Video, a versatile multimodal data format, has emerged as the predominant channel for information transmission and communication. Efficient video exploration and analysis are critical for numerous applications spanning business, security, and education. Taking advantage of powerful computational resources, numerous efforts have recently sought to utilize automatic machine learning or deep learning models to automatically summarize and analyze videos. However, considering the inherent complexity of video data, building models that effectively and comprehensively understand the spatial-temporal relationships and cross-modal information within videos presents significant challenges. Moreover, these automated methods provide poor interactions and result representations, hindering analysts from conducting fine-grained exploration for efficient high-level insight distillation. Visual analytics, which combines computational algorithm efficiency with human visual ability for pattern discovery and domain knowledge for decision-making, has brought new opportunities for model practitioners and data analysts to perform comprehensive video data exploration and model steering.
| Date of Award | 2024 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|
| Supervisor | Huamin QU (Supervisor) & Qian ZHANG (Supervisor) |
|---|
Interactive visual analytics for multimodal video data exploration and model steering
HE, J. (Author). 2024
Student thesis: Doctoral thesis