Abstract
Visual object tracking and segmentation in omnidirectional videos are challenging due to the wide field-of-view and large spherical distortion brought by 360° images. To alleviate these problems, we introduce a novel representation, extended bounding field-of-view (eBFoV), for target localization and use it as the foundation of a general 360 tracking framework which is applicable for both omnidirectional visual object tracking and segmentation tasks. Building upon our previous work on omnidirectional visual object tracking (360VOT), we propose a comprehensive dataset and benchmark that incorporates a new component called omnidirectional video object segmentation (360VOS). The 360VOS dataset includes 290 sequences accompanied by dense pixel-wise masks and covers a broader range of target categories. To support both the development and evaluation of algorithms in this domain, we divide the dataset into a training subset with 170 sequences and a testing subset with 120 sequences. Furthermore, we tailor evaluation metrics for both omnidirectional tracking and segmentation to ensure rigorous assessment. Through extensive experiments, we benchmark state-of-the-art approaches and demonstrate the effectiveness of our proposed 360 tracking framework and training dataset.
| Original language | English |
|---|---|
| Article number | 11090163 |
| Pages (from-to) | 9785-9797 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
| Volume | 47 |
| Issue number | 11 |
| Early online date | 23 Jul 2025 |
| DOIs | |
| Publication status | Published - Nov 2025 |
Bibliographical note
Publisher Copyright:© 1979-2012 IEEE.
Keywords
- Dataset
- omnidirectional vision
- visual object tracking
- video object segmentation
Fingerprint
Dive into the research topics of '360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver