Skip to main navigation Skip to search Skip to main content

Real-time object tracking with auxiliary models

  • Shengnan Cai

Student thesis: Master's thesis

Abstract

Achieving efficient and effective tracking remains a challenging task resorting from factors such as partial occlusion, background clutter, pose or illumination changes. Even though a tracker could follow basic procedures of tracking-by-detection to implement general object tracking, most state-of-the-art algorithms incorporate additional or sophisticated processing such as using auxiliary models so as to solve the challenges. This thesis presents novel methods on real-time general object tracking with auxiliary models. We study the feature selection along with the appearance model and motion model in video tracking algorithms. The categorization is yielded based on real applications utilizing features and models. In visual tracking, we propose two auxiliary models. One method utilizes generalized part-based appearance model and structure-constrained motion model as auxiliary. The appearance of the target object is modeled by the proposed generalized part-based appearance model, adaptively updated by an efficient structure learning scheme. In addition, we enhance the performance of our tracker by using a motion model, which employs a structure-constrained rule, that is, the change on the structure of the target object between consecutive frames is small. Another tracking method leverages layered detection that combines detection on two independent layers in a unified tracking-by-detection framework, one layer on the global level and the other on patch. According to the representation of the user-specified object of interest, the global level could distinguish the object against the background effectively, whereas the patch level is able to sample the patches of interest in the bounding box representation. Besides describing our own tracking methods, we compare diverse tracking algorithms in the literature using a public benchmark. During visual tracking, feature selection plays an important role. SIFT, SURF, HOG are commonly used. In this thesis, we study the problem of tracking man-made objects along the video sequence, and present a novel affine-invariant feature, Low-rank SIFT, which exploits the regular appearance property in man-made objects and achieves full affine invariance without needing to simulate over the affine parameter space. Our method seeks to leverage the low-rank prior to estimate the affine parameters for local patches directly and we propose a fast algorithm to compute parameters by introducing the Low-rank Integral Map. By automatically rectify the local patches to the low-rank forms, and perform conventional SIFT to solve rotation, translation and scaling ambiguity, our approach is able to perform feature selection in tracking with higher accuracy. With promising experimental results and observable qualitative improvement, the ideas of auxiliary models and affine-invariant features are blossoming. Further exploration on tracking will be conducted with more sophisticated models and more efficient calculating algorithms.
Date of Award2015
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology

Cite this

'