TY - GEN
T1 - A multi-sample, multi-tree approach to bag-of-words image representation for image retrieval
AU - Wu, Zhong
AU - Ke, Qifa
AU - Sun, Jian
AU - Shum, Heung Yeung
PY - 2009
Y1 - 2009
N2 - The state-of-the-art content based image retrieval systems has been significantly advanced by the introduction of SIFT features and the bag-of-words image representation. Converting an image into a bag-of-words, however, involves three non-trivial steps: feature detection, feature description, and feature quantization. At each of these steps, there is a significant amount of information lost, and the resulted visual words are often not discriminative enough for large scale image retrieval applications. In this paper, we propose a novel multi-sample multi-tree approach to computing the visual word codebook. By encoding more information of the original image feature, our approach generates a much more discriminative visual word codebook that is also efficient in terms of both computation and space consumption, without losing the original repeatability of the visual features. We evaluate our approach using both a ground-truth data set and a real-world large scale image database. Our results show that a significant improvement in both precision and recall can be achieved by using the codebook derived from our approach.
AB - The state-of-the-art content based image retrieval systems has been significantly advanced by the introduction of SIFT features and the bag-of-words image representation. Converting an image into a bag-of-words, however, involves three non-trivial steps: feature detection, feature description, and feature quantization. At each of these steps, there is a significant amount of information lost, and the resulted visual words are often not discriminative enough for large scale image retrieval applications. In this paper, we propose a novel multi-sample multi-tree approach to computing the visual word codebook. By encoding more information of the original image feature, our approach generates a much more discriminative visual word codebook that is also efficient in terms of both computation and space consumption, without losing the original repeatability of the visual features. We evaluate our approach using both a ground-truth data set and a real-world large scale image database. Our results show that a significant improvement in both precision and recall can be achieved by using the codebook derived from our approach.
UR - https://www.webofscience.com/wos/woscc/full-record/WOS:000294955300257
U2 - 10.1109/ICCV.2009.5459439
DO - 10.1109/ICCV.2009.5459439
M3 - Conference Paper published in a book
SN - 9781424444205
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 1992
EP - 1999
BT - 2009 IEEE 12th International Conference on Computer Vision, ICCV 2009
T2 - 12th International Conference on Computer Vision, ICCV 2009
Y2 - 29 September 2009 through 2 October 2009
ER -