Query-Aware locality-sensitive hashing for approximate nearest neighbor search

Qiang Huang, Jianlin Feng, Yikai Zhang, Qiong Fang, Wilfred Ng

Research output: Chapter in Book/Conference Proceeding/ReportBook Chapterpeer-review

Abstract

Locality-Sensitive Hashing (LSH) and its variants are the well-known indexing schemes for the c-Approximate Nearest Neighbor (c-ANN) search problem in high-dimensional Eu- clidean space. Traditionally, LSH functions are constructed in a query-oblivious manner in the sense that buckets are partitioned before any query arrives. However, objects closer to a query may be partitioned into diffierent buckets, which is undesirable. Due to the use of query-oblivious bucket parti- tion, the state-of-the-art LSH schemes for external memory, namely C2LSH and LSB-Forest, only work with approxima- tion ratio of integer c ≥ 2. In this paper, we introduce a novel concept of query-aware bucket partition which uses a given query as the \anchor" for bucket partition. Accordingly, a query-aware LSH func- tion is a random projection coupled with query-aware bucket partition, which removes random shift required by tradi- tional query-oblivious LSH functions. Notably, query-aware bucket partition can be easily implemented so that query performance is guaranteed. We propose a novel query-aware LSH scheme named QALSH for c-ANN search over exter- nal memory. Our theoretical studies show that QALSH enjoys a guarantee on query quality. The use of query- aware LSH function enables QALSH to work with any ap- proximation ratio c > 1. Extensive experiments show that QALSH outperforms C2LSH and LSB-Forest, especially in high-dimensional space. Speciffically, by using a ratio c < 2, QALSH can achieve much better query quality.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
PublisherAssociation for Computing Machinery
Pages1-12
Number of pages12
Edition1
Publication statusPublished - 2016
Event42nd International Conference on Very Large Data Bases, VLDB 2016 - Delhi, India
Duration: 5 Sept 20169 Sept 2016

Publication series

NameProceedings of the VLDB Endowment
Number1
Volume9
ISSN (Electronic)2150-8097

Conference

Conference42nd International Conference on Very Large Data Bases, VLDB 2016
Country/TerritoryIndia
CityDelhi
Period5/09/169/09/16

Fingerprint

Dive into the research topics of 'Query-Aware locality-sensitive hashing for approximate nearest neighbor search'. Together they form a unique fingerprint.

Cite this