Selectivity estimation on streaming spatio-textual data using local correlations

Xiaoyang Wang, Ying Zhang, Wenjie Zhang, Xuemin Lin*, Wei Wang

*Corresponding author for this work

Research output: Contribution to journalConference article published in journalpeer-review

18 Citations (Scopus)

Abstract

In this paper, we investigate the selectivity estimation prob- lem for streaming spatio-textual data, which arises in many social network and geo-location applications. Specifically, given a set of continuously and rapidly arriving spatio- textual objects, each of which is described by a geo-location and a short text, we aim to accurately estimate the cardinal- ity of a spatial keyword query on objects seen so far, where a spatial keyword query consists of a search region and a set of query keywords. To the best of our knowledge, this is the first work to ad- dress this important problem. We first extend two existing techniques to solve this problem, and show their limitations. Inspired by two key observations on the "locality" of the correlations among query keywords, we propose a local cor- relation based method by utilizing an augmented adaptive space partition tree (A2SP-tree for short) to approximately learn a local Bayesian network on-the-fly for a given query and estimate its selectivity. A novel local boosting approach is presented to further enhance the learning accuracy of lo- cal Bayesian networks. Our comprehensive experiments on real-life datasets demonstrate the superior performance of the local correlation based algorithm in terms of estimation accuracy compared to other competitors.

Original languageEnglish
Pages (from-to)101-112
Number of pages12
JournalProceedings of the VLDB Endowment
Volume8
Issue number2 2
DOIs
Publication statusPublished - Oct 2014
Externally publishedYes
Event3rd Workshop on Spatio-Temporal Database Management, STDBM 2006, Co-located with the 32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of
Duration: 11 Sept 200611 Sept 2006

Fingerprint

Dive into the research topics of 'Selectivity estimation on streaming spatio-textual data using local correlations'. Together they form a unique fingerprint.

Cite this