Xiaoying Chen - Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China (email)
Abstract: With the amount of data accumulated to tens of billions of scale, HBase, a distributed key-value database, plays a significant role in providing effective and high-throughput data service and management. However, for the applications involving spatio-temporal data, there is no good solution, due to inefficient query processing in HBase. In this paper, we propose spatio-temporal keyword searching problem for HBase, which is a meaningful issue in real life and a new challenge in this platform. To solve this problem, a novel access model for HBase is designed, containing row keys for indexing spatio-temporal dimensions and Bloom filters for fast detecting the existence of query keywords. And then, two algorithms for spatio-temporal keyword queries are developed, one is suitable for the queries with ordinary selectivity, the other is a parallel algorithm based on MapReduce aiming for the large range queries. We evaluate our algorithms on a real dataset, and the empirical results show that they are capable to handle spatio-temporal keyword queries efficiently.
Keywords: Spatio-temporal keyword query, HBase, Hilbert curve, bloom filter, MapReduce.
Received: May 2015; Revised: August 2015; Available Online: September 2015.