Spatio-temporal keywords queries in HBase
Xiaoying Chen Chong Zhang Zonglin Shi Weidong Xiao
With the amount of data accumulated to tens of billions of scale, HBase, a distributed key-value database, plays a significant role in providing effective and high-throughput data service and management. However, for the applications involving spatio-temporal data, there is no good solution, due to inefficient query processing in HBase. In this paper, we propose spatio-temporal keyword searching problem for HBase, which is a meaningful issue in real life and a new challenge in this platform. To solve this problem, a novel access model for HBase is designed, containing row keys for indexing spatio-temporal dimensions and Bloom filters for fast detecting the existence of query keywords. And then, two algorithms for spatio-temporal keyword queries are developed, one is suitable for the queries with ordinary selectivity, the other is a parallel algorithm based on MapReduce aiming for the large range queries. We evaluate our algorithms on a real dataset, and the empirical results show that they are capable to handle spatio-temporal keyword queries efficiently.
keywords: Spatio-temporal keyword query bloom filter Hilbert curve MapReduce. HBase
Time series based urban air quality predication
Ruiqi Li Yifan Chen Xiang Zhao Yanli Hu Weidong Xiao
Urban air pollution post a great threat to human health, and has been a major concern of many metropolises in developing countries. Lately, a few air quality monitoring stations have been established to inform public the real-time air quality indices based on fine particle matters, e.g. $PM_{2.5}$, in countries suffering from air pollutions. Air quality, unfortunately, is fairly difficult to manage due to multiple complex human activities from driving to smelting. We observe that human activities' hidden regular pattern offers possibility in predication, and this motivates us to infer urban air condition from the perspective of time series. In this paper, we focus on $PM_{2.5}$ based urban air quality, and introduce two kinds of time-series methods for real-time and fine-grained air quality prediction, harnessing historical air quality data reported by existing monitoring stations. The methods are evaluated based in the real-life $PM_{2.5}$ concentration data in the year of 2013 (January - December) in Wuhan, China.
keywords: $PM_{2.5}$ multiplicative model time series Urban air quality ARIMA.

Year of publication

Related Authors

Related Keywords

[Back to Top]