American Institute of Mathematical Sciences

August 2018, 1(3): 295-309. doi: 10.3934/mfc.2018014

A real-time aggregate data publishing scheme with adaptive ω-event differential privacy

 1 School of Electronics and Information Engineering, Beijing Jiaotong University, Beijing, China 2 School of Big Data & Software Engineering, Chongqing University, Chongqing, China

* Corresponding author: Yan Huo

Received  November 2017 Revised  February 2018 Published  July 2018

Fund Project: The first and second authors are supported by NSFC grants (No. 61471028) and the Fundamental Research Funds for the Central Universities (No. 2017JBM004).
The third author is supported by NSFC grants (No. 61702062).
The fourth author is supported by NSFC grants (No. 61571010).

Although massive real-time data collected from users can provide benefits to improve the quality of human daily lives, it is possible to expose users' privacy. $\epsilon$-differential privacy is a notable model to provide strong privacy preserving in statistics. The existing works highlight $ω$-event differential privacy with a fixed window size, which may not be suitable for many practical scenarios. In view of this issue, we explore a real-time scheme with adaptive $ω$-event for differentially private time-series publishing (ADP) in this paper. In specific, we define a novel notion, Quality of Privacy (QoP) to measure both the utility of the released statistics and the performance of privacy preserving. According to this, we present an adaptive $ω$-event differential privacy model that can provide privacy protection with higher accuracy and better privacy protection effect. In addition, we also design a smart grouping mechanism to improve the grouping performance, and then improve the availability of publishing statistics. Finally, comparing with the existing schemes, we exploit real-world and synthetic datasets to conduct several experiments to demonstrate the superior performance of the ADP scheme.

Citation: Chengtao Yong, Yan Huo, Chunqiang Hu, Yanfei Lu, Guanlin Jing. A real-time aggregate data publishing scheme with adaptive ω-event differential privacy. Mathematical Foundations of Computing, 2018, 1 (3) : 295-309. doi: 10.3934/mfc.2018014
References:
 [1] Administration, United States Federal Highway, Traffic monitoring guide, Evaluation, (2001). [2] K. Aleksandra, K. Krishnaram, M. Nina and N. Alexandros, Releasing search queries and clicks privately, International Conference on World Wide Web, (2009), 171-180. [3] J. Ankur, Adaptive stream resource management using Kalman Filters, ACM SIGMOD International Conference on Management of Data, (2004), 11-22. [4] A. Blum, K. Ligett and A. Roth, A learning theory approach to non-interactive database privacy, Journal of the ACM, 60 (2008), Art. 12, 25 pp. doi: 10.1145/2450142.2450148. [5] C. A. Bradley, D. Rolka and J. Loonsk, BioSense: Implementation of a national early event detection and situational awareness system, Mmwr Supplements, 54 (2005), 11pp. [6] Z. Cai, Z. He, X. Guan and Y. Li, Collective data-sanitization for preventing sensitive information inference attacks in social networks, IEEE Transactions on Dependable and Secure Computing, PP (2016), 1-1. doi: 10.1109/TDSC.2016.2613521. [7] J. Cao, Q. Xiao, G. Ghinita, N. Li, E. Bertino and T. Kian Lee, Efficient and accurate strategies for differentially-private sliding window queries, International Conference on Extending Database Technology, (2013), 191-202. [8] R. Chen, N. Mohammed, B. C. M. Fung, B. C. Desai and X. Li, Publishing setvalued data via differential privacy, Vldb, 4 (2012), 1087-1098. [9] S. Cheng, Z. Cai, J. Li and H. Gao, Extracting kernel dataset from big sensory data in wireless sensor networks, IEEE Transactions on Knowledge and Data Engineering, 29 (2017), 813-827. doi: 10.1109/TKDE.2016.2645212. [10] S. Cheng, Z. Cai and J. Li, Curve query processing in wireless sensor networks, IEEE Transactions on Vehicular Technology, 64 (2015), 5198-5209. doi: 10.1109/TVT.2014.2375330. [11] S. Cheng, Z. Cai, J. Li and X. Fang, Drawing dominant dataset from big sensory data in wireless sensor networks, The 34th Annual IEEE International Conference on Computer Communications (INFOCOM 2015), (2015), 531-539. doi: 10.1109/INFOCOM.2015.7218420. [12] D. Cynthia, Differential privacy, International Colloquium on Automata, Languages, and Programming, 4052 (2006), 1-12. doi: 10.1007/11787006_1. [13] D. Cynthia, Differential privacy in new settings, Acm-Siam Symposium on Discrete Algorithms, (2010), 174-183. [14] C. Dwork, M. Naor, T. Pitassiand and G. N. Rothblum, Differential privacy under continual observation, STOC '10 Proceedings of the Forty-Second ACM Symposium on Theory of Computing, (2010), 715-724. doi: 10.1145/1806689.1806787. [15] C. Dwork, M. Naor, O. Reingold, G. N. Rothblum and S. Vadhan, On the complexity of differentially private data release: Efficient algorithms and hardness results, Proc. 41st Annu. ACM STOC, (2009), 381-390. [16] C. Dwork, F. McSherry, K. Nissim and A. Smith, Calibrating noise to sensitivity in private data analysis, Theory of Cryptography, 3876 (2006), 265-284. doi: 10.1007/11681878_14. [17] L. Fan and L. Xiong, An adaptive approach to real-time aggregate monitoring with differential privacy, IEEE Transactions on Knowledge and Data Engineering, 26 (2014), 2094-2106. [18] T. Florian, Privacy issues in vehicular ad hoc networks, International Conference on Privacy Enhancing Technologies, (2005), 197-209. [19] B. J. Frey and D. Dueck, Clustering by passing messages between data points, Science, 315 (2007), 972-976. doi: 10.1126/science.1136800. [20] C. Graham, P. Cecilia, S. Divesh and T. T. L. Tran, Differentially private summaries for sparse data, International Conference on Database Theory, (2011), 299-311. [21] Z. He, Z. Cai and J. Yu, Latent-data Privacy Preserving with Customized Data Utility for Social Network Data, IEEE Transactions on Vehicular Technology, 67 (2018), 665-673. doi: 10.1109/TVT.2017.2738018. [22] Z. He, Z. Cai, S. Cheng and X. Wang, Approximate aggregation for tracking quantiles and range countings in wireless sensor networks, Theoretical Computer Science, 607 (2015), 381-390. doi: 10.1016/j.tcs.2015.07.056. [23] Th. H. Chan, E. Shi and D. Song, Private and continual release of statistics, Automata, Languages and Programming, Part Ⅱ, 405-417, Lecture Notes in Comput. Sci., 6199, Springer, Berlin, 2010. doi: 10.1007/978-3-642-14162-1_34. [24] T. H. Hubert Chan, M. Li, E. Shi and W. Xu, Differentially private continual monitoring of heavy hitters from distributed streams, Springer Berlin Heidelberg, (2012), 140-159. [25] R. Kalman, A new approach to linear filtering and prediction problem, Journal of Basic Engineering, 82 (1960), 35-45. doi: 10.1115/1.3662552. [26] G. Kellaris, S. Papadopoulos, X. Xiao and D. Papadias, Differentially private event sequences over infinite streams, Proc. of the VLDB Endowment, 7 (2014), 1155-1166. doi: 10.14778/2732977.2732989. [27] D. Kifer and B.-R. Lin, Towards an axiomatization of statistical privacy and utility, Proc. of ACM PODS, (2010), 147-158. doi: 10.1145/1807085.1807106. [28] C. Li, M. Hay, V. Rastogi, G. Miklau and A. Mcgregor, Optimizing linear counting queries under differential privacy, Twenty-Ninth ACM Sigmod-Sigact-Sigart Symposium on Principles of Database Systems, (2010), 123-134. [29] J. Li, S. Cheng, Z. Cai, J. Yu, C. Wang and Y. Li, Approximate holistic aggregation in wireless sensor networks, 2015 IEEE 35th International Conference on Distributed Computing Systems, (2015). doi: 10.1109/ICDCS.2015.86. [30] Y. Liang and Z. Cai and Q. Han and Y. Li, Location privacy leakage through sensory data, Security and Communication Networks, (2017), Article ID 7576307, 12 pages. doi: 10.1155/2017/7576307. [31] J. Mao, W. Tian, Y. Zhang, J. Cui, H. Ma, J. Bian, J. Liu and J. Zhang, Co-Check: Collab-orative outsourced data auditing in multicloud environment, Security and Communication Networks, (2017), Article ID 2948025, 13 pages doi: 10.1155/2017/2948025. [32] J. Mao, Y. Zhang, P. Li, T. Li, Q. Wu and J. Liu, A Position-aware Merkle Tree for Dynamic Cloud Data Integrity Verification, Soft Computing, 21 (2017), 2151-2164. doi: 10.1007/s00500-015-1918-8. [33] H. Michael, R. Vibhor, M. Gerome and D. Suciu, Boosting the accuracy of differentially private histograms through consistency, Proceedings of the Vldb Endowment, 3 (2010), 1021-1032. [34] J. Sun, Y. Fang and X. Zhu, Privacy and emergency response in e-healthcare leveraging wireless body sensor networks, IEEE Wireless Communications, 17 (2010), 66-73. doi: 10.1109/MWC.2010.5416352. [35] B. Thomas, A framework for generating network-based moving objects, Geoinformatica, 6 (2002), 153-180. [36] Q. Wang, Y. Zhang, X. Lu and Z. Wang, RescueDP: Real-time spatio-temporal crowdsourced data publishing with differential privacy, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, (2016). doi: 10.1109/INFOCOM.2016.7524458. [37] X. Xiao, G. Bender, H. Michael and G. Johannes, iReduct: differential privacy with reduced relative errors, ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June, (2011), 229-240. [38] J. Xu, Z. Zhang, X. Xiao, Y. Yang and G. Yu, Differentially private histogram publication, IEEE International Conference on Data Engineering, (2012), 32-43. [39] L. Zhang, Z. Cai and X. Wang, FakeMask: A Novel Privacy Preserving Approach for Smartphones, IEEE Transactions on Network and Service Management, 13 (2016), 335-348. doi: 10.1109/TNSM.2016.2559448. [40] X. Zheng, Z. Cai, J. Yu, C. Wang and Y. Li, Follow but no track: Privacy preserved profile publishing in cyber-physical social systems, IEEE Internet of Things Journal, 4 (2017), 1868-1878. doi: 10.1109/JIOT.2017.2679483. [41] CTR Data, Available from: https://www.kaggle.com/c/avazu-ctr-prediction. [42] Capital Bikeshare Data, Available from: https://www.capitalbikeshare.com/system-data.

show all references

References:
 [1] Administration, United States Federal Highway, Traffic monitoring guide, Evaluation, (2001). [2] K. Aleksandra, K. Krishnaram, M. Nina and N. Alexandros, Releasing search queries and clicks privately, International Conference on World Wide Web, (2009), 171-180. [3] J. Ankur, Adaptive stream resource management using Kalman Filters, ACM SIGMOD International Conference on Management of Data, (2004), 11-22. [4] A. Blum, K. Ligett and A. Roth, A learning theory approach to non-interactive database privacy, Journal of the ACM, 60 (2008), Art. 12, 25 pp. doi: 10.1145/2450142.2450148. [5] C. A. Bradley, D. Rolka and J. Loonsk, BioSense: Implementation of a national early event detection and situational awareness system, Mmwr Supplements, 54 (2005), 11pp. [6] Z. Cai, Z. He, X. Guan and Y. Li, Collective data-sanitization for preventing sensitive information inference attacks in social networks, IEEE Transactions on Dependable and Secure Computing, PP (2016), 1-1. doi: 10.1109/TDSC.2016.2613521. [7] J. Cao, Q. Xiao, G. Ghinita, N. Li, E. Bertino and T. Kian Lee, Efficient and accurate strategies for differentially-private sliding window queries, International Conference on Extending Database Technology, (2013), 191-202. [8] R. Chen, N. Mohammed, B. C. M. Fung, B. C. Desai and X. Li, Publishing setvalued data via differential privacy, Vldb, 4 (2012), 1087-1098. [9] S. Cheng, Z. Cai, J. Li and H. Gao, Extracting kernel dataset from big sensory data in wireless sensor networks, IEEE Transactions on Knowledge and Data Engineering, 29 (2017), 813-827. doi: 10.1109/TKDE.2016.2645212. [10] S. Cheng, Z. Cai and J. Li, Curve query processing in wireless sensor networks, IEEE Transactions on Vehicular Technology, 64 (2015), 5198-5209. doi: 10.1109/TVT.2014.2375330. [11] S. Cheng, Z. Cai, J. Li and X. Fang, Drawing dominant dataset from big sensory data in wireless sensor networks, The 34th Annual IEEE International Conference on Computer Communications (INFOCOM 2015), (2015), 531-539. doi: 10.1109/INFOCOM.2015.7218420. [12] D. Cynthia, Differential privacy, International Colloquium on Automata, Languages, and Programming, 4052 (2006), 1-12. doi: 10.1007/11787006_1. [13] D. Cynthia, Differential privacy in new settings, Acm-Siam Symposium on Discrete Algorithms, (2010), 174-183. [14] C. Dwork, M. Naor, T. Pitassiand and G. N. Rothblum, Differential privacy under continual observation, STOC '10 Proceedings of the Forty-Second ACM Symposium on Theory of Computing, (2010), 715-724. doi: 10.1145/1806689.1806787. [15] C. Dwork, M. Naor, O. Reingold, G. N. Rothblum and S. Vadhan, On the complexity of differentially private data release: Efficient algorithms and hardness results, Proc. 41st Annu. ACM STOC, (2009), 381-390. [16] C. Dwork, F. McSherry, K. Nissim and A. Smith, Calibrating noise to sensitivity in private data analysis, Theory of Cryptography, 3876 (2006), 265-284. doi: 10.1007/11681878_14. [17] L. Fan and L. Xiong, An adaptive approach to real-time aggregate monitoring with differential privacy, IEEE Transactions on Knowledge and Data Engineering, 26 (2014), 2094-2106. [18] T. Florian, Privacy issues in vehicular ad hoc networks, International Conference on Privacy Enhancing Technologies, (2005), 197-209. [19] B. J. Frey and D. Dueck, Clustering by passing messages between data points, Science, 315 (2007), 972-976. doi: 10.1126/science.1136800. [20] C. Graham, P. Cecilia, S. Divesh and T. T. L. Tran, Differentially private summaries for sparse data, International Conference on Database Theory, (2011), 299-311. [21] Z. He, Z. Cai and J. Yu, Latent-data Privacy Preserving with Customized Data Utility for Social Network Data, IEEE Transactions on Vehicular Technology, 67 (2018), 665-673. doi: 10.1109/TVT.2017.2738018. [22] Z. He, Z. Cai, S. Cheng and X. Wang, Approximate aggregation for tracking quantiles and range countings in wireless sensor networks, Theoretical Computer Science, 607 (2015), 381-390. doi: 10.1016/j.tcs.2015.07.056. [23] Th. H. Chan, E. Shi and D. Song, Private and continual release of statistics, Automata, Languages and Programming, Part Ⅱ, 405-417, Lecture Notes in Comput. Sci., 6199, Springer, Berlin, 2010. doi: 10.1007/978-3-642-14162-1_34. [24] T. H. Hubert Chan, M. Li, E. Shi and W. Xu, Differentially private continual monitoring of heavy hitters from distributed streams, Springer Berlin Heidelberg, (2012), 140-159. [25] R. Kalman, A new approach to linear filtering and prediction problem, Journal of Basic Engineering, 82 (1960), 35-45. doi: 10.1115/1.3662552. [26] G. Kellaris, S. Papadopoulos, X. Xiao and D. Papadias, Differentially private event sequences over infinite streams, Proc. of the VLDB Endowment, 7 (2014), 1155-1166. doi: 10.14778/2732977.2732989. [27] D. Kifer and B.-R. Lin, Towards an axiomatization of statistical privacy and utility, Proc. of ACM PODS, (2010), 147-158. doi: 10.1145/1807085.1807106. [28] C. Li, M. Hay, V. Rastogi, G. Miklau and A. Mcgregor, Optimizing linear counting queries under differential privacy, Twenty-Ninth ACM Sigmod-Sigact-Sigart Symposium on Principles of Database Systems, (2010), 123-134. [29] J. Li, S. Cheng, Z. Cai, J. Yu, C. Wang and Y. Li, Approximate holistic aggregation in wireless sensor networks, 2015 IEEE 35th International Conference on Distributed Computing Systems, (2015). doi: 10.1109/ICDCS.2015.86. [30] Y. Liang and Z. Cai and Q. Han and Y. Li, Location privacy leakage through sensory data, Security and Communication Networks, (2017), Article ID 7576307, 12 pages. doi: 10.1155/2017/7576307. [31] J. Mao, W. Tian, Y. Zhang, J. Cui, H. Ma, J. Bian, J. Liu and J. Zhang, Co-Check: Collab-orative outsourced data auditing in multicloud environment, Security and Communication Networks, (2017), Article ID 2948025, 13 pages doi: 10.1155/2017/2948025. [32] J. Mao, Y. Zhang, P. Li, T. Li, Q. Wu and J. Liu, A Position-aware Merkle Tree for Dynamic Cloud Data Integrity Verification, Soft Computing, 21 (2017), 2151-2164. doi: 10.1007/s00500-015-1918-8. [33] H. Michael, R. Vibhor, M. Gerome and D. Suciu, Boosting the accuracy of differentially private histograms through consistency, Proceedings of the Vldb Endowment, 3 (2010), 1021-1032. [34] J. Sun, Y. Fang and X. Zhu, Privacy and emergency response in e-healthcare leveraging wireless body sensor networks, IEEE Wireless Communications, 17 (2010), 66-73. doi: 10.1109/MWC.2010.5416352. [35] B. Thomas, A framework for generating network-based moving objects, Geoinformatica, 6 (2002), 153-180. [36] Q. Wang, Y. Zhang, X. Lu and Z. Wang, RescueDP: Real-time spatio-temporal crowdsourced data publishing with differential privacy, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, (2016). doi: 10.1109/INFOCOM.2016.7524458. [37] X. Xiao, G. Bender, H. Michael and G. Johannes, iReduct: differential privacy with reduced relative errors, ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June, (2011), 229-240. [38] J. Xu, Z. Zhang, X. Xiao, Y. Yang and G. Yu, Differentially private histogram publication, IEEE International Conference on Data Engineering, (2012), 32-43. [39] L. Zhang, Z. Cai and X. Wang, FakeMask: A Novel Privacy Preserving Approach for Smartphones, IEEE Transactions on Network and Service Management, 13 (2016), 335-348. doi: 10.1109/TNSM.2016.2559448. [40] X. Zheng, Z. Cai, J. Yu, C. Wang and Y. Li, Follow but no track: Privacy preserved profile publishing in cyber-physical social systems, IEEE Internet of Things Journal, 4 (2017), 1868-1878. doi: 10.1109/JIOT.2017.2679483. [41] CTR Data, Available from: https://www.kaggle.com/c/avazu-ctr-prediction. [42] Capital Bikeshare Data, Available from: https://www.capitalbikeshare.com/system-data.
The aggregate time-series data publishing scenario
A high-level overview of ADP
Utility comparison when $\epsilon$ changes
Utility comparison with and without Adaptive $\omega$
MAE and QoP of different grouping mechanisms
 [1] Yunmei Lu, Mingyuan Yan, Meng Han, Qingliang Yang, Yanqing Zhang. Privacy preserving feature selection and Multiclass Classification for horizontally distributed data. Mathematical Foundations of Computing, 2018, 1 (4) : 331-348. doi: 10.3934/mfc.2018016 [2] Archana Prashanth Joshi, Meng Han, Yan Wang. A survey on security and privacy issues of blockchain technology. Mathematical Foundations of Computing, 2018, 1 (2) : 121-147. doi: 10.3934/mfc.2018007 [3] Ruinian Li, Yinhao Xiao, Cheng Zhang, Tianyi Song, Chunqiang Hu. Cryptographic algorithms for privacy-preserving online applications. Mathematical Foundations of Computing, 2018, 1 (4) : 311-330. doi: 10.3934/mfc.2018015 [4] Stefano Galatolo. Orbit complexity and data compression. Discrete & Continuous Dynamical Systems - A, 2001, 7 (3) : 477-486. doi: 10.3934/dcds.2001.7.477 [5] Subrata Dasgupta. Disentangling data, information and knowledge. Big Data & Information Analytics, 2016, 1 (4) : 377-389. doi: 10.3934/bdia.2016016 [6] Alessia Marigo. Equilibria for data networks. Networks & Heterogeneous Media, 2007, 2 (3) : 497-528. doi: 10.3934/nhm.2007.2.497 [7] Alexandre J. Chorin, Fei Lu, Robert N. Miller, Matthias Morzfeld, Xuemin Tu. Sampling, feasibility, and priors in data assimilation. Discrete & Continuous Dynamical Systems - A, 2016, 36 (8) : 4227-4246. doi: 10.3934/dcds.2016.36.4227 [8] Xiaosheng Li, Gunther Uhlmann. Inverse problems with partial data in a slab. Inverse Problems & Imaging, 2010, 4 (3) : 449-462. doi: 10.3934/ipi.2010.4.449 [9] Richard Boire. Understanding AI in a world of big data. Big Data & Information Analytics, 2017, 2 (5) : 22-42. doi: 10.3934/bdia.2018001 [10] Ida De Bonis, Daniela Giachetti. Singular parabolic problems with possibly changing sign data. Discrete & Continuous Dynamical Systems - B, 2014, 19 (7) : 2047-2064. doi: 10.3934/dcdsb.2014.19.2047 [11] Sylvain Ervedoza, Enrique Zuazua. A systematic method for building smooth controls for smooth data. Discrete & Continuous Dynamical Systems - B, 2010, 14 (4) : 1375-1401. doi: 10.3934/dcdsb.2010.14.1375 [12] Zhouchen Lin. A review on low-rank models in data analysis. Big Data & Information Analytics, 2016, 1 (2&3) : 139-161. doi: 10.3934/bdia.2016001 [13] Z. G. Feng, Kok Lay Teo, N. U. Ahmed, Yulin Zhao, W. Y. Yan. Optimal fusion of sensor data for Kalman filtering. Discrete & Continuous Dynamical Systems - A, 2006, 14 (3) : 483-503. doi: 10.3934/dcds.2006.14.483 [14] Liu Hui, Lin Zhi, Waqas Ahmad. Network(graph) data research in the coordinate system. Mathematical Foundations of Computing, 2018, 1 (1) : 1-10. doi: 10.3934/mfc.2018001 [15] Pedro Caro. On an inverse problem in electromagnetism with local data: stability and uniqueness. Inverse Problems & Imaging, 2011, 5 (2) : 297-322. doi: 10.3934/ipi.2011.5.297 [16] Francesca Sapuppo, Elena Umana, Mattia Frasca, Manuela La Rosa, David Shannahoff-Khalsa, Luigi Fortuna, Maide Bucolo. Complex spatio-temporal features in meg data. Mathematical Biosciences & Engineering, 2006, 3 (4) : 697-716. doi: 10.3934/mbe.2006.3.697 [17] Pankaj Sharma, David Baglee, Jaime Campos, Erkki Jantunen. Big data collection and analysis for manufacturing organisations. Big Data & Information Analytics, 2017, 2 (2) : 127-139. doi: 10.3934/bdia.2017002 [18] Débora A. F. Albanez, Maicon J. Benvenutti. Continuous data assimilation algorithm for simplified Bardina model. Evolution Equations & Control Theory, 2018, 7 (1) : 33-52. doi: 10.3934/eect.2018002 [19] A. Cascone, Alessia Marigo, B. Piccoli, L. Rarità. Decentralized optimal routing for packets flow on data networks. Discrete & Continuous Dynamical Systems - B, 2010, 13 (1) : 59-78. doi: 10.3934/dcdsb.2010.13.59 [20] Enrico Capobianco. Born to be big: Data, graphs, and their entangled complexity. Big Data & Information Analytics, 2016, 1 (2&3) : 163-169. doi: 10.3934/bdia.2016002

Impact Factor: