American Institute of Mathematical Sciences

May  2019, 2(2): 127-147. doi: 10.3934/mfc.2019010

Unsupervised robust nonparametric learning of hidden community properties

 Machine Learning and Optimization Laboratory, EPFL, Lausanne, CH-1015, Switzerland

* Corresponding author: Mikhail Langovoy

Received  April 2019 Published  June 2019

We consider learning of fundamental properties of communities in large noisy networks, in the prototypical situation where the nodes or users are split into two classes according to a binary property, e.g., according to their opinions or preferences on a topic. For learning these properties, we propose a nonparametric, unsupervised, and scalable graph scan procedure that is, in addition, robust against a class of powerful adversaries. In our setup, one of the communities can fall under the influence of a knowledgeable adversarial leader, who knows the full network structure, has unlimited computational resources and can completely foresee our planned actions on the network. We prove strong consistency of our results in this setup with minimal assumptions. In particular, the learning procedure estimates the baseline activity of normal users asymptotically correctly with probability 1; the only assumption being the existence of a single implicit community of asymptotically negligible logarithmic size. We provide experiments on real and synthetic data to illustrate the performance of our method, including examples with adversaries.

Citation: Mikhail Langovoy, Akhilesh Gotmare, Martin Jaggi. Unsupervised robust nonparametric learning of hidden community properties. Mathematical Foundations of Computing, 2019, 2 (2) : 127-147. doi: 10.3934/mfc.2019010
References:

show all references

References:
Graph scan estimators for the artificial large graph
Political blogosphere graph for the 2004 Elections
Graph scan estimators for the 2004 Elections graph
 $k$ 100 200 500 700 800 $\#(N_w=0)$ 98 98 78 56 55 $\#(N_w=1)$ 2 2 16 22 17 $\#(N_w=2)$ 0 0 3 16 12 $\#(N_w = 3)$ 0 0 3 5 9 $\#(N_w \geq 4)$ 0 0 0 1 7 $\mu_{\hat{a}}$ 1.705 1.815 1.913 1.949 1.961 $\max{\hat{a}}$ 1.806 1.903 1.98 2.03 2.031 $\min{\hat{a}}$ 1.603 1.686 1.824 1.848 1.879 $\sigma_{\hat{a}}$ 0.048 0.04 0.033 0.034 0.029
 $k$ 100 200 500 700 800 $\#(N_w=0)$ 98 98 78 56 55 $\#(N_w=1)$ 2 2 16 22 17 $\#(N_w=2)$ 0 0 3 16 12 $\#(N_w = 3)$ 0 0 3 5 9 $\#(N_w \geq 4)$ 0 0 0 1 7 $\mu_{\hat{a}}$ 1.705 1.815 1.913 1.949 1.961 $\max{\hat{a}}$ 1.806 1.903 1.98 2.03 2.031 $\min{\hat{a}}$ 1.603 1.686 1.824 1.848 1.879 $\sigma_{\hat{a}}$ 0.048 0.04 0.033 0.034 0.029