Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Unsupervised robust nonparametric learning of hidden community properties

57 0 0.0 ( 0 )

Download Cite

Added by Mikhail Langovoy

Publication date 2017

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Mikhail A. Langovoy - Akhilesh Gotmare - Martin Jaggi

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider learning of fundamental properties of communities in large noisy networks, in the prototypical situation where the nodes or users are split into two classes according to a binary property, e.g., according to their opinions or preferences on a topic. For learning these properties, we propose a nonparametric, unsupervised, and scalable graph scan procedure that is, in addition, robust against a class of powerful adversaries. In our setup, one of the communities can fall under the influence of a knowledgeable adversarial leader, who knows the full network structure, has unlimited computational resources and can completely foresee our planned actions on the network. We prove strong consistency of our results in this setup with minimal assumptions. In particular, the learning procedure estimates the baseline activity of normal users asymptotically correctly with probability 1; the only assumption being the existence of a single implicit community of asymptotically negligible logarithmic size. We provide experiments on real and synthetic data to illustrate the performance of our method, including examples with adversaries.

rate research

Information Limits for Recovering a Hidden Community

155 - Bruce Hajek , Yihong Wu , Jiaming Xu 2015

We study the problem of recovering a hidden community of cardinality $K$ from an $n times n$ symmetric data matrix $A$, where for distinct indices $i,j$, $A_{ij} sim P$ if $i, j$ both belong to the community and $A_{ij} sim Q$ otherwise, for two known probability distributions $P$ and $Q$ depending on $n$. If $P={rm Bern}(p)$ and $Q={rm Bern}(q)$ with $p>q$, it reduces to the problem of finding a densely-connected $K$-subgraph planted in a large Erdos-Renyi graph; if $P=mathcal{N}(mu,1)$ and $Q=mathcal{N}(0,1)$ with $mu>0$, it corresponds to the problem of locating a $K times K$ principal submatrix of elevated means in a large Gaussian random matrix. We focus on two types of asymptotic recovery guarantees as $n to infty$: (1) weak recovery: expected number of classification errors is $o(K)$; (2) exact recovery: probability of classifying all indices correctly converges to one. Under mild assumptions on $P$ and $Q$, and allowing the community size to scale sublinearly with $n$, we derive a set of sufficient conditions and a set of necessary conditions for recovery, which are asymptotically tight with sharp constants. The results hold in particular for the Gaussian case, and for the case of bounded log likelihood ratio, including the Bernoulli case whenever $frac{p}{q}$ and $frac{1-p}{1-q}$ are bounded away from zero and infinity. An important algorithmic implication is that, whenever exact recovery is information theoretically possible, any algorithm that provides weak recovery when the community size is concentrated near $K$ can be upgraded to achieve exact recovery in linear additional time by a simple voting procedure.

Machine Learning Information Theory Information Theory

Semidefinite Programs for Exact Recovery of a Hidden Community

69 - Bruce Hajek , Yihong Wu , Jiaming Xu 2016

We study a semidefinite programming (SDP) relaxation of the maximum likelihood estimation for exactly recovering a hidden community of cardinality $K$ from an $n times n$ symmetric data matrix $A$, where for distinct indices $i,j$, $A_{ij} sim P$ if $i, j$ are both in the community and $A_{ij} sim Q$ otherwise, for two known probability distributions $P$ and $Q$. We identify a sufficient condition and a necessary condition for the success of SDP for the general model. For both the Bernoulli case ($P={{rm Bern}}(p)$ and $Q={{rm Bern}}(q)$ with $p>q$) and the Gaussian case ($P=mathcal{N}(mu,1)$ and $Q=mathcal{N}(0,1)$ with $mu>0$), which correspond to the problem of planted dense subgraph recovery and submatrix localization respectively, the general results lead to the following findings: (1) If $K=omega( n /log n)$, SDP attains the information-theoretic recovery limits with sharp constants; (2) If $K=Theta(n/log n)$, SDP is order-wise optimal, but strictly suboptimal by a constant factor; (3) If $K=o(n/log n)$ and $K to infty$, SDP is order-wise suboptimal. The same critical scaling for $K$ is found to hold, up to constant factors, for the performance of SDP on the stochastic block model of $n$ vertices partitioned into multiple communities of equal size $K$. A key ingredient in the proof of the necessary condition is a construction of a primal feasible solution based on random perturbation of the true cluster matrix.

Machine Learning Information Theory Social and Information Networks

A Note on Community Trees in Networks

53 - Ruqian Chen , Yen-Chi Chen , Wei Guo 2017

We introduce the concept of community trees that summarizes topological structures within a network. A community tree is a tree structure representing clique communities from the clique percolation method (CPM). The community tree also generates a persistent diagram. Community trees and persistent diagrams reveal topological structures of the underlying networks and can be used as visualization tools. We study the stability of community trees and derive a quantity called the total star number (TSN) that presents an upper bound on the change of community trees. Our findings provide a topological interpretation for the stability of communities generated by the CPM.

Machine Learning Social and Information Networks

Community Recovery in a Preferential Attachment Graph

309 - Bruce Hajek , Suryanarayana Sankagiri 2018

A message passing algorithm is derived for recovering communities within a graph generated by a variation of the Barab{a}si-Albert preferential attachment model. The estimator is assumed to know the arrival times, or order of attachment, of the vertices. The derivation of the algorithm is based on belief propagation under an independence assumption. Two precursors to the message passing algorithm are analyzed: the first is a degree thresholding (DT) algorithm and the second is an algorithm based on the arrival times of the children (C) of a given vertex, where the children of a given vertex are the vertices that attached to it. Comparison of the performance of the algorithms shows it is beneficial to know the arrival times, not just the number, of the children. The probability of correct classification of a vertex is asymptotically determined by the fraction of vertices arriving before it. Two extensions of Algorithm C are given: the first is based on joint likelihood of the children of a fixed set of vertices; it can sometimes be used to seed the message passing algorithm. The second is the message passing algorithm. Simulation results are given.

Machine Learning Social and Information Networks Probability

Bayesian Nonparametric Federated Learning of Neural Networks

165 - Mikhail Yurochkin , Mayank Agarwal , Soumya Ghosh 2019

In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. We develop a Bayesian nonparametric framework for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. We then develop an inference approach that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. We then demonstrate the efficacy of our approach on federated learning problems simulated from two popular image classification datasets.

Machine Learning Machine Learning

comments

Fetching comments

University of Aleppo

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Unsupervised robust nonparametric learning of hidden community properties

Ask ChatGPT about the research

No Arabic abstract

Read More