Impossibility of Partial Recovery in the Graph Alignment Problem

67 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Luca Ganassali

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Luca Ganassali - Laurent Massoulie - Marc Lelarge

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Random graph alignment refers to recovering the underlying vertex correspondence between two random graphs with correlated edges. This can be viewed as an average-case and noisy version of the well-known graph isomorphism problem. For the correlated Erdos-Renyi model, we prove an impossibility result for partial recovery in the sparse regime, with constant average degree and correlation, as well as a general bound on the maximal reachable overlap. Our bound is tight in the noiseless case (the graph isomorphism problem) and we conjecture that it is still tight with noise. Our proof technique relies on a careful application of the probabilistic method to build automorphisms between tree components of a subcritical Erdos-Renyi graph.

قيم البحث

59 - Luca Ganassali , Laurent Massoulie , Marc Lelarge 2021

We consider alignment of sparse graphs, which consists in finding a mapping between the nodes of two graphs which preserves most of the edges. Our approach is to compare local structures in the two graphs, matching two nodes if their neighborhoods ar e close enough: for correlated ErdH{o}s-Renyi random graphs, this problem can be locally rephrased in terms of testing whether a pair of branching trees is drawn from either a product distribution, or a correlated distribution. We design an optimal test for this problem which gives rise to a message-passing algorithm for graph alignment, which provably returns in polynomial time a positive fraction of correctly matched vertices, and a vanishing fraction of mismatches. With an average degree $lambda = O(1)$ in the graphs, and a correlation parameter $s in [0,1]$, this result holds with $lambda s$ large enough, and $1-s$ small enough, completing the recent state-of-the-art diagram. Tighter conditions for determining whether partial graph alignment (or correlation detection in trees) is feasible in polynomial time are given in terms of Kullback-Leibler divergences.

بنى وهياكل البيانات والخوارزميات التعلم الآلي الاحتمالات

Community Recovery in a Preferential Attachment Graph

309 - Bruce Hajek , Suryanarayana Sankagiri 2018

A message passing algorithm is derived for recovering communities within a graph generated by a variation of the Barab{a}si-Albert preferential attachment model. The estimator is assumed to know the arrival times, or order of attachment, of the verti ces. The derivation of the algorithm is based on belief propagation under an independence assumption. Two precursors to the message passing algorithm are analyzed: the first is a degree thresholding (DT) algorithm and the second is an algorithm based on the arrival times of the children (C) of a given vertex, where the children of a given vertex are the vertices that attached to it. Comparison of the performance of the algorithms shows it is beneficial to know the arrival times, not just the number, of the children. The probability of correct classification of a vertex is asymptotically determined by the fraction of vertices arriving before it. Two extensions of Algorithm C are given: the first is based on joint likelihood of the children of a fixed set of vertices; it can sometimes be used to seed the message passing algorithm. The second is the message passing algorithm. Simulation results are given.

التعلم الالي الشبكات الاجتماعية والمعلومات الاحتمالات

Thresholded Adaptive Validation: Tuning the Graphical Lasso for Graph Recovery

104 - Mike Laszkiewicz , Asja Fischer , Johannes Lederer 2020

Many Machine Learning algorithms are formulated as regularized optimization problems, but their performance hinges on a regularization parameter that needs to be calibrated to each application at hand. In this paper, we propose a general calibration scheme for regularized optimization problems and apply it to the graphical lasso, which is a method for Gaussian graphical modeling. The scheme is equipped with theoretical guarantees and motivates a thresholding pipeline that can improve graph recovery. Moreover, requiring at most one line search over the regularization path, the calibration scheme is computationally more efficient than competing schemes that are based on resampling. Finally, we show in simulations that our approach can improve on the graph recovery of other approaches considerably.

التعلم الالي التعلم الآلي المنهجية

Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory

120 - Yunbei Xu , Assaf Zeevi 2020

We study problem-dependent rates, i.e., generalization errors that scale near-optimally with the variance, the effective loss, or the gradient norms evaluated at the best hypothesis. We introduce a principled framework dubbed uniform localized conver gence, and characterize sharp problem-dependent rates for central statistical learning problems. From a methodological viewpoint, our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches. It also provides improvements and some level of unification in the study of localized complexities, one-sided uniform inequalities, and sample-based iterative algorithms. In the so-called slow rate regime, we provides the first (moment-penalized) estimator that achieves the optimal variance-dependent rate for general rich classes; we also establish improved loss-dependent rate for standard empirical risk minimization. In the fast rate regime, we establish finite-sample problem-dependent bounds that are comparable to precise asymptotics. In addition, we show that iterative algorithms like gradient descent and first-order Expectation-Maximization can achieve optimal generalization error in several representative problems across the areas of non-convex learning, stochastic optimization, and learning with missing data.

التعلم الالي التعلم الآلي الاحتمالات

Recovery thresholds in the sparse planted matching problem

373 - Guilhem Semerjian , Gabriele Sicuro , Lenka Zdeborova 2020

We consider the statistical inference problem of recovering an unknown perfect matching, hidden in a weighted random graph, by exploiting the information arising from the use of two different distributions for the weights on the edges inside and outs ide the planted matching. A recent work has demonstrated the existence of a phase transition, in the large size limit, between a full and a partial recovery phase for a specific form of the weights distribution on fully connected graphs. We generalize and extend this result in two directions: we obtain a criterion for the location of the phase transition for generic weights distributions and possibly sparse graphs, exploiting a technical connection with branching random walk processes, as well as a quantitatively more precise description of the critical regime around the phase transition.

الأنظمة المضطربة والشبكات العصبية الرياضيات المتقطعة الاحتمالات