Clustering in Partially Labeled Stochastic Block Models via Total Variation Minimization

62 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alexander Jung

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Alexander Jung

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A main task in data analysis is to organize data points into coherent groups or clusters. The stochastic block model is a probabilistic model for the cluster structure. This model prescribes different probabilities for the presence of edges within a cluster and between different clusters. We assume that the cluster assignments are known for at least one data point in each cluster. In such a partially labeled stochastic block model, clustering amounts to estimating the cluster assignments of the remaining data points. We study total variation minimization as a method for this clustering task. We implement the resulting clustering algorithm as a highly scalable message-passing protocol. We also provide a condition on the model parameters such that total variation minimization allows for accurate clustering.

قيم البحث

137 - Alexander Jung , Alfred O. Hero III , Alexandru Mara 2019

We propose and analyze a method for semi-supervised learning from partially-labeled network-structured data. Our approach is based on a graph signal recovery interpretation under a clustering hypothesis that labels of data points belonging to the sam e well-connected subset (cluster) are similar valued. This lends naturally to learning the labels by total variation (TV) minimization, which we solve by applying a recently proposed primal-dual method for non-smooth convex optimization. The resulting algorithm allows for a highly scalable implementation using message passing over the underlying empirical graph, which renders the algorithm suitable for big data applications. By applying tools of compressed sensing, we derive a sufficient condition on the underlying network structure such that TV minimization recovers clusters in the empirical graph of the data. In particular, we show that the proposed primal-dual method amounts to maximizing network flows over the empirical graph of the dataset. Moreover, the learning accuracy of the proposed algorithm is linked to the set of network flows between data points having known labels. The effectiveness and scalability of our approach is verified by numerical experiments.

التعلم الآلي التعلم الالي

Classifying Partially Labeled Networked Data via Logistic Network Lasso

86 - Nguyen Tran , Henrik Ambos , Alexander Jung 2019

We apply the network Lasso to classify partially labeled data points which are characterized by high-dimensional feature vectors. In order to learn an accurate classifier from limited amounts of labeled data, we borrow statistical strength, via an in trinsic network structure, across the dataset. The resulting logistic network Lasso amounts to a regularized empirical risk minimization problem using the total variation of a classifier as a regularizer. This minimization problem is a non-smooth convex optimization problem which we solve using a primal-dual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.

التعلم الآلي التعلم الالي

Sample Complexity of Total Variation Minimization

74 - Sajad Daei , Farzan Haddadi , Arash Amini 2018

This work considers the use of Total variation (TV) minimization in the recovery of a given gradient sparse vector from Gaussian linear measurements. It has been shown in recent studies that there exist a sharp phase transition behavior in TV minimiz ation in asymptotic regimes. The phase transition curve specifies the boundary of success and failure of TV minimization for large number of measurements. It is a challenging task to obtain a theoretical bound that reflects this curve. In this work, we present a novel upper-bound that suitably approximates this curve and is asymptotically sharp. Numerical results show that our bound is closer to the empirical TV phase transition curve than the previously known bound obtained by Kabanava.

نظرية المعلومات نظرية المعلومات

Precise Phase Transition of Total Variation Minimization

82 - Bingwen Zhang , Weiyu Xu , Jian-Feng Cai 2015

Characterizing the phase transitions of convex optimizations in recovering structured signals or data is of central importance in compressed sensing, machine learning and statistics. The phase transitions of many convex optimization signal recovery m ethods such as $ell_1$ minimization and nuclear norm minimization are well understood through recent years research. However, rigorously characterizing the phase transition of total variation (TV) minimization in recovering sparse-gradient signal is still open. In this paper, we fully characterize the phase transition curve of the TV minimization. Our proof builds on Donoho, Johnstone and Montanaris conjectured phase transition curve for the TV approximate message passing algorithm (AMP), together with the linkage between the minmax Mean Square Error of a denoising problem and the high-dimensional convex geometry for TV minimization.

نظرية المعلومات التعلم الآلي نظرية المعلومات

Characterizing Structural Regularities of Labeled Data in Overparameterized Models

47 - Ziheng Jiang , Chiyuan Zhang , Kunal Talwar 2020

Humans are accustomed to environments that contain both regularities and exceptions. For example, at most gas stations, one pays prior to pumping, but the occasional rural station does not accept payment in advance. Likewise, deep neural networks can generalize across instances that share common patterns or structures, yet have the capacity to memorize rare or irregular forms. We analyze how individual instances are treated by a model via a consistency score. The score characterizes the expected accuracy for a held-out instance given training sets of varying size sampled from the data distribution. We obtain empirical estimates of this score for individual instances in multiple data sets, and we show that the score identifies out-of-distribution and mislabeled examples at one end of the continuum and strongly regular examples at the other end. We identify computationally inexpensive proxies to the consistency score using statistics collected during training. We show examples of potential applications to the analysis of deep-learning systems.

التعلم الآلي التعلم الالي