Robust Learning of Mixtures of Gaussians

82 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daniel Kane

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Daniel M. Kane

بنى وهياكل البيانات والخوارزميات التعلم الآلي نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We resolve one of the major outstanding problems in robust statistics. In particular, if $X$ is an evenly weighted mixture of two arbitrary $d$-dimensional Gaussians, we devise a polynomial time algorithm that given access to samples from $X$ an $eps$-fraction of which have been adversarially corrupted, learns $X$ to error $poly(eps)$ in total variation distance.

قيم البحث

88 - Konstantin Makarychev , Yury Makarychev , Aravindan Vijayaraghavan 2015

We study the problem of learning communities in the presence of modeling errors and give robust recovery algorithms for the Stochastic Block Model (SBM). This model, which is also known as the Planted Partition Model, is widely used for community det ection and graph partitioning in various fields, including machine learning, statistics, and social sciences. Many algorithms exist for learning communities in the Stochastic Block Model, but they do not work well in the presence of errors. In this paper, we initiate the study of robust algorithms for partial recovery in SBM with modeling errors or noise. We consider graphs generated according to the Stochastic Block Model and then modified by an adversary. We allow two types of adversarial errors, Feige---Kilian or monotone errors, and edge outlier errors. Mossel, Neeman and Sly (STOC 2015) posed an open question about whether an almost exact recovery is possible when the adversary is allowed to add $o(n)$ edges. Our work answers this question affirmatively even in the case of $k>2$ communities. We then show that our algorithms work not only when the instances come from SBM, but also work when the instances come from any distribution of graphs that is $epsilon m$ close to SBM in the Kullback---Leibler divergence. This result also works in the presence of adversarial errors. Finally, we present almost tight lower bounds for two communities.

بنى وهياكل البيانات والخوارزميات التعلم الآلي نظرية الإحصاء

Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

162 - Lunjia Hu , Omer Reingold 2020

We study the problem of robustly estimating the mean of a $d$-dimensional distribution given $N$ examples, where most coordinates of every example may be missing and $varepsilon N$ examples may be arbitrarily corrupted. Assuming each coordinate appea rs in a constant factor more than $varepsilon N$ examples, we show algorithms that estimate the mean of the distribution with information-theoretically optimal dimension-independent error guarantees in nearly-linear time $widetilde O(Nd)$. Our results extend recent work on computationally-efficient robust estimation to a more widely applicable incomplete-data setting.

بنى وهياكل البيانات والخوارزميات التعلم الآلي نظرية الإحصاء

Adaptive Learning with Robust Generalization Guarantees

160 - Rachel Cummings , Katrina Ligett , Kobbi Nissim 2016

The traditional notion of generalization---i.e., learning a hypothesis whose empirical error is close to its true error---is surprisingly brittle. As has recently been noted in [DFH+15b], even if several algorithms have this guarantee in isolation, t he guarantee need not hold if the algorithms are composed adaptively. In this paper, we study three notions of generalization---increasing in strength---that are robust to postprocessing and amenable to adaptive composition, and examine the relationships between them. We call the weakest such notion Robust Generalization. A second, intermediate, notion is the stability guarantee known as differential privacy. The strongest guarantee we consider we call Perfect Generalization. We prove that every hypothesis class that is PAC learnable is also PAC learnable in a robustly generalizing fashion, with almost the same sample complexity. It was previously known that differentially private algorithms satisfy robust generalization. In this paper, we show that robust generalization is a strictly weaker concept, and that there is a learning task that can be carried out subject to robust generalization guarantees, yet cannot be carried out subject to differential privacy. We also show that perfect generalization is a strictly stronger guarantee than differential privacy, but that, nevertheless, many learning tasks can be carried out subject to the guarantees of perfect generalization.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Recent Advances in Algorithmic High-Dimensional Robust Statistics

118 - Ilias Diakonikolas , Daniel M. Kane 2019

Learning in the presence of outliers is a fundamental problem in statistics. Until recently, all known efficient unsupervised learning algorithms were very sensitive to outliers in high dimensions. In particular, even for the task of robust mean esti mation under natural distributional assumptions, no efficient algorithm was known. Recent work in theoretical computer science gave the first efficient robust estimators for a number of fundamental statistical tasks, including mean and covariance estimation. Since then, there has been a flurry of research activity on algorithmic high-dimensional robust estimation in a range of settings. In this survey article, we introduce the core ideas and algorithmic techniques in the emerging area of algorithmic high-dimensional robust statistics with a focus on robust mean estimation. We also provide an overview of the approaches that have led to computationally efficient robust estimators for a range of broader statistical tasks and discuss new directions and opportunities for future work.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي نظرية الإحصاء

Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting and Regularization

132 - Ke Wang , Christos Thrampoulidis 2020

Deep neural networks generalize well despite being exceedingly overparameterized and being trained without explicit regularization. This curious phenomenon has inspired extensive research activity in establishing its statistical principles: Under wha t conditions is it observed? How do these depend on the data and on the training algorithm? When does regularization benefit generalization? While such questions remain wide open for deep neural nets, recent works have attempted gaining insights by studying simpler, often linear, models. Our paper contributes to this growing line of work by examining binary linear classification under a generative Gaussian mixture model. Motivated by recent results on the implicit bias of gradient descent, we study both max-margin SVM classifiers (corresponding to logistic loss) and min-norm interpolating classifiers (corresponding to least-squares loss). First, we leverage an idea introduced in [V. Muthukumar et al., arXiv:2005.08054, (2020)] to relate the SVM solution to the min-norm interpolating solution. Second, we derive novel non-asymptotic bounds on the classification error of the latter. Combining the two, we present novel sufficient conditions on the covariance spectrum and on the signal-to-noise ratio (SNR) under which interpolating estimators achieve asymptotically optimal performance as overparameterization increases. Interestingly, our results extend to a noisy model with constant probability noise flips. Contrary to previously studied discriminative data models, our results emphasize the crucial role of the SNR and its interplay with the data covariance. Finally, via a combination of analytical arguments and numerical demonstrations we identify conditions under which the interpolating estimator performs better than corresponding regularized estimates.

التعلم الالي التعلم الآلي نظرية الإحصاء