We prove a emph{query complexity} lower bound for approximating the top $r$ dimensional eigenspace of a matrix. We consider an oracle model where, given a symmetric matrix $mathbf{M} in mathbb{R}^{d times d}$, an algorithm $mathsf{Alg}$ is allowed to make $mathsf{T}$ exact queries of the form $mathsf{w}^{(i)} = mathbf{M} mathsf{v}^{(i)}$ for $i$ in ${1,...,mathsf{T}}$, where $mathsf{v}^{(i)}$ is drawn from a distribution which depends arbitrarily on the past queries and measurements ${mathsf{v}^{(j)},mathsf{w}^{(i)}}_{1 le j le i-1}$. We show that for every $mathtt{gap} in (0,1/2]$, there exists a distribution over matrices $mathbf{M}$ for which 1) $mathrm{gap}_r(mathbf{M}) = Omega(mathtt{gap})$ (where $mathrm{gap}_r(mathbf{M})$ is the normalized gap between the $r$ and $r+1$-st largest-magnitude eigenvector of $mathbf{M}$), and 2) any algorithm $mathsf{Alg}$ which takes fewer than $mathrm{const} times frac{r log d}{sqrt{mathtt{gap}}}$ queries fails (with overwhelming probability) to identity a matrix $widehat{mathsf{V}} in mathbb{R}^{d times r}$ with orthonormal columns for which $langle widehat{mathsf{V}}, mathbf{M} widehat{mathsf{V}}rangle ge (1 - mathrm{const} times mathtt{gap})sum_{i=1}^r lambda_i(mathbf{M})$. Our bound requires only that $d$ is a small polynomial in $1/mathtt{gap}$ and $r$, and matches the upper bounds of Musco and Musco 15. Moreover, it establishes a strict separation between convex optimization and emph{randomized}, strict-saddle non-convex optimization of which PCA is a canonical example: in the former, first-order methods can have dimension-free iteration complexity, whereas in PCA, the iteration complexity of gradient-based methods must necessarily grow with the dimension.

التعلم الآلي بنى وهياكل البيانات والخوارزميات نظرية المعلومات

On Pairwise Clustering with Side Information

120 - Stephen Pasteris , Fabio Vitale , Claudio Gentile 2017

Pairwise clustering, in general, partitions a set of items via a known similarity function. In our treatment, clustering is modeled as a transductive prediction problem. Thus rather than beginning with a known similarity function, the function instea d is hidden and the learner only receives a random sample consisting of a subset of the pairwise similarities. An additional set of pairwise side-information may be given to the learner, which then determines the inductive bias of our algorithms. We measure performance not based on the recovery of the hidden similarity function, but instead on how well we classify each item. We give tight bounds on the number of misclassifications. We provide two algorithms. The first algorithm SACA is a simple agglomerative clustering algorithm which runs in near linear time, and which serves as a baseline for our analyses. Whereas the second algorithm, RGCA, enables the incorporation of side-information which may lead to improved bounds at the cost of a longer running time.

التعلم الآلي

Reliable Distributed Clustering with Redundant Data Assignment

69 - Venkata Gandikota , Arya Mazumdar , Ankit Singh Rawat 2020

In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines. We propose a novel data assignment scheme that enables us to obtain globa l information about the entire data even when some machines fail to respond with the results of the assigned local computations. The assignment scheme leads to distributed algorithms with good approximation guarantees for a variety of clustering and dimensionality reduction problems.

النظم الموزعة والتوازية والحوسبة العنقودية بنى وهياكل البيانات والخوارزميات نظرية المعلومات

Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance

69 - Jie Shen , Chicheng Zhang 2020

This paper is concerned with computationally efficient learning of homogeneous sparse halfspaces in $mathbb{R}^d$ under noise. Though recent works have established attribute-efficient learning algorithms under various types of label noise (e.g. bound ed noise), it remains an open question when and how $s$-sparse halfspaces can be efficiently learned under the challenging malicious noise model, where an adversary may corrupt both the unlabeled examples and the labels. We answer this question in the affirmative by designing a computationally efficient active learning algorithm with near-optimal label complexity of $tilde{O}big({s log^4 frac d epsilon} big)$ and noise tolerance $eta = Omega(epsilon)$, where $epsilon in (0, 1)$ is the target error rate, under the assumption that the distribution over (uncorrupted) unlabeled examples is isotropic log-concave. Our algorithm can be straightforwardly tailored to the passive learning setting, and we show that the sample complexity is $tilde{O}big({frac 1 epsilon s^2 log^5 d} big)$ which also enjoys the attribute efficiency. Our main techniques include attribute-efficient paradigms for instance reweighting and for empirical risk minimization, and a new analysis of uniform concentration for unbounded data -- all of them crucially take the structure of the underlying halfspace into account.

التعلم الالي بنى وهياكل البيانات والخوارزميات التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني لإدارة الأعمال

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Query Complexity of Clustering with Side Information

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً