Learning Mixtures of Linear Classifiers

176 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yuekai Sun

تاريخ النشر 2013

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Yuekai Sun - Stratis Ioannidis - Andrea Montanari

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers. Existing approaches are based on the EM algorithm, or similar techniques, without provable guarantees. We develop a simple method based on spectral techniques and a `mirroring trick, that discovers the subspace spanned by the classifiers parameter vectors. Under a probabilistic assumption on the feature vector distribution, we prove that this approach has nearly optimal statistical efficiency.

قيم البحث

156 - Puoya Tabaghi , Eli Chien , Chao Pan 2021

Embedding methods for product spaces are powerful techniques for low-distortion and low-dimensional representation of complex data structures. Nevertheless, little is known regarding downstream learning and optimization problems in such spaces. Here, we address the problem of linear classification in a product space form -- a mix of Euclidean, spherical, and hyperbolic spaces. First, we describe new formulations for linear classifiers on a Riemannian manifold using geodesics and Riemannian metrics which generalize straight lines and inner products in vector spaces, respectively. Second, we prove that linear classifiers in $d$-dimensional space forms of any curvature have the same expressive power, i.e., they can shatter exactly $d+1$ points. Third, we formalize linear classifiers in product space forms, describe the first corresponding perceptron and SVM classification algorithms, and establish rigorous convergence results for the former. We support our theoretical findings with simulation results on several datasets, including synthetic data, CIFAR-100, MNIST, Omniglot, and single-cell RNA sequencing data. The results show that learning methods applied to small-dimensional embeddings in product space forms outperform their algorithmic counterparts in each space form.

التعلم الآلي التعلم الالي

Online Active Learning of Reject Option Classifiers

334 - Kulin Shah , Naresh Manwani 2019

Active learning is an important technique to reduce the number of labeled examples in supervised learning. Active learning for binary classification has been well addressed in machine learning. However, active learning of the reject option classifier remains unaddressed. In this paper, we propose novel algorithms for active learning of reject option classifiers. We develop an active learning algorithm using double ramp loss function. We provide mistake bounds for this algorithm. We also propose a new loss function called double sigmoid loss function for reject option and corresponding active learning algorithm. We offer a convergence guarantee for this algorithm. We provide extensive experimental results to show the effectiveness of the proposed algorithms. The proposed algorithms efficiently reduce the number of label examples required.

التعلم الآلي التعلم الالي

The Power of Comparisons for Actively Learning Linear Classifiers

48 - Max Hopkins , Daniel M. Kane , Shachar Lovett 2019

In the world of big data, large but costly to label datasets dominate many fields. Active learning, a semi-supervised alternative to the standard PAC-learning model, was introduced to explore whether adaptive labeling could learn concepts with expone ntially fewer labeled samples. While previous results show that active learning performs no better than its supervised alternative for important concept classes such as linear separators, we show that by adding weak distributional assumptions and allowing comparison queries, active learning requires exponentially fewer samples. Further, we show that these results hold as well for a stronger model of learning called Reliable and Probably Useful (RPU) learning. In this model, our learner is not allowed to make mistakes, but may instead answer I dont know. While previous negative results showed this model to have intractably large sample complexity for label queries, we show that comparison queries make RPU-learning at worst logarithmically more expensive in both the passive and active regimes.

التعلم الآلي الهندسة الحسابية التعلم الالي

Aggregated Learning: A Vector-Quantization Approach to Learning Neural Network Classifiers

83 - Masoumeh Soflaei , Hongyu Guo , Ali Al-Bashabsheh 2020

We consider the problem of learning a neural network classifier. Under the information bottleneck (IB) principle, we associate with this classification problem a representation learning problem, which we call IB learning. We show that IB learning is, in fact, equivalent to a special class of the quantization problem. The classical results in rate-distortion theory then suggest that IB learning can benefit from a vector quantization approach, namely, simultaneously learning the representations of multiple input objects. Such an approach assisted with some variational techniques, result in a novel learning framework, Aggregated Learning, for classification with neural network models. In this framework, several objects are jointly classified by a single neural network. The effectiveness of this framework is verified through extensive experiments on standard image recognition and text classification tasks.

التعلم الآلي التعلم الالي

Learning Optimal Linear Regularizers

83 - Matthew Streeter 2019

We present algorithms for efficiently learning regularizers that improve generalization. Our approach is based on the insight that regularizers can be viewed as upper bounds on the generalization gap, and that reducing the slack in the bound can impr ove performance on test data. For a broad class of regularizers, the hyperparameters that give the best upper bound can be computed using linear programming. Under certain Bayesian assumptions, solving the LP lets us jump to the optimal hyperparameters given very limited data. This suggests a natural algorithm for tuning regularization hyperparameters, which we show to be effective on both real and synthetic data.

التعلم الآلي التعلم الالي