Do you want to publish a course? Click here

Supervised classification via minimax probabilistic transformations

400   0   0.0 ( 0 )
 Added by Santiago Mazuelas
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

Conventional techniques for supervised classification constrain the classification rules considered and use surrogate losses for classification 0-1 loss. Favored families of classification rules are those that enjoy parametric representations suitable for surrogate loss minimization, and low complexity properties suitable for overfitting control. This paper presents classification techniques based on robust risk minimization (RRM) that we call linear probabilistic classifiers (LPCs). The proposed techniques consider unconstrained classification rules, optimize the classification 0-1 loss, and provide performance bounds during learning. LPCs enable efficient learning by using linear optimization, and avoid overffiting by using RRM over polyhedral uncertainty sets of distributions. We also provide finite-sample generalization bounds for LPCs and show their competitive performance with state-of-the-art techniques using benchmark datasets.



rate research

Read More

Different types of training data have led to numerous schemes for supervised classification. Current learning techniques are tailored to one specific scheme and cannot handle general ensembles of training data. This paper presents a unifying framework for supervised classification with general ensembles of training data, and proposes the learning methodology of generalized robust risk minimization (GRRM). The paper shows how current and novel supervision schemes can be addressed under the proposed framework by representing the relationship between examples at test and training via probabilistic transformations. The results show that GRRM can handle different types of training data in a unified manner, and enable new supervision schemes that aggregate general ensembles of training data.
Supervised classification techniques use training samples to find classification rules with small expected 0-1 loss. Conventional methods achieve efficient learning and out-of-sample generalization by minimizing surrogate losses over specific families of rules. This paper presents minimax risk classifiers (MRCs) that do not rely on a choice of surrogate loss and family of rules. MRCs achieve efficient learning and out-of-sample generalization by minimizing worst-case expected 0-1 loss w.r.t. uncertainty sets that are defined by linear constraints and include the true underlying distribution. In addition, MRCs learning stage provides performance guarantees as lower and upper tight bounds for expected 0-1 loss. We also present MRCs finite-sample generalization bounds in terms of training size and smallest minimax risk, and show their competitive classification performance w.r.t. state-of-the-art techniques using benchmark datasets.
107 - Farzan Farnia , David Tse 2016
Given a task of predicting $Y$ from $X$, a loss function $L$, and a set of probability distributions $Gamma$ on $(X,Y)$, what is the optimal decision rule minimizing the worst-case expected loss over $Gamma$? In this paper, we address this question by introducing a generalization of the principle of maximum entropy. Applying this principle to sets of distributions with marginal on $X$ constrained to be the empirical marginal from the data, we develop a general minimax approach for supervised learning problems. While for some loss functions such as squared-error and log loss, the minimax approach rederives well-knwon regression models, for the 0-1 loss it results in a new linear classifier which we call the maximum entropy machine. The maximum entropy machine minimizes the worst-case 0-1 loss over the structured set of distribution, and by our numerical experiments can outperform other well-known linear classifiers such as SVM. We also prove a bound on the generalization worst-case error in the minimax approach.
Training a classifier over a large number of classes, known as extreme classification, has become a topic of major interest with applications in technology, science, and e-commerce. Traditional softmax regression induces a gradient cost proportional to the number of classes $C$, which often is prohibitively expensive. A popular scalable softmax approximation relies on uniform negative sampling, which suffers from slow convergence due a poor signal-to-noise ratio. In this paper, we propose a simple training method for drastically enhancing the gradient signal by drawing negative samples from an adversarial model that mimics the data distribution. Our contributions are three-fold: (i) an adversarial sampling mechanism that produces negative samples at a cost only logarithmic in $C$, thus still resulting in cheap gradient updates; (ii) a mathematical proof that this adversarial sampling minimizes the gradient variance while any bias due to non-uniform sampling can be removed; (iii) experimental results on large scale data sets that show a reduction of the training time by an order of magnitude relative to several competitive baselines.
We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems. The approach outperforms current state-of-the-art methods on MNIST reaching $99.14%$ of test set accuracy while using $5$ labeled examples per class. Experiments with one-dimensional signals highlight the generality of the method. Importantly, our approach is simple, efficient, and requires no change in the deep network architecture.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا