Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Cost Sensitive Learning in the Presence of Symmetric Label Noise

280 0 0.0 ( 0 )

Download Cite

Added by Sandhya Tripathi

Publication date 2019

fields Informatics Engineering Mathematical Statistics

and research's language is English

Authors Sandhya Tripathi - N. Hemachandra

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In binary classification framework, we are interested in making cost sensitive label predictions in the presence of uniform/symmetric label noise. We first observe that $0$-$1$ Bayes classifiers are not (uniform) noise robust in cost sensitive setting. To circumvent this impossibility result, we present two schemes; unlike the existing methods, our schemes do not require noise rate. The first one uses $alpha$-weighted $gamma$-uneven margin squared loss function, $l_{alpha, usq}$, which can handle cost sensitivity arising due to domain requirement (using user given $alpha$) or class imbalance (by tuning $gamma$) or both. However, we observe that $l_{alpha, usq}$ Bayes classifiers are also not cost sensitive and noise robust. We show that regularized ERM of this loss function over the class of linear classifiers yields a cost sensitive uniform noise robust classifier as a solution of a system of linear equations. We also provide a performance bound for this classifier. The second scheme that we propose is a re-sampling based scheme that exploits the special structure of the uniform noise models and uses in-class probability $eta$ estimates. Our computational experiments on some UCI datasets with class imbalance show that classifiers of our two schemes are on par with the existing methods and in fact better in some cases w.r.t. Accuracy and Arithmetic Mean, without using/tuning noise rate. We also consider other cost sensitive performance measures viz., F measure and Weighted Cost for evaluation. As our re-sampling scheme requires estimates of $eta$, we provide a detailed comparative study of various $eta$ estimation methods on synthetic datasets, w.r.t. half a dozen evaluation criterion. Also, we provide understanding on the interpretation of cost parameters $alpha$ and $gamma$ using different synthetic data experiments.

rate research

Cost-Sensitive Reference Pair Encoding for Multi-Label Learning

166 - Yao-Yuan Yang , Kuan-Hao Huang , Chih-Wei Chang 2016

Label space expansion for multi-label classification (MLC) is a methodology that encodes the original label vectors to higher dimensional codes before training and decodes the predicted codes back to the label vectors during testing. The methodology has been demonstrated to improve the performance of MLC algorithms when coupled with off-the-shelf error-correcting codes for encoding and decoding. Nevertheless, such a coding scheme can be complicated to implement, and cannot easily satisfy a common application need of cost-sensitive MLC---adapting to different evaluation criteria of interest. In this work, we show that a simpler coding scheme based on the concept of a reference pair of label vectors achieves cost-sensitivity more naturally. In particular, our proposed cost-sensitive reference pair encoding (CSRPE) algorithm contains cluster-based encoding, weight-based training and voting-based decoding steps, all utilizing the cost information. Furthermore, we leverage the cost information embedded in the code space of CSRPE to propose a novel active learning algorithm for cost-sensitive MLC. Extensive experimental results verify that CSRPE performs better than state-of-the-art algorithms across different MLC criteria. The results also demonstrate that the CSRPE-backed active learning algorithm is superior to existing algorithms for active MLC, and further justify the usefulness of CSRPE.

Machine Learning Machine Learning

TrustNet: Learning from Trusted Data Against (A)symmetric Label Noise

76 - Amirmasoud Ghiassi , Taraneh Younesian , Robert Birke 2020

Robustness to label noise is a critical property for weakly-supervised classifiers trained on massive datasets. Robustness to label noise is a critical property for weakly-supervised classifiers trained on massive datasets. In this paper, we first derive analytical bound for any given noise patterns. Based on the insights, we design TrustNet that first adversely learns the pattern of noise corruption, being it both symmetric or asymmetric, from a small set of trusted data. Then, TrustNet is trained via a robust loss function, which weights the given labels against the inferred labels from the learned noise pattern. The weight is adjusted based on model uncertainty across training epochs. We evaluate TrustNet on synthetic label noise for CIFAR-10 and CIFAR-100, and real-world data with label noise, i.e., Clothing1M. We compare against state-of-the-art methods demonstrating the strong robustness of TrustNet under a diverse set of noise patterns.

Machine Learning Machine Learning

Active Learning for Cost-Sensitive Classification

119 - Akshay Krishnamurthy , Alekh Agarwal , Tzu-Kuo Huang 2017

We design an active learning algorithm for cost-sensitive multiclass classification: problems where different errors have different costs. Our algorithm, COAL, makes predictions by regressing to each labels cost and predicting the smallest. On a new example, it uses a set of regressors that perform well on past data to estimate possible costs for each label. It queries only the labels that could be the best, ignoring the sure losers. We prove COAL can be efficiently implemented for any regression family that admits squared loss optimization; it also enjoys strong guarantees with respect to predictive performance and labeling effort. We empirically compare COAL to passive learning and several active learning baselines, showing significant improvements in labeling effort and test cost on real-world datasets.

Machine Learning Machine Learning

Are Anchor Points Really Indispensable in Label-Noise Learning?

247 - Xiaobo Xia , Tongliang Liu , Nannan Wang 2019

In label-noise learning, textit{noise transition matrix}, denoting the probabilities that clean labels flip into noisy labels, plays a central role in building textit{statistically consistent classifiers}. Existing theories have shown that the transition matrix can be learned by exploiting textit{anchor points} (i.e., data points that belong to a specific class almost surely). However, when there are no anchor points, the transition matrix will be poorly learned, and those current consistent classifiers will significantly degenerate. In this paper, without employing anchor points, we propose a textit{transition-revision} ($T$-Revision) method to effectively learn transition matrices, leading to better classifiers. Specifically, to learn a transition matrix, we first initialize it by exploiting data points that are similar to anchor points, having high textit{noisy class posterior probabilities}. Then, we modify the initialized matrix by adding a textit{slack variable}, which can be learned and validated together with the classifier by using noisy data. Empirical results on benchmark-simulated and real-world label-noise datasets demonstrate that without using exact anchor points, the proposed method is superior to the state-of-the-art label-noise learning methods.

Machine Learning Machine Learning

Part-dependent Label Noise: Towards Instance-dependent Label Noise

266 - Xiaobo Xia , Tongliang Liu , Bo Han 2020

Learning with the textit{instance-dependent} label noise is challenging, because it is hard to model such real-world noise. Note that there are psychological and physiological evidences showing that we humans perceive instances by decomposing them into parts. Annotators are therefore more likely to annotate instances based on the parts rather than the whole instances, where a wrong mapping from parts to classes may cause the instance-dependent label noise. Motivated by this human cognition, in this paper, we approximate the instance-dependent label noise by exploiting textit{part-dependent} label noise. Specifically, since instances can be approximately reconstructed by a combination of parts, we approximate the instance-dependent textit{transition matrix} for an instance by a combination of the transition matrices for the parts of the instance. The transition matrices for parts can be learned by exploiting anchor points (i.e., data points that belong to a specific class almost surely). Empirical evaluations on synthetic and real-world datasets demonstrate our method is superior to the state-of-the-art approaches for learning from the instance-dependent label noise.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Cost Sensitive Learning in the Presence of Symmetric Label Noise

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions