On the intrinsic robustness to noise of some leading classifiers and symmetric loss function -- an empirical evaluation


Abstract in English

In some industrial applications such as fraud detection, the performance of common supervision techniques may be affected by the poor quality of the available labels : in actual operational use-cases, these labels may be weak in quantity, quality or trustworthiness. We propose a benchmark to evaluate the natural robustness of different algorithms taken from various paradigms on artificially corrupted datasets, with a focus on noisy labels. This paper studies the intrinsic robustness of some leading classifiers. The algorithms under scrutiny include SVM, logistic regression, random forests, XGBoost, Khiops. Furthermore, building on results from recent literature, the study is supplemented with an investigation into the opportunity to enhance some algorithms with symmetric loss functions.

Download