ﻻ يوجد ملخص باللغة العربية
In machine learning contests such as the ImageNet Large Scale Visual Recognition Challenge and the KDD Cup, contestants can submit candidate solutions and receive from an oracle (typically the organizers of the competition) the accuracy of their guesses compared to the ground-truth labels. One of the most commonly used accuracy metrics for binary classification tasks is the Area Under the Receiver Operating Characteristics Curve (AUC). In this paper we provide proofs-of-concept of how knowledge of the AUC of a set of guesses can be used, in two different kinds of attacks, to improve the accuracy of those guesses. On the other hand, we also demonstrate the intractability of one kind of AUC exploit by proving that the number of possible binary labelings of $n$ examples for which a candidate solution obtains a AUC score of $c$ grows exponentially in $n$, for every $cin (0,1)$.
Background: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among a wide variety of decision-making models for determining the degree of disease deterioration at the bedside. AutoScore was proposed
The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for th
Machine-learning systems such as self-driving cars or virtual assistants are composed of a large number of machine-learning models that recognize image content, transcribe speech, analyze natural language, infer preferences, rank options, etc. Models
In the context of data-mining competitions (e.g., Kaggle, KDDCup, ILSVRC Challenge), we show how access to an oracle that reports a contestants log-loss score on the test set can be exploited to deduce the ground-truth of some of the test examples. B
In the field of machine learning there is a growing interest towards more robust and generalizable algorithms. This is for example important to bridge the gap between the environment in which the training data was collected and the environment where