The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity

68 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Pedro Sandoval Segura

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Pedro Sandoval Segura - Julius Lauw - Daniel Bashir

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Algorithm performance in supervised learning is a combination of memorization, generalization, and luck. By estimating how much information an algorithm can memorize from a dataset, we can set a lower bound on the amount of performance due to other factors such as generalization and luck. With this goal in mind, we introduce the Labeling Distribution Matrix (LDM) as a tool for estimating the capacity of learning algorithms. The method attempts to characterize the diversity of possible outputs by an algorithm for different training datasets, using this to measure algorithm flexibility and responsiveness to data. We test the method on several supervised learning algorithms, and find that while the results are not conclusive, the LDM does allow us to gain potentially valuable insight into the prediction behavior of algorithms. We also introduce the Label Recorder as an additional tool for estimating algorithm capacity, with more promising initial results.

قيم البحث

143 - James Wexler , Mahima Pushkarna , Tolga Bolukbasi 2019

A key challenge in developing and deploying Machine Learning (ML) systems is understanding their performance across a wide range of inputs. To address this challenge, we created the What-If Tool, an open-source application that allows practitioners t o probe, visualize, and analyze ML systems, with minimal coding. The What-If Tool lets practitioners test performance in hypothetical situations, analyze the importance of different data features, and visualize model behavior across multiple models and subsets of input data. It also lets practitioners measure systems according to multiple ML fairness metrics. We describe the design of the tool, and report on real-life usage at different organizations.

التعلم الآلي التعلم الالي

VizWiz Dataset Browser: A Tool for Visualizing Machine Learning Datasets

141 - Nilavra Bhattacharya , Danna Gurari 2019

We present a visualization tool to exhaustively search and browse through a set of large-scale machine learning datasets. Built on the top of the VizWiz dataset, our dataset browser tool has the potential to support and enable a variety of qualitativ e and quantitative research, and open new directions for visualizing and researching with multimodal information. The tool is publicly available at https://vizwiz.org/browse.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط تفاعل الإنسان والحاسوب

A Loss-Function for Causal Machine-Learning

71 - I-Sheng Yang 2020

Causal machine-learning is about predicting the net-effect (true-lift) of treatments. Given the data of a treatment group and a control group, it is similar to a standard supervised-learning problem. Unfortunately, there is no similarly well-defined loss function due to the lack of point-wise true values in the data. Many advances in modern machine-learning are not directly applicable due to the absence of such loss function. We propose a novel method to define a loss function in this context, which is equal to mean-square-error (MSE) in a standard regression problem. Our loss function is universally applicable, thus providing a general standard to evaluate the quality of any model/strategy that predicts the true-lift. We demonstrate that despite its novel definition, one can still perform gradient descent directly on this loss function to find the best fit. This leads to a new way to train any parameter-based model, such as deep neural networks, to solve causal machine-learning problems without going through the meta-learner strategy.

التعلم الآلي التعلم الالي

Distribution Density, Tails, and Outliers in Machine Learning: Metrics and Applications

143 - Nicholas Carlini , Ulfar Erlingsson , Nicolas Papernot 2019

We develop techniques to quantify the degree to which a given (training or testing) example is an outlier in the underlying distribution. We evaluate five methods to score examples in a dataset by how well-represented the examples are, for different plausible definitions of well-represented, and apply these to four common datasets: MNIST, Fashion-MNIST, CIFAR-10, and ImageNet. Despite being independent approaches, we find all five are highly correlated, suggesting that the notion of being well-represented can be quantified. Among other uses, we find these methods can be combined to identify (a) prototypical examples (that match human expectations); (b) memorized training examples; and, (c) uncommon submodes of the dataset. Further, we show how we can utilize our metrics to determine an improved ordering for curriculum learning, and impact adversarial robustness. We release all metric values on training and test sets we studied.

التعلم الآلي التعلم الالي

A Tractable Online Learning Algorithm for the Multinomial Logit Contextual Bandit

137 - Priyank Agrawal , Vashist Avadhanula , Theja Tulabandhula 2020

In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where in every round a decision maker offers a subset (assortment) of products to a consumer, and observes their response. Consumers purchase products so as to maximize their utility. We assume that the products are described by a set of attributes and the mean utility of a product is linear in the values of these attributes. We model consumer choice behavior by means of the widely used Multinomial Logit (MNL) model, and consider the decision makers problem of dynamically learning the model parameters, while optimizing cumulative revenue over the selling horizon $T$. Though this problem has attracted considerable attention in recent times, many existing methods often involve solving an intractable non-convex optimization problem and their theoretical performance guarantees depend on a problem dependent parameter which could be prohibitively large. In particular, existing algorithms for this problem have regret bounded by $O(sqrt{kappa d T})$, where $kappa$ is a problem dependent constant that can have exponential dependency on the number of attributes. In this paper, we propose an optimistic algorithm and show that the regret is bounded by $O(sqrt{dT} + kappa)$, significantly improving the performance over existing methods. Further, we propose a convex relaxation of the optimization step which allows for tractable decision-making while retaining the favourable regret guarantee.

التعلم الآلي التعلم الالي