ﻻ يوجد ملخص باللغة العربية
Algorithm performance in supervised learning is a combination of memorization, generalization, and luck. By estimating how much information an algorithm can memorize from a dataset, we can set a lower bound on the amount of performance due to other factors such as generalization and luck. With this goal in mind, we introduce the Labeling Distribution Matrix (LDM) as a tool for estimating the capacity of learning algorithms. The method attempts to characterize the diversity of possible outputs by an algorithm for different training datasets, using this to measure algorithm flexibility and responsiveness to data. We test the method on several supervised learning algorithms, and find that while the results are not conclusive, the LDM does allow us to gain potentially valuable insight into the prediction behavior of algorithms. We also introduce the Label Recorder as an additional tool for estimating algorithm capacity, with more promising initial results.
A key challenge in developing and deploying Machine Learning (ML) systems is understanding their performance across a wide range of inputs. To address this challenge, we created the What-If Tool, an open-source application that allows practitioners t
We present a visualization tool to exhaustively search and browse through a set of large-scale machine learning datasets. Built on the top of the VizWiz dataset, our dataset browser tool has the potential to support and enable a variety of qualitativ
Causal machine-learning is about predicting the net-effect (true-lift) of treatments. Given the data of a treatment group and a control group, it is similar to a standard supervised-learning problem. Unfortunately, there is no similarly well-defined
We develop techniques to quantify the degree to which a given (training or testing) example is an outlier in the underlying distribution. We evaluate five methods to score examples in a dataset by how well-represented the examples are, for different
In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where in every round a decision maker offers a subset (assortment) of products to a consumer, and observes