Model Selection for Production System via Automated Online Experiments

185 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhenwen Dai

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhenwen Dai - Praveen Chandar - Ghazal Fazelnia

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A challenge that machine learning practitioners in the industry face is the task of selecting the best model to deploy in production. As a model is often an intermediate component of a production system, online controlled experiments such as A/B tests yield the most reliable estimation of the effectiveness of the whole system, but can only compare two or a few models due to budget constraints. We propose an automated online experimentation mechanism that can efficiently perform model selection from a large pool of models with a small number of online experiments. We derive the probability distribution of the metric of interest that contains the model uncertainty from our Bayesian surrogate model trained using historical logs. Our method efficiently identifies the best model by sequentially selecting and deploying a list of models from the candidate set that balance exploration-exploitation. Using simulations based on real data, we demonstrate the effectiveness of our method on two different tasks.

قيم البحث

224 - Leonardo Cella , Claudio Gentile , Massimiliano Pontil 2020

Motivated by a natural problem in online model selection with bandit information, we introduce and analyze a best arm identification problem in the rested bandit setting, wherein arm expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learned on the fly. We define a novel notion of regret for this problem, where we compare to the policy that always plays the arm having the smallest expected loss at the end of the game. We analyze an arm elimination algorithm whose regret vanishes as the time horizon increases. The actual rate of convergence depends in a detailed way on the postulated functional form of the expected losses. Unlike known model selection efforts in the recent bandit literature, our algorithm exploits the specific structure of the problem to learn the unknown parameters of the expected loss function so as to identify the best arm as quickly as possible. We complement our analysis with a lower bound, indicating strengths and limitations of the proposed solution.

التعلم الالي التعلم الآلي

Model Selection for Generic Contextual Bandits

120 - Avishek Ghosh , Abishek Sankararaman , Kannan Ramchandran 2021

We consider the problem of model selection for the general stochastic contextual bandits under the realizability assumption. We propose a successive refinement based algorithm called Adaptive Contextual Bandit ({ttfamily ACB}), that works in phases a nd successively eliminates model classes that are too simple to fit the given instance. We prove that this algorithm is adaptive, i.e., the regret rate order-wise matches that of {ttfamily FALCON}, the state-of-art contextual bandit algorithm of Levi et. al 20, that needs knowledge of the true model class. The price of not knowing the correct model class is only an additive term contributing to the second order term in the regret bound. This cost possess the intuitive property that it becomes smaller as the model class becomes easier to identify, and vice-versa. We then show that a much simpler explore-then-commit (ETC) style algorithm also obtains a regret rate of matching that of {ttfamily FALCON}, despite not knowing the true model class. However, the cost of model selection is higher in ETC as opposed to in {ttfamily ACB}, as expected. Furthermore, {ttfamily ACB} applied to the linear bandit setting with unknown sparsity, order-wise recovers the model selection guarantees previously established by algorithms tailored to the linear setting.

التعلم الالي نظرية المعلومات التعلم الآلي

Parameter-free online learning via model selection

134 - Dylan J. Foster , Satyen Kale , Mehryar Mohri 2017

We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning. Departing from previous work, which has focused on highly structured function classes such as nested balls in Hilber t space, we propose a generic meta-algorithm framework that achieves online model selection oracle inequalities under minimal structural assumptions. We give the first computationally efficient parameter-free algorithms that work in arbitrary Banach spaces under mild smoothness assumptions; previous results applied only to Hilbert spaces. We further derive new oracle inequalities for matrix classes, non-nested convex sets, and $mathbb{R}^{d}$ with generic regularizers. Finally, we generalize these results by providing oracle inequalities for arbitrary non-linear classes in the online supervised learning model. These results are all derived through a unified meta-algorithm scheme using a novel multi-scale algorithm for prediction with expert advice based on random playout, which may be of independent interest.

التعلم الآلي التعلم الالي

Active Model Aggregation via Stochastic Mirror Descent

600 - Ravi Ganti 2015

We consider the problem of learning convex aggregation of models, that is as good as the best convex aggregation, for the binary classification problem. Working in the stream based active learning setting, where the active learner has to make a decis ion on-the-fly, if it wants to query for the label of the point currently seen in the stream, we propose a stochastic-mirror descent algorithm, called SMD-AMA, with entropy regularization. We establish an excess risk bounds for the loss of the convex aggregate returned by SMD-AMA to be of the order of $Oleft(sqrt{frac{log(M)}{{T^{1-mu}}}}right)$, where $muin [0,1)$ is an algorithm dependent parameter, that trades-off the number of labels queried, and excess risk.

التعلم الالي الذكاء الاصطناعي التعلم الآلي

Differentiable Model Compression via Pseudo Quantization Noise

79 - Alexandre Defossez , Yossi Adi , Gabriel Synnaeve 2021

We propose to add independent pseudo quantization noise to model parameters during training to approximate the effect of a quantization operator. This method, DiffQ, is differentiable both with respect to the unquantized parameters, and the number of bits used. Given a single hyper-parameter expressing the desired balance between the quantized model size and accuracy, DiffQ can optimize the number of bits used per individual weight or groups of weights, in a single training. We experimentally verify that our method outperforms state-of-the-art quantization techniques on several benchmarks and architectures for image classification, language modeling, and audio source separation. For instance, on the Wikitext-103 language modeling benchmark, DiffQ compresses a 16 layers transformer model by a factor of 8, equivalent to 4 bits precision, while losing only 0.5 points of perplexity. Code is available at: https://github.com/facebookresearch/diffq

التعلم الالي الذكاء الاصطناعي التعلم الآلي