Testing the Robustness of AutoML Systems

190 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل EPTCS

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Tuomas Halvari - Jukka K. Nurminen - Tommi Mikkonen

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Automated machine learning (AutoML) systems aim at finding the best machine learning (ML) pipeline that automatically matches the task and data at hand. We investigate the robustness of machine learning pipelines generated with three AutoML systems, TPOT, H2O, and AutoKeras. In particular, we study the influence of dirty data on accuracy, and consider how using dirty training data may help create more robust solutions. Furthermore, we also analyze how the structure of the generated pipelines differs in different cases.

قيم البحث

اقرأ أيضاً

An Open Source AutoML Benchmark

107 - Pieter Gijsbers , Erin LeDell , Janek Thomas 2019

In recent years, an active field of research has developed around automated machine learning (AutoML). Unfortunately, comparing different AutoML systems is hard and often done incorrectly. We introduce an open, ongoing, and extensible benchmark frame work which follows best practices and avoids common mistakes. The framework is open-source, uses public datasets and has a website with up-to-date results. We use the framework to conduct a thorough comparison of 4 AutoML systems across 39 datasets and analyze the results.

التعلم الآلي التعلم الالي

AutoML @ NeurIPS 2018 challenge: Design and Results

204 - Hugo Jair Escalante , Wei-Wei Tu , Isabelle Guyon 2019

We organized a competition on Autonomous Lifelong Machine Learning with Drift that was part of the competition program of NeurIPS 2018. This data driven competition asked participants to develop computer programs capable of solving supervised learnin g problems where the i.i.d. assumption did not hold. Large data sets were arranged in a lifelong learning and evaluation scenario and CodaLab was used as the challenge platform. The challenge attracted more than 300 participants in its two month duration. This chapter describes the design of the challenge and summarizes its main results.

التعلم الآلي التعلم الالي

FLAML: A Fast and Lightweight AutoML Library

117 - Chi Wang , Qingyun Wu , Markus Weimer 2019

We study the problem of using low computational cost to automate the choices of learners and hyperparameters for an ad-hoc training dataset and error metric, by conducting trials of different configurations on the given training data. We investigate the joint impact of multiple factors on both trial cost and model error, and propose several design guidelines. Following them, we build a fast and lightweight library FLAML which optimizes for low computational resource in finding accurate models. FLAML integrates several simple but effective search strategies into an adaptive system. It significantly outperforms top-ranked AutoML libraries on a large open source AutoML benchmark under equal, or sometimes orders of magnitude smaller budget constraints.

التعلم الآلي التعلم الالي

Iterative Compression of End-to-End ASR Model using AutoML

127 - Abhinav Mehrotra , {L}ukasz Dudziak , Jinsu Yeo 2020

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

التعلم الآلي التعلم الالي

An ADMM Based Framework for AutoML Pipeline Configuration

65 - Sijia Liu , Parikshit Ram , Deepak Vijaykeerthy 2019

We study the AutoML problem of automatically configuring machine learning pipelines by jointly selecting algorithms and their appropriate hyper-parameters for all steps in supervised learning pipelines. This black-box (gradient-free) optimization wit h mixed integer & continuous variables is a challenging problem. We propose a novel AutoML scheme by leveraging the alternating direction method of multipliers (ADMM). The proposed framework is able to (i) decompose the optimization problem into easier sub-problems that have a reduced number of variables and circumvent the challenge of mixed variable categories, and (ii) incorporate black-box constraints along-side the black-box optimization objective. We empirically evaluate the flexibility (in utilizing existing AutoML techniques), effectiveness (against open source AutoML toolkits),and unique capability (of executing AutoML with practically motivated black-box constraints) of our proposed scheme on a collection of binary classification data sets from UCI ML& OpenML repositories. We observe that on an average our framework provides significant gains in comparison to other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical advantages of this framework.

التعلم الآلي التعلم الالي