AutoML @ NeurIPS 2018 challenge: Design and Results

205 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hugo Jair Escalante

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Hugo Jair Escalante - Wei-Wei Tu - Isabelle Guyon

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We organized a competition on Autonomous Lifelong Machine Learning with Drift that was part of the competition program of NeurIPS 2018. This data driven competition asked participants to develop computer programs capable of solving supervised learning problems where the i.i.d. assumption did not hold. Large data sets were arranged in a lifelong learning and evaluation scenario and CodaLab was used as the challenge platform. The challenge attracted more than 300 participants in its two month duration. This chapter describes the design of the challenge and summarizes its main results.

قيم البحث

370 - Zhen Xu , Wei-Wei Tu , Isabelle Guyon 2021

Analyzing better time series with limited human effort is of interest to academia and industry. Driven by business scenarios, we organized the first Automated Time Series Regression challenge (AutoSeries) for the WSDM Cup 2020. We present its design, analysis, and post-hoc experiments. The code submission requirement precluded participants from any manual intervention, testing automated machine learning capabilities of solutions, across many datasets, under hardware and time limitations. We prepared 10 datasets from diverse application domains (sales, power consumption, air quality, traffic, and parking), featuring missing data, mixed continuous and categorical variables, and various sampling rates. Each dataset was split into a training and a test sequence (which was streamed, allowing models to continuously adapt). The setting of time series regression, differs from classical forecasting in that covariates at the present time are known. Great strides were made by participants to tackle this AutoSeries problem, as demonstrated by the jump in performance from the sample submission, and post-hoc comparisons with AutoGluon. Simple yet effective methods were used, based on feature engineering, LightGBM, and random search hyper-parameter tuning, addressing all aspects of the challenge. Our post-hoc analyses revealed that providing additional time did not yield significant improvements. The winners code was open-sourced https://www.4paradigm.com/competition/autoseries2020.

التعلم الآلي

Results and Insights from Diagnostic Questions: The NeurIPS 2020 Education Challenge

565 - Zichao Wang , Angus Lamb , Evgeny Saveliev 2021

This competition concerns educational diagnostic questions, which are pedagogically effective, multiple-choice questions (MCQs) whose distractors embody misconceptions. With a large and ever-increasing number of such questions, it becomes overwhelmin g for teachers to know which questions are the best ones to use for their students. We thus seek to answer the following question: how can we use data on hundreds of millions of answers to MCQs to drive automatic personalized learning in large-scale learning scenarios where manual personalization is infeasible? Success in using MCQ data at scale helps build more intelligent, personalized learning platforms that ultimately improve the quality of education en masse. To this end, we introduce a new, large-scale, real-world dataset and formulate 4 data mining tasks on MCQs that mimic real learning scenarios and target various aspects of the above question in a competition setting at NeurIPS 2020. We report on our NeurIPS competition in which nearly 400 teams submitted approximately 4000 submissions, with encouragingly diverse and effective approaches to each of our tasks.

أجهزة الكمبيوتر والمجتمع تفاعل الإنسان والحاسوب

WIDER Face and Pedestrian Challenge 2018: Methods and Results

328 - Chen Change Loy , Dahua Lin , Wanli Ouyang 2019

This paper presents a review of the 2018 WIDER Challenge on Face and Pedestrian. The challenge focuses on the problem of precise localization of human faces and bodies, and accurate association of identities. It comprises of three tracks: (i) WIDER F ace which aims at soliciting new approaches to advance the state-of-the-art in face detection, (ii) WIDER Pedestrian which aims to find effective and efficient approaches to address the problem of pedestrian detection in unconstrained environments, and (iii) WIDER Person Search which presents an exciting challenge of searching persons across 192 movies. In total, 73 teams made valid submissions to the challenge tracks. We summarize the winning solutions for all three tracks. and present discussions on open problems and potential research directions in these topics.

الرؤية الحاسوبية وتمييز الأنماط

FLAML: A Fast and Lightweight AutoML Library

117 - Chi Wang , Qingyun Wu , Markus Weimer 2019

We study the problem of using low computational cost to automate the choices of learners and hyperparameters for an ad-hoc training dataset and error metric, by conducting trials of different configurations on the given training data. We investigate the joint impact of multiple factors on both trial cost and model error, and propose several design guidelines. Following them, we build a fast and lightweight library FLAML which optimizes for low computational resource in finding accurate models. FLAML integrates several simple but effective search strategies into an adaptive system. It significantly outperforms top-ranked AutoML libraries on a large open source AutoML benchmark under equal, or sometimes orders of magnitude smaller budget constraints.

التعلم الآلي التعلم الالي

An Open Source AutoML Benchmark

107 - Pieter Gijsbers , Erin LeDell , Janek Thomas 2019

In recent years, an active field of research has developed around automated machine learning (AutoML). Unfortunately, comparing different AutoML systems is hard and often done incorrectly. We introduce an open, ongoing, and extensible benchmark frame work which follows best practices and avoids common mistakes. The framework is open-source, uses public datasets and has a website with up-to-date results. We use the framework to conduct a thorough comparison of 4 AutoML systems across 39 datasets and analyze the results.

التعلم الآلي التعلم الالي