NeurIPS 2020 Competition: Predicting Generalization in Deep Learning

242 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Scott Yak

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Yiding Jiang

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Understanding generalization in deep learning is arguably one of the most important questions in deep learning. Deep learning has been successfully adopted to a large number of problems ranging from pattern recognition to complex decision making, but many recent researchers have raised many concerns about deep learning, among which the most important is generalization. Despite numerous attempts, conventional statistical learning approaches have yet been able to provide a satisfactory explanation on why deep learning works. A recent line of works aims to address the problem by trying to predict the generalization performance through complexity measures. In this competition, we invite the community to propose complexity measures that can accurately predict generalization of models. A robust and general complexity measure would potentially lead to a better understanding of deep learnings underlying mechanism and behavior of deep models on unseen data, or shed light on better generalization bounds. All these outcomes will be important for making deep learning more robust and reliable.

قيم البحث

116 - Sharada Mohanty , Jyotish Poonganam , Adrien Gaidon 2021

The NeurIPS 2020 Procgen Competition was designed as a centralized benchmark with clearly defined tasks for measuring Sample Efficiency and Generalization in Reinforcement Learning. Generalization remains one of the most fundamental challenges in dee p reinforcement learning, and yet we do not have enough benchmarks to measure the progress of the community on Generalization in Reinforcement Learning. We present the design of a centralized benchmark for Reinforcement Learning which can help measure Sample Efficiency and Generalization in Reinforcement Learning by doing end to end evaluation of the training and rollout phases of thousands of user submitted code bases in a scalable way. We designed the benchmark on top of the already existing Procgen Benchmark by defining clear tasks and standardizing the end to end evaluation setups. The design aims to maximize the flexibility available for researchers who wish to design future iterations of such benchmarks, and yet imposes necessary practical constraints to allow for a system like this to scale. This paper presents the competition setup and the details and analysis of the top solutions identified through this setup in context of 2020 iteration of the competition at NeurIPS.

التعلم الآلي الذكاء الاصطناعي

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

163 - Sewon Min , Jordan Boyd-Graber , Chris Alberti 2021

We review the EfficientQA competition from NeurIPS 2020. The competition focused on open-domain question answering (QA), where systems take natural language questions as input and return natural language answers. The aim of the competition was to bui ld systems that can predict correct answers while also satisfying strict on-disk memory budgets. These memory budgets were designed to encourage contestants to explore the trade-off between storing large, redundant, retrieval corpora or the parameters of large learned models. In this report, we describe the motivation and organization of the competition, review the best submissions, and analyze system predictions to inform a discussion of evaluation for open-domain QA.

الحساب واللغة الذكاء الاصطناعي

NeurIPS 2020 NLC2CMD Competition: Translating Natural Language to Bash Commands

86 - Mayank Agarwal , Tathagata Chakraborti , Quchen Fu 2021

The NLC2CMD Competition hosted at NeurIPS 2020 aimed to bring the power of natural language processing to the command line. Participants were tasked with building models that can transform descriptions of command line tasks in English to their Bash s yntax. This is a report on the competition with details of the task, metrics, data, attempted solutions, and lessons learned.

الحساب واللغة

Deep Learning for Predicting Dynamic Uncertain Opinions in Network Data

78 - Xujiang Zhao , Feng Chen , Jin-Hee Cho 2019

Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been s ubstantially applied in a variety of decision making in the area of cybersecurity, opinion models, trust models, and/or social network analysis. However, SL and its variants have exposed limitations in predicting uncertain opinions in real-world dynamic network data mainly in three-fold: (1) a lack of scalability to deal with a large-scale network; (2) limited capability to handle heterogeneous topological and temporal dependencies among node-level opinions; and (3) a high sensitivity with conflicting evidence that may generate counterintuitive opinions derived from the evidence. In this work, we proposed a novel deep learning (DL)-based dynamic opinion inference model while node-level opinions are still formalized based on SL meaning that an opinion has a dimension of uncertainty in addition to belief and disbelief in a binomial opinion (i.e., agree or disagree). The proposed DL-based dynamic opinion inference model overcomes the above three limitations by integrating the following techniques: (1) state-of-the-art DL techniques, such as the Graph Convolutional Network (GCN) and the Gated Recurrent Units (GRU) for modeling the topological and temporal heterogeneous dependency information of a given dynamic network; (2) modeling conflicting opinions based on robust statistics; and (3) a highly scalable inference algorithm to predict dynamic, uncertain opinions in a linear computation time. We validated the outperformance of our proposed DL-based algorithm (i.e., GCN-GRU-opinion model) via extensive comparative performance analysis based on four real-world datasets.

التعلم الآلي التعلم الالي

Exploring Generalization in Deep Learning

177 - Behnam Neyshabur , Srinadh Bhojanapalli , David McAllester 2017

With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and robustness. We study how these measures can ensure generalization, highlighting the importance of scale normalization, and making a connection between sharpness and PAC-Bayes theory. We then investigate how well the measures explain different observed phenomena.

التعلم الآلي