A Pragmatic Look at Deep Imitation Learning

151 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kai Arulkumaran

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Kai Arulkumaran - Dan Ogawa Lillrank

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The introduction of the generative adversarial imitation learning (GAIL) algorithm has spurred the development of scalable imitation learning approaches using deep neural networks. The GAIL objective can be thought of as 1) matching the expert policys state distribution; 2) penalising the learned policys state distribution; and 3) maximising entropy. While theoretically motivated, in practice GAIL can be difficult to apply, not least due to the instabilities of adversarial training. In this paper, we take a pragmatic look at GAIL and related imitation learning algorithms. We implement and automatically tune a range of algorithms in a unified experimental setup, presenting a fair evaluation between the competing methods. From our results, our primary recommendation is to consider non-adversarial methods. Furthermore, we discuss the common components of imitation learning objectives, and present promising avenues for future research.

قيم البحث

اقرأ أيضاً

A Closer Look at Deep Policy Gradients

153 - Andrew Ilyas , Logan Engstrom , Shibani Santurkar 2018

We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. To this end, we propose a fine-grained analysis of state-of-the-art methods based on key elements of this framework: gradient estimation, value prediction, and optimization landscapes. Our results show that the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict: the surrogate objective does not match the true reward landscape, learned value estimators fail to fit the true value function, and gradient estimates poorly correlate with the true gradient. The mismatch between predicted and empirical behavior we uncover highlights our poor understanding of current methods, and indicates the need to move beyond current benchmark-centric evaluation methods.

التعلم الآلي الحوسبة العصبية والتطورية علم الروبوتات

Deep Learning at the Edge

153 - Sahar Voghoei , Navid Hashemi Tonekaboni , Jason G. Wallace 2019

The ever-increasing number of Internet of Things (IoT) devices has created a new computing paradigm, called edge computing, where most of the computations are performed at the edge devices, rather than on centralized servers. An edge device is an ele ctronic device that provides connections to service providers and other edge devices; typically, such devices have limited resources. Since edge devices are resource-constrained, the task of launching algorithms, methods, and applications onto edge devices is considered to be a significant challenge. In this paper, we discuss one of the most widely used machine learning methods, namely, Deep Learning (DL) and offer a short survey on the recent approaches used to map DL onto the edge computing paradigm. We also provide relevant discussions about selected applications that would greatly benefit from DL at the edge.

التعلم الآلي الحوسبة العصبية والتطورية

A First Look at Deep Learning Apps on Smartphones

208 - Mengwei Xu , Jiawei Liu , Yuanqiang Liu 2018

We are in the dawn of deep learning explosion for smartphones. To bridge the gap between research and practice, we present the first empirical study on 16,500 the most popular Android apps, demystifying how smartphone apps exploit deep learning in th e wild. To this end, we build a new static tool that dissects apps and analyzes their deep learning functions. Our study answers threefold questions: what are the early adopter apps of deep learning, what do they use deep learning for, and how do their deep learning models look like. Our study has strong implications for app developers, smartphone vendors, and deep learning R&D. On one hand, our findings paint a promising picture of deep learning for smartphones, showing the prosperity of mobile deep learning frameworks as well as the prosperity of apps building their cores atop deep learning. On the other hand, our findings urge optimizations on deep learning models deployed on smartphones, the protection of these models, and validation of research ideas on these models.

التعلم الآلي أجهزة الكمبيوتر والمجتمع

Data Driven Aircraft Trajectory Prediction with Deep Imitation Learning

102 - Alevizos Bastas , Theocharis Kravaris , George A. Vouros 2020

The current Air Traffic Management (ATM) system worldwide has reached its limits in terms of predictability, efficiency and cost effectiveness. Different initiatives worldwide propose trajectory-oriented transformations that require high fidelity air craft trajectory planning and prediction capabilities, supporting the trajectory life cycle at all stages efficiently. Recently proposed data-driven trajectory prediction approaches provide promising results. In this paper we approach the data-driven trajectory prediction problem as an imitation learning task, where we aim to imitate experts shaping the trajectory. Towards this goal we present a comprehensive framework comprising the Generative Adversarial Imitation Learning state of the art method, in a pipeline with trajectory clustering and classification methods. This approach, compared to other approaches, can provide accurate predictions for the whole trajectory (i.e. with a prediction horizon until reaching the destination) both at the pre-tactical (i.e. starting at the departure airport at a specific time instant) and at the tactical (i.e. from any state while flying) stages, compared to state of the art approaches.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Deep Learning with Limited Numerical Precision

871 - Suyog Gupta , Ankur Agrawal , Kailash Gopalakrishnan 2015

Training of large-scale deep neural networks is often constrained by the available computational resources. We study the effect of limited precision data representation and computation on neural network training. Within the context of low-precision f ixed-point computations, we observe the rounding scheme to play a crucial role in determining the networks behavior during training. Our results show that deep networks can be trained using only 16-bit wide fixed-point number representation when using stochastic rounding, and incur little to no degradation in the classification accuracy. We also demonstrate an energy-efficient hardware accelerator that implements low-precision fixed-point arithmetic with stochastic rounding.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي