Efficient time stepping for numerical integration using reinforcement learning

136 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sebastian Peitz

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Michael Dellnitz - Eyke Hullermeier - Marvin Lucke

النظم الديناميكية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Many problems in science and engineering require the efficient numerical approximation of integrals, a particularly important application being the numerical solution of initial value problems for differential equations. For complex systems, an equidistant discretization is often inadvisable, as it either results in prohibitively large errors or computational effort. To this end, adaptive schemes have been developed that rely on error estimators based on Taylor series expansions. While these estimators a) rely on strong smoothness assumptions and b) may still result in erroneous steps for complex systems (and thus require step rejection mechanisms), we here propose a data-driven time stepping scheme based on machine learning, and more specifically on reinforcement learning (RL) and meta-learning. First, one or several (in the case of non-smooth or hybrid systems) base learners are trained using RL. Then, a meta-learner is trained which (depending on the system state) selects the base learner that appears to be optimal for the current situation. Several examples including both smooth and non-smooth problems demonstrate the superior performance of our approach over state-of-the-art numerical schemes. The code is available under https://github.com/lueckem/quadrature-ML.

قيم البحث

81 - Dhruva Tirumala , Alexandre Galashov , Hyeonwoo Noh 2020

As we deploy reinforcement learning agents to solve increasingly challenging problems, methods that allow us to inject prior knowledge about the structure of the world and effective solution strategies becomes increasingly important. In this work we consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors that capture the common movement and interaction patterns that are shared across a set of related tasks or contexts. For example the day-to day behavior of humans comprises distinctive locomotion and manipulation patterns that recur across many different situations and goals. We discuss how such behavior patterns can be captured using probabilistic trajectory models and how these can be integrated effectively into reinforcement learning schemes, e.g. to facilitate multi-task and transfer learning. We then extend these ideas to latent variable models and consider a formulation to learn hierarchical priors that capture different aspects of the behavior in reusable modules. We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives, thereby offering an alternative perspective on existing ideas. We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.

الذكاء الاصطناعي التعلم الآلي

Efficient numerical integration of thermal interaction rates

72 - G. Jackson , M. Laine 2021

In many problems in particle cosmology, interaction rates are dominated by ${2}leftrightarrow{2}$ scatterings, or get a substantial contribution from them, given that ${1}leftrightarrow{2}$ and ${1}leftrightarrow{3}$ reactions are phase-space suppres sed. We describe an algorithm to represent, regularize, and evaluate a class of thermal ${2}leftrightarrow{2}$ and ${1}leftrightarrow{3}$ interaction rates for general momenta, masses, chemical potentials, and helicity projections. A key ingredient is an automated inclusion of virtual corrections to ${1}leftrightarrow{2}$ scatterings, which eliminate logarithmic and double-logarithmic IR divergences from the real ${2}leftrightarrow{2}$ and ${1}leftrightarrow{3}$ processes. We also review thermal and chemical potential induced contributions that require resummation if plasma particles are ultrarelativistic.

فيزياء الطاقة العالية - الظواهر

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

475 - Morteza Ibrahimi , Adel Javanmard , Benjamin Van Roy 2013

We study the problem of adaptive control of a high dimensional linear quadratic (LQ) system. Previous work established the asymptotic convergence to an optimal controller for various adaptive control schemes. More recently, for the average cost LQ pr oblem, a regret bound of ${O}(sqrt{T})$ was shown, apart form logarithmic factors. However, this bound scales exponentially with $p$, the dimension of the state space. In this work we consider the case where the matrices describing the dynamic of the LQ system are sparse and their dimensions are large. We present an adaptive control scheme that achieves a regret bound of ${O}(p sqrt{T})$, apart from logarithmic factors. In particular, our algorithm has an average cost of $(1+eps)$ times the optimum cost after $T = polylog(p) O(1/eps^2)$. This is in comparison to previous work on the dense dynamics where the algorithm requires time that scales exponentially with dimension in order to achieve regret of $eps$ times the optimal cost. We believe that our result has prominent applications in the emerging area of computational advertising, in particular targeted online advertising and advertising in social networks.

التعلم الالي التعلم الآلي التحسين والتحكم

Deep Reinforcement Learning for Efficient Measurement of Quantum Devices

71 - V. Nguyen , S.B. Orbell , D.T. Lennon 2020

Deep reinforcement learning is an emerging machine learning approach which can teach a computer to learn from their actions and rewards similar to the way humans learn from experience. It offers many advantages in automating decision processes to nav igate large parameter spaces. This paper proposes a novel approach to the efficient measurement of quantum devices based on deep reinforcement learning. We focus on double quantum dot devices, demonstrating the fully automatic identification of specific transport features called bias triangles. Measurements targeting these features are difficult to automate, since bias triangles are found in otherwise featureless regions of the parameter space. Our algorithm identifies bias triangles in a mean time of less than 30 minutes, and sometimes as little as 1 minute. This approach, based on dueling deep Q-networks, can be adapted to a broad range of devices and target transport features. This is a crucial demonstration of the utility of deep reinforcement learning for decision making in the measurement and operation of quantum devices.

الفيزياء ميسكالي وننكالي التعلم الآلي فيزياء الكم

Efficient Poverty Mapping using Deep Reinforcement Learning

66 - Kumar Ayush , Burak Uzkent , Kumar Tanmay 2020

The combination of high-resolution satellite imagery and machine learning have proven useful in many sustainability-related tasks, including poverty prediction, infrastructure measurement, and forest monitoring. However, the accuracy afforded by high -resolution imagery comes at a cost, as such imagery is extremely expensive to purchase at scale. This creates a substantial hurdle to the efficient scaling and widespread adoption of high-resolution-based approaches. To reduce acquisition costs while maintaining accuracy, we propose a reinforcement learning approach in which free low-resolution imagery is used to dynamically identify where to acquire costly high-resolution images, prior to performing a deep learning task on the high-resolution images. We apply this approach to the task of poverty prediction in Uganda, building on an earlier approach that used object detection to count objects and use these counts to predict poverty. Our approach exceeds previous performance benchmarks on this task while using 80% fewer high-resolution images. Our approach could have application in many sustainability domains that require high-resolution imagery.

الرؤية الحاسوبية وتمييز الأنماط