ترغب بنشر مسار تعليمي؟ اضغط هنا

Reinforced optimal control

191   0   0.0 ( 0 )
 نشر من قبل Denis Belomestny
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Least squares Monte Carlo methods are a popular numerical approximation method for solving stochastic control problems. Based on dynamic programming, their key feature is the approximation of the conditional expectation of future rewards by linear least squares regression. Hence, the choice of basis functions is crucial for the accuracy of the method. Earlier work by some of us [Belomestny, Schoenmakers, Spokoiny, Zharkynbay. Commun.~Math.~Sci., 18(1):109-121, 2020] proposes to emph{reinforce} the basis functions in the case of optimal stopping problems by already computed value functions for later times, thereby considerably improving the accuracy with limited additional computational cost. We extend the reinforced regression method to a general class of stochastic control problems, while considerably improving the methods efficiency, as demonstrated by substantial numerical examples as well as theoretical analysis.



قيم البحث

اقرأ أيضاً

This work discusses the finite element discretization of an optimal control problem for the linear wave equation with time-dependent controls of bounded variation. The main focus lies on the convergence analysis of the discretization method. The stat e equation is discretized by a space-time finite element method. The controls are not discretized. Under suitable assumptions optimal convergence rates for the error in the state and control variable are proven. Based on a conditional gradient method the solution of the semi-discretized optimal control problem is computed. The theoretical convergence rates are confirmed in a numerical example.
We present a data-driven point of view for rare events, which represent conformational transitions in biochemical reactions modeled by over-damped Langevin dynamics on manifolds in high dimensions. We first reinterpret the transition state theory and the transition path theory from the optimal control viewpoint. Given point clouds sampled from a reaction dynamics, we construct a discrete Markov process based on an approximated Voronoi tesselation. We use the constructed Markov process to compute a discrete committor function whose level set automatically orders the point clouds. Then based on the committor function, an optimally controlled random walk on point clouds is constructed and utilized to efficiently sample transition paths, which become an almost sure event in $O(1)$ time instead of a rare event in the original reaction dynamics. To compute the mean transition path efficiently, a local averaging algorithm based on the optimally controlled random walk is developed, which adapts the finite temperature string method to the controlled Monte Carlo samples. Numerical examples on sphere/torus including a conformational transition for the alanine dipeptide in vacuum are conducted to illustrate the data-driven solver for the transition path theory on point clouds. The mean transition path obtained via the controlled Monte Carlo simulations highly coincides with the computed dominant transition path in the transition path theory.
Two major problems in modern cities are air contamination and road congestion. They are closely related and present a similar origin: traffic flow. To face these problems, local governments impose traffic restrictions to prevent the entry of vehicles into sensitive areas, with the final aim of dropping down air pollution levels. However, these restrictions force drivers to look for alternative routes that usually generate congestions, implying both longer travel times and higher levels of air pollution. In this work, combining optimal control of partial differential equations and computational modelling, we formulate a multi-objective control problem with air pollution and drivers travel time as objectives and look for its optimal solutions in the sense of Stackelberg. In this problem, local government (the leader) implements traffic restrictions meanwhile the set of drivers (the follower) acts choosing travel preferences against leader constraints. Numerically, the discretized problem is solved by combining genetic-elitist algorithms and interior-point methods, and computational results for a realistic case posed in the Guadalajara Metropolitan Area (Mexico) are shown.
252 - Paul Manns , Stefan Ulbrich 2021
In PDE-constrained optimization, proper orthogonal decomposition (POD) provides a surrogate model of a (potentially expensive) PDE discretization, on which optimization iterations are executed. Because POD models usually provide good approximation qu ality only locally, they have to be updated during optimization. Updating the POD model is usually expensive, however, and therefore often impossible in a model-predictive control (MPC) context. Thus, reduced models of mediocre quality might be accepted. We take the view of a simplified Newton method for solving semilinear evolution equations to derive an algorithm that can serve as an offline phase to produce a POD model. Approaches that build the POD model with impulse response snapshots can be regarded as the first Newton step in this context. In particular, POD models that are based on impulse response snapshots are extended by adding a second simplified Newton step. This procedure improves the approximation quality of the POD model significantly by introducing a moderate amount of extra computational costs during optimization or the MPC loop. We illustrate our findings with an example satisfying our assumptions.
We investigate the problem of optimal transport in the so-called Kantorovich form, i.e. given two Radon measures on two compact sets, we seek an optimal transport plan which is another Radon measure on the product of the sets that has these two measu res as marginals and minimizes a certain cost function. We consider quadratic regularization of the problem, which forces the optimal transport plan to be a square integrable function rather than a Radon measure. We derive the dual problem and show strong duality and existence of primal and dual solutions to the regularized problem. Then we derive two algorithms to solve the dual problem of the regularized problem: A Gauss-Seidel method and a semismooth quasi-Newton method and investigate both methods numerically. Our experiments show that the methods perform well even for small regularization parameters. Quadratic regularization is of interest since the resulting optimal transport plans are sparse, i.e. they have a small support (which is not the case for the often used entropic regularization where the optimal transport plan always has full measure).
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا