Certainty Equivalent Perception-Based Control

285 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sarah Dean

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sarah Dean - Benjamin Recht

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In order to certify performance and safety, feedback control requires precise characterization of sensor errors. In this paper, we provide guarantees on such feedback systems when sensors are characterized by solving a supervised learning problem. We show a uniform error bound on nonparametric kernel regression under a dynamically-achievable dense sampling scheme. This allows for a finite-time convergence rate on the sub-optimality of using the regressor in closed-loop for waypoint tracking. We demonstrate our results in simulation with simplified unmanned aerial vehicle and autonomous driving examples.

قيم البحث

77 - Kunal Menda , Jean de Becdeli`evre , Jayesh K. Gupta 2020

System identification is a key step for model-based control, estimator design, and output prediction. This work considers the offline identification of partially observed nonlinear systems. We empirically show that the certainty-equivalent approximat ion to expectation-maximization can be a reliable and scalable approach for high-dimensional deterministic systems, which are common in robotics. We formulate certainty-equivalent expectation-maximization as block coordinate-ascent, and provide an efficient implementation. The algorithm is tested on a simulated system of coupled Lorenz attractors, demonstrating its ability to identify high-dimensional systems that can be intractable for particle-based approaches. Our approach is also used to identify the dynamics of an aerobatic helicopter. By augmenting the state with unobserved fluid states, a model is learned that predicts the acceleration of the helicopter better than state-of-the-art approaches. The codebase for this work is available at https://github.com/sisl/CEEM.

التعلم الآلي علم الروبوتات أنظمة وتحكم

Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization

103 - Tolga Ergen , Arda Sahiner , Batu Ozturkler 2021

Batch Normalization (BN) is a commonly used technique to accelerate and stabilize training of deep neural networks. Despite its empirical success, a full theoretical understanding of BN is yet to be developed. In this work, we analyze BN through the lens of convex optimization. We introduce an analytic framework based on convex duality to obtain exact convex representations of weight-decay regularized ReLU networks with BN, which can be trained in polynomial-time. Our analyses also show that optimal layer weights can be obtained as simple closed-form formulas in the high-dimensional and/or overparameterized regimes. Furthermore, we find that Gradient Descent provides an algorithmic bias effect on the standard non-convex BN network, and we design an approach to explicitly encode this implicit regularization into the convex objective. Experiments with CIFAR image classification highlight the effectiveness of this explicit regularization for mimicking and substantially improving the performance of standard BN networks.

التعلم الآلي التحسين والتحكم التعلم الالي

Non-Stochastic Control with Bandit Feedback

95 - Paula Gradu , John Hallman , Elad Hazan 2020

We study the problem of controlling a linear dynamical system with adversarial perturbations where the only feedback available to the controller is the scalar loss, and the loss function itself is unknown. For this problem, with either a known or unk nown system, we give an efficient sublinear regret algorithm. The main algorithmic difficulty is the dependence of the loss on past controls. To overcome this issue, we propose an efficient algorithm for the general setting of bandit convex optimization for loss functions with memory, which may be of independent interest.

التعلم الآلي التحسين والتحكم التعلم الالي

How Are Learned Perception-Based Controllers Impacted by the Limits of Robust Control?

175 - Jingxi Xu , Bruce Lee , Nikolai Matni 2021

The difficulty of optimal control problems has classically been characterized in terms of system properties such as minimum eigenvalues of controllability/observability gramians. We revisit these characterizations in the context of the increasing pop ularity of data-driven techniques like reinforcement learning (RL), and in control settings where input observations are high-dimensional images and transition dynamics are unknown. Specifically, we ask: to what extent are quantifiable control and perceptual difficulty metrics of a task predictive of the performance and sample complexity of data-driven controllers? We modulate two different types of partial observability in a cartpole stick-balancing problem -- (i) the height of one visible fixation point on the cartpole, which can be used to tune fundamental limits of performance achievable by any controller, and by (ii) the level of perception noise in the fixation point position inferred from depth or RGB images of the cartpole. In these settings, we empirically study two popular families of controllers: RL and system identification-based $H_infty$ control, using visually estimated system state. Our results show that the fundamental limits of robust control have corresponding implications for the sample-efficiency and performance of learned perception-based controllers. Visit our project website https://jxu.ai/rl-vs-control-web for more information.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Stochastic control of optimized certainty equivalents

117 - Julio Backhoff Veraguas , A. Max Reppen , Ludovic Tangpi 2020

Optimized certainty equivalents (OCEs) is a family of risk measures widely used by both practitioners and academics. This is mostly due to its tractability and the fact that it encompasses important examples, including entropic risk measures and aver age value at risk. In this work we consider stochastic optimal control problems where the objective criterion is given by an OCE risk measure, or put in other words, a risk minimization problem for controlled diffusions. A major difficulty arises since OCEs are often time inconsistent. Nevertheless, via an enlargement of state space we achieve a substitute of sorts for time consistency in fair generality. This allows us to derive a dynamic programming principle and thus recover central results of (risk-neutral) stochastic control theory. In particular, we show that the value of our risk minimization problem can be characterized via the viscosity solution of a Hamilton--Jacobi--Bellman--Issacs equation. We further establish the uniqueness of the latter under suitable technical conditions.

التحسين والتحكم الإحصاء والرياضيات المالية الإحصاء وإدارة المخاطر