مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Learning

157 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Brian Gaudet

تاريخ النشر 2021

مجال البحث هندسة إلكترونية الهندسة المعلوماتية

والبحث باللغة English

تأليف Brian Gaudet - Isaac Charcos - Roberto Furfaro

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We apply the meta reinforcement learning framework to optimize an integrated and adaptive guidance and flight control system for an air-to-air missile, implementing the system as a deep neural network (the policy). The policy maps observations directly to commanded rates of change for the missiles control surface deflections, with the observations derived with minimal processing from the computationally stabilized line of sight unit vector measured by a strap down seeker, estimated rotational velocity from rate gyros, and control surface deflection angles. The system induces intercept trajectories against a maneuvering target that satisfy control constraints on fin deflection angles, and path constraints on look angle and load. We test the optimized system in a six degrees-of-freedom simulator that includes a non-linear radome model and a strapdown seeker model. Through extensive simulation, we demonstrate that the system can adapt to a large flight envelope and off nominal flight conditions that include perturbation of aerodynamic coefficient parameters and center of pressure locations. Moreover, we find that the system is robust to the parasitic attitude loop induced by radome refraction, imperfect seeker stabilization, and sensor scale factor errors. Finally, we compare our systems performance to two benchmarks: a proportional navigation guidance system benchmark in a simplified 3-DOF environment, which we take as an upper bound on performance attainable with separate guidance and flight control systems, and a longitudinal model of proportional navigation coupled with a three loop autopilot. We find that our system moderately outperforms the former, and outperforms the latter by a large margin.

قيم البحث

251 - Zhian Kuang , Liting Sun , Huijun Gao 2021

To obtain precise motion control of wafer stages, an adaptive neural network and fractional-order super-twisting control strategy is proposed. Based on sliding mode control (SMC), the proposed controller aims to address two challenges in SMC: 1) redu cing the chattering phenomenon, and 2) attenuating the influence of model uncertainties and disturbances. For the first challenge, a fractional-order terminal sliding mode surface and a super-twisting algorithm are integrated into the SMC design. To attenuate uncertainties and disturbances, an add-on control structure based on the radial basis function (RBF) neural network is introduced. Stability analysis of the closed-loop control system is provided. Finally, experiments on a wafer stage testbed system are conducted, which proves that the proposed controller can robustly improve the tracking performance in the presence of uncertainties and disturbances compared to conventional and previous controllers.

أنظمة وتحكم علم الروبوتات أنظمة وتحكم

Adaptive Load Shedding for Grid Emergency Control via Deep Reinforcement Learning

313 - Ying Zhang , Meng Yue , Jianhui Wang 2021

Emergency control, typically such as under-voltage load shedding (UVLS), is broadly used to grapple with low voltage and voltage instability issues in practical power systems under contingencies. However, existing emergency control schemes are rule-b ased and cannot be adaptively applied to uncertain and floating operating conditions. This paper proposes an adaptive UVLS algorithm for emergency control via deep reinforcement learning (DRL) and expert systems. We first construct dynamic components for picturing the power system operation as the environment. The transient voltage recovery criteria, which poses time-varying requirements to UVLS, is integrated into the states and reward function to advise the learning of deep neural networks. The proposed approach has no tuning issue of coefficients in reward functions, and this issue was regarded as a deficiency in the existing DRL-based algorithms. Extensive case studies illustrate that the proposed method outperforms the traditional UVLS relay in both the timeliness and efficacy for emergency control.

أنظمة وتحكم أنظمة وتحكم

Deep Reinforcement Learning with Enhanced Safety for Autonomous Highway Driving

140 - Ali Baheri , Subramanya Nageshrao , H. Eric Tseng 2019

In this paper, we present a safe deep reinforcement learning system for automated driving. The proposed framework leverages merits of both rule-based and learning-based approaches for safety assurance. Our safety system consists of two modules namely handcrafted safety and dynamically-learned safety. The handcrafted safety module is a heuristic safety rule based on common driving practice that ensure a minimum relative gap to a traffic vehicle. On the other hand, the dynamically-learned safety module is a data-driven safety rule that learns safety patterns from driving data. Specifically, the dynamically-leaned safety module incorporates a model lookahead beyond the immediate reward of reinforcement learning to predict safety longer into the future. If one of the future states leads to a near-miss or collision, then a negative reward will be assigned to the reward function to avoid collision and accelerate the learning process. We demonstrate the capability of the proposed framework in a simulation environment with varying traffic density. Our results show the superior capabilities of the policy enhanced with dynamically-learned safety module.

أنظمة وتحكم علم الروبوتات أنظمة وتحكم

Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent Control

224 - Qingrui Zhang , Hao Dong , Wei Pan 2020

Decentralized multi-agent control has broad applications, ranging from multi-robot cooperation to distributed sensor networks. In decentralized multi-agent control, systems are complex with unknown or highly uncertain dynamics, where traditional mode l-based control methods can hardly be applied. Compared with model-based control in control theory, deep reinforcement learning (DRL) is promising to learn the controller/policy from data without the knowing system dynamics. However, to directly apply DRL to decentralized multi-agent control is challenging, as interactions among agents make the learning environment non-stationary. More importantly, the existing multi-agent reinforcement learning (MARL) algorithms cannot ensure the closed-loop stability of a multi-agent system from a control-theoretic perspective, so the learned control polices are highly possible to generate abnormal or dangerous behaviors in real applications. Hence, without stability guarantee, the application of the existing MARL algorithms to real multi-agent systems is of great concern, e.g., UAVs, robots, and power systems, etc. In this paper, we aim to propose a new MARL algorithm for decentralized multi-agent control with a stability guarantee. The new MARL algorithm, termed as a multi-agent soft-actor critic (MASAC), is proposed under the well-known framework of centralized-training-with-decentralized-execution. The closed-loop stability is guaranteed by the introduction of a stability constraint during the policy improvement in our MASAC algorithm. The stability constraint is designed based on Lyapunovs method in control theory. To demonstrate the effectiveness, we present a multi-agent navigation example to show the efficiency of the proposed MASAC algorithm.

أنظمة وتحكم التعلم الآلي أنظمة وتحكم

Control Barriers in Bayesian Learning of System Dynamics

205 - Vikas Dhiman , Mohammad Javad Khojasteh , Massimo Franceschetti 2020

This paper focuses on learning a model of system dynamics online while satisfying safety constraints. Our objective is to avoid offline system identification or hand-specified models and allow a system to safely and autonomously estimate and adapt it s own model during operation. Given streaming observations of the system state, we use Bayesian learning to obtain a distribution over the system dynamics. Specifically, we propose a new matrix variate Gaussian process (MVGP) regression approach with an efficient covariance factorization to learn the drift and input gain terms of a nonlinear control-affine system. The MVGP distribution is then used to optimize the system behavior and ensure safety with high probability, by specifying control Lyapunov function (CLF) and control barrier function (CBF) chance constraints. We show that a safe control policy can be synthesized for systems with arbitrary relative degree and probabilistic CLF-CBF constraints by solving a second order cone program (SOCP). Finally, we extend our design to a self-triggering formulation, adaptively determining the time at which a new control input needs to be applied in order to guarantee safety.

أنظمة وتحكم علم الروبوتات أنظمة وتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد العالي لإدارة الأعمال

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Learning

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً