Safe Model-based Off-policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles

146 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhaoxuan Zhu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhaoxuan Zhu - Nicola Pivaro - Shobhit Gupta

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Connected and Automated Hybrid Electric Vehicles have the potential to reduce fuel consumption and travel time in real-world driving conditions. The eco-driving problem seeks to design optimal speed and power usage profiles based upon look-ahead information from connectivity and advanced mapping features. Recently, Deep Reinforcement Learning (DRL) has been applied to the eco-driving problem. While the previous studies synthesize simulators and model-free DRL to reduce online computation, this work proposes a Safe Off-policy Model-Based Reinforcement Learning algorithm for the eco-driving problem. The advantages over the existing literature are three-fold. First, the combination of off-policy learning and the use of a physics-based model improves the sample efficiency. Second, the training does not require any extrinsic rewarding mechanism for constraint satisfaction. Third, the feasibility of trajectory is guaranteed by using a safe set approximated by deep generative models. The performance of the proposed method is benchmarked against a baseline controller representing human drivers, a previously designed model-free DRL strategy, and the wait-and-see optimal solution. In simulation, the proposed algorithm leads to a policy with a higher average speed and a better fuel economy compared to the model-free agent. Compared to the baseline controller, the learned strategy reduces the fuel consumption by more than 21% while keeping the average speed comparable.

قيم البحث

235 - Paul Young Joun Ha , Sikai Chen , Jiqian Dong 2020

Active Traffic Management strategies are often adopted in real-time to address such sudden flow breakdowns. When queuing is imminent, Speed Harmonization (SH), which adjusts speeds in upstream traffic to mitigate traffic showckwaves downstream, can b e applied. However, because SH depends on driver awareness and compliance, it may not always be effective in mitigating congestion. The use of multiagent reinforcement learning for collaborative learning, is a promising solution to this challenge. By incorporating this technique in the control algorithms of connected and autonomous vehicle (CAV), it may be possible to train the CAVs to make joint decisions that can mitigate highway bottleneck congestion without human driver compliance to altered speed limits. In this regard, we present an RL-based multi-agent CAV control model to operate in mixed traffic (both CAVs and human-driven vehicles (HDVs)). The results suggest that even at CAV percent share of corridor traffic as low as 10%, CAVs can significantly mitigate bottlenecks in highway traffic. Another objective was to assess the efficacy of the RL-based controller vis-`a-vis that of the rule-based controller. In addressing this objective, we duly recognize that one of the main challenges of RL-based CAV controllers is the variety and complexity of inputs that exist in the real world, such as the information provided to the CAV by other connected entities and sensed information. These translate as dynamic length inputs which are difficult to process and learn from. For this reason, we propose the use of Graphical Convolution Networks (GCN), a specific RL technique, to preserve information network topology and corresponding dynamic length inputs. We then use this, combined with Deep Deterministic Policy Gradient (DDPG), to carry out multi-agent training for congestion mitigation using the CAV controllers.

التعلم الآلي أنظمة وتحكم أنظمة وتحكم

Real-time Eco-Driving Control in Electrified Connected and Autonomous Vehicles using Approximate Dynamic Programming

132 - Shreshta Rajakumar Deshpande , Shobhit Gupta , Abhishek Gupta andn Marcello Canova 2021

Connected and Automated Vehicles (CAVs), particularly those with a hybrid electric powertrain, have the potential to significantly improve vehicle energy savings in real-world driving conditions. In particular, the Eco-Driving problem seeks to design optimal speed and power usage profiles based on available information from connectivity and advanced mapping features to minimize the fuel consumption over an itinerary. This paper presents a hierarchical multi-layer Model Predictive Control (MPC) approach for improving the fuel economy of a 48V mild-hybrid powertrain in a connected vehicle environment. Approximate Dynamic Programming (ADP) is used to solve the Receding Horizon Optimal Control Problem (RHOCP), where the terminal cost for the RHOCP is approximated as the base-policy obtained from the long-term optimization. The controller was extensively tested virtually (using both deterministic and Monte Carlo simulations) across multiple real-world routes where energy savings of more than 20% have been demonstrated. Further, the developed controller was deployed and tested at a proving ground in real-time on a test vehicle equipped with a rapid prototyping embedded controller. Real-time in-vehicle testing confirmed the energy savings observed in simulation and demonstrated the ability of the developed controller to be effective in real-time applications.

أنظمة وتحكم أنظمة وتحكم

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

432 - Zhuo Xu , Chen Tang , Masayoshi Tomizuka 2018

Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.

التعلم الآلي علم الروبوتات أنظمة وتحكم

Fully Distributed Model Predictive Control of Connected Automated Vehicles in Intersections: Theory and Vehicle Experiments

103 - Alexander Katriniok , Benedikt Rosarius , Petri Mahonen 2021

We propose a fully distributed control system architecture, amenable to in-vehicle implementation, that aims to safely coordinate connected and automated vehicles (CAVs) in road intersections. For control purposes, we build upon a fully distributed m odel predictive control approach, in which the agents solve a nonconvex optimal control problem (OCP) locally and synchronously, and exchange their optimized trajectories via vehicle-to-vehicle (V2V) communication. To accommodate a fast solution of the nonconvex OCPs, we apply the penalty convex-concave procedure which aims to solve a convexified version of the original OCP. For experimental evaluation, we complement the predictive controller with a localization layer, being in charge of self-localization and the estimation of joint collision points with other agents. Moreover, we come up with a proprietary communication protocol to exchange trajectories with other agents. Experimental tests reveal the efficacy of proposed control system architecture.

التحسين والتحكم أنظمة وتحكم أنظمة وتحكم

Conflict-free Cooperation Method for Connected and Automated Vehicles at Unsignalized Intersections: Graph-based Modeling and Optimality Analysis

108 - Chaoyi Chen , Qing Xu , Mengchi Cai 2021

Connected and automated vehicles have shown great potential in improving traffic mobility and reducing emissions, especially at unsignalized intersections. Previous research has shown that vehicle passing order is the key influencing factor in improv ing intersection traffic mobility. In this paper, we propose a graph-based cooperation method to formalize the conflict-free scheduling problem at an unsignalized intersection. Based on graphical analysis, a vehicles trajectory conflict relationship is modeled as a conflict directed graph and a coexisting undirected graph. Then, two graph-based methods are proposed to find the vehicle passing order. The first is an improved depth-first spanning tree algorithm, which aims to find the local optimal passing order vehicle by vehicle. The other novel method is a minimum clique cover algorithm, which identifies the global optimal solution. Finally, a distributed control framework and communication topology are presented to realize the conflict-free cooperation of vehicles. Extensive numerical simulations are conducted for various numbers of vehicles and traffic volumes, and the simulation results prove the effectiveness of the proposed algorithms.

علم الروبوتات أنظمة وتحكم أنظمة وتحكم