Integrated Decision and Control: Towards Interpretable and Computationally Efficient Driving Intelligence

164 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yang Guan

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yang Guan - Yangang Ren - Qi Sun

التعلم الآلي علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Decision and control are core functionalities of high-level automated vehicles. Current mainstream methods, such as functionality decomposition and end-to-end reinforcement learning (RL), either suffer high time complexity or poor interpretability and adaptability on real-world autonomous driving tasks. In this paper, we present an interpretable and computationally efficient framework called integrated decision and control (IDC) for automated vehicles, which decomposes the driving task into static path planning and dynamic optimal tracking that are structured hierarchically. First, the static path planning generates several candidate paths only considering static traffic elements. Then, the dynamic optimal tracking is designed to track the optimal path while considering the dynamic obstacles. To that end, we formulate a constrained optimal control problem (OCP) for each candidate path, optimize them separately and follow the one with the best tracking performance. To unload the heavy online computation, we propose a model-based reinforcement learning (RL) algorithm that can be served as an approximate constrained OCP solver. Specifically, the OCPs for all paths are considered together to construct a single complete RL problem and then solved offline in the form of value and policy networks, for real-time online path selecting and tracking respectively. We verify our framework in both simulations and the real world. Results show that compared with baseline methods IDC has an order of magnitude higher online computing efficiency, as well as better driving performance including traffic efficiency and safety. In addition, it yields great interpretability and adaptability among different driving tasks. The effectiveness of the proposed method is also demonstrated in real road tests with complicated traffic conditions.

قيم البحث

307 - Jianhua Jiang , Yangang Ren , Yang Guan 2021

Autonomous driving at intersections is one of the most complicated and accident-prone traffic scenarios, especially with mixed traffic participants such as vehicles, bicycles and pedestrians. The driving policy should make safe decisions to handle th e dynamic traffic conditions and meet the requirements of on-board computation. However, most of the current researches focuses on simplified intersections considering only the surrounding vehicles and idealized traffic lights. This paper improves the integrated decision and control framework and develops a learning-based algorithm to deal with complex intersections with mixed traffic flows, which can not only take account of realistic characteristics of traffic lights, but also learn a safe policy under different safety constraints. We first consider different velocity models for green and red lights in the training process and use a finite state machine to handle different modes of light transformation. Then we design different types of distance constraints for vehicles, traffic lights, pedestrians, bicycles respectively and formulize the constrained optimal control problems (OCPs) to be optimized. Finally, reinforcement learning (RL) with value and policy networks is adopted to solve the series of OCPs. In order to verify the safety and efficiency of the proposed method, we design a multi-lane intersection with the existence of large-scale mixed traffic participants and set practical traffic light phases. The simulation results indicate that the trained decision and control policy can well balance safety and tracking performance. Compared with model predictive control (MPC), the computational time is three orders of magnitude lower.

التعلم الآلي الذكاء الاصطناعي أنظمة وتحكم

Distributed Reinforcement Learning for Flexible and Efficient UAV Swarm Control

71 - Federico Venturini , Federico Mason , Francesco Pase 2021

Over the past few years, the use of swarms of Unmanned Aerial Vehicles (UAVs) in monitoring and remote area surveillance applications has become widespread thanks to the price reduction and the increased capabilities of drones. The drones in the swar m need to cooperatively explore an unknown area, in order to identify and monitor interesting targets, while minimizing their movements. In this work, we propose a distributed Reinforcement Learning (RL) approach that scales to larger swarms without modifications. The proposed framework relies on the possibility for the UAVs to exchange some information through a communication channel, in order to achieve context-awareness and implicitly coordinate the swarms actions. Our experiments show that the proposed method can yield effective strategies, which are robust to communication channel impairments, and that can easily deal with non-uniform distributions of targets and obstacles. Moreover, when agents are trained in a specific scenario, they can adapt to a new one with minimal additional training. We also show that our approach achieves better performance compared to a computationally intensive look-ahead heuristic.

التعلم الآلي علم الروبوتات

Computationally efficient stochastic MPC: a probabilistic scaling approach

244 - Martina Mammarella , Teodoro Alamo , Fabrizio Dabbene 2020

In recent years, the increasing interest in Stochastic model predictive control (SMPC) schemes has highlighted the limitation arising from their inherent computational demand, which has restricted their applicability to slow-dynamics and high-perform ing systems. To reduce the computational burden, in this paper we extend the probabilistic scaling approach to obtain low-complexity inner approximation of chance-constrained sets. This approach provides probabilistic guarantees at a lower computational cost than other schemes for which the sample complexity depends on the design space dimension. To design candidate simple approximating sets, which approximate the shape of the probabilistic set, we introduce two possibilities: i) fixed-complexity polytopes, and ii) $ell_p$-norm based sets. Once the candidate approximating set is obtained, it is scaled around its center so to enforce the expected probabilistic guarantees. The resulting scaled set is then exploited to enforce constraints in the classical SMPC framework. The computational gain obtained with the proposed approach with respect to the scenario one is demonstrated via simulations, where the objective is the control of a fixed-wing UAV performing a monitoring mission over a sloped vineyard.

أنظمة وتحكم علم الروبوتات أنظمة وتحكم

Optimization Algorithm for Feedback and Feedforward Policies towards Robot Control Robust to Sensing Failures

86 - Taisuke Kobayashi , Kenta Yoshizawa 2021

Model-free or learning-based control, in particular, reinforcement learning (RL), is expected to be applied for complex robotic tasks. Traditional RL requires a policy to be optimized is state-dependent, that means, the policy is a kind of feedback ( FB) controllers. Due to the necessity of correct state observation in such a FB controller, it is sensitive to sensing failures. To alleviate this drawback of the FB controllers, feedback error learning integrates one of them with a feedforward (FF) controller. RL can be improved by dealing with the FB/FF policies, but to the best of our knowledge, a methodology for learning them in a unified manner has not been developed. In this paper, we propose a new optimization problem for optimizing both the FB/FF policies simultaneously. Inspired by control as inference, the optimization problem considers minimization/maximization of divergences between trajectory, predicted by the composed policy and a stochastic dynamics model, and optimal/non-optimal trajectories. By approximating the stochastic dynamics model using variational method, we naturally derive a regularization between the FB/FF policies. In numerical simulations and a robot experiment, we verified that the proposed method can stably optimize the composed policy even with the different learning law from the traditional RL. In addition, we demonstrated that the FF policy is robust to the sensing failures and can hold the optimal motion. Attached video is also uploaded on youtube: https://youtu.be/zLL4uXIRmrE

التعلم الآلي علم الروبوتات

Towards Interpretable Neural Networks: An Exact Transformation to Multi-Class Multivariate Decision Trees

110 - Duy T. Nguyen , Kathryn E. Kasmarik , Hussein A. Abbass 2020

Artificial neural networks (ANNs) are commonly labelled as black-boxes, lacking interpretability. This hinders human understanding of ANNs behaviors. A need exists to generate a meaningful sequential logic for the production of a specific output. Dec ision trees exhibit better interpretability and expressive power due to their representation language and the existence of efficient algorithms to generate rules. Growing a decision tree based on the available data could produce larger than necessary trees or trees that do not generalise well. In this paper, we introduce two novel multivariate decision tree (MDT) algorithms for rule extraction from an ANN: an Exact-Convertible Decision Tree (EC-DT) and an Extended C-Net algorithm to transform a neural network with Rectified Linear Unit activation functions into a representative tree which can be used to extract multivariate rules for reasoning. While the EC-DT translates the ANN in a layer-wise manner to represent exactly the decision boundaries implicitlylearned by the hidden layers of the network, the Extended C-Net inherits the decompositional approach from EC-DT and combines with a C5 tree learning algorithm to construct the decision rules. The results suggest that while EC-DT is superior in preserving the structure and the accuracy of ANN, Extended C-Net generates the most compact and highly effective trees from ANN. Both proposed MDT algorithms generate rules including combinations of multiple attributes for precise interpretation of decision-making processes.

التعلم الآلي التعلم الالي