Do you want to publish a course? Click here

Encoding Defensive Driving as a Dynamic Nash Game

146   0   0.0 ( 0 )
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

Robots deployed in real-world environments should operate safely in a robust manner. In scenarios where an ego agent navigates in an environment with multiple other non-ego agents, two modes of safety are commonly proposed -- adversarial robustness and probabilistic constraint satisfaction. However, while the former is generally computationally intractable and leads to overconservative solutions, the latter typically relies on strong distributional assumptions and ignores strategic coupling between agents. To avoid these drawbacks, we present a novel formulation of robustness within the framework of general-sum dynamic game theory, modeled on defensive driving. More precisely, we prepend an adversarial phase to the ego agents cost function. That is, we prepend a time interval during which other agents are assumed to be temporarily distracted, in order to render the ego agents equilibrium trajectory robust against other agents potentially dangerous behavior during this time. We demonstrate the effectiveness of our new formulation in encoding safety via multiple traffic scenarios.



rate research

Read More

In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in higher policy performance and generality. We first develop an encoding distributional policy iteration (DPI) framework by embedding a permutation invariant module, which employs a feature neural network (NN) to encode the indicators of each vehicle, in the distributional RL framework. The proposed DPI framework is proved to exhibit important properties in terms of convergence and global optimality. Next, based on the developed encoding DPI framework, we propose the E-DSAC algorithm by adding the gradient-based update rule of the feature NN to the policy evaluation process of the DSAC algorithm. Then, the multi-lane driving task and the corresponding reward function are designed to verify the effectiveness of the proposed algorithm. Results show that the policy learned by E-DSAC can realize efficient, smooth, and relatively safe autonomous driving in the designed scenario. And the final policy performance learned by E-DSAC is about three times that of DSAC. Furthermore, its effectiveness has also been verified in real vehicle experiments.
97 - Kuang Huang , Xu Chen , Xuan Di 2020
This paper aims to answer the research question as to optimal design of decision-making processes for autonomous vehicles (AVs), including dynamical selection of driving velocity and route choices on a transportation network. Dynamic traffic assignment (DTA) has been widely used to model travelerss route choice or/and departure-time choice and predict dynamic traffic flow evolution in the short term. However, the existing DTA models do not explicitly describe ones selection of driving velocity on a road link. Driving velocity choice may not be crucial for modeling the movement of human drivers but it is a must-have control to maneuver AVs. In this paper, we aim to develop a game-theoretic model to solve for AVss optimal driving strategies of velocity control in the interior of a road link and route choice at a junction node. To this end, we will first reinterpret the DTA problem as an N-car differential game and show that this game can be tackled with a general mean field game-theoretic framework. The developed mean field game is challenging to solve because of the forward and backward structure for velocity control and the complementarity conditions for route choice. An efficient algorithm is developed to address these challenges. The model and the algorithm are illustrated on the Braess network and the OW network with a single destination. On the Braess network, we first compare the LWR based DTA model with the proposed game and find that the driving and routing control navigates AVs with overall lower costs. We then compare the total travel cost without and with the middle link and find that the Braess paradox may still arise under certain conditions. We also test our proposed model and solution algorithm on the OW network.
Designing or learning an autonomous driving policy is undoubtedly a challenging task as the policy has to maintain its safety in all corner cases. In order to secure safety in autonomous driving, the ability to detect hazardous situations, which can be seen as an out-of-distribution (OOD) detection problem, becomes crucial. However, most conventional datasets only provide expert driving demonstrations, although some non-expert or uncommon driving behavior data are needed to implement a safety guaranteed autonomous driving platform. To this end, we present a novel dataset called the R3 Driving Dataset, composed of driving data with different qualities. The dataset categorizes abnormal driving behaviors into eight categories and 369 different detailed situations. The situations include dangerous lane changes and near-collision situations. To further enlighten how these abnormal driving behaviors can be detected, we utilize different uncertainty estimation and anomaly detection methods to the proposed dataset. From the results of the proposed experiment, it can be inferred that by using both uncertainty estimation and anomaly detection, most of the abnormal cases in the proposed dataset can be discriminated. The dataset of this paper can be downloaded from https://rllab-snu.github.io/projects/R3-Driving-Dataset/doc.html.
169 - K. Shu , H. Yu , X. Chen 2020
Left-turn planning is one of the formidable challenges for autonomous vehicles, especially at unsignalized intersections due to the unknown intentions of oncoming vehicles. This paper addresses the challenge by proposing a critical turning point (CTP) based hierarchical planning approach. This includes a high-level candidate path generator and a low-level partially observable Markov decision process (POMDP) based planner. The proposed (CTP) concept, inspired by human-driving behaviors at intersections, aims to increase the computational efficiency of the low-level planner and to enable human-friendly autonomous driving. The POMDP based low-level planner takes unknown intentions of oncoming vehicles into considerations to perform less conservative yet safe actions. With proper integration, the proposed hierarchical approach is capable of achieving safe planning results with high commute efficiency at unsignalized intersections in real time.
We propose a game theoretic approach to address the problem of searching for available parking spots in a parking lot and picking the ``optimal one to park. The approach exploits limited information provided by the parking lot, i.e., its layout and the current number of cars in it. Considering the fact that such information is or can be easily made available for many structured parking lots, the proposed approach can be applicable without requiring major updates to existing parking facilities. For large parking lots, a sampling-based strategy is integrated with the proposed approach to overcome the associated computational challenge. The proposed approach is compared against a state-of-the-art heuristic-based parking spot search strategy in the literature through simulation studies and demonstrates its advantage in terms of achieving lower cost function values.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا