No Arabic abstract
We present an integrated approach for perception and control for an autonomous vehicle and demonstrate this approach in a high-fidelity urban driving simulator. Our approach first builds a model for the environment, then trains a policy exploiting the learned model to identify the action to take at each time-step. To build a model for the environment, we leverage several deep learning algorithms. To that end, first we train a variational autoencoder to encode the input image into an abstract latent representation. We then utilize a recurrent neural network to predict the latent representation of the next frame and handle temporal information. Finally, we utilize an evolutionary-based reinforcement learning algorithm to train a controller based on these latent representations to identify the action to take. We evaluate our approach in CARLA, a high-fidelity urban driving simulator, and conduct an extensive generalization study. Our results demonstrate that our approach outperforms several previously reported approaches in terms of the percentage of successfully completed episodes for a lane keeping task.
This paper presents a safe reinforcement learning system for automated driving that benefits from multimodal future trajectory predictions. We propose a safety system that consists of two safety components: a heuristic safety and a learning-based safety. The heuristic safety module is based on common driving rules. On the other hand, the learning-based safety module is a data-driven safety rule that learns safety patterns from driving data. Specifically, it utilizes mixture density recurrent neural networks (MD-RNN) for multimodal future trajectory predictions to accelerate the learning progress. Our simulation results demonstrate that the proposed safety system outperforms previously reported results in terms of average reward and number of collisions.
In this paper, a concurrent learning framework is developed for source search in an unknown environment using autonomous platforms equipped with onboard sensors. Distinct from the existing solutions that require significant computational power for Bayesian estimation and path planning, the proposed solution is computationally affordable for onboard processors. A new concept of concurrent learning using multiple parallel estimators is proposed to learn the operational environment and quantify estimation uncertainty. The search agent is empowered with dual capability of exploiting current estimated parameters to track the source and probing the environment to reduce the impacts of uncertainty, namely Concurrent Learning for Exploration and Exploitation (CLEE). In this setting, the control action not only minimises the tracking error between future agents position and estimated source location, but also the uncertainty of predicted estimation. More importantly, the rigorous proven properties such as the convergence of CLEE algorithm are established under mild assumptions on sensor noises, and the impact of noises on the search performance is examined. Simulation results are provided to validate the effectiveness of the proposed CLEE algorithm. Compared with the information-theoretic approach, CLEE not only guarantees convergence, but produces better search performance and consumes much less computational time.
In this work, we study estimation problems in nonlinear mechanical systems subject to non-stationary and unknown excitation, which are common and critical problems in design and health management of mechanical systems. A primary-auxiliary model scheduling procedure based on time-domain transmissibilities is proposed and performed under switching linear dynamics: In addition to constructing a primary transmissibility family from the pseudo-inputs to the output during the offline stage, an auxiliary transmissibility family is constructed by further decomposing the pseudo-input vector into two parts. The auxiliary family enables to determine the unknown working condition at which the system is currently running at, and then an appropriate transmissibility from the primary transmissibility family for estimating the unknown output can be selected during the online estimation stage. As a result, the proposed approach offers a generalizable and explainable solution to the signal estimation problems in nonlinear mechanical systems in the context of switching linear dynamics with unknown inputs. A real-world application to the estimation of the vertical wheel force in a full vehicle system are, respectively, conducted to demonstrate the effectiveness of the proposed method. During the vehicle design phase, the vertical wheel force is the most important one among Wheel Center Loads (WCLs), and it is often measured directly with expensive, intrusive, and hard-to-install measurement devices during full vehicle testing campaigns. Meanwhile, the estimation problem of the vertical wheel force has not been solved well and is still of great interest. The experimental results show good performances of the proposed method in the sense of estimation accuracy for estimating the vertical wheel force.
This paper develops a model-free volt-VAR optimization (VVO) algorithm via multi-agent deep reinforcement learning (MADRL) in unbalanced distribution systems. This method is novel since we cast the VVO problem in unbalanced distribution networks to an intelligent deep Q-network (DQN) framework, which avoids solving a specific optimization model directly when facing time-varying operating conditions of the systems. We consider statuses/ratios of switchable capacitors, voltage regulators, and smart inverters installed at distributed generators as the action variables of the DQN agents. A delicately designed reward function guides these agents to interact with the distribution system, in the direction of reinforcing voltage regulation and power loss reduction simultaneously. The forward-backward sweep method for radial three-phase distribution systems provides accurate power flow results within a few iterations to the DQN environment. Finally, the proposed multi-objective MADRL method realizes the dual goals for VVO. We test this algorithm on the unbalanced IEEE 13-bus and 123-bus systems. Numerical simulations validate the excellent performance of this method in voltage regulation and power loss reduction.
Load shedding has been one of the most widely used and effective emergency control approaches against voltage instability. With increased uncertainties and rapidly changing operational conditions in power systems, existing methods have outstanding issues in terms of either speed, adaptiveness, or scalability. Deep reinforcement learning (DRL) was regarded and adopted as a promising approach for fast and adaptive grid stability control in recent years. However, existing DRL algorithms show two outstanding issues when being applied to power system control problems: 1) computational inefficiency that requires extensive training and tuning time; and 2) poor scalability making it difficult to scale to high dimensional control problems. To overcome these issues, an accelerated DRL algorithm named PARS was developed and tailored for power system voltage stability control via load shedding. PARS features high scalability and is easy to tune with only five main hyperparameters. The method was tested on both the IEEE 39-bus and IEEE 300-bus systems, and the latter is by far the largest scale for such a study. Test results show that, compared to other methods including model-predictive control (MPC) and proximal policy optimization(PPO) methods, PARS shows better computational efficiency (faster convergence), more robustness in learning, excellent scalability and generalization capability.