ﻻ يوجد ملخص باللغة العربية
The balance between exploration and exploitation is a key problem for reinforcement learning methods, especially for Q-learning. In this paper, a fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems. In this approach, fidelity is adopted to help direct the learning process and the probability of each action to be selected at a certain state is updated iteratively along with the learning process, which leads to a natural exploration strategy instead of a pointed one with configured parameters. A probabilistic Q-learning (PQL) algorithm is first presented to demonstrate the basic idea of probabilistic action selection. Then the FPQL algorithm is presented for learning control of quantum systems. Two examples (a spin- 1/2 system and a lamda-type atomic system) are demonstrated to test the performance of the FPQL algorithm. The results show that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.
This paper provides a brief introduction to learning control of quantum systems. In particular, the following aspects are outlined, including gradient-based learning for optimal control of quantum systems, evolutionary computation for learning contro
Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems aris
This paper proposes a black box based approach for analysing deep neural networks (DNNs). We view a DNN as a function $boldsymbol{f}$ from inputs to outputs, and consider the local robustness property for a given input. Based on scenario optimization
This paper focuses on finding reinforcement learning policies for control systems with hard state and action constraints. Despite its success in many domains, reinforcement learning is challenging to apply to problems with hard constraints, especiall
Under voltage load shedding has been considered as a standard and effective measure to recover the voltage stability of the electric power grid under emergency and severe conditions. However, this scheme usually trips a massive amount of load which c