ترغب بنشر مسار تعليمي؟ اضغط هنا

On Training Effective Reinforcement Learning Agents for Real-time Power Grid Operation and Control

84   0   0.0 ( 0 )
 نشر من قبل Ruisheng Diao
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Deriving fast and effectively coordinated control actions remains a grand challenge affecting the secure and economic operation of todays large-scale power grid. This paper presents a novel artificial intelligence (AI) based methodology to achieve multi-objective real-time power grid control for real-world implementation. State-of-the-art off-policy reinforcement learning (RL) algorithm, soft actor-critic (SAC) is adopted to train AI agents with multi-thread offline training and periodic online training for regulating voltages and transmission losses without violating thermal constraints of lines. A software prototype was developed and deployed in the control center of SGCC Jiangsu Electric Power Company that interacts with their Energy Management System (EMS) every 5 minutes. Massive numerical studies using actual power grid snapshots in the real-time environment verify the effectiveness of the proposed approach. Well-trained SAC agents can learn to provide effective and subsecond control actions in regulating voltage profiles and reducing transmission losses.



قيم البحث

اقرأ أيضاً

As power systems are undergoing a significant transformation with more uncertainties, less inertia and closer to operation limits, there is increasing risk of large outages. Thus, there is an imperative need to enhance grid emergency control to maint ain system reliability and security. Towards this end, great progress has been made in developing deep reinforcement learning (DRL) based grid control solutions in recent years. However, existing DRL-based solutions have two main limitations: 1) they cannot handle well with a wide range of grid operation conditions, system parameters, and contingencies; 2) they generally lack the ability to fast adapt to new grid operation conditions, system parameters, and contingencies, limiting their applicability for real-world applications. In this paper, we mitigate these limitations by developing a novel deep meta reinforcement learning (DMRL) algorithm. The DMRL combines the meta strategy optimization together with DRL, and trains policies modulated by a latent space that can quickly adapt to new scenarios. We test the developed DMRL algorithm on the IEEE 300-bus system. We demonstrate fast adaptation of the meta-trained DRL polices with latent variables to new operating conditions and scenarios using the proposed method and achieve superior performance compared to the state-of-the-art DRL and model predictive control (MPC) methods.
Given the rise of electric vehicle (EV) adoption, supported by government policies and dropping technology prices, new challenges arise in the modeling and operation of electric transportation. In this paper, we present a model for solving the EV rou ting problem while accounting for real-life stochastic demand behavior. We present a mathematical formulation that minimizes travel time and energy costs of an EV fleet. The EV is represented by a battery energy consumption model. To adapt our formulation to real-life scenarios, customer pick-ups and drop-offs were modeled as stochastic parameters. A chance-constrained optimization model is proposed for addressing pick-ups and drop-offs uncertainties. Computational validation of the model is provided based on representative transportation scenarios. Results obtained showed a quick convergence of our model with verifiable solutions. Finally, the impact of electric vehicles charging is validated in Downtown Manhattan, New York by assessing the effect on the distribution grid.
Planning future operational scenarios of bulk power systems that meet security and economic constraints typically requires intensive labor efforts in performing massive simulations. To automate this process and relieve engineers burden, a novel multi -stage control approach is presented in this paper to train centralized and decentralized reinforcement learning agents that can automatically adjust grid controllers for regulating transmission line flows at normal condition and under contingencies. The power grid flow control problem is formulated as Markov Decision Process (MDP). At stage one, centralized soft actor-critic (SAC) agent is trained to control generator active power outputs in a wide area to control transmission line flows against specified security limits. If line overloading issues remain unresolved, stage two is used to train decentralized SAC agent via load throw-over at local substations. The effectiveness of the proposed approach is verified on a series of actual planning cases used for operating the power grid of SGCC Zhejiang Electric Power Company.
Microgrids are increasingly recognized as a key technology for the integration of distributed energy resources into the power network, allowing local clusters of load and distributed energy resources to operate autonomously. However, microgrid operat ion brings new challenges, especially in islanded operation as frequency and voltage control are no longer provided by large rotating machines. Instead, the power converters in the microgrid must coordinate to regulate the frequency and voltage and ensure stability. We consider the problem of designing controllers to achieve these objectives. Using passivity theory to derive decentralized stability conditions for the microgrid, we propose a control design method for grid-forming inverters. For the analysis we use higher-order models for the inverters and also advanced dynamic models for the lines with an arbitrarily large number of states. By satisfying the decentralized condition formulated, plug-and-play operation can be achieved with guaranteed stability, and performance can also be improved by incorporating this condition as a constraint in corresponding optimization problems formulated. In addition, our control design can improve the power sharing properties of the microgrid compared to previous non-droop approaches. Finally, realistic simulations confirm that the controller design improves the stability and performance of the power network.
We study the impact of predictions in online Linear Quadratic Regulator control with both stochastic and adversarial disturbances in the dynamics. In both settings, we characterize the optimal policy and derive tight bounds on the minimum cost and dy namic regret. Perhaps surprisingly, our analysis shows that the conventional greedy MPC approach is a near-optimal policy in both stochastic and adversarial settings. Specifically, for length-$T$ problems, MPC requires only $O(log T)$ predictions to reach $O(1)$ dynamic regret, which matches (up to lower-order terms) our lower bound on the required prediction horizon for constant regret.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا