PowerGym: A Reinforcement Learning Environment for Volt-Var Control in Power Distribution Systems

80 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ting-Han Fan

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ting-Han Fan - Xian Yeow Lee - Yubo Wang

التعلم الآلي الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We introduce PowerGym, an open-source reinforcement learning environment for Volt-Var control in power distribution systems. Following OpenAI Gym APIs, PowerGym targets minimizing power loss and voltage violations under physical networked constraints. PowerGym provides four distribution systems (13Bus, 34Bus, 123Bus, and 8500Node) based on IEEE benchmark systems and design variants for various control difficulties. To foster generalization, PowerGym offers a detailed customization guide for users working with their distribution systems. As a demonstration, we examine state-of-the-art reinforcement learning algorithms in PowerGym and validate the environment by studying controller behaviors.

قيم البحث

148 - Ying Zhang , Xinan Wang , Jianhui Wang 2020

This paper develops a model-free volt-VAR optimization (VVO) algorithm via multi-agent deep reinforcement learning (MADRL) in unbalanced distribution systems. This method is novel since we cast the VVO problem in unbalanced distribution networks to a n intelligent deep Q-network (DQN) framework, which avoids solving a specific optimization model directly when facing time-varying operating conditions of the systems. We consider statuses/ratios of switchable capacitors, voltage regulators, and smart inverters installed at distributed generators as the action variables of the DQN agents. A delicately designed reward function guides these agents to interact with the distribution system, in the direction of reinforcing voltage regulation and power loss reduction simultaneously. The forward-backward sweep method for radial three-phase distribution systems provides accurate power flow results within a few iterations to the DQN environment. Finally, the proposed multi-objective MADRL method realizes the dual goals for VVO. We test this algorithm on the unbalanced IEEE 13-bus and 123-bus systems. Numerical simulations validate the excellent performance of this method in voltage regulation and power loss reduction.

أنظمة وتحكم أنظمة وتحكم معالجة الإشارات

Bi-level Off-policy Reinforcement Learning for Volt/VAR Control Involving Continuous and Discrete Devices

87 - Haotian Liu , Wenchuan Wu 2021

In Volt/Var control (VVC) of active distribution networks(ADNs), both slow timescale discrete devices (STDDs) and fast timescale continuous devices (FTCDs) are involved. The STDDs such as on-load tap changers (OLTC) and FTCDs such as distributed gene rators should be coordinated in time sequence. Such VCC is formulated as a two-timescale optimization problem to jointly optimize FTCDs and STDDs in ADNs. Traditional optimization methods are heavily based on accurate models of the system, but sometimes impractical because of their unaffordable effort on modelling. In this paper, a novel bi-level off-policy reinforcement learning (RL) algorithm is proposed to solve this problem in a model-free manner. A Bi-level Markov decision process (BMDP) is defined to describe the two-timescale VVC problem and separate agents are set up for the slow and fast timescale sub-problems. For the fast timescale sub-problem, we adopt an off-policy RL method soft actor-critic with high sample efficiency. For the slow one, we develop an off-policy multi-discrete soft actor-critic (MDSAC) algorithm to address the curse of dimensionality with various STDDs. To mitigate the non-stationary issue existing the two agents learning processes, we propose a multi-timescale off-policy correction (MTOPC) method by adopting importance sampling technique. Comprehensive numerical studies not only demonstrate that the proposed method can achieve stable and satisfactory optimization of both STDDs and FTCDs without any model information, but also support that the proposed method outperforms existing two-timescale VVC methods.

أنظمة وتحكم الذكاء الاصطناعي التعلم الآلي

Reinforcement Learning for Decision-Making and Control in Power Systems: Tutorial, Review, and Vision

121 - Xin Chen , Guannan Qu , Yujie Tang 2021

With large-scale integration of renewable generation and distributed energy resources (DERs), modern power systems are confronted with new operational challenges, such as growing complexity, increasing uncertainty, and aggravating volatility. Meanwhi le, more and more data are becoming available owing to the widespread deployment of smart meters, smart sensors, and upgraded communication networks. As a result, data-driven control techniques, especially reinforcement learning (RL), have attracted surging attention in recent years. In this paper, we provide a tutorial on various RL techniques and how they can be applied to decision-making in power systems. We illustrate RL-based models and solutions in three key applications, frequency regulation, voltage control, and energy management. We conclude with three critical issues in the application of RL, i.e., safety, scalability, and data. Several potential future directions are discussed as well.

التعلم الآلي الذكاء الاصطناعي أنظمة وتحكم

Natural Environment Benchmarks for Reinforcement Learning

176 - Amy Zhang , Yuxin Wu , Joelle Pineau 2018

While current benchmark reinforcement learning (RL) tasks have been useful to drive progress in the field, they are in many ways poor substitutes for learning with real-world data. By testing increasingly complex RL algorithms on low-complexity simul ation environments, we often end up with brittle RL policies that generalize poorly beyond the very specific domain. To combat this, we propose three new families of benchmark RL domains that contain some of the complexity of the natural world, while still supporting fast and extensive data acquisition. The proposed domains also permit a characterization of generalization through fair train/test separation, and easy comparison and replication of results. Through this work, we challenge the RL research community to develop more robust algorithms that meet high standards of evaluation.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

A Reinforcement Learning Environment for Mathematical Reasoning via Program Synthesis

67 - Joseph Palermo , Johnny Ye , Alok Singh 2021

We convert the DeepMind Mathematics Dataset into a reinforcement learning environment by interpreting it as a program synthesis problem. Each action taken in the environment adds an operator or an input into a discrete compute graph. Graphs which com pute correct answers yield positive reward, enabling the optimization of a policy to construct compute graphs conditioned on problem statements. Baseline models are trained using Double DQN on various subsets of problem types, demonstrating the capability to learn to correctly construct graphs despite the challenges of combinatorial explosion and noisy rewards.

التعلم الآلي الذكاء الاصطناعي