ترغب بنشر مسار تعليمي؟ اضغط هنا

A Deep Reinforcement Learning Based Approach for Optimal Active Power Dispatch

157   0   0.0 ( 0 )
 نشر من قبل Xiaohu Zhang
 تاريخ النشر 2019
  مجال البحث هندسة إلكترونية
والبحث باللغة English




اسأل ChatGPT حول البحث

The stochastic and dynamic nature of renewable energy sources and power electronic devices are creating unique challenges for modern power systems. One such challenge is that the conventional mathematical systems models-based optimal active power dispatch (OAPD) method is limited in its ability to handle uncertainties caused by renewables and other system contingencies. In this paper, a deep reinforcement learning-based (DRL) method is presented to provide a near-optimal solution to the OAPD problem without system modeling. The DRL agent undergoes offline training, based on which, it is able to obtain the OAPD points under unseen scenarios, e.g., different load patterns. The DRL-based OAPD method is tested on the IEEE 14-bus system, thereby validating its feasibility to solve the OAPD problem. Its utility is further confirmed in that it can be leveraged as a key component for solving future model-free AC-OPF problems.



قيم البحث

اقرأ أيضاً

189 - Yang Wang , Zhen Gao , Jun Zhang 2021
In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a sophisticated three-dimensional (3D) environment, where the UAVs trajectory is optimized to efficiently collect data from multiple IoT ground nodes. Unlike existing approaches focusing only on a simplified two-dimensional scenario and the availability of perfect channel state information (CSI), this paper considers a practical 3D urban environment with imperfect CSI, where the UAVs trajectory is designed to minimize data collection completion time subject to practical throughput and flight movement constraints. Specifically, inspired from the state-of-the-art deep reinforcement learning approaches, we leverage the twin-delayed deep deterministic policy gradient (TD3) to design the UAVs trajectory and present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. In particular, we set an additional information, i.e., the merged pheromone, to represent the state information of UAV and environment as a reference of reward which facilitates the algorithm design. By taking the service statuses of IoT nodes, the UAVs position, and the merged pheromone as input, the proposed algorithm can continuously and adaptively learn how to adjust the UAVs movement strategy. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can achieve a near-optimal navigation strategy. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.
Vehicular edge computing (VEC) is envisioned as a promising approach to process the explosive computation tasks of vehicular user (VU). In the VEC system, each VU allocates power to process partial tasks through offloading and the remaining tasks thr ough local execution. During the offloading, each VU adopts the multi-input multi-out and non-orthogonal multiple access (MIMO-NOMA) channel to improve the channel spectrum efficiency and capacity. However, the channel condition is uncertain due to the channel interference among VUs caused by the MIMO-NOMA channel and the time-varying path-loss caused by the mobility of each VU. In addition, the task arrival of each VU is stochastic in the real world. The stochastic task arrival and uncertain channel condition affect greatly on the power consumption and latency of tasks for each VU. It is critical to design an optimal power allocation scheme considering the stochastic task arrival and channel variation to optimize the long-term reward including the power consumption and latency in the MIMO-NOMA VEC. Different from the traditional centralized deep reinforcement learning (DRL)-based scheme, this paper constructs a decentralized DRL framework to formulate the power allocation optimization problem, where the local observations are selected as the state. The deep deterministic policy gradient (DDPG) algorithm is adopted to learn the optimal power allocation scheme based on the decentralized DRL framework. Simulation results demonstrate that our proposed power allocation scheme outperforms the existing schemes.
Priority dispatching rule (PDR) is widely used for solving real-world Job-shop scheduling problem (JSSP). However, the design of effective PDRs is a tedious task, requiring a myriad of specialized knowledge and often delivering limited performance. I n this paper, we propose to automatically learn PDRs via an end-to-end deep reinforcement learning agent. We exploit the disjunctive graph representation of JSSP, and propose a Graph Neural Network based scheme to embed the states encountered during solving. The resulting policy network is size-agnostic, effectively enabling generalization on large-scale instances. Experiments show that the agent can learn high-quality PDRs from scratch with elementary raw features, and demonstrates strong performance against the best existing PDRs. The learned policies also perform well on much larger instances that are unseen in training.
This letter introduces a novel framework to optimize the power allocation for users in a Rate Splitting Multiple Access (RSMA) network. In the network, messages intended for users are split into different parts that are a single common part and respe ctive private parts. This mechanism enables RSMA to flexibly manage interference and thus enhance energy and spectral efficiency. Although possessing outstanding advantages, optimizing power allocation in RSMA is very challenging under the uncertainty of the communication channel and the transmitter has limited knowledge of the channel information. To solve the problem, we first develop a Markov Decision Process framework to model the dynamic of the communication channel. The deep reinforcement algorithm is then proposed to find the optimal power allocation policy for the transmitter without requiring any prior information of the channel. The simulation results show that the proposed scheme can outperform baseline schemes in terms of average sum-rate under different power and QoS requirements.
87 - Zhuo Li , Xu Zhou , Taixin Li 2021
With the mass deployment of computing-intensive applications and delay-sensitive applications on end devices, only adequate computing resources can meet differentiated services delay requirements. By offloading tasks to cloud servers or edge servers, computation offloading can alleviate computing and storage limitations and reduce delay and energy consumption. However, few of the existing offloading schemes take into consideration the cloud-edge collaboration and the constraint of energy consumption and task dependency. This paper builds a collaborative computation offloading model in cloud and edge computing and formulates a multi-objective optimization problem. Constructed by fusing optimal transport and Policy-Based RL, we propose an Optimal-Transport-Based RL approach to resolve the offloading problem and make the optimal offloading decision for minimizing the overall cost of delay and energy consumption. Simulation results show that the proposed approach can effectively reduce the cost and significantly outperforms existing optimization solutions.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا