Optimal Management of the Peak Power Penalty for Smart Grids Using MPC-based Reinforcement Learning


Abstract in English

The cost of the power distribution infrastructures is driven by the peak power encountered in the system. Therefore, the distribution network operators consider billing consumers behind a common transformer in the function of their peak demand and leave it to the consumers to manage their collective costs. This management problem is, however, not trivial. In this paper, we consider a multi-agent residential smart grid system, where each agent has local renewable energy production and energy storage, and all agents are connected to a local transformer. The objective is to develop an optimal policy that minimizes the economic cost consisting of both the spot-market cost for each consumer and their collective peak-power cost. We propose to use a parametric Model Predictive Control (MPC)-scheme to approximate the optimal policy. The optimality of this policy is limited by its finite horizon and inaccurate forecasts of the local power production-consumption. A Deterministic Policy Gradient (DPG) method is deployed to adjust the MPC parameters and improve the policy. Our simulations show that the proposed MPC-based Reinforcement Learning (RL) method can effectively decrease the long-term economic cost for this smart grid problem.

Download