No Arabic abstract
The successful launch of electric vehicles (EVs) depends critically on the availability of convenient and economic charging facilities. The problem of scheduling of large-scale charging of EVs by a service provider is considered. A Markov decision process model is introduced in which EVs arrive randomly at a charging facility with random demand and completion deadlines. The service provider faces random charging costs, convex non-completion penalties, and a peak power constraint that limits the maximum number of simultaneous activation of EV chargers. Formulated as a restless multi-armed bandit problem, the EV charging problem is shown to be indexable. A closed-form expression of the Whittles index is obtained for the case when the charging costs are constant. The Whittles index policy, however, is not optimal in general. An enhancement of the Whittles index policy based on spatial interchange according to the less laxity and longer processing time principle is presented. The proposed policy outperforms existing charging algorithms, especially when the charging costs are time varying.
The early sections of this paper present an analysis of a Markov decision model that is known as the multi-armed bandit under the assumption that the utility function of the decision maker is either linear or exponential. The analysis includes efficient procedures for computing the expected utility associated with the use of a priority policy and for identifying a priority policy that is optimal. The methodology in these sections is novel, building on the use of elementary row operations. In the later sections of this paper, the analysis is adapted to accommodate constraints that link the bandits.
We consider the scheduling of multiple tasks with pre-determined deadlines under random processing cost. This problem is motivated by the potential of large scale adoption of plug-in (hybrid) electric vehicles (PHEVs) in the near future. The charging requests of PHEVs usually have deadline constraints, and the electricity cost associated with PHEV charging is usually random due to the uncertainty in both system load and renewable generation. We seek to properly schedule the battery charging of multiple PHEVs so as to minimize the overall cost, which is derived from the total charging cost and the penalty for not completing charging before requested deadlines. Through a dynamic programming formulation, we establish the Less Laxity and Longer remaining Processing time (LLLP) principle that improves any charging policy on a sample-path basis, when the non-completion penalty is a convex function of the additional time needed to fulfill the uncompleted request. Specifically, the LLLP principle states that priority should be given to vehicles that have less laxity and longer remaining processing times. Numerical results demonstrate that heuristic policies that violate the LLLP principle, for example, the earliest deadline first (EDF) policy, can result in significant performance loss.
The number of electric vehicles (EVs) is expected to increase. As a consequence, more EVs will need charging, potentially causing not only congestion at charging stations, but also in the distribution grid. Our goal is to illustrate how this gives rise to resource allocation and performance problems that are of interest to the Sigmetrics community.
The multi-armed bandit (MAB) is a classical online optimization model for the trade-off between exploration and exploitation. The traditional MAB is concerned with finding the arm that minimizes the mean cost. However, minimizing the mean does not take the risk of the problem into account. We now want to accommodate risk-averse decision makers. In this work, we introduce a coherent risk measure as the criterion to form a risk-averse MAB. In particular, we derive an index-based online sampling framework for the risk-averse MAB. We develop this framework in detail for three specific risk measures, i.e. the conditional value-at-risk, the mean-deviation and the shortfall risk measures. Under each risk measure, the convergence rate for the upper bound on the pseudo regret, defined as the difference between the expectation of the empirical risk based on the observation sequence and the true risk of the optimal arm, is established.
Logistics has gained great attentions with the prosperous development of commerce, which is often seen as the classic optimal vehicle routing problem. Meanwhile, electric vehicle (EV) has been widely used in logistic fleet to curb the emission of green house gases in recent years. Solving the optimization problem of joint routing and charging of multiple EVs is in a urgent need, whose objective function includes charging time, charging cost, EVs travel time, usage fees of EV and revenue from serving customers. This joint problem is formulated as a mixed integer programming (MIP) problem, which, however, is NP-hard due to integer restrictions and bilinear terms from the coupling between routing and charging decisions. The main contribution of this paper lies at proposing an efficient two stage algorithm that can decompose the original MIP problem into two linear programming (LP) problems, by exploiting the exactness of LP relaxation and eliminating the coupled term. This algorithm can achieve a nearoptimal solution in polynomial time. In addition, another variant algorithm is proposed based on the two stage one, to further improve the quality of solution.