ﻻ يوجد ملخص باللغة العربية
Large-scale ride-sharing systems combine real-time dispatching and routing optimization over a rolling time horizon with a model predictive control (MPC) component that relocates idle vehicles to anticipate the demand. The MPC optimization operates over a longer time horizon to compensate for the inherent myopic nature of the real-time dispatching. These longer time horizons are beneficial for the quality of relocation decisions but increase computational complexity. Consequently, the ride-sharing operators are often forced to use a relatively short time horizon. To address this computational challenge, this paper proposes a hybrid approach that combines machine learning and optimization. The machine-learning component learns the optimal solution to the MPC on the aggregated level to overcome the sparsity and high-dimensionality of the solution. The optimization component transforms the machine-learning prediction back to the original granularity through a tractable transportation model. As a consequence, the original NP-hard MPC problem is reduced to a polynomial time prediction and optimization, which allows the ride-sharing operators to consider a longer time horizon. Experimental results show that the hybrid approach achieves significantly better service quality than the MPC optimization in terms of average rider waiting time, due to its ability to model a longer horizon.
Despite recent progress in robot learning, it still remains a challenge to program a robot to deal with open-ended object manipulation tasks. One approach that was recently used to autonomously generate a repertoire of diverse skills is a novelty bas
Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and ef
The uncertainties from distributed energy resources (DERs) bring significant challenges to the real-time operation of microgrids. In addition, due to the nonlinear constraints in the AC power flow equation and the nonlinearity of the battery storage
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state, in order to plan and to generalize better out-of-distribution. The agents architecture uses a set representation and a b
The connectivity aspect of connected autonomous vehicles (CAV) is beneficial because it facilitates dissemination of traffic-related information to vehicles through Vehicle-to-External (V2X) communication. Onboard sensing equipment including LiDAR an