ترغب بنشر مسار تعليمي؟ اضغط هنا

Learning Model-Based Vehicle-Relocation Decisions for Real-Time Ride-Sharing: Hybridizing Learning and Optimization

131   0   0.0 ( 0 )
 نشر من قبل Enpeng Yuan
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Large-scale ride-sharing systems combine real-time dispatching and routing optimization over a rolling time horizon with a model predictive control (MPC) component that relocates idle vehicles to anticipate the demand. The MPC optimization operates over a longer time horizon to compensate for the inherent myopic nature of the real-time dispatching. These longer time horizons are beneficial for the quality of relocation decisions but increase computational complexity. Consequently, the ride-sharing operators are often forced to use a relatively short time horizon. To address this computational challenge, this paper proposes a hybrid approach that combines machine learning and optimization. The machine-learning component learns the optimal solution to the MPC on the aggregated level to overcome the sparsity and high-dimensionality of the solution. The optimization component transforms the machine-learning prediction back to the original granularity through a tractable transportation model. As a consequence, the original NP-hard MPC problem is reduced to a polynomial time prediction and optimization, which allows the ride-sharing operators to consider a longer time horizon. Experimental results show that the hybrid approach achieves significantly better service quality than the MPC optimization in terms of average rider waiting time, due to its ability to model a longer horizon.

قيم البحث

اقرأ أيضاً

Despite recent progress in robot learning, it still remains a challenge to program a robot to deal with open-ended object manipulation tasks. One approach that was recently used to autonomously generate a repertoire of diverse skills is a novelty bas ed Quality-Diversity~(QD) algorithm. However, as most evolutionary algorithms, QD suffers from sample-inefficiency and, thus, it is challenging to apply it in real-world scenarios. This paper tackles this problem by integrating a neural network that predicts the behavior of the perturbed parameters into a novelty based QD algorithm. In the proposed Model-based Quality-Diversity search (M-QD), the network is trained concurrently to the repertoire and is used to avoid executing unpromising actions in the novelty search process. Furthermore, it is used to adapt the skills of the final repertoire in order to generalize the skills to different scenarios. Our experiments show that enhancing a QD algorithm with such a forward model improves the sample-efficiency and performance of the evolutionary process and the skill adaptation.
Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and ef ficiently deal with multiple subproblems determined by weight decomposition of objectives. This paper proposes a concise meta-learning-based DRL approach. It first trains a meta-model by meta-learning. The meta-model is fine-tuned with a few update steps to derive submodels for the corresponding subproblems. The Pareto front is built accordingly. The computational experiments on multiobjective traveling salesman problems demonstrate the superiority of our method over most of learning-based and iteration-based approaches.
111 - Hang Shuai , Member , IEEE 2021
The uncertainties from distributed energy resources (DERs) bring significant challenges to the real-time operation of microgrids. In addition, due to the nonlinear constraints in the AC power flow equation and the nonlinearity of the battery storage model, etc., the optimization of the microgrid is a mixed-integer nonlinear programming (MINLP) problem. It is challenging to solve this kind of stochastic nonlinear optimization problem. To address the challenge, this paper proposes a deep reinforcement learning (DRL) based optimization strategy for the real-time operation of the microgrid. Specifically, we construct the detailed operation model for the microgrid and formulate the real-time optimization problem as a Markov Decision Process (MDP). Then, a double deep Q network (DDQN) based architecture is designed to solve the MINLP problem. The proposed approach can learn a near-optimal strategy only from the historical data. The effectiveness of the proposed algorithm is validated by the simulations on a 10-bus microgrid system and a modified IEEE 69-bus microgrid system. The numerical simulation results demonstrate that the proposed approach outperforms several existing methods.
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state, in order to plan and to generalize better out-of-distribution. The agents architecture uses a set representation and a b ottleneck mechanism, forcing the number of entities to which the agent attends at each planning step to be small. In experiments with customized MiniGrid environments with different dynamics, we observe that the design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
The connectivity aspect of connected autonomous vehicles (CAV) is beneficial because it facilitates dissemination of traffic-related information to vehicles through Vehicle-to-External (V2X) communication. Onboard sensing equipment including LiDAR an d camera can reasonably characterize the traffic environment in the immediate locality of the CAV. However, their performance is limited by their sensor range (SR). On the other hand, longer-range information is helpful for characterizing imminent conditions downstream. By contemporaneously coalescing the short- and long-range information, the CAV can construct comprehensively its surrounding environment and thereby facilitate informed, safe, and effective movement planning in the short-term (local decisions including lane change) and long-term (route choice). In this paper, we describe a Deep Reinforcement Learning based approach that integrates the data collected through sensing and connectivity capabilities from other vehicles located in the proximity of the CAV and from those located further downstream, and we use the fused data to guide lane changing, a specific context of CAV operations. In addition, recognizing the importance of the connectivity range (CR) to the performance of not only the algorithm but also of the vehicle in the actual driving environment, the paper carried out a case study. The case study demonstrates the application of the proposed algorithm and duly identifies the appropriate CR for each level of prevailing traffic density. It is expected that implementation of the algorithm in CAVs can enhance the safety and mobility associated with CAV driving operations. From a general perspective, its implementation can provide guidance to connectivity equipment manufacturers and CAV operators, regarding the default CR settings for CAVs or the recommended CR setting in a given traffic environment.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا