Learning-based Preference Prediction for Constrained Multi-Criteria Path-Planning

329 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kevin Osanlou Mr

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Kevin Osanlou - Christophe Guettier - Andrei Bursuc

الذكاء الاصطناعي التعلم الآلي علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Learning-based methods are increasingly popular for search algorithms in single-criterion optimization problems. In contrast, for multiple-criteria optimization there are significantly fewer approaches despite the existence of numerous applications. Constrained path-planning for Autonomous Ground Vehicles (AGV) is one such application, where an AGV is typically deployed in disaster relief or search and rescue applications in off-road environments. The agent can be faced with the following dilemma : optimize a source-destination path according to a known criterion and an uncertain criterion under operational constraints. The known criterion is associated to the cost of the path, representing the distance. The uncertain criterion represents the feasibility of driving through the path without requiring human intervention. It depends on various external parameters such as the physics of the vehicle, the state of the explored terrains or weather conditions. In this work, we leverage knowledge acquired through offline simulations by training a neural network model to predict the uncertain criterion. We integrate this model inside a path-planner which can solve problems online. Finally, we conduct experiments on realistic AGV scenarios which illustrate that the proposed framework requires human intervention less frequently, trading for a limited increase in the path distance.

قيم البحث

118 - Kevin Yang , Tianjun Zhang , Chris Cummins 2021

Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function. Popular approaches like CEM and CMA-ES greedily focus on promising regions of the search spac e and may get trapped in local maxima. DOO and VOOT balance exploration and exploitation, but use space partitioning strategies independent of the reward function to be optimized. Recently, LaMCTS empirically learns to partition the search space in a reward-sensitive manner for black-box optimization. In this paper, we develop a novel formal regret analysis for when and why such an adaptive region partitioning scheme works. We also propose a new path planning method PlaLaM which improves the function value estimation within each sub-region, and uses a latent representation of the search space. Empirically, PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima, and shows benefits when plugged into model-based RL with planning components such as PETS. These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale. Code is available at https://github.com/yangkevin2/plalam.

الذكاء الاصطناعي التعلم الآلي علم الروبوتات

Optimal Solving of Constrained Path-Planning Problems with Graph Convolutional Networks and Optimized Tree Search

131 - Kevin Osanlou , Andrei Bursuc , Christophe Guettier 2021

Learning-based methods are growing prominence for planning purposes. However, there are very few approaches for learning-assisted constrained path-planning on graphs, while there are multiple downstream practical applications. This is the case for co nstrained path-planning for Autonomous Unmanned Ground Vehicles (AUGV), typically deployed in disaster relief or search and rescue applications. In off-road environments, the AUGV must dynamically optimize a source-destination path under various operational constraints, out of which several are difficult to predict in advance and need to be addressed on-line. We propose a hybrid solving planner that combines machine learning models and an optimal solver. More specifically, a graph convolutional network (GCN) is used to assist a branch and bound (B&B) algorithm in handling the constraints. We conduct experiments on realistic scenarios and show that GCN support enables substantial speedup and smoother scaling to harder problems.

الذكاء الاصطناعي التعلم الآلي علم الروبوتات

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

85 - Yinlam Chow , Mohammad Ghavamzadeh , Lucas Janson 2015

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account emph{risk}, i.e., increased awareness of events of small probability and high consequences. Accordingly, the objective o f this paper is to present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented via a chance constraint or a constraint on the conditional value-at-risk (CVaR) of the cumulative cost. We collectively refer to such problems as percentile risk-constrained MDPs. Specifically, we first derive a formula for computing the gradient of the Lagrangian function for percentile risk-constrained MDPs. Then, we devise policy gradient and actor-critic algorithms that (1) estimate such gradient, (2) update the policy in the descent direction, and (3) update the Lagrange multiplier in the ascent direction. For these algorithms we prove convergence to locally optimal policies. Finally, we demonstrate the effectiveness of our algorithms in an optimal stopping problem and an online marketing application.

الذكاء الاصطناعي التعلم الآلي التحسين والتحكم

Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method

248 - Zuxin Liu , Hongyi Zhou , Baiming Chen 2020

This paper studies the constrained/safe reinforcement learning (RL) problem with sparse indicator signals for constraint violations. We propose a model-based approach to enable RL agents to effectively explore the environment with unknown system dyna mics and environment constraints given a significantly small number of violation budgets. We employ the neural network ensemble model to estimate the prediction uncertainty and use model predictive control as the basic control framework. We propose the robust cross-entropy method to optimize the control sequence considering the model uncertainty and constraints. We evaluate our methods in the Safety Gym environment. The results show that our approach learns to complete the tasks with a much smaller number of constraint violations than state-of-the-art baselines. Additionally, we are able to achieve several orders of magnitude better sample efficiency when compared with constrained model-free RL approaches. The code is available at url{https://github.com/liuzuxin/safe-mbrl}.

الذكاء الاصطناعي التعلم الآلي علم الروبوتات

Constrained Shortest Path Search with Graph Convolutional Neural Networks

98 - Kevin Osanlou , Christophe Guettier , Andrei Bursuc 2021

Planning for Autonomous Unmanned Ground Vehicles (AUGV) is still a challenge, especially in difficult, off-road, critical situations. Automatic planning can be used to reach mission objectives, to perform navigation or maneuvers. Most of the time, th e problem consists in finding a path from a source to a destination, while satisfying some operational constraints. In a graph without negative cycles, the computation of the single-pair shortest path from a start node to an end node is solved in polynomial time. Additional constraints on the solution path can however make the problem harder to solve. This becomes the case when we need the path to pass through a few mandatory nodes without requiring a specific order of visit. The complexity grows exponentially with the number of mandatory nodes to visit. In this paper, we focus on shortest path search with mandatory nodes on a given connected graph. We propose a hybrid model that combines a constraint-based solver and a graph convolutional neural network to improve search performance. Promising results are obtained on realistic scenarios.

الذكاء الاصطناعي التعلم الآلي علم الروبوتات