Disentangled Planning and Control in Vision Based Robotics via Reward Machines

74 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alberto Camacho

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Alberto Camacho - Jacob Varley - Deepali Jain

علم الروبوتات الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this work we augment a Deep Q-Learning agent with a Reward Machine (DQRM) to increase speed of learning vision-based policies for robot tasks, and overcome some of the limitations of DQN that prevent it from converging to good-quality policies. A reward machine (RM) is a finite state machine that decomposes a task into a discrete planning graph and equips the agent with a reward function to guide it toward task completion. The reward machine can be used for both reward shaping, and informing the policy what abstract state it is currently at. An abstract state is a high level simplification of the current state, defined in terms of task relevant features. These two supervisory signals of reward shaping and knowledge of current abstract state coming from the reward machine complement each other and can both be used to improve policy performance as demonstrated on several vision based robotic pick and place tasks. Particularly for vision based robotics applications, it is often easier to build a reward machine than to try and get a policy to learn the task without this structure.

قيم البحث

95 - David DeFazio , Shiqi Zhang 2021

Legged robots have been shown to be effective in navigating unstructured environments. Although there has been much success in learning locomotion policies for quadruped robots, there is little research on how to incorporate human knowledge to facili tate this learning process. In this paper, we demonstrate that human knowledge in the form of LTL formulas can be applied to quadruped locomotion learning within a Reward Machine (RM) framework. Experimental results in simulation show that our RM-based approach enables easily defining diverse locomotion styles, and efficiently learning locomotion policies of the defined styles.

علم الروبوتات الذكاء الاصطناعي

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

368 - Frederik Ebert , Chelsea Finn , Sudeep Dasari 2018

Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains. We present a deep RL method that is practical for real-world robotics tasks, such as robotic manipulation, and generalizes effectively to never-before-seen tasks and objects. In these settings, ground truth reward signals are typically unavailable, and we therefore propose a self-supervised model-based approach, where a predictive model learns to directly predict the future from raw sensory readings, such as camera images. At test time, we explore three distinct goal specification methods: designated pixels, where a user specifies desired object manipulation tasks by selecting particular pixels in an image and corresponding goal positions, goal images, where the desired goal state is specified with an image, and image classifiers, which define spaces of goal states. Our deep predictive models are trained using data collected autonomously and continuously by a robot interacting with hundreds of objects, without human supervision. We demonstrate that visual MPC can generalize to never-before-seen objects---both rigid and deformable---and solve a range of user-defined object manipulation tasks using the same model.

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Topological Planning with Transformers for Vision-and-Language Navigation

86 - Kevin Chen , Junshen K. Chen , Jo Chuang 2020

Conventional approaches to vision-and-language navigation (VLN) are trained end-to-end but struggle to perform well in freely traversable environments. Inspired by the robotics community, we propose a modular approach to VLN using topological maps. G iven a natural language instruction and topological map, our approach leverages attention mechanisms to predict a navigation plan in the map. The plan is then executed with low-level actions (e.g. forward, rotate) using a robust controller. Experiments show that our method outperforms previous end-to-end approaches, generates interpretable navigation plans, and exhibits intelligent behaviors such as backtracking.

علم الروبوتات الذكاء الاصطناعي الحساب واللغة

KDF: Kinodynamic Motion Planning via Geometric Sampling-based Algorithms and Funnel Control

172 - Christos K. Verginis , Dimos V. Dimarogonas , 2021

We integrate sampling-based planning techniques with funnel-based feedback control to develop KDF, a new framework for solving the kinodynamic motion-planning problem via funnel control. The considered systems evolve subject to complex, nonlinear, an d uncertain dynamics (aka differential constraints). Firstly, we use a geometric planner to obtain a high-level safe path in a user-defined extended free space. Secondly, we develop a low-level funnel control algorithm that guarantees safe tracking of the path by the system. Neither the planner nor the control algorithm use information on the underlying dynamics of the system, which makes the proposed scheme easily distributable to a large variety of different systems and scenarios. Intuitively, the funnel control module is able to implicitly accommodate the dynamics of the system, allowing hence the deployment of purely geometrical motion planners. Extensive computer simulations and experimental results with a 6-DOF robotic arm validate the proposed approach.

علم الروبوتات

Scaling data-driven robotics with reward sketching and batch reinforcement learning

266 - Serkan Cabi , Sergio Gomez Colmenarejo , Alexander Novikov 2019

We present a framework for data-driven robotics that makes use of a large dataset of recorded robot experience and scales to several tasks using learned reward functions. We show how to apply this framework to accomplish three different object manipu lation tasks on a real robot platform. Given demonstrations of a task together with task-agnostic recorded experience, we use a special form of human annotation as supervision to learn a reward function, which enables us to deal with real-world tasks where the reward signal cannot be acquired directly. Learned rewards are used in combination with a large dataset of experience from different tasks to learn a robot policy offline using batch RL. We show that using our approach it is possible to train agents to perform a variety of challenging manipulation tasks including stacking rigid objects and handling cloth.

علم الروبوتات التعلم الآلي