ترغب بنشر مسار تعليمي؟ اضغط هنا

Joint Space Control via Deep Reinforcement Learning

76   0   0.0 ( 0 )
 نشر من قبل Stan Birchfield
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The dominant way to control a robot manipulator uses hand-crafted differential equations leveraging some form of inverse kinematics / dynamics. We propose a simple, versatile joint-level controller that dispenses with differential equations entirely. A deep neural network, trained via model-free reinforcement learning, is used to map from task space to joint space. Experiments show the method capable of achieving similar error to traditional methods, while greatly simplifying the process by automatically handling redundancy, joint limits, and acceleration / deceleration profiles. The basic technique is extended to avoid obstacles by augmenting the input to the network with information about the nearest obstacles. Results are shown both in simulation and on a real robot via sim-to-real transfer of the learned policy. We show that it is possible to achieve sub-centimeter accuracy, both in simulation and the real world, with a moderate amount of training.



قيم البحث

اقرأ أيضاً

238 - Fan Yang , Chao Yang , Di Guo 2020
Robots have limited adaptation ability compared to humans and animals in the case of damage. However, robot damages are prevalent in real-world applications, especially for robots deployed in extreme environments. The fragility of robots greatly limi ts their widespread application. We propose an adversarial reinforcement learning framework, which significantly increases robot robustness over joint damage cases in both manipulation tasks and locomotion tasks. The agent is trained iteratively under the joint damage cases where it has poor performance. We validate our algorithm on a three-fingered robot hand and a quadruped robot. Our algorithm can be trained only in simulation and directly deployed on a real robot without any fine-tuning. It also demonstrates exceeding success rates over arbitrary joint damage cases.
Multi-agent collision-free trajectory planning and control subject to different goal requirements and system dynamics has been extensively studied, and is gaining recent attention in the realm of machine and reinforcement learning. However, in partic ular when using a large number of agents, constructing a least-restrictive collision avoidance policy is of utmost importance for both classical and learning-based methods. In this paper, we propose a Least-Restrictive Collision Avoidance Module (LR-CAM) that evaluates the safety of multi-agent systems and takes over control only when needed to prevent collisions. The LR-CAM is a single policy that can be wrapped around policies of all agents in a multi-agent system. It allows each agent to pursue any objective as long as it is safe to do so. The benefit of the proposed least-restrictive policy is to only interrupt and overrule the default controller in case of an upcoming inevitable danger. We use a Long Short-Term Memory (LSTM) based Variational Auto-Encoder (VAE) to enable the LR-CAM to account for a varying number of agents in the environment. Moreover, we propose an off-policy meta-reinforcement learning framework with a novel reward function based on a Hamilton-Jacobi value function to train the LR-CAM. The proposed method is fully meta-trained through a ROS based simulation and tested on real multi-agent system. Our results show that LR-CAM outperforms the classical least-restrictive baseline by 30 percent. In addition, we show that even if a subset of agents in a multi-agent system use LR-CAM, the success rate of all agents will increase significantly.
Many manipulation tasks require robots to interact with unknown environments. In such applications, the ability to adapt the impedance according to different task phases and environment constraints is crucial for safety and performance. Although many approaches based on deep reinforcement learning (RL) and learning from demonstration (LfD) have been proposed to obtain variable impedance skills on contact-rich manipulation tasks, these skills are typically task-specific and could be sensitive to changes in task settings. This paper proposes an inverse reinforcement learning (IRL) based approach to recover both the variable impedance policy and reward function from expert demonstrations. We explore different action space of the reward functions to achieve a more general representation of expert variable impedance skills. Experiments on two variable impedance tasks (Peg-in-Hole and Cup-on-Plate) were conducted in both simulations and on a real FANUC LR Mate 200iD/7L industrial robot. The comparison results with behavior cloning and force-based IRL proved that the learned reward function in the gain action space has better transferability than in the force space. Experiment videos are available at https://msc.berkeley.edu/research/impedance-irl.html.
Vision-based grasping systems typically adopt an open-loop execution of a planned grasp. This policy can fail due to many reasons, including ubiquitous calibration error. Recovery from a failed grasp is further complicated by visual occlusion, as the hand is usually occluding the vision sensor as it attempts another open-loop regrasp. This work presents MAT, a tactile closed-loop method capable of realizing grasps provided by a coarse initial positioning of the hand above an object. Our algorithm is a deep reinforcement learning (RL) policy optimized through the clipped surrogate objective within a maximum entropy RL framework to balance exploitation and exploration. The method utilizes tactile and proprioceptive information to act through both fine finger motions and larger regrasp movements to execute stable grasps. A novel curriculum of action motion magnitude makes learning more tractable and helps turn common failure cases into successes. Careful selection of features that exhibit small sim-to-real gaps enables this tactile grasping policy, trained purely in simulation, to transfer well to real world environments without the need for additional learning. Experimentally, this methodology improves over a vision-only grasp success rate substantially on a multi-fingered robot hand. When this methodology is used to realize grasps from coarse initial positions provided by a vision-only planner, the system is made dramatically more robust to calibration errors in the camera-robot transform.
We consider an autonomous exploration problem in which a range-sensing mobile robot is tasked with accurately mapping the landmarks in an a priori unknown environment efficiently in real-time; it must choose sensing actions that both curb localizatio n uncertainty and achieve information gain. For this problem, belief space planning methods that forward-simulate robot sensing and estimation may often fail in real-time implementation, scaling poorly with increasing size of the state, belief and action spaces. We propose a novel approach that uses graph neural networks (GNNs) in conjunction with deep reinforcement learning (DRL), enabling decision-making over graphs containing exploration information to predict a robots optimal sensing action in belief space. The policy, which is trained in different random environments without human intervention, offers a real-time, scalable decision-making process whose high-performance exploratory sensing actions yield accurate maps and high rates of information gain.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا