Role Playing Learning for Socially Concomitant Mobile Robot Navigation

221 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mingming Li

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mingming Li - Rui Jiang - Shuzhi Sam Ge

علم الروبوتات الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we present the Role Playing Learning (RPL) scheme for a mobile robot to navigate socially with its human companion in populated environments. Neural networks (NN) are constructed to parameterize a stochastic policy that directly maps sensory data collected by the robot to its velocity outputs, while respecting a set of social norms. An efficient simulative learning environment is built with maps and pedestrians trajectories collected from a number of real-world crowd data sets. In each learning iteration, a robot equipped with the NN policy is created virtually in the learning environment to play itself as a companied pedestrian and navigate towards a goal in a socially concomitant manner. Thus, we call this process Role Playing Learning, which is formulated under a reinforcement learning (RL) framework. The NN policy is optimized end-to-end using Trust Region Policy Optimization (TRPO), with consideration of the imperfectness of robots sensor measurements. Simulative and experimental results are provided to demonstrate the efficacy and superiority of our method.

قيم البحث

139 - Yuxiang Cui , Haodong Zhang , Yue Wang 2020

Moving in dynamic pedestrian environments is one of the important requirements for autonomous mobile robots. We present a model-based reinforcement learning approach for robots to navigate through crowded environments. The navigation policy is traine d with both real interaction data from multi-agent simulation and virtual data from a deep transition model that predicts the evolution of surrounding dynamics of mobile robots. The model takes laser scan sequence and robots own state as input and outputs steering control. The laser sequence is further transformed into stacked local obstacle maps disentangled from robots ego motion to separate the static and dynamic obstacles, simplifying the model training. We observe that our method can be trained with significantly less real interaction data in simulator but achieve similar level of success rate in social navigation task compared with other methods. Experiments were conducted in multiple social scenarios both in simulation and on real robots, the learned policy can guide the robots to the final targets successfully while avoiding pedestrians in a socially compliant manner. Code is available at https://github.com/YuxiangCui/model-based-social-navigation

علم الروبوتات

CoMet: Modeling Group Cohesion for Socially Compliant Robot Navigation in Crowded Scenes

144 - Adarsh Jagan Sathyamoorthy , Utsav Patel , Moumita Paul 2021

We present CoMet, a novel approach for computing a groups cohesion and using that to improve a robots navigation in crowded scenes. Our approach uses a novel cohesion-metric that builds on prior work in social psychology. We compute this metric by ut ilizing various visual features of pedestrians from an RGB-D camera on-board a robot. Specifically, we detect characteristics corresponding to proximity between people, their relative walking speeds, the group size, and interactions between group members. We use our cohesion-metric to design and improve a navigation scheme that accounts for different levels of group cohesion while a robot moves through a crowd. We evaluate the precision and recall of our cohesion-metric based on perceptual evaluations. We highlight the performance of our social navigation algorithm on a Turtlebot robot and demonstrate its benefits in terms of multiple metrics: freezing rate (57% decrease), deviation (35.7% decrease), and path length of the trajectory(23.2% decrease).

علم الروبوتات

Dynamically Feasible Deep Reinforcement Learning Policy for Robot Navigation in Dense Mobile Crowds

106 - Utsav Patel , Nithish Kumar , Adarsh Jagan Sathyamoorthy 2020

We present a novel Deep Reinforcement Learning (DRL) based policy to compute dynamically feasible and spatially aware velocities for a robot navigating among mobile obstacles. Our approach combines the benefits of the Dynamic Window Approach (DWA) in terms of satisfying the robots dynamics constraints with state-of-the-art DRL-based navigation methods that can handle moving obstacles and pedestrians well. Our formulation achieves these goals by embedding the environmental obstacles motions in a novel low-dimensional observation space. It also uses a novel reward function to positively reinforce velocities that move the robot away from the obstacles heading direction leading to significantly lower number of collisions. We evaluate our method in realistic 3-D simulated environments and on a real differential drive robot in challenging dense indoor scenarios with several walking pedestrians. We compare our method with state-of-the-art collision avoidance methods and observe significant improvements in terms of success rate (up to 33% increase), number of dynamics constraint violations (up to 61% decrease), and smoothness. We also conduct ablation studies to highlight the advantages of our observation space formulation, and reward structure.

علم الروبوتات

Learning control for transmission and navigation with a mobile robot under unknown communication rates

347 - L. Busoniu , V. S. Varma , J. Loheac 2020

In tasks such as surveying or monitoring remote regions, an autonomous robot must move while transmitting data over a wireless network with unknown, position-dependent transmission rates. For such a robot, this paper considers the problem of transmit ting a data buffer in minimum time, while possibly also navigating towards a goal position. Two approaches are proposed, each consisting of a machine-learning component that estimates the rate function from samples; and of an optimal-control component that moves the robot given the current rate function estimate. Simple obstacle avoidance is performed for the case without a goal position. In extensive simulations, these methods achieve competitive performance compared to known-rate and unknown-rate baselines. A real indoor experiment is provided in which a Parrot AR.Drone 2 successfully learns to transmit the buffer.

علم الروبوتات التعلم الآلي أنظمة وتحكم

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation

176 - Gregory Kahn , Adam Villaflor , Pieter Abbeel 2018

A general-purpose intelligent robot must be able to learn autonomously and be able to accomplish multiple tasks in order to be deployed in the real world. However, standard reinforcement learning approaches learn separate task-specific policies and a ssume the reward function for each task is known a priori. We propose a framework that learns event cues from off-policy data, and can flexibly combine these event cues at test time to accomplish different tasks. These event cue labels are not assumed to be known a priori, but are instead labeled using learned models, such as computer vision detectors, and then `backed up in time using an action-conditioned predictive model. We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks. Videos of the experiments and code can be found at https://github.com/gkahn13/CAPs

علم الروبوتات الذكاء الاصطناعي التعلم الآلي