No Arabic abstract
Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to achieve a given task. In this paper, we present an approach to both learn and sequence robot behaviors, applied to the problem of visual navigation of mobile robots. We construct a layered representation of control policies composed of low- level behaviors and a meta-level policy. The low-level behaviors enable the robot to locomote in a particular environment while avoiding obstacles, and the meta-level policy actively selects the low-level behavior most appropriate for the current situation based purely on visual feedback. We demonstrate the effectiveness of our method on three simulated robot navigation tasks: a legged hexapod robot which must successfully traverse varying terrain, a wheeled robot which must navigate a maze-like course while avoiding obstacles, and finally a wheeled robot navigating in the presence of dynamic obstacles. We show that by learning control policies in a layered manner, we gain the ability to successfully traverse new compound environments composed of distinct sub-environments, and outperform both the low-level behaviors in their respective sub-environments, as well as a hand-crafted selection of low-level policies on these compound environments.
Deep reinforcement learning has great potential to acquire complex, adaptive behaviors for autonomous agents automatically. However, the underlying neural network polices have not been widely deployed in real-world applications, especially in these safety-critical tasks (e.g., autonomous driving). One of the reasons is that the learned policy cannot perform flexible and resilient behaviors as traditional methods to adapt to diverse environments. In this paper, we consider the problem that a mobile robot learns adaptive and resilient behaviors for navigating in unseen uncertain environments while avoiding collisions. We present a novel approach for uncertainty-aware navigation by introducing an uncertainty-aware predictor to model the environmental uncertainty, and we propose a novel uncertainty-aware navigation network to learn resilient behaviors in the prior unknown environments. To train the proposed uncertainty-aware network more stably and efficiently, we present the temperature decay training paradigm, which balances exploration and exploitation during the training process. Our experimental evaluation demonstrates that our approach can learn resilient behaviors in diverse environments and generate adaptive trajectories according to environmental uncertainties.
Humans can routinely follow a trajectory defined by a list of images/landmarks. However, traditional robot navigation methods require accurate mapping of the environment, localization, and planning. Moreover, these methods are sensitive to subtle changes in the environment. In this paper, we propose a Deep Visual MPC-policy learning method that can perform visual navigation while avoiding collisions with unseen objects on the navigation path. Our model PoliNet takes in as input a visual trajectory and the image of the robots current view and outputs velocity commands for a planning horizon of $N$ steps that optimally balance between trajectory following and obstacle avoidance. PoliNet is trained using a strong image predictive model and traversability estimation model in a MPC setup, with minimal human supervision. Different from prior work, PoliNet can be applied to new scenes without retraining. We show experimentally that the robot can follow a visual trajectory when varying start position and in the presence of previously unseen obstacles. We validated our algorithm with tests both in a realistic simulation environment and in the real world. We also show that we can generate visual trajectories in simulation and execute the corresponding path in the real environment. Our approach outperforms classical approaches as well as previous learning-based baselines in success rate of goal reaching, sub-goal coverage rate, and computational load.
In this paper, we present the Role Playing Learning (RPL) scheme for a mobile robot to navigate socially with its human companion in populated environments. Neural networks (NN) are constructed to parameterize a stochastic policy that directly maps sensory data collected by the robot to its velocity outputs, while respecting a set of social norms. An efficient simulative learning environment is built with maps and pedestrians trajectories collected from a number of real-world crowd data sets. In each learning iteration, a robot equipped with the NN policy is created virtually in the learning environment to play itself as a companied pedestrian and navigate towards a goal in a socially concomitant manner. Thus, we call this process Role Playing Learning, which is formulated under a reinforcement learning (RL) framework. The NN policy is optimized end-to-end using Trust Region Policy Optimization (TRPO), with consideration of the imperfectness of robots sensor measurements. Simulative and experimental results are provided to demonstrate the efficacy and superiority of our method.
Moving in dynamic pedestrian environments is one of the important requirements for autonomous mobile robots. We present a model-based reinforcement learning approach for robots to navigate through crowded environments. The navigation policy is trained with both real interaction data from multi-agent simulation and virtual data from a deep transition model that predicts the evolution of surrounding dynamics of mobile robots. The model takes laser scan sequence and robots own state as input and outputs steering control. The laser sequence is further transformed into stacked local obstacle maps disentangled from robots ego motion to separate the static and dynamic obstacles, simplifying the model training. We observe that our method can be trained with significantly less real interaction data in simulator but achieve similar level of success rate in social navigation task compared with other methods. Experiments were conducted in multiple social scenarios both in simulation and on real robots, the learned policy can guide the robots to the final targets successfully while avoiding pedestrians in a socially compliant manner. Code is available at https://github.com/YuxiangCui/model-based-social-navigation
We are interested in the design of autonomous robot behaviors that learn the preferences of users over continued interactions, with the goal of efficiently executing navigation behaviors in a way that the user expects. In this paper, we discuss our work in progress to modify a general model for robot navigation behaviors in an exploration task on a per-user basis using preference-based reinforcement learning. The novel contribution of this approach is that it combines reinforcement learning, motion planning, and natural language processing to allow an autonomous agent to learn from sustained dialogue with a human teammate as opposed to one-off instructions.