No Arabic abstract
In order to detect and correct physical exercises, a Grow-When-Required Network (GWR) with recurrent connections, episodic memory and a novel subnode mechanism is developed in order to learn spatiotemporal relationships of body movements and poses. Once an exercise is performed, the information of pose and movement per frame is stored in the GWR. For every frame, the current pose and motion pair is compared against a predicted output of the GWR, allowing for feedback not only on the pose but also on the velocity of the motion. In a practical scenario, a physical exercise is performed by an expert like a physiotherapist and then used as a reference for a humanoid robot like Pepper to give feedback on a patients execution of the same exercise. This approach, however, comes with two challenges. First, the distance from the humanoid robot and the position of the user in the cameras view of the humanoid robot have to be considered by the GWR as well, requiring a robustness against the users positioning in the field of view of the humanoid robot. Second, since both the pose and motion are dependent on the body measurements of the original performer, the experts exercise cannot be easily used as a reference. This paper tackles the first challenge by designing an architecture that allows for tolerances in translation and rotations regarding the center of the field of view. For the second challenge, we allow the GWR to grow online on incremental data. For evaluation, we created a novel exercise dataset with virtual avatars called the Virtual-Squat dataset. Overall, we claim that our novel architecture based on the GWR can use a learned exercise reference for different body variations through continual online learning, while preventing catastrophic forgetting, enabling for an engaging long-term human-robot interaction with a humanoid robot.
In the current level of evolution of Soccer 3D, motion control is a key factor in teams performance. Recent works takes advantages of model-free approaches based on Machine Learning to exploit robot dynamics in order to obtain faster locomotion skills, achieving running policies and, therefore, opening a new research direction in the Soccer 3D environment. In this work, we present a methodology based on Deep Reinforcement Learning that learns running skills without any prior knowledge, using a neural network whose inputs are related to robots dynamics. Our results outperformed the previous state-of-the-art sprint velocity reported in Soccer 3D literature by a significant margin. It also demonstrated improvement in sample efficiency, being able to learn how to run in just few hours. We reported our results analyzing the training procedure and also evaluating the policies in terms of speed, reliability and human similarity. Finally, we presented key factors that lead us to improve previous results and shared some ideas for future work.
We present a novel approach for interactive auditory object analysis with a humanoid robot. The robot elicits sensory information by physically shaking visually indistinguishable plastic capsules. It gathers the resulting audio signals from microphones that are embedded into the robotic ears. A neural network architecture learns from these signals to analyze properties of the contents of the containers. Specifically, we evaluate the material classification and weight prediction accuracy and demonstrate that the framework is fairly robust to acoustic real-world noise.
A practical approach to robot reinforcement learning is to first collect a large batch of real or simulated robot interaction data, using some data collection policy, and then learn from this data to perform various tasks, using offline learning algorithms. Previous work focused on manually designing the data collection policy, and on tasks where suitable policies can easily be designed, such as random picking policies for collecting data about object grasping. For more complex tasks, however, it may be difficult to find a data collection policy that explores the environment effectively, and produces data that is diverse enough for the downstream task. In this work, we propose that data collection policies should actively explore the environment to collect diverse data. In particular, we develop a simple-yet-effective goal-conditioned reinforcement-learning method that actively focuses data collection on novel observations, thereby collecting a diverse data-set. We evaluate our method on simulated robot manipulation tasks with visual inputs and show that the improved diversity of active data collection leads to significant improvements in the downstream learning tasks.
The hierarchical quadratic programming (HQP) is commonly applied to consider strict hierarchies of multi-tasks and robots physical inequality constraints during whole-body compliance. However, for the one-step HQP, the solution can oscillate when it is close to the boundary of constraints. It is because the abrupt hit of the bounds gives rise to unrealisable jerks and even infeasible solutions. This paper proposes the mixed control, which blends the single-axis model predictive control (MPC) and proportional derivate (PD) control for the whole-body compliance to overcome these deficiencies. The MPC predicts the distances between the bounds and the control target of the critical tasks, and it provides smooth and feasible solutions by prediction and optimisation in advance. However, applying MPC will inevitably increase the computation time. Therefore, to achieve a 500 Hz servo rate, the PD controllers still regulate other tasks to save computation resources. Also, we use a more efficient null space projection (NSP) whole-body controller instead of the HQP and distribute the single-axis MPCs into four CPU cores for parallel computation. Finally, we validate the desired capabilities of the proposed strategy via Simulations and the experiment on the humanoid robot Walker X.
Recent advances in robot learning have enabled robots to become increasingly better at mastering a predefined set of tasks. On the other hand, as humans, we have the ability to learn a growing set of tasks over our lifetime. Continual robot learning is an emerging research direction with the goal of endowing robots with this ability. In order to learn new tasks over time, the robot first needs to infer the task at hand. Task inference, however, has received little attention in the multi-task learning literature. In this paper, we propose a novel approach to continual learning of robotic control tasks. Our approach performs unsupervised learning of behavior embeddings by incrementally self-organizing demonstrated behaviors. Task inference is made by finding the nearest behavior embedding to a demonstrated behavior, which is used together with the environment state as input to a multi-task policy trained with reinforcement learning to optimize performance over tasks. Unlike previous approaches, our approach makes no assumptions about task distribution and requires no task exploration to infer tasks. We evaluate our approach in experiments with concurrently and sequentially presented tasks and show that it outperforms other multi-task learning approaches in terms of generalization performance and convergence speed, particularly in the continual learning setting.