ترغب بنشر مسار تعليمي؟ اضغط هنا

Towards Exploiting Geometry and Time for Fast Off-Distribution Adaptation in Multi-Task Robot Learning

66   0   0.0 ( 0 )
 نشر من قبل Yulun Zhang
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We explore possible methods for multi-task transfer learning which seek to exploit the shared physical structure of robotics tasks. Specifically, we train policies for a base set of pre-training tasks, then experiment with adapting to new off-distribution tasks, using simple architectural approaches for re-using these policies as black-box priors. These approaches include learning an alignment of either the observation space or action space from a base to a target task to exploit rigid body structure, and methods for learning a time-domain switching policy across base tasks which solves the target task, to exploit temporal coherence. We find that combining low-complexity target policy classes, base policies as black-box priors, and simple optimization algorithms allows us to acquire new tasks outside the base task distribution, using small amounts of offline training data.



قيم البحث

اقرأ أيضاً

Autonomous robots operating in large knowledgeintensive domains require planning in the discrete (task) space and the continuous (motion) space. In knowledge-intensive domains, on the one hand, robots have to reason at the highestlevel, for example t he regions to navigate to or objects to be picked up and their properties; on the other hand, the feasibility of the respective navigation tasks have to be checked at the controller execution level. Moreover, employing multiple robots offer enhanced performance capabilities over a single robot performing the same task. To this end, we present an integrated multi-robot task-motion planning framework for navigation in knowledge-intensive domains. In particular, we consider a distributed multi-robot setting incorporating mutual observations between the robots. The framework is intended for motion planning under motion and sensing uncertainty, which is formally known as belief space planning. The underlying methodology and its limitations are discussed, providing suggestions for improvements and future work. We validate key aspects of our approach in simulation.
The recently introduced Intelligent Trial and Error algorithm (IT&E) enables robots to creatively adapt to damage in a matter of minutes by combining an off-line evolutionary algorithm and an on-line learning algorithm based on Bayesian Optimization. We extend the IT&E algorithm to allow for robots to learn to compensate for damages while executing their task(s). This leads to a semi-episodic learning scheme that increases the robots lifetime autonomy and adaptivity. Preliminary experiments on a toy simulation and a 6-legged robot locomotion task show promising results.
We consider the problem of dynamically allocating tasks to multiple agents under time window constraints and task completion uncertainty. Our objective is to minimize the number of unsuccessful tasks at the end of the operation horizon. We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination and addresses them in a hierarchical manner. The lower layer computes policies for individual agents using dynamic programming with tree search, and the upper layer resolves conflicts in individual plans to obtain a valid multi-agent allocation. Our algorithm, Stochastic Conflict-Based Allocation (SCoBA), is optimal in expectation and complete under some reasonable assumptions. In practice, SCoBA is computationally efficient enough to interleave planning and execution online. On the metric of successful task completion, SCoBA consistently outperforms a number of baseline methods and shows strong competitive performance against an oracle with complete lookahead. It also scales well with the number of tasks and agents. We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
A general-purpose intelligent robot must be able to learn autonomously and be able to accomplish multiple tasks in order to be deployed in the real world. However, standard reinforcement learning approaches learn separate task-specific policies and a ssume the reward function for each task is known a priori. We propose a framework that learns event cues from off-policy data, and can flexibly combine these event cues at test time to accomplish different tasks. These event cue labels are not assumed to be known a priori, but are instead labeled using learned models, such as computer vision detectors, and then `backed up in time using an action-conditioned predictive model. We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks. Videos of the experiments and code can be found at https://github.com/gkahn13/CAPs
Terrain adaptation is an essential capability for a ground robot to effectively traverse unstructured off-road terrain in real-world field environments such as forests. However, the expected robot behaviors generated by terrain adaptation methods can not always be executed accurately due to setbacks such as wheel slip and reduced tire pressure. To address this problem, we propose a novel approach for consistent behavior generation that enables the ground robots actual behaviors to more accurately match expected behaviors while adapting to a variety of unstructured off-road terrain. Our approach learns offset behaviors that are used to compensate for the inconsistency between the actual and expected behaviors without requiring the explicit modeling of various setbacks. Our approach is also able to estimate the importance of the multi-modal features to improve terrain representations for better adaptation. In addition, we develop an algorithmic solver for our formulated regularized optimization problem, which is guaranteed to converge to the global optimal solution. To evaluate the method, we perform extensive experiments using various unstructured off-road terrain in real-world field environments. Experimental results have validated that our approach enables robots to traverse complex unstructured off-road terrain with more navigational behavior consistency, and it outperforms previous methods, particularly so on challenging terrain.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا