ﻻ يوجد ملخص باللغة العربية
We consider the problem of learning user preferences over robot trajectories for environments rich in objects and humans. This is challenging because the criterion defining a good trajectory varies with users, tasks and interactions in the environment. We represent trajectory preferences using a cost function that the robot learns and uses it to generate good trajectories in new environments. We design a crowdsourcing system - PlanIt, where non-expert users label segments of the robots trajectory. PlanIt allows us to collect a large amount of user feedback, and using the weak and noisy labels from PlanIt we learn the parameters of our model. We test our approach on 122 different environments for robotic navigation and manipulation tasks. Our extensive experiments show that the learned cost function generates preferred trajectories in human environments. Our crowdsourcing system is publicly available for the visualization of the learned costs and for providing preference feedback: url{http://planit.cs.cornell.edu}
Model predictive control (MPC) is a popular control method that has proved effective for robotics, among other fields. MPC performs re-planning at every time step. Re-planning is done with a limited horizon per computational and real-time constraints
We consider the problem of learning preferences over trajectories for mobile manipulators such as personal robots and assembly line robots. The preferences we learn are more intricate than simple geometric constraints on trajectories; they are rather
The sense of touch, being the earliest sensory system to develop in a human body [1], plays a critical part of our daily interaction with the environment. In order to successfully complete a task, many manipulation interactions require incorporating
Robots can learn the right reward function by querying a human expert. Existing approaches attempt to choose questions where the robot is most uncertain about the humans response; however, they do not consider how easy it will be for the human to ans
As language models become more powerful, training and evaluation are increasingly bottlenecked by the data and metrics used for a particular task. For example, summarization models are often trained to predict human reference summaries and evaluated