Show, Attend and Interact: Perceivable Human-Robot Social Interaction through Neural Attention Q-Network

269 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ahmed Qureshi

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ahmed Hussain Qureshi - Yutaka Nakamura - Yuichiro Yoshikawa

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

For a safe, natural and effective human-robot social interaction, it is essential to develop a system that allows a robot to demonstrate the perceivable responsive behaviors to complex human behaviors. We introduce the Multimodal Deep Attention Recurrent Q-Network using which the robot exhibits human-like social interaction skills after 14 days of interacting with people in an uncontrolled real world. Each and every day during the 14 days, the system gathered robot interaction experiences with people through a hit-and-trial method and then trained the MDARQN on these experiences using end-to-end reinforcement learning approach. The results of interaction based learning indicate that the robot has learned to respond to complex human behaviors in a perceivable and socially acceptable manner.

قيم البحث

383 - Tianmin Shu , M. S. Ryoo , Song-Chun Zhu 2016

In this paper, we present an approach for robot learning of social affordance from human activity videos. We consider the problem in the context of human-robot interaction: Our approach learns structural representations of human-human (and human-obje ct-human) interactions, describing how body-parts of each agent move with respect to each other and what spatial relations they should maintain to complete each sub-event (i.e., sub-goal). This enables the robot to infer its own movement in reaction to the human body motion, allowing it to naturally replicate such interactions. We introduce the representation of social affordance and propose a generative model for its weakly supervised learning from human demonstration videos. Our approach discovers critical steps (i.e., latent sub-events) in an interaction and the typical motion associated with them, learning what body-parts should be involved and how. The experimental results demonstrate that our Markov Chain Monte Carlo (MCMC) based learning algorithm automatically discovers semantically meaningful interactive affordance from RGB-D videos, which allows us to generate appropriate full body motion for an agent.

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions

267 - Tianmin Shu , Xiaofeng Gao , Michael S. Ryoo 2017

In this paper, we present a general framework for learning social affordance grammar as a spatiotemporal AND-OR graph (ST-AOG) from RGB-D videos of human interactions, and transfer the grammar to humanoids to enable a real-time motion inference for h uman-robot interaction (HRI). Based on Gibbs sampling, our weakly supervised grammar learning can automatically construct a hierarchical representation of an interaction with long-term joint sub-tasks of both agents and short term atomic actions of individual agents. Based on a new RGB-D video dataset with rich instances of human interactions, our experiments of Baxter simulation, human evaluation, and real Baxter test demonstrate that the model learned from limited training data successfully generates human-like behaviors in unseen scenarios and outperforms both baselines.

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning

106 - Ahmed Hussain Qureshi , Yutaka Nakamura , Yuichiro Yoshikawa 2017

For robots to coexist with humans in a social world like ours, it is crucial that they possess human-like social interaction skills. Programming a robot to possess such skills is a challenging task. In this paper, we propose a Multimodal Deep Q-Netwo rk (MDQN) to enable a robot to learn human-like interaction skills through a trial and error method. This paper aims to develop a robot that gathers data during its interaction with a human and learns human interaction behaviour from the high-dimensional sensory information using end-to-end reinforcement learning. This paper demonstrates that the robot was able to learn basic interaction skills successfully, after 14 days of interacting with people.

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction

143 - Mohit Shridhar , David Hsu 2018

This paper presents INGRESS, a robot system that follows human natural language instructions to pick and place everyday objects. The core issue here is the grounding of referring expressions: infer objects and their relationships from input images an d language expressions. INGRESS allows for unconstrained object categories and unconstrained language expressions. Further, it asks questions to disambiguate referring expressions interactively. To achieve these, we take the approach of grounding by generation and propose a two-stage neural network model for grounding. The first stage uses a neural network to generate visual descriptions of objects, compares them with the input language expression, and identifies a set of candidate objects. The second stage uses another neural network to examine all pairwise relations between the candidates and infers the most likely referred object. The same neural networks are used for both grounding and question generation for disambiguation. Experiments show that INGRESS outperformed a state-of-the-art method on the RefCOCO dataset and in robot experiments with humans.

علم الروبوتات الحساب واللغة الرؤية الحاسوبية وتمييز الأنماط

Formalizing and Guaranteeing* Human-Robot Interaction

93 - Hadas Kress-Gazit , Kerstin Eder , Guy Hoffman 2020

Robot capabilities are maturing across domains, from self-driving cars, to bipeds and drones. As a result, robots will soon no longer be confined to safety-controlled industrial settings; instead, they will directly interact with the general public. The growing field of Human-Robot Interaction (HRI) studies various aspects of this scenario - from social norms to joint action to human-robot teams and more. Researchers in HRI have made great strides in developing models, methods, and algorithms for robots acting with and around humans, but these computational HRI models and algorithms generally do not come with formal guarantees and constraints on their operation. To enable human-interactive robots to move from the lab to real-world deployments, we must address this gap. This article provides an overview of verification, validation and synthesis techniques used to create demonstrably trustworthy systems, describes several HRI domains that could benefit from such techniques, and provides a roadmap for the challenges and the research needed to create formalized and guaranteed human-robot interaction.

علم الروبوتات