بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

343 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Lina Mezghani

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Lina Mezghani - Sainbayar Sukhbaatar - Thibaut Lavril

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this work, we present a memory-augmented approach for image-goal navigation. Earlier attempts, including RL-based and SLAM-based approaches have either shown poor generalization performance, or are heavily-reliant on pose/depth sensors. Our method uses an attention-based end-to-end model that leverages an episodic memory to learn to navigate. First, we train a state-embedding network in a self-supervised fashion, and then use it to embed previously-visited states into the agents memory. Our navigation policy takes advantage of this information through an attention mechanism. We validate our approach with extensive evaluations, and show that our model establishes a new state of the art on the challenging Gibson dataset. Furthermore, we achieve this impressive performance from RGB input alone, without access to additional information such as position or depth, in stark contrast to related work.

قيم البحث

99 - Georgios Georgakis , Bernadette Bucher , Karl Schmeckpeper 2021

We consider the problem of object goal navigation in unseen environments. In our view, solving this problem requires learning of contextual semantic priors, a challenging endeavour given the spatial and semantic variability of indoor environments. Cu rrent methods learn to implicitly encode these priors through goal-oriented navigation policy functions operating on spatial representations that are limited to the agents observable areas. In this work, we propose a novel framework that actively learns to generate semantic maps outside the field of view of the agent and leverages the uncertainty over the semantic classes in the unobserved areas to decide on long term goals. We demonstrate that through this spatial prediction strategy, we are able to learn semantic priors in scenes that can be leveraged in unknown environments. Additionally, we show how different objectives can be defined by balancing exploration with exploitation during searching for semantic targets. Our method is validated in the visually realistic environments offered by the Matterport3D dataset and show state of the art results on the object goal navigation task.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Object Goal Navigation using Goal-Oriented Semantic Exploration

100 - Devendra Singh Chaplot , Dhiraj Gandhi , Abhinav Gupta 2020

This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at explor ation and long-term planning. We propose a modular system called, `Goal-Oriented Semantic Exploration which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category. Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR-2020 Habitat ObjectNav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently. Domain-agnostic module design allow us to transfer our model to a mobile robot platform and achieve similar performance for object goal navigation in the real-world.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي علم الروبوتات

Texture Memory-Augmented Deep Patch-Based Image Inpainting

80 - Rui Xu , Minghao Guo , Jiaqi Wang 2020

Patch-based methods and deep networks have been employed to tackle image inpainting problem, with their own strengths and weaknesses. Patch-based methods are capable of restoring a missing region with high-quality texture through searching nearest ne ighbor patches from the unmasked regions. However, these methods bring problematic contents when recovering large missing regions. Deep networks, on the other hand, show promising results in completing large regions. Nonetheless, the results often lack faithful and sharp details that resemble the surrounding area. By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions. The framework has a novel design that allows texture memory retrieval to be trained end-to-end with the deep inpainting network. In addition, we introduce a patch distribution loss to encourage high-quality patch synthesis. The proposed method shows superior performance both qualitatively and quantitatively on three challenging image benchmarks, i.e., Places, CelebA-HQ, and Paris Street-View datasets.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Hypothesis-driven Stream Learning with Augmented Memory

81 - Mengmi Zhang , Rohil Badkundri , Morgan B. Talbot 2021

Stream learning refers to the ability to acquire and transfer knowledge across a continuous stream of data without forgetting and without repeated passes over the data. A common way to avoid catastrophic forgetting is to intersperse new examples with replays of old examples stored as image pixels or reproduced by generative models. Here, we considered stream learning in image classification tasks and proposed a novel hypotheses-driven Augmented Memory Network, which efficiently consolidates previous knowledge with a limited number of hypotheses in the augmented memory and replays relevant hypotheses to avoid catastrophic forgetting. The advantages of hypothesis-driven replay over image pixel replay and generative replay are two-fold. First, hypothesis-based knowledge consolidation avoids redundant information in the image pixel space and makes memory usage more efficient. Second, hypotheses in the augmented memory can be re-used for learning new tasks, improving generalization and transfer learning ability. We evaluated our method on three stream learning object recognition datasets. Our method performs comparably well or better than SOTA methods, while offering more efficient memory usage. All source code and data are publicly available https://github.com/kreimanlab/AugMem.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Hierarchical Image-Goal Navigation in Real Crowded Scenarios

132 - Qiaoyun Wu , Jun Wang , Jing Liang 2021

This work studies the problem of image-goal navigation, which entails guiding robots with noisy sensors and controls through real crowded environments. Recent fruitful approaches rely on deep reinforcement learning and learn navigation policies in si mulation environments that are much simpler in complexity than real environments. Directly transferring these trained policies to real environments can be extremely challenging or even dangerous. We tackle this problem with a hierarchical navigation method composed of four decoupled modules. The first module maintains an obstacle map during robot navigation. The second one predicts a long-term goal on the real-time map periodically. The third one plans collision-free command sets for navigating to long-term goals, while the final module stops the robot properly near the goal image. The four modules are developed separately to suit the image-goal navigation in real crowded scenarios. In addition, the hierarchical decomposition decouples the learning of navigation goal planning, collision avoidance and navigation ending prediction, which cuts down the search space during navigation training and helps improve the generalization to previously unseen real scenes. We evaluate the method in both a simulator and the real world with a mobile robot. The results show that our method outperforms several navigation baselines and can successfully achieve navigation tasks in these scenarios.

علم الروبوتات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة العربية الدولية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Memory-Augmented Reinforcement Learning for Image-Goal Navigation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً