بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Vision System and Depth Processing for DRC-HUBO+

80 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Inwook Shim

تاريخ النشر 2015

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Inwook Shim - Seunghak Shin - Yunsu Bok

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a vision system and a depth processing algorithm for DRC-HUBO+, the winner of the DRC finals 2015. Our system is designed to reliably capture 3D information of a scene and objects robust to challenging environment conditions. We also propose a depth-map upsampling method that produces an outliers-free depth map by explicitly handling depth outliers. Our system is suitable for an interactive robot with real-world that requires accurate object detection and pose estimation. We evaluate our depth processing algorithm over state-of-the-art algorithms on several synthetic and real-world datasets.

قيم البحث

140 - Jawad N. Yasin , Sherif A.S. Mohamed , Mohammad-hashem Haghbayan 2020

Moving towards autonomy, unmanned vehicles rely heavily on state-of-the-art collision avoidance systems (CAS). However, the detection of obstacles especially during night-time is still a challenging task since the lighting conditions are not sufficie nt for traditional cameras to function properly. Therefore, we exploit the powerful attributes of event-based cameras to perform obstacle detection in low lighting conditions. Event cameras trigger events asynchronously at high output temporal rate with high dynamic range of up to 120 $dB$. The algorithm filters background activity noise and extracts objects using robust Hough transform technique. The depth of each detected object is computed by triangulating 2D features extracted utilising LC-Harris. Finally, asynchronous adaptive collision avoidance (AACA) algorithm is applied for effective avoidance. Qualitative evaluation is compared using event-camera and traditional camera.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Self-Guided Instance-Aware Network for Depth Completion and Enhancement

85 - Zhongzhen Luo , Fengjia Zhang , Guoyi Fu 2021

Depth completion aims at inferring a dense depth image from sparse depth measurement since glossy, transparent or distant surface cannot be scanned properly by the sensor. Most of existing methods directly interpolate the missing depth measurements b ased on pixel-wise image content and the corresponding neighboring depth values. Consequently, this leads to blurred boundaries or inaccurate structure of object. To address these problems, we propose a novel self-guided instance-aware network (SG-IANet) that: (1) utilize self-guided mechanism to extract instance-level features that is needed for depth restoration, (2) exploit the geometric and context information into network learning to conform to the underlying constraints for edge clarity and structure consistency, (3) regularize the depth estimation and mitigate the impact of noise by instance-aware learning, and (4) train with synthetic data only by domain randomization to bridge the reality gap. Extensive experiments on synthetic and real world dataset demonstrate that our proposed method outperforms previous works. Further ablation studies give more insights into the proposed method and demonstrate the generalization capability of our model.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

3D Shape Reconstruction from Vision and Touch

199 - Edward J. Smith , Roberto Calandra , Adriana Romero 2020

When a toddler is presented a new toy, their instinctual behaviour is to pick it upand inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with. At any instance here, touch pr ovides high fidelity localized information while vision provides complementary global context. However, in 3D shape reconstruction, the complementary fusion of visual and haptic modalities remains largely unexplored. In this paper, we study this problem and present an effective chart-based approach to multi-modal shape understanding which encourages a similar fusion vision and touch information.To do so, we introduce a dataset of simulated touch and vision signals from the interaction between a robotic hand and a large array of 3D objects. Our results show that (1) leveraging both vision and touch signals consistently improves single-modality baselines; (2) our approach outperforms alternative modality fusion methods and strongly benefits from the proposed chart-based structure; (3) there construction quality increases with the number of grasps provided; and (4) the touch information not only enhances the reconstruction at the touch site but also extrapolates to its local neighborhood.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Bidirectional Attention Network for Monocular Depth Estimation

145 - Shubhra Aich , Jean Marie Uwabeza Vianney , Md Amirul Islam 2020

In this paper, we propose a Bidirectional Attention Network (BANet), an end-to-end framework for monocular depth estimation (MDE) that addresses the limitation of effectively integrating local and global information in convolutional neural networks. The structure of this mechanism derives from a strong conceptual foundation of neural machine translation, and presents a light-weight mechanism for adaptive control of computation similar to the dynamic nature of recurrent neural networks. We introduce bidirectional attention modules that utilize the feed-forward feature maps and incorporate the global context to filter out ambiguity. Extensive experiments reveal the high degree of capability of this bidirectional attention model over feed-forward baselines and other state-of-the-art methods for monocular depth estimation on two challenging datasets -- KITTI and DIODE. We show that our proposed approach either outperforms or performs at least on a par with the state-of-the-art monocular depth estimation methods with less memory and computational complexity.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

254 - Bichen Wu , Chenfeng Xu , Xiaoliang Dai 2020

Computer vision has achieved remarkable success by (a) representing images as uniformly-arranged pixel arrays and (b) convolving highly-localized features. However, convolutions treat all image pixels equally regardless of importance; explicitly mode l all concepts across all images, regardless of content; and struggle to relate spatially-distant concepts. In this work, we challenge this paradigm by (a) representing images as semantic visual tokens and (b) running transformers to densely model token relationships. Critically, our Visual Transformer operates in a semantic token space, judiciously attending to different image parts based on context. This is in sharp contrast to pixel-space transformers that require orders-of-magnitude more compute. Using an advanced training recipe, our VTs significantly outperform their convolutional counterparts, raising ResNet accuracy on ImageNet top-1 by 4.6 to 7 points while using fewer FLOPs and parameters. For semantic segmentation on LIP and COCO-stuff, VT-based feature pyramid networks (FPN) achieve 0.35 points higher mIoU while reducing the FPN modules FLOPs by 6.5x.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حلوان

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Vision System and Depth Processing for DRC-HUBO+

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً