مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

AutoLay: Benchmarking amodal layout estimation for autonomous driving

129 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kaustubh Mani

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Kaustubh Mani - N. Sai Shankar - Krishna Murthy Jatavallabhula

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Given an image or a video captured from a monocular camera, amodal layout estimation is the task of predicting semantics and occupancy in birds eye view. The term amodal implies we also reason about entities in the scene that are occluded or truncated in image space. While several recent efforts have tackled this problem, there is a lack of standardization in task specification, datasets, and evaluation protocols. We address these gaps with AutoLay, a dataset and benchmark for amodal layout estimation from monocular images. AutoLay encompasses driving imagery from two popular datasets: KITTI and Argoverse. In addition to fine-grained attributes such as lanes, sidewalks, and vehicles, we also provide semantically annotated 3D point clouds. We implement several baselines and bleeding edge approaches, and release our data and code.

قيم البحث

376 - Xin Zheng , Jianke Zhu 2021

LiDAR odometry plays an important role in self-localization and mapping for autonomous navigation, which is usually treated as a scan registration problem. Although having achieved promising performance on KITTI odometry benchmark, the conventional s earching tree-based approach still has the difficulty in dealing with the large scale point cloud efficiently. The recent spherical range image-based method enjoys the merits of fast nearest neighbor search by spherical mapping. However, it is not very effective to deal with the ground points nearly parallel to LiDAR beams. To address these issues, we propose a novel efficient LiDAR odometry approach by taking advantage of both non-ground spherical range image and birds-eye-view map for ground points. Moreover, a range adaptive method is introduced to robustly estimate the local surface normal. Additionally, a very fast and memory-efficient model update scheme is proposed to fuse the points and their corresponding normals at different time-stamps. We have conducted extensive experiments on KITTI odometry benchmark, whose promising results demonstrate that our proposed approach is effective.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط

Benchmarking Image Sensors Under Adverse Weather Conditions for Autonomous Driving

98 - Mario Bijelic , Tobias Gruber , Werner Ritter 2019

Adverse weather conditions are very challenging for autonomous driving because most of the state-of-the-art sensors stop working reliably under these conditions. In order to develop robust sensors and algorithms, tests with current sensors in defined weather conditions are crucial for determining the impact of bad weather for each sensor. This work describes a testing and evaluation methodology that helps to benchmark novel sensor technologies and compare them to state-of-the-art sensors. As an example, gated imaging is compared to standard imaging under foggy conditions. It is shown that gated imaging outperforms state-of-the-art standard passive imaging due to time-synchronized active illumination.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

GAMMA: A General Agent Motion Prediction Model for Autonomous Driving

101 - Yuanfu Luo , Panpan Cai , David Hsu 2019

Autonomous driving in mixed traffic requires reliable motion prediction of nearby traffic agents such as pedestrians, bicycles, cars, buses, etc.. This prediction problem is extremely challenging because of the diverse dynamics and geometry of traffi c agents, complex road conditions, and intensive interactions among the agents. In this paper, we proposed GAMMA, a general agent motion prediction model for autonomous driving, that can predict the motion of heterogeneous traffic agents with different kinematics, geometry, human agents inner states, etc.. GAMMA formalizes motion prediction as geometric optimization in the velocity space, and integrates physical constraints and human inner states into this unified framework. Our results show that GAMMA outperforms state-of-the-art approaches significantly on diverse real-world datasets.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط

End-to-End Interactive Prediction and Planning with Optical Flow Distillation for Autonomous Driving

111 - Hengli Wang , Peide Cai , Rui Fan 2021

With the recent advancement of deep learning technology, data-driven approaches for autonomous car prediction and planning have achieved extraordinary performance. Nevertheless, most of these approaches follow a non-interactive prediction and plannin g paradigm, hypothesizing that a vehicles behaviors do not affect others. The approaches based on such a non-interactive philosophy typically perform acceptably in sparse traffic scenarios but can easily fail in dense traffic scenarios. Therefore, we propose an end-to-end interactive neural motion planner (INMP) for autonomous driving in this paper. Given a set of past surrounding-view images and a high definition map, our INMP first generates a feature map in birds-eye-view space, which is then processed to detect other agents and perform interactive prediction and planning jointly. Also, we adopt an optical flow distillation paradigm, which can effectively improve the network performance while still maintaining its real-time inference speed. Extensive experiments on the nuScenes dataset and in the closed-loop Carla simulation environment demonstrate the effectiveness and efficiency of our INMP for the detection, prediction, and planning tasks. Our project page is at sites.google.com/view/inmp-ofd.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط

Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity

118 - William Agnew , Christopher Xie , Aaron Walsman 2020

Learning-based 3D object reconstruction enables single- or few-shot estimation of 3D object models. For robotics, this holds the potential to allow model-based methods to rapidly adapt to novel objects and scenes. Existing 3D reconstruction technique s optimize for visual reconstruction fidelity, typically measured by chamfer distance or voxel IOU. We find that when applied to realistic, cluttered robotics environments, these systems produce reconstructions with low physical realism, resulting in poor task performance when used for model-based control. We propose ARM, an amodal 3D reconstruction system that introduces (1) a stability prior over object shapes, (2) a connectivity prior, and (3) a multi-channel input representation that allows for reasoning over relationships between groups of objects. By using these priors over the physical properties of objects, our system improves reconstruction quality not just by standard visual metrics, but also performance of model-based control on a variety of robotics manipulation tasks in challenging, cluttered environments. Code is available at github.com/wagnew3/ARM.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني الجزائري للبحث الزراعي

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

AutoLay: Benchmarking amodal layout estimation for autonomous driving

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً