New community

Subscribe to the gold package and get unlimited access to Shamra Academy

AutoLay: Benchmarking amodal layout estimation for autonomous driving

129 0 0.0 ( 0 )

Download Cite

Added by Kaustubh Mani

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Kaustubh Mani - N. Sai Shankar - Krishna Murthy Jatavallabhula

Robotics Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Given an image or a video captured from a monocular camera, amodal layout estimation is the task of predicting semantics and occupancy in birds eye view. The term amodal implies we also reason about entities in the scene that are occluded or truncated in image space. While several recent efforts have tackled this problem, there is a lack of standardization in task specification, datasets, and evaluation protocols. We address these gaps with AutoLay, a dataset and benchmark for amodal layout estimation from monocular images. AutoLay encompasses driving imagery from two popular datasets: KITTI and Argoverse. In addition to fine-grained attributes such as lanes, sidewalks, and vehicles, we also provide semantically annotated 3D point clouds. We implement several baselines and bleeding edge approaches, and release our data and code.

rate research

Efficient LiDAR Odometry for Autonomous Driving

376 - Xin Zheng , Jianke Zhu 2021

LiDAR odometry plays an important role in self-localization and mapping for autonomous navigation, which is usually treated as a scan registration problem. Although having achieved promising performance on KITTI odometry benchmark, the conventional searching tree-based approach still has the difficulty in dealing with the large scale point cloud efficiently. The recent spherical range image-based method enjoys the merits of fast nearest neighbor search by spherical mapping. However, it is not very effective to deal with the ground points nearly parallel to LiDAR beams. To address these issues, we propose a novel efficient LiDAR odometry approach by taking advantage of both non-ground spherical range image and birds-eye-view map for ground points. Moreover, a range adaptive method is introduced to robustly estimate the local surface normal. Additionally, a very fast and memory-efficient model update scheme is proposed to fuse the points and their corresponding normals at different time-stamps. We have conducted extensive experiments on KITTI odometry benchmark, whose promising results demonstrate that our proposed approach is effective.

Robotics Computer Vision and Pattern Recognition

Benchmarking Image Sensors Under Adverse Weather Conditions for Autonomous Driving

98 - Mario Bijelic , Tobias Gruber , Werner Ritter 2019

Adverse weather conditions are very challenging for autonomous driving because most of the state-of-the-art sensors stop working reliably under these conditions. In order to develop robust sensors and algorithms, tests with current sensors in defined weather conditions are crucial for determining the impact of bad weather for each sensor. This work describes a testing and evaluation methodology that helps to benchmark novel sensor technologies and compare them to state-of-the-art sensors. As an example, gated imaging is compared to standard imaging under foggy conditions. It is shown that gated imaging outperforms state-of-the-art standard passive imaging due to time-synchronized active illumination.

Image and Video Processing Computer Vision and Pattern Recognition

GAMMA: A General Agent Motion Prediction Model for Autonomous Driving

101 - Yuanfu Luo , Panpan Cai , David Hsu 2019

Autonomous driving in mixed traffic requires reliable motion prediction of nearby traffic agents such as pedestrians, bicycles, cars, buses, etc.. This prediction problem is extremely challenging because of the diverse dynamics and geometry of traffic agents, complex road conditions, and intensive interactions among the agents. In this paper, we proposed GAMMA, a general agent motion prediction model for autonomous driving, that can predict the motion of heterogeneous traffic agents with different kinematics, geometry, human agents inner states, etc.. GAMMA formalizes motion prediction as geometric optimization in the velocity space, and integrates physical constraints and human inner states into this unified framework. Our results show that GAMMA outperforms state-of-the-art approaches significantly on diverse real-world datasets.

Robotics Computer Vision and Pattern Recognition

End-to-End Interactive Prediction and Planning with Optical Flow Distillation for Autonomous Driving

111 - Hengli Wang , Peide Cai , Rui Fan 2021

With the recent advancement of deep learning technology, data-driven approaches for autonomous car prediction and planning have achieved extraordinary performance. Nevertheless, most of these approaches follow a non-interactive prediction and planning paradigm, hypothesizing that a vehicles behaviors do not affect others. The approaches based on such a non-interactive philosophy typically perform acceptably in sparse traffic scenarios but can easily fail in dense traffic scenarios. Therefore, we propose an end-to-end interactive neural motion planner (INMP) for autonomous driving in this paper. Given a set of past surrounding-view images and a high definition map, our INMP first generates a feature map in birds-eye-view space, which is then processed to detect other agents and perform interactive prediction and planning jointly. Also, we adopt an optical flow distillation paradigm, which can effectively improve the network performance while still maintaining its real-time inference speed. Extensive experiments on the nuScenes dataset and in the closed-loop Carla simulation environment demonstrate the effectiveness and efficiency of our INMP for the detection, prediction, and planning tasks. Our project page is at sites.google.com/view/inmp-ofd.

Robotics Computer Vision and Pattern Recognition

Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity

118 - William Agnew , Christopher Xie , Aaron Walsman 2020

Learning-based 3D object reconstruction enables single- or few-shot estimation of 3D object models. For robotics, this holds the potential to allow model-based methods to rapidly adapt to novel objects and scenes. Existing 3D reconstruction techniques optimize for visual reconstruction fidelity, typically measured by chamfer distance or voxel IOU. We find that when applied to realistic, cluttered robotics environments, these systems produce reconstructions with low physical realism, resulting in poor task performance when used for model-based control. We propose ARM, an amodal 3D reconstruction system that introduces (1) a stability prior over object shapes, (2) a connectivity prior, and (3) a multi-channel input representation that allows for reasoning over relationships between groups of objects. By using these priors over the physical properties of objects, our system improves reconstruction quality not just by standard visual metrics, but also performance of model-based control on a variety of robotics manipulation tasks in challenging, cluttered environments. Code is available at github.com/wagnew3/ARM.

Robotics Computer Vision and Pattern Recognition Machine Learning

comments

Fetching comments

Hama University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

AutoLay: Benchmarking amodal layout estimation for autonomous driving

Ask ChatGPT about the research

No Arabic abstract

Read More