Radar Camera Fusion via Representation Learning in Autonomous Driving

178 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xu Dong

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xu Dong - Binnan Zhuang - Yunxiang Mao

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Radars and cameras are mature, cost-effective, and robust sensors and have been widely used in the perception stack of mass-produced autonomous driving systems. Due to their complementary properties, outputs from radar detection (radar pins) and camera perception (2D bounding boxes) are usually fused to generate the best perception results. The key to successful radar-camera fusion is the accurate data association. The challenges in the radar-camera association can be attributed to the complexity of driving scenes, the noisy and sparse nature of radar measurements, and the depth ambiguity from 2D bounding boxes. Traditional rule-based association methods are susceptible to performance degradation in challenging scenarios and failure in corner cases. In this study, we propose to address radar-camera association via deep representation learning, to explore feature-level interaction and global reasoning. Additionally, we design a loss sampling mechanism and an innovative ordinal loss to overcome the difficulty of imperfect labeling and to enforce critical human-like reasoning. Despite being trained with noisy labels generated by a rule-based algorithm, our proposed method achieves a performance of 92.2% F1 score, which is 11.6% higher than the rule-based teacher. Moreover, this data-driven method also lends itself to continuous improvement via corner case mining.

قيم البحث

344 - Zhiqing Wei , Fengkai Zhang , Shuo Chang 2021

With autonomous driving developing in a booming stage, accurate object detection in complex scenarios attract wide attention to ensure the safety of autonomous driving. Millimeter wave (mmWave) radar and vision fusion is a mainstream solution for acc urate obstacle detection. This article presents a detailed survey on mmWave radar and vision fusion based obstacle detection methods. Firstly, we introduce the tasks, evaluation criteria and datasets of object detection for autonomous driving. Then, the process of mmWave radar and vision fusion is divided into three parts: sensor deployment, sensor calibration and sensor fusion, which are reviewed comprehensively. Especially, we classify the fusion methods into data level, decision level and feature level fusion methods. Besides, we introduce the fusion of lidar and vision in autonomous driving in the aspects of obstacle detection, object classification and road segmentation, which is promising in the future. Finally, we summarize this article.

الرؤية الحاسوبية وتمييز الأنماط

Action-Based Representation Learning for Autonomous Driving

91 - Yi Xiao , Felipe Codevilla , Christopher Pal 2020

Human drivers produce a vast amount of data which could, in principle, be used to improve autonomous driving systems. Unfortunately, seemingly straightforward approaches for creating end-to-end driving models that map sensor data directly into drivin g actions are problematic in terms of interpretability, and typically have significant difficulty dealing with spurious correlations. Alternatively, we propose to use this kind of action-based driving data for learning representations. Our experiments show that an affordance-based driving model pre-trained with this approach can leverage a relatively small amount of weakly annotated imagery and outperform pure end-to-end driving models, while being more interpretable. Further, we demonstrate how this strategy outperforms previous methods based on learning inverse dynamics models as well as other methods based on heavy human supervision (ImageNet).

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي علم الروبوتات

Full-Velocity Radar Returns by Radar-Camera Fusion

89 - Yunfei Long , Daniel Morris , Xiaoming Liu 2021

A distinctive feature of Doppler radar is the measurement of velocity in the radial direction for radar points. However, the missing tangential velocity component hampers object velocity estimation as well as temporal integration of radar sweeps in d ynamic scenes. Recognizing that fusing camera with radar provides complementary information to radar, in this paper we present a closed-form solution for the point-wise, full-velocity estimate of Doppler returns using the corresponding optical flow from camera images. Additionally, we address the association problem between radar returns and camera images with a neural network that is trained to estimate radar-camera correspondences. Experimental results on the nuScenes dataset verify the validity of the method and show significant improvements over the state-of-the-art in velocity estimation and accumulation of radar points.

الرؤية الحاسوبية وتمييز الأنماط

Learning a Domain-Agnostic Visual Representation for Autonomous Driving via Contrastive Loss

92 - Dongseok Shim , H. Jin Kim 2021

Deep neural networks have been widely studied in autonomous driving applications such as semantic segmentation or depth estimation. However, training a neural network in a supervised manner requires a large amount of annotated labels which are expens ive and time-consuming to collect. Recent studies leverage synthetic data collected from a virtual environment which are much easier to acquire and more accurate compared to data from the real world, but they usually suffer from poor generalization due to the inherent domain shift problem. In this paper, we propose a Domain-Agnostic Contrastive Learning (DACL) which is a two-stage unsupervised domain adaptation framework with cyclic adversarial training and contrastive loss. DACL leads the neural network to learn domain-agnostic representation to overcome performance degradation when there exists a difference between training and test data distribution. Our proposed approach achieves better performance in the monocular depth estimation task compared to previous state-of-the-art methods and also shows effectiveness in the semantic segmentation task.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي علم الروبوتات

Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving

126 - Kemiao Huang , Qi Hao 2021

Multi-object tracking (MOT) with camera-LiDAR fusion demands accurate results of object detection, affinity computation and data association in real time. This paper presents an efficient multi-modal MOT framework with online joint detection and trac king schemes and robust data association for autonomous driving applications. The novelty of this work includes: (1) development of an end-to-end deep neural network for joint object detection and correlation using 2D and 3D measurements; (2) development of a robust affinity computation module to compute occlusion-aware appearance and motion affinities in 3D space; (3) development of a comprehensive data association module for joint optimization among detection confidences, affinities and start-end probabilities. The experiment results on the KITTI tracking benchmark demonstrate the superior performance of the proposed method in terms of both tracking accuracy and processing speed.

الرؤية الحاسوبية وتمييز الأنماط