ﻻ يوجد ملخص باللغة العربية
In this paper, we present an Intersection-over-Union (IoU) guided two-stage 3D object detector with a voxel-to-point decoder. To preserve the necessary information from all raw points and maintain the high box recall in voxel based Region Proposal Network (RPN), we propose a residual voxel-to-point decoder to extract the point features in addition to the map-view features from the voxel based RPN. We use a 3D Region of Interest (RoI) alignment to crop and align the features with the proposal boxes for accurately perceiving the object position. The RoI-Aligned features are finally aggregated with the corner geometry embeddings that can provide the potentially missing corner information in the box refinement stage. We propose a simple and efficient method to align the estimated IoUs to the refined proposal boxes as a more relevant localization confidence. The comprehensive experiments on KITTI and Waymo Open Dataset demonstrate that our method achieves significant improvements with novel architectures against the existing methods. The code is available on Github URLfootnote{url{https://github.com/jialeli1/From-Voxel-to-Point}}.
Monocular 3D detection currently struggles with extremely lower detection rates compared to LiDAR-based methods. The poor accuracy is mainly caused by the absence of accurate location cues due to the ill-posed nature of monocular imagery. LiDAR point
Due to the sparsity and irregularity of the 3D data, approaches that directly process points have become popular. Among all point-based models, Transformer-based models have achieved state-of-the-art performance by fully preserving point interrelatio
3D object detection is receiving increasing attention from both industry and academia thanks to its wide applications in various fields. In this paper, we propose the Point-Voxel Region based Convolution Neural Networks (PV-RCNNs) for accurate 3D det
We present Point-Voxel CNN (PVCNN) for efficient, fast 3D deep learning. Previous work processes 3D data using either voxel-based or point-based NN models. However, both approaches are computationally inefficient. The computation cost and memory foot
Mesh reconstruction from a 3D point cloud is an important topic in the fields of computer graphic, computer vision, and multimedia analysis. In this paper, we propose a voxel structure-based mesh reconstruction framework. It provides the intrinsic me