No Arabic abstract
Learning new representations of 3D point clouds is an active research area in 3D vision, as the order-invariant point cloud structure still presents challenges to the design of neural network architectures. Recent works explored learning either global or local features or both for point clouds, however none of the earlier methods focused on capturing contextual shape information by analysing local orientation distribution of points. In this paper, we leverage on point orientation distributions around a point in order to obtain an expressive local neighborhood representation for point clouds. We achieve this by dividing the spherical neighborhood of a given point into predefined cone volumes, and statistics inside each volume are used as point features. In this way, a local patch can be represented by not only the selected points nearest neighbors, but also considering a point density distribution defined along multiple orientations around the point. We are then able to construct an orientation distribution function (ODF) neural network that involves an ODFBlock which relies on mlp (multi-layer perceptron) layers. The new ODFNet model achieves state-of the-art accuracy for object classification on ModelNet40 and ScanObjectNN datasets, and segmentation on ShapeNet S3DIS datasets.
LiDAR point clouds contain measurements of complicated natural scenes and can be used to update digital elevation models, glacial monitoring, detecting faults and measuring uplift detecting, forest inventory, detect shoreline and beach volume changes, landslide risk analysis, habitat mapping, and urban development, among others. A very important application is the classification of the 3D cloud into elementary classes. For example, it can be used to differentiate between vegetation, man-made structures, and water. Our goal is to present a preliminary comparison study for the classification of 3D point cloud LiDAR data that includes several types of feature engineering. In particular, we demonstrate that providing context by augmenting each point in the LiDAR point cloud with information about its neighboring points can improve the performance of downstream learning algorithms. We also experiment with several dimension reduction strategies, ranging from Principal Component Analysis (PCA) to neural network-based auto-encoders, and demonstrate how they affect classification performance in LiDAR point clouds. For instance, we observe that combining feature engineering with a dimension reduction a method such as PCA, there is an improvement in the accuracy of the classification with respect to doing a straightforward classification with the raw data.
Detecting objects in 3D LiDAR data is a core technology for autonomous driving and other robotics applications. Although LiDAR data is acquired over time, most of the 3D object detection algorithms propose object bounding boxes independently for each frame and neglect the useful information available in the temporal domain. To address this problem, in this paper we propose a sparse LSTM-based multi-frame 3d object detection algorithm. We use a U-Net style 3D sparse convolution network to extract features for each frames LiDAR point-cloud. These features are fed to the LSTM module together with the hidden and memory features from last frame to predict the 3d objects in the current frame as well as hidden and memory features that are passed to the next frame. Experiments on the Waymo Open Dataset show that our algorithm outperforms the traditional frame by frame approach by 7.5%
[email protected] and other multi-frame approaches by 1.2% while using less memory and computation per frame. To the best of our knowledge, this is the first work to use an LSTM for 3D object detection in sparse point clouds.
Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.
Deep learning-based point cloud registration models are often generalized from extensive training over a large volume of data to learn the ability to predict the desired geometric transformation to register 3D point clouds. In this paper, we propose a meta-learning based 3D registration model, named 3D Meta-Registration, that is capable of rapidly adapting and well generalizing to new 3D registration tasks for unseen 3D point clouds. Our 3D Meta-Registration gains a competitive advantage by training over a variety of 3D registration tasks, which leads to an optimized model for the best performance on the distribution of registration tasks including potentially unseen tasks. Specifically, the proposed 3D Meta-Registration model consists of two modules: 3D registration learner and 3D registration meta-learner. During the training, the 3D registration learner is trained to complete a specific registration task aiming to determine the desired geometric transformation that aligns the source point cloud with the target one. In the meantime, the 3D registration meta-learner is trained to provide the optimal parameters to update the 3D registration learner based on the learned task distribution. After training, the 3D registration meta-learner, which is learned with the optimized coverage of distribution of 3D registration tasks, is able to dynamically update 3D registration learners with desired parameters to rapidly adapt to new registration tasks. We tested our model on synthesized dataset ModelNet and FlyingThings3D, as well as real-world dataset KITTI. Experimental results demonstrate that 3D Meta-Registration achieves superior performance over other previous techniques (e.g. FlowNet3D).
Detecting pedestrians is a crucial task in autonomous driving systems to ensure the safety of drivers and pedestrians. The technologies involved in these algorithms must be precise and reliable, regardless of environment conditions. Relying solely on RGB cameras may not be enough to recognize road environments in situations where cameras cannot capture scenes properly. Some approaches aim to compensate for these limitations by combining RGB cameras with TOF sensors, such as LIDARs. However, there are few works that address this problem using exclusively the 3D geometric information provided by LIDARs. In this paper, we propose a PointNet++ based architecture to detect pedestrians in dense 3D point clouds. The aim is to explore the potential contribution of geometric information alone in pedestrian detection systems. We also present a semi-automatic labeling system that transfers pedestrian and non-pedestrian labels from RGB images onto the 3D domain. The fact that our datasets have RGB registered with point clouds enables label transferring by back projection from 2D bounding boxes to point clouds, with only a light manual supervision to validate results. We train PointNet++ with the geometry of the resulting 3D labelled clusters. The evaluation confirms the effectiveness of the proposed method, yielding precision and recall values around 98%.