ترغب بنشر مسار تعليمي؟ اضغط هنا

Uncertainty-aware Self-supervised 3D Data Association

115   0   0.0 ( 0 )
 نشر من قبل Jianren Wang
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

3D object trackers usually require training on large amounts of annotated data that is expensive and time-consuming to collect. Instead, we propose leveraging vast unlabeled datasets by self-supervised metric learning of 3D object trackers, with a focus on data association. Large scale annotations for unlabeled data are cheaply obtained by automatic object detection and association across frames. We show how these self-supervised annotations can be used in a principled manner to learn point-cloud embeddings that are effective for 3D tracking. We estimate and incorporate uncertainty in self-supervised tracking to learn more robust embeddings, without needing any labeled data. We design embeddings to differentiate objects across frames, and learn them using uncertainty-aware self-supervised training. Finally, we demonstrate their ability to perform accurate data association across frames, towards effective and accurate 3D tracking. Project videos and code are at https://jianrenw.github.io/Self-Supervised-3D-Data-Association.



قيم البحث

اقرأ أيضاً

Training deep convolutional neural networks usually requires a large amount of labeled data. However, it is expensive and time-consuming to annotate data for medical image segmentation tasks. In this paper, we present a novel uncertainty-aware semi-s upervised framework for left atrium segmentation from 3D MR images. Our framework can effectively leverage the unlabeled data by encouraging consistent predictions of the same input under different perturbations. Concretely, the framework consists of a student model and a teacher model, and the student model learns from the teacher model by minimizing a segmentation loss and a consistency loss with respect to the targets of the teacher model. We design a novel uncertainty-aware scheme to enable the student model to gradually learn from the meaningful and reliable targets by exploiting the uncertainty information. Experiments show that our method achieves high performance gains by incorporating the unlabeled data. Our method outperforms the state-of-the-art semi-supervised methods, demonstrating the potential of our framework for the challenging semi-supervised problems.
While making a tremendous impact in various fields, deep neural networks usually require large amounts of labeled data for training which are expensive to collect in many applications, especially in the medical domain. Unlabeled data, on the other ha nd, is much more abundant. Semi-supervised learning techniques, such as co-training, could provide a powerful tool to leverage unlabeled data. In this paper, we propose a novel framework, uncertainty-aware multi-view co-training (UMCT), to address semi-supervised learning on 3D data, such as volumetric data from medical imaging. In our work, co-training is achieved by exploiting multi-viewpoint consistency of 3D data. We generate different views by rotating or permuting the 3D data and utilize asymmetrical 3D kernels to encourage diversified features in different sub-networks. In addition, we propose an uncertainty-weighted label fusion mechanism to estimate the reliability of each views prediction with Bayesian deep learning. As one view requires the supervision from other views in co-training, our self-adaptive approach computes a confidence score for the prediction of each unlabeled sample in order to assign a reliable pseudo label. Thus, our approach can take advantage of unlabeled data during training. We show the effectiveness of our proposed semi-supervised method on several public datasets from medical image segmentation tasks (NIH pancreas & LiTS liver tumor dataset). Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.
79 - Hyeonwoo Yu , Jean Oh 2021
In this paper, we propose a self-supervised learningmethod for multi-object pose estimation. 3D object under-standing from 2D image is a challenging task that infers ad-ditional dimension from reduced-dimensional information.In particular, the estima tion of the 3D localization or orien-tation of an object requires precise reasoning, unlike othersimple clustering tasks such as object classification. There-fore, the scale of the training dataset becomes more cru-cial. However, it is challenging to obtain large amount of3D dataset since achieving 3D annotation is expensive andtime-consuming. If the scale of the training dataset can beincreased by involving the image sequence obtained fromsimple navigation, it is possible to overcome the scale lim-itation of the dataset and to have efficient adaptation tothe new environment. However, when the self annotation isconducted on single image by the network itself, trainingperformance of the network is bounded to the self perfor-mance. Therefore, we propose a strategy to exploit multipleobservations of the object in the image sequence in orderto surpass the self-performance: first, the landmarks for theglobal object map are estimated through network predic-tion and data association, and the corrected annotation fora single frame is obtained. Then, network fine-tuning is con-ducted including the dataset obtained by self-annotation,thereby exceeding the performance boundary of the networkitself. The proposed method was evaluated on the KITTIdriving scene dataset, and we demonstrate the performanceimprovement in the pose estimation of multi-object in 3D space.
Contrastive learning methods have significantly narrowed the gap between supervised and unsupervised learning on computer vision tasks. In this paper, we explore their application to remote sensing, where unlabeled data is often abundant but labeled data is scarce. We first show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks. To close the gap, we propose novel training methods that exploit the spatiotemporal structure of remote sensing data. We leverage spatially aligned images over time to construct temporal positive pairs in contrastive learning and geo-location to design pre-text tasks. Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing and other geo-tagged image datasets.
Semi-supervised approaches for crowd counting attract attention, as the fully supervised paradigm is expensive and laborious due to its request for a large number of images of dense crowd scenarios and their annotations. This paper proposes a spatial uncertainty-aware semi-supervised approach via regularized surrogate task (binary segmentation) for crowd counting problems. Different from existing semi-supervised learning-based crowd counting methods, to exploit the unlabeled data, our proposed spatial uncertainty-aware teacher-student framework focuses on high confident regions information while addressing the noisy supervision from the unlabeled data in an end-to-end manner. Specifically, we estimate the spatial uncertainty maps from the teacher models surrogate task to guide the feature learning of the main task (density regression) and the surrogate task of the student model at the same time. Besides, we introduce a simple yet effective differential transformation layer to enforce the inherent spatial consistency regularization between the main task and the surrogate task in the student model, which helps the surrogate task to yield more reliable predictions and generates high-quality uncertainty maps. Thus, our model can also address the task-level perturbation problems that occur spatial inconsistency between the primary and surrogate tasks in the student model. Experimental results on four challenging crowd counting datasets demonstrate that our method achieves superior performance to the state-of-the-art semi-supervised methods.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا