SimROD: A Simple Adaptation Method for Robust Object Detection

102 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rindra Ramamonjison

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Rindra Ramamonjison - Amin Banitalebi-Dehkordi - Xinyu Kang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a Simple and effective unsupervised adaptation method for Robust Object Detection (SimROD). To overcome the challenging issues of domain shift and pseudo-label noise, our method integrates a novel domain-centric augmentation method, a gradual self-labeling adaptation procedure, and a teacher-guided fine-tuning mechanism. Using our method, target domain samples can be leveraged to adapt object detection models without changing the model architecture or generating synthetic data. When applied to image corruptions and high-level cross-domain adaptation benchmarks, our method outperforms prior baselines on multiple domain adaptation benchmarks. SimROD achieves new state-of-the-art on standard real-to-synthetic and cross-camera setup benchmarks. On the image corruption benchmark, models adapted with our method achieved a relative robustness improvement of 15-25% AP50 on Pascal-C and 5-6% AP on COCO-C and Cityscapes-C. On the cross-domain benchmark, our method outperformed the best baseline performance by up to 8% AP50 on Comic dataset and up to 4% on Watercolor dataset.

قيم البحث

308 - Xingxu Yao , Sicheng Zhao , Pengfei Xu 2021

To reduce annotation labor associated with object detection, an increasing number of studies focus on transferring the learned knowledge from a labeled source domain to another unlabeled target domain. However, existing methods assume that the labele d data are sampled from a single source domain, which ignores a more generalized scenario, where labeled data are from multiple source domains. For the more challenging task, we propose a unified Faster R-CNN based framework, termed Divide-and-Merge Spindle Network (DMSN), which can simultaneously enhance domain invariance and preserve discriminative power. Specifically, the framework contains multiple source subnets and a pseudo target subnet. First, we propose a hierarchical feature alignment strategy to conduct strong and weak alignments for low- and high-level features, respectively, considering their different effects for object detection. Second, we develop a novel pseudo subnet learning algorithm to approximate optimal parameters of pseudo target subset by weighted combination of parameters in different source subnets. Finally, a consistency regularization for region proposal network is proposed to facilitate each subnet to learn more abstract invariances. Extensive experiments on different adaptation scenarios demonstrate the effectiveness of the proposed model.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

A Competitive Method to VIPriors Object Detection Challenge

328 - Fei Shen , Xin He , Mengwan Wei 2021

In this report, we introduce the technical details of our submission to the VIPriors object detection challenge. Our solution is based on mmdetction of a strong baseline open-source detection toolbox. Firstly, we introduce an effective data augmentat ion method to address the lack of data problem, which contains bbox-jitter, grid-mask, and mix-up. Secondly, we present a robust region of interest (ROI) extraction method to learn more significant ROI features via embedding global context features. Thirdly, we propose a multi-model integration strategy to refinement the prediction box, which weighted boxes fusion (WBF). Experimental results demonstrate that our approach can significantly improve the average precision (AP) of object detection on the subset of the COCO2017 dataset.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Progressive Domain Adaptation for Object Detection

192 - Han-Kai Hsu , Chun-Han Yao , Yi-Hsuan Tsai 2019

Recent deep learning methods for object detection rely on a large amount of bounding box annotations. Collecting these annotations is laborious and costly, yet supervised models do not generalize well when testing on images from a different distribut ion. Domain adaptation provides a solution by adapting existing labels to the target testing data. However, a large gap between domains could make adaptation a challenging task, which leads to unstable training processes and sub-optimal results. In this paper, we propose to bridge the domain gap with an intermediate domain and progressively solve easier adaptation subtasks. This intermediate domain is constructed by translating the source images to mimic the ones in the target domain. To tackle the domain-shift problem, we adopt adversarial learning to align distributions at the feature level. In addition, a weighted task loss is applied to deal with unbalanced image quality in the intermediate domain. Experimental results show that our method performs favorably against the state-of-the-art method in terms of the performance on the target domain.

الرؤية الحاسوبية وتمييز الأنماط

ST3D++: Denoised Self-training for Unsupervised Domain Adaptation on 3D Object Detection

165 - Jihan Yang , Shaoshuai Shi , Zhe Wang 2021

In this paper, we present a self-training method, named ST3D++, with a holistic pseudo label denoising pipeline for unsupervised domain adaptation on 3D object detection. ST3D++ aims at reducing noise in pseudo label generation as well as alleviating the negative impacts of noisy pseudo labels on model training. First, ST3D++ pre-trains the 3D object detector on the labeled source domain with random object scaling (ROS) which is designed to reduce target domain pseudo label noise arising from object scale bias of the source domain. Then, the detector is progressively improved through alternating between generating pseudo labels and training the object detector with pseudo-labeled target domain data. Here, we equip the pseudo label generation process with a hybrid quality-aware triplet memory to improve the quality and stability of generated pseudo labels. Meanwhile, in the model training stage, we propose a source data assisted training strategy and a curriculum data augmentation policy to effectively rectify noisy gradient directions and avoid model over-fitting to noisy pseudo labeled data. These specific designs enable the detector to be trained on meticulously refined pseudo labeled target data with denoised training signals, and thus effectively facilitate adapting an object detector to a target domain without requiring annotations. Finally, our method is assessed on four 3D benchmark datasets (i.e., Waymo, KITTI, Lyft, and nuScenes) for three common categories (i.e., car, pedestrian and bicycle). ST3D++ achieves state-of-the-art performance on all evaluated settings, outperforming the corresponding baseline by a large margin (e.g., 9.6% $sim$ 38.16% on Waymo $rightarrow$ KITTI in terms of AP$_{text{3D}}$), and even surpasses the fully supervised oracle results on the KITTI 3D object detection benchmark with target prior. Code will be available.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Consistency-based Active Learning for Object Detection

306 - Weiping Yu , Sijie Zhu , Taojiannan Yang 2021

Active learning aims to improve the performance of task model by selecting the most informative samples with a limited budget. Unlike most recent works that focused on applying active learning for image classification, we propose an effective Consist ency-based Active Learning method for object Detection (CALD), which fully explores the consistency between original and augmented data. CALD has three appealing benefits. (i) CALD is systematically designed by investigating the weaknesses of existing active learning methods, which do not take the unique challenges of object detection into account. (ii) CALD unifies box regression and classification with a single metric, which is not concerned by active learning methods for classification. CALD also focuses on the most informative local region rather than the whole image, which is beneficial for object detection. (iii) CALD not only gauges individual information for sample selection, but also leverages mutual information to encourage a balanced data distribution. Extensive experiments show that CALD significantly outperforms existing state-of-the-art task-agnostic and detection-specific active learning methods on general object detection datasets. Based on the Faster R-CNN detector, CALD consistently surpasses the baseline method (random selection) by 2.9/2.8/0.8 mAP on average on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO. Code is available at url{https://github.com/we1pingyu/CALD}

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي