A Competitive Method to VIPriors Object Detection Challenge

329 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fei Shen

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Fei Shen - Xin He - Mengwan Wei

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this report, we introduce the technical details of our submission to the VIPriors object detection challenge. Our solution is based on mmdetction of a strong baseline open-source detection toolbox. Firstly, we introduce an effective data augmentation method to address the lack of data problem, which contains bbox-jitter, grid-mask, and mix-up. Secondly, we present a robust region of interest (ROI) extraction method to learn more significant ROI features via embedding global context features. Thirdly, we propose a multi-model integration strategy to refinement the prediction box, which weighted boxes fusion (WBF). Experimental results demonstrate that our approach can significantly improve the average precision (AP) of object detection on the subset of the COCO2017 dataset.

قيم البحث

101 - Rindra Ramamonjison , Amin Banitalebi-Dehkordi , Xinyu Kang 2021

This paper presents a Simple and effective unsupervised adaptation method for Robust Object Detection (SimROD). To overcome the challenging issues of domain shift and pseudo-label noise, our method integrates a novel domain-centric augmentation metho d, a gradual self-labeling adaptation procedure, and a teacher-guided fine-tuning mechanism. Using our method, target domain samples can be leveraged to adapt object detection models without changing the model architecture or generating synthetic data. When applied to image corruptions and high-level cross-domain adaptation benchmarks, our method outperforms prior baselines on multiple domain adaptation benchmarks. SimROD achieves new state-of-the-art on standard real-to-synthetic and cross-camera setup benchmarks. On the image corruption benchmark, models adapted with our method achieved a relative robustness improvement of 15-25% AP50 on Pascal-C and 5-6% AP on COCO-C and Cityscapes-C. On the cross-domain benchmark, our method outperformed the best baseline performance by up to 8% AP50 on Comic dataset and up to 4% on Watercolor dataset.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

Towards Resolving the Challenge of Long-tail Distribution in UAV Images for Object Detection

74 - Weiping Yu , Taojiannan Yang , Chen Chen 2020

Existing methods for object detection in UAV images ignored an important challenge - imbalanced class distribution in UAV images - which leads to poor performance on tail classes. We systematically investigate existing solutions to long-tail problems and unveil that re-balancing methods that are effective on natural image datasets cannot be trivially applied to UAV datasets. To this end, we rethink long-tailed object detection in UAV images and propose the Dual Sampler and Head detection Network (DSHNet), which is the first work that aims to resolve long-tail distribution in UAV images. The key components in DSHNet include Class-Biased Samplers (CBS) and Bilateral Box Heads (BBH), which are developed to cope with tail classes and head classes in a dual-path manner. Without bells and whistles, DSHNet significantly boosts the performance of tail classes on different detection frameworks. Moreover, DSHNet significantly outperforms base detectors and generic approaches for long-tail problems on VisDrone and UAVDT datasets. It achieves new state-of-the-art performance when combining with image cropping methods. Code is available at https://github.com/we1pingyu/DSHNet

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Method Towards CVPR 2021 Image Matching Challenge

91 - Xiaopeng Bi , Yu Chen , Xinyang Liu 2021

This report describes Megvii-3D teams approach towards CVPR 2021 Image Matching Workshop.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

nnDetection: A Self-configuring Method for Medical Object Detection

60 - Michael Baumgartner , Paul F. Jaeger , Fabian Isensee 2021

Simultaneous localisation and categorization of objects in medical images, also referred to as medical object detection, is of high clinical relevance because diagnostic decisions often depend on rating of objects rather than e.g. pixels. For this ta sk, the cumbersome and iterative process of method configuration constitutes a major research bottleneck. Recently, nnU-Net has tackled this challenge for the task of image segmentation with great success. Following nnU-Nets agenda, in this work we systematize and automate the configuration process for medical object detection. The resulting self-configuring method, nnDetection, adapts itself without any manual intervention to arbitrary medical detection problems while achieving results en par with or superior to the state-of-the-art. We demonstrate the effectiveness of nnDetection on two public benchmarks, ADAM and LUNA16, and propose 10 further medical object detection tasks on public data sets for comprehensive method evaluation. Code is at https://github.com/MIC-DKFZ/nnDetection .

الرؤية الحاسوبية وتمييز الأنماط

End-to-End Object Detection with Fully Convolutional Network

449 - Jianfeng Wang , Lin Song , Zeming Li 2020

Mainstream object detectors based on the fully convolutional network has achieved impressive performance. While most of them still need a hand-designed non-maximum suppression (NMS) post-processing, which impedes fully end-to-end training. In this pa per, we give the analysis of discarding NMS, where the results reveal that a proper label assignment plays a crucial role. To this end, for fully convolutional detectors, we introduce a Prediction-aware One-To-One (POTO) label assignment for classification to enable end-to-end detection, which obtains comparable performance with NMS. Besides, a simple 3D Max Filtering (3DMF) is proposed to utilize the multi-scale features and improve the discriminability of convolutions in the local region. With these techniques, our end-to-end framework achieves competitive performance against many state-of-the-art detectors with NMS on COCO and CrowdHuman datasets. The code is available at https://github.com/Megvii-BaseDetection/DeFCN .

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي