ﻻ يوجد ملخص باللغة العربية
Although a polygon is a more accurate representation than an upright bounding box for text detection, the annotations of polygons are extremely expensive and challenging. Unlike existing works that employ fully-supervised training with polygon annotations, we propose a novel text detection system termed SelfText Beyond Polygon (SBP) with Bounding Box Supervision (BBS) and Dynamic Self Training (DST), where training a polygon-based text detector with only a limited set of upright bounding box annotations. For BBS, we firstly utilize the synthetic data with character-level annotations to train a Skeleton Attention Segmentation Network (SASN). Then the box-level annotations are adopted to guide the generation of high-quality polygon-liked pseudo labels, which can be used to train any detectors. In this way, our method achieves the same performance as text detectors trained with polygon annotations (i.e., both are 85.0% F-score for PSENet on ICDAR2015 ). For DST, through dynamically removing the false alarms, it is able to leverage limited labeled data as well as massive unlabeled data to further outperform the expensive baseline. We hope SBP can provide a new perspective for text detection to save huge labeling costs. Code is available at: github.com/weijiawu/SBP.
This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. In contrast to the
Self-training is a simple semi-supervised learning approach: Unlabelled examples that attract high-confidence predictions are labelled with their predictions and added to the training set, with this process being repeated multiple times. Recently, se
We study how to leverage Web images to augment human-curated object detection datasets. Our approach is two-pronged. On the one hand, we retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than other sea
Few-shot semantic segmentation (FSS) has great potential for medical imaging applications. Most of the existing FSS techniques require abundant annotated semantic classes for training. However, these methods may not be applicable for medical images d
Training object class detectors typically requires a large set of images with objects annotated by bounding boxes. However, manually drawing bounding boxes is very time consuming. In this paper we greatly reduce annotation time by proposing center-cl