ﻻ يوجد ملخص باللغة العربية
Object proposal generation is often the first step in many detection models. It is lucrative to train a good proposal model, that generalizes to unseen classes. This could help scaling detection models to larger number of classes with fewer annotations. Motivated by this, we study how a detection model trained on a small set of source classes can provide proposals that generalize to unseen classes. We systematically study the properties of the dataset - visual diversity and label space granularity - required for good generalization. We show the trade-off between using fine-grained labels and coarse labels. We introduce the idea of prototypical classes: a set of sufficient and necessary classes required to train a detection model to obtain generalized proposals in a more data-efficient way. On the Open Images V4 dataset, we show that only 25% of the classes can be selected to form such a prototypical set. The resulting proposals from a model trained with these classes is only 4.3% worse than using all the classes, in terms of average recall (AR). We also demonstrate that Faster R-CNN model leads to better generalization of proposals compared to a single-stage network like RetinaNet.
Object proposals have become an integral preprocessing steps of many vision pipelines including object detection, weakly supervised detection, object discovery, tracking, etc. Compared to the learning-free methods, learning-based proposals have becom
Object proposals greatly benefit object detection task in recent state-of-the-art works. However, the existing object proposals usually have low localization accuracy at high intersection over union threshold. To address it, we apply saliency detecti
We present Sparse R-CNN, a purely sparse method for object detection in images. Existing works on object detection heavily rely on dense object candidates, such as $k$ anchor boxes pre-defined on all grids of image feature map of size $Htimes W$. In
Object detection has recently achieved a breakthrough for removing the last one non-differentiable component in the pipeline, Non-Maximum Suppression (NMS), and building up an end-to-end system. However, what makes for its one-to-one prediction has n
Imbalance issue is a major yet unsolved bottleneck for the current object detection models. In this work, we observe two crucial yet never discussed imbalance issues. The first imbalance lies in the large number of low-quality RPN proposals, which ma