Do you want to publish a course? Click here

Progressive Domain Adaptation for Object Detection

193   0   0.0 ( 0 )
 Added by Yi-Hsuan Tsai
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

Recent deep learning methods for object detection rely on a large amount of bounding box annotations. Collecting these annotations is laborious and costly, yet supervised models do not generalize well when testing on images from a different distribution. Domain adaptation provides a solution by adapting existing labels to the target testing data. However, a large gap between domains could make adaptation a challenging task, which leads to unstable training processes and sub-optimal results. In this paper, we propose to bridge the domain gap with an intermediate domain and progressively solve easier adaptation subtasks. This intermediate domain is constructed by translating the source images to mimic the ones in the target domain. To tackle the domain-shift problem, we adopt adversarial learning to align distributions at the feature level. In addition, a weighted task loss is applied to deal with unbalanced image quality in the intermediate domain. Experimental results show that our method performs favorably against the state-of-the-art method in terms of the performance on the target domain.



rate research

Read More

To reduce annotation labor associated with object detection, an increasing number of studies focus on transferring the learned knowledge from a labeled source domain to another unlabeled target domain. However, existing methods assume that the labeled data are sampled from a single source domain, which ignores a more generalized scenario, where labeled data are from multiple source domains. For the more challenging task, we propose a unified Faster R-CNN based framework, termed Divide-and-Merge Spindle Network (DMSN), which can simultaneously enhance domain invariance and preserve discriminative power. Specifically, the framework contains multiple source subnets and a pseudo target subnet. First, we propose a hierarchical feature alignment strategy to conduct strong and weak alignments for low- and high-level features, respectively, considering their different effects for object detection. Second, we develop a novel pseudo subnet learning algorithm to approximate optimal parameters of pseudo target subset by weighted combination of parameters in different source subnets. Finally, a consistency regularization for region proposal network is proposed to facilitate each subnet to learn more abstract invariances. Extensive experiments on different adaptation scenarios demonstrate the effectiveness of the proposed model.
Most state-of-the-art methods of object detection suffer from poor generalization ability when the training and test data are from different domains, e.g., with different styles. To address this problem, previous methods mainly use holistic representations to align feature-level and pixel-level distributions of different domains, which may neglect the instance-level characteristics of objects in images. Besides, when transferring detection ability across different domains, it is important to obtain the instance-level features that are domain-invariant, instead of the styles that are domain-specific. Therefore, in order to extract instance-invariant features, we should disentangle the domain-invariant features from the domain-specific features. To this end, a progressive disentangled framework is first proposed to solve domain adaptive object detection. Particularly, base on disentangled learning used for feature decomposition, we devise two disentangled layers to decompose domain-invariant and domain-specific features. And the instance-invariant features are extracted based on the domain-invariant features. Finally, to enhance the disentanglement, a three-stage training mechanism including multiple loss functions is devised to optimize our model. In the experiment, we verify the effectiveness of our method on three domain-shift scenes. Our method is separately 2.3%, 3.6%, and 4.0% higher than the baseline method cite{saito2019strong}.
Recent advances in unsupervised domain adaptation have significantly improved the recognition accuracy of CNNs by alleviating the domain shift between (labeled) source and (unlabeled) target data distributions. While the problem of single-target domain adaptation (STDA) for object detection has recently received much attention, multi-target domain adaptation (MTDA) remains largely unexplored, despite its practical relevance in several real-world applications, such as multi-camera video surveillance. Compared to the STDA problem that may involve large domain shifts between complex source and target distributions, MTDA faces additional challenges, most notably the computational requirements and catastrophic forgetting of previously-learned targets, which can depend on the order of target adaptations. STDA for detection can be applied to MTDA by adapting one model per target, or one common model with a mixture of data from target domains. However, these approaches are either costly or inaccurate. The only state-of-art MTDA method specialized for detection learns targets incrementally, one target at a time, and mitigates the loss of knowledge by using a duplicated detection model for knowledge distillation, which is computationally expensive and does not scale well to many domains. In this paper, we introduce an efficient approach for incremental learning that generalizes well to multiple target domains. Our MTDA approach is more suitable for real-world applications since it allows updating the detection model incrementally, without storing data from previous-learned target domains, nor retraining when a new target domain becomes available. Our proposed method, MTDA-DTM, achieved the highest level of detection accuracy compared against state-of-the-art approaches on several MTDA detection benchmarks and Wildtrack, a benchmark for multi-camera pedestrian detection.
Unsupervised domain adaptive object detection aims to adapt detectors from a labelled source domain to an unlabelled target domain. Most existing works take a two-stage strategy that first generates region proposals and then detects objects of interest, where adversarial learning is widely adopted to mitigate the inter-domain discrepancy in both stages. However, adversarial learning may impair the alignment of well-aligned samples as it merely aligns the global distributions across domains. To address this issue, we design an uncertainty-aware domain adaptation network (UaDAN) that introduces conditional adversarial learning to align well-aligned and poorly-aligned samples separately in different manners. Specifically, we design an uncertainty metric that assesses the alignment of each sample and adjusts the strength of adversarial learning for well-aligned and poorly-aligned samples adaptively. In addition, we exploit the uncertainty metric to achieve curriculum learning that first performs easier image-level alignment and then more difficult instance-level alignment progressively. Extensive experiments over four challenging domain adaptive object detection datasets show that UaDAN achieves superior performance as compared with state-of-the-art methods.
This work tackles the unsupervised cross-domain object detection problem which aims to generalize a pre-trained object detector to a new target domain without labels. We propose an uncertainty-aware model adaptation method, which is based on two motivations: 1) the estimation and exploitation of model uncertainty in a new domain is critical for reliable domain adaptation; and 2) the joint alignment of distributions for inputs (feature alignment) and outputs (self-training) is needed. To this end, we compose a Bayesian CNN-based framework for uncertainty estimation in object detection, and propose an algorithm for generation of uncertainty-aware pseudo-labels. We also devise a scheme for joint feature alignment and self-training of the object detection model with uncertainty-aware pseudo-labels. Experiments on multiple cross-domain object detection benchmarks show that our proposed method achieves state-of-the-art performance.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا