No Arabic abstract
Recent works show that mean-teaching is an effective framework for unsupervised domain adaptive person re-identification. However, existing methods perform contrastive learning on selected samples between teacher and student networks, which is sensitive to noises in pseudo labels and neglects the relationship among most samples. Moreover, these methods are not effective in cooperation of different teacher networks. To handle these issues, this paper proposes a Graph Consistency based Mean-Teaching (GCMT) method with constructing the Graph Consistency Constraint (GCC) between teacher and student networks. Specifically, given unlabeled training images, we apply teacher networks to extract corresponding features and further construct a teacher graph for each teacher network to describe the similarity relationships among training images. To boost the representation learning, different teacher graphs are fused to provide the supervise signal for optimizing student networks. GCMT fuses similarity relationships predicted by different teacher networks as supervision and effectively optimizes student networks with more sample relationships involved. Experiments on three datasets, i.e., Market-1501, DukeMTMCreID, and MSMT17, show that proposed GCMT outperforms state-of-the-art methods by clear margin. Specially, GCMT even outperforms the previous method that uses a deeper backbone. Experimental results also show that GCMT can effectively boost the performance with multiple teacher and student networks. Our code is available at https://github.com/liu-xb/GCMT .
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in intelligent video surveillance. Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models, and annotating data is an expensive work in real-world scenarios. In addition, due to domain gaps between different datasets, the performance is dramatically decreased when re-ID models pre-trained on label-rich datasets (source domain) are directly applied to other unlabeled datasets (target domain). In this paper, we attempt to remedy these problems from two aspects, namely data and methodology. Firstly, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them, which free humans from heavy data collections and annotations. Based on them, we build two synthetic person re-ID datasets with different scales, GSPR and mini-GSPR datasets. Secondly, we propose a synthesis-based multi-domain collaborative refinement (SMCR) network, which contains a synthetic pretraining module and two collaborative-refinement modules to implement sufficient learning for the valuable knowledge from multiple domains. Extensive experiments show that our proposed framework obtains significant performance improvements over the state-of-the-art methods on multiple unsupervised domain adaptation tasks of person re-ID.
Unsupervised domain adaptive (UDA) person re-identification (ReID) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain for person matching. One challenge is how to generate target domain samples with reliable labels for training. To address this problem, we propose a Disentanglement-based Cross-Domain Feature Augmentation (DCDFA) strategy, where the augmented features characterize well the target and source domain data distributions while inheriting reliable identity labels. Particularly, we disentangle each sample feature into a robust domain-invariant/shared feature and a domain-specific feature, and perform cross-domain feature recomposition to enhance the diversity of samples used in the training, with the constraints of cross-domain ReID loss and domain classification loss. Each recomposed feature, obtained based on the domain-invariant feature (which enables a reliable inheritance of identity) and an enhancement from a domain specific feature (which enables the approximation of real distributions), is thus an ideal augmentation. Extensive experimental results demonstrate the effectiveness of our method, which achieves the state-of-the-art performance.
Unsupervised domain adaptation (UDA) methods for person re-identification (re-ID) aim at transferring re-ID knowledge from labeled source data to unlabeled target data. Although achieving great success, most of them only use limited data from a single-source domain for model pre-training, making the rich labeled data insufficiently exploited. To make full use of the valuable labeled data, we introduce the multi-source concept into UDA person re-ID field, where multiple source datasets are used during training. However, because of domain gaps, simply combining different datasets only brings limited improvement. In this paper, we try to address this problem from two perspectives, ie{} domain-specific view and domain-fusion view. Two constructive modules are proposed, and they are compatible with each other. First, a rectification domain-specific batch normalization (RDSBN) module is explored to simultaneously reduce domain-specific characteristics and increase the distinctiveness of person features. Second, a graph convolutional network (GCN) based multi-domain information fusion (MDIF) module is developed, which minimizes domain distances by fusing features of different domains. The proposed method outperforms state-of-the-art UDA person re-ID methods by a large margin, and even achieves comparable performance to the supervised approaches without any post-processing techniques.
Although existing person re-identification (Re-ID) methods have shown impressive accuracy, most of them usually suffer from poor generalization on unseen target domain. Thus, generalizable person Re-ID has recently drawn increasing attention, which trains a model on source domains that generalizes well on unseen target domain without model updating. In this work, we propose a novel adaptive domain-specific normalization approach (AdsNorm) for generalizable person Re-ID. It describes unseen target domain as a combination of the known source ones, and explicitly learns domain-specific representation with target distribution to improve the models generalization by a meta-learning pipeline. Specifically, AdsNorm utilizes batch normalization layers to collect individual source domains characteristics, and maps source domains into a shared latent space by using these characteristics, where the domain relevance is measured by a distance function of different domain-specific normalization statistics and features. At the testing stage, AdsNorm projects images from unseen target domain into the same latent space, and adaptively integrates the domain-specific features carrying the source distributions by domain relevance for learning more generalizable aggregated representation on unseen target domain. Considering that target domain is unavailable during training, a meta-learning algorithm combined with a customized relation loss is proposed to optimize an effective and efficient ensemble model. Extensive experiments demonstrate that AdsNorm outperforms the state-of-the-art methods. The code is available at: https://github.com/hzphzp/AdsNorm.
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data. To handle this problem, some recent works adopt clustering algorithms to off-line generate pseudo labels, which can then be used as the supervision signal for on-line feature learning in the target domain. However, the off-line generated labels often contain lots of noise that significantly hinders the discriminability of the on-line learned features, and thus limits the final UDA re-ID performance. To this end, we propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase, to alternatively boost the label purity and feature discriminability in the target domain for more reliable re-ID. Specifically, at the off-line phase, a new hierarchical clustering scheme is proposed, which selects representative prototypes for every coarse cluster. Thus, labels can be effectively refined by using the inherent hierarchical information of person images. Besides, at the on-line phase, we propose an instant memory spread-out (IM-spread-out) regularization, that takes advantage of the proposed instant memory bank to store sample features of the entire dataset and enable spread-out feature learning over the entire training data instantly. Our Dual-Refinement method reduces the influence of noisy labels and refines the learned features within the alternative training process. Experiments demonstrate that our method outperforms the state-of-the-art methods by a large margin.