ﻻ يوجد ملخص باللغة العربية
Person re-identification aims to identify a specific person at distinct times and locations. It is challenging because of occlusion, illumination, and viewpoint change in camera views. Recently, multi-shot person re-id task receives more attention since it is closer to real-world application. A key point of a good algorithm for multi-shot person re-id is the temporal aggregation of the person appearance features. While most of the current approaches apply pooling strategies and obtain a fixed-size vector representation, these may lose the matching evidence between examples. In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space. Based on the supervision signals from a downstream task of interest, the method reshapes the appearance feature space and further learns the unknown distribution of each image set. In the context of multi-shot person re-id, we apply this novel concept along with Wasserstein distance and learn a distributional set distance function between two image sets. In this way, the proper alignment between two image sets can be discovered naturally in a non-parametric manner. Our experiment results on two public datasets show the advantages of our proposed method compared to other state-of-the-art approaches.
Modeling the underlying person structure for person re-identification (re-ID) is difficult due to diverse deformable poses, changeable camera views and imperfect person detectors. How to exploit underlying person structure information without extra a
Modern video person re-identification (re-ID) machines are often trained using a metric learning approach, supervised by a triplet loss. The triplet loss used in video re-ID is usually based on so-called clip features, each aggregated from a few fram
Feature representation and metric learning are two critical components in person re-identification models. In this paper, we focus on the feature representation and claim that hand-crafted histogram features can be complementary to Convolutional Neur
In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the color-texture distributions to address the problem of person re-identification. In parti
Traditional person re-identification (ReID) methods typically represent person images as real-valued features, which makes ReID inefficient when the gallery set is extremely large. Recently, some hashing methods have been proposed to make ReID more e