ﻻ يوجد ملخص باللغة العربية
Many RGB-T trackers attempt to attain robust feature representation by utilizing an adaptive weighting scheme (or attention mechanism). Different from these works, we propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data by adaptively adjusting the convolutional kernels for various input images in practical tracking. Given the image pairs as input, we first encode their features with the backbone network. Then, we concatenate these feature maps and generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively. Inspired by residual connection, both the generated visible and thermal feature maps will be summarized with input feature maps. The augmented feature maps will be fed into the RoI align module to generate instance-level features for subsequent classification. To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism. The spatial and temporal recurrent neural network is used to capture the direction-aware context for accurate global attention prediction. Extensive experiments on three large-scale RGB-T tracking benchmark datasets validated the effectiveness of our proposed algorithm. The project page of this paper is available at https://sites.google.com/view/mfgrgbttrack/.
RGB-Infrared (IR) person re-identification is very challenging due to the large cross-modality variations between RGB and IR images. The key solution is to learn aligned features to the bridge RGB and IR modalities. However, due to the lack of corres
Due to the different photosensitive properties of infrared and visible light, the registered RGB-T image pairs shot in the same scene exhibit quite different characteristics. This paper proposes a siamese infrared and visible light fusion Network (Si
Prior correlation filter (CF)-based tracking methods for unmanned aerial vehicles (UAVs) have virtually focused on tracking in the daytime. However, when the night falls, the trackers will encounter more harsh scenes, which can easily lead to trackin
Visual object tracking is a challenging computer vision task with numerous real-world applications. Here we propose a simple but efficient Spectral Filter Tracking (SFT)method. To characterize rotational and translation invariance of tracking targets
Few-shot learning (FSL), which aims to recognise new classes by adapting the learned knowledge with extremely limited few-shot (support) examples, remains an important open problem in computer vision. Most of the existing methods for feature alignmen