ﻻ يوجد ملخص باللغة العربية
RGBT tracking receives a surge of interest in the computer vision community, but this research field lacks a large-scale and high-diversity benchmark dataset, which is essential for both the training of deep RGBT trackers and the comprehensive evaluation of RGBT tracking methods. To this end, we present a Large-scale High-diversity benchmark for RGBT tracking (LasHeR) in this work. LasHeR consists of 1224 visible and thermal infrared video pairs with more than 730K frame pairs in total. Each frame pair is spatially aligned and manually annotated with a bounding box, making the dataset well and densely annotated. LasHeR is highly diverse capturing from a broad range of object categories, camera viewpoints, scene complexities and environmental factors across seasons, weathers, day and night. We conduct a comprehensive performance evaluation of 12 RGBT tracking algorithms on the LasHeR dataset and present detailed analysis to clarify the research room in RGBT tracking. In addition, we release the unaligned version of LasHeR to attract the research interest for alignment-free RGBT tracking, which is a more practical task in real-world applications. The datasets and evaluation protocols are available at: https://github.com/BUGPLEASEOUT/LasHeR.
In this paper, we present a Large-Scale and high-diversity general Thermal InfraRed (TIR) Object Tracking Benchmark, called LSOTBTIR, which consists of an evaluation dataset and a training dataset with a total of 1,400 TIR sequences and more than 600
Salient object detection in complex scenes and environments is a challenging research topic. Most works focus on RGB-based salient object detection, which limits its performance of real-life applications when confronted with adverse conditions such a
For many years, multi-object tracking benchmarks have focused on a handful of categories. Motivated primarily by surveillance and self-driving applications, these datasets provide tracks for people, vehicles, and animals, ignoring the vast majority o
Crowd counting is a fundamental yet challenging task, which desires rich information to generate pixel-wise crowd density maps. However, most previous methods only used the limited information of RGB images and cannot well discover potential pedestri
Panoptic scene understanding and tracking of dynamic agents are essential for robots and automated vehicles to navigate in urban environments. As LiDARs provide accurate illumination-independent geometric depictions of the scene, performing these tas