ﻻ يوجد ملخص باللغة العربية
Most existing trackers based on deep learning perform tracking in a holistic strategy, which aims to learn deep representations of the whole target for localizing the target. It is arduous for such methods to track targets with various appearance variations. To address this limitation, another type of methods adopts a part-based tracking strategy which divides the target into equal patches and tracks all these patches in parallel. The target state is inferred by summarizing the tracking results of these patches. A potential limitation of such trackers is that not all patches are equally informative for tracking. Some patches that are not discriminative may have adverse effects. In this paper, we propose to track the salient local parts of the target that are discriminative for tracking. In particular, we propose a fine-grained saliency mining module to capture the local saliencies. Further, we design a saliency-association modeling module to associate the captured saliencies together to learn effective correlation representations between the exemplar and the search image for state estimation. Extensive experiments on five diverse datasets demonstrate that the proposed method performs favorably against state-of-the-art trackers.
In this paper, we propose a novel effective non-rigid object tracking framework based on the spatial-temporal consistent saliency detection. In contrast to most existing trackers that utilize a bounding box to specify the tracked target, the proposed
A saliency guided hierarchical visual tracking (SHT) algorithm containing global and local search phases is proposed in this paper. In global search, a top-down saliency model is novelly developed to handle abrupt motion and appearance variation prob
Object proposals greatly benefit object detection task in recent state-of-the-art works. However, the existing object proposals usually have low localization accuracy at high intersection over union threshold. To address it, we apply saliency detecti
The real human attention is an interactive activity between our visual system and our brain, using both low-level visual stimulus and high-level semantic information. Previous image salient object detection (SOD) works conduct their saliency predicti
Salient object detection aims at detecting the most visually distinct objects and producing the corresponding masks. As the cost of pixel-level annotations is high, image tags are usually used as weak supervisions. However, an image tag can only be u