ﻻ يوجد ملخص باللغة العربية
We propose a novel Siamese Natural Language Tracker (SNLT), which brings the advancements in visual tracking to the tracking by natural language (NL) descriptions task. The proposed SNLT is applicable to a wide range of Siamese trackers, providing a new class of baselines for the tracking by NL task and promising future improvements from the advancements of Siamese trackers. The carefully designed architecture of the Siamese Natural Language Region Proposal Network (SNL-RPN), together with the Dynamic Aggregation of vision and language modalities, is introduced to perform the tracking by NL task. Empirical results over tracking benchmarks with NL annotations show that the proposed SNLT improves Siamese trackers by 3 to 7 percentage points with a slight tradeoff of speed. The proposed SNLT outperforms all NL trackers to-date and is competitive among state-of-the-art real-time trackers on LaSOT benchmarks while running at 50 frames per second on a single GPU.
Natural Language (NL) descriptions can be one of the most convenient or the only way to interact with systems built to understand and detect city scale traffic patterns and vehicle-related events. In this paper, we extend the widely adopted CityFlow
In recent years, deep-learning-based visual object trackers have been studied thoroughly, but handling occlusions and/or rapid motion of the target remains challenging. In this work, we argue that conditioning on the natural language (NL) description
Recently, the majority of visual trackers adopt Convolutional Neural Network (CNN) as their backbone to achieve high tracking accuracy. However, less attention has been paid to the potential adversarial threats brought by CNN, including Siamese netwo
The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames. Prior approaches such as sliding window classifiers are inefficient, while text-clip similarity driven ranking-bas
In recent years, Siamese network based trackers have significantly advanced the state-of-the-art in real-time tracking. However, state-of-the-art Siamese trackers suffer from high memory cost which restricts their applicability in mobile applications