ﻻ يوجد ملخص باللغة العربية
Natural Language (NL) descriptions can be one of the most convenient or the only way to interact with systems built to understand and detect city scale traffic patterns and vehicle-related events. In this paper, we extend the widely adopted CityFlow Benchmark with NL descriptions for vehicle targets and introduce the CityFlow-NL Benchmark. The CityFlow-NL contains more than 5,000 unique and precise NL descriptions of vehicle targets, making it the first multi-target multi-camera tracking with NL descriptions dataset to our knowledge. Moreover, the dataset facilitates research at the intersection of multi-object tracking, retrieval by NL descriptions, and temporal localization of events. In this paper, we focus on two foundational tasks: the Vehicle Retrieval by NL task and the Vehicle Tracking by NL task, which take advantage of the proposed CityFlow-NL benchmark and provide a strong basis for future research on the multi-target multi-camera tracking by NL description task.
We propose a novel Siamese Natural Language Tracker (SNLT), which brings the advancements in visual tracking to the tracking by natural language (NL) descriptions task. The proposed SNLT is applicable to a wide range of Siamese trackers, providing a
Urban traffic optimization using traffic cameras as sensors is driving the need to advance state-of-the-art multi-target multi-camera (MTMC) tracking. This work introduces CityFlow, a city-scale traffic camera dataset consisting of more than 3 hours
Multi-Target Multi-Camera Tracking has a wide range of applications and is the basis for many advanced inferences and predictions. This paper describes our solution to the Track 3 multi-camera vehicle tracking task in 2021 AI City Challenge (AICITY21
Vehicle search is one basic task for the efficient traffic management in terms of the AI City. Most existing practices focus on the image-based vehicle matching, including vehicle re-identification and vehicle tracking. In this paper, we apply one ne
In this paper, we propose a novel method for video moment retrieval (VMR) that achieves state of the arts (SOTA) performance on R@1 metrics and surpassing the SOTA on the high IoU metric (R@1, IoU=0.7). First, we propose to use a multi-head self-at