Is First Person Vision Challenging for Object Tracking?

316 0 0.0 ( 0 )

Download Cite

Added by Matteo Dunnhofer

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Matteo Dunnhofer - Antonino Furnari - Giovanni Maria Farinella

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Understanding human-object interactions is fundamental in First Person Vision (FPV). Tracking algorithms which follow the objects manipulated by the camera wearer can provide useful cues to effectively model such interactions. Visual tracking solutions available in the computer vision literature have significantly improved their performance in the last years for a large variety of target objects and tracking scenarios. However, despite a few previous attempts to exploit trackers in FPV applications, a methodical analysis of the performance of state-of-the-art trackers in this domain is still missing. In this paper, we fill the gap by presenting the first systematic study of object tracking in FPV. Our study extensively analyses the performance of recent visual trackers and baseline FPV trackers with respect to different aspects and considering a new performance measure. This is achieved through TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences. Our results show that object tracking in FPV is challenging, which suggests that more research efforts should be devoted to this problem so that tracking could benefit FPV tasks.

rate research

Is First Person Vision Challenging for Object Tracking? The TREK-100 Benchmark Dataset

54 - Matteo Dunnhofer , Antonino Furnari , Giovanni Maria Farinella 2020

Understanding human-object interactions is fundamental in First Person Vision (FPV). Tracking algorithms which follow the objects manipulated by the camera wearer can provide useful information to effectively model such interactions. Despite a few previous attempts to exploit trackers in FPV applications, a systematic analysis of the performance of state-of-the-art trackers in this domain is still missing. On the other hand, the visual tracking solutions available in the computer vision literature have significantly improved their performance in the last years for a large variety of target objects and tracking scenarios. To fill the gap, in this paper, we present TREK-100, the first benchmark dataset for visual object tracking in FPV. The dataset is composed of 100 video sequences densely annotated with 60K bounding boxes, 17 sequence attributes, 13 action verb attributes and 29 target object attributes. Along with the dataset, we present an extensive analysis of the performance of 30 among the best and most recent visual trackers. Our results show that object tracking in FPV is challenging, which suggests that more research efforts should be devoted to this problem.

Computer Vision and Pattern Recognition

Fooling Detection Alone is Not Enough: First Adversarial Attack against Multiple Object Tracking

138 - Yunhan Jia , Yantao Lu , Junjie Shen 2019

Recent work in adversarial machine learning started to focus on the visual perception in autonomous driving and studied Adversarial Examples (AEs) for object detection models. However, in such visual perception pipeline the detected objects must also be tracked, in a process called Multiple Object Tracking (MOT), to build the moving trajectories of surrounding obstacles. Since MOT is designed to be robust against errors in object detection, it poses a general challenge to existing attack techniques that blindly target objection detection: we find that a success rate of over 98% is needed for them to actually affect the tracking results, a requirement that no existing attack technique can satisfy. In this paper, we are the first to study adversarial machine learning attacks against the complete visual perception pipeline in autonomous driving, and discover a novel attack technique, tracker hijacking, that can effectively fool MOT using AEs on object detection. Using our technique, successful AEs on as few as one single frame can move an existing object in to or out of the headway of an autonomous vehicle to cause potential safety hazards. We perform evaluation using the Berkeley Deep Drive dataset and find that on average when 3 frames are attacked, our attack can have a nearly 100% success rate while attacks that blindly target object detection only have up to 25%.

Computer Vision and Pattern Recognition Cryptography and Security

Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking

56 - Aljov{s}a Ov{s}ep , Wolfgang Mehner , Paul Voigtlaender 2017

The most common paradigm for vision-based multi-object tracking is tracking-by-detection, due to the availability of reliable detectors for several important object categories such as cars and pedestrians. However, future mobile systems will need a capability to cope with rich human-made environments, in which obtaining detectors for every possible object category would be infeasible. In this paper, we propose a model-free multi-object tracking approach that uses a category-agnostic image segmentation method to track objects. We present an efficient segmentation mask-based tracker which associates pixel-precise masks reported by the segmentation. Our approach can utilize semantic information whenever it is available for classifying objects at the track level, while retaining the capability to track generic unknown objects in the absence of such information. We demonstrate experimentally that our approach achieves performance comparable to state-of-the-art tracking-by-detection methods for popular object categories such as cars and pedestrians. Additionally, we show that the proposed method can discover and robustly track a large variety of other objects.

Computer Vision and Pattern Recognition

Real-time Autonomous Robot for Object Tracking using Vision System

230 - Qazwan Abdullah , Nor Shahida Mohd Shah , Mahathir Mohamad 2021

Researchers and robotic development groups have recently started paying special attention to autonomous mobile robot navigation in indoor environments using vision sensors. The required data is provided for robot navigation and object detection using a camera as a sensor. The aim of the project is to construct a mobile robot that has integrated vision system capability used by a webcam to locate, track and follow a moving object. To achieve this task, multiple image processing algorithms are implemented and processed in real-time. A mini-laptop was used for collecting the necessary data to be sent to a PIC microcontroller that turns the processes of data obtained to provide the robots proper orientation. A vision system can be utilized in object recognition for robot control applications. The results demonstrate that the proposed mobile robot can be successfully operated through a webcam that detects the object and distinguishes a tennis ball based on its color and shape.

Distributed Parallel and Cluster Computing Robotics

Aggregation Signature for Small Object Tracking

65 - Chunlei Liu , Wenrui Ding , Jinyu Yang 2019

Small object tracking becomes an increasingly important task, which however has been largely unexplored in computer vision. The great challenges stem from the facts that: 1) small objects show extreme vague and variable appearances, and 2) they tend to be lost easier as compared to normal-sized ones due to the shaking of lens. In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift. We make three-fold contributions in this work. First, technically, we propose a new descriptor, named aggregation signature, based on saliency, able to represent highly distinctive features for small objects. Second, theoretically, we prove that the proposed signature matches the foreground object more accurately with a high probability. Third, experimentally, the aggregation signature achieves a high performance on multiple datasets, outperforming the state-of-the-art methods by large margins. Moreover, we contribute with two newly collected benchmark datasets, i.e., small90 and small112, for visually small object tracking. The datasets will be available in https://github.com/bczhangbczhang/.

Computer Vision and Pattern Recognition