ﻻ يوجد ملخص باللغة العربية
Detecting and segmenting individual objects, regardless of their category, is crucial for many applications such as action detection or robotic interaction. While this problem has been well-studied under the classic formulation of spatio-temporal grouping, state-of-the-art approaches do not make use of learning-based methods. To bridge this gap, we propose a simple learning-based approach for spatio-temporal grouping. Our approach leverages motion cues from optical flow as a bottom-up signal for separating objects from each other. Motion cues are then combined with appearance cues that provide a generic objectness prior for capturing the full extent of objects. We show that our approach outperforms all prior work on the benchmark FBMS dataset. One potential worry with learning-based methods is that they might overfit to the particular type of objects that they have been trained on. To address this concern, we propose two new benchmarks for generic, moving object detection, and show that our model matches top-down methods on common categories, while significantly out-performing both top-down and bottom-up methods on never-before-seen categories.
We consider move-making algorithms for energy minimization of multi-label Markov Random Fields (MRFs). Since this is not a tractable problem in general, a commonly used heuristic is to minimize over subsets of labels and variables in an iterative pro
Most recent transformer-based models show impressive performance on vision tasks, even better than Convolution Neural Networks (CNN). In this work, we present a novel, flexible, and effective transformer-based model for high-quality instance segmenta
In this paper, we tackle video panoptic segmentation, a task that requires assigning semantic classes and track identities to all pixels in a video. To study this important problem in a setting that requires a continuous interpretation of sensory dat
In this paper, we propose an end-to-end framework for instance segmentation. Based on the recently introduced DETR [1], our method, termed SOLQ, segments objects by learning unified queries. In SOLQ, each query represents one object and has multiple
We contribute to approximate algorithms for the quadratic assignment problem also known as graph matching. Inspired by the success of the fusion moves technique developed for multilabel discrete Markov random fields, we investigate its applicability