ﻻ يوجد ملخص باللغة العربية
Scene model construction based on image rendering is an indispensable but challenging technique in computer vision and intelligent transportation systems. In this paper, we propose a framework for constructing 3D corridor-based road scene models. This consists of two successive stages: road detection and scene construction. The road detection is realized by a new superpixel Markov random field (MRF) algorithm. The data fidelity term in the MRFs energy function is jointly computed according to the superpixel features of color, texture and location. The smoothness term is established on the basis of the interaction of spatio-temporally adjacent superpixels. In the subsequent scene construction, the foreground and background regions are modeled independently. Experiments for road detection demonstrate the proposed method outperforms the state-of-the-art in both accuracy and speed. The scene construction experiments confirm that the proposed scene models show better correctness ratios, and have the potential to support a range of applications.
This work addresses the task of dense 3D reconstruction of a complex dynamic scene from images. The prevailing idea to solve this task is composed of a sequence of steps and is dependent on the success of several pipelines in its execution. To overco
With only bounding-box annotations in the spatial domain, existing video scene text detection (VSTD) benchmarks lack temporal relation of text instances among video frames, which hinders the development of video text-related applications. In this pap
Forecasting long-term human motion is a challenging task due to the non-linearity, multi-modality and inherent uncertainty in future trajectories. The underlying scene and past motion of agents can provide useful cues to predict their future motion.
In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely connected spatio-temporal
We introduce TransformerFusion, a transformer-based 3D scene reconstruction approach. From an input monocular RGB video, the video frames are processed by a transformer network that fuses the observations into a volumetric feature grid representing t