Fast Piecewise-Affine Motion Estimation Without Segmentation

109 0 0.0 ( 0 )

Download Cite

Added by Denis Fortun

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Denis Fortun - Martin Storath - Dennis Rickert

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Current algorithmic approaches for piecewise affine motion estimation are based on alternating motion segmentation and estimation. We propose a new method to estimate piecewise affine motion fields directly without intermediate segmentation. To this end, we reformulate the problem by imposing piecewise constancy of the parameter field, and derive a specific proximal splitting optimization scheme. A key component of our framework is an efficient one-dimensional piecewise-affine estimator for vector-valued signals. The first advantage of our approach over segmentation-based methods is its absence of initialization. The second advantage is its lower computational cost which is independent of the complexity of the motion field. In addition to these features, we demonstrate competitive accuracy with other piecewise-parametric methods on standard evaluation benchmarks. Our new regularization scheme also outperforms the more standard use of total variation and total generalized variation.

rate research

Dense Depth Estimation of a Complex Dynamic Scene without Explicit 3D Motion Estimation

126 - Suryansh Kumar , Ram Srivatsav Ghorakavi , Yuchao Dai 2019

Recent geometric methods need reliable estimates of 3D motion parameters to procure accurate dense depth map of a complex dynamic scene from monocular images cite{kumar2017monocular, ranftl2016dense}. Generally, to estimate textbf{precise} measurements of relative 3D motion parameters and to validate its accuracy using image data is a challenging task. In this work, we propose an alternative approach that circumvents the 3D motion estimation requirement to obtain a dense depth map of a dynamic scene. Given per-pixel optical flow correspondences between two consecutive frames and, the sparse depth prior for the reference frame, we show that, we can effectively recover the dense depth map for the successive frames without solving for 3D motion parameters. Our method assumes a piece-wise planar model of a dynamic scene, which undergoes rigid transformation locally, and as-rigid-as-possible transformation globally between two successive frames. Under our assumption, we can avoid the explicit estimation of 3D rotation and translation to estimate scene depth. In essence, our formulation provides an unconventional way to think and recover the dense depth map of a complex dynamic scene which is incremental and motion free in nature. Our proposed method does not make object level or any other high-level prior assumption about the dynamic scene, as a result, it is applicable to a wide range of scenarios. Experimental results on the benchmarks dataset show the competence of our approach for multiple frames.

Computer Vision and Pattern Recognition

Directional Cross Diamond Search Algorithm for Fast Block Motion Estimation

344 - Hongjun Jia , Li Zhang 2008

In block-matching motion estimation (BMME), the search patterns have a significant impact on the algorithms performance, both the search speed and the search quality. The search pattern should be designed to fit the motion vector probability (MVP) distribution characteristics of the real-world sequences. In this paper, we build a directional model of MVP distribution to describe the directional-center-biased characteristic of the MVP distribution and the directional characteristics of the conditional MVP distribution more exactly based on the detailed statistical data of motion vectors of eighteen popular sequences. Three directional search patterns are firstly designed by utilizing the directional characteristics and they are the smallest search patterns among the popular ones. A new algorithm is proposed using the horizontal cross search pattern as the initial step and the horizontal/vertical diamond search pattern as the subsequent step for the fast BMME, which is called the directional cross diamond search (DCDS) algorithm. The DCDS algorithm can obtain the motion vector with fewer search points than CDS, DS or HEXBS while maintaining the similar or even better search quality. The gain on speedup of DCDS over CDS or DS can be up to 54.9%. The simulation results show that DCDS is efficient, effective and robust, and it can always give the faster search speed on different sequences than other fast block-matching algorithm in common use.

Computer Vision and Pattern Recognition

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

301 - Jiahui Huang , He Wang , Tolga Birdal 2021

We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. The two non-trivial challenges posed by this multi-scan multibody setting that we investigate are: (i) guaranteeing correspondence and segmentation consistency across multiple input point clouds capturing different spatial arrangements of bodies or body parts; and (ii) obtaining robust motion-based rigid body segmentation applicable to novel object categories. We propose an approach to address these issues that incorporates spectral synchronization into an iterative deep declarative network, so as to simultaneously recover consistent correspondences as well as motion segmentation. At the same time, by explicitly disentangling the correspondence and motion segmentation estimation modules, we achieve strong generalizability across different object categories. Our extensive evaluations demonstrate that our method is effective on various datasets ranging from rigid parts in articulated objects to individually moving objects in a 3D scene, be it single-view or full point clouds.

Computer Vision and Pattern Recognition Machine Learning

Flexible Piecewise Curves Estimation for Photo Enhancement

100 - Chongyi Li , Chunle Guo , Qiming Ai 2020

This paper presents a new method, called FlexiCurve, for photo enhancement. Unlike most existing methods that perform image-to-image mapping, which requires expensive pixel-wise reconstruction, FlexiCurve takes an input image and estimates global curves to adjust the image. The adjustment curves are specially designed for performing piecewise mapping, taking nonlinear adjustment and differentiability into account. To cope with challenging and diverse illumination properties in real-world images, FlexiCurve is formulated as a multi-task framework to produce diverse estimations and the associated confidence maps. These estimations are adaptively fused to improve local enhancements of different regions. Thanks to the image-to-curve formulation, for an image with a size of 512*512*3, FlexiCurve only needs a lightweight network (150K trainable parameters) and it has a fast inference speed (83FPS on a single NVIDIA 2080Ti GPU). The proposed method improves efficiency without compromising the enhancement quality and losing details in the original image. The method is also appealing as it is not limited to paired training data, thus it can flexibly learn rich enhancement styles from unpaired data. Extensive experiments demonstrate that our method achieves state-of-the-art performance on photo enhancement quantitively and qualitatively.

Computer Vision and Pattern Recognition

PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN

83 - Yaxu Xie , Jason Rambach , Fangwen Shu 2021

Instance segmentation of planar regions in indoor scenes benefits visual SLAM and other applications such as augmented reality (AR) where scene understanding is required. Existing methods built upon two-stage frameworks show satisfactory accuracy but are limited by low frame rates. In this work, we propose a real-time deep neural architecture that estimates piece-wise planar regions from a single RGB image. Our model employs a variant of a fast single-stage CNN architecture to segment plane instances. Considering the particularity of the target detected, we propose Fast Feature Non-maximum Suppression (FF-NMS) to reduce the suppression errors resulted from overlapping bounding boxes of planes. We also utilize a Residual Feature Augmentation module in the Feature Pyramid Network (FPN). Our method achieves significantly higher frame-rates and comparable segmentation accuracy against two-stage methods. We automatically label over 70,000 images as ground truth from the Stanford 2D-3D-Semantics dataset. Moreover, we incorporate our method with a state-of-the-art planar SLAM and validate its benefits.

Computer Vision and Pattern Recognition Robotics