No Arabic abstract
Recently, parametric mappings have emerged as highly effective surface representations, yielding low reconstruction error. In particular, the latest works represent the target shape as an atlas of multiple mappings, which can closely encode object parts. Atlas representations, however, suffer from one major drawback: The individual mappings are not guaranteed to be consistent, which results in holes in the reconstructed shape or in jagged surface areas. We introduce an approach that explicitly encourages global consistency of the local mappings. To this end, we introduce two novel loss terms. The first term exploits the surface normals and requires that they remain locally consistent when estimated within and across the individual mappings. The second term further encourages better spatial configuration of the mappings by minimizing novel stitching error. We show on standard benchmarks that the use of normal consistency requirement outperforms the baselines quantitatively while enforcing better stitching leads to much better visual quality of the reconstructed objects as compared to the state-of-the-art.
We introduce a novel RGB-D patch descriptor designed for detecting coplanar surfaces in SLAM reconstruction. The core of our method is a deep convolutional neural net that takes in RGB, depth, and normal information of a planar patch in an image and outputs a descriptor that can be used to find coplanar patches from other images.We train the network on 10 million triplets of coplanar and non-coplanar patches, and evaluate on a new coplanarity benchmark created from commodity RGB-D scans. Experiments show that our learned descriptor outperforms alternatives extended for this new task by a significant margin. In addition, we demonstrate the benefits of coplanarity matching in a robust RGBD reconstruction formulation.We find that coplanarity constraints detected with our method are sufficient to get reconstruction results comparable to state-of-the-art frameworks on most scenes, but outperform other methods on standard benchmarks when combined with a simple keypoint method.
Our aim was to enhance visual quality and quantitative accuracy of dynamic positron emission tomography (PET)uptake images by improved image reconstruction, using sophisticated sparse penalty models that incorporate both 2D spatial+1D temporal (3DT) information. We developed two new 3DT PET reconstruction algorithms, incorporating different temporal and spatial penalties based on discrete cosine transform (DCT)w/ patches, and tensor nuclear norm (TNN) w/ patches, and compared to frame-by-frame methods; conventional 2D ordered subsets expectation maximization (OSEM) w/ post-filtering and 2D-DCT and 2D-TNN. A 3DT brain phantom with kinetic uptake (2-tissue model), and a moving 3DT cardiac/lung phantom was simulated and reconstructed. For the cardiac/lung phantom, an additional cardiac gated 2D-OSEM set was reconstructed. The structural similarity index (SSIM) and relative root mean squared error (rRMSE) relative ground truth was investigated. The image derived left ventricular (LV) volume for the cardiac/lung images was found by region growing and parametric images of the brain phantom were calculated. For the cardiac/lung phantom, 3DT-TNN yielded optimal images, and 3DT-DCT was best for the brain phantom. The optimal LV volume from the 3DT-TNN images was on average 11 and 55 percentage points closer to the true value compared to cardiac gated 2D-OSEM and 2D-OSEM respectively. Compared to 2D-OSEM, parametric images based on 3DT-DCT images generally had smaller bias and higher SSIM. Our novel methods that incorporate both 2D spatial and 1D temporal penalties produced dynamic PET images of higher quality than conventional 2D methods, w/o need for post-filtering. Breathing and cardiac motion were simultaneously captured w/o need for respiratory or cardiac gating. LV volumes were better recovered, and subsequently fitted parametric images were generally less biased and of higher quality.
Despite the long history of image and video stitching research, existing academic and commercial solutions still produce strong artifacts. In this work, we propose a wide-baseline video stitching algorithm for linear camera arrays that is temporally stable and tolerant to strong parallax. Our key insight is that stitching can be cast as a problem of learning a smooth spatial interpolation between the input videos. To solve this problem, inspired by pushbroom cameras, we introduce a fast pushbroom interpolation layer and propose a novel pushbroom stitching network, which learns a dense flow field to smoothly align the multiple input videos for spatial interpolation. Our approach outperforms the state-of-the-art by a significant margin, as we show with a user study, and has immediate applications in many areas such as virtual reality, immersive telepresence, autonomous driving, and video surveillance.
The reconstruction of a discrete surface from a point cloud is a fundamental geometry processing problem that has been studied for decades, with many methods developed. We propose the use of a deep neural network as a geometric prior for surface reconstruction. Specifically, we overfit a neural network representing a local chart parameterization to part of an input point cloud using the Wasserstein distance as a measure of approximation. By jointly fitting many such networks to overlapping parts of the point cloud, while enforcing a consistency condition, we compute a manifold atlas. By sampling this atlas, we can produce a dense reconstruction of the surface approximating the input cloud. The entire procedure does not require any training data or explicit regularization, yet, we show that it is able to perform remarkably well: not introducing typical overfitting artifacts, and approximating sharp features closely at the same time. We experimentally show that this geometric prior produces good results for both man-made objects containing sharp features and smoother organic objects, as well as noisy inputs. We compare our method with a number of well-known reconstruction methods on a standard surface reconstruction benchmark.
We present a method for reconstructing triangle meshes from point clouds. Existing learning-based methods for mesh reconstruction mostly generate triangles individually, making it hard to create manifold meshes. We leverage the properties of 2D Delaunay triangulations to construct a mesh from manifold surface elements. Our method first estimates local geodesic neighborhoods around each point. We then perform a 2D projection of these neighborhoods using a learned logarithmic map. A Delaunay triangulation in this 2D domain is guaranteed to produce a manifold patch, which we call a Delaunay surface element. We synchronize the local 2D projections of neighboring elements to maximize the manifoldness of the reconstructed mesh. Our results show that we achieve better overall manifoldness of our reconstructed meshes than current methods to reconstruct meshes with arbitrary topology. Our code, data and pretrained models can be found online: https://github.com/mrakotosaon/dse-meshing