Random mesh projectors for inverse problems

46 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sidharth Gupta

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sidharth Gupta - Konik Kothari - Maarten V. de Hoop

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We propose a new learning-based approach to solve ill-posed inverse problems in imaging. We address the case where ground truth training samples are rare and the problem is severely ill-posed - both because of the underlying physics and because we can only get few measurements. This setting is common in geophysical imaging and remote sensing. We show that in this case the common approach to directly learn the mapping from the measured data to the reconstruction becomes unstable. Instead, we propose to first learn an ensemble of simpler mappings from the data to projections of the unknown image into random piecewise-constant subspaces. We then combine the projections to form a final reconstruction by solving a deconvolution-like problem. We show experimentally that the proposed method is more robust to measurement noise and corruptions not seen during training than a directly learned inverse.

قيم البحث

94 - Rakesh Shrestha , Zhiwen Fan , Qingkun Su 2020

Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects and guide the shape generation process. These color image semantics only implicitly encode 3D information , potentially limiting the accuracy of the generated shapes. In this paper we propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo and regularizing the 3D shapes against these depth images. First, our system predicts a coarse 3D volume from the color images by probabilistically merging voxel occupancy grids from the prediction of individual views. Then the depth images from multi-view stereo along with the rendered depth images of the coarse shape are used as a contrastive input whose features guide the refinement of the coarse shape through a series of graph convolution networks. Notably, we achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.Our source code is available at https://git.io/Jmalg

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Multi-Domain Image Completion for Random Missing Input Data

325 - Liyue Shen , Wentao Zhu , Xiaosong Wang 2020

Multi-domain data are widely leveraged in vision applications taking advantage of complementary information from different modalities, e.g., brain tumor segmentation from multi-parametric magnetic resonance imaging (MRI). However, due to possible dat a corruption and different imaging protocols, the availability of images for each domain could vary amongst multiple data sources in practice, which makes it challenging to build a universal model with a varied set of input data. To tackle this problem, we propose a general approach to complete the random missing domain(s) data in real applications. Specifically, we develop a novel multi-domain image completion method that utilizes a generative adversarial network (GAN) with a representational disentanglement scheme to extract shared skeleton encoding and separate flesh encoding across multiple domains. We further illustrate that the learned representation in multi-domain image completion could be leveraged for high-level tasks, e.g., segmentation, by introducing a unified framework consisting of image completion and segmentation with a shared content encoder. The experiments demonstrate consistent performance improvement on three datasets for brain tumor segmentation, prostate segmentation, and facial expression image completion respectively.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Data-consistent neural networks for solving nonlinear inverse problems

109 - Yoeri E. Boink , Markus Haltmeier , Sean Holman 2020

Data assisted reconstruction algorithms, incorporating trained neural networks, are a novel paradigm for solving inverse problems. One approach is to first apply a classical reconstruction method and then apply a neural network to improve its solutio n. Empirical evidence shows that such two-step methods provide high-quality reconstructions, but they lack a convergence analysis. In this paper we formalize the use of such two-step approaches with classical regularization theory. We propose data-consistent neural networks that we combine with classical regularization methods. This yields a data-driven regularization method for which we provide a full convergence analysis with respect to noise. Numerical simulations show that compared to standard two-step deep learning methods, our approach provides better stability with respect to structural changes in the test set, while performing similarly on test data similar to the training set. Our method provides a stable solution of inverse problems that exploits both the known nonlinear forward model as well as the desired solution manifold from data.

التحليل العددي التحليل العددي معالجة الصور والفيديو

Bilevel Integrative Optimization for Ill-posed Inverse Problems

71 - Risheng Liu , Long Ma , Xiaoming Yuan 2019

Classical optimization techniques often formulate the feasibility of the problems as set, equality or inequality constraints. However, explicitly designing these constraints is indeed challenging for complex real-world applications and too strict con straints may even lead to intractable optimization problems. On the other hand, it is still hard to incorporate data-dependent information into conventional numerical iterations. To partially address the above limits and inspired by the leader-follower gaming perspective, this work first introduces a bilevel-type formulation to jointly investigate the feasibility and optimality of nonconvex and nonsmooth optimization problems. Then we develop an algorithmic framework to couple forward-backward proximal computations to optimize our established bilevel leader-follower model. We prove its convergence and estimate the convergence rate. Furthermore, a learning-based extension is developed, in which we establish an unrolling strategy to incorporate data-dependent network architectures into our iterations. Fortunately, it can be proved that by introducing some mild checking conditions, all our original convergence results can still be preserved for this learnable extension. As a nontrivial byproduct, we demonstrate how to apply this ensemble-like methodology to address different low-level vision tasks. Extensive experiments verify the theoretical results and show the advantages of our method against existing state-of-the-art approaches.

الرؤية الحاسوبية وتمييز الأنماط التحسين والتحكم

Space-Time Correspondence as a Contrastive Random Walk

201 - Allan Jabri , Andrew Owens , Alexei A. Efros 2020

This paper proposes a simple self-supervised approach for learning a representation for visual correspondence from raw video. We cast correspondence as prediction of links in a space-time graph constructed from video. In this graph, the nodes are pat ches sampled from each frame, and nodes adjacent in time can share a directed edge. We learn a representation in which pairwise similarity defines transition probability of a random walk, so that long-range correspondence is computed as a walk along the graph. We optimize the representation to place high probability along paths of similarity. Targets for learning are formed without supervision, by cycle-consistency: the objective is to maximize the likelihood of returning to the initial node when walking along a graph constructed from a palindrome of frames. Thus, a single path-level constraint implicitly supervises chains of intermediate comparisons. When used as a similarity metric without adaptation, the learned representation outperforms the self-supervised state-of-the-art on label propagation tasks involving objects, semantic parts, and pose. Moreover, we demonstrate that a technique we call edge dropout, as well as self-supervised adaptation at test-time, further improve transfer for object-centric correspondence.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو