Fast Training of Neural Lumigraph Representations using Meta Learning

122 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alexander Bergman

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Alexander W. Bergman - Petr Kellnhofer - Gordon Wetzstein

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Novel view synthesis is a long-standing problem in machine learning and computer vision. Significant progress has recently been made in developing neural scene representations and rendering techniques that synthesize photorealistic images from arbitrary views. These representations, however, are extremely slow to train and often also slow to render. Inspired by neural variants of image-based rendering, we develop a new neural rendering approach with the goal of quickly learning a high-quality representation which can also be rendered in real-time. Our approach, MetaNLR++, accomplishes this by using a unique combination of a neural shape representation and 2D CNN-based image feature extraction, aggregation, and re-projection. To push representation convergence times down to minutes, we leverage meta learning to learn neural shape and image feature priors which accelerate training. The optimized shape and image features can then be extracted using traditional graphics techniques and rendered in real time. We show that MetaNLR++ achieves similar or better novel view synthesis results in a fraction of the time that competing methods require.

قيم البحث

124 - Homer Walke , R. Kenny Jones , Daniel Ritchie 2020

Inferring programs which generate 2D and 3D shapes is important for reverse engineering, editing, and more. Training such inference models is challenging due to the lack of paired (shape, program) data in most domains. A popular approach is to pre-tr ain a model on synthetic data and then fine-tune on real shapes using slow, unstable reinforcement learning. In this paper, we argue that self-training is a viable alternative for fine-tuning such models. Self-training is a semi-supervised learning paradigm where a model assigns pseudo-labels to unlabeled data, and then retrains with (data, pseudo-label) pairs as the new ground truth. We show that for constructive solid geometry and assembly-based modeling, self-training outperforms state-of-the-art reinforcement learning approaches. Additionally, shape program inference has a unique property that circumvents a potential downside of self-training (incorrect pseudo-label assignment): inferred programs are executable. For a given shape from our distribution of interest $mathbf{x}^*$ and its predicted program $mathbf{z}$, one can execute $mathbf{z}$ to obtain a shape $mathbf{x}$ and train on $(mathbf{z}, mathbf{x})$ pairs, rather than $(mathbf{z}, mathbf{x}^*)$ pairs. We term this procedure latent execution self training (LEST). We demonstrate that self training infers shape programs with higher shape reconstruction accuracy and converges significantly faster than reinforcement learning approaches, and in some domains, LEST can further improve this performance.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

MetaSDF: Meta-learning Signed Distance Functions

116 - Vincent Sitzmann , Eric R. Chan , Richard Tucker 2020

Neural implicit shape representations are an emerging paradigm that offers many potential benefits over conventional discrete representations, including memory efficiency at a high spatial resolution. Generalizing across shapes with such neural impli cit representations amounts to learning priors over the respective function space and enables geometry reconstruction from partial or noisy observations. Existing generalization methods rely on conditioning a neural network on a low-dimensional latent code that is either regressed by an encoder or jointly optimized in the auto-decoder framework. Here, we formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task. We demonstrate that this approach performs on par with auto-decoder based approaches while being an order of magnitude faster at test-time inference. We further demonstrate that the proposed gradient-based method outperforms encoder-decoder based methods that leverage pooling-based set encoders.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Fast and Explicit Neural View Synthesis

105 - Pengsheng Guo , Miguel Angel Bautista , Alex Colburn 2021

We study the problem of novel view synthesis of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Augmenting Implicit Neural Shape Representations with Explicit Deformation Fields

229 - Matan Atzmon , David Novotny , Andrea Vedaldi 2021

Implicit neural representation is a recent approach to learn shape collections as zero level-sets of neural networks, where each shape is represented by a latent code. So far, the focus has been shape reconstruction, while shape generalization was mo stly left to generic encoder-decoder or auto-decoder regularization. In this paper we advocate deformation-aware regularization for implicit neural representations, aiming at producing plausible deformations as latent code changes. The challenge is that implicit representations do not capture correspondences between different shapes, which makes it difficult to represent and regularize their deformations. Thus, we propose to pair the implicit representation of the shapes with an explicit, piecewise linear deformation field, learned as an auxiliary function. We demonstrate that, by regularizing these deformation fields, we can encourage the implicit neural representation to induce natural deformations in the learned shape space, such as as-rigid-as-possible deformations.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

AutoInt: Automatic Integration for Fast Neural Volume Rendering

109 - David B. Lindell , Julien N. P. Martel , Gordon Wetzstein 2020

Numerical integration is a foundational technique in scientific computing and is at the core of many computer vision applications. Among these applications, neural volume rendering has recently been proposed as a new paradigm for view synthesis, achi eving photorealistic image quality. However, a fundamental obstacle to making these methods practical is the extreme computational and memory requirements caused by the required volume integrations along the rendered rays during training and inference. Millions of rays, each requiring hundreds of forward passes through a neural network are needed to approximate those integrations with Monte Carlo sampling. Here, we propose automatic integration, a new framework for learning efficient, closed-form solutions to integrals using coordinate-based neural networks. For training, we instantiate the computational graph corresponding to the derivative of the network. The graph is fitted to the signal to integrate. After optimization, we reassemble the graph to obtain a network that represents the antiderivative. By the fundamental theorem of calculus, this enables the calculation of any definite integral in two evaluations of the network. Applying this approach to neural rendering, we improve a tradeoff between rendering speed and image quality: improving render times by greater than 10 times with a tradeoff of slightly reduced image quality.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي