أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Kyle Genova

Local Deep Implicit Functions for 3D Shape

138 - Kyle Genova , Forrester Cole , Avneesh Sud 2019

The goal of this project is to learn a 3D shape representation that enables accurate surface reconstruction, compact storage, efficient computation, consistency for similar shapes, generalization across diverse shape categories, and inference from de pth camera observations. Towards this end, we introduce Local Deep Implicit Functions (LDIF), a 3D shape representation that decomposes space into a structured set of learned implicit functions. We provide networks that infer the space decomposition and local deep implicit functions from a 3D mesh or posed depth image. During experiments, we find that it provides 10.3 points higher surface reconstruction accuracy (F-Score) than the state-of-the-art (OccNet), while requiring fewer than 1 percent of the network parameters. Experiments on posed depth image completion and generalization to unseen classes show 15.8 and 17.8 point improvements over the state-of-the-art, while producing a structured 3D representation for each input with consistency across diverse shape collections.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي

Learning Shape Templates with Structured Implicit Functions

95 - Kyle Genova , Forrester Cole , Daniel Vlasic 2019

Template 3D shapes are useful for many tasks in graphics and vision, including fitting observation data, analyzing shape collections, and transferring shape attributes. Because of the variety of geometry and topology of real-world shapes, previous me thods generally use a library of hand-made templates. In this paper, we investigate learning a general shape template from data. To allow for widely varying geometry and topology, we choose an implicit surface representation based on composition of local shape elements. While long known to computer graphics, this representation has not yet been explored in the context of machine learning for vision. We show that structured implicit functions are suitable for learning and allow a network to smoothly and simultaneously fit multiple classes of shapes. The learned shape template supports applications such as shape exploration, correspondence, abstraction, interpolation, and semantic segmentation from an RGB image.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي

Unsupervised Training for 3D Morphable Model Regression

131 - Kyle Genova , Forrester Cole , Aaron Maschinot 2018

We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch distribution loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the network can correctly reinterpret its own output, and a multi-view identity loss that compares the features of the predicted 3D face and the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.

الرؤية الحاسوبية وتمييز الأنماط

Learning Where to Look: Data-Driven Viewpoint Set Selection for 3D Scenes

75 - Kyle Genova , Manolis Savva , Angel X. Chang 2017

The use of rendered images, whether from completely synthetic datasets or from 3D reconstructions, is increasingly prevalent in vision tasks. However, little attention has been given to how the selection of viewpoints affects the performance of rende red training sets. In this paper, we propose a data-driven approach to view set selection. Given a set of example images, we extract statistics describing their contents and generate a set of views matching the distribution of those statistics. Motivated by semantic segmentation tasks, we model the spatial distribution of each semantic object category within an image view volume. We provide a search algorithm that generates a sampling of likely candidate views according to the example distribution, and a set selection algorithm that chooses a subset of the candidates that jointly cover the example distribution. Results of experiments with these algorithms on SUNCG indicate that they are indeed able to produce view distributions similar to an example set from NYUDv2 according to the earth movers distance. Furthermore, the selected views improve performance on semantic segmentation compared to alternative view selection algorithms.

الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد