Learned Interpolation for 3D Generation

136 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Austin Dill

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Austin Dill - Songwei Ge - Eunsu Kang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In order to generate novel 3D shapes with machine learning, one must allow for interpolation. The typical approach for incorporating this creative process is to interpolate in a learned latent space so as to avoid the problem of generating unrealistic instances by exploiting the models learned structure. The process of the interpolation is supposed to form a semantically smooth morphing. While this approach is sound for synthesizing realistic media such as lifelike portraits or new designs for everyday objects, it subjectively fails to directly model the unexpected, unrealistic, or creative. In this work, we present a method for learning how to interpolate point clouds. By encoding prior knowledge about real-world objects, the intermediate forms are both realistic and unlike any existing forms. We show not only how this method can be used to generate creative point clouds, but how the method can also be leveraged to generate 3D models suitable for sculpture.

قيم البحث

374 - Kentaro Fukamizu , Masaaki Kondo , Ryuichi Sakamoto 2019

We present a method of generating high resolution 3D shapes from natural language descriptions. To achieve this goal, we propose two steps that generating low resolution shapes which roughly reflect texts and generating high resolution shapes which r eflect the detail of texts. In a previous paper, the authors have shown a method of generating low resolution shapes. We improve it to generate 3D shapes more faithful to natural language and test the effectiveness of the method. To generate high resolution 3D shapes, we use the framework of Conditional Wasserstein GAN. We propose two roles of Critic separately, which calculate the Wasserstein distance between two probability distribution, so that we achieve generating high quality shapes or acceleration of learning speed of model. To evaluate our approach, we performed quantitive evaluation with several numerical metrics for Critic models. Our method is first to realize the generation of high quality model by propagating text embedding information to high resolution task when generating 3D model.

الرسم الحاسوبي التعلم الآلي التعلم الالي

FAME: 3D Shape Generation via Functionality-Aware Model Evolution

122 - Yanran Guan , Han Liu , Kun Liu 2020

We introduce a modeling tool which can evolve a set of 3D objects in a functionality-aware manner. Our goal is for the evolution to generate large and diverse sets of plausible 3D objects for data augmentation, constrained modeling, as well as open-e nded exploration to possibly inspire new designs. Starting with an initial population of 3D objects belonging to one or more functional categories, we evolve the shapes through part recombination to produce generations of hybrids or crossbreeds between parents from the heterogeneous shape collection. Evolutionary selection of offsprings is guided both by a functional plausibility score derived from functionality analysis of shapes in the initial population and user preference, as in a design gallery. Since cross-category hybridization may result in offsprings not belonging to any of the known functional categories, we develop a means for functionality partial matching to evaluate functional plausibility on partial shapes. We show a variety of plausible hybrid shapes generated by our functionality-aware model evolution, which can complement existing datasets as training data and boost the performance of contemporary data-driven segmentation schemes, especially in challenging cases. Our tool supports constrained modeling, allowing users to restrict or steer the model evolution with functionality labels. At the same time, unexpected yet functional object prototypes can emerge during open-ended exploration owing to structure breaking when evolving a heterogeneous collection.

الرسم الحاسوبي

Voice2Mesh: Cross-Modal 3D Face Model Generation from Voices

127 - Cho-Ying Wu , Ke Xu , Chin-Cheng Hsu 2021

This work focuses on the analysis that whether 3D face models can be learned from only the speech inputs of speakers. Previous works for cross-modal face synthesis study image generation from voices. However, image synthesis includes variations such as hairstyles, backgrounds, and facial textures, that are arguably irrelevant to voice or without direct studies to show correlations. We instead investigate the ability to reconstruct 3D faces to concentrate on only geometry, which is more physiologically grounded. We propose both the supervised learning and unsupervised learning frameworks. Especially we demonstrate how unsupervised learning is possible in the absence of a direct voice-to-3D-face dataset under limited availability of 3D face scans when the model is equipped with knowledge distillation. To evaluate the performance, we also propose several metrics to measure the geometric fitness of two 3D faces based on points, lines, and regions. We find that 3D face shapes can be reconstructed from voices. Experimental results suggest that 3D faces can be reconstructed from voices, and our method can improve the performance over the baseline. The best performance gains (15% - 20%) on ear-to-ear distance ratio metric (ER) coincides with the intuition that one can roughly envision whether a speakers face is overall wider or thinner only from a persons voice. See our project page for codes and data.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Neural View-Interpolation for Sparse Light Field Video

144 - Mojtaba Bemana , Karol Myszkowski , Hans-Peter Seidel 2019

We suggest representing light field (LF) videos as one-off neural networks (NN), i.e., a learned mapping from view-plus-time coordinates to high-resolution color values, trained on sparse views. Initially, this sounds like a bad idea for three main r easons: First, a NN LF will likely have less quality than a same-sized pixel basis representation. Second, only few training data, e.g., 9 exemplars per frame are available for sparse LF videos. Third, there is no generalization across LFs, but across view and time instead. Consequently, a network needs to be trained for each LF video. Surprisingly, these problems can turn into substantial advantages: Other than the linear pixel basis, a NN has to come up with a compact, non-linear i.e., more intelligent, explanation of color, conditioned on the sparse view and time coordinates. As observed for many NN however, this representation now is interpolatable: if the image output for sparse view coordinates is plausible, it is for all intermediate, continuous coordinates as well. Our specific network architecture involves a differentiable occlusion-aware warping step, which leads to a compact set of trainable parameters and consequently fast learning and fast execution.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Automatic Generation of Large-scale 3D Road Networks based on GIS Data

140 - Hua Wang , Yue Wu , Xu Han 2021

How to automatically generate a realistic large-scale 3D road network is a key point for immersive and credible traffic simulations. Existing methods cannot automatically generate various kinds of intersections in 3D space based on GIS data. In this paper, we propose a method to generate complex and large-scale 3D road networks automatically with the open source GIS data, including satellite imagery, elevation data and two-dimensional(2D) road center axis data, as input. We first introduce a semantic structure of road network to obtain high-detailed and well-formed networks in a 3D scene. We then generate 2D shapes and topological data of the road network according to the semantic structure and 2D road center axis data. At last, we segment the elevation data and generate the surface of the 3D road network according to the 2D semantic data and satellite imagery data. Results show that our method does well in the generation of various types of intersections and the high-detailed features of roads. The traffic semantic structure, which must be provided in traffic simulation, can also be generated automatically according to our method.

الرسم الحاسوبي