Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

3D Magic Mirror: Automatic Video to 3D Caricature Translation

83 0 0.0 ( 0 )

Download Cite

Added by Yudong Guo

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Yudong Guo - Luo Jiang - Lin Cai

Graphics Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Caricature is an abstraction of a real person which distorts or exaggerates certain features, but still retains a likeness. While most existing works focus on 3D caricature reconstruction from 2D caricatures or translating 2D photos to 2D caricatures, this paper presents a real-time and automatic algorithm for creating expressive 3D caricatures with caricature style texture map from 2D photos or videos. To solve this challenging ill-posed reconstruction problem and cross-domain translation problem, we first reconstruct the 3D face shape for each frame, and then translate 3D face shape from normal style to caricature style by a novel identity and expression preserving VAE-CycleGAN. Based on a labeling formulation, the caricature texture map is constructed from a set of multi-view caricature images generated by CariGANs. The effectiveness and efficiency of our method are demonstrated by comparison with baseline implementations. The perceptual study shows that the 3D caricatures generated by our method meet peoples expectations of 3D caricature style.

rate research

Appearance-Driven Automatic 3D Model Simplification

73 - Jon Hasselgren , Jacob Munkberg , Jaakko Lehtinen 2021

We present a suite of techniques for jointly optimizing triangle meshes and shading models to match the appearance of reference scenes. This capability has a number of uses, including appearance-preserving simplification of extremely complex assets, conversion between rendering systems, and even conversion between geometric scene representations. We follow and extend the classic analysis-by-synthesis family of techniques: enabled by a highly efficient differentiable renderer and modern nonlinear optimization algorithms, our results are driven to minimize the image-space difference to the target scene when rendered in similar viewing and lighting conditions. As the only signals driving the optimization are differences in rendered images, the approach is highly general and versatile: it easily supports many different forward rendering models such as normal mapping, spatially-varying BRDFs, displacement mapping, etc. Supervision through images only is also key to the ability to easily convert between rendering systems and scene representations. We output triangle meshes with textured materials to ensure that the models render efficiently on modern graphics hardware and benefit from, e.g., hardware-accelerated rasterization, ray tracing, and filtered texture lookups. Our system is integrated in a small Python code base, and can be applied at high resolutions and on large models. We describe several use cases, including mesh decimation, level of detail generation, seamless mesh filtering and approximations of aggregate geometry.

Graphics

Automatic Scene Inference for 3D Object Compositing

132 - Kevin Karsch , Kalyan Sunkavalli , Sunil Hadap 2019

We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), post-process illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geometry, illumination, diffuse albedo and camera parameters) from a single, low dynamic range photograph. This is made possible by two novel contributions: an illumination inference algorithm that recovers a full lighting model of the scene (including light sources that are not directly visible in the photograph), and a depth estimation algorithm that combines data-driven depth transfer with geometric reasoning about the scene layout. A user study shows that our system produces perceptually convincing results, and achieves the same level of realism as techniques that require significant user interaction.

Graphics Image and Video Processing

ShapeAssembly: Learning to Generate Programs for 3D Shape Structure Synthesis

376 - R. Kenny Jones , Theresa Barton , Xianghao Xu 2020

Manually authoring 3D shapes is difficult and time consuming; generative models of 3D shapes offer compelling alternatives. Procedural representations are one such possibility: they offer high-quality and editable results but are difficult to author and often produce outputs with limited diversity. On the other extreme are deep generative models: given enough data, they can learn to generate any class of shape but their outputs have artifacts and the representation is not editable. In this paper, we take a step towards achieving the best of both worlds for novel 3D shape synthesis. We propose ShapeAssembly, a domain-specific assembly-language for 3D shape structures. ShapeAssembly programs construct shapes by declaring cuboid part proxies and attaching them to one another, in a hierarchical and symmetrical fashion. Its functions are parameterized with free variables, so that one program structure is able to capture a family of related shapes. We show how to extract ShapeAssembly programs from existing shape structures in the PartNet dataset. Then we train a deep generative model, a hierarchical sequence VAE, that learns to write novel ShapeAssembly programs. The program captures the subset of variability that is interpretable and editable. The deep model captures correlations across shape collections that are hard to express procedurally. We evaluate our approach by comparing shapes output by our generated programs to those from other recent shape structure synthesis models. We find that our generated shapes are more plausible and physically-valid than those of other methods. Additionally, we assess the latent spaces of these models, and find that ours is better structured and produces smoother interpolations. As an application, we use our generative model and differentiable program interpreter to infer and fit shape programs to unstructured geometry, such as point clouds.

Graphics Computer Vision and Pattern Recognition Machine Learning

PolyGen: An Autoregressive Generative Model of 3D Meshes

85 - Charlie Nash , Yaroslav Ganin , S. M. Ali Eslami 2020

Polygon meshes are an efficient representation of 3D geometry, and are of central importance in computer graphics, robotics and games development. Existing learning-based approaches have avoided the challenges of working with 3D meshes, instead using alternative object representations that are more compatible with neural architectures and training approaches. We present an approach which models the mesh directly, predicting mesh vertices and faces sequentially using a Transformer-based architecture. Our model can condition on a range of inputs, including object classes, voxels, and images, and because the model is probabilistic it can produce samples that capture uncertainty in ambiguous scenarios. We show that the model is capable of producing high-quality, usable meshes, and establish log-likelihood benchmarks for the mesh-modelling task. We also evaluate the conditional models on surface reconstruction metrics against alternative methods, and demonstrate competitive performance despite not training directly on this task.

Graphics Computer Vision and Pattern Recognition Machine Learning

Automatic Generation of Large-scale 3D Road Networks based on GIS Data

140 - Hua Wang , Yue Wu , Xu Han 2021

How to automatically generate a realistic large-scale 3D road network is a key point for immersive and credible traffic simulations. Existing methods cannot automatically generate various kinds of intersections in 3D space based on GIS data. In this paper, we propose a method to generate complex and large-scale 3D road networks automatically with the open source GIS data, including satellite imagery, elevation data and two-dimensional(2D) road center axis data, as input. We first introduce a semantic structure of road network to obtain high-detailed and well-formed networks in a 3D scene. We then generate 2D shapes and topological data of the road network according to the semantic structure and 2D road center axis data. At last, we segment the elevation data and generate the surface of the 3D road network according to the 2D semantic data and satellite imagery data. Results show that our method does well in the generation of various types of intersections and the high-detailed features of roads. The traffic semantic structure, which must be provided in traffic simulation, can also be generated automatically according to our method.

Graphics

comments

Fetching comments

National Institute of Business Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

3D Magic Mirror: Automatic Video to 3D Caricature Translation

Ask ChatGPT about the research

No Arabic abstract

Read More