ترغب بنشر مسار تعليمي؟ اضغط هنا

AutoSweep: Recovering 3D Editable Objectsfrom a Single Photograph

114   0   0.0 ( 0 )
 نشر من قبل Xin Chen
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper presents a fully automatic framework for extracting editable 3D objects directly from a single photograph. Unlike previous methods which recover either depth maps, point clouds, or mesh surfaces, we aim to recover 3D objects with semantic parts and can be directly edited. We base our work on the assumption that most human-made objects are constituted by parts and these parts can be well represented by generalized primitives. Our work makes an attempt towards recovering two types of primitive-shaped objects, namely, generalized cuboids and generalized cylinders. To this end, we build a novel instance-aware segmentation network for accurate part separation. Our GeoNet outputs a set of smooth part-level masks labeled as profiles and bodies. Then in a key stage, we simultaneously identify profile-body relations and recover 3D parts by sweeping the recognized profile along their body contour and jointly optimize the geometry to align with the recovered masks. Qualitative and quantitative experiments show that our algorithm can recover high quality 3D models and outperforms existing methods in both instance segmentation and 3D reconstruction. The dataset and code of AutoSweep are available at https://chenxin.tech/AutoSweep.html.



قيم البحث

اقرأ أيضاً

105 - Shuai Du , Youyi Zheng 2017
Acquiring 3D geometry of real world objects has various applications in 3D digitization, such as navigation and content generation in virtual environments. Image remains one of the most popular media for such visual tasks due to its simplicity of acq uisition. Traditional image-based 3D reconstruction approaches heavily exploit point-to-point correspondence among multiple images to estimate camera motion and 3D geometry. Establishing point-to-point correspondence lies at the center of the 3D reconstruction pipeline, which however is easily prone to errors. In this paper, we propose an optimization framework which traces image points using a novel structure-guided dynamic tracking algorithm and estimates both the camera motion and a 3D structure model by enforcing a set of planar constraints. The key to our method is a structure model represented as a set of planes and their arrangements. Constraints derived from the structure model is used both in the correspondence establishment stage and the bundle adjustment stage in our reconstruction pipeline. Experiments show that our algorithm can effectively localize structure correspondence across dense image frames while faithfully reconstructing the camera motion and the underlying structured 3D model.
61 - Meng Zhang , Youyi Zheng 2018
We introduce Hair-GANs, an architecture of generative adversarial networks, to recover the 3D hair structure from a single image. The goal of our networks is to build a parametric transformation from 2D hair maps to 3D hair structure. The 3D hair str ucture is represented as a 3D volumetric field which encodes both the occupancy and the orientation information of the hair strands. Given a single hair image, we first align it with a bust model and extract a set of 2D maps encoding the hair orientation information in 2D, along with the bust depth map to feed into our Hair-GANs. With our generator network, we compute the 3D volumetric field as the structure guidance for the final hair synthesis. The modeling results not only resemble the hair in the input image but also possesses many vivid details in other views. The efficacy of our method is demonstrated by using a variety of hairstyles and comparing with the prior art.
Light fields are 4D scene representation typically structured as arrays of views, or several directional samples per pixel in a single view. This highly correlated structure is not very efficient to transmit and manipulate (especially for editing), t hough. To tackle these problems, we present a novel compact and editable light field representation, consisting of a set of visual channels (i.e. the central RGB view) and a complementary meta channel that encodes the residual geometric and appearance information. The visual channels in this representation can be edited using existing 2D image editing tools, before accurately reconstructing the whole edited light field back. We propose to learn this representation via an autoencoder framework, consisting of an encoder for learning the representation, and a decoder for reconstructing the light field. To handle the challenging occlusions and propagation of edits, we specifically designed an editing-aware decoding network and its associated training strategy, so that the edits to the visual channels can be consistently propagated to the whole light field upon reconstruction.Experimental results show that our proposed method outperforms related existing methods in reconstruction accuracy, and achieves visually pleasant performance in editing propagation.
Generating free-viewpoint videos is critical for immersive VR/AR experience but recent neural advances still lack the editing ability to manipulate the visual perception for large dynamic scenes. To fill this gap, in this paper we propose the first a pproach for editable photo-realistic free-viewpoint video generation for large-scale dynamic scenes using only sparse 16 cameras. The core of our approach is a new layered neural representation, where each dynamic entity including the environment itself is formulated into a space-time coherent neural layered radiance representation called ST-NeRF. Such layered representation supports fully perception and realistic manipulation of the dynamic scene whilst still supporting a free viewing experience in a wide range. In our ST-NeRF, the dynamic entity/layer is represented as continuous functions, which achieves the disentanglement of location, deformation as well as the appearance of the dynamic entity in a continuous and self-supervised manner. We propose a scene parsing 4D label map tracking to disentangle the spatial information explicitly, and a continuous deform module to disentangle the temporal motion implicitly. An object-aware volume rendering scheme is further introduced for the re-assembling of all the neural layers. We adopt a novel layered loss and motion-aware ray sampling strategy to enable efficient training for a large dynamic scene with multiple performers, Our framework further enables a variety of editing functions, i.e., manipulating the scale and location, duplicating or retiming individual neural layers to create numerous visual effects while preserving high realism. Extensive experiments demonstrate the effectiveness of our approach to achieve high-quality, photo-realistic, and editable free-viewpoint video generation for dynamic scenes.
In this paper, we present a learning-based approach for recovering the 3D geometry of human head from a single portrait image. Our method is learned in an unsupervised manner without any ground-truth 3D data. We represent the head geometry with a p arametric 3D face model together with a depth map for other head regions including hair and ear. A two-step geometry learning scheme is proposed to learn 3D head reconstruction from in-the-wild face images, where we first learn face shape on single images using self-reconstruction and then learn hair and ear geometry using pairs of images in a stereo-matching fashion. The second step is based on the output of the first to not only improve the accuracy but also ensure the consistency of overall head geometry. We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images. We alter pose based on the recovered geometry and apply a refinement network trained with adversarial learning to ameliorate the reprojected images and translate them to the real image domain. Extensive evaluations and comparison with previous methods show that our new method can produce high-fidelity 3D head geometry and head pose manipulation results.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا