Learning to Infer Shape Programs Using Self Training

125 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Homer Walke

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Homer Walke - R. Kenny Jones - Daniel Ritchie

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Inferring programs which generate 2D and 3D shapes is important for reverse engineering, editing, and more. Training such inference models is challenging due to the lack of paired (shape, program) data in most domains. A popular approach is to pre-train a model on synthetic data and then fine-tune on real shapes using slow, unstable reinforcement learning. In this paper, we argue that self-training is a viable alternative for fine-tuning such models. Self-training is a semi-supervised learning paradigm where a model assigns pseudo-labels to unlabeled data, and then retrains with (data, pseudo-label) pairs as the new ground truth. We show that for constructive solid geometry and assembly-based modeling, self-training outperforms state-of-the-art reinforcement learning approaches. Additionally, shape program inference has a unique property that circumvents a potential downside of self-training (incorrect pseudo-label assignment): inferred programs are executable. For a given shape from our distribution of interest $mathbf{x}^$ and its predicted program $mathbf{z}$, one can execute $mathbf{z}$ to obtain a shape $mathbf{x}$ and train on $(mathbf{z}, mathbf{x})$ pairs, rather than $(mathbf{z}, mathbf{x}^)$ pairs. We term this procedure latent execution self training (LEST). We demonstrate that self training infers shape programs with higher shape reconstruction accuracy and converges significantly faster than reinforcement learning approaches, and in some domains, LEST can further improve this performance.

قيم البحث

121 - Alexander W. Bergman , Petr Kellnhofer , Gordon Wetzstein 2021

Novel view synthesis is a long-standing problem in machine learning and computer vision. Significant progress has recently been made in developing neural scene representations and rendering techniques that synthesize photorealistic images from arbitr ary views. These representations, however, are extremely slow to train and often also slow to render. Inspired by neural variants of image-based rendering, we develop a new neural rendering approach with the goal of quickly learning a high-quality representation which can also be rendered in real-time. Our approach, MetaNLR++, accomplishes this by using a unique combination of a neural shape representation and 2D CNN-based image feature extraction, aggregation, and re-projection. To push representation convergence times down to minutes, we leverage meta learning to learn neural shape and image feature priors which accelerate training. The optimized shape and image features can then be extracted using traditional graphics techniques and rendered in real time. We show that MetaNLR++ achieves similar or better novel view synthesis results in a fraction of the time that competing methods require.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Learning to Infer Semantic Parameters for 3D Shape Editing

120 - Fangyin Wei , Elena Sizikova , Avneesh Sud 2020

Many applications in 3D shape design and augmentation require the ability to make specific edits to an objects semantic parameters (e.g., the pose of a persons arm or the length of an airplanes wing) while preserving as much existing details as possi ble. We propose to learn a deep network that infers the semantic parameters of an input shape and then allows the user to manipulate those parameters. The network is trained jointly on shapes from an auxiliary synthetic template and unlabeled realistic models, ensuring robustness to shape variability while relieving the need to label realistic exemplars. At testing time, edits within the parameter space drive deformations to be applied to the original shape, which provides semantically-meaningful manipulation while preserving the details. This is in contrast to prior methods that either use autoencoders with a limited latent-space dimensionality, failing to preserve arbitrary detail, or drive deformations with purely-geometric controls, such as cages, losing the ability to update local part regions. Experiments with datasets of chairs, airplanes, and human bodies demonstrate that our method produces more natural edits than prior work.

الرؤية الحاسوبية وتمييز الأنماط

CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing

80 - Daxuan Ren , Jianmin Zheng , Jianfei Cai 2021

Generating an interpretable and compact representation of 3D shapes from point clouds is an important and challenging problem. This paper presents CSG-Stump Net, an unsupervised end-to-end network for learning shapes from point clouds and discovering the underlying constituent modeling primitives and operations as well. At the core is a three-level structure called {em CSG-Stump}, consisting of a complement layer at the bottom, an intersection layer in the middle, and a union layer at the top. CSG-Stump is proven to be equivalent to CSG in terms of representation, therefore inheriting the interpretable, compact and editable nature of CSG while freeing from CSGs complex tree structures. Particularly, the CSG-Stump has a simple and regular structure, allowing neural networks to give outputs of a constant dimensionality, which makes itself deep-learning friendly. Due to these characteristics of CSG-Stump, CSG-Stump Net achieves superior results compared to previous CSG-based methods and generates much more appealing shapes, as confirmed by extensive experiments. Project page: https://kimren227.github.io/projects/CSGStump/

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Self-supervised Learning of Point Clouds via Orientation Estimation

119 - Omid Poursaeed , Tianxing Jiang , Han Qiao 2020

Point clouds provide a compact and efficient representation of 3D shapes. While deep neural networks have achieved impressive results on point cloud learning tasks, they require massive amounts of manually labeled data, which can be costly and time-c onsuming to collect. In this paper, we leverage 3D self-supervision for learning downstream tasks on point clouds with fewer labels. A point cloud can be rotated in infinitely many ways, which provides a rich label-free source for self-supervision. We consider the auxiliary task of predicting rotations that in turn leads to useful features for other tasks such as shape classification and 3D keypoint prediction. Using experiments on ShapeNet and ModelNet, we demonstrate that our approach outperforms the state-of-the-art. Moreover, features learned by our model are complementary to other self-supervised methods and combining them leads to further performance improvement.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

ShapeAssembly: Learning to Generate Programs for 3D Shape Structure Synthesis

376 - R. Kenny Jones , Theresa Barton , Xianghao Xu 2020

Manually authoring 3D shapes is difficult and time consuming; generative models of 3D shapes offer compelling alternatives. Procedural representations are one such possibility: they offer high-quality and editable results but are difficult to author and often produce outputs with limited diversity. On the other extreme are deep generative models: given enough data, they can learn to generate any class of shape but their outputs have artifacts and the representation is not editable. In this paper, we take a step towards achieving the best of both worlds for novel 3D shape synthesis. We propose ShapeAssembly, a domain-specific assembly-language for 3D shape structures. ShapeAssembly programs construct shapes by declaring cuboid part proxies and attaching them to one another, in a hierarchical and symmetrical fashion. Its functions are parameterized with free variables, so that one program structure is able to capture a family of related shapes. We show how to extract ShapeAssembly programs from existing shape structures in the PartNet dataset. Then we train a deep generative model, a hierarchical sequence VAE, that learns to write novel ShapeAssembly programs. The program captures the subset of variability that is interpretable and editable. The deep model captures correlations across shape collections that are hard to express procedurally. We evaluate our approach by comparing shapes output by our generated programs to those from other recent shape structure synthesis models. We find that our generated shapes are more plausible and physically-valid than those of other methods. Additionally, we assess the latent spaces of these models, and find that ours is better structured and produces smoother interpolations. As an application, we use our generative model and differentiable program interpreter to infer and fit shape programs to unstructured geometry, such as point clouds.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي