Do you want to publish a course? Click here

Image2Lego: Customized LEGO Set Generation from Images

141   0   0.0 ( 0 )
 Added by Kyle Lennon
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Although LEGO sets have entertained generations of children and adults, the challenge of designing customized builds matching the complexity of real-world or imagined scenes remains too great for the average enthusiast. In order to make this feat possible, we implement a system that generates a LEGO brick model from 2D images. We design a novel solution to this problem that uses an octree-structured autoencoder trained on 3D voxelized models to obtain a feasible latent representation for model reconstruction, and a separate network trained to predict this latent representation from 2D images. LEGO models are obtained by algorithmic conversion of the 3D voxelized model to bricks. We demonstrate first-of-its-kind conversion of photographs to 3D LEGO models. An octree architecture enables the flexibility to produce multiple resolutions to best fit a users creative vision or design needs. In order to demonstrate the broad applicability of our system, we generate step-by-step building instructions and animations for LEGO models of objects and human faces. Finally, we test these automatically generated LEGO sets by constructing physical builds using real LEGO bricks.



rate research

Read More

A set is an unordered collection of unique elements--and yet many machine learning models that generate sets impose an implicit or explicit ordering. Since model performance can depend on the choice of order, any particular ordering can lead to sub-optimal results. An alternative solution is to use a permutation-equivariant set generator, which does not specify an order-ing. An example of such a generator is the DeepSet Prediction Network (DSPN). We introduce the Transformer Set Prediction Network (TSPN), a flexible permutation-equivariant model for set prediction based on the transformer, that builds upon and outperforms DSPN in the quality of predicted set elements and in the accuracy of their predicted sizes. We test our model on MNIST-as-point-clouds (SET-MNIST) for point-cloud generation and on CLEVR for object detection.
Sharing food has become very popular with the development of social media. For many real-world applications, people are keen to know the underlying recipes of a food item. In this paper, we are interested in automatically generating cooking instructions for food. We investigate an open research task of generating cooking instructions based on only food images and ingredients, which is similar to the image captioning task. However, compared with image captioning datasets, the target recipes are long-length paragraphs and do not have annotations on structure information. To address the above limitations, we propose a novel framework of Structure-aware Generation Network (SGN) to tackle the food recipe generation task. Our approach brings together several novel ideas in a systematic framework: (1) exploiting an unsupervised learning approach to obtain the sentence-level tree structure labels before training; (2) generating trees of target recipes from images with the supervision of tree structure labels learned from (1); and (3) integrating the inferred tree structures with the recipe generation procedure. Our proposed model can produce high-quality and coherent recipes, and achieve the state-of-the-art performance on the benchmark Recipe1M dataset.
Histopathology slides are routinely marked by pathologists using permanent ink markers that should not be removed as they form part of the medical record. Often tumour regions are marked up for the purpose of highlighting features or other downstream processing such an gene sequencing. Once digitised there is no established method for removing this information from the whole slide images limiting its usability in research and study. Removal of marker ink from these high-resolution whole slide images is non-trivial and complex problem as they contaminate different regions and in an inconsistent manner. We propose an efficient pipeline using convolution neural networks that results in ink-free images without compromising information and image resolution. Our pipeline includes a sequential classical convolution neural network for accurate classification of contaminated image tiles, a fast region detector and a domain adaptive cycle consistent adversarial generative model for restoration of foreground pixels. Both quantitative and qualitative results on four different whole slide images show that our approach yields visually coherent ink-free whole slide images.
Recipe generation from food images and ingredients is a challenging task, which requires the interpretation of the information from another modality. Different from the image captioning task, where the captions usually have one sentence, cooking instructions contain multiple sentences and have obvious structures. To help the model capture the recipe structure and avoid missing some cooking details, we propose a novel framework: Decomposed Generation Networks (DGN) with structure prediction, to get more structured and complete recipe generation outputs. To be specific, we split each cooking instruction into several phases, and assign different sub-generators to each phase. Our approach includes two novel ideas: (i) learning the recipe structures with the global structure prediction component and (ii) producing recipe phases in the sub-generator output component based on the predicted structure. Extensive experiments on the challenging large-scale Recipe1M dataset validate the effectiveness of our proposed model DGN, which improves the performance over the state-of-the-art results.
Reconstructing multiple molecularly defined neurons from individual brains and across multiple brain regions can reveal organizational principles of the nervous system. However, high resolution imaging of the whole brain is a technically challenging and slow process. Recently, oblique light sheet microscopy has emerged as a rapid imaging method that can provide whole brain fluorescence microscopy at a voxel size of 0.4 by 0.4 by 2.5 cubic microns. On the other hand, complex image artifacts due to whole-brain coverage produce apparent discontinuities in neuronal arbors. Here, we present connectivity-preserving methods and data augmentation strategies for supervised learning of neuroanatomy from light microscopy using neural networks. We quantify the merit of our approach by implementing an end-to-end automated tracing pipeline. Lastly, we demonstrate a scalable, distributed implementation that can reconstruct the large datasets that sub-micron whole-brain images produce.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا