أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Changxi Zheng

DeepCAD: A Deep Generative Network for Computer-Aided Design Models

115 - Rundi Wu , Chang Xiao , Changxi Zheng 2021

Deep generative models of 3D shapes have received a great deal of research interest. Yet, almost all of them generate discrete shape representations, such as voxels, point clouds, and polygon meshes. We present the first 3D generative model for a dra stically different shape representation --- describing a shape as a sequence of computer-aided design (CAD) operations. Unlike meshes and point clouds, CAD models encode the user creation process of 3D shapes, widely used in numerous industrial and engineering design tasks. However, the sequential and irregular structure of CAD operations poses significant challenges for existing 3D generative models. Drawing an analogy between CAD operations and natural language, we propose a CAD generative network based on the Transformer. We demonstrate the performance of our model for both shape autoencoding and random shape generation. To train our network, we create a new CAD dataset consisting of 178,238 models and their CAD construction sequences. We have made this dataset publicly available to promote future research on this topic.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي التعلم الآلي

Linear Semantics in Generative Adversarial Networks

92 - Jianjin Xu , Changxi Zheng 2021

Generative Adversarial Networks (GANs) are able to generate high-quality images, but it remains difficult to explicitly specify the semantics of synthesized images. In this work, we aim to better understand the semantic representation of GANs, and th ereby enable semantic control in GANs generation process. Interestingly, we find that a well-trained GAN encodes image semantics in its internal feature maps in a surprisingly simple way: a linear transformation of feature maps suffices to extract the generated image semantics. To verify this simplicity, we conduct extensive experiments on various GANs and datasets; and thanks to this simplicity, we are able to learn a semantic segmentation model for a trained GAN from a small number (e.g., 8) of labeled images. Last but not least, leveraging our findings, we propose two few-shot image editing approaches, namely Semantic-Conditional Sampling and Semantic Image Editing. Given a trained GAN and as few as eight semantic annotations, the user is able to generate diverse images subject to a user-provided semantic layout, and control the synthesized image semantics. We have made the code publicly available.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Direct imaging of liquid-nanoparticle interface with atom probe tomography

150 - Shi. Qiu , Changxi Zheng , Qi Zhou 2020

Understanding the structure and chemical composition at the liquid-nanoparticle (NP) interface is crucial for a wide range of physical, chemical and biological processes. In this study, direct imaging of the liquid-NP interface by atom probe tomograp hy (APT) is reported for the first time, which reveals the distributions and the interactions of key atoms and molecules in this critical domain. The APT specimen is prepared by controlled graphene encapsulation of the solution containing nanoparticles on a metal tip, with an end radius in the range of 50 nm to allow field ionization and evaporation. Using Au nanoparticles (AuNPs) in suspension as an example, analysis of the mass spectrum and three-dimensional (3D) chemical maps from APT provides a detailed image of the water-gold interface with near-atomic resolution. At the water-gold interface, the formation of an electrical double layer (EDL) rich in water (H2O) molecules has been observed, which results from the charge from the binding between the trisodium-citrate layer and the AuNP. In the bulk water region, the density of reconstructed H2O has been shown to be consistent, reflecting a highly packed density of H2O molecules after graphene encapsulation. This study is the first demonstration of direct imaging of liquid-NP interface using APT with results providing an atom-by-atom 3D dissection of the liquid-NP interface.

علم المواد الفيزياء ميسكالي وننكالي مجموعات الذرية والجزيئية

Enhancing Adversarial Defense by k-Winners-Take-All

147 - Chang Xiao , Peilin Zhong , Changxi Zheng 2019

We propose a simple change to existing neural network structures for better defending against gradient-based adversarial attacks. Instead of using popular activation functions (such as ReLU), we advocate the use of k-Winners-Take-All (k-WTA) activati on, a C0 discontinuous function that purposely invalidates the neural network models gradient at densely distributed input data points. The proposed k-WTA activation can be readily used in nearly all existing networks and training methods with no significant overhead. Our proposal is theoretically rationalized. We analyze why the discontinuities in k-WTA networks can largely prevent gradient-based search of adversarial examples and why they at the same time remain innocuous to the network training. This understanding is also empirically backed. We test k-WTA activation on various network structures optimized by a training method, be it adversarial training or not. In all cases, the robustness of k-WTA networks outperforms that of traditional networks under white-box attacks.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Scene-Aware Audio for 360textdegree{} Videos

93 - Dingzeyu Li , Timothy R. Langlois , Changxi Zheng 2018

Although 360textdegree{} cameras ease the capture of panoramic footage, it remains challenging to add realistic 360textdegree{} audio that blends into the captured scene and is synchronized with the camera motion. We present a method for adding scene -aware spatial audio to 360textdegree{} videos in typical indoor scenes, using only a conventional mono-channel microphone and a speaker. We observe that the late reverberation of a rooms impulse response is usually diffuse spatially and directionally. Exploiting this fact, we propose a method that synthesizes the directional impulse response between any source and listening locations by combining a synthesized early reverberation part and a measured late reverberation tail. The early reverberation is simulated using a geometric acoustic simulation and then enhanced using a frequency modulation method to capture room resonances. The late reverberation is extracted from a recorded impulse response, with a carefully chosen time duration that separates out the late reverberation from the early reverberation. In our validations, we show that our synthesized spatial audio matches closely with recordings using ambisonic microphones. Lastly, we demonstrate the strength of our method in several applications.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط التقنيات الناشئة

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد