Research papers, master and doctoral theses published by Chao Shen

Per Garment Capture and Synthesis for Real-time Virtual Try-on

99 - Toby Chong , I-Chao Shen , Nobuyuki Umetani 2021

Virtual try-on is a promising application of computer graphics and human computer interaction that can have a profound real-world impact especially during this pandemic. Existing image-based works try to synthesize a try-on image from a single image of a target garment, but it inherently limits the ability to react to possible interactions. It is difficult to reproduce the change of wrinkles caused by pose and body size change, as well as pulling and stretching of the garment by hand. In this paper, we propose an alternative per garment capture and synthesis workflow to handle such rich interactions by training the model with many systematically captured images. Our workflow is composed of two parts: garment capturing and clothed person image synthesis. We designed an actuated mannequin and an efficient capturing process that collects the detailed deformations of the target garments under diverse body sizes and poses. Furthermore, we proposed to use a custom-designed measurement garment, and we captured paired images of the measurement garment and the target garments. We then learn a mapping between the measurement garment and the target garments using deep image-to-image translation. The customer can then try on the target garments interactively during online shopping.

Graphics Computer Vision and Pattern Recognition

Robust Symbol-Level Precoding and Passive Beamforming for IRS-Aided Communications

97 - Guangyang Zhang , Chao Shen , Bo Ai 2021

This paper investigates a joint beamforming design in a multiuser multiple-input single-output (MISO) communication network aided with an intelligent reflecting surface (IRS) panel. The symbol-level precoding (SLP) is adopted to enhance the system performance by exploiting the multiuser interference (MUI) with consideration of bounded channel uncertainty. The joint beamforming design is formulated into a nonconvex worst-case robust programming to minimize the transmit power subject to single-to-noise ratio (SNR) requirements. To address the challenges due to the constant modulus and the coupling of the beamformers, we first study the single-user case. Specifically, we propose and compare two algorithms based on the semidefinite relaxation (SDR) and alternating optimization (AO) methods, respectively. It turns out that the AO-based algorithm has much lower computational complexity but with almost the same power to the SDR-based algorithm. Then, we apply the AO technique to the multiuser case and thereby develop an algorithm based on the proximal gradient descent (PGD) method. The algorithm can be generalized to the case of finite-resolution IRS and the scenario with direct links from the transmitter to the users. Numerical results show that the SLP can significantly improve the system performance. Meanwhile, 3-bit phase shifters can achieve near-optimal power performance.

Information Theory Signal Processing Information Theory

$L^2$-Extension of Adjoint bundles and Kollars Conjecture

184 - Junchao Shentu , Chen Zhao 2021

We give a new proof of Kollars conjecture on the pushforward of the dualizing sheaf twisted by a variation of Hodge structure. This conjecture was settled by M. Saito via mixed Hodge modules and has applications in the investigation of Albanese maps. Our technique is the $L^2$-method and we give a concrete construction and proofs of the conjecture. The $L^2$ point of view allows us to generalize Kollars conjecture to the context of non-abelian Hodge theory.

Algebraic Geometry

Teacher Model Fingerprinting Attacks Against Transfer Learning

332 - Yufei Chen , Chao Shen , Cong Wang 2021

Transfer learning has become a common solution to address training data scarcity in practice. It trains a specified student model by reusing or fine-tuning early layers of a well-trained teacher model that is usually publicly available. However, besides utility improvement, the transferred public knowledge also brings potential threats to model confidentiality, and even further raises other security and privacy issues. In this paper, we present the first comprehensive investigation of the teacher model exposure threat in the transfer learning context, aiming to gain a deeper insight into the tension between public knowledge and model confidentiality. To this end, we propose a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from. Specifically, we propose a novel optimization-based method to carefully generate queries to probe the student model to realize our attack. Unlike existing model reverse engineering approaches, our proposed fingerprinting method neither relies on fine-grained model outputs, e.g., posteriors, nor auxiliary information of the model architecture or training dataset. We systematically evaluate the effectiveness of our proposed attack. The empirical results demonstrate that our attack can accurately identify the model origin with few probing queries. Moreover, we show that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.

Cryptography and Security Machine Learning

ClipGen: A Deep Generative Model for Clipart Vectorization and Synthesis

314 - I-Chao Shen , Bing-Yu Chen 2021

This paper presents a novel deep learning-based approach for automatically vectorizing and synthesizing the clipart of man-made objects. Given a raster clipart image and its corresponding object category (e.g., airplanes), the proposed method sequentially generates new layers, each of which is composed of a new closed path filled with a single color. The final result is obtained by compositing all layers together into a vector clipart image that falls into the target category. The proposed approach is based on an iterative generative model that (i) decides whether to continue synthesizing a new layer and (ii) determines the geometry and appearance of the new layer. We formulated a joint loss function for training our generative model, including the shape similarity, symmetry, and local curve smoothness losses, as well as vector graphics rendering accuracy loss for synthesizing clipart recognizable by humans. We also introduced a collection of man-made object clipart, ClipNet, which is composed of closed-path layers, and two designed preprocessing tasks to clean up and enrich the original raw clipart. To validate the proposed approach, we conducted several experiments and demonstrated its ability to vectorize and synthesize various clipart categories. We envision that our generative model can facilitate efficient and intuitive clipart designs for novice users and graphic designers.

Graphics

Training Generative Adversarial Networks in One Stage

290 - Chengchao Shen , Youtan Yin , Xinchao Wang 2021

Generative Adversarial Networks (GANs) have demonstrated unprecedented success in various image generation tasks. The encouraging results, however, come at the price of a cumbersome training process, during which the generator and discriminator are alternately updated in two stages. In this paper, we investigate a general training scheme that enables training GANs efficiently in only one stage. Based on the adversarial losses of the generator and discriminator, we categorize GANs into two classes, Symmetric GANs and Asymmetric GANs, and introduce a novel gradient decomposition method to unify the two, allowing us to train both classes in one stage and hence alleviate the training effort. We also computationally analyze the efficiency of the proposed method, and empirically demonstrate that, the proposed method yields a solid $1.5times$ acceleration across various datasets and network architectures. Furthermore, we show that the proposed method is readily applicable to other adversarial-training scenarios, such as data-free knowledge distillation. The code is available at https://github.com/zju-vipa/OSGAN.

Computer Vision and Pattern Recognition Machine Learning Image and Video Processing

ClipFlip : Multi-view Clipart Design

68 - I-Chao Shen , Kuan-Hung Liu , Li-Wen Su 2020

We present an assistive system for clipart design by providing visual scaffolds from the unseen viewpoints. Inspired by the artists creation process, our system constructs the visual scaffold by first synthesizing the reference 3D shape of the input clipart and rendering it from the desired viewpoint. The critical challenge of constructing this visual scaffold is to generate a reference 3Dshape that matches the users expectation in terms of object sizing and positioning while preserving the geometric style of the input clipart. To address this challenge, we propose a user-assisted curve extrusion method to obtain the reference 3D shape.We render the synthesized reference 3D shape with consistent style into the visual scaffold. By following the generated visual scaffold, the users can efficiently design clipart with their desired viewpoints. The user study conducted by an intuitive user interface and our generated visual scaffold suggests that the users are able to design clipart from different viewpoints while preserving the original geometric style without losing its original shape.

Graphics

A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

84 - Kaidi Jin , Tianwei Zhang , Chao Shen 2020

Deep Neural Networks are well known to be vulnerable to adversarial attacks and backdoor attacks, where minor modifications on the input can mislead the models to give wrong results. Although defenses against adversarial attacks have been widely studied, research on mitigating backdoor attacks is still at an early stage. It is unknown whether there are any connections and common characteristics between the defenses against these two attacks. In this paper, we present a unified framework for detecting malicious examples and protecting the inference results of Deep Learning models. This framework is based on our observation that both adversarial examples and backdoor examples have anomalies during the inference process, highly distinguishable from benign samples. As a result, we repurpose and revise four existing adversarial defense methods for detecting backdoor examples. Extensive evaluations indicate these approaches provide reliable protection against backdoor attacks, with a higher accuracy than detecting adversarial examples. These solutions also reveal the relations of adversarial examples, backdoor examples and normal samples in model sensitivity, activation space and feature space. This can enhance our understanding about the inherent features of these two attacks, as well as the defense opportunities.

Machine Learning Machine Learning

Data-Free Adversarial Distillation

400 - Gongfan Fang , Jie Song , Chengchao Shen 2019

Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer. However, almost all existing KD algorithms are data-driven, i.e., relying on a large amount of original training data or alternative data, which is usually unavailable in real-world scenarios. In this paper, we devote ourselves to this challenging problem and propose a novel adversarial distillation mechanism to craft a compact student model without any real-world data. We introduce a model discrepancy to quantificationally measure the difference between student and teacher models and construct an optimizable upper bound. In our work, the student and the teacher jointly act the role of the discriminator to reduce this discrepancy, when a generator adversarially produces some hard samples to enlarge it. Extensive experiments demonstrate that the proposed data-free method yields comparable performance to existing data-driven methods. More strikingly, our approach can be directly extended to semantic segmentation, which is more complicated than classification, and our approach achieves state-of-the-art results. Code and pretrained models are available at https://github.com/VainF/Data-Free-Adversarial-Distillation.

Machine Learning Computer Vision and Pattern Recognition Machine Learning

Interactive Optimization of Generative Image Modeling using Sequential Subspace Search and Content-based Guidance

201 - Toby Chong Long Hin , I-Chao Shen , Issei Sato 2019

Generative image modeling techniques such as GAN demonstrate highly convincing image generation result. However, user interaction is often necessary to obtain the desired results. Existing attempts add interactivity but require either tailored architectures or extra data. We present a human-in-the-optimization method that allows users to directly explore and search the latent vector space of generative image modeling. Our system provides multiple candidates by sampling the latent vector space, and the user selects the best blending weights within the subspace using multiple sliders. In addition, the user can express their intention through image editing tools. The system samples latent vectors based on inputs and presents new candidates to the user iteratively. An advantage of our formulation is that one can apply our method to arbitrary pre-trained model without developing specialized architecture or data. We demonstrate our method with various generative image modeling applications, and show superior performance in a comparative user study with prior art iGAN.

Graphics Computer Vision and Pattern Recognition Human-Computer Interaction

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد