H-GAN: the power of GANs in your Hands

202 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sergiu Oprea

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sergiu Oprea - Giorgos Karvounas - Pablo Martinez-Gonzalez

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present HandGAN (H-GAN), a cycle-consistent adversarial learning approach implementing multi-scale perceptual discriminators. It is designed to translate synthetic images of hands to the real domain. Synthetic hands provide complete ground-truth annotations, yet they are not representative of the target distribution of real-world data. We strive to provide the perfect blend of a realistic hand appearance with synthetic annotations. Relying on image-to-image translation, we improve the appearance of synthetic hands to approximate the statistical distribution underlying a collection of real images of hands. H-GAN tackles not only the cross-domain tone mapping but also structural differences in localized areas such as shading discontinuities. Results are evaluated on a qualitative and quantitative basis improving previous works. Furthermore, we relied on the hand classification task to claim our generated hands are statistically similar to the real domain of hands.

قيم البحث

اقرأ أيضاً

Sketch Your Own GAN

127 - Sheng-Yu Wang , David Bau , Jun-Yan Zhu 2021

Can a user create a deep generative model by sketching a single example? Traditionally, creating a GAN model has required the collection of a large-scale dataset of exemplars and specialized knowledge in deep learning. In contrast, sketching is possi bly the most universally accessible way to convey a visual concept. In this work, we present a method, GAN Sketching, for rewriting GANs with one or more sketches, to make GANs training easier for novice users. In particular, we change the weights of an original GAN model according to user sketches. We encourage the models output to match the user sketches through a cross-domain adversarial loss. Furthermore, we explore different regularization methods to preserve the original models diversity and image quality. Experiments have shown that our method can mold GANs to match shapes and poses specified by sketches while maintaining realism and diversity. Finally, we demonstrate a few applications of the resulting GAN, including latent space interpolation and image editing.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Association: Remind Your GAN not to Forget

66 - Yi Gu , Jie Li , Yuting Gao 2020

Neural networks are susceptible to catastrophic forgetting. They fail to preserve previously acquired knowledge when adapting to new tasks. Inspired by human associative memory system, we propose a brain-like approach that imitates the associative le arning process to achieve continual learning. We design a heuristics mechanism to potentiatively stimulate the model, which guides the model to recall the historical episodes based on the current circumstance and obtained association experience. Besides, a distillation measure is added to depressively alter the efficacy of synaptic transmission, which dampens the feature reconstruction learning for new task. The framework is mediated by potentiation and depression stimulation that play opposing roles in directing synaptic and behavioral plasticity. It requires no access to the original data and is more similar to human cognitive process. Experiments demonstrate the effectiveness of our method in alleviating catastrophic forgetting on image-to-image translation tasks.

الرؤية الحاسوبية وتمييز الأنماط

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

75 - Hui-Po Wang , Ning Yu , Mario Fritz 2020

While Generative Adversarial Networks (GANs) show increasing performance and the level of realism is becoming indistinguishable from natural images, this also comes with high demands on data and computation. We show that state-of-the-art GAN models - - such as they are being publicly released by researchers and industry -- can be used for a range of applications beyond unconditional image generation. We achieve this by an iterative scheme that also allows gaining control over the image generation process despite the highly non-linear latent spaces of the latest GAN models. We demonstrate that this opens up the possibility to re-use state-of-the-art, difficult to train, pre-trained GANs with a high level of control even if only black-box access is granted. Our work also raises concerns and awareness that the use cases of a published GAN model may well reach beyond the creators intention, which needs to be taken into account before a full public release. Code is available at https://github.com/a514514772/hijackgan.

الرؤية الحاسوبية وتمييز الأنماط التشفير والأمن

Contextual RNN-GANs for Abstract Reasoning Diagram Generation

90 - Arnab Ghosh , Viveka Kulharia , Amitabha Mukerjee 2016

Understanding, predicting, and generating object motions and transformations is a core problem in artificial intelligence. Modeling sequences of evolving images may provide better representations and models of motion and may ultimately be used for fo recasting, simulation, or video generation. Diagrammatic Abstract Reasoning is an avenue in which diagrams evolve in complex patterns and one needs to infer the underlying pattern sequence and generate the next image in the sequence. For this, we develop a novel Contextual Generative Adversarial Network based on Recurrent Neural Networks (Context-RNN-GANs), where both the generator and the discriminator modules are based on contextual history (modeled as RNNs) and the adversarial discriminator guides the generator to produce realistic images for the particular time step in the image sequence. We evaluate the Context-RNN-GAN model (and its variants) on a novel dataset of Diagrammatic Abstract Reasoning, where it performs competitively with 10th-grade human performance but there is still scope for interesting improvements as compared to college-grade human performance. We also evaluate our model on a standard video next-frame prediction task, achieving improved performance over comparable state-of-the-art.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

158 - Andreas Steiner , Alexander Kolesnikov , Xiaohua Zhai 2021

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation. In comparison to convolutional neural networks , the Vision Transformers weaker inductive bias is generally found to cause an increased reliance on model regularization or data augmentation (``AugReg for short) when training on smaller training datasets. We conduct a systematic empirical study in order to better understand the interplay between the amount of training data, AugReg, model size and compute budget. As one result of this study we find that the combination of increased compute and AugReg can yield models with the same performance as models trained on an order of magnitude more training data: we train ViT models of various sizes on the public ImageNet-21k dataset which either match or outperform their counterparts trained on the larger, but not publicly available JFT-300M dataset.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي