ﻻ يوجد ملخص باللغة العربية
Adaptive and flexible image editing is a desirable function of modern generative models. In this work, we present a generative model with auto-encoder architecture for per-region style manipulation. We apply a code consistency loss to enforce an explicit disentanglement between content and style latent representations, making the content and style of generated samples consistent with their corresponding content and style references. The model is also constrained by a content alignment loss to ensure the foreground editing will not interfere background contents. As a result, given interested region masks provided by users, our model supports foreground region-wise style transfer. Specially, our model receives no extra annotations such as semantic labels except for self-supervision. Extensive experiments show the effectiveness of the proposed method and exhibit the flexibility of the proposed model for various applications, including region-wise style editing, latent space interpolation, cross-domain style transfer.
We present a novel approach to automatic image colorization by imitating the imagination process of human experts. Our imagination module is designed to generate color images that are context-correlated with black-and-white photos. Given a black-and-
In this paper, we leverage advances in neural networks towards forming a neural rendering for controllable image generation, and thereby bypassing the need for detailed modeling in conventional graphics pipeline. To this end, we present Neural Graphi
The existing text-guided image synthesis methods can only produce limited quality results with at most mbox{$text{256}^2$} resolution and the textual instructions are constrained in a small Corpus. In this work, we propose a unified framework for bot
We present an approach to synthesize highly photorealistic images of 3D object models, which we use to train a convolutional neural network for detecting the objects in real images. The proposed approach has three key ingredients: (1) 3D object model
The last decade has witnessed remarkable progress in the image captioning task; however, most existing methods cannot control their captions, emph{e.g.}, choosing to describe the image either roughly or in detail. In this paper, we propose to use a s