بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Interactive Optimization of Generative Image Modeling using Sequential Subspace Search and Content-based Guidance

202 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل I-Chao Shen

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Toby Chong Long Hin - I-Chao Shen - Issei Sato

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط تفاعل الإنسان والحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Generative image modeling techniques such as GAN demonstrate highly convincing image generation result. However, user interaction is often necessary to obtain the desired results. Existing attempts add interactivity but require either tailored architectures or extra data. We present a human-in-the-optimization method that allows users to directly explore and search the latent vector space of generative image modeling. Our system provides multiple candidates by sampling the latent vector space, and the user selects the best blending weights within the subspace using multiple sliders. In addition, the user can express their intention through image editing tools. The system samples latent vectors based on inputs and presents new candidates to the user iteratively. An advantage of our formulation is that one can apply our method to arbitrary pre-trained model without developing specialized architecture or data. We demonstrate our method with various generative image modeling applications, and show superior performance in a comparative user study with prior art iGAN.

قيم البحث

219 - Eric Heim 2019

Generative Adversarial Networks (GANs) have received a great deal of attention due in part to recent success in generating original, high-quality samples from visual domains. However, most current methods only allow for users to guide this image gene ration process through limited interactions. In this work we develop a novel GAN framework that allows humans to be in-the-loop of the image generation process. Our technique iteratively accepts relative constraints of the form Generate an image more like image A than image B. After each constraint is given, the user is presented with new outputs from the GAN, informing the next round of feedback. This feedback is used to constrain the output of the GAN with respect to an underlying semantic space that can be designed to model a variety of different notions of similarity (e.g. classes, attributes, object relationships, color, etc.). In our experiments, we show that our GAN framework is able to generate images that are of comparable quality to equivalent unsupervised GANs while satisfying a large number of the constraints provided by users, effectively changing a GAN into one that allows users interactive control over image generation without sacrificing image quality.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Interactive Video Stylization Using Few-Shot Patch-Based Training

111 - Ondv{r}ej Texler , David Futschik , Michal Kuv{c}era 2020

In this paper, we present a learning-based method to the keyframe-based video stylization that allows an artist to propagate the style from a few selected keyframes to the rest of the sequence. Its key advantage is that the resulting stylization is s emantically meaningful, i.e., specific parts of moving objects are stylized according to the artists intention. In contrast to previous style transfer techniques, our approach does not require any lengthy pre-training process nor a large training dataset. We demonstrate how to train an appearance translation network from scratch using only a few stylized exemplars while implicitly preserving temporal consistency. This leads to a video stylization framework that supports real-time inference, parallel processing, and random access to an arbitrary output frame. It can also merge the content from multiple keyframes without the need to perform an explicit blending operation. We demonstrate its practical utility in various interactive scenarios, where the user paints over a selected keyframe and sees her style transferred to an existing recorded sequence or a live video stream.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط

Generative Image Modeling Using Spatial LSTMs

508 - Lucas Theis , Matthias Bethge 2015

Modeling the distribution of natural images is challenging, partly because of strong statistical dependencies which can extend over hundreds of pixels. Recurrent neural networks have been successful in capturing long-range dependencies in a number of problems but only recently have found their way into generative image models. We here introduce a recurrent image model based on multi-dimensional long short-term memory units which are particularly suited for image modeling due to their spatial structure. Our model scales to images of arbitrary size and its likelihood is computationally tractable. We find that it outperforms the state of the art in quantitative comparisons on several image datasets and produces promising results when used for texture synthesis and inpainting.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Image Retargeting by Content-Aware Synthesis

523 - Weiming Dong , Fuzhang Wu , Yan Kong 2014

Real-world images usually contain vivid contents and rich textural details, which will complicate the manipulation on them. In this paper, we design a new framework based on content-aware synthesis to enhance content-aware image retargeting. By detec ting the textural regions in an image, the textural image content can be synthesized rather than simply distorted or cropped. This method enables the manipulation of textural & non-textural regions with different strategy since they have different natures. We propose to retarget the textural regions by content-aware synthesis and non-textural regions by fast multi-operators. To achieve practical retargeting applications for general images, we develop an automatic and fast texture detection method that can detect multiple disjoint textural regions. We adjust the saliency of the image according to the features of the textural regions. To validate the proposed method, comparisons with state-of-the-art image targeting techniques and a user study were conducted. Convincing visual results are shown to demonstrate the effectiveness of the proposed method.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط

cGAN-based Manga Colorization Using a Single Training Image

110 - Paulina Hensman , Kiyoharu Aizawa 2017

The Japanese comic format known as Manga is popular all over the world. It is traditionally produced in black and white, and colorization is time consuming and costly. Automatic colorization methods generally rely on greyscale values, which are not p resent in manga. Furthermore, due to copyright protection, colorized manga available for training is scarce. We propose a manga colorization method based on conditional Generative Adversarial Networks (cGAN). Unlike previous cGAN approaches that use many hundreds or thousands of training images, our method requires only a single colorized reference image for training, avoiding the need of a large dataset. Colorizing manga using cGANs can produce blurry results with artifacts, and the resolution is limited. We therefore also propose a method of segmentation and color-correction to mitigate these issues. The final results are sharp, clear, and in high resolution, and stay true to the characters original color scheme.

الرسم الحاسوبي الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الحواش الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Interactive Optimization of Generative Image Modeling using Sequential Subspace Search and Content-based Guidance

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً