ترغب بنشر مسار تعليمي؟ اضغط هنا

Multimodal Style Transfer via Graph Cuts

108   0   0.0 ( 0 )
 نشر من قبل Yulun Zhang
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

An assumption widely used in recent neural style transfer methods is that image styles can be described by global statics of deep features like Gram or covariance matrices. Alternative approaches have represented styles by decomposing them into local pixel or neural patches. Despite the recent progress, most existing methods treat the semantic patterns of style image uniformly, resulting unpleasing results on complex styles. In this paper, we introduce a more flexible and general universal style transfer technique: multimodal style transfer (MST). MST explicitly considers the matching of semantic patterns in content and style images. Specifically, the style image features are clustered into sub-style components, which are matched with local content features under a graph cut formulation. A reconstruction network is trained to transfer each sub-style and render the final stylized result. We also generalize MST to improve some existing methods. Extensive experiments demonstrate the superior effectiveness, robustness, and flexibility of MST.

قيم البحث

اقرأ أيضاً

Video style transfer is getting more attention in AI community for its numerous applications such as augmented reality and animation productions. Compared with traditional image style transfer, performing this task on video presents new challenges: h ow to effectively generate satisfactory stylized results for any specified style, and maintain temporal coherence across frames at the same time. Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos. Specifically, MCCNet works directly on the feature space of style and content domain where it learns to rearrange and fuse style features based on their similarity with content features. The outputs generated by MCC are features containing the desired style patterns which can further be decoded into images with vivid style textures. Moreover, MCCNet is also designed to explicitly align the features to input which ensures the output maintains the content structures as well as the temporal continuity. To further improve the performance of MCCNet under complex light conditions, we also introduce the illumination loss during training. Qualitative and quantitative evaluations demonstrate that MCCNet performs well in both arbitrary video and image style transfer tasks.
Universal Neural Style Transfer (NST) methods are capable of performing style transfer of arbitrary styles in a style-agnostic manner via feature transforms in (almost) real-time. Even though their unimodal parametric style modeling approach has been proven adequate to transfer a single style from relatively simple images, they are usually not capable of effectively handling more complex styles, producing significant artifacts, as well as reducing the quality of the synthesized textures in the stylized image. To overcome these limitations, in this paper we propose a novel universal NST approach that separately models each sub-style that exists in a given style image (or a collection of style images). This allows for better modeling the subtle style differences within the same style image and then using the most appropriate sub-style (or mixtures of different sub-styles) to stylize the content image. The ability of the proposed approach to a) perform a wide range of different stylizations using the sub-styles that exist in one style image, while giving the ability to the user to appropriate mix the different sub-styles, b) automatically match the most appropriate sub-style to different semantic regions of the content image, improving existing state-of-the-art universal NST approaches, and c) detecting and transferring the sub-styles from collections of images are demonstrated through extensive experiments.
Arbitrary style transfer aims to synthesize a content image with the style of an image to create a third image that has never been seen before. Recent arbitrary style transfer algorithms find it challenging to balance the content structure and the st yle patterns. Moreover, simultaneously maintaining the global and local style patterns is difficult due to the patch-based mechanism. In this paper, we introduce a novel style-attentional network (SANet) that efficiently and flexibly integrates the local style patterns according to the semantic spatial distribution of the content image. A new identity loss function and multi-level feature embeddings enable our SANet and decoder to preserve the content structure as much as possible while enriching the style patterns. Experimental results demonstrate that our algorithm synthesizes stylized images in real-time that are higher in quality than those produced by the state-of-the-art algorithms.
108 - Lei Li , Fuping Wu , Guang Yang 2019
Late gadolinium enhancement magnetic resonance imaging (LGE MRI) appears to be a promising alternative for scar assessment in patients with atrial fibrillation (AF). Automating the quantification and analysis of atrial scars can be challenging due to the low image quality. In this work, we propose a fully automated method based on the graph-cuts framework, where the potentials of the graph are learned on a surface mesh of the left atrium (LA) using a multi-scale convolutional neural network (MS-CNN). For validation, we have employed fifty-eight images with manual delineations. MS-CNN, which can efficiently incorporate both the local and global texture information of the images, has been shown to evidently improve the segmentation accuracy of the proposed graph-cuts based method. The segmentation could be further improved when the contribution between the t-link and n-link weights of the graph is balanced. The proposed method achieves a mean accuracy of 0.856 +- 0.033 and mean Dice score of 0.702 +- 0.071 for LA scar quantification. Compared with the conventional methods, which are based on the manual delineation of LA for initialization, our method is fully automatic and has demonstrated significantly better Dice score and accuracy (p < 0.01). The method is promising and can be useful in diagnosis and prognosis of AF.
Arbitrary image style transfer is a challenging task which aims to stylize a content image conditioned on an arbitrary style image. In this task the content-style feature transformation is a critical component for a proper fusion of features. Existin g feature transformation algorithms often suffer from unstable learning, loss of content and style details, and non-natural stroke patterns. To mitigate these issues, this paper proposes a parameter-free algorithm, Style Projection, for fast yet effective content-style transformation. To leverage the proposed Style Projection~component, this paper further presents a real-time feed-forward model for arbitrary style transfer, including a regularization for matching the content semantics between inputs and outputs. Extensive experiments have demonstrated the effectiveness and efficiency of the proposed method in terms of qualitative analysis, quantitative evaluation, and user study.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا