Structure First Detail Next: Image Inpainting with Pyramid Generator

182 0 0.0 ( 0 )

Download Cite

Added by Shuyi Qu

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Shuyi Qu - Zhenxing Niu - Kaizhu Huang

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent deep generative models have achieved promising performance in image inpainting. However, it is still very challenging for a neural network to generate realistic image details and textures, due to its inherent spectral bias. By our understanding of how artists work, we suggest to adopt a `structure first detail next workflow for image inpainting. To this end, we propose to build a Pyramid Generator by stacking several sub-generators, where lower-layer sub-generators focus on restoring image structures while the higher-layer sub-generators emphasize image details. Given an input image, it will be gradually restored by going through the entire pyramid in a bottom-up fashion. Particularly, our approach has a learning scheme of progressively increasing hole size, which allows it to restore large-hole images. In addition, our method could fully exploit the benefits of learning with high-resolution images, and hence is suitable for high-resolution image inpainting. Extensive experimental results on benchmark datasets have validated the effectiveness of our approach compared with state-of-the-art methods.

rate research

Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE

97 - Jialun Peng , Dong Liu , Songcen Xu 2021

Given an incomplete image without additional constraint, image inpainting natively allows for multiple solutions as long as they appear plausible. Recently, multiplesolution inpainting methods have been proposed and shown the potential of generating diverse results. However, these methods have difficulty in ensuring the quality of each solution, e.g. they produce distorted structure and/or blurry texture. We propose a two-stage model for diverse inpainting, where the first stage generates multiple coarse results each of which has a different structure, and the second stage refines each coarse result separately by augmenting texture. The proposed model is inspired by the hierarchical vector quantized variational auto-encoder (VQ-VAE), whose hierarchical architecture isentangles structural and textural information. In addition, the vector quantization in VQVAE enables autoregressive modeling of the discrete distribution over the structural information. Sampling from the distribution can easily generate diverse and high-quality structures, making up the first stage of our model. In the second stage, we propose a structural attention module inside the texture generation network, where the module utilizes the structural information to capture distant correlations. We further reuse the VQ-VAE to calculate two feature losses, which help improve structure coherence and texture realism, respectively. Experimental results on CelebA-HQ, Places2, and ImageNet datasets show that our method not only enhances the diversity of the inpainting solutions but also improves the visual quality of the generated multiple images. Code and models are available at: https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting.

Computer Vision and Pattern Recognition

StructureFlow: Image Inpainting via Structure-aware Appearance Flow

253 - Yurui Ren , Xiaoming Yu , Ruonan Zhang 2019

Image inpainting techniques have shown significant improvements by using deep neural networks recently. However, most of them may either fail to reconstruct reasonable structures or restore fine-grained textures. In order to solve this problem, in this paper, we propose a two-stage model which splits the inpainting task into two parts: structure reconstruction and texture generation. In the first stage, edge-preserved smooth images are employed to train a structure reconstructor which completes the missing structures of the inputs. In the second stage, based on the reconstructed structures, a texture generator using appearance flow is designed to yield image details. Experiments on multiple publicly available datasets show the superior performance of the proposed network.

Computer Vision and Pattern Recognition

Detail Preserving Residual Feature Pyramid Modules for Optical Flow

89 - Libo Long , Jochen Lang 2021

Feature pyramids and iterative refinement have recently led to great progress in optical flow estimation. However, downsampling in feature pyramids can cause blending of foreground objects with the background, which will mislead subsequent decisions in the iterative processing. The results are missing details especially in the flow of thin and of small structures. We propose a novel Residual Feature Pyramid Module (RFPM) which retains important details in the feature map without changing the overall iterative refinement design of the optical flow estimation. RFPM incorporates a residual structure between multiple feature pyramids into a downsampling module that corrects the blending of objects across boundaries. We demonstrate how to integrate our module with two state-of-the-art iterative refinement architectures. Results show that our RFPM visibly reduces flow errors and improves state-of-art performance in the clean pass of Sintel, and is one of the top-performing methods in KITTI. According to the particular modular structure of RFPM, we introduce a special transfer learning approach that can dramatically decrease the training time compared to a typical full optical flow training schedule on multiple datasets.

Computer Vision and Pattern Recognition

Semi-parametric Image Inpainting

97 - Karim Iskakov 2018

This paper introduces a semi-parametric approach to image inpainting for irregular holes. The nonparametric part consists of an external image database. During test time database is used to retrieve a supplementary image, similar to the input masked picture, and utilize it as auxiliary information for the deep neural network. Further, we propose a novel method of generating masks with irregular holes and present public dataset with such masks. Experiments on CelebA-HQ dataset show that our semi-parametric method yields more realistic results than previous approaches, which is confirmed by the user study.

Computer Vision and Pattern Recognition

Video Super-Resolution with Recurrent Structure-Detail Network

135 - Takashi Isobe , Xu Jia , Shuhang Gu 2020

Most video super-resolution methods super-resolve a single reference frame with the help of neighboring frames in a temporal sliding window. They are less efficient compared to the recurrent-based methods. In this work, we propose a novel recurrent video super-resolution method which is both effective and efficient in exploiting previous frames to super-resolve the current frame. It divides the input into structure and detail components which are fed to a recurrent unit composed of several proposed two-stream structure-detail blocks. In addition, a hidden state adaptation module that allows the current frame to selectively use information from hidden state is introduced to enhance its robustness to appearance change and error accumulation. Extensive ablation study validate the effectiveness of the proposed modules. Experiments on several benchmark datasets demonstrate the superior performance of the proposed method compared to state-of-the-art methods on video super-resolution.

Computer Vision and Pattern Recognition

comments

Fetching comments

American University of Beirut

Additional details More universities

Structure First Detail Next: Image Inpainting with Pyramid Generator

Ask ChatGPT about the research

No Arabic abstract

Read More