Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

299 0 0.0 ( 0 )

Download Cite

Added by Tianwei Lin

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Songhua Liu - Tianwei Lin - Dongliang He

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks. While reinforcement learning (RL) based agents can generate a stroke sequence step by step for this task, it is not easy to train a stable RL agent. On the other hand, stroke optimization methods search for a set of stroke parameters iteratively in a large search space; such low efficiency significantly limits their prevalence and practicality. Different from previous methods, in this paper, we formulate the task as a set prediction problem and propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network. This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time. More importantly, since there is no dataset available for training the Paint Transformer, we devise a self-training pipeline such that it can be trained without any off-the-shelf dataset while still achieving excellent generalization capability. Experiments demonstrate that our method achieves better painting performance than previous ones with cheaper training and inference costs. Codes and models are available.

rate research

Transformer Feed-Forward Layers Are Key-Value Memories

150 - Mor Geva , Roei Schuster , Jonathan Berant 2020

Feed-forward layers constitute two-thirds of a transformer models parameters, yet their role in the network remains under-explored. We show that feed-forward layers in transformer-based language models operate as key-value memories, where each key correlates with textual patterns in the training examples, and each value induces a distribution over the output vocabulary. Our experiments show that the learned patterns are human-interpretable, and that lower layers tend to capture shallow patterns, while upper layers learn more semantic ones. The values complement the keys input patterns by inducing output distributions that concentrate probability mass on tokens likely to appear immediately after each pattern, particularly in the upper layers. Finally, we demonstrate that the output of a feed-forward layer is a composition of its memories, which is subsequently refined throughout the models layers via residual connections to produce the final output distribution.

Computation and Language

Stylized Neural Painting

320 - Zhengxia Zou 2020

This paper proposes an image-to-painting translation method that generates vivid and realistic painting artworks with controllable styles. Different from previous image-to-image translation methods that formulate the translation as pixel-wise prediction, we deal with such an artistic creation process in a vectorized environment and produce a sequence of physically meaningful stroke parameters that can be further used for rendering. Since a typical vector render is not differentiable, we design a novel neural renderer which imitates the behavior of the vector renderer and then frame the stroke prediction as a parameter searching process that maximizes the similarity between the input and the rendering output. We explored the zero-gradient problem on parameter searching and propose to solve this problem from an optimal transportation perspective. We also show that previous neural renderers have a parameter coupling problem and we re-design the rendering network with a rasterization network and a shading network that better handles the disentanglement of shape and color. Experiments show that the paintings generated by our method have a high degree of fidelity in both global appearance and local textures. Our method can be also jointly optimized with neural style transfer that further transfers visual style from other images. Our code and animated results are available at url{https://jiupinjia.github.io/neuralpainter/}.

Computer Vision and Pattern Recognition

Reachability Analysis for Feed-Forward Neural Networks using Face Lattices

138 - Xiaodong Yang , Hoang-Dung Tran , Weiming Xiang 2020

Deep neural networks have been widely applied as an effective approach to handle complex and practical problems. However, one of the most fundamental open problems is the lack of formal methods to analyze the safety of their behaviors. To address this challenge, we propose a parallelizable technique to compute exact reachable sets of a neural network to an input set. Our method currently focuses on feed-forward neural networks with ReLU activation functions. One of the primary challenges for polytope-based approaches is identifying the intersection between intermediate polytopes and hyperplanes from neurons. In this regard, we present a new approach to construct the polytopes with the face lattice, a complete combinatorial structure. The correctness and performance of our methodology are evaluated by verifying the safety of ACAS Xu networks and other benchmarks. Compared to state-of-the-art methods such as Reluplex, Marabou, and NNV, our approach exhibits a significantly higher efficiency. Additionally, our approach is capable of constructing the complete input set given an output set, so that any input that leads to safety violation can be tracked.

Artificial Intelligence Machine Learning

Determinant-free fermionic wave function using feed-forward neural networks

116 - Koji Inui , Yasuyuki Kato , Yukitoshi Motome 2021

We propose a general framework for finding the ground state of many-body fermionic systems by using feed-forward neural networks. The anticommutation relation for fermions is usually implemented to a variational wave function by the Slater determinant (or Pfaffian), which is a computational bottleneck because of the numerical cost of $O(N^3)$ for $N$ particles. We bypass this bottleneck by explicitly calculating the sign changes associated with particle exchanges in real space and using fully connected neural networks for optimizing the rest parts of the wave function. This reduces the computational cost to $O(N^2)$ or less. We show that the accuracy of the approximation can be improved by optimizing the variance of the energy simultaneously with the energy itself. We also find that a reweighting method in Monte Carlo sampling can stabilize the calculation. These improvements can be applied to other approaches based on variational Monte Carlo methods. Moreover, we show that the accuracy can be further improved by using the symmetry of the system, the representative states, and an additional neural network implementing a generalized Gutzwiller-Jastrow factor. We demonstrate the efficiency of the method by applying it to a two-dimensional Hubbard model.

Strongly Correlated Electrons Disordered Systems and Neural Networks Machine Learning

Fiber-compatible photonic feed-forward with 99% fidelity

147 - G. L. Zanin , M. J. Jacquet , M. Spagnolo 2020

Both photonic quantum computation and the establishment of a quantum internet require fiber-based measurement and feed-forward in order to be compatible with existing infrastructure. Here we present a fiber-compatible scheme for measurement and feed-forward, whose performance is benchmarked by carrying out remote preparation of single-photon polarization states at telecom-wavelengths. The result of a projective measurement on one photon deterministically controls the path a second photon takes with ultrafast optical switches. By placing well-calibrated {bulk} passive polarization optics in the paths, we achieve a measurement and feed-forward fidelity of (99.0 $pm$ 1)%, after correcting for other experimental errors. Our methods are useful for photonic quantum experiments including computing, communication, and teleportation.

Quantum Physics