Learning to Write Stylized Chinese Characters by Reading a Handful of Examples

119 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Danyang Sun

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Danyang Sun - Tongzheng Ren - Chongxun Li

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Automatically writing stylized Chinese characters is an attractive yet challenging task due to its wide applicabilities. In this paper, we propose a novel framework named Style-Aware Variational Auto-Encoder (SA-VAE) to flexibly generate Chinese characters. Specifically, we propose to capture the different characteristics of a Chinese character by disentangling the latent features into content-related and style-related components. Considering of the complex shapes and structures, we incorporate the structure information as prior knowledge into our framework to guide the generation. Our framework shows a powerful one-shot/low-shot generalization ability by inferring the style component given a character with unseen style. To the best of our knowledge, this is the first attempt to learn to write new-style Chinese characters by observing only one or a few examples. Extensive experiments demonstrate its effectiveness in generating different stylized Chinese characters by fusing the feature vectors corresponding to different contents and styles, which is of significant importance in real-world applications.

قيم البحث

142 - Philippe Laban , Andrew Hsi , John Canny 2021

This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the orig inal document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary. A novel unsupervised training procedure leverages this coverage model along with a fluency model to generate and score summaries. When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points, and approaches results of competitive supervised methods. Our model attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision.

الحساب واللغة التعلم الآلي

Glyph-aware Embedding of Chinese Characters

74 - Falcon Z. Dai , Zheng Cai 2017

Given the advantage and recent success of English character-level and subword-unit models in several NLP tasks, we consider the equivalent modeling problem for Chinese. Chinese script is logographic and many Chinese logograms are composed of common s ubstructures that provide semantic, phonetic and syntactic hints. In this work, we propose to explicitly incorporate the visual appearance of a characters glyph in its representation, resulting in a novel glyph-aware embedding of Chinese characters. Being inspired by the success of convolutional neural networks in computer vision, we use them to incorporate the spatio-structural patterns of Chinese glyphs as rendered in raw pixels. In the context of two basic Chinese NLP tasks of language modeling and word segmentation, the model learns to represent each characters task-relevant semantic and syntactic information in the character-level embedding.

الحساب واللغة التعلم الآلي

GAP++: Learning to generate target-conditioned adversarial examples

247 - Xiaofeng Mao , Yuefeng Chen , Yuhong Li 2020

Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models. Finding these perturbations is such a hard task that we can only use the iterative methods to traverse. For computational efficiency, recent works use adversarial generative networks to model the distribution of both the universal or image-dependent perturbations directly. However, these methods generate perturbations only rely on input images. In this work, we propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label. Different from previous single-target attack models, our model can conduct target-conditioned attacks by learning the relations of attack target and the semantics in image. Using extensive experiments on the datasets of MNIST and CIFAR10, we show that our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Adversarial Examples for Evaluating Reading Comprehension Systems

95 - Robin Jia , Percy Liang 2017

Standard accuracy metrics indicate that reading comprehension systems are making rapid progress, but the extent to which these systems truly understand language remains unclear. To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD). Our method tests whether systems can answer questions about paragraphs that contain adversarially inserted sentences, which are automatically generated to distract computer systems without changing the correct answer or misleading humans. In this adversarial setting, the accuracy of sixteen published models drops from an average of $75%$ F1 score to $36%$; when the adversary is allowed to add ungrammatical sequences of words, average accuracy on four models decreases further to $7%$. We hope our insights will motivate the development of new models that understand language more precisely.

الحساب واللغة التعلم الآلي

TextScanner: Reading Characters in Order for Robust Scene Text Recognition

102 - Zhaoyi Wan , Minghang He , Haoran Chen 2019

Driven by deep learning and the large volume of data, scene text recognition has evolved rapidly in recent years. Formerly, RNN-attention based methods have dominated this field, but suffer from the problem of textit{attention drift} in certain situa tions. Lately, semantic segmentation based algorithms have proven effective at recognizing text of different forms (horizontal, oriented and curved). However, these methods may produce spurious characters or miss genuine characters, as they rely heavily on a thresholding procedure operated on segmentation maps. To tackle these challenges, we propose in this paper an alternative approach, called TextScanner, for scene text recognition. TextScanner bears three characteristics: (1) Basically, it belongs to the semantic segmentation family, as it generates pixel-wise, multi-channel segmentation maps for character class, position and order; (2) Meanwhile, akin to RNN-attention based methods, it also adopts RNN for context modeling; (3) Moreover, it performs paralleled prediction for character position and class, and ensures that characters are transcripted in correct order. The experiments on standard benchmark datasets demonstrate that TextScanner outperforms the state-of-the-art methods. Moreover, TextScanner shows its superiority in recognizing more difficult text such Chinese transcripts and aligning with target characters.

الرؤية الحاسوبية وتمييز الأنماط الحساب واللغة التعلم الآلي