ﻻ يوجد ملخص باللغة العربية
We apply entropy agglomeration (EA), a recently introduced algorithm, to cluster the words of a literary text. EA is a greedy agglomerative procedure that minimizes projection entropy (PE), a function that can quantify the segmentedness of an element set. To apply it, the text is reduced to a feature allocation, a combinatorial object to represent the word occurences in the texts paragraphs. The experiment results demonstrate that EA, despite its reduction and simplicity, is useful in capturing significant relationships among the words in the text. This procedure was implemented in Python and published as a free software: REBUS.
We propose a novel data augmentation for labeled sentences called contextual augmentation. We assume an invariance that sentences are natural even if the words in the sentences are replaced with other words with paradigmatic relations. We stochastica
Pretraining deep neural network architectures with a language modeling objective has brought large improvements for many natural language processing tasks. Exemplified by BERT, a recently proposed such architecture, we demonstrate that despite being
Paraphrase generation is a longstanding important problem in natural language processing. In addition, recent progress in deep generative models has shown promising results on discrete latent variables for text generation. Inspired by variational
Despite advances in neural network language model, the representation degeneration problem of embeddings is still challenging. Recent studies have found that the learned output embeddings are degenerated into a narrow-cone distribution which makes th
We propose a new global entity disambiguation (ED) model based on contextualized embeddings of words and entities. Our model is based on a bidirectional transformer encoder (i.e., BERT) and produces contextualized embeddings for words and entities in