ﻻ يوجد ملخص باللغة العربية
Characters do not convey meaning, but sequences of characters do. We propose an unsupervised distributional method to learn the abstract meaning-bearing units in a sequence of characters. Rather than segmenting the sequence, this model discovers continuous representations of the objects in the sequence, using a recently proposed architecture for object discovery in images called Slot Attention. We train our model on different languages and evaluate the quality of the obtained representations with probing classifiers. Our experiments show promising results in the ability of our units to capture meaning at a higher level of abstraction.
Recently, it has been argued that encoder-decoder models can be made more interpretable by replacing the softmax function in the attention with its sparse variants. In this work, we introduce a novel, simple method for achieving sparsity in attention
We explore the suitability of self-attention models for character-level neural machine translation. We test the standard transformer model, as well as a novel variant in which the encoder block combines information from nearby characters using convol
LSTMs and other RNN variants have shown strong performance on character-level language modeling. These models are typically trained using truncated backpropagation through time, and it is common to assume that their success stems from their ability t
Intent detection and slot filling are two fundamental tasks for building a spoken language understanding (SLU) system. Multiple deep learning-based joint models have demonstrated excellent results on the two tasks. In this paper, we propose a new joi
Given a set of integers containing no 3-term arithmetic progressions, one constructs a Stanley sequence by choosing integers greedily without forming such a progression. Independent Stanley sequences are a well-structured class of Stanley sequences w