A Convolutional Encoder Model for Neural Machine Translation

published by Michael Auli in 2016 in Informatics Engineering and research's language is English Download

Abstract in English

The prevalent approach to neural machine translation relies on bi-directional LSTMs to encode the source sentence. In this paper we present a faster and simpler architecture based on a succession of convolutional layers. This allows to encode the entire source sentence simultaneously compared to recurrent networks for which computation is constrained by temporal dependencies. On WMT16 English-Romanian translation we achieve competitive accuracy to the state-of-the-art and we outperform several recently published results on the WMT15 English-German task. Our models obtain almost the same accuracy as a very deep LSTM setup on WMT14 English-French translation. Our convolutional encoder speeds up CPU decoding by more than two times at the same or higher accuracy as a strong bi-directional LSTM baseline.

Download