ترغب بنشر مسار تعليمي؟ اضغط هنا

One Model to Pronounce Them All: Multilingual Grapheme-to-Phoneme Conversion With a Transformer Ensemble

144   0   0.0 ( 0 )
 نشر من قبل Kaili Vesik
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English
 تأليف Kaili Vesik




اسأل ChatGPT حول البحث

The task of grapheme-to-phoneme (G2P) conversion is important for both speech recognition and synthesis. Similar to other speech and language processing tasks, in a scenario where only small-sized training data are available, learning G2P models is challenging. We describe a simple approach of exploiting model ensembles, based on multilingual Transformers and self-training, to develop a highly effective G2P solution for 15 languages. Our models are developed as part of our participation in the SIGMORPHON 2020 Shared Task 1 focused at G2P. Our best models achieve 14.99 word error rate (WER) and 3.30 phoneme error rate (PER), a sizeable improvement over the shared task competitive baselines.

قيم البحث

اقرأ أيضاً

134 - Hao Sun , Xu Tan , Jun-Wei Gan 2019
Grapheme-to-phoneme (G2P) conversion is an important task in automatic speech recognition and text-to-speech systems. Recently, G2P conversion is viewed as a sequence to sequence task and modeled by RNN or CNN based encoder-decoder framework. However , previous works do not consider the practical issues when deploying G2P model in the production system, such as how to leverage additional unlabeled data to boost the accuracy, as well as reduce model size for online deployment. In this work, we propose token-level ensemble distillation for G2P conversion, which can (1) boost the accuracy by distilling the knowledge from additional unlabeled data, and (2) reduce the model size but maintain the high accuracy, both of which are very practical and helpful in the online production system. We use token-level knowledge distillation, which results in better accuracy than the sequence-level counterpart. What is more, we adopt the Transformer instead of RNN or CNN based models to further boost the accuracy of G2P conversion. Experiments on the publicly available CMUDict dataset and an internal English dataset demonstrate the effectiveness of our proposed method. Particularly, our method achieves 19.88% WER on CMUDict dataset, outperforming the previous works by more than 4.22% WER, and setting the new state-of-the-art results.
Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers an alternat ive approach with improved tolerance to different classes of data-types and quantization policies. It opens up new exciting applications where the quantization process is not static and can vary to meet different circumstances and implementations. To address this issue, we propose a method that provides intrinsic robustness to the model against a broad range of quantization processes. Our method is motivated by theoretical arguments and enables us to store a single generic model capable of operating at various bit-widths and quantization policies. We validate our methods effectiveness on different ImageNet models.
Transformers have become the powerhouse of natural language processing and recently found use in computer vision tasks. Their effective use of attention can be used in other contexts as well, and in this paper, we propose a transformer-based approach for efficiently solving the complex motion planning problems. Traditional neural network-based motion planning uses convolutional networks to encode the planning space, but these methods are limited to fixed map sizes, which is often not realistic in the real-world. Our approach first identifies regions on the map using transformers to provide attention to map areas likely to include the best path, and then applies local planners to generate the final collision-free path. We validate our method on a variety of randomly generated environments with different map sizes, demonstrating reduction in planning complexity and achieving comparable accuracy to traditional planners.
The analytical solution of the three--dimensional linear pendulum in a rotating frame of reference is obtained, including Coriolis and centrifugal accelerations, and expressed in terms of initial conditions. This result offers the possibility of trea ting Foucault and Bravais pendula as trajectories of the very same system of equations, each of them with particular initial conditions. We compare with the common two--dimensional approximations in textbooks. A previously unnoticed pattern in the three--dimensional Foucault pendulum attractor is presented.
One Monad to Prove Them All is a modern fairy tale about curiosity and perseverance, two important properties of a successful PhD student. We follow the PhD student Mona on her adventure of proving properties about Haskell programs in the proof assis tant Coq. On the one hand, as a PhD student in computer science Mona observes an increasing demand for correct software products. In particular, because of the large amount of existing software, verifying existing software products becomes more important. Verifying programs in the functional programming language Haskell is no exception. On the other hand, Mona is delighted to see that communities in the area of theorem proving are becoming popular. Thus, Mona sets out to learn more about the interactive theorem prover Coq and verifying Haskell programs in Coq. To prove properties about a Haskell function in Coq, Mona has to translate the function into Coq code. As Coq programs have to be total and Haskell programs are often not, Mona has to model partiality explicitly in Coq. In her quest for a solution Mona finds an ancient manuscript that explains how properties about Haskell functions can be proven in the proof assistant Agda by translating Haskell programs into monadic Agda programs. By instantiating the monadic program with a concrete monad instance the proof can be performed in either a total or a partial setting. Mona discovers that the proposed transformation does not work in Coq due to a restriction in the termination checker. In fact the transformation does not work in Agda anymore as well, as the termination checker in Agda has been improved. We follow Mona on an educational journey through the land of functional programming where she learns about concepts like free monads and containers as well as basics and restrictions of proof assistants like Coq. These concepts are well-known individually, but their interplay gives rise to a solution for Monas problem based on the originally proposed monadic tranformation that has not been presented before. When Mona starts to test her approach by proving a statement about simple Haskell functions, she realizes that her approach has an additional advantage over the original idea in Agda. Monas final solution not only works for a specific monad instance but even allows her to prove monad-generic properties. Instead of proving properties over and over again for specific monad instances she is able to prove properties that hold for all monads representable by a container-based instance of the free monad. In order to strengthen her confidence in the practicability of her approach, Mona evaluates her approach in a case study that compares two implementations for queues. In order to share the results with other functional programmers the fairy tale is available as a literate Coq file. If you are a citizen of the land of functional programming or are at least familiar with its customs, had a journey that involved reasoning about functional programs of your own, or are just a curious soul looking for the next story about monads and proofs, then this tale is for you.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا