بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Fast End-to-End Wikification

283 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yosi Mass

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ilya Shnayderman - Liat Ein-Dor - Yosi Mass

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Wikification of large corpora is beneficial for various NLP applications. Existing methods focus on quality performance rather than run-time, and are therefore non-feasible for large data. Here, we introduce RedW, a run-time oriented Wikification solution, based on Wikipedia redirects, that can Wikify massive corpora with competitive performance. We further propose an efficient method for estimating RedW confidence, opening the door for applying more demanding methods only on top of RedW lower-confidence results. Our experimental results support the validity of the proposed approach.

قيم البحث

190 - Yiming Wang , Tongfei Chen , Hainan Xu 2019

We present Espresso, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit fairseq. Espresso supports distributed training across GPUs and computing nodes, and features various decoding approaches commonly employed in ASR, including look-ahead word-based language model fusion, for which a fast, parallelized decoder is implemented. Espresso achieves state-of-the-art ASR performance on the WSJ, LibriSpeech, and Switchboard data sets among other end-to-end systems without data augmentation, and is 4--11x faster for decoding than similar systems (e.g. ESPnet).

الحساب واللغة أنظمة الصوت في الحاسوب معالجة الصوت والكلام

End-to-End Spoken Language Translation

131 - Michelle Guo , Albert Haque , Prateek Verma 2019

In this paper, we address the task of spoken language understanding. We present a method for translating spoken sentences from one language into spoken sentences in another language. Given spectrogram-spectrogram pairs, our model can be trained compl etely from scratch to translate unseen sentences. Our method consists of a pyramidal-bidirectional recurrent network combined with a convolutional network to output sentence-level spectrograms in the target language. Empirically, our model achieves competitive performance with state-of-the-art methods on multiple languages and can generalize to unseen speakers.

الحساب واللغة أنظمة الصوت في الحاسوب معالجة الصوت والكلام

End-to-end Anchored Speech Recognition

99 - Yiming Wang , Xing Fan , I-Fan Chen 2019

Voice-controlled house-hold devices, like Amazon Echo or Google Home, face the problem of performing speech recognition of device-directed speech in the presence of interfering background speech, i.e., background noise and interfering speech from ano ther person or media device in proximity need to be ignored. We propose two end-to-end models to tackle this problem with information extracted from the anchored segment. The anchored segment refers to the wake-up word part of an audio stream, which contains valuable speaker information that can be used to suppress interfering speech and background noise. The first method is called Multi-source Attention where the attention mechanism takes both the speaker information and decoder state into consideration. The second method directly learns a frame-level mask on top of the encoder output. We also explore a multi-task learning setup where we use the ground truth of the mask to guide the learner. Given that audio data with interfering speech is rare in our training data set, we also propose a way to synthesize noisy speech from clean speech to mitigate the mismatch between training and test data. Our proposed methods show up to 15% relative reduction in WER for Amazon Alexa live data with interfering background speech without significantly degrading on clean speech.

الحساب واللغة التعلم الآلي أنظمة الصوت في الحاسوب

Multi-Speaker End-to-End Speech Synthesis

138 - Jihyun Park , Kexin Zhao , Kainan Peng 2019

In this work, we extend ClariNet (Ping et al., 2019), a fully end-to-end speech synthesis model (i.e., text-to-wave), to generate high-fidelity speech from multiple speakers. To model the unique characteristic of different voices, low dimensional tra inable speaker embeddings are shared across each component of ClariNet and trained together with the rest of the model. We demonstrate that the multi-speaker ClariNet outperforms state-of-the-art systems in terms of naturalness, because the whole model is jointly optimized in an end-to-end manner.

الحساب واللغة التعلم الآلي أنظمة الصوت في الحاسوب

Dial2Desc: End-to-end Dialogue Description Generation

141 - Haojie Pan , Junpei Zhou , Zhou Zhao 2018

We first propose a new task named Dialogue Description (Dial2Desc). Unlike other existing dialogue summarization tasks such as meeting summarization, we do not maintain the natural flow of a conversation but describe an object or an action of what pe ople are talking about. The Dial2Desc system takes a dialogue text as input, then outputs a concise description of the object or the action involved in this conversation. After reading this short description, one can quickly extract the main topic of a conversation and build a clear picture in his mind, without reading or listening to the whole conversation. Based on the existing dialogue dataset, we build a new dataset, which has more than one hundred thousand dialogue-description pairs. As a step forward, we demonstrate that one can get more accurate and descriptive results using a new neural attentive model that exploits the interaction between utterances from different speakers, compared with other baselines.

الحساب واللغة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة إيبلا الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Fast End-to-End Wikification

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً