Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Automatic Speech Recognition Algorithms

خوارزميات تعرّف على الكلام آلياً

2435 3 11 5.0 ( 1 )

Download Cite

Added by Higher Institute for Applied Sciences and Technology رسالة ماجستير

Publication date 2017

fields Informatics Engineering

and research's language is العربية

Authors وائل الرزوق( باحث )

Created by Shamra Editor

Nlp التعرف على الكلام لغويات asr Automatic Speech Recognition

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In general, the aim of an automatic speech recognition system is to write down what is said. State of the art continuous speech recognition systems consist of four basic modules: the signal processing, the acoustic modeling, the language modeling and the search engine. While isolated word recognition systems do not contain language modeling, which is responsible for connecting words together to form understandable sentences.

Artificial intelligence review:

Upgrade your account to view the content

Research summary

تتناول الأطروحة دراسة أنظمة تعرف الكلام آلياً، وتهدف إلى تحويل الكلام المنطوق إلى نص مكتوب. تتكون أنظمة تعرف الكلام المستمر آلياً من أربع مكونات أساسية: معالجة الإشارة، النمذجة الصوتية، النمذجة اللغوية، ومحرك البحث. بينما لا تحتوي أنظمة تعرف الكلمات المنفصلة على النمذجة اللغوية. في جزء معالجة الإشارة، تم دراسة خوارزميتين لاستخراج السمات: معاملات الكيبيسترال بتردد ميل (MFCC) ومعاملات الكيبيسترال لمويجات جاماتون (GWCC)، وتم اختبار أدائهما باستخدام قاعدة بيانات TIDIGITS. تم استخدام نموذج ماركوف المخفي (HMM) لبناء المصنف، نظراً لمرونته وسهولة تعديله. تم اقتراح خوارزمية جديدة: معاملات الكيبيسترال بمعامل Q ثابت (CQCC) ومقارنة أدائها مع الخوارزميتين السابقتين. كما تم اختبار أداء الخوارزميات في بيئات ضجيج مختلفة (قطار، محطة، مطعم، ...).

Critical review

تعتبر هذه الدراسة شاملة ومفصلة في مجال تعرف الكلام آلياً، حيث تناولت دراسة خوارزميات متعددة واختبرت أدائها في بيئات مختلفة. ومع ذلك، يمكن توجيه بعض النقد البناء لهذه الدراسة. أولاً، قد يكون من الأفضل تضمين المزيد من قواعد البيانات المختلفة لاختبار الخوارزميات، مما يعزز من موثوقية النتائج. ثانياً، يمكن تحسين الدراسة من خلال تقديم تحليل أعمق لأسباب تفوق بعض الخوارزميات على الأخرى في بيئات ضجيج معينة. وأخيراً، يمكن أن تكون الدراسة أكثر شمولاً إذا تم تضمين تطبيقات عملية لأنظمة تعرف الكلام في الحياة اليومية، مثل استخدامها في الأجهزة الذكية أو السيارات.

Questions related to the research

ما هي المكونات الأساسية لأنظمة تعرف الكلام المستمر آلياً؟

تتكون أنظمة تعرف الكلام المستمر آلياً من أربع مكونات أساسية: معالجة الإشارة، النمذجة الصوتية، النمذجة اللغوية، ومحرك البحث.
ما هي الخوارزميات التي تم دراستها لاستخراج السمات في هذه الأطروحة؟

تم دراسة خوارزميتين لاستخراج السمات: معاملات الكيبيسترال بتردد ميل (MFCC) ومعاملات الكيبيسترال لمويجات جاماتون (GWCC).
ما هي الخوارزمية الجديدة التي تم اقتراحها في هذه الدراسة؟

تم اقتراح خوارزمية جديدة هي معاملات الكيبيسترال بمعامل Q ثابت (CQCC).
كيف تم اختبار أداء الخوارزميات في بيئات ضجيج مختلفة؟

تم اختبار أداء الخوارزميات بإضافة أنواع مختلفة من الضجيج (قطار، محطة، مطعم، ... ) إلى الاختبارات.

Keywords

تعرّف الكلام آلياً معالجة الإشارة نمذجة صوتية نمذجة لغوية نموذج ماركوف المخفي معاملات الكيبيسترال بتردد ميل معاملات الكيبيسترال لمويجات جاماتون معاملات الكيبيسترال بمعامل Q ثابت

References used

V. Kumar.S. Singh, S. Ahuja, and R. Chadha N. Trivedi, "Speech Recognition by Wavelet Analysis," International Journal of Computer Applications, vol. 15, no. 8, February 2011.

rate research

Automatic Prosody Generation for Arabic Text- To - Speech Systems

1655 - Damascus University 2011 ورقة بحثية

The main purpose of the present research is to support Arabic Text- to - Speech synthesizers, with natural prosody, based on linguistic analysis of texts to synthesize, and automatic prosody generation, using rules which are deduced from recorded s ignals analysis, of different types of sentences in Arabic. All the types of Arabic sentences (declarative and constructive) were enumerated with the help of an expert in Arabic linguistics . A textual corpus of about 2500 sentences covering most of these types was built and recorded both in natural prosody and without prosody. Later, these sentences were analyzed to extract prosody effect on the signal parameters, and to build prosody generation rules. In this paper, we present the results on negation sentences, applied on synthesized speech using the open source tool MBROLA. The results can be used with any parametric Arabic synthesizer. Future work will apply the rules on a new Arabic synthesizer based on semi-syllables units, which is under development in the Higher Institute for Applied Sciences and Technology.

تركيب الكلام من نصوص للغة العربية موسطات التنغيم قواعد لتوليد التغيم آلياً تحليل لغوي مدونة نصية مدونة كلامية تحليل الإشارة الكلامية Arabic Text To Speech Prosodic Parameters Automatic Prosody Generation Rules Linguistic analysis Text Corpus Speech Corpus Speech Signal Analysis المزيد..

Improvement of Speech Recognition by Merging Two Features Extraction Algorithms

2231 - Tishreen University 2017 ورقة بحثية

The speech recognition is one of the most modern technologies, which entered force in various fields of life, whether medical or security or industrial techniques. Accordingly, many related systems were developed, which differ from each otherin fea ture extraction methods and classification methods. In this research,three systems have been created for speech recognition.They differ from each other in the used methods during the stage of features extraction.While the first system used MFCC algorithm, the second system used LPCC algorithm, and the third system used PLP algorithm.All these three systems used HMM as classifier. At the first, the performance of the speechrecognitionprocesswas studied and evaluatedfor all the proposedsystems separately. After that, the combination algorithm was applied separately on eachpair of the studied system algorithmsin order to study the effect of using the combination algorithm onthe improvement of the speech recognition process. Twokinds of errors(simultaneous errors and dependent errors) were usedto evaluate the complementaryof each pair of the studied systems, and to study the effectiveness of the combination on improving the performance of speech recognition process. It can be seen from the results of the comparison that the best improvement ratio of speech recognition has been obtained in the case of collection MFCC and PLP algorithms with recognition ratio of 93.4%.

Features extraction نماذج ماركوف المخفية التعرف على الكلام استخراج السمات Speech recognition Markov Hidden models

Automatic Speech-Based Checklist for Medical Simulations

479 - Association for Computation Linguistics 2021 مقالة

Medical simulators provide a controlled environment for training and assessing clinical skills. However, as an assessment platform, it requires the presence of an experienced examiner to provide performance feedback, commonly preformed using a task s pecific checklist. This makes the assessment process inefficient and expensive. Furthermore, this evaluation method does not provide medical practitioners the opportunity for independent training. Ideally, the process of filling the checklist should be done by a fully-aware objective system, capable of recognizing and monitoring the clinical performance. To this end, we have developed an autonomous and a fully automatic speech-based checklist system, capable of objectively identifying and validating anesthesia residents' actions in a simulation environment. Based on the analyzed results, our system is capable of recognizing most of the tasks in the checklist: F1 score of 0.77 for all of the tasks, and F1 score of 0.79 for the verbal tasks. Developing an audio-based system will improve the experience of a wide range of simulation platforms. Furthermore, in the future, this approach may be implemented in the operation room and emergency room. This could facilitate the development of automatic assistive technologies for these domains.

medical simulators provide automatic speech-based checklist checklist المحاكاة الطبية توفر قائمة المراجعة القائمة على الكلام قائمة تدقيق صناعة حمض الفوسفور المزيد..

Speech Emotion Recognition Based on CNN+LSTM Model

797 - Association for Computation Linguistics 2021 مقالة

Due to the popularity of intelligent dialogue assistant services, speech emotion recognition has become more and more important. In the communication between humans and machines, emotion recognition and emotion analysis can enhance the interaction be tween machines and humans. This study uses the CNN+LSTM model to implement speech emotion recognition (SER) processing and prediction. From the experimental results, it is known that using the CNN+LSTM model achieves better performance than using the traditional NN model.

emotion recognition based speech emotion recognition emotion recognition العاطفة الاعتراف مقرها التعرف على العاطفة الكلام العاطفة الاعتراف صناعة حمض الفوسفور المزيد..

Sequential Randomized Smoothing for Adversarially Robust Speech Recognition

525 - Association for Computation Linguistics 2021 مقالة

While Automatic Speech Recognition has been shown to be vulnerable to adversarial attacks, defenses against these attacks are still lagging. Existing, naive defenses can be partially broken with an adaptive attack. In classification tasks, the Random ized Smoothing paradigm has been shown to be effective at defending models. However, it is difficult to apply this paradigm to ASR tasks, due to their complexity and the sequential nature of their outputs. Our paper overcomes some of these challenges by leveraging speech-specific tools like enhancement and ROVER voting to design an ASR model that is robust to perturbations. We apply adaptive versions of state-of-the-art attacks, such as the Imperceptible ASR attack, to our model, and show that our strongest defense is robust to all attacks that use inaudible noise, and can only be broken with very high distortion.

تحسين مدرب مسبقا adversarially robust speech robust speech recognition اعتراف خطاب قوي صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Automatic Speech Recognition Algorithms

خوارزميات تعرّف على الكلام آلياً

Ask ChatGPT about the research

Read More

suggested questions