Research papers, master and doctoral theses about MFCC

Analysis Study of Formant Frequencies Changes According to the Speaker's Vocal Tract Shape

1435 - Tishreen University 2017 ورقة بحثية

In this research, some of audio signal properties have been studied according to the speaker's vocal tract shape. A database of audio files has been recorded. These files belong to 57 men whose age between 35 and 45. All speakers came from the same academic and social culture. Furthermore, they don't suffer from any problems in hearings and utterance. The vowel database was created in perfect recording conditions. The spent time needed for recording process was about five minutes for each speaker who said the Arabic word " سألتمُونِيهَا " three times. That word is very rich of vowel letters. It composes of the whole Arabic long vowel. Based on the analysis study of the recorded audio signals, the relationship between the formant frequencies and the length of speaker's vocal tract has been studied. The results show an inverse proportion for the first three frequencies F1, f2, F3 and no clear relationship for the two other frequencies F4, F5.

قاعدة البيانات الصوتية Vowel database speaker ترددات النغمات المتحدث المجرى الصوتي formant frequencies vocal tract المزيد..

Improving the extraction of audio features In audio-visual Arabic systems

1924 - Aِl-Baath University 2017 ورقة بحثية

The audio-visual speech recognition systems that rely on speech and movement of the lips of the speaker of the most important speech recognition systems. Many different techniques have developed in terms of the methods used in the feature extracti on and classification methods. Research proposes the establishment of a system to identify isolated words based audio features extracted from videos pronunciations of words in Arabic in an environment free of noise, and then add the energy and Temporal derivative components in extracting features of the method Mel Frequency Cepstral Coefficient (MFCC) stage.

Features extraction MFCC نماذج ماركوف المخفية التعرف على الكلام استخراج السمات خوارزمية معاملات تردد الميل المشتقات التفاضلية مكون الطاقة Speech recognition Markov Hidden models Temporal derivatives energy component المزيد..

Improvement of Speech Recognition by Merging Two Features Extraction Algorithms

2674 - Tishreen University 2017 ورقة بحثية

The speech recognition is one of the most modern technologies, which entered force in various fields of life, whether medical or security or industrial techniques. Accordingly, many related systems were developed, which differ from each otherin fea ture extraction methods and classification methods. In this research,three systems have been created for speech recognition.They differ from each other in the used methods during the stage of features extraction.While the first system used MFCC algorithm, the second system used LPCC algorithm, and the third system used PLP algorithm.All these three systems used HMM as classifier. At the first, the performance of the speechrecognitionprocesswas studied and evaluatedfor all the proposedsystems separately. After that, the combination algorithm was applied separately on eachpair of the studied system algorithmsin order to study the effect of using the combination algorithm onthe improvement of the speech recognition process. Twokinds of errors(simultaneous errors and dependent errors) were usedto evaluate the complementaryof each pair of the studied systems, and to study the effectiveness of the combination on improving the performance of speech recognition process. It can be seen from the results of the comparison that the best improvement ratio of speech recognition has been obtained in the case of collection MFCC and PLP algorithms with recognition ratio of 93.4%.

Features extraction نماذج ماركوف المخفية التعرف على الكلام استخراج السمات Speech recognition Markov Hidden models

Smoker Distinction Based on Analysis of Created Vowel Triangles

1408 - Tishreen University 2016 ورقة بحثية

In this research, a new comparison criterion was proposed to study properties of the audio signal for each of the varieties of smokers and non-smoking persons. For this purpose, a database for smokers has been created. The smoker database contains 12 Syrian native speakers, six of them were smokers and the others were non-smokers. The smokers had been smoking for more than 10 years. All speakers were men and their ages ranging between 35 and 42 years old. They live in rural towns and speak the same dialect. Syrian vowels can be classified into long vowels and short ones. The long vowels are /AA/, /UU/, /II/ pronounced as ([ ي, و, ا ]) and the short vowels are /A/, /U/, /I/ pronounced as ([ كسرة, ضمة, فتحة ]). In this study, the Speakers have to pronounce the following sentence /I love Syria/ pronounced as ([ أَنَاْ أَحَبُّ سُوْرِيْة ]), and it was spoken during three hours. This sentence is rich with vowels. For each speaker, a long vowel triangle in ten planes and a short vowel triangle in ten planes as well were generated and analyzed. A new criterion was suggested to determine the most suitable vowel triangle for smoker distinction. This criterion depends on calculating the different distances among all centers of vowel triangles in each plane and determining the minimal distance called d. For each plane, the most suitable vowel triangle had been set as AIU35 short vowel triangle and AAIIUU45 long vowel triangle.

قاعدة البيانات الصوتية Vowel database MFCC Algorithm المثلثات الصوتية المدخن خوارزمية MFCC vowel triangles smoker المزيد..

Isolated Word Recognition

2760 - Tishreen University 2016 مشروع تخرج

الغاية من هذا البحث بناء نظام لتصنيف نطق الأرقام الانكليزية وذلك بالاعتماد على نماذج ماركوف المخفية في التصنيف وذلك بالاعتماد على طيف الإشارة في استخراج سمات الإشارات

MFCC Hidden Markov Model Vector Quantization Mel Frequency Cepstral Coefficients نماذج ماركوف المخفية

Analysis study about (MFCC and Endpoint) algorithms and the extent of their impact in voice recognition rates

4531 - Tishreen University 2016 ورقة بحثية

Voice recognition includes two basic parts: speech and speaker recognition. These recognition processes consider as the most important processes of modern technologies, many systems has been developed that differ in the methods used to extract feat ures and classification ways to support recognition systems of this type. The study was conducted in this research on the previous subject, where the system is designed to recognize the speaker and his voice orders and focus on several complementary algorithms to carry out the research. we conducted an analytical study on MFCC algorithm used in the extraction of features, and it has been studying two parameters the number of filters in the filters bank and the number of features that taken from each frame and the impact of these two parameters in the recognition rate and the relationship of these two parameters on each other. It was the use of feed forwarding back propagation neural networks performance analysis as characteristics and we analyze the performance of the network to gain access to the best features and components to the process of achieving recognition. And it has been studying Endpoint algorithm that used to remove periods of silence and its impact on voice recognition rates.

Neural network المتكلم الكلام السمات الشبكات العصبية speaker speech feature المزيد..

Generation and Analysis of Vowel Polygons for Syrian Dialects Using a Created Speech Database

1733 - Tishreen University 2015 ورقة بحثية

Speech databases form the main foundation in the construction of automatic utterance, speaker recognition and speech recognition systems in different languages and dialects. The elements of the speech database are audio files recorded for people's voices in the required language or dialect. The more the speech database is enriched with comprehensive elements the more it contributes to produce systems that communicate with the excellent performed machine. According to the lack of speech databases for the Syrian dialects, the research did one. The created database contained sixteen voluntaries from different Syrian dialects. Voluntaries' voices were recorded in different recording conditions that is for studying the effect of variety of dialects, gender and the conditions of recording on the vowel polygons. This research invested the created speech database in the field of generating and analyzing of vowel polygons, as the vowel polygon is a geometric polygon where its vertices represent the values of formant frequencies, and the area of the polygon represents the output acoustic space.

قاعدة البيانات الصوتية المجال الصوتي Vowel database vowel polygons acoustic space MFCC Algorithm

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد