Speech databases form the main foundation in the construction of automatic
utterance, speaker recognition and speech recognition systems in different languages and
dialects. The elements of the speech database are audio files recorded for people's
voices in
the required language or dialect. The more the speech database is enriched with
comprehensive elements the more it contributes to produce systems that communicate with
the excellent performed machine. According to the lack of speech databases for the Syrian
dialects, the research did one. The created database contained sixteen voluntaries from
different Syrian dialects. Voluntaries' voices were recorded in different recording
conditions that is for studying the effect of variety of dialects, gender and the conditions of
recording on the vowel polygons. This research invested the created speech database in the
field of generating and analyzing of vowel polygons, as the vowel polygon is a geometric
polygon where its vertices represent the values of formant frequencies, and the area of the
polygon represents the output acoustic space.
In this research, a new comparison criterion was proposed to study properties of the
audio signal for each of the varieties of smokers and non-smoking persons. For this
purpose, a database for smokers has been created. The smoker database contains
12 Syrian
native speakers, six of them were smokers and the others were non-smokers. The smokers
had been smoking for more than 10 years. All speakers were men and their ages ranging
between 35 and 42 years old. They live in rural towns and speak the same dialect.
Syrian vowels can be classified into long vowels and short ones. The long vowels are
/AA/, /UU/, /II/ pronounced as ([ ي, و, ا ]) and the short vowels are /A/, /U/, /I/ pronounced
as ([ كسرة, ضمة, فتحة ]). In this study, the Speakers have to pronounce the following sentence
/I love Syria/ pronounced as ([ أَنَاْ أَحَبُّ سُوْرِيْة ]), and it was spoken during three hours. This
sentence is rich with vowels.
For each speaker, a long vowel triangle in ten planes and a short vowel triangle in ten
planes as well were generated and analyzed. A new criterion was suggested to determine
the most suitable vowel triangle for smoker distinction. This criterion depends on
calculating the different distances among all centers of vowel triangles in each plane and
determining the minimal distance called d. For each plane, the most suitable vowel triangle
had been set as AIU35 short vowel triangle and AAIIUU45 long vowel triangle.