Improving the extraction of audio features In audio-visual Arabic systems


Abstract in English

The audio-visual speech recognition systems that rely on speech and movement of the lips of the speaker of the most important speech recognition systems. Many different techniques have developed in terms of the methods used in the feature extraction and classification methods. Research proposes the establishment of a system to identify isolated words based audio features extracted from videos pronunciations of words in Arabic in an environment free of noise, and then add the energy and Temporal derivative components in extracting features of the method Mel Frequency Cepstral Coefficient (MFCC) stage.

References used

Marius Zbancioc, Mihaela Costin :using neural networks and LPCC to improve speech recognition, International IEEE SCS Conference, Proceedings, Vol. 1, 2003 EX 720, pp. 445 – 448
Levy, C., Linares, G., Nocera, P., Bonastre, J.-F. : Reducing computational and memory cost for cellular phone embedded speech recognition system, Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on (Volume:5 ) , pages( 309-12) vol.5 , Print ISBN:9-8484-7803-0
Dimitriadis, Maragos, P. Potamianos: Robust AM-FM Features for Speech Recognition, IEEE signal processing letters, VOL. 12, NO. 9, 2005

Download