A ResNet-50-Based Convolutional Neural Network Model for Language ID Identification from Speech Recordings

published by Association for Computation Linguistics in 2021 in Artificial Intelligence and research's language is English Download

Abstract in English

This paper describes the model built for the SIGTYP 2021 Shared Task aimed at identifying 18 typologically different languages from speech recordings. Mel-frequency cepstral coefficients derived from audio files are transformed into spectrograms, which are then fed into a ResNet-50-based CNN architecture. The final model achieved validation and test accuracies of 0.73 and 0.53, respectively.

References used

https://aclanthology.org/

Download