ﻻ يوجد ملخص باللغة العربية
The present paper introduces a deep neural network (DNN) for predicting the instantaneous loudness of a sound from its time waveform. The DNN was trained using the output of a more complex model, called the Cambridge loudness model. While a modern PC can perform a few hundred loudness computations per second using the Cambridge loudness model, it can perform more than 100,000 per second using the DNN, allowing real-time calculation of loudness. The root-mean-square deviation between the predictions of instantaneous loudness level using the two models was less than 0.5 phon for unseen types of sound. We think that the general approach of simulating a complex perceptual model by a much faster DNN can be applied to other perceptual models to make them run in real time.
Audio classification is considered as a challenging problem in pattern recognition. Recently, many algorithms have been proposed using deep neural networks. In this paper, we introduce a new attention-based neural network architecture called Classifi
In this work, we propose an overlapped speech detection system trained as a three-class classifier. Unlike conventional systems that perform binary classification as to whether or not a frame contains overlapped speech, the proposed approach classifi
This paper introduces StutterNet, a novel deep learning based stuttering detection capable of detecting and identifying various types of disfluencies. Most of the existing work in this domain uses automatic speech recognition (ASR) combined with lang
We propose a method for the blind separation of sounds of musical instruments in audio signals. We describe the individual tones via a parametric model, training a dictionary to capture the relative amplitudes of the harmonics. The model parameters a
This paper introduces scattering transform for speech emotion recognition (SER). Scattering transform generates feature representations which remain stable to deformations and shifting in time and frequency without much loss of information. In speech