ﻻ يوجد ملخص باللغة العربية
In this paper, we propose a deep learning (DL)-based parameter enhancement method for a mixed excitation linear prediction (MELP) speech codec in noisy communication environment. Unlike conventional speech enhancement modules that are designed to obtain clean speech signal by removing noise components before speech codec processing, the proposed method directly enhances codec parameters on either the encoder or decoder side. As the proposed method has been implemented by a small network without any additional processes required in conventional enhancement systems, e.g., time-frequency (T-F) analysis/synthesis modules, its computational complexity is very low. By enhancing the noise-corrupted codec parameters with the proposed DL framework, we achieved an enhancement system that is much simpler and faster than conventional T-F mask-based speech enhancement methods, while the quality of its performance remains similar.
Previous studies have proven that integrating video signals, as a complementary modality, can facilitate improved performance for speech enhancement (SE). However, video clips usually contain large amounts of data and pose a high cost in terms of com
This paper proposes a full-band and sub-band fusion model, named as FullSubNet, for single-channel real-time speech enhancement. Full-band and sub-band refer to the models that input full-band and sub-band noisy spectral feature, output full-band and
Conventional deep neural network (DNN)-based speech enhancement (SE) approaches aim to minimize the mean square error (MSE) between enhanced speech and clean reference. The MSE-optimized model may not directly improve the performance of an automatic
Supervised learning for single-channel speech enhancement requires carefully labeled training examples where the noisy mixture is input into the network and the network is trained to produce an output close to the ideal target. To relax the condition
Statistical signal processing based speech enhancement methods adopt expert knowledge to design the statistical models and linear filters, which is complementary to the deep neural network (DNN) based methods which are data-driven. In this paper, by