ﻻ يوجد ملخص باللغة العربية
Time-frequency representations such as the spectrogram are commonly used to analyze signals having a time-varying distribution of spectral energy, but the spectrogram is constrained by an unfortunate tradeoff between resolution in time and frequency. A method of achieving high-resolution spectral representations has been independently introduced by several parties. The technique has been variously named reassignment and remapping, but while the implementations have differed in details, they are all based on the same theoretical and mathematical foundation. In this work, we present a brief history of work on the method we will call the method of time-frequency reassignment, and present a unified mathematical description of the technique and its derivation. We will focus on the development of time-frequency reassignment in the context of the spectrogram, and conclude with a discussion of some current applications of the reassigned spectrogram.
Recent works have shown that Deep Recurrent Neural Networks using the LSTM architecture can achieve strong single-channel speech enhancement by estimating time-frequency masks. However, these models do not naturally generalize to multi-channel inputs
We propose a multi-channel speech enhancement approach with a novel two-stage feature fusion method and a pre-trained acoustic model in a multi-task learning paradigm. In the first fusion stage, the time-domain and frequency-domain features are extra
We propose a unified approach to data-driven source-filter modeling using a single neural network for developing a neural vocoder capable of generating high-quality synthetic speech waveforms while retaining flexibility of the source-filter model to
Cochlear implant users struggle to understand speech in reverberant environments. To restore speech perception, artifacts dominated by reverberant reflections can be removed from the cochlear implant stimulus. Artifacts can be identified and removed
This paper introduces an improved target speaker extractor, referred to as Speakerfilter-Pro, based on our previous Speakerfilter model. The Speakerfilter uses a bi-direction gated recurrent unit (BGRU) module to characterize the target speaker from