ﻻ يوجد ملخص باللغة العربية
Since space-domain information can be utilized, microphone array beamforming is often used to enhance the quality of the speech by suppressing directional disturbance. However, with the increasing number of microphone, the complexity would be increased. In this paper, a concise beamforming scheme using Maximum Signal-to-Noise Ratio (SNR) filter is proposed to reduce the beamforming complexity. The maximum SNR filter is implemented by using the estimated direction-of-arrival (DOA) of the speech source localization (SSL) and the solving method of independent vector analysis (IVA). Our experiments show that when compared with other widely-used algorithms, the proposed algorithm obtain higher gain of signal-to-interference and noise ratio (SINR).
We propose BeamTransformer, an efficient architecture to leverage beamformers edge in spatial filtering and transformers capability in context sequence modeling. BeamTransformer seeks to optimize modeling of sequential relationship among signals from
A crucial aspect for the successful deployment of audio-based models in-the-wild is the robustness to the transformations introduced by heterogeneous acquisition conditions. In this work, we propose a method to perform one-shot microphone style trans
Spatial clustering techniques can achieve significant multi-channel noise reduction across relatively arbitrary microphone configurations, but have difficulty incorporating a detailed speech/noise model. In contrast, LSTM neural networks have success
Deep speaker embedding represents the state-of-the-art technique for speaker recognition. A key problem with this approach is that the resulting deep speaker vectors tend to be irregularly distributed. In previous research, we proposed a deep normali
We propose a unified approach to data-driven source-filter modeling using a single neural network for developing a neural vocoder capable of generating high-quality synthetic speech waveforms while retaining flexibility of the source-filter model to