ترغب بنشر مسار تعليمي؟ اضغط هنا

Consistent Independent Low-Rank Matrix Analysis for Determined Blind Source Separation

102   0   0.0 ( 0 )
 نشر من قبل Daichi Kitamura
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Independent low-rank matrix analysis (ILRMA) is the state-of-the-art algorithm for blind source separation (BSS) in the determined situation (the number of microphones is greater than or equal to that of source signals). ILRMA achieves a great separation performance by modeling the power spectrograms of the source signals via the nonnegative matrix factorization (NMF). Such a highly developed source model can solve the permutation problem of the frequency-domain BSS to a large extent, which is the reason for the excellence of ILRMA. In this paper, we further improve the separation performance of ILRMA by additionally considering the general structure of spectrograms, which is called consistency, and hence we call the proposed method Consistent ILRMA. Since a spectrogram is calculated by an overlapping window (and a window function induces spectral smearing called main- and side-lobes), the time-frequency bins depend on each other. In other words, the time-frequency components are related to each other via the uncertainty principle. Such co-occurrence among the spectral components can function as an assistant for solving the permutation problem, which has been demonstrated by a recent study. On the basis of these facts, we propose an algorithm for realizing Consistent ILRMA by slightly modifying the original algorithm. Its performance was extensively evaluated through experiments performed with various window lengths and shift lengths. The results indicated several tendencies of the original and proposed ILRMA that include some topics not fully discussed in the literature. For example, the proposed Consistent ILRMA tends to outperform the original ILRMA when the window length is sufficiently long compared to the reverberation time of the mixing system.



قيم البحث

اقرأ أيضاً

We address the determined audio source separation problem in the time-frequency domain. In independent deeply learned matrix analysis (IDLMA), it is assumed that the inter-frequency correlation of each source spectrum is zero, which is inappropriate for modeling nonstationary signals such as music signals. To account for the correlation between frequencies, independent positive semidefinite tensor analysis has been proposed. This unsupervised (blind) method, however, severely restrict the structure of frequency covariance matrices (FCMs) to reduce the number of model parameters. As an extension of these conventional approaches, we here propose a supervised method that models FCMs using deep neural networks (DNNs). It is difficult to directly infer FCMs using DNNs. Therefore, we also propose a new FCM model represented as a convex combination of a diagonal FCM and a rank-1 FCM. Our FCM model is flexible enough to not only consider inter-frequency correlation, but also capture the dynamics of time-varying FCMs of nonstationary signals. We infer the proposed FCMs using two DNNs: DNN for power spectrum estimation and DNN for time-domain signal estimation. An experimental result of separating music signals shows that the proposed method provides higher separation performance than IDLMA.
In this paper, we generalize a source generative model in a state-of-the-art blind source separation (BSS), independent low-rank matrix analysis (ILRMA). ILRMA is a unified method of frequency-domain independent component analysis and nonnegative mat rix factorization and can provide better performance for audio BSS tasks. To further improve the performance and stability of the separation, we introduce an isotropic complex Students $t$-distribution as a source generative model, which includes the isotropic complex Gaussian distribution used in conventional ILRMA. Experiments are conducted using both music and speech BSS tasks, and the results show the validity of the proposed method.
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art supervised multichannel audio source separation methods. It blindly estimates the demixing filters on the basis of source independence, using the source model estimated by the deep neural network (DNN). However, since the ratios of the source to interferer signals vary widely among time-frequency (TF) slots, it is difficult to obtain reliable estimated power spectrograms of sources at all TF slots. In this paper, we propose an IDLMA extension, empirical Bayesian IDLMA (EB-IDLMA), by introducing a prior distribution of source power spectrograms and treating the source power spectrograms as latent random variables. This treatment allows us to implicitly consider the reliability of the estimated source power spectrograms for the estimation of demixing filters through the hyperparameters of the prior distribution estimated by the DNN. Experimental evaluations show the effectiveness of EB-IDLMA and the importance of introducing the reliability of the estimated source power spectrograms.
Multichannel blind audio source separation aims to recover the latent sources from their multichannel mixtures without supervised information. One state-of-the-art blind audio source separation method, named independent low-rank matrix analysis (ILRM A), unifies independent vector analysis (IVA) and nonnegative matrix factorization (NMF). However, the spectra matrix produced from NMF may not find a compact spectral basis. It may not guarantee the identifiability of each source as well. To address this problem, here we propose to enhance the identifiability of the source model by a minimum-volume prior distribution. We further regularize a multichannel NMF (MNMF) and ILRMA respectively with the minimum-volume regularizer. The proposed methods maximize the posterior distribution of the separated sources, which ensures the stability of the convergence. Experimental results demonstrate the effectiveness of the proposed methods compared with auxiliary independent vector analysis, MNMF, ILRMA and its extensions.
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art multichannel audio source separation methods using the source power estimation based on deep neural networks (DNNs). The DNN-based power estimation works well for sound s having timbres similar to the DNN training data. However, the sounds to which IDLMA is applied do not always have such timbres, and the timbral mismatch causes the performance degradation of IDLMA. To tackle this problem, we focus on a blind source separation counterpart of IDLMA, independent low-rank matrix analysis. It uses nonnegative matrix factorization (NMF) as the source model, which can capture source spectral components that only appear in the target mixture, using the low-rank structure of the source spectrogram as a clue. We thus extend the DNN-based source model to encompass the NMF-based source model on the basis of the product-of-expert concept, which we call the product of source models (PoSM). For the proposed PoSM-based IDLMA, we derive a computationally efficient parameter estimation algorithm based on an optimization principle called the majorization-minimization algorithm. Experimental evaluations show the effectiveness of the proposed method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا