Raking the Cocktail Party


الملخص بالإنكليزية

We present the concept of an acoustic rake receiver---a microphone beamformer that uses echoes to improve the noise and interference suppression. The rake idea is well-known in wireless communications; it involves constructively combining different multipath components that arrive at the receiver antennas. Unlike spread-spectrum signals used in wireless communications, speech signals are not orthogonal to their shifts. Therefore, we focus on the spatial structure, rather than temporal. Instead of explicitly estimating the channel, we create correspondences between early echoes in time and image sources in space. These multiple sources of the desired and the interfering signal offer additional spatial diversity that we can exploit in the beamformer design. We present several intuitive and optimal formulations of acoustic rake receivers, and show theoretically and numerically that the rake formulation of the maximum signal-to-interference-and-noise beamformer offers significant performance boosts in terms of noise and interference suppression. Beyond signal-to-noise ratio, we observe gains in terms of the emph{perceptual evaluation of speech quality} (PESQ) metric for the speech quality. We accompany the paper by the complete simulation and processing chain written in Python. The code and the sound samples are available online at url{http://lcav.github.io/AcousticRakeReceiver/}.

تحميل البحث