Multi-Source Direction-of-Arrival Estimation Using Improved Estimation Consistency Method

53 0 0.0 ( 0 )

Download Cite

Added by Rohith Mars

Publication date 2019

fields Electronic Engineering

and research's language is English

Authors Rohith Mars

Audio and Speech Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We address the problem of estimating direction-of-arrivals (DOAs) for multiple acoustic sources in a reverberant environment using a spherical microphone array. It is well-known that multi-source DOA estimation is challenging in the presence of room reverberation, environmental noise and overlapping sources. In this work, we introduce multiple schemes to improve the robustness of estimation consistency (EC) approach in reverberant and noisy conditions through redefined and modified parametric weights. Simulation results show that our proposed methods achieve superior performance compared to the existing EC approach, especially when the sources are spatially close in a reverberant environment.

rate research

On the Potential of Multi-Mode Antennas for Direction-of-Arrival Estimation

69 - Robert Pohlmann , Sami Alkubti Almasri , Siwei Zhang 2018

In this paper, we show that a multi-mode antenna (MMA) is an interesting alternative to a conventional phased antenna array for direction-of-arrival (DoA) estimation. By MMA we mean a single physical radiator with multiple ports, which excite different characteristic modes. In contrast to phased arrays, a closed-form mathematical model of the antenna response, like a steering vector, is not straightforward to define for MMAs. Instead one has to rely on calibration measurement or electromagnetic field (EMF) simulation data, which is discrete. To perform DoA estimation, array interpolation technique (AIT) and wavefield modeling (WM) are suggested as methods with inherent interpolation capabilities, fully taking antenna nonidealities like mutual coupling into account. We present a non-coherent DoA estimator for low-cost receivers and show how coherent DoA estimation and joint DoA and polarization estimation can be performed with MMAs. Utilizing these methods, we assess the DoA estimation performance of an MMA prototype in simulations for both 2D and 3D cases. The results show that WM outperforms AIT for high SNR. Coherent estimation is superior to non-coherent, especially in 3D, because non-coherent suffers from estimation ambiguities. In conclusion, DoA estimation with a single MMA is feasible and accurate.

Signal Processing

Improved Generalization of Heading Direction Estimation for Aerial Filming Using Semi-supervised Regression

70 - Wenshan Wang , Aayush Ahuja , Yanfu Zhang 2019

In the task of Autonomous aerial filming of a moving actor (e.g. a person or a vehicle), it is crucial to have a good heading direction estimation for the actor from the visual input. However, the models obtained in other similar tasks, such as pedestrian collision risk analysis and human-robot interaction, are very difficult to generalize to the aerial filming task, because of the difference in data distributions. Towards improving generalization with less amount of labeled data, this paper presents a semi-supervised algorithm for heading direction estimation problem. We utilize temporal continuity as the unsupervised signal to regularize the model and achieve better generalization ability. This semi-supervised algorithm is applied to both training and testing phases, which increases the testing performance by a large margin. We show that by leveraging unlabeled sequences, the amount of labeled data required can be significantly reduced. We also discuss several important details on improving the performance by balancing labeled and unlabeled loss, and making good combinations. Experimental results show that our approach robustly outputs the heading direction for different types of actor. The aesthetic value of the video is also improved in the aerial filming task.

Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning

Resource Constrained Neural Networks for 5G Direction-of-Arrival Estimation in Micro-controllers

235 - Piyush Sahoo , Romesh Rajoria , Shivam Chandhok 2021

With the introduction of shared spectrum sensing and beam-forming based multi-antenna transceivers, 5G networks demand spectrum sensing to identify opportunities in time, frequency, and spatial domains. Narrow beam-forming makes it difficult to have spatial sensing (direction-of-arrival, DoA, estimation) in a centralized manner, and with the evolution of paradigms such as artificial intelligence of Things (AIOT), ultra-reliable low latency communication (URLLC) services and distributed networks, intelligence for edge devices (Edge-AI) is highly desirable. It helps to reduce the data-communication overhead compared to cloud-AI-centric networks and is more secure and free from scalability limitations. However, achieving desired functional accuracy is a challenge on edge devices such as microcontroller units (MCU) due to area, memory, and power constraints. In this work, we propose low complexity neural network-based algorithm for accurate DoA estimation and its efficient mapping on the off-the-self MCUs. An ad-hoc graphical-user interface (GUI) is developed to configure the STM32 NUCLEO-H743ZI2 MCU with the proposed algorithm and to validate its functionality. The performance of the proposed algorithm is analyzed for different signal-to-noise ratios (SNR), word-length, the number of antennas, and DoA resolution. In-depth experimental results show that it outperforms the conventional statistical spatial sensing approach.

Signal Processing

A Bayesian method for point source polarization estimation

418 - D. Herranz , F. Argueso , L. Toffolatti 2021

The estimation of the polarization $P$ of extragalactic compact sources in Cosmic Microwave Background images is a very important task in order to clean these images for cosmological purposes -- as, for example, to constrain the tensor-to-scalar ratio of primordial fluctuations during inflation -- and also to obtain relevant astrophysical information about the compact sources themselves in a frequency range, $ u sim 10$--$200$ GHz, where observations have only very recently started to be available. In this paper we propose a Bayesian maximum a posteriori (MAP) approach estimation scheme which incorporates prior information about the distribution of the polarization fraction of extragalactic compact sources between 1 and 100 GHz. We apply this Bayesian scheme to white noise simulations and to more realistic simulations that include CMB intensity, Galactic foregrounds and instrumental noise with the characteristics of the QUIJOTE experiment Wide Survey at 11 GHz. Using these simulations, we also compare our Bayesian method with the frequentist Filtered Fusion method that has been already used in WMAP data and in the emph{Planck} mission. We find that the Bayesian method allows us to decrease the threshold for a feasible estimation of $P$ to levels below $sim 100$ mJy (as compared to $sim 500$ mJy that was the equivalent threshold for the frequentist Filtered Fusion). We compare the bias introduced by the Bayesian method and find it to be small in absolute terms. Finally, we test the robustness of the Bayesian estimator against uncertainties in the prior and in the flux density of the sources. We find that the Bayesian estimator is robust against moderate changes in the parameters of the prior and almost insensitive to realistic errors in the estimated photometry of the sources.

Cosmology and Nongalactic Astrophysics Instrumentation and Methods for Astrophysics

Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker

49 - Maokui He , Desh Raj , Zili Huang 2021

Target-speaker voice activity detection (TS-VAD) has recently shown promising results for speaker diarization on highly overlapped speech. However, the original model requires a fixed (and known) number of speakers, which limits its application to real conversations. In this paper, we extend TS-VAD to speaker diarization with unknown numbers of speakers. This is achieved by two steps: first, an initial diarization system is applied for speaker number estimation, followed by TS-VAD network output masking according to this estimate. We further investigate different diarization methods, including clustering-based and region proposal networks, for estimating the initial i-vectors. Since these systems have complementary strengths, we propose a fusion-based method to combine frame-level decisions from the systems for an improved initialization. We demonstrate through experiments on variants of the LibriCSS meeting corpus that our proposed approach can improve the DER by up to 50% relative across varying numbers of speakers. This improvement also results in better downstream ASR performance approaching that using oracle segments.

Audio and Speech Processing