Do you want to publish a course? Click here

Mirage: 2D Source Localization Using Microphone Pair Augmentation with Echoes

117   0   0.0 ( 0 )
 Added by Diego Di Carlo
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

It is commonly observed that acoustic echoes hurt performance of sound source localization (SSL) methods. We introduce the concept of microphone array augmentation with echoes (MIRAGE) and show how estimation of early-echo characteristics can in fact benefit SSL. We propose a learning-based scheme for echo estimation combined with a physics-based scheme for echo aggregation. In a simple scenario involving 2 microphones close to a reflective surface and one source, we show using simulated data that the proposed approach performs similarly to a correlation-based method in azimuth estimation while retrieving elevation as well from 2 microphones only, an impossible task in anechoic settings.



rate research

Read More

A deep learning approach based on big data is proposed to locate broadband acoustic sources using a single hydrophone in ocean waveguides with uncertain bottom parameters. Several 50-layer residual neural networks, trained on a huge number of sound field replicas generated by an acoustic propagation model, are used to handle the bottom uncertainty in source localization. A two-step training strategy is presented to improve the training of the deep models. First, the range is discretized in a coarse (5 km) grid. Subsequently, the source range within the selected interval and source depth are discretized on a finer (0.1 km and 2 m) grid. The deep learning methods were demonstrated for simulated magnitude-only multi-frequency data in uncertain environments. Experimental data from the China Yellow Sea also validated the approach.
In this paper, we describe our method for DCASE2019 task3: Sound Event Localization and Detection (SELD). We use four CRNN SELDnet-like single output models which run in a consecutive manner to recover all possible information of occurring events. We decompose the SELD task into estimating number of active sources, estimating direction of arrival of a single source, estimating direction of arrival of the second source where the direction of the first one is known and a multi-label classification task. We use custom consecutive ensemble to predict events onset, offset, direction of arrival and class. The proposed approach is evaluated on the TAU Spatial Sound Events 2019 - Ambisonic and it is compared with other participants submissions.
193 - Hengnian Qi , Xiaoping Wu , 2020
In this paper, we introduce an intelligent prediction system for mobile source localization in industrial Internet of things. The position and velocity of mobile source are jointly predicted by using Time Delay (TD) measurements in the intelligent system. To predict the position and velocity, the Relaxed Semi-Definite Programming (RSDP) algorithm is firstly designed by dropping the rank-one constraint. However, dropping the rank-one constraint leads to produce a suboptimal solution. To improve the performance, we further put forward a Penalty Function Semi-Definite Programming (PF-SDP) method to obtain the rank-one solution of the optimization problem by introducing the penalty terms. Then an Adaptive Penalty Function Semi-Definite Programming (APF-SDP) algorithm is also proposed to avoid the excessive penalty by adaptively choosing the penalty coefficient. We conduct experiments in both a simulation environment and a real system to demonstrate the effectiveness of the proposed method. The results have demonstrated that the proposed intelligent APF-SDP algorithm outperforms the PF-SDP in terms of the position and velocity estimation whether the noise level is large or not.
69 - Xuecong Sun , Han Jia , Zhe Zhang 2019
Conventional approaches to sound localization and separation are based on microphone arrays in artificial systems. Inspired by the selective perception of human auditory system, we design a multi-source listening system which can separate simultaneous overlapping sounds and localize the sound sources in three-dimensional space, using only a single microphone with a metamaterial enclosure. The enclosure modifies the frequency response of the microphone in a direction-dependent way by giving each direction a signature. Thus, the information about the location and audio content of sound sources can be experimentally reconstructed from the modulated mixed signals using compressive sensing algorithm. Owing to the low computational complexity of the proposed reconstruction algorithm, the designed system can also be applied in source identification and tracking. The effectiveness of the system in multiple real scenarios has been proved through multiple random listening tests. The proposed metamaterial-based single-sensor listening system opens a new way of sound localization and separation, which can be applied to intelligent scene monitoring and robot audition.
Speech-related applications deliver inferior performance in complex noise environments. Therefore, this study primarily addresses this problem by introducing speech-enhancement (SE) systems based on deep neural networks (DNNs) applied to a distributed microphone architecture, and then investigates the effectiveness of three different DNN-model structures. The first system constructs a DNN model for each microphone to enhance the recorded noisy speech signal, and the second system combines all the noisy recordings into a large feature structure that is then enhanced through a DNN model. As for the third system, a channel-dependent DNN is first used to enhance the corresponding noisy input, and all the channel-wise enhanced outputs are fed into a DNN fusion model to construct a nearly clean signal. All the three DNN SE systems are operated in the acoustic frequency domain of speech signals in a diffuse-noise field environment. Evaluation experiments were conducted on the Taiwan Mandarin Hearing in Noise Test (TMHINT) database, and the results indicate that all the three DNN-based SE systems provide the original noise-corrupted signals with improved speech quality and intelligibility, whereas the third system delivers the highest signal-to-noise ratio (SNR) improvement and optimal speech intelligibility.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا