No Arabic abstract
Algorithms for acoustic source localization and tracking provide estimates of the positional information about active sound sources in acoustic environments and are essential for a wide range of applications such as personal assistants, smart homes, tele-conferencing systems, hearing aids, or autonomous systems. The aim of the IEEE-AASP Challenge on sound source localization and tracking (LOCATA) was to objectively benchmark state-of-the-art localization and tracking algorithms using an open-access data corpus of recordings for scenarios typically encountered in audio and acoustic signal processing applications. The challenge tasks ranged from the localization of a single source with a static microphone array to the tracking of multiple moving sources with a moving microphone array.
The Multi-target Challenge aims to assess how well current speech technology is able to determine whether or not a recorded utterance was spoken by one of a large number of blacklisted speakers. It is a form of multi-target speaker detection based on real-world telephone conversations. Data recordings are generated from call center customer-agent conversations. The task is to measure how accurately one can detect 1) whether a test recording is spoken by a blacklisted speaker, and 2) which specific blacklisted speaker was talking. This paper outlines the challenge and provides its baselines, results, and discussions.
The Magnificent CE$ u$NS Workshop (2018) was held November 2 & 3 of 2018 on the University of Chicago campus and brought together theorists, phenomenologists, and experimentalists working in numerous areas but sharing a common interest in the process of coherent elastic neutrino-nucleus scattering (CE$ u$NS). This is a collection of abstract-like summaries of the talks given at the meeting, including links to the slides presented. This document and the slides from the meeting provide an overview of the field and a snapshot of the robust CE$ u$NS-related efforts both planned and underway.
The Multitarget Challenge aims to assess how well current speech technology is able to determine whether or not a recorded utterance was spoken by one of a large number of blacklisted speakers. It is a form of multi-target speaker detection based on real-world telephone conversations. Data recordings are generated from call center customer-agent conversations. Each conversation is represented by a single i-vector. Given a pool of training and development data from non-Blacklist and Blacklist speakers, the task is to measure how accurately one can detect 1) whether a test recording is spoken by a Blacklist speaker, and 2) which specific Blacklist speaker was talking.
This short paper presents an efficient, flexible implementation of the SRP-PHAT multichannel sound source localization method. The method is evaluated on the single-source tasks of the LOCATA 2018 development dataset, and an associated Matlab toolbox is made available online.
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, far-field text-independent speaker verification from single microphone array, and far-field text-dependent speaker verification from distributed microphone arrays. All three tasks pose a cross-channel challenge to the participants. To simulate the real-life scenario, the enrollment utterances are recorded from close-talk cellphone, while the test utterances are recorded from the far-field microphone arrays. In this paper, we describe the database, the challenge, and the baseline system, which is based on a ResNet-based deep speaker network with cosine similarity scoring. For a given utterance, the speaker embeddings of different channels are equally averaged as the final embedding. The baseline system achieves minDCFs of 0.62, 0.66, and 0.64 and EERs of 6.27%, 6.55%, and 7.18% for task 1, task 2, and task 3, respectively.