ﻻ يوجد ملخص باللغة العربية
Computational auditory scene analysis is gaining interest in the last years. Trailing behind the more mature field of speech recognition, it is particularly general sound event detection that is attracting increasing attention. Crucial for training and testing reasonable models is having available enough suitable data -- until recently, general sound event databases were hardly found. We release and present a database with 714 wav files containing isolated high quality sound events of 14 different types, plus 303 `general wav files of anything else but these 14 types. All sound events are strongly labeled with perceptual on- and offset times, paying attention to omitting in-between silences. The amount of isolated sound events, the quality of annotations, and the particular general sound class distinguish NIGENS from other databases.
Audio content analysis in terms of sound events is an important research problem for a variety of applications. Recently, the development of weak labeling approaches for audio or sound event detection (AED) and availability of large scale weakly labe
In this work, we present the development of a new database, namely Sound Localization and Classification (SLoClas) corpus, for studying and analyzing sound localization and classification. The corpus contains a total of 23.27 hours of data recorded u
The motivation of our research is to develop a sound-to-image (S2I) translation system for enabling a human receiver to visually infer the occurrence of sound related events. We expect the computer to imagine the scene from the captured sound, genera
A continuous real-time respiratory sound automated analysis system is needed in clinical practice. Previously, we established an open access lung sound database, HF_Lung_V1, and automated lung sound analysis algorithms capable of detecting inhalation
Automated Audio Captioning is a cross-modal task, generating natural language descriptions to summarize the audio clips sound events. However, grounding the actual sound events in the given audio based on its corresponding caption has not been invest