Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

An empirical investigation into audio pipeline approaches for classifying bird species

154 0 0.0 ( 0 )

Download Cite

Added by Vukosi Marivate

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors David Behr - Ciira wa Maina - Vukosi Marivate

Sound Computers and Society Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper is an investigation into aspects of an audio classification pipeline that will be appropriate for the monitoring of bird species on edges devices. These aspects include transfer learning, data augmentation and model optimization. The hope is that the resulting models will be good candidates to deploy on edge devices to monitor bird populations. Two classification approaches will be taken into consideration, one which explores the effectiveness of a traditional Deep Neural Network(DNN) and another that makes use of Convolutional layers.This study aims to contribute empirical evidence of the merits and demerits of each approach.

rate research

Falling for Phishing: An Empirical Investigation into Peoples Email Response Behaviors

74 - Asangi Jayatilaka , Nalin Asanka Gamagedara Arachchilage andn Muhammad Ali Babar 2021

Despite the sophisticated phishing email detection systems, and training and awareness programs, humans continue to be tricked by phishing emails. In an attempt to understand why phishing email attacks still work, we have carried out an empirical study to investigate how people make response decisions while reading their emails. We used a think aloud method and follow-up interviews to collect data from 19 participants. The analysis of the collected data has enabled us to identify eleven factors that influence peoples response decisions to both phishing and legitimate emails. Based on the identified factors, we discuss how people can be susceptible to phishing attacks due to the flaws in their decision-making processes. Furthermore, we propose design directions for developing a behavioral plugin for email clients that can be used to nudge peoples secure behaviors enabling them to have a better response to phishing emails.

Cryptography and Security Computers and Society Human-Computer Interaction

Automated bird sound recognition in realistic settings

133 - Timos Papadopoulos , Stephen J. Roberts , Katherine J. Willis 2018

We evaluated the effectiveness of an automated bird sound identification system in a situation that emulates a realistic, typical application. We trained classification algorithms on a crowd-sourced collection of bird audio recording data and restricted our training methods to be completely free of manual intervention. The approach is hence directly applicable to the analysis of multiple species collections, with labelling provided by crowd-sourced collection. We evaluated the performance of the bird sound recognition system on a realistic number of candidate classes, corresponding to real conditions. We investigated the use of two canonical classification methods, chosen due to their widespread use and ease of interpretation, namely a k Nearest Neighbour (kNN) classifier with histogram-based features and a Support Vector Machine (SVM) with time-summarisation features. We further investigated the use of a certainty measure, derived from the output probabilities of the classifiers, to enhance the interpretability and reliability of the class decisions. Our results demonstrate that both identification methods achieved similar performance, but we argue that the use of the kNN classifier offers somewhat more flexibility. Furthermore, we show that employing an outcome certainty measure provides a valuable and consistent indicator of the reliability of classification results. Our use of generic training data and our investigation of probabilistic classification methodologies that can flexibly address the variable number of candidate species/classes that are expected to be encountered in the field, directly contribute to the development of a practical bird sound identification system with potentially global application. Further, we show that certainty measures associated with identification outcomes can significantly contribute to the practical usability of the overall system.

Sound Computers and Society Machine Learning

TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer

88 - Sicong Huang , Qiyang Li , Cem Anil 2018

In this work, we address the problem of musical timbre transfer, where the goal is to manipulate the timbre of a sound sample from one instrument to match another instrument while preserving other musical content, such as pitch, rhythm, and loudness. In principle, one could apply image-based style transfer techniques to a time-frequency representation of an audio signal, but this depends on having a representation that allows independent manipulation of timbre as well as high-quality waveform generation. We introduce TimbreTron, a method for musical timbre transfer which applies image domain style transfer to a time-frequency representation of the audio signal, and then produces a high-quality waveform using a conditional WaveNet synthesizer. We show that the Constant Q Transform (CQT) representation is particularly well-suited to convolutional architectures due to its approximate pitch equivariance. Based on human perceptual evaluations, we confirmed that TimbreTron recognizably transferred the timbre while otherwise preserving the musical content, for both monophonic and polyphonic samples.

Sound Machine Learning Audio and Speech Processing

Empirical Bayesian Independent Deeply Learned Matrix Analysis For Multichannel Audio Source Separation

174 - Takuya Hasumi , Tomohiko Nakamura , Norihiro Takamune 2021

Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art supervised multichannel audio source separation methods. It blindly estimates the demixing filters on the basis of source independence, using the source model estimated by the deep neural network (DNN). However, since the ratios of the source to interferer signals vary widely among time-frequency (TF) slots, it is difficult to obtain reliable estimated power spectrograms of sources at all TF slots. In this paper, we propose an IDLMA extension, empirical Bayesian IDLMA (EB-IDLMA), by introducing a prior distribution of source power spectrograms and treating the source power spectrograms as latent random variables. This treatment allows us to implicitly consider the reliability of the estimated source power spectrograms for the estimation of demixing filters through the hyperparameters of the prior distribution estimated by the DNN. Experimental evaluations show the effectiveness of EB-IDLMA and the importance of introducing the reliability of the estimated source power spectrograms.

Sound Audio and Speech Processing

Open Data Ecosystems -- an empirical investigation into an emerging industry collaboration concept

189 - Per Runeson , Thomas Olsson , Johan Lin{aa}ker 2021

Software systems are increasingly depending on data, particularly with the rising use of machine learning, and developers are looking for new sources of data. Open Data Ecosystems (ODE) is an emerging concept for data sharing under public licenses in software ecosystems, similar to Open Source Software (OSS). It has certain similarities to Open Government Data (OGD), where public agencies share data for innovation and transparency. We aimed to explore open data ecosystems involving commercial actors. Thus, we organized five focus groups with 27 practitioners from 22 companies, public organizations, and research institutes. Based on the outcomes, we surveyed three cases of emerging ODE practice to further understand the concepts and to validate the initial findings. The main outcome is an initial conceptual model of ODEs value, intrinsics, governance, and evolution, and propositions for practice and further research. We found that ODE must be value driven. Regarding the intrinsics of data, we found their type, meta-data, and legal frameworks influential for their openness. We also found the characteristics of ecosystem initiation, organization, data acquisition and openness be differentiating, which we advise research and practice to take into consideration.

Software Engineering

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

An empirical investigation into audio pipeline approaches for classifying bird species

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions