ﻻ يوجد ملخص باللغة العربية
In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.
In this technical report, we present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge. Task 1 comprises two different sub-tasks: (i) Task 1a focuses o
Generative Adversarial Networks (GANs) currently achieve the state-of-the-art sound synthesis quality for pitched musical instruments using a 2-channel spectrogram representation consisting of log magnitude and instantaneous frequency (the IFSpectrog
Music annotation has always been one of the critical topics in the field of Music Information Retrieval (MIR). Traditional models use supervised learning for music annotation tasks. However, as supervised machine learning approaches increase in compl
In acoustic scene classification (ASC), acoustic features play a crucial role in the extraction of scene information, which can be stored over different time scales. Moreover, the limited size of the dataset may lead to a biased model with a poor per
Unity is a software specifically designed for the development of video games. However, due to its programming possibilities and the polyvalence of its architecture, it can prove to be a versatile tool for stimuli presentation in research experiments.