أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل George Fazekas

Timbre Space Representation of a Subtractive Synthesizer

46 - Cyrus Vahidi , George Fazekas , Charalampos Saitis 2020

In this study, we produce a geometrically scaled perceptual timbre space from dissimilarity ratings of subtractive synthesized sounds and correlate the resulting dimensions with a set of acoustic descriptors. We curate a set of 15 sounds, produced by a synthesis model that uses varying source waveforms, frequency modulation (FM) and a lowpass filter with an enveloped cutoff frequency. Pairwise dissimilarity ratings were collected within an online browser-based experiment. We hypothesized that a varied waveform input source and enveloped filter would act as the main vehicles for timbral variation, providing novel acoustic correlates for the perception of synthesized timbres.

أنظمة الصوت في الحاسوب معالجة الصوت والكلام

The Effects of Noisy Labels on Deep Convolutional Neural Networks for Music Tagging

87 - Keunwoo Choi , George Fazekas , Kyunghyun Cho 2017

Deep neural networks (DNN) have been successfully applied to music classification including music tagging. However, there are several open questions regarding the training, evaluation, and analysis of DNNs. In this article, we investigate specific as pects of neural networks, the effects of noisy labels, to deepen our understanding of their properties. We analyse and (re-)validate a large music tagging dataset to investigate the reliability of training and evaluation. Using a trained network, we compute label vector similarities which is compared to groundtruth similarity. The results highlight several important aspects of music tagging and neural networks. We show that networks can be effective despite relatively large error rates in groundtruth datasets, while conjecturing that label noise can be the cause of varying tag-wise performance differences. Lastly, the analysis of our trained network provides valuable insight into the relationships between music tags. These results highlight the benefit of using data-driven methods to address automatic music tagging.

استرجاع المعلومات التعلم الآلي الوسائط المتعددة

Convolutional Recurrent Neural Networks for Music Classification

218 - Keunwoo Choi , George Fazekas , Mark Sandler 2016

We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted featur es. We compare CRNN with three CNN structures that have been used for music tagging while controlling the number of parameters with respect to their performance and training time per sample. Overall, we found that CRNNs show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.

الحوسبة العصبية والتطورية التعلم الآلي الوسائط المتعددة

Towards Music Captioning: Generating Music Playlist Descriptions

73 - Keunwoo Choi , George Fazekas , Brian McFee 2016

Descriptions are often provided along with recommendations to help users discovery. Recommending automatically generated music playlists (e.g. personalised playlists) introduces the problem of generating descriptions. In this paper, we propose a meth od for generating music playlist descriptions, which is called as music captioning. In the proposed method, audio content analysis and natural language processing are adopted to utilise the information of each track.

الوسائط المتعددة الذكاء الاصطناعي الحساب واللغة

Explaining Deep Convolutional Neural Networks on Music Classification

82 - Keunwoo Choi , George Fazekas , Mark Sandler 2016

Deep convolutional neural networks (CNNs) have been actively adopted in the field of music information retrieval, e.g. genre classification, mood detection, and chord recognition. However, the process of learning and prediction is little understood, particularly when it is applied to spectrograms. We introduce auralisation of a CNN to understand its underlying mechanism, which is based on a deconvolution procedure introduced in [2]. Auralisation of a CNN is converting the learned convolutional features that are obtained from deconvolution into audio signals. In the experiments and discussions, we explain trained features of a 5-layer CNN based on the deconvolved spectrograms and auralised signals. The pairwise correlations per layers with varying different musical attributes are also investigated to understand the evolution of the learnt features. It is shown that in the deep layers, the features are learnt to capture textures, the patterns of continuous distributions, rather than shapes of lines.

التعلم الآلي الذكاء الاصطناعي الوسائط المتعددة

Towards Playlist Generation Algorithms Using RNNs Trained on Within-Track Transitions

77 - Keunwoo Choi , , George Fazekas 2016

We introduce a novel playlist generation algorithm that focuses on the quality of transitions using a recurrent neural network (RNN). The proposed model assumes that optimal transitions between tracks can be modelled and predicted by internal transit ions within music tracks. We introduce modelling sequences of high-level music descriptors using RNNs and discuss an experiment involving different similarity functions, where the sequences are provided by a musical structural analysis algorithm. Qualitative observations show that the proposed approach can effectively model transitions of music tracks in playlists.

الذكاء الاصطناعي الوسائط المتعددة أنظمة الصوت في الحاسوب

Automatic tagging using deep convolutional neural networks

99 - Keunwoo Choi , George Fazekas , Mark Sandler 2016

We present a content-based automatic music tagging algorithm using fully convolutional neural networks (FCNs). We evaluate different architectures consisting of 2D convolutional layers and subsampling layers only. In the experiments, we measure the A UC-ROC scores of the architectures with different complexities and input types using the MagnaTagATune dataset, where a 4-layer architecture shows state-of-the-art performance with mel-spectrogram input. Furthermore, we evaluated the performances of the architectures with varying the number of layers on a larger dataset (Million Song Dataset), and found that deeper models outperformed the 4-layer architecture. The experiments show that mel-spectrogram is an effective time-frequency representation for automatic tagging and that more complex models benefit from more training data.

أنظمة الصوت في الحاسوب التعلم الآلي

Text-based LSTM networks for Automatic Music Composition

53 - Keunwoo Choi , George Fazekas , Mark Sandler 2016

In this paper, we introduce new methods and discuss results of text-based LSTM (Long Short-Term Memory) networks for automatic music composition. The proposed network is designed to learn relationships within text documents that represent chord progr essions and drum tracks in two case studies. In the experiments, word-RNNs (Recurrent Neural Networks) show good results for both cases, while character-based RNNs (char-RNNs) only succeed to learn chord progressions. The proposed system can be used for fully automatic composition or as semi-automatic systems that help humans to compose music by controlling a diversity parameter of the model.

الذكاء الاصطناعي الوسائط المتعددة

Understanding Music Playlists

54 - Keunwoo Choi , George Fazekas , Mark Sandler 2015

As music streaming services dominate the music industry, the playlist is becoming an increasingly crucial element of music consumption. Con- sequently, the music recommendation problem is often casted as a playlist generation prob- lem. Better unders tanding of the playlist is there- fore necessary for developing better playlist gen- eration algorithms. In this work, we analyse two playlist datasets to investigate some com- monly assumed hypotheses about playlists. Our findings indicate that deeper understanding of playlists is needed to provide better prior infor- mation and improve machine learning algorithms in the design of recommendation systems.

الوسائط المتعددة استرجاع المعلومات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد