Representation Learning of Music Using Artist, Album, and Track Information

70 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jongpil Lee

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jongpil Lee - Jiyoung Park - Juhan Nam

استرجاع المعلومات الوسائط المتعددة أنظمة الصوت في الحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Supervised music representation learning has been performed mainly using semantic labels such as music genres. However, annotating music with semantic labels requires time and cost. In this work, we investigate the use of factual metadata such as artist, album, and track information, which are naturally annotated to songs, for supervised music representation learning. The results show that each of the metadata has individual concept characteristics, and using them jointly improves overall performance.

قيم البحث

116 - Rafael Valle 2018

This paper describes computational methods for the visual display and analysis of music information. We provide a concise description of software, music descriptors and data visualization techniques commonly used in music information retrieval. Final ly, we provide use cases where the described software, descriptors and visualizations are showcased.

استرجاع المعلومات الوسائط المتعددة

Tag2Risk: Harnessing Social Music Tags for Characterizing Depression Risk

57 - Aayush Surana , Yash Goyal , Manish Shrivastava 2020

Musical preferences have been considered a mirror of the self. In this age of Big Data, online music streaming services allow us to capture ecologically valid music listening behavior and provide a rich source of information to identify several user- specific aspects. Studies have shown musical engagement to be an indirect representation of internal states including internalized symptomatology and depression. The current study aims at unearthing patterns and trends in the individuals at risk for depression as it manifests in naturally occurring music listening behavior. Mental well-being scores, musical engagement measures, and listening histories of Last.fm users (N=541) were acquired. Social tags associated with each listeners most popular tracks were analyzed to unearth the mood/emotions and genres associated with the users. Results revealed that social tags prevalent in the users at risk for depression were predominantly related to emotions depicting Sadness associated with genre tags representing neo-psychedelic-, avant garde-, dream-pop. This study will open up avenues for an MIR-based approach to characterizing and predicting risk for depression which can be helpful in early detection and additionally provide bases for designing music recommendations accordingly.

استرجاع المعلومات الوسائط المتعددة أنظمة الصوت في الحاسوب

Dual-track Music Generation using Deep Learning

210 - Sudi Lyu , Anxiang Zhang , Rong Song 2020

Music generation is always interesting in a sense that there is no formalized recipe. In this work, we propose a novel dual-track architecture for generating classical piano music, which is able to model the inter-dependency of left-hand and right-ha nd piano music. Particularly, we experimented with a lot of different models of neural network as well as different representations of music, and the results show that our proposed model outperforms all other tested methods. Besides, we deployed some special policies for model training and generation, which contributed to the model performance remarkably. Finally, under two evaluation methods, we compared our models with the MuseGAN project and true music.

أنظمة الصوت في الحاسوب التعلم الآلي التعلم الالي

Towards Deep Modeling of Music Semantics using EEG Regularizers

316 - Francisco Raposo , David Martins de Matos , Ricardo Ribeiro 2017

Modeling of music audio semantics has been previously tackled through learning of mappings from audio data to high-level tags or latent unsupervised spaces. The resulting semantic spaces are theoretically limited, either because the chosen high-level tags do not cover all of music semantics or because audio data itself is not enough to determine music semantics. In this paper, we propose a generic framework for semantics modeling that focuses on the perception of the listener, through EEG data, in addition to audio data. We implement this framework using a novel end-to-end 2-view Neural Network (NN) architecture and a Deep Canonical Correlation Analysis (DCCA) loss function that forces the semantic embedding spaces of both views to be maximally correlated. We also detail how the EEG dataset was collected and use it to train our proposed model. We evaluate the learned semantic space in a transfer learning context, by using it as an audio feature extractor in an independent dataset and proxy task: music audio-lyrics cross-modal retrieval. We show that our embedding model outperforms Spotify features and performs comparably to a state-of-the-art embedding model that was trained on 700 times more data. We further discuss improvements to the model that are likely to improve its performance.

استرجاع المعلومات التعلم الآلي أنظمة الصوت في الحاسوب

Towards Playlist Generation Algorithms Using RNNs Trained on Within-Track Transitions

77 - Keunwoo Choi , , George Fazekas 2016

We introduce a novel playlist generation algorithm that focuses on the quality of transitions using a recurrent neural network (RNN). The proposed model assumes that optimal transitions between tracks can be modelled and predicted by internal transit ions within music tracks. We introduce modelling sequences of high-level music descriptors using RNNs and discuss an experiment involving different similarity functions, where the sequences are provided by a musical structural analysis algorithm. Qualitative observations show that the proposed approach can effectively model transitions of music tracks in playlists.

الذكاء الاصطناعي الوسائط المتعددة أنظمة الصوت في الحاسوب