Text-based LSTM networks for Automatic Music Composition

54 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Keunwoo Choi Mr

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Keunwoo Choi - George Fazekas - Mark Sandler

الذكاء الاصطناعي الوسائط المتعددة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we introduce new methods and discuss results of text-based LSTM (Long Short-Term Memory) networks for automatic music composition. The proposed network is designed to learn relationships within text documents that represent chord progressions and drum tracks in two case studies. In the experiments, word-RNNs (Recurrent Neural Networks) show good results for both cases, while character-based RNNs (char-RNNs) only succeed to learn chord progressions. The proposed system can be used for fully automatic composition or as semi-automatic systems that help humans to compose music by controlling a diversity parameter of the model.

قيم البحث

124 - Gunjan Aggarwal , Devi Parikh 2021

Dance and music typically go hand in hand. The complexities in dance, music, and their synchronisation make them fascinating to study from a computational creativity perspective. While several works have looked at generating dance for a given music, automatically generating music for a given dance remains under-explored. This capability could have several creative expression and entertainment applications. We present some early explorations in this direction. We present a search-based offline approach that generates music after processing the entire dance video and an online approach that uses a deep neural network to generate music on-the-fly as the video proceeds. We compare these approaches to a strong heuristic baseline via human studies and present our findings. We have integrated our online approach in a live demo! A video of the demo can be found here: https://sites.google.com/view/dance2music/live-demo.

أنظمة الصوت في الحاسوب الوسائط المتعددة معالجة الصوت والكلام

AutoMATES: Automated Model Assembly from Text, Equations, and Software

270 - Adarsh Pyarelal , Marco A. Valenzuela-Escarcega , Rebecca Sharpn 2020

Models of complicated systems can be represented in different ways - in scientific papers, they are represented using natural language text as well as equations. But to be of real use, they must also be implemented as software, thus making code a thi rd form of representing models. We introduce the AutoMATES project, which aims to build semantically-rich unified representations of models from scientific code and publications to facilitate the integration of computational models from different domains and allow for modeling large, complicated systems that span multiple domains and levels of abstraction.

الذكاء الاصطناعي الوسائط المتعددة هندسة البرمجيات

Audio-Based Music Classification with DenseNet And Data Augmentation

129 - Wenhao Bian , Jie Wang , Bojin Zhuang 2019

In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieva l (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.

معالجة الصوت والكلام الوسائط المتعددة أنظمة الصوت في الحاسوب

MusiCoder: A Universal Music-Acoustic Encoder Based on Transformers

70 - Yilun Zhao , Jia Guo 2020

Music annotation has always been one of the critical topics in the field of Music Information Retrieval (MIR). Traditional models use supervised learning for music annotation tasks. However, as supervised machine learning approaches increase in compl exity, the increasing need for more annotated training data can often not be matched with available data. In this paper, a new self-supervised music acoustic representation learning approach named MusiCoder is proposed. Inspired by the success of BERT, MusiCoder builds upon the architecture of self-attention bidirectional transformers. Two pre-training objectives, including Contiguous Frames Masking (CFM) and Contiguous Channels Masking (CCM), are designed to adapt BERT-like masked reconstruction pre-training to continuous acoustic frame domain. The performance of MusiCoder is evaluated in two downstream music annotation tasks. The results show that MusiCoder outperforms the state-of-the-art models in both music genre classification and auto-tagging tasks. The effectiveness of MusiCoder indicates a great potential of a new self-supervised learning approach to understand music: first apply masked reconstruction tasks to pre-train a transformer-based model with massive unlabeled music acoustic data, and then finetune the model on specific downstream tasks with labeled data.

معالجة الصوت والكلام الوسائط المتعددة

LSTM-Based Goal Recognition in Latent Space

47 - Leonardo Amado , Jo~ao Paulo Aires , Ramon Fraga Pereira 2018

Approaches to goal recognition have progressively relaxed the requirements about the amount of domain knowledge and available observations, yielding accurate and efficient algorithms capable of recognizing goals. However, to recognize goals in raw da ta, recent approaches require either human engineered domain knowledge, or samples of behavior that account for almost all actions being observed to infer possible goals. This is clearly too strong a requirement for real-world applications of goal recognition, and we develop an approach that leverages advances in recurrent neural networks to perform goal recognition as a classification task, using encoded plan traces for training. We empirically evaluate our approach against the state-of-the-art in goal recognition with image-based domains, and discuss under which conditions our approach is superior to previous ones.

الذكاء الاصطناعي