Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding

423 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xiao Zhang

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zheng Sun - Jiaqi Liu - Zewang Zhang

التعلم الآلي الذكاء الاصطناعي أنظمة الصوت في الحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Creating aesthetically pleasing pieces of art, including music, has been a long-term goal for artificial intelligence research. Despite recent successes of long-short term memory (LSTM) recurrent neural networks (RNNs) in sequential learning, LSTM neural networks have not, by themselves, been able to generate natural-sounding music conforming to music theory. To transcend this inadequacy, we put forward a novel method for music composition that combines the LSTM with Grammars motivated by music theory. The main tenets of music theory are encoded as grammar argumented (GA) filters on the training data, such that the machine can be trained to generate music inheriting the naturalness of human-composed pieces from the original dataset while adhering to the rules of music theory. Unlike previous approaches, pitches and durations are encoded as one semantic entity, which we refer to as note-level encoding. This allows easy implementation of music theory grammars, as well as closer emulation of the thinking pattern of a musician. Although the GA rules are applied to the training data and never directly to the LSTM music generation, our machine still composes music that possess high incidences of diatonic scale notes, small pitch intervals and chords, in deference to music theory.

قيم البحث

125 - Anwesh Bhattacharya , Marios Mattheakis , Pavlos Protopapas 2021

In certain situations, Neural Networks (NN) are trained upon data that obey underlying physical symmetries. However, it is not guaranteed that NNs will obey the underlying symmetry unless embedded in the network structure. In this work, we explore a special kind of symmetry where functions are invariant with respect to involutory linear/affine transformations up to parity $p=pm 1$. We develop mathematical theorems and propose NN architectures that ensure invariance and universal approximation properties. Numerical experiments indicate that the proposed models outperform baseline networks while respecting the imposed symmetry. An adaption of our technique to convolutional NN classification tasks for datasets with inherent horizontal/vertical reflection symmetry has also been proposed.

التعلم الآلي الذكاء الاصطناعي الحوسبة العصبية والتطورية

From Note-Level to Chord-Level Neural Network Models for Voice Separation in Symbolic Music

79 - Patrick Gray , Razvan Bunescu 2020

Music is often experienced as a progression of concurrent streams of notes, or voices. The degree to which this happens depends on the position along a voice-leading continuum, ranging from monophonic, to homophonic, to polyphonic, which complicates the design of automatic voice separation models. We address this continuum by defining voice separation as the task of decomposing music into streams that exhibit both a high degree of external perceptual separation from the other streams and a high degree of internal perceptual consistency. The proposed voice separation task allows for a voice to diverge to multiple voices and also for multiple voices to converge to the same voice. Equipped with this flexible task definition, we manually annotated a corpus of popular music and used it to train neural networks that assign notes to voices either separately for each note in a chord (note-level), or jointly to all notes in a chord (chord-level). The trained neural models greedily assign notes to voices in a left to right traversal of the input chord sequence, using a diverse set of perceptually informed input features. When evaluated on the extraction of consecutive within voice note pairs, both models surpass a strong baseline based on an iterative application of an envelope extraction function, with the chord-level model consistently edging out the note-level model. The two models are also shown to outperform previous approaches on separating the voices in Bach music.

أنظمة الصوت في الحاسوب الذكاء الاصطناعي التعلم الآلي

Explaining Deep Convolutional Neural Networks on Music Classification

82 - Keunwoo Choi , George Fazekas , Mark Sandler 2016

Deep convolutional neural networks (CNNs) have been actively adopted in the field of music information retrieval, e.g. genre classification, mood detection, and chord recognition. However, the process of learning and prediction is little understood, particularly when it is applied to spectrograms. We introduce auralisation of a CNN to understand its underlying mechanism, which is based on a deconvolution procedure introduced in [2]. Auralisation of a CNN is converting the learned convolutional features that are obtained from deconvolution into audio signals. In the experiments and discussions, we explain trained features of a 5-layer CNN based on the deconvolved spectrograms and auralised signals. The pairwise correlations per layers with varying different musical attributes are also investigated to understand the evolution of the learnt features. It is shown that in the deep layers, the features are learnt to capture textures, the patterns of continuous distributions, rather than shapes of lines.

التعلم الآلي الذكاء الاصطناعي الوسائط المتعددة

Matrix Encoding Networks for Neural Combinatorial Optimization

83 - Yeong-Dae Kwon , Jinho Choo , Iljoo Yoon 2021

Machine Learning (ML) can help solve combinatorial optimization (CO) problems better. A popular approach is to use a neural net to compute on the parameters of a given CO problem and extract useful information that guides the search for good solution s. Many CO problems of practical importance can be specified in a matrix form of parameters quantifying the relationship between two groups of items. There is currently no neural net model, however, that takes in such matrix-style relationship data as an input. Consequently, these types of CO problems have been out of reach for ML engineers. In this paper, we introduce Matrix Encoding Network (MatNet) and show how conveniently it takes in and processes parameters of such complex CO problems. Using an end-to-end model based on MatNet, we solve asymmetric traveling salesman (ATSP) and flexible flow shop (FFSP) problems as the earliest neural approach. In particular, for a class of FFSP we have tested MatNet on, we demonstrate a far superior empirical performance to any methods (neural or not) known to date.

التعلم الآلي

Graph Coarsening with Neural Networks

162 - Chen Cai , Dingkang Wang , Yusu Wang 2021

As large-scale graphs become increasingly more prevalent, it poses significant computational challenges to process, extract and analyze large graph data. Graph coarsening is one popular technique to reduce the size of a graph while maintaining essent ial properties. Despite rich graph coarsening literature, there is only limited exploration of data-driven methods in the field. In this work, we leverage the recent progress of deep learning on graphs for graph coarsening. We first propose a framework for measuring the quality of coarsening algorithm and show that depending on the goal, we need to carefully choose the Laplace operator on the coarse graph and associated projection/lift operators. Motivated by the observation that the current choice of edge weight for the coarse graph may be sub-optimal, we parametrize the weight assignment map with graph neural networks and train it to improve the coarsening quality in an unsupervised way. Through extensive experiments on both synthetic and real networks, we demonstrate that our method significantly improves common graph coarsening methods under various metrics, reduction ratios, graph sizes, and graph types. It generalizes to graphs of larger size ($25times$ of training graphs), is adaptive to different losses (differentiable and non-differentiable), and scales to much larger graphs than previous work.

التعلم الآلي الذكاء الاصطناعي