New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Generation of musical patterns through operads

103 0 0.0 ( 0 )

Download Cite

Added by Samuele Giraudo

Publication date 2021

fields Informatics Engineering Electronic Engineering

and research's language is English

Authors Samuele Giraudo

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We introduce the notion of multi-pattern, a combinatorial abstraction of polyphonic musical phrases. The interest of this approach lies in the fact that this offers a way to compose two multi-patterns in order to produce a longer one. This dives musical phrases into an algebraic context since the set of multi-patterns has the structure of an operad; operads being structures offering a formalization of the notion of operators and their compositions. Seeing musical phrases as operators allows us to perform computations on phrases and admits applications in generative music: given a set of short patterns, we propose various algorithms to randomly generate a new and longer phrase inspired by the inputted patterns.

rate research

The music box operad: Random generation of musical phrases from patterns

79 - Samuele Giraudo 2021

We introduce the notion of multi-pattern, a combinatorial abstraction of polyphonic musical phrases. The interest of this approach to encode musical phrases lies in the fact that it becomes possible to compose multi-patterns in order to produce new ones. This dives the set of musical phrases into an algebraic framework since the set of multi-patterns has the structure of an operad. Operads are algebraic structures offering a formalization of the notion of operators and their compositions. Seeing musical phrases as operators allows us to perform computations on phrases and admits applications in generative music. Indeed, given a set of short patterns, we propose various algorithms to randomly generate a new and longer phrase inspired by the inputted patterns.

Sound Audio and Speech Processing Combinatorics

Melody Generation for Pop Music via Word Representation of Musical Properties

106 - Andrew Shin , Leopold Crestel , Hiroharu Kato 2017

Automatic melody generation for pop music has been a long-time aspiration for both AI researchers and musicians. However, learning to generate euphonious melody has turned out to be highly challenging due to a number of factors. Representation of multivariate property of notes has been one of the primary challenges. It is also difficult to remain in the permissible spectrum of musical variety, outside of which would be perceived as a plain random play without auditory pleasantness. Observing the conventional structure of pop music poses further challenges. In this paper, we propose to represent each note and its properties as a unique `word, thus lessening the prospect of misalignments between the properties, as well as reducing the complexity of learning. We also enforce regularization policies on the range of notes, thus encouraging the generated melody to stay close to what humans would find easy to follow. Furthermore, we generate melody conditioned on song part information, thus replicating the overall structure of a full song. Experimental results demonstrate that our model can generate auditorily pleasant songs that are more indistinguishable from human-written ones than previous models.

Sound Multimedia Audio and Speech Processing

Binaural Audio Generation via Multi-task Learning

93 - Sijia Li , Shiguang Liu , Dinesh Manocha 2021

We present a learning-based approach for generating binaural audio from mono audio using multi-task learning. Our formulation leverages additional information from two related tasks: the binaural audio generation task and the flipped audio classification task. Our learning model extracts spatialization features from the visual and audio input, predicts the left and right audio channels, and judges whether the left and right channels are flipped. First, we extract visual features using ResNet from the video frames. Next, we perform binaural audio generation and flipped audio classification using separate subnetworks based on visual features. Our learning method optimizes the overall loss based on the weighted sum of the losses of the two tasks. We train and evaluate our model on the FAIR-Play dataset and the YouTube-ASMR dataset. We perform quantitative and qualitative evaluations to demonstrate the benefits of our approach over prior techniques.

Sound Audio and Speech Processing

Joining Sound Event Detection and Localization Through Spatial Segregation

80 - Ivo Trowitzsch , Christopher Schymura , Dorothea Kolossa 2019

Identification and localization of sounds are both integral parts of computational auditory scene analysis. Although each can be solved separately, the goal of forming coherent auditory objects and achieving a comprehensive spatial scene understanding suggests pursuing a joint solution of the two problems. This work presents an approach that robustly binds localization with the detection of sound events in a binaural robotic system. Both tasks are joined through the use of spatial stream segregation which produces probabilistic time-frequency masks for individual sources attributable to separate locations, enabling segregated sound event detection operating on these streams. We use simulations of a comprehensive suite of test scenes with multiple co-occurring sound sources, and propose performance measures for systematic investigation of the impact of scene complexity on this segregated detection of sound types. Analyzing the effect of spatial scene arrangement, we show how a robot could facilitate high performance through optimal head rotation. Furthermore, we investigate the performance of segregated detection given possible localization error as well as error in the estimation of number of active sources. Our analysis demonstrates that the proposed approach is an effective method to obtain joint sound event location and type information under a wide range of conditions.

Sound Audio and Speech Processing

Demonstration of PerformanceNet: A Convolutional Neural Network Model for Score-to-Audio Music Generation

211 - Yu-Hua Chen , Bryan Wang , Yi-Hsuan Yang 2019

We present in this paper PerformacnceNet, a neural network model we proposed recently to achieve score-to-audio music generation. The model learns to convert a music piece from the symbolic domain to the audio domain, assigning performance-level attributes such as changes in velocity automatically to the music and then synthesizing the audio. The model is therefore not just a neural audio synthesizer, but an AI performer that learns to interpret a musical score in its own way. The code and sample outputs of the model can be found online at https://github.com/bwang514/PerformanceNet.

Sound Audio and Speech Processing

comments

Fetching comments

International University for Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Generation of musical patterns through operads

Ask ChatGPT about the research

No Arabic abstract

Read More