ﻻ يوجد ملخص باللغة العربية
Led by the success of neural style transfer on visual arts, there has been a rising trend very recently in the effort of music style transfer. However, music style is not yet a well-defined concept from a scientific point of view. The difficulty lies in the intrinsic multi-level and multi-modal character of music representation (which is very different from image representation). As a result, depending on their interpretation of music style, current studies under the category of music style transfer, are actually solving completely different problems that belong to a variety of sub-fields of Computer Music. Also, a vanilla end-to-end approach, which aims at dealing with all levels of music representation at once by directly adopting the method of image style transfer, leads to poor results. Thus, we vitally propose a more scientifically-viable definition of music style transfer by breaking it down into precise concepts of timbre style transfer, performance style transfer and composition style transfer, as well as to connect different aspects of music style transfer with existing well-established sub-fields of computer music studies. In addition, we discuss the current limitations of music style modeling and its future directions by drawing spirit from some deep generative models, especially the ones using unsupervised learning and disentanglement techniques.
In recent years, music source separation has been one of the most intensively studied research areas in music information retrieval. Improvements in deep learning lead to a big progress in music source separation performance. However, most of the pre
In this paper, we propose a simple yet effective method for multiple music source separation using convolutional neural networks. Stacked hourglass network, which was originally designed for human pose estimation in natural images, is applied to a mu
We present in this paper PerformacnceNet, a neural network model we proposed recently to achieve score-to-audio music generation. The model learns to convert a music piece from the symbolic domain to the audio domain, assigning performance-level attr
In this paper, we adapt triplet neural networks (TNNs) to a regression task, music emotion prediction. Since TNNs were initially introduced for classification, and not for regression, we propose a mechanism that allows them to provide meaningful low
Lyrics alignment in long music recordings can be memory exhaustive when performed in a single pass. In this study, we present a novel method that performs audio-to-lyrics alignment with a low memory consumption footprint regardless of the duration of