ﻻ يوجد ملخص باللغة العربية
We address the issue of editing musical performance data, in particular MIDI files representing human musical performances. Editing such sequences raises specific issues due to the ambiguous nature of musical objects. The first source of ambiguity is that musicians naturally produce many deviations from the metrical frame. These deviations may be intentional or subconscious, but they play an important role in conveying the groove or feeling of a performance. Relations between musical elements are also usually implicit, creating even more ambiguity. A note is in relation with the surrounding notes in many possible ways: it can be part of a melodic pattern, it can also play a harmonic role with the simultaneous notes, or be a pedal-tone. All these aspects play an essential role that should be preserved, as much as possible, when editing musical sequences. In this paper, we contribute specifically to the problem of editing non-quantized, metrical musical sequences represented as MIDI files. We first list of number of problems caused by the use of naive edition operations applied to performance data, using a motivating example. We then introduce a model, called Dancing MIDI, based on 1) two desirable, well-defined properties for edit operations and 2) two well-defined operations, Split and Concat, with an implementation. We show that our model formally satisfies the two properties, and that it prevents most of the problems that occur with naive edit operations on our motivating example, as well as on a real-world example using an automatic harmonizer.
Recently, end-to-end (E2E) speech recognition has become popular, since it can integrate the acoustic, pronunciation and language models into a single neural network, which outperforms conventional models. Among E2E approaches, attention-based models
Music classification is a task to classify a music piece into labels such as genres or composers. We propose large-scale MIDI based composer classification systems using GiantMIDI-Piano, a transcription-based dataset. We propose to use piano rolls, o
We introduce the Expanded Groove MIDI dataset (E-GMD), an automatic drum transcription (ADT) dataset that contains 444 hours of audio from 43 drum kits, making it an order of magnitude larger than similar datasets, and the first with human-performed
Sounds are essential to how humans perceive and interact with the world and are captured in recordings and shared on the Internet on a minute-by-minute basis. These recordings, which are predominantly videos, constitute the largest archive of sounds
We introduce the notion of multi-pattern, a combinatorial abstraction of polyphonic musical phrases. The interest of this approach lies in the fact that this offers a way to compose two multi-patterns in order to produce a longer one. This dives musi