مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

An Augmented Lagrangian Method for Piano Transcription using Equal Loudness Thresholding and LSTM-based Decoding

209 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sebastian Ewert

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sebastian Ewert - Mark B. Sandler

أنظمة الصوت في الحاسوب الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A central goal in automatic music transcription is to detect individual note events in music recordings. An important variant is instrument-dependent music transcription where methods can use calibration data for the instruments in use. However, despite the additional information, results rarely exceed an f-measure of 80%. As a potential explanation, the transcription problem can be shown to be badly conditioned and thus relies on appropriate regularization. A recently proposed method employs a mixture of simple, convex regularizers (to stabilize the parameter estimation process) and more complex terms (to encourage more meaningful structure). In this paper, we present two extensions to this method. First, we integrate a computational loudness model to better differentiate real from spurious note detections. Second, we employ (Bidirectional) Long Short Term Memory networks to re-weight the likelihood of detected note constellations. Despite their simplicity, our two extensions lead to a drop of about 35% in note error rate compared to the state-of-the-art.

قيم البحث

154 - Sebastian Ewert , Mark Sandler 2016

Given a musical audio recording, the goal of automatic music transcription is to determine a score-like representation of the piece underlying the recording. Despite significant interest within the research community, several studies have reported on a glass ceiling effect, an apparent limit on the transcription accuracy that current methods seem incapable of overcoming. In this paper, we explore how much this effect can be mitigated by focusing on a specific instrument class and making use of additional information on the recording conditions available in studio or home recording scenarios. In particular, exploiting the availability of single note recordings for the instrument in use we develop a novel signal model employing variable-length spectro-temporal patterns as its central building blocks - tailored for pitched percussive instruments such as the piano. Temporal dependencies between spectral templates are modeled, resembling characteristics of factorial scaled hidden Markov models (FS-HMM) and other methods combining Non-Negative Matrix Factorization with Markov processes. In contrast to FS-HMMs, our parameter estimation is developed in a global, relaxed form within the extensible alternating direction method of multipliers (ADMM) framework, which enables the systematic combination of basic regularizers propagating sparsity and local stationarity in note activity with more complex regularizers imposing temporal semantics. The proposed method achieves an f-measure of 93-95% for note onsets on pieces recorded on a Yamaha Disklavier (MAPS DB).

أنظمة الصوت في الحاسوب التحليل العددي

Sequence-to-Sequence Piano Transcription with Transformers

92 - Curtis Hawthorne , Ian Simon , Rigel Swavely 2021

Automatic Music Transcription has seen significant progress in recent years by training custom deep neural networks on large datasets. However, these models have required extensive domain-specific design of network architectures, input/output represe ntations, and complex decoding schemes. In this work, we show that equivalent performance can be achieved using a generic encoder-decoder Transformer with standard decoding methods. We demonstrate that the model can learn to translate spectrogram inputs directly to MIDI-like output events for several transcription tasks. This sequence-to-sequence approach simplifies transcription by jointly modeling audio features and language-like output dependencies, thus removing the need for task-specific architectures. These results point toward possibilities for creating new Music Information Retrieval models by focusing on dataset creation and labeling rather than custom model design.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

An Investigation on Semismooth Newton based Augmented Lagrangian Method for Image Restoration

306 - Hongpeng Sun 2019

Augmented Lagrangian method (also called as method of multipliers) is an important and powerful optimization method for lots of smooth or nonsmooth variational problems in modern signal processing, imaging, optimal control and so on. However, one usu ally needs to solve the coupled and nonlinear system together and simultaneously, which is very challenging. In this paper, we proposed several semismooth Newton methods to solve the nonlinear subproblems arising in image restoration, which leads to several highly efficient and competitive algorithms for imaging processing. With the analysis of the metric subregularities of the corresponding functions, we give both the global convergence and local linear convergence rate for the proposed augmented Lagrangian methods with semismooth Newton solvers.

التحسين والتحكم

An Efficient Augmented Lagrangian Method for Support Vector Machine

79 - Yinqiao Yan , Qingna Li 2019

Support vector machine (SVM) has proved to be a successful approach for machine learning. Two typical SVM models are the L1-loss model for support vector classification (SVC) and $epsilon$-L1-loss model for support vector regression (SVR). Due to the nonsmoothness of the L1-loss function in the two models, most of the traditional approaches focus on solving the dual problem. In this paper, we propose an augmented Lagrangian method for the L1-loss model, which is designed to solve the primal problem. By tackling the nonsmooth term in the model with Moreau-Yosida regularization and the proximal operator, the subproblem in augmented Lagrangian method reduces to a nonsmooth linear system, which can be solved via the quadratically convergent semismooth Newtons method. Moreover, the high computational cost in semismooth Newtons method can be significantly reduced by exploring the sparse structure in the generalized Jacobian. Numerical results on various datasets in LIBLINEAR show that the proposed method is competitive with the most popular solvers in both speed and accuracy.

التحسين والتحكم

Solving the OSCAR and SLOPE Models Using a Semismooth Newton-Based Augmented Lagrangian Method

357 - Ziyan Luo , Defeng Sun , Kim-Chuan Toh 2018

The octagonal shrinkage and clustering algorithm for regression (OSCAR), equipped with the $ell_1$-norm and a pair-wise $ell_{infty}$-norm regularizer, is a useful tool for feature selection and grouping in high-dimensional data analysis. The computa tional challenge posed by OSCAR, for high dimensional and/or large sample size data, has not yet been well resolved due to the non-smoothness and inseparability of the regularizer involved. In this paper, we successfully resolve this numerical challenge by proposing a sparse semismooth Newton-based augmented Lagrangian method to solve the more general SLOPE (the sorted L-one penalized estimation) model. By appropriately exploiting the inherent sparse and low-rank property of the generalized Jacobian of the semismooth Newton system in the augmented Lagrangian subproblem, we show how the computational complexity can be substantially reduced. Our algorithm presents a notable advantage in the high-dimensional statistical regression settings. Numerical experiments are conducted on real data sets, and the results demonstrate that our algorithm is far superior, in both speed and robustness, than the existing state-of-the-art algorithms based on first-order iterative schemes, including the widely used accelerated proximal gradient (APG) method and the alternating direction method of multipliers (ADMM).

التحسين والتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة المأمون الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

An Augmented Lagrangian Method for Piano Transcription using Equal Loudness Thresholding and LSTM-based Decoding

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً