ترغب بنشر مسار تعليمي؟ اضغط هنا

Timbre Space Representation of a Subtractive Synthesizer

47   0   0.0 ( 0 )
 نشر من قبل Cyrus Vahidi
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

In this study, we produce a geometrically scaled perceptual timbre space from dissimilarity ratings of subtractive synthesized sounds and correlate the resulting dimensions with a set of acoustic descriptors. We curate a set of 15 sounds, produced by a synthesis model that uses varying source waveforms, frequency modulation (FM) and a lowpass filter with an enveloped cutoff frequency. Pairwise dissimilarity ratings were collected within an online browser-based experiment. We hypothesized that a varied waveform input source and enveloped filter would act as the main vehicles for timbral variation, providing novel acoustic correlates for the perception of synthesized timbres.



قيم البحث

اقرأ أيضاً

Content and style representations have been widely studied in the field of style transfer. In this paper, we propose a new loss function using speaker content representation for audio source separation, and we call it speaker representation loss. The objective is to extract the target speaker voice from the noisy input and also remove it from the residual components. Compared to the conventional spectral reconstruction, our proposed framework maximizes the use of target speaker information by minimizing the distance between the speaker representations of reference and source separation output. We also propose triplet speaker representation loss as an additional criterion to remove the target speaker information from residual spectrogram output. VoiceFilter framework is adopted to evaluate source separation performance using the VCTK database, and we achieved improved performances compared to the baseline loss function without any additional network parameters.
Cardiovascular diseases are the leading cause of deaths and severely threaten human health in daily life. On the one hand, there have been dramatically increasing demands from both the clinical practice and the smart home application for monitoring t he heart status of subjects suffering from chronic cardiovascular diseases. On the other hand, experienced physicians who can perform an efficient auscultation are still lacking in terms of number. Automatic heart sound classification leveraging the power of advanced signal processing and machine learning technologies has shown encouraging results. Nevertheless, human hand-crafted features are expensive and time-consuming. To this end, we propose a novel deep representation learning method with an attention mechanism for heart sound classification. In this paradigm, high-level representations are learnt automatically from the recorded heart sound data. Particularly, a global attention pooling layer improves the performance of the learnt representations by estimating the contribution of each unit in feature maps. The Heart Sounds Shenzhen (HSS) corpus (170 subjects involved) is used to validate the proposed method. Experimental results validate that, our approach can achieve an unweighted average recall of 51.2% for classifying three categories of heart sounds, i. e., normal, mild, and moderate/severe annotated by cardiologists with the help of Echocardiography.
69 - Xuecong Sun , Han Jia , Zhe Zhang 2019
Conventional approaches to sound localization and separation are based on microphone arrays in artificial systems. Inspired by the selective perception of human auditory system, we design a multi-source listening system which can separate simultaneou s overlapping sounds and localize the sound sources in three-dimensional space, using only a single microphone with a metamaterial enclosure. The enclosure modifies the frequency response of the microphone in a direction-dependent way by giving each direction a signature. Thus, the information about the location and audio content of sound sources can be experimentally reconstructed from the modulated mixed signals using compressive sensing algorithm. Owing to the low computational complexity of the proposed reconstruction algorithm, the designed system can also be applied in source identification and tracking. The effectiveness of the system in multiple real scenarios has been proved through multiple random listening tests. The proposed metamaterial-based single-sensor listening system opens a new way of sound localization and separation, which can be applied to intelligent scene monitoring and robot audition.
In this work, we address the problem of musical timbre transfer, where the goal is to manipulate the timbre of a sound sample from one instrument to match another instrument while preserving other musical content, such as pitch, rhythm, and loudness. In principle, one could apply image-based style transfer techniques to a time-frequency representation of an audio signal, but this depends on having a representation that allows independent manipulation of timbre as well as high-quality waveform generation. We introduce TimbreTron, a method for musical timbre transfer which applies image domain style transfer to a time-frequency representation of the audio signal, and then produces a high-quality waveform using a conditional WaveNet synthesizer. We show that the Constant Q Transform (CQT) representation is particularly well-suited to convolutional architectures due to its approximate pitch equivariance. Based on human perceptual evaluations, we confirmed that TimbreTron recognizably transferred the timbre while otherwise preserving the musical content, for both monophonic and polyphonic samples.
169 - Sossio Vergara 2013
This article introduces an effective generalization of the polar flavor of the Fourier Theorem based on a new method of analysis. Under the premises of the new theory an ample class of functions become viable as bases, with the further advantage of u sing the same basis for analysis and reconstruction. In fact other tools, like the wavelets, admit specially built nonorthogonal bases but require different bases for analysis and reconstruction (biorthogonal and dual bases) and vectorial coordinates; this renders those systems unintuitive and computing intensive. As an example of the advantages of the new generalization of the Fourier Theorem, this paper introduces a novel method for the synthesis that is based on frequency-phase series of square waves (the equivalent of the polar Fourier Theorem but for nonorthogonal bases). The resulting synthesizer is very efficient needing only few components, frugal in terms of computing needs, and viable for many applications.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا