بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

DSP Based System for Real time Voice Synthesis Applications Development

333 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Radu Arsinte

تاريخ النشر 2008

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Radu Arsinte - Attila Ferencz - Costin Miron

أنظمة الصوت في الحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper describes an experimental system designed for development of real time voice synthesis applications. The system is composed from a DSP coprocessor card, equipped with an TMS320C25 or TMS320C50 chip, voice acquisition module (ADDA2),host computer (IBM-PC compatible), software specific tools.

قيم البحث

181 - Siqi Zheng , Weilong Huang , Xianliang Wang 2021

In this paper we describe a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting. We propose a novel systematic approach to tackle several long-standing challenges in speaker dia rization tasks: (1) to segment and separate overlapping speech from two speakers; (2) to estimate the number of speakers when participants may enter or leave the conversation at any time; (3) to provide accurate speaker identification on short text-independent utterances; (4) to track down speakers movement during the conversation; (5) to detect speaker change incidence real-time. First, a differential directional microphone array-based approach is exploited to capture the target speakers voice in far-field adverse environment. Second, an online speaker-location joint clustering approach is proposed to keep track of speaker location. Third, an instant speaker number detector is developed to trigger the mechanism that separates overlapped speech. The results suggest that our system effectively incorporates spatial information and achieves significant gains.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion

86 - Yi-Syuan Liou , Wen-Chin Huang , Ming-Chi Yen 2021

Voice conversion (VC) is an effective approach to electrolaryngeal (EL) speech enhancement, a task that aims to improve the quality of the artificial voice from an electrolarynx device. In frame-based VC methods, time alignment needs to be performed prior to model training, and the dynamic time warping (DTW) algorithm is widely adopted to compute the best time alignment between each utterance pair. The validity is based on the assumption that the same phonemes of the speakers have similar features and can be mapped by measuring a pre-defined distance between speech frames of the source and the target. However, the special characteristics of the EL speech can break the assumption, resulting in a sub-optimal DTW alignment. In this work, we propose to use lip images for time alignment, as we assume that the lip movements of laryngectomee remain normal compared to healthy people. We investigate two naive lip representations and distance metrics, and experimental results demonstrate that the proposed method can significantly outperform the audio-only alignment in terms of objective and subjective evaluations.

أنظمة الصوت في الحاسوب الحساب واللغة الرؤية الحاسوبية وتمييز الأنماط

Real Time Fabric Defect Detection System on an Embedded DSP Platform

456 - J. L. Raheja , B. Ajay , Ankit Chaudhary 2014

In industrial fabric productions, automated real time systems are needed to find out the minor defects. It will save the cost by not transporting defected products and also would help in making compmay image of quality fabrics by sending out only und efected products. A real time fabric defect detection system (FDDS), implementd on an embedded DSP platform is presented here. Textural features of fabric image are extracted based on gray level co-occurrence matrix (GLCM). A sliding window technique is used for defect detection where window moves over the whole image computing a textural energy from the GLCM of the fabric image. The energy values are compared to a reference and the deviations beyond a threshold are reported as defects and also visually represented by a window. The implementation is carried out on a TI TMS320DM642 platform and programmed using code composer studio software. The real time output of this implementation was shown on a monitor.

الرؤية الحاسوبية وتمييز الأنماط

Real-time Timbre Transfer and Sound Synthesis using DDSP

363 - Francesco Ganis , Erik Frej Knudesn , S{o}ren V. K. Lyster 2021

Neural audio synthesis is an actively researched topic, having yielded a wide range of techniques that leverages machine learning architectures. Google Magenta elaborated a novel approach called Differential Digital Signal Processing (DDSP) that inco rporates deep neural networks with preconditioned digital signal processing techniques, reaching state-of-the-art results especially in timbre transfer applications. However, most of these techniques, including the DDSP, are generally not applicable in real-time constraints, making them ineligible in a musical workflow. In this paper, we present a real-time implementation of the DDSP library embedded in a virtual synthesizer as a plug-in that can be used in a Digital Audio Workstation. We focused on timbre transfer from learned representations of real instruments to arbitrary sound inputs as well as controlling these models by MIDI. Furthermore, we developed a GUI for intuitive high-level controls which can be used for post-processing and manipulating the parameters estimated by the neural network. We have conducted a user experience test with seven participants online. The results indicated that our users found the interface appealing, easy to understand, and worth exploring further. At the same time, we have identified issues in the timbre transfer quality, in some components we did not implement, and in installation and distribution of our plugin. The next iteration of our design will address these issues. Our real-time MATLAB and JUCE implementations are available at https://github.com/SMC704/juce-ddsp and https://github.com/SMC704/matlab-ddsp , respectively.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking

212 - Hiroki Tamaru , Yuki Saito , Shinnosuke Takamichi 2019

This paper proposes a generative moment matching network (GMMN)-based post-filter that provides inter-utterance pitch variation for deep neural network (DNN)-based singing voice synthesis. The natural pitch variation of a human singing voice leads to a richer musical experience and is used in double-tracking, a recording method in which two performances of the same phrase are recorded and mixed to create a richer, layered sound. However, singing voices synthesized using conventional DNN-based methods never vary because the synthesis process is deterministic and only one waveform is synthesized from one musical score. To address this problem, we use a GMMN to model the variation of the modulation spectrum of the pitch contour of natural singing voices and add a randomized inter-utterance variation to the pitch contour generated by conventional DNN-based singing voice synthesis. Experimental evaluations suggest that 1) our approach can provide perceptible inter-utterance pitch variation while preserving speech quality. We extend our approach to double-tracking, and the evaluation demonstrates that 2) GMMN-based neural double-tracking is perceptually closer to natural double-tracking than conventional signal processing-based artificial double-tracking is.

أنظمة الصوت في الحاسوب الذكاء الاصطناعي التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الأندلس للعلوم الطبية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

DSP Based System for Real time Voice Synthesis Applications Development

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً