Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Spectral modification for recognition of children's speech undermismatched conditions

التعديل الطيفي للاعتراف بخطاب الأطفال الظروف المستدامة

602 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we propose spectral modification by sharpening formants and by reducing the spectral tilt to recognize children's speech by automatic speech recognition (ASR) systems developed using adult speech. In this type of mismatched condition, the ASR performance is degraded due to the acoustic and linguistic mismatch in the attributes between children and adult speakers. The proposed method is used to improve the speech intelligibility to enhance the children's speech recognition using an acoustic model trained on adult speech. In the experiments, WSJCAM0 and PFSTAR are used as databases for adults' and children's speech, respectively. The proposed technique gives a significant improvement in the context of the DNN-HMM-based ASR. Furthermore, we validate the robustness of the technique by showing that it performs well also in mismatched noise conditions.

References used

https://aclanthology.org/

rate research

Language Clustering for Multilingual Named Entity Recognition

488 - Association for Computation Linguistics 2021 مقالة

Recent work in multilingual natural language processing has shown progress in various tasks such as natural language inference and joint multilingual translation. Despite success in learning across many languages, challenges arise where multilingual training regimes often boost performance on some languages at the expense of others. For multilingual named entity recognition (NER) we propose a simple technique that groups similar languages together by using embeddings from a pre-trained masked language model, and automatically discovering language clusters in this embedding space. Specifically, we fine-tune an XLM-Roberta model on a language identification task, and use embeddings from this model for clustering. We conduct experiments on 15 diverse languages in the WikiAnn dataset and show our technique largely outperforms three baselines: (1) training a multilingual model jointly on all available languages, (2) training one monolingual model per language, and (3) grouping languages by linguistic family. We also conduct analyses showing meaningful multilingual transfer for low-resource languages (Swahili and Yoruba), despite being automatically grouped with other seemingly disparate languages.

السؤال المنطوق multilingual named entity الكيان المسمى متعدد اللغات صناعة حمض الفوسفور

Developing Lower Bounder for The Spectral Radius of a Numerical Matrix

1904 - Damascus University 2000 ورقة بحثية

This paper is concerned with the calculation of the spectral radius of an arbitrary real matrix A If rank A = m = 2 then (١) and (٢) are equalities. In addition, we provide the numerical radius r(A) of an n×n matrix whose diagonal entries are complex numbers.

Linear Algebra الجبر الخطي المصفوفات القيم الذاتية Matrices Eigenvalue

Studying the effect of modified spectral subtraction algorithm parameters and time window length in speech signals enhancement

1978 - Tishreen University 2015 ورقة بحثية

Speech denoising is a field of engineering that studies techniques used to recover the original signal from the noisy signal corrupted with different types of noise, such as broadband noise and narrowband noise, and other types present in environme nt, but the spectral subtraction technique consider the most prominent in this area . In this search we will discuss the parameters impact of the modified spectral subtraction algorithm and the time window length in the enhancement of speech that corrupted with broadband noise. We done the study and determine the ideal parameters values and the ideal window length with different values for the signal -to-noise ratio SNR for noisy speech and we discuss 18 case for each value. We done the simulation using MATLAB software and the results were compared based on improving the value of SNR for each case .

نسبة الإشارة إلى الضجيج الطرح الطيفي تحسين الكلام النافذة الزمنية معامل الطرح معامل قاعدة الطيف spectral subtraction speech enhancement Signal-to-noise ratio time window subtraction factor spectral floor factor المزيد..

Context Tracking Network: Graph-based Context Modeling for Implicit Discourse Relation Recognition

838 - Association for Computation Linguistics 2021 مقالة

Implicit discourse relation recognition (IDRR) aims to identify logical relations between two adjacent sentences in the discourse. Existing models fail to fully utilize the contextual information which plays an important role in interpreting each loc al sentence. In this paper, we thus propose a novel graph-based Context Tracking Network (CT-Net) to model the discourse context for IDRR. The CT-Net firstly converts the discourse into the paragraph association graph (PAG), where each sentence tracks their closely related context from the intricate discourse through different types of edges. Then, the CT-Net extracts contextual representation from the PAG through a specially designed cross-grained updating mechanism, which can effectively integrate both sentence-level and token-level contextual semantics. Experiments on PDTB 2.0 show that the CT-Net gains better performance than models that roughly model the context.

نماذج حل النماذج context tracking network شبكة تتبع السياق صناعة حمض الفوسفور

Extend, don't rebuild: Phrasing conditional graph modification as autoregressive sequence labelling

795 - Association for Computation Linguistics 2021 مقالة

Deriving and modifying graphs from natural language text has become a versatile basis technology for information extraction with applications in many subfields, such as semantic parsing or knowledge graph construction. A recent work used this techniq ue for modifying scene graphs (He et al. 2020), by first encoding the original graph and then generating the modified one based on this encoding. In this work, we show that we can considerably increase performance on this problem by phrasing it as graph extension instead of graph generation. We propose the first model for the resulting graph extension problem based on autoregressive sequence labelling. On three scene graph modification data sets, this formulation leads to improvements in accuracy over the state-of-the-art between 13 and 24 percentage points. Furthermore, we introduce a novel data set from the biomedical domain which has much larger linguistic variability and more complex graphs than the scene graph modification data sets. For this data set, the state-of-the art fails to generalize, while our model can produce meaningful predictions.

phrasing conditional graph conditional graph modification الصياغة الرسمية الرسم البياني رسم بياني تعديل الرسم البياني الشرطي صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Spectral modification for recognition of children's speech undermismatched conditions

التعديل الطيفي للاعتراف بخطاب الأطفال الظروف المستدامة

Ask ChatGPT about the research

Read More

suggested questions