فصل الكلام هو مشكلة في مجال معالجة الكلام التي تمت دراستها على قدم وساق مؤخرا.ومع ذلك، لم يكن هناك الكثير من العمل في دراسة سيناريو لفصل الكلام متعدد اللكنات.أثارت مكبرات الصوت غير المرئية لهجات جديدة والضوضاء مشكلة عدم تطابق المجال والتي لا يمكن حلها بسهولة عن طريق أساليب التدريب المشتركة التقليدية.وبالتالي، طبقنا MAML و FOMAML لمعالجة هذه المشكلة وحصلت على أعلى قيم SI-SILRI أعلى من التدريب المشترك على جميع لهجات غير مرئية تقريبا.أثبت ذلك أن هاتين الطريقتين لديها القدرة على توليد معلمات مدربة جيدا للتكييف مع مخاليط الكلام من مكبرات الصوت الجديدة ولوجزات.علاوة على ذلك، اكتشفنا أن Fomaml يحصل على أداء مماثل مقارنة بالماما مع توفير الكثير من الوقت.
Speech separation is a problem in the field of speech processing that has been studied in full swing recently. However, there has not been much work studying a multi-accent speech separation scenario. Unseen speakers with new accents and noise aroused the domain mismatch problem which cannot be easily solved by conventional joint training methods. Thus, we applied MAML and FOMAML to tackle this problem and obtained higher average Si-SNRi values than joint training on almost all the unseen accents. This proved that these two methods do have the ability to generate well-trained parameters for adapting to speech mixtures of new speakers and accents. Furthermore, we found out that FOMAML obtains similar performance compared to MAML while saving a lot of time.
References used
https://aclanthology.org/
The rapid rise of online social networks like YouTube, Facebook, Twitter allows people to express their views more widely online. However, at the same time, it can lead to an increase in conflict and hatred among consumers in the form of freedom of s
We present CoTexT, a pre-trained, transformer-based encoder-decoder model that learns the representative context between natural language (NL) and programming language (PL). Using self-supervision, CoTexT is pre-trained on large programming language
Stance detection determines whether the author of a text is in favor of, against or neutral to a specific target and provides valuable insights into important events such as legalization of abortion. Despite significant progress on this task, one of
Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. The common practice to tackle the problem is transferring the Autoregressive machine Translation (AT) k
In this paper, we describe our approach towards utilizing pre-trained models for the task of hope speech detection. We participated in Task 2: Hope Speech Detection for Equality, Diversity and Inclusion at LT-EDI-2021 @ EACL2021. The goal of this tas