New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Different Strokes for Different Folks: Investigating Appropriate Further Pre-training Approaches for Diverse Dialogue Tasks

السكتات الدماغية المختلفة للناس المختلفة: التحقيق في المزيد من الأساليب المسبقة التدريبية المناسبة لمهام الحوار المتنوعة

347 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

تحميل النماذج المدربة مسبقا على الكائنات الكبيرة على نطاق واسع في المجال العام وتوضعها على مهام محددة من المصب هي تدريجيا نموذجا في معالجة اللغة الطبيعية. يمكن أن تثبت التحقيقات السابقة أن إدخال مراحل ما قبل التدريب الإضافي بين مراحل ما قبل التدريب والضبط بشكل جيد لتكييف النموذج على البيانات الخاصة بالمجال الخاصة بالمجال يمكن أن يؤدي إلى إثبات تأثيرات إيجابية. ومع ذلك، فإن معظم هذه أعمال التدريب المسبق الإضافية هذه فقط استمر في تشغيل المهمة التقليدية السابقة للتدريب، على سبيل المثال، نموذج اللغة الملثم، والتي يمكن اعتبارها كتكيف مجال إلى سد فجوة توزيع البيانات. بعد مراعاة المهام المتنوعة المصب، نقترح أن المهام المختلفة قد تحتاج أيضا إلى مرحلة أخرى قبل التدريب مع مهام التدريب المناسبة لسد فجوة صياغة المهمة. للتحقيق في ذلك، نقوم بدراسة لتحسين مهام تسليم الحوار الموجهة نحو المهام المتعددة من خلال تصميم المهام المختلفة في مرحلة ما قبل التدريب المسبق. توضح التجربة أن المهام المختلفة المصب تفضل مزيد من المهام التدريبية المسبقة المختلفة، والتي لها علاقة جوهرية وأكبر مهام التدريب المسبق بشكل كبير تحسين المهام المستهدفة بشكل كبير بدلا من ذلك. يشير تحقيقنا إلى أنه من الأهمية والفعالية الكبرى لتصميم مهام التدريب المسبق المناسبة نمذجة معلومات محددة تفيد بمهام المصب. بالإضافة إلى ذلك، نقدم استنتاجات تجريبية بناءة متعددة لتعزيز الحوارات الموجهة نحو المهام.

Loading models pre-trained on the large-scale corpus in the general domain and fine-tuning them on specific downstream tasks is gradually becoming a paradigm in Natural Language Processing. Previous investigations prove that introducing a further pre-training phase between pre-training and fine-tuning phases to adapt the model on the domain-specific unlabeled data can bring positive effects. However, most of these further pre-training works just keep running the conventional pre-training task, e.g., masked language model, which can be regarded as the domain adaptation to bridge the data distribution gap. After observing diverse downstream tasks, we suggest that different tasks may also need a further pre-training phase with appropriate training tasks to bridge the task formulation gap. To investigate this, we carry out a study for improving multiple task-oriented dialogue downstream tasks through designing various tasks at the further pre-training phase. The experiment shows that different downstream tasks prefer different further pre-training tasks, which have intrinsic correlation and most further pre-training tasks significantly improve certain target tasks rather than all. Our investigation indicates that it is of great importance and effectiveness to design appropriate further pre-training tasks modeling specific information that benefit downstream tasks. Besides, we present multiple constructive empirical conclusions for enhancing task-oriented dialogues.

References used

https://aclanthology.org/

rate research

Astudy of the different Effect methods for land preparation In productivity indicators for chickpea

1297 - Aِl-Baath University 2017 ورقة بحثية

The ways cultivation of soils and preparing of soil are for farming field crops with adding fertilizers village as one of the most important methods of modern agriculture processes. Starting up off this importance. the research was executed in the north east area of Homs city, through the season(2013,2014) by using five ways to cultiveate the soil.

chickpea Productivity الإنتاجية نبات الحمص حراثة التربة soil Tillge

When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training

534 - Association for Computation Linguistics 2021 مقالة

Further pre-training language models on in-domain data (domain-adaptive pre-training, DAPT) or task-relevant data (task-adaptive pre-training, TAPT) before fine-tuning has been shown to improve downstream tasks' performances. However, in task-oriente d dialog modeling, we observe that further pre-training MLM does not always boost the performance on a downstream task. We find that DAPT is beneficial in the low-resource setting, but as the fine-tuning data size grows, DAPT becomes less beneficial or even useless, and scaling the size of DAPT data does not help. Through Representational Similarity Analysis, we conclude that more data for fine-tuning yields greater change of the model's representations and thus reduces the influence of initialization.

pre-training mlm pre-training ما قبل التدريب MLM صناعة حمض الفوسفور

A Comparison of Different NMT Approaches to Low-Resource Dutch-Albanian Machine Translation

357 - Association for Computation Linguistics 2021 مقالة

Low-resource languages can be understood as languages that are more scarce, less studied, less privileged, less commonly taught and for which there are less resources available (Singh, 2008; Cieri et al., 2016; Magueresse et al., 2020). Natural Langu age Processing (NLP) research and technology mainly focuses on those languages for which there are large data sets available. To illustrate differences in data availability: there are 6 million Wikipedia articles available for English, 2 million for Dutch, and merely 82 thousand for Albanian. The scarce data issue becomes increasingly apparent when large parallel data sets are required for applications such as Neural Machine Translation (NMT). In this work, we investigate to what extent translation between Albanian (SQ) and Dutch (NL) is possible comparing a one-to-one (SQ↔AL) model, a low-resource pivot-based approach (English (EN) as pivot) and a zero-shot translation (ZST) (Johnson et al., 2016; Mattoni et al., 2017) system. From our experiments, it results that the EN-pivot-model outperforms both the direct one-to-one and the ZST model. Since often, small amounts of parallel data are available for low-resource languages or settings, experiments were conducted using small sets of parallel NL↔SQ data. The ZST appeared to be the worst performing models. Even when the available parallel data (NL↔SQ) was added, i.e. in a few-shot setting (FST), it remained the worst performing system according to the automatic (BLEU and TER) and human evaluation.

nmt approaches dutch-albanian machine translation low-resource dutch-albanian machine ترجمة الهولندية الألبانية آلة منخفضة الموارد الهولندية الألبانية صناعة حمض الفوسفور

studying complications of the different routes of hysterectomy

1400 - Tishreen University 2015 ورقة بحثية

Objectives: This was a prospective study, conducted to analyze the intraoperative and postoperative complications between abdominal and vaginal hysterectomy. METHODS: This study was carried out on 120 patients (85 cases abdominal and 35 cases vagi nal hysterectomy),in the department of gynecology at Al-Assad university hospital in Lattakia in the period between 1/7/2013-1/7/2014. Results: the mean duration of surgery of abdominal hysterectomy was 103 min and that of vaginal was 91 min (p=0.0192). Wound infection was the main cause for febrile morbidity in abdominal hysterectomy group where as urinary tract infection was the main cause for febrile morbidity in vaginal hysterectomy. There was 3)3,5%(case of bladder injury and 2(2,8%) case of ureteric injury in abdominal hysterectomy group while none in vaginal hysterectomy group Postoperatively there was 3 (3,5%) cases of secondary haemorrage in TAH group while 1(2,8%) case in vaginal hysterectomy .there were 8 (9,4%) cases of paralytic ileus in abdominal hysterectomy while none in vaginal hysterectomy . Overall 45 (52.9%) cases of abdominal hysterectomy and 12 (34.2%) case of vaginal hysterectomy had complications (p=0.029). Conclusions: This study showed that vaginal hysterectomy was associated with less intraoperative complications and postoperative complications as compared to abdominal hysterectomy.

استئصال الرحم hysterectomy اختلاطات الاستئصال البطني الاستئصال المهبلي complication abdominal hysterectomy vaginal hysterectomy المزيد..

Investigating Softmax Tempering for Training Neural Machine Translation Models

541 - Association for Computation Linguistics 2021 مقالة

Neural machine translation (NMT) models are typically trained using a softmax cross-entropy loss where the softmax distribution is compared against the gold labels. In low-resource scenarios and NMT models tend to perform poorly because the model tra ining quickly converges to a point where the softmax distribution computed using logits approaches the gold label distribution. Although label smoothing is a well-known solution to address this issue and we further propose to divide the logits by a temperature coefficient greater than one and forcing the softmax distribution to be smoother during training. This makes it harder for the model to quickly over-fit. In our experiments on 11 language pairs in the low-resource Asian Language Treebank dataset and we observed significant improvements in translation quality. Our analysis focuses on finding the right balance of label smoothing and softmax tempering which indicates that they are orthogonal methods. Finally and a study of softmax entropies and gradients reveal the impact of our method on the internal behavior of our NMT models.

اتجاه الترجمة training neural machine تدريب الآلة العصبية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Different Strokes for Different Folks: Investigating Appropriate Further Pre-training Approaches for Diverse Dialogue Tasks

السكتات الدماغية المختلفة للناس المختلفة: التحقيق في المزيد من الأساليب المسبقة التدريبية المناسبة لمهام الحوار المتنوعة

Ask ChatGPT about the research

Read More

suggested questions