Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Larger-Context Tagging: When and Why Does It Work?

علامات السياق الأكبر: متى ولماذا يعمل؟

379 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

نمذجة غرامة الحبيبات tagging systems larger-context نظم العلامات سياق أكبر صناعة حمض الفوسفور

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

وضع تطوير الشبكات العصبية وتقنيات الاحتياطية العديد من أنظمة وضع العلامات على مستوى الجملة التي حققت أداء فائقا على المعايير النموذجية. ومع ذلك، فإن موضوع أقل مناقشة نسبيا هو ما إذا كانت معلومات السياق مزيد من المعلومات في أنظمة علامات التسجيل الحالية الحالية. على الرغم من أن العديد من الأعمال الموجودة قد حاولت تحويل أنظمة وضع العلامات من مستوى الجملة إلى مستوى المستند، لا يوجد أي استنتاج بتوافق الآراء بشأن متى ولماذا يعمل، الذي يحد من تطبيق نهج السياق الأكبر في مهام وضع العلامات. في هذه الورقة، بدلا من متابعة نظام علامات حديثة من خلال الاستكشاف المعماري، نركز على التحقيق عندما ولماذا التدريب في السياق الأكبر، كاستراتيجية عامة، يمكن أن تعمل. تحقيقا لهذه الغاية، نقوم بإجراء دراسة مقارنة شاملة عن أربعة مجمعين مقترحين لجمع معلومات السياق وتقديم طريقة تقييم بمساعدة السمة لتفسير التحسن الذي يحدده التدريب السياق الأكبر. تجريفيا، أنشأنا اختبارا بناء على أربع مهام وضع العلامات ومجموعات البيانات الثلاثين. نأمل أن تكون ملاحظاتنا الأولية يمكن أن تعميق فهم التدريب السياق الأكبر والتنوير يعمل المزيد من المتابعة على استخدام المعلومات السياقية.

The development of neural networks and pretraining techniques has spawned many sentence-level tagging systems that achieved superior performance on typical benchmarks. However, a relatively less discussed topic is what if more context information is introduced into current top-scoring tagging systems. Although several existing works have attempted to shift tagging systems from sentence-level to document-level, there is still no consensus conclusion about when and why it works, which limits the applicability of the larger-context approach in tagging tasks. In this paper, instead of pursuing a state-of-the-art tagging system by architectural exploration, we focus on investigating when and why the larger-context training, as a general strategy, can work. To this end, we conduct a thorough comparative study on four proposed aggregators for context information collecting and present an attribute-aided evaluation method to interpret the improvement brought by larger-context training. Experimentally, we set up a testbed based on four tagging tasks and thirteen datasets. Hopefully, our preliminary observations can deepen the understanding of larger-context training and enlighten more follow-up works on the use of contextual information.

References used

https://aclanthology.org/

rate research

When and Why a Model Fails? A Human-in-the-loop Error Detection Framework for Sentiment Analysis

288 - Association for Computation Linguistics 2021 مقالة

Although deep neural networks have been widely employed and proven effective in sentiment analysis tasks, it remains challenging for model developers to assess their models for erroneous predictions that might exist prior to deployment. Once deployed , emergent errors can be hard to identify in prediction run-time and impossible to trace back to their sources. To address such gaps, in this paper we propose an error detection framework for sentiment analysis based on explainable features. We perform global-level feature validation with human-in-the-loop assessment, followed by an integration of global and local-level feature contribution analysis. Experimental results show that, given limited human-in-the-loop intervention, our method is able to identify erroneous model predictions on unseen data with high precision.

error detection framework model fails الإطار كشف خطأ فشل النموذج صناعة حمض الفوسفور

Implicitly Abusive Language -- What does it actually look like and why are we not getting there?

738 - Association for Computation Linguistics 2021 مقالة

Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language, i .e. abusive language that is not conveyed by abusive words (e.g. dumbass or scum), is not working well. In this position paper, we explain why existing datasets make learning implicit abuse difficult and what needs to be changed in the design of such datasets. Arguing for a divide-and-conquer strategy, we present a list of subtypes of implicitly abusive language and formulate research tasks and questions for future research.

implicitly abusive language implicitly abusive لغة مسيئة ضمنيا مستنفئ ضمنيا صناعة حمض الفوسفور

When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training

561 - Association for Computation Linguistics 2021 مقالة

Further pre-training language models on in-domain data (domain-adaptive pre-training, DAPT) or task-relevant data (task-adaptive pre-training, TAPT) before fine-tuning has been shown to improve downstream tasks' performances. However, in task-oriente d dialog modeling, we observe that further pre-training MLM does not always boost the performance on a downstream task. We find that DAPT is beneficial in the low-resource setting, but as the fine-tuning data size grows, DAPT becomes less beneficial or even useless, and scaling the size of DAPT data does not help. Through Representational Similarity Analysis, we conclude that more data for fine-tuning yields greater change of the model's representations and thus reduces the influence of initialization.

pre-training mlm pre-training ما قبل التدريب MLM صناعة حمض الفوسفور

Benchmarking Meta-embeddings: What Works and What Does Not

447 - Association for Computation Linguistics 2021 مقالة

In the last few years, several methods have been proposed to build meta-embeddings. The general aim was to obtain new representations integrating complementary knowledge from different source pre-trained embeddings thereby improving their overall qua lity. However, previous meta-embeddings have been evaluated using a variety of methods and datasets, which makes it difficult to draw meaningful conclusions regarding the merits of each approach. In this paper we propose a unified common framework, including both intrinsic and extrinsic tasks, for a fair and objective meta-embeddings evaluation. Furthermore, we present a new method to generate meta-embeddings, outperforming previous work on a large number of intrinsic evaluation benchmarks. Our evaluation framework also allows us to conclude that previous extrinsic evaluations of meta-embeddings have been overestimated.

meta-embeddings benchmarking meta-embeddings Meta-Embeddings. معايير تايتا - Embeddings صناعة حمض الفوسفور

Rare-Class Dialogue Act Tagging for Alzheimer's Disease Diagnosis

366 - Association for Computation Linguistics 2021 مقالة

Alzheimer's Disease (AD) is associated with many characteristic changes, not only in an individual's language but also in the interactive patterns observed in dialogue. The most indicative changes of this latter kind tend to be associated with relati vely rare dialogue acts (DAs), such as those involved in clarification exchanges and responses to particular kinds of questions. However, most existing work in DA tagging focuses on improving average performance, effectively prioritizing more frequent classes; it thus gives a poor performance on these rarer classes and is not suited for application to AD analysis. In this paper, we investigate tagging specifically for rare class DAs, using a hierarchical BiLSTM model with various ways of incorporating information from previous utterances and DA tags in context. We show that this can give good performance for rare DA classes on both the general Switchboard corpus (SwDA) and an AD-specific conversational dataset, the Carolinas Conversation Collection (CCC); and that the tagger outputs then contribute useful information for distinguishing patients with and without AD

alzheimer disease diagnosis disease diagnosis rare-class dialogue act مرض مرض الزهايمر تشخيص الأمراض قانون الحوار النادر صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Larger-Context Tagging: When and Why Does It Work?

علامات السياق الأكبر: متى ولماذا يعمل؟

Ask ChatGPT about the research

Read More

suggested questions