New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Simplifying annotation of intersections in time normalization annotation: exploring syntactic and semantic validation

تبسيط التعليقات التوضيحية بين التقاطعات في تطبيع التطبلق الوقت: استكشاف التحقق من صحة النحوية والدل

663 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

exploring syntactic time normalization annotation time normalization استكشاف النحوية تطبيع الوقت التطبلق تطبيع الوقت صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

While annotating normalized times in food security documents, we found that the semantically compositional annotation for time normalization (SCATE) scheme required several near-duplicate annotations to get the correct semantics for expressions like Nov. 7th to 11th 2021. To reduce this problem, we explored replacing SCATE's Sub-Interval property with a Super-Interval property, that is, making the smallest units (e.g., 7th and 11th) rather than the largest units (e.g., 2021) the heads of the intersection chains. To ensure that the semantics of annotated time intervals remained unaltered despite our changes to the syntax of the annotation scheme, we applied several different techniques to validate our changes. These validation techniques detected and allowed us to resolve several important bugs in our automated translation from Sub-Interval to Super-Interval syntax.

References used

https://aclanthology.org/

rate research

On the Interaction between Annotation Quality and Classifier Performance in Abusive Language Detection

298 - Association for Computation Linguistics 2021 مقالة

Abusive language detection has become an important tool for the cultivation of safe online platforms. We investigate the interaction of annotation quality and classifier performance. We use a new, fine-grained annotation scheme that allows us to dist inguish between abusive language and colloquial uses of profanity that are not meant to harm. Our results show a tendency of crowd workers to overuse the abusive class, which creates an unrealistic class balance and affects classification accuracy. We also investigate different methods of distinguishing between explicit and implicit abuse and show lexicon-based approaches either over- or under-estimate the proportion of explicit abuse in data sets.

تقييم ملخص المجال العام quality and classifier الجودة والتصنيف صناعة حمض الفوسفور

Joint Universal Syntactic and Semantic Parsing

224 - Association for Computation Linguistics 2021 مقالة

While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other. This trade-off contradicts the large body of research focusing on the rich interact ions at the syntax--semantics interface. We explore multiple model architectures that allow us to exploit the rich syntactic and semantic annotations contained in the Universal Decompositional Semantics (UDS) dataset, jointly parsing Universal Dependencies and UDS to obtain state-of-the-art results in both formalisms. We analyze the behavior of a joint model of syntax and semantics, finding patterns supported by linguistic theory at the syntax--semantics interface. We then investigate to what degree joint modeling generalizes to a multilingual setting, where we find similar trends across 8 languages.

universal decompositional semantics parsing universal dependencies jointly parsing universal دليون عالمي للتحلل تحليل التبعيات العالمية تحليل المشترك العالمي صناعة حمض الفوسفور المزيد..

Learning with Different Amounts of Annotation: From Zero to Many Labels

309 - Association for Computation Linguistics 2021 مقالة

Training NLP systems typically assumes access to annotated data that has a single human label per example. Given imperfect labeling from annotators and inherent ambiguity of language, we hypothesize that single label is not sufficient to learn the sp ectrum of language interpretation. We explore new annotation distribution schemes, assigning multiple labels per example for a small subset of training examples. Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples. Extending a MixUp data augmentation framework, we propose a learning algorithm that can learn from training examples with different amount of annotation (with zero, one, or multiple labels). This algorithm efficiently combines signals from uneven training data and brings additional gains in low annotation budget and cross domain settings. Together, our method achieves consistent gains in two tasks, suggesting distributing labels unevenly among training examples can be beneficial for many NLP tasks.

تقييم الاستدلال القوي single label labels ضع الكلمة المناسبة تسمية واحدة تسميات صناعة حمض الفوسفور المزيد..

Emotion Ratings: How Intensity, Annotation Confidence and Agreements are Entangled

438 - Association for Computation Linguistics 2021 مقالة

When humans judge the affective content of texts, they also implicitly assess the correctness of such judgment, that is, their confidence. We hypothesize that people's (in)confidence that they performed well in an annotation task leads to (dis)agreem ents among each other. If this is true, confidence may serve as a diagnostic tool for systematic differences in annotations. To probe our assumption, we conduct a study on a subset of the Corpus of Contemporary American English, in which we ask raters to distinguish neutral sentences from emotion-bearing ones, while scoring the confidence of their answers. Confidence turns out to approximate inter-annotator disagreements. Further, we find that confidence is correlated to emotion intensity: perceiving stronger affect in text prompts annotators to more certain classification performances. This insight is relevant for modelling studies of intensity, as it opens the question wether automatic regressors or classifiers actually predict intensity, or rather human's self-perceived confidence.

emotion ratings agreements are entangled contemporary american english تقييمات العاطفة الاتفاقات متشابة اللغة الإنجليزية الأمريكية المعاصرة صناعة حمض الفوسفور المزيد..

Minimizing Annotation Effort via Max-Volume Spectral Sampling

429 - Association for Computation Linguistics 2021 مقالة

We address the annotation data bottleneck for sequence classification. Specifically we ask the question: if one has a budget of N annotations, which samples should we select for annotation? The solution we propose looks for diversity in the selected sample, by maximizing the amount of information that is useful for the learning algorithm, or equivalently by minimizing the redundancy of samples in the selection. This is formulated in the context of spectral learning of recurrent functions for sequence classification. Our method represents unlabeled data in the form of a Hankel matrix, and uses the notion of spectral max-volume to find a compact sub-block from which annotation samples are drawn. Experiments on sequence classification confirm that our spectral sampling strategy is in fact efficient and yields good models.

minimizing annotation effort annotation effort تقليل جهود التوضيحية جهود التوضيحية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Simplifying annotation of intersections in time normalization annotation: exploring syntactic and semantic validation

تبسيط التعليقات التوضيحية بين التقاطعات في تطبيع التطبلق الوقت: استكشاف التحقق من صحة النحوية والدل

Ask ChatGPT about the research

Read More

suggested questions