أثناء النظر في الأوقات الطبيعية في وثائق الأمن الغذائي، وجدنا أن التعليق التوضيحي التركيبي للتوسع في الوقت نفسه يتطلب العديد من التعليقات التوضيحية شبه المكررة للحصول على الدلالات الصحيحة للتعبيرات مثل 7 نوفمبر إلى 11 2021. للحد من هذه المشكلة، نحناستكشاف استبدال الممتلكات الفاصل الفرعية للخضار بممتلكات فاصلة فاصلة فاخرة، وهذا هو، مما يجعل أصغر الوحدات (على سبيل المثال، 7 و 11 عاما بدلا من أكبر الوحدات (على سبيل المثال، 2021) رؤساء سلاسل التقاطع.لضمان ظل دلالات الفواصل الزمنية المشروحة دون تغيير على الرغم من تغييراتنا في بناء جملة مخطط التوضيحية، طبقنا العديد من التقنيات المختلفة للتحقق من صحة تغييراتنا.تم اكتشاف تقنيات التحقق من الصحة هذه وسمحتنا بحل العديد من الأخطاء المهمة في الترجمة الآلية لدينا من الفاصل الفرعي إلى بناء جملة فائق الفاصل الزمني.
While annotating normalized times in food security documents, we found that the semantically compositional annotation for time normalization (SCATE) scheme required several near-duplicate annotations to get the correct semantics for expressions like Nov. 7th to 11th 2021. To reduce this problem, we explored replacing SCATE's Sub-Interval property with a Super-Interval property, that is, making the smallest units (e.g., 7th and 11th) rather than the largest units (e.g., 2021) the heads of the intersection chains. To ensure that the semantics of annotated time intervals remained unaltered despite our changes to the syntax of the annotation scheme, we applied several different techniques to validate our changes. These validation techniques detected and allowed us to resolve several important bugs in our automated translation from Sub-Interval to Super-Interval syntax.
References used
https://aclanthology.org/
Abusive language detection has become an important tool for the cultivation of safe online platforms. We investigate the interaction of annotation quality and classifier performance. We use a new, fine-grained annotation scheme that allows us to dist
While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other. This trade-off contradicts the large body of research focusing on the rich interact
Training NLP systems typically assumes access to annotated data that has a single human label per example. Given imperfect labeling from annotators and inherent ambiguity of language, we hypothesize that single label is not sufficient to learn the sp
When humans judge the affective content of texts, they also implicitly assess the correctness of such judgment, that is, their confidence. We hypothesize that people's (in)confidence that they performed well in an annotation task leads to (dis)agreem
We address the annotation data bottleneck for sequence classification. Specifically we ask the question: if one has a budget of N annotations, which samples should we select for annotation? The solution we propose looks for diversity in the selected