Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Event Coreference Data (Almost) for Free: Mining Hyperlinks from Online News

بيانات Aquerence Event (تقريبا) مجانا: تعلب الارتباطات التشعبية من الأخبار عبر الإنترنت

933 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Cross-document event coreference resolution (CDCR) is the task of identifying which event mentions refer to the same events throughout a collection of documents. Annotating CDCR data is an arduous and expensive process, explaining why existing corpora are small and lack domain coverage. To overcome this bottleneck, we automatically extract event coreference data from hyperlinks in online news: When referring to a significant real-world event, writers often add a hyperlink to another article covering this event. We demonstrate that collecting hyperlinks which point to the same article(s) produces extensive and high-quality CDCR data and create a corpus of 2M documents and 2.7M silver-standard event mentions called HyperCoref. We evaluate a state-of-the-art system on three CDCR corpora and find that models trained on small subsets of HyperCoref are highly competitive, with performance similar to models trained on gold-standard data. With our work, we free CDCR research from depending on costly human-annotated training data and open up possibilities for research beyond English CDCR, as our data extraction approach can be easily adapted to other languages.

References used

https://aclanthology.org/

rate research

WEC: Deriving a Large-scale Cross-document Event Coreference dataset from Wikipedia

664 - Association for Computation Linguistics 2021 مقالة

Cross-document event coreference resolution is a foundational task for NLP applications involving multi-text processing. However, existing corpora for this task are scarce and relatively small, while annotating only modest-size clusters of documents belonging to the same topic. To complement these resources and enhance future research, we present Wikipedia Event Coreference (WEC), an efficient methodology for gathering a large-scale dataset for cross-document event coreference from Wikipedia, where coreference links are not restricted within predefined topics. We apply this methodology to the English Wikipedia and extract our large-scale WEC-Eng dataset. Notably, our dataset creation method is generic and can be applied with relatively little effort to other Wikipedia languages. To set baseline results, we develop an algorithm that adapts components of state-of-the-art models for within-document coreference resolution to the cross-document setting. Our model is suitably efficient and outperforms previously published state-of-the-art results for the task.

المهام تحسين صفر النار event coreference cross-document event الحدث comeference. حدث عبر المستندات صناعة حمض الفوسفور

A Dataset for Research on Modelling Depression Severity in Online Forum Data

1115 - Association for Computation Linguistics 2021 مقالة

People utilize online forums to either look for information or to contribute it. Because of their growing popularity, certain online forums have been created specifically to provide support, assistance, and opinions for people suffering from mental i llness. Depression is one of the most frequent psychological illnesses worldwide. People communicate more with online forums to find answers for their psychological disease. However, there is no mechanism to measure the severity of depression in each post and give higher importance to those who are diagnosed more severely depressed. Despite the fact that numerous researches based on online forum data and the identification of depression have been conducted, the severity of depression is rarely explored. In addition, the absence of datasets will stymie the development of novel diagnostic procedures for practitioners. From this study, we offer a dataset to support research on depression severity evaluation. The computational approach to measure an automatic process, identified severity of depression here is quite novel approach. Nonetheless, this elaborate measuring severity of depression in online forum posts is needed to ensure the measurement scales used in our research meets the expected norms of scientific research.

modelling depression severity online forum data modelling depression نمذجة الشدة الاكتئاب بيانات المنتدى عبر الإنترنت نمذجة الاكتئاب صناعة حمض الفوسفور المزيد..

Almost Free Semantic Draft for Neural Machine Translation

697 - Association for Computation Linguistics 2021 مقالة

Translation quality can be improved by global information from the required target sentence because the decoder can understand both past and future information. However, the model needs additional cost to produce and consider such global information. In this work, to inject global information but also save cost, we present an efficient method to sample and consider a semantic draft as global information from semantic space for decoding with almost free of cost. Unlike other successful adaptations, we do not have to perform an EM-like process that repeatedly samples a possible semantic from the semantic space. Empirical experiments show that the presented method can achieve competitive performance in common language pairs with a clear advantage in inference efficiency. We will open all our source code on GitHub.

التنبؤات في الحكم صناعة حمض الفوسفور

The Online Pivot: Lessons Learned from Teaching a Text and Data Mining Course in Lockdown, Enhancing online Teaching with Pair Programming and Digital Badges

658 - Association for Computation Linguistics 2021 مقالة

In this paper we provide an account of how we ported a text and data mining course online in summer 2020 as a result of the COVID-19 pandemic and how we improved it in a second pilot run. We describe the course, how we adapted it over the two pilot r uns and what teaching techniques we used to improve students' learning and community building online. We also provide information on the relentless feedback collected during the course which helped us to adapt our teaching from one session to the next and one pilot to the next. We discuss the lessons learned and promote the use of innovative teaching techniques applied to the digital such as digital badges and pair programming in break-out rooms for teaching Natural Language Processing courses to beginners and students with different backgrounds.

enhancing online teaching text and data تعزيز التعليم عبر الإنترنت النص والبيانات بيانات التعدين صناعة حمض الفوسفور

Pedagogical Principles in the Online Teaching of Text Mining: A Retrospection

672 - Association for Computation Linguistics 2021 مقالة

The ongoing COVID-19 pandemic has brought online education to the forefront of pedagogical discussions. To make this increased interest sustainable in a post-pandemic era, online courses must be built on strong pedagogical foundations. With a long hi story of pedagogic research, there are many principles, frameworks, and models available to help teachers in doing so. These models cover different teaching perspectives, such as constructive alignment, feedback, and the learning environment. In this paper, we discuss how we designed and implemented our online Natural Language Processing (NLP) course following constructive alignment and adhering to the pedagogical principles of LTU. By examining our course and analyzing student evaluation forms, we show that we have met our goal and successfully delivered the course. Furthermore, we discuss the additional benefits resulting from the current mode of delivery, including the increased reusability of course content and increased potential for collaboration between universities. Lastly, we also discuss where we can and will further improve the current course design.

استنتاج شرح تجديد teaching of text online natural language تحليل النصوص تدريس النص اللغة الطبيعية عبر الإنترنت صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Event Coreference Data (Almost) for Free: Mining Hyperlinks from Online News

بيانات Aquerence Event (تقريبا) مجانا: تعلب الارتباطات التشعبية من الأخبار عبر الإنترنت

Ask ChatGPT about the research

Read More

suggested questions