New community

Subscribe to the gold package and get unlimited access to Shamra Academy

What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus

ما هو على وسائل التواصل الاجتماعي الذي ليس في Wordnet؟تحليل أولي على Twitteraae Corpus

362 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

نظرية هيكل الوثائق عبر المستندات preliminary analysis twitteraae corpus تحليل أولي Twitteraae Corpus. صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Natural Language Processing tools and resources have been so far mainly created and trained for standard varieties of language. Nowadays, with the use of large amounts of data gathered from social media, other varieties and registers need to be processed, which may present other challenges and difficulties. In this work, we focus on English and we present a preliminary analysis by comparing the TwitterAAE corpus, which is annotated for ethnicity, and WordNet by quantifying and explaining the online language that WordNet misses.

References used

https://aclanthology.org/

rate research

Applying Occam's Razor to Transformer-Based Dependency Parsing: What Works, What Doesn't, and What is Really Necessary

264 - Association for Computation Linguistics 2021 مقالة

The introduction of pre-trained transformer-based contextualized word embeddings has led to considerable improvements in the accuracy of graph-based parsers for frameworks such as Universal Dependencies (UD). However, previous works differ in various dimensions, including their choice of pre-trained language models and whether they use LSTM layers. With the aims of disentangling the effects of these choices and identifying a simple yet widely applicable architecture, we introduce STEPS, a new modular graph-based dependency parser. Using STEPS, we perform a series of analyses on the UD corpora of a diverse set of languages. We find that the choice of pre-trained embeddings has by far the greatest impact on parser performance and identify XLM-R as a robust choice across the languages in our study. Adding LSTM layers provides no benefits when using transformer-based embeddings. A multi-task training setup outputting additional UD features may contort results. Taking these insights together, we propose a simple but widely applicable parser architecture and configuration, achieving new state-of-the-art results (in terms of LAS) for 10 out of 12 diverse languages.

applying occam razor occam razor transformer-based dependency parsing تطبيق Occam Razor. accam الحلاقة تحليل التبعية القائمة على المحولات صناعة حمض الفوسفور المزيد..

What is SemEval evaluating? A Systematic Analysis of Evaluation Campaigns in NLP

435 - Association for Computation Linguistics 2021 مقالة

SemEval is the primary venue in the NLP community for the proposal of new challenges and for the systematic empirical evaluation of NLP systems. This paper provides a systematic quantitative analysis of SemEval aiming to evidence the patterns of the contributions behind SemEval. By understanding the distribution of task types, metrics, architectures, participation and citations over time we aim to answer the question on what is being evaluated by SemEval.

evaluation campaigns systematic empirical evaluation حملات التقييم التقييم التجريبي المنهجي صناعة حمض الفوسفور

What is Multimodality?

474 - Association for Computation Linguistics 2021 مقالة

The last years have shown rapid developments in the field of multimodal machine learning, combining e.g., vision, text or speech. In this position paper we explain how the field uses outdated definitions of multimodality that prove unfit for the mach ine learning era. We propose a new task-relative definition of (multi)modality in the context of multimodal machine learning that focuses on representations and information that are relevant for a given machine learning task. With our new definition of multimodality we aim to provide a missing foundation for multimodal research, an important component of language grounding and a crucial milestone towards NLU.

تجارب التكيف multimodal machine learning آلة التعلم متعددة الوسائط آلة صناعة حمض الفوسفور

Modeling Framing in Immigration Discourse on Social Media

538 - Association for Computation Linguistics 2021 مقالة

The framing of political issues can influence policy and public opinion. Even though the public plays a key role in creating and spreading frames, little is known about how ordinary people on social media frame political issues. By creating a new dat aset of immigration-related tweets labeled for multiple framing typologies from political communication theory, we develop supervised models to detect frames. We demonstrate how users' ideology and region impact framing choices, and how a message's framing influences audience responses. We find that the more commonly-used issue-generic frames obscure important ideological and regional patterns that are only revealed by immigration-specific frames. Furthermore, frames oriented towards human interests, culture, and politics are associated with higher user engagement. This large-scale analysis of a complex social and linguistic phenomenon contributes to both NLP and social science research.

immigration discourse discourse on social خطاب الهجرة خطاب على الاجتماعية صناعة حمض الفوسفور

Hidden Advertorial Detection on Social Media in Chinese

678 - Association for Computation Linguistics 2021 مقالة

Nowadays, there are a lot of advertisements hiding as normal posts or experience sharing in social media. There is little research of advertorial detection on Mandarin Chinese texts. This paper thus aimed to focus on hidden advertorial detection of o nline posts in Taiwan Mandarin Chinese. We inspected seven contextual features based on linguistic theories in discourse level. These features can be further grouped into three schemas under the general advertorial writing structure. We further implemented these features to train a multi-task BERT model to detect advertorials. The results suggested that specific linguistic features would help extract advertorials.

تحسين الخطاب taiwan mandarin chinese mandarin chinese تايوان الماندرين الصينية لغة الماندرين الصينية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus

ما هو على وسائل التواصل الاجتماعي الذي ليس في Wordnet؟تحليل أولي على Twitteraae Corpus

Ask ChatGPT about the research

Read More

suggested questions