New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies

أضرار الحصرية والتحديات الجنسانية في التمثيل غير الثنائي في تكنولوجيات اللغة

214 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Gender is widely discussed in the context of language tasks and when examining the stereotypes propagated by language models. However, current discussions primarily treat gender as binary, which can perpetuate harms such as the cyclical erasure of non-binary gender identities. These harms are driven by model and dataset biases, which are consequences of the non-recognition and lack of understanding of non-binary genders in society. In this paper, we explain the complexity of gender and language around it, and survey non-binary persons to understand harms associated with the treatment of gender as binary in English language technologies. We also detail how current language representations (e.g., GloVe, BERT) capture and perpetuate these harms and related challenges that need to be acknowledged and addressed for representations to equitably encode gender information.

References used

https://aclanthology.org/

rate research

Visual News: Benchmark and Challenges in News Image Captioning

328 - Association for Computation Linguistics 2021 مقالة

We propose Visual News Captioner, an entity-aware model for the task of news image captioning. We also introduce Visual News, a large-scale benchmark consisting of more than one million news images along with associated news articles, image captions, author information, and other metadata. Unlike the standard image captioning task, news images depict situations where people, locations, and events are of paramount importance. Our proposed method can effectively combine visual and textual features to generate captions with richer information such as events and entities. More specifically, built upon the Transformer architecture, our model is further equipped with novel multi-modal feature fusion techniques and attention mechanisms, which are designed to generate named entities more accurately. Our method utilizes much fewer parameters while achieving slightly better prediction results than competing methods. Our larger and more diverse Visual News dataset further highlights the remaining challenges in captioning news images.

آلة تفاعلية image captioning task تقسيم الصور المهمة صناعة حمض الفوسفور

Investigating the Impact of Gender Representation in ASR Training Data: a Case Study on Librispeech

235 - Association for Computation Linguistics 2021 مقالة

In this paper we question the impact of gender representation in training data on the performance of an end-to-end ASR system. We create an experiment based on the Librispeech corpus and build 3 different training corpora varying only the proportion of data produced by each gender category. We observe that if our system is overall robust to the gender balance or imbalance in training data, it is nonetheless dependant of the adequacy between the individuals present in the training and testing sets.

التحيز في ويكيبيديا asr training data gender representation بيانات التدريب العصر تمثيل الجنس صناعة حمض الفوسفور

Gender Bias in Natural Language Processing Across Human Languages

381 - Association for Computation Linguistics 2021 مقالة

Natural Language Processing (NLP) systems are at the heart of many critical automated decision-making systems making crucial recommendations about our future world. Gender bias in NLP has been well studied in English, but has been less studied in oth er languages. In this paper, a team including speakers of 9 languages - Chinese, Spanish, English, Arabic, German, French, Farsi, Urdu, and Wolof - reports and analyzes measurements of gender bias in the Wikipedia corpora for these 9 languages. We develop extensions to profession-level and corpus-level gender bias metric calculations originally designed for English and apply them to 8 other languages, including languages that have grammatically gendered nouns including different feminine, masculine, and neuter profession words. We discuss future work that would benefit immensely from a computational linguistics perspective.

مشكلة تقسيم زمرة language processing human languages معالجة اللغة لغات بشرية صناعة حمض الفوسفور

EMBEDDIA Tools, Datasets and Challenges: Resources and Hackathon Contributions

322 - Association for Computation Linguistics 2021 مقالة

This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union's Horizon 2020 research and innovation program. The collected resources were offered to participants of a hackathon organized a s part of the EACL Hackashop on News Media Content Analysis and Automated Report Generation in February 2021. The hackathon had six participating teams who addressed different challenges, either from the list of proposed challenges or their own news-industry-related tasks. This paper goes beyond the scope of the hackathon, as it brings together in a coherent and compact form most of the resources developed, collected and released by the EMBEDDIA project. Moreover, it constitutes a handy source for news media industry and researchers in the fields of Natural Language Processing and Social Science.

european union horizon hackathon contributions embeddia tools أفق الاتحاد الأوروبي مساهمات هاكاثون upddia أدوات صناعة حمض الفوسفور المزيد..

Gender and Representation Bias in GPT-3 Generated Stories

275 - Association for Computation Linguistics 2021 مقالة

Using topic modeling and lexicon-based word similarity, we find that stories generated by GPT-3 exhibit many known gender stereotypes. Generated stories depict different topics and descriptions depending on GPT-3's perceived gender of the character i n a prompt, with feminine characters more likely to be associated with family and appearance, and described as less powerful than masculine characters, even when associated with high power verbs in a prompt. Our study raises questions on how one can avoid unintended social biases when using large language models for storytelling.

representation bias generated stories generated stories depict التحيز التمثيل قصص تم إنشاؤها تصوير القصص التي تم إنشاؤها صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies

أضرار الحصرية والتحديات الجنسانية في التمثيل غير الثنائي في تكنولوجيات اللغة

Ask ChatGPT about the research

Read More

suggested questions