تصف هذه الورقة تطوير مورد معجمي عبر الإنترنت للمساعدة في تنظيم أنظمة الكشف وكبح استخدام الكلمات الهجومية عبر الإنترنت.مع انتشار نمو منصات وسائل التواصل الاجتماعي، يتم الآن إجراء العديد من المحادثات عند الخط.أدت زيادة المحادثات عبر الإنترنت للترفيه والعمل والتواصل الاجتماعي إلى زيادة المضايقة.على وجه الخصوص، نقوم بإنشاء مفردات متخصصة في الإحساس بالكلمات الهجومية اليابانية للكلمات المفتوحة متعددة اللغات.يتوسع هذا المفردات على قائمة موجودة من الكلمات اليابانية الناتجة وتوفر التصنيف والربط السليم بالاتصالات داخل Wordnet متعددة اللغات.ثم تناقش هذه الورقة تقييم المفردات كمورد لتمثيل التصنيف والكلمات الهجومية وكخلاصا محتملا لاستخدام الكلمة الهجومية في وسائل التواصل الاجتماعي.
This paper describes the development of an online lexical resource to help detection systems regulate and curb the use of offensive words online. With the growing prevalence of social media platforms, many conversations are now conducted on- line. The increase of online conversations for leisure, work and socializing has led to an increase in harassment. In particular, we create a specialized sense-based vocabulary of Japanese offensive words for the Open Multilingual Wordnet. This vocabulary expands on an existing list of Japanese offen- sive words and provides categorization and proper linking to synsets within the multilingual wordnet. This paper then discusses the evaluation of the vocabulary as a resource for representing and classifying offensive words and as a possible resource for offensive word use detection in social media.
References used
https://aclanthology.org/
Currently, there are two available wordnets for Turkish: TR-wordnet of BalkaNet and KeNet. As the more comprehensive wordnet for Turkish, KeNet includes 76,757 synsets. KeNet has both intralingual semantic relations and is linked to PWN through inter
The paper presents the project Semantic Network with a Wide Range of Semantic Relations and its main achievements. The ultimate objective of the project is to expand Princeton WordNet with conceptual frames that define the syntagmatic relations of ve
WordNet is the most widely used lexical resource for English, while Wikidata is one of the largest knowledge graphs of entity and concepts available. While, there is a clear difference in the focus of these two resources, there is also a significant
The vast majority of the existing approaches for taxonomy enrichment apply word embeddings as they have proven to accumulate contexts (in a broad sense) extracted from texts which are sufficient for attaching orphan words to the taxonomy. On the othe
Neural language models, including transformer-based models, that are pre-trained on very large corpora became a common way to represent text in various tasks, including recognition of textual semantic relations, e.g. Cross-document Structure Theory.