تحذير: تحتوي هذه المقالة على محتويات قد تسيء إلى القراء.
الاستراتيجيات التي تنشر ضجيجا متعمدا في نص عند نشرها شائعة في فضاء الإنترنت، وأحيانا تهدف إلى السماح لبعض أفراد المجتمع فقط بفهم الدلالات الحقيقية.
في هذه الورقة، نستكشف الغرض من هذه الإجراءات عن طريق تصنيفها إلى حيل و ميمز وفلاتر والأكواد، وتنظيم الاستراتيجيات اللغوية المستخدمة في كل غرض.
من خلال ذلك، نحدد أن يتم إجراء هذه الاستراتيجيات من قبل مؤلفين لأغراض متعددة، فيما يتعلق بوجود أصحاب المصلحة مثل الأقران والآخرين.ونحلل أخيرا كيفية ظهور هذه الاستراتيجيات بشكل مختلف في كل ظرف من الظروف، إلى جانب الأمثلة المصاحبة للتصنيف الموحد.
WARNING: This article contains contents that may offend the readers. Strategies that insert intentional noise into text when posting it are commonly observed in the online space, and sometimes they aim to let only certain community users understand the genuine semantics. In this paper, we explore the purpose of such actions by categorizing them into tricks, memes, fillers, and codes, and organize the linguistic strategies that are used for each purpose. Through this, we identify that such strategies can be conducted by authors for multiple purposes, regarding the presence of stakeholders such as Peers' and Others'. We finally analyze how these strategies appear differently in each circumstance, along with the unified taxonomy accompanying examples.
References used
https://aclanthology.org/
Sensitivity of deep-neural models to input noise is known to be a challenging problem. In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors. To mitigate this issue, models may leverage artificially nois
Morphological analysis (MA) and lexical normalization (LN) are both important tasks for Japanese user-generated text (UGT). To evaluate and compare different MA/LN systems, we have constructed a publicly available Japanese UGT corpus. Our corpus comp
User-generated texts include various types of stylistic properties, or noises. Such texts are not properly processed by existing morpheme analyzers or language models based on formal texts such as encyclopedias or news articles. In this paper, we pro
We present an algorithm based on multi-layer transformers for identifying Adverse Drug Reactions (ADR) in social media data. Our model relies on the properties of the problem and the characteristics of contextual word embeddings to extract two views
Orthorectification is the process of geometrically correcting imagery for geometric
distortions which can be caused by topography, camera geometry, and sensor related
errors. The output of orthorectification has the same geometric characteristics o