تعليق المستخدم هو ميزة قيمة للعديد من المنافذ الإخبارية، مما يتيح لهم اتصال مع القراء وتمكين القراء للتعبير عن رأيهم، وتوفير وجهات نظر مختلفة، وحتى المعلومات التكميلية. ومع ذلك، من الصعب تصفية كميات كبيرة من تعليقات المستخدمين، ناهيك عن قراءة واستخراج المعلومات ذات الصلة. لا يزال البحث في تلخيص تعليقات المستخدمين في مهده، ومجموعات بيانات التلخيص التي تم إنشاؤها الإنسان نادرة، خاصة بالنسبة لغات أقل الموارد. لمعالجة هذه المشكلة، نقترح نهج غير مدعوم لتلخيص تعليقات المستخدم، والذي يستخدم تمثيل حديث متعدد اللغات للجمل جنبا إلى جنب مع تقنيات تلخيص الاستخراج القياسية. تقارننا مناهج تمثيل الجملة المختلفة مقترن بنهج تلخيص مختلفة يدل على أن أكثر المجموعات ناجحة هي نفسها في الأخبار وتلخيص التعليق. النتائج التجريبية وعرضت تصور تظهر فائدة المنهجية المقترحة لعدة لغات.
User commenting is a valuable feature of many news outlets, enabling them a contact with readers and enabling readers to express their opinion, provide different viewpoints, and even complementary information. Yet, large volumes of user comments are hard to filter, let alone read and extract relevant information. The research on the summarization of user comments is still in its infancy, and human-created summarization datasets are scarce, especially for less-resourced languages. To address this issue, we propose an unsupervised approach to user comments summarization, which uses a modern multilingual representation of sentences together with standard extractive summarization techniques. Our comparison of different sentence representation approaches coupled with different summarization approaches shows that the most successful combinations are the same in news and comment summarization. The empirical results and presented visualisation show usefulness of the proposed methodology for several languages.
References used
https://aclanthology.org/
In this paper, we address unsupervised chunking as a new task of syntactic structure induction, which is helpful for understanding the linguistic structures of human languages as well as processing low-resource languages. We propose a knowledge-trans
The task of converting a nonstandard text to a standard and readable text is known as lexical normalization. Almost all the Natural Language Processing (NLP) applications require the text data in normalized form to build quality task-specific models.
Unsupervised relation extraction works by clustering entity pairs that have the same relations in the text. Some existing variational autoencoder (VAE)-based approaches train the relation extraction model as an encoder that generates relation classif
Toxic comments contain forms of non-acceptable language targeted towards groups or individuals. These types of comments become a serious concern for government organizations, online communities, and social media platforms. Although there are some app
Language as a significant part of communication should be inclusive of equality and diversity. The internet user's language has a huge influence on peer users all over the world. People express their views through language on virtual platforms like F