المحادثات عبر الإنترنت تشمل أكثر من مجرد نص.على نحو متزايد، تعمل الاستجابات القائمة على الصور مثل الميمات وحلويات الرسوم المتحركة استجابات معترف بها ثقافيا وغالبا ما تكون روح الدعابة في المحادثة.ومع ذلك، في حين أن NLP تم توسيعها إلى نماذج متعددة الوسائط، فإن أنظمة حوار المحادثة تركز إلى حد كبير على توليد الردود النصية فقط.هنا، نقدم مجموعة بيانات جديدة تبلغ 1.56 مليون محادثة Text-GIF تتحول وإدخال نموذج محادثة متعددة الوسائط جديد Pepe جمبري الملك لتحديد الردود القائمة على GIF.نوضح أن نموذجنا ينتج استجابات GIF ذات الصلة وعالية الجودة، وفي تجربة مراقبة عشوائية كبيرة ترد على المستخدمين الحقيقيين، نظين على أن ردودنا النموذجية لدينا مع صور متحركة يتم استلامها بشكل أفضل من قبل المجتمع.
Online conversations include more than just text. Increasingly, image-based responses such as memes and animated gifs serve as culturally recognized and often humorous responses in conversation. However, while NLP has broadened to multimodal models, conversational dialog systems have largely focused only on generating text replies. Here, we introduce a new dataset of 1.56M text-gif conversation turns and introduce a new multimodal conversational model Pepe the King Prawn for selecting gif-based replies. We demonstrate that our model produces relevant and high-quality gif responses and, in a large randomized control trial of multiple models replying to real users, we show that our model replies with gifs that are significantly better received by the community.
References used
https://aclanthology.org/
Speaker gestures are semantically co-expressive with speech and serve different pragmatic functions to accompany oral modality. Therefore, gestures are an inseparable part of the language system: they may add clarity to discourse, can be employed to
Much recent work in bilingual lexicon induction (BLI) views word embeddings as vectors in Euclidean space. As such, BLI is typically solved by finding a linear transformation that maps embeddings to a common space. Alternatively, word embeddings may
Coordination is a phenomenon of language that conjoins two or more terms or phrases using a coordinating conjunction. Although coordination has been explored extensively in the linguistics literature, the rules and constraints that govern its structu
We focus on dialog models in the context of clinical studies where the goal is to help gather, in addition to the close information collected based on a questionnaire, serendipitous information that is medically relevant. To promote user engagement a
Table-based fact verification task aims to verify whether the given statement is supported by the given semi-structured table. Symbolic reasoning with logical operations plays a crucial role in this task. Existing methods leverage programs that conta