تلقى الاعتراف بالمحادثة في المحادثة اهتماما كبيرا مؤخرا بسبب تطبيقاتها الصناعية العملية.تميل الأساليب الحالية إلى التغاضي عن التفاعل المتبادل الفوري بين مكبرات الصوت المختلفة في مستوى الكلام المتكلم، أو قم بتطبيق RNN المتكلم المرغوب عن الكلام من مختلف المتحدثين.نقترح عملة معدنية، نموذج تفاعلي محادثة لتخفيف هذه المشكلة عن طريق تطبيق التفاعل المتبادل الحكومي في سياقات التاريخ.بالإضافة إلى ذلك، نقدم وحدة تفاعلية عالمية مكدسة لالتقاط تمثيل السياق والاعتماد بين الاعتمادات بطريقة هرمية.لتحسين المتانة والتعميم أثناء التدريب، نقوم بإنشاء أمثلة خصومة من خلال تطبيق الاضطرابات البسيطة بشأن مدخلات ميزة متعددة الوسائط، كشف النقاب عن فوائد الأمثلة العداء للكشف عن المشاعر.ينص النموذج المقترح بشكل تجريبي النتائج الحالية على النتائج الحالية على مجموعة بيانات IEMOCAP Benchmark.
Emotion recognition in conversation has received considerable attention recently because of its practical industrial applications. Existing methods tend to overlook the immediate mutual interaction between different speakers in the speaker-utterance level, or apply single speaker-agnostic RNN for utterances from different speakers. We propose COIN, a conversational interactive model to mitigate this problem by applying state mutual interaction within history contexts. In addition, we introduce a stacked global interaction module to capture the contextual and inter-dependency representation in a hierarchical manner. To improve the robustness and generalization during training, we generate adversarial examples by applying the minor perturbations on multimodal feature inputs, unveiling the benefits of adversarial examples for emotion detection. The proposed model empirically achieves the current state-of-the-art results on the IEMOCAP benchmark dataset.
References used
https://aclanthology.org/
Several recent studies on dyadic human-human interactions have been done on conversations without specific business objectives. However, many companies might benefit from studies dedicated to more precise environments such as after sales services or
Emotion recognition in multi-party conversation (ERMC) is becoming increasingly popular as an emerging research topic in natural language processing. Prior research focuses on exploring sequential information but ignores the discourse structures of c
Existing works in multimodal affective computing tasks, such as emotion recognition and personality recognition, generally adopt a two-phase pipeline by first extracting feature representations for each single modality with hand crafted algorithms, a
Conversational Emotion Recognition (CER) is a task to predict the emotion of an utterance in the context of a conversation. Although modeling the conversational context and interactions between speakers has been studied broadly, it is important to co
Due to the popularity of intelligent dialogue assistant services, speech emotion recognition has become more and more important. In the communication between humans and machines, emotion recognition and emotion analysis can enhance the interaction be