Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

LT3 at SemEval-2021 Task 6: Using Multi-Modal Compact Bilinear Pooling to Combine Visual and Textual Understanding in Memes

LT3 في مهمة Semeval-2021 6: استخدام تجمع Bilinear المدمج متعدد الوسائط للجمع بين الفهم البصري والنصوص في الميمات

403 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

combine visual visual and textual textual understanding الجمع بين البصرية مرئي ونص فهم نصي صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Internet memes have become ubiquitous in social media networks today. Due to their popularity, they are also a widely used mode of expression to spread disinformation online. As memes consist of a mixture of text and image, they require a multi-modal approach for automatic analysis. In this paper, we describe our contribution to the SemEval-2021 Detection of Persuasian Techniques in Texts and Images Task. We propose a Multi-Modal learning system, which incorporates memebeddings'', viz. joint text and vision features by combining them with compact bilinear pooling, to automatically identify rhetorical and psychological disinformation techniques. The experimental results show that the proposed system constantly outperforms the competition's baseline, and achieves the 2nd best Macro F1-score and 14th best Micro F1-score out of all participants.

References used

https://aclanthology.org/

rate research

1213Li at SemEval-2021 Task 6: Detection of Propaganda with Multi-modal Attention and Pre-trained Models

996 - Association for Computation Linguistics 2021 مقالة

This paper presents the solution proposed by the 1213Li team for subtask 3 in SemEval-2021 Task 6: identifying the multiple persuasion techniques used in the multi-modal content of the meme. We explored various approaches in feature extraction and th e detection of persuasion labels. Our final model employs pre-trained models including RoBERTa and ResNet-50 as a feature extractor for texts and images, respectively, and adopts a label embedding layer with multi-modal attention mechanism to measure the similarity of labels with the multi-modal information and fuse features for label prediction. Our proposed method outperforms the provided baseline method and achieves 3rd out of 16 participants with 0.54860/0.22830 for Micro/Macro F1 scores.

detection of propaganda multi-modal attention propaganda with multi-modal اهتمام متعدد الوسائط دعاية مع متعددة مشروط صناعة حمض الفوسفور

AIMH at SemEval-2021 Task 6: Multimodal Classification Using an Ensemble of Transformer Models

691 - Association for Computation Linguistics 2021 مقالة

This paper describes the system used by the AIMH Team to approach the SemEval Task 6. We propose an approach that relies on an architecture based on the transformer model to process multimodal content (text and images) in memes. Our architecture, cal led DVTT (Double Visual Textual Transformer), approaches Subtasks 1 and 3 of Task 6 as multi-label classification problems, where the text and/or images of the meme are processed, and the probabilities of the presence of each possible persuasion technique are returned as a result. DVTT uses two complete networks of transformers that work on text and images that are mutually conditioned. One of the two modalities acts as the main one and the second one intervenes to enrich the first one, thus obtaining two distinct ways of operation. The two transformers outputs are merged by averaging the inferred probabilities for each possible label, and the overall network is trained end-to-end with a binary cross-entropy loss.

aimh team visual textual transformer double visual textual فريق Aimh محول البصرية النصية ضعف المرئي النصية صناعة حمض الفوسفور المزيد..

Volta at SemEval-2021 Task 6: Towards Detecting Persuasive Texts and Images using Textual and Multimodal Ensemble

592 - Association for Computation Linguistics 2021 مقالة

Memes are one of the most popular types of content used to spread information online. They can influence a large number of people through rhetorical and psychological techniques. The task, Detection of Persuasion Techniques in Texts and Images, is to detect these persuasive techniques in memes. It consists of three subtasks: (A) Multi-label classification using textual content, (B) Multi-label classification and span identification using textual content, and (C) Multi-label classification using visual and textual content. In this paper, we propose a transfer learning approach to fine-tune BERT-based models in different modalities. We also explore the effectiveness of ensembles of models trained in different modalities. We achieve an F1-score of 57.0, 48.2, and 52.1 in the corresponding subtasks.

detecting persuasive texts detecting persuasive الكشف عن النصوص المقنعة الكشف عن إقناع صناعة حمض الفوسفور

YNU-HPCC at SemEval-2021 Task 6: Combining ALBERT and Text-CNN for Persuasion Detection in Texts and Images

621 - Association for Computation Linguistics 2021 مقالة

In recent years, memes combining image and text have been widely used in social media, and memes are one of the most popular types of content used in online disinformation campaigns. In this paper, our study on the detection of persuasion techniques in texts and images in SemEval-2021 Task 6 is summarized. For propaganda technology detection in text, we propose a combination model of both ALBERT and Text CNN for text classification, as well as a BERT-based multi-task sequence labeling model for propaganda technology coverage span detection. For the meme classification task involved in text understanding and visual feature extraction, we designed a parallel channel model divided into text and image channels. Our method achieved a good performance on subtasks 1 and 3. The micro F1-scores of 0.492, 0.091, and 0.446 achieved on the test sets of the three subtasks ranked 12th, 7th, and 11th, respectively, and all are higher than the baseline model.

memes combining image combining albert الميم يجمع بين الصورة الجمع بين ألبرت صناعة حمض الفوسفور

WVOQ at SemEval-2021 Task 6: BART for Span Detection and Classification

1149 - Association for Computation Linguistics 2021 مقالة

Simultaneous span detection and classification is a task not currently addressed in standard NLP frameworks. The present paper describes why and how an EncoderDecoder model was used to combine span detection and classification to address subtask 2 of SemEval-2021 Task 6.

span detection detection and classification simultaneous span detection سبان الكشف الكشف والتصنيف اكتشاف سبان في وقت واحد. صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

LT3 at SemEval-2021 Task 6: Using Multi-Modal Compact Bilinear Pooling to Combine Visual and Textual Understanding in Memes

LT3 في مهمة Semeval-2021 6: استخدام تجمع Bilinear المدمج متعدد الوسائط للجمع بين الفهم البصري والنصوص في الميمات

Ask ChatGPT about the research

Read More

suggested questions