ﻻ يوجد ملخص باللغة العربية
Images are more than a collection of objects or attributes -- they represent a web of relationships among interconnected objects. Scene Graph has emerged as a new modality for a structured graphical representation of images. Scene Graph encodes objects as nodes connected via pairwise relations as edges. To support question answering on scene graphs, we propose GraphVQA, a language-guided graph neural network framework that translates and executes a natural language question as multiple iterations of message passing among graph nodes. We explore the design space of GraphVQA framework, and discuss the trade-off of different design choices. Our experiments on GQA dataset show that GraphVQA outperforms the state-of-the-art model by a large margin (88.43% vs. 94.78%).
Fact-based Visual Question Answering (FVQA), a challenging variant of VQA, requires a QA-system to include facts from a diverse knowledge graph (KG) in its reasoning process to produce an answer. Large KGs, especially common-sense KGs, are known to b
This paper studies the task of Visual Question Answering (VQA), which is topical in Multimedia community recently. Particularly, we explore two critical research problems existed in VQA: (1) efficiently fusing the visual and textual modalities; (2) e
In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question answering. To aggregate clues from scattered texts across multiple paragraphs, a hierarchical graph is created by constructing nodes on different levels of granularity
Incorporating external knowledge to Visual Question Answering (VQA) has become a vital practical need. Existing methods mostly adopt pipeline approaches with different components for knowledge matching and extraction, feature learning, etc.However, s
Commonsense question answering (QA) requires a model to grasp commonsense and factual knowledge to answer questions about world events. Many prior methods couple language modeling with knowledge graphs (KG). However, although a KG contains rich struc