ﻻ يوجد ملخص باللغة العربية
Linked text representation is critical for many intelligent web applications, such as online advertisement and recommender systems. Recent breakthroughs on pretrained language models and graph neural networks facilitate the development of corresponding techniques. However, the existing works mainly rely on cascaded model structures: the texts are independently encoded by language models at first, and the textual embeddings are further aggregated by graph neural networks. We argue that the neighbourhood information is insufficiently utilized within the above process, which restricts the representation quality. In this work, we propose GraphFormers, where graph neural networks are nested alongside each transformer layer of the language models. On top of the above architecture, the linked texts will iteratively extract neighbourhood information for the enhancement of their own semantics. Such an iterative workflow gives rise to more effective utilization of neighbourhood information, which contributes to the representation quality. We further introduce an adaptation called unidirectional GraphFormers, which is much more efficient and comparably effective; and we leverage a pretraining strategy called the neighbourhood-aware masked language modeling to enhance the training effect. We perform extensive experiment studies with three large-scale linked text datasets, whose results verify the effectiveness of our proposed methods.
Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is es
Text generation has become one of the most important yet challenging tasks in natural language processing (NLP). The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of pretrained langua
Purely character-based language models (LMs) have been lagging in quality on large scale datasets, and current state-of-the-art LMs rely on word tokenization. It has been assumed that injecting the prior knowledge of a tokenizer into the model is ess
Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts. Recent studies report that prompt-based direct classification eliminates the need for fine-tuning but lacks data and i
Recently, text world games have been proposed to enable artificial agents to understand and reason about real-world scenarios. These text-based games are challenging for artificial agents, as it requires understanding and interaction using natural la