Do you want to publish a course? Click here

Time-dependent Entity Embedding is not All You Need: A Re-evaluation of Temporal Knowledge Graph Completion Models under a Unified Framework

تضمين الكيان التابع للوقت ليس كل ما تحتاجه: إعادة تقييم نماذج الإنجاز الرسم البياني للمعرفة الزمنية بموجب إطار موحد

862   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Various temporal knowledge graph (KG) completion models have been proposed in the recent literature. The models usually contain two parts, a temporal embedding layer and a score function derived from existing static KG modeling approaches. Since the approaches differ along several dimensions, including different score functions and training strategies, the individual contributions of different temporal embedding techniques to model performance are not always clear. In this work, we systematically study six temporal embedding approaches and empirically quantify their performance across a wide range of configurations with about 3000 experiments and 13159 GPU hours. We classify the temporal embeddings into two classes: (1) timestamp embeddings and (2) time-dependent entity embeddings. Despite the common belief that the latter is more expressive, an extensive experimental study shows that timestamp embeddings can achieve on-par or even better performance with significantly fewer parameters. Moreover, we find that when trained appropriately, the relative performance differences between various temporal embeddings often shrink and sometimes even reverse when compared to prior results. For example, TTransE (CITATION), one of the first temporal KG models, can outperform more recent architectures on ICEWS datasets. To foster further research, we provide the first unified open-source framework for temporal KG completion models with full composability, where temporal embeddings, score functions, loss functions, regularizers, and the explicit modeling of reciprocal relations can be combined arbitrarily.



References used
https://aclanthology.org/
rate research

Read More

Static knowledge graph (SKG) embedding (SKGE) has been studied intensively in the past years. Recently, temporal knowledge graph (TKG) embedding (TKGE) has emerged. In this paper, we propose a Recursive Temporal Fact Embedding (RTFE) framework to tra nsplant SKGE models to TKGs and to enhance the performance of existing TKGE models for TKG completion. Different from previous work which ignores the continuity of states of TKG in time evolution, we treat the sequence of graphs as a Markov chain, which transitions from the previous state to the next state. RTFE takes the SKGE to initialize the embeddings of TKG. Then it recursively tracks the state transition of TKG by passing updated parameters/features between timestamps. Specifically, at each timestamp, we approximate the state transition as the gradient update process. Since RTFE learns each timestamp recursively, it can naturally transit to future timestamps. Experiments on five TKG datasets show the effectiveness of RTFE.
Representation learning approaches for knowledge graphs have been mostly designed for static data. However, many knowledge graphs involve evolving data, e.g., the fact (The President of the United States is Barack Obama) is valid only from 2009 to 20 17. This introduces important challenges for knowledge representation learning since the knowledge graphs change over time. In this paper, we present a novel time-aware knowledge graph embebdding approach, TeLM, which performs 4th-order tensor factorization of a Temporal knowledge graph using a Linear temporal regularizer and Multivector embeddings. Moreover, we investigate the effect of the temporal dataset's time granularity on temporal knowledge graph completion. Experimental results demonstrate that our proposed models trained with the linear temporal regularizer achieve the state-of-the-art performances on link prediction over four well-established temporal knowledge graph completion benchmarks.
This paper discusses different approaches to the Toxic Spans Detection task. The problem posed by the task was to determine which words contribute mostly to recognising a document as toxic. As opposed to binary classification of entire texts, word-le vel assessment could be of great use during comment moderation, also allowing for a more in-depth comprehension of the model's predictions. As the main goal was to ensure transparency and understanding, this paper focuses on the current state-of-the-art approaches based on the explainable AI concepts and compares them to a supervised learning solution with word-level labels. The work consists of two xAI approaches that automatically provide the explanation for models trained for binary classification of toxic documents: an LSTM model with attention as a model-specific approach and the Shapley values for interpreting BERT predictions as a model-agnostic method. The competing approach considers this problem as supervised token classification, where models like BERT and its modifications were tested. The paper aims to explore, compare and assess the quality of predictions for different methods on the task. The advantages of each approach and further research direction are also discussed.
We study the power of cross-attention in the Transformer architecture within the context of transfer learning for machine translation, and extend the findings of studies into cross-attention when training from scratch. We conduct a series of experime nts through fine-tuning a translation model on data where either the source or target language has changed. These experiments reveal that fine-tuning only the cross-attention parameters is nearly as effective as fine-tuning all parameters (i.e., the entire translation model). We provide insights into why this is the case and observe that limiting fine-tuning in this manner yields cross-lingually aligned embeddings. The implications of this finding for researchers and practitioners include a mitigation of catastrophic forgetting, the potential for zero-shot translation, and the ability to extend machine translation models to several new language pairs with reduced parameter storage overhead.
The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters. However, due to deployment constraints in edge devices, there has been a rising interest in the compression o f these models to improve their inference time and memory footprint. This paper presents a novel loss objective to compress token embeddings in the Transformer-based models by leveraging an AutoEncoder architecture. More specifically, we emphasize the importance of the direction of compressed embeddings with respect to original uncompressed embeddings. The proposed method is task-agnostic and does not require further language modeling pre-training. Our method significantly outperforms the commonly used SVD-based matrix-factorization approach in terms of initial language model Perplexity. Moreover, we evaluate our proposed approach over SQuAD v1.1 dataset and several downstream tasks from the GLUE benchmark, where we also outperform the baseline in most scenarios. Our code is public.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا