Do you want to publish a course? Click here

Link Prediction on N-ary Relational Data Based on Relatedness Evaluation

88   0   0.0 ( 0 )
 Added by Saiping Guan
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

With the overwhelming popularity of Knowledge Graphs (KGs), researchers have poured attention to link prediction to fill in missing facts for a long time. However, they mainly focus on link prediction on binary relational data, where facts are usually represented as triples in the form of (head entity, relation, tail entity). In practice, n-ary relational facts are also ubiquitous. When encountering such facts, existing studies usually decompose them into triples by introducing a multitude of auxiliary virtual entities and additional triples. These



rate research

Read More

Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.
Tensor, an extension of the vector and matrix to the multi-dimensional case, is a natural way to describe the N-ary relational data. Recently, tensor decomposition methods have been introduced into N-ary relational data and become state-of-the-art on embedding learning. However, the performance of existing tensor decomposition methods is not as good as desired. First, they suffer from the data-sparsity issue since they can only learn from the N-ary relational data with a specific arity, i.e., parts of common N-ary relational data. Besides, they are neither effective nor efficient enough to be trained due to the over-parameterization problem. In this paper, we propose a novel method, i.e., S2S, for effectively and efficiently learning from the N-ary relational data. Specifically, we propose a new tensor decomposition framework, which allows embedding sharing to learn from facts with mixed arity. Since the core tensors may still suffer from the over-parameterization, we propose to reduce parameters by sparsifying the core tensors while retaining their expressive power using neural architecture search (NAS) techniques, which can search for data-dependent architectures. As a result, the proposed S2S not only guarantees to be expressive but also efficiently learns from mixed arity. Finally, empirical results have demonstrated that S2S is efficient to train and achieves state-of-the-art performance.
Multi-relational graph is a ubiquitous and important data structure, allowing flexible representation of multiple types of interactions and relations between entities. Similar to other graph-structured data, link prediction is one of the most important tasks on multi-relational graphs and is often used for knowledge completion. When related graphs coexist, it is of great benefit to build a larger graph via integrating the smaller ones. The integration requires predicting hidden relational connections between entities belonged to different graphs (inter-domain link prediction). However, this poses a real challenge to existing methods that are exclusively designed for link prediction between entities of the same graph only (intra-domain link prediction). In this study, we propose a new approach to tackle the inter-domain link prediction problem by softly aligning the entity distributions between different domains with optimal transport and maximum mean discrepancy regularizers. Experiments on real-world datasets show that optimal transport regularizer is beneficial and considerably improves the performance of baseline methods.
For many years, link prediction on knowledge graphs (KGs) has been a purely transductive task, not allowing for reasoning on unseen entities. Recently, increasing efforts are put into exploring semi- and fully inductive scenarios, enabling inference over unseen and emerging entities. Still, all these approaches only consider triple-based glspl{kg}, whereas their richer counterparts, hyper-relational KGs (e.g., Wikidata), have not yet been properly studied. In this work, we classify different inductive settings and study the benefits of employing hyper-relational KGs on a wide range of semi- and fully inductive link prediction tasks powered by recent advancements in graph neural networks. Our experiments on a novel set of benchmarks show that qualifiers over typed edges can lead to performance improvements of 6% of absolute gains (for the Hits@10 metric) compared to triple-only baselines. Our code is available at url{https://github.com/mali-git/hyper_relational_ilp}.
Currently, Chronic Kidney Disease (CKD) is experiencing a globally increasing incidence and high cost to health systems. A delayed recognition implies premature mortality due to progressive loss of kidney function. The employment of data mining to discover subtle patterns in CKD indicators would contribute achieving early diagnosis. This work presents the development and evaluation of an explainable prediction model that would support clinicians in the early diagnosis of CKD patients. The model development is based on a data management pipeline that detects the best combination of ensemble trees algorithms and features selected concerning classification performance. The results obtained through the pipeline equals the performance of the best CKD prediction models identified in the literature. Furthermore, the main contribution of the paper involves an explainability-driven approach that allows selecting the best prediction model maintaining a balance between accuracy and explainability. Therefore, the most balanced explainable prediction model of CKD implements an XGBoost classifier over a group of 4 features (packed cell value, specific gravity, albumin, and hypertension), achieving an accuracy of 98.9% and 97.5% with cross-validation technique and with new unseen data respectively. In addition, by analysing the models explainability by means of different post-hoc techniques, the packed cell value and the specific gravity are determined as the most relevant features that influence the prediction results of the model. This small number of feature selected results in a reduced cost of the early diagnosis of CKD implying a promising solution for developing countries.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا