ﻻ يوجد ملخص باللغة العربية
Knowledge bases of entities and relations (either constructed manually or automatically) are behind many real world search engines, including those at Yahoo!, Microsoft, and Google. Those knowledge bases can be viewed as graphs with nodes representing entities and edges representing (primary) relationships, and various studies have been conducted on how to leverage them to answer entity seeking queries. Meanwhile, in a complementary direction, analyses over the query logs have enabled researchers to identify entity pairs that are statistically correlated. Such entity relationships are then presented to search users through the related searches feature in modern search engines. However, entity relationships thus discovered can often be puzzling to the users because why the entities are connected is often indescribable. In this paper, we propose a novel problem called entity relationship explanation, which seeks to explain why a pair of entities are connected, and solve this challenging problem by integrating the above two complementary approaches, i.e., we leverage the knowledge base to explain the connections discovered between entity pairs. More specifically, we present REX, a system that takes a pair of entities in a given knowledge base as input and efficiently identifies a ranked list of relationship explanations. We formally define relationship explanations and analyze their desirable properties. Furthermore, we design and implement algorithms to efficiently enumerate and rank all relationship explanations based on multiple measures of interestingness. We perform extensive experiments over real web-scale data gathered from DBpedia and a commercial search engine, demonstrating the efficiency and scalability of REX. We also perform user studies to corroborate the effectiveness of explanations generated by REX.
In todays Web and social network environments, query workloads include ad hoc and OLAP queries, as well as iterative algorithms that analyze data relationships (e.g., link analysis, clustering, learning). Modern DBMSs support ad hoc and OLAP queries,
Trajectory-based spatiotemporal entity linking is to match the same moving object in different datasets based on their movement traces. It is a fundamental step to support spatiotemporal data integration and analysis. In this paper, we study the prob
Entity resolution (ER) is the problem of identifying and merging records that refer to the same real-world entity. In many scenarios, raw records are stored under heterogeneous environment. Specifically, the schemas of records may differ from each ot
Document subject classification is essential for structuring (digital) libraries and allowing readers to search within a specific field. Currently, the classification is typically made by human domain experts. Semi-supervised Machine Learning algorit
Probabilistic databases play a preeminent role in the processing and management of uncertain data. Recently, many database research efforts have integrated probabilistic models into databases to support tasks such as information extraction and labeli