ﻻ يوجد ملخص باللغة العربية
Representations of geographic entities captured in popular knowledge graphs such as Wikidata and DBpedia are often incomplete. OpenStreetMap (OSM) is a rich source of openly available, volunteered geographic information that has a high potential to complement these representations. However, identity links between the knowledge graph entities and OSM nodes are still rare. The problem of link discovery in these settings is particularly challenging due to the lack of a strict schema and heterogeneity of the user-defined node representations in OSM. In this article, we propose OSM2KG - a novel link discovery approach to predict identity links between OSM nodes and geographic entities in a knowledge graph. The core of the OSM2KG approach is a novel latent, compact representation of OSM nodes that captures semantic node similarity in an embedding. OSM2KG adopts this latent representation to train a supervised model for link prediction and utilises existing links between OSM and knowledge graphs for training. Our experiments conducted on several OSM datasets, as well as the Wikidata and DBpedia knowledge graphs, demonstrate that OSM2KG can reliably discover identity links. OSM2KG achieves an F1 score of 92.05% on Wikidata and of 94.17% on DBpedia on average, which corresponds to a 21.82 percentage points increase in F1 score on Wikidata compared to the best performing baselines.
OpenStreetMap (OSM) is one of the richest openly available sources of volunteered geographic information. Although OSM includes various geographical entities, their descriptions are highly heterogeneous, incomplete, and do not follow any well-defined
Accurate understanding and forecasting of traffic is a key contemporary problem for policymakers. Road networks are increasingly congested, yet traffic data is often expensive to obtain, making informed policy-making harder. This paper explores the e
In an era of heterogeneous data, novel methods and volunteered geographic information provide opportunities to understand how people interact with a place. However, it is not enough to simply have such heterogeneous data, instead an understanding of
Accurate modelling of local population movement patterns is a core contemporary concern for urban policymakers, affecting both the short term deployment of public transport resources and the longer term planning of transport infrastructure. Yet, whil
This paper investigates the problem of utilizing network topology and partial timestamps to detect the information source in a network. The problem incurs prohibitive cost under canonical maximum likelihood estimation (MLE) of the source due to the e