Regex Queries over Incomplete Knowledge Bases


Abstract in English

We propose the novel task of answering regular expression queries (containing disjunction ($vee$) and Kleene plus ($+$) operators) over incomplete KBs. The answer set of these queries potentially has a large number of entities, hence previous works for single-hop queries in KBC that model a query as a point in high-dimensional space are not as effective. In response, we develop RotatE-Box -- a novel combination of RotatE and box embeddings. It can model more relational inference patterns compared to existing embedding based models. Furthermore, we define baseline approaches for embedding based KBC models to handle regex operators. We demonstrate performance of RotatE-Box on two new regex-query datasets introduced in this paper, including one where the queries are harvested based on actual user query logs. We find that our final RotatE-Box model significantly outperforms models based on just RotatE and just box embeddings.

Download