Structural Generalizability: The Case of Similarity Search


Abstract in English

Graph similarity search algorithms usually leverage the structural properties of a database. Hence, these algorithms are effective only on some structural variations of the data and are ineffective on other forms, which makes them hard to use. Ideally, one would like to design a data analytics algorithm that is structurally robust, i.e., it returns essentially the same accurate results over all possible structural variations of a dataset. We propose a novel approach to create a structurally robust similarity search algorithm over graph databases. We leverage the classic insight in the database literature that schematic variations are caused by having constraints in the database. We then present RelSim algorithm which is provably structurally robust under these variations. Our empirical studies show that our proposed algorithms are structurally robust while being efficient and as effective as or more effective than the state-of-the-art similarity search algorithms.

Download