A Quadratic Assignment Formulation of the Graph Edit Distance

141 0 0.0 ( 0 )

Download Cite

Added by Sebastien Bougleux

Publication date 2015

fields Informatics Engineering

and research's language is English

Authors Sebastien Bougleux - Luc Brun - Vincenzo Carletti

Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Computing efficiently a robust measure of similarity or dissimilarity between graphs is a major challenge in Pattern Recognition. The Graph Edit Distance (GED) is a flexible measure of dissimilarity between graphs which arises in error-tolerant graph matching. It is defined from an optimal sequence of edit operations (edit path) transforming one graph into an other. Unfortunately, the exact computation of this measure is NP-hard. In the last decade, several approaches have been proposed to approximate the GED in polynomial time, mainly by solving linear programming problems. Among them, the bipartite GED has received much attention. It is deduced from a linear sum assignment of the nodes of the two graphs, which can be efficiently computed by Hungarian-type algorithms. However, edit operations on nodes and edges are not handled simultaneously, which limits the accuracy of the approximation. To overcome this limitation, we propose to extend the linear assignment model to a quadratic one, for directed or undirected graphs having labelized nodes and edges. This is realized through the definition of a family of edit paths induced by assignments between nodes. We formally show that the GED, restricted to the paths in this family, is equivalent to a quadratic assignment problem. Since this problem is NP-hard, we propose to compute an approximate solution by an adaptation of the Integer Projected Fixed Point method. Experiments show that the proposed approach is generally able to reach a more accurate approximation of the optimal GED than the bipartite GED, with a computational cost that is still affordable for graphs of non trivial sizes.

rate research

Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time

58 - Diptarka Chakraborty , Debarati Das , Elazar Goldenberg 2018

Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a dynamic programming algorithm that runs in quadratic time. Andoni, Krauthgamer, and Onak (2010) gave a nearly linear time algorithm that approximates edit distance within an approximation factor $text{poly}(log n)$. In this paper, we provide an algorithm with running time $tilde{O}(n^{2-2/7})$ that approximates the edit distance within a constant factor.

Data Structures and Algorithms

Graph Edit Distance Reward: Learning to Edit Scene Graph

83 - Lichang Chen , Guosheng Lin , Shijie Wang 2020

Scene Graph, as a vital tool to bridge the gap between language domain and image domain, has been widely adopted in the cross-modality task like VQA. In this paper, we propose a new method to edit the scene graph according to the user instructions, which has never been explored. To be specific, in order to learn editing scene graphs as the semantics given by texts, we propose a Graph Edit Distance Reward, which is based on the Policy Gradient and Graph Matching algorithm, to optimize neural symbolic model. In the context of text-editing image retrieval, we validate the effectiveness of our method in CSS and CRIR dataset. Besides, CRIR is a new synthetic dataset generated by us, which we will publish it soon for future use.

Computer Vision and Pattern Recognition

A Simple Sublinear Algorithm for Gap Edit Distance

339 - Joshua Brakensiek , Moses Charikar , Aviad Rubinstein 2020

We study the problem of estimating the edit distance between two $n$-character strings. While exact computation in the worst case is believed to require near-quadratic time, previous work showed that in certain regimes it is possible to solve the following {em gap edit distance} problem in sub-linear time: distinguish between inputs of distance $le k$ and $>k^2$. Our main result is a very simple algorithm for this benchmark that runs in time $tilde O(n/sqrt{k})$, and in particular settles the open problem of obtaining a truly sublinear time for the entire range of relevant $k$. Building on the same framework, we also obtain a $k$-vs-$k^2$ algorithm for the one-sided preprocessing model with $tilde O(n)$ preprocessing time and $tilde O(n/k)$ query time (improving over a recent $tilde O(n/k+k^2)$-query time algorithm for the same problem [GRS20].

Data Structures and Algorithms

On the edit distance function of the random graph

84 - Ryan R. Martin , Alexander W. N. Riasanovsky 2020

Given a hereditary property of graphs $mathcal{H}$ and a $pin [0,1]$, the edit distance function ${rm ed}_{mathcal{H}}(p)$ is asymptotically the maximum proportion of edge-additions plus edge-deletions applied to a graph of edge density $p$ sufficient to ensure that the resulting graph satisfies $mathcal{H}$. The edit distance function is directly related to other well-studied quantities such as the speed function for $mathcal{H}$ and the $mathcal{H}$-chromatic number of a random graph. Let $mathcal{H}$ be the property of forbidding an ErdH{o}s-R{e}nyi random graph $Fsim mathbb{G}(n_0,p_0)$, and let $varphi$ represent the golden ratio. In this paper, we show that if $p_0in [1-1/varphi,1/varphi]$, then a.a.s. as $n_0toinfty$, begin{align*} {rm ed}_{mathcal{H}}(p) = (1+o(1)),frac{2log n_0}{n_0} cdotminleft{ frac{p}{-log(1-p_0)}, frac{1-p}{-log p_0} right}. end{align*} Moreover, this holds for $pin [1/3,2/3]$ for any $p_0in (0,1)$.

Combinatorics Probability

Space efficient streaming algorithms for the distance to monotonicity and asymmetric edit distance

495 - Michael Saks , C. Seshadhri 2012

Approximating the length of the longest increasing sequence (LIS) of an array is a well-studied problem. We study this problem in the data stream model, where the algorithm is allowed to make a single left-to-right pass through the array and the key resource to be minimized is the amount of additional memory used. We present an algorithm which, for any $delta > 0$, given streaming access to an array of length $n$ provides a $(1+delta)$-multiplicative approximation to the emph{distance to monotonicity} ($n$ minus the length of the LIS), and uses only $O((log^2 n)/delta)$ space. The previous best known approximation using polylogarithmic space was a multiplicative 2-factor. Our algorithm can be used to estimate the length of the LIS to within an additive $delta n$ for any $delta >0$ while previous algorithms could only achieve additive error $n(1/2-o(1))$. Our algorithm is very simple, being just 3 lines of pseudocode, and has a small update time. It is essentially a polylogarithmic space approximate implementation of a classic dynamic program that computes the LIS. We also give a streaming algorithm for approximating $LCS(x,y)$, the length of the longest common subsequence between strings $x$ and $y$, each of length $n$. Our algorithm works in the asymmetric setting (inspired by cite{AKO10}), in which we have random access to $y$ and streaming access to $x$, and runs in small space provided that no single symbol appears very often in $y$. More precisely, it gives an additive-$delta n$ approximation to $LCS(x,y)$ (and hence also to $E(x,y) = n-LCS(x,y)$, the edit distance between $x$ and $y$ when insertions and deletions, but not substitutions, are allowed), with space complexity $O(k(log^2 n)/delta)$, where $k$ is the maximum number of times any one symbol appears in $y$.

Data Structures and Algorithms Discrete Mathematics