Linear Sum Assignment with Edition


Abstract in English

We consider the problem of transforming a set of elements into another by a sequence of elementary edit operations, namely substitutions, removals and insertions of elements. Each possible edit operation is penalized by a non-negative cost and the cost of a transformation is measured by summing the costs of its operations. A solution to this problem consists in defining a transformation having a minimal cost, among all possible transformations. To compute such a solution, the classical approach consists in representing removal and insertion operations by augmenting the two sets so that they get the same size. This allows to express the problem as a linear sum assignment problem (LSAP), which thus finds an optimal bijection (or permutation, perfect matching) between the two augmented sets. While the LSAP is known to be efficiently solvable in polynomial time complexity, for instance with the Hungarian algorithm, useless time and memory are spent to treat the elements which have been added to the initial sets. In this report, we show that the problem can be formalized as an extension of the LSAP which considers only one additional element in each set to represent removal and insertion operations. A solution to the problem is no longer represented as a bijection between the two augmented sets. We show that the considered problem is a binary linear program (BLP) very close to the LSAP. While it can be solved by any BLP solver, we propose an adaptation of the Hungarian algorithm which improves the time and memory complexities previously obtained by the approach based on the LSAP. The importance of the improvement increases as the size of the two sets and their absolute difference increase. Based on the analysis of the problem presented in this report, other classical algorithms can be adapted.

Download