No Arabic abstract
We show tight bounds for online Hamming distance computation in the cell-probe model with word size w. The task is to output the Hamming distance between a fixed string of length n and the last n symbols of a stream. We give a lower bound of Omega((d/w)*log n) time on average per output, where d is the number of bits needed to represent an input symbol. We argue that this bound is tight within the model. The lower bound holds under randomisation and amortisation.
We give cell-probe bounds for the computation of edit distance, Hamming distance, convolution and longest common subsequence in a stream. In this model, a fixed string of $n$ symbols is given and one $delta$-bit symbol arrives at a time in a stream. After each symbol arrives, the distance between the fixed string and a suffix of most recent symbols of the stream is reported. The cell-probe model is perhaps the strongest model of computation for showing data structure lower bounds, subsuming in particular the popular word-RAM model. * We first give an $Omega((delta log n)/(w+loglog n))$ lower bound for the time to give each output for both online Hamming distance and convolution, where $w$ is the word size. This bound relies on a new encoding scheme and for the first time holds even when $w$ is as small as a single bit. * We then consider the online edit distance and longest common subsequence problems in the bit-probe model ($w=1$) with a constant sized input alphabet. We give a lower bound of $Omega(sqrt{log n}/(loglog n)^{3/2})$ which applies for both problems. This second set of results relies both on our new encoding scheme as well as a carefully constructed hard distribution. * Finally, for the online edit distance problem we show that there is an $O((log n)^2/w)$ upper bound in the cell-probe model. This bound gives a contrast to our new lower bound and also establishes an exponential gap between the known cell-probe and RAM model complexities.
Vizings celebrated theorem asserts that any graph of maximum degree $Delta$ admits an edge coloring using at most $Delta+1$ colors. In contrast, Bar-Noy, Naor and Motwani showed over a quarter century that the trivial greedy algorithm, which uses $2Delta-1$ colors, is optimal among online algorithms. Their lower bound has a caveat, however: it only applies to low-degree graphs, with $Delta=O(log n)$, and they conjectured the existence of online algorithms using $Delta(1+o(1))$ colors for $Delta=omega(log n)$. Progress towards resolving this conjecture was only made under stochastic arrivals (Aggarwal et al., FOCS03 and Bahmani et al., SODA10). We resolve the above conjecture for emph{adversarial} vertex arrivals in bipartite graphs, for which we present a $(1+o(1))Delta$-edge-coloring algorithm for $Delta=omega(log n)$ known a priori. Surprisingly, if $Delta$ is not known ahead of time, we show that no $big(frac{e}{e-1} - Omega(1) big) Delta$-edge-coloring algorithm exists. We then provide an optimal, $big(frac{e}{e-1}+o(1)big)Delta$-edge-coloring algorithm for unknown $Delta=omega(log n)$. Key to our results, and of possible independent interest, is a novel fractional relaxation for edge coloring, for which we present optimal fractional online algorithms and a near-lossless online rounding scheme, yielding our optimal randomized algorithms.
We consider the following online optimization problem. We are given a graph $G$ and each vertex of the graph is assigned to one of $ell$ servers, where servers have capacity $k$ and we assume that the graph has $ell cdot k$ vertices. Initially, $G$ does not contain any edges and then the edges of $G$ are revealed one-by-one. The goal is to design an online algorithm $operatorname{ONL}$, which always places the connected components induced by the revealed edges on the same server and never exceeds the server capacities by more than $varepsilon k$ for constant $varepsilon>0$. Whenever $operatorname{ONL}$ learns about a new edge, the algorithm is allowed to move vertices from one server to another. Its objective is to minimize the number of vertex moves. More specifically, $operatorname{ONL}$ should minimize the competitive ratio: the total cost $operatorname{ONL}$ incurs compared to an optimal offline algorithm $operatorname{OPT}$. Our main contribution is a polynomial-time randomized algorithm, that is asymptotically optimal: we derive an upper bound of $O(log ell + log k)$ on its competitive ratio and show that no randomized online algorithm can achieve a competitive ratio of less than $Omega(log ell + log k)$. We also settle the open problem of the achievable competitive ratio by deterministic online algorithms, by deriving a competitive ratio of $Theta(ell lg k)$; to this end, we present an improved lower bound as well as a deterministic polynomial-time online algorithm. Our algorithms rely on a novel technique which combines efficient integer programming with a combinatorial approach for maintaining ILP solutions. We believe this technique is of independent interest and will find further applications in the future.
We consider the file maintenance problem (also called the online labeling problem) in which n integer items from the set {1,...,r} are to be stored in an array of size m >= n. The items are presented sequentially in an arbitrary order, and must be stored in the array in sorted order (but not necessarily in consecutive locations in the array). Each new item must be stored in the array before the next item is received. If r<=m then we can simply store item j in location j but if r>m then we may have to shift the location of stored items to make space for a newly arrived item. The algorithm is charged each time an item is stored in the array, or moved to a new location. The goal is to minimize the total number of such moves done by the algorithm. This problem is non-trivial when n=<m<r. In the case that m=Cn for some C>1, algorithms for this problem with cost O(log(n)^2) per item have been given [IKR81, Wil92, BCD+02]. When m=n, algorithms with cost O(log(n)^3) per item were given [Zha93, BS07]. In this paper we prove lower bounds that show that these algorithms are optimal, up to constant factors. Previously, the only lower bound known for this range of parameters was a lower bound of Omega(log(n)^2) for the restricted class of smooth algorithms [DSZ05a, Zha93]. We also provide an algorithm for the sparse case: If the number of items is polylogarithmic in the array size then the problem can be solved in amortized constant time per item.
An assignment of colours to the vertices of a graph is stable if any two vertices of the same colour have identically coloured neighbourhoods. The goal of colour refinement is to find a stable colouring that uses a minimum number of colours. This is a widely used subroutine for graph isomorphism testing algorithms, since any automorphism needs to be colour preserving. We give an $O((m+n)log n)$ algorithm for finding a canonical version of such a stable colouring, on graphs with $n$ vertices and $m$ edges. We show that no faster algorithm is possible, under some modest assumptions about the type of algorithm, which captures all known colour refinement algorithms.