No Arabic abstract
We present a number of new results about range searching for colored (or categorical) data: 1. For a set of $n$ colored points in three dimensions, we describe randomized data structures with $O(nmathop{rm polylog}n)$ space that can report the distinct colors in any query orthogonal range (axis-aligned box) in $O(kmathop{rm polyloglog} n)$ expected time, where $k$ is the number of distinct colors in the range, assuming that coordinates are in ${1,ldots,n}$. Previous data structures require $O(frac{log n}{loglog n} + k)$ query time. Our result also implies improvements in higher constant dimensions. 2. Our data structures can be adapted to halfspace ranges in three dimensions (or circular ranges in two dimensions), achieving $O(klog n)$ expected query time. Previous data structures require $O(klog^2n)$ query time. 3. For a set of $n$ colored points in two dimensions, we describe a data structure with $O(nmathop{rm polylog}n)$ space that can answer colored type-2 range counting queries: report the number of occurrences of every distinct color in a query orthogonal range. The query time is $O(frac{log n}{loglog n} + kloglog n)$, where $k$ is the number of distinct colors in the range. Naively performing $k$ uncolored range counting queries would require $O(kfrac{log n}{loglog n})$ time. Our data structures are designed using a variety of techniques, including colored variants of randomized incremental construction (which may be of independent interest), colored variants of shallow cuttings, and bit-packing tricks.
The problem of finding the maximum number of vertex-disjoint uni-color paths in an edge-colored graph (called MaxCDP) has been recently introduced in literature, motivated by applications in social network analysis. In this paper we investigate how the complexity of the problem depends on graph parameters (namely the number of vertices to remove to make the graph a collection of disjoint paths and the size of the vertex cover of the graph), which makes sense since graphs in social networks are not random and have structure. The problem was known to be hard to approximate in polynomial time and not fixed-parameter tractable (FPT) for the natural parameter. Here, we show that it is still hard to approximate, even in FPT-time. Finally, we introduce a new variant of the problem, called MaxCDDP, whose goal is to find the maximum number of vertex-disjoint and color-disjoint uni-color paths. We extend some of the results of MaxCDP to this new variant, and we prove that unlike MaxCDP, MaxCDDP is already hard on graphs at distance two from disjoint paths.
Longest Run Subsequence is a problem introduced recently in the context of the scaffolding phase of genome assembly (Schrinner et al., WABI 2020). The problem asks for a maximum length subsequence of a given string that contains at most one run for each symbol (a run is a maximum substring of consecutive identical symbols). The problem has been shown to be NP-hard and to be fixed-parameter tractable when the parameter is the size of the alphabet on which the input string is defined. In this paper we further investigate the complexity of the problem and we show that it is fixed-parameter tractable when it is parameterized by the number of runs in a solution, a smaller parameter. Moreover, we investigate the kernelization complexity of Longest Run Subsequence and we prove that it does not admit a polynomial kernel when parameterized by the size of the alphabet or by the number of runs. Finally, we consider the restriction of Longest Run Subsequence when each symbol has at most two occurrences in the input string and we show that it is APX-hard.
We investigate the computational complexity of the following problem. We are given a graph in which each vertex has an initial and a target color. Each pair of adjacent vertices can swap their current colors. Our goal is to perform the minimum number of swaps so that the current and target colors agree at each vertex. When the colors are chosen from {1,2,...,c}, we call this problem c-Colored Token Swapping since the current color of a vertex can be seen as a colored token placed on the vertex. We show that c-Colored Token Swapping is NP-complete for c = 3 even if input graphs are restricted to connected planar bipartite graphs of maximum degree 3. We then show that 2-Colored Token Swapping can be solved in polynomial time for general graphs and in linear time for trees. Besides, we show that, the problem for complete graphs is fixed-parameter tractable when parameterized by the number of colors, while it is known to be NP-complete when the number of colors is unbounded.
For any $epsilon>0$, Laue and Matijevi{c} [CCCG07, IPL08] give a PTAS for finding a $(1+epsilon)$-approximate solution to the $k$-hop MST problem in the Euclidean plane that runs in time $(n/epsilon)^{O(k/epsilon)}$. In this paper, we present an algorithm that runs in time $(n/epsilon)^{O(log k cdot(1/epsilon)^2cdotlog^2(1/epsilon))}$. This gives an improvement on the dependency on $k$ on the exponent, while having a worse dependency on $epsilon$. As in Laue and Matijevi{c}, we follow the framework introduced by Arora for Euclidean TSP. Our key ingredients include exponential distance scaling and compression of dynamic programming state tables.