No Arabic abstract
In the W-streaming model, an algorithm is given $O(n mathrm{polylog} n)$ space and must process a large graph of up to $O(n^2)$ edges. In this short note we give two algorithms for edge colouring under the W-streaming model. For edge colouring in W-streaming, a colour for every edge must be determined by the time all the edges are streamed. Our first algorithm uses $Delta + o(Delta)$ colours in $O(n log^2 n)$ space when the edges arrive according to a uniformly random permutation. The second algorithm uses $(1 + o(1))Delta^2 / s$ colours in $tilde{O}(n s)$ space when edges arrival adversarially.
The anti-Ramsey number, $ar(G, H)$ is the minimum integer $k$ such that in any edge colouring of $G$ with $k$ colours there is a rainbow subgraph isomorphic to $H$, i.e., a copy of $H$ with each of its edges assigned a different colour. The notion was introduced by Erd{{o}}s and Simonovits in 1973. Since then the parameter has been studied extensively in combinatorics, also the particular case when $H$ is a star graph. Recently this case received the attention of researchers from the algorithm community because of its applications in interface modelling of wireless networks. To the algorithm community, the problem is known as maximum edge $q$-colouring problem. In this paper, we study the maximum edge $2$-colouring problem from the approximation algorithm point of view. The case $q=2$ is particularly interesting due to its application in real-life problems. Algorithmically, this problem is known to be NP-hard for $qge 2$. For the case of $q=2$, it is also known that no polynomial-time algorithm can approximate to a factor less than $3/2$ assuming the unique games conjecture. Feng et al. showed a $2$-approximation algorithm for this problem. Later Adamaszek and Popa presented a $5/3$-approximation algorithm with the additional assumption that the input graph has a perfect matching. Note that the obvious but the only known algorithm issues different colours to the edges of a maximum matching (say $M$) and different colours to the connected components of $G setminus M$. In this article, we give a new analysis of the aforementioned algorithm leading to an improved approximation bound for triangle-free graphs with perfect matching. We also show a new lower bound when the input graph is triangle-free. The contribution of the paper is a completely new, deeper and closer analysis of how the optimum achieves a higher number of colours than the matching based algorithm, mentioned above.
In this paper, we study edit distance (ED) and longest common subsequence (LCS) in the asymmetric streaming model, introduced by Saks and Seshadhri [SS13]. As an intermediate model between the random access model and the streaming model, this model allows one to have streaming access to one string and random access to the other string. Our first main contribution is a systematic study of space lower bounds for ED and LCS in the asymmetric streaming model. Previously, there are no explicitly stated results in this context, although some lower bounds about LCS can be inferred from the lower bounds for longest increasing subsequence (LIS) in [SW07][GG10][EJ08]. Yet these bounds only work for large alphabet size. In this paper, we develop several new techniques to handle ED in general and LCS for small alphabet size, thus establishing strong lower bounds for both problems. In particular, our lower bound for ED provides an exponential separation between edit distance and Hamming distance in the asymmetric streaming model. Our lower bounds also extend to LIS and longest non-decreasing sequence (LNS) in the standard streaming model. Together with previous results, our bounds provide an almost complete picture for these two problems. As our second main contribution, we give improved algorithms for ED and LCS in the asymmetric streaming model. For ED, we improve the space complexity of the constant factor approximation algorithms in [FHRS20][CJLZ20] from $tilde{O}(frac{n^delta}{delta})$ to $O(frac{d^delta}{delta};mathsf{polylog}(n))$, where $n$ is the length of each string and $d$ is the edit distance between the two strings. For LCS, we give the first $1/2+epsilon$ approximation algorithm with space $n^{delta}$ for any constant $delta>0$, over a binary alphabet.
We present new lower bounds that show that a polynomial number of passes are necessary for solving some fundamental graph problems in the streaming model of computation. For instance, we show that any streaming algorithm that finds a weighted minimum $s$-$t$ cut in an $n$-vertex undirected graph requires $n^{2-o(1)}$ space unless it makes $n^{Omega(1)}$ passes over the stream. To prove our lower bounds, we introduce and analyze a new four-player communication problem that we refer to as the hidden-pointer chasing problem. This is a problem in spirit of the standard pointer chasing problem with the key difference that the pointers in this problem are hidden to players and finding each one of them requires solving another communication problem, namely the set intersection problem. Our lower bounds for graph problems are then obtained by reductions from the hidden-pointer chasing problem. Our hidden-pointer chasing problem appears flexible enough to find other applications and is therefore interesting in its own right. To showcase this, we further present an interesting application of this problem beyond streaming algorithms. Using a reduction from hidden-pointer chasing, we prove that any algorithm for submodular function minimization needs to make $n^{2-o(1)}$ value queries to the function unless it has a polynomial degree of adaptivity.
We give new approximation algorithms for the submodular joint replenishment problem and the inventory routing problem, using an iterative rounding approach. In both problems, we are given a set of $N$ items and a discrete time horizon of $T$ days in which given demands for the items must be satisfied. Ordering a set of items incurs a cost according to a set function, with properties depending on the problem under consideration. Demand for an item at time $t$ can be satisfied by an order on any day prior to $t$, but a holding cost is charged for storing the items during the intermediate period; the goal is to minimize the sum of the ordering and holding cost. Our approximation factor for both problems is $O(log log min(N,T))$; this improves exponentially on the previous best results.
In the time-decay model for data streams, elements of an underlying data set arrive sequentially with the recently arrived elements being more important. A common approach for handling large data sets is to maintain a emph{coreset}, a succinct summary of the processed data that allows approximate recovery of a predetermined query. We provide a general framework that takes any offline-coreset and gives a time-decay coreset for polynomial time decay functions. We also consider the exponential time decay model for $k$-median clustering, where we provide a constant factor approximation algorithm that utilizes the online facility location algorithm. Our algorithm stores $mathcal{O}(klog(hDelta)+h)$ points where $h$ is the half-life of the decay function and $Delta$ is the aspect ratio of the dataset. Our techniques extend to $k$-means clustering and $M$-estimators as well.