Quadratic Time Algorithms Appear to be Optimal for Sorting Evolving Data

107 0 0.0 ( 0 )

Download Cite

Added by Timothy Johnson

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Juan Jose Besa - William E. Devanny - David Eppstein

Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We empirically study sorting in the evolving data model. In this model, a sorting algorithm maintains an approximation to the sorted order of a list of data items while simultaneously, with each comparison made by the algorithm, an adversary randomly swaps the order of adjacent items in the true sorted order. Previous work studies only t

rate research

Time-Optimal Sublinear Algorithms for Matching and Vertex Cover

164 - Soheil Behnezhad 2021

We present a near-tight analysis of the average query complexity -- `a la Nguyen and Onak [FOCS08] -- of the randomized greedy maximal matching algorithm, improving over the bound of Yoshida, Yamamoto and Ito [STOC09]. For any $n$-vertex graph of average degree $bar{d}$, this leads to the following sublinear-time algorithms for estimating the size of maximum matching and minimum vertex cover, all of which are provably time-optimal up to logarithmic factors: $bullet$ A multiplicative $(2+epsilon)$-approximation in $widetilde{O}(n/epsilon^2)$ time using adjacency list queries. This (nearly) matches an $Omega(n)$ time lower bound for any multiplicative approximation and is, notably, the first $O(1)$-approximation that runs in $o(n^{1.5})$ time. $bullet$ A $(2, epsilon n)$-approximation in $widetilde{O}((bar{d} + 1)/epsilon^2)$ time using adjacency list queries. This (nearly) matches an $Omega(bar{d}+1)$ lower bound of Parnas and Ron [TCS07] which holds for any $(O(1), epsilon n)$-approximation, and improves over the bounds of [Yoshida et al. STOC09; Onak et al. SODA12] and [Kapralov et al. SODA20]: The former two take at least quadratic time in the degree which can be as large as $Omega(n^2)$ and the latter obtains a much larger approximation. $bullet$ A $(2, epsilon n)$-approximation in $widetilde{O}(n/epsilon^3)$ time using adjacency matrix queries. This (nearly) matches an $Omega(n)$ time lower bound in this model and improves over the $widetilde{O}(nsqrt{n})$-time $(2, epsilon n)$-approximate algorithm of [Chen, Kannan, and Khanna ICALP20]. It also turns out that any non-trivial multiplicative approximation in the adjacency matrix model requires $Omega(n^2)$ time, so the additive $epsilon n$ error is necessary too. As immediate corollaries, we get improved sublinear time estimators for (variants of) TSP and an improved AMPC algorithm for maximal matching.

Data Structures and Algorithms Distributed Parallel and Cluster Computing

A Quadratic Time Locally Optimal Algorithm for NP-hard Equal Cardinality Partition Optimization

267 - Kaan Gokcesu , Hakan Gokcesu 2021

We study the optimization version of the equal cardinality set partition problem (where the absolute difference between the equal sized partitions sums are minimized). While this problem is NP-hard and requires exponential complexity to solve in general, we have formulated a weaker version of this NP-hard problem, where the goal is to find a locally optimal solution. The local optimality considered in our work is under any swap between the opposing partitions element pairs. To this end, we designed an algorithm which can produce such a locally optimal solution in $O(N^2)$ time and $O(N)$ space. Our approach does not require positive or integer inputs and works equally well under arbitrary input precisions. Thus, it is widely applicable in different problem scenarios.

Data Structures and Algorithms Machine Learning Combinatorics

Applying Sorting Networks to Synthesize Optimized Sorting Libraries

449 - Michael Codish , Luis Cruz-Filipe , Markus Nebel 2015

This paper shows an application of the theory of sorting networks to facilitate the synthesis of optimized general purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm with insertion sort applied as the base case for small fixed numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching and results in code which is equivalent to a sorting network. This enables the application of further program transformations based on sorting network optimizations, and eventually the synthesis of code from sorting networks. We show that if considering the number of comparisons and swaps then theory predicts no real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. We provide empirical evidence that using code synthesized from efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.

Data Structures and Algorithms Mathematical Software

The Quest for Optimal Sorting Networks: Efficient Generation of Two-Layer Prefixes

552 - Michael Codish , Luis Cruz-Filipe , Peter Schneider-Kamp 2014

Previous work identifying depth-optimal $n$-channel sorting networks for $9leq n leq 16$ is based on exploiting symmetries of the first two layers. However, the naive generate-and-test approach typically applied does not scale. This paper revisits the problem of generating two-layer prefixes modulo symmetries. An improved notion of symmetry is provided and a novel technique based on regular languages and graph isomorphism is shown to generate the set of non-symmetric representations. An empirical evaluation demonstrates that the new method outperforms the generate-and-test approach by orders of magnitude and easily scales until $n=40$.

Data Structures and Algorithms Discrete Mathematics

Universally-Optimal Distributed Algorithms for Known Topologies

68 - Bernhard Haeupler , David Wajc , Goran Zuzic 2021

Many distributed optimization algorithms achieve existentially-optimal running times, meaning that there exists some pathological worst-case topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (1) What network topology parameters determine the complexity of distributed optimization? (2) Are there universally-optimal algorithms that are as fast as possible on every topology? We resolve these 25-year-old open problems in the known-topology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, $(1+varepsilon)$-min cut, various approximate shortest paths problems, sub-graph connectivity, etc. In particular, we provide several (equivalent) graph parameters and show they are tight universal lower bounds for the above problems, fully characterizing their inherent complexity. Our results also imply that algorithms based on the low-congestion shortcut framework match the above lower bound, making them universally optimal if shortcuts are efficiently approximable. We leverage a recent result in hop-constrained oblivious routing to show this is the case if the topology is known -- giving universally-optimal algorithms for all above problems.

Data Structures and Algorithms