Lossy Kernelization of Same-Size Clustering

123 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Nidhi Purohit

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sayan Bandyapadhyay - Fedor V. Fomin - Petr A. Golovach

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this work, we study the $k$-median clustering problem with an additional equal-size constraint on the clusters, from the perspective of parameterized preprocessing. Our main result is the first lossy ($2$-approximate) polynomial kernel for this problem, parameterized by the cost of clustering. We complement this result by establishing lower bounds for the problem that eliminate the existences of an (exact) kernel of polynomial size and a PTAS.

قيم البحث

اقرأ أيضاً

Lossy Kernelization

58 - Daniel Lokshtanov , Fahad Panolan , M. S. Ramanujan 2016

In this paper we propose a new framework for analyzing the performance of preprocessing algorithms. Our framework builds on the notion of kernelization from parameterized complexity. However, as opposed to the original notion of kernelization, our de finitions combine well with approximation algorithms and heuristics. The key new definition is that of a polynomial size $alpha$-approximate kernel. Loosely speaking, a polynomial size $alpha$-approximate kernel is a polynomial time pre-processing algorithm that takes as input an instance $(I,k)$ to a parameterized problem, and outputs another instance $(I,k)$ to the same problem, such that $|I|+k leq k^{O(1)}$. Additionally, for every $c geq 1$, a $c$-approximate solution $s$ to the pre-processed instance $(I,k)$ can be turned in polynomial time into a $(c cdot alpha)$-approximate solution $s$ to the original instance $(I,k)$. Our main technical contribution are $alpha$-approximate kernels of polynomial size for three problems, namely Connected Vertex Cover, Disjoint Cycle Packing and Disjoint Factors. These problems are known not to admit any polynomial size kernels unless $NP subseteq coNP/poly$. Our approximate kernels simultaneously beat both the lower bounds on the (normal) kernel size, and the hardness of approximation lower bounds for all three problems. On the negative side we prove that Longest Path parameterized by the length of the path and Set Cover parameterized by the universe size do not admit even an $alpha$-approximate kernel of polynomial size, for any $alpha geq 1$, unless $NP subseteq coNP/poly$. In order to prove this lower bound we need to combine in a non-trivial way the techniques used for showing kernelization lower bounds with the methods for showing hardness of approximation

بنى وهياكل البيانات والخوارزميات

Kernelization of Whitney Switches

321 - Fedor V. Fomin , Petr A. Golovach 2020

A fundamental theorem of Whitney from 1933 asserts that 2-connected graphs G and H are 2-isomorphic, or equivalently, their cycle matroids are isomorphic, if and only if G can be transformed into H by a series of operations called Whitney switches. I n this paper we consider the quantitative question arising from Whitneys theorem: Given two 2-isomorphic graphs, can we transform one into another by applying at most k Whitney switches? This problem is already NP-complete for cycles, and we investigate its parameterized complexity. We show that the problem admits a kernel of size O(k), and thus, is fixed-parameter tractable when parameterized by k.

بنى وهياكل البيانات والخوارزميات التوافقية

Parameterized Complexity of Categorical Clustering with Size Constraints

218 - Fedor V. Fomin , Petr A. Golovach , 2021

In the Categorical Clustering problem, we are given a set of vectors (matrix) A={a_1,ldots,a_n} over Sigma^m, where Sigma is a finite alphabet, and integers k and B. The task is to partition A into k clusters such that the median objective of the clu stering in the Hamming norm is at most B. That is, we seek a partition {I_1,ldots,I_k} of {1,ldots,n} and vectors c_1,ldots,c_kinSigma^m such that sum_{i=1}^ksum_{jin I_i}d_h(c_i,a_j)leq B, where d_H(a,b) is the Hamming distance between vectors a and b. Fomin, Golovach, and Panolan [ICALP 2018] proved that the problem is fixed-parameter tractable (for binary case Sigma={0,1}) by giving an algorithm that solves the problem in time 2^{O(Blog B)} (mn)^{O(1)}. We extend this algorithmic result to a popular capacitated clustering model, where in addition the sizes of the clusters should satisfy certain constraints. More precisely, in Capacitated Clustering, in addition, we are given two non-negative integers p and q, and seek a clustering with pleq |I_i|leq q for all iin{1,ldots,k}. Our main theorem is that Capacitated Clustering is solvable in time 2^{O(Blog B)}|Sigma|^B(mn)^{O(1)}. The theorem not only extends the previous algorithmic results to a significantly more general model, it also implies algorithms for several other variants of Categorical Clustering with constraints on cluster sizes.

بنى وهياكل البيانات والخوارزميات الرياضيات المتقطعة

Subexponential parameterized algorithms and kernelization on almost chordal graphs

96 - Fedor V. Fomin , Petr A. Golovach 2020

We study the algorithmic properties of the graph class Chordal-ke, that is, graphs that can be turned into a chordal graph by adding at most k edges or, equivalently, the class of graphs of fill-in at most k. We discover that a number of fundamental intractable optimization problems being parameterized by k admit subexponential algorithms on graphs from Chordal-ke. We identify a large class of optimization problems on Chordal-ke that admit algorithms with the typical running time 2^{O(sqrt{k}log k)}cdot n^{O(1)}. Examples of the problems from this class are finding an independent set of maximum weight, finding a feedback vertex set or an odd cycle transversal of minimum weight, or the problem of finding a maximum induced planar subgraph. On the other hand, we show that for some fundamental optimization problems, like finding an optimal graph coloring or finding a maximum clique, are FPT on Chordal-ke when parameterized by k but do not admit subexponential in k algorithms unless ETH fails. Besides subexponential time algorithms, the class of Chordal-ke graphs appears to be appealing from the perspective of kernelization (with parameter k). While it is possible to show that most of the weighted variants of optimization problems do not admit polynomial in k kernels on Chordal-ke graphs, this does not exclude the existence of Turing kernelization and kernelization for unweighted graphs. In particular, we construct a polynomial Turing kernel for Weighted Clique on Chordal-ke graphs. For (unweighted) Independent Set we design polynomial kernels on two interesting subclasses of Chordal-ke, namely, Interval-ke and Split-ke graphs.

بنى وهياكل البيانات والخوارزميات الرياضيات المتقطعة

FPT and kernelization algorithms for the k-in-a-tree problem

217 - Guilherme C. M. Gomes , Vinicius F. dos Santos , Murilo V. G. da Silva 2020

The three-in-a-tree problem asks for an induced tree of the input graph containing three mandatory vertices. In 2006, Chudnovsky and Seymour [Combinatorica, 2010] presented the first polynomial time algorithm for this problem, which has become a crit ical subroutine in many algorithms for detecting induced subgraphs, such as beetles, pyramids, thetas, and even and odd-holes. In 2007, Derhy and Picouleau [Discrete Applied Mathematics, 2009] considered the natural generalization to $k$ mandatory vertices, proving that, when $k$ is part of the input, the problem is $mathsf{NP}$-complete, and ask what is the complexity of four-in-a-tree. Motivated by this question and the relevance of the original problem, we study the parameterized complexity of $k$-in-a-tree. We begin by showing that the problem is $mathsf{W[1]}$-hard when jointly parameterized by the size of the solution and minimum clique cover and, under the Exponential Time Hypothesis, does not admit an $n^{o(k)}$ time algorithm. Afterwards, we use Courcelles Theorem to prove fixed-parameter tractability under cliquewidth, which prompts our investigation into which parameterizations admit single exponential algorithms; we show that such algorithms exist for the unrelated parameterizations treewidth, distance to cluster, and distance to co-cluster. In terms of kernelization, we present a linear kernel under feedback edge set, and show that no polynomial kernel exists under vertex cover nor distance to clique unless $mathsf{NP} subseteq mathsf{coNP}/mathsf{poly}$. Along with other remarks and previous work, our tractability and kernelization results cover many of the most commonly employed parameters in the graph parameter hierarchy.

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات