Mean Isoperimetry with Control on Outliers: Exact and Approximation Algorithms

92 0 0.0 ( 0 )

Download Cite

Added by Amir Daneshgar

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Morteza Alimi - Amir Daneshgar - Mohammad-Hadi Foroughmand-Araabi

Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Given a weighted graph $G=(V,E)$ with weight functions $c:Eto mathbb{R}_+$ and $pi:Vto mathbb{R}_+$, and a subset $Usubseteq V$, the normalized cut value for $U$ is defined as the sum of the weights of edges exiting $U$ divided by the weight of vertices in $U$. The {it mean isoperimetry problem}, $mathsf{ISO}^1(G,k)$, for a weighted graph $G$ is a generalization of the classical uniform sparsest cut problem in which, given a parameter $k$, the objective is to find $k$ disjoint nonempty subsets of $V$ minimizing the average normalized cut value of the parts. The robust version of the problem seeks an optimizer where the number of vertices that fall out of the subpartition is bounded by some given integer $0 leq rho leq |V|$. Our main result states that $mathsf{ISO}^1(G,k)$, as well as its robust version, $mathsf{CRISO}^1(G,k,rho)$, subjected to the condition that each part of the subpartition induces a connected subgraph, are solvable in time $O(k^2 rho^2 pi(V(T)^3)$ on any weighted tree $T$, in which $pi(V(T))$ is the sum of the vertex-weights. This result implies that $mathsf{ISO}^1(G,k)$ is strongly polynomial-time solvable on weighted trees when the vertex-weights are polynomially bounded and may be compared to the fact that the problem is NP-Hard for weighted trees in general. Also, using this, we show that both mentioned problems, $mathsf{ISO}^1(G,k)$ and $mathsf{CRISO}^1(G,k,rho)$ as well as the ordinary robust mean isoperimetry problem $mathsf{RISO}^1(G,k,rho)$, admit polynomial-time $O(log^{1.5}|V| loglog |V|)$-approximation algorithms for weighted graphs with polynomially bounded weights, using the R{a}cke-Shah tree cut sparsifier.

rate research

Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

162 - Lunjia Hu , Omer Reingold 2020

We study the problem of robustly estimating the mean of a $d$-dimensional distribution given $N$ examples, where most coordinates of every example may be missing and $varepsilon N$ examples may be arbitrarily corrupted. Assuming each coordinate appears in a constant factor more than $varepsilon N$ examples, we show algorithms that estimate the mean of the distribution with information-theoretically optimal dimension-independent error guarantees in nearly-linear time $widetilde O(Nd)$. Our results extend recent work on computationally-efficient robust estimation to a more widely applicable incomplete-data setting.

Data Structures and Algorithms Machine Learning Statistics Theory

Multi-way sparsest cut problem on trees with a control on the number of parts and outliers

191 - Ramin Javadi , Saleh Ashkboos 2017

Given a graph, the sparsest cut problem asks for a subset of vertices whose edge expansion (the normalized cut given by the subset) is minimized. In this paper, we study a generalization of this problem seeking for $ k $ disjoint subsets of vertices (clusters) whose all edge expansions are small and furthermore, the number of vertices remained in the exterior of the subsets (outliers) is also small. We prove that although this problem is $ NP-$hard for trees, it can be solved in polynomial time for all weighted trees, provided that we restrict the search space to subsets which induce connected subgraphs. The proposed algorithm is based on dynamic programming and runs in the worst case in $ O(k^2 n^3) $, when $ n $ is the number of vertices and $ k $ is the number of clusters. It also runs in linear time when the number of clusters and the number of outliers is bounded by a constant.

Data Structures and Algorithms

Subspace approximation with outliers

152 - Amit Deshpande , Rameshwar Pratap 2020

The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_{1},ldots, x_{n} in R^{d}$, an integer $1 leq k leq d$, and an outlier parameter $0 leq alpha leq 1$, is to find a $k$-dimensional linear subspace of $R^{d}$ that minimizes the sum of squared distances to its nearest $(1-alpha)n$ points. More generally, the $ell_{p}$ subspace approximation problem with outliers minimizes the sum of $p$-th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the $(1-alpha)n$ inliers in the optimal solution are promised to lie exactly on a $k$-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard. We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal $k$-dimensional subspace summed over the optimal $(1-alpha)n$ inliers is at least $delta$ times its squared-error summed over all $n$ points, for some $0 < delta leq 1 - alpha$. With this assumption, we give an efficient algorithm to find a subset of $poly(k/epsilon) log(1/delta) loglog(1/delta)$ points whose span contains a $k$-dimensional subspace that gives a multiplicative $(1+epsilon)$-approximation to the optimal solution. The running time of our algorithm is linear in $n$ and $d$. Interestingly, our results hold even when the fraction of outliers $alpha$ is large, as long as the obvious condition $0 < delta leq 1 - alpha$ is satisfied.

Computational Geometry Data Structures and Algorithms Statistics Theory

Dense Steiner problems: Approximation algorithms and inapproximability

115 - Marek Karpinski , Mateusz Lewandowski , Syed Mohammad Meesum 2020

The Steiner Tree problem is a classical problem in combinatorial optimization: the goal is to connect a set $T$ of terminals in a graph $G$ by a tree of minimum size. Karpinski and Zelikovsky (1996) studied the $delta$-dense version of {sc Steiner Tree}, where each terminal has at least $delta |V(G)setminus T|$ neighbours outside $T$, for a fixed $delta > 0$. They gave a PTAS for this problem. We study a generalization of pairwise $delta$-dense {sc Steiner Forest}, which asks for a minimum-size forest in $G$ in which the nodes in each terminal set $T_1,dots,T_k$ are connected, and every terminal in $T_i$ has at least $delta |T_j|$ neighbours in $T_j$, and at least $delta|S|$ nodes in $S = V(G)setminus (T_1cupdotscup T_k)$, for each $i, j$ in ${1,dots, k}$ with $i eq j$. Our first result is a polynomial-time approximation scheme for all $delta > 1/2$. Then, we show a $(frac{13}{12}+varepsilon)$-approximation algorithm for $delta = 1/2$ and any $varepsilon > 0$. We also consider the $delta$-dense Group Steiner Tree problem as defined by Hauptmann and show that the problem is $mathsf{APX}$-hard.

Data Structures and Algorithms

Exact algorithms for maximum weighted independent set on sparse graphs

91 - Sen Huang , Mingyu Xiao , Xiaoyu Chen 2021

The maximum independent set problem is one of the most important problems in graph algorithms and has been extensively studied in the line of research on the worst-case analysis of exact algorithms for NP-hard problems. In the weighted version, each vertex in the graph is associated with a weight and we are going to find an independent set of maximum total vertex weight. In this paper, we design several reduction rules and a fast exact algorithm for the maximum weighted independent set problem, and use the measure-and-conquer technique to analyze the running time bound of the algorithm. Our algorithm works on general weighted graphs and it has a good running time bound on sparse graphs. If the graph has an average degree at most 3, our algorithm runs in $O^*(1.1443^n)$ time and polynomial space, improving previous running time bounds for the problem in cubic graphs using polynomial space.

Data Structures and Algorithms