Do you want to publish a course? Click here

Approximating $(k,ell)$-Median Clustering for Polygonal Curves

66   0   0.0 ( 0 )
 Added by Dennis Rohde
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

In 2015, Driemel, Krivov{s}ija and Sohler introduced the $(k,ell)$-median problem for clustering polygonal curves under the Frechet distance. Given a set of input curves, the problem asks to find $k$ median curves of at most $ell$ vertices each that minimize the sum of Frechet distances over all input curves to their closest median curve. A major shortcoming of their algorithm is that the input curves are restricted to lie on the real line. In this paper, we present a randomized bicriteria-approximation algorithm that works for polygonal curves in $mathbb{R}^d$ and achieves approximation factor $(1+epsilon)$ with respect to the clustering costs. The algorithm has worst-case running-time linear in the number of curves, polynomial in the maximum number of vertices per curve, i.e. their complexity, and exponential in $d$, $ell$, $epsilon$ and $delta$, i.e., the failure probability. We achieve this result through a shortcutting lemma, which guarantees the existence of a polygonal curve with similar cost as an optimal median curve of complexity $ell$, but of complexity at most $2ell-2$, and whose vertices can be computed efficiently. We combine this lemma with the superset-sampling technique by Kumar et al. to derive our clustering result. In doing so, we describe and analyze a generalization of the algorithm by Ackermann et al., which may be of independent interest.



rate research

Read More

The Euclidean $k$-center problem is a classical problem that has been extensively studied in computer science. Given a set $mathcal{G}$ of $n$ points in Euclidean space, the problem is to determine a set $mathcal{C}$ of $k$ centers (not necessarily part of $mathcal{G}$) such that the maximum distance between a point in $mathcal{G}$ and its nearest neighbor in $mathcal{C}$ is minimized. In this paper we study the corresponding $(k,ell)$-center problem for polygonal curves under the Frechet distance, that is, given a set $mathcal{G}$ of $n$ polygonal curves in $mathbb{R}^d$, each of complexity $m$, determine a set $mathcal{C}$ of $k$ polygonal curves in $mathbb{R}^d$, each of complexity $ell$, such that the maximum Frechet distance of a curve in $mathcal{G}$ to its closest curve in $mathcal{C}$ is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension $2$ and higher. We show that, if $ell$ is part of the input, then there is no polynomial-time approximation scheme unless $mathsf{P}=mathsf{NP}$. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Frechet distance. In the case of the discrete Frechet distance on two-dimensional curves, we show hardness of approximation within a factor close to $2.598$. This result also holds when $k=1$, and the $mathsf{NP}$-hardness extends to the case that $ell=infty$, i.e., for the problem of computing the minimum-enclosing ball under the Frechet distance. Finally, we observe that a careful adaptation of Gonzalez algorithm in combination with a curve simplification yields a $3$-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.
textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems, specifically textit{$k$-means} clustering has got much attention from the researchers. Despite the fact that $k$-means is a very well studied problem its status in the plane is still an open problem. In particular, it is unknown whether it admits a PTAS in the plane. The best known approximation bound in polynomial time is $9+eps$. In this paper, we consider the following variant of $k$-means. Given a set $C$ of points in $mathcal{R}^d$ and a real $f > 0$, find a finite set $F$ of points in $mathcal{R}^d$ that minimizes the quantity $f*|F|+sum_{pin C} min_{q in F} {||p-q||}^2$. For any fixed dimension $d$, we design a local search PTAS for this problem. We also give a bi-criterion local search algorithm for $k$-means which uses $(1+eps)k$ centers and yields a solution whose cost is at most $(1+eps)$ times the cost of an optimal $k$-means solution. The algorithm runs in polynomial time for any fixed dimension. The contribution of this paper is two fold. On the one hand, we are being able to handle the square of distances in an elegant manner, which yields near optimal approximation bound. This leads us towards a better understanding of the $k$-means problem. On the other hand, our analysis of local search might also be useful for other geometric problems. This is important considering that very little is known about the local search method for geometric approximation.
We give a polynomial-time constant-factor approximation algorithm for maximum independent set for (axis-aligned) rectangles in the plane. Using a polynomial-time algorithm, the best approximation factor previously known is $O(loglog n)$. The results are based on a new form of recursive partitioning in the plane, in which faces that are constant-complexity and orthogonally convex are recursively partitioned into a constant number of such faces.
124 - Haitao Wang 2019
Let $mathcal{P}$ be a polygonal domain of $h$ holes and $n$ vertices. We study the problem of constructing a data structure that can compute a shortest path between $s$ and $t$ in $mathcal{P}$ under the $L_1$ metric for any two query points $s$ and $t$. To do so, a standard approach is to first find a set of $n_s$ gateways for $s$ and a set of $n_t$ gateways for $t$ such that there exist a shortest $s$-$t$ path containing a gateway of $s$ and a gateway of $t$, and then compute a shortest $s$-$t$ path using these gateways. Previous algorithms all take quadratic $O(n_scdot n_t)$ time to solve this problem. In this paper, we propose a divide-and-conquer technique that solves the problem in $O(n_s + n_t log n_s)$ time. As a consequence, we construct a data structure of $O(n+(h^2log^3 h/loglog h))$ size in $O(n+(h^2log^4 h/loglog h))$ time such that each query can be answered in $O(log n)$ time.
For a polygonal domain with $h$ holes and a total of $n$ vertices, we present algorithms that compute the $L_1$ geodesic diameter in $O(n^2+h^4)$ time and the $L_1$ geodesic center in $O((n^4+n^2 h^4)alpha(n))$ time, respectively, where $alpha(cdot)$ denotes the inverse Ackermann function. No algorithms were known for these problems before. For the Euclidean counterpart, the best algorithms compute the geodesic diameter in $O(n^{7.73})$ or $O(n^7(h+log n))$ time, and compute the geodesic center in $O(n^{11}log n)$ time. Therefore, our algorithms are significantly faster than the algorithms for the Euclidean problems. Our algorithms are based on several interesting observations on $L_1$ shortest paths in polygonal domains.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا