ترغب بنشر مسار تعليمي؟ اضغط هنا

Approximating $(k,ell)$-center clustering for curves

58   0   0.0 ( 0 )
 نشر من قبل Kevin Buchin
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The Euclidean $k$-center problem is a classical problem that has been extensively studied in computer science. Given a set $mathcal{G}$ of $n$ points in Euclidean space, the problem is to determine a set $mathcal{C}$ of $k$ centers (not necessarily part of $mathcal{G}$) such that the maximum distance between a point in $mathcal{G}$ and its nearest neighbor in $mathcal{C}$ is minimized. In this paper we study the corresponding $(k,ell)$-center problem for polygonal curves under the Frechet distance, that is, given a set $mathcal{G}$ of $n$ polygonal curves in $mathbb{R}^d$, each of complexity $m$, determine a set $mathcal{C}$ of $k$ polygonal curves in $mathbb{R}^d$, each of complexity $ell$, such that the maximum Frechet distance of a curve in $mathcal{G}$ to its closest curve in $mathcal{C}$ is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension $2$ and higher. We show that, if $ell$ is part of the input, then there is no polynomial-time approximation scheme unless $mathsf{P}=mathsf{NP}$. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Frechet distance. In the case of the discrete Frechet distance on two-dimensional curves, we show hardness of approximation within a factor close to $2.598$. This result also holds when $k=1$, and the $mathsf{NP}$-hardness extends to the case that $ell=infty$, i.e., for the problem of computing the minimum-enclosing ball under the Frechet distance. Finally, we observe that a careful adaptation of Gonzalez algorithm in combination with a curve simplification yields a $3$-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.

قيم البحث

اقرأ أيضاً

In 2015, Driemel, Krivov{s}ija and Sohler introduced the $(k,ell)$-median problem for clustering polygonal curves under the Frechet distance. Given a set of input curves, the problem asks to find $k$ median curves of at most $ell$ vertices each that minimize the sum of Frechet distances over all input curves to their closest median curve. A major shortcoming of their algorithm is that the input curves are restricted to lie on the real line. In this paper, we present a randomized bicriteria-approximation algorithm that works for polygonal curves in $mathbb{R}^d$ and achieves approximation factor $(1+epsilon)$ with respect to the clustering costs. The algorithm has worst-case running-time linear in the number of curves, polynomial in the maximum number of vertices per curve, i.e. their complexity, and exponential in $d$, $ell$, $epsilon$ and $delta$, i.e., the failure probability. We achieve this result through a shortcutting lemma, which guarantees the existence of a polygonal curve with similar cost as an optimal median curve of complexity $ell$, but of complexity at most $2ell-2$, and whose vertices can be computed efficiently. We combine this lemma with the superset-sampling technique by Kumar et al. to derive our clustering result. In doing so, we describe and analyze a generalization of the algorithm by Ackermann et al., which may be of independent interest.
textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems, specifically te xtit{$k$-means} clustering has got much attention from the researchers. Despite the fact that $k$-means is a very well studied problem its status in the plane is still an open problem. In particular, it is unknown whether it admits a PTAS in the plane. The best known approximation bound in polynomial time is $9+eps$. In this paper, we consider the following variant of $k$-means. Given a set $C$ of points in $mathcal{R}^d$ and a real $f > 0$, find a finite set $F$ of points in $mathcal{R}^d$ that minimizes the quantity $f*|F|+sum_{pin C} min_{q in F} {||p-q||}^2$. For any fixed dimension $d$, we design a local search PTAS for this problem. We also give a bi-criterion local search algorithm for $k$-means which uses $(1+eps)k$ centers and yields a solution whose cost is at most $(1+eps)$ times the cost of an optimal $k$-means solution. The algorithm runs in polynomial time for any fixed dimension. The contribution of this paper is two fold. On the one hand, we are being able to handle the square of distances in an elegant manner, which yields near optimal approximation bound. This leads us towards a better understanding of the $k$-means problem. On the other hand, our analysis of local search might also be useful for other geometric problems. This is important considering that very little is known about the local search method for geometric approximation.
We study two fundamental problems dealing with curves in the plane, namely, the nearest-neighbor problem and the center problem. Let $mathcal{C}$ be a set of $n$ polygonal curves, each of size $m$. In the nearest-neighbor problem, the goal is to cons truct a compact data structure over $mathcal{C}$, such that, given a query curve $Q$, one can efficiently find the curve in $mathcal{C}$ closest to $Q$. In the center problem, the goal is to find a curve $Q$, such that the maximum distance between $Q$ and the curves in $mathcal{C}$ is minimized. We use the well-known discrete Frechet distance function, both under~$L_infty$ and under $L_2$, to measure the distance between two curves. For the nearest-neighbor problem, despite discouraging previous results, we identify two important cases for which it is possible to obtain practical bounds, even when $m$ and $n$ are large. In these cases, either $Q$ is a line segment or $mathcal{C}$ consists of line segments, and the bounds on the size of the data structure and query time are nearly linear in the size of the input and query curve, respectively. The returned answer is either exact under $L_infty$, or approximated to within a factor of $1+varepsilon$ under~$L_2$. We also consider the variants in which the location of the input curves is only fixed up to translation, and obtain similar bounds, under $L_infty$. As for the center problem, we study the case where the center is a line segment, i.e., we seek the line segment that represents the given set as well as possible. We present near-linear time exact algorithms under $L_infty$, even when the location of the input curves is only fixed up to translation. Under $L_2$, we present a roughly $O(n^2m^3)$-time exact algorithm.
We consider the $k$-center problem in which the centers are constrained to lie on two lines. Given a set of $n$ weighted points in the plane, we want to locate up to $k$ centers on two parallel lines. We present an $O(nlog^2 n)$ time algorithm, which minimizes the weighted distance from any point to a center. We then consider the unweighted case, where the centers are constrained to be on two perpendicular lines. Our algorithms run in $O(nlog^2 n)$ time also in this case.
We give a polynomial-time constant-factor approximation algorithm for maximum independent set for (axis-aligned) rectangles in the plane. Using a polynomial-time algorithm, the best approximation factor previously known is $O(loglog n)$. The results are based on a new form of recursive partitioning in the plane, in which faces that are constant-complexity and orthogonally convex are recursively partitioned into a constant number of such faces.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا