ترغب بنشر مسار تعليمي؟ اضغط هنا

Finding the Mode of a Kernel Density Estimate

127   0   0.0 ( 0 )
 نشر من قبل Wai Ming Tai
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Given points $p_1, dots, p_n$ in $mathbb{R}^d$, how do we find a point $x$ which maximizes $frac{1}{n} sum_{i=1}^n e^{-|p_i - x|^2}$? In other words, how do we find the maximizing point, or mode of a Gaussian kernel density estimation (KDE) centered at $p_1, dots, p_n$? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. Few provable algorithms are known, so practitioners rely on heuristics like the mean-shift algorithm, which are not guaranteed to find a global optimum. We address this challenge by providing fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low dimension $d$, our main contribution is to reduce the mode finding problem to a solving a small number of systems of polynomial inequalities. For high dimension $d$, we prove the first dimensionality reduction result for KDE mode finding, which allows for reduction to the low dimensional case. Our result leverages Johnson-Lindenstrauss random projection, Kirszbrauns classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding.



قيم البحث

اقرأ أيضاً

168 - Wai Ming Tai 2020
Given a point set $Psubset mathbb{R}^d$, a kernel density estimation for Gaussian kernel is defined as $overline{mathcal{G}}_P(x) = frac{1}{left|Pright|}sum_{pin P}e^{-leftlVert x-p rightrVert^2}$ for any $xinmathbb{R}^d$. We study how to construct a small subset $Q$ of $P$ such that the kernel density estimation of $P$ can be approximated by the kernel density estimation of $Q$. This subset $Q$ is called coreset. The primary technique in this work is to construct $pm 1$ coloring on the point set $P$ by the discrepancy theory and apply this coloring algorithm recursively. Our result leverages Banaszczyks Theorem. When $d>1$ is constant, our construction gives a coreset of size $Oleft(frac{1}{varepsilon}right)$ as opposed to the best-known result of $Oleft(frac{1}{varepsilon}sqrt{logfrac{1}{varepsilon}}right)$. It is the first to give a breakthrough on the barrier of $sqrt{log}$ factor even when $d=2$.
175 - Jack Wang 2012
Let $mathscr O$ be a set of $n$ disjoint obstacles in $mathbb{R}^2$, $mathscr M$ be a moving object. Let $s$ and $l$ denote the starting point and maximum path length of the moving object $mathscr M$, respectively. Given a point $p$ in ${R}^2$, we sa y the point $p$ is achievable for $mathscr M$ such that $pi(s,p)leq l$, where $pi(cdot)$ denotes the shortest path length in the presence of obstacles. One is to find a region $mathscr R$ such that, for any point $pin mathbb{R}^2$, if it is achievable for $mathscr M$, then $pin mathscr R$; otherwise, $p otin mathscr R$. In this paper, we restrict our attention to the case of line-segment obstacles. To tackle this problem, we develop three algorithms. We first present a simpler-version algorithm for the sake of intuition. Its basic idea is to reduce our problem to computing the union of a set of circular visibility regions (CVRs). This algorithm takes $O(n^3)$ time. By analysing its dominant steps, we break through its bottleneck by using the short path map (SPM) technique to obtain those circles (unavailable beforehand), yielding an $O(n^2log n)$ algorithm. Owing to the finding above, the third algorithm also uses the SPM technique. It however, does not continue to construct the CVRs. Instead, it directly traverses each region of the SPM to trace the boundaries, the final algorithm obtains $O(nlog n)$ complexity.
We construct near-optimal coresets for kernel density estimates for points in $mathbb{R}^d$ when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size $O(sqrt{d}/varepsiloncdot sqrt{log 1/varepsilo n} )$, and we show a near-matching lower bound of size $Omega(min{sqrt{d}/varepsilon, 1/varepsilon^2})$. When $dgeq 1/varepsilon^2$, it is known that the size of coreset can be $O(1/varepsilon^2)$. The upper bound is a polynomial-in-$(1/varepsilon)$ improvement when $d in [3,1/varepsilon^2)$ and the lower bound is the first known lower bound to depend on $d$ for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.
We study the construction of coresets for kernel density estimates. That is we show how to approximate the kernel density estimate described by a large point set with another kernel density estimate with a much smaller point set. For characteristic k ernels (including Gaussian and Laplace kernels), our approximation preserves the $L_infty$ error between kernel density estimates within error $epsilon$, with coreset size $2/epsilon^2$, but no other aspects of the data, including the dimension, the diameter of the point set, or the bandwidth of the kernel common to other approximations. When the dimension is unrestricted, we show this bound is tight for these kernels as well as a much broader set. This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called emph{kernel herding}. This analysis unites a broad line of work that spans statistics, machine learning, and geometry. When the dimension $d$ is constant, we demonstrate much tighter bounds on the size of the coreset specifically for Gaussian kernels, showing that it is bounded by the size of the coreset for axis-aligned rectangles. Currently the best known constructive bound is $O(frac{1}{epsilon} log^d frac{1}{epsilon})$, and non-constructively, this can be improved by $sqrt{log frac{1}{epsilon}}$. This improves the best constant dimension bounds polynomially for $d geq 3$.
We give algorithms with running time $2^{O({sqrt{k}log{k}})} cdot n^{O(1)}$ for the following problems. Given an $n$-vertex unit disk graph $G$ and an integer $k$, decide whether $G$ contains (1) a path on exactly/at least $k$ vertices, (2) a cycle o n exactly $k$ vertices, (3) a cycle on at least $k$ vertices, (4) a feedback vertex set of size at most $k$, and (5) a set of $k$ pairwise vertex-disjoint cycles. For the first three problems, no subexponential time parameterized algorithms were previously known. For the remaining two problems, our algorithms significantly outperform the previously best known parameterized algorithms that run in time $2^{O(k^{0.75}log{k})} cdot n^{O(1)}$. Our algorithms are based on a new kind of tree decompositions of unit disk graphs where the separators can have size up to $k^{O(1)}$ and there exists a solution that crosses every separator at most $O(sqrt{k})$ times. The running times of our algorithms are optimal up to the $log{k}$ factor in the exponent, assuming the Exponential Time Hypothesis.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا