ترغب بنشر مسار تعليمي؟ اضغط هنا

Fast Sketching of Polynomial Kernels of Polynomial Degree

146   0   0.0 ( 0 )
 نشر من قبل Lichen Zhang
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Kernel methods are fundamental in machine learning, and faster algorithms for kernel approximation provide direct speedups for many core tasks in machine learning. The polynomial kernel is especially important as other kernels can often be approximated by the polynomial kernel via a Taylor series expansion. Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic. However, for more slowly growing kernels, such as the neural tangent and arc-cosine kernels, $q$ needs to be polynomial, and previous work incurs a polynomial factor slowdown in the running time. We give a new oblivious sketch which greatly improves upon this running time, by removing the dependence on $q$ in the leading order term. Combined with a novel sampling scheme, we give the fastest algorithms for approximating a large family of slow-growing kernels.

قيم البحث

اقرأ أيضاً

Kernel methods are fundamental tools in machine learning that allow detection of non-linear dependencies between data without explicitly constructing feature vectors in high dimensional spaces. A major disadvantage of kernel methods is their poor sca lability: primitives such as kernel PCA or kernel ridge regression generally take prohibitively large quadratic space and (at least) quadratic time, as kernel matrices are usually dense. Some methods for speeding up kernel linear algebra are known, but they all invariably take time exponential in either the dimension of the input point set (e.g., fast multipole methods suffer from the curse of dimensionality) or in the degree of the kernel function. Oblivious sketching has emerged as a powerful approach to speeding up numerical linear algebra over the past decade, but our understanding of oblivious sketching solutions for kernel matrices has remained quite limited, suffering from the aforementioned exponential dependence on input parameters. Our main contribution is a general method for applying sketching solutions developed in numerical linear algebra over the past decade to a tensoring of data points without forming the tensoring explicitly. This leads to the first oblivious sketch for the polynomial kernel with a target dimension that is only polynomially dependent on the degree of the kernel function, as well as the first oblivious sketch for the Gaussian kernel on bounded datasets that does not suffer from an exponential dependence on the dimensionality of input data points.
We study the existence of polynomial kernels, for parameterized problems without a polynomial kernel on general graphs, when restricted to graphs of bounded twin-width. Our main result is that a polynomial kernel for $k$-Dominating Set on graphs of t win-width at most 4 would contradict a standard complexity-theoretic assumption. The reduction is quite involved, especially to get the twin-width upper bound down to 4, and can be tweaked to work for Connected $k$-Dominating Set and Total $k$-Dominating Set (albeit with a worse upper bound on the twin-width). The $k$-Independent Set problem admits the same lower bound by a much simpler argument, previously observed [ICALP 21], which extends to $k$-Independent Dominating Set, $k$-Path, $k$-Induced Path, $k$-Induced Matching, etc. On the positive side, we obtain a simple quadratic vertex kernel for Connected $k$-Vertex Cover and Capacitated $k$-Vertex Cover on graphs of bounded twin-width. Interestingly the kernel applies to graphs of Vapnik-Chervonenkis density 1, and does not require a witness sequence. We also present a more intricate $O(k^{1.5})$ vertex kernel for Connected $k$-Vertex Cover. Finally we show that deciding if a graph has twin-width at most 1 can be done in polynomial time, and observe that most optimization/decision graph problems can be solved in polynomial time on graphs of twin-width at most 1.
Let $H$ be a fixed graph. Given a graph $G$ and an integer $k$, the $H$-free edge modification problem asks whether it is possible to modify at most $k$ edges in $G$ to make it $H$-free. Sandeep and Sivadasan (IPEC 2015) asks whether the paw-free com pletion problem and the paw-free edge deletion problem admit polynomial kernels. We answer both questions affirmatively by presenting, respectively, $O(k)$-vertex and $O(k^4)$-vertex kernels for them. This is part of an ongoing program that aims at understanding compressibility of $H$-free edge modification problems.
We show that the k-Dominating Set problem is fixed parameter tractable (FPT) and has a polynomial kernel for any class of graphs that exclude K_{i,j} as a subgraph, for any fixed i, j >= 1. This strictly includes every class of graphs for which this problem has been previously shown to have FPT algorithms and/or polynomial kernels. In particular, our result implies that the problem restricted to bounded- degenerate graphs has a polynomial kernel, solving an open problem posed by Alon and Gutner.
In list-decodable subspace recovery, the input is a collection of $n$ points $alpha n$ (for some $alpha ll 1/2$) of which are drawn i.i.d. from a distribution $mathcal{D}$ with a isotropic rank $r$ covariance $Pi_*$ (the emph{inliers}) and the rest a re arbitrary, potential adversarial outliers. The goal is to recover a $O(1/alpha)$ size list of candidate covariances that contains a $hat{Pi}$ close to $Pi_*$. Two recent independent works (Raghavendra-Yau, Bakshi-Kothari 2020) gave the first efficient algorithm for this problem. These results, however, obtain an error that grows with the dimension (linearly in [RY] and logarithmically in BK) at the cost of quasi-polynomial running time) and rely on emph{certifiable anti-concentration} - a relatively strict condition satisfied essentially only by the Gaussian distribution. In this work, we improve on these results on all three fronts: emph{dimension-independent} error via a faster fixed-polynomial running time under less restrictive distributional assumptions. Specifically, we give a $poly(1/alpha) d^{O(1)}$ time algorithm that outputs a list containing a $hat{Pi}$ satisfying $|hat{Pi} -Pi_*|_F leq O(1/alpha)$. Our result only needs $mathcal{D}$ to have emph{certifiably hypercontractive} degree 2 polynomials. As a result, in addition to Gaussians, our algorithm applies to the uniform distribution on the hypercube and $q$-ary cubes and arbitrary product distributions with subgaussian marginals. Prior work (Raghavendra and Yau, 2020) had identified such distributions as potential hard examples as such distributions do not exhibit strong enough anti-concentration. When $mathcal{D}$ satisfies certifiable anti-concentration, we obtain a stronger error guarantee of $|hat{Pi}-Pi_*|_F leq eta$ for any arbitrary $eta > 0$ in $d^{O(poly(1/alpha) + log (1/eta))}$ time.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا