Hutch++: Optimal Stochastic Trace Estimation

301 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Christopher Musco

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Raphael A. Meyer - Cameron Musco - Christopher Musco

بنى وهياكل البيانات والخوارزميات التعلم الآلي التحليل العددي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study the problem of estimating the trace of a matrix $A$ that can only be accessed through matrix-vector multiplication. We introduce a new randomized algorithm, Hutch++, which computes a $(1 pm epsilon)$ approximation to $tr(A)$ for any positive semidefinite (PSD) $A$ using just $O(1/epsilon)$ matrix-vector products. This improves on the ubiquitous Hutchinsons estimator, which requires $O(1/epsilon^2)$ matrix-vector products. Our approach is based on a simple technique for reducing the variance of Hutchinsons estimator using a low-rank approximation step, and is easy to implement and analyze. Moreover, we prove that, up to a logarithmic factor, the complexity of Hutch++ is optimal amongst all matrix-vector query algorithms, even when queries can be chosen adaptively. We show that it significantly outperforms Hutchinsons method in experiments. While our theory mainly requires $A$ to be positive semidefinite, we provide generalized guarantees for general square matrices, and show empirical gains in such applications.

قيم البحث

اقرأ أيضاً

Instance Optimal Join Size Estimation

78 - Mahmoud Abo-Khamis , Sungjin Im , Benjamin Moseley 2020

We consider the problem of efficiently estimating the size of the inner join of a collection of preprocessed relational tables from the perspective of instance optimality analysis. The run time of instance optimal algorithms is comparable to the mini mum time needed to verify the correctness of a solution. Previously instance optimal algorithms were only known when the size of the join was small (as one component of their run time that was linear in the join size). We give an instance optimal algorithm for estimating the join size for all instances, including when the join size is large, by removing the dependency on the join size. As a byproduct, we show how to sample rows from the join uniformly at random in a comparable amount of time.

بنى وهياكل البيانات والخوارزميات قواعد البيانات

Optimal Coreset for Gaussian Kernel Density Estimation

168 - Wai Ming Tai 2020

Given a point set $Psubset mathbb{R}^d$, a kernel density estimation for Gaussian kernel is defined as $overline{mathcal{G}}_P(x) = frac{1}{left|Pright|}sum_{pin P}e^{-leftlVert x-p rightrVert^2}$ for any $xinmathbb{R}^d$. We study how to construct a small subset $Q$ of $P$ such that the kernel density estimation of $P$ can be approximated by the kernel density estimation of $Q$. This subset $Q$ is called coreset. The primary technique in this work is to construct $pm 1$ coloring on the point set $P$ by the discrepancy theory and apply this coloring algorithm recursively. Our result leverages Banaszczyks Theorem. When $d>1$ is constant, our construction gives a coreset of size $Oleft(frac{1}{varepsilon}right)$ as opposed to the best-known result of $Oleft(frac{1}{varepsilon}sqrt{logfrac{1}{varepsilon}}right)$. It is the first to give a breakthrough on the barrier of $sqrt{log}$ factor even when $d=2$.

بنى وهياكل البيانات والخوارزميات الهندسة الحسابية التعلم الآلي

Low-memory stochastic backpropagation with multi-channel randomized trace estimation

232 - Mathias Louboutin , Ali Siahkoohi , Rongrong Wang 2021

Thanks to the combination of state-of-the-art accelerators and highly optimized open software frameworks, there has been tremendous progress in the performance of deep neural networks. While these developments have been responsible for many breakthro ughs, progress towards solving large-scale problems, such as video encoding and semantic segmentation in 3D, is hampered because access to on-premise memory is often limited. Instead of relying on (optimal) checkpointing or invertibility of the network layers -- to recover the activations during backpropagation -- we propose to approximate the gradient of convolutional layers in neural networks with a multi-channel randomized trace estimation technique. Compared to other methods, this approach is simple, amenable to analyses, and leads to a greatly reduced memory footprint. Even though the randomized trace estimation introduces stochasticity during training, we argue that this is of little consequence as long as the induced errors are of the same order as errors in the gradient due to the use of stochastic gradient descent. We discuss the performance of networks trained with stochastic backpropagation and how the error can be controlled while maximizing memory usage and minimizing computational overhead.

التعلم الآلي التحليل العددي التحليل العددي

Stream sampling for variance-optimal estimation of subset sums

403 - Edith Cohen , Nick Duffield , Haim Kaplan 2010

From a high volume stream of weighted items, we want to maintain a generic sample of a certain limited size $k$ that we can later use to estimate the total weight of arbitrary subsets. This is the classic context of on-line reservoir sampling, thinki ng of the generic sample as a reservoir. We present an efficient reservoir sampling scheme, $varoptk$, that dominates all previous schemes in terms of estimation quality. $varoptk$ provides {em variance optimal unbiased estimation of subset sums}. More precisely, if we have seen $n$ items of the stream, then for {em any} subset size $m$, our scheme based on $k$ samples minimizes the average variance over all subsets of size $m$. In fact, the optimality is against any off-line scheme with $k$ samples tailored for the concrete set of items seen. In addition to optimal average variance, our scheme provides tighter worst-case bounds on the variance of {em particular} subsets than previously possible. It is efficient, handling each new item of the stream in $O(log k)$ time. Finally, it is particularly well suited for combination of samples from different streams in a distributed setting.

بنى وهياكل البيانات والخوارزميات

Circular Trace Reconstruction

122 - Shyam Narayanan , Michael Ren 2020

Trace reconstruction is the problem of learning an unknown string $x$ from independent traces of $x$, where traces are generated by independently deleting each bit of $x$ with some deletion probability $q$. In this paper, we initiate the study of Cir cular trace reconstruction, where the unknown string $x$ is circular and traces are now rotated by a random cyclic shift. Trace reconstruction is related to many computational biology problems studying DNA, which is a primary motivation for this problem as well, as many types of DNA are known to be circular. Our main results are as follows. First, we prove that we can reconstruct arbitrary circular strings of length $n$ using $expbig(tilde{O}(n^{1/3})big)$ traces for any constant deletion probability $q$, as long as $n$ is prime or the product of two primes. For $n$ of this form, this nearly matches what was the best known bound of $expbig(O(n^{1/3})big)$ for standard trace reconstruction when this paper was initially released. We note, however, that Chase very recently improved the standard trace reconstruction bound to $expbig(tilde{O}(n^{1/5})big)$. Next, we prove that we can reconstruct random circular strings with high probability using $n^{O(1)}$ traces for any constant deletion probability $q$. Finally, we prove a lower bound of $tilde{Omega}(n^3)$ traces for arbitrary circular strings, which is greater than the best known lower bound of $tilde{Omega}(n^{3/2})$ in standard trace reconstruction.

بنى وهياكل البيانات والخوارزميات نظرية الأعداد