Tensor Random Projection for Low Memory Dimension Reduction

115 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Madeleine Udell

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yiming Sun - Yang Guo - Joel A. Tropp

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Random projections reduce the dimension of a set of vectors while preserving structural information, such as distances between vectors in the set. This paper proposes a novel use of row-product random matrices in random projection, where we call it Tensor Random Projection (TRP). It requires substantially less memory than existing dimension reduction maps. The TRP map is formed as the Khatri-Rao product of several smaller random projections, and is compatible with any base random projection including sparse maps, which enable dimension reduction with very low query cost and no floating point operations. We also develop a reduced variance extension. We provide a theoretical analysis of the bias and variance of the TRP, and a non-asymptotic error analysis for a TRP composed of two smaller maps. Experiments on both synthetic and MNIST data show that our method performs as well as conventional methods with substantially less storage.

قيم البحث

182 - Anna Breger , Jose Ignacio Orlando , Pavol Harar 2019

The use of orthogonal projections on high-dimensional input and target data in learning frameworks is studied. First, we investigate the relations between two standard objectives in dimension reduction, preservation of variance and of pairwise relati ve distances. Investigations of their asymptotic correlation as well as numerical experiments show that a projection does usually not satisfy both objectives at once. In a standard classification problem we determine projections on the input data that balance the objectives and compare subsequent results. Next, we extend our application of orthogonal projections to deep learning tasks and introduce a general framework of augmented target loss functions. These loss functions integrate additional information via transformations and projections of the target data. In two supervised learning problems, clinical image segmentation and music information classification, the application of our proposed augmented target loss functions increase the accuracy.

التحليل العددي التعلم الآلي التحليل العددي

Tensor Train Random Projection

51 - Yani Feng , Kejun Tang , Lianxing He 2020

This work proposes a novel tensor train random projection (TTRP) method for dimension reduction, where the pairwise distances can be approximately preserved. Based on the tensor train format, this new random projection method can speed up the computa tion for high dimensional problems and requires less storage with little loss in accuracy, compared with existing methods (e.g., very sparse random projection). Our TTRP is systematically constructed through a rank-one TT-format with Rademacher random variables, which results in efficient projection with small variances. The isometry property of TTRP is proven in this work, and detailed numerical experiments with data sets (synthetic, MNIST and CIFAR-10) are conducted to demonstrate the efficiency of TTRP.

التعلم الالي التعلم الآلي

Stochastic Gradients for Large-Scale Tensor Decomposition

72 - Tamara G. Kolda , David Hong 2019

Tensor decomposition is a well-known tool for multiway data analysis. This work proposes using stochastic gradients for efficient generalized canonical polyadic (GCP) tensor decomposition of large-scale tensors. GCP tensor decomposition is a recently proposed version of tensor decomposition that allows for a variety of loss functions such as Bernoulli loss for binary data or Huber loss for robust estimation. The stochastic gradient is formed from randomly sampled elements of the tensor and is efficient because it can be computed using the sparse matricized-tensor-times-Khatri-Rao product (MTTKRP) tensor kernel. For dense tensors, we simply use uniform sampling. For sparse tensors, we propose two types of stratified sampling that give precedence to sampling nonzeros. Numerical results demonstrate the advantages of the proposed approach and its scalability to large-scale problems.

التحليل العددي التعلم الآلي التحليل العددي

Wedderburn rank reduction and Krylov subspace method for tensor approximation. Part 1: Tucker case

281 - S. A. Goreinov , I. V. Oseledets , D. V. Savostyanov 2010

New algorithms are proposed for the Tucker approximation of a 3-tensor, that access it using only the tensor-by-vector-by-vector multiplication subroutine. In the matrix case, Krylov methods are methods of choice to approximate the dominant column an d row subspaces of a sparse or structured matrix given through the matrix-by-vector multiplication subroutine. Using the Wedderburn rank reduction formula, we propose an algorithm of matrix approximation that computes Krylov subspaces and allows generalization to the tensor case. Several variants of proposed tensor algorithms differ by pivoting strategies, overall cost and quality of approximation. By convincing numerical experiments we show that the proposed methods are faster and more accurate than the minimal Krylov recursion, proposed recently by Elden and Savas.

التحليل العددي بنى وهياكل البيانات والخوارزميات التحليل العددي

Accelerating Block Coordinate Descent for Nonnegative Tensor Factorization

81 - Andersen Man Shun Ang , Jeremy E. Cohen , Nicolas Gillis 2020

This paper is concerned with improving the empirical convergence speed of block-coordinate descent algorithms for approximate nonnegative tensor factorization (NTF). We propose an extrapolation strategy in-between block updates, referred to as heuris tic extrapolation with restarts (HER). HER significantly accelerates the empirical convergence speed of most existing block-coordinate algorithms for dense NTF, in particular for challenging computational scenarios, while requiring a negligible additional computational budget.

التحليل العددي التعلم الآلي التحليل العددي