ﻻ يوجد ملخص باللغة العربية
We introduce a learning-based algorithm for the low-rank decomposition problem: given an $n times d$ matrix $A$, and a parameter $k$, compute a rank-$k$ matrix $A$ that minimizes the approximation loss $|A-A|_F$. The algorithm uses a training set of input matrices in order to optimize its performance. Specifically, some of the most efficient approximate algorithms for computing low-rank approximations proceed by computing a projection $SA$, where $S$ is a sparse random $m times n$ sketching matrix, and then performing the singular value decomposition of $SA$. We show how to replace the random matrix $S$ with a learned matrix of the same sparsity to reduce the error. Our experiments show that, for multiple types of data sets, a learned sketch matrix can substantially reduce the approximation loss compared to a random matrix $S$, sometimes by one order of magnitude. We also study mixed matrices where only some of the rows are trained and the remaining ones are random, and show that matrices still offer improved performance while retaining worst-case guarantees. Finally, to understand the theoretical aspects of our approach, we study the special case of $m=1$. In particular, we give an approximation algorithm for minimizing the empirical loss, with approximation factor depending on the stable rank of matrices in the training set. We also show generalization bounds for the sketch matrix learning problem.
The success of deep reinforcement learning (DRL) is due to the power of learning a representation that is suitable for the underlying exploration and exploitation task. However, existing provable reinforcement learning algorithms with linear function
We consider adaptive approximations of the parameter-to-solution map for elliptic operator equations depending on a large or infinite number of parameters, comparing approximation strategies of different degrees of nonlinearity: sparse polynomial exp
In continual learning (CL), a learner is faced with a sequence of tasks, arriving one after the other, and the goal is to remember all the tasks once the continual learning experience is finished. The prior art in CL uses episodic memory, parameter r
We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks, which are 1) the huge memory cost of storing individual gradients, 2) the added noise suffering notorious dimensional dep