ﻻ يوجد ملخص باللغة العربية
We present a methodology for parallel acceleration of learning in the presence of matrix orthogonality and unitarity constraints of interest in several branches of machine learning. We show how an apparently sequential elementary rotation parametrization can be restructured into blocks of commutative operations using a well-known tool for coloring the edges of complete graphs, in turn widely applied to schedule round-robin (all-against-all) sports tournaments. The resulting decomposition admits an algorithm to compute a fully-parametrized orthogonal matrix from its rotation parameters in $O(n)$ sequential steps and one to compute the gradient of a training loss with respect to its parameters in $O(nlog n)$ steps. We discuss parametric restrictions of interest to generative modeling and present promising performance results with a prototype GPU implementation.
We introduce an efficient approach for optimization over orthogonal groups on highly parallel computation units such as GPUs or TPUs. As in earlier work, we parametrize an orthogonal matrix as a product of Householder reflections. However, to overcom
We propose proximal backpropagation (ProxProp) as a novel algorithm that takes implicit instead of explicit gradient steps to update the network parameters during neural network training. Our algorithm is motivated by the step size limitation of expl
Consider a device that is connected to an edge processor via a communication channel. The device holds local data that is to be offloaded to the edge processor so as to train a machine learning model, e.g., for regression or classification. Transmiss
For reinforcement learning (RL), it is challenging for an agent to master a task that requires a specific series of actions due to sparse rewards. To solve this problem, reverse curriculum generation (RCG) provides a reverse expansion approach that a
Using properties of skew-Hamiltonian matrices and classic connectedness results, we prove that the moduli space $M_{ort}^0(r,n)$ of stable rank $r$ orthogonal vector bundles on $mathbb{P}^2$, with Chern classes $(c_1,c_2)=(0,n)$, and trivial splittin