No Arabic abstract
Over a complete Riemannian manifold of finite dimension, Greene and Wu introduced a convolution, known as Greene-Wu (GW) convolution. In this paper, we study properties of the GW convolution and apply it to non-Euclidean machine learning problems. In particular, we derive a new formula for how the curvature of the space would affect the curvature of the function through the GW convolution. Also, following the study of the GW convolution, a new method for gradient estimation over Riemannian manifolds is introduced.
In this paper, we discuss the heat flow of a pseudo-harmonic map from a closed pseudo-Hermitian manifold to a Riemannian manifold with non-positive sectional curvature, and prove the existence of the pseudo-harmonic map which is a generalization of Eells-Sampsons existence theorem. We also discuss the uniqueness of the pseudo-harmonic representative of its homotopy class which is a generalization of Hartman theorem, provided that the target manifold has negative sectional curvature.
We study gradient-based regularization methods for neural networks. We mainly focus on two regularization methods: the total variation and the Tikhonov regularization. Applying these methods is equivalent to using neural networks to solve some partial differential equations, mostly in high dimensions in practical applications. In this work, we introduce a general framework to analyze the generalization error of regularized networks. The error estimate relies on two assumptions on the approximation error and the quadrature error. Moreover, we conduct some experiments on the image classification tasks to show that gradient-based methods can significantly improve the generalization ability and adversarial robustness of neural networks. A graphical extension of the gradient-based methods are also considered in the experiments.
We study the convergence issue for the gradient algorithm (employing general step sizes) for optimization problems on general Riemannian manifolds (without curvature constraints). Under the assumption of the local convexity/quasi-convexity (resp. weak sharp minima), local/global convergence (resp. linear convergence) results are established. As an application, the linear convergence properties of the gradient algorithm employing the constant step sizes and the Armijo step sizes for finding the Riemannian $L^p$ ($pin[1,+infty)$) centers of mass are explored, respectively, which in particular extend and/or improve the corresponding results in cite{Afsari2013}.
Thanks to the combination of state-of-the-art accelerators and highly optimized open software frameworks, there has been tremendous progress in the performance of deep neural networks. While these developments have been responsible for many breakthroughs, progress towards solving large-scale problems, such as video encoding and semantic segmentation in 3D, is hampered because access to on-premise memory is often limited. Instead of relying on (optimal) checkpointing or invertibility of the network layers -- to recover the activations during backpropagation -- we propose to approximate the gradient of convolutional layers in neural networks with a multi-channel randomized trace estimation technique. Compared to other methods, this approach is simple, amenable to analyses, and leads to a greatly reduced memory footprint. Even though the randomized trace estimation introduces stochasticity during training, we argue that this is of little consequence as long as the induced errors are of the same order as errors in the gradient due to the use of stochastic gradient descent. We discuss the performance of networks trained with stochastic backpropagation and how the error can be controlled while maximizing memory usage and minimizing computational overhead.
We introduce and implement a method to compute stationary states of nonlinear Schrodinger equations on metric graphs. Stationary states are obtained as local minimizers of the nonlinear Schrodinger energy at fixed mass. Our method is based on a normalized gradient flow for the energy (i.e. a gradient flow projected on a fixed mass sphere) adapted to the context of nonlinear quantum graphs. We first prove that, at the continuous level, the normalized gradient flow is well-posed, mass-preserving, energy diminishing and converges (at least locally) towards stationary states. We then establish the link between the continuous flow and its discretized version. We conclude by conducting a series of numerical experiments in model situations showing the good performance of the discrete flow to compute stationary states. Further experiments as well as detailed explanation of our numerical algorithm are given in a companion paper.