No Arabic abstract
This survey is meant to provide an introduction to the fundamental theorem of linear algebra and the theories behind them. Our goal is to give a rigorous introduction to the readers with prior exposure to linear algebra. Specifically, we provide some details and proofs of some results from (Strang, 1993). We then describe the fundamental theorem of linear algebra from different views and find the properties and relationships behind the views. The fundamental theorem of linear algebra is essential in many fields, such as electrical engineering, computer science, machine learning, and deep learning. This survey is primarily a summary of purpose, significance of important theories behind it. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in theory behind the fundamental theorem of linear algebra and rigorous analysis in order to seamlessly introduce its properties in four subspaces in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results and given the paucity of scope to present this discussion, e.g., the separated analysis of the (orthogonal) projection matrices. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields. Some excellent examples include (Rose, 1982; Strang, 2009; Trefethen and Bau III, 1997; Strang, 2019, 2021).
We report on a verification of the Fundamental Theorem of Algebra in ACL2(r). The proof consists of four parts. First, continuity for both complex-valued and real-valued functions of complex numbers is defined, and it is shown that continuous functions from the complex to the real numbers achieve a minimum value over a closed square region. An important case of continuous real-valued, complex functions results from taking the traditional complex norm of a continuous complex function. We think of these continuous functions as having only one (complex) argument, but in ACL2(r) they appear as functions of two arguments. The extra argument is a context, which is uninterpreted. For example, it could be other arguments that are held fixed, as in an exponential function which has a base and an exponent, either of which could be held fixed. Second, it is shown that complex polynomials are continuous, so the norm of a complex polynomial is a continuous real-valued function and it achieves its minimum over an arbitrary square region centered at the origin. This part of the proof benefits from the introduction of the context argument, and it illustrates an innovation that simplifies the proofs of classical properties with unbound parameters. Third, we derive lower and upper bounds on the norm of non-constant polynomials for inputs that are sufficiently far away from the origin. This means that a sufficiently large square can be found to guarantee that it contains the global minimum of the norm of the polynomial. Fourth, it is shown that if a given number is not a root of a non-constant polynomial, then it cannot be the global minimum. Finally, these results are combined to show that the global minimum must be a root of the polynomial. This result is part of a larger effort in the formalization of complex polynomials in ACL2(r).
This note is devoted to two classical theorems: the open mapping theorem for analytic functions (OMT) and the fundamental theorem of algebra (FTA). We present a new proof of the first theorem, and then derive the second one by a simple topological argument. The proof is elementary in nature and does not use any kind of integration (neither complex nor real). In addition, it is also independent of the fact that the roots of an analytic function are isolated. The proof is based on either the Banach or Brouwer fixed point theorems. In particular, this shows that one can obtain a proof of the FTA (albeit indirect) which is based on the Brouwer fixed point theorem, an aim which was not reached in the past and later the possibility to achieve it was questioned. We close this note with a simple generalization of the FTA. A short review of certain issues related to the OMT and the FTA is also included.
This paper concerns the minimax center of a collection of linear subspaces. When the subspaces are $k$-dimensional subspaces of $mathbb{R}^n$, this can be cast as finding the center of a minimum enclosing ball on a Grassmann manifold, Gr$(k,n)$. For subspaces of different dimension, the setting becomes a disjoint union of Grassmannians rather than a single manifold, and the problem is no longer well-defined. However, natural geometric maps exist between these manifolds with a well-defined notion of distance for the images of the subspaces under the mappings. Solving the initial problem in this context leads to a candidate minimax center on each of the constituent manifolds, but does not inherently provide intuition about which candidate is the best representation of the data. Additionally, the solutions of different rank are generally not nested so a deflationary approach will not suffice, and the problem must be solved independently on each manifold. We propose and solve an optimization problem parametrized by the rank of the minimax center. The solution is computed using a subgradient algorithm on the dual. By scaling the objective and penalizing the information lost by the rank-$k$ minimax center, we jointly recover an optimal dimension, $k^*$, and a central subspace, $U^* in$ Gr$(k^*,n)$ at the center of the minimum enclosing ball, that best represents the data.
Weight initialization plays an important role in training neural networks and also affects tremendous deep learning applications. Various weight initialization strategies have already been developed for different activation functions with different neural networks. These initialization algorithms are based on minimizing the variance of the parameters between layers and might still fail when neural networks are deep, e.g., dying ReLU. To address this challenge, we study neural networks from a nonlinear computation point of view and propose a novel weight initialization strategy that is based on the linear product structure (LPS) of neural networks. The proposed strategy is derived from the polynomial approximation of activation functions by using theories of numerical algebraic geometry to guarantee to find all the local minima. We also provide a theoretical analysis that the LPS initialization has a lower probability of dying ReLU comparing to other existing initialization strategies. Finally, we test the LPS initialization algorithm on both fully connected neural networks and convolutional neural networks to show its feasibility, efficiency, and robustness on public datasets.
In a previous paper, we have given an algebraic model to the set of intervals. Here, we apply this model in a linear frame. We define a notion of diagonalization of square matrices whose coefficients are intervals. But in this case, with respect to the real case, a matrix of order $n$ could have more than $n$ eigenvalues (the set of intervals is not factorial). We consider a notion of central eigenvalues permits to describe criterium of diagonalization. As application, we define a notion of Exponential mapping.