Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of $S_n$

74 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yossi Arjevani

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yossi Arjevani - Michael Field

التعلم الآلي النظم الديناميكية التحسين والتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Motivated by questions originating from the study of a class of shallow student-teacher neural networks, methods are developed for the analysis of spurious minima in classes of gradient equivariant dynamics related to neural nets. In the symmetric case, methods depend on the generic equivariant bifurcation theory of irreducible representations of the symmetric group on $n$ symbols, $S_n$; in particular, the standard representation of $S_n$. It is shown that spurious minima do not arise from spontaneous symmetry breaking but rather through a complex deformation of the landscape geometry that can be encoded by a generic $S_n$-equivariant bifurcation. We describe minimal models for forced symmetry breaking that give a lower bound on the dynamic complexity involved in the creation of spurious minima when there is no symmetry. Results on generic bifurcation when there are quadratic equivariants are also proved; this work extends and clarifies results of Ihrig & Golubitsky and Chossat, Lauterback & Melbourne on the instability of solutions when there are quadratic equivariants.

قيم البحث

146 - Yossi Arjevani , Michael Field 2020

We consider the optimization problem associated with fitting two-layer ReLU networks with $k$ hidden neurons, where labels are assumed to be generated by a (teacher) neural network. We leverage the rich symmetry exhibited by such models to identify v arious families of critical points and express them as power series in $k^{-frac{1}{2}}$. These expressions are then used to derive estimates for several related quantities which imply that not all spurious minima are alike. In particular, we show that while the loss function at certain types of spurious minima decays to zero like $k^{-1}$, in other cases the loss converges to a strictly positive constant. The methods used depend on symmetry, the geometry of group actions, bifurcation, and Artins implicit function theorem.

التعلم الآلي النظم الديناميكية التحسين والتحكم

The Standard Representation of the Symmetric Group $S_n$ over the Ring of Integers

60 - Kunle Adegoke , Olawanle Layeni , Rauf Giwa 2015

In this paper we give a Casimir Invariant for the Symmetric group $S_n$. Furthermore we obtain and present, for the first time in the literature, explicit formulas for the matrices of the standard representation in terms of the matrices of the permutation representation.

نظرية المجموعة

Parameter Inference with Bifurcation Diagrams

91 - Gregory Szep , Neil Dalchau , Attila Csikasz-Nagy 2021

Estimation of parameters in differential equation models can be achieved by applying learning algorithms to quantitative time-series data. However, sometimes it is only possible to measure qualitative changes of a system in response to a controlled c ondition. In dynamical systems theory, such change points are known as bifurcations and lie on a function of the controlled condition called the bifurcation diagram. In this work, we propose a gradient-based semi-supervised approach for inferring the parameters of differential equations that produce a user-specified bifurcation diagram. The cost function contains a supervised error term that is minimal when the model bifurcations match the specified targets and an unsupervised bifurcation measure which has gradients that push optimisers towards bifurcating parameter regimes. The gradients can be computed without the need to differentiate through the operations of the solver that was used to compute the diagram. We demonstrate parameter inference with minimal models which explore the space of saddle-node and pitchfork diagrams and the genetic toggle switch from synthetic biology. Furthermore, the cost landscape allows us to organise models in terms of topological and geometric equivalence.

التعلم الآلي النظم الديناميكية الأساليب الكمية

A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups

66 - Marc Finzi , Max Welling , Andrew Gordon Wilson 2021

Symmetries and equivariance are fundamental to the generalization of neural networks on domains such as images, graphs, and point clouds. Existing work has primarily focused on a small number of groups, such as the translation, rotation, and permutat ion groups. In this work we provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before, including $mathrm{O}(1,3)$, $mathrm{O}(5)$, $mathrm{Sp}(n)$, and the Rubiks cube group. Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems. We release our software library to enable researchers to construct equivariant layers for arbitrary matrix groups.

التعلم الآلي النظم الديناميكية التعلم الالي

An equivariant Hochsters formula for $mathfrak S_n$-invariant monomial ideals

174 - Satoshi Murai , Claudiu Raicu 2020

Let $R=Bbbk[x_1,dots,x_n]$ be a polynomial ring over a field $Bbbk$ and let $Isubset R$ be a monomial ideal preserved by the natural action of the symmetric group $mathfrak S_n$ on $R$. We give a combinatorial method to determine the $mathfrak S_n$-m odule structure of $mathrm{Tor}_i(I,Bbbk)$. Our formula shows that $mathrm{Tor}_i(I,Bbbk)$ is built from induced representations of tensor products of Specht modules associated to hook partitions, and their multiplicities are determined by topological Betti numbers of certain simplicial complexes. This result can be viewed as an $mathfrak S_n$-equivariant analogue of Hochsters formula for Betti numbers of monomial ideals. We apply our results to determine extremal Betti numbers of $mathfrak S_n$-invariant monomial ideals, and in particular recover formulas for their Castelnuovo--Mumford regularity and projective dimension. We also give a concrete recipe for how the Betti numbers change as we increase the number of variables, and in characteristic zero (or $>n$) we compute the $mathfrak S_n$-invariant part of $mathrm{Tor}_i(I,Bbbk)$ in terms of $mathrm{Tor}$ groups of the unsymmetrization of $I$.

الجبر التبادلي نظرية التمثيل