Extracting Global Dynamics of Loss Landscape in Deep Learning Models

274 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mohammed Eslami

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mohammed Eslami - Hamed Eramian - Marcio Gameiro

النظم الديناميكية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep learning models evolve through training to learn the manifold in which the data exists to satisfy an objective. It is well known that evolution leads to different final states which produce inconsistent predictions of the same test data points. This calls for techniques to be able to empirically quantify the difference in the trajectories and highlight problematic regions. While much focus is placed on discovering what models learn, the question of how a model learns is less studied beyond theoretical landscape characterizations and local geometric approximations near optimal conditions. Here, we present a toolkit for the Dynamical Organization Of Deep Learning Loss Landscapes, or DOODL3. DOODL3 formulates the training of neural networks as a dynamical system, analyzes the learning process, and presents an interpretable global view of trajectories in the loss landscape. Our approach uses the coarseness of topology to capture the granularity of geometry to mitigate against states of instability or elongated training. Overall, our analysis presents an empirical framework to extract the global dynamics of a model and to use that information to guide the training of neural networks.

قيم البحث

120 - Ulrich Bauer , David Hien , Oliver Junge 2020

We describe a computational method for constructing a coarse combinatorial model of some dynamical system in which the macroscopic states are given by elementary cycling motions of the system. Our method is in particular applicable to time series dat a. We illustrate the construction by a perturbed double well Hamiltonian as well as the Lorenz system.

النظم الديناميكية ديناميات الفوضوية

Deep Learning of Conjugate Mappings

189 - Jason J. Bramburger , Steven L. Brunton , J. Nathan Kutz 2021

Despite many of the most common chaotic dynamical systems being continuous in time, it is through discrete time mappings that much of the understanding of chaos is formed. Henri Poincare first made this connection by tracking consecutive iterations o f the continuous flow with a lower-dimensional, transverse subspace. The mapping that iterates the dynamics through consecutive intersections of the flow with the subspace is now referred to as a Poincare map, and it is the primary method available for interpreting and classifying chaotic dynamics. Unfortunately, in all but the simplest systems, an explicit form for such a mapping remains outstanding. This work proposes a method for obtaining explicit Poincare mappings by using deep learning to construct an invertible coordinate transformation into a conjugate representation where the dynamics are governed by a relatively simple chaotic mapping. The invertible change of variable is based on an autoencoder, which allows for dimensionality reduction, and has the advantage of classifying chaotic systems using the equivalence relation of topological conjugacies. Indeed, the enforcement of topological conjugacies is the critical neural network regularization for learning the coordinate and dynamics pairing. We provide expository applications of the method to low-dimensional systems such as the Rossler and Lorenz systems, while also demonstrating the utility of the method on infinite-dimensional systems, such as the Kuramoto--Sivashinsky equation.

النظم الديناميكية التعلم الآلي

Global Dynamics and Existence of Traveling Wave Solutions for A Three-Species Models

174 - Fanfan Li , Zhenlai Han , Ting-Hui Yang 2020

In this work, we investigate the system of three species ecological model involving one predator-prey subsystem coupling with a generalist predator with negative effect on the prey. Without diffusive terms, all global dynamics of its corresponding re action equations are proved analytically for all classified parameters. With diffusive terms, the transitions of different spatial homogeneous solutions, the traveling wave solutions, are showed by higher dimensional shooting method, the Wazewski method. Some interesting numerical simulations are performed, and biological implications are given.

النظم الديناميكية

Embedding Principle of Loss Landscape of Deep Neural Networks

140 - Yaoyu Zhang , Zhongwang Zhang , Tao Luo 2021

Understanding the structure of loss landscape of deep neural networks (DNNs)is obviously important. In this work, we prove an embedding principle that the loss landscape of a DNN contains all the critical points of all the narrower DNNs. More precise ly, we propose a critical embedding such that any critical point, e.g., local or global minima, of a narrower DNN can be embedded to a critical point/hyperplane of the target DNN with higher degeneracy and preserving the DNN output function. The embedding structure of critical points is independent of loss function and training data, showing a stark difference from other nonconvex problems such as protein-folding. Empirically, we find that a wide DNN is often attracted by highly-degenerate critical points that are embedded from narrow DNNs. The embedding principle provides an explanation for the general easy optimization of wide DNNs and unravels a potential implicit low-complexity regularization during the training. Overall, our work provides a skeleton for the study of loss landscape of DNNs and its implication, by which a more exact and comprehensive understanding can be anticipated in the near

التعلم الآلي التعلم الالي

Learning Deep Models from Synthetic Data for Extracting Dolphin Whistle Contours

119 - Pu Li , Xiaobai Liua , K. J. Palmer 2020

We present a learning-based method for extracting whistles of toothed whales (Odontoceti) in hydrophone recordings. Our method represents audio signals as time-frequency spectrograms and decomposes each spectrogram into a set of time-frequency patche s. A deep neural network learns archetypical patterns (e.g., crossings, frequency modulated sweeps) from the spectrogram patches and predicts time-frequency peaks that are associated with whistles. We also developed a comprehensive method to synthesize training samples from background environments and train the network with minimal human annotation effort. We applied the proposed learn-from-synthesis method to a subset of the public Detection, Classification, Localization, and Density Estimation (DCLDE) 2011 workshop data to extract whistle confidence maps, which we then processed with an existing contour extractor to produce whistle annotations. The F1-score of our best synthesis method was 0.158 greater than our baseline whistle extraction algorithm (~25% improvement) when applied to common dolphin (Delphinus spp.) and bottlenose dolphin (Tursiops truncatus) whistles.

الأساليب الكمية أنظمة الصوت في الحاسوب معالجة الصوت والكلام