ﻻ يوجد ملخص باللغة العربية
Despite the tremendous success of Stochastic Gradient Descent (SGD) algorithm in deep learning, little is known about how SGD finds generalizable solutions in the high-dimensional weight space. By analyzing the learning dynamics and loss function landscape, we discover a robust inverse relation between the weight variance and the landscape flatness (inverse of curvature) for all SGD-based learning algorithms. To explain the inverse variance-flatness relation, we develop a random landscape theory, which shows that the SGD noise strength (effective temperature) depends inversely on the landscape flatness. Our study indicates that SGD attains a self-tuned landscape-dependent annealing strategy to find generalizable solutions at the flat minima of the landscape. Finally, we demonstrate how these new theoretical insights lead to more efficient algorithms, e.g., for avoiding catastrophic forgetting.
X-ray diffraction (XRD) data acquisition and analysis is among the most time-consuming steps in the development cycle of novel thin-film materials. We propose a machine-learning-enabled approach to predict crystallographic dimensionality and space gr
In liquid argon time projection chambers exposed to neutrino beams and running on or near surface levels, cosmic muons and other cosmic particles are incident on the detectors while a single neutrino-induced event is being recorded. In practice, this
We present a simulation-based study using deep convolutional neural networks (DCNNs) to identify neutrino interaction vertices in the MINERvA passive targets region, and illustrate the application of domain adversarial neural networks (DANNs) in this
We investigate the topics of sensitivity and robustness in feedforward and convolutional neural networks. Combining energy landscape techniques developed in computational chemistry with tools drawn from formal methods, we produce empirical evidence i
We investigate a stationary processs crypticity---a measure of the difference between its hidden state information and its observed information---using the causal states of computational mechanics. Here, we motivate crypticity and cryptic order as ph