ﻻ يوجد ملخص باللغة العربية
The input space of a neural network with ReLU-like activations is partitioned into multiple linear regions, each corresponding to a specific activation pattern of the included ReLU-like activations. We demonstrate that this partition exhibits the following encoding properties across a variety of deep learning models: (1) {it determinism}: almost every linear region contains at most one training example. We can therefore represent almost every training example by a unique activation pattern, which is parameterized by a {it neural code}; and (2) {it categorization}: according to the neural code, simple algorithms, such as $K$-Means, $K$-NN, and logistic regression, can achieve fairly good performance on both training and test data. These encoding properties surprisingly suggest that {it normal neural networks well-trained for classification behave as hash encoders without any extra efforts.} In addition, the encoding properties exhibit variability in different scenarios. {Further experiments demonstrate that {it model size}, {it training time}, {it training sample size}, {it regularization}, and {it label noise} contribute in shaping the encoding properties, while the impacts of the first three are dominant.} We then define an {it activation hash phase chart} to represent the space expanded by {model size}, training time, training sample size, and the encoding properties, which is divided into three canonical regions: {it under-expressive regime}, {it critically-expressive regime}, and {it sufficiently-expressive regime}. The source code package is available at url{https://github.com/LeavesLei/activation-code}.
Generative adversarial networks (GANs) are capable of producing high quality image samples. However, unlike variational autoencoders (VAEs), GANs lack encoders that provide the inverse mapping for the generators, i.e., encode images back to the laten
We perform a careful, thorough, and large scale empirical study of the correspondence between wide neural networks and kernel methods. By doing so, we resolve a variety of open questions related to the study of infinitely wide neural networks. Our ex
Tuning machine learning models with Bayesian optimization (BO) is a successful strategy to find good hyperparameters. BO defines an iterative procedure where a cross-validated metric is evaluated on promising hyperparameters. In practice, however, an
Many proposed methods for explaining machine learning predictions are in fact challenging to understand for nontechnical consumers. This paper builds upon an alternative consumer-driven approach called TED that asks for explanations to be provided in
XDeep is an open-source Python package developed to interpret deep models for both practitioners and researchers. Overall, XDeep takes a trained deep neural network (DNN) as the input, and generates relevant interpretations as the output with the pos