ﻻ يوجد ملخص باللغة العربية
Activation functions play a pivotal role in the function learning using neural networks. The non-linearity in the learned function is achieved by repeated use of the activation function. Over the years, numerous activation functions have been proposed to improve accuracy in several tasks. Basic functions like ReLU, Sigmoid, Tanh, or Softplus have been favorite among the deep learning community because of their simplicity. In recent years, several novel activation functions arising from these basic functions have been proposed, which have improved accuracy in some challenging datasets. We propose a five hyper-parameters family of activation functions, namely EIS, defined as, [ frac{x(ln(1+e^x))^alpha}{sqrt{beta+gamma x^2}+delta e^{-theta x}}. ] We show examples of activation functions from the EIS family which outperform widely used activation functions on some well known datasets and models. For example, $frac{xln(1+e^x)}{x+1.16e^{-x}}$ beats ReLU by 0.89% in DenseNet-169, 0.24% in Inception V3 in CIFAR100 dataset while 1.13% in Inception V3, 0.13% in DenseNet-169, 0.94% in SimpleNet model in CIFAR10 dataset. Also, $frac{xln(1+e^x)}{sqrt{1+x^2}}$ beats ReLU by 1.68% in DenseNet-169, 0.30% in Inception V3 in CIFAR100 dataset while 1.0% in Inception V3, 0.15% in DenseNet-169, 1.13% in SimpleNet model in CIFAR10 dataset.
Deep learning at its core, contains functions that are composition of a linear transformation with a non-linear function known as activation function. In past few years, there is an increasing interest in construction of novel activation functions re
The scope of research in the domain of activation functions remains limited and centered around improving the ease of optimization or generalization quality of neural networks (NNs). However, to develop a deeper understanding of deep learning, it bec
Note: This paper describes an older version of DeepLIFT. See https://arxiv.org/abs/1704.02685 for the newer version. Original abstract follows: The purported black box nature of neural networks is a barrier to adoption in applications where interpret
The mixture extension of exponential family principal component analysis (EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA does. For example, due to the linearity of EPCAs essential form,
Integrating discrete probability distributions and combinatorial optimization problems into neural networks has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end