The empirical size of trained neural networks

نشر في Alden Walker بتاريخ 2016 في مجال الاحصاء الرياضي والبحث باللغة English تحميل البحث

الملخص بالإنكليزية

ReLU neural networks define piecewise linear functions of their inputs. However, initializing and training a neural network is very different from fitting a linear spline. In this paper, we expand empirically upon previous theoretical work to demonstrate features of trained neural networks. Standard network initialization and training produce networks vastly simpler than a naive parameter count would suggest and can impart odd features to the trained network. However, we also show the forced simplicity is beneficial and, indeed, critical for the wide success of these networks.

تحميل البحث