Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity


Abstract in English

As a popular approach to modeling the dynamics of training overparametrized neural networks (NNs), the neural tangent kernels (NTK) are known to fall behind real-world NNs in generalization ability. This performance gap is in part due to the textit{label agnostic} nature of the NTK, which renders the resulting kernel not as textit{locally elastic} as NNs~citep{he2019local}. In this paper, we introduce a novel approach from the perspective of emph{label-awareness} to reduce this gap for the NTK. Specifically, we propose two label-aware kernels that are each a superimposition of a label-agnostic part and a hierarchy of label-aware parts with increasing complexity of label dependence, using the Hoeffding decomposition. Through both theoretical and empirical evidence, we show that the models trained with the proposed kernels better simulate NNs in terms of generalization ability and local elasticity.

Download