Learning Sub-Patterns in Piecewise Continuous Functions


Abstract in English

Most stochastic gradient descent algorithms can optimize neural networks that are sub-differentiable in their parameters, which requires their activation function to exhibit a degree of continuity. However, this continuity constraint on the activation function prevents these neural models from uniformly approximating discontinuous functions. This paper focuses on the case where the discontinuities arise from distinct sub-patterns, each defined on different parts of the input space. We propose a new discontinuous deep neural network model trainable via a decoupled two-step procedure that avoids passing gradient updates through the networks non-differentiable unit. We provide universal approximation guarantees for our architecture in the space of bounded continuous functions and in the space of piecewise continuous functions, which we introduced herein. We present a novel semi-supervised two-step training procedure for our discontinuous deep learning model, and we provide theoretical support for its effectiveness. The performance of our architecture is evaluated experimentally on two real-world datasets and one synthetic dataset.

Download