DNN2LR: Interpretation-inspired Feature Crossing for Real-world Tabular Data


Abstract in English

For sake of reliability, it is necessary for models in real-world applications to be both powerful and globally interpretable. Simple classifiers, e.g., Logistic Regression (LR), are globally interpretable, but not powerful enough to model complex nonlinear interactions among features in tabular data. Meanwhile, Deep Neural Networks (DNNs) have shown great effectiveness for modeling tabular data, but is not globally interpretable. In this work, we find local piece-wise interpretations in DNN of a specific feature are usually inconsistent in different samples, which is caused by feature interactions in the hidden layers. Accordingly, we can design an automatic feature crossing method to find feature interactions in DNN, and use them as cross features in LR. We give definition of the interpretation inconsistency in DNN, based on which a novel feature crossing method called DNN2LR is proposed. Extensive experiments have been conducted on four public datasets and two real-world datasets. The final model, i.e., a LR model empowered with cross features, generated by DNN2LR can outperform the complex DNN model, as well as several state-of-the-art feature crossing methods. The experimental results strongly verify the effectiveness and efficiency of DNN2LR, especially on real-world datasets with large numbers of feature fields.

Download