Towards Interpretable Neural Networks: An Exact Transformation to Multi-Class Multivariate Decision Trees

111 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tung Nguyen

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Duy T. Nguyen - Kathryn E. Kasmarik - Hussein A. Abbass

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Artificial neural networks (ANNs) are commonly labelled as black-boxes, lacking interpretability. This hinders human understanding of ANNs behaviors. A need exists to generate a meaningful sequential logic for the production of a specific output. Decision trees exhibit better interpretability and expressive power due to their representation language and the existence of efficient algorithms to generate rules. Growing a decision tree based on the available data could produce larger than necessary trees or trees that do not generalise well. In this paper, we introduce two novel multivariate decision tree (MDT) algorithms for rule extraction from an ANN: an Exact-Convertible Decision Tree (EC-DT) and an Extended C-Net algorithm to transform a neural network with Rectified Linear Unit activation functions into a representative tree which can be used to extract multivariate rules for reasoning. While the EC-DT translates the ANN in a layer-wise manner to represent exactly the decision boundaries implicitlylearned by the hidden layers of the network, the Extended C-Net inherits the decompositional approach from EC-DT and combines with a C5 tree learning algorithm to construct the decision rules. The results suggest that while EC-DT is superior in preserving the structure and the accuracy of ANN, Extended C-Net generates the most compact and highly effective trees from ANN. Both proposed MDT algorithms generate rules including combinations of multiple attributes for precise interpretation of decision-making processes.

قيم البحث

89 - Ji Feng , Yang Yu , Zhi-Hua Zhou 2018

Multi-layered representation is believed to be the key ingredient of deep neural networks especially in cognitive tasks like computer vision. While non-differentiable models such as gradient boosting decision trees (GBDTs) are the dominant methods fo r modeling discrete or tabular data, they are hard to incorporate with such representation learning ability. In this work, we propose the multi-layered GBDT forest (mGBDTs), with an explicit emphasis on exploring the ability to learn hierarchical representations by stacking several layers of regression GBDTs as its building block. The model can be jointly trained by a variant of target propagation across layers, without the need to derive back-propagation nor differentiability. Experiments and visualizations confirmed the effectiveness of the model in terms of performance and representation learning ability.

التعلم الآلي التعلم الالي

Efficient Decision Trees for Multi-class Support Vector Machines Using Entropy and Generalization Error Estimation

122 - Pittipol Kantavat , Boonserm Kijsirikul , Patoomsiri Songsiri 2017

We propose new methods for Support Vector Machines (SVMs) using tree architecture for multi-class classi- fication. In each node of the tree, we select an appropriate binary classifier using entropy and generalization error estimation, then group the examples into positive and negative classes based on the selected classi- fier and train a new classifier for use in the classification phase. The proposed methods can work in time complexity between O(log2N) to O(N) where N is the number of classes. We compared the performance of our proposed methods to the traditional techniques on the UCI machine learning repository using 10-fold cross-validation. The experimental results show that our proposed methods are very useful for the problems that need fast classification time or problems with a large number of classes as the proposed methods run much faster than the traditional techniques but still provide comparable accuracy.

التعلم الآلي التعلم الالي

Model family selection for classification using Neural Decision Trees

129 - Anthea Merida Montes de Oca , Argyris Kalogeratos , Mathilde Mougeot 2020

Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much information abou t the dataset itself. In this paper we propose a method to reduce the scope of exploration needed for the task. The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying `rigid decision boundaries (e.g. decision trees), so as to obtain an equivalent or better model. In our approach, this is realized by progressively relaxing the decision boundaries of the initial decision trees (the RMs) as long as this is beneficial in terms of performance measured on an analyzed dataset. More specifically, this relaxation is performed by making use of a neural decision tree, which is a neural network built from DTs. The final model produced by our method carries non-linear decision boundaries. Measuring the performance of the final model, and its agreement to its seeding RM can help the user to figure out on which family of models he should focus on.

التعلم الآلي التعلم الالي

Interpretable Multi-Task Deep Neural Networks for Dynamic Predictions of Postoperative Complications

94 - Benjamin Shickel , Tyler J. Loftus , Shounak Datta 2020

Accurate prediction of postoperative complications can inform shared decisions between patients and surgeons regarding the appropriateness of surgery, preoperative risk-reduction strategies, and postoperative resource use. Traditional predictive anal ytic tools are hindered by suboptimal performance and usability. We hypothesized that novel deep learning techniques would outperform logistic regression models in predicting postoperative complications. In a single-center longitudinal cohort of 43,943 adult patients undergoing 52,529 major inpatient surgeries, deep learning yielded greater discrimination than logistic regression for all nine complications. Predictive performance was strongest when leveraging the full spectrum of preoperative and intraoperative physiologic time-series electronic health record data. A single multi-task deep learning model yielded greater performance than separate models trained on individual complications. Integrated gradients interpretability mechanisms demonstrated the substantial importance of missing data. Interpretable, multi-task deep neural networks made accurate, patient-level predictions that harbor the potential to augment surgical decision-making.

التعلم الآلي التعلم الالي

Interpretable Additive Recurrent Neural Networks For Multivariate Clinical Time Series

93 - Asif Rahman , Yale Chang , Jonathan Rubin 2021

Time series models with recurrent neural networks (RNNs) can have high accuracy but are unfortunately difficult to interpret as a result of feature-interactions, temporal-interactions, and non-linear transformations. Interpretability is important in domains like healthcare where constructing models that provide insight into the relationships they have learned are required to validate and trust model predictions. We want accurate time series models where users can understand the contribution of individual input features. We present the Interpretable-RNN (I-RNN) that balances model complexity and accuracy by forcing the relationship between variables in the model to be additive. Interactions are restricted between hidden states of the RNN and additively combined at the final step. I-RNN specifically captures the unique characteristics of clinical time series, which are unevenly sampled in time, asynchronously acquired, and have missing data. Importantly, the hidden state activations represent feature coefficients that correlate with the prediction target and can be visualized as risk curves that capture the global relationship between individual input features and the outcome. We evaluate the I-RNN model on the Physionet 2012 Challenge dataset to predict in-hospital mortality, and on a real-world clinical decision support task: predicting hemodynamic interventions in the intensive care unit. I-RNN provides explanations in the form of global and local feature importances comparable to highly intelligible models like decision trees trained on hand-engineered features while significantly outperforming them. I-RNN remains intelligible while providing accuracy comparable to state-of-the-art decay-based and interpolation-based recurrent time series models. The experimental results on real-world clinical datasets refute the myth that there is a tradeoff between accuracy and interpretability.

التعلم الآلي الذكاء الاصطناعي