ﻻ يوجد ملخص باللغة العربية
Decision trees are a popular technique in statistical data classification. They recursively partition the feature space into disjoint sub-regions until each sub-region becomes homogeneous with respect to a particular class. The basic Classification and Regression Tree (CART) algorithm partitions the feature space using axis parallel splits. When the true decision boundaries are not aligned with the feature axes, this approach can produce a complicated boundary structure. Oblique decision trees use oblique decision boundaries to potentially simplify the boundary structure. The major limitation of this approach is that the tree induction algorithm is computationally expensive. In this article we present a new decision tree algorithm, called HHCART. The method utilizes a series of Householder matrices to reflect the training data at each node during the tree construction. Each reflection is based on the directions of the eigenvectors from each classes covariance matrix. Considering axis parallel splits in the reflected training data provides an efficient way of finding oblique splits in the unreflected training data. Experimental results show that the accuracy and size of the HHCART trees are comparable with some benchmark methods in the literature. The appealing feature of HHCART is that it can handle both qualitative and quantitative features in the same oblique split.
Increasing the penetration of variable generation has a substantial effect on the operational reliability of power systems. The higher level of uncertainty that stems from this variability makes it more difficult to determine whether a given operatin
Interpretable surrogates of black-box predictors trained on high-dimensional tabular datasets can struggle to generate comprehensible explanations in the presence of correlated variables. We propose a model-agnostic interpretable surrogate that provi
Decision forests, including Random Forests and Gradient Boosting Trees, have recently demonstrated state-of-the-art performance in a variety of machine learning settings. Decision forests are typically ensembles of axis-aligned decision trees; that i
Ensembles of decision trees perform well on many problems, but are not interpretable. In contrast to existing approaches in interpretability that focus on explaining relationships between features and predictions, we propose an alternative approach t
Depression is a common mental disorder worldwide which causes a range of serious outcomes. The diagnosis of depression relies on patient-reported scales and psychiatrist interview which may lead to subjective bias. In recent years, more and more rese