Do you want to publish a course? Click here

Handling Missing Data in Decision Trees: A Probabilistic Approach

95   0   0.0 ( 0 )
 Added by Pasha Khosravi
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine learning models. As such, handling missing data in decision trees is a well studied problem. In this paper, we tackle this problem by taking a probabilistic approach. At deployment time, we use tractable density estimators to compute the expected prediction of our models. At learning time, we fine-tune parameters of already learned trees by minimizing their expected prediction loss w.r.t. our density estimators. We provide brief experiments showcasing effectiveness of our methods compared to few baselines.



rate research

Read More

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an $eta$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm runs in time $n^{O(log(s/varepsilon)/varepsilon^2)}$ and returns a hypothesis with error within an additive $2eta + varepsilon$ of the Bayes optimal. An additive $2eta$ is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of $O(eta) + varepsilon$ was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.
58 - S. Riggi , D. Riggi , F. Riggi 2020
Identification of charged particles in a multilayer detector by the energy loss technique may also be achieved by the use of a neural network. The performance of the network becomes worse when a large fraction of information is missing, for instance due to detector inefficiencies. Algorithms which provide a way to impute missing information have been developed over the past years. Among the various approaches, we focused on normal mixtures models in comparison with standard mean imputation and multiple imputation methods. Further, to account for the intrinsic asymmetry of the energy loss data, we considered skew-normal mixture models and provided a closed form implementation in the Expectation-Maximization (EM) algorithm framework to handle missing patterns. The method has been applied to a test case where the energy losses of pions, kaons and protons in a six-layers Silicon detector are considered as input neurons to a neural network. Results are given in terms of reconstruction efficiency and purity of the various species in different momentum bins.
Time series with missing data are signals encountered in important settings for machine learning. Some of the most successful prior approaches for modeling such time series are based on recurrent neural networks that transform the input and previous state to account for the missing observations, and then treat the transformed signal in a standard manner. In this paper, we introduce a single unifying framework, Recursive Input and State Estimation (RISE), for this general approach and reformulate existing models as specific instances of this framework. We then explore additional novel variations within the RISE framework to improve the performance of any instance. We exploit representation learning techniques to learn latent representations of the signals used by RISE instances. We discuss and develop various encoding techniques to learn latent signal representations. We benchmark instances of the framework with various encoding functions on three data imputation datasets, observing that RISE instances always benefit from encoders that learn representations for numerical values from the digits into which they can be decomposed.
One popular hypothesis of neural network generalization is that the flat local minima of loss surface in parameter space leads to good generalization. However, we demonstrate that loss surface in parameter space has no obvious relationship with generalization, especially under adversarial settings. Through visualizing decision surfaces in both parameter space and input space, we instead show that the geometry property of decision surface in input space correlates well with the adversarial robustness. We then propose an adversarial robustness indicator, which can evaluate a neural networks intrinsic robustness property without testing its accuracy under adversarial attacks. Guided by it, we further propose our robust training method. Without involving adversarial training, our method could enhance networks intrinsic adversarial robustness against various adversarial attacks.
89 - Ji Feng , Yang Yu , Zhi-Hua Zhou 2018
Multi-layered representation is believed to be the key ingredient of deep neural networks especially in cognitive tasks like computer vision. While non-differentiable models such as gradient boosting decision trees (GBDTs) are the dominant methods for modeling discrete or tabular data, they are hard to incorporate with such representation learning ability. In this work, we propose the multi-layered GBDT forest (mGBDTs), with an explicit emphasis on exploring the ability to learn hierarchical representations by stacking several layers of regression GBDTs as its building block. The model can be jointly trained by a variant of target propagation across layers, without the need to derive back-propagation nor differentiability. Experiments and visualizations confirmed the effectiveness of the model in terms of performance and representation learning ability.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا