ﻻ يوجد ملخص باللغة العربية
Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine learning models. As such, handling missing data in decision trees is a well studied problem. In this paper, we tackle this problem by taking a probabilistic approach. At deployment time, we use tractable density estimators to compute the expected prediction of our models. At learning time, we fine-tune parameters of already learned trees by minimizing their expected prediction loss w.r.t. our density estimators. We provide brief experiments showcasing effectiveness of our methods compared to few baselines.
We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an $eta$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm run
Identification of charged particles in a multilayer detector by the energy loss technique may also be achieved by the use of a neural network. The performance of the network becomes worse when a large fraction of information is missing, for instance
Time series with missing data are signals encountered in important settings for machine learning. Some of the most successful prior approaches for modeling such time series are based on recurrent neural networks that transform the input and previous
One popular hypothesis of neural network generalization is that the flat local minima of loss surface in parameter space leads to good generalization. However, we demonstrate that loss surface in parameter space has no obvious relationship with gener
Multi-layered representation is believed to be the key ingredient of deep neural networks especially in cognitive tasks like computer vision. While non-differentiable models such as gradient boosting decision trees (GBDTs) are the dominant methods fo