ﻻ يوجد ملخص باللغة العربية
Decision trees provide a rich family of highly non-linear but efficient models, due to which they continue to be the go-to family of predictive models by practitioners across domains. But learning trees is a challenging problem due to their highly discrete and non-differentiable decision boundaries. The state-of-the-art techniques use greedy methods that exploit the discrete tree structure but are tailored to specific problem settings (say, categorical vs real-valued predictions). In this work, we propose a reformulation of the tree learning problem that provides better conditioned gradients, and leverages successful deep network learning techniques like overparameterization and straight-through estimators. Our reformulation admits an efficient and {em accurate} gradient-based algorithm that allows us to deploy our solution in disparate tree learning settings like supervised batch learning and online bandit feedback based learning. Using extensive validation on standard benchmarks, we observe that in the supervised learning setting, our general method is competitive to, and in some cases more accurate than, existing methods that are designed {em specifically} for the supervised settings. In contrast, for bandit settings, where most of the existing techniques are not applicable, our models are still accurate and significantly outperform the applicable state-of-the-art methods.
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics. In each episode, the learner suffers the loss accumulated along the trajectory realized by the poli
We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations thro
Multi-layered representation is believed to be the key ingredient of deep neural networks especially in cognitive tasks like computer vision. While non-differentiable models such as gradient boosting decision trees (GBDTs) are the dominant methods fo
Non-convex optimization problems are challenging to solve; the success and computational expense of a gradient descent algorithm or variant depend heavily on the initialization strategy. Often, either random initialization is used or initialization r
Learning an efficient update rule from data that promotes rapid learning of new tasks from the same distribution remains an open problem in meta-learning. Typically, previous works have approached this issue either by attempting to train a neural net