Conditional Density Estimation with Neural Networks: Best Practices and Benchmarks

348 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jonas Rothfuss

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Jonas Rothfuss - Fabio Ferreira - Simon Walther

التعلم الالي التعلم الآلي المالية الحاسوبية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Given a set of empirical observations, conditional density estimation aims to capture the statistical relationship between a conditional variable $mathbf{x}$ and a dependent variable $mathbf{y}$ by modeling their conditional probability $p(mathbf{y}|mathbf{x})$. The paper develops best practices for conditional density estimation for finance applications with neural networks, grounded on mathematical insights and empirical evaluations. In particular, we introduce a noise regularization and data normalization scheme, alleviating problems with over-fitting, initialization and hyper-parameter sensitivity of such estimators. We compare our proposed methodology with popular semi- and non-parametric density estimators, underpin its effectiveness in various benchmarks on simulated and Euro Stoxx 50 data and show its superior performance. Our methodology allows to obtain high-quality estimators for statistical expectations of higher moments, quantiles and non-linear return transformations, with very little assumptions about the return dynamic.

قيم البحث

147 - Wesley Tansey , Karl Pichotta , James G. Scott 2016

The vast majority of the neural network literature focuses on predicting point values for a given set of response variables, conditioned on a feature vector. In many cases we need to model the full joint conditional distribution over the response var iables rather than simply making point predictions. In this paper, we present two novel approaches to such conditional density estimation (CDE): Multiscale Nets (MSNs) and CDE Trend Filtering. Multiscale nets transform the CDE regression task into a hierarchical classification task by decomposing the density into a series of half-spaces and learning boolean probabilities of each split. CDE Trend Filtering applies a k-th order graph trend filtering penalty to the unnormalized logits of a multinomial classifier network, with each edge in the graph corresponding to a neighboring point on a discretized version of the density. We compare both methods against plain multinomial classifier networks and mixture density networks (MDNs) on a simulated dataset and three real-world datasets. The results suggest the two methods are complementary: MSNs work well in a high-data-per-feature regime and CDE-TF is well suited for few-samples-per-feature scenarios where overfitting is a primary concern.

التعلم الالي

Best-scored Random Forest Density Estimation

79 - Hanyuan Hang , Hongwei Wen 2019

This paper presents a brand new nonparametric density estimation strategy named the best-scored random forest density estimation whose effectiveness is supported by both solid theoretical analysis and significant experimental performance. The termino logy best-scored stands for selecting one density tree with the best estimation performance out of a certain number of purely random density tree candidates and we then name the selected one the best-scored random density tree. In this manner, the ensemble of these selected trees that is the best-scored random density forest can achieve even better estimation results than simply integrating trees without selection. From the theoretical perspective, by decomposing the error term into two, we are able to carry out the following analysis: First of all, we establish the consistency of the best-scored random density trees under $L_1$-norm. Secondly, we provide the convergence rates of them under $L_1$-norm concerning with three different tail assumptions, respectively. Thirdly, the convergence rates under $L_{infty}$-norm is presented. Last but not least, we also achieve the above convergence rates analysis for the best-scored random density forest. When conducting comparative experiments with other state-of-the-art density estimation approaches on both synthetic and real data sets, it turns out that our algorithm has not only significant advantages in terms of estimation accuracy over other methods, but also stronger resistance to the curse of dimensionality.

التعلم الالي التعلم الآلي

Noise Regularization for Conditional Density Estimation

187 - Jonas Rothfuss , Fabio Ferreira , Simon Boehm 2019

Modelling statistical relationships beyond the conditional mean is crucial in many settings. Conditional density estimation (CDE) aims to learn the full conditional probability density from data. Though highly expressive, neural network based CDE mod els can suffer from severe over-fitting when trained with the maximum likelihood objective. Due to the inherent structure of such models, classical regularization approaches in the parameter space are rendered ineffective. To address this issue, we develop a model-agnostic noise regularization method for CDE that adds random perturbations to the data during training. We demonstrate that the proposed approach corresponds to a smoothness regularization and prove its asymptotic consistency. In our experiments, noise regularization significantly and consistently outperforms other regularization methods across seven data sets and three CDE models. The effectiveness of noise regularization makes neural network based CDE the preferable method over previous non- and semi-parametric approaches, even when training data is scarce.

التعلم الالي التعلم الآلي

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

99 - Bernd Bischl , Martin Binder , Michel Lang 2021

Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-perfor ming hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.

التعلم الالي التعلم الآلي

RFCDE: Random Forests for Conditional Density Estimation

92 - Taylor Pospisil , Ann B. Lee 2018

Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or cl assification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Pyth

التعلم الالي التعلم الآلي