D-vine quantile regression with discrete variables

90 0 0.0 ( 0 )

Download Cite

Added by Daniel Kraus

Publication date 2017

fields Mathematical Statistics

and research's language is English

Authors Niklas Schallhorn - Daniel Kraus - Thomas Nagler

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Quantile regression, the prediction of conditional quantiles, finds applications in various fields. Often, some or all of the variables are discrete. The authors propose two new quantile regression approaches to handle such mixed discrete-continuous data. Both of them generalize the continuous D-vine quantile regression, where the dependence between the response and the covariates is modeled by a parametric D-vine. D-vine quantile regression provides very flexible models, that enable accurate and fast predictions. Moreover, it automatically takes care of major issues of classical quantile regression, such as quantile crossing and interactions between the covariates. The first approach keeps the parametric estimation of the D-vines, but modifies the formulas to account for the discreteness. The second approach estimates the D-vine using continuous convolution to make the discrete variables continuous and then estimates the D-vine nonparametrically. A simulation study is presented examining for which scenarios the discrete-continuous D-vine quantile regression can provide superior prediction abilities. Lastly, the functionality of the two introduced methods is demonstrated by a real-world example predicting the number of bike rentals.

rate research

D-vine copula based quantile regression

101 - Daniel Kraus , Claudia Czado 2015

Quantile regression, that is the prediction of conditional quantiles, has steadily gained importance in statistical modeling and financial applications. The authors introduce a new semiparametric quantile regression method based on sequentially fitting a likelihood optimal D-vine copula to given data resulting in highly flexible models with easily extractable conditional quantiles. As a subclass of regular vine copulas, D-vines enable the modeling of multivariate copulas in terms of bivariate building blocks, a so-called pair-copula construction (PCC). The proposed algorithm works fast and accurate even in high dimensions and incorporates an automatic variable selection by maximizing the conditional log-likelihood. Further, typical issues of quantile regression such as quantile crossing or transformations, interactions and collinearity of variables are automatically taken care of. In a simulation study the improved accuracy and saved computational time of the approach in comparison with established quantile regression methods is highlighted. An extensive financial application to international credit default swap (CDS) data including stress testing and Value-at-Risk (VaR) prediction demonstrates the usefulness of the proposed method.

Methodology

Quantile Regression with Censoring and Endogeneity

430 - Victor Chernozhukov , Ivan Fernandez-Val , 2011

In this paper, we develop a new censored quantile instrumental variable (CQIV) estimator and describe its properties and computation. The CQIV estimator combines Powell (1986) censored quantile regression (CQR) to deal with censoring, with a control variable approach to incorporate endogenous regressors. The CQIV estimator is obtained in two stages that are non-additive in the unobservables. The first stage estimates a non-additive model with infinite dimensional parameters for the control variable, such as a quantile or distribution regression model. The second stage estimates a non-additive censored quantile regression model for the response variable of interest, including the estimated control variable to deal with endogeneity. For computation, we extend the algorithm for CQR developed by Chernozhukov and Hong (2002) to incorporate the estimation of the control variable. We give generic regularity conditions for asymptotic normality of the CQIV estimator and for the validity of resampling methods to approximate its asymptotic distribution. We verify these conditions for quantile and distribution regression estimation of the control variable. Our analysis covers two-stage (uncensored) quantile regression with non-additive first stage as an important special case. We illustrate the computation and applicability of the CQIV estimator with a Monte-Carlo numerical example and an empirical application on estimation of Engel curves for alcohol.

Methodology Econometrics

Instrumental Variable Quantile Regression with Misclassification

104 - Takuya Ura 2016

This paper considers the instrumental variable quantile regression model (Chernozhukov and Hansen, 2005, 2013) with a binary endogenous treatment. It offers two identification results when the treatment status is not directly observed. The first result is that, remarkably, the reduced-form quantile regression of the outcome variable on the instrumental variable provides a lower bound on the structural quantile treatment effect under the stochastic monotonicity condition (Small and Tan, 2007; DiNardo and Lee, 2011). This result is relevant, not only when the treatment variable is subject to misclassification, but also when any measurement of the treatment variable is not available. The second result is for the structural quantile function when the treatment status is measured with error; I obtain the sharp identified set by deriving moment conditions under widely-used assumptions on the measurement error. Furthermore, I propose an inference method in the presence of other covariates.

Methodology

Quantile Functional Regression using Quantlets

329 - Hojin Yang , Veerabhadran Baladandayuthapani , Jeffrey S. Morris 2017

In this paper, we develop a quantile functional regression modeling framework that models the distribution of a set of common repeated observations from a subject through the quantile function, which is regressed on a set of covariates to determine how these factors affect various aspects of the underlying subject-specific distribution. To account for smoothness in the quantile functions, we introduce custom basis functions we call textit{quantlets} that are sparse, regularized, near-lossless, and empirically defined, adapting to the features of a given data set and containing a Gaussian subspace so {non-Gaussianness} can be assessed. While these quantlets could be used within various functional regression frameworks, we build a Bayesian framework that uses nonlinear shrinkage of quantlet coefficients to regularize the functional regression coefficients and allows fully Bayesian inferences after fitting a Markov chain Monte Carlo. Specifically, we apply global tests to assess which covariates have any effect on the distribution at all, followed by local tests to identify at which specific quantiles the differences lie while adjusting for multiple testing, and to assess whether the covariate affects certain major aspects of the distribution, including location, scale, skewness, Gaussianness, or tails. If the difference lies in these commonly-used summaries, our approach can still detect them, but our systematic modeling strategy can also detect effects on other aspects of the distribution that might be missed if one restricted attention to pre-chosen summaries. We demonstrate the benefit of the basis space modeling through simulation studies, and illustrate the method using a biomedical imaging data set in which we relate the distribution of pixel intensities from a tumor image to various demographic, clinical, and genetic characteristics.

Methodology

Regression Analyses of Distributions using Quantile Functional Regression

197 - Hojin Yang , Veerabhadran Baladandayuthapani , Arvind U.K. Rao 2018

Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected features. In this paper, we present methods to model the entire marginal distribution of pixel intensities via the quantile function as functional data, regressed on a set of demographic, clinical, and genetic predictors. We call this approach quantile functional regression, regressing subject-specific marginal distributions across repeated measurements on a set of covariates, allowing us to assess which covariates are associated with the distribution in a global sense, as well as to identify distributional features characterizing these differences, including mean, variance, skewness, and various upper and lower quantiles. To account for smoothness in the quantile functions, we introduce custom basis functions we call quantlets that are sparse, regularized, near-lossless, and empirically defined, adapting to the features of a given data set. We fit this model using a Bayesian framework that uses nonlinear shrinkage of quantlet coefficients to regularize the functional regression coefficients and provides fully Bayesian inference after fitting a Markov chain Monte Carlo. We demonstrate the benefit of the basis space modeling through simulation studies, and apply the method to Magnetic resonance imaging (MRI) based radiomic dataset from Glioblastoma Multiforme to relate imaging-based quantile functions to demographic, clinical, and genetic predictors, finding specific differences in tumor pixel intensity distribution between males and females and between tumors with and without DDIT3 mutations.

Methodology