No Arabic abstract
Regression models describing the joint distribution of multivariate response variables conditional on covariate information have become an important aspect of contemporary regression analysis. However, a limitation of such models is that they often rely on rather simplistic assumptions, e.g. a constant dependency structure that is not allowed to vary with the covariates or the restriction to linear dependence between the responses only. We propose a general framework for multivariate conditional transformation models that overcomes these limitations and describes the entire distribution in a tractable and interpretable yet flexible way conditional on nonlinear effects of covariates. The framework can be embedded into likelihood-based inference, including results on asymptotic normality, and allows the dependence structure to vary with covariates. In addition, the framework scales well beyond bivariate response situations, which were the main focus of most earlier investigations. We illustrate the application of multivariate conditional transformation models in a trivariate analysis of childhood undernutrition and demonstrate empirically that our approach can be beneficial compared to existing benchmarks such that complex truly multivariate data-generating processes can be inferred from observations.
This paper proposes a maximum-likelihood approach to jointly estimate marginal conditional quantiles of multivariate response variables in a linear regression framework. We consider a slight reparameterization of the Multivariate Asymmetric Laplace distribution proposed by Kotz et al (2001) and exploit its location-scale mixture representation to implement a new EM algorithm for estimating model parameters. The idea is to extend the link between the Asymmetric Laplace distribution and the well-known univariate quantile regression model to a multivariate context, i.e. when a multivariate dependent variable is concerned. The approach accounts for association among multiple responses and study how the relationship between responses and explanatory variables can vary across different quantiles of the marginal conditional distribution of the responses. A penalized version of the EM algorithm is also presented to tackle the problem of variable selection. The validity of our approach is analyzed in a simulation study, where we also provide evidence on the efficiency gain of the proposed method compared to estimation obtained by separate univariate quantile regressions. A real data application is finally proposed to study the main determinants of financial distress in a sample of Italian firms.
Graphical models express conditional independence relationships among variables. Although methods for vector-valued data are well established, functional data graphical models remain underdeveloped. We introduce a notion of conditional independence between random functions, and construct a framework for Bayesian inference of undirected, decomposable graphs in the multivariate functional data context. This framework is based on extending Markov distributions and hyper Markov laws from random variables to random processes, providing a principled alternative to naive application of multivariate methods to discretized functional data. Markov properties facilitate the composition of likelihoods and priors according to the decomposition of a graph. Our focus is on Gaussian process graphical models using orthogonal basis expansions. We propose a hyper-inverse-Wishart-process prior for the covariance kernels of the infinite coefficient sequences of the basis expansion, establish existence, uniqueness, strong hyper Markov property, and conjugacy. Stochastic search Markov chain Monte Carlo algorithms are developed for posterior inference, assessed through simulations, and applied to a study of brain activity and alcoholism.
Among Judea Pearls many contributions to Causality and Statistics, the graphical d-separation} criterion, the do-calculus and the mediation formula stand out. In this chapter we show that d-separation} provides direct insight into an earlier causal model originally described in terms of potential outcomes and event trees. In turn, the resulting synthesis leads to a simplification of the do-calculus that clarifies and separates the underlying concepts, and a simple counterfactual formulation of a complete identification algorithm in causal models with hidden variables.
This paper introduces a novel quantile approach to harness the high-frequency information and improve the daily conditional quantile estimation. Specifically, we model the conditional standard deviation as a realized GARCH model and employ conditional standard deviation, realized volatility, realized quantile, and absolute overnight return as innovations in the proposed dynamic quantile models. We devise a two-step estimation procedure to estimate the conditional quantile parameters. The first step applies a quasi-maximum likelihood estimation procedure, with the realized volatility as a proxy for the volatility proxy, to estimate the conditional standard deviation parameters. The second step utilizes a quantile regression estimation procedure with the estimated conditional standard deviation in the first step. Asymptotic theory is established for the proposed estimation methods, and a simulation study is conducted to check their finite-sample performance. Finally, we apply the proposed methodology to calculate the value at risk (VaR) of 20 individual assets and compare its performance with existing competitors.
Modeling of longitudinal data often requires diffusion models that incorporate overall time-dependent, nonlinear dynamics of multiple components and provide sufficient flexibility for subject-specific modeling. This complexity challenges parameter inference and approximations are inevitable. We propose a method for approximate maximum-likelihood parameter estimation in multivariate time-inhomogeneous diffusions, where subject-specific flexibility is accounted for by incorporation of multidimensional mixed effects and covariates. We consider $N$ multidimensional independent diffusions $X^i = (X^i_t)_{0leq tleq T^i}, 1leq ileq N$, with common overall model structure and unknown fixed-effects parameter $mu$. Their dynamics differ by the subject-specific random effect $phi^i$ in the drift and possibly by (known) covariate information, different initial conditions and observation times and duration. The distribution of $phi^i$ is parametrized by an unknown $vartheta$ and $theta = (mu, vartheta)$ is the target of statistical inference. Its maximum likelihood estimator is derived from the continuous-time likelihood. We prove consistency and asymptotic normality of $hat{theta}_N$ when the number $N$ of subjects goes to infinity using standard techniques and consider the more general concept of local asymptotic normality for less regular models. The bias induced by time-discretization of sufficient statistics is investigated. We discuss verification of conditions and investigate parameter estimation and hypothesis testing in simulations.