Latent Multivariate Log-Gamma Models for High-Dimensional Multi-Type Responses with Application to Daily Fine Particulate Matter and Mortality Counts

495 0 0.0 ( 0 )

Download Cite

Added by Jonathan Bradley

Publication date 2019

fields Mathematical Statistics

and research's language is English

Authors Zhixing Xu - Jonathan R. Bradley - Debajyoti Sinha

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Tracking and estimating Daily Fine Particulate Matter (PM2.5) is very important as it has been shown that PM2.5 is directly related to mortality related to lungs, cardiovascular system, and stroke. That is, high values of PM2.5 constitute a public health problem in the US, and it is important that we precisely estimate PM2.5 to aid in public policy decisions. Thus, we propose a Bayesian hierarchical model for high-dimensional multi-type responses. By multi-type responses we mean a collection of correlated responses that have different distributional assumptions (e.g., continuous skewed observations, and count-valued observations). The Centers for Disease Control and Prevention (CDC) database provides counts of mortalities related to PM2.5 and daily averaged PM2.5 which are both treated as responses in our analysis. Our model capitalizes on the shared conjugate structure between the Weibull (to model PM2.5), Poisson (to model diseases mortalities), and multivariate log-gamma distributions, and we use dimension reduction to aid with computation. Our model can also be used to improve the precision of estimates and estimate values at undisclosed/missing counties. We provide a simulation study to illustrate the performance of the model, and give an in-depth analysis of the CDC dataset.

rate research

High-dimensional Multivariate Mediation: with Application to Neuroimaging Data

89 - Oliver Y. Chen , Ciprian M. Crainiceanu , Elizabeth L. Ogburn 2015

Mediation analysis has become an important tool in the behavioral sciences for investigating the role of intermediate variables that lie in the path between a randomized treatment and an outcome variable. The influence of the intermediate variable on the outcome is often explored using structural equation models (SEMs), with model coefficients interpreted as possible effects. While there has been significant research on the topic in recent years, little work has been done on mediation analysis when the intermediate variable (mediator) is a high-dimensional vector. In this work we present a new method for exploratory mediation analysis in this setting called the directions of mediation (DMs). The first DM is defined as the linear combination of the elements of a high-dimensional vector of potential mediators that maximizes the likelihood of the SEM. The subsequent DMs are defined as linear combinations of the elements of the high-dimensional vector that are orthonormal to the previous DMs and maximize the likelihood of the SEM. We provide an estimation algorithm and establish the asymptotic properties of the obtained estimators. This method is well suited for cases when many potential mediators are measured. Examples of high-dimensional potential mediators are brain images composed of hundreds of thousands of voxels, genetic variation measured at millions of SNPs, or vectors of thousands of variables in large-scale epidemiological studies. We demonstrate the method using a functional magnetic resonance imaging (fMRI) study of thermal pain where we are interested in determining which brain locations mediate the relationship between the application of a thermal stimulus and self-reported pain.

Methodology

Multi-population mortality forecasting using high-dimensional functional factor models

145 - Chen Tang , Han Lin Shang , Yanrong Yang 2021

This paper proposes a two-fold factor model for high-dimensional functional time series (HDFTS), which enables the modeling and forecasting of multi-population mortality under the functional data framework. The proposed model first decomposes the HDFTS into functional time series with lower dimensions (common feature) and a system of basis functions specific to different cross-sections (heterogeneity). Then the lower-dimensional common functional time series are further reduced into low-dimensional scalar factor matrices. The dimensionally reduced factor matrices can reasonably convey useful information in the original HDFTS. All the temporal dynamics contained in the original HDFTS are extracted to facilitate forecasting. The proposed model can be regarded as a general case of several existing functional factor models. Through a Monte Carlo simulation, we demonstrate the performance of the proposed method in model fitting. In an empirical study of the Japanese subnational age-specific mortality rates, we show that the proposed model produces more accurate point and interval forecasts in modeling multi-population mortality than those existing functional factor models. The financial impact of the improvements in forecasts is demonstrated through comparisons in life annuity pricing practices.

Methodology Applications Computation

Gaussian Graphical Regression Models with High Dimensional Responses and Covariates

406 - Jingfei Zhang , Yi Li 2020

Though Gaussian graphical models have been widely used in many scientific fields, limited progress has been made to link graph structures to external covariates because of substantial challenges in theory and computation. We propose a Gaussian graphical regression model, which regresses both the mean and the precision matrix of a Gaussian graphical model on covariates. In the context of co-expression quantitative trait locus (QTL) studies, our framework facilitates estimation of both population- and subject-level gene regulatory networks, and detection of how subject-level networks vary with genetic variants and clinical conditions. Our framework accommodates high dimensional responses and covariates, and encourages covariate effects on both the mean and the precision matrix to be sparse. In particular for the precision matrix, we stipulate simultaneous sparsity, i.e., group sparsity and element-wise sparsity, on effective covariates and their effects on network edges, respectively. We establish variable selection consistency first under the case with known mean parameters and then a more challenging case with unknown means depending on external covariates, and show in both cases that the convergence rate of the estimated precision parameters is faster than that obtained by lasso or group lasso, a desirable property for the sparse group lasso estimation. The utility and efficacy of our proposed method is demonstrated through simulation studies and an application to a co-expression QTL study with brain cancer patients.

Methodology Statistics Theory Statistics Theory

Multivariate functional responses low rank regression with an application to brain imaging data

115 - Xiucai Ding , Dengdeng Yu , Zhengwu Zhang 2020

We propose a multivariate functional responses low rank regression model with possible high dimensional functional responses and scalar covariates. By expanding the slope functions on a set of sieve basis, we reconstruct the basis coefficients as a matrix. To estimate these coefficients, we propose an efficient procedure using nuclear norm regularization. We also derive error bounds for our estimates and evaluate our method using simulations. We further apply our method to the Human Connectome Project neuroimaging data to predict cortical surface motor task-evoked functional magnetic resonance imaging signals using various clinical covariates to illustrate the usefulness of our results.

Methodology Applications

High-dimensional functional time series forecasting: An application to age-specific mortality rates

85 - Yuan Gao , Han Lin Shang , Yanrong Yang 2018

We address the problem of forecasting high-dimensional functional time series through a two-fold dimension reduction procedure. The difficulty of forecasting high-dimensional functional time series lies in the curse of dimensionality. In this paper, we propose a novel method to solve this problem. Dynamic functional principal component analysis is first applied to reduce each functional time series to a vector. We then use the factor model as a further dimension reduction technique so that only a small number of latent factors are preserved. Classic time series models can be used to forecast the factors and conditional forecasts of the functions can be constructed. Asymptotic properties of the approximated functions are established, including both estimation error and forecast error. The proposed method is easy to implement especially when the dimension of the functional time series is large. We show the superiority of our approach by both simulation studies and an application to Japanese age-specific mortality rates.

Methodology