High-dimensional Multivariate Mediation: with Application to Neuroimaging Data

90 0 0.0 ( 0 )

Download Cite

Added by Oliver Ch\\'en

Publication date 2015

fields Mathematical Statistics

and research's language is English

Authors Oliver Y. Chen - Ciprian M. Crainiceanu - Elizabeth L. Ogburn

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Mediation analysis has become an important tool in the behavioral sciences for investigating the role of intermediate variables that lie in the path between a randomized treatment and an outcome variable. The influence of the intermediate variable on the outcome is often explored using structural equation models (SEMs), with model coefficients interpreted as possible effects. While there has been significant research on the topic in recent years, little work has been done on mediation analysis when the intermediate variable (mediator) is a high-dimensional vector. In this work we present a new method for exploratory mediation analysis in this setting called the directions of mediation (DMs). The first DM is defined as the linear combination of the elements of a high-dimensional vector of potential mediators that maximizes the likelihood of the SEM. The subsequent DMs are defined as linear combinations of the elements of the high-dimensional vector that are orthonormal to the previous DMs and maximize the likelihood of the SEM. We provide an estimation algorithm and establish the asymptotic properties of the obtained estimators. This method is well suited for cases when many potential mediators are measured. Examples of high-dimensional potential mediators are brain images composed of hundreds of thousands of voxels, genetic variation measured at millions of SNPs, or vectors of thousands of variables in large-scale epidemiological studies. We demonstrate the method using a functional magnetic resonance imaging (fMRI) study of thermal pain where we are interested in determining which brain locations mediate the relationship between the application of a thermal stimulus and self-reported pain.

rate research

Bayesian inference on high-dimensional multivariate binary data

147 - Antik Chakraborty , Rihui Ou , David B. Dunson 2021

It has become increasingly common to collect high-dimensional binary data; for example, with the emergence of new sampling techniques in ecology. In smaller dimensions, multivariate probit (MVP) models are routinely used for inferences. However, algorithms for fitting such models face issues in scaling up to high dimensions due to the intractability of the likelihood, involving an integral over a multivariate normal distribution having no analytic form. Although a variety of algorithms have been proposed to approximate this intractable integral, these approaches are difficult to implement and/or inaccurate in high dimensions. We propose a two-stage Bayesian approach for inference on model parameters while taking care of the uncertainty propagation between the stages. We use the special structure of latent Gaussian models to reduce the highly expensive computation involved in joint parameter estimation to focus inference on marginal distributions of model parameters. This essentially makes the method embarrassingly parallel for both stages. We illustrate performance in simulations and applications to joint species distribution modeling in ecology.

Methodology Computation

Multivariate functional responses low rank regression with an application to brain imaging data

115 - Xiucai Ding , Dengdeng Yu , Zhengwu Zhang 2020

We propose a multivariate functional responses low rank regression model with possible high dimensional functional responses and scalar covariates. By expanding the slope functions on a set of sieve basis, we reconstruct the basis coefficients as a matrix. To estimate these coefficients, we propose an efficient procedure using nuclear norm regularization. We also derive error bounds for our estimates and evaluate our method using simulations. We further apply our method to the Human Connectome Project neuroimaging data to predict cortical surface motor task-evoked functional magnetic resonance imaging signals using various clinical covariates to illustrate the usefulness of our results.

Methodology Applications

Latent Multivariate Log-Gamma Models for High-Dimensional Multi-Type Responses with Application to Daily Fine Particulate Matter and Mortality Counts

494 - Zhixing Xu , Jonathan R. Bradley , Debajyoti Sinha 2019

Tracking and estimating Daily Fine Particulate Matter (PM2.5) is very important as it has been shown that PM2.5 is directly related to mortality related to lungs, cardiovascular system, and stroke. That is, high values of PM2.5 constitute a public health problem in the US, and it is important that we precisely estimate PM2.5 to aid in public policy decisions. Thus, we propose a Bayesian hierarchical model for high-dimensional multi-type responses. By multi-type responses we mean a collection of correlated responses that have different distributional assumptions (e.g., continuous skewed observations, and count-valued observations). The Centers for Disease Control and Prevention (CDC) database provides counts of mortalities related to PM2.5 and daily averaged PM2.5 which are both treated as responses in our analysis. Our model capitalizes on the shared conjugate structure between the Weibull (to model PM2.5), Poisson (to model diseases mortalities), and multivariate log-gamma distributions, and we use dimension reduction to aid with computation. Our model can also be used to improve the precision of estimates and estimate values at undisclosed/missing counties. We provide a simulation study to illustrate the performance of the model, and give an in-depth analysis of the CDC dataset.

Methodology

ICS for Multivariate Outlier Detection with Application to Quality Control

77 - Aurore Archimbaud , Klaus Nordhausen , Anne Ruiz-Gazen 2016

In high reliability standards fields such as automotive, avionics or aerospace, the detection of anomalies is crucial. An efficient methodology for automatically detecting multivariate outliers is introduced. It takes advantage of the remarkable properties of the Invariant Coordinate Selection (ICS) method. Based on the simultaneous spectral decomposition of two scatter matrices, ICS leads to an affine invariant coordinate system in which the Euclidian distance corresponds to a Mahalanobis Distance (MD) in the original coordinates. The limitations of MD are highlighted using theoretical arguments in a context where the dimension of the data is large. Unlike MD, ICS makes it possible to select relevant components which removes the limitations. Owing to the resulting dimension reduction, the method is expected to improve the power of outlier detection rules such as MD-based criteria. It also greatly simplifies outliers interpretation. The paper includes practical guidelines for using ICS in the context of a small proportion of outliers which is relevant in high reliability standards fields. The choice of scatter matrices together with the selection of relevant invariant components through parallel analysis and normality tests are addressed. The use of the regular covariance matrix and the so called matrix of fourth moments as the scatter pair is recommended. This choice combines the simplicity of implementation together with the possibility to derive theoretical results. A simulation study confirms the good properties of the proposal and compares it with other scatter pairs. This study also provides a comparison with Principal Component Analysis and MD. The performance of our proposal is also evaluated on several real data sets using a user-friendly R package accompanying the paper.

Methodology

Interpretable High-Dimensional Inference Via Score Projection with an Application in Neuroimaging

61 - Simon N. Vandekar , Philip T. Reiss , 2017

In the fields of neuroimaging and genetics, a key goal is testing the association of a single outcome with a very high-dimensional imaging or genetic variable. Often, summary measures of the high-dimensional variable are created to sequentially test and localize the association with the outcome. In some cases, the results for summary measures are significant, but subsequent tests used to localize differences are underpowered and do not identify regions associated with the outcome. Here, we propose a generalization of Raos score test based on projecting the score statistic onto a linear subspace of a high-dimensional parameter space. In addition, we provide methods to localize signal in the high-dimensional space by projecting the scores to the subspace where the score test was performed. This allows for inference in the high-dimensional space to be performed on the same degrees of freedom as the score test, effectively reducing the number of comparisons. Simulation results demonstrate the test has competitive power relative to others commonly used. We illustrate the method by analyzing a subset of the Alzheimers Disease Neuroimaging Initiative dataset. Results suggest cortical thinning of the frontal and temporal lobes may be a useful biological marker of Alzheimers risk.

Statistics Theory Statistics Theory