No Arabic abstract
Given the cost and duration of phase III and phase IV clinical trials, the development of statistical methods for go/no-go decisions is vital. In this paper, we introduce a Bayesian methodology to compute the probability of success based on the current data of a treatment regimen for the multivariate linear model. Our approach utilizes a Bayesian seemingly unrelated regression model, which allows for multiple endpoints to be modeled jointly even if the covariates between the endpoints are different. Correlations between endpoints are explicitly modeled. This Bayesian joint modeling approach unifies single and multiple testing procedures under a single framework. We develop an approach to multiple testing that asymptotically guarantees strict family-wise error rate control, and is more powerful than frequentist approaches to multiplicity. The method effectively yields those of Ibrahim et al. and Chuang-Stein as special cases, and, to our knowledge, is the only method that allows for robust sample size determination for multiple endpoints and/or hypotheses and the only method that provides strict family-wise type I error control in the presence of multiplicity.
In neuroimaging, hundreds to hundreds of thousands of tests are performed across a set of brain regions or all locations in an image. Recent studies have shown that the most common family-wise error (FWE) controlling procedures in imaging, which rely on classical mathematical inequalities or Gaussian random field theory, yield FWE rates that are far from the nominal level. Depending on the approach used, the FWER can be exceedingly small or grossly inflated. Given the widespread use of neuroimaging as a tool for understanding neurological and psychiatric disorders, it is imperative that reliable multiple testing procedures are available. To our knowledge, only permutation joint testing procedures have been shown to reliably control the FWER at the nominal level. However, these procedures are computationally intensive due to the increasingly available large sample sizes and dimensionality of the images, and analyses can take days to complete. Here, we develop a parametric bootstrap joint testing procedure. The parametric bootstrap procedure works directly with the test statistics, which leads to much faster estimation of adjusted emph{p}-values than resampling-based procedures while reliably controlling the FWER in sample sizes available in many neuroimaging studies. We demonstrate that the procedure controls the FWER in finite samples using simulations, and present region- and voxel-wise analyses to test for sex differences in developmental trajectories of cerebral blood flow.
The use of quantiles to obtain insights about multivariate data is addressed. It is argued that incisive insights can be obtained by considering directional quantiles, the quantiles of projections. Directional quantile envelopes are proposed as a way to condense this kind of information; it is demonstrated that they are essentially halfspace (Tukey) depth levels sets, coinciding for elliptic distributions (in particular multivariate normal) with density contours. Relevant questions concerning their indexing, the possibility of the reverse retrieval of directional quantile information, invariance with respect to affine transformations, and approximation/asymptotic properties are studied. It is argued that the analysis in terms of directional quantiles and their envelopes offers a straightforward probabilistic interpretation and thus conveys a concrete quantitative meaning; the directional definition can be adapted to elaborate frameworks, like estimation of extreme quantiles and directional quantile regression, the regression of depth contours on covariates. The latter facilitates the construction of multivariate growth charts---the question that motivated all the development.
Graphical models express conditional independence relationships among variables. Although methods for vector-valued data are well established, functional data graphical models remain underdeveloped. We introduce a notion of conditional independence between random functions, and construct a framework for Bayesian inference of undirected, decomposable graphs in the multivariate functional data context. This framework is based on extending Markov distributions and hyper Markov laws from random variables to random processes, providing a principled alternative to naive application of multivariate methods to discretized functional data. Markov properties facilitate the composition of likelihoods and priors according to the decomposition of a graph. Our focus is on Gaussian process graphical models using orthogonal basis expansions. We propose a hyper-inverse-Wishart-process prior for the covariance kernels of the infinite coefficient sequences of the basis expansion, establish existence, uniqueness, strong hyper Markov property, and conjugacy. Stochastic search Markov chain Monte Carlo algorithms are developed for posterior inference, assessed through simulations, and applied to a study of brain activity and alcoholism.
We propose a method for multiple hypothesis testing with familywise error rate (FWER) control, called the i-FWER test. Most testing methods are predefined algorithms that do not allow modifications after observing the data. However, in practice, analysts tend to choose a promising algorithm after observing the data; unfortunately, this violates the validity of the conclusion. The i-FWER test allows much flexibility: a human (or a computer program acting on the humans behalf) may adaptively guide the algorithm in a data-dependent manner. We prove that our test controls FWER if the analysts adhere to a particular protocol of masking and unmasking. We demonstrate via numerical experiments the power of our test under structured non-nulls, and then explore new forms of masking.
The article develops marginal models for multivariate longitudinal responses. Overall, the model consists of five regression submodels, one for the mean and four for the covariance matrix, with the latter resulting by considering various matrix decompositions. The decompositions that we employ are intuitive, easy to understand, and they do not rely on any assumptions such as the presence of an ordering among the multivariate responses. The regression submodels are semiparametric, with unknown functions represented by basis function expansions. We use spike-slap priors for the regression coefficients to achieve variable selection and function regularization, and to obtain parameter estimates that account for model uncertainty. An efficient Markov chain Monte Carlo algorithm for posterior sampling is developed. The simulation studies presented investigate the effects of priors on posteriors, the gains that one may have when considering multivariate longitudinal analyses instead of univariate ones, and whether these gains can counteract the negative effects of missing data. We apply the methods on a highly unbalanced longitudinal dataset with four responses observed over of period of 20 years