Automatic Detection of Significant Areas for Functional Data with Directional Error Control

93 0 0.0 ( 0 )

Download Cite

Added by Jian Shi

Publication date 2015

fields Mathematical Statistics

and research's language is English

Authors Peirong Xu - Youngjo Lee - Jian Qing Shi

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

To detect differences between the mean curves of two samples in longitudinal study or functional data analysis, we usually need to partition the temporal or spatial domain into several pre-determined sub-areas. In this paper we apply the idea of large-scale multiple testing to find the significant sub-areas automatically in a general functional data analysis framework. A nonparametric Gaussian process regression model is introduced for two-sided multiple tests. We derive an optimal test which controls directional false discovery rates and propose a procedure by approximating it on a continuum. The proposed procedure controls directional false discovery rates at any specified level asymptotically. In addition, it is computationally inexpensive and able to accommodate different time points for observations across the samples. Simulation studies are presented to demonstrate its finite sample performance. We also apply it to an executive function research in children with Hemiplegic Cerebral Palsy and extend it to the equivalence tests.

rate research

Bayesian Multivariate Probability of Success Using Historical Data with Strict Control of Family-wise Error Rate

86 - Ethan M. Alt , Matthew A. Psioda , Joseph G. Ibrahim 2020

Given the cost and duration of phase III and phase IV clinical trials, the development of statistical methods for go/no-go decisions is vital. In this paper, we introduce a Bayesian methodology to compute the probability of success based on the current data of a treatment regimen for the multivariate linear model. Our approach utilizes a Bayesian seemingly unrelated regression model, which allows for multiple endpoints to be modeled jointly even if the covariates between the endpoints are different. Correlations between endpoints are explicitly modeled. This Bayesian joint modeling approach unifies single and multiple testing procedures under a single framework. We develop an approach to multiple testing that asymptotically guarantees strict family-wise error rate control, and is more powerful than frequentist approaches to multiplicity. The method effectively yields those of Ibrahim et al. and Chuang-Stein as special cases, and, to our knowledge, is the only method that allows for robust sample size determination for multiple endpoints and/or hypotheses and the only method that provides strict family-wise type I error control in the presence of multiplicity.

Methodology

Faster Family-wise Error Control for Neuroimaging with a Parametric Bootstrap

43 - Simon N. Vandekar , Theodore D. Satterthwaite , Adon Rosen 2017

In neuroimaging, hundreds to hundreds of thousands of tests are performed across a set of brain regions or all locations in an image. Recent studies have shown that the most common family-wise error (FWE) controlling procedures in imaging, which rely on classical mathematical inequalities or Gaussian random field theory, yield FWE rates that are far from the nominal level. Depending on the approach used, the FWER can be exceedingly small or grossly inflated. Given the widespread use of neuroimaging as a tool for understanding neurological and psychiatric disorders, it is imperative that reliable multiple testing procedures are available. To our knowledge, only permutation joint testing procedures have been shown to reliably control the FWER at the nominal level. However, these procedures are computationally intensive due to the increasingly available large sample sizes and dimensionality of the images, and analyses can take days to complete. Here, we develop a parametric bootstrap joint testing procedure. The parametric bootstrap procedure works directly with the test statistics, which leads to much faster estimation of adjusted emph{p}-values than resampling-based procedures while reliably controlling the FWER in sample sizes available in many neuroimaging studies. We demonstrate that the procedure controls the FWER in finite samples using simulations, and present region- and voxel-wise analyses to test for sex differences in developmental trajectories of cerebral blood flow.

Methodology

Online EM for Functional Data

390 - Florian Maire , Eric Moulines , Sidonie Lefebvre 2016

A novel approach to perform unsupervised sequential learning for functional data is proposed. Our goal is to extract reference shapes (referred to as templates) from noisy, deformed and censored realizations of curves and images. Our model generalizes the Bayesian dense deformable template model (Allassonni`ere et al., 2007), a hierarchical model in which the template is the function to be estimated and the deformation is a nuisance, assumed to be random with a known prior distribution. The templates are estimated using a Monte Carlo version of the online Expectation-Maximization algorithm, extending the work from Cappe and Moulines (2009). Our sequential inference framework is significantly more computationally efficient than equivalent batch learning algorithms, especially when the missing data is high-dimensional. Some numerical illustrations on curve registration problem and templates extraction from images are provided to support our findings.

Methodology

Rank Dynamics for Functional Data

159 - Yaqing Chen , Matthew Dawson , Hans-Georg Muller 2018

The study of the dynamic behavior of cross-sectional ranks over time for functional data and the ranks of the observed curves at each time point and their temporal evolution can yield valuable insights into the time dynamics of functional data. This approach is of interest in various application areas. For the analysis of the dynamics of ranks, estimation of the cross-sectional ranks of functional data is a first step. Several statistics of interest for ranked functional data are proposed. To quantify the evolution of ranks over time, a model for rank derivatives is introduced, where rank dynamics are decomposed into two components. One component corresponds to population changes and the other to individual changes that both affect the rank trajectories of individuals. The joint asymptotic normality for suitable estimates of these two components is established. The proposed approaches are illustrated with simulations and three longitudinal data sets: Growth curves obtained from the Zurich Longitudinal Growth Study, monthly house price data in the US from 1996 to 2015 and Major League Baseball offensive data for the 2017 season.

Methodology

Functional Data Representation with Merge Trees

73 - Matteo Pegoraro , Piercesare Secchi 2021

In this paper we face the problem of representation of functional data with the tools of algebraic topology. We represent functions by means of merge trees and this representation is compared with that offered by persistence diagrams. We show that these two tree structures, although not equivalent, are both invariant under homeomorphic re-parametrizations of the functions they represent, thus allowing for a statistical analysis which is indifferent to functional misalignment. We employ a novel metric for merge trees and we prove a few theoretical results related to its specific implementation when merge trees represent functions. To showcase the good properties of our topological approach to functional data analysis, we first go through a few examples using data generated {em in silico} and employed to illustrate and compare the different representations provided by merge trees and persistence diagrams, and then we test it on the Aneurisk65 dataset replicating, from our different perspective, the supervised classification analysis which contributed to make this dataset a benchmark for methods dealing with misaligned functional data.

Methodology Statistics Theory Other Statistics