ترغب بنشر مسار تعليمي؟ اضغط هنا

A statistical test to identify differences in clustering structures

124   0   0.0 ( 0 )
 نشر من قبل Daniel Takahashi
 تاريخ النشر 2013
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Statistical inference on functional magnetic resonance imaging (fMRI) data is an important task in brain imaging. One major hypothesis is that the presence or not of a psychiatric disorder can be explained by the differential clustering of neurons in the brain. In view of this fact, it is clearly of interest to address the question of whether the properties of the clusters have changed between groups of patients and controls. The normal method of approaching group differences in brain imaging is to carry out a voxel-wise univariate analysis for a difference between the mean group responses using an appropriate test (e.g. a t-test) and to assemble the resulting significantly different voxels into clusters, testing again at cluster level. In this approach of course, the primary voxel-level test is blind to any cluster structure. Direct assessments of differences between groups (or reproducibility within groups) at the cluster level have been rare in brain imaging. For this reason, we introduce a novel statistical test called ANOCVA - ANalysis Of Cluster structure Variability, which statistically tests whether two or more populations are equally clustered using specific features. The proposed method allows us to compare the clustering structure of multiple groups simultaneously, and also to identify features that contribute to the differential clustering. We illustrate the performance of ANOCVA through simulations and an application to an fMRI data set composed of children with ADHD and controls. Results show that there are several differences in the brains clustering structure between them, corroborating the hypothesis in the literature. Furthermore, we identified some brain regions previously not described, generating new hypothesis to be tested empirically.

قيم البحث

اقرأ أيضاً

85 - Hangjin Jiang 2020
Statistical modeling plays a fundamental role in understanding the underlying mechanism of massive data (statistical inference) and predicting the future (statistical prediction). Although all models are wrong, researchers try their best to make some of them be useful. The question here is how can we measure the usefulness of a statistical model for the data in hand? This is key to statistical prediction. The important statistical problem of testing whether the observations follow the proposed statistical model has only attracted relatively few attentions. In this paper, we proposed a new framework for this problem through building its connection with two-sample distribution comparison. The proposed method can be applied to evaluate a wide range of models. Examples are given to show the performance of the proposed method.
Clustering methods have led to a number of important discoveries in bioinformatics and beyond. A major challenge in their use is determining which clusters represent important underlying structure, as opposed to spurious sampling artifacts. This chal lenge is especially serious, and very few methods are available when the data are very high in dimension. Statistical Significance of Clustering (SigClust) is a recently developed cluster evaluation tool for high dimensional low sample size data. An important component of the SigClust approach is the very definition of a single cluster as a subset of data sampled from a multivariate Gaussian distribution. The implementation of SigClust requires the estimation of the eigenvalues of the covariance matrix for the null multivariate Gaussian distribution. We show that the original eigenvalue estimation can lead to a test that suffers from severe inflation of type-I error, in the important case where there are huge single spikes in the eigenvalues. This paper addresses this critical challenge using a novel likelihood based soft thresholding approach to estimate these eigenvalues which leads to a much improved SigClust. These major improvements in SigClust performance are shown by both theoretical work and an extensive simulation study. Applications to some cancer genomic data further demonstrate the usefulness of these improvements.
76 - Duncan Lee 2012
Disease maps display the spatial pattern in disease risk, so that high-risk clusters can be identified. The spatial structure in the risk map is typically represented by a set of random effects, which are modelled with a conditional autoregressive (C AR) prior. Such priors include a global spatial smoothing parameter, whereas real risk surfaces are likely to include areas of smooth evolution as well as discontinuities, the latter of which are known as risk boundaries. Therefore, this paper proposes an extension to the class of CAR priors, which can identify both areas of localised spatial smoothness and risk boundaries. However, allowing for this localised smoothing requires large numbers of correlation parameters to be estimated, which are unlikely to be well identified from the data. To address this problem we propose eliciting an informative prior about the locations of such boundaries, which can be combined with the information from the data to provide more precise posterior inference. We test our approach by simulation, before applying it to a study of the risk of emergency admission to hospital in Greater Glasgow, Scotland.
The tau statistic $tau$ uses geolocation and, usually, symptom onset time to assess global spatiotemporal clustering from epidemiological data. We test different factors that could affect graphical hypothesis tests of clustering or bias clustering ra nge estimates based on the statistic, by comparison with a baseline analysis of an open access measles dataset. From re-analysing this data we find that the spatial bootstrap sampling method used to construct the confidence interval for the tau estimate and confidence interval (CI) type can bias clustering range estimates. We suggest that the bias-corrected and accelerated (BCa) CI is essential for asymmetric sample bootstrap distributions of tau estimates. We also find evidence against no spatiotemporal clustering, $p$-value $in$ [0,0.014] (global envelope test). We develop a tau-specific modification of the Loh & Stein spatial bootstrap sampling method, which gives more precise bootstrapped tau estimates and a 20% higher estimated clustering endpoint than previously published (36.0m; 95% BCa CI (14.9, 46.6), vs 30m) and an equivalent increase in the clustering area of elevated disease odds by 44%. What appears a modest radial bias in the range estimate is more than doubled on the areal scale, which public health resources are proportional to. This difference could have important consequences for control. Correct practice of hypothesis testing of no clustering and clustering range estimation of the tau statistic are illustrated in the Graphical abstract. We advocate proper implementation of this useful statistic, ultimately to reduce inaccuracies in control policy decisions made during disease clustering analysis.
361 - Fei Gao Vaccine 2021
Longitudinal cohorts to determine the incidence of HIV infection are logistically challenging, so researchers have sought alternative strategies. Recency test methods use biomarker profiles of HIV-infected subjects in a cross-sectional sample to infe r whether they are recently infected and to estimate incidence in the population. Two main estimators have been used in practice: one that assumes a recency test is perfectly specific, and another that allows for false-recent results. To date, these commonly used estimators have not been rigorously studied with respect to their assumptions and statistical properties. In this paper, we present a theoretical framework with which to understand these estimators and interrogate their assumptions, and perform a simulation study to assess the performance of these estimators under realistic HIV epidemiological dynamics. We conclude with recommendations for the use of these estimators in practice and a discussion of future methodological developments to improve HIV incidence estimation via recency test.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا