No Arabic abstract
The clustering for functional data with misaligned problems has drawn much attention in the last decade. Most methods do the clustering after those functional data being registered and there has been little research using both functional and scalar variables. In this paper, we propose a simultaneous registration and clustering (SRC) model via two-level models, allowing the use of both types of variables and also allowing simultaneous registration and clustering. For the data collected from subjects in different unknown groups, a Gaussian process functional regression model with time warping is used as the first level model; an allocation model depending on scalar variables is used as the second level model providing further information over the groups. The former carries out registration and modeling for the multi-dimensional functional data (2D or 3D curves) at the same time. This methodology is implemented using an EM algorithm, and is examined on both simulated data and real data.
Estimating the number of clusters (K) is a critical and often difficult task in cluster analysis. Many methods have been proposed to estimate K, including some top performers using resampling approach. When performing cluster analysis in high-dimensional data, simultaneous clustering and feature selection is needed for improved interpretation and performance. To our knowledge, none has investigated simultaneous estimation of K and feature selection in an exploratory cluster analysis. In this paper, we propose a resampling method to meet this gap and evaluate its performance under the sparse K-means clustering framework. The proposed target function balances between sensitivity and specificity of clustering evaluation of pairwise subjects from clustering of full and subsampled data. Through extensive simulations, the method performs among the best over classical methods in estimating K in low-dimensional data. For high-dimensional simulation data, it also shows superior performance to simultaneously estimate K and feature sparsity parameter. Finally, we evaluated the methods in four microarray, two RNA-seq, one SNP and two non-omics datasets. The proposed method achieves better clustering accuracy with fewer selected predictive genes in almost all real applications.
Motivated by recent work involving the analysis of biomedical imaging data, we present a novel procedure for constructing simultaneous confidence corridors for the mean of imaging data. We propose to use flexible bivariate splines over triangulations to handle irregular domain of the images that is common in brain imaging studies and in other biomedical imaging applications. The proposed spline estimators of the mean functions are shown to be consistent and asymptotically normal under some regularity conditions. We also provide a computationally efficient estimator of the covariance function and derive its uniform consistency. The procedure is also extended to the two-sample case in which we focus on comparing the mean functions from two populations of imaging data. Through Monte Carlo simulation studies we examine the finite-sample performance of the proposed method. Finally, the proposed method is applied to analyze brain Positron Emission Tomography (PET) data in two different studies. One dataset used in preparation of this article was obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database.
This paper presents a novel algorithm that registers a collection of mono-modal 3D images in a simultaneous fashion, named as Direct Simultaneous Registration (DSR). The algorithm optimizes global poses of local frames directly based on the intensities of images (without extracting features from the images). To obtain the optimal result, we start with formulating a Direct Bundle Adjustment (DBA) problem which jointly optimizes pose parameters of local frames and intensities of panoramic image. By proving the independence of the pose from panoramic image in the iterative process, DSR is proposed and proved to be able to generate the same optimal poses as DBA, but without optimizing the intensities of the panoramic image. The proposed DSR method is particularly suitable in mono-modal registration and in the scenarios where distinct features are not available, such as Transesophageal Echocardiography (TEE) images. The proposed method is validated via simulated and in-vivo 3D TEE images. It is shown that the proposed method outperforms conventional sequential registration method in terms of accuracy and the obtained results can produce good alignment in in-vivo images.
Task-based functional magnetic resonance imaging (task fMRI) is a non-invasive technique that allows identifying brain regions whose activity changes when individuals are asked to perform a given task. This contributes to the understanding of how the human brain is organized in functionally distinct subdivisions. Task fMRI experiments from high-resolution scans provide hundred of thousands of longitudinal signals for each individual, corresponding to measurements of brain activity over each voxel of the brain along the duration of the experiment. In this context, we propose some visualization techniques for high dimensional functional data relying on depth-based notions that allow for computationally efficient 2-dim representations of tfMRI data and that shed light on sample composition, outlier presence and individual variability. We believe that this step is crucial previously to any inferential approach willing to identify neuroscientific patterns across individuals, tasks and brain regions. We illustrate the proposed technique through a simulation study and demonstrate its application on a motor and language task fMRI experiment.
Many classification techniques when the data are curves or functions have been recently proposed. However, the presence of misaligned problems in the curves can influence the performance of most of them. In this paper, we propose a model-based approach for simultaneous curve registration and classification. The method is proposed to perform curve classification based on a functional logistic regression model that relies on both scalar variables and functional variables, and to align curves simultaneously via a data registration model. EM-based algorithms are developed to perform maximum likelihood inference of the proposed models. We establish the identifiability results for curve registration model and investigate the asymptotic properties of the proposed estimation procedures. Simulation studies are conducted to demonstrate the finite sample performance of the proposed models. An application of the hyoid bone movement data from stroke patients reveals the effectiveness of the new models.