Do you want to publish a course? Click here

Functional principal variance component testing for a genetic association study of HIV progression

68   0   0.0 ( 0 )
 Added by Denis Agniel
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

HIV-1C is the most prevalent subtype of HIV-1 and accounts for over half of HIV-1 infections worldwide. Host genetic influence of HIV infection has been previously studied in HIV-1B, but little attention has been paid to the more prevalent subtype C. To understand the role of host genetics in HIV-1C disease progression, we perform a study to assess the association between longitudinally collected measures of disease and more than 100,000 genetic markers located on chromosome 6. The most common approach to analyzing longitudinal data in this context is linear mixed effects models, which may be overly simplistic in this case. On the other hand, existing non-parametric methods may suffer from low power due to high degrees of freedom (DF) and may be computationally infeasible at the large scale. We propose a functional principal variance component (FPVC) testing framework which captures the nonlinearity in the CD4 and viral load with potentially low DF and is fast enough to carry out thousands or millions of times. The FPVC testing unfolds in two stages. In the first stage, we summarize the markers of disease progression according to their major patterns of variation via functional principal components analysis (FPCA). In the second stage, we employ a simple working model and variance component testing to examine the association between the summaries of disease progression and a set of single nucleotide polymorphisms. We supplement this analysis with simulation results which indicate that FPVC testing can offer large power gains over the standard linear mixed effects model.



rate research

Read More

Functional principal component analysis (FPCA) could become invalid when data involve non-Gaussian features. Therefore, we aim to develop a general FPCA method to adapt to such non-Gaussian cases. A Kenalls $tau$ function, which possesses identical eigenfunctions as covariance function, is constructed. The particular formulation of Kendalls $tau$ function makes it less insensitive to data distribution. We further apply it to the estimation of FPCA and study the corresponding asymptotic consistency. Moreover, the effectiveness of the proposed method is demonstrated through a comprehensive simulation study and an application to the physical activity data collected by a wearable accelerometer monitor.
Functional binary datasets occur frequently in real practice, whereas discrete characteristics of the data can bring challenges to model estimation. In this paper, we propose a sparse logistic functional principal component analysis (SLFPCA) method to handle the functional binary data. The SLFPCA looks for local sparsity of the eigenfunctions to obtain convenience in interpretation. We formulate the problem through a penalized Bernoulli likelihood with both roughness penalty and sparseness penalty terms. An efficient algorithm is developed for the optimization of the penalized likelihood using majorization-minimization (MM) algorithm. The theoretical results indicate both consistency and sparsistency of the proposed method. We conduct a thorough numerical experiment to demonstrate the advantages of the SLFPCA approach. Our method is further applied to a physical activity dataset.
Functional principal component analysis (FPCA) has been widely used to capture major modes of variation and reduce dimensions in functional data analysis. However, standard FPCA based on the sample covariance estimator does not work well in the presence of outliers. To address this challenge, a new robust functional principal component analysis approach based on the functional pairwise spatial sign (PASS) operator, termed PASS FPCA, is introduced where we propose estimation procedures for both eigenfunctions and eigenvalues with and without measurement error. Compared to existing robust FPCA methods, the proposed one requires weaker distributional assumptions to conserve the eigenspace of the covariance function. In particular, a class of distributions called the weakly functional coordinate symmetric (weakly FCS) is introduced that allows for severe asymmetry and is strictly larger than the functional elliptical distribution class, the latter of which has been well used in the robust statistics literature. The robustness of the PASS FPCA is demonstrated via simulation studies and analyses of accelerometry data from a large-scale epidemiological study of physical activity on older women that partly motivates this work.
Functional principal component analysis is essential in functional data analysis, but the inferences will become unconvincing when some non-Gaussian characteristics occur, such as heavy tail and skewness. The focus of this paper is to develop a robust functional principal component analysis methodology in dealing with non-Gaussian longitudinal data, for which sparsity and irregularity along with non-negligible measurement errors must be considered. We introduce a Kendalls $tau$ function whose particular properties make it a nice proxy for the covariance function in the eigenequation when handling non-Gaussian cases. Moreover, the estimation procedure is presented and the asymptotic theory is also established. We further demonstrate the superiority and robustness of our method through simulation studies and apply the method to the longitudinal CD4 cell count data in an AIDS study.
We propose a supervised principal component regression method for relating functional responses with high dimensional covariates. Unlike the conventional principal component analysis, the proposed method builds on a newly defined expected integrated residual sum of squares, which directly makes use of the association between functional response and predictors. Minimizing the integrated residual sum of squares gives the supervised principal components, which is equivalent to solving a sequence of nonconvex generalized Rayleigh quotient optimization problems and thus is computationally intractable. To overcome this computational challenge, we reformulate the nonconvex optimization problems into a simultaneous linear regression, with a sparse penalty added to deal with high dimensional predictors. Theoretically, we show that the reformulated regression problem recovers the same supervised principal subspace under suitable conditions. Statistically, we establish non-asymptotic error bounds for the proposed estimators. Numerical studies and an application to the Human Connectome Project lend further support.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا