ﻻ يوجد ملخص باللغة العربية
In electronic health records (EHRs), latent subgroups of patients may exhibit distinctive patterning in their longitudinal health trajectories. For such data, growth mixture models (GMMs) enable classifying patients into different latent classes based on individual trajectories and hypothesized risk factors. However, the application of GMMs is hindered by the special missing data problem in EHRs, which manifests two patient-led missing data processes: the visit process and the response process for an EHR variable conditional on a patient visiting the clinic. If either process is associated with the process generating the longitudinal outcomes, then valid inferences require accounting for a nonignorable missing data mechanism. We propose a Bayesian shared parameter model that links GMMs of multiple longitudinal health outcomes, the visit process, and the response process of each outcome given a visit using a discrete latent class variable. Our focus is on multiple longitudinal health outcomes for which there can be a clinically prescribed visit schedule. We demonstrate our model in EHR measurements on early childhood weight and height z-scores. Using data simulations, we illustrate the statistical properties of our method with respect to subgroup-specific or marginal inferences. We built the R package EHRMiss for model fitting, selection, and checking.
Analyzing electronic health records (EHR) poses significant challenges because often few samples are available describing a patients health and, when available, their information content is highly diverse. The problem we consider is how to integrate
We propose a distributed quadratic inference function framework to jointly estimate regression parameters from multiple potentially heterogeneous data sources with correlated vector outcomes. The primary goal of this joint integrative analysis is to
Although increasingly used as a data resource for assembling cohorts, electronic health records (EHRs) pose many analytic challenges. In particular, a patients health status influences when and what data are recorded, generating sampling bias in the
Readily available proxies for time of disease onset such as time of the first diagnostic code can lead to substantial risk prediction error if performing analyses based on poor proxies. Due to the lack of detailed documentation and labor intensivenes
In the genomic era, the identification of gene signatures associated with disease is of significant interest. Such signatures are often used to predict clinical outcomes in new patients and aid clinical decision-making. However, recent studies have s