ترغب بنشر مسار تعليمي؟ اضغط هنا

Multiclass classification of growth curves using random change points and heterogeneous random effects

151   0   0.0 ( 0 )
 نشر من قبل Vincent Chin
 تاريخ النشر 2019
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Faltering growth among children is a nutritional problem prevalent in low to medium income countries; it is generally defined as a slower rate of growth compared to a reference healthy population of the same age and gender. As faltering is closely associated with reduced physical, intellectual and economic productivity potential, it is important to identify faltered children and be able to characterise different growth patterns so that targeted treatments can be designed and administered. We introduce a multiclass classification model for growth trajectory that flexibly extends a current classification approach called the broken stick model, which is a piecewise linear model with breaks at fixed knot locations. Heterogeneity in growth patterns among children is captured using mixture distributed random effects, whereby the mixture components determine the classification of children into subgroups. The mixture distribution is modelled using a Dirichlet process prior, which avoids the need to choose the true number of mixture components, and allows this to be driven by the complexity of the data. Because children have individual differences in the onset of growth stages, we introduce child-specific random change points. Simulation results show that the random change point model outperforms the broken stick model because it has fewer restrictions on knot locations. We illustrate our model on a longitudinal birth cohort from the Healthy Birth, Growth and Development knowledge integration project funded by the Bill and Melinda Gates Foundation. Analysis reveals 9 subgroups of children within the population which exhibit varying faltering trends between birth and age one.



قيم البحث

اقرأ أيضاً

We consider inference from non-random samples in data-rich settings where high-dimensional auxiliary information is available both in the sample and the target population, with survey inference being a special case. We propose a regularized predictio n approach that predicts the outcomes in the population using a large number of auxiliary variables such that the ignorability assumption is reasonable while the Bayesian framework is straightforward for quantification of uncertainty. Besides the auxiliary variables, inspired by Little & An (2004), we also extend the approach by estimating the propensity score for a unit to be included in the sample and also including it as a predictor in the machine learning models. We show through simulation studies that the regularized predictions using soft Bayesian additive regression trees yield valid inference for the population means and coverage rates close to the nominal levels. We demonstrate the application of the proposed methods using two different real data applications, one in a survey and one in an epidemiology study.
We present an approach to estimate distance-dependent heterogeneous associations between point-referenced exposures to built environment characteristics and health outcomes. By estimating associations that depend non-linearly on distance between subj ects and point-referenced exposures, this method addresses the modifiable area-unit problem that is pervasive in the built environment literature. Additionally, by estimating heterogeneous effects, the method also addresses the uncertain geographic context problem. The key innovation of our method is to combine ideas from the non-parametric function estimation literature and the Bayesian Dirichlet process literature. The former is used to estimate nonlinear associations between subjects outcomes and proximate built environment features, and the latter identifies clusters within the population that have different effects. We study this method in simulations and apply our model to study heterogeneity in the association between fast food restaurant availability and weight status of children attending schools in Los Angeles, California.
We call $i$ a fixed point of a given sequence if the value of that sequence at the $i$-th position coincides with $i$. Here, we enumerate fixed points in the class of restricted growth sequences. The counting process is conducted by calculation of ge nerating functions and leveraging a probabilistic sampling method.
Two-sample and independence tests with the kernel-based MMD and HSIC have shown remarkable results on i.i.d. data and stationary random processes. However, these statistics are not directly applicable to non-stationary random processes, a prevalent f orm of data in many scientific disciplines. In this work, we extend the application of MMD and HSIC to non-stationary settings by assuming access to independent realisations of the underlying random process. These realisations - in the form of non-stationary time-series measured on the same temporal grid - can then be viewed as i.i.d. samples from a multivariate probability distribution, to which MMD and HSIC can be applied. We further show how to choose suitable kernels over these high-dimensional spaces by maximising the estimated test power with respect to the kernel hyper-parameters. In experiments on synthetic data, we demonstrate superior performance of our proposed approaches in terms of test power when compared to current state-of-the-art functional or multivariate two-sample and independence tests. Finally, we employ our methods on a real socio-economic dataset as an example application.
The random coefficients model $Y_i={beta_0}_i+{beta_1}_i {X_1}_i+{beta_2}_i {X_2}_i+ldots+{beta_d}_i {X_d}_i$, with $mathbf{X}_i$, $Y_i$, $mathbf{beta}_i$ i.i.d, and $mathbf{beta}_i$ independent of $X_i$ is often used to capture unobserved heterogene ity in a population. We propose a quasi-maximum likelihood method to estimate the joint density distribution of the random coefficient model. This method implicitly involves the inversion of the Radon transformation in order to reconstruct the joint distribution, and hence is an inverse problem. Nonparametric estimation for the joint density of $mathbf{beta}_i=({beta_0}_i,ldots, {beta_d}_i)$ based on kernel methods or Fourier inversion have been proposed in recent years. Most of these methods assume a heavy tailed design density $f_mathbf{X}$. To add stability to the solution, we apply regularization methods. We analyze the convergence of the method without assuming heavy tails for $f_mathbf{X}$ and illustrate performance by applying the method on simulated and real data. To add stability to the solution, we apply a Tikhonov-type regularization method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا