Do you want to publish a course? Click here

Flexible multi-state models for interval-censored data: specification, estimation, and an application to ageing research

53   0   0.0 ( 0 )
 Added by Robson Machado
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

Continuous-time multi-state survival models can be used to describe health-related processes over time. In the presence of interval-censored times for transitions between the living states, the likelihood is constructed using transition probabilities. Models can be specified using parametric or semi-parametric shapes for the hazards. Semi-parametric hazards can be fitted using $P$-splines and penalised maximum likelihood estimation. This paper presents a method to estimate flexible multi-state models which allows for parametric and semi-parametric hazard specifications. The estimation is based on a scoring algorithm. The method is illustrated with data from the English Longitudinal Study of Ageing.



rate research

Read More

This work was motivated by observational studies in pregnancy with spontaneous abortion (SAB) as outcome. Clearly some women experience the SAB event but the rest do not. In addition, the data are left truncated due to the way pregnant women are recruited into these studies. For those women who do experience SAB, their exact event times are sometimes unknown. Finally, a small percentage of the women are lost to follow-up during their pregnancy. All these give rise to data that are left truncated, partly interval and right-censored, and with a clearly defined cured portion. We consider the non-mixture Cox regression cure rate model and adopt the semiparametric spline-based sieve maximum likelihood approach to analyze such data. Using modern empirical process theory we show that both the parametric and the nonparametric parts of the sieve estimator are consistent, and we establish the asymptotic normality for both parts. Simulation studies are conducted to establish the finite sample performance. Finally, we apply our method to a database of observational studies on spontaneous abortion.
Most generative models for clustering implicitly assume that the number of data points in each cluster grows linearly with the total number of data points. Finite mixture models, Dirichlet process mixture models, and Pitman--Yor process mixture models make this assumption, as do all other infinitely exchangeable clustering models. However, for some applications, this assumption is inappropriate. For example, when performing entity resolution, the size of each cluster should be unrelated to the size of the data set, and each cluster should contain a negligible fraction of the total number of data points. These applications require models that yield clusters whose sizes grow sublinearly with the size of the data set. We address this requirement by defining the microclustering property and introducing a new class of models that can exhibit this property. We compare models within this class to two commonly used clustering models using four entity-resolution data sets.
The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classifying the cured and uncured subjects is linear. In this paper, we propose a new mixture cure rate model based on interval censored data that uses the support vector machine (SVM) to model the effect of covariates on the uncured or the cured probability (i.e., on the incidence part of the model). Our proposed model inherits the features of the SVM and provides flexibility to capture classification boundaries that are non-linear and more complex. Furthermore, the new model can be used to model the effect of covariates on the incidence part when the dimension of covariates is high. The latency part is modeled by a proportional hazards structure. We develop an estimation procedure based on the expectation maximization (EM) algorithm to estimate the cured/uncured probability and the latency model parameters. Our simulation study results show that the proposed model performs better in capturing complex classification boundaries when compared to the existing logistic regression based mixture cure rate model. We also show that our models ability to capture complex classification boundaries improve the estimation results corresponding to the latency parameters. For illustrative purpose, we present our analysis by applying the proposed methodology to an interval censored data on smoking cessation.
A new method for the analysis of time to ankylosis complication on a dataset of replanted teeth is proposed. In this context of left-censored, interval-censored and right-censored data, a Cox model with piecewise constant baseline hazard is introduced. Estimation is carried out with the EM algorithm by treating the true event times as unobserved variables. This estimation procedure is shown to produce a block diagonal Hessian matrix of the baseline parameters. Taking advantage of this interesting feature of the estimation method a L0 penalised likelihood method is implemented in order to automatically determine the number and locations of the cuts of the baseline hazard. This procedure allows to detect specific areas of time where patients are at greater risks for ankylosis. The method can be directly extended to the inclusion of exact observations and to a cure fraction. Theoretical results are obtained which allow to derive statistical inference of the model parameters from asymptotic likelihood theory. Through simulation studies, the penalisation technique is shown to provide a good fit of the baseline hazard and precise estimations of the resulting regression parameters.
Aggregation of large databases in a specific format is a frequently used process to make the data easily manageable. Interval-valued data is one of the data types that is generated by such an aggregation process. Using traditional methods to analyze interval-valued data results in loss of information, and thus, several interval-valued data models have been proposed to gather reliable information from such data types. On the other hand, recent technological developments have led to high dimensional and complex data in many application areas, which may not be analyzed by traditional techniques. Functional data analysis is one of the most commonly used techniques to analyze such complex datasets. While the functional extensions of much traditional statistical techniques are available, the functional form of the interval-valued data has not been studied well. This paper introduces the functional forms of some well-known regression models that take interval-valued data. The proposed methods are based on the function-on-function regression model, where both the response and predictor/s are functional. Through several Monte Carlo simulations and empirical data analysis, the finite sample performance of the proposed methods is evaluated and compared with the state-of-the-art.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا