Do you want to publish a course? Click here

Estimation for recurrent events through conditional estimating equations

83   0   0.0 ( 0 )
 Added by Hai Yan Liu
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

We present new estimators for the statistical analysis of the dependence of the mean gap time length between consecutive recurrent events, on a set of explanatory random variables and in the presence of right censoring. The dependence is expressed through regression-like and overdispersion parameters, estimated via conditional estimating equations. The mean and variance of the length of each gap time, conditioned on the observed history of prior events and other covariates, are known functions of parameters and covariates. Under certain conditions on censoring, we construct normalized estimating functions that are asymptotically unbiased and contain only observed data. We discuss the existence, consistency and asymptotic normality of a sequence of estimators of the parameters, which are roots of these estimating equations. Simulations suggest that our estimators could be used successfully with a relatively small sample size in a study of short duration.

rate research

Read More

The recent advent of smart meters has led to large micro-level datasets. For the first time, the electricity consumption at individual sites is available on a near real-time basis. Efficient management of energy resources, electric utilities, and transmission grids, can be greatly facilitated by harnessing the potential of this data. The aim of this study is to generate probability density estimates for consumption recorded by individual smart meters. Such estimates can assist decision making by helping consumers identify and minimize their excess electricity usage, especially during peak times. For suppliers, these estimates can be used to devise innovative time-of-use pricing strategies aimed at their target consumers. We consider methods based on conditional kernel density (CKD) estimation with the incorporation of a decay parameter. The methods capture the seasonality in consumption, and enable a nonparametric estimation of its conditional density. Using eight months of half-hourly data for one thousand meters, we evaluate point and density forecasts, for lead times ranging from one half-hour up to a week ahead. We find that the kernel-based methods outperform a simple benchmark method that does not account for seasonality, and compare well with an exponential smoothing method that we use as a sophisticated benchmark. To gauge the financial impact, we use density estimates of consumption to derive prediction intervals of electricity cost for different time-of-use tariffs. We show that a simple strategy of switching between different tariffs, based on a comparison of cost densities, delivers significant cost savings for the great majority of consumers.
We construct a Bayesian inference deep learning machine for parameter estimation of gravitational wave events of binaries of black hole coalescence. The structure of our deep Bayseian machine adopts the conditional variational autoencoder scheme by conditioning both the gravitational wave strains and the variations of amplitude spectral density of the detector noise. We show that our deep Bayesian machine is capable of yielding the posteriors compatible with the ones from the nest sampling method, and of fighting against the noise outliers. We also apply our deep Bayesian machine to the LIGO/Virgo O3 events, and find that conditioning detector noise to fight against its drifting is relevant for the events with medium signal-to-noise ratios.
The analysis of data arising from environmental health studies which collect a large number of measures of exposure can benefit from using latent variable models to summarize exposure information. However, difficulties with estimation of model parameters may arise since existing fitting procedures for linear latent variable models require correctly specified residual variance structures for unbiased estimation of regression parameters quantifying the association between (latent) exposure and health outcomes. We propose an estimating equations approach for latent exposure models with longitudinal health outcomes which is robust to misspecification of the outcome variance. We show that compared to maximum likelihood, the loss of efficiency of the proposed method is relatively small when the model is correctly specified. The proposed equations formalize the ad-hoc regression on factor scores procedure, and generalize regression calibration. We propose two weighting schemes for the equations, and compare their efficiency. We apply this method to a study of the effects of in-utero lead exposure on child development.
We conduct a review to assess how the simulation of repeated or recurrent events are planned. For such multivariate time-to-events, it is well established that the underlying mechanism is likely to be complex and to involve in particular both heterogeneity in the population and event-dependence. In this respect, we particularly focused on these two dimensions of events dynamic when mimicking actual data. Next, we investigate whether the processes generated in the simulation studies have similar properties to those expected in the clinical data of interest. Finally we describe a simulation scheme for generating data according to the timescale of choice (gap time/ calendar) and to whether heterogeneity and/or event-dependence are to be considered. The main finding is that event-dependence is less widely considered in simulation studies than heterogeneity. This is unfortunate since the occurrence of an event may alter the risk of occurrence of new events.
In biomedical studies it is of substantial interest to develop risk prediction scores using high-dimensional data such as gene expression data for clinical endpoints that are subject to censoring. In the presence of well-established clinical risk factors, investigators often prefer a procedure that also adjusts for these clinical variables. While accelerated failure time (AFT) models are a useful tool for the analysis of censored outcome data, it assumes that covariate effects on the logarithm of time-to-event are linear, which is often unrealistic in practice. We propose to build risk prediction scores through regularized rank estimation in partly linear AFT models, where high-dimensional data such as gene expression data are modeled linearly and important clinical variables are modeled nonlinearly using penalized regression splines. We show through simulation studies that our model has better operating characteristics compared to several existing models. In particular, we show that there is a nonnegligible effect on prediction as well as feature selection when nonlinear clinical effects are misspecified as linear. This work is motivated by a recent prostate cancer study, where investigators collected gene expression data along with established prognostic clinical variables and the primary endpoint is time to prostate cancer recurrence.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا