No Arabic abstract
Complex biological processes are usually experimented along time among a collection of individuals. Longitudinal data are then available and the statistical challenge is to better understand the underlying biological mechanisms. The standard statistical approach is mixed-effects model, with regression functions that are now highly-developed to describe precisely the biological processes (solutions of multi-dimensional ordinary differential equations or of partial differential equation). When there is no analytical solution, a classical estimation approach relies on the coupling of a stochastic version of the EM algorithm (SAEM) with a MCMC algorithm. This procedure needs many evaluations of the regression function which is clearly prohibitive when a time-consuming solver is used for computing it. In this work a meta-model relying on a Gaussian process emulator is proposed to replace this regression function. The new source of uncertainty due to this approximation can be incorporated in the model which leads to what is called a mixed meta-model. A control on the distance between the maximum likelihood estimates in this mixed meta-model and the maximum likelihood estimates obtained with the exact mixed model is guaranteed. Eventually, numerical simulations are performed to illustrate the efficiency of this approach.
We consider a re-sampling scheme for estimation of the population parameters in the mixed effects nonlinear regression models of the type use for example in clinical pharmacokinetics, say. We provide an estimation procedure which {it recycles}, via random weighting, the relevant two-stage parameters estimates to construct consistent estimates of the sampling distribution of the various estimates. We establish the asymptotic consistency and asymptotic normality of the resampled estimates and demonstrate the applicability of the {it recycling} approach in a small simulation study and via example.
It is well known that the minimax rates of convergence of nonparametric density and regression function estimation of a random variable measured with error is much slower than the rate in the error free case. Surprisingly, we show that if one is willing to impose a relatively mild assumption in requiring that the error-prone variable has a compact support, then the results can be greatly improved. We describe new and constructive methods to take full advantage of the compact support assumption via spline-assisted semiparametric methods. We further prove that the new estimator achieves the usual nonparametric rate in estimating both the density and regression functions as if there were no measurement error. The proof involves linear and bilinear operator theories, semiparametric theory, asymptotic analysis regarding Bsplines, as well as integral equation treatments. The performance of the new methods is demonstrated through several simulations and a data example.
We present new results for consistency of maximum likelihood estimators with a focus on multivariate mixed models. Our theory builds on the idea of using subsets of the full data to establish consistency of estimators based on the full data. It requires neither that the data consist of independent observations, nor that the observations can be modeled as a stationary stochastic process. Compared to existing asymptotic theory using the idea of subsets we substantially weaken the assumptions, bringing them closer to what suffices in classical settings. We apply our theory in two multivariate mixed models for which it was unknown whether maximum likelihood estimators are consistent. The models we consider have non-stochastic predictors and multivariate responses which are possibly mixed-type (some discrete and some continuous).
We consider a model where the failure hazard function, conditional on a covariate $Z$ is given by $R(t,theta^0|Z)=eta_{gamma^0}(t)f_{beta^0}(Z)$, with $theta^0=(beta^0,gamma^0)^topin mathbb{R}^{m+p}$. The baseline hazard function $eta_{gamma^0}$ and relative risk $f_{beta^0}$ belong both to parametric families. The covariate $Z$ is measured through the error model $U=Z+epsilon$ where $epsilon$ is independent from $Z$, with known density $f_epsilon$. We observe a $n$-sample $(X_i, D_i, U_i)$, $i=1,...,n$, where $X_i$ is the minimum between the failure time and the censoring time, and $D_i$ is the censoring indicator. We aim at estimating $theta^0$ in presence of the unknown density $g$. Our estimation procedure based on least squares criterion provide two estimators. The first one minimizes an estimation of the least squares criterion where $g$ is estimated by density deconvolution. Its rate depends on the smoothnesses of $f_epsilon$ and $f_beta(z)$ as a function of $z$,. We derive sufficient conditions that ensure the $sqrt{n}$-consistency. The second estimator is constructed under conditions ensuring that the least squares criterion can be directly estimated with the parametric rate. These estimators, deeply studied through examples are in particular $sqrt{n}$-consistent and asymptotically Gaussian in the Cox model and in the excess risk model, whatever is $f_epsilon$.
The plurigaussian model is particularly suited to describe categorical regionalized variables. Starting from a simple principle, the thresh-olding of one or several Gaussian random fields (GRFs) to obtain categories, the plurigaussian model is well adapted for a wide range ofsituations. By acting on the form of the thresholding rule and/or the threshold values (which can vary along space) and the variograms ofthe underlying GRFs, one can generate many spatial configurations for the categorical variables. One difficulty is to choose variogrammodel for the underlying GRFs. Indeed, these latter are hidden by the truncation and we only observe the simple and cross-variogramsof the category indicators. In this paper, we propose a semiparametric method based on the pairwise likelihood to estimate the empiricalvariogram of the GRFs. It provides an exploratory tool in order to choose a suitable model for each GRF and later to estimate its param-eters. We illustrate the efficiency of the method with a Monte-Carlo simulation study .The method presented in this paper is implemented in the R packageRGeostats.