No Arabic abstract
We introduce estimation and test procedures through divergence minimiza- tion for models satisfying linear constraints with unknown parameter. These procedures extend the empirical likelihood (EL) method and share common features with generalized empirical likelihood approach. We treat the problems of existence and characterization of the divergence projections of probability distributions on sets of signed finite measures. We give a precise characterization of duality, for the proposed class of estimates and test statistics, which is used to derive their limiting distributions (including the EL estimate and the EL ratio statistic) both under the null hypotheses and under alterna- tives or misspecification. An approximation to the power function is deduced as well as the sample size which ensures a desired power for a given alternative.
We introduce estimation and test procedures through divergence optimization for discrete or continuous parametric models. This approach is based on a new dual representation for divergences. We treat point estimation and tests for simple and composite hypotheses, extending maximum likelihood technique. An other view at the maximum likelihood approach, for estimation and test, is given. We prove existence and consistency of the proposed estimates. The limit laws of the estimates and test statistics (including the generalized likelihood ratio one) are given both under the null and the alternative hypotheses, and approximation of the power functions is deduced. A new procedure of construction of confidence regions, when the parameter may be a boundary value of the parameter space, is proposed. Also, a solution to the irregularity problem of the generalized likelihood ratio test pertaining to the number of components in a mixture is given, and a new test is proposed, based on $chi ^{2}$-divergence on signed finite measures and duality technique.
In prevalent cohort studies where subjects are recruited at a cross-section, the time to an event may be subject to length-biased sampling, with the observed data being either the forward recurrence time, or the backward recurrence time, or their sum. In the regression setting, it has been shown that the accelerated failure time model for the underlying event time is invariant under these observed data set-ups and can be fitted using standard methodology for accelerated failure time model estimation, ignoring the length-bias. However, the efficiency of these estimators is unclear, owing to the fact that the observed covariate distribution, which is also length-biased, may contain information about the regression parameter in the accelerated life model. We demonstrate that if the true covariate distribution is completely unspecified, then the naive estimator based on the conditional likelihood given the covariates is fully efficient.
We consider the problem of estimating the support size of a discrete distribution whose minimum non-zero mass is at least $ frac{1}{k}$. Under the independent sampling model, we show that the sample complexity, i.e., the minimal sample size to achieve an additive error of $epsilon k$ with probability at least 0.1 is within universal constant factors of $ frac{k}{log k}log^2frac{1}{epsilon} $, which improves the state-of-the-art result of $ frac{k}{epsilon^2 log k} $ in cite{VV13}. Similar characterization of the minimax risk is also obtained. Our procedure is a linear estimator based on the Chebyshev polynomial and its approximation-theoretic properties, which can be evaluated in $O(n+log^2 k)$ time and attains the sample complexity within a factor of six asymptotically. The superiority of the proposed estimator in terms of accuracy, computational efficiency and scalability is demonstrated in a variety of synthetic and real datasets.
We introduce estimation and test procedures through divergence minimization for models satisfying linear constraints with unknown parameter. Several statistical examples and motivations are given. These procedures extend the empirical likelihood (EL) method and share common features with generalized empirical likelihood (GEL). We treat the problems of existence and characterization of the divergence projections of probability measures on sets of signed finite measures. Our approach allows for a study of the estimates under misspecification. The asymptotic behavior of the proposed estimates are studied using the dual representation of the divergences and the explicit forms of the divergence projections. We discuss the problem of the choice of the divergence under various respects. Also we handle efficiency and robustness properties of minimum divergence estimates. A simulation study shows that the Hellinger divergence enjoys good efficiency and robustness properties.
In this study, we propose shrinkage methods based on {it generalized ridge regression} (GRR) estimation which is suitable for both multicollinearity and high dimensional problems with small number of samples (large $p$, small $n$). Also, it is obtained theoretical properties of the proposed estimators for Low/High Dimensional cases. Furthermore, the performance of the listed estimators is demonstrated by both simulation studies and real-data analysis, and compare its performance with existing penalty methods. We show that the proposed methods compare well to competing regularization techniques.