Do you want to publish a course? Click here

Semiparametric empirical likelihood inference with estimating equations under density ratio models

104   0   0.0 ( 0 )
 Added by Pengfei Li
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

The density ratio model (DRM) provides a flexible and useful platform for combining information from multiple sources. In this paper, we consider statistical inference under two-sample DRMs with additional parameters defined through and/or additional auxiliary information expressed as estimating equations. We examine the asymptotic properties of the maximum empirical likelihood estimators (MELEs) of the unknown parameters in the DRMs and/or defined through estimating equations, and establish the chi-square limiting distributions for the empirical likelihood ratio (ELR) statistics. We show that the asymptotic variance of the MELEs of the unknown parameters does not decrease if one estimating equation is dropped. Similar properties are obtained for inferences on the cumulative distribution function and quantiles of each of the populations involved. We also propose an ELR test for the validity and usefulness of the auxiliary information. Simulation studies show that correctly specified estimating equations for the auxiliary information result in more efficient estimators and shorter confidence intervals. Two real-data examples are used for illustrations.

rate research

Read More

The Gini index is a popular inequality measure with many applications in social and economic studies. This paper studies semiparametric inference on the Gini indices of two semicontinuous populations. We characterize the distribution of each semicontinuous population by a mixture of a discrete point mass at zero and a continuous skewed positive component. A semiparametric density ratio model is then employed to link the positive components of the two distributions. We propose the maximum empirical likelihood estimators of the two Gini indices and their difference, and further investigate the asymptotic properties of the proposed estimators. The asymptotic results enable us to construct confidence intervals and perform hypothesis tests for the two Gini indices and their difference. We show that the proposed estimators are more efficient than the existing fully nonparametric estimators. The proposed estimators and the asymptotic results are also applicable to cases without excessive zero values. Simulation studies show the superiority of our proposed method over existing methods. Two real-data applications are presented using the proposed methods.
Statistical methods with empirical likelihood (EL) are appealing and effective especially in conjunction with estimating equations through which useful data information can be adaptively and flexibly incorporated. It is also known in the literature that EL approaches encounter difficulties when dealing with problems having high-dimensional model parameters and estimating equations. To overcome the challenges, we begin our study with a careful investigation on high-dimensional EL from a new scope targeting at estimating a high-dimensional sparse model parameters. We show that the new scope provides an opportunity for relaxing the stringent requirement on the dimensionality of the model parameter. Motivated by the new scope, we then propose a new penalized EL by applying two penalty functions respectively regularizing the model parameters and the associated Lagrange multipliers in the optimizations of EL. By penalizing the Lagrange multiplier to encourage its sparsity, we show that drastic dimension reduction in the number of estimating equations can be effectively achieved without compromising the validity and consistency of the resulting estimators. Most attractively, such a reduction in dimensionality of estimating equations is actually equivalent to a selection among those high-dimensional estimating equations, resulting in a highly parsimonious and effective device for high-dimensional sparse model parameters. Allowing both the dimensionalities of model parameters and estimating equations growing exponentially with the sample size, our theory demonstrates that the estimator from our new penalized EL is sparse and consistent with asymptotically normally distributed nonzero components. Numerical simulations and a real data analysis show that the proposed penalized EL works promisingly.
The Youden index is a popular summary statistic for receiver operating characteristic curve. It gives the optimal cutoff point of a biomarker to distinguish the diseased and healthy individuals. In this paper, we propose to model the distributions of a biomarker for individuals in the healthy and diseased groups via a semiparametric density ratio model. Based on this model, we use the maximum empirical likelihood method to estimate the Youden index and the optimal cutoff point. We further establish the asymptotic normality of the proposed estimators and construct valid confidence intervals for the Youden index and the corresponding optimal cutoff point. The proposed method automatically covers both cases when there is no lower limit of detection (LLOD) and when there is a fixed and finite LLOD for the biomarker. Extensive simulation studies and a real data example are used to illustrate the effectiveness of the proposed method and its advantages over the existing methods.
Simultaneous, post-hoc inference is desirable in large-scale hypotheses testing as it allows for exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for the number of true discoveries must be based on closed testing. In this paper we investigate tractable and efficient closed testing with local tests of different properties, such as monotonicty, symmetry and separability, meaning that the test thresholds a monotonic or symmetric function or a function of sums of test scores for the individual hypotheses. This class includes well-known global null tests by Fisher, Stouffer and Ruschendorf, as well as newly proposed ones based on harmonic means and Cauchy combinations. Under monotonicity, we propose a new linear time statistic (coma) that quantifies the cost of multiplicity adjustments. If the tests are also symmetric and separable, we develop several fast (mostly linear-time) algorithms for post-hoc inference, making closed testing tractable. Paired with recent advances in global null tests based on generalized means, our work immediately instantiates a series of simultaneous inference methods that can handle many complex dependence structures and signal compositions. We provide guidance on choosing from these methods via theoretical investigation of the conservativeness and sensitivity for different local tests, as well as simulations that find analogous behavior for local tests and full closed testing. One result of independent interest is the following: if $P_1,dots,P_d$ are $p$-values from a multivariate Gaussian with arbitrary covariance, then their arithmetic average P satisfies $Pr(P leq t) leq t$ for $t leq frac{1}{2d}$.
105 - Zijian Guo , Cun-Hui Zhang 2019
Additive models, as a natural generalization of linear regression, have played an important role in studying nonlinear relationships. Despite of a rich literature and many recent advances on the topic, the statistical inference problem in additive models is still relatively poorly understood. Motivated by the inference for the exposure effect and other applications, we tackle in this paper the statistical inference problem for $f_1(x_0)$ in additive models, where $f_1$ denotes the univariate function of interest and $f_1(x_0)$ denotes its first order derivative evaluated at a specific point $x_0$. The main challenge for this local inference problem is the understanding and control of the additional uncertainty due to the need of estimating other components in the additive model as nuisance functions. To address this, we propose a decorrelated local linear estimator, which is particularly useful in reducing the effect of the nuisance function estimation error on the estimation accuracy of $f_1(x_0)$. We establish the asymptotic limiting distribution for the proposed estimator and then construct confidence interval and hypothesis testing procedures for $f_1(x_0)$. The variance level of the proposed estimator is of the same order as that of the local least squares in nonparametric regression, or equivalently the additive model with one component, while the bias of the proposed estimator is jointly determined by the statistical accuracies in estimating the nuisance functions and the relationship between the variable of interest and the nuisance variables. The method is developed for general additive models and is demonstrated in the high-dimensional sparse setting.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا