No Arabic abstract
We give analytic methods for nonparametric bias reduction that remove the need for computationally intensive methods like the bootstrap and the jackknife. We call an estimate {it $p$th order} if its bias has magnitude $n_0^{-p}$ as $n_0 to infty$, where $n_0$ is the sample size (or the minimum sample size if the estimate is a function of more than one sample). Most estimates are only first order and require O(N) calculations, where $N$ is the total sample size. The usual bootstrap and jackknife estimates are second order but they are computationally intensive, requiring $O(N^2)$ calculations for one sample. By contrast Jaeckels infinitesimal jackknife is an analytic second order one sample estimate requiring only O(N) calculations. When $p$th order bootstrap and jackknife estimates are available, they require $O(N^p)$ calculations, and so become even more computationally intensive if one chooses $p>2$. For general $p$ we provide analytic $p$th order nonparametric estimates that require only O(N) calculations. Our estimates are given in terms of the von Mises derivatives of the functional being estimated, evaluated at the empirical distribution. For products of moments an unbiased estimate exists: our form for this polykay is much simpler than the usual form in terms of power sums.
Let $X_1,dots, X_n$ be i.i.d. random variables sampled from a normal distribution $N(mu,Sigma)$ in ${mathbb R}^d$ with unknown parameter $theta=(mu,Sigma)in Theta:={mathbb R}^dtimes {mathcal C}_+^d,$ where ${mathcal C}_+^d$ is the cone of positively definite covariance operators in ${mathbb R}^d.$ Given a smooth functional $f:Theta mapsto {mathbb R}^1,$ the goal is to estimate $f(theta)$ based on $X_1,dots, X_n.$ Let $$ Theta(a;d):={mathbb R}^dtimes Bigl{Sigmain {mathcal C}_+^d: sigma(Sigma)subset [1/a, a]Bigr}, ageq 1, $$ where $sigma(Sigma)$ is the spectrum of covariance $Sigma.$ Let $hat theta:=(hat mu, hat Sigma),$ where $hat mu$ is the sample mean and $hat Sigma$ is the sample covariance, based on the observations $X_1,dots, X_n.$ For an arbitrary functional $fin C^s(Theta),$ $s=k+1+rho, kgeq 0, rhoin (0,1],$ we define a functional $f_k:Theta mapsto {mathbb R}$ such that begin{align*} & sup_{thetain Theta(a;d)}|f_k(hat theta)-f(theta)|_{L_2({mathbb P}_{theta})} lesssim_{s, beta} |f|_{C^{s}(Theta)} biggr[biggl(frac{a}{sqrt{n}} bigvee a^{beta s}biggl(sqrt{frac{d}{n}}biggr)^{s} biggr)wedge 1biggr], end{align*} where $beta =1$ for $k=0$ and $beta>s-1$ is arbitrary for $kgeq 1.$ This error rate is minimax optimal and similar bounds hold for more general loss functions. If $d=d_nleq n^{alpha}$ for some $alphain (0,1)$ and $sgeq frac{1}{1-alpha},$ the rate becomes $O(n^{-1/2}).$ Moreover, for $s>frac{1}{1-alpha},$ the estimators $f_k(hat theta)$ is shown to be asymptotically efficient. The crucial part of the construction of estimator $f_k(hat theta)$ is a bias reduction method studied in the paper for more general statistical models than normal.
A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain in power can be achieved by considering finer partitions. We suggest novel consistent distribution-free tests that are based on summation or maximization aggregation of scores over all partitions of a fixed size. We show that our test statistics based on summation can serve as good estimators of the mutual information. Moreover, we suggest regularized tests that aggregate over all partition sizes, and prove those are consistent too. We provide polynomial-time algorithms, which are critical for computing the suggested test statistics efficiently. We show that the power of the regularized tests is excellent compared to existing tests, and almost as powerful as the tests based on the optimal (yet unknown in practice) partition size, in simulations as well as on a real data example.
Many proposals have emerged as alternatives to the Heckman selection model, mainly to address the non-robustness of its normal assumption. The 2001 Medical Expenditure Panel Survey data is often used to illustrate this non-robustness of the Heckman model. In this paper, we propose a generalization of the Heckman sample selection model by allowing the sample selection bias and dispersion parameters to depend on covariates. We show that the non-robustness of the Heckman model may be due to the assumption of the constant sample selection bias parameter rather than the normality assumption. Our proposed methodology allows us to understand which covariates are important to explain the sample selection bias phenomenon rather than to only form conclusions about its presence. We explore the inferential aspects of the maximum likelihood estimators (MLEs) for our proposed generalized Heckman model. More specifically, we show that this model satisfies some regularity conditions such that it ensures consistency and asymptotic normality of the MLEs. Proper score residuals for sample selection models are provided, and model adequacy is addressed. Simulated results are presented to check the finite-sample behavior of the estimators and to verify the consequences of not considering varying sample selection bias and dispersion parameters. We show that the normal assumption for analyzing medical expenditure data is suitable and that the conclusions drawn using our approach are coherent with findings from prior literature. Moreover, we identify which covariates are relevant to explain the presence of sample selection bias in this important dataset.
Propensity score methods have been shown to be powerful in obtaining efficient estimators of average treatment effect (ATE) from observational data, especially under the existence of confounding factors. When estimating, deciding which type of covariates need to be included in the propensity score function is important, since incorporating some unnecessary covariates may amplify both bias and variance of estimators of ATE. In this paper, we show that including additional instrumental variables that satisfy the exclusion restriction for outcome will do harm to the statistical efficiency. Also, we prove that, controlling for covariates that appear as outcome predictors, i.e. predict the outcomes and are irrelevant to the exposures, can help reduce the asymptotic variance of ATE estimation. We also note that, efficiently estimating the ATE by non-parametric or semi-parametric methods require the estimated propensity score function, as described in Hirano et al. (2003)cite{Hirano2003}. Such estimation procedure usually asks for many regularity conditions, Rothe (2016)cite{Rothe2016} also illustrated this point and proposed a known propensity score (KPS) estimator that requires mild regularity conditions and is still fully efficient. In addition, we introduce a linearly modified (LM) estimator that is nearly efficient in most general settings and need not estimation of the propensity score function, hence convenient to calculate. The construction of this estimator borrows idea from the interaction estimator of Lin (2013)cite{Lin2013}, in which regression adjustment with interaction terms are applied to deal with data arising from a completely randomized experiment. As its name suggests, the LM estimator can be viewed as a linear modification on the IPW estimator using known propensity scores. We will also investigate its statistical properties both analytically and numerically.
We develop an analytic approach to the four-point crossing equation in CFT, for general spacetime dimension. In a unitary CFT, the crossing equation (for, say, the s- and t-channel expansions) can be thought of as a vector equation in an infinite-dimensional space of complex analytic functions in two variables, which satisfy a boundedness condition in the u-channel Regge limit. We identify a useful basis for this space of functions, consisting of the set of s- and t-channel conformal blocks of double-twist operators in mean field theory. We describe two independent algorithms to construct the dual basis of linear functionals, and work out explicitly many examples. Our basis of functionals appears to be closely related to the CFT dispersion relation recently derived by Carmi and Caron-Huot.