Leave-out estimation of variance components


Abstract in English

We propose leave-out estimators of quadratic forms designed for the study of linear models with unrestricted heteroscedasticity. Applications include analysis of variance and tests of linear restrictions in models with many regressors. An approximation algorithm is provided that enables accurate computation of the estimator in very large datasets. We study the large sample properties of our estimator allowing the number of regressors to grow in proportion to the number of observations. Consistency is established in a variety of settings where plug-in methods and estimators predicated on homoscedasticity exhibit first-order biases. For quadratic forms of increasing rank, the limiting distribution can be represented by a linear combination of normal and non-central $chi^2$ random variables, with normality ensuing under strong identification. Standard error estimators are proposed that enable tests of linear restrictions and the construction of uniformly valid confidence intervals for quadratic forms of interest. We find in Italian social security records that leave-out estimates of a variance decomposition in a two-way fixed effects model of wage determination yield substantially different conclusions regarding the relative contribution of workers, firms, and worker-firm sorting to wage inequality than conventional methods. Monte Carlo exercises corroborate the accuracy of our asymptotic approximations, with clear evidence of non-normality emerging when worker mobility between blocks of firms is limited.

Download