Complete Subset Averaging with Many Instruments


Abstract in English

We propose a two-stage least squares (2SLS) estimator whose first stage is the equal-weighted average over a complete subset with $k$ instruments among $K$ available, which we call the complete subset averaging (CSA) 2SLS. The approximate mean squared error (MSE) is derived as a function of the subset size $k$ by the Nagar (1959) expansion. The subset size is chosen by minimizing the sample counterpart of the approximate MSE. We show that this method achieves the asymptotic optimality among the class of estimators with different subset sizes. To deal with averaging over a growing set of irrelevant instruments, we generalize the approximate MSE to find that the optimal $k$ is larger than otherwise. An extensive simulation experiment shows that the CSA-2SLS estimator outperforms the alternative estimators when instruments are correlated. As an empirical illustration, we estimate the logistic demand function in Berry, Levinsohn, and Pakes (1995) and find the CSA-2SLS estimate is better supported by economic theory than the alternative estimates.

Download