No Arabic abstract
We study the wild bootstrap inference for instrumental variable (quantile) regressions in the framework of a small number of large clusters, in which the number of clusters is viewed as fixed and the number of observations for each cluster diverges to infinity. For subvector inference, we show that the wild bootstrap Wald test with or without using the cluster-robust covariance matrix controls size asymptotically up to a small error as long as the parameters of endogenous variables are strongly identified in at least one of the clusters. We further develop a wild bootstrap Anderson-Rubin (AR) test for full-vector inference and show that it controls size asymptotically up to a small error even under weak or partial identification for all clusters. We illustrate the good finite-sample performance of the new inference methods using simulations and provide an empirical application to a well-known dataset about U.S. local labor markets.
This paper presents a simple method for carrying out inference in a wide variety of possibly nonlinear IV models under weak assumptions. The method is non-asymptotic in the sense that it provides a finite sample bound on the difference between the true and nominal probabilities of rejecting a correct null hypothesis. The method is a non-Studentized version of the Anderson-Rubin test but is motivated and analyzed differently. In contrast to the conventional Anderson-Rubin test, the method proposed here does not require restrictive distributional assumptions, linearity of the estimated model, or simultaneous equations. Nor does it require knowledge of whether the instruments are strong or weak. It does not require testing or estimating the strength of the instruments. The method can be applied to quantile IV models that may be nonlinear and can be used to test a parametric IV model against a nonparametric alternative. The results presented here hold in finite samples, regardless of the strength of the instruments.
We develop monitoring procedures for cointegrating regressions, testing the null of no breaks against the alternatives that there is either a change in the slope, or a change to non-cointegration. After observing the regression for a calibration sample m, we study a CUSUM-type statistic to detect the presence of change during a monitoring horizon m+1,...,T. Our procedures use a class of boundary functions which depend on a parameter whose value affects the delay in detecting the possible break. Technically, these procedures are based on almost sure limiting theorems whose derivation is not straightforward. We therefore define a monitoring function which - at every point in time - diverges to infinity under the null, and drifts to zero under alternatives. We cast this sequence in a randomised procedure to construct an i.i.d. sequence, which we then employ to define the detector function. Our monitoring procedure rejects the null of no break (when correct) with a small probability, whilst it rejects with probability one over the monitoring horizon in the presence of breaks.
We propose a novel conditional quantile prediction method based on complete subset averaging (CSA) for quantile regressions. All models under consideration are potentially misspecified and the dimension of regressors goes to infinity as the sample size increases. Since we average over the complete subsets, the number of models is much larger than the usual model averaging method which adopts sophisticated weighting schemes. We propose to use an equal weight but select the proper size of the complete subset based on the leave-one-out cross-validation method. Building upon the theory of Lu and Su (2015), we investigate the large sample properties of CSA and show the asymptotic optimality in the sense of Li (1987). We check the finite sample performance via Monte Carlo simulations and empirical applications.
This paper examines methods of inference concerning quantile treatment effects (QTEs) in randomized experiments with matched-pairs designs (MPDs). Standard multiplier bootstrap inference fails to capture the negative dependence of observations within each pair and is therefore conservative. Analytical inference involves estimating multiple functional quantities that require several tuning parameters. Instead, this paper proposes two bootstrap methods that can consistently approximate the limit distribution of the original QTE estimator and lessen the burden of tuning parameter choice. Most especially, the inverse propensity score weighted multiplier bootstrap can be implemented without knowledge of pair identities.
This paper introduces structured machine learning regressions for prediction and nowcasting with panel data consisting of series sampled at different frequencies. Motivated by the empirical problem of predicting corporate earnings for a large cross-section of firms with macroeconomic, financial, and news time series sampled at different frequencies, we focus on the sparse-group LASSO regularization. This type of regularization can take advantage of the mixed frequency time series panel data structures and we find that it empirically outperforms the unstructured machine learning methods. We obtain oracle inequalities for the pooled and fixed effects sparse-group LASSO panel data estimators recognizing that financial and economic data exhibit heavier than Gaussian tails. To that end, we leverage on a novel Fuk-Nagaev concentration inequality for panel data consisting of heavy-tailed $tau$-mixing processes which may be of independent interest in other high-dimensional panel data settings.