ترغب بنشر مسار تعليمي؟ اضغط هنا

A $t$-test for synthetic controls

379   0   0.0 ( 0 )
 نشر من قبل Yinchu Zhu
 تاريخ النشر 2018
  مجال البحث اقتصاد
والبحث باللغة English




اسأل ChatGPT حول البحث

We propose a practical and robust method for making inferences on average treatment effects estimated by synthetic controls. We develop a $K$-fold cross-fitting procedure for bias-correction. To avoid the difficult estimation of the long-run variance, inference is based on a self-normalized $t$-statistic, which has an asymptotically pivotal $t$-distribution. Our $t$-test is easy to implement, provably robust against misspecification, valid with non-stationary data, and demonstrates an excellent small sample performance. Compared to difference-in-differences, our method often yields more than 50% shorter confidence intervals and is robust to violations of parallel trends assumptions. An R-package for implementing our methods is available.



قيم البحث

اقرأ أيضاً

97 - Paul Hunermund 2021
Double machine learning (DML) is becoming an increasingly popular tool for automated model selection in high-dimensional settings. At its core, DML assumes unconfoundedness, or exogeneity of all considered controls, which might likely be violated if the covariate space is large. In this paper, we lay out a theory of bad controls building on the graph-theoretic approach to causality. We then demonstrate, based on simulation studies and an application to real-world data, that DML is very sensitive to the inclusion of bad controls and exhibits considerable bias even with only a few endogenous variables present in the conditioning set. The extent of this bias depends on the precise nature of the assumed causal model, which calls into question the ability of selecting appropriate controls for regressions in a purely data-driven way.
We present a robust generalization of the synthetic control method for comparative case studies. Like the classical method, we present an algorithm to estimate the unobservable counterfactual of a treatment unit. A distinguishing feature of our algor ithm is that of de-noising the data matrix via singular value thresholding, which renders our approach robust in multiple facets: it automatically identifies a good subset of donors, overcomes the challenges of missing data, and continues to work well in settings where covariate information may not be provided. To begin, we establish the condition under which the fundamental assumption in synthetic control-like approaches holds, i.e. when the linear relationship between the treatment unit and the donor pool prevails in both the pre- and post-intervention periods. We provide the first finite sample analysis for a broader class of models, the Latent Variable Model, in contrast to Factor Models previously considered in the literature. Further, we show that our de-noising procedure accurately imputes missing entries, producing a consistent estimator of the underlying signal matrix provided $p = Omega( T^{-1 + zeta})$ for some $zeta > 0$; here, $p$ is the fraction of observed data and $T$ is the time interval of interest. Under the same setting, we prove that the mean-squared-error (MSE) in our prediction estimation scales as $O(sigma^2/p + 1/sqrt{T})$, where $sigma^2$ is the noise variance. Using a data aggregation method, we show that the MSE can be made as small as $O(T^{-1/2+gamma})$ for any $gamma in (0, 1/2)$, leading to a consistent estimator. We also introduce a Bayesian framework to quantify the model uncertainty through posterior probabilities. Our experiments, using both real-world and synthetic datasets, demonstrate that our robust generalization yields an improvement over the classical synthetic control method.
130 - Yong Li , Xiaobin Liu , Jun Yu 2018
In this paper, a new and convenient $chi^2$ wald test based on MCMC outputs is proposed for hypothesis testing. The new statistic can be explained as MCMC version of Wald test and has several important advantages that make it very convenient in pract ical applications. First, it is well-defined under improper prior distributions and avoids Jeffrey-Lindleys paradox. Second, its asymptotic distribution can be proved to follow the $chi^2$ distribution so that the threshold values can be easily calibrated from this distribution. Third, its statistical error can be derived using the Markov chain Monte Carlo (MCMC) approach. Fourth, most importantly, it is only based on the posterior MCMC random samples drawn from the posterior distribution. Hence, it is only the by-product of the posterior outputs and very easy to compute. In addition, when the prior information is available, the finite sample theory is derived for the proposed test statistic. At last, the usefulness of the test is illustrated with several applications to latent variable models widely used in economics and finance.
Counterfactual estimation using synthetic controls is one of the most successful recent methodological developments in causal inference. Despite its popularity, the current description only considers time series aligned across units and synthetic con trols expressed as linear combinations of observed control units. We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations. This model is directly applicable to the general setting of irregularly-aligned multivariate time series and may be optimized in rich function spaces -- thereby improving on some limitations of existing approaches.
In a recent paper Juodis and Reese (2021) (JR) show that the application of the CD test proposed by Pesaran (2004) to residuals from panels with latent factors results in over-rejection and propose a randomized test statistic to correct for over-reje ction, and add a screening component to achieve power. This paper considers the same problem but from a different perspective and shows that the standard CD test remains valid if the latent factors are weak, and proposes a simple bias-corrected CD test, labelled CD*, which is shown to be asymptotically normal, irrespective of whether the latent factors are weak or strong. This result is shown to hold for pure latent factor models as well as for panel regressions with latent factors. Small sample properties of the CD* test are investigated by Monte Carlo experiments and are shown to have the correct size and satisfactory power for both Gaussian and non-Gaussian errors. In contrast, it is found that JRs test tends to over-reject in the case of panels with non-Gaussian errors, and have low power against spatial network alternatives. The use of the CD* test is illustrated with two empirical applications from the literature.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا