Do you want to publish a course? Click here

Sparsity Double Robust Inference of Average Treatment Effects

67   0   0.0 ( 0 )
 Added by Yinchu Zhu
 Publication date 2019
  fields Economy
and research's language is English




Ask ChatGPT about the research

Many popular methods for building confidence intervals on causal effects under high-dimensional confounding require strong ultra-sparsity assumptions that may be difficult to validate in practice. To alleviate this difficulty, we here study a new method for average treatment effect estimation that yields asymptotically exact confidence intervals assuming that either the conditional response surface or the conditional probability of treatment allows for an ultra-sparse representation (but not necessarily both). This guarantee allows us to provide valid inference for average treatment effect in high dimensions under considerably more generality than available baselines. In addition, we showcase that our results are semi-parametrically efficient.



rate research

Read More

104 - Kolyan Ray , Botond Szabo 2019
Bayesian approaches have become increasingly popular in causal inference problems due to their conceptual simplicity, excellent performance and in-built uncertainty quantification (posterior credible sets). We investigate Bayesian inference for average treatment effects from observational data, which is a challenging problem due to the missing counterfactuals and selection bias. Working in the standard potential outcomes framework, we propose a data-driven modification to an arbitrary (nonparametric) prior based on the propensity score that corrects for the first-order posterior bias, thereby improving performance. We illustrate our method for Gaussian process (GP) priors using (semi-)synthetic data. Our experiments demonstrate significant improvement in both estimation accuracy and uncertainty quantification compared to the unmodified GP, rendering our approach highly competitive with the state-of-the-art.
This paper studies inference in linear models whose parameter of interest is a high-dimensional matrix. We focus on the case where the high-dimensional matrix parameter is well-approximated by a ``spiked low-rank matrix whose rank grows slowly compared to its dimensions and whose nonzero singular values diverge to infinity. We show that this framework covers a broad class of models of latent-variables which can accommodate matrix completion problems, factor models, varying coefficient models, principal components analysis with missing data, and heterogeneous treatment effects. For inference, we propose a new ``rotation-debiasing method for product parameters initially estimated using nuclear norm penalization. We present general high-level results under which our procedure provides asymptotically normal estimators. We then present low-level conditions under which we verify the high-level conditions in a treatment effects example.
261 - Keli Guo 2020
The research described herewith is to re-visit the classical doubly robust estimation of average treatment effect by conducting a systematic study on the comparisons, in the sense of asymptotic efficiency, among all possible combinations of the estimated propensity score and outcome regression. To this end, we consider all nine combinations under, respectively, parametric, nonparametric and semiparametric structures. The comparisons provide useful information on when and how to efficiently utilize the model structures in practice. Further, when there is model-misspecification, either propensity score or outcome regression, we also give the corresponding comparisons. Three phenomena are observed. Firstly, when all models are correctly specified, any combination can achieve the same semiparametric efficiency bound, which coincides with the existing results of some combinations. Secondly, when the propensity score is correctly modeled and estimated, but the outcome regression is misspecified parametrically or semiparametrically, the asymptotic variance is always larger than or equal to the semiparametric efficiency bound. Thirdly, in contrast, when the propensity score is misspecified parametrically or semiparametrically, while the outcome regression is correctly modeled and estimated, the asymptotic variance is not necessarily larger than the semiparametric efficiency bound. In some cases, the super-efficiency phenomenon occurs. We also conduct a small numerical study.
This paper is about the ability and means to root-n consistently and efficiently estimate linear, mean square continuous functionals of a high dimensional, approximately sparse regression. Such objects include a wide variety of interesting parameters such as the covariance between two regression residuals, a coefficient of a partially linear model, an average derivative, and the average treatment effect. We give lower bounds on the convergence rate of estimators of such objects and find that these bounds are substantially larger than in a low dimensional, semiparametric setting. We also give automatic debiased machine learners that are $1/sqrt{n}$ consistent and asymptotically efficient under minimal conditions. These estimators use no cross-fitting or a special kind of cross-fitting to attain efficiency with faster than $n^{-1/4}$ convergence of the regression. This rate condition is substantially weaker than the product of convergence rates of two functions being faster than $1/sqrt{n},$ as required for many other debiased machine learners.
83 - David M. Kaplan 2016
Bayesian and frequentist criteria are fundamentally different, but often posterior and sampling distributions are asymptotically equivalent (e.g., Gaussian). For the corresponding limit experiment, we characterize the frequentist size of a certain Bayesian hypothesis test of (possibly nonlinear) inequalities. If the null hypothesis is that the (possibly infinite-dimensional) parameter lies in a certain half-space, then the Bayesian tests size is $alpha$; if the null hypothesis is a subset of a half-space, then size is above $alpha$ (sometimes strictly); and in other cases, size may be above, below, or equal to $alpha$. Two examples illustrate our results: testing stochastic dominance and testing curvature of a translog cost function.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا