No Arabic abstract
This paper provides estimation and inference methods for a large number of heterogeneous treatment effects in the presence of an even larger number of controls and unobserved unit heterogeneity. In our main example, the vector of heterogeneous treatments is generated by interacting the base treatment variable with a subset of controls. We first estimate the unit-specific expectation functions of the outcome and each treatment interaction conditional on controls and take the residuals. Second, we report the Lasso (L1-regularized least squares) estimate of the heterogeneous treatment effect parameter, regressing the outcome residual on the vector of treatment ones. We debias the Lasso estimator to conduct simultaneous inference on the target parameter by Gaussian bootstrap. We account for the unobserved unit heterogeneity by projecting it onto the time-invariant covariates, following the correlated random effects approach of Mundlak (1978) and Chamberlain (1982). We demonstrate our method by estimating price elasticities of groceries based on scanner data.
We investigate how to exploit structural similarities of an individuals potential outcomes (POs) under different treatments to obtain better estimates of conditional average treatment effects in finite samples. Especially when it is unknown whether a treatment has an effect at all, it is natural to hypothesize that the POs are similar - yet, some existing strategies for treatment effect estimation employ regularization schemes that implicitly encourage heterogeneity even when it does not exist and fail to fully make use of shared structure. In this paper, we investigate and compare three end-to-end learning strategies to overcome this problem - based on regularization, reparametrization and a flexible multi-task architecture - each encoding inductive bias favoring shared behavior across POs. To build understanding of their relative strengths, we implement all strategies using neural networks and conduct a wide range of semi-synthetic experiments. We observe that all three approaches can lead to substantial improvements upon numerous baselines and gain insight into performance differences across various experimental settings.
Understanding treatment effect heterogeneity in observational studies is of great practical importance to many scientific fields because the same treatment may affect different individuals differently. Quantile regression provides a natural framework for modeling such heterogeneity. In this paper, we propose a new method for inference on heterogeneous quantile treatment effects that incorporates high-dimensional covariates. Our estimator combines a debiased $ell_1$-penalized regression adjustment with a quantile-specific covariate balancing scheme. We present a comprehensive study of the theoretical properties of this estimator, including weak convergence of the heterogeneous quantile treatment effect process to the sum of two independent, centered Gaussian processes. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and an empirical example, dealing with the differential effect of mothers education on infant birth weights.
Bayesian approaches have become increasingly popular in causal inference problems due to their conceptual simplicity, excellent performance and in-built uncertainty quantification (posterior credible sets). We investigate Bayesian inference for average treatment effects from observational data, which is a challenging problem due to the missing counterfactuals and selection bias. Working in the standard potential outcomes framework, we propose a data-driven modification to an arbitrary (nonparametric) prior based on the propensity score that corrects for the first-order posterior bias, thereby improving performance. We illustrate our method for Gaussian process (GP) priors using (semi-)synthetic data. Our experiments demonstrate significant improvement in both estimation accuracy and uncertainty quantification compared to the unmodified GP, rendering our approach highly competitive with the state-of-the-art.
The policy relevant treatment effect (PRTE) measures the average effect of switching from a status-quo policy to a counterfactual policy. Estimation of the PRTE involves estimation of multiple preliminary parameters, including propensity scores, conditional expectation functions of the outcome and covariates given the propensity score, and marginal treatment effects. These preliminary estimators can affect the asymptotic distribution of the PRTE estimator in complicated and intractable manners. In this light, we propose an orthogonal score for double debiased estimation of the PRTE, whereby the asymptotic distribution of the PRTE estimator is obtained without any influence of preliminary parameter estimators as far as they satisfy mild requirements of convergence rates. To our knowledge, this paper is the first to develop limit distribution theories for inference about the PRTE.
Given the unconfoundedness assumption, we propose new nonparametric estimators for the reduced dimensional conditional average treatment effect (CATE) function. In the first stage, the nuisance functions necessary for identifying CATE are estimated by machine learning methods, allowing the number of covariates to be comparable to or larger than the sample size. The second stage consists of a low-dimensional local linear regression, reducing CATE to a function of the covariate(s) of interest. We consider two variants of the estimator depending on whether the nuisance functions are estimated over the full sample or over a hold-out sample. Building on Belloni at al. (2017) and Chernozhukov et al. (2018), we derive functional limit theory for the estimators and provide an easy-to-implement procedure for uniform inference based on the multiplier bootstrap. The empirical application revisits the effect of maternal smoking on a babys birth weight as a function of the mothers age.