Ensemble Methods for Causal Effects in Panel Data Settings

326 0 0.0 ( 0 )

Download Cite

Added by Susan Athey

Publication date 2019

fields Economy

and research's language is English

Authors Susan Athey - Mohsen Bayati - Guido Imbens

Econometrics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper studies a panel data setting where the goal is to estimate causal effects of an intervention by predicting the counterfactual values of outcomes for treated units, had they not received the treatment. Several approaches have been proposed for this problem, including regression methods, synthetic control methods and matrix completion methods. This paper considers an ensemble approach, and shows that it performs better than any of the individual methods in several economic datasets. Matrix completion methods are often given the most weight by the ensemble, but this clearly depends on the setting. We argue that ensemble methods present a fruitful direction for further research in the causal panel data setting.

rate research

Double-Robust Identification for Causal Panel Data Models

220 - Dmitry Arkhangelsky , Guido W. Imbens 2019

We study identification and estimation of causal effects in settings with panel data. Traditionally researchers follow model-based identification strategies relying on assumptions governing the relation between the potential outcomes and the unobserved confounders. We focus on a novel, complementary, approach to identification where assumptions are made about the relation between the treatment assignment and the unobserved confounders. We introduce different sets of assumptions that follow the two paths to identification, and develop a double robust approach. We propose estimation methods that build on these identification strategies.

Econometrics General Economics Economics

Subspace Clustering for Panel Data with Interactive Effects

117 - Jiangtao Duan , Wei Gao , Hao Qu 2019

In this paper, a statistical model for panel data with unobservable grouped factor structures which are correlated with the regressors and the group membership can be unknown. The factor loadings are assumed to be in different subspaces and the subspace clustering for factor loadings are considered. A method called least squares subspace clustering estimate (LSSC) is proposed to estimate the model parameters by minimizing the least-square criterion and to perform the subspace clustering simultaneously. The consistency of the proposed subspace clustering is proved and the asymptotic properties of the estimation procedure are studied under certain conditions. A Monte Carlo simulation study is used to illustrate the advantages of the proposed method. Further considerations for the situations that the number of subspaces for factors, the dimension of factors and the dimension of subspaces are unknown are also discussed. For illustrative purposes, the proposed method is applied to study the linkage between income and democracy across countries while subspace patterns of unobserved factors and factor loadings are allowed.

Econometrics

Matrix Completion Methods for Causal Panel Data Models

297 - Susan Athey , Mohsen Bayati , Nikolay Doudchenko 2017

In this paper we study methods for estimating causal effects in settings with panel data, where some units are exposed to a treatment during some periods and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations. We propose a class of matrix completion estimators that uses the observed elements of the matrix of control outcomes corresponding to untreated unit/periods to impute the missing elements of the control outcome matrix, corresponding to treated units/periods. This leads to a matrix that well-approximates the original (incomplete) matrix, but has lower complexity according to the nuclear norm for matrices. We generalize results from the matrix completion literature by allowing the patterns of missing data to have a time series dependency structure that is common in social science applications. We present novel insights concerning the connections between the matrix completion literature, the literature on interactive fixed effects models and the literatures on program evaluation under unconfoundedness and synthetic control methods. We show that all these estimators can be viewed as focusing on the same objective function. They differ solely in the way they deal with identification, in some cases solely through regularization (our proposed nuclear norm matrix completion estimator) and in other cases primarily through imposing hard restrictions (the unconfoundedness and synthetic control approaches). The proposed method outperforms unconfoundedness-based or synthetic control estimators in simulations based on real data.

Statistics Theory Econometrics Statistics Theory

Double-Robust Two-Way-Fixed-Effects Regression For Panel Data

84 - Dmitry Arkhangelsky , Guido W. Imbens , Lihua Lei 2021

We propose a new estimator for the average causal effects of a binary treatment with panel data in settings with general treatment patterns. Our approach augments the two-way-fixed-effects specification with the unit-specific weights that arise from a model for the assignment mechanism. We show how to construct these weights in various settings, including situations where units opt into the treatment sequentially. The resulting estimator converges to an average (over units and time) treatment effect under the correct specification of the assignment model. We show that our estimator is more robust than the conventional two-way estimator: it remains consistent if either the assignment mechanism or the two-way regression model is correctly specified and performs better than the two-way-fixed-effect estimator if both are locally misspecified. This strong double robustness property quantifies the benefits from modeling the assignment process and motivates using our estimator in practice.

Econometrics General Economics Economics

Pricing Engine: Estimating Causal Impacts in Real World Business Settings

68 - Matt Goldman , Brian Quistorff 2018

We introduce the Pricing Engine package to enable the use of Double ML estimation techniques in general panel data settings. Customization allows the user to specify first-stage models, first-stage featurization, second stage treatment selection and second stage causal-modeling. We also introduce a DynamicDML class that allows the user to generate dynamic treatment-aware forecasts at a range of leads and to understand how the forecasts will vary as a function of causally estimated treatment parameters. The Pricing Engine is built on Python 3.5 and can be run on an Azure ML Workbench environment with the addition of only a few Python packages. This note provides high-level discussion of the Double ML method, describes the packages intended use and includes an example Jupyter notebook demonstrating application to some publicly available data. Installation of the package and additional technical documentation is available at $href{https://github.com/bquistorff/pricingengine}{github.com/bquistorff/pricingengine}$.

Econometrics Machine Learning