ترغب بنشر مسار تعليمي؟ اضغط هنا

On Inductive Biases for Heterogeneous Treatment Effect Estimation

71   0   0.0 ( 0 )
 نشر من قبل Alicia Curth
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

We investigate how to exploit structural similarities of an individuals potential outcomes (POs) under different treatments to obtain better estimates of conditional average treatment effects in finite samples. Especially when it is unknown whether a treatment has an effect at all, it is natural to hypothesize that the POs are similar - yet, some existing strategies for treatment effect estimation employ regularization schemes that implicitly encourage heterogeneity even when it does not exist and fail to fully make use of shared structure. In this paper, we investigate and compare three end-to-end learning strategies to overcome this problem - based on regularization, reparametrization and a flexible multi-task architecture - each encoding inductive bias favoring shared behavior across POs. To build understanding of their relative strengths, we implement all strategies using neural networks and conduct a wide range of semi-synthetic experiments. We observe that all three approaches can lead to substantial improvements upon numerous baselines and gain insight into performance differences across various experimental settings.



قيم البحث

اقرأ أيضاً

Federated learning is an appealing framework for analyzing sensitive data from distributed health data networks. Under this framework, data partners at local sites collaboratively build an analytical model under the orchestration of a coordinating si te, while keeping the data decentralized. While integrating information from multiple sources may boost statistical efficiency, existing federated learning methods mainly assume data across sites are homogeneous samples of the global population, failing to properly account for the extra variability across sites in estimation and inference. Drawing on a multi-hospital electronic health records network, we develop an efficient and interpretable tree-based ensemble of personalized treatment effect estimators to join results across hospital sites, while actively modeling for the heterogeneity in data sources through site partitioning. The efficiency of this approach is demonstrated by a study of causal effects of oxygen saturation on hospital mortality and backed up by comprehensive numerical results.
The goal of many scientific experiments including A/B testing is to estimate the average treatment effect (ATE), which is defined as the difference between the expected outcomes of two or more treatments. In this paper, we consider a situation where an experimenter can assign a treatment to research subjects sequentially. In adaptive experimental design, the experimenter is allowed to change the probability of assigning a treatment using past observations for estimating the ATE efficiently. However, with this approach, it is difficult to apply a standard statistical method to construct an estimator because the observations are not independent and identically distributed. We thus propose an algorithm for efficient experiments with estimators constructed from dependent samples. We also introduce a sequential testing framework using the proposed estimator. To justify our proposed approach, we provide finite and infinite sample analyses. Finally, we experimentally show that the proposed algorithm exhibits preferable performance.
It is important to estimate the local average treatment effect (LATE) when compliance with a treatment assignment is incomplete. The previously proposed methods for LATE estimation required all relevant variables to be jointly observed in a single da taset; however, it is sometimes difficult or even impossible to collect such data in many real-world problems for technical or privacy reasons. We consider a novel problem setting in which LATE, as a function of covariates, is nonparametrically identified from the combination of separately observed datasets. For estimation, we show that the direct least squares method, which was originally developed for estimating the average treatment effect under complete compliance, is applicable to our setting. However, model selection and hyperparameter tuning for the direct least squares estimator can be unstable in practice since it is defined as a solution to the minimax problem. We then propose a weighted least squares estimator that enables simpler model selection by avoiding the minimax objective formulation. Unlike the inverse probability weighted (IPW) estimator, the proposed estimator directly uses the pre-estimated weight without inversion, avoiding the problems caused by the IPW methods. We demonstrate the effectiveness of our method through experiments using synthetic and real-world datasets.
Borrowing from the transformer models that revolutionized the field of natural language processing, self-supervised feature learning for visual tasks has also seen state-of-the-art success using these extremely deep, isotropic networks. However, the typical AI researcher does not have the resources to evaluate, let alone train, a model with several billion parameters and quadratic self-attention activations. To facilitate further research, it is necessary to understand the features of these huge transformer models that can be adequately studied by the typical researcher. One interesting characteristic of these transformer models is that they remove most of the inductive biases present in classical convolutional networks. In this work, we analyze the effect of these and more inductive biases on small to moderately-sized isotropic networks used for unsupervised visual feature learning and show that their removal is not always ideal.
This paper provides estimation and inference methods for a large number of heterogeneous treatment effects in the presence of an even larger number of controls and unobserved unit heterogeneity. In our main example, the vector of heterogeneous treatm ents is generated by interacting the base treatment variable with a subset of controls. We first estimate the unit-specific expectation functions of the outcome and each treatment interaction conditional on controls and take the residuals. Second, we report the Lasso (L1-regularized least squares) estimate of the heterogeneous treatment effect parameter, regressing the outcome residual on the vector of treatment ones. We debias the Lasso estimator to conduct simultaneous inference on the target parameter by Gaussian bootstrap. We account for the unobserved unit heterogeneity by projecting it onto the time-invariant covariates, following the correlated random effects approach of Mundlak (1978) and Chamberlain (1982). We demonstrate our method by estimating price elasticities of groceries based on scanner data.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا