No Arabic abstract
Researchers are often interested in treatment effects on outcomes that are only defined conditional on a post-treatment event status. For example, in a study of the effect of different cancer treatments on quality of life at end of follow-up, the quality of life of individuals who die during the study is undefined. In these settings, a naive contrast of outcomes conditional on the post-treatment variable is not an average causal effect, even in a randomized experiment. Therefore the effect in the principal stratum of those who would have the same value of the post-treatment variable regardless of treatment, such as the always survivors in a truncation by death setting, is often advocated for causal inference. While this principal stratum effect is a well defined causal contrast, it is often hard to justify that it is relevant to scientists, patients or policy makers, and it cannot be identified without relying on unfalsifiable assumptions. Here we formulate alternative estimands, the conditional separable effects, that have a natural causal interpretation under assumptions that can be falsified in a randomized experiment. We provide identification results and introduce different estimators, including a doubly robust estimator derived from the nonparametric influence function. As an illustration, we estimate a conditional separable effect of chemotherapies on quality of life in patients with prostate cancer, using data from a randomized clinical trial.
In competing event settings, a counterfactual contrast of cause-specific cumulative incidences quantifies the total causal effect of a treatment on the event of interest. However, effects of treatment on the competing event may indirectly contribute to this total effect, complicating its interpretation. We previously proposed the separable effects (Stensrud et al, 2019) to define direct and indirect effects of the treatment on the event of interest. This definition presupposes a treatment decomposition into two components acting along two separate causal pathways, one exclusively outside of the competing event and the other exclusively through it. Unlike previous definitions of direct and indirect effects, the separable effects can be subject to empirical scrutiny in a study where separate interventions on the treatment components are available. Here we extend and generalize the notion of the separable effects in several ways, allowing for interpretation, identification and estimation under considerably weaker assumptions. We propose and discuss a definition of separable effects that is applicable to general time-varying structures, where the separable effects can still be meaningfully interpreted, even when they cannot be regarded as direct and indirect effects. We further derive weaker conditions for identification of separable effects in observational studies where decomposed treatments are not yet available; in particular, these conditions allow for time-varying common causes of the event of interest, the competing events and loss to follow-up. For these general settings, we propose semi-parametric weighted estimators that are straightforward to implement. As an illustration, we apply the estimators to study the separable effects of intensive blood pressure therapy on acute kidney injury, using data from a randomized clinical trial.
We propose a general new method, the conditional permutation test, for testing the conditional independence of variables $X$ and $Y$ given a potentially high-dimensional random vector $Z$ that may contain confounding factors. The proposed test permutes entries of $X$ non-uniformly, so as to respect the existing dependence between $X$ and $Z$ and thus account for the presence of these confounders. Like the conditional randomization test of Cand`es et al. (2018), our test relies on the availability of an approximation to the distribution of $X mid Z$. While Cand`es et al. (2018)s test uses this estimate to draw new $X$ values, for our test we use this approximation to design an appropriate non-uniform distribution on permutations of the $X$ values already seen in the true data. We provide an efficient Markov Chain Monte Carlo sampler for the implementation of our method, and establish bounds on the Type I error in terms of the error in the approximation of the conditional distribution of $Xmid Z$, finding that, for the worst case test statistic, the inflation in Type I error of the conditional permutation test is no larger than that of the conditional randomization test. We validate these theoretical results with experiments on simulated data and on the Capital Bikeshare data set.
In observational studies, balancing covariates in different treatment groups is essential to estimate treatment effects. One of the most commonly used methods for such purposes is weighting. The performance of this class of methods usually depends on strong regularity conditions for the underlying model, which might not hold in practice. In this paper, we investigate weighting methods from a functional estimation perspective and argue that the weights needed for covariate balancing could differ from those needed for treatment effects estimation under low regularity conditions. Motivated by this observation, we introduce a new framework of weighting that directly targets the treatment effects estimation. Unlike existing methods, the resulting estimator for a treatment effect under this new framework is a simple kernel-based $U$-statistic after applying a data-driven transformation to the observed covariates. We characterize the theoretical properties of the new estimators of treatment effects under a nonparametric setting and show that they are able to work robustly under low regularity conditions. The new framework is also applied to several numerical examples to demonstrate its practical merits.
Modeling of longitudinal data often requires diffusion models that incorporate overall time-dependent, nonlinear dynamics of multiple components and provide sufficient flexibility for subject-specific modeling. This complexity challenges parameter inference and approximations are inevitable. We propose a method for approximate maximum-likelihood parameter estimation in multivariate time-inhomogeneous diffusions, where subject-specific flexibility is accounted for by incorporation of multidimensional mixed effects and covariates. We consider $N$ multidimensional independent diffusions $X^i = (X^i_t)_{0leq tleq T^i}, 1leq ileq N$, with common overall model structure and unknown fixed-effects parameter $mu$. Their dynamics differ by the subject-specific random effect $phi^i$ in the drift and possibly by (known) covariate information, different initial conditions and observation times and duration. The distribution of $phi^i$ is parametrized by an unknown $vartheta$ and $theta = (mu, vartheta)$ is the target of statistical inference. Its maximum likelihood estimator is derived from the continuous-time likelihood. We prove consistency and asymptotic normality of $hat{theta}_N$ when the number $N$ of subjects goes to infinity using standard techniques and consider the more general concept of local asymptotic normality for less regular models. The bias induced by time-discretization of sufficient statistics is investigated. We discuss verification of conditions and investigate parameter estimation and hypothesis testing in simulations.
Causal effect sizes may vary among individuals and they can even be of opposite directions. When there exists serious effect heterogeneity, the population average causal effect (ACE) is not very informative. It is well-known that individual causal effects (ICEs) cannot be determined in cross-sectional studies, but we will show that ICEs can be retrieved from longitudinal data under certain conditions. We will present a general framework for individual causality where we will view effect heterogeneity as an individual-specific effect modification that can be parameterized with a latent variable, the receptiveness factor. The distribution of the receptiveness factor can be retrieved, and it will enable us to study the contrast of the potential outcomes of an individual under stationarity assumptions. Within the framework, we will study the joint distribution of the individuals potential outcomes conditioned on all individuals factual data and subsequently the distribution of the cross-world causal effect (CWCE). We discuss conditions such that the latter converges to a degenerated distribution, in which case the ICE can be estimated consistently. To demonstrate the use of this general framework, we present examples in which the outcome process can be parameterized as a (generalized) linear mixed model.