No Arabic abstract
In this guide, we present how to perform constraint-based causal discovery using three popular software packages: pcalg (with add-ons tpc and micd), bnlearn, and TETRAD. We focus on how these packages can be used with observational data and in the presence of mixed data (i.e., data where some variables are continuous, while others are categorical), a known time ordering between variables, and missing data. Throughout, we point out the relative strengths and limitations of each package, as well as give practical recommendations. We hope this guide helps anyone who is interested in performing constraint-based causal discovery on their data.
This paper introduces a unified framework of counterfactual estimation for time-series cross-sectional data, which estimates the average treatment effect on the treated by directly imputing treated counterfactuals. Examples include the fixed effects counterfactual estimator, interactive fixed effects counterfactual estimator, and matrix completion estimator. These estimators provide more reliable causal estimates than conventional twoway fixed effects models when treatment effects are heterogeneous or unobserved time-varying confounders exist. Under this framework, we propose a new dynamic treatment effects plot, as well as several diagnostic tests, to help researchers gauge the validity of the identifying assumptions. We illustrate these methods with two political economy examples and develop an open-source package, fect, in both R and Stata to facilitate implementation.
In this review, we present a simple guide for researchers to obtain pseudo-random samples with censored data. We focus our attention on the most common types of censored data, such as type I, type II, and random censoring. We discussed the necessary steps to sample pseudo-random values from long-term survival models where an additional cure fraction is informed. For illustrative purposes, these techniques are applied in the Weibull distribution. The algorithms and codes in R are presented, enabling the reproducibility of our study.
Causal discovery algorithms estimate causal graphs from observational data. This can provide a valuable complement to analyses focussing on the causal relation between individual treatment-outcome pairs. Constraint-based causal discovery algorithms rely on conditional independence testing when building the graph. Until recently, these algorithms have been unable to handle missing values. In this paper, we investigate two alternative solutions: Test-wise deletion and multiple imputation. We establish necessary and sufficient conditions for the recoverability of causal structures under test-wise deletion, and argue that multiple imputation is more challenging in the context of causal discovery than for estimation. We conduct an extensive comparison by simulating from benchmark causal graphs: As one might expect, we find that test-wise deletion and multiple imputation both clearly outperform list-wise deletion and single imputation. Crucially, our results further suggest that multiple imputation is especially useful in settings with a small number of either Gaussian or discrete variables, but when the dataset contains a mix of both neither method is uniformly best. The methods we compare include random forest imputation and a hybrid procedure combining test-wise deletion and multiple imputation. An application to data from the IDEFICS cohort study on diet- and lifestyle-related diseases in European children serves as an illustrating example.
Adaptive designs for clinical trials permit alterations to a study in response to accumulating data in order to make trials more flexible, ethical and efficient. These benefits are achieved while preserving the integrity and validity of the trial, through the pre-specification and proper adjustment for the possible alterations during the course of the trial. Despite much research in the statistical literature highlighting the potential advantages of adaptive designs over traditional fixed designs, the uptake of such methods in clinical research has been slow. One major reason for this is that different adaptations to trial designs, as well as their advantages and limitations, remain unfamiliar to large parts of the clinical community. The aim of this paper is to clarify where adaptive designs can be used to address specific questions of scientific interest; we introduce the main features of adaptive designs and commonly used terminology, highlighting their utility and pitfalls, and illustrate their use through case studies of adaptive trials ranging from early-phase dose escalation to confirmatory Phase III studies.
Multi-image alignment, bringing a group of images into common register, is an ubiquitous problem and the first step of many applications in a wide variety of domains. As a result, a great amount of effort is being invested in developing efficient multi-image alignment algorithms. Little has been done, however, to answer fundamental practical questions such as: what is the comparative performance of existing methods? is there still room for improvement? under which conditions should one technique be preferred over another? does adding more images or prior image information improve the registration results? In this work, we present a thorough analysis and evaluation of the main multi-image alignment methods which, combined with theoretical limits in multi-image alignment performance, allows us to organize them under a common framework and provide practical answers to these essential questions.