No Arabic abstract
How to measure the incremental Return On Ad Spend (iROAS) is a fundamental problem for the online advertising industry. A standard modern tool is to run randomized geo experiments, where experimental units are non-overlapping ad-targetable geographical areas (Vaver & Koehler 2011). However, how to design a reliable and cost-effective geo experiment can be complicated, for example: 1) the number of geos is often small, 2) the response metric (e.g. revenue) across geos can be very heavy-tailed due to geo heterogeneity, and furthermore 3) the response metric can vary dramatically over time. To address these issues, we propose a robust nonparametric method for the design, called Trimmed Match Design (TMD), which extends the idea of Trimmed Match (Chen & Au 2019) and furthermore integrates the techniques of optimal subset pairing and sample splitting in a novel and systematic manner. Some simulation and real case studies are presented. We also point out a few open problems for future research.
Instrumental variable methods are widely used in medical and social science research to draw causal conclusions when the treatment and outcome are confounded by unmeasured confounding variables. One important feature of such studies is that the instrumental variable is often applied at the cluster level, e.g., hospitals or physicians preference for a certain treatment where each hospital or physician naturally defines a cluster. This paper proposes to embed such observational instrumental variable data into a cluster-randomized encouragement experiment using statistical matching. Potential outcomes and causal assumptions underpinning the design are formalized and examined. Testing procedures for two commonly-used estimands, Fishers sharp null hypothesis and the pooled effect ratio, are extended to the current setting. We then introduce a novel cluster-heterogeneous proportional treatment effect model and the relevant estimand: the average cluster effect ratio. This new estimand is advantageous over the structural parameter in a constant proportional treatment effect model in that it allows treatment heterogeneity, and is advantageous over the pooled effect ratio estimand in that it is immune to Simpsons paradox. We develop an asymptotically valid randomization-based testing procedure for this new estimand based on solving a mixed integer quadratically-constrained optimization problem. The proposed design and inferential methods are applied to a study of the effect of using transesophageal echocardiography during CABG surgery on patients 30-day mortality rate.
Treatment switching in a randomized controlled trial is said to occur when a patient randomized to one treatment arm switches to another treatment arm during follow-up. This can occur at the point of disease progression, whereby patients in the control arm may be offered the experimental treatment. It is widely known that failure to account for treatment switching can seriously dilute the estimated effect of treatment on overall survival. In this paper, we aim to account for the potential impact of treatment switching in a re-analysis evaluating the treatment effect of NucleosideReverse Transcriptase Inhibitors (NRTIs) on a safety outcome (time to first severe or worse sign or symptom) in participants receiving a new antiretroviral regimen that either included or omitted NRTIs in the Optimized Treatment That Includes or OmitsNRTIs (OPTIONS) trial. We propose an estimator of a treatment causal effect under a structural cumulative survival model (SCSM) that leverages randomization as an instrumental variable to account for selective treatment switching. Unlike Robins accelerated failure time model often used to address treatment switching, the proposed approach avoids the need for artificial censoring for estimation. We establish that the proposed estimator is uniformly consistent and asymptotically Gaussian under standard regularity conditions. A consistent variance estimator is also given and a simple resampling approach provides uniform confidence bands for the causal difference comparing treatment groups overtime on the cumulative intensity scale. We develop an R package named ivsacim implementing all proposed methods, freely available to download from R CRAN. We examine the finite performance of the estimator via extensive simulations.
We review recent literature that proposes to adapt ideas from classical model based optimal design of experiments to problems of data selection of large datasets. Special attention is given to bias reduction and to protection against confounders. Some new results are presented. Theoretical and computational comparisons are made.
Motivation: Although principal component analysis is frequently applied to reduce the dimensionality of matrix data, the method is sensitive to noise and bias and has difficulty with comparability and interpretation. These issues are addressed by improving the fidelity to the study design. Principal axes and the components for variables are found through the arrangement of the training data set, and the centers of data are found according to the design. By using both the axes and the center, components for an observation that belong to various studies can be separately estimated. Both of the components for variables and observations are scaled to a unit length, which enables relationships to be seen between them. Results: Analyses in transcriptome studies showed an improvement in the separation of experimental groups and in robustness to bias and noise. Unknown samples were appropriately classified on predetermined axes. These axes well reflected the study design, and this facilitated the interpretation. Together, the introduced concepts resulted in improved generality and objectivity in the analytical results, with the ability to locate hidden structures in the data.
In paired comparison experiments respondents usually evaluate pairs of competing options. For this situation we introduce an appropriate model and derive optimal designs in the presence of second-order interactions when all attributes are dichotomous.