No Arabic abstract
Recent development in data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, researchers can search for the optimal individualized treatment rule (ITR) that maximizes the expected outcome. Existing methods typically require initial estimation of some nuisance models. The double robustness property that can protect from misspecification of either the treatment-free effect or the propensity score has been widely advocated. However, when model misspecification exists, a doubly robust estimate can be consistent but may suffer from downgraded efficiency. Other than potential misspecified nuisance models, most existing methods do not account for the potential problem when the variance of outcome is heterogeneous among covariates and treatment. We observe that such heteroscedasticity can greatly affect the estimation efficiency of the optimal ITR. In this paper, we demonstrate that the consequences of misspecified treatment-free effect and heteroscedasticity can be unified as a covariate-treatment dependent variance of residuals. To improve efficiency of the estimated ITR, we propose an Efficient Learning (E-Learning) framework for finding an optimal ITR in the multi-armed treatment setting. We show that the proposed E-Learning is optimal among a regular class of semiparametric estimates that can allow treatment-free effect misspecification. In our simulation study, E-Learning demonstrates its effectiveness if one of or both misspecified treatment-free effect and heteroscedasticity exist. Our analysis of a Type 2 Diabetes Mellitus (T2DM) observational study also suggests the improved efficiency of E-Learning.
Recent development in the data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, policy makers best individualized treatment rule (ITR) that maximizes the expected outcome, known as the value function. Many existing methods assume that the training and testing distributions are the same. However, the estimated optimal ITR may have poor generalizability when the training and testing distributions are not identical. In this paper, we consider the problem of finding an optimal ITR from a restricted ITR class where there is some unknown covariate changes between the training and testing distributions. We propose a novel distributionally robust ITR (DR-ITR) framework that maximizes the worst-case value function across the values under a set of underlying distributions that are close to the training distribution. The resulting DR-ITR can guarantee the performance among all such distributions reasonably well. We further propose a calibrating procedure that tunes the DR-ITR adaptively to a small amount of calibration data from a target population. In this way, the calibrated DR-ITR can be shown to enjoy better generalizability than the standard ITR based on our numerical studies.
With a large number of baseline covariates, we propose a new semi-parametric modeling strategy for heterogeneous treatment effect estimation and individualized treatment selection, which are two major goals in personalized medicine. We achieve the first goal through estimating a covariate-specific treatment effect (CSTE) curve modeled as an unknown function of a weighted linear combination of all baseline covariates. The weight or the coefficient for each covariate is estimated by fitting a sparse semi-parametric logistic single-index coefficient model. The CSTE curve is estimated by a spline-backfitted kernel procedure, which enables us to further construct a simultaneous confidence band (SCB) for the CSTE curve under a desired confidence level. Based on the SCB, we find the subgroups of patients that benefit from each treatment, so that we can make individualized treatment selection. The innovations of the proposed method are three-fold. First, the proposed method can quantify variability associated with the estimated optimal individualized treatment rule with high-dimensional covariates. Second, the proposed method is very flexible to depict both local and global associations between the treatment and baseline covariates in the presence of high-dimensional covariates, and thus it enjoys flexibility while achieving dimensionality reduction. Third, the SCB achieves the nominal confidence level asymptotically, and it provides a uniform inferential tool in making individualized treatment decisions.
Precision medicine is an emerging scientific topic for disease treatment and prevention that takes into account individual patient characteristics. It is an important direction for clinical research, and many statistical methods have been recently proposed. One of the primary goals of precision medicine is to obtain an optimal individual treatment rule (ITR), which can help make decisions on treatment selection according to each patients specific characteristics. Recently, outcome weighted learning (OWL) has been proposed to estimate such an optimal ITR in a binary treatment setting by maximizing the expected clinical outcome. However, for ordinal treatment settings, such as individualized dose finding, it is unclear how to use OWL. In this paper, we propose a new technique for estimating ITR with ordinal treatments. In particular, we propose a data duplication technique with a piecewise convex loss function. We establish Fisher consistency for the resulting estimated ITR under certain conditions, and obtain the convergence and risk bound properties. Simulated examples and two applications to datasets from an irritable bowel problem and a type 2 diabetes mellitus observational study demonstrate the highly competitive performance of the proposed method compared to existing alternatives.
Randomized controlled trials typically analyze the effectiveness of treatments with the goal of making treatment recommendations for patient subgroups. With the advance of electronic health records, a great variety of data has been collected in clinical practice, enabling the evaluation of treatments and treatment policies based on observational data. In this paper, we focus on learning individualized treatment rules (ITRs) to derive a treatment policy that is expected to generate a better outcome for an individual patient. In our framework, we cast ITRs learning as a contextual bandit problem and minimize the expected risk of the treatment policy. We conduct experiments with the proposed framework both in a simulation study and based on a real-world dataset. In the latter case, we apply our proposed method to learn the optimal ITRs for the administration of intravenous (IV) fluids and vasopressors (VP). Based on various offline evaluation methods, we could show that the policy derived in our framework demonstrates better performance compared to both the physicians and other baselines, including a simple treatment prediction approach. As a long-term goal, our derived policy might eventually lead to better clinical guidelines for the administration of IV and VP.
Data-driven individualized decision making has recently received increasing research interests. Most existing methods rely on the assumption of no unmeasured confounding, which unfortunately cannot be ensured in practice especially in observational studies. Motivated by the recent proposed proximal causal inference, we develop several proximal learning approaches to estimating optimal individualized treatment regimes (ITRs) in the presence of unmeasured confounding. In particular, we establish several identification results for different classes of ITRs, exhibiting the trade-off between the risk of making untestable assumptions and the value function improvement in decision making. Based on these results, we propose several classification-based approaches to finding a variety of restricted in-class optimal ITRs and develop their theoretical properties. The appealing numerical performance of our proposed methods is demonstrated via an extensive simulation study and one real data application.