Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Resampling-based Confidence Intervals for Model-free Robust Inference on Optimal Treatment Regimes

143 0 0.0 ( 0 )

Download Cite

Added by Yunan Wu

Publication date 2019

fields Mathematical Statistics

and research's language is English

Authors Yunan Wu - Lan Wang

Methodology Machine Learning

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We propose a new procedure for inference on optimal treatment regimes in the model-free setting, which does not require to specify an outcome regression model. Existing model-free estimators for optimal treatment regimes are usually not suitable for the purpose of inference, because they either have nonstandard asymptotic distributions or do not necessarily guarantee consistent estimation of the parameter indexing the Bayes rule due to the use of surrogate loss. We first study a smoothed robust estimator that directly targets the parameter corresponding to the Bayes decision rule for optimal treatment regimes estimation. This estimator is shown to have an asymptotic normal distribution. Furthermore, we verify that a resampling procedure provides asymptotically accurate inference for both the parameter indexing the optimal treatment regime and the optimal value function. A new algorithm is developed to calculate the proposed estimator with substantially improved speed and stability. Numerical results demonstrate the satisfactory performance of the new methods.

rate research

Model-Assisted Uniformly Honest Inference for Optimal Treatment Regimes in High Dimension

88 - Yunan Wu , Lan Wang , Haoda Fu 2021

This paper develops new tools to quantify uncertainty in optimal decision making and to gain insight into which variables one should collect information about given the potential cost of measuring a large number of variables. We investigate simultaneous inference to determine if a group of variables is relevant for estimating an optimal decision rule in a high-dimensional semiparametric framework. The unknown link function permits flexible modeling of the interactions between the treatment and the covariates, but leads to nonconvex estimation in high dimension and imposes significant challenges for inference. We first establish that a local restricted strong convexity condition holds with high probability and that any feasible local sparse solution of the estimation problem can achieve the near-oracle estimation error bound. We further rigorously verify that a wild bootstrap procedure based on a debiased version of the local solution can provide asymptotically honest uniform inference for the effect of a group of variables on optimal decision making. The advantage of honest inference is that it does not require the initial estimator to achieve perfect model selection and does not require the zero and nonzero effects to be well-separated. We also propose an efficient algorithm for estimation. Our simulations suggest satisfactory performance. An example from a diabetes study illustrates the real application.

Methodology

Optimal hypergeometric confidence sets are (almost) always intervals

163 - Jay Bartroff , Gary Lorden , 2021

We present an efficient method of calculating exact confidence intervals for the hypergeometric parameter. The method inverts minimum-width acceptance intervals after shifting them to make their endpoints nondecreasing while preserving their level. The resulting set of confidence intervals achieves minimum possible average width, and even in comparison with confidence sets not required to be intervals it attains the minimum possible cardinality most of the time, and always within 1. The method compares favorably with existing methods not only in the size of the intervals but also in the time required to compute them. The available R package hyperMCI implements the proposed method.

Methodology Statistics Theory Statistics Theory

Efficient Learning of Optimal Individualized Treatment Rules for Heteroscedastic or Misspecified Treatment-Free Effect Models

117 - Weibin Mo , Yufeng Liu 2021

Recent development in data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, researchers can search for the optimal individualized treatment rule (ITR) that maximizes the expected outcome. Existing methods typically require initial estimation of some nuisance models. The double robustness property that can protect from misspecification of either the treatment-free effect or the propensity score has been widely advocated. However, when model misspecification exists, a doubly robust estimate can be consistent but may suffer from downgraded efficiency. Other than potential misspecified nuisance models, most existing methods do not account for the potential problem when the variance of outcome is heterogeneous among covariates and treatment. We observe that such heteroscedasticity can greatly affect the estimation efficiency of the optimal ITR. In this paper, we demonstrate that the consequences of misspecified treatment-free effect and heteroscedasticity can be unified as a covariate-treatment dependent variance of residuals. To improve efficiency of the estimated ITR, we propose an Efficient Learning (E-Learning) framework for finding an optimal ITR in the multi-armed treatment setting. We show that the proposed E-Learning is optimal among a regular class of semiparametric estimates that can allow treatment-free effect misspecification. In our simulation study, E-Learning demonstrates its effectiveness if one of or both misspecified treatment-free effect and heteroscedasticity exist. Our analysis of a Type 2 Diabetes Mellitus (T2DM) observational study also suggests the improved efficiency of E-Learning.

Methodology Machine Learning

Partial identification and dependence-robust confidence intervals for capture-recapture surveys

76 - Jinghao Sun 2020

Capture-recapture (CRC) surveys are widely used to estimate the size of a population whose members cannot be enumerated directly. When $k$ capture samples are obtained, counts of unit captures in subsets of samples are represented naturally by a $2^k$ contingency table in which one element -- the number of individuals appearing in none of the samples -- remains unobserved. In the absence of additional assumptions, the population size is not point-identified. Assumptions about independence between samples are often used to achieve point-identification. However, real-world CRC surveys often use convenience samples in which independence cannot be guaranteed, and population size estimates under independence assumptions may lack empirical credibility. In this work, we apply the theory of partial identification to show that weak assumptions or qualitative knowledge about the nature of dependence between samples can be used to characterize a non-trivial set in which the true population size lies with high probability. We construct confidence sets for the population size under bounds on pairwise capture probabilities, and bounds on the highest order interaction term in a log-linear model using two methods: test inversion bootstrap confidence intervals, and profile likelihood confidence intervals. We apply these methods to recent survey data to estimate the number of people who inject drugs in Brussels, Belgium.

Methodology Applications

Robust and Efficient Empirical Bayes Confidence Intervals using $gamma$-Divergence

76 - Daisuke Kurisu , Takuya Ishihara , Shonosuke Sugasawa 2021

Although parametric empirical Bayes confidence intervals of multiple normal means are fundamental tools for compound decision problems, their performance can be sensitive to the misspecification of the parametric prior distribution (typically normal distribution), especially when some strong signals are included. We suggest a simple modification of the standard confidence intervals such that the proposed interval is robust against misspecification of the prior distribution. Our main idea is using well-known Tweedies formula with robust likelihood based on $gamma$-divergence. An advantage of the new interval is that the interval lengths are always smaller than or equal to those of the parametric empirical Bayes confidence interval so that the new interval is efficient and robust. We prove asymptotic validity that the coverage probability of the proposed confidence intervals attain a nominal level even when the true underlying distribution of signals is contaminated, and the coverage accuracy is less sensitive to the contamination ratio. The numerical performance of the proposed method is demonstrated through simulation experiments and a real data application.

Methodology

comments (0)

no comments...

AlHawash Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Resampling-based Confidence Intervals for Model-free Robust Inference on Optimal Treatment Regimes

Ask ChatGPT about the research

No Arabic abstract

Read More