أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Xavier de Luna

SDRcausal: an R package for causal inference based on sufficient dimension reduction

88 - Mohammad Ghasempour , Xavier de Luna 2021

SDRcausal is a package that implements sufficient dimension reduction methods for causal inference as proposed in Ghosh, Ma, and de Luna (2021). The package implements (augmented) inverse probability weighting and outcome regression (imputation) esti mators of an average treatment effect (ATE) parameter. Nuisance models, both treatment assignment probability given the covariates (propensity score) and outcome regression models, are fitted by using semiparametric locally efficient dimension reduction estimators, thereby allowing for large sets of confounding covariates. Techniques including linear extrapolation, numerical differentiation, and truncation have been used to obtain a practicable implementation of the methods. Finding the suitable dimension reduction map (central mean subspace) requires solving an optimization problem, and several optimization algorithms are given as choices to the user. The package also provides estimators of the asymptotic variances of the causal effect estimators implemented. Plotting options are provided. The core of the methods are implemented in C language, and parallelization is allowed for. The user-friendly and freeware R language is used as interface. The package can be downloaded from Github repository: https://github.com/stat4reg.

حساب المنهجية

The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters

68 - Niloofar Moosavi , Jenny Haggstrom , Xavier de Luna 2021

Important advances have recently been achieved in developing procedures yielding uniformly valid inference for a low dimensional causal parameter when high-dimensional nuisance models must be estimated. In this paper, we review the literature on unif ormly valid causal inference and discuss the costs and benefits of using uniformly valid inference procedures. Naive estimation strategies based on regularisation, machine learning, or a preliminary model selection stage for the nuisance models have finite sample distributions which are badly approximated by their asymptotic distributions. To solve this serious problem, estimators which converge uniformly in distribution over a class of data generating mechanisms have been proposed in the literature. In order to obtain uniformly valid results in high-dimensional situations, sparsity conditions for the nuisance models need typically to be made, although a double robustness property holds, whereby if one of the nuisance model is more sparse, the other nuisance model is allowed to be less sparse. While uniformly valid inference is a highly desirable property, uniformly valid procedures pay a high price in terms of inflated variability. Our discussion of this dilemma is illustrated by the study of a double-selection outcome regression estimator, which we show is uniformly asymptotically unbiased, but is less variable than uniformly valid estimators in the numerical experiments conducted.

نظرية الإحصاء المنهجية نظرية الإحصاء

Covariate balancing for causal inference on categorical and continuous treatments

89 - Seong-ho Lee , Yanyuan Ma , Xavier de Luna 2021

We propose novel estimators for categorical and continuous treatments by using an optimal covariate balancing strategy for inverse probability weighting. The resulting estimators are shown to be consistent and asymptotically normal for causal contras ts of interest, either when the model explaining treatment assignment is correctly specified, or when the correct set of bases for the outcome models has been chosen and the assignment model is sufficiently rich. For the categorical treatment case, we show that the estimator attains the semiparametric efficiency bound when all models are correctly specified. For the continuous case, the causal parameter of interest is a function of the treatment dose. The latter is not parametrized and the estimators proposed are shown to have bias and variance of the classical nonparametric rate. Asymptotic results are complemented with simulations illustrating the finite sample properties. Our analysis of a data set suggests a nonlinear effect of BMI on the decline in self reported health.

المنهجية نظرية الإحصاء نظرية الإحصاء

Sufficient Dimension Reduction for Feasible and Robust Estimation of Average Causal Effect

94 - Trinetri Ghosh n Pennsylvania State University 2018

When estimating the treatment effect in an observational study, we use a semiparametric locally efficient dimension reduction approach to assess both the treatment assignment mechanism and the average responses in both treated and nontreated groups. We then integrate all results through imputation, inverse probability weighting and doubly robust augmentation estimators. Doubly robust estimators are locally efficient while imputation estimators are super-efficient when the response models are correct. To take advantage of both procedures, we introduce a shrinkage estimator to automatically combine the two, which retains the double robustness property while improving on the variance when the response model is correct. We demonstrate the performance of these estimators through simulated experiments and a real dataset concerning the effect of maternal smoking on baby birth weight. Key words and phrases: Average Treatment Effect, Doubly Robust Estimator, Efficiency, Inverse Probability Weighting, Shrinkage Estimator.

المنهجية

Robust semiparametric inference with missing data

90 - Eva Cantoni , Xavier de Luna 2018

Classical semiparametric inference with missing outcome data is not robust to contamination of the observed data and a single observation can have arbitrarily large influence on estimation of a parameter of interest. This sensitivity is exacerbated w hen inverse probability weighting methods are used, which may overweight contaminated observations. We introduce inverse probability weighted, double robust and outcome regression estimators of location and scale parameters, which are robust to contamination in the sense that their influence function is bounded. We give asymptotic properties and study finite sample behaviour. Our simulated experiments show that contamination can be more serious a threat to the quality of inference than model misspecification. An interesting aspect of our results is that the auxiliary outcome model used to adjust for ignorable missingness by some of the estimators, is also useful to protect against contamination. We also illustrate through a case study how both adjustment to ignorable missingness and protection against contamination are achieved through weighting schemes, which can be contrasted to gain further insights.

المنهجية

Causal inference taking into account unobserved confounding

91 - Minna Genback , Xavier de Luna 2017

Causal inference with observational data can be performed under an assumption of no unobserved confounders (unconfoundedness assumption). There is, however, seldom clear subject-matter or empirical evidence for such an assumption. We therefore develo p uncertainty intervals for average causal effects based on outcome regression estimators and doubly robust estimators, which provide inference taking into account both sampling variability and uncertainty due to unobserved confounders. In contrast with sampling variation, uncertainty due unobserved confounding does not decrease with increasing sample size. The intervals introduced are obtained by deriving the bias of the estimators due to unobserved confounders. We are thus also able to contrast the size of the bias due to violation of the unconfoundedness assumption, with bias due to misspecification of the models used to explain potential outcomes. This is illustrated through numerical experiments where bias due to moderate unobserved confounding dominates misspecification bias for typical situations in terms of sample size and modeling assumptions. We also study the empirical coverage of the uncertainty intervals introduced and apply the results to a study of the effect of regular food intake on health. An R-package implementing the inference proposed is available.

المنهجية

Sensitivity analysis for unobserved confounding of direct and indirect effects using uncertainty intervals

81 - Anita Lindmark , Xavier de Luna , Marie Eriksson 2017

To estimate direct and indirect effects of an exposure on an outcome from observed data strong assumptions about unconfoundedness are required. Since these assumptions cannot be tested using the observed data, a mediation analysis should always be ac companied by a sensitivity analysis of the resulting estimates. In this article we propose a sensitivity analysis method for parametric estimation of direct and indirect effects when the exposure, mediator and outcome are all binary. The sensitivity parameters consist of the correlation between the error terms of the mediator and outcome models, the correlation between the error terms of the mediator model and the model for the exposure assignment mechanism, and the correlation between the error terms of the exposure assignment and outcome models. These correlations are incorporated into the estimation of the model parameters and identification sets are then obtained for the direct and indirect effects for a range of plausible correlation values. We take the sampling variability into account through the construction of uncertainty intervals. The proposed method is able to assess sensitivity to both mediator-outcome confounding and confounding involving the exposure. To illustrate the method we apply it to a mediation study based on data from the Swedish Stroke Register (Riksstroke).

نظرية الإحصاء نظرية الإحصاء

Inference for partial correlation when data are missing not at random

80 - Tetiana Gorbach , Xavier de Luna 2017

We introduce uncertainty regions to perform inference on partial correlations when data are missing not at random. These uncertainty regions are shown to have a desired asymptotic coverage. Their finite sample performance is illustrated via simulations and real data example.

نظرية الإحصاء نظرية الإحصاء

A Consistency Result for Bayes Classifiers with Censored Response Data

113 - Priyantha Wijayatunga , Xavier de Luna 2014

Naive Bayes classifiers have proven to be useful in many prediction problems with complete training data. Here we consider the situation where a naive Bayes classifier is trained with data where the response is right censored. Such prediction problem s are for instance encountered in profiling systems used at National Employment Agencies. In this paper we propose the maximum collective conditional likelihood estimator for the prediction and show that it is strongly consistent under the usual identifiability condition.

نظرية الإحصاء نظرية الإحصاء

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد