Do you want to publish a course? Click here

Missing at random: a stochastic process perspective

95   0   0.0 ( 0 )
 Added by Daniel Farewell
 Publication date 2018
and research's language is English




Ask ChatGPT about the research

We offer a natural and extensible measure-theoretic treatment of missingness at random. Within the standard missing data framework, we give a novel characterisation of the observed data as a stopping-set sigma algebra. We demonstrate that the usual missingness at random conditions are equivalent to requiring particular stochastic processes to be adapted to a set-indexed filtration of the complete data: measurability conditions that suffice to ensure the likelihood factorisation necessary for ignorability. Our rigorous statement of the missing at random conditions also clarifies a common confusion: what is fixed, and what is random?



rate research

Read More

Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism governing data missingness, and correctly deciding the appropriate mechanism is crucially relevant for conducting proper practical investigations. The conventional notions include the three common potential classes -- missing completely at random, missing at random, and missing not at random. In this paper, we present a new hypothesis testing approach for deciding between missing at random and missing not at random. Since the potential alternatives of missing at random are broad, we focus our investigation on a general class of models with instrumental variables for data missing not at random. Our setting is broadly applicable, thanks to that the model concerning the missing data is nonparametric, requiring no explicit model specification for the data missingness. The foundational idea is to develop appropriate discrepancy measures between estimators whose properties significantly differ only when missing at random does not hold. We show that our new hypothesis testing approach achieves an objective data oriented choice between missing at random or not. We demonstrate the feasibility, validity, and efficacy of the new test by theoretical analysis, simulation studies, and a real data analysis.
89 - BaoLuo Sun , Lan Liu , Wang Miao 2016
Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. In such settings, identification is generally not possible without imposing additional assumptions. Identification is sometimes possible, however, if an instrumental variable (IV) is observed for all subjects which satisfies the exclusion restriction that the IV affects the missingness process without directly influencing the outcome. In this paper, we provide necessary and sufficient conditions for nonparametric identification of the full data distribution under MNAR with the aid of an IV. In addition, we give sufficient identification conditions that are more straightforward to verify in practice. For inference, we focus on estimation of a population outcome mean, for which we develop a suite of semiparametric estimators that extend methods previously developed for data missing at random. Specifically, we propose inverse probability weighted estimation, outcome regression-based estimation and doubly robust estimation of the mean of an outcome subject to MNAR. For illustration, the methods are used to account for selection bias induced by HIV testing refusal in the evaluation of HIV seroprevalence in Mochudi, Botswana, using interviewer characteristics such as gender, age and years of experience as IVs.
We study the identification and estimation of statistical functionals of multivariate data missing non-monotonically and not-at-random, taking a semiparametric approach. Specifically, we assume that the missingness mechanism satisfies what has been previously called no self-censoring or itemwise conditionally independent nonresponse, which roughly corresponds to the assumption that no partially-observed variable directly determines its own missingness status. We show that this assumption, combined with an odds ratio parameterization of the joint density, enables identification of functionals of interest, and we establish the semiparametric efficiency bound for the nonparametric model satisfying this assumption. We propose a practical augmented inverse probability weighted estimator, and in the setting with a (possibly high-dimensional) always-observed subset of covariates, our proposed estimator enjoys a certain double-robustness property. We explore the performance of our estimator with simulation experiments and on a previously-studied data set of HIV-positive mothers in Botswana.
106 - Yijie Li , Wei Fan , Miao Zhang 2020
The causal structure for measurement bias (MB) remains controversial. Aided by the Directed Acyclic Graph (DAG), this paper proposes a new structure for measuring one singleton variable whose MB arises in the selection of an imperfect I/O device-like measurement system. For effect estimation, however, an extra source of MB arises from any redundant association between a measured exposure and a measured outcome. The misclassification will be bidirectionally differential for a common outcome, unidirectionally differential for a causal relation, and non-differential for a common cause between the measured exposure and the measured outcome or a null effect. The measured exposure can actually affect the measured outcome, or vice versa. Reverse causality is a concept defined at the level of measurement. Our new DAGs have clarified the structures and mechanisms of MB.
A pedigree is a directed graph that describes how individuals are related through ancestry in a sexually-reproducing population. In this paper we explore the question of whether one can reconstruct a pedigree by just observing sequence data for present day individuals. This is motivated by the increasing availability of genomic sequences, but in this paper we take a more theoretical approach and consider what models of sequence evolution might allow pedigree reconstruction (given sufficiently long sequences). Our results complement recent work that showed that pedigree reconstruction may be fundamentally impossible if one uses just the degrees of relatedness between different extant individuals. We find that for certain stochastic processes, pedigrees can be recovered up to isomorphism from sufficiently long sequences.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا