Weak consistency of the 1-nearest neighbor measure with applications to missing data


Abstract in English

When data is partially missing at random, imputation and importance weighting are often used to estimate moments of the unobserved population. In this paper, we study 1-nearest neighbor (1NN) importance weighting, which estimates moments by replacing missing data with the complete data that is the nearest neighbor in the non-missing covariate space. We define an empirical measure, the 1NN measure, and show that it is weakly consistent for the measure of the missing data. The main idea behind this result is that the 1NN measure is performing inverse probability weighting in the limit. We study applications to missing data and mitigating the impact of covariate shift in prediction tasks.

Download