Spatio-Temporal Mixed Models to Predict Coverage Error Rates at Local Areas


Abstract in English

Despite of the great efforts during the censuses, occurrence of some nonsampling errors such as coverage error is inevitable. Coverage error which can be classified into two types of under-count and overcount occurs when there is no unique bijective (one-to-one) mapping between the individuals from the census count and the target population -- individuals who usually reside in the country (de jure residences). There are variety of reasons make the coverage error happens including deficiencies in the census maps, errors in the field operations or disinclination of people for participation in the undercount situation and multiple enumeration of individuals or those who do not belong to the scope of the census in the overcount situation. A routine practice for estimating the net coverage error is subtracting the census count from the estimated true population, which obtained from a dual system (or capture-recapture) technique. Estimated coverage error usually suffers from significant uncertainty of the direct estimate of true population or other errors such as matching error. To rectify the above-mentioned problem and predict a more reliable coverage error rate, we propose a set of spatio-temporal mixed models. In an illustrative study on the 2010 census coverage error rate of the U.S. counties with population more than 100,000, we select the best mixed model for prediction by deviance information criteria (DIC) and conditional predictive ordinate (CPO). Our proposed approach for predicting coverage error rate and its measure of uncertainty is a full Bayesian approach, which leads to a reasonable improvement over the direct coverage error rate in terms of mean squared error (MSE) and confidence interval (CI) as provided by the U.S. Census Bureau.

Download