ترغب بنشر مسار تعليمي؟ اضغط هنا

Sequential Monte Carlo for Sampling Balanced and Compact Redistricting Plans

120   0   0.0 ( 0 )
 نشر من قبل Cory McCartan
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Random sampling of graph partitions under constraints has become a popular tool for evaluating legislative redistricting plans. Analysts detect partisan gerrymandering by comparing a proposed redistricting plan with an ensemble of sampled alternative plans. For successful application, sampling methods must scale to large maps with many districts, incorporate realistic legal constraints, and accurately and efficiently sample from a selected target distribution. Unfortunately, most existing methods struggle in at least one of these three areas. We present a new Sequential Monte Carlo (SMC) algorithm that draws representative redistricting plans from a realistic target distribution of choice. Because it samples directly, the SMC algorithm can efficiently explore the relevant space of redistricting plans better than the existing Markov chain Monte Carlo algorithms that yield dependent samples. Our algorithm can simultaneously incorporate several constraints commonly imposed in real-world redistricting problems, including equal population, compactness, and preservation of administrative boundaries. We validate the accuracy of the proposed algorithm by using a small map where all redistricting plans can be enumerated. We then apply the SMC algorithm to evaluate the partisan implications of several maps submitted by relevant parties in a recent high-profile redistricting case in the state of Pennsylvania. We find that the proposed algorithm is roughly 40 times more efficient in sampling from the target distribution than a state-of-the-art MCMC algorithm. Open-source software is available for implementing the proposed methodology.



قيم البحث

اقرأ أيضاً

We use a multivariate formulation of sequential Monte Carlo filter that utilizes mechanistic models for Ebola virus propagation and available incidence data to simultaneously estimate the disease progression states and the model parameters. This meth od has the advantage of performing the inference online as the new data becomes available and estimates the evolution of basic reproductive ratio $R_0(t)$ of the Ebola outbreak through time. Our analysis identifies a peak in the basic reproductive ratio close to the time when Ebola cases were reported in Europe and the USA.
Decision making for dynamic systems is challenging due to the scale and dynamicity of such systems, and it is comprised of decisions at strategic, tactical, and operational levels. One of the most important aspects of decision making is incorporating real time information that reflects immediate status of the system. This type of decision making, which may apply to any dynamic system, needs to comply with the systems current capabilities and calls for a dynamic data driven planning framework. Performance of dynamic data driven planning frameworks relies on the decision making process which in return is relevant to the quality of the available data. This means that the planning framework should be able to set the level of decision making based on the current status of the system, which is learned through the continuous readings of sensory data. In this work, a Markov chain Monte Carlo sampling method is proposed to determine the optimal fidelity of decision making in a dynamic data driven framework. To evaluate the performance of the proposed method, an experiment is conducted, where the impact of workers performance on the production capacity and the fidelity level of decision making are studied.
The US Census Bureau plans to protect the privacy of 2020 Census respondents through its Disclosure Avoidance System (DAS), which attempts to achieve differential privacy guarantees by adding noise to the Census microdata. By applying redistricting s imulation and analysis methods to DAS-protected 2010 Census data, we find that the protected data are not of sufficient quality for redistricting purposes. We demonstrate that the injected noise makes it impossible for states to accurately comply with the One Person, One Vote principle. Our analysis finds that the DAS-protected data are biased against certain areas, depending on voter turnout and partisan and racial composition, and that these biases lead to large and unpredictable errors in the analysis of partisan and racial gerrymanders. Finally, we show that the DAS algorithm does not universally protect respondent privacy. Based on the names and addresses of registered voters, we are able to predict their race as accurately using the DAS-protected data as when using the 2010 Census data. Despite this, the DAS-protected data can still inaccurately estimate the number of majority-minority districts. We conclude with recommendations for how the Census Bureau should proceed with privacy protection for the 2020 Census.
This paper explores the application of methods from information geometry to the sequential Monte Carlo (SMC) sampler. In particular the Riemannian manifold Metropolis-adjusted Langevin algorithm (mMALA) is adapted for the transition kernels in SMC. S imilar to its function in Markov chain Monte Carlo methods, the mMALA is a fully adaptable kernel which allows for efficient sampling of high-dimensional and highly correlated parameter spaces. We set up the theoretical framework for its use in SMC with a focus on the application to the problem of sequential Bayesian inference for dynamical systems as modelled by sets of ordinary differential equations. In addition, we argue that defining the sequence of distributions on geodesics optimises the effective sample sizes in the SMC run. We illustrate the application of the methodology by inferring the parameters of simulated Lotka-Volterra and Fitzhugh-Nagumo models. In particular we demonstrate that compared to employing a standard adaptive random walk kernel, the SMC sampler with an information geometric kernel design attains a higher level of statistical robustness in the inferred parameters of the dynamical systems.
Sequential Monte Carlo (SMC), also known as particle filters, has been widely accepted as a powerful computational tool for making inference with dynamical systems. A key step in SMC is resampling, which plays the role of steering the algorithm towar ds the future dynamics. Several strategies have been proposed and used in practice, including multinomial resampling, residual resampling (Liu and Chen 1998), optimal resampling (Fearnhead and Clifford 2003), stratified resampling (Kitagawa 1996), and optimal transport resampling (Reich 2013). We show that, in the one dimensional case, optimal transport resampling is equivalent to stratified resampling on the sorted particles, and they both minimize the resampling variance as well as the expected squared energy distance between the original and resampled empirical distributions; in the multidimensional case, the variance of stratified resampling after sorting particles using Hilbert curve (Gerber et al. 2019) in $mathbb{R}^d$ is $O(m^{-(1+2/d)})$, an improved rate compared to the original $O(m^{-(1+1/d)})$, where $m$ is the number of resampled particles. This improved rate is the lowest for ordered stratified resampling schemes, as conjectured in Gerber et al. (2019). We also present an almost sure bound on the Wasserstein distance between the original and Hilbert-curve-resampled empirical distributions. In light of these theoretical results, we propose the stratified multiple-descendant growth (SMG) algorithm, which allows us to explore the sample space more efficiently compared to the standard i.i.d. multiple-descendant sampling-resampling approach as measured by the Wasserstein metric. Numerical evidence is provided to demonstrate the effectiveness of our proposed method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا