No Arabic abstract
We propose a novel probabilistic method for detection of objects in noisy images. The method uses results from percolation and random graph theories. We present an algorithm that allows to detect objects of unknown shapes in the presence of random noise. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedure.
We develop an unsupervised, nonparametric, and scalable statistical learning method for detection of unknown objects in noisy images. The method uses results from percolation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes and sizes in the presence of nonparametric noise of unknown level. The noise density is assumed to be unknown and can be very irregular. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove strong consistency and scalability of our method in this setup with minimal assumptions.
By means of Monte Carlo simulations, we study long-range site percolation on square and simple cubic lattices with various combinations of nearest neighbors, up to the eighth neighbors for the square lattice and the ninth neighbors for the simple cubic lattice. We find precise thresholds for 23 systems using a single-cluster growth algorithm. Site percolation on lattices with compact neighborhoods can be mapped to problems of lattice percolation of extended shapes, such as disks and spheres, and the thresholds can be related to the continuum thresholds $eta_c$ for objects of those shapes. This mapping implies $zp_{c} sim 4 eta_c = 4.51235$ in 2D and $zp_{c} sim 8 eta_c = 2.73512$ in 3D for large $z$ for circular and spherical neighborhoods respectively, where $z$ is the coordination number. Fitting our data to the form $p_c = c/(z+b)$ we find good agreement with $c = 2^d eta_c$; the constant $b$ represents a finite-$z$ correction term. We also study power-law fits of the thresholds.
Global sensitivity analysis (GSA) of numerical simulators aims at studying the global impact of the input uncertainties on the output. To perform the GSA, statistical tools based on inputs/output dependence measures are commonly used. We focus here on dependence measures based on reproducing kernel Hilbert spaces: the Hilbert-Schmidt Independence Criterion denoted HSIC. Sometimes, the probability distributions modeling the uncertainty of inputs may be themselves uncertain and it is important to quantify the global impact of this uncertainty on GSA results. We call it here the second-level global sensitivity analysis (GSA2). However, GSA2, when performed with a double Monte Carlo loop, requires a large number of model evaluations which is intractable with CPU time expensive simulators. To cope with this limitation, we propose a new statistical methodology based on a single Monte Carlo loop with a limited calculation budget. Firstly, we build a unique sample of inputs from a well chosen probability distribution and the associated code outputs are computed. From this inputs/output sample, we perform GSA for various assumed probability distributions of inputs by using weighted HSIC measures estimators. Statistical properties of these weighted esti-mators are demonstrated. Finally, we define 2 nd-level HSIC-based measures between the probability distributions of inputs and GSA results, which constitute GSA2 indices. The efficiency of our GSA2 methodology is illustrated on an analytical example, thereby comparing several technical options. Finally, an application to a test case simulating a severe accidental scenario on nuclear reactor is provided.
The statistical analysis of Randomized Numerical Linear Algebra (RandNLA) algorithms within the past few years has mostly focused on their performance as point estimators. However, this is insufficient for conducting statistical inference, e.g., constructing confidence intervals and hypothesis testing, since the distribution of the estimator is lacking. In this article, we develop an asymptotic analysis to derive the distribution of RandNLA sampling estimators for the least-squares problem. In particular, we derive the asymptotic distribution of a general sampling estimator with arbitrary sampling probabilities. The analysis is conducted in two complementary settings, i.e., when the objective of interest is to approximate the full sample estimator or is to infer the underlying ground truth model parameters. For each setting, we show that the sampling estimator is asymptotically normally distributed under mild regularity conditions. Moreover, the sampling estimator is asymptotically unbiased in both settings. Based on our asymptotic analysis, we use two criteria, the Asymptotic Mean Squared Error (AMSE) and the Expected Asymptotic Mean Squared Error (EAMSE), to identify optimal sampling probabilities. Several of these optimal sampling probability distributions are new to the literature, e.g., the root leverage sampling estimator and the predictor length sampling estimator. Our theoretical results clarify the role of leverage in the sampling process, and our empirical results demonstrate improvements over existing methods.
Chase-escape percolation is a variation of the standard epidemic spread models. In this model, each site can be in one of three states: unoccupied, occupied by a single prey, or occupied by a single predator. Prey particles spread to neighboring empty sites at rate $p$, and predator particles spread only to neighboring sites occupied by prey particles at rate $1$, killing the prey particle that existed at that site. It was found that the prey can survive with non-zero probability, if $p>p_c$ with $p_c<1$. Using Monte Carlo simulations on the square lattice, we estimate the value of $p_c = 0.49451 pm 0.00001$, and the critical exponents are consistent with the undirected percolation universality class. We define a discrete-time parallel-update version of the model, which brings out the relation between chase-escape and undirected bond percolation. For all $p < p_c$ in $D$-dimensions, the number of predators in the absorbing configuration has a stretched-exponential distribution in contrast to the exponential distribution in the standard percolation theory. We also study the problem starting from the line initial condition with predator particles on all lattice points of the line $y=0$ and prey particles on the line $y=1$. In this case, for $p_c<p < 1$, the center of mass of the fluctuating prey and predator fronts travel at the same speed. This speed is strictly smaller than the speed of an Eden front with the same value of $p$, but with no predators. At $p=1$, the fronts undergo a depinning transition. The fluctuations of the front follow Kardar-Parisi-Zhang scaling both above and below this depinning transition.