No Arabic abstract
We derive the second-order sampling properties of certain autocovariance and autocorrelation estimators for sequences of independent and identically distributed samples. Specifically, the estimators we consider are the classic lag windowed correlogram, the correlogram with subtracted sample mean, and the fixed-length summation correlogram. For each correlogram we derive explicit formulas for the bias, covariance, mean square error and consistency for generalised higher-order white noise sequences. In particular, this class of sequences may have non-zero means, be complexed valued and also includes non-analytical noise signals. We find that these commonly used correlograms exhibit lag dependent covariance despite the fact that these processes are white and hence by definition do not depend on lag.
We derive the bias, variance, covariance, and mean square error of the standard lag windowed correlogram estimator both with and without sample mean removal for complex white noise with an arbitrary mean. We find that the arbitrary mean introduces lag dependent covariance between different lags of the correlogram estimates in spite of the lack of covariance in white noise for non-zeros lags. We provide a heuristic rule for when the sample mean should be, and when it should not be, removed if the true mean is not known. The sampling properties derived here are useful is assesing the general statistical performance of autocovariance and autocorrelation estimators in different parameter regimes. Alternatively, the sampling properties could be used as bounds on the detection of a weak signal in general white noise.
Inverse problems defined on the sphere arise in many fields, and are generally high-dimensional and computationally very complex. As a result, sampling the posterior of spherical inverse problems is a challenging task. In this work, we describe a framework that leverages a proximal Markov chain Monte Carlo algorithm to efficiently sample the high-dimensional space of spherical inverse problems with a sparsity-promoting wavelet prior. We detail the modifications needed for the algorithm to be applied to spherical problems, and give special consideration to the crucial forward modelling step which contains spherical harmonic transforms that are computationally expensive. By sampling the posterior, our framework allows for full and flexible uncertainty quantification, something which is not possible with other methods based on, for example, convex optimisation. We demonstrate our framework in practice on a common problem in global seismic tomography. We find that our approach is potentially useful for a wide range of applications at moderate resolutions.
We propose a novel method for computing $p$-values based on nested sampling (NS) applied to the sampling space rather than the parameter space of the problem, in contrast to its usage in Bayesian computation. The computational cost of NS scales as $log^2{1/p}$, which compares favorably to the $1/p$ scaling for Monte Carlo (MC) simulations. For significances greater than about $4sigma$ in both a toy problem and a simplified resonance search, we show that NS requires orders of magnitude fewer simulations than ordinary MC estimates. This is particularly relevant for high-energy physics, which adopts a $5sigma$ gold standard for discovery. We conclude with remarks on new connections between Bayesian and frequentist computation and possibilities for tuning NS implementations for still better performance in this setting.
A phenomenological systems approach for identifying potential precursors in multiple signals of different types for the same local seismically active region is proposed based on the assumption that a large earthquake may be preceded by a system reconfiguration (preparation) at different time and space scales. A nonstationarity factor introduced within the framework of flicker-noise spectroscopy, a statistical physics approach to the analysis of time series, is used as the dimensionless criterion for detecting qualitative (precursory) changes within relatively short time intervals in arbitrary signals. Nonstationarity factors for chlorine-ion concentration variations in the underground water of two boreholes on the Kamchatka peninsula and geacoustic emissions in a deep borehole within the same seismic zone are studied together in the time frame around a large earthquake on October 8, 2001. It is shown that nonstationarity factor spikes (potential precursors) take place in the interval from 70 to 50 days before the earthquake for the hydrogeochemical data and at 29 and 6 days in advance for the geoacoustic data.
Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with dimensionality. As a result, many high-dimensional sampling and approximation problems once thought intractable are being revisited through the lens of machine learning. While the promise of unparalleled accuracy may suggest a renaissance for applications that require parameterizing representations of complex systems, in many applications gathering sufficient data to develop such a representation remains a significant challenge. Here we introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions that are dominated by rare events. We show that importance sampling reduces the asymptotic variance of the solution to a learning problem, suggesting benefits for generalization. We study our algorithm in the context of learning dynamical transition pathways between two states of a system, a problem with applications in statistical physics and implications in machine learning theory. Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimension and rare data.