No Arabic abstract
A data-driven method for improving the correlation estimation in serial ensemble Kalman filters is introduced. The method finds a linear map that transforms, at each assimilation cycle, the poorly estimated sample correlation into an improved correlation. This map is obtained from an offline training procedure without any tuning as the solution of a linear regression problem that uses appropriate sample correlation statistics obtained from historical data assimilation products. In an idealized OSSE with the Lorenz-96 model and for a range of cases of linear and nonlinear observation models, the proposed scheme improves the filter estimates, especially when the ensemble size is small relative to the dimension of the state space.
Sequential data with serial correlation and an unknown, unstructured, and dynamic background is ubiquitous in neuroscience, psychology, and econometrics. Inferring serial correlation for such data is a fundamental challenge in statistics. We propose a total variation constrained least square estimator coupled with hypothesis tests to infer the serial correlation in the presence of unknown and unstructured dynamic background. The total variation constraint on the dynamic background encourages a piece-wise constant structure, which can approximate a wide range of dynamic backgrounds. The tuning parameter is selected via the Ljung-Box test to control the bias-variance trade-off. We establish a non-asymptotic upper bound for the estimation error through variational inequalities. We also derive a lower error bound via Fanos method and show the proposed method is near-optimal. Numerical simulation and a real study in psychology demonstrate the excellent performance of our proposed method compared with the state-of-the-art.
A new type of ensemble Kalman filter is developed, which is based on replacing the sample covariance in the analysis step by its diagonal in a spectral basis. It is proved that this technique improves the aproximation of the covariance when the covariance itself is diagonal in the spectral basis, as is the case, e.g., for a second-order stationary random field and the Fourier basis. The method is extended by wavelets to the case when the state variables are random fields, which are not spatially homogeneous. Efficient implementations by the fast Fourier transform (FFT) and discrete wavelet transform (DWT) are presented for several types of observations, including high-dimensional data given on a part of the domain, such as radar and satellite images. Computational experiments confirm that the method performs well on the Lorenz 96 problem and the shallow water equations with very small ensembles and over multiple analysis cycles.
Data assimilation is concerned with sequentially estimating a temporally-evolving state. This task, which arises in a wide range of scientific and engineering applications, is particularly challenging when the state is high-dimensional and the state-space dynamics are unknown. This paper introduces a machine learning framework for learning dynamical systems in data assimilation. Our auto-differentiable ensemble Kalman filters (AD-EnKFs) blend ensemble Kalman filters for state recovery with machine learning tools for learning the dynamics. In doing so, AD-EnKFs leverage the ability of ensemble Kalman filters to scale to high-dimensional states and the power of automatic differentiation to train high-dimensional surrogate models for the dynamics. Numerical results using the Lorenz-96 model show that AD-EnKFs outperform existing methods that use expectation-maximization or particle filters to merge data assimilation and machine learning. In addition, AD-EnKFs are easy to implement and require minimal tuning.
Various methods have been proposed for the nonlinear filtering problem, including the extended Kalman filter (EKF), iterated extended Kalman filter (IEKF), unscented Kalman filter (UKF) and iterated unscented Kalman filter (IUKF). In this paper two new nonlinear Kalman filters are proposed and investigated, namely the observation-centered extended Kalman filter (OCEKF) and observation-centered unscented Kalman filter (OCUKF). Although the UKF and EKF are common default choices for nonlinear filtering, there are situations where they are bad choices. Examples are given where the EKF and UKF perform very poorly, and the IEKF and OCEKF perform well. In addition the IUKF and OCUKF are generally similar to the IEKF and OCEKF, and also perform well, though care is needed in the choice of tuning parameters when the observation error is small. The reasons for this behaviour are explored in detail.
Ensemble filters implement sequential Bayesian estimation by representing the probability distribution by an ensemble mean and covariance. Unbiased square root ensemble filters use deterministic algorithms to produce an analysis (posterior) ensemble with prescribed mean and covariance, consistent with the Kalman update. This includes several filters used in practice, such as the Ensemble Transform Kalman Filter (ETKF), the Ensemble Adjustment Kalman Filter (EAKF), and a filter by Whitaker and Hamill. We show that at every time index, as the number of ensemble members increases to infinity, the mean and covariance of an unbiased ensemble square root filter converge to those of the Kalman filter, in the case a linear model and an initial distribution of which all moments exist. The convergence is in $L^{p}$ and the convergence rate does not depend on the model dimension. The result holds in the infinitely dimensional Hilbert space as well.