No Arabic abstract
A nonparametric method to predict non-Markovian time series of partially observed dynamics is developed. The prediction problem we consider is a supervised learning task of finding a regression function that takes a delay embedded observable to the observable at a future time. When delay embedding theory is applicable, the proposed regression function is a consistent estimator of the flow map induced by the delay embedding. Furthermore, the corresponding Mori-Zwanzig equation governing the evolution of the observable simplifies to only a Markovian term, represented by the regression function. We realize this supervised learning task with a class of kernel-based linear estimators, the kernel analog forecast (KAF), which are consistent in the limit of large data. In a scenario with a high-dimensional covariate space, we employ a Markovian kernel smoothing method which is computationally cheaper than the Nystrom projection method for realizing KAF. In addition to the guaranteed theoretical convergence, we numerically demonstrate the effectiveness of this approach on higher-dimensional problems where the relevant kernel features are difficult to capture with the Nystrom method. Given noisy training data, we propose a nonparametric smoother as a de-noising method. Numerically, we show that the proposed smoother is more accurate than EnKF and 4Dvar in de-noising signals corrupted by independent (but not necessarily identically distributed) noise, even if the smoother is constructed using a data set corrupted by white noise. We show skillful prediction using the KAF constructed from the denoised data.
Due to the dynamic nature, chaotic time series are difficult predict. In conventional signal processing approaches signals are treated either in time or in space domain only. Spatio-temporal analysis of signal provides more advantages over conventional uni-dimensional approaches by harnessing the information from both the temporal and spatial domains. Herein, we propose an spatio-temporal extension of RBF neural networks for the prediction of chaotic time series. The proposed algorithm utilizes the concept of time-space orthogonality and separately deals with the temporal dynamics and spatial non-linearity(complexity) of the chaotic series. The proposed RBF architecture is explored for the prediction of Mackey-Glass time series and results are compared with the standard RBF. The spatio-temporal RBF is shown to out perform the standard RBFNN by achieving significantly reduced estimation error.
We propose a new estimator to measure directed dependencies in time series. The dimensionality of data is first reduced using a new non-uniform embedding technique, where the variables are ranked according to a weighted sum of the amount of new information and improvement of the prediction accuracy provided by the variables. Then, using a greedy approach, the most informative subsets are selected in an iterative way. The algorithm terminates, when the highest ranked variable is not able to significantly improve the accuracy of the prediction as compared to that obtained using the existing selected subsets. In a simulation study, we compare our estimator to existing state-of-the-art methods at different data lengths and directed dependencies strengths. It is demonstrated that the proposed estimator has a significantly higher accuracy than that of existing methods, especially for the difficult case, where the data is highly correlated and coupled. Moreover, we show its false detection of directed dependencies due to instantaneous couplings effect is lower than that of existing measures. We also show applicability of the proposed estimator on real intracranial electroencephalography data.
Inferring nonlinear and asymmetric causal relationships between multivariate longitudinal data is a challenging task with wide-ranging application areas including clinical medicine, mathematical biology, economics and environmental research. A number of methods for inferring causal relationships within complex dynamic and stochastic systems have been proposed but there is not a unified consistent definition of causality in this context. We evaluate the performance of ten prominent bivariate causality indices for time series data, across four simulated model systems that have different coupling schemes and characteristics. In further experiments, we show that these methods may not always be invariant to real-world relevant transformations (data availability, standardisation and scaling, rounding error, missing data and noisy data). We recommend transfer entropy and nonlinear Granger causality as likely to be particularly robust indices for estimating bivariate causal relationships in real-world applications. Finally, we provide flexible open-access Python code for computation of these methods and for the model simulations.
In applications spaning from image analysis and speech recognition, to energy dissipation in turbulence and time-to failure of fatigued materials, researchers and engineers want to calculate how often a stochastic observable crosses a specific level, such as zero. At first glance this problem looks simple, but it is in fact theoretically very challenging. And therefore, few exact results exist. One exception is the celebrated Rice formula that gives the mean number of zero-crossings in a fixed time interval of a zero-mean Gaussian stationary processes. In this study we use the so-called Independent Interval Approximation to go beyond Rices result and derive analytic expressions for all higher-order zero-crossing cumulants and moments. Our results agrees well with simulations for the non-Markovian autoregressive model.
While Internet of Things (IoT) devices and sensors create continuous streams of information, Big Data infrastructures are deemed to handle the influx of data in real-time. One type of such a continuous stream of information is time series data. Due to the richness of information in time series and inadequacy of summary statistics to encapsulate structures and patterns in such data, development of new approaches to learn time series is of interest. In this paper, we propose a novel method, called pattern tree, to learn patterns in the times-series using a binary-structured tree. While a pattern tree can be used for many purposes such as lossless compression, prediction and anomaly detection, in this paper we focus on its application in time series estimation and forecasting. In comparison to other methods, our proposed pattern tree method improves the mean squared error of estimation.