Symbolic Analysis-based Reduced Order Markov Modeling of Time Series Data

331 0 0.0 ( 0 )

Download Cite

Added by Abhishek Srivastav

Publication date 2017

fields Mathematical Statistics

and research's language is English

Authors Devesh K Jha - Nurali Virani - Jan Reimann

Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents a technique for reduced-order Markov modeling for compact representation of time-series data. In this work, symbolic dynamics-based tools have been used to infer an approximate generative Markov model. The time-series data are first symbolized by partitioning the continuous measurement space of the signal and then, the discrete sequential data are modeled using symbolic dynamics. In the proposed approach, the size of temporal memory of the symbol sequence is estimated from spectral properties of the resulting stochastic matrix corresponding to a first-order Markov model of the symbol sequence. Then, hierarchical clustering is used to represent the states of the corresponding full-state Markov model to construct a reduced-order or size Markov model with a non-deterministic algebraic structure. Subsequently, the parameters of the reduced-order Markov model are identified from the original model by making use of a Bayesian inference rule. The final model is selected using information-theoretic criteria. The proposed concept is elucidated and validated on two different data sets as examples. The first example analyzes a set of pressure data from a swirl-stabilized combustor, where controlled protocols are used to induce flame instabilities. Variations in the complexity of the derived Markov model represent how the system operating condition changes from a stable to an unstable combustion regime. In the second example, the data set is taken from NASAs data repository for prognostics of bearings on rotating shafts. We show that, even with a very small state-space, the reduced-order models are able to achieve comparable performance and that the proposed approach provides flexibility in the selection of a final model for representation and learning.

rate research

Markov Modeling of Time-Series Data using Symbolic Analysis

108 - Devesh K. Jha 2021

Markov models are often used to capture the temporal patterns of sequential data for statistical learning applications. While the Hidden Markov modeling-based learning mechanisms are well studied in literature, we analyze a symbolic-dynamics inspired approach. Under this umbrella, Markov modeling of time-series data consists of two major steps -- discretization of continuous attributes followed by estimating the size of temporal memory of the discretized sequence. These two steps are critical for the accurate and concise representation of time-series data in the discrete space. Discretization governs the information content of the resultant discretized sequence. On the other hand, memory estimation of the symbolic sequence helps to extract the predictive patterns in the discretized data. Clearly, the effectiveness of signal representation as a discrete Markov process depends on both these steps. In this paper, we will review the different techniques for discretization and memory estimation for discrete stochastic processes. In particular, we will focus on the individual problems of discretization and order estimation for discrete stochastic process. We will present some results from literature on partitioning from dynamical systems theory and order estimation using concepts of information theory and statistical learning. The paper also presents some related problem formulations which will be useful for machine learning and statistical learning application using the symbolic framework of data analysis. We present some results of statistical analysis of a complex thermoacoustic instability phenomenon during lean-premixed combustion in jet-turbine engines using the proposed Markov modeling method.

Machine Learning Machine Learning

Lagrangian Data-Driven Reduced Order Modeling of Finite Time Lyapunov Exponents

81 - Xuping Xie , Peter J. Nolan , Shane D. Ross 2018

There are two main strategies for improving the projection-based reduced order model (ROM) accuracy: (i) improving the ROM, i.e., adding new terms to the standard ROM; and (ii) improving the ROM basis, i.e., constructing ROM bases that yield more accurate ROMs. In this paper, we use the latter. We propose new Lagrangian inner products that we use together with Eulerian and Lagrangian data to construct new Lagrangian ROMs. We show that the new Lagrangian ROMs are orders of magnitude more accurate than the standard Eulerian ROMs, i.e., ROMs that use standard Eulerian inner product and data to construct the ROM basis. Specifically, for the quasi-geostrophic equations, we show that the new Lagrangian ROMs are more accurate than the standard Eulerian ROMs in approximating not only Lagrangian fields (e.g., the finite time Lyapunov exponent (FTLE)), but also Eulerian fields (e.g., the streamfunction). We emphasize that the new Lagrangian ROMs do not employ any closure modeling to model the effect of discarded modes (which is standard procedure for low-dimensional ROMs of complex nonlinear systems). Thus, the dramatic increase in the new Lagrangian ROMs accuracy is entirely due to the novel Lagrangian inner products used to build the Lagrangian ROM basis.

Fluid Dynamics Computational Physics

Harnessing the power of Topological Data Analysis to detect change points in time series

321 - Umar Islambekov , Monisha Yuvaraj , Yulia R. Gel 2019

We introduce a novel geometry-oriented methodology, based on the emerging tools of topological data analysis, into the change point detection framework. The key rationale is that change points are likely to be associated with changes in geometry behind the data generating process. While the applications of topological data analysis to change point detection are potentially very broad, in this paper we primarily focus on integrating topological concepts with the existing nonparametric methods for change point detection. In particular, the proposed new geometry-oriented approach aims to enhance detection accuracy of distributional regime shift locations. Our simulation studies suggest that integration of topological data analysis with some existing algorithms for change point detection leads to consistently more accurate detection results. We illustrate our new methodology in application to the two closely related environmental time series datasets -ice phenology of the Lake Baikal and the North Atlantic Oscillation indices, in a research query for a possible association between their estimated regime shift locations.

Machine Learning Machine Learning

Probabilistic structure discovery in time series data

75 - David Janz , Brooks Paige , Tom Rainforth 2016

Existing methods for structure discovery in time series data construct interpretable, compositional kernels for Gaussian process regression models. While the learned Gaussian process model provides posterior mean and variance estimates, typically the structure is learned via a greedy optimization procedure. This restricts the space of possible solutions and leads to over-confident uncertainty estimates. We introduce a fully Bayesian approach, inferring a full posterior over structures, which more reliably captures the uncertainty of the model.

Machine Learning Machine Learning

Feature-based time-series analysis

78 - Ben D. Fulcher 2017

This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.

Machine Learning