No Arabic abstract
Modern biomedical applications often involve time-series data, from high-throughput phenotyping of model organisms, through to individual disease diagnosis and treatment using biomedical data streams. Data and tools for time-series analysis are developed and applied across the sciences and in industry, but meaningful cross-disciplinary interactions are limited by the challenge of identifying fruitful connections. Here we introduce the web platform, CompEngine, a self-organizing, living library of time-series data that lowers the barrier to forming meaningful interdisciplinary connections between time series. Using a canonical feature-based representation, CompEngine places all time series in a common space, regardless of their origin, allowing users to upload their data and immediately explore interdisciplinary connections to other data with similar properties, and be alerted when similar data is uploaded in the future. In contrast to conventional databases, which are organized by assigned metadata, CompEngine incentivizes data sharing by automatically connecting experimental and theoretical scientists across disciplines based on the empirical structure of their data. CompEngines growing library of interdisciplinary time-series data also facilitates comprehensively characterization of algorithm performance across diverse types of data, and can be used to empirically motivate the development of new time-series analysis algorithms.
Spatio-temporally extended nonlinear systems often exhibit a remarkable complexity in space and time. In many cases, extensive datasets of such systems are difficult to obtain, yet needed for a range of applications. Here, we present a method to generate synthetic time series or fields that reproduce statistical multi-scale features of complex systems. The method is based on a hierarchical refinement employing transition probability density functions (PDFs) from one scale to another. We address the case in which such PDFs can be obtained from experimental measurements or simulations and then used to generate arbitrarily large synthetic datasets. The validity of our approach is demonstrated at the example of an experimental dataset of high Reynolds number turbulence.
We introduce the concept of time series motifs for time series analysis. Time series motifs consider not only the spatial information of mutual visibility but also the temporal information of relative magnitude between the data points. We study the profiles of the six triadic time series. The six motif occurrence frequencies are derived for uncorrelated time series, which are approximately linear functions of the length of the time series. The corresponding motif profile thus converges to a constant vector $(0.2,0.2,0.1,0.2,0.1,0.2)$. These analytical results have been verified by numerical simulations. For fractional Gaussian noises, numerical simulations unveil the nonlinear dependence of motif occurrence frequencies on the Hurst exponent. Applications of the time series motif analysis uncover that the motif occurrence frequency distributions are able to capture the different dynamics in the heartbeat rates of healthy subjects, congestive heart failure (CHF) subjects, and atrial fibrillation (AF) subjects and in the price fluctuations of bullish and bearish markets. Our method shows its potential power to classify different types of time series and test the time irreversibility of time series.
We report on a novel stochastic analysis of seismic time series for the Earths vertical velocity, by using methods originally developed for complex hierarchical systems, and in particular for turbulent flows. Analysis of the fluctuations of the detrended increments of the series reveals a pronounced change of the shapes of the probability density functions (PDF) of the series increments. Before and close to an earthquake the shape of the PDF and the long-range correlation in the increments both manifest significant changes. For a moderate or large-size earthquake the typical time at which the PDF undergoes the transition from a Gaussian to a non-Gaussian is about 5-10 hours. Thus, the transition represents a new precursor for detecting such earthquakes.
This paper has been withdrawn by the authors.
To handle time series with complicated oscillatory structure, we propose a novel time-frequency (TF) analysis tool that fuses the short time Fourier transform (STFT) and periodic transform (PT). Since many time series oscillate with time-varying frequency, amplitude and non-sinusoidal oscillatory pattern, a direct application of PT or STFT might not be suitable. However, we show that by combining them in a proper way, we obtain a powerful TF analysis tool. We first combine the Ramanujan sums and $l_1$ penalization to implement the PT. We call the algorithm Ramanujan PT (RPT). The RPT is of its own interest for other applications, like analyzing short signal composed of components with integer periods, but that is not the focus of this paper. Second, the RPT is applied to modify the STFT and generate a novel TF representation of the complicated time series that faithfully reflect the instantaneous frequency information of each oscillatory components. We coin the proposed TF analysis the Ramanujan de-shape (RDS) and vectorized RDS (vRDS). In addition to showing some preliminary analysis results on complicated biomedical signals, we provide theoretical analysis about RPT. Specifically, we show that the RPT is robust to three commonly encountered noises, including envelop fluctuation, jitter and additive noise.