A systematic review of Python packages for time series analysis

128 0 0.0 ( 0 )

Download Cite

Added by Julien Siebert

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Julien Siebert - Janek Gro{ss} - Christof Schroth

Mathematical Software

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents a systematic review of Python packages with a focus on time series analysis. The objective is to provide (1) an overview of the different time series analysis tasks and preprocessing methods implemented, and (2) an overview of the development characteristics of the packages (e.g., documentation, dependencies, and community size). This review is based on a search of literature databases as well as GitHub repositories. Following the filtering process, 40 packages were analyzed. We classified the packages according to the analysis tasks implemented, the methods related to data preparation, and the means for evaluating the results produced (methods and access to evaluation data). We also reviewed documentation aspects, the licenses, the size of the packages community, and the dependencies used. Among other things, our results show that forecasting is by far the most frequently implemented task, that half of the packages provide access to real datasets or allow generating synthetic data, and that many packages depend on a few libraries (the most used ones being numpy, scipy and pandas). We hope that this review can help practitioners and researchers navigate the space of Python packages dedicated to time series analysis. We will provide an updated list of the reviewed packages online at https://siebert-julien.github.io/time-series-analysis-python/.

rate research

Seglearn: A Python Package for Learning Sequences and Time Series

74 - David M. Burns , Cari M. Whyne 2018

Seglearn is an open-source python package for machine learning time series or sequences using a sliding window segmentation approach. The implementation provides a flexible pipeline for tackling classification, regression, and forecasting problems with multivariate sequence and contextual data. This package is compatible with scikit-learn and is listed under scikit-learn Related Projects. The package depends on numpy, scipy, and scikit-learn. Seglearn is distributed under the BSD 3-Clause License. Documentation includes a detailed API description, user guide, and examples. Unit tests provide a high degree of code coverage.

Machine Learning Machine Learning

pyWATTS: Python Workflow Automation Tool for Time Series

104 - Benedikt Heidrich , Andreas Bartschat , Marian Turowski 2021

Time series data are fundamental for a variety of applications, ranging from financial markets to energy systems. Due to their importance, the number and complexity of tools and methods used for time series analysis is constantly increasing. However, due to unclear APIs and a lack of documentation, researchers struggle to integrate them into their research projects and replicate results. Additionally, in time series analysis there exist many repetitive tasks, which are often re-implemented for each project, unnecessarily costing time. To solve these problems we present texttt{pyWATTS}, an open-source Python-based package that is a non-sequential workflow automation tool for the analysis of time series data. pyWATTS includes modules with clearly defined interfaces to enable seamless integration of new or existing methods, subpipelining to easily reproduce repetitive tasks, load and save functionality to simply replicate results, and native support for key Python machine learning libraries such as scikit-learn, PyTorch, and Keras.

Machine Learning

Nemo/Hecke: Computer Algebra and Number Theory Packages for the Julia Programming Language

106 - Claus Fieker , William Hart , Tommy Hofmann 2017

We introduce two new packages, Nemo and Hecke, written in the Julia programming language for computer algebra and number theory. We demonstrate that high performance generic algorithms can be implemented in Julia, without the need to resort to a low-level C implementation. For specialised algorithms, we use Julias efficient native C interface to wrap existing C/C++ libraries such as Flint, Arb, Antic and Singular. We give examples of how to use Hecke and Nemo and discuss some algorithms that we have implemented to provide high performance basic arithmetic.

Mathematical Software Symbolic Computation

Deep learning-based electroencephalography analysis: a systematic review

146 - Yannick Roy , Hubert Banville , Isabela Albuquerque 2019

Electroencephalography (EEG) is a complex signal and can require several years of training to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. In this work, we review 156 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches in order to inform future research and formulate recommendations. Various data items were extracted for each study pertaining to 1) the data, 2) the preprocessing methodology, 3) the DL design choices, 4) the results, and 5) the reproducibility of the experiments. Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours. As for the model, 40% of the studies used convolutional neural networks (CNNs), while 14% used recurrent neural networks (RNNs), most often with a total of 3 to 10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was 5.4% across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. To help the field progress, we provide a list of recommendations for future studies and we make our summary table of DL and EEG papers available and invite the community to contribute.

Machine Learning Signal Processing Machine Learning

Application of the Allan Variance to Time Series Analysis in Astrometry and Geodesy: A Review

150 - Zinovy Malkin 2016

The Allan variance (AVAR) was introduced 50 years ago as a statistical tool for assessing of the frequency standards deviations. For the past decades, AVAR has increasingly being used in geodesy and astrometry to assess the noise characteristics in geodetic and astrometric time series. A specific feature of astrometric and geodetic measurements, as compared with the clock measurements, is that they are generally associated with uncertainties; thus, an appropriate weighting should be applied during data analysis. Besides, some physically connected scalar time series naturally form series of multi-dimensional vectors. For example, three station coordinates time series $X$, $Y$, and $Z$ can be combined to analyze 3D station position variations. The classical AVAR is not intended for processing unevenly weighted and/or multi-dimensional data. Therefore, AVAR modifications, namely weighted AVAR (WAVAR), multi-dimensional AVAR (MAVAR), and weighted multi-dimensional AVAR (WMAVAR), were introduced to overcome these deficiencies. In this paper, a brief review is given of the experience of using AVAR and its modifications in processing astro-geodetic time series.

Instrumentation and Methods for Astrophysics Geophysics