No Arabic abstract
In this paper, we present the FATS (Feature Analysis for Time Series) library. FATS is a Python library which facilitates and standardizes feature extraction for time series data. In particular, we focus on one application: feature extraction for astronomical light curve data, although the library is generalizable for other uses. We detail the methods and features implemented for light curve analysis, and present examples for its usage.
This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.
Machine learning algorithms are highly useful for the classification of time series data in astronomy in this era of peta-scale public survey data releases. These methods can facilitate the discovery of new unknown events in most astrophysical areas, as well as improving the analysis of samples of known phenomena. Machine learning algorithms use features extracted from collected data as input predictive variables. A public tool called Feature Analysis for Time Series (FATS) has proved an excellent workhorse for feature extraction, particularly light curve classification for variable objects. In this study, we present a major improvement to FATS, which corrects inconvenient design choices, minor details, and documentation for the re-engineering process. This improvement comprises a new Python package called feets, which is important for future code-refactoring for astronomical software tools.
One of the tasks of the Kepler Asteroseismic Science Operations Center (KASOC) is to provide asteroseismic analyses on Kepler Objects of Interest (KOIs). However, asteroseismic analysis of planetary host stars presents some unique complications with respect to data preprocessing, compared to pure asteroseismic targets. If not accounted for, the presence of planetary transits in the photometric time series often greatly complicates or even hinders these asteroseismic analyses. This drives the need for specialised methods of preprocessing data to make them suitable for asteroseismic analysis. In this paper we present the KASOC Filter, which is used to automatically prepare data from the Kepler/K2 mission for asteroseismic analyses of solar-like planet host stars. The methods are very effective at removing unwanted signals of both instrumental and planetary origins and produce significantly cleaner photometric time series than the original data. The methods are automated and can therefore easily be applied to a large number of stars. The application of the filter is not restricted to planetary hosts, but can be applied to any solar-like or red giant stars observed by Kepler/K2.
Statistical parameters are used in finance, weather, industrial, science, among other vast number of different fields to draw conclusions. New more efficient selection methods are mandatory to analyses the huge amount of astronomical data. The standard and new data-mining parameters to analyses non-correlated data are used to set the best way to discriminate stochastic and non-stochastic variations. We introduce 16 modified statistical parameters covering different features of statistical distribution, like; average, dispersion, and shape parameters. Many of dispersion and shape parameters are unbound parameters, i.e. equations which do not require the calculation of the average. Moreover, the majority of them have lower error than previous ones that is mainly observed for distributions having few measurements. A set of non-correlated variability indices, sample size corrections, and a new noise model as well as tests of different apertures and cutoffs on the data (BAS approach) are introduced. The number of misselections is reduced by about 520% using a single waveband and 1200% combining all wavebands. On the other hand, the even mean also improves the correlated indices introduced in Paper 1 Ferreira Lopes & Cross (2016). The misselection rate is reduced by about 18% if the even mean is used instead of the mean to compute the correlated indices in the WFCAM database. Even statistics allows us to improve the effectiveness of both correlated and non-correlated indices. The correlated variability indices, proposed in the first paper of this series, are also improved if the even mean is used. The even parameters will also be useful for classifying light curves in the last step of this project. We consider that the first step of this project, where we set new techniques and methods that provide a huge improve on the efficiency of selection of variable stars, is now complete.
In astronomy, we are witnessing an enormous increase in the number of source detections, precision, and diversity of measurements. Additionally, multi-epoch data is becoming the norm, making time-series analyses an important aspect of current astronomy. The Gaia mission is an outstanding example of a multi-epoch survey that provides measurements in a large diversity of domains, with its broad-band photometry; spectrophotometry in blue and red (used to derive astrophysical parameters); spectroscopy (employed to infer radial velocities, v sin(i), and other astrophysical parameters); and its extremely precise astrometry. Most of all that information is provided for sources covering the entire sky. Here, we present several properties related to the Gaia time series, such as the time sampling; the different types of measurements; the Gaia G, G BP and G RP-band photometry; and Gaia-inspired studies using the CORrelation-RAdial-VELocities data to assess the potential of the information on the radial velocity, the FWHM, and the contrast of the cross-correlation function. We also present techniques (which are used or are under development) that optimize the extraction of astrophysical information from the different instruments of Gaia, such as the principal component analysis and the multi-response regression. The detailed understanding of the behavior of the observed phenomena in the various measurement domains can lead to richer and more precise characterization of the Gaia data, including the definition of more informative attributes that serve as input to (our) machine-learning algorithms.