AstroCatR: a Mechanism and Tool for Efficient Time Series Reconstruction of Large-Scale Astronomical Catalogues

74 0 0.0 ( 0 )

Download Cite

Added by Kun Li

Publication date 2020

fields Physics

and research's language is English

Authors Ce Yu - Kun Li - Shanjiang Tang

Instrumentation and Methods for Astrophysics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analyzing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or databases, match each item to determine which object it belongs to, and finally produce time series datasets. To support the high-performance parallel processing of large-scale datasets, AstroCatR uses the extract-transform-load (ETL) preprocessing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3X faster than methods using relational database management systems at matching massive catalogues.

rate research

Computational Intelligence Challenges and Applications on Large-Scale Astronomical Time Series Databases

70 - Pablo Huijse , Pablo A. Estevez , Pavlos Protopapas 2015

Time-domain astronomy (TDA) is facing a paradigm shift caused by the exponential growth of the sample size, data complexity and data generation rates of new astronomical sky surveys. For example, the Large Synoptic Survey Telescope (LSST), which will begin operations in northern Chile in 2022, will generate a nearly 150 Petabyte imaging dataset of the southern hemisphere sky. The LSST will stream data at rates of 2 Terabytes per hour, effectively capturing an unprecedented movie of the sky. The LSST is expected not only to improve our understanding of time-varying astrophysical objects, but also to reveal a plethora of yet unknown faint and fast-varying phenomena. To cope with a change of paradigm to data-driven astronomy, the fields of astroinformatics and astrostatistics have been created recently. The new data-oriented paradigms for astronomy combine statistics, data mining, knowledge discovery, machine learning and computational intelligence, in order to provide the automated and robust methods needed for the rapid detection and classification of known astrophysical objects as well as the unsupervised characterization of novel phenomena. In this article we present an overview of machine learning and computational intelligence applications to TDA. Future big data challenges and new lines of research in TDA, focusing on the LSST, are identified and discussed from the viewpoint of computational intelligence/machine learning. Interdisciplinary collaboration will be required to cope with the challenges posed by the deluge of astronomical data coming from the LSST.

Instrumentation and Methods for Astrophysics Machine Learning

VARTOOLS: A Program for Analyzing Astronomical Time-Series Data

74 - Joel D. Hartman 2016

This paper describes the VARTOOLS program, which is an open-source command-line utility, written in C, for analyzing astronomical time-series data, especially light curves. The program provides a general-purpose set of tools for processing light curves including signal identification, filtering, light curve manipulation, time

Instrumentation and Methods for Astrophysics

catsHTM - A tool for fast accessing and cross-matching large astronomical catalogs

78 - Maayane T. Soumagnac , Eran O. Ofek 2018

Fast access to large catalogs is required for some astronomical applications. Here we introduce the catsHTM tool, consisting of several large catalogs reformatted into HDF5-based file format, which can be downloaded and used locally. To allow fast access, the catalogs are partitioned into hierarchical triangular meshes and stored in HDF5 files. Several tools are provided to perform efficient cone searches at resolutions spanning from a few arc seconds to degrees, within a few milliseconds time. The first released version includes the following catalogs (by alphabetical order): 2MASS, 2MASS extended sources, AKARI, APASS, Cosmos, DECaLS/DR5, FIRST, GAIA/DR1, GAIA/DR2, GALEX/DR6Plus7, HSC/v2, IPHAS/DR2, NED redshifts, NVSS, Pan-STARRS1/DR1, PTF photometric catalog, ROSAT faint source, SDSS sources, SDSS/DR14 spectroscopy, Spitzer/SAGE, Spitzer/IRAC galactic center, UCAC4, UKIDSS/DR10, VST/ATLAS/DR3, VST/KiDS/DR3, WISE and XMM. We provide Python code that allows to perform cone searches, as well as MATLAB code for performing cone searches, catalog cross-matching, general searches, as well as load and create these catalogs.

Instrumentation and Methods for Astrophysics

Provenance as a requirement for large-scale complex astronomical instruments

148 - Mathieu Servillat , Catherine Boisson , Julien Lefaucheur 2018

We developed several pieces of software to enable the tracking of provenance information for the large-scale complex astronomical observatory CTA, the Cherenkov Telescope Array. Such major facilities produce data that will be publicly released to a large community of scientists. There are thus strong requirements to ensure data quality, reliability and trustworthiness. Among those requirements, traceability and reproducibility of the data products have to be included in the development of large projects. Those requirements can be answered by structuring and storing the provenance information for each data product. We followed the Provenance data model, currently discussed at the IVOA, and implemented solutions to collect provenance information during the CTA data processing and the execution of jobs on a work cluster.

Instrumentation and Methods for Astrophysics

exoplanet: Gradient-based probabilistic inference for exoplanet data & other astronomical time series

102 - Daniel Foreman-Mackey , Rodrigo Luger , Eric Agol 2021

exoplanet is a toolkit for probabilistic modeling of astronomical time series data, with a focus on observations of exoplanets, using PyMC3 (Salvatier et al., 2016). PyMC3 is a flexible and high-performance model-building language and inference engine that scales well to problems with a large number of parameters. exoplanet extends PyMC3s modeling language to support many of the custom functions and probability distributions required when fitting exoplanet datasets or other astronomical time series. While it has been used for other applications, such as the study of stellar variability, the primary purpose of exoplanet is the characterization of exoplanets or multiple star systems using time-series photometry, astrometry, and/or radial velocity. In particular, the typical use case would be to use one or more of these datasets to place constraints on the physical and orbital parameters of the system, such as planet mass or orbital period, while simultaneously taking into account the effects of stellar variability.

Instrumentation and Methods for Astrophysics Earth and Planetary Astrophysics