No Arabic abstract
Modern astronomical data processing requires complex software pipelines to process ever growing datasets. For radio astronomy, these pipelines have become so large that they need to be distributed across a computational cluster. This makes it difficult to monitor the performance of each pipeline step. To gain insight into the performance of each step, a performance monitoring utility needs to be integrated with the pipeline execution. In this work we have developed such a utility and integrated it with the calibration pipeline of the Low Frequency Array, LOFAR, a leading radio telescope. We tested the tool by running the pipeline on several different compute platforms and collected the performance data. Based on this data, we make well informed recommendations on future hardware and software upgrades. The aim of these upgrades is to accelerate the slowest processing steps for this LOFAR pipeline. The pipeline collector suite is open source and will be incorporated in future LOFAR pipelines to create a performance database for all LOFAR processing.
Users of the Atacama Large Millimeter/submillimeter Array (ALMA) are provided with calibration and imaging products in addition to raw data. In Cycle 0 and Cycle 1, these products are produced by a team of data reduction experts spread across Chile, East Asia, Europe, and North America. This article discusses the lines of communication between the data reducers and ALMA users that enable this model of distributed data reduction. This article also discusses the calibration and imaging scripts that have been provided to ALMA users in Cycles 0 and 1, and what will be different in future Cycles.
Data processing pipelines represent an important slice of the astronomical software library that include chains of processes that transform raw data into valuable information via data reduction and analysis. In this work we present Corral, a Python framework for astronomical pipeline generation. Corral features a Model-View-Controller design pattern on top of an SQL Relational Database capable of handling: custom data models; processing stages; and communication alerts, and also provides automatic quality and structural metrics based on unit testing. The Model-View-Controller provides concept separation between the user logic and the data models, delivering at the same time multi-processing and distributed computing capabilities. Corral represents an improvement over commonly found data processing pipelines in Astronomy since the design pattern eases the programmer from dealing with processing flow and parallelization issues, allowing them to focus on the specific algorithms needed for the successive data transformations and at the same time provides a broad measure of quality over the created pipeline. Corral and working examples of pipelines that use it are available to the community at https://github.com/toros-astro.
The Chinese Spectral RadioHeliograph (CSRH) is a synthetic aperture radio interferometer built in Inner Mongolia, China. As a solar-dedicated interferometric array, CSRH is capable of producing high quality radio images at frequency range from 400 MHz to 15 GHz with high temporal, spatial, and spectral resolution.To implement high cadence imaging at wide-band and obtain more than 2 order higher multiple frequencies, the implementation of the data processing system for CSRH is a great challenge. It is urgent to build a pipeline for processing massive data of CSRH generated every day. In this paper, we develop a high performance distributed data processing pipeline (DDPP) built on the OpenCluster infrastructure for processing CSRH observational data including data storage, archiving, preprocessing, image reconstruction, deconvolution, and real-time monitoring. We comprehensively elaborate the system architecture of the pipeline and the implementation of each subsystem. The DDPP is automatic, robust, scalable and manageable. The processing performance under multi computers parallel and GPU hybrid system meets the requirements of CSRH data processing. The study presents an valuable reference for other radio telescopes especially aperture synthesis telescopes, and also gives an valuable contribution to the current and/or future data intensive astronomical observations.
In the multi-messenger era, astronomical projects share information about transients phenomena issuing science alerts to the Scientific Community through different communications networks. This coordination is mandatory to understand the nature of these physical phenomena. For this reason, astrophysical projects rely on real-time analysis software pipelines to identify as soon as possible transients (e.g. GRBs), and to speed up external alerts reaction time. These pipelines can share and receive the science alerts through the Gamma-ray Coordinates Network. This work presents a framework designed to simplify the development of real-time scientific analysis pipelines. The framework provides the architecture and the required automatisms to develop a real-time analysis pipeline, allowing the researchers to focus more on the scientific aspects. The framework has been successfully used to develop real-time pipelines for the scientific analysis of the AGILE space mission data. It is planned to reuse this framework for the Super-GRAWITA and AFISS projects. A possible future use for the Cherenkov Telescope Array (CTA) project is under evaluation.
{Context}. The HIFI instrument on the Herschel Space Observatory performed over 9100 astronomical observations, almost 900 of which were calibration observations in the course of the nearly four-year Herschel mission. The data from each observation had to be converted from raw telemetry into calibrated products and were included in the Herschel Science Archive. {Aims}. The HIFI pipeline was designed to provide robust conversion from raw telemetry into calibrated data throughout all phases of the HIFI missions. Pre-launch laboratory testing was supported as were routine mission operations. {Methods}. A modular software design allowed components to be easily added, removed, amended and/or extended as the understanding of the HIFI data developed during and after mission operations. {Results}. The HIFI pipeline processed data from all HIFI observing modes within the Herschel automated processing environment as well as within an interactive environment. The same software can be used by the general astronomical community to reprocess any standard HIFI observation. The pipeline also recorded the consistency of processing results and provided automated quality reports. Many pipeline modules were in use since the HIFI pre-launch instrument level testing. {Conclusions}. Processing in steps facilitated data analysis to discover and address instrument artefacts and uncertainties. The availability of the same pipeline components from pre-launch throughout the mission made for well-understood, tested, and stable processing. A smooth transition from one phase to the next significantly enhanced processing reliability and robustness.