No Arabic abstract
A collaboration between the W. M. Keck Observatory (WMKO) in Hawaii and the NASA Exoplanet Science Institute (NExScI) in California, the Keck Observatory Archive (KOA) was commissioned in 2004 to archive observing data from WMKO, which operates two classically scheduled 10 m ground-based telescopes. The observing data from Keck is not suitable for direct ingestion into the archive since the metadata contained in the original FITS headers lack the information necessary for proper archiving. Coupled with different standards among instrument builders and the heterogeneous nature of the data inherent in classical observing, in which observers have complete control of the instruments and their observations, the data pose a number of technical challenges for KOA. We describe the methodologies and tools that we have developed to successfully address these difficulties, adding content to the FITS headers and retrofitting the metadata in order to support archiving Keck data, especially those obtained before the archive was designed. With the expertise gained from having successfully archived observations taken with all eight currently active instruments at WMKO, we have developed lessons learned from handling this complex array of heterogeneous metadata that help ensure a smooth ingestion of data not only for current but also future instruments, as well as a better experience for the archive user.
The Parkes pulsar data archive currently provides access to 144044 data files obtained from observations carried out at the Parkes observatory since the year 1991. Around 10^5 files are from surveys of the sky, the remainder are observations of 775 individual pulsars and their corresponding calibration signals. Survey observations are included from the Parkes 70cm and the Swinburne Intermediate Latitude surveys. Individual pulsar observations are included from young pulsar timing projects, the Parkes Pulsar Timing Array and from the PULSE@Parkes outreach program. The data files and access methods are compatible with Virtual Observatory protocols. This paper describes the data currently stored in the archive and presents ways in which these data can be searched and downloaded.
In the Virtual Observatory era, where we intend to expose scientists (or software agents on their behalf) to a stream of observations from all existing facilities, the ability to access and to further interpret the origin, relationships, and processing steps on archived astronomical assets (their Provenance) is a requirement for proper observation selection, and quality assessment. In this article we present the different use cases Data Provenance is needed for, the challenges inherent to building such a system for the ESO archive, and their link with ongoing work in the International Virtual Observatory Alliance (IVOA).
The Large sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) is the largest optical telescope in China. In last four years, the LAMOST telescope has published four editions data (pilot data release, data release 1, data release 2 and data release 3). To archive and release these data (raw data, catalog, spectrum etc), we have set up a data cycle management system, including the transfer of data, archiving, backup. And through the evolution of four softwa
From the moment astronomical observations are made the resulting data products begin to grow stale. Even if perfect binary copies are preserved through repeated timely migration to more robust storage media, data standards evolve and new tools are created that require different kinds of data or metadata. The expectations of the astronomical community change even if the data do not. We discuss data engineering to mitigate the ensuing risks with examples from a recent project to refactor seven million archival images to new standards of nomenclature, metadata, format, and compression.
The Data Quality Segment Database (DQSEGDB) software is a database service, backend API, frontend graphical web interface, and client package used by the Laser Interferometer Gravitational-Wave Observatory (LIGO), Virgo, GEO600 and the Kamioka Gravitational wave detector for storing and accessing metadata describing the status of their detectors. The DQSEGDB has been used in the analysis of all published detections of gravitational waves in the advanced detector era. The DQSEGDB currently stores roughly 600 million metadata entries and responds to roughly 600,000 queries per day with an average response time of 0.223 ms.