No Arabic abstract
Handling, processing and archiving the huge amount of data produced by the new generation of experiments and instruments in Astronomy and Astrophysics are among the more exciting challenges to address in designing the future data management infrastructures and computing services. We investigated the feasibility of a data management and computation infrastructure, available world-wide, with the aim of merging the FAIR (Findable, Accessible, Interoperable, Reusable) data management provided by IVOA standards with the efficiency and reliability of a cloud approach. Our work involved the Canadian Advanced Network for Astronomy Research (CANFAR) infrastructure and the European EGI federated cloud (EFC). We designed and deployed a pilot data management and computation infrastructure that provides IVOA-compliant VOSpace storage resources and wide access to interoperable federated clouds. In this paper, we detail the main user requirements covered, the technical choices and the implemented solutions and we describe the resulting Hybrid cloud Worldwide infrastructure, its benefits and limitations.
The Simple Image Access protocol (SIA) provides capabilities for the discovery, description, access, and retrieval of multi-dimensional image datasets, including 2-D images as well as datacubes of three or more dimensions. SIA data discovery is based on the ObsCore Data Model (ObsCoreDM), which primarily describes data products by the physical axes (spatial, spectral, time, and polarization). Image datasets with dimension greater than 2 are often referred to as datacubes, cube or image cube datasets and may be considered examples of hypercube or n-cube data. In this document the term image refers to general multi-dimensional datasets and is synonymous with these other terms unless the image dimensionality is otherwise specified. SIA provides capabilities for image discovery and access. Data discovery and metadata access (using ObsCoreDM) are defined here. The capabilities for drilling down to data files (and related resources) and services for remote access are defined elsewhere, but SIA also allows for direct access to retrieval.
A joint project between the Canadian Astronomy Data Center of the National Research Council Canada, and the italian Istituto Nazionale di Astrofisica-Osservatorio Astronomico di Trieste (INAF-OATs), partially funded by the EGI-Engage H2020 European Project, is devoted to deploy an integrated infrastructure, based on the International Virtual Observatory Alliance (IVOA) standards, to access and exploit astronomical data. Currently CADC-CANFAR provides scientists with an access, storage and computation facility, based on software libraries implementing a set of standards developed by the International Virtual Observatory Alliance (IVOA). The deployment of a twin infrastructure, basically built on the same open source software libraries, has been started at INAF-OATs. This new infrastructure now provides users with an Access Control Service and a Storage Service. The final goal of the ongoing project is to build an integrated infrastructure geographycally distributed providing complete interoperability, both in users access control and data sharing. This paper describes the target infrastructure, the main user requirements covered, the technical choices and the implemented solutions.
The Simple Spectral Access (SSA) Protocol (SSAP) defines a uniform interface to remotely discover and access one dimensional spectra. SSA is a member of an integrated family of data access interfaces altogether comprising the Data Access Layer (DAL) of the IVOA. SSA is based on a more general data model capable of describing most tabular spectrophotometric data, including time series and spectral energy distributions (SEDs) as well as 1-D spectra; however the scope of the SSA interface as specified in this document is limited to simple 1-D spectra, including simple aggregations of 1-D spectra. The form of the SSA interface is simple: clients first query the global resource registry to find services of interest and then issue a data discovery query to selected services to determine what relevant data is available from each service; the candidate datasets available are described uniformly in a VOTable format document which is returned in response to the query. Finally, the client may retrieve selected datasets for analysis. Spectrum datasets returned by an SSA spectrum service may be either precomputed, archival datasets, or they may be virtual data which is computed on the fly to respond to a client request. Spectrum datasets may conform to a standard data model defined by SSA, or may be native spectra with custom project-defined content. Spectra may be returned in any of a number of standard data formats. Spectral data is generally stored externally to the VO in a format specific to each spectral data collection; currently there is no standard way to represent astronomical spectra, and virtually every project does it differently. Hence spectra may be actively mediated to the standard SSA-defined data model at access time by the service, so that client analysis programs do not have to be familiar with the idiosyncratic details of each data collection to be accessed.
The Simple Line Access Protocol (SLAP) is an IVOA Data Access protocol which defines a protocol for retrieving spectral lines coming from various Spectral Line Data Collections through a uniform interface within the VO framework. These lines can be either observed or theoretical and will be typically used to identify emission or absorption features in astronomical spectra. It makes use of the Simple Spectral Line Data Model (SSLDM [1]) to characterize spectral lines through the use of uTypes [14]. Physical quantities of units are described by using the standard Units DM [15]. SLAP services can be registered in an IVOA Registry of Resources using the VOResource [12] Extension standard, having a unique ResourceIdentifier [13] in the Registry. The SLAP interface is meant to be reasonably simple to implement by service providers. A basic query will be done in a wavelength range for the different services. The service returns a list of spectral lines formatted as a VOTable. Thus, an implementation of the service may support additional search parameters (some which may be custom to that particular service) to more finely control the selection of spectral lines. The specification also describes how the search on extra parameters has to be done, making use of the support provided by the Simple Spectral Line Data Model (SSLDM [1])
This document describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data. The web service capability supports a drill-down into the details of a specific dataset and provides a set of links to the dataset file(s) and related resources. This specification also includes a VOTable-specific method of providing descriptions of one or more services and their input(s), usually using parameter values from elsewhere in the VOTable document. Providers are able to describe services that are relevant to the records (usually datasets with identifiers) by including service descriptors in a result document.