No Arabic abstract
AstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning. This technique prioritises data that offer high information gain, leading to improved performance using substantially less data. The system allows users to visualise and integrate data from different sources and deal with incorrect or missing labels and imbalanced class sizes. AstronomicAL enables experts to visualise domain-specific plots and key information relating both to broader context and details of a point of interest drawn from a variety of data sources, ensuring reliable labels. In addition, AstronomicAL provides functionality to explore all aspects of the training process, including custom models and query strategies. This makes the software a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies. We illustrate using the system with an astronomical dataset due to the fields immediate need; however, AstronomicAL has been designed for datasets from any discipline. Finally, by exporting a simple configuration file, entire layouts, models, and assigned labels can be shared with the community. This allows for complete transparency and ensures that the process of reproducing results is effortless
ESASky is a science-driven discovery portal to explore the multi-wavelength sky and visualise and access multiple astronomical archive holdings. The tool is a web application that requires no prior knowledge of any of the missions involved and gives users world-wide simplified access to the highest-level science data products from multiple astronomical space-based astronomy missions plus a number of ESA source catalogues. The first public release of ESASky features interfaces for the visualisation of the sky in multiple wavelengths, the visualisation of query results summaries, and the visualisation of observations and catalogue sources for single and multiple targets. This paper describes these features within ESASky, developed to address use cases from the scientific community. The decisions regarding the visualisation of large amounts of data and the technologies used were made in order to maximise the responsiveness of the application and to keep the tool as useful and intuitive as possible.
We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques. CosmoHub, hosted and developed at the Port dInformacio Cientifica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ci`encies de lEspai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHubs datasets are seldomly modified, Hive it is a better fit. Over 60 TiB of catalogued information and $50 times 10^9$ astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of $10^9$ objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes.
We present an interactive IDL program for viewing and analyzing astronomical spectra in the context of modern imaging surveys. SpecPros interactive design lets the user simultaneously view spectroscopic, photometric, and imaging data, allowing for rapid object classification and redshift determination. The spectroscopic redshift can be determined with automated cross-correlation against a variety of spectral templates or by overlaying common emission and absorption features on the 1-D and 2-D spectra. Stamp images as well as the spectral energy distribution (SED) of a source can be displayed with the interface, with the positions of prominent photometric features indicated on the SED plot. Results can be saved to file from within the interface. In this paper we discuss key program features and provide an overview of the required data formats.
The performance of soccer players is one of most discussed aspects by many actors in the soccer industry: from supporters to journalists, from coaches to talent scouts. Unfortunately, the dashboards available online provide no effective way to compare the evolution of the performance of players or to find players behaving similarly on the field. This paper describes the design of a web dashboard that interacts via APIs with a performance evaluation algorithm and provides graphical tools that allow the user to perform many tasks, such as to search or compare players by age, role or trend of growth in their performance, find similar players based on their pitching behavior, change the algorithms parameters to obtain customized performance scores. We also describe an example of how a talent scout can interact with the dashboard to find young, promising talents.
The recent increase in volume and complexity of available astronomical data has led to a wide use of supervised machine learning techniques. Active learning strategies have been proposed as an alternative to optimize the distribution of scarce labeling resources. However, due to the specific conditions in which labels can be acquired, fundamental assumptions, such as sample representativeness and labeling cost stability cannot be fulfilled. The Recommendation System for Spectroscopic follow-up (RESSPECT) project aims to enable the construction of optimized training samples for the Rubin Observatory Legacy Survey of Space and Time (LSST), taking into account a realistic description of the astronomical data environment. In this work, we test the robustness of active learning techniques in a realistic simulated astronomical data scenario. Our experiment takes into account the evolution of training and pool samples, different costs per object, and two different sources of budget. Results show that traditional active learning strategies significantly outperform random sampling. Nevertheless, more complex batch strategies are not able to significantly overcome simple uncertainty sampling techniques. Our findings illustrate three important points: 1) active learning strategies are a powerful tool to optimize the label-acquisition task in astronomy, 2) for upcoming large surveys like LSST, such techniques allow us to tailor the construction of the training sample for the first day of the survey, and 3) the peculiar data environment related to the detection of astronomical transients is a fertile ground that calls for the development of tailored machine learning algorithms.