No Arabic abstract
Astrophysics lies at the crossroads of big datasets (such as the Large Synoptic Survey Telescope and Gaia), open source software to visualize and interpret high dimensional datasets (such as Glue, WorldWide Telescope, and OpenSpace), and uniquely skilled software engineers who bridge data science and research fields. At the same time, more than 4,000 planetariums across the globe immerse millions of visitors in scientific data. We have identified the potential for critical synergy across data, software, hardware, locations, and content that -- if prioritized over the next decade -- will drive discovery in astronomical research. Planetariums can and should be used for the advancement of scientific research. Current facilities such as the Hayden Planetarium in New York City, Adler Planetarium in Chicago, Morrison Planetarium in San Francisco, the Iziko Planetarium and Digital Dome Research Consortium in Cape Town, and Visualization Center C in Norrkoping are already developing software which ingests catalogs of astronomical and multi-disciplinary data critical for exploration research primarily for the purpose of creating scientific storylines for the general public. We propose a transformative model whereby scientists become the audience and explorers in planetariums, utilizing software for their own investigative purposes. In this manner, research benefits from the authentic and unique experience of data immersion contained in an environment bathed in context and equipped for collaboration. Consequently, in this white paper we argue that over the next decade the research astronomy community should partner with planetariums to create visualization-based research opportunities for the field. Realizing this vision will require new investments in software and human capital.
Spherical coordinate systems, which are ubiquitous in astronomy, cannot be shown without distortion on flat, two-dimensional surfaces. This poses challenges for the two complementary phases of visual exploration -- making discoveries in data by looking for relationships, patterns or anomalies -- and publication -- where the results of an exploration are made available for scientific scrutiny or communication. This is a long-standing problem, and many practical solutions have been developed. Our allskyVR approach provides a workflow for experimentation with commodity virtual reality head-mounted displays. Using the free, open source S2PLOT programming library, and the A-Frame WebVR browser-based framework, we provide a straightforward way to visualise all-sky catalogues on a user-centred, virtual celestial sphere. The allskyVR distribution contains both a quickstart option, complete with a gaze-based menu system, and a fully customisable mode for those who need more control of the immersive experience. The software is available for download from: https://github.com/cfluke/allskyVR
In the era of big data astronomy, next generation telescopes and large sky surveys produce data sets at the TB or even PB level. Due to their large data volumes, these astronomical data sets are extremely difficult to transfer and analyze using personal computers or small clusters. In order to offer better access to data, data centers now generally provide online science platforms that enable analysis close to the data. The Chinese Virtual Observatory (China-VO) is one of the member projects in the International Virtual Observatory Alliance and it is dedicated to providing a research and education environment where globally distributed astronomy archives are simple to find, access, and interoperate. In this study, we summarize highlights of the work conducted at the China-VO, as well the experiences and lessons learned during the full life-cycle management of astronomical data. Finally, We discuss the challenges and future trends for astronomical science platforms.
This paper presents the design, implementation, and evaluation of the PyTorch distributed data parallel module. PyTorch is a widely-adopted scientific computing package used in deep learning research and applications. Recent advances in deep learning argue for the value of large datasets and large models, which necessitates the ability to scale out model training to more computational resources. Data parallelism has emerged as a popular solution for distributed training thanks to its straightforward principle and broad applicability. In general, the technique of distributed data parallelism replicates the model on every computational resource to generate gradients independently and then communicates those gradients at each iteration to keep model replicas consistent. Despite the conceptual simplicity of the technique, the subtle dependencies between computation and communication make it non-trivial to optimize the distributed training efficiency. As of v1.5, PyTorch natively provides several techniques to accelerate distributed data parallel, including bucketing gradients, overlapping computation with communication, and skipping gradient synchronization. Evaluations show that, when configured appropriately, the PyTorch distributed data parallel module attains near-linear scalability using 256 GPUs.
Improving software citation and credit continues to be a topic of interest across and within many disciplines, with numerous efforts underway. In this Birds of a Feather (BoF) session, we started with a list of actionable ideas from last years BoF and other similar efforts and worked alone or in small groups to begin implementing them. Work was captured in a common Google document; the session organizers will disseminate or otherwise put this information to use in or for the community in collaboration with those who contributed.
In two recent papers the mesoscale model Meso-NH, joint with the Astro-Meso-NH package, has been validated at Dome C, Antarctica, for the characterization of the optical turbulence. It has been shown that the meteorological parameters (temperature and wind speed, from which the optical turbulence depends on) as well as the Cn2 profiles above Dome C were correctly statistically reproduced. The three most important derived parameters that characterize the optical turbulence above the internal antarctic plateau: the surface layer thickness, the seeing in the free-atmosphere and in the total atmosphere showed to be in a very good agreement with observations. Validation of Cn2 has been performed using all the measurements of the optical turbulence vertical distribution obtained in winter so far. In this paper, in order to investigate the ability of the model to discriminate between different turbulence conditions for site testing, we extend the study to two other potential astronomical sites in Antarctica: Dome A and South Pole, which we expect to be characterized by different turbulence conditions. The optical turbulence has been calculated above these two sites for the same 15 nights studied for Dome C and a comparison between the three sites has been performed.