Intelligently-automated facilities expansion with the HEPCloud Decision Engine

150 0 0.0 ( 0 )

Download Cite

Added by Burt Holzman

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Mine Altunay - W. David Dagenhart - Stuart Fuess

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The next generation of High Energy Physics experiments are expected to generate exabytes of data---two orders of magnitude greater than the current generation. In order to reliably meet peak demands, facilities must either plan to provision enough resources to cover the forecasted need, or find ways to elastically expand their computational capabilities. Commercial cloud and allocation-based High Performance Computing (HPC) resources both have explicit and implicit costs that must be considered when deciding when to provision these resources, and to choose an appropriate scale. In order to support such provisioning in a manner consistent with organizational business rules and budget constraints, we have developed a modular intelligent decision support system (IDSS) to aid in the automatic provisioning of resources---spanning multiple cloud providers, multiple HPC centers, and grid computing federations.

rate research

HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

306 - Burt Holzman , Lothar A. T. Bauerdick , Brian Bockelman 2017

Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing nterest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized both local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. In addition, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.

Distributed Parallel and Cluster Computing Computational Physics

HEPCloud, an Elastic Hybrid HEP Facility using an Intelligent Decision Support System

86 - Parag Mhashilkar , Mine Altunay , Eileen Berman 2019

HEPCloud is rapidly becoming the primary system for provisioning compute resources for all Fermilab-affiliated experiments. In order to reliably meet the peak demands of the next generation of High Energy Physics experiments, Fermilab must plan to elastically expand its computational capabilities to cover the forecasted need. Commercial cloud and allocation-based High Performance Computing (HPC) resources both have explicit and implicit costs that must be considered when deciding when to provision these resources, and at which scale. In order to support such provisioning in a manner consistent with organizational business rules and budget constraints, we have developed a modular intelligent decision support system (IDSS) to aid in the automatic provisioning of resources spanning multiple cloud providers, multiple HPC centers, and grid computing federations. In this paper, we discuss the goals and architecture of the HEPCloud Facility, the architecture of the IDSS, and our early experience in using the IDSS for automated facility expansion both at Fermi and Brookhaven National Laboratory.

Distributed Parallel and Cluster Computing

The Petascale DTN Project: High Performance Data Transfer for HPC Facilities

139 - Eli Dart , William Allcock , Wahid Bhimji 2021

The movement of large-scale (tens of Terabytes and larger) data sets between high performance computing (HPC) facilities is an important and increasingly critical capability. A growing number of scientific collaborations rely on HPC facilities for tasks which either require large-scale data sets as input or produce large-scale data sets as output. In order to enable the transfer of these data sets as needed by the scientific community, HPC facilities must design and deploy the appropriate data transfer capabilities to allow users to do data placement at scale. This paper describes the Petascale DTN Project, an effort undertaken by four HPC facilities, which succeeded in achieving routine data transfer rates of over 1PB/week between the facilities. We describe the design and configuration of the Data Transfer Node (DTN) clusters used for large-scale data transfers at these facilities, the software tools used, and the performance tuning that enabled this capability.

Distributed Parallel and Cluster Computing Performance

Automated Decision-based Adversarial Attacks

186 - Qi-An Fu , Yinpeng Dong , Hang Su 2021

Deep learning models are vulnerable to adversarial examples, which can fool a target classifier by imposing imperceptible perturbations onto natural examples. In this work, we consider the practical and challenging decision-based black-box adversarial setting, where the attacker can only acquire the final classification labels by querying the target model without access to the models details. Under this setting, existing works often rely on heuristics and exhibit unsatisfactory performance. To better understand the rationality of these heuristics and the limitations of existing methods, we propose to automatically discover decision-based adversarial attack algorithms. In our approach, we construct a search space using basic mathematical operations as building blocks and develop a random search algorithm to efficiently explore this space by incorporating several pruning techniques and intuitive priors inspired by program synthesis works. Although we use a small and fast model to efficiently evaluate attack algorithms during the search, extensive experiments demonstrate that the discovered algorithms are simple yet query-efficient when transferred to larger normal and defensive models on the CIFAR-10 and ImageNet datasets. They achieve comparable or better performance than the state-of-the-art decision-based attack methods consistently.

Machine Learning Cryptography and Security

Enterprise Architecture Model Transformation Engine

106 - Erik Heiland , Peter Hillmann , Andreas Karcher 2021

With increasing linkage within value chains, the IT systems of different companies are also being connected with each other. This enables the integration of services within the movement of Industry 4.0 in order to improve the quality and performance of the processes. Enterprise architecture models form the basis for this with a better buisness IT-alignment. However, the heterogeneity of the modeling frameworks and description languages makes a concatenation considerably difficult, especially differences in syntax, semantic and relations. Therefore, this paper presents a transformation engine to convert enterprise architecture models between several languages. We developed the first generic translation approach that is free of specific meta-modeling, which is flexible adaptable to arbitrary modeling languages. The transformation process is defined by various pattern matching techniques using a rule-based description language. It uses set theory and first-order logic for an intuitive description as a basis. The concept is practical evaluated using an example in the area of a large German IT-service provider. Anyhow, the approach is applicable between a wide range of enterprise architecture frameworks.

Distributed Parallel and Cluster Computing Artificial Intelligence Computer Vision and Pattern Recognition

comments

Fetching comments

International University for Science and Technology

Additional details More universities

Intelligently-automated facilities expansion with the HEPCloud Decision Engine

Ask ChatGPT about the research

No Arabic abstract

Read More