Arhuaco: Deep Learning and Isolation Based Security for Distributed High-Throughput Computing

125 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Andres Gomez Ramirez

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف A. Gomez Ramirez - C. Lara - L. Betev

النظم الموزعة والتوازية والحوسبة العنقودية التشفير والأمن التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Grid computing systems require innovative methods and tools to identify cybersecurity incidents and perform autonomous actions i.e. without administrator intervention. They also require methods to isolate and trace job payload activity in order to protect users and find evidence of malicious behavior. We introduce an integrated approach of security monitoring via Security by Isolation with Linux Containers and Deep Learning methods for the analysis of real time data in Grid jobs running inside virtualized High-Throughput Computing infrastructure in order to detect and prevent intrusions. A dataset for malware detection in Grid computing is described. We show in addition the utilization of generative methods with Recurrent Neural Networks to improve the collected dataset. We present Arhuaco, a prototype implementation of the proposed methods. We empirically study the performance of our technique. The results show that Arhuaco outperforms other methods used in Intrusion Detection Systems for Grid Computing. The study is carried out in the ALICE Collaboration Grid, part of the Worldwide LHC Computing Grid.

قيم البحث

93 - E. A. Huerta , Roland Haas , Shantenu Jha 2018

The advent of experimental science facilities-instruments and observatories, such as the Large Hadron Collider, the Laser Interferometer Gravitational Wave Observatory, and the upcoming Large Synoptic Survey Telescope-has brought about challenging, l arge-scale computational and data processing requirements. Traditionally, the computing infrastructure to support these facilitys requirements were organized into separate infrastructure that supported their high-throughput needs and those that supported their high-performance computing needs. We argue that to enable and accelerate scientific discovery at the scale and sophistication that is now needed, this separation between high-performance computing and high-throughput computing must be bridged and an integrated, unified infrastructure provided. In this paper, we discuss several case studies where such infrastructure has been implemented. These case studies span different science domains, software systems, and application requirements as well as levels of sustainability. A further aim of this paper is to provide a basis to determine the common characteristics and requirements of such infrastructure, as well as to begin a discussion of how best to support the computing requirements of existing and future experimental science facilities.

النظم الموزعة والتوازية والحوسبة العنقودية ظاهرة عالية الطاقة الفيزياء الفيزيائية النسبية العامة وهدية الكونيات الكم

GRAPLEr: A Distributed Collaborative Environment for Lake Ecosystem Modeling that Integrates Overlay Networks, High-throughput Computing, and Web Services

61 - Kensworth Subratie , Saumitra Aditya , Renato Figueiredo 2015

The GLEON Research And PRAGMA Lake Expedition -- GRAPLE -- is a collaborative effort between computer science and lake ecology researchers. It aims to improve our understanding and predictive capacity of the threats to the water quality of our freshw ater resources, including climate change. This paper presents GRAPLEr, a distributed computing system used to address the modeling needs of GRAPLE researchers. GRAPLEr integrates and applies overlay virtual network, high-throughput computing, and Web service technologies in a novel way. First, its user-level IP-over-P2P (IPOP) overlay network allows compute and storage resources distributed across independently-administered institutions (including private and public clouds) to be aggregated into a common virtual network, despite the presence of firewalls and network address translators. Second, resources aggregated by the IPOP virtual network run unmodified high-throughput computing middleware (HTCondor) to enable large numbers of model simulations to be executed concurrently across the distributed computing resources. Third, a Web service interface allows end users to submit job requests to the system using client libraries that integrate with the R statistical computing environment. The paper presents the GRAPLEr architecture, describes its implementation and reports on its performance for batches of General Lake Model (GLM) simulations across three cloud infrastructures (University of Florida, CloudLab, and Microsoft Azure).

النظم الموزعة والتوازية والحوسبة العنقودية

BigDL: A Distributed Deep Learning Framework for Big Data

187 - Jason Dai , Yiheng Wang , Xin Qiu 2018

This paper presents BigDL (a distributed deep learning framework for Apache Spark), which has been used by a variety of users in the industry for building deep learning applications on production big data platforms. It allows deep learning applicatio ns to run on the Apache Hadoop/Spark cluster so as to directly process the production data, and as a part of the end-to-end data analysis pipeline for deployment and management. Unlike existing deep learning frameworks, BigDL implements distributed, data parallel training directly on top of the functional compute model (with copy-on-write and coarse-grained operations) of Spark. We also share real-world experience and war stories of users that have adopted BigDL to address their challenges(i.e., how to easily build end-to-end data analysis and deep learning pipelines for their production data).

النظم الموزعة والتوازية والحوسبة العنقودية الذكاء الاصطناعي التعلم الآلي

Distributed Machine Learning for Predictive Analytics in Mobile Edge Computing Based IoT Environments

68 - Prabath Abeysekara , Hai Dong , A.K. Qin 2020

Predictive analytics in Mobile Edge Computing (MEC) based Internet of Things (IoT) is becoming a high demand in many real-world applications. A prediction problem in an MEC-based IoT environment typically corresponds to a collection of tasks with eac h task solved in a specific MEC environment based on the data accumulated locally, which can be regarded as a Multi-task Learning (MTL) problem. However, the heterogeneity of the data (non-IIDness) accumulated across different MEC environments challenges the application of general MTL techniques in such a setting. Federated MTL (FMTL) has recently emerged as an attempt to address this issue. Besides FMTL, there exists another powerful but under-exploited distributed machine learning technique, called Network Lasso (NL), which is inherently related to FMTL but has its own unique features. In this paper, we made an in-depth evaluation and comparison of these two techniques on three distinct IoT datasets representing real-world application scenarios. Experimental results revealed that NL outperformed FMTL in MEC-based IoT environments in terms of both accuracy and computational efficiency.

النظم الموزعة والتوازية والحوسبة العنقودية

PIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks

157 - Sicong Zhou , Huawei Huang , Wuhui Chen 2019

In the fifth-generation (5G) networks and the beyond, communication latency and network bandwidth will be no more bottleneck to mobile users. Thus, almost every mobile device can participate in the distributed learning. That is, the availability issu e of distributed learning can be eliminated. However, the model safety will become a challenge. This is because the distributed learning system is prone to suffering from byzantine attacks during the stages of updating model parameters and aggregating gradients amongst multiple learning participants. Therefore, to provide the byzantine-resilience for distributed learning in 5G era, this article proposes a secure computing framework based on the sharding-technique of blockchain, namely PIRATE. A case-study shows how the proposed PIRATE contributes to the distributed learning. Finally, we also envision some open issues and challenges based on the proposed byzantine-resilient learning framework.

النظم الموزعة والتوازية والحوسبة العنقودية التشفير والأمن