Archipelago: A Scalable Low-Latency Serverless Platform

263 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Arjun Singhvi

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Arjun Singhvi - Kevin Houck - Arjun Balasubramanian

النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The increased use of micro-services to build web applications has spurred the rapid growth of Function-as-a-Service (FaaS) or serverless computing platforms. While FaaS simplifies provisioning and scaling for application developers, it introduces new challenges in resource management that need to be handled by the cloud provider. Our analysis of popular serverless workloads indicates that schedulers need to handle functions that are very short-lived, have unpredictable arrival patterns, and require expensive setup of sandboxes. The challenge of running a large number of such functions in a multi-tenant cluster makes existing scheduling frameworks unsuitable. We present Archipelago, a platform that enables low latency request execution in a multi-tenant serverless setting. Archipelago views each application as a DAG of functions, and every DAG in associated with a latency deadline. Archipelago achieves its per-DAG request latency goals by: (1) partitioning a given cluster into a number of smaller worker pools, and associating each pool with a semi-global scheduler (SGS), (2) using a latency-aware scheduler within each SGS along with proactive sandbox allocation to reduce overheads, and (3) using a load balancing layer to route requests for different DAGs to the appropriate SGS, and automatically scale the number of SGSs per DAG. Our testbed results show that Archipelago meets the latency deadline for more than 99% of realistic application request workloads, and reduces tail latencies by up to 36X compared to state-of-the-art serverless platforms.

قيم البحث

275 - Justin Hu , Ariana Bruno , Brian Ritchken 2020

Swarms of autonomous devices are increasing in ubiquity and size. There are two main trains of thought for controlling devices in such swarms; centralized and distributed control. Centralized platforms achieve higher output quality but result in high network traffic and limited scalability, while decentralized systems are more scalable, but less sophisticated. In this work we present HiveMind, a centralized coordination control platform for IoT swarms that is both scalable and performant. HiveMind leverages a centralized cluster for all resource-intensive computation, deferring lightweight and time-critical operations, such as obstacle avoidance to the edge devices to reduce network traffic. HiveMind employs an event-driven serverless framework to run tasks on the cluster, guarantees fault tolerance both in the edge devices and serverless functions, and handles straggler tasks and underperforming devices. We evaluate HiveMind on a swarm of 16 programmable drones on two scenarios; searching for given items, and counting unique people in an area. We show that HiveMind achieves better performance and battery efficiency compared to fully centralized and fully decentralized platforms, while also handling load imbalances and failures gracefully, and allowing edge devices to leverage the cluster to collectively improve their output quality.

النظم الموزعة والتوازية والحوسبة العنقودية بنية الشبكات والإنترنت

RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing

78 - Marcin Copik , Konstantin Taranov , Alexandru Calotoiu 2021

The rigid MPI programming model and batch scheduling dominate high-performance computing. While clouds brought new levels of elasticity into the world of computing, supercomputers still suffer from low resource utilization rates. To enhance supercomp uting clusters with the benefits of serverless computing, a modern cloud programming paradigm for pay-as-you-go execution of stateless functions, we present rFaaS, the first RDMA-aware Function-as-a-Service (FaaS) platform. With hot invocations and decentralized function placement, we overcome the major performance limitations of FaaS systems and provide low-latency remote invocations in multi-tenant environments. We evaluate the new serverless system through a series of microbenchmarks and show that remote functions execute with negligible performance overheads. We demonstrate how serverless computing can bring elastic resource management into MPI-based high-performance applications. Overall, our results show that MPI applications can benefit from modern cloud programming paradigms to guarantee high performance at lower resource costs.

النظم الموزعة والتوازية والحوسبة العنقودية

A Serverless Cloud-Fog Platform for DNN-Based Video Analytics with Incremental Learning

576 - Huaizheng Zhang , Meng Shen , Yizheng Huang 2021

DNN-based video analytics have empowered many new applications (e.g., automated retail). Meanwhile, the proliferation of fog devices provides developers with more design options to improve performance and save cost. To the best of our knowledge, this paper presents the first serverless system that takes full advantage of the client-fog-cloud synergy to better serve the DNN-based video analytics. Specifically, the system aims to achieve two goals: 1) Provide the optimal analytics results under the constraints of lower bandwidth usage and shorter round-trip time (RTT) by judiciously managing the computational and bandwidth resources deployed in the client, fog, and cloud environment. 2) Free developers from tedious administration and operation tasks, including DNN deployment, cloud and fogs resource management. To this end, we implement a holistic cloud-fog system referred to as VPaaS (Video-Platform-as-a-Service). VPaaS adopts serverless computing to enable developers to build a video analytics pipeline by simply programming a set of functions (e.g., model inference), which are then orchestrated to process videos through carefully designed modules. To save bandwidth and reduce RTT, VPaaS provides a new video streaming protocol that only sends low-quality video to the cloud. The state-of-the-art (SOTA) DNNs deployed at the cloud can identify regions of video frames that need further processing at the fog ends. At the fog ends, misidentified labels in these regions can be corrected using a light-weight DNN model. To address the data drift issues, we incorporate limited human feedback into the system to verify the results and adopt incremental learning to improve our system continuously. The evaluation demonstrates that VPaaS is superior to several SOTA systems: it maintains high accuracy while reducing bandwidth usage by up to 21%, RTT by up to 62.5%, and cloud monetary cost by up to 50%.

النظم الموزعة والتوازية والحوسبة العنقودية الذكاء الاصطناعي

The Scalable Systems Laboratory: a Platform for Software Innovation for HEP

63 - Robert Gardner , Lincoln Bryant , Mark Neubauer 2020

The Scalable Systems Laboratory (SSL), part of the IRIS-HEP Software Institute, provides Institute participants and HEP software developers generally with a means to transition their R&D from conceptual toys to testbeds to production-scale prototypes . The SSL enables tooling, infrastructure, and services supporting the innovation of novel analysis and data architectures, development of software elements and tool-chains, reproducible functional and scalability testing of service components, and foundational systems R&D for accelerated services developed by the Institute. The SSL is constructed with a core team having expertise in scale testing and deployment of services across a wide range of cyberinfrastructure. The core team embeds and partners with other areas in the Institute, and with LHC and other HEP development and operations teams as appropriate, to define investigations and required service deployment patterns. We describe the approach and experiences with early application deployments, including analysis platforms and intelligent data delivery systems.

النظم الموزعة والتوازية والحوسبة العنقودية

Platform Autonomous Custom Scalable Service using Service Oriented Cloud Computing Architecture

105 - B. Kamala , B. Priya , J. M. Nandhini 2016

The global economic recession and the shrinking budget of IT projects have led to the need of development of integrated information systems at a lower cost. Today, the emerging phenomenon of cloud computing aims at transforming the traditional way of computing by providing both software applications and hardware resources as a service. With the rapid evolution of Information Communication Technology (ICT) governments, organizations and businesses are looking for solutions to improve their services and integrate their IT infrastructures. In recent years advanced technologies such as SOA and Cloud computing have been evolved to address integration problems. The Clouds enormous capacity with comparable low cost makes it an ideal platform for SOA deployment. This paper deals with the combined approach of Cloud and Service Oriented Architecture along with a Case Study and a review.

النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات