A Micro-Service based Approach for Constructing Distributed Storage System

131 0 0.0 ( 0 )

Download Cite

Added by Yuhao Lu

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Yuhao Lu - Zhenqing Liu - Dejun Jiang

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents an approach for constructing distributed storage system based on micro-service architecture. By building storage functionalities using micro services, we can provide new capabilities to a distributed storage system in a flexible way. We take erasure coding and compression as two case studies to show how to build a micro-service based distributed storage system. We also show that by building erasure coding and compression as micro-services, the distributed storage system still achieves reasonable performance compared to the monolithic one.

rate research

Distributed Storage Allocations for Optimal Service Rates

99 - Pei Peng , Moslem Noori , Emina Soljanin 2021

Redundant storage maintains the performance of distributed systems under various forms of uncertainty. This paper considers the uncertainty in node access and download service. We consider two access models under two download service models. In one access model, a user can access each node with a fixed probability, and in the other, a user can access a random fixed-size subset of nodes. We consider two download service models. In the first (small file) model, the randomness associated with the file size is negligible. In the second (large file) model, randomness is associated with both the file size and the systems operations. We focus on the service rate of the system. For a fixed redundancy level, the systems service rate is determined by the allocation of coded chunks over the storage nodes. We consider quasi-uniform allocations, where coded content is uniformly spread among a subset of nodes. The question we address asks what the size of this subset (spreading) should be. We show that in the small file model, concentrating the coded content to a minimum-size subset is universally optimal. For the large file model, the optimal spreading depends on the system parameters. These conclusions hold for both access models.

Information Theory Distributed Parallel and Cluster Computing Information Theory

Live Forensics for Distributed Storage Systems

147 - Saurabh Jha , Shengkun Cui , Tianyin Xu 2019

We present Kaleidoscope an innovative system that supports live forensics for application performance problems caused by either individual component failures or resource contention issues in large-scale distributed storage systems. The design of Kaleidoscope is driven by our study of I/O failures observed in a peta-scale storage system anonymized as PetaStore. Kaleidoscope is built on three key features: 1) using temporal and spatial differential observability for end-to-end performance monitoring of I/O requests, 2) modeling the health of storage components as a stochastic process using domain-guided functions that accounts for path redundancy and uncertainty in measurements, and, 3) observing differences in reliability and performance metrics between similar types of healthy and unhealthy components to attribute the most likely root causes. We deployed Kaleidoscope on PetaStore and our evaluation shows that Kaleidoscope can run live forensics at 5-minute intervals and pinpoint the root causes of 95.8% of real-world performance issues, with negligible monitoring overhead.

Distributed Parallel and Cluster Computing Machine Learning

A Layered Architecture for Erasure-Coded Consistent Distributed Storage

70 - Kishori M. Konwar , N. Prakash , Nancy Lynch 2017

Motivated by emerging applications to the edge computing paradigm, we introduce a two-layer erasure-coded fault-tolerant distributed storage system offering atomic access for read and write operations. In edge computing, clients interact with an edge-layer of servers that is geographically near; the edge-layer in turn interacts with a back-end layer of servers. The edge-layer provides low latency access and temporary storage for client operations, and uses the back-end layer for persistent storage. Our algorithm, termed Layered Data Storage (LDS) algorithm, offers several features suitable for edge-computing systems, works under asynchronous message-passing environments, supports multiple readers and writers, and can tolerate $f_1 < n_1/2$ and $f_2 < n_2/3$ crash failures in the two layers having $n_1$ and $n_2$ servers, respectively. We use a class of erasure codes known as regenerating codes for storage of data in the back-end layer. The choice of regenerating codes, instead of popular choices like Reed-Solomon codes, not only optimizes the cost of back-end storage, but also helps in optimizing communication cost of read operations, when the value needs to be recreated all the way from the back-end. The two-layer architecture permits a modular implementation of atomicity and erasure-code protocols; the implementation of erasure-codes is mostly limited to interaction between the two layers. We prove liveness and atomicity of LDS, and also compute performance costs associated with read and write operations. Further, in a multi-object system running $N$ independent instances of LDS, where only a small fraction of the objects undergo concurrent accesses at any point during the execution, the overall storage cost is dominated by that of persistent storage in the back-end layer, and is given by $Theta(N)$.

Distributed Parallel and Cluster Computing Information Theory Information Theory

Tycoon: A Distributed Market-based Resource Allocation System

59 - Kevin Lai , Bernardo A. Huberman , 2004

P2P clusters like the Grid and PlanetLab enable in principle the same statistical multiplexing efficiency gains for computing as the Internet provides for networking. The key unsolved problem is resource allocation. Existing solutions are not economically efficient and require high latency to acquire resources. We designed and implemented Tycoon, a market based distributed resource allocation system based on an Auction Share scheduling algorithm. Preliminary results show that Tycoon achieves low latency and high fairness while providing incentives for truth-telling on the part of strategic users.

Distributed Parallel and Cluster Computing Multiagent Systems

Pangea: Monolithic Distributed Storage for Data Analytics

114 - Jia Zou , Arun Iyengar , Chris Jermaine 2018

Storage and memory systems for modern data analytics are heavily layered, managing shared persistent data, cached data, and non-shared execution data in separate systems such as distributed file system like HDFS, in-memory file system like Alluxio and computation framework like Spark. Such layering introduces significant performance and management costs for copying data across layers redundantly and deciding proper resource allocation for all layers. In this paper we propose a single system called Pangea that can manage all data---both intermediate and long-lived data, and their buffer/caching, data placement optimization, and failure recovery---all in one monolithic storage system, without any layering. We present a detailed performance evaluation of Pangea and show that its performance compares favorably with several widely used layered systems such as Spark.

Distributed Parallel and Cluster Computing