بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Distributed Storage Allocations for Optimal Service Rates

100 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Pei Peng

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Pei Peng - Moslem Noori - Emina Soljanin

نظرية المعلومات النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Redundant storage maintains the performance of distributed systems under various forms of uncertainty. This paper considers the uncertainty in node access and download service. We consider two access models under two download service models. In one access model, a user can access each node with a fixed probability, and in the other, a user can access a random fixed-size subset of nodes. We consider two download service models. In the first (small file) model, the randomness associated with the file size is negligible. In the second (large file) model, randomness is associated with both the file size and the systems operations. We focus on the service rate of the system. For a fixed redundancy level, the systems service rate is determined by the allocation of coded chunks over the storage nodes. We consider quasi-uniform allocations, where coded content is uniformly spread among a subset of nodes. The question we address asks what the size of this subset (spreading) should be. We show that in the small file model, concentrating the coded content to a minimum-size subset is universally optimal. For the large file model, the optimal spreading depends on the system parameters. These conclusions hold for both access models.

قيم البحث

229 - Jing Wang , Zhiyuan Yan , Hongmei Xie 2016

Recently, the research on local repair codes is mainly confined to repair the failed nodes within each repair group. But if the extreme cases occur that the entire repair group has failed, the local code stored in the failed group need to be recovere d as a whole. In this paper, local codes with cooperative repair, in which the local codes are constructed based on minimum storage regeneration (MSR) codes, is proposed to achieve repairing the failed groups. Specifically, the proposed local codes with cooperative repair construct a kind of mutual interleaving structure among the parity symbols, that the parity symbols of each local code, named as distributed local parity, can be generated by the parity symbols of the MSR codes in its two adjacent local codes. Taking advantage of the structure given, the failed local groups can be repaired cooperatively by their adjacent local groups with lower repair locality, and meanwhile the minimum distance of local codes with cooperative repair is derived. Theoretical analysis and simulation experiments show that, compared with codes with local regeneration (such as MSR-local codes and MBR-local codes), the proposed local codes with cooperative repair have benefits in bandwidth overhead and repair locality for the case of local groups failure.

نظرية المعلومات النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

Optimal Coding Scheme and Resource Allocation for Distributed Computation with Limited Resources

75 - Shu-Jie Cao , Lihui Yi , Haoning Chen 2021

A central issue of distributed computing systems is how to optimally allocate computing and storage resources and design data shuffling strategies such that the total execution time for computing and data shuffling is minimized. This is extremely cri tical when the computation, storage and communication resources are limited. In this paper, we study the resource allocation and coding scheme for the MapReduce-type framework with limited resources. In particular, we focus on the coded distributed computing (CDC) approach proposed by Li et al.. We first extend the asymmetric CDC (ACDC) scheme proposed by Yu et al. to the cascade case where each output function is computed by multiple servers. Then we demonstrate that whether CDC or ACDC is better depends on system parameters (e.g., number of computing servers) and task parameters (e.g., number of input files), implying that neither CDC nor ACDC is optimal. By merging the ideas of CDC and ACDC, we propose a hybrid scheme and show that it can strictly outperform CDC and ACDC. Furthermore, we derive an information-theoretic converse showing that for the MapReduce task using a type of weakly symmetric Reduce assignment, which includes the Reduce assignments of CDC and ACDC as special cases, the hybrid scheme with a corresponding resource allocation strategy is optimal, i.e., achieves the minimum execution time, for an arbitrary amount of computing servers and storage memories.

نظرية المعلومات النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

An Application of Storage-Optimal MatDot Codes for Coded Matrix Multiplication: Fast k-Nearest Neighbors Estimation

109 - Utsav Sheth , Sanghamitra Dutta , Malhar Chaudhari 2018

We propose a novel application of coded computing to the problem of the nearest neighbor estimation using MatDot Codes [Fahim. et.al. 2017], that are known to be optimal for matrix multiplication in terms of recovery threshold under storage constrain ts. In approximate nearest neighbor algorithms, it is common to construct efficient in-memory indexes to improve query response time. One such strategy is Multiple Random Projection Trees (MRPT), which reduces the set of candidate points over which Euclidean distance calculations are performed. However, this may result in a high memory footprint and possibly paging penalties for large or high-dimensional data. Here we propose two techniques to parallelize MRPT, that exploit data and model parallelism respectively, by dividing both the data storage and the computation efforts among different nodes in a distributed computing cluster. This is especially critical when a single compute node cannot hold the complete dataset in memory. We also propose a novel coded computation strategy based on MatDot codes for the model-parallel architecture that, in a straggler-prone environment, achieves the storage-optimal recovery threshold, i.e., the number of nodes that are required to serve a query. We experimentally demonstrate that, in the absence of straggling, our distributed approaches require less query time than execution on a single processing node, providing near-linear speedups with respect to the number of worker nodes. Through our experiments on real systems with simulated straggling, we also show that our strategy achieves a faster query execution than the uncoded strategy in a straggler-prone environment.

نظرية المعلومات النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

Efficient Storage Schemes for Desired Service Rate Regions

135 - Fatemeh Kazemi , Sascha Kurz , Emina Soljanin 2020

A major concern in cloud/edge storage systems is serving a large number of users simultaneously. The service rate region is introduced recently as an important performance metric for coded distributed systems, which is defined as the set of all data access requests that can be simultaneously handled by the system. This paper studies the problem of designing a coded distributed storage system storing k files where a desired service rate region R of the system is given and the goal is 1) to determine the minimum number of storage nodes n(R) (or a lower bound on n(R)) for serving all demand vectors inside the set R and 2) to design the most storage-efficient redundancy scheme with the service rate region covering R. Towards this goal, we propose three general lower bounds for n(R). Also, for k=2, we characterize n(R), i.e., we show that the proposed lower bounds are tight via designing a novel storage-efficient redundancy scheme with n(R) storage nodes and the service rate region covering R.

نظرية المعلومات الرياضيات المتقطعة نظرية المعلومات

Generic Secure Repair for Distributed Storage

163 - Wentao Huang , Jehoshua Bruck 2017

This paper studies the problem of repairing secret sharing schemes, i.e., schemes that encode a message into $n$ shares, assigned to $n$ nodes, so that any $n-r$ nodes can decode the message but any colluding $z$ nodes cannot infer any information ab out the message. In the event of node failures so that shares held by the failed nodes are lost, the system needs to be repaired by reconstructing and reassigning the lost shares to the failed (or replacement) nodes. This can be achieved trivially by a trustworthy third-party that receives the shares of the available nodes, recompute and reassign the lost shares. The interesting question, studied in the paper, is how to repair without a trustworthy third-party. The main issue that arises is repair security: how to maintain the requirement that any colluding $z$ nodes, including the failed nodes, cannot learn any information about the message, during and after the repair process? We solve this secure repair problem from the perspective of secure multi-party computation. Specifically, we design generic repair schemes that can securely repair any (scalar or vector) linear secret sharing schemes. We prove a lower bound on the repair bandwidth of secure repair schemes and show that the proposed secure repair schemes achieve the optimal repair bandwidth up to a small constant factor when $n$ dominates $z$, or when the secret sharing scheme being repaired has optimal rate. We adopt a formal information-theoretic approach in our analysis and bounds. A main idea in our schemes is to allow a more flexible repair model than the straightforward one-round repair model implicitly assumed by existing secure regenerating codes. Particularly, the proposed secure repair schemes are simple and efficient two-round protocols.

نظرية المعلومات التشفير والأمن نظرية المعلومات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني الجزائري للبحث الزراعي

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Distributed Storage Allocations for Optimal Service Rates

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً