بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Cooperative Training for Attribute-Distributed Data: Trade-off Between Data Transmission and Performance

363 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Haipeng Zheng

تاريخ النشر 2009

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Haipeng Zheng - Sanjeev R. Kulkarni - H. Vincent Poor

النظم الموزعة والتوازية والحوسبة العنقودية أنظمة متعددة العملاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper introduces a modeling framework for distributed regression with agents/experts observing attribute-distributed data (heterogeneous data). Under this model, a new algorithm, the iterative covariance optimization algorithm (ICOA), is designed to reshape the covariance matrix of the training residuals of individual agents so that the linear combination of the individual estimators minimizes the ensemble training error. Moreover, a scheme (Minimax Protection) is designed to provide a trade-off between the number of data instances transmitted among the agents and the performance of the ensemble estimator without undermining the convergence of the algorithm. This scheme also provides an upper bound (with high probability) on the test error of the ensemble estimator. The efficacy of ICOA combined with Minimax Protection and the comparison between the upper bound and actual performance are both demonstrated by simulations.

قيم البحث

137 - Ajay Badita , Rooji Jinan , Balajee Vamanan 2021

We consider energy minimization for data-intensive applications run on large number of servers, for given performance guarantees. We consider a system, where each incoming application is sent to a set of servers, and is considered to be completed if a subset of them finish serving it. We consider a simple case when each server core has two speed levels, where the higher speed can be achieved by higher power for each core independently. The core selects one of the two speeds probabilistically for each incoming application request. We model arrival of application requests by a Poisson process, and random service time at the server with independent exponential random variables. Our model and analysis generalizes to todays state-of-the-art in CPU energy management where each core can independently select a speed level from a set of supported speeds and corresponding voltages. The performance metrics under consideration are the mean number of applications in the system and the average energy expenditure. We first provide a tight approximation to study this previously intractable problem and derive closed form approximate expressions for the performance metrics when service times are exponentially distributed. Next, we study the trade-off between the approximate mean number of applications and energy expenditure in terms of the switching probability.

النظم الموزعة والتوازية والحوسبة العنقودية استرجاع المعلومات

PyTorch Distributed: Experiences on Accelerating Data Parallel Training

443 - Shen Li , Yanli Zhao , Rohan Varma 2020

This paper presents the design, implementation, and evaluation of the PyTorch distributed data parallel module. PyTorch is a widely-adopted scientific computing package used in deep learning research and applications. Recent advances in deep learning argue for the value of large datasets and large models, which necessitates the ability to scale out model training to more computational resources. Data parallelism has emerged as a popular solution for distributed training thanks to its straightforward principle and broad applicability. In general, the technique of distributed data parallelism replicates the model on every computational resource to generate gradients independently and then communicates those gradients at each iteration to keep model replicas consistent. Despite the conceptual simplicity of the technique, the subtle dependencies between computation and communication make it non-trivial to optimize the distributed training efficiency. As of v1.5, PyTorch natively provides several techniques to accelerate distributed data parallel, including bucketing gradients, overlapping computation with communication, and skipping gradient synchronization. Evaluations show that, when configured appropriately, the PyTorch distributed data parallel module attains near-linear scalability using 256 GPUs.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

Diversity/Parallelism Trade-off in Distributed Systems with Redundancy

145 - Pei Peng , Emina Soljanin , Philip Whiting 2020

As numerous machine learning and other algorithms increase in complexity and data requirements, distributed computing becomes necessary to satisfy the growing computational and storage demands, because it enables parallel execution of smaller tasks t hat make up a large computing job. However, random fluctuations in task service times lead to straggling tasks with long execution times. Redundancy, in the form of task replication and erasure coding, provides diversity that allows a job to be completed when only a subset of redundant tasks is executed, thus removing the dependency on the straggling tasks. In situations of constrained resources (here a fixed number of parallel servers), increasing redundancy reduces the available resources for parallelism. In this paper, we characterize the diversity vs. parallelism trade-off and identify the optimal strategy, among replication, coding and splitting, which minimizes the expected job completion time. We consider three common service time distributions and establish three models that describe scaling of these distributions with the task size. We find that different distributions with different scaling models operate optimally at different levels of redundancy, and thus may require very different code rates.

النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات الأداء

Distributed Quantum Proofs for Replicated Data

366 - Pierre Fraigniaud , Franc{c}ois Le Gall , Harumichi Nishimura 2020

The paper tackles the issue of $textit{checking}$ that all copies of a large data set replicated at several nodes of a network are identical. The fact that the replicas may be located at distant nodes prevents the system from verifying their equality locally, i.e., by having each node consult only nodes in its vicinity. On the other hand, it remains possible to assign $textit{certificates}$ to the nodes, so that verifying the consistency of the replicas can be achieved locally. However, we show that, as the data set is large, classical certification mechanisms, including distributed Merlin-Arthur protocols, cannot guarantee good completeness and soundness simultaneously, unless they use very large certificates. The main result of this paper is a distributed $textit{quantum}$ Merlin-Arthur protocol enabling the nodes to collectively check the consistency of the replicas, based on small certificates, and in a single round of message exchange between neighbors, with short messages. In particular, the certificate-size is logarithmic in the size of the data set, which gives an exponential advantage over classical certification mechanisms.

النظم الموزعة والتوازية والحوسبة العنقودية فيزياء الكم

HPTMT Parallel Operators for High Performance Data Science & Data Engineering

666 - Vibhatha Abeykoon , Supun Kamburugamuve , Chathura Widanage 2021

Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstrac tions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field has led to other implementations that do not work well together. The HPTMT architecture that we proposed recently, identifies a set of data structures, operators, and an execution model for creating rich data applications that links all aspects of data engineering and data science together efficiently. This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together.

النظم الموزعة والتوازية والحوسبة العنقودية الذكاء الاصطناعي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة المستنصرية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Cooperative Training for Attribute-Distributed Data: Trade-off Between Data Transmission and Performance

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً