Data Races and the Discrete Resource-time Tradeoff Problem with Resource Reuse over Paths

69 0 0.0 ( 0 )

Download Cite

Added by Rathish Das

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Rathish Das - Shih-Yu Tsai - Sharmila Duppala

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

A determinacy race occurs if two or more logically parallel instructions access the same memory location and at least one of them tries to modify its content. Races often lead to nondeterministic and incorrect program behavior. A data race is a special case of a determinacy race which can be eliminated by associating a mutual-exclusion lock or allowing atomic accesses to the memory location. However, such solutions can reduce parallelism by serializing all accesses to that location. For associative and commutative updates, reducers allow parallel race-free updates at the expense of using some extra space. We ask the following question. Given a fixed budget of extra space to mitigate the cost of races in a parallel program, which memory locations should be assigned reducers and how should the space be distributed among the reducers in order to minimize the overall running time? We argue that the races can be captured by a directed acyclic graph (DAG), with nodes representing memory cells and arcs representing read-write dependencies between cells. We then formulate our optimization problem on DAGs. We concentrate on a variation of this problem where space reuse among reducers is allowed by routing extra space along a source to sink path of the DAG and using it in the construction of reducers along the path. We consider two reducers and the corresponding duration functions (i.e., reduction time as a function of space budget). We generalize our race-avoiding space-time tradeoff problem to a discrete resource-time tradeoff problem with general non-increasing duration functions and resource reuse over paths. For general DAGs, the offline problem is strongly NP-hard under all three duration functions, and we give approximation algorithms. We also prove hardness of approximation for the general resource-time tradeoff problem and give a pseudo-polynomial time algorithm for series-parallel DAGs.

rate research

Real-Time Notification for Resource Synchronization

841 - Martin Klein , Robert Sanderson , Herbert Van de Sompel 2014

Web applications frequently leverage resources made available by remote web servers. As resources are created, updated, deleted, or moved, these applications face challenges to remain in lockstep with the servers change dynamics. Several approaches exist to help meet this challenge for use cases where good enough synchronization is acceptable. But when strict resource coverage or low synchronization latency is required, commonly accepted Web-based solutions remain elusive. This paper details characteristics of an approach that aims at decreasing synchronization latency while maintaining desired levels of accuracy. The approach builds on pushing change notifications and pulling changed resources and it is explored with an experiment based on a DBpedia Live instance.

Distributed Parallel and Cluster Computing

Federated Learning Over Wireless Channels: Dynamic Resource Allocation and Task Scheduling

74 - Shunfeng Chu , Jun Li , Jianxin Wang 2021

With the development of federated learning (FL), mobile devices (MDs) are able to train their local models with private data and sends them to a central server for aggregation, thereby preventing sensitive raw data leakage. In this paper, we aim to improve the training performance of FL systems in the context of wireless channels and stochastic energy arrivals of MDs. To this purpose, we dynamically optimize MDs transmission power and training task scheduling. We first model this dynamic programming problem as a constrained Markov decision process (CMDP). Due to high dimensions rooted from our CMDP problem, we propose online stochastic learning methods to simplify the CMDP and design online algorithms to obtain an efficient policy for all MDs. Since there are long-term constraints in our CMDP, we utilize Lagrange multipliers approach to tackle this issue. Furthermore, we prove the convergence of the proposed online stochastic learning algorithm. Numerical results indicate that the proposed algorithms can achieve better performance than the benchmark algorithms.

Distributed Parallel and Cluster Computing

Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data Intensive Applications

441 - Ioan Raicu , Yong Zhao , Ian Foster 2008

Data intensive applications often involve the analysis of large datasets that require large amounts of compute and storage resources. While dedicated compute and/or storage farms offer good task/data throughput, they suffer low resource utilization problem under varying workloads conditions. If we instead move such data to distributed computing resources, then we incur expensive data transfer cost. In this paper, we propose a data diffusion approach that combines dynamic resource provisioning, on-demand data replication and caching, and data locality-aware scheduling to achieve improved resource efficiency under varying workloads. We define an abstract data diffusion model that takes into consideration the workload characteristics, data accessing cost, application throughput and resource utilization; we validate the model using a real-world large-scale astronomy application. Our results show that data diffusion can increase the performance index by as much as 34X, and improve application response time by over 506X, while achieving near-optimal throughputs and execution times.

Distributed Parallel and Cluster Computing

Resource Pools and the CAP Theorem

74 - Andrew Lewis-Pye , Tim Roughgarden 2020

Blockchain protocols differ in fundamental ways, including the mechanics of selecting users to produce blocks (e.g., proof-of-work vs. proof-of-stake) and the method to establish consensus (e.g., longest chain rules vs. BFT-inspired protocols). These fundamental differences have hindered apples-to-apples comparisons between different categories of blockchain protocols and, in turn, the development of theory to formally discuss their relative merits. This paper presents a parsimonious abstraction sufficient for capturing and comparing properties of many well-known permissionless blockchain protocols, simultaneously capturing essential properties of both proof-of-work and proof-of-stake protocols, and of both longest-chain-type and BFT-type protocols. Our framework blackboxes the precise mechanics of the user selection process, allowing us to isolate the properties of the selection process which are significant for protocol design. We illustrate our frameworks utility with two results. First, we prove an analog of the CAP theorem from distributed computing for our framework in a partially synchronous setting. This theorem shows that a fundamental dichotomy holds between protocols (such as Bitcoin) that are adaptive, in the sense that they can function given unpredictable levels of participation, and protocols (such as Algorand) that have certain finality properties. Second, we formalize the idea that proof-of-work (PoW) protocols and non-PoW protocols can be distinguished by the forms of permission that users are given to carry out updates to the state.

Distributed Parallel and Cluster Computing Data Structures and Algorithms

A Survey on Time-Sensitive Resource Allocation in the Cloud Continuum

84 - Saravanan Ramanathan 2020

Artificial Intelligence (AI) and Internet of Things (IoT) applications are rapidly growing in todays world where they are continuously connected to the internet and process, store and exchange information among the devices and the environment. The cloud and edge platform is very crucial to these applications due to their inherent compute-intensive and resource-constrained nature. One of the foremost challenges in cloud and edge resource allocation is the efficient management of computation and communication resources to meet the performance and latency guarantees of the applications. The heterogeneity of cloud resources (processors, memory, storage, bandwidth), variable cost structure and unpredictable workload patterns make the design of resource allocation techniques complex. Numerous research studies have been carried out to address this intricate problem. In this paper, the current state-of-the-art resource allocation techniques for the cloud continuum, in particular those that consider time-sensitive applications, are reviewed. Furthermore, we present the key challenges in the resource allocation problem for the cloud continuum, a taxonomy to classify the existing literature and the potential research gaps.

Distributed Parallel and Cluster Computing Networking and Internet Architecture