Subscribe to the gold package and get unlimited access to Shamra Academy

Leader Confirmation Replication for Millisecond Consensus in Geo-distributed Systems

277 0 0.0 ( 0 )

Download Cite

Added by Haiwen Du Dr.

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Dongjie Zhu - Haiwen Du - Yundong Sun

Networking and Internet Architecture Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Geo-distributed private chain and database have created higher performance requirements for consistency models. However, with millisecond network latency between nodes, the widely used leader-based SMR models cause frequent retransmission of logs since they cannot know the logs replication status in time, which resulting in the leader costing high network and computing resource. To address the problem, we proposed a Leader Confirmation based Replication (LCR) model. First, we demonstrate the efficacy of the approach by designing the Future-Log Replication model, which the followers are responsible for non-transactional log replication. It reduces the leaders network load using the signal log. Secondly, we designed a Generation Re-replication strategy, which can ensure the security and consistency of future-logs when the number of nodes changes. Finally, we implemented LCR-Raft and designed experiments. The results show that in the single-ms network latency environments, LCR-Raft can provide 1.5X higher TPS, reduces transactional data response time 40%-60%, and network traffic by 20%-30% with acceptable network traffic and CPU cost on followers. Besides, LCR can provide high portability since it does not change the number of leader and election process.

rate research

An Intelligent Resource Reservation for Crowdsourced Live Video Streaming Applications in Geo-Distributed Cloud Environment

76 - Emna Baccour , Fatima Haouari , Aiman Erbad 2021

Crowdsourced live video streaming (livecast) services such as Facebook Live, YouNow, Douyu and Twitch are gaining more momentum recently. Allocating the limited resources in a cost-effective manner while maximizing the Quality of Service (QoS) through real-time delivery and the provision of the appropriate representations for all viewers is a challenging problem. In our paper, we introduce a machine-learning based predictive resource allocation framework for geo-distributed cloud sites, considering the delay and quality constraints to guarantee the maximum QoS for viewers and the minimum cost for content providers. First, we present an offline optimization that decides the required transcoding resources in distributed regions near the viewers with a trade-off between the QoS and the overall cost. Second, we use machine learning to build forecasting models that proactively predict the approximate transcoding resources to be reserved at each cloud site ahead of time. Finally, we develop a Greedy Nearest and Cheapest algorithm (GNCA) to perform the resource allocation of real-time broadcasted videos on the rented resources. Extensive simulations have shown that GNCA outperforms the state-of-the art resource allocation approaches for crowdsourced live streaming by achieving more than 20% gain in terms of system cost while serving the viewers with relatively lower latency.

Networking and Internet Architecture Distributed Parallel and Cluster Computing Machine Learning

Adaptive Replication in Distributed Content Delivery Networks

416 - Mathieu Leconte , Marc Lelarge , Laurent Massoulie 2014

We address the problem of content replication in large distributed content delivery networks, composed of a data center assisted by many small servers with limited capabilities and located at the edge of the network. The objective is to optimize the placement of contents on the servers to offload as much as possible the data center. We model the system constituted by the small servers as a loss network, each loss corresponding to a request to the data center. Based on large system / storage behavior, we obtain an asymptotic formula for the optimal replication of contents and propose adaptive schemes related to those encountered in cache networks but reacting here to loss events, and faster algorithms generating virtual events at higher rate while keeping the same target replication. We show through simulations that our adaptive schemes outperform significantly standard replication strategies both in terms of loss rates and adaptation speed.

Networking and Internet Architecture

DistPrivacy: Privacy-Aware Distributed Deep Neural Networks in IoT surveillance systems

245 - Emna Baccour , Aiman Erbad , Amr Mohamed 2020

With the emergence of smart cities, Internet of Things (IoT) devices as well as deep learning technologies have witnessed an increasing adoption. To support the requirements of such paradigm in terms of memory and computation, joint and real-time deep co-inference framework with IoT synergy was introduced. However, the distribution of Deep Neural Networks (DNN) has drawn attention to the privacy protection of sensitive data. In this context, various threats have been presented, including black-box attacks, where a malicious participant can accurately recover an arbitrary input fed into his device. In this paper, we introduce a methodology aiming to secure the sensitive data through re-thinking the distribution strategy, without adding any computation overhead. First, we examine the characteristics of the model structure that make it susceptible to privacy threats. We found that the more we divide the model feature maps into a high number of devices, the better we hide proprieties of the original image. We formulate such a methodology, namely DistPrivacy, as an optimization problem, where we establish a trade-off between the latency of co-inference, the privacy level of the data, and the limited-resources of IoT participants. Due to the NP-hardness of the problem, we introduce an online heuristic that supports heterogeneous IoT devices as well as multiple DNNs and datasets, making the pervasive system a general-purpose platform for privacy-aware and low decision-latency applications.

Networking and Internet Architecture Distributed Parallel and Cluster Computing

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models

109 - Yunfei Teng , Wenbo Gao , Francois Chalus 2019

We consider distributed optimization under communication constraints for training deep learning models. We propose a new algorithm, whose parameter updates rely on two forces: a regular gradient step, and a corrective direction dictated by the currently best-performing worker (leader). Our method differs from the parameter-averaging scheme EASGD in a number of ways: (i) our objective formulation does not change the location of stationary points compared to the original optimization problem; (ii) we avoid convergence decelerations caused by pulling local workers descending to different local minima to each other (i.e. to the average of their parameters); (iii) our update by design breaks the curse of symmetry (the phenomenon of being trapped in poorly generalizing sub-optimal solutions in symmetric non-convex landscapes); and (iv) our approach is more communication efficient since it broadcasts only parameters of the leader rather than all workers. We provide theoretical analysis of the batch version of the proposed algorithm, which we call Leader Gradient Descent (LGD), and its stochastic variant (LSGD). Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers). The multi-leader setting is well-aligned with current hardware architecture, where local workers forming a group lie within a single computational node and different groups correspond to different nodes. For training convolutional neural networks, we empirically demonstrate that our approach compares favorably to state-of-the-art baselines.

Machine Learning Distributed Parallel and Cluster Computing Optimization and Control

Cooperative UAVs Gas Monitoring using Distributed Consensus

77 - Daniele Facinelli , Matteo Larcher , Davide Brunelli 2019

This paper addresses the problem of target detection and localisation in a limited area using multiple coordinated agents. The swarm of Unmanned Aerial Vehicles (UAVs) determines the position of the dispersion of stack effluents to a gas plume in a certain production area as fast as possible, that makes the problem challenging to model and solve, because of the time variability of the target. Three different exploration algorithms are designed and compared. Besides the exploration strategies, the paper reports a solution for quick convergence towards the actual stack position once detected by one member of the team. Both the navigation and localisation algorithms are fully distributed and based on the consensus theory. Simulations on realistic case studies are reported.

Robotics Distributed Parallel and Cluster Computing

comments

Fetching comments

Yarmouk Private University

Additional details More universities

Leader Confirmation Replication for Millisecond Consensus in Geo-distributed Systems

Ask ChatGPT about the research

No Arabic abstract

Read More