Leader Confirmation Replication for Millisecond Consensus in Geo-distributed Systems


Abstract in English

Geo-distributed private chain and database have created higher performance requirements for consistency models. However, with millisecond network latency between nodes, the widely used leader-based SMR models cause frequent retransmission of logs since they cannot know the logs replication status in time, which resulting in the leader costing high network and computing resource. To address the problem, we proposed a Leader Confirmation based Replication (LCR) model. First, we demonstrate the efficacy of the approach by designing the Future-Log Replication model, which the followers are responsible for non-transactional log replication. It reduces the leaders network load using the signal log. Secondly, we designed a Generation Re-replication strategy, which can ensure the security and consistency of future-logs when the number of nodes changes. Finally, we implemented LCR-Raft and designed experiments. The results show that in the single-ms network latency environments, LCR-Raft can provide 1.5X higher TPS, reduces transactional data response time 40%-60%, and network traffic by 20%-30% with acceptable network traffic and CPU cost on followers. Besides, LCR can provide high portability since it does not change the number of leader and election process.

Download