Carbide: Highly Reliable Networks Through Real-Time Multiple Control Plane Composition


Abstract in English

Achieving highly reliable networks is essential for network operators to ensure proper packet delivery in the event of software errors or hardware failures. Networks must ensure reachability and routing correctness, such as subnet isolation and waypoint traversal. Existing work in network verification relies on centralized computation at the cost of fault tolerance, while other approaches either build an over-engineered, complex control plane, or compose multiple control planes without providing any guarantee on correctness. This paper presents Carbide, a novel system to achieve high reliability in networks through distributed verification and multiple control plane composition. The core of Carbide is a simple, generic, efficient distributed verification framework that transforms a generic network verification problem to a reachability verification problem on a directed acyclic graph (DAG), and solves the latter via an efficient distributed verification protocol (DV-protocol). Equipped with verification results, Carbide allows the systematic composition of multiple control planes and realization of operator-specified consistency. Carbide is fully implemented. Extensive experiments show that (1) Carbide reduces downtime by 43% over the most reliable individual underlying control plane, while enforcing correctness requirements on all traffic; and (2) by systematically decomposing computation to devices and pruning unnecessary messaging between devices during verification, Carbide scales to a production data center network.

Download