Preventing network crashes caused by misconfigurations

 

Network misconfigurations can have major consequences, such as network crashes that may be extremely debilitating. Consider for example the October 2021 Facebook outage, which was caused by a network misconfiguration. Ideally, we would like to prevent these misconfigurations from occurring in the first place – but how?

On a technical level, the “section” of the Internet that handles communication between devices consists of two parts: the “control plane” and the “data plane”. The control plane has the “overview” of the network layout, and thus defines forwarding rules for each router, indicating in which direction each router should send specific packets – the technical term for messages – for them to reach their destination. The data plane then performs the actual receiving, processing and sending of packets, as well as performing other tasks such as load balancing and blocking packets (e.g. as part of a firewall).

Traditionally, both of these planes are found ”on” the router. Today, some networks use a modern network design known as ”software-defined networks”, which separates the data and control plane, moving the control plane away from the router and to a, possibly remote, controller. This allows for much more detailed fine-tuning when configuring a network. For this project, we focus on P4, a protocol-independent, vendor-neutral, domain-specific language used for configuring the data plane of routers and switches.

However, with increased customizability comes more possible points of failure. Since users can now write their own data plane configuration, there are many ways such a configuration could be incorrect, even if it compiles. Also, updates (such as a forwarding table entry) pushed using the control plane API (P4Runtime) might update the data plane such that it becomes incorrect w.r.t some specification. Finally, there is also the possibility that an update fails halfway through and is only pushed to some of the desired routers, resulting in incompatible configurations. The problem is thus twofold: one, making sure that P4 data plane programs are initially correct according to some specification, and two, making sure that updates sent via P4Runtime adhere to this specification.

In this project, we focus on solving the second part. We intend to first develop a domain-specific language for specifying the P4Runtime API calls. The language should have an unambiguous and formal ruleset, which will allow in-depth analysis of programs written in that language. Notably, the ruleset should be generated based on a P4 configuration file (which is itself generated from a P4 program), such that e.g. using a nonexistent table ID will never type. From here, we will extend the language with a way of allowing user-defined correctness invariants, in order to further ensure that updates can never make ”incorrect” changes.

PhD project

By: Jens Kanstrup Larsen

Section: Software Systems Engineering

Principal supervisor: Alceste Scalas

Co-supervisors: Philipp Haller, Roberto Guanciale

Project title: Verification of Software-Defined Networks

Term: 01/08/2022

Contact

Jens Kanstrup Larsen
PhD student
DTU Compute
+45 53 55 25 26

Contact

Alceste Scalas
Associate Professor
DTU Compute
+45 45 25 37 59