scieee Science in your language
[en] (orig)
Apoorv Shukla, Kevin Hudemann, Zsolt Vági, Lily Hügerich,
Georgios Smaragdakis, Artur Hecker, Stefan Schmid, Anja
Feldmann
Fix with P6: Verifying Programmable Switches at
Runtime
Open Access via institutional repository of Technische Universität Berlin
Document type
Conference paper | Accepted version
(i. e. final author-created version that incorporates referee comments and is the version accepted for
publication; also known as: Author’s Accepted Manuscript (AAM), Final Draft, Postprint)
This version is available at
https://doi.org/10.14279/depositonce-12013
Citation details
Shukla, Apoorv; Hudemann, Kevin; Vági, Zsolt; Hügerich, Lily; Smaragdakis, Georgios; Hecker, Artur; Schmid,
Stefan; Feldmann, Anja (2021). Fix with P6: Verifying Programmable Switches at Runtime. IEEE INFOCOM
2021 IEEE Conference on Computer Communications, 10–13.05.2021.
©©
2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other
uses, in any current or future media, including reprinting/republishing this material for advertising or
promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of
any copyrighted component of this work in other works.
Terms of use
This work is protected by copyright and/or related rights. You are free to use this work in any way permitted by
the copyright and related rights legislation that applies to your usage. For other uses, you must obtain
permission from the rights-holder(s).
Fix with P6: Verifying Programmable Switches at Runtime
Apoorv Shukla1,*Kevin Hudemann2,*Zsolt Vági3,*Lily Hügerich4
Georgios Smaragdakis4,6Artur Hecker1Stefan Schmid5Anja Feldmann6
1Huawei Munich Research Center 2SAP 3Swisscom 4TU Berlin 5Faculty of Computer Science, University of Vienna 6MPI-Informatics
Abstract—We design, develop, and evaluate P6, an auto-
mated approach to (a) detect, (b) localize, and (c) patch
software bugs in P4 programs. Bugs are reported via a
violation of pre-specified expected behavior that is captured
by P6.P6 is based on machine learning-guided fuzzing
that tests P4 switch non-intrusively, i.e., without modifying
the P4 program for detecting runtime bugs. This enables
an automated and real-time localization and patching of
bugs. We used a P6 prototype to detect and patch existing
bugs in various publicly available P4 application programs
deployed on two different switch platforms: behavioral model
(bmv2) and Tofino. Our evaluation shows that P6 significantly
outperforms bug detection baselines while generating fewer
packets and patches bugs in large P4 programs such as
switch.p4 without triggering any regressions.
I. INTRODUCTION
Programmable networks herald a paradigm shift in the
design and operation of networks. While programmable
networks enable to break the tie between vendor-specific
hardware and proprietary software, they facilitate an in-
dependent evolution of software and hardware. With the
P4 language [1], [2], one can define in a P4 program,
the instructions for processing the packets, e.g., how the
received packet should be read, manipulated, and forwarded
by a network device, e.g., a P4 switch.
However, with these new capabilities, also new chal-
lenges are unleashed, related to the P4 software verification,
i.e., ensuring that the software fully satisfies all the expected
requirements. The P4 switch behavior depends on the
correctness of the P4 programs running on them. We realize
that a bug in a P4 program, i.e., a small fault such as a
missing line of code or a fat finger error, or a vendor-
specific implementation error, can trigger unexpected and
abnormal switch behavior. In the worst case, it can result
in a network outage, or even a security compromise [3].
Problem Statement. In this paper, we examine and verify
the behavior of P4 switches after the P4 programs are
deployed. We pose the question: “Is it possible to detect,
localize, and patch software bugs in a P4 program
running on P4 switches?”. We believe that being able to
answer this question, even partially, unlocks full potential of
programmable networks, improves their security, and will
hence increase their penetration in operational and mission-
critical networks.
Recently, a panoply of P4 program verification tools [4]–
[10] has been proposed. These verification systems, how-
ever, fail to repair the P4 program containing bugs. Most
*Authors worked on this paper while affiliated with TU Berlin.
of them [4]–[7] aim to statically verify user-defined P4
programs which are later, compiled to run on a target
switch. They mostly find bugs that violate memory safety
properties, e.g., invalid memory access, buffer overflow,
etc. Furthermore, they are prone to false positives and are
unable to verify the runtime behavior on real packets. In
addition, classes of bugs, e.g., checksum-related or ECMP
(Equal-Cost Multi-path) hash calculations-related bugs are
platform-dependent or P4 target switch implementation-
specific and, thus, cannot be detected by static analysis
approaches [4]–[7] or others [11]. Since, runtime verifica-
tion aims to verify the actual behavior against the expected
behavior of a switch by sending specially-crafted input
packets to the switch and observing the behavior, such
verification is complementary to static analysis. Currently,
the development and testing cycles in P4-based systems are
short [12] due to intense competition and need for new
applications which makes runtime verification indispens-
able. Note; this makes the detection of bugs causing the
abnormal runtime behavior a challenging task as the P4
switch does not throw any runtime exceptions. Furthermore,
the detection of bugs is also challenging if there is no
output, i.e., packets are dropped silently instead of being
forwarded. Thus, runtime verification of switch is crucial.
A useful approach to verify the runtime behavior is
fuzz testing or fuzzing [13]–[23], a well-known dynamic
program testing technique that generates semi-valid, ran-
dom inputs which may trigger abnormal program behavior.
However, for fuzzing to be efficient, intelligence needs to
be added to the input generation, so that the inputs are not
rejected by the parser and it maximizes the chances of trig-
gering bugs. This becomes crucial especially in networking,
where the input space is huge, e.g., even a 32-bit destina-
tion IPv4 address field in a packet header introduces 232
possibilities. To make fuzzing more effective, we consider
the use of machine learning, to guide the fuzzing process to
generate smart inputs that trigger abnormal target behavior.
Recently, Shukla et al. [23] have shown that Reinforcement
Learning (RL) [24], [25] can be used to train a verification
system. We build upon [23] by adding (a) static analysis to
the fuzzing process to significantly reduce the input search
space, and thus, adding input structure awareness, and (b)
support for platform-dependent bug detection.
Even if a bug in a P4 program is detected, the localization
of code statements in the P4 code that are responsible
for the bug, is non-trivial. The difficulty stems from the
fact that practical P4 programs can be large with a dense
1
conditional structure. In addition, the same faulty statements
in a P4 program may be executed for both passed as well
as failed pre-defined test cases; this makes it difficult to
pinpoint the actual faulty line/s of code. Tarantula [26]–
[28] is a dynamic program analysis technique that helps in
fault localization by pinpointing the potential faulty lines
of code. To localize the software bugs, we tailor Tarantula
for generic software to P4 programs by building a localizer
called P4Tarantula and integrating it with the bug detection
of machine learning-guided fuzzing. In this paper, we also
show how the detection and localization of bug makes it
possible to patch a number of bugs in P4 programs.
P6.In P4, the automated program repair [29] is an un-
charted territory and becomes increasingly important as the
software development lifecycle in programmable networks
is short [12] with insufficient testing. In this paper, we show
that due to the structure of P4 programs, it is possible to au-
tomate patching of platform-independent bugs (P4 program-
specific software bugs) in P4 programs, if the patch is avail-
able. To this end, we present P6,P4 with runtime Program
Patching, a novel runtime P4-switch verification system
that (a) detects, (b) localizes, and (c) patches
software bugs in a P4 program. P6 improves existing work
on machine learning-guided fuzzing [23] in P4 by extending
it and augmenting it with: (a) automated localization, and
(b) runtime patching. P6 relies on the combination of static
analysis of the P4 program and Reinforcement Learning
(RL) technique to guide the fuzzing process to verify the
P4-switch behavior at runtime.
In a nutshell, in P6, the first step is to capture the
expected behavior of a P4 switch, which is achieved using
information from three different sources: (i) the control
plane configuration, (ii) queries in p4q (§III-B1), a query
language which we leverage to describe expected behav-
ior using conditional statements, and (iii) accepted header
layouts, e.g., IPv4, IPv6, etc., learned via static analysis
of the P4 program. If the actual runtime behavior to the
test packets generated via machine-learning guided fuzzing
differs from the expected behavior through the violation
of the p4q queries, it signals a bug to P6 which then
identifies a patch from a library of patches. If the patch
is available, P6 modifies the original P4 program to fix
the bug signaled by the p4q queries. Then, the patched P4
program is subjected to sanity and regression testing.
We develop a prototype of P6 and evaluate it by testing it
on eight P416 application programs from switch.p4 [30],
P4 tutorial solutions [31], and NetPaxos codebase [32]
across two P4 switch platforms, namely, behavioral model
version 2 (bmv2) [33] and Tofino [34]. Our results show
that P6 successfully detects, localizes, and patches diverse
bugs in all P416 programs while significantly outperforming
bug detection baselines without introducing any regressions.
Related Work. Unlike P6, P4-based verification ap-
proaches [4]–[9], [11], [23], [35], [36] are insufficient in
localizing and patching of runtime bugs. Besides, they
Related work in P4 Runtime Verification Detection Localization Patching Detection of PD bugs
Cocoon [36] 7X X 7 7
Vera [4] 7X7 7 7
p4v [5] 7X7 7 7
ASSERT-P4 [6], [7] 7X7 7 7
P4NOD [35] 7X7 7 7
p4pktgen [8] 7X7 7 7
P4CONSIST [9] X X 7 7 7
P4RL [23] X X 7 7 7
P6 X X X X X
TABLE I: Related work in P4 verification. PD corresponds to the
platform-dependent bugs. Note, Xdenotes the capability, (X) denotes
a part of full capability, and denotes the missing capability.
cannot detect the platform-dependent bugs. Contrary to
them, P6 can automatically detect, localize and patch the
software bugs in the P4 programs. In addition, P6 detects
the platform-dependent bugs. Table I illustrates capabilities
of other P4 verification tools as compared to P6.
Contributions. Our main contributions are:
We design, implement, and evaluate P6, an end-to-end
runtime P4 verification system that detects, localizes, and
patches bugs in P4 programs non-intrusively. (§III)
We observe that the success of P6 relies on the increased
patchability of P4 program from old (P414) to the new
version (P416). (§II)
We present a P6 prototype and report on an evaluation
study. We evaluate our P6 prototype on a P4 switch running
eight P416 programs (including switch.p4 with 8,715
LOC) from publicly available sources [30]–[32] across
two platforms, namely, behavioral model and Tofino. Our
results show that P6 non-intrusively detects both platform-
dependent and platform-independent bugs, and significantly
outperforms state-of-the-art bug detection baselines. (§IV)
For platform-independent bugs, P6 localizes bugs and
fixes the P4 program, when a patch is available, without
causing regressions/introducing new bugs. (§III, §IV)
We release the P6 software and library of ready patches
for all existing bugs in the P4 programs [37].
II. CHALLENGES &OPPORTUNITIES
A. Primer: Packet Processing Pipeline of P4
P4 [1], [2] is a domain-specific language comprising
of packet-processing abstractions, e.g., headers, parsers,
tables, actions, and controls. The P4 packet processing
pipeline evolved from [38] to its current form P416 [2]
in generic Portable Switch Architecture (PSA) [39]
switch platform, e.g., Tofino [34] (Figure 1a and 1b). In
P416 pipeline, there are six programmable blocks that
are platform-independent, namely, ingress parser,
ingress match-action,ingress deparser,
egress parser,egress match-action, and
egress deparser. The programmable blocks are
annotated with a solid line in Figures 1a and 1b. There are
also two platform-dependent blocks (annotated with dashed
lines in Figures 1a and 1b): the packet replication
engine (PRE) and the buffer queuing engine
(BQE). These are non-programmable relying on proprietary
implementations of hardware vendors.
2
Advertisement
Ingress
Match-Action
Packet
Replication
Engine
(PRE)
Packet
Egress
Parser
Egress Match-Action
Parser MyParser(...){
(…)
state parse_ipv4 {
pkt.extract(hdr.ipv4);
transition accept;
}
(...)
}
(...)
update_checksum(
(...)
{ hdr.ipv4.version,
hdr.ipv4.dstAddr },
(…); )
(...)
Egress
Deparser
Ingress
Deparser
Buffer
Queuing
Engine
(BQE)
Ingress Parser
(a) An example of platform-independent bug in P416 pipeline.
(PRE)
Egress
Parser
Egress
Match-Action
Egress
Deparser
Ingress
Deparser
Buffer
Queuing
Engine
(BQE)
Ingress Parser
Table 1 Table n
Miss:
Drop & Exit
Match:
Clone
...
if (clone_flag != 0)
{clone}
if (resubmit_flag != 0)
{resubmit}
elif (mcast_grp! = 0)
{multicast}
elif (egr_port == 511)
{Drop} ...
Packet cloned
Ingress
Match-Action
Packet
(b) An example of platform-dependent bug in P416 pipeline.
Fuzzer
P4 Switch
Localizer
Patcher
3
§3.2§3.3
§3.4
Test
Packets
Feedback
Activate
Patcher
Activate
Localizer
Compile and deploy the patched P4 program
214
5
(c) P6 Workflow. Modules of P6 (in solid green boxes).
Fig. 1: Fig. 1a and Fig. 1b illustrate platform-independent and -dependent bugs respectively. Fig. 1c depicts P6 Workflow.
B. Challenges: Runtime Bugs in P4
Bugs or errors can occur at any stage in the P4 pipeline.
If a bug occurs in any of the programmable blocks, then we
term the bug as platform-independent and software patching
can solve the problem. If the bug appears in the non-
programmable or platform-dependent blocks, namely, the
PRE or BQE, then the vendor has to be informed to fix the
issue if the implementation is hardware-related or vendor-
specific. P4 program verification systems [4]–[7] are able
to detect bugs using static analysis. Unfortunately, static
analysis is (i) prone to false positives, (ii) cannot detect
platform-dependent bugs, and (iii) cannot detect runtime
bugs that require to actively send real packets.
For platform-independent bugs, we consider the Figure 1a
(solid line blocks). It illustrates part of the implementation
of Layer-3 (L3) switch provided in the P4 tutorial solu-
tions [31]. Here, the parser does not check if the IPv4 header
contains IPv4 options or not, i.e., if the IPv4 ihl field is
equal to 5 or not. When updating the IPv4 checksum
of the packets during egress processing, IPv4 options are
not taken into account, hence for those IPv4 packets with
options, the resulting checksum is wrong causing such
packets to be forwarded and incorrectly dropped at the
next hop leading to anomalies in network behavior. Other
bugs that fall in this category are those related to IPv4/6
checksum and ttl in the packet. Such bugs are platform-
independent, as they only result from programming errors.
For a platform-dependent bug, consider the scenario
shown in Figure 1b (dashed line blocks). Here, we assume
a P4 program implements at least two match-action tables.
Any table except the last one could be a longest prefix
match (LPM) table, offering unicast, clone, and drop actions
(ingress match-action block). The last match-action table
implements an access control list (ACL). So, the packets
can either be dropped or forwarded according to the chosen
actions by the previous tables. In this case, it is possible that
conflicting forwarding decisions are made. Consider packets
are matched by the first table (Table 1) and a clone decision
is made, later, those are dropped by the ACL table (Table
n). In such a case, the forwarding behavior depends on the
implementation of the PRE, which is platform-dependent.
The implementation of PRE of the SimpleSwitch target in
the behavioral model (bmv2) is illustrated in Figure 1b. It
would drop the original packet, however, forward the cloned
copy of the packet. Similar bugs can occur, if instead of
the clone action, resubmit action is chosen (blue) or when
implementing multicast (green).
The above motivates us to turn our attention to run-
time detection of bugs. Runtime verification is a useful
and complementary tool in the P4 verification repertoire
that detects both platform-independent bugs resulting from
programming errors as well as platform-dependent bugs.
C. Opportunities for Patching: Structure of a P4 Program
In the evolution of P4, there are two recent versions:
P414 [40] and P416 [2]. P416 allows programmers to use
definitions of a target switch-specific architecture, e.g.,
PSA (Portable Switch Architecture) [39], [41]. P416 is an
upgraded version of P414. In particular, a large number
of language features have been eliminated from P414 and
moved into libraries including counters, checksum units,
meters, etc., in P416.P414 allowed the programmer to
explicitly program three blocks: ingress parser (including
header definitions of accepted header layouts), ingress con-
trol and egress control functions. Recall that P416 allows to
explicitly program six programmable blocks (Figure 1a).
By analyzing programs in the P414 and P416 versions,
we realize that as more blocks of the P4 program get pro-
grammable, there is more onus on the programmer to write
a program that behaves as expected (when it gets compiled
and deployed on the P4 switch). Missing checks or fat finger
errors can cause havoc in the network. However, this is a
blessing in disguise as the more programmable the code
is, the more patchable it is. Thus, programming errors can
be fixed. We observe that the potentially patchable code
percentage increases from P414 to P416 in all applications
(excluding calculator) from P4 tutorial solutions [31] and
NetPaxos codebase [32] in behavioral model (bmv2) switch
platform [33] and other generic PSA switch platforms [39],
[41], e.g., Tofino [34] respectively. The patchable code
3
*P4 Source Code*
………….
state parse_ipv4 {
packet.extract(hdr.ipv4);
transition accept;
}
………….
*P4 Source Code*
………….
state parse_ipv4 {
packet.extract(hdr.ipv4);
transition accept;
}
………….
*P4 Source Code*
………….
state parse_ipv4 {
packet.extract(hdr.ipv4);
verify(hdr.ipv4.version == 4, error.BadHeader);
verify(hdr.ipv4.ihl == 5, error.BadHeader);
verify(hdr.ipv4.len >= 20, error.BadHeader);
verify(hdr.ipv4.ttl >= 2, error.BadHeader);
transition accept; }
………….
apply {
if (standard_metadata.parser_error != error.NoError) {
mark_to_drop();
return;
}
………….
P4 Source Code with Bugs Localized by P4Tarantula Patched by Patcher
FuzzerLocalizer
Patcher
Fuzzer
P4 Switch
Localizer
Patcher
FuzzerLocalizer
Patcher
P4 Switch P4 Switch
Fig. 2: P6 in Action: depicting the automated detection, localization and patching of a bug in a L3 switch P4 program [31].
percentage comes from the six programmable blocks in
P416. Roughly, whatever is programmable, is patchable. In
principle, around 40-45% of a P4 program is patchable in
P416 programs for behavioral model (bmv2) switch plat-
form [33]. This increases to 50-55% if the ingress deparser
and egress parser are programmable for other target switch
platforms, e.g., Tofino [34]. In particular, the parser and
header definitions account for 20-40% of the total patchable
code. If there is no bug in parser or header, packets with
incorrect header get dropped. However, the bug still can be
either in the non-patchable platform-dependent block or in
the application code logic or deparser which is patchable
as it is platform-independent.
Observation: From P414 to P416, P4 program possesses
twice as many programmable blocks increasing the chances
for patchability. Bugs detected in the platform-independent
part can be localized and patched; a platform-dependent
bug may not be patchable if it is hardware-related.
III. P6:SYSTEM DESIGN
A. P6: Overview
P6s goal (see Figure 1c) is to detect, localize, and
patch the software bugs in a P4 program at runtime.
This is achieved by verifying the actual runtime behavior
against the expected behavior of a P4 switch running a pre-
compiled P4 program to the incoming packets.
The P6 system contains three main modules:
(1) Fuzzer: Generates test packets using RL-guided
fuzzing, static analysis, and p4q queries (§III-B1) to the
P4 switch running the pre-compiled P4 program. (§III-B)
(2) Localizer: P4Tarantula is the Localizer which pinpoints
faulty lines of code causing bugs in the P4 program. (§III-C)
(3) Patcher: Automates patching of the bugs localized by
P4Tarantula Localizer, if patchable. Then, Patcher compiles
and loads the patched P4 program on the P4 switch. (§III-D)
P6 Workflow. P6 is a closed-loop control system. Through
a pre-generated dictionary from control plane configuration,
p4q queries, and static analysis of a P4 program, the
expected runtime behavior of the P4 switch is captured and
sent as an input to the Fuzzer containing the RL Agent
and the Reward System (§III-B). As shown in Figure 1c,
the Fuzzer selects appropriate mutation actions such as
add/delete/modify bytes in a packet to generate test packets
towards the P4 switch running the pre-compiled P4 pro-
gram 1 . If the actual runtime behavior towards the packets
defies the expected behavior through the violation of the
p4q queries, it signals a bug in the form of a reward as a
feedback to the Reward System which is then, exploited by
the RL Agent to improve during the training process by se-
lecting better mutation actions on the packet 2 . After the
bug detection, the Fuzzer automatically triggers Localizer
(§III-C), P4Tarantula (only for platform-independent bugs;
for platform-dependent bugs, the vendor is informed) which
pinpoints the faulty line of code 3 to trigger the Patcher
(§III-D) which searches for the appropriate patch from a
library of patches for the corresponding P4 program 4 .
If the patch is available, Patcher modifies the original P4
program, compiles and loads it on the P4 switch and checks
if the bug is no longer triggered by p4q queries by repeating
the whole-cycle and executing sanity and regression test-
ing 5 . Note, P6 is non-intrusive and thus, requires no
modification to the P4 program for testing before patching.
P6 in Action. Before we dive into the details of Fuzzer,
Localizer, and Patcher, we demonstrate the operation of
P6. Figure 2 illustrates how P6 detects, localizes, and
patches an existing bug in a layer-3 (L3) switch P4 source
code (program) from [31] in an automated fashion. The
left part of Figure 2 shows the P4 program containing a
platform-independent bug in the parser code, i.e., no header
field validation is implemented, hence all IPv4 packets are
incorrectly accepted by the parser. After the P4 program is
deployed on the P4 switch, P6 is triggered. Initially, the
Fuzzer detects the bug violating the corresponding p4q
query based on the feedback (reward) received from the
P4 switch. Then, it triggers the P4Tarantula for localization
(shown in the center of Figure 2) where it pinpoints the
problematic part of the code (highlighted). Afterwards, the
Patcher is triggered automatically, patching the necessary
problematic parts of the code, i.e., adding header field
verification statements (highlighted in right), after checking
if the patch was indeed missing from the P4 program.
Finally, Patcher automatically compiles [42] and deploys
the patched P4 program on the P4 switch, and triggers P6
to ensure that the patches caused no regressions and fixed
4
Advertisement
Loading more pages...