scieee Science in your language
[en] (orig)
Forschungsberichte
der Fakultät IV – Elektrotechnik und Informatik
Improving Network Troubleshooting using
Virtualization
Andreas Wundsam
Amir Mehmood
Anja Feldmann
Olaf Maennel
Technische Universität Berlin/ Deutsche Telekom Laboratories
Bericht-Nr. 2009-12
ISSN 1436-9915
Improving Network Troubleshooting using
Virtualization
Andreas Wundsam Amir Mehmood Anja Feldmann Olaf Maennel
TU Berlin / Deutsche Telekom Laboratories, Germany
ABSTRACT
Diagnosing problems, deploying new services, testing
protocol interactions, or validating network configura-
tions are hard problems in today’s Internet. This pa-
per proposes to leverage the concept of Network Virtu-
alization to overcome such problems: (1) Monitoring
VNets can be created on demand along side any pro-
duction network to enable network-wide monitoring in
a robust and cost-efficient manner; (2) Shadow VNets
enable troubleshooting as well as safe upgrades to both
the software components and their configurations. Both
approaches build on the agility and isolation properties
of the underlying virtualized infrastructure. Neither re-
quires changes to the physical or logical structure of
the production network. Thus, they have the potential
to substantially ease network operation and improve re-
silience against mistakes.
1. INTRODUCTION
There is hardly any more difficult task than diag-
nosing problems in the Internet. This is in part due
to the complexity of the problem itself. IP networks
are distributed in nature, they span the globe and
involve billions of components, while even individ-
ual network protocols, e.g., BGP, are very complex
in their own right. There are interactions of vari-
ous network layers, e.g., IP routing with SONET,
and some applications rely on unstated assump-
tions, e.g., on RTTs not exceeding certain values.
Moreover, problems are often only vaguely speci-
fied. For instance, a user may be complaining about
the Web not working, when in fact DNS is failing.
While these complexities may be intrinsic to the
nature of the Internet, coping with them is made all
the harder by our dismal abilities to monitor large
scale production networks. Monitoring capabilities
are usually inversely proportional to the size of net-
work if the network is small we can monitor and
analyze almost everything out-of-band. However, if
the network is large, we can monitor and analyze
data at only a small number of locations and often
have to resort to in-band monitoring, adding sub-
stantial overhead.
As a result, error diagnosis is often delegated to
“artificial” environments (simulator, testbed) that
offer good monitoring capabilities, or attempted in
production environment with only limited monitor-
ing capabilities. Such approaches have significant
drawbacks: Simulations have limited accuracy, test-
beds are limited in scale, in-band monitoring is lim-
ited. Thus, these methods often do not suffice to
catch the real-life network troubleshooting prob-
lems that stem from complex interactions of many
parties, especially if they only occur sporadically.
Even if it is possible to replicate a complete setup
as done by some vendors, e.g., via Cisco’s NSite [1],
these setups only carry test traffic. As such, many
problems are not observable.
To solve such problems, network engineers usu-
ally rely on their intuition to guide them when ad-
justing the network configuration or adding extra
instrumentation, either external or in-band. How-
ever, this can and does introduce new bugs [14, 17]
and the instrumentation overhead may even alter
the very behavior that is meant to be observed. The
latter is often referred to as probe effect.
In this paper, we propose to utilize the emerg-
ing concept of network virtualization to tackle the
network diagnosis problem in a novel fashion. Net-
work virtualization expands the existing concepts
of host and link virtualization to the entire net-
work. Indeed, a virtual network may span multiple
physical network domains. Network virtualization
frameworks enable virtual networks or VNets to be
dynamically provisioned and configured and to op-
erate in parallel on a shared physical infrastructure.
1
A controlling instance, often called the virtual node
monitor isolates virtual networks from each other.
As a result, virtual networks can be quickly instan-
tiated on demand, and each virtual network is in-
dependent of the other virtual networks.
The contributions of this paper are two new novel
approaches for adding to our network diagnosis ca-
pabilities while avoiding the undesirable effects out-
lined above: (a) Monitoring VNets enable decen-
tralized, non-intrusive network-wide monitoring in
production networks and (b) Shadow VNets for repli-
cating not just networks but also their input, safely.
This offers network operators a new range of capa-
bilities at with low overhead: He can safely evalu-
ate and test new configurations or test new software
components, at the scale of his production network,
under real user traffic. As the new setup exist in
parallel with the old one he can swap them in a
near-atomic fashion. This enables him to switch
only once he has convinced himself that the new
setup is operational and offers the expected bene-
fits. We note that such capabilities are crucial for
performance troubleshooting, for network diagno-
sis, for debugging new network protocols, or to eval-
uate new network configurations. The overhead is
very limited given the assumption that there will
be a reasonable number of VNets but only a small
number of Monitoring or Shadow VNets.
Our naming of Shadow VNets is inspired by the
results of Alimi et al. [6], who implement shadow
configurations to improve the safety and smooth-
ness of configuration updates. While shadow con-
figurations provide a convincing solution to today’s
configuration management, they are intrinsically un-
able to handle, e.g., software updates or updates to
the end-systems. Moreover, network virtualization
intrinsicly offers features that have to be added to
support shadow configurations. As such, Shadow
VNets offer, on the one hand, a clean implemen-
tation of shadow configuration by relying on the
abstraction and isolation properties of network vir-
tualization. Additionally, they substantially extend
their scope.
The remainder of this paper is organized as fol-
lows: In Section 2 we present our approaches and
review their intrinsic benefits and limitations. We
then discuss, in Section 3, enabling technologies and
currently available implementation options for our
approaches. With the help of a prototype imple-
mentation, we evaluate the feasibility of our ap-
proach in Section 4. Finally, we conclude with an
Figure 1: Substrate running production
VNet (Prod) and Monitoring/Analysis VNet
(Moni)
outlook in Section 5.
2. APPROACH
In the following we discuss the benefits and limi-
tations of our ideas on how virtualization can help
network troubleshooting.
2.1 Monitoring VNets
Network operation heavily relies on network traf-
fic monitoring at different granularities. For in-
stance, link statistics and CPU utilization on routers
are commonly used to assess the load on network
links and routers. If, e.g., a router’s CPU usage is
getting too close to 100% most ISPs require net-
work operations to react to prevent the router from
crashing. Other beneficiaries of network measure-
ments include traffic engineering and legal inter-
cepts. Among the most powerful tools for network
diagnosis is fine grained packet monitoring. This
is done, e.g., with tcpdump or wireshark on end-
systems, via monitoring ports on switches, or via
passive optical taps on the wire. Unfortunately, de-
ployment of such monitoring often requires changes
to the logical configuration or even additional hard-
ware and thus changes to the physical configuration
of the network. Accordingly, such solutions are of-
ten costly and difficult to deploy on demand.
Therefore, we propose to exploit the agility and
isolation offered by network virtualization to enable
a lightweight alternative that can be deployed with-
out changes to the production network itself. With
the help of a virtualization framework we couple
any production network with a monitoring virtual
network, as shown in Fig. 2.1 for VNet Prod. The
monitoring network has its own resources and con-
sists of VNodes (called Moni) on each of the sub-
strate nodes of VNet Prod which are interconnected
by separate virtual links. On each node, traffic go-
ing in and/or out of the production VNodes can be
selectively mirrored to interfaces on the monitoring
VNode depending on the requirements of the mon-
2
itoring use case. The selected traffic can then be
processed by the software running within the mon-
itoring VNode. For example, Moni can record all
selected packets, or compute aggregate statistics to
reduce the data volume, or even analyze them in
real-time using a specialized application. Finally,
the results can be made available to the network
operator. All the computation as well as the data
transfer is accounted to Moni rather than the pro-
duction VNet. As such, the production network can
remain unaware of the monitoring VNet.
The benefits of network-wide traffic monitoring
are multifold and include:
Cost effectiveness: This approach does not re-
quire any physical or logical changes to the
configuration of the production network it
only adds a virtual network. Thus it has a
light deployment “footprint”. As a result it
might be possible to deploy it more widely
than traditional monitoring approaches even
with limited resources, thus enabling network-
wide monitoring of a geographically distributed
network.
Resilience against operator mistakes: The ab-
sence of changes to the production network
also results in increased operational safety
configuration mistakes in the Monitoring VNet
do not result in production network outages.
Note, our approach assumes a virtualization
framework that enables automatic instantia-
tion of virtual networks without the need for
manual configuration.
Performance isolation: Virtualization offers re-
source isolation. Thus it limits the perfor-
mance degradation caused by the monitoring
VNet on the production VNet. Even if a moni-
toring VNode is overloaded, e.g., due to an un-
expected traffic spike or a bug in the deployed
IDS, the production networks is isolated and
can continue to operate at regular speed (given
appropriate priorities).
Reduced result transfer volume: We offer local
analysis on each node. Thus it is possible to
substantially reduce the data transfer volume
to the central data warehouse of the operator.
In essence, our approach offers the benefits of a
monitor port on every switch/router in the network
coupled with a close-by analysis node. It can be
provisioned dynamically on demand and without
the need for additional or dedicated hardware. The
main limitations of the proposed approach are that
/dev/null
Figure 2: Substrate running produc-
tion VNet(a), Shadow VNet(b) and Con-
trol/Monitoring/Analysis VNet(Control)
current virtualization frameworks do not yet offer
the monitoring service and the question of how high
the overhead is. For a study of the latter see Sec-
tion 4.1.
2.2 Shadow VNets
Next, we show how network virtualization can
be used to enable smooth and fail-safe network up-
grades, including network software, operating sys-
tem images and configurations. Inspired by Shadow
Configs [6] we call the resulting VNets: Shadow
VNets.
To upgrade, e.g., the production VNet (VNet A)
in Fig. 2, we clone it by creating a parallel Shadow
VNet (VNet B). It uses the identical configuration
and is in the same state, e.g., by relying on same
techniques as used for network node migration. In
addition, we create a third virtual network, Con-
trol VNet, that serves as a monitoring and control
facility. Both VNets A and B continue to operate
in parallel and isolation from each other. However,
VNets rarely operate in complete isolation from the
outside world. Rather they interact with external
entities, e.g., end-systems or non virtualized legacy
parts of the Internet. We thus duplicate traffic from
external entities on entering the production VNet
and mirror it to the Shadow VNet. Therefore, any
user traffic traverses both VNets. However, only
traffic from the production VNet A is propagated to
external nodes. Traffic from the Shadow VNet B is
discarded, silently. To upgrade the production net-
work from VNet A to B the only necessary change
is to discard the output of VNet A and use the out-
put from VNet B, making VNet A the new Shadow
VNet of VNet B. As such, it is always possible to
quickly roll back to the “old” network. Finally, at
some later point in time, the old VNet, VNet A,
can be dismantled to free resources.
Therefore, any operator can use Shadow VNets to
safely reconfigure and then upgrade their network.
For example, he can optimize the routing protocols
3
by changing link weights or introducing QoS param-
eters, or he can upgrade faulty software components
to newer versions. To judge if his changes improve
the network operations he can use the nodes of the
control network to verify the impact of his changes.
Only if a new configuration is stable, e.g., if the
routing protocols in VNet B have converged, may
he decide to use VNet B as his new production net-
work. In contrast to Shadow Config, Shadow VNet
does properly separate real traffic from shadow traf-
fic (shadow config abuses the IP protocol version
field), and Shadow VNets support software and con-
figurations, even operating system running in par-
allel.
The benefits of Shadow VNets are also multifold
and include:
Resilience against operator mistakes: Mistakes
during the configuration and/or update pro-
cess are limited to the Shadow VNet. Thus
they do not effect the production network. More-
over, the entire change set can be tested under
realistic conditions before moving them into
production.
Real user traffic: Shadow VNets expose the new
system and its configuration to real user traffic
at full scale and thus offer the opportunities to
detect more bugs earlier.
Near Atomicity: Changing a network setup usu-
ally takes time and it may require some time
for stabilization. This is hidden from the users
of the production VNet.
An inherent limitation of this approach is that
closed-loop effects cannot be captured by monitor-
ing the Shadow VNet. For example, a configura-
tion upgrade that provides faster delivery of re-
quests to an external server may result in faster
response traffic which may cause an overload on the
network. Such effects only occur once the Shadow
VNet is made productive and thus cannot be pre-
dicted by monitoring the shadow VNet while it is
still in shadow mode. One possible solution is to
integrate all relevant communicating parties into
the VNet itself. Another limitation is that the sub-
strate has to carry the additional load imposed by
the Shadow and the Control VNets. However, if
we assume that there is a reasonable number of ac-
tive VNets supported by the substrate at any given
time it should be possible to add Shadow VNets for
a fraction of the VNets. In the event of an unex-
pected traffic or load spike, the productive VNets
can be given priority over the Shadow VNets, thus
impairing the measurements and testing but not de-
teriorating the quality experienced by the user.
3. IMPLEMENTATION OPTIONS
This work assumes that network virtualization
architectures exist that can automatically provision
VNets and offer proper VNet isolation. In addition,
we need the ability to mirror packets.
VNet Architectures: Several past and present
projects are proposing network virtualization ar-
chitectures that offer some of the necessary fea-
tures, e.g., X-Bone [21], Genesis [10], DaVinci [15],
VINI [8], Cabernet [22] and 4WARD [5]. While
the earlier proposals, X-Bone [21], Genesis [10], Da-
Vinci [15], offer automatic configuration, they were
never designed to offer Internet wide VNets with
performance isolation which potentially involves many
ISPs.
VINI [8] is available today as it builds upon plan-
etlab and special servers. It offers simple container
based host virtualization coupled with network vir-
tualization capabilities for building test setups with
virtual topologies. Other proposals aim at pro-
viding perspectives for virtualizing the Internet at
large, and incorporate economical aspects and de-
tailed roles. Examples include Cabernet [22] and
4WARD [5]. We loosely base our terminology on
the latter.
Manual Resource Virtualization: Given that
no virtualization framework is already widely de-
ployed we test the feasibility of our proposed net-
work diagnosis approaches by manually configuring
the monitoring networks using available node and
link virtualization techniques. 802.11Q VLANs are
commonly used in local are networks to build vir-
tual links while MPLS tunnels are prevalent in the
WAN.
Recently, node virtualization has become very pop-
ular for data centers and server farms as it offers
better utilization of the computing capabilities, in-
creased availability, and ease of management. XEN[7],
an open source virtual node monitor or hypervisor,
is among the most commonly used open source vir-
tualization solutions and allows to run multiple op-
erating systems side-by-side. Hereby, one OS as-
sumes a privileged role. It serves as the interface to
the Hypervisor and controls the hardware (Dom0).
The others OSs are unprivileged guests (DomU ).
XEN has been shown to be a viable solution [12] for
building high performance routers on commodity
hardware. Still it is not perfect functional blocks
4
have to be selected and placed very carefully and
performance isolation and fairness remain problem-
atic especially with regards to network I/O [9, 11].
Many other node virtualization solutions are avail-
able offering individual trade-offs between perfor-
mance and flexibility, including VMWare [4], KVM
[16], OpenVZ [2], and Trellis building on VServer
and NetNS [9].
Packet mirroring: For both of our approaches,
we need packets to be duplicated and delivered to
two different virtual networks in parallel. We thus
study how links are attached to the virtual nodes, ei-
ther in software or with hardware support. Software
options include using the standard Linux bridge de-
vice or the tc mirred command. However, naively
bridging interfaces can impose severe performance
problems [9].
Fortunately, several hardware accelerated link vir-
tualization technologies are about to become com-
mercially available including Multi-Queue NIC cards,
and OpenFlow. Multi-Queue NIC cards (e.g., [3])
lower the I/O overhead by allowing packets to be de-
livered directly to the target VNode. OpenFlow [18]
enables an external entity, the controller, to con-
trol Ethernet switches. This controller can dynam-
ically set the forwarding rules using wildcard pat-
terns across the packet headers while the frame for-
warding is done by the switch hardware. Thus,
OpenFlow offers to be a flexible building block for
virtualized solutions.
Prototype implementation: For our feasibility
study, we build a prototype system utilizing XEN
for host virtualization and VLANs for link virtual-
ization. Unfortunately, current virtualization tech-
niques do not yet offer perfect isolation especially
regarding network I/O [11, 9]. To study its impact
we compare multiple options of link attachment and
packet duplication. Option (a) uses the standard
Linux software bridge, see Fig. 3(a). Option (b)
maps the NIC directly into the DomUs and thus
hides it from the Dom0(pciback see Fig.3(b)).
In Option (a) we duplicate the packets inside the
host, using tc. In Option (b) this is not possible.
Packets have to be inspected and duplicated exter-
nally. We can use a statically configured switch port
or a dynamically configured open flow switch for
this purpose. In this work, we simulate the use of
Multi-Queue NIC cards by using several dedicated
cards, and the use of OpenFlow-enabled traffic du-
plication by manually configuring monitoring ports
on a switch.
(a) internal bridging (b) external bridging
Figure 3: Options for link attachment
Traffic generation node VNET transfer Traffic receiver node
Moni stats
Vrouter
Node 1 Node 2 Node 3
Figure 4: Monitoring experiment setup
4. FEASIBILITY STUDY
4.1 Monitoring VNets
To assess the feasibility of network-wide Monitor-
ing VNets, we need to study if current virtualiza-
tion approaches offer sufficient isolation. As such,
we focus on the monitoring capabilities and not on
the ability to setup the monitoring virtual network.
The task we choose for the virtual network is packet
forwarding the base task of any virtual node.
For our evaluation we rely on a three node setup.
Each node has two Quad Core Intel Xeon L5420
processors running at 2.5GHz, 16GB of RAM, and
4-8 1GBit/s Intel Ethernet ports. The schematic
setup is shown in Fig. 4. We deploy Debian Linux
4.0 with XEN 3.0.3, which is part of the distribu-
tion. We use this out-of-the-box configuration as it
resembles a possible production deployment better
than custom, hand optimized kernel and hypervisor
versions. The virtual network consists of 3 nodes.
Node 1 is the traffic source while Node 3 is the sink.
Node 2 forwards the traffic from Node 1 to Node 3.
On UNIX systems forwarding performance is usu-
ally dominated by the per-packet overhead. As such,
our baseline experiment uses minimum sized pack-
ets and explores the performance impact of the var-
ious options on how to attach the link to the virtual
node and how to do packet duplication, see Sec. 3.
Fig. 5 shows the boxplots of the forwarding per-
5
Baseline
not bridged
bridged
alone
monitoring
alone
monitor
monitor,load
kPPS
0 200 400 600 800 1200
Linux Dom0 DomU (bridge) DomU (pci direct)
Figure 5: Forwarding performance for 64-
Byte packets, with and without Monitoring
formance within 1 second time bins for an experi-
ment duration of 500 seconds. Results are grouped
according to the phases of the experiment by the
dashed vertical (red) lines.
In the first two phases, we baseline our setup by
measuring the forwarding rate in native Linux (group
1) and Dom0 (group 2). Native Linux (Debian
2.6.18-6) supports a median forwarding rate of 840
kpps. Interestingly, the XEN kernel performs better
when doing native, unbridged forwarding in Dom0
(939.9 kpps). Next, we introduce the soft-bridge
that is required for internal attachment of the Do-
mUs, but still keep the forwarding in Dom0. We
notice that the performance drops to 522 kpps. We
then switch to option (a) by delegating the forward-
ing to DomU via a soft-bridge (third phase). Notice
that the forwarding performance drops to 215 kpps
even without monitoring, After enabling mirroring
forwarding performance is further reduced to 180 kpps
which indicates that network performance isolation
is a problem in this setup.
Next we study option (b) (directly attaching the
DomU interface to the NIC via pcidirect, group
4 in the figure). There is hardly any performance
degradation. Indeed, we see a performance improve-
ment: without monitoring, a directly attached DomU
forwards at 961.6 kpps. We then enable packet du-
plication via option (b) (simulated OpenFlow packet
duplication using a manually configured switch mon-
itoring port). As expected, the performance impact
is minimal forwarding is at 953 kpps. Even when
the monitoring domain is overloaded, with 100%
CPU load and 100% hard drive I/O load, forward-
ing performance degrades by only 5% to 896 kpps.
We conclude that probe-effect free Monitoring VNets
/dev/null
VOIP
sender VOIP
receiver
Vnet entry node Vnet transfer node
Vnet exit node
Moni
Vnode 1A Vnode 2A Vnode 3A
BG Traffic
Sender
BG Traffic
Receiver
Node 1 Node 2 Node 3
Vnode 2B
Vnode 1B Vnode 3B
VNET A
VNET B
prod
shadow
Traffic flow
Figure 6: Shadow VNet experiment setup
are possible. However, their performance strongly
depends on the isolation properties offered by the
virtualization platform for network I/O.
4.2 Shadow VNets
To assess the feasibility of a network-wide Shadow
VNet we choose the following scenario: A VNet op-
erator that offers both VoIP and Internet access
across a best effort VNet, considers moving to a
setup with service differentiation to offer better qual-
ity of service (QoS) to its VoIP traffic. This move is
motivated by customer complains about their VoIP
quality during certain times of the day. Given the
open-loop nature of VoIP traffic we expect that
VoIP performance can be estimated with some ac-
curacy by the Shadow VNet approach.
For our experiment, a VoIP call and background
traffic of varying intensity is routed through a vir-
tualized substrate (Fig. 6). The substrate network
again consists of three nodes. We now instantiate
two parallel VNets, VNet A and B, each with a
maximum bandwidth of 20 Mbit/s throughout the
experiment, enforced by traffic shaping on Node 2.
Moreover, we setup an additional virtual network
for monitoring. On entry to the VNet, traffic is
duplicated to both VNets A and B and forwarded
within each via Node 2 to node 3 using separate vir-
tual links (VLANs). On exit, when leaving Node 3,
only output from one VNet is sent to the receivers.
In addition, the monitoring VNet “Moni” receives
a copy of only the VoIP traffic from both VNets.
Metrics: For our evaluation, we measure at two
points in the experiment: Moni records data for
both VNets on exit of the VNet, while the Receiver
records the quality as experienced by the user. We
record the percentage of dropped packages on the
VoIP call as a rough quality indicator, and calcu-
late the MoS value as defined by the the E-model,
6
Table 1: Experiment outline across phases
Phase 1 2 3 4 5 6
Production VNet A A A A B B
Active VNets A A A&B A&B A&B B
QoS enabled - - - B B B
Internet traffic L L/H H H H H
intensity
an ITU-T standard for measuring the transmission
quality of voice calls [13]1.
Setup: For the VoIP traffic we use the pjsip [19]
client, an open source VoIP client based on SIP. It
generates traffic at a constant rate of 80kbps us-
ing the G.711 codex with a net bitrate of 64kbps.
Each voip RTP packet contains 20 ms voice and
has a payload of 160 Bytes. A pool of servers is
used to generate the background traffic, using Har-
poon [20], with properties that are consistent with
those observed in the Internet heavy-tailed file
distributions and self-similar traffic, emulating the
Internet access traffic by the users of the VNet. To
account for different intensities of the background
traffic during different times during the day we use
two different load levels: L/H correspond to 20-
25%, 60-86% average link utilization, respectively.
All traffic sources are located on the left in Fig. 6.
The experiment is conducted in six phases; each
of five minutes. Fig. 7 shows the rate of the back-
ground traffic (shaded area) averaged over 10s (scale
on right axis) across time. In addition, Fig. 7 shows
the number of dropped packets across time, again
using 10s bins (scale on left axis). Drop rates for
VNet A are depicted as blue plus signs, VNet B
as red crosses, and the values measured at the re-
ceiver as green diamonds. Table 1 summarizes the
configuration of each phase.
Results: In phase 1, background traffic is running
at low intensity. In the middle of phase 2 the inten-
sity of the Internet traffic is switched to high. This
causes a problem in VoIP quality as measured by
the MoS value, see Figure 8. The perceived quality
drops from a MoS score of 4.34 which corresponds
1The E-model states that the various impairments con-
tributing to the overall perception of voice quality (e.g.,
drops, delay, jitter) are additive when converted to
the appropriate psycho-acoustic scale (R factor). The
R-factor is then translated via a non-linear mapping
to the Mean opinion Score(M oS), a quality metric
for voice. MoS values range from 1.0 (not recom-
mended) to 5.0 (very satisfied). Values above 4.0 in-
dicate satisfied users, values below 4.0/3.6/3.1 indicate
that some/many/nearly all users are dissatisfied.
Figure 8: MoS results per phase
to a “very satisfied” service level drops to 4.16 which
corresponds to a level of “satisfied”.
As such, the VNet operator asks to instantiate
a Shadow VNet at the beginning of phase 3. This
means that all packets are now duplicated at Node 1
and are routed in both VNets A and B. However,
the end user for VoIP service is still getting service
through VNet A. This allows the operator to assess
the impact of the degradation and to do root cause
analysis in VNet B. Indeed, the quality of the call
decreases further in our experiment. In our case the
operator decides to prioritize VoIP traffic to counter
the bad performance. He enables QoS at the start
of phase 4. this reduces the loss rates within VNet B
significantly and the MoS value increases again to
4.38. Indeed, due to some bad congestion the VoIP
MoS score within VNet A drops to 1.45 which cor-
responds to “not recommended”.
At the start of phase 5 the operator switches his
production VNet from VNet A to VNet B. There-
fore, the user is now getting the good performance
provided by VNet B. With phase 6 the operator
deactivates VNet A.
Already this very simple scenario shows how an
operator can benefit from Shadow VNets, e.g., to
smoothly upgrade this network configuration to amend
a network performance problem.
5. SUMMARY AND FUTURE WORK
In this paper, we present two novel approaches
that leverage the capabilities of network virtualiza-
tion to add to our network troubleshooting capabili-
ties, especially for large production networks. Mon-
itoring VNets can be used to provide cost-effective,
side effect free network-wide monitoring capabili-
ties. Shadow VNets enable operators to upgrade
configurations and software in an operationally safe
way and with transaction semantics while exposing
7
0 500 1000 1500
0 10 20 30 40
Experiment time [s]
packet drops %
0 5 10 15 20 25
phase 1
phase 2
phase 3
phase 4
phase 5
phase 6
Receiver
VNET A
VNET B
BG Traffic
BG traffic throughput [MBit/s]
Figure 7: VoIP packet drops (left) and background traffic throughput(right) across time
the new system and configuration to real user be-
havior. As such the system can be tested before
putting it into the wild.
The experiences with our prototype implemen-
tation underlines the feasibility of the approaches,
especially if used on a virtualization platform that
offers good isolation. It also hints at the power of
these new troubleshooting tools.
In the future we plan further experiments using
hardware virtualization enablers (e.g., Open Flow,
Multi Queue NIC) as these promise proper isola-
tion and explore the scalability limits. Moreover,
we plan to integrate Monitoring VNets and Shadow
VNets into one of the emerging VNet architecture
platforms.
6. REFERENCES
[1] Networked solutions integrated test engineering:
Delivering on the promise of innovation.
https://www.cisco.com/en/US/solutions/ns341/
ns522/networking_solutions_products_
genericcontent0900aecd80458f98.pdf.
[2] Openvz. http://wiki.openvz.org/.
[3] Solving the hypervisor network I/O bottleneck.
http://www.solarflare.com/technology/documents/
SF-101233-TM-5.pdf.
[4] Vmware infrastructure.
http://www.vmware.com/products/vi/.
[5] 4WARD Project. http://www.4ward-project.eu.
[6] R. Alimi, Y. Wang, and Y. R. Yang. Shadow
configuration as a network management primitive. In
ACM Sigcomm, 2008.
[7] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris,
A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen
and the art of virtualization. In ACM SOSP, 2003.
[8] A. Bavier, N. Feamster, M. Huang, L. Peterson, and
J. Rexford. In vini veritas: Realistic and controlled
network experimentation. In ACM Sigcomm, 2006.
[9] S. Bhatia, M. Motiwala, W. M¨uhlbauer, Y. Mundad,
V. Valancius, A. Bavier, N. Feamster, L. Peterson, and
J. Rexford. Trellis: A platform for building flexible,
fast virtual networks on commodity hardware. In ACM
ROADS Workshop, 2008.
[10] P. Ch, A. Fisher, C. Kosak, T. E. Ng, P. Steenkiste,
E. Takahashi, and H. Zhang. Darwin: Customizable
resource management for value-added network services.
IEEE Network, 2001.
[11] N. Egi, A. Greenhalgh, M. Handley, M. Hoerdt,
F. Huici, and L. Mathy. Fairness issues in software
virtual routers. In ACM PRESTO Workshop, 2008.
[12] N. Egi, A. Greenhalgh, M. Handley, M. Hoerdt,
L. Mathy, and T. Schooley. Evaluating xen for router
virtualization. In IEEE PMECT, 2007.
[13] The E-model, a Computational Model for Use in
Transmission Planning, ITU-T Rec. G.107, 2005.
[14] N. Feamster and H. Balakrishnan. Detecting bgp
configuration faults with static analysis. In USENIX
NSDI, 2005.
[15] J. He, R. Zhang-Shen, Y. Li, C.-Y. Lee, J. Rexford,
and M. Chiang. Davinci: Dynamically adaptive virtual
networks for a customized internet. In Proc. ACM
CONEXT, 2008.
[16] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and
A. Liguori. kvm: the linux virtual machine monitor. In
Proc. of the Linux Symposium, 2007.
[17] R. Mahajan, D. Wetherall, and T. Anderson.
Understanding bgp misconfiguration. In ACM
Sigcomm, 2002.
[18] N. McKeown, T. Anderson, H. Balakrishnan,
G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and
J. Turner. Openflow: enabling innovation in campus
networks. ACM Sigcomm CCR, 38(2), 2008.
[19] Open Source SIP Stack and Media Stack for Presence,
Instant Messaging, and Multimedia Communication,
2009. http://www.pjsip.org.
[20] J. Sommers and P. Barford. Self-configuring network
traffic generation. In ACM IMC, 2004.
[21] J. D. Touch. Dynamic Internet overlay deployment and
management using the X-Bone. In ICNP, 2000.
[22] Y. Zhu, R. Zhang-Shen, S. Rangarajan, and
J. Rexford. Cabernet: Connectivity architecture for
better network services. In ACM ReArch Workshop,
2008.
8