Document [original]

Forschungsberichte

der Fakultät IV – Elektrotechnik und Informatik

Content-aware Traffic Engineering

Benjamin Frank

Ingmar Poese

Georgios Smaragdakis

Steve Uhlig

Anja Feldmann

Bericht-Nr. 2012 – 03

ISSN 1436-9915

Content-aware Traffic Engineering

Benjamin Frank

T-Labs/TU Berlin

[email protected]

Ingmar Poese

T-Labs/TU Berlin

[email protected]

Georgios Smaragdakis

T-Labs/TU Berlin

[email protected]

Steve Uhlig

Queen Mary University of London

ste[email protected]

Anja Feldmann

T-Labs/TU Berlin

[email protected]

ABSTRACT

Today, a large fraction of Internet traffic is originated by Content

Providers (CPs) such as content distribution networks and hyper-

giants. To cope with the increasing demand for content, CPs de-

ploy massively distributed infrastructures. This poses new chal-

lenges for CPs as they have to dynamically map end-users to ap-

propriate servers, without being fully aware of network conditions

within an ISP as well as the end-users network locations. Further-

more, ISPs struggle to cope with rapid traffic shifts caused by the

dynamic server selection process of CPs.

In this paper, we argue that the challenges that CPs and ISPs

face separately today can be turned into an opportunity. We show

how they can jointly take advantage of the deployed distributed in-

frastructures to improve their operation and end-user performance.

We propose Content-aware Traffic Engineering (CaTE), which dy-

namically adapts the traffic demand for content hosted on CPs by

utilizing ISP network information and end-user location during the

server selection process. As a result, CPs enhance their end-user

to server mapping and improve end-user experience, thanks to the

ability of network-informed server selection to circumvent network

bottlenecks. In addition, ISPs gain the ability to partially influence

the traffic demands in their networks. Our results with operational

data show improvements in path length and delay between end-user

and the assigned CP server, network wide traffic reduction of up to

15%, and a decrease in ISP link utilization of up to 40% when ap-

plying CaTE to traffic delivered by a small number of major CPs.

1. INTRODUCTION

People value the Internet for the content it makes available [35].

For example, the demand for online entertainment and web brows-

ing has exceeded 70% of the peak downstream traffic in the United

States [34]. Recent traffic studies [27, 40, 52] show that a large

fraction of Internet traffic is originated by a small number of Con-

tent Providers (CPs). Major CPs are highly popular rich media

sites like YouTube and Netflix, One-Click Hosters (OCHs), e.g.,

RapidShare or MegaUpload, as well as Content Delivery Networks

(CDN) suchas Akamaior Limelight andhyper-giants, e.g., Google,

Yahoo! or Microsoft. Gerber and Doverspike [27] report that a few

CPs account for more than half of the traffic of a US-based Tier-

1 carrier. Poese et al. [52] report a similar observation from the

traffic of a European Tier-1 carrier. Labovitz et al. [40] infer that

more than 10% of the total Internet inter-domain traffic originates

from Google, and Akamai claims to deliver more than 20% of the

total Web traffic in the Internet [50]. In North America, Netflix is

responsible for around 30% of the traffic during peak hours [34] by

offering a high definition video streaming service hosted on CDN

infrastructures such as Limelight and the CDN operated by Level3.

To cope with the increasing demand for content, CPs deploy

massively distributed server infrastructures [42] to replicate content

and make it accessible from different locations in the Internet [62,

2]. For example, Akamai operates more than 60,000 servers in

more than 5,000 locations across nearly 1,000 networks [42, 50].

Google is reported to operate tens of data-centers and front-end

server clusters worldwide [39, 61]. Microsoft has deployed its

CDN infrastructure in 24 locations around the world [33]. Ama-

zon maintains at least 5 large data-centers and caches in at least 21

locations around the world [55]. Limelight operates thousands of

servers in more than 22 delivery centers and connects directly to

more than 900 networks worldwide [49].

The growth of demand for content and the resulting deployment

of content delivery infrastructures pose new challenges to CPs and

to ISPs. For CPs, the cost of deploying and maintaining such a

massive infrastructure has significantly increased during the last

years [53] and the revenue from delivering traffic to end-users has

decreased due to the intense competition. Furthermore, CPs strug-

gle to engineer and manage their infrastructures, replicate content

based on end-user demand, and assign users to appropriate servers.

Thelatter ischallenging asend-user toserver assignmentis based

on inaccurate end-user location information [47, 12], and inferring

the network conditions within an ISP without direct information

from the network is difficult. Moreover, due to highly distributed

server deployment and adaptive server assignment, the traffic in-

jected by CPs is volatile. For example, if one of its locations is

overloaded, a CP will re-assign end-users to other locations, result-

ing in large traffic shifts in the ISP network within minutes. Current

traffic engineering by ISP networks adapts the routing and operates

on time scales of several hours, and is therefore too slow to react to

rapid traffic changes caused by CPs.

The pressure for cost reduction and customer satisfaction that

both CPs and ISPs are confronted with, coupled with the oppor-

tunity that distributed server infrastructures offer, motivate us to

propose a new tool in the traffic engineering landscape. We intro-

duce Content-aware Traffic Engineering (CaTE). CaTE leverages

the location diversity offered by CPs and, through this, it allows to

adapt to traffic demand shifts. In fact, CaTE relies on the observa-

tion that by selecting an appropriate server among those available

to deliver the content, the path of the traffic in the network can be

influenced in a desired way. Figure 1 illustrates the basic concept

of CaTE. The content requested by the client is in principle avail-

able from three servers (A, B, and C) in the network. However, the

client only connects to one of the network locations. Today, the

decision of where the client will connect to is solely done by the

CP and is partially based on measurements and/or inference of net-

work information and end-user location. With CaTE the decision

on end-user to server assignment can be done jointly between the

CP and ISP.

ISP

Pos ACP Server A

Pos B

Client

CP Server B

CP Server C

Figure 1: By choosing a CP server for a client with the help of

CaTE, traffic engineering goals and accurate end-user server

assignment become possible.

CaTE complements the existing traffic engineering ecosystem

by focusing on traffic demands rather than routing, by combining

(i) the knowledge of CPs about their location diversity and server

load, with (ii) the ISPs detailed knowledge of the network condi-

tions and end-user location. CaTE offers additional traffic engi-

neering capabilities to ISPs to better manage the volatility of CP

traffic. Also, thanks to the information about ISP networks, CPs

gain the ability to better assign end-users to their servers and better

amortize the cost of deploying and maintaining their infrastructure.

Furthermore, the burden of measuring and inferring network topol-

ogy and state is removed from the CPs. In short, all involved par-

ties, including the end-users, benefit from CaTE, creating a win-

win situation for everyone. Our contributions are as follows:

•We introduce the concept of CaTE.

•We present the design, incentives, and possible deployment

schemes of systems to realize CaTE.

•We propose an online algorithm to map end-user requests to

servers for CaTE and discuss its properties.

•We evaluate the performance of CaTE using real data from

a European Tier-1 ISP. We show that CaTE can improve the

assignment of end-users to servers for a number of metrics,

namely, link utilization, path length and path delay. Our re-

sults show that the maximum link utilization can be reduced

by half, especially during the peak hour, that the total traf-

fic that flows in the network can be reduced by up to 15%,

and the delay by 20% respectively when applying CaTE to

a small number of major CPs. Similar results are obtained

when evaluating CaTE on two other operational networks.

The remainder of this paper is structured as follows. In Section 2

we present the observations that motivate our work. In Section 3 we

introduce our concept of CaTEand present the general architecture

as well as possible deployment schemes. We formally define and

model CaTE in Section 4. We propose algorithms to enable CaTE

in Section 5. We evaluate the benefits of CaTE in Section 6 using

data from operational networks with different metrics, including

link utilization, path delay and length. We present related work in

Section 7 and summarize in Section 8.

2. CHALLENGESANDOPPORTUNITIESIN

CONTENT DISTRIBUTION

With the emergence of “hyper-giants” and other popular CPs,

the traffic of the Internet has undergone drastic changes [40]. These

changes stem from trends in business and organizational integration

and consolidation. As a consequence, a small number of CPs are

responsible for a large fraction of traffic [27, 52]. Content delivered

by CPs, including highly popular rich media sites like Facebook

and high definition video streaming such as Netflix or YouTube, is

mostly carried over HTTP. Recent studies unveil that HTTP con-

tributes more than 60% of Internet traffic [3, 17, 27, 34, 40, 46].

Moreover, CPs peer directly with a large number of ISPs and in

many locations. For scalability reasons, most CPs make the content

available from all their infrastructure locations [62]. The globally

deployed infrastructures allow CPs to rapidly shift large amounts

of traffic from one peering point to another. While the diverse foot-

print of CPs and the ability to shift traffic in short timescales poses

new challenges to both CPs and ISPs, it also offers new opportuni-

ties for joint optimization of content delivery.

2.1 Challenges in Content Delivery

The scale and complexity of content delivery, especially from

distributed infrastructures, brings multiplechallenges to CPs. These

challenges have a major impact on both the end-user performance

and ISP operation.

Content Delivery Cost. CPs strive to minimize the overall cost of

delivering huge amounts of content to end-users. To that end, their

assignment strategy is mainly driven by economic aspects such as

bandwidth or energy cost [53, 28]. While a CP will try to assign

end-users in such a way that the server can deliver reasonable per-

formance, this does not always result in end-users being assigned

to the server able to deliver the best performance. Moreover, the

intense competition in the content delivery market has led to di-

minishing returns of delivering traffic to end-users.

End-user Mis-location. End-user mapping requests received by

the CP DNS servers originate from the DNS resolver of the end-

user, not from the end-user itself. The assignment is therefore based

on the assumption that end-users are close to their DNS resolvers.

Recent studies have shown that in many cases this assumption does

not hold [1, 47]. As a result, the end-user is mis-located and the

server assignment is not optimal. As a response, DNS extensions

have been proposed to include the end-user IP information [12].

Network Bottlenecks. Despite their efforts to discover the paths

betweentheend-usersandtheirserversto predictperformances [32],

CPs have limited information about the actual network conditions.

Tracking the ever changing network conditions, i.e., through ac-

tive measurements and end-user reports, incurs an extensive over-

head for the CP without a guarantee of performance improvements

for the end-user. Without sufficient information about the network

paths between the CP servers and the end-user, an assignment per-

formed by the CP can lead to additional load on existing network

bottlenecks, or create new ones.

End-userPerformance. Applications delivered by CPs often have

requirements in terms of end-to-end delay [39]. Moreover, faster

and more reliable content delivery results in higher revenues for e-

commerce applications [50] as well as user engagement [15]. De-

spite the significant efforts of CPs, end-user mis-location and the

limited view of network bottlenecks are major obstacles to improve

end-user performance.

2.2 Opportunities for CaTE

The idea behind CaTE is to provide solutions for the new chal-

lenges in content delivery. Indeed, ISPs are in a unique position,

both in terms of knowledge as well as incentives, to improve con-

tent delivery. ISPs have the knowledge about the state of the un-

derlying network topology and the status of individual links. This

information not only helps CPs in their user-to-server mapping, but

also reduces the need for CPs to perform large-scale active mea-

surements and topology discovery [32]. It also enables CPs to bet-

ter amortize their existing infrastructure, offer better quality of ex-

perience to their users, and postpone their infrastructure expansion.

The opportunity for ISPs to coordinate with CPs in their server

selection is technically possible thanks to the decoupling of the

server selection from the content delivery. In general, any end-user

requesting content from a CP first does a mapping request, usually

through the Domain Name System (DNS). During this request the

CP needs to locate the network position of the end-user and as-

sign a server capable of delivering the content, preferably close to

the end-user. However, locating the user in a network and infer-

ring the conditions of the path between the end-user and eligible

CP servers is hard as the CP is missing network information. In

contrast, ISPs have this information ready at their fingertips, but

are currently missing a communication channel to inform the CPs.

Furthermore, ISPs face the challenge of predicting the CP traffic,

which is very difficult due to the lack of information on the map-

ping of end-users to server decided by CPs.

We propose to use CaTE during the server selection process of

CPs. In today’s CP deployment, the server selection is done di-

rectly between the end-user and the CP without the involvement of

the ISP (see arrow A in Figure 2). Through CaTE, CPs are offered

the opportunity to optimize their server selection beyond their cur-

rent capabilities by communicating directly with the ISP (CP-ISP

Communication, see Figure 2). Furthermore, ISPs gain the ability

of adapting to the volatile traffic induced by content delivery, by be-

ing able to influence the choice of the CP. We believe that CaTE is

a step forward in improving the end-user performance and enabling

ISP and CP collaboration.

2.3 Incentives

The opportunities that CaTE enables for both CPs and ISPs re-

quire that both parties have incentives to work together. Further-

more, the growing awareness of end-users about CaTE’s benefits

will accelerate the penetration of CaTE in a highly commoditized

content delivery market.

2.3.1 Incentives for CPs

The market of CPs requires them to enable new applications

while reducing their operational cost, and to improve the end-user

experience [50]. With CaTE improving the mapping of end-users

to servers, CPs can expect improvements in the end-user experi-

ence, and thus, a competitive advantage. This is particularly im-

portant for CPs in light of the commoditization of the content de-

livery market and the choice that is offered to end-users, for exam-

ple through meta-CDNs [15]. The improved mapping also yields

better infrastructure amortization and thanks to CaTE, CPs will no

longer have to perform and analyze voluminous measurements in

order to infer the network conditions or end-user locations.

To stimulate the use of CaTE, ISPs can operate and provide

CaTE as a free service to CPs or even offer discounts on peer-

ing or hosting prices, e.g., for early adopters and CPs that expose

a higher server diversity while using CaTE. The loss of peering or

hosting revenue is amortized with the benefits of a lowered network

utilization, reduced investments in network capacity expansion and

by taking back some control over the traffic within the network.

Ma et al. [45] have developed a methodology to estimate the prices

in such a cooperative scheme by utilizing the Shapley settlement

mechanism. CaTE can also act as an enabler for CPs and ISPs to

jointly launch new applications in a cost-effective way, for example

traffic-intensive applications such as the delivery of high definition

video on-demand, or real-time applications such as online games.

In an ISP-CP collaborative scheme, CaTE can play the role of a

recommendation system and is not intended to be applied unilater-

ally by the ISP.

ISP

Client

End-Users CaTE

System

CP Server

Selection

Content

Server

Content

Transfer

ACP-ISP

Communication

CP InfrastructureCP Infrastructure

Figure 2: CaTE deployment and interaction with CPs.

2.3.2 Incentives for ISPs

ISPs are interested in reducing their operational and infrastruc-

ture upgradecosts, offeringbroadband servicesat competitiveprices,

and delivering the best end-user experience possible. Due to net-

work congestion during the peak hour, ISPs in North America have

recently revisited the flat pricing model and have announced data

caps to broadband services. A better management of traffic in their

network with CaTE can allow them to offer higher data caps or

even alleviate the need to introduce them. From an ISP perspec-

tive, CaTE offers the possibility to do global traffic and peering

management, through an improved awareness of the traffic across

the whole network. For example, peering agreements with CPs can

offer the use of CaTE in exchange for reduced costs to the CPs.

This can be an incentive for CPs to peer with a CaTE-enable ISP

and an additional revenue for an ISP, as such reduced prices can at-

tract additional peering customers. An ISP can also offer CaTE to

other ISPs it peers with, which makes sense especially in the case

that the peering ISPs hosts content or also acts as CP. The interac-

tion and federation of CPs run by ISPs can also be enabled through

CaTE. There is high interest on the side of ISPs, as reflected by

the creation of the IETF working group CDNi [44]. Furthermore,

CaTE has the potential to reduce the significant overhead due to

the handling of customer complaints that often do not stem from

the operation of the ISP but the operation of CPs [8]. With CaTE,

ISPs can identify and mitigate congestion, and react to short distur-

bances caused by an increased demand of content from CPs.

2.3.3 Incentives for end-users

CaTE offers a way to empower end-users to obtain the best pos-

sible quality of experience. As such, this creates an incentive for

end-users to support the adoption of CaTE by both ISPs and CPs.

For example, an ISP can offer more attractive products, i.e., higher

bandwidth or lower prices, since it is able to better manage the

traffic inside its network. Also, thanks to better traffic engineer-

ing, ISPs can increase data caps on their broadband offers, making

the ISP more attractive to end-users. Moreover, CPs that utilize

CaTE can offer better quality of experience to end-users. This can

be done through premium services based on CaTE. For example,

CPs delivering streaming services can offer higher quality videos to

end-users thanks to better server assignment and network engineer-

ing. Also, applications running over the Internet can greatly benefit

in their performance from CaTE (see Appendix B). This, in turn,

gives end-users a good reason to choose CaTE enabled services.

3. CaTE APPROACH

The concept of CaTE relies on two key observations. First,

a major fraction of the traffic in ISPs is delivered by massively

distributed CP infrastructures. Therefore, the same content is of-

ten available at different network locations with different network

paths to the end-user. Second, the server selection of CPs is de-

coupled from the content transfer. Thus, it is possible to augment

the server selection strategy of CPs with detailed information from

ISPs about the current network state, the status of links that are

traversed and the precise network location of the end-user.

3.1 Concept of CaTE

CaTE relies on the fact that by selecting an appropriate server

among those being able to satisfy a request, the flow of traffic

within the network can be influenced. To illustrate the concept,

we show in Figure 1 how, by selecting server A instead of B or C,

a shorter path through the network is chosen. However, CPs have

limited knowledge about the path characteristics inside a network.

On the other hand, ISPs are aware of the state of their network,

the location of their users, as well as the path conditions between

end-users and servers. Given the large fraction of traffic that orig-

inates from CPs and their highly distributed infrastructure, CaTE

can shift traffic among paths within a network and, through this,

achieve traffic engineering goals for both CPs and the ISP.

3.2 CaTE Deployment Schemes

Our main architectural motivation is that the server selection is

decoupled from the content transfer. In Figure 2 we provide a sim-

plified version of how CPs handle content requests. Today, the

server selection process of CPs works as follows. When an end-

user wants to obtain a specific content, it first sends a request to the

CP server selection of the CP (see Figure 2, (A)). Today, there are

two prevalent techniques used to transfer this request: DNS queries

and HTTP redirection. The CP server selection selects the con-

tent server based on the requested content, the objectives of the CP,

its current view of the network, and its knowledge of the end-user

network location. Finally, it returns the selected server IP, either

through a DNS reply or a HTTP redirection, to the end-user, which

in turn establishes a connection to the supplied server IP to down-

load the content.

In order for CaTE to hook into the server selection of CPs, a

new component inside the ISPs network is needed. In general, this

component offers an interface between the CP and the ISP to get

supplement information about the network position of end-users,

path conditions between an end-user and eligible servers, etc. To

this end, the system uses information readily available to an ISP,

such as the actual network topology, routing information, end-user

assignment databases, current network loads, etc. Today, systems

capable of providing the interface between an ISP and a CP are for

example the IETF ALTO service [4] or the Provider-aided Distance

information System (PaDIS) [52]. In Figure 2 we outline the range

of possible CaTE deployment schemes:

1. CP contacts ISP: The end-user contacts the CP server selection

module via its DNS resolver (A) as it does today. When choosing

the server for the end-user, the CP uses the CP-ISP Communication

to retrieve information about the network status, topology, or a rec-

ommendation by the ISP based on the network conditions between

the end-user and the candidate content servers. The advantage of

the recommendation option is that no party reveals any sensitive

operational information.

This can be implemented by including the client IP in the map-

ping request as proposed at the IETF dnsext working group [12]

while using the IETF ALTO protocol or PaDIS by the CP to re-

trieve topology information, network status information, or server

recommendation by the ISP.

2. ISP contacts CP: The end-user contacts CaTE directly (B) for

the mapping. Then, CaTE uses the CP-ISP Communication to for-

ward the request to the CP. The CP returns a list of potential servers

and CaTEranks them based on network characteristics and the cur-

rent path conditions between end-user and server network location.

This can be implemented by utilizing the part of the DNS res-

olution process handled by CPs. When end-users query the ISP

DNS resolver and, in turn, the CP DNS server, the CP returns all

candidate content servers, which are re-ordered by the ISP DNS

resolvers according to CaTE.

3. ISP-based: The end-user contacts CaTE directly (B) for the

mapping. However, CaTE forwards the request through the CP-

ISP Communication to the CP server selection, which returns the

normal reply as it happens today. CaTEcollects and aggregates the

replies from the CP and overwrites the replies using the knowledge

it has obtained from past results.

This can be implemented by using the DNS resolution process

of CPs. When end-users query the ISP DNS resolver the ISP for-

wards the request. However, the answer from the CP is kept and

aggregated as proposed by Poese et al. [52] and the DNS replies

are overwritten as CaTE sees fit.

4. User-based: The end-user collects the potential content servers

from the CP as well as the current network state from the ISP. By

utilizing this information, it calculates the best server to connect

to based on active end-to-end measurements or previously reported

experience.

This can be achieved when both the CP and the ISP run the IETF

ALTO service or PaDIS. In this case, the client downloads all the

needed information and performs the server selection itself.

In the first three schemes CaTE can be incrementally deployed

and interacts with the existing CP infrastructures while being trans-

parent to the end-user. In the collaborative schemes 1 and 2, the

final decision is made by the CPs to avoid any disturbance on their

operation. The frequency of ranking exchanges as well as the gran-

ularity of end-user location identification is up to the administrator

of the system. It is also possible to provide end-users the choice

to opt-in or opt-out. CPs can also negotiate how many locations

they make available to ISPs. Note, CPs can dynamically change

the locations made available to the ISP depending on the utilization

of each location. In the last deployment option, we describe how

CaTE can also be deployed at the end-user, e.g., via the browser

or home gateway, but the penetration will be slower as it requires

the installation of software at the end-user.

4. MODELLING CaTE

Next, we formalize CaTE and discuss how it relates to tradi-

tional traffic engineering and multipath routing.

4.1 Traffic Engineering

We model the network as a directed graph G(V, E)where Vis

the set of nodes and Eis the set of links. An origin-destination

(OD) flow fod consists of all traffic entering the network at a given

point o∈V(origin) and exiting the network at some point d∈V

(destination). The traffic on a link is the superposition of all OD

flows that traverse the link.

The relationship between link and OD flow traffic is expressed

by the routing matrix A. The matrix Ahas size |E| × |V|2. Each

element of matrix Ahas a boolean value. Aml = 1 if OD flow

mtraverses link l, and 0otherwise. The routing matrix Acan be

derived from routing protocols, e.g., OSPF, ISIS, BGP. Typically,

Ais very sparse since each OD flow traverses only a very small

number of links. Let ybe a vector of size |E|with traffic counts

on links and xa vector of size |V|2with traffic counts in OD flows,

then y=Ax. Note, xis the vector representation of the traffic matrix.

ISP

CP Server A CP Server B

CP Server C

Client

Highly

utilized

link

shift

Figure 3: Content-aware Traffic Engineering Process

Traditional Traffic Engineering: In its broadest sense, traffic en-

gineering encompasses the application of technology and scien-

tific principles to the measurement, characterization, modeling, and

control of Internet traffic [5]. Traditionally, traffic engineering re-

duces to controlling and optimizing the routing function and to

steeringtrafficthrough thenetworkinthemosteffectiveway. Trans-

lated into the above matrix form, traffic engineering is the process

of adjusting A, given the OD flows x, so as to influence the link

traffic yin a desirable way, as coined in [41]. The above definition

assumes that the OD flow vector xis known. For instance, direct

observations can be obtained, e.g., with Netflow data [9, 19].

Terminology: We denote as flow an OD flow between two routers

in the network. We call a flow splittable if arbitrarily small pieces

of the flow can be assigned to other flows. This is not to be con-

fused with end-to-end sessions, i.e., TCP connections, which are

un-splittable. The assumption that flows are splittable is reason-

able, as the percentage of traffic of a single end-to-end session is

small compared to that of a flow between routers. Let Cbe the set

of nominal capacities of the links in the network G. We denote as

link utilization the fraction of the link capacity that is used by flows.

We denote as flow utilization the maximum link utilization among

all links that a flow traverses. We introduce the terms of traffic con-

sumer and traffic producer which refer to the aggregated demand of

users attached to a router, and the CPs that are responsible for the

traffic respectively. Throughout this paper, we refer to the different

alternatives from which content can be supplied by a given CP as

network locations that host servers.

4.2 Definition of CaTE

We revisit traffic engineering by focusing on the traffic demands

rather than changing the routing.

Definition 1: Content-aware Traffic Engineering(CaTE)is the

process of adjusting the traffic demand vector x, given a routing

matrix A, so as to change the link traffic y.

Not all the traffic can be adjusted arbitrarily. Only traffic for

which location diversity is available can be adjusted by CaTE.

Therefore, x=xr+xswhere xrdenotes the content demands that can

be adjusted and xsdenotes the content demands that can not be ad-

justed as there is only a single location in the network where the

content can be downloaded from. The amount of traffic that can be

adjusted depends on the diversity of locations from which the con-

tent can be obtained. We can rewrite the relation between traffic

counts on links and traffic counts in flows as follows: y=A(xs+

xr). CaTE adjusts the traffic on each link of the network by ad-

justing the content demands xr:yr=Axr. Applying CaTE means

adjusting the content demand to satisfy a traffic engineering goal.

Definition 2: Optimal Traffic Matrix is the new traffic matrix, x∗,

after applying CaTE, given a network topology G, a routing matrix

Aand an initial traffic matrix x.

Figure 3 illustrates the CaTE process. A content consumer re-

quests content that three different servers can deliver. Let us as-

sume that, without CaTE, the CP redirects the clients to servers B

and C. Unfortunately, the resulting traffic crosses a highly-utilized

link. With CaTE, content can also be downloaded from server A,

thus, the traffic within the network is better balanced as the highly

utilized link is circumvented.

Minimizing the maximum utilization across all links in a net-

work is a popular traffic engineering goal [24, 25, 42]. It poten-

tially improves the quality of experience and postpones the need

for capacity increase. CaTE mitigates bottlenecks and minimizes

the maximum link utilization by re-assigning parts of the traffic

traversing heavily loaded paths. Thus it redirects traffic to other,

less utilized paths. As we will elaborate in Section 6, different met-

rics such as path length or network delay can also be used in CaTE.

4.3 CaTE and Traditional TE

CaTE is complementary to routing-based traffic engineering as

it does not modify the routing. Routing-based traffic engineering

adjusts routing weights to adapt to traffic matrix changes. To avoid

micro-loops during IGP convergence [26], it is common practice to

only adjust a small number of routing weights [25]. To limit the

number of changes in routing weights, routing-based traffic engi-

neering relies on traffic matrices computed over long time periods

and offline estimation of the routing weights. Therefore, routing-

based traffic engineering operates on time scales of hours, which

can be too slow to react to rapid change of traffic demands. CaTE

complements routing-based traffic engineering and can influence

flows at shorter time scales by assigning clients to servers on a per

request basis. Thus, CaTE influences the traffic within a network

online in a fine-grained fashion.

4.4 CaTE and Multipath Routing

Multipath routing helps end-hosts to increase and control their

upload capacity [37]. It can be used to minimize transit costs [28].

Multipath also enables ASes to dynamically distribute the load in-

side networks in the presence of volatile and hard to predict traffic

demand changes [19, 16, 58, 21]. This is a significant advantage,

as routing-based traffic engineering can be too slow to react to phe-

nomena such as flash crowds. Multipath takes advantage of the

diversity of paths to better distribute traffic.

CaTE also leverages the path diversity, and can be advanta-

geously combined with multipath to further improve traffic engi-

neering and end-user performance. One of the advantages of CaTE

is its limited investments in hardware deployed within an ISP. It can

be realized with no change to routers, contrary to some of the pre-

vious multipath proposals [58, 16, 21]. The overhead of CaTE is

also limited as no state about individual TCP connections needs

to be maintained, contrary to multipath [58, 16, 21]. In contrast

to [16, 58], CaTE is not restricted to MPLS-like solutions and is

easily deployable in todays networks.

4.5 CaTE and Oscillations

Theoretical results [23, 22] have shown that load balancing al-

gorithms can take advantage of multipath while provably avoiding

traffic oscillations. In addition, their convergence is fast. Building

on these theoretical results, Fischer et al. proposed REPLEX [21],

a dynamic traffic engineering algorithm that exploits the fact that

there are multiple paths to a destination. It dynamically changes

the traffic load routed on each path. Extensive simulations show

that REPLEX leads to fast convergence, without oscillations, even

when there is lag between consecutive updates about the state of

2 31

Server

Machines

Provider 1 Provider 2

Server

Consumer Consumer

Link 1

Link 2

Tasks

Consumer

Figure 4: CaTE and Restricted Machine Load Balancing.

the network. CaTE is derived from the same principles and thus

inherits all the above-mentioned desired properties.

5. CaTE ALGORITHMS

In this section we propose algorithms to realize CaTE, in the

context of an ISP. A key observation is that CaTE can be reduced

to the restricted machine load balancing problem [7] for which op-

timal online algorithms are available. The benefit of the CaTE

online algorithm can be estimated either by reporting results from

field tests within an ISP or by using trace-driven simulations. Typ-

ically, in operational networks only aggregated monitoring data is

available. To estimate the benefit that CaTE offers to an ISP, we

present offline algorithms that uses traffic demands and server di-

versity over time extracted from those statistics as input.

5.1 Connection to Restricted Machine Load

Balancing

Given a set of CPs and their network location diversity, we con-

sider the problem of re-assigning the flows that correspond to de-

mands of content consumers to the CPs in such a way that a specific

traffic engineering goal is achieved. Given that sub-flows between

end-systems and content provider servers can be re-distributed only

to a subset of the network paths, we show that the solution of the

optimal traffic matrix problem corresponds to solving the restricted

machine load balancing problem [7]. In the restricted machine load

balancing problem, a sequence of tasks is arriving, where each task

can be executed by a subset of all the available machines. The goal

is to assign each task upon arrival to one of the machines that can

execute it so that the total load is minimized. Note, contrary to the

case of multipath where paths between only one source-destination

pair are utilized, CaTE can utilize any eligible path between any

candidate source and destination of traffic.

For ease of presentation let us assume that the traffic engineer-

ing goal is to minimize the maximum link utilization in the net-

work [24, 25]. Let us consider three consumers where each one

wants to download one unit of content from two different content

providers, see Figure 4. Given that different servers can deliver the

content on behalf of the two providers, the problem consists in as-

signing consumers to servers in such a way that their demands are

satisfied while minimizing the maximum link utilization in the net-

work. Thus, the problem is the restricted machine load balancing

one where tasks are the demands satisfied by the servers and ma-

chines are the bottleneck links that are traversed when a path, out

of all eligible server-consumer paths, is selected. Figure 4 shows

one of the possible solutions to this problem, where consumer 1

is assigned to servers 1 and 4, consumer 2 to servers 5 and 2, and

consumer 3 to servers 3 and 6. Note that the machine load refers to

the utilization of the bottleneck links of eligible paths, denoted as

link 1 and 2.

To be consistent with our terminology, we define the restricted

flow load balancing problem. Let Jbe the set of the consumers in

the network, Kbe the set of content producers, and Ibe the set of

servers for a given content provider, i.e., the set of locations where

a request can be satisfied. Note, this set is offered by the CP in

order to satisfy its own objectives and can change over time. We

denote as Mjk the set of flows that can deliver content for a given

content producer kto consumer j.

Definition 3: Restricted Flow Load Balancing Problem is the

problem of finding a feasible assignment of flows such that a traffic

engineering goal is achieved, given a set of sub-flows {fijk}from

all eligible servers i∈Iof a given content provider k∈Kto a

consumer j∈J, and a set of eligible residual flows f−k

ij , i ∈Mjk

(after removing the traffic of the above mentioned sub-flows).

Despite some similarities, the nature of our problem differs from

themulti-commodity flowand binpacking. In themulti-commodity

flow problem [6], the demand between source and destination pairs

is given while in our problem the assignment of demands is part

of the solution. In the bin packing problem [11], the objective is

to minimize the number of bins, i.e., number of flows in our set-

ting, even if this means deviating from the given traffic engineering

goal. Note, in the restricted flow load balancing problem any eli-

gible path from a candidate source to the destination can be used,

contrary to the multipath problem where only equal-cost paths can

be used.

5.2 Online Algorithm and Competitiveness

We next turn to the design of online algorithms. It has been

shown that in the online restricted machine load balancing prob-

lem, the greedy algorithm that schedules a permanent task to an

eligible processor having the least load is exactly optimal [7], i.e.,

it is the best that can be found, achieving a competitive ratio of

⌈log2n⌉+1, where nis the number of machines. If tasks are split-

table then the greedy algorithm is 1-competitive, i.e., it yields the

same performance as an offline optimal algorithm. The greedy al-

gorithm is an online one, thus it converges to the optimal solution

immediately without oscillations.

In the restricted flow load balancing problem, the set Mjk can

be obtained from the set of candidate servers that can deliver con-

tent when utilizing CaTE as described in Section 3.2. The online

assignment of users to servers per request, which minimizes the

overall load, leads to an optimal assignment of sessions within sub-

flows. In our case, flows are splittable since the content correspond-

ing to each content request is negligible compared to the overall

traffic traversing a link. Note, the end-to-end TCP connections are

not splittable. Thus, the following online algorithm is optimal:

Algorithm 1. Online Greedy Server Selection. Upon the arrival

of a content user request, assign the user to the server that can de-

liver the content, out of all the servers offered by the CP, such that

the traffic engineering goal is achieved.

5.3 Estimating the Benefit of CaTE with Pas-

sive Measurements

Before applying CaTE in real operational networks, it is impor-

tant to understand the potential benefits that it can bring in a given

context. For example, the operator of an ISP network would like to

know in advance what are the gains when applying CaTE, as well

as being able to answer what-if scenarios, when applying CaTE

to traffic delivered by different CPs. Operators of CPs would also

like to quantify the benefits by participating in CaTE before col-

laborating with an ISP. In most operational networks, aggregated

statistics and passive measurements are collected to support oper-

ational decisions. Therefore, we provide a framework that allows

a simulation-driven evaluation of CaTE. To that end, we present

0.2

0.4

0.6

0.8

1 10 100 1000

CDF of total HTTP Traffic

Number of Providers

Figure 5: CDF of traffic volume of CPs in ISP1.

0.2

0.4

0.6

0.8

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Normalized Traffic

Time of Day (10 Min. Buckets)

All Traffic

Top 100 CPs Top 10 CPs

Top 1 CPs

Figure 6: Normalized traffic for top CPs by volume in ISP1.

offline algorithms that can take as input passive measurements and

evaluate the potential gain when applying CaTE in different sce-

narios in Appendix A. We propose a linear programming formu-

lation as well as greedy approximation algorithms to speed-up the

process of estimating the gain when using CaTE.

6. EVALUATION OF CaTE

In this section, we quantify the potential of CaTE with different

traffic engineering goals in mind. We evaluate CaTE with opera-

tional data from three different networks. For the first network, we

rely on content demands built from observed traffic of a European

Tier-1 ISP. The other two networks, namely AT&T and Abilene,

allow us to evaluate the impact of the ISP topology structure.

6.1 Experimental Setting

To evaluate CaTE, an understanding of the studied ISP network

is necessary, including its topological properties and their implica-

tions on the flow of traffic. Indeed, the topological properties of the

ISP network influence the availability of disjoint paths, which are

key to benefit from the load-balancing ability of CaTE. Because

CaTE influences traffic aggregates inside the ISP network at the

granularity of requests directed to CPs, fine-grained traffic statis-

tics are necessary. Traffic counts per-OD flow, often used in the

literature, are too coarse an input for CaTE.

6.1.1 Data from a Large European ISP

To build fine-grained traffic demands, we rely on anonymized

packet-level traces of residential DSL connections from a large Eu-

ropean Tier-1 ISP, henceforth called ISP1. For ISP1, we have the

complete annotated router-level topology including the router loca-

tions as well as all public and private peerings. ISP1 contains more

than 650 routers and 30 peering points all over the world.

We collect a 10 days long trace starting on May 7, 2010. Our

monitor, using Endace monitoring cards [10], allows us to observe

the traffic of more than 20,000 DSL lines to the Internet. We cap-

ture HTTP and DNS traffic using the Bro intrusion detection sys-

tem [51]. We observe 720 million DNS messages as well as more

than 1 billion HTTP requests involving about 1.4 million unique

hostnames, representing more than 35 TBytes of data. With re-

gards to the application mix, more than 65% of the traffic volume

is due to HTTP. Other popular applications that contribute to the

overall traffic volume are NNTP, BitTorrent, and eDonkey.

A large fraction of the traffic in the Internet is due to large CPs,

including CDNs, hyper-giants, and OCHs, as reported in earlier

studies [27, 40, 52]. In Figure 5, we plot the cumulative fraction

of HTTP traffic volume as a function of the CPs that originate the

traffic. We define a CP as a organizational unit where all servers

from the distributed infrastructure serve the same content, such as

Akamai or Google. We rank the CPs by decreasing traffic volume

observed in our trace. Note that the x-axis uses a logarithmic scale.

The top 10 CPs are responsible for around 40% of the HTTP traffic

volume and the top 100 CPs for close to 70% of the HTTP traffic

volume. The marginal increase of traffic is diminishing when in-

creasing the number of CPs. This shows that collaborating directly

with a small number of large CPs, can yield significant savings.

In Figure 6 we plot the traffic of the top 1, 10, 100 CPs by volume

as well as the total traffic over time normalized to the peak traffic in

our dataset. For illustrative purposes, we show the evolution across

the first 60 hours of our trace. A strong diurnal pattern of traffic

activity is observed. We again observe that a small number of CPs

are responsible for about half of the traffic. Similar observations

are made for the rest of the trace.

6.1.2 Understanding the Location Diversity of CPs

To achieve traffic engineering goals, it is crucial to also under-

stand the location diversity of the top CPs, as CaTE relies on the

fact that the same content is available at multiple locations. Traffic

originated from multiple network locations by a given CP is seen by

CaTE as a single atomic traffic aggregate to be engineered. Fur-

thermore, as routing in the Internet works per prefix, we assume

that the granularity of subnets is the finest at which CaTE should

engineer the traffic demand. Thus, we differentiate candidate lo-

cations of CPs by their subnets and quantify the location diversity

of CPs through the number of subnets from which content can be

obtained.

We examine the amount of location diversity offered by CPs

based on traces from ISP1. To identify the subnets of individ-

ual CPs, we rely on a similar methodology to the one from Poese

et al. [52]. Our granularity is comparable to their "infrastructure

redirection aggregation". Figure 7 shows the cumulative fraction

of HTTP traffic as a function of the number of subnets (logarithmic

scale) from which a given content can be obtained, over the entire

10 days of the trace. We observe that more than 50% of the HTTP

traffic can be delivered from at least 8different subnets, and more

than 60% of the HTTP traffic from more than 3locations. These

results confirm the observations made in [52].

6.1.3 Dynamics in Location Diversity

So far the location diversity of CPs has been evaluated irrespec-

tive of time. To complement the finding, we turn our attention to

the location diversity exposed by CPs at small time-scales, i.e., in

the order of minutes. To this end, we split the original trace into

10 minutes bins. Figure 8 shows the evolution of the number of

exposed subnets of five of the top 10 CPs by volume. Note that the

diversity exposed by some CPs exhibits explicit time of day pat-

terns, while others do not. This can be due to the structural setup or

the type of content served by the CP. The exposed location diver-

0.2

0.4

0.6

0.8

1 10 100

CDF of total HTTP Traffic

Available Subnets

Figure 7: Subnet diversity from which content is available.

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Number of Subnets

Time of Day (10 Min. Buckets)

CP1 CP2 CP3 CP4 CP5

Figure 8: Evolution over time of number of subnets for selected

CPs in the top 10 CPs.

sity patterns, i.e., flat or diurnal, are representative for all CPs with

a major traffic share in our trace. We conclude that a significant

location diversity is exposed by popular CPs at any point in time,

and is quite extensive during the peak hour.

6.1.4 Content Demand Generation

The location diversity is not a mere observation about CPs de-

ployment. It requires to revisit the mapping between a given con-

tent demand and the realized traffic matrix. Given the location di-

versity for content, multiple traffic matrices can be realized from a

given content demand. The standard view of the OD flows therefore

provides an incomplete picture of the options available for CaTE.

As an input for CaTE, we introduce an abstraction of the de-

mand that reflects the available location diversity. We rely on the

notion of potential vectors, that were denoted as xrin Section 4.2.

To generate the potential vector for a given CP, the amount of traf-

fic this CP originates as well as the potential ingress points need to

be known. Combining all potential vectors and xs, we synthesize a

network-wide content demand matrix for each time bin, by scaling

the traffic demand to match the network utilization of ISP1. For

our evaluation, we use the series of content demand matrices over

a period of 10 days. The content demands are based exclusively on

the HTTP traffic of our trace.

6.2 CaTE in ISP1

To quantify the benefits of CaTE, we first consider one of the

most populartraffic engineering goals, namelyminimizing themax-

imum utilization of the links in the network [24, 25]. The rationale

is that by minimizing the maximum link utilization, network bot-

tlenecks are reduced, in turn limiting queueing delays, improving

the quality of experience and postponing the need for increased

network capacity.

With CaTE, an ISP can collaborate with any CP. It is up to the

0.2

0.4

0.6

0.8

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Utilization Reduction

Time of Day (10 Min. Buckets)

Top 100 CPs Top 10 CPs Top 1 CP

0.03

0.06

0.09

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Traffic Reduction

Time of Day (10 Min. Buckets)

Top 100 CPs Top 10 CPs Top 1 CP

Figure 9: Maximum link utilization reduction (top) and total

traffic reduction (bottom) with CaTE for the top CPs.

ISP to select the set of CPs that are the most important to establish

collaboration with. Since a significant fraction of the traffic orig-

inates from a small number of CPs, we consider the most popular

CPs by volume to evaluate CaTE. In the following, we perform a

sensitivity study where we quantify the benefits of CaTE when re-

stricting its use to the top 1, 10 and 100 CPs by volume. All other

traffic remains unaffected by CaTE. For all experiments, we use

the Algorithm 2 from Appendix A.2.

Effect on Maximum Link Utilization. Figure 9 (top) shows the

reduction of the maximum link utilization over a period of 2days

when considering the top 1, 10 and 100 CPs. Once again, we

normalized the absolute link utilization by the maximal one. The

largest gain in maximum link utilization reduction is up to 15%,

40% and 70% respectively. We observe large fluctuations of the

gains which are due to variations in traffic (see Figure 7) and loca-

tion diversity (Figure 8) throughout the day. The largest gains are

obtained during peak times, when there is more traffic and the high-

est location diversity is available. This is also when congestion is

at its peak and CaTE is most needed. Our results show that CaTE

is able to react to diurnal changes in traffic volume and utilizes the

available location diversity.

Effect on Network-wide Traffic. Although optimizing for link

utilization, CaTE reduces the overall traffic that flows through the

network, see Figure 9 (bottom). This is due to CaTE choosing the

shortest path when multiple ones with the same utilization are avail-

able, thus, as a side effect, content is fetched from closer locations

and therefore traverses less links. With CaTE, the gains in overall

traffic reduction are up to 6% and follows a clear diurnal pattern.

It is worth noticing that just with the top 10 CPs, the total traffic

reduction is very close to the one when considering the top 100

CPs, indicating that CaTE only needs to be implemented with the

major players. Also, an ISP that is able to reduce the overall traffic

inside its network is more competitive as it can serve more end-

users with the same infrastructure, delay additional investments in

0.2

0.4

0.6

0.8

0.001 0.01 0.1 1

CDF of total Volume

Relative Link Utilization

Top 100 CPs

Top 10 CPs

Top 1 CP

Original

Figure 10: Improvements in link utilization with CaTE.

0.1

0.2

0.3

0.4

0.5

0 1 2 3 4 5 6 7

Fraction of Traffic

Path Length in Traversed Backbone Links

Top 100 CPs

Top 10 CPs

Top 1 CP

Original

Figure 11: Backbone path length count with CaTE.

capacity upgrades and improve end-user satisfaction.

Effect on Distribution of Link Utilization. Reducing the maxi-

mum link utilization shifts traffic away from congested links. How-

ever, it should not be done at the expense of creating congestion on

other highly utilized links. In Figure 10 we plot the CDF of traffic

volume in ISP1 across all link utilizations, normalized by the max-

imum one when considering sets of the top CPs by volume. The

results show that CaTE shifts the traffic away from highly utilized

links to low utilized ones.

Effect on Traffic Path Length. Our results in Figure 9 (bottom)

show a reduction in the overall traffic in ISP1, which can be at-

tributed to an overall reduction of the path length. Path length re-

duction is an important metric for ISPs for the dimensioning of the

network as well as the reduction of operational costs. To quantify

this reduction in terms of the path length inside ISP1, Figure 11

shows the relative traffic across different path lengths inside the

network. CaTE redirects the traffic towards paths with the same

or even shorter length than the ones used without CaTE, only in

the rare case where a longer paths yields a lower utilization, CaTE

can choose a longer one. Note that there is no traffic for backbone

path length equal to 1 due to the network design of ISP1. We con-

clude that applying CaTE to a small number of CPs yields major

improvements in terms of path length.

Effect on Path Delay. Although the objective of minimizing max-

imum link utilization is not directly related to the reduction of path

delay, the achieved reduction in path length directly affects the path

delay. Figure 12 shows the accumulated path delay for the traffic

that flows within ISP1, when applying CaTE. The reported num-

bers for the backbone path delay are relatively modest compared

to the values for the access part of the network [46]. However,

improving the access delay requires significant investments as it

can be done mostly through changes in the access technology, e.g.,

from copper to fiber. When considering the end-to-end delay, the

delay of the path outside the ISP’s network also needs to be con-

0.2

0.4

0.6

0.8

2 10 100

Fraction of Traffic

Accumulated Path Delay in ms

Top 100 CPs

Top 10 CPs

Top 1 CP

Original

Figure 12: Improvement in path delay (in ms) with CaTE.

sidered. As content infrastructures are located close to peering

points [40, 39, 2], e.g., IXPs or private peerings, the delays are

expected to be relatively small, especially for popular CPs. Es-

timating the impact of CaTE on the end-to-end performance for

every application is very challenging, due to the many factors that

influence flow performance, especially network bottlenecks outside

the considered ISP. In Appendix B we show the results from ac-

tive measurements conducted in the case of traffic-heavy applica-

tions, confirming the significant improvements in end-to-end delay

as well as download time that can be achieved thanks to CaTE.

Summary. Our evaluation shows that CaTE yields encouraging

results, even when only a few large CPs are collaborating with an

ISP. In fact, even metrics that are not directly related to the opti-

mization function of CaTE are improved. Besides significant im-

provements for the operation of ISP networks, the end-users are

expected to also benefit from these gains. This can be attributed to

the decrease of delay as well as the reduced link utilization.

6.3 CaTE with other Network Metrics

So far we have evaluated CaTE with one traffic engineering

objective, namely, the minimization of maximum link utilization.

CaTE allows ISPs and CPs to to optimize for other network met-

rics such as path length or path delay. To this end, we quantify the

effects of CaTE when using path length and delay and compare it

with the results presented in Section 6.2. We focus on the top 10

CPs as our results show that most of the benefits from CaTE can be

achieved with this rather low number of CPs. Similar observations

are made when applying CaTE to the top 1 and 100 CPs.

In Figure 13 (top) we plot the total traffic reduction when apply-

ing CaTE to the top 10 CPs with different optimization goals. The

first observation is that when the network metric is path length, the

total traffic reduction is the highest, with up to 15%. The total traf-

fic reduction when optimizing for path length are close to the one

achieved when the metric is delay. Optimizing for other metrics

provides the expected result: the optimized metric is significantly

improved, but at the cost of not optimizing other metrics as much.

For example, optimizing for link utilization diminishes the benefits

from path length (Figure 14 top) and vice-versa (Figure 13 bot-

tom). Still, significant improvements can be achieved even when

optimizing for another network metric and we encountered no case

of significant deterioration in on of the network metrics throughout

our experiments, see Figure 13 and Figure 14.

6.4 CaTE in AT&T and Abilene

To quantify the potential benefits of CaTE in networks with dif-

ferent topological structures than ISP1, we repeat our experiments

for two other ISPs: AT&T and Abilene.

AT&T is one of the largest commercial networks. We use the

0.03

0.06

0.09

0.12

0.15

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Traffic Reduction

Time of Day (10 Min. Buckets)

Path Delay Utilization

0.1

0.2

0.3

0.4

0.5

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Utilization Reduction

Time of Day (10 Min. Buckets)

Utilization Delay Path

Figure 13: Total traffic (top) and maximum link utilization

(bottom) reduction with CaTE and different network metrics.

topology for the US backbone of AT&T as measured by the Rock-

etfuel project [60, 57]. Given that no publicly available traffic de-

mands exist for AT&T, we rely on the gravity model [54] to gener-

ate several traffic demand matrices as in ISP1.

Abilene is the academic network in the US. We use the Abilene

topology and traffic demands covering a 6 month period that are

both publicly available.1

The topology of both networks differ significantly from the one

of ISP1. In AT&T, many smaller nodes within a geographical area

are aggregated into a larger one. Abilene has few but large and well

connected nodes with a high degree of peerings. For the application

mix we rely on recent measurements in AT&T [27] and for server

diversity we rely on measurements of users in these networks [2].

Figure 15 shows the cumulative fraction of normalized link uti-

lizations for AT&T and Abilene with different optimization goals.

As already done in ISP, only the Top 10 CPs are considered for

CaTE, while all other traffic stays unaffected. For AT&T the bene-

fit for the maximum link utilization is about 36% when the network

is optimized for minimizing the maximum link utilization, while

themedian reductionin termsof network-widetrafficisabout3.7%.

When other optimizations are used, the benefits of CaTE regard-

ing the link utilization minimization are approximately 12% for

path length and delay. However, when looking at the median traf-

fic reduction of these metrics, the traffic is reduced by 5.4% when

path length is used, while delay achieves a reduction of 5%. In the

Abilene network benefits of CaTE are more significant: 45% re-

duction in the maximum link utilization and 18% for network-wide

traffic when CaTE optimizes for link utilization. When targeting

the other two metrics, i.e., path length and delay, the results show

that CaTE does not reduce the maximum link utilization. In fact,

the maximum link utilizations stays constant. This is due to the

structure of the network and the fact that the content is available

1http://userweb.cs.utexas.edu/~yzhang/research/AbileneTM/

0.1

0.2

0.3

0.4

0.5

0 1 2 3 4 5 6 7

Fraction of Traffic

Path Length in Traversed Backbone Links

Path

Delay

Utilization

Original

0.2

0.4

0.6

0.8

2 10 100

Fraction of Traffic

Accumulated Path Delay in ms

delay

path

utilization

original

Figure 14: Backbone path length (top) and accumulated path

delay (bottom) with CaTE and different network metrics.

closer, but at the cost of keeping the high utilization on some of

the links. However, when looking at the median traffic reduction,

both metrics manage to reduce the traffic by over 24%. These re-

sults show that CaTE is capable of targeting different optimization

goals in different network structures and is able to optimize for dif-

ferent metrics.

It is worth noting that for AT&T 40% of the links have a nor-

malized link utilization less than 10% while the remaining link

utilizations are distributed almost linear. This distribution fits the

structural observations made for the AT&T network: many links

from smaller nodes are aggregated into larger ones. This also ex-

plains why the benefits for AT&T are smaller, since such a struc-

ture reduces the path diversity. Turning our attention to Abilene,

we attribute the higher reduction of maximum link utilization and

network-wide traffic to the non-hierarchical structure of the net-

work and a higher ratio of peering locations. Applying CaTE to

both AT&T and Abilene networks where the network metric is de-

lay or path length shows similar behavior of CaTE as it does in

ISP1.

6.5 CaTE and Popular Applications

Today, the launch of new content hosted on CPs such as high

definition video or others that share flash-crowd characteristics, is

not done in coordination with ISPs. This is challenging to ISPs

that have to deal with rapid shifts of traffic volume as currently

deployed traffic engineering tools are too slow to react to rapid

demand changes. Furthermore, the end-user experience for pop-

ular applications is far from optimal as application designers have

limited means to optimize the end-to-end delivery of content [39].

Both ISPs and applications would benefit from the improved traffic

engineering capabilities of CaTE. We believe that CaTE can act

as an enabler for ISP-application collaboration.

For example, Netflix, a very popular application that delivers

high quality videos to end-users, relies on commercial CDNs such

0.2

0.4

0.6

0.8

0.1 1

CDF of total Volume

Relative Link Utilization

Rocketfuel measured ATT

Utilization

Delay

Path

Original 0

0.2

0.4

0.6

0.8

0.1 1

CDF of total Volume

Relative Link Utilization

Abilene

Utilization

Delay

Path

Original

Figure 15: Link utilization improvements after applying when using CaTE in AT&T and Abilene.

as Level3 and Limelight to improve the content delivery. Today,

Netflix is only available in North and Latin America. However,

Netflix has announced that it will be launching its services in Eu-

rope early 2012. To quantify the effect of Netflix coming to Europe,

we use our simulation to estimate the effect on ISP1. We run a se-

ries of experiments, assuming that the traffic of the CPs hosting

Netflix will increase 20-fold. Our results show that with CaTE, the

total HTTP traffic volume is reduced by up to 8% and the utiliza-

tion of the most utilized link by 60%. More detailed results can be

found in Appendix C.

7. RELATED WORK

To meet the requirements of mission critical applications with

stringent Service Level Agreements (SLAs), today’s ISPs rely on

traffic engineering [5] to better control the flow of IP packets inside

their network. Several techniques have been proposed in the liter-

ature, some require tuning of the IP routing protocols used inside

the ISP network [24, 25, 64], while others rely on multipath [19,

16, 58, 21, 28, 65]. Changing routing weights can lead to oscil-

lations [30] and is applied on time scales of hours. Multipath en-

ables ISPs to dynamically distribute the traffic load within the net-

work in the presence of volatile and hard to predict traffic demand

changes [19, 16, 58, 21], even at very small time scales, but re-

quires additional configuration and management or router support.

CaTE is complementary to both routing-based traffic engineering

and multipath enabled networks.

Traffic engineering relies on the availability of information about

the traffic demands, which can be obtained either by direct obser-

vations [19, 31, 20, 63] or through inference [48, 66, 67, 59, 18].

CaTE relies on the network location diversity exposed by current

hosting and content delivery infrastructures [2].

Game-theoretic results [36, 14, 45] show that the collaboration

between CPs and ISPs can lead to a win-win situation. Recent stud-

ies also show that content location diversity has significant implica-

tions on traffic engineering within an ISP [56]. To our knowledge,

CaTE is the first system that is proposed to leverage the benefits of

a direct collaboration between CPs and ISPs.

8. SUMMARY

Today, a large fraction of Internet traffic is due to a few content

providers that rely on highly distributed infrastructures [40, 42, 2].

These distributed infrastructures expose a significant location di-

versity, which opens new opportunities to improve end-user perfor-

mance, help CPs to better locale end-user and circumvent network

bottlenecks, and enables new traffic engineering capabilities. We

introduce the concept of content-aware traffic engineering (CaTE),

that leverages this location diversity to engineer the traffic through

careful selection of the locations from which content is obtained.

We propose deployment schemes of CaTE based on an online al-

gorithm. The algorithm is stable and incurs no oscillations in link

utilizations. Furthermore, CaTE works on time scales ranging be-

tween the TCP control loop and traditional traffic engineering, and

therefore advantageously complements existing traffic engineering

techniques.

We evaluate some of the potential benefits of CaTE on multi-

ple operational networks using an offline derivative of the online

algorithm. Our results show that CaTE provides benefits to CPs,

ISPs and end-users, by reducing the maximum link utilization, the

path length and the delay inside an ISP network, as well as enabling

improved end-user to CP server assignment.

In the future, we envision CaTE as an enabler for coordinated

and Internet-wide traffic engineering. Meanwhile, CaTE creates

incentives for both ISPs and CPs to interlock their traffic engineer-

ing planes through the mutual benefits it brings. As further work,

we want to deploy a prototype implementation of CaTE and eval-

uate it through a direct collaboration between a CP and an ISP.

9. REFERENCES

[1] B. Ager, W. Mühlbauer, G. Smaragdakis, and S. Uhlig. Comparing

DNS Resolvers in the Wild. In ACM IMC, 2010.

[2] B. Ager, W. Mühlbauer, G. Smaragdakis, and S. Uhlig. Web Content

Cartography. In ACM IMC, 2011.

[3] B. Ager, F. Schneider, J. Kim, and A. Feldmann. Revisiting

Cacheability in Times of User Generated Content. In IEEE Global

Internet, 2010.

[4] R. Alimi, R. Penno, and Y. Yang. ALTO Protocol.

draft-ietf-alto-protocol-08, May 2011.

[5] D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and X. Xiao. Overview

and principles of internet traffic engineering. RFC3272.

[6] B. Awerbuch and F. Leighton. Multicommodity Flows: A Survey of

Recent Research. In ISAAC, 1993.

[7] Y. Azar, J. Naor, and R. Rom. The competitiveness of on-line

assignments. J. Algorithms, 1995.

[8] B. Chow, P. Golle, M. Jakobsson, E. Shi, J. Staddon, R. Masuoka,

and J. Molina. Controlling Data in the Cloud: Outsourcing

Computation withour Outsourcing Control. 2009.

[9] Cisco. NetFlow services and applications. White paper:

http://www.cisco.com/warp/public/732/netflow.

[10] J. Cleary, S. Donnelly, I. Graham, A. McGregor, and M. Pearson.

Design Principles for Accurate Passive Measurement. In PAM, 2000.

[11] Jr. E. Coffman, M. Garey, and D. Johnson. Approximation

Algorithms for Bin Packing: A Survey. In Approximation algorithms

for NP-hard problems, 1997.

[12] C. Contavalli, W. van der Gaast, S. Leach, and D. Rodden. Client IP

Information in DNS Requests. IETF draft, work in progress,

draft-vandergaast-edns-suspend-ip-01.txt, 2011.

[13] A. Czumaj, C. Riley, and C. Scheideler. Perfectly Balanced

Allocation. In RANDOM-APPROX, 2003.

[14] D. DiPalantino and R. Johari. Traffic Engineering versus Content

Distribution: A Game-theoretic Perspective. In IEEE INFOCOM,

2009.

[15] F. Dobrian, A. Awan, I. Stoica, V. Sekar, A. Ganjam, D. Joseph,

J. Zhan, and H. Zhang. Understanding the Impact of Video Quality

on User Engagement. In ACM SIGCOMM, 2011.

[16] A. Elwalid, C. Jin, S. Low, and I. Widjaja. MATE : MPLS adaptive

traffic engineering. In IEEE INFOCOM, 2001.

[17] J. Erman, A. Gerber, M. Hajiaghayi, D. Pei, and O. Spatscheck.

Network-aware forward caching. In WWW Conf., 2009.

[18] V. Erramilli, M. Crovella, and N. Taft. An Independent-connection

Model for Traffic Matrices. In ACM IMC, 2006.

[19] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and

F. True. Deriving Traffic Demands for Operational IP Networks:

Methodology and Experience. IEEE/ACM Trans. Netw., 2001.

[20] A. Feldmann, N. Kammenhuber, O. Maennel, B. Maggs, R. De

Prisco, and R. Sundaram. A Methodology for Estimating

Interdomain Web Traffic Demand. In ACM IMC, 2004.

[21] S. Fischer, N. Kammenhuber, and A. Feldmann. REPLEX: Dynamic

Traffic Engineering based on Wardrop Routing Policies. In ACM

CoNEXT, 2006.

[22] S. Fischer, H. Räcke, and B. Vöcking. Fast Convergence to Wardrop

Equilibria by Adaptive Sampling Methods. In ACM STOC, 2006.

[23] S. Fischer and B. Vöcking. Adaptive Routing with Stale Information.

In ACM PODC, 2005.

[24] B. Fortz and M. Thorup. Internet Traffic Engineering by Optimizing

OSPF Weights. In IEEE INFOCOM, 2000.

[25] B. Fortz and M. Thorup. Optimizing OSPF/IS-IS Weights in a

Changing World. IEEE J. Sel. Areas in Commun., 2002.

[26] P. Francois and O. Bonaventure. Avoiding transient loops during the

convergence of link-state routing protocols. IEEE/ACM Trans. Netw.,

2007.

[27] A. Gerber and R. Doverspike. Traffic Types and Growth in Backbone

Networks. In OFC/NFOEC, 2011.

[28] D. Goldenberg, L. Qiuy, H. Xie, Y. Yang, and Y. Zhang. Optimizing

Cost and Performance for Multihoming. In ACM SIGCOMM, 2004.

[29] R. Graham. Bounds on Multiprocessing Timing Anomalies. SIAM J.

Applied Math., 1969.

[30] T. Griffin and G. Wilfong. On the Correctness of iBGP

Configuration. In ACM SIGCOMM, 2002.

[31] A. Gunnar, M. Johansson, and T. Telkamp. Traffic Matrix Estimation

on a Large IP Backbone: A Comparison on Real Data. In ACM IMC,

2004.

[32] Akamai Inc. SureRoute.

www.akamai.com/dl/feature_sheets/fs_edgesuite_sureroute.pdf.

[33] Microsoft Inc. Windows azure.

http://www.microsoft.com/windowsazure/cdn.

[34] Sandvine Inc. Global broadband phenomena. Research Report

http://www.sandvine.com/news/global_broadband_trends.asp, 2011.

[35] V. Jacobson, D. Smetters, J. Thornton, M. Plass, N. Briggs, and

R. Braynard. Networking Named Content. In ACM CoNEXT, 2009.

[36] W. Jiang, R. Zhang-Shen, J. Rexford, and M. Chiang. Cooperative

Content Distribution and Traffic Engineering in an ISP Network. In

ACM SIGMETRICS, 2009.

[37] P. Key, L. Massoulié, and D. Towsley. Path Selection and Multipath

Congestion Control. Commun. of the ACM, 2011.

[38] L. Khachiyan. A Polynomial Time Algorithm for Linear

Programming. Dokl. Akad. Nauk SSSR, 1979.

[39] R. Krishnan, H. Madhyastha, S. Srinivasan, S. Jain,

A. Krishnamurthy, T. Anderson, and J. Gao. Moving Beyond

End-to-end Path Information to Optimize CDN Performance. In

ACM IMC, 2009.

[40] C. Labovitz, S. Lekel-Johnson, D. McPherson, J. Oberheide, and

F. Jahanian. Internet Inter-Domain Traffic. In ACM SIGCOMM, 2010.

[41] A. Lakhina, K. Papagiannaki, Mark. Crovella, C. Diot, E. Kolaczyk,

and N. Taft. Structural Analysis of Network Traffic Flows. In ACM

SIGMETRICS, 2004.

[42] T. Leighton. Improving Performance on the Internet. Commun. of the

ACM, 2009.

[43] J. Lenstra, D. Shmoys, and É. Tardos. Approximation Algorithms for

Scheduling Unrelated Parallel Machines. Math. Program., 1990.

[44] K. Ma. Content Distribution Network Interconnection (CDNI)

Metadata Interface. draft-ma-cdni-metadata-00, April 2011.

[45] R. T. B. Ma, D. .M. Chiu, J. C. S. Lui, V. Misra, and D. Rubenstein.

On Cooperative Settlement Between Content, Transit, and Eyeball

Internet Service Providers. IEEE/ACM Trans. Netw., 2011.

[46] G. Maier, A. Feldmann, V. Paxson, and M. Allman. On Dominant

Characteristics of Residential Broadband Internet Traffic. In ACM

IMC, 2009.

[47] Z. Mao, C. Cranor, F. Douglis, M. Rabinovich, O. Spatscheck, and

J. Wang. A Precise and Efficient Evaluation of the Proximity

Between Web Clients and Their Local DNS Servers. In Usenix ATC,

2002.

[48] A. Medina, N. Taft, K. Salamatian, S. Bhattacharyya, and C. Diot.

Traffic Matrix Estimation: Existing Techniques and New Directions.

In ACM SIGCOMM, 2002.

[49] Limelight Networks.

http://www.limelightnetworks.com/platform/cdn.

[50] E. Nygren, R. K. Sitaraman, and J. Sun. The Akamai Network: A

Platform for High-performance Internet Applications. SIGOPS Oper.

Syst. Rev., 44:2–19, August 2010.

[51] V. Paxson. Bro: A System for Detecting Network Intruders in

Real-Time. Computer Networks, 1999.

[52] I. Poese, B. Frank, B. Ager, G. Smaragdakis, and A. Feldmann.

Improving Content Delivery using Provider-Aided Distance

Information. In ACM IMC, 2010.

[53] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs.

Cutting the Electric Bill for Internet-scale Systems. In ACM

SIGCOMM, 2009.

[54] M. Roughan. Simplifying the Synthesis of Internet Traffic Matrices.

ACM Comp. Comm. Rev., 2005.

[55] AMAZON Web Services. http://aws.amazon.com.

[56] A. Sharma, A. Mishra, V. Kumar, and A. Venkataramani. Beyond

MLU: An application-centric comparison of traffic engineering

schemes. In IEEE INFOCOM, 2011.

[57] R. Sherwood, A. Bender, and N. Spring. DisCarte: A Disjunctive

Internet Cartographer. In ACM SIGCOMM, 2008.

[58] S.Kandula, D. Katabi, B. Davie, and A. Charny. Walking the

Tightrope: Responsive Yet Stable Traffic Engineering. In ACM

SIGCOMM, 2005.

[59] A. Soule, A. Nucci, R. Cruz, E. Leonardi, and N. Taft. How to

Identify and Estimate the Largest Traffic Matrix Elements in a

Dynamic Environment. In ACM SIGMETRICS, 2004.

[60] N. Spring, R. Mahajan, D. Wetherall, and T. Anderson. Measuring

ISP Topologies with Rocketfuel. IEEE/ACM Trans. Netw., 2004.

[61] M. Tariq, A. Zeitoun, V. Valancius, N. Feamster, and M. Ammar.

Answering What-if Deployment and Configuration Questions with

Wise. In ACM SIGCOMM, 2009.

[62] S. Triukose, Z. Al-Qudah, and M. Rabinovich. Content Delivery

Networks: Protection or Threat? In ESORICS, 2009.

[63] S. Uhlig, B. Quoitin, J. Lepropre, and S. Balon. Providing Public

Intradomain Traffic Matrices to the Research Community. ACM

Comp. Comm. Rev., 2006.

[64] Y. Wang, Z. Wang, and L. Zhang. Internet Traffic Engineering

Without Full Mesh Overlaying. In IEEE INFOCOM, 2001.

[65] M. Zhang, C. Yi, B. Liu, and B. Zhang. GreenTE: Power-aware

Traffic Engineering. In ICNP, 2010.

[66] Y. Zhang, M. Roughan, N. Duffield, and A. Greenberg. Fast Accurate

Computation of Large-scale IP Traffic Matrices from Link Loads. In

ACM SIGMETRICS, 2003.

[67] Y. Zhang, M. Roughan, C. Lund, and D. Donoho. An

Information-theoretic Approach to Traffic Matrix Estimation. In

ACM SIGCOMM, 2003.

APPENDIX

A. ESTIMATING THE BENEFITS OF CaTE

WITH PASSIVE MEASUREMENTS

We answer the question of the potential benefit CaTE can offer

to CPs, ISPs, and end-users. The online algorithm requires de-

ployment of CaTE inside an operational network. An alternative

is to rely on a simulation-driven evaluation of CaTE. For this, we

design offline algorithms that take as input passive measurements

and estimate the gain when applying CaTE under different scenar-

ios. We first propose a linear programming formulation and then

we present greedy algorithms to speed-up the process of estimating

the benefits of CaTE.

A.1 Linear Programming Formulation

To estimate the potential improvement of CaTE we formulate

the Restricted Flow Load Balancing problem (see Section 5.1) as

a Linear Program (LP) with restrictions on the variable values.

Variables fijk correspond to flows that can be influenced. Setting

fijk = 0 indicates that consumer jcannot download the content

from server iof a content provider k. For each consumer jwe re-

quire that its demand djk for content provider kis satisfied, i.e.,

we require Pi∈Mjk fijk =djk. The utilization on a flow fij is

expressed as fij =Pkfijk.

We use the objective function to encode the traffic engineering

goal. For ease of presentation we use as objective function the

minimization of the maximum link utilization. Let Tebe the set

of flows fij that traverse a link e∈E. The link utilization of a link

e∈Eis expressed as Le=PTefij . Let variable Lcorrespond to

the maximum link utilization. We use the inequality PTefij ≤L

for all links. This results in the following LP problem:

min L

fijk =djk,∀j∈J, k ∈K

fijk ≤L, ∀j∈J, i ∈I, k ∈K, e ∈E

0≤fijk ≤djk,∀j∈J, i ∈Mjk, k ∈K

fijk = 0,∀j∈J, i /∈Mjk, k ∈K

The solution of the above LP provides a fractional assignment of

flows under the assumption that flows are splittable and thus can be

solved in polynomial time [38]. The solution is the optimal flow

assignment, f∗

ijk, that corresponds to the optimal traffic matrix x∗.

If flows are not splittable, or the sub-flows are discretized, then the

integer programming formulation has to be solved. In this case the

Restricted Flow Load Balancing problem is NP-hard and a poly-

nomial time rounding algorithm that approximates the assignment

within a factor of 2exists [43].

A.2 Approximation Algorithms

Since it is a common practice for operators to study multiple

scenarios to quantify the effect of changes in traffic matrices over

periods that spans multiple weeks or months, solutions based on LP

may be too slow. It might be also too slow to estimate the gain of

CaTE when applying it to an arbitrary combination of CPs. To that

end, we turn our attention to the design of fast approximation algo-

rithms. Simple greedy algorithms for load balancing problems [29]

are among the best known. Accordingly, we propose a greedy al-

Algorithm 2: Iterative Greedy-Sort-Flow.

INPUT: I,J,K,{fijk},{Mjk},A.

OUTPUT: {f∗

ijk}.

Initialization:

1. Sort k∈Kby decreasing volume: PiPjfijk.

2. Sort j∈Jby decreasing volume: Pifijk for all k∈K.

Iteration:

Until no sub-flow is re-assigned or the maximum number of

iterations has been reached.

Pick unprocessed k∈Kin descending order.

Pick unprocessed j∈Jin descending order.

Re-assign fijk in f−k

ij , i ∈Mjk s.t. the engineering

goal is achieved.

0.5

0.6

0.7

0.8

0.9

20 30 40 50 60 70 80 90 100

CDF

Download time in seconds

original

CaTE

Figure 16: Distribution of download times of a CP.

gorithm for our problem which starts with the largest flow first.

Algorithm 3: Greedy-Sort-Flow. Sort sub-flows in decreasing

order based on volume and re-assign them in this order to any other

eligible flow which, after assigning the sub-flow fijk, will yield the

desire traffic engineering goal.

Assignment in sorted order has been shown to significantly im-

prove the approximation ratio and the convergence speed [13, 29].

Recent studies [27, 40, 52] show that a small number of content

providers are responsible for a large fraction of the traffic. There-

fore it is expected that the algorithm yields results close to the opti-

mal ones. To further improve the accuracy of the proposed approx-

imation algorithm, we design an iterative version of the algorithm,

presented in Algorithm 2, that converges to the optimal solution.

Indeed, a small number of iterations, typically one, suffice to pro-

vide a stable assignment of flows.

As we elaborate in Section 6, we performed a number of sim-

ulations using real operational traces, and different sets of CPs.

Our evaluation show that the performance of the iterative greedy

algorithm presented in Algorithm 2 yields results very close to this

obtained with LP, but in significantly shorter time.

B. ACTIVE MEASUREMENTS IN ISP1

The CaTE evaluation in Section 6.2 does not allow us to argue

about end-user performance, as it is based on simulations. To this

end, we complement our previous network-wide simulations with

active measurements. Over a period of one week, we repeatedly

downloaded a 60MB object from one of the major CPs. This CP

is an OCH distributed across 12 locations. The downloads were

performed every two hours, from each of the 12 locations. Addi-

tionally, mapping requests were issued every 200ms to find out the

dynamics in the server assignment of this CP. Figure 16 shows the

-0.6

-0.3

0.3

0.6

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Utilization Reduction

Time of Day (10 Min. Buckets)

Utilization Delay Path

0.04

0.08

0.12

0.16

0:00 8:00 16:00 0:00 8:00 16:00 0:00

Traffic Reduction

Time of Day (10 Min. Buckets)

Path Delay Utilization

0.1

0.2

0.3

0.4

0.5

0 1 2 3 4 5 6 7

Fraction of Traffic

Path Length in Traversed Backbone Links

Path

Delay

Utilization

Original

Figure 17: Projection of reduction in link utilization (top), re-

duction in overall network traffic (middle) and fraction of vol-

ume by path length (bottom) if Netflix is launched in ISP1.

distribution of total download times when the CP assigns end-users

to its servers ("original") and compares it to the download time that

would be observed if CaTE had been used. We observe that more

than 50% of the downloads do not show a significant difference.

This happens when congestion is low, e.g., during non-peak hours.

For 20% of the downloads, we observe a significant difference in

the download times, mainly during peak hours. This confirms our

observation that CaTE is most beneficial during peak hours.

C. CASE STUDY: NETFLIX IN ISP1

Netflix, a very popularapplication thatdelivers highquality videos

to end-users, relies on commercial CDNs such as Level3 and Lime-

light to improve the content delivery. Today, Netflix is available in

North and Latin America, and is announced to arrive in the UK

soon. Recent studies show that Netflix is responsible for more than

30% of the peak downstream traffic in large ISPs [34]. Consider the

scenariowhereNetflix islaunching itsservice inthe large European

ISP1 we described in Section 6.1. If the launch happens overnight,

ISP1 would have to deal with a huge amount of highly variable traf-

fic, which would have significant implications on the operation of

ISP1. With CaTE, the traffic of Netflix can be spread across the

ingress points of ISP1. This will limit the negative consequences

imposed by additional traffic for the CP delivering Netflix as well

as for ISP1 and thus avoids a deteriorated end-user experience.

To quantify the effect of Netflix being deployed in Europe, we

simulate the launch of Netflix in ISP1, assuming that the CP cur-

rently hosting Netflix increases its traffic 20-fold, while keeping the

distribution of the requests. Next, we generate a new set of traffic

demands for CaTE accordingly. We consider the the top 10 CPs

by volume for CaTE, and show the benefits when optimizing for

different metrics.

Our results show that with CaTE, the utilization of the most uti-

lized link can be reduced by up to 60% (see top of Figure 17), the

total HTTP traffic volume can be reduced by 15% (see middle of

Figure 17) and traffic can be shifted towards shorter paths inside

the network of ISP1 (bottom of Figure 17). However, when consid-

ering all metrics, we observe that not all metrics can be optimized

to their full extend at the same time. For example, a reduction of

traffic in the order of 15% would actually increase the utilization

on the highest loaded link by 60%. This indicates that the opti-

mization function employed by CaTE needs to be carefully chosen

to target the most important metrics when deploying CaTE inside

a network. Nonetheless, if minimizing the maximum link utiliza-

tion is chosen as the optimization function for CaTE, benefits in

all metrics can be observed.

Internet applications such as Netflix are in a position to negotiate

how they should be deployed in order to improve end-user experi-

ence and not disturb the operation of ISPs. CaTE can be used to

identify the best peering points between the CPs that deliver Netflix

traffic and the ISPs that receive its traffic. In addition, ISPs might

offer better peering prices if the CPs hosting Netflix are willing to

provide a higher diversity in the locations from which the traffic can

be obtained. This would lead to a win-win situation where Netflix

can offer better service to its users, the CPs achieve reduced pricing

on their peering agreements, and ISPs can compensate the reduced

peering revenue through more efficient operations.