scieee Science in your language
[en] (orig)
Power-Aware Online File Allocation
in Dynamic Networks
Dissertation
by
Jan Mehler
Faculty of Computer Science, Electrical Engineering and Mathematics
Department of Computer Science and Heinz Nixdorf Institute
University of Paderborn, Germany
July 2010
ii
Zusammenfassung
Sowohl die Vernetzung von mobilen drahtlosen Ger¨
aten wie Smartphones und
PDAs als auch die Verbreitung von Sensornetzwerken nimmt zur Zeit stark zu.
Eine wesentliche Anforderung an solche mobilen ad hoc Netzwerke besteht dar-
in, den Netzwerkknoten eine gemeinsame Nutzung von Daten zu erm¨
oglichen.
Beim von Bartal eingef¨
uhrten File Allocation Problem hat ein Datenverwaltungs-
system die M¨
oglichkeit nach Bedarf beliebig viele Kopien eines Datums auf den
Knoten des Netzwerks zu erzeugen und auch wieder zu l¨
oschen. Da die Knoten
eines mobilen ad hoc Netzwerks in der Regel nur eine stark beschr¨
ankte Energie-
reserve besitzen, besteht unser Ziel darin Algorithmen zu entwickeln, die den bei
der Bedienung einer Folge von Lese- und Schreibanfragen der Netzwerkknoten
anfallenden Energiebedarf, m¨
oglichst gering halten. Um dies zu erreichen muss
ein Algorithmus Kopien so im Netzwerk platzieren, dass sie zwar m¨
oglichst nahe
an den zugreifenden Knoten liegen, aber gleichzeitig eine Aktualisierung aller
Kopien nicht zu teuer wird. Wir verallgemeinern das File Allocation Problem von
Bartal auf Netzwerke, die sich mit der Zeit ver¨
andern. Dabei besteht eine wesent-
liche Herausforderung darin, dass weder bekannt ist welche Anfragen in Zukunft
gestellt werden noch wie sich das Netzwerk ver¨
andern wird. Wir untersuchen die
Qualit¨
at verschiedener online Algorithmen f¨
ur das File Allocation Problem in dy-
namischen Netzwerken sowohl theoretisch als auch mittels simulationsbasierter
Experimente.
Reviewers:
Prof. Dr. Friedhelm Meyer auf der Heide, University of Paderborn, Germany
Prof. Dr. Christian Scheideler, University of Paderborn, Germany
Acknowledgments
First of all, I would like to thank my supervisor Professor Friedhelm Meyer
auf der Heide for giving me the opportunity to write this thesis and for his
persistent optimism and encouragements. I especially appreciate that he gave me
the freedom to do my work in my own style and pace.
I want to thank all the members of the working group “Algorithms and Com-
plexity” I encountered during my time in Paderborn for providing a great working
environment. I will miss especially the enjoyable lunch breaks.
In particular, I would like to thank Bastian Degener for several good advices
and for making my life more adventurous. Furthermore, I want to thank Peter
Pietrzyk for proofreading my dissertation and especially for his everlasting moral
support.
Paderborn, July 2010 Jan Mehler
Contents
1 Introduction 1
1.1 The file allocation problem in static networks . . . . . . . . . . . . . 3
1.2 Competitiveanalysis........................... 3
1.3 Dynamicnetworks ............................ 5
1.4 Relatedwork................................ 7
1.5 Ourcontribution ............................. 10
2 Limitations of Extension to Dynamic Networks 13
2.1 Ourmodel................................. 13
2.2 Lowerbound ............................... 14
2.3 Conclusion................................. 16
3 File Allocation with Step Costs 17
3.1 File allocation in a dynamic star network . . . . . . . . . . . . . . . 19
3.1.1 Ourmodel............................. 19
3.1.2 Demand-driven algorithms . . . . . . . . . . . . . . . . . . . 21
3.1.2.1 Lowerbound...................... 21
3.1.2.2 Algorithm Follow ................... 23
3.1.2.3 Algorithm Count ................... 26
3.1.2.4 Algorithm RandomizedFollow ........... 33
3.1.3 Lower bound for non-demand-driven algorithms . . . . . . 37
3.2 File allocation in a dynamic tree network . . . . . . . . . . . . . . . 40
3.2.1 Ourmodel............................. 40
3.2.2 Algorithm RandomizedTree .................. 42
3.3 Conclusion and open problems . . . . . . . . . . . . . . . . . . . . . 51
v
vi Contents
4 Simulation Based Evaluation of File Allocation with Step Costs 55
4.1 Evaluated file allocation algorithms . . . . . . . . . . . . . . . . . . 56
4.2 Simulation environment . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Mobilitymodel .............................. 60
4.4 Experiments................................ 62
4.5 Results ................................... 63
4.6 Conclusion................................. 70
5 File Leasing 73
5.1 Ourmodel................................. 74
5.2 Lowerbound ............................... 77
5.3 Algorithms................................. 79
5.4 Conclusion and open problems . . . . . . . . . . . . . . . . . . . . . 84
Bibliography 87
Notation
Frequently Used Variables and Notations
G=(V,E) A (weighted) graph consisting of nodes Vand edges EV2.
i,j,kNodes of a network.
cCenter node of a star network.
dt(e)Weight of edge eat time t.
di
tDistance of node ito the center cat time tin a star network.
dti,jThe distance between node iand node jat time t.
dt(R,S)The distance between RVand SVat time t.
ST(R)A minimum Steiner tree for RV.
stt(R)The weight of a minimum Steiner tree for RVat time t.
DFile size.
pStand-by costs of a participating node in one step.
LPeriod of validity of a lease.
δMaximum change of an edge weight in one step.
σ=(dt,at)tAn input sequence.
atData access request in step t.
Alg A (randomized) algorithm.
Det A deterministic algorithm.
Adv An adversary.
Opt An optimal offline algorithm.
Rt
Alg Replica set of algorithm Alg at the end of step t.
CAlg(σ) The costs incurred by Alg on serving input sequence σ.
vii
Chapter 1
Introduction
The availability of lightweight mobile devices equipped with wireless transceivers
significantly increases since the beginning of the 21st century. The first generation
of these devices could not communicate directly with each other. They could
only communicate with stationary base stations which were connected to a wired
network. So, they were completely dependent on a wired communication infras-
tructure. By now, more and more devices are able to communicate directly with
other devices within their communication range. This allows to form widespread
networks without the necessity of a costly and inflexible stationary infrastructure.
Such networks are called mobile ad hoc networks (MANET).
A MANET consists of a set of mobile wireless nodes equipped with wireless
(e.g. radio or infrared) transceivers [CM99]. Examples for such networks are ad
hoc networks formed by wireless gadgets (e.g. PDAs, netbooks, mobile phones)
or wireless sensor networks. A wireless sensor network consists of a large num-
ber of spatially distributed autonomous devices which gather data from their
environment and intelligently forward relevant data for analysis [ASSC02].
Common to all MANETs is that every node has its own power source and
that the transmission range of the nodes is limited [CM99]. In most cases power
will be provided by a battery with a limited amount of energy. Therefore, the
scarcity of energy constitutes a major challenge for the operation of MANETs. In
the majority of cases the wireless communication constitutes a significant part of
the energy consumption of the wireless nodes. The limited transmission range
implies that data transfers necessarily have to use multi-hop routing paths. A data
transfer between two nodes therefore induces not only a power consumption in
the sending respectively receiving nodes but also in every hop of the used path.
Thus, every service provided in a MANET should try to minimize the amount of
transferred data.
We investigate the problem of providing access to a shared file (e.g. a database)
1
2 Introduction
used relay
unused relay
participating node
Figure 1.1: A set of nodes shares a file in a MANET. The participating nodes are
interconnected via paths of relays.
to a subset of the nodes of a MANET. We call this subset of nodes the set of
participating nodes. The file consists of a fixed number of data units and can be
stored on any non-empty subset of the participating nodes. All the participating
nodes of the MANET can read and modify the file. Every such access affects
only a single unit of the file. In order to avoid data transfers, an algorithm can
create/delete additional copies of the file and place them on the participating
nodes. A read request is fulfilled either by a copy of the file which was prior
placed on the requesting node or by a data transfer from one of the available
copies. A write request needs to update all the copies of the file. Our goal is to
minimize the overall energy consumption of the nodes of the MANET induced
by the data management system.
The nodes of the MANET can have two different roles in the data management
system (cf. Figure 1.1): participating nodes and relays. The participating nodes
are interested in the contents of the shared file. They can issue read and write
requests to the file and can hold a copy of the file. We assume that during a
run of the data management system the set of participating nodes is fixed.1All
the nodes of the MANET (including the participating nodes) are relays. They are
used for data transfers between the participating nodes. Which relays are actually
used is determined by the routing protocol of the MANET. In this thesis we do
not deal with a concrete routing protocol. We assume that an arbitrary routing
protocol, which is capable of routing messages between any two connected nodes,
is implemented in the MANET (e.g. LLS [ADM04] and MLS [FW06]). Relays are
typically not aware of their involvement in the data management system.
The described scenario induces two major challenges for the design of data
1Our results can easily be adapted to the case that mobile nodes can join and leave the set of
participating nodes.
1.1 The file allocation problem in static networks 3
management strategies. Firstly, a data management strategy has to decide online
where and when to place copies, i.e. without knowing what requests are issued
in the future. Secondly, it has to take the mobility of the nodes of the MANET
into account.
1.1 The file allocation problem in static networks
The file allocation problem in static networks introduced by Bartal et al. [BFR95]
is defined as follows. A single indivisible file consisting of several data units is
shared by a set of nodes (e.g. processors, computers) which are interconnected by
a static network, i.e. neither the topology nor the edge weights change over time.
Each node can access (read/write) the file and can hold a copy of the file. The input
of the file allocation problem consists of a sequence of read/write accesses issued
by the nodes of the network. It is assumed that these accesses arrive sequential
and that each access affects only a single unit of the file. If a node issues a read
request, the requested data has to be send from one of the nodes holding a copy
to the requesting node. Thus, the incurred cost is defined by the distance of the
reading node to the closest copy of the file. If a node modifies the file, all copies
have to be updated. Thus, the cost of a write request is the weight of a minimum
Steiner tree spanning all the copies and the writing node. In order to minimize
the overall costs, an algorithm can create and delete copies between consecutive
requests. Creating a new copy means that the whole file has to be transferred from
a node holding a copy to the destination of the copy. This incurs a cost of Dtimes
the length of a shortest path between the source and the destination. Deleting a
copy is only allowed if at least one copy of the file remains. Hence the deletion
of a copy needs an understanding between the node whose copy shall be deleted
and at least one other node holding a copy. This implies a cost of the distance of
the passing copy to its nearest neighbor copy.2
A simple reduction of the minimum Steiner tree problem shows that the offline
version of the file allocation problem is NP-hard.
1.2 Competitive analysis
In practice the input of the file allocation problem is not given at once but arises
step-by-step. Furthermore, the decision which action should be performed has
to be made immediately and can not be deferred until the whole input is known.
Problems with these characteristics are called online problems. An algorithm for
2Our model differs here from the model of Bartal [Bar94] where deletions are free.
4 Introduction
an online problem which considers the immediate execution of actions in every
step based only on prior requests is called online algorithm. There are lots of online
problems considered in the literature. Most prominently are the paging [ST85],
list accessing [ST85, AW98] and k-server [MMS88] problems.
The most common method to measure the performance of online algorithms is
the competitive analysis [ST85, KMRS88, BEY98]. It consists of a comparison of an
online algorithm to an optimal offline algorithm which knows the whole input
beforehand. Formally, let Alg be a (randomized) online algorithm and let Opt be
an optimal offline algorithm. Furthermore, let CAlg(σ) resp. COpt(σ) denote the
costs which Alg resp. Opt incur while serving input sequence σ. Alg is called
γ-competitive if an α0 exists such that for all input sequences σ
E[CAlg(σ)]γ·E[COpt(σ)]+α
holds. The expectation is thereby taken over the random bits of Alg. If this in-
equality is even valid for α=0, then Alg is called strictly γ-competitive. Since the
competitive ratio of an online algorithm is a worst-case performance measure, mea-
surements on practical inputs often show a better performance than the theoretical
analysis suggests [AL99].
A common view on competitive analysis is to assume a malicious adversary
Adv which tries to fool an online algorithm Alg. Adv therefore generates both
an input sequence σAlg which is costly for Alg and a solution for σAlg which
causes only low costs. In the case of randomized online algorithms three types
of adversaries are commonly used [RS94, BDBK+94, BEY98]. They differ in the
knowledge of the actual actions of Alg and in the point of time when they have
to construct their solution. An oblivious adversary has to generate σAlg and an
optimal offline solution before Alg is started. Both the adaptive-online and the
adaptive-offline adversaries create the input interleaved with the execution of Alg.
In every step they choose the next request based on the knowledge of the actions
Alg has performed in the past. The adaptive-offline adversary creates an optimal
offline solution after the whole input is fixed. The adaptive-online algorithm
creates its solution step-by-step like the online algorithm. It has to decide which
actions to perform right after it fixes a request, but before it sees Alg’s reaction on
this request.
The adaptive-offline adversary is more powerful than the adaptive-online ad-
versary which in turn is more powerful than the oblivious adversary [BDBK+94].
1.3 Dynamic networks 5
1.3 Dynamic networks
There are different reasons why the properties of a network can change. We dis-
cuss here the most common reasons for dynamics in different types of networks.
A wired network, which exclusively executes a single application, can change
over time because of link respectively node failures or because of intended topol-
ogy changes. In all these cases the routing paths through the network have to be
adapted to the new topology. While these changes normally occur only rarely,
their effect on the length of the routing paths can be drastic. Imagine for example
a ring network consisting of nnodes. When a single link fails the length of the
path between its two adjacent nodes rises from 1 to n1. But since those changes
occur only rarely, a manual adaption of the file allocation strategy can be feasible.
The situation is more complicated if the wired network is not used exclusively
by one application, e.g. the application is executed over the Internet or a shared
LAN/MAN/WAN. Then attributes like the available bandwidth or the latency of
a link changes with the amount of data transferred over it. These changes usually
do not occur abruptly and are less drastic than a topology change.
In wireless networks mobility of the nodes constitutes another source for dy-
namics. In a MANET the topology of the network is defined by the transmission
ranges of the nodes. Whenever two nodes can communicate with each other this
constitute a link of the network. Since the nodes can move arbitrarily the topology
of the network changes continuously. Furthermore, failures of nodes occur more
often than in wired networks since the nodes mostly depend on finite energy
supplies and their environment is generally unsafer. This leads to ever changing
routing paths in th network.
Especially in the field of peer-to-peer systems, the usage of overlay networks
is wide spread. An overlay network is a logical network which is build on top
of another network. The topology of an overlay network is mostly independent
of the underlying network and can change over time. Reasons for changes of
an overlay network are entering respectively leaving nodes, and measures for
perpetuating certain characteristics of the topology, e.g. randomness or expansion
[Mah10]. When two nodes of the overlay network communicate with each other,
the messages are routed through the overlay network by a routing protocol specific
to the overlay network. Thus, the actual routing paths in the physical network
depend additionally on the routing paths through the overlay network.
Our model of dynamic networks
We are mostly interested in file allocation in MANETs and use the energy con-
sumption used to transfer data as a distance measure. In this thesis we assume
that the file allocation is performed in an overlay network with a fixed and sim-
6 Introduction
ple topology in the MANET. This allows us to neglect the continuous topology
changes of the MANET and deal only with the changes of the distances. Further-
more, the simplicity of the topology avoids the necessity to deal with the tracking
of the positions of the files in the network, which is still an open problem.
The power consumption of a wireless transceiver can be divided into three
parts [WS04, WMY06]: the power consumption for sending data, the power
consumption for receiving data, and the power consumption for staying in idle
mode. The actual amount of energy consumed thereby mainly depends on the
characteristics of the transceiver at hand. We assume that all the mobile nodes
are equipped with identical transceivers which always use the same amount of
energy for transmissions. Thus, the topology of the MANET can be modeled as
an unit disk graph and the power consumption for sending respectively receiving
a message of unit size is a fixed constant.
The energy needed for the transfer of a data unit between two participating
nodes is defined by the used multi-hop path. We assume that this path is deter-
mined by the routing protocol of the MANET. Every relay on the path through the
MANET has to receive the message from its predecessor and send the message to
the next node of the path. The overall power consumption of the transfer is the
sum of the power consumptions of the nodes on the routing path. In this way we
can determine for any point in time and any pair of participating nodes the en-
ergy which would be consumed if a data transfer of unit size would be performed
between them. We call this value the energy-distance of the two participating
nodes.
(a) Node adisappears. The
path is fixed with node b.
(b) Node bmoves out of
the transmission range of
node a. The path is fixed
with node c.
(c) The routing protocol
switches to a new path
between node aand node
baround the obstacle.
Figure 1.2: Examples for topology adaptions. The gray paths are adapted to the
black ones.
A change of the energy-distance of a node can have different reasons (cf. Fig-
ure 1.2): If a node of a path disappears (cf. Figure 1.2(a)) or moves in such a way
that it either gets dispensable or the path gets disconnected (cf. Figure 1.2(b)), a
slight topology change is sufficient. In the case of a dispensable node the path can
be shortened by removing the node from the path. In the other cases the path can
1.4 Related work 7
be fixed by using one or more nearby nodes. More drastic changes of the energy-
distance occur if the routing protocol changes big parts of a routing path. This
could for example happen if the path circumvents an obstacle (cf. Figure 1.2(c)),
if the used routing protocol updates outdated paths, or if a path is disconnected
and no local fix is possible. The analysis of the algorithms presented in this thesis
can basically deal with all possible jumps in the energy-distance. But the higher
the changes of the energy-distances are the higher is the guaranteed competitive
ratio of the algorithms.
While our model basically does not restrict the changes of the energy-distances
of the participating nodes in a single step, it is not possible to investigate a worst
case measure with unrestricted adversaries. An unrestricted adversary could for
example choose a participating node on which the online algorithm does not hold
a copy, place itself a copy on it, move it in a single step very far away from all the
other nodes and then issue a read request from this participating node. Clearly
the adversary could cause any online algorithm an unbounded competitive ratio
this way. We therefore restrict the changes of the energy-distances an adversaries
may generate in a single step.
Definition 1. Let δ0. An adversary Adv is called δ-restricted iffAdv generates
only input sequences such that the energy-distances of every edge is always at least δand
changes at most by δin one step.
1.4 Related work
Since the placement of data in networks has a major impact on the performance
of distributed systems, a lot of research has been made on data management
strategies. We will provide here a brief overview of different data management
models and especially of the online file allocation problem.
Data management
Early data placement models used a pre-computed static placement of copies of
the file, i.e. the positions of the copies are fixed throughout the whole runtime of
the system. These models required good estimation of the request characteristics
of the nodes and could not deal with fluctuations of these characteristics. An
overview of different static file assignment problems can be found in the survey
by Dowdy and Foster [DF82].
The next step in data management was the dynamic placement of files. Here
copies of the file could be migrated, created and deleted during runtime. The
survey of Gravish and Sheng [GS90] gives an overview of different offline models
for file placement problems. Common to all these models is that the calculation of
8 Introduction
the placement of copies is performed offline. Thus, for these models it is important
to have a good estimation of the request characteristics and their fluctuations.
The arising field of online algorithms and competitive analysis [ST85, BEY98]
offered a way to get rid of the requirement of prior knowledge about the request
characteristics. In 1989 Black and Sleator introduced the replication and the mi-
gration problems [BS89]. And in 1992 Bartal, Fiat and Rabani introduced the file
allocation and distributed paging problems [BFR95]. These four related problems
are all online problems and do not assume any knowledge about the input se-
quences. A survey about these online problems can be found in [Bar98a]. We have
already described the file allocation problem in Subsection 1.1. The replication
and the migration problems are basically special cases of the file allocation prob-
lem. In the replication problem only read requests are issued and hence there is no
reason to delete any copies. The migration problem allows only one copy in the
network which can be migrated between nodes. Bartal shows in Theorem 4.2.1
of [Bar94] that the migration problem is basically equivalent to the file allocation
problem if only write requests are issued since then multiple copies provide no
benefit. The distributed paging problem is a generalization of the file allocation
problem where more than one file is present and the nodes have memories of
restricted size.
Online file allocation
Bartal, Fiat and Rabani introduced the online file allocation problem in [BFR95].
There they describe the randomized algorithm SteinerBased which works on
arbitrary networks. SteinerBased uses internally an arbitrary algorithm for the
online Steiner tree problem introduced by Imase and Waxman in [IW91]. Imase
and Waxman also introduced the greedy Steiner tree algorithm and showed that
it is Olog n-competitive. Furthermore, they showed that there are networks and
inputs such that every online algorithm is at least log n-competitive. Bartal et
al. derived a lower bound of log nfor the file allocation problem on arbitrary
networks from this lower bound for the online Steiner tree problem. Further-
more, they showed that SteinerBased is 2+3·c-competitive against adaptive
online adversaries if the used Steiner tree algorithm is strictly c-competitive. In
[ABF93] Awerbuch, Bartal and Fiat showed an improved analysis of the greedy
Steiner tree algorithm which shows that it is strictly Omin(log n,log(diameter))-
competitive. This implies that SteinerBased combined with the greedy Steiner
tree algorithm is Omin log n,log(diameter)-competitive against adaptive online
adversaries. Westbrook and Yan showed in [WY95] that SteinerBased combined
with the greedy Steiner tree algorithm is Olog log n-competitive in unweighted
networks with diameter Olog n. In [ABF93] Awerbuch, Bartal and Fiat gave
an Omin(log n,log(diameter)-competitive deterministic algorithm for arbitrary
1.4 Related work 9
networks. They also described a distributed version of the algorithm which is
Olog4n-competitive.
In [BFR95] Bartal et al. also investigated the file allocation problem on uniform
networks, i.e. complete networks with unit edge weights. An example for such
an uniform network is a set of processors with local memory which are inter-
connected via a shared bus. They introduced the deterministic algorithm Count,
which is adapted to a dynamic star network in Subsection 3.1.2.3, and showed
that it is 3-competitive as follows: They defined a biunique mapping of the overall
costs incurred by Count to the processors. For a processor pthey partitioned the
input into phases. A phase is the time between two copy deletions on node p.
Then they showed that in each phase the costs of pincurred by Count are at
most 3-times the costs of an optimal offline algorithm. This proof heavily relies
on constant edge weights during a phase and therefore can not be extended to
dynamic networks. They also provided a matching lower bound of 3 for random-
ized algorithms against adaptive online adversaries. This lower bound uses only
two adjacent nodes and therefore also holds for trees.
A common approach to handle data management problems on complex or un-
structured networks is to approximate the network by a tree embedding [Bar98b,
MadHVW97] and perform the data management on this tree. Therefore file al-
location algorithms for trees are of special interest. In [BFR95] Bartal et al. gave
a memoryless randomized algorithm for tree networks which is 3-competitive
against adaptive online adversaries. Lund, Reingold, Westbrook and Yan pre-
sented in [LRWY99] a deterministic 3-competitive algorithm based on a work
function. They also provided a lower bound of 2+1
Dagainst oblivious adversaries.
In [MVW99] Krick, Meyer auf der Heide, R¨
acke, V¨
ocking and Westermann gave
another deterministic 3-competitive algorithm which additionally is distributed.
As far as we are aware, closing the gap between the lower 2 +1
Dand upper 3
bound for randomized algorithms against oblivious adversaries is still an open
problem.
Data management in dynamic networks
In [ABF98] Awerbuch, Bartal and Fiat introduced an algorithm for file allocation
in dynamic networks. Their dynamics are limited to the ability of the nodes to
issue requests. Inactive nodes are not able to issue requests, but they can still be
used for communication. So, their network is actually static.
Bienkowski, Korzeniowski and Meyer auf der Heide introduced in [BKM04]
the dynamic page migration problem. They modeled the dynamic network by
a set of nnodes which are placed in a metric space. The nodes can change their
positions in the metric over time. The movement is restricted by the assumption
that during a fixed time interval the position can change by at most δ.
10 Introduction
In [BKM04] Bienkowski et al. gave an Omin nD,D, λ-competitive ran-
domized algorithm against adaptive online adversaries, where λis the maximum
distance allowed in the metric. There they also provided a matching lower bound
of nD,D, λ. In [BDK05] Bienkowski, Dynia and Korzeniowski presented a
deterministic OnD-competitive algorithm. They also provided a lower bound
of pDlog n,D2/3, λfor randomized algorithms and a ODlog n-competitive
randomized algorithm against oblivious adversaries. Bienkowski and Byrka im-
proved this algorithm in [BJ05] to a OpDlog n-competitive algorithm against
oblivious adversaries.
In [BK05, BKM04] Bienkowski and Korzeniowski investigated the dynamic
page migration under Brownian motion. In this scenario the movement of the
nodes is modeled by random walks on a ring. They gave a deterministic algorithm
which is Omin B,qD
B,n·polylog(D,B,n)-competitive with high probability
on rings with diameter B3
D.
In [Bie05a] Bienkowski reduced the power of the adversary by taking away its
ability to generate adversarial requests. Instead, the adversary has to choose a
probability distribution π:{1,...,n} [0,1] which is used to generate randomly
a request in every step. They gave a deterministic algorithm and show that it
is strictly O(1)-competitive with probability 1 O(Dγ)for any constant γif the
input sequence is sufficiently long. Furthermore, they showed that the expected
competitive ratio is also constant.
A complete overview of the results about the dynamic page migration problem
can be found in [Bie05b, BBKadH09, BM05]. As far as we know these are the only
theoretical results for online data management problems in dynamic networks
known today.
1.5 Our contribution
We introduce three different models of the file allocation problem in dynamic
networks. The first model is a straight forward extension of the file allocation
problem to dynamic networks. The second model comprehends stand-by power
consumptions of the participating nodes. Furthermore, we introduce an overlay
network which is used for the communication between the participating nodes.
In the third model the copies of a file are not permanent but are leased for a fixed
duration.
Chapter 2 gives a straightforward generalization of the file allocation problem
to dynamic networks. The only difference between this model and the classic one
described in Section 1.1 is that the edge weights can slightly change in every step.
1.5 Our contribution 11
We show that in this model no competitive online algorithms are possible even
for networks consisting of two nodes. Furthermore, we discuss in detail that the
main difficulty of an online algorithm for the file allocation problem in dynamic
networks is, that it can hardly deduce from an adversarial input sequence where
to place copies.
In Chapter 3 we extend the straightforward model by extra costs p, which are in-
curred by every participating node in every step of the input sequence. These costs
can for example be motivated by unavoidable stand-by costs of the transceivers
of the participating nodes. We investigate two scenarios of file allocation systems
in a MANET. The two scenarios mainly differ in the usage of overlay networks
with different topologies.
In the first scenario the participating nodes are connected to a stationary server
which holds a primary copy of the file. This scenario implies that the par-
ticipating nodes and the server form an overlay network with a star topology
where the center node always holds a copy. We analyse the performance of the
two deterministic algorithms Follow and Count and the randomized algorithm
RandomizedFollow and show that all three are OD
pδ-competitive. We give a
matching lower bound of 1 +D
pδfor the natural class of “demand-driven” online
algorithms which basically create only copies on participating nodes which ac-
cessed the file recently. Furthermore, we give a general lower bound of D
pδ
lgD
pδ!
for randomized algorithms against oblivious adversaries. These results have
previously been published in [MM09].
In the second scenario, we assume that the participating nodes of the MANET
form an overlay network with a tree topology. We analyse the randomized online
algorithm RandomizedTree and show that it is max 3,3D+3
p11
nδ-competitive
against adaptive-online adversaries. This is again optimal for demand-driven
algorithms since the lower bound for star networks holds also for trees.
Chapter 4 presents the description and results of simulations of the two algo-
rithms Follow and Count for the star topology. We therefor simulated a MANET
using the sensor network simulator Shawn [KPB+05, FKFP07]. The file is primar-
ily stored on a stationary server and some of the mobile nodes access it. Because
the communication range of the mobile nodes is limited, they have to commu-
nicate via multi-hop paths with the server. We used the “mobility model based
on social network theory” from [MM07] to generate the movement of the nodes.
The participating nodes generate file accesses stochastically and independent of
each other. The simulations show that the competitive ratio for realistic inputs is
much lower than the worst-case analysis suggests. Especially, the performance of
Count proved to be reasonable good for practical application.
In Chapter 5 we introduce another model for file allocation in dynamic net-
12 Introduction
works. We are again looking at a star topology where the center node always
holds a copy. But here we do not assume step costs of the participating nodes.
Instead, we assume that a copy placed on a participating node is only leased for a
fixed number of Lsteps. When this time expires, the participating node can either
drop the copy or renew the lease for another Lsteps. Dropping the copy is free,
but a renewal costs the current distance of the participating node to the center. The
concept of leases is especially able to reduce the duration of inconsistencies which
can occur when a participating node holding a copy is temporarily not connected.
We give a lower bound of (L)for the competitive ratio of randomized algorithms
against oblivious adversaries. Furthermore, we analyse two simple deterministic
algorithms which can be combined to a O(LD)-competitive algorithm.
Chapter 2
Limitations of Extension to
Dynamic Networks
In this chapter, we investigate a straightforward extension of the file allocation
problem to dynamic networks. We consider a set of participating nodes in a
MANET which is interconnected via an overlay network with an arbitrary but
fixed topology. Thus, our model is basically the same as the classical model
for static networks. The only difference is that the weights of the edges may
change over time. We show in Section 2.2 that there can be no competitive
online algorithms against an adversary which controls both the edge weights and
the request sequence. This means that a straightforward generalization of the
online file allocation problem to dynamic networks is not suitable. This stands in
contrast to the online page migration problem where the (nearly) straightforward
generalization of Bienkowski [Bie05b] was very fruitful.
2.1 Our model
The file allocation problem in a dynamic network is defined as follows: We are looking
at a connected network G=({1,...,n},E) of nnodes. All the nodes of the network
can access an indivisible file which initially is stored on an arbitrary node s
{1,...,n}. During the runtime of the system, an algorithm Alg can arbitrarily
place copies of the file on the nodes. The replica set Rt
Alg {1,...,n}is defined as
the set of nodes where algorithm Alg has a copy at the end of step t.
The input consists of the initial edge weights d0and a finite sequence σof pairs
(dt,at), 1 t |σ|, of a weight-function and an access request. It is presented to
the online algorithm in equidistant discrete time steps. The function dt:ERn
>0
defines the weights of the edges in step t. The access request at {−,ri,wi|
i {1,...,n}} either indicates that no access occurs () or a read (r) respectively
13
14 Limitations of Extension to Dynamic Networks
a write (w) access is issued at node i. Every step tis divided into the following
three phases, which are executed strictly in the given order: First the distances of
the edges are updated to dt, then request atis served, and finally, the algorithm
may replicate and/or delete copies.
We consider the shared file as an indivisible container of a fixed number of data
units of uniform size. A single read respectively write access always affects just
one unit of the file. While a read request can be fulfilled by a single copy, a write
request has to update all the copies. Every transfer of such a data unit costs the
length of the used path. The costs CAlg(σt) incurred by Alg in step tare composed
of two components: the transfer costs for fulfilling the data access and the transfer
costs for creating and deleting copies. The costs CAlg(at) for serving a data access
from i {1,...,n}are naturally defined as
Ct
Alg() :=0Ct
Alg(ri) :=dti,Rt1
AlgCt
Alg(wi) :=sttRt1
Alg {i}
A replication of the file from node ito node jcosts D·dti,j, where D1 is a
fixed constant depending on the size of the file. The deletion of a copy can either
be free as in the classical model of Bartal [BFR92] or costs the distance to the next
copy as in the models in the following chapters. The lower bound in the following
section works for both variants. We denote by CAlg(σ) :=|σ|
P
t=1
CAlg(σt) the overall
costs incurred by algorithm Alg while serving input σ.
2.2 Lower bound
Here, we will give a simple proof that no deterministic algorithm can be competi-
tive in this model. The main idea of this proof can also be extended to randomized
algorithms against oblivious adversaries as in [BBKadH09]. The impossibility of
competitive randomized algorithms against oblivious adversaries in this model
is also implicitly shown by Theorem 3.9 (p=0) and Theorem 5.3 (L=).
Theorem 2.1. For every connected network G =({1,...,n},E)with n 2, no deter-
ministic online algorithm for the file allocation problem in a dynamic network can be
competitive.
Proof. Let Det be an arbitrary deterministic online algorithm for the file allocation
problem in a dynamic network and let i,j {1,...,n}be two arbitrary nodes in
G. We describe a δ-restricted adversary Adv generating an input sequence which
consists of arbitrary many phases τ. We show for every phase that CDet(τ)
D
D+1γ·CAdv(τ) holds for any γN>0.
2.2 Lower bound 15
We assume that the distance of ito jis initially δand that the start copy is on
i. If this is not the case the adversary can establish this situation by moving iand
jto distance δ, changing RAdv to {i}and forcing Det to do the same as described
below. This would incur Adv at most a constant cost.
Figure 2.1: A single phase of the lower bound.
Every phase τ(see Figure 2.1) starts with a distance of δbetween iand j.
Furthermore, both Det and Adv solely have a copy on node i. A phase consists
of an expansion-stage, an optional request-stage and a contraction-stage. In the
expansion-stage the adversary moves iand japart with speed δ, i.e. di,j=t·δin
step tof the expansion-stage, and issues read requests from i. In the request-stage
the nodes do not move and jissues read requests for Dsteps. In the contraction-
stage iand jmove back to their initial distance of δand no requests are issued.
If Det creates a copy on jin step sγof the expansion-stage, Adv immediately
changes to the contraction-stage. This means that Det incurs at least a cost of
D·sδfor creating the copy on j. Adv does nothing during the phase. And since
there were only read requests from iand Adv permanently had a copy on i, Adv
does not incur any costs at all. Thus,
CDet(τ)D·sδ > 0=γ·CAdv(τ)
Otherwise, iand jreached distance γδ and Det did not place a copy on j. Then
Adv starts the request-stage, i.e. issues read request from jat distance γδ. Since
Det has no copy on jat the beginning of the request-stage, it either has to fulfill
these requests by transferring data for every request or to create a copy on j. In
16 Limitations of Extension to Dynamic Networks
both cases the costs of Det are at least D·γδ. Adv on the other hand knows about
the upcoming request-stage beforehand and therefore creates a copy on jin the
first step of the expansion-stage which costs Dδ. It deletes this copy right after
the contraction stage for δ. Thus depending on the chosen cost model, the costs
of Adv are Dδif deletions are free and (D+1)δotherwise. This results in
CDet(τ)D·γδ D
D+1γ·CAdv(τ)
Thus, for every phase CDet(τ)D
D+1γ·CAdv(τ) holds. Between two consecutive
phases Adv forces Det to store an exclusive copy on node i. Adv does this by
consecutively issuing write requests from i. As long as Det has copies on any
other node than i, it has to pay for these copies. Adv on the other hand avoids any
costs by deleting a possibly existing copy on j. Thus, Det has to delete eventually
all copies on other nodes than ior will incur an arbitrary high cost, what implies
that it can not be competitive.
2.3 Conclusion
The given lower bound shows that the file allocation problem is more difficult to
handle online than the page migration problem. The reason for this is that the
adversary can hide its own configuration very well. When an adversary issues
a long series of requests from a fixed node, the online algorithm knows that the
adversary has a copy of the file near that node. Since in the page migration prob-
lem there is always only one copy, the online algorithm has a good approximation
of the configuration of the adversary. In the file allocation problem on the other
hand, the adversary could have additional copies. But there is no clue for the
online algorithm how many additional copies exist and where they are placed.
The lack of information about the configuration of the adversary makes the file
allocation problem very difficult for online algorithms. A similar lack of informa-
tion would occur in the page migration problem if the model would not force the
adversary to issue a request in every step.
In the following chapters we give and analyse two variants of the file allocation
problem in dynamic networks which lessen the described difficulty. The version
in Chapter 3 introduces unavoidable step costs and the version in Chapter 5 adds
an expiry date to copies. Both variants prevent that the adversary moves hidden
copies far away from the center of requests for free. This makes competitive online
algorithms possible.
Chapter 3
File Allocation with Step Costs
In this chapter we introduce step costs for the participating nodes to the file
allocation problem in dynamic networks. We assume that the transceiver of
every participating node consumes a constant stand-by power pin every step.
This stand-by power consumptions imply unavoidable costs for the adversary in
every step. While this does not prevent an adversary to cheaply create copies
and hide them from the online algorithm as seen in the last chapter, it means that
an adversary can not move such hidden copies far away from the current center
of requests for free anymore. As we will see, these stand-by costs allow online
algorithms to be competitive.
The definition of energy-distances implies that, even given a fixed maximum
speed of the mobile nodes, the energy-distance of a participating node can change
drastically during a single step (cf. Figure 1.2(c)). Therefore, we try to relate the
costs of our online algorithms to the costs of an optimal offline algorithm Opt in
a way that takes the changes of energy-distances into account. Here, we measure
the quality of an online algorithm by the following refinement of the competitive
ratio:
Definition 2. Let Alg be any online algorithm and Opt be an optimal offline algorithm.
Alg has amortized competitive ratio γ:RR1if an α0exists such that for any
input sequence σ
|σ|
X
t=1
CAlg(σt)|σ|
X
t=1
γ(¯
δt)·COpt(σt)+α
holds, where ¯
δtis the average change of the edge weights of the network in step t.
We will examine two different overlay networks with fixed topologies. The
first one uses a star topology where the center node holds a primary copy and the
17
18 File Allocation with Step Costs
used relay
unused relay
participating node
Figure 3.1: A stationary server provides a shared file to the members of a MANET.
Every participating node is connected to one access point via a path of relays.
second one is a tree topology. Both topologies have in common that the Steiner
tree for any subset of nodes is unique and easy to find.
The star topology arises from the following scenario: A stationary server is
connected via one or more stationary access points to a mobile ad hoc network
(see Figure 3.1). On this server a primary copy of the file is stored at all times. The
participating nodes need secured access to the file. The wireless communication
between the server and the participating nodes therefore is cryptographically
authenticated and secured. This induces that all the communication has to involve
the server, i.e. there are no direct data exchanges between two participating nodes.
A file allocation algorithm can create copies of the file on the participating nodes
which can then be used to fulfill read requests from this participating node.
We describe and analyse the deterministic algorithms Count and Follow and
the randomized algorithm RandomizedFollow. They are all OD
pδ-competitive
against δ-restricted adversaries. Common to all three algorithms is that they create
copies only on a participating node right after it issued a request. Intuitively,
this demand-driven behavior seems reasonable. In Subsection 3.1.2 we give a
more general definition of demand-driven algorithms which probably includes
all practically useful online algorithms. We give a very strong lower bound for
those demand-driven algorithms of 1 +D
pδ.
In Subsection 3.1.3 we show a lower bound of 1 +
D
pδ
8l1+D
pδm for general online
algorithms. This lower bound holds especially for algorithms which are not
demand-driven.
The tree topology investigated in Section 3.2 is a first attempt to perform file
allocation in a serverless MANET. Figure 3.2 gives an example of such a tree
in a MANET. The participating nodes thereby form an overlay network with a
tree topology. Two participating nodes can only communicate with each other
3.1 File allocation in a dynamic star network 19
used relay
unused relay
participating node
Figure 3.2: A file is shared on a tree which is embedded into a MANET. Every
participating node is connected to its tree neighbors via a path of relays.
via a path in the overlay network. The simplicity of the tree topology avoids the
necessity of a data tracking service which is able to find approximations of nearest
copies for read requests and minimum Steiner trees for updates.
In Subsection 3.2.2 we describe and analyse the memoryless randomized al-
gorithm RandomizedTree on a dynamic tree and show that it is also OD
pδ-
competitive.
3.1 File allocation in a dynamic star network
In this section we introduce and analyse our model for the file allocation problem
with stand-by costs in a dynamic star network.
3.1.1 Our model
In order to determine the costs of a data management strategy in step tfor the
scenario described above, we only have to know the following information about
the configuration: the current replica set, i.e. the set of participating nodes holding
a copy of the file, and the vector of current distances dt=[d1
t,...,dn
t], where di
t
denotes the energy-distance from participating node ito the server in step t.
Such a configuration can be described via a star network: the center of the star
represents the server, the nperipheral nodes represent the participating nodes and
the distances between the peripheral nodes and the center represent the current
energy-distances of the participating nodes to the servers.
The file allocation problem in a dynamic star network (Fads)is defined as follows:
We are given a star network consisting of a center node c(the server) and n
peripheral nodes 1,...,n. All the nodes of the star can access an indivisible file
20 File Allocation with Step Costs
which initially is stored only on c. During the runtime of the system, an algorithm
Alg can arbitrarily place copies of the file on the peripheral nodes. The center
node always holds a copy. The replica set Rt
Alg {c,1,...,n}is defined as the set of
nodes where algorithm Alg has a copy at the end of step t.
Figure 3.3: The file allocation system is modeled as a star with dynamic edge
weights.
The input consists of the initial distances d0and a finite sequence σof pairs (dt,at),
1t |σ|, of a vector of distances and an access request. It is presented to the
online algorithm in equidistant discrete time steps. The vector dt=[d1
t,..., dn
t]
Rn
>0defines the distances of the peripheral nodes to the center in step t. The access
request at {−,ri,wi|i {c,1,...,n}}either indicates that no access occurs () or a
read (r) respectively a write (w) access is issued at node i. Every step tis divided
into the following three phases, which are executed strictly in the given order:
First the distances between the peripheral nodes and care updated to dt, then
request atis served, and finally, the algorithm may replicate and/or delete copies.
We consider the shared file as an indivisible container of a fixed number of data
units of uniform size. A single read respectively write access always affects just
one unit of the file. While a read request can be fulfilled by a single copy, a write
request has to update all the copies. Every transfer of such a data unit costs the
corresponding distance.
The costs CAlg(σt) incurred by algorithm Alg in step tare composed of three
components: a fixed stand-by cost of pfor every peripheral node, the transfer costs
for fulfilling the data access and the transfer costs for creating and deleting copies.
We will occasionally use the following split-up of the costs CAlg(σt)=np+˜
CAlg(σt),
3.1.2 Demand-driven algorithms 21
where ˜
CAlg(σt) denotes the costs incurred by Alg for serving the request and
performing actions. The costs ˜
CAlg(at) incurred by Alg for serving a data access
from i {c,1,...,n}are naturally defined by
˜
Ct
Alg() :=0,˜
Ct
Alg(ri) :=
0,if iRt1
Alg
di
t,otherwise ,˜
Ct
Alg(wi) :=X
j(Rt1
Alg∪{i})\{c}
dj
t
Notice that ˜
CAlg(rc) is always 0 and therefore a read access from the center node
is equivalent to no access (). Furthermore, ˜
CAlg(wi) is always at least di
tsince the
center always holds a copy. A replication of the file from the center to peripheral
node icosts D·di
t, where D1 is a fixed constant depending on the size of the file.
If an algorithm decides to delete the copy of peripheral node i, both the center
and ihave to know this. Thus, a communication between these two nodes has
to take place which costs di
t.1We denote by CAlg(σ) the overall costs incurred by
algorithm Alg while serving input sequence σ.
3.1.2 Demand-driven algorithms
In this subsection we investigate the natural class of demand-driven algorithms.
Definition 3. Let r 1and 0. A file allocation algorithm is called (r,)-demand-
driven if it replicates only to peripheral nodes which have a distance of at most to any
node which issued a request during the previous r steps.
As far as we know, all the investigated algorithms for the online file allocation
problem in static networks are demand-driven. Furthermore, our intuition tells
us that placing copies on a peripheral node which is far away from the current
center of requests is not a good idea, because this means that an algorithm just
speculates where future requests could occur.
3.1.2.1 Lower bound
The following lower bound for demand-driven algorithms works not only for
deterministic, but also for randomized algorithms. It consists of an input sequence
which is completely oblivious of the concrete online algorithm. In the following
subsections, we give two up to a constant factor matching OD
pδ-competitive
(1,0)-demand-driven algorithms.
Theorem 3.1. Let r 1and 0. The competitive ratio of any (r,)-demand-driven
algorithm Alg for Fads against δ-restricted oblivious adversaries is at least 1+D+1
pδ.
1This differs from the file allocation model of Bartal et al. [BFR92] where deletions are free.
22 File Allocation with Step Costs
Proof. In the following, we define a class of input sequences σTfor Alg where
the parameter T1 basically determines the length of σT. We show that for any
constant αand > 0 we can choose a Tsuch that
CAlg(σT) 1+D+1
pδ!·COpt(σT)+α. (3.1)
During the whole input sequence all the peripheral nodes perform the same
movement, i.e. di
t=dj
tfor all 1 i,jnand t0. The input sequence σT
(see Figure 3.4) starts with all peripheral nodes at distance + δand with no
copies on the peripheral nodes. σTconsists of two stages: In the expansion-
stage the peripheral nodes move away from cwith speed δfor Tsteps. During
this stage only cissues read requests. In the reading-stage each peripheral node
consecutively issues D+1 read requests. Note that σTis beside completely
independent from Alg’s behavior.
Figure 3.4: A single phase σTof the lower bound for demand-driven algorithms.
Obviously, an optimal offline algorithm Opt creates a copy on every peripheral
node at the beginning of the expansion-stage, because it knows that these nodes
will read the file at distance + Tδ. So the costs of Opt are bounded by
COpt(σT)nD( + 1·δ)
| {z }
replication
+Tnp
|{z}
expansion-stage
+n(D+1)np
| {z }
reading-stage
.
Since the distances of the peripheral nodes to the requesting node are greater
than during the expansion-stage, a (r,)-demand-driven algorithm will not
create copies on any peripheral nodes before the reading-stage. So Alg has to pay
in the reading-stage + T·δfor the first read request from a peripheral node and
3.1.2 Demand-driven algorithms 23
then either can create a copy on this peripheral node or serve all D+1 requests
by sending the requested data from c. In either case it incurs a cost of at least
(D+1)( + T·δ). Thus,
CAlg(σT)(T+n(D+1))
| {z }
# of steps
np +n(D+1)( + T·δ)
| {z }
reading
>np +n(D+1)δ·T.
Altogether we therefore have for any constant α
CAlg(σT)α
COpt(σT)>(np +n(D+1)δ)·Tα
np ·T+n2(D+1)p+nD( + 1·δ)
T→∞1+D+1
pδ.
The given input sequence can also be repeated several times like in the proof of
Theorem 2.1. Then the lower bound drops to 1 +D+1
2pδsince the adversary has to
move the peripheral nodes back to the initial position in every phase.
3.1.2.2 Algorithm Follow
Algorithm Follow2(see Algorithm 1) is based on the “dynamic caching strategy”
of [MMVW97]. It consists of the following two independent rules:
(i) If a request is issued by a peripheral node which does not hold a copy, a new
copy is placed on this peripheral node.
(ii) If a node issues a write request, all the copies on peripheral nodes other than
the requesting one are deleted.
This means that the copies basically follow the issued requests.
Theorem 3.2. Let σbe an arbitrary input sequence and Opt be an arbitrary optimal
offline algorithm for Fads. Then
CF(σ)|σ|
X
t=0
γF¯
δt·COpt(σt)
holds, where ¯
δtis the average change of the distances of the peripheral nodes in step t, i.e.
¯
δt:=1
n
n
P
i=1di
tdi
t1and
γF(¯
δt) :=max D+3,1+D+1
p¯
δt!.
2We abbreviate Follow with F in indices.
24 File Allocation with Step Costs
Algorithm 1 Follow
1: On request (dt,at)do
2: Let ibe the source of the current request at.
3: if i<RFthen
4: Replicate to i.
5: end if
6: if at=withen
7: for jRF\{c,i}do
8: Delete copy on j.
9: end for
10: end if
11: end
The function γF(¯
δt) is depicted in Figure 3.5. Note that the theorem does not
induce that CF(σt) has to be smaller than or equal to γF(¯
δt)·COpt(σt) in every step
t.γF(¯
δt) rather constitutes the amortized competitive ratio of step t.
The following corollary is an immediate consequence of Theorem 3.2 since by
definition a δ-restricted adversary can only produce input sequences such that
¯
δt=1
n
n
P
i=1di
tdi
t11
nnδ=δholds in every step t.
Corollary 3.3. Follow is strictly max D+3,1+D+1
pδ-competitive against δ-restrict-
ed adversaries.
Proof of Theorem 3.2. We will prove the theorem with the help of the potential
function Φt:=
n
P
i=1
Φi
t·di
t, where
Φi
t:=
D+1 if iROpt \RF,
2 if iRF\ROpt,
0 else.
Notice that Φt0 for 0 t |σ|and especially Φ0=0 in the natural case that
Follow and Opt start with the same replica set.
We divide every step into two parts. In the first part both Follow and Opt
pay pfor every peripheral node and the peripheral nodes move to their new
positions. In the second part the file access is served and the algorithms may
reallocate copies. We denote by Φt1
2the potential after the first part of step t, i.e.
Φt1
2=
n
P
i=1
Φi
t1·di
t.
3.1.2 Demand-driven algorithms 25
Below we show that for the first part of any step tholds:
np + Φt1
2Φt1 1+D+1
p¯
δt!np.(3.2)
And for the second part, we show:
˜
CF(σt)+ ΦtΦt1
2(D+3) ˜
COpt(σt).(3.3)
Notice that (3.3) especially says that Follow is (D+3)-competitive in a static star
network.
Together this results in
CF(σ)CF(σ)+ Φ|σ|Φ0
=|σ|
X
t=1np +˜
CF(σt)+ Φ|σ|Φ0
=|σ|
X
t=1np + Φt1
2Φt1+˜
CF(σt)+ ΦtΦt1
2
()
|σ|
X
t=1 1+D+1
p¯
δt!np +(D+3) ˜
COpt(σt)!
|σ|
X
t=1
max D+3,1+D+1
p¯
δt!COpt(σt),
where () follows from (3.2) and (3.3).
For the proof of (3.2), we observe that the costs of Follow and Opt in the first
phase of a step are always np. Furthermore, the value of Φchanges only due to
the changes of the distances since no actions are performed. Thus, we have:
np + Φt1
2Φt1=np +
n
X
i=1
Φi
t1·(di
tdi
t1)
np +
n
X
i=1
(D+1) ·di
tdi
t1
=
1+D+1
np
n
X
i=1di
tdi
t1
np
= 1+D+1
p¯
δt!np
26 File Allocation with Step Costs
For the analysis of Follow in the second part of a step, we use the following
general cost allocation scheme: we allot the transfer costs incurred by any algo-
rithm Alg as follows to the peripheral nodes. Peripheral node iis charged for all
those costs that are caused by it. More precisely, peripheral node ipays for all
costs incurred for creating and deleting copies on i. Furthermore, iis charged di
t
in step tif iissues a read request and does not hold a copy, or if iholds a copy and
a write request is issued by any node. We denote by Ci
Alg(σ) the costs incurred by
iwhile Alg serves the input sequence σ. Since all the costs incurred by Alg are
mapped to exactly one peripheral node, CAlg(σ)=
n
P
i=1
Ci
Alg(σ) holds.
We proof (3.3) by showing that for all 1 inholds
˜
Ci
F(σt)+Φi
tΦi
t1
2di
t(D+3) ·˜
Ci
Opt(σt).(3.4)
This implies
˜
CF(σt)+ ΦtΦt1
2=
n
X
i=1˜
Ci
F(σt)+Φi
tΦi
t1
2di
t
n
X
i=1
(D+3) ·˜
Ci
Opt(σt)
=(D+3) ·COpt(σt).
We show (3.4) in two steps: First we analyze the costs and the change of Φidue
to the request and the actions of Follow. Thereafter we have a look at the change
of Φicaused by an action of Opt.
The proof of (3.4) for the first step can be found in Table 3.1. In the second step,
we have always ˜
Ci
F=0. So, it is enough to look at the change of the potential Φi.
If Opt creates a new copy on peripheral node i,Φionly increases if Follow does
not hold a copy on i. Thus, ∆Φi(D+1). So we have
˜
Ci
F+ ∆Φidi
t(D+1)di
t<(D+3)Ddi
t=(D+3) ·˜
Ci
Opt.
If Opt deletes a copy on peripheral node i,Φionly increases if Follow does
hold a copy on i. Thus, ∆Φi2. So we have
˜
Ci
F+ ∆Φidi
t2di
t<(D+3)di
t=(D+3) ·˜
Ci
Opt.
3.1.2.3 Algorithm Count
Algorithm Count3(see Algorithm 2) is an adaption of the algorithm of the same
name from [BFR92]. There it is shown that Count is 3-competitive on a static
uniform network, i.e. a complete network with uniform distances.
3We abbreviate Count with C in indices.
3.1.2 Demand-driven algorithms 27
Table 3.1: Changes of replica set RFand corresponding values of Φi,˜
Ci
Fand ˜
Ci
Opt
regarding peripheral node iin the second part of a step on request at.
iRFiR0
FiROpt ˜
Ci
F(σt)∆Φidi
t˜
Ci
F(σt)+ ∆Φidi
t(D+3) ·˜
Ci
Opt(σt)
at=rjwith j,i
no no no 0 (0 0)di
t0 0
no no yes 0 ((D+1) (D+1))di
t0 0
yes yes no 0 (2 2)di
t0 0
yes yes yes 0 (0 0)di
t0 0
at=ri
no yes no (D+1)di
t(2 0)di
t(D+3)di
t(D+3)di
t
no yes yes (D+1)di
t(0 (D+1))di
t0 0
yes yes no 0 (2 2)di
t0 (D+3)di
t
yes yes yes 0 (0 0)di
t0 0
at=wjwith j,i
no no no 0 (0 0)di
t0 0
no no yes 0 ((D+1) (D+1))di
t0 (D+3)di
t
yes no no 2di
t(0 2)di
t0 0
yes no yes 2di
t((D+1) 0)di
t(D+3)di
t(D+3)di
t
at=wi
no yes no (D+1)di
t(2 0)di
t(D+3)di
t(D+3)di
t
no yes yes (D+1)di
t(0 (D+1))di
t0 (D+3)di
t
yes yes no di
t(2 2)di
tdi
t(D+3)di
t
yes yes yes di
t(0 0)di
tdi
t(D+3)di
t
Count has a counter cifor every peripheral node i {1,...,n}, which may hold
integer values ranging from 0 to D+1. On any request (read or write) from a
peripheral node, Count increases the value of the corresponding counter by 1
(but not exceeding D+1). If cireaches D+1 and idoes not already hold a copy, a
new copy is created on i. On write requests from any node (including the center),
Count additionally decreases the counters of all the other peripheral nodes by 1
(but not below 0). Thereafter all the copies on peripheral nodes with a counter
value of 0 are deleted.
We assume that initially no peripheral node holds a copy and therefore all the
counters are initialized with 0. Count can be implemented without the need of
any extra communication by maintaining a counter cion peripheral node iwhile
it holds a copy of the file and on the center node otherwise (c.f. the distributed
implementation in Section 4.1).
The change of the ranges of the counters compared to the version in [BFR92] is
solely necessary because of the different cost measure for the deletion of copies.
While in [BFR92] deletions are free, we charge a cost of di
tfor the deletion of the
copy on peripheral node i. So after a copy was deleted at least D+1 read requests
have to be issued until the creation and deletion of a new copy is amortized.
28 File Allocation with Step Costs
Algorithm 2 Count
1: Initialize ci:=0 for all i {1,...,n}.
2: On request (dt,at)do
3: Let ibe the source of the current request at.
4: if i {1,...,n}then
5: ci:=min(ci+1,D+1)
6: if (ci=D+1 and i<RC)then
7: Replicate to i.
8: end if
9: end if
10: if at=withen
11: for j {1,...,n}\{i}do
12: cj:=max(cj1,0)
13: if cj=0 and jRCthen
14: Delete the copy on j.
15: end if
16: end for
17: end if
18: end
Theorem 3.4. Let σbe an arbitrary input sequence and Opt be an arbitrary optimal
offline algorithm for Fads. Then
CC(σ)|σ|
X
t=1
γr
C¯
δt·COpt(σt)
holds for any 0r1, where ¯
δtis the average change of the distances in step t, i.e.
¯
δt:=1
n
n
P
i=1di
tdi
t1and
γr
C(¯
δt) :=max rD +(3 r),1+(3 r)D+r
p¯
δt!.
Notice that the proof of Theorem 3.4 requires that ris fixed during the whole
input sequence.
Corollary 3.5. Count is strictly γC(δ)-competitive against δ-restricted adversaries,
where
γC(δ) :=min
0r1γr
C(δ)=
3if δ2
3Dp,
p+3(D+1)δ
p+δif 2
3Dp<δ< D+1
2D+1p,
1+2D+1
pδif δD+1
2D+1p.
3.1.2 Demand-driven algorithms 29
0¯
δt
γ¡¯
δt¢
Follow
D+ 3
1 + D+1
p
¯
δt
D+2
D+1 p
Count
3
D+ 2
1 + 3D
p
¯
δt1 + 2D+1
p
¯
δt
2
3DpD+1
2D+1 p
r= 0
r=1
5
r=2
5
r=3
5
r=4
5
r= 1
lower bound
2
D+1
Figure 3.5: The amortized competitive ratios of Count for different values of
rand Follow. The gray curve gives the competitive ratio of Count against ¯
δt-
restricted adversaries. The bold curve depicts the lower bound for demand-driven
algorithms.
Figure 3.5 shows the amortized competitive ratios γr
C(¯
δt) for different values
of rcompared to the amortized competitive ratio γF(¯
δt) of Follow and the lower
bound for demand-driven algorithms. It shows that for small values of ¯
δtthe
competitive ratio of Count is optimal and the one of Follow depends on Dand
is therefore dissatisfactory. On the other hand the competitive ratio of Follow
is optimal for large ¯
δtwhile the competitive ratio of Count increases faster with
increasing ¯
δt. Thus, Follow seems to be better in scenarios with high mobility of
the peripheral nodes. The reason for Follow’s good performance for large ¯
δtis that
it handles the lower bound for demand-driven algorithms (c.f. Subsection 3.1.2.1)
in the best way a competitive demand-driven algorithm can.
Proof of Theorem 3.4. We show that the amortized competitive ratio of any step
tof any input σis smaller than or equal to max rD +(3 r),1+(3r)D+r
p¯
δtfor
an arbitrary but fixed 0 r1. The proof proceeds analogous to the proof of
Theorem 3.2. We use the potential function Φt:=
n
P
i=1
Φi
t·di
t, where
Φi
t:=
2ci
tif i<Rt1
Cand i<Rt1
Opt,
(3 r)Dci
t+rif i<Rt1
Cand iRt1
Opt,
ci
t+1 if iRt1
Cand i<Rt1
Opt,
(3 r)D2ci
t+(r+1) if iRt1
Cand iRt1
Opt.
30 File Allocation with Step Costs
Since 0 ci
tD+1 holds, we have Φi
t0 for 1 t |σ|and Φi
0=0 for the natural
case that Count and Opt start holding only a copy on the center node.
Before we proceed with the analysis of the two parts of a step, we need the
following properties of Count.
Claim 3.6.
(a) If counter ciis 0, then peripheral node i can not hold a copy.
(b) If counter ciis D +1, then peripheral node i holds a copy.
Proof. We show both properties by induction over the number of steps. Initially
all the counters are 0 and no peripheral node holds a copy. After the t-th step the
following cases have to be considered: If counter cihas not changed during step
t, both claims follow from the induction hypothesis. If counter cihas changed to
0 during step t, a node j,ihad issued a write request and thus Count ensures
in line 10 that idoes not hold a copy. If counter cihas changed to D+1 during
step t,ihad issued a request and thus Count ensures in line 6 that iholds a copy.
Otherwise ciis neither 0 nor D+1 and therefore nothing is to be shown.
iRCrange of ciiROpt Φi
no 0 ciDno 2D
no 0 ciDyes (3 r)D+r
yes 1 ciD+1 no D+2
yes 1 ciD+1 yes (3 r)D+(r1)
Table 3.2: Upper bounds of Φifor the different combinations of RC,ciand ROpt.
With the help of the above claim, we can derive Table 3.2 and hence for the first
part of any step t(movement and stand-by costs) holds
np + Φt1
2Φt1=np +
n
X
i=1
Φi
t1·(di
tdi
t1)
Table 3.2
np +
n
X
i=1
((3 r)D+r)·di
tdi
t1
=
1+((3 r)D+r)
n
P
i=1di
tdi
t1
np
np
= 1+(3 r)D+r
p¯
δt!np.(3.5)
3.1.2 Demand-driven algorithms 31
For the analysis of Count in the second part of a step (request and actions), we
use the cost allocation scheme already used in the analysis of Follow. We show
that for every peripheral node 1 inholds
˜
Ci
C(σt)+Φi
tΦi
t1
2
0di
t(rD +(3 r))·˜
Ci
Opt(σt).(3.6)
We show this in two steps: First, we analyse the costs and the change of Φidue to
the request and the actions of Count. Thereafter, we have a look at the change of
Φicaused by an action of Opt. The proof of (3.6) for the first step can be found in
Table 3.4 and the proof for the second step in Table 3.3.
Table 3.3: Changes of replica set ROpt and corresponding values of Φi,˜
Ci
Cand ˜
Ci
Opt
regarding peripheral node iin the second part of step t.
iRCciiROpt iR0
Opt
˜
Ci
C+ ∆Φidi
t(rD +(3 r))·˜
Ci
Opt
Opt creates a new copy on i.
no ci1 no yes ((3 r)D3ci+r)di
t((3 r)D3+r)di
t(rD +(3 r))·Ddi
t
yes ci0 no yes ((3 r)D3ci+r)di
t((3 r)D+r)di
t(rD +(3 r))·Ddi
t
Opt deletes copy on i.
no ciD+1 yes no ((3 r)D+ci+r)di
t(rD +3r)di
t(rD +(3 r))·di
t
yes ciDyes no ((3 r)Dci+2+r)di
t(rD r)di
t(rD +(3 r))·di
t
Together, this results in
CC(σ)CC(σ)+ Φ|σ|Φ0
=|σ|
X
t=1np +˜
CC(σt)+ Φ|σ|Φ0
=|σ|
X
t=1np + Φt1
2Φt1+˜
CC(σt)+ ΦtΦt1
2
()
|σ|
X
t=1 1+(3 r)D+r
p¯
δt!np +(rD +(3 r))·˜
COpt(σt)!
|σ|
X
t=1
max rD +(3 r),1+(3 r)D+r
p¯
δt!COpt(σt),
where () follows from (3.5) and (3.6).
32 File Allocation with Step Costs
Table 3.4: Changes of configuration RC,ciand corresponding values of ∆Φi,˜
Ci
Cand ˜
Ci
Opt regarding peripheral node i
in the second part of step ton request at.
iRCciiR0
Cci0iROpt ˜
Ci
C(σt)∆Φidi
t˜
Ci
C(σt)+ ∆Φidi
t(rD +(3 r))˜
Ci
Opt(σt)
at=rjwith j,i
no Dno cino 0 (2ci2ci)di
t0 0
no Dno ciyes 0 (((3 r)Dci+r)((3 r)Dci+r))di
t0 0
yes 1 yes cino 0 ((ci+1) (ci+1))di
t0 0
yes 1 yes ciyes 0 (((3 r)D2ci+(1 +r)) ((3 r)D2ci+(1 +r)))di
t0 0
at=ri
no D1 no ci+1 no di
t((2(ci+1)) 2ci)di
t3di
t(rD +(3 r))di
t
no Dyes D+1 no (D+1)di
t(((D+1) +1) 2D)di
t3di
t(rD +(3 r))di
t
no D1 no ci+1 yes di
t(((3 r)D(ci+1) +r)((3 r)Dci+r))di
t0 0
no Dyes D+1 yes (D+1)di
t(((3 r)D2(D+1) +(1 +r)) ((3 r)DD+r))di
t0 0
yes Dyes ci+1 no 0 (((ci+1) +1) (ci+1))di
tdi
t(rD +(3 r))di
t
yes D+1 yes D+1 no 0 (((D+1) +1) ((D+1) +1))di
t0(rD +(3 r))di
t
yes Dyes ci+1 yes 0 (((3 r)D2(ci+1) +(1 +r)) ((3 r)D2ci+(1 +r)))di
t2di
t0
yes D+1 yes D+1 yes 0 (((3 r)D2(D+1) +(1 +r)) ((3 r)D2(D+1) +(1 +r)))di
t0 0
at=wjwith j,i
no 0 no 0 no 0 (0 0)di
t0 0
n o 1 no ci1 no 0 (2(ci1) 2ci)di
t2di
t0
no 0 no 0 yes 0 (((3 r)D0+r)((3 r)D0+r))di
t0(rD +(3 r))di
t
no 1 no ci1 yes 0 (((3 r)D(ci1) +r)((3 r)Dci+r))di
tdi
t(rD +(3 r))di
t
yes 1 no 0 no 2di
t(0 (1 +1))di
t0 0
yes 2 yes ci1 no di
t(ci(ci+1))di
t0 0
yes 1 no 0 yes 2di
t(((3 r)D0+r)((3 r)D2+(1 +r)))di
t3di
t(rD +(3 r))di
t
yes 2 yes ci1 yes di
t(((3 r)D2(ci1) +(1 +r)) ((3 r)D2ci+(1 +r)))di
t3di
t(rD +(3 r))di
t
at=wi
no D1 no ci+1 no di
t(2(ci+1) 2ci)di
t3di
t(rD +(3 r))di
t
no Dyes D+1 no (D+1)di
t(((D+1) +1) 2D)di
t3di
t(rD +(3 r))di
t
no D1 no ci+1 yes di
t(((3 r)D(ci+1) +r)((3 r)Dci+r))di
t0(rD +(3 r))di
t
no Dyes D+1 yes (D+1)di
t(((3 r)D2(D+1) +(1 +r)) ((3 r)DD+r))di
t0(rD +(3 r))di
t
yes Dyes ci+1 no di
t(((ci+1) +1) (ci+1))di
t2di
t(rD +(3 r))di
t
yes D+1 yes D+1 no di
t(((D+1) +1) ((D+1) +1))di
tdi
t(rD +(3 r))di
t
yes Dyes ci+1 yes di
t(((3 r)D2(ci+1) +(1 +r)) ((3 r)D2ci+(1 +r)))di
tdi
t(rD +(3 r))di
t
yes D+1 yes D+1 yes di
t(((3 r)D2(D+1) +(1 +r)) ((3 r)D2(D+1) +(1 +r)))di
tdi
t(rD +(3 r))di
t
3.1.2 Demand-driven algorithms 33
3.1.2.4 Algorithm RandomizedFollow
Algorithm RandomizedFollow4(see Algorithm 3) is a randomized extension of
Follow. It consists of the following two independent rules:
(i) If a request is issued at a peripheral node which does not hold a copy, place
a new copy on this peripheral node with probability pr.
(ii) If a node iissues a write request, each copy on a peripheral node other than
iis deleted with probability pd.
We denote by RandomizedFollowl, with l1, the instance of RandomizedFollow
with pr:=l
D+land pd:=l
D+l.
Algorithm 3 RandomizedFollow
1: On request (dt,at)do
2: Let ibe the source of the current request at.
3: if i<RRF then
4: Replicate to iwith probability pr.
5: end if
6: if at=withen
7: for jRRF \{c,i}do
8: Delete copy on jwith probability pd.
9: end for
10: end if
11: end
Theorem 3.7. Let σbe an arbitrary input sequence and Adv be an adaptive-online
adversary for Fads. Then
ECRFl(σ)|σ|
X
t=1
γr
RFl¯
δt·E[CAdv(σt)]
holds for any 0r1
l, where ¯
δtis the average change of the distances of the peripheral
nodes in step t, i.e. ¯
δt:=1
n
n
P
i=1di
tdi
t1and
γr
RFl(¯
δt) :=max
12
l+1+rD+3r,1+1+2
lrD+11
l+r
p¯
δt
.
4We abbreviate RandomizedFollow with RF in indices.
34 File Allocation with Step Costs
Notice that the expected number of read respectively write requests until
RandomizedFollow1creates respectively deletes a copy is D+1. Thus, the behav-
ior of RandomizedFollow1is basically similar to the one of Count. And in fact,
the expected amoritzed competitive ratio max rD +3r,1+(3r)D+r
p¯
δt, 0 r1,
of RandomizedFollow1is exactly the amortized competitive ratio of Count. Fur-
thermore, if l both prand pdapproach 1 and therefore RandomizedFollow
approaches Follow. And accordingly the amortized competitive ratio approaches
max D+3,1+D+1
p¯
δt, which is the amortized competitive ratio of Follow.
Corollary 3.8. RandomizedFollowlis strictly γRFl(δ)-competitive against δ-restricted
adaptive-online adversaries, where
γRFl(δ) :=
12
l+1D+3if δl(12
l+1)D+2l
(l+2)D+l1p, (r=0)
1+((2+2
l2
l+1)D+41
l)δ
p
1+δ
p
if l(12
l+1)D+2l
(l+2)D+l1p<δ<l(12
l+1)D+2l+(D1)
(l+2)D+l1(D1) p, (0 <r<1
l)
1+1+1
lD+1δ
pif δl(12
l+1)D+2l+(D1)
(l+2)D+l1(D1) p. (r=1
l)
Proof of Theorem 3.7. We will prove the theorem with the help of the potential
function Φt:=
n
P
i=1
Φi
t·di
t, where
Φi
t:=
0 if i<RRFland i<RAdv,
1
lD+2 if iRRFland i<RAdv,
1+2
lrD+11
l+rif i<RRFland iRAdv,
1
lrD1
l+rif iRRFland iRAdv.
Notice that Φt0 for 1 t |σ|and especially Φ0=0 in the natural case that
RandomizedFollow and Adv start without copies on peripheral nodes.
We divide every step into two parts. In the first part both RandomizedFollowl
and Adv pay pfor every peripheral node and the peripheral nodes move to their
new positions. In the second part the file access is served and the algorithms may
reallocate copies. We denote by Φt1
2the potential after the first part of step t, i.e.
Φt1
2=
n
P
i=1
Φi
t1·di
t.
Below we show that for the first part of any step tholds:
np +EhΦt1
2Φt1i
1+1+2
lrD+11
l+r
p¯
δt
np.(3.7)
And for the second part, we show:
Eh˜
CRFl(σt)+ ΦtΦt1
2i12
l+1+rD+3r·Eh˜
CAdv(σt)i.(3.8)
3.1.2 Demand-driven algorithms 35
Inequality(3.8)especially saysthat RandomizedFollowlis12
l+1+rD+3r-
competitive against adaptive-online adversaries in a static star network.
Together this results in
ECRFl(σ)
ECRFl(σ)+EΦ|σ|Φ0
=E
|σ|
X
t=1np +˜
CRFl(σt)+ Φ|σ|Φ0
=E
|σ|
X
t=1np +Φt1
2Φt1+˜
CRFl(σt)+ ΦtΦt1
2
=|σ|
X
t=1np +EhΦt1
2Φt1i+Eh˜
CRFl(σt)+ ΦtΦt1
2i
()
|σ|
X
t=1
1+1+2
lrD+11
l+r
p¯
δt
np +12
l+1+rD+3rEh˜
CAdv(σt)i
|σ|
X
t=1
max
12
l+1+rD+3r,1+1+2
lrD+11
l+r
p¯
δt
E[CAdv(σt)],
where () follows from (3.7) and (3.8).
For the proof of (3.7), we observe that the costs of RandomizedFollowland Adv
in the first phase of a step are always np. Furthermore, the value of Φchanges
only due to the changes of the distances since no actions are performed. Thus, we
have:
np +EhΦt1
2Φt1i=np +E
n
X
i=1
Φi
t1·(di
tdi
t1)
np +
n
X
i=11+2
lrD+11
l+r·di
tdi
t1
=
1+1+2
lrD+11
l+r
np
n
X
i=1di
tdi
t1
np
=
1+1+2
lrD+11
l+r
p¯
δt
np
For the analysis of RandomizedFollowlin the second part of a step, we use the
cost allocation scheme already used in the analysis of Follow and Count. We
prove (3.8) by showing that for all 1 inholds
E˜
Ci
RF(σt)+Φi
tΦi
t1
2di
t12
l+1+rD+3r·Eh˜
Ci
Adv(σt)i.(3.9)
36 File Allocation with Step Costs
Table 3.5: Changes of replica set RRFland corresponding values of E h∆Φidi
ti, E h˜
Ci
RFliand ˜
Ci
Adv regarding peripheral node
iin the second part of step ton request at.
iRRFliRAdv E˜
Ci
RFl(σt)Φidi
tEhΦi0di
tiE˜
Ci
RFl(σt)+ ∆Φidi
t12
l+1+rD+3r˜
Ci
Adv(σt)
at=rjwith j,i
no no 0 0 Φidi
t0 0
no yes 0 1+2
lrD+11
l+rdi
tΦidi
t0 0
yes no 0 1
lD+2di
tΦidi
t0 0
yes yes 0 1
lrD1
l+rDΦidi
t0 0
at=ri
no no 1+l
D+lDdi
t0l
D+l1
lD+2di
t+1l
D+lΦidi
t2+l
D+l(D+1)di
t12
l+1+rD+3rdi
t
no yes 1+l
D+lDdi
t1+2
lrD+11
l+rdi
tl
D+l1
lrD1
l+rdi
t+1l
D+lΦidi
t0 0
yes no 0 1
lD+2di
tΦidi
t012
l+1+rD+3rdi
t
yes yes 0 1
lrD1
l+rDΦidi
t0 0
at=wjwith j,i
no no 0 0 Φidi
t0 0
no yes 0 1+2
lrD+11
l+rdi
tΦidi
t012
l+1+rD+3rdi
t
yes no 1+l
D+ldi
t1
lD+2di
t01l
D+lΦidi
t0 0
yes yes 1+l
D+ldi
t1
lrD1
l+rDl
D+l1+2
lrD+11
l+rdi
t+1+l
D+lDΦidi
t2+l
D+l(D+1)di
t12
l+1+rD+3rdi
t
at=wi
no no 1+l
D+lDdi
t0l
D+l1
lD+2di
t+1l
D+lΦidi
t2+l
D+l(D+1)di
t12
l+1+rD+3rdi
t
no yes 1+l
D+lDdi
t1+2
lrD+11
l+rdi
tl
D+l1
lrD1
l+rdi
t+1l
D+lΦidi
t012
l+1+rD+3rdi
t
yes no di
t1
lD+2di
tΦidi
tdi
t12
l+1+rD+3rdi
t
yes yes di
t1
lrD1
l+rDΦidi
tdi
t12
l+1+rD+3rdi
t
3.1.3 Lower bound for non-demand-driven algorithms 37
We show this in two steps: First we analyze the costs and the change of Φidue
to the request and the actions of RandomizedFollowl. Thereafter we have a look
at the change of Φicaused by an action of Adv. The proof of (3.9) for the first step
can be found in Table 3.5 and the one for the second step in Table 3.6.
Table 3.6: Changes of replica set RAdv and corresponding values of ∆Φiand ˜
Ci
Adv
regarding peripheral node iin the second part of step t.
iRRF iRAdv iR0
Adv ∆Φidi
trD +2+lrl2+1
l+1·˜
Ci
Adv
Adv creates a new copy on i.
no no yes (1 +2
lr)D+11
l+rdi
t(1 2
l+1+r)D+3r·Ddi
t
yes no yes rD 21
l+rdi
t(1 2
l+1+r)D+3r·Ddi
t
Adv deletes copy on i.
no yes no (12
l+r)D1+1
lrdi
t(1 2
l+1+r)D+3r·di
t
yes yes no rD +2+1
lrdi
t(1 2
l+1+r)D+3r·di
t
3.1.3 Lower bound for non-demand-driven algorithms
In the previous section we showed that every algorithm from the natural class of
demand-driven algorithms has a competitive ratio of D
pδand gave examples
of demand-driven algorithms with a competitive ratio of ΘD
pδ. Here we give a
slightly worse lower bound for arbitrary randomized online algorithms against δ-
restricted oblivious adversaries. Thereby, we have to regard that a given algorithm
may create a copy on a peripheral node even though this node is far away from
the recent origins of requests.
Theorem 3.9. The competitive ratio of any randomized algorithm for Fads against δ-
restricted oblivious adversaries is at least max 2+1
D,1+
D
pδ
8llg1+D
pδm!.
This theorem especially implies that there can be no competitive algorithm for
Fads if δ > 0 and p=0 as discussed in Chapter 2.
Proof. Let Alg be an arbitrary randomized algorithm for Fads. In the case that
1+
D
pδ
8llg1+D
pδm 2+1
D, the lower bound can be deduced from the lower bound for
static networks given in [LRWY99]. For the other case, we describe an adversary
Adv who constructs an input sequence consisting of an arbitrary number of
phases. We show that the expected costs of Alg are at least 1+
D
pδ
8llg1+D
pδm!-times
38 File Allocation with Step Costs
the costs of an optimal offline algorithm in every phase. We thereby upper bound
the costs of an optimal offline algorithm by describing how Adv would serve the
generated input.
Every phase starts with all the peripheral nodes at distance Sδfor some fixed
S1. During the phase all the peripheral nodes perform the same movement,
i.e. di
t=dj
tfor all 1 i,jnand t0.
Since Adv is oblivious, it does not know on which peripheral nodes Alg holds
copies, but it can compute the probability that a peripheral node holds a copy. We
denote by qi
tthe probability that peripheral node iholds a copy at the end of step
t. Furthermore, we denote by Qt:=
n
P
i=1
qi
tthe sum of these probabilities, i.e. the
expected number of copies on peripheral nodes. Below we omit the index tand
denote by qiand Qthe corresponding values at the end of the current step.
Figure 3.6: A single phase of the lower bound for general randomized algorithms.
A phase (see Figure 3.6) starts with a clearance-stage, in which Adv reduces the
expected number of copies of Alg to less than n
4. Adv achieves this by issuing
write requests from cuntil Q<n
4. Notice that Alg incurs an expected cost of at
least np +Q·Sδnp +n
4·Sδfor every such write request. Adv on the other hand
deletes all its copies on peripheral nodes at the end of each phase and therefore
pays np in every step of the clearance stage. Thus, E[CAlg](1 +S
4pδ)CAdv
holds for every step of the clearance stage. By choosing SD
2llg1+D
pδm we get
E[CAlg](1 +
D
pδ
8llg1+D
pδm)CAdv. We can therefore neglect the clearance stage in the
3.1.3 Lower bound for non-demand-driven algorithms 39
following analysis of the main part of the phase.
After the clearance-stage all the peripheral nodes move with speed δaway from
the center. This expansion is divided into l:=llg 1+D
pδmstages. Stage sstarts
at distance 2s1Sδand ends at distance (2sS1)δ. At the end of stage sAdv tests
whether Qn
4(1 +s
l). If so, Adv aborts the phase by moving the peripheral nodes
back to the initial distance Sδ. Otherwise the next stage begins. If Alg reaches the
end of stage lwithout abortion, i.e. the expected number of copies Qis lower than
n
2, Adv issues consecutively Dread requests from each of the peripheral nodes at
distance 2lSδ. Thereafter all the peripheral nodes move back to the initial distance
Sδ.
Let us first assume that the phase is aborted after stage 1 sl. Since there
is no request from a peripheral node in an aborted phase, Adv does not create
any copies. Thus, Adv pays 2(2s1)Snp for the phase. For the expected costs of
Alg we observe that Qis greater than n
4(1 +s
l) after stage s, i.e. it increased by
at least n
4
s
lsince the start of the expansion. Additionally, Qwas smaller than or
equal to n
4(1 +r
l) at the end of every stage 1 r<s, because the phase was not
aborted in stage r. Thus, Qincreased at most by n
41+r
ln
41+r1
lin stage r.
Since creating a copy gets more expensive over time, Alg would pay the least if
it created copies as early as possible. Thus, the best behavior of Alg would be
to create copies such that Qrises in the first step of stage rto n
41+r
l, where
> 0. Thus, we can lower bound the expected costs of Alg as follows:
E[CAlg]2(2s1)S
| {z }
# of steps
np +
s
X
r=1n
41+r
ln
41+r1
l·D·2r1Sδ
=2(2s1)Snp +nD
4lSδ
s
X
r=1
2r1
=2(2s1)Snp +nD
4l(2s1)Sδ
= 1+D
p
1
8lδ!·2(2s1)Snp
=
1+
D
pδ
8llg 1+D
pδm
CAdv
In the other case Qis lower than n
2after the l-th stage and hence Adv issues
consecutively Dread requests from every peripheral node. If Alg does not hold a
copy on a peripheral node iat this time, it has to either send Dsingle data units or
to create a new copy on i. Thus, Alg incurs at least a cost of D·2lSδfor peripheral
node i. This implies an expected cost of at least
n
P
i=1
(1 qi)·D2lSδ > n
2·D2lSδ
40 File Allocation with Step Costs
for serving the read requests. Adv on the other hand creates copies on all the
peripheral nodes at the first step of the expansion and deletes them at the end of
the phase. Thus, the transfer costs of Adv are n(D+1)Sδ. Altogether we have
E[CAlg]
CAdv
>2(2l1)S+nDnp +n
2D2lSδ
(2(2l1)S+nD)np +n(D+1)Sδ
=1+
n
2D2lSδn(D+1)Sδ
(2(2l1)S+nD)np +n(D+1)Sδ
S→∞1+
1
2D2lδ(D+1)δ
2(2l1)p+(D+1)δ
1+
1
2D2lδ2Dδ
2(2l1)p+2Dδ(3.10)
1+
1
2D(1 +D
pδ)δ2Dδ
2Dδ+2Dδ(3.11)
=1+
D
pδ
83
8
>1+
D
pδ
8llg 1+D
pδm,(3.12)
where (3.10) follows from D1 and (3.11) can be derived from llg 1+D
pδ.
Furthermore, (3.12) is valid for all D
pδsuch that 1 +
D
pδ
8llg1+D
pδm >2+1
D.
3.2 File allocation in a dynamic tree network
In this section we extend the technique we used for analyzing algorithms in
dynamic star networks to arbitrary dynamic tree networks. Such a dynamic tree
network consists of a tree with a fixed topology where the weights of the edges
can change over time.
3.2.1 Our model
The file allocation problem in a dynamic tree network (Fadt)is defined as follows: We
are given a tree network consisting of nnodes 1,...,nwhich are interconnected
via edges E. All the nodes of the tree can access an indivisible file which initially
is stored on an arbitrary node s {1,...,n}of the tree. During the runtime of the
system, an algorithm Alg can arbitrarily place copies of the file on the nodes. The
replica set Rt
Alg {1,...,n}is defined as the set of nodes where algorithm Alg has
3.2.1 Our model 41
a copy at the end of step t. There has to be at all times at least one copy in the tree,
i.e Rt
Alg ,for all t0.
Figure 3.7: An example of a dynamic tree with 13 nodes.
The input consists of the initial edge weight function d0and a finite sequence
σof pairs (dt,at), 1 t |σ|, of an edge weight function and an access request.
It is presented to the online algorithm in equidistant discrete time steps. The
function dt:ERn
>0defines the edge weights in step t. The access request
at {−,ri,wi|i {1,...,n}} either indicates that no access occurs () or a read (r)
respectively a write (w) access is issued at node i. Every step tis divided into the
following three phases, which are executed strictly in the given order: First the
edge weights are updated to dt, then request atis served, and finally, the algorithm
can create and/or delete copies.
We consider the shared file as an indivisible container of a fixed number of data
units of uniform size. A single read respectively write access always affects just
one unit of the file. While a read request can be fulfilled by a single copy, a write
request has to update all the copies. Every transfer of such a data unit costs the
corresponding distance.
The costs CAlg(σt) incurred by an algorithm Alg in step tare composed of
three components: a fixed stand-by cost of pfor every node, the transfer costs for
fulfilling the data access and the transfer costs for creating and deleting copies. We
will occasionally use the following split-up of the costs CAlg(σt)=np +˜
CAlg(σt),
where ˜
CAlg(σt) denotes the costs incurred by Alg for serving the request and
performing actions. The costs ˜
CAlg(at) for serving a data access from i {c,1,...,n}
42 File Allocation with Step Costs
are naturally defined by
˜
Ct
Alg() :=0,˜
Ct
Alg(ri) :=dti,Rt1
Alg˜
Ct
Alg(wi) :=sttRt1
Alg {i}
A replication of the file from node ito node jcosts D·dti,j, where D1 is a
fixed constant depending on the size of the file. If an algorithm decides to delete
the copy of a node i, it has to make sure that at least one copy remains in the
tree. Thus, a communication between iand the current replica set RAlg has to take
place, which costs dt(i,RAlg \{i}).5We denote by CAlg(σ) the overall costs incurred
by algorithm Alg while serving input sequence σ.
3.2.2 Algorithm RandomizedTree
The algorithm RandomizedTree6(see Algorithm 4) was introduced and analyzed
regarding static tree networks in [Bar94, BFR92]. We present here a version with
slightly adapted probabilities, because our model involves costs for deletions.
Notice that the set of replicas of RandomizedTree RRT is always a connected
subtree.
Algorithm 4 RandomizedTree
1: On request (dt,at)do
2: if (at=riand i<RRT)then
3: Let jRt1
RT be the nearest node to iwith a copy.
4: With probability 1
D+1place copies on the path from jto i.
5: else if (at=wi)then
6: Let jRt1
RT be the nearest node to iwith a copy.
7: With probability 1
D+1delete all copies but the one on j.
8: With probability 1
2migrate the copy from jto i.
9: end if
10: end
In the following, we abuse notation by writing e ST(R)and i ST(R)to
denote either an edge or a node of the Steiner tree ST(R). It will be clear from the
context whether an edge or a node is meant.
We need the following lemma in the analysis of RandomizedTree, which easily
follows from the uniqueness of Steiner trees in tree networks.
Lemma 3.10. Let S,T {1,...,n}, i {1,...,n}and t N. Then holds:
stt(S{i})=stt(S)+dt(i,ST(S)) .(3.13)
stt(ST)=stt(S)+stt(T)stt(ST(S)ST(T)) +dt(ST(S),ST(T)) (3.14)
5This differs from the file allocation model of Bartal et al. [BFR92] where deletions are free.
6We abbreviate RandomizedTree with RT in indices.
3.2.2 Algorithm RandomizedTree 43
The following lemma allows us to assume in the analysis of RandomizedTree
that the replica set of the adversary always constitutes a connected subtree.
Lemma 3.11. Let Alg be an arbitrary file allocation algorithm for an arbitrary dynamic
tree network ({1,...,n},E)and s {1,...,n}an arbitrary position of the initial copy.
Furthermore, let σbe an arbitrary suitable input sequence and ({s},R1
Alg,...,R|σ|
Alg)be the
sequence of replica sets produced by Alg while serving σ. Then there is an algorithm Alg
which produces the sequence of replica sets {s},STR1
Alg,...,STRσ
Algsuch that
CAlg0(σ)CAlg(σ).
Proof. The proof of this lemma is mostly equivalent to the proof of Theorem 3.1
in [LRWY99]. We therefore discuss only the differences here. Their model differs
from our model only in the stand-by costs, the deletion costs and the dynamics of
the edge weights.
The stand-by costs are equal for Alg and Alg since they are completely obliv-
ious of the behavior of an algorithm.
For the deletion costs of our model, assume that node ideletes its copy. Let jbe
the nearest copy in RAlg \{i}to i. Then Alg deletes all its copies on the path from
ito jwhich do not belong to ST(RAlg \{i}). Thus, CAlg0di,j=CAlg holds for
the deletion.
Thedynamicsof theedge weightsdo notinterferewith theprooffrom[LRWY99]
since in every step Alg transfers data only over edges which are also used by
Alg.
We can analyse the performance of RandomizedTree now.
Theorem 3.12. Let σbe an arbitrary input sequence, s {1,...,n}the initial position
of the file and Adv be an adaptive-online adversary for the file allocation problem in an
arbitrary dynamic tree ({1,...,n},E). Then
E[CRT(σ)]|σ|
X
t=1
γRT ¯
δt·E[CAdv(σt)]
holds, where ¯
δt:=1
|E|P
eEdt(e)dt1(e)is the average change of edge weights in step t
and γRT(¯
δt) :=max 3,1+3D+3
p11
n¯
δt.
Corollary 3.13. RandomizedTree is strictly max 3,1+3D+3
p11
nδ-competitive
against δ-restricted adversaries.
44 File Allocation with Step Costs
Proof of Theorem 3.12. We will regularly use in the following that the replica set RRT
of RandomizedTree constitutes always a connected subtree, i.e. ST(RRT)=RRT.
Because of Lemma 3.11 we can assume that the same holds for the adversary Adv,
i.e. ST(RAdv)=RAdv.
We show the theorem using this potential function:
Φt:=3(D+1)sttRt
RT Rt
Adv(2D+1)sttRt
RT3sttRt
Adv.(3.15)
Φ0is 0 since both algorithms start with a single copy on s. Furthermore, we have
for all possible copy sets RRT,RAdv and edge-weights:
Φt=3(D+1)stt(RRT RAdv)(2D+1)stt(RRT)3stt(RAdv)
3(D+1)stt(RRT RAdv)(2D+1)stt(RRT)(D+2)stt(RAdv)
(3.14)
=3(D+1) (stt(RRT)+stt(RAdv)stt(RRT RAdv)
+dt(RRT,RAdv)) (2D+1)stt(RRT)(D+2)stt(RAdv)
(D+2)stt(RRT)+(2D+1)stt(RAdv)3(D+1)stt(RRT RAdv)
3(D+1)stt(RRT RAdv)3(D+1)stt(RRT RAdv)
=0.
We divide every step into two parts. In the first part both RandomizedTree and
Adv pay pfor every node and the weights of the edges are updated. In the second
part the file access is served and the algorithms may reallocate copies. We denote
by Φt1
2the potential after the first part of step t, i.e.
Φt1
2=3(D+1)sttRt1
RT Rt1
Adv(2D+1)sttRt1
RT 3sttRt1
Adv.
Below we show that for the first part of any step tholds:
np +EhΦt1
2Φt1i 1+3D+3
p11
n¯
δt!np.(3.16)
And for the second part, we show:
Eh˜
CRT(at)+ ΦtΦt1
2i3·Eh˜
CAdv(σt)i.(3.17)
Notice that (3.8) especially says that RandomizedTree is 3-competitive against
adaptive-online adversaries in a static tree network.
3.2.2 Algorithm RandomizedTree 45
Altogether, we then have
E[CRT(σ)]ECRT(σ)+ Φ|σ|Φ0
=|σ|
X
t=1np +Eh˜
CRT(σt)i+EΦ|σ|Φ0
=|σ|
X
t=1np +EhΦt1
2Φt1+˜
CRT(σt)+ ΦtΦt1
2i
=|σ|
X
t=1np +EhΦt1
2Φt1i+Eh˜
CRT(σt)+ ΦtΦt1
2i
()
|σ|
X
t=1 1+3D+3
p11
n¯
δt!np +3E h˜
CAdv(σt)i!
|σ|
X
t=1
max 3,1+3D+3
p11
n¯
δt!E[CAdv(σt)],
where () follows from (3.16) and (3.17).
For the proof of (3.16), we observe that the costs of RandomizedTree and Adv in
the first phase of a step are always np. Furthermore, the value of Φchanges only
due to the changes of the weights of the edges since no actions are performed. We
first transform (3.16) as follows:
np +EhΦt1
2Φt1i 1+3D+3
p11
n¯
δt!np
EhΦt1
2Φt1i3D+3
p
n1
n
1
|E|X
eEdt(e)dt1(e)
np
EhΦt1
2Φt1i(3D+3) X
eEdt1
edt1
e.
46 File Allocation with Step Costs
This holds since
EhΦt1
2Φt1i
=Eh3(D+1)sttRt1
RT Rt1
Adv(2D+1)sttRt1
RT 3sttRt1
Adv
3(D+1)stt1Rt1
RT Rt1
Adv(2D+1)stt1Rt1
RT 3stt1Rt1
Advi
=E
3(D+1) X
e∈ST(Rt1
RT Rt1
Adv)
dt(e)dt1(e)(2D+1)X
eRt1
RTdt(e)dt1(e)3X
eRt1
Adv
dt(e)dt1(e)
=E
[3(D+1) (2D+1)]X
eRt1
RT \Rt1
Adv
dt(e)dt1(e)+[3(D+1) 3]X
eRt1
Adv\Rt1
RT
dt(e)dt1(e)
+[3(D+1) (2D+1) 3]X
eRt1
RT Rt1
Adv
dt(e)dt1(e)+[3(D+1)]X
e∈ST(Rt1
RT Rt1
Adv)\(Rt1
RT Rt1
Adv)
dt(e)dt1(e)
E
(D+2) X
eRt1
RT \Rt1
Adv
dt(e)dt1(e)+3DX
eRt1
Adv\Rt1
RT
dt(e)dt1(e)
+(D1) X
eRt1
RT Rt1
Adv
dt(e)dt1(e)+3(D+1) X
e∈ST(Rt1
RT Rt1
Adv)\(Rt1
AdvRt1
RT )
dt(e)dt1(e)
E
3(D+1) X
e∈ST(Rt1
RT Rt1
Adv)
dt(e)dt1(e)
3(D+1) X
eEdt(e)dt1(e).
For the proof of (3.17), we follow the proof of Theorem 4.4.2 of [Bar94]. We show
(3.17) in two steps: First the request is served and RandomizedTree performs
actions, then the adversary can perform actions.
Request and RandomizedTrees actions.
We will analyse the following components of the potential:
Θt:=sttRt1
RT ,
Ψt:=sttRt1
RT Rt1
AdvsttRt1
RT .
Since RAdv does not change in this step of the analysis, this leads to
∆Φ = (D+2)∆Θ + 3(D+1)∆Ψ.
3.2.2 Algorithm RandomizedTree 47
No request.
Since RandomizedTree does nothing, there are neither any costs nor does the
potential change. Thus,
Eh˜
CRT + ∆Φi=0=3˜
CAdv.
Read request.
Let ibe the source of the read request. RandomizedTree first sends the requested
data from RRT to iand then creates copies on the way from RRT to iwith probability
1
D+1. Thus, the expected costs of RandomizedTree are:
Eh˜
CRTi=dti,Rt1
RT +1
D+1Ddti,Rt1
RT =2D+1
D+1dti,Rt1
RT .
Θincreases by dti,Rt1
RT if RandomizedTree replicates to i. Otherwise, it does
not change. Thus,
E[∆Θ]=1
D+1sttRt1
RT {i}sttRt1
RT +11
D+1sttRt1
RT sttRt1
RT 
(3.13)
=1
D+1dti,Rt1
RT .
Ψincreases also only if RandomizedTree replicates to i. Thus,
E[∆Ψ]=1
D+1hsttRt1
RT Rt1
Adv {i}sttRt1
RT {i}sttRt1
RT Rt1
AdvsttRt1
RT i
=1
D+1hsttRt1
RT Rt1
Adv {i}sttRt1
RT Rt1
AdvsttRt1
RT {i}sttRt1
RT i
(3.13)
=1
D+1hdti,STRt1
RT Rt1
Advdti,Rt1
RT i
1
D+1hdti,Rt1
Advdti,Rt1
RT i.
Together, this results in
Eh˜
CRT + ∆Φi=Eh˜
CRTi+(D+2)E [∆Θ]+3(D+1)E [∆Ψ]
2D+1
D+1dti,Rt1
RT +D+2
D+1dti,Rt1
RT
+3D+1
D+1dti,Rt1
Advdti,Rt1
RT 
=3dti,Rt1
Adv
=3˜
CAdv.
48 File Allocation with Step Costs
Write request.
Let ibe the source of the write request and jbe the nearest node to iin Rt1
RT . On a
write request, an update is first send from ito jand then from jto all the nodes in
Rt1
RT . This costs stt(RRT {i})=dti,Rt1
RT +sttRt1
RT . Then RandomizedTree deletes
all the copies but the one on jwith probability 1
D+1. This is done by sending delete
messages starting from jover each edge in STRt1
RT and therefore incurs a cost
of sttRt1
RT . Furthermore, the copy from jis migrated to iwith probability 1
2. This
is done by a replication from jto iand a subsequent deletion of the copy on j
initiated by i. Thus, the migration costs (D+1)dti,Rt1
RT . Altogether, this results
in
E[CRT]=dti,Rt1
RT +stRt1
RT +1
D+1sttRt1
RT +1
2(D+1)dti,Rt1
RT
=1+1
D+1sttRt1
RT +3
2dti,Rt1
RT .
The value of Θchanges to 0 if RandomizedTree deletes all the copies but one
(either jor iremains). Otherwise, it does not change. Thus,
E[∆Θ]=1
D+10sttRt1
RT +11
D+1sttRt1
RT sttRt1
RT 
=1
D+1sttRt1
RT .
We divide the change of Ψin two parts. ∆Ψ1is the change of Ψfor the case
that RandomizedTree deletes all the copies but the one at jand does not migrate
to i. And ∆Ψ2is the change of Ψfor the case that RandomizedTree deletes all the
copies and migrates to i. Then we have
E[∆Ψ]=11
D+1·0+1
2(D+1)∆Ψ1+1
2(D+1)∆Ψ2.
For ∆Ψ1, we get
∆Ψ1=sttRt1
Adv {j}stt{j}sttRt1
RT Rt1
AdvsttRt1
RT 
(3.13)
=sttRt1
Adv+dtj,Rt1
Adv+sttRt1
RT sttRt1
RT Rt1
Adv
(3.14)
=sttRt1
Adv+dtj,Rt1
Adv+sttRt1
RT
sttRt1
RT +sttRt1
AdvsttRt1
RT Rt1
Adv+dtRt1
RT ,Rt1
Adv
=sttRt1
RT Rt1
Adv+dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv
sttRt1
Adv+dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv.
3.2.2 Algorithm RandomizedTree 49
For the analysis of ∆Ψ2we assume that RandomizedTree first replicates from j
to iand thereafter deletes all the copies but the one on i. Then we have
∆Ψ2=hsttRt1
Adv {i}stt({i})ihsttRt1
RT Rt1
AdvsttRt1
RT i
=hsttRt1
Adv {i}stt({i})ihsttRt1
RT Rt1
Adv {i}sttRt1
RT {i}i
+hsttRt1
RT Rt1
Adv {i}sttRt1
RT {i}ihsttRt1
RT Rt1
AdvsttRt1
RT i
sttRt1
Adv {i}+hsttRt1
RT Rt1
Adv {i}sttRt1
RT {i}ihsttRt1
RT Rt1
AdvsttRt1
RT i
=sttRt1
Adv {i}+hsttRt1
RT Rt1
Adv {i}sttRt1
RT Rt1
AdvihsttRt1
RT {i}sttRt1
RT i
(3.13)
=sttRt1
Adv {i}+dti,STRt1
RT Rt1
Advdti,Rt1
RT .
Before we can bound the expected change of Ψ, we need to show
dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv+dti,STRt1
RT Rt1
Advdti,Rt1
Adv.(3.18)
We show this by differentiating two cases. If dtRt1
RT ,Rt1
Adv=dtj,Rt1
Advthen, we
trivially get
dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv+dti,STRt1
RT Rt1
Adv
=dti,STRt1
RT Rt1
Adv
dti,Rt1
Adv.
Otherwise, dtRt1
RT ,Rt1
Adv<dtj,Rt1
Advholds, i.e. there is a node in Rt1
RT , which is
closer to Rt1
Adv than jis. Imagine that node iis the root of the tree ({1,...,n},E). Then
all the nodes in Rt1
RT are below jsince jis defined as the node nearest to iin Rt1
RT .
And since the distance between Rt1
RT and Rt1
Adv is not determined by j, the nodes
of Rt1
Adv also have to be below j. This implies that dti,STRt1
RT Rt1
Adv=dti,j
and di,Rt1
Adv=di,j+dj,Rt1
Adv. Thus,
dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv+dti,STRt1
RT Rt1
Adv
=dti,j+dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv
=dti,Rt1
AdvdtRt1
RT ,Rt1
Adv
dti,Rt1
Adv.
50 File Allocation with Step Costs
The expected change of Ψcan now be bounded by
E[∆Ψ]=11
D+1·0+1
2(D+1) ·∆Ψ1+1
2(D+1) ·∆Ψ2
1
2(D+1)hsttRt1
Adv+dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv
+sttRt1
Adv {i}+dti,STRt1
RT Rt1
Advdi,Rt1
RT i
=1
2(D+1)hsttRt1
Adv+sttRt1
Adv {i}di,Rt1
RT
+dtj,Rt1
AdvdtRt1
RT ,Rt1
Adv+dti,STRt1
RT Rt1
Advi
(3.18)
1
2(D+1) hsttRt1
Adv+sttRt1
Adv {i}dti,Rt1
RT +dti,Rt1
Advi
(3.13)
=1
2(D+1) h2sttRt1
Adv {i}dti,Rt1
RT i.
Altogether, we have:
E[CRT + ∆Φ]=E[CRT]+(D+2)E [∆Θ]+3(D+1)E [∆Ψ]
D+2
D+1sttRt1
RT +3
2dti,Rt1
RT D+2
D+1sttRt1
RT
+3
22sttRt1
Adv {i}dti,Rt1
RT 
=3sttRt1
Adv {i}
=3CAdv.
Advs actions.
Without loss of generality we can treat each action performed by Adv separately.
Adv replicates.
Let i {1,...,n}\RAdv be the destination of the new copy. Since RandomizedTree
has no costs and its replica set does not change, we have
˜
CRT + ∆Φ = h3(D+1)stt(RRT (RAdv {i})) (2D+1)stt(RRT)3stt(RAdv {i})i
h3(D+1)stt(RRT RAdv)(2D+1)stt(RRT)3stt(RAdv)i
=3(D+1)hstt((RRT RAdv){i})stt(RRT RAdv)i
3hstt(RAdv {i})stt(RAdv)i
(3.13)
=3(D+1)dt(i,ST(RRT RAdv)) 3dt(i,RAdv)
3(D+1)dt(i,RAdv)3dt(i,RAdv)
=3Ddt(i,RAdv)
=3˜
CAdv.
3.3 Conclusion and open problems 51
Adv deletes.
Let iRAdv be the node which holds the copy to be deleted. Since Random-
izedTree has no costs and its replica set does not change, we have
˜
CRT + ∆Φ = h3(D+1)stt(RRT (RAdv \{i})) (2D+1)stt(RRT)3stt(RAdv \{i})i
h3(D+1)stt(RRT RAdv)(2D+1)stt(RRT)3stt(RAdv)i
=3(D+1)hstt(RRT (RAdv \{i}))stt((RRT (RAdv \{i})) {i})i
3hstt(RAdv \{i})stt((RAdv \i){i})i
(3.13)
=3(D+1)dt(i,ST(RRT (RAdv \{i}))) +3dt(i,ST(RAdv \{i}))
3dt(i,ST(RAdv \{i}))
3dt(i,RAdv \{i})
=3˜
CAdv.
This concludes the proof of (3.17) and therewith of the theorem.
3.3 Conclusion and open problems
In this chapter we have considered a model for data management in dynamic
networks which comprises stand-by costs of the participating nodes. We limited
our examinations to networks with star respectively tree topologies, because
such networks have unique and easy to find Steiner trees. The drawback of
these topologies is that they do not allow to exploit locality in the MANET. In
the star topology a participating node can only access the copy on the server
and the possibly available own copy, but not the copy of a nearby participating
node. Thus, creating a copy can be beneficial only for the participating node
which receives the copy but not for its neighborhood. In the tree topology two
participating nodes, which are close-by in the MANET, can have several hops
between them in the tree. And since the communication only happens along
the tree links, this can result in costly data transfers even though the actually
communicating nodes are side by side in the MANET.
In order to exploit locality of requests in the MANET optimally, each partic-
ipating node has to be able to communicate with any other participating node
directly. This results in the difficulty that we can no longer reduce the MANET
to an overlay network of the participating nodes with a fixed and well structured
topology. This is because the topology of the overlay network, which is defined by
the used routing protocol, can be arbitrary and even changes with the movement
of the nodes. Moreover, a file allocation system in a network with cycles requires
a distributed mechanism to track the available copies. Such a tracking protocol
52 File Allocation with Step Costs
has to be able to name a path to a nearby copy whenever a participating node
reads the file. Furthermore, it has to find Steiner trees of all present copies and an
arbitrary initiator of a write request to update the file. The tracking protocol given
by Bartal et al. [Bar94] provides these services for arbitrary static networks. But
it is not possible to generalize this protocol to dynamic networks since it relies on
a precalculated data structure (called regional matching [AP95]) which comprises
information about distances. Furthermore, their solution for the cover problem,
which is used by the tracking protocol, depends on static edge weights and a
fixed topology. Thus, it is still an open and surely challenging problem to find a
distributed (or even local) tracking protocol for dynamic networks.
It is especially doubtful, whether such a tracking protocol can be analyzed using
competitive analysis. One reason is that it is not clear how to define the optimal
costs of the tracking problem. One could define the sum of the shortest distances
respectively Steiner trees at the points of time where tracking queries are issued
as the optimal costs. But then no online algorithm could be competitive, because
obviously any online algorithm has to update its configuration regularly when
the network considerably changes even if no actual tracking request is issued.
Another possible definition for the optimal costs is that it is the lowest costs any
algorithm with global (or even local) knowledge, which is able to return at any
point of time and for any requesting node the optimal path respectively Steiner
tree, can have. But this seems to be a too tough definition since optimal Steiner
trees are quiet unstable under movement and therefore a lot of updates and
communication would be necessary even for such an optimal algorithm. And in
fact an approximation of the shortest path and Steiner trees would be sufficient
for an actual implementation of a file allocation system.
Another possible model is to use a complete network as the overlay network of
participating nodes where edge weights are defined by the shortest routing paths
through the MANET, i.e. we are looking at the distance metric of the MANET7. A
shortcoming of this model is that the weight of Steiner trees are inaccurate. This is
because Steiner trees in the MANET could use non-participating nodes as Steiner
nodes. But as shown in Section 6.1 of [PS02] the weight of a minimum Steiner
tree in the metric constitutes a 2-approximation of the weight of a minimum
Steiner tree in the MANET. The more severe problems of this model are that a
participating node of the MANET would have to be aware of its energy-distance
to all the other participating nodes and of the current locations of the copies.
Star topology
We have shown that the deterministic algorithms Follow and Count are OD
pδ-
competitive against δ-restricted adversaries and that this is up to a constant factor
7This is actually only true if the routing protocol uses optimal routing paths.
3.3 Conclusion and open problems 53
optimal for demand-driven algorithms. Furthermore we have given a lower
bound of D
pδ
lg(D
pδ)against δ-restricted oblivious adversaries for general random-
ized algorithms. The obvious open question is: Does an oD
pδ-competitive non-
demand-driven algorithm exist or can a better lower bound be found?
Tree topology
We have shown that the randomized algorithm RandomizedTree is OD
pδ-com-
petitive against δ-restricted adaptive online algorithms. Since the lower bounds
for the star network can be applied to tree networks as well, RandomizedTree
is an optimal demand-driven online algorithm for file allocation in dynamic tree
networks. As for star networks, it is an open problem, whether a non-demand
driven algorithm with a competitive ratio of oD
pδexists.
As discussed above, the fixed topology of the tree overlay does not allow to
exploit locality in the MANET. It would therefore be interesting to find a mecha-
nism to change the topology of the overlay tree in response to the dynamics of the
MANET. Such a mechanism should maintain an approximation of a minimum
Steiner tree of the participating nodes in the MANET. Whenever the topology of
this tree changes, the file allocation system deletes all copies but one and restarts
its file allocation algorithm with the new tree. Unfortunately the maintenance of
a minimum Steiner tree is inherently communication intensive in a MANET since
the nodes have only local information.
Chapter 4
Simulation Based Evaluation of File
Allocation with Step Costs
In this chapter we describe an experimental evaluation of the algorithms Follow
and Count described in Subsection 3.1.2 in a simulated MANET. This MANET
consists of a stationary server and a set of mobile nodes. Some of these mobile
nodes are participating nodes. A participating node can only access a copy stored
on itself or the primary copy on the server. Thus, the file allocation is performed
on an overlay network with a star topology as described in Section 3.1.1.
Figure 4.1: A screenshot of a simulation run. The mobile nodes and the server
form a unit disk graph. The red nodes are the participating nodes and the red
links represent the routing paths to the server located in the center of the MANET.
The green squares represent the copies created by the simulated algorithms.
55
56 Simulation Based Evaluation of File Allocation with Step Costs
We used the customizable sensor network simulator Shawn [KPB+05, FKFP07]
to simulate the behavior of the participating nodes, the server and the other relays.
The movement of the nodes was generated by the “mobility model based on social
network theory” of [MM07]. The file accesses of the participating nodes and the
server were generated stochastically. Every participating node issued requests
independent from the other participating nodes. This implies that, unlike our
model in Section 3.1.1, more than one request can occur in a single step. In
order to ease the evaluation of the experiments, only the server could issue write
requests.
Figure 4.1 shows a screenshot of an example simulation run, which was per-
formed in a smaller area and with fewer nodes than the actual experiments.
We will see, that the competitive ratio of Follow and Count for realistic inputs
are much lower than the worst-case analysis suggests. While Follow has nev-
ertheless a quiet poor performance, Count turns out to be reasonable good for
practical application.
4.1 Evaluated file allocation algorithms
We investigate the performance of the two deterministic online algorithms Fol-
low and Count described in subsections 3.1.2.2 respectively 3.1.2.3. We use the
two trivial strategies DoNothing and ReplicateOnFirstRequest as performance
references. DoNothing never creates any copies and therefore is equivalent to
storing the file only on the server. ReplicateOnFirstRequest replicates to any
participating node after its first request and keeps this copy forever. Hence,
ReplicateOnFirstRequest is the contrary extreme of DoNothing.
Algorithm 5 Follow on the server
1: RFollow :={c};
2: On event edo
3: if (eis a read-message from participating node i)then
4: Replicate to i;
5: RF:=RF{i};
6: else if (eis a write request from the server ) then
7: for iRF\{c}do
8: Send delete-message to i;
9: end for
10: RF:={c};
11: end if
12: end
4.1 Evaluated file allocation algorithms 57
Here, we give the implemented distributed versions of the algorithms. They
consist of an algorithm that is executed on the server and one that is executed
independently on each of the participating nodes. We thereby omit the handling
of the requests, i.e. we only describe what the algorithms do to determine when
to perform an action. Furthermore, we did not include code for handling write
requests from the participating nodes since such requests do not occur in our
experiments. Notice that more than one event can occur and will be handled
in a single step, since we allow more than one request in a single step in our
simulations.
Algorithm 5 gives the implementation of Follow on the server. The participat-
ing nodes are not involved in the decision about performing actions and therefore
do not need to execute any extra algorithm.
Algorithm 6 Count on the server
1: ci:=0 for all i {1,...,n};
2: RCount :={c};
3: On event edo
4: if (eis a read-message from participating node i)then
5: ci:=min(ci+1,D+1);
6: if (ci=D+1 ) then
7: Replicate to i;
8: RC:=RCount {i};
9: end if
10: else if (eis a write request from the server ) then
11: for i {1,...,n}\RCdo
12: ci:=max(ci1,0);
13: end for
14: else if (eis a delete-message from participating node i)then
15: RC:=RC\{i};
16: ci:=0;
17: end if
18: end
The distributed implementation of Count executes Algorithm 6 on the server
and Algorithm 7 on every participating node. Initially no participating node
holds a copy and therefore all the counters are initialized with 0. Notice that
counter ciis maintained on participating node iwhen iholds a copy and on the
server otherwise. This does not require any additional communication, because
on every replicate respectively delete operation the counter always has the value
58 Simulation Based Evaluation of File Allocation with Step Costs
Algorithm 7 Count on participating node i
1: ci:=0;
2: holds copy :=false;
3: On event edo
4: if (eis a read request from i)then
5: if (holds copy )then
6: ci:=min(ci+1,0);
7: end if
8: else if (eis a write-message from the server ) then
9: ci:=max(ci1,0);
10: if (ci=0 ) then
11: Send delete-message to the server.
12: holds copy :=false;
13: end if
14: else if (eis a replicate-message from the server ) then
15: ci:=D+1;
16: holds copy :=true
17: end if
18: end
D+1 respectively 0. Thus, the arrival of a replicate- respectively a delete-message
implicitly gives the current value of the counter.
4.2 Simulation environment
We performed the simulations using the customizable sensor network simulator
Shawn [KPB+05, FKFP07]. Unlike other network simulators, Shawn does not
simulate the lower layers of the OSI Reference Model, but simulates the effects of
these layers (e.g. packet loss, corruption and delay) by probabilistic models. We
have chosen Shawn, because the simplified simulation of the lower layers allows a
reasonable number of nodes and simulation rounds. Furthermore, Shawn allows
to calculate the required routing paths using a centralized algorithm.
A set of 1500 mobile nodes are placed in a 2000m×2000marea which is parti-
tioned into 15 ×15 squares of equal size. Each mobile node can communicate via
a wireless transceiver with a communication radius of 100m. Besides the mobile
nodes, a stationary sever is positioned on the midpoint of the area, which also has
a wireless transceiver. The server always holds the primary copy of the file.
The power consumption for sending respectively receiving a single unit of the
file is chosen uniformly and independently at random between 0.95 and 1.05 for
4.2 Simulation environment 59
each mobile node and direction at the beginning of the simulation. The overall
power consumption of a data transfer consists of the sum of the power consump-
tions incurred for sending respectively receiving the data by any relay on the path
between the server and the involved participating node. We thereby do not con-
sider any power consumption of the server. For the routing of a data packages we
always use the paths with the lowest overall power consumption, which are calcu-
lated by a centralized algorithm in every step. Furthermore, the stand-by power
consumption of a participating node pis chosen uniformly and independently
at random between 0.095 and 0.105. Unlike the model introduced in Section 3.1
each participating node has its individual stand-by power consumption.
A simulation run is performed in discrete time steps. Each step stands for 1s.
Each simulation run consists of 10.000 steps (2.8h). In each step every mobile
node and the server may issue a request. We used different probabilistic input
generation mechanisms which are described in the following sections. Unlike the
model introduced in Section 3.1.1, only the server issues write request and only the
participating nodes issue read requests. We omitted read requests from the server
since they can always be answered by the server and therefore never cause any
communication in the wireless network. A write request from participating node
idoes not differ from a write request from the server for all participating nodes but
i. If we allowed write requests from the participating nodes, the number of write
requests would depend on the number of participating nodes. Furthermore, it
would be more difficult to force a fixed ratio of the number of write requests to the
number of read requests. Since we primarily want to investigate the dependence
of the performance of the described online algorithms on the write/read-ratio, we
disallowed write requests from participating nodes.
The simulations are performed in four phases: In the first phase the movement
of the 1500 mobile nodes is generated based on the mobility model described in the
next section. Then, for each step the tree of minimum energy-distance paths from
each node to the server are calculated using Dijkstra’s algorithm [Dij59]. These
trees are stored in the nodes such that they can perform the routing between the
server and the mobile nodes without global knowledge. Nodes which are not
connected to the server during any step are marked as such.
In the second phase we first select uniformly at random 50 participating nodes
from the set of nodes which are connected to the server during the whole move-
ment. Thereafter, the file accesses of the participating nodes are generated based
on the probability distributions described later in this section. This completes the
whole input sequence for one simulation run.
In the third phase an optimal offline solution is calculated for the current input
sequence. This is done centrally based only on the energy-distances and file
accesses, i.e. the optimal offline algorithm does not use information about node
60 Simulation Based Evaluation of File Allocation with Step Costs
positions or routing paths.
In the fourth phase the online algorithms and the offline solution are executed
in parallel on the simulated MANET. In this phase no global information are used.
The online algorithms execute the distributed algorithms described in Section 4.1
on the server and the participating nodes. Request- and action-messages issued
by the server or a participating node are forwarded from hop to hop based on
the precalculated routing paths. Each mobile node, but not the server, has one
power consumption counter for each simulated algorithm. A transfer of a data
unit increases the power consumption counters of the corresponding algorithm in
the sending and the receiving node. The participating nodes additionally increase
their counters in every step by their stand-by power consumption. At the end of
the simulation the registered power consumptions of the mobile nodes are added
up for every algorithm.
Since the movement and input generation are randomized, a single simulation
run is not representative for a given instance of parameters. We therefore per-
formed several simulation runs for each value of the independent variable. For
every experiment we conducted at least three simulation batches. A batch con-
sisted of at least ten simulation runs for every value of the independent variable.
All the simulation runs of a batch used the same movement of nodes, power con-
sumption parameters and routing paths, but different selections of participating
nodes and request sequences. In other words, we executed simulation phase one
once per batch and simulation phases two to four for each simulation run. Thus, if
the independent variable can for example take five different values, we performed
three batches and in each batch we performed ten simulation runs per value of
the independent variable This results in 3 ·5·10 =150 performed simulation runs
for this example experiment.
4.3 Mobility model
The movement of the nodes is generated according to the “mobility model based
on social network theory” described in [MM07]. This mobility model is primarily
based on an interaction matrix Mwhich represents the degree of relatedness of
every pair of nodes. Entry mi,jof Mcan take values in the range [0,1], where a
value of 0 means that there is no interaction between nodes iand jand 1 indicates a
strong interaction. The generation of Mis based on the connected-caveman graph
from [Wat99]. First, 300 isolated communities of 5 nodes each are created. The
nodes of these communities are initially fully connected to each other. Then each
edge is re-wired with probability 0.1 such that it points to another community.
The resulting graph is called the connectivity graph. The value of mi,jis then chosen
4.3 Mobility model 61
uniformly at random either from the range [0.9,1] if the edge (i,j) exists in the
connectivity graph, or otherwise from the range [0,0.1].
Initially each community is assigned to a random cell of the 15 ×15 grid. Each
member of the community is placed at a random position in the chosen cell.
Then for 1350 nodes a random speed is chosen from the range [1m
s,2.5m
s] which
remains unchanged during the whole movement instance. The used velocity
range correlates approximately to the walking speed of humans. The other 150
nodes are stationary. The first goal of every node is chosen randomly inside its
initial cell. Whenever a node reaches its current goal, it chooses its next goal
randomly using the following probability distribution: let Cp,q, 1 p,q15 be
the set of nodes which have the cell (p,q) as its current goal. Then
Pr ichooses cell (r,s) as its next goal:=P
jCr,s
mi,j
P
p,qP
jCp,q
mi,j
.
Thus, node iprefers cells which are the goal of nodes related to it. The nodes
always start instantly to move to their next goal, i.e. in contrast to the random
waypoint mobility model they do not pause.
Since the initial placement of nodes results in a rather untypical node distribu-
tion, we started the actual simulation of the file allocation system after the nodes
moved for 1000 steps. This means that most of the nodes already reached their
second goal and thus the social component of the movement started to be in effect.
We have chosen this mobility model, because it is a good compromise between
synthetic mobility models and recorded traces of real movements. The widely
used synthetic mobility models (e.g. the random walk [Ein56], random waypoint
[JM96] and random direction [RMSM01] mobility models) produce movements
where every node moves independent of all the other nodes. This behavior is
very uncommon for social beings like humans. The main advantage of these
synthetic mobility models is that they are easy to implement and that they can
generate arbitrary many different movement instances. Traces of real movements
on the other hand provide realistic movements, but are hard to gain and handle.
There are several free movement traces available (e.g. the project CRAWDAD
[KH05, KH10] provides an archive). The main problem is that different traces
have completely different settings, i.e. they use different hardware, different
number of nodes, and different area sizes. Thus, it is hard to perform the same
experiments on different movement instances. But using only one movement
trace bears the danger to measure phenomenons which are caused by this special
movement instance. The chosen “mobility model based on social network theory”
provides more realistic movements than the synthetic models and can easily
generate different instances. In [MM07] this mobility model was compared to
62 Simulation Based Evaluation of File Allocation with Step Costs
the random waypoint mobility model and a trace of real movement recorded by
Intel (cf. [CHC+05]). They examined the contact duration, i.e. the duration two
nodes can communicate nonstop with each other, and the inter-contact time, i.e.
the time between two contacts. Their experiments showed that the distributions
of the contact duration and the inter-contact time of their mobility model are
significantly more similar to the real traces than the one of the random waypoint
mobility model.
4.4 Experiments
We conducted three different experiments.
Write/read-ratio with constant write probability
In this experiment we investigate the performance of the described algorithms
under different ratios of the number of write requests to the number of read
requests. The server and the participating nodes issue their requests randomly
and independent of each other. In every step the server issues a write request with
probability pwand each participating node issues a read request with probability
pr. This means that the expected number of write requests is Tpwand the expected
number of read requests is nTpr, where Tis the number of simulated steps. Thus,
the ratio of write request to read requests is pw
npr.
We set pwto the fixed value of 1
20 and prto values between 1
5and 1
35. Thus,
the expected write/read-ratios were between 0.005 and 0.035 if n=50. At the
same time the expected number of requests T(pw+npr) declined with shrinking
pr. We performed this experiment with different numbers of participating nodes
n {10,50,100}. Each simulation run consisted of T=10,000 simulation steps
and the file size Dwas always 50.
Write/read-ratio with constant expected number of requests
In this experiment we also investigate the performance of the described algorithms
under different ratios of the number of write requests to the number of read
requests. But contrary to the prior experiment, we chose the values of pwand
prsuch that the expected number of requests is always 20,000 (2 requests per
step). This is the case if pw=αβ
β+1and pr=α
n
1
β+1, where αis the desired expected
number of requests per step and βis the desired write/read-ratio. Thus, if βrises
the expected number of write requests rises while the expected number of read
requests falls. We have chosen the value of βbetween 0.005 and 0.035.
Each simulation run consisted of T=10,000 simulation steps and involved
50 participating nodes. We conducted the experiment for different file sizes
D {10,30,50,70,100}.
4.5 Results 63
Heterogeneous request rates
In this experiment we investigate the performance of the described algorithms
when the participating nodes have heterogeneous read probabilities. The sim-
ulations involved 50 participating nodes and lasted for 10,000 steps. Each par-
ticipating node iissued a read request to the file with probability pi, 1 i50,
and the server issued write request with probability pwin every step . We have
chosen piuniformly at random from the range [ 1
30,1
10] at the beginning of the input
generation. The write probability pwhad a fixed value of 1
15 which corresponds
to the expected read probability of the participating nodes. We investigated the
performance of the described algorithms in this scenario under different file sizes
D {5,10,15,20,30,40,50,60,70,80,90,100}.
4.5 Results
DoNothing has to pay for every read request, but has no costs for write requests.
Thus, the competitive ratio of DoNothing is high if the write/read-ratio is low and
decreases with increasing write/read-ratio. Since DoNothing never replicates,
its costs are completely independent from D. The competitive ratio then again
decreases with growing D, because the costs of the optimal offline algorithm are
increasing.
ReplicateOnFirstRequest on the other hand pays only for the first read request
from a participating node and thereafter for every write request. Thus, the com-
petitive ratio of ReplicateOnFirstRequest are low if the write/read-ratio is low
and increases with increasing write/read-ratio. The costs of ReplicateOnFirstRe-
quest depends only on Dbecause of the initial replications. If the input sequence
is long enough the influence of Dcan be neglected.
Follow creates a copy on a participating node whenever this participating
node issues a request and does not hold a copy and deletes all copies on every
write request. This is only a good behavior if there are long sequences of read
requests which are not interlaced by write requests. Since in our experiments
this is usually not the case, Follow quiet often replicates and deletes copies. As
can be seen in Figure 4.2, this results in a much worse performance than Count,
DoNothing and ReplicateOnFirstRequest, which rarely create and delete copies.
We will therefore concentrate in the following on the comparison of Count with
DoNothing and ReplicateOnFirstRequest.
We also calculated the upper bounds for the costs of Follow and Count from
64 Simulation Based Evaluation of File Allocation with Step Costs
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read ratio
0
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
costs
Follow upper bound Follow
Count upper bound Count
ReplicateOnFirstRequest DoNothing
Opt
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read ratio
50.0
100.0
150.0
200.0
competitive ratio
Follow upper bound Follow
Count upper bound Count
ReplicateOnFirstRequest DoNothing
Figure 4.2: The performance of Count, Follow and their upper bounds for
different write/read-ratios with fixed write probability.
Theorem 3.2 respectively Theorem 3.4:
CF(σ)|σ|
X
t=0
max D+3,1+D+1
p¯
δt!·COpt(σt)
and
CC(σ)min
0r1
|σ|
X
t=1
max rD +(3 r),1+(3 r)D+r
p¯
δt!·COpt(σt)
.
Inallperformed simulationrunsthe average valueof ¯
δtwasinthe range[0.17,0.26].
For values of ¯
δtin this range combined with the used average stand-by power
consumption of p=0.01, the upper bound of Count is lowest for r=1. Thus, the
effective upper bound for the competitive ratio of Count in our simulations is
CC(σ)|σ|
X
t=1
max D+2,1+2D+1
p¯
δt!·COpt(σt).
In all experiments the upper bounds for the costs of Follow and Count were
much higher than their actual costs. Figure 4.2 shows a representative example
taken from the first experiment. The decrease of the upper bounds of the costs
are thereby caused by the decrease of the optimal offline costs (c.f. Figure 4.3).
Notice that the scale for the competitive ratio reaches a value of 200 while in
all the following diagrams the competitive ratio only ranges up to 2. While the
upper bound for Follow was always significantly lower than the one for Count,
4.5 Results 65
the actual costs of Follow were always significantly higher than the costs of
Count. This suggests that the theoretical worst-case upper bounds are not good
performance estimators for realistic inputs.
Write/read-ratio with constant write probability
At first, we will analyse the measured results for 50 participating nodes shown
Count
ReplicateOnFirstRequest
DoNothing
Opt
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
Figure 4.3: The performance of Count, DoNothing and ReplicateOnFirstRe-
quest for different write/read-ratios with fixed write probability.
in Figure 4.3. In this experiment the expected number of write requests is con-
stant. This implies that the costs for ReplicateOnFirstRequest are nearly con-
stant. The expected number of read requests on the other hand sinks with the
write/read-ratio. This is the reason for the decrease of the costs of DoNothing
with rising write/read-ratio. The optimal offline solution creates, in the case of a
write/read ratio of 0.005, copies on every participating node and keeps them. In
the range 0.01 to 0.025 it creates and deletes copies as needed. And for write/read-
ratios starting above 0.025 the optimal offline solution nearly never creates copies.
Thus, ReplicateOnFirstRequest has a nearly optimal competitive ratio for low
write/read-ratios and DoNothing has a nearly optimal competitive ratio for high
write/read-ratios. But both algorithms have rather bad competitive ratios in the
respectively opposite case.
In contrast, Count is relative good over all write/read-ratios. In the case of
low write/read-ratios Count is slightly worse than ReplicateOnFirstRequest.
This is because Count creates a copy at the earliest after Dread request from a
participating node, while ReplicateOnFirstRequest creates its copy immediately
after the first request. For high write/read-ratios Count behaves basically like
DoNothing, i.e. it never creates copies. For write/read-ratios in between Count
66 Simulation Based Evaluation of File Allocation with Step Costs
creates and deletes copies similar to the optimal offline solution, but shifted in
time.
0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175
write/read ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
0.000 0.003 0.005 0.007 0.010 0.013 0.015 0.018
write/read ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
Figure 4.4: The performance of Count, DoNothing and ReplicateOnFirstRe-
quest for different write/read-ratios with fixed write probability for n=10 (left)
and n=100 (right).
The diagrams of the competitive ratio for 10 and 100 participating nodes shown
in Figure 4.4 have basically the same shape as the one for 50 participating nodes.
The main difference is the scaling of the x-axis. This is due to the fact that the
write/read-ratio pw
nprdepends on the number of participating nodes. The shape
of the diagram does not change, because the costs caused by any participating
node is defined only by the write requests from the center and its own read
requests and is therefore independent of the accesses of the other participating
nodes. This implies that the overall competitive ratio is some kind of averaging of
the competitive ratios of all participating nodes for a common sequence of write
requests. Thus, the overall competitive ratio depends mainly on the expected
write/read-ratio of an individual participating node and is therefore independent
of n.
Write/read-ratio with constant expected number of requests
At first, we will analyse the measured results for D=50 shown in Figure 4.5.
In this experiment the write probability pw=αβ
β+1takes values in the range
(0.009,0.070,) while the read probabilities pr=α
n
1
β+1take values from the range
(0.038,0.040). Thus, the read probabilities are nearly constant and the change of
the write/read-ratio is mainly due to the increasing write probability. This leads
to nearly constant costs for DoNothing and linearly rising costs for ReplicateOn-
FirstRequest.
4.5 Results 67
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read-ratio
100,000
200,000
300,000
400,000
500,000
600,000
700,000
costs
Count
ReplicateOnFirstRequest
DoNothing
Opt
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read-ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
Figure 4.5: The performance of Count, DoNothing and ReplicateOnFirstRe-
quest for different write/read-ratios with constant expected number of requests.
The competitive ratio of DoNothing is again bad for low write/read-ratios and
improves while the write/read-ratio rises. ReplicateOnFirstRequest offers the
opposite behavior. Its competitive ratio is good for low write/read-ratios and gets
worse while the write/read-ratio rises.
The optimal offline solution and Count act basically just like in the first experi-
ment: When the write/read-ratio is low they create many copies and rarely delete
them again. When the write/read-ratio is high they nearly never create copies.
And in between they create and delete copies if it is sensible. Count is again
never better than both DoNothing and ReplicateOnFirstRequest, but offers for
all write/read-ratios a reasonable good performance.
In the first experiment the expected overall number of requests highly differs for
different write/read-ratios. It declines from about 100,000 for a write/read-ratio of
0.005 to about 15,000 for a write/read-ratio of 0.035. Since the expected number
of requests in the second experiment has a constant value of 20,000, we conclude
that the competitive ratio depends mainly on the write/read-ratio but hardly on
the overall number of requests.
Figure 4.6 shows the diagrams of the competitive ratio for the file sizes 10, 30,
70 and 100. Obviously, the optimal offline costs of a fixed input sequence rises
if the file size increases. Furthermore, the number of copies an optimal offline
algorithm creates will decrease with increasing D.
Since the costs of DoNothing are completely independent of D, the graph of its
competitive ratio only differs because of the changes of the costs of the optimal
solution. For growing D, the point where creating no copies is optimal is reached
at lower write/read-ratios.
68 Simulation Based Evaluation of File Allocation with Step Costs
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read-ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
D=10
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read-ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
D=30
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read-ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
D=70
0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035
write/read-ratio
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
D=100
Figure 4.6: The competitive ratios of Count, DoNothing and ReplicateOn-
FirstRequest for different write/read-ratios with fixed expected number of re-
quests for D {10,30,70,100}.
As can be seen in the charts, the curves of the competitive ratio of Replica-
teOnFirstRequest is shifted upwards for growing D, i.e. the competitive ratio
grows with D. This is because the costs of ReplicateOnFirstRequest are not
independent of Dsince it replicates to every participating node right after its first
read request. If the optimal offline algorithm also creates a copy to fulfill this first
read request, it usually does this a few steps sooner than the request in order to
save costs. The competitive ratio of ReplicateOnFirstRequest rises for growing
D, because these cost savings grow with D.
The higher the value of Dis the more read requests have to be issued by a
participating node until Count creates a copy on it. The optimal offline algorithm
on the other hand mostly creates this copy much earlier than Count. This leads
4.5 Results 69
to a rising competitive ratio of Count with growing Dfor small write/read-ratios.
For high write/read-ratios the creation of copies gets less useful with growing D.
Both Count and the optimal offline strategy therefore approach the behavior of
DoNothing for high write/read-ratios.
Heterogeneous request rates
The points in the charts of Figure 4.7 correspond to the mean value of 3 simula-
Count
ReplicateOnFirstRequest
DoNothing
Opt
0 10 20 30 40 50 60 70 80 90 100
D
1.00
1.05
1.10
1.15
1.20
1.25
1.30
competitive ratio
Count
ReplicateOnFirstRequest
DoNothing
Figure 4.7: The performance of Count, DoNothing and ReplicateOnFirstRe-
quest for different file sizes with heterogeneous request rates.
tion batches (overall 30 measured values). The charts furthermore indicate the
standard deviation of the measured values.
In this experiment every participating node acts in the same way as in the first
experiment. But here each participating node has its individual read probability
and therefore its individual write/read-ratio. The diagrams in Figure 4.7 show that
Count clearly outperforms DoNothing and ReplicateOnFirstRequest for values
of Dgreater than or equal to 10. As seen in the analysis of the prior experiments
Count is reasonable good for all possible write/read-ratios, but both DoNothing
and ReplicateOnFirstRequest are bad in a certain range of write/read-ratios.
Since the used range of read-probabilities contains the bad cases of DoNothing
and ReplicateOnFirstRequest, this results in a better overall performance of
Count.
The competitive ratio of DoNothing slightly decreases with growing D, because
the optimal offline solution creates fewer copies for bigger values of D. This is
also the reason for the growing competitive ratio of ReplicateOnFirstRequest.
The competitive ratio of Count has the highest value for D=5. Thereafter, the
competitive ratio falls rapidly until D=30 and then rises slowly with growing D.
70 Simulation Based Evaluation of File Allocation with Step Costs
Why is the competitive ratio of Count such bad for small values of D? Imagine
that Count creates a copy on a participating node and thereafter no read requests
are issued from this node until Count deletes this copy. Then Count has payed
for the replication, Dwrite requests and the deletion. This means Count has
roughly payed the same as if it would have served 2D+1 further read requests
without the copy. Thus, creating a copy at a wrong point of time can cause sever
additional costs. An analysis of the number of created copies showed, that Count
created significantly more copies than the optimal offline strategy for D=5.
This implies that Count created many unnecessary copies which drives its costs
significantly up. For growing Dthe number of copies created by Count decreases
and especially falls under the number of copies created by the optimal offline
strategy at D=10. This leads to a better competitive ratio since Count has to pay
for fewer erroneous copy creations. The continued fall of the number of copies
created by Count implies that the competitive ratio of Count approaches the
competitive ratio of DoNothing.
4.6 Conclusion
The simulation based experimental evaluation of Count and Follow showed
that their performance is significantly better on realistic inputs than the worst
case upper bounds given in Chapter 3 suggest.
In all conducted simulation runs, the parameter ¯
δthad an average value in the
range [0.17,0.26] and phad an average value of 0.01. Since D+2
D+1p<3
2p0.015 <
0.17 ¯
δt, the graph in Figure 3.5 suggests that the competitive ratios of Count
and Follow are considerably above D. Furthermore, the graph implies that the
competitive ratio of Count is higher than the competitive ratio of Follow. But
actually, the competitive ratio of Count was in all simulation runs below 3 and
therewith below the upper bound for static networks. While the competitive ratio
of Follow was always significantly higher than the competitive ratio of Count, it
was still significantly below the theoretical upper bound.
The theoretical upper bound for Follow for large enough ¯
δtis the lowest possi-
ble, because Follow handles the lower bound for demand-driven algorithms pro-
vided in Subsection 3.1.2.1 in the best way possible for any competitive demand-
driven online algorithm. But at the same time, Follow was significantly outper-
formed by the trivial strategies DoNothing and ReplicateOnFirstRequest in all
experiments. This clearly shows that a good performance in the worst case does
not imply a good performance on realistic inputs.
Count on the other hand turned out to be a good choice in all experiments. In
the experiments with a fixed write/read-ratio, Count provided a reasonable good
4.6 Conclusion 71
performance for all write/read-ratios. In contrast to Count, the trivial strategies
DoNothing and ReplicateOnFirstRequest were only good for certain write/read-
ratios and quite bad for the other write/read-ratios. While Count was never better
than both trivial strategies, it was also never much worse than the better of the
trivial strategies. This resulted in a clearly better performance of Count in the
experiment with heterogeneous write/read-ratios of the participating nodes.
Chapter 5
File Leasing
A major problem in a MANET is that connectivity can not be guaranteed. On the
one hand, the movement of the nodes can lead to several connected components
which are not connected among each other. The analysis of two real movement
traces in [SBF+08] showed that the investigated MANETs consisted almost always
of more than one connected component. On the other hand, connectivity can be
disturbed by nodes which fail because of hardware or software failures or because
they run out of energy. A file allocation system should therefore be able to handle
disconnection of the participating nodes.
The model of the file allocation problem in a dynamic star network in the prior
chapters assumes that each participating node is always connected to the server. If
this assumption is not fulfilled, a file allocation system acting on this assumption
can encounter inconsistencies in the copies of the file. If a participating node i
which holds a copy gets disconnected, it will not receive updates from the server.
Thus, every future read access from icould result in an invalid answer. Although
the server would notice a failed update of the disconnected copy, it can not delete
this copy until a new connecting path is available. But even when a new path
appeared, the server is not automatically aware of it. Thus, the server either has
to periodically try to send delete-messages to ior has to tolerate the invalid copy
until icontacts him. In either case icould access invalid data arbitrary often.
In this chapter we introduce an alternative file allocation model which can deal
better with disconnected copies. As in Section 3.1, we are looking at a file which
is primary stored on a server and shared with a subset of participating nodes
of a MANET. Instead of creating permanent copies on the participating nodes,
the server grants leases of the file. This means that every copy of the file has an
expiration date. If the copy of a participating node expires, it either drops the
copy or renews the lease. Since the server and the participating node know when
a lease expires, dropping the copy does not require a data transfer. The renewal of
73
74 File Leasing
the lease on the other hand has to be notified and therefore needs a data transfer.
A renewal not only fails if the participating node is disconnected, but also if the
copy is not up to date because of a prior failed update. Thus, an invalid copy will
remain at most until the lease expires.
We give in Section 5.2 a lower bound of (L)for the competitive ratio of
randomized algorithms against oblivious adversaries. Furthermore, we analyse
two simple deterministic algorithms in Section 5.3 which can be combined to a
O(LD)-competitive algorithm.
5.1 Our model
The file leasing problem in a dynamic star network (Fads)is defined as follows: We
are given a star network consisting of a center node c(the server) and nperipheral
nodes 1,...,n(the participating nodes of the MANET). All the peripheral nodes
of the star can access an indivisible file which initially is stored on c. During the
runtime of the system, an algorithm Alg can arbitrarily place copies of the file on
the peripheral nodes. A copy of the file is leased for Lsteps. When a lease expires,
it can either be removed or renewed. The center node always holds a copy. The
replica set Rt
Alg {c,1,...,n}is defined as the set of nodes where algorithm Alg
has a copy at the end of step t.
Figure 5.1: The file leasing system is modeled as a star with dynamic edge weights.
Each copy has the residual term of its lease attached to it.
The input consists of the initial distances d0and a finite sequence σof pairs (dt,at),
5.1 Our model 75
1t |σ|, of a vector of distances and an access request. It is presented to the
online algorithm in equidistant discrete time steps. The vector dt=[d1
t,..., dn
t]
Rn
>0defines the distances of the peripheral nodes to the center in step t. The access
request at {−,ri,wi|i {c,1,...,n}} either indicates that no access occurs ()
or a read (r) respectively a write (w) access is issued at node i. Every step tis
divided into the following three phases, which are executed strictly in the given
order: first the distances between the peripheral nodes and care updated to dt,
then request atis served, and finally, the algorithm may create new copies, renew
leases and/or delete copies.
We consider the shared file as an indivisible container of a fixed number of data
units of uniform size. A single read respectively write access always affects just
one unit of the file. While a read request can be fulfilled by a single copy, a write
request has to update all the copies. Every transfer of such a data unit costs the
corresponding distance. The costs CAlg(σt) incurred by Alg in step tare composed
of two components: the transfer costs for fulfilling the data access and the transfer
costs for creating and deleting copies and for renewing leases. The costs CAlg(at)
for serving a data access from i {c,1,...,n}are naturally defined by
Ct
Alg() :=0,Ct
Alg(ri) :=
0,if iRt1
Alg
di
t,otherwise ,Ct
Alg(wi) :=X
j(Rt1
Alg∪{i})\{c}
dj
t
Notice that CAlg(rc) is always 0 and therefore a read access from the center node
is equivalent to no access (). Furthermore, CAlg(wi) is always at least di
tsince the
center always holds a copy. A replication of the file from the center to peripheral
node icosts D·di
t, where D1 is a fixed constant depending on the size of the file.
If the lease of a copy expires, an algorithm decides whether the copy is deleted or
renewed. The deletion is free since both the peripheral node and the server know
when a lease expires. A renewal costs the current distance between the peripheral
node and the server since both have to be aware that the copy persists. If an
algorithm decides to delete the copy of a peripheral node iwhile its still is still
valid, both the center and ihave to know this. Thus, a communication between
these two nodes has to take place which costs di
t. We denote by CAlg(σ) the overall
costs incurred by algorithm Alg while serving input σ.
In the following we investigate only the case LD. This is a reasonable
assumption, because otherwise the costs for creating a copy would, in the majority
of cases, be higher than the costs this copy could avoid during a single leasing
period.
Related work
Different variants of leases are used for caching in distributed file systems [GC89]
like Echo [BHJ+93] or MFS [Bur88]. Furthermore, an extended version of leases
76 File Leasing
has also been proposed for caching in large-scale systems like the World Wide
Web [YADL98].
Our file leasing model can be seen as a generalization of the classic ski-rental
problem, where a skier has to decide whether to rent a set of skis for 1 per day
or to buy it for D. In our model the price for renting respectively buying changes
over time, but has a fixed ratio of D, and the lifetime of a bought set of skis is
exactly Ldays. This is basically a combination of the Bahncard problem introduced
by Fleischer [Fle01] and the ski rental problem with dynamic prices introduced by
Bienkowski [Bie08].
The Bahncard problem is inspired by the price system of the German railway
company “Deutsche Bahn”. The input consists of a stream of pairs (t,p) where tis
the date of an unavoidable travel and pis the regular price of a ticket. The traveler
can buy a discount card (Bahncard) for a fixed price with a fixed period of validity
before he buys the next ticket. While the Bahncard is valid, any ticket costs only β
times the regular price, where 0 <β<1. Thus, the main differences to our leasing
model are that the costs of two consecutive tickets can change arbitrary in the
Bahncard problem and that the possession of a Bahncard does not avoid all the
costs of a request. In [Fle01] optimal (2 β)-competitive deterministic algorithms
are given for the Bahncard problem.
In the ski rental problem with dynamic prices the price of renting respectively
buying a set of skis is determined by a lipschitz continuous function. As in
our model the ratio of buying to renting is always a fixed constant. The main
difference to our model is that a bought set of skis lasts forever.
Reduction lemma
The following lemma shows that regarding the competitive ratio every statement
valid for 1-restricted adversaries is also valid for δ-restricted adversaries. We will
therefore constrain our investigations to 1-restricted adversaries.
Lemma 5.1. Let Alg be a γ-competitive (randomized) algorithm against δ-restricted
adversaries. Then there exists for every δ0>0an algorithm which is γ-competitive
against δ0-restricted adversaries.
Proof. Let σ0=(dt,at)1t≤|σ|be an arbitrary δ0-restricted input sequence. Then the
input sequence σ:=δ
δ0dt,at1t≤|σ|is obviously δ-restricted. Let Alg be the online
algorithm which simulates on input σ0the behavior of Alg on input σ. Since the
costs incurred by any algorithm on any input sequence is a linear combination of
the distances, we have
CAlg0(σ0)=CAlg(σ)·δ0
δγ·COpt(σ)·δ0
δ=γ·COpt(σ0).
5.2 Lower bound 77
5.2 Lower bound
Here, we give a lower bound for randomized algorithms against oblivious adver-
saries. In the proof we use the following version of Yao’s min-max principle for
online algorithms. The proof for this lemma can be found in [CLLR97].
Lemma 5.2 (Yao’s min-max principle).If for every C R>0a probability distribution
πover input sequences σexists such that
(a) Eπ[COpt(σ)]C, and
(b) Eπ[CDet(σ)]γ·Eπ[COpt(σ)]for every deterministic algorithm Det,
then no randomized algorithm can be γ0-competitive against oblivious adversaries for any
γ0< γ.
Theorem 5.3. The competitive ratio of every randomized algorithm for Flds against
1-restricted oblivious adversaries is at least
γ(L,D):=
(L+1)2
4D,if D L<2D1
L+1D,if L 2D1
Proof. We proof the theorem using Lemma 5.2. Let Det be an arbitrary determin-
istic online algorithm for Flds and 1 inan arbitrary peripheral node.1The
input sequences of πare concatenations of equal phases τT, where 1 TD. We
will choose an adequate value for Tat the end of the proof. Tdepends only on L
and Dand therefore is constant during every input sequence. We show that for
each τTholds:
(a) E [COpt(τT)]=D
L+1T, and
(b) E [CDet(τT)](L+1T)T
D·E[COpt(τT)].
The number of phases mis chosen such that E hCOpt(τm
T)i=m·D
L+1TC.
Every phase τT(cf. Figure 5.2) consists of four stages: the clearance-stage, the
expansion-stage, the request-stage and the contraction-stage. In the clearance-
stage DL read requests are issued from cat a constant distance of 1. In the
expansion-stage peripheral node imoves away from cwith speed 1 for LT
steps, i.e. di
t=1+tin the t-th step of the expansion-stage. During the expansion-
stage ckeeps on issuing read requests. In the request-stage peripheral node i
issues Tread requests with probability 1
L+1T, otherwise cissues Tread requests.
During the request-stage, peripheral node istays at distance (L+1T). Finally
in the contraction-stage, peripheral node imoves back to distance 1 while cissues
read requests.
1The described phases can also be executed on all peripheral nodes in parallel.
78 File Leasing
Figure 5.2: A single phase of the lower bound.
An optimal offline algorithm knows which node will issue requests during
the request-stage beforehand. Thus, it will not pay anything if cissues requests.
If iissues requests, Opt can either serve these requests by creating a copy or
by sending data for each of the Trequests. The cheapest way to serve read
requests from iby a copy is by creating the copy in the last step of the clearance-
stage, because a copy created earlier will not last until the end of the request-
stage and a copy created later will be more expensive. Thus, Opt pays at most
D. If Opt does not create a copy, it incurs costs of T(L+1T) which is at
least min (L,D(L+1D))Dsince this parabola has no local minimum and
1TDL. Together this results in
E[COpt](τT)=D
1+LT.
We distinguish two cases regarding the behavior of Det: either Det has a copy
on peripheral node iat the beginning of the request-stage or it does not.
If Det has a copy on i, this copy was either created in the current phase which
costs at least D. Or the copy was already present at the beginning of the phase.
Then Det had to pay at least Dfor renewals during the clearance-stage. Thus, in
5.3 Algorithms 79
both cases Det payed at least Din this phase. Together with TD, we have
E[CDet(τT)]D
=(L+1T)D
L+1T
=(L+1T)E[COpt(τT)]
(L+1T)T
DE[COpt(τT)].
If Det does not hold a copy on iat the beginning of the request-stage, it has to
pay for possibly occurring read requests from i. Since TD, Det can not reduce
its costs by creating a copy anymore. Thus,
E[CDet(τT)]T·(L+1T)
=(L+1T)T
D·D
>(L+1T)T
DE[COpt(τT)].
So, in either case
E[CDet(τT)](L+1T)T
D·E[COpt(τT)].
The theorem follows by choosing T:=L+1
2if L<2D1 and T:=Dif L2D1.
5.3 Algorithms
Here, we analyse the competitive ratios of the two simple online algorithms
NeverLease and AlwaysLease.2NeverLease (see Algorithm 8) does not create
any copies at all. AlwaysLease (see Algorithm 9) creates a copy of the file whenever
a peripheral node issues a read request and does not hold a copy. If the lease of a
copy on a peripheral node expires, AlwaysLease renews it only if this peripheral
node issued a read request in the same step. On a write request, AlwaysLease
deletes all the copies of peripheral nodes but a possibly existing copy of the writing
node.
Algorithm 8 NeverLease
1: On request (dt,at)do
2: end
2We abbreviate NeverLease with NL and AlwaysLease with AL in indices.
80 File Leasing
Algorithm 9 AlwaysLease
1: On request (dt,at)do
2: Let ibe the source of the current request at.
3: if at=rithen
4: if i<RAL then
5: Replicate to i.
6: else if iRAL and the copy of iexpires in step tthen
7: Renew the copy of i.
8: end if
9: else if at=withen
10: for jRAL \{c,i}do
11: Delete copy on j.
12: end for
13: end if
14: end
For the analysis of the algorithms we allot the costs incurred by an algorithm
Alg as follows to the peripheral nodes: A peripheral node 1 inis charged
for all those costs that are caused by it. More precisely, peripheral node ipays
for all costs incurred for creating and deleting copies and renewing leases on
i. Furthermore, iis charged di
tin step tif iissues a read request and does not
hold a copy, or if iholds a copy and a write request is issued by any node. We
denote by Ci
Alg(σ) the costs incurred by iwhile Alg serves the input sequence σ.
Since all the costs incurred by Alg are mapped to exactly one peripheral node,
CAlg(σ)=
n
P
i=1
Ci
Alg(σ) holds.
We start with the analysis of NeverLease.
Theorem 5.4. NeverLease is strictly LL+3
2-competitive against 1-restricted adversaries.
Proof. Let σbe an arbitrary input sequence and Opt be an optimal offline algorithm
for Flds. We show that for any peripheral node 1 inholds
Ci
NL(σ)LL+3
2·Ci
Opt(σ).
This implies
CNL(σ)=
n
X
i=1
Ci
NL(σ)
n
X
i=1
LL+3
2·Ci
Opt(σ)=LL+3
2·COpt(σ).
We examine the processing of σbackwards and allot the costs incurred by
NeverLease to costs of Opt.
5.3 Algorithms 81
NeverLease incurs costs di
tin every step tin which iissues a request. On a
write request or if Opt does not hold a copy on i, Opt also has to pay di
tto fulfill
the request. Then we have
Ci
NL(σt)=di
tLL+3
2di
t=LL+3
2Ci
Opt(σt).
Otherwise, iissued a read request and Opt has a copy on i. Then Opt has
created or renewed this copy in a step t0with t>t0tL. We allot all the costs
of NeverLease in the steps after t0until tto Opt’s costs for creating respectively
renewing this copy. Thus,
Ci
NL
t
X
s=t0+1
di
s
L
X
s=1di
t0+s
=Ldi
t0+L
2(L+1)
L+L
2(L+1)di
t0
LL+3
2Ci
Opt.
The proof of the competitiveness of AlwaysLease uses the same technique as
the prior one, but is trickier.
Theorem 5.5. AlwaysLease isstrictly ((D+5)L+D+3)-competitiveagainst1-restricted
adversaries.
Proof. Let σbe an arbitrary input sequence and Opt be an optimal offline algorithm
for Flds. We show that for any 1 in
Ci
AL(σ)((D+5)L+D+3)Ci
Opt(σ) (5.1)
holds. This implies
CAL(σ)=
n
X
i=1
Ci
AL(σ)
n
X
i=1
((D+5)L+D+3)Ci
Opt(σ)=((D+5)L+D+3)COpt(σ).
We examine the processing of σbackwards and allot the costs incurred by
AlwaysLease to costs of Opt. We thereby divide each step into two parts. In the
first part the request is fulfilled by both AlwaysLease and Opt, and AlwaysLease
performs actions. In the second part Opt performs actions.
82 File Leasing
AlwaysLease only incurs costs in step tif ireads the file and it has no copy on
i, or if a node j,iwrites to the file and it has a copy on i, or if iwrites to the
file. In the last case both AlwaysLease and Opt have to pay di
tregardless of their
configurations and (5.1) surely holds.
If ireads the file and AlwaysLease has no copy on i, then AlwaysLease pays
(D+1)di
tfor serving the read request and creating a copy. If Opt has no copy on
i, it has to pay di
tfor the request and we are done. Otherwise, Opt has a copy on
iand we search the last step t0before twhere Opt incurs costs. This can only be
due to a write access or due to Opt renewing or creating the copy on i. This means
that t0tLand Ci
Opt di
t0holds. Since AlwaysLease would have a copy on iin
step t, if there were a read request from isince t0, we can conclude that only read
requests from nodes other than ican occur between t0and t. Thus,
Ci
AL di
t0+(D+1)di
t
di
t0+(D+1)(di
t0+L)
(D+2+(D+1)L)dt0
<((D+5)L+D+3)di
t0
((D+5)L+D+3)Ci
Opt.
If a node j,iwrites to the file and AlwaysLease has a copy on i, then Al-
waysLease pays 2di
tfor serving the write request and deleting the copy. If Opt also
has a copy on i, it has to pay di
tfor the request and we are done. Otherwise, Opt
has no copy on iand we search the last step t0before twhere Opt incurred costs.
This can be because of a request from ior because Opt deleted a copy on i. But
it is also possible that Opt does not have a copy because its prior copy expired
after t0. We will treat this case below. So, if t0is determined by an access from ior
a deletion of Opt, then Ci
Opt di
t0holds. Since AlwaysLease has a copy on iand
no read request from ioccurred since t0, we have t0tLand there is no write
request from any node between t0and t. Thus,
Ci
AL di
t0+2di
t
di
t0+2(di
t0+L)
(3 +2L)dt0
<((D+5)L+D+3)di
t0
((D+5)L+D+3)Ci
Opt.
Now for the special case in which Opt has no copy at the write request from
jin step t, because its prior copy expired after t0. Let us denote the step when
AlwaysLease creates or renews its copy by t00 and the step when Opt loses its copy
by t000. Then, we have t0<t000 <tand t000 t0+Land tt00 +L. First note that
5.3 Algorithms 83
t00 ,t0, because this would imply that the access in t0were a read request from
i, but this can not cause costs for Opt since Opt has and keeps a copy on isince
t000 L<tLt00. Furthermore, between t000 and tonly read requests from nodes
other than ican occur, because Opt would incur costs on a request from iand
a write request from another node than iwould cause AlwaysLease to delete its
copy earlier than t.
If AlwaysLease created respectively renewed its copy before t0, then tt00 +L<
t0+L. This means that the costs of Opt at t0could only be caused by a write request
from i, because Opt and AlwaysLease have a copy at t0and AlwaysLease still has
its copy after t0. Since between t000 and tonly read requests from nodes other than
ican occur, this results in
Ci
AL =di
t0+2di
t
di
t0+2(di
t0+L)
(3 +2L)dt0
<((D+5)L+D+3)di
t0
((D+5)L+D+3)Ci
Opt.
If AlwaysLease created respectively renewed its copy after t0, then we have
t0<t00 t000 <t, because there are no read request from iafter t000. Together with
t000 t0+Land tt00 +L, this implies t00 t0+Land tt0+2L. As seen above
between t000 and tonly read requests from nodes other than ican occur. Between
t0and t000 only read request can occur, because Opt has a copy but incurs no cost.
Especially, a read request is issued by iin step t00 which causes AlwaysLease to
renew or create its copy on i. Since AlwaysLease can create respectively renew
only one copy during t0and t000, this results in
Ci
AL (D+1)di
t00 +2di
t
(D+1)(di
t0+L)+2(di
t0+2L)
((D+5)L+D+3)di
t0
((D+5)L+D+3)Ci
Opt.
Since the parameters Land Dare fixed throughout the whole input sequence, we
cancombineAlwaysLease and NeverLease toa strictlymin (D+5)L+D+3,LL+3
2-
competitive algorithm.
84 File Leasing
5.4 Conclusion and open problems
In this chapter we have considered a new model for the file allocation problem
in dynamic networks which uses renewable leases with a fixed period of validity.
We have shown that this model overcomes the problem of hidden copies, which
are moved far away from the server, described in chapter 2 and thereby allows
competitive algorithms. This is because the costs of lease renewals do not allow
an adversary to move a hidden copy far away from the center of requests for
free. But this is in fact just a slight weakening of the main problem of online file
allocation. The adversary still can hide its actual configuration from the online
algorithm. And so the given lower bound basically depends on the duration the
adversary can hide a copy without paying for it.
In our model we defined that every write access is forwarded to all existing
copies immediately. But this is actually just one reasonable behavior. The server
could also invalidate existing leases either immediately or on the next renewal
request. This would give an adversary the power to delete the copies of an
online algorithm. Furthermore, the server could defer all the updates of a copy
until the next renewal of its lease. This would mean that the size of the answer
to a lease renewal would depend on the number of updated data units of the
file. While this is surely a reasonable strategy for a practical implementation, it
complicates theoretical analysis because of the varying lease renewal answer sizes.
Model variants with deferred reactions on write accesses have the disadvantage
of temporary inconsistency of the content of the copies. Finally, a vast amount
of hybrid strategies to handle write accesses could be implemented. The lower
bound given in Section 5.2 holds for all these model variants since it does not
contain any write requests.
A possible generalization of the model is to allow an algorithm to assign dif-
ferent durations of validity to leases. In this model the lower bound given in
Section 5.2 still holds with the difference that Lnow stands for the longest al-
lowed duration of validity of a lease. This means especially that if there is no
upper bound for the duration of validity of a lease, no competitive online algo-
rithm is possible.
We gave a lower bound of (L)and two simple O(DL)-competitive algorithms.
So there is still a big gap between the upper and the lower bounds. We conjecture
that the following extension LeaseCount of the algorithm Count from Subsec-
tion 3.1.2.3 is O(L)-competitive. LeaseCount behaves exactly as Count with one
exception, whenever a lease expires it deletes the corresponding copy and sets the
corresponding counter to 0. The attempt to prove this conjecture with the help of
a potential function argument failed because of the following universal problem:
The potential function obviously has to depend on the distances of the peripheral
5.4 Conclusion and open problems 85
nodes, because otherwise the costs for the creation respectively deletion of a copy
could not be amortized by a decrease of the potential. But, it is possible that the
adversary has no costs in one (or several) step(s), e.g. when the center node issues
a read request. This especially means that the potential may not increase during
this step, i.e.
Ci
Alg
|{z}
0
(di+ di)Φ(di)0
But, if the potential function depends on the current distance of the peripheral
node, this means that
δdiδ:Φ(di+ di)Φ(di)0
has to hold. Thus, the decisive question is: How can you define a potential
function that does not increase for both increasing and decreasing distance? In
the model with stand-by costs in Chapter 3 this problem was avoided by the
inevitable stand-by costs in every step.
Bibliography
[ABF93] Baruch Awerbuch, Yair Bartal, and Amos Fiat. Competitive dis-
tributed file allocation. In Proceedings of the 25th Symposium on
Theory of Computing (STOC), pages 164–173. ACM, 1993.
[ABF98] Baruch Awerbuch, Yair Bartal, and Amos Fiat. Distributed paging
for general networks. Journal of Algorithms, 28(1):67–104, 1998.
[ADM04] Ittai Abraham, Danny Dolev, and Dahlia Malkhi. LLS: a locality
aware location service for mobile ad hoc networks. In Proceedings
of the Joint Workshop on Foundations of Mobile Computing (DIAL M-
POMC), pages 75–84, New York, NY, USA, 2004. ACM.
[AL99] Susanne Albers and Stefano Leonardi. Online algorithms. ACM
Computing Surveys, 31(3es), 1999.
[AP95] Baruch Awerbuch and David Peleg. Online tracking of mobile
users. Journal of the ACM, 42(5):1021–1058, 1995.
[ASSC02] Ian F. Akyildiz, Weilian Su, Yogesh Sankarasubramaniam, and
Erdal Cayirci. Wireless sensor networks: a survey. Computer
Networks, 38(4):393–422, 2002.
[AW98] Susanne Albers and Jeffery Westbrook. Self-organizing data struc-
tures. In Online Algorithms, volume 1442 of Lecture Notes in Com-
puter Science, pages 13–51. Springer, 1998.
[Bar94] Yair Bartal. Competitive analysis of distributed on-line problems
distributed paging. PhD thesis, Tel-Aviv University, 1994.
[Bar98a] Yair Bartal. Distributed paging. In Online Algorithms, volume 1442
of Lecture Notes in Computer Science, pages 97–117. Springer, 1998.
87
88 Bibliography
[Bar98b] Yair Bartal. On approximating arbitrary metrices by tree metrics.
In Proceedings of the 30th Annual ACM Symposium on Theory of
Computing (STOC), pages 161–168, New York, NY, USA, 1998.
ACM.
[BBKadH09] Marcin Bienkowski, Jaroslaw Byrka, Miroslaw Korzeniowski, and
Friedhelm Meyer auf der Heide. Optimal algorithms for page
migration in dynamic networks. Journal of Discrete Algorithms,
7(4):545–569, 2009.
[BDBK+94] Shai Ben-David, Allan Borodin, Richard M. Karp, G´
abor Tardos,
and Avi Wigderson. On the power of randomization in on-line
algorithms. Algorithmica, 11(1):2–14, 1994.
[BDK05] Marcin Bienkowski, Miroslaw Dynia, and Miroslaw Ko-
rzeniowski. Improved algorithms for dynamic page migration.
In Proceedings of the 22nd Symposium on Theoretical Aspects of Com-
puter Science (STACS), pages 365–376, 2005.
[BEY98] Allan Borodin and Ran El-Yaniv. Online computation and competitive
analysis. Cambridge University Press, 1998.
[BFR92] Yair Bartal, Amos Fiat, and Yuval Rabani. Competitive algorithms
for distributed data managemen. In Proceedings of the 24th Sympo-
sium on Theory of Computing (STOC), pages 39–50. ACM, 1992.
[BFR95] Yair Bartal, Amos Fiat, and Yuval Rabani. Competitive algorithms
for distributed data management. Journal of Computer and System
Sciences, 51(3):341–358, 1995.
[BHJ+93] Andrew D. Birrell, Andy Hisgen, Chuck Jerian, Timothy Mann,
and Garret Swart. The Echo distributed file system. Technical
Report 111, DEC Systems Research Center, 1993.
[Bie05a] Marcin Bienkowski. Dynamic page migration with stochastic re-
quests. In Proceedings of the 17th ACM Symposium on Parallelism
in Algorithms and Architectures (SPAA), pages 270–278, Las Vegas,
Nevada, USA, July 2005. ACM Press, New York, NY, USA.
[Bie05b] Marcin Bienkowski. Page migration in dynamic networks. PhD thesis,
University of Paderborn, 2005.
Bibliography 89
[Bie08] Marcin Bienkowski. Ski rental problem with dynamic pricing.
Technical Report TR032008, Institute of Computer Science, Uni-
versity of Wrocław, 2008.
[BJ05] Marcin Bienkowski and Byrka Jarosław. Bucket game with appli-
cations to set multicover and dynamic page migration. In Proceed-
ings of the 13th Annual European Symposium on Algorithms (ESA),
volume 3669 of Lecture Notes in Computer Science, pages 815–826.
Springer Verlag, October 2005.
[BK05] Marcin Bienkowski and Miroslaw Korzeniowski. Dynamic page
migration under brownian motion. In Proceedings of the European
Conference in Parallel Processing (Euro-Par), pages 962–971, 2005.
[BKM04] Marcin Bienkowski, Miroslaw Korzeniowski, and Friedhelm
Meyer auf der Heide. Fighting against two adversaries: page
migration in dynamic networks. In Proceedings of the 16th ACM
Symposium on Parallelism in Algorithms and Architectures (SPAA),
pages 64–73, 2004.
[BM05] Marcin Bienkowski and Friedhelm Meyer auf der Heide. Page
migration in dynamic networks. In Proceedings of the 30th Interna-
tional Symposium on Mathematical Foundations of Computer Science
(MFCS), pages 1–14, September 2005. Invited talk.
[BS89] David L. Black and Daniel D. Sleator. Competitive algorithms for
replication and migration problems. Technical Report CMU-CS-
89-201, Department of Computer Science, Carnegie-Mellon Uni-
versity, 1989.
[Bur88] Michael Burrows. Efficient data sharing. PhD thesis, University of
Cambridge, 1988.
[CHC+05] Augustin Chaintreau, Pan Hui, Jon Crowcroft, Christophe Diot,
Richard Gass, and James Scott. Pocket switched networks: real-
world mobility and its consequences for opportunistic forward-
ing. Technical Report UCAM-CL-TR-617, Computer Laboratory,
University of Cambridge, 2005.
[CLLR97] Marek Chrobak, Lawrence L. Larmore, Carsten Lund, and Nick
Reingold. A better lower bound on the competitive ratio of the ran-
domized 2-server problem. Information Processing Letters, 63(2):79–
83, 1997.
90 Bibliography
[CM99] M. Scott Corson and Joseph Macker. RFC 2501: Mobile ad hoc
networking (MANET): routing protocol performance issues and
evaluation considerations, January 1999.
[DF82] Lawrence W. Dowdy and Derrell V. Foster. Comparative models of
the file assignment problem. ACM Computing Surveys, 14(2):287–
313, 1982.
[Dij59] Edsger J. Dijkstra. A note on two problems in connexion with
graphs. Numerische Mathematik, 1(1):269–271, 1959.
[Ein56] Albert Einstein. Investigations on the theory of the Brownian move-
ment. Dover Publications, Inc., 1956.
[FKFP07] S´
andor P. Fekete, Alexander Kr¨
oller, Stefan Fischer, and Dennis
Pfisterer. Shawn: the fast, highly customizable sensor network
simulator. In Proceedings of the 4th International Conference on Net-
worked Sensing Systems (INSS), page 299, 2007.
[Fle01] Rudolf Fleischer. On the Bahncard problem. Theoretical computer
science, 268(1):161–174, 2001.
[FW06] Roland Flury and Roger Wattenhofer. MLS: an efficient location
service for mobile ad hoc networks. In Proceedings of the 7th ACM
International Symposium on Mobile Ad Hoc Networking and Comput-
ing (MobiHoc), pages 226–237, New York, NY, USA, 2006. ACM.
[GC89] Cary C. Gray and David R. Cheriton. Leases: an efficient fault-
tolerant mechanism for distributed file cache consistency. ACM
SIGOPS Operating Systems Review, 23(5):202–210, 1989.
[GS90] Bezalel Gavish and Olivia R. Liu Sheng. Dynamic file migration
in distributed computer systems. Communications of the ACM,
33(2):177–189, 1990.
[IW91] Makoto Imase and Bernard M. Waxman. Dynamic Steiner tree
problem. SIAM Journal on Discrete Mathematics, 4:369–384, 1991.
[JM96] David B. Johnson and David A. Maltz. Dynamic source routing in
ad hoc wireless networks. Kluwer International Series in Engeneering
and Computer Science, pages 153–179, 1996.
[KH05] David Kotz and Tristan Henderson. CRAWDAD: a community
resource for archiving wireless data at dartmouth. IEEE Pervasive
Computing, 4(4):12–14, 2005.
Bibliography 91
[KH10] David Kotz and Tristan Henderson. CRAWDAD: a commu-
nity resource for archiving wireless data at Dartmouth. http:
//crawdad.cs.dartmouth.edu/, July 2010.
[KMRS88] Anna R. Karlin, Mark S. Manasse, Larry Rudolph, and Daniel D.
Sleator. Competitive snoopy caching. Algorithmica, 3(1):79–119,
1988.
[KPB+05] AlexanderKr¨
oller, DennisPfisterer, Carsten Buschmann, S´
andorP.
Fekete, and Stefan Fischer. Shawn: a new approach to simulat-
ing wireless sensor networks. In Proceedings of the 3rd Symposium
on Design, Analysis, and Simulation of Distributed Systems (DASD),
pages 117–124, 2005.
[LRWY99] Carsten Lund, Nick Reingold, Jeffery Westbrook, and Dicky Yan.
Competitive on-line algorithms for distributed data management.
SIAM Journal on Computing, 28(3):1086–1111, 1999.
[MadHVW97] Bruce M. Maggs, Friedhelm Meyer auf der Heide, Berthold V¨
ock-
ing, and Matthias Westermann. Exploiting locality for data man-
agement in systems of limited bandwidth. In Proceedings of the
38th Annual Symposium on Foundations of Computer Science (FOCS),
pages 284–293, October 1997.
[Mah10] Peter Mahlmann. Peer-to-peer networks based on random graphs. PhD
thesis, University of Paderborn, 2010.
[MM07] Mirco Musolesi and Cecilia Mascolo. Designing mobility models
based on social network theory. ACM SIGMOBILE Mobile Comput-
ing and Communications Review, 11(3):59–70, 2007.
[MM09] Jan Mehler and Friedhelm Meyer auf der Heide. Power-aware
online file allocation in mobile ad hoc networks. In Proceedings
of the 21st Symposium on Parallelism in Algorithms and Architectures
(SPAA), pages 347–356, New York, NY, USA, 2009. ACM.
[MMS88] Mark S. Manasse, Lyle A. McGeoch, and Daniel D. Sleator. Com-
petitive algorithms for on-line problems. In Proceedings of the 20th
Annual ACM symposium on Theory of computing (STOC), pages 322–
333, New York, NY, USA, 1988. ACM.
92 Bibliography
[MMVW97] Bruce Maggs, Friedhelm Meyer auf der Heide, Berthold V¨
ocking,
and Matthias Westermann. Exploiting locality for data manage-
ment in systems of limited bandwidth. Proceedings of the 38th An-
nual Symposium on Foundations of Computer Science (FOCS), pages
284–293, 1997.
[MVW99] Friedhelm Meyer auf der Heide, Berthold V¨
ocking, and Matthias
Westermann. Provably good and practical strategies for non-
uniform data management in networks. In Proceedings of the 7th
Annual European Symposium on Algorithms (ESA), pages 89–100,
1999.
[PS02] Hans J¨
urgen Promel and Angelika Steger. The Steiner tree problem.
Vieweg, 2002.
[RMSM01] Elizabeth M. Royer, P. Michael Melliar-Smith, and Louise E. Moser.
An analysis of the optimum node density for ad hoc mobile net-
works. In Proceedings of the International Conference on Communica-
tions (ICC), volume 3, pages 857–861, 2001.
[RS94] Prabhakar Raghavan and Marc Snir. Memory versus randomiza-
tion in on-line algorithms. IBM Journal of Research and Development,
38(6):683–708, 1994.
[SBF+08] Antoine Scherrer, Pierre Borgnat, Eric Fleury, Jean-Loup Guil-
laume, and C`
eline Robardet. Description and simulation of dy-
namic mobility networks. Computer Networks, 52(15):2842–2858,
2008. Complex Computer and Communication Networks.
[ST85] Daniel D. Sleator and Robert E. Tarjan. Amortized efficiency of list
update and paging rules. Communications of the ACM, 28(2):202–
208, 1985.
[Wat99] Duncan J. Watts. Networks, dynamics, and the smallworld phe-
nomenon. American Journal of Sociology, 105(2):493–527, 1999.
[WMY06] Qin Wang, MarkHempstead, and Woodward Yang. A realistic
power consumption model for wireless sensor network devices.
In Proceedings of the 3rd Annual IEEE Communications Society Con-
ference on Sensor, Mesh and Ad Hoc Communications and Networks
(SECON), pages 286–295, Sept. 2006.
Bibliography 93
[WS04] Andrew Y. Wang and Charles G. Sodini. A simple energy model
for wireless microsensor transceivers. In Proceedings of the 47th
Annual IEEE Global Telecommunications Conference (GLOBECOM),
volume 5, pages 3205–3209, 2004.
[WY95] Jeffery Westbrook and Dicky C. K. Yan. The performance of greedy
algorithms for the on-line Steiner tree and related problems. Theory
of Computing Systems, 28(5):451–468, 1995.
[YADL98] Jian Yin, Lorenzo Alvisi, Michael Dahlin, and Calvin Lin. Us-
ing leases to support server-driven consistency in large-scale sys-
tems. In Proceedings of the The 18th International Conference on Dis-
tributed Computing Systems (ICDCS), pages 285–294, Washington,
DC, USA, 1998. IEEE Computer Society.