scieee Science in your language
[en] (orig)
JournalofScheduling
https://doi.org/10.1007/s10951-020-00643-w
Broadcasting a file in a communication network
Kai-Simon Goetzmann1·Tobias Harks3·Max Klimm2
© The Author(s) 2021, corrected publication 2021
Abstract
We study the problem of distributing a file, initially located at a server, among a set of nnodes. The file is divided into m1
equally sized packets. After downloading a packet, nodes can upload it to other nodes, possibly to multiple nodes in parallel.
Each node, however, may receive each packet from a single source node only. The upload and download rates between nodes
are constrained by node- and server-specific upload and download capacities. The objective is to minimize the makespan.
This problem has been proposed and analyzed first by Mundinger et al. (J Sched 11:105–120, 2008.https://doi.org/10.1007/
s10951-007-0017-9) under the assumption that uploads obey the fair sharing principle, that is, concurrent upload rates from
a common source are equal at any point in time. Under this assumption, the authors devised an optimal polynomial time
algorithm for the case where the upload capacity of the server and the nodes’ upload and download capacities are all equal.
In this work, we drop the fair sharing assumption and derive an exact polynomial time algorithm for the case when upload
and download capacities per node and among nodes are equal. We further show that the problem becomes strongly NP-hard
for equal upload and download capacities per node that may differ among nodes, even for a single packet. For this case,
we devise a polynomial time (1+22)-approximation algorithm. Finally, we devise two polynomial time algorithms with
approximation guarantees of 5 and 2 +log2n/m/m, respectively, for the general case of mpackets.
Keywords Broadcasting problem ·Content distribution ·File sharing ·Load balancing ·Makespan ·Peer-to-peer ·
Approximation algorithm
The results of Section 3 of this paper appeared as an extended abstract
in Proceedings of the 22nd International Symposium on Algorithms
and Computation (Goetzmann et al. 2011).
The research of Kai-Simon Goetzmann was supported by the
Deutsche Forschungsgemeinschaft within the research training group
“Methods for Discrete Structures” (GRK 1408).
The research of Max Klimm was carried out in the framework of
Matheon supported by Einstein Foundation Berlin (MI-8).
BMax Klimm
Tobias Harks
1Institut für Mathematik, Technische Universität Berlin,
Straße des 17. Juni 136, 10623 Berlin, Germany
2School of Business and Economics, Humboldt-Universität zu
Berlin, Spandauer Straße 1, 10178 Berlin, Germany
3Institut für Mathematik, Universität Augsburg,
86135 Augsburg, Germany
1 Introduction
The theory and practice of computing has seen two major
paradigm shifts over the past decades. First, with the rise of
cloud computing services, the computations for the solution
of a difficult problem are often distributed among several
data centers scattered all over the world. Second, on a more
microscopic level, most processor architectures have mul-
tiple cores that (ideally) solve computational problems in
parallel. In both realms, it is important to efficiently distribute
data available at one entity (computer, processor, etc.) to all
other entities of the system, or said differently, to efficiently
broadcast data (Armbrust et al. 2009). A widely accepted
measure of efficiency is the time needed to complete the
broadcast (called the makespan), leading to the minimum
time broadcasting problem. The makespan objective is for
instance important for cloud computing systems, where it is
crucial that security updates are disseminated to every com-
puting node as fast as possible.
In the minimum time broadcasting problem, we are given
a graph, whose nodes correspond to the entities of the system.
At time zero, a file is located at a designated node (the server),
123
JournalofScheduling
0
core
1 2 ... ... n
cd
0
cd
1
cu
1
cd
2cu
2
cd
n
cu
n
Fig. 1 Graphical representation of the file distribution problem we con-
sider
and the task is to disseminate the file to all nodes as fast as
possible. We assume that the file is divided into mequally
sized packets. At any point in time, each node that possesses
a certain packet may send it with an arbitrary rate to other
nodes that have not yet received this packet. As it is common
in cloud computing, the nodes are connected to each other via
a core network which is usually overprovisioned. Thus, the
sending rates to and from other nodes (via the core) are lim-
ited by two capacities. First, for each node ithere is an upload
capacity cu
i, and the sum of the sending rates of imay not
exceed cu
i. Second, every node has a download capacity cd
i.
This problem contains elements of several classical com-
binatorial optimization problems such as network design,
scheduling and routing. A feasible solution (for a single
packet) can be thought of as a directed tree rooted at the
source node with the understanding that a node receives the
packet from its unique parent and sends it to its children
nodes. The additional feature besides finding an optimal dis-
tribution tree is to define a schedule determining the points
in time when nodes actually send the packet. In contrast to
classical scheduling models, however, once nodes received
the packet they may send the packet to other nodes, thus, in
scheduling terminology, machines can be activated.
Alternatively, the problem can be interpreted as a dynamic
multicommodity flow problem in a star-shaped topology, see
Fig. 1. The central vertex of the network represents the core
network. All nodes are connected to this vertex via a pair of
antiparallel arcs, equipped with upload and download capac-
ities. Packets of the file can be sent from one node to another
via the unique directed simple path between them. The total
flow on an arc must not exceed its capacity at any point in
time.
1.1 Previous work
There is a large body of work on the (minimum time) broad-
casting problem, multicast problem, and gossiping problem,
see Hedetniemi et al. (1998) for a survey. In the broadcasting
problem, the task is to disseminate a file from a source node to
the rest of the nodes in a given communication network as fast
as possible, see Ravi (1994) for an approximation algorithm.
When the file needs to be disseminated only to a subset of the
nodes, this task is referred to as multicasting, see Bar-Noy
et al. (2000a) for approximation algorithms. In the gossiping
problem, several nodes possess different files and the goal
is to transmit every file to every node. The usual underly-
ing communication model for these problems is known as
the telephone model: A node may send a file to at most one
other node at a time, and it takes one round (i.e., one unit
of time) to transfer a file. For complete graphs on n+1
nodes, this process terminates in log2(n+1)rounds. It is
known that for arbitrary communication graphs, the problem
of computing an optimal broadcast in the telephone model
is NP-hard (cf. Garey and Johnson 1979), even for 3-regular
planar graphs (cf. Middendorf 1993).
Khuller and Kim (2007) studied the problem of broadcast-
ing in heterogeneous networks [cf. Bar-Noy et al. (2000a)for
the corresponding multicast problem]. They consider com-
plete graphs, but extend the telephone model by allowing
the transmission time of the file to depend on the sender.
They prove that it is NP-hard to minimize the makespan and
present an approximation scheme.
All the above models, however, do not take into account
two important features of many real-world implementations
of broadcasting systems, such as cloud computing or peer-
to-peer networks. First, each node typically sends the file to
multiple nodes of the network in parallel. Second, the file
is usually divided into packets (as the smallest indivisible
unit), and a node may receive different packets from dif-
ferent nodes. These extensions result in structural changes
of the model. In contrast to the telephone model [including
the extension of Khuller and Kim (2007)], in our model the
transfer time between any two nodes is no longer part of the
instance but determined by the sending rates of a feasible
solution. In fact, the model of Khuller and Kim (2007)for
broadcasting in heterogeneous networks reduces to a special
case of our model. If cd
imaxjNcu
jfor all iN, there
always is an optimal solution in which no node will send the
file to more than one other node at a time, thus, the time to
transfer the file from one node to another only depends on
the sender’s upload capacity. In general, however, it may be
beneficial to serve multiple nodes simultaneously as illus-
trated in the following example of a file distribution problem
in a network with three nodes. Node 0 with capacity cu
0=2
initially owns the file consisting of one packet. There are two
further nodes with capacities cu
i=cd
i=1, i=1,2. In the
optimal solution, both nodes receive the file in parallel at a
rate of 1, yielding a makespan of M=1. Restricting nodes
to upload to at most one other node at a time results in a
makespan of 2.
Mundinger et al. (2008) studied a peer-to-peer file distri-
bution problem, where a file is subdivided in multiple parts
123
JournalofScheduling
and the goal is to disseminate the complete file to every peer
as fast as possible. The crucial difference to our model is that
they assume that the upload capacity of a node is equally
shared among concurrent uploads. Under this fair sharing
assumption, they prove that for homogeneous capacities (i.e.,
cu
i=cd
i=1 for all nodes i) a simple greedy algorithm is opti-
mal. (A similar result was also earlier reported by Kwon and
Chwa (1995) and Bar-Noy et al. (2000b), assuming the tele-
phone model.) We prove this result without the fair sharing
assumption. Our proofs are quite involved, since dropping the
fair sharing assumption considerably complicates matters.
For heterogeneous capacities, to the best of our knowledge,
neither approximation algorithms nor hardness results were
known before.
When the number of parts of the file tends to infinity, one
obtains the so-called fluid flow model, for which a relatively
easy closed-form expression of the minimum completion
time can be derived, see, e.g., Ezovski et al. (2009), Kumar
et al. (2006), and Mehyar et al. (2007). Qiu and Srikant (2004)
studied a related model with stochastic arrival of peers. Fan
et al. (2009) considered a finite number of parts, but assumed
that the number of parts is large enough such that a peer
always has enough “new” parts to share with other peers.
The file distribution problem we consider has elements of
a scheduling problem in which a job becomes a machine after
its completion. A related problem exhibiting this property is
the freeze-tag problem studied in the area of robotics. Here, a
set of asleep robots is placed on a graph. There is one desig-
nated robot which is awake and that can travel to other robots
in order to awake them. Once a robot is awake, it may travel
to other robots to awake them. The task is to awake all robots
as fast as possible. Arkin et al. (2003) showed the hardness of
the problem in unweighted graphs and gave a constant factor
approximation algorithm for the case that there is at most
one robot at each node. For the general case, they obtained a
log n-approximation, where nis the number of robots.
Könemann et al. (2005)devisedaOlog n-approximation
for the weighted case. Arkin et al. (2006) studied the prob-
lem for different graph topologies. Closest to our setting,
they showed that the problem is NP-complete, even for star
networks. For this case, they devised a 14-approximation.
1.2 Summary of the results and used techniques
Our results for a single packet. In Sect. 3, we study the basic
situation in which a single packet is to be broadcasted. For
the case of homogeneous symmetric (unit) capacities, that is,
cu
i=cd
i=1 for all nodes i, we show that a greedy algorithm
computes an optimal solution with makespan log2(n+1).
Although similar results for the same algorithm have been
obtained before, e.g., by Mundinger et al. (2008), or in the
broadcasting literature (Kwon and Chwa 1995; Bar-Noy et al.
2000b), our result holds for a more general model as we drop
the fair sharing assumption used by Mundinger et al. in the
telephone model. For the case of unit node capacities and an
arbitrary integer server capacity, we propose a polynomial
time algorithm (that possibly splits up server capacity) and
prove its optimality. We also give a closed-form expression
of the minimal makespan.
If node capacities are heterogeneous and symmetric (i.e.,
cu
i=cd
ifor all nodes i), we show that the problem becomes
strongly NP-hard. A key ingredient of the reduction (from 3-
Partition)istheCapacity Expansion Lemma (Lemma 5)
that provides an upper bound on the total amount of data
downloaded at any point in time.
In light of the hardness, we then study approxima-
tion algorithms. We first devise a polynomial time 22-
approximation algorithm for instances with heterogeneous,
symmetric capacities in which the upload capacity of the
server is larger than the download capacity of any other
node. For smaller server capacities, a slight modification of
our algorithm gives a (1+22)-approximation. Our algo-
rithm which we term Scale- Fit proceeds in two phases; in a
first phase, we use a time-varying resource augmentation to
construct a so-called 2-augmented solution, that violates
the capacity constraints of the nodes at any point in time
by a factor of at most 2. By exploiting again our Capac-
ity Expansion Lemma, we prove that the makespan of the
augmented solution thus constructed is at most a factor of 2
away from the optimal makespan of the original instance. We
then rescale the relaxed solution to obtain a feasible solution
with makespan less than a factor 22 away from the optimal
makespan.
Our results for multiple packets. In Sect. 4, we proceed
by analyzing the more challenging problem of distributing
a file that is divided into m>1 packets to a set of nodes
with heterogeneous symmetric capacities. First, we devise
an algorithm, which we call Spread- Exchange, that works
in two phases. In the first phase, we use a slight variation of
our Scale- Fit-Algorithm for single packet distribution in
order to send one packet to each node. At the end of the
first phase, each node possesses one packet of the file, but
the packets owned by different nodes may differ. In fact,
we strive to maximize the variety of different packets avail-
able at the nodes. In the second phase, the nodes exchange
the packets among each other, in each round increasing the
number of packets available at each node by one. By care-
fully keeping track of the proportions of the different packets
available at the nodes during both phases of the algorithm, we
prove that for instances for which the server has the largest
capacity, Spread- Exchange achieves an 4-approximation
of the optimal makespan. For smaller server capacities, a
slight variation of the algorithm gives a 5-approximation.
Motivated by real-world instances where the number of
packets is large compared to the number of nodes, we also
propose a different algorithm, termed Spread- Mirror-
123
JournalofScheduling
Cycle, that performs better than Spread- Exchange on
these instances. The main idea is to invest more time in the
first phase, in order to achieve a balanced proportion of the
different packets available at the different nodes. This allows
to perform a faster exchange procedure in the second phase.
For a large number of packets, the second phase dominates
the makespan of the solutions of both algorithms and, thus,
Spread- Mirror- Cycle performs better than Spread-
Exchange. We prove that Spread- Mirror- Cycle yields
a(2+2log2n/m/m)-approximation, establishing that
Spread- Mirror- Cycle has a better worst-case guarantee
for instances in which nm2(3/2)m.
2 Preliminaries
An instance of the minimum time broadcasting prob-
lem is described by a tuple I=(N,m,cd,cu), where
N={0,...,n}is the set of nodes, mN1is the
number of packets, cd=(cd
0,...,cd
n)Qn+1
0is the vec-
tor of download capacities from the core network and
cu=(cu
0,...,cu
n)Qn+1
0is the vector of upload capacities
to the core network. Let [m]={1,2,...,m}be the set of
packets. We will identify the server with node 0 and assume
that only the server initially owns the file. See Fig. 1for an
illustration. The file is of unit size; the size of each packet is
thus 1/m. A feasible solution S=s(k)
i,ji,jN,k∈[m]is a fam-
ily of integrable functions s(k)
i,j:R0R0,i,jN,
k∈[m], where s(k)
i,j(t)denotes the rate at which node i
sends packet kto node jat time t. We require that each node
iNreceives each packet kfrom a unique node, denoted
by p(k)(i)= i:
s(k)
,i(t)=0 for all = p(k)(i)and all tR0.
In addition, only nodes that possess a packet kcan send that
packet with a positive rate:
s(k)
i,j(t)=0 for all i,jN\{0},k∈[m],tR0
with t
0
N
s(k)
,i) dτ< 1
m.
Finally, the sending rates have to obey download and upload
capacity constraints:
k∈[m]
jN
s(k)
i,j(t)cu
ifor all iN,tR0,and
k∈[m]
s(k)
p(k)(j), j(t)cd
jfor all jN,tR0.
We denote by
xi(t)=t
0
k∈[m]
s(k)
p(k)(i),i) dτ
the proportion of the file owned by node iN\{0}at time t.
For notational convenience, we set x0(t)=1 for all tR0.
We let Ci=inf{tR0:xi(t)=1}denote the completion
time of node i. The makespan of a solution Sis then defined
as M(S)=maxiN\{0}Ci. If an instance satisfies cu
i=cd
i=
cu
j=cd
jfor all i,jN\{0}we speak of an instance with
homogeneous symmetric capacities. If an instance satisfies
cu
i=cd
ifor all iN(but possibly cu
i= cu
jfor some i,j
N\{0}), we speak of heterogeneous symmetric capacities.
In both cases, we only write I=(N,m,c)meaning that
c=cu=cd.Ifm=1, we only write I=(N,c).
As we will show, the minimum time broadcasting problem
is in general NP-hard. A common approach in theoretical
computer science to deal with such problems is to consider
approximation algorithms. For a minimization problem P
and ρ>1, a ρ-approximation algorithm is an algorithm
that, for any instance IP, computes a feasible solution
Sin time that is polynomial in the encoding length |I|of
the input, such that the objective value c(S)is not greater
than the optimal value Opt(I)by more than a factor of ρ,
i.e., c(S)ρOpt(I).
There are some particularities of the minimum time broad-
casting problem regarding the polynomial running time of
algorithms. In general, a solution may consist of arbitrar-
ily complicated sending rate functions s(k)
i,j. Without loss
of generality, we can restrict ourselves to piecewise con-
stant functions with discontinuities only at points in time
where some node starts or finishes the download of a packet.
Any solution that does not have this property can be modi-
fied to do so by flattening the sending rates in the intervals
where the sending and receiving nodes for each packet are
constant. Specifically, let [t1,t2)be such an interval. It is
easy to verify that the new sending rates ¯s(k)
i,jdefined as
¯s(k)
i,j=1
t2t1t2
t1s(k)
i,j) dτfor all i,jNobey the capacity
constraints of the upload and download links; see also Miller
and Wolisz (2011, Theorem 1) for a formal proof. A simple
shifting argument further shows that it is without loss of gen-
erality to assume that a node starts receiving a packet only
at time zero or when another node has finished receiving a
packet. There are at most m·nof these breakpoints.
However, if the number of packets mis part of the input,
encoded in binary, a polynomial time algorithm would usu-
ally only be allowed time that is polynomial in log m.Given
that even a simplified (optimal) solution might need space
linear in m, in the context of broadcasting a divisible file we
call an algorithm polynomial if it is polynomial in the encod-
ing length of the input and m. Another reason for assuming
123
JournalofScheduling
Algorithm 1: Greedy for identical capacities
Input: instance I=(N,c)with c=1
Output: makespan-minimal solution S
1si,j) := 0τ0;
2A:= {0},P:= N\{0},t:= 0;
3while P=∅do
4A:= ;
5forall the iAdo
6Choose jP;
7si,j(t):= 1forallt∈[t,t+1);
8A:= A∪{j},P:= P\{j};
9end
10 A:= AA,t:= t+1;
11 end
an input size proportional to mis that in real systems each
packet contains a unique piece of information and, thus, the
input size is in practice proportional to m.
3 Broadcasting a single packet
We begin by studying the base case m=1, that is, the file
consists of a single packet that is to be broadcasted. Parts
of the results obtained in this chapter will be reused later to
solve the more challenging general case m>1. To increase
the readability, throughout this section we drop the index k
for the packet number from the notation.
This section is organized as follows. First, we propose
efficient and optimal solutions for the case of homogeneous
symmetric capacities, that is, upload and download capaci-
ties among the nodes are equal. Then, we show that for the
more general case of heterogeneous symmetric capacities the
computation of an optimal solution becomes strongly NP-
hard. Finally, we give an efficient O(nlog n)-algorithm that
approximates the optimal makespan by a factor of 1 +22.
3.1 Homogeneous symmetric capacities
In this section, we consider the homogeneous symmetric set-
ting, i.e., cu
i=cd
i=ci=1 for all iN\{0}. The server
has a capacity of cu
0=c0.
If the server has unit upload capacity c0=1 as well,
the following greedy procedure yields an optimal solution:
At each point in time, any node that already owns the file
uploads it to exactly one other node, which takes one unit of
time. Thus in each step, the number of nodes owning the file
is doubled, resulting in a makespan of log2(n+1).Fora
formal description of the procedure, see Algorithm 1.
Lemma 1 If ci=1for all i N, Algorithm 1computes
an optimal solution in time O(n). The optimal makespan is
log2(n+1).
Proof Throughout the algorithm, the number of nodes that
own the file (including node 0) at time tNis min{2t,n}
(proof by induction). All completion times are integral;
hence, the makespan of the constructed solution is
M=min{tN:2tn+1}
=min{tN:tlog2(n+1)}
=log2(n+1).
To prove that this is best possible, consider an optimal
solution S=(s
i,j)i,jNwith the corresponding completion
times C
1,...,C
n. We will modify this solution such that in
the time interval [0,1)only one node is served, completing
the transfer at time 1, without increasing the makespan. Iter-
ating this argument yields a solution with the same structure
as the one constructed by Algorithm 1, establishing the claim.
Let the nodes be indexed such that C
1=mini=1,...,nC
i.We
set the rate vector sof the modified solution as follows:
s
0,1(t)=1fort∈[0,1),
0fort1,
s
0,i(t)=
0fort∈[0,1),
x
i(C
1)
C
11for t∈[1,C
1),
s
0,i(t)for tC
1,
for all iN\{1}with p(i)=0,
s
i,j(t)=s
i,j(t)
for all other i,jNand tR0. Without loss of generality,
we can assume s
i,j(t)=1
C
1C
1
0s
i,j) dτfor all t∈[0,C
1)
and all i,jN; see Miller and Wolisz (2011, Theorem 1).
It then holds that x
i(C
1)=s
0,i(0)·C
1for iN\{1}
with p(i)=0. Using this, we can show that in the modified
solution the upload capacity of node 0 is obeyed: For t
[1,C
1),
iN\{1}:p(i)=0s
0,i(t)
=1
C
11iN\{1}:p(i)=0x
i(C
1)
=C
1
C
11iN\{1}:p(i)=0s
0,i(0)
=1
11
C
1iNs
0,i(0)

1
s
0,1(0)
1
1s
0,1(0)1s
0,1(0)
=1.
123
JournalofScheduling
The inequality above also yields s
0,i(t)1 for all iN\{1}
with p(i)=0, so download capacities are obeyed as well
since p(i)=0 implies s
j,i(t)=0 for all j= 0, tR0.
Finally, at time C
1every node iNowns the same fraction
of the file as in the original solution, namely x
i(C
1). Since
in Sno node completes its download before time C
1, and
in the modified solution after time C
1the original rates are
used, no completion of a download is delayed, and hence,
the makespan is not increased.
We now consider the case where the server capacity c0is
an arbitrary integer and start with a proof that in this case
there always is an optimal solution that uses fair sharing,
i.e., whenever node 0 serves several nodes simultaneously,
all of them are served with equal rate. The proof is quite
involved, and the integrality condition on c0is crucial. To
see this, consider an example with c0=3/2 and n=4. In
an optimal solution, the server serves node 1 in [0,1)and
node 2 in [1,2)with a rate of 1 each and node 3 with a rate
of 1/2in[0,2). Node 4 is served by node 1 in [1,2), resulting
in a makespan of M=2. The best solution that uses fair
sharing, however, has a makespan of M=7/3: The server
serves nodes 1 and 2 in [0,4/3)witharateof3/4 each, and
node 3 and 4 are both served at a rate of 1 in [4/3,7/3)by
the server and node 1.
Lemma 2 For any instance I =(N,c)with c0Nand
ci=1for all i =1,...,n, there always exists an optimal
solution that uses fair sharing.
Proof For an arbitrary instance Iwith the demanded proper-
ties, consider an optimal solution Swith the corresponding
completion times C
1,...,C
n. W.l.o.g. assume that the nodes
are indexed such that
{1,2,...,k}={iN:p(i)=0}
and let 1,
2,...,∈{1,...,k}be such that
C
1=...=C
1
<C
1+1=···=C
1+2<C
1+2+1...C
k.
Let N1={1,...,
1}.Fori=1,...,k,letmibe the
makespan for serving all nodes that are directly or indirectly
served by node i, i.e.,
mi=maxC
jC
i:jN,pr(j)=ifor some rN,
where pr(j)denotes the r-fold application of the function
pon j. By Lemma 1, we can assume that these nodes
are served according to Algorithm 1. (Otherwise, we mod-
ify the solution to get this structure without increasing the
makespan.) In particular, we get that miNfor all i=
1,...,k. The makespan of the given optimal solution is
M=maxi=1,...,k{C
i+mi}. Finally let M1=maxiN1mi.
In the following, we show how to modify the original solu-
tion sto obtain a solution sthat uses fair sharing and has
not a larger makespan. All primed variables s
i,j,C,x,
1
and N
1refer to variables of the modified solution. When-
ever s
i,j(t)is not specified for some i,j,t, it is equal to the
original value s
i,j(t). The modification of sto sproceeds
iteratively in steps. In each step, one of four modifications
is made. Specifically, if the server capacity is not smaller
than k, all nodes {1,...,k}are served by the server at
equal rate (Case 1). If k>c0and M1mi0for some
i0∈{1+1,...,k}, a single node is added to N1and all
nodes in N1are served with equal rate while maintaining the
original rates for all nodes not in N1(Case 2). If 1c0and
M1>mifor all i∈{1+1,...,k}, the server serves all
nodes in N1at full capacity and with fair sharing; the serving
of the other nodes is delayed (Case 3). Finally, if 1<c0<k
and M1>mifor all i∈{1+1,...,k}, we increase N1to
size c0and serve the nodes in N1at full capacity with fair
sharing; the serving of all other nodes is delayed (Case 4).
Application of Case 1 immediately leaves us with a solu-
tion that uses fair sharing. Application of Case 3 and Case 4
leaves us with a solution swith completion times
C
1=...=C
1
<C
1+1=···=C
1+
2
<C
1+
2+1...C
k,
where the nodes in N
1={1,...,
1}are served in time
[0,C
1)with fair sharing and nodes i∈{
1+1,...,k}are
served by the server after C
1. We then fix the part of the
solution related to the nodes in N
1(and the nodes served by
these nodes) and apply the same modifications for the nodes
in N
2={
1+1,...,
1+
2}. Finally, after application of
Case 2, we have increased the size of N1by one. Since the
number of nodes is finite, after a finite number of applications
of Case 2, one of the other three cases is applied.
We proceed to show that the application of the four cases
yields a feasible solution and does not increase the makespan.
Case 1: kc0.
In this case, we let node 0 serve the nodes 1,...,kat a rate
of 1 in the time interval [0,1). Obviously no node completes
its download later than in the given solution, and fair sharing
is used all the time.
In the following cases, we always assume that k>c0.
Case 2: M1mi0for some i0∈{1+1,...,k}.
We add one node to N1, serving nodes 1,...,
1+1 with
equal rates in the time interval 0,C
1+1:
i1,...,
1+1:
s
0,i(t)=1+1
j=1
s
0,j(t)
1+1for all t0,C
1+1,
0 otherwise.
123
JournalofScheduling
Then nodes 1,...,
1+1 all complete their download at time
C
1+1since
x
i(C
1+1)=C
1+1
0
s
0,i) dτ
=1
1+1
1+1
j=1C
1+1
0
s
0,j) dτ

=1
=1
for all i=1,...,
1+1. The download capacities of all
nodes are satisfied since at any point tin time
s
0,i(t)=1
1+1
1+1
j=1
s
0,j(t)max
j=1,...,1+1s
0,j(t)1.
The same is true for the upload capacity of the server, since it
uploads with the same overall rate as in the original solution
at each point in time.
We proceed to argue that this modification does not
increase the makespan. First, since M1mi0, for all iN1
we have
C
i+mi=C
k+1+miC
i0+mi0M,
so delaying the completion of nodes in N1does not influence
the makespan. Furthermore, in the modified solution s,all
nodes i∈{1+2,...,k}are served by the server in the
same way as in solution sso that their completion time Ci
as well as the completion times of the nodes served by them
does not change.
We now turn to the cases where M1>mifor all i=
1+1,...,k. Note that by Lemma 1, we know that miN
for all i; hence, we can assume M1mi+1 for all i>
1
in the subsequent cases.
Case 3: 1c0and M1mi+1 for all i>
1.
In this case, we modify the solution such that node 0 first
serves all nodes in N1at full capacity, delaying the serving
of all other nodes until those in N1have completed their
downloads. More formally, let t1=1/c0, and set
iN1:s
0,i(t)=c0
1for all t∈[0,t1)
0 otherwise.
i=1+1,...,k:s
0,i(t)=0 for all t∈[0,t1).
It is obvious that in [0,t1)all capacities are obeyed and C
i=
t1C
ifor all iN1.
To specify how the remaining nodes are served, let t2=
max{C
1,t1+1}. We serve nodes 1+1,...,kin such a way
that the fraction of the file they own at time t2is the same as
in the original solution, i.e., x
i(t2)=x
i(t2):
i=1+1,...,k:s
0,i(t)=x
i(t2)
t2t1
for all t∈[t1,t2).
Since x
i(t2)1 and t2t11, this solution obeys the
download capacities of all nodes. Also the upload capacity
of the server is obeyed, since in the time interval [0,t2)it
sends the same volume as in the original solution:
t2
0
iN
s
0,i) dτ=
1
i=1t1
0
c0
1
dτ+
k
i=1+1t2
t1
x
i(t2)
t2t1
dτ
=
k
i=1
t1·c0
1

=1=x
i(t2)
+
k
i=1+1
x
i(t2)
=
k
i=1
x
i(t2),
and the overall rate the server sends at is c0at the beginning
and constant afterward, hence it is at most c0at all times.
From time t2on, in the modified solution s, all nodes are
served as in the original solution s.
Nodes in N1and nodes with C
it2complete their down-
load no later than in the original solution. If there is a node
i∈{1+1,...,k}with C
i<t2, then t2>C
1and thus
t2=t1+1, and this node icompletes its download exactly
one time unit after the nodes in N1. Since M1mi+1we
obtain that
C
i+mi=t1+1+miC
1+M1M,
therefore the makespan is not increased.
Case 4: 1<c0and M1mi+1 for all i>
1.
In this case, we add nodes 1+1,...,c0to N1and serve
these nodes jointly at full capacity in the time interval [0,1):
i=1,...,c0:s
0,i(t)=1 for all t∈[0,1)
0 otherwise.
i=c0+1,...,k:s
0,i(t)=0 for all t∈[0,1).
For the serving of the remaining nodes, we set t2=
max{C
c0,2}and serve nodes c0+1,...,ksuch that x
i(t2)=
x
i(t2)by setting
i=c0+1,...,k:s
0,i(t)=x
i(t2)
t21for all t∈[1,t2).
Feasibility follows similarly to Case 3, as well as the fact that
the makespan is not increased.
We continue by proving that the server always serves
groups of c0nodes jointly, possibly except for some start
123
JournalofScheduling
interval [0,t1),t1R0where a group of at least c0and
most 2c0nodes is served.
Lemma 3 For any instance I =(N,c)with c0Nand
ci=1for all i =1,...,n, there always exists an opti-
mal solution that uses fair sharing where the server never
serves more than c0nodes simultaneously except for some
interval [0,t1],t
1R0in which at least c0and at most
2c0nodes are served simultaneously until they are com-
pleted.
Proof For an instance with the demanded properties, con-
sider an optimal solution S. Using Lemma 2, we can assume
w.l.o.g. that the set {iN:p(i)=0}of nodes served
by the server is partitioned into sets N1,...,Nksuch that
all nodes iNjare served simultaneously by the server
at equal rates in the time interval [tj1,tj], where t0=0
and tj=j
=1max{1,|N|
c0}for j=1,...,k. We can
further assume that |Nj|≥c0for j=1,...,k1
(otherwise we can add nodes from Nj+1to Njwithout
increasing the makespan) and |Nj|<2c0for j=1,...,k
(Otherwise, we split Njinto two sets and serve c0nodes
in [tj1,tj1+1]and the remaining |Nj|−c0nodes in
[tj1+1,tj].)
For j=1,...,k, we denote by ˜
Njthe set of nodes that
are directly or indirectly served by a node in Nj, and by Mj
the makespan needed for this, i.e.,
˜
Nj=iN:pr(i)=ifor some iNj,rN,
Mj=maxi˜
NjC
itj,(1)
where again pr(j)denotes the r-fold application of the func-
tion pon j. As in the proof of Lemma 2, we can assume that
MjNfor all j=1,...,k. We now prove by induction
that the solution can be modified such that |Nj|≤c0for
j=2,...,k, without increasing the makespan. For the base
case, assume |N2|>c0. We decrease N2to contain only c0
nodes.
If M1M2+1, we remove |N2|−c0nodes from |N2|,
add them to N1and let them be served by the server jointly
with the other nodes in N1. This delays the completion of
nodes in N1to time
|N1|+|N2|−c0
c0=t21,
but using M1M2+1 it does not increase the makespan
since the new makespan of any node in ˜
N1can be bounded
by t21+M1t2+M2. The case M1M2+2
is more involved. Here, the |N2|−c0removed nodes will
be served by nodes in ˜
N2. For this, the makespan M2
has to be increased by at most one. To see this note that
|˜
N2|≤|N22M2. With |N
2|=c0,inM2+1 time units
we can serve up to c0·2M2+1=2c0·2M2>|N22M2
Algorithm 2: Extended Greedy for integral server
capacity
Input:I=(N,c)with c0N,ci=1foralliN\{0}
Output: makespan-minimal solution S
1h:= log2n
c0+1;
2if nc0(2h1+2h1)then
3Choose N1Nwith |N1|=c0;
4k:= h+1;
5else if n<c0(2h1+2h1)then
6Choose N1Nwith |N1|=nc0(2h11)
2h1;
7k:= h;
8end
9t1:= |N1|/c0;
10 In [0,t1), node 0 serves nodes in N1;
11 for t=0,...,k2do
12 Choose Nt+2N,|Nt+2|=c0of unserved nodes;
13 In [t1+t,t1+t+1), node 0 serves nodes in Nt+2,any
completed node iN\{0}serves one other node that is not
yet served (if there still is one);
14 end
nodes through nodes in N
2, so we can serve all nodes in
˜
N2(including the nodes from N2). The overall makespan is
not increased by this, since the serving of N
2is now com-
pleted at time t
2=t1+1, and hence, all nodes in ˜
N2are
served at time t
2+M2+1=t1+M2+2t1+M1
M.
For the inductive step, assume |Nj|>c0for some
j=3,...,k. By the induction hypothesis, we know that
|N|=c0for =2,..., j1 and thus tj1=t+
j1 for all =1,..., j1. With this and a dis-
tinction between the cases MMj+jand M
Mj+j+1, the inductive step follows similarly to the
base case.
Summing up, we have shown so far that there always is
an optimal solution where the server at the beginning serves
a set of nodes N1,c0≤|N1|<2c0,in|N1|/c0time units,
then some sets N2,...,Nk1with |Nj|=c0for all j2,
in one time unit each, and finally a set Nkcontaining at most
c0nodes, in another unit of time.
We are now ready to present an exact algorithm for this
special case and prove its correctness. Intuitively speaking,
the algorithm is an extended greedy procedure that attempts
to balance the sub-makespans Mjof the different sets Nj
of nodes served by the server. Set h=log2(n/c0+1).If
n<c0(2h1+2h1),let|N1|=nc0(2h11)/2h1
and k=h; otherwise, let |N1|=c0and k=h+1.
Let |Nj|=c0for j=2,...,k. The server serves N1in
[0,|N1|/c0)and Njin |N1|/c0+j2,|N1|/c0+j
2. Meanwhile, any node that finishes its download starts
serving other nodes greedily. Details are listed in Algo-
rithm 2.
123
JournalofScheduling
Theorem 4 For any instance I =(N,c)with c0Nand
ci=1for all i =1,...,n, Algorithm 2yields an optimal
solution. The optimal makespan is
M=
h1+1
c0nc0(2h11)
2h1
if n c0(2h1), c0(2h1+2h1)
h+1
if n c0(2h1+2h1), c0(2h+11),
where h =log2(n/c0+1).
Proof W.l.o.g. we can assume that in the solution produced
by Algorithm 2,|˜
Nj|=c0·2kjfor all j=2,...,k[with
˜
Njas defined in (1)], i.e., all sets ˜
Njfor j2 are as big
as possible. If this is not the case, we can move nodes from
˜
N1to these sets. The lemmas above state that there also is an
optimal solution with this structure and c0≤|N1|<2c0.In
the argumentation below, we therefore restrict to solutions of
this structure.
The integer his chosen such that c0(2h1)n<
c0(2h+11). We first consider the case where nc0(2h
1+2h1). Our algorithm sets k=h+1 and |N1|=c0.
The maximum number of nodes that can be served like this
is c0(2h+11)>n, so indeed all nodes are served within a
makespan of h+1. Assume that in an optimal solution only
k=hsets are served by the server. The maximum number
of nodes served in this way is |N12h1+c0(2h11)<
c0(2h+2h11)n, so not all nodes can be served,
contradiction! Therefore in an optimal solution it must hold
that k=h+1 and |N1|≥c0, proving that the solution
produced by Algorithm 2is optimal.
If n<c0(2h1+2h1), the size of N1in an optimal
solution (with |˜
Nj|=c0·2kjfor j2) has to be chosen
such that |˜
N1|=nc0(2h11)≤|N12h1, i.e.,
|N1|=min N:nc0(2h11)
2h1
=nc0(2h11)
2h1,
which is exactly how the algorithm chooses N1. Moreover,
the algorithm sets k=h. It is obvious that choosing kh1
does not lead to a solution where all nodes are served, and
choosing kh+1 leads to a solution with a makespan of
at least h+1. The solution produced by our algorithm has a
makespan of t1+k1h+1, where we used t12 and
k=h. This proves the correctness of Algorithm 2.
3.2 Heterogeneous symmetric capacities
In this section, we consider the case of heterogeneous and
symmetric capacities. First, we show that the minimum time
broadcasting problem for heterogeneous symmetric capaci-
ties is strongly NP-hard, even for an indivisible file. Then, we
devise an algorithm that approximates the optimal makespan
by a factor 1+22. Our hardness and approximation results
rely on a useful lemma that bounds the work that has to
be done in any feasible solution. For a feasible solution, let
ui(t1,t2)and zi(t1,t2)denote the total upload and the idle-
ness of node iin time interval [t1,t2], defined as
ui(t1,t2)=t2
t1
jN
si,j) dτ
and
zi(t1,t2)=t2
max{t1,Ci}
ci
jN
si,j)
dτ.
For t0, we define
X(t)=
iN
xi(t)
and
Z(t)=
iN
zi(0,t).
We obtain the immediate equality X(t2)X(t1)=
iNui(t1,t2)for all 0 t1t2.
Lemma 5 (Capacity expansion lemma) Let I =(N,c)be an
instance of the minimum time broadcasting problem with an
indivisible file, and let c =maxiNci. Then, for all solutions
S of I, the following two statements hold:
1. For all nodes i N and all times 0t1<t2:
ui(t1,t2)+zi(t1,t2)
max0,t2t11xi(t1)
cici
=max0,(t2t1)ci1+xi(t1)
2. Xk
c+Zk
c2kfor all k N. This inequality is strict
if there is i N with ci<c and 0<Cik
c.
Proof To see 1., note that for any node iwith Cit1,the
upload rate (integrand of the total upload) and the idle rate
(integrand of the idleness) sum up to cifor all t∈[t1,t2],
thus ui(t1,t2)+zi(t1,t2)(t2t1)ci.Ifxi(t1)<1, node i
needs at least 1xi(t1)
citime units to finish its download, thus
only a time interval of length t2t11xi(t1)
ciremains for
the upload and the claimed inequality follows.
123
JournalofScheduling
We prove 2.by induction over k. The inequality is trivial
for k=0 since initially only the server holds the file. So, let
us assume that for kN,wehaveXk1
c+Zk1
c2k1
and that this inequality is strict if there is iNwith ci<c
and 0 <Cik1
c.Using1., we obtain
Xk
cXk1
c+Zk
cZk1
c
=
iN
uik1
c,k
c+zik1
c,k
c
iN
max$0,ci
c1+xik1
c%
Xk1
c,(2)
whereweusecic. Rearranging terms and using the induc-
tion hypothesis, we obtain
Xk
c+Zk
c2Xk1
c+Zk1
c2k.(3)
To finish the proof, let us assume that there is iNwith
ci<cand 0 <Cik
c.IfCik1
c, using the induction
hypothesis, we obtain Xk1
c<2k1and, by (3), Xk
c<
2k. If, on the other hand, Cik1
c,k
c&, then the inequality
(2) is satisfied strictly and we obtain Xk
c<2kby (3).
3.2.1 Hardness
We are now ready to prove strong NP-hardness of the mini-
mum time broadcasting problem.
Theorem 6 The minimum time broadcasting problem for het-
erogeneous symmetric capacities is strongly NP-hard, even
for m =1.
Proof We reduce from 3-Partition. An instance of 3-
Partition is given by a multiset P={k1,...,kn}of n=
3μ,μN, positive integers. The goal is to partition Pinto μ
subsets P0,...,Pμ1such that the sum of all integers within
each subset is equal. The decision whether such partition
exists remains NP-complete in the strong sense even if B/4<
ki<B/2 for all kiP, where B=n
i=1ki is integer,
see Garey and Johnson (1979). We first construct a modi-
fied 3- Partition instance in which μis a power of two and
larger or equal to 16. To this end, let =max{log2μ,4}
and set μ=2. We define the new instance by
P=P
μ
'
k=μ+1{B/2,B/2}.
The goal is to partition Pinto μsubsets P0,...,Pμ1such
that the sum of all integers within each subset is equal. Note
that |P|might not be equal to 3μ, but this does not hurt our
following arguments. Because the integer B/2 can only go
with another integer B/2 in a “yes”-instance of the modified
3- Partition problem, the modified 3-Partition instance
Padmits a solution if and only if the original instance P
admits a solution.
Given the modified instance Pof 3-Partition, we con-
struct an instance Iof the minimum time broadcasting
problem as follows: For every kiP, we introduce a mas-
ter element node i0with capacity ci0=kiand 2ki2
element nodes i1,...,i2ki2with capacity cik=ki
ki1for
all k∈{1,...,2ki2}.For j∈{0,...,μ
1},we
introduce a subset node sjwith capacity csj=B. Subset
node s0initially owns the file. Note that 3- Partition is
strongly NP-complete, i.e., is NP-complete even if the inte-
gers kiare represented unary. For unary encoded integers of
the 3- Partition instance, we obtain a polynomial reduction.
We claim that the optimal makespan of Iis less or equal
B+if and only if Pis a yes-instance. For the “if”-part, we
construct a solution Swhere first all subset nodes are served,
and then, each subset node serves those master element nodes
that correspond to elements in one of the sets of the partition.
Afterward all element nodes are served. More formally, in
the first 1
Btime units subset node s0sends the file to the first
subset node s1witharateofss0,s1(t)=Bfor all t∈[0,1
B).
After 1/Btime units, the first subset node has completed the
download of the file. Starting at time 1/Bsubset nodes s0and
s1send the file to the subset nodes s2and s3, respectively, each
witharateofss0,s2(t)=ss1,s3(t)=Bfor all t1
B,2
B.
Continuing in this fashion, at time /Ball μsubset nodes
own the file but none of the other nodes has received any
data. Let P
0,...,P
μ1be a solution of P.Intheremain-
ing time units, subset node sj,j∈{0,...,μ
1}, sends
the file to every master element node i0with kiP
j,atarate
of ssj,i0(t)=kifor all t
B+1
ki. This is feasible since
P
0,...,P
μ1is a solution of Pand thus iS
jki=B.
The download of each master element node is finished after
1/kitime units at time
B+1
ki. At this point in time, master
element node i0starts to send data to ki1 of the corre-
sponding element nodes ij,j∈{1,...,ki1},atarateof
si0,ij(t)=ki
ki1for all t
B+1
ki,
B+. The remaining
ki1 element nodes ij,j∈{ki,2ki2}, are served
by the subset node sjwith rate ssj,ij(t)=ki
ki1for all
t
B+1
ki,
B+. The download of the file by the element
nodes starts at time
B+1
kiand needs additional ki1
kitime
units, resulting in a total makespan of
B+.
It remains to be shown that the optimal makespan of the
broadcasting instance Iis strictly larger than
B+in the
case that Pis a no-instance. We first argue that no element
node ik,k∈{1,...,2ki2}, will upload the file to any
other node. Both the upload and the download of the com-
plete file by element node ikrequires at least ki1
ki=1
ki
time units. We calculate
123
JournalofScheduling
21
ki4
3+8
32
ki
4
3+2
3
>4
31
B+1,
where we use 4. This value is larger than the assumed
makespan. Hence, we can assume that only the subset nodes
(including the server s0) and the master element nodes upload
the file to other nodes. In order to find a solution of Iwith
M
B+, it is necessary to find a partial solution ¯
Sof the
partial instance ¯
Ithat consists only of subset nodes and the
master element nodes and has accumulated enough idleness
until time
B+, i.e.,
Z
B+
iP
(2ki2).
Otherwise the idleness of subset nodes and master element
nodes does not suffice to serve all element nodes until time
B+in the original instance S. We proceed to show that
for the partial instance, no such partial solution exists.
Let us first assume that xsj
B=1 for every subset node
sj. Using Lemma 5, the inequality X
B+Z
Bμholds,
thus Z
B=0 and xi0
B=0 for every master element
node i0. We calculate
Z
B+=
iN
zi
B,
B+
=μB+
kiP
ki
B+Ci0
jN
kiP
B+
B
sj,i0) dτ

=|P|
=μB+
kiPki
B+Ci01.
For the completion time Ci0of a master element node i0,we
obtain the inequality Ci0
B+1
ki. We claim that there is
at least one master element node i
0for which this inequality
is strict. For a contradiction, suppose that Ci0=
B+1
kifor
all kiP. This implies that every master element node i0
downloads with its full rate sp(i0),i0=ci0. Let us define
Pj={ki:p(i0)=sj}. As subset node sjhas capac-
ity csj=Band serves all master element nodes i0with
kiSjwith full rate, we derive that kiSjkiB. Thus,
Sjis a solution to the 3- Partition instance P, a contradic-
tion. Consequently, there is a master element node i
0with
Ci
0>
B+1
ki, establishing that
Z
B+
B+
kiP
(ki2)=
kiP
(2ki2).
We are left with the case that at least one of the subset
nodes has not finished the download of the file at time
B,
that is, there is j∈{1,...,μ
1}with xsj(
B)<1. We
calculate
Z
B+Z
B+
μ1
j=0
B
B+Csj
+
kiP
ki
B+Ci0)
iN
jN
B+
B
si,j) dτ
Z
B+
kiP
2ki+
μ1
j=0
B
BCsj
+
kiP
ki
BCi0
iN1xi
B (4)
=
kiP
(2ki1)μ+
μ1
j=0
B
BCsj
+
kiP
ki
BCi0+X
B+Z
B

μ
kiP
(2ki1)+
μ1
j=0
B
BCsj
+
kiP
ki
BCi0,(5)
where (4) stems from the fact that each node iN
still needs to download a proportion of 1 xi(
B)of the
file. For each subset node sjwith xsj(
B)<1, we have
Csj
B+1xsj(/B)
B, and for each master element node
with xi0(
B)<1, we have Ci0
B+1xi0(/B)
ki. We distin-
guish two cases.
First case: At least one of the master element nodes has
finished the download at time
B.
Then, inequality (5) is strict and we obtain
Z
B+<
kiP
(2ki1)
μ1
j=01xsj
B
kiP1xi0
B
123
JournalofScheduling
=
kiP
(2ki1)+X
B

μ
μ−|P|
kiP
(2ki2).
Second case: All master element nodes and at least one
of the subset nodes still have not finished the downloads at
time /B.
Then, the upload capacity of the other subset nodes is not
sufficient to serve all master element nodes with their full
download rate, thus, one of the inequalities Ci0
B+
1xi0(/B)
kiis satisfied strictly. As a consequence, we obtain
Z(
B+) < kiP(2ki2).
3.2.2 Approximation
In this section, we devise an algorithm that runs in time
O(nlog n)and computes a solution with makespan no larger
than (1+22)M, where Mis the optimal makespan. We
first consider the case c0cifor all i=1,...,n,for
which we will show a 22-approximation. Then, we will
use this result to obtain a (1+22)-approximation for arbi-
trary server capacities. The additional summand of Min
the approximation arises as possibly we first have to transfer
the file to the node having the largest capacity which takes at
most Mtime units.
Before we present details of the algorithm, we give a
high-level picture of our approach. The intrinsic difficulty
in proving any approximation guarantee is to obtain lower
bounds on the optimal makespan. Let us recall the inequal-
ity shown in the Capacity Expansion Lemma (Lemma 5).
For any solution, node is contribution to the upload in time
interval [t1,t2]is bounded by ui(t1,t2)max{0,ci(t2
t1)1+xi(t1)}−zi(t1,t2)where zi(t1,t2)is the total idle-
ness of node iin the time interval [t1,t2]. To exploit this
capacity bound, our algorithm should fulfill two properties:
it should send the file to nodes with large capacity as soon
as possible, and it should avoid idle time as long as pos-
sible. To achieve this, we use the concept of time-varying
resource augmentation.Forλ1, we call a possibly infea-
sible solution ˜
Saλ-augmented solution, if at any point in
time the original capacities are not exceeded by more than a
factor of λ, i.e., jNsi,j(t)λcu
iand sp(i),i(t)λcd
ifor
all iN,tR0. We will show that there is an efficiently
computable 2-augmented solution that satisfies the above-
mentioned properties. This allows us to apply the Capacity
Expansion Lemma, and we get that the makespan ˜
Mof ˜
Sis
not larger than 2M. Rescaling the augmented solution ˜
Sto
obtain a feasible solution S, we get an additional factor of
2.
We now explain the scaling of the capacities during the
course of the algorithm, see also Algorithm 3for a more
formal description. The algorithm maintains a queue Qof
capacity release events
(t,i,c)R0×N×R0
meaning that at time ta capacity of cof node iis available
for uploads. At the start of the algorithm only the server is
available for uploads so Qhas only a single entry (0,0,c0).
Order the nodes non-increasingly by capacities such that
c1c2···cnand consider a point in time t, where a
node ibecomes available with its full upload capacity ciand
suppose that nodes 1,...,k1 have already received a frac-
tion of the file. We now determine a number kksuch that
nodes k,...,kstart receiving the file from node i. The num-
ber kis chosen such that either nodes k,...,kreceive the file
at their full capacity and the upload factor of node iis violated
by a factor β∈[1,2)or node isends the file at full capacity
and the download capacity of all nodes k,...,kis violated
by a factor α∈[1,2). The next lemma shows that such k
always exists (provided that there are enough nodes left).
Lemma 7 Let c1c2 ··· cnand c c1
2.If
n
j=1cj>c, then there is k n such that
c
2
k
j=1
cj2c.
Proof If c
2c1, there is nothing left to show. Otherwise,
let
k=max $∈{1,...,n}:
j=1
cj<c
2%.
Since c1<c
2and n
j=1cj>c,wehavek∈{1,...,n
1}.Wesetk=k+1. By definition, k
j=1cjc
2.In
addition, we have k
j=1cj2k
j=1cj<2c.
During the course of the algorithm, nodes are served in
order of non-increasing capacity. Since the capacities of the
uploaders are inflated by a factor of at most 2, the released
capacity csatisfies cck
2, where kis the smallest index
of a node not served yet. Hence, we can apply Lemma 7for
the nodes ckck+1···cnand derive the existence of
kksuch that
c
2
k
j=k
cj2c.
The capacities of nodes k,...,kare increased by a factor of
α=max 1,ci
k
j=kcj(
123
JournalofScheduling
and that of node iby a factor of
β=
k
j=k
αcj
ci
,
see lines 13 and 15 in Algorithm 3.
For the instance with such augmented capacities, the
schedule is now constructed as follows. At time t, nodes
j∈{k,...,k}start downloading from node iwith their full
(possibly augmented) capacity of αcj(line 20). Node j
{k,...,k}thus finishes the download at time ˜
Cj=t+1
αcj,
see line 18. At that time, both the new upload capacity of
node jas well as the fraction of the upload capacity of node i
that was devoted to uploading the file to node jbecome
available for further uploads. To avoid double augmentation
of capacities during the course of the algorithm, we have to
rescale both to their original level. This is done in line 21.
Here both the event that the (non-augmented) capacity cjof
node jis available and that a fraction αcj
βof the capacity of
node iis available for further uploads is added to the event
queue. Note that if ck>ck+1at the time ˜
Ckonly some frac-
tion of the upload capacity of node ibecomes available since
nodes k+1,...,kare still downloading from node i.The
algorithm reassigns this fraction of the upload capacity at
time ˜
Ckwithout waiting for nodes k+1,...,kto avoid idle
times.
The algorithm proceeds by taking the next capacity release
event out of the event queue. The process stops as soon as the
total capacity of the remaining nodes is less than the currently
available upload capacity.
From this point on, all remaining nodes are served simul-
taneously at full download rate. For the formal details of the
Scale- Fit procedure, see Algorithm 3.
The choice of the augmentation factors and the rescaling
procedure ensure the invariant that the augmented capacities
do not exceed the original capacities by more than a factor
of 2.
Lemma 8 Scale- Fit computes a 2-augmented solution ˜
S.
Proof When new downloads from a node iare assigned, the
download capacities are increased by the factor
α=max 1,c
k
j=kcj(,
see line 13. By the choice of kin line 12, α2 so that no
download capacity is increased by a factor larger than 2.
We prove that also the upload capacities are never exceeded
by a factor larger than 2 by two inductive claims. First, we
claim that after a node istarted the download of the file at
time ˜
Ai, we have that the capacities of all further capacity
Algorithm 3: Scale- Fit
Input: instance I=(N,c)with c0=maxiNci
Output:2
2-approximate solution Sof I
1sort nodes non-increasingly with respect to their capacities, i.e.,
c1c2···cn;
// server has unused upload capacity c0at
time 0
// add this capacity release to event queue
2Q:= (0,0,c0);
// node with largest capacity not served
yet
3k:= 1;
// largest augmentation factor so far
4b:= 1;
5while kndo
// take next event from queue
6(t,i,c):= arg min(t,i,c)Qt;
7Q:= Q\{(t,i,c)};
8if n
j=kcj2cthen
// iserves all remaining nodes
9k:= n;
// augmentation factor of downloader
10 α:= 1;
11 else
12 choose ksuch that c
2k
j=kcj2c;
// augmentation factor of downloader
13 α:= max1,c)k
j=kcj;
14 end
// augmentation factor of uploader
15 β:= k
j=kαcj
c;
// update largest augmentation factor
16 b:= max{b};
// time when jstarts downloading
17 ˜
Aj:= tfor all j∈{k,...,k};
// time when jfinishes downloading
18 ˜
Cj:= t+1
αcjfor all j∈{k,...,k};
// parent of nodes k,...,kis j
19 p(j):= ifor all j∈{k,...,k};
// sending rate from ito j
20 ˜si,j) := αcjfor all j∈{k,...,k} ∈[˜
Aj,˜
Cj);
// add rescaled capacity release to queue
21 Q:= Q*k
j=k(˜
Cj,j,cj)(˜
Cj,icj
β);
22 k:= k+1;
23 end
24 forall the iN\{0}do
// Rescale ˜
S
25 Ai:= ˜
Aib;
26 Ci:= ˜
Cib;
27 sp(i),i(t):= ˜sp(i),i(˜
Ai)
bfor all t∈[Ai,Ci);
28 end
release events for isum up to ci(as long as there are still
unserved nodes), i.e.,
(t,i,c)Q:i=i
c=ci.
123
JournalofScheduling
This is obviously true for all times t∈[
˜
Ai,˜
Ci)when
the only event in Qis (˜
Ci,i,ci). Next suppose that the
claim is correct for all times until a capacity release event
(t,i,c)for iis taken from the queue Q. Then, the capacity
of cis used to serve nodes k,...,kwhere the down-
load capacity of the downloaders is increased by the factor
α=max1,c)k
j=kcjand the upload capacity of iis
increased by the factor β=k
j=kαcj
c. Finally, the capacity
release events *k
j=k(˜
Cj,icj
β)are added for i.Wehave
k
j=k
αcj
β=
k
j=k
cj
k
=k
c
c=c,
by the choice of β. This proves the claim.
Next, we observe that after node ifinished the download of
the file at time ˜
Ciand as long as there are still unserved nodes,
every capacity release event (t,icj
β)for icorresponds to
a an upload of node ito some other node jat rate αcjand
vice versa.
To sum up, for each node i,ateverytimet˜
Cjall active
uploads at rates αjcjfor some α∈[1,2]correspond to
a capacity release event (t,ij
cj
βj)for some βj∈[1,2]
and the sum of the capacities of the release events is ci.We
thus obtain
j:si,j(t)>0
αjcj=
(t,ij
cj
βj)Q
βjαj
cj
βj2c,
so that the upload capacity of each node iis also never
exceeded by a factor larger than 2.
Before we proceed to formally analyze the performance
of Scale- Fit, let us illustrate the algorithm by an example:
Example 9 Consider an instance with 6 nodes and capacities
c0=5, c1=3, c2=3, c3=5/2, c4=2, and c5=2.
We first describe how to construct the augmented solution
˜
S. At time 0, the upload capacity of node 0 is augmented by
a factor of 6/5 in order to serve nodes 1 and 2 at their full
download capacities. These downloads are completed at time
1/3. At that time, the rescaled upload capacities of the parent
of nodes 1 and 2 (i.e., node 0) can be used separately to serve
further nodes. The 5/2 units of capacity (previously assigned
to node 1) are now assigned to node 3, which downloads
with full capacity without any augmentation. The remaining
5/2 units of capacity are assigned to node 4, whose down-
load capacity is augmented by a factor of 5/4. At the same
time, node 1 starts serving node 5 without any augmentation
because node 5 is the only remaining node. The highest aug-
mentation factor is 5/4. Thus, we rescale all rates by 4/5 and
obtain the feasible solution S. See Fig. 2for an illustration.
time
c0
node 1
node 2
node 3
node 4
c1
node 5
01
3
11
15
5
6
(a)Augmented solution ˜
S
time
c0
node 1
node 2
node 3
node 4
c1
node 5
05
12
11
12
25
24
(b) Feasible solution S
Fig. 2 aAugmented solution ˜
Sand bfeasible solution Sof Example 9
We now turn to the approximation guarantee of our algo-
rithm. We first need the following lemma that provides a
lower bound on the optimal makespan.
Lemma 10 Let I =(N,c)be an instance of the minimum
time broadcasting problem with c0c1c2···cnfor
all i =1,...,n, and let ˜
S be a possibly augmented solution
with the following properties:
1. Every node i N\{0}receives the file during some
interval [˜
Ai,˜
Ci]at constant rate ˜cici;
2. ˜
Ai˜
Ajfor all i,jN\{0}with i j.
Then,
t0=min$t0:there is i N with ˜
Cit
and
jN˜si,j(t)<ci%M,
where Mis the optimal makespan for I.
Proof Let us first consider the case t0<˜
M, where ˜
Mis the
makespan of the augmented solution. For a contradiction,
suppose M<t0and fix an optimal solution S.Wehave
123
JournalofScheduling
˜
X(t0)<X(t0)=n.Letk0be the node with smallest index
that has not finished the download in the augmented solution
˜
Sat time t0, that is,
k0=min{iN:˜
Ci>t0}.
Such a node exists, since we assume t0<˜
M.
If k0=1, then we obtain the inequality t0<1/c1, which
is a lower bound on M, and there is nothing left to show. So
we assume k0>1 and consider the point in time t1=t0
1/ck0. For a contradiction, let us assume that ˜
X(t1)X(t1).
Since ˜xk0(t0)<1 and in ˜
S, node k0receives the file in some
time interval [˜
Ai,˜
Ci]at rate ˜cici,wehave ˜xk0(t1)=0,
and since the nodes start downloading in the order of their
indices, we also have ˜xi(t1)=0 for all ik0. Every node
iNwith ˜xi(t1)>0 receives data with download rate
˜ciciuntil time ˜
Ci.Using˜zi(t1,t0)=0, we obtain
˜ui(t1,t0)1
ck01−˜xi(t1)
˜cici
=ci
ck0+ci
˜ci˜x(t1)1
ci
ck0xi(t1)1
for all ik0with ˜xi(t1)>0, and ˜ui(t1,t0)0 for all other
nodes. Referring to Lemma 5(1.), we calculate
X(t0)X(t1)
iN
max0,ci
ck01+x
i(t1)
=
iN
x
i(t1)>1ci
ck0
ci
ck01+x
i(t1)
i<k0
x(t1)>1ci
ck0
ci
ck01+X(t1),
whereweuseci/ck01 for all ik0. We get
X(t0)X(t1)
i<k0
(ci/ck01)+˜
X(t1)
˜
X(t0)˜
X(t1),
a contradiction. We conclude X(t1)> ˜
X(t1). Applying the
same line of argumentation for t1instead of t0, we derive
the existence of t2=t11/ck1for some k1Nwith
X(t2)> ˜
X(t1). We can iterate this argument until we reach
t, with the property that X(t)> ˜
X(t)and k=min{i
N:˜
Ci>t}=1. This is a contradiction since in the time
interval [0,t]only the server can upload and its upload rate
in ˜
Sis not smaller than that in S. We conclude that our
initial assumption that t0>Mwas wrong, finishing the
proof for the case t0<˜
M.Ift0=˜
M, we can use the same
line of argumentation for t0instead of t0, where >0is
arbitrary. Thus we obtain t0Malso for that case.
We are now ready to prove the approximation guarantee
of Scale- Fit.
Theorem 11 For the minimum time broadcasting problem
with heterogeneous symmetric capacities and an indivisible
file, the following holds:
1. If c0=maxiNci, then Scale- Fit is a 22-approxi-
mation.
2. If c0<maxiNci, then uploading the file to another node
and applying Scale- Fit is a (1+22)-approximation.
The running time of Scale- Fit is O(nlog n).
Proof We first show 1.By construction, node ndetermines
the makespan and starts its download not after t0. Hence,
the makespan of the 2-augmented solution ˜
Sis not larger
than t0+1/cn2M, where we use Lemma 10 and the
fact that 1/cnis a lower bound on the optimal makespan.
For rational input, the largest factor, with which a capacity
constraint is violated, is strictly smaller than 2. Dividing
all sending rates by that factor, we obtain a feasible solution
with makespan smaller than 22M.
To see 2., note that the server needs 1/c0Mtime units
to transfer the file to node 1. We then treat the instance as an
instance where node 1 is the server, i.e., we do not let node 0
upload the file to any other node except node 1. Using the
same arguments as in the proof of Lemma 10, we derive that
this takes at most 22Madditional time units, implying the
claimed approximation factor.
The complexity of the algorithm is as follows: In a first
step, the nodes are sorted with respect to their capacities
(line 1 of Algorithm 3), which takes time O(nlog n).The
most expensive operations within the loop (lines 5–23) are
adding elements to Q(line 21) and retrieving the minimal
element from Q(line 6). In total, there are O(n)of these
operations, and if Qis a priority queue, all operations can
be executed in time O(log n). Therefore the overall running
time of the algorithm is O(nlog n).
4 Broadcasting multiple packets
We now turn to the more general case that the file is divided
into mN1packets. The file is still of unit size; there-
fore, the download of one packet at a rate of ctakes 1
cm
time units. We present two efficient algorithms. The first one,
termed Spread- Exchange, uses a variant of the Scale-
Fit algorithm, developed in the last section, as a subroutine
123
JournalofScheduling
and yields a 5-approximation. The second algorithm, termed
Spread- Mirror- Cycle, has an approximation guarantee
of 2 +2log2n/m
m, which is super-constant in general, but
better than 5 if the number of packets is large compared to
the number of nodes.
4.1 The SPREAD-EXCHANGE algorithm
The Spread- Exchange algorithm works in two phases. In
the first phase, called Spread, a modification of the Scale-
Fit algorithm is used to provide each node with a single
packet. For better reference, we denote the modified version
of the algorithm by Scale- Fit. In the second phase, called
Exchange, the packets are exchanged between the nodes in
a pairwise manner until all nodes own all packets. The second
phase can only work efficiently, if the packets are distributed
in a suitable pattern among the agents.
For ease of exposition, we will first make some simplifying
assumptions on the instance and explain the Spread-
Exchange algorithm for such simple instances. In the end of
this section, we will relax these assumptions. First, we require
that the server has the highest capacity and that nodes are
ordered by their capacities. Second, we require that the capac-
ities are powers of two. Third, we assume that iN
ci
c0=2
for some integer . Intuitively, this condition ensures that the
distribution tree is complete and has height . Finally, we
assume that m, that is, there are not more packets than
levels in the distribution tree. An instance that satisfies these
properties will be called simple as summarized in the follow-
ing definition.
Definition 12 An instance I=(N,c,m)is called simple, if
it satisfies the following properties
1. c0c1c2···cn,
2. ci=2riwith riNfor all iN,
3. n
i=0
ci
c0=2for some Nwith m.
4.1.1 The SCALE-FIT-algorithm
Our algorithm uses a variant of the Scale- Fit-Algorithm to
send exactly one packet to each node. Compared to the orig-
inal Scale- Fit-Algorithm, we introduce two modifications.
The modified algorithm still maintains a queue Qof capacity
release events but the events now are quadruples
(t,i,c,q)R0×N×R0×[m],
indicating that at time tnode ihas an upload capacity of c
and will use this capacity to upload packet q.
Second, we observe that for a simple instance, where all
capacities are powers of two, the Scale- Fitalgorithm in
line 10 can choose ksuch that
k
j=k
cj=c,
that is, no further scaling of the upload or download capacities
is needed. This implies in particular that after removing the
entry (t,i,c,q)from the capacity release event queue, the
set of entries
k
'
j=kt+1
mcj
,i,cj,q,t+1
mcj
,j,cj,q
is added to the queue. The first event (t+1
mcj,i,cj,q)
means that for the uploading node ithe capacity of cjwill
be available for further uploads at time t+1
mcj. The packet
qto be send in these uploads will be specified in Sect. 4.1.2.
The second event (t+1
mcj,j,cj,q)means that node jwill
have finished the download of the file at time t+1
mcjand thus
its full capacity of cjwill be available for further uploads.
Since each node receives a single packet only, node jwith
j∈{k,...,k}can send only this packet in further uploads.
Further note that the download times are shorter by a factor
of 1/msince only a part of size 1/mis distributed by the
algorithm.
4.1.2 The distribution tree
When introducing new capacity release events in the event
queue Q, we have to decide which packet is scheduled
for upload. This is trivial for non-server nodes who only
receive a single packet during the course of the Scale- Fit-
Algorithm. To explain how the server decides which packet
is uploaded, we introduce the notion of a distribution tree T
which is build incrementally from the events introduced into
Q. To this end, let Qdenote the set of all capacity release
events (t,i,c,q)added to Qduring the course of the algo-
rithm. We build a tree structure Twith vertex set V(T)=Q
as follows. The root of Tis the tree node (0,0,c0,1)cor-
responding to the event initially placed in Q. Whenever a
capacity release event (t,i,c,q)is removed from Q,the
algorithm afterward adds the set of events
k
'
j=k$t+1
mcj
,i,cj,q,t+1
mcj
,j,cj,q% (6)
to Q.InT, we add an edge from the tree node (t,i,c,q)to
each of the tree nodes (t+1
mcj,i,cj,q)with j=k,...,k
and an edge from (t,i,c,q)to each of the tree nodes (t+
1
mcj,j,cj,q)with j=k,...,k. We call tree nodes of the
form (t+1
mcj,j,cj,q)proper tree nodes and denote the set
123
JournalofScheduling
Algorithm 4: Scale- Fit
Input: simple instance I=(N,c,m)
Output: partial solution where each node iN\{0}owns one
packet
// server has unused upload capacity c0at
time 0
// server will send packet 1
// add this capacity release to event queue
1Q:= (0,0,c0,1);
// node with largest capacity not served
yet
2k:= 1;
3while kndo
// take next event from queue
4(t,i,c,q):= arg min(t,i,c,q)Qt;
5Q:= Q\{(t,i,c,q)};
// assign downloaders
6if n
j=kcj<cthen
7k:= n;
8else
9choose ksuch that c=k
j=kcj;
10 end
// sending rate from ito j
11 s(q)
i,j) := cjfor all j∈{k,...,k} ∈[t,t+1
mcj);
// add new capacity release events to
queue
12 Q:= Q*k
j=k(t+1
mcj,j,cj,k);
13 if i=0then
14 Q:= Q*k
j=k(t+1
mcj,0,cj,min{q+1,m});
15 else
16 Q:= Q*k
j=k(t+1
mcj,0,cj,q);
17 end
18 k:= k+1;
19 end
of proper tree nodes by V(T). Note that there is a natural
bijection between the set of non-server nodes iN\{0}
and the set of proper tree nodes vV(T). Specifically,
for each non-server node iN\{0}, there is a proper tree
node (Ci,i,ci,qi), where qiis the packet send to node i.
Further, there is also a corresponding non-proper tree node
(Ci,p(qi)
i,ci,q)with q∈[m], where p(qi)
iis the unique
node that sends packet qito node iduring the course of the
algorithm.
We proceed to provide useful properties of the distri-
bution tree T. For a tree node v,letlevel(v) denote the
number of edges from vto the root. Further, for a tree
node v=(t,i,c,q), we introduce the notations t(v) =t,
i(v) =i,c(v) =cand q(v) =qas a shorthand for the time,
node, capacity and packet associated with the corresponding
capacity release event. The following lemma establishes that
nodes at higher levels have no larger capacities and higher
completion times.
Lemma 13 Let v, w V(T)with level(v) < level(w). Then,
c(v) c(w) and t(v) < t(w).
Proof We show by induction over that for all tree nodes
vwith level(v) =and all tree nodes wwith level(w) >
level(v), the claimed properties hold. The only tree node with
level(v) =0 is node (0,0,c0,1). Since t(0,0,c0,1)=0
and, by our preprocessing, c0cifor all iN,theresult
follows for all tree nodes with level 0.
Assume the statement is true for all tree nodes vV(T)
with some fixed level level(v) =and all nodes wV(T)
with level(w) > level(v). Consider a node vV(T)with
level(v) =+1 and a node wV(T)with level(w) >
level(v).Letvand wbe the parent nodes of vand win T.
By the induction hypothesis, we have t(v)<t(w), i.e., the
capacity release event vhappens before the event w. Since
the algorithm considers the nodes in order of non-increasing
capacity, this implies c(v) c(w). Since all nodes all served
with full capacity, we also have t(v) =t(v)+1
mcv<t(w)+
1
mcw.
Lemma 13 implies that all capacity release events of a
level happen after all capacity release events of a former
level. This implies that the tree Tis complete in the sense
that the capacity of a tree node is equal to the sum of all
capacities of its children. Since n
i=0
ci
c0=2with N
also the last level is complete.
Next, we note that that each capacity release event
(t,i,ci,q)indicates that node ifinishes the download of
packet qat time [m]. Each such capacity event is mirrored by
another event of the form (t,p(q)
i,ci,q)indicating that the
upload capacity of the node p(q)
ifrom which node ireceived
packet qis free again and ready for further uploads. Specif-
ically, q=min{q+1,m}if p(q)
iis the server, and q=q
otherwise. As a consequence, half of the capacity of the tree
nodes in each level corresponds to available upload capacity
of nodes.
We proceed to show that the joint capacity of all nodes
that own a particular packet after running the Scale- Fit-
Algorithm is distributed very regularly.
Lemma 14 Let I =(N,c,m)be a simple instance and let
¯c=iNci, and for k ∈[m]let
Nk=iN\{0}:i owns packet k
and ¯ck=iNkci. Then
¯ck=¯c/2kfor k =1,...,m1,
¯c/2m1c0for k =m.
Proof Let
V
1={vV(T):level(v) =1}
123
JournalofScheduling
be the set of proper tree nodes in level 1 and let
˜
V1={vV(T)\V(T):level(v) =1}
be the set of non-proper tree nodes in level 1. Since the total
capacity of all nodes equals the capacity of the server plus
the total capacity of all proper tree nodes, we have
¯c=c0+
rV
1
vV(T(r))
c(v) +
r˜
V1
vV(T(r))
c(v).
The nodes vV(T(r)) with rV
1are exactly those that
receive packet 1, thus,
¯c=c0c1+
r˜
V1
vV(T(r))
c(v).
Since the trees are complete, we have
r˜
V1
vV(T(r))
c(v) =
rV
1
vV(T(r))
c(v)
r˜
V1
c(r)
=
rV
1
vV(T(r))
c(v) c0
We then obtain ¯c=2¯c1showing the claim for k=1.
To show the result for 1 <k<m, we apply this argument
recursively in the subtrees of level k.LetV
kdenote the set of
proper nodes of level kthat are served by the server and let
˜
Vkdenote the set of non-proper tree nodes in level k. Then
¯ck=rV
kvV(T(r)) c(v). By induction,
¯c=c0+
k1
i=1¯ci
+
rV
k
vV(T(r))
c(v) +
r˜
Vk
vV(T(r))
c(v).
With the same reasoning as before, we obtain the equality
r˜
VkvV(T(r)) c(v) =rV
kvV(T(r)) c(v) c0,
thus,
¯c=
k1
i=1
¯c
2i+2
rV
k
vV(T(r))
c(v)
=12(k1)¯c+2¯ck,
hence ¯ckc/2kshowing the results for 1 <k<m.
For k=m,wehave
¯c=c0+
m1
i=1¯cicm
=c0+(12(m1))¯ccm
and thus
¯c=12(m1)¯c+c0cm,
or equivalently ¯cmc/2m1c0, as claimed.
4.1.3 The EXCHANGE phase
For the Exchange part of the algorithm, we treat the server
as a node that owns only packet m. We carry out m1 rounds
of exchange. In round k, the total capacity of nodes that own
packets j∈{k,...,m}is doubled. Lemma 14 then implies
that after round k, all nodes own packet k, and this packet
can be ignored in all further rounds. Moreover, after round
m1, all nodes own all packets, and the file is completely
distributed.
In round k, we assign the nodes that own packet kto nodes
that own other packets k>kwith a straightforward greedy
algorithm, see Algorithm 6for details. As long as the largest
capacity ciof a node ithat owns packet kis larger than the
capacity of at least one of the other nodes, we match this
node ito a subset ˜
Nof the other nodes, where ˜
Nis chosen
in such a way that the capacities in this set sum up to ci.This
is always possible since all capacities are powers of two. As
soon as the maximum capacity of an unmatched node that
owns packet kis smaller than the minimum capacity of the
unmatched nodes without packet k, we proceed the other way
round—a single node ithat does not own packet kis matched
to a set of nodes that do own packet k, where the capacities in
this set sum up to ci. Using Lemma 14, it is straightforward
to show that all nodes will get matched this way.
In round k, after all nodes are matched that way, every
node has a unique packet j∈{k,...,m}and sends this
packet jat a rate of cmin to its matched node(s).
4.1.4 The SPREAD-EXCHANGE ALGORITHM
A detailed description of the whole procedure Spread-
Exchange can be found in Algorithm 5. We first show that
it yields a 2-approximation algorithm for simple instances.
Theorem 15 For the minimum time broadcasting problem
with heterogeneous symmetric capacities and a file dis-
tributed in m packets, Spread- Exchange computes a
2-approximation for simple instances.
Proof Let I=(N,c,m)be a simple instance and M(I)be
its optimal makespan. Let I1/m=(N,c,1)be the instance
with the same set of nodes and capacities where a single file
of size 1/mhas to be send to all nodes. Since in Iall m
packets of size 1/mhave to be send to all nodes, we have
M(I1/m)M(I). Consider the Spread phase when the
123
JournalofScheduling
Algorithm 5: Spread- Exchange algorithm
Input: simple instance I=(N,c,m)
Output: 2-approximate solution S
1run Scale- Fit;// spread
2t1:= makespan of Scale- Fit;
3for k=1,...,m1do // exchange
4Nk:= iN\{0}:iowns packet k};
5Nk:= N\Nk;
6Greedy assignment for Nkand Nk;
7forall the iNkdo
8forall the j assigned to i do
9s(k)
i,j(t):= cmin for all tt1,t1+1
mcn;
10 let k>kbe a packet that is owned by j;
11 s(k)
j,i(t):= cmin for all tt1,t1+1
mcn;
12 end
13 end
14 t1:= t1+1
mcn;
15 end
Algorithm 6: Greedy assignment
Input:setsA={a1,...,ar},B={b1,...,bs}such that
aAa=bBb,
ai=2ri,bi=2siwith ri,siNfor all iN
Output: assignment (ai,Bi)iI,(Aj,bj)jJsuch that
A=˙
*iIai˙
˙
*jJAj,
B=˙
*iIBi˙
˙
*jJbj,
ai=bBibfor all iI,
bj=aAjafor all jJ
1M:= ;
2while A=∅do
3s:= arg maxtABt;
4if sAthen
5choose BBwith bBb=s;
6M:= M(s,B),A:= A\{s},B:= B\B;
7else
8assign sto a set AAwith aAa=s;
9M:= M(A,s),B:= B\{s},A:= A\A;
10 end
11 end
12 return M;
Spread-Exchange algorithm is run for Iand let
t0=min$t0:there is iNwith xi1
m
and
jN
k∈[m]
s(k)
i,j(t)<ci%.(7)
be the first point in time during the Spread phase when
available upload capacity remains idle. During the Spread
phase, no node receives more than one packet, so we may
identify the whole Spread phase with a solution for instance
I1/m. Using Lemma 10, we then obtain that t0M(I1/m)
implying that t0M(I). Since in the Spread phase no
node starts the download of a packet strictly after t0,wehave
t1t01
mcnwhere t1is the makespan of the Spread phase.
After t1,theSpread- Exchange algorithm carries out
m1 rounds of mutual exchanges of packets, each of which
needs 1
mcntime units. As a consequence, the makespan of the
solution computed by Spread-Exchange is bounded from
above by
M(I)+1
mcn+m1
mcn2M(I),
where we used that M(I)1
cnsince the optimal solution
has to send the whole file to node n.
We proceed to show that a slight modification of the
Spread- Exchange algorithm yields a 5-approximation for
arbitrary, not necessarily simple, instances.
Theorem 16 For the minimum time broadcasting problem
with heterogenous symmetric capacities and a file distributed
in m packets, the following holds:
1. If c0=maxiNci, there is a 4-approximation algorithm.
2. If c0<maxiNci, there is a 5-approximation algorithm.
In both cases, the running time is O(mn log n).
Proof Let M(I)denote the optimal makespan of the original
instance and let cmax =maxiNci.Ifc0<cmax, we spend
1/c0time units to send the whole file from the server node 0
to a node iwith maximal capacity, remove node 0 from the
instance and treat ias the new server. Using that 1/c0
M(I), it suffices to show 1.
To this end, let the nodes be numbered such that c0c1
c2 ··· cn. Consider the instance Iwhere all nodes’
capacities are rounded down to the next power of two, i.e.,
I=(N,(ci)iN,m)where ci=2log2ci. By appropriate
scaling of the instance, it is without loss of generality to
assume that cn=cn.LetNand n∈{0,...,n}be such
that
=+log2,n
i=0
ci
c0-. and
n
i=0
ci
c0=2.
For the subset of nodes N={0,...,n}, consider the
instance with rounded capacities I=(N,(ci)iN,m).Let
p,rNbe such that m=p+rwith r<.
The general structure of the algorithm is follows. We con-
duct prounds of Spread- Exchange for Iin each of which
we distribute packets to the nodes in N. Then, we conduct
a final round of Spread- Exchange in which we distribute
r<packets, so that after the p+1 rounds all mpackets
are distributed to the nodes in N. Finally, we send the whole
file from nodes in Nto nodes in N\N. For the analysis of
this algorithm, we distinguish the cases p=0 and p>0.
123
JournalofScheduling
First case p =0. Consider the run of Spread-Exchange
on the instance I=(N,(ci)iN,m)with a subset of nodes
and rounded capacities. As in the proof of Theorem 15, con-
sider the time t0during the single Spread phase defined in
(7) when one of the nodes starts to become idle. We claim
that t02M(I).
To see this, consider again the relaxation I1/mof Iwhere
only a single file of size 1/mhas to be distributed. We
have M(I1/m)M(I). We consider the solution for the
Spread phase for I, double all upload and download rates
and only send packet 1. By construction, this yields a 2-
augmented solution for I
1/m. By adding arbitrary (feasible)
downloads for the nodes in N\N, we may extend this solu-
tion into a solution for I1/m. Note that by doubling all upload
and download rates, all completion times are halved, but oth-
erwise, no nodes become idle earlier than in the original
solution.
By Lemma 10,t02M(I1/m)and, thus, t02M(I).
Since in the Spread phase no node starts sending a packet
strictly after t0,theSpread phase ends not later than M(I)+
1
cn.IntheExchange phase, the algorithm performs (m1)
rounds of exchange of length 1
mcn. Overall, we see that the
time needed to run the Spread- Exchange algorithm is at
most
2M(I)+1
mcn+m1
mcn=2M(I)+1
cn
2M(I)+1
cn
=2M(I)+1
cn3M(I).
After the Spread- Exchange algorithm, we perform a
greedy matching between the nodes Nthat received the file
and the nodes N\Nthat did not receive any portion of
the file yet. Since iNciiN\Nciand all capacities
are powers of two, there is a greedy assignment in which
all nodes iN\Nreceive the file at their full (rounded
down) capacity ci. The assignment can be computed with
Algorithm 6. This additional step takes additional
1
cn=1
cnM(I)
time units. We need at most 3M(I)time units for the
Spread- Exchange algorithm and M(I)time units for
sending the file to the nodes in N\N, so that the we obtain
a 4-approximation for the case p=0.
Second case p >0. For the instance Iwith a subset of
nodes and rounded capacities, the algorithm performs p>0
rounds of Spread- Exchange distributing packets each
followed by one round distributing rpackets. Consider one of
the such round during which we distribute s∈{, r}packets.
In the Spread part, the maximum number of subsequent
uploads of the server is and each upload takes at most
1
mcn1
mcn=1
mcn
time units. Further, no download starts strictly after the server
has finished its last upload. Thus, the Spread phase of this
round takes at most +1
mcntime units. In the Exchange phase,
we perform s1 rounds of exchange requiring further s1
mcn
time units. In total, all p+1 rounds of Spread- Exchange
need
(p+1)+1
mcn+p1
mcn+r1
mcn
=(p+1)
mcn+p
mcn+r
mcn
=(p+1)
mcn+1
cn
3
cn
3M(I)
time units.
As in the first case, we spend additional 1
cnM(I)time
units to send the file from the nodes in Nto the nodes in
N\Nyielding a 4-approximation.
We now turn to the running time of the algorithm. For
the Spread phase, we need to run Scale- Fitwhich has a
complexity of O(nlog n). Note that even when Scale- Fit
is run multiple times for different sets of packets, it actually
suffices to run the algorithm only one time and adapt the
solutions by changing the packets that are disseminated. For
the Exchange phase, we conduct m1 rounds of exchanges
for each of which we need to compute an assignment. The
greedy algorithm has a complexity of O(nlog n). Sending the
file to the nodes in N\Nrequires the solution of a single
greedy assignment problem, so that we obtain the overall
complexity of O(mn log n).
We note that in a situation where in the beginning the
packets are spread all over the nodes in the network, we still
obtain a 5-approximation. In the beginning, all packets are
sent to a node with largest capacity, which then acts as a
server. This sending does not take longer than M(I).
We conclude this section with an observation, relating our
result to that of Mundinger et al. (2008).
Remark 17 If the number of nodes is a power of two and
capacities are homogeneous, i.e., |N|=2for some N
and ci=1 for all iN, the solution produced by Spread-
Exchange is exactly the same as the one produced by the
algorithm presented in Section 3 of Mundinger et al. (2008).
(Note that we do not have to scale capacities in this case and
123
JournalofScheduling
thus also do not scale the sending rates in the end.) This also
implies that in this special case,Spread- Exchange yields
an optimal solution.
4.2 An algorithm for many packets
In this section, we devise a simple alternative algorithm,
which we call Spread- Mirror- Cycle, that has a better
performance guarantee than Spread- Exchange when the
number of packets is large compared to the number of nodes.
The new algorithm proceeds in three phases. In the first phase
(Spread), we choose a certain subset of nodes and send each
packet to exactly one node from this subset. In the second
phase (Mirror), we let these nodes mirror their packets to
the remaining nodes, in such a way that the distribution of
packets among the initial set of nodes is copied (possibly
several times), and at the end of phase two every node owns
at least one packet and each packet is owned by at least one
node. In the final phase of the algorithm (Cycle), we per-
form a rather simple circular exchange of packets between
nodes until all nodes possess the whole file. The main differ-
ence to the Spread- Exchange-algorithm, described in the
previous section, is that Spread- Exchange uses Scale-
Fitas a subroutine to provide each node with one packet.
We use a different method for this task, that aims for a large
variety of the distributed packets. This comes at the cost of
a worse performance guarantee for the first two phases, but
allows for more flexibility and hence a better performance
of the exchange part our algorithm (Cycle). When the num-
ber of packets is large compared to the number of nodes,
the exchange phase dominates the running time of the algo-
rithm, and Spread- Mirror- Cycle has a better worst-case
performance than Spread- Exchange.
We proceed to describe Spread- Mirror- Cycle in
detail. In contrast to Spread- Exchange, we do not round
the capacities to powers of 2, but work on the original
capacities. Let k=n/m. We partition the set of nodes
into kmutually disjoint sets N1,...,Nk. The partition is
chosen in such a way that the cardinalities differ by at
most one, and the sets are indexed such that the cardinal-
ities are non-decreasing. Formally, for j∈{1,...,k},let
nj=|Nj|denote the cardinality of the j-th set and let
Nj={ij
1,...,ij
nj}. Then we have |njnj|≤1 for all
j,j∈{1,...,k}, and njnjfor all j,j∈{1...,k}with
jj. Note that by the choice of k, each partition contains
more than m/2 nodes, but at most m, i.e., m/2<njm
for all j.
In the Spread phase of our algorithm, node 0 sends one
packet to each of the nodes in N1as follows: at time 0,
node 0 starts sending packet 1 to node i1
1at maximum rate
min{c0,ci1
1}. After this transmission is completed, node 0
sends packet 2 to node i1
2at maximum rate min{c0,ci1
2}, and
so on. This process continues until node 0 has sent packet
n1to node i1
n1. Then, if m>n1, nodes i1
1,...,i1
mn1one
after another receive a second packet from node 0 at full rate,
until each node iN1owns at least one packet and each
packet is owned by at least one node in N1. For each node
j∈{1,...,n1}, the transmission of a packet needs at most
1
mcmin time units.
In the Mirror phase, all nodes in N1mirror themselves to
a node in N2. More formally, we form a matching between
the nodes in N1and N2. Assume for the moment that the
sets are of equal size. Each node isends its packets to its
matched node iat the full rate min{ci,ci}. As each node
sends at most two packets, this process needs at most 2
mcmin
time units. We then continue iteratively mirroring the internal
node structure of N1to all other subsets of nodes. That is,
the nodes in N1send their packets to the nodes in N3.Atthe
same time, the nodes in N2send their packets to the nodes in
N4, and so on. After at most 2log2k
mcmin time units, the internal
structure of N1is mirrored to all other sets N2,...Nk.
When the cardinalities of the two sets are different,
i.e., nodes in set Njmirror themselves to the nodes in set
Njwith nj=nj+1, we slightly correct the mirroring
process to take care of that fact. As nj=nj1m1,
there is a node iNjthat owns two packets. We match
this node to two different nodes in Nj, and let it send one
packet to each of them. In that fashion, we ensure that after
the mirroring, all nodes in Njpossess at least one packet.
In the Cycle phase, we perform a simple circular
exchange in all sets Nj,j∈{1,...,k}. Recall that within
each set Nj, each node owns at least one packet and each
packet is present at exactly one node. The Cycle phase con-
sists of m1 rounds. In each round, node ij
ksends one of its
packets to node ij
k+1, except for ij
nj, which sends to ij
1. Each
node sends each packet only once, in order of their reception.
After m1 rounds of such exchange, all nodes in Njpossess
all packets. To see this, note that during the m1 exchange
rounds the order of the packets sent along the cycle is pre-
served. Hence, each node receives all other packets before
it gets the packet it sent first. Thus, each node has all pack-
ets after m1 rounds. Performing the exchange in all sets
N1,...,Nkin parallel, the final phase of our algorithm needs
m1
mcmin time units.
Summing up, we get the following result:
Theorem 18 Spread- Mirror- Cycle yields a (2+2
log2n/m/m)-approximation in time O(nm).
Proof Adding up the time spent on the single phases, the total
makespan of the solution computed by Spread- Mirror-
Cycle is bounded by
1
cmin +2log2n/m
mcmin +m1
mcmin
123
JournalofScheduling
2M+2log2n/m
mM,
where Mis the optimal makespan. The partitioning of the
set can be done in time O(n), the sending rates for the Spread
phase are computed in time O(m). Then, there are O(k)=
O(n/m)Mirror steps, each of which can be computed in
time O(m). Finally, there is the Cycle phase, in which each
node sends m1 packets. Computing the sending rates of
this phase thus takes time O(nm), dominating the overall
complexity of the algorithm.
5 Conclusion
We studied the problem of distributing a file, divided into
mpackets, to all nodes of a communication network. In
contrast to the prevailing assumption in the broadcasting lit-
erature known as the telephone model (and variants thereof),
our model allows for flexible data transfers, that is, at any
point in time each node that possesses a certain packet may
send it with an arbitrary rate to other nodes that have not yet
received this packet. For this quite general model, we pro-
vided a detailed study of various settings. For the simplest
setting with homogeneous capacities and a single packet, we
presented an efficient exact algorithm. Already for a single
packet and heterogeneous capacities, the problem becomes
strongly NP-hard. We thus turned to approximation algo-
rithms and devised constant factor approximation algorithms
for heterogeneous capacities, both for single and multiple
packets. Some questions remain open, however, such as the
case of different upload and download capacities per node. It
would also be interesting whether the presented approxima-
tion factors can be improved, or whether inapproximability
bounds can be shown. Moreover, other objectives such as
the weighted sum of completion times seem interesting and
deserve further study.
Funding Open Access funding enabled and organized by Projekt
DEAL.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing, adap-
tation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons licence, and indi-
cate if changes were made. The images or other third party material
in this article are included in the article’s Creative Commons licence,
unless indicated otherwise in a credit line to the material. If material
is not included in the article’s Creative Commons licence and your
intended use is not permitted by statutory regulation or exceeds the
permitted use, you will need to obtain permission directly from the copy-
right holder. To view a copy of this licence, visit http://creativecomm
ons.org/licenses/by/4.0/.
References
Arkin, E., Bender, M., Fekete, S., Mitchell, J., & Skutella, M. (2006).
The freeze-tag problem: How to wake up a swarm of robots. Algo-
rithmica, 46(2), 193–221.
Arkin, E., Bender, M., Ge, D., Hejoseph, S., & Mitchell, S. (2003).
Improved approximation algorithms for the freeze-tag problem.
In Proceedings of the 15th annual ACM symposium on parallel
algorithms and architectures (SPAA) (pp. 295–303).
Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski,
A., et al. (2009). Above the clouds: A Berkeley view of cloud
computing. Tech. rep., University of California at Berkeley.
Bar-Noy, A., Guha, S., Naor, J., & Schieber, B. (2000). Message multi-
casting in heterogeneous networks. SIAM Journal on Computing,
30(2), 347–358.
Bar-Noy, A., Kipnis, S., & Schieber, B. (2000). Optimal multiple
message broadcasting in telephone-like communication systems.
Discrete Applied Mathematics, 100(1–2), 1–15.
Ezovski, G., Tang, A., & Andrew, L. (2009). Minimizing average finish
time in P2P networks. In Proceedings of the 28th IEEE interna-
tional conference on computer communications (INFOCOM) (pp.
594–602).
Fan, B., Lui, J. C. S., & Chiu, D. M. (2009). The design trade-offs of
bittorrent-like file sharing protocols. IEEE/ACM Transactions on
Networking, 17, 365–376.
Garey, M., & Johnson, D. (1979). Computers and intractability.New
York, NY, USA: W. H. Freeman.
Goetzmann, K. S., Harks, T., Klimm, M., & Miller, K. (2011). Opti-
mal file distribution in peer-to-peer networks. In Proceedings of
the 22nd international conference on algorithms and computation
(ISAAC) (pp. 210–219).
Hedetniemi, S. T., Hedetniemi, S. M., & Liestman, A. (1998). A sur-
vey of gossiping and broadcasting in communication networks.
Networks, 18, 129–134.
Khuller, S., & Kim, Y. A. (2007). Broadcasting in heterogeneous net-
works. Algorithmica, 48(1), 1–21.
Könemann, J., Levin, A., & Sinha, A. (2005). Approximating the
degree-bounded minimum diameter spanning tree problem. Algo-
rithmica, 41(2), 117–129.
Kumar, R., & Ross, K. (2006). Peer assisted file distribution: The min-
imum distribution time. In Proc. 1st IEEE workshop on hot topics
in web systems and technologies (pp. 1–11).
Kwon, O. H., & Chwa, K. Y. (1995). Multiple message broadcasting in
communication networks. Networks, 26(4), 253–261. https://doi.
org/10.1002/net.3230260409.
Mehyar, M., Gu, W., Low, S., Effros, M., & Ho, T. (2007). Optimal
strategies for efficient peer-to-peer file sharing. In Proceedings of
the IEEE international conference on acoustics, speech and signal
processing (ICASSP) (Vol. 4, pp. 1337–1340).
Middendorf, M. (1993). Minimum broadcast time is NP-complete for
3-regular planar graphs and deadline 2. Information Processing
Letters, 46(6), 281–287.
Miller, K., & Wolisz, A. (2011). Transport optimization in peer-to-peer
networks. In Proceedings of the 19th international conference on
parallel, distributed and network-based processing (PDP), (pp.
567–573)
Mundinger, J., Weber, R., & Weiss, G. (2008). Optimal scheduling of
peer-to-peer file dissemination. Journal of Scheduling, 11, 105–
120. https://doi.org/10.1007/s10951-007-0017-9.
Qiu, D., Srikant, R.: Modeling and performance analysis of BitTorrent-
like peer-to-peer networks. In Proc. ACM conf. applications,
technologies, architectures, and protocols for comput. comm.
(SIGCOMM) (pp. 367–378).
123
JournalofScheduling
Ravi, R.: Rapid rumor ramification: Approximating the minimum
broadcast time. In Proceedings of the 35th annual IEEE sympo-
sium on foundations of computer science (pp. 202–213).
Publisher’s Note Springer Nature remains neutral with regard to juris-
dictional claims in published maps and institutional affiliations.
123