Document [original]

JournalofScheduling

https://doi.org/10.1007/s10951-020-00643-w

Broadcasting a file in a communication network

Kai-Simon Goetzmann1·Tobias Harks3·Max Klimm2

Abstract

We study the problem of distributing a file, initially located at a server, among a set of nnodes. The file is divided into m≥1

equally sized packets. After downloading a packet, nodes can upload it to other nodes, possibly to multiple nodes in parallel.

Each node, however, may receive each packet from a single source node only. The upload and download rates between nodes

are constrained by node- and server-specific upload and download capacities. The objective is to minimize the makespan.

This problem has been proposed and analyzed first by Mundinger et al. (J Sched 11:105–120, 2008.https://doi.org/10.1007/

s10951-007-0017-9) under the assumption that uploads obey the fair sharing principle, that is, concurrent upload rates from

a common source are equal at any point in time. Under this assumption, the authors devised an optimal polynomial time

algorithm for the case where the upload capacity of the server and the nodes’ upload and download capacities are all equal.

In this work, we drop the fair sharing assumption and derive an exact polynomial time algorithm for the case when upload

and download capacities per node and among nodes are equal. We further show that the problem becomes strongly NP-hard

for equal upload and download capacities per node that may differ among nodes, even for a single packet. For this case,

we devise a polynomial time (1+2√2)-approximation algorithm. Finally, we devise two polynomial time algorithms with

approximation guarantees of 5 and 2 +log2n/m/m, respectively, for the general case of mpackets.

Keywords Broadcasting problem ·Content distribution ·File sharing ·Load balancing ·Makespan ·Peer-to-peer ·

Approximation algorithm

The results of Section 3 of this paper appeared as an extended abstract

in Proceedings of the 22nd International Symposium on Algorithms

and Computation (Goetzmann et al. 2011).

The research of Kai-Simon Goetzmann was supported by the

Deutsche Forschungsgemeinschaft within the research training group

“Methods for Discrete Structures” (GRK 1408).

The research of Max Klimm was carried out in the framework of

Matheon supported by Einstein Foundation Berlin (MI-8).

BMax Klimm

[email protected]

Tobias Harks

[email protected]urg.de

1Institut für Mathematik, Technische Universität Berlin,

Straße des 17. Juni 136, 10623 Berlin, Germany

2School of Business and Economics, Humboldt-Universität zu

Berlin, Spandauer Straße 1, 10178 Berlin, Germany

3Institut für Mathematik, Universität Augsburg,

86135 Augsburg, Germany

1 Introduction

The theory and practice of computing has seen two major

paradigm shifts over the past decades. First, with the rise of

cloud computing services, the computations for the solution

of a difficult problem are often distributed among several

data centers scattered all over the world. Second, on a more

microscopic level, most processor architectures have mul-

tiple cores that (ideally) solve computational problems in

parallel. In both realms, it is important to efficiently distribute

data available at one entity (computer, processor, etc.) to all

other entities of the system, or said differently, to efficiently

broadcast data (Armbrust et al. 2009). A widely accepted

measure of efficiency is the time needed to complete the

broadcast (called the makespan), leading to the minimum

time broadcasting problem. The makespan objective is for

instance important for cloud computing systems, where it is

crucial that security updates are disseminated to every com-

puting node as fast as possible.

In the minimum time broadcasting problem, we are given

a graph, whose nodes correspond to the entities of the system.

At time zero, a file is located at a designated node (the server),

123

JournalofScheduling

core

1 2 ... ... n

2cu

Fig. 1 Graphical representation of the file distribution problem we con-

sider

and the task is to disseminate the file to all nodes as fast as

possible. We assume that the file is divided into mequally

sized packets. At any point in time, each node that possesses

a certain packet may send it with an arbitrary rate to other

nodes that have not yet received this packet. As it is common

in cloud computing, the nodes are connected to each other via

a core network which is usually overprovisioned. Thus, the

sending rates to and from other nodes (via the core) are lim-

ited by two capacities. First, for each node ithere is an upload

capacity cu

i, and the sum of the sending rates of imay not

exceed cu

i. Second, every node has a download capacity cd

This problem contains elements of several classical com-

binatorial optimization problems such as network design,

scheduling and routing. A feasible solution (for a single

packet) can be thought of as a directed tree rooted at the

source node with the understanding that a node receives the

packet from its unique parent and sends it to its children

nodes. The additional feature besides finding an optimal dis-

tribution tree is to define a schedule determining the points

in time when nodes actually send the packet. In contrast to

classical scheduling models, however, once nodes received

the packet they may send the packet to other nodes, thus, in

scheduling terminology, machines can be activated.

Alternatively, the problem can be interpreted as a dynamic

multicommodity flow problem in a star-shaped topology, see

Fig. 1. The central vertex of the network represents the core

network. All nodes are connected to this vertex via a pair of

antiparallel arcs, equipped with upload and download capac-

ities. Packets of the file can be sent from one node to another

via the unique directed simple path between them. The total

flow on an arc must not exceed its capacity at any point in

time.

1.1 Previous work

There is a large body of work on the (minimum time) broad-

casting problem, multicast problem, and gossiping problem,

see Hedetniemi et al. (1998) for a survey. In the broadcasting

problem, the task is to disseminate a file from a source node to

the rest of the nodes in a given communication network as fast

as possible, see Ravi (1994) for an approximation algorithm.

When the file needs to be disseminated only to a subset of the

nodes, this task is referred to as multicasting, see Bar-Noy

et al. (2000a) for approximation algorithms. In the gossiping

problem, several nodes possess different files and the goal

is to transmit every file to every node. The usual underly-

ing communication model for these problems is known as

the telephone model: A node may send a file to at most one

other node at a time, and it takes one round (i.e., one unit

of time) to transfer a file. For complete graphs on n+1

nodes, this process terminates in log2(n+1)rounds. It is

known that for arbitrary communication graphs, the problem

of computing an optimal broadcast in the telephone model

is NP-hard (cf. Garey and Johnson 1979), even for 3-regular

planar graphs (cf. Middendorf 1993).

Khuller and Kim (2007) studied the problem of broadcast-

ing in heterogeneous networks [cf. Bar-Noy et al. (2000a)for

the corresponding multicast problem]. They consider com-

plete graphs, but extend the telephone model by allowing

the transmission time of the file to depend on the sender.

They prove that it is NP-hard to minimize the makespan and

present an approximation scheme.

All the above models, however, do not take into account

two important features of many real-world implementations

of broadcasting systems, such as cloud computing or peer-

to-peer networks. First, each node typically sends the file to

multiple nodes of the network in parallel. Second, the file

is usually divided into packets (as the smallest indivisible

unit), and a node may receive different packets from dif-

ferent nodes. These extensions result in structural changes

of the model. In contrast to the telephone model [including

the extension of Khuller and Kim (2007)], in our model the

transfer time between any two nodes is no longer part of the

instance but determined by the sending rates of a feasible

solution. In fact, the model of Khuller and Kim (2007)for

broadcasting in heterogeneous networks reduces to a special

case of our model. If cd

i≥maxj∈Ncu

jfor all i∈N, there

always is an optimal solution in which no node will send the

file to more than one other node at a time, thus, the time to

transfer the file from one node to another only depends on

the sender’s upload capacity. In general, however, it may be

beneficial to serve multiple nodes simultaneously as illus-

trated in the following example of a file distribution problem

in a network with three nodes. Node 0 with capacity cu

0=2

initially owns the file consisting of one packet. There are two

further nodes with capacities cu

i=cd

i=1, i=1,2. In the

optimal solution, both nodes receive the file in parallel at a

rate of 1, yielding a makespan of M∗=1. Restricting nodes

to upload to at most one other node at a time results in a

makespan of 2.

Mundinger et al. (2008) studied a peer-to-peer file distri-

bution problem, where a file is subdivided in multiple parts

123

JournalofScheduling

and the goal is to disseminate the complete file to every peer

as fast as possible. The crucial difference to our model is that

they assume that the upload capacity of a node is equally

shared among concurrent uploads. Under this fair sharing

assumption, they prove that for homogeneous capacities (i.e.,

i=cd

i=1 for all nodes i) a simple greedy algorithm is opti-

mal. (A similar result was also earlier reported by Kwon and

Chwa (1995) and Bar-Noy et al. (2000b), assuming the tele-

phone model.) We prove this result without the fair sharing

assumption. Our proofs are quite involved, since dropping the

fair sharing assumption considerably complicates matters.

For heterogeneous capacities, to the best of our knowledge,

neither approximation algorithms nor hardness results were

known before.

When the number of parts of the file tends to infinity, one

obtains the so-called fluid flow model, for which a relatively

easy closed-form expression of the minimum completion

time can be derived, see, e.g., Ezovski et al. (2009), Kumar

et al. (2006), and Mehyar et al. (2007). Qiu and Srikant (2004)

studied a related model with stochastic arrival of peers. Fan

et al. (2009) considered a finite number of parts, but assumed

that the number of parts is large enough such that a peer

always has enough “new” parts to share with other peers.

The file distribution problem we consider has elements of

a scheduling problem in which a job becomes a machine after

its completion. A related problem exhibiting this property is

the freeze-tag problem studied in the area of robotics. Here, a

set of asleep robots is placed on a graph. There is one desig-

nated robot which is awake and that can travel to other robots

in order to awake them. Once a robot is awake, it may travel

to other robots to awake them. The task is to awake all robots

as fast as possible. Arkin et al. (2003) showed the hardness of

the problem in unweighted graphs and gave a constant factor

approximation algorithm for the case that there is at most

one robot at each node. For the general case, they obtained a

√log n-approximation, where nis the number of robots.

Könemann et al. (2005)devisedaO√log n-approximation

for the weighted case. Arkin et al. (2006) studied the prob-

lem for different graph topologies. Closest to our setting,

they showed that the problem is NP-complete, even for star

networks. For this case, they devised a 14-approximation.

1.2 Summary of the results and used techniques

Our results for a single packet. In Sect. 3, we study the basic

situation in which a single packet is to be broadcasted. For

the case of homogeneous symmetric (unit) capacities, that is,

i=cd

i=1 for all nodes i, we show that a greedy algorithm

computes an optimal solution with makespan log2(n+1).

Although similar results for the same algorithm have been

obtained before, e.g., by Mundinger et al. (2008), or in the

broadcasting literature (Kwon and Chwa 1995; Bar-Noy et al.

2000b), our result holds for a more general model as we drop

the fair sharing assumption used by Mundinger et al. in the

telephone model. For the case of unit node capacities and an

arbitrary integer server capacity, we propose a polynomial

time algorithm (that possibly splits up server capacity) and

prove its optimality. We also give a closed-form expression

of the minimal makespan.

If node capacities are heterogeneous and symmetric (i.e.,

i=cd

ifor all nodes i), we show that the problem becomes

strongly NP-hard. A key ingredient of the reduction (from 3-

Partition)istheCapacity Expansion Lemma (Lemma 5)

that provides an upper bound on the total amount of data

downloaded at any point in time.

In light of the hardness, we then study approxima-

tion algorithms. We first devise a polynomial time 2√2-

approximation algorithm for instances with heterogeneous,

symmetric capacities in which the upload capacity of the

server is larger than the download capacity of any other

node. For smaller server capacities, a slight modification of

our algorithm gives a (1+2√2)-approximation. Our algo-

rithm which we term Scale- Fit proceeds in two phases; in a

first phase, we use a time-varying resource augmentation to

construct a so-called √2-augmented solution, that violates

the capacity constraints of the nodes at any point in time

by a factor of at most √2. By exploiting again our Capac-

ity Expansion Lemma, we prove that the makespan of the

augmented solution thus constructed is at most a factor of 2

away from the optimal makespan of the original instance. We

then rescale the relaxed solution to obtain a feasible solution

with makespan less than a factor 2√2 away from the optimal

makespan.

Our results for multiple packets. In Sect. 4, we proceed

by analyzing the more challenging problem of distributing

a file that is divided into m>1 packets to a set of nodes

with heterogeneous symmetric capacities. First, we devise

an algorithm, which we call Spread- Exchange, that works

in two phases. In the first phase, we use a slight variation of

our Scale- Fit-Algorithm for single packet distribution in

order to send one packet to each node. At the end of the

first phase, each node possesses one packet of the file, but

the packets owned by different nodes may differ. In fact,

we strive to maximize the variety of different packets avail-

able at the nodes. In the second phase, the nodes exchange

the packets among each other, in each round increasing the

number of packets available at each node by one. By care-

fully keeping track of the proportions of the different packets

available at the nodes during both phases of the algorithm, we

prove that for instances for which the server has the largest

capacity, Spread- Exchange achieves an 4-approximation

of the optimal makespan. For smaller server capacities, a

slight variation of the algorithm gives a 5-approximation.

Motivated by real-world instances where the number of

packets is large compared to the number of nodes, we also

propose a different algorithm, termed Spread- Mirror-

123

JournalofScheduling

Cycle, that performs better than Spread- Exchange on

these instances. The main idea is to invest more time in the

first phase, in order to achieve a balanced proportion of the

different packets available at the different nodes. This allows

to perform a faster exchange procedure in the second phase.

For a large number of packets, the second phase dominates

the makespan of the solutions of both algorithms and, thus,

Spread- Mirror- Cycle performs better than Spread-

Exchange. We prove that Spread- Mirror- Cycle yields

a(2+2log2n/m/m)-approximation, establishing that

Spread- Mirror- Cycle has a better worst-case guarantee

for instances in which nm2(3/2)m.

2 Preliminaries

An instance of the minimum time broadcasting prob-

lem is described by a tuple I=(N,m,cd,cu), where

N={0,...,n}is the set of nodes, m∈N≥1is the

number of packets, cd=(cd

0,...,cd

n)∈Qn+1

≥0is the vec-

tor of download capacities from the core network and

cu=(cu

0,...,cu

n)∈Qn+1

≥0is the vector of upload capacities

to the core network. Let [m]={1,2,...,m}be the set of

packets. We will identify the server with node 0 and assume

that only the server initially owns the file. See Fig. 1for an

illustration. The file is of unit size; the size of each packet is

thus 1/m. A feasible solution S=s(k)

i,ji,j∈N,k∈[m]is a fam-

ily of integrable functions s(k)

i,j:R≥0→R≥0,i,j∈N,

k∈[m], where s(k)

i,j(t)denotes the rate at which node i

sends packet kto node jat time t. We require that each node

i∈Nreceives each packet kfrom a unique node, denoted

by p(k)(i)= i:

s(k)

,i(t)=0 for all = p(k)(i)and all t∈R≥0.

In addition, only nodes that possess a packet kcan send that

packet with a positive rate:

s(k)

i,j(t)=0 for all i,j∈N\{0},k∈[m],t∈R≥0

with t

0

∈N

s(k)

,i(τ) dτ< 1

Finally, the sending rates have to obey download and upload

capacity constraints:



k∈[m]

j∈N

s(k)

i,j(t)≤cu

ifor all i∈N,t∈R≥0,and



k∈[m]

s(k)

p(k)(j), j(t)≤cd

jfor all j∈N,t∈R≥0.

We denote by

xi(t)=t

0

k∈[m]

s(k)

p(k)(i),i(τ) dτ

the proportion of the file owned by node i∈N\{0}at time t.

For notational convenience, we set x0(t)=1 for all t∈R≥0.

We let Ci=inf{t∈R≥0:xi(t)=1}denote the completion

time of node i. The makespan of a solution Sis then defined

as M(S)=maxi∈N\{0}Ci. If an instance satisfies cu

i=cd

j=cd

jfor all i,j∈N\{0}we speak of an instance with

homogeneous symmetric capacities. If an instance satisfies

i=cd

ifor all i∈N(but possibly cu

i= cu

jfor some i,j∈

N\{0}), we speak of heterogeneous symmetric capacities.

In both cases, we only write I=(N,m,c)meaning that

c=cu=cd.Ifm=1, we only write I=(N,c).

As we will show, the minimum time broadcasting problem

is in general NP-hard. A common approach in theoretical

computer science to deal with such problems is to consider

approximation algorithms. For a minimization problem P

and ρ>1, a ρ-approximation algorithm is an algorithm

that, for any instance I∈P, computes a feasible solution

Sin time that is polynomial in the encoding length |I|of

the input, such that the objective value c(S)is not greater

than the optimal value Opt(I)by more than a factor of ρ,

i.e., c(S)≤ρOpt(I).

There are some particularities of the minimum time broad-

casting problem regarding the polynomial running time of

algorithms. In general, a solution may consist of arbitrar-

ily complicated sending rate functions s(k)

i,j. Without loss

of generality, we can restrict ourselves to piecewise con-

stant functions with discontinuities only at points in time

where some node starts or finishes the download of a packet.

Any solution that does not have this property can be modi-

fied to do so by flattening the sending rates in the intervals

where the sending and receiving nodes for each packet are

constant. Specifically, let [t1,t2)be such an interval. It is

easy to verify that the new sending rates ¯s(k)

i,jdefined as

¯s(k)

i,j=1

t2−t1t2

t1s(k)

i,j(τ) dτfor all i,j∈Nobey the capacity

constraints of the upload and download links; see also Miller

and Wolisz (2011, Theorem 1) for a formal proof. A simple

shifting argument further shows that it is without loss of gen-

erality to assume that a node starts receiving a packet only

at time zero or when another node has finished receiving a

packet. There are at most m·nof these breakpoints.

However, if the number of packets mis part of the input,

encoded in binary, a polynomial time algorithm would usu-

ally only be allowed time that is polynomial in log m.Given

that even a simplified (optimal) solution might need space

linear in m, in the context of broadcasting a divisible file we

call an algorithm polynomial if it is polynomial in the encod-

ing length of the input and m. Another reason for assuming

123

JournalofScheduling

Algorithm 1: Greedy for identical capacities

Input: instance I=(N,c)with c=1

Output: makespan-minimal solution S

1si,j(τ) := 0∀τ≥0;

2A:= {0},P:= N\{0},t:= 0;

3while P=∅do

4A:= ∅;

5forall the i∈Ado

6Choose j∈P;

7si,j(t):= 1forallt∈[t,t+1);

8A:= A∪{j},P:= P\{j};

9end

10 A:= A∪A,t:= t+1;

11 end

an input size proportional to mis that in real systems each

packet contains a unique piece of information and, thus, the

input size is in practice proportional to m.

3 Broadcasting a single packet

We begin by studying the base case m=1, that is, the file

consists of a single packet that is to be broadcasted. Parts

of the results obtained in this chapter will be reused later to

solve the more challenging general case m>1. To increase

the readability, throughout this section we drop the index k

for the packet number from the notation.

This section is organized as follows. First, we propose

efficient and optimal solutions for the case of homogeneous

symmetric capacities, that is, upload and download capaci-

ties among the nodes are equal. Then, we show that for the

more general case of heterogeneous symmetric capacities the

computation of an optimal solution becomes strongly NP-

hard. Finally, we give an efficient O(nlog n)-algorithm that

approximates the optimal makespan by a factor of 1 +2√2.

3.1 Homogeneous symmetric capacities

In this section, we consider the homogeneous symmetric set-

ting, i.e., cu

i=cd

i=ci=1 for all i∈N\{0}. The server

has a capacity of cu

0=c0.

If the server has unit upload capacity c0=1 as well,

the following greedy procedure yields an optimal solution:

At each point in time, any node that already owns the file

uploads it to exactly one other node, which takes one unit of

time. Thus in each step, the number of nodes owning the file

is doubled, resulting in a makespan of log2(n+1).Fora

formal description of the procedure, see Algorithm 1.

Lemma 1 If ci=1for all i ∈N, Algorithm 1computes

an optimal solution in time O(n). The optimal makespan is

log2(n+1).

Proof Throughout the algorithm, the number of nodes that

own the file (including node 0) at time t∈Nis min{2t,n}

(proof by induction). All completion times are integral;

hence, the makespan of the constructed solution is

M=min{t∈N:2t≥n+1}

=min{t∈N:t≥log2(n+1)}

=log2(n+1).

To prove that this is best possible, consider an optimal

solution S∗=(s∗

i,j)i,j∈Nwith the corresponding completion

times C∗

1,...,C∗

n. We will modify this solution such that in

the time interval [0,1)only one node is served, completing

the transfer at time 1, without increasing the makespan. Iter-

ating this argument yields a solution with the same structure

as the one constructed by Algorithm 1, establishing the claim.

Let the nodes be indexed such that C∗

1=mini=1,...,nC∗

i.We

set the rate vector sof the modified solution as follows:

s

0,1(t)=1fort∈[0,1),

0fort≥1,

s

0,i(t)=⎧

⎪

⎨

⎪

⎩

0fort∈[0,1),

x∗

i(C∗

C∗

1−1for t∈[1,C∗

1),

s∗

0,i(t)for t≥C∗

for all i∈N\{1}with p(i)=0,

s

i,j(t)=s∗

i,j(t)

for all other i,j∈Nand t∈R≥0. Without loss of generality,

we can assume s∗

i,j(t)=1

C∗

1C∗

0s∗

i,j(τ) dτfor all t∈[0,C∗

and all i,j∈N; see Miller and Wolisz (2011, Theorem 1).

It then holds that x∗

i(C∗

1)=s∗

0,i(0)·C∗

1for i∈N\{1}

with p(i)=0. Using this, we can show that in the modified

solution the upload capacity of node 0 is obeyed: For t∈

[1,C∗

1),

i∈N\{1}:p(i)=0s

0,i(t)

C∗

1−1i∈N\{1}:p(i)=0x∗

i(C∗

=C∗

C∗

1−1i∈N\{1}:p(i)=0s∗

0,i(0)

1−1

C∗

1i∈Ns∗

0,i(0)

 

≤1

−s∗

0,1(0)

≤1

1−s∗

0,1(0)1−s∗

0,1(0)

=1.

123

JournalofScheduling

The inequality above also yields s

0,i(t)≤1 for all i∈N\{1}

with p(i)=0, so download capacities are obeyed as well

since p(i)=0 implies s∗

j,i(t)=0 for all j= 0, t∈R≥0.

Finally, at time C∗

1every node i∈Nowns the same fraction

of the file as in the original solution, namely x∗

i(C∗

1). Since

in S∗no node completes its download before time C∗

1, and

in the modified solution after time C∗

1the original rates are

used, no completion of a download is delayed, and hence,

the makespan is not increased. 

We now consider the case where the server capacity c0is

an arbitrary integer and start with a proof that in this case

there always is an optimal solution that uses fair sharing,

i.e., whenever node 0 serves several nodes simultaneously,

all of them are served with equal rate. The proof is quite

involved, and the integrality condition on c0is crucial. To

see this, consider an example with c0=3/2 and n=4. In

an optimal solution, the server serves node 1 in [0,1)and

node 2 in [1,2)with a rate of 1 each and node 3 with a rate

of 1/2in[0,2). Node 4 is served by node 1 in [1,2), resulting

in a makespan of M∗=2. The best solution that uses fair

sharing, however, has a makespan of M=7/3: The server

serves nodes 1 and 2 in [0,4/3)witharateof3/4 each, and

node 3 and 4 are both served at a rate of 1 in [4/3,7/3)by

the server and node 1.

Lemma 2 For any instance I =(N,c)with c0∈Nand

ci=1for all i =1,...,n, there always exists an optimal

solution that uses fair sharing.

Proof For an arbitrary instance Iwith the demanded proper-

ties, consider an optimal solution S∗with the corresponding

completion times C∗

1,...,C∗

n. W.l.o.g. assume that the nodes

are indexed such that

{1,2,...,k}={i∈N:p(i)=0}

and let 1,

2,...,∈{1,...,k}be such that

C∗

1=...=C∗

1

<C∗

1+1=···=C∗

1+2<C∗

1+2+1≤...≤C∗

Let N1={1,...,

1}.Fori=1,...,k,letmibe the

makespan for serving all nodes that are directly or indirectly

served by node i, i.e.,

mi=maxC∗

j−C∗

i:j∈N,pr(j)=ifor some r∈N,

where pr(j)denotes the r-fold application of the function

pon j. By Lemma 1, we can assume that these nodes

are served according to Algorithm 1. (Otherwise, we mod-

ify the solution to get this structure without increasing the

makespan.) In particular, we get that mi∈Nfor all i=

1,...,k. The makespan of the given optimal solution is

M∗=maxi=1,...,k{C∗

i+mi}. Finally let M1=maxi∈N1mi.

In the following, we show how to modify the original solu-

tion s∗to obtain a solution sthat uses fair sharing and has

not a larger makespan. All primed variables s

i,j,C,x,



and N

1refer to variables of the modified solution. When-

ever s

i,j(t)is not specified for some i,j,t, it is equal to the

original value s∗

i,j(t). The modification of s∗to sproceeds

iteratively in steps. In each step, one of four modifications

is made. Specifically, if the server capacity is not smaller

than k, all nodes {1,...,k}are served by the server at

equal rate (Case 1). If k>c0and M1≤mi0for some

i0∈{1+1,...,k}, a single node is added to N1and all

nodes in N1are served with equal rate while maintaining the

original rates for all nodes not in N1(Case 2). If 1≥c0and

M1>mifor all i∈{1+1,...,k}, the server serves all

nodes in N1at full capacity and with fair sharing; the serving

of the other nodes is delayed (Case 3). Finally, if 1<c0<k

and M1>mifor all i∈{1+1,...,k}, we increase N1to

size c0and serve the nodes in N1at full capacity with fair

sharing; the serving of all other nodes is delayed (Case 4).

Application of Case 1 immediately leaves us with a solu-

tion that uses fair sharing. Application of Case 3 and Case 4

leaves us with a solution swith completion times

C

1=...=C



<C



1+1=···=C



1+

<C

1+

2+1≤...≤C

where the nodes in N

1={1,...,



1}are served in time

[0,C

1)with fair sharing and nodes i∈{

1+1,...,k}are

served by the server after C

1. We then fix the part of the

solution related to the nodes in N

1(and the nodes served by

these nodes) and apply the same modifications for the nodes

in N

2={

1+1,...,



1+

2}. Finally, after application of

Case 2, we have increased the size of N1by one. Since the

number of nodes is finite, after a finite number of applications

of Case 2, one of the other three cases is applied.

We proceed to show that the application of the four cases

yields a feasible solution and does not increase the makespan.

Case 1: k≤c0.

In this case, we let node 0 serve the nodes 1,...,kat a rate

of 1 in the time interval [0,1). Obviously no node completes

its download later than in the given solution, and fair sharing

is used all the time.

In the following cases, we always assume that k>c0.

Case 2: M1≤mi0for some i0∈{1+1,...,k}.

We add one node to N1, serving nodes 1,...,

1+1 with

equal rates in the time interval 0,C∗

1+1:

∀i∈1,...,

1+1:

s

0,i(t)=1+1

j=1

s∗

0,j(t)

1+1for all t∈0,C∗

1+1,

0 otherwise.

123

JournalofScheduling

Then nodes 1,...,

1+1 all complete their download at time

C∗

1+1since

x

i(C∗

1+1)=C∗

1+1

s

0,i(τ) dτ

1+1



j=1C∗

1+1

s∗

0,j(τ) dτ

 

for all i=1,...,

1+1. The download capacities of all

nodes are satisfied since at any point tin time

s

0,i(t)=1

1+1



j=1

s∗

0,j(t)≤max

j=1,...,1+1s∗

0,j(t)≤1.

The same is true for the upload capacity of the server, since it

uploads with the same overall rate as in the original solution

at each point in time.

We proceed to argue that this modification does not

increase the makespan. First, since M1≤mi0, for all i∈N1

we have

C

i+mi=C∗

k+1+mi≤C∗

i0+mi0≤M∗,

so delaying the completion of nodes in N1does not influence

the makespan. Furthermore, in the modified solution s,all

nodes i∈{1+2,...,k}are served by the server in the

same way as in solution s∗so that their completion time Ci

as well as the completion times of the nodes served by them

does not change.

We now turn to the cases where M1>mifor all i=

1+1,...,k. Note that by Lemma 1, we know that mi∈N

for all i; hence, we can assume M1≥mi+1 for all i>

in the subsequent cases.

Case 3: 1≥c0and M1≥mi+1 for all i>

In this case, we modify the solution such that node 0 first

serves all nodes in N1at full capacity, delaying the serving

of all other nodes until those in N1have completed their

downloads. More formally, let t1=1/c0, and set

∀i∈N1:s

0,i(t)=c0

1for all t∈[0,t1)

0 otherwise.

∀i=1+1,...,k:s

0,i(t)=0 for all t∈[0,t1).

It is obvious that in [0,t1)all capacities are obeyed and C

t1≤C∗

ifor all i∈N1.

To specify how the remaining nodes are served, let t2=

max{C∗

1,t1+1}. We serve nodes 1+1,...,kin such a way

that the fraction of the file they own at time t2is the same as

in the original solution, i.e., x

i(t2)=x∗

i(t2):

∀i=1+1,...,k:s

0,i(t)=x∗

i(t2)

t2−t1

for all t∈[t1,t2).

Since x∗

i(t2)≤1 and t2−t1≥1, this solution obeys the

download capacities of all nodes. Also the upload capacity

of the server is obeyed, since in the time interval [0,t2)it

sends the same volume as in the original solution:

t2

0

i∈N

s

0,i(τ) dτ=

1



i=1t1

1

dτ+



i=1+1t2

x∗

i(t2)

t2−t1

dτ

k



i=1

t1·c0

1



=1=x∗

i(t2)



i=1+1

x∗

i(t2)



i=1

x∗

i(t2),

and the overall rate the server sends at is c0at the beginning

and constant afterward, hence it is at most c0at all times.

From time t2on, in the modified solution s, all nodes are

served as in the original solution s∗.

Nodes in N1and nodes with C∗

i≥t2complete their down-

load no later than in the original solution. If there is a node

i∈{1+1,...,k}with C∗

i<t2, then t2>C∗

1and thus

t2=t1+1, and this node icompletes its download exactly

one time unit after the nodes in N1. Since M1≥mi+1we

obtain that

C

i+mi=t1+1+mi≤C∗

1+M1≤M∗,

therefore the makespan is not increased.

Case 4: 1<c0and M1≥mi+1 for all i>

In this case, we add nodes 1+1,...,c0to N1and serve

these nodes jointly at full capacity in the time interval [0,1):

∀i=1,...,c0:s

0,i(t)=1 for all t∈[0,1)

0 otherwise.

∀i=c0+1,...,k:s

0,i(t)=0 for all t∈[0,1).

For the serving of the remaining nodes, we set t2=

max{C∗

c0,2}and serve nodes c0+1,...,ksuch that x

i(t2)=

x∗

i(t2)by setting

∀i=c0+1,...,k:s

0,i(t)=x∗

i(t2)

t2−1for all t∈[1,t2).

Feasibility follows similarly to Case 3, as well as the fact that

the makespan is not increased. 

We continue by proving that the server always serves

groups of c0nodes jointly, possibly except for some start

123

JournalofScheduling

interval [0,t1),t1∈R≥0where a group of at least c0and

most 2c0nodes is served.

Lemma 3 For any instance I =(N,c)with c0∈Nand

ci=1for all i =1,...,n, there always exists an opti-

mal solution that uses fair sharing where the server never

serves more than c0nodes simultaneously except for some

interval [0,t1],t

1∈R≥0in which at least c0and at most

2c0nodes are served simultaneously until they are com-

pleted.

Proof For an instance with the demanded properties, con-

sider an optimal solution S∗. Using Lemma 2, we can assume

w.l.o.g. that the set {i∈N:p(i)=0}of nodes served

by the server is partitioned into sets N1,...,Nksuch that

all nodes i∈Njare served simultaneously by the server

at equal rates in the time interval [tj−1,tj], where t0=0

and tj=j

=1max{1,|N|

c0}for j=1,...,k. We can

further assume that |Nj|≥c0for j=1,...,k−1

(otherwise we can add nodes from Nj+1to Njwithout

increasing the makespan) and |Nj|<2c0for j=1,...,k

(Otherwise, we split Njinto two sets and serve c0nodes

in [tj−1,tj−1+1]and the remaining |Nj|−c0nodes in

[tj−1+1,tj].)

For j=1,...,k, we denote by ˜

Njthe set of nodes that

are directly or indirectly served by a node in Nj, and by Mj

the makespan needed for this, i.e.,

Nj=i∈N:pr(i)=ifor some i∈Nj,r∈N,

Mj=maxi∈˜

NjC∗

i−tj,(1)

where again pr(j)denotes the r-fold application of the func-

tion pon j. As in the proof of Lemma 2, we can assume that

Mj∈Nfor all j=1,...,k. We now prove by induction

that the solution can be modified such that |Nj|≤c0for

j=2,...,k, without increasing the makespan. For the base

case, assume |N2|>c0. We decrease N2to contain only c0

nodes.

If M1≤M2+1, we remove |N2|−c0nodes from |N2|,

add them to N1and let them be served by the server jointly

with the other nodes in N1. This delays the completion of

nodes in N1to time

|N1|+|N2|−c0

c0=t2−1,

but using M1≤M2+1 it does not increase the makespan

since the new makespan of any node in ˜

N1can be bounded

by t2−1+M1≤t2+M2. The case M1≥M2+2

is more involved. Here, the |N2|−c0removed nodes will

be served by nodes in ˜

N2. For this, the makespan M2

has to be increased by at most one. To see this note that

|˜

N2|≤|N2|·2M2. With |N

2|=c0,inM2+1 time units

we can serve up to c0·2M2+1=2c0·2M2>|N2|·2M2

Algorithm 2: Extended Greedy for integral server

capacity

Input:I=(N,c)with c0∈N,ci=1foralli∈N\{0}

Output: makespan-minimal solution S

1h:= log2n

c0+1;

2if n≥c0(2h−1+2h−1)then

3Choose N1⊆Nwith |N1|=c0;

4k:= h+1;

5else if n<c0(2h−1+2h−1)then

6Choose N1⊆Nwith |N1|=n−c0(2h−1−1)

2h−1;

7k:= h;

8end

9t1:= |N1|/c0;

10 In [0,t1), node 0 serves nodes in N1;

11 for t=0,...,k−2do

12 Choose Nt+2⊆N,|Nt+2|=c0of unserved nodes;

13 In [t1+t,t1+t+1), node 0 serves nodes in Nt+2,any

completed node i∈N\{0}serves one other node that is not

yet served (if there still is one);

14 end

nodes through nodes in N

2, so we can serve all nodes in

N2(including the nodes from N2). The overall makespan is

not increased by this, since the serving of N

2is now com-

pleted at time t

2=t1+1, and hence, all nodes in ˜

N2are

served at time t

2+M2+1=t1+M2+2≤t1+M1≤

M∗.

For the inductive step, assume |Nj|>c0for some

j=3,...,k. By the induction hypothesis, we know that

|N|=c0for =2,..., j−1 and thus tj−1=t+

j−−1 for all =1,..., j−1. With this and a dis-

tinction between the cases M≤Mj+j−and M≥

Mj+j−+1, the inductive step follows similarly to the

base case. 

Summing up, we have shown so far that there always is

an optimal solution where the server at the beginning serves

a set of nodes N1,c0≤|N1|<2c0,in|N1|/c0time units,

then some sets N2,...,Nk−1with |Nj|=c0for all j≥2,

in one time unit each, and finally a set Nkcontaining at most

c0nodes, in another unit of time.

We are now ready to present an exact algorithm for this

special case and prove its correctness. Intuitively speaking,

the algorithm is an extended greedy procedure that attempts

to balance the sub-makespans Mjof the different sets Nj

of nodes served by the server. Set h=log2(n/c0+1).If

n<c0(2h−1+2h−1),let|N1|=n−c0(2h−1−1)/2h−1

and k=h; otherwise, let |N1|=c0and k=h+1.

Let |Nj|=c0for j=2,...,k. The server serves N1in

[0,|N1|/c0)and Njin |N1|/c0+j−2,|N1|/c0+j−

2. Meanwhile, any node that finishes its download starts

serving other nodes greedily. Details are listed in Algo-

rithm 2.

123

JournalofScheduling

Theorem 4 For any instance I =(N,c)with c0∈Nand

ci=1for all i =1,...,n, Algorithm 2yields an optimal

solution. The optimal makespan is

M∗=⎧

⎪

⎨

⎪

⎩

h−1+1

c0n−c0(2h−1−1)

2h−1

if n ∈c0(2h−1), c0(2h−1+2h−1)

h+1

if n ∈c0(2h−1+2h−1), c0(2h+1−1),

where h =log2(n/c0+1).

Proof W.l.o.g. we can assume that in the solution produced

by Algorithm 2,|˜

Nj|=c0·2k−jfor all j=2,...,k[with

Njas defined in (1)], i.e., all sets ˜

Njfor j≥2 are as big

as possible. If this is not the case, we can move nodes from

N1to these sets. The lemmas above state that there also is an

optimal solution with this structure and c0≤|N1|<2c0.In

the argumentation below, we therefore restrict to solutions of

this structure.

The integer his chosen such that c0(2h−1)≤n<

c0(2h+1−1). We first consider the case where n≥c0(2h−

1+2h−1). Our algorithm sets k=h+1 and |N1|=c0.

The maximum number of nodes that can be served like this

is c0(2h+1−1)>n, so indeed all nodes are served within a

makespan of h+1. Assume that in an optimal solution only

k=hsets are served by the server. The maximum number

of nodes served in this way is |N1|·2h−1+c0(2h−1−1)<

c0(2h+2h−1−1)≤n, so not all nodes can be served,

contradiction! Therefore in an optimal solution it must hold

that k=h+1 and |N1|≥c0, proving that the solution

produced by Algorithm 2is optimal.

If n<c0(2h−1+2h−1), the size of N1in an optimal

solution (with |˜

Nj|=c0·2k−jfor j≥2) has to be chosen

such that |˜

N1|=n−c0(2h−1−1)≤|N1|·2h−1, i.e.,

|N1|=min ∈N:≥n−c0(2h−1−1)

2h−1

=n−c0(2h−1−1)

2h−1,

which is exactly how the algorithm chooses N1. Moreover,

the algorithm sets k=h. It is obvious that choosing k≤h−1

does not lead to a solution where all nodes are served, and

choosing k≥h+1 leads to a solution with a makespan of

at least h+1. The solution produced by our algorithm has a

makespan of t1+k−1≤h+1, where we used t1≤2 and

k=h. This proves the correctness of Algorithm 2.

3.2 Heterogeneous symmetric capacities

In this section, we consider the case of heterogeneous and

symmetric capacities. First, we show that the minimum time

broadcasting problem for heterogeneous symmetric capaci-

ties is strongly NP-hard, even for an indivisible file. Then, we

devise an algorithm that approximates the optimal makespan

by a factor 1+2√2. Our hardness and approximation results

rely on a useful lemma that bounds the work that has to

be done in any feasible solution. For a feasible solution, let

ui(t1,t2)and zi(t1,t2)denote the total upload and the idle-

ness of node iin time interval [t1,t2], defined as

ui(t1,t2)=t2

t1

j∈N

si,j(τ) dτ

and

zi(t1,t2)=t2

max{t1,Ci}⎛

⎝ci−

j∈N

si,j(τ)⎞

⎠dτ.

For t≥0, we define

X(t)=

i∈N

xi(t)

and

Z(t)=

i∈N

zi(0,t).

We obtain the immediate equality X(t2)−X(t1)=

i∈Nui(t1,t2)for all 0 ≤t1≤t2.

Lemma 5 (Capacity expansion lemma) Let I =(N,c)be an

instance of the minimum time broadcasting problem with an

indivisible file, and let c =maxi∈Nci. Then, for all solutions

S of I, the following two statements hold:

1. For all nodes i ∈N and all times 0≤t1<t2:

ui(t1,t2)+zi(t1,t2)

≤max0,t2−t1−1−xi(t1)

cici

=max0,(t2−t1)ci−1+xi(t1)

2. Xk

c+Zk

c≤2kfor all k ∈N. This inequality is strict

if there is i ∈N with ci<c and 0<Ci≤k

Proof To see 1., note that for any node iwith Ci≤t1,the

upload rate (integrand of the total upload) and the idle rate

(integrand of the idleness) sum up to cifor all t∈[t1,t2],

thus ui(t1,t2)+zi(t1,t2)≤(t2−t1)ci.Ifxi(t1)<1, node i

needs at least 1−xi(t1)

citime units to finish its download, thus

only a time interval of length t2−t1−1−xi(t1)

ciremains for

the upload and the claimed inequality follows.

123

JournalofScheduling

We prove 2.by induction over k. The inequality is trivial

for k=0 since initially only the server holds the file. So, let

us assume that for k∈N,wehaveXk−1

c+Zk−1

c≤2k−1

and that this inequality is strict if there is i∈Nwith ci<c

and 0 <Ci≤k−1

c.Using1., we obtain

Xk

c−Xk−1

c+Zk

c−Zk−1

c

=

i∈N

uik−1

c,k

c+zik−1

c,k

c

≤

i∈N

max$0,ci

c−1+xik−1

c%

≤Xk−1

c,(2)

whereweuseci≤c. Rearranging terms and using the induc-

tion hypothesis, we obtain

Xk

c+Zk

c≤2Xk−1

c+Zk−1

c≤2k.(3)

To finish the proof, let us assume that there is i∈Nwith

ci<cand 0 <Ci≤k

c.IfCi≤k−1

c, using the induction

hypothesis, we obtain Xk−1

c<2k−1and, by (3), Xk

c<

2k. If, on the other hand, Ci∈k−1

c,k

c&, then the inequality

(2) is satisfied strictly and we obtain Xk

c<2kby (3). 

3.2.1 Hardness

We are now ready to prove strong NP-hardness of the mini-

mum time broadcasting problem.

Theorem 6 The minimum time broadcasting problem for het-

erogeneous symmetric capacities is strongly NP-hard, even

for m =1.

Proof We reduce from 3-Partition. An instance of 3-

Partition is given by a multiset P={k1,...,kn}of n=

3μ,μ∈N, positive integers. The goal is to partition Pinto μ

subsets P0,...,Pμ−1such that the sum of all integers within

each subset is equal. The decision whether such partition

exists remains NP-complete in the strong sense even if B/4<

ki<B/2 for all ki∈P, where B=n

i=1ki/μ is integer,

see Garey and Johnson (1979). We first construct a modi-

fied 3- Partition instance in which μis a power of two and

larger or equal to 16. To this end, let =max{log2μ,4}

and set μ=2. We define the new instance by

P=P∪

μ

k=μ+1{B/2,B/2}.

The goal is to partition Pinto μsubsets P0,...,Pμ−1such

that the sum of all integers within each subset is equal. Note

that |P|might not be equal to 3μ, but this does not hurt our

following arguments. Because the integer B/2 can only go

with another integer B/2 in a “yes”-instance of the modified

3- Partition problem, the modified 3-Partition instance

Padmits a solution if and only if the original instance P

admits a solution.

Given the modified instance Pof 3-Partition, we con-

struct an instance Iof the minimum time broadcasting

problem as follows: For every ki∈P, we introduce a mas-

ter element node i0with capacity ci0=kiand 2ki−2

element nodes i1,...,i2ki−2with capacity cik=ki

ki−1for

all k∈{1,...,2ki−2}.For j∈{0,...,μ

−1},we

introduce a subset node sjwith capacity csj=B. Subset

node s0initially owns the file. Note that 3- Partition is

strongly NP-complete, i.e., is NP-complete even if the inte-

gers kiare represented unary. For unary encoded integers of

the 3- Partition instance, we obtain a polynomial reduction.

We claim that the optimal makespan of Iis less or equal



B+if and only if Pis a yes-instance. For the “if”-part, we

construct a solution Swhere first all subset nodes are served,

and then, each subset node serves those master element nodes

that correspond to elements in one of the sets of the partition.

Afterward all element nodes are served. More formally, in

the first 1

Btime units subset node s0sends the file to the first

subset node s1witharateofss0,s1(t)=Bfor all t∈[0,1

B).

After 1/Btime units, the first subset node has completed the

download of the file. Starting at time 1/Bsubset nodes s0and

s1send the file to the subset nodes s2and s3, respectively, each

witharateofss0,s2(t)=ss1,s3(t)=Bfor all t∈1

B,2

B.

Continuing in this fashion, at time /Ball μsubset nodes

own the file but none of the other nodes has received any

data. Let P∗

0,...,P∗

μ−1be a solution of P.Intheremain-

ing time units, subset node sj,j∈{0,...,μ

−1}, sends

the file to every master element node i0with ki∈P∗

j,atarate

of ssj,i0(t)=kifor all t∈

B+1

ki. This is feasible since

P∗

0,...,P∗

μ−1is a solution of Pand thus i∈S∗

jki=B.

The download of each master element node is finished after

1/kitime units at time 

B+1

ki. At this point in time, master

element node i0starts to send data to ki−1 of the corre-

sponding element nodes ij,j∈{1,...,ki−1},atarateof

si0,ij(t)=ki

ki−1for all t∈

B+1

ki,

B+. The remaining

ki−1 element nodes ij,j∈{ki,2ki−2}, are served

by the subset node sjwith rate ssj,ij(t)=ki

ki−1for all

t∈

B+1

ki,

B+. The download of the file by the element

nodes starts at time 

B+1

kiand needs additional ki−1

kitime

units, resulting in a total makespan of 

B+.

It remains to be shown that the optimal makespan of the

broadcasting instance Iis strictly larger than 

B+in the

case that Pis a no-instance. We first argue that no element

node ik,k∈{1,...,2ki−2}, will upload the file to any

other node. Both the upload and the download of the com-

plete file by element node ikrequires at least ki−1

ki=−1

time units. We calculate

123

JournalofScheduling

2−1

ki≥4

3+8

3−2

≥4

3+2

>4

3≥1

B+1,

where we use ≥4. This value is larger than the assumed

makespan. Hence, we can assume that only the subset nodes

(including the server s0) and the master element nodes upload

the file to other nodes. In order to find a solution of Iwith

M≤

B+, it is necessary to find a partial solution ¯

Sof the

partial instance ¯

Ithat consists only of subset nodes and the

master element nodes and has accumulated enough idleness

until time 

B+, i.e.,

Z

B+≥

i∈P

(2ki−2).

Otherwise the idleness of subset nodes and master element

nodes does not suffice to serve all element nodes until time



B+in the original instance S. We proceed to show that

for the partial instance, no such partial solution exists.

Let us first assume that xsj

B=1 for every subset node

sj. Using Lemma 5, the inequality X

B+Z

B≤μholds,

thus Z

B=0 and xi0

B=0 for every master element

node i0. We calculate

Z

B+=

i∈N

zi

B,

B+

=μB+

ki∈P

ki

B+−Ci0

−

j∈N

ki∈P

B+



sj,i0(τ) dτ

 

=|P|

=μB+

ki∈Pki

B+−Ci0−1.

For the completion time Ci0of a master element node i0,we

obtain the inequality Ci0≥

B+1

ki. We claim that there is

at least one master element node i

0for which this inequality

is strict. For a contradiction, suppose that Ci0=

B+1

kifor

all ki∈P. This implies that every master element node i0

downloads with its full rate sp(i0),i0=ci0. Let us define

Pj={ki:p(i0)=sj}. As subset node sjhas capac-

ity csj=Band serves all master element nodes i0with

ki∈Sjwith full rate, we derive that ki∈Sjki≤B. Thus,

Sjis a solution to the 3- Partition instance P, a contradic-

tion. Consequently, there is a master element node i

0with

Ci

0>

B+1

ki, establishing that

Z

B+<μ

B+

ki∈P

(ki−2)=

ki∈P

(2ki−2).

We are left with the case that at least one of the subset

nodes has not finished the download of the file at time 

that is, there is j∈{1,...,μ

−1}with xsj(

B)<1. We

calculate

Z

B+≤Z

B+

μ−1



j=0

B

B+−Csj

+

ki∈P

ki

B+−Ci0)

−

i∈N

j∈N

B+



si,j(τ) dτ

≤Z

B+

ki∈P

2ki+

μ−1



j=0

B

B−Csj

+

ki∈P

ki

B−Ci0−

i∈N1−xi

B (4)

=

ki∈P

(2ki−1)−μ+

μ−1



j=0

B

B−Csj

+

ki∈P

ki

B−Ci0+X

B+Z

B

 

≤μ

≤

ki∈P

(2ki−1)+

μ−1



j=0

B

B−Csj

+

ki∈P

ki

B−Ci0,(5)

where (4) stems from the fact that each node i∈N

still needs to download a proportion of 1 −xi(

B)of the

file. For each subset node sjwith xsj(

B)<1, we have

Csj≥

B+1−xsj(/B)

B, and for each master element node

with xi0(

B)<1, we have Ci0≥

B+1−xi0(/B)

ki. We distin-

guish two cases.

First case: At least one of the master element nodes has

finished the download at time 

Then, inequality (5) is strict and we obtain

Z

B+<

ki∈P

(2ki−1)−

μ−1



j=01−xsj

B

−

ki∈P1−xi0

B

123

JournalofScheduling

=

ki∈P

(2ki−1)+X

B



≤μ

−μ−|P|

≤

ki∈P

(2ki−2).

Second case: All master element nodes and at least one

of the subset nodes still have not finished the downloads at

time /B.

Then, the upload capacity of the other subset nodes is not

sufficient to serve all master element nodes with their full

download rate, thus, one of the inequalities Ci0≥

B+

1−xi0(/B)

kiis satisfied strictly. As a consequence, we obtain

Z(

B+) < ki∈P(2ki−2).

3.2.2 Approximation

In this section, we devise an algorithm that runs in time

O(nlog n)and computes a solution with makespan no larger

than (1+2√2)M∗, where M∗is the optimal makespan. We

first consider the case c0≥cifor all i=1,...,n,for

which we will show a 2√2-approximation. Then, we will

use this result to obtain a (1+2√2)-approximation for arbi-

trary server capacities. The additional summand of M∗in

the approximation arises as possibly we first have to transfer

the file to the node having the largest capacity which takes at

most M∗time units.

Before we present details of the algorithm, we give a

high-level picture of our approach. The intrinsic difficulty

in proving any approximation guarantee is to obtain lower

bounds on the optimal makespan. Let us recall the inequal-

ity shown in the Capacity Expansion Lemma (Lemma 5).

For any solution, node i’s contribution to the upload in time

interval [t1,t2]is bounded by ui(t1,t2)≤max{0,ci(t2−

t1)−1+xi(t1)}−zi(t1,t2)where zi(t1,t2)is the total idle-

ness of node iin the time interval [t1,t2]. To exploit this

capacity bound, our algorithm should fulfill two properties:

it should send the file to nodes with large capacity as soon

as possible, and it should avoid idle time as long as pos-

sible. To achieve this, we use the concept of time-varying

resource augmentation.Forλ≥1, we call a possibly infea-

sible solution ˜

Saλ-augmented solution, if at any point in

time the original capacities are not exceeded by more than a

factor of λ, i.e., j∈Nsi,j(t)≤λcu

iand sp(i),i(t)≤λcd

ifor

all i∈N,t∈R≥0. We will show that there is an efficiently

computable √2-augmented solution that satisfies the above-

mentioned properties. This allows us to apply the Capacity

Expansion Lemma, and we get that the makespan ˜

Mof ˜

Sis

not larger than 2M∗. Rescaling the augmented solution ˜

Sto

obtain a feasible solution S, we get an additional factor of

√2.

We now explain the scaling of the capacities during the

course of the algorithm, see also Algorithm 3for a more

formal description. The algorithm maintains a queue Qof

capacity release events

(t,i,c)∈R≥0×N×R≥0

meaning that at time ta capacity of cof node iis available

for uploads. At the start of the algorithm only the server is

available for uploads so Qhas only a single entry (0,0,c0).

Order the nodes non-increasingly by capacities such that

c1≥c2≥···≥cnand consider a point in time t, where a

node ibecomes available with its full upload capacity ciand

suppose that nodes 1,...,k−1 have already received a frac-

tion of the file. We now determine a number k≥ksuch that

nodes k,...,kstart receiving the file from node i. The num-

ber kis chosen such that either nodes k,...,kreceive the file

at their full capacity and the upload factor of node iis violated

by a factor β∈[1,√2)or node isends the file at full capacity

and the download capacity of all nodes k,...,kis violated

by a factor α∈[1,√2). The next lemma shows that such k

always exists (provided that there are enough nodes left).

Lemma 7 Let c1≥c2≥ ··· ≥ cnand c ≥c1

√2.If

n

j=1cj>c, then there is k ≤n such that

√2≤



j=1

cj≤√2c.

Proof If c

√2≤c1, there is nothing left to show. Otherwise,

let

k=max $∈{1,...,n}:





j=1

cj<c

√2%.

Since c1<c

√2and n

j=1cj>c,wehavek∈{1,...,n−

1}.Wesetk=k+1. By definition, k

j=1cj≥c

√2.In

addition, we have k

j=1cj≤2k

j=1cj<√2c.

During the course of the algorithm, nodes are served in

order of non-increasing capacity. Since the capacities of the

uploaders are inflated by a factor of at most √2, the released

capacity csatisfies c≥ck

√2, where kis the smallest index

of a node not served yet. Hence, we can apply Lemma 7for

the nodes ck≥ck+1≥···≥cnand derive the existence of

k≥ksuch that

√2≤

k



j=k

cj≤√2c.

The capacities of nodes k,...,kare increased by a factor of

α=max 1,ci

k

j=kcj(

123

JournalofScheduling

and that of node iby a factor of

β=

k



j=k

αcj

see lines 13 and 15 in Algorithm 3.

For the instance with such augmented capacities, the

schedule is now constructed as follows. At time t, nodes

j∈{k,...,k}start downloading from node iwith their full

(possibly augmented) capacity of αcj(line 20). Node j∈

{k,...,k}thus finishes the download at time ˜

Cj=t+1

αcj,

see line 18. At that time, both the new upload capacity of

node jas well as the fraction of the upload capacity of node i

that was devoted to uploading the file to node jbecome

available for further uploads. To avoid double augmentation

of capacities during the course of the algorithm, we have to

rescale both to their original level. This is done in line 21.

Here both the event that the (non-augmented) capacity cjof

node jis available and that a fraction αcj

βof the capacity of

node iis available for further uploads is added to the event

queue. Note that if ck>ck+1at the time ˜

Ckonly some frac-

tion of the upload capacity of node ibecomes available since

nodes k+1,...,kare still downloading from node i.The

algorithm reassigns this fraction of the upload capacity at

time ˜

Ckwithout waiting for nodes k+1,...,kto avoid idle

times.

The algorithm proceeds by taking the next capacity release

event out of the event queue. The process stops as soon as the

total capacity of the remaining nodes is less than the currently

available upload capacity.

From this point on, all remaining nodes are served simul-

taneously at full download rate. For the formal details of the

Scale- Fit procedure, see Algorithm 3.

The choice of the augmentation factors and the rescaling

procedure ensure the invariant that the augmented capacities

do not exceed the original capacities by more than a factor

of √2.

Lemma 8 Scale- Fit computes a √2-augmented solution ˜

Proof When new downloads from a node iare assigned, the

download capacities are increased by the factor

α=max 1,c

k

j=kcj(,

see line 13. By the choice of kin line 12, α≤√2 so that no

download capacity is increased by a factor larger than √2.

We prove that also the upload capacities are never exceeded

by a factor larger than √2 by two inductive claims. First, we

claim that after a node istarted the download of the file at

time ˜

Ai, we have that the capacities of all further capacity

Algorithm 3: Scale- Fit

Input: instance I=(N,c)with c0=maxi∈Nci

Output:2

√2-approximate solution Sof I

1sort nodes non-increasingly with respect to their capacities, i.e.,

c1≥c2≥···≥cn;

// server has unused upload capacity c0at

time 0

// add this capacity release to event queue

2Q:= (0,0,c0);

// node with largest capacity not served

yet

3k:= 1;

// largest augmentation factor so far

4b:= 1;

5while k≤ndo

// take next event from queue

6(t,i,c):= arg min(t,i,c)∈Qt;

7Q:= Q\{(t,i,c)};

8if n

j=kcj≤√2cthen

// iserves all remaining nodes

9k:= n;

// augmentation factor of downloader

10 α:= 1;

11 else

12 choose ksuch that c

√2≤k

j=kcj≤√2c;

// augmentation factor of downloader

13 α:= max1,c)k

j=kcj;

14 end

// augmentation factor of uploader

15 β:= k

j=kαcj

// update largest augmentation factor

16 b:= max{b,α,β};

// time when jstarts downloading

17 ˜

Aj:= tfor all j∈{k,...,k};

// time when jfinishes downloading

18 ˜

Cj:= t+1

αcjfor all j∈{k,...,k};

// parent of nodes k,...,kis j

19 p(j):= ifor all j∈{k,...,k};

// sending rate from ito j

20 ˜si,j(τ) := αcjfor all j∈{k,...,k},τ ∈[˜

Aj,˜

Cj);

// add rescaled capacity release to queue

21 Q:= Q∪*k

j=k(˜

Cj,j,cj)∪(˜

Cj,i,αcj

β);

22 k:= k+1;

23 end

24 forall the i∈N\{0}do

// Rescale ˜

25 Ai:= ˜

Aib;

26 Ci:= ˜

Cib;

27 sp(i),i(t):= ˜sp(i),i(˜

Ai)

bfor all t∈[Ai,Ci);

28 end

release events for isum up to ci(as long as there are still

unserved nodes), i.e.,



(t,i,c)∈Q:i=i

c=ci.

123

JournalofScheduling

This is obviously true for all times t∈[

Ai,˜

Ci)when

the only event in Qis (˜

Ci,i,ci). Next suppose that the

claim is correct for all times until a capacity release event

(t,i,c)for iis taken from the queue Q. Then, the capacity

of cis used to serve nodes k,...,kwhere the down-

load capacity of the downloaders is increased by the factor

α=max1,c)k

j=kcjand the upload capacity of iis

increased by the factor β=k

j=kαcj

c. Finally, the capacity

release events *k

j=k(˜

Cj,i,αcj

β)are added for i.Wehave

k



j=k

αcj

β=

k



j=k

k

=k

c

c=c,

by the choice of β. This proves the claim.

Next, we observe that after node ifinished the download of

the file at time ˜

Ciand as long as there are still unserved nodes,

every capacity release event (t,i,αcj

β)for icorresponds to

a an upload of node ito some other node jat rate αcjand

vice versa.

To sum up, for each node i,ateverytimet≥˜

Cjall active

uploads at rates αjcjfor some α∈[1,√2]correspond to

a capacity release event (t,i,αj

βj)for some βj∈[1,√2]

and the sum of the capacities of the release events is ci.We

thus obtain



j:si,j(t)>0

αjcj=

(t,i,αj

βj)∈Q

βjαj

βj≤√2c,

so that the upload capacity of each node iis also never

exceeded by a factor larger than √2. 

Before we proceed to formally analyze the performance

of Scale- Fit, let us illustrate the algorithm by an example:

Example 9 Consider an instance with 6 nodes and capacities

c0=5, c1=3, c2=3, c3=5/2, c4=2, and c5=2.

We first describe how to construct the augmented solution

S. At time 0, the upload capacity of node 0 is augmented by

a factor of 6/5 in order to serve nodes 1 and 2 at their full

download capacities. These downloads are completed at time

1/3. At that time, the rescaled upload capacities of the parent

of nodes 1 and 2 (i.e., node 0) can be used separately to serve

further nodes. The 5/2 units of capacity (previously assigned

to node 1) are now assigned to node 3, which downloads

with full capacity without any augmentation. The remaining

5/2 units of capacity are assigned to node 4, whose down-

load capacity is augmented by a factor of 5/4. At the same

time, node 1 starts serving node 5 without any augmentation

because node 5 is the only remaining node. The highest aug-

mentation factor is 5/4. Thus, we rescale all rates by 4/5 and

obtain the feasible solution S. See Fig. 2for an illustration.

time

node 1

node 2

node 3

node 4

node 5

(a)Augmented solution ˜

time

node 1

node 2

node 3

node 4

node 5

(b) Feasible solution S

Fig. 2 aAugmented solution ˜

Sand bfeasible solution Sof Example 9

We now turn to the approximation guarantee of our algo-

rithm. We first need the following lemma that provides a

lower bound on the optimal makespan.

Lemma 10 Let I =(N,c)be an instance of the minimum

time broadcasting problem with c0≥c1≥c2≥···≥cnfor

all i =1,...,n, and let ˜

S be a possibly augmented solution

with the following properties:

1. Every node i ∈N\{0}receives the file during some

interval [˜

Ai,˜

Ci]at constant rate ˜ci≥ci;

2. ˜

Ai≤˜

Ajfor all i,j∈N\{0}with i ≤j.

Then,

t0=min$t≥0:there is i ∈N with ˜

Ci≤t

and 

j∈N˜si,j(t)<ci%≤M∗,

where M∗is the optimal makespan for I.

Proof Let us first consider the case t0<˜

M, where ˜

Mis the

makespan of the augmented solution. For a contradiction,

suppose M∗<t0and fix an optimal solution S∗.Wehave

123

JournalofScheduling

X(t0)<X∗(t0)=n.Letk0be the node with smallest index

that has not finished the download in the augmented solution

Sat time t0, that is,

k0=min{i∈N:˜

Ci>t0}.

Such a node exists, since we assume t0<˜

If k0=1, then we obtain the inequality t0<1/c1, which

is a lower bound on M∗, and there is nothing left to show. So

we assume k0>1 and consider the point in time t1=t0−

1/ck0. For a contradiction, let us assume that ˜

X(t1)≥X∗(t1).

Since ˜xk0(t0)<1 and in ˜

S, node k0receives the file in some

time interval [˜

Ai,˜

Ci]at rate ˜ci≥ci,wehave ˜xk0(t1)=0,

and since the nodes start downloading in the order of their

indices, we also have ˜xi(t1)=0 for all i≥k0. Every node

i∈Nwith ˜xi(t1)>0 receives data with download rate

˜ci≥ciuntil time ˜

Ci.Using˜zi(t1,t0)=0, we obtain

˜ui(t1,t0)≥1

ck0−1−˜xi(t1)

˜cici

=ci

ck0+ci

˜ci˜x(t1)−1

≥ci

ck0+˜xi(t1)−1

for all i≤k0with ˜xi(t1)>0, and ˜ui(t1,t0)≥0 for all other

nodes. Referring to Lemma 5(1.), we calculate

X∗(t0)−X∗(t1)≤

i∈N

max0,ci

ck0−1+x∗

i(t1)

=

i∈N

x∗

i(t1)>1−ci

ck0

ci

ck0−1+x∗

i(t1)

≤

i<k0

x∗(t1)>1−ci

ck0

ci

ck0−1+X∗(t1),

whereweuseci/ck0≤1 for all i≥k0. We get

X∗(t0)−X∗(t1)≤

i<k0

(ci/ck0−1)+˜

X(t1)

≤˜

X(t0)−˜

X(t1),

a contradiction. We conclude X∗(t1)> ˜

X(t1). Applying the

same line of argumentation for t1instead of t0, we derive

the existence of t2=t1−1/ck1for some k1∈Nwith

X∗(t2)> ˜

X(t1). We can iterate this argument until we reach

t, with the property that X∗(t)> ˜

X(t)and k=min{i∈

N:˜

Ci>t}=1. This is a contradiction since in the time

interval [0,t]only the server can upload and its upload rate

in ˜

Sis not smaller than that in S∗. We conclude that our

initial assumption that t0>M∗was wrong, finishing the

proof for the case t0<˜

M.Ift0=˜

M, we can use the same

line of argumentation for t0−instead of t0, where >0is

arbitrary. Thus we obtain t0≤M∗also for that case. 

We are now ready to prove the approximation guarantee

of Scale- Fit.

Theorem 11 For the minimum time broadcasting problem

with heterogeneous symmetric capacities and an indivisible

file, the following holds:

1. If c0=maxi∈Nci, then Scale- Fit is a 2√2-approxi-

mation.

2. If c0<maxi∈Nci, then uploading the file to another node

and applying Scale- Fit is a (1+2√2)-approximation.

The running time of Scale- Fit is O(nlog n).

Proof We first show 1.By construction, node ndetermines

the makespan and starts its download not after t0. Hence,

the makespan of the √2-augmented solution ˜

Sis not larger

than t0+1/cn≤2M∗, where we use Lemma 10 and the

fact that 1/cnis a lower bound on the optimal makespan.

For rational input, the largest factor, with which a capacity

constraint is violated, is strictly smaller than √2. Dividing

all sending rates by that factor, we obtain a feasible solution

with makespan smaller than 2√2M∗.

To see 2., note that the server needs 1/c0≤M∗time units

to transfer the file to node 1. We then treat the instance as an

instance where node 1 is the server, i.e., we do not let node 0

upload the file to any other node except node 1. Using the

same arguments as in the proof of Lemma 10, we derive that

this takes at most 2√2M∗additional time units, implying the

claimed approximation factor.

The complexity of the algorithm is as follows: In a first

step, the nodes are sorted with respect to their capacities

(line 1 of Algorithm 3), which takes time O(nlog n).The

most expensive operations within the loop (lines 5–23) are

adding elements to Q(line 21) and retrieving the minimal

element from Q(line 6). In total, there are O(n)of these

operations, and if Qis a priority queue, all operations can

be executed in time O(log n). Therefore the overall running

time of the algorithm is O(nlog n).

4 Broadcasting multiple packets

We now turn to the more general case that the file is divided

into m∈N≥1packets. The file is still of unit size; there-

fore, the download of one packet at a rate of ctakes 1

time units. We present two efficient algorithms. The first one,

termed Spread- Exchange, uses a variant of the Scale-

Fit algorithm, developed in the last section, as a subroutine

123

JournalofScheduling

and yields a 5-approximation. The second algorithm, termed

Spread- Mirror- Cycle, has an approximation guarantee

of 2 +2log2n/m

m, which is super-constant in general, but

better than 5 if the number of packets is large compared to

the number of nodes.

4.1 The SPREAD-EXCHANGE algorithm

The Spread- Exchange algorithm works in two phases. In

the first phase, called Spread, a modification of the Scale-

Fit algorithm is used to provide each node with a single

packet. For better reference, we denote the modified version

of the algorithm by Scale- Fit∗. In the second phase, called

Exchange, the packets are exchanged between the nodes in

a pairwise manner until all nodes own all packets. The second

phase can only work efficiently, if the packets are distributed

in a suitable pattern among the agents.

For ease of exposition, we will first make some simplifying

assumptions on the instance and explain the Spread-

Exchange algorithm for such simple instances. In the end of

this section, we will relax these assumptions. First, we require

that the server has the highest capacity and that nodes are

ordered by their capacities. Second, we require that the capac-

ities are powers of two. Third, we assume that i∈N

c0=2

for some integer . Intuitively, this condition ensures that the

distribution tree is complete and has height . Finally, we

assume that m≤, that is, there are not more packets than

levels in the distribution tree. An instance that satisfies these

properties will be called simple as summarized in the follow-

ing definition.

Definition 12 An instance I=(N,c,m)is called simple, if

it satisfies the following properties

1. c0≥c1≥c2≥···≥cn,

2. ci=2riwith ri∈Nfor all i∈N,

3. n

i=0

c0=2for some ∈Nwith ≥m.

4.1.1 The SCALE-FIT∗-algorithm

Our algorithm uses a variant of the Scale- Fit-Algorithm to

send exactly one packet to each node. Compared to the orig-

inal Scale- Fit-Algorithm, we introduce two modifications.

The modified algorithm still maintains a queue Qof capacity

release events but the events now are quadruples

(t,i,c,q)∈R≥0×N×R≥0×[m],

indicating that at time tnode ihas an upload capacity of c

and will use this capacity to upload packet q.

Second, we observe that for a simple instance, where all

capacities are powers of two, the Scale- Fit∗algorithm in

line 10 can choose ksuch that

k



j=k

cj=c,

that is, no further scaling of the upload or download capacities

is needed. This implies in particular that after removing the

entry (t,i,c,q)from the capacity release event queue, the

set of entries

k

j=kt+1

mcj

,i,cj,q,t+1

mcj

,j,cj,q

is added to the queue. The first event (t+1

mcj,i,cj,q)

means that for the uploading node ithe capacity of cjwill

be available for further uploads at time t+1

mcj. The packet

qto be send in these uploads will be specified in Sect. 4.1.2.

The second event (t+1

mcj,j,cj,q)means that node jwill

have finished the download of the file at time t+1

mcjand thus

its full capacity of cjwill be available for further uploads.

Since each node receives a single packet only, node jwith

j∈{k,...,k}can send only this packet in further uploads.

Further note that the download times are shorter by a factor

of 1/msince only a part of size 1/mis distributed by the

algorithm.

4.1.2 The distribution tree

When introducing new capacity release events in the event

queue Q, we have to decide which packet is scheduled

for upload. This is trivial for non-server nodes who only

receive a single packet during the course of the Scale- Fit∗-

Algorithm. To explain how the server decides which packet

is uploaded, we introduce the notion of a distribution tree T

which is build incrementally from the events introduced into

Q. To this end, let Qdenote the set of all capacity release

events (t,i,c,q)added to Qduring the course of the algo-

rithm. We build a tree structure Twith vertex set V(T)=Q

as follows. The root of Tis the tree node (0,0,c0,1)cor-

responding to the event initially placed in Q. Whenever a

capacity release event (t,i,c,q)is removed from Q,the

algorithm afterward adds the set of events

k

j=k$t+1

mcj

,i,cj,q,t+1

mcj

,j,cj,q% (6)

to Q.InT, we add an edge from the tree node (t,i,c,q)to

each of the tree nodes (t+1

mcj,i,cj,q)with j=k,...,k

and an edge from (t,i,c,q)to each of the tree nodes (t+

mcj,j,cj,q)with j=k,...,k. We call tree nodes of the

form (t+1

mcj,j,cj,q)proper tree nodes and denote the set

123

JournalofScheduling

Algorithm 4: Scale- Fit∗

Input: simple instance I=(N,c,m)

Output: partial solution where each node i∈N\{0}owns one

packet

// server has unused upload capacity c0at

time 0

// server will send packet 1

// add this capacity release to event queue

1Q:= (0,0,c0,1);

// node with largest capacity not served

yet

2k:= 1;

3while k≤ndo

// take next event from queue

4(t,i,c,q):= arg min(t,i,c,q)∈Qt;

5Q:= Q\{(t,i,c,q)};

// assign downloaders

6if n

j=kcj<cthen

7k:= n;

8else

9choose ksuch that c=k

j=kcj;

10 end

// sending rate from ito j

11 s(q)

i,j(τ) := cjfor all j∈{k,...,k},τ ∈[t,t+1

mcj);

// add new capacity release events to

queue

12 Q:= Q∪*k

j=k(t+1

mcj,j,cj,k);

13 if i=0then

14 Q:= Q∪*k

j=k(t+1

mcj,0,cj,min{q+1,m});

15 else

16 Q:= Q∪*k

j=k(t+1

mcj,0,cj,q);

17 end

18 k:= k+1;

19 end

of proper tree nodes by V∗(T). Note that there is a natural

bijection between the set of non-server nodes i∈N\{0}

and the set of proper tree nodes v∈V∗(T). Specifically,

for each non-server node i∈N\{0}, there is a proper tree

node (Ci,i,ci,qi), where qiis the packet send to node i.

Further, there is also a corresponding non-proper tree node

(Ci,p(qi)

i,ci,q)with q∈[m], where p(qi)

iis the unique

node that sends packet qito node iduring the course of the

algorithm.

We proceed to provide useful properties of the distri-

bution tree T. For a tree node v,letlevel(v) denote the

number of edges from vto the root. Further, for a tree

node v=(t,i,c,q), we introduce the notations t(v) =t,

i(v) =i,c(v) =cand q(v) =qas a shorthand for the time,

node, capacity and packet associated with the corresponding

capacity release event. The following lemma establishes that

nodes at higher levels have no larger capacities and higher

completion times.

Lemma 13 Let v, w ∈V(T)with level(v) < level(w). Then,

c(v) ≥c(w) and t(v) < t(w).

Proof We show by induction over that for all tree nodes

vwith level(v) =and all tree nodes wwith level(w) >

level(v), the claimed properties hold. The only tree node with

level(v) =0 is node (0,0,c0,1). Since t(0,0,c0,1)=0

and, by our preprocessing, c0≥cifor all i∈N,theresult

follows for all tree nodes with level 0.

Assume the statement is true for all tree nodes v∈V(T)

with some fixed level level(v) =and all nodes w∈V(T)

with level(w) > level(v). Consider a node v∈V(T)with

level(v) =+1 and a node w∈V(T)with level(w) >

level(v).Letvand wbe the parent nodes of vand win T.

By the induction hypothesis, we have t(v)<t(w), i.e., the

capacity release event vhappens before the event w. Since

the algorithm considers the nodes in order of non-increasing

capacity, this implies c(v) ≥c(w). Since all nodes all served

with full capacity, we also have t(v) =t(v)+1

mcv<t(w)+

mcw.

Lemma 13 implies that all capacity release events of a

level happen after all capacity release events of a former

level. This implies that the tree Tis complete in the sense

that the capacity of a tree node is equal to the sum of all

capacities of its children. Since n

i=0

c0=2with ∈N

also the last level is complete.

Next, we note that that each capacity release event

(t,i,ci,q)indicates that node ifinishes the download of

packet qat time [m]. Each such capacity event is mirrored by

another event of the form (t,p(q)

i,ci,q)indicating that the

upload capacity of the node p(q)

ifrom which node ireceived

packet qis free again and ready for further uploads. Specif-

ically, q=min{q+1,m}if p(q)

iis the server, and q=q

otherwise. As a consequence, half of the capacity of the tree

nodes in each level corresponds to available upload capacity

of nodes.

We proceed to show that the joint capacity of all nodes

that own a particular packet after running the Scale- Fit∗-

Algorithm is distributed very regularly.

Lemma 14 Let I =(N,c,m)be a simple instance and let

¯c=i∈Nci, and for k ∈[m]let

Nk=i∈N\{0}:i owns packet k

and ¯ck=i∈Nkci. Then

¯ck=¯c/2kfor k =1,...,m−1,

¯c/2m−1−c0for k =m.

Proof Let

V∗

1={v∈V∗(T):level(v) =1}

123

JournalofScheduling

be the set of proper tree nodes in level 1 and let

V1={v∈V(T)\V∗(T):level(v) =1}

be the set of non-proper tree nodes in level 1. Since the total

capacity of all nodes equals the capacity of the server plus

the total capacity of all proper tree nodes, we have

¯c=c0+

r∈V∗

1

v∈V∗(T(r))

c(v) +

r∈˜

V1

v∈V∗(T(r))

c(v).

The nodes v∈V∗(T(r)) with r∈V∗

1are exactly those that

receive packet 1, thus,

¯c=c0+¯c1+

r∈˜

V1

v∈V∗(T(r))

c(v).

Since the trees are complete, we have



r∈˜

V1

v∈V∗(T(r))

c(v) =

r∈V∗

1

v∈V∗(T(r))

c(v) −

r∈˜

c(r)

=

r∈V∗

1

v∈V∗(T(r))

c(v) −c0

We then obtain ¯c=2¯c1showing the claim for k=1.

To show the result for 1 <k<m, we apply this argument

recursively in the subtrees of level k.LetV∗

kdenote the set of

proper nodes of level kthat are served by the server and let

Vkdenote the set of non-proper tree nodes in level k. Then

¯ck=r∈V∗

kvV∗(T(r)) c(v). By induction,

¯c=c0+

k−1



i=1¯ci

+

r∈V∗

k

v∈V∗(T(r))

c(v) +

r∈˜

Vk

v∈V∗(T(r))

c(v).

With the same reasoning as before, we obtain the equality

r∈˜

Vkv∈V∗(T(r)) c(v) =r∈V∗

kv∈V∗(T(r)) c(v) −c0,

thus,

¯c=

k−1



i=1

¯c

2i+2

r∈V∗

k

v∈V∗(T(r))

c(v)

=1−2−(k−1)¯c+2¯ck,

hence ¯ck=¯c/2kshowing the results for 1 <k<m.

For k=m,wehave

¯c=c0+

m−1



i=1¯ci+¯cm

=c0+(1−2−(m−1))¯c+¯cm

and thus

¯c=1−2−(m−1)¯c+c0+¯cm,

or equivalently ¯cm=¯c/2m−1−c0, as claimed. 

4.1.3 The EXCHANGE phase

For the Exchange part of the algorithm, we treat the server

as a node that owns only packet m. We carry out m−1 rounds

of exchange. In round k, the total capacity of nodes that own

packets j∈{k,...,m}is doubled. Lemma 14 then implies

that after round k, all nodes own packet k, and this packet

can be ignored in all further rounds. Moreover, after round

m−1, all nodes own all packets, and the file is completely

distributed.

In round k, we assign the nodes that own packet kto nodes

that own other packets k>kwith a straightforward greedy

algorithm, see Algorithm 6for details. As long as the largest

capacity ciof a node ithat owns packet kis larger than the

capacity of at least one of the other nodes, we match this

node ito a subset ˜

Nof the other nodes, where ˜

Nis chosen

in such a way that the capacities in this set sum up to ci.This

is always possible since all capacities are powers of two. As

soon as the maximum capacity of an unmatched node that

owns packet kis smaller than the minimum capacity of the

unmatched nodes without packet k, we proceed the other way

round—a single node ithat does not own packet kis matched

to a set of nodes that do own packet k, where the capacities in

this set sum up to ci. Using Lemma 14, it is straightforward

to show that all nodes will get matched this way.

In round k, after all nodes are matched that way, every

node has a unique packet j∈{k,...,m}and sends this

packet jat a rate of cmin to its matched node(s).

4.1.4 The SPREAD-EXCHANGE ALGORITHM

A detailed description of the whole procedure Spread-

Exchange can be found in Algorithm 5. We first show that

it yields a 2-approximation algorithm for simple instances.

Theorem 15 For the minimum time broadcasting problem

with heterogeneous symmetric capacities and a file dis-

tributed in m packets, Spread- Exchange computes a

2-approximation for simple instances.

Proof Let I=(N,c,m)be a simple instance and M∗(I)be

its optimal makespan. Let I1/m=(N,c,1)be the instance

with the same set of nodes and capacities where a single file

of size 1/mhas to be send to all nodes. Since in Iall m

packets of size 1/mhave to be send to all nodes, we have

M∗(I1/m)≤M∗(I). Consider the Spread phase when the

123

JournalofScheduling

Algorithm 5: Spread- Exchange algorithm

Input: simple instance I=(N,c,m)

Output: 2-approximate solution S

1run Scale- Fit∗;// spread

2t1:= makespan of Scale- Fit∗;

3for k=1,...,m−1do // exchange

4Nk:= i∈N\{0}:iowns packet k};

5N−k:= N\Nk;

6Greedy assignment for Nkand N−k;

7forall the i∈Nkdo

8forall the j assigned to i do

9s(k)

i,j(t):= cmin for all t∈t1,t1+1

mcn;

10 let k>kbe a packet that is owned by j;

11 s(k)

j,i(t):= cmin for all t∈t1,t1+1

mcn;

12 end

13 end

14 t1:= t1+1

mcn;

15 end

Algorithm 6: Greedy assignment

Input:setsA={a1,...,ar},B={b1,...,bs}such that

a∈Aa=b∈Bb,

ai=2ri,bi=2siwith ri,si∈Nfor all i∈N

Output: assignment (ai,Bi)i∈I,(Aj,bj)j∈Jsuch that

A=˙

*i∈Iai˙

∪˙

*j∈JAj,

B=˙

*i∈IBi˙

∪˙

*j∈Jbj,

ai=b∈Bibfor all i∈I,

bj=a∈Ajafor all j∈J

1M:= ∅ ;

2while A=∅do

3s:= arg maxt∈A∪Bt;

4if s∈Athen

5choose B⊆Bwith b∈Bb=s;

6M:= M∪(s,B),A:= A\{s},B:= B\B;

7else

8assign sto a set A⊆Awith a∈Aa=s;

9M:= M∪(A,s),B:= B\{s},A:= A\A;

10 end

11 end

12 return M;

Spread-Exchange algorithm is run for Iand let

t0=min$t≥0:there is i∈Nwith xi≥1

and 

j∈N

k∈[m]

s(k)

i,j(t)<ci%.(7)

be the first point in time during the Spread phase when

available upload capacity remains idle. During the Spread

phase, no node receives more than one packet, so we may

identify the whole Spread phase with a solution for instance

I1/m. Using Lemma 10, we then obtain that t0≤M∗(I1/m)

implying that t0≤M∗(I). Since in the Spread phase no

node starts the download of a packet strictly after t0,wehave

t1−t0≤1

mcnwhere t1is the makespan of the Spread phase.

After t1,theSpread- Exchange algorithm carries out

m−1 rounds of mutual exchanges of packets, each of which

needs 1

mcntime units. As a consequence, the makespan of the

solution computed by Spread-Exchange is bounded from

above by

M∗(I)+1

mcn+m−1

mcn≤2M∗(I),

where we used that M∗(I)≥1

cnsince the optimal solution

has to send the whole file to node n.

We proceed to show that a slight modification of the

Spread- Exchange algorithm yields a 5-approximation for

arbitrary, not necessarily simple, instances.

Theorem 16 For the minimum time broadcasting problem

with heterogenous symmetric capacities and a file distributed

in m packets, the following holds:

1. If c0=maxi∈Nci, there is a 4-approximation algorithm.

2. If c0<maxi∈Nci, there is a 5-approximation algorithm.

In both cases, the running time is O(mn log n).

Proof Let M∗(I)denote the optimal makespan of the original

instance and let cmax =maxi∈Nci.Ifc0<cmax, we spend

1/c0time units to send the whole file from the server node 0

to a node i∗with maximal capacity, remove node 0 from the

instance and treat i∗as the new server. Using that 1/c0≤

M∗(I), it suffices to show 1.

To this end, let the nodes be numbered such that c0≥c1≥

c2≥ ··· ≥ cn. Consider the instance Iwhere all nodes’

capacities are rounded down to the next power of two, i.e.,

I=(N,(ci)i∈N,m)where ci=2log2ci. By appropriate

scaling of the instance, it is without loss of generality to

assume that cn=cn.Let∈Nand n∈{0,...,n}be such

that

=+log2,n



i=0

c0-. and

n



i=0

c0=2.

For the subset of nodes N={0,...,n}, consider the

instance with rounded capacities I=(N,(ci)i∈N,m).Let

p,r∈Nbe such that m=p+rwith r<.

The general structure of the algorithm is follows. We con-

duct prounds of Spread- Exchange for Iin each of which

we distribute packets to the nodes in N. Then, we conduct

a final round of Spread- Exchange in which we distribute

r<packets, so that after the p+1 rounds all mpackets

are distributed to the nodes in N. Finally, we send the whole

file from nodes in Nto nodes in N\N. For the analysis of

this algorithm, we distinguish the cases p=0 and p>0.

123

JournalofScheduling

First case p =0. Consider the run of Spread-Exchange

on the instance I=(N,(ci)i∈N,m)with a subset of nodes

and rounded capacities. As in the proof of Theorem 15, con-

sider the time t0during the single Spread phase defined in

(7) when one of the nodes starts to become idle. We claim

that t0≤2M∗(I).

To see this, consider again the relaxation I1/mof Iwhere

only a single file of size 1/mhas to be distributed. We

have M∗(I1/m)≤M∗(I). We consider the solution for the

Spread phase for I, double all upload and download rates

and only send packet 1. By construction, this yields a 2-

augmented solution for I

1/m. By adding arbitrary (feasible)

downloads for the nodes in N\N, we may extend this solu-

tion into a solution for I1/m. Note that by doubling all upload

and download rates, all completion times are halved, but oth-

erwise, no nodes become idle earlier than in the original

solution.

By Lemma 10,t0≤2M∗(I1/m)and, thus, t0≤2M∗(I).

Since in the Spread phase no node starts sending a packet

strictly after t0,theSpread phase ends not later than M∗(I)+

cn.IntheExchange phase, the algorithm performs (m−1)

rounds of exchange of length 1

mcn. Overall, we see that the

time needed to run the Spread- Exchange algorithm is at

most

2M∗(I)+1

mcn+m−1

mcn=2M∗(I)+1

cn

≤2M∗(I)+1

=2M∗(I)+1

cn≤3M∗(I).

After the Spread- Exchange algorithm, we perform a

greedy matching between the nodes Nthat received the file

and the nodes N\Nthat did not receive any portion of

the file yet. Since i∈Nci≥i∈N\Nciand all capacities

are powers of two, there is a greedy assignment in which

all nodes i∈N\Nreceive the file at their full (rounded

down) capacity ci. The assignment can be computed with

Algorithm 6. This additional step takes additional

cn=1

cn≤M∗(I)

time units. We need at most 3M∗(I)time units for the

Spread- Exchange algorithm and M∗(I)time units for

sending the file to the nodes in N\N, so that the we obtain

a 4-approximation for the case p=0.

Second case p >0. For the instance Iwith a subset of

nodes and rounded capacities, the algorithm performs p>0

rounds of Spread- Exchange distributing packets each

followed by one round distributing rpackets. Consider one of

the such round during which we distribute s∈{, r}packets.

In the Spread part, the maximum number of subsequent

uploads of the server is and each upload takes at most

mcn≤1

mcn=1

mcn

time units. Further, no download starts strictly after the server

has finished its last upload. Thus, the Spread phase of this

round takes at most +1

mcntime units. In the Exchange phase,

we perform s−1 rounds of exchange requiring further s−1

mcn

time units. In total, all p+1 rounds of Spread- Exchange

need

(p+1)+1

mcn+p−1

mcn+r−1

mcn

=(p+1)

mcn+p

mcn+r

mcn

=(p+1)

mcn+1

≤3

≤3M∗(I)

time units.

As in the first case, we spend additional 1

cn≤M∗(I)time

units to send the file from the nodes in Nto the nodes in

N\Nyielding a 4-approximation.

We now turn to the running time of the algorithm. For

the Spread phase, we need to run Scale- Fit∗which has a

complexity of O(nlog n). Note that even when Scale- Fit∗

is run multiple times for different sets of packets, it actually

suffices to run the algorithm only one time and adapt the

solutions by changing the packets that are disseminated. For

the Exchange phase, we conduct m−1 rounds of exchanges

for each of which we need to compute an assignment. The

greedy algorithm has a complexity of O(nlog n). Sending the

file to the nodes in N\Nrequires the solution of a single

greedy assignment problem, so that we obtain the overall

complexity of O(mn log n).

We note that in a situation where in the beginning the

packets are spread all over the nodes in the network, we still

obtain a 5-approximation. In the beginning, all packets are

sent to a node with largest capacity, which then acts as a

server. This sending does not take longer than M∗(I).

We conclude this section with an observation, relating our

result to that of Mundinger et al. (2008).

Remark 17 If the number of nodes is a power of two and

capacities are homogeneous, i.e., |N|=2for some ∈N

and ci=1 for all i∈N, the solution produced by Spread-

Exchange is exactly the same as the one produced by the

algorithm presented in Section 3 of Mundinger et al. (2008).

(Note that we do not have to scale capacities in this case and

123

JournalofScheduling

thus also do not scale the sending rates in the end.) This also

implies that in this special case,Spread- Exchange yields

an optimal solution.

4.2 An algorithm for many packets

In this section, we devise a simple alternative algorithm,

which we call Spread- Mirror- Cycle, that has a better

performance guarantee than Spread- Exchange when the

number of packets is large compared to the number of nodes.

The new algorithm proceeds in three phases. In the first phase

(Spread), we choose a certain subset of nodes and send each

packet to exactly one node from this subset. In the second

phase (Mirror), we let these nodes mirror their packets to

the remaining nodes, in such a way that the distribution of

packets among the initial set of nodes is copied (possibly

several times), and at the end of phase two every node owns

at least one packet and each packet is owned by at least one

node. In the final phase of the algorithm (Cycle), we per-

form a rather simple circular exchange of packets between

nodes until all nodes possess the whole file. The main differ-

ence to the Spread- Exchange-algorithm, described in the

previous section, is that Spread- Exchange uses Scale-

Fit∗as a subroutine to provide each node with one packet.

We use a different method for this task, that aims for a large

variety of the distributed packets. This comes at the cost of

a worse performance guarantee for the first two phases, but

allows for more flexibility and hence a better performance

of the exchange part our algorithm (Cycle). When the num-

ber of packets is large compared to the number of nodes,

the exchange phase dominates the running time of the algo-

rithm, and Spread- Mirror- Cycle has a better worst-case

performance than Spread- Exchange.

We proceed to describe Spread- Mirror- Cycle in

detail. In contrast to Spread- Exchange, we do not round

the capacities to powers of 2, but work on the original

capacities. Let k=n/m. We partition the set of nodes

into kmutually disjoint sets N1,...,Nk. The partition is

chosen in such a way that the cardinalities differ by at

most one, and the sets are indexed such that the cardinal-

ities are non-decreasing. Formally, for j∈{1,...,k},let

nj=|Nj|denote the cardinality of the j-th set and let

Nj={ij

1,...,ij

nj}. Then we have |nj−nj|≤1 for all

j,j∈{1,...,k}, and nj≤njfor all j,j∈{1...,k}with

j≤j. Note that by the choice of k, each partition contains

more than m/2 nodes, but at most m, i.e., m/2<nj≤m

for all j.

In the Spread phase of our algorithm, node 0 sends one

packet to each of the nodes in N1as follows: at time 0,

node 0 starts sending packet 1 to node i1

1at maximum rate

min{c0,ci1

1}. After this transmission is completed, node 0

sends packet 2 to node i1

2at maximum rate min{c0,ci1

2}, and

so on. This process continues until node 0 has sent packet

n1to node i1

n1. Then, if m>n1, nodes i1

1,...,i1

m−n1one

after another receive a second packet from node 0 at full rate,

until each node i∈N1owns at least one packet and each

packet is owned by at least one node in N1. For each node

j∈{1,...,n1}, the transmission of a packet needs at most

mcmin time units.

In the Mirror phase, all nodes in N1mirror themselves to

a node in N2. More formally, we form a matching between

the nodes in N1and N2. Assume for the moment that the

sets are of equal size. Each node isends its packets to its

matched node iat the full rate min{ci,ci}. As each node

sends at most two packets, this process needs at most 2

mcmin

time units. We then continue iteratively mirroring the internal

node structure of N1to all other subsets of nodes. That is,

the nodes in N1send their packets to the nodes in N3.Atthe

same time, the nodes in N2send their packets to the nodes in

N4, and so on. After at most 2log2k

mcmin time units, the internal

structure of N1is mirrored to all other sets N2,...Nk.

When the cardinalities of the two sets are different,

i.e., nodes in set Njmirror themselves to the nodes in set

Njwith nj=nj+1, we slightly correct the mirroring

process to take care of that fact. As nj=nj−1≤m−1,

there is a node i∗∈Njthat owns two packets. We match

this node to two different nodes in Nj, and let it send one

packet to each of them. In that fashion, we ensure that after

the mirroring, all nodes in Njpossess at least one packet.

In the Cycle phase, we perform a simple circular

exchange in all sets Nj,j∈{1,...,k}. Recall that within

each set Nj, each node owns at least one packet and each

packet is present at exactly one node. The Cycle phase con-

sists of m−1 rounds. In each round, node ij

ksends one of its

packets to node ij

k+1, except for ij

nj, which sends to ij

1. Each

node sends each packet only once, in order of their reception.

After m−1 rounds of such exchange, all nodes in Njpossess

all packets. To see this, note that during the m−1 exchange

rounds the order of the packets sent along the cycle is pre-

served. Hence, each node receives all other packets before

it gets the packet it sent first. Thus, each node has all pack-

ets after m−1 rounds. Performing the exchange in all sets

N1,...,Nkin parallel, the final phase of our algorithm needs

m−1

mcmin time units.

Summing up, we get the following result:

Theorem 18 Spread- Mirror- Cycle yields a (2+2

log2n/m/m)-approximation in time O(nm).

Proof Adding up the time spent on the single phases, the total

makespan of the solution computed by Spread- Mirror-

Cycle is bounded by

cmin +2log2n/m

mcmin +m−1

mcmin

123

JournalofScheduling

≤2M∗+2log2n/m

mM∗,

where M∗is the optimal makespan. The partitioning of the

set can be done in time O(n), the sending rates for the Spread

phase are computed in time O(m). Then, there are O(k)=

O(n/m)Mirror steps, each of which can be computed in

time O(m). Finally, there is the Cycle phase, in which each

node sends m−1 packets. Computing the sending rates of

this phase thus takes time O(nm), dominating the overall

complexity of the algorithm. 

5 Conclusion

We studied the problem of distributing a file, divided into

mpackets, to all nodes of a communication network. In

contrast to the prevailing assumption in the broadcasting lit-

erature known as the telephone model (and variants thereof),

our model allows for flexible data transfers, that is, at any

point in time each node that possesses a certain packet may

send it with an arbitrary rate to other nodes that have not yet

received this packet. For this quite general model, we pro-

vided a detailed study of various settings. For the simplest

setting with homogeneous capacities and a single packet, we

presented an efficient exact algorithm. Already for a single

packet and heterogeneous capacities, the problem becomes

strongly NP-hard. We thus turned to approximation algo-

rithms and devised constant factor approximation algorithms

for heterogeneous capacities, both for single and multiple

packets. Some questions remain open, however, such as the

case of different upload and download capacities per node. It

would also be interesting whether the presented approxima-

tion factors can be improved, or whether inapproximability

bounds can be shown. Moreover, other objectives such as

the weighted sum of completion times seem interesting and

deserve further study.

Funding Open Access funding enabled and organized by Projekt

DEAL.

Open Access This article is licensed under a Creative Commons

Attribution 4.0 International License, which permits use, sharing, adap-

tation, distribution and reproduction in any medium or format, as

long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons licence, and indi-

cate if changes were made. The images or other third party material

in this article are included in the article’s Creative Commons licence,

unless indicated otherwise in a credit line to the material. If material

is not included in the article’s Creative Commons licence and your

intended use is not permitted by statutory regulation or exceeds the

permitted use, you will need to obtain permission directly from the copy-

right holder. To view a copy of this licence, visit http://creativecomm

ons.org/licenses/by/4.0/.

References

Arkin, E., Bender, M., Fekete, S., Mitchell, J., & Skutella, M. (2006).

The freeze-tag problem: How to wake up a swarm of robots. Algo-

rithmica, 46(2), 193–221.

Arkin, E., Bender, M., Ge, D., Hejoseph, S., & Mitchell, S. (2003).

Improved approximation algorithms for the freeze-tag problem.

In Proceedings of the 15th annual ACM symposium on parallel

algorithms and architectures (SPAA) (pp. 295–303).

Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski,

A., et al. (2009). Above the clouds: A Berkeley view of cloud

computing. Tech. rep., University of California at Berkeley.

Bar-Noy, A., Guha, S., Naor, J., & Schieber, B. (2000). Message multi-

casting in heterogeneous networks. SIAM Journal on Computing,

30(2), 347–358.

Bar-Noy, A., Kipnis, S., & Schieber, B. (2000). Optimal multiple

message broadcasting in telephone-like communication systems.

Discrete Applied Mathematics, 100(1–2), 1–15.

Ezovski, G., Tang, A., & Andrew, L. (2009). Minimizing average finish

time in P2P networks. In Proceedings of the 28th IEEE interna-

tional conference on computer communications (INFOCOM) (pp.

594–602).

Fan, B., Lui, J. C. S., & Chiu, D. M. (2009). The design trade-offs of

bittorrent-like file sharing protocols. IEEE/ACM Transactions on

Networking, 17, 365–376.

Garey, M., & Johnson, D. (1979). Computers and intractability.New

York, NY, USA: W. H. Freeman.

Goetzmann, K. S., Harks, T., Klimm, M., & Miller, K. (2011). Opti-

mal file distribution in peer-to-peer networks. In Proceedings of

the 22nd international conference on algorithms and computation

(ISAAC) (pp. 210–219).

Hedetniemi, S. T., Hedetniemi, S. M., & Liestman, A. (1998). A sur-

vey of gossiping and broadcasting in communication networks.

Networks, 18, 129–134.

Khuller, S., & Kim, Y. A. (2007). Broadcasting in heterogeneous net-

works. Algorithmica, 48(1), 1–21.

Könemann, J., Levin, A., & Sinha, A. (2005). Approximating the

degree-bounded minimum diameter spanning tree problem. Algo-

rithmica, 41(2), 117–129.

Kumar, R., & Ross, K. (2006). Peer assisted file distribution: The min-

imum distribution time. In Proc. 1st IEEE workshop on hot topics

in web systems and technologies (pp. 1–11).

Kwon, O. H., & Chwa, K. Y. (1995). Multiple message broadcasting in

communication networks. Networks, 26(4), 253–261. https://doi.

org/10.1002/net.3230260409.

Mehyar, M., Gu, W., Low, S., Effros, M., & Ho, T. (2007). Optimal

strategies for efficient peer-to-peer file sharing. In Proceedings of

the IEEE international conference on acoustics, speech and signal

processing (ICASSP) (Vol. 4, pp. 1337–1340).

Middendorf, M. (1993). Minimum broadcast time is NP-complete for

3-regular planar graphs and deadline 2. Information Processing

Letters, 46(6), 281–287.

Miller, K., & Wolisz, A. (2011). Transport optimization in peer-to-peer

networks. In Proceedings of the 19th international conference on

parallel, distributed and network-based processing (PDP), (pp.

567–573)

Mundinger, J., Weber, R., & Weiss, G. (2008). Optimal scheduling of

peer-to-peer file dissemination. Journal of Scheduling, 11, 105–

120. https://doi.org/10.1007/s10951-007-0017-9.

Qiu, D., Srikant, R.: Modeling and performance analysis of BitTorrent-

like peer-to-peer networks. In Proc. ACM conf. applications,

technologies, architectures, and protocols for comput. comm.

(SIGCOMM) (pp. 367–378).

123

JournalofScheduling

Ravi, R.: Rapid rumor ramification: Approximating the minimum

broadcast time. In Proceedings of the 35th annual IEEE sympo-

sium on foundations of computer science (pp. 202–213).

Publisher’s Note Springer Nature remains neutral with regard to juris-

dictional claims in published maps and institutional affiliations.

123