Document [original]

Journal of Mathematical Biology (2021) 82:53

https://doi.org/10.1007/s00285-021-01596-0

Mathematical Biology

Separation of timescales for the seed bank diffusion and its

jump-diffusion limit

Jochen Blath1·Eugenio Buzzoni1·Adrián González Casanova2·

Maite Wilke Berenguer3

Received: 6 July 2018 / Revised: 1 October 2020 / Accepted: 27 October 2020 / Published online: 28 April 2021

Abstract

We investigate scaling limits of the seed bank model when migration (to and from

the seed bank) is ‘slow’ compared to reproduction. This is motivated by models for

bacterial dormancy, where periods of dormancy can be orders of magnitude larger

than reproductive times. Speeding up time, we encounter a separation of timescales

phenomenon which leads to mathematically interesting observations, in particular

providing a prototypical example where the scaling limit of a continuous diffusion

will be a jump diffusion. For this situation, standard convergence results typically fail.

While such a situation could in principle be attacked by the sophisticated analytical

scheme of Kurtz (J Funct Anal 12:55–67, 1973), this will require significant technical

efforts. Instead, in our situation, we are able to identify and explicitly characterise a

well-definedlimitviadualityinasurprisinglynon-technicalway. Indeed, we show that

momentdualityisinasuitablesensestableunderpassagetothelimitandallowsadirect

and intuitive identification of the limiting semi-group while at the same time providing

a probabilistic interpretation of the model. We also obtain a general convergence

strategy for continuous-time Markov chains in a separation of timescales regime,

which is of independent interest.

Keywords Strong seed bank ·Two-island model ·Separation of timescales ·

Diffusion limits ·Jump-diffusion ·Duality

Mathematics Subject Classification Primary 60K35; Secondary 92D10

BMaite Wilke Berenguer

maite.wilkeberenguer@ruhr-uni-bochum.de

1Institut für Mathematik, Technische Universität Berlin, Berlin, Germany

2Instituto de Matemáticas, Universidad Nacional Autónoma de México, Mexico City, Mexico

3Fakultät für Mathematik, Ruhr-Universität Bochum, Bochum, Germany

123

53 Page 2 of 34 J. Blath et al.

1 Motivation and main results

In this extended introductory section, we first provide some background on the bio-

logical concept of dormancy and its relevance in particular in microbial communities.

This is followed by a short review of modelling approaches for dormancy in population

genetics, where we think that dormancy might be seen as an additional evolutionary

force, interacting with other forces such as genetic drift in complex ways. Since dor-

mancy periods vary over several orders of magnitude (depending on the underlying

species and environmental conditions), we aim for a systematic classification of rel-

evant timescales, leading to the distinction of three separate scaling regimes. While

the first two regimes have been modelled and analysed in population genetics before,

the last one, leading to a separation of timescales between genetic drift and dormancy

periods, is new, and completes the picture (at least on the level of ‘toy models’) of

modelling scenarios. Our results for this regime will be presented in this introduction

both for the forward-in time population model as well as for the dual genealogical

processes, leading to novel scaling limits, which are interesting also from a purely

mathematical perspective.

The proofs of these results can be found in Sects. 2and 3for the results going

backwards and forwards in time, respectively. We believe that our rather direct method

of proof to obtain and characterise these limits, making extensive use of duality for

Markov processes, can be applied in a variety of situations, so that in each section, we

first present the corresponding methodology in a general set-up and then discuss its

application to our concrete motivation.

Background on dormancy Dormancy is a complex trait that has developed indepen-

dently in many species across the tree of life and comes in many different guises.

Originally, theory for dormancy and the resulting seed banks has be developed in the

context of bet-hedging strategies for plants Cohen (1966). However, dormancy is also

a highly common trait in microbial communities, with important consequences for

their evolutionary, ecological and pathogenic properties.

Here, we define dormancy as the ability of (micro-) organisms to enter and leave

a state of vanishing metabolic activity. It has been observed for many habitats that at

any given time a large fraction of micro-organisms can be in such a dormant state. For

example, more than 80% of bacteria in soil are reported to be metabolically inactive,

forming large ‘seed banks’ comprised of dormant individuals, see Lennon and Jone

(2011). While dormancy seems to be an efficient and wide-spread strategy, e.g. to

withstand unfavourable environmental conditions, competitive pressure, or antibiotic

treatment, it is at the same time a costly trait whose maintenance involves energy and

a sophisticated ‘switching machinery’.

Dormancy also plays a role in various (human) diseases. So-called persister cells,

that may evade antibiotic treatment by remaining in a state of low activity, play a

major role in chronic infections, cf. Fisher et al. (2017), and individual cell dormancy

is linked to relapses in cancer, cf. Marx (2018), Endo and Inoue (2019).

In this paper, we will focus on microbial seed banks. Lennon and Jone (2011)

and Shoemaker and Lennon (2018) provide a broad overview of this rich and fas-

cinating field and serve as a motivation in the present paper. Given the relevance of

123

Separation of timescales for the seed bank diffusion… Page 3 of 34 53

biological systems exhibiting dormancy, investigating the mathematical implications

of dormancy in large populations seems to be a timely and interesting task.

Classification of the duration of dormancy: Known models and motivation for this

paper As indicated above, dormancy comes in many different forms, specific to the

involved species and environments. One variation lies in the duration of dormancy

periods: While in some microbial species dormancy periods last at most a few days,

others stay dormant for prolonged periods of time, and some, e.g. bacterial endospores,

have been reported to successfully resuscitate from dormancy after millions of years

(Shoemaker and Lennon 2018; Cano and Borucki 1995; Johnson et al. 2007; Morono

et al. 2020). The theoretical derivation and analysis of mathematical models may

help to identify, understand and classify the different effects of dormancy, on suitable

timescales, on the population dynamics and genealogical processes of the underlying

populations.

Hence, in this paper, we consider the consequences of dormancy and seed banks

in the framework of population genetics. More precisely, we are interested in the

interplay of dormancy and the classical evolutionary force of random genetic drift,in

particular with respect to its sensitivity to the duration of dormancy periods.

In a bi-allelic, haploid population that reproduces according to the Wright-Fisher

model, the frequency of a given allele converges to the Wright-Fisher diffusion,given

as the solution to

dZ(t)=Z(t)(1−Z(t))dB(t),

where (B(t))t≥0is a standard Brownian motion, if one measures time in the coalescent

timescale (alsoknown as the evolutionary timescale),i.e. on the order of the population

size as this tends to infinity. This diffusion is dual to the block-counting process of the

Kingman coalescent which in turn describes the genealogy of the population. These

objects serve as a reference for populations without dormancy and are widely studied

and applied in biology and mathematics alike. See e.g. Wakeley (2009) or Etheridge

(2011) for an overview. We will consider suitable extensions incorporating dormancy.

We propose to distinguish three regimes comparing the duration of dormancy peri-

ods to the coalescent timescale, i.e. the scale at which the random genetic drift acts.

1. Dormancy periods are small compared to the coalescent timescale.

In 2001, Kaj et al. (2001) introduced a model for dormancy in the following fashion:

insteadof alwayschoosing theancestorinthe preceding generations likein theWright-

Fisher model, individuals are allowed to choose an ancestor several generations in the

past. Their lineages thus ‘jump’ this number of generations and can be interpreted as

dormant during that time. If we denote by B≥1 the expected size of the ‘jump’, the

genealogy of the model converges on the coalescent timescale to a delayed Kingman

coalescent, depicted in Fig. 1b, where coalescences occur at rate β2, where β:= 1/B,

instead of at rate 1, cf. Kaj et al. (2001), Blath et al. (2013). This in turn is dual to the

delayed Wright-Fisher diffusion

d˜

Z(t)=β2˜

Z(t)(1−˜

Z(t))dB(t), (1)

123

53 Page 4 of 34 J. Blath et al.

(a) (b) (c)

Fig. 1 Typical realisations of athe Kingman coalescent, where lineages merge at rate 1 per pair, ba delayed

Kingman coalescent, where lineages merge at rate β2<1 per pair, and cthe seed bank coalescent, see Def.

1.2. In the seed bank coalescent there are two kinds of lines: blue lines are active lineages, while purple

lines are dormant lineages. The differences can be seen in the (asymptotic) expected time to the most recent

ancestor when started with a sample of n(active and mdormant) individuals given on the time-axis (colour

figure online)

that again describes the frequency of a given allele in the population, cf. Fig. 2a. Note

that βdoes not depend on the population size, whence its qualitatively weak impact

on the coalescent timescale.

2. Dormancy periods on the order of the coalescent timescale

For microbial species, however, dormancy times can be much longer than just a

few ‘generations’, In this set-up, Lennon and Jone (2011) proposed a model based on

two reservoirs, the ‘active’ and the ‘dormant’ population, between which individuals

‘migrate/switch’ via initiation of and resuscitation from dormancy, at fixed rates. A

mathematical model for ‘spontaneous/stochastic’ switching (observed in nature under

stable environmental conditions, cf. Epstein 2009; Shoemaker and Lennon 2018), was

introduced and studied in Blath et al. (2016). This is reminiscent of the ‘two-island

model’ (Wright 1931; Moran 1959) with the notable difference of the absence of

reproduction on the second island.

If the size of the active and dormant population are proportional with the ratio

given by some K>0, the frequencies X(t)and Y(t)of a given allele in the active and

dormant population, respectively, when time is measured on the coalescent timescale,

are described by the seed bank diffusion, cf. Fig. 2b. This diffusion was first introduced

in Corollary 2.5 in Blath et al. (2016). The existence of a unique strong solution that

is Feller follows from Theorem 3.2 and Remark 3.2 in Shiga and Shimizu (1980), see

also Greven et al. (2020) for a more general seed bank diffusion.

Definition 1.1 (Seed bank diffusion)Let(B(t))t≥0be a standard Brownian motion and

c,Kfinite positive constants. The [0,1]2-valued continuous strong Markov process

(X(t), Y(t))t≥0given as the unique strong solution of the initial value problem

dX(t)=c(Y(t)−X(t))dt+√X(t)(1−X(t))dB(t),

dY(t)=Kc(X(t)−Y(t))dt,

(2)

with (X(0), Y(0)) =(x,y)∈[0,1]2, is called seed bank diffusion with parameters

c,K, starting at (x,y)∈[0,1]2.

123

Separation of timescales for the seed bank diffusion… Page 5 of 34 53

(a)

(b)

(c)

Fig. 2 Typical realisations of the trajectory of aa time-changed Wright-Fisher diffusion, where the time-

change is an effect of a weak seed bank, bthe seed bank diffusion, with the frequency of a given allele in

the active population displayed in blue and in the dormant population, in purple, cthe frequency process

(˜

X(t), ˜

Y(t)), using the same colour code (colour figure online)

The genealogy of such a population is given by the seed bank coalescent, introduced

in Definition 3.2 in Blath et al. (2016). Here, lineages can switch between an active and

a dormant state independently (hence ‘spontaneous’ switching) at a given rate c>0.

While the active lineages behave like the Kingman coalescent, dormant lineages are

prohibited from coalescing, as depicted in Fig. 1c.

That dormancy appears in such a prominent form in the coalescent and in the

diffusion and therefore is visible on the coalescent timescale is due to the underlying

scaling assumptions of the model. These imply that dormancy times are of the order of

the population size and therefore on the coalescent timescale. Here, many population

genetic quantities and statistics are affected in non-trivial ways, see Blath et al. (2015),

Blath et al. (2016) and Blath et al. (2020b) for a discussion of the scaling assumptions

and further extensions of the model. Since the seed bank here has a major qualitative

effect on both the diffusion and the coalescent, this is sometimes referred to as the

strong seed bank model.

As in the previous models, an important mathematical tool in our analysis will

be the formal duality relation between the seed bank diffusion (X(t), Y(t))t≥0and

the block-counting process of the seed bank coalescent (N(t), M(t))t≥0. Note that

the notion of a ‘block’ comes from the mathematical definition of a coalescent as a

partition-valued process. In thebiologicalcontext,the process could aswellbe denoted

the line-counting process, keeping track of the number of ancestral lines presents at

each time in the past.

Definition 1.2 (Block-counting process of the seed bank coalescent)LetE:= N0×

N0.Letc,K>0. We define (N(t), M(t))t≥0to be the continuous-time Markov chain

123

53 Page 6 of 34 J. Blath et al.

taking values in Ewith conservative Q-matrix Rgiven by

R(n,m),(¯n,¯m)=

⎧

⎪

⎨

⎪

⎩

n

2if (¯n,¯m)=(n−1,m),

cn if (¯n,¯m)=(n−1,m+1),

cKm if (¯n,¯m)=(n+1,m−1),

1−n

2−cn −cKm,if (¯n,¯m)=(n,m),

0,otherwise.

(3)

This continuous-time Markov chain introduced in Definition 2.7 in Blath et al. (2016),

satisfies the moment duality

Ex,yX(t)nY(t)m=En,mxN(t)yM(t)(4)

for every t>0, for every (x,y)∈[0,1]and for every n,m∈N0, see Theorem 2.8 in

Blath et al. (2016). In other words, the distribution of the seed bank diffusion at any

time tis uniquely determined by the moment dual at said time.

3. Dormancy periods are large compared to the coalescent timescale.

In view of the (potentially) extreme duration of dormancy times of bacterial spores,

it is natural to ask: What happens in the third natural scaling-regime, when dormancy

times are long in comparison to the scale on which genetic drift acts? This is the

question answered in this manuscript in the following subsections.

To this end, we consider scaling limits of the above seed bank/two-island model

when migration between active and dormant states (say at rate c) and reproduction (say

at rate 1) act on different timescales, that is cbeing much smaller than 1. Interesting

limits can only be expected when switching to a ‘fast’ super-evolutionary timescale.

Indeed, if one just lets c→0, then one obtains the trivial limit where the active popula-

tion follows a Wright-Fisher diffusion and a Kingman coalescent, respectively, and is

completely separated from the dormant population, as can be readily seen from (2) and

(3). Hence, in order to capture the effect of long dormancy times one needs to speed up

time by a factor 1/c,asc→0, thus switching to a new timescale, which we will refer

to as the super-evolutionary timescale. At this super-evolutionary timescale migration

between the active and the dormant population occurs at rate 1 while reproduction,

and hence genetic drift, acts ‘instantaneously’. Intuitively, fast reproduction should

drive the Xcoordinate of the diffusion process immediately towards the boundaries

0 and 1, which then only rarely switches between these states due to immigration of

‘ancient’ alleles. This is indeed what we will see below.

This scaling regime also leads to mathematically appealing problems. The naïve

scaling limit would lead to a coefficient of “∞” for the genetic drift in the seed bank

diffusion and an infinite coalescent rate in the seed bank coalescent, respectively, and

we thus need to find a way to rigorously identify and describe such a ‘degenerate’

mathematical limit.

Main results under separation of timescales: the frequency process The following

two theorems provide the main results for the frequency processes of Wright-Fisher

models with seed banks, if dormancy times are sufficiently long for the timescales of

123

Separation of timescales for the seed bank diffusion… Page 7 of 34 53

dormancy and genetic drift to separate. Note that we switch to the super-evolutionary

timescale.

Theorem 1.3 Let (Xc(t), Yc(t))t≥0be the seed bank diffusion given in Definition

1.1 with migration rate c >0. Assume that the initial distributions (Xc(0), Yc(0))

converge weakly to an (x,y)∈[0,1]2as c →0. Then, there exists a strong Markov

process (˜

X(t), ˜

Y(t))t≥0, started in (˜

X(0), ˜

Y(0)) =(x,y)with the property that for

any sequence of migration rates with cκ→0when κ→∞,

Xcκ1

cκ

t,Ycκ1

cκ

tt≥0

f.d.d.

−−−→(˜

X(t), ˜

Y(t))t≥0as κ→∞.

Furthermore,

lim

t→0P˜

X(t)=1=1−lim

t→0P˜

X(t)=0=x(5)

and we may choose (˜

X(t), ˜

Y(t))t≥0to be cádlág and such that for every t >0

(˜

X(t), ˜

Y(t)) ∈{0,1}×[0,1].

Here, càdlàg stands for continue à droite, limite à gauche, i.e. the property of a

path to be right-continuous for every t ≥0and have a limit from the left for every

t>0.

Note that the above convergence is in the sense of the finite-dimensional distribu-

tions (f.d.d.), which uniquely determines the law of the limit. As indicated above, it

will have jumps in the first component ˜

X, which is remarkable since the prelimiting

processes all have continuous paths. In order to understand this, we prove in Proposi-

tion 3.8 that, if started in {0,1}×[0,1],(˜

X(t), ˜

Y(t))t≥0coincides in distribution with

a Feller process (¯

X(t), ¯

Y(t))t≥0taking values in {0,1}×[0,1]which is defined via

the generator

Af(x,y)=(1−x)y(f(1,y)−f(x,y)) +x(1−y)( f(0,y)−f(x,y))

+K(x−y)∂f

∂y(x,y), (6)

for functions fin {f:{0,1}×[0,1]→R|f(0,·), f(1,·)∈C1([0,1],R)}.

The dynamics of the process (˜

X(t), ˜

Y(t))t≥0are therefore as follows: The first

component ˜

Xis indeed a piece-wise deterministic process, switching between states 0

and 1. The switching rate at time tfor jumps from 0 to 1 is just given by the value of the

secondcomponent ˜

Y(t),andfrom 1 to 0 with complementary rate 1−˜

Y(t).In-between

jump times of ˜

X, the second component ˜

Ybehaves deterministically, following the

equation

d˜

Y(t)=K(˜

X(t)−˜

Y(t))dt,

So while ˜

X(t)is in state 0, ˜

Y(t)decreases deterministically with exponential rate

−K˜

Y(t),andwhile ˜

X(t)isinstate1, ˜

Y(t)increaseswithexponentialrate K(1−˜

Y(t)).

This is illustrated in Fig. 2c.

123

53 Page 8 of 34 J. Blath et al.

Interpretation: dormancy versus genetic drift on different timescales In the classical

Wright-Fisher model without dormancy, genetic drift drives the frequency process

(Z(t))t≥0of a given allele towards the boundaries 0 and 1, where it fixates. This

occurs on timescales of the order the (effective total) population size.

In the weak seed bank regime frequencies are described by (˜

Z(t))t≥0and genetic

drift is ‘slowed down’ in a quantitative sense by a factor β2, since dormant individ-

uals may jump generations, increasing the effective population size accordingly. For

example, expected fixation times will be stretched by the factor β−2.

In the strong seed bank regime, dormancy times and genetic drift both act on

the same timescale. The resulting additional seed bank ‘island’ in the diffusion

(X(t), Y(t))t≥0will slow down the effect of genetic drift in a qualitative sense. In fact,

although the active population may fixate briefly in 0 or 1, the seed bank component

will then quickly reintroduce variability via the migration term, hence the memory

in the seed bank prevents final fixation in finite time (at least for non-trivial initial

states). This interesting effect is discussed in detail in Blath et al. (2019), where it is

also shown that the seed bank introduces ‘variability’ into the population model in a

suitable sense, by means of a delay-equation reformulation of the seed bank diffusion.

Finally, in the extreme case where dormancy periods are much longer than the

timescale of genetic drift, if time is measured in the super-evolutionary scale, fixa-

tion/extinctionintheactivepopulationof(˜

X(t), ˜

Y(t))t≥0willhappeninstantaneously,

andlast forafinite time.The switchesofthe frequencyintheactive populationbetween

0 and 1 can be explained as follows: When a single ‘ancient’ allele ‘resuscitates’, it

will usually not be able to fixate in the population and go extinct again. However, on

the super-evolutionary timescale, these ‘trials’ reoccur many times, and eventually a

resuscitating allele will fixate. If it is of the same type as the allele currently present in

the active population, nothing changes and there will be no jump. However, if it is of

the other type, this will cause ˜

Xto switch to the opposite boundary. The probabilities

of the allele resuscitating at time tbeing of the given type or of the opposite type are

Y(t)and 1 −˜

Y(t), which explains the form of the rates in Theorem 6.

These observations regarding fixation or coexistence of types can be summed up

as follows. In the Wright-Fisher diffusion without mutation (Z(t))t≥0, ultimately, one

type will fixate. In the weak seed bank regime described by (˜

Z(t))t≥0, there will also

be one type that fixates, but the (expected) time until this happens is increased by a

factor of β−2. In the strong seed bank regime, we will occasionally see fixation of

one type in the active population, but then the seed bank will reintroduce variability

immediately, so that coexistence is visible almost all the time. Finally, in the case of

dormancy onthe super-evolutionary timescale, at any given time, the active population

will always be homomorphic, but the dominant type will switch from time to time,

and there are no visible periods of coexistence at all.

Duality and genealogical interpretation of the scaling regimes

As we have seen, the processes describing the forward-in-time frequency of a given

allele in a Wright-Fisher model with seed bank have natural dual processes describing

their genealogies. Such genealogical processes shed light on the effect of dormancy

on the ancestral processes of samples, but are also useful tools for the proofs of the

123

Separation of timescales for the seed bank diffusion… Page 9 of 34 53

Fig. 3 A typical realisation of an ancient ancestral lines process. Blue lines are active lineages, purple

lines are dormant. At the macroscopic time-scale coalescence occurs instantaneously, which is what we see

between the times 0 and 0+. Afterwards we have at most one active lineage at any given time. If a dormant

lineage activates, it coalesces immediately with the active lineage (colour figure online)

previous theorems, as they tend to be mathematically simpler objects. Our new scaling

regime is no exception.

In the super-evolutionary scaling regime of Theorem 1.3 we obtain the block-

counting process of the ancient ancestral lines process as a scaling limit of the

genealogies (see Theorem 1.5 below). Intuitively, since we are considering a pop-

ulation for which dormancy times are of a larger order than the times of coalescences,

at the super-evolutionary timescale, coalescences occur instantaneously, while migra-

tion between the active and the dormant state occurs at order 1, cf. Fig. 3. Hence, in

the limit, for each time t>0, there will be at most one active line. More formally, we

obtain the following definition.

Definition 1.4 (The ancient ancestral lines process)Let(n0,m0)∈N0×N0.

The (n0,m0)-ancient ancestral lines process is the continuous-time Markov chain

(˜

N(t), ˜

M(t))t≥0with initial value (˜

N(0), ˜

M(0)) =(n0,m0), taking values in the

state space

E(n0,m0):= {0,...,n0+m0}2,

with semi-group

(t):= PetG,t>0,

where (0)is defined as IE, the identity on E(n0,m0).Pis a projection (P2=P)

given by

P(n,m),( ¯n,¯m):= ⎧

⎪

⎨

⎪

⎩

1,if ¯n=1,n≥1,¯m=m,

1,if ¯n=n=0,¯m=m,

0,otherwise,

(7)

123

53 Page 10 of 34 J. Blath et al.

for all (n,m), (¯n,¯m)∈E(n0,m0)and Gis defined as

G(n,m),(¯n,¯m):=

⎧

⎪

⎨

⎪

⎩

Km,if ¯n=1,n≥0,¯m=m−1,

1,if ¯n=0,n≥1,¯m=m+1,

−1−Km,if ¯n=1,n≥1,¯m=m,

−Km,if ¯n=n=0,¯m=m,

0,otherwise.

Note the form of the semi-group of the Markov chain which in particular is not

standard, i.e. limt↓0(t)=P= IdE(cf. Chung 1960). Since the projection Pacts

for all t>0, this process takes values in the smaller space {0,1}×{0,...,m0+1}P-

a.s. for every (fixed) t>0. The first two “rates” given in the definition of Gcorrespond

to the events of resuscitation (with immediate coalescence if applicable) and initiation

of dormancy. Gis, however, not a Q-matrix, since for any ¯n≥2 it has negative values

off the diagonal. These only regard states that will be collapsed by Pinto the smaller

state space.

Thetechnical challenges due to the degenerateform of thesemi-groupofthescaling

limit coming from “separation of timescales phenomena” (cf. for example Wakeley

2009, Chapter 6 from the population genetics perspective) require special care as

we detail in Sect. 2.1. Subsequently, we apply the above strategy to our model in

Sect. 2.2 proving that the ancient ancestral lines process arises as the scaling limit of

the block-counting process of the seed bank coalescent in the sense of convergence of

the finite-dimensional distributions.

Theorem 1.5 Denote by (Nc(t), Mc(t))t≥0the block counting process of the seed

bank coalescent as defined in Definition 1.2 with migration rate c >0and assume

that it starts at some (n0,m0)∈N×N,P-a.s.

Furthermore let (˜

N(t), ˜

M(t)))t≥0be the ancient ancestral lines process from Def-

inition 1.4 with the same initial condition. Then, for any sequence of migration rates

(cκ)κ∈Nwith cκ→0when κ→∞, we have

Ncκ1

cκ

t,Mcκ1

cκ

tt≥0

f.d.d.

−−−→˜

N(t), ˜

M(t)t≥0.

Without loss of generality, we assume (˜

N(t), ˜

M(t))t≥0to be càdlàg.

Spontaneous and simultaneous switching

One should note that for the above models, we assumed a ‘spontaneous’ switching.

‘Simultaneous’ switching, where transition to and from the dormant population are

triggered by environmental cues, are currently an active area of research, see e.g. Blath

et al. (2020a).

123

Separation of timescales for the seed bank diffusion… Page 11 of 34 53

2 Scaling limits for continuous-time Markov chains

Motivatedbytheexampleofthesuper-evolutionary scaling intheintroductorysection,

as a first step, we consider scaling limits of continuous-time Markov chains. Indeed,

when speeding up time, some transition rates diverge to ∞, thus obstructing direct

Q-matrix computations and producing states that are vacated immediately. This effect

is frequently observed when dealing with “separation of timescales phenomena” and

can in a ‘well-behaved’ scenario still lead to a scaling limit with potentially “degen-

erate”, i.e. non-standard transition semi-group of the form

PetG,t≥0,

where Pis a projection to a subspace of the original state space as a result of “immedi-

ately vacated states” and satisfies G=PG =GP.Fordiscrete-time Markov chains,

this situation was considered e.g. in Möhle (1998), Birkner et al. (2013) and recently

also Möhle and Notohara (2016). Since the handling of such situations for continuous-

time Markov chains (such as the above block counting process) might be of general

interest and is somewhat more involved than the discrete case, we give a detailed

“recipe” for such convergence proofs in Sect. 2.1. Note that all of these results can

in principle be seen as specialised and ready-to-use variants of the general operator-

theoretic scheme derived in Kurtz (1973) in the context of ‘random evolutions’ (see

also Ethier and Kurtz 1986, Sect. 1.7). Recent applications of this scheme can also be

found in Bobrowski (2015).

2.1 Separation of timescales phenomena for continuous-time Markov chains: a

strategy

Given a sequence of continuous-time Markov chains (ξκ(t))t≥0,κ∈Nwith finite

state-space E(equipped with a metric d), suppose that our aim is to prove its conver-

gence in finite-dimensional distributions under a suitable time-rescaling (Cκ)κ∈Nto a

continuous-time Markov chain (ξ(t))t≥0when κ→∞.

Our programme to carry out such a proof has two steps:

First, consider an appropriate time discretisation of (ξκ(t))t≥0,κ∈N. Employing

the machinery from Birkner et al. (2013), Möhle (1998) and Möhle and Notohara

(2016) available in this context, one can prove convergence of a rescaling of the

discretised processes to a continuous-time Markov chain (ξ(t))t≥0when κ→∞in

the sense of weak convergence in finite-dimensional distributions.

Second, we prove a continuity result to show that the suitably rescaled original

process converges in finite-dimensional distributions to the same limit.

In order to formulate the conditions on the time-rescaling and the original sequence

ofMarkovchains,werewritethetime-rescalingasCκ=bκ/aκ,wherefurtherassump-

tions on the non-negative sequences (aκ)κ∈Nand (bκ)κ∈Nwill be specified below.

Step (i) Time discretisation and its convergence

The following lemma is an immediate application of Lemma 1.7 in Birkner et al.

(2013) analogous to Theorem 1 in Möhle (1998). We rephrase it in this framework

123

53 Page 12 of 34 J. Blath et al.

for the convenience of the reader and as reference for the examples we will consider

below.

Observe that for a non-negative sequence (aκ)κ∈N,(ξκ(i/aκ))i∈Nis a discrete-

time Markov chain with finite state-space Efor each κ∈N. We equip the matrices

A=(Ae,¯e)e,¯e∈Eon Ewith the matrix norm A:=maxe∈E¯e∈EAe,¯e. Since Eis

finite, convergence in the matrix norm is equivalent to pointwise convergence.

Lemma 2.1 Let (aκ)κ∈Nand (bκ)κ∈Nbe non-negative sequences such that aκ,b

κ,

bκ/aκ→∞as κ→∞. For each κ∈Ndenote by κthe transition matrix of the

discrete-time, time-homogeneous Markov chain (ξκ(i/aκ))i∈N.

Assume that for every κ∈Nwe have a representation of the transition matrix of

the form

κ=Aκ+1

bκ

Bκ,(8)

such that the following holds: Aκis a stochastic matrix and

lim

C→∞ lim

κ→∞ sup

r≥Caκ(Aκ)r−P=0(9)

for some matrix P. Furthermore, we require that the matrix limit with respect to the

matrix norm

G:= lim

κ→∞ PB

κPexists.(10)

Then, we obtain the following convergence (with respect to the matrix norm):

lim

κ→∞tbκ

κ=lim

κ→∞Aκ+1

bκ

Bκtbκ=PetG =: (t)for all t >0.(11)

In particular, if we define (0):= IdE, then ((t))t≥0is a semi-group that gen-

erates a continuous-time Markov chain which we denote by (ξ(t))t≥0.

If ξκ(0)w

−→ ξ(0)as κ→∞,Eq.(11)implies

ξκbκt

aκt≥0

f.d.d.

−−−→(ξ(t))t≥0,as κ→∞.

Here, w

−→denotes weak convergence.

Before proceeding to the proof of this lemma, let us make a few remarks about the

assumptions and results observed in it.

Remark 2.2 1. Sinceκisthe transition matrix of the (ξκ(t))t≥0underatime-change

by a−1

κ, in a representation like (8), Aκis a stochastic matrix that contains only

entries of order 1 and a−1

κ, and Bκcontains only entries of order 1 and o(1).

Since we then speed-up time by a factor bκ, we obtain a separation of timescales,

where the entries in Aκgive rise to a projection matrix Pacting on the probability

distributionson E,whiletheentriesin Bκgiverise to a “Q-matrix”. The Aκcontain

123

Separation of timescales for the seed bank diffusion… Page 13 of 34 53

the transition rates of (ξκ(t))t≥0that occur at a faster rate than the new timescale,

hence they occur “instantaneously” in the limit. The entries in Bκcorrespond to

the transitions of (ξκ(t))t≥0that either occur on the new timescale or are slower,

hence describing the transitions visible in the limit and those that vanish.

2. Note that given (9), the matrix Pis necessarily a projection on E, i.e. satisfies

P2=P. Since P=P2,wehavePG =GP =Gand hence PetG =etG P=

P−I+etG foranyt≥0. Inparticular, ((t))t≥0isnotstandard, as limt↓0(t)=

P= (0)=IdE.Peffectively restricts the state-space of the limiting chain to a

subspace of E.

Observe that Gdiffers from a normal Q-matrix as it may have negative entries off

the diagonal.

Proof of Lemma 2.1 Conditions (8), (9) and (10) above are precisely conditions (36),

(46) and (48) in Birkner et al. (2013). Hence (11) is the claim of (49) in Lemma 1.7 and

Remark1.8inBirkneretal.(2013).Remark2.2inparticularimpliesthattheChapman-

Kolmogorov equations hold for ((t))t≥0and hence this generates a continuous-time

Markov chain which we denote by (ξ(t))t≥0(see, for example, Kallenberg 2002,Thm.

8.4). The convergence in Eq. (11) and the Markov property then imply the convergence

in finite-dimensional distributions. 

Step (ii) Convergence of the continuous-time Markov chains

The previous step ensured the existence of a limit for suitably discretised versions

of the original sequence of continuous-time Markov chains (ξκ(t))t≥0. The following

lemma tells us under what conditions such a discretisation is sufficiently fine to also

imply the convergence of the (ξκ(t))t≥0to the same limit.

Lemma 2.3 Let (ξκ(t))t≥0,κ ∈Nbe a sequence of continuous-time, time-

homogeneous Markov chains with finite state space E (equipped with some metric

d). Let (aκ)κ∈Nand (bκ)κ∈Nbe non-negative sequences.

Denote by Gκthe Q-matrix of (ξκ(t))t≥0for each κ∈Nand set qκ:=

maxe∈E−Gκ

e,e.If

(a) qκ

aκ→0as κ→∞, and

(b) ξκbκt

aκt≥0

f.d.d.

−−−→(ξ(t))t≥0as κ→∞,

then also

ξκbκ

aκ

tt≥0

f.d.d.

−−−→(ξ(t))t≥0as κ→∞.

Proof Whenstartedate∈E,thetimetothefirstjumpofξκisexponentiallydistributed

with parameter −Gκ

e,e. Hence on sees that condition a) was chosen precisely such that

P(ξκ(t))t≥0has a jump in 0,1

aκ≤1−exp −qκ

aκ→0,κ→∞.(12)

123

53 Page 14 of 34 J. Blath et al.

Observe that for the distance between ξκbκt

aκand ξκbκt

aκat any time t≥0

we have

dξκbκt

aκ,ξκbκt

aκ>0

only if the process (ξκ(t))t≥0has a jump in the interval bκt

aκ,bκt

aκ. Since the length

of this interval can be estimated through

0≤bκt

aκ−bκt

aκ≤1

aκ

and the Markov chains are time-homogeneous we can in turn estimate the probability

of a jump in the interval using (12) and obtain

Pdξκbκt

aκ,ξκbκt

aκ>0≤P(ξκ(t))t≥0has a jump in bκt

aκ

,bκt

aκ

≤1−exp −qκ

aκ→0,κ→∞.(13)

In order to prove the convergence of the finite-dimensional distributions, recall that

weak convergence of measures is equivalent to convergence in the Prohorov metric

(see, e.g. Whitt (2002), Section 3.2). Hence, assumption (b) yields that for all time

points 0 ≤t0,...,tl<∞, states e0,...,el∈Eand any ε>0 sufficiently small

there exists a ¯κ∈Nsuch that for all κ≥¯κ:

Pξκbκt0

aκ=e0,...,ξκbκtl

aκ=el≥P{ξ(t0)=e0,...,ξ(tl)=el}−ε

Combining this with (13) we see that for all time points 0 ≤t0,...,tl<∞, states

e0,...,el∈Eand any ε>0 sufficiently small there exists a ¯κ∈Nsuch that for all

κ≥¯κ

Pξκbκt0

aκ=e0,...,ξκbκtl

aκ=el

≥Pξκbκt0

aκ=e0,...,ξκbκtl

aκ=el,

dξκbκt0

aκ,ξκbκt0

aκ=···=dξκbκtl

aκ,ξκbκtl

aκ=0

≥Pξκbκt0

aκ=e0,...,ξκbκtl

aκ=el−ε

≥P{ξ(t0)=e0,...,ξ(tl)=el}−ε.

123

Separation of timescales for the seed bank diffusion… Page 15 of 34 53

This implies the convergence of the finite-dimensional distributions of ξκbκ

aκtt≥0

to the finite-dimensional distributions of (ξ(t))t≥0in the Prohorov metric and hence

weakly, which completes the proof. 

2.2 The ancient ancestral lines process (and other scaling limits)

Let us apply this machinery to the “ancestral lines process” introduced in Sect. 1.

Indeed, consider the block-counting process of the seed bank coalescent defined in

Definition 1.2 with vanishing migration rate c.

Ifwe let c→0 and simultaneously speed uptime by a factor 1/c→∞, we obtain a

new structure given in Definition 1.4, thus uncovering a separation-of-timescales phe-

nomenon. Theorem 1.5 formalises this heuristic and establishes the ancient ancestral

lines process as scaling limit in finite-dimensional distributions of the block-counting

process of the seed bank coalescent. Note that indeed Pis a projection matrix and

PG =GP =G,forPand Gas in Definition 1.4.

Proof of Theorem 1.5 Let (cκ)κ∈Nbe a positive sequence such that cκ→0. Without

lossofgeneralityassumecκ≤1forall κ∈N. We prove the result using the machinery

outlined in the previous section with aκ:= c−2

κand bκ:= c−3

κ.

Recall that (Ncκ(t), Mcκ(t))t≥0is the block counting process of the seed bank

coalescent as defined in Definition 1.2 with migration rate cκ>0 and assume that it

starts at some (n0,m0)∈N×N,P-a.s. Let N0be equipped with the discrete topology.

Step (i) In analogy to the notation in the previous section we abbreviate

(ξκ(t))t≥0:= (Ncκ(t), Mcκ(t))t≥0

and consider a discretised process with time steps of length a−1

κ=c2

κby defining

ηκ(i):= ξκ(ic2

κ), i∈N0.

Let κbe the transition matrix of the Markov chain (ηκ(i))i∈N0. The transition

probabilities of this chain are

(κ)(n,m),(¯n,¯m)=P{ηκ(1)=(¯n,¯m)|ηκ(0)=(n,m)}

=P(Ncκ(c2

κ), Mcκ(c2

κ)) =(¯n,¯m)|(Ncκ(0), Mcκ(0)) =(n,m)

⎧

⎪

⎨

⎪

⎩

n

2c2

κ+o(c3

κ), if ¯n=n−1,¯m=m,

cκnc2

κ+o(c3

κ), if ¯n=n−1,¯m=m+1,

cκKmc2

κ+o(c3

κ), if ¯n=n+1,¯m=m−1,

1−n

2c2

κ−o(c2

κ)−cκnc2

κ−cκKmc2

κ−o(c3

κ), if ¯n=n,¯m=m,

o(c3

κ), otherwise,

(14)

for any sensible (n,m), (¯n,¯m)∈E(n0,m0), recalling the convention of n

2=0for

n≤1. This can be seen as follows.

123

53 Page 16 of 34 J. Blath et al.

Denote by T1the time of the first jump of ξκand by T2the time between the first and

the second jump of ξκ. By the strong Markov property we know that T1and ξκ(T1),as

well as T1and T2are independent. Conditioning on ξκto start in (n,m),wealsoknow

that T1follows an exponential distribution with parameter n

2+cκn+cκKm and

that T2dominates an exponential random variable with parameter 2n−1

2+n+1

2+

cκ(3n+1+3Km)(condition on the possible values of ξκ(T1), then take the minimum

of the possible exponential random variables describing the waiting time to the next

jump). Using this one can check that

PT1+T2≤c2

κ∈o(c3

κ). (15)

Tocalculatethetransitionprobabilitiesin(14),notethat (15)tellsusthattheprobability

of seeing more than one jump by ξκin the interval [0,c2

κ]is in o(c3

κ). In particular,

this gives us the order of the transition probabilities for ηκto states summarised under

“otherwise”, i.e. those that require more than one jump by ξκ. The transitions that are

possible with just one jump are “coalescence”, “dormancy” and “resuscitation” in the

order in which they appear in (14). We calculate the case of “coalescence”: Note that

in order to see such a transition at least one jump must have happened. Hence,

P{ηκ(1)=(n−1,m)|ηκ(0)=(n,m)}

=P{ξκ(T1)=(n−1,m), T1≤c2

κ,T1+T2>c2

κ|ηκ(0)=(n,m)}+o(c3

κ)

=P{ξκ(T1)=(n−1,m), T1≤c2

κ|ηκ(0)=(n,m)}+o(c3

κ)

=n

2

n

2+cκ(n+Km)1−PT1≥c2

κ+o(c3

κ)=n

2c2

κ+o(c3

κ)

where we used (15) for the third equality, the independence of ξκ(T1)and T1for the

fourth and a Taylor expansion and the distribution of T1for the fifth equality. The tran-

sition probabilities for “dormancy” and “resuscitation” can be calculated analogously.

The calculation of the transition probability to the same state the chain originated from

is obvious.

With the representation in (14) we now obtain the decomposition as in (8)

κ=Aκ+Bκ

bκ

with bκ=c−3

κas defined above and

(Aκ)(n,m),(¯n,¯m)=⎧

⎪

⎨

⎪

⎩n

2c2

κ,if ¯n=n−1,¯m=m,

1−n

2c2

κ,if ¯n=n,¯m=m,

0,otherwise,

123

Separation of timescales for the seed bank diffusion… Page 17 of 34 53

and

(Bκ)(n,m),(¯n,¯m)=⎧

⎪

⎨

⎪

⎩

n+o(1), if ¯n=n−1,¯m=m+1,

Km +o(1), if ¯n=n+1,¯m=m−1,

−n−Km +o(1), if ¯n=n,¯m=m,

o(1), otherwise.

(16)

In order to apply Lemma 2.1, we now need to check condition (9), i.e.

lim

C→∞ lim

κ→∞ sup

r≥Cc−2

κ(Aκ)r−P=0 (17)

for Pgiven in (7). Since Aκis a stochastic matrix, let (Zκ

r)r∈N0be the Markov chain

associated to it. This is a pure death process in the first component and constant in the

second. By definition of the matrix norm, we get

(Aκ)r−P= max

(n,m)∈E(n0,m0)

(¯n,¯m)∈E(n0,m0)

|(Aκ)r

(n,m),(¯n,¯m)−P(n,m),( ¯n,¯m)|

=max

(n,m)∈E(n0,m0)|(Aκ)r

(n,m),(1,m)−1|+



¯n=2|(Aκ)r

(n,m),(¯n,m)−0|

=max

(n,m)∈E(n0,m0)

21−(Aκ)r

(n,m),(1,m)

=2max

(n,m)∈E(n0,m0)

PZκ

r= (1,m)|Zκ

0=(n,m)

=2PZκ

r= (1,m0)|Zκ

0=(n0,m0)

Observe that for all n∈{2,...,n0}(and all m∈{0,...,m0}) the probability of

Zκto jump to (n−1,m)in the next step can be bounded:

(Aκ)(n,m),(n−1,m)=n

2c2

κ≥c2

κ.

Hence, the number of time-steps required for Zκto reach (1,m0)if it is started in

(n0,m0)is dominated by the sum of n0−1 independent geometric random variables

γκ

1,...,γκ

n0−1with success probability c2

κ. More precisely, if we define T:= inf{¯r∈

N0|Zκ

¯r=(1,m0)}, then

PZκ

r= (1,m0)|Zκ

0=(n0,m0)≤PT≥r|Zκ

0=(n0,m0)

≤Pγκ

1+···+γκ

n0−1≥r.

By Markov’s inequality, we get

Pγκ

1+···+γκ

n0−1≥r≤1

rEγκ

1+···+γκ

n0−1=1

(n0−1)

123

53 Page 18 of 34 J. Blath et al.

Combining these observations we obtain

lim

C→∞ lim

κ→∞ sup

r≥Cc−2

κ(Aκ)r−P≤ lim

C→∞ lim

κ→∞ sup

r≥Cc−2

2Pγκ

1+···+γκ

n0−1≥r

=lim

C→∞ lim

κ→∞2Pγκ

1+···+γκ

n0−1≥Cc−2

κ

=lim

C→∞ lim

κ→∞

(n0−1)

κ=0

and (17) holds. We are now left to establish the matrix-norm limit (10) and show that

coincides with the Ggiven in Definition 1.4. Notice that Bκitself converges when

κ→∞uniformly and in the matrix norm (recalling that the state space E(n0,m0)is

finite):

B:= lim

κ→∞ Bκ=⎧

⎪

⎨

⎪

⎩

n,if ¯n=n−1,¯m=m+1,

Km,if ¯n=n+1,¯m=m−1,

−n−Km,if ¯n=n,¯m=m,

0,otherwise.

Simply multiplying the matrices on the left-hand-side we obtain PBP =Gand

therefore

lim

κ→∞ PB

κP=PBP =G(18)

and thus (10). Since we have proven the assumptions, Lemma 2.1 yields

lim

κ→∞tc−3

κ

κ=lim

κ→∞Aκ+c3

κBκtc−3

κ=PetG =: (t)for all t>0,

andunder theadditionalassumption thatηκ(0)=(Ncκ(0), Mcκ(0)) =(˜

N(0), ˜

M(0)),

also

(ηκ(c−3

κt))t≥0f.d.d.

−−−→(˜

N(t), ˜

M(t))t≥0,κ→∞,

where (˜

N(t), ˜

M(t))t≥0is the ancient ancestral lines process defined in Definition 1.4.

Step (ii) We would now like to apply Lemma 2.3. Denote by Qcκthe Q-matrix of the

process (ξκ(t))t≥0as given in Definition 1.2. We can estimate

qκ:= max

(n,m)∈E(n0,m0)−(Qcκ)(n,m),(n,m)≤n0+m0

2+cκ(n0+m0)

+cκK(n0+m0).

123

Separation of timescales for the seed bank diffusion… Page 19 of 34 53

As we can see, condition (a) of Lemma 2.3 holds with

qκ

aκ=n0+m0

2+cκ(n0+m0)+cκK(n0+m0)

c−2

κ−→ 0,κ→∞.

Condition (b) was proven in Step (i). Therefore we may conclude

Ncκ(c−1

κt), Mcκ(c−1

κt)t≥0=ξκc−3

c−2

tt≥0

f.d.d.

−−−→˜

N(t), ˜

M(t)t≥0

when κ→∞and the proof of Theorem 1.5 is complete. 

Remark 2.4 (Imbalanced Island Size) It is straightforward to pursue the same con-

sideration for the two-island model and its structured coalescent Herbots (1994);

Notohara (1990). The two-island model considers two populations much like the seed

bank model, but allows for coalescence in the second population. Its genealogy is then

given by the structured coalescent, whose block-counting process allows for the same

transition rates described in (3) adding r(n,m),(¯n,¯m)=m

2for ¯n=nand ¯m=m−1,

i.e. coalescence in the second island (and adapting the diagonal entries accordingly).

Letting the migration rate converge c→0 while speeding up time by 1/c→∞as

we have done for the block counting process of the seed bank coalescent above will

lead to a structure with instantaneous coalescences in both islands, leaving us with a

single line migrating between them.

In this set-up it is much more interesting to consider a two-island model with

different scalingsofthecoalescenceratesintheislands.Inordertodothis,weintroduce

the parameters αand αsuch that the Q-matrix of the block-counting process of the

structured coalescent now is

R(n,m),(¯n,¯m)=

⎧

⎪

⎨

⎪

⎩

αn

2,if (¯n,¯m)=(n−1,m),

αm

2,if (¯n,¯m)=(n,m−1),

cn,if (¯n,¯m)=(n−1,m+1),

cKm,if (¯n,¯m)=(n+1,m−1),

1−αn

2−αm

2−cn −cKm,if (¯n,¯m)=(n,m),

0,otherwise.

(19)

αand αare associated with the notion of effective population size (cf. e.g. Wakeley

2009)soa differentscalingcorrespondstoa significant differenceinpopulationsizeon

thetwoislands. If,inaddition toc→0we assumethecoalescence rateα=α(c)>0

in the second island to scale as c, i.e. α/c→1, the result is a two-island model with

instantaneous coalescences in the first island, but otherwise ‘normal’ migration and

coalescence behaviour in the second.

In order to formalise this heuristic observation, denote by (Nc,α(t), Mc,α(t))t≥0

the block-counting process of the structured coalescent as defined by the rates in (19)

with migration rate c>0 and coalescence rate α>0 in the second island and assume

123

53 Page 20 of 34 J. Blath et al.

that it starts at some (n0,m0)∈N×N,P-a.s. (The parameters α, K>0 are arbitrary

but fixed.)

Define (ˆ

N(t), ˆ

M(t))t≥0to be the continuous-time Markov chain with initial value

(ˆ

N(0), ˆ

M(0)) =(n0,m0), taking values in the state space E(n0,m0):= {0,...,n0+

m0}2, with transition matrix (t):= Petˆ

G,fort>0 and (0)=IdE, where Pis

given by (7) (as in the case of seed banks) and ˆ

Gis now a matrix of the form

G(n,m),(¯n,¯m):=

⎧

⎪

⎨

⎪

⎩

Km +m

2,if ¯n=1,n≥1,¯m=m−1,

Km,if ¯n=1,n=0,¯m=m−1,

m

2,if ¯n=0,n=0,¯m=m−1,

1,if ¯n=0,n≥1,¯m=m+1,

−m

2−1−Km,if ¯n=1,n≥1,¯m=m,

−m

2−Km,if ¯n=n=0,¯m=m,

0,otherwise.

Then, for any sequence of migration rates (cκ)κ∈Nand any sequence of coalescence

rates (α

κ)κ∈Nwith cκ→0 and cκ/α

κ→1 when κ→∞

Ncκ,α

κ1

cκ

t,Mcκ,α

κ1

cκ

tt≥0

f.d.d.

−−−→ˆ

N(t), ˆ

M(t)t≥0,κ→∞.

This observation for the two-island model is analogous to Theorem 1.5 for seed

banks. Its proof is a close parallel to that of Theorem 1.5. Considering, again, the

sequences aκ:= c−2

κand bκ:= c−3

κ,Aκand Pcoincide with those in the proof of

Theorem 1.5, hence the hardest work has already been done. Small alterations to Bκ

immediately yield the result and we therefore omit any further details.

3 Scaling limits for the diﬀusion

We would now also like to observe similar scaling limits for the diffusion (2). As we

saw in the case of Markov chains, rescaling time may lead to a limiting process that

is still Markovian, but whose semi-group is not standard, i.e. not continuous in 0. We

can use moment duality to obtain this limit.

3.1 Convergence of the finite-dimensional distributions obtained from duality

We present a method to obtain convergence in finite-dimensional distributions of a

sequence of Markov processes using moment duality and the convergence in finite-

dimensional distributions of the dual processes. The result does not depend on whether

time is rescaled, too, or not. It is, however, of particular interest in the rescaled case,

since it might lead to the identification of limiting objects which rather “ill-behaved”.

Indeed,wewillseeexamplesin Sect. 3.2 where the limit does not haveageneratorwith

123

Separation of timescales for the seed bank diffusion… Page 21 of 34 53

a sufficiently large domain and hence the common approach of proving convergence

through generator convergence fails.

For any tuples n:= (n1,...,nd)∈Nd

0and x:= (x1,...,xd)∈[0,1]d, define the

mixed-moment function mas m(x,n):= xn1

1···xnd

Theorem 3.1 Let (ζκ(t))t≥0,κ∈N0, be a sequence of Feller processes taking values

in [0,1]d(for some d ∈N), and (ξκ(t))t≥0,κ∈N0, a sequence of Markov chains

with values in Nd

0such that they are pairwise moment duals, i.e.

∀κ∈N0,∀t≥0∀x∈[0,1]d,n∈Nd

0:En[m(x,ξκ(t))]] = Ex[m(ζκ(t), n)].

As usual, Pnand Pxdenote the distributions for which ξand ζ, start in n and x,

respectively.

If (ξκ)κ∈N0converges to some Markov chain ξin the f.d.d.-sense, then there exists

a Markov process ζwith values in [0,1]dsuch that it is the f.d.d.-limit of (ζκ)κ∈N0

and the moment dual to ξ, i.e.

∀t≥0∀x∈[0,1]d,n∈Nd

0:En[m(x,ξ(t))]] = Ex[m(ζ(t), n)].(20)

Remark 3.2 At first glance one might suspect that this result should also hold in a more

generalset-upaslongastheemployeddualityfunctionyieldsconvergencedetermining

families for the respective semi-groups. Indeed, most of the steps of the proof would

still go through. However, note that we did not assume existence of a limiting Markov

process beforehand. We can conclude this by the solvability of Hausdorff’s moment

problem on [0,1]dHildebrandt and Schoenberg (1933), which precisely treats the

existence (and uniqueness) of a distribution with a given sequence of moments and

therefore “matches” the moment duality function in our theorem.

Proof of Theorem 3.1 The proof can roughly be split into three steps: We first use

duality to prove the convergence of the one-dimensional distributions of (ζκ)κ∈N0.

This, together with the Markov property will give us the convergence of the finite-

dimensional distributions of (ζκ)κ∈N0to a family of limiting distributions. Then

we prove consistency of this family of distributions and hence by Kolmogorov’s

Extension-Theorem the existence of a limiting process ζ, which must then be Marko-

vian.

Sincethe mixed-momentfunction mis continuousandboundedas a function onNd

the convergence of the finite-dimensional distributions of (ξκ)κ∈Nand the assumed

moment duality yield

Ex[m(ζκ(t), n)]=En[m(x,ξ

κ(t))]κ→∞

−−−→En[m(x,ξ(t))]=:γ(n,x,t)(21)

for any t≥0, x∈[0,1]dand n∈Nd

0. For fixed x∈[0,1]dand t≥0thisisa

monotonic sequence, i.e.

∀n∈Nd

0,∀k1≤n1,...,kd≤nd:k1

1···kd

dγ(n,x,t)≥0,

123

53 Page 22 of 34 J. Blath et al.

where iγ(n,x,t):= γ ((n1,...,ni−1,ni−1,ni+1,...,nl), x,t)−γ(n,x,t)is

the difference operator acting on the ith component of n,i=1,...,d. This can be

seen from (21):

k1

1···kd

dγ(n,x,t)=lim

κ→∞k1

1···kd

lEn[m(x,ξ

κ(t))]

=lim

κ→∞k1

1···kd

lEx[m(ζκ(t), n)]

=lim

κ→∞

Ex[m(ζκ(t), n)(1−ζ1

κ(t))k1···(1−ζd

κ(t))kd]≥0.

(22)

Hence, the Hausdorff moment problem for (γ (n,x,t))n∈Nd

0is solvable according to

Theorem 1 in Hildebrandt and Schoenberg (1933), which means that there exists a

measure μx,ton ([0,1]d,B([0,1]d)) (where Bis the Borel-σ-algebra) such that

∀n∈Nd

0:γ(n,x,t)=[0,1]d

m(¯x,n)dμx,t(¯x).

In particular, this holds for n=(0,...,0), hence μx,t([0,1]d)=E0[1]=1 and μx,t

is therefore a distribution. Since the polynomials are dense in the continuous functions,

(21) implies the convergence of the one-dimensional distributions to (μx,t)t≥0(for

each starting point x∈[0,1]d).

To check the convergence in finite-dimensional distributions, let us first make a

general observation regarding weak convergence. Let Pκ,κ∈N, and Pbe distribu-

tions on ([0,1],B([0,1])) such that the Pκconverge weakly to P. Furthermore, let

fκ,f:[0,1]→R,κ∈N, be continuous such that fκ(x)is uniformly bounded in κ

and xand converges to fpointwise (and therefore uniformly). Then we can estimate

[0,1]d

fκ(x)Pκ(dx)−[0,1]d

f(x)P(dx)

≤max

x∈[0,1]d|fκ(x)−f(x)|+[0,1]d

f(x)Pκ(dx)−[0,1]d

f(x)P(dx)

κ→∞

−−−→0

(23)

Returning to the task at hand, let Pκbe the probability transition function of ζκand

recall that we assumed the ζκto be Feller, hence x→ [0,1]df(¯x)Pκ(x,t,d¯x)is

continuous and bounded by 1 for any fcontinuous and bounded by 1. For 0 ≤t1<

...<tl<∞,x∈[0,1]dand n(1),...,n(l)∈Nd

0then observe

Ex[m(ζκ(t1), n(1))···m(ζκ(tl), n(l))]

=[0,1]d···[0,1]d

m(¯x(1),n(1))···m(¯x(l),n(l))

Pκ(¯x(l−1),tl−tl−1,d¯x(l))···Pκ(x,t1,d¯x(1))

=[0,1]d

m(¯x(1),n(1))···[0,1]d

m(¯x(l),n(l))Pκ(¯x(l−1),tl−tl−1,d¯x(l))···Pκ(x,t1,d¯x(1))

123

Separation of timescales for the seed bank diffusion… Page 23 of 34 53

κ→∞

−−−→: γ(n(1),...,n(l),x,t1,...,tl). (24)

Here we used the Markov property of ζκin the first equality. For the convergence to

some constant γ(n(1),...,n(l),x,t1,...,tl)we used the convergence of the finite-

dimensionaldistributionsshown abovetogetherwiththeobservationthat x→ m(x,n)

is continuous and bounded by 1 (on [0,1]d) and a recursive application of (23).

By the same argument as in (22), for fixed x∈[0,1]dand t≥0,

γ(n(1),...,n(l),x,t1,...,tl),(n(1),...,n(l))∈(Nd

0)l, is a monotonic sequence and

Theorem 1 in Hildebrandt and Schoenberg (1933) yields the existence of a distribu-

tion μI,xon (([0,1]d)I,B([0,1]d)⊗I)for any finite set of indices I={t1,...,tl}⊂

[0,∞)and starting point x∈[0,1]d. In addition, (24) implies the convergence of the

finite-dimensional distributions of (ζκ)κ∈Nto a respective μI,x. Since these μI,xare

the limits of a consistent family they are themselves consistent and according to Kol-

mogorov’s Extension-Theorem there exists a unique measure μxon the product-space

(([0,1]d)[0,∞),B([0,1]d)⊗[0,∞))which is the distribution of the desired process ζ.

This is a Markov process, because the (ζκ)κ∈N0are Markov processes.

The duality of ζand ξfollows from the duality of the prelimiting processes. 

3.2 Ancient ancestral material scaling regime

As an application of Theorem 3.1 we consider the diffusion (2) with the scaling regime

of Sect. 2.2, namely, with the migration rate c→0 while simultaneously speeding

up time by a factor 1/c→∞and obtain Theorem 1.3 stating the convergence of the

rescaled diffusions to a Markovian limit (˜

X(t), ˜

Y(t))t≥0.

Theorem 3.3 Let (Xc(t), Yc(t))t≥0be the seed bank diffusion given in Definition 1.1

with migration rate c >0. Assume that the initial distributions (Xc(0), Yc(0)) con-

verge weakly to a (x,y)∈[0,1]2as c →0. Then, there exists a Markov process

(˜

X(t), ˜

Y(t))t≥0, started in (˜

X(0), ˜

Y(0)) =(x,y)with the property that for any

sequence of migration rates with cκ→0when κ→∞,

Xcκ1

cκ

t,Ycκ1

cκ

tt≥0

f.d.d.

−−−→(˜

X(t), ˜

Y(t))t≥0as κ→∞

and (˜

X(t), ˜

Y(t))t≥0is the moment dual of (˜

N(t), ˜

M(t))t≥0given in Definition 1.4.

Note that (˜

X(t), ˜

Y(t))t≥0makes sense for more general initial conditions in [0,1]2.

Inanycase,thelimitingprocess(˜

X(t), ˜

Y(t))t≥0instantaneouslyjumpsintothesmaller

state space {0,1}×[0,1]at time 0+. The jump probabilities to 0 and 1 are the

fixation probabilities of the ordinary Wright-Fisher diffusion. This corresponds to an

instantaneous application of a projection operator ˜

Pdefined as the limit (in a suitable

sense)

P:= lim

t→∞ ˜

Pt,

123

53 Page 24 of 34 J. Blath et al.

where (˜

Pt)t≥0is the semi-group associated to the classical Wright-Fisher diffusion,

cf. Kurtz (1973) (or Bobrowski 2015, Equation (3)). Intuitively, this can be explained

as follows: In the regime, where dormancy duration is significantly larger than the

effect of genetic drift, the population evolves according to a Wright-Fisher diffusion

without dormancy and has the chance to be absorbed in 0 or 1, before ever seeing a

resuscitation/migration into the population from the seed bank. Hence, on the super-

evolutionary time-scale the probabilitiesIntuitively to immediately jump to 0 or 1

are precisely given by the corresponding fixation probabilities of the Wright-Fisher

diffusion.

Remark 3.4 (Convergence on path space?)Onceconvergenceofthefinite-dimensional

distributions is established in Theorem 1.3, it is natural (at least for mathematicians)

to ask whether it is possible to prove tightness on the space of càdlàg paths space in

order to obtain weak convergence. However, since the set of continuous paths form a

closed subset of the càdlàg paths in the classical Skorohod (J1) topology (cf. Skorohod

(1956)), and the solutions to our pre-limiting seed bank diffusions are continuous, con-

vergence in the above topologies would predict a limit with continuous paths, which

we know not to be correct at least in 0. This makes weak convergence on path space

impossible. However, the set of jump times of the above process is finite on finite time

intervals, and in particular has Lebesgue-measure zero, so that we expect that conver-

gence is true in weaker topologies, such as the Meyer-Zheng topology corresponding

to convergence in measure (Meyer and Zheng 1984;Kurtz1991). However, we refrain

from going into these technicalities here, which we consider to be outside the scope

of this manuscript.

Remark 3.5 Remark 3.2 in Shiga and Shimizu (1980) implies that the unique strong

solution to the SDE (1.1) which is the seed bank diffusion from Definition 2is a Feller

process. This is considered in more generality in Theorem 2.4 in Greven et al. (2020).

Proof of Theorem 3.3 Since the (Xcκ(t/cκ),Ycκ(t/cκ))t≥0are constant time-changes

of the seed bank diffusion introduced in Definition 1.1, they are Feller, as well.

Since the moment duality of the block-counting process of the seed bank coalescent

and the seed bank diffusion (4) holds for every time t≥0, it is preserved for the time-

changed processes (Ncκ(t/cκ),Mcκ(t/cκ))t≥0and (Xcκ(t/cκ),Ycκ(t/cκ))t≥0.

Together with Theorem 1.5 all assumptions of Theorem 3.1 hold and we get the

existence of a Markov process (˜

X(t), ˜

Y(t))t≥0that is the dual of (˜

N(t), ˜

M(t))t≥0.By

the uniqueness of the solution to the Hausdorff moment problem (Theorem 2 in in

Hildebrandt and Schoenberg 1933) a distribution on [0,1]2is uniquely determined by

all its mixed-moments. The moment duality of the limit with a process that does not

depend on the scaling sequence (cκ)κ∈N0therefore implies that the one-dimensional

distributions of the limit do not depend on the choice of scaling sequence, either. Since

the limit is a Markov-process the one-dimensional distributions uniquely determine its

entire distribution. Hence, the distribution of the limit does not depend on the choice

of scaling sequence (cκ)κ∈N0.

So far we have characterised the process (˜

X(t), ˜

Y(t))t≥0only as the moment dual

of the continuous-time Markov chain (˜

N(t), ˜

M(t))t≥0whose semi-group we could

123

Separation of timescales for the seed bank diffusion… Page 25 of 34 53

give explicitly in Definition 1.4. We now use this characterisation to better understand

the process (˜

X(t), ˜

Y(t))t≥0itself. More precisely, since (20) holds in particular for

t>0, m=0 and any n≥1, x,y∈[0,1]we see

Ex,y[˜

X(t)n˜

Y(t)0]=En,0[x˜

N(t)y˜

M(t)]

=xPn,0(˜

N(t)=1,˜

M(t)=0)+yPn,0(˜

N(t)=0,˜

M(t)=1)

=x(PetG)(n,0),(1,0)+y(PetG)(n,0),(0,1)=x(etG)(1,0),(1,0)+y(etG)(1,0),(0,1).

(25)

We used the fact that the first component of the ancient ancestral lines process

(˜

N(t), ˜

M(t))t≥0takes values in {0,1}for any t>0 in the second equality and

the definition of the projection in the last equality. Since the right-hand side does not

depend on n≥1, we can conclude that

X(t)∈{0,1}Px,y-a.s. for any t>0 and any (x,y)∈[0,1]2.(26)

We can use this observation together with (25) to obtain

lim

t↓0Px,y{˜

X(t)=1}=lim

t↓0Ex,y[˜

X(t)n]=x(IN2

0)(1,0),(1,0)+y(IN2

0)(1,0),(0,1)=x.

(27)

(Here (IN2

0)is the identity matrix on N2

0.)

This small observation has an important consequence: Much like in the case of

its dual (˜

N(t), ˜

M(t))t≥0, the semi-group of the ancient ancestral material process

(˜

X(t), ˜

Y(t))t≥0is not right-continuous in 0.

Intuitively, the reproduction mechanism (in the active population) acts so fast, that

fixation (or extinction) in the active population happens instantaneously. Whenever

there is an invasion from the seed bank, the chances that this is by an individual of

the type extinct in the active population (thereby causing a change of type here) are

given by the frequency of said type in the dormant population. The limit is thus a pure

jump process in the active component that moves between the states 0 and 1 at rates

proportional to the frequency in the dormant population of the allele that is extinct in

the active population, while the seed bank component retains its classical behaviour.

We can formalise this observation if we restrict the process to the smaller state space

{0,1}×[0,1], see Proposition 3.8 below.

Definition 3.6 Let (¯

N(t), ¯

M(t))t≥0be the Markov chain on {0,1}×N0given by the

Q-matrix

G(n,m),(¯n,¯m)=⎧

⎪

⎨

⎪

⎩

Km,if ¯n=1,n∈{0,1},¯m=m−1,

1,if ¯n=0,n=1,¯m=m+1,

−n−Km,if ¯n=n,¯m=m,

0,otherwise.

123

53 Page 26 of 34 J. Blath et al.

for any (n,m), (¯n,¯m)∈{0,1}×N0.

Furthermore, let (¯

X(t), ¯

Y(t))t≥0be the Markov process on {0,1}×[0,1]defined

by the generator given in (6).

Proposition 3.7 (¯

X(t), ¯

Y(t))t≥0is well-defined i.e. the closure of ¯

Agiven in (6)is

indeed the generator of a Markov process and this process is Feller. Furthermore, we

may assume that (¯

X(t), ¯

Y(t))t≥0is cádlág on [0,∞).

Proof Define E:= {0,1}×[0,1]and

C:= C(E,R)={f:E→R|f(0,·), f(1,·):[0,1]→Rare continuous},

D¯

A:= C1:= {f:E→R|f(0,·), f(1,·):[0,1]→Rare differentiable}.

We verify the conditions of the Hille–Yosida Theorem, cf. Theorem 19.11 in Kallen-

berg (2002), for (¯

A,D¯

A), where ¯

Ais given in (6). First note that

{f:E→R|f(0,·), f(1,·):[0,1]→Rare polynomials}⊂D¯

hence D¯

Ais dense in C. In order to verify the maximum principle choose an arbitrary

f∈D¯

Aand let (¯x,¯y)∈Ebe such that f(¯x,¯y)≥f(x,y)∧0 for all (x,y)∈E.

Then

Af(¯x,¯y)=(1−¯x)¯y(f(1,¯y)−f(¯x,¯y)) +¯x(1−¯y)( f(0,¯y)−f(¯x,¯y))

+K(¯x−¯y)∂f

∂y(¯x,¯y)

Since we assumed fto have a maximum in (¯x,¯y), the first two summands are non-

positive. If ¯y∈(0,1), a maximum in (¯x,¯y)implies ∂f

∂y(¯x,¯y)=0. If ¯y=0, a

maximumin(¯x,¯y)implies ∂f

∂y(¯x,¯y)≤0andtherefore(¯x−0)∂f

∂y(¯x,¯y)≤0.Likewise,

if ¯y=1, a maximum in (¯x,¯y)implies ∂f

∂y(¯x,¯y)≥0 and therefore (¯x−1)∂f

∂y(¯x,¯y)≤

0. Hence, ¯

Af(¯x,¯y)≤0 and the maximum principle holds. We are left to prove that

there exists a λ>0 such that (λ −¯

A)D¯

Ais dense in C. First, observe that f∈C

if and only if it can be written in the form f(x,y)=(1−x)f0(y)+xf

1(y), where

f0(·), f1(·):[0,1]→Rare continuous. Since the polynomials are dense in the

continuos functions on [0,1]and ¯

Ais a linear operator, it suffices to show that for

any r∈N0we can find f(r),f(r)∈D¯

Asuch that (λ −¯

A)f(r)(x,y)=(1−x)yrand

(λ −¯

A)f(r)(x,y)=xyr. In an intuitive abuse of notation, we will in the following

denote maps of the form (x,y)→ xyrby xyrand likewise for (1−x)yr.Webegin

by calculating, for any r∈N0

(λ −¯

A)xyr=λxyr−(1−x)y(yr−xyr)−x(1−y)(0−xyr)−K(x−y)xryr−1

=(1−x)−yr+1+x−rKyr−1+(λ +1+rK)yr−yr+1,

=x−rKyr−1+(λ +1+rK)yr)−yr+1

(λ −¯

A)(1−x)yr=λ(1−x)yr−(1−x)y(0−(1−x)yr)

123

Separation of timescales for the seed bank diffusion… Page 27 of 34 53

−x(1−y)(yr−(1−x)yr)−K(x−y)(1−x)ryr−1

=(1−x)(λ +rK)yr+yr+1+x−yr+yr+1,

=(1−x)(λ +rK)yr+x−yr+yr+1

(λ −¯

A)yr=x−rKyr−1+(λ +rK)yr.

Proceedbyinductiononthedegreer,beginningwithr=0.Observethat(λ−¯

A)1=λ

and

(λ −¯

A)(1−x)−1

λ+Ky=(1−x)λ −x+y−1

λ+K(−xK +(λ +K)y)

=(1−x)λ +x−λ

λ+K,

therefore

(λ −¯

A)λ+K

−λ(λ +K+1)(1−x)−1

λ+Ky−1

=λ+K

−λ(λ +K+1)x(−λ) +x−λ

λ+K=xy0

and immediately also

(λ −¯

A)1

λ−λ+K

−λ(λ +K+1)(1−x)−1

λ+Ky−1=1−x=(1−x)y0.

Now let n∈Nand assume that for any r≤n−1 there exist f(r),f(r)∈D¯

Asuch

that (λ −¯

A)f(r)(x,y)=(1−x)yrand (λ −¯

A)f(r)(x,y)=xyr. Note that

(λ −¯

A)yn+nKf(n−1)(x,y)=(λ +nK)yn.

In addition, similarly to the above,

(λ −¯

A)(1−x)yn−1

λ+(n+1)Kyn+1

=(1−x)(λ +nK)yn+x−λ

λ+(n+1)Kyn.

Hence we may again obtain

(λ −¯

A)−λ−(n+1)K

λ+(λ +nK)(λ +(n+1)K)

×(1−x)yn−1

λ+(n+1)Kyn+1−yn+nKf(n−1)

123

53 Page 28 of 34 J. Blath et al.

=−λ−(n+1)K

λ+(λ +nK)(λ +(n+1)K)x(λ +nK)(−yn)+x−λ

λ+(n+1)Kyn

=xyn

and with this also

(λ −¯

A)1

λ+nK yn+nKf(n−1)(x,y)

+λ+(n+1)K

λ+(λ +nK)(λ +(n+1)K)

×(1−x)yn−1

λ+(n+1)Kyn+1−yn+nKf(n−1)

=yn−xyn=(1−x)yn.

This completes the proof that (λ −¯

A)D¯

Ais dense in C. Hence, the closure of ¯

generates a Feller semigroup on C. According to Kallenberg (2002, Proposition 19.14)

this Feller semigroup then generates a Feller process, which we may assume to be

cádlág paths thanks to Kallenberg (2002, Theorem 19.15).

Both processes correspond to the ancient ancestral material scaling when consid-

ering only the reduced “effective” state spaces:

Proposition 3.8 The processes (¯

N(t), ¯

M(t))t≥0and (¯

X(t), ¯

Y(t))t≥0introduced in

Definition 3.6 are moment duals, i.e.

∀t≥0∀(x,y)∈[0,1]2,(n,m)∈N2

0:En,mx¯

N(t)y¯

M(t)=Ex,y¯

X(t)n¯

Y(t)m.

(28)

Furthermore, (¯

N(t), ¯

M(t))t≥0coincides in distribution with (˜

N(t), ˜

M(t))t≥0if (both

are) started in the reduced state-space {0,1}×N0.

Likewise, (¯

X(t), ¯

Y(t))t≥0coincides in distribution with (˜

X(t), ˜

Y(t))t≥0if (both

are) started in the reduced state-space {0,1}×[0,1].

Moment duality of the involved processes will be important for the proof of the last

statement, which is crucial for the proof of Theorem 1.3.

Proof We prove the claims in order of appearance.

The duality of (¯

N(t), ¯

M(t))t≥0and (¯

X(t), ¯

Y(t))t≥0can be shown using the respec-

tive generators: Define S((x,y), (n,m)) := Sx,y(n,m):= Sn,m(x,y):= xnymfor

(n,m)∈{0,1}×N0and (x,y)∈{0,1}×[0,1]. Applying ¯

Ato Sn,myields

ASn,m(x,y)=y(ym−0nym)1l{0}(x)+(1−y)(0nym−ym)1l{1}(x)

+K(x−y)xnmym−1

=−(x−y)(ym−0nym)1l{0}(x)+(x−y)(0nym−ym)1l{1}(x)

123

Separation of timescales for the seed bank diffusion… Page 29 of 34 53

Fig. 4 Strategyoftheproofof Proposition3.8. Themoment dualityof (˜

N(t), ˜

M(t))t≥0and(˜

X(t), ˜

Y(t))t≥0

is a consequence of Theorem 3.3. The laws of (˜

N(t), ˜

M(t))t≥0and (¯

N(t), ¯

M(t))t≥0agree when restricted

tothereducedstate-space{0,1}×N0.Weshowthemomentdualityof(¯

N(t), ¯

M(t))t≥0and(¯

X(t), ¯

Y(t))t≥0,

which then allows us to conclude that the restricted laws of (˜

X(t), ˜

Y(t))t≥0and (¯

X(t), ¯

Y(t))t≥0also agree

on {0,1}×[0,1]

+K(x−y)xnmym−1

=−n(x−y)ym+Km(x−y)xnym−1

=Kmxn+1ym−1+(−Kmxn−nx)ym+nym+1

where we continue to use 00=1, the fact that n∈{0,1}and simply sorted the terms

by powers of yfor easier comparison in the last line.

In order to do the analogous calculation for (¯

N(t), ¯

M(t))t≥0we need its generator.

Since ¯

Gis the conservative Q-matrix, the generator Gis given by

Gf(n,m):= 

(¯n,¯m)∈{0,1}×N0

G(n,m),(¯n,¯m)f(¯n,¯m)

=Km(f(1,m−1)−f(n,m)) +(f(0,m+1)−f(n,m))1l{1}(n)

for all f:{0,1}×N0→Rwhich are bounded. If we apply Gto Sas a function in

(n,m)∈{0,1}×N0, we get

GSx,y(n,m)=Km(xym−1−xnym)+(ym+1−xym)1l{1}(n)

=Km(xym−1−xym)1l{1}(n)

 ! "

+Km(xym−1−ym)1l{0}(n)

 ! "

=1−n

+1(ym+1−xym)1l{1}(n)

 ! "

=Kmxym−1+(−Kmnx −Km(1−n)−nx)ym+nym+1.

A close look noting that for our choices of variables we have xn+1=xand nx +(1−

n)=xnshows that

∀(x,y)∈{0,1}×[0,1],(n,m)∈{0,1}×N0:¯

ASn,m(x,y)=¯

GSx,y(n,m).

Sis bounded and continuous. For any t≥0, (x,y)∈{0,1}×[0,1]the functions

Sx,yand {0,1}×N0(n,m)→ Ex,y[¯

X(t)n¯

Y(t)m]are bounded. Furthermore, for

any t≥0, (n,m)∈{0,1}×N0, the functions Sn,mand {0,1}×[0,1](x,y)→

123

53 Page 30 of 34 J. Blath et al.

En,m[x˜

N(t)y˜

M(t)]are continuously differentiable on (0,1)and continuous on [0,1]in

the second component and continuous due to the theorem of bounded convergence.

Hence, all assumptions of Jansen and Kurt (2014, Prop. 1.2) hold and we have proven

the duality.

Next, we want to prove the equality of (˜

N(t), ˜

M(t))t≥0and (¯

N(t), ¯

M(t))t≥0in dis-

tribution, if both processes have the same initial distribution (which then must be in the

smallerspace{0,1}×N0)).Recallthat(PetG)t≥0isthesemi-groupof(˜

N(t), ˜

M(t))t≥0

from Definition 1.4. On the other hand, the semi-group of (¯

N(t), ¯

M(t))t≥0is given

by (et¯

G)t≥0. Since both are Markov processes it suffices to prove that they both have

the same semi-group. Note that technically these semi-groups have different dimen-

sions, so to be precise, we want to prove that the restriction of (PetG)t≥0to the space

{0,1}×N0coincides with (et¯

G)t≥0, i.e.

∀t≥0:((PetG)i,j)i,j∈{0,1}×N0=et¯

This will be true, because of the structure of Gthat reflects that the space {0,1}×N0

is absorbing for (˜

N(t), ˜

M(t))t≥0. More precisely, we prove by induction that

(Gk

i,j)i,j∈{0,1}×N0=¯

Gk(29)

for all k∈N. Comparing the definitions of Gin Definition 1.4 and of ¯

Gin Definition

3.6, we see that (29) holds for k=1. Assume this is true for some fixed k∈N. Then,

for (n,m), (¯n,¯m)∈{0,1}×N0

Gk+1

(n,m),(¯n,¯m)=

(l1,l2)∈N2

(n,m),(l1,l2)Gk

(l1,l2),( ¯n,¯m)

∗

=

(l1,l2)∈{0,1}×N0

(n,m),(l1,l2)

 ! "

=¯

(n,m),(l1,l2)

(l1,l2),( ¯n,¯m)

 ! "

=¯

(l1,l2),(¯n,¯m)

=¯

Gk+1

(n,m),(¯n,¯m)

where we used that G(n,m),(l1,l2)=0if(l1,l2)/∈{0,1}×N0in ∗and then applied the

induction assumption. Hence (29) does indeed hold for any k∈N. Hence, for every

choice of (n,m), (¯n,¯m)∈{0,1}×N0and t≥0, recalling that PG =G,

(PetG)(n,m),(¯n,¯m)=

k∈N0

k!Gk

(n,m),(¯n,¯m)=

k∈N0

k!¯

(n,m),(¯n,¯m)=(et¯

G)(n,m),(¯n,¯m).

Therefore the processes (˜

N(t), ˜

M(t))t≥0and (¯

N(t), ¯

M(t))t≥0do indeed coincide in

distribution if started in the same state (n,m)∈{0,1}×N0.

Since we now in particular have the equality of the one-dimensional distributions,

we can use the duality (28) and the duality given in Theorem 3.3 to obtain

Ex,y˜

X(t)n˜

Y(t)m=En,mx˜

N(t)y˜

M(t)=En,mx¯

N(t)y¯

M(t)=Ex,y¯

X(t)n¯

Y(t)m(30)

for all t≥0 and all (x,y)∈{0,1}×[0,1]and (n,m)∈{0,1}×N0.

123

Separation of timescales for the seed bank diffusion… Page 31 of 34 53

Recall from (26)thatforanyt>0wehave(˜

X(t), ˜

Y(t)) ∈{0,1}×[0,1],Px,y-a.s.,

(x,y)∈[0,1]2. Since a distribution on {0,1}×[0,1]is uniquely determined by its

moments of order (n,m)∈{0,1}×N0,(30) implies that (˜

X(t), ˜

Y(t)) ∼(¯

X(t), ¯

Y(t))

for any t>0 (when started in the same (x,y)∈{0,1}×[0,1]). Since they are both

Markovian, this implies that the distributions of (¯

X(t), ¯

Y(t))t≥0and (˜

X(t), ˜

Y(t))t≥0

coincide when started in the reduced state-space {0,1}×[0,1].

Combining these results we obtain the proof of Theorem 1.3.

Proof of Theorem 1.3 Theorem 3.3 already yields the existence of (˜

X(t), ˜

Y(t))t≥0as

the limit in finite-dimensional distributions.

(5) is simply the observation of (27).

Hence we are left to prove that we can choose a process with the above properties

(determined only by the distribution!) with nice path-properties.

Fix, (x,y)∈[0,1]2.Now,let(¯

X∗(t), ¯

Y∗(t))t≥0and (¯

X∗(t), ¯

Y∗(t))t≥0be inde-

pendent copies of (¯

X(t), ¯

Y(t))t≥0, starting at (0,y)∈{0,1}×[0,1]and (1,y)∈

{0,1}×[0,1], respectively. Furthermore, let Bbe an independent Bernoulli random

variable with success parameter x. With this, define the process

∀t≥0:(˜

X(t), ˜

Y(t)) := B(¯

X∗(t), ¯

Y∗(t))t≥0+(1−B)( ¯

X∗(t), ¯

Y∗(t))t≥0.

This process is cádlág (with a random initial distribution (B,y)). We now prove that

(˜

X(t), ˜

Y(t))t>0and (˜

X(t), ˜

Y(t))t>0are equal in distribution. (Note that we claim this

for t>0 only.) We prove this using duality. Recall that for t>0, and any (n,m)∈N2

Pn,m˜

N(t)∈{0,1}=1 and we can therefore calculate

E[˜

X(t)n˜

Y(t)m]=xE1,y[¯

X∗(t)n¯

Y∗(t)m]+(1−x)E0,y[¯

X∗(t)n¯

Y∗(t)m]

=xE1,y[˜

X(t)n˜

Y(t)m]+(1−x)E0,y[˜

X(t)n˜

Y(t)m]

=xEn,m[1˜

N(t)y˜

M(t)]+(1−x)En,m[0˜

N(t)y˜

M(t)]

=En,m[(x+(1−x)1l{˜

N(t)=0})y˜

M(t)]

=En,m[x˜

N(t)y˜

M(t)]=Ex,y[˜

X(t)n˜

Y(t)m].

Here,weusedProposition3.8inthesecondequality,thedualitybetween(˜

X(t), ˜

Y(t))t≥0

and (˜

N(t), ˜

M(t))t≥0from Theorem 3.3 in the third and last equality, and the obser-

vation, that Pn,m˜

N(t)∈{0,1}=1, in the fifth equality. Since (n,m)∈N2

0was

arbitrary, we have shown that for every t>0, (˜

X(t), ˜

Y(t)) and (˜

X(t), ˜

Y(t)) are equal

in distribution. Since both processes are time-homogeneous Markov processes, this

impliestheclaim.Thus,theprocess(ˆ

X(t), ˆ

Y(t))t≥0,definedas(ˆ

X(0), ˆ

Y(0)) := (x,y)

and

∀t≥0:(ˆ

X(t), ˆ

Y(t)) := (˜

X(t), ˜

Y(t)),

123

53 Page 32 of 34 J. Blath et al.

is cádlág for all t>0 and coincides in distribution with (˜

X(t), ˜

Y(t))t≥0started in

(˜

X(0), ˜

Y(0)) =(x,y).

Remark 3.9 (Imbalanced Island size: Part 2) We return to the example discussed in

Remark 2.4 of the two-island model and its close relation to the seed bank model.

The frequency process of the given allele is then described by the two-island diffusion

(Kermany et al. 2008),

dX(t)=c(Y(t)−X(t))dt+α√X(t)(1−X(t))dB(t),

dY(t)=cK(X(t)−Y(t))dt+α√Y(t)(1−Y(t))dB(t), (31)

where (B(t)t≥0and (B(t))t≥0are independent Brownian Motions.

Again, the interesting consideration here is to use different scalings of the coales-

cence rates in the islands, i.e. different scalings for α≥0 and α≥0. If, in addition

to c→0 we assume the coalescence rate α>0 in the second island to scale as c,

i.e. α/c→1, the result is a two-island model with instantaneous coalescences in

the first island, but otherwise regular migration and diffusive behaviour in the second.

For more precision, denote by (Xc,α(t), Yc,α(t))t≥0the two-island diffusion with

migration rate c>0 and island 2 of size α>0 and assume that it starts at some

(x,y)∈[0,1]2,P-a.s.. Repeating the calculations we did for the seed bank model, it

can be shown that the sequence (Xcκ,α

κ(t), Ycκ,α

κ(t))t≥0will converge to a Markovian

degenerate limit coinciding in distribution with a Markov process with generator

Lf(x,y)=(1−x)y(f(1,y)−f(x,y)) +x(1−y)( f(0,y)−f(x,y))

+K(x−y)∂f

∂y(x,y)+1

2y(1−y)∂2

∂y2f(x,y)

for functions fin {f:{0,1}×[0,1]→R|f(0,·), f(1,·)∈C2([0,1],R)whenever

started in the smaller state-space {0,1}×[0,1].

Acknowledgements JB and MWB were supported by DFG Priority Programme 1590 “Probabilistic Struc-

tures in Evolution”, Project BL 1105/5-1, EB and AGC by the Berlin Mathematical School.

Funding Information Open Access funding enabled and organized by Projekt DEAL.

Open Access ThisarticleislicensedunderaCreativeCommonsAttribution4.0InternationalLicense,which

permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give

appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,

and indicate if changes were made. The images or other third party material in this article are included

in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If

material is not included in the article’s Creative Commons licence and your intended use is not permitted

by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the

References

Birkner M, Blath J, Eldon B (2013) An ancestral recombination graph for diploid populations with skewed

offspring distribution. Genetics 193(1):255–290

123

Separation of timescales for the seed bank diffusion… Page 33 of 34 53

Blath J, González Casanova A, Kurt N, Wilke Berenguer M (2020a) The seed bank coalescent with simul-

taneous switching. Electron J Probab 25:27

Blath J, Buzzoni E, Koskela J, Wilke Berenguer M (2020b) Statistical tools for seed bank detection. Theor

Popul Biol 132:1–15

Blath J, González Casanova A, Kurt N, Spanò D (2013) The ancestral process of long-range seed bank

models. J Appl Probab 50(3):741–759

Blath J, González Casanova A, Eldon B, Kurt N, Wilke Berenguer M (2015) Genetic variability under the

seedbank coalescent. Genetics 200(3):921–934

Blath J, González Casanova A, Kurt N, WilkeBerenguer M (2016) A new coalescent for seed-bank models.

Ann Appl Probab 26(2):857–891

Blath J, Buzzoni E, González Casanova A, Wilke Berenguer M (2019) Structural properties of the seed

bank and the two island diffusion. J Math Biol 79(1):369–392

Bobrowski A (2015) Singular perturbations involving fast diffusion. J Math Anal Appl 427(2):1004–1026

Cano R, Borucki M (1995) Revival and identification of bacterial spores in 25- to 40-million-year-old

dominican amber. Science (New York, NY) 268:1060–1064

Chung KL (1960) Markov chains with stationary transition probabilities. Grundlehren der Mathematischen

Wissenschaften, 1st edn, vol. 104. Springer, Berlin

Cohen D (1966) Optimizing reproduction in a randomly varying environment. J Theor Biol 12:119–129

Endo H, Inoue M (2019) Dormancy in cancer. Cancer Sci 110(2):474–480

Epstein SS (2009) Microbial awakenings. Nature 457(7233):1083

Etheridge A (2011) Some mathematical models from population genetics. Lecture notes in mathematics, vol

2012. Springer, Heidelberg. Lectures from the 39th Probability Summer School held in Saint-Flour,

2009, École d’Été de Probabilités de Saint-Flour. [Saint-Flour Probability Summer School]

Ethier SN, Kurtz TG (1986) Markov processes: characterization and convergence. In: Wiley series in

probability and mathematical statistics: probability and mathematical statistics. Wiley, New York

Fisher RA, Gollan B, Helaine S (2017) Persistent bacterial infections and persister cells. Nat Rev Microbiol

15(8):453

Greven A, den Hollander WTF, Oomen M (2020) Spatial populations with seed-bank: well-posedness,

duality and equilibrium. arXiv:2004.14137 (Preprint)

Herbots HM (1994) Stochastic models in population genetics: genealogical and genetic differentiation in

structured populations. PhD thesis, University of London

Hildebrandt TH, Schoenberg IJ (1933) On linear functional operations and the moment problem for a finite

interval in one or several dimensions. Ann Math 34(2):317–328

Jansen S, Kurt N (2014) On the notion(s) of duality for Markov processes. Probab Surv 11:59–120

Johnson SS, Hebsgaard MB, Christensen TR, Mastepanov M, Nielsen R, Munch K, Brand T, Gilbert MT,

Zuber MT, Bunce M, Ronn R (2007) Ancient bacteria show evidence of DNA repair. Proc Natl Acad

Sci 104(36):14401–14405

Kaj I, Krone SM, Lascoux M (2001) Coalescent theory for seed bank models. J Appl Probab 38(2):285–300

Kallenberg O (2002) Foundations of modern probability. Probability and its applications, 2nd edn. Springer,

New York

Kermany ARR, Zhou X, Hickey DA (2008) Joint stationary moments of a two-island diffusion model of

population subdivision. Theor Popul Biol 74(3):226–232

Kurtz TG (1973) A limit theorem for perturbed operator semigroups with applications to random evolutions.

J Funct Anal 12:55–67

Kurtz TG (1991) Random time changes and convergence in distribution under the meyer-zheng conditions.

Ann Probab 19:1010–1034

Lennon JT, Jone SE (2011) Microbial seed banks: the ecological and evolutionary implications of dormancy.

Nat Rev Microbiol 9:119

Marx V (2018) How to pull the blanket off dormant cancer cells. Nat Methods 15:249–252

Meyer P-A, Zheng WA (1984) Tightness criteria for laws of semimartingales. Ann Inst Henri Poincaré

Probab Stat 20(4):353–372

MöhleM(1998) Aconvergencetheorem for Markov chains arisingin populationgenetics and thecoalescent

with selfing. Adv Appl Probab 30(2):493–512

Möhle M, Notohara M (2016) An extension of a convergence theorem for Markov chains arising in popu-

lation genetics. J Appl Probab 53(3):953–956

MoranPAP (1959)The theoryof somegenetical effectsof populationsubdivision. AustJ BioSci12(2):109–

116

123

53 Page 34 of 34 J. Blath et al.

Morono Y, Ito M, Hoshino T et al (2020) Aerobic microbial life persists in oxic marine sediment as old as

101.5 million years. Nat Commun 11:3626

Notohara M (1990) The coalescent and the genealogical process in geographically structured population. J

Math Biol 29(1):59–75

Shiga T, Shimizu A (1980) Infinite-dimensional stochastic differential equations and their applications. J

Math Kyoto Univ 20(3):395–416

Shoemaker WR, Lennon JT (2018) Evolution with a seed bank: the population genetic consequences of

microbial dormancy. Evol Appl 11(1):60–75

Skorohod AV (1956) Limit theorems for stochastic processes. Teor Veroyatnost i Primenen 1:289–319

Wakeley J (2009) Coalescent theory: an introduction. Roberts & Company Publishers, Greenwood Village

Whitt W (2002) Stochastic-process limits: an introduction to stochastic-process limits and their application

to queues. Springer series in operations research. Springer, New York

Wright S (1931) Evolution in Mendelian populations. Genetics 16(2):97–159

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps

and institutional affiliations.

123