Supplemen a y Ma e ials
Popula ion Size Di e ences Can Lead o Biases in
Phylogene ic In e ence and In og ession De ec ion in
he P esence o Pu i ying Selec ion
Chong He1*, Meng-Yun Chen2, Hao Zhu134*
1 Bioin o ma ics Sec ion, School o Basic Medical Sciences, Sou he n Medical
Uni e si y, Guangzhou, 510515, China
2 Guangzhou Key Labo a o y o Sub opical Biodi e si y and Biomoni o ing,
Guangdong P o incial Key Labo a o y o Bio echnology o Plan
De elopmen , School o Li e Sciences, Sou h China No mal Uni e si y,
Guangzhou 510631, China
3 Guangdong-Hong Kong-Macao G ea e Bay A ea Cen e o B ain Science
and B ain-Inspi ed In elligence, Sou he n Medical Uni e si y, Guangzhou,
510515, China
4 Guangdong P o incial Key Lab o Single Cell Technology and Applica ion,
Sou he n Medical Uni e si y, Guangzhou, 510515, China
*Co esponding au ho : [email p o ec ed]; [email p o ec ed]
Supplemen a y Me hods
Examining he Values o P(V1 | V) unde Neu al E olu ion and unde Pu i ying
Selec ion
P(V1 | V) is he p obabili y ha a genealogical ee is a ype V1 genealogical ee gi en
ha his genealogical ee is a ype V genealogical ee (Fig. 7). Unde neu al e olu ion, i is
heo e ically expec ed ha P(V1 | V) = 2/3 (Fig. 9B, Supplemen a y Appendix 2). The SLiM
simula ion desc ibed he e aimed o empi ically examine whe he P(V1 | V) is indeed 2/3 unde
e olu ion. We also examined whe he he alue o P(V1 | V) unde pu i ying selec ion is
g ea e o smalle han ha unde neu al e olu ion. We se he simples biallelic con ex in
ou simula ions because ou aim was o de e mine he di ec ion (g ea e o smalle ) o he
di e ence be ween he alues o P(V1 | V) unde pu i ying selec ion and unde neu al
e olu ion. I mo e han wo alleles a e in ol ed, he magni ude o he di e ence can change,
ne e heless, bu he di ec ion o he di e ence will no change. The “Mu a ion. ime” and
“Mu a ion.node” a ibu es in he Py hon package ski we e used o ex ac in o ma ion abou
he ime each mu a ion occu s and he b anch each mu a ion alls on. We analyzed gene ees
ha expe ienced deep coalescence and a he same ime con ained one and only one mu a ion
occu ing a a ime ea lie han ime 123 ( his mu a ion co esponds o he mu a ion M in ou
heo e ical analysis). Among hese gene ees, we calcula ed he a io be ween gene ees
whe e he mu a ion occu ing in S123 alls on he i s di e ging b anch ( ype V1) and gene
ees whe e he mu a ion occu ing in S123 alls on any o h ee e minal b anches ( ype V).
Supplemen a y Appendix S1
He e, we p o ide a mo e igo ous explana ion o why unde pa e ns A3, A4, and A5,
P(V1 | Ax) = P(V1 | V) (x = 3, 4, 5). As pa e ns A3, A4, and A5 a e simila , he e, we ake pa e n
A3 as an example (g1 and g2 a e connec ed o “nonca ie s” and g3 is connec ed o a “ca ie ”
a 123, Fig. 7). I pa e n A3 occu s, G123 is ce ainly a ype V genealogical ee. Hence,
P(V1 | A3) may be w i en as P(V1 | A3V). Then, acco ding o Bayes’ heo em, P(V1 | A3V) may
be w i en as ollows:
𝑃(𝑉1|𝐴3)=𝑃(𝑉1|𝐴3𝑉)=𝑃(𝑉1𝐴3|𝑉)
𝑃(𝐴3|𝑉)=𝑃(𝑉1𝐴3|𝑉)
𝑃(𝑉1𝐴3|𝑉)+𝑃(𝑉1
𝐴3|𝑉)
=𝑃(𝑉1|𝑉)𝑃(𝐴3|𝑉1𝑉)
𝑃(𝑉1|𝑉)𝑃(𝐴3|𝑉1𝑉)+𝑃(𝑉1
|𝑉)𝑃(𝐴3|𝑉1
𝑉)
Biologically speaking, ega dless o media o s’ genealogical posi ions in G123
(depending on e olu iona y e en s ea lie han 123), “ca ie s” a e s ill “ca ie s”, and
“nonca ie s” a e s ill “nonca ie s”. How g1, g2, and g3 a e connec ed o media o s
(depending on e olu iona y e en s la e han 123) only depends on whe he media o s a e
“ca ie s” o “nonca ie s”. This is i ele an o media o s’ genealogical posi ions o in G123.
This essen ially is a case o he Ma ko p ope y: Gi en he “cu en s a e” ( he s a es o
media o s a 123), he “ea lie s a e” is i ele an o he “ u u e s a e”. Acco dingly,
𝑃(𝐴3|𝑉1𝑉)=𝑃(𝐴3|𝑉1
𝑉)=𝑃(𝐴3|𝑉), and he abo e exp ession becomes:
𝑃(𝑉1|𝐴3)=𝑃(𝑉1|𝑉)
𝑃(𝑉1|𝑉)+𝑃(𝑉1
|𝑉)=𝑃(𝑉1|𝑉)
Following a simila logic, i can also be p o en ha unde pa e ns A6, A7, and A8,
P(W1 | Ax) = P(W1 | W) (x = 6, 7, 8).
Supplemen a y Appendix S2
The equi alence ela ionship be ween pi and E( i) can be mo e igo ously unde s ood
as ollows:
𝑝𝑖=𝑃{𝑎𝑖 is a “ca ie ”}
=𝑃{𝑔𝑖 is a descendan o a “ca ie ”}
=∫ 𝑃(𝑔𝑖 is a descendan o a “ca ie ” | 𝑓𝑖)𝑝(𝑓𝑖)
+∞
−∞ 𝑑𝑓𝑖
=∫ 𝑃(a descendan o a “ca ie ” is sampled om 𝑆𝑖 | 𝑓𝑖)𝑝(𝑓𝑖)𝑑𝑓𝑖
+∞
−∞
=∫ 𝑓𝑖𝑝(𝑓𝑖)
+∞
−∞ 𝑑𝑓𝑖
=𝐸(𝑓𝑖)
Whe e, i deno es he equency o descendan s o “ca ie s” in species Si, p( i) deno es
he p obabili y densi y unc ion o i.
Addi ionally, i may be wo h no ing ha he equi alence ela ionship be ween pi and
E( i) is una ec ed by he p esence o mu a ions occu ing la e han 123 (shallow mu a ions).
This is because shallow mu a ions should be conside ed as “ andom noise” ela i e o
mu a ion M i sel . Mo e speci ically, 𝑃(𝑔𝑖 is a descendan o a “ca ie ” | 𝑓𝑖) is a
p obabili y o e all possible ca ying s a es o shallow mu a ions:
𝑃(𝑔𝑖 is a descendan o a “ca ie ” | 𝑓𝑖)
=∑𝑃(𝑔𝑖 is a descendan o a “ca ie ” |𝑢,𝑓𝑖)𝑃(𝑢)
∞
Whe e, u deno es each possible ca ying s a e o shallow mu a ions. The e o e, pi/E( i)
is a pa ame e ha e lec s he in insic p ope y o mu a ion M.
Supplemen a y Appendix S3
He e, we p o e ha unde a neu al e olu ion assump ion wi h a cons an mu a ion
a e and a cons an popula ion size, P(V1 | V) = 2/3. We use X1 and X2 o deno e he wai ing
ime o he i s coalescen e en and he wai ing ime o he second coalescen e en ,
espec i ely (Fig. 9B). The wai ing ime o he i s coalescen e en X1 and he wai ing ime
o he second coalescen e en X2 ollow an exponen ial dis ibu ion wi h a e pa ame e λ1
and an exponen ial dis ibu ion wi h a e pa ame e λ2, espec i ely. The expec ed wai ing
ime o he i s coalescen e en and ha o he second coalescen e en a e 1/λ1 and 1/λ2,
espec i ely. Le R deno e he a io dis ibu ion o X2 o X1, i.e., R = X2/X1. As he i s and he
second coalescen e en s a e mu ually independen , he p obabili y densi y unc ion o R is as
ollows (Blom 1989):
𝑓(𝑟)={∫ 𝑥1∙𝜆1𝑒−𝜆1𝑥1∙𝜆2𝑒−𝜆2(𝑟𝑥1)𝑑𝑥1
∞
0, 𝑟>0
0, 𝑜𝑡ℎ𝑒𝑟𝑠
When > 0, ( ) may be simpli ied as ollows:
𝑓(𝑟)=∫ 𝑥1∙𝜆1𝑒−𝜆1𝑥1∙𝜆2𝑒−𝜆2(𝑟𝑥1)𝑑𝑥1
∞
0
=𝜆1𝜆2∫ 𝑥1∙𝑒−(𝜆1+𝜆2𝑟)𝑥1𝑑𝑥1
∞
0
=𝜆1𝜆2
(𝜆1+𝜆2𝑟)2Γ(2)
Abo e, Γ(2) is a gamma unc ion, which is equal o 1. Hence:
𝑓(𝑟)=𝜆1𝜆2
(𝜆1+𝜆2𝑟)2
=𝜆1𝜆2
⁄
(𝜆1𝜆2
⁄+𝑟)2
In a h ee- ip genealogical ee, he leng h o he i s di e ging b anch is X1 + X2. The
o al leng h o he h ee e minal b anches is 3X1 + X2. I mu a ion M alls on one o he h ee
e minal b anches, he h ee- ip genealogical ee is a ype V genealogical ee. Among he
h ee e minal b anches, i mu a ion M alls on he i s di e ging b anch, he h ee- ip
genealogical ee is a ype V1 genealogical ee (Fig. 9B). The e o e, le φ deno e he
condi ional alue o P(V1 | V) gi en ha he alues o X1 and X2 a e X1 = x1 and X2 = x2,
espec i ely. Assuming a cons an mu a ion a e, φ is as ollows:
𝜑= 𝑥1+𝑥2
3𝑥1+𝑥2=1+𝑟
3+𝑟
Then, he o e all alue o P(V1 | V) ac oss all possible alues o X1 and X2 may be
w i en as ollows:
𝑃(𝑉1|𝑉)=∫ 𝜑(𝑟)𝑓(𝑟)𝑑𝑟
∞
0
=∫ (1+𝑟
3+𝑟)[𝜆1𝜆2
⁄
(𝜆1𝜆2
⁄+𝑟)2]𝑑𝑟
∞
0
=(𝜆1𝜆2
⁄ )∫(1− 2
3+𝑟)[1
(𝜆1𝜆2
⁄+𝑟)2]𝑑𝑟
∞
0
=(𝜆1𝜆2
⁄ )[∫ 1
(𝜆1𝜆2
⁄+𝑟)2𝑑𝑟−∫ 2
(3+𝑟)(𝜆1𝜆2
⁄+𝑟)2𝑑𝑟
∞
0
∞
0]
Assuming a cons an popula ion size, he expec ed wai ing ime o he second
coalescen e en is 3 imes longe han ha o he i s coalescen e en (Kingman 1982;
Tajima 1983), i.e., λ1/λ2 = 3. Hence:
𝑃(𝑉1|𝑉)=(𝜆1𝜆2
⁄ )[∫ 1
(𝜆1𝜆2
⁄+𝑟)2𝑑𝑟−∫ 2
(3+𝑟)(𝜆1𝜆2
⁄+𝑟)2𝑑𝑟
∞
0
∞
0]
=3×[∫ 1
(3+𝑟)2𝑑𝑟−∫ 2
(3+𝑟)3𝑑𝑟
∞
0
∞
0]
=3×[−1
3+𝑟|∞
0+1
(3+𝑟)2|∞
0]
=3×(1
3−1
9)=2
3
Supplemen a y Sec ion S1: The Conclusion Rega ding Gene T ee Dis ibu ions
unde Pu i ying Selec ion Remains Valid When Mul iple Ances al Mu a ions
A e Assumed
Fo demons a ion, in he main ex , we ocus on he simples one-mu a ion case.
He e, we show ha no ma e how many mu a ions a e assumed in ances al species S123, he
conclusions ega ding media o s’ inpu and ou pu emain alid. Le us assume n mu a ions
occu ing S123 (n can be any in ege ). Then, he e a e (𝑛
𝑛)+(𝑛
𝑛−1)⋯+(𝑛
0) possible allelic
s a es. We conside an a bi a y pai o hese allelic s a es, ypex and ypex+k, whe e ypex
deno es an allelic s a e wi h a combina ion o x mu a ions, and ypex+k deno es an allelic s a e
whe e k mu a ions a e added o ypex. I he media o s’ ou pu is comple ely andom,
ega dless o wha allelic s a es a e ypex and ypex+k, a ypex gene copy should be equally
likely o occupy each o he h ee ips o G123, and a ypex+k gene copy should also be equally
likely o occupy each ip o G123. The k mu a ions men ioned abo e can be iewed as a “supe
mu a ion”, and ypex and ypex+k gene copies can be iewed as “nonca ie s” and “ca ie s”,
espec i ely. The e o e, he e ec o his “supe mu a ion” can be unde s ood in he same way
as he e ec o mu a ion M (Fig. 9).
Abo e, he e a e wo di e en allelic s a es among he h ee ips o G123. When mo e
han one mu a ion is assumed in S123, i is also possible ha he e a e h ee di e en allelic
s a es among he h ee ips o G123. We now add ano he allelic s a e— ypex+l. Acco dingly,
ypex+k and ypex+l gene copies a e wo ypes o “ca ie s”, and ypex gene copies a e
“nonca ie s”. I he e a e h ee di e en allelic s a es among he h ee ips o G123, he e a e
wo “supe mu a ions” in G123—one is cons i u ed by k mu a ions, he o he is cons i u ed by l
mu a ions (see he igu e below). I media o s’ ou pu is comple ely andom, he “nonca ie ”
in G123 should be equally likely o occupy each o he h ee ips. Tha is o say, he p obabili y
ha he wo “supe mu a ions” all on he i s di e ging b anch should be equal o he
p obabili y ha hey all on one o he wo nes ed b anches (see he igu e below). This
con lic s wi h he ac ha he i s di e ging b anch is longe han he wo nes ed b anches.
The e o e, gene copies a 123 a e no equally dis ibu ed among he ips o G123. Ins ead, i is
expec ed ha when an ex an gene copy is connec ed o a “ca ie ”, i s connec ion ends o be
es ablished wi h he i s di e ging b anch o G123 (see he igu e below). This e ec ac s in
he same di ec ion as he e ec desc ibed in he main ex . The e o e, we exclude he
possibili y ha he e is a pai o e ec s ha cancel each o he ou . Collec i ely, when mul iple
mu a ions a e assumed in S123, i emains biologically un enable o claim ha changes in
media o s’ inpu ha e no e ec on gene ee dis ibu ions. Fu he mo e, when mul iple
mu a ions a e assumed, how likely gi is connec ed o each ype o gene copies a 123 ( ypex,
ypex-k, o ypex-k-l) is s ill de e mined by he expec ed a es o hese gene copies in Si.
The e o e, ega dless o how many mu a ions a e assumed in S123, i can s ill be concluded
ha , unde pu i ying selec ion, popula ion size di e ences can lead o changes in media o s’
inpu , and consequen ly lead o changes in gene ee dis ibu ions.
Illus a ion o he si ua ions whe e he ips o G123 con ain h ee ypes o gene copies. A. I
media o s’ ou pu is comple ely andom, he ype x gene copy (blue ci cle) should be equally
likely o occupy each o he h ee possible posi ions. B. I media o s’ ou pu is comple ely
andom, he wo si ua ions shown he e should be equally likely o occu , which con lic s wi h he
ac ha he i s di e ging b anch is longe han he wo nes ed b anches. The le si ua ion is
mo e likely o occu han he igh one. The e o e, a “nonca ie ” ends o b ing he gene copy
connec ed o i o a nes ed b anch.
𝑃(𝑇1|𝐶3
)= ∑ 𝑃(𝑇1|𝐴𝑥)𝑃(𝐴𝑥|𝐶3
)
𝑥=1,4,5,6
=1
3𝑞1𝑞2𝑞3+[1−𝑃(𝑉1|𝑉)
2]𝑞1𝑝2𝑞3+[1−𝑃(𝑉1|𝑉)
2]𝑝1𝑞2𝑞3+𝑃(𝑊1|𝑊)𝑝1𝑝2𝑞3
𝑞1𝑞2𝑞3+𝑞1𝑝2𝑞3+𝑝1𝑞2𝑞3+𝑝1𝑝2𝑞3
=1
3𝑞1𝑞2+[1−𝑃(𝑉1|𝑉)
2]𝑞1𝑝2+[1−𝑃(𝑉1|𝑉)
2]𝑝1𝑞2+𝑃(𝑊1|𝑊)𝑝1𝑝2
Again, we ha e shown ha P(V1 | V) ≥ 2/3 and P(W1 | W) = 1. Hence, he abo e
exp ession becomes:
𝑃(𝑇1|𝐶3
)≤1
3𝑞1𝑞2+1
6𝑞1𝑝2+1
6𝑝1𝑞2+𝑝1𝑝2
=1
3−1
6(𝑝1+𝑝2)+𝑝1𝑝2
=1
3+1
2𝑝1(𝑝2−1
3)+1
2𝑝2(𝑝1−1
3)(8)
Acco ding o exp ession (8), i p1 < 1/3 and p2 < 1/3 (as shown abo e, almos always
sa is ied), 𝑃(𝑇1|𝐶3
)<1
3. Because 𝑃(𝑇1|𝐶3)>1
3 and 𝑃(𝑇1|𝐶3
)<1
3, i is expec ed ha as p3
inc eases (i.e., g3 becomes mo e likely o be connec ed o a “ca ie ” a 123), T1 becomes mo e
likely o occu . This conclusion is he same as ha d awn by analyzing 𝜕𝑃(𝑇1)
𝜕𝑝3.
In ac , he exp ession o 𝜕𝑃(𝑇1)
𝜕𝑝3 may be ea anged as ollows:
𝜕𝑃(𝑇1)
𝜕𝑝3=−1
3𝑞1𝑞2+1
3𝑝1𝑝2
+𝑃(𝑉1|𝑉)𝑞1𝑞2−[1−𝑃(𝑉1|𝑉)
2]𝑞1𝑝2−[1−𝑃(𝑉1|𝑉)
2]𝑝1𝑞2
−𝑃(𝑊1|𝑊)𝑝1𝑝2+[1−𝑃(𝑊1|𝑊)
2]𝑝1𝑞2+[1−𝑃(𝑊1|𝑊)
2]𝑞1𝑝2
=1
3𝑝1𝑝2+𝑃(𝑉1|𝑉)𝑞1𝑞2+[1−𝑃(𝑊1|𝑊)
2]𝑝1𝑞2+[1−𝑃(𝑊1|𝑊)
2]𝑞1𝑝2
−1
3𝑞1𝑞2−[1−𝑃(𝑉1|𝑉)
2]𝑞1𝑝2−[1−𝑃(𝑉1|𝑉)
2]𝑝1𝑞2−𝑃(𝑊1|𝑊)𝑝1𝑝2
=𝑃(𝑇1|𝐶3)−𝑃(𝑇1|𝐶3
)
An inc ease in p3 can be unde s ood as a p ocess whe e a ac ion o gene ees wi h a
“nonca ie ” a3 a e eplaced by a ac ion o gene ees wi h a “ca ie ” a3. In his ac ion o
gene ees, as gene ees wi h a "nonca ie " a3 a e eplaced by gene ees wi h a “ca ie ” a3,
he p obabili y o occu ence o gene ee opology T1 will change om 𝑃(𝑇1|𝐶3
) o
𝑃(𝑇1|𝐶3). When his gene ee eplacemen p ocess is exp essed in ma hema ical language, i
is he exp ession p esen ed abo e. The e o e, he analysis o 𝜕𝑃(𝑇1)
𝜕𝑝3 and he analysis o
𝑃(𝑇1|𝐶3
) and 𝑃(𝑇1|𝐶3) a e essen ially equi alen . Cla i ying he equi alence be ween he
analysis o 𝜕𝑃(𝑇1)
𝜕𝑝3 and he analysis o 𝑃(𝑇1|𝐶3
) and 𝑃(𝑇1|𝐶3) help be e unde s and he
biological meaning o he exp ession o 𝜕𝑃(𝑇1)
𝜕𝑝3. 𝜕𝑃(𝑇2)
𝜕𝑝3 and 𝜕𝑃(𝑇3)
𝜕𝑝3 can be simila ly
unde s ood by decomposing hem in o 𝑃(𝑇2|𝐶3
) and 𝑃(𝑇2|𝐶3), and in o 𝑃(𝑇3|𝐶3
) and
𝑃(𝑇3|𝐶3), espec i ely.
Re e ence
Blom G. 1989. Ra io o andom a iables. In: Fienbe g S, Olkin I, edi o s. P obabili y and
S a is ics, Theo y and Applica ions. New Yo k, NY: Sp inge -Ve lag. p. 91–92.
Kingman JFC. 1982. The coalescen . S och. P ocess. hei Appl. 13:235–248.
Tajima F. 1983. E olu iona y ela ionship o DNA sequences in ini e popula ions. Gene ics
105:437–460.
Supplemen a y Figu e S1. A. A compa ison o simula ion esul s o di e en le els o scaling. The esul
in he le side was gene a ed wi h pa ame e s iden ical o ha shown in Figu e 5F, he esul in he igh
side was gene a ed using 5- old la ge popula ion sizes, a 5- old lowe selec ion coe icien , and a 5- old
lowe mu a ion a e (i.e. using a scaling ac o ha is 1/5 o ha o Fig. 5F). These esul s a e simila . Each
simula ion esul shown he e only con ains 3000 × 20 = 60,000 gene ees (in Fig. 5, each con ains 6000
× 20 = 120,000). The e o e, he a ia ion among eplia es is g ea e han ha in Figu e 5.
Figu e S1
Iden ical o Fig. 5F
(N123 = 200,
μ = 1.5 × 10-5,
and s = -0.0075)
5- old la ge popula ion sizes
(N123 = 1000,
μ = 3 × 10-6,
and s = -0.0015)
N1=(1/25)×N12 N2=N12 N3=N12
Popula ion sizes used by
Vande pool e al. (2020)
N12: N123: N1: N2: N3 = 25:25:1:25:25
(σ123
=
-7.5)
Popula ion sizes used by
He e al. (2020)
N12: N123: N1: N2: N3 = 5:5:1:25:25
(σ123
=
-1.5)
Supplemen a y Figu e S2. A. The species ee se in he simula ions o Vande pool e al. (2020). The le
side shows he SLiM codes used by Vande pool e al. (2020). These codes co espond o he species ee
shown on he igh side. This species ee is no compa able o any o hose used by He e al. (2020).
Please compa e he species ee shown he e and ha shown in Figu e 5. Addi ionally, each locus in Van-
de pool e al.’s simula ions has 41000 possible allelic s a es, whe eas each locus in he simula ions o He e
al. (2020) is biallelic. Because o hese di e ences, Vande pool e al.’s simula ions canno be used o
examine he eplicabili y o he simula ion esul s o He e al. (2020). Among he di e ences men ioned
abo e, he mos c ucial di e ence lies in N123. The popula ion sizes used by Vande pool e al. (2020) a e in
he a io N12:N123:N1:N2:N3 = 25:25:1:25:25, whe eas he popula ion sizes used by He e al. (2020) a e in he
a io N12: N123: N1: N2: N3 = 5:5:1:25:25. Tha is o say, he N123 used by Vande pool e al. (2020) is e ec i e-
ly 5 imes g ea e han ha used by He e al. (2020). Consequen ly, he scaled selec ion coe icien in
ances al species S123 (σ123
) in Vande pool e al.’s simula ions has an absolu e alue 5 imes g ea e han
ha in he simula ions o He e al. (2020) (7.5 e sus 1.5). Because o his di e ence, in Vande pool e al.’s
simula ions, dele e ious mu a ions occu ing in ances al species S123 ha e a much lowe p obabili y o
being p esen in ex an species S1, S2, and S3. In such a si ua ion, he ances ies o g1, g2, and g3 a e o en
aced back o h ee iden ical ances o s. I is unsu p ising ha asymme y in gene ee dis ibu ion is no
p onounced in such a si ua ion. In eali y, slig ly dele e ious mu a ions ha e a chance o being p esen in
species S1, S2, and S3. The e o e, he simula ion se ings used by Vande pool e al. (2020) a e no ealis ic.
B. We changed he popula ion sizes used by Vande pool e al. (2020) o hose used by He e al. (2020).
O he se ings used by Vande pool e al. (2020) emained unchanged. This change in popula ion sizes
casues a change in he dis ibu ion o gene ee opologies. Such a phenomenon con adic s Vande pool e
al.’s claim “ he e should be no e ec o nega i e selec ion on he dis ibu ion o ee opologies”.
N1N2N3
0.005×(2N12)
4×(2N12)
10×(2N12)
N12
N123
Species ee used by
Vande pool e al. (2020)
Popula ion sizes used by
Vande pool e al. (2020)
N12: N123: N1: N2: N3 = 25:25:1:25:25
(σ123
= 0
)
Neu al Pu i ying selec ion Pu i ying selec ion
A
B
N123=N12
Figu e S2
Supplemen a y Figu e S3. The alues o P(V1|V)
unde neu al e olu ion and unde pu i ying selec ion.
Unde neu al e olu ion, he obse ed alues o P(V1|V)
ag ee wi h he heo e ical expec a ion desc ibed in
he main ex , P(V1|V)
=
2/3. Unde pu i ying selec ion, he obse ed alues o P(V1|V)
a e sligh ly g ea e han
2/3. The esul o he Mann–Whi eney U es sugges s ha he di e ence be ween he alues o P(V1|V)
unde
neu al e olu ion and hose unde pu i ying selec ion is s a is ically signi ican (p
=
5.2
×
10-5). This phenomenon
can be unde s ood as ollows: unde pu i ying selec ion, “nonca ie s” end o p oduce mo e o sp ing in
compa ison wi h unde neu al e olu ion. The necessa y condi ion o he occu ence o V1 genealogical ees
is ha a “nonca ie ” p oduce mo e han one o sp ing. The e o e, unde pu i ying selec ion, i is expec ed ha
V1 genealogical ees ha e a g ea e p obabili y o occu ence in compa ison wi h unde neu al e olu ion. In
addi ion, his phenomenon can be unde s ood by he ances al selec ion g aph (ASG) model. In he ASG
model, he in luence o selec ion is ep esen ed by “incoming b anches”. The p esence o incoming b anches
can cause a p opo ion o ype W1 genealogical ees o be ans o med in o ype V1 genealogical ees.
The e o e, he alue o P(V1|V)
is expec ed o inc ease unde pu i ying selec ion.
Figu e S3
Neu al
(σ123
=
0)
Pu i ying selec ion
(σ123
=
-1.5)
P(V1|V)