Vol.:(0123456789)
1 3
Molecula Gene ics and Genomics (2023) 298:153–160
h ps://doi.o g/10.1007/s00438-022-01969-0
ORIGINAL ARTICLE
Neoli hic expansion and he17q21.31 in e sion inIbe ia:
ane olu iona y app oach oH2 haplo ype dis ibu ion in heNea
Eas andEu ope
IboneEspinosa1· MiguelA.Al onso‑Sánchez1· LuisGómez‑Pé ez1· JoseA.Peña1
Recei ed: 17 Decembe 2021 / Accep ed: 25 Oc obe 2022 / Published online: 10 No embe 2022
© The Au ho (s) 2022
Abs ac
The ch omosomal egion 17q21.31 ha bo s a 900kb in e sion polymo phism named a e he mic o ubule-associa ed p o ein
au (MAPT) gene. Since no ecombina ion occu s, wo haplo ypes a e ecognized: a di ec ly o ien ed a ian (H1) and an
in e ed a ian (H2). The H2 haplo ype ea u es a dis ibu ion pa e n wi h high equencies in he Nea Eas and Eu ope,
medium le els in Sou h Asia and No h A ica, and low le els elsewhe e. S udies o his genomic egion a e ele an owing
o i s likely associa ion wi h nume ous neu odegene a i e diseases. Howe e , he causes unde lying he geog aphic dis i-
bu ion o he haplo ype equencies emain a bone o con en ion among esea che s. Wi h his wo k, we ha e in ended o
ou line a plausible hypo hesis on he o igin o he high Eu opean H2 equencies. To ha end, we ha e analyzed an ex ensi e
popula ion da abase (including h ee new Ibe ian popula ions) o explo e po en ial clinal a ia ions o H2 equencies. We
ound a sigmoidal equency cline wi h an upwa d end om Sou h Asia o Eu ope. The maximum alue was de ec ed in
he Basques om Gipuzkoa p o ince (0.494) wi h he cu e’s in lec ion poin in he Nea Eas . F om ou esul s, we sug-
ges ha he mos likely scena io o high H2 equencies in Eu ope would be a ounding e en in he Nea Eas du ing he
la e Paleoli hic o ea ly Neoli hic. Subsequen ly, such H2 o e ep esen a ion would ha e eached Eu ope wi h he a i al o
he i s Neoli hic a me s. The cu en equencies and geog aphic dis ibu ion o he 17q21.31 in e sion sugges ha he
ounding e en s mainly a ec ed he H2D subhaplo ype.
Keywo ds MAPT· Tau p o ein· Basques· Founde e ec · Gene ic d i · Mos ecen common ances o
In oduc ion
The ch omosomal egion 17q21.31 ha bo s a 900kb in e -
sion polymo phism (S e ansson e al. 2005), named a e one
o he mos s udied genes in his egion because o i s plau-
sible associa ion wi h se e al neu odegene a i e diseases
(Balla o e e al. 2007; A end e al. 2016): he mic o ubule-
associa ed au p o ein (MAPT) gene. Linkage disequilib ium
esul ing om changes in he o ien a ion o he 17q21.31
genomic egion has led o de ine wo dis inc haplo ypes: H1
conside ed he common, di ec ly o ien ed a ian , and he
in e ed H2 a ian (Donnelly e al. 2010). In addi ion, some
au ho s ha e iden i ied de ec able le els o MAPT haplo ype
di e si y, hus dis inguishing di e en subhaplo ypes such
as H2' and H2D. Among hem, H2' would be he ances al
H2 haplo ype and H2D, ca ying some duplica ions, would
be he de i ed haplo ype (S einbe g e al. 2012). The H1
haplo ype is globally widesp ead, being p esen in all human
popula ions s udied o da e (E ans e al. 2004). On he o he
hand, he less equen H2 haplo ype is i ually absen in
sub-Saha an A ica, Eas Asia, he Ame icas, and Oceania,
hus con as ing wi h he no able equencies obse ed in
Sou hwes Asia, Eu ope and No h A ica (E ans e al.
2004; Donnelly e al. 2010; S einbe g e al. 2012; Al es
e al. 2015).
Di e en es ima es o he ime o he Mos Recen Com-
mon Ances o (MRCA) o he in e sion ha e been ob ained
so a , leading o di e gen hypo heses on he o igin o he
H2 ances al haplo ype. These es ima es ange om 3 mil-
lion yea s (S e ansson e al. 2005) o less han 100 ky s
Communica ed by Shuhua Xu.
* Jose A. Peña
[email p o ec ed]
1 Depa amen o de Gené ica, An opología Física y Fisiología
Animal, Uni e sidad del País Vasco (UPV/EHU), Apa ado
644, 48080Bilbao, Spain
154 Molecula Gene ics and Genomics (2023) 298:153–160
1 3
(Donnelly e al. 2010). The sou ce a ea is also con o e sial,
as hese au ho s sugges ha i could be ei he A ica (S e-
ansson e al. 2005; Donnelly e al. 2010) o sou hwes Asia
(Donnelly e al. 2010). Howe e , demog aphic e en s ha
migh ha e o igina ed he cu en wo ldwide dis ibu ion o
he H2 haplo ype emain ela i ely unexplo ed. In ending o
p o ide new e idence on why H2 is especially equen in
sou he n Eu ope, he ein we analyzed an ex ensi e popula-
ion da abase (including h ee new Ibe ian popula ions) o
look o po en ial clinal a ia ions o H2 equencies ac oss
he e i o y in ol ed.
Ma e ials andme hods
S udy samples
Aimed a ob aining ep esen a i e samples o he gene ic
a iabili y in he Ibe ian Peninsula, ou s udy included h ee
popula ion samples om wo geog aphically and gene i-
cally di e en ia ed Spanish egions. Thus, we assessed he
gene ic di e si y o he Can ab ian wa e shed by sampling
wo Basque a eas. On he o he hand, analysis o he Medi-
e anean egion elied on a sample om he Valencian
communi y.
A ecen s udy sugges ed ha Basque human g oups
migh ha e unde gone ema kable popula ion isola ion
since, a leas , he I on Age. Logically, his ac would ha e
been c ucial o make au och honous Basques an an h opo-
logically dis inc popula ion (Olalde e al. 2019). Besides
he complex o og aphy o he Basque a ea, se e al s ud-
ies ha e claimed ha Euske a, he na i e Basque language
o p e-indo-Eu opean o igin, could ha e ep esen ed a lin-
guis ic ba ie o gene low wi h neighbou ing popula ions
speaking a language o he han Euske a (Al onso-Sánchez
e al. 2005; Ga cía-Ob egón e al. 2007). Thus, we ga he ed
samples om wo dis inc zones o he adi ional Basque
a ea essen ially dis inguished by he p e alence o Euske a,
a ac o equen ly neglec ed in in es iga ions in ol ing
Basques. Speci ically, s udy samples s emmed om di e -
en egions o he Gipuzkoa p o ince and he Baz an alley,
loca ed in he no he n end o Na a e p o ince. A p esen ,
Euske a pe sis s in bo h a eas. Ye , while Gipuzkoa bo de s
o he Basque-speaking popula ions in Spain and he F ench
Basque egion (Ipa alde), Na a a is mainly su ounded
by Spanish e i o ies speaking an Indo-Eu opean language
(Cas ilian). Acco dingly, Na a e p o ince has adi ionally
been mo e pe meable o gene low om ou side he Basque-
speaking a ea (Pé ez-Mi anda e al. 2005).
As p e iously men ioned, we examined he gene ic he -
e ogenei y o he Medi e anean basin by analyzing a sam-
ple om Valencia. Con a ily o he popula ion isola ion
o he Basque a ea, Valencia seems o ha e been unde he
in luence o ecu en gene low e en s since he I on Age,
coming om bo h he Cen al and Eas e n Medi e anean
and No h A ica. (Ga cía‐Ob egón e al. 2006; Olalde e al.
2019).
MAPT haplo yping
Assignmen o MAPT haplo ypes/subhaplo ypes in ol ed
yping o wo single nucleo ide polymo phisms (SNPs),
namely s10514879 and s199451 (see Supplemen a y
Table1), using he high- esolu ion mel ing (HRM) assay
(Al onso-Sánchez e al. 2018).
To e ine he MAPT haplo ype iden i ica ion, we u he
examined ou sho andem epea s (STRs) (MAPT07, 08,
09 and 14) loca ed wi hin he bounda ies o he in e sion
o he MAPT gene, ollowing he p o ocol de eloped by
Donnelly e al. (2010). Mic osa elli es we e yped ia PCR
ampli ica ion wi h luo escen ly labeled p ime s, he leng hs
o he PCR p oduc s de e mined on an ABI P ism 310
Gene ic Analyze (Applied Biosys ems, Fos e Ci y, CA).
S a is ical andphylogene ic analysis
MAPT haplo ypic equencies in Eu opean, Nea Eas e n
and Sou h Asian human samples we e collec ed om pub-
lished s udies o b oaden he geog aphical con ex o his
s udy (Supplemen a y Table2). As men ioned, hese geo-
g aphic egions show highe MAPT*H2 haplo ype equen-
cies han hose epo ed o o he con inen s (E ans e al.
2004). Al es e al. (2015) also e e ed o ela i ely high H2
equencies in No h A ica, bu no published equencies
we e ound in he ele an li e a u e. In he nex s ep, H2
equency da a we e used o assess gene ic he e ogenei y
in selec ed popula ion clus e s by a hie a chical analysis o
molecula a iance (AMOVA) using he p og am A lequin
3.5 (Exco ie and Lische 2010). We u he explo ed
po en ial geog aphic (spa ial) pa e ns o haplo ype equen-
cies using he GenoCline so wa e (Peña e al. 2021), which
can de ec equency g adien s ei he in he o m o linea o
sigmoid gene ic clines.
Phylogene ic ela ionships we e examined by cons uc -
ing a genealogy o STR/SNPs haplo ypes using he median-
joining ne wo k app oach (Bandel e al. 1995) in he Ne -
wo k 10.2 so wa e (Fluxus Technology). Subsequen ly,
STR/SNPs haplo ype da a we e inpu o calcula e he ime
elapsed o he mos ecen common ances o (MRCA) o
he Ibe ian haplo ypes, ollowing he me hod o S ephens
e al. (1998). To es ima e he numbe o gene a ions o he
MRCA, we conside ed a mu a ion a e be ween 0.0005 and
0.0010, as pe Donnelly e al. (2010). We also assumed a
ecombina ion a e o ze o, conside ing ha all ou STRs
examined a e wi hin he in e sion. Finally, we se an a e age
gene a ion leng h o 25yea s.
155Molecula Gene ics and Genomics (2023) 298:153–160
1 3
Fo he sake o gaining insigh s in o he e olu ion o a i-
a ions in MAPT haplo ype equencies o e ime, we iden-
i ied and compiled MAPT haplo ypes om ancien DNA
da abases o Eu ope (Laza idis e al. 2014, 2016, 2017;
Ma hieson e al. 2015, 2018; Fu e al. 2016; Lipson e al.
2017; Olalde e al. 2018, 2019) and Sou h Asian (Na asim-
han e al. 2019) samples.
Finally, H2' and H2D equencies we e compiled o di -
e en con inen s (Supplemen a y Table3), aiming o eluci-
da e which o hese wo subhaplo ypes could accoun o he
o igin o he high MAPT*H2 Eu opean equencies.
Resul s
Table1 p esen s MAPT haplo ype and subhaplo ype e-
quencies o Ibe ian samples. In Gipuzkoa Basques, au H1
and H2 haplo ype equencies we e simila , a ound 50%
(H1: 50.6%, H2: 49.4%). In con as , he haplo ype equen-
cies o he Na a ese Basques and Valencians we e in line
wi h Eu opean popula ions, wi h a clea p edominance o
H1 (68.9% and 74.2%, espec i ely). In Gipuzkoa, he high
H2 equency is p ima ily due o he high incidence o he
in e ed haplo ype wi h duplica ion (H2D: 98.8% o o al
H2). In e es ing esul s eme ged by applying he likelihood
a io es (o G es ) o e alua e popula ion di e en ia ion.
Thus, we ound s a is ically signi ican di e ences o H2
haplo ype equencies be ween he pool o Basque collec-
ions and he Valencian sample (G = 10.16, d = 1, P < 0.01).
This gene ic he e ogenei y was de e mined by Gipuzkoa
Basques. In his way, while H2 equencies be ween Na a -
ese Basques and Valencians did no show signi ican di -
e ences (G = 1.26, d = 1, P = 0.262), we ound s a is ically
signi ican di e ences be ween Gipuzkoan and Na a ese
Basques (G = 13.87, d = 1, P < 0.001).
The e was a dis inc i e pa e n in he MAPT*H2 hap-
lo ype equency dis ibu ion h oughou he e i o y
om Eu ope o Sou h Asia (see Fig.1 and Supplemen a y
Table2). Haplo ype equencies end o be lowe in Sou h
Asia and he eas e n pa o he Nea Eas , anging om 0.00
o 0.13. In con as , in he wes e n pa o he Nea Eas and
Eu ope, H2 equencies end o be much highe , anging
om 0.15 o 0.49.
An analysis o he spa ial s uc u ing o haplo ype e-
quencies using he GenoCline p og am e ealed a sigmoid
cline wi h a Sou heas -No hwes o ien a ion, speci ically
wi h an azimu h o 312° o he No h. All sigmoid unc-
ion pa ame e s we e s a is ically signi ican , including he
o dina e scaling ac o (P < 10–6), slope (P < 10–3), midpoin
(P < 10–6), and F a io (P < 10–6).
Table 1 F equency es ima es (± s anda d e o ) o di ec (H1) and
in e ed (H2) MAPT haplo ypes in h ee Ibe ian popula ions
Also displayed a e he equencies o in e ed subhaplo ypes wi hou
duplica ion (H2'), and wi h duplica ion (H2D)
N sample size as ch omosome numbe
Gipuzkoa Na a a Valencia
N180 210 182
H1 0.506 ± 0.037 0.689 ± 0.035 0.742 ± 0.032
H2 0.494 ± 0.037 0.311 ± 0.035 0.258 ± 0.032
H2' 0.006 ± 0.006 0.033 ± 0.013 0.022 ± 0.011
H2D 0.488 ± 0.037 0.278 ± 0.033 0.236 ± 0.031
Fig. 1 Reg ession o MAPT*H2
haplo ype equencies on
geog aphic coo dina es in a se
o Sou h Asian and Eu opean
popula ions. The y-axis is a
coo dina e axis wi h a o a ion
o 312º ela i e o he No h.
Resul s ma ched he sigmoid
unc ion
y
=
a
(
1+e(−b∗(x−c))
)
, wi h
a = 0.2687, b = 13.8090 and
c = 0.2519. Fo popula ion
labels see Supplemen a y
Table2. Solid ci cles ep esen
Sou h Asia, open ci cles he
Nea Eas , and squa es Eu ope
156 Molecula Gene ics and Genomics (2023) 298:153–160
1 3
The midpoin o he h eshold (i.e., he cu e’s in lec ion
poin ) appea ed a ound he Nea Eas , indica ing a shi in
he unc ion's end in his egion. As a esul , popula ions
om he eas e n po ion o he Nea Eas showed equen-
cies below 0.15 (Kuwai : 0.09 and Yemeni e Jews: 0.12),
while hose in he wes e n a ea ea u ed equencies abo e
his alue. (Jo dan: 0.39, Bedouin o Jo dan: 0.37, D uze:
0.32, Sama i an: 0.31, Bedouin: 0.26, Pales ine: 0.23, and
Ku ds: 0.21).
Th ough AMOVA, we compa ed he H2 equencies
be ween he geog aphic egions delimi ed by sigmoid
eg ession (Sou h Asia, including Eas e n-Nea Eas e sus
Eu ope, including Wes e n-Nea Eas ). All o he ixa ion
indices ob ained we e highly signi ican a P < 10–5 (FST:
0.1455, FSC: 0.0025 and FCT: 0.01689), hus con i ming
gene ic he e ogenei y be ween Eu ope and Sou h Asia in
e ms o MAPT*H2 equencies. Fu he analysis based
only on Eu opean popula ions e ealed a signi ican cline
o MAPT*H2 (R: 0.6094; P < 0.01), bu in his case, spa ial
a ia ion in H2 equencies ma ched he linea eg ession
model (Fig.2). The esul ing gene ic cline was No h–Sou h
o ien ed, wi h an azimu h o 190° ela i e o he No h.
Sou he n Eu opean samples, especially Spanish g oups,
exhibi ed high le els o gene ic he e ogenei y. Spanish popu-
la ions e idenced a wide ange o H2 haplo ype equencies,
a ying om 0.21 in Ca alonia o 0.49 in Gipuzkoa. O e all,
such a wide a iabili y could be a ibu able o he di e -
ences be ween au och honous Basques and he emaining
Spanish samples. In his way, MAPT*H2 equencies in he
non-Basque collec ions anged om 0.21 o 0.26, whe eas
he Basque collec ions exhibi ed igu es be ween 0.31 and
0.49. Non-na i e Basques ( esiding popula ion o he Basque
Coun y), wi h a high admix u e le el be ween bo h g oups,
eco ded an in e media e equency (0.28). Ea lie s udies
ha e sugges ed ha he F ench-Basque popula ion has adi-
ionally expe ienced subs an ial gene low om neighbou -
ing non-Basque communi ies (Calde ón e al. 1998, 2000),
as e lec ed by he lowe H2 equency ela i e o Ibe ian
Basques. The s ikingly high equency o MAPT*H2 in
Gipuzkoa (0.49) is no ewo hy. Such alue ep esen s a
maximum among he wo ld popula ions analyzed o da e.
To del e in o he causes o he high equency o H2 in
Eu ope, we analyzed ou STRs om he MAPT egion
(Donnelly e al. 2010) in bo h he s udy samples (Na a a,
Gipuzkoa, and Valencia) and he gene al popula ion in he
Basque Coun y (Al onso-Sánchez e al. 2018). Ne wo k
analysis o he haplo ypes gene a ed o his sample se
(Fig.3) depic s one g oup o haplo ypes associa ed wi h H1
(le side o he g aph) and wo g oups o haplo ypes associ-
a ed wi h H2 ( igh side). Haplo ype a is he mos equen
one (0.075) and p esumably he ances o o all Ibe ian H2
haplo ypes. Fo his eason, we based on da a o haplo ype
a o pe o m MRCA es ima es.
Es ima es o he ime o he MRCA o Ibe ian H2 haplo-
ypes anged be ween 648 and 1295 gene a ions, i.e. be ween
16,189 and 32,378yea s. The ange cen al alue was abou
24 kiloyea s (ky s), ha is, a ound he beginning o he Las
Glacial Maximum.
A se o p e ious s udies p o ide genome-wide da a on
ancien Eu opean popula ions ( e e ences in he Me hods
sec ion). A molecula polymo phism commonly examined
in hose s udies based on ancien DNA om human emains
is he SNP s10514879, a diagnos ic ma ke o MAPT
haplo ype assignmen . Un o una ely, Ibe ian samples wi h
Fig. 2 Linea eg ession o
MAPT*H2 haplo ype equen-
cies in Eu opean popula ions
agains a o a ing coo dina e
axis. Pa ame e alues ma ch
he unc ion y = a + b*x, whe e
a = 0.1553, and b = 0.1634. The
axis is o a ed 190º ela i e o
he No h. Dashed lines a e 95%
con idence limi s
157Molecula Gene ics and Genomics (2023) 298:153–160
1 3
iden i iable geno ypes p o ed a e, hen all Eu opean indi-
iduals pooled. Table2 shows he equencies a anged
acco ding o di e en cul u al pe iods. All Paleoli hic and
Mesoli hic samples showed H2 equencies below he alues
o ex an Eu opeans. By con as , equencies co esponding
o he Neoli hic, Chalcoli hic and B onze Age ell wi hin he
ange o p esen Eu opean equencies, abo e 0.15.
MAPT haplo ype assignmen was also easible on ancien
Cen al and Sou h Asian popula ions om da a published
by Na asimhan e al. (2019). The numbe o human emains
wi h iden i iable MAPT haplo ypes was low, excep o he
B onze Age, whe e 21 ou o 92 o al ch omosomes could
be classed as H2 haplo ypes. H2 equency in his g oup was
s ikingly high (0.23), a alue clea ly abo e equencies cu -
en ly obse ed in Cen al and Sou h Asia (Table3).
Unlike A ican popula ions, whe e H2D is gene ally less
equen han H2' (S einbe g e al. 2012), H2D p edomina es
in Eu ope, wi h clea ly highe equencies han H2'. A e age
H2' equencies end o be low, wi h igu es below 0.03 in all
con inen s (Eas Asia: 0.00; Sou h Asia: 0.02; A ica: 0.02;
Eu ope: 0.03). Al e na i ely, H2D end o show mo e une en
alues, wi h low a e age equencies in Eas Asia (0.00) and
A ica (0.01), in e media e in Sou h Asia (0.06), and sub-
s an ially highe in Eu ope (0.24), whe e igu es oscilla ed
om 0.09 o 0.49 (see Supplemen a y Table3).
Discussion
Es ima es o he ime o he MRCA o au H2 haplo ypes
in p e ious wo ks ha e been a om consis en , anging
om 3 million yea s (S e ansson e al. 2005) o less han 100
ky s (Donnelly e al. 2010). The hypo he ical a ea o o igin
also emains ambiguous and he au ho s ci ed abo e sugges
ha i could be ei he A ica (S e ansson e al. 2005; Don-
nelly e al. 2010) o Sou hwes Asia (Donnelly e al. 2010).
Also, some au ho s ha e pos ula ed a selec i e ad an age o
H2 (S e ansson e al. 2005; Al es e al. 2015). In any case,
he di e en ia ing ac o o he au H2 haplo ype among
human popula ions is i s une en wo ldwide dis ibu ion pa -
e n, cha ac e ized by high equencies in he Nea Eas and
Eu ope, medium equency le els in Sou h Asia and No h
A ica, and low le els elsewhe e (E ans e al. 2004; Don-
nelly e al. 2010; Al es e al. 2015).
Fig. 3 Haplo ype phylogene ic
ne wo k based on ou STRs
and wo SNPs om he MAPT
egion o se e al Spanish
popula ions. The le -side g oup
o haplo ypes means indi iduals
ca ying MAPT*H1 (excep ion
ma ked wi h an as e isk), while
he igh -side clus e s co e-
spond o MAPT*H2 indi iduals.
The label a indica es he mos
equen H2 haplo ype. Solid
ci cles ep esen H2′ subhap-
lo ypes
Table 2 F equency es ima es (± s anda d e o ) o he MAPT*H2
haplo ype in ancien DNA samples om he Eu opean con inen ,
a anged by cul u al pe iods
N sample size as ch omosome numbe
NH2
Paleoli hic & Mesoli hic 69 0.043 ± 0.012
Neoli hic 109 0.220 ± 0.020
Chalcoli hic 65 0.154 ± 0.022
B onze age 124 0.161 ± 0.017
Table 3 F equency es ima es (± s anda d e o ) o he MAPT*H2
haplo ype in ancien DNA samples om Cen al and Sou h Asia,
a anged by cul u al pe iods
N sample size as ch omosome numbe
NH2
Neoli hic 11 0.000 ± 0.000
Chalcoli hic 10 0.200 ± 0.063
B onze age 92 0.228 ± 0.022
I on age 31 0.065 ± 0.022
158 Molecula Gene ics and Genomics (2023) 298:153–160
1 3
Ou s udy i s ly examined he spa ial dis ibu ion o
MAPT haplo ype equencies in he Ibe ian Peninsula based
on h ee geog aphically and demog aphically dis inc popu-
la ions. As one o ou main indings, he ema kable high H2
equencies eco ded o he Basque popula ion o Gipuzkoa
(0.49) s ands ou as he highes among all human g oups
analyzed o da e. Gipuzkoa Basques a e excep ional because
hei H2 equencies a e p ac ically equal o H1 (0.51). As
a esul , we could de ec signi ican gene ic he e ogenei y
be ween a pool o he Basque samples (Gipuzkoa and Na -
a e) and he Medi e anean collec ion (Valencia), on he one
hand, and also be ween he wo na i e Basque g oups, mos
likely due o he unusually high H2 equency in Gipuz-
koa. The e has been plen y o e idence ha au och honous
Basques om Gipuzkoa a e unique in e ms o language
(p edominance o Euske a in u al illages), opog aphy,
inb eeding pa e ns and consanguini y s uc u es, among
o he ai s, ela i e o o he neighbo ing Basque g oups
(Al onso-Sánchez e al. 2005; Pé ez-Mi anda e al. 2005).
Second, we analyzed he H2 clinal a ia ion in Sou h
Asia and Eu ope, inding a s a is ically signi ican sigmoidal
cline. Usually, such clines indica e hyb id zones, i.e., con ac
zones be ween wo gene ically di e gen popula ions (Ba on
and Hewi 1985). He e, he dis inc popula ions would be
he wes e n Nea Eas and Eu ope, on he one hand, and
he eas e n Nea Eas and sou he n Asia, on he o he . The
hyb id zone would he e o e be he Nea Eas .
Then, we u ilized a se o ou STRs and wo SNPs o
es ima e he ime o he MRCA o Ibe ian H2 haplo ypes.
Along hese lines, including he MAPT haplo ypes iden i-
ied om ancien DNA analysis acili a ed he acking o
H2 equencies ac oss Eu ope and Sou h Asia o e ime,
especially wi h he s a i ica ion o indi iduals acco ding o
cul u al pe iods. The indings o such an app oach suppo ed
ou da ing es ima es and he likely o igins o he 17q21.31
in e sion in Eu ope.
Based on ou es ima es, he o igin o Ibe ian H2 hap-
lo ypes would ha e occu ed be ween 16 and 32 housand
yea s ago, wi h a cen al alue o 24 housand yea s, i.e.
a ound he onse o he Las Glacial Maximum. The e o e,
while he MRCA age would be cen e ed a ound he dawn
o he Las Glacial Maximum, he es ima e’s uppe limi
would all wi hin such an e olu iona y miles one. This ange
o da ing allows us o conjec u e wo plausible e olu ion-
a y scena ios. A i s look would ocus on he equency
peak ound in Basques (Gipuzkoa sample). Thus, we could
pos ula e he impac o gene ic d i by cumula i e e ec s
o ounde e en s du ing he Las Glacial Maximum in he
F anco-Can ab ian e ugium and a subsequen sp eading
associa ed wi h he pos glacial ecoloniza ion o cen al
and no he n Eu ope (To oni e al. 1998). Howe e , his
hypo hesis would no explain he high equencies ound in
o he popula ions om sou he n Eu ope and he Nea Eas .
A second scena io would assume ounde e ec episodes
occu ed in he Nea Eas du ing he la e Paleoli hic o ea ly
Neoli hic (Alka aki e al. 2021), wi h a subsequen dispe -
sal ac oss Eu ope associa ed wi h Neoli hic demic di usion
(Ha is 2017; Ise n e al. 2017). In his case, he high H2
equency in Gipuzkoa migh be explained by mo e ecen
gene ic d i e en s associa ed wi h he pe sis en isola ion
o au och honous Basque communi ies un il ela i ely ecen
imes (Al onso-Sánchez e al. 2008; Olalde e al. 2019). Ou
indings based on ancien genomes e ealed ha high H2 e-
quencies in Eu ope seem o ha e eme ged du ing Neoli hic
imes, as Paleoli hic and Mesoli hic indi iduals showed
ela i ely low equencies: only h ee indi iduals ou o 69
ca ied he H2 haplo ype in Eu ope.
The Neoli hic hypo hesis could explain he high H2 le els
in Sou he n Eu ope, pa icula ly in Sa dinia (0.375), whose
popula ion gene pool s ill sha es subs an ial simila i ies wi h
he ea ly Eu opean a ming g oups (Laza idis e al. 2014).
On he o he hand, high equencies iden i ied in he wes -
e n egion o he Nea Eas migh well be gene ic aces
o he haplo ype's o igins. The ela i ely high H2 equen-
cies in Sou h Asia migh mi o he expansion o Eu asian
s eppe peoples du ing he B onze Age (Na asimhan e al.
2019). Finally, compa a i ely high equencies in No h
A ica (Al es e al. 2015) could be he consequence o he
A ab mig a ions s emming om he Nea Eas since he 8 h
cen u y.
Finally, when conside ing equencies o H2' and H2D
subhaplo ypes, we ound ha H2', p oposed as an ances al
subhaplo ype o MAPT*H2 (S einbe g e al. 2012), p e-
sen ed low equencies h oughou all con inen s. Likewise,
H2D, de i ed om H2' and ca ying some duplica ions,
ea u ed he highes equencies in Eu ope and in e medi-
a e alues in Sou h Asia. These alues ma ch he cu en
geog aphic dis ibu ion o he au H2 haplo ype. Thus, ou
indings sugges ha gene ic d i caused by ounde e ec s
in he Nea Eas —which migh ha e con ibu ed o he high
H2 equencies obse ed he e and in Eu ope—would ha e
p ima ily a ec ed ch omosomes ca ying duplica ions, i.e.,
H2D.
In summa y, he mos plausible hypo hesis o he o igin
o high MAPT*H2 equencies in Eu ope seems o poin o
gene ic d i by ounde e en s du ing he la e Paleoli hic
o ea ly Neoli hic in he wes e n Nea Eas . Acco ding o
cu en ly a ailable H2 equency da a and he geog aphic
dis ibu ion o ch omosomal a ian s ca ying he 17q21.31
in e sion, he ounde e ec s would ha e mainly a ec ed
he H2D subhaplo ype. H2 o e ep esen a ion would hen
ha e en e ed Eu ope wi h he i s Neoli hic a ming com-
muni ies, on he one hand, and sp ead o Sou h Asia wi h
he mig a ions and expansions o Indo-Eu opean peoples.
The obus ness and scope o ou indings could be imp o ed
by inc easing he numbe o ancien DNA samples om
159Molecula Gene ics and Genomics (2023) 298:153–160
1 3
di e en cul u al pe iods in Eu ope, No h A ica, and
Sou hwes Asia o e ine bo h he iming and he hypo he i-
cal dispe sal ou es o H2 haplo ypes.
Supplemen a y In o ma ion The online e sion con ains supplemen-
a y ma e ial a ailable a h ps:// doi. o g/ 10. 1007/ s00438- 022- 01969-0.
Funding Open Access unding p o ided hanks o he CRUE-CSIC
ag eemen wi h Sp inge Na u e. This s udy was unded by he Basque
Go e nmen (G an o consolida ed esea ch g oups, IT-833-13, and
SAIOTEK P og am, S-PE10UN54).
Decla a ions
Con lic o in e es The au ho s decla e ha he e is no con lic o in-
e es o compe ing in e es s.
E hical app o al E hical guidelines o esea ch wi h human beings
we e adhe ed o as s ipula ed by he Ins i u ional Re iew Boa d om
he Uni e si y o he Basque Coun y (UPV/EHU), including w i -
en in o med consen o olun a y pa icipan s in he s udy. The s udy
p o ocol was app o ed by he E hics Commi ee o he ci ed ins i u ion.
Open Access This a icle is licensed unde a C ea i e Commons A i-
bu ion 4.0 In e na ional License, which pe mi s use, sha ing, adap a-
ion, dis ibu ion and ep oduc ion in any medium o o ma , as long
as you gi e app op ia e c edi o he o iginal au ho (s) and he sou ce,
p o ide a link o he C ea i e Commons licence, and indica e i changes
we e made. The images o o he hi d pa y ma e ial in his a icle a e
included in he a icle's C ea i e Commons licence, unless indica ed
o he wise in a c edi line o he ma e ial. I ma e ial is no included in
he a icle's C ea i e Commons licence and you in ended use is no
pe mi ed by s a u o y egula ion o exceeds he pe mi ed use, you will
need o ob ain pe mission di ec ly om he copy igh holde . To iew a
copy o his licence, isi h p:// c ea i eco mmons. o g/ licen ses/ by/4. 0/.
Re e ences
Al onso-Sánchez MA, A es i U, Peña JA, Calde ón R (2005) Inb eed-
ing le els and consanguini y s uc u e in he Basque p o ince o
Guipúzcoa (1862–1980). Am J Phys An h opol 127:240–252.
h ps:// doi. o g/ 10. 1002/ ajpa. 20172
Al onso-Sánchez MA, Ca doso S, Ma ínez-Bouzas C, Peña JA, He -
e a RJ, Cas o A e al (2008) Mi ochond ial DNA haplog oup
di e si y in Basques: a eassessmen based on HVI and HVII
polymo phisms. Am J Hum Biol 20:154–164. h ps:// doi. o g/ 10.
1002/ ajhb. 20706
Al onso-Sánchez MA, Espinosa I, Gómez-Pé ez L, Po eda A, Reba o
E, Peña JA (2018) Tau haplo ypes suppo he Asian ances y
o he Roma popula ion se led in he Basque Coun y. He edi y
120:91–99. h ps:// doi. o g/ 10. 1038/ s41437- 017- 0001-x
Alka aki AK, Abuelezz AI, Khabou OF, Peña JA, Al onso-Sánchez
MA, Al aany Z (2021) MAPT haplo ypes in Jo dan: e idence on
he Middle Eas as a mel ing-po p eda ing Neoli hic mig a ion.
Ann Hum Biol. h ps:// doi. o g/ 10. 1080/ 03014 460. 2021. 19830 18
Al es JM, Lima AC, Pais IA, Ami N, Celes ino R e al (2015) Reas-
sessing he e olu iona y his o y o he 17q21 in e sion polymo -
phism. Genome Biol E ol 7(12):3239–3248. h ps:// doi. o g/ 10.
1093/ gbe/ e 214
A end T, S iele JT, Holze M (2016) Tau and auopa hies. B ain Res
Bull 126:238–292. h ps:// doi. o g/ 10. 1016/j. b ain esbu ll. 2016.
08. 018
Balla o e C, Lee VMY, T ojanowski JQ (2007) Tau-media ed neu ode-
gene a ion in Alzheime ’s disease and ela ed diso de s. Na Re
Neu osci 8:663–672. h ps:// doi. o g/ 10. 1038/ n n21 94
Bandel HJ, Fo s e P, Sykes BC, Richa ds MB (1995) Mi ochond ial
po ai s o human popula ions using median ne wo ks. Gene ics
141:743–753. h ps:// doi. o g/ 10. 1093/ gene ics/ 141.2. 743
Ba on NH, Hewi GM (1985) Analysis o hyb id zones. Annu Re
Ecol Sys . h ps:// doi. o g/ 10. 1146/ annu e . es. 16. 110185. 000553
Calde ón R, Vidales C, Pena JA, Pe ez-Mi anda A, Dugoujon JM
(1998) Immunoglobulin allo ypes (GM and KM) in Basques om
Spain: app oach o he o igin o he Basque popula ion. Hum Biol
70:667–698
Calde ón R, Pé ez-Mi anda A, Peña JA, Vidales C, A es i U, Dugou-
jon JM (2000) The gene ic posi ion o he au och honous sub-
popula ion o No he n Na a e (Spain) in ela ion o o he
Basque subpopula ions. A s udy based on GM and KM immu-
noglobulin allo ypes. Hum Biol 72:619–640
Donnelly MP, Paschou P, G igo enko E, Gu wi z D, Mehdi SQ,
Kajuna SL e al (2010) The dis ibu ion and mos ecen com-
mon ances o o he 17q21 in e sion in humans. Am J Hum
Gene 86:161–171. h ps:// doi. o g/ 10. 1016/j. ajhg. 2010. 01. 007
E ans W, Fung HC, S eele J, Ee ola J, Tiena i P, Pi man A e al
(2004) The au H2 haplo ype is almos exclusi ely Caucasian
in o igin. Neu osci Le 369:183–185. h ps:// doi. o g/ 10. 1016/j.
neule . 2004. 05. 119
Exco ie L, Lische HE (2010) A lequin sui e e 3.5: a new se ies
o p og ams o pe o m popula ion gene ics analyses unde
Linux and Windows. Mol Ecol Res 10:564–567. h ps:// doi.
o g/ 10. 1111/j. 1755- 0998. 2010. 02847.x
Fu Q, Pos h C, Hajdinjak M, Pe M, Mallick S, Fe nandes D e al
(2016) The gene ic his o y o Ice Age Eu ope. Na u e 534:200–
205. h ps:// doi. o g/ 10. 1038/ na u e17993
Ga cía-Ob egón S, Al onso-Sánchez MA, Pé ez-Mi anda AM, Vid-
ales C, A oyo D, Peña JA (2006) Gene ic posi ion o Valencia
(Spain) in he Medi e anean basin acco ding o Alu inse ions.
Am J Hum Biol 18(2):187–195. h ps:// doi. o g/ 10. 1002/ ajhb.
20487
Ga cía-Ob egón S, Al onso-Sánchez MA, Pé ez-Mi anda AM, De
Panco bo MM, Peña JA (2007) Polymo phic Alu inse ions and
he gene ic s uc u e o Ibe ian Basques. J Hum Gene 52:317–
327. h ps:// doi. o g/ 10. 1007/ s10038- 007- 0114-9
Ha is EE (2017) Demic and cul u al di usion in p ehis o ic Eu ope in
he age o ancien genomes. E ol An h opol 26:228–241. h ps://
doi. o g/ 10. 1002/ e an. 21545
Ise n N, Fo J, de Rioja VL (2017) The ancien cline o haplog oup K
implies ha he Neoli hic ansi ion in Eu ope was mainly demic.
Sci Rep 7:11229. h ps:// doi. o g/ 10. 1038/ s41598- 017- 11629-8
Laza idis I, Pa e son N, Mi nik A, Renaud G, Mallick S, Ki sanow K
e al (2014) Ancien human genomes sugges h ee ances al popu-
la ions o p esen -day Eu opeans. Na u e 513:409–413. h ps://
doi. o g/ 10. 1038/ na u e13673
Laza idis I, Nadel D, Rolle son G, Me e DC, Rohland N, Mallick
S e al (2016) Genomic insigh s in o he o igin o a ming in he
ancien Nea Eas . Na u e 536:419–424. h ps:// doi. o g/ 10. 1038/
na u e19310
Laza idis I, Mi nik A, Pa e son N, Mallick S, Rohland N, P engle
S e al (2017) Gene ic o igins o he Minoans and Mycenaeans.
Na u e 548:214–218. h ps:// doi. o g/ 10. 1038/ na u e23310
Lipson M, Szécsényi-Nagy A, Mallick S, Pósa A, S égmá B, Kee l
V e al (2017) Pa allel palaeogenomic ansec s e eal complex
gene ic his o y o ea ly Eu opean a me s. Na u e 551:368–372.
h ps:// doi. o g/ 10. 1038/ na u e24476
Ma hieson I, Laza idis I, Rohland N, Mallick S, Pa e son N, Rooden-
be g SA e al (2015) Genome-wide pa e ns o selec ion in 230
ancien Eu asians. Na u e 528:499–503. h ps:// doi. o g/ 10. 1038/
na u e16152
160 Molecula Gene ics and Genomics (2023) 298:153–160
1 3
Ma hieson I, Alpaslan-Roodenbe g S, Pos h C, Szécsényi-Nagy A,
Rohland N, Mallick S e al (2018) The genomic his o y o sou h-
eas e n Eu ope. Na u e 555:197–203. h ps:// doi. o g/ 10. 1038/
na u e25778
Na asimhan VM, Pa e son N, Moo jani P, Laza idis I, Lipson M, Mal-
lick S e al (2019) The genomic o ma ion o Sou h and Cen al
Asia. Science 365:eaa 7487. h ps:// doi. o g/ 10. 1126/ scien ce. aa 74
87
Olalde I, B ace S, Allen o ME, A mi I, K is iansen K, Boo h T e al
(2018) The beake phenomenon and he genomic ans o ma ion
o no hwes Eu ope. Na u e 555:190–196. h ps:// doi. o g/ 10.
1038/ na u e25738
Olalde I, Mallick S, Pa e son N, Rohland N, Villalba-Mouco V, Sil a
M e al (2019) The genomic his o y o he Ibe ian Peninsula o e
he pas 8000 yea s. Science 363:1230–1234. h ps:// doi. o g/ 10.
1126/ scien ce. aa 40 40
Peña JA, Gómez-Pé ez L, Al onso-Sánchez MA (2021) On he ail o
spa ial pa e ns o gene ic a ia ion. E ol Biol. h ps:// doi. o g/ 10.
1007/ s11692- 021- 09552-y
Pé ez-Mi anda AM, Al onso-Sánchez MA, Kalan a A, Ga cía-
Ob egón S, de Panco bo MM, Peña JA e al (2005) Mic osa elli e
da a suppo subpopula ion s uc u ing among Basques. J Hum
Gene 50:403–414. h ps:// doi. o g/ 10. 1007/ s10038- 005- 0268-2
S e ansson H, Helgason A, Tho lei sson G, S ein ho sdo i V, Masson
G, Ba na d J e al (2005) A common in e sion unde selec ion
in Eu opeans. Na Gene 37:129–137. h ps:// doi. o g/ 10. 1038/
ng1508
S einbe g KM, An onacci F, Sudman PH, Kidd JM, Campbell CD,
Vi es L e al (2012) S uc u al di e si y and A ican o igin o
he 17q21. 31 in e sion polymo phism. Na Gene 44:872–880.
h ps:// doi. o g/ 10. 1038/ ng. 2335
S ephens JC, Reich DE, Golds ein DB, Shin HD, Smi h MW, Ca -
ing on M e al (1998) Da ing he o igin o he CCR5-Δ32 AIDS-
esis ance allele by he coalescence o haplo ypes. Am J Hum
Gene 62:1507–1515. h ps:// doi. o g/ 10. 1086/ 301867
To oni A, Bandel HJ, D’u bano L, Lahe mo P, Mo al P, Selli o D
e al (1998) M DNA analysis e eals a majo la e Paleoli hic popu-
la ion expansion om sou hwes e n o no heas e n Eu ope. Am J
Hum Gene 62:1137–1152. h ps:// doi. o g/ 10. 1086/ 301822
Publishe 's No e Sp inge Na u e emains neu al wi h ega d o
ju isdic ional claims in published maps and ins i u ional a ilia ions.