© 2025 The Au ho s. Published unde a C ea i e Commons
A ibu ion 4.0 In e na ional (CC BY 4.0) license.
Imaging Neu oscience, Volume 3, 2025
h ps://doi.o g/10.1162/IMAG.a.75
Resea ch A icle
1. INTRODUCTION
Speech p oduc ion equi es complex mo o con ol,
in ol ing mo e han 100 muscles o wo k in conce .
Speech mo o con ol de elops ea ly in li e ( Pe in &
Venance, 2019; Wäch e e al., 2009), making he acqui-
si ion o a second language speech easie and mo e
na i e o child en han adul s, whe e o eign accen s a e
common ( Flynn & Manuel, 1991; Geschwind & Ca e e e,
1966; Hack e al., 2012; Long, 1990; Wol e, 1967).
Acco ding o Kuhl e al. (2005), he end o he c i ical
pe iod a ound 7yea s old is cha ac e ized by a educed
co ical plas ici y in he mo o and audi o y ci cui s, along
wi h lowe p o iciencies in o eign speech phone ic dis-
c imina ion and speech p oduc ion. Al hough educed
Neu al co ela es o o eign speech imi a ion: The e ec s o age
and music
Xiaohui Yana, Jiaqi Maob, Zixin Maa, Kyle Pe kinsc, Weizheng Lid, Yang Wangd, Fan Caoa
aDepa men o Psychology, he Uni e si y o Hong Kong, Hong Kong, China
bBCBL Basque Cen e on Cogni ion, B ain and Language, Donos ia, Gipuzkoa, Spain
c e i ed p o esso , Flo ida In e na ional Uni e si y, Miami, FL, Uni ed S a es
dDepa men o Psychology, Sun Ya - Sen Uni e si y, Guangzhou, China
Co esponding Au ho : Fan Cao ([email p o ec ed])
ABSTRACT
Adul lea ne s o a o eign speech a e o en ma ked by ha ing a o eign accen ; howe e , child en and adul s wi h
singing aining end o ha e be e p onuncia ions han adul s wi hou music aining. The assimila ion hypo hesis
p oposes ha people end o assimila e o eign speech o na i e speech du ing pe cep ion and p oduc ion, which
may explain o eign accen . Un o una ely, he neu al mechanisms unde lying he age and music e ec s a e s ill
unclea . In his s udy, we compa ed b ain ac i a ion pa e ns in h ee g oups o pa icipan s, namely, child en, adul s
wi h singing aining, and adul s wi hou music aining (con ol adul s) du ing na i e (Chinese) and o eign speech
(Spanish) imi a ion wi h each wo d epea ed h ee imes. We ound g ea e ep esen a ional simila i y be ween Chi-
nese and Spanish in bo h g oups o adul s han in child en du ing bo h speech pe cep ion and p oduc ion, suppo ing
he assimila ion hypo hesis. Fu he mo e, we ound g oup- speci ic e ec s o he simila i y be ween di e en imes o
imi a ion, sugges ing di e en mechanisms. Speci ically, con ol adul s showed g ea e simila i y be ween di e en
imes o Spanish wo d imi a ion han he o he wo g oups in he medial o bi al on al co ex in ol ed in adap i e
lea ning/memo y; child en showed g ea e simila i y han he o he wo g oups in he bila e al in e io p emo o /pos -
cen al gy i in ol ed in senso imo o lea ning; adul s wi h singing aining showed g ea e simila i y han he o he wo
g oups in he le supe io empo al gy us in ol ed in audi o y eedback. I sugges s ha singing aining acili a es
eliance on audi o y disc imina ion, while child en ely on soma osenso y and speech mo o con ol o lea n o eign
speech sounds, implica ing di e en mechanisms o age and singing aining e ec s. Ou esul s p o ide insigh s in
unde s anding he neu al mechanisms o age and music e ec s in o eign speech lea ning.
Keywo ds: ep esen a ional simila i y, o eign speech imi a ion, musicians, speech p oduc ion
Recei ed: 24 Janua y 2025 Re ision: 1 May 2025 Accep ed: 10 June 2025 A ailable Online: 25 June 2025
2
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
neu al plas ici y ma ks he end o he c i ical pe iod, he
speci ic unde lying neu al mechanisms o how lea ning
o eign speech di e s be ween child en and adul s
emain unclea .
The Di ec ions in o Veloci ies o A icula o s (DIVA)
model p o ides a amewo k o unde s anding he neu-
al mechanisms unde lying o eign speech imi a ion
( Tou ille & Guen he , 2011), which is essen ially a p o-
cess o adjus ing mo o con ol acco ding o soma o-
senso y and audi o y eedback. Acco ding o he DIVA
model ( Tou ille & Guen he , 2011), a eed o wa d con-
ol sys em is esponsible o p ojec ing mo o com-
mands o execu ing a icula o y ges u es. A he same
ime, a eedback con ol sys em gene a es soma osen-
so y expec a ions o he a icula o y ges u es, and com-
pa es incoming soma osenso y and audi o y eedbacks
wi h senso y expec a ions. I e o signals a e de ec ed,
co ec i e mo o commands a e sen o he mo o co -
ex ( Guen he & Vladusich, 2012; Tou ille & Guen he ,
2011). Speci ically, in he mo o co ex, he middle p e-
cen al gy us is a la ynx a ea which di ec ly con ols
muscles in he ocal- old, ope a ion o which de e -
mines he opening and closing o he glo al space, as
well as ensing and elaxing o ocal olds, modula ing
he pi ch. The in e io p ecen al gy us is in ol ed in lip
and ongue mo emen con ol. Lesion s udies ound
ha he middle and in e io p ecen al gy us a e he
mos consis en egions associa ed wi h o eign accen
synd ome in s oke pa ien s ( Higashiyama e al., 2021).
While he middle and in e io p ecen al gy us a e
in ol ed in p oducing acqui ed speech sounds, he s i-
a um, halamus, and p emo o co ex a e mo e in ol ed
in ocal lea ning o no el speech sounds, acco ding o
Ja is (2004, 2006) based on esea ch o songbi ds and
humans. Simmonds (2015) u he sugges ed ha he
ocal lea ning pa hway (e.g., he s ia um) becomes
inac i e oo ea ly du ing ocal lea ning, and he mo o
co ex o p oducing acqui ed speech sounds is in ol ed
ins ead, which may be why he e is o eign accen in la e
second language lea ne s.
Un o una ely, no s udies ha e compa ed he eal- ime
b ain mechanisms unde lying o eign speech imi a ion in
child en and adul s. P e ious s udies ha e compa ed
na i e and nonna i e owels imi a ion in adul s ( Ca ey,
Miquel e al., 2017; Klein e al., 2006), as well as adul s
wi h highe L2 ap i ude and hose wi h lowe L2 ap i ude
du ing L1 and L2 p oduc ion ( Hu, Acke mann e al., 2013).
A ew s udies also conce ned how age o acquisi ion
a ec s speech p oduc ion in second language ( Be ken
e al., 2015; Be ken, G acco, e al., 2016; F enck- Mes e
e al., 2005). Two o hem ound ha ea ly bilinguals end
o show g ea e ac i a ion and g ea e g ey ma e ol-
ume in he pu amen han la e bilinguals ( Be ken, G acco,
e al., 2016; F enck- Mes e e al., 2005), while ano he
s udy ound g ea e ac i a ion in he in e io on al gy us
in ea ly bilinguals han la e bilinguals du ing speech p o-
duc ion o L2 ( Be ken, Chai, e al., 2016). Howe e , hese
s udies exclusi ely scanned adul s, wi h hal being ea ly
bilinguals and hal being la e bilinguals, examining L2 as
a long- e m lea ning e ec , a he han he eal- ime neu-
al unde pinnings o o eign speech acquisi ion.
A ew s udies ha e examined he eal- ime b ain ac i-
a ions o adul s lea ning new languages. One s udy
ound g ea e ac i a ion in he an e io insula and in e io
on al gy us when lea ning o speak a new language
compa ed o na i e speech p oduc ion, especially in he
i s 10min o speaking he new language ( Mose e al.,
2009). T aining s udies examined b ain ac i a ion du ing
non- na i e speech sound pe cep ion ( Goles ani &
Za o e, 2004) and p oduc ion ( Simmonds e al., 2014)
be o e and a e aining. I was ound ha mo e e icien
p ocessing in he on al speech a eas was co ela ed
wi h g ea e success in o eign phone ic iden i ica ion
( Goles ani & Za o e, 2004). Simmonds e al. (2014) ound
educed ac i a ion in he an e io s ia um o e ime bo h
wi hin and be ween scanning sessions, sugges ing ha
he s ia um becomes inac i e oo ea ly du ing ocal
lea ning. Howe e , no s udies ha e di ec ly compa ed he
lea ning p ocess o o eign speech imi a ion be ween
child en and adul s o unde s and why child en ha e a
educed o eign accen .
One hypo hesis o he exis ence o o eign accen s in
adul s when lea ning a second language is ha hey use
hei i s language as e e ence. Some p e ious s udies
ha e shown assimila ion o o eign speech sounds o
na i e sounds du ing pe cep ion in adul lea ne s ( Bes &
Tyle , 2007; Bes e al., 2001).This inaccu a e speech ep-
esen a ion u he causes ailu e in speech mo o con ol
du ing p oduc ion ( Ing am & Pa k, 1997). On he o he
hand, child en con on less in e e ence om na i e lan-
guage. Fo example, Bake e al. (2008) ound ha chil-
d en aged 7 o 14 yea s old pe o med be e a
dis inguishing simila na i e and o eign speech sounds
han adul s. One possible eason o he less assimila ion
o he na i e speech in child en han in adul s is ha he
na i e speech ep esen a ion sys em is s ill unde de el-
opmen in child en ( Ze in, 2012). In ac , he ep esen a-
ion o na i e speech sound ca ego ies shows a signi ican
de elopmen o he age ange o 6– 12yea s and 12–
18yea s ( McMu ay e al., 2018).
Howe e , no neu oimaging e idence is a ailable ye o
suppo his assimila ion hypo hesis. Mo eo e , i is
unknown whe he he g ea e simila i y be ween o eign
speech sounds and na i e speech sounds in adul s han
in child en exis s only in pe cep ion o ex ends o p oduc-
ion as well. In he cu en s udy, we compa ed he b ain
3
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
ac i a ion pa e ns o na i e Chinese- speaking child en
(aged 9– 10) and adul s in Spanish speech imi a ion. The
ask mimicked a na u alis ic speech lea ning si ua ion in
which each Spanish wo d was epea ed h ee imes, wi h
he pa icipan epea ing i a e each p esen a ion. Unde
his pa adigm, we aimed a compa ing how child en and
adul s a e di e en du ing his o eign speech lea ning
p ocess. We expec g ea e simila i y be ween Chinese
and Spanish ep esen a ion in adul s han in child en and
as e decline o he s ia um in adul s han in child en
du ing Spanish speech lea ning.
Fu he mo e, abou 5– 15% o he popula ion ha e he
abili y o achie e na i e- like speech, e en when he age
o acquisi ion is la e ( Ab ahamsson & Hyl ens am, 2008;
Bi dsong, 2005; Flynn & Manuel, 1991; Wells, 1985). I
has been edundan ly documen ed ha he e is c oss-
domain ans e om musical expe ise o speech pe -
cep ion and p oduc ion ( Jekiel & Mala ski, 2021;
Milo ano & Te aniemi, 2011; Weiss e al., 2015; Wong
e al., 2007; Zuk e al., 2013), which could be explained
by he common neu al co ela es in ol ed in musical and
speech p ocessing in he audi o y pa hway and mo o
co ex ( Ozdemi e al., 2006). Mo eo e , he di e ence in
musical nodes is mo e i ial han ha in speech pho-
nemes, leading o he ac ha musical abili ies can be
ans e ed o speech abili ies. One s udy showed ha
musical abili y p edic ed L2 phonological abili y (bo h
ecep i e and p oduc i e) e en a e con olling o o he
ac o s, bu did no accoun o he unique a iance in L2
syn ax o lexical knowledge ( Sle c & Miyake, 2006).
Ano he s udy ound ha musical expe ience is co -
ela ed wi h g ea e p onuncia ion accu acy o English
owels a e speech he apy o accen educ ion ( Jekiel
& Mala ski, 2021).
Unlike musical ins umen aining, which mainly os-
e s speech pe cep ion, ocal aining is bene icial o
bo h speech pe cep ion and speech p oduc ion, as sing-
ing no only enhances indi iduals’ pi ch disc imina ion,
bu also ains he ocal mo o appa a us necessa y o
pi ch p oduc ion. As illus a ed in a s udy by Ch is ine
and Rei e e (2015), in which h ee g oups o Ge man
adul s (i.e., singe s, ins umen alis s, non- musicians)
we e asked o imi a e sen ences o a o eign language
(i.e., Hindi), bo h singe s and ins umen alis s had highe
pe o mances han non- musicians in mimicking he Hindi
u e ances. In addi ion, as ocalis s had mo e p ecise
ocal con ol han ins umen alis s, singe s ou pe o med
ins umen alis s in he Hindi sen ence imi a ion ask.
The e o e, singing may be a s ong p edic o o good p o-
nuncia ion skills in o eign speech imi a ion.
Neu ologically, esea che s ha e ound ha p o es-
sional singe s end o ha e a g ea e olume in he en al
p ima y soma osenso y co ex, os al sup ama ginal
gy us, and audi o y co ex han non- musicians ( Klebe
e al., 2017). In addi ion, la ge olumes o a cua e ascic-
ulus, which is a ibe ac connec ing empo al a eas and
p e on al egions, we e ound in musicians han non-
musicians ( Halwani e al., 2011). These b ain s uc u al
changes in musicians we e a gued o play a signi ican
ole in speech p oduc ion ( Halwani e al., 2011). Howe e ,
no published s udies ha e compa ed b ain ac i a ion
pa e ns du ing ac ual o eign speech lea ning in singe s
and non- musicians o unde s and he b ain di e ences
o singe s o ou pe o m non- musicians.
In he cu en s udy, we ec ui ed adul s wi h p o es-
sional ocal music aining o mo e han 2yea s and
we planned o compa e he o eign speech lea ning
mechanisms in he b ain in singe adul s, child en and
adul s wi hou music aining (con ol adul s) o unde -
s and why adul s wi h ocal aining and child en ha e
ad an ages han con ol adul s in o eign speech lea n-
ing. We expec singe s o ha e mo e accu a e ep e-
sen a ions han con ol adul s in he eed o wa d mo o
con ol a eas, including he key ocal lea ning egions
in he s ia um, o soma osenso y and audi o y eed-
back a eas.
2. METHOD
2.1. Pa icipan s
We ec ui ed 32 adul s wi hou music backg ound (i.e.,
con ol adul s) (mean age 22.6, ange 19– 31), 20 adul s
wi h ocal singing aining o a leas 2yea s (mean age
19, ange 18– 24), and 20 child en wi hou music aining
(mean age 10.3, ange 10– 11) in he local ci y. The adul s
wi h singing aining we e ec ui ed om a local music
college. The demog aphic in o ma ion o he pa ici-
pan s is p esen ed in Table1. The adul s wi h singing
aining had 4.08yea s o p o essional ocal aining on
a e age ( ange: 2– 9 yea s, s d: 2 yea s), wi h age o
ocal aining onse being 7.95 yea s old on a e age
( ange: 6– 17 yea s old, s d: 3.75 yea s old). The wo
adul g oups we e ma ched on educa ion and English
p onuncia ion acco ding o a na i e English speake ’s
a ing on each pa icipan ’s language sample
( (50)=0.446, p=.657). All pa icipan s also me he ol-
lowing c i e ia: (a) na i e Chinese speake s wi h English
as hei L2; (b) ne e lea ned Spanish, F ench, Po u-
guese, o I alian o mo e han 3mon hs; (c) ee o med-
ical implan s and o he me al accesso ies; (d) ee o
claus ophobia, hea ing diso de s, a en ion de ici
hype ac i i y diso de (ADHD), o he de elopmen al dis-
abili ies, neu ological disease, and psychia ic diso -
de s; and (e) igh - handed. The p esen s udy was
app o ed by he e hics commi ee a he local uni e -
4
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
si y. In o med consen was ob ained om all pa ici-
pan s/pa en s o pa icipan s be o e da a collec ion.
2.2. Sensi i i y analysis
Sensi i i y analysis was conduc ed using Gpowe 3.19.
In o de o achie e α=0.001 and a s a is ical powe o
95%, ou cu en sample size would need an e ec size
o 1.51 o he compa ison be ween con ol adul s and
child en, 1.71 o he compa ison be ween child en and
adul s wi h singing aining, and 1.49 o he compa ison
be ween con ol adul s and adul s wi h singing aining.
In he whole- b ain analysis, ou ac ual e ec size is 3.09
(h ps://www . sdmp ojec . com / u ili ies / ? show
= S a is ics) o all g oup compa isons when he oxel-
le el h eshold was se a p<.001, which is much la ge
han needed.
2.3. P ocedu es
2.3.1. Beha io al asks
Se e al beha io al es s we e adminis e ed be o e he
MRI scanning. A pseudowo d hyming judgmen es
and an ini ial sound dele ion es we e included o es
phonological awa eness. The pseudowo d hyming
judgmen es consis s o 40 pai s o single- syllable
English pseudowo ds, and pa icipan s we e asked o
de e mine i he wo pseudowo ds in a pai hymed o
no . In he ini ial sound dele ion es , pa icipan s we e
asked o lis en o eal English wo ds and epea i ou
loud wi hou he ini ial sound. The e we e 30 wo ds,
including 10 single- syllable wo ds, 10 wo- syllable
wo ds, and 10 h ee- syllable wo ds. Fu he mo e, we
measu ed wo king memo y using a digi span es in o -
wa d and e e sed o de . The digi span es was in Chi-
nese, which is he i s language o pa icipan s. In his
es , expe imen e s explici ly ead andom digi s ings
wi h an inc easing span. All pa icipan s we e gi en he
same es s.
In addi ion, we measu ed music ap i ude in adul pa -
icipan s using he Ad anced Measu es o Music Audi-
ion (AMMA; Go don, 1989). In his es , pa icipan s
lis ened o 30 pai s o music audios, and a e each pai
o audio, hey we e asked o choose one om he ollow-
ing op ions: he wo audios di e in ones, di e in
hy hms, same, o no su e. The e is a onal sco e, a
hy hm sco e, and a composi e sco e in AMMA.
2.3.2. MRI ask
In he speech imi a ion ask, he e was a Spanish un
and a Chinese un ha we e coun e balanced ac oss
pa icipan s. Fo he Spanish un, he e we e 28 Spanish
eal wo ds (15 wo- syllable wo ds, 10 h ee- syllable
wo ds, and 3 ou - syllable wo ds), and o he Chinese
un, he e we e 28 Chinese pseudo- wo ds wi h syllable
numbe s ma ched wi h he Spanish wo ds. Chinese
pseudo- wo ds we e used in o de o a oid seman ic
ac i a ion in he na i e language bu no in he o eign
language. In bo h uns, pa icipan s we e asked o lis en
o and epea each wo d/pseudo- wo d h ee imes con-
secu i ely. Audio s imuli played in he scanne we e
eco ded by a na i e emale speake in Spanish and
Chinese espec i ely. A 5- min p ac ice session was
conduc ed p io o he scanning.
As illus a ed in Figu eS1, a ed c oss and an audio
wo d/pseudowo d we e p esen ed o 1500ms. Then,
he pa icipan was asked o imi a e he wo d/
pseudowo d hey jus hea d in he nex 1500ms ( 1).
A e a “ji e ” sc een which was p esen ed o 1000–
5000 ms (3000 ms a a e age), he same wo d was
played again o 1500ms and he pa icipan was asked
o imi a e he wo d o he second ime ( 2). The same
p ocedu e was epea ed again o he hi d ime ( 3).
A e he hi d “ji e ” sc een, a ed c oss was displayed
o 3000ms wi hou audi o y s imulus, which se ed as
he baseline, ollowed by he ou h “ji e ”. Then, he
nex ial s a ed. A e he MRI scanning, pa icipan s
we e asked o imi a e he Spanish speech again ou side
Table1. Demog aphic in o ma ion and beha io al es s esul s in each g oup o pa icipan s.
Con ol adul s Adul s wi h singing aining Child en
N32 (10M) 20 (3M) 20 (6M)
Age (yea s) 22.6 (3.3) [19- 31] 19.9 (1.5) [18- 24] 10.3 (0.6) [10- 11]
Rhyming judgmen 36.8 (2.5) 35.3 (3.8) 31.0 (4.9)
Ini ial sound Dele ion 26.5 (4.3) 23.9 (7.3) 15.5 (8.5)
Digi span ( o wa d) 9.0 (1.4) 9.6 (1.3) 8.1 (1.3)
Digi span (backwa d) 6.4 (2.2) 6.2 (2.4) 4.7 (1.3)
AMMA (pe cen ile) 61.4 (27.4) 80.5 (15.8) -
Numbe s in he pa en hesis a e s anda d de ia ions, and in he b acke s a e anges.
M: males; AMMA: Ad anced Measu es o Music Audi ion.
5
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
he scanne in o de o ha e a high- quali y eco ding o
hei p onuncia ion.
2.4. MRI da a acquisi ion
The MRI images we e acqui ed using a 3T Siemens
P isma MRI scanne . Pa icipan s lay down in he scanne
wi h a s anda d 20- channel head coil, and wo oam pads
we e used o help educe head mo emen . Be o e hey
en e ed he scanne , a mock scanne was used o p ac-
icing speaking wi h limi ed head mo emen . Du ing
scanning, a eal- ime moni o ing o head mo emen was
conduc ed, and pa icipan s we e eminded o keep hei
head s ill while alking du ing he b eak be ween uns i
he head mo emen was la ge. A single- sho echo plana
imaging (EPI) sequence was adop ed o collec unc ional
BOLD signals, wi h an in e lea ed acquisi ion om bo -
om o op o each olume ( epe i ion ime (TR)=2000ms,
echo ime (TE) = 20.0 ms, lip angle = 80°, ma ix
size= 128 ×128, ield o iew (FOV)=220mm, slice
hickness=3.0mm, numbe o slices=34, oxel size
=1.7×1.7×3.0mm3). The e we e 348 olumes collec ed
o each un. High- esolu ion s uc u al T1- weigh ed 3D
images (MPRAGE) we e also acqui ed (TR=2300ms,
TE = 3.24 ms, TI = 900 ms, lip angle = 9°, ma ix
size=256×256, FOV=260mm, slice hickness=1.0mm,
numbe o slices=160).
2.5. Da a analysis
2.5.1. Acous ic analysis
2.5.1.1. Voice onse ime (VOT) analysis. As a beha -
io al indica o o he in- scanne ask pe o mance, we
measu ed he oice onse ime (VOT) o /b/ and /d/ in
se en Spanish wo ds using P aa ( Boe sma, 2001). The
VOT is he ime in e al be ween a plosi e consonan
elease and oicing onse . The se en Spanish wo ds
(“bebé”, “bueno”, “b azo”, “dado”, and “di ícil”) we chose
con ained oiced s ops (/b/ and /d/ in Spanish) in which
hei onse s o phona ion occu be o e he consonan
elease, esul ing in a nega i e VOT. Figu eS2 illus a es
he measu emen o VOT in P aa .
2.5.1.2. Fo man equency analysis. The o man e-
quency analysis included he same se en Spanish wo ds
as he VOT analysis, co e ing all owels in Spanish (/a/,
/e/, /i/, /o/, and /u/). In o de o quan i y he pe o mance
on Spanish owels’ imi a ion, he i s o man (F1) and
second o man (F2) o each owel we e calcula ed in
P aa . The F1 and F2 a e he i s wo esona ing equen-
cies in a owel’s p onuncia ion, wi h F1 indica ing he
opening o lips and F2 indica ing he ongue’s posi ion
(Wood, 1982). The indi idual o man alues we e no -
malized using he R package NORM (Thomas & Kendall,
2007), in o de o elimina e he in luence o gende and
age (Fab icius e al., 2009).
Two independen esea che s who we e blinded abou
he pa icipan s’ in o ma ion calcula ed he VOT and e-
quency o man s, and hei e alua ion esul s we e co -
ela ed ( =.856, p<.001 o he VOT; =.982, p<.001
o he equency o man ). We a e aged hei esul s o
se e as he inal sco e. The VOT and o man alues o
each consonan and owel we e a e aged ac oss all
wo ds and en e ed in o a epea ed- measu e ANOVA o
g oup (con ol adul s, adul s wi h singing aining, chil-
d en) by ime (1s , 2nd, 3 d) o u he analysis.
2.5.2. MRI images p ep ocessing
MRI da a p ep ocessing was conduc ed using DPARSF
4.3 (Yan e al., 2016; h p:// m i . o g / DPARSF). Fi s o all,
slice iming was pe o med o co ec iming di e ence o
he in e lea ed slices wi h he middle slice as he e e -
ence. Nex , unc ional images we e aligned o he i s
olume o co ec head mo emen . We used ART (A i ac
De ec ion Tools, h ps://www . ni c . o g / p ojec s / a i ac
_ de ec ) o de ec head mo emen s ha exceeded 3mm
o ansla ions o 3° o o a ions o each pa icipan . We
ound ha 4 pa icipan s (2 om he adul s wi hou music
aining g oup and 2 om he child en g oup) had exces-
si e head mo emen s o less han 8 olumes ac oss all
sessions. Conside ing ha he a ec ed da a poin s we e
less han 2% o he o al da a in each pa icipan , we
kep hese pa icipan s and epai ed he a ec ed ol-
umes in A Repai (h ps://www.ni c.o g/p ojec s/a
_ epai /), using he in e pola ed alues om neighbo ing
ime poin s o eplace he a ec ed ime poin s. We also
dele ed wo pa icipan s due o ex ensi e head mo e-
men (1 om he adul s wi hou music aining g oup and
1 om he adul s wi h singing aining g oup). The cu en
sample size was a e elimina ing hese wo pa icipan s.
Then, T1- weigh ed s uc u al images we e co- egis e ed
o he ealigned unc ional images o each indi idual. A
il e was applied so ha only signals abo e 0.01Hz we e
kep . The images we e segmen ed in o g ay ma e , whi e
ma e , and ce eb ospinal luid, be o e being no malized
o he Mon eal Neu ological Ins i u e (MNI) space. Finally,
he no malized images we e de ended by eg essing ou
he nuisances using a F is on 24- pa ame e model
( F is on e al., 1996).
2.5.3. Rep esen a ional simila i y analysis (RSA)
RSA was conduc ed o calcula e he simila i y be ween
Chinese and Spanish, as well as be ween di e en imes
o imi a ion wi hin a language (i.e., 1 and 2, 2 and 3, 1
6
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
and 3) using he CoSMoMVPA oolbox (h p://www
. cosmom pa . o g/). Speci ically, b ain esponses o each
ial we e es ima ed using unsmoo hed p ep ocessed
da a. The Leas - Squa es Sepa a e (LSS) me hod
( Mum o d e al., 2012) was used, wi h six head mo e-
men pa ame e s and all o he ials as co a ia es. A
sea chligh app oach was adop ed, whe e a sphe e con-
aining 125 oxels was cen e ed a each oxel and mo ed
ac oss he en i e b ain. Fo he c oss- language simila i y
analysis, pa e n simila i y be ween Chinese and Spanish
was calcula ed using spli - hal co ela ion. Speci ically,
wi hin each sea chligh , a Pea son’s co ela ion coe i-
cien was calcula ed be ween each Chinese ial and
each Spanish ial on he be a alues o he 125 oxels,
and in o al he e we e 84×84=7056 such co ela ion
coe icien s, because we had 84 ials (28×3=84) in
each language. Then, we a e aged hese 7056 co ela-
ion coe icien s o ep esen simila i y be ween Chinese
and Spanish a his oxel.
Fo he simila i y be ween di e en imes o imi a ion
wi hin a language, a each sea chligh , a Pea son’s co -
ela ion coe icien was calcula ed be ween each ial and
each o he ial in he same imi a ion o de wi hin a lan-
guage (28 ials in o al) on he be a alues o he 125
oxels; he e o e, we had a 28×28 DSM. Spea man’s
co ela ions we e calcula ed be ween he DSMs o he
i s imi a ion and he second imi a ion, be ween he sec-
ond and he hi d imi a ion, and be ween he i s and he
hi d imi a ion. Following he analysis, he esul s we e
z- ans o med and subjec ed o u he g oup analysis.
The h eshold was se a unco ec ed p< .001 a he
oxel le el and FWE- co ec ed p<.05 a he clus e le el
when epo ing g oup analysis.
2.5.4. Machine lea ning
In o de o con i m ha he h ee g oups ha e di e en
ep esen a ion pa e ns o o eign speech, we adop ed a
machine- lea ning app oach. A e calcula ing he ep e-
sen a ional simila i y be ween di e en imes o imi a ion,
h ee machine- lea ning models we e se up o classi y
child en e sus con ol adul s, child en e sus adul s wi h
singing aining, and con ol adul s e sus adul s wi h
singing aining sepa a ely using simila i y be ween 1
and 2, 2 and 3, 1 and 3 in each language o bo h
pe cep ion and p oduc ion. In o al, he e we e 12 simila -
i y pa ame e s (i.e., h ee simila i ies in wo languages o
bo h pe cep ion and p oduc ion). The analysis was pe -
o med using PRoNTo 2.1 (Pa e n ecogni ion o neu-
oimaging oolbox) (h p://www . mlnl . cs . ucl . ac . uk / p on o
/ p so wa e . h ml). Speci ically, using aining da a, he
12 simila i y pa ame e s we e i s a e aged and hen
mean- cen e ed. Subsequen ly, a bina y linea Suppo
Vec o Machine (SVM) was employed o ain he model
in a whole- b ain g ay ma e mask, wi h c oss- alida ion
pe o med using he lea e- one- subjec - ou app oach. To
measu e whe he he classi ica ion is success ul, 1000
pe mu a ions we e conduc ed. Fo a success ul classi i-
ca ion, he con ibu ing weigh map was compu ed. The
weigh map was hen h esholded a 30% o he maxi-
mum weigh alue wi h 100 ex ended oxels.
2.5.5. S a is ical analysis on b ain ac i a ion
A gene al linea model (GLM) was cons uc ed in SPM12
(S a is ical Pa ame ic Mapping, h p://www . il . ion . ucl . ac
. uk / spm) a e he da a we e smoo hed wi h an iso opic
Gaussian ke nel o 4 mm ull wid h hal maximum
(FWHM). Fo each pa icipan , he p ep ocessed unc-
ional images om all sessions we e en e ed in o a GLM
o es ima e he whole- b ain neu al ac i i ies o he pe -
cep ion and p oduc ion s age. In he g oup- le el s a is i-
cal analysis, we conduc ed lexible ac o ial ANOVAs o
g oup (con ol adul s, adul s wi h singing aining, chil-
d en) by imi a ion o de s ( 1, 2, 3) sepa a ely o pe cep-
ion and p oduc ion o each language in SPM ( Gläsche
& Gi elman, 2008). The main e ec s o o de and g oup
as well as he in e ac ion be ween o de and g oup we e
calcula ed. The h eshold was se a unco ec ed p<.001
a he oxel le el and FWE- co ec ed p<.05 a he clus-
e le el.
3. RESULTS
3.1. Beha io al es s
The e was a signi ican main e ec o g oup o bo h pho-
nological awa eness es s ( o he hyming judgmen es :
F(2, 69)=15.86, p<.001, pa ial
η
²=.261; o he ini ial
sound dele ion es : F(2, 69)=18.239, p<.001, pa ial
η
²=.261). The Bon e oni- co ec ed pos - hoc analysis
showed ha child en had signi ican ly lowe sco es on
he pseudowo d hyming judgmen es han con ol
adul s ( (69)=5.603, p<.001, pa ial
η
²=.313) and adul s
wi h singing aining ( (69) = 3.575, p = .017, pa ial
η
²=.156). Child en also had lowe sco es on he ini ial
sound dele ion es han con ol adul s ( (69) = 5.97,
p<.001, pa ial
η
²=.341) and adul s wi h singing aining
( (69)=4.14, p<.001, pa ial
η
²=.199). No di e ences
we e ound be ween he wo adul g oups on he
pseudowo d hyming judgmen es ( (69) = 1.637,
p=.238, pa ial
η
²=.037), o he ini ial sound dele ion
es ( (69)=1.371, p=.389, pa ial
η
²=.026). Fo he
wo king memo y es o digi span, a signi ican g oup
e ec was ound in he es o o wa d o de (F(2,
69)=5.908, p=.004, pa ial
η
²=.105), bu no in he es
7
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
o e e sed o de (F(2, 69) = 1.761, p = .179, pa ial
η
²=.078). A Bon e oni- co ec ed pos - hoc es e ealed
g ea e digi span in adul s wi h singing aining han in
child en ( (69)=3.345, p=.025, pa ial
η
²=.14) and in
con ol adul s han in child en ( (69)=2.514, p=.043,
pa ial
η
²=.084). No signi ican di e ence be ween he
wo adul g oups ( (69)=1.761, p=.179, pa ial
η
²=.043)
was ound. Fo he AMMA es , he adul s wi h singing
aining signi ican ly ou pe o med he con ol adul s
( (50)=8.568, p=.005, pa ial
η
²=.595).
3.2. VOTs
We an an ANOVA o g oup by o de o each consonan
sepa a ely (i.e., /d/ and /b/). We ound a signi ican main
e ec o g oup o /d/ (F(2, 65)=5.299, p=.007, pa ial
η
²= .14) and /b/ (F(2, 65)=3.445, p=.038, pa ial
η
²=
.096). Bon e oni- co ec ed pos - hoc analysis e ealed
ha child en’s VOT was mo e nega i e han con ol adul s
( (45)=3.934, p=.006, pa ial
η
²=.256) o /d/, and ma -
ginally mo e nega i e han con ol adul s ( (45)=2.847,
p=.06, pa ial
η
²=.153) o /b/ (Fig.1). Adul s wi h singing
aining did no di e signi ican ly om he o he wo
g oups in he VOTs o ei he /d/ ( o compa ison wi h chil-
d en: (33)=- 1.857, p=.574, pa ial
η
²=.095; o compa -
ison wi h con ol adul s: (48)=1.724, p=.248, pa ial
η
²=
.058) o /b/ ( o compa ison wi h child en: (33)= - .907,
p= .645, pa ial
η
²= .024; o compa ison wi h con ol
adul s: (48)=1.955, p=.184, pa ial
η
²=.074). The o he
main e ec s o in e ac ion e ec s we e no signi ican .
3.3. Fo man o he owels
To e alua e he imi a ion pe o mance, we calcula ed he
dis ance be ween he pa icipan ’s owel in he equency
space (F1, F2) and he na i e speake ’s owel (F1s imulus,
F2s imulus). Then, a epea ed- measu e ANOVA o g oup by
o de was conduc ed o each owel on he dis ance. Fo
he owel /o/ and /i/, he ANOVA e ealed a signi ican
main e ec o g oup (F(2, 63)=11.559, p<.001, pa ial
η
²=.187 o /o/, and F(2, 63)=5.965, p=.004, pa ial
η
²=.101 o /i/). Bon e oni- co ec ed pos - hoc analysis
e ealed ha o /o/, con ol adul s had a la ge dis ance
han adul s wi h singing aining ( (48)=4.679, p<.001,
pa ial
η
²=.314) and child en ( (43)=2.793, p=.013,
pa ial
η
²=.153). Fo /i/, child en and con ol adul s had
Fig.1. Con ol adul s showed a poo e pe o mance on VOT and owel’s equency o man han singing- ained adul s
and child en. A and B a e he VOT esul s o /d/ and /b/ in each g oup; C and D a e he dis ance o he na i e speake in
he equency o man space o /o/ and /i/. The do ed line in A and B is he VOT o /d/ and /b/ in a na i e Spanish speake .
8
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
Table2. Resul s o he c oss- linguis ic simila i y.
Ana omical label H
Clus e size
(Voxels)
MNI coo dina e
Zx y z
(Con ol adul s>child en) & (adul s wi h singing aining>child en) (Pe cep ion)
Supe io occipi al gy us/cuneus, BA 7/17/18/19 B 703 18 - 84 18 3.71
Cauda e/ Thalamus B 145 4 14 0 3.48
(Con ol adul s>child en) & (adul s wi h singing aining>child en) (P oduc ion)
P ecuneus/ pos e io cingula e gy us, BA 18/30 L 125 - 20 - 52 2 3.42
Calca ine sulcus/cuneus/pos e io cingula e gy us, BA7/18/19/23 B 1265 20 - 50 6 4.07
Rolandic/ supe io empo al gy us/middle empo al gy us, BA
13/22/40/41
R 299 54 - 28 20 3.44
Pe cep ion>p oduc ion (main e ec )
Thalamus, cauda e L 131 - 14 - 18 14 4.47
B ain s em R 138 0 - 22 - 8 4.42
Thalamus, cauda e head R 57 0 6 2 4.19
Cauda e body L 59 - 8 14 10 3.99
In e ac ion o g oup by p ocess (pe cep ion, p oduc ion)
Cingula e gy us/supplemen a y mo o a ea, BA 31 B 138 0 - 28 44 4.92
a la ge dis ance han adul s wi h singing aining ( o
child en: (31)=1.619, p=.007, pa ial
η
²=.078; o con-
ol adul s: (48) = 1.694, p = .019, pa ial
η
² = .056)
(Fig. 1). Howe e , no signi ican g oup di e ence was
de ec ed o he o he owels (/a/: F(2, 63) = 1.419,
p=.250, pa ial
η
²=.064; /e/: F(2, 63)=.527, p=.593,
pa ial
η
²=.007; /u/: F(2, 63)=.902, p=.411, pa ial
η
²=.09). No main e ec s o o de o in e ac ions be ween
g oup and o de we e signi ican o any owel.
3.4. Rep esen a ional simila i y analysis
3.4.1. Simila i y be ween Chinese and Spanish
We conduc ed an ANOVA o g oup by p ocess (pe cep-
ion, p oduc ion) o examine g oup di e ences in he sim-
ila i y be ween he wo languages du ing pe cep ion and
p oduc ion. We ound main e ec s o g oup and p o-
cesses, as well as in e ac ions be ween g oup and p o-
cess. Fo he g oup di e ences, we ound ha bo h adul
g oups showed g ea e simila i y han child en, bu chil-
d en did no show g ea e simila i y han adul s; he e-
o e, we conduc ed a conjunc ion analysis be ween
con ol adul s>child en and adul s wi h singing ain-
ing>child en. The conjunc ion analysis showed ha bo h
con ol adul s and adul s wi h singing aining showed
g ea e ep esen a ional simila i y be ween Chinese and
Spanish han child en in he bila e al cauda e/ halamus
and supe io occipi al gy us, cuneus du ing pe cep ion,
and in he bila e al p ecuneus, cuneus, pos e io cingu-
la e and igh STG du ing p oduc ion (Table2, Fig.2A, B).
Fo he main e ec o p ocess, we ound g ea e simila i y
in he bila e al halamus and cauda e in pe cep ion han
in p oduc ion (Table2).
The e was an in e ac ion be ween g oup and p ocess
in he cingula e gy us/SMA, d i en by g ea e c oss-
linguis ic simila i y in con ol adul s han in child en in
pe cep ion, and g ea e simila i y in adul s wi h singing
aining han in child en in p oduc ion (Fig.2C).
3.4.2. Simila i y be ween he 1s , 2nd, and 3 d
imi a ion
We also calcula ed ep esen a ional simila i y be ween
di e en imes o imi a ion wi hin a language sepa a ely
o pe cep ion and p oduc ion. Then, we conduc ed a
g oup (3) by language (2) ANOVA sepa a ely o pe cep-
ion and p oduc ion. We ound main e ec s o g oup.
Then, we conduc ed a conjunc ion analysis o iden i y
g oup- speci ic simila i y pa e ns. Speci ically, a conjunc-
ion be ween con ol adul s>child en and con ol
adul s>adul s wi h singing aining would e eal egions
speci ic o con ol adul s. A conjunc ion be ween chil-
d en>con ol adul s and child en>adul s wi h singing
aining would e eal egions speci ic o child en. A con-
junc ion be ween adul s wi h singing aining>child en
and adul s wi h singing aining>con ol adul s would
e eal egions speci ic o adul s wi h singing aining. We
ound g ea e simila i y in con ol adul s han in child en
and adul s wi h singing aining in he bila e al medial
o bi al on al co ex ac oss bo h pe cep ion and p oduc-
ion; and g ea e simila i y in child en han in con ol
adul s and adul s wi h singing aining in he bila e al in e-
io p emo o /pos cen al gy us in bo h pe cep ion and
p oduc ion (Table3, Fig.3). In he conjunc ion analysis,
we did no ind g ea e simila i y in adul s wi h singing
aining han he o he wo g oups.
9
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
Fo he main e ec o language, we ound g ea e sim-
ila i y in Spanish han in Chinese in bila e al halamus
du ing pe cep ion (Table3).
We ound in e ac ion e ec s be ween g oup and lan-
guage o bo h pe cep ion and p oduc ion (Fig.4). Fo
pe cep ion, we ound in e ac ions a he le STG and
sup ama ginal gy us. A he le STG, adul s wi h singing
aining had g ea e simila i y han he o he wo g oups
in Spanish, while con ol adul s had g ea e simila i y
han he o he wo g oups in Chinese. A he le sup am-
a ginal gy us, adul s wi h singing aining had g ea e
simila i y han con ol adul s in Spanish while con ol
adul s had g ea e simila i y han adul s wi h singing
aining in Chinese. Child en did no show di e ence om
he o he wo g oups in ei he language a he le sup a-
ma ginal gy us.
Fo p oduc ion, con ol adul s showed g ea e simila -
i y han he o he wo g oups in he le insula in Chinese,
bu no g oup di e ences we e ound in Spanish. In he
le STG, con ol adul s showed g ea e simila i y han
adul s wi h singing aining in Chinese, bu no g oup di -
e ences we e ound in Spanish.
3.4.3. Machine lea ning esul s
Machine lea ning yielded signi ican esul s when classi-
ying con ol adul s and child en (ACC = 74.38%,
p=.001) (Fig.5). The bila e al medial on al gy us, bila -
e al supe io empo al gy us, bila e al cauda e/ halamus,
and bila e al pos cen al gy us/p ecen al gy us con ib-
u ed o he con as o con ol adul s minus child en; in
con as , he bila e al supe io on al gy us, bila e al
Fig.2. Resul s o he simila i y be ween Chinese and Spanish. Adul s showed g ea e simila i y be ween he wo
languages han child en in bo h he pe cep ion and p oduc ion p ocesses, while child en did no show g ea e simila i y
han adul s (A, B). (C) is he in e ac ion be ween g oup and p ocess o he simila i y be ween Chinese and Spanish. Adul s
wi h singing aining showed g ea e simila i y in he cingula e gy us in p oduc ion han pe cep ion, while con ol adul s
showed g ea e simila i y in pe cep ion han p oduc ion.
16
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
adap i e lea ning; child en showed g ea e simila i y in
bila e al in e io p emo o /pos cen al gy us, sugges ing
senso imo o lea ning; and adul s wi h singing aining
showed g ea e simila i y in he le STG, sugges ing eli-
ance on audi o y eedback. Taken oge he , hese indings
pa e he way o unde s anding why adul s con on
g ea e challenge in o eign speech lea ning han child en,
and how singing aining may help.
DATA AND CODE AVAILABILITY
All da a and code will be a ailable a e accep ance o he
pape based on eques sen o he co espondence
au ho wi h a possibili y o needs o a o mal da a-
sha ing ag eemen , and app o al om he eques ing
esea che ’s local e hics commi ee.
AUTHOR CONTRIBUTIONS
X.Y.: in es iga ion, me hodology, da a cu a ion, da a ali-
da ion, isualiza ion, and o mal da a analysis. J.M.:
in es iga ion, me hodology, da a cu a ion, da a alida-
ion, o mal da a analysis, and d a w i ing. Z.M.: w i ing,
edi ing, and e iewing. K.P.: da a alida ion, w i ing, and
edi ing and e iewing. W.L.: in es iga ion, me hodology,
and da a cu a ion. Y.W.: in es iga ion, me hodology, and
da a cu a ion. F.C.: o ma ion o ideas, supe ision o
in es iga ion, me hodology, o mal da a analysis, d a
w i ing, edi ing, and e iewing.
FUNDING
This wo k was unded by “Gene al Resea ch Fund
(17605925), he Resea ch G an s Council, Hong Kong”
awa ded o F.C.
DECLARATION OF COMPETING INTEREST
The au ho s claim no compe ing in e es s.
SUPPLEMENTARY MATERIALS
Supplemen a y ma e ial o his a icle is a ailable wi h
he online e sion he e: h ps://doi . o g / 10 . 1162 / IMAG
. a . 75
REFERENCES
Ab ahamsson, N., & Hyl ens am, K. (2008). The obus ness
o ap i ude e ec s in nea - na i e second language
acquisi ion. S ud Second Lang Acquis, 30(4), 481–509.
h ps://doi . o g / 10 . 1017 / s0272263108080339
Bake , W., T o imo ich, P., Flege, J.E., Mack, M., & Hal e ,
R. (2008). Child— adul di e ences in second- language
phonological lea ning: The ole o c oss- language
simila i y. Lang Speech, 51(4), 317–342. h ps://doi . o g
/ 10 . 1177 / 0023830908099068
Bamiou, D.E., Musiek, F.E., & Luxon, L.M. (2003). The
insula (Island o Reil) and i s ole in audi o y p ocessing.
Li e a u e e iew. B ain Res B ain Res Re , 42(2), 143–
154. h ps://doi . o g / 10 . 1016 / s0165 - 0173(03)00172 - 3
Be ken, J.A., Chai, X., Chen, J.K., G acco, V.L., & Klein,
D. (2016). E ec s o ea ly and la e bilingualism on
es ing- s a e unc ional connec i i y. J Neu osci, 36(4),
1165–1172. h ps://doi . o g / 10 . 1016 / j . neu opsychologia
. 2016 . 08 . 031
Be ken, J.A., G acco, V.L., Chen, J.K., & Klein, D. (2016).
The iming o language lea ning shapes b ain s uc u e
associa ed wi h a icula ion. B ain S uc Func , 221(7),
3591–3600. h ps://doi . o g / 10 . 1007 / s00429 - 015 - 1121 - 9
Be ken, J.A., G acco, V.L., Chen, J.K., Wa kins, K.E.,
Baum, S., Callahan, M., & Klein, D. (2015). Neu al
ac i a ion in speech p oduc ion and eading aloud in
na i e and non- na i e languages. Neu oimage, 112,
208–217. h ps://doi . o g / 10 . 1016 / j . neu oimage . 2015 . 03
. 016
Bes , C.T., McRobe s, G.W., & Goodell, E. (2001).
Disc imina ion o non- na i e consonan con as s
a ying in pe cep ual assimila ion o he lis ene ’s na i e
phonological sys em. J Acous Soc Am, 109(2), 775–794.
h ps://doi . o g / 10 . 1121 / 1 . 1332378
Bes , C.T., & Tyle , M.D. (2007). Nonna i e and second-
language speech pe cep ion: Commonali ies and
complemen a i ies. In O.- S. Bohn & M.J. Mun o (Eds.),
Language lea ning & language eaching (Vol. 17, pp.
13–34). John Benjamins Publishing Company. h ps://doi
. o g / 10 . 1075 / lll . 17 . 07bes
Bi dsong, D. (2005). In e p e ing age e ec s in second
language acquisi ion. In J.F. K oll & A.M.B. de G oo
(Eds.), Handbook o bilingualism: Psycholinguis ic
app oaches, 109, 127. Ox o d Uni e si y P ess. h ps://
doi . o g / 10 . 1093 / oso / 9780195151770 . 003 . 0007
Boe sma, P. (2001). P aa , a sys em o doing phone ics by
compu e . Glo In , 5(9), 341–345. h ps://doi . o g / 10 . 1097
/ aud . 0b013e31821473 7
B own, S., Ngan, E., & Lio i, M. (2008). A la ynx a ea in
he human mo o co ex. Ce eb Co ex, 18(4), 837–845.
h ps://doi . o g / 10 . 3410 / . 1104671 . 560738
Ca ey, D., Miquel, M.E., E ans, B.G., Adank, P., &
McGe igan, C. (2017). Func ional b ain ou comes
o L2 speech lea ning eme ge du ing senso imo o
ans o ma ion. Neu oimage, 159, 18–31. h ps://doi . o g
/ 10 . 1016 / j . neu oimage . 2017 . 06 . 053
Ch is ine , M., & Rei e e , S.M. (2015). A Moza is no a
Pa a o i: Singe s ou pe o m ins umen alis s on o eign
accen imi a ion. F on Hum Neu osci, 9, 482. h ps://doi
. o g / 10 . 3389 / nhum . 2015 . 00482
Du, Y., & Za o e, R.J. (2017). Musical aining sha pens
and bonds ea s and ongue o hea speech be e . P oc
Na l Acad Sci, 114(51), 13579–13584. h ps://doi . o g / 10
. 1073 / pnas . 1712223114
Fab icius, A., Wa , D., & Johnson, D. E. (2009). A
compa ison o h ee speake -in insic owel o man
equency no maliza ion algo i hms o sociophone ics.
Language Va ia ion and Change, 21(3), 413–435. h ps://
doi.o g/10.1017/S0954394509990160
F enck- Mes e, C., An on, J.L., Ro h, M., Vaid, J., & Vialle ,
F. (2005). A icula ion in ea ly and la e bilinguals’ wo
languages: E idence om unc ional magne ic esonance
imaging. Neu o epo , 16(7), 761–765. h ps://doi . o g / 10
. 1097 / 00001756 - 200505120 - 00021
Flinke , A., Ko zeniewska, A., Shes yuk, A. Y., F anaszczuk,
P. J., D onke s, N. F., Knigh , R. T., & C one, N. E. (2015).
17
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
Rede ining he ole o B oca’s a ea in speech. P oc
Na l Acad Sci U S A, 112(9), 2871–2875. h ps://doi
.o g/10.1073/pnas.1414491112
Flynn, S., & Manuel, S. (1991). Age- dependen e ec s in
language acquisi ion: An e alua ion o c i ical pe iod
hypo heses. In L. Eubank (Ed.), Poin coun e poin :
Uni e sal g amma in he second language, pp. 117–145.
John Benjamins Publishing Company. h ps://doi . o g / 10
. 1075 / lald . 3 . 06 ly
F is on, K.J., Williams, S., Howa d, R., F ackowiak, R.S., &
Tu ne , R. (1996). Mo emen - ela ed e ec s in MRI ime-
se ies. Magn Reson Med, 35(3), 346–355. h ps://doi . o g
/ 10 . 1002 / m m . 1910350312
Ga e , D.D., Ko ace ic, N., McIn osh, A.R., & G ady,
C.L. (2011). The impo ance o being a iable. J
Neu osci, 31, 4496–4503. h ps://doi . o g / 10 . 1523
/ jneu osci . 5641 - 10 . 2011
Ga e , D.D., Samanez- La kin, G.R., MacDonald, S.W.S.,
Lindenbe ge , U., McIn osh, A.R., & G ady, C.L. (2013).
Momen - o- momen b ain signal a iabili y: A nex
on ie in human b ain mapping? Neu osci Biobeha
Re , 37, 610– 624. h ps://doi . o g / 10 . 1016 / j . neubio e
. 2013 . 02 . 015
Gase , C., & Schlaug, G. (2003). B ain s uc u es di e
be ween musicians and non- musicians. J Neu osci,
23(27), 9240–9245. h ps://doi . o g / 10 . 1523 / jneu osci . 23
- 27 - 09240 . 2003
Geschwind, N., & Ca e e e. (1966). A discussion o
“Speech de elopmen : I s ana omical and physiological
concomi an s.” In E.H. Lennebe g & Ca e e e
(Eds.), B ain unc ion, 3. h ps://doi . o g / 10 . 1525
/ 9780520333819 - 006
Gläsche , J., & Gi elman, D. (2008). Con as weigh s in
lexible ac o ial design wi h mul iple g oups o subjec s.
Sml Edi o , 1, 12. h ps://doi . o g / 10 . 1097 / 01434893
- 200808000 - 00010
Goles ani, N., & Za o e, R.J. (2004). Lea ning new
sounds o speech: Realloca ion o neu al subs a es.
Neu oimage, 21(2), 494–506. h ps://doi . o g / 10 . 1016 / j
. neu oimage . 2003 . 09 . 071
Go don, E.E. (1989). Manual o he ad anced measu es o
music audia ion. G. I. A. Publica ions, Inc. h ps://doi . o g
/ 10 . 2307 / 3399589
Gou ley, S.L., Zimme mann, K.S., Allen, A.G., & Taylo ,
J.R. (2016). The medial o bi o on al co ex egula es
sensi i i y o ou come alue. J Neu osci, 36(16), 4600–
4613. h ps://doi . o g / 10 . 1523 / jneu osci . 4253 - 15 . 2016
Guen he , F.H., & Vladusich, T. (2012). A neu al heo y o
speech acquisi ion and p oduc ion. J Neu olinguis ics,
25(5), 408–422. h ps://doi . o g / 10 . 1016 / j . jneu oling . 2009
. 08 . 006
Hack, J., Ma ino a- Todd, S.H., & May Be nha d , B.
(2012). Speech assessmen o Chinese– English bilingual
child en: Accen e sus de elopmen al le el. In J
Speech Lang Pa hol, 14(6), 509–519. h ps://doi . o g / 10
. 3109 / 17549507 . 2012 . 718361
Hae ne , R.M., Be kes, P., & Fise , J. (2016). Pe cep ual
decision- making as p obabilis ic in e ence by neu al
sampling. Neu on, 90, 649–660. h ps://doi . o g / 10 . 1016 / j
. neu on . 2016 . 03 . 020
Halwani, G.F., Loui, P., Rube , T., & Schlaug, G. (2011).
E ec s o p ac ice and expe ience on he a cua e
asciculus: Compa ing singe s, ins umen alis s, and
non- musicians. F on Psychol, 2, 156. h ps://doi . o g / 10
. 3389 / psyg . 2011 . 00156
Higashiyama, Y., Hamada, T., Sai o, A., Mo iha a, K.,
Okamo o, M., Kimu a, K., Joki, H., Kishida, H., Doi,
H., Ueda, N., Takeuchi, H., & Tanaka, F. (2021). Neu al
mechanisms o o eign accen synd ome: Lesion and
ne wo k analysis. Neu oimage Clin, 31, 102760. h ps://
doi . o g / 10 . 1016 / j . nicl . 2021 . 102760
Hu, X., Acke mann, H., Ma in, J.A., E b, M., Winkle ,
S., & Rei e e , S.M. (2013). Language ap i ude o
p onuncia ion in ad anced second language (L2)
lea ne s: Beha iou al p edic o s and neu al subs a es.
B ain Lang, 127(3), 366–376. h ps://doi . o g / 10 . 1016 / j
. bandl . 2012 . 11 . 006
Ing am, J.C., & Pa k, S.- G. (1997). C oss- language owel
pe cep ion and p oduc ion by Japanese and Ko ean
lea ne s o English. J Phone ics, 25(3), 343–370. h ps://
doi . o g / 10 . 1006 / jpho . 1997 . 0048
Ja is, E.D. (2004). Lea ned bi dsong and he neu obiology
o human language. Ann N Y Acad Sci, 1016, 749–777.
h ps://doi . o g / 10 . 1196 / annals . 1298 . 026
Ja is, E.D. (2006). Selec ion o and agains ocal lea ning
in bi ds and mammals. O ni hol Sci, 5, 5–14. h ps://doi
. o g / 10 . 2326 / osj . 5 . 5
Jekiel, M., & Mala ski, K. (2021). Musical hea ing and
musical expe ience in second language English owel
acquisi ion. J Speech Lang Hea Res, 64(5), 1666–1682.
h ps://doi . o g / 10 . 1044 / 2021 _ jslh - 19 - 00253
Klebe , B., F ibe g, A., Zei ouni, A., & Za o e, R. (2017).
Expe ience- dependen modula ion o igh an e io
insula and senso imo o egions as a unc ion o noise-
masked audi o y eedback in singe s and nonsinge s.
Neu oimage, 147, 97–110. h ps://doi . o g / 10 . 1016 / j
. neu oimage . 2016 . 11 . 059
Klein, D., Wa kins, K.E., Za o e, R.J., & Milne , B. (2006).
Wo d and nonwo d epe i ion in bilingual subjec s: A PET
s udy. Hum B ain Mapp, 27(2), 153–161. h ps://doi . o g
/ 10 . 1002 / hbm . 20174
Kuhl, P.K., Conboy, B.T., Padden, D., Nelson, T., & P ui ,
J. (2005). Ea ly speech pe cep ion and la e language
de elopmen : Implica ions o he “c i ical pe iod.” Lang
Lea n De , 1(3– 4), 237–264. h ps://doi . o g / 10 . 1207
/ s15473341lld0103 & 4 _ 2
Long, M.H. (1990). Ma u a ional cons ain s on language
de elopmen . S ud Second Lang Acquis, 12(3), 251–285.
h ps://doi . o g / 10 . 1017 / s0272263100009165
McMu ay, B., Danelz, A., Rigle , H., & Seedo , M.
(2018). Speech ca ego iza ion de elops slowly h ough
adolescence. De Psychol, 5(8), 1472–1491. h ps://doi
. o g / 10 . 1037 / de 0000542
Milo ano , R., & Te aniemi, M. (2011). The in e play
be ween musical and linguis ic ap i udes: A e iew. F on
Psychol, 2, 321. h ps://doi . o g / 10 . 3389 / psyg . 2011
. 00321
Mose , D., F id iksson, J., Bonilha, L., Healy, E.W., Baylis,
G., Bake , J.M., & Ro den, C. (2009). Neu al ec ui men
o he p oduc ion o na i e and no el speech sounds.
Neu oimage, 46(2), 549–557. h ps://doi . o g / 10 . 1016 / j
. neu oimage . 2009 . 01 . 015
Mum o d, J.A., Tu ne , B.O., Ashby, F.G., & Pold ack,
R.A. (2012). Decon ol ing BOLD ac i a ion in e en -
ela ed designs o mul i oxel pa e n classi ica ion
analyses. Neu oImage, 59(3), 2636–2643. h ps://doi . o g
/ 10 . 1016 / j . neu oimage . 2011 . 08 . 076
Nomi, J.S., Bol , T.S., Ezie, C.E.C., Uddin, L.Q., & Helle ,
A.S. (2017). Momen - o- momen BOLD signal a iabili y
e lec s egional changes in neu al lexibili y ac oss he
li espan. J Neu osci, 37, 5539–5548. h ps://doi . o g / 10
. 1523 / jneu osci . 3408 - 16 . 2017
Ozdemi , E., No on, A., & Schlaug, G. (2006). Sha ed
and dis inc neu al co ela es o singing and speaking.
Neu oImage, 33(2), 628–635. h ps://doi . o g / 10 . 1016 / j
. neu oimage . 2006 . 07 . 013
18
X. Yan, J. Mao, Z. Ma e al. Imaging Neu oscience, Volume 3, 2025
Pe in, E., & Venance, L. (2019). B idging he gap be ween
s ia al plas ici y and lea ning. Cu Opin Neu obiol, 54,
104–112. h ps://doi . o g / 10 . 1016 / j . conb . 2018 . 09 . 007
Raichle, M.E. (2015). The b ain’s de aul mode ne wo k.
Ann Re Neu osci, 38, 433–447. h ps://doi . o g / 10 . 1146
/ annu e - neu o - 071013 - 014030
Raja Beha elle, A., Ko ače ić, N., McIn osh, A.R., &
Le ine, B. (2012). B ain signal a iabili y ela es o
s abili y o beha io a e eco e y om di use b ain
inju y. Neu oimage, 60, 1528–1537. h ps://doi . o g / 10
. 1016 / j . neu oimage . 2012 . 01 . 037
Schoenbaum, G., Roesch, M., S alnake , T., & Takahashi,
Y.K. (2009). A new pe spec i e on he ole o he
o bi o on al co ex in adap i e beha iou . Na Re
Neu osci, 10, 885–892. h ps://doi . o g / 10 . 1038
/ n n2753
Schwa z, B.D., & Sp ouse, R.A. (1996). L2 cogni i e
s a es and he ull ans e / ull access model. Second
Lang Res, 12(1), 40–72. h ps://doi . o g / 10 . 1177
/ 026765839601200103
Shus e , L. I., & Lemieux, S. K. (2005). An MRI in es iga ion
o co e ly and o e ly p oduced mono- and mul isyllabic
wo ds. B ain Lang, 93(1), 20–31. h ps://doi.o g/10.1016
/j.bandl.2004.07.007
Simmonds, A.J. (2015). A hypo hesis on imp o ing o eign
accen s by op imizing a iabili y in ocal lea ning b ain
ci cui s. F on Hum Neu osci, 9, 606. h ps://doi . o g / 10
. 3389 / nhum . 2015 . 00606
Simmonds, A.J., Leech, R., I e son, P., & Wise, R.J.
(2014). The esponse o he an e io s ia um du ing adul
human ocal lea ning. J Neu ophysiol, 112(4), 792–801.
h ps://doi . o g / 10 . 1152 / jn . 00901 . 2013
Sle c, L.R., & Miyake, A. (2006). Indi idual di e ences
in second- language p o iciency: Does musical abili y
ma e ? Psychol Sci, 17(8), 675–681. h ps://doi . o g / 10
. 1111 / j . 1467 - 9280 . 2006 . 01765 . x
Thomas, E. R., & Kendall, T. (2007). NORM: The owel
no maliza ion and plo ing sui e. h p://ling ools.uo egon
.edu/no m/
To ico, T.J., Munakomi, S., & Neu oana omy, T. (2023).
In: S a Pea ls [In e ne ]. T easu e Island, FL: S a Pea ls
Publishing. h ps://doi . o g / 10 . 1080 / 15424065 . 2024
. 2389325
Tou ille, J.A., & Guen he , F.H. (2011). The DIVA model: A
neu al heo y o speech acquisi ion and p oduc ion. Lang
Cogn P ocess, 26(7), 952–981. h ps://doi . o g / 10 . 1080
/ 01690960903498424
Wäch e , T., Lungu, O.V., Liu, T., Willingham, D.T., & Ashe,
J. (2009). Di e en ial e ec o ewa d and punishmen on
p ocedu al lea ning. J Neu osci, 29(2), 436–443. h ps://
doi . o g / 10 . 1523 / jneu osci . 4132 - 08 . 2009
Waschke, L., Kloos e man, N.A., Oblese , J., & Ga e ,
D.D. (2021). Beha io needs neu al a iabili y. Neu on,
109(5), 751–766. h ps://doi . o g / 10 . 1016 / j . neu on . 2021
. 01 . 023
Weiss, Y., Ka zi , T., & Bi an, T. (2015). Many ways o ead
you owels— Neu al p ocessing o diac i ics and owel
le e s in Heb ew. Neu oimage, 121, 10–19. h ps://doi
. o g / 10 . 1016 / j . neu oscience . 2021 . 12 . 025
Wells, G. (1985). Language de elopmen in he p e- school
yea s (Vol. 2). CUP A chi e. h ps://doi . o g / 10 . 1177
/ 014272378700702008
Wol e, D.L. (1967). Some heo e ical aspec s o language
lea ning and language eaching. Lang Lea n, 17(3‐4),
173–188. h ps://doi . o g / 10 . 1111 / j . 1467 - 1770 . 1967
. b00924 . x
Wong, P.C., Pe achione, T.K., & Pa ish, T.B. (2007).
Neu al cha ac e is ics o success ul and less success ul
speech and wo d lea ning in adul s. Hum B ain Mapp,
28(10), 995–1006. h ps://doi . o g / 10 . 1002 / hbm . 20330
Wood, S. (1982). X- ay and model s udies o owel
a icula ion, Wo king Pape s in Linguis ics, 23,
Depa men o Linguis ics, Uni e si y o Lund. h ps://
jou nals.lub.lu.se/LWPL/a icle/ iew/16897/15276
Yan, C. G., Wang, X. D., Zuo, X. N., & Zang, Y. F. (2016).
DPABI: Da a p ocessing & analysis o ( es ing-s a e)
b ain imaging. Neu oin o m, 14(3), 339–351. h ps://doi
.o g/10.1007/s12021-016-9299-4
Ze in, J.D. (2012). A sensi i e pe iod o shibbole hs:
The long ail and changing goals o speech pe cep ion
o e he cou se o de elopmen . De Psychobiol, 54(6),
632–642. h ps://doi . o g / 10 . 1002 / de . 20611
Zuk, J., And ade, P.E., And ade, O.V., Ga dine , M., &
Gaab, N. (2013). Musical, language, and eading abili ies
in ea ly Po uguese eade s. F on Psychol, 4, 288.
h ps://doi . o g / 10 . 3389 / psyg . 2013 . 00288