DRAFT
E e yday language inpu and p oduc ion in 1001
child en om 6 con inen s
Elika Be gelsona,1, Melanie Sode s omb, I is-Co inna Schwa zc,n, Ca oline F. Rowlandd,e, , Nai án Ramí ez-Espa zag, Lisa
Rague Ham ickh, Ellen Ma klundc, Ma ina Kalashniko ai,o, A a Guezj, Ma isa Casillasd, ,k, Lucia Bene il, Pe a an Alphenm,
and Alejand ina C is iaj,1
a
Ha a d Uni e si y, Depa men o Psychology;
b
Uni e si y o Mani oba, Depa men o Psychology;
c
S ockholm Uni e si y, Depa men o Linguis ics;
d
Max Planck Ins i u e
o Psycholinguis ics, Language De elopmen Depa men ; eRadboud Uni e si y, Donde s Cen e o B ain, Cogni ion and Beha iou ; ARC Cen e o Excellence o he
Dynamics o Language (CoEDL); gUni e si y o Connec icu , Psychological Sciences; hPu due Uni e si y, Depa men o Psychological Sciences; iBasque Cen e on
Cogni ion B ain and Language (BCBL);
j
PSL Uni e si y, Labo a oi e de Sciences Cogni i es e de Psycholinguis ique, Dépa emen d’é udes cogni i es, ENS, EHESS, CNRS;
k
Uni e si y o Chicago, Compa a i e Human De elopmen Depa men ;
l
Ohio S a e Uni e si y, School o Music;
m
Royal Du ch Ken alis;
n
S ockholm Uni e si y, Depa men o
Special Educa ion; oIke basque, Basque Founda ion o Science
This manusc ip was compiled on Augus 23, 2023
Language is a uni e sal human abili y, acqui ed eadily by young
child en, who o he wise s uggle wi h many basics o su i al. And
ye , language abili y is a iable ac oss indi iduals. Na u alis ic and
expe imen al obse a ions sugges ha child en’s linguis ic skills
a y wi h ac o s like socioeconomic s a us and child en’s gende .
Bu which ac o s eally in luence child en’s day- o-day language use?
He e we le e age speech echnology in a big-da a app oach o epo
on a unique c oss-cul u al and di e se da a se : >2,500 day-long,
child-cen e ed audio- eco dings o 1,001 2- o 48-mon h-olds om
12 coun ies spanning 6 con inen s ac oss u ban, a me - o age ,
and subsis ence- a ming con ex s. As expec ed, age and language-
ele an clinical isks and diagnoses p edic ed how much speech
(and speech-like ocaliza ion) child en p oduced. C i ically, so oo
did adul alk in child en’s en i onmen s: Child en who hea d mo e
alk om adul s p oduced mo e speech. In con as o p e ious
conclusions based on mo e limi ed sampling me hods and a di e en
se o language p oxies, socioeconomic s a us (ope a ionalized as
ma e nal educa ion) was no signi ican ly associa ed wi h child en’s
p oduc ions o e he i s ou yea s o li e, and nei he we e gende
o mul ilingualism. These indings om la ge-scale na u alis ic da a
ad ance ou unde s anding o which ac o s a e obus p edic o s o
a iabili y in he speech beha io s o young lea ne s in a wide ange
o e e yday con ex s.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
in ancy | human di e si y | language | socioeconomic s a us | speech
Typically-de eloping child en eadily p og ess om coos o
1
complex sen ences wi hin jus a ew yea s, leading some o
2
hypo hesize ha he uni e sal language abili ies o humans
3
de elop uni o mly, wi h only inciden al e ec s o indi idual- o
4
g oup-le el a ia ion (1). And ye , s udies using a a ie y o
5
p oxies o language de elopmen ind some e idence o such
6
a ia ion in ea ly language skills, wi h di e ences epo ed
7
be ween gi ls and boys (2), as well as hose aised in socioeco-
8
nomically p i ileged compa ed o disad an aged households
9
(3, 4).10
Howe e in e es ing, hese s udies end o ely on Wes e n-
11
cen ic samples and me hods, and may no e lec e e yday
12
language use in child en. Mo eo e , p io wo k o en s ops
13
a e only conside ing indi idual p edic o s in a bina y way
14
(i.e. do hey signi ican ly impac language de elopmen o
15
no ), while ailing o ask he mo e in o ma i e ques ion o how
16
la ge hei ela i e impac is (5), especially in eely-occu ing,
17
e e yday speech beha io .18
Recen esea ch on mice and whales shows he p omise o
19
machine lea ning o examining e e yday animal beha io (6,
20
7). We le e age ad ances in wea ables and machine-lea ning-
21
based speech echnology o ca alyze a simila b eak h ough in
22
language de elopmen esea ch. Ou da ase is comp ised o
23
>40,000 hou s o audio om >2,500 days in he li es o 1,001
24
2- o 48-mon h-olds om 6 con inen s and di e se cul u al
25
con ex s (Figu e 1). Wi hin his da ase , we ocused on he
26
amoun o speech o speech-like ocaliza ion young child en
27
p oduce in hei e e yday li e. C i ically, hese au oma ically-
28
ex ac able “quan i y” measu es co ela e obus ly wi h gold-
29
s anda d “quali y” measu es o child en’s language skills and
30
knowledge, like ocabula y es ima es (see SI1D o ele an
31
e idence) (4). 32
We que y and compa e he e ec s o wo ypes o ac o s.
33
Fi s , he e a e ac o s wi h undeniable e ec s on ea ly lan-
34
Signi icance S a emen
Ha nessing a global sample o >40,000 hou s o child-cen e ed
audio cap u ing young child en’s home en i onmen , we mea-
su ed con ibu o s o how much speech 0–4 yea olds na u ally
p oduce. Amoun o adul alk, age, and no ma i e de elop-
men we e he sole signi ican p edic o s; child gende , socioe-
conomic s a us, and mul ilingualism did no explain how o en
child en ocalized, o how much adul alk hey hea d. These
indings (s eng hened by ou alida ion o exis ing au oma ed
speech algo i hms) open up new con e sa ions ega ding ea ly
language de elopmen o he b oade public, including pa en s,
clinicians, educa o s, and policymake s. The ac o s explaining
a iance also in o m ou unde s anding o humans’ unique ca-
paci y o lea ning, and po en ially la ge-scale applica ions o
machine echnology o e e yday human beha io .
EB, MC, and AC de eloped he ini ial concep ualiza ion o he p ojec and ec ui ed co pus owne s
and co-au ho s. EB, MC, and AC cu a ed he me a-co pus and me a-da a and p epa ed hem
o analysis. EB and AC p epa ed ma e ials o and/o led g oup decision-making. EB, MS, CR,
NRE, AG, MC, LB, P A, and AC con ibu ed o he decision-making on he analy ic app oach,
including selec ion o explo a o y and con i ma o y se s, selec ion o a iables, iden i ica ion o
hypo heses and/o speci ica ion o models. AC, EB, and AG d a ed he p e egis a ions. AC, EB,
and AG designed and implemen ed he analyses. EB, CR, NRE, LRH, MK, and LB conduc ed
and syn hesized li e a u e e iews on key opics ela ed o he decision-making ega ding li e a u e
e iew, hypo heses, and analyses. EB, EM, ICS, CR, LRH, MS, NRE, MK, MC, P A, and AC
p o ided co pus da a and me a-da a. See acknowledgmen s o non-au ho da a con ibu o s. EB,
AC, and MS con ibu ed o he ini ial manusc ip d a w i ing. EB, MK, MC, and AC con ibu ed o
isualiza ions. EB, MC, and AC e ised and esponded o eedback and in o mal pee - e iew. MS,
CR, LRH, LB, EB, AC, EM, ICS, and P A con ibu ed o supplemen a y ma e ials, Open Science
F amewo k p ojec page and/o o he documen a ion. No e: O he han i s and las au ho s, middle
au ho s a e lis ed in e e se alphabe ical o de .
The au ho s ha e no con lic o in e es o decla e.
1
To whom co espondence should be add essed. E-mail: [email p o ec ed],
[email p o ec ed]
www.pnas.o g/cgi/doi/10.1073/pnas.XXXXXXXXXX PNAS | Augus 23, 2023 | ol. XXX | no. XX | 1–12
DRAFT
guage p oduc ion, namely, child age and language- ele an
35
clinical isks and diagnoses. Second, he e a e indi idual- and
36
amily-le el ac o s ha a e epo ed o co ela e wi h a i-
37
abili y in ea ly language skills: socioeconomic s a us (SES;
38
ope a ionalized he e as ma e nal educa ion; SI2B), gende ,
39
language inpu quan i y, and mul ilingual backg ound. Be-
40
cause small and homogeneous samples make uni e sal claims
41
mo e ques ionable, a key no el con ibu ion o his wo k is i s
42
benchma king o he le el o s abili y and a iabili y o e e y-
43
day language use in a he e ogeneous, ichly di e se pa icipan
44
sample.∗
45
Measu ing Di e se, Real-li e Language Use. Language skills
46
and knowledge a e no di ec ly obse able. As a esul , all
47
s udies use a p oxy when es ima ing hem in indi idual chil-
48
d en. These p oxies ha e a iable alidi y and p edic i e powe
49
ela i e o o he measu es, bo h concu en ly and p edic i ely,
50
and likely a y in he ex en o which hey e lec child en’s
51
e e yday language beha io . Fo ins ance, pa en al epo
52
measu es a e indi ec and—especially o ecep i e knowledge—
53
can be di icul o ca e ake s o es ima e (9), e en in ela i ely
54
homogeneous Wes e n-cen ic con ex s.55
He e, we adop a e y di e en app oach. We employed
56
he LENA™sys em, which cap u es wha child en hea and
57
say ac oss an en i e day h ough small wea able eco de s
58
(10); his ecologically- alid sampling me hod educes obse e
59
e ec s ela i e o, e.g., sho e ideo eco dings (11). The
60
LENA™sys em uses s anda dized algo i hms ha es ima e
61
who is speaking when, alongside au oma ed coun s o adul and
62
child linguis ic ocaliza ions (4) (see de ini ion and alida ion
63
in SI1C:E). The esul ing LENA™measu es co ela e wi h
64
and p edic o he measu es o language skills in child en wi h
65
and wi hou clinical isks o diagnoses, as e ealed by manual
66
ansc ip ions, clinical ins umen s, and pa en ques ionnai es
67
(12, 13). We use LENA™’s alida ed, au oma ed es ima es
68
o de i e ou measu es o e e yday language use: adul alk
69
and child speech (see de ailed mo i a ion in SI3B). We de ine
70
child speech as he quan i y o child en’s speech- ela ed o-
71
caliza ions (e.g., p o ophones (14), babbles, syllables, wo ds,
72
o sen ences, bu no laughing o c ying) pe hou , and adul
73
alk as he numbe o nea and clea ocaliza ions pe hou
74
a ibu ed o adul s (bo h as de ec ed by LENA™’s algo i hm;
75
see Me hods). Assuaging conce ns ha hese measu es a e
76
me ely cap u ing cha iness o epe i ion, bo h ha e a
≥.
7
77
co ela ion wi h measu es o lexical di e si y and language
78
“quali y”: ou child speech measu e co ela es wi h ocabula y
79
in an independen sample, and he adul alk measu e co e-
80
la es wi h he numbe o wo d ypes om manual ansc ip ion
81
in a subse o he da a (SI1D).82
Capi alizing on his s anda dized and deiden i ied nume ic
83
ou pu , we solici ed LENA™da ase s ha esea che s had
84
p e iously collec ed o s udy mono- and mul ilingual child en
85
(i.e. hose lea ning >1 language) in u ban, a me - o age ,
86
and subsis ence- a ming con ex s wo ldwide (Figu e 1). This
87
esul ed in a da ase e lec ing he s a e o cu en knowledge
88
in ecologically- alid speech samples om child en’s daily li es
89
(SI3A; see Me hods o mo e sample de ails).90
∗
While hese da a collec i ely span li ing ci cums ances, geog aphy, and amily s uc u e, some da a
dono s we e conce ned ha highligh ing di e ences when mino i ized communi ies a e in ol ed
poses e hical challenges, in e ms o hono able ep esen a ion and po en ial ha m. Indi idual da a
s ewa ds a e ac i ely engaging in iche desc ip ions o included samples (see SI5), which may
enable u u e wo k on meaning ul popula ion-le el di e ences (e.g., 8).
The da ase includes child en om wide- anging SES back-
91
g ounds, based on ma e nal educa ion le els spanning om no
92
o mal educa ion o ad anced deg ees (SI2B). This SES p oxy
93
was selec ed no only because i was a ailable in all 18 co po a
94
(only 3 had al e na i e SES p oxies), bu mos impo an ly
95
because i is he mos commonly employed SES p oxy in lan-
96
guage acquisi ion esea ch, as es ablished in me a-analyses (15,
97
16). This allows ou indings o in o m ongoing discussions.
98
Theo ies o how SES ela es o child en’s language de elopmen
99
ha e p oposed a wide ange o pa hways in which ma e nal
100
educa ion is p edic i e o child en’s language expe iences, in-
101
cluding he connec ion be ween ma e nal educa ion and he
102
endency o employ e bal o e physical esponsi eness (17),
103
he di e si y in mo he s’ ocabula y (18), and he equency o
104
e bally- ich ac i i ies (19). Ma e nal educa ion also co ela es
105
highly wi h o he SES p oxies (e.g. =.86 in a s udy o child en
106
g owing up in 10 Eu opean o No h Ame ican coun ies, 20),
107
sugges ing i may also indi ec ly pick up on o he pa hways
108
linking SES o language de elopmen , h ough e.g. di e en ial
109
access o esou ces and nu i ion, o exposu e o s ess pe ina-
110
ally (21). A he same ime, we ecognize ha compa ing a
111
a iable like educa ion ac oss coun ies, al hough commonly
112
done (22), is no s aigh o wa d. The e o e, we supplemen
113
ou p e- egis e ed app oach wi h nume ous explo a o y checks
114
and analyses examining al e na i e implemen a ions (SI3G:H
115
desc ibed u he below). 116
C ucially, by including child en aged 2 o 48 mon hs, we
117
span a wide ange o linguis ic skills, allowing us o be e
118
cap u e he e ec s o ou a iables o e a b oad span o de el-
119
opmen wi hin ou socio-cul u ally and geog aphically b oad-
120
anging pa icipan s. We also include child en wi h a a ie y
121
o diagnoses o language delays and diso de s, as well as hose
122
a high isks o hem (see Me hods & SI2A o de ini ions and
123
de ailed jus i ica ion). Such child en’s language de elopmen
124
is by de ini ion non-no ma i e. Thus, age and non-no ma i e
125
s a us p o ide use ul ya ds icks o conside ing he signi icance
126
and e ec size o o he child- and amily-le el ac o s (SES
127
h ough ma e nal educa ion, child gende , mono- s. mul ilin-
128
gual s a us, and how much adul s alk o and a ound he child).
129
Tha is, i a ac o (e.g., gende ) has an e ec a smalle han
130
ha o age o non-no ma i e de elopmen , i would sugges
131
ha indi idual di e ences wi hin i a e ela i ely limi ed in
132
hei connec ion o e e yday language use. I he e ec s a e
133
compa able in size, i would ins ead sugges ha he amoun o
134
speech humans p oduce in e e yday in e ac ions is unde gi ded
135
by subs an ial and s uc u ed indi idual di e ences, a he
136
han s iking uni o mi y. Gi en ha e ec s could a y as a
137
unc ion o child age, we make su e o include key in e ac ion
138
e ms. Fo ins ance, we can expec age o in e ac wi h adul
139
alk i (as an icipa ed) olde child en a e mo e sensi i e o
140
adul s’ alking o hem han younge ones. 141
P edic ing Child en’s Speech P oduc ion. We employed a
142
hypo hesis- es ing app oach: In a wo-s ep p e egis a ion,
143
we i s es ablished explo a ion and con i ma ion da a subse s
144
(see Me hods and SI3A o de ailed explana ion, and SI3D:E
145
o he p ocedu e used o de i e p e- egis e ed hypo heses
146
and analyses). We hen le e aged he held-ou con i ma ion
147
subse o answe ou key ques ion: How well do speci ic
148
indi idual- and amily-le el ac o s p edic a ia ion
149
in how much speech young child en p oduce? A s ake
150
in hese analyses is whe he sys ema ic di e ences in child en’s
151
2| www.pnas.o g/cgi/doi/10.1073/pnas.XXXXXXXXXX Be gelson e al.
DRAFT
Fig. 1. Geog aphical loca ion, p ima y language, numbe o child en (Nchild), numbe o eco dings (N ec) and da a ci a ion o each co pus.
Table 1. Model esul s p edic ing child speech. q- alues
show FDR-co ec ed p- alues.
βSE q
In e cep 0.109 0.128 .681
Child Gende (Male) 0.026 0.051 .852
SES(<H.S.(1)) 0.001 0.111 .991
SES(H.S.(2)) -0.033 0.115 .932
SES(B.A.(4)) -0.064 0.079 .681
SES(>B.A.(5) -0.002 0.090 .991
Con ol -0.085 0.029 .035 *
No m -0.220 0.087 .036 *
Adul Talk 0.260 0.037 <.001 *
Age 0.647 0.024 <.001 *
Mono 0.045 0.095 .852
No m × Adul Talk -0.005 0.063 .991
No m × Age -0.217 0.051 <.001 *
Adul Talk × Age 0.125 0.022 <.001 *
Adul Talk × Mono 0.092 0.072 .45
Mono × Age -0.048 0.056 .681
No m × Adul Talk × Age 0.019 0.043 .852
Mono × Adul Talk × Age 0.137 0.065 .094
No e. Be as show de ia ion om he ollow-
ing baseline le els: Child Gende : emale; SES:
some uni e si y(3); No m: No m(a i e de el-
opmen ); Mono: Mono(lingual). SES = child
SES based on ma e nal educa ion (<H.S.(1) =
less han high school, H.S.(2) = high school,
B.A.(4) = college deg ee, >B.A.(5) = ad anced
deg ee); Con ol = o e lap a e con ol; Adul
Talk = adul ocaliza ion coun a e.
li es ha e measu able links o hei language p oduc ion, and
152
i so, wha he s eng h o hese ela ionships is bo h o e all,
153
and in ela ion o one ano he (see Table 1 o esul s†). 154
As expec ed, we ound ha olde child en p oduced mo e
155
speech han younge ones (ß=0.647, SE=0.024). Child en
156
wi h non-no ma i e de elopmen p oduced less speech han
157
child en wi h no ma i e de elopmen (ß=-0.22, SE=0.087)
‡
,
158
an e ec ha s eng hened wi h age (ß=-0.217, SE=0.051; see
159
Figu e 2B). This is expec ed because o some g oups in ou
160
non-no ma i e subse (e.g. hose wi h amilial isk o a speech
161
impai men ) olde child en a e mo e likely o ha e an ac ual
162
diagnosis (as opposed o isk ac o ) han younge ones (see
163
SI2A o de ails on non-no ma i e classi ica ion). 164
Ou esul s u he e ealed ha young child en’s speech
165
p oduc ion co ela ed wi h he amoun o adul alk hey hea d
166
(ß=0.26, SE=0.037). This co ela ion s eng hened wi h age
167
(ß=0.125, SE=0.022; see Figu e 2A), pe haps because a ia ion
168
in adul alk a e has less e ec on in an s (whose ea ly babbles
169
occu equen ly e en when in an s a e alone, 14). The e ec 170
o adul alk is a subs an ial one. Taking he e ec s o age and
171
no ma i i y as con enien (bu un ela ed) gauges o wha
172
coun s as a conside able e ec , we see ha he e ec size o
173
adul alk is abou a hi d o ha o age and simila o ha
174
o no ma i i y (adul alk: 0.26; in e ac ion adul alk by
175
age: 0.125; age: 0.647; non-no ma i e de elopmen : -0.22;
176
in e ac ion non-no ma i e by age: -0.217; all e ec size be as
177
exp essed as SDs). 178
To p o ide hese esul s in mo e in ui i e uni s, we i he
179
same model cen e ing a iables wi hou scaling. Child en
180
†
All ßs in Tables and ex a e based on ea men -coded models. See SI3H o sum-coded models,
which gi e he same pa e n o esul s.
‡The no ma i i y es ima e is nega i e because no ma i e de elopmen is he baseline.
Be gelson e al. PNAS | Augus 23, 2023 | ol. XXX | no. XX | 3
DRAFT
p oduced 66 mo e ocaliza ions pe hou wi h each yea o li e.
181
Fo e e y 100 adul ocaliza ions pe hou , child en p oduced
182
27 mo e ocaliza ions; his e ec g ew by 16 ocaliza ions pe
183
yea . Rela i e o in an s wi h ypical de elopmen , hose wi h
184
non-no ma i e de elopmen p oduced 20 ewe ocaliza ions
185
pe hou ; his di e ence g ew by 8 ocaliza ions pe yea .186
Su p isingly, and in con as o p e ious esul s using
187
smalle and less di e se da ase s and/o o he language p oxies,
188
we ound ha child gende , SES (indexed he e by ma e nal
189
educa ion), and monolingual s a us did no explain signi i-
190
can a ia ion in child speech. As ou aw da a igu es and
191
model ou come esul s show, hese null e ec s hold bo h when
192
conside ing co a ia es (as in ou model; Table 1) and when
193
conside ing hese a iables indi idually (as in Figu e 3; SI3F,
194
3G, 3H). In ou ull model con olling o o he a iables (Ta-
195
ble 1), he la ges es ima e o main e ec s o in e ac ions
196
in ol ing child gende , SES, and monolingual s a us was abou
197
hal o ha o no ma i i y, and one-six h o ha o age; none
198
eached h esholds o s a is ical signi icance.199
While ou models a e well-powe ed o es ima e associa ions
200
o child speech wi h age, no ma i i y, adul alk, gende , SES
201
(as measu ed by ma e nal educa ion), and monolingual s a us,
202
his is p edica ed upon pooling he da a and accoun ing s a is-
203
ically o co pus- and child-le el a iance ia andom e ec s,
204
as desc ibed in Me hods. This makes i beyond his pape ’s
205
scope o analyze language o popula ion/cul u al di e ences
206
in de ail, i.e. in a way ha migh allow he conside a ion
207
o addi ional, cul u e-speci ic a iables (hence hei omission
208
in Figs 2–3); see SI5 o ci a ions o esea ch on indi idual
209
da ase s, some o which ackle such di e ences di ec ly.210
No ing ha he esul s abo e ha e he s onges in e en ial
211
alue hanks o being p e- egis e ed, we also add essed ce ain
212
al e na i e hypo heses and in e p e a ions ha could ha e en-
213
de ed ou conclusions unjus i ied h ough a se ies o ollow-up
214
analyses. These checked o obus ness o ou key esul s wi h
215
di e en ope a ionaliza ions and s a is ical implemen a ions o
216
SES, when conside ing only child en unde o o e 18 mon hs,
217
when conside ing causal pa hs, and when inco po a ing speech
218
om o he child en as a p edic o ; ou key esul s held in all
219
cases (SI3H).220
We highligh he e he esul s ha may un mos coun e o
221
many eade s’ assump ions, namely, ha in his la ge sample,
222
SES (indexed by ma e nal educa ion) does no come ou as
223
a signi ican p edic o o child speech. This conclusion held
224
when decla ing SES as an o dinal and as a con inuous a iable
225
based on le els o yea s o ma e nal educa ion, when bina izing
226
SES le els based on indi idual coun ies’ a e age educa ion
227
comple ion a e, and when decla ing a andom slope o SES
228
wi hin co pus (which allows SES e ec s o a y ac oss co po a).
229
Some eade s may wonde whe he he e we e some co po a
230
o which SES did ma e . I so, he analysis wi h andom
231
SES slopes by co pus would ha e indica ed his, bu i did no
232
(SI3H). The ela ionship be ween SES and child speech was
233
weak and inconsis en ac oss co po a (as e iden in Fig. 4).234
Pe haps mos con incingly, esul s also held when cons ain-
235
ing ou analysis o ou la ges homogeneous subse , he No h
236
Ame ican subsample (642 daylong eco dings om 206 in an s
237
in 7 co po a; SI3G). We essen ially eplica ed he ull-sample
238
esul s in his subsample: adul alk and age we e signi ican 239
p edic o s, whe eas gende and SES (based on ma e nal edu-
240
ca ion) we e no . The signi ican adul alk
×
age in e ac ion
241
also eplica ed. The main e ec o no ma i i y did no , likely
242
because no ma i i y’s in e ac ion wi h age was la ge han
243
in he ull-sample analysis. Finally, we also es ed whe he
244
emo ing he adul alk a iable would esul in an SES e ec ,
245
i.e. es ing whe he adul alk was abso bing a iance ha
246
would o he wise be accoun ed o by SES. This was no he
247
case: Remo ing he adul alk p edic o , SES s ill does no
248
accoun o signi ican a iance in child speech in ou analysis.
249
A cen al con ibu ion o his wo k is hus he clea lack o
250
e idence we ind o e ec s o SES (unde se e al ope a ional-
251
iza ions ocused on ma e nal educa ion), on how much speech
252
young child en p oduce in day- o-day li e. 253
Ano he po en ial conce n is ha ou conclusions hinge
254
on he use o LENA™’s pa icula algo i hm; hey do no .
255
The indings abo e success ully eplica e in he subse o da a
256
o which da a s ewa ds we e able o sha e aw audio (11/18
257
co po a), which was analyzed wi h a wholly di e en algo i h-
258
mic app oach, he Voice Type Classi ie o VTC (Me hods;
259
SI3F).
§
Ye ano he wo y is ha ou ocus on adul alk may
260
mask o he impo an con ibu ions o child en’s language
261
expe iences, o ins ance, speech om o he child en. Tes ing
262
his in a supplemen al analysis, we con i m ha he le el o
263
associa ion ound be ween adul alk and child en’s speech
264
was una ec ed by including o he child en’s alk measu ed by
265
LENA as a p edic o a iable (SI3H), con i ming ha ou key
266
conclusions hold when ac o ing his o he sou ce o inpu in. 267
Finally, we also an a model p edic ing adul alk ( a he
268
han child speech). The amoun o adul alk was no sig-
269
ni ican ly p edic ed by SES, child age, gende , monolingual
270
o no ma i e s a us (Table 2, Figu e 3E:H; SI3G:H). Impo -
271
an ly, hese null esul s eplica ed in he No h Ame ican
272
subse (SI3G) as well as in e e y o he al e na i e analysis we
273
a emp ed (SI3H). Toge he , hese analyses sugges ha he
274
ela ionship we ind be ween adul alk and child speech in he
275
child speech models is no a ibu able o child- o amily-le el
276
ac o s a ec ing adul alk. 277
Speech and O he Ea ly Vocal Beha io . While ou cen al
278
que y conce ned a iabili y wi hin ea ly speech p oduc ion,
279
we conduc ed a u he desc ip i e analysis examining how
280
much o child en’s ocaliza ions we e speech o speech-like, as
281
opposed o he wo o he classes o LENA™-iden i ied ocal-
282
iza ions: c ying and ege a i e sounds (e.g. bu ps, hiccups).
283
We examined hese ocaliza ion ypes as a unc ion o age,
284
monolingual s a us, and no ma i e s a us. As Figu e 2C shows,
285
o child en wi h no ma i e de elopmen , he p opo ion o
286
ocaliza ions ha we e speech inc eased om jus o e hal o
287
he as majo i y o e 2–48 mon hs. In con as , he c ying
288
p opo ion ell s eeply o e he same pe iod, om nea ly hal
289
o ocaliza ions o a small ac ion o hem; he p opo ion
290
o ege a i e sounds was low and cons an . Con e gen wi h
291
ou speech analyses, monolingual s a us did no al e hese
292
pa e ns bu no ma i e s a us did: While he same o e all
293
pa e ns held o child en wi h non-no ma i e de elopmen ,
294
hei dec ease in c ying and inc ease in speech p oduc ion wi h
295
age was less s eep (see Figu e 2C). 296
As wi h mo e na owly-de ined non-no ma i e popula ions
297
(e.g. child en wi h Au ism Spec um Diso de (23)), we ind
298
clea di e gences in language ajec o ies in ou no ma i e
299
s. non-no ma i e samples. This is no able because (a) ou
300
§VTC oo has been obus ly alida ed ela i e o a ious gold s anda d manual measu es (SI1E)
4| www.pnas.o g/cgi/doi/10.1073/pnas.XXXXXXXXXX Be gelson e al.
DRAFT
Low−AVC
Mid−AVC
High−AVC
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
0
50
100
150
200
250
300
350
400
450
500
550
600
Age (mon hs)
Child ling. oc. a e (pe h .)
Co pus
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
A
Non−No ma i e
No ma i e
0 10 20 30 40 50 0 10 20 30 40 50
0
50
100
150
200
250
300
350
400
450
500
550
600
Age (mon hs)
Child ling. oc. a e (pe h .)
B
Speech
C y
Vege a i e
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Age (mon hs)
P op. o ocal beha io
Non−monoling.
Non−no m.
Monoling.
& no m.
C
Fig. 2. E ec s o adul alk, child age, and no ma i e de elopmen on child en's speech p oduc ion. Poin s show each daylong eco ding; lines show linea eg ession
wi h 95% Con idence In e als (CI). Child speech is quan i ied as child linguis ic ocaliza ion a e; adul alk as adul ocaliza ion coun a e (AVC ). A: Child speech by age, spli
by low/mid/high e iles o adul alk. Lines depic signi ican adul alk × age in e ac ion. Colo -shape combina ions show each unique co pus, numbe ed o p ese e anonymi y.
B: Child speech by age and no ma i e s a us. Lines depic signi ican age × no ma i e s a us in e ac ion. C: P opo ion o ocal beha io classi ied as speech, c y, o ege a i e,
by age. Line ype/colo indica e monolingual and no ma i e s a uses. N.B. Monolingual no ma i e CI (blue) alls ully wi hin ha o mul ilingual child en (pink) o all 3 ypes o
ocal beha io , highligh ing hese g oups' equi alen pa e ns.
Table 2. Model esul s p edic ing adul alk (i.e. adul ocal-
iza ion coun a e). q- alues show FDR-co ec ed p- alues.
βSE q
In e cep -0.100 0.160 .778
Child Gende (Male) 0.174 0.148 .547
SES(<H.S.(1)) 0.239 0.173 .547
SES(H.S.(2)) -0.015 0.194 .939
SES(B.A.(4)) 0.148 0.131 .547
SES(>B.A.(5) 0.098 0.150 .778
Con ol 0.084 0.055 .547
No m 0.013 0.103 .939
Age -0.030 0.029 .547
Mono -0.028 0.112 .939
Gende (Male) × SES(<H.S.(1)) -0.375 0.196 .547
Gende (Male) × SES(H.S.(2)) -0.263 0.252 .547
Gende (Male) × SES(B.A.(4)) -0.220 0.176 .547
Gende (Male) × SES(>B.A.(5)) 0.016 0.201 .939
No m × Age -0.076 0.060 .547
Mono × Age 0.035 0.068 .804
No e. None o he a iables in ou model p edic ed
adul alk. All abb e ia ions and baselines as in
Table 1.
non-no ma i e sample is he e ogeneous (SI2A) and (b) as 2–48-
301
mon h-olds, many child en wi h non-no ma i e classi ica ions
302
he e we e a isk o (bu no ye diagnosed wi h) language
303
delays o diso de s. Au oma ed speech analyses in na u alis ic
304
eco dings hus hold p omise o u u e esea ch in o ea ly
305
diagnos ics (24, 25). 306
Adul Talk and Child Speech. Child en who hea d mo e adul
307
alk p oduced d ama ically highe a es o speech, and his
308
e ec inc eased wi h age. This esul eeds in o ongoing heo-
309
e ical deba es ega ding he ele ance o indi idual di e ences
310
(26). Al hough we canno in e causali y om ou co ela ional
311
da a, i is use ul o conside possible causal pa hs ha could
312
in p inciple ha e led o ou esul s. A co ela ion be ween
313
child speech and adul alk is compa ible wi h a leas h ee
314
explana ions: (1) Child en who p oduce mo e speech elici
315
mo e alk om adul s; (2) Language-dense en i onmen s lead
316
child en o p oduce mo e speech; o (3) A hi d a iable causes
317
inc eases in bo h adul alk and child speech.¶318
Ou model p edic ing adul alk (see Table 2) can be
319
b ough o bea on Explana ion 1. I child en alking mo e
320
elici ed mo e alk om adul s, hen we would ha e expec ed o
321
¶
Ou analyses sugges ha one such po en ial hi d a iable, di e ences in ac i i ies ac oss eco dings,
is no a likely candida e o he co ela ion be ween child speech and adul alk (SI4).
Be gelson e al. PNAS | Augus 23, 2023 | ol. XXX | no. XX | 5
DRAFT
0
100
200
300
400
500
600
F M
Child Gende
Child ling. oc. a e
(pe h .)
A
0
100
200
300
400
500
600
<H.S.
(1) H.S.
(2) S.U.
(3) B.A.
(4) >B.A.
(5)
SES
Child ling. oc. a e
(pe h .)
B
Low−AVC
Mid−AVC
High−AVC
0 1020304050 0 1020304050 0 1020304050
100
200
300
400
Age (mon hs)
Child ling. oc. a e
(pe h .)
No ma i e
Y
N
C
Low−AVC
Mid−AVC
High−AVC
0 1020304050 0 1020304050 0 1020304050
100
200
300
400
Age (mon hs)
Child ling. oc. a e
(pe h .)
Monoling.
Y
N
D
0
100
200
300
400
500
600
F M
Child Gende
Adul oc. coun
a e (pe h .)
E
0
200
400
600
<H.S.
(1) H.S.
(2) S.U.
(3) B.A.
(4) >B.A.
(5)
SES
Adul oc. coun
a e (pe h .)
F
170
180
190
200
210
220
0 10 20 30 40 50
Age (mon hs)
Adul oc. coun
a e (pe h .)
No ma i e
Y
N
G
175
200
225
250
0 10 20 30 40 50
Age (mon hs)
Adul oc. coun
a e (pe h .)
Monoling.
Y
N
H
Fig. 3. Fac o s ha do no p edic child speech o adul alk. Poin s = indi idual
eco dings, ji e ed ho izon ally. Lines = linea i wi h 95% Con idence In e als. E o
ba s = 99% boo s apped CIs o sample means. Child speech is quan i ied as child
linguis ic ocaliza ion a e; adul alk as adul ocaliza ion coun a e (AVC ). A & B:
null e ec s o child gende (A) and socioeconomic s a us (SES) (B) on child speech.
C: null 3-way e ec o no ma i e de elopmen × adul alk × age (N.B.: no ma i e ×
age and adul alk × age a e signi ican ; see Fig. 2). D: null 3-way e ec o age × adul
alk × monolingual s a us. Eand F: null e ec s o child gende (E) and SES (F) on
adul alk. G & H: null e ec o no ma i e de elopmen (G) and monolingual s a us (H)
on adul alk.
see ha age and no ma i e s a us we e signi ican p edic o s322
o adul alk. Ins ead, we ind ha nei he hese (no any
323
o he a iables in ou model) p edic ed he quan i y o adul
324
alk (Figu e 3G). None heless, he p ecise s a is ical analy-
325
ses we ca ied ou do no allow us o di ec ly ule ou any
326
o he explana ions, a combina ion o which may be join ly
327
ue. Es ablishing a p ecise causal chain will equi e ca e ul
328
conside a ion o a a ie y o p oximal and ul ima e pa hways
329
h ough which child and adul beha io s a e shaped. As one
330
example, gi en ha mos child en he e a e gene ically ela ed
331
o hei adul ca egi e s, we may be obse ing co a iance in
332
amoun o alk and i s linguis ic co ela es (Explana ion 3).
333
E alua ing hese al e na i es equi es e idence om child en
334
aised by un ela ed ca egi e s o om genome-wide associ-
335
a ion s udies, as gene ic and en i onmen al ac o s emain
336
challenging o disen angle (27). In his ein, ecen wo k
337
wi h adop ed 15–73-mon h-olds p o ides e idence o inpu
338
e ec s (ma e nal u e ance leng h and/o lexical di e si y) on
339
adop ed child en’s ocabula y size (measu ed ia ca e ake
340
checklis ) (28). This s udy sugges s ha sha ed gene ics is no
341
he sole con ibu o o links be ween (a leas hese p oxies
342
o ) ca e ake inpu and child language ou comes. Mo eo e ,
343
sha ed gene ics is jus one o he ways in which adul and child
344
beha io may be independen ly shaped by an unmeasu ed
345
con ounded a iable (as pe Explana ion 3). Fo ins ance,
346
o he hi d a iables ela ed o dimensions like pe sonali y,
347
neighbo hood, and childca e con ex oo may be con ibu o s
348
(29, 30). These explana ions can only be de ini i ely eased
349
<H.S.
(1) H.S.
(2) S.U.
(3) B.A.
(4) >B.A.
(5) <H.S.
(1) H.S.
(2) S.U.
(3) B.A.
(4) >B.A.
(5) <H.S.
(1) H.S.
(2) S.U.
(3) B.A.
(4) >B.A.
(5)
0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
0
100
200
300
400
SES
Child ling. oc. a e (pe h .)
Fig. 4. Child speech as a unc ion o SES wi hin indi idual co po a. SES =
ma e nal educa ion le els as in Table 1. Whi e lines = linea i wi h 95% CIs in colo ,
colo = co pus. Black lines = 99% CIs o sample means boo s apped sepa a ely om
linea i o each le el o SES. These da a (as well as ou main models and u he
analyses in SI 3H/G) do no e eal an SES e ec on child speech.
apa by u u e wo k. 350
New Insigh on Child and Family Fac o s. Ou main models,
351
igu es showing he aw da a, and addi ional analyses (in
352
he No h Ame ican subse o he da a, as well as using an
353
al e na i e algo i hm, see SI3F) e eal e ec s o no ma i i y,
354
age, and adul alk bu no SES (measu ed he e h ough
355
ma e nal educa ion), child gende , o monolingualism. To
356
illus a e he complexi ies in ol ed in de e mining causal links
357
be ween child and amily ac o s and child language skills, we
358
again conside how causal links migh mani es , using SES as
359
a cen al example. 360
Ou indings bea on deba es ega ding SES-associa ed
361
academic achie emen di e ences in Wes e n indus ialized
362
socie ies (31, 32). Slowe language de elopmen has o en
363
been a ibu ed o pa en s om lowe -SES backg ounds p o-
364
iding less inpu o hei child en ( iewed om a middle-class
365
Wes e n-cen ic pe spec i e (32)), leading o calls o beha -
366
io al in e en ions aiming o inc ease i . P oponen s o such
367
in e en ions migh highligh ou co ela ion be ween adul
368
alk and child speech; c i ics migh ins ead unde sco e ou
369
inding ha SES was no signi ican in ou main analyses no
370
in e e y o he e-analysis we a emp ed (SI3E:G). 371
A ull unde s anding o how SES may ela e o child en’s
372
language inpu is complica ed o empi ical and concep ual
373
easons, lea ing s ong conclusions p ema u e. On he empi i-
374
cal side, wo ecen me a-analyses ha e in es iga ed SES–inpu
375
co ela ions, one ocused on LENA™measu es (15), and he 376
o he based on human-anno a ed measu es (mos ly om sho
377
lab eco dings) (16). The o me inds e idence consis en
378
wi h a publica ion bias; co ec ing his bias s a is ically nea ly
379
hal es he associa ion be ween SES and LENA™’s adul alk
380
measu e ( = .19 e sus .12). The la e inds a sizeable SES
381
6| www.pnas.o g/cgi/doi/10.1073/pnas.XXXXXXXXXX Be gelson e al.
DRAFT
e ec when inspec ing in an -di ec ed speech ( = .34) and a
382
much smalle one when analyzing o e all inpu quan i ies (
383
= .09). Toge he , hese s udies sugges ha ou bes es ima e
384
o he associa ion be ween o e all inpu quan i ies and SES is
385
small ( = .1) and may no be de ec able e en wi h a sample
386
as la ge as ou s (whe e he e ec was es ima ed a |d| = .06,
387
o | | = .03, which did no each he h eshold o signi icance).
388
Simila ly, desc ip i e plo s o he po en ial co ela ion be ween
389
ou SES p oxy and child en’s speech (Figu e 4) did no sugges
390
a s ong o s able ela ionship ac oss he 18 co po a, leading
391
o ou conclusion ha , in he sample as a whole, on a e age,
392
ma e nal educa ion does no p edic how much adul s and
393
child en alk.394
On he concep ual side, SES di e ences in inpu and lan-
395
guage skills may depend on how language is measu ed (33). Fo
396
ins ance, we specula e ha SES e ec s may be magni ied by
397
measu es like p e alence o low- equency wo ds and complex
398
sen ence s uc u es common in w i en ex . Such wo ds and
399
s uc u es may occu mo e in he inpu o Wes e n, highe -SES
400
child en because o pa en ing p ac ices s e eo ypical in hese
401
g oups (34). Mo eo e , such measu es may p edic academic
402
achie emen be e han o he s, because o he impo ance
403
li e acy has in Wes e n schooling oday. In con as , SES
404
di e ences in inpu may be minimized by holis ic measu es o
405
speech quan i ies. Indeed, a s eng h o daylong eco dings
406
is ha hey p o ide a ela i ely neu al ( a he han Wes -
407
e n, high SES-cen ic) measu e, as hey ap in o how much
408
child en a e con ibu ing ( ia speech) o hei communi y’s
409
con e sa ional in e ac ions ins ead o how many a e wo ds o
410
complex cons uc ions hey ha e been augh .411
An exclusi e ocus on wo d coun s o speech quan i ies
412
likely misses ce ain beha io s. As machine lea ning ad ances
413
(35), i may soon be possible o au oma ically ansc ibe
414
con e sa ions happening in daylong eco dings (a leas in
415
monolingual high- esou ce language con ex s). We suspec
416
ha analysis o con e sa ional con en may e eal SES di -
417
e ences in, e.g., a e wo d use o amily p ac ices a ound
418
book- eading e en in na u alis ic samples (36). Fu u e wo k
419
wi h a high-densi y longi udinal lens is also needed o assess
420
he p edic i e alue o global quan i a i e measu es o speech
421
(like hose we employ) ela i e o mo e specialized measu es
422
(e.g. book- eading p ac ices) wi h espec o cul u ally- ele an
423
ou comes (e.g. academic achie emen , p agma ic compe ence
424
in mul i-pa y con e sa ion, e c.)425
In ou iew, causal links be ween pa en al beha io and chil-
426
d en’s ou comes can bes be illumina ed by andomized con ol
427
ials. Disco e ing and le e aging such links o change long-
428
e m language ou comes depends on communi y pa ne ship-
429
based app oaches ha a e in o med by he ole ha s uc u al
430
inequali ies play in hese ou comes and engage wi h cul u ally
431
in o med pe spec i es (37). The p esen esul s should no
432
be used o deny amilies access o esou ces ha e idence
433
sugges s a e linked wi h be e ou comes o child en and hei
434
amilies.435
Complica ed causal e ec s a e in eg al o all de elopmen al
436
p ocesses. While we illus a ed his wi h ou SES null esul s,
437
we also ound no di e ences in child speech o adul alk as
438
a unc ion o child gende o mul ilingual s a us. Rega ding
439
mul ilingualism, we could no examine ela i e inpu in each
440
language he child was exposed o. Fu u e machine lea ning
441
ad ances will pe mi he sepa a e quan i ica ion o di e en
442
languages in daylong eco dings, bu his mus happen along-
443
side e lec ion on how o ai ly measu e inpu and ou comes
444
in such he e ogeneous popula ions (38–40). 445
Au oma ed Tools and Wha They Coun . A key bene i o ou
446
app oach is ha we we e able o pool and iden ically p ocess
447
40,933 hou s o independen ly-collec ed da a (SI3A). Mo eo e ,
448
unlike pa en al su eys, clinical assessmen s, lab ins umen s,
449
o hand-anno a ed da a, cu en published e idence sugges s
450
ha he LENA™algo i hm’s esul s do no a y sys ema ically
451
by language ( hough hey do a y somewha ac oss samples,
452
12). Mo e ele an he e, in analyzing he algo i hm’s accu acy
453
as a unc ion o samples g ouped by language and cul u al
454
ea u es, we ound no signi ican di e ences (Me hods, SI1E).
455
While child en’s language skills g ow d ama ically o e 2–48
456
mon hs, ou measu e is no an index o comp ehension (which
457
can show qui e a di e en ajec o y, 41) bu a he o ob-
458
se able linguis ic beha io , ocusing exclusi ely on child en’s
459
a e o linguis ic ocaliza ions (SI3B). These esul s ce ainly
460
do no deny e ec s ound on p oxies o mo e na ow-scoped
461
linguis ic de elopmen s (e.g. ocabula y, p ocessing e iciency,
462
o syn ac ic complexi y), gi en ha some p edic o s ha ail
463
o explain a iance he e may none heless be signi ican he e
464
(3, 42). 465
The same holds o ou measu e o adul alk, which is
466
quan i a i e and holis ic; addi ional esea ch is needed o dis-
467
inguish child-di ec ed om child-a ailable speech, wi h he
468
la e including all speech he child hea s. Al hough some
469
esea ch sugges s child-di ec ed speech shows igh e co ela-
470
ions wi h child en’s ocabula y han child-a ailable speech
471
does (43, 44), he impo ance o he la e has no been as ully
472
s udied o o he ypes o language knowledge (45); and, as a
473
as we know, his pape is he i s o documen a signi ican
474
link o e e yday child speech beha io . The e o e, i would
475
be ele an o u he in es iga e he s eng h o he p edic i e
476
alue o o e all adul alk (which was a signi ican p edic o
477
he e) e sus child-di ec ed alk, in a simila ly la ge and di e se
478
sample as he p esen one. Un o una ely, au oma ed ools o
479
sepa a ing child-di ec ed om o e hea d speech a e no ye
480
su icien ly accu a e o make his possible (46). Fu u e wo k
481
could also de elop p omising new app oaches o conside ing
482
o he sou ces o speech (e.g., o he child en) gi en hei ele-
483
ance as a unc ion o amily s uc u e (47). These app oaches
484
we e no possible he e due o bo h echnical algo i hmic con- 485
s ain s and amily s uc u e in o ma ion no being a ailable
486
in ou da a-subse s. Ano he ui ul u u e di ec ion could
487
conside con e sa ional dynamics, s udying bo h child en’s
488
endency o ocalize a ound adul s and he complexi y o such
489
ocaliza ions. Recen wo k ( ha is c i ically elian on human
490
anno a ion o social in en ) aises pa icula ly in e es ing ideas
491
in his domain (14, 48). Rela edly, no el explo a o y analyses
492
desc ibing he acous ics o child en’s ocaliza ions (49) hold
493
p omise o d i ing u u e hypo hesis- es ing wo k building on
494
he p esen esul s. 495
Wha e e measu es a e employed in he u u e as p oxies
496
o child language p oduc ion and inpu , we s ongly encou age
497
esea che s o conside psychome ic p ope ies and ecological
498
alidi y. The cu en app oach demons a es measu e alidi y
499
ha is compa able o ha o o he s anda d in an ins umen s
500
(SI1D:E). As con ex , measu es used as p oxies o in an
501
language and cogni i e knowledge a e inhe en ly noisie han
502
he bes ba e ies used o assess highly educa ed adul s in
503
Be gelson e al. PNAS | Augus 23, 2023 | ol. XXX | no. XX | 7
DRAFT
Wes e n-cen ic se ings. No ably, e en he e, eliabili ies can
504
all well below = 1.‖
505
Mo eo e , s anda dized es s ace ecological alidi y h ea s,
506
pa icula ly when applied c oss-cul u ally. I ou goal is o mea-
507
su e and unde s and he human mind, we need implemen able,
508
cul u ally sensi i e and app op ia e ways o measu ing human
509
beha io on a la ge scale. To ou knowledge, he e a e no
510
such measu es whose eliabili y has been examined, d i ing
511
us o conduc ex ensi e quan i ica ion o he eliabili y o he
512
me ics we employed he e (SI1D:E). We ound ha ou mea-
513
su es show le els o eliabili y ha a e consis en wi h hose
514
al eady in use o esea ch and clinical pu poses in in an pop-
515
ula ions. Fo example, he MacA hu -Ba es Communica i e
516
De elopmen In en o y (a pa en al epo ins umen used
517
la gely as a p oxy o ocabula y size) has been he basis o
518
c oss-linguis ic, demog aphic, and clinical esea ch (9, 51–53),
519
and epo s a median co ela ion be ween i sel and labo a-
520
o y measu es o .61 (54). Ou median accu acy compa ing
521
au oma ed and manual anno a ion o each o ou algo i hms
522
(LENA™and VTC) is .74, squa ely in line wi h ield s anda ds
523
(SI1E). Indeed, con e ging e idence ac oss hese wo wholly
524
sepa a e algo i hms ega ding o e all accu acy o ou measu e
525
se es o inc ease con idence in he alidi y o ou esul s.526
In sum, a he han elici ing knowledge o ca egi e -child
527
in e ac ion in a cons ained lab se ing, o using checklis s
528
in con ex s whe e hey make li le sense socio-cul u ally, we
529
measu e e e yday language use en masse. Ou measu e o
530
ea ly speech p oduc ion is global, since we simply measu e
531
mo e e sus less speech o speech-like p oduc ion on he pa
532
o adul s and child en as hey go abou hei daily li e. And
533
ye , hese measu es ha e impo an ad an ages, which led us
534
o selec hem as p oxies he e, including compa able elia-
535
bili y o o he measu es o language de elopmen commonly
536
used in bo h esea ch and applied se ings (Me hods, SI1D:E);
537
epo ed co ela ions be ween hem and ine -g ained, “quali-
538
a i e” measu es o language de elopmen (SI1D), and con e -
539
gen alidi y wi h espec o s anda dized language es s (13).
540
Mos impo an ly, ou speech measu e me i s conside a ion as
541
one o many possible p oxies o language de elopmen hanks
542
o i s c oss-cul u al adap abili y, obse e - ee sampling ol-
543
ume, and shee ecological alidi y. Indeed, ou esul s aise
544
he possibili y ha mo e ecologically- alid lexical, phone ic, o
545
g amma ical measu es will also e eal s abili y ac oss ac o s
546
like SES (55), gende , and mul ilingualism. Explo ing hese
547
ac o s, howe e , awai s machine-lea ning de elopmen s ha
548
can ex ac such ine-g ained linguis ic measu es om he aw
549
audio collec ed wi h child-wo n de ices.550
Conclusion. Ou analysis o speech beha io in daily li e
551
a ound he wo ld e inces scien i ic p og ess on wo on s.
552
Fi s , by e ealing subs an ial a ia ion in young child en’s
553
speech, we p o ide e idence agains a monoli hic pic u e o
554
language de elopmen . Ins ead, his wo k e eals indi id-
555
ual a ia ion as undamen al o ou unde s anding o his
556
species-wide abili y. Second, by apping in o na u al speech
557
in e ac ions a unp eceden ed scale and di e si y, we a e able
558
o mo e beyond p io wo k by simul aneously conside ing he
559
in e locking ac o s ha a ec speech p oduc ion o e ea ly
560
de elopmen . Ou esul s e eal no only he expec ed co -
561
ela ions wi h age and clinical ac o s, bu also subs an ial
562
‖
Fo ins ance, p io wo k inds es - e es eliabili ies as low as = .6 o ce ain sec ions o he widely
used Wechsle Adul In elligence Scale among No h Ame ican English-speaking adul s (50).
associa ions wi h adul alk. All o he ac o s paled in compa -
563
ison wi h hese h ee, he null e ec o ou SES p oxy being
564
o pa icula no ewo hiness. These indings open exci ing a -
565
enues o bo h heo e ical esea ch and po en ial applica ions,
566
including he p ospec o beha io al in e en ions o ha ness
567
adul alk in he con ex o speech and language diagnoses.
568
Small-scale expe imen al and obse a ional esea ch has been
569
undamen al o ou unde s anding o language, de elopmen ,
570
and he human mind. Machine lea ning (like ha in speech
571
echnology) p omises o ex end ou scien i ic each by explod-
572
ing he ange o e e yday in e ac ions we a e able o cap u e
573
and analyze. Jus as ecen echnological inno a ions ha e
574
opened new is as in unde s anding he ocaliza ions o mice
575
and whales (6, 7), so oo does speech echnology ha e he
576
po en ial o e eal how e e yday human communica ion gi es
577
ise o language lea ning in child en a ound he wo ld. 578
Me hods 579
All code used o gene a e ou analysis and he
580
manusc ip is a ailable a h ps://os .io/9 2m5/? iew_only=
581
50d 17 c 0844145ae692c35b78c6b08.582
Da a Disco e y and In eg a ion. We ook s eps o coun e a p e a-
583
len bias o no ma i e No h Ame ican da a (see SI3A o
584
co pus cons i u ion p ocedu e). Included da a we e indepen-
585
den ly collec ed by 18 s ewa ds (56–77); see SI5 o lis o
586
publica ions based on indi idual da ase s. We no e ha while
587
ou co po a co e ed a much g ea e a ie y o pa icipan s
588
han p io wo k, i would no be app op ia e o in e p e ou
589
samples as comp ehensi ely ep esen a i e o he coun y o
590
language communi y om which hey a e d awn. 591
Socioeconomic s a us and no ma i e de elopmen we e
592
s eamlined o c oss-co pus consis ency (SI2A:B, SI3A, Fig-
593
u e S3A.1). Fo socioeconomic s a us we use ma e nal ed-
594
uca ion, a eliable p oxy o SES in p e ious esea ch on
595
language de elopmen (18, 78). Ma e nal educa ion was a ail-
596
able ac oss all da ase s, and could be con e ed in o a 5-
597
poin ma e nal educa ion scale wi h le els co esponding o
598
less han high school deg ee, high school deg ee o equi a-
599
len , some college/ oca ional/associa e deg ee le el aining,
600
uni e si y/college deg ee, and ad anced deg ee (SI2B; Table
601
S2B.1). 602
Fo non-no ma i e de elopmen , da a s ewa ds had agged
603
a wide a ie y o in an o amilial cha ac e is ics as po en-
604
ially non-no ma i e. We con i med ha he classi ica ion
605
was backed up by ex an li e a u e (SI2A). In an s ul ima ely
606
classi ied as ha ing non-no ma i e de elopmen in he p esen
607
sample include hose who me one o mo e o he ollowing
608
c i e ia: p e e m bi h (<37 weeks); diagnosed speech o lan-
609
guage delay; global de elopmen al delay; low bi h weigh
610
(<2500g when speci ied); hea ing loss, hea ing aids o cochlea
611
implan s; amilial isk o Au ism Spec um Diso de , speci ic
612
language impai men and/o dyslexia; o he ele an gene ic
613
synd omes. No ably, ou child ocaliza ion a e measu e is
614
no a s anda dized no med clinical e alua ion, and hus non-
615
no ma i e s a us may no necessa ily ansla e o beha io
616
ha alls >1 s anda d de ia ions below he no m in hese
617
na u alis ic eco dings. 618
Analysis De ails. We i s andomly pa i ioned he da a wi hin
619
each co pus such ha 35% o monolingual, no ma i e chil-
620
d en we e placed in an explo a ion se (N child en = 264; N
621
8| www.pnas.o g/cgi/doi/10.1073/pnas.XXXXXXXXXX Be gelson e al.
DRAFT
eco dings = 850), and all o he s in a con i ma ion se (N
622
child en = 737; N eco dings = 2025) (SI3A). The explo a ion
623
se was used o s udy he psychome ic p ope ies o po en ial
624
language inpu and ou pu a iables (SI3B), esul ing in he
625
selec ion o he ou pu a iable e e ed o as child speech
626
abo e, and CVC (Child Vocaliza ion Coun a e) in anal-
627
ysis and supplemen a y iles (SI3B, Table S3B.1); and he
628
inpu a iable e e ed o as adul alk abo e, and AVC
629
(Adul Vocaliza ion Coun a e) in analysis and supplemen-
630
a y iles (SI3B, Table S3B.2). No e ha his includes bo h
631
child-di ec ed and child-a ailable speech.632
In addi ion, we used he explo a ion se o check he o-
633
bus ness o esul s o a ia ion in andom e ec s uc u e, and
634
explo ed di e se model s uc u es using mixed models in R’s
635
lme4 package (79), checking whe he he addi ion o e ec s o
636
in e ac ions explained addi ional a iance (SI3C). This led us
637
o (a) include o e lap a e as a co a ia e (see Figu e S3C.1),
638
o con ol o he ac ha in noisy en i onmen s, mo e child
639
speech and adul alk wi hin he same eco dings may be
640
labeled as “o e lap” by LENA (and hus no a ibu ed o
641
ei he speake ype) and (b) o no include andom slopes
642
o any o he p edic o s. Rega ding he la e choice, ou
643
explo a ion o andom e ec s uc u e e ealed ha models
644
including andom slopes o any o he p edic o s (no ably
645
including gende and SES) as a unc ion o co pus led o
646
non-con e gen models. While such non-con e gence could
647
be due o a ious easons, he mos likely explana ion is ha
648
he model is o e pa ame ized (80), i.e., a iance canno be
649
eliably a ibu ed o p edic o s wi hin each co pus (see SI3H
650
o addi ional checks, including one including andom slopes
651
o SES, and SI2B o discussion o al e na i es o ou SES
652
implemen a ion).653
E alua ion agains human anno a ions. To assess he alidi y o
654
ou child speech and adul alk measu es, we e alua ed hem
655
agains human anno a ions (see SI1D:E o u he in o ma-
656
ion). The median co ela ion o human o algo i hm pe o -
657
mance o he algo i hms is >.7, i.e. compa able eliabili y o
658
es ablished de elopmen al clinical and esea ch ins umen s
659
(81–83). As a as we know, he p esen mul i-cul u al al-
660
ida ion exceeds hose om p io esea ch ins umen s. Fo
661
example, he Ages and S ages Ques ionnai e (84) is a s anda d
662
ins umen used a well-child isi s in he U.S. I is also ecom-
663
mended by he Wo ld Bank as one o he mos popula ools
664
o measu e child de elopmen , used in a leas 20 coun ies
665
(85). And ye , a ecen sys ema ic e iew (83) epo s only 6666
eliabili y analyses (a e aging, e.g., .7 o in e nal consis ency
667
a 24mo.). Rela i e o his, ou alida ion e o con aining es-
668
ima es o 14/18 co po a and inding s ong alidi y is no able.
669
Finally, one may wonde whe he he LENA™algo i hm pe -
670
o ms less well o languages and cul u es ha di e ge om
671
i s aining se , which was English-lea ning child en g owing
672
up in an u ban/subu ban U.S. se ing. Al hough we obse e
673
conside able co pus a ia ion, his a ia ion is no a ibu able
674
o whe he child en we e lea ning English o g owing up in
675
an u ban se ing, as assessed by Welch’s - es s, o ei he
676
ou child speech measu e (CVC ; English e sus non-English677
medians 0.785 s. 0.71,
(6.04) = -0.5,
p
= 0.637; u ban e sus
678
u al medians 0.77 s. 0.71,
(8.11) = -0.46,
p
= 0.661), o
679
o ou adul alk measu e (AVC ; English e sus non-English
680
medians 0.75 s. 0.74,
(7.91) = 0.42,
p
= 0.686; u ban e -
681
sus u al medians 0.75 s. 0.74,
(3.07) = -0.23,
p
= 0.835).
682
Ins ead, ou esul s sugges ha co pus a ia ion mo e likely
683
e lec s how he human anno a ion was done a he han how
684
well he algo i hm wo ked, since he co po a wi h lowe eli-
685
abili ies we e also hose in which he human anno a ion was
686
mo e coa se-g ained (see SI1E). 687
Addi ional algo i hm. To make su e ha key conclusions we e
688
obus o me hodological de ails, we eanalyzed he subse o
689
he da a o which da a s ewa ds sha ed audio wi h a newe ,
690
open-sou ce al e na i e o LENA™: he Voice Type Classi ie
691
(VTC) (86). Like he LENA™algo i hm, VTC e u ns an
692
es ima ion o child and adul ocaliza ion coun s. A o al
693
o 1065 audio iles om 11 co po a we e a ailable o his
694
eanalysis (SI3F). 695
The VTC algo i hm employs a comple ely di e en ap-
696
p oach han he p op ie a y algo i hm de eloped by LENA™,
697
including he use o neu al ne wo ks unning di ec ly om he
698
audio ( a he han om MFCC ea u es). VTC allows mul i-
699
ple alke classes o be ac i a ed a he same ime, whe eas
700
in he LENA™algo i hm, o e lap be ween alke s (o be-
701
ween a alke and noise) is agged as “O e lap,” which is
702
no coun ed owa ds child en’s inpu o ou pu . VTC also
703
di e s om LENA™in i s aining se . While LENA™was
704
ained en i ely on da a om No h Ame ican, monolingual
705
English-lea ning, u ban child en, VTC was de eloped using
706
he combina ion o a ious co po a o child en esiding in
707
u ban o u al se ings and lea ning one o mo e o se e al lan-
708
guages (including he onal language Minn, F ench, Ju|’hoan,
709
Tsimane, English, and se e al o he s, in ough o de o quan-
710
i y o da a). Fu he in o ma ion on accu acy is p o ided in 711
SI1E; bo h algo i hms ende simila accu acy when compa ed
712
o human anno a ion as no ed abo e. 713
Models. We used linea mixed eg essions (Gaussian amily),
714
and es ablished model s uc u e om he explo a ion da a
715
(SI3C). Hypo heses we e de i ed om explo a o y models and
716
sys ema ic e iews o li e a u e on monolingualism and no -
717
ma i i y (SI3D). The model p edic ing he a e o child en’s
718
linguis ic ocaliza ions (i.e. child speech) was:
child_gende
+
719
SES
+
child_no ma i e∗AV C ∗age
+
child_monolingual∗720
AV C ∗age
+
o e lap
+ (1 +
o e lap
+
AV C |co pus
) +
721
(1
|co pus
:
child_id
). The model p edic ing he a e o adul
722
linguis ic ocaliza ions (i.e. adul alk) was:
child_gende
+
723
SES
+
child_no ma i e ∗age
+
child_monolingual ∗age
+
724
o e lap
+ (1 +
o e lap|co pus
) + (1
|co pus
:
child_id
). Full
725
model de ails and a link o model diagnos ics a e p o ided
726
in SI3E. We epo es ima es (s anda dized, which se e as
727
e ec sizes), s anda d e o s o he es ima es, and q- alues
728
(FDR-co ec ed p- alues); see Tables 1and 2.729
Pa icipan s. Table 3lis s pa icipan cha ac e is ics no ing bo h
730
(1) he explo a ion/con i ma ion spli (SI3A), and (2) ha
731
some child en p o ided mul iple eco dings. We excluded
732
2/850 eco dings om 1/264 child en om he explo a ion se
733
and 8/2025 eco dings om 5/737 child en in he con i ma ion
734
se om ou models because da a ega ding hei ma e nal
735
educa ion was missing. Fo child gende , he e we e sligh ly
736
mo e boys han gi ls. This was in pa because co po a wi h
737
child en wi h non-no ma i e de elopmen also include child en
738
wi h no ma i e de elopmen ma ched in gende , leading o an
739
o e - ep esen a ion o boys since mo e boys han gi ls ha e
740
non-no ma i e de elopmen . See Table 3and Figu e 5 o
741
speci ic numbe s and isualized dis ibu ions. 742
Be gelson e al. PNAS | Augus 23, 2023 | ol. XXX | no. XX | 9