Jou nal o Econome ics 240 (2024) 105689
A ailable online 10 Feb ua y 2024
0304-4076/© 2024 The Au ho (s). Published by Else ie B.V. This is an open access a icle unde he CC BY-NC-ND license
(h p://c ea i ecommons.o g/licenses/by-nc-nd/4.0/).
Con en s lis s a ailable a ScienceDi ec
Jou nal o Econome ics
jou nal homepage: www.else ie .com/loca e/jeconom
Non- ep esen a i e sampled ne wo ks: Es ima ion o ne wo k
s uc u al p ope ies by weigh ing✩
Chih-Sheng Hsieha, Yu-Chin Hsub,c,d, S anley I.M. Koe, Ja omí Ko ářík ,g,∗,
T e on D. Loganh,i
aDepa men o Economics, Na ional Taiwan Uni e si y, Taipei, Taiwan
bIns i u e o Economics, Academia Sinica, Taipei, Taiwan
cDepa men o Finance, Na ional Cen al Uni e si y, Taoyuan Ci y, Taiwan
dDepa men o Economics, Na ional Chengchi Uni e si y, Taipei, Taiwan
eG adua e School o Economics and Managemen , Tohoku Uni e si y, Japan
Dp o. del Análisis Económico, Uni e si y o he Basque Coun y UPV-EHU, Bilbao, Spain
gFacul y o A s & Facul y o Economics, Uni e si y o Wes Bohemia, Pilsen, Czech Republic
hDepa men o Economics, The Ohio S a e Uni e si y, 410 A ps Hall, 1945 N. High S ee , Columbus, OH, 43210, Uni ed S a es o Ame ica
iNBER, Uni ed S a es o Ame ica
ARTICLE INFO
Keywo ds:
Ne wo ks
Weigh ing
(Pos -)s a i ica ion
Non- ep esen a i eness
Measu emen e o s
ABSTRACT
This pape analyzes s a is ical issues a ising om non- ep esen a i e samples o a ne wo k. Sam-
pled ne wo k da a could sys ema ically bias he ne wo k p ope ies and gene a e non-classical
measu emen e o p oblems. Apa om he sampling a e and he elici a ion p ocedu e, he
biases on ne wo k s uc u al measu es depend non- i ially on which subpopula ions o nodes
a e missing wi h highe p obabili y. We p opose a me hodology, adap ing weigh ed es ima o s
o ne wo ked con ex s, which enables esea che s o eco e se e al ne wo k-le el s a is ics
and educe he biases in he es ima ed ne wo k e ec s. The p oposed weigh ed es ima o s a e
consis en and asymp o ically no mally dis ibu ed and ha e good pe o mance in ini e samples.
No ably, ou app oach does no equi e use s o assume any ne wo k o ma ion model and is
s aigh o wa d o implemen .
1. Mo i a ion
The e is g owing in e es in unde s anding he ole o ne wo ks in Economics (Vega-Redondo,2007;Jackson,2010). Di e en
‘‘mic o’’ and ‘‘mac o’’ ea u es o ne wo k a chi ec u e shape di usion, lea ning, beha io , and o he subs an i e phenomena in a
a ie y o con ex s. Due o he inc easing a ailabili y o la ge ne wo k da a se s and inc easing compu a ional powe , empi ical
ne wo k esea ch is now a dynamic and g owing pa o his li e a u e. A he same ime, empi ical ne wo k analysis gene a es
✩We a e g a e ul o Isaiah And ews, Au eo de Paula, Ma co an de Leij, and pa icipan s a nume ous semina s o commen s and sugges ions. Hsieh
acknowledges inancial suppo om he Na ional Science and Technology Council o Taiwan (NSTC110-2410-H-002-195). Hsieh and Hsu g a e ully acknowledge
he esea ch suppo om he Cen e o Resea ch in Econome ic Theo y and Applica ions o Na ional Taiwan Uni e si y, Taiwan (G an no. 112L8601). Hsu
g a e ully acknowledges esea ch suppo om he Na ional Science and Technology Council o Taiwan (NSTC112-2628-H-001-001), and he Academia Sinica
In es iga o Awa d o Academia Sinica, Taiwan (AS-IA-110-H01). Ko ářík acknowledges inancial suppo om Minis e io de Economía y Compe i idad, Spain
and Fondo Eu opeo de Desa ollo Regional (PID2019-106146GB-I00), he Basque Go e nmen , Spain (IT1461-22), and he G an Agency o he Czech Republic
(21-22796S).
∗Co esponding au ho a : Dp o. del Análisis Económico, Uni e si y o he Basque Coun y UPV-EHU, Bilbao, Spain.
E-mail add esses: [email p o ec ed] (C.-S. Hsieh), [email p o ec ed] (Y.-C. Hsu), [email p o ec ed] (S.I.M. Ko), [email p o ec ed]
(J. Ko ářík), [email p o ec ed] (T.D. Logan).
h ps://doi.o g/10.1016/j.jeconom.2024.105689
Recei ed 27 July 2022; Recei ed in e ised o m 7 Janua y 2024; Accep ed 10 Janua y 2024
Jou nal o Econome ics 240 (2024) 105689
2
C.-S. Hsieh e al.
new econome ic challenges (Fo in and Bouche ,2015;De Paula,2017;Jackson e al.,2017). This pape ackles he challenges
ha a ise when ne wo k da a come om non- ep esen a i e samples o he popula ion, which is he mos commonly encoun e ed
scena io in p ac ical applica ions.
The as majo i y o empi ical ne wo k s udies analyze sampled da a, and he sampling a es a e ypically low.1E en hough he
li e a u e ac oss se e al disciplines has no ed ha using sampled da a may lead o conside able biases and o he s a is ical issues
(see below o e e ences), he ypical app oach is o ea he sampled da a ‘‘as i ’’ i we e comple e. Chand asekha and Lewis
(2016) show o mally ha , e en i he nodes a e selec ed ep esen a i ely h ough simple andom sampling (SRS, hence o h), he
s a is ics o he sampled ne wo ks di e signi ican ly om hose o he popula ion ne wo k. This dispa i y esul s in measu emen
e o s and inconsis ency p oblems when we es ima e ne wo k e ec s h ough eg essions. The es ima es om sampled ne wo ks may
su e om a enua ion, expansion, o e en sign-swi ching. As a esul , one canno ely on solu ions o classical measu emen -e o
p oblems o co ec hese issues, e en i he sample is ep esen a i e.
Fu he mo e, nodes obse ed in ne wo k samples a e ypically non- ep esen a i e. Fi s , non- ep esen a i eness may be caused
by he sampling design i sel (F ank,1981;Kolaczyk,2009;Handcock and Gile,2010). Fo ins ance, he s a subg aph sampling
design analyzed in his pape is p one o including nodes wi h highe connec i i y han nodes wi h a small numbe o ne wo k
neighbo s. The eason is ha s a subg aphs encompass no only he ini ially sampled nodes bu also hei ne wo k neighbo s e en
i he la e we e no ini ially sampled. Ha ing mo e connec ions hus inc eases he p obabili y o a node being included. This
is an example o a design ha gene a es samples in which he inclusion p obabili ies o nodes a e endogenous o he unde lying
popula ion ne wo k s uc u e. Non- ep esen a i eness may also a ise when he inclusion (o missing) p obabili ies a e o hogonal
o he popula ion ne wo k a chi ec u e. Fo ins ance, when ne wo k samples a e collec ed wi h speci ied bounda ies such as wi hin
schools, o wi hin illages, e c., i is no gua an eed ha samples wi hin bounda ies a e ep esen a i e o he en i e popula ion. Such
bounda y-induced ne wo k samples a e equi alen o he induced subg aph sampling design analyzed in his pape . O he common
sou ces o non- ep esen a i eness in sampling s udies a e non- esponses o disp opo iona e s a i ied sampling. Many s udies exploi
s a i ied samples o imp o e p ecision and sampling e iciency. Un o una ely, i is di icul and cos ly o s a i y o all ele an
cha ac e is ics.
To in ui i ely explain he issues a ising om sampled ne wo ks, we decompose he p oblem in o wo sou ces, scaling and non-
ep esen a i eness.Scaling e e s o obse ing ewe nodes and edges han he e exis in he whole ne wo k, independen ly o he
(non-) ep esen a i eness o he sample. In con as , non- ep esen a i eness a ises when di e en nodes ha e unequal p obabili ies
o being included in he sample. I nodes appea in he sample wi h equal p obabili y, only scaling ma e s. As an example o he
e ec o scaling, le us conside he a e age deg ee o a ne wo k. When he links be ween he sampled and non-sampled nodes
a e no obse ed, he sample a e age deg ee is biased downwa ds by cons uc ion. Fu he mo e, suppose he a e age deg ee is
co ela ed wi h he ne wo k’s di usion p ope ies. As a esul , using he sample a e age deg ee in a eg ession analysis leads o
an o e es ima ion o he a e age deg ee’s impac on di usion, e en when samples a e ep esen a i e. This is an example o he
expansion o he es ima ed e ec and hus, non-classical measu emen e o . Howe e , i nodes appea in he sample wi h unequal
p obabili ies, whe he he obse ed a e age deg ee and he es ima es a e in la ed o a enua ed will depend on who is missing.
Fo example, i less connec ed nodes a e missing wi h highe p obabili y, scaling and non- ep esen a i eness can bias he a e age
deg ee and he es ima es in opposi e di ec ions, and one canno easily p edic which o ce will domina e. In con as o he a e age
deg ee, he global clus e ing coe icien and he homophily index can be unbiased in ep esen a i e samples. In samples in which
di e en ypes o nodes a e missing wi h di e en p obabili ies, homophily will be biased by de ini ion. Since clus e ing is ypically
associa ed wi h connec i i y in social ne wo ks (Jackson and Roge s,2007), i is also likely o be mismeasu ed. The magni ude and
di ec ion o he biases in hese cha ac e is ics and hei es ima ed e ec s in eg essions again depend c ucially and non- i ially on
who is missing.
In his s udy, we sys ema ically analyze he p oblems a ising om sampled ne wo k da a elici ed ia wo widely employed
sampling me hods, and p oposes a solu ion enabling o eco e he ue s uc u al ea u es o a ne wo k (e.g., a e age deg ee)
and mi iga e biases in eg essions which s udy he impac o hese ne wo k ea u es on ei he indi idual o g oup-le el beha io s
and ou comes.2We i s de i e analy ically weigh ed es ima o s o a se o ne wo k s uc u al p ope ies om sampled ne wo ks
assuming ha nodes appea in he sample wi h unequal p obabili ies acco ding o hei ypes. Secondly, we s udy he asymp o ic
p ope ies o he p oposed weigh ed es ima o s and e alua e hei ini e-sample pe o mance nume ically. Las ly, he p oposed
me hodology is applied o a widely employed s a i ied da a se on Indian illages (Bane jee e al.,2013).3This da a se is sui ed
o ou app oach because i con ains a ela i ely la ge numbe o ne wo ks, and we documen ha he ne wo k da a ha e been
collec ed om a non- ep esen a i e sample o he popula ion unde sc u iny.
This s udy shows ha elying on he assump ion o ep esen a i eness o adjus ne wo k samples, which is a ely sa is ied in
eal-wo ld applica ions, can be as biased as using aw ne wo k samples wi hou any adjus men s. Since he di ec ion and magni ude
1The easons behind he common use o ne wo k samples a e ha he imp ac icali y o analyzing he en i e popula ion and he highe cos s associa ed
wi h ne wo k elici a ion compa ed o collec ing basic indi idual cha ac e is ics (A al,2016;B eza e al.,2020). Chand asekha and Lewis (2016) epo ha he
median sampling a e in applied wo k in Economics is 25% and mo e han 66% o ne wo k s udies ha e a sampling a e lowe han 51%. Simila a es a e
ound in o he ields.
2Ou s udy also imp o es in e ences in ne wo k- o ma ion applica ions s udying con ex ual de e minan s o he ne wo k a chi ec u e (i.e., applying ne wo k
p ope ies as eg essands). Since ne wo k o ma ion ep esen s a key opic in he ne wo k li e a u e (see Jackson,2005 and De Paula,2020 o e iews), i
enla ges he applicabili y o he p oposed me hodology. Howe e , his s udy ocuses on eg essions including ne wo k p ope ies as eg esso s.
3See, e.g., Jackson e al. (2012), Bane jee e al. (2013,2014), Chand asekha and Lewis (2016) and De Paula e al. (2018), among o he s.
Jou nal o Econome ics 240 (2024) 105689
3
C.-S. Hsieh e al.
o he biases depend on who is missing, we demons a e he necessi y o accoun ing o po en ially di e en missing a es o di e en
ypes o nodes in applied wo k. This is pa icula ly impo an in ne wo k da a whe e popula ion and dis ibu ional pa ame e s a e
o p ima y in e es .
As he main con ibu ion, we p opose weigh ed es ima o s o a selec ed se o ne wo k cha ac e is ics ha a e widely used
in applica ions: a e age deg ee, global clus e ing coe icien , epidemic h eshold, and homophily index. These ne wo k ea u es
ep esen undamen al aspec s o ne wo k a chi ec u e employed in heo e ical and empi ical esea ch and p o ide in ui i e insigh s
ega ding he way social o ganiza ion shapes indi idual and g oup-le el phenomena (Jackson e al.,2017).4To ha aim, we assume
ha ne wo k membe s can be di ided in o a ini e numbe o disjoin ypes, and ha sampling a es di e ac oss hese ypes. Taking
explici accoun o he di e ing sampling a es ac oss ypes, we adap s anda d (ne wo k- ee) Ho i z–Thompson (H–T) es ima o s o
ne wo ked con ex s and p opose (pos -) s a i ica ion as a iable app oach o co ec sampling biases caused ei he by he sampling
p ocedu e o due o a ying non- esponse a es among di e en demog aphic o socioeconomic ca ego ies (o bo h) in o de o
imp o e he p ecision o sample es ima es o objec i e a iables o in e es (Smi h,1991;Li le,1993). The main di e ence be ween
he s anda d H–T es ima o s and ou app oach is o weigh on ne wo k objec s, such as links, iples, o iangles, a he han on
nodes.5We p o e ha , in spa se ne wo ks, he p oposed weigh ed es ima o s a e consis en and asymp o ically no mally dis ibu ed.6
We also p o ide su icien condi ions so ha we can igno e he es ima ion e ec s when he eg ession analysis includes he p oposed
weigh ed ne wo k measu es as co a ia es. Ou nume ical analysis shows ha ou me hodology pe o ms well in ini e samples and
subs an ially ou pe o ms bo h he nai e (unco ec ed) s a is ics om he aw da a and co ec ions designed o ep esen a i e
samples.
Ou empi ical applica ion shows ha he Indian illage ne wo k da a s a i ied on eligion and geog aphy a e non- ep esen a i e
in e ms o age and gende . We hen show ha no accoun ing o unequal missing a es o nodes o di e en ypes a ec s he
es ima ed ne wo k e ec s subs an ially and one canno easily p edic he di ec ion and magni ude o he biases. Gi en he di e ences,
applied esea che s should ca e ully conside o wha ex en hei esul s migh be d i en by he non- ep esen a i eness o hei
samples.
The p esen pape connec s o h ee pieces o li e a u e. Fi s , ou me hodology complemen s eme ging econome ic li e a u e on
impe ec ly measu ed ne wo k da a and he es ima ion o ne wo k e ec s. Chand asekha and Lewis (2016) show ha es ima ions
wi h ne wo k da a coming om ep esen a i e samples su e om non-classical measu emen e o s and p opose a me hod o ensu e
consis en es ima es. Thei me hodology consis s o wo al e na i e app oaches. Fi s , hey p o ide o mal co ec ions o se e al
ne wo k measu es. Ou app oach gene alizes his i s s a egy. As a second app oach, hey p opose a g aphical econs uc ion
echnique ha deli e s consis en es ima es in bo h ne wo k-le el and indi idual-le el eg essions. The p ocedu e is i s o es ima e
a ne wo k o ma ion model and hen employ he es ima ed model o in e pola e o e missing pa s o he ne wo k. The ne wo k
econs uc ion app oach equi es a co ec model speci ica ion and ce ain assump ions o ensu e he consis ency o he ne wo k
e ec s. Howe e , his second app oach does no necessa ily eco e he s uc u al p ope ies o he popula ion ne wo k, which is he
p ima y objec i e o ou s udy. Mos impo an ly, om he pe spec i e o he p esen wo k, bo h app oaches a e es ic ed o he
case ha he sample is ep esen a i e. Chand asekha and Jackson (2016) p opose a ne wo k o ma ion model simila in spi i o ou
me hodology in ha i is also based on subg aphs in he unc ion o ypes o nodes. Howe e , none o hese app oaches can e ec i ely
eco e he ue ne wo k o ma ion p ocess om non- ep esen a i e samples because, when he ne wo k- o ma ion model is i ed
on non- ep esen a i e ne wo k samples, he es ima ed pa ame e s in he i s s age will likely be biased and po en ially inconsis en
e en i he assumed model is co ec . Thi ke le (2019) p oposes a ne wo k o ma ion model enabling he es ima ion o bounds
on ne wo k s a is ics om pa ially obse ed ne wo ks. The ad an age o ou app oach, as opposed o he g aphical econs uc ion
echniques, is ha ou me hodology does no ely on any assumed ne wo k o ma ion model. Ou wo k complemen s and expands
he abo e s udies by p o iding he i s s ep owa d he s a is ical ea men o ne wo k da a coming om non- ep esen a i e samples
o he popula ion, which is he mos common ype o ne wo k da a a ailable.7
Second, we con ibu e o he s a is ical sampling heo y ha has de eloped p ocedu es o eco e ing he ue ne wo k s uc u al
pa ame e s om samples i he only sou ce o non- ep esen a i eness comes om he ne wo k sampling design (see Kolaczyk,
2009 o a su ey). Ou me hodology nes s hese p ocedu es as a special case (e.g., F ank,1981,Kolaczyk,2009,Chand asekha
and Lewis,2016). Ou weigh ing me hod sha es he same goals wi h hese app oaches bu di e s subs an ially in he unde lying
assump ions and applicabili y. Unlike hese app oaches which a e ypically sui able o speci ic sampling designs, ou me hod can
be applied o adap ed o a ious sampling p ocedu es.8Fu he mo e, ou app oach emains e ec i e e en in cases whe e non-
ep esen a i eness is caused by ac o s un ela ed o he sampling design, such as non- esponse o he p esence o ha d- o- each
subpopula ions. Mos impo an ly, exis ing app oaches assume ce ain o ms o ep esen a i eness in he sampling p ocess ex-an e,
while ou p oposed me hodology a ge s bo h ex-an e and ex-pos non- ep esen a i eness o he sample. I he sampling a es a e se
by he esea che be o e he da a collec ion (as in he app oaches discussed abo e and in s anda d s a i ica ion), hey a e ea ed
4Sec ion 6discusses he ex ension o ou app oach o o he ne wo k cha ac e is ics.
5Such ne wo k objec s a e e e ed o as subg aphs, subne wo ks, o ne wo k mo i s in di e en ields.
6Since i ually all eal-li e social and economic ne wo ks a e spa se, ou asymp o ic esul s a e b oadly applicable o empi ical esea ch.
7Bouche and Hounde oungan (2020) s udy he es ima ion o pee e ec s when he esea che s only obse e consis en es ima es o agg ega e ne wo k
s a is ics. Hence, ou me hodology and hei app oach na u ally complemen each o he in non- ep esen a i e samples since ou me hodology deli e s such
consis en es ima es om non- ep esen a i e samples.
8Fo he sake o b e i y, we concen a e on wo sampling designs commonly used in economics. Howe e , ou me hodology can be applied o adap ed o
o he sampling designs. See Sec ion 6.
Jou nal o Econome ics 240 (2024) 105689
4
C.-S. Hsieh e al.
as known pa ame e s. I he sampling a es a e lea ned a e he da a collec ion, ou me hodology co esponds o pos -s a i ica ion
by exploi ing he non- ep esen a i eness o he sample and ea s he sampling a es as unknown pa ame e s o be es ima ed.
Las , we con ibu e o be e p ac ices o empi ically e alua ing he e ec s o global ne wo k ea u es in socio-economic
en i onmen s. Ou s udy shows ha , despi e o he econome ic issues, mismeasu ed ne wo k ea u es wi h non- ep esen a i e
samples migh lead o a se ious misunde s anding o ne wo k e ec s. Howe e , ou me hodology mi iga es his issue and p o ides
an addi ional a gumen o he employmen o sampling in empi ical ne wo k wo k. Wi h he inc easing use o ne wo k da a and
co esponding empi ical echniques, ou p oposed app oach can imp o e he design o ne wo k sampling s a egies and he in e ence
we d aw om ne wo k s udies mo e gene ally. Mo eo e , i can se e as a s anda d obus ness check o empi ical esul s.
2. F amewo k
2.1. No a ion
A g aph o ne wo k is de ined by 𝐺𝑛= (𝑉 , 𝐸), whe e 𝑉is he se o e ices (nodes) wi h 𝑛=|𝑉|deno ing he ca dinali y o 𝑉,
and 𝐸is he se o edges (links). The ne wo k can be ep esen ed by an 𝑛×𝑛adjacency ma ix 𝑊𝑛. We ocus on unweigh ed and
undi ec ed ne wo ks; i.e., 𝑊𝑖𝑗,𝑛 = 1(0) i 𝑖and 𝑗a e (no ) connec ed and 𝑊𝑖𝑗,𝑛 =𝑊𝑗𝑖,𝑛 o each 𝑖, 𝑗 ∈𝑉. Following he con en ion,
we exclude sel -loops by se ing 𝑊𝑖𝑖,𝑛 = 0. We assume ha he nodes can be classi ied in o 𝑇disjoin ypes wi h a gene ic ype
𝑡∈= {1,2,…, 𝑇 }. One can iew his classi ica ion as s a i ica ion, which can be ca ied ou ei he be o e o a e sample
collec ion. When conduc ed a e he collec ion, his p ocess is commonly e e ed o as pos -s a i ica ion. We w i e 𝑡𝑖=𝑡i node
𝑖is o ype 𝑡. Then, 𝑡𝑖=𝑡𝑗(𝑡𝑖≠𝑡𝑗) indica es ha 𝑖and 𝑗a e (no ) o he same ype. Le 𝑉𝑡be he se o nodes o ype 𝑡,𝑛𝑡=|𝑉𝑡|is
he size o his se , and ∑𝑇
𝑡=1 𝑛𝑡=𝑛.
Ra he han he whole ne wo k 𝐺𝑛, esea che s only obse e he sampled ne wo k, which is also e e ed o as a subg aph o
𝐺𝑛. Le 𝑉∗⊆ 𝑉 be he se o sampled nodes o size 𝑚=|𝑉∗|and le 𝜓deno e he sampling a e. Analogously, 𝑉∗
𝑡deno es he se o
nodes o ype 𝑡in he sample and 𝑚𝑡=|𝑉∗
𝑡|is he numbe o sampled nodes o ype 𝑡and ∑𝑇
𝑡=1 𝑚𝑡=𝑚. We use 𝜓𝑡 o deno e ype 𝑡’s
sampling a e. We assume ha 𝜓𝑡> 𝜏 o some 𝜏 > 0 o each 𝑡and is independen o 𝑛. C ucially, we assume ha , wi hin each ype,
indi idual nodes ha e an equal p obabili y o being selec ed in o he sample. Ou amewo k p ima ily ocuses on non- ep esen a i e
samples, i.e., 𝜓𝑡≠𝜓𝑠 o a leas one 𝑡, 𝑠 ∈, while also encompassing he ep esen a i e sample, i.e., 𝜓𝑡=𝜓 o all 𝑡∈, as a
special case.
In he con ex o (ex-an e) s a i ica ion, he ue alue o 𝜓𝑡is a known quan i y speci ied by he esea che . Howe e , when i
comes o pos -s a i ica ion, he ue alue o 𝜓𝑡is ea ed as unknown and can be es ima ed by 𝜓𝑡=𝑚𝑡
𝑛𝑡, he a io o he numbe o
nodes o ype 𝑡included in he sample o he popula ion numbe o nodes o ype 𝑡. We deno e 𝜑𝑖=∑𝑇
𝑡=1 𝜓𝑡𝟏(𝑡𝑖=𝑡) he sampling
p obabili y o node 𝑖, condi ional on he ype, and 𝜑𝑖=∑𝑇
𝑡=1 𝜓𝑡𝟏(𝑡𝑖=𝑡) he co esponding es ima o based on 𝜓𝑡.
Gi en sampled nodes, his pape ocuses on wo designs o elici ing ne wo k edges. The i s is he induced subg aph, in which
he sampled ne wo k is deno ed by 𝐺I
𝑛= (𝑉∗, 𝐸I). In 𝐺I
𝑛, he se 𝑉∗in ol es 𝑚sampled nodes, and he se 𝐸I⊆ 𝐸 in ol es ne wo k
links among hese 𝑚sampled nodes. 𝑊I
𝑛is he 𝑚×𝑚adjacency ma ix co esponding o 𝐺I
𝑛. The second is he s a subg aph, in which
he sampled ne wo k is deno ed by 𝐺S
𝑛= (𝑉∗, 𝐸S).9In 𝐺S
𝑛, he e a e 𝑚ini ially sampled nodes in he se 𝑉∗
0. Howe e , esea che s
obse e no only he ne wo k links among hese 𝑚sampled nodes, bu also he links o he 𝑚sampled nodes o unsampled nodes
in 𝑉. Hence, we use 𝐸S o deno e he se o edges such ha a leas one node o he co esponding dyad is in 𝑉∗
0. The se 𝑉∗
0is
enla ged o 𝑉∗by including all he e ices 𝑖∈𝑉⧵𝑉∗
0 ha a e connec ed h ough he obse ed links o a leas one sampled node
om 𝑉∗
0. The size o his enla ged e ex se is deno ed by 𝑚′=|𝑉∗|, and he co esponding sampling a e is deno ed by 𝜓′. Le
𝑊S
𝑛be he 𝑚′×𝑚′adjacency ma ix co esponding o he g aph 𝐺S
𝑛. In bo h he induced and s a subg aphs, we assume ha edges
a e epo ed wi hou e o s.
We s udy se e al ne wo k s uc u al p ope ies (measu es) and e e o a gene ic popula ion ne wo k measu e as 𝛬. Le 𝛬(𝐺𝑛)
deno e he es ima ed ne wo k measu e o 𝛬based on he whole ne wo k da a, and le 𝛬(𝐺𝑛),𝐺𝑛∈ {𝐺I
𝑛,𝐺S
𝑛}, ep esen he
co esponding es ima ed ne wo k measu e based on he sampled ne wo k 𝐺𝑛. We call 𝛬(𝐺𝑛) he nai e es ima o o ne wo k p ope y.
Addi ionally, le
𝛬(𝐺𝑛)deno e he weigh ed ne wo k measu e p oposed o mi iga e sample biases wi h espec o he whole ne wo k.
Fo example, 𝛬(𝐺𝑛) = 1
𝑛∑𝑖∈𝑉∑𝑗∈𝑉𝑊𝑖𝑗,𝑛 is he a e age deg ee o a g aph, which we deno e 𝑑(𝐺𝑛)below. Hence, 𝑑(𝐺𝑛)is he a e age
deg ee o he sampled ne wo k, and
𝑑(𝐺𝑛)is he p oposed weigh ed es ima o o mi iga e biases o 𝑑(𝐺𝑛).
In applica ions, esea che s may obse e mul iple ne wo ks. We use a gene ic subsc ip 𝑟∈= {1,2,…, 𝑅}when a measu e
e e s o ne wo k 𝑟. Tha is, 𝐺𝑟,𝑛𝑟deno es he g aph 𝑟, and 𝐺𝑟,𝑛𝑟∈ {𝐺I
𝑟,𝑛𝑟,𝐺S
𝑟,𝑛𝑟}deno es he co esponding sampled ne wo k.
The e o e, 𝑛𝑟,𝑡 and 𝑚𝑟,𝑡 a e he numbe o nodes o ype 𝑡in he whole ne wo k 𝑟and i s co esponding numbe in he sample.
2.2. Reg ession wi h ne wo k measu es
In addi ion o he econs uc ion o ne wo k p ope ies o in e es , we also conside eg ession analysis wi h ne wo k measu es.
Th oughou he analysis, we ocus on eg essions in which esea che s a e in e es ed in unde s anding whe he and how he global
measu es o ne wo k p ope ies in luence a pa icula ou come. Fo mally,
𝑦𝑟=𝛼+𝛽𝛬𝑟+𝑥𝑟𝛾+𝜀𝑟,(1)
9𝐺S
𝑛is e e ed o as he labeled s a subg aph in Kolaczyk (2009) because he unsampled nodes which connec o sampled nodes a e iden i ied and labeled.
Jou nal o Econome ics 240 (2024) 105689
5
C.-S. Hsieh e al.
whe e 𝑦𝑟is he ou come a iable o ne wo k (o communi y) 𝑟,𝑥𝑟is he se o ne wo k-le el con ols, and 𝛬𝑟is he popula ion
ne wo k p ope y o in e es o 𝑟 h ne wo k popula ion. The esea che s a e in e es ed in es ima ing he pa ame e s 𝛼,𝛽, and 𝛾.
Examples o he applica ions o (1) in he li e a u e include Ala as e al. (2016) which eg ess he abili y o illage s o agg ega e
in o ma ion on a se o ne wo k cha ac e is ics in Indonesian illages, Bane jee e al. (2013) who model mic o inance ake-up a e in
u al India in unc ion o he a e age cen ali y o he ini ial seeds, Cu a ini e al. (2009) and Golub and Jackson (2012) who ela e
homophily wi h school-le el s a is ics using Add Heal h da a, o Fleming e al. (2007) who model he abili y o di e en egions o
gene a e knowledge depending on he s uc u e o egional esea ch ne wo ks. Such eg essions a e also o in e es heo e ically.
Fo example, he o e all clus e ing o a ne wo k may explain he magni ude and e iciency o isk-sha ing wi hin a socie y (Bloch
e al.,2008), and he s abili y o beha io in a socie y may be ela ed o he minimal eigen alue o he adjacency ma ix (B amoullé
e al.,2014).
The p oposed app oach also applies o models in es iga ing he in luence o a ne wo k’s global measu e on indi idual-le el
ou comes: 𝑦𝑖𝑟 =𝛼+𝛽𝛬𝑟+𝑥𝑖𝑟𝛾+𝜇𝑟+𝜀𝑖𝑟, whe e 𝑦𝑖𝑟 is he ou come o an indi idual 𝑖in ne wo k 𝑟,𝑥𝑖𝑟 cap u es indi idual he e ogenei y
( ha can also include he he e ogenei y o 𝑖’s neighbo hood), and 𝜇𝑟is a ne wo k andom e ec . Fo ins ance, he decision o an
indi idual o adop a p oduc (e.g., mic o inance as in Bane jee e al.,2013), pa icipa e in an ac i i y (e.g., ec ea ional ac i i y as
in B amoullé e al.,2009), o beha e in a pa icula way (Cen ola,2010) can depend on he o e all s uc u e o he ne wo k. In he
same ein, he inno a ion li e a u e s udies how he s uc u e o egional ne wo ks shapes he inno a i e pe o mance o indi idual
inno a o s (Schilling and Phelps,2007). The e also exis heo ies a guing ha he o e all s uc u e o a ne wo k may de e mine he
beha io a he indi idual le el (see, e.g., Balles e e al.,2006;B amoullé e al.,2014).
Wi h sampled da a, esea che s obse e 𝐺𝑟,𝑛𝑟∈ {𝐺I
𝑟,𝑛𝑟, 𝐺S
𝑟,𝑛𝑟}, and he nai e es ima o 𝛬(𝐺𝑟,𝑛𝑟)is no a consis en es ima o o
𝛬𝑟. The e o e, when esea che s es ima e
𝑦𝑟=𝛼+𝛽𝛬(𝐺𝑟,𝑛𝑟) + 𝑥𝑟𝛾+𝑢𝑟,(2)
i leads o a measu emen e o in he eg esso . The classic measu emen e o and he esul ing a enua ion bias a e based on se e al
assump ions ha a e gene ally no sa is ied in he case o ne wo k measu es.10 Chand asekha and Lewis (2016) show analy ically
and ia simula ions ha he biases a e gene ally no ac able and can lead o expansion o sign swi ching unde ep esen a i eness.
The issues become e en mo e p oblema ic i he ep esen a i eness assump ion is iola ed. On he o he hand, when esea che s
es ima e
𝑦𝑟=𝛼+𝛽
𝛬(𝐺𝑟,𝑛𝑟) + 𝑥𝑟𝛾+𝑢𝑟,(3)
i leads o consis en es ima ion o he pa ame e s. In his ega d, we la e p o ide su icien condi ions such ha we can igno e he
es ima ed e ec o
𝛬(𝐺𝑟,𝑛𝑟)in he OLS eg ession.
3. Weigh ed es ima o s o sampled ne wo k measu es
This sec ion p oposes weigh ed es ima o s o commonly used ne wo k measu es when sampled da a a e used. We also add ess
he biases p esen in bo h he nai e (unweigh ed) es ima o s and weigh ed es ima o s ha solely accoun o scaling e ec s. One
key assump ion made h oughou his sec ion is ha he ne wo k measu es (s a is ics) unde conside a ion a e well-de ined. Fo
example, when he sampling a e is ex emely low, he global clus e ing coe icien could be ze o i no closed iple s a e obse ed
in he sampled ne wo k. In such scena ios, bo h he nai e es ima o and ou p oposed weigh ed es ima o a e null and i is no
possible o eco e he ue alue o he coe icien . Hence, we s ess ha ou co ec ions o ne wo k s a is ics a e applicable unde
sampling a es in which well-de ined nai e es ima o s exis . To main ain no a ional simplici y, we will omi he ne wo k index 𝑟
in he subsc ip s h oughou Sec ions 3.1 and 3.2. We ein oduce i when discussing he asymp o ics o eg essions wi h ne wo k
measu es in Sec ion 3.3.
3.1. A e age deg ee
The deg ee is he numbe o connec ions o a node, which is a basic measu e o a node’s impo ance o local cen ali y. The
a e age deg ee o he g aph 𝐺𝑛is simply he a e age numbe o ne wo k links pe node in he ne wo k, de ined as 𝑑(𝐺𝑛) =
1
𝑛∑𝑖∈𝑉∑𝑗∈𝑉𝑊𝑖𝑗,𝑛. I has been applied as a eg esso in nume ous empi ical s udies o di e en con ex s (see, e.g., B anas-Ga za
e al.,2010;Bane jee e al.,2013;Ala as e al.,2016, among many o he s).
Fo an induced subg aph, he nai e es ima o o he a e age deg ee is compu ed as
𝑑(𝐺I
𝑛) = 1
𝑚∑
𝑖∈𝑉∗∑
𝑗∈𝑉∗
𝑊I
𝑖𝑗,𝑛 =1
𝑚∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛𝐷𝑖𝐷𝑗,(4)
whe e 𝐷𝑖is a bina y a iable ha akes he alue 1 i 𝑖∈𝑉∗, and 0 o he wise. To co ec he biases om bo h scaling and non-
ep esen a i eness in (4), we p opose he weigh ed sample a e age deg ee by mul iplying each obse ed sample edge 𝑊I
𝑖𝑗,𝑛 wi h he
10 Al hough ne wo k eg essions o en ace addi ional challenges such as endogenei y and omi ed a iable p oblems, we con end ha he sampling issue
pe sis s e en in he absence o hese p oblems.
Jou nal o Econome ics 240 (2024) 105689
6
C.-S. Hsieh e al.
weigh , (𝜑𝑖𝜑𝑗)−1, which is he in e se o he es ima ed inclusion p obabili y. Thus, he weigh ed sample a e age deg ee is gi en by
𝑑(𝐺I
𝑛) = 1
𝑛∑
𝑖∈𝑉∗∑
𝑗∈𝑉∗
𝑊I
𝑖𝑗,𝑛(𝜑𝑖𝜑𝑗)−1
=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛
𝐷𝑖
𝜑𝑖
𝐷𝑗
𝜑𝑗
.(5)
As he ue alue o he inclusion p obabili y (𝜑𝑖𝜑𝑗)is ypically unknown and needs o be es ima ed om he sample, we e e o
he weigh ed es ima o in (5) as a pos -s a i ica ion es ima o . Howe e , when he ue alue o (𝜑𝑖𝜑𝑗)is known and applied in
(5), he es ima o ollows he gene al p inciple o he H–T es ima o (Ho i z and Thompson,1952). To show why he p oposed
weigh ed es ima o (5) emo es he bias in (4), assume known (𝜑𝑖𝜑𝑗). Then,
E(1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛
𝐷𝑖
𝜑𝑖
𝐷𝑗
𝜑𝑗||||||
𝐺𝑛)=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛 (E(𝐷𝑖𝐷𝑗|𝐺𝑛)
𝜑𝑖𝜑𝑗)=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛 (𝜑𝑖𝜑𝑗
𝜑𝑖𝜑𝑗)=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛.
Tha is, he expec ed alue o ou weigh ed es ima o is he ue a e age deg ee o he popula ion ne wo k. The in ui ion behind (5) is
as ollows. The e a e ∑𝑖∈𝑉∑𝑗∈𝑉𝑊𝑖𝑗,𝑛 edges o accoun o in 𝐺𝑛. Howe e , due o a ia ions in he inclusion p obabili ies o sample
edges, we only obse e ∑𝑖∈𝑉∑𝑗∈𝑉𝑊𝑖𝑗,𝑛(𝜑𝑖𝜑𝑗)edges in an induced subg aph in expec a ion. E en i samples a e ep esen a i e
(i.e., 𝜑𝑖=𝜑𝑗=𝜓), as long as 𝜓 < 1, a bias eme ges due o scaling. Mo eo e , as 𝜑𝑖and 𝜑𝑗a e no necessa ily he same, we ha e
he second sou ce o bias, non- ep esen a i eness, and he issues become mo e complica ed.
Fo a s a subg aph, he nai e (sample) a e age deg ee is de ined as
𝑑(𝐺S
𝑛) = 1
𝑚′∑
𝑖∈𝑉∗∑
𝑖∈𝑉∗
𝑊S
𝑖𝑗,𝑛 =1
𝑚′∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛(1 − (1 − 𝐷𝑖)(1 − 𝐷𝑗)),(6)
whe e 𝐷𝑖is a bina y a iable ha akes he alue 1 i 𝑖∈𝑉∗
0, and 0 o he wise. To co ec he bias, we p opose he ollowing
weigh ed sample a e age deg ee,
𝑑(𝐺S
𝑛) = 1
𝑛∑
𝑖∈𝑉∗∑
𝑗∈𝑉∗
𝑊S
𝑖𝑗,𝑛 (1 − (1 − 𝜑𝑖)(1 − 𝜑𝑗))−1 =1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛
1 − (1 − 𝐷𝑖)(1 − 𝐷𝑗)
1 − (1 − 𝜑𝑖)(1 − 𝜑𝑗).(7)
Once again, assuming ha 𝜑𝑖’s a e known, we can demons a e he ollowing:
E(1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛
1 − (1 − 𝐷𝑖)(1 − 𝐷𝑗)
1 − (1 − 𝜑𝑖)(1 − 𝜑𝑗)||||||
𝐺𝑛)=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛 (E(1 − (1 − 𝐷𝑖)(1 − 𝐷𝑗)|𝐺𝑛)
1 − (1 − 𝜑𝑖)(1 − 𝜑𝑗))
=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛 (1 − (1 − 𝜑𝑖)(1 − 𝜑𝑗)
1 − (1 − 𝜑𝑖)(1 − 𝜑𝑗))=1
𝑛∑
𝑖∈𝑉∑
𝑗∈𝑉
𝑊𝑖𝑗,𝑛.
This esul jus i ies why he weigh ed sample a e age in (7) mi iga es he bias p oblem.
The weigh ed es ima o s p oposed in (5) and (7) accoun o wo phenomena. Fi s ly, hey accoun o he di e ing inclusion
p obabili ies o he links in he unc ion o he ypes o he in ol ed nodes. Secondly, hey espec he co ela ions in who is connec ed
o whom in he obse ed pa o he ne wo k (i.e., hey espec he ne wo k homophily). I one applies he co ec ions assuming
ep esen a i eness o he sample, (5) and (7) will change o
𝑑(𝐺I
𝑛) = 1
𝑛∑
𝑖∈𝑉∗∑
𝑗∈𝑉∗
𝑊I
𝑖𝑗,𝑛(𝜓2)−1 (8)
and
𝑑(𝐺S
𝑛) = 1
𝑛∑
𝑖∈𝑉∗∑
𝑗∈𝑉∗
𝑊S
𝑖𝑗,𝑛(1 − (1 − 𝜓)2)−1,(9)
espec i ely. These co ec ions a e exac ly he same as shown by Chand asekha and Lewis (2016). Howe e , biases would s ill
eme ge in (8) and (9) i he sample is no uly ep esen a i e. Impo an ly, he e is no eason o hese biases o be smalle han
in he aw (unco ec ed) da a as hei size depends on who is missing.
One can pe cei e he p oposed weigh ed es ima o s (5) and (7) as a design-based app oach. Howe e , he emainde o his
subsec ion cha ac e izes he asymp o ic p ope ies o he es ima o s. To his aim, we en ision ha he unde lying ini e-popula ion
ne wo k (𝐺𝑛)expands p og essi ely owa d a hypo he ical supe popula ion ne wo k. This p omp s a na u al ansi ion o a model-
based app oach, a ge ing he unknown (model) pa ame e s ha cha ac e ize his hypo he ical supe popula ion o asymp o ic
s a is ical in e ence.11 Consequen ly, we ad oca e a syn hesis o design-based and model-based app oaches (Binde and Robe s,
2003;S e ba,2009). We expand upon he amewo k in oduced by Bickel e al. (2011) o accoun o non- ep esen a i eness o
nodes unde he assump ion ha he ne wo k is spa se. Spa se ne wo ks e e o ne wo ks whe e he numbe o obse ed links is
conside ably lowe han he maximum numbe o possible links, a common ea u e o eal-li e social ne wo ks. Fo mally, spa seness
11 An al e na i e possibili y is asymp o ic analysis wi h ini e popula ion sampling (P áško á and Sen,2009;Li and Ding,2017). Howe e , he me hodology
canno cu en ly handle spa se ne wo ks in ini e popula ions. Consequen ly, we adop he app oach in Bickel e al. (2011) o in es iga e he asymp o ics o a
supe popula ion and de e he analysis o a ini e popula ion o u u e esea ch.
Jou nal o Econome ics 240 (2024) 105689
7
C.-S. Hsieh e al.
is de ined as he p ope y o an in ini e sequence o g aphs whe e he (a e age) deg ee is bounded as 𝑛→∞(Bickel and Chen,
2009;Lo ász,2012).12 In addi ion, simila o Bickel e al. (2011), we assume ha he adjacency ma ix o he whole ne wo k 𝑊𝑛is
exchangeable.13 As a esul , acco ding o he Aldous–Hoo e heo em (Aldous,1981;Hoo e ,1979), he adjacency ma ix can be
ep esen ed by
𝑊𝑖𝑗,𝑛
D
=𝑔𝑛(𝜉𝑖, 𝜉𝑗, 𝜖𝑖𝑗 , 𝑡𝑖, 𝑡𝑗),(10)
whe e D
=deno es equali y in dis ibu ion, and 𝑔𝑛is a measu able unc ion symme ic in i s i s wo and las wo a gumen s. In (10),
𝜉𝑖and 𝜖𝑖𝑗 a e i.i.d. uni o m andom a iables on [0,1],𝜖𝑖𝑗 =𝜖𝑗𝑖, and {𝑡𝑖}𝑛
𝑖=1 a e independen o {𝜉𝑖}𝑛
𝑖=1 and {𝜖𝑖𝑗 }𝑛
𝑖,𝑗=1. No e ha his
implies 𝑊𝑖𝑗,𝑛 =𝑊𝑗𝑖,𝑛.
Since he unc ion 𝑔𝑛(.)in (10) canno be uniquely iden i ied (Bickel and Chen,2009), i would be ad isable o explo e an
al e na i e pa ame e iza ion, ℎ𝑡𝑠,𝑛(𝑢, 𝑣)≡P[𝑊𝑖𝑗,𝑛 = 1|𝜉𝑖=𝑢, 𝜉𝑗=𝑣, 𝑡𝑖=𝑡, 𝑡𝑗=𝑠] o 𝑡, 𝑠 ∈, which e e s o he unique canonical
ℎ𝑡𝑠,can such ha ∫1
0ℎ𝑡𝑠,can(𝑢, 𝑣)𝑑𝑣 is mono one non-dec easing in 𝑢. Also, le 𝑝𝑡= P(𝑡𝑖=𝑡)and assume o all 𝑡∈,𝑝𝑡≥𝜏 o some
𝜏 > 0, and is independen o 𝑛. Unde hese assump ions, we ha e ℎ𝑡𝑠,𝑛(𝑢, 𝑣) = ℎ𝑠𝑡,𝑛(𝑢, 𝑣)and ℎ𝑛(𝑢, 𝑣) = P[𝑊𝑖𝑗,𝑛 = 1|𝜉𝑖=𝑢, 𝜉𝑗=𝑣] =
∑𝑇
𝑡=1 ∑𝑇
𝑠=1 ℎ𝑡𝑠,𝑛(𝑢, 𝑣)𝑝𝑡𝑝𝑠. Le
𝜌𝑛=∫1
0∫1
0
ℎ𝑛(𝑢, 𝑣)𝑑𝑢 𝑑𝑣 (11)
be he p obabili y o an edge in he ne wo k (i.e., ne wo k densi y). We can hen w i e 𝑤𝑡𝑠,𝑛(𝑢, 𝑣) = 𝜌−1
𝑛ℎ𝑡𝑠,𝑛(𝑢, 𝑣), which ep esen s
he condi ional densi y o (𝜉𝑖, 𝜉𝑗)gi en ha he e is an edge be ween 𝑖and 𝑗. The exp ession 𝑤𝑡𝑠,𝑛 decouples he ne wo k densi y
om he inhomogenei y s uc u e. Fo he asymp o ics, we will assume ha 𝑤𝑡𝑠,𝑛(𝑢, 𝑣) = 𝑤𝑡𝑠(𝑢, 𝑣), whe e 𝑤𝑡𝑠(𝑢, 𝑣)is independen o
𝑛. Le 𝑤𝑡(𝑢, 𝑣) = ∑𝑇
𝑠=1 𝑤𝑡𝑠(𝑢, 𝑣)𝑝𝑠and 𝑤(𝑢, 𝑣) = ∑𝑇
𝑡,𝑠=1 𝑤𝑡𝑠(𝑢, 𝑣)𝑝𝑡𝑝𝑠=∑𝑇
𝑡=1 𝑤𝑡(𝑢, 𝑣)𝑝𝑡. We will con ol he a e o he expec ed deg ee
𝜆𝑛= (𝑛− 1)𝜌𝑛>0as 𝑛→∞.14
The asymp o ics o he a e age deg ee o he whole ne wo k 𝐺𝑛,𝑑(𝐺𝑛), and ou p oposed weigh ed es ima o s
𝑑(𝐺I
𝑛)and
𝑑(𝐺S
𝑛)
can be summa ized in he ollowing heo em.15
Theo em 1. Suppose ha ∫1
0∫1
0𝑤2(𝑢, 𝑣)𝑑𝑣𝑑𝑢 < ∞and lim𝑛→∞𝜆𝑛=𝜆 < ∞. Then,
(a) unde he popula ion (non-sampled) ne wo k,
𝑑(𝐺𝑛)𝑝
→𝜆, √𝑛(𝑑(𝐺𝑛) − 𝜆)𝑑
→(0, 𝜎2
𝑑(𝐺))
o some 𝜎2
𝑑(𝐺)>0;
(b) unde he induced subg aph,
𝑑(𝐺I
𝑛)𝑝
→𝜆, √𝑛(
𝑑(𝐺I
𝑛) − 𝜆)𝑑
→(0, 𝜎2
𝑑(𝐺I))
o some 𝜎2
𝑑(𝐺I)>0;
(c) unde he s a subg aph,
𝑑(𝐺S
𝑛)𝑝
→𝜆, √𝑛(
𝑑(𝐺S
𝑛) − 𝜆)𝑑
→(0, 𝜎2
𝑑(𝐺S))
o some 𝜎2
𝑑(𝐺S)>0.
Theo em 1 es ablishes ha , i he ne wo k is spa se, ou weigh ed es ima o s
𝑑(𝐺I
𝑛)and
𝑑(𝐺S
𝑛)a e consis en and asymp o ically
no mally dis ibu ed wi h ini e a iance.16 This is he case independen ly o whe he he sampling a es a e ea ed as es ima o s
o no . Sec ion 4complemen s he asymp o ic analysis (in Theo em 1 and Supplemen a y Appendix B) wi h nume ical analysis
assessing o wha ex en he co ec ions p oposed in his sec ion di e om hei ue alues in ini e samples.
12 The spa se ne wo ks ha we conside he e a e es ic ed o a pa icula class o ne wo ks wi h 𝑜(𝑛2)edges, o equi alen ly 𝑜(𝑛)ne wo k deg ees, and do
no con ain dense spo s. This may hus p eclude some eal-li e social ne wo ks ha exhibi he powe -law deg ee dis ibu ion (Bo gs e al.,2019).
13 To be p ecise, a ne wo k is ela i ely exchangeable wi h espec o he ype a iable 𝑡i
[𝑊𝜎𝑡(𝑖)𝜎𝑡(𝑗),𝑛]D
= [𝑊𝑖𝑗,𝑛]
o all 𝑛and all pe mu a ions 𝜎𝑡sa is ying [𝑡𝜎𝑡(𝑖)]𝑖∈𝑛= [𝑡𝑖]𝑖∈𝑛(C ane and Towsne ,2018). Exchangeabili y implies a pa icula dependence s uc u e ac oss he
elemen s o 𝑊𝑖𝑗,𝑛. In pa icula , 𝑊𝑖𝑗,𝑛 and 𝑊𝑖′𝑗′,𝑛 a e dependen i 𝑖=𝑖′o 𝑗=𝑗′. This ype o dependence is implied by many s a is ical and econome ic ne wo k
o ma ion models, such as s ochas ic blockmodels (Holland e al.,1983), la en posi ion model (Ho e al.,2002), and o he condi ional edge independence
models (Chand asekha ,2016). Mo e de ails o his amewo k a e a ailable in he handbook chap e by G aham (2020).
14 Speci ically, we equi e 𝜌𝑛=𝛩(1∕𝑛), i.e., 𝜌𝑛g ows as as as 1∕𝑛, so ha 𝜆𝑛con e ges o a non-ze o cons an when 𝑛goes o in ini y.
15 The p oo s o Theo em 1 and Lemma 2 a e elega ed o he Supplemen a y Appendix B.
16 The analy ical exp essions o he asymp o ic a iances o
𝑑(𝐺I
𝑛)and
𝑑(𝐺S
𝑛)a e complex due o in ica e ne wo k pa e ns (Bickel e al.,2011;Bha acha yya
and Bickel,2015;G aham,2020) and we he e o e lea e hei de i a ions o u u e esea ch. Ne e heless, Bickel e al. (2011) p opose subsampling boo s ap
me hods o app oxima ion and conjec u e–al hough do no p o e– ha hese me hods migh wo k p ope ly in spa se ne wo ks; see p. 2291–2292 o hei pape .
Jou nal o Econome ics 240 (2024) 105689
8
C.-S. Hsieh e al.
3.2. O he ne wo k measu es
In addi ion o he a e age deg ee, we also s udy h ee o he undamen al ne wo k measu es: he global clus e ing coe icien ,
epidemic h eshold, and homophily index. We will now p o ide a b ie in oduc ion o hese h ee ne wo k measu es.
Global clus e ing coe icien . The global clus e ing coe icien is de ined as he a io be ween he numbe o closed iple s 𝑇𝑐(𝐺𝑛)
and he numbe o connec ed iples 𝑁𝑐(𝐺𝑛)in he ne wo k (Wa s and S oga z,1998),17 calcula ed as
𝑐(𝐺𝑛) = 𝑇𝑐(𝐺𝑛)
𝑁𝑐(𝐺𝑛),(12)
whe e
𝑇𝑐(𝐺𝑛) = 1
2∑
𝑖∈𝑉∑
𝑗∈𝑉∑
𝑘∈𝑉
𝑖≠𝑗≠𝑘
𝑊𝑖𝑗,𝑛𝑊𝑗𝑘,𝑛𝑊𝑘𝑖,𝑛 and 𝑁𝑐(𝐺𝑛) = 1
2∑
𝑖∈𝑉∑
𝑗∈𝑉∑
𝑘∈𝑉
𝑖≠𝑗≠𝑘
𝑊𝑖𝑗,𝑛𝑊𝑗𝑘,𝑛.
The global clus e ing coe icien has adi ionally been conside ed a measu e o social capi al. Fo example, i plays an impo an
ole in isk-sha ing (Bloch e al.,2008), us building (Ka lan e al.,2009), job sea ch (Ruiz-Palazuelos e al.,2023), and enhancing
coope a ion (G ano e e ,1985). Se e al empi ical s udies ha e used he global clus e ing coe icien as a eg esso o a dependen
a iable (e.g., Fleming e al.,2007;Ala as e al.,2016).
Supplemen a y Appendix A.1 shows ha he nai e es ima o s o he global clus e ing coe icien s (12) unde he induced
and s a subg aphs display biases. We p opose hei co ec ions, which di e om hose o he a e age deg ee: a he han
edges connec ing dyads (pai s o indi iduals), we adjus ‘‘ ela ionships’’ in ol ing h ee indi iduals, aking in o accoun hei
in e connec ions as closed iple s o connec ed iples, and accoun ing o he associa ed sampling p obabili ies. Ne e heless,
esea ch has demons a ed ha he global clus e ing coe icien in (12) app oaches ze o in spa se ne wo ks as 𝑛→∞(see
Supplemen a y Appendix B.2 o a o mal p oo ; see also Bha acha yya and Bickel 2015 and G aham 2020 o u he discussion).
The e o e, he asymp o ic analysis o 𝑐(𝐺𝑛)is unin o ma i e. To o e come his issue, we ollow he li e a u e, employing a no malized
global clus e ing coe icien which con e ges o a non-ze o alue asymp o ically and is obus o ne wo k size, ne wo k densi y, and
deg ee he e ogenei y. In pa icula , we ocus on he no malized coe icien p oposed in Li e al. (2019), calcula ed as
𝑐𝑛𝑜𝑟𝑚(𝐺𝑛) =
𝑇𝑐(𝐺𝑛)
3(𝑛
3)(𝑛𝑑(𝐺𝑛)
2(𝑛
2))3
(𝑁𝑐(𝐺𝑛)
3(𝑛
3))3=(𝑛− 2)2𝑡𝑟(𝑊3
𝑛)(𝟏′𝑊𝑛𝟏)3
𝑛(𝑛− 1)(𝟏′𝑊2
𝑛𝟏−𝑡𝑟(𝑊2
𝑛))3,(13)
whe e 𝟏is 𝑛-dimensional ec o o 1’s. I is s aigh o wa d o see ha he no maliza ion in (13) balances he exponen s ega ding
he ne wo k size 𝑛and he adjacency ma ix 𝑊𝑛in he nume a o and denomina o . The e o e, as 𝑛→∞, he nume a o and he
denomina o will con e ge a he same a e.18 A e ea anging, (13) can be exp essed as ollows:
𝑐𝑛𝑜𝑟𝑚(𝐺𝑛) = 𝜁𝑛
𝑇𝑐(𝐺𝑛)𝑑(𝐺𝑛)3
(𝑁𝑐(𝐺𝑛)
𝑛)3,(14)
wi h 𝜁𝑛=(𝑛−2)2
4𝑛(𝑛−1) . Supplemen a y Appendices A.1 and B.2 analyze he biases in he nai e es ima o s o he no malized global
clus e ing coe icien in (14), p o ide he co esponding co ec ions, and show ha he weigh ed es ima o s a e consis en and
asymp o ically no mally dis ibu ed as 𝑛→∞.
Epidemic h eshold. The e is an inc easing in e es in unde s anding he di usion p ope ies o ne wo ks. The epidemic h eshold
is one way o quan i y how easy i is o a disease, in o ma ion, idea, o beha io o p opaga e h ough a ne wo k. The
applica ions ange om p oduc adop ion (Bane jee e al.,2013), sp ead o in o ma ion (Ala as e al.,2016) o sp ead o
beha io s (Cen ola,2010). The e is a la ge a ie y o epidemic h esholds, depending on he di usion condi ions and ne wo k
p ope ies (see, e.g., Vega-Redondo,2007, and Jackson,2010). We ocus on he ollowing widely used e sion, based on he
mean- ield app oxima ion (Pas o -Sa o as and Vespignani,2002):
𝛿(𝐺𝑛) =
1
𝑛∑𝑖∈𝑉∑𝑗∈𝑉𝑊𝑖𝑗,𝑛
1
𝑛∑𝑖∈𝑉(∑𝑗∈𝑉𝑊𝑖𝑗,𝑛)2.
Supplemen a y Appendices A.2 and B.3 show ha he nai e es ima o s a e biased and ou p oposed weigh ed es ima o s a e
consis en and no mally dis ibu ed.
17 The numbe o closed iple s also equals h ee imes he numbe o iangles. A iangle e e s o a comple e subne wo k o h ee indi iduals, which consis s
o h ee closed iple s, one cen e ed on each node. A connec ed iple is a h ee-node subne wo k in which a leas wo edges a e p esen . Hence, e e y iangle
is a connec ed iple, bu he e e se is no necessa ily ue.
18 In addi ion o (13), he e is an al e na i e no malized global clus e ing coe icien p oposed in Bha acha yya and Bickel (2015), which we discuss in u he
de ail in Supplemen a y Appendix B.2. We ocus on (13) in he main ex o he sake o b e i y.
Jou nal o Econome ics 240 (2024) 105689
9
C.-S. Hsieh e al.
Homophily index. Social and economic ne wo ks exhibi a ea u e called homophily, a endency o bond wi h simila indi iduals.
In social and economic ne wo ks, who links wi h whom is ypically co ela ed wi h cha ac e is ics such as gende , age, ace, and
social and economic s a us, among o he s (see McPhe son e al.,2001 o a su ey). This phenomenon o ‘‘bi ds o a ea he lock
oge he ’’ gains pa icula ele ance in ou app oach because we explici ly conside he ypes o nodes in he ne wo k. Homophily is
an impo an measu e o c oss- ype seg ega ion and a ec s many economically ele an phenomena such as di usion o lea ning and
hei speeds (Golub and Jackson,2012), labo ma ke ou comes (Cal o-A mengol and Jackson,2004), o indi idual and i m-le el
success (McPhe son and Smi h-Lo in,1987).
We adop he homophily index om Cu a ini e al. (2009). The index o ype 𝑡is de ined as 𝐻𝑡(𝐺𝑛) = 𝑑𝑡𝑡(𝐺𝑛)
𝑑𝑡(𝐺𝑛), whe e 𝑑𝑡𝑡(𝐺𝑛)
deno es he a e age numbe o iendships ha agen s o ype 𝑡ha e wi hin he same ype and 𝑑𝑡(𝐺𝑛)deno es he a e age numbe o
iendships ha ype 𝑡 o m ega dless o o he s’ ypes. Supplemen a y Appendix A.3 con ains de ailed de i a ions o he weigh ed
es ima o o 𝐻𝑡(𝐺𝑛)unde induced and s a subg aphs. Supplemen a y Appendix B.4 again p o es ha ou weigh ed es ima o s a e
consis en and asymp o ically no mally dis ibu ed.
3.3. Asymp o ics o eg essions wi h es ima ed ne wo k measu es
This sec ion discusses he asymp o ic p ope ies o OLS eg essions in (3), in which ou weigh ed es ima o s a e employed as
eg esso s. Suppose we ha e 𝑅ne wo ks. Le 𝑛𝑟deno e he numbe o nodes in he 𝑟 h ne wo k o 𝑟= 1,…, 𝑅. Le 𝑛𝑟=𝑎𝑟⋅𝑛and
assume ha 0< 𝜍𝓁≤𝑎𝑟≤𝜍𝑢<∞ o all 𝑟wi h 𝜍𝓁and 𝜍𝑢being cons an s ha do no depend on 𝑟. Tha is, we assume ha he
numbe o nodes in each ne wo k is o he same o de . I is well-known ha in OLS eg essions wi h co a ia es being es ima ed, i
he es ima ing e o o he eg esso is independen o he eg ession e o 𝜖𝑟and max𝑟=1,…,𝑅{|
𝛬(𝐺𝑟,𝑛𝑟) − 𝛬𝑟|} = 𝑜𝑝(1), he es ima ion
e ec can be igno ed asymp o ically. To be speci ic, le
(𝛼𝑖𝑛,
𝛽𝑖𝑛, 𝛾′
𝑖𝑛)′= a g min
(𝛼,𝛽,𝛾′)
1
𝑅
𝑅
∑
𝑟=1(𝑦𝑟−𝛼−𝛽𝛬𝑟−𝑥𝑟𝛾)2,
(𝛼,
𝛽, 𝛾′)′= a g min
(𝛼,𝛽,𝛾′)
1
𝑅
𝑅
∑
𝑟=1 (𝑦𝑟−𝛼−𝛽
𝛬(𝐺𝑟,𝑛𝑟) − 𝑥𝑟𝛾)2,(15)
whe e (𝛼𝑖𝑛,
𝛽𝑖𝑛, 𝛾′
𝑖𝑛)′deno es he in easible OLS es ima o because he ue 𝛬𝑟’s a e no obse able and (𝛼,
𝛽, 𝛾′)′deno es he OLS
es ima o when he ue 𝛬𝑟’s a e eplaced wi h hei es ima es. I max𝑟=1,…,𝑅{|
𝛬(𝐺𝑟,𝑛𝑟) − 𝛬𝑟|} = 𝑜𝑝(1), hen
√𝑅((𝛼,
𝛽, 𝛾′)′− ( 𝛼𝑖𝑛,
𝛽𝑖𝑛, 𝛾′
𝑖𝑛)′)=𝑜𝑝(1),
i.e., he in easible OLS es ima o and he OLS es ima o based on es ima ed
𝛬(𝐺𝑟,𝑛𝑟)’s a e asymp o ically equi alen . In o he wo ds,
we can ea
𝛬(𝐺𝑟,𝑛𝑟)’s as he ue 𝛬𝑟’s in he eg ession wi hou he need o co ec o he es ima ion e ec o
𝛬(𝐺𝑟,𝑛𝑟)’s. The
ollowing lemma p o ides su icien condi ions o max𝑟=1,…,𝑅{|
𝛬(𝐺𝑟,𝑛𝑟) − 𝛬𝑟|} = 𝑜𝑝(1).
Lemma 2. Assume ha he a iance o √𝑛(
𝛬(𝐺𝑟,𝑛𝑟) − 𝛬𝑟)is uni o mly bounded abo e by a ini e cons an 𝑀, o all 𝑟and 𝑛≥𝑁 o
some ini e la ge numbe 𝑁, and 𝑅∕𝑛→0. Then, max𝑟=1,…,𝑅{|
𝛬(𝐺𝑟,𝑛𝑟) − 𝛬𝑟|} = 𝑜𝑝(1).
4. Mon e Ca lo simula ions
This sec ion complemen s he p e ious one in assessing he pe o mance o ou app oach in ini e samples. In pa icula , we
e alua e nume ically he es ima ion biases in he ne wo k measu es unde s udy (in Sec ions Sec ion 4.1), as well as he ne wo k
e ec s when using hese measu es as eg esso s in eg ession analysis (in Sec ion 4.2). The e alua ion conside s a ious ac o s
such as he sampling design (induced s. s a subg aph), he sampling a e, and whe he SRS ( ep esen a i eness) is assumed when
applying he weigh ed es ima o s. We quan i y he biases p esen in he nai e es ima o s and he co ec ions made unde he SRS
assump ion and compa e hei pe o mances is-à- is ou pos -s a i ica ion es ima o s. Fo ease o in e p e a ion, we concen a e
on he scena ios ha mimic ou modeling assump ions.
In his simula ion exe cise, we demons a e he e ec i eness o ou pos -s a i ica ion app oach by analyzing he ne wo k
measu es discussed in Sec ion 3. These measu es include he a e age deg ee, global clus e ing coe icien , no malized global
clus e ing coe icien , epidemic h eshold, and homophily index.19 The ne wo k da a in ou simula ion s udy a e adop ed om he
Add Heal h Wa e-I In-school da a.20 In pa icula , we adop one school as a p o o ype.21 By adop ing he eal-li e iendship ne wo k
19 We include bo h he s anda d global clus e ing coe icien and i s no malized a ian . Ou co ec ions o he la e a e asymp o ically well-beha ed and we
would like o assess i s pe o mance in ini e samples. Howe e , he (non-no malized) coe icien is widely employed in he li e a u e. Hence, al hough we know
i con e ges o ze o in spa se ne wo ks asymp o ically (see Sec ion 3.2), we analyze i s pe o mance in ini e samples.
20 This is a p og am p ojec designed by J. Richa d Ud y, Pe e S. Bea man, and Ka hleen Mullan Ha is, and unded by a g an P01-HD31921 om he
Na ional Ins i u e o Child Heal h and Human De elopmen , wi h coope a i e unding om 17 o he agencies. Special acknowledgmen is due Ronald R. Rind uss
and Ba ba a En wisle o assis ance in he o iginal design. Pe sons in e es ed in ob aining da a iles om Add Heal h should con ac Add Heal h, Ca olina
Popula ion Cen e , 123 W. F anklin S ee , Chapel Hill, NC 27516-2524 ([email p o ec ed]). No di ec suppo was ecei ed om g an P01-HD31921 o his
analysis.
21 This adop ed school is a public subu ban school wi h 1606 s uden s om g ades 9 o 12. The school is loca ed in he sou he n U.S.
Jou nal o Econome ics 240 (2024) 105689
16
C.-S. Hsieh e al.
Table 1
Popula ion and sample sha es o di e en cha ac e is ics and labo ma ke ou comes in he Indian u al illage
da a om Bane jee e al. (2013).
Popula ion Sample Di . (𝑝- alue)
Age
<30 38.71% 30.97% 7.74% (0.000)
30–50 39.60% 54.11% −14.51% (0.000)
>50 21.69% 14.92% 6.77% (0.000)
Male 50.34% 44.57% 5.77% (0.000)
Household size
<317.26% 15.49% 1.77% (0.038)
3–8 71.57% 73.48% −1.91% (0.039)
>811.17% 11.03% 0.14% (0.879)
Labo ma ke ou come
employed 62.49%
wo k ou side illage 21.21%
Numbe o illages 75 75
Obse a ions 48,646 16,995
collec ed he census in o ma ion o each household in all illages. Subsequen ly, hey conduc ed a comp ehensi e ollow-up su ey
wi h a subse o each illage, whe ein hey also eco ded he ne wo ks o ela ionships among su eyed indi iduals. As is common
in mos s udies, he su ey esponden s only ep esen a sample o each illage, and hei epo ed ne wo k is an induced subg aph
o he whole ne wo k. The a e age sampling a e ac oss illages is 35%. The c ucial aspec o he sampling design in Bane jee e al.
(2013) is he s a i ica ion by eligion and geog aphic sub-loca ion, gene a ing a ep esen a i e sample wi h espec o hese wo
a iables. This is a common app oach in many applica ions. Despi e he s a i ica ion based on eligion and geog aphy, Table 1
e eals ha he da a a e no ep esen a i e in e ms o age, gende , and– o a lesse ex en –household size. Below, we show o wha
ex en he di e ences be ween he illage popula ion and sample sha es o hese ca ego ies a ec he es ima ion o ne wo k e ec s
in eg essions discussed in Sec ion 2.2.
The da a con ain se e al a iables ega ding he labo ma ke ou comes o he pa icipan s, such as hei employmen s a us,
whe he hey wo k ou side he illage, and hei occupa ion. Since he impo an ole o social ne wo ks in labo ma ke s is
widely acknowledged (G ano e e ,1985;Cal o-A mengol and Jackson,2004;Cingano and Rosolia,2012), we ask how he illage
employmen a e and he ac ion o people wo king ou side he illage co ela e wi h he global ea u es o he unde lying ne wo k
o ela ionships wi hin he illage.25 Theo e ical li e a u e sugges s ha bo h connec i i y and he global clus e ing coe icien can
ha e a di ec impac on employmen p ospec s (Cal o-A mengol and Jackson,2004;Ruiz-Palazuelos e al.,2023). Addi ionally, he
epidemic h eshold can indi ec ly in luence labo ou comes by a ec ing he low o labo -ma ke in o ma ion (Cal o-A mengol
and Jackson,2004). Simila ly, he deg ee o seg ega ion can de e mine which indi iduals ha e access o job in o ma ion and
hose who do no . Mos impo an ly, o he p esen s udy, we ask how he es ima ed ne wo k e ec s change i we accoun o
non- ep esen a i eness o he ne wo k sample. We hypo hesize ha he o e - ep esen a ion o indi iduals aged 30–50 and he
unde - ep esen a ion o men in he sample (as e iden in Table 1), who a e ypically mo e ac i e pa icipan s in labo ma ke s
in a coun y like India, could bias he es ima ed ne wo k e ec s i his mis ep esen a ion is no aken in o accoun .
Table 2 epo s he es ima ed ne wo k e ec s in a se ies o eg essions di e ing in (i) he dependen a iable (employmen a e
o ac ion o wo king ou side he illage), (ii) whe he aw sample s a is ics o co ec ions a e used and (iii) di e en ne wo k
measu es. As o (ii), o sepa a e he e ec o scaling om he e ec on non- ep esen a i eness o he sample, we use he nai e
es ima o s (deno ed Raw in Table 2), co ec ions assuming SRS (deno ed SRS), and ou app oach in which we weigh on c oss-
cha ac e is ics (inco po a ing he in o ma ion on age, gende , and household size; deno ed C oss). Table 1 illus a es he dis ibu ions
o hese h ee a iables, om which we compu e he 𝜓𝑡 o he 3 × 2 × 3 = 18 ypes acco ding o he a iable C oss. Each ow
epo s he es ima ed ne wo k e ec (and he s anda d e o obus o he e oskedas ici y in pa en heses) om a sepa a e eg ession
o one dependen a iable on he co esponding ne wo k s a is ic and illage size, mimicking he s uc u e o he eg essions in
Sec ion 2.2. We also apply he pos -s a i ica ion weigh ing on he dependen a iables (i.e., employmen and wo king ou side
illages) a he illage le el o co ec measu emen e o s.26 Consequen ly, he columns C oss p o ide a ypical example o s anda d
pos -s a i ica ion wi h a easonable numbe o s a i ica ion g oups, whe e he sampling a es a e es ima ed om he di e ences
be ween he sample and popula ion sha es o auxilia y a iables. Since we show ha ou app oach deli e s consis en es ima es,
we belie e ha applied esea che s should epo es ima es such as hose in he columns C oss as hei main esul while es ima ing
he e ec o ne wo k measu es on ou comes in non- ep esen a i e samples o , a leas , as a obus ness check o hei main analysis.
As o he in luence o illage ne wo ks on labo ma ke ou comes, ou indings suppo exis ing li e a u e, highligh ing he
signi ican ole played by he s uc u e o social ne wo ks in shaping labo ma ke s. By accoun ing o he non- ep esen a i eness
25 To main ain simplici y and align be e wi h he assump ions o ou analysis, we ocus on a simple applica ion compa ed o Bane jee e al. (2013), who
p opose a mo e in ica e es ima ion s a egy.
26 We use he ne wo k cons uc ed by he union o all ela ionships epo ed by su ey esponden s (e.g., bo owing, lending, seeking ad ices, going o emple
oge he , isi ing home, e c.). We ind simila esul s i we only ocus on iendships (see Table C.1 in he Supplemen a y Appendix).
Jou nal o Econome ics 240 (2024) 105689
17
C.-S. Hsieh e al.
Table 2
Es ima ed ne wo k e ec s on he labo ma ke ou comes o illage s in u al India illages.
Dependen a iable (I) Employed (%) (II) Wo k ou side illage (%)
Raw SRS C oss Raw SRS C oss
A e age deg ee 0.0269*** 0.0091** 0.0088** −0.0235*−0.0093** −0.0101*
(0.0095) (0.0035) (0.0039) (0.0120) (0.0044) (0.0051)
Global clus e ing 0.4989** 0.4989** 0.4240** −0.6410** −0.6410** −0.4996***
(0.1930) (0.1930) (0.1830) (0.2666) (0.2666) (0.1879)
No m. global clus e ing −0.0047 −0.0047 −0.0022 0.0038 0.0038 0.0037
(0.0061) (0.0061) (0.0051) (0.0048) (0.0048) (0.0054)
Epidemic h eshold −1.1530*** −2.3017*** −2.0965** 0.9357** 2.3498** 2.3924**
(0.3589) (0.8442) (0.8341) (0.4148) (0.9967) (1.0430)
HI-male 0.1939*0.1939*0.1445 −0.1374 −0.1374 −0.0463
(0.1028) (0.1028) (0.0939) (0.1490) (0.1490) (0.1621)
HI-middle age 0.2848 0.2848 −0.2386 −0.5150** −0.5150** 0.0010
(0.1956) (0.1956) (0.2119) (0.2033) (0.2033) (0.2428)
HI-small household size 0.0930 0.0930 0.1866** −0.2677** −0.2677** −0.0818
(0.0991) (0.0991) (0.0856) (0.0992) (0.0992) (0.0920)
No e: Reg essions a e based on 75 illages. S anda d e o s obus o he e oskedas ici y a e epo ed in pa en heses. Each ow ep esen s a sepa a e eg ession
wi h a di e en ne wo k measu e, and he illage size is included in e e y eg ession as a de aul con ol. Raw indica es an unweigh ed sample s a is ic, SRS
signi ies he co ec ion based on he ep esen a i eness assump ion, and C oss deno es he weigh ing on he C oss cha ac e is ic a iable.
* S and o signi icance a 10%.
** S and o signi icance a 5%.
*** S and o signi icance a 1%.
o he sample (as indica ed by he C oss columns in Table 2), ce ain ea u es o he social ne wo ks ha e a meaning ul impac on
a e age labo ou comes wi hin he illage. Mo eo e , he e ec s o hese a ious ne wo k cha ac e is ics la gely exhibi consis ency
wi h one ano he .
Rega ding he main pu pose o his exe cise, Table 2 shows he sensi i i y o he esul s wi h espec o (non-) ep esen a i eness
o ne wo k samples. In con as o Sec ion 4, we do no know he ue impac o he di e en ne wo k measu es. Howe e , since
he da a and he pe o med eg essions ma ch he assump ions behind ou app oach, all he p e ious analysis sugges s ha he
esul s using ou me hodology a e consis en , less biased, and mo e s able han ei he he nai e es ima o s o co ec ions assuming
ep esen a i eness. As a esul , he ollowing discussion p o ides an in o mal assessmen o he dispa i ies among he esul s ob ained
om nai e es ima o s, he co ec ions assuming SRS, and ou app oach.
Table 2 documen s ha he es ima es using aw da a o co ec ions based on he ep esen a i eness assump ion a e mos ly
expanded compa ed o he co ec ions ha accoun o bo h scaling and he non- ep esen a i eness o he ne wo k da a. Howe e ,
we also obse e ins ances o a enua ion and e en sign-swi ching. The e a e h ee cases in which we obse e a ne wo k e ec when
employing he nai e es ima o s o co ec ions unde SRS, bu his e ec does no show up using ou weigh ing app oach. In one
o he case, he ne wo k e ec is absen wi h he nai e es ima o s and co ec ions o scaling, bu his e ec becomes signi ican
in he C oss column. All hese ou cases a e associa ed wi h he impac o homophily. Quan i a i ely speaking, he e ec o he
a e age deg ee, when based on aw da a, is o e es ima ed in Table 2 by o e 130% compa ed o he e ec obse ed h ough ou
weigh ing app oach. Likewise, he e ec o he global clus e ing coe icien is o e es ima ed by mo e han 17%, while he e ec
o he epidemic h eshold is unde es ima ed by a leas 45%. Hence, some o hese di e ences a e economically signi ican . The
co ec ions assuming ep esen a i eness ei he alle ia e o main ain he biases when compa ed o he esul s ob ained h ough ou
app oach. These co ec ions e ec i ely educe he biases wi h espec o he C oss column o below 10% o he a e age deg ee and
epidemic h eshold. Howe e , he biases emain economically signi ican o ne wo k measu es ha a e unbiased in ep esen a i e
samples bu gene ally biased in non- ep esen a i e samples, such as he clus e ing coe icien s and he homophily indices.
In sum, signi ican di e ences a e p esen be ween he es ima es ob ained using ou app oach and hose om he nai e es ima o s
as well as he co ec ions assuming ep esen a i eness. These indings sugges ha alse posi i es (o nega i es), expansion o
ne wo k e ec s, and sign swi ching migh be common phenomena esul ing om non- ep esen a i eness o ne wo k samples. Gi en
ha mos ne wo k da a sha e he unde lying p ope ies o his da a, he esul s he e imply ha applied esea che s should conside
he e ec o weigh ing on he sign, size, and magni ude o ne wo k e ec s. Mo e impo an ly, he di ec ion and he magni ude o
he biases depend non- i ially on he pa icula ne wo k s a is ics, he dependen a iable unde s udy, and who is missing. Hence,
his exe cise co obo a es ha esea che s canno easily p edic he di ec ion o he biases and consequen ly, hey should no ely
on classical measu emen -e o solu ions, e en in he simples cases analyzed he e.
6. Discussion
This sec ion discusses po en ial ex ensions and limi a ions o ou me hodology and p o ide se e al ecommenda ions conce ning
he selec ion o auxilia y a iables o weigh ing.
Jou nal o Econome ics 240 (2024) 105689
18
C.-S. Hsieh e al.
Al e na i e Ne wo k Sampling Designs. Al hough his pape ocuses on he induced and s a subg aphs, he p oposed
me hodology can be adap ed o o he sampling schemes as long as he esea che knows he s a egy employed o he elici a ion o
he sample and possesses some in o ma ion abou he whole popula ion. We p esen se e al examples illus a ing how he p oposed
app oach can be applied o di e en sampling s a egies and discuss cases whe e ou me hodology canno be di ec ly applied, o
equi es modi ica ion.
As a i s example, conside he issue known as he bounda y speci ica ion p oblem. Resea che s some imes se a bounda y o
de e mine he whole ne wo k o in e es . Imagine a esea che who collec s a ne wo k sample om a ew classes wi hin a school,
excluding indi iduals om o he classes and any connec ions be ween he classes unde in es iga ion and indi iduals ou side he
class. Al hough he sampled ne wo k may p o ide a comp ehensi e ep esen a ion o he analyzed classes, i emains incomple e in
cap u ing he en i e y o he ue social ne wo k wi hin he school. I one would like o s udy he school ne wo k, and indi idual
cha ac e is ics a e a ailable o he whole school, one can mi iga e he bounda y speci ica ion p oblem by applying ou me hod
di ec ly because se ing a bounda y is ma hema ically equi alen o he induced subg aph sampling.
As a second example, conside snowball sampling, a sampling p ocedu e commonly applied in Sociology, Ma ke ing, and
Epidemiology (see, e.g., Be g,2004;B owne,2005). In snowball sampling, a esea che begins by andomly selec ing seed nodes.
These seeds se e as he s a ing poin o he i s wa e, du ing which he esea che collec s in o ma ion on all he con ac s o he
ini ially selec ed nodes. In subsequen wa es, he esea che expands he sample by elici ing he con ac s o he nodes iden i ied in he
p e ious wa e, and his p ocess con inues i e a i ely. No e ha conduc ing a one-wa e snowball sampling is essen ially equi alen
o he s a subg aph sampling app oach discussed ea lie , hus making ou me hodology di ec ly applicable. The li e a u e has
sugges ed co ec ions o one-wa e snowball sampling (F ank,1977;Kolaczyk,2009), bu hese co ec ions only align wi h ou
app oach when he ini ial seeds a e ep esen a i e samples o he popula ion. We a gue his is a ely he case e en in e y ca e ully
and sys ema ically collec ed da a se s. Al hough he compu a ion becomes inc easingly complex as mo e wa es a e pe o med, one
can adap ou app oach o mul iple wa es o snowball sampling aking in o accoun he missing equencies o each ype and
he in o ma ion abou he wi hin- ype and ac oss- ype connec i i y om he obse ed pa o he ne wo k using combina o ial
a gumen s. In ac , ou me hodology has ce ain pa allelism wi h Responden D i en Sampling (Hecka ho n,1997), a weigh ing
app oach on snowball samples o compensa e analy ically o he non- andomness o snowball-sampling p ocedu es. In con as o
his app oach ha co ec s o he non- ep esen a i eness ex-an e, ou app oach adjus s o hese issues ex-pos by mi iga ing he
disc epancy be ween he sampled and popula ion ne wo ks and ea ing he sampling a es as es ima o s.
Unsu p isingly, ou co ec ions canno be applied o some al e na i e sampling designs o should be ailo ed o he speci ic
sampling s a egy employed in he co esponding s udy. Conside , o example, andom selec ion o links (also known as andom
edge sampling) whe e an indi idual 𝑖is included in he sample i a leas one o he edges is sampled. Such sampling is commonplace
in communica ion da a, whe e only andom samples o phone calls o e-mails a e selec ed. We do no a ge his p ocedu e in his
s udy as addi ional assump ions would be necessa y, bu see, e.g., Kolaczyk (2009) o a po en ial di ec ion. Rela edly, ou app oach
assumes ha , condi ionally on obse ing a pa icula sample o nodes and he sampling design, he links a e obse ed pe ec ly. Tha
is, his s udy speci ically analyzes issues a ising om impe ec obse a ion o ne wo k membe s bu canno sol e issues a ising om
mismeasu ed links (see, e.g., Ha dy e al.,2019). A no able example o his issue is he unca ed ixed-choice su ey design, whe e
esponden s a e cons ained o nomina e a ce ain numbe o iends (e.g., up o en iends). Ou app oach mi iga es he biases
due o he non- ep esen a i eness bu no hose due o he unca ion. Howe e , bo h issues migh be a ge ed simul aneously by
combining ou pos -s a i ica ion weigh ing wi h he app oach p oposed by G i i h (2022), which is speci ically designed o mi iga e
he issues due o he unca ion. Simila ly, addi ional applica ions o ou app oach migh esul om combining ou app oach wi h
me hods designed o o he pu poses. The ex ension o ou app oach o hese o he mo e specialized sampling p ocedu es is le o
u u e esea ch.
O he Ne wo k Measu es. Due o hei heo e ical and empi ical ele ance, his s udy ocuses on ou undamen al ne wo k
measu es commonly seen in he empi ical li e a u e. Ne e heless, one can adap he me hodology o o he measu es ha solely
equi e he knowledge o nodes’ local in o ma ion.27 The i s se o examples allows o a di ec applica ion o ou me hodology,
which includes he asso a i i y coe icien and he a e age size o he second-o de neighbo hood. Asso a i i y plays a c ucial ole
in he p ocess o di usion, as i can ei he impede o acili a e he ansmission o diseases, beha io s, and social no ms (Newman,
2002;Jackson e al.,2017). The a e age size o he second-o de neighbo hood enables us o assess how as di usion sp eads, and
i is impo an in labo ma ke s (Cal o-A mengol and Jackson,2004). Since he compu a ion o bo h he asso a i i y coe icien
and he second-o de neighbo hood only equi es he knowledge o an indi idual’s deg ee and he deg ees o hei neighbo s, hei
weigh ed co ec ions ollow di ec ly om Sec ion 3.
O he measu es do no ollow di ec ly om Sec ion 3, bu ou app oach can s ill be applied. Fo ins ance, Eagle e al. (2010) apply
he concep o en opy o cap u e he di e si y o connec ions o an indi idual o di e en ypes in he ne wo k. Since hei measu e
only elies on he neighbo hood o each node, he co ec ed a ia ion o his measu e o sampled ne wo ks is s aigh o wa d.
Simila ly, cycles o leng h ou ha e ecen ly ecei ed ce ain a en ion in sociology (Opsahl,2013) and economics (Ruiz-Palazuelos
e al.,2023). One can eco e i ollowing ou app oach using he combina o ial logic. Since hese cha ac e is ics a e ex ensions
o he ideas o homophily and he global clus e ing coe icien , espec i ely, we ocus on he mo e common a ia ions and do no
p opose he co ec ions o hese wo in his s udy.
27 In his pape , local in o ma ion always e e s o he i s - and second-o de neighbo hoods o each node. One can go u he and inco po a e mo e dis an
neighbo s p obably a he cos o lowe p ecision o he p oposed co ec ions.
Jou nal o Econome ics 240 (2024) 105689
19
C.-S. Hsieh e al.
The p oposed me hodology canno eco e global ne wo k measu es compu ed based on he en i e ne wo k a chi ec u e. This
includes spec al p ope ies, a e age be weenness o eigen alue cen ali y, and ne wo k dis ances. Howe e , he e is a ich li e a u e
p oposing app oxima ions, bounds, o ‘‘plug-in’’ es ima o s compu ed on he basis o nodes’ local in o ma ion (e.g., Van Mieghem,
2010;Comellas and Gago,2007). Hence, one can co ec hese bounds and app oxima ions using ou app oach ei he di ec ly o
by plugging some o ou co ec ions in o mo e gene al exp essions. Fu u e esea ch shall es ablish he ini e-sample as well as
asymp o ic p ope ies o such bounds, app oxima ions, and plug-in es ima o s. The p oposed app oach canno eco e he ne wo k
cha ac e is ics a he indi idual node le el.
Selec ion o (Auxilia y) Weigh ing Va iables. A na u al ques ion a ising om he p oposed me hodology is he choice o
he (auxilia y) weigh ing a iables o pos -s a i ica ion. The e idence poin s ou ha di e en cha ac e is ics ma e in di e en
con ex s and si ua ions. Fo ins ance, Mo elli e al. (2017) epo ha posi i e emo ions explain posi ioning in ne wo k e lec ing
ime sha ing, while empa hy plays a ole in in ima e ne wo ks o he same people desc ibing us and suppo . Simila ly, i ms
may o m ies di e en ly i sea ching o p o ide s (o buye s) compa ed o inno a ion collabo a ions. Hence, one has o know he
pa icula applica ion unde s udy o assess which node-le el cha ac e is ic migh p o ide aluable in o ma ion abou he ne wo k
and we p e e o e ain om making gene al ecommenda ions ega ding he applica ion o pa icula a iables. Fo his eason, we
would gene ally encou age applied esea che s o i s analyze he deg ee o non- ep esen a i eness and hen use ha in o ma ion
o in o m he a iables chosen o he co ec ion.
P ac ically speaking, mos da a se s a e limi ed o a ela i ely small se o a iables ha encompass census in o ma ion. Since
ou esul s show ha he pe o mance imp o es wi h mo e in o ma ion and applying a iables ha p o ide no in o ma ion abou
he ne wo k does no a ec he pe o mance nega i ely, we ecommend employing all he a ailable in o ma ion in such cases. In
con as , when many a iables a e a ailable o weigh ing, a p oblem would be o ha e oo ew obse a ions in each s a i ied cell.
This can lead o an inc ease in a iance, esul ing in educed e iciency o he weigh ing es ima es o he cha ac e is ic being s udied.
One s aigh o wa d solu ion is o apply he p incipal componen analysis o il e he ele an independen in o ma ion om a la ge
numbe o po en ially co ela ed a iables and cons uc he weigh s using he disc e ized componen s. Ano he solu ion can be a
simple wo-s ep algo i hm, ou lined in Supplemen a y Appendix E, ha we p opose o he selec ion o he ‘‘ igh ’’ a iables.
We emain agnos ic abou he speci ic app oach a esea che would ake o a pa icula p ojec . Howe e , ha esea che s
should be awa e o he in e en ial p oblem add essed he e and he gene al limi s o ea ing he ne wo k as i i we e comple e o
assuming ep esen a i eness o he ne wo k sample. Gi en ha sensi i i y, a a ie y o weigh s should be used o disco e i he
esul s a e sensi i e o accoun ing o non- ep esen a i eness. Such analysis should se e as a s anda d obus ness check o empi ical
ne wo k esul s, gi ing schola s con idence ha he esul s e lec ne wo k e ec s and a e no a igmen o he sampling s a egy.
Appendix A. Supplemen a y da a
Supplemen a y ma e ial ela ed o his a icle can be ound online a h ps://doi.o g/10.1016/j.jeconom.2024.105689.
Re e ences
Ala as, Vi i, Bane jee, Abhiji , Chand asekha , A un G., Hanna, Rema, Olken, Benjamin A., 2016. Ne wo k s uc u e and he agg ega ion o in o ma ion: Theo y
and e idence om Indonesia. Ame . Econ. Re . 106 (7), 1663–1704.
Aldous, Da id J., 1981. Rep esen a ions o pa ially exchangeable a ays o andom a iables. J. Mul i a ia e Anal. 11 (4), 581–598.
A al, Sinan, 2016. Ne wo ked expe imen s. In: The Ox o d Handbook o he Economics o Ne wo ks. Ox o d, UK: Ox o d Uni e si y P ess, pp. 376–411.
Balles e , Co alio, Cal ó-A mengol, An oni, Zenou, Y es, 2006. Who’s who in ne wo ks. Wan ed: The key playe . Econome ica 74 (5), 1403–1417.
Bane jee, Abhiji , Chand asekha , A un G., Du lo, Es he , Jackson, Ma hew O., 2013. The di usion o mic o inance. Science 341 (6144), 1236498.
Bane jee, Abhiji , Chand asekha , A un G., Du lo, Es he , Jackson, Ma hew O., 2014. Gossip: Iden i ying cen al indi iduals in a social ne wo k. No. w20422
NBER Wo king pape .
Be g, S en, 2004. Snowball sampling—I. Encycl. S a . Sci. 12.
Bha acha yya, Sha modeep, Bickel, Pe e J., 2015. Subsampling boo s ap o coun ea u es o ne wo ks. Ann. S a is . 43 (6).
Bickel, Pe e J., Chen, Aiyou, 2009. A nonpa ame ic iew o ne wo k models and Newman–Gi an and o he modula i ies. P oc. Na l. Acad. Sci. 106 (50),
21068–21073.
Bickel, Pe e J., Chen, Aiyou, Le ina, Eliza e a, 2011. The me hod o momen s and deg ee dis ibu ions o ne wo k models. Ann. S a is . 39 (5), 2280–2301.
Binde , Da id A., Robe s, Geo gia R., 2003. Design-based and model-based me hods o es ima ing model pa ame e s. Anal. Su ey Da a 29, 33–54.
Bloch, F ancis, Genico , Ga ance, Ray, Deb aj, 2008. In o mal insu ance in social ne wo ks. J. Econom. Theo y 143 (1), 36–58.
Bo gs, Ch is ian, Chayes, Jenni e , Cohn, Hen y, Zhao, Yu ei, 2019. An 𝐿𝑝 heo y o spa se g aph con e gence I: Limi s, spa se andom g aph models, and powe
law dis ibu ions. T ans. Ame . Ma h. Soc. 372 (5), 3019–3062.
Bouche , Vincen , Hounde oungan, A is ide, 2020. Es ima ing pee e ec s using pa ial ne wo k da a. Wo king pape .
B amoullé, Yann, Djebba i, Habiba, Fo in, Be na d, 2009. Iden i ica ion o pee e ec s h ough social ne wo ks. J. Econome ics 150 (1), 41–55.
B amoullé, Yann, K an on, Rachel, D’amou s, Ma in, 2014. S a egic in e ac ion and ne wo ks. Ame . Econ. Re . 104 (3), 898–930.
B anas-Ga za, Pablo, Cobo-Reyes, Ramón, Espinosa, Ma ía Paz, Jiménez, Na alia, Ko ářík, Ja omí , Pon i, Gio anni, 2010. Al uism and social in eg a ion. Games
Econom. Beha . 69 (2), 249–257.
B eza, Emily, Chand asekha , A un G., McCo mick, Tyle H., Pan, Mengjie, 2020. Using agg ega ed ela ional da a o easibly iden i y ne wo k s uc u e wi hou
ne wo k da a. Ame . Econ. Re . 110 (8), 2454–2484.
B owne, Ka h, 2005. Snowball sampling: using social ne wo ks o esea ch non-he e osexual women. In . J. Soc. Res. Me hodol. 8 (1), 47–60.
Cal o-A mengol, An oni, Jackson, Ma hew O., 2004. The e ec s o social ne wo ks on employmen and inequali y. Ame . Econ. Re . 94 (3), 426–454.
Cen ola, Damon, 2010. The sp ead o beha io in an online social ne wo k expe imen . Science 329 (5996), 1194–1197.
Chand asekha , A un, 2016. Econome ics o ne wo k o ma ion. In: The Ox o d Handbook o he Economics o Ne wo ks. pp. 303–357.
Chand asekha , A un G., Jackson, Ma hew O., 2016. A ne wo k o ma ion model based on subg aphs, Wo king pape . A ailable a SSRN: h ps://ss n.com/
abs ac =2660381.
Chand asekha , A un, Lewis, Randall, 2016. Econome ics o sampled ne wo ks, Wo king pape .
Jou nal o Econome ics 240 (2024) 105689
20
C.-S. Hsieh e al.
Cingano, Fede ico, Rosolia, Al onso, 2012. People I know: job sea ch and social ne wo ks. J. Labo Econ. 30 (2), 291–332.
Comellas, F., Gago, S., 2007. Spec al bounds o he be weenness o a g aph. Linea Algeb a Appl. 423 (1), 74–80.
C ane, Ha y, Towsne , Hen y, 2018. Rela i ely exchangeable s uc u es. J. Symbolic Logic 83 (2), 416–442.
Cu a ini, Se gio, Jackson, Ma hew O., Pin, Paolo, 2009. An economic model o iendship: Homophily, mino i ies, and seg ega ion. Econome ica 77 (4),
1003–1045.
De Paula, Au eo, 2017. Econome ics o ne wo k models. In: Ad ances in Economics and Econome ics: Ele en h Wo ld Cong ess. In: Econome ic Socie y
Monog aphs, Camb idge Uni e si y P ess, Camb idge, pp. 268–323,
De Paula, Áu eo, 2020. Econome ic models o ne wo k o ma ion. Annu. Re . Econ. 12, 775–799.
De Paula, Áu eo, Rasul, Im an, Souza, Ped o, 2018. Reco e ing social ne wo ks om panel da a: Iden i ica ion, simula ions and an applica ion. Wo king pape .
Eagle, Na han, Macy, Michael, Clax on, Rob, 2010. Ne wo k di e si y and economic de elopmen . Science 328 (5981), 1029–1031.
Fleming, Lee, King, III, Cha les, Juda, Adam I., 2007. Small wo lds and egional inno a ion. O gan. Sci. 18 (6), 938–954.
Fo in, Be na d, Bouche , Vincen , 2015. Some challenges in he empi ics o he e ec s o ne wo ks. In: The Ox o d Handbook o he Economics o Ne wo ks.
F ank, O e, 1977. Su ey sampling in g aphs. J. S a is . Plann. In e ence 1 (3), 235–264.
F ank, O e, 1981. A su ey o s a is ical me hods o g aph analysis. Sociol, Me hodol, 12, 110–155.
Golub, Benjamin, Jackson, Ma hew O., 2012. How homophily a ec s he speed o lea ning and bes - esponse dynamics. Q. J. Econ. 127 (3), 1287–1338.
G aham, B yan S., 2020. Ne wo k da a. In: Handbook o Econome ics, ol. 7, Else ie , pp. 111–218.
G ano e e , Ma k, 1985. Economic ac ion and social s uc u e: The p oblem o embeddedness. Am. J. Sociol. 91 (3), 481–510.
G i i h, Alan, 2022. Name you iends, bu only i e? he impo ance o censo ing in pee e ec s es ima es using social ne wo k da a. J. Labo Econ. 40 (4),
779–805.
Handcock, Ma k S., Gile, K is a J., 2010. Modeling social ne wo ks om sampled da a. Annals o Applied S a is ics 4 (1), 5.
Ha dy, Mo gan, Hea h, Rachel M., Lee, Wesley, McCo mick, Tyle H., 2019. Es ima ing spillo e s using imp ecisely measu ed ne wo ks. a Xi p ep in
a Xi :1904.00136.
Hecka ho n, Douglas D., 1997. Responden -d i en sampling: a new app oach o he s udy o hidden popula ions. Soc. P oblems 44 (2), 174–199.
Ho , Pe e D., Ra e y, Ad ian E., Handcock, Ma k S., 2002. La en space app oaches o social ne wo k analysis. J. Ame . S a is . Assoc. 97 (460), 1090–1098.
Holland, Paul W., Laskey, Ka h yn Blackmond, Leinha d , Samuel, 1983. S ochas ic blockmodels: i s s eps. Soc. Ne w. 5 (2), 109–137.
Hoo e , Douglas N., 1979. Rela ions on P obabili y Spaces and A ays o Random Va iables, P ep in . ol. 2, P ince on, NJ, p. 275.
Ho i z, Daniel G., Thompson, Dono an J., 1952. A gene aliza ion o sampling wi hou eplacemen om a ini e uni e se. J. Ame . S a is . Assoc. 47 (260),
663–685.
Jackson, Ma hew O., 2005. A su ey o ne wo k o ma ion models: s abili y and e iciency. G oup Fo m. Econ. Ne w. Clubs Coali ions 11–49.
Jackson, Ma hew O., 2010. Social and Economic Ne wo ks. P ince on Uni e si y P ess.
Jackson, Ma hew O., Rod iguez-Ba aque , Tomas, Tan, Xu, 2012. Social capi al and social quil s: Ne wo k pa e ns o a o exchange. Ame . Econ. Re . 102
(5), 1857–1897.
Jackson, Ma hew O., Roge s, B ian W., 2007. Mee ing s ange s and iends o iends: How andom a e social ne wo ks? Ame . Econ. Re . 97 (3), 890–915.
Jackson, Ma hew O., Roge s, B ian W., Zenou, Y es, 2017. The economic consequences o social-ne wo k s uc u e. J. Econ. Li . 55 (1), 49–95.
Ka lan, Dean, Mobius, Ma kus, Rosenbla , Tanya, Szeidl, Adam, 2009. T us and social colla e al. Q. J. Econ. 124 (3), 1307–1361.
Kolaczyk, E ic D., 2009. S a is ical Analysis o Ne wo k Da a: Me hods and Models. Sp inge Science & Business Media.
Li, Xin an, Ding, Peng, 2017. Gene al o ms o ini e popula ion cen al limi heo ems wi h applica ions o causal in e ence. J. Ame . S a is . Assoc. 112 (520),
1759–1769.
Li, Ting, Yu, Xianshi, Jing, Bing-Yi, 2019. Measu ing he clus e ing s eng h o a ne wo k ia he no malized clus e ing coe icien . a Xi p ep in a Xi :1908.00523.
Li le, Rode ick J.A., 1993. Pos -s a i ica ion: a modele ’s pe spec i e. J. Ame . S a is . Assoc. 88 (423), 1001–1012.
Lo ász, László, 2012. La ge Ne wo ks and G aph Limi s, ol. 60, Ame ican Ma hema ical Socie y.
McPhe son, J. Mille , Smi h-Lo in, Lynn, 1987. Homophily in olun a y o ganiza ions: S a us dis ance and he composi ion o ace- o- ace g oups. Am. Sociol.
Re . 370–379.
McPhe son, Mille , Smi h-Lo in, Lynn, Cook, James M., 2001. Bi ds o a ea he : Homophily in social ne wo ks. Annu. Re . Sociol. 27 (1), 415–444.
Mo elli, Syl ia A., Ong, Desmond C., Maka i, Rucha, Jackson, Ma hew O., Zaki, Jamil, 2017. Empa hy and well-being co ela e wi h cen ali y in di e en social
ne wo ks. P oc. Na l. Acad. Sci. 114 (37), 9843–9847.
Newman, Ma k E.J., 2002. Asso a i e mixing in ne wo ks. Phys. Re . Le . 89 (20), 208701.
Opsahl, To e, 2013. T iadic closu e in wo-mode ne wo ks: Rede ining he global and local clus e ing coe icien s. Social Ne wo ks 35 (2), 159–167.
Pas o -Sa o as, Romualdo, Vespignani, Alessand o, 2002. Immuniza ion o complex ne wo ks. Phys. Re . E 65 (3), 036104.
P áško á, Zuzana, Sen, P anab Kuma , 2009. Asymp o ics in ini e popula ion sampling. Handbook o S a is . 29, 489–522.
Ruiz-Palazuelos, So ía, Espinosa, Ma ía Paz, Ko ářík, Ja omí , 2023. The weakness o common job con ac s. Eu . Econ. Re . 160, 104594.
Schilling, Melissa A., Phelps, Co ey C., 2007. In e i m collabo a ion ne wo ks: The impac o la ge-scale ne wo k s uc u e on i m inno a ion. Manage. Sci. 53
(7), 1113–1126.
Smi h, Te ence M.F., 1991. Pos -s a i ica ion. J. Royal S a . Soc. Se ies D 40 (3), 315–323.
S e ba, Sonya K., 2009. Al e na i e model-based and design-based amewo ks o in e ence om samples o popula ions: F om pola iza ion o in eg a ion.
Mul i a . Beha . Res. 44 (6), 711–740.
Thi ke le, Ma hew, 2019. Iden i ica ion and es ima ion o ne wo k s a is ics wi h missing link da a. Wo king pape .
Van Mieghem, Pie , 2010. G aph Spec a o Complex Ne wo ks. Camb idge Uni e si y P ess.
Vega-Redondo, Fe nando, 2007. Complex Social Ne wo ks, No. 44. Camb idge Uni e si y P ess.
Wa s, Duncan J., S oga z, S e en H., 1998. Collec i e dynamics o ‘‘small-wo ld’’ ne wo ks. Na u e 393 (6684), 440–442.