1
Vol.:(0123456789)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s
Li es yle di e ences
be ween co‑ wins a e associa ed
wi h dec eased simila i y in hei
in e nal and ex e nal exposome
p o iles
Gabin D oua d
1*, Zhiyang Wang
1, Aino Heikkinen
1, Ma ia Fo as e
2, Jo di Jul ez
3,4,
Ka ja M. Kanninen
5, I ene an Kamp
6, Ma i Pi inen
1,7,8, Miina Ollikainen
1,9 &
Jaakko Kap io
1*
Whe he di e ences in li es yle be ween co‑ wins a e e lec ed in di e ences in hei in e nal o
ex e nal exposome p o iles emains la gely unde explo ed. We he e o e in es iga ed whe he wi hin‑
pai di e ences in li es yle we e associa ed wi h wi hin‑pai di e ences in exposome p o iles ac oss
ou domains: he ex e nal exposome, p o eome, me abolome and epigene ic age accele a ion (EAA).
Fo each domain, we assessed he simila i y o co‑ win p o iles using Gaussian simila i ies in up o 257
young adul same‑sex win pai s (54% monozygo ic). We addi ionally es ed whe he simila i y in one
domain ansla ed in o g ea e simila i y in ano he . Resul s sugges ha a lowe deg ee o simila i y
in co‑ wins’ exposome p o iles was associa ed wi h g ea e di e ences in hei beha io and subs ance
use. The s onges associa ion was iden i ied be ween excessi e d inking beha io and he ex e nal
exposome. O e all, ou s udy demons a es how social beha io and especially subs ance use a e
connec ed o he in e nal and ex e nal exposomes, while con olling o amilial con ounde s.
Keywo ds Mul i-omics, Exposome, P o eome, Me abolome, Epigene ic age accele a ion, Twins,
En i onmen and gene ics, Wi hin-pai p oximi y sco es (WPPS)
Li es yle can ha e a signi ican impac on heal h o e ime, as he yea s spen in good heal h inc ease o indi idu-
als who main ain heal hy li es yles1. Beha io al isk ac o s such as smoking, alcohol use and die accoun o a
subs an ial pa o he global bu den o disease2. These isk ac o s e lec bo h socie al in luences bu also gene ic
p edisposi ion. The same isk ac o s also in luence pa hophysiological p ocesses h ough mul iple mechanisms,
including me abolic s a e and gene ac ion.
Omics da a enable biological ep esen a ion a di e en molecula le els, de ining mul iple biological laye s
ha can be ei he deep (e.g., geno ype) o shallow (e.g., me abolome). The olume and di e si y o omics da a
inc eased exponen ially since he s a o he wen y- i s cen u y3 and showed g ea po en ial in he s udy o
disease and common ai s, as hei po en ial o an in-dep h unde s anding o unde lying molecula mechanisms
is undeniable4. As he olume o omics da a inc eases, one concep whose use has ecen ly been on he ise is
he Exposome, which can be de ined as he o al en i onmen al exposu es ha an indi idual expe iences5. The
Exposome is usually classi ied in o h ee pa s: gene al ex e nal, speci ic ex e nal, and in e nal exposome. The
gene al ex e nal exposome gene ally ep esen s la ge-scale en i onmen al ac o s, inancial s a us o he buil
OPEN
1Ins i u e o Molecula Medicine Finland (FIMM), HiLIFE, Uni e si y o Helsinki, Helsinki, Finland. 2PHAGEX
Resea ch G oup, Blanque na School o Heal h Science, Uni e si a Ramon Llull (URL), Ba celona, Spain. 3Clinical
and Epidemiological Neu oscience (Neu oÈpia), Ins i u d’In es igació Sani à ia Pe e Vi gili (IISPV), Reus,
Spain. 4ISGlobal, Pa c de Rece ca Biomèdica de Ba celona (PRBB), Ba celona, Spain. 5A.I. Vi anen Ins i u e o
Molecula Sciences, Uni e si y o Eas e n Finland, Kuopio, Finland. 6Cen e o Sus ainabili y, En i onmen and
Heal h, Na ional Ins i u e o Public Heal h and he En i onmen , Bil ho en, The Ne he lands. 7Depa men o
Public Heal h, Facul y o Medicine, Uni e si y o Helsinki, Helsinki, Finland. 8Depa men o Ma hema ics and
S a is ics, Uni e si y o Helsinki, Helsinki, Finland. 9Mine a Founda ion Ins i u e o Medical Resea ch, Helsinki,
Finland. *email: [email p o ec ed]; jaakk[email p o ec ed]
2
Vol:.(1234567890)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
en i onmen (i.e., g een spaces, building densi y, e c.). The speci ic ex e nal exposome includes speci ic en i on-
men al ac o s such as die , occupa ional exposu es and li es yle ac o s6. In he con ex o his pape , we dis in-
guish be ween he in e nal and ex e nal exposome, he la e including social and physical exposomes7 as well
as socio-demog aphic ac o s, bu no li es yle. Finally, en i onmen al imp in s on omic p o iles a e commonly
classi ied as pa o he in e nal exposome, as omics da a ha e been shown o be highly sensi i e o he en i on-
men , and gene ic ac o s only explain pa o hei a iance. Fo example, i has been es ima ed ha 40–50%
o he a iance in me aboli e le els is explained by gene ics8, and o DNA me hyla ion he mean he i abili y is
19% wi h la ge di e ences and polygenic e ec s ac oss me hyla ion si es9,10. This sugges s a subs an ial ole o
bo h he genome and en i onmen in one’s omic p o ile. Fo example, he epigenome is widely used o es ima e
a ious exposu es, such as ciga e e smoking11,12, alcohol consump ion, as well as one’s biological age13,14. While
s udies ocusing on he in e nal o ex e nal exposome a e g owing in numbe s, hose aiming a connec ing he
in e nal o he ex e nal exposome emain sca ce15.
Recen ly, s udies ha e shown associa ions be ween li es yle and omics da a, such as p o eomics16,17 and
me abolomics18,19 da a. DNA me hyla ion-based biological aging is also being inc easingly used as i e lec s
heal h- ela ed exposu es and cellula p ocesses, hus being mo e s ongly associa ed wi h an indi idual’s heal h
han ch onological age. Associa ions be ween DNA me hyla ion-based aging wi h o example die , obesi y, alco-
hol consump ion and physical ac i i y a e now well es ablished20–22. Two majo challenges in Exposome s udies
o li es yle emain o be add essed. The i s is whe he he exposome as a whole is s uc u ally associa ed wi h
li es yle and ela ed ai s. Such knowledge would allow he scien i ic communi y o iden i y se s o exposu es ha
oge he cha ac e ize people’s li es yles and guide policymake s in, o example, u ban planning and managemen .
The second challenge ela es o he ole o gene ics and en i onmen in Exposome-li es yle associa ions, which
could be elucida ed by win s udies bu emain sca ce in he li e a u e. Recen ly, a win s udy23 showed ha he
a ia ion in biological aging ha is sha ed wi h li es yle can be explained by bo h common gene ic ac o s and
en i onmen al ac o s. This inding implies ha he molecula mechanisms unde lying an indi idual’s li es yle
a e in ica e and mul i- ace ed. Twin designs could p o ide a mo e in-dep h iew o whe he and how gene ics
con ounds associa ions be ween domains o he exposome and li es yle, and hus be a use ul s ep in elucida ing
po en ial causal mechanisms.
Using and in eg a ing mul iple da a sou ces o di e en ypes, such as omics o en i onmen al da a, could
p o ide aluable insigh s in o he complex ela ionship ha links li es yle o omics. Mul i-omics app oaches
analyzing mul iple omics simul aneously aised he p omise o d awing esul s a he scale o all omic laye s, i.e.,
a a holis ic scale, a he han a he scale o wi hin-omic a iables24. The use o such app oaches, despi e he high
olume o omics a ailable and hei known clinical po en ial, emains unde u ilized in he li es yle li e a u e.
Twin coho s cons i u e powe ul epidemiological designs, whe ein he u iliza ion o win s udy designs illu-
mina es he na u e o ela ionships be ween a iables, whe he a ibu able o gene ic o en i onmen al ac o s. As
such, win co ela ions in monozygo ic (MZ) and dizygo ic (DZ) pai s a e commonly used o such pu poses25,26.
As MZ co- wins sha e iden ical DNA a he sequence le el, while DZ co- wins sha e on a e age 50% o hei
seg ega ing genes, highe win co ela ions wi hin MZ pai s han wi hin DZ pai s may indica e he p esence o
gene ic e ec s. Con e sely, win s udies enable he de ec ion o en i onmen al ac o s, as he sum o gene ic and
en i onmen al e ec s equal he sum o all measu able and unmeasu able e ec s27. A simila app oach can be used
in bi a ia e amewo ks, using no ably c oss- win c oss- ai co ela ions, which may lead o he quan i ica ion
o gene ic o en i onmen al co ela ions. Howe e , de ec ing sha ed gene ic and en i onmen al co ela ions
unde lying mo e han a ew ai s emains challenging. As such, win co ela ions a e powe ul ools o dissec
he e iology o ai s in uni a ia e o bi a ia e pe spec i es, bu a e no sui ed o a “whole-da ase ” pe spec i e.
Consequen ly, he eme gence o mul i-omics s udies in win coho s is challenged by high-dimensional designs.
Mul i-omics s udies in wins a e ye expec ed o imp o e ou unde s anding o common ai s28, as was shown
in ew s udies o men al heal h29, beha io 30, o body weigh 31,32.
To coun e he scou ge o high dimensions in omics da ase s, di e se me hods a e commonly used in machine
lea ning. As such, mul iple clus e ing algo i hms aim o e alua e p oximi y (i.e., closeness) be ween indi iduals
a he scale o a whole da ase . Simila i y ma ices, quan i ying be ween-indi idual dis ances, allow o es ima e
how “close” wo indi iduals a e a he scale o a whole da ase 33. Thei use, combined wi h o example Gaussian
ke nels, can he e o e p ojec high-dimensional da a in o single uni a ia e sco es depic ing closeness be ween
indi iduals, called Gaussian simila i ies. In da ase s wi h highly co ela ed a iables, pe o ming an ups eam
P incipal Componen Analysis (PCA) can aid in he downs eam e alua ion o dis ances be ween indi iduals.
Al hough he use o such app oaches a e a he basis o se e al clus e ing algo i hms, including spec al clus e -
ing, Gaussian simila i ies as such do no seem o ha e been used in he win li e a u e, al hough hei use could
open new pe spec i es in he s udy o ela i es, including wins.
Ou main aim was o in es iga e whe he co- wins di e ing in li es yle may also show less simila omic o
epigene ic aging p o iles, o be exposed o a less simila ex e nal exposome, compa ed o wins om pai s wi h
a e y simila li es yle. By answe ing his ques ion, we we e able o in es iga e whe he domains o he expo-
some a e s uc u ally associa ed wi h li es yle- ela ed ai s, i.e. whe he g oups o exposu es wi hin a domain
disc imina e heal hy om unheal hy li es yles, while con olling o amilial con ounde s. To do so, we i s
quan i ied, using Gaussian simila i ies, how close wo co- wins a e a he scale o a whole da ase (Fig.1A), ac oss
ou domains: he plasma p o eome, he plasma me abolome, epigene ic age accele a ion (EAA), and he ex e nal
exposome. These sco es we e la e e e ed o as wi hin-pai p oximi y sco es (WPPS) (Fig.1B). We hen quan i-
ied associa ions be ween domain-speci ic WPPS wi h mul iple a iables depic ing li es yle, such as educa ion,
leisu e- ime ac i i y, subs ance use and beha io . We also in es iga ed whe he simila i y be ween co- wins a
he scale o a domain could be associa ed wi h g ea e simila i y a he scale o ano he domain. Using all wins
and subse s o MZ and DZ win pai s only, we hen aimed o in es iga e whe he pai wise WPPS associa ions
we e likely o be d i en by gene ics and/o en i onmen .
3
Vol.:(0123456789)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
Resul s
The p esen s udy included up o 257 same-sex win pai s (63% emales; 54% monozygo ic pai s) om he
FinnTwin12 coho . The p oximi y o he wins wi hin each win pai ac oss he p o eome, me abolome, EAA,
and ex e nal exposome domains was es ima ed using wi hin-pai p oximi y sco es (WPPS) (Figs.1, 2A; Table1).
MZ pai s had signi ican ly highe domain-speci ic WPPS o he p o eome (p = 2.9e−7), he me abolome
(p = 9.5e−5), and EAA (p = 3.5e−8) han DZ pai s, sugges ing ha gene ic ac o s con ibu e o co- win simila -
i y in omic p o iles and epigene ic aging. MZ co- wins did no show close ex e nal exposomes han DZ co- wins
(p = 0.09). Male co- wins had a mo e simila p o eome han emales (p alue: 9.8e−4), which was no obse ed
o he me abolome (p alue: 0.18), EAA (p = 0.13), and ex e nal exposome domains (p = 0.70). Age a which
he co- wins sepa a ed om hei amilial home was signi ican ly posi i ely associa ed wi h g ea e simila i y
be ween he co- wins a he ex e nal exposome le el (p = 4.6e−7) bu did no show signi ican associa ion wi h
domain-speci ic WPPS o he p o eome (p = 0.50), he me abolome (p = 0.11), o EAA (p = 0.88).
Associa ions be ween wi hin‑pai p oximi y sco es ac oss di e en domains
We assessed pai wise associa ions be ween WPPSs using linea eg ession, adjus ing o di e ences in age a
blood sampling be ween co- wins. In all wins, g ea e simila i y be ween co- wins a he le el o he ex e nal
exposome was associa ed wi h g ea e simila i y a he le el o he p o eome (es ima e: 0.31; s anda d e o (se):
0.15; p = 0.04). No signi ican associa ion was obse ed be ween he ex e nal exposome wi h he EAA (es i-
ma e: 0.02; se: 0.08; p = 0.77) and he me abolome (es ima e: 0.16; se: 0.10; p = 0.11). The close he p o eomes
o wo co- wins we e o each o he , he mo e simila hei epigene ic age accele a ion was (es ima e: 0.47; se:
0.13; p = 3.1e−4), while no associa ions we e obse ed be ween EAA and me abolome (es ima e: 0.09; se: 0.08;
p = 0.25). Finally, g ea e me abolomic simila i y be ween co- wins was associa ed wi h g ea e p o eomic p ox-
imi y (es ima e: 0.21; se: 0.04; p = 2.6e−8). While mos o he signi ican associa ions epo ed a e modes in
s eng h (Fig.2C), abou one se en h o he p o eome-speci ic WPPS a iance was explained by a ia ions o
he me abolome-speci ic WPPS (R2 = 14.1%). This sugges s a subs an ial in e play be ween hese wo omics.
In o de o de e mine whe he he associa ions obse ed in all wins we e likely o be due o gene ic o
en i onmen al ac o s, we di ided he win pai s by zygosi y in o MZ and DZ pai s. Fo bo h MZ and DZ pai s,
we e-quan i ied he associa ions and compa ed he magni ude o he coe icien s by s anda dizing he WPPS
o each domain (Fig.2B). Numbe o MZ and DZ pai s wi h comple e WPPS o each domain a e a ailable in
he supplemen a y ma e ial (Supplemen a y documen 1, Fig.S1). Among MZ win pai s only, he associa ions
be ween he p o eome and he me abolome (es ima e: 0.35; se: 0.08; p = 5.7e−5), he EAA (es ima e: 0.19; se:
Fig. 1. S udy wo k low diag am. The s udy was di ided in o wo main analyses (A), which ollowed he
calcula ion o Wi hin-Pai P oximi y Sco es (WPPS) a he le el o each domain (B). C oss-domain WPPS
analysis in ol ed pai wise linea eg essions, while he ExWAS analysis was designed o iden i y associa ions
be ween li es yle di e ences and WPPS sco es in win pai s.
4
Vol:.(1234567890)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
Fig. 2. Associa ions be ween wi hin-pai p oximi y sco es ac oss he ou domains in all win pai s, in
monozygo ic win pai s only, and in dizygo ic win pai s only. (A) G aphical ep esen a ion o he WPPS o
each pai ac oss all domains. (B) S anda dized es ima es o he associa ions ac oss domains o MZ and DZ win
pai s sepa a ely. The ho izon al and e ical ba s ep esen he 95% con idence in e als o he s anda dized
es ima es in he MZ and DZ win pai s, espec i ely. S anda dized posi i e coe icien s o which he con idence
in e al does no include ze o indica e ha a high deg ee o closeness be ween co- wins a he le el o a gi en
domain was also e lec ed in a highe deg ee o closeness be ween co- wins a he le el o a second domain. The
ange whe e es ima es o MZ wins we e la ge han hose o DZ wins is indica ed by he yellow a ea. (C)
R2 measu es in all win pai s, in MZ win pai s only, o in DZ win pai s only. The sample sizes o each o he
associa ions ha led o he displayed R2 a e a ailable in he supplemen a y ma e ial (Fig.S1).
Table 1. Coho desc ip ion and dis ibu ions o wi hin-pai p oximi y sco es. N Numbe o pai s, MZ
Monozygo ic, DZ Dizygo ic, M Males, F Females, WPPS Wi hin-pai p oximi y sco e.
Mean/N S anda d de ia ion Range
Numbe o pai s (MZ/DZ) 257 (140/117)
Sex a bi h o he pai s 94M/163 F
Age o wins a blood sampling 22.3 0.6 21.0–24.7
WPPS—p o eome 0.53 0.11 0.18–0.76
WPPS—me abolome 0.66 0.18 0.04–0.93
WPPS—EAA 0.70 0.22 0.01–0.98
WPPS—exposome 0.77 0.27 0.07–1.00
5
Vol.:(0123456789)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
0.09; p = 0.03) and he ex e nal exposome (es ima e: 0.17; se: 0.08; p = 0.05) emained signi ican . All o he coe -
icien s o associa ion be ween domain-speci ic WPPS we e highe among he MZ win pai s han among he
DZ win pai s. Simila ly, R2 measu es we e la ge o MZ han DZ win pai s o mos associa ions (Fig.2C).
Al hough his may sugges ha gene ic ac o s play a ole in wi hin-pai domain associa ions, we we e unable o
demons a e ha any o hese associa ions in ol ed gene ic ac o s, as assessed by z- es . The associa ion be ween
he ex e nal exposome and he p o eome showed he closes le el o s a is ical signi icance wi hou eaching
i . This associa ion emained signi ican among MZ win pai s (es ima e: 0.17; se: 0.08; p = 0.05), bu no in DZ
win pai s (es ima e: 0.01; se: 0.19; p = 0.92), al hough he di e ence in he e ec s be ween he wo subsamples
was no signi ican (one- ailed z- es : p = 0.10).
Associa ions be ween wi hin‑pai p oximi y sco es and di e ence in li es yle be ween co‑ wins
We sough o examine whe he li es yle di e ences be ween co- wins we e associa ed wi h di e ences be ween
co- wins a he le el o each domain. To do so, we quan i ied associa ions be ween WPPS o each domain wi h
14 li es yle a iables (Fig.3) ha e lec ed win disco dance (i.e., one co- win exhibi s a ea u e ha he o he
co- win does no ). Models we e adjus ed o di e ences in age a blood sampling be ween co- wins. The numbe
o disco dan win pai s is a ailable in supplemen a y ma e ial (Supplemen a y documen 1, TableS1).
Simila i y be ween wo co- wins a he me abolomic le el, de ined by high me abolome-speci ic WPPS, was
no signi ican ly associa ed wi h any li es yle a iable. Twin pai s in which one co- win equen ly d inks o he
poin o in oxica ion, while he o he does no , had a less simila p o eome (es ima e: − 0.04; se: 0.02; nominal p
alue: 0.01; FDR-co ec ed p alue: 0.16) compa ed o hose pai s wi h wins possessing simila d inking habi s.
The mo e he co- wins di e ed in hei EAA, he g ea e he chance ha he co esponding win pai was
disco dan o ciga e e smoking (es ima e: − 0.10; se: 0.03; nominal p alue: 1.9e−3; FDR-co ec ed p alue: 0.03).
The ex e nal exposome cap u ed he mos associa ions wi h li es yle a iables (Fig.3). The mo e he ex e nal
exposome di e ed be ween wins in a pai , he mo e likely he pai was disco dan o li es yle, such as ha ing a
oca ional deg ee(es ima e: − 0.10; se: 0.04; nominal p alue: 0.02; FDR-co ec ed p alue: 0.10), going ou (i.e.,
Fig. 3. Associa ions be ween domain-speci ic wi hin-pai p oximi y sco es and bina y a iables indica ing
disco dance in win pai s. The log10 nominal p alue s ha ha e FDR co ec ed p alue s o less han 0.20 a e
shown in yellow. The dashed lines indica e he h eshold alue o a nominal p alue o 0.05.
6
Vol:.(1234567890)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
hanging ou )(es ima e: − 0.10; se: 0.03; nominal p alue: 5.1e−3; FDR-co ec ed p alue: 0.03), ciga e e smoking
(es ima e: − 0.08; se: 0.04; nominal p alue: 0.04; FDR-co ec ed p alue: 0.15), o equen d inking o in oxi-
ca ion (es ima e: − 0.15; se: 0.04; nominal p alue: 6.8e−5; FDR-co ec ed p alue: 9.5e−4). These associa ions
sugges ha bo h social habi s ela ed o subs ance use and educa ion a e linked o he en i onmen in which
he wins li e. All summa y s a is ics a e a ailable as supplemen a y documen s (Supplemen a y documen 1,
TableS2).
Sensi i i y analyses
We assessed he eliabili y o he associa ions ha included he ex e nal exposome because o he skewness o
i s WPPS. Sensi i i y analyses we e pe o med by educing he skewness by 1) ans o ming he WPPS o his
domain o 2) by excluding win pai s wi h a WPPS o 1, i.e. pai o wins li ing in he same household and
he e o e ha ing he same ex e nal exposome. These sensi i i y analyses indica e a ela i ely good eliabili y o
he esul s o he main analyses, which can be consul ed in he supplemen a y ma e ial (Supplemen a y docu-
men 1, Sec .1).
Discussion
By assessing he simila i y be ween wins in a pai in ou di e en domains, we showed ha co- wins who di e
in li es yle end o be less simila a he in e nal and ex e nal exposome le els, despi e sha ing a leas hal o hei
gene ic makeup as well as a common amilial en i onmen . Speci ically, di e ences in mul i-omic p o iles wi hin
win pai s we e associa ed wi h g ea e wi hin-pai di e ences in social beha io and subs ance use. G ea es
wi hin-pai di e ences in he ex e nal exposome we e iden i ied be ween co- wins di e ing in excessi e d ink-
ing beha io (i.e., equen d inking o in oxica ion). These indings pu in pe spec i e he mul iple laye s, om
in e nal o ex e nal exposome, ha cha ac e ize li es yle di e ences wi hin win pai s. In addi ion, we obse ed
ha co- wins wi h simila ex e nal exposomes ended o ha e mo e simila p o eomes. G ea e simila i y be ween
co- wins in epigene ic age accele a ion o me abolome was also associa ed wi h inc eased simila i y in p o eome
p o iles. These indings sugges an in e play in co- wins’ mul i-omic esemblance.
Di e ences in he ex e nal exposome be ween co- wins we e s ongly associa ed wi h di e ences in li es yle
be ween he co- wins. Fewe associa ions we e ound be ween li es yle wi h he me abolome, he p o eome o
EAA. This sugges s ha he e is a ela i ely s ong associa ion be ween pa s o he ex e nal exposome, e en
hough in ou s udy he li es yle in o ma ion was de i ed om ques ionnai es sel - epo ed by he wins, while
he ex e nal exposome domain we conside ed included in o ma ion abou he wins a he neighbo hood and
geocode le el. In pa icula , he esul s showed ha co- wins wi h a di e en ex e nal exposome ended o di e
in e ms o social beha io and subs ance use. This is in line wi h p e ious s udies showing associa ions be ween
subs ance use and he ex e nal exposome34,35, as well as be ween neighbo hood o ganiza ion and alcohol o d ug
use36. Simila ly, Williams and La kin37 ound a posi i e associa ion be ween neighbo hood po e y and subs ance
use. Ou s udy complemen s hese indings in a amily se ing, whe e mo e simila ex e nal exposome wi hin
win pai s is associa ed wi h mo e simila beha io s owa ds excessi e alcohol consump ion.
Twins sha e a common amily en i onmen as well as hal o all o hei gene ic he i age wi h hei DZ o
MZ co- win, espec i ely. Thus, by design, a numbe o measu able and unmeasu able ac o s sha ed wi hin
each pai a e accoun ed o in he wi hin-pai compa isons. I is o no e ha he wi hin-pai p oximi y sco es o
he ex e nal exposome we e simila in MZ and DZ pai s, p o iding empi ical e idence o suppo one cen al
assump ion o classical win modeling, namely ha he en i onmen al exposu es and expe iences a e he same,
on a e age, in MZ and DZ pai s. Obse ed associa ions a e he e o e likely o be due o en i onmen al ac o s
speci ic o each win, ega dless o hei zygosi y. Simila ly, wi hin- amily Genome-Wide Associa ion S udies
(GWAS) using measu ed genes can also con ol o e ec s such as popula ion s a i ica ion and amilial biases38
compa ed o ou con ol o gene ic e ec s using win pai s alone. In e es ingly, wi hin- amily GWAS analyses
e ealed a highe SNP he i abili y o alcohol use han popula ion-based analyses39, while he e ec was in he
opposi e di ec ion o educa ion and smoking in he same analysis. Finally, indings om he cu en win
s udy can, o some ex en , be ex apola ed o a mo e global amily con ex . DZ wins a e gene ically ma ched
like no mal siblings, al hough DZ wins may sha e a mo e in ense ea ly li e en i onmen wi h hei co- wins.
Al hough we p o ide addi ional e idence o a di ec e ec be ween he ex e nal exposome wi h social
beha io and equen d inking o in oxica ion, ou design does no allow us o es ablish any causal ela ionships.
High consump ion o alcohol is associa ed wi h mul iple social p oblems and medical diso de s, bo h men al
and soma ic. The educa ional a ainmen o he young adul wins may o example explain some o he asso-
cia ions be ween mul i-omic p o ile and li es yle di e ences. Addi ionally, one win mo ing away om amily
home o a end e ia y educa ion may explain wi hin-pai di e ences in he ex e nal exposome, wi h he la e
being a media o a he han a cause o di e ences in li es yles be ween he co- wins. Mo e s udies a e needed
o disen angle he ole o educa ion and socio-economic ac o s in linking li es yle and he ex e nal exposome.
Me aboli es a e known o e lec en i onmen al e ec s40, ye we obse ed no associa ion be ween me abolomic
and li es yle simila i y. Se e al ac o s and limi a ions o ou s udy may explain he absence o an associa ion.
Fi s , he sample size was ela i ely modes (comple e pai s o wins wi h me abolome-speci ic WPPS: N = 219),
which limi ed he s a is ical powe . This obse a ion also applies o he o he in e nal exposome domains such
as he p o eome. Howe e , da ase s ha include bo h in e nal and ex e nal exposome da a in mo e han hal a
housand wins a e a e. Al hough speci ic me aboli es o classes o me aboli es a e known o be associa ed wi h
li es yle19, i emains o be explo ed whe he no iceable associa ions be ween he en i e me abolome and li es yle
exis . In addi ion, he se o li es yle a iables used in he cu en s udy was ela i ely limi ed. Fo example, we
did no include in o ma ion on physical ac i i y o die , e en hough s ong connec ions wi h some me aboli es
ha e been epo ed in he li e a u e41,42. Whe he simila i y in co- wins’ me abolome p o iles is associa ed wi h
7
Vol.:(0123456789)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
li es yle emains inconclusi e by he cu en s udy. Finally, in e nal ac o s o he han me aboli es and p o eins,
such as inges ed me als and pes icides, we e no examined in he cu en s udy. Thus, ou s udy ep esen s only
a pa ial in es iga ion o he ela ionship be ween li es yle and he in e nal exposome. Inco po a ing new se s o
omics, including in o ma ion abou pollu an s, may pa e he way o a deepe unde s anding o how exposu e o
pollu an s, such as pes icides, migh modula e associa ions be ween pa s o he in e nal exposome, such as he
me abolome o p o eome, and li es yle. Fu he s udies in es iga ing complex in e ac ions be ween subdomains
o he in e nal exposome in ela ion o li es yle a e o be encou aged.
The simila i y o co- wins’ mul i-omic p o iles was assessed using Gaussian simila i ies based on modi ied
Euclidean dis ances. The modi ica ion o he Euclidean dis ance aimed o coun e ac high da a co ela ions, while
adhe ing o he mos classical e sion o Gaussian ke nels. Howe e , o he me hods could be used o assess he
p oximi y be ween indi iduals in a simila con ex ; he Mahalanobis dis ance is one such me hod. Using WPPS in
con ex s wi h low-dimensional da ase s does no seem o be he mos e ec i e ei he . The me hodology p esen ed
in he cu en s udy, al hough enabling quan i ica ion o how simila co- wins a e, does no o e an in-dep h
epidemiological pe spec i e, as gene ic modeling does, o example, o es ima e a iance componen s. Howe e ,
he me hodology p o ed e ec i e and use ul o wi hin-pai analyses, especially o high-dimensional da a. As
such, manipula ing a uni a ia e sco e depic ing he p oximi y be ween wins in a pai is ela i ely s aigh o wa d
in simple con igu a ions such as ha o e ed by eg ession. O he mo e ad anced machine lea ning models could
also bene i om using hese uni a ia e sco es, as hey could simul aneously inco po a e di e en ypes o da a
o s udy li es yle and ela ed ai s.
In summa y, we show he e ha co- wins wi h di e en li es yles a e less simila in hei in e nal o ex e nal
exposome p o iles han a e co- wins wi h simila li es yles. Resul s indica e associa ions be ween lowe simila i y
in co- wins’ mul i-omic p o iles and di e ences in social beha io o subs ance use, wi h he s onges associa ion
in pai s di e ing in excessi e d inking beha io .
Me hods
FinnTwin12 coho and pa icipan s
FinnTwin12 is a na ionwide coho based on Finnish wins bo n 1983–1987, which aims a in es iga ing beha -
io al de elopmen om childhood o adul hood43,44. Pa icipan s we e iden i ied om he popula ion da abase
o he Digi al and Popula ion Da a Se ices Agency o Finland (d . i). They comple ed mul iple ques ionnai es
a ages 11/12, 14, 17 and 22. A subse o hese wins was s udied mo e in ensi ely, and 786 o hem pa icipa ed
in in-pe son assessmen s and p o ided enous blood samples a e o e nigh as ing as young adul s (mean:
22.3; sd: 0.6), om which p o eomic, me abolomic and epigene ic da a we e gene a ed. Besides omics da a,
we used exposu es de i ed om geocodes o depic each wins’ ex e nal exposome45, including o example
access o g een spaces. Geocodes we e based on he ull esiden ial his o y o he wins un il he end o 2020,
ob ained om he Digi al and Popula ion Da a Se ices Agency. The me abolome (numbe o a iables: p = 140),
p o eome (p = 439), EAA (p = 8) and ex e nal exposome (p = 65) da ase s a e each e e ed o as domains om
which wi hin-pai p oximi y sco es we e la e calcula ed. In addi ion o he domains, we used ques ionnai e
da a cha ac e izing educa ion, leisu e ac i i ies, subs ance use, and beha io , all o which we e measu ed in he
ques ionnai e adminis e ed a age 22 and hus empo ally close o he domain assessmen s. De ails abou domains
and ques ionnai e da a a e in oduced in eponymous da a p ocessing subsec ions.
Da a p ocessing
The sample sizes o omics and ex e nal exposome da a a ied bu o e lapped well wi h mos win pai s ha -
ing all domains a ailable. The maximum numbe o a ailable win pai s was 257 (63% emales), all o which
had p o eomic da a and a high p opo ion o which had da a om o he domains as well. De ails o omics and
ex e nal exposome o e lap a e p o ided in he Supplemen a y ma e ial (Supplemen a y documen 1, Fig.S1).
P o eomics
P o eins om he plasma samples o 786 pa icipan s we e p ecipi a ed and subjec ed o in-solu ion diges ion
acco ding o he s anda d p o ocol o he Tu ku P o eomics Facili y (Tu ku P o eomics Facili y, Tu ku, Finland).
De ails abou high abundan p o ein deple ion, p ecipi a ion and diges ion in his sample ha e been desc ibed
elsewhe e46. Samples we e i s analyzed by independen da a acquisi ion LC–MS/MS using a Q Exac i e HF
mass spec ome e and u he analyzed using Spec onau so wa e. Da a was locally no malized47, and aw
ma ix coun s we e p ocessed and quali y con olled as desc ibed elsewhe e31. B ie ly, p o ein le els we e
log2- ans o med and p o eins wi h > 10% missing alues we e excluded. Missing alues we e impu ed by he
lowes obse ed alue o each p o ein ca ying missing alues. Co ec ions o ba ch e ec s we e pe o med
wi h Comba 48 and he inal p o eomic da ase comp ised 439 p o eins which we e scaled, such ha one uni
co esponded o one s anda d de ia ion (sd) wi h ze o mean. P o eomic da a we e a ailable o 257 comple e
pai s o same-sex wins. The lis o p o eins is p esen ed in he supplemen a y ma e ial (Supplemen a y docu-
men 1, TableS3).
Me abolomics
Me aboli es we e quan i ied om plasma samples using high- h oughpu p o on nuclea magne ic esonance
spec oscopy (1H-NMR) (Nigh ingale Heal h L d, Helsinki, Finland) in one ba ch44,49,50. An ex ensi e desc ip ion
o he da a is a ailable elsewhe e51. The da a ini ially comp ised 149 me aboli es including lipids and lipop o ein
subclasses wi hin 14 subclasses, a y acid composi ion, and a ious low-molecula weigh me aboli es, including
amino acids. We excluded p egnan women (n = 53) and an indi idual wi h choles e ol. Only me aboli es wi h
less han 10% missing alues we e kep , leading o a selec ion o 140 me aboli es (see TableS4 o de ails o he
8
Vol:.(1234567890)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
me aboli es used) (Supplemen a y documen 1, TableS4). Impu a ion by he obse ed sample minimum alue
was pe o med o each me aboli e ( a e o missing alues in da ase : 1.2%). None o he pa icipan s had a leas
one o he i s h ee p incipal componen s de i ed om p incipal componen analyses abo e o below 5 SD
o he mean, indica ing he absence o ou lie s. Me aboli e alues we e no malized using in e se no mal ank
ans o ma ion, and scaled so ha one uni co esponded o a change o one sd, wi h a mean o ze o.
Epigene ic age accele a ion
DNA me hyla ion le els we e quan i ied using In inium Illumina HumanMe hyla ion450K, and p ep ocessed
using R-package me il52 emo ing bad quali y samples and p obes53. We calcula ed di e en EAA es ima es,
de ined as he esiduals o ch onological age eg essed on epigene ic age, using hei espec i e algo i hms:
Ho a h54, Hannum55, PhenoAge56, and G imAge57, alongside hei espec i e PC-sco e e sions58. De ails abou
EAA calcula ions a e desc ibed elsewhe e23. In con as wi h ch onological age, which is de ined by he amoun
o ime elapsed since bi h, epigene ic age aims a es ima ing cells’ biological age and shows p omising clinical
po en ial13. The inal EAA da ase consis ed o 8 di e en EAAs a ailable o 241 comple e same-sex win pai s,
showing ela i ely mode a e co ela ions (mean Pea son co ela ion: = 0.49) be ween he EAA es ima es.
Ex e nal exposome
The domain o he ex e nal exposome comp ised a o al o 65 exposu es de i ed om wo sou ces: Equal-li e
p ojec 7 en ichmen da ase s and S a is ics Finland. We used win’s geocodes in 2005–2006 o me ge he expo-
su es, which we e ex ensi ely desc ibed elsewhe e45. B ie ly, exposu es we e con inuous and included: sizes o
and access o g een spaces, pe cen age o buil up a eas, popula ion ages and headcoun s, c ime a es, and o -
ing pa e ns a municipal elec ions. Comple e da ase was a ailable o 250 pai s o same-sex wins. Comple e
desc ip ion o he ex e nal exposome’s a iables is a ailable in he supplemen a y ma e ial (Supplemen a y
documen 1, TableS5).
Li es yle a iables
A panel o 14 a iables cha ac e izing educa ion, leisu e ac i i ies, subs ance use, and social beha io was de i ed
om home ques ionnai es comple ed by he wins p io o he in-pe son s udy as well as ques ionnai es com-
ple ed du ing he s udy isi , all measu ed a app oxima ely age 22. Home ques ionnai es we e comple ed wi hin
2weeks o blood sampling o mo e han 95% o hese wins31. These a iables we e used o examine whe he
hese we e associa ed wi h domain-speci ic WPPS. Playing ideo games, wa ching ideos, playing an ins umen ,
and eading we e used as a iables o cha ac e ize leisu e ime. Social beha io was assessed by he ollowing
a iables: going ou (i.e., hanging ou ), going dancing, aking pa in a club, going o as ood, and going o
ba s. Va iables cha ac e izing subs ance use we e ciga e e smoking, alcohol consump ion, and alcohol-induced
in oxica ion59. These equency a iables we e dicho omized in o bina y a iables, such ha modali ies we e
de ined by a equency o a mos once a mon h e sus a leas once a week. In addi ion, he a ainmen o a
oca ional deg ee (modali ies: yes/no) was used o cha ac e ize possible educa ional di e ences be ween he
co- wins. We also included a a iable indica ing he age o he i s sexual in e cou se (modali ies: s ic ly be o e
he age o 18/ a age 18 o a e ).
Wi hin-pai li es yle a iables we e hen c ea ed. They we e coded as ollows: 1 indica es a disco dan pai ,
0 a conco dan pai . Tha is, o each wi hin-pai indica o a iable, 1 deno ed a win pai in which one win
exp essed a ea u e ha he o he did no , and 0 deno ed wo co- wins exp essing he same ea u e (e.g., same
subs ance use o social beha io ). The numbe o li es yle disco dan win pai s anged om 42 o 94 among
he 257 a ailable win pai s. Fo each a iable, pai s o wins in which a leas one o he co- wins had a miss-
ing alue we e coded as ha ing a missing alue o he pai . De ailed numbe s o disco dan and missing pai s
pe each li es yle a iable a e a ailable in he supplemen a y ma e ial (Supplemen a y documen 1, TableS1).
S a is ical analyses
Wi hin‑pai p oximi y sco es (WPPS) ac oss domains
Gaussian ke nels, as de ined by (Δ), a e commonly used unc ions o sco e he p oximi y be ween wo indi idu-
als xi and xj by p ojec ing hei mu ual dis ance D(xi, xj) (e.g., euclidean dis ance) in o he in e al [0, 1], wi h
σ
being a unable hype pa ame e con olling he wid h o he neighbo hoods.
A p oximi y sco e o 1 he e o e e lec s a pai o iden ical indi iduals, while a dec easing sco e alue e lec s
an inc eased dissimila i y be ween wo indi iduals. We p opose o adap he Gaussian ke nel o ou win design,
by sco ing he p oximi y be ween co- wins, which we e e o as wi hin-pai p oximi y sco es (WPPS). Because
co ela ions be ween a iables wi hin each domain may be subs an ial, as in he me abolomic da a, we did no
use he euclidean dis ance as dis ance D(.,.), bu a weigh ed e sion o i . This a oided in la ion o p oximi y
sco es be ween wo co- wins due o epe i ion o in o ma ion om highly co ela ed a iables. Fo each domain
(e.g., me abolome, ex e nal exposome, e c.) o dimension p, WPPS we e calcula ed as desc ibed by he ollowing
sequen ial p ocedu e:
(Δ)
K
xi,xj
=exp
−
D
xi,xj
2σ2
9
Vol.:(0123456789)
Scien i ic Repo s | (2024) 14:21261 | h ps://doi.o g/10.1038/s41598-024-72354-7
www.na u e.com/scien i ic epo s/
1. A P incipal Componen Analysis (PCA) was i s conduc ed, om which p P incipal Componen s (PCs)
we e de i ed, co e ing 100% o he ini ial ine ia (i.e., he o al a iance). We deno e ⍵k he pe cen age o
ine ia co e ed by PCk (k = 1, …, p), i.e. he p opo ion o o al a iance explained by PCk, such ha ⅀k⍵k = 1.
2. Then, he p PCs we e scaled o mean ze o and a iance one.
3. The dis ance D(xi, xj) be ween wo co- wins xi and xj was de ined by weigh ing each PC by i s associa ed
pe cen age o ine ia, i.e., D(xi, xj) = ⅀k⍵k[PCk(xi) − PCk(xj)]2
4. The p oximi y be ween he wo co- wins was calcula ed as in (Δ). The hype pa ame e
σ
was se o one.
Two co- wins o which he squa ed di e ences in le els ac oss all p PCs o a domain we e highe on a e age
han a s anda d de ia ion ha e a WPPS lowe han exp(− 1/2) ≃ 0.61.
Since we used he Euclidean dis ance o compu e he WPPS, we e alua ed he ex en o which he WPPS
migh di e i we used ano he me ic, such as he Manha an dis ance. We ound ha domain-speci ic WPPS
we e highly co ela ed be ween he Euclidean-based and Manha an-based me hods, wi h Pea son co ela ions
anging om 0.96 o 0.98. This sugges s ha he choice o dis ance used o compu e he WPPS is likely o ha e
li le e ec on he esul s p esen ed in he cu en s udy.
WPPS‑WPPS and WPPS‑li es yle associa ions
Fi s , we sough o in es iga e whe he sex, zygosi y and age a which he co- wins we e sepa a ed om he amily
home a he i s ime60 we e associa ed wi h each domain-speci ic WPPS. We he e o e i ed linea eg essions
modeling a WPPS as dependen a iable, and sex, zygosi y, and age a sepa a ion o he pai as independen
a iables.
We hen examined whe he co- wins being simila a he scale o a pa icula domain may also sha e g ea e
simila i y a he scale o ano he domain. To do so, we quan i ied pai wise associa ions be ween WPPS ac oss
domains using linea eg essions. We modeled hese associa ions conside ing one as he dependen a iable,
he o he as independen a iable. The dependen a iable o each domain pai was p io i ized as ollows: (1)
ex e nal exposome, (2) EAA, (3) p o eome, and (4) me abolome. Di e ences in age a blood sampling be ween
co- wins was added as co a ia e o co ec o po en ial e ec s on WPPS, e en hough hese di e ences we e o
ze o days o h ee ou hs o win pai s and a e aged only o 20days in all wins. A WPPS o a speci ic domain
was conside ed signi ican ly associa ed wi h a WPPS o ano he domain i he coe icien o he la e was sig-
ni ican ly non-ze o, as assessed by - es (p < 0.05).
To assess whe he signi ican associa ions a e likely o be due o gene ic e ec s, we s a i ied linea eg ession
models by zygosi y, as MZ wins a e ully ma ched o gene ics, while DZ wins a e hal -ma ched. We scaled each
WPPS o mean ze o and a iance one in each subsample, as o be able o compa e coe icien s ac oss he wo
subsamples. Nulli y o di e ences in hese coe icien s be ween MZ and DZ subsample was assessed by unila e al
Z- es 61,62. Tes ing was one-sided as MZ win pai s di e om DZ win pai s by be e gene ic ma ching wi hin
he pai ; he la ge he gene ic e ec s a e, he la ge he coe icien s in he MZ should be compa ed o hose in
he DZ win pai s. Sample sizes o all WPPS-WPPS associa ion scena ios in all win pai s, MZ pai s only, o DZ
pai s only a e shown in he Supplemen a y Ma e ial (Supplemen a y documen 1, Fig.S1).
A he second s ep, we sough o in es iga e whe he co- wins sha ing simila p o eome, me abolome, EAA
o ex e nal exposome di e ed in li es yle. To do so, we quan i ied associa ions be ween WPPS om each domain
wi h disco dance o 14 li es yle a iables, as in a s anda d Exposome-Wide Associa ion S udy (ExWAS). Di e -
ences in age a blood sampling we e added as a co a ia e. Associa ions be ween each domain-speci ic WPPS wi h
li es yle a iables we e co ec ed o mul iple es ing by False Disco e y Ra e (FDR) using he Benjamini–Hoch-
be g me hod, and we discussed associa ions o which FDR-co ec ed p alues we e lowe han 0.2.
Sensi i i y analyses
Because he wins we e young, some o hem li ed in he same amilial household as hei co- win. This esul ed
in 80 win pai s wi h an ex e nal exposome WPPS o 1. To e alua e po en ial p oblems caused by he skewness
o his WPPS in linea modeling (skewness: − 0.73), we pe o med wo sensi i i y analyses. Fi s , we modi ied
he ex e nal exposome WPPS by ans o ming i wi h he logi unc ion so ha all he alues a e mo e sp ead
ou on he whole eal axis. As he alue 1 is no wi hin he alid ange o applica ion o he logi unc ion, we
i s emapped WPPS in o he in e al [0.025, 0.975] by linea ans o ma ion. The second sensi i i y analysis
consis ed o excluding pai s wi h a WPPS sco e o 1 om he analyses in which he ex e nal exposome was
included. We epo ed associa ions in bo h con igu a ions and discussed po en ial di e ences obse ed wi h
he o iginal modeling.
Da a a ailabili y
FinnTwin12 da a analyzed in his s udy is no publicly a ailable due o he es ic ions o in o med consen .
Reques s o access hese da ase s should be di ec ed o he Ins i u e o Molecula Medicine Finland (FIMM)
Da a Access Commi ee (DAC) ( imm[email p o ec ed]) o au ho ized esea che s who ha e IRB/e hics app o al
and an ins i u ionally app o ed s udy plan. To ensu e he p o ec ion o p i acy and compliance wi h na ional
da a p o ec ion legisla ion, a da a use/ ans e ag eemen is needed, he con en and speci ic clauses o which
will depend on he na u e o he eques ed da a.
Code a ailabili y
Me hodological esou ces, including R sc ip s, a e a ailable upon eques om he co esponding au ho .