Me hodological Conside a ions on he Ex e nal Validi y o he Kim HJ e al.
COVID-19 Vaccina ion S udy (Bioma k Res 13:114, 2025): A Quan i a i e
Analysis
Ma co Rocce i
ma co. occe [email protected]
Uni e si y o Bologna, Depa men o Compu e Science and Enginee ing Bologna, I aly
Co esponding Au ho : Ma co Rocce i
A ilia ion: Uni e si y o Bologna, Depa men o Compu e Science and Enginee ing, Bologna, I aly
Email Add ess: ma co. occe [email protected]
ORCID: 0000-0003-1264-8595
Wo d Coun : 4402
Abs ac
Despi e he undamen al impo ance o e ospec i e s udies in assessing he eal-wo ld impac o COVID-19 accina ion,
many o hese wo ks employ coho cons uc ion me hodologies ha do no adhe e o he mos basic ules o bios a is ics,
hus comp omising he alidi y o hei esul s. The objec i e o his s udy is o o mally e alua e he me hodology and
es he ex e nal alidi y o a ecen la ge-scale coho s udy ha epo ed a su p ising and signi ican ly highe 1-yea
cance incidence isk in he COVID-19 accina ed g oup. Agg ega ed aw da a (n=2,975,035 indi iduals) om he coho
we e used o calcula e he o e all C ude Incidence Ra e CR o cance . The esul ing coho CR was hen compa ed agains
he es ablished o icial na ional a e age CR o he e e ence pe iod (2020–2022) o assess ex e nal alidi y. A seconda y
analysis employed he Chi-Squa ed Goodness-o -Fi Tes o quan i y he impac o he 1:4 P opensi y Sco e Ma ching
(PSM) on he age s uc u e o he inal coho agains he na ional demog aphic benchma k. The coho 's o e all CR was
40.78 pe 10,000, a subs an ial 22.26% downwa d de ia ion om he na ional a e age 52.46 pe 10,000 (SD 2.97). This
disc epancy es ablishes a p onounced epidemiological pa adox, s ongly indica ing a lack o ex e nal alidi y.
Fu he mo e, he Chi-Squa ed es e ealed a p o ound s uc u al al e a ion, wi h a alue o 69,370 (p < 0.00001),
con i ming ha he PSM p ocedu e unde ep esen ed he high- isk demog aphic g oup >= 65 yea s) compa ed o he
na ional a e age (12.15% obse ed s 18.00% expec ed). In conclusion, he eliabili y o he s a is ical associa ions
epo ed by he sc u inized s udy a e signi ican ly challenged by he lack o ex e nal alidi y and he me hodological
ambigui ies conce ning he composi ion o he coho . Independen alida ion is manda o y, necessi a ing he immedia e
public access o he unde lying adminis a i e heal h da a sou ces.
Keywo ds: COVID-19 Vaccina ion, C ude Incidence Ra e, Cance Incidence, Epidemiological Pa adox, Ex e nal
alidi y, Coho Rep esen a i eness.
In oduc ion
The alue o e ospec i e s udies in analyzing he eal-wo ld e ec i eness o COVID-19 accina ion is undeniable.
Howe e , i is conce ning ha many s udies do no adhe e o he basic bios a is ical equi emen s o coho cons uc ion,
e ec i ely ende ing hei esul s less alid o un eliable. Impo an ly, he accu a e assessmen o pos -ma ke ing ad e se
e en s, pa icula ly hose associa ed wi h widesp ead public heal h in e en ions like COVID-19 accina ion, is c i ical
o public us and e ec i e heal h policy. Obse a ional s udies d awing om na ional heal h da abases a e essen ial
ools in his p ocess, o e ing la ge sample sizes and eal-wo ld da a. Howe e , he eliabili y o such s udies is
undamen ally dependen on he me hodological igo applied o coho selec ion and s a is ical adjus men [1].
A co e s anda d o igo in epidemiology is Ex e nal Validi y [2]. This concep e e s o he ex en o which he indings
o a s udy can be gene alized o o he popula ions, se ings, and ci cums ances ou side he s udy's speci ic coho . Fo a
coho de i ed om a na ional egis y, high ex e nal alidi y equi es ha he s udy’s o e all bu den o disease (measu ed
by he C ude Incidence Ra e, o CR) is s a is ically consis en wi h he known na ional bu den o disease o he same
pe iod. Failu e o mee his s anda d, o en due o sampling o selec ion issues, means he coho is no ep esen a i e o
he b oade popula ion, ende ing i s conclusions ques ionable in a eal-wo ld con ex .
S udy [3] p o ides a ecen , e ospec i e, popula ion-based coho analysis u ilizing da a om some Sou h Ko ean
adminis a i e da abase o in es iga e he 1-yea isks o cance s associa ed wi h COVID-19 accina ion in Sou h Ko ea
[4]. The s udy’s inding, sugges ing a highe a e o new cance cases among he accina ed popula ion compa ed o he
un accina ed, is a su p ising and scien i ically challenging esul ha has ye o be ully add essed and scien i ically
analyzed wi h he equi ed dep h and u gency, especially conside ing he global scope o he accina ion p og ams [5].
The p esen wo k se es as a comp ehensi e sc u iny ha in eg a es wo sequen ial compu a ional analyses. Ini ially, we
iden i ied a p onounced epidemiological pa adox based on aw incidence calcula ions de i ed om he s udy's
supplemen a y da a. This pa adox es ablished a signi ican ex e nal inconsis ency be ween he s udy coho ’s agg ega e
cance incidence and o icial na ional s a is ics [6]. The ollow-up analysis we de eloped is as an in eg al pa o he
o e all a gumen and posi s a speci ic me hodological explana ion o his pa adox: he likely misapplica ion o in e sion
o he 1:4 P opensi y Sco e Ma ching (PSM) p ocedu e [7] used in [3].
In closing, he p ima y objec i e o his pape is o quan i y and demons a e he se e i y o he ex e nal inconsis ency
obse ed in he sc u inized coho , he eby challenging i s ex e nal alidi y. We hen p opose a plausible me hodological
hypo hesis, speci ically he in e ed P opensi y Sco e Ma ching (PSM), ha would econcile his nume ical disc epancy
and ul ima ely calls in o ques ion he eliabili y o he s udy’s inal associa ion esul s. I is essen ial o unde ake his
c i ical examina ion using he known da a, as he magni ude o he inding demands he highes le el o scien i ic sc u iny.
Me hods
In his Sec ion, we p o ide all he necessa y de ails on he da a and me hods used in ou s udy, allowing eade s o easily
eplica e ou indings.
Sou ces o da a
This analysis is based en i ely on publicly a ailable, agg ega ed da a ex ac ed om [3] and o icial Sou h Ko ean na ional
heal h s a is ics as epo ed in [8-11]. In pa icula , he aw coho igu es necessa y o calcula ion we e ob ained om
Table S4 ("Cumula i e incidences o o e all cance s in he ma ched coho be ween accina ed and un accina ed
indi iduals") om he Supplemen a y Ma e ial o [3]. These igu es a e as summa ized in he ollowing Table 1.
Table 1: Raw Coho Da a showing he ini ial and inal ma ched coho coun s, case numbe s, and he P opensi y Sco e
Ma ching (PSM) de ails used in [3].
The O icial Na ional Cance S a is ical da a, including he O icial C ude Incidence Ra e (CR), o all cance s in Sou h
Ko ea we e ins ead sou ced om he Ko ean Cen al Cance Regis y o he yea s immedia ely p eceding and du ing he
s udy pe iod (2020–2022) as epo ed in [8-11]. This da a p o ides he obus na ional baseline agains which he s udy
coho 's ep esen a i eness has been es ed. Fu he mo e, he o icial Sou h Ko ean demog aphic s uc u e, which
speci ies ha he popula ion aged >= 65 yea s cons i u es 18.00% o he o al, was used as he expec ed na ional baseline
o assessing he coho 's age ep esen a i eness, as epo ed in [12].
De ini ion and calcula ion o C ude Incidence Ra e (CR)
The C ude Incidence Ra e (CR) pe 10,000 popula ion is a undamen al epidemiological measu e used he e speci ically
o e alua e he ex e nal alidi y o he coho . Unlike Age-S anda dized Ra es (ASRs) which adjus o age dis ibu ion
o allow compa ison be ween popula ions, he CR e lec s he aw bu den o disease in a de ined popula ion o e ime
[13]. Mos impo an ly, any coho de i ed om a na ional da abase should possess an agg ega e CR ha is s a is ically
Me ic
Value
Ini ial Coho Size
8,407,849 indi iduals
Final Ma ched Coho Size
2,975,035 indi iduals
To al Cance Cases in Ma ched Coho
12,133 cance cases
Un accina ed G oup (N)
595,007 indi iduals
Un accina ed G oup (Cases)
1,989 cance cases
Vaccina ed G oup (N)
2,380,028 indi iduals
Vaccina ed G oup (Cases)
10,144 cance cases
P opensi y Sco e Ma ching (PSM)
1:4 Ra io
consis en wi h he na ional a e age CR o he same ime pe iod. A signi ican de ia ion signals a ounda ional p oblem
in he ini ial sampling o selec ion p ocess ha any gi en p ocedu e used o cons uc he coho would ail o co ec . The
CR is calcula ed using he es ablished epidemiological o mula:
CR pe 10,000 = (Numbe o new cases du ing a gi en pe iod) / (A e age popula ion a isk du ing he same pe iod)
imes 10,000. 1)
This is ollowed by a s aigh o wa d calcula ion o o icial Sou h Ko ean CR baseline [8-11], whose alues o bo h
sexes pe 100,000 popula ion we e con e ed o a pe 10,000 basis o es ablish he na ional benchma k as eco ded in
Table 2.
Table 2: O icial Na ional C ude Incidence Ra es (CR) o All Cance s in Sou h Ko ea pe 100,000 and he de i ed CR
pe 10,000, used o es ablish he na ional a e age baseline o he e e ence pe iod (2020–2022).
Consequen ly, he o icial a e age CR baseline o all cance s o he e e ence pe iod (2020–2022) can be es ablished as
he mean o hese alues: CR (O icial A e age) = 52.46 pe 10,000 (S anda d De ia ion, SD = 2.97, being assumed ha
he h ee annual CR alues cons i u e he popula ion o he e e ence pe iod, hus he SD is calcula ed using N as he
denomina o ). Finally, using he aw igu es om Table S4 in he Supplemen a y ma e ial p o ided in [3], he ollowing
CRs o Table 3 a e calcula ed using he CR equa ion o he coho o in e es .
Table 3: Calcula ed C ude Incidence Ra es (CR) o he ma ched coho o [3], showing he o e all a e o he en i e
coho and he a es o he seg ega ed accina ed and un accina ed g oups.
Hypo hesis o mula ion on P opensi y Sco e Ma ching (PSM) in e sion
A P opensi y Sco e Ma ching (1:4 PSM) p ocedu e aims o ma ch each indi idual in he T ea men G oup wi h ou
compa able indi iduals om he Con ol G oup [7]. In gene al, The P opensi y Sco e Ma ching (PSM) is a quasi-
expe imen al s a is ical me hod used o educe he con ounding bias ha occu s when es ima ing he e ec o a ea men
o in e en ion (like accina ion, in ou case) in obse a ional s udies. The P opensi y Sco e is de ined as he condi ional
p obabili y o an indi idual ecei ing he ea men gi en a se o obse ed co a ia es (e.g., age, sex, como bidi ies).
Hence, he p opensi y sco e e(X) is gi en by e(X)) = P ob (Z = 1 | X), whe e Z is he ea men assignmen and X is he
ec o o baseline co a ia es.
The PSM calcula ion p ocedu e in ol es a mul i-s ep p ocess. Fi s , a logis ic eg ession model is cons uc ed o es ima e
he p opensi y sco e o e e y indi idual, based on he se o obse ed con ounde s. Once he p opensi y sco es a e
calcula ed, he ma ching phase begins. Di e en ma ching algo i hms exis (e.g., nea es neighbo , calipe , o ke nel
ma ching). In he epo ed 1:4 PSM, each ea ed indi idual (o he base g oup) should be pai ed wi h ou compa able
con ol indi iduals whose p opensi y sco es a e nea ly iden ical. This p ocess would e ec i ely c ea e a syn he ic,
balanced coho whe e he wo g oups a e compa able on all measu ed con ounde s, he eby minimizing selec ion bias.
The p ima y a ionale o using PSM is o mimic he andomiza ion p ocess o a andomized con olled ial in non-
andomized obse a ional da a. By balancing he dis ibu ion o baseline co a ia es be ween he ea ed and con ol
g oups, PSM aims o isola e he ue e ec o he ea men om he e ec s o con ounding ac o s ha in luenced he
decision o accina e. I he PSM is success ully implemen ed, any esidual di e ence in ou come be ween he ma ched
Yea
CR pe 100,000
CR pe 10,000
2020
482.9
48.29
2021
540.6
54.06
2022
550.2
55.02
G oup
Calcula ion
(New Cance Cases / Popula ion) x 10,000
C ude Incidence Ra e (CR)
CR (Coho O e all)
(12,133 / 2,975,035) x 10,000
40.78 pe 10,000
CR (Vaccina ed)
(10,144 / 2,380,028) x 10,000
42.63 pe 10,000
CR (Un accina ed)
(1,989 / 595,007) x 10,000
33.43 pe 10,000
g oups can be mo e con iden ly a ibu ed o he ea men i sel . A ailu e in he PSM p ocess, o a misapplica ion like
he hypo hesized in e sion, undamen ally unde mines his a ionale and ein oduces signi ican bias in o he analysis.
Age-S a i ied Da a and Chi-Squa ed Goodness-o -Fi Tes
To o mally assess he s uc u al in eg i y o he inal ma ched coho , we u ilize he age-s a i ied da a p o ided in he
supplemen a y ma e ials (Table S4) o [3]. The ull b eakdown, p esen ed in Table 4, is essen ial o alida ing he coho 's
ex e nal alidi y agains he na ional age s uc u e
Table 4: Age-s a i ied composi ion o he ma ched coho , based on publicly a ailable supplemen a y da a o [3].
The Chi-Squa ed Goodness-o -Fi Tes is used o o mally es he null hypo hesis ha he age dis ibu ion o he inal
PSM-ma ched coho is consis en wi h he known age dis ibu ion o he gene al Sou h Ko ean popula ion (82% unde
65 yea s, 18% 65 yea s and olde ). This s a is ical es quan i ies he magni ude o he di e ence be ween he obse ed
equencies in he s udy coho and he expec ed equencies based on he na ional demog aphic benchma k. A s a is ically
signi ican esul om his es indica es ha he sampling o ma ching p ocedu e in oduced a p o ound s uc u al bias,
challenging he coho 's ep esen a i eness and, consequen ly, i s ex e nal alidi y. The calcula ion uses a deg ee o
eedom =1 ( wo age ca ego ies minus one deg ee o eedom), based on he o mula:
𝐶ℎ𝑖 − 𝑠𝑞𝑢𝑎𝑟𝑒𝑑 = - ∑("#$)!
$ 2)
whe e O is he obse ed equency in he coho and E is he expec ed equency based on he na ional demog aphic
p opo ion.
To conclude, i is o ei e a e ha all da a used o calcula ions a e publicly a ailable and explici ly ci ed om h e
supplemen a y ma e ials [3] and o icial Ko ean cance egis ies as eo ed in [8-11]. All calcula ions o his s udy a e
ully ep oducible by using he me hod desc ibed in his Sec ion, plus he da a e e enced abo e.
Pa ien and Public In ol emen
Pa ien s and/o he public we e no in ol ed in he design, o conduc , o epo ing, o dissemina ion plans o his esea ch.
Resul s
This Resul s Sec ion p esen s h ee ypes o esul s: i s , he quan i ica ion o he Epidemiological Pa adox h ough he
compa ison o he calcula ed C ude Incidence Ra e (CR) agains he na ional baseline; second, he nume ical e idence
suppo ing he hypo hesis o P opensi y Sco e Ma ching (PSM) in e sion; and hi d, he s a is ical quan i ica ion o he
PSM's impac on he age s uc u e o he coho .
Quan i ica ion o he epidemiological pa adox
The compa ison be ween he s udy coho 's agg ega ed CR and he na ional a e age CR e ealed a subs an ial and
signi ican downwa d de ia ion, con i ming he epidemiological pa adox as summa ized in he ollowing Table 5.
Me ic
Ra e pe 10,000
Re e ence/Calcula ion
O icial Na ional A e age CR
(2020–2022)
52.46
Table 2
Coho O e all CR
40.78
Table 3
Absolu e De ia ion
- 11.68
(40.78 - 52.46)
Pe cen age De ia ion
- 22.26%
(- 11.68) / 52.46 x 100
G oup
To al
Pa ecipan s
To al
Cance Cases
% Pa ecipan s
< 65
% Cases
< 65
% Pa ecipan s
>= 65
% Cases
>= 65
Un accina ed
595,507
1,989
87.85
69.03
12.15
30.97
Vaccina ed
2,380,028
10,144
87.85
67.64
12.15
32.36
To al
2,975,035
12,133
87.85
67.87
12.15
32.13
Table 5: Quan i ica ion o he Epidemiological Pa adox: Compa ison o he sc u inized Coho 's o e all C ude Incidence
Ra e (CR) agains he O icial Na ional A e age CR, highligh ing he se e e downwa d de ia ion.
The pa adox is summa ized as ollows: he s udy’s analysis sugges s an ele a ed cance isk wi hin he majo i y g oup
( accina ed CR is 27.7% highe han un accina ed CR), which should in ui i ely push he o e all coho CR highe , ye
he o e all CR is 22.26% lowe han he na ional baseline. This p o ound inconsis ency ep esen s a s ong p esump ion
o he coho 's lack o ep esen a i eness, which awai s o mal e u a ion, hough such a e u a ion appea s ma hema ically
challenging. To comp ehend he ull impac o his de ia ion, one mus ansla e hese s a is ical disc epancies in o
absolu e numbe s, which e eal he magni ude o he e ec . Based on he coho 's o e all a e o 40.78 pe 10,000 and
applying his o Sou h Ko ea’s popula ion (app ox. 51.77 million inhabi an s [12]), he coho a e would ansla e o
app oxima ely 210,873 new annual cance cases. This is o e 61,000 ewe new cases han he 271,957 de i ed om he
o icial na ional a e age a e o 52.46 pe 10,000 o he same popula ion. This massi e de ici in expec ed cases
unde sco es he p o ound lack o ep esen a i eness.
E idence suppo ing he PSM in e sion hypo hesis
The hypo hesis ha he 1:4 PSM was in e ed is s ongly suppo ed by he inal coho sizes epo ed in [3]. In ac , gi en
he s udy’s ocus on he COVID-19 accine, he s anda d and app op ia e g oup assignmen should ha e been: T ea men
= Vaccina ed; Con ol = Un accina ed. The hypo hesis o In e ed PSM is he e o mula ed by analyzing he inal epo ed
coho sizes: 595,007 Un accina ed and 2,380,028 Vaccina ed. This dis ibu ion is ma hema ically consis en wi h aking
he smalle g oup (app ox. 600,000) as he base "1" and ma ching i o he la ge g oup (app ox. 2.4 million) as he "4"
componen , sugges ing he e e se assignmen : Base G oup = Un accina ed; Ma ched G oup = Vaccina ed.
All his is u he e idenced by he "nume ical signa u e" whe e he o al ma ched coho (2,975,035) is exac ly i e imes
he size o he smalle un accina ed g oup (595,007), ep esen ing he sum o he 1:4 a io. This esul con i ms ha he
smalle un accina ed g oup was used as he base '1' o he ma ching, hus in e ing he s anda d p ocedu e and yielding
a inal 4:1 a io which is (e oneously) as ollows: Ra io (Vaccina ed / Un accina ed) = 2,380,028 / 595,007 = 4,
app oxima ely. This calcula ed a io con i ms he nume ical co espondence: he accina ed g oup (2,380,028) is
p ecisely ou imes he size o he un accina ed g oup (595,007). This exac nume ical cons uc ion solidi ies he
a gumen o an in e sion, whe e he small, un accina ed pool de ined he base coho size o he 1:4 ma ching.
PSM Impac on Coho Age S uc u e: Chi-squa ed Tes Quan i ica ion
The analysis o he age-s a i ied da a om Table 4 e eals a se e e s uc u al al e a ion in he coho esul ing om he
PSM p ocedu e. The coho 's o al popula ion aged >= 65 yea s ( om Table 4), cons i u es only 12.15% o he o al coho
(361,425 / 2,975,035 x 100). This ep esen s a subs an ial 5.6 pe cen age poin downwa d de ia ion om he expec ed
na ional demog aphic a e age o 18.00% o he >= 65 age b acke . To o mally assess he s a is ical signi icance o his
de ia ion, a Chi-Squa ed Goodness-o -Fi Tes was pe o med, compa ing he obse ed age s uc u e o he coho agains
he na ional benchma k (82% s 18%). The es yielded an excep ionally high alue o Chi-squa ed = 69,370 (wi h deg ee
o eedom =1). This esul ansla es in o a p alue signi ican ly lowe han 0.00001. This o e whelming s a is ical
e idence con i ms ha he inal ma ched coho is no a andom ep esen a ion o he Sou h Ko ean popula ion and
possesses a s uc u al age dis ibu ion ha is p o oundly biased owa ds he younge , lowe - isk demog aphic. This
quan i ied s uc u al de ec p o ides he di ec me hodological explana ion o he supp essed C ude Incidence Ra e (CR)
iden i ied in he p e ious Sec ion.
Discussion
P incipal Findings
The i s pa o his Discussion syn hesizes he main quan i a i e indings, ocusing on he co e issues aised. Fi s , i is
wo h while eminding ha his esea ch has in eg a ed compu a ional epidemiological analyses o c i ically e alua e he
me hodology and indings o a gi en sc u inized coho om a gi en s udy [3] wi h a su p ising inding o a highe cance
incidence in he accina ed g oup. Ou ini ial c i ique es ablished a p onounced epidemiological pa adox: while he coho
sugges ed a highe c ude cance incidence a e (CR) among he accina ed g oup, he o e all coho CR was ound o
de ia e downwa ds by o e 22.26% om he o icial na ional a e age CR (2020–2022). This undamen al disc epancy
sugges ed a lack o ex e nal alidi y. We hen ound and s a is ically alida ed a plausible me hodological explana ion o
his pa adox: he epo ed 1:4 P opensi y Sco e Ma ching (PSM) p ocedu e esul ed in a quan i iable s uc u al bias in he
coho 's age composi ion. This bias essen ially in oduces uniden i ied con ounding ac o s and a i icially de ia e he
o e all CR downwa ds.
The co e issue o ou esea ch is he epidemiological pa adox demons a ing a p o ound lack o ex e nal alidi y o he
coho o [3]. This is e idenced by he supp essed o e all CR 40.78 pe 10,000 compa ed o he na ional a e age 52.46
pe 10,000. This undamen al inconsis ency sugges s ha he sample is no ep esen a i e o he unde lying popula ion's
cance incidence. Fu he , he disc epancy is so la ge ha i canno be dismissed as a mino luc ua ion, calling in o
ques ion he eliabili y o he s udy's conclusions. Second, he hypo hesis ha he 1:4 PSM was in e ed p o ides a
conc e e me hodological explana ion o he obse ed low CR. While he nume ical signa u e ( he exac 4:1 a io) ini ially
sugges ed an in e sion in he base g oup assignmen , he subsequen Chi-squa ed es de ini i ely quan i ies he s uc u al
consequence o his p ocedu e. The highly signi ican alue o Chi-squa ed es has p o en ha he PSM p ocess, o he
ini ial sampling i ollowed, esul ed in a coho ha se e ely unde ep esen s he high- isk demog aphic g oup (>= 65
yea s), which makes up only 12.15% o he inal sample ins ead o he expec ed 18.00%. This quan i ied al e a ion o he
age s uc u e, a o ing a younge popula ion wi h a lowe baseline cance incidence, explains he supp ession o he
o e all CR compa ed o he na ional a e age. Thus, he PSM p ocedu e, in ended o balance known con ounde s,
e ec i ely des oyed he ex e nal alidi y o he coho by al e ing i s undamen al demog aphic signa u e.
In ai ness, we mus also emembe ha , while ma ching om he less nume ous g oup (un accina ed) is s anda d
s a is ical p ac ice, in a scena io whe e he exposed g oup ( accina ed) is he o e whelming majo i y, in his case his
choice has signi ican ly cons ained he coho cons uc ion. E en i he PSM we e execu ed co ec ly acco ding o i s
in e nal algo i hm, his choice would always d as ically educe he ex e nal alidi y, as he esul ing coho s would lose
hei abili y o e lec he ue epidemiological backg ound o he o iginal popula ion.
S eng hs and Limi a ions
Ii should be no iced ha he p esen analysis s udy has se e al s eng hs, which we condense in o h ee undamen al
poin s: a) he s udy p o ides conc e e, ep oducible ma hema ical e idence ( he 22.26% CR de ia ion) es ablishing a
undamen al lack o ex e nal alidi y, which o e ides subsequen s a is ical associa ions; b) i o mula es a speci ic,
es able hypo hesis, he in e ed PSM p ocedu e, which would nume ically econcile he obse ed epidemiological
pa adox (4:1 a io), o e ing a conc e e explana ion o he bias; c) he c i ique shi s he ocus om clinical ou comes o
co e me hodological in eg i y (CR and PSM), se ing as a c ucial cau iona y example o u u e la ge-scale
epidemiological s udies u ilizing adminis a i e da a.
The limi a ions inhe en o his p esen s udy, ins ead, s em p ima ily om he na u e o obse a ional analysis buil upon
agg ega ed, ex e nally published da a. Speci ically, ou quan i a i e indings ega ding he supp essed C ude Incidence
Ra e (CR) and he hypo hesis o P opensi y Sco e Ma ching (PSM) in e sion canno be de ini i ely esol ed wi hou
p ima y da a access. Howe e , hese limi a ions a e ul ima ely a ibu able o he o iginal s udy [3], which, despi e i s as
scale, did no p o ide he necessa y da a access o independen alida ion. The e hical and scien i ic impe a i e o da a
anspa ency emains he single g ea es cons ain , o cing ou sc u iny o ely on nume ical signa u es and logical
in e ence a he han di ec e i ica ion. Mo eo e , we mus cla i y ha ou in en ion is no o ejec he s a is ical
associa ions ound in [3], showing a highe incidence o cance s in he COVID-19 accina ed popula ion. Such
associa ions, while po en ially alid wi hin he limi s o hei non- ep esen a i e coho , a e no he ocus o his c i ique.
Howe e , i is impossible o igno e he p o ound disc epancies in he C ude Incidence Ra e and he s ong e idence
sugges ing he in e ed applica ion o he 1:4 PSM. These possible me hodological laws would ende he ex e nal
alidi y and, consequen ly, he eliabili y o he haza d a ios de i ed om his speci ic coho , highly ques ionable un il
p o en o he wise.
Conclusion
The undamen al p emise o any la ge popula ion-based s udy is ha he aw incidence a e o any common backg ound
e en mus be consis en wi h na ional epidemiological su eillance and i s gold s andad [13-15]. The Pa adox o C ude
Ra es showing simul aneously inc eases in he Vaccina ed and O e all Dec ease bo de on nonsensical om he
pe spec i e o public heal h and su eillance. Speci ically, he 22.26% downwa d de ia ion obse ed in he coho ’s CR
o [3], suppo ed by ou Chi-squa ed es quan i ying he s uc u al age bias, cons i u es a plausible e idence o he lack
o ex e nal alidi y o his exempla case, in which he compa ibili y checks o he selec ed coho agains he gold
s anda ds we e no p ope ly conduc ed. This ailu e is pa icula ly ala ming because i has d i en o a si ua ion whe e,
due o me hodological and nume ical inconsis encies, he signaled esul s canno be accep ed a a la ge scale. To esol e
hese ambigui ies and ein o ce scien i ic igo , he ollowing ac ions a e essen ial: a) anspa ency om hose who should
disclose he de ailed algo i hmic me hodology and g oup assignmen used o he coho cons uc ion, and b) ull
disclosu e o he ini ial da a o allow o independen e i ica ion and esolu ion o he aised me hodological
disc epancies. While public a ailabili y o he en i e Sou h Ko ean da abase is ideal, a comp ehensi e, ex ended summa y
o he aw da a, consis en wi h p i acy egula ions, would be accep able, p o ided i is de ailed enough o con i m o
in alida e he me hodological conce ns aised he e. Resol ing he iden i ied me hodological ambigui y is undamen al o
he global us in epo ed associa ions and b oade COVID-19 accine sa e y assessmen s.
Acknowledgmen s: No applicable
Funding: The au ho does no decla e a speci ic g an o his esea ch om any unding agency in he public, comme cial
o no - o -p o i sec o s.
Da a a ailabili y s a emen : All da a used o calcula ions a e publicly a ailable and explici ly ci ed om he ci ed
lie a u e and o icial Sou h Ko ean cance egis ies. All he calcula ions and esul s o his s udy a e ully ep oducible
by using he me hod desc ibed in he a icle and he da a men ioned abo e. Fu he easonable eques s can be add essed
o he co esponding au ho (email: ma co. occe [email protected] ).
Au ho s’ Con ibu ions: MR as a single au ho concei ed, designed, w o e, managed, and e ised his manusc ip , and
has ead and ag eed o he published e sion o he manusc ip .
Con lic o In e es s: None decla ed.
Resea ch E hics App o al: No applicable, nei he humans no animals no plan s no pe sonal da a we e in ol ed in
his s udy.
Pa ien consen o publica ion: No equi ed.
T anspa ency: The sole au ho (MR) a i ms ha he manusc ip is an hones , accu a e, and anspa en accoun o he
compu a ional and me hodological analysis being epo ed; ha no impo an aspec s o he publicly a ailable da a and
calcula ions ha e been omi ed; and ha all indings a e de i ed exclusi ely om he ci ed public da a sou ces
Abb e ia ions: CR = C ude Incidence Ra e; PSM = P opensi y Sco e Ma ching; ASR = Age-S anda dized Ra e; SD =
s anda d de ia ion
Re e ences
1 Chi ico F, Teixei a da Sil a JA. (2022). COVID-19 Heal h Policies: The need o anspa en da a sha ing
e ween Scien is s, Go e nmen s, and Policymake s. Oman Med J, 37(5). doi: 10.5001/omj.2022.63
2 Mo o PL, Habe P, McNeil MM. (2019). Challenges in e alua ing pos -licensu e accine sa e y: obse a ions
om he Cen e s o Disease Con ol and P e en ion. Expe Re Vaccines, 18(10):1091–1101. doi:
10.1080/14760584.2019.1676154
3 Kim HJ, Kim M-H, Choi MG, Chun EM. (2025). 1-yea isks o cance s associa ed wi h COVID-19 accina ion:
a la ge popula ion-based coho s udy in Sou h Ko ea. Bioma k Res., 13(114). doi: 10.1186/s40364-025-00831-
w
4 The S ai Times Edi o , Sou h Ko ea opens Co id-19 accine ese a ions o all adul s, (2023) (Augus ). The
S ai s Times. h ps://www.s ai s imes.com/asia/eas -asia/sou h-ko ea-opens-co id-19- accine- ese a ions-
o -all-adul s (accessed online 15 Oc obe 2025)
5 Paul E, S ep oe A, Fancou D, (2021). A i udes owa ds accines and in en ion o accina e agains COVID-
19: Implica ions o public heal h communica ions. Lance Reg Heal h Eu , 1:100012. doi:
10.1016/j.lanepe.2020.10001
6 Mu ad MH, Ka abi A, Benkhad a R, e al. (2018). Ex e nal alidi y, gene alisabili y, applicabili y and di ec ness:
a b ie p ime . BMJ E id Based Med, 23:17-19. doi: 10.1136/ebmed-2017-110800.
7 Wijn SRW, Ro e s MM, Hannink G. (2022). Con ounding adjus men me hods in longi udinal obse a ional
da a wi h a ime- a ying ea men : a mapping e iew. BMJ Open, 12:e058977. doi: 10.1136/bmjopen-2021-
058977.
8 Kang MJ, Jung K-W, Bang SH, e al. (2023). Cance S a is ics in Ko ea: Incidence, Mo ali y, Su i al, and
P e alence in 2020. Cance Res T ea ., 55(2):385-399. doi: 10.4143/c .2023.447
9 Pa k EH, Jung K-W, Pa k NJ, e al. (2024). Cance S a is ics in Ko ea: Incidence, Mo ali y, Su i al, and
P e alence in 2021. Cance Res T ea ., 56(2):357-371. doi: 10.4143/c .2024.253
10 Pa k EH, Jung K-W, Pa k NJ, e al. (2025). Cance S a is ics in Ko ea: Incidence, Mo ali y, Su i al, and
P e alence in 2022. Cance Res T ea ., 57(2):312-330. doi: 10.4143/c .2025.264
11 WHO, (2023). Republic o Ko ea: Heal h da a o e iew. Wo ld Heal h O ganiza ion,
h ps://da a.who.in /coun ies/410 (accessed online 15 Oc obe 2025)
12 Wo ldBank. Popula ion ages 65 and abo e(o o al popula ion) – S Ko ea Rep. Wo ld Popula ion P ospec s,
Uni ed Na ions (UN). [Accessed 2025 No 2]. A ailable om:
h ps://da a.wo ldbank.o g/indica o /SP.POP.65UP.TO.ZS?loca ions=KR 5
13 Las JM (2014). A Dic iona y o Public Heal h. Ox o d Uni e si y P ess. doi:
10.1093/ac e /9780195160901.001.0001
14 Rocce i M, Cacciapuo i G, (2025). Beyond he Gold S anda d: Linea Reg ession and Poisson GLM Yield
Iden ical Mo ali y T ends and Dea hs Coun s o COVID-19 in I aly: 2021–2025. Compu a ion, 13(10):233.
doi: 10.3390/compu a ion13100233
15 Eysenbach G, (1999). Resea ch Ques ions o Sys ema ic Re iews mus be Unambiguous omPp o ocol S age.
The BMJ, 319:1265. doi: 10.1136/bmj.319.7219.1265a