scieee Science in your language
[en] (orig)

COMET Community Meeting Slides, 30 October 2025

Author: Mentis, Dione; Buttrick, Adam
Publisher: Zenodo
DOI: 10.5281/zenodo.17542523
Source: https://zenodo.org/records/17542523/files/COMET_Community_Meeting_20251030.pdf
COMET Communi y Mee ing
30 Oc obe 2025
Imp o e he
Quali y o PID
Me ada a
The Goal
Collec i e
S ewa dship T us h ough
T anspa ency and
Collec i e
Go e nance
COMET Model Founda ional P inciples
doi.o g/10.7269/C1TG6H
Collec i e
Bene i
A ising ide
li s all boa s
COMET is an
inno a i e
acili a o
De elops new capabili ies whils
ampli ying and connec ing
exis ing solu ions, uni ing
communi y en ichmen s.

COMET O ganise s
Public
Knowledge
P ojec
CWTS,
Leiden
Uni e si y
Cali o nia
Digi al
Lib a y
Da aCi e
COMET O ganise s
Pilo En ichmen P ojec s
High quali y cu a ed me ada a
exis s, bu he bene i s a e
isola ed.
De-siloing
quali y
me ada a
How i wo ks
●We no malize he ins i u ion's inpu da a using
he same me hod as ou OpenAlex da a
inges ion.
●Fo each inpu wo k, we use uzzy au ho and
a filia ion s ing ma ching (including name
a ian s) o find o e laps and disc epancies.
●We cap u e bo h he aw and no malized
a filia ion s ings om all ma ches.
●Use hese ex ac ed s ings o que y he
da abase o all o he ins ances o hose
a filia ions, finding e e y wo k hey appea on.

Ou pu s
●Linkage Valida ion File
○DOI-by-DOI check showing i he
ins i u ion's cu a ed au ho -a filia ion
pai exis s in OpenAlex.
○Flags disc epancies (e.g., ins i u ion
says au ho is on a pape , bu OpenAlex
is missing he a filia ion o ROR ID).
●Full Wo ks Lis
○Uses he a filia ion s ings om he
small inpu sample o cap u e all o he
wo ks, e u ning he ull lis om
OpenAlex wi h he same o simila
a filia ions, including hose wi h
inco ec o missing ROR IDs.
Pilo Pa ne s
Eu opean Molecula
Biology Labo a o y (EMBL)
Wageningen Uni e si y
& Resea ch (WUR)
•Ma ched ~90% o cu a ed wo ks wi h econcilia ion s a egy, ound new wo ks, wi h 5% missed because o au ho
a filia ion gaps
• Remaining 5% po en ially missed because o small inpu size (single yea ) o addi ional efinemen needed on
au ho a filia ion ma ching
• Fo all wo ks om he es yea (mached + missed), 83% ha e co ec au ho a filia ion ROR ID assignmen s, 12%
appea o ha e an EMBL a filia ion s ing bu no ROR ID, emainde a ec ed by same au ho a filia ion pa sing gap
• No jus an OpenAlex issue! ROR needs mo e EMBL aliases o suppo ma ching in OpenAlex
Resul s: EMBL
ma ched
incomple e o
missing a ilia ion
(ba s ma ch)
needs
in es iga ion
incomple e o
missing a ilia ion
(ba s ma ch)
co ec
has
a ilia ion
Reconcilia ion
Ma ched +
missed
•Complex cu a ion wo kflow wi h sou ce and esou ce ype exclusion p e en s s aigh o wa d
econcilia ion esul analysis, bu lo s o lea n om WUR’s example
•Tes ing wi h example yea and esou ce ype, 92% o cu a ed wo ks ha e he co ec a filia ion and ROR
ID assignmen
• 6% o wo ks ha e a filia ion gaps o e o s ha p e en econcilia ion and ROR ID assignmen s
• Gaps o e lap wi h EMBL on sou ce clus e ing and we ound a sli e o new wo ks!
Resul s: WUR
incomple e o
missing a ilia ion
(ba s co ec ROR
ID assignmen )
missed ROR ID
assignmen
co ec
●Pilo demons a es a iable, scalable
s a egy ha c ea es alue o bo h sides.
●Fo Ins i u ions:
○Fas , au oma ed alida ion o local
cu a ion.
○A way o find new, uncu a ed wo ks.
○A clea , ac ionable lis o da a gaps o
add ess.
●Fo In as uc u es:
○Recei es a ge ed, ep oducible
eedback om a high-quali y sou ce.
○P o ides an oppo uni y o ins i u ions
o use hei exis ing wo k o imp o e he
schola ly eco d.
Takeaways

●We' e flagged he pa sing issues o OpenAlex o
in es iga ion, bu hey’ e doing g ea !
●Fu u e Wo k:
○Explo e o malizing his econcilia ion (e.g.,
ha es ing asse ions om OAI-PMH).
○Inco po a e di e en inpu s and fields.
●Special hanks o ou pa ne s a EMBL and WUR.
●The code o his pilo is open sou ce (check i ou !).
Nex S eps
Communi y
Pe spec i e: Diego
Res epo om
CoLa
Open Discussion:
Le e aging Cu a ion
om CRIS sys ems
Join Us!
•Con ibu e cu a ed me ada a and
en ichmen p ocesses
•Ge in ol ed in ou pilo p ojec s
• Sa e he da e o ou nex
communi y mee ing on 21 Janua y
a 4 PM (UTC).
•Keep in o med by signing up o ou
newsle e .