scieee Science in your language
[en] (orig)

PATOF – From the PAst TO the Future: Legacy Data in Small and Medium-Scale "PUNCH" Experiments – a Blueprint for PUNCH and Other Disciplines

Author: Hu, Ding-Ze
Publisher: Zenodo
DOI: 10.5281/zenodo.17288537
Source: https://zenodo.org/records/17288537/files/PATOF___D5_report.pdf
PATOF - Deli e able D5
Ding-Ze Hu
May 2025
Abs ac
The pu pose o his epo is o audi he aims and objec i es o he
ask package and he esul s achie ed by he case s udies, as well as o
cla i y he lis o ele an opics o u he de elopmen . This epo
desc ibes he aims and objec i es o he ask package and he esul s
deli e ed.
1 In oduc ion
Da a om he “PUNCH” (Pa icles, Uni e se, NuClei & Had ons) disci-
plines pa icle / as opa icle / had on & nuclea physics, and as onomy
a e aluable and o en allow o new scien i ic insigh s no expec ed du ing
an expe imen ’s li e ime. The PUNCH disciplines a e e y expe ienced in
da a managemen , la gely due o ea ly digi al da a acquisi ion sys ems,
high da a a es & olumes o be managed, and globally dis ibu ed use
communi ies.
The expe imen and s a egy o me ada a managemen om PUNCH
will be u ilized in he PATOF p ojec and we will go beyond. As he DESY
lib a y, we p o ide a “cookbook” cap u ing he concep ual me hodology
o making indi idual expe imen -speci ic da a FAIR and desc ibing a
“FAIR Me ada a Fac o y”, i.e. a p ocess o c ea e a na u ally e ol ed
me ada a schema by ex ending he las e sion o he Da aCi e me ada a
schema wi hou disca ding he o iginal indi idual me ada a concep s.
1.1 Me ada a Fac o y
Me ada a gene a ion is a c ucial ask o enhancing he FAIRness in sci-
en i ic da a om conc e e expe imen s. In his p ojec we cap u e he
concep ual me hodology o making indi idual expe imen -speci ic me a-
da a schemas FAIR and desc ibing a ”FAIR Me ada a Fac o y”, i.e. a
p ocess o c ea e a na u ally e ol ed me ada a schema by ex ending he
la es Da aCi e me ada a schema wi hou disca ding he o iginal indi-
idual me ada a concep s. The esul is a schema ha con ains FAIR
expe imen -speci ic ex ensions on op o he schema ha is al eady de-
ined by Da aCi e.
1
1.1.1 Design o he me ada a schema in Me ada a Fac o y
In o de o gene a e XML me ada a eco ds wi h well-de ined and well-
s uc u ed schema, he me ada a schema de ined by Da aCi e is being
used in he me ada a gene a ion p ocess in he Me ada a Fac o y. As o
he schema designed o he me ada a ac o y has wo laye s, he op-laye
me ada a schema wi h common me ada a ields ollows along he schema
de ined by Da aCi e, which is al eady used by many esea ch ins i u es
wi hin Ge many o me ada a modeling o hei esea ch da a.
The second laye , which encompasses expe imen -speci ic me ada a, is
cu en ly being designed and e ined based on da a and eedback p o ided
by domain expe s om conc e e expe imen al con ex s. These con ibu-
ions a e helping o ensu e ha he esul ing s uc u es accu a ely e lec
eal use cases and can accommoda e he di e si y o da a o ma s, p ac-
ices, and equi emen s obse ed ac oss di e en esea ch en i onmen s.
As he me ada a schema de ined by Da aCi e is well-es ablished, also
Da aCi e as a DOI egis e agency o esea ch da a se s and i is used
by a la ge numbe o esea ch ins i u ions in Ge many. We decide o
u ilize he me ada a schema om Da aCi e o he ALPS II use case. As
o he me ada a schema being designed, he e a e wo ypes o schema
o me ada a eco ds: One ype is o DOI min ing, which con ains he
ba e minimum in o ma ion ha is needed when min ing a DOI o his
da a se . On he o he hand, ano he ype o me ada a eco d con ains a
lo mo e in o ma ion han jus ba e minimum. Bo h ypes o me ada a
eco ds a e linked oge he .
The ollowing is he me ada a ields in he p elimina y me ada a schema
ha is being designed om one o he expe imen use cases, ALPS II:
•iden i ie
•c ea o s
• i les
•subjec s
•con ibu o s
•publishe
•publica ionYea
• esou ceType
•DESY ALPSII ins umen
–de ice
–UUID
–de iceType
–de iceName
–manu ac u e
– i mwa es
–ha dwa es
–da es
–ne wo k
2
– o m ac o
– ela edI ems
–ins alla ion
–speci ics
•DESY ALPSII so wa e
–UUID
–so wa eType
–so wa eName
–con ibu o s
∗con ibu o
·gi enName
· amilyName
·nameIden i ie
·a ilia ion
– e sion
–ope a ionSys em
–da es
– ela edI ems
–ins alla ion
–speci ics
1.1.2 XML (Ex ensible Ma kup Language) o me ada a
eco d
Fo his use case, we c ea e me ada a eco ds using XML (Ex ensible
Ma kup Language), which is one o he mos common o ma s o ep e-
sen ing s uc u ed me ada a. The way me ada a is used ypically implies
ha i will be sha ed and in e p e ed beyond i s o iginal c ea o com-
muni y, equi ing a clea and consis en s uc u e. Ex ensible Ma kup
Language is he unde lying s anda d ha enables such in e ope abili y,
p o iding a lexible ye well-de ined amewo k o desc ibing, exchanging,
and alida ing me ada a ac oss di e se sys ems and domains.
1.1.3 XSD (XML Schema De ini ion) o de ining me a-
da a
To ensu e ha XML me ada a eco ds a e s uc u ed consis en ly and
can be alida ed au oma ically, an accompanying XML Schema De ini ion
(XSD) ile will be de eloped. The XSD speci ies he elemen s, a ibu es,
and ela ionships ha de ine he s uc u e and con en o he me ada a,
e ec i ely se ing as a bluep in o da a c ea o s and use s alike. By p o-
iding his o mal schema, i becomes possible o check he co ec ness
o XML iles, main ain in e ope abili y ac oss di e en sys ems, and en-
able au oma ed p ocessing o me ada a wi hin and beyond he o igina ing
communi y.
3
•Usage o me ada a schema om he la es e sion o Da aCi e me a-
da a schema: As he me ada a schema de ined by Da aCi e is well-
es ablished, also Da aCi e as a DOI egis e agency o esea ch da a
se s and i is used by a la ge numbe o esea ch ins i u ions in Ge -
many. We decide o u ilize he me ada a schema om Da aCi e o
he ALPS II use case.
•Two ypes o me ada a eco ds: Two ypes o me ada a eco ds a e
c ea ed o a da a se . One ype is o DOI min ing, which con ains
he ba e minimum in o ma ion ha is needed when min ing a DOI
o his da a se . On he o he hand, ano he ype o me ada a
eco d con ains a lo mo e in o ma ion han jus ba e minimum.
Bo h ypes o me ada a eco ds a e linked oge he .
•Each da a ields in he me ada a eco d ha is used o DOI min ing
–iden i ie
–c ea o
– i le
–publishe
–publica ionYea
– esou ceType
•Each da a ields in he me ada a eco ds ha con ains a lo mo e
in o ma ion
–subjec
–con ibu o s
– ela edI ems
–collabo a ions
–DESY ALPSII ins umen s
ha dwa e
–DESY ALPSII so wa e
so wa e
1.2 Me ada a cookbook
Building upon he Me ada a Fac o y, he me ada a cookbook consoli-
da es concep ual, echnical, and p ocedu al knowledge in o a cohe en
e e ence amewo k. I ansla es he expe ience gained om schema de-
sign, wo k low de ini ion, and so wa e implemen a ion in o a s uc u ed
and eusable o m. The cookbook se es as bo h a eco d o he me h-
ods employed and a p ac ical esou ce ha suppo s consis en me ada a
gene a ion ac oss di e en expe imen al domains.
The me ada a cookbook se es as bo h a concep ual amewo k and
a p ac ical guide o he c ea ion, ex ension, and managemen o me a-
da a wi hin he Me ada a Fac o y. I consolida es he knowledge and
expe ience gained h ough he de elopmen o me ada a schemas, wo k-
lows, and so wa e componen s, ansla ing hem in o a s uc u ed and
ep oducible me hodology. The cookbook aims o suppo esea che s,
da a manage s, and de elope s in gene a ing consis en , in e ope able,
and FAIR-complian me ada a ac oss a a ie y o expe imen al con ex s.
4
A i s co e, he cookbook ou lines a sys ema ic p ocess o p oducing
me ada a eco ds ha in eg a e wo complemen a y laye s: he op-laye ,
based on he well-es ablished Da aCi e me ada a schema, and he second-
laye , which con ains expe imen -speci ic ex ensions de i ed om collabo-
a ion wi h domain expe s. The op-laye ensu es alignmen wi h widely
accep ed me ada a s anda ds and suppo s DOI egis a ion, while he
second-laye cap u es he con ex ual ichness o indi idual expe imen s,
including ins umen s, so wa e, and da a acquisi ion de ails. Toge he ,
hese laye s enable bo h in e ope abili y ac oss disciplines and p ecision
wi hin speci ic scien i ic domains.
Concep ually, he cookbook desc ibes no only he s uc u e o me a-
da a bu also he easoning behind i s design. I p o ides an o e iew o
how common me ada a elemen s can be eused, adap ed, and ex ended in
a con olled manne . P ac ically, i o e s empla es, examples, and guid-
ance o gene a ing XML me ada a eco ds alida ed by co esponding
XSD schemas. These examples illus a e how me ada a ields can be pop-
ula ed, linked, and e i ied o ensu e in e nal consis ency and compliance
wi h es ablished communi y p ac ices.
In addi ion o se ing as documen a ion o he me ada a schema, he
cookbook unc ions as a li ing esou ce ha can e ol e wi h he needs
o he communi y. I acili a es he exchange o knowledge be ween e-
sea ch g oups by p o iding a sha ed language and amewo k o desc ib-
ing expe imen al da a. The in en ion is ha he cookbook can be applied
beyond he ini ial use cases, guiding new communi ies in adop ing o
ex ending he Me ada a Fac o y app oach while main aining in e ope -
abili y wi h exis ing in as uc u es.
Ul ima ely, he me ada a cookbook b idges he concep ual and ope -
a ional dimensions o me ada a managemen . I cap u es he a ionale,
he p ocess, and he implemen a ion de ails equi ed o gene a e meaning-
ul, s uc u ed me ada a ha is bo h machine- eadable and scien i ically
ele an . By combining me hodological cla i y wi h p ac ical ools, he
cookbook suppo s a sus ainable pa h owa d mo e cohe en and eusable
me ada a p ac ices ac oss di e se esea ch en i onmen s.
1.2.1 Concep ual and P ac ical Guide o Me ada a Gen-
e a ion
This sec ion ou lines he p ac ical s eps and concep ual p inciples ha
guide he gene a ion o me ada a wi hin he Me ada a Fac o y ame-
wo k. The p ocess combines s anda dized op-laye me ada a ields wi h
expe imen -speci ic ex ensions o ensu e bo h in e ope abili y and con ex-
ual accu acy. The app oach aims o enable he ep oducible gene a ion
o me ada a eco ds ha a e complian wi h Da aCi e s anda ds while
emaining adap able o he unique needs o indi idual esea ch communi-
ies.
S ep 1: Es ablishing he Co e Me ada a Laye The i s s ep
in he me ada a gene a ion p ocess is o de ine he op-laye me ada a
ields ha a e based on he Da aCi e schema. These elemen s o m he
backbone o all me ada a eco ds and include co e iden i ie s such as
5

iden i ie ,c ea o s, i les,publishe , and publica ionYea . These
ields ensu e ha e e y eco d can be ecognized, ci ed, and in eg a ed in o
exis ing da a ca alogues such as SciCa o ins i u ional eposi o ies.
The cookbook p o ides empla es and XML snippe s ha demons a e
how hese ields a e s uc u ed and how hey co espond o Da aCi e-
complian elemen s. This ensu es ha gene a ed me ada a eco ds main-
ain a consis en baseline and can be seamlessly inco po a ed in o DOI
egis a ion wo k lows.
S ep 2: Inco po a ing Expe imen -Speci ic Me ada a The
second laye o he me ada a schema cap u es domain- o expe imen -
speci ic de ails. These include ields desc ibing ins umen s, so wa e, ac-
quisi ion se ings, and expe imen al pa ame e s ha p o ide he necessa y
scien i ic con ex . Fo example, wi hin he ALPS II use case, me ada a
ex ensions such as DESY ALPSII ins umen and DESY ALPSII so wa e
ha e been de ined o documen de ailed ha dwa e and so wa e in o ma-
ion.
In p ac ice, hese ields a e designed collabo a i ely wi h domain ex-
pe s, ensu ing ha he ocabula y, s uc u e, and g anula i y accu a ely
e lec eal expe imen al condi ions. The cookbook illus a es how new
ields can be in oduced h ough XML Schema De ini ion (XSD) ex en-
sions, while s ill main aining compa ibili y wi h he Da aCi e co e.
S ep 3: Schema De ini ion and Valida ion Fo each se o
me ada a eco ds, a co esponding XSD ile de ines he allowed s uc u e,
ela ionships, and da a ypes. The cookbook includes examples o XSD
s uc u es, showing how expe imen -speci ic elemen s can be decla ed and
linked o exis ing Da aCi e componen s. Valida ion ools a e in oduced
o e i y ha XML me ada a eco ds con o m o he schema, ensu ing
quali y and in e ope abili y.
Resea che s and da a manage s can alida e XML eco ds using s an-
da d schema alida ion ools o h ough au oma ed pipelines in eg a ed
in o he Me ada a Fac o y wo k low. This alida ion p ocess enhances
he consis ency and eliabili y o me ada a p oduced ac oss di e en use
cases.
S ep 4: Wo k low In eg a ion and Au oma ion The me ada a
gene a ion p ocess will be designed o in eg a e wi h b oade da a man-
agemen wo k lows. Wi hin he Me ada a Fac o y, au oma ion sc ip s
o modules can popula e XML empla es using da a p o ided om local
da abases, ins umen s, o use inpu s. The cookbook desc ibes how hese
componen s in e ac , p o iding a p ac ical o e iew o he da a low om
sou ce in o ma ion o alida ed me ada a eco d.
Examples in he cookbook illus a e how wo k low s eps can be mod-
ula ized, allowing new expe imen al se ups o me ada a schemas o be
inco po a ed wi hou majo s uc u al changes. This suppo s scalabili y
and lexibili y ac oss di e en esea ch con ex s.
6
S ep 5: Sha ing and Reuse Once gene a ed and alida ed, me a-
da a eco ds can be deposi ed in o sha ed ca alogues such as SciCa o
ins i u ional eposi o ies, whe e hey can be sea ched, eused, and linked
o publica ions o da ase s. The cookbook p o ides guidance on aligning
me ada a ou pu wi h eposi o y equi emen s and Da aCi e DOI egis-
a ion o ma s. This alignmen p omo es wide isibili y and acili a es
in eg a ion in o exis ing esea ch in as uc u es.
Summa y In summa y, he me ada a cookbook p o ides a s uc u ed
ye adap able amewo k o gene a ing, alida ing, and in eg a ing me a-
da a. I guides use s om concep ual schema design o p ac ical imple-
men a ion, ensu ing ha me ada a eco ds emain FAIR, in e ope able,
and scien i ically meaning ul. The app oach combines a s able ounda ion
in he Da aCi e s anda d wi h he lexibili y o inco po a e expe imen -
speci ic de ails, o e ing a ep oducible and ex ensible model o me ada a
c ea ion ac oss di e se scien i ic disciplines.
7