scieee Science in your language
[en] (orig)

EOSC EDEN M1.1 – Report on Identification of Core Preservation Processes

Author: EOSC EDEN T1.2; Lindlar, Micky; Caron, Bertrand; Benauer, Maria; Kylander, Johan; Dekeyser, Kris; Addis, Matthew; Levlin, Mattias; Laukkanen, Mikko; Lehtonen, Juha; Burger, Felix; Koho, Tiina; Schwab, Franziska; Molloy, Laura; Zhang, Fen
Publisher: Zenodo
DOI: 10.5281/zenodo.16992452
Source: https://zenodo.org/records/16992452/files/EOSC-EDEN_CPP-024_Enabling_Discovery.pdf
Enabling Disco e y (CPP-024)
CPP-Iden i ie
CPP-024
CPP-Label
Enabling Disco e y
Au ho
Be and Ca on
Con ibu o s
Ma ias Le lin
E alua o s
Ma hew Addis, Felix Bu ge , Ma ia
Benaue
Da e o edi ion comple ed
29.08.2025
Change his o y
Commen s
Ve sion 1.0 - 29.08.2025
Miles one e sion
Page 1 o 12
1. Desc ip ion o he CPP
The TDA p o ides ca alogue se ices o i s consume s o help hem iden i y Objec s ha
hey may be in e es ed in.
Inpu s and ou pu s
Inpu (s)
Da a
Op ional: De i a i es use ul o he
consume o iden i y he con en and scope
o he Objec
Me ada a
Digi al A chi e Da abase
Documen a ion / guidance
Que y scena ios
Me ada a mapping speci ica ions
Ou pu (s)
Me ada a
Ca alogue se ice
P ocess execu ion epo
De ini ion and scope
Enabling Disco e y co e s he ex ac ion o he subse o Me ada a om he digi al a chi e
da abase o enable consume s o iden i y which Objec s hey may be in e es ed in. In OAIS,
his subse is called “Desc ip i e In o ma ion”.
Disco e y elies on CPP-018 (Communi y Wa ch) o s udy consume s’ needs ega ding
sea ches and elabo a e que y scena ios. Que y scena ios a e use cases whe e he
consume s iden i y a subse o he TDA holdings’ ha add esses hei needs. Disco e y is in
cha ge o p o iding Me ada a, que y and e ie al ea u es ha suppo hese que y
scena ios in an e icien way. I he TDA wan s i s Objec s o be disco e able in hi d-pa y
ca alogues (e.g. ede a ed ca alogues, po als e c.), i mus pe o m his p ocess o each
se ice in o de o ensu e ha Me ada a and de i a i es con o m o he speci ica ions o he
hi d-pa y ca alogue(s).
I he TDA p o ides di ec access o i s holdings o end use s, hen i may also gene a e a
PID (e.g. a DOI) o each accessible Objec so ha i is e e encable and loca able using
hi d-pa y disco e y se ices. Howe e , his CPP only ocuses on he gene a ion o
Me ada a ha is subsequen ly used o disco e y ( a he han he disco e y p ocess and he
use o disco e y se ices such as po als and ca alogues). The de ails o egis e ing PIDs,
publishing Me ada a and/o da a wi h hi d-pa y ca alogues and disco e y se ices, as well
as he access o Objec s ia PID esolu ion and TDA se ices a e beyond he scope o his
CPP.
Enabling Disco e y is also in cha ge o e i ying he legal s a us o Me ada a based on igh s
assessmen as issued by CPP-020 (Righ s Managemen ). Me ada a access migh indeed
Page 2 o 12
need o be es ic ed, so Disco e y mus ensu e i dissemina es hese only o au ho ised
use s.
In addi ion o Me ada a, disco e y migh p o ide de i a i e copies (e.g. humbnails, ex ual
ansc ip ion, edac ed copy, e c.) ha would help consume s in iden i ying he con en and
scope o he ele an Objec s. Indeed, access o he p ese a ion copies o o a su icien ly
comple e de i a i e is o en limi ed o he TDA’s p ecinc .
Page 3 o 12
P ocess desc ip ion
T igge e en (s)
T igge e en
CPP-iden i ie
New Objec o new Objec e sion inges ed
CPP-029 (Inges ), CPP-021 (AIP Ve sioning)
S ep-by-s ep desc ip ion
No
Supplie
Inpu
S eps
Ou pu
Cus ome
1
CPP-018
(Communi y
Wa ch)
Que y scena ios
Selec he ele an Me ada a (and
possibly da a) use ul o he
consume
Subse o Me ada a o
be exposed in he
ca alogue
2
CPP-018
(Communi y
Wa ch)
Que y scena ios
Selec he syn ax o se ialisa ions
use ul o he consume
Syn ax o Me ada a o be
exposed in he ca alogue
3a
New AIP o AIP e sion
Ex ac ion: Ex ac he equi ed
subse o Me ada a om he new
AIP o AIP e sion
Ex ac ed disco e y
Me ada a
Subse o Me ada a o be
exposed in he ca alogue
Syn ax o Me ada a o be
exposed in he ca alogue
3b
Ex ac ed disco e y
Me ada a
Mapping & ans o ma ion: Map
and ans o m he Me ada a
T ans o med disco e y
Me ada a
Page 4 o 12
acco ding o he equi ed o ma o
he disco e y ca alogue
3c
T ans o med disco e y
Me ada a
Valida ion: Valida e he esul ing
subse o Me ada a agains he
a ge ca alogue’s schema
Valida ion success ul:
Valida ed disco e y
Me ada a (s ep 4)
Valida ion ailed: Log he
e o and lag he eco d
o e iew. The eco d
mus no p oceed o he
disco e y ca alogue un il
co ec ed
4
CPP-020 (Righ s
Managemen )
Valida ed disco e y
Me ada a
Check igh s s a us o disco e y
Me ada a
Valida ed disco e y
Me ada a wi h clea ed
igh s
Righ s s a emen
5a
Valida ed disco e y
Me ada a wi h clea ed igh s
Add disco e y Me ada a o he
Objec (s) o he ca alogue se ice.
The ca alogue se ice may be
p o ided by he TDA, by a
hi d-pa y (e.g. a ede a ed
ca alogue), o by a combina ion o
he wo.
En y in ca alogue
se ice
Consume
5b
Valida ed disco e y
Me ada a wi h clea ed igh s
Op ional: I he TDA equi es a
PID (e.g. DOI), because he
Objec o Me ada a abou he
Objec will be publicly accessible
(e.g. open access), he TDA may
eques a PID om an app op ia e
PID
Consume
Page 5 o 12

egis a ion agency. The TDA may
also add he PID o he Me ada a
in he ca alogue se ice.
5c
CPP-025
(Enabling
Access),
CPP-028
(C ea ion o
De i a i es)
DIP o De i a i es
Op ional: The TDA may p o ide
e sions o i s Objec s o inclusion
in he ca alogue se ice ( o
example, humbnails, p e iew
e sions, edac ed documen s
e c.). These a e added o he
ca alogue en y o he Objec .
En y in ca alogue
se ice.
6
Que y scena ios
Ve i ica ion: Using he de ined
que y scena ios, pe o m a es
que y o con i m ha he new o
upda ed Objec is disco e able in
he ca alogue se ice.
Ve i ica ion success ul:
Ve i ica ion con i ma ion
in he p ocess execu ion
epo
CPP-013 (Objec
Managemen
Repo ing)
Ca alogue se ice
Ve i ica ion ailed: I he
Objec is no ound, log
he ailu e as an inciden
o in es iga e he inges
and indexing chain
Page 6 o 12
Ra ionale(s)1 and wo s case(s)
Ra ionale
Impac o inac ion o ailu e o he p ocess
I he TDA is p o iding access o i s holdings
o consume s, ensu ing que y se ices ha
sui he consume s’ needs is manda o y.
I he ca alogue se ice does no allow que y
scena ios use ul o he consume s,
disco e abili y and usage o he TDA
holdings is comp omised.
2. Dependencies and ela ionships wi h o he
CPPs
Dependencies
CPP-ID
CPP-Ti le
Rela ionship desc ip ion
CPP-005
Iden i ie
Managemen
Enabling Disco e y should make use o PIDs.
CPP-009
Me ada a
Ex ac ion
Some Me ada a p o ided o he consume mus ha e
been ex ac ed om he Files.
CPP-016
Me ada a Inges
and
Managemen
Enabling Disco e y elies on a co ec me ada a
managemen p ocess. In pa icula , Me ada a c ea ed by
and wi hin he TDA is o pa icula in e es o he
consume in o de o unde s and p ese a ion ac ions ha
could ha e a ec ed he Objec .
CPP-018
Communi y
wa ch
The TDA mus ha e iden i ied he needs o i s designa ed
communi y in o de o enable que ies ha suppo he
communi y's de ined que y scena ios.
O he ela ions
Rela ion
CPP-ID
CPP-Ti le
Rela ionship desc ip ion
A ini y wi h
CPP-025
Enabling
Access
The dis inc ion be ween Enabling
Disco e y and Enabling Access may be
blu ed as de i a i e copies may be
indexed and sea ched in he same way as
Me ada a. In addi ion, hese de i a i e
copies may be su icien o add ess some
1 Te m de i ed om PREMIS.
Page 7 o 12
consume s’ needs. Ne e heless, he
dis inc ion is s ill use ul as gi ing access
o he o iginal da a is o en go e ned by
speci ic legal cons ain s, and equi es
speci ic ha dwa e and so wa e ools.
3. Links o amewo ks
Ce i ica ion
Ce i ica ion
amewo k
Te m used in amewo k o e e
o he CPP
Sec ion
CTS
Link
Disco e y
R12 Disco e y and Iden i ica ion
Nes o Seal
Link
Resea ch (op ions)
C4 Access
ISO 16363
Link
Disco e y
4.5.1 The eposi o y shall speci y
minimum in o ma ion equi emen s o
enable he Designa ed Communi y o
disco e and iden i y ma e ial o in e es .
O he amewo ks and e e ence documen s
Re e ence
Documen
Te m used in amewo k o
e e o he p ocess
Sec ion
OAIS
Link
No exac e m is a ailable in
OAIS o Disco e y, bu he
opic is app oached h ough he
Package Desc ip ion no ion
4.3.3.7 Uni Desc ip ion
4.3.3.9 Collec ion Desc ip ions
PREMIS
Link
Disco e y
Sec ion “Mo e on Objec s”, subsec ion
“In ellec ual En i ies”, p. 8.
Page 8 o 12
4. Re e ence implemen a ions
Example use cases
ePADD Disco e y module
Ins i u ional Backg ound
Ins i u ion
S an o d Uni e si y's Special Collec ions & Uni e si y A chi es,
USA
Hype link
h ps://www.epaddp ojec .o g/using-epadd/disco e y-module
Disco e y module o S an o d’s email collec ions:
h ps://epadd-disco e y.s an o d.edu/epadd/collec ions
Desc ip ion
P oblem s a emen
Email collec ions a e bo n-digi al ma e ial ha need and allow
speci ic usages and access. On he o he hand, hey aise
p i acy issues ha o ce memo y o ganisa ions o gi e access o
hese collec ions only in he o ganisa ion’s p ecinc . Consume s
he e o e need o iden i y he scope and con en o he collec ion
h ough emo e que ies be o e planning an on-si e isi .
P oposed solu ion
Beyond he mails’ Me ada a, he ull ex was indexed and is
sea chable, hough no en i ely eadable - when accessing he
mail, he ull ex is edac ed, only he sea ched e m and email
Me ada a a e displayed. Access o he mails’ collec ions is
possible only in he o ganisa ion’s p ecinc .
e sin. ai da a. i (CSC): Open disco e y o da ase s wi h a ied access
condi ions
Ins i u ional Backg ound
Ins i u ion
CSC – IT Cen e o Science, Finland
Hype link
Main Disco e y Se ice: h ps://e sin. ai da a. i/
Example 1 (Di ec Access ia use-copy): FIRE P o ile 3
Example 2 (Media ed Access): D i e- es ed opsoil
Desc ip ion
Page 9 o 12