Deli e able D2.1
G aph On ology o E idence S o age
Edi o (s):
Ve ena Geis , S e an Schöbe l
Responsible Pa ne :
So wa e Compe ence Cen e Hagenbe g GmbH
S a us-Ve sion:
Final – 1.0
Da e:
31.07.2024
Type:
Repo
Dis ibu ion le el (SEN, PU):
PU
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 2 o 33
www.eme ald-he.eu
P ojec Numbe :
101120688
P ojec Ti le:
EMERALD
Ti le o Deli e able:
D2.1 G aph On ology o E idence S o age
Due Da e o Deli e y o he EC
31.07.2024
Wo kpackage esponsible o he
Deli e able:
WP2 - Me hodology o Knowledge Ex ac ion
Edi o (s):
Ve ena Geis , S e an Schöbe l (SCCH)
Con ibu o (s):
Ch is ian Banse, Immanuel Kunz, Angelika Schneide ,
Flo ian Wendland (FHG)
F anz Deimling (FABA)
Re iewe (s):
Nico Haas (FHG)
C is ina Ma ínez, Juncal Alonso (TECNALIA)
App o ed by:
All Pa ne s
Recommended/manda o y
eade s:
WP2, WP3
Abs ac :
EMERALD aims o in eg a e e idence collec ed a
di e en le els o he cloud se ice in o a single g aph-
based s uc u e, he Ce i ica ion G aph (Ce G aph).
This documen desc ibes he de elopmen o a uni o m
schema o s o ing and linking hese he e ogenous da a.
The epo mainly in ol es T2.1, bu also inpu s o T2.2,
T2.3, T2.4, T2.5, and T3.1 a e conside ed.
Keywo d Lis :
E idence collec ion, knowledge g aph schema, on ology
ex ensions, knowledge in eg a ion, combined e idence
analysis.
Licensing in o ma ion:
This wo k is licensed unde C ea i e Commons
A ibu ion-Sha eAlike 3.0 Unpo ed (CC BY-SA 3.0)
h p://c ea i ecommons.o g/licenses/by-sa/3.0/
Disclaime
Funded by he Eu opean Union. Views and opinions
exp essed a e howe e hose o he au ho (s) only and
do no necessa ily e lec hose o he Eu opean Union.
The Eu opean Union canno be held esponsible o
hem.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 3 o 33
www.eme ald-he.eu
Documen Desc ip ion
Ve sion
Da e
Modi ica ions In oduced
Modi ica ion Reason
Modi ied by
0.1
30.04.2024
Fi s d a e sion, key in o ma ion,
and TOC.
Ve ena Geis , S e an
Schöbe l (SCCH),
Ch is ian Banse,
Immanuel Kunz (FHG),
F anz Deimling (FABA)
0.2
21.05.2024
Basics on on ologies and knowledge
g aphs, a chi ec u e, and
equi emen s.
Ve ena Geis , S e an
Schöbe l (SCCH)
0.3
23.05.2024
Execu i e summa y, in oduc ion, and
a chi ec u e.
Ve ena Geis , S e an
Schöbe l (SCCH)
0.4
27.05.2024
Reconcilia ion o TOC and ini ial
con en s by in e nal e iewe .
Nico Haas (FHG)
0.5
11.06.2024
On ology ex ensions and
collabo a ion.
Ve ena Geis , S e an
Schöbe l (SCCH)
0.6
26.06.2024
Consolida ion o inpu om ask
leade s, and illus a i e example.
Angelika Schneide ,
Flo ian Wendland
(FHG),
F anz Deimling (FABA),
Ve ena Geis , S e an
Schöbe l (SCCH)
0.7
11.07.2024
QA e iew by in e nal e iewe in
acco dance wi h he QA p ocess.
Nico Haas (FHG)
0.8
15.07.2024
Add ess commen s and sugges ions
om he QA e iew.
S e an Schöbe l (SCCH)
1.0
31.07.2024
Submi ed o he Eu opean
Commission.
C is ina Ma ínez
(TECNALIA)
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 4 o 33
www.eme ald-he.eu
Table o con en s
Te ms and abb e ia ions ............................................................................................................... 6
Execu i e Summa y ....................................................................................................................... 7
1 In oduc ion ........................................................................................................................... 8
1.1 Abou his Deli e able ................................................................................................... 8
1.2 Documen S uc u e ...................................................................................................... 8
2 F om he MEDINA On ology o he EMERALD Knowledge G aph....................................... 10
2.1 Di e ences be ween an On ology and a Knowledge G aph ....................................... 10
2.2 Recap: Cloud P ope y G aph On ology ....................................................................... 11
2.3 O e iew o Planned Ex ensions ................................................................................. 13
2.4 Embedding he new On ology in he EMERALD A chi ec u e ..................................... 13
3 Requi emen s o Designing he On ology .......................................................................... 16
4 Co e On ology and Ex ensions ............................................................................................ 17
4.1 Co e wi h Secu i y Fea u e .......................................................................................... 18
4.1.1 Co e – A Base On ology ..................................................................................... 18
4.1.2 Secu i y Fea u e – Con aining Da a P ope ies o Secu i y Me ics ................. 19
4.2 On ology Ex ensions .................................................................................................... 19
4.2.1 Applica ion – A Taxonomy o Sou ce Code ...................................................... 19
4.2.2 Documen – A Taxonomy o Policy Documen s ............................................... 20
4.2.3 ML – A Taxonomy o AI/ML Models ................................................................. 20
4.2.4 Cloud – A Taxonomy o Cloud Resou ces including Run ime In o ma ion ...... 20
5 Illus a i e Example – Modelling and Combining E idence In o ma ion o “TLS Ve sion” 22
5.1 O e iew o Used Concep s ......................................................................................... 22
5.2 Adding Ins ances o Ex ac ed E idence .................................................................... 24
5.3 Challenges and u u e wo k ......................................................................................... 24
6 Conclusion ........................................................................................................................... 25
7 Re e ences ........................................................................................................................... 27
APPENDIX A: Collabo a i e On ology De elopmen using P o égé ............................................ 28
A.1 Go e nance .................................................................................................................. 28
A.2 Technical Aspec s ......................................................................................................... 28
A.2.1 Res uc u ing and Ex ending he On ology ....................................................... 28
A.2.2 Used Tools: P o égé and Gi .............................................................................. 28
APPENDIX B: Owl2p o o – Con e ing On ology Files o P o obu ............................................. 30
B.1 Mo i a ion ................................................................................................................... 30
B.2 App oach ...................................................................................................................... 30
B.3 Fu u e Wo k ................................................................................................................. 32
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 5 o 33
www.eme ald-he.eu
Lis o ables
TABLE 1. ONTOLOGY VS. KNOWLEDGE GRAPH ...................................................................................... 10
TABLE 2. ONTOLOGY EXTENSIONS AND THEIR DEDICATED EXTRACTORS ..................................................... 18
TABLE 3. SUB-ONTOLOGIES AND THEIR NAMESPACES ............................................................................. 18
Lis o igu es
FIGURE 1. EXCERPT OF THE CLOUD PROPERTY GRAPH ONTOLOGY SHOWING DIFFERENT RELATIONSHIPS –
BETWEEN ENTITIES IN BLUE AND INHERITANCE IN YELLOW ............................................................... 12
FIGURE 2. MAPPING THE CPG TO THE CODE PROPERTY GRAPH ONTOLOGY .............................................. 12
FIGURE 3. EXCERPT OF THE EMERALD COMPONENT DIAGRAM [3] ......................................................... 14
FIGURE 4. OVERVIEW OF HOW THE CERTGRAPH ONTOLOGY LOGICALLY INTERACTS WITH (SELECTED) EMERALD
COMPONENTS ......................................................................................................................... 15
FIGURE 5. MODULAR DESIGN OF THE CERTGRAPH ONTOLOGY WITH THE EXTENSIONS IN GREEN ................... 17
FIGURE 6. CLASSES AND INSTANCES FOR THE TLS EXAMPLE ..................................................................... 23
FIGURE 7. SCREENSHOT OF PROTÉGÉ .................................................................................................. 29
FIGURE 8. OVERVIEW OF THE ONTOLOGY HIERARCHY OF THE RESOURCE VIRTUALMACHINE ......................... 30
FIGURE 9. EXAMPLE FOR THE PROPERTIES OF THE RESOURCE CLOUDRESOURCE .......................................... 31
FIGURE 10. EXAMPLE FOR THE PROPERTIES OF THE RESOURCE VIRTUAL MACHINE ...................................... 31
FIGURE 11. EXAMPLE FOR THE AUTO-GENERATED PROTOBUF MESSAGE FOR THE VIRTUALMACHINE RESOURCE32
FIGURE 12. EXAMPLE FOR THE AUTO-GENERATED PROTOBUF MESSAGE FOR THE INTERMEDIATE NODE/RESOURCE
COMPUTE ............................................................................................................................... 32
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 6 o 33
www.eme ald-he.eu
Te ms and abb e ia ions
AI
A i icial In elligence
AMOE
Assessmen and Managemen o O ganiza ional E idence
API
Applica ion P og amming In e aces
AST
Abs ac Syn ax T ee
BSI
Bundesam ü Siche hei in de In o ma ions echnik
Ce G aph
Ce i ica ion G aph
CPG
Cloud P ope y G aph
CSA o EU CSA
EU Cybe secu i y Ac
CSP
Cloud Se ice P o ide
CRY
C yp og aphy and Key Managemen
DB
Da abase
DoA
Desc ip ion o Ac ion
EC
Eu opean Commission
GA
G an Ag eemen o he p ojec
HTTP
Hype ex T ans e P o ocol
IRI
In e na ionalized Resou ce Iden i ie
KPI
Key Pe o mance Indica o
KR
Key Resul
ML
Machine Lea ning
NLP
Na u al Language P ocessing
OWL
Web On ology Language
PoC
P oo -o -Concep
P o obu
P o ocol Bu e s
RDF
Resou ce Desc ip ion F amewo k
RDFS
RDF Schema
SW
So wa e
SWRL
Seman ic Web Rule Language
TLS
T anspo Laye Secu i y
URI
Uni o m Resou ce Iden i ie
URL
Uni o m Resou ce Loca o
XML
Ex ensible Ma kup Language
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 7 o 33
www.eme ald-he.eu
Execu i e Summa y
This deli e able desc ibes he design and de elopmen o he Ce G aph On ology, a cen al
on ology o s o ing e idence in a g aph-based o ma . I add esses he key esul CERTGRAPH
(KR2) o he EMERALD p ojec by ou lining a concep o a uni o m g aph-based model o
consolida e all necessa y e idence in o ma ion ex ac ed om a cloud se ice and o enable he
e ie al o combined e idence. In his way, i se es as a common s uc u e ha is illed by all
e idence ex ac ion ools o WP2.
EMERALD ollows a knowledge g aph-based app oach o p o ide his uni ied iew o he cloud
se ice unde ce i ica ion a di e en laye s o he se ice. The schema o s o ing and linking
he e ogenous e idence in o ma ion is de eloped in WP2, and he model is hen implemen ed
in WP3 as a knowledge g aph ha can be le e aged by assessmen ools o measu e
ce i ica ion- ele an secu i y me ics.
This documen s a s by ske ching he p e equisi es o he ansi ion om he MEDINA on ology
o he EMERALD knowledge g aph. I in oduces a lis o equi emen s o de eloping he
on ology, such as using a o mal language and p o iding a clea concep ualiza ion o he cloud
se ice ce i ica ion domain. The main pa desc ibes he on ology ex ensions o suppo he
holis ic app oach o e idence collec ion, including all le els o he cloud se ice, anging om
he in as uc u e laye (e.g., i ual esou ces), o he business laye (e.g., policies and
p ocedu es), o he implemen a ion and da a laye , (e.g., sou ce code and inc easingly used
a i icial in elligence (AI) models). A e wa ds, i p o ides an illus a i e example o modelling
and combining e idence in o ma ion o TLS enc yp ion om di e en sou ces (e.g., un ime
in o ma ion, policy documen s, and sou ce code) as p oo -o -concep (PoC). Finally, he
documen concludes wi h a sho summa y and discussion o u u e wo k.
The g aph-based app oach desc ibed in his deli e able allows o agg ega e indi idual aspec s
and agmen s o in o ma ion o a highe -le el iewpoin o combined e idence, no p e iously
de ec able by a single ool. A he same ime, he app oach main ains aceabili y back o
di e en in o ma ion sou ces and ex ac ion p ocesses. The uni o m schema o e idence
in o ma ion will hen be analysed using in elligen algo i hms and le e aged o acqui e new
insigh s o knowledge in u u e WP2 deli e ables, namely D2.10 “Ce i ica ion G aph– 1” (M15)
and D2.11 “Ce i ica ion G aph– 1” (M27).
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 8 o 33
www.eme ald-he.eu
1 In oduc ion
Fo au oma ed compliance ools o wo k, sui able e idence needs o be ex ac ed and linked
om di e en laye s o a cloud se ice. This includes, on he one hand, (i) he i ual
in as uc u e, such as i ual machines, con aine s, o s o age (based on he Cloud P ope y
G aph on ology om he MEDINA p ojec ). In addi ion, also he ollowing sou ces should be
aken in o accoun in EMERALD: (ii) he sou ce code o se ices, o en w i en in di e en
p og amming languages, such as Ja a, Go, o Py hon); (iii) ele an pa s o legal and policy
documen s, such as equi emen o a chi ec u e documen s; (i ) applied machine lea ning (ML)
models wi h espec o a ious c i e ia, such as obus ness, ai ness, and explainabili y; and ( )
un ime in o ma ion, such as con igu a ion o log iles. Ex ac ion ools, which will be de eloped
as pa o he u u e deli e ables D2.2 - D2.9, will ex ac and p o ide e idence om he
di e en laye s and sou ces desc ibed abo e.
The Ce G aph On ology wi h i s espec i e ex ensions, desc ibed in his documen , is a cen al
ool o b idge he di e en laye s and sou ces o ex ac ed in o ma ion.
1.1 Abou his Deli e able
This documen aims o desc ibe he on ology o modelling e idence in o ma ion in he cloud
se ice ce i ica ion domain, i.e., he schema o he EMERALD knowledge g aph, consis ing o
en i ies, ela ions, and p ope ies. In addi ion, a u he ask is o de ine guidelines o designing
he on ology ex ensions and domain-speci ic schema cons ain s o he unde lying da a.
The eby, de ining addi ional da a p ope ies o en iching da a wi h p o enance in o ma ion
(me a da a om sou ces and ex ac ion p ocesses) is essen ial o p o iding aceabili y down
o di e en sou ces o ce i ica ion.
The on ology ep esen s he basis o in eg a ing and ins an ia ing he knowledge g aph as a
eposi o y o a ge alues in he E idence S o e (a mic ose ice o Cloudi o ) in Task 3.1. I is
also he ounda ion o analysing he seman ic in o ma ion and con ex o he he e ogeneous
e idence in o ma ion in Task 2.6 o build a highe -le el iewpoin o combined e idence, which
acili a es que ying o ce i ica ion e idence and p o ides he basis o he e alua ion and
assessmen o me ics in Task 3.4.
1.2 Documen S uc u e
The documen is s uc u ed as ollows.
In Sec ion 2, we discuss how o ex end he MEDINA on ology o he EMERALD knowledge g aph.
The e o e, we s a by p esen ing he main di e ences be ween an on ology and a knowledge
g aph, hen gi e a sho ecap o he Cloud P ope y G aph on ology, ske ch he planned
ex ensions, and desc ibe how we in end o embed he new on ology in he EMERALD
a chi ec u e.
Sec ion 3 p o ides he equi emen s o designing he on ology.
Sec ion 4 de ails he on ology ex ensions o he di e en cloud se ice laye s, i.e., o ex ac ed
e idence om sou ce code, om policy documen s, om ML models, and om cloud un ime
en i onmen s. We u he discuss e inemen s o da a p ope ies o combining e idence and
suppo ing aceabili y, as well as o secu i y ea u es o assess new secu i y me ics.
In Sec ion 5, a seamless example o modelling and combining ex ac ed e idence in o ma ion
om di e en sou ces is p o ided.
Sec ion 6 ends up wi h he conclusions, including a summa y o he main con ibu ions, open
challenges, and u u e wo k.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 9 o 33
www.eme ald-he.eu
The deli e able also includes wo appendices:
• APPENDIX A: Collabo a i e On ology De elopmen using P o égé, which includes
ema ks o collabo a i e de elopmen o on ology ex ensions using he P o égé ool.
• APPENDIX B: Owl2p o o – Con e ing On ology Files o P o obu , which p esen s
Owl2p o o, a new ool o con e on ology iles o P o ocol Bu e s (P o obu )
s uc u es.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 16 o 33
www.eme ald-he.eu
3 Requi emen s o Designing he On ology
As on ologies a e o mal ep esen a ions o knowledge wi h a ich se o concep s wi hin a
domain and he ela ionships be ween hose concep s, hey a e used o eason abou he
objec s wi hin ha domain and o desc ibe how hey a e ela ed. The ollowing equi emen s
a e essen ial o enable sophis ica ed knowledge managemen , e ie al, and easoning
capabili ies:
• Fo mal language. The on ology should be de ined using a o mal language ha allows
o he exp ession o concep s, ela ionships, ins ances, and axioms. Examples o
on ology languages include he Web On ology Language (OWL)
4
, Resou ce Desc ip ion
F amewo k (RDF)
5
, and RDF Schema (RDFS)
6
.
• Clea concep ualiza ion. The on ology should p o ide a clea and comp ehensi e
concep ualiza ion o he domain i ep esen s, including he de ini ion o classes (o
concep s), p ope ies (a ibu es o ela ionships), and ins ances (indi idual examples o
classes). Concep s should be consis en ly de ined (e.g., Da abase was di ided in o
S o age and Se ice, whe eas Backup was no ) and ela ionship should be p ope ly
e ined (e.g., o e should be e ined in o speci ies ( o policy documen s) and
implemen s ( o sou ce code)). In o ma ion, which is needed o code gene a ion om
he on ology (e.g., he Cloud P ope y G aph on ology used he has and hasMul iple
p ope ies o model o-one o o-many ela ionships and code gene a o s could
gene a e app op ia e code o ep esen hem) should no be included in he on ology.
Ins ead, he domain should be he ocus and he on ology should e lec i in a
meaning ul way.
• Hie a chical s uc u e o concep s. The on ology should suppo he c ea ion o a
hie a chical s uc u e o concep s, allowing o subclass ela ionships and he
o ganiza ion o concep s in o a axonomy.
• Reasoning and consis ency checking. The on ology should be compa ible wi h in e ence
engines and allow o he de ini ion o logical ules ha enable au oma ed easoning
abou he concep s and hei ela ionships. In addi ion, ools and me hods should be
a ailable o checking he consis ency and alidi y o he on ology, ensu ing ha he e
a e no logical con adic ions wi hin he de ined concep s and ela ionships.
• In e ope abili y and ex ensibili y. The on ology should be de eloped in a way ha
ensu es in e ope abili y wi h o he on ologies, acili a ing da a exchange and
in eg a ion ac oss di e en laye s o a cloud se ice. The on ology should be accessible
o bo h humans and machines, wi h clea naming con en ions and iden i ie s, allowing
pa s o he on ology o be eused in di e en namespaces and con ex s. I should also
be ex ensible ega ding no el secu i y schemas and s anda ds, in case hey equi e
addi ional e idence, which has o be modelled as an ex ension.
• Documen a ion and anno a ion. Comp ehensi e documen a ion and anno a ion o he
on ology should be a ailable, including desc ip ions o he pu pose, scope, and s uc u e
o he on ology, as well as he meaning o all concep s and ela ionships.
• Ve sioning. The e should be a clea s a egy o handling he eleases o he on ology
(e.g., annually, qua e ly, o on demand) and how changes and new e sions a e
announced.
4
h ps://www.w3.o g/OWL/
5
h ps://www.w3.o g/RDF/
6
h ps://www.w3.o g/TR/ d 12-schema/
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 17 o 33
www.eme ald-he.eu
4 Co e On ology and Ex ensions
In his sec ion, we discuss how o in eg a e e idence ex ac ed om he mul iple cloud se ice
laye s (i.e., in as uc u e, pla o m, and so wa e), including policy documen s and un ime
in o ma ion, in o a single g aph-based s uc u e (KR2-CERTGRAPH). Fu he mo e, e idence
in o ma ion o he secu i y e alua ion o AI models (KR5-AIPOC) will be included. Based on he
gene al idea o (ha monized) secu i y me ics, we allow di e en e idence collec ion ools o
ga he di e en laye s o e idence o he same me ic, enhancing euse o e idence collec ed,
and p o iding answe s o assess he me ics.
The e o e, we plan di e en ex ensions o he o e all on ology (see Figu e 5), which oge he
ep esen he uni ied sou ce o ypes in he cloud se ice ce i ica ion domain. Con en is
ypically ep esen ed by a se o iples (subjec , p edica e, objec ), whe e he p edica e
desc ibes he ela ion be ween he subjec and objec en i y [4]. Knowledge g aphs [5] a e a
well-es ablished me hod o managing complex mul i- ela ional ela ionships based on he
p o ided schema. This way, ine-g ained en i ies om di e en he e ogenous sou ces
(s uc u ed da a, semi-s uc u ed da a, ee ex ) can be ex ac ed and linked s ep by s ep [6].
O impo ance a e he ideas o he ex ended DIKW hie a chy (da a, in o ma ion, knowledge, and
wisdom) [7], whe e each concep is ela ed o he p e ious concep , o ming a chain o
inc easing in e connec edness and e alua ed human unde s anding [8].
The g owing a ailabili y o la ge amoun s o e idence da a, which e ol es o e ime and is
con inuously ex ac ed o cloud se ice ce i ica ion, equi es o anno a e he g aph wi h
empo al in o ma ion, such as imes amps [9]. Explici ly cap u ing empo al dependencies in
addi ion o s uc u al p ope ies inc eases he aceabili y o ac s back o ex ac ion ools and
anspa ency in p ocesses and p ocedu es equi ed o un cloud se ices.
Figu e 5. Modula design o he Ce G aph On ology wi h he ex ensions in g een
As shown in Figu e 5, he Ce G aph On ology consis s o mul iple sub-on ologies and
ex ensions, which co e indi idual aspec s. The Co e on ology, oge he wi h he Secu i y
Fea u e on ology, builds he ounda ion o he on ology and con ains base classes and
p ope ies. Speci ically, Secu i y Fea u e models di e en secu i y ela ed concep s. The
ex ensions a e buil on op o his ounda ion and each ex ension models he e idence ga he ed
by a di e en ype o ex ac o (see Table 2). The collec ed e idence om he ex ac o s is
ep esen ed as ins ances wi hin a sepa a e pa ha , in u n, is buil upon he on ology and
implemen ed in he E idence S o e.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 18 o 33
www.eme ald-he.eu
Table 2. On ology ex ensions and hei dedica ed ex ac o s
Ex ension
Ex ac o
Cloud
Cloudi o -Disco e y
Applica ion
eknows and Codyze
ML
AI-SEC
Documen
AMOE
Each sub-on ology and ex ension has i s own namespaces (see Table 3). This allows o
in e ope abili y and a lexible ex ension o he on ology beyond he aspec s conside ed wi hin
EMERALD.
Table 3. Sub-on ologies and hei namespaces
Sub-on ology
Namespace
Co e
h ps://on ology.eme ald-he.eu/co e
Secu i y Fea u e
h ps://on ology.eme ald-he.eu/co e/secu i y ea u e
Cloud
h ps://on ology.eme ald-he.eu/ esou ces/cloud
Applica ion
h ps://on ology.eme ald-he.eu/ esou ces/applica ion
ML
h ps://on ology.eme ald-he.eu/ esou ces/ml
Documen
h ps://on ology.eme ald-he.eu/ esou ces/documen s
E idence om < ool>
h ps://on ology.eme ald-he.eu/e idence/< ool>
A he ime o w i ing, each sub-on ology is no modelled in de ail. In he ollowing sec ions we
ou line he main concep s which will be included in each sub-on ology. Sec ion 5 shows an
example o he planned con en and includes a diag am, which zooms in and jus shows he
ele an pa s o he example. The planned collabo a ion o c ea ing he Ce G aph On ology
and i s sub-on ologies and ex ensions is desc ibed in APPENDIX A: Collabo a i e On ology
De elopmen using P o égé.
A newly de eloped ool, called Owl2p o o
7
allows o con e an on ology o P o obu s uc u es.
Wi h his ool, we a e able o au oma ically gene a e p o o iles om he on ology iles and use
hem di ec ly in di e en suppo ed p og amming languages. P e iously, we had o c ea e and
upda e all on ology objec s o each p og amming language manually. The Owl2p o o ool is
desc ibed in APPENDIX B: Owl2p o o – Con e ing On ology Files o P o obu .
4.1 Co e wi h Secu i y Fea u e
4.1.1 Co e – A Base On ology
Co e is an on ology ha cons i u es he co e o he o e all Ce G aph On ology, whe e di e en
ex ensions can be impo ed (depending on he ac ual ce i ica ion use case).
• Roo node: Resou ce (an abs ac concep being he ancho o all impo ed ex ensions)
• Con en :
o Con ains he (manda o y) Secu i y Fea u e On ology.
o Speci ies e inemen s o combining e idence and suppo ing aceabili y, such
as equi ed p ope ies o he ex ac ion sou ce (i.e., which ex ac o pe o med
he e idence ex ac ion, in which e sion o he ool, e c.), imes amps, e c.
7
h ps://gi hub.com/oxis o/owl2p o o
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 19 o 33
www.eme ald-he.eu
4.1.2 Secu i y Fea u e – Con aining Da a P ope ies o Secu i y Me ics
Secu i y Fea u e is an on ology whe e all ex ac ed e idence in o ma ion om he di e en
laye s o a cloud se ice will be linked o speci ic da a p ope ies o secu i y ea u es o assess
secu i y me ics / equi emen s om he pilo s.
• Roo node: Secu i yFea u e
• Con en :
o Based on he exis ing axonomy om he MEDINA On ology, addi ional secu i y
ea u es and p ope ies will be added based on new me ics and pilo
equi emen s.
o Addi ional secu i y ea u es and p ope ies o exis ing ex ac o s o MEDINA
will be added (e.g., ega ding AMOE and Codyze).
o The Func ionali y axonomy o he MEDINA On ology, which comp ises a
collec ion o gene al concep s, will also be included.
4.2 On ology Ex ensions
This sec ion desc ibes he key in o ma ion o he on ology ex ensions o di e en cloud se ice
laye s. No e ha a mixed app oach will be ollowed when c ea ing he ex ensions: We will s a
“ op-down” by modelling he hie a chical s uc u e o concep s and ela ionships in a a he
gene ic way – independen o secu i y me ics and equi emen s. These axonomies will hen
be e ined and linked o da a p ope ies o secu i y ea u es “bo om-up”, i.e., depending on
wha we need, wha we wan o measu e, and wha we ac ually ge om he gi en a i ac s in
he pilo s. The mo e conc e e he links, he mo e added alue he on ology will p o ide. The
s a ing poin o conc e e me ics will mos likely be o icial secu i y schemes, such as BSI AIC4
8
o C5
9
.
4.2.1 Applica ion – A Taxonomy o Sou ce Code
Applica ion is a sou ce code axonomy o ca ego ize and o ganize code elemen s based on hei
cha ac e is ics, unc ionali ies, and secu i y aspec s in so wa e sys ems (wi h ega d o
e idence ex ac ed in Task 2.2).
• Roo node: Applica ion
• Ex ac o (s)
10
: Codyze / eknows
• Con en :
o Sub-concep s o an applica ion a e desc ibed in mo e de ail, ep esen ed as a
supe se o se e al languages, and linked o (addi ional) secu i y ea u es. I
may include:
▪ Sou ce code ile wi h line in o ma ion,
▪ secu i y- ela ed APIs,
▪ business ules,
▪ secu i y guidelines,
▪ p ojec con igu a ion and eposi o y (me a)da a.
o F amewo k axonomy om he MEDINA On ology.
o Mapping o eknows classes (AST) will be de ined, analogous o mapping o
Codyze classes (CPG).
8
h ps://www.bsi.bund.de/Sha edDocs/Downloads/EN/BSI/CloudCompu ing/AIC4/AI-Cloud-Se ice-
Compliance-C i e ia-Ca alogue_AIC4.h ml
9
h ps://www.bsi.bund.de/dok/7685384
10
Please no e ha he wo ex ac o s should complemen each o he . Fo example, one secu i y con ol
can be be e co e ed by Codyze, ano he by eknows. The e a e no plans o combine he wo ools.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 20 o 33
www.eme ald-he.eu
o Ex ac ed e idence o secu i y equi emen s / ea u es may include
c yp og aphy, secu e s o age, dependency managemen , anspo enc yp ion,
au hen ica ion, au ho iza ion, logging, inpu alida ion and bes p ac ices o
(secu e) coding.
4.2.2 Documen – A Taxonomy o Policy Documen s
Documen is an o ganiza ional axonomy o ca ego ize and o ganize ex ual in o ma ion om
policy documen s (wi h ega d o e idence ex ac ed in Task 2.3).
• Roo node: Documen
• Ex ac o : AMOE
• Con en :
o Di e en ia ion be ween di e en kinds o policy documen s as high-le el nodes
(e.g., a chi ec u e, equi emen s, e c.) – he be e classi ied, he mo e p ecisely
linkable o secu i y ea u es desc ibed in policy documen s.
o Migh also con ain in o ma ion abou au ho , page numbe , esponsible pe son,
link o he documen , e c.
o Ex ac ed e idence o secu i y equi emen s/ ea u es may include enc yp ion
( anspo , b owse , passwo d, API, e c.), ce i ica e (key leng h, alidi y pe iod,
e c.), au hen ica ion (login, passwo d, e c.), secu i y inciden , malwa e
p o ec ion, da a access, and backup.
4.2.3 ML – A Taxonomy o AI/ML Models
ML is a axonomy o ca ego ize and o ganize in o ma ion ex ac ed om AI/ML models based
on ce ain c i e ia (wi h ega d o e idence ex ac ed in Task 2.4).
• Roo node: ML
• Ex ac o : AI-SEC
• Con en :
o Di e en ia ion be ween di e en kinds o ML models (e.g., o images, ex ,
e c.) and di e en kinds o asks (e.g., classi ica ion, p edic ion e c.) as high-le el
nodes.
o Di e en kind o in o ma ion deno ing ele an c i e ia (e.g., ai ness,
obus ness, p i acy-p ese ing, e c.) which will be linked o speci ic secu i y
ea u es.
o Types o ex ac ed e idence o secu i y ea u es a e no de ined ye ( ypes
may con ain s ings, ec o s, e c.).
4.2.4 Cloud – A Taxonomy o Cloud Resou ces including Run ime
In o ma ion
Cloud is a axonomy (based on he con en s o MEDINA) wi h addi ional p ope ies o ex end
e idence ga he ing o cloud esou ce con igu a ions and o enhance e idence wi h applica ion-
speci ic un ime in o ma ion, e.g., om log iles (wi h ega d o e idence ex ac ed in Task 2.5).
• Roo node: Cloud Resou ce
• Ex ac o : Cloudi o -Disco e y
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 21 o 33
www.eme ald-he.eu
• Con en :
o Based on he exis ing axonomy om he MEDINA On ology. The axonomy is
di ided in o di e en esou ce ca ego ies (e.g., Compu e, S o age, Ne wo king)
which con ain he co esponding cloud esou ces (e.g., Vi ual Machine, Block
S o age, Ne wo k In e ace). Fo example, he esou ce ca ego y Compu e
con ains he unde lying Cloud esou ces Con aine , Func ion and Vi ual
Machine.
o Addi ional and e ined links o he secu i y ea u es.
o Ex ac ed e idence o secu i y equi emen s/ ea u es may include enc yp ion
/ anspo , enc yp ion in use, a es enc yp ion, e c.), logging (enabled,
e en ion pe iod, e c.), au hen ica ion (passwo d, mul i ac o , oken based,
e c.), access es ic ion ( es ic ed po s, i ewall, e c.), backup ( anspo
enc yp ion, loca ion, e en ion pe iod, e c.), edundancy.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 22 o 33
www.eme ald-he.eu
5 Illus a i e Example – Modelling and Combining E idence
In o ma ion o “TLS Ve sion”
This sec ion p esen s an illus a i e example o modelling and combining ex ac ed e idence
in o ma ion om di e en sou ces. A he ime o w i ing his documen , he inal choice o he
used secu i y schema(s) and secu i y con ols/me ics in EMERALD has no ye been made. We,
he e o e, s a an ini ial p oo -o -concep wi h a meaning ul secu i y p ope y om a so wa e
pe spec i e aken om BSI C5, e.g., enc yp ion o da a o ansmission (BSI C5: CRY-02).
The key idea is o ep esen secu i y- ela ed pa s o he sou ce code o a cloud se ice in a g aph
s uc u e and p o ide addi ional con ex h ough he disco e y o he cloud esou ces he
se ice is unning on and ela ed policy documen s, e.g., ega ding he used TLS e sion. B idging
he wo ld o s a ic code analysis and ex ac ion o a cloud se ice’s un ime in o ma ion allows
o combine e idence a a highe le el o knowledge and also enables a compa ison o wha is
desc ibed in policy documen s.
5.1 O e iew o Used Concep s
The ocus o his example is on illus a ing he big pic u e and in e connec i i y be ween sub-
on ologies (see Figu e 6) and no on de ails wi hin a ce ain on ology. Fu he mo e, OWL will be
used as o mal language o desc ibe he on ologies. In he diag am, classes a e isualized as
ec angles and ins ances as hexagons. Open-headed a ows wi h a illed line (⇾) ep esen
“subclass o ” ela ions, which connec subclasses o hei pa en class, and open headed a ows
wi h a dashed line (┉▹) ep esen “ins ance o ” ela ions, which connec ins ances o hei class.
Simple a ows (→) ep esen da a and objec p ope ies. These a ows a e used be ween classes
o de ine he schema, as well as be ween ins ances in hei ma e ialized o m.
As desc ibed in Sec ion 4, he wo on ologies Co e and Secu i y Fea u e o m he basis o he
Ce G aph On ology. The Co e on ology de ines he me amodel o EMERALD e idence and uses
he concep s de ined in he Secu i y Fea u e on ology.
The Secu i y Fea u e on ology con ains a a ie y o secu i y ea u es and da a p ope ies, which
a e based on he same-named axonomy om MEDINA. To keep hings simple, only a single
ea u e (T anspo Enc yp ion class) is showcased in his example and he hie a chy has also been
simpli ied o wo le els. The base class o his hie a chy is Secu i yFea u e. Also, o simplici y
easons, jus one da a p ope y e sion is de ined o s o e he TLS e sion.
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 23 o 33
www.eme ald-he.eu
Figu e 6. Classes and ins ances o he TLS example
The Co e on ology con ains classes, which de ine he model o e idence and hei connec ions
o o he ela ed in o ma ion agmen s. A he ime o w i ing, ou p oo -o -concep consis s o
he ollowing classes:
• E idence, which is he cen al class and ins ances o i ep esen de ec ed secu i y
e idence. Each e idence has connec ions o Secu i y Fea u e, Se ice, Asse , and Tool.
• Asse , which ep esen s he sou ce o a piece o e idence and should s o e ele an
me ada a o he loca ion, as i bes i s he asse . Each Asse has a connec ion o an
Asse Type.
• Asse Type, which classi ies he ole o asse s wi hin he sys em. Asse Type is modelled
as an enume a ion ype in on ology e ms. Fo his, a class is needed, and an ins ance is
c ea ed o each possible a ian . Cu en ly, we dis inguish be ween hese wo possible
a ian s:
o The i s a ian , Speci ica ion, is used o e idence ound in asse s, which
desc ibe, how he sys em should beha e. The main applica ion o his a ian
is in human- eadable documen s which a e no au oma ically p ocessed o
compila ion, o example, a chi ec u e desc ip ions o policy documen s.
o The second a ian , Implemen a ion, is used o e idence ound in asse s, which
desc ibe, how he sys em ac ually beha es. This a ian is mainly used o
e idence ound in machine-p ocessed asse s, o example, sou ce code,
con igu a ion iles, o un ime in o ma ion.
• Se ice, which ies he e idence o a ce ain se ice.
• Tool, which ep esen s he ex ac o componen ha has collec ed he e idence.
In pa icula , he connec ion o Se ice should enable he usion o e idence om mul iple
sou ces. This equi es a unique iden i ie o each se ice, which will be used as URI o he
se ice ins ance.
Ex ensions a e buil on op o he Co e and Secu i y Fea u e on ologies. In his example, we used
he Documen and Applica ion ex ensions and limi he scope o jus one class pe ex ension. As
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 24 o 33
www.eme ald-he.eu
p e iously desc ibed, he classes in he ex ensions should model hei espec i e domains. The
ollowing wo classes a e used in he example:
• A chi ec u eDocumen , which ep esen s a human- eadable ex ual documen o
so wa e a chi ec u e, and
• Sou ceCodeFile, which ep esen s a sou ce code ile which is compiled o a gi en
se ice and is s o ed in a eposi o y.
5.2 Adding Ins ances o Ex ac ed E idence
Ga he ed e idence om he sys em a e modelled as OWL ins ances. In he example in Figu e 6,
e idence ex ac ed by eknows will be used. The ound e idence is ep esen ed as he ins ance
TEFoundInCode in he diag am and has connec ions o o he ins ances. Please no e ha
“T anspo Enc yp ion” is abb e ia ed as “TE” in his example o be e eadabili y in he
diag am.
TEFoundInCode connec s o he ollowing ins ances:
• Con olle .ja a, an ins ance o he Sou ceCodeFile class,
• TLS, an ins ance o T anspo Enc yp ion wi h he e sion p ope y se o “1.2”,
• eknows, an ins ance o Tool, o ep esen he ex ac ion componen ,
• Implemen a ion, an ins ance p o ided by Co e, o indica e, ha he e idence has been
ound as ac ual beha iou , and
• P oduc Se ice, an ins ance o Se ice, o ep esen he se ice, o which he e idence
belongs o.
No e: E idence om o he ex ac ion componen s mus link o he same se ice ins ance. In
OWL, wo ins ances a e conside ed as he same i hey a e iden i ied by he same URI. This
enables knowledge usion la e on o he assessmen , and he e o e one mus ensu e ha he
same URI is used o iden i y a gi en se ice ac oss all ex ac ion componen s. This challenge was
ackled in MEDINA by c ea ing an ID o each cloud se ice, and he same s a egy will be applied
he e.
5.3 Challenges and u u e wo k
Based on his example, he on ology will con inuously be ex ended in he cou se o he p ojec .
Fu he mo e, some design decisions a e no inal and a e s ill discussed. This includes, bu is no
limi ed o, connec ions be ween classes in gene al.
Ano he open discussion is he s uc u e o E idence and Secu i yFea u e. Cu en ly hey a e
modelled as wo sepa a e classes and i is being e alua ed whe he i would be mo e sensible
and simple o me ge hese wo classes in o one.
A he ime o w i ing, he implica ions o each decision canno ye be es ima ed en i ely, and
he s uc u e o he on ology will con inue o e ol e. The esul s will be epo ed in he
upcoming deli e ables D2.10 “Ce i ica ion G aph– 1” (M15) and D2.11 “Ce i ica ion G aph–
1” (M27).
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 25 o 33
www.eme ald-he.eu
6 Conclusion
The aim o his deli e able is o documen how o es ablish a uni ied iew o he cloud se ice
unde ce i ica ion by ex ac ing and en iching knowledge on di e en laye s o he se ice and
p o iding a sui able schema o s o ing his e idence. The e o e, we desc ibe a seamless
app oach consis ing o da a acquisi ion, knowledge ex ac ion, and knowledge usion o build
he Ce G aph On ology and i s sub-on ologies and ex ensions o consolida e all necessa y
in o ma ion o he se ice as he basis o implemen ing he knowledge g aph in WP3. An
illus a i e example on how o model and combine e idence ex ac ed by di e en EMERALD
ex ac o s is p o ided as an ini ial p oo -o -concep (PoC).
The main con ibu ions o he on ology o e idence s o age include he p o ision o :
1 A concep o gene ic models o map secu i y aspec s.
Fi s ly, gene ic models (i.e., sub-on ologies and ex ensions) p o ide suppo o
e idence ex ac ion om di e en sou ces (e.g., in as uc u e, sou ce code, and
documen s) o he se ice and a schema o s o ing he he e ogeneous e idence
in o ma ion. Following a knowledge g aph-based app oach, hese models allow o
iew pa ial e idence om di e en pe spec i es.
2 A clean basis mul i-e idence usion.
Secondly, linking o he e ogeneous e idence allows o agg ega e indi idual aspec s
and agmen s o in o ma ion o a highe -le el o combined e idence, while p o iding
suppo o aceabili y o in o ma ion sou ces and ex ac ion p ocesses. This way, he
g aph se es as a common s uc u e illed by all e idence ex ac ion ools ha can be
le e aged by he assessmen ools in WP3 o measu e secu i y me ics.
3 Enhanced quali y o measu emen and possibili y o compa ison.
Thi dly, assessing (pa ial) e idence om di e en sou ces allows a quali a i e
s a emen abou he accu acy o measu ed esul s o audi o s and, u he mo e,
enables he compa ison be ween speci ica ion (e.g., in policy documen s) and
implemen a ion (e.g., in sou ce code) o secu i y ea u es.
4 Rep esen a ion o e idence abou AI model secu i y.
Las ly, by in eg a ing also e idence ex ac ed by no el me hods o he secu i y
assessmen o AI models, EMERALD will also be able o ce i y cloud-based AI sys ems
and ans e he inno a ion esul s o upcoming AI ce i ica ion schemes.
The e a e also some open challenges ha we will add ess in u u e wo k:
• Secu i y con ols and me ics o be add essed a e no ye de ined.
I is no clea ye wha e idence should be included in he on ology ex ensions in de ail.
This hea ily depends on he secu i y con ols and me ics o be add essed in he pilo s,
which a e no known by he ime o w i ing. To mi iga e his challenge, we use a mixed
app oach (“ op-down” ollowed by “bo om-up”) o de eloping he on ology
ex ensions and mapping hem o equi ed secu i y ea u e p ope ies la e .
In addi ion, a wo kshop is planned oge he wi h esponsible echnical pe sons o he
pilo pa ne s o discuss unc ional equi emen s o he echnical ex ac ion
componen s.
• De ails on usion o pa ial e idence and implemen a ion op ions need o be cla i ied.
Fu he de ails on ensu ing unique iden i ie s, empo al cons ain s, and seman ic
e inemen s need o be in es iga ed. In Task 2.6, we will esea ch on c ea ing a g aph
abs ac ion laye o acili a e que ying o e idence and apply g aph-based analysis o
gene a e new insigh s o knowledge . We will u he in es iga e easoning echniques
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 32 o 33
www.eme ald-he.eu
Figu e 11. Example o he au o-gene a ed p o obu message o he Vi ualMachine esou ce
Ano he special y is ha we use he oneO keywo d in p o obu messages. This is used o all
nodes (classes) ha a e no lea -nodes ( he lowes classes). This allows o ob ain he indi idual
ype in o ma ion o he in e media e nodes. Figu e 12 shows an example o he p o obu
message.
Figu e 12. Example o he au o-gene a ed p o obu message o he in e media e node/ esou ce
Compu e
B.3 Fu u e Wo k
We plan o implemen he ollowing imp o emen s:
• Field numbe s. To keep compa ibili y wi h p e ious p o obu e sion, ield numbe mus
no change. Cu en ly, he ool is no able o deal wi h ha i new p ope ies a e added.
• Se e al iles. The ool can cu en ly only ead in one owl ile o gene a e he
co esponding p o obu ile. Since he Ce G aph On ology consis s o se e al impo
iles, i would be desi able i he ool could di ec ly p ocess se e al iles as inpu .
D2.1 – G aph On ology o E idence S o age Ve sion 1.0 – Final. Da e: 31.07.2024
© EMERALD Conso ium Con ac No. GA 101120688 Page 33 o 33
www.eme ald-he.eu
• Load On ology iles. On ology iles a e usually a ailable a a speci ic URL. I would be
desi able i only he URL o he oo ile had o be speci ied and his and all o he
impo ed iles would be au oma ically e ie ed and used o he ile gene a ion.