Au ho :
Wilcke,W.X. VU Uni e si y Ams e dam
A iadne is unded by he Eu opean Commission’s
7 h F amewo k P og amme.
D16.1: Fi s Repo on Da a Mining
ARIADNE'D16.1'(Public)'
!
!
The' esea ch'leading' o' hese' esul s'has' ecei ed' unding' om' he'Eu opean'Communi y's'
Se en h'F amewo k'P og amme'(FP7JINFRASTRUCTURESJ2012J1)'unde 'g an 'ag eemen 'n°'
313193.''
Ve sion:'1.2'( inal)'
Ma ch'2015'
Au ho :'
Wilcke,'W.X,'VU'Uni e si y'Ams e dam'
Con ibu ing'pa ne s:'
De'Boe ,V.'AVU'Uni e si y'Ams e dam'
Van'Ha melen,'F.A.H.'AVU'Uni e si y'Ams e dam'
De'Kleijn,'M.T.M.'AVU'Uni e si y'Ams e dam'
Wansleeben,'M.A'Leiden'Uni e si y'
'
Quali y'Con ol'Re iew:'
W igh ,'H.'E.'–'A chaeology'Da a'Se ice,'Uni e si y'o 'Yo k'
'
'
'
'
'
ARIADNE is a p ojec unded by he Eu opean Commission unde he Communi y’s
Se en h F amewo k P og amme, con ac no. FP7
The iews and opinions exp essed in his p esen a ion a e he sole esponsibili y o he
au ho s and do no necessa ily e lec he iews o he Eu opean Commission.
ARIADNE"D16.1"Public"
III"
"
Table&o &Con en s&
Documen (His o y(....................................................................................................................(I"
Lis (o (Abb e ia ions(...............................................................................................................(II"
Execu i e&Summa y(............................................................................................................(IV"
Recommenda ions"................................................................................................................................."V"
Roadmap"................................................................................................................................................"VI"
1"In oduc ion(and(Objec i es(.............................................................................................(1"
1.1"S uc u e"o "Repo "........................................................................................................................."1"
2"In oduc ion( o(Linked(Da a(.............................................................................................(2"
2.1"The"RDF"Da a"Model"......................................................................................................................."2"
2.2"On ologies"......................................................................................................................................"5"
2.3"The"Seman ic"Web".........................................................................................................................."6"
2.4"Linked"A chaeological"Da a"............................................................................................................"8"
3"In oduc ion( o(Da a(Mining(..........................................................................................(10"
3.1"Lea ning" om"A chaeological"Da a".............................................................................................."10"
3.2"Knowledge"Disco e y"and"Da a"Mining"........................................................................................"12"
3.3"Da a"Mining"Tasks"........................................................................................................................."13"
3.4"Towa ds"Mining" he"Seman ic"Web".............................................................................................."15"
4"Seman icOWeb(Mining(...................................................................................................(17"
4.1"Da aIMining"Tasks"........................................................................................................................"17"
4.2"Applicable"Solu ions"....................................................................................................................."22"
5"Domain(Unde s anding(..................................................................................................(35"
5.1"Rele an "S udies"..........................................................................................................................."36"
5.2"Wishes"o "Domain"Expe s"............................................................................................................"38"
5.3"Summa y"......................................................................................................................................."39"
6"Da a(Unde s anding(.......................................................................................................(40"
6.1"Da a"P oduced"using"Na u alILanguage"P ocessing"....................................................................."40"
6.2"Case"S udy"on"Da a"Reposi o ies".................................................................................................."41"
6.3"Summa y"......................................................................................................................................."43"
ARIADNE"D16.1"Public"
IV"
"
7"Da a(Mining(on(Linked(A chaeological(Da a(...................................................................(45"
7.1"Hypo hesis"Gene a ion"................................................................................................................."45"
7.2"Assis ed"Que y"Fo mula ion"........................................................................................................."46"
7.3"Ranking"o "Que y"Resul s".............................................................................................................."46"
7.4"Resou ce"Recommende "Sys em".................................................................................................."47"
7.5"Da a"Quali y"Analysis"...................................................................................................................."48"
7.6" T us "Analysis"................................................................................................................................"49"
8"Conclusions(...................................................................................................................(51"
8.1"Domain"Unde s anding"................................................................................................................."51"
8.2"Da a"Unde s anding"......................................................................................................................"52"
8.3"Recommenda ions"........................................................................................................................"53"
8.4"Roadmap"......................................................................................................................................."54"
Bibliog aphy(.........................................................................................................................(55"
Appendix(A"Reasoning(wi h(Logic(.........................................................................................(i"
A.1"Reasoning"by"Deduc ion"..............................................................................................................."i"
A.2" Reasoning"by"Induc ion"..............................................................................................................."ii"
A.3"Logic"Reasoning"wi hin" he"Seman ic"Web"................................................................................."ii"
Appendix(B"Vec o (Space(Models(.......................................................................................(i "
Appendix(C"Lea ning(Me hods( o (Seman ic(Web(Mining(....................................................( "
C.1"P oposi ional"Lea ning"................................................................................................................" "
C.2"S a is ical"Rela ional"Lea ning"..................................................................................................." ii"
C.3"Ke nel"Me hods".........................................................................................................................."xi"
Appendix(D"Sample(o (A chaeological(Scena ios(...............................................................(xi "
"
I"
"
Documen &His o y&
1. 13 h"Feb ua y,"2015"" –"Full"D a "Ve sion"1.0"
2. 19 h"Feb ua y,"2015"" –"QC"Re iew"1.0"
3. ""4 d""""""Ma ch,"2015"" –"Full"D a "Ve sion"1.1"
4. ""6 h""""""Ma ch,"2015"" –"QC"Re iew"1.1"
5. ""6 d""""""Ma ch,"2015"" –"Full"Final"Ve sion"1.2"
" "
ARIADNE"D16.1"Public"
II"
"
Lis &o &Abb e ia ions&
The" ollowing"abb e ia ions"will"be"used"in" his" epo ."
Abb e ia ion!
Full&Te m&
ACDM"
ARIADNE"Ca alogue"Da a"Model"
ADS"
A chaeological"Da a"Se ice"
API"
Applica ion"P og amming"In e ace"
ARIADNE"
Ad anced" Resea ch" In as uc u e" o " A chaeological" Da ase "
Ne wo king"in"Eu ope"
BGV"
Basic"Geo"Vocabula y"
CAA"
Compu e "Applica ions"&"Quan i a i e"Me hods"in"A chaeology"
DCAT"
Da a"Ca alogue"Vocabula y"
DINAA"
Digi al"Index"o "No hIAme ican"A chaeology"
DM"
Da a"Mining"
GIS"
Geog aphic"In o ma ion"Sys em"
IG"
In o ma ion"Gain"
ILP"
Induc i e"Logic"P og amming"
IT"
In o ma ion"Task"
KDD"
Knowledge"Disco e y"and"Da a"Mining"
LAD"
Linked"A chaeological"Da a"
LD"
Linked"Da a"
LOD"
Linked"Open"Da a"
ML"
Machine"Lea ning"
MRDM"
Mul iIRela ional"Da a"Mining"
NLP"
Na u al"Language"P ocessing"
OGC"
Open"Geospa ial"Conso ium"
"
ARIADNE"D16.1"Public"
III"
"
Abb e ia ion!
Full&Te m&
OLAP"
Online"Analy ical"P ocessing"
OWL"
Web"On ology"Language"
PCA"
P incipal"Componen "Analysis"
PSL"
P obabilis ic"So "Logic"
RDF"
Resou ce"Desc ip ion"F amewo k"
RDFS"
Resou ce"Desc ip ion"F amewo k"Schema"
SKOS"
Simple"Knowledge"O ganiza ion"Sys em"
SRL"
S a is ical"Rela ional"Lea ning"
SVM"
Suppo "Vec o "Machine"
SW"
Seman ic"Web"
SWM"
Seman ic"Web"Mining"
TL"
T us "Le el"
UI"
Use "In e ace"
URI"
Uni e sal"Resou ce"Indica o "
VSM"
Vec o "Space"Models"
W3C"
Wo ld"Wide"Web"Conso ium"
WGS"
Wo ld"Geode ic"Sys em"
WP"
Wo k"Package"
" "
ARIADNE"D16.1"Public"
IV"
"
Execu i e&Summa y&
ARIADNE," he"Ad anced"Resea ch"In as uc u e" o "A chaeological"Da ase "Ne wo king"in"Eu ope,"will"
acili a e" a" cen al" web" po al" ha " p o ides" access" o" a chaeological" da a" om" a ious" sou ces" in" a"
s anda dized"and"open" o ma ."This" o ma "will"likely"adhe e" o" he"Linked"Da a"pa adigm"ei he " ully"o "
pa ially," wi h" he" o me " being" he" op ion" ha " we" belie e" is" needed" o" p opel" ARIADNE" owa ds" a"
highe "le el"o "in e ope abili y."By" his"assump ion,"use s"will"be"able" o"use" he"po al" o"b owse"and"
sea ch" he"da a," he eby"making"use"o "all" he"ad an ages" ha "Linked"Da a"has" o"o e ."Among" hese"
ad an ages" a e" ad anced" sea ch" abili ies," he" inhe en "capabili y" o " d awing" in e ences," and" he"
en ichmen "o "da a"by"linkage" o"and" om"ex e nal"sou ces."We"expec " hese" ea u es" o"ha e"a"posi i e"
impac " on" he" a chaeological" esea ch" communi y." Howe e ," his" does" no " necessa ily" add" o" he"
knowledge"con ained"wi hin" he"agg ega ed"da a"se s."Ideally,"we"would"like" o"expand" his"knowledge"
as"well."One" ield"o "expe ise" ha "specializes"in"exac ly" ha "is"da a"mining."He e o," his" ield"p o ides"
ools"and" echniques" o"iden i y" alid,&no el,&po en ially&use ul,&and&ul ima ely&unde s andable&pa e ns&
in&da a&(U."M."Fayyad"1996)."
This" epo "examines" he"applicabili y"and" easibili y"o "in eg a ing"da a"mining"solu ions"in o"ARIADNE."
To" his"end,"we"explo ed" a ious"s a eIo I heIa " heo ies,"me hods,"and"solu ions" o"de ec "pa e ns"in,"
and"es ablish" ela ions"be ween,"da a" om" he"a chaeological"domain."Th oughou " his" epo ,"we"made"
he"assump ion" ha " his"da a"will"adhe e,"ei he " ully"o "pa ially," o" he"p inciples"o " he"Linked"Da a"
pa adigm."The"sub ield"o "da a"mining"dedica ed" o" his" o m"o "da a,"known"as"seman ic"web"mining,"
was"delibe a ely"chosen"o e " heImo eI adi ional" abula "da a"mining," o "i s"abili y" o" ully"exploi " he"
g aphIlike"s uc u e"o "Linked"Da a"wi hou " he"loss"o "knowledge."In"addi ion" o"da a"mining,"ou "s udy"
ocussed"on"usageIpa e n"analysis"and"con en "linking,"as"well"as"on"in o ma ion" e ie al."To" his"end,"a"
ho ough"analysis"o "use s’"needs"and"wishes"was"conduc ed,"as"well"as"an"explo a ion"o " he"expec ed"
da a’s"cha ac e is ics."Fu he mo e," ecen "and" ele an "li e a u e"and"expe ience"on" he" opics"in ol ed"
we e"examined"in"dep h.""
The" use I equi emen s"s udy"in ol ed" an" analysis" o " he" ques ionnai es" and" in e iews" ha " we e"
conduc ed" by" wo k" package"2.1" and" 13.1," espec i ely."While" p o iding" aluable" insigh " in o" he"
s akeholde s" o " ARIADNE," bo h" wo k" packages" only" ouched" on" he" possibili y" o " da a" mining." As" a"
esul ," e y" li le" could" be" asce ained" as" o" wha " pa h" any" da a"mining" solu ion" should" ollow."
Mo eo e ," he"la ge"majo i y"o " he"s akeholde s"had"li le" o"no"expe ience"wi h"da a"mining"and"we e"
unawa e" o "wha " i " ac ually" en ailed." To" mi iga e" his" lack" o " di ec ion," se e al"addi ional"in e iew"
sessions" wi h" s akeholde s" we e" held," du ing" which" he" possibili y" o " da a" mining" was" mo eIac i ely"
explo ed."Rega dless,"o "all" he" opics"discussed,"only" ew"we e" ele an "wi h" espec " o"da a"mining."
In" i s" en i e y," he" equi emen s" s udy" seemed" o" indica e" ha " he" la ge" majo i y" o " he" di icul ies"
expe ienced"by"s akeholde s"could"be"mi iga ed"by" he"use"o "Linked"Da a"alone."Se e al"o " hese"issues"
could"addi ionally"be"imp o ed"upon"e en" mo e" wi h" he"help" o "da a"mining."These"issues" in ol ed"
knowing"which"da a"is"a ailable,"how" o"loca e" ele an "da a,"and"how" o"dis il" he" ele an " esul s" om"
ARIADNE"D16.1"Public"
V"
""
hose" ha "a e"no ."In"addi ion," he"quali y"o " he"da a"was"men ioned"p ominen ly," he eby"emphasizing"
hei "(lack"o )"comple eness"and" he"(lack"o )" us "bes owed"on" hem."Toge he ," hese"we e"amongs "
he"p ime"a eas"conside ed" o"which"a"da aImining"solu ion"could"be"applied."
Explo ing" he"da a"is"an"impo an "ea ly"s ep"wi hin"any"da aImining"p ocess,"du ing"which" he"da a’s"
cha ac e is ics," hei " quali y," and" hei " abno mali ies"a e" inspec ed." Gene ally,"a"gene ous"amoun "o "
da a" is" p o ided" om" which" conclusions" can" be" d awn" ha " in luence" choices" made" du ing" he"
de elopmen "o " he"e en ual"da aImining"solu ion."Un o una ely," he"minimal"amoun "o "da a"cu en ly"
a ailable" h ough"ARIADNE"p e en s"such"a"sequence"o "e en s" o" ake"place."The e o e,"Linked"Da a"
om"se e al"di e en "a chaeological" eposi o ies"a ound" he"globe"was"inspec ed"ins ead."These"da a"
we e"chosen" o " hei "almos Idisjoin "cha ac e is ics," hus"hope ully"p o iding"good" ep esen a ions"o "
he"di e en " ace s" ha "ARIADNE"migh "b ing" o h."Assuming" hey"do,"se e al"obse a ions"could"be"
made:"sa e" o " he"gene ally"expec ed"di e ences"in"used"on ologies"and"s uc u e," he"examined"linked"
a chaeological"da a"we e" ound" o"s ongly"depend"on"desc ip i e" alues,"as"well"as"consis ing"la gely"o "
ela i ely" la "da a"s uc u es."These"aspec s"o " he"da a"should"be"conside ed"du ing" he"de elopmen "
o " he" o hcoming"da aImining"solu ion."
Unde s anding" he" domain" and" i s" da a" a e" wo" ea ly" bu " c ucial" s eps" in" any" da aImining" p ocess."
Toge he "wi h"ou "upda ed"knowledge"on"Linked"Da a"and"da a"mining," hese"all"come" oge he " o" o m"
he" ield"o "Seman ic"Web"mining."This" ield" ep esen s"a"young"a ea"o " esea ch"o "which"many"aspec s"
a e"s ill"unce ain"o "le "unexplo ed," om"bo h"a" echnical"and"p ac ical"pe spec i e."In" ac ,"many"o "
me hods"a e"s ill"unde "hea y"de elopmen ,"wi h" ew"o " hem"ha ing"p og essed"ou side" he"con ines"o "
academic" esea ch."The e o e,"ins ead"o "conside ing"all"possible"app oaches,"we"ha e"solely" ocussed"
on" he"mo eIp ominen "mo emen s"as"seen"in" he"li e a u e.""
Recommenda ions&
Gene ally," he"de elope "o "a" ypical"da a"mining"solu ion"will"explo e"a"la ge"amoun "o "da a"wi h" he"
goal"o " e ealing"po en ially" ele an "pa e ns."A e "ca e ul"inspec ion," he" uly" ele an "pa e ns"will"
subsequen ly" be" gene alized" o" he" en i e y" o " he" da a." Un o una ely," he" small" amoun " o " da a"
cu en ly"a ailable" h ough"ARIADNE"would"make"i " a he "unlikely" o"success ully"gene alize"abou "any"
disco e ed" pa e n" o " he" la ge" amoun " o " da a" ha ," one" day," will" be" accessible." Ins ead," a" mo e"
gene ic"app oach"is"sugges ed,"such" ha "i s"wo kings"a e"ensu ed" ega dless"o " he"exac "cha ac e is ics"
o " he" u u e"da a."
Based"on" he"s udy"o "bo h"domain"and"da a,"as"well"as"on"p ac ical"cons ain s"wi h" espec " o" ime"and"
esou ces," wo"da aImining"solu ions"we e"chosen"which"we e"deemed" he"mos " easible"and"sui able"
o " implemen a ion" wi hin" he" ARIADNE" in as uc u e." These" cons i u e" 1)" he" abili y" o " use s" o"
gene a e" po en iallyI ele an " hypo heses," and" 2)" analysing" he" quali y" o " da a" as" well" as" helping" o"
imp o e"i ."We"will"b ie ly" ouch"on" hese" wo"solu ions"nex ."
" "
ARIADNE"D16.1"Public"
5"
"
2.2 On ologies&
As"discussed"p e iously","RDF"is"a"simple"da a"model"wi h"which"s a emen s"can"be"speci ied."In" ac ,"RDF"
alone"can"be"used" o"make"any"s a emen ,"e en"ludic ous"ones"such"as" ha " he"Py amids"o "Giza"a e"in"
Rome."The" eason"behind" his"is" ha "i "does"no "make"any"assump ions"abou " he"domain"i "desc ibes,"
no " he"seman ics"used" o"desc ibe" ha "domain."Tha "is,"RDF"i sel "is"“unawa e”"o " he"in o ma ion"i "
s a es."Ins ead," his"“awa eness”"is"p o ided"by"on ologies"(Shadbol ,"Hall"and"Be ne sILee"2006," an"
Ha melen,"e "al."2012,"Hea h"and"Bize "2011)."
Wi hin" he" ield"o "Compu e "Science,"on ologies"desc ibe"a"domain"by" ypes,"p ope ies,"and" ela ion"
ypes."They"a e" he"culmina ion"o "a"p og ession" om"simple" ocabula ies"wi h"a" ixed"lis "o " e ms" o"
ullI ledged"languages"wi h"a"powe ul"exp essi eness"(Ga shol"2004," an"Ha melen,"e "al."2012)."To" his"
end,"on ologies"gene ally"p o ide"a"hie a chy"o "classes"and"p ope ies,"as"well"as"allowing"some" o m"o "
easoning" o e " hem." These"classes" may" encompass" subjec s," p edica es," and" objec s," including" bo h"
esou ces"and"li e als."Fo "ins ance,"an"a chaeological"on ology"migh "de ine" he"exac "meaning"o " he"
concep "exca a ion,"and" ha " his"concep "should"always"con ain"an"a chaeological"con ex " o"be" alid."
Nume ous"on ologies,"bo h"simple"and"powe ul,"modelling" a ious"domains"ha e"al eady"been"made"
a ailable"in"an"RDFIcomplian " o ma ."Wi hin"ARIADNE," he"mo e" ele an "o " hese"is,"unsu p isingly," he"
a chaeological"domain."As" his"domain"has"a"s ong"geospa ial"componen "(Wag endonk,"e "al."2009,"De"
Kleijn,"e "al."2014,"Conolly"and"Lake"2006)," his"la e "domain"can"be" ega ded"as"qui e" ele an "as"well."
The e o e,"se e al"on ologies"desc ibing" hese" wo"domains"will"be"b ie ly"discussed"nex ."Please"no e"
ha ," o "simplici y," he" echnical"aspec s"behind"on ologies"ha e"been"omi ed."
2.2.1 Geospa ial&On ologies&
The e"exis "se e al"on ologies"holding"geospa ial"knowledge,"o "which" he"mos "elemen a y"is" he"Basic"
Geo" Vocabula y"(BGV)" (B ickley" 2006)." BGV" allows" o " he" speci ica ion" o " poin s" wi hin" he" Wo ld"
Geode ic"Sys em"(WGS)"s anda d."To" his"end,"i "accep s"la i ude,"longi ude,"and"al i ude"decla a ions."
A" geospa ial"on ology" wi h" a" mo eIpowe ul" le el" o " exp essi eness" han" wi h" BGV" is" GeoSPARQL;" a"
geog aphical"que y"language"de eloped"by" he"Open"Geospa ial"Conso ium3"(OGC)"(OGC"GeoSPARQL"I"
A"Geog aphic"Que y"Language" o "RDF"Da a"2012)."While"i "emphasizes" he" e ie al"o "knowledge,"i.e."
que ying,"mo e" han"desc ibing,"GeoSPARQL"s ill"o e s"a" lexible"on ology" o"desc ibe" opologies." To"
accomplish" his," i " inco po a es" speci ica ions" om" o he " geospa ial" s anda ds" designed" by" he" OGC,"
among"which"a e" he"Geog aphic"Ma kup"Language4"and"Simple"Fea u e"Access5."The e o e,"GeoSPARQL"
accep s"decla a ions"o "poin s,"(mul i)"lines,"and"(mul i)"polygons,"as"well"as"desc ibing" hei "p ope ies"
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
3"Open"Geospa ial"Conso ium,"see"www.opengeospa ial.o g""
4"Geospa ial"Ma kup"Language,"see"www.opengeospa ial.o g/s anda ds/gml""
5"Simple"Fea u e"Access,"see"www.opengeospa ial.o g/s anda ds/s a"
ARIADNE"D16.1"Public"
6"
"
by"Region"Connec ion"Calculus."Howe e ,"some"o " he"mo e"exo ic" ea u es" equi e" ha " he"da a"s o e"
suppo s" he"GeoSPARQL"p o ocol."
2.2.2 A chaeological&On ologies&
An" on ology" wi hin" he" a chaeological" domain" will" be" he" ARIADNE" Ca alogue" Da a" Model" (ACDM)"
(Aloia,"e "al."2014)."The"ACDM"a emp s" o"p o ide"a"da a"model" o"desc ibe"a chaeological" esou ces,"
such"as"collec ions,"da a"se s"and"se ices,"as"well"as"me ada a"and" ocabula ies."Due" o" his"a ea"o "
ocus,"i "is"being"buil "upon" he"Da a"Ca alogue"Vocabula y"(DCAT)"(Maali,"E ickson"and"A che "2014);"a"
ocabula y"commended"by" he"W3C" o "i s"abili y" o" ep esen "go e nmen "da a"ca alogues."Ins ead"o "
ca alogues" howe e ," ACDM" emphasizes" collec ions" and" da a" se s," wi h" he" o me " being" a" se " o "
he e ogeneous"i ems"wi hou "a" o mal"s uc u e"and" he"la e "being"a"se "o "s uc u ed" eco ds."These"
s uc u ed" eco ds"a e"assumed" o"o igina e" om"ei he "a"da abase"o " om"a"Geog aphic"In o ma ion"
Sys em"(GIS)."
Two" o he " a chaeological" on ologies" a e" he" CRMa chaeo" and" CRMIEH" ex ension" o " he" CIDOC"
Concep ual" Re e ence" Model;" an" on ology" o " desc ibing" knowledge" om" he" domain" o " cul u al"
he i age"(The"CIDOC"Concep ual"Re e ence"Model"n.d.,"Doe "and"Schalle "2008)."The"CIDOC"CRM"was"
de eloped" by" in"collabo a ion" wi h" he" In e na ional" Council" o " Museums," wi h" he" aim" o " allowing"
di e se"pe spec i es" ha "inco po a e"di e en "ins i u ional"his o ies,"disciplines,"and"objec i es."To" his"
end,"i "p o ides"a"solid"co e"wi h" he"abili y" o"add" unc ionali y"by"use"o "ex ensions."
"
The" CRMa chaeo" cons i u es" a" gene ic" a chaeological" ex ension," de eloped" wi hin" he" ARIADNE"
amewo k,"which"aims"a "encoding"me ada a"on" he"exca a ion"p ocess"(C ipps,"e "al."2014)."By"o e ing"
his" me ada a," CRMa chaeo" endea o s" o" op imize" he" in e p e abili y" o " a" documen ed" exca a ion,"
he eby" p o iding" he" a ional" o " conduc ing" ha " exca a ion," as" well" as" knowledge" on" p e ious"
exca a ions"and"s udies"on" he"same"si e.""
"
The"second"a chaeological"CRM"ex ension," he"CIDOC"CRMIEH"(May"n.d.),"was"de eloped" o"include" he"
a chaeological"concep s"and"p ocesses"in"use"by" he"English"He i age6;"a"na ional"he i age"body"in" he"
UK"cha ged"wi h"sa egua ding"cul u al"he i age."To" his"end,"i "o e s"nume ous"classes"and"(in e se)"
p ope ies"di ided"amongs "se e al"modules."
2.3 The&Seman ic&Web&
The" LOD" cloud" cons i u es" a" la ge" numbe " o " in e connec ed" RDF" eposi o ies." These" eposi o ies,"
commonly" e e ed" o"as"a" iple(s o es,"allow"que ies" o"be"p ocessed"on" hei "da a"(Shadbol ,"Hall"and"
Be ne sILee"2006," an"Ha melen,"e "al."2012)."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
6"English"He i age,"see"www.englishIhe i age.o g.uk""
ARIADNE"D16.1"Public"
7"
"
A" single," isola ed" iple" s o e"al eady"p o ides" a" da a" s uc u e" ha " can" allow" powe ul" me hods" o "
sea ching" h ough" and" easoning" wi h" he" da a" o" be" used." The" eal" ad an age" o " LD"howe e ,"only"
su aces"when"mul iple" iple"s o es"a e"a ailable"on" he"web"and"a e"linked" o"each"o he ."Recall" ha "
his"dis inc ion"is"equi alen " o" he"di e ence"be ween" he" ou "and" i e"s a " a ing"o " he"LOD"p ojec "as"
was" discussed" ea lie ." I espec i e," hese" in e linked" iple" s o es" oge he " o m" a" web( o ( da a;" he"
Seman ic&Web"(Bize ,"Hea h"and"Be ne sILee"2009,"Hea h"and"Bize "2011," an"Ha melen,"e "al."2012)."""
2.3.1 Que ying& he&Seman ic&Web&
Nea ly"all"sea ch"engines" o " he"cu en Iday"Wo ldIWide"Web"a e"keywo dIbased"(F ei as,"e "al."2012);"
gi en"a"p o ided"se "o "wo ds" he"mos " ele an " esul "is"sough ."Gene ally," his"is"accomplished"by"some"
a ia ion"on"Vec o (Space(Models((Appendix"B),"which"cons i u e"a"me hod"wi h"which"documen s"can"
be" iden i ied." Un o una ely," such" a" solu ion" would" be" unsui ed" o "sea ching" h ough" he"LOD" ha "
makes"up" he"Seman ic"Web"(SW)"o "web"o "da a,"as"i "lacks" he"abili y" o" ep esen "s uc u ed"da a"as"
well" as" hei "seman ics" (Figu e" 2I3)." Ins ead," he" SW" is" gene ally" sea ched" h ough" by" que yIbased"
languages" ha "we e"speci ically"de eloped" o " his"pu pose."
In" he"case"o "a" iple"s o e," he"commonly"used"que y"language"is" ha "o "SPARQL"(P ud'hommeaux"and"
Seabo ne"2008," an"Ha melen,"e "al."2012,"Hea h"and"Bize "2011,"Shadbol ,"Hall"and"Be ne sILee"2006,"
Bize ,"Hea h"and"Be ne sILee"2009)." SPARQL" is" a" W3C" ecommended"p o ocol"and"a"que y"language"
de eloped" o"access," e ie e,"and"modi y"RDF"da a."Simila " o"o he "que y"languages,"such"as"SQL,"i "
o e s" a" wide" ange" o " capabili ies" anging" om" simple" pa e n" ma ching" o" complex" que ies" wi h"
es ic ions"in" ange," ime,"and"domain."Mo eo e ,"SPARQL’s"que yIp ocessing"engine"enables" he"abili y"
o" eason"deduc i ely"(Appendix"A)"o e " he"da a" o"which"i "p o ides"access."This"abili y"can"be"ei he "
ela i ely"powe ul"o " a he "limi ed,"depending"on" he"exp essi e"s eng h"o " he"on ologies"used" o"
speci y" hese"da a"wi h."
Que ying"a" iple"s o e" h ough"a"que yIlanguage"like"SPARQL"is"accomplished"by"connec ing" o"a"soI
called"endpoin ."Endpoin s"a e"p o ided"by" he" iple"s o e"and" o m" he"b idge"be ween"a"use "and" he"
da a"con ained"in" ha " iple"s o e"( an"Ha melen,"e "al."2012,"Hea h"and"Bize "2011)."Once"a"que y"has"
been"submi ed" o"such"a"publically"accessible"endpoin ,"i "will"be"p ocessed"and" he" esul s" e u ned."
Fo "ins ance,"a"que y"migh " eques "all"po e y" agmen s"wi h"bu n"ma ks"and"la ge " han"20"cm2" ha "
we e" ound"be ween"1950"and"1960"a "a"speci ic"si e"in"I aly."
ARIADNE"D16.1"Public"
8"
"
"
Figu e&2D3:&Exp essi i yDUsabili y& adeDo & o &que ying&wi h&ei he &a&wo d&o &on ologyDbased&sea ch&
engine&(F ei as,&e &al.&2012).&
2.4 Linked&A chaeological&Da a&
While" ai ly"new," he"concep "o "LD"is"no "unhea d"o "wi hin" he"a chaeological"domain."In" ac ,"some"
hink"o "i "as"being" he"nex "logical"s ep"in"sha ing"a chaeological"knowledge"(Signo e"2009,"Richa ds"
2006)."O he s"belie e"howe e "i s"seman ic"complexi y"lacks" he"abili y" o"desc ibe" he"unce ain y"in"
a chaeological" da a" (Isaksen," Ma inez," e " al." 2010," Ma inez" and" Isaksen" 2010)." Rega dless," se e al"
endea o s" ega ding"Linked"A chaeological"Da a"(LAD)" eposi o ies"ha e"al eady"been"unde aken."
One" o " he" la ge " LAD" eposi o ies"wi hin" Eu ope"is" dissemina ed" by" he" A chaeology" Da a" Se ice7"
(ADS),"which"p o ides"a"SPARQL"endpoin " o"a" iple"s o e"hos ed"by" he"Uni e si y"o "Yo k8"(Cha no,"e "
al." 2012)." This" iple" s o e"was" de eloped" as" pa "o " he" STELLAR9"p ojec "(Tudhope," e " al." 2011);" a"
collabo a ion," unded"by" he"UK"A s"&"Humani ies"Resea ch"Council10,"in"pa ne ship"wi h" he"Uni e si y"
o "Glamo gan"(now"Sou h"Wales)11,"and"English"He i age,"wi h" he"aim"o "imp o ing" he"in eg a ion"o "LD"
in o" he"digi al"a chaeological"domain."The"da a"cu en ly"in" he"ADS" iple"s o e"we e"con e ed" om"
da abases"and"sp eadshee s" o"RDF,"using" he"CRMIEH"on ology."The" esul ing" iples"a e"s o ed"in"an"
Alleg oG aph12" iple"s o e,"which"poin s" o"a"SPARQL"endpoin "and"allows" he" esul s" o"be"p o ided"in"
one"o "se e al"RDF"se ializa ions."
On" he"o he "side"o " he"A lan ic"ocean," he"Digi al"Index"o "No h"Ame ican"A chaeology"(DINAA)"aims"
o"in eg a e" go e nmen "cu a ed" public" da a" om" bo h" o line" and" online" digi al" a chaeological"
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
7"A chaeological"Da a"Se ice,"see"da a.a chaeologicalda ase ice.ac.uk""
8"Uni e si y"o "Yo k,"see"www.yo k.ac.uk""
9"STELLAR"P ojec ,"see"www.a chaeologicalda ase ice.ac.uk/ esea ch/s ella ""
10"A s"&"Humani ies"Resea ch"Counsel,"see"www.ah c.ac.uk""
11"Uni e si y"o "Glamo gan,"see"www.sou hwales.ac.uk""
12"Alleg oG aph,"see"www. anz.com/ag aph/alleg og aph""
ARIADNE"D16.1"Public"
9"
"
eposi o ies" (Wells," e " al." 2014)." Suppo ed" by" he" Na ional" Science" Founda ion13," i s" p ima y" ocus"
conce ns"aiding" he" esea che "in"da a"disco e y,"as"well"as" illing" he"gap"in"a chaeological"in o ma ion"
in as uc u es." Mo eo e ," DINAA" emphasizes" he" euse" o " bo h" echnologies" and" da a." These" da a,"
which"a e"s o ed"in"a"MySQL14"da abase,"a e"exp essed"wi h" he"help"o " he"DINAA15" ocabula y;"an"
on ology"buil "upon"OWL"and"s onglyIin luenced"by"CIDOC"CRM."Fu he mo e,"as" hese"da a"a e"open,"
DINAA"has"been"welcomed"in o" he"LOD"cloud."
Whe e"bo h"ADS"and"DINAA"p o ide"na ional"da a" o " he"mos "pa ,"ARIADNE"will"a emp " o"ex end"
his" o" he"whole"o "Eu ope."No "all"LAD"p ojec s"aim"a "such"scale"howe e ."Fo "ins ance,"(G ube ,"e "al."
2012)"explo e" he" easibili y" and" use ulness" o " in oducing" LD" in o" he" ield" o " numisma ics," he eby"
p o iding"enhanced"sea ching"abili ies" o"a"da abase"o "Roman"coins."Ano he "example"is" om"(Isaksen,"
Ma inez,"e "al."2009),"who" ied" o"imp o e"ou "unde s anding"o "ancien " ade"ne wo ks"by"analyzing"
LAD" conce ning" he" dis ibu ion" o " ampho ae" and" ma ble." As" a" inal" example," conside " he" esea ch"
done"in"de"Boe ,"e "al."(2014)"whe e" hey"pai ed"LAD"conce ning"Du ch"ship"w ecks"wi h" ha "o "Du ch"
sailo s," esul ing"in"new"insigh s"on" he"socioIeconomic" eali ies"o " he"18 h"Cen u y.""
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
13"Na ional"Science"Founda ion,"see"www.ns .go ""
14"MySQL,"see"www.mysql.com""
15"DINNA"On ology,"see"opencon ex .o g/ ocabula ies/dinaa""
ARIADNE"D16.1"Public"
10"
"
3 In oduc ion& o&Da a&Mining&
Da a(mining"(DM)"is"a" ai ly"new"and"mul iIdisciplina y" ield"which"in e sec s"wi h"A i icial"In elligence,"
Da a"Science,"and"S a is ics,"as"well"as"pa ially"o e lapping"wi h"Machine"Lea ning"(ML)" om"which"i "
d aws"i s" echnical"basis"(Kan a dzic"2011,"Has ie,"e "al."2009,"F iedman"1998,"Wi en,"F ank"and"Hall"
2011)." The e o e," he" people" who" specialize" in" his" ield" s em" om" a ious" backg ounds" and" hold"
di e en " iews,"making"i "di icul " o"p o ide"a"de ini ion"ag eed"upon"by"all" hose"in ol ed."Conside " he"
ollowing" h ee"de ini ions,"as" ound"in" he"li e a u e:"
Da a& Mining& i & he& non i ial& p ocess& o & iden i ying& alid,& no el,& po en ially& use ul,& and&
ul ima ely&unde s andable&pa e ns&in&da a"(U."M."Fayyad"1996)."
Da a& Mining& is& he& ex ac ion& o & implici ,& p e iously& unknown,& and& po en ially& use ul&
in o ma ion& om&da a&(Wi en,"F ank"and"Hall"2011).&
Da a&Mining&is&a&decision&suppo &p ocess&whe e&we&look&in&la ge&da a&bases& o &unknown&
and&unexpec ed&pa e ns&o &in o ma ion&(Pa saye"1996)."
A" common" no ion" h oughou " he" di e en " de ini ions" is" ha " o " iden i ying," ex ac ing," and" using"
in o ma ion" om" da a" ha " was" p e iously" unknown." In" o de " wo ds;" Da a" Mining" conce ns" lea ning(
om(da a((Has ie,"e "al."2009).""
3.1 Lea ning& om&A chaeological&Da a&
Long"be o e" he" ield"o "DM"came" o"be,"s a is ics"was" he"only"a ea" ha "specialized"in"lea ning" om"
da a."Despi e"i s"po en ial"use ulness"howe e ,"many"a chaeologis s"used" o" e ain" om" amilia izing"
hemsel es"wi h" hese"me hods"(Bax e "2003)."Hence," hei "use"in" he"a chaeological"domain"p og essed"
a he "slowly."Ins ead,"mos "g ew"ou "o "necessi y,"i.e." o"sol e"a"p oblem,"and"we e"o iginally"de eloped"
in"o he " ields"such"as"geog aphy"and"ecology."Only"wi h" he"eme gence"o "New&A chaeology,"did" he"
in e es " in" s a is ical" me hods"g ow." S ill," ea ly" uses" we e" la gely" wi hou " he" henIcalled"‘complex"
s a is ics’,"amongs " hose"lis ed"we e"p incipal"componen "(PCA)," ac o ,"and"clus e "analysis"(Whallon"
1987,"Kin igh"1987)."This"changed"wi h" he"inc eased"a ailabili y"o "s a is ical"applica ions."
" "
ARIADNE"D16.1"Public"
11"
"
O e " he"las " wo"decades,"a" ise"in"compu a ional"powe "ga e"bi h" o" a ious"s a is ical"applica ions"
(Bax e "2003)."Some"o " hese,"such"as"SPSS16"and,"in"a"lesse "deg ee,"R17,"we e"aimed"a "indi iduals"who"
we e" no " eally" amilia " wi h" he" heo y" behind" s a is ical" me hods." These" ools" allowed" he"
a chaeologis s" o" apply" and" ge " acquain ed" wi h" basic" app oaches" such" as" eg ession" analysis" and"
Bayesian"s a is ics,"as"well"as"p o iding"a"simple" on end" o" he"a o emen ioned"‘complex"s a is ics’."
Recen " yea s" saw" he" in eg a ion" o " s a is ical" me hods" in o" a ious" nonIdedica ed" applica ions."Fo "
example,"conside "a"Geog aphical"In o ma ion"Sys em"(GIS);"a" ool"commonly"used"by"a chaeologis s" o"
pe o m"some" o m"o "spa ial"analysis"(Selho e "and"Gese "2014,"Bax e "2003)."He e o,"mos "mode n"GIS"
amewo ks"p o ide" a" simpli ied" on end" o" se e al" adap ed" s a is ical" me hods," he eby" including"
loca ion"and"p edic i e"modelling,"as"well"as"a"subse "o " he"‘complex"s a is ics’"men ioned"ea lie .""
"
Figu e&3D1&:&Rough&segmen a ion&o &an&a chaeological&ae ial&pho og aph&as&de e mined&by&a&DM&
algo i hm&(Kobylinski&and&Walczak&2006).&
The"accep ance"o "DM"by"a chaeologis s"appea s" o" ollow"a"line"simila " o" ha "o "s a is ics,"wi h" he"
e m" “da a" mining”" ha ing" been" men ioned" only" spo adically" in" a chaeologicallyI ela ed" li e a u e."
Ne e heless," ce ain" opics" ela ed" o" DM" appea " o" be" qui e" well" ep esen ed," especially" hose"
in ol ing"some" o m"o "classi ica ion."Amongs " hese," he"o enIencoun e ed"a e ac s"a e"coins,"glass,"
and"ce amics,"which"a e"classi ied"based"on" he"simila i y"be ween" hei " isual"cha ac e is ics"( an"de "
Maa en,"e "al."2006,"Hube ,"e "al."2005,"Nolle,"e "al."2003,"Ka asik,"e "al."2004)."An"example"o "a"mo e"
specialized"s udy"in ol ing"classi ica ion"is" ha "o "Bi,"e "al."(2008),"who"c ea ed"a"me hod" o"spa ially"
classi y" and" pa i ion" a chaeological" se lemen s" based" on" he" disco e y" o " nea by" hea hs," pi s," u n"
ombs,"and"pi " ombs."Ano he "specialized"example"is" he" esea ch"conduc ed"by"Linde holm"&"Geladi"
(2012)," who" de eloped" an" app oach" o" classi y" a chaeological" soil" and" sedimen " samples" based" on"
in a ed" eadings"o " hose"samples."As"ye "ano he "example,"conside "Di"Ludo ico"and"Pie i"(2011),"who"
explo ed" a ious" means" o" classi y" en ies" wi hin" la ge" co po a" o " deco a ions" on" Mesopo amian"
cylinde " seals." As" a" inal" example," conside " he" esea ch" done" by" Kobylinski" &" Walczak" (2006)," who"
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
16"SPSS,"see"www.ibm.com/so wa e/nl/analy ics/spss""
17"The"RIp ojec ,"see"www. Ip ojec .o g""
ARIADNE"D16.1"Public"
12"
"
de eloped"a"me hod" o"au oma ically"de e mine"po en ially"in e es ing" ea u es"in"a chaeological"ae ial"
pho os"(Figu e"3I1)."
3.2 Knowledge&Disco e y&and&Da a&Mining&
Un il"now,"we"ha e"used" he" e m"“da a"mining”" o" e e " o" he"whole"p ocess"o "disco e ing"use ul"
pa e ns" om"any" o m"o "da a."Fo "simplici y"sake,"we"will"con inue" o"do"so."S ic ly"speaking"howe e ,"
he" e m"me ely"deno es" he"ac "o " unning"an"algo i hm"on"a"da a"se ."This"is"only"one"s age"in"a"la ge "
knowledge"disco e y"p ocess,"gene ally"known"as"Knowledge"Disco e y"and"Da a"Mining"(KDD)."Du ing"
he"cou se"o "ARIADNE,"such"a"p ocess"will"be"unde aken"by" he"knowledge"enginee s"in ol ed"wi h"WP"
16."In" ac ," he" esea ch"being"conduc ed" o " his" epo "al eady"cons i u es"a"pa ial"KDD"p ocess."
"
Figu e&3D2:&Schema ic&depic ion&o &a&gene ic&KDD&p ocess.&
A"gene ic"KDD"p ocess"(Figu e"3I2)" ypically"consis s"o "six"di e en "s ages"(Ku gan"and"Musilek"2006):"
Domain"Unde s anding,"Da a"Unde s anding,"Da a"P epa a ion,"Da a"Mining,"E alua ion,"and"Knowledge"
Consolida ion."Fo "each"o " hese,"a"gene al"desc ip ion"will"be"gi en."
Domain( Unde s anding"conce ns" amilia izing" onesel " wi h" he" domain" a " hand" (Fayyad,"
Pia e skyIshapi o" and" Smy h" 1996," Kan a dzic" 2011," Ku gan" and" Musilek" 2006," Maimon" and"
Rokach" 2005)." This" en ails" a" su icien " comp ehension" o " he" cu en " s a e" o " a ai s," on" he"
p oblems" he ein,"and"on" he"goals" ha "ha e" o"be" eached."Fu he mo e,"key" igu es"and" hei "
e minology"should" be"iden i ied."Wi hin"ARIADNE," his"boils"down" o"an"unde s anding"o " he"
a chaeological"domain,"as"well"as"o " he"a chaeologis s" hemsel es."
Da a(Unde s anding(conce ns"analysing" he"da a," he eby"inspec ing"i s"quali y"(Wi en,"F ank"and"
Hall"2011,"Fayyad,"Pia e skyIshapi o"and"Smy h"1996,"Kan a dzic"2011,"Ku gan"and"Musilek"2006)."
This"in ol es" he"iden i ica ion"o "anomalies,"such"as"noise,"ou lie s,"and"missing" alues,"as"well"as"
he" selec ion" o " in e es ing" subse s" o " ea u es." In" he" case" o " ARIADNE," his" in ol es" an"
unde s anding"o "LAD"and"i s"anomalies."
" "
ARIADNE"D16.1"Public"
13"
"
Da a(P epa a ion"conce ns" he" ans o ma ion"o " he"da a"as" o"make"i "sui able" o "DM"(Wi en,"
F ank" and" Hall" 2011," Fayyad," Pia e skyIshapi o" and" Smy h" 1996," Kan a dzic" 2011," Ku gan" and"
Musilek"2006,"Maimon"and"Rokach"2005)."This"en ails" esol ing"p oblems"in"da a"quali y,"which"
we e"disco e ed"p e iously,"as"well"as"scaling"and"no malizing" alues"i "needed."Fu he mo e,"a"
inal"selec ion"o "in e es ing" ea u es"is"made.""
Da a( Mining"conce ns" applying" a" sui able"induc i eI easoning" me hod" (Appendix" A)" o" he"
p epa ed"da a"se ," esul ing"in" he"au oma ed"disco e y"o "po en ially" ele an "pa e ns"(Wi en,"
F ank" and" Hall" 2011," Fayyad," Pia e skyIshapi o" and" Smy h" 1996," Kan a dzic" 2011," Ku gan" and"
Musilek"2006,"Maimon"and"Rokach"2005)."These"pa e ns"a e"desc ibed"in"a"ma hema ical"model"
ha "app oxima es" he"da a."
E alua ion"is" he"phase"du ing"which" he"p e iously"gene a ed"knowledge"is"in e p e ed,"as"well"
as"being"inspec ed" o "i s"use ulness"(Wi en,"F ank"and"Hall"2011,"Fayyad,"Pia e skyIshapi o"and"
Smy h"1996,"Kan a dzic"2011,"Ku gan"and"Musilek"2006,"Maimon"and"Rokach"2005)."This" ypically"
in ol es"a" isualiza ion"o " he"co esponding"pa e ns."In" he"case"o "ARIADNE," his"will"in ol e"an"
i e a i e" e iew"p ocess"a ended"by"bo h"knowledge"enginee s"and"a chaeological" esea che s."
Knowledge( Consolida ion"conce ns" he" p esen a ion" o " he" new" knowledge" in" a" use Io ien ed"
ashion," ollowed" by" he" possible"inco po a ion" o " ha " knowledge" in o" a" inal" sys em" (Wi en,"
F ank" and" Hall" 2011," Fayyad," Pia e skyIshapi o" and" Smy h" 1996," Kan a dzic" 2011," Ku gan" and"
Musilek" 2006," Maimon" and" Rokach" 2005)." Wi hin" ARIADNE," his" would" co espond" o" ei he "
showing" he"DM" esul s" o" he"use "o "adding" hese" esul s" o" he"co esponding" iple"s o e."
The"gene ic"sequence"o "s ages"as"ou lined"abo e"is"only"one"o "many"p oposed"KDD"models"(Ku gan"
and"Musilek"2006)."Due" o"i s"domainIindependen "p ope ies,"we"belie e"i " o"be"a"sui able"model" o"
ollow"du ing"ou " esea ch"wi hin" his"WP."In" ac ," he"s eps"o "Domain"and"Da a"Unde s anding"will"be"
co e ed"la gely"du ing" he"cou se"o " his" epo ."
3.3 Da a&Mining&Tasks&
The" domainIunde s anding" s age" o " a" KDD"p ocess" ypically" p o ides" he" knowledge" enginee " wi h"
insigh "in o" he"desi ed"goals"o " ha "p ocess."These"goals" o m" he"main"c i e ia"when"deciding"on"which"
DM" ask" o"implemen ."While"many" a ian s"exis ," hese" asks"gene ally" all"in o"one"o " he" ollowing"
highe Ile el" asks"(Fayyad," Pia e skyIshapi o" and" Smy h" 1996," Kan a dzic" 2011," La ac" and" Dze oski"
2001)":" Classi ica ion," Reg ession," Clus e ing," Summa iza ion," Change" and" De ia ion" De ec ion," and"
Dependency"Modelling."Fo "each"o " hese,"a"gene al"desc ip ion"will"be"gi en."
" "
ARIADNE"D16.1"Public"
14"
"
Classi ica ion" ocusses"on"lea ning"a"p edic i e"model" ha "is"capable"o "co ec ly"assigning"new"
ins ances"o "unknown"classes" o"one"o "se e al"p ede ined"classes"(Fayyad,"Pia e skyIshapi o"and"
Smy h"1996,"Wi en,"F ank"and"Hall"2011,"Kan a dzic"2011,"La ac"and"Dze oski"2001,"Be end ,"e "
al." 2004," Hagood" 2012)." Typically," hese" classes" ep esen " ela ed" ca ego ies" wi hin" a" ce ain"
domain," o "example," ed"and"g een"deno e"classes"wi hin"a" ini e"se "o "colou s.""
Reg ession(analysis"in ol es"lea ning"a"model,"which"can"p edic "unknown"a ibu e" alues"based"
on" known" alues" o " he" o he " a ibu es" belonging" o" he" co esponding" ins ances" (Fayyad,"
Pia e skyIshapi o" and" Smy h" 1996," Wi en," F ank" and" Hall" 2011," Kan a dzic" 2011)." He e," bo h"
known"and"unknown" alues"should"ha e"a"nume ical"in e nal" ep esen a ion."In" he"case"o "bina y"
o " ca ego ical" alues," e.g." labels," speci ic" nume ical" anges" a e" used." Fo " ins ance," conside "
p edic ing" ei he " ue"o " alse"by" le ing" a" posi i e" and" nega i e" alue" deno e" he" o me " and"
la e ," espec i ely.""
Clus e ( analysis" ies" o" desc ibe"a" ini e"se " o " g oups"o "clus e s" composed"o "ins ances" wi h"
simila "a ibu e" alues"(Fayyad,"Pia e skyIshapi o"and"Smy h"1996,"Wi en,"F ank"and"Hall"2011,"
Kan a dzic"2011,"La ac"and"Dze oski"2001,"Be end ,"e "al."2004,"Hagood"2012)."These"clus e s"a e"
de e mined" wi hou " p io " knowledge" on" he" unde lying" s uc u e" o " he" da a," and" may" be"
ega ded"as"nameless"classes."Hence,"clus e "analysis"can"be"seen"as"a" a ian "o "classi ica ion."
Summa iza ion"conce ns" he"me hods" ha "a e"capable"o "comp essing"da a"in o"mo e"compac "
o ms"wi hou "losing" oo"much"o " he"o iginal"knowledge"(Fayyad,"Pia e skyIshapi o"and"Smy h"
1996," Kan a dzic" 2011," La ac" and" Dze oski" 2001)." This" p oblem" can" be" ackled" h ough" wo"
dis inc "pa hs;"ei he " h ough"ex ac ion"o " h ough"abs ac ion"(Mani"and"Maybu y"1999)."He e,"
he" o me "en ails" he"au oma ic"ex ac ion"o "exis ing" agmen s"o " he"da a" ha "a e"deemed"
ele an ,"while" he"la e "me hod"gene a es"new"da a" ha "desc ibes" hese" ele an "aspec s"in"a"
concise"way."The"ale " eade "migh " ecognize" he"simila i ies"be ween" he"me hod"o "abs ac ion"
and" he" echnique"behind"VSM."
Change( and( De ia ion( De ec ion"aims" a " disco e ing" signi ican " changes" o " de ia ions," e.g."
ou lie s," om"p e iously"measu ed"o "no ma i e" alues," espec i ely"(Fayyad,"Pia e skyIshapi o"
and" Smy h" 1996," Kan a dzic" 2011," La ac" and" Dze oski" 2001)." In" bo h" cases," he" key" is" o"
de e mine" whe he " he" p obabili y" o " such" an" anomaly" occu ing" is" oo" low" o" wa an " i " o"
ac ually" happen." Fo " ins ance," gi en" an" a e age" human" heigh " o " 1.70"me e s," he" a i al" o "
someone"wi h"a"heigh "o "2.50"me e "would"almos "ce ainly"s and"ou ."
Dependency(Modelling"consis s"o "lea ning"one"o "mo e"models" ha "a e"capable"o "desc ibing"
signi ican " dependencies" be ween" he" di e en " a iables" ound" in" he" da a"(Fayyad," Pia e skyI
shapi o"and"Smy h"1996,"Kan a dzic"2011)."Typically," hese"models" ocus"on"a"pa icula "subse "o "
he"da a," he eby"desc ibing"di e en "dependencies,"and" hus"do"no "co e " he"en i e"da a"se .""
None" o " hese" six" highIle el" asks" make" assump ions" on" he" domain" a " hand," no " do" any" o " he"
algo i hms"used" o"pe o m" hese" asks."They"do"howe e ,"make"assump ions"on" he"da a" o"which" hey"
ARIADNE"D16.1"Public"
21"
"
concep ual( hie a chicalOclus e ing"algo i hm," each" o " he" clus e s" may" be" p o ided" by"a" humanI
in e p e able"label"(Fishe "1987,"Fanizzi,"d’Ama o"and"Esposi o"2008)."
"
Figu e&4D5:&Example&o &gene a ing&a& axonomy&( igh )& om&an&hie a chicallyDclus e ed&da a&se &(le ).&
He e,&assume& his&is&a&da ase &abou &G eek&ho ae& ha &is&clus e ed&based& i s &on&ci iliza ion,&second&
on&shape,&and& hi d&on&loca ion.&No e& ha & he&labels&would&gene ally&be&mo e&desc ip i e&ins ead&o & he&
ones&used&he e.&
Thus" a ,"only" he"gene a ion"o "a" axonomy" om"new"da a"has"been"conside ed."The"same"me hod"
howe e ,"may"also"be"applied" o"exis ing"da a"se s" ha "al eady"adhe e" o"a"ce ain"on ology."In" hose"
cases," an" al e na i e" axonomy" may" be" o e ed" which" is" based" on" simila i ies" be ween" he"en i ies"
ins ead" o "on" a" p ede ined" axonomy," which" was" made" by" domain" expe s." Fo " ins ance," ins ead" o "
inding" Ne o"lis ed" unde " Roman& Empe o ," he" migh " al e na i ely" be" ound" in" he" clus e " con aining"
esou ces"conce ning"Roman&Ci ies"and"Disas e ."Due" o" he"la e "being"based"on" he"da a"i sel ,"i "will"
be"less"likely" o"possess"a"use Iinduced"bias"wi h" espec " o" he"hie a chy."Howe e ," he"occu ence"o "
e oneous"da a"and" a ia ion"in" he"da a’s"quali y"migh "limi "such"e ec ."
Ins ead" o " clus e ing" a" whole" da a" se " i "could"al e na i ely"be" limi ed" o" a" speci ic" (combina ion" o )"
subg aph(s)."As"such," his"app oach"may"be"applied" o" he" esul s" e u ned"by"a"que y."An"ad an age"o "
his"is" ha " he"clus e s"and" hei "hie a chy"a e"based"on" he"local"neighbou hood"o "ones’"que y," hus"
o e ing"only"in o ma ion"on" ele an "da a."In"addi ion,"each"ac ion" ha "mo es,"na ows,"o "b oadens"
he" scope" o " he" sea ch" will" igge " a" ecalcula ion" o " he" clus e s," hus," once" again," p o iding"
in o ma ion" only" on" he" mos I ele an " da a." Fu he mo e," by" clus e ing" simila " esul s," he" use " is"
p esen ed"wi h"a"lessIexhaus i e"lis ," he eby"allowing" o "a"mo e"use I iendly"and"in ui i e"b owsing"
clima e."
" "
ARIADNE"D16.1"Public"
22"
"
4.2 Applicable&Solu ions&
The"majo i y"o " he"de elopmen s"in"SWM"a e"s ill" ai ly"academic."E en"so," he e"ha e"also"been"a" ai "
numbe "o "p ojec s" ha "ha e" esul ed"in"a"mo eI e ined"and"usable"p oduc ."Some"o " hese"migh "e en"
be"sui ed" o "ARIADNE,"albei "wi h"a"numbe "o "adjus men s."The e o e,"a"c oss"sec ion"o " he"exis ing"
solu ions"will"be"discussed"nex ."No e" ha ," o " eadabili y,"mos " echnical"de ails"will"be"omi ed.""
4.2.1 SPARQL&ex ensions&
Recall" ha "SPARQL"is" he" ecommended"(que yIbased)"in e ace" o "a" iple"s o e."As"a" esul ,"nea ly"all"
iple"s o es"suppo " his"s anda d."Hence," he"in eg a ion"o "DM"wi h"SPARQL"migh "p o ide"a"clean"and"
na u al"solu ion."Two"di e en " ypes"o "ex ensions" ha "implemen "such"capabili ies"will"be"discussed"
nex ."
4.2.1.1 Assis ed+Que y+Fo ming+
While"SPARQL"has"long"been" he" ecommended"s anda d" o "que ying"a" iple"s o e,"i s"complexi y"s ill"
o ms"a"ba ie " o "many"use s."A"la ge"pa "o " his"complexi y"o igina es" om" he"he e ogenei y"o " he"
da a,"which"may"be"mapped" o" e y"di e en "on ologies."I "is"unlikely" ha " he"a e age"use "is" amilia "
wi h"all"o " hese" a ia ions," hus" esul ing"in"an"inabili y" o"cons uc "que ies" ha "would"op imally"exploi "
he"da a."The e o e,"se e al"p ojec s"ha e" ocussed"on"ex ending"SPARQL"wi h" he"abili y" o"assis "in" he"
o ming"o "que ies"(Figu e"4I6)."Two"di e en "app oaches"will"be"b ie ly"discussed"nex ."
SPACE"is"a"que yId i en"au ocomple ion"ex ension" o"SPARQL"(K ame ,"Di idino"and"G öne "2013)."A "i s"
hea "lies"an"index"build" om"pas "que ies" i ed"a "endpoin s."The" a ionale" o " his"app oach"s ems" om"
he"idea" ha " he"que y"logs"o "a"speci ic"endpoin "p o ide"a"good" ep esen a ion"o " he"da a" o"which"
ha " endpoin " p o ides" access." The e o e," while"a" que y" is" being" w i en," SPACE" compa es" he"
p og essed"que y" o" hose"in"i s"index" o " he"co esponding"endpoin "and"subsequen ly"sugges s" he"
mos "simila "pas "que y."
An"al e na i e" o"a"que yId i en"app oach"is" ha "o "a"da aId i en"app oach."(Gombos"and"Kiss"2014,"
Campinas,"e "al."2012)." Tha " is," ins ead" o " p edic ing" a" que y" based" on" p e ious"que ies," es ic " he"
possible"que ies" o"wha " he"da a"can"o e ."One"such"me hod"consis s"o "gene a ing"a"g aph"summa y"
o " he"RDF"g aphs."Once"buil ,"i " ep esen s"a"gene aliza ion"o " he" o iginal"g aph" om"which"o enI
pai ed"RDF"elemen s"can"be"que ied."Fo "ins ance,"i "migh "link" he"p edica e"w i en&by" o" he"class"
au ho ."Consequen ly,"i " he"la es Iw i en" e m"o "a"que y"would"consis "o " he"p edica e"w i en&by,"a"
sugges ion"o " he"class"au ho ,"e.g."D.&Whea ley,"migh "be"gi en."
ARIADNE"D16.1"Public"
23"
"
"
Figu e&4D6:&Implemen ed&example&o &a&SPARQL&assis ed&que y& o mula ion&(Campinas,&e &al.&2012).&
4.2.1.2 SPARQL;ML+
Wi h" SPARQLIML," a" wide" ange" o " p edic ion" and" classi ica ion" me hods" a e" added" o" he" SPARQL"
in e ace" (Kie e ," Be ns ein" and" Loche " 2008," Loche " 2007)." These" me hods" s em" om" SRL" and" ha e"
been" modi ied" o" wo k" di ec ly" o" g aph" da a." In" addi ion," he" me hods" can" be" accessed" h ough"
s a emen s" ha " ollow" he"SPARQL"g amma "and"which"a e"simila " o" hose"used"by"Mic oso ’s"Da aI
Mining"Ex ension18.""
Unde " he"hood,"SPARQLIML" equi es" he"da a"is"s o ed"in" he"Mone DB19"da abase,"which"suppo s"
bo h" ela ional"and"g aph"da a."In"addi ion,"i "needs" he"Weka20"and"P oximi y21"DM"APIs," h ough"which"
all"DM"ope a ions"a e"p ocessed."These"so wa e"packages"a e" un"on" he"se e "side,"i.e."on" he"se e "
ha "p o ides"a"SPARQLIML"in e ace."Hence,"i "spa es" he"use s" he"bu den"o "ins alling"addi ional" ools."
Fu he mo e,"all"no mal"SPARQL"ope a ions" emain"una ec ed."
"
Figu e&4D7:&An&example&que y&in&SPARQLDML& o&lea n&a&p edic i e&model."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
18"Mic oso ’s"Da aIMining"Ex ension,"see"msdn.mic oso .com/enIus/lib a y/ms132058.aspx""
19"Mone DB,"see"www.mone db.o g""
20"Weka"is"an"openIsou ce"da a"mining" ool" o "p oposi ional"da a,"see"www.cs.waika o.ac.nz/ml/weka/""
21"P oximi y"is"an"openIsou ce"da a"mining" ool" o " ela ional"da a,"see"kdl.cs.umass.edu/display/public/P oximi y""
ARIADNE"D16.1"Public"
24"
"
4.2.2 Ranking&Me hods&
T adi ional" anking" me hods," such" as" Google’s" PageRank22," gene ally" apply" a" o m" o " an" au ho i yI
anking"algo i hm."These"algo i hms"o de "pages"on" he"web"based"on"how"o en" hey"a e" e e ed" o"
om"au ho i a i e"si es."He e,"au ho i a i e"si es"a e"de ined"as"impo an "and" us ed"hubs"on" he"web,"
e.g."due" o" hei "in luence"wi hin" he"online"communi y."
While" adi ional" anking"me hods"wo k"qui e"well"on" he" egula "web," hey"do"no "con ain"a"mechanism"
o "handling"and"exploi ing"seman ic" ela ionships." Hence," hese" a e" ill"sui ed" in" he"case"o " he"SW."
Fo una ely,"a"hand ul"o " anking"schemes"exis " ha "speci ically" a ge " he"SW."Fi e"o " hose"will"b ie ly"
be"discussed"nex ," he eby"omi ing" hose"less" ele an " o"ARIADNE."
4.2.2.1 T ipleRank+
T ipleRank" is" an" au ho i yI anking" me hod" ha " akes" he" seman ics" o " LD" in o" accoun "(F anz," e " al."
2009)."This"is"accomplished"by" ep esen ing" he"g aph"as"a" hi dIo de " enso "(C.1.2),"which"is"capable"o "
exploi ing" hese"seman ics"in"a"na u al" ashion."By"applying"a"speci ic" ac o iza ion"me hod,"au ho i a i e"
sou ces"can"be"de e mined."By"subsequen ly"calcula ing" he"con ibu ions"o " hese"sou ces" o"a"g oup"o "
iples,"an"o de ing"can"be" ound."In"addi ion,"g oups"o "seman icallyIsimila "p edica es"and" esou ces"
may"be"iden i ied."
4.2.2.2 ReConRank+
The"ReConRank"me hod"is"a" usion"o " wo" anking"algo i hms;"Resou ceRank"and"Con ex Rank."(Hogan,"
Ha h"and"Decke "2006)."The" i s "o " hese"has"been"adap ed" om"PageRank"and" hus"applies"a" o m"o "
au ho i y" anking."This"is"accomplished"by"i e a i ely"going" h ough" he"g aph"whils "igno ing" he"
seman ics"o " he"connec ing"links."Con ex Rank,"on" he"o he "hand," akes" he"con ex "g aph"in o"
accoun ."This"g aph"consis s"o "con ex Ispeci ic" esou ces"and"p edica es"which"a e" us ed" o"be" alid."
Finally,"bo h" anking"algo i hms"a e"combined" o"compu e" he"ReConRank"o de "o " ele ance."
4.2.2.3 xhRank+
xhRank"is"a" anking"app oach" ha "endea ou s" o"implemen "mul iple"di e en "me ics"in o"one"single"
package"in" he"hope"o "achie ing" he"bes "o "se e al"wo lds"(He"and"Bake "2011)."He e o,"i "calcula es"
he" anking" based" on" ele ance," on" impo ance," and" on" que y" leng h." O " hese," he" ele ance"is"
de e mined"based"on" he"con ex "g aph"o "a"que y,"as"well"as"on" he"con ex ual"simila i y"o "ph ases"and"
e ms" wi hin" ha " que y," and" hose" con ained" in" he" RDF" g aph."The" impo ance" is" compu ed" by"
conside ing"au ho i y"nodes,"as"well"as" he"popula i y"o "all" ele an " esou ces."Finally," he"que y"leng h"
is"calcula ed"by"e alua ing"a"(weigh ed)"con ex "g aph"wi h" espec " o" he"inpu "que y."Once"all"me ics"
ha e" esul ed"in"a"( aw)" ank," hese"a e"combined" o" o m" he"o e all" ank."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
22"PageRank,"see"www.google.com/abou /company/p oduc s/""
ARIADNE"D16.1"Public"
25"
"
4.2.2.4 SemRank+
The"SemRank" ele ance"model"cons i u es"a" usion"o "seman ic"and"in o ma ion" heo e ic"me hods,"as"
well" as" heu is ics"(E e " and" Domeniconi" 2014," Anyanwu," Maduko" and" She h" 2005)." Toge he ," hese"
echniques" esul " in" a" uni ied" model" wi h" which" all" ypes" o " complex" seman ic" ela ions," known" as"
Seman ic" Associa ions" (SA)" (Figu e" 4I8)," can" be" anked"by" ele ance." Ins ead" o " applying" popula "
ele ance"measu es,"such"as"sho es "pa h"o "leas I equen ly"occu ing"pa h,"SemRank"calcula es" he"
In o ma ion"Gain"(IG)"pe " ela ion."This"me ic"con eys"how"much"in o ma ion"a"use "would"gain"when"
p esen ed"wi h" he"SA" o"which" he"IG"belongs."Howe e ,"as" he"de elope s"acknowledge" ha "di e en "
domains" equi e"di e en "measu es," hey"p o ide" he"op ion"o "easily"swi ching" o"ano he " ele ance"
me ic."
"
Figu e&4D8&:&Th ee& ypes&o &Seman ic&Associa ions&(Anyanwu,&Maduko&and&She h&2005).&
4.2.2.5 Vec o +Space+Models+
Vec o "Space"Models"(VSM)"cons i u e"an"app oach"by"which"(uns uc u ed)"da a"can"easily"be"sea ched"
h ough" o" ind"po en ially" ele an "answe s" ha " i "a"desc ip i e"que y"(Appendix"B)."Such"an"app oach"
can" be" added" on" op" o " egula " SW" que ies," he eby" p o iding" mo e" e sa ile" and" mo e"scalable"
sea ching"abili ies."Mo eo e ,"as"VSMs" ep esen " he"po en ial" ele ance"on"a"con inuous"scale," hese"
alues"may"be"used" o" ank" he"co esponding" esul s"as"well."
Se e al" s udies" ha e" ocused" on" inco po a ing" VSM" in o" he" SW" (F ei as," e " al." 2012," Mendes," e " al."
2011,"Cas ells,"Fe nandez"and"Valle "2007,"Tous"and"Delgado"2006)."He e o," hey"indexed" esou ces"by"
hei " ec o s."Tha "is,"each"nonIze o" e m"in"a" ec o "deno ed"a"p edica eI esou ce"pai "belonging" o"
ha " esou ce."In"addi ion,"each"o " hese"pai s"was"weigh ed" o" e lec "how"well" hey" ep esen ed" hei "
co esponding" esou ce." Fo " ins ance," gi en" an"i em" in" an"a chaeological" da a" se ," he"p edica eI
ARIADNE"D16.1"Public"
26"
"
esou ce"pai "ins ance&o &D agendo &33"would"be" a "be e "a "desc ibing" he"i em" ha " he"pai "is&an&
a e ac ."
The"implemen a ion"o "VSM"in o" he"SW"can"be"app oached"in" wo"di e en "ways."Ei he "a"keywo dI
based"que y"is"used" o"gene a e"a"SPARQL"que y,"o "a"SPARQL"que y"is"used" om"which"keywo ds"a e"
ex ac ed."Ei he "way,"bo h"que y"and"keywo d"a e,"a "some"poin ,"a ailable" o " u he "p ocessing."This"
p ocess"con inues"by" he"execu ion"o " he"que y"by" he"que y"engine,"a e "which" he" esul s"a e" anked"
based"on" hei "simila i y" o" he"keywo ds"(Figu e"4I9)."
"
Figu e&4D9:&Wo k low&as& o&how&VSM&may&aid&in& anking&que y& esul s&(Cas ells,&Fe nandez&and&Valle &
2007).&
4.2.3 F amewo ks&
Se e al" amewo ks"exis " ha " ocus"on"adding"DM"capabili ies" o"g aph"da a"o "da a"composed"o " o mal"
on ologies."O " hese," ou " ele an "examples"will"be"discussed"nex ."
4.2.3.1 SUNS+
S a is ical"Uni "Node"Se "(SUNS)"is"a"ML"plugin" o " he"La ge"Knowledge"Collide 23"(La KC");"a"la geIscale"
in eg a ion"p ojec "aimed"a "de eloping"a"pla o m" o "massi e"dis ibu ed"incomple e" easoning"on" he"
SW."I "has"since"been"po ed" o"wo k"on" ela ional"da a"as"well"(Huang,"T esp"and"K iegel,"e "al."2009)."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
23"La ge"Knowledge"Collide ;"a"FP7"p ojec ,"see"www.la kc.eu""
ARIADNE"D16.1"Public"
27"
"
The"SUNS" amewo k"(Huang,"T esp"and"Bundschus,"e "al."2011,"Huang"and"T esp"2010)"cen es"on" he"
concep s" o " s a is ical& uni "and" popula ion," which" i " de ines" as" an" ins ance" o " a" ce ain" class" and" all"
ins ances"unde "conside a ion," espec i ely."In"addi ion,"i "de ines"each"po en ial" iple"as"a"bina y" iple&
node"o "which" he" alue"is" ue"i " he" iple"is"known" o"exis "and" alse"i " he" iple"is"known"no " o"exis ."
Mo eo e ," he"en i e y"o " iple&nodes" ha "belong" o"a"s a is ical&uni "is"de ined"as" he"s a is ics&uni &
node&se ."Fo "ins ance,"an"a bi a y"a e ac "o " he"class"D agendo &33"would"be"a"s a is ical&uni "o " ha "
class."Each" iple" ha "s a es"a" ac "abou " ha "a e ac ,"i.e." ha "i "has" he"colou " ed,"is"a"posi i e" iple&
node.""
A "i s"co e,"SUNS"applies"a"mul i a ia e24"p edic ion"algo i hm,"which"is"ad oca ed"by"SUNS’"de elope s"
as"p o iding"an"imp o ed"p edic i e"pe o mance"when"compa ed" o" adi ional"algo i hms."Rega dless,"
i "cons i u es"a"p oposi ional"app oach," hus" equi ing" he"g aph"da a" o"be" ansla ed"in o"a" ela ional"
ma ix." The" unknown" iples" a e" subsequen ly" p edic ed" by" ac o iza ion" (C.1.1)," wi h" he" op ion" o "
in eg a ing" his"new"knowledge"in o" he" iple"s o e."
"
Figu e&4D10&:&A&schema ic&o e iew&o & he&SUNS& amewo k&(Huang&and&T esp&2012).&
4.2.3.2 LiDDM+
The"LinkedIDa a"Da a"Mine "(LiDDM)"aims"a "p o iding"a" amewo k" ha "o e s"a"LDIspeci ic"al e na i e"
o" he" mo eI egula " KDD" schema" (Figu e" 4I11)" (Na asimha," e " al." 2011," Ramezani," Sa aee" and"
Nema bakhsh"2013)."Mo eo e ,"i "emphasizes" he"no ion"o "simplici y"wi h" he"de elope s"ad oca ing"
he"use"o " egula "SPARQL"que ies"in"o de " o"p e en "aliena ing" he"use "wi h"a"complex"lea ning"cu e."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
24"Mul i a ia e"me hods"a e"a" usion"o "supe ised"and"unsupe ised"me hods,"which"use"known"inpu " ea u es" o"
p edic "se e al" a iables"join ly," he eby"gene ally"inc easing" hei "p edic i e"s eng h."
ARIADNE"D16.1"Public"
28"
"
The" i s "s ep"in" he"LiDDM" amewo k"is" he"impo "o "da a"as" o mula ed"as"a"SPARQL"que y."By"going"
h ough"se e al"p eIp ocessing"s eps" he"da a"is" hen,"among"o he " hings," ansla ed" o"a"p oposi ional"
o ma ." Once" comple ed," p oposi ional" DM" me hods" can" be" applied" as" p o ided" by" an" ex e nal" DM"
p ocessing"engine."The" esul s"o " his"can"subsequen ly"be" isualized.""
In"o de " o" es " hei " amewo k," he"de elope s"c ea ed" he"LiDDM"Tool"(LiDDMT)."To" acili a e" he"DM"
me hods" hey"implemen ed" he"Weka"API."Based"on" he" esul s"gained," hey"conclude" ha " he"s eng h"
in" he"p oposed" amewo k"lies"in"mining"se e al"agg ega ed"da a"se s"simul aneously," he eby"o e ing"
a" lexibili y"wi h" espec " o" he"da a" o ma ."In"addi ion," he"de elope s"hope" o"au oma e"many"o "i s"
ea u es"in" he"nea " u u e."
"
Figu e&4D11&:&Schema ic&depic ion&o & he&LiDDM&a chi ec u e&(Na asimha,&e &al.&2011).&
4.2.3.3 OLAP+
In" he" ield"o "Business"In elligence,"Online"Analy ical"P ocessing"(OLAP)"in ol es"a" amewo k" o " he"
analysis"o "mul idimensional" ela ional"da a"(Codd,"Codd"and"Salley"1993)."Typically," his"conce ns"an"
in e ac i e"DM"p ocess" om"which" he" esul s"may"lead" o"business"and" inancial" epo s."
The"co e"concep "o "any"OLAP"applica ion"is" he"OLAP"Cube"(Figu e"4I12)."Simply"pu ,"an"OLAP"Cube"
ep esen s"a"gene aliza ion"o " abula "da a," he eby"placing"ce ain"aspec s"o "a"mul idimensional"da a"
se "on" he"axes"o " he"cube."When" equi ing"mo e" han" he" h ee"dimensions" ha "a"cube"p o ides,"i "is"
cus oma y" o"speak"o "an"OLAP"Hype cube."New"insigh s"may"nex "be"ob ained"by"in e ac i ely"selec ing"
ARIADNE"D16.1"Public"
29"
"
and"analysing"slices"o " his"(hype )"cube."Fo "ins ance,"da a"on"a e ac s,"loca ion"o " hei "disco e y,"and"
hei "ca bon"da ing"migh "be"b ough " oge he "in"a" h eeIdimensional"OLAP"Cube" o"analyse" he"possible"
ela ions"be ween" hem."
Analysing"da a"on" he"SW" by"OLAP"has"slowly"been"gaining"momen um."Howe e ," he"cu en " ocus"
leans" mo e" owa ds" he" ansla ion" o " ela ional" da a" o" an" OLAP" ep esen a ion" on" he" SW." To"
accomplish" his," he" W3C" ecommends" he" use" o " he" RDF" Da a" Cube" on ology" (QB)" (Cyganiak" and"
Reynolds" 2014)." O he s" howe e ," deem" his" on ology" oo" limi ed" (E che e y" and" Vaisman" 2012,"
agimo ,"e "al."2014),"and"ha e"ex ended"i "wi h"QB4OLAP" o"enable"all"analy ical"abili ies"o "OLAP."As"a"
esul ," ela ional"da a"published"on" he"SW"using" he"QB"on ology"can"be"analysed"wi h"OLAP" echniques"
by" using" QB4OLAP." A" mo e"na i e" al e na i e" is" o" use" he" Open" Cube" ocabula y"(E che e y" and"
Vaisman" 2012)," which" combines" QB" and" QB4OLAP" in o" a" single" on ology." Mo eo e ," i " allows" o "
pe o ming"OLAP"ope a ions" ia"SPARQL"que ies."
"
Figu e&4D12:&Schema ic& ep esen a ion&o &an&OLAP&wo k low.&
4.2.3.4 AITION+
AITION"is"an"in e ac i e"DM"solu ion" o " he"biomedical"domain"(Dimi opoulos,"e "al."2012,"Me axas,"e "
al."2014)." De eloped" by" he" Uni e si y" o " A hens25" o " he" FP6"Heal hIeIChild26"p ojec ,"i "speci ically"
aims"a "disco e ing"knowledge"in"a"medical"p ocessing"en i onmen ."Mo e"speci ic,"i "p o ides"a" ull"KDD"
solu ion"by"which"(biomedical)" esea che s"can"p eIp ocess,"simula e,"and" isualize" ela ional"da a,"as"
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
25"Uni e si y"o "A hens,"see"www.uoa.g ""
26"Heal hIe"Child,"see"www.heal hIeIchild.o g""
ARIADNE"D16.1"Public"
30"
"
well" as" cons uc " s a is ical" models" and" subsequen ly" in e " om" hem" (Figu e& 4D13)." He e o," AITION"
o e s"a"use I iendly"g aphical"in e ace"simila " o" hose"o "s a is ical"so wa e"packages."This"in e ace"
allows"one" o" weak" he"KDD"p ocess" o"his"o "he "needs,"amongs "which"a e" he"selec ion"o "algo i hm"
and" he"op ional"speci ica ion"o "p io "knowledge"and"medical"on ologies."
O iginally,"AITION"was"de eloped"as"a"s andIalone"desk op"applica ion."Howe e ,"due" o"limi a ions"in"
p ocessing" capabili ies" i " was" la e " ex ended" o" a" se e Io ien ed" design." This" allows" i " o" un" on"
dis ibu ed" a chi ec u es" such" as" clus e s," g ids," and" clouds." These" a chi ec u es" howe e ," should"
p o ide"access" o"a" ela ional"(big"da a)"da abase" o "AITION" o"wo k"p ope ly."Hence,"AITION"expec s"
he"da a" o"con o m" o" he"speci ica ions"o "MRDM."Tha "is,"g aphIbased"da a"such"as"LD"is"unsuppo ed"
a " his" ime."
"
Figu e&4D13:&Schema ic&depic ion&o & he&AITION& amewo k&(Dimi opoulos,"e "al."2012).&
4.2.4 Pla o ms&
Whe eby"p e ious"solu ions"we e"ei he "mos ly" heo e ical,"o "ex ensions" o" iple"s o es," he" ollowing"
wo" pla o ms" cons i u e" ela i ely" comple e" p oduc s" wi h" a" e ined" use " in e ace." These" will" be"
ouched"on"nex ."
4.2.4.1 Rapidmine +
Rapidmine " is" a" popula " DM" and" Business" Analy ics" pla o m" ha " aims" a " p o iding" he" whole" KDD"
p ocess" o"business"use s."To" his"end,"i "combines"a"wide" ange"o "DM"and"DMI ela ed" echniques"wi h"
an"in ui i e"in e ace"(Figu e"4I14)."Mo eo e ,"i s"de elope s" y" o"s ay"a " he" on "o " echnological"
inno a ion," he eby" o e ing" e sions" capable" o " unning" on" highIpe o mance" and" dis ibu ed"
a chi ec u es,"as"well"as" unning"di ec ly" om" he"cloud."
ARIADNE"D16.1"Public"
37"
"
appa en " lack" o " cu en lyIa ailable" DM" applica ions" wi hin" he" domain." Tha " is," o " he" use s" ha "
pa icipa ed"in" he"su ey,"less" han"a" i h" a ed" hei "a ailabili y"good"(12%)"o " e y"good"(5%).""This"
s a e"o "a ai s"p e en s"use s" om"gaining"any"p ac ical"expe ience,"and" hus"impeding" hem"in" o ming"
a" ealis ic" iew"o " he"added" alue" ha "DM"could"o e .""
In" he"case"o "da a"enhancemen ,"only" ew"s akeholde s"o " he"sample"g oup"–"all"a ilia ed"wi h"da a"
eposi o ies"–"appea " o"hold" he"belie " ha "DM"migh "be"bene icial."In" ac ,"only"one"o " hem"s ongly"
belie s" ha " he" in eg a ion" o " DM" in" eposi o ies" should" be" ega ded" as" a" e y" impo an " aspec ."
Addi ional"in e iews"(n&=&3)"we e"conduc ed"by"(Hollande "and"Hooge we "2014)"as"pa "o "ARIADNE"
Deli e able"13.1," om"which"a"desi e" o"enhance&me ada a"was"concluded."This"enhancemen "would"
en ail,"among"o he " hing,"au oma ed"me ada a"ex ac ion,"duplica ion"de ec ion,"link"p edic ion,"and"
o e all"seman ic"en ichmen ."
Du ing" he"su ey," he"sample"g oup"appea ed" o"hold"mixed"belie s"abou "whe he "DM"is"an"impo an "
aspec "o " hei " esea ch"(Selho e "and"Gese "2014)."Mo e"speci ically,"hal "o " hem"deemed"i "ei he " e y"
impo an "(16%)"o " a he "impo an "(34%),"while" he" emainde " hough "i " o"be" a he "unimpo an "
(26%)"o " e y"unimpo an "(23%)."E en"less"o " he"su ey’s"pa icipan s"had"e e "used"a"DM"solu ion"
du ing" hei " esea ch,"wi h"less" han"a" en h"(8%)"ha ing"used"i " e y"o en,"and"sligh ly"mo e"(16%)"
ha ing"applied"i " equen ly."I " ac ,"nea "hal "(41%)"had"(almos )"ne e "used"DM"a "all."These"numbe s"
may" pa ially" be" inaccu a e" gi en" ha ," as" indica ed" ea lie ," many" pa icipan s" we e" unawa e" o " he"
possibili ies" ha " he"in eg a ion"o "DM" ools"in o" hei " esea ch"may"p o ide,"and"may,"in" ac ,"ha e"used"
such" ools" unwi ingly."Examples" o " such" ools" a e" hose" in eg a ed"in o" a" GIS," wi h" which" ad anced"
analysis"o "a chaeological"da a"is"al eady"possible"(Conolly"and"Lake"2006).""
Acco ding" o" he" su ey’s" pa icipan s," he" mos " impo an " ype" o " da a" hey" a e" using" du ing" hei "
esea ch"is,"unsu p isingly," ha "o "exca a ion"da a"(Selho e "and"Gese "2014)."O " he"o he "impo an "
sou ces,"many" appea " o"in ol e"a"geog aphical"componen ,"such" as" GIS"o "sa elli e"da a."The e o e,"
augmen ing" hose" ypes"o "da a"wi h" he"help"o "DM"would"likely"o e " he"mos "e ec i e"imp o emen "
du ing" he" knowledgeIdisco e y" s age" as" pe cei ed" by" he" use ." Howe e ," i " may" also" be" ue" ha "
esea che s"a e" using" he"o he " ypes"o "da a"less" o en" due" o" he"poo " anspa ency" ha " cu en ly"
exis s." I espec i e," Hollande " and" Hooge we " (2014)"emphasize" he" need" o" p esen " he" use " wi h,"
among" o he s," a" geoIin eg a ed" sea ch," he eby" o e ing" nume ous" op ions" such" as" an" in e ac i e"
imeline"and" a ious"backg ound"laye s."
Ano he " impo an " pa " o " da a" anspa ency" is" he" abili y" o" easily" dis il" ele an " in o ma ion" om"
exhaus ing" quan i ies" o " po en iallyD ele an "in o ma ion." One" app oach" o" achie e" his" migh " be" o"
apply" some" o m" o " anking" o" he" da a" (Hollande " and" Hooge we " 2014)," ei he " au oma ically" o "
manually"by" he"use ."Al e na i ely,"a"combina ion"o " he" wo"may"be"possible,"whe eby" he"use s"can"
co ec "o "o he wise"in luence" he"au oma ically"gene a ed" ankings."
ARIADNE"D16.1"Public"
38"
"
5.2 Wishes&o &Domain&Expe s&
In"an"e o " o"gain"addi ional"insigh "in o" he"needs"o "a chaeological" esea che s"we"o ganized" i e"oneI
hou "b ains o ming"sessions."Fou "o " hese"we e"held"a "Leiden"Uni e si y"and"one"a " he"VU"Uni e si y"
Ams e dam." Each" o " hese" sessions" consis ed" o " an" open" discussion" wi h" a" di e en " a chaeological"
specialis ,"among"which"we e"junio "and"senio " esea che s"as"well"as"Ph.D."candida es."Fu he mo e,"
hei " ields" o " expe ise" anged" om"P ehis o ic" hun e " ga he e s" and" a ming" communi ies," o" longI
e m"de elopmen s"o "se lemen s,"landIuse"dynamics,"and"spa ial"dynamic"modelling.""
The" pa icipan s" we e" asked" o" w i e" down" se e al" a chaeological" esea ch" scena ios," which" hey"
deemed"di icul "wi h" he"cu en "in o ma ion"in as uc u e."Mo eo e ," he"assump ion"was"made" ha "
all"a chaeological"publica ions"and"da a"would"ha e"been"made"a ailable"in"one"la ge"da abase."In" o al,"
he"sessions" esul ed"in"26"scena ios,"o "which"an"English" ansla ion"is"gi en"in"Appendix"D."
An" analysis" o " he" scena ios"posed" e ealed" ha " he" la ge" majo i y" (77%)" o " he" di icul ies" ha "
esea che s" expe ience" conce n" an" in o ma ionIga he ing" ask." Hence," he" agg ega ion" o " da a" is"
desi ed."In"se e al"occasions," hese"da a"a e"likely" o"o igina e" om"mul iple,"dis inc "sou ces"and"s em"
om"di e en "domains,"amongs "which"a e"plan "and"den al" eco ds."Mo eo e ,"nea ly"hal "o " hese"
in o ma ionIga he ing" scena ios"(40%)" ela e" some" a chaeological" en i y" o" a" geospa ial" componen ,"
he eby" making" his" la e " domain" he" second" mos " p ominen ." Finally," h ee" scena ios"men ion""
di icul y"in" inding" ele an "da a"using"a"keywo dIbased"sea ch"engine."
The" emaining" esea ch" scena ios"(23%)" in ol e" a" ac I inding" ask," among" which" a e" he" sea ch" o "
con ac "in o ma ion,"speci ic"GIS" iles,"and" he"numbe "o "a"physical"s o age"con aine ."Two"scena ios"
ha "s and"ou "conce n" he" us wo hiness"o "ce ain"da a,"which"is"dependen "on" he"co esponding"
a chaeological"con ex ."
In o ma ionDSeeking&Task&
Numbe &o &Scena ios&
Fac "Finding"
6"
In o ma ion"Ga he ing"
20"
Keeping"UpI oIda e"
0"
Table&5D1:&Dissemina ion&o & he&a chaeological& esea ch&scena ios&in o& he& h ee&dis inc &In o ma ionD
Seeking& asks.&
When"assuming"an"ideal"si ua ion"in"which"all" equi ed"da a"has"been"made"a ailable"as"LD,"mos ,"i "no "
all"o " hese" in o ma ionIga he ing" asks" would" be" accomplishable" wi hou " much" ouble."In"con as ,"
none"o " he" ac I inding" asks"would" equi e"LD;" o " hese,"a"simple" ela ional"da abase"would"su ice."
These" esul s"a e"simila " o" hose"o " he"su ey"and" he"in e iews."Tha "is," he"abili ies" ha "a e"p o ided"
by"LD"a e"likely" o"g ea ly"imp o e" he"knowledgeIdisco e y"p ocess"o "an"a chaeological" esea che ."
Also"simila ,"un o una ely,"is" he"di icul y"o "dis illing" asks" o"which"DM"migh "p o e"bene icial.""
ARIADNE"D16.1"Public"
39"
"
5.3 Summa y&
Based"on" he"p e iouslyIdiscussed"su ey"and"in e iews," he" ollowing"poin s"can"be"dis illed"which"a e"
deemed" ele an "wi h" espec " o"DM."Please"no e"howe e ," ha "a"solu ion" o" he"las "poin "would"mo e"
likely"be"conside ed"an"issue" ea able"by"ML" a he " han"DM."
Unawa eness(o ( he(a ailable(da a" e lec s" he"lack"o "knowledge"use s"possess"on"wha "da a"is"
a ailable" o" hem." This" hinde s" hem" du ing" he" ea ly" s ages" o " hei " esea ch," as" hey" ail" o"
explo e"da a"ou side" he"scope"o " hei "cu en "sea ch."
Unce ain y( on( how( o( loca e( ele an ( da a"conce ns" he" di icul y" o " use s" o" ge " an"
unde s anding"o " he"ac ions" equi ed" o" ind"and"access" he"da a" hey"a e"looking" o ."Ins ead,"
hey" un"in o"ambigui y"issues"wi h" he" e ms" hey"use" o"sea ch."
Inabili y( o(e ec i ely(dis il( ele an (da a"in ol es" he"issues" ha "use s"expe ience"when" aced"
wi h"an"exhaus i e"lis "o "po en ially& ele an "da a."To"de e mine"wha "da a"is" ele an "and"wha "is"
no ," hey"need" o" ho oughly"examine"each"en y"in" he"lis ."
Incomple eness(o (Da a"conce ns" he"pe cei ed"gaps"in"in o ma ion,"due" o"ha ing"been"omi ed"
du ing"ei he " he"s udy"o " he"subsequen "digi iza ion."As" esea che s"canno "asce ain"whe he "
he"missing"da a"is"o "impo ance," his"hinde s" hem" om" us ing" he"da a"se "as"a"whole."
ARIADNE"D16.1"Public"
40"
"
6 Da a&Unde s anding&
Da a"Unde s anding"is"an"impo an "s ep"wi hin"any"KDD"and"conce ns"inspec ing" he"da a," hei "quali y,"
and" hei "abno mali ies."Ob aining"a"good"o e iew"o " hese"aspec s"con ibu es" o" he"pe o mance"o "
any" u u e"DM"applica ions"wi hin" he"ARIADNE"in as uc u e."A e "comple ion"o " his"p ojec "a " he"
end" o " Feb ua y" 2017," i " is" hoped" ha "Eu ope's" a chaeological" communi ies"will" adop " i ." Un il" ha "
momen "howe e ," he"a ailable"da a"will"be"comp ised"o "wha "al eady"exis s,"as"well"as"small"amoun s"
o "new"da a"which"will"be"p oduced"by" hose"in ol ed"wi h"Na u al"Language"P ocessing."The e o e," he"
conclusions"a " he"end"o " his" epo "will"apply"only" o" hese"da a."I "is"expec ed"howe e ," ha " hese"
da a"will"cons i u e"a"su icien " ep esen a ion"o " he" u u e"da a,"and" hus"allow" o "gene aliza ion,"such"
ha " he"conclusions"will"hold."
We"expec " ha " he"la ge"majo i y"o " he"da a" ha "ARIADNE"will"o e "du ing"i s" i s "couple"o "yea s"will"
be"p o ided"by" he"cu en ly"exis ing"digi allyIaccessible"da a"in as uc u es."Only"a" ew"o " hose"ha e"
al eady"explo ed" he"possibili y"o "publishing" hei "da a"as"LD,"and"e en" ewe "ha e"emb aced"i "en i ely."
A"good"example"o "a" eposi o y" ha "does"emb ace"LD"is" he"ADS;" he"A chaeological"Da a"Se ice"based"
a " he"Uni e si y"o "Yo k,"and"which"has"adop ed"all" ace s"o " he"LD"pa adigm" o "a"sec ion"o " hei "
a chi e"da a,"and" o "all" hei " esou ce"disco e y"me ada a."An"example"o "a" eposi o y" ha "is"well"on"i s"
way" migh " be" EASY29;" he" digi al" wa ehouse"on" Digi al" Humani ies" hos ed" by" DANS30," which" o e s"
uns uc u ed"da a" oge he "wi h" hei "co esponding"LD"coun e pa ."In"some"cases"howe e ,"da a"se s"
consis ing"solely"o "LD"a e"published31."
6.1 Da a&P oduced&using&Na u alQLanguage&P ocessing&
In"addi ion" o" he"Linked"A chaeological"Da a"(LAD)" om"exis ing"in as uc u es,"ARIADNE"will"p o ide"
LAD" ha "has"been"gene a ed"semiIau oma ically" om"unpublished"a chaeological" epo s"by"means"o "
Na u alILanguage"P ocessing"(NLP)."These" epo s," he"soIcalled"g ey&li e a u e,"a e"inc easingly"‘bo n"
digi al’."Those" ha "a e"no "a e"scanned" o"c ea e"a"copy,"which"is"usually"made"a ailable"in"PDF" o ma ."
In" ei he " case," his"may"be" ollowed" by" p ocessing" he" ex " h ough"specialised" ools," o" con e " he"
epo " o"LD.""
Wi hin"ARIADNE," he" ask"o "explo ing" he"applicabili y"o "NLP" alls"unde "WP16"wi h" ask"numbe "16.2."
The"main"con ibu o s" o" his" ask"a e" he"Uni e si y"o "Sou h"Wales," he"ADS,"and"Leiden"Uni e si y."As"
i "may"be"assumed" ha "o he "NLP"endea ou s"will"p oceed"in"a"simila " ashion," he e"will" hus"be"no"
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
29"Elec onic"A chi ing"Sys em,"see"easy.dans.knaw.nl/""
30"Da a"A chi ing"and"Ne wo ked"Se ices,"see"dans.knaw.nl/en""
31"Fo "example,"conside " he" ecen "CLARIN"Du ch"Ships"and"Sailo s"da a"se ""
ARIADNE"D16.1"Public"
41"
"
di e ence"be ween"da a" om"ei he "o igin"wi h" espec " o"accessing"and" e ie ing"i ."Tha "is,"bo h"will"
be"s o ed"on" he"same"(o "simila )" iple"s o e"using" he"same"(o "simila )"on ologies."
The"main"challenge"wi h"which" he" ield"o "NLP"s uggles"is" e y"simila " o" ha "which" he"Seman ic"Web"
ies" o" sol e," namely" he" p oblem" o " making" knowledge" in e p e able" by" so wa e" agen s" (B iscoe"
1991)."Howe e ," he"Seman ic"Web" ies" o"a ack" his"p oblem"by"s uc u ing" he"knowledge,"whe eas"
he" ield" o " NLP" gene ally" ies" o" c ea e" seman icallyIawa e" agen s." Un o una ely," due" o" i s"
complexi y," his"la e "app oach"is"s ill" a " om"pe ec ."The e o e,"any"knowledge"enginee "should"be"
wa y"o "possible" laws"in" he"da a"con e ed"by"NLP,"which"we e"no "de ec ed"du ing"manual"o "(semiI)"
au oma ic"checks."
6.2 Case&S udy&on&Da a&Reposi o ies&
O " he"exis ing"LAD"in as uc u es"(Fen ess"2014)," ou "ha e"been"chosen" om"which" he"RDF"da a"was"
deemed"a"sui able" ep esen a i e"o " hei "speci ic"a ea"o "expe ise."In"addi ion," his"selec ion" a ou ed"
minimal"o e lap"be ween" he"co esponding"da a"se s,"as" o"p o ide"as"high"a"deg ee"o " a iabili y"as"
possible."These" ou "a e" he"ADS,"EASY,"Open"Con ex ,"and"Pleiades."
6.2.1 ADS&
Recall" ha " he"A chaeological"Da a"Se ice"(ADS)"p o ides"a"web"po al" o"a" iple"s o e."A " he"momen "
o "w i ing,"i "holds" he"agg ega ed"da a"o "nea ly" 500"dis inc "collec ions."Toge he ," hese"collec ions"
p o ide"472,172" iples"on"103,148" esou ces32."
The" da a" appea s" o" be" s uc u ed" in" h ee" sec ions," each" wi h" a" di e en "pu pose." The" smalles " o "
hese,"comp ising"less" han"a" ew"pe cen "o " he"whole"g aph,"desc ibes"se e al" ypes"o "ampho ae"by"
he"dis inc "physical" ea u es" ha "make"up" hei "shape."The"co esponding"s a emen s"consis "la gely"
ou "o "SKOS"s a emen s.""
Wi h"a"sligh ly"highe "pe cen age" ollow"nume ous"desc ip ions"o " he"sou ces"and"s udies" om"which"
he"da a"o igina ed,"all"o "which"mainly"make"use"o "Dublin"Co e"(DC)."I "should"be"no ed"howe e ," ha "
mos "o " he"DC" iples"ha e"li e als"as" hei "objec ."Fu he mo e,"i "appea s"as"i " he" ange"o "ce ain"DC"
a ibu es,"e.g."dc:co e age,"co e s" disjoin ed" se s" such" as" he" disjoin "union" o " empo al" pe iod" and"
geog aphical"loca ion."
The" la ge" majo i y" o " he" g aph"desc ibes" he" a e ac s" p o ided" by" he" sou ces"and" s udies." The"
co esponding"s a emen s"a e"speci ied"using" he"CIDOC"CRMIEH"on ology,"o "which" he"p ope ies"s a "
wi h"a"unique"iden i ie ."In"addi ion,"desc ip ions"o " he"same" esou ce" equen ly"occu "mo e" han"once,"
wi h"each"occu ence"speci ying"a"di e en "se "o "p ope ies."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
32"These" igu es"a e"based"on" he"g aphs" e ie ed"in"Decembe "2014."
ARIADNE"D16.1"Public"
42"
"
6.2.2 EASY&
Recall" ha " EASY" conce ns" a" digi al" eposi o y" on" Digi al" Humani ies" hos ed" by" DANS," and" which"
expe imen s"wi h"o e ing"LD"alongside" hei "co esponding"uns uc u ed"coun e pa ,"among"which"a e"
da abases,"images,"and" epo s."On" he"whole," he"da a"wi hin"EASY’s" es " iple"s o e"cons i u es"mo e"
han"25,000"da a"se s"which," oge he ," esul s"in" oughly"28,000" esou ces"consis ing"o "836,447" iples"
in" o al33."
Li le" a ia ion"be ween"concep "desc ip ions"exis "wi hin" he"g aph,"wi h"almos "e e y"one"consis ing"o "
a"single"da a"se s"wi h"simila "p ope ies."These"p ope ies"mos ly"conce n"me ada a"and"a e"desc ibed"
in" Dublin" Co e34"(DC)"wi h" li e al" alues."The e o e,"while" he" g aph" is" ela i ely" uni o mly" di ided,"i "
does"con ain"li le"c ossIlinking."
While"no "se ialised"in"RDF,"EASY"addi ionally"p o ides"each"da a"se "wi h"mo e"con ex ual"in o ma ion"
in" he" o m"o "a"XML" ile;" he"“pakbon”,"o "packing"slip"in"English."This" ile"consis s"o "nume ous" ixed"
a ibu es,"de eloped"in"coope a ion"wi h" he"SIKB35," ha "a e" o"be"supplied"by" he"submi e "o " he"
da a" se ." I " con e ed" o" RDF," he" in o ma ion" he ein" migh " p o e" qui e" bene icial" o" a chaeological"
esea che s."
6.2.3 Open&Con ex &
Open"Con ex 36"is"a"web"po al" ha "allows" esea che s" o"publish"and"access"scien i ic"da a" om" a ious"
domains," such" as" zooa chaeological"and" spa ial" a chaeology," as" well" as" numisma ics." A " he" ime" o "
w i ing," he" po al" o e ed" access" o" da a" om" 20" p ojec s," wi h" 34" o he s" o hcoming37." O " he"
co esponding"da a"se s,"only"a" agmen "ha e"been"con e ed" o"RDF."Ins ead,"mos "da a" esides"in"a"
combina ion"o "JSON38,"KML39,"and"A chaeoML40."He e," he"la e " wo"a e"XML" o ma s,"which"aim"a "
exp essing"geospa ial"and"a chaeological"da a," espec i ely."Howe e ,"OpenCon ex "is"cu en ly"wo king"
on"aligning"i s"me ada a" o" he"CIDOCICRM."
The"cu en ly"a ailable"RDF"da a"consis s"o "nea ly"5,000" iples"which," oge he ,"desc ibe" oughly"1,250"
esou ces41."O " hese" esou ces," he"majo i y"consis "o "coin," egion,"and"si e"desc ip ions."Li le" a ia ion"
be ween" concep " desc ip ions" exis " wi hin" o " e en" be ween" hese" domains," which" all" apply" a" s a ic"
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
33"These" igu es"a e"based"on" he"g aphs" e ie ed"in"Decembe "2014."
34"Dublin"Co e,"see"www.dublinco e.o g""
35"S ich ing"In as uc uu "Kwali ei sbo ging"Bodemonde zoek,"see"h p://www.sikb.nl""
36"Open"Con ex ,"see"www.opencon ex .o g""
37"These" igu es"a e"based"on" he"g aphs" e ie ed"in"Decembe "2014."
38"Ja aSc ip "Objec "No i ica ion,"see"www.json.o g""
39"Keyhole"Ma kup"Language"(KML),"see"www.opengeospa ial.o g/s anda ds/kml/""
40"A chaeo"Ma kup"Language"(A chaeoML),"see"www.opencon ex .o g/abou /concep s/""
41"These" igu es"a e"based"on" he"g aphs" e ie ed"in"Decembe "2014."
ARIADNE"D16.1"Public"
43"
"
gene ic"se "o "p ope ies"whe eby"only" he" alues"di e ."These" alues"o en"in ol e"cus om" esou ces,"as"
well"as" hose"p o ided"by"Geonames"42."
6.2.4 Pleiades&
Pleiades43"is"a"web"po al" ha "p o ides"his o ical"geog aphical"in o ma ion"abou "place" om" he"ancien "
wo ld."He e o," hey"use" hei "own"de ini ion"o "place,"which"cons i u es"a"geog aphical"loca ion"wi h"an"
ancien " name," which" may" a y" o e " ime." Cu en ly," he" da abase" con ains" close" o" 3,500" places,"
esul ing"in"2,258,807" iples44."Fu he mo e," he"co esponding" esou ces"on"au ho s,"place" ypes,"and"
ime"pe iods"consis "o "an"addi ional"5,000" iples."
Wi hin" he" iple"s o e"each" esou ce" ype"has"i s"own"g aph,"wi h" he"place"desc ip ions"encompassing"
nine" g aphs." These" la e " g aphs" cons i u e" he" majo i y" o " he" da a." While" mode a e" a ia ion" in"
concep "desc ip ions"exis "be ween" he" ypeIspeci ic"g aphs," he e"is"li le" a ia ion"wi hin" hem."
Pa icula ly"in e es ing"is" he"choice" o"include"an"e a a"g aph."The"goal"o " his"g aph"is" o"hold" he"
alsi ied" s a emen s" om" o he " g aphs" ins ead" o " co ec ing" he" e o s" di ec ly." The e o e," bo h" he"
co ec "and" he"inco ec " e sion"a e"a ailable.""
6.3 Summa y&
Based"on" he"case"s udy"o " ou "di e en "da a"se s,"as"well"as"on" he"di e ences"be ween" hem," he"
ollowing"no es"can"be"dis illed"which"migh " equi e"special"a en ion"when"designing"a"DM"solu ion."
Di e ences(in(on ologies"used"exis "be ween" he"da a"se s" om"di e en "sou ces,"whe eas" his"
occu s"less"so"wi hin" hose" om" he"same"sou ce."Addi ionally," he"usage"o "di e en " e sions"o "
he"same" on ology"can"be"obse ed,"which"may" cause"such"an"on ology" o"be"ei he "unde "o "
o e ep esen ed"i "no "deal "wi h"acco dingly."Mo eo e ,"a en ion"should"be"gi en" o" he"use"o "
cus om"on ologies."No e" ha ,"ideally,"all"da a"will"be" ansla ed"using"a"single"on ology"such"as"
CIDOC"CRMa chaeo,"CRMIEH,"o " he"ACDM."
S uc u al( a ia ion"wi hin"da a"se s" om" he"same"sou ce"appea s" ela i ely"li le,"wi h"mos "o "a"
da ase ’s"concep "desc ip ions" ollowing" oughly" he"same"s uc u al"schema."Mo e"speci ically,"
nea ly"all"desc ip ions"in"a"da a"se "use" oughly" he"same"se "o "p ope ies,"which"a e"speci ied"
using" he" same" on ologies." The" in e se" appea s" ue" be ween" he" da a" se s" om" di e en "
sou ces,"which"all"ha e" hei "own"dis inc "schema"s uc u e."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
42"Geonames,"see"www.geonames.o g""
43"Pleiades,"see"pleiades.s oa.o g/""
44"These" igu es"a e"based"on" he"g aphs" e ie ed"in"Decembe "2014."
ARIADNE"D16.1"Public"
44"
"
S uc u allyO la ( g aphs"appea " o" be" qui e" common." This" phenomena" occu s" due" o" he"
sca ceness"o "URIs"pe " esou ce,"ei he "because"a" esou ce"only"has"a"small"numbe "o "p ope ies,"
o "because"o "an"ex ensi e"use"o "li e als."In" he"la e "case,"nume ous"occu ences"o "unnecessa y"
use"ha e"been"obse ed."Tha "is,"li e als"we e"used" o"deno e"p ope y" alues" o "which"URIs"we e"
a ailable."
S ong( Dependency( on( desc ip i e( alues" ends" o" occu " equen ly" h oughou " all" da a" se s."
O en," hese" desc ip ions" p o ide" c ucial" in o ma ion" and" hus" canno " be" igno ed" wi hou " a"
signi ican "loss"o "knowledge."The e o e,"such" alues"should"be"gi en"addi ional" hough "du ing"
he"de elopmen "o "a"DM"solu ion."
Concu en (RDF(s a emen s"we e"obse ed"wi hin"one"da a"se "in"which"bo h" he"co ec "and" he"
co esponding" alsi ied" s a emen s" on" he" same" concep " we e" kep " a ailable." Due" o" his"
cons uc ion,"duplica e"and"possibly"con lic ing"en ies"migh "su ace."Hence,"me hods" o "con lic "
esolu ion"should"be"explo ed."
ARIADNE"D16.1"Public"
45"
"
7 Da a&Mining&on&Linked&A chaeological&Da a&
As"discussed"ea lie ,"a" ypical"KDD"p ocess"s a s"o "wi h"a"deep"unde s anding"o " he"domain"and" he"
da a."A " his"momen ,"expe ise"wi hin" he" o me "is" eadily"a ailable"wi hin"ARIADNE."Un o una ely,"
he"same"canno "be"said"abou " he"da a"aspec ."This"is"unde s andable,"as"ARIADNE"has" eached"hal "o "
i s" ou Iyea " un." Howe e ," as" he" explo a ion" o " da a" is" an" impo an " s ep" in" any" KDD" p ocess," his"
means"i "is"di icul " o"p edic "wha "kind"o "new"knowledge"may"be"b ough " o"ligh "as"a" esul "o " his"
p ocess."The e o e," his"sec ion"will" ocus"on" he"mo eIgene ic"op ions"expec ed" o" unc ion"p ope ly"on"
any" o m"o "Linked"A chaeological"Da a"(LAD)."
Based"on"bo h" he"Domain"and"Da a"Unde s anding"s ep"wi hin"ou "selec ed"KDD"p ocess,"se e al"DM"
solu ions"we e"selec ed"which"we"deem"sui able" o "a"LAD" amewo k"such"as"ARIADNE"will"likely"be."We"
will"nex "discuss" hese"solu ions"in"mo e"dep h."
7.1 Hypo hesis&Gene a ion&
DM"me hods"a e"capable"o "de ec ing"pa e ns"in"da a."In e es ing"and"po en ially" ele an "subse s"o "
hese"pa e ns"can" hen"be"p esen ed" o"use s"as"s a ing"poin s" o " o ming"new" esea ch"hypo heses."
Fo "ins ance," he"sys em"migh "de ec " ha "speci ic" ypes"o "po e y"a e"mos "o en" ound"nea "coas al"
a eas." This"migh " al eady" be" known" o" he" esea che ," o " i " migh " be" some hing" he" esea che " is"
in e es ed" in"explo ing" u he ."The"in e es ingness"o "pa e ns"will"be"de i ed"algo i hmically"on" he"
basis"o "p ede ined"c i e ia"and"use " eedback."To" acili a e" his,"any"LAD" eposi o y"should"ideally"o e "a"
KDD"in e ace" ha "p o ides"i s"use s"wi h"such"capabili ies."These"capabili ies"would" hen"be"applied"
di ec ly" o" he"da a" om"one"o "mo e" eposi o ies."
The"in eg a ion"o "KDD"capabili ies"in o"a"s uc u ed"que y"in e ace"such"as"SPARQL"o "simila "would"
p o ide"a"solu ion" ha "la gely"sa is ies" he"ea lie Imen ioned"c i e ia."As"a" esul ,"any"que y"can"easily"
be" ex ended" wi h" da aImining" ope a ions" capable" o " gene a ing" po en ial" hypo heses."Mo eo e ," as"
using" hese" addi ional" ope a ions" is" pu ely" op ional," hei " p esence" would" no " hinde " use s" who" a e"
solely"in e es ed"in" egula "que ies."Fu he mo e,"i "migh "p o e"use ul" o"assis "wi h" he" o mula ion"o "
que ies" o " hose"use s"who" a e" un amilia " wi h" he"syn ax"o " a" s uc u ed" que y" language"such" as"
SPARQL." Al e na i ely," he" capabili ies" o " se e al" highe Ile el" abs ac ions" could" be" in eg a ed" in o" a"
g aphical"UI," hus" lowe ing" hei "lea ning"cu e."Mo eo e ,"mul iple"hypo heses"could" hen"easily"be"
p esen ed" o"use s"in"nonIin usi e"ways" o"allow" o "quick"scanning" o "po en ially" aluable"di ec ions."
" "
ARIADNE"D16.1"Public"
46"
"
7.2 Assis ed&Que y&Fo mula ion&
The"complexi y"o "accessing"DM"capabili ies" h ough"a"que y"language"migh "pose"a"hu dle" o"in e es ed"
use s,"i.e." hose"who"wan " o"use" hese"capabili ies"bu "who"a e"un amilia "wi h" hei "syn ax."To"lowe "
his"ba ie ,"i "migh "p o e"use ul" o"assis " he"use "in"i s" o mula ion"o "que ies."Ins ead"o "limi ing" his"
assis ance" o"solely" he"DM"capabili ies,"i "may"be"ex ended" o"aid"wi h" o mula ing" egula "que ies"as"
well." This" may" mani es " i sel " as" ei he " p edic ing" o " au ocomple ing" he" que y." Al e na i ely," a"
combina ion"o " he" wo"may"be"applied."
P edic ing"a"que y"in ol es"lea ning" om"pas "que ies."Based"on"a"pa iallyIw i en"que y," he" emainde "
is" p edic ed" by" compa ing" he" simila i y" be ween" he" w i en" pa " and" (pa s" o )"pas " que ies." A"
dis inc ion"he ein"is"whe he "a"mo e"local"o "mo e"global" iew"should"be"main ained."He e,"a"mo e"local"
iew"would"conce n"que ies" om"use s"who"sha e"a"simila "backg ound"o "in e es ,"as"well"as"use s"who"
ha e"accessed"much"o " he"same"da a."This"howe e ,"would" equi e"use "p o iles."In"con as ,"a"global"
iew"would"conside " he"que ies"o "all"use s."A"combina ion"is"possible"as"well," he eby" a ou ing"mo e"
local"que ies"o e "global"ones.""
Whe eas" he"p e iously"discussed"app oach"is"que yId i en,"a"da aId i en"app oach"can"be"used"as"well."
This" would" en ail" lea ning" om" he" a ailable" da a," he eby" de e mining" equen ly"occu ing"
combina ions"o " ela ions"which"may"be"o e ed"as"sugges ions."This"could"addi ionally"be"coupled"wi h"
ele an "on ologies"and"knowledge"o " he"que y"language"and"i s"DM"ex ension."Such"a"coupling"would"
p e en "o e ing"in alid"sugges ions"as"well."Hence,"i "could"be"conside ed"as"a" o m"o "au ocomple ion."
7.3 Ranking&o &Que y&Resul s&
In eg a ing"KDD"capabili ies"in o"a"que y"in e ace"allows" he"abili y" o" ank" he" esul s"o "such"a"que y."
This"abili y"conce ns" he"o de ing"o " esul s"based"upon"ce ain"c i e ia."Wi hin"a"LAD,"a"s ongly"desi ed"
c i e ion"is" ha "o " ele ancy."This"c i e ion"is"commonly" ega ded"as" he"mos "challenging" o"de e mine,"
as"i "is"in e wined"wi h" he" esea che s" low"o " hough "(F anz,"e "al."2009)."Howe e ,"when"using"a"
que y" language" capable" o " ep esen ing" s uc u e" and" seman ics," e.g." SPARQL," de e mining" his"
ele ancy"becomes"mo e"manageable."
While"SPARQL"o "a"simila "que y"language"would"always" o m" he"b idge"be ween"a"LAD’s" on end"and"
backend,"i "migh "no "necessa ily"be" he"in e ace" ha "is"used" o"sea ch"by" he" esea che s."Tha "is," he"
on end"migh " acili a e" ace ed"sea ching," he eby"e ec i ely"posing"as"a"w appe " o" he"unde lying"
que y" language" wi h" i s" DM" ex ension." In" addi ion," i " may" p o ide" a" keywo dIbased" sea ch" as" well,"
he eby" equi ing" a" NLP" solu ion" which" ansla es" he" que y" o" a" SPARQLIbased" language" o " simila ."
While"such"a" ansla ion"would"likely" esul "in"some"loss"in"p ecision,"i "does"allow" o "a"mo e"use I
iendly" sea ch." Mo eo e ," i " opens" up" he" possibili y" o " inco po a ing" VSM," he eby" enabling" he"
adi ional" anking" o " documen s" based" on" hei " ele ance" o" he" p o ided" keywo ds." No e" ha " he"
ARIADNE"D16.1"Public"
53"
"
S ong(Dependency(on(desc ip i e( alues"
All"s udied"da a"se s"appea " o"depend"s ongly"on"desc ip ions."In"mos "cases," hese"desc ip ions"
p o ide"c ucial"in o ma ion"and" hus"canno "be"omi ed"wi hou "a"signi ican "loss"o "knowledge."
Howe e ,"such"desc ip ions"a e"o en" ep esen ed"as"a"single"li e al," hus"being"li le"mo e" han"
uns uc u ed" ex .""
Concu en (RDF(s a emen s"
Da a"se s"we e"obse ed"in"which"bo h" he"co ec "and"inco ec "s a emen s" o " he"same"concep "
we e"kep "a ailable."Tha "is,"ins ead"o "upda ing" alsi ied"s a emen s,"an"e a um"was"supplied."
Due" o" his"cons uc ion,"duplica e"and"possibly"con lic ing"en ies"migh "su ace.""
8.3 Recommenda ions&
Gene ally," he"de elope "o "a" ypical"Da a"Mining"solu ion"is"supplied"wi h"a"gene ous"amoun "o "da a"
om"which" he"explo a ion"migh " e eal"po en ially" ele an "pa e ns."A e "ca e ul"inspec ion"o " he"
da a"cu en ly"a ailable," ele an " pa e ns" can" likely"be" gene alized" o" he" en i e y" o " he" da a."
Un o una ely," he"minimal"amoun "o "da a"cu en "a ailable" h ough"ARIADNE"p e en s"such"sequence"
o "e en s" o" ake"place."The e o e,"i "would"be" a he "unlikely" o"success ully"gene alize"any"disco e ed"
pa e n" o" he"la ge"amoun "o "da a" ha ,"one"day,"will"be"accessible" h ough"ARIADNE."Ins ead,"a"mo eI
gene ic"app oach"is"sugges ed,"such" ha "i s"wo kings"a e"ensu ed" ega dless"o " he"exac "cha ac e is ics"
o " he" u u e"da a."
Based" on" he" s udy" o " bo h" he"domain" and" da a"gene a ed" by" he" domain," as" well" as" on" p ac ical"
cons ain s" wi h" espec " o" ime" and" esou ces," wo" da a" mining" solu ions" we e" chosen" which" we e"
deem" he"mos I easible"and"sui able" o "implemen a ion"wi hin" he"ARIADNE" amewo k."These"a e:"
Hypo hesis(Gene a ion((7.1)"
The"o icial"p ojec "p oposal"o "ARIADNE"men ions" he"abili y" o"de ec "pa e ns"in"a chaeological"
da a"o " ela ed"da a"and"applica ions"wi hin" he"ARIADNE"in as uc u e."Da aImining"me hods"
a e" capable" o " de ec ing" such" pa e ns." In e es ing" and" po en ially" ele an " subse s" o " hese"
pa e ns"can" hen"be"p esen ed" o"use s"as"s a ing"poin s" o " o ming"new" esea ch"hypo heses."
This"may"al eady"be"known" o" he" esea che ,"o "i "migh "be"some hing" hey"a e"in e es ed"in"
explo ing" u he ."The"in e es ingness"o "pa e ns"will"be"de e mined"algo i hmically"on" he"basis"
o "p ede ined"c i e ia"and"use " eedback."To" acili a e" his,"a"use "in e ace"should"p o ide"access"
o" a" da a"mining" backend." Ini ially," his" migh " bes " be" in eg a ed" wi hin" a" ex Ibased" que y"
in e ace"such"as"SPARQL"o "simila ."A "a"la e "s age,"a"w appe " o " he"g aphical"use "in e ace"
should"be" made." This" will" allow" mul iple" hypo heses" o" be" p esen ed" in" nonIin usi e" ways,"
he eby"allowing"use s" o"quickly"scan"in"po en ially" aluable"di ec ions."
" "
ARIADNE"D16.1"Public"
54"
"
Da a(Quali y(Analysis((7.5)(
Two"aspec s" ha " e lec "poo ly"on" he"quali y"o "da a"a e" he"occu ence"o "gaps"and"e o s"in" he"
knowledge" con ained" he ein."In" case" o " he" o me ," illing" hese" oids" in ol es" p edic ing" he"
mos Ilikely" esou ce," link," o " li e al."In" he" la e " case," hese" e o s" ypically" include"anomalies"
wi hin" he"da a"which"canno "be"explained"by"any"o " he"disco e ed"pa e ns"alone."Depending"on"
he"likelihood"o " hem"being"e oneous," he"de ec ed"e o s"could"be"sugges ed" o " emo al"o "
agged"as"dubious."Al e na i ely," hey"could"be" eplaced"by"a"p edic ion"o " he"co ec " alue."This"
could" euse" he"ea lie Imen ioned"p edic ion"me hod"and"da a"mining"backend."
8.4 Roadmap&
The" sequel" o" his" epo ," i.e." Deli e able" 16.3," will" p esen " he" inal" esul s" o " he" applicabili y" and"
easibili y" o " da a" mining" wi hin" he" ARIADNE" amewo k." To" his" end," he" a o emen ioned"
ecommenda ions"will"be"explo ed"and"expe imen ed"wi h" u he ."This"p ocess"will"consis "o "se e al"
phases."
Following" his" epo ,"a"mo e"ex ensi e"s udy"in o" he" ecommended" opics"will" i s "be"pe o med."This"
will"assis "us"in"na owing"down" he"lis "o "possible"op ions" o"only" hose" ha "we"belie e"possess" he"
mos " po en ial." The" emaining" op ions" will" subsequen ly" be" implemen ed" in o" ou " expe imen al"
en i onmen "on"si e"whe e" hey"will" ho oughly"be" es ed"on" a ious"linked"a chaeological"da a."The"
esul " o " hese" es s" will" de e mine" whe he " he" selec ed" op ions" a e" sui able" o " in eg a ion" wi hin"
ARIADNE."Based"on" he"expe ience"gained"du ing" he"cu en "s udy,"we"expec " ha "mos "o " he"possible"
op ions"will"need" o"be"adap ed" o"sui "bo h" he"da a"and" he"use ’s"needs."I ," o "some" eason,"none"o "
hese"op ions"a e" ound" o"be"sui able,"a"cus om"solu ion"will"be"de eloped"ins ead."
Once" he"selec ed"op ions"ha e"been"success ully"implemen ed"wi hin" he"expe imen al"en i onmen ,"
in e nal"e alua ion" ounds"will"be"o ganized"du ing"which" domain"en husias s"and" expe s"o " a ying"
le els"o "expe ise"will"be"asked" o"expe imen "wi h" he"implemen a ion."He e,"we"expec " he"g oups" o"
be"comp ised"o "s uden s,"junio "and"senio "a chaeological" esea che s,"and"local"da a"and" eposi o y"
manage s." This" will" addi ionally" p o ide" he" inpu " needed" o " he" de elopmen " o " (elemen s" o )" a"
g aphical"use "in e ace."Simila ly"as"be o e,"an"i e a i e"scheme"will"be" ollowed.""
The" inal" phase" will" consis " o " po en ially" implemen ing" he" da a"mining" solu ions" in o" he" ARIADNE"
in as uc u e." This" would" be" ollowed" by" ex ensi ely" expe imen ing" on" he" a ious" da a" accessible"
h ough"ARIADNE."This"implemen a ion"will"be"imp o ed"upon" u he "du ing"a"se ies"o "i e a i e"and"
open"e alua ion"sessions" o " he" emainde "o " he"ARIADNE"p ojec ."
ARIADNE"D16.1"Public"
55"
"
Bibliog aphy&
Aloia," N," C" Meghini," D" Ga ilis," and" C" Papa heodo ou." ARIADNE& Ca alogue& Da a& Model."Deli e able,"
ARIADNE,"2014."
Amin,"A,"J"Van"Ossenb uggen,"L"Ha dman,"and"A" an"Nispen.""Unde s anding"cul u al"he i age"expe s'"
in o ma ion" seeking" needs."" The& 8 h& ACM/IEEEDCS& join & con e ence& on& Digi al& lib a ies."ACM,"
2008."39I47."
Anyanwu,"K," A" Maduko," and" A" She h.""SemRank:" anking"complex" ela ionship"sea ch" esul s" on" he"
seman ic"web.""14 h&in e na ional&con e ence&on&Wo ld&Wide&Web&."ACM,"2005."117I127."
A z," D," and" Y" Gil." "A" su ey" o " us " in" compu e " science" and" he" seman ic" web."" Web& Seman ics:&
Science,&Se ices&and&Agen s&on& he&Wo ld&Wide&Web,"2007:"58I71."
Balmin," A," V" H is idis," and" Y" Papakons an in." "Objec ank:" au ho i yIbased" keywo d" sea ch" in"
da abases.""VLDB,"2004:"564I575."
Bax e ,"M."S a is ics&in&a chaeology."London:"A nold,"2003."
Be end ," B," A" Ho ho," and" G" S umme." "Towa ds" seman ic" web" mining."" The& Seman ic& Web—ISWC."
Sp inge "Be lin"Heidelbe g,"2002."264I278."
Be end ,"B.,"Ho ho,"A,"D"Mladenic,"M"Van"Some en,"M"Spiliopoulou,"and"G"S umme.""A" oadmap" o "
web"mining.""Web& o&seman ic&web,"2004:"1I22."
Bi,"S,"S"Xue,"Y"Xu,"Pei,"and"A.""Spa ial"Da a"Mining"in"Se lemen "A chaeological"Da abases"Based"on"
Vec o "Fea u es.""Fuzzy&Sys ems&and&Knowledge&Disco e y."Jinan"Shandong:"IEEE,"2008."277I281."
Bice ," V," T" T an," and" A" Gossen." "Rela ional" ke nel" machines" o " lea ning" om" g aphIs uc u ed" RDF"
da a.""The&Seman ic&Web:&Resea ch&and&Applica ions,"2011:"47I62."
Bize ,"C,"and"R"Oldakowski.""Using"Con ex I"and"Con en IBased"T us "Policies"on" he"Seman ic"Web.""
13 h&In e na ional&Wo ld&Wide&Web&Con e ence."New"Yo k,"NY:"ACM"P ess,"2004."228I229."
Bize ,"C,"T"Hea h,"and"T"Be ne sILee.""Linked"da aI he"s o y"so" a .""In e na ional&jou nal&on&seman ic&
web&and&in o ma ion&sys em"3,"no."5"(2009)."
Bloehdo n,"S,"and"Y"Su e.""Ke nel"me hods" o "mining"ins ance"da a"in"on ologies.""2007:"58I71."
Bo gwa d ,"K"M,"N"N"Sch audolph,"and"S"Vishwana han.""Fas "compu a ion"o "g aph"ke nels.""Ad ances&
in&neu al&in o ma ion&p ocessing&sys ems,"2006:"1449I1456."
ARIADNE"D16.1"Public"
56"
"
B ay,"T,"J"Paoli,"C"M"Spe be gIMcQueen,"E"Male ,"and"F"Ye geau.""Ex ensible"Ma kup"Language"(XML)"1.""
W3C."No embe "26,"2008."www.w3.o g/TR/xml."
B ickley,"D.""Basic"Geo"Vocabula y.""W3C."Feb ua y"1,"2006."www.w3.o g/2003/01/geo."
B iscoe,"T.""Lexical"issues"in"na u al"language"p ocessing.""Na u al&language&and&speech,"1991:"39I68."
Buneman,"P.""Semis uc u ed"da a.""The&six een h&ACM&SIGACTDSIGMODDSIGART&P inciples&o &da abase&
sys ems."ACM,"1997."117I121."
Campinas,"S,"T"E"Pe y,"D"Cecca elli,"R"Delb u,"and"G"Tumma ello.""In oducing" d "g aph"summa y"wi h"
applica ion" o" assis ed" spa ql" o mula ion."" 23 d& In e na ional& Wo kshop& on& Da abase& and&
Expe &Sys ems&Applica ions."2012."
Cas ells,"P,"M"Fe nandez,"and"D"Valle .""An"adap a ion"o " he" ec o Ispace"model" o "on ologyIbased"
in o ma ion" e ie al.""Knowledge&and&Da a&Enginee ing."IEEE"T ansac ions,"2007."261I272."
Cha no,"M,"S"Je ey,"C"Binding,"D"Tudhope,"and"K"May.""F om" he"Slope"o "Enligh enmen " o" he"Pla eau"
o " P oduc i i y:" De eloping" Linked" Da a" a " he" ADS."" 40 h& Annual& Con e ence& o & Compu e &
Applica ions& and& Quan i a i e& Me hods& in& A chaeology."Sou hamp on:" Ams e dam" Uni e si y"
P ess,"2012."216I223."
Chen,"M"S,"J"Han,"and"P"S"Yu.""Da a"mining:"an"o e iew" om"a"da abase"pe spec i e.""Knowledge&and&
da a&Enginee ing,"1996:"866I883."
Clea e,"J"P."A&s udy&o &logics."Ox o d"Uni e si y"P ess,"1991."
Codd,"E"F.""A" ela ional"model"o "da a" o "la ge"sha ed"da a"banks.".""Communica ions&o & he&ACM,"1970:"
377I387."
Codd,"E"F,"S"B"Codd,"and"C"T"Salley."P o iding&OLAP&(onDline&analy ical&p ocessing)& o&use Danalys s:&An&IT&
manda e."Codd"and"Da e,"1993."
Cohen,"W"W,"and"Z"Kou.""S acked"g aphical"lea ning:"app oxima ing"lea ning"in"ma ko " andom" ields"
using" e y"sho "inhomogeneous"ma ko "chains.""Technical" epo ,"2006."
Conolly," J," and" M" Lake." Geog aphical& In o ma ion& Sys ems& in& A chaeology."Camb idge:" Camb idge"
Uni e si y"P ess,"2006."
C ipps,"P,"e "al."CRMa chaeo:& he&Exca a ion&Model."CIDOC"CRM,"2014."
Cyganiak," R," and" D" Reynolds." "The" RDF" Da a" Cube" Vocabula y."" W3C."Janua y" 2014," 2014."
www.w3.o g/TR/ ocabIda aIcube."
d'Ama o,"C,"N"Fanizzi,"and"F"Esposi o.""Induc i e"lea ning" o " he"Seman ic"Web:"Wha "does"i "buy?""
Seman ic&Web,"2010:"53I59."
ARIADNE"D16.1"Public"
57"
"
De" Kleijn," M," N" an" Manen," J" Kolen," and" H" Schol en." "Towa ds" a" Use Icen ic" SDI" F amewo k" o "
His o ical" and" He i age" Eu opean" Landscape" Resea ch."" In e na ional& Jou nal& o & Spa ial& Da a&
In as uc u es&Resea ch,"2014:"1I35."
Di" Ludo ico," A," and" G" Pie i." "A i icial" Neu al" Ne wo ks" and" ancien " a e ac s:" Jus i ica ions" o " a"
mul i o m"in eg a ed"app oach"using"PST"and"Au oICM"models.""A cheologia&e&calcola o i,"2011:"
91I128."
Dimi opoulos," H," e " al." "AITION:" a" scalable" pla o m" o " in e ac i e" da a" mining."" Scien i ic& and&
S a is ical&Da abase&Managemen ,"2012:"646I651."
Doe ,"M,"and"K"Schalle .""The"D eam"o "a"Global"Knowledge"Ne wo k"I"A"new"App oach.""ACM&Jou nal&
on&Compu e s&and&Cul u al&He i age,"2008."
Doe ," M," K" Schalle ," and" M" Theodo idou." "I eg a ion" o " complemen a y" a chaeological" sou ces.""
Compu e &Applica ions&and&Quan i a i e&Me hods&in&A chaeology."P a o,"I ally","2004."
Dubin,"D.""The"mos "in luen ial"pape "Ge a d"Sal on"ne e "w o e.""Lib a y&T ends,"2004:"748I764."
Ea l," G," T." T" Sly," and" D" D" Whea ley." "A chaeology" in" he" Digi al" E a."" Compu e & Applica ions& and&
Quan i a i e&Me hods&in&A chaeology."Sou hamp on:"Ams e dam"Uni e si y"P ess,"2014."
E che e y,"L,"and"A"A"Vaisman.""Enhancing"OLAP"analysis"wi h"web"cubes.""Seman ic&Web:&Resea ch&
and&Applica ions,"2012:"469I483."
E che e y,"L,"and"A"A"Vaisman."QB4OLAP:&A&New& Vocabula y& o &OLAP&Cubes&on& he&Seman ic&Web."
R1210LAC004,"2012."
E e ,"D,"and"C"Domeniconi.""SemRank:"Seman ic"Rank"Lea ning" o "Mul imedia"Re ie al.""2014."
Fanizzi,"N,"C"d’Ama o,"and"F"Esposi o.""Concep ual"clus e ing"and"i s"applica ion" o"concep "d i "and"
no el y"de ec ion.""Munich:"Sp inge "Be lin"Heidelbe g,"2008."
Fayyad,"U"M.""Da a"mining"and"knowledge"disco e y:"Making"sense"ou "o "da a.""IEEE&In elligen &Sys ems"
11,"no."5"(1996):"20I25."
Fayyad," Usama," G ego y" Pia e skyIshapi o," and" Padh aic" Smy h." "F om" Da a" Mining" o" Knowledge"
Disco e y"in"Da abases.""AI&Magazine"17"(1996):"37I54."
Fen ess,"E."Regis e &o &Online&A chaeological&Da abases."Deli e able,"ARIADNE,"2014."
Fishe ,"D"H.""Knowledge"acquisi ion" ia"inc emen al"concep ual"clus e ing.""Machine&lea ning"2,"no."2"
(1987)."
F anz," T," A" Schul z," S" Sizo ," and" S" S aab." "T iple ank:" Ranking" seman ic" web" da a" by" enso "
decomposi ion.""2009:"213I228."
ARIADNE"D16.1"Public"
58"
"
F ei as,"A,"E"Cu y,"J"G"Oli ei a,"and"S"O'Riain.""Que ying"he e ogeneous"da ase s"on" he"linked"da a"web:"
Challenges,"app oaches,"and" ends.""In e ne &Compu ing,&IEEE,"2012:"24I33."
F iedman,"J"H.""Da a"Mining"and"S a is ics:"Wha 's" he"connec ion?""Compu ing&Science&and&S a is ics"29,"
no."1"(1998):"3I9."
Ga shol," L" M." "" Me ada a?" Thesau i?" Taxonomies?" Topic" maps!" Making" sense" o " i " all."" Jou nal& o &
in o ma ion&science"4,"no."30"(2004):"378I391."
Gä ne ,"T.""A"su ey"o "ke nels" o "s uc u ed"da a.""ACM&SIGKDD&Explo a ions&Newsle e ,"2003,"5"ed.:"
49I58."
Ge oo ,"L,"and"B"Taska ."In oduc ion& o&s a is ical& ela ional&lea ning."MIT"p ess,"2007."
Golbeck,"J,"B"Pa sia,"and"J"Hendle .""T us "ne wo ks"on" he"seman ic"web.""Jou nal&o &Web&Seman ics,"
2003:"238I249."
Gombos,"G,"and"A"Kiss.""SPARQL"que y"w i ing"wi h" ecommenda ions"based"on"da ase s.""In o ma ion&
and&Knowledge&Design&and&E alua ion,"2014:"310I319."
G ube ," E," G" B ansbou g," S" Hea h," and" A" Meadows." "Linking" Roman" Coins:" Cu en " Wo k" a " he"
Ame ican" Numisma ic" Socie y."" 40 h& Annual& Con e ence& o & Compu e & Applica ions& and&
Quan i a i e& Me hods& in& A chaeology."Sou hamp on:"Ams e dam"Uni e si y"P ess,"2012."249I
258."
Gup a,"M,"Y"Sun,"and"J"Han.""T us "analysis"wi h"clus e ing.""20 h&in e na ional&con e ence&companion&on&
Wo ld&Wide&Web&."ACM,"2011."53I54."
Hagood,"J.""A"b ie "in oduc ion" o"da a"mining"p ojec s"in" he"humani ies.""Bulle in& o & he& Ame ican&
Socie y& o &In o ma ion&Science&and&Technology"38,"no."4"(2012):"20I23."
Has ie,"T,"R"Tibshi ani,"J"F iedman,"T"Has ie,"J"F iedman,"and"R"Tibshi ani."The& elemen s& o & s a is ical&
lea ning."New"Yo k:"Sp inge ,"2009."
He," X," and" M" Bake ." "xhRank:" Ranking" En i ies" o " Seman ic" Web" Sea ching."" Fi h& In e na ional&
Con e ence&on&Ad ances&in&Seman ic&P ocessing."2011."62I68."
Hea h," T," and" Ch is ian" Bize ." Linked& Da a:& E ol ing& he& Web& in o& a& Global& Da a& Space."Mo gan" &"
Claypool"Publishe s,"2011."
Hogan,"A,"A"Ha h,"and"S"Decke .""Recon ank:"A"scalable" anking"me hod" o "seman ic"web"da a"wi h"
con ex .""2006."
Hollande ,"Hella,"and"Maa en"Hooge we ."D13.1:&Se ice&Design."ARIADNE,"2014."
ARIADNE"D16.1"Public"
59"
"
Huang," B," A" Kimmig," L" Ge oo ," and" J" Golbeck." "P obabilis ic" so " logic" o " us " analysis" in" social"
ne wo ks.""In e na ional&Wo kshop&on&S a is ical&Rela ional&AI."2012."1I8."
Huang," Y," and" V" T esp." "Accessing" he" Seman ic" Web" ia" S a is ical" Machine" Lea ning."" ESWC& 2012&
Tu o ial."May" 2012," 2012." h p://www.dbs.i i.lmu.de/~huang/eswc2012 u o ial/ESWC2012I
Tu o ialV10.pd ."
Huang,"Y,"and"V"T esp."Rela ion&P edic ion&in&Seman ic&Domains&using&Mul i a ia e&P edic ion."Munich,"
Ge many:"Siemens"AG,"2010."
Huang,"Y,"V"T esp,"H"K iegel,"and"P.""Mul i a ia e"p edic ion" o "lea ning"in" ela ional"g aphs.""Wo kshop:&
Analyzing&Ne wo ks&and&Lea ning&Wi h&G aphs."2009."
Huang,"Y,"V"T esp,"M"Bundschus,"A"Re inge ,"and"H"P"K iegel.""Mul i a ia e"p edic ion" o "lea ning"on"
he"seman ic"web.""Induc i e&Logic&P og amming,"2011:"92I104."
Hube ," R," H" Ramose ," K" Maye ," H" Penz," and" M" Rubik." "Classi ica ion" o " coins" using" an" eigenspace"
app oach.""Pa e n&Recogni ion&Le e s,"2005:"61I75."
Hugge ," J." "Disciplina y" Issues:" Challenging" he" Resea ch" and" P ac ice" o " Compu e " Applica ions" in"
A chaeology.""40 h&Annual&Con e ence&o &Compu e &Applica ions&and&Quan i a i e&Me hods&in&
A chaeology."Sou hamp on:"Ams e dam"Uni e si y"P ess,"2012."14I24."
Isaksen," L," G" Ea l," K" Ma inez," S" Keay," and" N" Gibbins." "Linking" a chaeological" da a."" In e na ional&
Con e ence&on&Compu e &Applica ions&and&Quan i a i e&Me hods&in&A chaeology."2009."
Isaksen,"L,"K"Ma inez,"N"Gibbins,"G"Ea l,"and"S"Keay.""Linking"a chaeological"da a.""CAA,"2009."
Isaksen," L," K" Ma inez," N" Gibbins," G aeme" Ea l," and" S" Keay." "In e ope a e" wi h" whom?" Fomali y,"
A chaeology"and" he"Seman ic"Web.""WebScience."Raleigh,"NC,"2010."
Kan a dzic,"M."Da a&mining:&concep s,&models,&me hods,&and&algo i hms."John"Wiley"&"Sons,"2011."
Ka asik," A," I" Sha on," U" Smilansky," and" A" Gilboa." "Typology" and" classi ica ion" o " ce amics" based" on"
cu a u e" analysis."" Compu e & Applica ions& and& Quan i a i e& Me hods& in& A chaeology."2004."
472I475."
Kella ," M," C" Wa e s," and" K" M" Inkpen." "An" explo a ion" o " webIbased" moni o ing:" implica ions" o "
design.""The&SIGCHI&con e ence&on&Human& ac o s&in&compu ing&sys ems."ACM,"2007."377I386."
Khan," M" A," G" A" G imnes," and" A" Dengel." "Two" p eIp ocess" ope a o s" o " imp o ed" lea ning" om"
seman icweb"da a.""Fi s &Rapidmine &Communi y&Mee ing&and&Con e ence."2010."
Kie e ,"C,"A"Be ns ein,"and"A"Loche .""Adding"da a"mining"suppo " o"SPARQL" ia"s a is ical" ela ional"
lea ning"me hods.""The&Seman ic&Web:&Resea ch&and&Applica ions,"2008:"478I492."
ARIADNE"D16.1"Public"
60"
"
Kin igh,"K.""Quan i a i e"me hods" designed" o " a chaeological"p oblems.""In"Quan i a i e& Resea ch& in&
A chaeology:& P og ess& and& P ospec s," by" M" S" Aldende e ," 135I150." Newbu y" Pa k," NJ:" Sage,"
1987."
Knobbe,"A"J."Mul iD ela ional&da a&mining."Ios"P ess,"2006."
Kobylinski," L," and" K" Walczak." "Da a" mining" app oach" o" classi ica ion" o " a chaeological" ae ial"
pho og aphs.""In elligen &In o ma ion&P ocessing&and&Web&Mining,"2006:"479I487."
Kolda,"T"G,"and"B"W"Bade .""Tenso "decomposi ions"and"applica ions.""SIAM& e iew"51,"no."3"(2009):"
455I500."
Kolle ,"D,"and"N"F iedman."P obabilis ic&g aphical&models:&p inciples&and& echniques."MIT"p ess,"2009."
K ame ," K," R" Q" Di idino," and" G" G öne ." "SPACE:" SPARQL" Index" o " E icien " Au ocomple ion.""
In e na ional&Seman ic&Web&Con e ence."2013."157I160."
K ogel," M" A," S" Rawles," F" Železný," P" A" Flach," N" La ač," and" S" W obel." "Compa a i e" e alua ion" o "
app oaches" o" p oposi ionaliza ion."" 13 h& In e na ional& Con e ence& on& ILP."Szeged:" Sp inge "
Be lin"Heidelbe g,"2003."197I214."
Ku gan,"L"A,"and"P"Musilek.""A"su ey"o "Knowledge"Disco e y"and"Da a"Mining"p ocess"models.""The&
Knowledge&Enginee ing&Re iew,"2006:"1I24."
La ač,"N,"A"Va pe ič,"L"Solda o a,"I"T ajko ski,"and"P"K"No ak.""Using"on ologies"in"seman ic"da a"mining"
wi h"segs"and"gIsegs.""Disco e y&Science,"2011:"165I178."
La ac,"N,"and"S"Dze oski."Rela ional&Da a&Mining."Sp inge ,"2001."
Linde holm," J," and" P" Geladi." "Classi ica ion" o " a chaeological" soil" and" sedimen " samples" using" nea "
in a ed" echniques.""NIR&news,"2012:"6."
Loche ," A." SPARQLDML:& Knowledge& Disco e y& o & he& Seman ic& Web."Thesis" Uni e si y" o " Zu ich,"
Uni e si y"o "Zu ich,"2007."
Maali," F," J" E ickson," and" P" A che ." "Da a" Ca alogue" Vocabula y" (DCAT)."" W3C."Janua y" 16," 2014."
www.w3.o g/TR/ ocabIdca ."
Maedche,"A,"and"V"Zacha ias.""Clus e ing"on ologyIbased"me ada a"in" he"seman ic"web."In.""P inciples&
o &Da a&Mining&and&Knowledge&Disco e y."Sp inge "Be lin"Heidelbe g,"2002."348I360."
Maimon,"O"Z,"and"L"Rokach."Da a&mining&and&knowledge&disco e y&handbook."New"Yo k:"Sp inge ,"2005."
Mani," Inde jee ," and" Ma k" T" Maybu y." Ad ances& in& Au oma ic& Tex & Summa iza ion."Camb idge:" MIT"
P ess,"1999."
ARIADNE"D16.1"Public"
61"
"
Ma inez,"K,"and"L"Isaksen.""The"Seman ic"Web"App oach" o"Inc easing"Access" o"Cul u al"He i age.""
Re isualizing&Visual&Cul u e."Va nham,"2010."29I44."
May," K." Hype media& Resea ch& Uni & D& CIDOC& CRMDEH& On ology."n.d."
h p://hype media. esea ch.sou hwales.ac.uk/kos/CRM/"(accessed"No embe "02,"2014)."
Mendes,"P"N,"M"Jakob,"A"Ga cíaISil a,"and"C"Bize .""DBpedia"spo ligh :"shedding"ligh "on" he"web" o "
documen s.""7 h&In e na ional&Con e ence&on&Seman ic&Sys ems."ACM,"2011."1I8."
Me axas,"O,"H"Dimi opoulos,"Y"Ioannidis,"and"M"Paedig ee.""AITION:"A"scalable"KDD"pla o m" o "Big"
Da a"Heal hca e.""Biomedical&and&Heal h&In o ma ics."IEEE,"2014."601I604."
Middle on,"S"E,"N"R"Shadbol ,"and"D"C"De"Rou e.""On ological"use "p o iling"in" ecommende "sys ems.""
ACM&T ansac ions&on&In o ma ion&Sys ems&(TOIS)&22.1,"2004:"54I88."
Na asimha,"V,"P"Kappa a,"R"Ichise,"and"O"P"Vyas.""LiDDM:"A"Da a"Mining"Sys em" o "Linked"Da a.""CEUR&
Wo kshop&P oceedings:&Linked&Da a&on& he&Web."2011."
Nickel,"M,"V"T esp,"and"H"K iegel.""A" h eeIway"model" o "collec i e"lea ning"on"mul iI ela ional"da a.""
28 h&in e na ional&con e ence&on&machine&lea ning."2011."809I816."
Nolle,"M,"H"Penz,"M"Rubik,"K"Maye ,"I"Hollande ,"and"R"G anec.""Dagobe Ia"new"coin" ecogni ion"and"
so ing" sys em."" In e na ional& Con e ence& on& Digi al& Image& Compu ing,& Techniques,& and&
Applica ions."Sydney:"CSIRO"Publishing,"2003."329I338."
No ak,"P"K,"A"Va pe ic,"I"T ajko ski,"and"N"La ac.""Towa ds"seman ic"da a"mining"wi h"gIsegs.""11 h&
In e na ional&Mul icon e ence&In o ma ion&Socie y."2009."
OGC& GeoSPARQL& D& A& Geog aphic& Que y& Language& o & RDF& Da a."Speci ica ion," Open" Geospa ial"
Conso ium,"2012."
Padawi z,"P."Compu ing&in&Ho n&clause& heo ies."Sp inge "Publishing"Company,"2012."
Pa saye,"K.""Su eying"Decision"Suppo :"New"Realms"o "Analysis.""Da abase&P og amming&and&Design,"
1996:"26I33."
Paulheim,"H,"and"J"Fümk anz.""Unsupe ised"gene a ion"o "da a"mining" ea u es" om"linked"open"da a.""
In e na ional&con e ence&on&web&in elligence,&mining&and&seman ics."ACM,"2012."31."
Po oniec,"J,"and"A"Law ynowicz.""RMon o:"on ological"ex ension" o"RapidMine .""ISWC."2011."1I4."
—.""RMon oI owa ds"KDD"wo k lows" o "on ologyIbased"da a"mining.""eCML&PKDD."2011b."11."
P ud'hommeaux," E," and" A" Seabo ne." "SPARQL" Que y" Language" o " RDF."" W3C."Janua y" 18," 2008."
www.w3.o g/TR/ d Ispa qlIque y."
ARIADNE"D16.1"Public"
62"
"
Pu," L," and" B" Fal ings." "Unde s anding" and" imp o ing" ela ional" ma ix" ac o iza ion" in" ecommende "
sys ems.""7 h&ACM&con e ence&on&Recommende &sys ems."ACM,"2013."41I48."
agimo ,"D,"K"Hose,"T"B"Pede sen,"and"E"Zimányi.""Towa ds"Explo a o y"OLAP"o e "Linked"Open"Da a–A"
Case"S udy.""2014:"18."
Ramezani,"R,"M"Sa aee,"and"M"A"Nema bakhsh.""Finding"associa ion" ules"in"linked"da a,"a"cen aliza ion"
app oach.""I anian&Con e ence&on&Elec ical&Enginee ing."IEEE,"2013."1I6."
Rapidmine & SemWeb."n.d." h ps://code.google.com/p/ apidmine Isemweb" (accessed" Sep embe " 12,"
2014)."
Re inge ,"A,"U"Lösch,"V"T esp,"C"d'Ama o,"and"N"Fanizzi.""Mining" he"seman ic"web.""Da a&Mining&and&
Knowledge&Disco e y,"2012:"613I662."
Richa ds," J" D." "A chaeology," eIpublica ion" and" he" seman ic" web."" ANTIQUITYDOXFORD"80," no." 310"
(2006):"970I979."
Ris oski,"P,"and"H"Paulheim.""A"Compa ison"o "P oposi ionaliza ion"S a egies" o "C ea ing"Fea u es" om"
Linked"Open"Da a.""Linked&Da a& o &Knowledge&Disco e y,"2014:"6I17."
Ris oski," P," C" Bize ," and" H" Paulheim." "Mining" he" web" o " linked" da a" wi h" apidmine ."" In e na ional&
Seman ic&Web&Con e ence."2014."
Rocha,"C,"D"Schwabe,"and"M"P"A agao.""A"hyb id"app oach" o "sea ching"in" he"seman ic"web.""13 h&
Iin e na ional&con e ence&on&Wo ld&Wide&Web."ACM,"2004."374I383."
Sal on,"G,"A"Wong,"and"C"S"Yang.""A" ec o "space"model" o "au oma ic"indexing.""Communica ions&o & he&
ACM,"1975:"613I620."
Schölkop ," B," and" A" J" Smola." Lea ning& wi h& ke nels:& Suppo & ec o & machines,& egula iza ion,&
op imiza ion,&and&beyond."MIT"p ess,"2002."
Selho e ,"Hannes,"and"Gun am"Gese ."D2.1:&Fi s &Repo &on&Use s'&Needs."Salzbu g:"ARIADNE,"2014."
Sen,"P,"G"Nama a,"M"Bilgic,"L"Ge oo ,"B"Gallighe ,"and"T"EliassiIRad.""Collec i e"classi ica ion"in"ne wo k"
da a.""AI&magazine"29,"no."3"(2008)."
Shadbol ,"N,"W"Hall,"and"T"Be ne sILee.""The"seman ic"web" e isi ed.""In elligen &Sys ems"(IEEE)"3,"no."21"
(2006)."
Signo e," O." "Rep esen ing" knowledge" in" a chaeology:" om" ca aloguing" ca ds" o" seman ic" web.""
A cheologia&e&calcola o i,"no."20"(2009):"111I128."
Singla,"P,"and"P"Domingos.""En i y" esolu ion"wi h"ma ko "logic.""Six h&In e na ional&Con e ence&on&Da a&
Mining."IEEE,"2006."572I582."
ARIADNE"D16.1"Public"
"
"
Appendix&C Lea ning& Me hods& o & Seman ic& Web&
Mining&
Seman icIWeb"Mining"o e s"a"wide" ange"o "po en ial"me hods,"mos "o "which"a e" a he "expe imen al."
O " hese" me hods"(Knobbe" 2006," Re inge ," e " al." 2012," Be end ," e " al." 2004)," wo" a e" ea u ed"
p ominen ly" in" ecen " li e a u e," namely"P oposi ionaliza ion"and"S a is ical"Rela ional" Lea ning." O "
hese," he"co e"concep s"will"be"discussed"nex ."In"addi ion,"we"will"b ie ly"look"a " he"p omising"and"
a he "new"Ke nel"Me hods,"which"ha e" ecen ly"become"qui e"popula "wi hin"ML"communi ies."
C.1 P oposi ional&Lea ning&
Recall" ha "p oposi ional"da a"a e"assumed" o"be"independen "and"iden ically"dis ibu ed,"hence"allowing"
o " s a is ical" machine" lea ning" algo i hms" o" be" applied." Gi en" ha " hese" assump ions" do" no "
necessa ily" hold" o " LD," applying" hem" anyway" would" likely" esul " in" alse" conclusions." Ins ead" o "
abandoning" hese" p o en" me hods" howe e ," an" al e na i e" would" be" o" empo a ily" con e " LD" o"
p oposi ional"da a;"a"p ocess"known"as"p oposi ionaliza ion"(Ris oski"and"Paulheim"2014,"T esp,"e "al."
2008)."
A"wellIknown"me hod"du ing"p oposi ionaliza ion"is" ac o iza ion,"which"in ol es" he"decomposi ion"o "
an"objec "in o"a"p oduc "o "smalle "objec s;"i s" ac o s."When"pu " oge he "again," hese" ac o s" e u n"(an"
es ima ion"o )" he"o iginal"objec ."Wi hin" he"con ex "o "LD,"i "en ails" he" ansla ion"o "g aph"da a" o"
p oposi ional"da a,"and" ice" e sa."While"in" his"p oposi ion" o ma "s a is ical"ML"can"be"applied."
C.1.1 Rela ional&Ma ices&
A"ma ix"is"a" woIdimensional"a ay"a anged"in"𝑛" ows"and"𝑚"columns."Each"poin "(𝑖,𝑗)"wi h"𝑖∈𝑛"and"
𝑗∈𝑚"is"an"elemen "o " he"ma ix."When" wo"o "mo e"ma ices"a e"needed" o"desc ibe"a"single"da a"se ,"
hese" a e" o en"deno ed"as" ela ional" ma ices." No e" ha " his" s ongly" esembles" he" ela ional"da a"
model"as"is"used"in" ela ional"da abase."
T ansla ing"LD"in o" ela ional"ma ices"(Figu e"CI1)" is" a" ai ly"s aigh o wa d" p ocedu e"(T esp," e " al."
2008,"Pu"and"Fal ings"2013,"Re inge ,"e "al."2012)."Each"ma ix" ep esen s"a"single"p edica e"in" he"da a"
se ." Hence," he" numbe " o " equi ed" ma ices" equals" he"numbe " o " p edica es." Nex ," o " each" iple"
con aining" a" ce ain" p edica e," i s" subjec "and" objec "a e" placed" on" ow" 𝑖"and" column" 𝑗"o " he"
co esponding"ma ix," espec i ely."Wi hin" ha "ma ix," he"elemen "(𝑖,𝑗)"now"con ains"a" alue"o "1.0,"
he eby" ep esen ing" ha " he"p edica e"holds"wi h" espec " o" he"co esponding"subjec "and"objec .&
& &
ARIADNE"D16.1"Public"
i"
"
A e "p oposi ionaliza ion,"each"(𝑖,𝑗)"o "a"ma ix"con ains"ei he "a" alue"o "1.0"(holds)"o "0.0"(o he wise)"
(Pu" and" Fal ings" 2013," Re inge ," e " al." 2012)." Fac o iza ion" is" hen" applied" o" de e mine" he" la en "
ea u es" ha "a e"hidden"be ween" he"en i ies."This"is"simila " o" he"wellIknown"s a is ical" echnique"o "
P incipal"Componen "Analysis"(PCA)."Once"spli "up," he"ma ices"a e"mul iplied"again," he eby"c ea ing"
an"es ima ion"o " he"o iginal"ma ix."Howe e ,"whe e"be o e"some"o " he"en i ies"had" he" alue"0.0," hey"
now"ha e"a" alue"be ween"0.0"and"1.0," he eby" ep esen ing"con idence" alues" ha " he"co esponding"
s a emen "holds."
"
Figu e&CD1:&P oposi ionaliza ion&o &g aph&da a& o& ela ional&ma ices.&No e& ha ,& o &cla i y,&only& h ee&
dis inc &RDF&p edica es&ha e&been&depic ed&as& ela ional&ma ices,&whe eas& he&g aph&depic s&a&maximum&
o &nine.&
An" ad an age" o " using" ela ional" ma ices" is" ha " he" p ocesses" o " applying" p oposi ionaliza ion" and"
ac o iza ion"a e" ai ly"s aigh o wa d."While"many"di e en "app oaches" o" ac o izing"a"ma ix"exis ,"a"
echnical"de ail"omi ed"he e,"i "cons i u es"a"p o en"me hod"o e all" ha "is"in"use"a "many"companies"
(Pu"and"Fal ings"2013,"Re inge ,"e "al."2012)."Fu he mo e,"once"an"RDF"g aph"has"gone" h ough" his"
p ocess," he" econs uc ed"s a emen s" ha "we e"p e iously"unknown"could"be"in eg a ed"as"(weigh ed)"
iples"(T esp,"e "al."2008)."Ca e"should"be" aken"howe e ,"as"pe haps"no "e e y"unknown"s a emen "is"
jus i ied."Ano he "aspec "o "which"one"should"be"wa y"is" ha "all"s a emen s"o "a"ce ain"p edica e"a e"
es ima ed" in" a" single" s ep." Ex ending" his" o e " he" en i e" g aph" p o ides" a" pe o mance" ha " scales"
p opo ionally" o" he"numbe "o "p edica es,"as"well"as" o" hei " equency."Tha "is,"e e y"addi ional" iple"
equi es"an"addi ional"en y"in" he"co esponding"(possibly"no "ye "exis ing)"ma ix."
C.1.2 Tenso s&
Tenso s"may"be" hough "o "as" he"gene aliza ion"o " he"ma ix," he eby"allowing" o "an"a bi a y"numbe "
o "modes,"called"o de s."Di e en ly"pu ;"a"ma ix"can"be" ega ded"as"a"secondIo de " enso ."Depending"
on" he"numbe "o "o de s"𝑁,"a" enso "consis s"o "poin s"(𝑖,𝑗,…,𝑛)"wi h"𝑖,𝑗,…,𝑛∈𝑁."No e"howe e ," ha "
o de s"a e"independen "o " he"spa ial"dimensions."
Du ing"p oposi ionaliza ion"(Figu e"CI2)," each"g aph"is" ansla ed" o"a"single" enso " (Kolda"and"Bade "
2009,"Re inge ,"e "al."2012,"Nickel,"T esp" and"K iegel" 2011)." Gi en" ha " each" iple"consis s"o " h ee"
en i ies," hose" enso s"will"be"o " he" hi d"o de ."O " hese"o de s," he" i s "will"con ain"all"subjec s," he"
second"all"p edica es,"and" he" hi d"all"objec s"o " he"g aph."Assuming"indices"𝑖,𝑗,𝑘" o " he" i s ,"second,"
ARIADNE"D16.1"Public"
ii"
"
and" hi d"o de " espec i ely,"each"elemen " 𝑖,𝑗,𝑘"will"now"con ain" he" alue"1.0"i " he"s a emen "holds,"
and"0.0"o he wise.""
Fac o iza ion" is" hen" applied" o" de e mine" he" la en " ea u es" ha " a e" hidden"be ween" he" en i ies."
Once"spli "up," he" enso s"a e"mul iplied"again," he eby"c ea ing"an"es ima ion"o " he"o iginal" enso "
(Kolda" and" Bade "2009,"Re inge ,"e "al." 2012)."Howe e ,"whe e"be o e" some"o " he"en i ies"had" he"
alue"0.0," hey"now"ha e"a" alue"be ween"0.0"and"1.0," he eby" ep esen ing"con idence" alues" ha " he"
co esponding"s a emen "holds."
"
Figu e&CD2:&P oposi ionaliza ion&o &g aph&da a& o&a& hi dDo de & enso &in&a& h eeDdimensional& ec o &
space.&No e& ha ,& o &cla i y,&only& wo&RDF&s a emen s&ha e&been&depic ed&in& he& enso ,&whe eas& he&
g aph&depic s&nine.&
An" ad an age" o " using" enso s" is" he" simplici y" o " applying" p oposi ionaliza ion" and" ac o iza ion." In"
addi ion," enso s" pe mi " he" inclusion" o " con ex ual" in o ma ion"(Re inge ," e " al." 2012)," as" well" as"
allowing" o "collec i e"lea ning"such" ha "i "is"less"dependen "on" he"explici "agg ega ion"o "in o ma ion.""
Fu he mo e,"once"an"RDF"g aph"has"gone" h ough" his"p ocess," he" econs uc ed"s a emen s" ha "we e"
p e iously"unknown"could"be"in eg a ed"as"(weigh ed)" iples"(T esp,"e "al."2008)."Ca e"should"be" aken"
howe e ,"as"pe haps"no "e e y"unknown"s a emen "is"jus i ied."Ano he "aspec "o "which"one"should"be"
wa y"is" ha "all"s a emen s"o " he"whole"g aph"a e"es ima ed"in"a"single"s ep."Ne e heless," enso s" end"
o"scale" a he "well,"despi e" hei "la ge"size."Tha "is,"each"o de " anges"p opo ionally" o" he"numbe "o "
co esponding"en i ies"wi hin" he"g aph.""
C.2 S a is ical&Rela ional&Lea ning&
S a is ical" Rela ional" Lea ning(is" an" umb ella" e m," which" encompasses" a" la ge" numbe " o " a ious"
me hods" o" ep esen ," eason," and" lea n" in" domains" wi h" complex" ela ional" and" ich" p obabilis ic(
s uc u es((Ge oo "and"Taska "2007,"Re inge ,"e "al."2012)."Typically," hese"me hods"a e"based"on"ei he "
logicI"o " ameIbased" o malisms,"such"as" ules"o "g aphical"models" espec i ely.""
C.2.1 Induc i e&Logic&P og amming&
Induc i e"Logic"P og amming"(ILP)"encompasses"me hods" ha "a emp " o"lea n"logical"(Ho n)"clauses"
di ec ly" om" ela ional" da a"(La ac" and" Dze oski" 2001," Ge oo " and" Taska " 2007," T esp," e " al." 2008,"
Maimon" and" Rokach" 2005)." These" clauses" ypically" consis " o " conjunc ions" o " posi i e" and" nega i e"
ARIADNE"D16.1"Public"
iii"
"
logical"a oms,"which" oge he ,"can"be"seen"as"Logic"P og ams."In"addi ion," hese"Logic"P og ams"allow"
o " he"in eg a ion"o " alid"backg ound"(domain)"knowledge"as"well."
Gi en"a"se "o "posi i e"and"nega i e"a oms"o " a ge " ela ion"𝑝,"as"well"as"gi en"a"se "o "backg ound"
ela ions"𝑞!," he" ask"is" o"lea n"a"de ini ion"o " ela ion"𝑝" ha "is"consis en "and"comple e"(La ac"and"
Dze oski"2001,"Ge oo "and"Taska "2007,"Maimon"and"Rokach"2005)."Tha "is," his"de ini ion"should"be"
able" o"explain"all"speci ied"posi i e"and"nega i e"a oms."Wi hin" he"con ex "o "LD,"bo h" ypes"o "a oms"
can"be" hough "o "as" iples" ha "ei he "do"o "do"no "hold."In" ha "case," he"ou come"o "a"single"clause"
would" hen"cons i u e" he" u h" alue"o "a"hypo hesized" iple."
As"an"example,"conside "wan ing" o"de e mine"whe he "a"colou "di e ence"in"a"laye "o "exca a ed"soil"
a e" he" emains"o "an"ancien "wa e "well."Gi en"nume ous"examples"o "colou "di e ences" ha "we e"
ound" o"be"such"a"wa e "well,"and"gi en"nume ous"mo e"examples"o " hose" ha "we e"no ,"a"new" ule"
could"be"lea ned" ha "disc imina es"be ween" he" wo"cases."A" e y"c ude" e sion"o "such"a" ule,"o "Logic"
P og am,"migh " esemble"Equa ion"DI1,"which"e alua es" he"colou "di e ence"𝐶𝐷"as"being"a"well"(𝑉=
𝑡)"i "i s" adius"is"la ge " han"50"uni s"and"i s"dep h"la ge " han"200"uni s,"as"well"as"ha ing"a"di e ence"𝐷"
in"colou "wi h" espec " o" he"su ounding"a ea"𝑆𝐴" ha "exceeds"a" alue"o "1.42"uni s."
&𝑤𝑎𝑡𝑒𝑟𝑊𝑒𝑙𝑙 𝐶𝐷,𝑉=𝑡←& & & & & & & &&&&&Equa ion&CD1"
"""""𝑟𝑎𝑑𝑖𝑢𝑠 𝐶𝐷 >50 ∧𝑑𝑒𝑝𝑡ℎ𝐶𝐷 >200"
∧𝑐𝑜𝑙𝑜𝑢𝑟𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝐶𝐷,𝑆𝐴,𝐷∧𝐷>1.42!"
A"big"ad an age"o "ILP"is" he"expe ience"i "has"on"(mul iI)" ela ional"DM"(Ge oo "and"Taska "2007,"La ac"
and"Dze oski"2001,"Maimon"and"Rokach"2005),"which,"while"no " he"same,"sha es"many"o " he"hu dles"
ha "a e"also"p esen "wi h"g aph"da a."Ano he "ad an age"is"i s"logical" ep esen a ion,"which"p o ides"a"
s ong"exp essi e"powe "and"which" i s"na u ally" o" he" o mal"logics"behind"LD."This" i "allows" o "an"
easy"in eg a ion"o "backg ound"knowledge,"as"well"as"allowing" o "an"equally"easy"in eg a ion"o "in e ed"
knowledge"in o" he"o iginal"g aph."Howe e ,"while"some" a ian s"do"exis ," his"new"knowledge"is"mainly"
limi ed" o" s a emen s" ha " ei he " do" o " do"no " hold." Ano he " disad an age" is" he" need" o " mul iple"
posi i e"and"nega i e"a oms"pe " ela ion,"which"in" he"case"o "LD,"is"challenging"due" o" he"o enIexis ing"
spa seness"in"RDF"g aphs."Mo eo e ," he"need" o "nega i e"a oms" ansla ed"in o"a"need" o "explici ly"
speci ied" iples" ha "s a e" ha "a" ela ion"does"no "hold;"a"p ac ice" ha "is" a e"in"a"sys em" ha "adhe es"
o" he"OpenIWo ld"Assump ion."
C.2.1.1 P oposi ional+ILP+
In"addi ion" o" ela ional"lea ning,"ILP"also"has"a"s ong" ela ion"wi h"p oposi ional"lea ning."In" ac ,"i "was"
in" he" a ea" o " ILP" whe e" he" e m" p oposi ionaliza ion"was" o iginally" used." Wi hin" ha " con ex ," i "
cons i u es" he" ansla ion"o " i s Io de "clauses"in o" ea u es" o"which"s a is ical"ML"algo i hms"can"be"
applied." Howe e ," his" p ocess" may" esul " in" he" loss" o " in o ma ion," and" hus" is" said" o" exchange"
accu acy" o "e iciency"(K ogel,"e "al."2003)."
ARIADNE"D16.1"Public"
ix"
"
Wi h"p oposi ionaliza ion"(Figu e"CI3),"a"se "o "Boolean" ea u es"is"sough "whe eby"each" ea u e"can"be"
de ined"as"a"co esponding"clause"(K ogel,"e "al."2003,"La ac"and"Dze oski"2001,"T esp,"e "al."2008)."
Gi en"𝑛" ea u es,"a"p oposi ionaliza ion"o "a" ela ionalIlea ning"p oblem"is"a"se "o "𝑛"clauses,"wi h"each"
clause"cons i u ing"one"o "mo e"logical"li e als."These"li e als"a e"de i ed" om" he" ela ional"backg ound"
knowledge."Once"all" ea u es"ha e"been"de ined," hey"a e"e alua ed"wi h" espec " o"an"ins ance," hus"
esul ing"in"a"sequence"o "𝑛"Boolean" alues.""
Wi hin" he" ield"o "ILP," he e"a e" wo"app oaches" o"p oposi ionaliza ion"(K ogel,"e "al."2003,"La ac"and"
Dze oski" 2001);" ei he " comple ely" o " pa ially." Wi h" comple e" p oposi ionaliza ion," all" knowledge"
con ained"wi hin"a"da a"se "is" ansla ed" o" ea u e"de ini ions,"whe eas"wi h" he"pa ial" a ian " his"is"
done" only" o " he" mos I ele an " subse " o " his" knowledge." The e o e," in" case" o " he" la e ," ce ain"
knowledge"is"los ."Mo eo e ,"de e mining"which" ea u es"a e"o "impo ance"is" ypically"accomplished"
h ough"heu is ics," hus"in oducing"assump ions."
"
Figu e&CD3:&P oposi ionaliza ion&o &g aph&da a& o&a&se &o &ILP& ea u e&de ini ions.&
An" ad an age" o " p oposi ional" ILP" a e" ha ," in" mos " cases," i " is" a" e y" obus " and" e ec i e" me hod"
(K ogel,"e "al."2003,"La ac"and"Dze oski"2001,"T esp,"e "al."2008)."Only"a" ew"da a"se s"a e"known" o"
exis " o "which" his"does"no "hold."Ano he "bene i "is" ha " he"pa ial" a ian "may"easily"be" ailo ed" o"
one’s"needs"by"speci ying" he"numbe "o " ea u es."This"la e " ea u e"can"addi ionally"be" ega ded"as"a"
disad an age,"as"i "may"quickly"lead" o"los "knowledge."Mo eo e ,"i "migh "in oduce"assump ions" ha "
may"bias"any" u u e" easoning.""
C.2.2 Rela ional&G aphical&Models&
P obabilis ic"G aphical"Models"(PGM)"a e"a"gene al" amewo k"(Figu e"CI4)" o" ep esen "complex" ealI
wo ld"phenomena"o e "a"highIdimensional"space"by" he"combina ion"o "p obabili y" heo y"and"logical"
s uc u es"(Ge oo "and"Taska "2007,"Cohen"and"Kou"2006,"Kolle "and"F iedman"2009)."The e o e,"PGMs"
a e"able" o" eason"wi h"unce ain y,"as"well"as"wi h"dependencies"be ween"en i ies."While"nume ous"
kinds"exis ," he"majo i y"o " he"PGMs"can"easily"be"depic ed"by"ei he "a"di ec ed"o "undi ec ed"g aph,"
such" as" a" Bayesian" and" Ma ko " model," espec i ely." Wi hin" his" g aph," he" nodes" map" o" domain"
a iables" and" he" edges" co espond" o" di ec " p obabilis ic" in e ac ions" be ween" hese" a iables."
Fu he mo e,"PGMs"ha e"a" ixed"g aphical"s uc u e,"which"limi s" hei "abili y" o" eason"abou "a" a ying"
numbe "o "en i ies"in"a" a ie y"o "con igu a ions."
ARIADNE"D16.1"Public"
x"
""
Rela ional" G aphical" Models" (RGM)" a e" a" speci ic" kind" o " PGMs" (Figu e" CI4)" ha " ex end" he" PGM"
amewo k"wi h"a" lexible"g aphical"s uc u e,"as"well"as"wi h"concep s"o "objec s," hei "p ope ies,"and"
ela ions"be ween" hem"(Ge oo "and"Taska "2007,"T esp,"e "al."2008,"La ac"and"Dze oski"2001)."This"
sepa a ion"is"simila " o" ha " o "p oposi ional"and" ela ional"logics."I espec i e,"whe he " he" ela ions"
hold"is"encoded"by" hei "co esponding"bina y" a iables,"which"a e" ep esen ed"as"nodes"in" he"RGM."
Wi hin" he"con ex "o "LD," hese" a iables"deno e" he"po en ial"RDF" iples"and"no " he" nodes" o " he"
co esponding"RDF"g aph,"which"a e"ei he " esou ces"o "li e als."Mo e"speci ically,"a" a iable"wi h" he"
alue"1.0"would"deno e" ha " he"co esponding" iple"holds,"whe eas" he" alue"0.0"would"deno e" he"
opposi e."Any" alue"in"be ween"would"indica e"a"ce ain y" alue."
"
Figu e&CD4:&Venn&diag am&o & he&hie a chy&wi hin& he&PGM& amewo k.&No e& ha & he&cu si e& e ms&wi hin&
he&bounda ies&deno e& ha &class’&mos Dcommon&model.&
"
As"wi h"PGMs,"RGMs"may"be"ei he "di ec ed"o "undi ec ed"(Figu e"CI4)."In"case"o " he" o me ," hese"a e"
gene ally"known"as"P obabilis ic"Rela ional"Models"(PRM)" hem"(Ge oo "and"Taska "2007,"T esp,"e "al."
2008,"La ac"and"Dze oski"2001,"Re inge ,"e "al."2012)."In"addi ion," he" ype"o " ela ion"and"i s"di ec ion"
a e"assumed" o"be"known;"an"assump ion"in ended" o"simpli y" he"model."An"ex ension" o" his"
addi ionally"conside s" wo" ypes"o "s uc u al"unce ain y:"
Re e ence(unce ain y"conce ns" he"case"in"which"a" ela ion,"and"only"one"o "i s" wo"membe s,"is"
known." Tha " is," i " is" unknown" whe he " a" ce ain" en i y" is" dependen " upon" one" o " mo e" o he "
en i ies,"and"i "so,"which" hose"o he "en i ies"a e.""
Exis ence(unce ain y"conce ns" he"case"in"which" he" ela ion"be ween" wo"o "mo e"en i ies"is"
unknown."Tha "is,"gi en" wo"en i ies,"i "is"unknown"whe he "one"depends"on" he"o he ."
" "
ARIADNE"D16.1"Public"
xi"
"
As"an"example,"conside " ansla ing"a"simple"RDF"g aph"(Figu e"CI5"Le )" o"a"PRM"(Figu e"CI5"Righ )."
He e," he" o me "in ol es" an"anomaly" disco e ed"wi hin" he"soil," which" was" ound" o"be" an" ancien "
wa e "well."This"conclusion"was"based"upon" he" adius"and"dep h"o " he"anomaly," hus"making" hese"
measu emen s"dependencies" o " ha "conclusion."The e o e,"a"PRM"would" ep esen " he"co esponding"
iples,"i.e."Soil&Anomaly&has& adius&60"and"Soil&Anomaly&has&dep h&250"as"pa en "nodes"o " he"concluding"
iple"Soil&Anomaly&a&Wa e &Well."
"
Figu e&CD5:&Le )&a&simple&RDF&g aph&conce ning&an&anomaly&in& he&soil& ha &once&cons i u ed&a&wa e &
well.&Righ )&a&PRM&o & he&RDF&g aph&wi h&e e y& iple&being& ue.&
An" ad an age" o " RGMs" is" he" lexible" bu " powe ul" exp essi eness" o " hei " g aphical" ep esen a ion"
(Ge oo "and"Taska "2007,"T esp,"e "al." 2008," La ac"and"Dze oski"2001,"Re inge ,"e "al."2012)."Being"
g ounded"in"a"sound"s a is ical" amewo k," hey"allow" o "di ec "lea ning" om"g aph"da a"wi hou " he"
need" o " p oposi ionaliza ion." In" addi ion," on ology" backg ound" knowledge" can" be" in eg a ed," and"
in e ed" knowledge" can" be" inco po a ed" in o" he" RDF" g aph" as" weigh ed" iples." This" makes" i "
pa icula ly"well"sui ed" o"explo a o y"da a"analysis."Fu he mo e,"i "possesses" he"abili y" o"lea n"and"
pe o m" in e ence" in" la ge" ne wo ks." Howe e ," bo h" asks" end" o" be" expensi e," as" compu a ional"
equi emen s"scale"wi h" he"numbe "o "s a emen s"whose" u h" alue"is"known."Al e na i e"app oaches"
ha e" been" sugges ed," which" limi " he" lea ning" and" easoning" o" ele an " subg aphs" only." These"
app oaches"howe e ,"a e"s ill" ai ly"new"and"expe imen al."Ano he "challenge"s ill"la gely"un esol ed"is"
how" o"deal"wi h"missing"da a."Finally,"one"should"no e" ha "RGMs"a e"mainly"limi ed" o"lea ning"and"
p edic ing" u h" alues"o "RDF"s a emen s."
C.3 Ke nel&Me hods&
Ke nel"me hods"a e"a"g oup"o " echniques" ha " y" o"sol e"ma hema ically"complex"ML"p oblems"by"
ansla ing" hem" in o" mo eIma hema icallyIwo kable" ML" p oblems"(Schölkop " and" Smola" 2002)." The"
unc ion" ha "pe o ms" his" ansla ion"is"called" he"ke nel."How"such"a"ke nel"is"de ined"depends"g ea ly"
on" he"cha ac e is ics"o " he"p oblem"a "hand."In" ac ,"many"di e en "ke nels"exis ,"each"designed"wi h"a"
speci ic"goal"in"mind."
ARIADNE"D16.1"Public"
xii"
"
Ma hema ically"speaking45;"gi en"a"p oblem"speci ied"in"inpu &space"𝒳,"a"mapping"Φ"is"de ined"which"
ansla es"poin s"in"𝒳" o" ea u e&space"ℋ"(Equa ion"DI2)."A"ke nel"𝑘"is" hen"de ined"such" ha "i "maps"
inpu " ec o s"𝑥"and"𝑥′" om"𝒳" o"ℋ"using"mapping"Φ"(Equa ion"DI3).""
𝛷∶𝒳→ℋ&&&&&&&Equa ion&CD2&
𝑘𝑥,𝑥!⟼!!𝛷𝑥,𝛷𝑥!!&&&&&Equa ion&CD3&
Amongs " he"mos "popula "me hods" ha "use"ke nels"a e"Suppo IVec o "Machines"(SVM)"(Schölkop "
and"Smola"2002,"Gä ne "2003)."SVMs"a e"commonly"used" o"pe o m"classi ica ion" asks"(Figu e"CI6)."
Howe e ," hey" can" addi ionally" be" used" o " ei he " clus e ing" o " eg ession." By" inco po a ing" ke nels,"
which" hey" e e " o"as"applying" he"ke nel! ick,"SVMs"a e"able" o"mi iga e"se e al"o " he"issues" ha "a e"
encoun e ed"by" adi ional"ML"algo i hms."One"p ime"example"o "such"an"issue"is"local"minima,"which"
cause" an" op imiza ion" p ocess" o" s all."Mo eo e ," he" ke nel" ick" allows" SVMs" o" lea n" nonIlinea "
sepa a ion"bounda ies."
"
Figu e&CD6:&Example&o &how&SVMs& ansla e&complex&p oblems&(le ),&he e&a&bina yDclassi ica ion&p oblem,&
o&mo eDwo kable&p oblems&( igh )&(Schölkop &and&Smola&2002).&
" "
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
45"Fo " simplici y," he" e minology" chosen" is" he" one" om" commonlyIused" in" SVMs." Rega dless," he" p inciples"
emain" he"same."
ARIADNE"D16.1"Public"
xiii"
"
An"ad an age"o "using"ke nel"me hods"is" hey"do"no " es ic " hei "a gumen s" o"solely" ec o I ype"da a."
Ins ead," hey"may"be"de ined"on"almos "any" ype"o "da a"(Bloehdo n"and"Su e"2007,"Gä ne "2003)."As"a"
esul ," hey" can" be" applied" di ec ly" o" he e ogeneous" and" in e connec ed" da a" wi hou " he" need" o "
con e ing" hese" da a" o" ec o s."A" ela ed" ad an age" o "ke nels" is" hei " abili y" o"simpli y"complex"
lea ning" p oblems" by" mapping" hem" o" lessIcomplex" lea ning" p oblems."Mo eo e ," SVM," he" ield’s"
mos "wellIknown"me hod,"is"one"o " he"mos "success ul" ecen "de elopmen s"in"ML."In" ac ,"i "has"been"
shown" o"e ec i ely"sol e"lea ning"p oblems"which"we e"le "unsol ed"by"ILP;"a" ield"wi h"many"mo e"
yea s" o " expe ience." Howe e ," mos " o " he" success" s o ies" on" ke nel" me hods" s em" om"academic"
esea ch."So" a ," e y"li le"expe ience"has"been"achie ed"ou side"o " hose"con ines."Due" o" he" ield’s"
cu en "popula i y"howe e ," his"disad an age"may"e en ually"be"nulli ied."
C.3.1 Ke nels& o &S uc u ed&Da a&
The e" a e" wo" common" app oaches" when" de eloping"ke nels" o "s uc u ed" da a." These" app oaches"
conce n" he" decision" o" use"ei he "modelId i en"o " syn axId i en"ke nels" (Gä ne " 2003)." He e," he"
o me "is"o en"applied"when"ei he "backg ound"knowledge"o "s a es46"a e"o "impo ance."In"con as ,"
he"la e "emphasizes" he"seman ics"wi hin"a"da a"se ."These"ke nels" ypically"in ol e" ules," ees,"and"
g aphs.""
Ke nels"de eloped" o "lea ning"on" he"SW"o en" all"in o" he"g oup"o "G aph(Ke nels."Each"o " hese"g aph"
ke nels"cons i u es"a" unc ion" ha " ansla es"a"g aph" o"an"elemen "o "which" he"simila i y"wi h"o he "
elemen s" can" easily" be" compu ed." Un o una ely," his" ansla ion" s ep" is" gene ally" a he " esou ce"
expensi e," especially" when" isomo phism47"has" o" be" aken" in o" accoun "(Gä ne " 2003," Bo gwa d ,"
Sch audolph"and"Vishwana han"2006,"Vishwana han,"e "al."2010)."A"comp omise"is" o"use"app oxima e"
ke nels"ins ead."One"wellIknown"and"simple"example"o " hese"is"a"RandomIWalk"G aph"ke nel,"which"
compu es" he" ansla ion"o "a"g aph"by" andomly"walking"o e "i s" e ices."Two"g aphs"𝐺"and"𝐺′"can"
subsequen ly"be"checked"on"simila i y"by"compa ing" hei "co esponding" andom"walks."
Ano he "kind"o "ke nel" ha "is"app op ia e" o"lea n" om" he"SW"is" he"Clause(Ke nel"(Bice ,"T an"and"
Gossen" 2011)." Each" clause"ke nel" con ains" an" ILP" clause" which" co esponds" o" a" ea u e" o " he"
co esponding"RDF"g aph."This"app oach"adds" he"ad an age"o "dynamically"de ining"ke nels"based"on"
he"likelihood"o " hem"being"able" o"explain"a"se "o "p o ided"examples."Fu he mo e,"se e al"o " hese"
ke nels" may" subsequen ly" be" combined" in o"one" composi e" ke nel," he eby" imp o ing" e iciency." In"
addi ion,"combining"ke nels"p o ides" esilience" o"spa se"da a."
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
46"A"s a e"in ol es"a"snapsho "o "a"da a"se "a "a"speci ic" ime."Applying"an"ope a ion,"e.g."add"o " emo e," o"a"s a e"
esul s"in"a"new"s a e."
47"Isomo ph"g aphs"only"di e "in" he"enume a ion"o " hei " e ices."
ARIADNE"D16.1"Public"
xi "
"
Appendix&D Sample&o &A chaeological&Scena ios&
Se e al"junio "and"senio " esea che s,"as"well"as"a"couple"o "Ph.D."candida es,"we e"in i ed" o"pa icipa e"
in" a" b ains o ming" session" a " he" Facul y" o " A chaeology" o " Leiden" Uni e si y" as" well" as" a " he" VU"
Uni e si y" Ams e dam."Those" pa icipa ing" we e" asked" o" w i e" down" an" a chaeological" esea ch"
scena io," which" hey" deemed" di icul " wi h" he" cu en " in o ma ion" in as uc u e."In" addi ion,"
pa icipan s" we e" ins uc ed" o" assume" ha " all" equi ed" da a" was" eadily" a ailable" wi hin" his"
in as uc u e."
The" ollowing"scena ios"we e"submi ed"( ansla ed" om"Du ch):"
1) The&need& o& e ie e&all&li e a u e&abou &indi idual& inds& om& e y&(o en&obscu e)&local&jou nals&
abou &a&pa icula &exca a ion&(i.e.& he&la e&b onze&age&se lemen &in&Bo enka spel).&
2) The&need& o &in o ma ion&on&plan s&which&a e&use ul& o &humans,&bo h&as& ood,&oil,&medicinal&o &
as&ma e ial&( o &c ea ing& opes,&bucke s&e c.)& ound&in&Holocene&con ex s&da ed& o&La e&p ehis o y&
(i.e.&La e&Neoli hic,&B onze&Age&and&I on&Age).&
3) All&publica ions& om&exca a ed&se lemen s&which&a e&da ed& o& he&B onze&Age&and&loca ed&on&a&
speci ic&geomo phological&uni ,&in&an&a ea& om&Denma k& o&No h&F ance.&&
4) All&images&(images&and&d awings)&wi h&me ada a&abou & he&da ing&and&a chaeological&con ex &
om&p ehis o ic& aps&made&o &willow.&In&addi ion&a&lis &o &pe sons&who& esea ched& hese&a e ac &
ypes.&
5) O e iew&o &all&Neoli hic&axes&made&o &s one,&and&in&pa icula & lin & ound&in&a&Roman&con ex .&
(i.e.& ound&du ing&exca a ions,¬&indi idual& inds).&
6) The& eliabili y&and&use ulness&o &C14&da ing&depends&on&a& a ie y&o & ac o s&(i.e.&s a ig aphical&
posi ion,&soil&dis u bances,& ype&o &sample&e c.).&This&in o ma ion&is¬ed&on&sepa a e&sample&
o ms.&I &would&be& e y&use ul&i & his&in o ma ion&we e&accessible&on& o& he&le el&o &indi idual&C14&
iden i ie s.&(e.g.&www.lumid.nl&p o ides&TL&in o ma ion,&bu ¬&as&LOD).&
7) Func ionali y& o& e ie e&all& he& GIS& da a& om&a&speci ic&a chaeological& a ea&(e.g.&all&GIS& da a&
om&Roman&exca a ions&on& he&Kops&Pla eau&in&Nijmegen).&
8) All& published& adiome ic& da ed& (in& yea s)& ma e ial& (wi h& id& numbe s)& on& Neande hal& si es& in&
F ance.&
9) All& ypes&o &a owheads&da ed& o& he&middle&Neoli hic&pe iod.&&&
10) All&cha ac e is ics&o &hand&axes& om& he&Sou h&Eas &o & he&Ne he lands.&
11) All&in o ma ion&on& he&o igins&o & he&Le allois& echnique.&(si e&loca ions&X,Y&coo dina es).&
12) In o ma ion&on& he&in luence&o & he&pH& alue&soils&on& he&conse a ion&condi ions&o &cha coal.&
13) All&a chaeological&con ex s&in&which&b oken& lin &axes&da ed& om& he&Middle&Neoli hic&pe iod&a e&
ound.&(i.e.&se lemen s,& une a y,&indi idual& inds&e c.).&
14) All&sou ces,&bo hðnog aphic,&his o ic&and&a chaeological,&in& which&land& clea ances&by& i e&is&
men ioned&wi h®a d& o& hei &in luence& he&landscape& ege a ion.&
15) Func ionali y& o&au oma ically&sea ch& o &keywo d&synonyms&( o &example&in&Google&schola ).&&