Ci a ion: Mi xelena-Hoyos, O.;
Ama o-Mellado, J.-L. A Compa ison
o Ca og aphic and Toponymic
Da abases in a Mul ilingual
En i onmen : A Me hodology o
De ec ing Redundancies Using ETL
and GIS Tools. ISPRS In . J. Geo-In .
2023,12, 70. h ps://doi.o g/
10.3390/ijgi12020070
Academic Edi o s: Flo ian H uby
and Wol gang Kainz
Recei ed: 1 Decembe 2022
Re ised: 5 Feb ua y 2023
Accep ed: 16 Feb ua y 2023
Published: 18 Feb ua y 2023
Copy igh : © 2023 by he au ho s.
Licensee MDPI, Basel, Swi ze land.
This a icle is an open access a icle
dis ibu ed unde he e ms and
condi ions o he C ea i e Commons
A ibu ion (CC BY) license (h ps://
c ea i ecommons.o g/licenses/by/
4.0/).
In e na ional Jou nal o
Geo-In o ma ion
A icle
A Compa ison o Ca og aphic and Toponymic Da abases
in a Mul ilingual En i onmen : A Me hodology o De ec ing
Redundancies Using ETL and GIS Tools
Oihana Mi xelena-Hoyos 1,2 and José-Láza o Ama o-Mellado 3,4,*
1Ins i u o Geog á ico Nacional-Se icio Regional en Can ab ia-País Vasco, 20010 San Sebas ian, Spain
2Escuela de Ingenie ía de Gipuzkoa, UPV-EHU, 20018 San Sebas ian, Spain
3Depa amen o de Ingenie ía G á ica, Uni e sidad de Se illa, 41092 Se ille, Spain
4Ins i u o Geog á ico Nacional-Se icio Regional en Andalucía, 41013 Se ille, Spain
*Co espondence: jama [email p o ec ed]
Abs ac :
Toponymy, a ans e sal discipline o geog aphy, linguis ics, and his o y, inds one o i s
main suppo s in ca og aphy. Due o exhaus i eness on he e i o y, cadas al ca og aphy and
i s oponymy ha e he ideal cha ac e is ics o de elop sys ema ic geog aphical analyses. Mo eo e ,
cadas e and geog aphical names a e pa o he geog aphic e e ence da a acco ding o Annex 1 o he
INSPIRE di ec i e. This wo k p esen s he design, implemen a ion, and applica ion o a me hodology
based on Geog aphic In o ma ion Sys ems and Ex ac , T ans o m, and Load (ETL) ools o de ec ing
coincidences be ween he cadas al geoin o ma ion and he o icial gaze ee co esponding o he
p o ince o Gipuzkoa, Spain. Me hodologically, his s udy p oposes a solu ion o he issues aised by
bilingualism in he s udy a ea. This p oblem is app oached a p io i, in he p e ious da a ea men ,
and a pos e io i, applying seman ic c i e ia. The esul s show a ma ch be ween he da ase s o close o
40%. In his way, he uniqueness and ichness o he analyzed sou ce and i s ou s anding con ibu ion
o he po en ial in eg a ion o he o icial oponymic co pus a e e idenced.
Keywo ds: gaze ee ; ca og aphy; cadas e ; GIS; ools; oponymy; place name
1. In oduc ion
Landscape, ca og aphy, and oponymy a e in e wined because o hei links o
e i o ies and human ac i i y. All h ee in ol e ways o unde s anding and desc ibing he
o ganiza ion o he space whe e humans ca y ou hei ac i i ies [
1
–
3
]. On he one hand, a
geog aphical name has he p ima y unc ion o naming a place and ac ing, a he same ime,
as a seman ic and loca ing unc ion [
4
,
5
]. On he o he hand, he analysis o he na u e o
i s o ma ion and he de e mining ac o s o i s su i al allow i o become a piece o he
landscape a chaeology [
6
], since hey b ing pas eali ies o oday. This pa imonial unc ion
o place names can also be obse ed in a sociolinguis ic aspec , especially in egions whe e
con ac be ween languages has gi en ise o di e en o ms o mul ilingualism o e ime [
7
].
In his con ex , we mus also emphasize ha he ans e sal na u e o oponymy makes i
necessa y o main ain a mul idisciplina y pe spec i e in i s ea men .
A he same ime, ca og aphy is an unbea able ool o he s udy, s anda diza ion,
and dissemina ion o oponymy, aking ad an age o he geog aphical componen o he
place name wi h which he human g oups ha inhabi , o ha e inhabi ed, a e i o y
know and designa e i [
4
]. Cadas al ca og aphy has speci ic cha ac e is ics ha place i
in a special/peculia posi ion conce ning i s oponymic con en . Acco ding o González
Ga cía [
8
], cadas al in o ma ion is endowed wi h ce ain cha ac e is ics wi h espec o he
e i o y due o i s s a us as an exhaus i e census eco d. Since i e lec s all he p ope ies
on he g ound and is con inuously main ained, i also acili a es he knowledge o geome ic
ISPRS In . J. Geo-In . 2023,12, 70. h ps://doi.o g/10.3390/ijgi12020070 h ps://www.mdpi.com/jou nal/ijgi
ISPRS In . J. Geo-In . 2023,12, 70 2 o 29
o o he a ia ions in he e i o y. These cha ac e is ics a e ideal o combining in s udies
based on e i o ial analysis.
Addi ionally, cadas al in o ma ion is a eposi o y o geog aphic and oponymic
knowledge. Bo h he cadas e and geog aphical names a e pa o he co e ( e e ence da a)
o he geoin o ma ion o e i o y, since hey appea in Annex I o he INSPIRE Di ec i e
2007/2/EC, “which means ha hey a e conside ed as e e ence da a, i.e., da a ha con-
s i u e he spa ial ame o ecognizing geog aphical loca ion” [
9
]. Likewise, he Spanish
legisla ion, Law 14/2010 o 5 July, on in as uc u e and geog aphic in o ma ion se ices
in Spain (LISIGE) es ablishes ha bo h he O icial Gaze ee s and cadas al geog aphic
in o ma ion belong o he Geog aphic Re e ence In o ma ion (GRI) [
10
]. This phenomenon
is so impo an ha he cadas al pa cel has become he wo king and e e ence uni o he
o he GRI ea u es.
Fu he mo e, he geospa ial da a-in eg a ion capabili y o Geog aphic In o ma ion
Sys ems (GIS) is beyond doub . These ools can be used o wo k in all ields whe e
he loca ion is ele an and a e especially use ul in mul idisciplina y en i onmen s [
11
].
Al e na i ely, when he olume o da a is huge o epe i i e geog aphic p ocesses a e o be
pe o med, Ex ac , T ans o m, and Load (ETL) ools become ex emely use ul [12–14].
This wo k aims o imp o e he e iciency o in eg a ing oponymic da a om di e en
geoin o ma ion sou ces, using a semi-au oma ic me hodology ha minimizes he e ec
ha mul ilingualism p oduces in de ec ing da a edundancy. Consequen ly, his esea ch
o e s a me hodology o he synch oniza ion o di e en oponymic da abases. To his end,
we use GIS and ETL ools o in eg a e place names om o icial sou ces o di e en le els
o adminis a ion, in he geog aphical a ea o he his o ical e i o y o Gipuzkoa, Basque
Coun y (País Vasco o Euskadi o CAPV o CAE), Spain. The da abases used ha e di e en
cha ac e is ics in e ms o he scale o cap u e and p ocessing o place names, deg ee o
linguis ic s anda diza ion o he names, as well as classi ica ion in o ypologies. In his
con ex , he e is a possibili y o inno a ion by inco po a ing o al cadas al ca og aphy in
hese asks.
The closes equi alen wo k, bo h geog aphically and concep ually, is ha ca ied
ou by he Au onomous Communi y o Andalusia [15], whe e he in eg a ion o cadas al
oponymy in i s O icial Gaze ee (NGA) was add essed wi h e y a o able esul s [
16
].
Ano he close piece o esea ch is he semi-au oma ic colla ion wo k planned o he im-
p o emen o he Basic Geog aphic Gaze ee o Spain (NGBE) [
17
] called “au oco ec ion.”
This gaze ee is handled by he Spanish Na ional Mapping Agency, Ins i u o Geog á ico
Nacional (IGN). As a esul , conco dance be ween elemen s o he NGBE and he O icial
Geog aphic Gaze ee o he Au onomous Communi y o he Basque Coun y (NGO-
CAE) was ound. These wo cases se e as a e e ence o con as he sui abili y o he
p oposed me hodology.
The me hod de eloped aims o e alua e he complemen a i y o he da ase s o dis-
co e hei o iginal alue and a oid edundancies when wo king wi h hem join ly. The
FME ex ac ion capabili y has been used in he ini ial sou ce p ocessing wo k and he sea ch
o combined ex ual and localiza ion ma ches. Fo he design issues o he lexeme-based
seman ic alida ion o he esul s [
18
], as well as he manual e alua ion o he me hod on a
sample o he da a, he capabili ies o GIS ools ha e been used. Conce ning his esea ch,
he s udy p esen ed he e is inno a i e because i conside s and an icipa es speci ic aspec s
de i ed om bilingualism, since Basque (o Euske a) and Spanish languages a e co-o icial
in Gipuzkoa.
2. Rela ed Wo k
This sec ion p esen s he ele an publica ions ela ed o his oponymy in es iga ion
om di e en poin s o iew, such as ca og aphy, GIS, cadas e , and mul ilingualism.
Gi en ha ou esea ch is s ongly mul idisciplina y, some o he ci ed wo ks could appea
(o ha e appea ed) in di e en subsec ions.
ISPRS In . J. Geo-In . 2023,12, 70 3 o 29
2.1. On Place Names, Ca og aphy, and GIS
Ca og aphy is a o m o ep esen a ion o a syn hesized geog aphic space, and o-
ponymy iden i ies i s di e en elemen s. The place names on a map allow he iden i ica ion
o he geog aphical elemen s ep esen ed. In addi ion, as a g aphic elemen o he map, he
label i sel becomes a ehicle o ansmi ing in o ma ion abou he elemen h ough he
isual a iables o he ield o g aphic semiology [
19
,
20
]. As geog aphic in o ma ion, i is
ine i able o conclude ha he mos e icien way o manage his da ase is he usage o
Geog aphic In o ma ion Sys ems. The e o e, he design and implemen a ion o he GIS
p ojec a e c ucial o espond o cu en equi emen s [21].
Rega ding he localizing unc ion o oponymy, di e en wo ks p o e his capaci y.
Thus, F aje and Fiedo made up a oponymic GIS o ind ou ex inc wa e bodies [
22
];
Go do a e al. deal wi h geog aphical names o sol e his o ic-geog aphical issues and
c ea ed egional a lases [
23
]. O he esea che s ha e assessed place names as geog aphical
in o ma ion ools [
24
]. Gi au and Houssay-Holzschuch sugges ed a heo e ical amewo k
in o de o analyze place naming ega ding geopoli ics and powe ela ions [25].
On he one hand, he pe spec i e o physical geog aphy has been widely po ayed
h ough place names [
26
]. On he o he hand, di e en wo ks in e ela e he names
o di e en en i ies ha ep esen ypes o s uc u ing landscape elemen s [
27
], such as
oads [
28
], hyd ology [
22
], and cons uc ions, whe he adi ional [
29
] o u ban [
30
]. All
his does no unde mine he ou h dimension, he empo al aspec p esen bo h in he
his o ical es imony [
31
,
32
] and in he dynamism o he landscape, e lec ed in he labeling
o maps [
33
] h ough he diach onic pe spec i e in his o ical esea ch [
34
]. Fu he mo e,
wi h his empo al componen , we ci e i s localizing capaci y [
35
]. On he o he hand, om
indi idual wo k o dis ibu ed and ne wo ked sys ems, he way o accessing and exploi ing
geog aphic in o ma ion, including cadas al in o ma ion, has adically changed [36].
In addi ion, ETL assis s in he managemen o di e en da abases by adding powe
o he analysis. These ools also acili a e he p epa a ion o he da a o analysis while
p o iding he possibili y o op imizing he da abases’ s uc u e [
37
], bo h o he cadas e
and he o he geoin o ma ion se s o he INSPIRE di ec i e [38].
Wi h ega d o he oponymy ea men , some wo ks poin ou p og ess in applying
new echnologies. Among o he s, Conede a e al. p o e ha place names can also help ind
ou pas land uses, conside ing di e en , phone ically alike dialec op ions using a GIS [
39
];
Pije -Migón and Migón examined he link be ween geohe i age and cul u al he i age [
40
].
Fu he mo e, Blaschke e al. in es iga ed how linguis ic and cul u al se ings could a ec
he pe cep ion o place using a mul i-language app oach and a GIS [
41
]. Finally, Se iko a
and Baishuku o a exposed an example o how Google Maps, Apple Maps, and Yandex
can con ibu e o imp o ing he explo a ion and iden i ica ion o geog aphical names in
Kazakhs an [42].
Mo eo e , oponym-compa ison ope a ions ha e been ex ensi ely analyzed o hei
use in sea ch engines. In he case o he Cana y Islands (Spain), he implemen a ion o hese
solu ions in GIS en i onmen s has been u he de eloped [
43
]. Addi ionally, a gene al
idea o he comple e pano ama o oponymy in Spain can be ound in Go dón-Pe al [
44
], as
well as in he pe iodical con e ences o he Commission on Geog aphical Names (Comisión
Especializada de Nomb es Geog á icos). These wo ks gi e b ie indica ions o he sou ces used
in in en o y and esea ch wo k, bu do no go in o p ocedu es in any dep h.
Finally, i is ele an o men ion he managemen o oponymic gaze ee banks,
bo h his o ical [
45
] and gaze ee s o place names, in he ame o he Eu opean Union.
In e ope abili y be ween gaze ee s has signi ican ad an ages bu also b ings challenges
wi h issues such as “localiza ion, non-uni oci y, classi ica ion, le el o de ail o o he
da a” [46].
2.2. On Gaze ee Synch oniza ion
A ending o he digi al ea men o geog aphical names as ex s ings, new echnolo-
gies o e come challenges, such as hose p oposed by Deng e al., whe e he Le ensh ein
ISPRS In . J. Geo-In . 2023,12, 70 4 o 29
dis ance is employed o compu e he name simila i y o conduc a geog aphical da a in-
eg a ion in China [
36
]. On he o he hand, oponymy can also be ea ed wi h na u al
language p ocessing (NLP) echnology [
37
,
38
]. Yan e al. made a deep neu al ne wo k o
conside bo h local and global ea u es o deal wi h geocoding [
39
]. Ma ins employed
machine-lea ning echniques o de ec duplica ed eco ds in digi al gaze ee s [40].
These ad ances ha e a di ec applica ion in he synch oniza ion o oponymic da abases.
In he case o he IGN (Spain), he synch oniza ion o he s a e da abase (NGBE) wi h he
egional da abases has been ackled. This issue has been sol ed wi h hei collabo a ion,
using semi-au oma ic me hods. In he in e na ional sphe e, an example o he in eg a ion o
la ge-scale da abases is he Uni ed S a es ede al gaze ee . This is called he Uni ed S a es
Boa d on Geog aphic Names, and i combines local and s a e sou ces in i s composi ion
(h ps://www.usgs.go /us-boa d-on-geog aphic-names; accessed on 2 Feb ua y 2023).
A success s o y o de ec ing edundancies due o c owd-sou ce con ibu ions is e-
po ed om Indonesia. Gele n e e al. [
47
] add ess he same issue, compa ing a c ea ed
uzzy ma ch algo i hm using machine lea ning (SVM). The la e checks bo h app oxima e
spelling and app oxima e geocoding in o de o ind duplica es be ween he c owd-sou ced
ags and he gaze ee . In he Swiss case, wo me hods o emo ing duplica es a e com-
pa ed: he i s me hod is based on ule-based ma ching, while he second applies machine
lea ning using Random Fo es s [48].
2.3. On Cadas al In o ma ion and Cadas al Place Names
Ce ainly, cadas e has been unde s ood in many ways, and in each e i o y has
had i s e olu ion, gi ing ise o s udies on his o ical cadas e [
49
]. On he o he hand,
adminis a ions ha e aken he pa h o s anda diza ion in his a ea [
50
], pa ly due o he
de elopmen o echnologies ha make i possible [
51
] o mo e owa ds e-go e nance.
A emp s o ha monize adminis a ions ha e been c ys allized in he ISO s anda d 19152
Land Adminis a ion Domain Model (LADM) [
52
], which in i s wo k e lec s he ad ances
in da a in eg a ion in Colombia. In he same way, syne gies a ise be ween di e en lines
o echnological de elopmen so ha cadas al in o ma ion is aided in he 3D concep
suppo ed by BIM [
53
]. In he case o Spain, he e o m o he cadas al legisla ion by Law
13/2015, o 24 June, on he Re o m o he Mo gage Law [
54
] has been a challenge o he
adminis a ion and an oppo uni y o echnical and concep ual imp o emen [
55
]. In
addi ion o he wo ks ha s udy he cadas e, some in es iga ions apply i as a ool o he
s udy o o he ields [56], such as u ban planning [57] o his o ical s udies [58].
Focusing speci ically on he place names con ained in he cadas al documen a ion,
unde s ood as geoin o ma ion, which a e p esen in nume ous wo ks is a ehicle o space
in e p e a ion. Fo example, one o he i s cadas e s in Spain da es om he eign o
Fe nando VI, he yea 1749, called “Cadas e o he Ma quis de la Ensenada”, which was
ca ied ou in se e al places o he Kingdom o Cas ile, bu no in he Basque Coun y
because i was exemp om axes [
59
]. On his da ase o g ea his o ical alue, we can ind
lines o esea ch unde he linguis ic p ism [
60
], physical geog aphy [
61
], in e p e a ion
as bo any [
62
], and di e en applica ions h ough GIS ela ed o he e i o y [
63
,
64
].
Speci ically, cadas al oponymy can be help ul in he s udy o u ban dynamics [
65
]. In
ano he o de , Pea n analyzes he ela ionship be ween he cadas al ca og aphic execu ion
i sel and place names [
66
]. Fu he mo e, he e a e s udies dealing wi h he analysis o he
au hen ici y o da a eco ded in cadas al da abases [67].
Rega ding oponymy ga he ed om cadas al in o ma ion, academic documen a ion
o he in eg a ion p ocess is no equen , bu is an implici pa o he genesis o a ious
da ase s. Nex , some examples o Spanish egions a e p esen ed. In he case o A agon,
acco ding o he speci ica ions o i s gaze ee 2019 [
68
], he cadas e da abase has se ed
as a sou ce o he c ea ion o he Geog aphic Gaze ee o A agon, p o iding a la ge
numbe o oponyms. In he c ea ion o o he egional da abases, he cadas e is no ci ed.
Ye he pa cel s uc u e o he cadas e is used as an elemen in he genesis o he place
names. Speci ically, Galicia has a e y la ge olume o oponyms, pa ly explained by he
ISPRS In . J. Geo-In . 2023,12, 70 5 o 29
smallholding in land dis ibu ion [
69
]. This same elemen also appea s in he case o he
Balea ic Islands. The e, despi e hei sys ema ic wo k based on ield collec ion beyond he
in eg a ion o ca og aphic sou ces, i is emphasized ha he usual uni o nomina ion is
he es a e [70].
The oponymy o he cadas e has been aken in o accoun in he p epa a ion o local
s udies o oponymy, as in he case o San Ma ín de Unx, Na a a [
71
]. E en so, i has no
been app oached sys ema ically h oughou he e i o y, which is he aim o ou s udy.
2.4. On Mul ilingualism
The geog aphic scope in which he s udy is con ex ualized includes a coexis ence
o languages, among which o icial bilingualism is ound. Some au ho s p e e he e m
“languages in con ac ” o encompass all possible si ua ions [
72
]. The cu en sociolinguis-
ic con ex wi nesses he p esence o au onomous e i o iali y in which, in addi ion o
Spanish and Basque, o he languages o e y di e en cul u al o igins coexis [
73
]. The
poli ical and iden i y issues ela ed o mul ilingualism in place names ha e been s udied
by
Jo dan e al. [74]
and Mácha e al. [
7
]. Mo eo e , he s udy o Eu opean policies and
legisla ion is discussed by Ruiz Viey ez [75].
A key aspec o unde s anding he ins i u ional managemen o bilingualism is he
dis ance be ween coexis ing languages [
76
]. In he case o Basque in i s ela ion o Spanish,
i does no enable seman ic anspa ency be ween he languages. In his sense, we should
ema k on he eme ging mains eam ela ed o he s udy o he linguis ic landscape [
77
],
in en i onmen s o linguis ic coexis ence, especially in asymme ic si ua ions [
78
], which
dese es special a en ion. I e en allows highligh ing such asymme y in u ban space [
79
].
The con ibu ion o linguis ic a ie ies o o ming an indi idual o g oup iden i y is ex-
plo ed in Rugkhapan [
80
], who makes special men ion o he dis ance be ween Ca esian
ca og aphy and popula pe cep ion, o icial language, and e e yday pa lance.
Al hough linguis ic s anda diza ion is a cu ing-edge opic, he applica ion o s an-
da diza ion in oponymy equi es bea ing in mind aspec s ela ed o i s es imonial cha -
ac e [
81
]. In his way, “ esea ch is alued as a p e ious and essen ial s ep in oponymic
s anda diza ion” [
82
], e en mo e so in he case o Euske a, whe e he p ocess o linguis ic
s anda diza ion con inues o he p esen day.
3. S udy A ea
The geog aphical scope o his s udy is he p o ince o Gipuzkoa (Figu e 1), he
smalles o he h ee e i o ies o he CAPV (o CAE) in ex ension, app oxima ely
1980 km2
(h ps://ssweb.seap.minhap.es/REL/; accessed on 20 No embe 2022). This p o ince has
a popula ion o 716,616 inhabi an s (inh), which implies a popula ion densi y o abou
362 inh/km2
, sligh ly highe han ha o i s au onomous communi y, 302 inh/km
2
in 2022
(h ps://www.eus a .eus; accessed on 20 No embe 2022). This egion’s densi y is mo e
han h ee imes ha o Spain, 94 inh/km
2
(h ps://www.ine.es; accessed on 20 No embe
2022), ep esen ing a high ela i e a e and leading o high p essu e on he e i o y. In
addi ion, Gipuzkoa has an i egula popula ion dis ibu ion, depending on ac o s such
as physical geog aphy o indus ial de elopmen . Speci ically, he coas al egions accoun
o h ee ou o e e y ou inhabi an s [
83
]. Simila ly, hey a e on he bo de o con inen al
Eu ope, which is also a key ac o [84].
I s egional o ganiza ion is a anged acco ding o he hyd og aphic basins o i e
alleys, mos o which low in o he Can ab ian Sea, adop ing di e en aspec s o join
managemen o municipal se ices and collabo a ing in he s uc u e o he e i o y [
85
].
E en hough he seconda y sec o has a g ea weigh , wi h he ise o he pape , ex ile, and,
in pa , i on and s eel sec o s, he a ea used o p ima y ac i i ies cons i u es mo e han 60%
o he o al e i o y [
86
]. This las poin is essen ial o unde s anding and jus i ying he
s udy o he cul u al he i age ha oponymy ep esen s, in any o i s app oaches, especially
gi en i s capaci y o conse e he landscape in o ma ion o he pas [87].
ISPRS In . J. Geo-In . 2023,12, 70 6 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 6 o 33
i o y. In addi ion, Gipuzkoa has an i egula popula ion dis ibu ion, depending on
ac o s such as physical geog aphy o indus ial de elopmen . Speci ically, he coas al
egions accoun o h ee ou o e e y ou inhabi an s [83]. Simila ly, hey a e on he
bo de o con inen al Eu ope, which is also a key ac o [84].
Figu e 1. The geog aphical ex en o he esea ch (p o ince o Gipuzkoa, Spain). Sou ce: Own
elabo a ion om www.ign.es (accessed on 22 No embe 2022). F ame coo dina es in km.
I s egional o ganiza ion is a anged acco ding o he hyd og aphic basins o i e
alleys, mos o which low in o he Can ab ian Sea, adop ing di e en aspec s o join
managemen o municipal se ices and collabo a ing in he s uc u e o he e i o y [85].
E en hough he seconda y sec o has a g ea weigh , wi h he ise o he pape , ex ile,
and, in pa , i on and s eel sec o s, he a ea used o p ima y ac i i ies cons i u es mo e
han 60% o he o al e i o y [86]. This las poin is essen ial o unde s anding and jus-
i ying he s udy o he cul u al he i age ha oponymy ep esen s, in any o i s ap-
p oaches, especially gi en i s capaci y o conse e he landscape in o ma ion o he pas
[87].
Figu e 1.
The geog aphical ex en o he esea ch (p o ince o Gipuzkoa, Spain). Sou ce: Own
elabo a ion om www.ign.es (accessed on 22 No embe 2022). F ame coo dina es in km.
Fo he linguis ic aspec , he Spanish Cons i u ion [
88
] es ablishes a si ua ion o co-
o iciali y o Spanish and Basque h oughou he au onomous communi y o he Basque
Coun y. Fu he mo e, he Basque Go e nmen s udies sociolinguis ic dynamics h ough
s a is ical ope a ions such as he Sociolinguis ic Su ey o he Sociolinguis ic Map. They
show ha 52.6% o he popula ion knows Basque [
89
]. Despi e his being he cu en
si ua ion, i should no be o go en ha he oponymy o any place is he esul o mul-
iple unc ional languages ha ha e occu ed h oughou his o y, main aining ossilized
linguis ic elemen s cha ac e is ic o pas imes [90].
4. Ma e ials and Me hods
In his sec ion, he da a used, he so wa e u ilized, he desc ip ion o he p ocess, and
he alida ion s a egy a e explained.
ISPRS In . J. Geo-In . 2023,12, 70 7 o 29
4.1. Toponymic Geoda a
4.1.1. P o incial Council. (Gipuzkoa)
The au ho i y esponsible o he managemen o geog aphic in o ma ion in he s udy
a ea is he Gipuzkoa P o incial Council, Dipu ación Fo al de Gipuzkoa (DFG). Ne e heless,
i has no de eloped a oca ional oponymic da abase; i makes use o oponyms so ha
hey can ep esen oponymic in en o ies. I s bes -known geoin o ma ion sou ce, i s 1:5000
ca og aphy, sha es a oponymic su ey wi h he egional da abase, so a high co ela ion is
expec ed. Howe e , in his wo k, he sou ce o in o ma ion we wan o gi e p ominence
o is he ca og aphy o he P o incial Council’s o al cadas e. I should be aken in o
accoun ha he Cadas e o he Fo al Depu a ions o he Basque Coun y, as well as ha o
he Fo al (Au onomous) Communi y o Na a e, is ou side he Spanish Common Regime
e i o y [
59
], ha ing had a sepa a e ajec o y his o ically wi h di e en da a models. In
addi ion o he cha ac e is ics o cadas al in o ma ion, hese da a o e he pa icula i y
ha hey ha e no ye been sys ema ically used o compile he o icial gaze ee s. Hence,
he con ibu ion is o iginal, expec ing a lowe co ela ion han among he es o he spa ial
da ase s. The cadas al in o ma ion o he DFG is a ailable online as desc ibed below: he
pa cel is di ided in o us ic and u ban; on he one hand, he e is a g aphic desc ip ion o
he pa cel (h ps://www.gipuzkoa.eus/es/web/ogasuna/ca as o/in o macion-gene al;
accessed on 29 No embe 2022); on he o he hand, he e is an alphanume ic desc ip ion
(h ps://www.gipuzkoai ekia.eus; accessed on 29 No embe 2022).
4.1.2. Au onomous Communi y (Basque Coun y)
The oponymic co pus o he Basque Go e nmen acqui es i s cu en s a us h ough
he Dec ee 179/2019 o 19 No embe , “on he s anda diza ion o he ins i u ional and
adminis a i e use o he o icial languages in he local ins i u ions o Euskadi” [
91
] unde
he p o ec ion o he Eu opean Cha e o Regional o Mino i y Languages [
92
] (ins u-
men o a i ica ion by Spain, “Spanish O icial Bulle in”—Bole ín O icial del Es ado—o 15
Sep embe 2001 [
93
]). This is because, as indica ed in i s p eamble, unde he UNESCO
A las o he Wo ld’s Languages in Dange , Basque is a mino i y and ulne able language in
i s own e i o y and is in an asymme ical si ua ion conce ning he o he o icial language,
namely, Spanish. The i h chap e o he a o emen ioned egula ion deals wi h municipal
oponymy. I pays pa icula a en ion and espec o he dis ibu ion o compe encies
and esponsibili ies o di e en public au ho i ies, including hose o he Academy o he
Basque Language, Euskal zaindia. In addi ion, he NGO-CAE is c ea ed as a public egis e ,
a ached o he depa men wi h compe encies in ma e s o linguis ic no maliza ion o
he Basque Go e nmen , in which he o icial place names o he Basque Coun y will be
egis e ed.
The p ecu so o his da abase is he agglu ina ion o di e en wo ks o collec ion
and p ocessing o place names, which ha e been deployed since he 1990s [
94
]. These
ha e been ca ied ou on he ini ia i e o he au onomous adminis a ion in i s e i o ial
scope and, addi ionally, di e en local p ojec s ha ha e been complemen ing, imp o ing,
and upda ing he o iginal da abase. The au onomous go e nmen i sel has used di e en
measu es o boos his local wo k. Toge he wi h egional and local con ibu ions, his
co pus has a linguis ic oca ion ha p e ails o e he geog aphic componen . Thus, he
in ended use o his da ase wi hin he ins i u ion’s ca og aphy g an s i he na u e o
geoin o ma ion in all i s unc ionali y. This esea ch adop s he compac e sion o he NGO-
CAE (h ps://www.openda a.euskadi.eus/inicio/; accessed on 29 No embe 2022), and
ma e ialized in a ex ile, and p esen ing poin geome y. This cha ac e is ic simpli ies he
ec o o e lay geop ocessing and educes p ocessing ime. In case o he need o e ie e
he geome y o he ea u es, i is possible o link he “iden i y” ield o his da abase
wi h he iden i ie o he oponym collec ed in he Ha monized Topog aphic Base, Base
Topog á ica A monizada—BTA.
ISPRS In . J. Geo-In . 2023,12, 70 8 o 29
4.1.3. Na ional Le el (Spain)
A he na ional le el, he Na ional Ca og aphic Sys em es ablishes he Na ional
Geog aphic Gaze ee as pa o he Na ional Re e ence Geog aphic Equipmen , which is
made up o he ha moniza ion o he NGBE and he Geog aphic Gaze ee s o each o he
Au onomous Communi ies [
95
]. The e o e, he NGBE has been comple ed by he IGN o
mee he equi emen s o he INSPIRE di ec i e and he LISIGE. Al hough he s a ing poin
o he da abase is he oponymic con en o he Mapa Topog á ico Nacional, 1:25,000 (MTN25)
and he es o he IGN ca og aphic se ies, he main enance and imp o emen ac i i y o e
he las decade has been aimed a coo dina ion be ween gaze ee s. A e a p ocess called
“au oco ec ion”, he co ec ion o NGBE oponyms by compa ison wi h he NGO-CAE
was add essed h ough a p ocess wi h an ex ensi e manual componen [
17
]. The eby, he
s ong co ela ion o con e gence be ween hese wo da abases is ecognized in he case o
he CAE. On he o he hand, his da abase, managed om a egional pe spec i e, o ien ed
o co e he needs o ca og aphy a a scale o 1:25,000, has a much lowe densi y, so i s
con ibu ion is limi ed.
4.2. So wa e U ilized
Da a managemen and p e-p ocessing, as well as he compa ison o hei con en s,
including di e en linguis ic aspec s, was pe o med using ETL ools, speci ically FME
©
Desk op 2021.2 (h ps://www.sa e.com/ me/ me-desk op/; accessed on 30 No embe
2022). Al e na i ely, he analysis o he esul s and he me hodology alida ion we e
conduc ed using a GIS ool, QGIS 3.24.0 Tisle ;www.qgis.o g (accessed on 30 No embe ).
4.3. P ocessing O e iew
The p ocess ollowed can be summa ized g aphically using he diag am shown in
Figu e 2.
4.3.1. Inpu Da a
Fi s o all, in he NGO-CAE a se ies o ope a ions was pe o med on he da a o keep
only he objec s o in e es , elimina ing epea ed elemen s ha we e due o en ies wi h
a la ge se o a ibu es ha di e en ia e each ins ance. This issue is condi ioned by he
o iginal s uc u e o he Vice-Minis y o Linguis ic Policy’s own da abase. I was also
necessa y o elimina e he coding o communica ion ou es, which, as a li e al sys em o
geog aphic loca ion, in con as wi h oponymy, has a no ma i e and no a cul u al o igin;
he e o e, his in o ma ion was disca ded. The inpu o hese da a in he ETL p ocessing was
h ough an alphanume ic ile (in .CSV o ma ). The compa ison ield we used con ained
he speci ic pa o he name o a oid as much as possible he coincidences ha would
occu i we included he gene ic pa . In his sense, he ga he ing capaci y o he Basque
language, which a o s he agglu ina ion o lexicalized gene ics, caused he appea ance o
coincidences o lexemes speci ic o oponymic o geog aphic e minology.
In he cadas e-DFG da abase, nume ic codes a e also inse ed oge he wi h he
oponymy ( o example, he po al numbe o speci ic pa cels). S ill, hei elimina ion
was inciden al since hey did no a ec he p ocessing. The coding o he pa cels in
polygons, oge he wi h he municipali y code, acili a ed he union o he geome ic
elemen ep esen ing he pa cel and i s alphanume ic a ibu es, including he place name.
Fo he compa ison o he elemen s o each o igin, o compu a ional e iciency easons,
he cen oids o he cadas al pa cels (CP) we e supe imposed on a eas o in luence a ound
he NGO-CAE place names. Fo a eas whe e he pa cel ex ension was la ge, i was decided
o gene a e a con inuous su ace o he Vo onoi diag am, as has been used in o he ecen
ca og aphic wo ks [
96
–
98
]; hese a e usually o es y and pas u e a eas. In mos cases, he
municipali y owns he land, ei he as public domain o as pa imonial land.
ISPRS In . J. Geo-In . 2023,12, 70 9 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 9 o 33
Figu e 2. Flowcha wi h he p ocess unde aken.
Figu e 2. Flowcha wi h he p ocess unde aken.
ISPRS In . J. Geo-In . 2023,12, 70 16 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 17 o 33
Figu e 5. Cadas al pa cels selec ed au oma ically (5C+ ma ching). F ame coo dina es in km.
Table 4 displays he ou cha ac e esul s, bo h supe ised (Figu e 6) and au oma ic
(Figu e 7). As can be seen, he depu a ion o he esul s wi h he LKB has no assu ed he
au oma ic me hod’s eliabili y. Fu he mo e, as in e ed om he las column, some mu-
nicipali ies ha e commission e o s, such as Abal ziske a o Mu iku. Fo hese easons, a
manual check o his da ase is necessa y. Finally, ega ding he e icien planning o he
p ocess, i should be emphasized ha i om he o al o 4C combina ions ound, we ex-
Figu e 5. Cadas al pa cels selec ed au oma ically (5C+ ma ching). F ame coo dina es in km.
ISPRS In . J. Geo-In . 2023,12, 70 17 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 19 o 33
Figu e 6. Cadas al pa cels selec ed by supe ision (4C and 5C+ ma ching). F ame coo dina es in
km.
Figu e 6.
Cadas al pa cels selec ed by supe ision (4C and 5C+ ma ching). F ame coo dina es in km.
ISPRS In . J. Geo-In . 2023,12, 70 18 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 20 o 33
Figu e 7. Cadas al pa cels selec ed au oma ically (4C and 5C+ ma ching). F ame coo dina es in
km.
Finally, o de i e a quan i a i e assessmen o he quali y o he p ocess, Table 5
epo s he esul s o quan i ying he disc epancies be ween he au oma ic me hod and i s
manual e i ica ion (Figu e 8). As a esul , a a e o e o s o omission and e o s o
commission is ob ained o he sample. As explained abo e, seman ic ma ching on he
sample has been used o demons a e he eliabili y o he ma ches ound au oma ically.
In he case o e o o omission, a a e o 5.7% was obse ed. This alue is calcula ed on
Figu e 7.
Cadas al pa cels selec ed au oma ically (4C and 5C+ ma ching). F ame coo dina es in km.
Finally, o de i e a quan i a i e assessmen o he quali y o he p ocess, Table 5 epo s
he esul s o quan i ying he disc epancies be ween he au oma ic me hod and i s manual
e i ica ion (Figu e 8). As a esul , a a e o e o s o omission and e o s o commission is
ob ained o he sample. As explained abo e, seman ic ma ching on he sample has been
ISPRS In . J. Geo-In . 2023,12, 70 19 o 29
used o demons a e he eliabili y o he ma ches ound au oma ically. In he case o e o
o omission, a a e o 5.7% was obse ed. This alue is calcula ed on he nomina ed pa cels
o he sample space. Simila ly, he commission e o a e is 1.7%. This pe cen age can be
modula ed by applying he lexical il e mo e o less es ic i ely, bu educing he a e o
e o s o commission ine i ably means aising he a e o e o s o omission.
Table 5. Re ision o e o s o omission and commission in he sampled municipali ies.
Adminis a i e
Uni
CP wi h
a Name
To al
Manual Signi ican
Ma ch CP
To al
Au o +
Seman ic CP
Selec ion
Omission
Occu ence
CP
Commission
Occu ence
CP
Abal ziske a (001) 508 180 156 18 3
Mu iku (056) 556 346 258 25 23
O dizia (079) 111 21 24 0 0
Oña i (059) 2661 1045 993 176 39
% To al ( om To al CP wi h manual signi ican ma ch) 13.8% 4.1%
% To al ( om To al CP wi h a name) 5.7% 1.7%
To illus a e his issue wi h an example, we use he adjec i es “zabal” (wide) and
“sakon” (deep) in he sample o he municipali y o Oña i. Fo he i s o hem (zabal), his
coincidence is au oma ically de ec ed 32 imes. A e manual supe ision, 19 elemen s a e
iden i ied as signi ican in conco dance wi h he speci ic; on he con a y, 13 a e no . Because
o hese esul s, i is concluded ha he s udied subs ing is no a disc imina ing elemen in
he consis ency o p ope names and is no applied in he il e ing. Con e sely, 18 au oma ic
ma ches a e ound wi h he adjec i e “sakon”, and he manual e iew e eals ha only wo
a e inconsis en . In his case, deciding o include his lexeme as a disc imina o depends on
how conse a i e we wan o be wi h commission e o s.
To conclude, a e checking he me hod’s alidi y, Table 6summa izes he esul s
de i ed om applying he me hod o he uni e se o discou se (Figu e 9). Finally, he
occu ence a e de ec ed by he wo me hods o he sample and using he au oma ic
me hod o he whole p o ince o Gipuzkoa is p esen ed. Ou o he 18,179 cadas al
pa cels (CP) nomina ed, 40% ha e some elemen ha is pa ially o o ally coinciden wi h
he NGO-CAE.
Table 6.
Resul s ob ained a e he applica ion o he au oma ic p ocess in he o al popula ion (on
he pa cels o he en i e municipali y).
Adminis a i e
Uni
CP wi h
a Name
To al
Manual Signi ican
Ma ch CP
To al
Au o +
Seman ic CP
Selec ion
Au o + Seman ic %
om Signi ican
Ma ch CP
Au o + Seman ic
% om
Signi ican wi h a
Name
Abal ziske a (001) 508 180 156 86.6% 30.7%
Mu iku (056) 556 346 258 74.5% 46.4%
O dizia (079) 111 21 24 114.3% 21.6%
Oña i (059) 2661 1045 993 96.0% 37.3%
Sample a e age 89.9% 37.7%
GIPUZKOA 45,326 18,179 40.1%
ISPRS In . J. Geo-In . 2023,12, 70 20 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 22 o 33
Figu e 8. Omission and commission e o s in he selec ed cadas al pa cels. F ame coo dina es in
km.
To illus a e his issue wi h an example, we use he adjec i es “zabal” (wide) and
“sakon” (deep) in he sample o he municipali y o Oña i. Fo he i s o hem (zabal),
his coincidence is au oma ically de ec ed 32 imes. A e manual supe ision, 19 ele-
men s a e iden i ied as signi ican in conco dance wi h he speci ic; on he con a y, 13
a e no . Because o hese esul s, i is concluded ha he s udied subs ing is no a dis-
c imina ing elemen in he consis ency o p ope names and is no applied in he il e ing.
Figu e 8.
Omission and commission e o s in he selec ed cadas al pa cels. F ame coo dina es in km.
ISPRS In . J. Geo-In . 2023,12, 70 21 o 29
ISPRS In . J. Geo-In . 2023, 12, x FOR PEER REVIEW 24 o 33
Figu e 9. Resul s om he p oposed me hod o he whole p o ince o Gipuzkoa. F ame coo dina es
in km.
6. Discussion
In his wo k, we ha e designed a p ocess using ETL and GIS ools o assess he e-
dundancy be ween wo g oups o oponymic da a o igina ed om he geoin o ma ion o
di e en na u es. The objec i e is hei in eg a ion and join use in subsequen spa ial
analysis wo ks. In pu suing his edundancy, he c i e ia o loca ion and ex ual coinci-
dence ha e been applied, as well as he seman ic c i e ia. The me hodology has been ap-
plied, p o iding absolu e and ela i e alues ha allow us o posi i ely e alua e i s
use ulness and he use o he sou ces, as a gued below.
Fi s ly, we ha e designed an au oma ic p ocess, which has a ained a conco dance o
40% o coincidences in he ex s o he i e-cha ac e names. The esul is suscep ible o
imp o emen wi h semi-au oma ic p ocesses, such as including he esul s o
ou -cha ac e ma ches, which would educe he 5% alue o e o s o omission. Re-
dundancy can also be in e p e ed in a complemen a y way; i can be es ima ed ha 60%
o he oponymic co pus is suscep ible o being in eg a ed in o he NGO-CAE. Apa
om conside a ions abou he easibili y and p o enance o his ask ha we will discuss
below, we can compa e hese esul s wi h he analogous wo k ca ied ou by he Anda-
lusian Regional Go e nmen [16].
Figu e 9.
Resul s om he p oposed me hod o he whole p o ince o Gipuzkoa. F ame coo dina es
in km.
6. Discussion
In his wo k, we ha e designed a p ocess using ETL and GIS ools o assess he
edundancy be ween wo g oups o oponymic da a o igina ed om he geoin o ma ion
o di e en na u es. The objec i e is hei in eg a ion and join use in subsequen spa ial
analysis wo ks. In pu suing his edundancy, he c i e ia o loca ion and ex ual coincidence
ha e been applied, as well as he seman ic c i e ia. The me hodology has been applied,
p o iding absolu e and ela i e alues ha allow us o posi i ely e alua e i s use ulness
and he use o he sou ces, as a gued below.
Fi s ly, we ha e designed an au oma ic p ocess, which has a ained a conco dance o
40% o coincidences in he ex s o he i e-cha ac e names. The esul is suscep ible o
imp o emen wi h semi-au oma ic p ocesses, such as including he esul s o ou -cha ac e
ma ches, which would educe he 5% alue o e o s o omission. Redundancy can also be
in e p e ed in a complemen a y way; i can be es ima ed ha 60% o he oponymic co pus
is suscep ible o being in eg a ed in o he NGO-CAE. Apa om conside a ions abou he
easibili y and p o enance o his ask ha we will discuss below, we can compa e hese
esul s wi h he analogous wo k ca ied ou by he Andalusian Regional Go e nmen [
16
].
While he con ibu ion o he cadas al oponymy o he Andalusian Gaze ee (NGA)
accoun s o 18.7% o elemen s, wi h he conside a ions p esen ed below, we can calcula e
ISPRS In . J. Geo-In . 2023,12, 70 22 o 29
ha he con ibu ion o he NGO on he e i o y o Gipuzkoa could each up o 46.9%.
This es ima e is de i ed by conside ing ha he 18,179 conco dances o he esul (which
ep esen 40%) ha e been es ablished wi h 7850 elemen s o he NGO-CAE, which is 31.3%
o he gaze ee . This is explained by he ac ha he e is also an in e nal edundancy o co -
ela ion be ween he names o he pa cels. The e o e, ollowing he same logic and applying
his edundancy a e o he 60% o pa cels wi h independen names, we a i e a he igu e
o 46.9%. In any case, his is only a maximum es ima e, bu a subs an ial imp o emen
conce ning he esul s o he Andalusian p ojec canno be uled ou . This is because, while
he da abases compa ed in Andalusia do ha e in e dependence in hei con en [
16
], his is
di e en in he Basque Coun y. In o he wo ds, while in he ini ial speci ica ions o he
ca og aphy Mapa Topog á ico de Andalucía(MTA), he cadas al ca og aphic in o ma ion
be ween he yea s 1985 and 1990 was es ablished as a sou ce o e e ence [
16
]; con e sely,
o he p epa a ion o he BTA in he Basque Coun y he oponymy o he o al cadas e
has no been used [
105
], which means a lowe co ela ion. Mo e speci ically, a co ela ion
exclusi ely based on he objec and no on he p ocess is expec ed.
On he o he hand, i should no be o go en ha he mos signi ican di e ence
be ween hese wo wo ks is ha , while he imp o emen o he NGA is pe o med in he
adminis a i e ield, he ini ia i e o ou s udy is academic. The equi emen s o he inclu-
sion o he o icial geog aphical gaze ee o he Au onomous Communi y o he Basque
Coun y a e egula ed in Dec ee 179/2019. Unde his egula ion, he ini ia i e mus come
om a compe en body, which is he local adminis a ion in he case o mic o oponymy.
In addi ion, his p ocess equi es he classi ica ion o candida e names based on he a ge
da a ca alog. Thus, his ask anscends he scope o his wo k and academic esea ch.
Ne e heless, his is no an impedimen o conside ing his in eg a ed da ase in de eloping
wo ks o analysis o he e i o y based on place names. In his case, he in eg a ion o
sou ces can only u he he p ocess and inc ease he obus ness o he es s.
Ano he p ojec ela ed o his s udy, amed in he Na ional Geog aphic Gaze ee
(NGN) p epa a ion, is he co ela ion o he ini ial e sion o he NGBE wi h he oponymic
da abases and in o ma ion o he au onomous communi ies de eloped by he IGN. I s ob-
jec i e is o co ec he oponymy o his na ional gaze ee acco ding o he documen a ion
p oduced by he au onomous communi y en i ies, which a e gene ally compe en in he
ma e . The me hodology, called “au oco ec ion”, was designed a ound 2011 and has had
di e en deg ees o de elopmen . The e alua ion o he pilo p ojec s, implemen ed in he
p o inces o Huel a and Ala a, shows a deg ee o exac coincidence be ween da abases
o 30% [
17
], al hough ce ain p oblems a e e iden in cases o bilingualism, such as an
inc ease in e o s and spelling a ia ions. The manual componen o alida ion o all
co espondences has a g ea weigh .
On he o he hand, his implemen a ion has b ough homogenei y and eliabili y o
he NGBE, in addi ion o p omo ing a igo ous ac i i y o he adminis a ion wi h he aim
o complying wi h he equi emen s de i ed om he Eu opean INSPIRE di ec i e. I also
b ings an essen ial idea, which is ha he imp o emen o hese da abases can only be
app oached i e a i ely. Whe eas he spi i o he p ojec may be aligned wi h his wo k, he
esul s a e ha dly compa able, and he e is a lack o desc ip i e s a is ics o he geog aphic
scope o ou s udy.
The me hodology design is based on GIS ools, which p o ide all he powe o spa ial
analysis. On he o he hand, he capaci y o ETL ools has been used o managing da abases
and hei ans o ma ion based on alphanume ic da a. Al hough he me hod can be ully
ep oduced by da abase manage s o by GIS, he e a e se e al ad an ages o ETL ools.
Fi s ly, he e is he agili y o da a loading ( he cadas e’s wo king uni is he municipali y, so
he e a e mul iple iles o be linked); hen he ope abili y be ween o ma s; and inally, he
possibili y o implemen ing sub ou ines wi h code adap ed o he needs o he da a models
used. The di e ence be ween da a models depending on he e i o ial uni analyzed is one
o he main challenges o he ex ension o he applica ion o his me hod o o he da ase s
wi h oponymic con en .
ISPRS In . J. Geo-In . 2023,12, 70 23 o 29
Ne e heless, di e en p ojec s a e known o o e solu ions o simila p oblems, so i
is necessa y o discuss he choice o he ools used. Le us conside he design implemen ed
by Kau man [
102
], which sea ches o ma ches using an algo i hmic solu ion based on he
uzzy s ing ma ching me hodology. We ind ha ou p oposal, which uses ma ching wi h
a geog aphic LKB, is be e adap ed o he Basque language cons uc ion logic o lexeme
agglu ina ion. The use o na u al language p ocessing ools using a i icial in elligence [
106
]
should also be conside ed. Ne e heless, his ou e equi es a deepe imme sion in o
he linguis ic concep s beyond he objec i es o he line o esea ch in which his s udy
is embedded.
Th oughou his manusc ip , we ha e also indica ed he di e en limi a ions in he
scope o he wo k. Due o he ini ial geog aphic pe spec i e, we ha e no conside ed he
implemen a ion o o ms o na u al language p ocessing using a i icial in elligence, which
has yielded e y a o able esul s, as men ioned in Sec ions 2.2 and 2.4. In addi ion, a
oponymic s anda diza ion p ocess has ye o be deployed on cadas al geog aphic in o -
ma ion, which makes i pa icula ly di icul o compa e wi h linguis ically s anda dized
da abases. On he o he hand, his wo k lea es aside he ypi ica ion o geog aphic ele-
men s. Thus, his could be one o he objec i es o u u e wo k on he co pus c ea ed wi h
he se o place names. Finally, i mus be admi ed ha any sys ema ic oponymy ea men
o e s imp o emen s in homogenei y and speed in ob aining esul s. Ne e heless, i needs
o imp o e con ol in he managemen o de ail and he in e p e a ion o nuances, which is
possible in manual me hods.
Conce ning he seman ic aspec , i is possible o ou line an ac ion o imp o e he
p ocedu e by including i in he e e ence LKB e minology ela ed o bo any, land use,
and land co e , in addi ion o geog aphical e ms. The e o e, i would be ad isable o
design a seman ic map o he o ganiza ion o he lexical sys em ela ed o oponymy [
107
],
which would endow a seman ic a ilia ion o di e en ca ego ies in o de o acili a e he
applica ion o his LKB.
7. Conclusions
In his esea ch, we ha e designed, implemen ed, and alida ed a me hodology o he
de ec ion o coincidences be ween geog aphic da abases using ETL and GIS ools. These
ha e been he NGO-CAE gaze ee and he cadas al geoin o ma ion co esponding o
he p o ince o Gipuzkoa (bilingual e i o ial). The objec i e o his compa ison is he
in eg a ion o in o ma ion a an academic le el. In addi ion, oponymy is a special ype o
geoin o ma ion because o i s ans e sali y and he mul i ude o app oaches om which i
can be add essed.
The implemen ed me hodology is based on spa ial c i e ia, ex ual elemen ea men s,
and inally on seman ic c i e ia, which ou line he au oma ic selec ion o conco dances, as
e i ied in he manual alida ion o he me hod, pe o ming he same p ocess manually. The
alida ion phase o he me hod on a sample o ou municipali ies cla i ies he minimum
conco dance chain leng h in which seman ic uni s a e ound. These a e c ucial da a
o de e mine he h eshold a which he p ocessing is eliable by exclusi ely au oma ic
me hods. In ou case, as Basque is he p edominan language, i is ound ha s ings o
ewe han ou cha ac e s cease o ha e a unique signi icance, and inde e minacy akes
o e he analysis so ha he execu ion o he succession o commands is no enough. I has
also been ound ha he mo e o less es ic i e use o LKB leads o e o modula ion. Fo
he issue ha conce ns us, which is he in eg a ion o da abases, he e o s o commission
esul in he po en ial loss o oponymic in o ma ion. The eby, i will be he pa ame e ha
we mus conside o limi he e minological glossa y.
The esul s achie ed in his implemen a ion, conside ing he high numbe o non-
epea ed names, demons a e his me hod’s goodness and he adequacy o he choice
o da abases o hei in eg a ion. A lay iew migh hink his low consis ency may
deno e a lack o quali y in he compa ed sou ces. Howe e , gi en he na u e o he
sou ces, he in e p e a ion de i ed om he de ailed knowledge o he da ase s ends o
ISPRS In . J. Geo-In . 2023,12, 70 24 o 29
ela e his incohe ence o di e ences in he imes and me hods o in o ma ion cap u e, o
ypes o en i ies consigned o he pu pose o each wo k. In sho , his wo k con on s
di e en app oaches, objec i es, and equi emen s o each ep esen a ion o eali y h ough
geog aphic in o ma ion.
I his s udy is con as ed wi h o he abo e-men ioned ela ed wo ks, we can asse
ha he goodness o he esul s is compa able. Fu he mo e, his example p o ides an
inno a i e solu ion o he ea men o oponymy in a bilingual en i onmen . Finally,
i explo es he oponymic possibili ies o a geoin o ma ion da abase, such as ha o he
Gipuzkoa land egis y, which un il now has no been used o hese goals.
The assump ion o he challenge o ex ending he geog aphical scope o applica ion o
he o he e i o ies o he Basque Coun y, Ala a, o Bizkaia mus in ol e esul e i ica ion
due o he popula ion and sociolinguis ic di e ences. Finally, and om a mo e gene al
pe spec i e, he main challenge is o ad ance owa d au oma ed, sys ema ic, and eliable
managemen in a ield o knowledge ha has been ea ed by hand un il e y ecen ly.
Au ho Con ibu ions:
Concep ualiza ion, Oihana Mi xelena-Hoyos and José-Láza o Ama o-Mellado;
me hodology, Oihana Mi xelena-Hoyos and José-Láza o Ama o-Mellado; so wa e, Oihana Mi xelena-
Hoyos and José-Láza o Ama o-Mellado; alida ion, Oihana Mi xelena-Hoyos and José-Láza o
Ama o-Mellado; o mal analysis, Oihana Mi xelena-Hoyos and José-Láza o Ama o-Mellado; in-
es iga ion, Oihana Mi xelena-Hoyos and José-Láza o Ama o-Mellado; esou ces, Oihana Mi xelena-
Hoyos and José-Láza o Ama o-Mellado; da a cu a ion, Oihana Mi xelena-Hoyos and José-Láza o
Ama o-Mellado; w i ing—o iginal d a p epa a ion, Oihana Mi xelena-Hoyos and José-Láza o
Ama o-Mellado; w i ing— e iew and edi ing, Oihana Mi xelena-Hoyos and José-Láza o Ama o-
Mellado; isualiza ion, Oihana Mi xelena-Hoyos and José-Láza o Ama o-Mellado; supe ision,
Oihana Mi xelena-Hoyos and José-Láza o Ama o-Mellado; p ojec adminis a ion, Oihana Mi xelena-
Hoyos and José-Láza o Ama o-Mellado; unding acquisi ion. All au ho s ha e ead and ag eed o
he published e sion o he manusc ip .
Funding: This esea ch ecei ed no ex e nal unding.
Ins i u ional Re iew Boa d S a emen : No applicable.
In o med Consen S a emen : No applicable.
Da a A ailabili y S a emen : No applicable.
Con lic s o In e es : The au ho s decla e no con lic o in e es .
Appendix A
#Sub ou ine inpu a iables:
#Cs Nomb eC: cadas al pa cel oponym
#Rela ionships{}.NOMBRET: dic iona y ha con ains he place names ha ul il he geo-
me ic c i e ion (see Sec ion 4.3.3. Geome ic C i e ion)
#ELEMENTOS_EN_LISTA: numbe o elemen s in Rela ionships{}:
#Sub ou ine ou pu a iables:
#Fo e e y ea u e o Rela ionships{} dic iona y, we ha e he ollowing a iables:
#Rela ionships{}.COMMON_CHAIN: he maximum-common-chain ex .
#Rela ionships{}.LEN_ COMMON_CHAIN: leng h o he maximum-common-chain ex
impo me
impo meobjec s
de Fea u eP ocesso ( ea u e):
nomb e = s ( ea u e.ge A ibu e(‘cs Nomb eC’))
m = len(nomb e)
num_ele_lis = ea u e.ge A ibu e(‘ELEMENTOS EN LISTA’)
coun =0
ISPRS In . J. Geo-In . 2023,12, 70 25 o 29
while coun < num_ele_lis :
oponimo = ea u e.ge A ibu e(‘_ ela ionships{‘ + s (coun ) + ’}.NOMBRET’)
n = len(s ( oponimo))
maxLen = 0
endIndex = m
FIND = [[0 o x in ange(n + 1)] o y in ange(m + 1)]
o i in ange(1, m + 1):
o j in ange(1, n + 1):
i nomb e[i −1] == oponimo[j −1]:
FIND[i][j] = FIND[i −1][j −1] + 1
i FIND[i][j] > maxLen:
maxLen = FIND[i][j]
endIndex = i
esul ado = nomb e[endIndex −maxLen: endIndex]
ea u e.se A ibu e(‘_ ela ionships{‘ + s (coun ) + ’}.COMMON_CHAIN’,s ( esul ado))
ea u e.se A ibu e(‘_ ela ionships{‘ + s (coun ) + ’}.LEN_COMMON_CHAIN’,len
(s ( esul ado)))
coun = coun + 1
Re e ences
1.
Quesada-Ga cía, S. A ca og aphy o al-Andalus’ landscape: Mapping se lemen s o Muslim ag icul u al coloniza ion in Eu ope
applying GIS echniques. J. His . Geog . 2022,77, 65–84. [C ossRe ]
2. Rossellói Ve ge , V.M. Ca og aphy, landscape and e i o y. Ca alan Soc. Sci. Re . 2012,1, 46–57. [C ossRe ]
3.
Zamo shchiko a, L.; Gadal, S.; Filippo a, V.; Samsono a, M. Landscape Toponymic Maps: In e disciplina y App oach (Example
o Sakha Republic, Russia). In P oceedings o he In e na ional Mul idisciplina y Scien i ic Con e ence SGEM2016, Albena,
Bulga ia, 30 June–6 July 2016; pp. 311–316.
4. A oyo Ille a, F. Toponymy as a in angible cul u al legacy. Bole ín Real Soc. Geog . 2019,153, 33–60.
5.
Mollo, N.M. De e minación geog á ica de los si ios de in e és his ó ico y a queológico median e la u ilización de écnicas
ca og á icas. Teo . P ác . A queol. His . La inoam. 2022,3, 21–48. [C ossRe ]
6.
Ingelmo Casado, R. Localización y a amien o de in o mación his ó ica a a és de la oponimia meno : U ilidad del ca as o
de la iqueza ús ica (Localiza ion and ea men o his o ical in o ma ion h ough he mino oponymy: U ili y o he cadas e
o us ic weal h). In Tecnologías de la In o mación Geog á ica: La In o mación Geog á ica al Se icio de los Ciudadanos (Geog aphic
In o ma ion Technologies: Geog aphic In o ma ion a he Se ice o Ci izens); Ojeda, J., Piya, M.F., Vallejo, I., Eds.; Sec e a iado de
Publicaciones de la Uni e sidad de Se illa: Se illa, Spain, 2010; pp. 199–213, ISBN 978-84-472-1294-1.
7.
Mácha, P.; Ob usník, U.; Jo dan, P.; Sancho Reinoso, A. The Challenges o S udying Place-Name Poli ics in Mul ilingual A eas. In
Place-Name Poli ics in Mul ilingual A eas; Sp inge In e na ional Publishing: Cham, Swi ze land, 2021; pp. 45–69.
8.
González Ga cía, E.M. El ca as o: Fuen e de in o mación del e i o io (The cadas e: Sou ce o e i o ial in o ma ion). In
P oceedings o he X Coloquio de His o ia Cana io—Ame icano. Coloquio 10. Tomo 2; Cabildo Insula de G an Cana ia, Ed.; ULPGC:
Las Palmas de G an Cana ia, Spain, 1992; pp. 160–175.
9.
Eu opean Pa liamen and Council o he Eu opean Union. Di ec i e 2007/2/EC o he Eu opean Pa liamen and o he Council o 14
Ma ch 2007 Es ablishing an In aes uc u e o Spa ial In o ma ion in he Eu opean Communi y (INSPIRE), L108; O icial Jou nal o he
Eu opean Union: B ussels, Belgium, 2007.
10.
Gobie no de España. Ley 14/2010, de 5 de Julio, sob e las In aes uc u as y los Se icios de In o mación Geog á ica en España (LISIGE) (Law
14/2010, o July 5, 2010, on Geog aphic In o ma ion In as uc u es and Se ices in Spain); Go e nmen o Spain: Mad id, Spain, 2010.
11.
Liu, Z.; Cheng, L. Re iew o GIS Technology and I s Applica ions in Di e en A eas. IOP Con . Se . Ma e . Sci. Eng.
2020
,
735, 012066. [C ossRe ]
12.
Mesqui ela, J.; El as, L.B.; Fe ei a, J.C.; Nunes, L. Da a Analy ics P ocess o e Road Acciden s Da a—A Case S udy o Lisbon
Ci y. ISPRS In . J. Geo-In . 2022,11, 143. [C ossRe ]
13.
Páez, O.; Vilches-Blázquez, L.M. B inging Fede a ed Seman ic Que ies o he GIS-Based Scena io. ISPRS In . J. Geo-In .
2022
,
11, 86. [C ossRe ]
14.
D ešˇcek, U.; Kosma in F as, M.; Teka ec, J.; Lisec, A. Spa ial ETL o 3D Building Modelling Based on Unmanned Ae ial Vehicle
Da a in Semi-U ban A eas. Remo e Sens. 2020,12, 1972. [C ossRe ]