scieee Science in your language
[en] (orig)

Modelling Decisions in Banking Supervision A Machine Learning Approach

Author: Guerra, Pedro Arteaga
Year: 2022
Source: https://run.unl.pt/bitstream/10362/141297/1/D0068.pdf
Doc o al P og amme - In o ma ion Managemen
2
Acknowledgemen s
To my supe iso P o . Mau o Cas elli o his in aluable suppo , guidance and
expe ience. To a g ea spa ing pa ne .
Also o my co-supe iso and colleague P o . Nadine Cˆo e-Real o he imely
insigh s and c i ical business iews.
To my wonde ul pa en s, who g ew in o my closes iends, o eaching me when
o walk and when o ly.
To my belo ed and magni icen wi e, wi h whom I sha e he g ea es joys, o
being key o my successes and my ha en in despe a e imes. She s ood by me
h ough all my a ails, my absences, my i s o pique and impa ience. She ga e me
suppo and help, discussed ideas and p e en ed se e al w ong u ns.
To my child en, he cou ageous, ado able and ebellious Miguel and Tom´as, who
each me e e y day o be pa ien and unde s anding. F om he e y beginning, wi h
a ms wide open, I hope o li e up o hem and wa ch hem do be e han me.
To my iends and amily, la c `eme de la c `eme, he e y e y ew ha p e ail
laughing wi h me.
A special hank you o my mos unde s anding bosses, Jo˜ao Ped o Gomes and
Lu´ıs Cos a Fe ei a, who ga e me he luxu y o ime and p o ided he means o see
his p ojec h ough.
3
Doc o al P og amme - In o ma ion Managemen
4
Publica ions
Machine Lea ning Applied o Banking Supe ision: a Li e a u e Re iew
Ped o Gue a and Mau o Cas elli
Risks, 2021, 9, no. 7: 136
h ps://doi.o g/10.3390/ isks9070136
Machine lea ning o liquidi y isk modelling: A supe iso y pe spec i e
Ped o Gue a, Mau o Cas elli, Nadine Cˆo e-Real
Economic Analysis and Policy, 2022, Volume 74, Pages 175-187
h ps://doi.o g/10.1016/j.eap.2022.02.001
App oaching Eu opean Supe iso y Risk Assessmen wi h SupTech: A
P oposal o an Ea ly Wa ning Sys em
Ped o Gue a, Mau o Cas elli, Nadine Cˆo e-Real
Risks, 2022, 10, no. 4: 71
h ps://doi.o g/10.3390/ isks10040071
5

Doc o al P og amme - In o ma ion Managemen
6
Begin a he beginning, he King said g a ely, “and go on ill you come
o he end: hen s op.”
—Lewis Ca oll, Alice in Wonde land
I is a capi al mis ake o heo ize be o e one has da a. Insensibly one
begins o wis ac s o sui heo ies, ins ead o heo ies o sui ac s.
—Si A hu Conan Doyle, She lock Holmes
Doc o al P og amme - In o ma ion Managemen
8
Con en s
1 In oduc ion 11
2 Machine Lea ning Applied o Banking Supe ision: a Li e a u e
Re iew 15
2.1 In oduc ion................................ 15
2.2 Me hodology ............................... 16
2.2.1 Engines .............................. 16
2.2.2 Que y ............................... 16
2.2.3 S eps................................ 17
2.3 Resul s................................... 18
2.3.1 Dis ibu ion............................ 18
2.3.2 E olu ion ............................. 20
2.3.3 Da ase s.............................. 26
2.3.4 Rela edWo k........................... 26
2.3.5 GlobalAnalysis.......................... 27
2.4 Conclusion................................. 28
2.4.1 Limi a ions and u u e wo k . . . . . . . . . . . . . . . . . . . 29
3 Machine Lea ning o Liquidi y Risk Modelling: a Supe iso y Pe -
spec i e 31
3.1 In oduc ion................................ 31
3.1.1 Risk assessmen measu es . . . . . . . . . . . . . . . . . . . . 31
3.1.2 Machine lea ning o isk assessmen . . . . . . . . . . . . . . 32
3.2 Me hodology ............................... 34
3.2.1 TheDa a ............................. 35
3.2.2 T ans o ma ions.......................... 36
3.2.3 Fea u e Selec ion . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.4 Expe imen s............................ 37
3.3 Resul s and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Conclusion................................. 46
3.4.1 P ac ical and heo e ical implica ions . . . . . . . . . . . . . . 46
3.4.2 Limi a ions and u u e wo k . . . . . . . . . . . . . . . . . . . 47
4 App oaching Eu opean Supe iso y Risk Assessmen wi h SupTech:
A P oposal o an Ea ly Wa ning Sys em 49
4.1 In oduc ion................................ 49
4.1.1 Rela edwo k ........................... 51
4.2 Me hodology ............................... 54
4.2.1 TheDa a ............................. 55
9
Doc o al P og amme - In o ma ion Managemen
gies o isk assessmen . These can be he pilla s o hei nex decision suppo
sys ems by laying down he echnologies suppo ing isk assessmen p ocesses.
Fu he mo e, his wo k can also inci e su eys and case s udies on he use and
adop ion o ML a cen al banks.
3. Consul ancy companies will bene i om a compendium o ML echniques and
isk measu es, o be e suppo hei clien s.
4. Academia ecei es an impo an con ibu ion ha ga he s an ex ensi e num-
be o pape s on isk assessmen and colla es he iden i ied me hodologies
om a supe iso y pe spec i e. This will hope ully se e as a s epping s one
o u u e de elopmen s in his a ea, and p o ide a baseline o es ing new
me hodologies.
This pape is o ganised as ollows: i s a s by jus i ying he me hodology and
desc ibing how he e e ences we e selec ed. The esul s sec ion ga he s simila i ies
among published scien i ic knowledge and p esen s he mos ele an wo ks ha
in luence his ield. The las sec ion p o ides a space o discussing lessons lea ned
and u u e wo k.
2.2 Me hodology
This esea ch was conduc ed h ough a se ies o explo a o y s eps on he opics o
machine lea ning, banking, isk assessmen , and banking supe ision. The ini ial
objec i e was o e alua e how machine lea ning echniques we e being used a cen al
banks. Addi ionally, we in ended o analyse how hese me hods we e in o ming he
analy ical capabili ies o supe iso s. We hen e ined a sea ch que y b oad enough
o e u n a se o a icles we could wo k on. The ollowing subsec ions desc ibe a
s ep-by-s ep guide o he e e ence sea ch and selec ion.
2.2.1 Engines
This li e a u e e iew elies on h ee sea ch engines: Sp inge Link,ScienceDi ec ,
and Google Schola , que ied un il June 2021. The i s and second sea ch engines
a e ex ensi ely enowned o hei us wo hiness and o selec ing op jou nals o
hei esul s. The las one p o ides an ex ensi e o e iew o all a icles published
in English (Gusenbaue , 2019).
2.2.2 Que y
Th ough ex ensi e addi ion and di e si ica ion o sea ch e ms, we e ined he sea ch
que y o he ollowing: ”machine lea ning” and (“bank” o “banking” o
“supe ision”).
The unde lying easoning is ha machine lea ning echniques a e he ocal poin
o his e iew a icle. The added alue comes om analysing hei po en ial ap-
plica ions o he banking sec o , speci ically banking supe ision. No limi a ion
conce ning he yea o publica ion was applied. O e lapping esul s a e add essed
in ou seconda y analysis. Fu he mo e, no il e ega ding ype o place o publica-
ion was applied, since he included pape s’ jou nals o publica ion we e e alua ed
16

Doc o al P og amme - In o ma ion Managemen
and classi ied a e sc eening. Addi ionally, o keep up wi h new publica ions, we
de ined an ale in Google Schola wi h his que y. Finally, we pay close a en ion
o Mendeley’s ale s o a icles ela ed o he se ga he ed in his e iew.
2.2.3 S eps
The ollowing subsec ions de ail e e y s ep o he selec ion p ocess summa ise in he
ollowing PRISMA diag am 2.1.
Figu e 2.1: PRISMA diag am de ailing he selec ion p ocess o he iden i ied a icles.
Table A.3 lis s he selec ed pape s, p o iding a single-sen ence summa y o hei
con en .
Iden i ica ion
The esea ch que y iden i ied 85 a icles and wo books, om he h ee sea ch en-
gines. All he pape s we e published in English, in se e al di e en jou nals, and
spanned om 2000 o 2021. This i s s ep in ol ed i le and abs ac analysis, and
excluded 14 a icles o lack o ele ance.
Sc eening
In his phase, he main opics o each a icle we e analysed, esul ing in he exclusion
o 21 pape s, based on he ollowing c i e ia:
•Da ase : when he analysed pape used da a o he han he banking sec o ,
i was disca ded. We a e awa e ha applica ions o ML o he s ock ma ke
a e a endy opic in he li e a u e, and ha he insu ance and pension unds
sec o is o g ea impo ance in he Eu ozone. Ne e heless, he egula ion
is subs an ially di e en , and hey would me i om a di e en s udy and
app oach;
17
Doc o al P og amme - In o ma ion Managemen
•Me hodology: isk assessmen exe cises a e his o ically based on quan i a i e
da a, combined wi h expe judgmen . Fu he mo e, i is he quan i a i e da a
ha holds he la ges amoun o in o ma ion ega ding isk exposu e p ac ices.
The e o e, we ocus ou analysis on quan i a i e me hods, o which a isk
assessmen classi ica ion has al eady been assigned (le e aging on p e ious
knowledge h ough supe ised lea ning). We hus excluded wo ks conce ning
unsupe ised lea ning me hods, o sen imen analysis (quali a i e);
•Region: his c i e ion is closely ela ed o he i s , since egula ion changes
acco ding o geog aphy. We chose o ocus mainly on wo ks based upon in-
s i u ions ope a ing in he Eu ozone. None heless, ele an wo ks by o he
cen al banks we e conside ed eligible.
Eligibili y
The nex s ep equi ed a ho ough analysis o each pape , o e i y i s sou ces and
classi y he jou nal i was published in (qua ile o impac ). Pape s we e analysed
om 2021 backwa d o iden i y any o e lapping esul s o new o imp o ed me hod-
ologies, esul ing in he exclusion o en mo e a icles: nine being pe sonal loans
ela ed and one duplica e esul .
The scope o his e iew is he applica ion o ML echniques o isk assessmen
om a supe iso y pe spec i e, which includes a bes how ins i u ions a e add ess-
ing hei isk assessmen exe cises. The da a and p edic o s used o e alua e an
indi idual c edi applica ion (pe sonal loan) di e subs an ially om he da a used
by banks om a co po a e pe spec i e, and e en mo e om he da a collec ed in he
egula o y con ex . As such, wo ks ega ding c edi isk o indi idual applican s
we e also excluded.
Conside ed pape s
The inal a icle base consis s o 41 pape s and wo books, published om 2000 un il
2021, selec ed h ough he s eps men ioned. In he nex sec ion, we will desc ibe
he simila i ies among he pape s, as well as he me hods applied and espec i e
banking a eas.
2.3 Resul s
2.3.1 Dis ibu ion
Based on he e iewed wo ks om he p e ious sec ion, he ollowing pa ag aphs
desc ibe how machine lea ning echniques ha e been used in he banking sec o . Ou
esea ch in ends o p o ide a u u e e e ence on how hese echnologies add ess and
suppo he isk assessmen p ocess, in pa icula om a cen al bank’s pe spec i e.
These esul s solely e lec he analysis o he pape s selec ed o his e iew. They
ep esen nei he he o al o publica ions h oughou hese yea s no he dis ibu ion
o opics o all publica ions.
Table A.1 summa ises he selec ed a icles, e e enced by au ho , yea o publi-
ca ion, a ilia ion and numbe o ci a ions. Addi ionally, able A.2 lis s he jou nals
om he selec ed a icles.
18
Doc o al P og amme - In o ma ion Managemen
The mos common opic on hese pape s is c edi isk ela ed (nea ly 34% o
e e ences), as shown in Figu e 2.2.
Figu e 2.2: Dis ibu ion o a icles acco ding o main opic.
The second majo ca ego y ela es o ”ML applica ion” (su eys, in- ech and
sup- ech, as pe he di ision sugges ed by B oede s and P enio (2018), he use o
inno a i e echnologies by supe iso y agencies o suppo hei p ocesses) along
wi h “s ess es s”. The emainde o he esul s ocuses ei he on ”bank isk” mo e
b oadly, o on speci ic opics o supe ision such as liquidi y isk and o he banking
isk pe spec i es. Ano he ele an aspec is he publica ion da e o hese a icles,
anging om 2000 o 2021 and dis ibu ed as shown in Figu e 2.3.
Figu e 2.3: Re e ences acco ding o yea o publica ion.
Impo an ly, al hough ML applied o he inancial sec o has been p esen since
2000, by 2015 he in e sec ion o hese knowledge a eas gained a huge in e es . This
ansla ed o inc easing numbe s o publica ions in his ield, wi h he majo i y o
ele an a icles in his s udy being published om 2017 onwa d. Table A.4 lis s
he machine lea ning me hods applied by each au ho as well as he da ase s ha
suppo ed each esea ch.
19
Doc o al P og amme - In o ma ion Managemen
2.3.2 E olu ion
The selec ed pape s we e o ganised by da e o publica ion. Publica ion in e als
we e de ined based on ele an e en s in he banking sec o , echnological e olu ion,
and he numbe o pape s pe in e al. The i s slo anging om 2000 o 2011
encompasses he e ec s o he inancial c isis o 1999 and 2008. The second ange
( om 2012 o 2016) s ill e lec s se e al s udies based on he 2008 c isis, bu wi h a
mo e ma u e insigh . In his pe iod he e is also a ending inc ease o ANN models.
The hi d slo encompasses he yea s o 2017-2018, which show a signi ican inc ease
in publica ions in e sec ing ML and he banking sec o .
The inal in e al (2019 o he cu en da e) depic s impo an ML applica ions
o he inancial ma ke in gene al. S udies in his pe iod e eal an inc eased pon-
de a ion o he uses and impac s o machine lea ning in banking supe ision, wi h
se e al publica ions om banking au ho i ies.
2000-2011
Six pape s we e iden i ied om his pe iod. They mos ly ocus on s ess es s al-
hough h ee o hem engage on he opic o c edi isk and de aul isk.
Ea ly in his pe iod, Galindo and Tamayo (2000) iden i ied he isk assessmen
ask as c ucial o an e icien use o esou ces. They used an e o cu e me hodol-
ogy o compa e model p ecision and concluded ha ee-based models ou pe o m
ANNs, KNN and p obi . This se s o wa d he inding ha ee-based models a e
mo e app op ia e o s uc u ed da a, as opposed o ANNs.
Hillegeis e al. (2004) p oposed a new me hod o assessing bank up cy p oba-
bili y. Based on he Black–Scholes–Me on op ion-p icing model, his me hod was
compa ed o he well-known Z-sco e (Al man, 1968) and O-sco e (Ohlson, 1980),
ob aining supe io esul s. These au ho s s essed he need o a s anda dised isk
assessmen measu e mainly o compa abili y pu poses.
Min and Lee (2005) p esen ed a pape ha compa es s a is ical and a i icial
in elligence me hods, wi h he la e ou pe o ming he o me in he classi ica ion
o bank up cy. Al hough his s udy ocuses on c edi isk assessmen o hea y in-
dus y i ms in Ko ea, we included i in ou sample o a compelling eason. I is
a clea example o machine lea ning me hods ou pe o ming con en ional s a is ics
and i uses a se o p edic o s ( inancial a ios) easily mapped o egula o y inan-
cial epo ing since hey a e based on balance shee en ies. Angelini e al. (2008)
based hei wo k on he Basel II capi al equi emen s and he need o a sys em o
assess c edi isk. The main objec i e o his wo k is o e alua e he possibili y o
using neu al ne wo ks o es ima e he p obabili y o de aul o a bo owe (I alian
small companies). In spi e o some ANNs being used, he compa ison o classic
machine lea ning models o con en ional s a is ical me hods was he mo e ecu en
app oach. Fu he mo e, he isk de ini ion used o e alua e he da a se s was based
on he p obabili y o de aul . This is explained by he ac ha he da ase s a e
mos ly om loan applica ions, ei he om small and medium en e p ises o pe sonal
loans (housing included). These indings con adic Galindo and Tamayo (2000) as
well as mo e ecen de elopmen s in his a ea. ANNs ha e been p o ed o excel
in ime-se ies, image, and oice ecogni ion, as opposed o hei pe o mance using
s uc u ed da a.
Addi ionally, some a icles used inancial a ios and CAMELS a ing model (an
20
Doc o al P og amme - In o ma ion Managemen
in e na ional a ing sys em used by egula o y banking au ho i ies o a e inancial
ins i u ions) o assess an ins i u ion’s pe o mance (s ess es ing and bank up cy
p edic ion). Assessing he heal h o a bank is c ucial o p e en i s ailu e and
con ain he sys emic isk i s ailu e o losses ep esen . The wo k o Boyacioglu
e al. (2009) iden i ies his assessmen as an o iginal classi ica ion p oblem. The
au ho s use he CAMELS me hod o selec he mos ele an p edic o s. Using his
me hod, neu al ne wo ks we e shown o ou pe o m mul i a ia e s a is ical me hods
o a Tu kish banking sec o use case.
Chaudhu i and De (2011) conside s Basel II de ini ion o isk o selec ea u es
o he models. In his case, ANNs a e no as equen ly used as o he con en ional
ML echniques, such as suppo ec o machines and k-nea es neighbou s. As a
consequence, he au ho s ocus on he op imisa ion o hose models o he p oblem
a hand (i.e. na u e o he da ase ).
2012-2016
In his pe iod, a icles mos ly e lec he i s insigh s gained om he 2008 inancial
c isis.
Ha ing iden i ied he lack o a comp ehensi e me hod o inco po a e ci cums an-
ial aspec s in o he banking de aul isk p edic i e models, Ribei o e al. (2012) e-
po ed ha SVM+ ou pe o med o he me hods ha did no include non- inancial
in o ma ion. Hamme e al. (2012) showed ha Logical Analysis o Da a (LAD) is
an accu a e me hod by e e se-enginee ing Fi ch isk a ings. The au ho s s a ed
ha LAD can be used as an in e nal a ing sys em ha is Basel complian .
I u iaga and Sanz (2015) ook a di e en app oach o his ma e . Fi s , hey
used sel -o ganising maps (SOM) o p o ile dis essed banks. This unsupe ised
lea ning me hod is compe i i e so i h i es o each he igh pa e n, he ep-
esen a ion o bank up cy o a bank. A e wa d, he au ho s applied mul i-laye
pe cep ons o assess a bank’s isk in se e al ime ames, ob aining e y p omising
esul s p edic ing bank up cy o comme cial banks. This wo-s ep app oach is he
i s in his selec ion o pape s o ecognise he bene i s o a p e-p ocessing phase o
map he bank up cy layou o a bank. Al hough p e ious esea ch has shown be e
esul s using con en ional ML, he success shown by his pe cep on model sugges s
i is adequa e o model he ime e olu ion o quan i a i e da a.
A new app oach o c edi sco ing using an ensemble model was p oposed by
Ala’ aj and Abbod (2016). These au ho s combine se e al da a il e ing and ea-
u e selec ion me hods be o e e alua ing model pe o mance, and compa e he mos
adi ional classi ie s wi h hei me hod. The esul s a e alida ed on se e al public
da ase s and hei accu acy assessed unde se e al measu es: a e age accu acy, a ea
unde he cu e (AUC), H-measu e, and B ie Sco e. This is he i s pape in ou
sample showing ha ensembles ou pe o m single models o classi ica ion p oblems.
2017-2018
These wo yea s showed a mo e han 60% inc ease in publica ions in he in e sec ion
o ML and banking sec o . As highligh ed by S ydom and Buckley (2019), he
echnological e olu ion allowed o he de elopmen o deep lea ning (DL) models, as
well as new ensemble me hods like ex eme g adien boos ing (XGBoos ). Al hough
21

Doc o al P og amme - In o ma ion Managemen
he DL’s i s eappea ance happened in 2012 (K izhe sky e al.), i s applica ion o
inancial isk only came o ligh in 2016-2017.
T adi ional ML and classical s a is ical app oaches a e s ill he co ne s ones o
mos o hese a icles. Howe e , an inc easing end is no iceable in he use o
ANN-based models mainly due o bigge da ase s and enhanced compu ing powe .
Abell´an and Cas ellano (2017) build on hei p e ious wo k showing how ensem-
bles achie e be e esul s in c edi isk assessmen han single models, alida ing
he indings o Ala’ aj and Abbod (2016). The au ho s s ess he impo ance o
indi idual model pe o mance as a c i e ion o ensemble selec ion. Al hough he
au ho s emphasize hei own ee-based model (C edal Decision T ee, CDT), he
main inding o hei wo k is he co obo a ion o he hypo hesis ha ensembles
ou pe o m single classi ie s.
P omp ed by he 2008 Global Financial C isis and he need o o esee signals
o inancial ins abili y, I alian au ho s Pompella and Dicanio (2017) de eloped an
Ea ly Wa ning Sys em (EWS) o help unco e dis ess signs o banks. This c edi
isk model allows use s o disc imina e s able om likely- o- ail banks and migh be
use ul in adjus ing a ing assignmen s by Ra ing Agencies. The au ho s sugges i s
implemen a ion in egula o s o suppo he supe iso y p ocess.
Xia e al. (2017) p esen an ex eme g adien boos ing model (XGBoos by Chen
and Gues in (2016)) ha consis en ly ou pe o ms baseline models. The au ho s
s ess he impo ance o model-based ea u e selec ion as well as he use o Bayesian
hype -pa ame e op imisa ion o achie e be e p edic i e esul s. Al hough pe -
sonal c edi isk is no he main opic o in e es in his e iew, his s udy shows he
ad an ages o boos ing echniques and he impo ance o an in e p e able model o
decision making. This ype o models ha e won se e al Kaggle compe i ions and
a e consis en ly showing excellen esul s wi h s uc u ed da a.
Chak abo y and Joseph (2017) om he Bank o England in oduce a cen al
bank pe spec i e on machine lea ning and i s applica ions. The au ho s p o ide an
o e iew o machine lea ning models and model alida ion o suppo he p esen-
a ion o h ee case s udies. As a inal no e, his wo k acknowledges he amoun o
a ailable da a as an impo an ec o in decision suppo sys ems based on machine
lea ning a cen al banks and o he o ices. As p e iously s a ed, agency pape s
as his one a e pa amoun in unde s anding he use o machine lea ning in hese
con ex s, p o iding use cases and a eas o in e es o u u e wo k.
Alessi and De ken (2018) con ibu e wi h ano he EWS o de ec excessi e c edi
g ow h. This phenomenon is usually a he oo o sys emic isk o inancial s abili y
and i s ea ly de ec ion can help a oid cases o bank up cy. The au ho s use Random
Fo es classi ie model wi h c edi and eal es a e p edic o s. Thei wo k pionee s
in he domain o isk assessmen om he pe spec i e o cen al banks, hus se ing
pee p ac i ione s in hei u u e pa h. Mo eo e , he wo k ein o ces ha ensem-
bles consis en ly ou pe o m single models. O he au ho s success ully use ex eme
g adien boos ing o de elop a c edi isk model o inancial ins i u ions (Chang
e al., 2018). Those ools p omise signi ican suppo (i.e. low e o a e) o isk
assessmen in loans.
The Cen al Bank o G eece also p o ides a ho ough analysis based on pos -2008
c isis loan da a om G eek banks, by Pe opoulos e al. (2018). This s udy se s a
miles one o he use o ad anced ML echniques om a supe iso y pe spec i e.
Fu he mo e, i le e ages he esul ing model o c ea e an EWS ha will suppo
22
Doc o al P og amme - In o ma ion Managemen
subsequen decisions in loan app o al. Simila o wha I u iaga and Sanz (2015)
ha e shown, modeling a imeline e olu ion is whe e neu al ne wo ks (in his case
deep neu al ne wo ks, DNN’s) excel. Ano he impo an esul is ha DNNs can
pe o m jus as well as XGBoos , showcasing how p ecisely deep lea ning models
adap o s uc u ed da a.
Ta ana e al. (2018) p esen a s udy ha di ec ly add esses liquidi y isk, which
is he mos apidly de as a ing isk a bank is exposed o. In his pape , he au ho s
p esen an a i icial neu al ne wo k model combined wi h a Bayesian ne wo k (BN)
o assess liquidi y isk using sol ency as a p oxy. This combined app oach models he
liquidi y isk indica o h ough he ANN and he p obabili y o occu ence h ough
he BN. The esul s show his app oach dis inguishes he mos c i ical ac o s o
liquidi y in his da ase .
B oede s and P enio (2018) conduc a s udy ha compiles he expe ience o
ea ly use s o inno a i e echnology in inancial supe ision (sup- ech). The au ho s
s uc u e a de ini ion o sup- ech and show how i is used o da a collec ion and
analy ics. These wo applica ions ha e di e en ini ia o s in supe iso y agencies.
Da a collec ion ends o be ini ia ed by managemen decisions and p ojec s whe eas
analy ics usually s a ou as esea ch ques ions o analysis que ies om supe i-
sion uni s. A conduc i e h ead o all use cases is he sha ing o he expe ience o
some ea ly adop e s and he impac hose echnologies a e ha ing on he o ganisa-
ion. Simila s udies, such as he one conduc ed by Chak abo y and Joseph (2017)
a e essen ial o compiling, sha ing, con as ing he se e al app oaches h oughou
cen al banks and o he agencies.
The Fede al Rese e p o ides a b oade pe spec i e, analysing how he use o
machine lea ning and big da a will impac compliance aspec s (Jag iani e al., 2018).
The au ho s also s ess he need o iden i y he isks ha hese echnologies ca y
when applied o he inancial ma ke .
Gogas e al. (2018) p opose a me hodology ha sepa a es sol en and ailed
banks, using machine lea ning models. The au ho s p esen an al e na i e ool o
s ess- es ing ha ou pe o ms he O-sco e. Thei app oach is based on a suppo
ec o machine model ha helps o de ine a bounda y be ween sol en and insol en
banks, con e ing his issue in o a classi ica ion p oblem. Kupiec (2018) p esen s a
ela ed s udy ha s esses he need o new me hodologies o alida e con en ional
bank s ess es s.
As a inal e e ence o his pe iod, Le and Vi iani (2018) also ackle he p oblem
o bank ailu e p edic ion using machine lea ning and classical inancial a ios. One
impo an aspec o his wo k is ha he au ho s use a ios om 5 di e en isk
pe spec i es: Loan quali y, Capi al quali y, Ope a ions e iciency, P o i abili y, and
Liquidi y. This wo k alida es ye again ha machine lea ning me hods ou pe o m
adi ional s a is ics. Howe e , hese au ho s do no explo e he possibili y o using
ensembles, which ha e al eady been p o en o be op pe o me s in classi ica ion
p oblems.
2019-2021
C edi and banking isks a e essen ial o a balanced economy; ying o p e en
sys emic epe cussions s emming om hem is conside ed o he u mos impo ance.
Simila ly o ea lie pe iods, hese isks main ain a p i ileged spo in esea ch. S ill,
i was on ML applica ion we saw he mos signi ican inc ease in publica ions. This
23
Doc o al P og amme - In o ma ion Managemen
sugges s he demand o coo dina ion and a global pe spec i e on he de elopmen s
conque ed so a in his a ea.
Leo e al. (2019) p oduce a ho ough e iew on how machine lea ning has been
used a banks o isk assessmen . This pape o se s he indus ial and academic
claim o ML applica ion e sus eal-li e p ac ices, highligh ing a se ies o pe spec-
i es whe e isk managemen has been poo ly applied. Climen e al. (2019) de elop
an insigh ul s udy ha aims o iden i y a se o inancial p edic o s ha bes model
a bank’s inancial dis ess. To his end, he au ho s apply an XGBoos based model
o a se o indica o s ha migh p edic a bank ailu e in he Eu ozone. The se o se-
lec ed indica o s (To al asse s, Loan loss p o isions/ne in e es e enue, Equi y/ne
loans and In e bank a io) a e shown o bes help egula o s moni o inancial dis-
ess o hose banks. F om a echnical pe spec i e, his wo k ein o ces he choice
o XGBoos o classi ica ion p oblems using s uc u ed da a. A ecen s udy by
Wang e al. (2021) decons uc s he use o logi as he base classi ie o EWS de-
eloped o p edic banking c isis. In ac , he au ho s use andom o es classi ie
o simula e expe decision, ob aining a gene alisa ion capabili y abo e 80% a ea
unde he cu e (AUC).
Kou e al. (2019) compa e se e al ongoing esea ches conce ning he applica ions
o machine lea ning me hods o he de ec ion o sys emic isk e en s, ha is, inan-
cial dis ess phenomena ha a ec se e al ma ke s o geog aphic egions. They also
p opose he use o big-da a analysis o assess sys emic isk.
Soui e al. (2019) add ess he issue o comp ehensibili y o machine lea ning
models o c edi isk assessmen . In e es ingly, in his s udy, in e p e abili y was
men ioned as one o he ba ie s o adop ing ML models in day- o-day decision
making. In an a emp o ci cum en his p oblem, he au ho s p oceeded o de elop
an e olu iona y algo i hm o app oach c edi isk assessmen as an op imisa ion
p oblem: minimising complexi y while maximising accu acy.
A ecen e iew by Das ile e al. (2020) compa ing s a is ical and ML lea n-
ing models o c edi sco ing showed ha ensembles ou pe o m single classi ie s,
con i ming he esul s o p e iously men ioned wo ks. The au ho s iden i y model
explainabili y and he abili y o deal wi h imbalanced da ase s, as he main issues
o deal wi h when modelling c edi isk. Deep lea ning models also show p omising
esul s, al hough hey ha e no been ex ensi ely explo ed o c edi isk assessmen .
The au ho s iden i y he lack o in e p e abili y as he main ba ie o adop ing
deep lea ning o c edi isk assessmen .
Banco de Espa˜na (Alonso and Ca bo, 2021) published a compa ison o se e al
well-known machine lea ning algo i hms o c edi de aul p edic ion, showing signi -
ican imp o emen s o e logi . The au ho s es ima e ha implemen ing XGBoos -
media ed assessmen could lead o sa ings o up o 17% o capi al equi emen s
unde cu en ECB egula ion. An unes (2021) om he Cen al Bank o B azil
p esen s a solid a gumen o main ain supe iso y on-si e inspec ions. The au ho
compa es wo machine lea ning models, one ained wi h po olio a ings assessed
by he banks hemsel es, and he o he based on pas a ings ob ained h ough
on-si e inspec ions. The esul s show ha he o e all pe o mance is consis en ly
highe when using da a e ie ed h ough inspec ions.
This is he pe iod wi h he mos ML applica ions pape s iden i ied (wi h a o al
o 9 ou o 13). They span om insigh s on how AI will con inue o e olu ionise
indus ies and change social beha iou (Dwi edi e al., 2021), o mo e p ac ical
24
Doc o al P og amme - In o ma ion Managemen
app oaches on how o inco po a e ML in inancial se ices (Lee and Shin, 2020).
Milian e al. (2019) also p o ide a lis compa ing in- ech de ini ions, how i is
suppo ed by digi al ans o ma ion, and he inancial isks associa ed wi h he use
o ML.
A comp ehensi e s udy om 2019 by di Cas i e al. (2019) ocuses on he de ini-
ion o sup- ech and highligh s he need o a mo e p ecise no ion o wha o include
as ”inno a i e echnology” a he se ice o a inancial au ho i y. I p esen s se e al
use cases and classi ies he echnologies on o ma u i y le els (named in he pape as
”gene a ions”), concluding ha he iden i ied ini ia i es (applica ions o inno a i e
echnologies o suppo he ac i i ies ca ied ou by inancial egula o s and au ho -
i ies) a e mos ly expe imen al. The au ho s sugges an in e na ional coo dina ion
e o and alignmen o c ea e syne gies ha le e age sup- ech de elopmen .
The Bank o I aly p esen ed a use case o a classi ica ion p oblem (deducing he
ins i u ional sec o code o a company based on i s cha ac e is ics) (Massa o e al.,
2020). Al hough his wo k is no ela ed o isk assessmen , i p o ides an excellen
example o a p oduc ion- eady applica ion o ML o supe iso y asks.
Alonso and Ca bo (2020) om Banco de Espa˜na s ess he need o a join
s a egy o assess ML models o inc ease anspa ency and p omo e adhe ence o
his echnology. The au ho s conclude ML models inc ease he p edic i e capabili y
o a c edi de aul classi ie by 20%. The s udy also iden i ies ac o s in c edi isk
managemen ha migh inc ease supe iso y cos s.
D i en by he ecen p og ess in inancial echnology, Huang e al. (2021) ac-
knowledge he complex and hie a chical na u e o inancial da a and he echnologi-
cal ba ie s ound when using s a is ics and classic ML. The au ho s hen p oceed o
apply ad anced deep lea ning me hods and make use o se e al g aphic p ocesso s
o imp o e compu a ion.
As a inal ema k ega ding ML applica ions, Doe e al. (2021), om he Bank
o In e na ional Se lemen s, p esen ed a policy b ie ing on he Eu opean Money
and Finance Fo um, e alua ing o wha ex en cen al banks a e making use o ML
and big da a. The au ho s conclude ha al hough cen al banks a e acquain ed
wi h big da a, he e exis s a pe sis en need o specialised knowledge on how o use
ML h oughou hese o ganisa ions.
S ess es s a e also e e enced in hese yea s. In a 2019 s udy, Kola i e al.
(2019) hypo hesise ha s ess es s hemsel es a e mo e o an assessmen o a bank’s
abili y o deal wi h he isks i is exposed o. This s a emen challenges he common
concep ion o s ess es s as a ma ke o a bank’s esilience o ad e se al e na i e
mac oeconomic scena ios. Fo his pu pose, he au ho s de elop an ea ly wa ning
sys em o assess how Eu opean banks will pe o m on s ess es s. These au ho s
sugges su i ing s ess es s depends la gely on he unde lying isk dimensions o
indi idual banks. Mo eo e , his pape ea i ms boos ing echniques as winning
solu ions, no only o his so o classi ica ion p oblems bu also when applied o
s uc u ed da a. As a u u e wo k, he au ho s ecommend a simila app oach using
egula o y da a.
In he same line o in es iga ion, an EWS was de eloped by Filippopoulou e al.
(2020) o p edic bank sys emic isks in he Eu ozone. This s udy s a s by analysing
he impo ance o he indica o s ha a e usually applied and p esen s a model ha
de ec s a sys emic c isis one o ou yea s be o ehand. In spi e o using a classic
mul i a ia e bina y logis ic eg ession model, he me hodology adop ed o his EWS
25
Doc o al P og amme - In o ma ion Managemen
classi ica ion, o ins ance, ” ailu e” o ”no ailu e” o a bank. This a ge a iable
is de i ed om a se o inancial a ios, mos o en om public o p oxy da ase s.
Fo his s udy, we conside he classi ica ion me hod p esen ed in a well-es ablished
and widely app o ed me hodology o isk measu emen - he Supe iso y Re iew
and E alua ion P ocess (SREP) (Bank) - de ined by he ECB in coope a ion wi h he
Na ional Compe en Au ho i ies (NCAs). This is he p ocess h ough which supe -
iso s pe iodically assess and measu e he isk o each bank om i e pe spec i es:
liquidi y, c edi , ma ke , ope a ional, and p o i abili y. The au ho s suppo ou
isk classi ica ion on he au oma ic Risk Assessmen Sys em (RAS), which is hen
eclassi ied acco ding o expe judgemen .
This me hodology uses eal supe iso y da a collec ed h ough he Eu opean
Banking Au ho i y (EBA) di ec i e o Implemen ing Technical S anda ds (Au ho -
i y, 2013), wi hin he scope o he Single Supe iso y Mechanism (SSM) (Commis-
sion, 2015). Da a is used o classi y each ins i u ion in e ms o i s isk le el,
acco ding o he au oma ic isk assessmen sys em om he SREP p ocess. These
obse a ions ange om 2014 un il Ma ch 2021. The da a used in his esea ch is
ex ensi ely alida ed, hus ensu ing a posi i e co ela ion wi h liquidi y isk assess-
men capabili ies (Ng, 2011).
3.1.2 Machine lea ning o isk assessmen
Risk assessmen is a p edominan ly quan i a i e exe cise, o en adjus ed h ough ex-
pe judgemen . The use o machine lea ning me hods om a cen al bank pe spec-
i e is a ecen opic o in e es , no only om NCAs and o he agencies’ pe spec i e,
bu also om he academic poin o iew.
Since he ea ly 2000s, isk assessmen has been iden i ied as a op p io i y o he
e icien use o inancial esou ces (Galindo and Tamayo, 2000). Ea ly in ha decade,
he same au ho s es ablished ha ee-based models a e mo e adequa e in p edic ion
asks when compa ed o a i icial neu al ne wo ks (ANN), using s uc u ed da a.
This esul is ein o ced by o he publica ions, h oughou he yea s. Kola i e al.
(2019) speci ically add ess s ess es ing, sugges ing i is an assessmen o a bank’s
abili y o deal wi h he isk i is exposed o, a he han he bank’s ac ual esilience.
Recen echnological e olu ion has been suppo ing he de elopmen o mo e
sophis ica ed models (S ydom and Buckley, 2019), like deep lea ning (DL) models,
as well as new ensemble me hods like ex eme g adien boos ing (XGBoos ) (Abell´an
and Cas ellano, 2017), due o hei capabili y o cap u e he complexi y o his ype
o phenomenon. DL i s eappea ed in 2012 wi h ImageNe (K izhe sky e al.).
Howe e , DL was applied o inancial isk assessmen only in 2016. Das ile e al.
(2020) con i m DL as a p omising ool in isk assessmen , in pa icula o c edi isk.
They hypo hesise ex apola ing his app oach o o he isk pe spec i es, al hough
he lack o in e p e abili y o DL is seen by hese au ho s as he main ba ie o
adop ing his app oach.
A he same ime, se e al s udies showcase he le el o p ecision wi h which
deep lea ning models adap o s uc u ed da a. Pe opoulos e al. (2018) expand on
he use o ad anced ML echniques om a supe iso y pe spec i e. These au ho s
de eloped an Ea ly Wa ning Sys em (EWS) o c edi isk p edic ion, using da a
om G eek banks’ co po a e loans (Bank o G eece; 2005-2015).Al hough XGBoos
eme ged as he bes model, DNNs also p esen ed p omising esul s. Simila ly o
32

Doc o al P og amme - In o ma ion Managemen
wha I u iaga and Sanz (2015) ha e demons a ed, modelling a imeline e olu ion
is whe e neu al ne wo ks excel (in his case, deep neu al ne wo ks - DNN’s).
As in bank up cy p edic ion, using machine lea ning o model a isk assessmen
usually sums up o a classi ica ion ask whe e he de eloped model assigns a bina y
esul o a ce ain obse a ion o con ex : ” ail” o ”no ail”. This means ha o a
se o independen a iables/indica o s, ha ep esen a bank’s con ex in a ce ain
pe iod, he model will i s lea n, hen p edic , whe he ha bank will go bank up
o no , wi h a pa icula deg ee o ce ain y.
On he business side, i is c ucial o unde s and how banks, na ional compe en
au ho i ies and o he agencies a e adap ing o his e olu ion. In pa icula , we
a e in e es ed in how cen al banks use inno a i e echnologies o le e age hei
analy ical capabili ies, namely o isk assessmen .
Acco ding o wha S ock and Wa son (2001) o mula e ha mac oeconome i-
cians a policy ins i u ions do, NCAs a e esponsible o :
1. Summa ising and analysing da a;
2. Fo ecas ing he key mac oeconomic a iables;
3. Conduc ing isk analysis and balance o unce ain ies;
4. Pe o ming s uc u al/causal analysis, as well as scena io analysis;
5. Making decisions, communica ing hem and jus i ying hese decisions is-a- is
he public.
A s udy conduc ed by B oede s and P enio (2018) showcases he expe ience o
ea ly use s o inno a i e echnology in supe ision (sup- ech). This wo k p esen s a
new de ini ion o sup- ech and shows how i is used o da a collec ion and analy ics.
Chak abo y and Joseph (2017) published a simila s udy whe e he au ho s compile,
p esen and compa e he app oaches adop ed by NCAs and o he agencies. As
no ed be o e, he amoun o a ailable da a eme ges as an impo an ec o o he
de elopmen o decision suppo sys ems based on ML.
Massa o e al. (2020) p esen a p oduc ion- eady solu ion using ML o suppo
a NCA’s e e yday asks. Al hough his wo k is no a isk assessmen ool, i p o es
how hese NCAs can le e age on sup- ech.
We ound only one pape add essing isk assessmen using ML, om a supe i-
so y pe spec i e (Filippopoulou e al., 2020). The EWS de eloped by hese au ho s
is o g ea ele ance o cen al banks. I add esses isk assessmen , bu mos im-
po an ly, i uses eal da a ga he ed in he a e ma h o he 2008 economic collapse
(Eu opean Cen al Bank Mac op uden ial Da abase). Pompella and Dicanio (2017)
also p opose an EWS o ale o banks’ dis ess signs. The au ho s p opose a c edi
isk model o help adjus ing a ing assignmen s by he esponsible agencies. Along
wi h Filippopoulou e al. (2020), hese indings sugges EWS as eliable ins umen s
suppo ing supe iso y p ocesses.
The pauci y o s udies such as he one jus men ioned, is a gap we p opose o
add ess. To he bes o ou knowledge, he e a e no pape s add essing liquidi y isk
assessmen om a supe iso y pe spec i e. Addi ionally, his wo k uses eal-wo ld
da a collec ed a a cen al bank in he con ex o supe iso y di ec i es. The ac
ha his ype o da ase s a e p i ileged and he e o e con iden ial u he jus i ies
he nonexis ence o simila s udies.
33
Doc o al P og amme - In o ma ion Managemen
The ew s udies add essing isk assessmen wi h ML echniques use public o
p oxy da ase s. These ea ly wo ks se he one o he pa icula use case o cen-
al banks. In he supe iso y con ex , da a is con iden ial and he p ocesses a e
suppo ed by Eu opean-wide legisla ion, hus making hese pape s mo e likely o
s em om join wo ks wi h NCAs. Addi ionally, we do no use a sample da ase
bu a he he en i e popula ion: he Po uguese banking sec o . Also suppo ing
he no el y o his wo k is he isk assessmen me hodology used: he quan i a i e
pilla o SREP, he Risk Assessmen Sys em (RAS). We model he isk assessmen
ask h ough a classi ica ion p oblem. As opposed o he pape s ci ed abo e, we
p opose expanding he usual bina y classi ica ion in o mul iple classes, acco ding o
banks’ isk le el and as es ablished in he RAS me hodology:
1. low isk;
2. medium-low isk;
3. medium isk;
4. high isk.
This app oach ensu es ha we can look h ough he same lenses a all banks in
he Eu o-a ea, making hese assessmen s compa able, eplicable and anspa en .
In his wo k, we decide o conside solely liquidi y isk due o i s high impo ance
o a bank’s inancial heal h (Ven o and Ganga, 2009). A liquidi y c isis can lead a
bank o bank up cy in less han a week (Shah e al., 2018). The e o e, i is o he
u mos impo ance o deli e inno a i e ools ha inc ease he cu en analy ical
capabili ies o cen al banks. We aim o p o ide a solid base o a scena io analysis
ool.
3.2 Me hodology
The undamen al pu pose o machine lea ning (ML) is ex ac ing p edic ions om
unde lying da a (o Big Da a). Gene ally, Machine Lea ning algo i hms a e applied
o da a o ge insigh s om i . In his case we a e using C oss Sec ional Da a, ha
can be cap u ed a any poin in ime. Using in o ma ion om p e iously obse ed
ci cums ances (c oss sec ional da a), ML algo i hms can p edic alues pe aining
o e en s ha ha e ye o occu .
Figu e 3.1: Me hodology p ocess o e iew.
Figu e 3.1 displays he s eps pe o med in he expe imen al phase o his s udy.
We s a ed by e ie ing he da a om he Banco de Po ugal p oduc ion da abase
34
Doc o al P og amme - In o ma ion Managemen
o supe iso y da a. This da ase includes all he a ailable ea u es, as well as
he p e-compu ed a ge – he RAS sco e o liquidi y isk. Da a ans o ma ion
comp ises da a cleaning, implemen ing a s a egy o deal wi h missing alues, and
he ea u e selec ion p ocess. In he expe imen phase, we compa e h ee di e en
app oaches o e alua e he ML algo i hms o his ask: he classic ain- es spli ,
he mo e accu a e c oss- alida ion, and he TPOT Au oML amewo k (Olson e al.,
2016). We hen use he 1-sco e and he con usion ma ices o compa e he esul s
and inally, selec he bes model. In u u e use, his model can be deployed as an
Ea ly Wa ning Sys em making p edic ions o he liquidi y isk le el.
In his sec ion we will desc ibe he me hods used in his esea ch, om da a
ga he ing o model pe o mance e alua ion.
3.2.1 The Da a
This s udy elies on supe iso y da a collec ed by Banco de Po ugal (Po uguese
Cen al Bank - BdP) wi hin he Capi al Requi emen s Regula ion (CRR) and Cap-
i al Requi emen s Di ec i e IV (CRD IV) Pa liamen (2013). The da a anges om
Ma ch 2014 un il Ma ch 2021. Depending on i s na u e, some da a is ga he ed
mon hly while in o he cases i is ga he ed qua e ly (Au ho i y, 2013). Due o
con iden iali y issues, he da ase used in his s udy canno be made a ailable o
public consul .
Da a is ex ac ed ia SQL que y om BdP’s p oduc ion da abase in o a comma-
sepa a ed- alues (cs ) ile o be impo ed using he Py hon p og amming language.
An ex ac ion ou ine was implemen ed o assu e consis ency and au oma ion in
da a ga he ing. No il e is applied ega ding e e ence da e, ins i u ions o le el o
consolida ion. The ex ac ion is s uc u ed in wo s eps:
1. Fi s , he ea u es a e selec ed om he epo ed da a. These belong o he
4 main epo ing amewo ks o banking supe ision: Financial Repo ing,
Common Repo ing, Asse Encumb ance and Funding Plans. This se encom-
passes all possible p edic o s.
2. The a ge a iables a e selec ed. These a e compu ed h ough a co po a e
calcula ion p ocess bu all in e media e a iables a e disca ded, in o de o
a oid any possible ma hema ical ela ion be ween ea u es and a ge .
The da a esides in a ela ional da abase whe e each ow ep esen s a epo ed
alue. This means ha in he da a sou ce, se e al ows ep esen a single obse -
a ion. Du ing ex ac ion, da a is anonymised using MD5 algo i hm wi hin a hash
unc ion. This s ep assu es he same iden i ie o e e y ow in he same obse a ion.
The base da ase has he ollowing opology:
1. ID - a hash code ep esen ing each obse a ion’s iden i ie ;
2. a iable - a code wi h business meaning ha ep esen s each epo ed alue;
3. al - he ac ual nume ic alue o he a iable.
35
Doc o al P og amme - In o ma ion Managemen
3.2.2 T ans o ma ions
A py hon ou ine impo s he CSV ile, p epa ing he da a o machine lea ning
algo i hms. The i s s ep is pi o ing he da a se so ha each o he esul ing
lines co esponds o an obse a ion. Subsequen ly, we go h ough he da a cleaning
p ocess ha s a s by disca ding he a ge columns ha all ou o he liquidi y
con ex . By his s age, each ow co esponds o a single obse a ion, and he las
column ep esen s ou a ge a iable ( he RAS liquidi y isk sco e). The o he
columns po ay all he ea u es a ailable in ou da ase .
The nex s eps delinea e unde which ci cums ances a ow o column is disca ded
om ou da ase :
1. Rows o which he a ge a iable is null.
2. Rows ha ha e a a ge a iable 0. This alue ep esen s a non-applicable
obse a ion.
3. Rows whe e all ea u es/columns a e null.
4. Null columns: e e y column/ ea u e has a leas one epo ed alue. A e
comple ing he p e ious s eps, we mus con i m ha e e y ea u e s ill has
alues.
Finally, we deal wi h missing alues o each ea u e. As poin ed ou by Madley-
Dowd e al. (2019), mul iple impu a ions can a ain unbiased esul s up un il 90%
o missing da a. Since in ou da ase we ha e a mos 20% o missing alues, we do
no disca d obse a ions based on his c i e ia. Ins ead, we use he median o ill
ou he missing alues, which is he mos adequa e s a egy o nume ic da ase s
whe e he ea u es p esen di e en dis ibu ions (Acuna and Rod iguez, 2004). I
wi hin he same ea u e/column we ha e simila mean and median i is indi e en
which s a egy o use. The use o he median gi es a mo e app op ia e idea o
da a dis ibu ion. A e unde going his p ocess, he inal sample included 5299
obse a ions.
3.2.3 Fea u e Selec ion
The selec ion o he mos ele an p edic o s is an impo an s ep, no only o
educing compu a ional ime, bu also o compa e and con as wi h he business
pe spec i e, he ECB Risk Assessmen Me hodology. A e cleaning he da a and
d opping some non- ep esen a i e ea u es we a e s ill dealing wi h he o al uni e se
o a ailable da a.
Fo he ea u e selec ion p ocess we used Random Fo es Classi ie wi h an 85%
h eshold o he ea u e impo ance. This me hod was chosen due o i s abili y o
ank he pu i y o each node (gini impu i y): g ea es impu i y dec ease occu s a
he op o he ee (nea oo le el) whe eas smalle impu i y dec ease a e is obse ed
a he end (nea lea nodes). When his algo i hm p unes below a pa icula node,
i c ea es a subse o he mos impo an ea u es.
Th ough his s a egy, we a e able o echnically assess he ele ance o each
ea u e ega ding he a iable we wan o p edic and selec he ones ha explain
85% ( he impo ance h eshold de ined in he algo i hm) o ou a ge a iable. The
36
Doc o al P og amme - In o ma ion Managemen
inal da ase has a o al o 3409 ea u es selec ed om a uni e se o 82559 p edic o s,
and 5299 obse a ions.
A e wa ds, we compa e he simila i y o he ob ained ea u es wi h he ones he
me hodology highligh s. This, pe se, is a use ul analysis since i gi es hin s o he
analys s on which indica o s o moni o mo e closely.
Fo he pu pose o educing compu a ional ime we ha e also conside ed, a i s ,
he P incipal Componen Analysis (PCA). Al hough his me hod is associa ed wi h
dimensionali y educ ion, i s use comp omises model explainabili y. A he same
ime, PCA loses ack o he ea u es ha be e ep esen ou a ge a iable, by
p ojec ing he ea u e space in o a lowe dimensional space.
A he end o his p ocess we compu e he co ela ion ma ix o he da ase
o assu e he e is no a high co ela ion be ween ea u es and a ge . This would
sugges ha a ce ain ea u e ep esen s he same phenomena as he a ge . The
co ela ion indices ange be ween a posi i e 26% and a nega i e 32%.
3.2.4 Expe imen s
The expe imen s ca ied ou o assess and compa e he pe o mance o each model
we e o ganised in h ee sepa a e phases, each o which is explained in he ollowing
subsec ions. Fi s , we adop ed he mos s aigh - o wa d app oach o spli ing he
da a in o wo se s, he ain and es se s. A e wa ds, we use c oss alida ion o
measu e he a e age pe o mance o each model, conside ing e e y obse a ion o
ei he aining o es ing. Finally, we use an au o-ml lib a y, TPOT (Olson e al.,
2016), o ha e ano he e alua ion pe spec i e.
Fo each o he h ee app oaches we calcula e a measu e o pe o mance/sco ing
o bo h ain and es se s. Fu he mo e, we compu e he con usion ma ix o a
p ecise pic u e o each model’s p edic ion.
We ha e selec ed a lis o some o he mos common machine lea ning algo i hms
used o classi ica ion p oblems. Fo he pu pose o hese expe imen s we ha e
selec ed sciki -lea n implemen a ion o he ollowing models:
1. Logis ic Reg ession (LG) by Cox (1958), o Mul inomial Logis ic Reg ession,
is an ex ension o he Bi a ia e Logis ic Reg ession p oposed by McCullagh
and Nelde in 1989 (Glonek and McCullagh, 1995) o p oblems wi h mo e
han wo disc e e ou comes. The o iginal app oach was designed o bina y
p oblems, and he a ge a iable was modelled h ough a binomial p obabili y
dis ibu ion unc ion. In i s mul iclass o m, he p obabili y is dis ibu ed by
he numbe o classes o he p oblem a hand. In his pape , we used he sciki -
lea n implemen a ion o he Logis ic Reg ession o mul i-classes (Ped egosa
e al., 2011).
2. Suppo Vec o Machine Classi ie (SVC) - o Mul i-class Suppo Vec o Ma-
chine - is a gene alisa ion p oposed by Wes on and Wa kins (1998) o he
bina y classi ica ion Suppo Vec o . Ins ead o compu ing he p obabili y o
an obse a ion co esponding o a ce ain class (like he Logis ic Reg ession),
his me hod ep esen s all da apoin s in an n-dimensional space, and aims a
c ea ing a bounda y, called a hype plane, ha sepa a es he da apoin s in o
classes. The algo i hm ies o maximise he dis ance be ween he bounda y
and he nea es da apoin s. Real-wo ld da a is seldom linea ly sepa able, so
37

Doc o al P og amme - In o ma ion Managemen
i becomes compu a ionally expensi e o p ojec all da a in o a highe di-
mensional space o calcula ing he dis ances o he op imal bounda y. To
o e come his compu a ional hu dle, SVM uses he ke nel ick, a me hod
ha uses a ke nel unc ion ha akes wo ec o s/da apoin s in he o iginal
space and compu es hei do p oduc in he ea u e space. Since he ec o s
a e no malised he esul is ela ed o he Euclidean dis ance o bo h ec o s -
he dis ance we wan ed o compu e. In o he wo ds, his me hod sho cu s he
compu a ion o he dis ances om he da apoin s o he possible hype planes,
by pe o ming hem in he o iginal n-dimensional space, hus educing wall
ime (Adankon and Che ie , 2009). We ha e used sciki -lea n implemen a ion
o SVM based on libs m lib a y.
3. Nai e Bayes Classi ie (NBC) is a supe ised lea ning me hod based on Bayes
heo em, based upon he s a is ical independence o ea u es. This simpli ied
app oach o lea ning shows i is up o pa wi h mo e sophis ica ed classi ie s,
namely when dealing wi h high dimensionali y and complex classi ica ion p ob-
lems (Rish, 2001). Nai e Bayes algo i hms a e hus e y e icien o ain and
equi e li le da a o con e ge. This de i es om he ac ha hey only e-
qui e o compu e he p obabili y o each class, he condi ional p obabili ies o
each inpu alue gi en a ce ain class, and he mean and s anda d de ia ion
alues o each a ibu e o each class. In his pape , we use he Gaussian
Nai e Bayes implemen a ion om sciki -lea n which p esupposes a Gaussian
dis ibu ion o he ea u es.
4. Random Fo es Classi ie (RFC) is a lea ning me hod ha combines ee p e-
dic o s wo king oge he o minimise he e o (B eiman, 2001). As ho oughly
explained by Fawag eh e al. (2014), each decision ee in he o es is a base
classi ie using a sample o he ins ances in-bag, hence he bagging echnique.
The ees a e combined h ough a o ing sys em - one o e pe ee - whe e
he o es chooses he class wi h mos o es. Ano he aspec ha imp o ed
he andomness o he ees was he use o he Gini index - ea u es wi h he
highes index a e used o spli he inne node o he ee. This algo i hm
p esen s g ea esul s when dealing wi h da a noise and a oiding o e i , and
handles la ge da ase s wi h high dimensionali y. He e again, we a e using i s
sciki -lea n implemen a ion.
5. Ex eme G adien Boos ing (XGBC) Classi ie p oposed by Chen and Gues in
(2016) is a machine lea ning algo i hm used o ee boos ing ha uses da a
comp ession and sha ding (a da a pa i ion echnique) o scale o la ge amoun s
o da a. Due o i s capabili y o a oid o e i ing and i s e icien use o la ge
amoun s o da a, i has become one o he mos popula ML me hods in he
las ew yea s (Sahin, 2020). Ha ing F iedman (2001) g adien boos ing ech-
nique as i s pilla , XGBoos uses a di e en iable loss unc ion and op imises i
wi h g adien descen algo i hm, in o de o build an ensemble o classi ica ion
ees. Fo his algo i hm, we ha e used he au ho s’ implemen a ion package
(Chen and Gues in, 2016).
The TPOT au o-ml lib a y au oma ically selec s he bes model and we use ha
esul o compa e wi h he o he s.
38
Doc o al P og amme - In o ma ion Managemen
In o de o ha e all ea u es in a simila scale we ha e applied a scaling me hod
when p ep ocessing he da a. MinMaxScale was he bes choice since i p ese es
he shape o he o iginal dis ibu ion. I does no signi ican ly change he in o ma-
ion embedded in he o iginal da a. No e ha MinMaxScale does no educe he
impo ance o ou lie s. The de aul ange o he ea u e e u ned by MinMaxScale
is 0 o 1.
He e we p esen a lis o he main cha ac e is ics o he expe imen s’ en i onmen :
1. Leno o ThinkPad P50 wi h an In el Xeon p ocesso (2.8GHz), 32 GB o RAM,
1 TB SSD;
2. Windows 10 64-bi s;
3. Py hon 3.9.1 64-bi s;
4. Pandas 1.2.0;
5. sciki -lea n 0.24.0;
6. TPOT 0.11.7.
Pe o mance measu es
We used wo di e en ools o compa ison pu poses: he con usion ma ix and he
1-sco e. The con usion ma ix is he mos de ailed iew o how a pa icula machine
lea ning model is pe o ming in a classi ica ion p oblem (Tha wa , 2018). Th ough
his ool, we a e able o assess each o ou model’s p edic ions and compa e hem
wi h he co ec alue.
Figu e 3.2: Example o a con usion ma ix o a bina y classi ica ion p oblem.
Figu e 3.2 shows a gene ic ma ix o a bina y classi ica ion p oblem whe e we
can obse e each possible classi ica ion:
•T ue posi i e (TP) co esponds o he model co ec hi s.
•False nega i e (FN) ep esen s e e y missed case, whe e he model unde es i-
ma ed.
•False posi i e (FP) ep esen s alse ala ms, whe e he model o e es ima ed.
•T ue nega i e (TN) co esponds o he co ec ejec ions made by he model.
39
Doc o al P og amme - In o ma ion Managemen
Fo ou speci ic p oblem whe e we ha e ou classes ep esen ing he isk le els o
liquidi y, we will ha e a 4X4 ma ix o each model, which is simply a gene alisa ion
o he one jus p esen ed.
The e a e se e al me ics ha one can ex ac om hese s a is ics. Howe e
we will ocus on he 1-sco e and wo o he s de i ed om i , p ecision - o posi i e
p edic i e alues, ha is he numbe o posi i e esul s ha a e ue posi i es - and
ecall - also known as sensi i i y o ue posi i e a e, which measu es he numbe
o posi i e hi s among all he posi i es:
• 1-sco e ep esen s he ha monic mean o p ecision and ecall. I is mos sui ed
o une en class dis ibu ions, as is he case o ou da ase . I is calcula ed as
1 = 2 ∗p ecision ∗ ecall
p ecision + ecall (3.1)
whe e
p ecision =TP
TP +FP (3.2)
ecall =TP
TP +FN (3.3)
T ain- es spli
Ou i s app oach o e alua ing he pe o mance o each model is h ough a ain-
es spli o he a ailable sample da a. As a gene al p inciple, we used 80% o he
da a o aining and 20% o es ing.
The assessmen is o ganized as ollows:
1. Use he MinMaxScale , as speci ied abo e;
2. I e a e h ough all machine lea ning models;
3. Fi he model o he da a;
4. Assess ain and es sco es;
5. Compu e he con usion ma ix;
6. S o e he esul s.
C oss- alida ion
When we a e dealing wi h small o medium da ase s a simple ain- es spli will
mos likely mis ep esen ou eal-wo ld p oblem by missing some classes. This is he
main indica ion o using c oss- alida ion, whe e e e y single obse a ion is eligible
o he ain and es se s. The echnique consis s o spli ing he da ase in o a
speci ic numbe o olds, o pa i ions, and i e a ing h ough he pa i ions.
In Figu e 3.3 we pic u e how a 5- old c oss- alida ion example would p ocess.
Fi s , he da ase is spli in o 5 olds. Then, in each o he i e i e a ions, one o
he olds assumes he ole o es old and he o he ou as aining old. In each
i e a ion, he machine lea ning algo i hms a e ained on he aining old, and hei
pe o mance is assessed on he es old. By he end o he i e i e a ions, he a e age
40
Doc o al P og amme - In o ma ion Managemen
Figu e 3.3: Example o a 5- old c oss- alida ion p ocess.
o he pe o mance ob ained on each i e a ion is he alue conside ed o compa ison.
C oss- alida ion is he p e e ed me hod o assessing model pe o mance because i
gi es models he oppo uni y o ain on mul iple ain- es spli s. This will be e
indica e how well a model will pe o m on unseen da a. Con e sely, a simple ain-
es spli is dependen on jus a single da a spli which can o e es ima e he o e all
pe o mance.
In his expe imen we used S a i iedKFold (a o m o c oss- alida ion) o p e-
se e he pe cen age o samples among classes. The pu pose o his speci ic o m
is o he es o be as close as possible o he whole da ase . The s a i ica ion
ensu es class equencies o he pa i ions a e equal o he comple e da ase . This
is pa icula ly ad an ageous in an imbalanced da ase scena io, whe e his me hod
ensu es e e y class is ep esen ed.
The use o c oss- alida ion can also aise some issues. Since we a e assessing
pe o mance o a model on se e al spli s, si ua ions may a ise whe e da a leaks
om one i e a ion o ano he . In o he wo ds, da a leakage can happen when we
a e lea ning om bo h he es ing and aining se . I we do any p e-p ocessing
ou side he c oss- alida ion algo i hm, we will bias ou esul s and mos likely o e i
ou model. To a oid his common p oblem we eed ou c oss- alida ion cycle he
en i e da ase and pe o m e e y ans o ma ion wi hin each i e a ion. Al hough
he au ho s concede ha his epe i ion akes i s oll on pe o mance, he ex a s ep
assu es no da a is leaking om each o he spli s o i e a ions.
F1-sco e is used as a pe o mance measu e since i keeps a balance be ween
p ecision and ecall. Fu he mo e, since we obse e une en class dis ibu ion in he
da ase , F1-sco e is mo e app op ia e han AUC (F1 gi es a sco e o a speci ic
h eshold, whe eas AUC a e ages o e all possible h esholds).
Con usion ma ix was selec ed as he bes ool o desc ibing pe o mance on a
classi ica ion model. This is an NxN ma ix whe e N is he numbe o classes in ou
classi ica ion p oblem (as men ioned ea lie , classes 1, 2, 3, and 4 ep esen ing he
isk ie s o any gi en inancial ins i u ion).
41
Doc o al P og amme - In o ma ion Managemen
We also ind ele an o include expe judgemen o ein o ce o inal isk assess-
men . To his end, quali a i e da a sou ces like in e nal no es and isk assessmen
epo s, as well as isk sco es eassigned by he supe iso s should con ibu e o he
model’s lea ning phase.
Finally, we belie e he o he isk pe spec i es comp ised in he SREP me hod-
ology should also be add essed using he same me hodology. Ul ima ely, combining
all isk pe spec i es could be a s epping s one o egula o s as a suppo o he
SREP exe cise.
48

Chap e 4
App oaching Eu opean
Supe iso y Risk Assessmen wi h
SupTech: A P oposal o an Ea ly
Wa ning Sys em
4.1 In oduc ion
In ecen yea s, he use o decision suppo sys ems has sky ocke ed, wi h machine
lea ning (ML) spea heading he change. The inancial indus y has always been one
o he main d i e s o ha de elopmen (Zopounidis e al., 1997). As he amoun
o da a collec ed soa s and compu ing powe ises o mee he challenge, he use
o classical s a is ics such as linea and logis ic eg essions is g adually declining.
Al hough hey we e once he mains ay o decision suppo sys ems, nowadays hey
end o be ecalled spo adically, and mainly o hei be e comp ehensibili y in
compa ison o mos ML models (Yang and Wu, 2021). The cu en esea ch p oblem
is how o le e age on ML o suppo isk assessmen p ocesses a cen al banks, using
quan i a i e supe iso y da a.
Recen uses o machine lea ning ha e un eiled da a pa e ns ha we e as o ye
undisco e ed (Huang e al., 2021). These applica ions ha e also expanded o he
ields o egula ion and supe ision, as desc ibed by He ig (2021). Fo supe iso y
pu poses, he e has been a huge inc ease o in e es in de eloping sup- ech ools.
As Bee man e al. (2021) epo ed, he numbe o ongoing ML p ojec s in his ield
sky ocke ed om 12 in 2019 o 71 as o Decembe 2021. The pandemic o ced an o -
si e app oach o wha was p e iously equi ed o be done in pe son. In he pas wo
yea s we ha e seen an inc easingly highe numbe o p oduc ion- eady sys ems ap-
plying ML o suppo cen al banks’ asks (Massa o e al., 2020). F om he speci ic
s andpoin o supe ision, he wo k om Filippopoulou e al. (2020) is a wa e shed
in EWS de elopmen a cen al banks, using EBC Mac o-p uden ial Da abase o ad-
d ess c edi isk. This wo k, along wi h he EWS p oposed by Pompella and Dicanio
(2017), suppo s he impo ance o hese sys ems o suppo a ing assignmen s and
ale o dis ess signals.
The amoun o da a e ie ed in he supe iso y amewo k is o e whelmingly
high (Au ho i y, 2013). Addi ionally, supe iso s o en ask o complemen a y in-
o ma ion, ei he quan i a i e o quali a i e. E en hough Na ional Cen al Banks
49
Doc o al P og amme - In o ma ion Managemen
(NCBs) a e equipped wi h business in elligence sys ems ha allow hem o o ganise
mos quan i a i e in o ma ion, da a analysis is mos ly done on a ad-hoc manne ha
is imp ac ical o a p omp spo ing o isky e en s (B oede s and P enio, 2018).
Besides, his me hod only looks a pas e en s, making i impossible o sys ema i-
cally es al e na i e economic scena ios. Fu he mo e, we mus men ion ha isk
me hodologies migh a y, making i di icul o compa e no only he assessmen s,
bu also he e olu ion o he classi ica ions.
T adi ional app oaches al eady se ou an o ganised pe spec i e o he epo ed
da a, h ough dashboa ds and epo s ha p o ide agg ega ed and speci ic iews
o key indica o s (di Cas i e al., 2019). Howe e , hese app oaches only conside
pas e en s and hey a e cons ained by he egula o y amewo k (no o men ion,
hey lack wha -i analysis and decision p ocesses buil on ha da a). The use o
inno a i e echnologies o suppo supe iso y p ocesses is de ined by B oede s and
P enio (2018); Doe e al. (2021) as sup- ech, and hese au ho s summa ise he
ba ie s o adop ion in h ee main i ems:
1. F equen egula o y upda es;
2. Conse a i e indus y;
3. Lack o quali ied human esou ces.
F om a da a pe spec i e, Ea ly Wa ning Sys ems o p edic ing banking c isis
ha e also been in he spo ligh . Casabianca e al. (2019); Consoli e al. (2021)
a e some o he many examples o landma k indings in ha a ea, along wi h he
p e iously men ioned Filippopoulou e al. (2020). Howe e , none o hese au ho s
explo e he in o ma ion a ailable in he Eu opean supe iso y amewo k.
In a p e ious wo k, we ha e add essed he issues o using a single isk me hod-
ology, selec ing li e a u e-suppo ed ML models o e alua e he isk le el o banks,
and using up- o-da e eal-wo ld supe iso y da a om he Po uguese banking sec-
o (Gue a e al., 2022). The p e ious wo k add essed he concep o liquidi y isk
since i is c ucial o a bank’s abili y o ope a e (Ven o and Ganga, 2009) and i
can ende a bank nonope a ional in a ma e o days (Shah e al., 2018). In ou
pape we expand p e ious indings o he o he isk pe spec i es comp ised in he
Supe iso y Re iew and E alua ion P ocess (SREP).
In ou cu en s udy, we ha e ex ended he sample om Ma ch 2014 un il Augus
2021. This da a is ex ensi ely alida ed by Banco de Po ugal and Eu opean Cen al
Bank (ECB) quali y assu ance p ocesses. The quali y o ga he ed in o ma ion allows
o accu a e assessmen , hus ensu ing a posi i e co ela ion be ween isk p edic ion
and he obse ed phenomena (Ng, 2011).
Ano he key componen o ou app oach is he way we se up he classi ica ion
p oblem. Con a y o wha is commonly ound in he li e a u e, we ei e a e he
impo ance o conside ing a mul i- ie classi ica ion app oach o his p oblem. Ou
da a being p o ided by eal wo ld con ex , we eel highly con iden in expanding
om he ail/no- ail classes and adop ing he ou classes comp ised in he RAS
me hodology, a Eu opean-wide isk assessmen me hodology:
1. Low- isk;
2. Medium-low isk;
50
Doc o al P og amme - In o ma ion Managemen
3. Medium isk;
4. High isk.
Ou wo k also showcases li e a u e-backed ML models o s uc u ed inancial da a
ha suppo he e iciency o supe iso y p ocesses.
Based on he indings o ou s udy, we p o ide a comp ehensi e guidance o he
de elopmen o aluable supe iso y use-cases enhanced by inno a i e echniques.
The pu pose o his wo k is o le e age on he abo e-men ioned aspec s, and
expand he academic body o knowledge o quan i a i e isk assessmen o p uden-
ial supe ision. F om a supe iso ’s s andpoin , we aim o b ing be e insigh s
in o he da a and a ain highe e iciency - au oma ing esou ce in ensi e asks and
eeing up analys s o mo e in eg a i e analysis (Bee man e al., 2021). As poin ed
ou , he e is oom o imp o emen in his ield, since less han 25% o sup- ech
sys ems a e exclusi ely in ended o quan i a i e pu poses. Following in ha lead,
his wo k de elops a me hodology o add ess each o he isk pe spec i es in he
RAS me hodology: c edi , ma ke , ope a ional and p o i abili y.
4.1.1 Rela ed wo k
The use o machine lea ning o isk assessmen has been a highly deba ed opic, bo h
om an academic and indus y s andpoin . Since he 2000s (Galindo and Tamayo,
2000), isk assessmen has been ecu en ly iden i ied as a op-p io i y in es men o
de eloping he da a li e acy o inancial ins i u ions. As ecen ly shown by An unes
(2021), isk assessmen by cen al banks is pa amoun o accu a e supe ision and
less biased hen he sel -assessmen s ca ied ou by he banks hemsel es.
Addi ionally, Galindo and Tamayo (2000) es ablished ha ee-based models
pe o m consis en ly be e han a i icial neu al ne wo ks (ANNs) conside ing s uc-
u ed inancial da a. This inding is one o he pilla s o ou app oach and i has
been con i med by se e al o he au ho s (Xia e al., 2017; Chen and Gues in, 2016;
Climen e al., 2019).
In hei li e a u e e iew, Leo e al. (2019) highligh ed he popula i y o machine
lea ning applica ions o isk managemen in banking indus y, while also no ing he
expe imen al na u e o mos app oaches. The au ho s also ema k he disc epancy
be ween he high le el o academic esea ch conce ning his a ea e sus he de ac o
indus y applica ions.
This deba e has ocused a ound wo main issues:
•Finding he igh isk assessmen measu e.
•Finding he adequa e machine lea ning algo i hm o build a isk assessmen
model.
Gue a and Cas elli (2021) s udied bo h o hese aspec s app aising se e al
me hodologies o assessing dis ess signals. This e iew spans om 2004, when Hil-
legeis e al. (2004) u ned he page on wo landma k me hods ( he Z-sco e (Al man,
1968) and he O-sco e (Ohlson, 1980)) by p oposing he use o he Black–Scholes–Me on
op ion-p icing model, up un il 2019, when Kou e al. (2019) lis ed he mos common
me hodologies o assessing sys emic isk on he inancial sys em.
51
Doc o al P og amme - In o ma ion Managemen
On he same opic, Climen e al. (2019) used XGBoos o iden i y he bes
p edic o s o bank ailu e and de elop a classi ica ion model o label ailed and non-
ailed banks in he Eu ozone. The da a used in hei s udy comp ised 25 annual
inancial a ios o comme cial banks.
The majo i y o cu en li e a u e con e s he isk assessmen p oblem in o a
bina y classi ica ion ask, whe e each bank is labelled as ” ailed o likely o ail” o
”no ail” (Climen e al., 2019; Kola i e al., 2019; Leo e al., 2019; Filippopoulou
e al., 2020; Wang e al., 2021). These s udies usually ely on public da ase s, whe e
he a ge a iable is de i ed om a se o inancial a ios.
A cen al banks, as clea ly poin ed ou by S ock and Wa son (2001), economis s
a e esponsible o conduc ing isk analysis and pe o ming scena io es ing.
Since he appea ance o he Single Supe iso y Mechanism (Commission, 2015)
we a e bea ing wi ness o a s anda disa ion o epo ing equi emen s and me hod-
ologies. The he e ogeneous landscape o inancial pe o mance measu es iden i ied
in he li e a u e has been inc easingly eplaced by he use o he Supe iso y Re-
iew and E alua ion P ocess (SREP) (Bank), leading us o le e age on his isk
assessmen me hodology. SREP is an ongoing wo k by he Eu opean Cen al Bank
(ECB) and he Na ional Cen al Banks (NCBs) ha p o ides an in eg a ed iew on
each bank acco ding o i e isk pe spec i es: liquidi y, c edi , ma ke , ope a ional
and p o i abili y. The Risk Assessmen Sys em (RAS) is he quan i a i e pilla o
he me hodology, and i is he ocal poin o his wo k.
Selec ing he adequa e machine lea ning me hods applied o cen al banking,
we ound ha i ecen ly became a ho opic om bo h an academic and NCBs
s andpoin (Lee and Shin, 2020; Huang e al., 2021; Wang e al., 2021; Alonso
and Ca bo, 2020; An unes, 2021). Bee man e al. (2021) epo ha he pandemic
p omp ed NCBs o ely on sup- ech solu ions in hei e e yday p ocesses. Se e al
o he su eyed au ho i ies al eady ha e ope a ional sys ems. Fo ins ance, Cen al
Bank o B azil has a ool ha examines he whole c edi po olio o a bank o de ec
exposu es wi h un ecognised expec ed losses; Bank o Spain is applying in e ence
maps o model he ela ionships be ween bo owe and e alua e he isk impac ;
and he Mone a y Au ho i y o Singapo e is de eloping a ool o au oma e da a
analysis so ha supe iso s can ely on comple e da ase s, ins ead o sampling. Fo
his eason, we expanded ou esea ch o applica ions o ML o isk assessmen .
By b oadening his esea ch, we can e alua e how ML has been used o inancial
s uc u ed da a and hen ocus on he cen al bank case.
S ess- es ing is one o he many o ms o isk assessmen ha is pa icula ly used
a cen al banks. Kola i e al. (2019) challenged he concep o a bank’s esilience
by sugges ing ha i mos ly ep esen s a bank’s abili y o deal wi h a speci ic isk
suppo ed by i s own capaci y o abso b i . In such a se ing, applying a isk- ocused
me hodology like SREP allows supe iso s o be e assess he oo causes o wha
migh o he wise be pe cei ed as a gene al business model issue.
Chak abo y and Joseph (2017) p esen ed a se ies o ML applica ions o inancial
p oblems and hey analysed he mos equen ly used algo i hms, like ee-based
ensembles, a i icial neu al ne wo ks and clus e ing echniques. The au ho s also
showcase h ee use-cases a cen al banks, ha es ablish ML as a be e solu ion han
adi ional s a is ics. The mos ele an o ou wo k is one ha de elops a se ies
o ale s (EWS) based on he balance shee s uc u e o a bank, in a supe iso y
con ex . This shows no only how ele an supe iso y da a is o a p oac i e isk
52
Doc o al P og amme - In o ma ion Managemen
assessmen , bu also how i can be used o sense he isk p ocli i y o supe ised
ins i u ions.
Recen echnological de elopmen s ha e allowed newe and mo e complex models
o eme ge (S ydom and Buckley, 2019), such as deep lea ning (DL) and ex eme
g adien boos ing (XGBoos ) (Abell´an and Cas ellano, 2017). E idence shows hose
analysis me hods ha e a unique capaci y o cap u ing he in icacies o inancial
phenomena (Ribei o e al., 2012; Huang e al., 2021).
I u iaga and Sanz (2015) showed ha modelling ime se ies is whe e DL excels.
Also Pe opoulos e al. (2018) le e age on DL’s p ecision and de elop an Ea ly
Wa ning Sys em (EWS) o p edic ing ailu e o G eek banks (da a in 2005-2015).
This is a landma k epo on he use o ad anced ML in a daily supe iso y con ex .
Wang e al. (2021) p oposed an add-on o he con en ional logi -based EWS, which
in ol es simula ing expe o ing h ough a Random Fo es based sys em, and ha
showed aluable esul s in p edic ing sys emic c ises.
B oede s and P enio (2018) o ganise supe iso y inno a ion concep s and p esen
a se ies o use-cases whe e ea ly adop e s a e implemen ing inno a i e app oaches
(sup- ech), con e ing e ie ed da a in o p edic i e indica o s. These wo ks a e o
g ea impo ance o sys ema ise how o implemen his echnology. The inc easing
amoun o a ailable da a is one o he main d i e s o he de elopmen o ML-based
sys ems, as Chak abo y and Joseph (2017) also ha e claimed. Banking supe ision
acknowledges he bene i s o inno a i e echnologies and he impo ance o keeping
up wi h he a ie y o sup- ech ini ia i es being de eloped. These ini ia i es ha e
he po en ial o d ama ically change he supe iso y p ocess; an icipa ing he conse-
quences o cu en beha iou s ins ead o bela edly eac ing o pas e en s. The same
au ho s explo e se e al use-cases om he Cen al Bank o he Republic o Aus-
ia (OeNB), Mone a y Au ho i y o Singapo e (MAS), Secu i ies and Exchange
Commission (SEC), among o he s. Business p ocess e ec i eness, cos educ ion
and inc eased analy ical capabili y, a e no ed as he main d i e s o he sup- ech
endea ou . These supe iso y agencies epo se e al challenges in explo ing and
implemen ing hese echnologies, such as:
•The echnical know-how and app op ia e in as uc u e o suppo hese ana-
ly ical solu ions;
•The legal amewo k o suppo he use o he ele an in o ma ion;
•The in e nal suppo om managemen o in es in hese ini ia i es and om
he end-use s, o p o ide he expe knowledge and o use and p omo e he
new analy ical ools.
Boa d (2020) also shows how he balance o supply and demand igni ed he de-
elopmen and use o sup- ech ools. F om he demand side, hese au ho s men ion,
among o he aspec s, enhanced supe iso y and egula o y equi emen s and im-
p o ed isk managemen capabili ies, whe e he au oma ion o da a e ie al and
summa isa ion can d as ically imp o e supe iso y p ocesses. F om he supply side,
he inc easing a ailabili y o da a and new analy ical me hods a e among he op
suppo e s o he abo e-men ioned egula o y necessi ies. Lis ed bene i s o imple-
men ing hese g ound-b eaking ools include:
•Enhanced analy ical capabili ies;
53

Doc o al P og amme - In o ma ion Managemen
•En iched isuals, s emming om s a e-o - he-a da a collec ion o sophis i-
ca ed dashboa ds;
•Reduced cos s, as a consequence o au oma ion.
Ne e heless, adop ing new analy ical p ocesses ine i ably b ings on esh chal-
lenges. Recognising his aspec , Jag iani e al. (2018) expand on he impac s o
hese new analy ical solu ions and possible isks o adop ing hem, such as:
•Thi d-pa y endo isk, whe e banks gi e access o ou side specialis s - da a
scien is s and business use s in ol ed in se ing up he ool - ha can lead o
da a b eaches. Addi ionally, i he endo is a dominan playe in he ma ke ,
ha ci cums ance can c ea e a single poin o ailu e in he inancial sys em.
•Cybe -secu i y isk, which is ela ed o he p e ious opic, as endo s migh
no comply wi h supe iso s’ secu i y equi emen s. Addi ionally, by allowing
o ex e nal sou ces o da a, banks and cen al banks become exposed o ha
channel and he in o ma ion he ein con ained.
•Model isk, whe e sys ems based on complex machine lea ning models o e en
black-boxes make decisions ha migh no make sense om a business pe -
spec i e, hence p o iding w ong p edic ions.
Ano he ac o wi h majo impac in ML use is he comp ehensibili y o he
models. Al hough ML models a e seldom capable o explaining p edic ion, hey
consis en ly ou pe o m he classic app oaches. Das ile e al. (2020) published a
sys ema ic li e a u e e iew con as ing hese echniques o a c edi sco ing p ob-
lem, and hey s ess he lack o in e p e abili y o DL as he main ba ie o adop ion
in suppo ing inancial decisions.
I is wo h bea ing in mind ha poin ing ou he di ec ion o u u e esea ch is as
impo an as signalling isks associa ed wi h implemen ing ML models. Kou e al.
(2019) p esen a ho ough epo on s a e-o - he-a applica ions and ML echniques
o assess sys emic isk. Based on he exis ing echnology, hey sugges se e al u u e
wo k a eas, like big-da a analysis, da a-d i en esea ch and policy analysis wi h da a
science.
4.2 Me hodology
De elopmen s in he a ea o da a science and machine lea ning usually all in o one o
wo ca ego ies: de eloping a new compu a ional me hod o be e sol e an exis ing
p oblem; o al e na i ely, using he exis ing me hods o add ess a new p oblem. In
his wo k, we aim o add ess a p oblem ha was ye o be sol ed using machine
lea ning: supe iso y isk modelling.
Figu e 4.1 illus a es how we a ained ou objec i e in a s ep-by-s ep diag am,
as a de elopmen o wha was p esen ed in Gue a e al. (2022). The i s s ep
comp ises a da a e ie al p ocess om Banco de Po ugal supe iso y da a sys em,
including a wide se o ea u es and he a ge a iables we wan o model. Nex ,
he e is a ans o ma ion p ocess ha is esponsible o cleaning he da a, dealing
wi h missing alues and selec ing he mos signi ican ea u es. In he ollowing
s ep, we compa e he ML models o his ask using ain- es spli , c oss- alida ion
54
Doc o al P og amme - In o ma ion Managemen
Figu e 4.1: Me hodology p ocess o e iew.
and he TPOT Au oML amewo k (Olson e al., 2016). The 1-sco e and con usion
ma ices a e used o compa e he esul s and selec he bes model ha can hen
suppo an Ea ly Wa ning Sys em o he RAS isk pe spec i es.
In his sec ion we p esen he s eps ca ied ou in his esea ch, beginning wi h
explaining how he da a was e ie ed, wha ans o ma ions we e equi ed, which
ea u es we e selec ed and i s c i e ia, and inally, how he models’ pe o mance was
e alua ed.
4.2.1 The Da a
One o he main pilla s o his pape is he supe iso y da a collec ed by Banco de
Po ugal (BdP) wi hin he Capi al Requi emen s Regula ion (CRR) and Capi al
Requi emen s Di ec i e IV (CRD IV) (Pa liamen , 2013). Ou da a anges om
Ma ch 2014 un il Augus 2021 and mos o he da a used o he pu poses o his
pape is qua e ly (Au ho i y, 2013). Due o con iden iali y issues, he da ase used
in his s udy canno be made a ailable o public consul .
Da a is ex ac ed ia an SQL que y om BdP’s p oduc ion da abase (wi h no
il e s ega ding e e ence da e, banks o hei consolida ion le el) in o a comma-
sepa a ed- alues (cs ) ile. The esul se is impo ed using a Py hon sc ip wi hin
Jupy e no ebooks. An ex ac ion ou ine was implemen ed o assu e consis ency
and au oma ion in da a ga he ing.
To accoun o all possible p edic o s, we ha e selec ed ou ea u e space om
he ou main epo ing amewo ks o banking supe ision: Financial Repo ing,
Common Repo ing, Asse Encumb ance and Funding Plans.
The da a esides in a ela ional da abase whe e each ow ep esen s a epo ed
alue. This means ha in he da a sou ce, se e al ows ep esen a single obse a-
ion. Du ing ex ac ion, da a is anonymised using MD5 algo i hm wi hin a SQL’s
hashing unc ion. This s ep assu es he same iden i ie o e e y ow in he same
obse a ion. The ex ac ed da ase ollows his column schema:
1. ID - a hash code ep esen ing each obse a ion’s iden i ie ;
2. a iable - a code wi h business meaning ha ep esen s each epo ed alue;
3. al - he ac ual nume ic alue o he a iable.
4.2.2 T ans o ma ions
P epa ing he da a o machine lea ning algo i hms is he single mos c i ical s age
in such s udies and p ojec s. The i s s ep in ou s udy is o pi o he da a wi h
55
Doc o al P og amme - In o ma ion Managemen
he aim o ha ing each ow co esponding o one obse a ion. This ans o ma ion
unco e s he spa si y o ou ea u e space, equi ing null columns o be d opped.
Ano he impo an s ep is o ocus he da ase on he isk pe spec i e o be e al-
ua ed. In ou s udy we a e add essing c edi , ma ke , ope a ional and p o i abili y
isks. When in es iga ing one isk pe spec i e - one speci ic a ge a iable - we
d op all he o he s. This migh lead o in alid obse a ions, ha is, obse a ions
ha only made sense o a ce ain isk. As a consequence, we disca d he ows o
which he selec ed a ge alue is null.
Dealing wi h missing alues is he inal s ep o he ans o ma ions phase. Ou
da ase is exclusi ely nume ic and each column/ ea u e has i s own dis ibu ion.
The e o e, we op ed o inpu ing he missing alues o each ea u e wi h he median,
since i p o ides a mo e accu a e pe spec i e on he da a’s dis ibu ion when dealing
wi h up o wen y pe cen o missing alues (Acuna and Rod iguez, 2004).
By he end o hese s eps ou da ase consis s o 9262 ows and 82576 columns.
4.2.3 Fea u e Selec ion
As we saw in he p e ious subsec ion, his da ase is ex emely spa se - he e he
inaccu acy o he e m ”ex emely” endea ou s o cap u e he ac ha his is a
wide da ase (mo e ea u es han obse a ions). Al hough we ha e conside ed using
P incipal Componen Analysis (PCA), his me hod comp omises model comp ehen-
sibili y. Since i p ojec s he o iginal ea u es in o a lowe dimensionali y ea u e
space, he e is always in o ma ion loss om disca ding he componen s wi h less
a iance/in o ma ion. The selec ion c i e ia is based on he co a iance ma ix, and
does no accoun o he a ge a iable o be s udied. As his da ase comp ises
i e di e en a ge a iables - one pe isk - PCA migh exclude ea u es ega dless
o hei con ibu ion o a speci ic a ge .
To add ess he abo e-men ioned issues we ha e used he Random Fo es ea u e
selec ion algo i hm, wi h an 85% h eshold o ea u e impo ance. T ee-based mod-
els a e bes o pe o m his ask since hey no only ake in o accoun he a ge
a iable o be explained, bu also a p io i hey ank ea u es acco ding o how well
hey imp o e he pu i y o nodes (gini impu i y). The close a node is o he oo ,
he g ea e impu i y dec ease occu s (i.e. he ”cleane ” da a becomes). Con a ily,
lea nodes ha e smalle impu i y dec ease. Hence, p uning below a ce ain node
esul s in a subse o he mos impo an ea u es.
This me hod allowed us o echnically assess he lis o ea u es ha explain a
leas 85% o ou a ge a iable. F om he o iginal o al o 82576 ea u es we selec ed
2608 ea u es - o c edi isk. This numbe a ied o di e en a ge a iables.
As a inal check, we ha e compu ed he co ela ion ma ix o each a ge a i-
able o assu e ea u es and a ge we e no highly co ela ed - Pea son’s co ela ion
coe icien less ha 0.3.
4.2.4 Expe imen s
In he ollowing subsec ions we lay ou he h ee app oaches ollowed o assess he
bes machine lea ning model:
•T ain- es spli : simply spli ing he da ase in ain and es se s.
56
Doc o al P og amme - In o ma ion Managemen
•C oss- alida ion: using di e en pa i ions o he da a o es and ain he
model on e e y obse a ion, i e a i ely.
•TPOT Au o ML: an au o ML amewo k by Olson e al. (2016), o compa ison
pu poses.
These app oaches p o ide a pe o mance measu e ha summa ises he gene ali-
sa ion capabili y o e e y model and allows o a eliable and as compa ison among
models. F1-sco e was used as a pe o mance measu e since i keeps a balance be-
ween p ecision and ecall. Fu he mo e, since we obse e une en class dis ibu ion
in he da ase , F1-sco e is mo e app op ia e han he A ea Unde he Cu e (AUC)
( 1-sco e gi es a sco e o a speci ic h esholds, whe eas AUC a e ages o e all pos-
sible h esholds). Fo a ull de ail o each e alua ion, he con usion ma ices a e also
p o ided.
Fo he pu poses o his s udy we selec ed and e alua ed he pe o mance o each
o he ollowing models:
•Logis ic Reg ession (LG); used only o benchma king;
•k-Nea es Neighbou s Classi ie (kNN);
•Random Fo es Classi ie (RFC);
•Ex eme G adien Boos ing Classi ie (XGBC).
The TPOT amewo k is an Au oML amewo k ha makes use o gene ic p og am-
ming o op imise he p ocess o inding he bes model o he p oblem a hand. This
is a ising end in he usage o machine lea ning and we ha e included i in o de
o o e alua e i s adequacy o his p oblem.
All h ee app oaches comp ise an op imisa ion phase, whe e we expe imen wi h
a ange o alues o he hype -pa ame e s o each o he conside ed models. Fo
bo h he ain- es spli and c oss- alida ion we ca ied ou a 5- old c oss alida ed
g id sea ch o he speci ic pa ame e s o each model. The TPOT amewo k has an
op imisa ion s ep wi hin i s pipeline ha is ully documen ed.
Jus be o e eeding he da a o he ML algo i hms, we used he MinMaxScale o
i he ea u es in he same scale. This app oach p ese es ou lie s and he o iginal
dis ibu ion o each ea u e, hence conse ing he in o ma ion embedded in he da a.
All he expe imen s we e execu ed a Banco de Po ugal using i s compu ing
in as uc u e. The speci ica ions o he node assigned o hese expe imen s we e
he ollowing:
•4 In el(R) Xeon(R) CPUs E7-8891 4 @ 2.80GHz, 32 GB o RAM, 1 TB SSD;
•Ubun u 20.04.3 LTS;
•Py hon 3.8.10;
•Pandas 1.2.0;
•sciki -lea n 0.24.0;
•TPOT 0.11.7.
57
Doc o al P og amme - In o ma ion Managemen
The esul s o he ain- es spli e alua ion a e shown in igu e 4.5. He e he
esul s show a dis ibu ion simila o wha we obse ed wi h c edi isk howe e ,
he sco es a e sligh ly be e .
The Logis ic Reg ession esul s sugges ha we ace linea (o close o linea )
bounda ies be ween classes. This eading is also suppo ed by he ac ha i s sco e
is close o k-Nea es Neighbou s’.
S ill, he use o ensemble ee-based models show a signi ican inc ease in pe -
o mance. The spike is no as p ominen as wi h c edi isk, and Random Fo es
has again a simila , bu lowe , sco e han XGBoos - on he o de o he decimal
pe cen age poin s. Figu e A.3 shows he con usion ma ices o he ain- es spli
e alua ion.
Figu e 4.6: F1-sco es o each model, using c oss- alida ion app oach.
Howe e , a andom ain- es spli migh gi e an unde alued o o e alued
pe spec i e o a model’s pe o mance. To alida e hese indings we applied c oss-
alida ion wi h 1-sco e o he whole da ase . The esul s o his p ocess a e shown
in igu e 4.6 along wi h he e alua ion o he TPOT amewo k.
The models show simila sco es when compa ed o each o he , wi h Logis ic
Reg ession a ing close o he k-Nea es Neighbou s. Con a ily o wha we obse ed
in he ain- es spli , a mo e disce ning look a he esul s shows ha Random
Fo es classi ie sligh ly ou pe o ms XGBoos . TPOT comes in hi d place in e ms
o pe o mance, and i becomes e en less appealing i we conside i s wall ime.
Figu e A.4 p esen s he con usion ma ices o his classi ica ion p ocess.
4.3.3 Ope a ional Risk
The sample p o ided o e alua e ope a ional isk has 4819 obse a ions and 3447
ea u es. The wall ime needed o e alua e he models on his sample was:
1. T ain- es spli : 5 minu e and 19 seconds;
2. C oss- alida ion: 18 hou s, 13 minu es and 52 seconds;
3. TPOT amewo k: 2 days, 15 hou s, 31 minu es and 41 seconds.
64

Doc o al P og amme - In o ma ion Managemen
The ain- es spli esul s shown in igu e 4.7 pain a di e en pic u e han he
o he pe spec i es. Al hough we can obse e a simila dis ibu ion o esul s, he
Logis ic Reg ession p esen s below-a e age esul s on unseen da a. Fu he mo e, he
k-Nea es Neighbou s classi ie exhibi a sligh imp o emen o he p e ious model.
Random Fo es and XGBoos classi ie s again come in o he spo ligh , wi h he
la e showing a modes ad an age o less han wo pe cen age poin s. Figu e A.5
shows he con usion ma ices o he ain- es spli , o a de ailed iew o each
classi ica ion.
Figu e 4.7: F1-sco es o each model, using ain- es spli app oach.
Applying c oss- alida ion o his sample e eals se e al pe o mance adjus -
men s. Ou non- ee-based models - he Logis ic Reg ession, and k-Nea es Neigh-
bou s classi ie - exp essed an inc ease in hei sco e, due o he op imisa ion p ocess.
Fo Random Fo es and XGBoos we see mino adjus men s in he 1-sco e,
howe e , hei pe o mance di e ence is consis en wi h he ain- es spli app oach.
This inding con i ms he abili y o g asp he he e ogenei y o egula o y inancial
da a. The TPOT amewo k is again in hi d place, e ealing o be a poo choice due
o he mo e han wo and a hal days o p ocessing. Figu e A.6 shows he con usion
ma ices o he c oss- alida ion p ocess, o a de ailed iew o he classi ica ions o
each model.
65
Doc o al P og amme - In o ma ion Managemen
Figu e 4.8: F1-sco es o each model, using c oss- alida ion app oach.
4.3.4 P o i abili y Risk
As o ou inal isk pe spec i e - p o i abili y - we used a sample o 6448 obse a ions
and 3177 ea u es. The p ocessing and e alua ion imes o each o he app oaches
we e:
1. T ain- es spli : 9 minu e and 14 seconds;
2. C oss- alida ion: 1 day, 2hou s, 25 minu es and 58 seconds;
3. TPOT amewo k: 1 day, 11 hou s, 56 minu es and 42 seconds.
This is he isk pe spec i e wi h he wo se o e all esul s. Figu e 4.9 shows he
ain and es s sco es o each model. Logis ic Reg ession, k-Nea es Neighbou s
p esen a pal y pe o mance. E en Random Fo es and XGBoos show some de-
c ease in pe o mance, al hough s ill p esen ing good esul s. Figu e A.7 pic u es
he de ailed classi ica ions o hese models h ough he con usion ma ices.
66
Doc o al P og amme - In o ma ion Managemen
Figu e 4.9: F1-sco es o each model, using ain- es spli app oach.
The c oss- alida ion p ocess co ec s o any misclassi ica ion esul ing om a
un a ou able ain- es spli . In igu e 4.10 we show he 1-sco es o each model,
including he TPOT amewo k.
Jus as wi h ain- es spli , he Logis ic Reg ession, and k-Nea es Neighbou s
p esen a low sco e, when compa ed o he o he algo i hms and hei pe o mance
in o he isk pe spec i es. Al hough his seems no ela ed o class imbalance (see
igu e 4.2), he complexi y o he decision bounda ies and he dependence o some
o he ea u es migh be he oo cause o hese ounde ing esul s.
E en so, he Random Fo es and XGBoos show good esul s, wi h he la e
again ou pe o ming he o me . The TPOT amewo k, comes in hi d wi h a e age
esul s and one and a hal day o p ocessing, again making i an unsa is ac o y
al e na i e o his ask. See igu e A.8 o he con usion ma ices o he c oss-
alida ion p ocess.
Figu e 4.10: F1-sco es o each model, using c oss- alida ion app oach.
67
Doc o al P og amme - In o ma ion Managemen
4.3.5 Final ema ks
Following ou p e ious wo k (Gue a and Cas elli, 2021), we clea ly de ined he
equi ed elemen s o modelling he supe iso y isk assessmen p ocess comp ised
in RAS. Fi s , we sugges ed he use o SREP’s quan i a i e pilla - Risk Assessmen
Sys em - as a s anda d me hodology o compa e he banks a Eu opean le el. This
me hodology is al eady es ablished ac oss he Eu o-a ea, hence making i he ideal
choice o he ask. Mo eo e , mos wo ks in his a ea adop a bina y classi ica ion
o he isk le el o he banks, limi ing he classi ica ion o ” ailu e” o ”no ailu e”.
As men ioned be o e, his app oach lacks he lexibili y equi ed o cen al banks o
de ec he e ec o dis ess e en s g adually and ea lie in ime. This is accomplished
h ough he p og essi e mul i-class scale p o ided in he RAS. Addi ionally, we
iden i ied a esea ch gap speci ically add essing he supe iso y use-case. Using eal-
wo ld supe iso y da a, designed and e ie ed o egula o y pu poses, i has been
p o en o p o ide he mos accu a e ou look (B oede s and P enio, 2018; di Cas i
e al., 2019; Massa o e al., 2020; Filippopoulou e al., 2020).
We es ed he abo e men ioned elemen s and success ully modelled he liquidi y
isk o a bank (Gue a e al., 2022). Based on hose indings, we se ou o gene alise
he me hodology and model he emaining isk pe spec i es comp ised in he RAS:
c edi , ma ke , ope a ional and p o i abili y.
F om a echnical s andpoin , we con i med ha an op imised XGBoos ou pe -
o med he o he conside ed models. This is acco dance wi h p e ious li e a u e
esul s sugges ing XGBoos pe o ms bes wi h s uc u ed inancial da a. In addi-
ion o ha , we ha e es ed i agains he au o ML amewo k TPOT, a ising end
in he ield. The esul s showed ha due o he cha ac e is ics o he da ase - la ge
numbe o ea u es and spa se da ase - compu ing ime was ex emely axing, e en
wi h low pa ame e s o he GP algo i hm. I migh be in e es ing o educe he
numbe o ea u es o ewe han en, and see how TPOT pe o ms.
F om a business pe spec i e, he no el y wi hin he p esen ed esul s is he ac
ha we a e modelling a mul i-class decision p ocess wi h eal-wo ld supe iso y da a.
Whe eas o he wo ks ha e no explo ed supe iso y da a, we ely on he Eu opean
egula o y amewo k and he da a collec ed wi hin i . This da a is he pilla o
supe iso y p ocesses and b ings he s uc u e and con ex o ou models. By elying
on hese models, we can de elop ea ly wa ning sys ems capable o an icipa ing
dis ess e en s, conside ing he isk measu es abo e, and also gi e supe iso s a
ool o es al e na i e economic scena ios o p e en pi alls.
4.4 Conclusion
S eamlining an e ec i e supe iso y me hodology equi es an in eg a ed iew o
he isks a c edi ins i u ion is subjec o. In ou p e ious wo k we ha e success ully
modelled liquidi y isk acco ding o SREP me hodology. Once ha pilla was se ,
we we e able o apply he same modelling echniques o he o he isk pe spec i es
comp ised in he Supe iso y Re iew and E alua ion P ocess (SREP) and i s Risk
Assessmen Sys em (RAS): c edi , ma ke , ope a ional and p o i abili y.
Based on he quan i iable mains ay o ECB’s Risk Assessmen Me hodology, we
classi ied c edi ins i u ions om he Po uguese banking sec o acco ding o hei
isk le el on each o he pe spec i es encompassed in he me hodology. We used
68
Doc o al P og amme - In o ma ion Managemen
eal-li e supe iso y da a and modelled his decision p ocess by compa ing se e al
machine lea ning echniques, benchma ked agains a widely used s a is ical me hod.
We ha e eached signi ican esul s clea ly es ablishing ha his decision p ocess
can be modelled and ha he ML echniques used ou pe o m he classic s a is ical
app oaches.
Regula o y supe iso y da a is highly co ela ed and he e ogeneous, making he
decision bounda ies o his exe cise a challenging ask. Addi ionally, eal-wo ld
e en s a e seldom ep esen ed by balanced da a. All isk le els a e obse ed bu wi h
occu ences ha a e subjec o e en s in a speci ic poin in ime. The complexi ies
o such eali y we e bes ep esen ed by ensemble ee-based models, like Random
Fo es and XGBoos classi ie s. These models can cap u e he he e ogeneous na u e
o inancial da a and es ablish clea decision bounda ies wi h li le e o - 1-sco e
be ween 87% and 94%. These esul s we e ob ained a e applying an op imisa ion
p ocess wi hin he c oss- alida ion cycle.
Gi en he compu a ional esou ces a ailable and he cu ing edge gene ic p o-
g amming op imisa ion pipeline a ailable h ough TPOT, we expec ed i o ou -
pe o m XGBoos . Howe e , TPOT consis en ly came in hi d ega ding 1-sco e,
being ou pe o med by Random Fo es and XGBoos . I s long p ocessing imes can
be explained by he dedica ed op imisa ion p ocess, and he ac ha ou da ase
is spa se (82576 ea u es). The ea u e selec ion p ocess is cos ly in compu a ional
sense and i migh accoun o a signi ican sha e o he wall ime.
We i mly belie e his wo k is a meaning ul con ibu ion o a se o s akeholde s
in ol ed in isk assessmen in he banking sec o :
•Na ional Cen al Banks (NCBs) can le e age on he indings o his wo k and
use hese models o de elop ea ly wa ning sys ems. These sup- ech ini ia i es
a e cu en ly in he limeligh , wi h many p ojec s being de eloped in his a ea
by he ECB, Bank o In e na ional Se lemen s (BIS) and wo ldwide NCBs. A
decision suppo sys em like his would p o ide an enhanced isk assessmen
pe spec i e o supe iso s.
•Banks and he consul ing indus y can con ey hese p inciples in o hei own
sys ems. Consul ancy companies can u he suppo hei clien s in imple-
men ing hei decision suppo sys ems using he da a owned by he banks
hemsel es. A bank can hen p oac i ely moni o and adjus hei isk p o ile
and s a egy acco ding o he egula o y equi emen s.
•Academia can use his wo k o ex end and apply his ype o ML me hodologies
o expand i s usage on a egula o y pe spec i e. Fu he mo e, we s ess he
pos ula es o using high quali y, highly alida ed ele an da a, and adop ing
an uni e sal me hodology o isk assessmen , one ha s anda dises how o
app aise a bank.
Th ough his pape , we aim o con ibu e o he echnical unde s anding o
ML ha can be applied o sup- ech use cases acco ding o he business needs.
G ounded in his o ical supe iso y da a, we p opose a Sup-Tech ool ha imp o es
he Eu opean supe iso y isk assessmen by p o iding ea ly wa nings on se e al
isks.
69

Doc o al P og amme - In o ma ion Managemen
4.4.1 Limi a ions and u u e wo k
The e a e se e al aspec s we ha e iden i ied o e he cou se o his s udy ha would
me i e ision and imp o emen .
The da ase we used in his wo k e lec s he Po uguese banking sec o . Ideally,
expanding o he Eu opean le el and using da a om all cen al banks in he Eu o-
a ea would p o ide a comple e supe iso y pe spec i e. Addi ionally, mo e di e se
da a, wi h mo e business models would s eng hen he ML models p esen ed he e.
Each o he isks would also me i om con ex speci ic da a, in o de o enhance
he gene alisa ion o each model. This would also allow he supe iso s o access
mo e imely decisions. Supe iso y da a is mos ly qua e ly which p e en s quick
eac ions o ad e se e en s. By combining i wi h daily da a sou ces, such as ma ke
da a, paymen s sys ems and c edi esponsibili ies da a, we migh be able o ob ain
a daily signal o each aspec o a bank’s isk. Con i ming his decision pa h will
s eng hen he a o emen ioned models and p o ide a unning isk assessmen on
which supe iso s can ely on.
The abili y o explain he easoning behind each model is o u mos impo ance,
in pa icula o c i ical sys ems such as o c isis de ec ion. Explainable AI bene i s
hold ue no only o expe s o alida e he decision p ocess ca ied ou by he
ML models, bu also as common g ound language o epo any issue o banks.
As such, in es ing in explainable models will deli e a be e unde s anding o he
echnology, b inging supe iso s close o sup- ech, and will also se o h a clea e
communica ion be ween ins i u ions and cen al banks.
Combining ou quan i a i e da a wi h quali a i e expe judgemen , using Na -
u al Language P ocessing (NLP), will allow o au oma ed sco e adjus men s based
on in e nal supe iso y no es and isk assessmen epo s.
As a inal ema k, consolida ing he esul s o each isk model wi h he ele-
an quali a i e da a could p o ide a single in eg a ed bank sco e as an addi ional
measu e o he SREP exe cise.
70
Chap e 5
Conclusion
The e e -g owing amoun o da a e ie ed by o ganisa ions has led almos e e y
indus y o in es in big da a echnologies. Machine lea ning is one o oday’s op
choices o ha es da a, and he inancial sec o has been one o i s main d i e s.
Simila ly, Na ional Cen al Banks also h i e by le e aging on inno a ion as a key
pilla o hei mission: keeping p ices s able and banks heal hy. Howe e , indus ies
a e adop ing hese new echnologies a di e en paces. The much-needed changes
on how supe ision is accomplished ha e been delayed due o he inancial sec o
being pa icula ly conse a i e, he cons an upda ing o he egula o y ou look, and
he lack o quali ied p o essionals who can ca y ou hese changes in a sus ainable
manne . This is he iden i ied gap ha igge ed ou esea ch.
He e, we use machine lea ning in he supe iso y con ex wi h he pu pose o
modelling isk assessmen p ocesses. We achie ed ou objec i e and con ibu ed o
he o emen ioned esea ch gap in h ee s eps:
1. We i s e iewed he li e a u e on isk assessmen and machine lea ning a
cen al banks. Rega ding his ecen in e sec ion o knowledge a eas, we dis-
co e ed he need o a common isk assessmen me hodology. Addi ionally, we
gleaned ha he e a e se e al cen al banks ca ying ou sup- ech ini ia i es,
e idencing he po en ial o ML in he supe iso y ealm (B oede s and P enio,
2018; Bee man e al., 2021; He ig, 2021).
2. We hen p oceeded o success ully model he liquidi y isk o a bank as a clas-
si ica ion p oblem, compa ing se e al ML models. The wo key componen s
o his s ep we e he applica ion o ECB’s Risk Assessmen Sys em (RAS)
used ac oss he Eu o-a ea, and he use o eal-wo ld supe iso y da a om he
Po uguese banking sec o .
3. Finally, we compiled equally alid esul s when applying he same me hodology
o he emaining isk pe spec i es: c edi , ma ke , ope a ional and p o i abil-
i y.
The indings we accomplished in he abo e men ioned s eps se e a as numbe
o s akeholde s ha can be g ouped in h ee ca ego ies:
•Banks and consul ancy companies, which can ely on he de eloped ML models
o p oac i ely manage hei isk pos u e. This can posi i ely in luence and
suppo how banks app oach compliance obliga ions and also, how hey adjus
hei p ac ices o hei business model.
71
Doc o al P og amme - In o ma ion Managemen
•Cen al Banks can uple el hei supe iso y p ocesses and inno a ion ini ia-
i es wi h he ML echniques he e desc ibed o inancial da a. Ea ly Wa ning
Sys ems a e among he bes use cases a supe iso y agencies and can al eady
be ound and he ECB and BIS (He ig, 2021).
•Academia will be able o suppo new in es iga ion on he applica ions o ML
o he supe iso y a ea, and expand he usage o inno a i e echnologies in
he egula o y ealm.
The e a e some limi a ions o he wo k he e de eloped ha we would like o
add ess as u u e de elopmen opics. In a i s ie we would imp o e model o-
bus ness by ex ending he sample o all banks in he Eu o-a ea (app oxima ely 3150
o ganisa ions). This could be achie ed h ough a join p ojec wi h he ECB, whe e
da a om all Cen al Banks is al eady e ie ed. A e wa ds, we would inc emen
accu acy by complemen ing each ML model wi h isk speci ic da a. As a inal ec-
ommenda ion, we would sugges in e sec ing he quan i a i e pe spec i e s udied in
his hesis wi h he quali a i e in o ma ion p esen in in e nal documen s and e-
po s. This would de elop an inno a i e use-case o Na u al Language P ocessing
(NLP) in supe ision.
72
Bibliog aphy
Abell´an, J., Cas ellano, J.G., 2017. A compa a i e s udy on base classi ie s in
ensemble me hods o c edi sco ing. Expe Sys ems wi h Applica ions 73, 1–10.
doi:10.1016/j.eswa.2016.12.020.
Acuna, E., Rod iguez, C., 2004. The ea men o missing alues and i s e ec on
classi ie accu acy.
Adankon, M.M., Che ie , M., 2009. Encyclopedia o Biome ics - Suppo Vec o
Machine. Sp inge US. URL: h ps://doi.o g/10.1007/978-0-387-73003-5 299,
doi:10.1007/978-0-387-73003-5_299.
Ala’ aj, M., Abbod, M.F., 2016. A new hyb id ensemble c edi sco ing model based
on classi ie s consensus sys em app oach. Expe Sys ems wi h Applica ions 64,
36–55. doi:10.1016/j.eswa.2016.07.017.
Alessi, L., De ken, C., 2018. Iden i ying excessi e c edi g ow h and le e age. Jou -
nal o Financial S abili y 35, 215–225. doi:10.1016/j.j s.2017.06.005.
Alonso, A., Ca bo, J.M., 2020. Machine lea ning in c edi isk: Measu ing he
dilemma be ween p edic ion and supe iso y cos . SSRN Elec onic Jou nal
doi:10.2139/ss n.3724374.
Alonso, A., Ca bo, J.M., 2021. Unde s anding he pe o mance o machine lea ning
models o p edic c edi de aul : A no el app oach o supe iso y e alua ion.
SSRN Elec onic Jou nal doi:10.2139/ss n.3774075.
Al man, E.I., 1968. Financial a ios, disc iminan analysis and he p edic ion o
co po a e bank up cy. The Jou nal o Finance 23, 589–609. doi:10.1111/j.1540-
6261.1968. b00843.x.
Angelini, E., di Tollo, G., Roli, A., 2008. A neu al ne wo k app oach o c edi isk
e alua ion. Qua e ly Re iew o Economics and Finance 48, 733–755. doi:10.
1016/j.q e .2007.04.001.
An unes, J.A.P., 2021. To supe ise o o sel -supe ise: a machine lea ning based
compa ison on c edi supe ision. Financial Inno a ion 7. URL: h ps://doi.o g/
10.1186/s40854-021-00242-4, doi:10.1186/s40854-021-00242-4.
Au ho i y, E.B., 2013. Eba implemen ing echnical s anda ds (i s).
URL: h p://www.eba.eu opa.eu/documen s/10180/532570/EBA-ITS-2013-
12+(Final+d a +ITS+on+Hypo he ical+Capi al+o +a+CCP).pd .
73
Doc o al P og amme - In o ma ion Managemen
80

Appendix A
Tables
Au ho s Yea A ilia ion Ti le Ci a ions
Abellan e al. 2017 academia A compa a i e s udy on
base classi ie s in ensemble
me hods o c edi sco ing
88
Ala’ aj e al. 2016 academia A new hyb id ensemble
c edi sco ing model based
on classi ie s consensus sys-
em app oach
66
Alessi e al. 2018 cen al bank Iden i ying excessi e c edi
g ow h and le e age
135
Alonso e al. 2020 cen al bank Machine Lea ning in
C edi Risk: Measu ing
he Dilemma Be ween
P edic ion and Supe iso y
Cos
1
2021 cen al bank Unde s anding he Pe o -
mance o Machine Lea ning
Models o P edic C edi
De aul : A No el App oach
o Supe iso y E alua ion
0
Angelini e al. 2008 academia A neu al ne wo k app oach
o c edi isk e alua ion
305
An unes 2021 cen al bank To supe ise o o sel -
supe ise: a machine lea n-
ing based compa ison on
c edi supe ision
0
Con inued on nex page
81
Doc o al P og amme - In o ma ion Managemen
Table A.1 – con inued om p e ious page
Au ho s Yea A ilia ion Ti le Ci a ions
Boyacioglu e al. 2009 academia P edic ing bank inancial
ailu es using neu al ne -
wo ks, suppo ec o ma-
chines and mul i a ia e s a-
is ical me hods: A compa -
a i e analysis in he sam-
ple o sa ings deposi in-
su ance und (SDIF) ans-
e ed banks in Tu key
272
B oede s e al. 2018 indus y FSI Insigh s Inno a i e
echnology in inancial
supe ision
23
Chak abo y e
al.
2017 cen al bank Machine Lea ning a Cen-
al Banks
62
Chang e al. 2018 academia Applica ion o eX eme g a-
dien boos ing ees in he
cons uc ion o c edi isk
assessmen models o i-
nancial ins i u ions
17
Chaudhu i e al. 2011 academia Fuzzy Suppo Vec o Ma-
chine o bank up cy p e-
dic ion
155
Climen e al. 2019 academia An icipa ing bank dis ess
in he Eu ozone: An Ex-
eme G adien Boos ing
app oach
10
Das ile e al. 2020 academia S a is ical and machine
lea ning models in c edi
sco ing: A sys ema ic
li e a u e su ey
0
Doe e al. 2021 indus y How do cen al banks use
big da a and machine lea n-
ing?
0
Dwi edi e al. 2019 academia A i icial In elligence (AI):
Mul idisciplina y pe spec-
i es on eme ging chal-
lenges, oppo uni ies, and
agenda o esea ch, p ac-
ice and policy
39
Filippopoulou e
al.
2020 academia An ea ly wa ning sys em o
p edic ing sys emic banking
c ises in he Eu ozone: A
logi eg ession app oach
1
Con inued on nex page
82
Doc o al P og amme - In o ma ion Managemen
Table A.1 – con inued om p e ious page
Au ho s Yea A ilia ion Ti le Ci a ions
Galindo e al. 2000 academia C edi isk assessmen us-
ing s a is ical and machine
lea ning: Basic me hodol-
ogy and isk modeling appli-
ca ions
213
Giudice e al. 2020 cen al bank Ins i u ional Sec o Classi-
ie , a Machine Lea ning
App oach
0
Gogas e al. 2018 academia Fo ecas ing bank ailu es
and s ess es ing: A ma-
chine lea ning app oach
20
Hamme e al. 2012 academia A logical analysis o banks’
inancial s eng h a ings
49
Hillegeis e al. 2004 academia Assessing he p obabili y o
bank up cy
1393
Hohl e al. 2019 indus y FSI Insigh s on policy im-
plemen a ion The sup ech
gene a ions
3
Huang e al. 2021 academia In elligen FinTech Da a
Mining by Ad anced Deep
Lea ning App oaches
0
Jag iani e al. 2018 cen al bank The Roles o Big Da a and
Machine Lea ning in Bank
Supe ision
4
Kola i e al. 2019 academia P edic ing Eu opean bank
s ess es s: Su i al o he
i es
4
Kou e al. 2019 academia Machine lea ning me hods
o sys emic isk analysis in
inancial sec o s
47
Kupiec e al. 2018 indus y On he accu acy o al e na-
i e app oaches o calib a -
ing bank s ess es models
5
Le e al. 2018 academia P edic ing bank ailu e: An
imp o emen by implemen -
ing a machine-lea ning ap-
p oach o classical inancial
a ios
24
Lee e al. 2020 academia Machine lea ning o en e -
p ises: Applica ions, algo-
i hm selec ion, and chal-
lenges
7
Con inued on nex page
83
Doc o al P og amme - In o ma ion Managemen
Table A.1 – con inued om p e ious page
Au ho s Yea A ilia ion Ti le Ci a ions
Leo e al. 2019 (blank) Machine lea ning in bank-
ing isk managemen : A li -
e a u e e iew
11
Lopez I u iaga e
al.
2015 academia Bank up cy isualiza ion
and p edic ion using neu al
ne wo ks: A s udy o U.S.
comme cial banks
129
Milian e al. 2019 academia Fin echs: A li e a u e e-
iew and esea ch agenda
31
Min e al. 2005 academia Bank up cy p edic ion us-
ing suppo ec o machine
wi h op imal choice o ke -
nel unc ion pa ame e s
866
Pe opoulos e al. 2018 cen al bank A obus machine lea n-
ing app oach o c edi isk
analysis o la ge loan le el
da ase s using deep lea ning
and ex eme g adien boos -
ing
6
Pompella e al. 2017 academia Ra ings based In e ence and
C edi Risk: De ec ing
likely- o- ail Banks wi h he
PC-Mahalanobis Me hod
5
Ribei o e al. 2012 academia Enhanced de aul isk mod-
els wi h SVM+
57
Soui e al. 2019 academia Rule-based c edi isk as-
sessmen model using mul i-
objec i e e olu iona y algo-
i hms
3
Ta ana e al. 2018 academia An A i icial Neu al Ne -
wo k and Bayesian Ne wo k
model o liquidi y isk as-
sessmen in banking
30
Wang e al. 2021 academia A machine lea ning-based
ea ly wa ning sys em o
sys emic banking c ises
2
Xia e al. 2017 academia A boos ed decision ee
app oach using Bayesian
hype -pa ame e op imiza-
ion o c edi sco ing
158
Table A.1: Lis o pape s collec ed h ough he esea ch que y, e e enced by au ho ,
yea o publica ion, a ilia ion, and numbe o ci a ions.
84
Doc o al P og amme - In o ma ion Managemen
Qua ile / O igin Jou nal Numbe o Pape s
A (ERA) Ad ances in Neu al In o ma ion P ocessing Sys-
ems
1
Banca d’I alia Ques ioni di Economia e Finanza 1
Banco de Espa˜na SSRN Elec onic Jou nal 2
Bank o In e na ional
Se lemen s
FSI Insigh s on policy implemen a ion 2
Bank o England Bank o England 1
Bank o G eece Nin h IFC Con e ence on “A e pos -c isis s a is i-
cal ini ia i es comple ed?”
1
Fede al Rese e Banking Pe spec i es, Fo hcoming 1
Q1
Applied So Compu ing Jou nal 3
Business Ho izons 1
Elec onic Comme ce Resea ch and Applica ions 1
Expe Sys ems wi h Applica ions 9
In e na ional Jou nal o Fo ecas ing 1
In e na ional Jou nal o In o ma ion Managemen 1
Jou nal o Business Resea ch 1
Jou nal o Economic Beha io and O ganiza ion 1
Jou nal o Financial S abili y 2
Neu ocompu ing 1
Resea ch in In e na ional Business and Finance 1
Re iew o Accoun ing S udies 1
Technological and Economic De elopmen o
Economy
1
Q2
Applied Economics 1
Compu a ional Economics 2
Economic Modelling 1
Financial Inno a ion 1
Global Finance Jou nal 1
Qua e ly Re iew o Economics and Finance 1
Risks 1
SUERF SUERF - The Eu opean Money and Finance Fo-
um
1
Table A.2: Jou nals o he selec ed a icles and hei qua ile. Whe e he jou nal is
no indexed, he en i y esponsible o he publishing was included.
Au ho s Yea Summa y sen ence
Galindo e al. 2000 CART decision- ees ou -pe o m s a is ics o c edi isk assess-
men , using a comme cial bank loans da ase
Hillegeis e al. 2004 Black–Scholes–Me on op ion-p icing model is a be e indica o o
bank up cy p obabili y han Z-Sco e and O-Sco e.
Con inued on nex page
85

Doc o al P og amme - In o ma ion Managemen
Table A.3 – con inued om p e ious page
Au ho s Yea Summa y sen ence
Min e al. 2005 Mo i a ed by he inc easing use o machine lea ning echniques,
his pape aims o ou pe o m classical s a is ics in bank up cy
p edic ion. An op imised SVM model pe o ms be e han MDA,
logi and BPN o bank up cy p edic ion.
Angelini e al. 2008 Regula ion-imposed capi al equi emen s inc ease he need o p e-
cise c edi isk assessmen sys ems. This pape shows ANNs’ e y
good esul s p edic ing he de aul endency o a bo owe .
Boyacioglu e al. 2009 Mul i-laye pe cep ons and lea ning ec o quan iza ion a e he
mos success ul models p edic ing bank ailu e as a classi ica ion
p oblem, in a Tu kish case.
Chaudhu i e al. 2011 Fuzzy-SVM sa is ies Basel II demands o de ec ing bank up cy
p obabili y, ou pe o ming o he app oaches. This algo i hm also
p o ed o ha e mo e clus e ing capabili ies han PNN.
Hamme e al. 2012 The logical analysis o da a (LAD) is able o e e se-enginee Fi ch
isk a ings o bank, showing be e esul s han suppo - ec o ma-
chines and logis ic eg ession when e alua ing he c edi wo hiness
o banks.
Ribei o e al. 2012 This s udy es ablishes he limi a ions o using exclusi ely quan-
i a i e inancial da a when de eloping de aul isk models. The
au ho s p opose a new app oach ha includes con ex ual knowl-
edge in an SVM model, showing be e p edic abili y pe o mance
Lopez I u iaga
e al.
2015 P o iling dis essed banks using sel -o ganising maps and modelling
ailu e de ec ion wi h mul i-laye pe cep on ou pe o ms adi-
ional models o bank up cy p edic ion. The esul ing model de-
ec s 96% o ailu es, up o 3 yea s be o e he bank up cy e
Ala’ aj e al. 2016 The p oposed hyb id ensemble model imp o es p edic ing capabil-
i y compa ed o base classi ie s, using 7 eal-wo ld da ase s. I uses
a classi ie consensus sys em o compa e his new app oach wi h
he adi ional combina ion me hods.
Abellan e al. 2017 Selec ion o he bes base classi ie in ensemble me hods o c edi
sco ing p oblems. The indi idual pe o mance o classi ie s is no
he only c i e ia o ensemble schemes.
Chak abo y e
al.
2017 An o e iew o he applica ions o machine lea ning o inancial
p oblems, he mos popula modelling app oaches, and h ee case
s udies o ele an wo ks o cen al banks. This s udy also es ab-
lishes ha machine lea ning models usually ou pe o m adi
Pompella e al. 2017 An EWS is p oposed o de ec likely- o- ail banks. This me hod is
compa ed wi h isk agencies’ a ing and de ec s possibly w ongly
a ed banks. The au ho s sugges he adop ion o his EWS by
egula o s.
Xia e al. 2017 The c edi sco ing p oblem is add essed using a XGBoos model
wi h Bayesian hype -pa ame e op imisa ion, no only ob aining
be e accu acy han baseline models, bu also p o iding ea u e
impo ance and a decision cha o in e p e abili y.
Con inued on nex page
86
Doc o al P og amme - In o ma ion Managemen
Table A.3 – con inued om p e ious page
Au ho s Yea Summa y sen ence
Alessi e al. 2018 The use o andom o es o p edic banking c ises seconda y o
excessi e c edi g ow h, using c edi and eal es a e p edic o s.
B oede s e al. 2018 A su ey on he use o inno a i e echnologies in inancial supe -
ision, he challenges aced by supe iso y agencies and he need
o a clea sup ech s a egy. Addi ionally, he expe ience o ea ly
adop e s is desc ibed.
Chang e al. 2018 The de elopmen o a c edi isk model using XGBoos classi ie
o add ess he he e ogeneous na u e o inancial da a. An unde -
sampling me hod is applied o deal wi h he imbalanced da a.
Gogas e al. 2018 Ou pe o ming he Ohlson’s sco e wi h s ess- es ing ool based
on a suppo - ec o machine model o o ecas bank ailu es. The
adop ed me hodology de ines a clea bounda y be ween sol en and
insol en banks.
Jag iani e al. 2018 The impac o machine lea ning in banking supe ision in e ms o
new possible analy ical solu ions and isks in ol ed in hose new
app oaches.
Kupiec e al. 2018 Add essing he need o alida ion o bank s ess es models, by
emphasising model o ecas accu acy. A Lasso model shows he
bes o ecas ing capabili ies o de e mining capi al equi emen s
in s ess ul condi ions.
Le e al. 2018 A i icial neu al ne wo ks and k-nea es neighbou me hods a e
mo e accu a e o p edic ing bank ailu e han adi ional s a is-
ics.
Pe opoulos e
al.
2018 P edic ing he p obabili y o de aul o G eek banks using da a min-
ing echniques o educe dimensionali y, wi h XGBoos eme ging as
he bes model. The au ho s aim o ully cap u e he in o ma ion
wi hin hese la ge da ase s o be e suppo he o e all
Ta ana e al. 2018 Add essing liquidi y isk assessmen h ough a model ha uses neu-
al ne wo ks and Bayesian ne wo ks. The models we e capable o
dis inguishing he mos c i ical ac o s in liquidi y isk measu e-
men .
Climen e al. 2019 Using XGBoos o iden i y he bes p edic o s o bank ailu e and
de elop a classi ica ion model o label ailed and non- ailed banks
in he Eu ozone. The da a used in his s udy is composed o 25
annual inancial a ios o comme cial banks in he Eu ozo
Dwi edi e al. 2019 Expe con ibu o s iden i y and compile a se ies o oppo uni ies,
impac s and esea ch opics aised by he apid adop ion o AI.
The inancial sec o shows eno mous po en ial in obo ad iso y
and au oma ion, and bank up cy p edic ion.
Hohl e al. 2019 A su ey o ac i i ies wi hin he scope o sup ech, classi ying he
deg ee o echnological de elopmen , and he s a egies in place
o implemen hem, highligh ing he expe imen al na u e o hese
ini ia i es and he need o in e na ional coo dina ion.
Con inued on nex page
87
Doc o al P og amme - In o ma ion Managemen
Table A.3 – con inued om p e ious page
Au ho s Yea Summa y sen ence
Kola i e al. 2019 Success ully unde going Eu opean bank s ess- es s depends la gely
on he isks a bank is exposed o, as opposed o being p epa ed o
speci ic ad e se scena ios. Using Bankscope da a, he de eloped
model accu a ely p edic s 90% o he ailing banks.
Kou e al. 2019 A su ey depic ing he mos common me hodologies o assess sys-
emic isk in he inancial sys em, using machine lea ning, big da a
analysis, ne wo k analysis and sen imen analysis. The pape show-
cases cu en esea ches on he use o machine lea ning in
Leo e al. 2019 A li e a u e e iew e idencing machine lea ning use o isk man-
agemen pu poses in he banking indus y, while also no ing he
expe imen al na u e o mos app oaches.
Milian e al. 2019 A li e a u e e iew aiming o ind consensus on a in ech de ini-
ion, showing how banks and supe iso y agencies a e using hese
inno a i e echnologies and dealing wi h he isks in ol ed.
Soui e al. 2019 Using e olu iona y algo i hms o add ess c edi isk assessmen
by conside ing i as an op imisa ion ( ule-based) sea ch p oblem:
minimise complexi y, maximise accu acy and weigh ( ules impo -
ance).
Alonso e al. 2020 Compa ing machine lea ning models om c edi de aul p edic ion.
Necessi y o a s uc u ed s a egy o assessing ML models o in-
c ease anspa ency in he use o hese echnologies, and p omo e
inno a ion in he inancial indus y.
Das ile e al. 2020 A sys ema ic li e a u e e iew on how s a is ic and machine lea n-
ing echniques ha e been used o add ess he c edi sco ing p ob-
lem. Al hough machine lea ning is o en incapable o explaining
p edic ions, hese models consis en ly ou pe o m he classic
Filippopoulou e
al.
2020 De eloping an EWS o de ec sys emic banking c isis based on he
ECB Mac op uden ial da abase. Mos o he isk indica o s used
in he da ase a e key o o ecas a sys emic isk c isis 1 o 4 yea s
be o e he e en .
Giudice e al. 2020 De eloping an au oma ic classi ica ion sys em o he sec o o eco-
nomic ac i i y o I alian companies, using a mul i-s ep classi ie
wi h g adien boos ing and suppo - ec o machine models. The
de eloped model is al eady being used in a p oduc ion en i
Lee e al. 2020 A s udy on ypes o machine lea ning applica ions, explo ing he
accu acy-in e p e abili y ade-o , and h ee use cases in inancial
indus y.
Alonso e al. 2021 P edic ing c edi de aul p obabili y wi h machine lea ning su -
passes adi ional s a is ic me hods, po en ially leading o sa ings
o up o 17% in egula o y capi al equi emen s.
An unes 2021 Es ablishing he need o supe iso y on-si e inspec ion by com-
pa ing he esul s o wo machine lea ning models, one based on
he banks’ own isk assessmen and he o he based on he indings
om p e ious on-si e inspec ions.
Con inued on nex page
88
Doc o al P og amme - In o ma ion Managemen
Table A.3 – con inued om p e ious page
Au ho s Yea Summa y sen ence
Doe e al. 2021 Policy b ie showing cen al banks a e elying on big da a o daily
asks, and iden i ying a clea need o specialised knowledge on how
o adequa ely use machine lea ning, and ex ac g ea e alue om
ha da a.
Huang e al. 2021 This s udy is de eloped unde he assump ion ha he in ica e
na u e o inancial da a canno be p ope ly explo ed h ough a-
di ional me hods. An ad anced deep lea ning model o add ess he
complex and hie a chical ea u es o inancial da a, ha ou p
Wang e al. 2021 Random o es based EWS ou pe o ms he classic logi app oach
as he p edic i e ool o p e en sys emic banking c ises. This pape
shows an expe o ing app oach o model he mul i a ia e na u e
o sys emic isk assessmen da a.
Table A.3: Sho summa y o each analysed pape , e e enced by au ho s and yea .
Au ho s ML Me hods Da ase
Abellan e al. ada-boos ing, bagging, andom
subspace, DECORATE, o a ion
o es
public: Aus alian, Ge man, and
Japanese da ase s ob ained om
UCI eposi o y o machine lea n-
ing; I anian da ase om ”A com-
pa ison be ween s a is ical and
da a mining me hods o c edi
sco ing in case o limi ed a ailable
da a. (2007)”; Polish da ase
Ala’ aj e al. neu al ne wo ks, suppo ec o
machines, andom o es s, deci-
sion ees, Nai e Bayes
public: Aus alian, Ge man, and
Japanese da ase s ob ained om
UCI eposi o y o machine lea n-
ing; I anian da ase om ”A com-
pa ison be ween s a is ical and
da a mining me hods o c edi
sco ing in case o limi ed a ailable
da a. (2007)”; Polish da ase
Alessi e al. logi , decision ees, andom o -
es
public: c isis da ase by De ken
e al. 2014, cap u ing sys emic
banking c ises ela ed o domes-
ic c edi cycle
Alonso e al. logi , lasso, CART, andom o -
es , xgboos , deep lea ning
p i a e: anonymized da ase
om Banco San ande , con ain-
ing mo e han 75000 c edi ope -
a ions
Alonso e al. logi , lasso, CART, andom o -
es , xgboos , deep lea ning, RL &
ensemble me hods
public: kaggle.com ”Gi e me
some c edi ” da ase
Angelini e al. ann p i a e: SME loans om na I al-
ian bank
Con inued on nex page
89
Doc o al P og amme - In o ma ion Managemen
(a) Logis ic Reg ession.
(b) k-Nea es Neighbou s
classi ie .
(c) Random Fo es classi-
ie .
(d) Ex eme G adien
Boos ing classi ie .
(e) TPOT classi ie au-
oML amewo k.
Figu e A.6: Ope a ional isk: Con usion ma ices gene a ed when e alua ing he
abo e men ioned models, using c oss- alida ion app oach.
(a) Logis ic Reg ession.
(b) k-Nea es Neighbou s
classi ie .
(c) Random Fo es classi-
ie .
(d) Ex eme G adien
Boos ing classi ie .
Figu e A.7: P o i abili y isk: Con usion ma ices gene a ed when e alua ing he
abo e men ioned models, using ain- es spli app oach.
96

Doc o al P og amme - In o ma ion Managemen
(a) Logis ic Reg ession.
(b) k-Nea es Neighbou s
classi ie .
(c) Random Fo es classi-
ie .
(d) Ex eme G adien
Boos ing classi ie .
(e) TPOT classi ie au-
oML amewo k.
Figu e A.8: P o i abili y isk: Con usion ma ices gene a ed when e alua ing he
abo e men ioned models, using c oss- alida ion app oach.
97