scieee Science in your language
[en] (orig)

Detection of non-technical losses in smart meter data based on load curve profiling and time series analysis

Author: Oregui Bravo, Izaskun,Del Ser Lorente, Javier,Villar Rodríguez, Esther,Bilbao Maron, Miren Nekane,Gil López, Sergio
Publisher: Elsevier
Year: 2017
DOI: 10.1016/j.energy.2017.07.008
Source: https://addi.ehu.eus/bitstream/10810/78659/3/2017_Detection_of_non_technical_losses_in_sma.pdf
De ec ion o Non-Technical Losses in Sma Me e Da a
based on Load Cu e P o iling and Time Se ies Analysis
Es he Villa -Rod igueza, Ja ie Del Se a,b,c,∗, Izaskun O egia,
Mi en Nekane Bilbaob, and Se gio Gil-Lopeza
aTECNALIA, 48160 De io, Bizkaia, Spain.
bUni e si y o he Basque Coun y (EHU/UPV), 48013 Bilbao, Bizkaia, Spain.
cBasque Cen e o Applied Ma hema ics (BCAM), 48009 Bilbao, Bizkaia, Spain.
Abs ac
The ad en and p og essi e deploymen o he so-called Sma G id has un-
leashed a p o i able po olio o new possibili ies o an e icien manage-
men o he low- ol age dis ibu ion ne wo k suppo ed by he in oduc ion
o in o ma ion and communica ion echnologies o exploi i s digi aliza ion.
Among all such possibili ies his wo k ocuses on he de ec ion o anoma-
lous ene gy consump ion aces: dis ega ding whe he hey a e due o mal-
unc ioning me e ing equipmen o audulen pu poses, s ong e o s a e
in es ed by u ili ies o de ec such ou lying e en s and add ess hem o op-
imize he powe dis ibu ion and a oid signi ican income cos s. In his
con ex his manusc ip in oduce a no el algo i hmic app oach o he iden-
i ica ion o consump ion ou lie s in Sma G ids ha elies on concep s om
p obabilis ic da a mining and ime se ies analysis. A key ing edien o he
p oposed echnique is i s abili y o accommoda e ime i egula i ies – shi s
and wa ps – in he consump ion habi s o he use by concen a ing on he
shape o he consump ion a he han on i s empo al p ope ies. Simula-
ion esul s o e eal da a om a Spanish u ili y a e p esen ed and discussed,
om whe e i is concluded ha he p oposed app oach excels a de ec ing
di e en ou lie cases emula ed on he a o emen ioned consump ion aces.
Keywo ds: Sma G ids; Sma Me e Da a; Non-Technical Losses; Ou lie
De ec ion.
∗Co esponding au ho : ja[email p o ec ed] (P o . D . Ja ie Del Se ). TEC-
NALIA. P. Tecnologico Bizkaia, Ed. 700, 48160 De io, Spain. Tl: +34 946 430 50. Fax:
+34 901 760 009. E-mail: ja[email p o ec ed].
June 26, 2017
This is he accep ed manusc ip o he a icle ha appea ed in inal o m in Ene gy 137 : 118-128 (2017), which has
been published in inal o m a h ps://doi.o g/10.1016/j.ene gy.2017.07.008. © 2017 Else ie unde CC BY-NC-ND
license (h p://c ea i ecommons.o g/licenses/by-nc-nd/4.0/)
1. In oduc ion
Acco ding o he o icial de ini ion in oduced by he Ene gy Indepen-
dence and Secu i y Ac o 2007 [1], Sma G ids can be unde s ood in he
wide sense as he echnological e o s o mode nize and digi ize he elec ici y
dis ibu ion sys em o a na ion o ensu e, in an scalable manne , he imp o ed
eliabili y, secu i y and e iciency o he g id, as well as o gua an ee an op i-
mized managemen o i s esou ces and ope a ion. This de ini ion desc ibes
such a e m as a comp ehensi e se o ope a ional esou ces es ablished o
gua an ee an e icien powe ansmission and elec ici y dis ibu ion, paying
an special a en ion o eliabili y and secu i y.
The ad en and p og essi e deploymen o he so-called Sma G id has
unleashed a p o i able po olio o new possibili ies o an e icien manage-
men o he low and medium ol age dis ibu ion ne wo k, suppo ed by he
in oduc ion o in o ma ion and communica ion echnologies o exploi i s
digi aliza ion. In his con ex , he deploymen o he Ad anced Me e ing
In as uc u es (AMIs) allows he u ili ies o acqui e ine-g ained da a abou
he eal consump ion o end-use s (no based on es ima ions o mon hly mea-
su emen s), which esul s essen ial o acqui e deepe insigh s on how, when
and whe e ene gy is dis ibu ed and consumed h ough he ne wo k [2, 3, 4].
This ac is pa icula ly c ucial in ega ds o he aceabili y and cha ac-
e iza ion o elec ical losses, which accoun o he di e ence be ween he
amoun o ene gy dis ibu ed by he elec ical dis ibu ion company and he
amoun o ene gy paid by he consume s. Such losses may be due o wo
main con ibu ing causes: 1) losses inhe en o he ans o ma ion and dis-
ibu ion o ene gy, which a e p opo ional o he squa ed o elec ical cu en
and widely e e ed o as Technical Losses (TL); and 2) non- echnical losses
(NTLs), associa ed o e oneous eadings, de ec ed sma me e s o aud [5].
This wo k ocuses on he de ec ion o ene gy consump ion aces which
con ibu e o NTLs: dis ega ding whe he hey a e due o mal unc ioning
me e ing equipmen o audulen pu poses. As men ioned by [6, 7, 8] he
amoun o ene gy loss in he dis ibu ion g ids a ies be ween 7 – 50 % o he
o al deli e ed ene gy (depending o he coun y and he cha ac e is ics o
he dis ibu ion ne wo k), which undoub edly jus i ies he s ong e o s ha
u ili ies a e in es ing owa ds de ec ing and inspec ing a ypical consump ion
aces o ul ima ely a oid signi ican economical losses. As s a ed in [9], only
in US be ween 1 and 10 billion wo h o elec ici y was s olen in he la e 90s,
showing an inc emen be ween 5-10% in he las wo decades wi h a ema k-
2
able 40% and beyond in Sou heas Asia [10]. In addi ion, he iden i ica ion o
a ypicali ies p o ides u he p o i able ad an ages beyond aud assessmen :
by p ope ly cha ac e izing he s a is ics o he consump ion aces egis e ed
o e he powe g id, he powe dis ibu ion can be op imized by ma ching
gene a ion o consump ion, he eby a oiding ne wo k unde -dimensioning
and elec ical su ge.
In e es ingly o he scope o his wo k, ene gy he accoun s o he
majo i y o easons o he a o emen ioned non- echnical losses. The e a e
indeed e y di e se me hods by which malicious consume s educe illegally
he consump ion moni o ed by he ins alled sma equipmen , pa icula ly
in he las s age o he dis ibu ion ne wo k. One o he mos usual o ms o
elec ici y he is aud, by which he use delibe a ely a emp s a decei -
ing he ene gy supplie (u ili y) a hand. This can be achie ed by di e se
means such as me e ampe ing, by which he me e is o ced o egis e
a lowe powe eading han he eal consump ion o he use . While o he
o ms o elec ici y he p e ail ac oss di e en coun ies and cul u es (e.g.
billing i egula i ies), his wo k e ol es a ound hose audulen cases when
he non- echnical loss may be e lec ed in a beha io al change o he ene gy
consump ion ace egis e ed by he me e ing de ice. In his ega d, bo h
ampe ing and elec ici y he all wi hin he scope o his wo k: hey cons i-
u e a p io i ized a ge o mos u ili y companies a ound he wo ld, due o
he se e e consequences o hese phenomena (i.e. highe elec ici y a es o
paying consume s, inc eased isk o i e o elec ocu ion due o imp ope ly
ins alled bypasses and in gene al, a educed g id eliabili y).
F om a da a based pe spec i e, a change in he ene gy consump ion p o ile
o a use con ibu ing o NTLs can be unde s ood as a de ia ing obse a-
ion in he ime se ies ha models such a p o ile, whose s a is ics make i
qui e likely o be gene a ed by ano he di e en unde lying beha io [11].
Howe e , no mal load p o iling in he low- ol age ne wo k can be p oduced
by he agg ega ion o di e en , ye ela ed beha io al componen s (season-
ali y, daily and weekly s a is ical a iabili y, habi s changes, among o he s)
ha di e om each o he in bo h, ampli ude (i.e. amoun o ene gy con-
sumed om he powe g id due o di e en load consump ions) and ime
domains (co espondingly, he s a is ical consump ion schedule o he se o
use s’ loads along he day o week). I is he dissimila i y o any new con-
sump ion ace o any o hose p e iously lea ned beha io al pa e ns (load
p o iling) wha should di e bo h beha io s (no mal and NTLs). Fu he -
mo e, he de ec ion o anomalous obse a ions allows o he in e ence o
3
mo e obus models by disca ding hose ins ances esul ed om he s ongly
i egula samples, which could de ia e he models om he ep esen a ion o
s a is ically signi ican egula ends in he consump ion habi s o he use
unde analysis. This being said, an ou lie de ec ion me hod can be de ined
as he ask o classi ying elemen s as no mal o di e ing wi h espec o he
s a is ical egula i y cha ac e izing a da ase . A his poin he concep o
egula i y mus be de e mined by he add essed applica ion scena io.
In his con ex , a baseline axonomy o ou lie de ec ion algo i hms com-
p ises 1) pa ame ic me hods ha ely on p io hypo hesis abou he s a-
is ical model gene a ing he da a; and 2) model- ee, non-pa ame ic ech-
niques, which a oid any p io assump ion abou he unde lying dis ibu ion
o he da a o s a is ical pa ame e es ima es. Among he la e we ocus
on dis ance-based unsupe ised ou lie de ec ion app oaches, which gene -
ally hinge on local dis ance measu emen s (no o accoun ing beha io al
di e ences) and a e capable o e icien ly handling la ge da ase s [12]. By
p ope ly de ining a dis ance o measu e o simila i y be ween samples, sub-
sequen da a mining p ocedu es such as clus e analysis can iden i y g oup
o samples ha do no belong o he se o disco e ed da a clus e s. This
iden i ica ion can be done based on di e en dis ance-based c i e ia, such as
he densi y o samples wi hin a gi en dis ance h eshold. In all such cases
he selec ion o he dis ance me ic is a key poin o his collec ion o ech-
niques, since he simila i y c i e ion – which oughly depends on he chosen
dis ance me ic – will guide he whole iden i ica ion p ocess. The e o e, i is
clea ha he bes simila i y unc ion mus be complian wi h he na u e o
da a and he speci ic pa icula i ies o he applica ion.
In his ega d, se e al p io con ibu ions ha e hi he o deal wi h he
iden i ica ion o NTLs in ene gy consump ion aces. To begin wi h, se e al
con ibu ions ha e g a i a ed on he use o machine lea ning models o e su-
pe ised da ase s, such as Suppo Vec o Machines [13, 14, 15, 16, 17, 18],
Neu al Ne wo ks [19, 20, 21], Ex eme Lea ning Machines [22], Pa h Fo es s
[23, 24], Decision T ees [25, 26, 27], model ensembles [28], and s a is ical
me hods [29, 30]. Howe e , all such p e ious wo k builds upon he assump-
ion ha supe ised da ase s cap u e he en i e casuis y o symp oma ic
anomalies o in e es o aud de ec ion and/o elec ici y he , which no
only un ealis ic in p ac ice bu also yields highly imbalanced da ase s ha
subsequen ly jeopa dize he model lea ning p ocess. By con as , unsupe -
ised anomaly de ec ion in Sma G ids o e ides any need o p e iously
labeled da a, ye makes he e alua ion and uning o he model ha d o
4
pe o m due o he non-u iliza ion o posi i e examples du ing he cons uc-
ion o he lea ne . The li e a u e dealing wi h elec ici y aud using non-
supe ised lea ning models has been ela i ely sca ce, wi h Sel O ganizing
Maps [31] and uzzy clus e ing schemes [32] mos ly used o da e.
This manusc ip in oduces a no el algo i hmic app oach o acqui ing
knowledge o cus ome ’s beha io s (load p o iling), which allows o he iden-
i ica ion o consump ion beha io al ou lie s in Sma G ids based on he
hou ly measu emen s p o ided by he AMIs. The p oposed scheme ad ances
o e he s a e o he a by combining p obabilis ic da a mining and ime se-
ies analysis; we adop he so-called Dynamic Time Wa ping (DTW) me ic
as he measu e o simila i y be ween consump ion aces egis e ed by he
use unde analysis, by which such sequences a e aligned in a dynamic, non-
linea ashion dis ega ding any shi s o wa ps along ime [33]. This me ic is
hen used wi hin wo di e en dis ance-based lea ning models, bo h elying
on densi y es ima ions o de ec anomalous pa e ns. A u he no el ing e-
dien o his wo k is a ace encoding s a egy ha depends on he spanned
hou ly s a is ical anges o e e y use , which inc eases he lexibili y o he
models o a oid alse ala ms. The pe o mance o he de i ed schemes is as-
sessed and discussed based on simula ion esul s compu ed o e eal AMI
da a cap u ed by a Spanish u ili y. Gi en he ob ained sco es we conclude
ha he p oposed me hod accommoda es i egula i ies o he analyzed con-
sump ion aces along ime by ocusing exclusi ely on hei shape.
The es o he manusc ip is s uc u ed as ollows: Sec ion 2 poses he
no a ion used h oughou he manusc ip , and o mula es he p oblem o ou -
lie de ec ion con ex ualized o he applica ion ackled in his manusc ip .
Sec ion 3 p o ides an o e iew o he p oposed app oach, emphasizing on i s
cons i uen elemen s in subsec ions he ein. Nex , Sec ion 4 desc ibes he
da ase u ilized o pe o mance assessmen , jus i ies he di e en emula ed
cases o e such da a and discusses he ob ained esul s. Finally conclusions
a e gi en in Sec ion 5 along wi h an ou line o u u e esea ch lines.
2. No a ion and P oblem S a emen
As depic ed in Figu e 1, we assume ha an ene gy dis ibu ion company
has deployed a se o Nsma me e s o moni o he consump ion o pa o
i s cus ome po olio. Le da a samples egis e ed by he n- h sma me e
be deno ed as xn.
={xn
}Tn
=1, whe e s ands o he ime dimension disc e ized
as pe he g anula i y n
s[minu es] by which he sma me e eco ds da a (e.g.
5

hou ly, n
s= 60 minu es). He e Tndeno es he o al numbe o samples ead
o he cus ome a hand, which may a y among di e en cus ome s due o
e.g. he da e on which he sma me e was ins alled in he use p emises. We
u he conside ha he minimum decisional uni o he ou lie de ec ion
model is an en i e day (24 hou s), o which xn.
={xn
}Tn
=1 can be eshaped
as a ma ix Xn, wi h each column con aining he (24 ·60)/ n
s alues ha he
me e o cus ome n∈ {1, . . . , N}samples du ing each day. Fo he sake
o simplici y, in o egoing de i a ions we will o ce n
s= 60 minu es ∀n, such
ha Xnwill ha e 24 eadings pe e e y day ou o a o al o Dn.
=⌊Tn/24⌋
days moni o ed o cus ome n. Samples o day d∈ {1, . . . , Dn}will be
exp essed as Xn
d, i.e. by he d- h ow in Xn.
The aim o an ou lie de ec ion model Mn
θ(Xn
d′;Xn) is o in e , o use n,
whe he a new daily consump ion ace Xn
d′cap u ed by he sma me e o
use n ollows he same dis ibu ion as ha cha ac e izing Xn(decla ing i o
be an inlie ) o , ins ead, di e s signi ican ly (co espondingly, is an ou lie ).
The la e case se es as a igge o a u he inspec ion p ocess o con i m
whe he he beha io al change is due o e.g. aud. The model is con olled
by a se o pa ame e s collec ed in θ, which pe mi o balance be ween he
T ue Posi i e Ra e (TPR, also e e ed o as sensi i i y o ecall) and he
T ue Nega i e Ra e o he model (namely, TNR o speci ici y) [34].
A his poin i is impo an o no e ha o measu ing he TNR and
TPR me ics o any ou lie de ec ion model we need supe ised labels o
he es aces o e which such me ics a e compu ed. In o he wo ds, o
assessing he pe o mance o an ou lie de ec ion algo i hm i is manda o y
o know a p io i whe he he dis ibu ion u ilized o p oducing each o he
es aces co esponds o ha u ilized o modeling he ou lie p o o ype
ha he model should de ec .
We e e as ℓn
d′
.
=Mn
θ(Xn
d′;Xn)∈ {0,1} o he p edic ed label by he
model o es ace Xn
d′. Bea ing his de ini ion in mind, he TPR and
TNR sco es achie ed by model Mn
θ(·) o e a es da ase {Xn
d′}D′
d′=1 a e
gi en by TNR (Mn
θ) and TPR (Mn
θ), espec i ely. No ewo hy is o high-
ligh ha hese me ics implici ly measu e he ex en o which he model is
adap ed o disc imina e among he dis ibu ion 1
X(x) ollowed by ou lie s
wi hin {Xn
d′}D′
d′=1 om ha ollowed by egula aces in Xn(co espondingly,
0
X(x)). While lea ning 0
X(x) is a ma e o i ing he model o Xnon he
assump ion ha all consump ion aces he ein a e legi ima e, he casuis y
o ou lie s dic a ed by 1
X(x) is d i en by he speci ici ies o he applica ion
scena io i sel . To his end, in his wo k we ocus on ou di e en hypo heses
6
o he es ace Xn
d′which Mn
θ(·) should decla e as an inlie o an ou lie :
1. The es ace Xn
d′belongs o he no mal beha io al dis ibu ion o cus-
ome n, i.e. Xn
d′∼ 0
X(x) wi h high likelihood. In his case he model
should decla e ha Xn
d′is an inlie , namely, ℓn
d′
.
=Mn
θ(Xn
d′;Xn) = 0.
2. The es ace Xn
d′ alls again wi hin he ace space spanned by he no mal
beha io o cus ome n. Howe e , in his case a shape-p ese ing shi (o
δ∈[−∆max,∆max] hou s) in he ime domain is p esen in he es ace
o accoun o exogenous ac o s a ec ing he consump ion pa e ns o he
use along he ime domain. Fo ins ance, a domes ic use does no neces-
sa ily use his/he home appliances a he same ime du ing he week, bu
i is o en he case ha such home du ies ollow a egula pa e n in hei
execu ion. In his case he model should be elas ic enough o accommo-
da e his ime a iabili y, ocus on pu ely shape- ela ed cha ac e iza ion
o he consump ion pa e ns and p edic ha ℓn
d′= 0.
3. The es ace Xn
d′ e lec s a sub le ene gy loss o e i s ime span wi h e-
spec o a pa icula legi ima e example in Xn. This e ec is symp oma ic
o sophis ica ed manipula ions by which he me e is slowed down egu-
la ly in sho ime in e als (e.g. by ins alling a ci cui inside he de ice)
o hal he eco ding p ocess and unde - egis e he ene gy consumed by
he cus ome . Clea ly, in his case he model should ou pu ℓn
d′= 1 de-
pending on he a io σ∈(0,1] be ween he o e all ene gy o he es ace
and ha o he legi ima e consump ion ace om whe e i was p oduced.
4. Me e ampe ing, by which he me e is delibe a ely bypassed so ha he
de ice does no eco d any consump ion a all. As a esul , ab up ene gy
losses a e ob ained in he da a aces o he cus ome , which eme ge in he
da a ace o he day in which he ampe ing was pe o med as a se ies
o Zmax ze o- alued samples. The model should p edic ℓn
d′= 1 o his
e en , and igge a subsequen manual inspec ion o e he use a hand.
A good ou lie de ec ion model should ake in o accoun ha he goal
o he applica ion is o co ec ly p edic es aces alling wi hin any o
he abo e 4 ca ego ies. The e o e, he design goal can be o mula ed as a
mul i-objec i e op imiza ion p oblem whe e he op imali y o he sough se
o model pa ame e s is d i en by he ade-o be ween wo con lic ing objec-
i es: he a io o con i med ou lie s (TPR) and he p opo ion o co ec ly
iden i ied inlie s (TNR) when he model p edic s a es se composed by D′
7
new consump ion aces. Ma hema ically:
θop = a g
θhmax TNR Mn
θ({Xn
d′}D′
d′=1;Xn),max TPR Mn
θ({Xn
d′}D′
d′=1;Xn)i,
subjec o Xn
d′∼ { 0,X
X(x), 0,δ
X(x), 1,σ
X(x), 1,z
X(x)} ∀d′∈ {1, . . . , D′}, whe ein
by a sligh abuse in no a ion we disc imina e he pa icula hypo heses ha
each dis ibu ion models: no mal beha io ( 0,X
X(x)), shape-p ese ing ime
a iabili y ( 0,δ
X(x)), sub le loss ( 1,σ
X(x)) o ampe ing ( 1,z
X(x)). In essence:
we pu sue he bes model con igu a ion o de ec all classes o inlie and
ou lie aces in he es se , based on he ace se Xn o use n.
The abo e op imiza ion p oblem models he concep ual, s anda d model
adjus men p ocess in da a mining, which can be ackled by using di e en
well-known me hodologies such as c oss- alida ion [35]. Howe e , he design
challenge goes beyond he nume ical e inemen o he pa ame e s con olling
he lea ning p ocess o he model i sel . Since a design a ge is o accom-
moda e ime shi s in he load cu e ha a e no symp oma ic o NTL, we
op o dis ance-based ou lie de ec o s ha le e age a simila i y me ic be-
ween ime dis ances ha is no a ec ed by such non-linea a ia ions. Two
di e en ou lie de ec ion schemes will be designed based on his simila i y
measu emen , compu ed no o e he o iginal da a aces, bu a he on hei
quan ized alues based on he hou ly s a is ics o Xn. The nex sec ion del es
in o he de ails o hese models, along wi h he u ilized simila i y dis ance
and he s a is ical quan iza ion.
3. P oposed App oach
Figu e 2 shows he o e all p ocessing low o he ou lie de ec ion me hods
p oposed in his manusc ip . Fou a e he ing edien s ha lie a he co e o
he de eloped echniques, which a e desc ibed as ollows:
3.1. Simila i y Measu e
As a gued in he p e ious sec ion, a elas ic measu e o simila i y be ween
load p o iles will be used o accommoda e beha io al changes ha do no
imply a dec ease in he ene gy consumed by he moni o ed use (e.g. ime
wa ps). To his end we will emb ace he so-called Dynamic Time Wa ping
(DTW) measu e, by which he simila i y be ween wo any gi en consump ion
aces Xn
dand Xn
d′(i.e. aces eco ded o use na days dand d′) can be
8
compu ed by sea ching o a minimum-weigh op imal pa h Pbe ween he
(1,1) and (N, N) e ices o a ec angula N×Ng id. The weigh wi,j
associa ed o e ex (i, j) in his g id co espond o he Euclidean dis ance
be ween Xn
d,i (i.e. he consump ion measu ed o use n, day dand hou i)
and Xn
d′,j, namely, wi,j =
Xn
d,i −Xn
d′,j
. The DTW me ic be ween aces o
use nco esponding o day dand d′is gi en by [33, 36]
DTW(Xn
d,Xn
d′) = min
P∈P
KP
X
k=1
wpk=
KP
X
k=1
wik,jk,(1)
wi h P={p1,p2,...,pKP}deno ing a KP-long wa ping pa h composed by
s eps pk= (ik, jk) (k∈ {1, . . . , KP}), and Pdeno ing he se o all pa hs
h ough he g id ul illing p1= (1,1), pk−pk−1∈ {(1,1),(0,1),(1,0)}and
pKP= (N, N).
When con ex ualized on he ene gy applica ion ackled in his manusc ip ,
he DTW me ic allows measu ing he deg ee o dissimila i y be ween wo
consump ion aces by dismissing small beha io al shi s o e he ime do-
main and hence ocusing s ic ly on di e ences in he ampli ude o he en-
e gy consumed by he cus ome a hand. The DTW algo i hm p o ides an
adap ed me ic o assess he simila i y be ween wo empo al sequences which
may a y in speed. A pa e n in e ms o he daily elec ic consump ion mus
be lexible enough o cope wi h ime de o ma ions esul ing om i egula
house habi s o di e en wo king schedules. The e o e, a conc e e consump-
ion pa e n does no necessa ily co espond o a unique ea u e ec o in
e ms o bo h sequence modula ion and pe iodici y – hus conside ing a con-
s an window spacing and a poin - o-poin de ini ion – bu a he o a shape
o a silhoue e in a highe -le el o abs ac ion ha allows s e ching o com-
p essing sec ions o he se ies o compa ison. In his wo k we pos ula e ha
he DTW p ope ly deals wi h such an assump ion on he simila i y be ween
wo consump ion aces unde a mo e elas ic conside a ion o alignmen .
3.2. S a is ical T ace Encoding
An op ional ace encoding s a egy is p oposed based on he s a is ical
anges spanned by he hou ly measu emen s egis e ed o he use a hand.
When compu ing he DTW me ic wo dis inc s a egies can be adop ed:
he i s hinges on compu ing he simila i y be ween da a ins ances Xn
dand
Xn
d′by using di ec ly he nume ical alues o he hou ly ene gy consumed
by he use a hand. Howe e , he s aigh o wa d use o unp ocessed alues
9
Q1: Do all encoding-model combina ions (i.e. LOF,LSA,LOF-box,LSA-box)
pe o m easonably well wi h espec o he a ge ed casuis y o NTL
e en s? Which domina es? In e ms o which me ic? (TNR/TPR)
Q2: When op ing o encoding aces based on hei s a is ical bounda ies
(LOF-box,LSA-box), does i yield an enhanced obus ness agains alse
posi i es? (i.e. a highe alue o TNR). Wha is he downside in e u n?
Q3: How a e misclassi ied aces dis ibu ed o e he di e en pa s com-
p ising he es da ase ? Is he e any link o he egula i y o he use ?
Q4: Is he e any chance o inc easing he pe o mance sco es in a p ac ical
implemen a ion o his scheme?
To his end mac oscopic pe o mance sco e s a is ics ha e been compu ed
based on he esul s ob ained o e a e a p e ious da a cleansing s age com-
p ising co up ed da a disca ding. The pa ame e g id {θ1,...,θϑ},o e
which models o e e y disco e ed clus e we e e ined ia c oss- alida ion,
a e, o LOF,{1,2,...,20} × {0,0.1,...,1.9,2}, whe e he i s e m co e-
sponds o he numbe o neighbo s and he second one s ands o he de-
cision h eshold γn,LOF
c. As o models based on LSA, he pa ame e g id is
{0,0.1,...,0.9,1} × {0,0.1,...,0.9,1} × {0.5}, co esponding o ρn
c,τn
cand
γn,LSA
c o alle ia ing he compu a ional complexi y o he c oss- alida ion
p ocess. To his end he ke nel es ima ion wi hin LSA-based app oaches was
u he es ained o a maximum o 50 poin s ins ead o eso ing o he
whole aining se o he clus e a hand. Those ep esen a i e poin s can be
emula ed by he min(|Dn
c|,50) medoids compu ed by a hie a chical clus e -
ing model, whe e we ecall ha |Dn
c|is aken as he numbe o samples he
aining se o he clus e c. The numbe o olds is F= 10 in all cases.
As p e iously s a ed in Algo i hm 1, he i ness unc ion quan i ying he
op imali y o a pa ame e se du ing he clus e -wise c oss- alida ion p ocess
is max{min{TNR,TPR}} o bo h LOF and LSA app oaches. This combined
me ic p e en s any o he in ol ed me ics om becoming domina ed by
he o he , hence o cing he model o achie e a high sco e in one o he wo
pu sued c i e ia o he de imen o he o he .
4.1. Resul s and Discussion
In esponse o Q1, we begin ou discussion by analyzing Figu e 4, which
depic s a sca e plo comp ising he es TNR/TPR sco es a ained by he
p oposed me hods o e e y use in he da ase . Also a e included in he
16

plo i ed Gaussian dis ibu ions o e e y sco e and echnique ia Ke nel
densi y es ima ion wi h a bandwid h pa ame e equal o 1 in all cases. A
i s look on he esul s plo ed in his igu e e eals ha indeed bo h LSA
and LOF bene i om he op ional s a is ical encoding app oach (Subsec ion
3.2) when he ocus is placed on maximizing he numbe o ue nega i e
sco es. This is specially no able in he case o LOF, whe e he a e age TNR
inc eases om 0.62 (LOF) o 0.77 (LOF-box). This, as expec ed, comes along
wi h a se e e penal y in he numbe o de ec ed posi i es, wi h a dec ease
in a e age TPR om 0.70 (LOF) o 0.36 (LOF-box). This pa icula esul
e inces he ade-o be ween bo h sco es, o which he inclusion o algo i h-
mic design op ions as he s a is ical encoding scheme is c ucial o achie e
pe o mance sco es aligned wi h he ope a ional equi emen s. Fo ins ance,
he ope a o migh conse a i ely p io i ize a low numbe o alse posi i es
due o in e nal budge a y/ esou ce cons ain s o inspec ion asks, hence
op ing o he a o emen ioned encoding scheme.
Compa isons be ween echniques can be be e analyzed by ed awing he
esul s in Figu e 4 as a se ies o iolin plo s, i.e. an enhanced e sion o he
con en ional boxplo wi h ex ended in o ma ion abou he shape o a ke nel
dis ibu ion i ed o he da a samples. Such plo s a e p o ided in Figu e 5
along wi h con en ional boxplo s o e laid o e each case. In ligh o hese
esul s and linking o ques ion Q2, i can be in e ed ha he nai e LOF and
LSA schemes in gene al ou pe o m hei s a is ically encoded coun e pa s in
e ms o ou lie de ec ion (TPR), since hey essen ially yield a ine-g ained
adjus ed model capable o disc imina ing sligh de ia ions om he egula
consump ion pa e ns o he use . Howe e , o use s wi h mo e chao ic o
uns eady pa e ns he pa ame e sea ch p ocedu e o he o e all model ails o
ind a p ope balance be ween sensi i i y (TPR) and speci ici y (TNR). Due
o he ac ha a po ion o he alida ion se (and acco dingly ano he pa
o he es se ) is p oduced by emula ing mino luc ua ions in legi ima e
consump ion aces, he new da a aces a e likely o all in high-densi y
egions al eady popula ed by legi ima e use aces, hence being e en ually
in easible o d aw bounda ies o bina y classi ica ion. A his poin i is
in e es ing o ema k ha he LSA-box scheme seems o be mo e esilien
o he TPR deg ada ion expec ed when including he s a is ical encoding
wi hin he ou lie de ec ion low, wi h 70% o he o e all se o analyzed
use s wi h TPR sco es kep abo e 0.6 o his scheme.
The discussion ollows by add essing ques ion Q3; in his ega d, Figu e
6 depic s he dis ibu ion o he accu acy me ic (i.e. he p opo ion o ue
17
es ima ions – bo h posi i e and nega i e – wi h espec o he o al numbe
o samples p ocessed o each use ) o e he di e en pa s in which he
es se is di ided: Region 1 (o iginal legi ima e es aces o he use ),
Region 2 (o iginal aces wi h andom shi s in he ime domain), Region 3
(sub le andom pe u ba ions in he hou ly consump ion alue o he use )
and Region 4 (sha p ze oing o he consump ion ace). Fo he sake o
space and cla i y esul s a e only shown o he LSA and LSA-box schemes.
Expec edly he use o an elas ic measu e o simila i y a he co e o he
classi ie design implies ha he sco e s a is ics be ween Regions 1 and 2
a e simila o each o he , hus e incing ha he o e all model is capable o
accommoda ing occasional beha io al changes in he consump ion habi s o
he use ha o he con en ional simila i y me ics (e.g. pai wise Euclidean
dis ance) would decla e as a alse posi i e. When ocusing on Regions 3 and
4 he ob ained esul s con i m he in ui ion ha sub le a ia ions in Region 3
a e signi ican ly mo e challenging o de ec as ou lie s han he ze oed da a
aces composing Region 4. In e es ingly, accu acy sco es o LSA-box o
Region 4 a e lowe han hose o he nai e LSA scheme, due o he ac ha
small de ia ions may all wi hin he compu ed s a is ical bounda ies d i ing
he ace encoding s a egy o LSA-box. By con as , ze oed samples playing
he ole o mal unc ions in he powe quan i ica ion o ampe ing (namely,
Region 4) a e be e de ec ed by he LSA-box scheme, wi h accu acy sco es
abo e 0.8 o 80% o he o al se o use s in he expe imen al se up.
The a ionale o he di e en pe o mance pa e ns ound be ween ech-
niques o e he egions o he es da a aces can be also unde s ood in
connec ion o he egula i y o he use in his/he ene gy consump ion pa -
e ns. When ansla ing aw alues o he consumed ene gy o a educed ye
s a is ically meaning ul alphabe , he o e all da ase o he use a hand can
be explained mo e likely by a educed se o pa e ns. A byp oduc o his
simpli ica ion is a be e disc imina ion o ou lie s when hey a e cha ac e -
ized by se e e ampli ude d ops, as dis ances become enla ged by i ue o he
ange disc e iza ion o hei median alues. We exempli y his obse a ion in
Figu e 7, which shows a boxplo o he hou ly ene gy measu emen s o wo
di e en use s in he da ase conside ing he DTW alignmen be ween he
da a aces and he a e age consump ion habi o e e y cus ome . As op-
posed o he consump ion i egula i y cha ac e izing Use A, Use B ea u es
ela i ely mo e s able consump ion pa e ns, yielding signi ican ly be e p e-
dic i e sco es han hose ob ained o use A (i.e. a e age TNR/TPR sco es
equal o 0.95/0.93 e sus 0.83/0.58 o LSA-box).
18
We end he discussion by elabo a ing on he implemen a ion o he p o-
posed de ec o s in p ac ice (ques ion Q4). In his con ex i is impo an
o ema k ha sco es so a ha e epo ed o isola ed daily p edic ions, i.e.
TNR/TPR alues co espond o decisions made o e one single day. This,
howe e , lays a an un ealis ic ex eme wi h espec o he p ac ical imple-
men a ion o he p oposed de ec o s, in which he ope a o would en o ce
he inspec ion depa men o in es iga e he equipmen ins alled a ce ain
use ’s p emises only a e a numbe consecu i e posi i es ha e been de ec ed
on his/he da a aces.
A nai e albei insigh ul scheme modeling a mo e ealis ic implemen a ion
hinges on o ing by majo i y a numbe o consecu i e p edic ions o e e y
use . Resul s shown in Figu e 8 o 3 consecu i ely o ed ou comes o he
model bu ess his hypo hesis: p edic i e sco es a e imp o ed no ably by
adop ing his p ac ical app oach o e hose ob ained by he model p edic ing
on an indi idual sample basis (included also in he plo o compa ison).
Rema kably, LSA-box achie es TNR/TPR sco es abo e 0.9 o a leas 75%
o all use s, p omisingly pa ing he way o he deploymen and ope a ion o
his model in eal sma g id scena ios.
5. Concluding Rema ks and Fu u e Resea ch Lines
This manusc ip has elabo a ed on he de ec ion o NTL e en s in ene gy
consump ion p o iles cap u ed by AMIs in Sma G ids. In pa icula we ha e
p oposed a po olio o echniques inco po a ing se e al no el ing edien s o e
he ela ed li e a u e. Fi s , a elas ic measu e o simila i y be ween consump-
ion aces has been adop ed so as o accommoda e he e en ual empo al
a iabili y o he consump ion pa e ns ea u ed by he use unde analysis,
hus en o cing he o e all de ec o o a he ocus on shape pa e ns wi hin
he consump ion aces dis ega ding he ime suppo o e which hey occu .
Second, we ha e de ined an op ional encoding s a egy elying on bounda ies
d i en by he s a is ics o he load cu es o he use , concei ed as a means
o p o ide lexibili y o he o e all de ec o agains mino ampli ude luc u-
a ions and consequen ly, o de ec ue nega i es mo e eliably.
A da a mining low has been buil upon wo di e en dis ance-based lea n-
ing mechanisms (LOF and LSA) ha can be adop ed as i s inne classi ica ion
model, inco po a ing u he elemen s (e.g. dis ance-based clus e ing and
c oss- alida ion) aimed a a p ope cha ac e iza ion o he use in ega ds
o he casuis y o NTL e en s a ge ed in he pape . The combina ion
19
o dis ance-based lea ning algo i hms and he op ionali y o he encoding
s a egy has gi en ise o 4 di e en schemes – namely, LOF,LSA,LOF-box
and LSA-box –, which ha e been desc ibed in de ail h oughou he a icle
and compa ed o each o he o e a da ase comp ising eal da a aces o
a Spanish u ili y company. Resul s ob ained he e om ha e been analyzed
mac oscopically by assessing how each scheme balances he ade-o be ween
sensi i i y and speci ici y when de ec ing emula ed e en s e lec ing di e en
e ec s o NTL e en s in he load cu es. The obse ed pe o mance sco es
o each echnique in he benchma k con i ms he pos ula ed hypo heses:
he use o an elas ic measu e o simila i y be ween ime se ies educes he
a e o alse ala ms due o he e en ual a iabili y o legi ima e consump ion
aces along ime, whe eas he inclusion o an s a is ical encoding app oach
p io o dis ance compu a ion enhances he eliabili y o he de ec o when
p edic ing legi ima e aces (highe ue nega i e a e), a he cos o a de-
g aded disc iminabili y o con i med NTL e en s (lowe ue posi i e a e).
Ne e heless, he ul ima e decision conce ning he selec ion o one model o
ano he (accep ing possibly op imal models and disca ding subop imal ones)
is essen ially a business- ela ed ma e depending on bo h he a ailabili y o
inspec ion esou ces and he in e es o he u ili y company o igge manual
inspec ion campaigns. F equen ly, in eal en i onmen s a misclassi ica ion in-
ol es conside able inspec ion cos s de i ed om checking in si u he easons
o he p edic ed NTL e en , hence u ning he a e o alse ala ms in o he
mos c i ical objec i e. Among he me hods compa ed in ou expe imen s,
LSA-box s ands ou as he one achie ing he bes balance be ween he a e
o ue posi i es and he a e o ue nega i es.
Finally, we ha e p esen ed a mo e p ac ical de ec ion scheme based on
majo i y o ing consecu i e p edic ions o he p oposed NTL de ec ion al-
go i hms, which has been shown o enhance he pe o mance sco es signi i-
can ly o all echniques in he benchma k, wi h alues abo e 0.9 o 75% o
he use s o LSA-box wi h jus h ee o es in he decision. This las esul is
specially encou aging o he p ac ical deploymen and ope a ion o he p o-
posed scheme, o which esea ch e o s will be in es ed in he nea u u e.
O he aspec in he esea ch agenda ela ed o his wo k will g a i a e on he
alle ia ion o he compu a ional complexi y cha ac e izing he clus e -wise
pa ame e se ing by selec ing he clus e samples o e which models a e sub-
sequen ly ained and op imized. P ac ical policies o pe iodically eschedule
he o e all de ec o based on he p edic ion accu acy s a is ics and he eed-
back om inspec ion campaigns will be in es iga ed. The applicabili y o he
20
p oposed me hod o o he ene gy- ela ed scena ios (e.g. sub-me e ing, use
p o iling, demand-side managemen ) will be also examined.
Acknowledgmen s
This wo k has been pa ially suppo ed by he Basque Go e nmen unde
he ELKARTEK p og am (BID3ABI p ojec , g an e . KK-2015/0000080),
as well as by he Spanish Minis e io de Ene g´ıa y Compe i i idad unde he
RETOS p og am (OSIRIS p ojec , g an e . RTC-2014-1556-3).
Bibliog aphy
[1] US Ene gy Independence and Secu i y Ac (EISA) (2007): Ene gy Inde-
pendence and Secu i y Ac o 2007. 110 h Uni ed S a es Cong ess.
[2] Liu, X., Nielsen, P. S. (2016): A Hyb id ICT-Solu ion o Sma Me e
Da a Analy ics. Ene gy 115 (3): 1710-1722.
[3] T indade, F. C., Ochoa, L. F., F ei as, W. (2016): Da a Analy ics in
Sma Dis ibu ion Ne wo ks: Applica ions and Challenges. IEEE Inno-
a i e Sma G id Technologies - Asia (ISGT-Asia): 574-579.
[4] Beckel, C., Sadamo i, L., S aake, T., San ini S. (2014): Re ealing House-
hold Cha ac e is ics om Sma Me e Da a. Ene gy 78: 397-410.
[5] McLaughlin, S., Podkuiko, D., MacDaniel, P. (2009): Ene gy The in he
Ad anced Me e ing In aes uc u e. In e na ional Wo kshop on C i ical
In o ma ion In as uc u es Secu i y, CRITIS, 176-187.
[6] Re ou, O., Alsa as eh, Q., Alsoud, M. (2015): E alua ion o Elec ical
Ene gy Losses in Sou he n Go e no a es o Jo dan Dis ibu ion Elec ic
Sys em. In e na ional Jou nal o Ene gy Enginee ing 5(2): 25-32.
[7] An mann, P. (2009): Reducing Technical and Non-Technical Losses in
he Powe Sec o . Backg ound Pape o he Wo ld Bank G oup Ene gy
Sec o S a egy.
[8] Smi h, T. B. (2004): Elec ici y The : A Compa a i e Analysis. Ene gy
Policy 32: 2067-2076.
[9] Nesbi , B. (2000): Thie es Lu k – he Sizeable P oblem o S olen Elec-
ici y. Elec ical Wo ld T&D.
21

[10] Nagi, J., Yap, K. S., Nagi, F., Tiong, S. K., Koh, S. P., Ahmed, S.
K. (2010): NTL De ec ion o Elec ici y The and Abno mali ies o
La ge Powe Consume s in TNB Malaysia. IEEE S uden Con e ence on
Resea ch and De elopmen (SCOReD): 202-206.
[11] Hawkins, D. M. (1980): Iden i ica ion o ou lie s. Chapman and Hall 11.
[12] Kno , E., Ng R., Tucako V. (2000): Dis ance-based Ou lie s: Algo-
i hms and applica ions. VLDB Jou nal: Ve y La ge Da a Bases 8(3-4):
237-253.
[13] Ahmad, T. (2017): Non- echnical Loss Analysis and P e en ion using
Sma Me e s. Renewable and Sus ainable Ene gy Re iews 72: 573-589.
[14] Joka , P., A ianpoo, N., Leung, V. C. (2016): Elec ici y The De ec-
ion in AMI using Cus ome s’ Consump ion Pa e ns. IEEE T ansac ions
on Sma G id 7(1): 216-226.
[15] Nagi, J., Yap, K. S., Tiong, S. K., Ahmed, S. K., Nagi, F. (2011): Im-
p o ing SVM-based Non echnical Loss De ec ion in Powe U ili y using
he Fuzzy In e ence Sys em. IEEE T ansac ions on Powe Deli e y 26(2):
1284-1285.
[16] Depu u, S. S. S. R., Wang, L., De abhak uni, V. (2011): Suppo Vec-
o Machine based Da a Classi ica ion o De ec ion o Elec ici y The .
IEEE/PES Powe Sys ems Con e ence and Exposi ion 1-8.
[17] Nagi, J., Yap, K. S., Tiong, S. K., Ahmed, S. K., Mohamad, M. (2010):
Non echnical Loss De ec ion o Me e ed Cus ome s in Powe U ili y
using Suppo Vec o Machines. IEEE T ansac ions on Powe Deli e y
25(2): 1162-1171.
[18] Nagi, J., Yap, K. S., Tiong, S. K., Ahmed, S. K., Mohammad, A. M.
(2008): De ec ion o Abno mali ies and Elec ici y The using Gene ic
Suppo Vec o Machines. IEEE TENCON Con e ence, 1-6.
[19] Fo d, V., Si aj, A., Ebe le, W. (2014): Sma G id Ene gy F aud De-
ec ion using A i icial Neu al Ne wo ks. IEEE Symposium on Compu a-
ional In elligence Applica ions in Sma G id (CIASG), 1-6.
[20] Ma koˇc, Z., Hlupi´c, N., Basch, D. (2011): De ec ion o Suspicious Pa -
e ns o Ene gy Consump ion using Neu al Ne wo k ained by Gene a ed
22
Samples. ITI In e na ional Con e ence on In o ma ion Technology In e -
aces, 551-556.
[21] Monede o, ´
I, Bisca i, F., Leon, C., Bisca i, J., Millan, R. (2006): MI-
DAS: De ec ion o Non-Technical Losses in Elec ical Consump ion using
Neu al Ne wo ks and S a is ical Techniques. In e na ional Con e ence on
Compu a ional Science and I s Applica ions, 725-734.
[22] Niza , A. H., Dong, Z. Y., Wang, Y. (2008): Powe U ili y Non echnical
Loss Analysis wi h Ex eme Lea ning Machine Me hod. IEEE T ansac-
ions on Powe Sys ems 23(3): 946-955.
[23] Ramos, C. C., Souza, A. N., Chiachia, G., Falc˜ao, A. X., Papa, J. P.
(2011): A No el Algo i hm o Fea u e Selec ion using Ha mony Sea ch
and i s Applica ion o Non-Technical Losses De ec ion. Compu e s &
Elec ical Enginee ing 37(6): 886-894.
[24] Ramos, C. C. O., de Sousa, A. N., Papa, J. P., Falc˜ao, A. X. (2011):
A New App oach o Non echnical Losses De ec ion based on Op imum-
Pa h Fo es . IEEE T ansac ions on Powe Sys ems 26(1): 181-189.
[25] Cody, C., Fo d, V., Si aj, A. (2015): Decision T ee Lea ning o F aud
De ec ion in Consume Ene gy Consump ion. IEEE In e na ional Con-
e ence on Machine Lea ning and Applica ions (ICMLA), 1175-1179.
[26] Niza , A. H., Zhao, J. H., Dong, Z. Y. (2006): Cus ome In o ma ion
Sys em Da a P e-p ocessing wi h Fea u e Selec ion Techniques o Non-
Technical Losses P edic ion in an Elec ici y Ma ke . In e na ional Con-
e ence on Powe Sys em Technology, 1-7.
[27] Filho, J. R., Gon ijo, E. M., Delaiba, A. C., Mazina, E., Cab al, J. E.,
Pin o J. P. O. (2004): F aud Iden i ica ion in Elec ici y Company Cus-
ome s using Decision T ees. IEEE In e na ional Con e ence on Sys ems,
Man and Cybe ne ics 4: 3730-3734.
[28] Muniz, C., Figuei edo, K., Vellasco, M., Cha ez, G., Pacheco, M. (2009):
I egula i y De ec ion on Low Tension Elec ic Ins alla ions by Neu al
Ne wo k Ensembles. IEEE In e na ional Join Con e ence on Neu al Ne -
wo ks, 2176-2182.
[29] Fou ie J. W., Calmeye J. E. (2004): A S a is ical Me hod o Mini-
23
mize Elec ical Ene gy Losses in a Local Elec ici y Dis ibu ion Ne wo k.
IEEE AFRICON Con e ence 2: 667-673.
[30] Niza A. H., Dong Z. Y., Jalaluddin M., Ra les M. J. (2006): Load P o-
iling Non-Technical Loss Ac i i ies in Powe U ili y. Fi s In e na ional
Powe and Ene gy Con e ence (PECON) 1: 82-87.
[31] Cab al, J. E., Pin o, J. O., Pin o, A. M. (2009): F aud De ec ion Sys em
o High and Low Vol age Elec ici y Consume s based on Da a Mining.
IEEE Powe & Ene gy Socie y Gene al Mee ing, 1-5.
[32] Angelos, E. W. S., Saa ed a, O. R., Co es, O. A. C., de Souza, A. N.
(2011): De ec ion and Iden i ica ion o Abno mali ies in Cus ome Con-
sump ions in Powe Dis ibu ion Sys ems. IEEE T ansac ions on Powe
Deli e y 26(4): 2436-2442.
[33] Be nd , D. J., Cli o d, J. (1994): Using Dynamic Time Wa ping o ind
Pa e ns in Time Se ies. KDD wo kshop 10(16): 359-370.
[34] Fawce , T. (2006): An in oduc ion o ROC analysis. Pa e n Recog-
ni ion Le e s 27(8): 861-874.
[35] A lo , S., Celisse, A. (2010): A su ey o c oss- alida ion p ocedu es o
model selec ion. S a is ics su eys 4: 40-79.
[36] Fu, T. C. (2011): A e iew on ime se ies da a mining. Enginee ing
Applica ions o A i icial In elligence 24(1), 164-181.
[37] Shekha S., Lu C. T., Zhang P. (2002): De ec ing G aph-Based Spa ial
Ou lie . In elligen Da a Analysis 6(5): 451-468.
[38] Acuna E., Rod iguez C. A. (2004): Me a Analysis S udy o Ou lie De-
ec ion Me hods in Classi ica ion. Technical pape , Depa men o Ma h-
ema ics, Uni e si y o Pue o Rico a Mayaguez.
[39] B eunig, M. M., K iegel, H. P., Ng, R. T., Sande , J. (2000): LOF:
Iden i ying Densi y-based Local Ou lie s. ACM SIGMOD eco d 29(2):
93-104.
[40] Quinn, J. A., Sugiyama, M. (2014): A Leas -Squa es App oach o
Anomaly De ec ion in S a ic and Sequen ial Da a. Pa e n Recogni ion
Le e s 40: 36-40.
[41] Sugiyama, M. (2010): Supe as -T ainable Mul i-class P obabilis ic
24
Classi ie by Leas -Squa es Pos e io Fi ing. IEICE T ansac ions on In-
o ma ion and Sys ems 93: 2690-2701.
25
LOF LSA LOF-box LSA-box
Technique
0.0
0.2
0.4
0.6
0.8
1.0
Value
TNR
TPR
Figu e 5: Violin plo o he TNR-TPR s a is ics o e e y echnique in he benchma k.
The LOF-box is se e ely a ec ed by he s a is ical ace encoding s a egy, wi h he Pa e o
be ween TNR and TPR se e ely unbalanced in a o o he la e . By con as , TNR s a s
o LSA-box enhance sligh ly, ye keeping he TPR sco e s ill a admissible le els.
32

Figu e 6: Dis ibu ion o e o s in e e y egion o he da ase o LSA- aw and LSA-box:
egion 1 co esponds o o iginal da a aces ha should be labeled as inlie s, simila ly o
hose in egion 2 whe e o iginal da a aces a e wa ped along ime o a maximum shi
o ∆max = 4 hou s. Regions 3 and 4 should be decla ed as ou lie s since hey emula e
sha p (ze oing, as could happen in ampe ing) and sub le (small dec eases o he eco ded
ene gy) NTL e en s, espec i ely. Expec edly, sco es a e signi ican ly lowe in egion 3,
whe e he e ec NTL e en is less se e e o e he es da a han in he es o egions.
33
Figu e 7: Hou ly boxplo exempli ying he egula i y and i egula i y o wo consume s
in wha ega ds o his/he ene gy consump ion habi s. Da a samples used o compu ing
he boxplo a hou h∈ {0,1,...,23}a e composed by hose hou ly measu emen s along
Xn(wi h n∈ {A, B}) ma ched, ia DTW alignmen , o he h- h hou o he a e age
consump ion habi o he cus ome (compu ed o e Xn).
34
Figu e 8: Boxplo s co esponding o he TNR/TPR sco es ob ained o each echnique ( e-
ie ed om Figu e 5), and hose sco ed by o ing by majo i y h ee consecu i e ou comes
o he model.
35