scieee Science in your language
[en] (orig)

Computer vision for Pedestrian detection using Histograms of Oriented Gradients

Author: Rodríguez Fernández, José Marcos
Publisher: Universitat Politècnica de Catalunya
Year: 2014
Source: https://upcommons.upc.edu/bitstream/2099.1/21343/1/95066.pdf
Compu e ision o pedes ian
de ec ion using His og ams o
O ien ed G adien s
Jose Ma cos Rod ´ıguez Fe n´andez
Facul a In o m´a ica de Ba celona
Uni e si a Poli ´ecnica de Ca alu˜na
A hesis submi ed o he deg ee o
Enginee in compu e science
2014 Janua y
1. Vocal: F ancisco Ja ie La osa Bondia
2. Sec e a y: Joan Climen Vila ´o
2. P esiden : Manel F igola Bou lon
Day o he de ense: 31/03/2014
ii
Abs ac
This wo k a ge s he pedes ian de ec ion in s a ic images om a compu e
ision poin o iew. The in e es o such de ec o esides in i s many appli-
ca ions; au omo i e sa e y, c owd con ol, ideo su eillance o au oma ic
image indexing a e jus a ew examples. De ec ing pedes ians is a chal-
lenging ma e as pe sons can adop a wide ange o poses, in e y di e en
backg ounds and unde signi ican changes in illumina ion and colo . To
achie e a obus de ec ion me hod we s udy and de elop a HOG plus SVM
solu ion, as p oposed by Dalal & T iggs. The HOG desc ip o p oposed
u ns ou o be obus o small changes in he image con ou , loca ion and
di ec ion, and signi ican changes in illumina ion and colo . E en hough
HOGs pe o m equally well o o he classes, in his wo k we a ge speci i-
cally in up igh pe sons, o o say, pedes ians. Fu he mo e we y se e al
SVM models and aining app oaches o pick ou he bes possible SVM
ke nel and pa ame e s o wo di e en well known pe sons da a se s; MIT
and INRIA da a se s.
i
Dedica ed o my mo he o he i eless e o s and kind soul.

ii
Con en s
Lis o Figu es ii
Lis o Tables xi
Glossa y xiii
1 In oduc ion 1
1.1 Compu e ision ............................... 1
1.2 Some backg ound on objec de ec ion . . . . . . . . . . . . . . . . . . . 3
1.2.1 Challenges in modeling he objec class . . . . . . . . . . . . . . 4
1.2.2 Challenges in modeling he non-objec class . . . . . . . . . . . . 5
2 Aims o he p ojec 7
2.1 Finalaim ................................... 7
2.2 P elimina yaims............................... 8
3 A obus ea u e o objec ecogni ion 9
3.1 In oduc ion.................................. 9
3.2 HOGcon igu a ion.............................. 9
3.3 Ex ac ing HOGs om an image . . . . . . . . . . . . . . . . . . . . . . 11
3.3.1 B ie o e iew o HOG desc ip o s . . . . . . . . . . . . . . . . . 11
3.3.2 Inpu Images ............................. 12
3.3.2.1 Imagesize ......................... 12
3.3.2.2 Colo spaces . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.3 G adien compu a ion . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.4 Spa ial / O ien a ion Binning . . . . . . . . . . . . . . . . . . . . 14
3.3.5 No maliza ion and desc ip o blocks . . . . . . . . . . . . . . . . 15
iii
CONTENTS
3.4 Visualizing heda a ............................. 16
4 SVM 21
4.1 Wha isSVM? ................................ 21
4.2 Howi wo ks?................................. 21
4.2.1 SVMke nels ............................. 23
4.2.2 T ainingaSVM ........................... 26
4.2.2.1 Selec ing he SVM pa ame e s . . . . . . . . . . . . . . 28
4.2.3 Tes ingaSVM ............................ 31
5 Pedes ian de ec ion 33
5.1 T aining hede ec o ............................. 33
5.1.1 Da ase s ............................... 34
5.1.2 SVMmodels ............................. 36
5.1.2.1 MITmodels ........................ 36
5.1.2.2 INRIAmodels ....................... 37
5.2 Tes ing hede ec o ............................. 44
5.3 The inalde ec o ............................... 47
6 Implemen a ion and Usage 51
6.1 Implemen a ion................................ 52
6.1.1 Auxilia unc ions . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2 Usage ..................................... 57
7 Pe o mance 59
7.1 MITmodels.................................. 59
7.2 INRIAmodels ................................ 61
7.2.1 1s aining app oach . . . . . . . . . . . . . . . . . . . . . . . . 62
7.2.1.1 Linea models . . . . . . . . . . . . . . . . . . . . . . . 62
7.2.1.2 RBFmodels ........................ 64
7.2.1.3 Linea e sus RBF model compa ison . . . . . . . . . . 66
7.2.2 2nd aining app oach . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.2.1 Linea models . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.2.2 RBFmodels ........................ 68
7.2.2.3 Linea e sus RBF model compa ison . . . . . . . . . . 69
i
CONTENTS
7.2.3 Selec ing he inal model . . . . . . . . . . . . . . . . . . . . . . . 69
7.3 De ec o pe o mance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8 Discussion 75
8.1 Possible imp o emen s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.1.1 Code compila ion. . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.1.2 Pa allel GPGPU compu a ion. . . . . . . . . . . . . . . . . . . . 76
8.1.3 P incipal Componen Analysis o dimensional educ ion. . . . . 77
8.2 Al e na i es.................................. 79
9 Ma e ials & me hods 81
9.1 MATLAB................................... 81
9.1.1 a iable ypes............................. 81
9.2 Lib a ies and packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
9.2.1 libSVM ................................ 82
9.2.2 NISTDETplo s ........................... 82
9.2.3 m2h ml ................................ 82
10 P ojec Managemen 85
10.1Planning.................................... 85
10.2Cos s...................................... 89
10.2.1Human esou ces........................... 89
10.2.2 Ha dwa e esou ces . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.2.3 So wa ecos ............................. 90
Re e ences 93
GLOSSARY
xii

Glossa y
ANN A i icial Neu al Ne wo k
CNN Con olu ional Neu al Ne wo k
DET De ec ion E o T ade o
FLOPS FLoa ing-poin Ope a ions Pe Sec-
ond
GPGPU Gene al-Pu pose GPU
HOG His og am o O ien ed G adien s
MIT Massachuse s Ins i u e o Technol-
ogy
NIST Na ional Ins i u e o S anda ds and
Technology
RAM Random Access Memo y
RBF Radial Basis Func ion
ROC Recei e Ope a ing Cha ac e is ic
SIFT Scale-In a ian Fea u e T ans o m
SVM Suppo Vec o Machine
-SNE -dis ibu ed S ochas ic Neighbo
Embedding
xiii
GLOSSARY
xi
1
In oduc ion
Compu e s and machines a e ubiqui ous elemen s in ou daily li es and his ends o
be an inc easing eali y day by day. I s na u al o y o imp o e he compu e -human
o compu e -en i onmen in e ac ion. E en hough compu e s p ocess and ea huge
amoun s o da a wi h an ease ha humans ha e ailed o achie e, human daily p oblems
like ecognizing ano he pe son o ob aining high le el in o ma ion om a scene u ns
ou o be some hing machines do no deal wi h he same ease as human do. This
ac was somehow e idenced in he 80s by Hans Mo a ec who poin ed ou ha human
easoning needs ela i ely li le compu a ion while senso imo o skills equi e eno mous
amoun s o compu a ion. 1The e o e one o he mos challenging poin s o he mode n
enginee ing is o achie e a good in e ac ion be ween machines and hei en i onmen .
This equi es compu e s o ha e some in elligence and lea ning capabili y. Specially
speech ecogni ion and isual in e p e a ion a e key poin s o g an compu e s and
machines wi h good in e ac ion skills.
This chap e p esen s a b ie desc ip ion o he compu e ision ield, some back-
g ound o i and he main challenges i p esen s.
1.1 Compu e ision
Compu e ision o a i icial ision is a ield ha comp ehend me hods, echniques and
echnologies o acqui e, p ocess and unde s and images. Typical a ge s in compu e
1As Mo a ec w i es: ”I is compa a i ely easy o make compu e s exhibi adul le el pe o mance on
in elligence es s o playing checke s, and di icul o impossible o gi e hem he skills o a one-yea -old
when i comes o pe cep ion and mobili y.”
1
1. INTRODUCTION
ision include;
•de ec ion, segmen a ion and localiza ion o objec s (i.e ace iden i ica ion)
•objec acking (i.e ollowing pedes ian mo emen s o su eillance)
•image sea ch by i s con en (i.e image online sea che s)
•image es o a ion (i.e noise o mo ion blu emo al)
•3Dcons uc ion om a se o 2Dimages
E en hough many compu e ision applica ions a e p og ammed o sol e a pa icula
p oblem, i is becoming common o ind me hod based on machine lea ning. The e o e
compu e ision is highly ela ed wi h o he ields. Figu e 1.1 shows his ela ions.
Figu e 1.1: Rela ions be ween compu e ision and many o he ields -
As a scien i ic discipline, compu e ision is conce ned wi h he heo y behind a i icial
sys ems ha ex ac in o ma ion om images. In a mo e echnological poin o iew,
among i s applica ions we could highligh :
•Indus ial quali y con ol (i.e de ec ing c acks o issu es in iles, bo les e c)
•Au onomous d i ing
•De ec ing e en s in ideo su eillance
2
1.2 Some backg ound on objec de ec ion
•Indexing image da a bases
•Modeling objec o en i onmen s (i.e medical image analysis)
•Compu e -human in e ac ion (i.e con olling a compu e wi h he eye mo emen )
•Au oma ic documen ea men (i.e bank checks e i ica ion)
1.2 Some backg ound on objec de ec ion
In his hesis we a ge in sea ching objec s, in his case, pedes ians wi hin s a ic
images. Mainly an objec de ec ion applica ion can be spli ed in wo di e en s ages;
an encoding s age whe e he image o a egion o i is ansla ed in o a ea u e ec o
ep esen a ion, and a de ec ion s age o decide whe he some conc e e egion is o no a
posi i e ma ch. The ea u e ex ac ion ep esen a ion can ei he be spa se o dense and
usually cap u e shapes,con ou in o ma ion o in ensi y pa e ns. The spa se ea u e
app oach ex ac s in o ma ion only om some special egions, assuming no he same
impo ance o e he whole image. I s no ewo hy ha his me hod needs some way o
de ec impo an egions wha implies a dense sea ch o e he image. Ne e heless he
way his poin s a e ound di e s om he inal encoding o he dense ea u e app oach.
Addi ionally he space ela ionships be ween he ea u es ep esen a ion can also be
used o make de ec ion decision. F om he de ec o poin o iew mainly wo app oaches
can be ound. On one hand a pa based app oach, whe e a de ec ion is de ined by he
p esence o di e en pa s cons i u ing he objec o de ec and he spacial ela ions
be ween hese pa s. Conside ing he pedes ian de ec ion p oblem we could hink on
a pedes ian like a se o a ms, legs, head and o so. Fo his se o pa s o e ec i ely
be a pe son i is no enough o ind each pa , i is also needed o sa is y some spa ial
o de ing. On he o he hand a kind o empla e app oach, whe e a densely ea u e
is compu ed o, u he on, classi y i as ma ching o no by any lea ning algo i hm,
commonly SVM o simila echniques.
Among he pedes ian de ec ion p oposed solu ions and me hods, one o he mos
p omising app oaches seems o be he sliding window pa adigm when dealing wi h low
o medium esolu ion images.
3

1. INTRODUCTION
Papageo giou e al. p oposed one o he i s sliding window plus SVM wi h an o e -
comple e dic iona y o mul iscale Haa wa ele s (2). Viola and Jones buil hei de-
ec o upon in eg al images o as ea u e compu a ion and a cascade s uc u e o
e icien de ec ion, and u ilizing AdaBoos o au oma ic ea u e selec ion (3). Zhu e
al. sped up HOG ea u es by using in eg al his og ams (4, 5). Dolla e al. p oposed
an ex ension o Viola and Jones me hod whe e Haa -like ea u es a e compu ed o e
mul iple channels o isual da a (LUV colo channels,g ayscale, g adien magni ude
and g adien magni ude quan ized by o ien a ion) (6). The Fas es Pedes ian De ec-
o in he Wes (FPDW), ex ended his app oach o as mul i-scale de ec ion a e i
was demons a ed how ea u e compu ed a a single scale can be used o app oxima e
ea u e a nea by scales. Achie ing a 5 ps on 640 ×480 images as his me hod does no
need o compu e a ine scale-space-py amid. (7).
In his hesis we densely compu e ou ea u e ec o ep esen a ion o e image
egions ollowing a empla e app oach, and hen classi y e e y ea u e wi h a SVM.
The goal is o model pedes ians in such a way ha ligh , backg ound o hei clo hes
does no de imen he de ec ion. Conc e ely his poin leads us o b ie ly e iew
some o he mos common issues ha objec -de ec ion p oblems p esen . These can
be di ided in wo ca ego ies, on one hand we can alk abou he ones encoun e ed
when modeling he objec class ha we wan o de ec , on he o he hand, he ones in
modeling he non-objec class.
1.2.1 Challenges in modeling he objec class
Among he se e al di icul ies ha make ha d modeling and de ec ing a pa icula
objec we could men ion he ollowing:
•Illumina ion a ia ions: objec s appea widely di e en and modeling hem in
such a way hey become in a ian o his a ia ions u ns ou o be a key poin .
•Objec pose o objec de o ma ion: changes wha an objec looks like and he e o e
empla e app oaches o objec ecogni ion end o su e low pe o mance in high
changing objec s.
•Clu e ed scenes: hinde de ec ion and localiza ion o objec s.
4
1.2 Some backg ound on objec de ec ion
•Occlusion: makes ha d o ecognize shapes and he e o e objec s as some pa s
can no be seen.
•In a-class appea ance: makes di icul o g oup dissimila objec s. (i.e ca ypes,
colo s, shapes widely a y bu hey all ep esen ca s)
•Viewpoin : makes appea ance o widely a y (i.e pedes ians om an ae ial poin
o iew e sus on al iew)
1.2.2 Challenges in modeling he non-objec class
On he o he hand o he non-objec class we should men ion:
•Bad localiza ion: sub-pa s o he objec could be de ec ed bu no he whole
objec .
•Con usion wi h simila objec s
•Miscellaneous backg ounds
5
1. INTRODUCTION
6
2
Aims o he p ojec
2.1 Final aim
The main objec i e o he hesis is o implemen and e alua e a pedes ian de ec ion
me hod o s a ic images. We elay in he desc ip o p oposed by Dalal and T iggs, om
he F ench Na ional Ins i u e o Resea ch in Compu e Science and Con ol (INRIA).
Thei solu ion consis s in a HOG plus a SVM app oach which we build using he
MATLAB language and en i onmen . Mo e speci ically we compa e he e ec i eness
o di e en se o pa ame e s, SVM models, ke nels, pedes ian da a se s, and a ia ions
o he de ec ion me hod, while achie ing a comple ely unde s anding o Suppo Vec o
Machines,ke nels and he HOG desc ip o i sel . Fu he mo e we ealize a comple e se
o es s and measu es o each ained de ec o o p o e and show in di e en manne s
which u ns ou o be he bes possible con igu a ion aking in o accoun de ec ion
accu acy in e ms o a ade-o be ween wo di e en ypes o e o s. In addi ion e e y
bi o code is de eloped wi h lexibili y and abs ac ion in mind so e e y expe imen
and model ealized in his hesis can be eplica ed wi h same o di e en pa ame e s,
con igu able h ough pa ame e iles. This enables o he people o ind ex a esul s o
make di e en compa isons. Fo he same eason a wide se o ou ines and unc ions a e
also a ailable o p o ide a way o p oduce he same da a esul s and make compa isons
easie .
7
3. A ROBUST FEATURE FOR OBJECT RECOGNITION
Figu e 3.5: G adien magni ude equa ion
As ega ds he angle i can be calcula ed as he ou -quad an in e se angen o Gy
and Gx, namely:
Θ(i, j) = a c an Gy(i, j)
Gx(i, j)(3.5)
Figu e 3.6: G adien angle equa ion
Howe e we ha e no commen ed ye wo de ails abou he HOG ob ainmen . The i s
de ail conce ns he al eady men ioned aspec o he colo space o he inpu image.
The p ocess ob ained so a does no men ion how o deal wi h image colo s, so o he
RGB image case. As we will see in §7 using when a ailable he h ee channels imp o es
sligh ly he o e all pe o mance. The basic idea behind he HOG compu a ion when
aking a h ee channel image is mainly apply he explained me hod independen ly o
all he channels and aking o each pixel o he image he g adien wi h highes alue
and i s co esponden angle.
The second de ail has o do wi h he bounda y condi ion, ha is o say how o compu e
he g adien a he image bounda ies whe e he cen e o he de i a i e mask can no
be placed as i should. Anyway we can o ge his de ail by now, jus o no obscu e
he main idea behind he desc ip o i sel , his will be explained in mo e de ail when
we e iew he conc e e implemen a ion in §6.
3.3.4 Spa ial / O ien a ion Binning
This s ep in oduces he non linea i y o he desc ip o and implies he c ea ion o an
his og am o a bunch o local spa ial a eas, he ones ha we called cells. I we emem-
be he i s and b ie de ini ion gi en in he in oduc ion, he image was subdi ided in
li le cells. Thus o e e y pixel wi hin a cell a weigh ed o e is issued o he bin co -
esponding o he angle o i s g adien , hen o all he pixels o a cell he issued o es
a e accumula ed o o m he inal his og am o ha cell. This his og ams ep esen
angles e enly spaced be ween 0◦and 180◦o wi hin 0◦and −360◦, depending whe he
he angle is signed o unsigned espec i ely.
The au ho s ound ou ha using unsigned angles oge he wi h 9 o ien a ion bins,
was he ou pe o ming con igu a ion. P obably he eason why omi ing angle sign
in o ma ion is be e is because in human de ec ion we ind a wide ange o clo hing
and backg ound colo s and he e o his in o ma ion is unin o ma i e.
14

3.3 Ex ac ing HOGs om an image
Wi h espec o he o e weigh ing ei he squa ed g adien magni ude, he squa e oo
o he magni ude i sel could be an op ion. In p ac ice using he magni ude i sel esul s
o be he bes op ion.
The e o he esul ing equa ion o he compu a ion o he k h bin o he his og am is:
hk=X
i,j
=M(i, j)1[Φ(i, j) = k] (3.6)
Figu e 3.7: Bin k equa ion
(1 is he cha ac e is ic unc ion ha indica es i a pa icula o ien a ion belongs o a gi en bin o no )
In ou conc e e case, k anges om 1 o 9. As Dalal and T iggs showed in (8), a ine
o ien a ion coding u ns ou o be essen ial while a mo e coa se spa ial coding can be
done.
In ac , gi en he p e ious angle di ision in bins, p obably gi en an angle i will all
be ween o bin cen e s, so he o e is calcula ed ia a bilinea in e pola ion be ween
he wo neighbo ing bin cen e s. Doing he o ing like his educes he aliasing. The
bilinea in e pola ion is done as a double linea in e pola ion in bo h o ien a ion and
posi ion.
(x|x1, x2) = (x1) + ( (x2)− (x1)
x2−x1
(x−x1) (3.7)
Figu e 3.8: Linea in e pola ion equa ion
Summa izing we ha e ha he image is by now di ided in li le sub- egions called
cells, a he same ime cells a e g ouped in la ge spa ial a eas called blocks. Then o
e e y pixel wi hin a cell he g adien (G(i, j)) is compu ed and a o e is issued, his
o e is hen linea ly in e pola ed o disco e which po ions o i s magni ude (M(i, j))
co espond o each con iguous bin. So a he same ime e e y pixel con ibu es o a
couple o bins and a each o he ou his og ams ha o m a block.
3.3.5 No maliza ion and desc ip o blocks
As has been said o e all he whole wo k, de ec ing pedes ian is subjec o he main
compu e ision issues like illumina ion and backg ound a ia ions o occlusions. Fo
he sake o a obus desc ip o some kind o illumina ion no maliza ion mus be done,
15
3. A ROBUST FEATURE FOR OBJECT RECOGNITION
his is e idenced when paying a en ion o he a ia ions in he g adien s s eng h.
Ac ually his s ep is essen ial o achie e good esul s.
Se e al no maliza ion schemes we e explo ed by (8). Le s de ine i s as he ec o
con aining all he his og ams o a gi en block, || ||k he k-no m o , wi h k∈1,2 and
le s be a small cons an . Then he no maliza ion schemes a e:
L1-no m : →
|| ||2+(3.8)
Figu e 3.9: L1-no maliza ion o he desc ip o ec o o a block
L1-squa ed no m : →
|| ||2+(3.9)
Figu e 3.10: L1-squa ed-no maliza ion o he desc ip o ec o o a block
L2-no m : →
p|| ||2
2+2(3.10)
Figu e 3.11: L2-no maliza ion o he desc ip o ec o o a block
And L2-no m ollowed by a clipping, limi ing he alues o o 0.2 and e-no malizing
again. This can be achie ed by ge ing he esul o he L2-no m and cu ing i and
no malizing again.
The expe imen s showed ha ei he L1-no m-squa ed, L2-no m o L2-no m-Hys pe -
o ms simila and achie es good esul s, bu L1-no m dec eases pe o mance in a 5%.
No no malizing penalizes eno mously he pe o mance in a ound a 27%.
One could suppose ha including he  alue in he abo e men ioned calcula ions may
in any manne dis up o dis o he disc imina ion de ec i eness o he desc ip o bu
he esul s a e insensi i e o a wide ange o  alues.
3.4 Visualizing he da a
In gene al be o e dealing wi h any classi ica ion ask, some imes helps isualizing how
he selec ed desc ip o ep esen s he elemen s we wan o classi y. Ob iously when he
dimension o he desc ip o inc eases mo e di icul becomes he ask o isualizing how
he selec ed ea u es ep esen he da a and he e o mo e di icul o de e mine a a
16
3.4 Visualizing he da a
glance i ha se o ea u es in conjunc ion wi h a conc e e classi ica ion me hod will
lead o success o will ail.
In his case we a e ying o isualize poin s on a 3780 space as he explained HOG
con igu a ion gi es a such dimensional ec o . Once assumed his space is impossible
o eally be isualized, he only possible way o igu e ou how he da a is dis ibu ed
is by applying some dimensional educ ion me hod. The idea is o end up wi h a 2D
o 3D plo e ealing in some manne he implici ela ions be ween poin s. A good
desc ip o plus a good high isualiza ion echnique should gi e a map whe e ins ances
o he same class a e close in he map and di e en classes a e a om each o he .
We selec ed -dis ibu ed Neighbo Embedding ( -SNE) (15) me hod o achie ing a
isualiza ion om HOGs ep esen a ion. The eason is because o he imp o emen
ha his me hod p esen s compa ed wi h o he simila ones. Plo s om -SNE p esen
a lowe endency o c owd poin s oge he in he cen e o he map and is be e in
ep esen ing s uc u es a many di e en scales in a single map.
Be o e p esen ing he esul s o he -SNE isualiza ion a e y b ie explana ion o
how i wo ks ollows. Two s ages comp ise he algo i hm. Fi s , -SNE cons uc s
a p obabili y dis ibu ion o e pai s o high-dimensional objec s in such a way ha
simila objec s ha e a high p obabili y o being picked, whils dissimila poin s ha e an
in ini esimal p obabili y o being picked. Second, -SNE de ines a simila p obabili y
dis ibu ion o e he poin s in he low-dimensional map, and i minimizes he Kullback-
Leible (16), di e gence be ween he wo dis ibu ions wi h espec o he loca ions o
he poin s in he map.
Looking a he isualiza ion p o ided by he -sne algo i hm one can obse e a leas
a s ong ela ionship be ween poin s ep esen ing HOGs ex ac ed om pedes ian
windows and he ones ex ac ed om windows no con aining any pedes ian.
Once we ha e some clue abou he disc imina ion capabili ies o he HOG desc ip o
we can s a hinking abou classi ica ion me hods ha we will explain in de ail in he
nex chap e .
Mo eo e , ins ead o isualizing he whole bunch o poin s ep esen ing e e y ins ance,
we may wan o isualize a HOG ea u e ex ac ed om a conc e e image, is o say, o
isualize he image in he ea u e space, some hing simila o iew he same as he clas-
si ie sys em sees. One possible app oach o di ec ly plo he alues o each his og am
ba o all cells in he image, commonly ep esen ed in a s a shape. Figu e 3.13 shows
17
3. A ROBUST FEATURE FOR OBJECT RECOGNITION
Figu e 3.12: HOG -sne map - ed ci cles ep esen pedes ians, blue c osses no
pedes ians
some examples o his ep esen a ion. Ano he app oach, mo e complex bu qui e be -
e o unde s and why ou de ec o ails in some cases, is he me hod p oposed by Ca l
Vond ick, Adi ya Khosla, Tomasz Malisiewicz and An onio To alba om MIT (17).
Figu e 3.14 show some examples in image and ea u e space, he HOGgles ep esen a ion
show how he HOG desc ip o seems o belong o pe son images.
(a) HOG ep esen a ion (b) O iginal image
Figu e 3.13: HOG common ep esen a ion
18
3.4 Visualizing he da a
(a) (b) (c)
Figu e 3.14: HOG HOGgles ep esen a ion
19

3. A ROBUST FEATURE FOR OBJECT RECOGNITION
20
4
SVM
4.1 Wha is SVM?
In a wide a ie y o si ua ions one may wan o be able o assign a conc e e objec
o one o mo e ca ego ies o classes, based on i s cha ac e is ics. Fo example, gi en
he cha ac e is ics o a conc e e documen page we may wan o be able o de e mine,
om a se o page ypes, which ype o page i is. In compu e science his is called a
classi ica ion p oblem. SVM o Suppo Vec o Machine a e supe ised lea ning models
and algo i hms ha a e able o analyze da a and ecognize pa e ns o au oma ically
build a se o ules o classi y simila da a ha hey ha e ne e seen be o e.
By supe ised lea ning we e e o he ask o in e ing he abo e men ioned se o
ules om al eady labeled da a. O he kind o algo i hms a e able o g oup o classi y
di e en objec in o clus e s by simila i y wi hou knowing a p io i he classes hey
belong o. In he case o SVM we need o p o ide labeled da a o all he classes in
which we wish o ha e ou da a classi ied.
4.2 How i wo ks?
Le ’s explain he simple case in which we only ha e wo classes; gene ally called a
bina y classi ica ion.
Then, gi en a se o Llabeled poin s, {xi, yi}wi h xi∈ <dbeing a ec o o ea u es
o cha ac e is ics o he i- h objec and yi∈ {−1,+1} he class label, we wan o build
a ule o de e mine, gi en a new x,one o he wo possible classes. We will also assume
ou da e is linea ly sepa able, his is o say ha we can d aw a line ha spli s all poin s
21
4. SVM
belonging o one class om he poin s belonging o he o he class when d= 2, and a
hype plane when d > 2. Fu he mo e we wan his sepa a ing hype plane o maximize
he ma gins o dis ances o he closes poin s o each class.
The abo e men ioned hype plane can be desc ibed by w·x+b= 0 whe e wis no mal
o he hype plane and b
kwkbeing he pe pendicula dis ance om he hype plane o
he o igin.
Then sol ing he classi ica ion can be w i en like:
xi·w+b≥+1 o yi= +1 (4.1)
xi·w+b≤ −1 o yi=−1 (4.2)
This wo equa ions can be desc ibed by only one equa ion in he ollowing o m:
yi(xi·w+b)−1≥0∀i(4.3)
I , as we said, he se s o poin s a e linea ly sepa able, hen we can d aw a line o
hype plane going h ough he poin s ha lie closes o he sepa a ing hype plane, his
is, he suppo ec o s. The main aim o SVM is hen o maximize he dis ance be ween
he new hype planes; le ’s call hem H1and H2.
Figu e 4.1 illus a es he abo e explained heo y.
Figu e 4.1: SVM maximum sepa a ing hype plane ma gin -
Geome ically we can see ha he dis ance om H1 o he hype plane is equal o 1
kwk,
and simila ly o H2, so he dis ance om H1 o H2is equal o 2
kwk. As we explained he
main goal is o maximize his dis ance, hen we ha e ha sol ing he p oblem educes
22
4.2 How i wo ks?
o minimize kwk. Minimizing kwkis equi alen o minimize 1
2· k wk2bu using he
la e e m makes possible o pe o m Quad a ic P og amming op imiza ion (QP).
The e o e we ha e:
min
(w,b)
1
2· k wk2
s. . yi(xi·w+b)−1≥0∀i
(4.4)
The de elopmen o he QP p oblem is ou o he scope o his hesis, bu he comple e
de elopmen can be s udied in de ail in (18, 19, 20).
Rega dless o he de ails o he QP he in ui ion behind SVM can be easily explained
wi h igu e 4.2, whe e we ha e h ee di e en planes. H3does no e en sepa a e com-
ple ely he wo se o poin s. H1does sepa a e he poin s, bu does no sa is y he
maximum ma gin es ic ion, inally H2sepa a es bo h se s and maximizes he dis-
ance om he hype plane o he suppo ec o s.
Figu e 4.2: Di e en hype planes in 2D - This igu e shows h ee di e en sepa a ions
bu only H2gi es he maximum sepa a ion ma gin
4.2.1 SVM ke nels
The eade will ha e no iced ha , un il now, all he possible sepa a ion bounda ies ha
SVM is able o in e u n ou o be lines in he <2case o hype planes in <n. Some imes
mo e complex bounda ies a e needed in o de o classi y no linea ly sepa able da a.
Figu e 4.3 shows wo examples o 2D ep esen a ions o non linea ly sepa able da a.
23
4. SVM
Figu e 4.9: Lea ning cu e showing he e olu ion o he e o in unc ion o
he model complexi y - High a iance = o e - i , high bias = unde - i
Ano he eason o o e - i ing could be an insu icien numbe o aining ins ances.
Plo ing e o cu es in unc ion o he aining ins ances o bo h aining and c oss
alida ion o es se could help. In his case we expec o see he aining e o
inc easing as he numbe o aining ins ances g ows and he c oss alida ion e o
dec easing, as he model is no possible o pe ec ly i he o al numbe o ins ances.
I is also common o see a la ge gap be ween bo h ype o e o s. Figu e 4.10 show an
example o he expec ed cu e.
Figu e 4.10: Typical lea ning cu e o high a iance - La ge gap be ween e o s
and e o con e gence endency sugges o e - i ing
In conclusion o sol e his p oblem we should op o inc ease he numbe o aining
ins ances, wha is no always possible and usually implies a ha d labeling wo k, o o
elax he model complexi y, ei he choosing ano he pa ame e con igu a ion o using
30

4.2 How i wo ks?
ano he ke nel.
Unde - i ing.
On he o he hand we say a model has unde - i ed when i is no capable o desc ibing
he ela ionships be ween he gi en da a. Unde - i ing may occu when ying o i
some complex da a wi h an excessi e simple model. Fo ins ance ying o classi y wi h
a linea ke nel he da a shown in 4.3 would lead in a clea case o unde - i ing, as
a line is no able o spli bo h classes. Con a y o wha happens in he o e - i ing
case, when he model u ns ou o be e y simple, bo h aining and es ing e o s a e
high. The same analysis can be done o de e mine i we ha e an unde - i ing p oblem.
Plo ing e o cu es in unc ion o he aining se size should help. Figu e 4.11 show
he ypical shape hese cu e should show when su e ing om unde - i ing. In his
case usually a li le gap sepa a es bo h e o s.
Figu e 4.11: Typical lea ning cu e o high bias - Li le gap be ween e o s and
high e o s sugges unde - i ing
In his case, inc easing he aining ins ances, by i sel , does no end o help. When
dealing wi h his p oblem we should inc ease he model complexi y.
4.2.3 Tes ing a SVM
Some conside a ions mus be aken in o accoun when ying o de e mine he e ec-
i eness o a classi ica ion model. The i s , and mos impo an , is o selec as es ing
ins ances da a no used in he aining p ocess. As we wan o es he eal pe o mance
o he model we need o know how i does when dealing wi h unseen da a, his should
gi e us an in ui ion o how did he model gene alized while aining.
31
4. SVM
The second conside a ion, has o do wi h he ela ion be ween he amoun o ins ances
om each class. Le ’s hink we es wi h a e y skewed da a se whe e 99% o ou
da a belong o he posi i e class and he emaining 1% o he nega i e one. Now
we pe o m ou es wi h a model, ha independen ly o he inpu , always ou pu s
a posi i e p edic ion. In his scena io, coun ing he igh guesses will gi e a 99% o
accu acy, howe e i is ob ious his model is no a good choice. The e o e is p e e able
o use a simila amoun o ins ances o each class. Howe e some measu es apa om
a simple coun can be done o ci cum en his d awback i o some eason we specially
need o use skewed da a se s. These measu es and hei explana ions can be seen in
§5.2.
32
5
Pedes ian de ec ion
5.1 T aining he de ec o
A comple e explana ion ollows abou how he SVM aining has been done, no only
o he inal de ec o , also o all he in e media e s a es, models and de ec o s ha ha e
led he way un il achie ing he de ini i e one.
In gene al e e y de ec o ained is di ided in wo o mo e s eps o e aining. Fi s
a p e-model is ained, hen se e al ha d example sea ches ollow using he p e ious
ained model o achie e his pu pose, his is, using he jus ained model a exhaus-
i ely sea ch along he nega i e images is done wi h he pu pose o inding windows ha
he model mis akes in i s classi ica ion. Each model in u n is ained sea ching h ow
a c oss- alida ion he op imum pa ame e se s o ha ke nel wi hin a ange o pos-
sible alues o each pa ame e . Once he bes pa ame e wi hin he gi en pa ame e
space is ound ha model is sa ed and becomes he model o he ha d example sea ch.
This p ocess is epea ed un il no u he imp o emen is eached o he pe o mance
inc eases in a e y low a e.
When alking abou exhaus i e sea ch we e e o a sea ch in a dense space-scale py a-
mid ollowing a sliding window pa adigm. See igu e 5.1
The s a ing scale le el is 1, so he i s image is he o iginal one, hen we keep adding
one mo e le el in he py amid un il he size o he scaled image is g ea e han 64 o
he ho izon al axis and 128 o he e ical axis. The scale a io be ween consecu i e
le els is 1.2. So conc e ely we will gene a e mo e scaled images un il jImageWid h
Scale k>64
and jImageHeigh
Scale k>128.
33
5. PEDESTRIAN DETECTION
Figu e 5.1: Scale-space py amid illus a ion - Each le el ep esen s a scaled e sion
o he o iginal image
As ega ds o he sliding window con igu a ion, he window s ide (sampling dis ance
be ween wo consecu i e windows) a any scale is 8 pixels. I a e i ing all windows
a a scale le el some ma gin emains a bo de s, we di ide he ma gin by 2, ake i s
loo and shi he whole window g id.
Fo example, i image size a cu en le el is (75,130) and he ma gin (wi h s ide o
8 and window size o (64,128) le is (3,2), hen we shi all windows by jMa ginX
2k,
jMa ginY
2k.
New image wid h and heigh a e calcula ed using he o mulas:
NewWid h = jO igWid h
Scale kand NewHeigh = jO igHeigh
Scale k.
He e scale = 1 implies he o iginal image size.
5.1.1 Da a se s
As pedes ian in pa icula , and any objec de ec ion in gene al, is becoming an impo
goal in he machine lea ning ield, se e al image da a se s can be ound all o e in e ne ,
many o hem belonging o compu e ision ins i u es and esea ch depa men s all
a ound he wo ld. Fo his hesis wo well known pedes ian da a se s has been used,
34
5.1 T aining he de ec o
he i s one om he Massachuse s Ins i u e o Technology (MIT) and he second
one om he F ench Na ional Ins i u e o Resea ch in Compu e Science and Con ol
(INRIA).
This a e he speci ica ions o each da a se :
•MIT Da a Se : (21)
–64 x 128 (x3) PPM o ma images
–924 iles (posi i e igh s anding pedes ians)
–10 Megaby es comp essed : 22 Megaby es uncomp essed
–pedes ian posses limi ed o ea o on iews (people heigh om shoulde s
o he ee is app oxima ely 80 px)
–images ob ained om colo ideo sequences aken in di e en seasons wi h
di e en ideo came as
•INRIA Da a Se : (22) Di ided in wo o ma s. (a) o iginal images wi h co e-
sponding anno a ion iles, and (b) posi i e images in no malized 64x128 pixel
o ma (as used in (8)) wi h o iginal nega i e images.
–70 x 134 no malized and cen e ed posi i e es images (le and igh e lec-
ions)
–96 x 160 no malized and cen e ed posi i e ain images (le and igh e-
lec ions)
–1218 o iginal nega i e aining images
–614 o iginal posi i e aining images
–453 o iginal nega i e es ing images
–288 o iginal posi i e es ing images
–970 Megaby es comp essed : 1150 Megaby es uncomp essed
–Only up igh pe sons wi h heigh la ge han 100 px a e ma ked in each
image
–images ob ained om di e en sou ces
35

5. PEDESTRIAN DETECTION
5.1.2 SVM models
Se e al SVM models has been ained o ind such a con igu a ion ha pe o ms in he
bes possible way.
5.1.2.1 MIT models
Fo he MIT da a se linea ke nel and colo images has been used because pe ec ly
sepa a ion was achie ed. Resul s can be ound in he es sec ion. As he MIT da a
se only p o ides posi i e images an addi ional g oup o 2000 andom nega i e images
we e sampled om images we e no pedes ian could be ound. This images joined wi h
a g oup o 654 posi i e images ( he posi i e images gi en as ain se by MIT da a se )
we e ed in o a linea SVM o ain a p e-model.
To ind a easonable good p e-model a ious models we e ained wi h di e en pa am-
e e con igu a ion pe o ming a 5- old c oss- alida ion o e he aining se o ind a
measu e o he pe o mance depending on he pa ame e . As he model was a linea
ke nel model he only pa ame e o adjus is he cos pa ame e (C) as explained in
§4.2.2. This pa ame e sea ches we e made o e a powe o wo ange s a ing om 2−2
and up o 27, inding ou ha inc easing Co e a alue o 23makes he model o e i
and dec eases i s pe o mance as can be seen in igu e 5.2.
Figu e 5.2: MIT p e linea model c oss- alida ion - y-axis ep esen s pe o mance
and x-axis cos alue.
Al hough his model is jus a p elimina y model eaches a c oss- alida ion accu acy o
36
5.1 T aining he de ec o
99.85% and e en hough he c oss- alida ion was pe o med o e he ain se i gi es
a good in ui ion abou how will he model pe o m on unseen da a.
The inal model was acqui ed by sea ching exhaus i ely, as explained a he beginning
o his chap e , he o iginal 2000 nega i e images om whe e he nega i e aining se
was sampled o ind ha d examples. Then he ini ial aining se plus he ha d examples
ound by he p e-model we e ed in o he linea SVM o ob ain he inal model epea ing
he abo e explained p ocess. This de ini i e MIT model pe o med sligh ly be e han
he p elimina y one, his sligh di e ence can be seen in he nex c oss alida ion.
Figu e 5.3: MIT ound 1 linea model c oss- alida ion - y-axis ep esen s pe o -
mance and x-axis cos alue.
5.1.2.2 INRIA models
Linea models. Fo he mo e challenging INRIA da a se we can ind wo g oups o
models depending on he SVM ke nel used.
Fo he pu pose o aining SVM models om he INRIA da a se a bunch o 12180
nega i e windows we e sampled andomly om he o iginal nega i e images.
As he aining p ocess emains he same as long as he SVM ke nel ype is he same,
he aining p ocess will only be explained once o each ke nel ype e en hough all
he esul s will be shown la e on his chap e .
The p ocess o all he linea ke nels o he INRIA da a se s a e iden ical o he p ocess
explained o he MIT da a se so i can be conside ed as explained bu conside ing a
37
5. PEDESTRIAN DETECTION
li le di e ence in he c oss alida ion p ocess. While in he MIT models a 5- old c oss
alida ion was pe o med, in he INRIA models a 3- old c oss alida ion ins ead was
done, his was like his because o he di e ence in he numbe o images used o c ea e
he models.
Due o he g ea amoun o images in ol ed in he aining p ocess and because o he
na u e o he aining i sel a big amoun o ime and memo y is needed o pe o m a
mo e high o de old c oss alida ion. This ac is specially no o ious in he i s e ain
ound when he ha d examples ound by he p e model a e used. Wi h his is mind and
a he expense o a sligh ly poo e e o es ima ion in he aining p ocess he numbe
o olds was educed om 5 o 3.
As he aining me hod ollowed by he au ho s was no e y clea explained wi h
espec o he numbe o nega i e ins ances used o ain he p e-models, wo ways o
p oceeding we e ied.
In he i s app oach a subse o he whole nega i e aining images was chosen andomly
o ain a p e-model, once his model was eady he ha d examples we e joined wi h
he nega i e aining images and se ed like he aining se o he di e en ounds.
Gene ally a simila numbe o nega i e and posi i e ins ances we e used.
I is no ewo hy ha , in con as o he MIT case, he i s linea model o he INRIA
da a se did no pe o m as good as he MIT case did, so he e- ain s ep becomes
mo e impo an . Anyway only one e ain ound is wo h i , u he ounds do no
imp o e signi ican ly he pe o mance o he de ec o . The c oss- alida ion cu es o
he RGB models can be seen in igu e 5.4.
38
5.1 T aining he de ec o
(a) INRIA p e linea model c oss- alida ion (b) INRIA ound 1 linea model c oss- alida ion
Figu e 5.4: C oss alida ion accu acy endency ound compa ison
y-axis ep esen s pe o mance and x-axis cos alue.
The second app oach consis s in he same p ocess bu using as ini ial se he whole
nega i e aining se p o ided by he INRIA eam. In his way a mo e wide ange o
cos alues was needed o ind he op imum pa ame e con igu a ion, leading in he
ollowing pa ame e sea ch and cu e. 5.5
Figu e 5.5: INRIA p e linea model c oss- alida ion using all he nega i e
aining ins ances - y-axis ep esen s pe o mance and x-axis cos alue.
39
5. PEDESTRIAN DETECTION
ROC cu es. When he e is a adeo o e o ypes, a single pe o mance numbe
is no he bes solu ion o ep esen he capabili ies o a sys em. Such a sys em has
many ope a ing poin s, and is bes ep esen ed by a pe o mance cu e. A ecei e
ope a ing cha ac e is ic, o simply ROC cu e, is a g aphical plo which illus a es he
pe o mance o a bina y classi ie sys em as i s disc imina ion h eshold is a ied. I is
a plo o he ue posi i e a e agains he alse posi i e a e, hen, he adeo be ween
sensi i i y and speci ici y (any inc ease in sensi i i y will be accompanied by a dec ease
in speci ici y).
Some obse a ions should help o in e p e his plo :
•The close he cu e ollows he le -hand bo de and hen he op bo de o he
ROC space, he mo e accu a e he es .
•The close he cu e comes o he 45-deg ee diagonal o he ROC space, he less
accu a e he es .
•The slope o he angen line a a poin gi es he likelihood a io o ha alue
o he es .
•The a ea unde he cu e is a measu e o accu acy. An a ea o 1 ep esen s a
pe ec es ; an a ea o .5 ep esen s a wo hless es .
Anyhow when a good pe o mance is achie ed mos o he plo ing a ea is unde u ilized
as he cu e ends o ge close o he le and uppe ma gins, gi ing ewe de ails
abou he adeo be ween e o s. The e o e we also use o he ype o plo o be able
o compa e en de ail di e en models.
DET cu es. In o de o sol e his issue wi h ROC cu es we also use DET cu es.
In he DET cu e we plo e o a es on bo h axes, gi ing uni o m ea men o bo h
ypes o e o , and use a scale o bo h axes which sp eads ou he plo and be e
dis inguishes di e en well pe o ming sys ems, his usually p oduces plo s ha a e
close o linea .
This plo assumes a no mal likelihood dis ibu ion o bo h posi i es and nega i es
ins ances and scales he axis acco ding o his assump ion. This scaling and linea i y
allows o ha e a clea e obse a ion o he sys em beha io .
46

5.3 The inal de ec o
Besides his ad an age, some special poin s can be easily iewed in DET cu e. A
weigh ed a e age o he missed de ec ion and alse posi i es a es may be used as a
e e ence o he o e all pe o mance, in his wo k all he plo ed cu es indica e, by
a li le ci cle, he poin whe e his a e age becomes minimum. As we don’ ha e any
special equi emen s in minimizing any o he wo ypes o e o s an e enly weigh ed
a e age is pe o med o ind he poin whe e his measu e becomes minimum, anyhow
his could be adjus ed o mee any special equi emen ha a conc e e applica ion may
need.
SVM sco es dis ibu ion. In he gene al case SVM de ines one o mo e hype -
planes which beha e as bounda ies be ween classes, he e o e he con idence o belong-
ing o a conc e e class can be seen as he dis ance om a bounda y. This is wha
we call classi ica ion sco e. Jus o ha ing a clue on how well he bounda ies we e
placed, we plo a his og am coun ing how many posi i e and nega i e ins ances we e
a a simila dis ance om he bounda y using di e en colo s be ween classes. In his
plo s we expec o ind li le o e lapping be ween di e en colo bins as SVM ies o
maximize he ma gins om he bounda y o each class. We could also use he p ob-
abili y es ima es bu i u ns ou o be a expensi e compu a ion and is know o ha e
some nume ical issues. E en hough a sco e may no be a e y in o ma i e measu e
i sel , we a e no in e es ed in he alue i sel bu in he dis ibu ion o all he sco es.
Th eshold cu es. SVM implici ly uses a 0.5 h eshold o de e mine whe he a
conc e e ins ance belongs o one class o he o he . Depending on how he p edic ion
p obabili ies a e dis ibu ed and he ac ual g ound u h alues, maybe o he h esholds
may p o e o be mo e app op ia e. To ind he op imum h eshold o a desi ed ade-
o be ween he wo possible kind o e o s ( alse posi i es and alse nega i es) we plo
p ecision, ecall and F-sco e measu es in unc ion o he classi ica ion h eshold.
5.3 The inal de ec o
Once he bes possible model has been selec ed a comple e de ec o can be made. The
de ec o consis s hen in a SVM classi ie using he bes model achie ed in he p e ious
aining s ages plus a sliding window sys em.
47
5. PEDESTRIAN DETECTION
Gi en an image, we e-scale he image, hen a dense scale-space py amid is made and a
64×128 de ec ion window scans all he py amid le els sliding his window as explained
in he p ocess o he exhaus i e sea ch pe o med du ing he aining o inding ha d
examples.
This e-scaling o he inpu image is done o wo main easons. The i s eason has
o do wi h he way on how he model was ained. As all he models whe e ained
wi h 64 ×128 images whe e pedes ians p esen ed an a e age heigh o 100 pixels, he
es ing s age mus also be complian wi h his. This educes he o e all p ocessing ime
by no checking non use ul windows.
A e his educ ion is made, he exhaus i e sea ch begins. As one pedes ian can
cause se e al de ec ions om nea by windows a non maximal supp ession is applied so
only he mos p obable de ec ion window is shown. Anyway he de ec o is capable o
d awing all he de ec ion window i desi ed as well as showing each window in sepa a e
igu es o a mo e de ailed examina ion.
Jus o he pu pose o unde s anding and making isible how he sliding window
me hod wo ks, we made ano he de ec o whe e he sliding windows is shown a e e y
s age, showing a ed ec angle whe e no pedes ian is ound inside he window and
g een ec angle wi h he pe cen age o con idence when a pedes ian is ound.
Rega ding o he non-maximmum-supp ession algo i hm, he main idea behind i , is o
g oup all he de ec ion windows ul illing some p oximi y condi ion o compa ing hei
classi ica ion p obabili ies and supp ess all bu he mos p obable de ec ion wi hin each
clus e .
We ied wo p oximi y condi ion o g oup de ec ion windows. The i s app oach is
o g oup hem by a squa ed Euclidean dis ance measu ed in pixels. Fo e e y window
we compu e i s dis ance as d=d2
x+d2
y. Whe e dxand dya e he coo dina e di e ence
be ween he op le co ne s o each window. As he squa e oo ope a ion u ns ou o
be compu a ionally expensi e and does no p o ide any imp o emen o he calculus
we decided no o do i . E e y window a a sho e dis ance han he squa e o he
sho es side o he de ec ion windows size is conside ed nea each o he .
The o he app oach is simila bu compa ing he o e lapping a ea be ween windows,
e e y wo windows wi h a o e lapped a ea g ea e han hal he o al a ea is conside ed
close and he e o e conside ed as wo windows de ec ing he same pedes ian.
48
5.3 The inal de ec o
Figu e 5.12: Mul i de ec ion e sus non-max-supp ession de ec ion
Appea s ha he i s app oach gi es be e supp essions o he es ed images. Anyway
some mo e de ailed es ing should be conside ed as he p oximi y condi ions could no
be he op imum ones.
Some de ec ion supp ession examples a e shown in igu e 5.12.
49
5. PEDESTRIAN DETECTION
50
6
Implemen a ion and Usage
In his sec ion we gi e a b ie e iew o he implemen a ion and some de ails o each
de ec o componen , ne e heless a lis and explana ion o he lib a ies and packages
used in he p ojec can be ound in §9.
Addi ionally o he desc ip ion in his chap e a comple e documen a ion in HTML
o ma is p o ided o easily na iga e h ough he whole se o unc ions, making possible
o see he calls be ween hem and he comple e code.
The de elopmen o he en i e p ojec has been made using MATLAB, a widely known
language and en i onmen , designed o nume ical calculus and as de elopmen o
applica ions. Be ween he main easons o wha MATLAB was chosen, we ind:
1. Much as e de elopmen compa ed wi h adi ional languages like C++ o Ja a
2. La ge numbe o lib a ies and oolbox o compu e ision, op imiza ion e c
3. E icien manipula ion o ma ices and ec o s
4. La ge communi y and suppo
Besides MATLAB p o ides an in e es ing de elopmen en i onmen , as MATLAB uns
on op he Ja a Vi ual Machine i ’s pe o mance becomes poo compa ed wi h a mo e
low le el language.
MATLAB s uc u es all he code h ough .m iles ha can be sc ip s o unc ions.
The main di e ence be ween hem is ha unc ions expec inpu pa ame e s and in
u n e u ns any numbe o ou pu pa ame e while sc ip s don’ use nei he inpu
no ou pu pa ame e s and he e o e wo k wi h he wo kspace a iables, so no p i a e
51

6. IMPLEMENTATION AND USAGE
a iables a e a ailable when wo king wi h sc ip s.
(A de ailed explana ion abou he wo kspace and MATLAB can be ound in §9)
In addi ion o he MATLAB p ojec code, some sc ip s a e also a ailable in o de o
make possible some as and easy esul s acquisi ion. Mainly we p o ide unc ions o
plo ing c oss- alida ion g ids and cu es and c ea e excel wo kshee s om di e en
logging da a w i en by he aining o es ing unc ions o he p ojec i sel . Many
o his sc ip s a e w i en in Py hon bu i s use is gene ally s aigh o wa d and b ie
explana ions a e included wi h he code.
6.1 Implemen a ion
Be o e explaining he unc ions and sc ip composing he p ojec is wo h being amilia
wi h some MATLAB a iable ypes, lib a ies o concep s as: cell a ay (9.1.1), .ma
iles (9.1.1) and he libSVM lib a y (9.2.1).
Is also no ewo hy ha he whole p ojec is made in such a way ha e e y con igu able
pa ame e o he s udied me hod is capable o being changed h ough con igu a ion iles.
The unc ions and sc ip s composing he p ojec ollow (in alphabe ical o de ):
•compu e cell coo dina es.m - unc ion o compu ing cell coo dina es gi en
he cell x and y size. Inpu : image o spli in cells, x cell size and y cell size.
•compu e g adien .m - unc ion o compu ing he g adien o he inpu image.
I he inpu image is a RGB image, he g adien is compu ed o each channel
and he e u ned magni udes and angles co espond o he highes magni ude o
each pixel. Inpu : Image om whe e o compu e he g adien . Ou pu : Angles
and magni udes o he g adien in each pixel.
•compu e HOG.m - unc ion esponsible o compu ing he desc ip o . Calls he
compu e g adien unc ion and pe o ms he his og am compu a ion and block
no maliza ion. Inpu : Image o p ocess, cell size in pixels, block size in cells and
numbe o bins o he his og am. Ou pu : HoG desc ip o o he image.
•con e 2g ay.m - Sc ip o sa ing a g ay scale e sion o e e y image in a
speci ied olde . Used o ob ain a g ay se om he RGB images in ain and es
se s om INRIA pe son da ase .
52
6.1 Implemen a ion
•c oss alida e.m - unc ion esponsible o he c oss alida ion o a SVM model.
Once he whole pa ame e space is sea ched he bes con igu a ion ound is e-
u ned. Also a c oss alida ion log ile, whe e he accu acy eached o each
con igu a ion is logged, is sa ed a long wi h a c oss alida ion g id o cu e ep e-
sen ing he e olu ion o he accu acy wi h espec he pa ame e selec ion. Inpu :
ke nel ype (RBF o linea as s ing), cos and gamma anges as lis s o doubles,
he ain ma ix (as many ows as aining ins ances and as many columns as he
HoG dimension), a label column ec o speci ying he class o each ins ance and
he model sa e pa h. Ou pu : S ing ep esen ing he bes pa ame e se in he
libSVM o ma .
•s a ic de ec o .m - unc ion ha pe o ms he pedes ian sea ch o e all he
images ound in a speci ied olde . Gi en a model asks o a olde om whe e o
ead he desi ed images. Then o each image sea ches o pedes ians. I desi ed
applies non-maximum supp ession o d aw only he mos p obable ma ches o e
he image ma ked wi h a g een bounding box. I non maximal supp ession in
desi ed all posi i es ma ches can be d awn. Also e e y de ec ion can be showed
in a sepa a e igu e o a mo e accu a e examina ion i wan ed. Inpu : SVM
model. Ou pu : Emp y
•sliding de ec o .m - unc ion simila o s a ic de ec o bu d aws he scanning
window all along i s sliding p ocess o e e e y image. P omp s o he image
olde pa h and calls d aw sliding window o e e y image ound in he olde .
Inpu : SVM model. Ou pu : Emp y
•d aw sliding window.m - unc ion esponsible o d awing he sliding window.
Called om sliding de ec o . Responsible o compu ing he desc ip o and clas-
si ying each window as long as he de ec ion window slides h ough he image.
Inpu : sub image de ined by he de ec ion window and he libSVM model desi ed
o he classi ica ion. Ou pu : Emp y
•ge ea u e ma ix.m - unc ion o compu e he desc ip o ma ix o all he
inpu images. Inpu : Pa hs o posi i e and nega i es images. All window pa am-
e e s a e ead om he window pa ams ile. Ou pu : labels; a column ec o wi h
he g ound u h class o each inpu ins ance, ain ma ix; desc ip o / ea u e
53
6. IMPLEMENTATION AND USAGE
ma ix whe e he numbe o ows is equal o he numbe o inpu ins ances and
he numbe o columns is equal o he ea u e ec o dimension.
•ge iles.m - unc ion o e ie ing all o a subse o he image pa hs posi i e
and nega i e olde s. Inpu : numbe o posi i e image pa hs desi ed, numbe
o nega i e image pa hs desi ed (-1 in case o asking o all he images in he
olde ) and a cell a ay con aining he pa hs o he olde s con aining posi i e
and nega i e images espec i ely. In case no olde pa hs a e gi en a windows
p omp s o bo h olde s pa h. Ou pu : Two lis s con aining he posi i es and
nega i es pa hs o he images.
•ge nega i e windows.m - unc ion o compu ing he desc ip o ma ix o
all inpu images. Inpu : wo lis s con aining posi i e and nega i e images pa hs
espec i ely. Ou pu : labels; a column ma ix wi h 1 in posi i es images and -1
in nega i es, ain ma ix; he desc ip o / ea u e ma ix. whe e num. Rows =
num. ins ances and num. columns = ea u e dimension.
•ge pa ams.m - unc ion esponsible o eading di e en pa ame e s om a .ma
ile. Inpu : pa ame e ile pa h. Ou pu : map o dic iona y mapping all he
pa ame e keys o i s ac ual alues.
•ge py amid dimensions.m - unc ion o compu ing he dimensions o he
scale-space py amid gi en an inpu Image. The pa ame e s de ining he s ide,
scale ac o and windows dimensions a e ead om a py amid pa ame e ile. By
de aul his ile is sea ched in he .m ile oo olde , i no ound is sea ched in he
pa ams olde whe e he .m ile is loca ed. I bo h sea ches ail o ind he pa am-
e e s ile hen a window p omp s o i . Acco dingly o he pa ame e s ead he
unc ion compu ed how many de ec ion windows will he py amid ha e. Inpu :
Image o p ocess. All py amid pa ame e s a e ead om he py amid pa ams ile.
Ou pu : numbe o le els, o al numbe o windows and windows pe le el.
•ge py amid hogs.m - unc ion esponsible o compu ing all he HoGs om
he space-scaled py amid gi en an inpu image and he py amid con igu a ion
pa ame e s. Inpu : inpu image, desc ip o size, py amid scale ac o be ween
le els and window s ide. Ou pu : HoGs o e e y window, all he py amid
windows, numbe o window pe le el and e e y window coo dina e.
54
6.1 Implemen a ion
•ge scale space py amid images.m - unc ion o compu ing all he py amid
window images and i s coo dina es gi en an inpu image. Inpu : image o p ocess
Ou pu : py amid; cell a ay con aining in a py amid s uc u e all he windows o
he scale-space py amid, coo dina es; uppe -le coo dina es o each windows in
he e u n py amid s uc u e.
•ge window.m - unc ion o e ie ing a speci ied sized window om an image.
Inpu : image o p ocess, wid h and heigh o he desi ed window and he ex ac-
ion me hod. The possible me hod a e cen e ed o andom, whe e he cen e ed
me hod ex ac s a cen e ed window o he speci ied size om he inpu image,
while he andom me hod simply picks any alid window o he speci ied size
somewhe e o e he inpu image. Gi ing a coo dina e as a me hod leads on a
window whose uppe -le coo dina e coincides wi h he speci ied one. Ou pu :
he ex ac ed window
•non max supp ession.m - unc ion esponsible o applying a supp ession o all
nea by de ec ions selec ing he mos con iden one. Two modes o p oximi y a e
de ined. Windows no mee ing a dis ance c i e ion o no mee ing a h eshold
alue o o e lapping a ea a e conside ed as nea by window and he e o e likely
o be compa ed o ind he mos con iden be ween hem. Inpu : window pixel
coo dina e, window con idence measu e and window size. Ou pu : Emp y
•plo DETcu e.m - unc ion o plo ing one o mo e o e lapped DET cu es
gi en one o mo e SVM models. Inpu : Lis con aining he models o es , cell
a ay con aining he model names as should be ep esen ed in he legend, posi i e
and nega i e olde pa hs. I no posi i e no nega i e pa hs a e gi en a windows
p omp s o he olde pa hs. Ou pu : Emp y
•plo ROCcu e.m - Sc ip esponsible o plo ing he ROC cu e. A window
p omp s o he SVM model ile and he olde s con aining he posi i e and neg-
a i e aining images.
•plo sne map.m - unc ion o plo he -SNE map gi en he image pa hs. In
case no pa hs a e gi en p omp s an explo e window o loca e he olde s. Inpu :
Cell a ay con aining posi i e and nega i e image pa hs. Ou pu : -SNE plo
55
7. PERFORMANCE
a b oade es ing s age, including a lo o compa isons be ween models o ind he bes
possible op ions.
The INRIA pe son da a se p o ides 2416 posi i e windows and 453 nega i e images o
es . F om he nega i e images, we andomly sampled 3 windows pe image, ob aining
1359 nega i e windows. This was done like his o pe o m all he es wi hou ha ing
skewed classes ha could po en ially dis o some measu e. Mo e de ails o he image
da a se s can be ound in §5.1.1.
As we saw in §5.1.2 di e en app oaches we e ollowed when aining he INRIA models,
depending on he a io be ween posi i es and nega i es ins ances. The i s app oach
consis ed in aining e e y model wi h a simila amoun o posi i es and nega i es
ins ances while in he second all he aining ins ances we e used. This is, using a da a
se whe e he a ailable nega i e ins ances ep esen a ound 5 imes he posi i e ones.
Fo each app oach linea and Gaussian ke nels we e ained. In he ollowing sec ion
we p esen he pe o mance measu es g ouped by app oach ollowed and ke nel ype
i s and inally a e iew be ween he bes models ound in he p e ious compa isons
in o de o ind he bes op ion ega ding he ke nels.
7.2.1 1s aining app oach
7.2.1.1 Linea models
As ega ds o he i s me hod and he linea ke nels, we ha e ha , as poin ed by
he c oss alida ion cu es showed in he SVM aining chap e , li le imp o emen is
achie ed by e- aining wi h he ha d examples images. The imp o emen achie ed can
be seen in he 7.11, showing how he op imal h eshold alue makes he second model
sligh ly be e .
Ba plo s in igu e 7.4 illus a e he li le di e ence be ween bo h models. The middle
ba ep esen s he pe o mance o he e- ained model when no op imal h eshold is
used, his is, when a 0.5 h eshold de e mines he p edic ion class. In his case he
pe o mance alls d ama ically.
62

7.2 INRIA models
Figu e 7.3: INRIA linea gb p e-model e sus ound-1 model DET cu e (1s
app oach) - The op imal h eshold alue yields a sligh ly be e model a e he e- aining
(a) Absolu e measu es (b) Rela i e measu es
Figu e 7.4: e- ain impac in INRIA linea model (1s app oach)
A as e iew o he classi ica ion p obabili ies shows ha he ound-1 model is e y
con iden in p edic ing nega i e ins ances, so a li le suspicion abou a pa icula win-
dow being a posi i e ins ance is enough o de e mine ha is ac ually a posi i e one.
Tha gi es a op imal h eshold o 0.1 o he posi i e p edic ion p obabili y, wha is
he same as saying ha , i he de ec o is no su e a a leas a a 90% o no being a
posi i e ins ance, hen is no a posi i e ins ance. The op imal h eshold sea ch can be
seen in igu e 7.5.
63
7. PERFORMANCE
Figu e 7.5: INRIA linea gb ound1 model h eshold sea ch (1s app oach) -
X in he igu e shows op imal h eshold, Y shows he con e gence o he h ee pe o mance
measu es
7.2.1.2 RBF models
Simila ly as we did o he linea models, we compa e now he pe o mance o e e y
Gaussian o RBF model. In his case mo e han one ound is needed o ind a model
om which new s eps o e- aining does no imp o e. Anyhow he h ee i s s eps
a e essen ial as a e e y new e- ain we eached a much be e model. DET cu es in
igu e 7.6 show he imp o emen o e e y model.
Conc e e nume ic measu es o e he es se can be seen in 7.7. As poin ed ou by he
DET cu es in 7.6, e e y s age imp o es all o he possible pe o mance measu es o
he model.
64
7.2 INRIA models
Figu e 7.6: INRIA RBF gb aining ounds DET cu es (1s app oach) - A
e e y e- aining s age we achie e a ound a 50% o e o educ ion.
(a) Absolu e measu es (b) Rela i e measu es
Figu e 7.7: e- ain impac in INRIA RBF models
I should be no ed ha hese measu es ha e been aken om he op imal h eshold
ound by 7.8 whe e he op imali y c i e ion is o minimize a cos unc ion whe e alse
posi i es and alse nega i es penalize in he same a io. In o he wo ds, maximize he
F1-sco e. As explained ea lie , a di e en weigh ing migh be desi ed, o ha pu pose
he same plo s could be used.
65
7. PERFORMANCE
(a) INRIA b gb p e-1 model (b) INRIA b gb ound-1 model
(c) INRIA b gb ound-2 model
Figu e 7.8: op imal h eshold sea ch o INRIA RBF models (1s app oach)
7.2.1.3 Linea e sus RBF model compa ison
Now we ha e he bes model o each ke nel ype, ollowing he same measu es, we
compa e he pe o mance be ween he linea and he Gaussian o RBF models. As
shown in 7.9 he Gaussian model pe o ms be e a e e y h eshold le el.
As we had a la ge amoun o da a o ain and a a he complex ea u e desc ip o
we can ake ad an age om a mo e complex model ha can gene a e mo e complex
bounda ies, he e o e ob aining a be e disc imina ion capabili y. E en hough he
RBF ke nel pe o ms be e a e e y h eshold, is no ewo hy he e o needed o
achie e his gain. In one hand because o he much mo e expensi e c oss- alida ion
p ocess, and in he o he hand because o he ex a e- aining s eps equi ed o achie e
he op imum model.
66
7.2 INRIA models
Figu e 7.9: INRIA linea s b models DET cu es (1s app oach) - DET cu es
show an ad an age o he ke nel model
The di e ence in pe o mance when using op imal h esholds o bo h models can be
seen in 7.10
Figu e 7.10: INRIA linea model e sus RBF model pe o mance compa ison
(1s app oach) - The RBF model inc eases in abou a 2% he p ecision and a ound a 1%
he o e all pe o mance espec he linea one
67

7. PERFORMANCE
7.2.2 2nd aining app oach
7.2.2.1 Linea models
Same me hodology as o he i s app oach ollows. Fi s we compa e he pe o mance
be ween he di e en s eps needed by he linea models o achie e he bes possible
esul s. As be o e his can be seen wi h he DET cu es in igu e 7.11
Figu e 7.11: INRIA linea gb pe -model e sus ound-1 model - Round-1 e-
aining s ep imp o es he miss a e e o
As shown in igu e 7.11, is no so easy o de e mine which model pe o ms be e a
a i s glance. A nume ic measu e a he op imums h esholds e eals a sligh ly li le
gain o he ound-1 model. Specially in igu e 7.12 a sub le imp o emen can be seen
in he miss a e measu e in a o o he ound-1 model.
7.2.2.2 RBF models
Fo he RBF models ollowing he second app oach only one ain s ep was pe o med
as u he e- aining did no imp o e. Anyway a ha d pa ame e sea ch was needed
o ind a good pe o mance. In he ollowing sec ion all he igu es will e e no o
di e en e- aining s eps bu o di e en pa ame e iza ion ha lead o qui e di e en
models. E en hough we will no p esen he conc e e pe o mance measu es o e e y
68
7.2 INRIA models
(a) Absolu e measu es (b) Rela i e measu es
Figu e 7.12: Re- ain impac in INRIA linea model (2nd app oach)
c oss- alida ion sea ch made while sea ching he bes Gaussian model, we show in
igu e 7.13 he DET cu es showing he e olu ion made by he pe o mance o e e y
model depending on he aining pa ame e s. Finally in igu e 7.2 we show he nume ic
measu es ob ained wi h he inal Gaussian model.
OKs KOs p n p nmiss a e P ecision Recall F-sco e
2457 28 16 12 1114 1343 0.01065 0.9858 0.9893 0.9875
Table 7.2: INRIA RBF inal model pe o mance.
7.2.2.3 Linea e sus RBF model compa ison
Again, in o de o ind he bes model om his aining app oach, we compa e he
pe o mance o e all he linea and RBF models. Figu e 7.14 shows how he Gaussian
model ou pe o ms o e he linea model, achie ing a ound a 1% o imp o emen in
bo h ypes o e o s. This makes he Gaussian model, again, he bes choice.
7.2.3 Selec ing he inal model
Once comple ed all he possible compa ison o e he whole se o models ained, we
show a mo e gene al compa ison om he model ke nel ype poin o iew. Wi h his
sec ion we p e end no only o de e mine he bes model be ween all he ained ones,
bu we wan o know which ype o app oach p o ed o be he mos success ul one,
which app oach gi e he bes models and wi h less e o .
69
7. PERFORMANCE
Figu e 7.13: INRIA RBF model pe o mance due o he pa ame e con igu a-
ion - E e y cu e ep esen s a pa ame e sea ch om 1 o 3 un il a ou pe o ming model
was achie ed
Figu e 7.15 illus a es he compa ison be ween all he linea models on one hand and
a compa ison o he bes linea models om each aining app oach.
(a) All linea models compa ison (b) Final linea models compa ison
Figu e 7.15: Linea models compa ison be ween he 1s and 2nd aining app oach
As can be seen in igu e 7.15, he inal ound-1 models and he p e-models in bo h cases
70
7.2 INRIA models
Figu e 7.14: INRIA linea e sus b model pe o mance compa ison (2nd
app oach) - DET cu es show an ad an age abou a 1% o he RBF model
a e be e when ained ollowing he second app oach, his is, using all he a ailable
da a. Is impo an o no e ha e en he bes 1s app oach linea model is wo s
han he 2nd app oach p e-model. Figu e bisola es he bes wo linea models om
each aining app oach, and clea ly illus a es he imp o emen ha in ol es he 2nd
app oach ega ding he linea models.
The same analysis be ween he RBF models will lead us o he mos ou pe o ming
model ound in he p e ious sec ions so we can objec i ely conclude which SMV model
will gi e us he bes de ec ion accu acy gi en ou op imali y c i e ion. Figu e 7.16
shows how he bes possible model u ns ou o be he RBF o Gaussian model ained
ollowing he second app oach. The ad an age is in dec easing bo h ype o e o s ( alse
posi i es and alse nega i es) in a ound a 1%.
71
8. DISCUSSION
The e o e SVM will deal wi h sho e ea u e ec o s, wha i is ansla ed in less
compu a ional e o and ime.
Ou b ie es s on applying PCA o he HOG ea u e ma ix show ha i is possible
o educe a 3780 dimensional HOG o a 937 dimensional one by sac i icing in mo e
o less a 1% he o e all pe o mance. Howe e we mus ake in o accoun ha only
a p e-model has been ained ollowing his app oach (so no e- aining s eps) and no
op imal h eshold has been sea ched.
Figu e 8.2 show he c oss- alida ion cu e and able 8.1 show he es measu es applied
o a linea model ained wi h educed HOGs.
Figu e 8.2: INRIA linea PCA-model c oss- alida ion cu e - C oss- alida ion o
a linea model ained wi h PCA- educed ea u es.
OKs KOs p n p nmiss a e P ecision Recall F-sco e
2410 75 55 20 1106 1304 0.0176 0.952 0.982 0.967
Table 8.1: INRIA linea PCA-model pe o mance.
Anyway, as al eady men ioned, u he es should be done in o de o ob ain a mo e
accu a e conclusion,op imum educ ions and maybe possible d awbacks no no ices
du ing ou b ie es ing.
78

8.2 Al e na i es
8.2 Al e na i es
A he beginning o his hesis we ha e men ioned some me hods de eloped by many
di e en ins i u es and in es iga o s apa om he de eloped one. Anyway all he
commen ed me hods consis in de ining some kind o ea u e and hen classi ying hem
ollowing di e en scanning app oaches and ying many o he possible enhancemen s
like ea ly ejec ions e c. Recen ly some in e es ing esul s ha e been ound using A i-
icial Neu al Ne wo ks (ANN), specially Con olu ional Neu al Ne wo ks (CNN). Su -
p ising esul s we e achie ed in many di e en ields and p oblems; Handw i en digi
ecogni ion (23), objec ecogni ion (24) o au onomous d i ing. Conc e ely he wo k
de eloped in (24) by Alex K izhe sky achie ed a much lowe e o a e in he ILSVRC-
2012 compe i ion compa ed wi h many amous compu e ision g oups all o e he
wo ld. Among se e al ad an ages, he CNN in eg a es ea u e ex ac ion and classi i-
ca ion in o one single, ully adap i e s uc u e. Fu he mo e we do no need o wo y
abou de ining complex ea u es ep esen a ions as he embedding laye s o he ne wo k
al eady de ine implici ly wha ea u es bes ep esen he inpu s. They a e designed
o ecognize isual pa e ns di ec ly om pixel images wi h minimal p e-p ocessing.
Ano he signi ican imp o emen has o do wi h he ela i e ole ance ha CNN show
in on geome ic ans o ma ions, local dis o ions e c.
Figu e 8.3 show a ep esen a ion o he ea u es lea ned by a hidden laye o 25 hidden
uni s on an ANN designed o ecognize handw i en digi s. The inpu s o he ne wo k
we e 20 ×20 images wi h a handw i en digi . A e he aining s age e e y hidden
uni had a mos likely ac i a ion ep esen a ion. This ANN was jus a li le example
wi h only one inpu laye , one hidden laye and one ou pu laye .
Anyway, de ining he opology o deep ne wo ks and aining hem can be a e y
expensi e ask, so, many esou ces may be needed o ind good esul s. In o de o
illus a e his idea we gi e some numbe s o he deep ne ained in (24):
•T ained wi h s ochas ic g adien descen on wo NVIDIA GPUs o abou a week
•650,000 neu ons
•60,000,000 pa ame e s
•630,000,000 connec ions
79
8. DISCUSSION
Figu e 8.3: Hidden laye ea u e ep esen a ion lea ned by a simple ANN -
E e y squa e image ep esen s he inpu s ha make a conc e e neu on o be ac i a ed.
•Final ea u e laye : 4096-dimensional
E en hough i could be a e y in e es ing app oach o u he in es iga ion,. Y.Bengio
and Y.LeCun p esen in (25) heo e ical and empi ical e idence showing ha ke nel
me hods and o he ”shallow” a chi ec u es a e ine icien o ep esen ing complex unc-
ions such as he ones in ol ed in a i icially in elligen beha io , such as isual pe -
cep ion.
80
9
Ma e ials & me hods
9.1 MATLAB
9.1.1 a iable ypes
E en hough he aim o his chap e is no making a comple e explana ion abou he
MATLAB a iables, we gi e a b ie desc ip ion abou some da a ypes used in he
p ojec implemen a ion.
Mainly MATLAB wo ks wi h ma ices o di e en nume ic ep esen a ion, anyway non
nume ic ypes can also be s o ed in a ma ix while some equi emen s a e ul illed.
Figu e 9.1 show he possible da a ypes ha can be s o ed in a ma ix o m. The only
equi emen ha a ma ix mus mee is o ha e all dimension being consis en , his is
o say, all elemen s in a ma ix mus ha e he same dimension and all elemen s wi hin
he ma ix mus o he same ype.
Figu e 9.1: MATLAB ma ix da a ypes - Fo a mo e accu a e ype explana ion
e e o he MATLAB documen a ion in (26)
81
9. MATERIALS & METHODS
Cell A ay.
Fo s o ing some a ying size da a, le s say, s ings o di e en size in a ec o o
a ay, MATLAB p o ides an special a iable ype called cell a ays which is a da a
ype wi h indexed da a con aine s called cells, whe e each cell can con ain any ype o
da a. Cell a ays commonly con ain ei he lis s o ex s ings, combina ions o ex
and numbe s, o nume ic a ays o di e en sizes.
Ma iles.
Files wi h a .ma ex ension con ain MATLAB o ma ed da a. This da a can be
loaded om o w i en o hese iles using he unc ions LOAD and SAVE, espec i ely.
We use his o ma o s o e he SVM models and he di e en pa ame e s needed o
con igu e he scale-space-py amid p ocedu e, es ing o aining p ocess e c. Mo e
in o ma ion abou he .ma iles is a ailable in (27).
9.2 Lib a ies and packages
9.2.1 libSVM
Many SVM packages a e a ailable bu libSVM u ns ou o be one o he mos popula
and comple e ones. I is an open sou ce lib a y de eloped a he Na ional Taiwan
Uni e si y, is de eloped in C++ and suppo s classi ica ion and eg ession. I is ee
so wa e published unde a BSD license. Among i s ad an ages we may no ice ha
coun s, ough many in e aces, wi h compa ibili y o a wide ange o languages and
pla o ms, his is no ewo hy because jus in case he code is po ed o ano he language
he same classi ie wi h same pa ame e s and con igu a ion could be used.
9.2.2 NIST DET plo s
In o de o plo De ec ion E o T adeo cu es a MATLAB package p o ided by he
Na ional Ins i u e o S anda ds and Technology (NIST) is used. The compe e so wa e
can be downloaded in (28). Fo addi ional in o ma ion abou DET cu es e e o (29).
9.2.3 m2h ml
Wi h he pu pose o deli e ing a comp ehensi e code documen a ion in a easy o ead
and ecologic way, all he MATLAB code has been pa sed wi h he m2h ml unc ions
82
9.2 Lib a ies and packages
c ea ing a se o HTML pages con aining all he in o ma ion ega ding he code. In
his way we explici ly de ine he ela ionship be ween unc ions, inpu s, ou pu s and
he code i sel while making ex emely easy na iga ing h ow all he sou ce code and
iles. The ool i sel and all he in o ma ion abou i can be ound in: (30).
83

9. MATERIALS & METHODS
84
10
P ojec Managemen
10.1 Planning
The p ojec is di ided in ou main sec ions; Resea ch,Implemen a ion,Building and
Documen a ion. The esea ch s ep consis s in he in o ma ion compila ion and un-
de s anding, he implemen a ion conside s all he code de eloping, he Building s age
collec s all he model aining, es ing and selec ion and inally he Documen a ion
phase has o do wi h all he esul s ecollec ion, and documen a ion w i ing.
Figu e 10.1 shows he o e all p ojec planning.
Figu e 10.1: O e all p ojec schedule - Collapsed iew o all ou p ojec phases
A mo e de ailed iew o he esea ch s ep e eals wo main s ages; he i s one is an
app oach o he heo y behind he HOG desc ip o , SVMs and he pedes ian de ec ion
p oblem. The second one has o do wi h he selec ion o he de eloping en i onmen
w. . he p og amming language, ools, lib a ies and a ailable in o ma ion o commu-
ni y. Figu e 10.2 shows he o iginal schedule. The S a e-o - he-a s ep ook longe
because o he need o u he esea ch as o he pedes ian de ec ion al e na i es in-
ol ed some unseen heo y.
85
10. PROJECT MANAGEMENT
Figu e 10.2: Resea ch schedule - De ailed iew which comp ehends a heo y and
echnology s udy
Once all he heo y was diges ed and MATLAB was chosen he implemen a ion phase
began, de eloping i s he HOG ea u e ex ac ion and ollowed by he classi ie con-
s uc ion. Also all he measu emen ools we e de eloped in o de o be able o pa -
allelize he aining, es ing and e alua ion o he models. Figu e 10.3 shows his
b eakdown. G a e ully some p e-build unc ions o plo ing measu es like ROC cu es
and isualizing HOGs helped o educe he da a isualiza ion phase span.
The nex s age consis s in aining and es ing he SVM models. When possible he
aining and es ing s ages we e o e lapped in ime, using wo di e en compu e s. Due
o he long compu a ional imes needed on o de o ain complex models like RBF ones
wi h a la ge da a-se and wide pa ame e -spaces c oss- alida ions, he model building
phase was delayed. Mean while he inal de ec o de eloping was ca ied ou . Figu e
10.4 shows his empo al o e lapping.
Finally he esul s collec ing and documen a ion w i ing s age compiled all he p ojec
explana ion. As he code was p ope ly documen ed in a i s s age he only pending
wo k was o make a easy o use and unde s and documen a ion. Using a ma lab o
h ml con e e he code explana ion u ned ou o be e y iendly in e o and ime.
I is no ewo hy ha he edac ion o he hesis is ,by i sel , he la ges s ep. Figu e
10.5 show his de ail.
86
10.1 Planning
Figu e 10.3: Implemen a ion schedule - De ailed iew showing all he implemen a ion
phase
Figu e 10.4: Building schedule - De ailed iew o he building s age
87