Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
DOI: 10.5121/sipij.2019.10304 33
A NOVEL DATA DICTIONARY LEARNING FOR LEAF
RECOGNITION
Shaimaa Ib ahem1, Yasse M. Abd El-La i 2 and Naglaa M. Reda2
1Highe Ins i u e o Compu e Sciences and In o ma ion Sys em, Egyp .
2 Facul y o Sciences, Ain Shams Uni e si y, Egyp .
ORCID: 0000-0003-0701-087X
ABSTRACT
Au oma ic lea ecogni ion ia image p ocessing has been g ea ly impo an o a numbe o p o essionals,
such as bo anical axonomic, en i onmen al p o ec o s, and o es e s. Lea n an o e -comple e lea
dic iona y is an essen ial s ep o lea image ecogni ion. Big lea images dimensions and aining images
numbe is acing o as and comple e da a lea es dic iona y. In his wo k an e icien app oach applies o
cons uc o e -comple e da a lea es dic iona y o se o big images diminu ions based on spa se
ep esen a ion. In he p oposed me hod a new c opped-con ou me hod has used o c op he aining
image. The expe imen s a e es ing using co ela ion be ween he spa se ep esen a ion and da a
dic iona y and wi h ocus on he compu ing ime.
KEYWORDS
Lea image ecogni ion, Dic iona y lea ning, Spa se ep esen a ion, Online Dic iona y Lea ning
1. INTRODUCTION
Recogni ion has become an impo an echnique in a ious ields o he pas ew yea s.
ecogni ion algo i hms a e popula o many kinds o applica ions such as ace ecogni ion, ac ion
ecogni ion and lea ecogni ion [1]. This esea ch has ocused on lea ecogni ion based on he
analysis o lea images. The challenging ask in he Lea ecogni ion om lea images is o ind
disc iminan ea u es ha can be app op ia e o dis inguishing di e en lea es classes. In o de o
classi y he lea di e en cha ac e is ics ha e been e alua ed such as colo , shape, ex u e,
mo phology and ena ion s uc u e as compa a i e ool , and some lea da ase s such as Swedish
lea da ase , Fla ia da ase , and ICL da ase a e s anda d benchma k.
Mos o he esea che s ocus on he lea shape ea u es, because he o e all shape s uc u e o a
lea may be p ese ed e en hough he lea sample being damaged by age o insec s bu also he
lea shape is app oxima e same in la ge se o lea es.
A shape-based app oach o lea classi ica ion p esen ed in many esea ches [1-5]. Jou-ken [1]
p esen ed a plane iden i ica ion sys em based on shape-based using ea u e ex ac ed om he
lea images by scale in a ian ea u e ans o m (SIFT) me hod. Yang e al. [2] p esen ed a new
app oach o plan lea ecogni ion using con ou -based shape desc ip o called mul i-scale
iangula cen oid dis ance (MTCD) and dynamic p og amming (DP). This desc ip o ex ac s
The MTCD ea u es om each con ou poin o p o ide a compac , mul i-scale shape desc ip o .
The DP p og am inds he bes alignmen be ween co esponding poin s o he shapes. Also Wu,
S.G. [3] de eloped plan iden i ica ion sys em based on neu al ne wo k using a eed
o wa d neu al ne wo k (PNN) p obabilis ic neu al ne wo k. Same plan lea ecogni ion
p esen ed in [4] by using mo e median cen e me hod (MMC)hype sphe e classi ie .
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
34
Mzoughi e al. [6] p oposed a lea ecogni ion sys em depend on spli he en i e lea image o
h ee pa s, op, middle and base, and hen based on con ou and ex u e o each pa o he lea
sea ch o candida e species h ough he usion o desc ip o s. The au ho in eg a ed he p oposed
s uc u ing p ocess using a e ie al decision wi h a k-NN classi ie and poin ed ou he educ ion
o he lea a ie y complexi y wi hin and ac oss species as he main ad an age, since i enables
he de ini ion o pe inen ea u es o each subspace (pa ), depending on i s disc imina o y
p ope ies.
Aaki and Khan [7] p oposed a no el lea ecogni ion algo i hm using depend on shape-based
wi h di e en ea u es such as mo phological cha ac e s, Fou ie desc ip o s and a newly
designed Shape-De ining Fea u e (SDF). The algo i hm showed i s e ec i eness in baseline
da ase like Fla ia.
Recen ly, many esea che s p esen ed a combina ion cha ac e is ic o plan lea ecogni ion [8,
9]. Kadi [8] p esen plan iden i ica ion depend on se o cha ac e is ics as colo , ex u e and
shape ex ac ed om he lea es images. Olsen e al. [9] p oposed a lea classi ica ion si u using
o a ion and a scale in a ian his og ams o o ien ed g adien s (HOG) ea u e se o ep esen
egions o ex u e wi hin lea images. Tang e al. [10] in oduced a no el ex u e ex ac ion
me hod, based on he combina ion o G ay Le el Co-Occu ence Ma ix (GLCM) and local
bina y pa e n (LBP), o classi y g een ea lea es. G.L. G inbla . [11] P esen a new plan
iden i ica ion using ein mo phological pa e ns. This algo i hm i s ex ac ed he ein pa e ns
using Hi o Miss T ans o m (UHMT), and hen ained a con olu ional neu al ne wo k (CNN) o
iden i y hem using a se o cen al pa ch o lea images. In addi ion, a conside able amoun o
esea ch has used combina ions o ea u es o ep esen lea es.
An in e es ing app oach o deal wi h he a ious lea ea u e and huge numbe o lea es is he
online dic iona y lea ning (ODL) as in ou p oposed app oach. Online Spa se Dic iona y
Lea ning algo i hm (OSDL) is a amous algo i hm used o build dic iona y lea ning. Mos ecen
dic iona y lea ning is pe o ming by algo i hms such as he online dic iona y lea ning (ODL) [12]
which is used in Lea ning–Based Lea Image Recogni ion F amewo ks [1]. Ano he algo i hm
pe o ming da a dic iona y is K-singula alue decomposi ion (K-SVD) [13]. These algo i hms
achie ed e icien ly wi h images ha es small pixel dimensions and usually ake long ime in
compu ing. This pape will concen a e on (ODL) algo i hm dealing wi h la ge pixel dimension
images.In his wo k we p esen Lea ecogni ion sys em ha ind disc iminan ea u es ha can be
app op ia e o dis inguishing be ween di e en lea es classes depend on lea ’s spa se
ep esen a ion. We each accu acy 96% in all o ou expe imen .
The es o his pape is p oceeds as ollows. The me hodology, p e-p ocessing ope a ions, ea u e
ex ac ion, build he online da a dic iona y a e explained in Sec ion 2,whie he expe imen al
esul s a e p esen ed and e alua ed in Sec ion 3. Sec ion 4 concludes he pape wi h a summa y o
he p oposed wo k.
2. METHODOLOGY
The p oposed sys em is di ided in o wo s ages aining s age and es ing s age. The aining s age
s a wi h p e-p ocessing ope a ions o he aining image se . Then he spa se ep esen a ion is
compu ed o all aining image se and used o build he da a dic iona y lea ning (DL) o all
classes in da ase sepa a ely. In es ing s age he spa se ep esen a ion o he es ing image in
es ing se is compu ed and hen compa ing wi h he da a dic iona y depend on co ela ion
me hod; he bes ma ching e e o he class ha es ing image is belong o.
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
35
The wo s ages in ou p oposed algo i hm a e desc ibed as ollow:
2.1 T aining s age:
S ep1: p e-p ocessing ope a ions (ex ac he lea om he aining image using he con ou ).
S ep2: pa ching he aining lea image se o all lea es in he di e en classes sepa a ely
( ea u e ex ac ion).
S ep3: compu e he spa se code o pa ches (spa se ep esen a ion).
S ep4: build he (ODL) o di e en lea classes.
2.2 Tes ing s age:
S ep1: compu e he spa se code o he es ing image se .
S ep2: compu e he co ela ion o he es ing image and all (ODLs).
S ep3: compa e he co ela ion esul s. The g ea es co ela ion e e o he class ha es image is
belong o (classi ica ion p ocess).
We desc ibe in he ollowing pa ag aphs he p oposed me hod in de ails:
2.1.1 P e-p ocessing:
In all lea es da ase he lea es we e scanned on a whi e shee (backg ound).We conside he e a
se o aining images om (1 o n) .To classi ies he lea we c opped i om he backg ound
(disca d he w blocks) depend on con ou edges. The colo ed images is con e ed o g ayscale
image by applying gb2g ay hen con e ed o a bina y image using im2bw unc ion. Then helea
con ou is ex ac ed.
2.1.2 Fea u e Ex ac ion:
In his s ep, each image esul ing om he p e ious p e-p ocessing me hod is submi ed o he
ex ac ion p ocess in o de o ob ain di e en desc ip o s om each image. The desc ip o s
ex ac ed he e a e he image pa ches.
In he p oposed sys em he e is many ways o pa ches he lea image; i we wan a ce ain numbe
o pa ches we can add bounda y column o i he numbe o pa ches equi ed; also we can dele e
o igno e columns o i he equi ed pa ches numbe . O he wise i we don’ de e mine he numbe
o pa ches as we do in his pape we de e mine he coo dina e o one pa ch and ge pa ches om
he inpu image as a ailable.
2.1.3 Spa se ep esen a ion:
In his s ep, spa se ep esen a ion is compu ed o he aining image se . Spa se ep esen a ion is
a ep esen a ion me hod which aims o inding a spa se code o he inpu da a in he o m o a
linea combina ion o basic elemen s. These elemen s a e called a oms.
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
36
Spa se ep esen a ion o an image o e comple e dic iona y is achie ed by op imizing a spa se
dic iona y which is a ma ix wi h spa se coding o K pa ches o ea u e ec o s ex ac ed om
aining se images C wi h leng h N. Spa se coding, modeling as spa se linea combina ions by
basic dic iona y (a oms) and spa se coe icien as show in igu e 1.
A spa se dic iona y [15] o m as a spa se s uc u e D=B*A Whe e, B is a ixed base dic iona y
and A is a spa se ma ix. The ma ix D is a spa se dic iona y ma ix o size N × K, con aining he
spa se ep esen a ions o he dic iona y a oms o e B.The base dic iona y B ypically has a as
algo i hmic implemen a ion {Speci ically, as e han explici ma ix mul iplica ion} his makes
he spa se dic iona y e y e icien o apply.
Figu e 1: An example o illus a ing spa se ep esen a ion o ex ac ing pa ches om aining
images se
2.1.4 Online Dic iona y Lea ning based on Spa se Rep esen a ion
Da a Dic iona y lea ning is a dic iona y o a oms ha ha e he good spa se coe icien . I should
be small wi h good da a ep esen a ion. Dic iona y lea ning D o spa se ep esen a ion he e is a
collec ion o a oms ( aining pa ches) o leng h N×K whe e N is a om ec o leng h and k is he
numbe o a oms. The dic iona y lea ning p oblem is inding a dic iona y such ha he
app oxima ion o he aining se is good as possible gi en a spa s coe icien w and x is a se o
pa ch ex ac ion; his p oblem can be w i en as
𝑥 = 𝐷𝑤 (1)
The e is in ini y many possible solu ions o he sys em𝑥 = 𝐷𝑤. Among his in ini ely la ge se o
solu ions, he spa se ep esen a ion is p e e ed wi h smalles ‖𝑥‖𝑜 – no m. Thus, he ask o
compu ing D o a signal can be o mally e alua ed by
‖𝑥‖𝑜𝑤
𝑚𝑖𝑛 Subjec o 𝑥 = 𝐷𝑤 (2)
The exac equa ion in he cons ain abo e is eplaced by he al e na i e equi emen ‖𝑥 −
𝐷𝑤‖2 o allow addi i e noise and model de ia ions.
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
37
A e collec ing a se o ec o s o yj, he aining pa ches exempla s, j = 1, 2, …, P, wi h espec
o D Whe e wj deno es he spa se coe icien and λ is a egula iza ion pa ame e o lea n a
dic iona y D spa si yingyj, he ollowing op imiza ion equa ion sol ed [8]:
1
𝑝∑ ( 1
2‖𝑦𝑗−𝐷𝑤𝑗‖
𝑝
𝑗=1
𝐷,𝑤𝑗
𝑚𝑖𝑛 2
2+ λ‖𝑤𝑗‖1 ) (3)
2.2.1 Spa se ep esen a ion o he es ing image se :
In his s ep, spa se ep esen a ion o he es ing image se is compu ed o e e y image in es se
sepa a ely.
2.2.2 Co ela ion:
Co ela ion is a s a is ical echnique ha is used o measu e and desc ibe
he s eng h and di ec ion o he ela ionship be ween wo a iables. I is ob ained by di iding
he co a iance o he wo a iables by he p oduc o hei s anda d de ia ions. Ka l
Pea son de eloped he coe icien om a simila bu sligh ly di e en idea by F ancis Gal on[16].
A e we ex ac he spa se ep esen a ions as w o a es image hen we ge he co ela ion
be ween w and he da a dic iona y D wi h expec ed alues μw and μD and s anda d
de ia ions σw and σD is de ined as:
Co (W,D) = (w−μ𝑤)(D−μ𝐷)
𝜎𝑤𝜎𝐷 (4)
co is a widely used al e na i e no a ion o he co ela ion coe icien .
2.2.3 Classi ica ion p ocess:
In his s ep, we classi ica ion he es ing image se by co ela ion esul s ha ob ained acco ding
o co ela ion he spa se coe icien s o he es image se and class da a dic iona y. The class
associa ed wi h he maximum co ela ion wi h he inpu image will be decided o be he class ha
he inpu image belongs o.
3. EXPERIMENT AND RESULT ANALYSIS
3.1 Expe imen al en i onmen :
The sys em o ecogni ion used was o use image o he lea es in he da a se “Fla ia lea image
da ase ”[16] a popula lea image da ase . I da ase consis s o 32 classes o lea images (c=32)
whe e each class con ains 40-60 images. Table 1 showing some o da ase lea es.
Ou expe imen s we e implemen ed in ma lab R2014a (64 bi s e sion) on a pe sonal compu e
equipped wi h In el® co e™ i5-2410M p ocesso . Mo eo e unc ion and ools using wi hin
ma lab all he expe imen use ompbox1 ,ompbox10 mul i- h eaded C coded loops and
m imesx_20110223 in spa se code le el. MATLAB does no always implemen he mos e icien
algo i hms o memo y access, and MATLAB does no always ake ull ad an age o symme ic
and conjuga e cases. MTIMESX3 a emp s o do bo h o hese o he ulles ex en possible, and in
some cases can ou pe o m MATLAB by 3x- 4x o speed as e .
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
38
Table 1: Examples o lea Da a se images
3.2 Expe imen al me hod
Expe imen s is o o maliza ion spa se ep esen a ion and da a dic iona y lea ning o all images in
a C class i=1,2,..,c o he ex ac ing pa ches Pi wi h di e en size (16×16, 32×32, 60×60) .In
his expe imen we s udy he ela ion be ween pa ch size and ecogni ion a e and s udy he
execu ion ime. Tes ing was done on 32 di e en lea ypes e e y ype consis s abou 60 lea
images. In pa icula , we no ed 100% accu acy o di e en ypes using small da a dic iona y
size. Few lea images a e emo ed om expe imen because he ex ac ed ea u es a e no
enough. The esul o all expe imen is discus nex .
Figu e 2: The p oposed lea image ecogni ion s eps
3.3 Expe imen al Resul s
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
39
3.2.1 Recogni ion a e:
Fi s ly expe imen s, Recogni ion a e has achie ed in ou p oposed image ecogni ion sys em.
The ollowing pa ag aphs explain ou expe imen s.
To s udy he e iciency whe he he inc ease in he numbe o images in lea ning is use ul o no ,
we pe o med he i s expe imen using a ew images (10 images o size 1200×1600) o build he
da a dic iona y and hen doubled he numbe o images. In his expe imen we ex ac ed he lea
bounda y and used di e en pa ches sizes (16×16, 32×32) o ex ac he pa ches. The goal is o
access he ull da a dic iona y o all a ailable images wi h la ge image dimensions and using he
app op ia e pa ches while main aining he numbe o images equi ed in he educa ion wi hou
compu a ional load inc ease. Table 2 p esen Examples o bounda y lea images.
Table 2: Examples o lea images in c opped s age
The pa ches unc ion un in wo ways, ei he ex ac ing all a ailable pa ches in he c opped lea
image, o in he case o expe imen s ha he image size is la ge and he pa ch size is small, o
example 16 × 16, he ex ac ed pa ches may be g ea e han million samples and i made load o e
when compu e he spa se code, so ha a ce ain numbe o pa ches mus be de e mine o ex ac .
In he expe imen s wi h pa ch size is 32 × 32, he pa ch numbe inc ease be ween 10000 up o 1
million pa ches depend on he numbe o aining images.
The es s we e pe o med using h ee le els. In i s le el he es ing se con ains lea images used
in lea ning s age. In he second le el he es ing se con ains lea images o he same class bu no
used in lea ning s age. The hi d class con ains simila lea images om di e en classes.
The esul o he p e ious expe imen ob ained by Appling he p oposed algo i hm o wo da a
dic iona y, dic iona y 1 which con ains a numbe o images o aining and dic iona y 2 con ains
he double o dic iona y 1.
The esul s showed ha he aining images numbe is e ec i e in ecogni ion a e. whe e
inc ease he aining images he ecogni ion a e inc eases un il he da a dic iona y is comple ed
wi h ea u es o he lea hen he addi ion in aining image occu s o e load in compu a ion and
o e load in compa ison ime whe e i dec ease he ecogni ion a e.
Also ou p oposed algo i hm wo k in he same da ase p esen ed in [1,3,4,5,8]. Table 3 lis s he
ecogni ion a es ob ained by bag-o -wo ds (BoW) p esen ed in [1], spa se coding (SC) p esen ed
in [1], p obabilis ic neu al ne wo k-Based (PNN) p esen ed in [3], mo e median cen e s me hod
(MMC) p esen ed in [4], P oposed- me hod, espec i ely. In [1] andomly 30 image is selec ed in
he lea ning s age o building OSDL using spa se ep esen a ion o ea u e ex ac ed using SIFT
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
40
wi h leng h n=128 o each desc ip o , which is small dimension and ack long ime in
compu a ion, while he PNN app oach p oposed in [9] used 1,800 aining images o neu al
ne wo k aining also ake long ime.
In Table 3, he ecogni ion a es ob ained by Wu e al. [3] and Du e al. [4] we e epo ed in [3],
while Jou-Ken[1] epo ed BoW And SC we e hei espec i e be e han [3] and [4]. Compa ed
wi h he BoW-based app oach, he main ad an age o SC-based app oach is ha i is no equi ed
o e- ain classi ie s wi h newly lea image class added, while in he BoW-based app oach, bo h
he codebook and he SVM classi ie a e equi ed o be e- ained.
I can be obse ed om Table 3 ha he p oposed me hod a e compa able wi h he six exis ing
app oaches used o compa isons (BoW and SC [1], PNN [9], MMC [11], HPNN [13], and
CShape [14]).
Table 3 : p esen he ecogni ion imp o e esul s
Me hod
Ra e
BoW [1]
94.38
SC [1]
95.47
PNN [3]
90.31
MMC [4]
91
CShape [5]
94.62
HPNN [8]
93.75
P oposed me hod
96
3.2.2 Accele a ing ime:
Secondly expe imen s, ocus on achie e ime educing in dealing wi h la ge size images and la ge
pa ch size in image ecogni ion. The ollowing pa ag aphs explain ou expe imen s.
In he p e ious expe imen we we e dealing wi h la ge size images and pa ch size 16×16 and
32×32. He e we inc ease he pa ch size o 60×60 and eplay building da a dic iona y and eplay
he co ela ion o he same aining and es ing images. Table 4 p esen he esul o his
expe imen compa ed wi h he p e ious expe imen esul s.
Table 4 : p esen a compa ison o he expe imen al esul s using a ious pa ch size
Pa ch size
OSDL size o 40 aining
images
Building
ime
File size
(.ma )
Recogni ion
a e
16x16
256x100000
18000s
600Kb
96
32x32
1024x100000
54000s
2.22Mb
96.5
60x60
3600x100000
208800s
4.88Mb
94
In his big size o pa ch cons uc he spa se dic iona y is slow and because o he amoun o
aining da a is limi ed he dic iona y is weak and pe haps insu icien o ain a ull dic iona y.
Also we include he esul p esen ed in [18] “T ainle s: Dic iona y Lea ning in High Dimensions”
ha pape p oposed a modi ica ion on he Wa ele ans o m by cons uc ing wo-dimensional
sepa able c opped Wa ele s. The da a dic iona y employing using SGD ideas in he dic iona y
lea ning ask. I used o pe o m Image Res o a ion Demons a ion. The expe imen s we e
pe o m on pa ches o size 64×64 using ODL o building he da a dic iona y and he un ime is
120 hou o show he ODL schema wi h highe dimensional signal. In ou expe imen he un
ime is 58 hou o 60×60 pa ch size.
4. CONCLUSIONS AND FUTURE WORK
Signal & Image P ocessing: An In e na ional Jou nal (SIPIJ) Vol.10, No.3, June 2019
41
In his pape we p oposed a new me hod o ecogni ion lea es using he OSDL me hod. We
e alua ed he pe o mance o ou p oposed me hod acco ding o he ecogni ion a e and
execu ion ime. The ecogni ion a e achie ed was 96%, and accele a ing ime achie ed was 62%.
The e a e se e al ac s a e p esen ed based on expe imen s include:
1) Inc easing in aining image achie e inc ease in ecogni ion a e un il eaching he ull da a
dic iona y. In hen inc easing in aining image will occu o e load in es ing s age.
2) Whine inc easing he pa ches size he non-ze o spa se ep esen a ion is inc easing also
inc easing he compu a ional ime as show in igu e 2.
3) Inc easing in pa ches size p oduce limi numbe o pa ches so insu icien o ain a ull
dic iona y and dec ease he ecogni ion a e as show in igu e 3.
Figu e 3: pe o mance compa ison be ween pa ch size depend on ecogni ion a e
Figu e 4: pe o mance compa ison pa ch size depend on execu ion ime
This pape shows ha dic iona y lea ning can be up-scaled o ackle a new le el o la ge signal
dimensions. In u u e esea ch we will a emp o achie e be e esul by imp o e in:
1) con ou -c opped me hod o ex ac all ea u es ela ed o main objec om all aining images.
2) To build a ull da a dic iona y o mo e aining images we need o imp o e he compu ing
me hod as using pa allel compu ing ha aim o accele a e he execu ion ime.
REFERENCES