scieee Science in your language
[en] (orig)

A Friendly Introduction to RGP

Author: Flasch, Oliver
Year: 2013
Source: https://cos.bibl.th-koeln.de/files/32/Flas13b.pdf
Sch i en eihe CIplus, Band 2/2013
He ausgebe : T. Ba z-Beiels ein, W. Konen, H. S enzel, B. Naujoks
A F iendly In oduc ion o RGP
Oli e Flasch
A F iendly In oduc ion o RGP
Oli e Flasch
Facul y o Compu e and Enginee ing Sciences
Cologne Uni e si y o Applied Sciences, 51643 Gumme sbach, Ge many
[email p o ec ed]
Sch i en eihe CIplus
TR 2/2013. ISSN 2194-2870
Abs ac . RGP is gene ic p og amming sys em based on, as well as
ully in eg a ed in o, he Ren i onmen . The sys em implemen s classi-
cal ee-based gene ic p og amming as well as o he a ian s including,
o example, s ongly yped gene ic p og amming and Pa e o gene ic
p og amming. I s i es o high modula i y h ough a consis en a chi-
ec u e ha allows he cus omiza ion and eplacemen o e e y algo i hm
componen , while main aining accessibili y o new use s by adhe ing o
he ”con en ion o e con igu a ion” p inciple. Pe o mance c i ical sec-
ions ha e e icien implemen a ions in C, making he sys em sui able o
eal-wo ld applica ion. Typical GP applica ions a e suppo ed by well-
known Rideoms. Fo example, symbolic eg ession ia GP is suppo ed
by he same ” o mula in e ace” as linea eg ession in R.
This ex p o ides a iendly in oduc ion o RGP, a lexible sys em o ge-
ne ic p og amming (GP)in heRen i onmen o s a is ical compu ing. A e
sec ion 1 in oduces GP in he abs ac and sec ion 2 se s he s age wi h ypi-
cal applica ions o GP in gene al and RGP in pa icula , sec ion 3 ou lines he
ange and dep h o RGP’s ea u es. RGP is a la ge package ha can be daun -
ing o he i s - ime use . To help ge ing s a ed, sec ion 4 p o ides a se o
hands-on u o ials, beginning wi h simple asks, including ge ing RGP up and
unning in an exis ing Rins alla ion, up o opics like symbolic eg ession. The
ou look in sec ion 5 gi es hin s on whe e o go om he e, including e e ences
o GP li e a u e as well as RGP’s comp ehensi e online documen a ion and web
esou ces.
1 Gene ic P og amming
GP is a collec ion o echniques om e olu iona y compu ing (EC) o he
au oma ic gene a ion o compu e p og ams ha pe o m a use -de ined ask
[Poli e al., 2008, Banzha e al., 1998]. S a ing wi h a high-le el p oblem de -
ini ion, GP c ea es a popula ion o andom p og ams ha a e p og essi ely
e ined h ough a ia ion and selec ion un il a sa is ac o y solu ion is ound.
An impo an ad an age o GP is ha no p io knowledge conce ning he
solu ion s uc u e is needed. Ano he ad an age is he ep esen a ion o solu ions
RGP In oduc ion 3
as e ms o a o mal language (symbolic exp essions), i.e. in a o m accessible
o human easoning. The main d awback o GP is i s high compu a ional cos ,
due o he po en ially in ini ely la ge sea ch space o symbolic exp essions. On
he o he hand, he ecen a ailabili y o as mul i-co e sys ems has enabled
he p ac ical applica ion o GP in many eal-wo ld applica ion a eas. This has
lead o he de elopmen o a a ie y o so wa e amewo ks o GP,including
Da aModele , Discipulus, ECJ, Eu equa, and GPTIPS.
All o hese sys ems a e complex agg ega es o algo i hms o sol ing no only
GP speci ic asks, such as solu ion c ea ion, a ia ion, and e alua ion, bu also
mo e gene al EC asks, like single- and mul i-objec i e selec ion, and e en la gely
gene al asks like he design o expe imen s, da a p e-p ocessing, esul analysis
and isualiza ion. Packages like Ma lab, Ma hema ica, and R[R De elopmen
Co e Team, 2009] al eady p o ide solu ions o he mo e gene al asks, g ea ly
simpli ying he de elopmen o GP sys ems based on hese en i onmen s and
also lowe ing he ba ie o en y o use s who al eady know he unde lying
package.
RGP1is based on he Ren i onmen o se e al easons. Fi s ly, he e seems
o be a bene icial end owa ds employing s a is ical me hods in he analysis
and design o e olu iona y algo i hms, including mode n GP a ian s [Sun e al.,
2009, Ba z-Beiels ein e al., 2010]. Secondly, R’s open de elopmen model has led
o he ee a ailabili y o Rpackages o mos me hods om s a is ics and many
me hods om EC. Also, he ee a ailabili y o Ri sel makes RGP accessible o
a wide audience. Thi dly, he Rlanguage suppo s “compu ing on he language”,
which g ea ly simpli ies symbolic compu a ion inhe en in mos GP ope a ions.
In addi ion, pa allel execu ion o long- unning GP uns is easily suppo ed by
he Rpackage.
2 Applica ion A eas
GP in gene al, and RGP as a modula GP sys em in pa icula , has a wide
a ay o possible applica ion a eas. Basically, GP is a e olu iona y sea ch heu is-
ic o a bi a y symbolic exp essions, i.e. ma hema ical o logical o mulas.. A
non-exhaus i e lis o RGP-applica ions include:
–Symbolic Reg ession: Gi en a se o measu emen da a di ided in o depen-
den and independen a iables, symbolic eg ession can disco e he unc-
ional ela ionship be ween dependen and independen a iables. This ela-
ionship is ep esen ed as a symbolic exp ession, which can be used o gain
insigh in o he da a-gene a ing p ocess o sys em (sys em iden i ica ion),
and as a model o p edic he alues o dependen a iables o unseen al-
ues o independen a iables (in a- and ex apola ion). Figu e 1 p o ides a
simple example.
–Fea u e Selec ion: No all independen a iables mus ha e an in luence on
he alues o he dependen a iables. In many p ac ical applica ions, only
1The RGP package and documen a ion is a ailable a symbolic.o g.
4Oli e Flasch
a small subse o independen a iables a↵ec he dependen a iables. The
ask hen is o iden i y his subse , which can be done by GP in a e y obus
ashion.
–Au oma ic P og amming: As compu e p og ams a e symbolic exp ession,
GP can be used o au oma ic p og amming, which explains he name o he
me hod. This equi es a se o p og am building blocks and a i ness unc ion
ha assigns a nume ical quali y measu e o each candida e p og am. Fo
small p og ams desc ibing co e algo i hm componen s, his app oach al eady
wo ks in p ac ice.
–Gene al Exp ession Sea ch: The applicabili y o GP e en goes beyond au-
oma ic p og amming. The me hod can be used o disco e all s uc u es
ha a e ep esen able by symbolic exp essions o mode a e complexi y. Ex-
amples include elec ical ci cui s, an enna designs, p ocessing ne wo ks in
manu ac u ing and logis ics, and many o he s.
The RGP sys em is lexible enough o be applied in nea ly all possible GP
applica ion a eas. I al eady has been success ully applied in such di e se a eas as
suppo ec o machine ke nel gene a ion o machine lea ning, su oga e model
ensemble gene a ion o enginee ing op imiza ion, and ime se ies p edic ion o
wa e esou ce managemen applica ions.
RGP In oduc ion 5
0246810
−0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4
Symbolic Reg ession o a Damped Oscilla o
budge = 5M ou namen s, unc ion se = {+,−,*,/,sin,cos, an}
x
y
(x)=
sin(2⋅x
0.636205)
x2+2.81232 ⋅−2.419366
( )=x0 ⋅exp(− δ ⋅ )⋅sin(ω ⋅ +phi0)
Fig. 1: Symbolic eg ession o he go e ning law o a damped oscilla o : RGP en-
ables symbolic eg ession ia gene ic p og amming. This example shows how RGP
is used o ind he go e ning physical law o a damped oscilla o . In con as o
o he eg ession me hods, he solu ion is exp essed as a ma hema ical o mula
accessible o human in e p e a ion and alida ion. In his igu e, he ue oscil-
la o law and beha iou a e shown in dashed ed, he solu ion ound by RGP is
shown in solid black.

6Oli e Flasch
3 Fea u es
To gi e an idea o he ex end and limi s o RGP’s ea u e se , his sec ion
p o ides an non-exhaus i e o e iew o he sys em. De ailed documen a ion o
all unc ionali y, including examples, can be ound in he online help o he
package.
3.1 Solu ion Rep esen a ion
RGP ep esen s candida e solu ions, i.e. GP indi iduals, as Rexp essions ha
can be di ec ly e alua ed by he Rin e p e e . This allows he whole spec um
o unc ions a ailable in R o be used as building blocks o GP. Because Rex-
p essions a e in e nally ep esen ed as ees, RGP may be seen as a ee-based
GP sys em. Howe e , he indi idual ep esen a ion can be easily eplaced o-
ge he wi h he associa ed a ia ion and e alua ion ope a o s, i an al e na i e
ep esen a ion is ound o be mo e e↵ec i e o a gi en applica ion [?].
Besides classical (un yped) GP, s ongly yped GP is suppo ed by a ype
sys em based on simply yped lambda calculus [Ba end eg e al., 1992]. A dis-
inc i e ea u e o RGP’s yped ee ep esen a ion is he suppo o unc ion
de ining sub ees, i.e. anonymous unc ions o lambda abs ac ions. In combina-
ion wi h a ype sys em suppo ing unc ion ypes, his allows he in eg a ion
o common highe o de unc ions like olds, mappings, and con olu ions, in o
he se o GP building blocks, g ea ly inc easing RGP’s applicabili y in many
“non-classical” GP applica ion a eas.
RGP also includes R ules, a ule based ansla o o ans o ming Rexp es-
sions. This mechanism can be used o simpli y GP indi iduals as pa o he
e olu ion p ocess as a means he educe bloa , o jus o simpli y solu ion ex-
p essions o p esen a ion and la e use. The de aul ule base implemen s sim-
pli ica ion o a i hme ic exp essions. R ules can be easily ex ended o simpli y
exp essions con aining use -de ined ope a o s and unc ions.
3.2 GP Ope a o s
RGP p o ides de aul implemen a ions o se e al ini ializa ion, a ia ion,
and selec ion ope a o s. The sys em o↵e s clea in e aces o use -de ined ope -
a o s, as well as he possibili y o eplace he e olu iona y algo i hm used o GP
sea ch wi h use de ined a ian s, wi hou he need o ew i e o he unc ionali y.
Ini ializa ion Indi idual ini ializa ion can pe o med by he con en ional g ow
and ull s a egies o ee building. When using s ongly- yped GP, hep o ided
indi idual ini ializa ion s a egies espec ype cons ain s and will c ea e only
well- yped exp essions. Ini ializa ion s a egies may be eely combined, e.g. o
implemen he well known amped-hal -and-hal s a egy.
RGP In oduc ion 7
Va ia ion RGP includes classical and ype-sa e sub ee c osso e ope a o s. Also,
se e al classical and ype-sa e mu a ion ope a o s a e p o ided. The a ia ion
pipeline can be eely con igu ed by combining se e al mu a ion and ecombina-
ion ope a o s o be applied in pa allel o consecu i ely, wi h eely con igu able
p obabili ies.
Selec ion The sys em p o ides se e al single- and mul i-objec i e selec ion ope -
a o s. O he selec ion s a egies can be easily added by he use . Mul i-objec i e
selec ion is suppo ed ia he EMOA package.2The mul i-objec i e sea ch s a -
egy op imizes solu ion quali y while, a he same ime, con olling solu ion com-
plexi y and popula ion di e si y. Fo his pu pose, RGP implemen s mul iple
complexi y measu es o GP indi iduals.
Fig. 2: RGP’s g aphical use in e ace o symbolic eg ession: Al hough RGP is
basically a command-line d i en sys em, much like he unde lying Ren i on-
men , g aphical use in e aces a e p o ided whe e hey ease in e ac ion and ex-
plo a ion. The g aphical use in e ace o symbolic eg ession allows he di ec
manipula ion o he mos impo an GP pa ame e s.
2The EMOA E olu iona y Mul iobjec i e Op imiza ion Algo i hm oolbox o Ris
a ailable a h p://gi .da enspli e .ne /cgi /emoa.
8Oli e Flasch
3.3 Analysis and Visualiza ion
The RGP sys em p o ides ools o he analysis and isualiza ion o GP
indi iduals and popula ions. GP indi iduals, i.e. symbolic eg essions, can be
isualized as ees (in mul iple le els o de ail), as o mulas in ma hema ical
no a ion, as poin s in a Pa e o plo , o as plo s o hei inpu /ou pu beha iou .
GP popula ions can be isualized as o es s o schema ic ees, as Pa e o plo s,
o as a iable p esence cha s.
As RGP is based on R, a as a ay o s a is ical ools o analyzing GP
indi iduals, GP popula ions and GP sys em pe o mance a e eadily a ailable.
Fo example, in eg a ion wi h he SPOT package o sequen ial pa ame e op i-
miza ion allows he au oma ic uning o c i ical GP algo i hm pa ame e s. The
RGP online documen a ion p o ides examples o ypical applica ions o each
isualiza ion and analysis echnique.
Al hough RGP is basically a command-line d i en sys em, like he unde lying
Rpackage, g aphical use in e aces a e p o ided whe e hey ease in e ac ion and
explo a ion. The g aphical use in e ace o symbolic eg ession (see igu e 2)
allows di ec manipula ion o he mos impo an GP pa ame e s.
4 Tu o ials
To help ge ing s a ed wi h RGP, his sec ion p o ides a se o hands-on
u o ials, beginning wi h simple asks, including ge ing RGP up and unning
in an exis ing Rins alla ion, up o ad anced opics like s ongly yped gene ic
p og amming. All u o ials a e mean o be ollowed s epwise in a unning R
session.
4.1 Ins alla ion
RGP is a ailable as an Rpackage on he comp ehensi e Ra chi e ne wo k
CRAN, making ins alla ion ex emely simple. To ins all RGP and all i ’s depen-
dencies, issue he ollowing command in a unning Rsession:
>ins all.packages(" gp")
A p omp will appea asking o selec a CRAN mi o will appea i i is he
i s ime an Rpackage is ins alled in you Rins alla ion. Jus selec a mi o
loca ion nea you. The ins alla ion o RGP may ake some ime, as dependencies
a e downloaded and compila ion s eps a e pe o med.
4.2 Ge ing S a ed
This u o ial p o ides an in e ac i e walk h ough o sol ing a simple symbolic
modelling p oblem wi h GP. Only basic low-le el RGP unc ionali y is used, high-
le el con enience unc ions a e in en ionally a oided o make each s ep in he
modelling p ocess clea and explici .
RGP In oduc ion 9
In his i s example, we con igu e RGP o c ea e polynomial app oxima ions
o he sine unc ion. To make RGP’s unc ionali y a ailable in a unning Rsession,
he package has o be loaded ia he lib a y command:
>lib a y(" gp")
De ining he GP Sea ch Space In RGP, candida e solu ions a e ep esen ed as
egula R unc ions. The bodies o hese unc ions a e build om a se inpu
a iables, a se o cons an s, and a se o unc ion symbols. These membe s o
hese se s a e o en e e ed o as GP building blocks. In o he wo ds, hese h ee
se s de ine he symbolic exp ession sea ch space.
As ou example ask is he app oxima ion o he sine unc ion wi h polyno-
mials, we c ea e a unc ion symbol se con aining only addi ion, mul iplica ion,
and sub ac ion.
> unc ionSe 1<- unc ionSe ("+","*","-")
We hen c ea e a se o inpu a iables con aining jus he symbol x.The eby
we es ic he sea ch space o uni a ia e unc ions, i.e. unc ion o one a iable:.
>inpu Va iableSe 1<-inpu Va iableSe ("x")
Finally, we c ea e a se o cons an s. Cons an s a e no c ea ed di ec ly, bu
ia cons an ac o y unc ions. Each ime a cons an has o be c ea ed du ing
GP sea ch, RGP calls a cons an ac o y unc ion. He e we use a single cons an
ac o y ha e u ns cons an s om a no mal dis ibu ion:
>cons an Fac o ySe 1<-cons an Fac o ySe ( unc ion() no m(1))
De ining he Fi ness Func ion The i ness unc ion, o objec i e unc ion, asso-
cia es a nume ical i ness alue o a candida e solu ion. RGP elies on he i ness
unc ion o di ec i s e olu iona y sea ch. The i ness unc ion de ines he p ob-
lem o be sol ed by GP. As al eady men ioned, in his example, we will use
RGP o ind unc ions app oxima ing he sine unc ion in he in e al in e al1
[⇡,⇡]. We sample his in e al in s eps o size 0.1:
>in e al1<-seq( om=-pi, o=pi,by=0.1)
> i nessFunc ion1<- unc ion( )
+ mse( (in e al1),sin(in e al1))
By de aul , RGP minimizes i ness alues, so lowe alues should be associa ed
wi h be e solu ions. He e, we use he oo mean e o (RMSE) o a gi en sine
app oxima ion agains he ue sine unc ion as a i ness unc ion.3
3The p oblem de ined he e is a ypical symbolic eg ession p oblem. RGP also ea u es
a simple in e ace o symbolic eg ession, which is in oduced in he nex u o ial
on symbolic eg ession.
Bibliog aphy
Wol gang Banzha , F ank D. F ancone, Robe E. Kelle , and Pe e No din. Ge-
ne ic p og amming: an in oduc ion: on he au oma ic e olu ion o compu e
p og ams and i s applica ions. Mo gan Kau mann Publishe s Inc., San F an-
cisco, CA, USA, 1998. ISBN 1-55860-510-X.
Henk Ba end eg , S. Ab amsky, D. M. Gabbay, T. S. E. Maibaum, and H. P.
Ba end eg . Lambda calculi wi h ypes. In Handbook o Logic in Compu e
Science, pages 117–309. Ox o d Uni e si y P ess, 1992.
Thomas Ba z-Beiels ein, Ma co Chia andini, Luis Paque e, and Mike P euss,
edi o s. Expe imen al Me hods o he Analysis o Op imiza ion Algo i hms.
Sp inge , Be lin, Heidelbe g, New Yo k, 2010.
Ricca do Poli, William B. Langdon, and Nicholas F ei ag McPhee. A ield
guide o gene ic p og amming.Published iah p://lulu.com and eely
a ailable a h p://www.gp- ield-guide.o g.uk, 2008. URL h p://www.
gp- ield-guide.o g.uk. (Wi h con ibu ions by J. R. Koza).
R De elopmen Co e Team. R: A Language and En i onmen o S a is ical
Compu ing. R Founda ion o S a is ical Compu ing, Vienna, Aus ia, 2009.
URL h p://www.R-p ojec .o g. ISBN 3-900051-07-0.
Yi Sun, Daan Wie s a, Tom Schaul, and Jue gen Schmidhube . E icien na u al
e olu ion s a egies. In GECCO ’09: P oceedings o he 11 h Annual con e ence
on Gene ic and e olu iona y compu a ion, pages 539–546, New Yo k, NY,
USA, 2009. ACM. ISBN 978-1-60558-325-9.

Kon ak /Imp essum
Diese Ve ¨
o↵en lichungen e scheinen im Rahmen de Sch i en eihe ”CIplus”. Alle
Ve ¨
o↵en lichungen diese Reihe k¨
onnen un e
www.ciplus- esea ch.de
ode un e
h p://opus.bsz-bw.de/ hk/index.php?la=de
abge u en we den.
K¨
oln, Janua 2012
He ausgebe / Edi o ship
P o . D . Thomas Ba z-Beiels ein,
P o . D . Wol gang Konen,
P o . D . Ho s S enzel,
D . Bo is Naujoks
Ins i u e o Compu e Science,
Facul y o Compu e Science and Enginee ing Science,
Cologne Uni e si y o Applied Sciences,
S einm¨
ulle allee 1,
51643 Gumme sbach
u l: www.ciplus- esea ch.de
Sch i lei ung und Ansp echpa ne / Con ac edi o ’s o ice
P o . D . Thomas Ba z-Beiels ein,
Ins i u e o Compu e Science,
Facul y o Compu e Science and Enginee ing Science,
Cologne Uni e si y o Applied Sciences,
S einm¨
ulle allee 1, 51643 Gumme sbach
phone: +49 2261 8196 6391
u l: h p://www.gm. h-koeln.de/~ba z/
eMail: homas.ba z-beiels ein@ h-koeln.de
ISSN (online) 2194-2870