scieee Science in your language
[en] (orig)

Building Ensembles of Surrogate Models by Optimal Convex Combination

Author: Friese, Martina,Bartz-Beielstein, Thomas,Emmerich, Michael
Year: 2016
Source: https://cos.bibl.th-koeln.de/files/348/Frie16aCos.pdf
!
!
!
!
CIplus
Band 4/2016
Building Ensembles o Su oga e Models by
Op imal Con ex Combina ion
Ma ina F iese, Thomas Ba z-Beiels ein, and Michael Emme ich
!
!
!
!
!
!
!
! !
BUILDING ENSEMBLES OF SURROGATE
MODELS BY OPTIMAL CONVEX COMBINATION
Ma ina F iese and Thomas Ba z-Beiels ein
SPOTSe en Lab, TH K¨oln
S einm¨ulle allee 1, 51643 Gumme sbach, Ge many
{ma ina. iese| homas.ba z-beiels ein}@ h-koeln.de
Michael Emme ich
LIACS, Leiden Uni e si y
Niels Boh weg 1, 2333CA Leiden, The Ne he lands
[email p o ec ed]
Abs ac When using machine lea ning echniques o lea ning a unc ion ap-
p oxima ion om gi en da a i is o en a di icul ask o selec he
igh modeling echnique. In many eal-wo ld se ings is no p elimina y
knowledge abou he objec i e unc ion a ailable. Then i migh be
bene icial i he algo i hm could lea n all models by i sel and selec he
model ha sui s bes o he p oblem. This app oach is known as au o-
ma ed model selec ion. In his wo k we p opose a gene aliza ion o his
app oach. I combines he p edic ions o se e al in o one mo e accu a e
ensemble su oga e model. This app oach is s udied in a undamen al
way, by i s e alua ing minimalis ic ensembles o only wo su oga e
models in de ail and hen p oceeding o ensembles wi h h ee and mo e
su oga e models. The esul s show o wha ex en combina ions o
models can pe o m be e han single su oga e models and p o ides
insigh s in o he scalabili y and obus ness o he app oach. The s udy
ocuses on mul i-modal unc ions opologies, which a e impo an in
su oga e-assis ed global op imiza ion.
Keywo ds: Func ion App oxima ion, Su oga e Models, Model Selec ion, Ensemble
Me hods, Global Op imiza ion
1. In oduc ion
Su oga e models a e ma hema ical unc ions ha , basing on a sam-
ple o known unc ion alues, app oxima e he beha io o he o iginal
unc ion, while being cheape in e ms o e alua ion. In he ield o
1
2
op imiza ion on expensi e objec i e unc ions i is s a e o he a o
use su oga e models o ge an idea o he objec i e unc ion landscape
wi h lesse e alua ions o he objec i e unc ion. Expe sys ems like
SPOT [1] come wi h a la ge a ie y o models ha has o be chosen
om when ini ia ing an op imiza ion p ocess. The choice o he igh
model de e mines he quali y o he he op imiza ion p ocess.
O en expe knowledge is needed o decide which model o selec o a
gi en p oblem. I he e is no p elimina y knowledge abou he objec i e
unc ion i migh be bene icial i he algo i hm could lea n all by i sel
which model sui s bes o he p oblem. This can be done by e alua -
ing di e en models on es da a a p io i and using a s a is ical model
selec ion app oach o selec he mos p omising model.
Some occu ences imply ha he e migh also be a bene i in linea ly
combining p edic o s om se e al models in o a mo e accu a e p edic-
o . In Fig. 1 such an occu ence is happening. P edic ions wi h wo
di e en (K iging) models a e shown and esul s ob ained by a con ex
combina ion o he p edic o s o hese models. Di e en e o s seem o
be compensa ed by he combined model’s p edic ions.
Such occu ences indica e ha a p edic o based on a single modeling
app oach is no always he bes choice. On he o he hand, complica ed
exp essions based on mul iple p edic o s migh no be a good choice,
ei he , due o o e i ing and lack o anspa ency. Using con ex combi-
na ions o p edic o s om a ailable models seems o be a ‘sma ’ com-
p omise. Gi en su oga e models ˆyi:Rd→R, i = 1, . . . , s, by a con ex
combina ion o models we unde s and a model gi en by Ps
i=1 αiˆywi h
Pαi= 1 and αi≥0, i = 1, . . . , s. Finding an op imal con ex combi-
na ion o models can be iewed as a gene aliza ion o model selec ion.
The selec ion o a single model is a special case wi h only one posi i e
coe icien and he o he coe icien s ze o.
This pape in es iga es he idea o using con ex combina ions o p e-
dic ions o di e en models (model mix u es) o gain a mo e accu a e
p edic ions. Focusing on he p edic ions a he han implemen a ion
special ies when combing models gi es us he abili y o combine models
wi hou u he conside ing he ype o he model, making he app oach
e y lexible. The main esea ch ques ions a e:
(Q-1) Can con ex combina ions o p edic o s imp o e as compa ed o
(single) model selec ion?
(Q-2) Gi en he answe is posi i e, wha a e explana ions o he ob-
se ed beha io ?
(Q-3) How can a sys em be build ha inds he op imal con ex combi-
na ion o p edic ions on aining da a?
Building Ensembles o Su oga e Models by Op imal Con ex Combina ion 3
Figu e 1: The black line ma ks he ac ual objec i e unc ion alue. The do s show
he esul s ob ained in a lea e-one-ou c oss- alida ion. Blue and ed do s ma k he
p edic ions o single models. The g een do s shows p edic ions ob ained wi h an
op imal linea combina ion o he wo p edic o s.
In o de o answe hese ques ions, de ailed empi ical s udies a e con-
duc ed, s a ing om simple examples and ad ancing o mo e complex
ones. To imp o e eadabili y, his pape ollows a non-s anda d s uc-
u e, whe e he discussion o expe imen al esul s ollows di ec ly he
in oduc ion o he modeling ex ensions.
The pape is s uc u ed as ollows: Sec ion 2 discusses he gene al ap-
p oach and ela ed wo k. Sec ion 3 p o ides echnical p elimina ies o
he subsequen expe imen s. Sec ion 4 in oduces he idea o model mix-
u es and explo es bina y model mix u es. Sec ion 5 p o ides a mo e
de ailed analysis o bina y model mix u es. Sec ion 6 ex ends he anal-
ysis o e na y model mix u es, and Sec ion 7 p o ides i s esul s and
echniques o enabling mix u es o a la ge numbe o models. Sec ion
8 discusses he main esul s and u u e esea ch di ec ions.
4
2. Gene al App oach and Rela ed Wo k
To base a decision o build a p edic ion om mul iple opinions is common
p ac ice in ou e e yday li e. I happens in a democ a ic go e nmen ,
o when in TV shows he audience is asked o help. One also migh
use i when we y o build an opinion on a opic ha is new o us.
Na u ally, such ools al eady ound hei way in o s a is ical p edic ion
and machine lea ning.
In s a is ics and machine lea ning an ensemble is a p edic ion model om
se e al p edic ion models. A comp ehensi e in oduc ion o ensemble-
based app oaches in decision making is gi en in [6] and [4]. Gene -
ally, he e a e wo g oups o ensemble app oaches: he i s g oup’s ap-
p oaches, he so-called single-e alua ion app oaches, only choose and
build one single model, whe eas he second g oup’s app oaches, he so-
called mul i-e alua ion app oaches, build all models, and use he de i ed
in o ma ion o decide which ou pu o use. Fo each o hese wo ap-
p oaches, se e al model selec ion s a egies can be implemen ed. Well-
known s a egies a e:
Round obin and andomized choosing a e he mos simplis ic im-
plemen a ions o ensemble-based s a egies. In he o me app oach,
he models a e chosen in a ci cula o de independen o hei p e i-
ously achie ed gain. In he la e app oach, he model o be used in
each s ep is selec ed andomly om he lis o a ailable models. The
p e ious success o he model is no a decision ac o .
G eedy s a egies choose he model ha p o ided he bes unc ion
alue so a , while he So Max s a egy uses a p obabili y ec o ,
whe e each elemen ep esen s he p obabili y o a co esponding
model o be chosen [8]. The p obabili y ec o is upda ed depending
on he ewa d ecei ed o he chosen models.
Ranking s a egies y o combine he esponses o all me a models o
one esponse, whe e all me a models con ibu ed o, a he han o
choose one esponse.
Bagging combines esul s om andomly gene a ed aining se s and
can also be used in unc ion app oxima ion, whe eas
Boos ing combines se e al weak lea ne s o a s ong one in a s ochas ic
se ing.
Weigh ed a e aging app oaches do no choose a speci ic model’s esul
bu a he combine i by a e aging. Since bad models should no
de e io a e he o e all esul , a weigh ing scheme is in oduced. E e y
model’s esul o a single design poin is weigh ed by i s o e all e o ,
he sum o e all models yields he inal alue assigned o he design
poin . In s acking, se e al ained models a e combined and ained

Building Ensembles o Su oga e Models by Op imal Con ex Combina ion 5
again by a s acking algo i hm. A ypical example o a success ul
weigh ed a e age model a e Random Fo es s [3].
Con ex combina ions o su oga e models used in his pape can be
iewed as a special case o weigh ed a e aging models, albei we p opose
he e global op imiza ion ins ead o e- aining o inding he bes con ex
combina ion o models. Mo eo e , he analysis in his pape aims o
anspa en p esen a ion o esul s using mix u e analysis and ocuses on
mul imodal unc ion app oxima ion, which is an impo an applica ion
in su oga e-assis ed global op imiza ion.
3. P elimina ies
3.1 Su oga e Models
By a su oga e model, we will unde s and he e a unc ion ˆy:Rd→R
ha is an app oxima ion o he o iginal unc ion y:Rd→R, and
lea ned om a ini e se o e alua ions o he o iginal unc ion. A ypical
applica ion o su oga e models is o p o ide a as app oxima ions o
unc ions ha a e expensi e o e alua e, o ins ance unc ions based
on cos ly compu e simula ions. K iging su oga e models we e used in
ou s udy. A se o h ee di e en ke nels was used o implemen he
ensemble s a egies. Following he de ini ions om [7], he co ela ion
models can be desc ibed as ollows. We conside s a iona y co ela ions
o he o m
R(θ, w, x) =
n
Y
j=1
R(θj, wj−xj).
The i s model uses he exponen ial ke nel
R(θ, w, x) = exp(−θj|wj−xj|),
he second model uses an gaussian ke nel
R(θ, w, x) = exp(−θj|wj−xj|2),
whe eas he hi d model is based on he spline co ela ion unc ion
R(θ, w, x) = ζ(θj|wj−xj]) wi h
ζ(j) = (1−152
j+ 303
j o 0 ≤≤0.2
1.25(1 −j)3 o 0.2< j<1
0 o j≥1.
The a iables and θa e hype pa ame e s es ima ed by likelihood max-
imiza ion.
6
Table 1: Gaussian landscape gene a o op ions
Pa ame e Desc ip ion Value
nDimension 2 - 10
mNumbe o peaks 10 - 40
lLowe bounds o he egion, whe e peaks a e
gene a ed
{0; 0}
uUppe bounds o he egion, whe e peaks a e
gene a ed
{5; 5}
max Max unc ion alue 100
Ra io be ween global and local op ima 0.8
3.2 Objec i e Func ions
Fo gene a ing es unc ions we used he Max-Se o Gaussian Land-
scape Gene a o (MSG) [5], which can be used o se up p oblem in-
s ances o con inuous, bound-cons ained op imiza ion p oblems. I
uses he maximum o mweigh ed Gaussian unc ions
G(x) = max
i∈1,2,...,m(wigi(x)),
whe e g:Rn→Rdeno es an n-dimensional Gaussian unc ion
g(x) = exp −1
2(x−µ)Σ−1(x−µ)T
(2π)n/2|Σ|1/2!1/n
,
µis an n-dimensional ec o o means, and Σ is an (n×n) co a iance
ma ix. The mean o each Gaussian co esponds o an op imum on he
landscape and he loca ion o all op ima is known. The global op imum
is he one wi h he la ges alue. Fo he gene a ion o he objec i e
unc ion he spo GlgC ea e me hod o he SPOT package has been used.
Implemen a ion de ails a e p esen ed in [2]. The op ions used o ou
expe imen s a e shown in Table 1. Wi h he pa ame e n he dimension
o he objec i e unc ion is speci ied. The lowe and uppe bounds (l
and u, espec i ely) speci y he egion whe e he peaks a e gene a ed.
The alue max speci ies he unc ion alue o he global op imum, while
he maximum unc ion alue o all o he peaks is limi ed by , he a io
be ween he global and he local op ima.
4. Bina y Ensembles
This Sec ion analyses models which combine only wo models. Con ex
combina ions o models will be e e ed o as ensemble models, while he
Building Ensembles o Su oga e Models by Op imal Con ex Combina ion 7
o iginal models will be e e ed o as base models. We ocus on posi i e
weigh s, since we do no wan o selec models ha make p edic ions
which a e an i-co ela ed wi h he esul s.
A sample o poin s (design) is e alua ed on he objec i e unc ion (MSG,
o pa ame e s see Table 1). Fo he sampling o he poin s a la in
hype cube design ea u ing 40 design poin s is gene a ed. The wo base
models a e K iging wi h exponen ial co ela ion unc ion ( e e ed o
as a) and gaussian co ela ion unc ion ( e e ed o as b). Bo h base
models a e i ed o he da a and hen asked o do a p edic ion on he
es da a. The p edic ions ˆyo he ensemble models a e calcula ed as
linea combina ions o he p edic ions o he base models.
Gi en a weigh αi, whe e αi∈ {0.0,0.1,0.2, ..., 0.9,1.0}, he ensemble
models can be de ined as he linea combina ions o he models aand b
as ollows:
ˆyn=αn׈ya+ (1 −αn)׈yb(1)
The models a e e alua ed by calcula ing he oo mean squa ed e o
(RMSE) o he p edic ions made du ing a lea e-one-ou c oss- alida ion
on he 40 design poin s.
Since andomness has been induced in o he expe imen by using he
La in hype cube design, he e alua ion p ocess has been epea ed 50
imes. Wi h each model e u ning one p edic ion o each design poin
in e e y epe i ion his esul s in a o al o 2,000 p edic ion alues (40
design poin s ×50 epe i ions) o each model.
To ge a i s quick insigh in o he esul da a, o each epe i ion he
ankings o he RMSE’s ha e been calcula ed. The models wi h α= 0.6,
α= 0.8 and α= 0.9 domina e his compa ison, each pe o ming bes 8
ou o 50 imes. The base models, aand b, pe o med bes only in ou
espec i ely wo cases ou o 50. Ne e an ensemble model wi h posi i e
weigh s was pe o ming wo s .
In o de o achie e some compa abili y be ween he RMSE’s o di e -
en epe i ions all RMSE’s ha e been epe i ion-wise scaled o alues
be ween ze o and one, so ha he scaled RMSE o he bes model in
one epe i ion is always ze o and he scaled RMSE o he wo s model
o one epe i ion is always 1.0. Figu e 2 shows he boxplo o e hese
scaled RMSE’s. I can be seen ha he model a(exponen ial) in mos o
he cases pe o ms wo s since i s median alue is one, only some ou lie s
come close o ze o.
Model b(Gaussian) shows a la ge a ia ion in i s pe o mance. I has
been he bes - as well as he wo s pe o ming model each a leas once.
I s median and mean pe o mances a e a e age in compa ison wi h all
models e alua ed.
8
Figu e 2: Boxplo o e he scaled RMSE’s o all models. The models a e de ined
by an α-weigh ed linea combina ion o he wo base models. The esul s o he base
models depic ed on he ou e ows and colo ed ed (exponen ial ke nel), espec i ely
blue (Gaussian ke nel). All linea combina ions a e d awn in be ween. The model
combina ion chosen as bes wi h α= 0.6is colo ed g een. The mean alue o each
esul ba is ma ked by a do .
A pa abolic endency can be seen in he pe o mances. This indica es
ha a linea combina ions o he models a e indeed bene icial. Due o
he con ex combina ion o he p edic o , a p edic ion by he ensemble
model canno be wo se bu i migh be be e han bo h base models.
An ensemble can only be be e , i one model o e es ima es and he
o he model unde es ima es he o iginal unc ion alue. In he expe i-
men his happens in 649 ou o 2000 cases.
As a consis en me hod o e alua ing he pe o mance and au oma ically
choosing he bes model he ollowing app oach is p oposed: Model-wise
mean-, median- and 3 d qua ile- alues a e calcula ed. The esul ing
alues a e anked and he ankings summed up o one inal anking.
The model ha achie ed he lowes alue is ecommended as bes choice.
In Figu e 2 he model ecommended as bes choice by his me hod is
colo ed g een.
5. De ailed Analysis on T anspa en Tes Cases
I can clea ly be s a ed ha o his i s expe imen se up he combina-
ion o wo models is bene icial o he o e all p edic ion. In his sec ion
we’ e going o ha e a close look a possible explana ions o he success-
ul esul . A e he e p oblem ea u es ha encou age using ensembles
and is his esul gene alizable?
Building Ensembles o Su oga e Models by Op imal Con ex Combina ion 15
Figu e 8: The plo shows he esul s o he same expe imen se up as p esen ed
in Sec. 6. The op imal linea combina ion has been sea ched wi h a simple (1+1)-
E olu ion S a egy wi h 1/5 h success ule. Again, each ci cle depic s he pe o mance
esul s o one model. The h ee base models a e loca ed on he co ne s o he iangle,
models gained by linea combina ions o only wo models a e loca ed on he ou e
bo de . Ci cles on he inne a ea o he a ea show he esul s o models ha we e
gained by linea combina ions o all h ee base models. The size o he ci cles deno es
he mean RMSE alue, he colo he s anda d de ia ion. The model p oposed as bes
choice is ma ked by an addi ional whi e ci cle.
does no mee he equi emen s needed o a alid weigh s ec o , he
esul ing ec o has been adjus ed in 3 s eps:
1) I min(α, β, γ)<0 hen ~ := ~ −min(α, β, γ),
2) ~ := ~ /(α+β+γ),
3) Round he alues α, β, γ o wo decimal places so, ha α+β+γ= 1.
Fo his expe imen we allowed a maximum o 100 indi iduals o be
e alua ed. Wi hin hese bounds al eady he 35 h indi idual e alua ed
has been he bes indi idual ound in his un. Figu e 8 depic s he
esul s o his op imiza ion s ep. As be o e, he bes indi idual is ma ked
by a whi e ci cle.

16
8. Discussion and Ou look
Reconside ing he esea ch ques ions om Sec. 1, we can s a e ha lin-
ea combina ions o p edic o s can gene a e be e esul s han model
selec ion (Q-1). A sys em, which inds op imal linea combina ions, was
p esen ed in Sec. 4. The co esponding expe imen s we e ex ended o a
la ge scale in Sec. 6. The esul s om hese expe imen s u he sup-
po ou s a emen , ha combina ion o models leads o be e esul s.
Finally, in Sec. 7, we p oposed a me hod o include e en mo e base
models o he sys em. Fo he same expe imen se up as used be o e, a
solu ion o compa able pe o mance quali y has been ound, wi h e en
lesse numbe o ensemble model e alua ions. Wi h his me hod he
ounda ion has been c ea ed o a la ge sys em including all a ailable
models.
Al hough esea ch ques ion (Q-3) could be answe ed posi i ely, a com-
ple e answe o ques ion (Q-2) could no be gi en in his s udy. Ex-
plana ions o he obse ed beha io equi e u he esea ch. Ideas and
ques ions ha we e no in es iga ed so a include:
Expe imen s ea u ing mo e base models, also including o he ypes
o models.
Ex ensi e analysis o he in luence o objec i e unc ion a ibu es on
he expe imen ou come. The esul s o Sec. 5.2 sugges , ha pa ic-
ula ly piecewise assembled unc ions migh be o special in e es .
S udies also allowing o he ope a ions han simple linea combina ions
only.
Concep ion o a p ocedu e ha includes ou me hod o ensemble
building in o an sequen ial op imiza ion p ocess.
Summa izing, his p elimina y s udy p esen s aluable and new indings
in he ield o ensemble-based modeling. We de eloped a sma and sim-
ple s a egy s a egy o combining di e en modeling app oaches. I
uses a (linea ) combina ion o he p edic ed alues and is easily appli-
cable in many modeling si ua ions whe e se e al models a e a ailable.
Especially, i he use does no know, which model o choose, a linea
combina ion migh be a p omising app oach. The weigh s in he lin-
ea model can shed some ligh on he ele ance o ce ain models and
illus a e, which model is ac i e. Te na y plo s (as shown in Figu e 7)
can be used o illus a e he p og ess o he op imiza ion p ocess. How-
e e , since de e mina ion o op imal weigh s in he linea model is a
non-linea op imiza ion p oblem, we canno gua an ee he op imali y o
he p oposed weigh s. All in all, his expe imen al s udy p esen s some
impo an indings abou he beha io o an ensemble-based app oach
ha de ines an in e es ing di ec ion o esea ch.
Building Ensembles o Su oga e Models by Op imal Con ex Combina ion 17
Re e ences
[1] T. Ba z-Beiels ein. Spo : An R package o au oma ic and in e ac i e uning
o op imiza ion algo i hms by sequen ial pa ame e op imiza ion. Technical Re-
po 05/10, Resea ch Cen e CIOP (Compu a ional In elligence, Op imiza ion
andDa a Mining), Cologne Uni e si y o Applied Science, Facul y o Compu e
Science and Enginee ing Science, 2010. Commen s: Rela ed so wa e can be down-
loaded om h p://c an. -p ojec .o g/web/packages/SPOT/index.h ml.
[2] T. Ba z-Beiels ein. How o C ea e Gene alizable Resul s. In J. Kacp zyk and
W. Ped ycz, edi o s, Sp inge Handbook o Compu a ional In elligence, pages
1127–1142. Sp inge Be lin Heidelbe g, Be lin, Heidelbe g, 2015.
[3] L. B eiman. Random o es s. Mach. Lea n., 45(1):5–32, Oc . 2001.
[4] M. F iese, M. Zae e e , T. Ba z-Beiels ein, O. Flasch, P. Koch, W. Konen, and
B. Naujoks. Ensemble-Based Op imiza ion and Tuning Algo i hms. In F. Ho -
mann and E. H¨ulle meie , edi o s, P oceedings 21. Wo kshop Compu a ional
In elligence, pages 119–134. Uni e si ¨a s e lag Ka ls uhe, 2011.
[5] M. Gallaghe and B. Yuan. A gene al-pu pose unable landscape gene a o . IEEE
T ans. E olu iona y Compu a ion, 10(5):590–603, 2006.
[6] R. Polika . Ensemble based sys ems in decision making. Ci cui s and Sys ems
Magazine, IEEE, 6(3):21–45, 2006.
[7] J. S. Sø en N. Lopha en, Hans B uun Nielsen. Dace - a ma lab k iging oolbox.
Technical epo , Technical Uni e si y o Denma k, 2002.
[8] R. S. Su on and A. G. Ba o. In oduc ion o Rein o cemen Lea ning. MIT
P ess, Camb idge, MA, USA, 1s edi ion, 1998.
!
!
CIplus
Band 4/2016
Building Ensembles o Su oga e Models by
Op imal Con ex Combina ion
Ma ina F iese
Thomas Ba z-Beiels ein
Technische Hochschule Köln
Michael Emme ich
LIACS, Leiden Uni e si y
Mä z 2016
Die Ve an wo ung ü den Inhal diese
Ve ö en lichung lieg bei den Au o en.