Re e ences
RESULTS
MOTHODOLOGY
Aiming s a egy – Mo i a ion
Jose A. Ca balloa, Ja ie Bonillaa, N.C. C uzb, Jesús Fe nández-Rechea, J.D. Ál a ezc, An onio A ila-Ma ína, M. Be enguelc
Ad ancing Helios a Aiming S a egies in Sola
Towe Plan s
a CIEMAT - Pla a o ma Sola de Alme ía (PSA), C a. de Senés km 4.5, E-04200 Tabe nas, Alme ía, Spain
b Dep . o Compu e Enginee ing, Au oma ion, and Robo ics. Facul y o Educa ion, Economics, and Technology o Ceu a,
Uni e si y o G anada
c CIESOL, Sola Ene gy Resea ch Cen e, Join Ins i u e Uni e si y o Alme ía - CIEMAT, Depa men o
In o ma ics, E-04120, Alme ía, Spain
A majo challenge in sola owe (ST) plan s is op imizing helios a
aiming o maximize ene gy cap u e while p o ec ing ecei e in eg i y
and li e ime.
We p e iously p oposed a model- ee deep Rein o cemen Lea ning
(RL) app oach [1] using he So Ac o –C i ic (SAC) algo i hm. The ask
is amed as con inuous con ol, whe e an SAC agen in e ac s wi h a
Sola Pilo /CoPylo model o he CESA-1 ield, lea ning h ough
simula ed episodes. Rewa ds p omo e abso bed powe while
penalizing spillage, excessi e mo ion, and unsa e lux peaks, ensu ing
policies con e ge o sa e and e icien solu ions.
The agen e alua es plan s a es and p o ides op imized aiming in
eal ime, enabling ully au oma ic, adap i e s a egies ha
ou pe o m s a ic ixed-poin me hods.
The p esen wo k ocuses on enhancing aining e iciency and
inco po a ing ealis ic sola a iabili y o ensu e obus , ans e able
s a egies o cu en and u u e ST plan s.
Loca ion
[1] J. A. Ca ballo, J. Bonilla, N. C. C uz, J. Fe nández-Reche, J.D. Ál a ez, A.A ila-Ma in, & M. Be enguel.
Rein o cemen lea ning o helios a aiming: Imp o ing he pe o mance o Sola Towe plan s. Applied Ene gy,
377,124574.2025.doi:h ps://doi.o g/10.1016/j.apene gy.2024.124574
[2] E. Liang, R. Liaw, R. Nishiha a, P. Mo i z, R. Fox & J. Gonzalez. Ray llib: A composable and scalable
ein o cemen lea ning lib a y. a Xi p ep in a Xi :1712.09381, 85, 245. 2025.
[3] Wagne , M.J., Wendelin, T. (2018). Sola PILOT: A powe owe sola ield layou and cha ac e iza ion ool,
Sola Ene gy, Vol. 171, pp. 185-196, ISSN 0038-092X, h ps://doi.o g/10.1016/j.solene .2018.06.063.
Building on p e ious esul s, we adop ed he Ray amewo k o o e come he main
limi a ions o ou ini ial SAC app oach. The implemen a ion elies on Ray Rllib [2] o
scalable dis ibu ed aining, Ray Tune o la ge-scale hype pa ame e op imiza ion, and
Ray T ain o e icien execu ion.
The op imiza ion p ocess (Fig. 2) launches mul iple pa allel ainings wi h di e en
con igu a ions. Each aining consis s o episodes made o simula ion s eps, whe e he
agen in e ac s wi h he en i onmen , collec s ewa ds, and upda es ac o –c i ic ne wo ks.
Pe o mance is e alua ed h ough he mean episode e u n (R), and p omising
con igu a ions a e e ined i e a i ely o con e ge owa ds obus policies.
The cus om en i onmen de eloped in he p e ious wo k in Py hon and in eg a ing
Sola Pilo /CoPylo [3], simula es 114 helios a s o he CESA-I ield has been modi ied o
inco po a e eal PSA i adiance da a, exposing he agen o a iabili y and cloud
ansien s.
All expe imen s we e execu ed on a 225-co e wo ks a ion a PSA, enabling la ge-scale
pa allel sampling and e icien policy e alua ion. Bes -pe o ming policies we e alida ed
wi h annual simula ions, assessing ene gy yield, ewa d ajec o ies, and lux dis ibu ion.
This wo k low ensu es obus ness and ans e abili y o he lea ned s a egies o eal ST
plan ope a ion.
Conclusions and u u e wo ks
Fig 1.- RL in helios a aiming s a egy
Fig 2.- Op imiza in and aining p ocess
Fig 3.- So wa e
The adop ion o Ray RLlib g ea ly imp o ed bo h e iciency and
pe o mance. T aining ime was educed om ~9 days o a single SAC
agen o <4 days o 500 pa allel con igu a ions, enabling sys ema ic
explo a ion o ne wo k a chi ec u es, lea ning a es, and explo a ion
s a egies.
This la ge-scale sea ch enhanced policy obus ness, ensu ing
gene aliza ion unde ealis ic DNI a iabili y, including cloud ansien s.
Op imized agen s achie ed a 9.1% annual gain in abso bed powe ,
su passing he 8.8% imp o emen o ou ea lie s udy.
These esul s con i m he echnical supe io i y and indus ial eadiness o
dis ibu ed RL o helios a aiming op imiza ion in sola owe plan s.
Fig 4.- Op imiza ion/ aining esul s
This wo k has been unded by ASTERIx-CAESa p ojec which is unded by he Eu opean
Union unde he G an Ag eemen Nº 101122231. Views and opinions exp essed a e
howe e hose o he au ho (s) only and do no necessa ily e lec hose o he Eu opean
Union o CINEA. Nei he he Eu opean Union no he g an ing au ho i y can be held
esponsible o hem.
In eg a ing Ray RLlib wi h eal DNI da a ad ances RL-based helios a
aiming, achie ing scalabili y, obus ness, and indus ial easibili y. The
dis ibu ed amewo k educes aining ime while main aining s able
pe o mance unde sola a iabili y and cloud ansien s.
Fu u e wo k will ocus on ield alida ion a he CESA-1 plan and he use
o ans e lea ning o adap p e- ained policies o new si es and ecei e
designs, enabling b oade deploymen o adap i e aiming s a egies
ac oss sola owe plan s.