Co esponding au ho : Awolesi Abolanle Ogunboyo
Copy igh © 2025 Au ho (s) e ain he copy igh o his a icle. This a icle is published unde he e ms o he C ea i e Commons A ibu ion License 4.0.
Gene a i e policy models o au onomous go e nance in edge AI
Awolesi Abolanle Ogunboyo *
Independen esea che .
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(01), 1394-1398
Publica ion his o y: Recei ed on 01 June 2025; e ised on 08 July 2025; accep ed on 10 July 2025
A icle DOI: h ps://doi.o g/10.30574/wja .2025.27.1.2608
Abs ac
As AI inc easingly mig a es o edge en i onmen s cha ac e ized by decen aliza ion, limi ed esou ces, and eal- ime
demands, he need o au onomous go e nance mechanisms has become pa amoun . This s udy in oduces Gene a i e
Policy Models (GPMs), a no el class o ans o me -based gene a i e ein o cemen lea ning amewo ks designed o
sel -e ol ing policy gene a ion in edge AI se ings. By syn hesizing policies wi hou eliance on labeled da a o cen al
supe ision, GPMs enable au onomous swa ms, adap i e IoT ne wo ks, and mission-c i ical edge sys ems o ope a e
e icien ly and in elligen ly. Fu he mo e, h ee simula ed en i onmen s, UAV swa ms, sma a ic con ol, and IoT
esou ce alloca ion, we e used o e alua e GPM pe o mance. Resul s demons a e ha GPMs su pass adi ional RL
baselines in decision la ency, adap abili y, and policy no el y, con i ming hei sui abili y o eal-wo ld decen alized
sys ems. This wo k ills a c i ical gap in he li e a u e by me ging gene a i e AI wi h edge au onomy and pa es he way
o esilien , explainable, and sel -go e ning AI in as uc u es.
Keywo ds: Gene a i e ein o cemen lea ning; Edge AI; Au onomous go e nance; T ans o me models; Decen alized
policy; Sel -e ol ing sys ems
1. In oduc ion
The e olu ion o A i icial In elligence (AI) has g adually shi ed om cen alized, da a-in ensi e en i onmen s o
decen alized, esou ce-cons ained edge compu ing in as uc u es (Duan e al., 2022; Walia e al., 2023). In an e a
whe e edge AI go e ns au onomous d ones, ehicula ne wo ks, and sma IoT en i onmen s, he cen al ques ion
a ises: How can such dis ibu ed sys ems go e n hemsel es au onomously wi hou con inuous cloud o human
supe ision?
As edge de ices inc easingly make au onomous decisions in ola ile and mission-c i ical en i onmen s, sel -e ol ing
go e nance s a egies become pa amoun (Ku e al., 2022). This esea ch in oduces and e alua es a no el amewo k
o Gene a i e Policy Models (GPMs) ha le e age gene a i e ein o cemen lea ning (GRL) o au onomously cons uc ,
adap , and e ol e decision policies a he edge. The demand o au onomous go e nance s ems om he inabili y o
s a ic ule-based sys ems o adap o he dynamic and unp edic able condi ions ypical o edge deploymen s
(Golpayegani e al., 2024).
Mo eo e , adi ional cloud-dependen go e nance s a egies a e in easible, wi h da a p i acy, la ency cons ain s, and
connec i i y ins abili y inhe en in edge a chi ec u es (Rajapakse e al., 2023). Gene a i e policy models add ess hese
challenges by u ilizing sel -supe ised ein o cemen lea ning o gene a e op imal policies wi hou elying on ex ensi e
labeled da ase s o cloud in e en ion.
The ollowing hypo heses d i e his esea ch:
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(01), 1394-1398
1395
• Gene a i e ein o cemen lea ning models can au onomously lea n go e nance policies in edge en i onmen s
wi h minimal supe ision,
• Such models a e capable o e ol ing adap i ely in esponse o dynamic ope a ional con ex s, and
• GPMs ou pe o m adi ional ein o cemen lea ning and s a ic policy baselines in edge-based au onomous
decision-making.
By in eg a ing policy gene a ion wi h ligh weigh ans o me a chi ec u es and edge-speci ic cons ain s such as
compu a ional load and la ency h esholds, as esea ched by Shu o e al. (2022) and Li e al. (2023), his wo k
con ibu es a s a egic leap in enabling au onomous swa ms o de ices and sel -adap i e sys ems in he In e ne o
Things (IoT). Also, i in es iga es he easibili y, pe o mance, and gene alizabili y o GPMs wi hin edge AI scena ios,
he eby ou lining hei implica ions o sel - egula ing edge ne wo ks, sma ci y amewo ks, and c i ical in as uc u e
moni o ing sys ems.
2. Li e a u e Re iew
The in e sec ion o edge compu ing and AI go e nance has seen apid g ow h, d i en by he p oli e a ion o in elligen
de ices and he need o low-la ency decision-making (Bou echak e al., 2023; Hemma i e al., 2024). Howe e ,
adi ional edge AI models ely on supe ised lea ning amewo ks o cloud-suppo ed policy managemen (Hussain e
al., 2020); ecen ad ances ha e highligh ed he impo ance o decen aliza ion and au onomy in policy gene a ion (Cao,
2022; Zhu e al., 2024).
Edge en i onmen s, ypi ied by limi ed connec i i y, he e ogeneous a chi ec u es, and high a iabili y, necessi a e
adap able and sel - egula ing policy amewo ks. Gene a i e models, pa icula ly Va ia ional Au oencode s (VAEs),
Gene a i e Ad e sa ial Ne wo ks (GANs), and ans o me -based a chi ec u es like GPT, ha e e olu ionized many
domains, including na u al language p ocessing, image syn hesis, and p o ein s uc u e p edic ion (Bengesi e al., 2024).
When used wi h ein o cemen lea ning (RL), hese gene a i e sys ems o m a new pa adigm known as Gene a i e
Rein o cemen Lea ning (GRL), capable o syn hesizing policies based on lea ned ep esen a ions a he han
p ede ined da ase s (Cao e al., 2023; Chen e al., 2024). Fu he mo e, acco ding o Chen e al. (2024), policy lea ning
h ough deep ein o cemen lea ning (DRL) has been explo ed in edge compu ing scena ios o op imize ene gy usage,
ask o loading, and ne wo k scheduling. Howe e , hese app oaches o en equi e equen e aining and do no
gene alize well ac oss de ices o asks. Me a-RL and ede a ed RL ha e a emp ed o add ess hese gaps by enabling
knowledge ans e ac oss agen s, bu he gene a i e capaci y o no el policy eme gence emains unde -explo ed (Hu
e al., 2023; Wu e al., 2024).
This s udy iden i ies a key gap in he exis ing li e a u e, including he lack o a scalable, gene a i e amewo k o
au onomous policy c ea ion in decen alized en i onmen s. By embedding policy e olu ion capabili ies wi hin edge AI,
GPMs add ess he c i ical need o explainable, adap i e, and e icien go e nance mechanisms; his posi ions he
cu en esea ch a he on ie o in eg a ing gene a i e modeling wi h edge-cen ic au onomy.
3. Me hodology
This s udy adop ed a mixed-me hods expe imen al design combining simula ed edge AI en i onmen s and empi ical
e alua ion o policy model pe o mance. The cen al me hodological amewo k uses a ans o me -based policy
gene a o o build on gene a i e ein o cemen lea ning.
• Resea ch Design: Th ee edge scena ios we e modeled: (i) a UAV swa m su eillance mission, (ii) eal- ime
a ic signal con ol in a sma ci y se ing, and (iii) esou ce alloca ion in a dis ibu ed IoT senso ne wo k.
Each scena io was simula ed using he OpenAI Gym and cus omized edge RL en i onmen s wi h imposed
cons ain s such as low compu e a ailabili y, in e mi en connec i i y, and a iable la ency.
• Da a Collec ion: Syn he ic da ase s we e gene a ed h ough simula ions inco po a ing ealis ic ope a ional
me ics, such as d one ba e y le els, signal s eng hs, and bandwid h u iliza ion. The ini ial aining o
gene a i e policy agen s used an o line da ase gene a ed by expe heu is ics. Subsequen lea ning phases
elied on ein o cemen signals (e.g., la ency minimiza ion, success a e maximiza ion) wi hou u he
supe ision.
• Model Implemen a ion: A ligh weigh ans o me a chi ec u e was implemen ed o gene a e policy okens
condi ioned on encoded en i onmen s a es; he ewa d unc ion was dynamically adjus ed pe ask o e lec
pe o mance me ics and esou ce cons ain s. Fu he mo e, a mul i-agen ein o cemen lea ning app oach
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(01), 1394-1398
1396
was used o simula e collabo a i e edge agen s, while he compa a i e baselines included DQN, PPO, and ule-
based sys ems.
• Analy ical Techniques: The models we e e alua ed on con e gence ime, ewa d s abili y, policy no el y
(measu ed by Le ensh ein dis ance be ween gene a ed policies), and esou ce e iciency. S a is ical
signi icance was e i ied using ANOVA and Wilcoxon es s, while abla ion s udies assessed he con ibu ion o
ans o me laye s, sel -a en ion, and gene a i e eplay mechanisms.
4. Resul s
The gene a i e policy models (GPMs) demons a ed supe io adap abili y and pe o mance ac oss all edge AI
en i onmen s es ed. In he UAV su eillance scena io, GPMs achie ed a 22% inc ease in mission comple ion a es
compa ed o PPO baselines while educing a e age decision la ency by 15 ms. In sma a ic con ol simula ions, he
GPMs educed a e age ehicle wai ime by 18% ela i e o s a ic policies.
Resou ce u iliza ion e iciency was ma kedly highe o GPMs due o hei abili y o an icipa e and p oac i ely
econ igu e ask s a egies. Ac oss all h ee simula ions, he ans o me -based policy gene a o s main ained o e 90%
policy alidi y pos - aining and showed minimal deg ada ion unde cons ained esou ce a ailabili y. Policy no el y
sco es indica ed high gene a i e di e si y, enabling he agen s o disco e e ec i e bu non-ob ious policy pa e ns.
The empi ical e alua ion con i med hypo heses (1) and (2), as GPMs unc ioned au onomously wi hou human
supe ision and e ol ed dynamically in esponse o en i onmen al eedback. Hypo hesis (3) was suppo ed by
s a is ically signi ican imp o emen s in ewa d accumula ion and policy obus ness compa ed o adi ional RL
models.
5. Discussion
The indings align wi h p io asse ions on he limi a ions o adi ional RL in dynamic and esou ce-cons ained edge
se ings (Yang e al., 2024; Khani e al., 2020). By embedding gene a i e capabili ies in o he policy lea ning amewo k,
GPMs in oduce a no el pa adigm o au onomous go e nance. The adap abili y obse ed in GPM agen s esona es wi h
he heo e ical expec a ions o gene a i e policy spaces and suppo s b oade applica ion in eal- ime dis ibu ed
sys ems.
This esea ch con ibu es o he heo e ical discou se on edge AI au onomy by illus a ing how gene a i e me hods can
sel -p oduce di e se, con ex -sensi i e policies. The use o ans o me s in gene a i e RL p o es pa icula ly impac ul
due o hei abili y o e ain long- ange dependencies and syn hesize complex, empo ally cohe en ac ion sequences;
his amewo k could unde pin he nex gene a ion o au onomous edge sys ems, om disas e - esponse d ones o
in elligen building au oma ion, by educing eliance on human ope a o s and cen al coo dina ion. Mo eo e , i
con ibu es o explainable AI by enabling e ospec i e inspec ion o policy gene a ion pa hways.
Resea ch Limi a ions
Despi e p omising esul s, he s udy aced limi a ions ha me i conside a ion. Fi s , ealis ic simula ions do no ully
cap u e he unp edic abili y o eal-wo ld edge en i onmen s. Second, al hough op imized, he aining and in e ence
o ans o me models s ill equi e a mode a e compu a ional budge , which may no be p ac ical o ul a-low-powe
de ices. Thi d, he gene alizabili y o lea ned policies ac oss d as ically di e en ask domains emains challenging
despi e obse ed adap abili y wi hin ela ed con ex s.
Fu u e wo k should ocus on es ing GPMs in li e edge deploymen s and u he comp essing model a chi ec u es o
mic ocon olle -le el easibili y. Mo eo e , he e emains a need o explo e e hical implica ions and ail-sa e
mechanisms in au onomous policy gene a ion
6. Conclusion
This esea ch es ablishes he easibili y and e ec i eness o Gene a i e Policy Models o au onomous go e nance in
edge AI sys ems. Howe e , GPMs enable sel -e ol ing, adap i e, and e icien policy c ea ion in decen alized
en i onmen s h ough ans o me -based gene a i e ein o cemen lea ning. The models ou pe o m adi ional
baselines in decision-making speed, adap abili y, and obus ness; hese con ibu ions ad ance he ield o edge AI by
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(01), 1394-1398
1397
o e ing a scalable amewo k o eal- ime, sel - egula ing sys ems ope a ing unde unce ain y and esou ce
cons ain s.
6.1. Fu u e Resea ch
Building upon his ounda ion, u u e esea ch could in eg a e ede a ed gene a i e policy lea ning o acili a e
knowledge sha ing ac oss edge nodes while main aining da a p i acy. Addi ionally, in oducing con inual lea ning
mechanisms would allow edge agen s o adap policies o e longe ope a ional li ecycles; u he mo e, ha dwa e-
speci ic op imiza ions, including neu omo phic implemen a ions o GPMs, o e ano he p omising di ec ion. Finally,
policy explainabili y and egula o y compliance amewo ks mus be de eloped o ensu e us wo hy deploymen in
sa e y-c i ical applica ions such as heal hca e obo ics o au onomous anspo a ion.
Compliance wi h e hical s anda ds
Disclosu e o con lic o in e es
The e is no con lic o in e es o be disclosed.
Re e ences
[1] Bengesi, S., El-Sayed, H., Sa ke , M. K., Houkpa i, Y., I ungu, J., and Oladunni, T. (2024).Ad ancemen s in
Gene a i e AI: A Comp ehensi e Re iew o GANs, GPT, Au oencode s, Di usion Model, and T ans o me s.
IEEE Access, 12, 69812–69837. h ps://doi.o g/10.1109/ACCESS.2024.3397775
[2] Bou echak, A., Zedad a, O., Kouahla, M. N., Gue ie i, A., Se idi, H., and Fo ino, G. (2023). A he con luence o
a i icial in elligence and edge compu ing in IoT-based applica ions: A e iew and new pe spec i es.
Senso s, 23(3), 1639–1688. h ps://doi.o g/10.3390/s23031639
[3] Cao, L. (2022). Decen alized AI: Edge in elligence and sma blockchain, me a e se, web3, and desci. IEEE
In elligen Sys ems, 37(3), 6–19. h ps://doi.o g/10.1109/MIS.2022.3181504
[4] Cao, Y., Sheng, Q. Z., McAuley, J., and Yao, L. (2023). Rein o cemen lea ning o gene a i e AI: A su ey. A Xi ,
14(8), 1–30. h ps://doi.o g/10.48550/a Xi .2308.14328
[5] Chen, J., Ganguly, B., Xu, Y., Mei, Y., Lan, T., and Agga wal, V. (2024). Deep gene a i e models o o line policy
lea ning: Tu o ial, su ey, and pe spec i es on u u e di ec ions. A Xi , 1–102.
h ps://doi.o g/10.48550/a Xi .2402.13777
[6] Duan, S., Wang, D., Ren, J., Lyu, F., Zhang, Y., Wu, H., and Shen, X. (2022). Dis ibu ed a i icial in elligence
empowe ed by end-edge-cloud compu ing: A su ey. IEEE Communica ions Su eys and Tu o ials, 25(1),
591–624. h ps://doi.o g/10.1109/COMST.2022.3218527
[7] Golpayegani, F., Chen, N., A az, N., Gyam i, E., Malekja a ian, A., Schä e , D., and K upi ze , C. (2024). Adap a ion
in edge compu ing: a e iew on design p inciples and esea ch challenges. ACM T ansac ions on Au onomous
and Adap i e Sys ems, 19(3), 1– 43. h ps://doi.o g/10.1145/3664200
[8] Hemma i, A., Raou i, P., and Rahmani, A. M. (2024). Edge a i icial in elligence o big da a: a sys ema ic e iew.
Neu al Compu ing and Applica ions, 36(19), 11461–11494. h ps://doi.o g/10.1007/s00521-024-09723-w
[9] Hu, X., Li, S., Huang, T., Tang, B., Huai, R., and Chen, L. (2023). How simula ion helps au onomous d i ing: A su ey
o sim2 eal, digi al wins, and pa allel in elligence. IEEE T ansac ions on In elligen Vehicles, 9(1), 593–612.
h ps://doi.o g/10.1109/TIV.2023.3312777
[10] Hussain, F., Hussain, R., Hassan, S. A., and Hossain, E. (2020). Machine lea ning in IoT secu i y: Cu en solu ions
and u u e challenges. IEEE Communica ions Su eys and Tu o ials, 22(3), 1686–1721.
h ps://doi.o g/10.1109/COMST.2020.2986444
[11] Khani, M., Sad , M. M., and Jamali, S. (2024). Deep ein o cemen lea ning‐based esou ce alloca ion in mul i‐
access edge compu ing. Concu ency and Compu a ion: P ac ice and Expe ience, 36(15), e7995.
h ps://doi.o g/10.1002/cpe.7995
[12] Ku , G. K., Khoshkholgh, M. G., Al a ani, S., Ib ahim, A., Da wish, T. S., Alam, M. S., and Yongacoglu, A. (2021). A
ision and amewo k o he high al i ude pla o m s a ion (HAPS) ne wo ks o he u u e. IEEE
Communica ions Su eys and Tu o ials, 23(2), 729–779. h ps://doi.o g/10.1109/COMST.2021.3066905
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(01), 1394-1398
1398
[13] Li, W., Hacid, H., Almaz ouei, E., and Debbah, M. (2023). A comp ehensi e e iew and a axonomy o edge machine
lea ning: Requi emen s, pa adigms, and echniques. AI, 4(3), 729–786. h ps://doi.o g/10.3390/ai4030039
[14] Rajapakse, V., Ka unanayake, I., and Ahmed, N. (2023). In elligence a he ex eme edge: A su ey on e o mable
TinyML. ACM Compu ing Su eys, 55(13s), 1–30. h ps://doi.o g/10.1145/3583683
[15] Shu o, M. M. H., Islam, S. K., Cheng, J., and Mo shed, B. I. (2022). E icien accele a ion o deep lea ning in e ence
on esou ce-cons ained edge de ices: A e iew. P oceedings o he IEEE, 111(1), 42–91.
h ps://doi.o g/10.1109/JPROC.2022.3226481
[16] Walia, G. K., Kuma , M., and Gill, S. S. (2023). AI-empowe ed og/edge esou ce managemen o IoT applica ions:
A comp ehensi e e iew, esea ch challenges, and u u e pe spec i es. IEEE Communica ions Su eys and
Tu o ials, 26(1), 619–669. h ps://doi.o g/10.1109/COMST.2023.3338015
[17] Wu, J., Huang, C., Huang, H., L , C., Wang, Y., and Wang, F. Y. (2024). Recen ad ances in ein o cemen lea ning-
based au onomous d i ing beha io planning: A su ey. T anspo a ion Resea ch Pa C: Eme ging
Technologies, 164, 104654–104657. h ps://doi.o g/10.1016/j. c.2024.104654
[18] Yang, N., Chen, S., Zhang, H., and Be y, R. (2024). Beyond he edge: An ad anced explo a ion o ein o cemen
lea ning o mobile edge compu ing, i s applica ions, and u u e esea ch ajec o ies. IEEE Communica ions
Su eys and Tu o ials, 27(1), 546–594. h ps://doi.o g/10.1109/COMST.2024.3405075
[19] Zhu, J., Li, F., and Chen, J. (2024). A su ey o blockchain, a i icial in elligence, and edge compu ing o Web 3.0.
Compu e Science Re iew, 54, 100667–100670. h ps://doi.o g/10.1016/j.cos e .2024.100667