scieee Science in your language
[en] (orig)

Data-Driven Orchestration for Distributed RAN Intelligent Controller Placement in 6G Networks

Author: Hashemi Nezhad, Elham; Di Maio, Antonio; Braun, Torsten
Publisher: Zenodo
DOI: 10.48620/78477
Source: https://zenodo.org/records/17670482/files/Data_Driven_Orchestration_for_Distributed_RAN_Intelligent_Controller_Placement_in_6G_Networks.pdf
Da a-D i en O ches a ion o Dis ibu ed RAN In elligen
Con olle Placemen in 6G Ne wo ks
Elham HashemiNezhad
Uni e si y o Be n
Be n, Swi ze land
[email p o ec ed]
An onio Di Maio
Uni e si y o Be n
Be n, Swi ze land
[email p o ec ed]
To s en B aun
Uni e si y o Be n
Be n, Swi ze land
[email p o ec ed]
ABSTRACT
Open-Radio Access Ne wo k (
O-RAN
) b ings an inno a i e ap-
p oach o add ess he issues o con olle placemen in la ge-scale
6G ne wo ks. In his wo k, we in oduce a Rein o cemen Lea n-
ing (
RL
) algo i hm o decen alized RAN In elligen Con olle
O ches a ion in 6G Ne wo ks, which le e ages he online lea n-
ing capabili ies o a mul i-agen
RL
sys em. Ou me hod achie es
a ound 42-66% lowe use la ency and 9-14% highe use packe
deli e y a io compa ed o s a e-o - he-a baselines in a b oad
ange o simula ed scena ios.
ACM Re e ence Fo ma :
Elham HashemiNezhad, An onio Di Maio, and To s en B aun. 2025. Da a-
D i en O ches a ion o Dis ibu ed RAN In elligen Con olle Placemen
in 6G Ne wo ks. In The 40 h ACM/SIGAPP Symposium on Applied Compu ing
(SAC ’25), Ma ch 31-Ap il 4, 2025, Ca ania, I aly. ACM, New Yo k, NY, USA,
3 pages. h ps://doi.o g/10.1145/3672608.3707972
1 INTRODUCTION
The e olu ion beyond 5G and he de elopmen o six h-gene a ion
(6G) wi eless ne wo ks call o an a chi ec u al ans o ma ion ha
suppo s se ice he e ogenei y, coo dina es mul i-connec i i y ech-
nologies, and enables on-demand se ice deploymen . The
O-RAN
a chi ec u e includes wo RAN In elligen Con olle s (RICs) ha
pe o m managemen and con ol o he ne wo k a Nea -Real-
Time (Nea -RT) RIC be ween 10 [
ms
] and 1 [s] and Non-Real-Time
(Non-RT) RIC mo e han 1 [s] ime scales [
3
]. Op imal placemen o
con olle s has a p ominen e ec on minimizing his esponse ime.
Dis ibu ing a minimum numbe o con olle s a op imal loca ions
o comple e con ol unc ions p omp ly is known as he Con olle
Placemen P oblem (CPP) [
1
]. Dis ibu ed con olle s a e necessa y
o enhance ne wo k pe o mance and ensu e he sus ainabili y o
use connec ions. A single con olle poses a single poin o ailu e,
comp omising ne wo k eliabili y and a ailabili y. Se e al wo ks
ha e ackled he p oblem o dis ibu ed con olle placemen in
di e en ypes o ne wo ks. Almeida e al. [
2
] p oposed a RIC O -
ches a o (RIC-O) o op imize he deploymen o he Nea -RT RIC
componen s. The o ches a o seeks o place Nea -RT RIC compo-
nen s o he closes edge compu ing nodes based on he leas cos in
an O-RAN, so i educes la ency and imp o es cos e iciency using
a g eedy s a egy. The app oach also employs a moni o ing sys em
Pe mission o make digi al o ha d copies o pa o all o his wo k o pe sonal o
class oom use is g an ed wi hou ee p o ided ha copies a e no made o dis ibu ed
o p o i o comme cial ad an age and ha copies bea his no ice and he ull ci a ion
on he i s page. Copy igh s o hi d-pa y componen s o his wo k mus be hono ed.
Fo all o he uses, con ac he owne /au ho (s).
SAC ’25, Ma ch 31-Ap il 4, 2025, Ca ania, I aly
©2025 Copy igh held by he owne /au ho (s).
ACM ISBN 979-8-4007-0629-5/25/03
h ps://doi.o g/10.1145/3672608.3707972
o he con ol loop. Howe e , con e gence may occu when all
edge nodes hos Nea -RT RIC componen s, po en ially inc easing
la ency, pa icula ly in la ge, dynamic 6G ne wo ks. Wu e al. [
6
]
p oposed a Deep Q-Ne wo k (DQN) app oach based on deep
RL
o
con olle placemen in So wa e-De ined Ne wo ks (SDN). This
me hod op imizes la ency and load-balancing me ics by op imally
placing he con olle and adjus ing he swi ch-con olle mapping
acco ding o he low luc ua ions. Howe e , his me hod does no
conside he numbe o con olle s. Lyu e al. [
5
] p oposed a ully
dis ibu ed o ches a ion o minimize SDN’s ime-a e age cos by
s ochas ically op imizing con olle s’ on-demand ac i a ion, an
adap i e associa ion o con olle s and swi ches, and eal- ime e-
ques p ocessing and dispa ching. Howe e ,
RL
o in elligen and
decen alized decision-making would be mo e e icien in mee -
ing he dynamic, complex, and high-demand en i onmen o 6G
ne wo ks.
The main con ibu ions o his esea ch o ackle he men ioned
issues a e summa ized as ollows.
•
De elop a mul i-agen
RL
app oach o dis ibu ed con olle s
o op imize ansmission powe alloca ion, he eby educing
a e age la ency and maximizing packe deli e y a io ac oss
he ne wo k.
•
Design a decen alized o ches a ion amewo k o e icien
con olle deploymen and o ches a ion managemen using
a mul i-agen RL sys em.
2 METHODOLOGY
We assume he sys em ope a es wi hin a Radio Access Ne wo k
(RAN) deploymen adhe ing o O-RAN speci ica ions, as ep e-
sen ed in Figu e 1. The ne wo k opology is modeled as an undi-
ec ed g aph
𝐺=(𝑉, 𝐸)
, whe e
𝑉={𝑣1, . . . , 𝑣|𝑉|}
ep esen s
a se o ne wo k de ices. Mo eo e , we conside a se o edges
𝐸={𝑒1. . . , 𝑒|𝐸|}
ep esen ing a se o physical links be ween wo
nodes in
𝑉
. Each link in
𝐸
is cha ac e ized by i s link la ency
𝐿𝑖 𝑗
be-
ween de ices
𝑣𝑖∈𝑉
and
𝑣𝑗∈𝑉
. We assume ha each base s a ion
in
𝑉
mus adop he op imal ansmission powe
𝑃𝑡
𝑖[
W
]
owa ds i s
𝑖
- h connec ed Use Equipmen s (
UE
s). Ou sys em model con ains
a se
C ⊆ 𝑉
o con olle s ha each is in cha ge o powe alloca ion
o
𝑁𝑐UE
s associa ed wi h all base s a ions managed by con olle
𝑐
(i.e., he con olle domain). The a e age ansmission powe
𝑃𝑐
𝑡
selec ed by con olle
𝑐
om i s managed base s a ions o all i s
managed
𝑁𝑐
use s a he ime s ep
𝑡
as
𝑃𝑐
𝑡=1
𝑁𝑐Í𝑖∈[𝑁𝑐]𝑃𝑡
𝑖
. A se
O ⊆ 𝑉
o o ches a o s pa i ioning he ne wo k in o con iguous
domains using a Vo onoi- essella ion [
4
] app oach. Each domain
𝐾𝑜
g oups RAN nodes
𝑉
wi h he lowes la ency o he o ches a-
o
𝑜∈ O
. O ches a o s a e asked wi h dynamically deploying
con olle s C𝑜on a se 𝐾𝑜⊆𝑉o possible deploymen loca ions.
SAC ’25, Ma ch 31-Ap il 4, 2025, Ca ania, I aly E. HashemiNezhad e al.
Figu e 1: Example o deploymen o con olle s and o ches-
a o s o e he modeled physical ne wo k in as uc u e
2.1 Con olle Ope a ion
Each con olle alloca es he ansmission powe o each use in
i s domain by le e aging a local
RL
agen . This powe alloca ion
p ocess can be modeled as a sequen ial decision-making p ob-
lem
(𝑠𝑐, 𝑎𝑐,𝑟𝑐)(𝑡)
, whe e each con olle adjus s i s ac ion
𝑎𝑐(𝑡) ∈
A𝑐(𝑡)
in he en i onmen , i.e., he ansmission powe alloca ed o
each use a each ime s ep
𝑡
, based on he cu en sys em’s s a e
𝑠𝑐(𝑡) ∈ S𝑐(𝑡)
and a ewa d unc ion
𝑟𝑐(𝑡) ∈ R𝑐(𝑡)
. We de ine he
con olle ’s s a e, ac ion, and ewa d as ollows.
2.1.1 Con olle S a e. A each ime s ep
𝑡
, e e y con olle builds
he local s a e
𝑠𝑐(𝑡)
by collec ing sys em me ics such as he la ency
ec o
𝐿𝑐(𝑡−1)
and he SNR ec o
𝜌𝑐(𝑡−1)
, which con ain in o ma-
ion abou he communica ion la ency be ween he con olle and
all managed
UE
s and he SNR ecei ed by all managed
UE
s a ime
s ep
𝑡−
1. Mo eo e , he con olle also collec s an a e age ans-
mission powe ec o
P𝑡=(𝑃1
𝑡, . . . , 𝑃 | C|
𝑡)
h ough In e -con olle
connec ions a e e y ime s ep
𝑡
, which con ains he la es a e age
ansmission powe selec ed by all con olle s (i sel and all o he s)
a ime s ep
𝑡
. As he numbe o base s a ions and use s
𝑁𝑐
managed
by he gene ic con olle
𝑐
a y o e ime, he dimension o he
s a e space
S𝑐(𝑡)
ha con ains he s a e
𝑠𝑐(𝑡) ∈ S𝑐(𝑡)
is also ime-
a ying and
S𝑐(𝑡) ⊆ R2𝑁𝑐+| C |
. Equa ion 1 o mally cha ac e izes
he 𝑐- h con olle ’s s a e 𝑠𝑐(𝑡)a e e y ime s ep 𝑡∈N.
𝑠𝑐(𝑡)=𝐿𝑐(𝑡−1), 𝜌𝑐(𝑡−1),P𝑡−1(1)
2.1.2 Con olle Ac ion. Each con olle
𝑐
mus de e mine he ans-
mission powe o each use in i s con ol domain by execu ing a
local con olle policy
𝜋𝑐
. We de ine he con olle agen ’s ac ion
𝑎𝑐(𝑡) ∈ A𝑐(𝑡)=[
0
, 𝑃max)𝑁𝑐
, whe e
𝑃max
ep esen s ha dwa e o
egula o y limi a ions o base s a ions on he maximum ansmis-
sion powe .
2.1.3 Con olle Rewa d. The ewa d unc ion
𝑟𝑐(𝑡) ∈ R𝑐(𝑡)=R+
o he gene ic con olle
𝑐
a ime s ep
𝑡
is o maximize he a e age
o all packe deli e y a ios
𝑄𝑡
𝑖
o all
UE
s in he
𝑐
- h con olle ’s
domain a ime s ep 𝑡(Equa ion 2).
𝑟𝑐(𝑡)=
1
𝑁𝑐∑︁
𝑖∈𝑁𝑐
𝑄𝑡
𝑖(2)
2.2 O ches a o Ope a ion
We p opose a decen alized o ches a ion amewo k whe e o ches-
a o s employ a mul i-agen
RL
algo i hm o deploy con olle s;
he decisions a each ime s ep ely on he ou comes o p e ious
s eps. The p ocess can be ep esen ed using he uple
(𝑠𝑜, 𝑎𝑜,𝑟𝑜)(𝑡)
,
whe e each o ches a o pe o ms
𝑎𝑜(𝑡) ∈ A𝑜(𝑡)
on he en i on-
men o selec con olle nodes a each ime s ep
𝑡
based on he
cu en sys em’s s a e
𝑠𝑜(𝑡) ∈ S𝑜(𝑡)
and achie e a ewa d unc-
ion
𝑟𝑜(𝑡) ∈ R𝑜(𝑡)
. The o ches a o ’s s a e, ac ion, and ewa d a e
deno ed as ollows.
2.2.1 O ches a o S a e. Each o ches a o has a s a e
𝑠𝑜(𝑡)
by
ga he ing he con olle -use la ency ec o
𝐿𝑜(𝑡−1)
and he o ches-
a o use -coun ec o
𝑁𝑜(𝑡−1)
a ime s ep
𝑡−
1. Each agen
obse es he la ency be ween he o ches a o and all managed
con olle s in he p e ious ime s ep as o ches a o -con olle la-
ency ec o
𝐿𝑜(𝑡−1)
o deploy con olle s in he possible lowes
o ches a o -con olle la ency a ime s ep
𝑡
. Each o ches a o also
collec s numbe o con olle s ec o ,
C𝑡−1=(|C1|, . . . , |C| O| |)𝑡−1∈
N| O|
h ough In e -o ches a o connec ions which ep esen s he
la es numbe o con olle s managed by o ches a o
𝑜
a ime s ep
𝑡−
1. Equa ion 3 desc ibes he
𝑜
- h o ches a o ’s s a e
𝑠𝑜(𝑡)
a
e e y ime s ep 𝑡∈N.
𝑠𝑜(𝑡)=(𝐿𝑜(𝑡−1), 𝑁𝑜(𝑡−1), 𝐿𝑜(𝑡−1),C𝑡−1)(3)
2.2.2 O ches a o Ac ion. The ac ion
𝑎𝑜(𝑡)
o a single o ches a-
o agen is o selec con olle nodes in he o ches a o domain as
a bina y decision by execu ing a local o ches a o policy
𝜋𝑜
. The
agen selec s a node wi h less a e age use la ency, mo e numbe
o use s, and less la ency be ween he node and he o ches a o as
a con olle and ge s 1; o he wise, i ge s 0 o non-con olle s. The
o ches a o agen ’s ac ion is de ined as
𝑎𝑜(𝑡) ∈ {
0
,
1
}|𝐾𝑜|
, which
is a logical alue ep esen ing con olle and non-con olle nodes.
2.2.3 O ches a o Rewa d. The ewa d unc ion
𝑟𝑜(𝑡)
in he
RL
amewo k o each o ches a o is designed o minimize use la-
ency wi hin he con olle ’s domain and he la ency be ween he
con olle and i s o ches a o node (Equa ion 4).
𝑟𝑜(𝑡)=−∥𝐿𝑜𝑡 ∥−∥𝐿𝑜𝑡 ∥(4)
3 EXPERIMENTAL EVALUATION
In ou e alua ion, we conduc simula ions o e i y he pe o mance
o ou me hod in aspec s o la ency, ansmission powe , and packe
deli e y a io. We compa e h ee en i onmen s, including Single
O ches a o - Single Con olle (SOSC), Single O ches a o - Dis-
ibu ed Con olle s (SODC) ha a e implemen ed by a single-agen
sys em, and he p oposed me hod wi h a mul i-agen sys em in he
same ne wo k condi ions and opology in e ms o la ency, packe
deli e y a io, and ansmission powe .
Da a-D i en O ches a ion o Dis ibu ed RAN In elligen Con olle Placemen in 6G Ne wo ks SAC ’25, Ma ch 31-Ap il 4, 2025, Ca ania, I aly
Figu e 2: The a e age use la ency
𝐿𝑁
ac oss he numbe
𝑁
o Use Equipmen
In ou expe imen , we assume ha each node
𝑖∈𝑉
is loca ed
a a posi ion
(𝑥𝑖,𝑦𝑖)
on a 2D plane andomly, and we de ine
𝑑𝑖 𝑗
as
he Euclidean dis ance be ween he posi ions o nodes
𝑖
and
𝑗
. The
la ency be ween wo nodes
𝑖
and
𝑗
is de ined as
𝐿𝑖 𝑗 =𝐿p
𝑖 𝑗 +𝐿
𝑖 𝑗 +𝐿q
𝑖 𝑗 +
𝐿c
𝑖 𝑗
. The p opaga ion la ency
𝐿p
𝑖 𝑗 =
𝑑𝑖 𝑗
𝜂𝑖 𝑗
be ween wo nodes is he
a io be ween hei dis ance
𝑑𝑖 𝑗
and he speed o elec omagne ic
wa es
𝜂𝑖 𝑗
in he ansmission medium be ween nodes
𝑖
and
𝑗
. The
ansmission la ency
𝐿
𝑖 𝑗 =𝑆
𝑅𝑖 𝑗
, whe e
𝑆[bi ]
is he packe size
and
𝑅𝑖 𝑗 [bi /
s
]
is he ansmission a e.
𝐿q
𝑖 𝑗
ep esen s he queuing
la ency as a packe ’s ime in a queue be o e i can be ansmi ed.
Finally, he p ocessing la ency
𝐿c
𝑖 𝑗
e lec s he ime equi ed o he
de ices o analyze and ou e packe s.
We also de ine he packe deli e y a io in each use ansmission
as
𝑄𝑡
𝑖=𝑒−𝛼𝑑𝑖𝑁𝑐
, which is used o simula e he packe deli e y a io
𝑄𝑡
𝑖
based on
𝑑𝑖
, he dis ance be ween he
𝑖
- h
UE
and i s associa ed
base s a ion, and he con olle node load
𝑁𝑐
a ime s ep
𝑡
. The
coe icien
𝛼∈ (
0
,+∞)
join ly con ols he impac o dis ance and
load on packe deli e y.
A se o
𝑉
is andomly deployed in a no malized uni squa e
[
0
,
1
]2
in ou simula ion en i onmen . We conside he numbe o
use s
𝑁={
50
,
100
, . . . ,
500
}
o compa e he pe o mance o he
selec ed baselines unde di e en numbe s o use s in he sys em.
We implemen ou
RL
-based me hods wi h Py hon and use he
Ray RLlib o ain he P oximal Policy Op imiza ion (PPO) algo-
i hm, which op imizes policy pe o mance o o ches a o s and
con olle s.
Figu e 2 demons a es ha he p oposed me hod consis en ly
ou pe o ms bo h SOSC and SODC as he numbe o
UE
s inc eases
in e ms o lowe use la ency. The p oposed me hod’s a e age use
la ency
𝐿𝑁
is almos 42% lowe han SODC and a ound 66% lowe
han SOSC a he
UE
le el. The gained a e age la ency in he p o-
posed me hod is he lowes because each con olle is placed a he
lowes la ency om i s use s in he con olle domain, minimizing
delays in hei use ’s ansi ions.
Figu e 3 shows he packe deli e y a io
𝑄𝑡
𝑖
o di e en numbe s
o
UE
s ha he p oposed me hod consis en ly achie es app oxi-
ma ely 9% highe han SODC and 14% highe han SOSC. This
ou come a ises om using dis ibu ed con olle s and decen al-
ized o ches a ion, which e ec i ely balance he wo kload among
con olle s. By deploying con olle s wi h he lowes la ency om
Figu e 3: A e age use packe deli e y a io
𝑄𝑡
𝑖
ac oss he
numbe 𝑁o Use Equipmen
hei use s, he communica ion and decision-making be ween he
con olle s and he use s a e minimized, leading o quicke deli e y
o packe s.
4 CONCLUSION
This pape add esses he Nea -RT RIC placemen p oblem, c u-
cial o managing
UE
s in he
O-RAN
a chi ec u e. We p opose a
mul i-agen app oach whe e decen alized o ches a o s place con-
olle s o minimize la ency, sha ing da a o op imize con olle
placemen wi hin each domain. Con olle s ac as agen s, adjus -
ing use ansmission powe based on la ency and Signal- o-Noise
Ra io (
SNR
) obse a ions. Ex ensi e expe imen s show ha he
p oposed me hod educes la ency and imp o es packe deli e y
a ios, ou pe o ming s a e-o - he-a baselines.
ACKNOWLEDGMENTS
This wo k was unded by he SNS-JU 6G Cloud p ojec unde he
Eu opean Union’s Ho izon Eu ope Resea ch and Inno a ion P o-
g amme unde G an Ag eemen No. 101139073.
REFERENCES
[1]
Mohammad Abdel-Rahman, EMADELDIN MAZIED, FAHID HASSAN, Ko y
Teague, ATHEER AL-SHAGGAH, ALLEN MACKENZIE, Sco Midki , and Kle-
be V Ca doso. 2023. A S ochas ic Op imiza ion F amewo k o Join RAN In el-
ligen Con olle Placemen and RAN Nodes Assignmen in O-RAN Ne wo ks.
Au ho ea P ep in s (2023).
[2]
Gab iel Ma heus Almeida, Gus a o Zana a B uno, Alexand e Hu , Ma i Hil unen,
Elias P ocopio Dua e, C is iano Bona o Bo h, and Klebe Viei a Ca doso. 2024.
RIC-O: E icien Placemen o a Disagg ega ed and Dis ibu ed RAN In elligen
Con olle Wi h Dynamic Clus e ing o Radio Nodes. IEEE Jou nal on Selec ed
A eas in Communica ions 42, 2 (2024), 446–459. h ps://doi.o g/10.1109/JSAC.2023.
3336159
[3]
Leona do Bona i, Sal a o e D’O o, Michele Polese, S e ano Basagni, and Tommaso
Melodia. 2021. In elligence and Lea ning in O-RAN o Da a-D i en Nex G Cellula
Ne wo ks. IEEE Communica ions Magazine 59, 10 (2021), 21–27.
[4]
Zakha Kabluchko and Ch is oph Thäle. 2021. The Typical Cell o a Vo onoi
Tessella ion on he Sphe e. Disc e e & Compu a ional Geome y 66, 4 (2021), 1330–
1350.
[5]
Xinchen Lyu, Chenshan Ren, Wei Ni, Hui Tian, Ren Ping Liu, and Y Jay Guo.
2018. Mul i-Timescale Decen alized Online O ches a ion o So wa e-De ined
Ne wo ks. IEEE Jou nal on Selec ed A eas in Communica ions 36, 12 (2018), 2716–
2730.
[6]
Yiwen Wu, Sipei Zhou, Yunkai Wei, and Supeng Leng. 2020. Deep ein o cemen
lea ning o con olle placemen in so wa e de ined ne wo k. In IEEE INFO-
COM 2020-IEEE Con e ence on Compu e Communica ions Wo kshops (INFOCOM
WKSHPS). IEEE, 1254–1259.