Agora: A Distributed Language Model Framework With API-Call Support for Integrated Climate Forecasting

Author: Udrescu, Alexandra; Popovici, Dan-Matei

Publisher: Zenodo

DOI: 10.1109/ACCESS.2025.3568028

Source: https://zenodo.org/records/17659653/files/Agora_A_Distributed_Language_Model_Framework_With_API-Call_Support_for_Integrated_Climate_Forecasting.pdf

Recei ed 6 Ap il 2025, accep ed 24 Ap il 2025, da e o publica ion 8 May 2025, da e o cu en e sion 19 May 2025.
Digi al Objec Iden i ie 10.1109/ACCESS.2025.3568028
Ago a: A Dis ibu ed Language Model
F amewo k Wi h API-Call Suppo o In eg a ed
Clima e Fo ecas ing
ALEXANDRA UDRESCU AND DAN-MATEI POPOVICI
Compu e Science Depa men , Na ional Uni e si y o Science and Technology POLITEHNICA Bucha es , 060042 Bucha es , Romania
Co esponding au ho : Dan-Ma ei Popo ici (ma ei.popo ici@upb. o)
This wo k was suppo ed in pa by he Eu opean Union h ough he FUTURAL P ojec -Empowe ing he FUTu e h ough inno a i e
Sma Solu ions o URAL a eas (HORIZON EUROPE) unde P ojec 101083958, and in pa by Uni a ea Execu i ă pen u Finan
.a ea
În ă
,ămân ului Supe io , a Ce ce ă ii, Dez ol ă ii Si Ino ă ii (UEFISCDI) h ough he P ojec FUTURAL-Soluţii in eligen e ino a oa e
pen u zonele u ale (O izon Eu opa Ins i u ii) unde P ojec 020234823. Views and opinions exp essed a e howe e hose o he au ho (s)
only and do no necessa ily e lec hose o he Eu opean Union o he Eu opean Resea ch Execu i e Agency. Nei he he Eu opean Union
no he g an ing au ho i y can be held esponsible o hem.
ABSTRACT We in oduce Ago a, a Gene a i e AI-d i en sys em ha deli e s expe answe s and
ecommenda ions on clima e and ag icul u e, ans o ming complex da a in o clea , na u al language
explana ions. While buil o he u al domain, Ago a is highly adap able and can be deployed ac oss a ious
domain applica ions. I ope a es as a ‘‘mix u e-o -expe s’’ language model sys em, selec i ely u ilizing
mul iple ine- uned la ge language models o in e ence. By dynamically in eg a ing ex e nal da a h ough
API calls, Ago a ensu es eal- ime, con ex ually ele an esponses. Ago a is buil o ex ensibili y—i
seamlessly in eg a es new APIs and domains wi hou equi ing a ull sys em e ain. De eloped en i ely
wi h open-sou ce la ge language models om he LLaMA amily, Ago a emains open and adap able,
allowing anyone o ex end and enhance i s capabili ies. Op imized o accessibili y, Ago a uns e icien ly on
commodi y GPUs wi hou comp omising pe o mance. By elimina ing he need o expensi e ha dwa e like
NVIDIA’s A100, i makes ex gene a ion mo e a o dable and widely accessible. Ago a ou pe o ms closed-
sou ce models, achie ing 78% accu acy on ou ques ion-answe ing benchma k. This esul is achie ed ia
dynamic API in eg a ion, which pulls in eal- ime ex e nal da a, making esponses mo e adap i e, p ecise,
and con ex -awa e.
INDEX TERMS API-call o ches a ion, API-call suppo , o ecas gene a ion, la ge language model, model
ine- uning, na u al language p ocessing.
I. INTRODUCTION
In many Eas e n Eu opean coun ies, pa icula ly Romania,
a signi ican gap exis s be ween he echnological ad ance-
men s and p ac ices p e alen in u al a eas. Despi e he
widesp ead a ailabili y o In e ne connec i i y Romania
anks wi hin he op 54 o 140 coun ies o mobile In e ne
speed [1] and accessibili y o expe da a, sma se ices, and
cu ing-edge in o ma ion emains low in hese egions.
A epo [2] om he Black Sea Basin P og amme
e ealed ha app oxima ely one million people in Romania,
along wi h hei amilies, a e disconnec ed om mode n
ad ancemen s. This is u he suppo ed by he s a is ics
The associa e edi o coo dina ing he e iew o his manusc ip and
app o ing i o publica ion was A ianna D’Ulizia .
shown in Figu e 1. The same epo highligh s ha 97%
o Romanian a ms a e mic o- and subsis ence en e p ises,
ypically amily owned, co e ing up o 10 ha. These a ms
employ a leas hal he ag icul u al wo k o ce a ailable in
Romania. Al hough specialized wea he o ecas s, seasonal
c opping in o ma ion, and insigh s in o he e ec s o clima e
change a e eadily a ailable [3],[4],[5], many po en ial
bene icia ies in u al Romania a e no u ilizing hem.
This disconnec a ises because many u al use s a e
un amilia wi h he echnologies unde pinning hese se ices.
The p ocess o ins alling and na iga ing apps, da ase s
and API clien s, coupled wi h he g owing complexi y o
echnical in e aces such as sa elli e and ada p ojec ions o
mul imodel o ecas s can be daun ing. The e o e, con e ing
complex in o ma ion in o a clea , accu a e, and speci ic
84112
2025 The Au ho s. This wo k is licensed unde a C ea i e Commons A ibu ion 4.0 License.
Fo mo e in o ma ion, see h ps://c ea i ecommons.o g/licenses/by/4.0/ VOLUME 13, 2025
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
FIGURE 1. Awa eness abou sma a ming applica ions [2].
ex -based o ma is akey obs acle o he widesp ead adop ion
o mode n sma a ming p ac ices in he nea u u e.
As pa o a b oade e o o suppo u al communi ies [6],
ou objec i e was o de elop a sys em capable o accessing
expe da a om di e se hi d-pa y sou ces, easoning abou
such da a, and deli e ing ac ionable conclusions and ec-
ommenda ions. By le e aging he powe o La ge Language
Models (LLMs), which excel in p ocessing, unde s anding,
and gene a ing human language, we aim o p o ide u al
communi ies wi h easy access o expe knowledge and
insigh s wi hou equi ing deep echnical expe ise.
LLMs, buil on he g oundb eaking T ans o me a chi-
ec u e [7], ha e become s a e-o - he-a ools, a ac ing
unp eceden ed a en ion and esea ch e o , and a e now
widely adop ed ac oss di e se applica ions. They powe
cha bo s capable o human-like con e sa ions and imp o e
p og amming ools wi h ea u es such as code gene a ion
and explana ion. The la es language models, wi h hund eds
o billions o pa ame e s, demons a e ad anced eason-
ing abili ies and can pe o m logical easoning o some
ex en .
Howe e , language models a e no ou -o he box solu ions
o many applica ions including ou s. LLMs su e om
ain-da a dependency - hey a e cons ained by he da a
hey ha e been ained on, hus limi ing hei easoning abil-
i ies, as illus a ed in [8];in-con ex easoning limi a ions
- hey s uggle wi h undamen al easoning asks such as
calcula ing minimum, maximum, o a e age alues ac oss
da a se ies, and o en ail o decide when o pe o m ac ual
checks ins ead o pe o ming ex gene a ion; p ohibi i e
aining cos s - hey demand ex ensi e memo y and GPU
esou ces, wi h ine- uning cos s s a ing a hund eds o
dolla s and escala ing o hund eds o housands o aining
om sc a ch.
In his pape , we in oduce Ago a, an ad anced sys em
powe ed by language models speci ically designed o add ess
que ies ela ed o wea he o ecas s, clima e, and c op
managemen , while e ec i ely o e coming he challenges
men ioned ea lie . Ago a can in oke hi d-pa y APIs o
imp o e i s answe s. Mo eo e , i can e icien ly o ches a e
mul iple API calls, allowing i o igge addi ional eques s
when needed o e ie e supplemen a y da a, he eby ensu ing
mo e comp ehensi e and accu a e answe s o use que ies.
Fo ins ance, o answe a ques ion such as:
Can he cul i a ion o oma oes h i e in he
clima ic condi ions a ound he ci y o Pi eş i,
Romania?
Ago a ecognized ha simply e ie ing da a on op imal
sowing condi ions o oma oes is insu icien . To deli e a
p ecise esponse, i also ini ia es a seconda y call o ob ain
his o ical clima e da a o Pi eş i. In ano he example:
Wha is he cooles mon h in he ci y o Sibiu?
Ago a e ie es mon hly empe a u e a e ages o he
gi en loca ion and sends hem o an agg ega ion API o
pe o m he min ope a ion o e he gi en in e al.
In addi ion o answe ing ques ions accu a ely, Ago a is
highly modula . Ins ead o elying on a single la ge-scale
model, i u ilizes smalle domain-speci ic language models,
each ailo ed o handle speci ic ypes o da a. These models
a e coo dina ed by a la ge in e iewe model, ha in eg a es
and p ocesses hei ou pu s.
The domain-speci ic models ha we ained ocused on
ag icul u e, wea he , and clima e da a. Howe e , Ago a is
lexible and can be deployed o a wide ange o asks, wi h
he abili y o scale and inco po a e addi ional domains as
needed.
Mos impo an ly, by dis ibu ing expe ise ac oss mul iple
models, Ago a is scallable. Using mul iple models wi h a
ela i ely low numbe o pa ame e s, i suppo s aining and
in e ence ha can be e icien ly pe o med on commodi y
GPUs.
Ago a add esses a key gap in cu en esea ch, which is
la gely ocused on aining massi e, gene al-pu pose models
wi h b oad knowledge and billions o pa ame e s. These
la ge-scale sys ems—o en p op ie a y—a e ou o each o
academia, small businesses, and o he esou ce-cons ained
use s. They come wi h usage cos s and lock use s in o closed
ecosys ems. In con as , we belie e he u u e lies in smalle ,
high-expe ise models ailo ed o speci ic domains. These
models can be ained and deployed on modes ha dwa e, a e
easie o e alua e, and ely on expe da a o p oduce eliable,
a ge ed esponses. Ago a is buil wi h his ision in mind.
In Sec ion II we ou line he challenges encoun e ed
du ing he de elopmen o Ago a and explain how i s
design e ec i ely add esses each o hese issues. We explain
Ago a’s implemen a ion in Sec ion III. In Sec ion IV we
discuss model aining and e alua ion. In Sec ion Vwe
p o ide an in-dep h e iew o he exis ing app oaches ha
ollow simila di ec ions. In Sec ion VI we discuss limi a ions
and inally, in Sec ion VII we p esen conclusions and u u e
wo k.
II. DESIGN CHALLENGES WHEN BUILDING AGORA
To begin designing Ago a, we i s explo ed he exis ing
domain-speci ic models ele an o ou a eas o in e es . The
de elopmen o la ge language models (LLMs) has made
signi ican p og ess in ields such as ag icul u e [9] and
clima e science [4],[10]. Such models we e c ea ed by op ing
o a base language model, which may be ei he open-sou ce
VOLUME 13, 2025 84113
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
(e.g., he LLaMA [11] amily) o p op ie a y (e.g., he GPT-4
amily). Nex , wo main app oaches a e employed. The i s
elies on (i) p omp enginee ing, whe e he model is guided
by ins uc ions and/o examples o answe domain-speci ic
ques ions, as seen in [4]. This may in ol e augmen ing
p omp s wi h ele an da ase agmen s and explana o y
in o ma ion o imp o e esponse accu acy. Al e na i ely,
(ii) he model can be ine- uned using a da ase o example
que ies and he co esponding a ge answe s.
Ou expe ience has shown ha p omp enginee ing ends
o pe o m well on la ge models (wi h mo e han 100 billion
pa ame e s). Fine- uning can yield good esul s wi h much
smalle models (e.g., he LLaMA2 7B model [11]). Howe e ,
bo h me hods a e inhe en ly cons ained by hei eliance
on da a encoun e ed du ing aining, which limi hei
easoning abili ies o ha in o ma ion.
Re ie al-Augmen ed Gene a ion (RAG) [12] is a common
app oach o add essing his limi a ion. RAG ope a es in wo
phases: i s , a e ie e analyzes he use que y and e ie es
ele an da a om a da abase o ex e nal sou ce. Second,
hese da a a e appended o he use que y as con ex , which
he LLM hen uses o gene a e i s esponse.
We ound ha ypical e ie al componen s in RAG a e
o en simple, ypically employing basic sea ch me hods such
as BM25 [13], which use keywo d ma ching and do no
scale well o complex nume ical da ase s. Fo example,
conside ing he que y om he p e ious sec ion: ‘‘Wha
is he cooles mon h in he ci y o Sibiu?’’, a BM25-
based e ie e s uggled o iden i y he speci ic da ase slice
necessa y o answe his que y. A bes , he en i e da ase
can be appended o a use que y. Howe e , his app oach is
imp ac ical because language models ha e a limi ed con ex
window (e.g., LLaMA 2’s 4096- oken limi is equi alen o
app oxima ely 3000 wo ds).
To add ess hese issues, we examined me hods ha suppo
in oking ex e nal APIs du ing ex gene a ion, such as [8].
The au ho s p oposed a language model ha has been ained
o selec i ely pause ex gene a ion, in oke an API and
inco po a e he esponse in o ongoing ex gene a ion. This
me hodology which includes da ase c ea ion and he esul -
ing model, is e med Tool o me . Tool o me le e ages he
inhe en easoning abili ies o he small-size 6B-pa ame e
GPT-J model [14] o au oma ically anno a e an exis ing
da ase wi h API calls. Subsequen ly, he da ase is used o
ine- une he same model o each i whe e o pe o m API
calls and how o selec he app op ia e pa ame e s. Fu he
de ails a e p esen ed in Sec ion III.
By expe imen ing wi h Tool o me we iden i ied i e
essen ial c i e ia ha we e essen ial in c ea ing Ago a:
1) API-call suppo : Ou sys em needs o lea n when and
how o pe o m ex e nal API-in oca ions in o de o
suppo i s answe -gene a ion p ocess;
2) API-call o ches a ion: In many cases, API esponses
may in luence he cons uc ion o subsequen API-
calls. Fo ins ance, o many use que ies, we i s
ga he da a ega ding mon hly p ecipi a ion, hen i , and
FIGURE 2. Ago a a chi ec u e.
based on he esul ing alue sea ch o c ops whose
op imal condi ions ma ch hose p ecipi a ion a e ages.
The sys em needs o suppo such dependencies
be ween API-calls.
3) API scalabili y: New da a sou ces exposed ia new
APIs should be easy o in eg a e wi hou equi ing an
en i e e aining o he sys em
4) Sys em scalabili y: T aining he en i e sys em should
be achie able on commodi y GPUs, wi h minimal
cos s.
5) Open-sou ce eliance: The sys em should exclusi ely
u ilize open-sou ce models o in e ence, allowing
o easy adap a ion and implemen a ion in a ious
en i onmen s.
O hese i e ea u es, Tool o me sa is ies only 1., 4.
and 5. c i e ia. In addi ion, we ound ha c i e ia 2. and 3.
con lic when a emp ing o c ea e a single language model.
As he ype o o ches a ion be ween API-calls becomes mo e
elabo a e, smalle models end o unde -pe o m, and he
da a size equi ed o aining inc eases exponen ially wi h
he numbe o APIs. The e o e, in o de o build Ago a,
we expe imen ed wi h he idea o c ea ing a mul i-model
sys em, ha sepa a es he asks o p ope ly add essing API-
calls and esponses, om he ask o API-call o ches a ion.
III. AGORA ARCHITECTURE
The Ago a was a cen al public space in ancien G eece
whe e ci izens ga he ed o discuss public ma e s and make
decisions. Ou sys em is inspi ed by Ago a’s concep .
I consis s o a collec ion o models E1, . . . , Enhence o h
called expe s oge he wi h a special model MI, which we
e m he in e iewe model.
Whene e a use que y is ecei ed, MI, as well as all he
expe s gene a e one oken a a ime simul aneously on he
same inpu p omp . A speci ic poin s du ing his p ocess,
when necessa y, MI delega es con ol o an expe model Ei o
con inue he esponse. This p ocess is illus a ed in Figu e 2,
which shows he In e iewe model coo dina ing wi h h ee
expe models. Each expe is ained o in oke API calls o
e ie e da a om hi d-pa y eposi o ies, da abases, o o he
ex e nal sou ces. Fo complex APIs, we ain a dedica ed
84114 VOLUME 13, 2025
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
FIGURE 3. Ago a a chi ec u e.
model o handle hose in e ac ions. In con as , o simple
APIs ha can be lea ned om examples, a single expe is
ained o manage mul iple call ypes.
Once an expe model Eicomple es i s ask, con ol e u ns
o MI, which esumes ex gene a ion un il ano he expe
is equi ed o he esponse is comple e. To cla i y his ex
gene a ion wo k low, we p o ide an example in Figu e 4.
The use ques ion is shown in g ay. The ex gene a ed by
MI is illus a ed in blue, whe eas ha gene a ed wi h he aid
o expe E1is shown in o ange. E1is he wea he o ecas
and clima e model. To p ope ly add ess his ques ion, MI
no ices ha empe a u e in o ma ion is necessa y and uses E1
o gene a e he app op ia e ex .
Figu e 3illus a es how he In e iewe model o ches-
a es API in e ac ions be ween wo expe models. In
S ep 1, he In e iewe delega es he ini ial pa o he
esponse o he i s expe . This expe issues an API
call and uses he e ie ed da a o gene a e i s con ibu ion
(S ep 2). In S ep 4, he In e iewe in okes a second
expe o con inue he esponse. This second expe elies
on he ou pu o he i s o cons uc a alid API call—
highligh ing a dependency be ween he wo expe s. Such
dependencies can span mul iple expe s and occu in a bi a y
sequences.
A conc e e example o his in e ac ion is shown in Figu e5,
which in ol es wo expe s, E1and E2.E1specializes
in clima e- ela ed da a, while E2 ocuses on c op- ela ed
knowledge. As be o e, he ques ion is shown in g ey. Since
clima e da a is needed i s , he In e iewe s a s ex
gene a ion wi h E1. When c op-speci ic in o ma ion becomes
ele an , E2is engaged o con inue he esponse. Finally, MI
syn hesizes he comple e answe based on he ou pu s om
bo h expe s.
Nex we discuss how each ype o model (in e iewe /ex-
pe s) has been buil .
A. EXPERT MODELS
Expe models a e ained independen ly and hei ask is
o pe o m accu a e calls o a designa ed API (o APIs).
In his wo k we conside ou APIs: (i) cu en da e
- which is necessa y when que ies use ime exp essions
FIGURE 4. Gene a ing answe s wi h Ago a.
ela i e o he cu en momen in ime, such as now, oday,
omo ow, (ii) wea he & clima e - which e ie es wea he
o ecas s as well as que ies ela ed o p ecipi a ion, wind
and empe a u e eco ds o up o 10 yea s in he pas ,
(iii) c ops - which e ie es ag icul u al da a ega ding
o op imal plan ing pe iods and p ecipi a ion equi emen s,
om a collec ion o da a-sou ces and (i ) agg ega ion
- which compu es maximal, minimal and a e age alues
o e lis s o in ege s. We ained h ee expe s o handle
hese ou APIs. Each expe model was ained ia ine-
uning, s a ing om he same base LLaMA-3-8B-Ins uc
model [15].
LLaMA-3-8B-Ins uc is al eady ine- uned o ins uc ion-
ollowing [15], making i a s ong ounda ion o ask-speci ic
dialogue and assis an beha io . This means be e alignmen
and usabili y ou o he box—especially o s uc u ed, guided
ou pu s like ou s. Despi e i s smalle size, i ma ches o
ou pe o ms la ge models such as LLaMA 2–13B and GPT-
3.5, while emaining ligh weigh enough o un on a single
commodi y GPU [15]. Jus as impo an ly, LLaMA-3 is open-
sou ce, gi ing us he lexibili y o ine- une and deploy he
model p i a ely, ensu ing compliance wi h sensi i e da a
equi emen s.
The API call ep esen a ion is inspi ed om [8]. Fo a gi en
API a, a call is a pai ( a,i1. . . , in) whe e adesigna es he call
name, while i1, . . . , ikdesigna e he kpa ame e s o he call.
API calls a e encoded as a bi a y wo d sequences iden i ied
only wi h special okens ma king he s a and end o a call,
as illus a ed below:
<API_CALLa> a(i1, . . . , in)</API_CALLa>
Thus, du ing ex gene a ion, whene e he expe model
p edic s <API_CALLa>as he nex oken o be gene a ed,
he ollowing s eps occu :
•The model con inues gene a ion in e nally un il
he comple e sequence o he call, ending when
</API_CALLa>is p oduced;
•The ac ual call a(i1, . . . , in) o API ais pe o med and
he esponse sequence is e ie ed, and appended o he
cu en con ex ;
Th ough ine- uning, expe models Eilea n how o
cons uc alid API calls, and by ca e ully designing he
aining da a, hey also lea n when API-calls should ake
place. In Sec ion IV we go in o mo e de ail in o ou app oach
o aining expe s as well as da ase ins umen a ion.
Mo e de ails ega ding he API syn ax can be ound in
Appendix A.
VOLUME 13, 2025 84115
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
FIGURE 5. Ago a wi h wo expe s.
B. LIMITATIONS WHEN BUILDING ONE EXPERT ACROSS
MULTIPLE APIS
Using ca e ul ins umen a ion o he aining da ase [8],
a model can be ained o pe o m calls o mul iple APIs.
Howe e , his app oach does no scale well when he numbe
nand he complexi y o di e en suppo ed APIs inc eases.
Suppose a1is he numbe o ine- uning examples ha a
model needs o ain o accu a ely lea n one API ype (e.g. he
Wea he API). I a comple ely new and independen API ype
(say C ops) is o be added, ano he bexamples would su ice.
Howe e , i we would like o cap u e possible dependencies
be ween he ela i e posi ion o one API call wi h espec o
he o he in he ex (e.g. many C op calls a e likely o ha e a
Wea he call p io o hem, and C op answe s may in luence
how he Wea he calls a e being pe o med), hen he aining
da a needs o ha e en ies whe e one call occu s in ela ion o
ano he (o de O( a× b) en ies), as well as en ies whe e calls
occu independen ly. Hence, he comple e aining da ase o
lea ning he wo APIs wi h dependencies be ween hem is as
ollows:
O( a· b+ a+ b) (1)
This illus a es he API-call scalabili y p oblem high-
ligh ed in Sec ion II. As he numbe no APIs inc eases, he
da ase size equi ed o lea n hem inc eases exponen ially
wi h espec o n.
Mo eo e , he dependencies be ween API calls a e no
solely posi ional. Conside he example in Fig. 5. To assess
whe he peach ees a e sui able o plan ing in he a ea o
Cons an a, we need o ha e knowledge abou p ecipi a ion
a e ages in Cons an a om he Wea he API. Mo e gene ally,
a esponse om an API call may in luence he manne in
which ano he subsequen call is pe o med. T aining da a
size is no he only limi a ion - as he numbe o ine- uning
1Ou expe ience shows ha a≤10000. Mo e de ails in Sec ion IV.
examples inc eases, small models such as LLaMA-3-8b a e
no longe capable o sus aining such in ica e co ela ions,
and hei associa ion pe o mance deg ades.
Thus, ou solu ion eplaces he ‘‘single model’’ scena io
as well as he massi e da ase equi ed o ine- uning, wi h
se e al smalle expe models, each adap ed o a gi en
domain ia a s aigh o wa d ine- uning ask. To co ela e he
ex gene a ed by such expe s, we equi e ano he dedica ed
model.
C. THE INTERVIEWER MODEL
The in e iewe model (MI) was ained speci ically o
handle he ask o expe model mode a ion. Mo e speci -
ically, MI is esponsible o deciding when an expe LLM
should be used o gene a e pa s o he answe , as well as
o in eg a e hese pa s in he o e all answe , whene e
necessa y. We implemen ed se e al op ions o achie e his
mode a ion. The i s , e med sequen ial con ol, is shown
in Fig. 6. In o mally, in his se up, ’’ he in e iewe asks
an expe ‘‘.MI gene a es oken sequences x1, . . . , xn. Nex ,
i decided o use expe E1 o con inue he answe . This is
achie ed by gene a ing a special oken CTRLi(CTRL1in
Fig. 6). Expe model E1uses x1, . . . , xnas he ini ial
con ex (i.e. he sequence o p e iously-gene a ed okens).
I gene a es okens yn+2, . . . , yn+m, ollowed by a STOP
oken ha e u ns he con ol o MI.
FIGURE 6. Sequen ial con ol passing be ween models.
When o mula ing a ques ion ela ed o plan cul i a ion
MI migh accu a ely decide o allow he C ops expe model
84116 VOLUME 13, 2025

A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
o con inue gene a ion. Howe e , ou ini ial expe imen s
showed ha MI does no always exhibi con ex sensi i i y.
O en imes, he expe is mo e capable o assessing he con-
ex and de e mining when o s a ’’ alking‘‘ by gene a ing
i s own CTRLi oken. The e o e, we also included ano he
scena io, pa allel gene a ion, illus a ed in Figu e 7.
He e, he expe models eps and in e up s he In e iewe .
To achie e his, we ha e all he expe models con inuously
gene a e okens. As be o e, oken ykis gene a ed by an expe
based on he his o y o okens x1, . . . , xk−1which ep esen
he cu en con ex . When an expe gene a es a oken CTRLi,
i p eemp s he In e iewe . All subsequen okens up un il
STOP a e pa o he use ’s answe . In Figu e 7, all okens
ha he use does no see a e indica ed by dashed lines.
This si ua ion o en occu s when an expe decides o
pe o m an API call. We obse ed ha in almos all scena ios
such a decision is con ex ually co ec and should be
p io i ized o e he ex gene a ed by he In e iewe .
FIGURE 7. Pa allel gene a ion wi h mul iple models.
Finally, ou expe ience has also shown ha he expe -
gene a ed ex needs o be p ocessed be o e ou pu . Fo his
eason we in oduced Pa allel gene a ion wi h mode a ion
(see Figu e 8). He e, we apply a ex ans o ma ion unc ion
g o he sequence o okens yn+2, . . . , yn+mgene a ed by an
expe .
FIGURE 8. Pa allel gene a ion wi h mode a ion.
I gis he iden i y unc ion (g(s)=s), hen he ex
gene a ed by he expe is unmodi ied. I g(·)=ϵ( he emp y
s ing), hen he en i e sequence gene a ed by he expe is
e ec i ely hidden om he use . Howe e , his sequence will
s ill be pa o he con ex which MI uses o gene a e ex .
In Figu e 8, no e ha oken xn+m+2is gene a ed based on he
sequence o okens x1,x2, . . . , xn,yn+2, . . . , yn+m, o which
he expe E1con ibu ed wi h yn+2, . . . , yn+m. This o m
o hiding okens is pa icula ly help ul in dealing wi h API-
calls ha p oduce abula da a as a esponse. Fo example,
he Wea he expe is ained o gene a e such API calls.
We wan his ype o in o ma ion o be in he con ex o he
In e iewe o d aw a conclusion, bu no be explici o he
use . We illus a e his si ua ion using he example shown in
Fig. 9.
A e s a ing a sen ence, MI decides o yield he con ex o
he Cu en da e expe . The con ol okens we e omi ed o
b e i y. The expe pe o ms a call o e ie e he cu en da e.
This call akes no pa ame e s. The ac ual call will be hidden
om he use (illus a ed wi h whi e boxes in Figu e 9), bu
kep in he con ex window. A e he expe has inished he
call, con ol is esumed by MI which swi ches o he Wea he
expe . A his poin , he cu en con ex con ains he loca ion
as well as da e, which will be used by he second expe model
o cons uc i s wea he - ela ed API-call. Once he call has
inished, ex gene a ion con ol e u ns o MI, which uses
he ime and o ecas in o ma ion o p oduce i s conclusion.
This example highligh s se e al key ai s o ou app oach:
•We use API calls no only o gene a e ex -answe s, bu
also o add con ex ual in o ma ion lexibly. By using
g, we keep his in o ma ion in he con ex so ha he
In e iewe can eason abou i and a he same ime hide
i om he use . This is akin o dynamic gene a ion o
que y-dependen p omp s.
•The example in Figu e 9also illus a es he dependency
be ween he answe gi en by he Cu en da e API,
and he cons uc ion o he subsequen Wea he call.
In p ac ice we ind many such dependencies, some imes
cascading o e h ee o mo e calls. Fo ins ance,
we migh need o e ch he cu en da e, based on i
iden i y wea he - ela ed da a, hen pe o m an a e age
o e he esul and inally e ch c op- ela ed in o ma ion
based on ha a e age.
IV. TRAINING AND EVALUATING AGORA
Ago a was buil as a esul o an i e a i e e inemen p ocess,
in which we explo ed di e en designs o achie e ou goals.
As men ioned in Sec ion II, hese we e: (i) API-call suppo ,
(ii) he abili y o suppo dependencies be ween calls i.e. API-
call o ches a ion, (iii) he abili y o easily in eg a e new
APIs - API scalabili y, (i ) he abili y o ain he en i e
sys em on commodi y GPUs (sys em scalabili y) and ( )
open-sou ce eliance.
A. FINE-TUNING A SINGLE MODEL FOR API-CALL
SUPPORT
Ou i s s ep was o apply he Tool o me me hodology [8],
which consis s o c ea ing a single model ine- uned o
add ess ou di e en ypes o API-calls. Tool o me selec-
i ely inse s API calls in o a la ge da ase , enabling he model
o lea n when and how o gene a e hem.
1) DATASET
Ou me hodology di e ged om ha in [8] because o he
una ailabili y o such an exis ing da ase . Ag icul u al ex s
and da ase s gene ally lack he su icien empo al and spa ial
da a equi ed o accu a e wea he and clima e API calls, wi h
ele an examples occu ing oo in equen ly o que ies on
op imal sowing condi ions.
VOLUME 13, 2025 84117
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
FIGURE 9. Illus a ing he usage o he unc ion g.
To build aining da a o each o ou APIs—Cu en
Da e,Wea he & Clima e,C ops, and Agg ega ion—we used
GPT-4o o gene a e sepa a e da ase s o ques ion–answe
pai s anno a ed wi h API calls. Ensu ing di e si y in hese
da ase s was essen ial o he expe models o gene alize
e ec i ely. We began by w i ing hand-c a ed ques ion
empla es ha a ied in complexi y, in ol ing be ween
one and h ee expe s, and cap u ing di e en ypes o
dependencies ac oss domains. GPT-4o was hen used o
p oduce seman ic a ia ions o hese empla es, using a
ange o p omp s o encou age di e se language s yles and
ph asings. Finally, GPT-4o ins an ia ed each empla e by
illing in speci ic de ails—such as loca ions, c op ypes,
o clima e condi ions— o c ea e ully conc e e ques ions.
Using he selec ed pa ame e s, we cons uc ed a co ec se
o API calls, que ied he ele an da a and asked GPT-4o o
gene a e an answe , oge he wi h he necessa y API calls,
esul ing in a comple e ques ion–answe pai .
The p omp s used o each API a e p o ided in
Appendix B-A, along wi h mo e de ails on he addi ional p o-
cessing pe o med on he syn he ically gene a ed examples.
2) TRAINING
We chose a ela i ely small, s a e-o - he-a open-sou ce
LLM, speci ically LLaMA-3-8B-Ins uc , because he
LLaMA-3 amily consis en ly ou pe o ms o he models o
simila size. We pe o med ine- uning using LoRA [16]
and QLoRA [17] adap e s, o i he memo y GPU
limi a ions. The exac hype -pa ame e s ha we used a e
p esen ed in Appendix B-C. To achie e objec i e (i ),
we selec ed he NVIDIA AD102 GeFo ce RTX 4090 GPU, a
high-pe o mance, cos -e ec i e, and eadily a ailable piece
o ha dwa e on he ma ke . We u ilized h ee such GPUs each
wi h 24,576 MB o a ailable VRAM, accessed ia he CUDA
API.
3) EVALUATION
We began by iden i ying he main ca ego ies o use -
ele an ques ions, wi h a ocus on complex que ies ha
equi e in eg a ing in o ma ion ac oss mul iple domains.
This analysis included mapping ou all possible dependency
ela ionships be ween API calls. Ou indings show ha
wea he and clima e da a o en se e as ounda ional inpu s,
wi h c op- ela ed que ies ypically depending on bo h o
hese, as well as he cu en da e. Agg ega ion API calls
end o ely on he esul s o p io API esponses, such
as hose om wea he , clima e, o c op se ices. Based on
hese insigh s, we designed a comp ehensi e se o ques ion
empla es ha sys ema ically e lec he ull ange o possible
dependencies. Using an app oach simila o ha desc ibed
in Sec ion IV-A1, we gene a ed di e se ques ions g ounded
in hese empla es. We hen used GPT-4o—guided by he
p omp s desc ibed in he Appendix— o p oduce use que ies
e enly dis ibu ed ac oss he iden i ied ca ego ies, including
examples in ol ing only a single API call. These que ies a e
dis inc om hose used in aining (see Sec ion IV-A1) and
we e no seen by he model du ing ine- uning.
We subjec ed each o he sys ems unde sc u iny o hese
ques ions and manually g aded he answe s on a scale o
1 o 5. G ades 3 - 5 a e assigned o answe s whe e all API
calls a e co ec o only pa o hem, bu he o e all answe
and unde lying easoning a e alid. G ades 1 and 2 e e o
84118 VOLUME 13, 2025
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
answe s in which API calls a e in alid and dependencies a e
misiden i ied.
Fo each que y included in ou e alua ion, we ha e
de e minis ic knowledge o bo h he speci ic API call(s) ha
need o be in oked and he dependencies be ween hem.
This p ede ined s uc u e elimina es any ambigui y in he
e alua ion p ocess, allowing us o assess each esponse wi h
comple e ce ain y and ensu ing a igo ous and objec i e
accu acy analysis.
The esul s a e shown in Figu e 10.Accu acy e e s o he
pe cen age o sco ed g ades om h ee o i e ou o he o al,
while Pe cen age o pe ec answe s e e s o hose g ades o
i e ou o he o al.
FIGURE 10. Ago a pe o mance compa ed o o he app oaches.
Ou i s expe imen s consis in applying he Tool-
o me [8] me hodology on ou aining se , and wi h ou
language model choice. Al hough he esul s we e p omising
( i s ow in Figu e 10), and be e esul s could be ob ained
by inc easing he size o he da ase o ha o he model, ou
obse a ion was ha he single model lacked he abili y o
associa e mul iple API calls, e en hough i had knowledge
o each a ailable API. Ou a emp s a ins umen ing he
da ase o cap u e API call dependencies e ealed he API-
call scalabili y p oblem discussed in Sec ion III-B.
B. AGORA - INTERVIEWER WITH MULTIPLE EXPERTS
To enhance pe o mance and add modula i y (i.e. ou
objec i e (iii)), we in oduced he Ago a sys em, which dis-
ibu es he API-call gene a ion asks ac oss expe models,
dependency handling and gene a ing conclusions - o he
In e iewe model. We used he same s a egy as be o e o
gene a e he da ase s o he aining o expe s.
In ou i s Ago a i e a ion, we used he same 8B base
model o expe s and he In e iewe . The aining p ocedu e
o each expe model ollows he de ails in Sec ion IV-A2.
Table 1ou lines he APIs managed by each expe .
Fo he in e iewe model we used a sys em p omp
ha con ained a sho desc ip ion o each API, a g amma
showcasing hei syn ax and ew-sho examples o hei use.
This sys em p omp , which can be ound in he Appendix,
enables he in e iewe o unde s and how di e en APIs can
TABLE 1. The numbe o examples used o ain each expe model.
be linked o combined when a use que y lacks su icien
in o ma ion o a single API call. Mo eo e , using a sys em
p omp ensu es he modula i y o he Ago a sys em because
adding o emo ing one o mo e APIs equi es only upda ing
his p omp .
We obse ed a signi ican imp o emen in pe o mance
and Ago a success ully chained mul iple API calls, hus
achie ing objec i e (ii). The esul s a e shown in Figu e 10
(second column).
Ou subsequen objec i e was o u he inc ease he
sys em pe o mance. We expe imen ed wi h a 70B in e -
iewe model along wi h 8B expe models. This led o
s ong pe o mance esul s, as shown in Figu e 10 ( hi d
column), because he la ge model’s enhanced easoning
abili ies enabled i o be e unde s and he a ailable APIs and
combine hem. Simul aneously, new expe models can be
ained independen ly, wi h no in e en ion equi ed o he
exis ing ones, and wi h minimal API desc ip ions ha need o
be added o he In e iewe ’s sys em p omp , hus achie ing
objec i e (iii).
Howe e , unning in e ence on he 70B in e iewe
model wi h an NVIDIA AD102 is no easible because o
insu icien GPU memo y, equi ing us o swi ch o a mo e
capable A100 wi h 80GB o memo y.
1) MEMORY CONSTRAINTS DURING INFERENCE
Using LoRA [16] and QLoRA [17] adap e s du ing ine-
uning, each expe can be o med by combining a base model
wi h a small adap e , which can be easily plugged in o
emo ed as needed (see Figu e 11).
This s a egy signi ican ly enhances memo y e iciency
and educed GPU memo y equi emen s by up o h ee
imes, as epo ed in [16] compa ed wi h adi ional ine-
uning me hods. This app oach o e s se e al ad an ages o
Ago a’s a chi ec u e, which, in p inciple, is designed o
n+1 models, whe e n ep esen s he numbe o expe s
oge he wi h he In e iewe . Because all n+1 models mus
un o each gene a ed oken, we can achie e his by loading
he n+1 base models on a su icien ly la ge GPU (o mul iple
GPUs).
Howe e , we can achie e be e memo y u iliza ion by
loading one base model on a GPU, oge he wi h kadap e s,
one o each expe . This means ha on each GPU, we can un
he in e ence om kdi e en expe s sequen ially by simply
VOLUME 13, 2025 84119
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
swapping ou he co esponding LoRA adap e s, as illus a ed
in Figu e 11. This me hod educes he numbe o base
models ha need o be loaded simul aneously, hus sa ing
compu a ional esou ces a he expense o inc eased ime
owing o sequen ial loading.
Addi ionally, when he in e iewe model is signi ican ly
la ge han he expe models, we can load he in e iewe
on o a dedica ed GPU. The expe s which a e much smalle
han he In e iewe , can be e enly dis ibu ed ac oss he
emaining GPUs.
FIGURE 11. Ago a pe o mance compa ed o o he app oaches.
Figu e 11 showcases se e al scena ios used du ing ou
e alua ion. In Scena io 1, ou smalle GPUs un in e ence
in pa allel, an a angemen we used o assess he ini ial
e sion o Ago a. As he e alua ion mo ed o he la ge
70B In e iewe model, we ansi ioned o mo e powe ul
80GB A100 GPUs. Scena ios 2 and 3 demons a e how base
models and adap e s a e dis ibu ed ac oss a ailable memo y.
In Scena io 3, o example, a single GPU holds one base
model and wo adap e s, enabling in e ence o wo expe
models, E2and E3, which gene a e okens sequen ially. The
adap e -swi ching ime is app oxima ely 20 imes sho e
han he ime needed o gene a e a oken. Despi e ha dwa e
limi a ions, Scena ios 2 and 3 illus a e ha ou mul i-model
sys em can s ill ope a e e icien ly, hough wi h educed
pe o mance due o sequen ial in e ence.
C. AGORA COMPRESSION
The p ima y limi a ion o he Ago a sys em, as desc ibed in
Sec ion IV-B, is he subs an ial esou ce demands necessa y
o achie e op imal pe o mance. This is mainly because o
he need o load bo h he LLaMA-3-70B-Ins uc model
(In e iewe ) and he expe s’ base model, LLaMA-3-8B-
Ins uc , esul ing in excessi e memo y consump ion. The
memo y o e head su passed he capabili ies o NVIDIA
AD102 GPUs alone. To mi iga e his, we de eloped a new,
s andalone model, which we ained using he esponses
p o ided by Ago a. We call his model a ‘‘comp ession
model’’, because unlike Ago a, i is a single language model,
bu is able o ep oduce, and e en enhance he pe o mance
o Ago a. To achie e his, we p oceeded as ollows:
•We s a ed wi h a da ase o ques ions and used Ago a
o gene a e answe s. We employed an 70B in e iewe
and 8B expe models. This da ase includes examples
demons a ing indi idual API usage as well as examples
o chained API calls.
•Ago a has g ea bu no pe ec accu acy, hence we
ca e ully il e ed i o e ain only ques ion-answe pai s
wi h highly accu a e mul i-API call examples. The inal
da ase con ains 17,000 such en ies.
•We ine uned a single LLaMA-3-8B-Ins uc model wi h
his da ase , esul ing ou comp ession model.
The pipeline o aining he Ago a comp ession model, ely-
ing on he p e ious s ages we ha e desc ip ed, is illus a ed
in Figu e 12.
FIGURE 12. All s eps equi ed o build he Ago a comp ession model.
While his model demons a ed he bes pe o mance
o e all, e en sligh ly su passing Ago a wi h he 70B in e -
iewe (Fig. 10), he e we e some ade-o s. The comp ession
model:
•canno be ine- uned independen ly o Ago a, as he e
is no exis ing da ase sui able o his ask. Ins ead, he
equi ed da ase mus be gene a ed di ec ly using Ago a,
o a simila ool.
•sac i ices modula i y (i.e. objec i e ( )), meaning i
canno - by i sel accommoda e new APIs wi hou
eso ing o Ago a as p e iously desc ibed.
Howe e , we ound ha ine- uning he newe e sions o a
comp ession model is an accep able comp omise o a ious
applica ions. The ine- uning ime - pe o med on an NVIDIA
A100, is unde wo hou s.
84120 VOLUME 13, 2025
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
APPENDIX B
PROMPTS FOR BUILDING AGORA
A. PROMPTS FOR BUILDING TRAINING DATASETS FOR
EXPERTS
Below, we p esen he p omp s used o gene a e da ase s o
each API, along wi h sample ou pu s om he gene a ed da a
o cla i y.
C ops and Agg ega ion: The ques ion-answe pai s in
he da ase s o he c ops and agg ega ion APIs we e
gene a ed di ec ly using GPT-4, wi hou any addi ional
il e ing, by employing he p omp s shown in Figu es 18,
19 and 20. Each API is linked o a co esponding back-end
unc ion, which ei he e u ns he cu en da e o que ies an
in e nal da abase o sowing condi ions.
Fo handling he cu en da e and wea he & clima e APIs,
we gene a ed he ques ion and he pa ame e s o he calls
using GPT-4o. We began by aking a lis o Romanian ci ies
along wi h hei la i ude and longi ude, which we e la e
needed when in e ac ing wi h he wea he and clima e API.
FIGURE 18. P omp o gene a ing da ase o C ops expe da ase .
FIGURE 19. P omp o gene a ing da ase o Agg ega ion expe
da ase .
We w o e by hand 190 di e en ime exp essions ha we
conside ed plausible o be used in a que y. Ou API p o ides,
by design, hou ly da a o in e als o up o h ee days, daily
da a o in e als o up o wo mon hs, and mon hly da a
o longe ime spans. I he use ’s que y includes he name
o a coun y, i is passed as a pa ame e in he call. I no
coun y is speci ied, he pa ame e alue de aul s o ‘‘???’’,
in which case he sys em assumes he que y e e s o he
la ges ci y wi h he gi en name. In his si ua ion, he coun y
name is e ie ed au oma ically om a da abase. We il e ed
ou examples whe e da a e ie al om ou sou ces was
unsuccess ul and hen asked GPT-4o-mini wi h adjus ing
he cu en da e and all ela ed da es in each example. This
ensu ed ha he inal da ase does no con ain examples
ied exclusi ely o he o iginal c ea ion da e. In addi ion o
ully anno a ed examples, we ealized he need o mo e
examples ocusing solely on he wea he & clima e API call
pa ame e s. To boos he model’s abili y o selec he co ec
pa ame e s o his API, we gene a ed addi ional examples
ha con ained only he ques ion and he call pa ame e s,
omi ing he inal esul and ex easoning abou he e ei ed
da a.
FIGURE 20. P omp o gene a ing da ase o Wea he & clima e expe
da ase .
Table 4ou lines he numbe o examples gene a ed o
aining each expe .
VOLUME 13, 2025 84127

A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
TABLE 4. The numbe o examples used o ain each expe model.
B. PROMPTS FOR CROSS-DOMAIN QUESTIONS
To gene a e ques ions ha equi e API calls om bo h he
Wea he & Clima e and Agg ega ion expe s, we used he
p omp shown in Figu e 21. We used simila p omp s o all
o he co ss-domain ques ions.
FIGURE 21. P omp o gene a ing da ase o Agg ega ion expe
da ase .
Table 5shows he numbe o examples gene a ed o each
combina ion o API calls ha we deemed necessa y and ha
in ol e a leas wo expe s.
TABLE 5. The numbe o examples wi h combina ions o API calls.
C. FINE-TUNING HYPER-PARAMETERS
We applied he same aining se ings consis en ly ac oss all
ou ine- uning expe imen s.
A cons an lea ning a e (LR) was subop imal, so we
adop ed a cosine LR schedule wi h a maximum alue o
1e-4, which p o ided be e con e gence. Due o es ic ed
memo y, we used a ba ch size o 1 and ained he model
o 2 epochs wi h a maximum inpu sequence leng h o
2048 okens. We applied g adien clipping a 0.3 and se he
weigh decay o 0.1 o s abilize he aining p ocess.
Fo he LoRA hype pa ame e s, we ha e se bo h alpha and
ank ( ) o 32.
ACKNOWLEDGMENT
Views and opinions exp essed a e howe e hose o he
au ho (s) only and do no necessa ily e lec hose o
he Eu opean Union o he Eu opean Resea ch Execu i e
Agency. Nei he he Eu opean Union no he g an ing
au ho i y can be held esponsible o hem.
REFERENCES
[1] In es Romania. (2022). In e ne In as uc u e. Accessed: Jul. 22, 2024.
[Online]. A ailable: h ps://in es omania.go . o/web/in e ne -
in as uc u e/
[2] Business Agency Associa ion (BAA), ‘‘Join ly p epa ing he condi ions
in he ag icul u al and connec ed sec o s in he BSB a ea o he digi al
ans o ma ion (BSB sma a ming),’’ Duna ea de Jos, Uni . Gala i,
Gala i, Romania, Tech. Rep., 2021. Accessed: Dec. 12, 2024.
[3] S. Rezayi, Z. Liu, Z. Wu, C. Dhakal, B. Ge, C. Zhen, T. Liu, and
S. Li, ‘‘Ag iBERT: Knowledge-in used ag icul u al language models o
ma ching ood and nu i ion,’’ in P oc. 31s In . Join Con . A i . In ell.,
Jul. 2022, pp. 5150–5156, doi: 10.24963/ijcai.2022/715.
[4] N. Kolduno and T. Jung, ‘‘Local clima e se ices o all, cou esy o la ge
language models,’’ Commun. Ea h En i on., ol. 5, no. 1, p. 13, Jan. 2024,
doi: 10.1038/s43247-023-01199-1.
[5] T. T. Nguyen, J. B ands e e , A. Kapoo , J. K. Gup a, and A. G o e ,
‘‘ClimaX: A ounda ion model o wea he and clima e,’’ in P oc. 40 h
In . Con . Mach. Lea n., Jan. 2023, pp. 1–14.
[6] Fu u al P ojec . (2024). Fu u al P ojec –ag icul u e and Clima e.
Accessed: Sep. 14, 2024. [Online]. A ailable: h ps:// u u al-p ojec .eu/
[7] A. Vaswani, N. Shazee , N. Pa ma , J. Uszko ei , L. Jones, A. N. Gomez,
Ł. Kaise , and I. Polosukhin, ‘‘A en ion is all you need,’’ Ad . Neu al In .
P ocess. Sys ., ol. 30, pp. 5998–6008, Jun. 2017.
[8] T. Schick, J. Dwi edi-Yu, R. Dessì, R. Raileanu, M. Lomelí, L. Ze le-
moye , N. Cancedda, and T. Scialom, ‘‘Tool o me : Language models can
each hemsel es o use ools,’’ in P oc. Ad . Neu al In . P ocess. Sys .,
ol. 36, Jan. 2024, pp. 1–16.
[9] (2024). KissanAI. Accessed: Jul. 22, 2024. [Online]. A ailable: h ps://
kissan.ai/
[10] B. Sil a, L. Nunes, R. Es e ão, V. Aski, and R. Chand a, ‘‘GPT-4 as an
ag onomis assis an ? Answe ing ag icul u e exams using la ge language
models,’’ 2023, a Xi :2310.06225.
[11] H. Tou on, T. La il, G. Izaca d, X. Ma ine , M.-A. Lachaux, T.
Lac oix, B. Roziè e, N. Goyal, E. Hamb o, F. Azha , A. Rod iguez, A.
Joulin, E. G a e, and G. Lample, ‘‘LLaMA: Open and e icien ounda ion
language models,’’ 2023, a Xi :2302.13971.
[12] P. Lewis, E. Pe ez, A. Pik us, F. Pe oni, V. Ka pukhin, N. Goyal, H. Kü le ,
M. Lewis, W.- . Yih, T. Rock äschel, S. Riedel, and D. Kiela, ‘‘Re ie al-
augmen ed gene a ion o knowledge-in ensi e NLP asks,’’ in P oc. Ad .
Neu al In . P ocess. Sys ., Jan. 2020, pp. 9459–9474.
[13] S. Robe son and H. Za agoza, ‘‘The p obabilis ic ele ance ame-
wo k: BM25 and beyond,’’ Found. T ends In . Re ., ol. 3, no. 4,
pp. 333–389, 2009.
[14] B. Wang and A. Koma suzaki. (2021). Gp -j-6b: A 6 Billion Pa ame e
Au o eg essi e Language Model. Accessed: Aug. 12, 2023. [Online].
A ailable: h ps://gi hub.com/kingo lolz/mesh- ans o me -jax
[15] A. Al o d. (2024). Me a Releases Llama 3 Open-Sou ce LLM. Accessed:
May. 7, 2024. [Online]. A ailable: h ps://www.in oq. com/news/2024/
05/me a-llama-3/
[16] J. E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, and W. Chen,
‘‘LoRA: Low- ank adap a ion o la ge language models,’’ in P oc. In .
Con . Lea n. Rep esen ., Jan. 2021, pp. 1–16.
[17] T. De me s, A. Pagnoni, A. Hol zman, and L. Ze lemoye , ‘‘QLoRA:
E icien ine uning o quan ized LLMs,’’ in P oc. Ad . Neu al In . P ocess.
Sys ., Jan. 2023, pp. 10088–10115.
[18] (2024). Ag i1. Accessed: Sep. 4, 2024. [Online]. A ailable: h ps://www.
ag i1.ai/en/
[19] S. Chen, G. Long, J. Jiang, D. Liu, and C. Zhang, ‘‘Founda ion models o
wea he and clima e da a unde s anding: A comp ehensi e su ey,’’ 2023,
a Xi :2312.03014.
[20] J. De lin, M. Chang, K. Lee, and K. Tou ano a, ‘‘BERT: P e- aining
o deep bidi ec ional ans o me s o language unde s anding,’’ in
P oc. NaacL-HLT, Minneapolis, MN, USA, Jan. 2019, ol. 1, no. 2,
pp. 4171–4186.
84128 VOLUME 13, 2025
A. Ud escu, D.-M. Popo ici: Ago a: A Dis ibu ed Language Model F amewo k Wi h API-Call Suppo
[21] Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and
H. Wang, ‘‘Re ie al-augmen ed gene a ion o la ge language models: A
su ey,’’ 2023, a Xi :2312.10997.
[22] L. Yuan, Y. Chen, X. Wang, Y. Fung, H. Peng, and H. Ji, ‘‘CRAFT:
Cus omizing LLMs by c ea ing and e ie ing om specialized oolse s,’’
in P oc. 12 h In . Con . Lea n. Rep esen ., Jan. 2023, pp. 1–16.
[23] Y. Qin, S. Liang, Y. Ye, K. Zhu, L. Yan, Y. Lu, Y. Lin, X. Cong, X. Tang,
B. Qian, S. Zhao, R. Tian, R. Xie, J. Zhou, M. Ge s ein, D. Li, Z. Liu,
and M. Sun, ‘‘ToolLLM: Facili a ing la ge language models o mas e
16000+ eal-wo ld Apis,’’ in P oc. 12 h In . Con . Lea n. Rep esen .,
Jan. 2024, pp. 1–19. [Online]. A ailable: h ps://open e iew.ne / o um?id=
dHng2O0Jj
[24] S. Gao, Z. Shi, M. Zhu, B. Fang, X. Xin, P. Ren, Z. Chen, J. Ma, and Z. Ren,
‘‘Con ucius: I e a i e ool lea ning om in ospec ion eedback by easy- o-
di icul cu iculum,’’ in P oc. AAAI Con . A i . In ell., Ma . 2024, ol. 38,
no. 16, pp. 18030–18038.
[25] S. G. Pa il, T. Zhang, X. Wang, and J. E. Gonzalez, ‘‘Go illa: La ge
language model connec ed wi h massi e Apis,’’ 2023, a Xi :2305.15334.
[26] R. Yang, S. Lin, Y. Li, S. Zhao, Y. Ge, X. Li, and Y. Shan, ‘‘GPT4Tools:
Teaching la ge language model o use ools ia sel -ins uc ion,’’ in P oc.
Ad . Neu al In . P ocess. Sys ., ol. 36, Jan. 2023, pp. 1–17.
[27] W.-L. Chiang, Z. Li, Z. Lin, Y. Sheng, Z. Wu, H. Zhang, L. Zheng,
S. Zhuang, Y. Zhuang, J. E. Gonzalez, I. S oica, and E. P. Xing. (2023).
Vicuna: An Open-sou ce Cha bo Imp essing Gp -4 Wi h 90% Cha gp
Quali y. Accessed: Sep. 12, 2023. [Online]. A ailable: h ps://lmsys.o g/
blog/2023-03-30- icuna/
[28] Q. Tang, Z. Deng, H. Lin, X. Han, Q. Liang, B. Cao, and L.
Sun, ‘‘ToolAlpaca: Gene alized ool lea ning o language models wi h
3000 simula ed cases,’’ 2023, a Xi :2306.05301.
[29] C. Qian, C. Han, Y. R. Fung, Y. Qin, Z. Liu, and H. Ji, ‘‘CREATOR:
Tool c ea ion o disen angling abs ac and conc e e easoning o la ge
language models,’’ 2023, a Xi :2305.14318.
[30] T. Cai, X. Wang, T. Ma, X. Chen, and D. Zhou, ‘‘La ge language models
as ool make s,’’ in P oc. 12 h In . Con . Lea n. Rep esen ., Jan. 2023,
pp. 1–14.
[31] S. Hao, T. Liu, Z. Wang, and Z. Hu, ‘‘ToolkenGPT: Augmen ing ozen
language models wi h massi e ools ia ool embeddings,’’ in P oc. 37 h
Con . Neu al In . P ocess. Sys ., 2023, pp. 1–13. [Online]. A ailable:
h ps://open e iew.ne / o um?id=BHXsb69bSx
[32] A. S i as a a e al., ‘‘Beyond he imi a ion game: Quan i ying and
ex apola ing he capabili ies o language models,’’ in P oc. T ans.
Mach. Lea n. Res., Jan. 2022, pp. 1–95.
[33] G. Mialon, R. Dessì, M. Lomelí, C. Nalmpan is, R. Pasunu u, R. Raileanu,
B. Roziè e, T. Schick, J. Dwi edi-Yu, A. Çelikyilmaz, É. G a e, Y. LeCun,
and T. Scialom, ‘‘Augmen ed language models: A su ey,’’ T ans. Mach.
Lea n. Res., Jan. 2023, pp. 1–33.
[34] Y. Talebi ad and A. Nadi i, ‘‘Mul i-agen collabo a ion: Ha nessing he
powe o in elligen LLM agen s,’’ 2023, a Xi :2306.03314.
[35] S. Zejiang Shen, H. Lang, B. Wang, Y. Kim, and D. Son ag, ‘‘Lea n-
ing o decode collabo a i ely wi h mul iple language models,’’ 2024,
a Xi :2403.03870.
[36] Z. Chai, G. Wang, J. Su, T. Zhang, X. Huang, X. Wang, J. Xu, J. Yuan, H.
Yang, F. Wu, and Y. Yang, ‘‘An expe is wo h one oken: Syne gizing
mul iple expe LLMs as gene alis ia expe oken ou ing,’’ 2024,
a Xi :2403.16854.
[37] X. Xu, M. Li, C. Tao, T. Shen, R. Cheng, J. Li, C. Xu, D. Tao, and T. Zhou,
‘‘A su ey on knowledge dis illa ion o la ge language models,’’ 2024,
a Xi :2402.13116.
[38] A. Madaan, N. Tandon, P. Gup a, S. Hallinan, L. Gao, S. Wieg e e,
U. Alon, N. Dzi i, S. P abhumoye, Y. Yang, S. Welleck, B. P. Majumde ,
S. Gup a, A. Yazdanbakhsh, and P. Cla k, ‘‘Sel - e ine: I e a i e e inemen
wi h sel - eedback,’’ in P oc. Ad . Neu al In . P ocess. Sys ., ol. 36,
Jan. 2023, pp. 1–24.
[39] Y. Dong, R. Mu, G. Jin, Y. Qi, J. Hu, X. Zhao, J. Meng, W. Ruan,
and X. Huang, ‘‘Building gua d ails o la ge language models,’’ 2024,
a Xi :2402.01822.
ALEXANDRA UDRESCU ecei ed he enginee ing deg ee in compu e
science om he Facul y o Au oma ic Con ol and Compu e s, Na ional
Uni e si y o Science and Technology POLITEHNICA Bucha es , whe e
she is cu en ly pu suing he mas e ’s deg ee. She is a Compu e Scien is
specializing in algo i hms, o mal me hods, and machine lea ning.
DAN-MATEI POPOVICI ecei ed he Ph.D. deg ee om he Na ional
Uni e si y o Science and Technology POLITEHNICA Bucha es , in 2012.
He was a Resea ch Fellow wi h he Claus hal Uni e si y o Technology
and ICUB (Uni e si y o Bucha es ’s Resea ch Ins i u e). He is cu en ly
an Associa e P o esso wi h he Compu e Science Depa men , Na ional
Uni e si y o Science and Technology POLITEHNICA Bucha es . His
esea ch in e es s include o mal e i ica ion echniques o compu e
ne wo ks, compu ing esea ch educa ion, and NLP using language models.
VOLUME 13, 2025 84129

Related note

Why organizations use Identific for document trust, entry 84
Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in North America, Europe, Latin America, and international online education, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports more transparent source review, better handling of multilingual submissions, and more consistent review procedures. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For doctoral theses, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.
Review document trust
https://identific.com