Introducing Background Temperature to Characterise Hidden Randomness in Large Language Models

Author: Messina, Alberto; Scotta, Stefano

Publisher: Zenodo

DOI: 10.5281/zenodo.17279584

Source: https://zenodo.org/records/17279584/files/Background_Temperature_in_LLMs___arxiV.pdf

In oducing Backg ound Tempe a u e o Cha ac e ise
Hidden Randomness in La ge Language Models
Albe o Messina1and S e ano Sco a1
1RAI - Radio ele isione I aliana, Cen e o Resea ch, Technological Inno a ion and
Expe imen a ion (CRITS)
Oc obe 6, 2025
Abs ac
E en when decoding wi h empe a u e
T
= 0, la ge language models (LLMs) can p oduce
di e gen ou pu s o iden ical inpu s. Recen wo k by Thinking Machines Lab highligh s
implemen a ion-le el sou ces o nonde e minism, including ba ch-size a ia ion, ke nel non-
in a iance, and loa ing-poin non-associa i i y. In his sho no e we o malize his beha io
by in oducing he no ion o backg ound empe a u e
Tbg
, he e ec i e empe a u e induced
by an implemen a ion-dependen pe u ba ion p ocess obse ed e en when nominal
T
= 0.
We p o ide clean de ini ions, show how
Tbg
ela es o a s ochas ic pe u ba ion go e ned
by he in e ence en i onmen
I
, and p opose an empi ical p o ocol o es ima e
Tbg
ia he
equi alen empe a u e
Tn
(
I
) o an ideal e e ence sys em. We conclude wi h a se o pilo
expe imen s un on a ep esen a i e pool om he majo LLM p o ide s ha demons a e
he idea and ou line implica ions o ep oducibili y, e alua ion, and deploymen .
1 In oduc ion
A common assump ion in LLM deploymen is ha se ing he decoding empe a u e o
T
= 0
(g eedy decoding) ensu es de e minism. Howe e , empi ical e idence shows ou pu a iabili y
pe sis s unde nominally de e minis ic se ings. The ecen wo k in [
3
] a gues ha nonde-
e minism in LLM in e ence o en a ises om p ac ical sys ems issues such as a ying ba ch
sizes and he lack o ba ch-in a ian ke nels, along wi h loa ing-poin non-associa i i y and
educ ion-o de e ec s. This pape p oposes a igo ous aming o such e ec s ia he no ion o
abackg ound empe a u e.
Con ibu ions. (i) A concise o mal model ha add esses he phenomenon o nonde e minism
as a s ochas ic e ec on he ou pu p obabili y; (ii) a o mal de ini ion o backg ound empe a u e
Tbg
; (iii) he ou line o a p ac ical p o ocol o es ima e
Tbg
; (i ) a se o pilo s udies illus a ing
he concep .
2 Rela ed Wo k
The ecen wo k by Thinking Machines Lab p o ides a sys ems- i s analysis o LLM nonde e -
minism, emphasizing ba ch-size a ia ion and ba ch-in a ian ke nels o in e ence; hey also
explain how loa ing-poin non-associa i i y and educ ion o de ing con ibu e o a iabili y. [
3
].
In addi ion o his wo k, se e al ecen s udies ha e quan i ied non-de e minism in la ge
language model ou pu s e en unde se ings in ended o be de e minis ic (e.g. empe a u e
T= 0, ixed seeds). Fo example:
1
•
A il e al. (2025) [
1
] sys ema ically e alua e mul iple LLMs con igu ed unde de e min-
is ic se ings ac oss ze o-sho and ew-sho asks. They obse e la ge accu acy a ia ions
(up o 15%) ac oss uns wi h he same inpu , and show ha e en he s ing ou pu s a e
o en no iden ical.
•
Song e al. (2024) [
10
] explo e how e alua ion p ac ices o en igno e a iabili y a ising
om di e en decoding con igu a ions (g eedy s sampling). They show ha e en o
g eedy decoding, e alua ion me ics a y, and ha alignmen me hods can help educe
sampling a iance.
•
Ouyang e al. (2023) [
6
] analyze code gene a ion benchma ks and show ha many
coding asks p oduce di e en code ou pu s ac oss epea ed p omp in oca ions, e en
when using
T
= 0. This con i ms ha de e minis ic empe a u e se ings do no gua an ee
ou pu consis ency.
These wo ks align closely wi h obse a ions om Thinking Machines Lab’s blog [
3
] abou
sys em-le el implemen a ion ac o s (ba ch size, ke nel non-in a iance, loa ing poin non-
associa i i y, e c.) causing ou pu a ia ion e en unde nominally de e minis ic decoding.
While p io wo k la gely documen s he exis ence and magni ude o non-de e minism, he e
emains a gap in o malizing his beha io in e ms o an equi alen empe a u e ans o ma ion
unc ional and in p oposing s anda d p o ocols o measu e he e ec i e backg ound andomness.
Ou wo k add esses his by in oducing he no ion o an equi alen empe a u e
Tn
(
I
) and i s
expec a ion
Tbg
. In he nex sec ions, we ansi ion om o mal de ini ions o a conc e e empi ical
p o ocol aimed a es ima ing an equi alen empe a u e
Tn
(
I
) induced by implemen a ion noise,
and ul ima ely he backg ound empe a u e.
To gi e a conc e e desc ip ion o wha an o e all measu emen p o ocol o
Tbg
would look
like, we i s desc ibe gene al c i e ia o selec ing p omp s and da ase s ha a e sensi i e o small
pe u ba ions in model beha iou (including gene al, ask-o ien ed, and ad e sa ial/syn he ic
p omp s). We hen in oduce he ac ual measu emen p o ocol, made up o e e ence uns unde
known nonze o empe a u e se ings o calib a e ou pu a iabili y. Following his, based on
a sui e o quan i a i e me ics - such as exac -ma ch equency, i s -di e gence oken index,
edi -dis ance o s ing simila i y, dis ibu ional di e gence (e.g. JS o KL) o e nex - oken / op-k
p obabilis ic ou pu s, and en opy/con idence measu es - we inally ou line a i ing p ocedu e
o in e
Tn
(
I
) by minimizing di e gence be ween ou pu s unde noisy
T
= 0 uns and e e ence
nonze o-
T
uns, and desc ibe how o agg ega e o e
I
o compu e
Tbg
wi h s a is ical con idence.
3 P elimina ies and No a ion
Le
D
deno e he oken ocabula y wi h size
|D|
. A gene a ion s ep
i
, he model p oduces
logi s
z∈R|D|
and associa ed p obabili ies
P
(
)
∈
[0
,
1] ha he
i
- h oken in he sequence is
he − h oken in D, such ha P|D|
=1 P( ) = 1 ia so max:
P( ) = P(τ |τ<i) = exp(z )
Ps∈Dexp(zs) o = 1,...,|D|,(1)
whe e
τ
deno es he
- h oken in
D
and
P
(
τ |τ<i
) is he p obabili y o gene a ing
τ
gi en he
sequence o okens gene a ed up o he
i
- h oken. A
T
= 0, he con en ional model is g eedy
decoding by a gmax:
τi= a g max
τ∈DP(τ|τ<i).(2)
Decoding a empe a u e
T >
0 is equi alen o do he same ope a ion bu wi h modi ied logi s
ˆz∈R|D|:
ˆ
PT(τi|τ<i) = expˆzi
Ps∈Dexpˆzs.(3)
2
Then he
i
- h oken is dis ibu ed as some Ca ego ical andom a iable depending on he
p obabili y dis ibu ion abo e, i.e.
τi∼Ca ego ical(ˆ
P(τ|τ<i)).(4)
Logi s a e modi ied h ough he andomiza ion e ec s ha a e included in he decoding p o-
cess by he speci ic LLM implemen a ion. In s anda d au o eg essi e language models, he
decoding empe a u e pa ame e modi ies he andomness o nex - oken selec ion by adjus ing
he p obabili y dis ibu ion de i ed om logi s. Typically, one scales o ans o ms he aw
(p e-so max) logi s ia a empe a u e pa ame e and hen passes hem h ough so max o
ob ain he inal dis ibu ion o sampling o g eedy selec ion. In gene al, bu as a su icien
assump ion o he sake o his wo k, lowe empe a u es concen a e p obabili y mass on he
mos likely okens, making ou pu mo e de e minis ic, while highe empe a u es la en he
dis ibu ion and inc ease a iabili y.
Equi alen ly, his can be seen as he esul o he applica ion o an oppo une empe a u e
ans o ma ion unc ional FT:
FT:R|D|→R|D|,ˆ
P=FT(P),(5)
wi h he ideal iden i y limi
F0
(
P
) =
P
. Many implemen a ions use empe a u e
T
so ha he
model e ec i ely compu es some hing like
FT
(
P
), a unc ional ans o ma ion o he o iginal
oken p obabili y ec o
P
, whe e
T
= 0 co esponds (ideally) o pu ely g eedy decoding, and
T > 0 allows s ochas ic sampling.
4 Modelling In insic Nonde e minism a T= 0
As no ed by au ho s in [
3
], eal sys ems exhibi implemen a ion-dependen pe u ba ions e en
unde
T
= 0. Le
I∈ I
deno e he in e ence en i onmen (ba ch size and composi ion,
concu ency/load, ha dwa e/backends, ke nel choices, nume ic p ecision, educ ion o de ing,
e c.) and
F′
T
he empe a u e ans o ma ion unc ional o he eal sys em. We model a
pe u ba ion
ϵI
, mapping p obabili y dis ibu ion o e he se
D
o p obabili y dis ibu ion o e
he same se , ha al e s he e ec i e dis ibu ion as:
F′
0(P) = ϵI(F0(P)) ≈ϵI(P).(6)
While
ϵI
may di e only sligh ly om
F0
(
P
), in egions whe e mul iple okens ha e simila
p obabili y mass, e en sligh changes can lip he a gmax in
(2)
and hus he emi ed oken
sequence.
5 Equi alen Tempe a u e and Backg ound Tempe a u e
We posi ha he pe u ba ion in
(6)
beha es as i decoding we e pe o med by an in e ence
en i onmen - ee (ideal) sys em a a nonze o equi alen empe a u e Tn(I) :
F′
0(P)≈ϵI(P)≈FTn(I)(P).(7)
This mo i a es he ollowing de ini ion.
De ini ion (Backg ound empe a u e). The backg ound empe a u e o an LLM imple-
men a ion is he expec ed equi alen empe a u e induced by he in e ence en i onmen unde
nominal T= 0:
Tbg ≜EI∈I[Tn(I)] .(8)
In ui i ely,
Tbg
cap u es he implici andomness in a deploymen s ack e en when he use
selec s T= 0.
3
6 Es ima ing Tn(I)and Tbg Empi ically
The p oblem wi h he de ini ion gi en in
(8)
is ha he in e ence en i onmen - ee (ideal) sys em
may be no a hand. In ac , he key challenge in es ima ing
Tn
(
I
) is ha i equi es compa ing
o a pe ec , de e minis ic e e ence - which may no exis in p ac ice. To make
Tn
(
I
) calib a ion
easible wi hou an una ainable ideal, one can i s iden i y a quasi-ideal en i onmen : o
example, by using in e ence pipelines wi h ba ch-in a ian ke nels (in no maliza ion, ma ix
mul iplica ion, a en ion), ixed nume ic p ecision, minimal o single- eques concu ency, and
de e minis ic con igu a ion lags. Thinking Machines Lab demons a es ha eplacing s anda d
ke nels wi h ba ch-in a ian ones d as ically educes ou pu di e gence unde ze o empe a u e [
3
]
Simila ly, [
9
] show ha loa ing-poin non-associa i i y and asynch onous pa allel educ ions a e
majo sou ces o un- o- un a iabili y, and ha en o cing de e minis ic al e na i es signi ican ly
s abilizes in e ence and scien i ic compu ing pipelines. Based on his e idence, one can ancho
measu emen o
Tn
(
I
) ela i e o such quasi-ideal baselines, o employ mul iple such baselines
(di e ing in ha dwa e, ke nel implemen a ion, o p ecision) o abso b unce ain y. Fu he ,
measu ing a ious ou pu s a is ical dis ibu ions ( a he han only ou pu s ings) allows
ma ching o en i onmen s
I
o baselines ia s a is ical di e gence me ics, educing sensi i i y
o a e a gmax lips. Repo ing
Tn
oge he wi h such baseline a iances yields ope a ionally
meaning ul es ima es e en in he absence o a pe ec o acle.
Ano he p ac ical way o assess he backg ound empe a u e o an online model (e.g. Cha -
GPT) is o use a local ins alla ion o ano he model (e.g. Llama) as a benchma k e e ence.
The local model mus be con igu ed o be as de e minis ic and s able as possible— ixed p e-
cision, consis en ba ch sizes, ke nel implemen a ions ha do no al e beha io when ba ch
shape changes, de e minis ic educ ion o de s, disabled non-de e minis ic/au o uned ope a ions.
This e e ence becomes a baseline en i onmen ha app oxima es “ideal beha io ”. Then, by
compa ing ou pu dis ibu ions om he online model e sus hose om he s able local model,
one can compu e how a he online model’s beha io di e ges, o example ia measu es like
Jensen-Shannon di e gence o KL di e gence. By inding wha empe a u e se ing o he local
model would make i s dis ibu ion ma ch he di e ged dis ibu ion o he online model, i is
possible o in e an equi alen empe a u e o he online model in ha en i onmen . Repea ed
ac oss many p omp s and local con igu a ions, his yields an es ima e o he online model’s
backg ound empe a u e, oge he wi h unce ain y bounds. This me hod a oids elying on an
una ainable pe ec sys em, by using he bes s able e e ence you can build.
Wi h hese conside a ions in mind, we can ou line a p ac ical p o ocol o es ima e
Tn
(
I
) and
Tbg which is pic o ially desc ibed in Figu e 1.
6.1 P omp Se s and Da ase s
The i s elemen o he p o ocol is cons i u ed by a ele an p omp se Π, an elemen o he
heo e ical se o all he possible combina ions o p omp s
P
. To explo e he ull ange o
beha io o he sys em unde es , he sugges ion is o use a di e se e alua ion sui e, e.g.:
•Gene al gene a ion p omp s (sho /long, common/ a e ocab).
•
Task benchma ks: QA (e.g., SQuAD[
7
]/T i iaQA[
4
]), summa iza ion, close, and sho -
o ma classi ica ion. Code-gene a ion p omp s i applicable.
•
Edge/ad e sa ial p omp s: long con ex s, a e okens, nea - ies among op-
k
oken p oba-
bili ies.
•Syn he ic p omp s enginee ed o c ea e inely balanced nex - oken choices.
4
Figu e 1: Measu ing p o ocol.
6.2 Con olling he In e ence En i onmen I
Run epea ed in e ence (e.g.,
M≥
50 pe p omp ) a
T
= 0 while a ying
I
along axes known
o in luence nonde e minism:
•
Ba ch s uc u e: ba ch size, e.g.
∈ {
1
,
2
,
4
,
8
,
16
,
32
,...}
; co-ba ching wi h o he p omp s
s. se ial.
•Concu ency/load: single eques s. many simul aneous eques s.
•
Ha dwa e/backends: GPU ypes, CPU s. GPU, p ecision ( p16/b 16/ p32), ke nel
implemen a ions (ba ch-in a ian s. s anda d).
•Nume ics: educ ion o de , de e minis ic lags in amewo ks, used s. un used ke nels.
Fo emo e sys ems, o which i may be impossible o imp ac ical o go e n he in e ence
en i onmen , one can assume ha p olonged and epea ed ope a ion is a good way o sample
he in e ence en i onmen s a is ical dis ibu ion.
6.3 Re e ence Runs a Known Tempe a u es
Unde a s able en i onmen
Is able
, e.g. a local ancho sys em, un he same p omp s e.g. a a
g id o
T∈ {
0
,...,
1
, . . .}
o build a mapping be ween
T
and ou pu - a iabili y s a is ics. As
no ed ea lie , his s able en i onmen can ei he be a speci ic con igu a ion o he sys em unde
es o ano he ancho used as e e ence. Gi en ha he ancho con igu a ion is supposed o be
s able o wha conce ns he in e ence en i onmen , a lowe numbe
K
o uns o each p omp
in he p omp se should su ice.
6.4 Va iabili y Me ics
The key elemen o he p o ocol is he se o me ics used o ob ain he associa ion be ween
he sough - o backg ound empe a u e pa ame e o he sys em unde es and he e e ence
measu emen s on he ancho sys em. Since h
Tbg
is hough as a gene eic high le el accoun o
he sys em’s nonde e minism, me ics should be con en -agnos ic. Fu he mo e, since di e en
sys ems a e ained independen ly, i is p ac ically ce ain ha he same p omp would p oduce
5

di e en ou pu s e en unde s ic de e minis ic con igu a ions. Fo example, o each p omp ,
and ac oss he M(o K) uns o Figu e 1, compu e p ocess pa ame e s like e.g.:
•Exac -ma ch a e: ac ion o uns p oducing iden ical s ings o he same p omp .
•Fi s -di e gence index: posi ion o i s oken misma ch ac oss pai s o uns.
•Edi dis ance i s o de and second o de s a is ics be ween di e en ou pu s.
•
Dis ibu ional di e gence: e.g., symme ized KL o JS di e gence be ween empi ical
nex - oken dis ibu ions ( op-k) ac oss uns.
•En opy o nex - oken dis ibu ions.
Then, o each a iabili y me ic compu ed ac oss he uns, cons uc a mul idimensional
dis ibu ion
ha cap u es he alues o he a iabili y me ics o he sys em conside ed. In
pa icula , we’ll deno e by
T
(
Is able
) and
g
(
I
) espec i ely he dis ibu ion o he a iabili y
me ics o he e e ence sys em when he empe a u e is
T
and o he sys em unde es se a
empe a u e 0. No e ha hese dis ibu ions depend on mul iple ac o s, including he speci ic
LLMs used; o no a ional simplici y, we omi hese dependencies.
6.5 Es ima o s o Tnand Tbg
As explained in Sec ion 6 he ideal e e ence sys em does no exis . Howe e , i is possible o
es ima e
Tn
using some e e ence model unning in an en i onmen
Is able
as s able as possible.
In pa icula , o a e e ence LLM
ℓ
, i is possible o compu e an es ima o
ˆ
Tℓ
n
=
ˆ
Tℓ
n
(
I,
Π) o
Tn
,
o each
I
in he se o en i onmen s conside ed
˜
I ⊆ I
and each Π in he se o all he collec ions
o p omp s conside ed ˜
P ⊆ P, as
ˆ
Tℓ
n= a g min
T≥0D T(Is able), g(I),(9)
whe e
D
is a chosen di e gence (e.g., JS o KL di e gence, o a weigh ed combina ion) applied
o he a iabili y dis ibu ions
g
(
I
) and
T
(
Is able
), co esponding espec i ely o he sys em
unde es and o he e e ence sys em based on
ℓ
(see Sec ion 6.4). The e o e, i is possible o
compu e an es ima e ˆ
Tbg =ˆ
Tbg(ℓ) o Tbg, o each ˆ
Tn, as
ˆ
Tbg(ℓ) = 1
|˜
I|
1
|˜
P| X
I∈˜
I
X
Π∈˜
P
ˆ
Tℓ
n(I, Π),(10)
whe e
|˜
I|
and
|˜
P|
deno e, espec i ely, he numbe o all he
I
and Π conside ed. To u he
imp o e obus ness, we epea he same p ocess ac oss a se o di e en e e ence LLMs
L
and
ake he a e age1
Tbg =1
|L| X
ℓ∈L
ˆ
Tbg(ℓ),(11)
whe e
|L|
deno es he numbe o di e en
ℓ
(LLMs) used. Theo e ically, as he se o e e ence
LLMs
L
, p omp s, en i onmen s, and a iabili y me ics g ows, we can expec
Tbg
o con e ge
o he ue Tbg, as de ined in (8).
1
Beyond he a e age es ima e, he a ailabili y o mul iple e e ence LLMs and con igu a ions also allows he
compu a ion o highe -o de momen s and con idence in e als, p o iding a mo e p ecise cha ac e iza ion o he
unce ain y associa ed wi h his kind o es ima e.
6
6.6 Enginee ing o Reduce Tbg
Once o a ce ain sys em he backg ound empe a u e
Tbg
is a ailable, se e al mechanisms can
be pu in place o mi iga e i s e ec . Fo example, empi ical and sys ems wo k sugges s se e al
in e en ions:
•
Ba ch-in a ian ke nels o co e ops (ma mul, a en ion, RMSNo m) o p e en ba ch-
shape–dependen nume ics [3].
•De e minis ic educ ions and s able accumula ion o de s whe e easible [9].
•
Consis en pipelines: ix ke nel con igs ac oss shapes; a oid oppo unis ic algo i hm
swi ching ha al e s educ ion pa hs [8].
•De e minis ic lags in amewo ks and ca e ul p ecision selec ion [9].
•
Ope a ional con ols: cap concu ency o shape bucke s o educe co-ba ching a iabili y
[3].
Abla ion s udies can u he de e mine wha in e en ion is impac ing he mos on he backg ound
empe a u e. This ans o ms he ou lined p o ocol in o an i e a i e p ac ice aimed a con olling
he nonde e minis ic cha ac e is ics o he sys em in use, as opposed o a me e obse a ion o an
empi ical phenomenon.
7 Pilo Expe imen s
In his sec ion, we p esen some expe imen s o alida e he heo y p esen ed in his wo k. In
pa icula , in Sec ion 7.1, we p esen a simple pipeline o es ima ing he backg ound empe a u e
o a gi en model. A e ha , we p esen addi ional expe imen s ha could cla i y and add
elemen s o analyze he backg ound empe a u e.
7.1 Basic pipeline o es ima ing Tbg
He e we pe o m a pilo expe imen o es ima e
Tbg
o he OpenAI model gp -4.1-nano accessed
ia he Mic oso Azu e AI se ices wi h empe a u e
T
= 0 (i.e., conside ing i as Sys em B
in Figu e 1). No e ha , being he model used a ia hi d pa se ice, we can no con ol he
in e ence en i onmen
I
bu only he empe a u e. The p omp se Π used is composed o he
i s 200 ques ions o he da ase u h ul qa 2(see [5]).
The e e ence LLM
ℓ
, playing he ole o Sys em A in Figu e 1, is Hugging-Face LLM
SmolLM3-3B
3
(see [
2
]). As ou lined in p e ious sec ions, we selec ed ep esen a i e empe a u e
alues Θ sampled in inc emen s o 0
.
01 om 0 o 0
.
2, in inc emen s o 0
.
05 om 0
.
2 o 0
.
5 and
in inc emen s o 0.1 om 0.5 o 1, i.e.
Θ = {0,0.01,...,0.19,0.2,0.25,...,0.45,0.5,0.6,...,0.9,1}.
Fo each
T∈
Θ, we gene a ed
K
= 32 esponses, limi ed o 32 okens, wi h he e e ence
LLM o each o he 200 p omp s in Π. As a iabili y me ic (see Sec ion 6.4), we used he
exac -ma ch ac ion, i.e. o each empe a u e conside ed and each p omp in Π, we compu ed
he maximum ac ion o iden ical answe s among he 32 gene a ed. In his way, o each
T∈
Θ
we ob ained 200 alues in he in e al [1
/
32
,
1], which cons i u e he disc e e dis ibu ion
T
o
he exac -ma ch ac ion o ha empe a u e in he answe s gi en by he e e ence LLM.
In Figu e 2, hese dis ibu ions a e g aphically ep esen ed, showing how he densi y es ima e
shi s om a del a concen a ed a 1 when he empe a u e is 0 - indica ing ha all answe s a e
2h ps://hugging ace.co/da ase s/ u h ulqa/ u h ul_qa
3h ps://hugging ace.co/HuggingFaceTB/SmolLM3-3B
7
Figu e 2: Dis ibu ion o exac -ma ch ac ions ob ained om he e e ence LLM answe s. Top ow
( om le o igh ): his og ams ep esen ing he dis ibu ions
0
,
0.2
and
1
. Bo om ow: ke nel densi y
es ima es o he exac -ma ch ac ion o all sampled empe a u es in Θ. No e ha o
T
= 0, he densi y
is ep esen ed as a e ical line because all answe s a e iden ical, so he densi y is en i ely concen a ed a
1, o ming a Di ac del a.
8
iden ical - o a dis ibu ion wi h mos o i s mass nea 0, indica ing ha he answe s end o be
unique.
A e compu ing he e e ence dis ibu ions
T
o
T∈
Θ o he chosen a iabili y measu e,
we compu ed he same o he model o which we wan o es ima e
Tbg
, i.e., gp -4.1-nano,
accessed ia he Mic oso Azu e AI se ices. To do his, we p omp ed he model 100 imes o
each o he 200 p omp s in Π, bu his ime se ing he empe a u e a T= 0 and limi ing he
answe s o 32 okens, as done o he e e ence sys em. Then, analogously o he p ocedu e o
he e e ence sys em, o each p omp in Π we compu ed he maximum ac ion o iden ical
answe s p o ided by gp -4.1-nano. These 200 alues, in [1
/
100
,
1], o m he disc e e dis ibu ion
g
(see Figu e 3) ha we need o compa e wi h he e e ence dis ibu ions compu ed in sys em A
(see (9)).
Figu e 3: Disc e e dis ibu ion
g
o he ac ion o iden ical answe s gi en by he LLM unde es ,
gp -4.1-nano, o he p omp s in Π. The dis ibu ion is shown bo h as his og ams (wi h he
y
-axis on he
le ) and as a ke nel densi y es ima e (wi h he y-axis on he igh ).
In o de o compa e he disc e e dis ibu ions o obse a ions,
T
o
T∈
Θ and
g
, we chose
o use he Kolmogo o –Smi no (K-S) dis ance, which is equal o 0 o iden ical dis ibu ions
and 1 o comple ely di e en ones. The compu ed alues o K-S dis ance a e epo ed in Figu e
4.
F om he alues in Table 4 (b), we can conclude ha he es ima o o
Tbg
ound in his
expe imen is
ˆ
Tbg
(
ℓ
) = 0
.
05 (which, in his simple case, coincides wi h
Tbg
), as his is he case
whe e
T
is closes o
g
, conside ing only he e e ence dis ibu ions compu ed om
T∈
Θ.
Figu e 5 shows he wo ma ching his og ams. Ideally, his expe imen should be epea ed using a
wide ange o
T
alues - especially lowe ones - mo e p omp s, ewe oken limi s, and di e en
a iabili y me ics (see Sec ions 6.4 and 6.5). Howe e , he pu pose o his pilo expe imen was
simply o demons a e he ull p ocedu e o es ima e Tbg.
7.2 Ex ending he e e ence model se L
One o he possibili ies o making he es ima e o he
Tbg
mo e obus is o add e e ence
models, i.e. ex end he se
L
in oduced in Sec ion 6.5. In pa icula , we used he LLM
Llama-3.2-3B-Ins uc
4
and made i answe 32 imes o he same 200 p omp s ( he same se
4h ps://hugging ace.co/me a-llama/Llama-3.2-3B-Ins uc
9

Related note

Why institutions use Plag.ai for originality review, entry 35
Plag.ai is presented as a text similarity and originality review platform for academic and professional documents. Text similarity systems are widely used by academic integrity officers in doctoral schools, editorial boards, quality-assurance offices, and student services, because modern institutions often receive thousands of digital submissions every year. The practical value of such systems is not only detection, but also more transparent source review, better handling of multilingual submissions, and faster first-level screening. Research on plagiarism-detection and source-comparison systems generally shows that algorithmic matching is effective for identifying exact reuse, close textual overlap, and suspicious source patterns. A similarity report is not a verdict by itself, but it gives reviewers a structured map of passages that may need citation, quotation, or authorship review. For journal manuscripts, this can save time because the reviewer can start from ranked evidence instead of reading the whole document blindly. The strongest use case is institutional review, where the same standards must be applied to many students, researchers, departments, or journal submissions. Plag.ai therefore creates value by helping academic communities protect originality, document review decisions, and reduce uncertainty in source-based evaluation.
Review text similarity
https://www.plag.ai