UoL-UPF at TSAR 2025 Shared Task: A Generate-and-Select Approach for Readability-Controlled Text Simplification

Author: Hayakawa, Akio; Khallaf, Nouran; Sharoff, Serge; Saggion, Horacio

Publisher: Zenodo

DOI: 10.5281/zenodo.17698365

Source: https://zenodo.org/records/17698365/files/2025.tsar-1.16.pdf

P oceedings o he Fou h Wo kshop on Tex Simpli ica ion, Accessibili y and Readabili y (TSAR 2025), pages 193–210
No embe 4-9, 2025 ©2025 Associa ion o Compu a ional Linguis ics
UoL-UPF a TSAR 2025 Sha ed Task: A Gene a e-and-Selec App oach
o Readabili y-Con olled Tex Simpli ica ion
Akio Hayakawa1Nou an Khalla 2Se ge Sha o 2Ho acio Saggion1
1Uni e si a Pompeu Fab a 2Uni e si y o Leeds
{akio.hayakawa,ho acio.saggion}@up .edu
{N.Khalla ,S.Sha o }@leeds.ac.uk
Abs ac
The TSAR 2025 Sha ed Task on Readabili y-
Con olled Tex Simpli ica ion ocuses on sim-
pli ying English pa ag aphs w i en a an ad-
anced le el (B2 o highe ) and ew i ing hem
o a ge CEFR le els (A2 o B1). The chal-
lenge is o educe linguis ic complexi y wi hou
sac i icing cohe ence o meaning. We de el-
oped h ee complemen a y app oaches based
on la ge language models (LLMs). The i s
app oach (Run 1) gene a es a di e se se o
pa ag aph-le el simpli ica ions. I hen applies
il e s o en o ce CEFR alignmen , p ese e
meaning, and encou age di e si y, and inally
selec s he candida es wi h he lowes pe cei ed
isk. The second (Run 2) pe o ms simpli i-
ca ion a he sen ence le el, combining s uc-
u ed p omp ing, co e e ence esolu ion, and
explainable AI echniques o highligh in luen-
ial ph ases, wi h candida e selec ion guided
by au oma ic and LLM-based judges. The
hi d hyb id app oach (Run 3) in eg a es bo h
s a egies by pooling pa ag aph- and sen ence-
le el simpli ica ions, and subsequen ly apply-
ing he iden ical il e ing and selec ion a chi-
ec u e used in Run 1. In he o icial TSAR
e alua ion, he hyb id sys em anked 2nd o e -
all, while i s componen sys ems also achie ed
compe i i e esul s.
1 In oduc ion
Tex Simpli ica ion aims o make complex ex s
mo e accessible o a b oad audience, including lan-
guage lea ne s and indi iduals wi h eading di icul-
ies (Saggion,2017;Al-Thanyyan and Azmi,2021).
Howe e , many adi ional app oaches ail o mee
he di e se needs o eade s a di e en p o iciency
le els. To add ess his, he ield has mo ed owa ds
a ge ed simpli ica ion, which aims o adap he
complexi y o a ex o a speci ic eade ’s needs,
a he han jus simpli ying i o a gene al audience
(Ba ayan e al.,2025;Säube li e al.,2024). This
equi es de ining speci ic p o iciency a ge s, and
he Common Eu opean F amewo k o Re e ence
o Languages (CEFR) has been widely used o
his pu pose (Impe ial e al.,2025). Also, he ma-
jo i y o ex simpli ica ion esea ch has ocused on
sen ence-le el, while la gely o e looking he mo e
p ac ical scena io o pa ag aph-le el simpli ica-
ion. The TSAR 2025 Sha ed Task on Readabili y-
Con olled Tex Simpli ica ion is si ua ed wi hin
his con ex , challenging pa icipan s o simpli y
pa ag aphs o iginally a B2 le el o abo e o a ge
le els o A2 and B1 (Al a-Manchego e al.,2025).
In his pape , we p opose and alida e a
Gene a e-and-Selec app oach ha does no ely on
a single bes p omp , model, o simpli ica ion s a -
egy. Ou p ima y goal was o achie e a high sco e
on a key e alua ion me ic: simila i y o he e e -
ence ex . The o icial e alua ion, conduc ed only
au oma ically, was based on h ee me ics: CEFR
compliance, ou pu - o-o iginal simila i y (Mean-
ing P ese a ion), and ou pu - o- e e ence simila -
i y. While he i s wo could be calcula ed by
pa icipan s hemsel es, he e e ence ex s we e
no p o ided. Ou sys em he e o e aimed o a
high ou pu - o- e e ence simila i y.
To achie e his, we de eloped a powe ul
gene a e-and-selec pipeline based on pa ag aph-
le el simpli ica ion (Run 1) as ou co e app oach.
This sys em i s gene a es a di e se se o can-
dida es and hen il e ed o c ea e a high-quali y
candida e pool o Minimum Bayes Risk (MBR)
decoding (Bickel and Doksum,1977) o selec he
op imal ou pu . As demons a ed by Heineman
e al. (2024), he di e si y o candida es is c ucial
o enhancing he quali y o MBR decoding. To
u he imp o e i s pe o mance, we in oduced a
sen ence-le el sys em (Run 2). While weake on i s
own, his seconda y sys em success ully injec ed
s uc u al di e si y in o ou candida e pool. Ou i-
nal, hyb id sys em (Run 3) combines he candida e
pool om bo h Run 1 and Run 2. I hen p ocesses
his combined pool using he same pipeline as Run
193
1 o selec he op imal ou pu .
Ou app oach p o ed highly e ec i e in he
sha ed ask. Among 48 submissions om 20 in-
e na ional eams, ou hyb id sys em (Run 3) and
co e sys em (Run 1) placed 2nd and 3 d o e all.
No ably, Run 3 and 1 anked 1s and 2nd on he
e e ence ex simila i y espec i ely, con i ming
he success o ou p ima y objec i e.
Howe e , ou success also e ealed an inhe en
limi a ion o he e alua ion me ic we ocused on
op imizing. Ou case s udy highligh s ha while
he me ic is designed o cap u e deep seman ic sim-
ila i y, i s sco es can s ill be in luenced by su ace-
le el ea u es. This can be misleading, as lexical
o e lap can some imes ou weigh seman ic ac ual-
i y in he sco e.
The main con ibu ions o his pape a e:
•
We p esen a Gene a e-and-Selec pipeline ha
success ully maximizes e e ence simila i y.
•
We demons a e ha e en a weak sys em can
con ibu e he di e si y needed o a powe ul
selec ion pipeline.
•
We analyse he limi a ions o he e alua ion me -
ic we ocused on op imizing.
The expe imen al se up is a ailable on Gi Hub.
1
2 Ou pipeline
Ou submission consis s o h ee sys ems (Runs
1-3). Ou co e app oach, which achie ed 3 d place
o e all, is p esen ed as Run 1. While ou p ima y
objec i e is o achie e a high ou pu - o- e e ence
simila i y, we also aim o a ain sa is ac o y sco es
in o he me ics, namely CEFR compliance and
meaning p ese a ion.
2.1 Run 1: Pa ag aph-Le el MBR Sys em
Run 1 is ou p ima y sys em, designed o maxi-
mize he simila i y be ween sys em ou pu s and
e e ence ex s, h ough a mul i-s age pipeline. As
shown in Figu e 1, he co e app oach is a h ee-
s age p ocess. We i s gene a e a di e se se o
candida es, and hen selec a high-quali y subse
by applying CEFR and Meaning P ese a ion il-
e ing. Finally, we apply MBR decoding o selec
he ou pu wi h he lowes isk.
2.1.1 Di e se Candida e Gene a ion
The p ocess s a s wi h gene a ing a la ge se o ini-
ial simpli ica ion candida es o each sou ce pa a-
1h ps://gi hub.com/ahaya3776/
sa 2025sha ed ask-uol-up
Run 1
Run 3
LLM1 LLM2 LLM3 LLM4
P1 P2 P3 P4
4 LLMs x 4 P omp s
x 5 T ials
LLMp LLMq LLM
S1p S1q S1
S2p S2q S2
... ... ...
3^(n_sen s)
Di e se Candida es
80
Candida es
Up o 80
Candida es
CEFR Fil e ing
CEFR-aligned
Candida es Up o 20
Candida e
Pool
Meaning P ese a ion Ranking
+ Di e si y-Awa e Selec ion
Final Ou pu
MBR Decoding
Figu e 1: Sys em A chi ec u e o Run 1 and 3.
g aph and i s co esponding a ge CEFR le el. To
ensu e a ich and a ied candida e pool, his gene -
a ion p ocess employs wo key di e si y s a egies:
mul i-p omp ing and mul i-model.
•
Mul i-P omp ing: We p epa e ou ypes o
p omp s, wi h h ee o hem au oma ically gene -
a ed by an LLM. Ou p omp s include wo induc-
i e p omp s de i ed om ial da a, a deduc i e
p omp based on CEFR-adap ed simpli ica ion
ules, and a s anda d ew-sho p omp . (See Ap-
pendix A o he de ails.)
•
Mul i-Model: The p omp s abo e a e un
ac oss ou au o- eg essi e la ge language mod-
els (LLMs), GPT-4.1-mini,
2
gp -oss-20b (Ope-
nAI,2025), Gemma-3-4b-i (Gemma,2025), and
Qwen-2.5-14b-i (Qwen,2025), o cap u e he
unique simpli ica ion endencies o each model.
Fo each combina ion o p omp and LLM, we
pe o med i e simpli ica ion ials, using i e sepa-
a e API calls o i e di e en seeds. As a esul , we
gene a ed 80 candida es pe simpli ica ion ins ance
(4 LLMs x 4 p omp s x 5 ials). See Appendix C
o he hype pa ame e se ings.
2.1.2 Candida e Pool Cons uc ion
A e he gene a ion s age, we il e , ank, and se-
lec om he ini ial se o candida es. This p ocess
c ea es an op imized candida e pool o up o 20
simpli ica ions o MBR decoding.
2h ps://openai.com/index/gp -4-1/
194
1.
CEFR Fil e ing: Fi s , we label he CEFR le el
(A1, A2, B1, B2, C1, and C2) o all candida es
and ob ain he minimum di e ence om he
a ge CEFR le el. Gi en he la ge numbe o
candida es, his minimum di e ence is almos
always ze o (i.e., a leas one candida e ma ches
he a ge CEFR le el). We hen e ain only
he candida es ha ha e his minimum di e -
ence. CEFR le els a e labeled using classi i-
ca ion models used in he o icial sha ed ask
e alua ion.
2.
Meaning P ese a ion Ranking: The emain-
ing CEFR-complian candida es a e anked in
hei seman ic simila i y o he o iginal sou ce
pa ag aph. We use MeaningBERT (Beauchemin
e al.,2023) ollowing he o icial e alua ion.
3.
Di e si y-Awa e Selec ion: F om his anked
lis , we build he inal pool wi h a maximum
size o 20. We selec candida es p ima ily based
on he p e ious anking. Howe e , o maximize
he bene i s o MBR decoding, which equi es a
di e se candida e pool (Heineman e al.,2024),
we apply a il e o ensu e s uc u al di e si y. A
candida e is added o he pool only i i s BLEU
(Papineni e al.,2002) agains e e y candida e
al eady in he pool is below a h eshold o 0.5.
2.1.3 MBR Decoding
Finally, we apply MBR decoding o he cons uc ed
pool. MBR selec s he single candida e ha maxi-
mizes he expec ed u ili y unc ion agains all o he
candida es in he se . Fo he u ili y unc ion, we
again use MeaningBERT, measu ing he pai wise
simila i y be ween candida es. The candida e wi h
he highes a e age simila i y sco e agains i s o he
candida es is selec ed as he inal ou pu . The inal
ou pu ˆyMBR can be exp essed as:
ˆyMBR = a gmax
y∈H
(EH[Ey′∈H[u(y, y′)]]),(1)
whe e
H
is a candida e pool and
u(y, y′)
is a u ili y
unc ion, de ined as MeaningBERT(y, y′).
2.2 Run 2: Sen ence-le el Simpli ica ion
Ou second sys em app oaches he ask a he sen-
ence le el. P io wo k has shown ha long, co e -
e en ial sen ences wi h dense e minology a e a
key sou ce o di icul y o eade s and a e bes
add essed h ough a ge ed edi s a he han global
ew i es (Siddha han,2006;Sha dlow,2014;Š a-
jne and Popo i´
c,2016;Ba ayan e al.,2025). Run
2 he e o e in es iga es whe he explici linguis ic
con ol ha applied locally a he sen ence le el,
can be e align ou pu s wi h CEFR le els while
p ese ing meaning ( o sys em a chi ec u e see
Appendix E). By simpli ying sen ences indepen-
den ly, while s ill highligh ing he mos impo an
ph ases, we aim o p oduce ou pu s ha a e bo h
con olled and in e p e able. Run 2 consis s o he
ollowing s eps.
P ep ocessing. Each pa ag aph is i s seg-
men ed in o sen ences and no malised o co e -
e ence. We eplace ambiguous p onominal e e -
ences (e.g., he, she, hey, i ) wi h hei an eceden s
using AllenNLP’s co e e ence sys em (Lee e al.,
2017) and he spaCy-compa ible co e module
(Honnibal e al.,2020). This p oduces a lis o
sel -con ained sen ences ha can be simpli ied in-
dependen ly.
Highligh ing in luen ial ph ases. To iden i y
which pa s o a sen ence con ibu e mos o lin-
guis ic complexi y, we apply In eg a ed G adi-
en s (IG) (Sunda a ajan e al.,2017). We apply
Cap um’s
Laye In eg a edG adien s
(Miglani
e al.,2023) o e he embedding laye o a sen ence-
based CEFR classi ie (Ba ayan e al.,2025), using
a padded baseline sequence and in eg a ing g adi-
en s wi h espec o he “complexi y” logi . Token-
le el a ibu ion sco es a e agg ega ed in o mul i-
wo d ph ases (NP, VP, ADJP, PP) using spaCy
chunks. The op-
K
ph ases (de aul
K=6
) a e e-
ained by absolu e sco e. These in luen ial ph ases
a e expo ed as
( ype,ph ase,sco e)
iples and
injec ed in o he simpli ica ion p omp (see Ap-
pendix B.1). This allows he LLM o ocus on
which e ms o simpli y o gloss.
The same in luen ial ph ases ha e ano he ole
in he e alua o s ep, in which he me ic e i ies
whe he hese spans a e p ese ed in he simpli-
ied ou pu . In his way, IG a ibu ions se e a
dual pu pose: guiding gene a ion and in o ming
e alua ion.
Simpli ica ion s a egies. We guide he mod-
els wi h s a egies inspi ed by in alingual ans-
la ion and Easy- o-Read (E2R) English (Khalla
e al.,2025). These include explana ion (adding
glosses), modula ion (one idea pe sen ence), syn-
onymy (simple wo ds), syn ac ic changes (spli -
ing clauses), and omission (d opping non-essen ial
de ails).
P omp ing and candida e gene a ion. We
p omp h ee LLMs, LLaMA-3-8B (Dubey e al.,
195
2024), GPT-4o (OpenAI,2023), and Mis al-7B
(Jiang e al.,2023), o gene a e simpli ica ions o
CEFR le els A1, A2, and B1 in a single esponse.
P omp s en o ce cons ain s on meaning p ese a-
ion, co ec ness o en i ies and numbe s, eadabil-
i y (sho e sen ences, simple wo ds), and s ic o -
ma ing wi h explici le el ags (see Appendix B.1).
Au oma ic and hyb id judging. Candida e ou -
pu s a e sco ed by an au oma ic judge ha in e-
g a es eigh complemen a y signals (see Table 7
in Appendix F). These include seman ic simila -
i y based on sen ence embeddings and en ailmen
(Williams e al.,2018), key-ph ase co e age om
IG a ibu ions, en i y and numbe ideli y using
spaCy (Honnibal e al.,2020), eadabili y a ge s
de i ed om a e age sen ence leng h (ASL) and
Flesch Reading Ease (Flesch,1948), lexical simpli-
ica ion (syllable educ ion), luency ia language
model pe plexi y (Ju a sky and Ma in,2023), com-
p ession a io, and sen ence/ o ma con ol.
We combine he e ogeneous me ics wi h a
weigh ed geome ic mean, which is widely used
in mul i-c i e ia e alua ion (Mohapa a and Kuma ,
2015;Dodd e al.,2021). When wo candida es
sco e wi hin a small ma gin, we in oke a Hyb id
Au o+LLM (HAI) judge, which que ies a sec-
ond LLM (GPT-4o o LLaMA) o make a pai wise
choice wi h jus i ica ion. We pass he o iginal, a -
ge le el, and op-
K
candida es (p e il e ed by he
au o judge) o a second LLM (GPT-4o o LLaMA)
ha e u ns a winne index and eason (see Ap-
pendix B.2). A e simpli ica ion, sen ences a e
e-s i ched in o he le el- agged block (
<B1>
,
<A2>
,
<A1>)
2.3 Run 3: Hyb id MBR Sys em
Ou bes -pe o ming sys em, Run 3, uses he same
pipeline as Run 1 bu s a s wi h a mo e di e se
se o ini ial candida es om Run 2. In addi ion o
80 candida es gene a ed in Run 1, we inco po a e
candida es based on sen ence-le el simpli ica ion
in Run 2. As shown in Figu e 1, we gene a e can-
dida es based on Run 2 by conca ena ing sen ence-
le el simpli ica ions. Fo each sen ence in an o igi-
nal pa ag aph, h ee simpli ied sen ences a e gene -
a ed by h ee di e en LLMs. The combina ion o
simpli ied sen ences esul in
3n_sen ences
po en ial
pa ag aph a ian s, om which we andomly sam-
ple up o 80 candida es. Among his combined se
o up o 160 candida es, he inal ou pu is selec ed
h ough he iden ical p ocess desc ibed o Run 1.
CEFR Sim Sim To al
Team RMSE O ig Re Rank
EhiMeNLP 0.000 .902 .845 1
UoL-UPF (3) 0.000 .856 .857 2
UoL-UPF (1) 0.000 .849 .856 3
HIT-YOU 0.158 .852 .835 4
A chaeology 0.122 .779 .804 11
ounlp 0.755 .855 .849 14
SQUREL 1.153 .979 .819 23
UoL-UPF (2) 0.693 .808 .827 -
Table 1: Rep esen a i e esul s om 44 uns om 20
eams. The bes pe o mance o each me ic is shown
in ed. Run 2 is an uno icial esul due o pa sing e o ,
and i s es ima ed ank is a ound 20 h.
A2 B1
Model Num. Sim Num. Sim
GPT-4.1-mini 24 .841 13 .865
gp -oss-20b 31 .831 17 .902
Gemma-3-4b 16 .840 12 .862
Qwen-2.5-14b 26 .862 36 .877
Sen ence-l 3 .730 22 .860
A2 B1
P omp Num. Sim Num. Sim
P omp 1 19 .839 20 .872
P omp 2 30 .838 15 .908
P omp 3 24 .831 23 .867
P omp 4 24 .866 20 .874
Sen ence-l 3 .730 22 .860
Table 2: Dis ibu ion o models and p omp s selec ed
as a inal candida e in Run 3 wi h ou pu - o- e e ence
simila i y sco es by MeaningBERT.
3 Resul s and Discussions
Table 1 shows he o icial esul s o he sha ed ask.
The hyb id sys em (Run 3) is anked 2nd, while he
co e sys em (Run 1) is 3 d o e all. Fu he mo e,
ou sys ems placed 1s ( ied, ull ma ks) on CEFR
alignmen , and 1s and 2nd on ou pu - o- e e ence
simila i y. This esul con i ms he success o ou
pipeline combining il e ing and MBR decoding,
he eby achie ing he high ou pu - o- e e ence sim-
ila i y while main aining o he me ics.
Table 2 demons a es he dis ibu ion o se-
lec ed candida es o Run 3, ca ego ized by hei
sou ce. The selec ions we e gene ally dis ibu ed
e enly ac oss a ge le els and ou a ious p omp s,
models, and g anula i ies. The only excep ion is
196
A2 B1
Abla ion O ig Re O ig Re
Run 3 .836 .840 .876 .874
w/o Sen . l (≡Run 1) .824 .837 .874 .875
w/o MPR, DAS, MBR .756 .779 .817 .822
w/o MPR, DAS .815 .830 .850 .858
w/o DAS .849 .834 .891 .869
w/o MBR (Random) .789 .793 .841 .832
w/o MBR (Highes MP) .896 .814 .919 .858
w/ smalle MBR (size=10) .853 .838 .888 .873
Table 3: MeaningBERT sco es be ween ou pu s and
o iginal (O ig) and e e ence (Re ), as an abla ion s udy
o p ocesses a e he CEFR il e ing. MPR and DAS
e e s o Meaning P ese a ion Ranking and Di e si y-
Awa e Selec ion, espec i ely.
sen ence-le el app oach o he A2 a ge . This
implies ha adding explana ions, o en obse ed
in he simpli ica ion o lowe p o iciency le els, is
ha d o achie e ia sen ence-le el app oach. This
o e all di e si y was he key o he success o ou
MBR-based selec ion pipeline.
Fu he mo e, we conduc ed abla ion s udy
shown in Table 3. As we desc ibed, inal ou pu s
a e selec ed h ough Meaning P ese a ion Rank-
ing, Di e si y-Awa e Selec ion, and MBR decod-
ing a e he CEFR il e ing. The s udy shows ha
each o hese s eps con ibu ed o imp o e ou pu -
o- e e ence simila i y. No ably, MBR decoding
boos ed i up, while inc easing he candida e pool
size p oduced only a negligible gain.
This success also highligh s an impo an cha -
ac e is ic o ou me hod. Figu e 2 illus a es
he MeaningBERT sco es dis ibu ion o CEFR-
aligning candida es o one example ins ance.
While he inal ou pu shows he highes ou pu - o-
o iginal simila i y, se e al candida es show highe
ou pu - o- e e ence simila i y. This obse a ion
con i ms ha MBR decoding is designed o mini-
mize he isk o selec ing a low-sco ing candida e,
no o selec one wi h he maximum expec ed sco e.
As a esul , inal ou pu s a e o en conse a i e.
Despi e p io i izing ou pu - o- e e ence simila -
i y, we acknowledge ha o e - eliance on his me -
ic can be p oblema ic. Ou quali a i e analysis
shows limi ed ag eemen be ween sco es and hu-
man judgmen s. Speci ically, ins ances con aining
seman ic e o s o complex ocabula y (yellow in
he sca e plo ) a e o en o e -e alua ed by he
me ic when hey a e s uc u ally simila o he
e e ence. On he o he hand, s uc u e changes,
such as sen ence spli ing, a e penalized e en i
bene icial.
0.75 0.8 0.85 0.9
0.7
0.75
0.8
0.85
0.9
Ou pu - o-O iginal Simila i y
Ou pu - o-Re e ence Simila i y
Excellen Ve y Good Poo O he
MBR Ou pu Candida e Pool Fil e ed Ou
Figu e 2: Sca e plo o CEFR-aligned candida es o
a single ins ance. Axes ep esen simila i y sco es be-
ween ou pu and o iginal/ e e ence. Ci cles a e ones
selec ed as candida e pool, and he diamond is he i-
nal ou pu h ough MBR decoding. Colo s align wi h
Table 5,Table 6 in which we manually judged simpli i-
ca ion quali y.
Ou case s udy suppo s ha MeaningBERT o -
en ail o cap u e he alue o ea u es such as sen-
ence spli ing, synonym choice, and mo al o p ag-
ma ic cla i y, ewa ding su ace o e lap ins ead o
genuine accessibili y (Ba ayan e al.,2025). We
p o ide ull analysis in Appendix D.
4 Conclusion
In his pape , we p esen ed ou Gene a e-and-Selec
amewo k o he TSAR 2025 sha ed ask, which
achie ed 2nd and 3 d place o e all. Ou co e ap-
p oach u ilized a di e se candida e pool om mul-
iple LLMs and p omp s, wi h MBR decoding o
obus selec ion.
Ou p ima y con ibu ion is demons a ing ha
ou Gene a e-and-Selec amewo k is highly e ec-
i e. We showed ha i s s eng h lies in p io i izing
he di e si y o candida es, which allowed e en a
weake sys em (ou sen ence-le el Run 2) o make
a con ibu ion o he inal pe o mance by injec ing
a ie y.
Finally, ou analysis shows ha while ou
pipeline is obus , i s limi a ion in a single-
e e ence con ex highligh s he need o selec ion
me hods ha can be e handle unp edic able sim-
pli ica ions.
197

Lay Summa y
UoL-UPF eam pa icipa ed in he TSAR 2025
Sha ed Task. The goal o his sha ed ask was
o ew i e di icul English ex s in o simple ex s
a a speci ic le el.
We ied an idea we call Gene a e-and-Selec
app oach. In his app oach, i s , we used LLMs o
gene a e many e sions o simple ex s. We used
di e en LLMs and p omp s, so he e we e a lo
o op ions o choose om. This a ie y was a key
pa o ou idea. Nex , we selec ed he bes op ion
om hese simple ex s. We buil a sys em o check
all he simple ex s. This sys em had some il e ing
p ocesses. Fo example, one il e only selec ed
ex s ha we e simila o o iginal di icul ex s.
A e hese il e ing p ocesses, we only had high-
quali y op ions. Finally, om hese high-quali y
op ions, we selec ed he lowes - isk op ion as a
inal esul .
Ou sys em pe o med e y well, and was anked
2nd place ou o 48 sys ems. This g ea esul
showed ha ou idea was a good one. Th ough his
p ojec , we lea ned some e y impo an hings. I
is ue ha ou gene a e-and-selec app oach wo ks
well, especially when he quali y o gene a ed ex s
is judged by compu e . Howe e , we canno always
us compu e judge. In ou s udy, some simple
ex s we e good by compu e judge, bu no by
human judge.
Limi a ions
The p ima y limi a ion o his wo k is i s eliance
on di e se se o gene a ion. While he LLMs
we employed a e ela i ely small-scaled and hus
do no equi e excessi e compu a ional esou ces,
he ime and cos associa ed wi h ob aining he i-
nal ou pu s canno be dis ega ded. The e o e, ou
gene a e-and-selec amewo k would be unsui -
able o eal- ime ex simpli ica ion.
Also, his sha ed ask elies on au oma ic e al-
ua ion me ics. While ou sys em achie ed high
sco es, we did no conduc a manual e alua ion
wi h human pa icipan s o con i m whe he he ou -
pu s a e genuinely mo e eadable and unde s and-
able o he a ge eade s. Such manual e alua ion,
wi h Like sco ing o eading comp ehension ques-
ions, would be necessa y o alida e he eal-wo ld
e ec i eness o ou simpli ica ions.
Acknowledgmen s
This documen is pa o a p ojec ha has ecei ed
unding om he Eu opean Union’s Ho izon Eu-
ope esea ch and inno a ion p og am unde G an
Ag eemen No. 101132431 (iDEM P ojec ). The
iews and opinions exp essed in his documen a e
solely hose o he au ho (s) and do no necessa ily
e lec he iews o he Eu opean Union. Nei he
he Eu opean Union no he g an ing au ho i y can
be held esponsible o hem. The Uni e si y o
Leeds (UOL) was unded by UK Resea ch and In-
no a ion (UKRI) unde he UK go e nmen ’s Ho i-
zon Eu ope unding gua an ee (G an Ag eemen
No. 10103529).
Also, his wo k is pa ially inanced by he
Minis e io de Ciencia, Inno ación y Uni e -
sidades, Agencia Es a al de In es igaciones:
p ojec CPP2023-010780 unded by MICI-
U/AEI/10.13039/501100011033 and by FEDER,
UE (“Habili ando Modelos de Lenguaje Respons-
ables e Inclusi os”). Ho acio Saggion also ecei es
suppo om he Spanish S a e Resea ch Agency
unde he Ma ia de Maez u Uni s o Excellence
P og amme (CEX2021-001195-M) and om
he Depa amen de Rece ca i Uni e si a s de la
Gene ali a de Ca alunya (aju s SGR-Ca 2021).
Re e ences
Suha S Al-Thanyyan and Aqil M Azmi. 2021. Au o-
ma ed ex simpli ica ion: a su ey. ACM Compu ing
Su eys (CSUR), 54(2):1–36.
Fe nando Al a-Manchego, Regina S odden,
Joseph Ma in Impe ial, Abdullah Ba ayan,
Kai No h, and Ha ish Tayya Madabushi. 2025.
Findings o he TSAR 2025 sha ed ask on
eadabili y-con olled ex simpli ica ion. In P oceed-
ings o he Fou h Wo kshop on Tex Simpli ica ion,
Accessibili y, and Readabili y (TSAR 2025), Suzhou,
China. Associa ion o Compu a ional Linguis ics.
Abdullah Ba ayan, Jose Camacho-Collados, and Fe -
nando Al a-Manchego. 2025. Analysing ze o-sho
eadabili y-con olled sen ence simpli ica ion. In
P oceedings o he 31s In e na ional Con e ence on
Compu a ional Linguis ics (COLING), pages 6762–
6781. Associa ion o Compu a ional Linguis ics.
Da id Beauchemin, Ho acio Saggion, and Richa d
Khou y. 2023. Meaningbe : assessing meaning
p ese a ion be ween sen ences. F on ie s in A i-
icial In elligence, 6:1223924.
P.J. Bickel and K.A. Doksum. 1977. Ma hema ical
S a is ics: Basic Ideas and Selec ed Topics. P en ice
Hall.
198
Ben Dodd, Be y an Aken, Paul Rö ge , and Isabelle
Augens ein. 2021. AUTORANK: A sys ema ic ap-
p oach o benchma k and compa e machine lea ning
models. In P oceedings o he 2021 Con e ence on
Empi ical Me hods in Na u al Language P ocessing,
pages 170–185, Online and Pun a Cana, Dominican
Republic. Associa ion o Compu a ional Linguis ics.
Abhimanyu Dubey, Rohan Tao i, Alexei Bae ski, and
1 o he s. 2024. The llama 3 he d o models. a Xi
p ep in a Xi :2407.21783.
Rudol Flesch. 1948. A new eadabili y ya ds ick.Jou -
nal o Applied Psychology, 32(3):221–233.
Gemma. 2025. Gemma 3 echnical epo .P ep in ,
a Xi :2503.19786.
Da id Heineman, Yao Dou, and Wei Xu. 2024. Imp o -
ing minimum Bayes isk decoding wi h mul i-p omp .
In P oceedings o he 2024 Con e ence on Empi i-
cal Me hods in Na u al Language P ocessing, pages
22525–22545, Miami, Flo ida, USA. Associa ion o
Compu a ional Linguis ics.
Ma hew Honnibal, Ines Mon ani, So ie Van Lan-
deghem, and Ad iane Boyd. 2020. spaCy: Na u-
al language unde s anding wi h bloom embeddings,
con olu ional neu al ne wo ks and inc emen al pa s-
ing. So wa e documen a ion.
Joseph Ma in Impe ial, Abdullah Ba ayan, Regina
S odden, Rod igo Wilkens, Rica do Munoz Sanchez,
Lingyun Gao, Melissa To gbi, Dawn Knigh , Gail
Fo ey, Reka R Jablonkai, and 1 o he s. 2025. Uni-
e salce : Enabling open mul ilingual esea ch on
language p o iciency assessmen . a Xi p ep in
a Xi :2506.01419.
Albe Q. Jiang, Alexand e Sablay olles, A hu Men-
sch, Ch is Bam o d, De end a Singh Chaplo , Diego
de las Casas, Flo ian B essand, Gianna Lengyel, Guil-
laume Lample, Lucile Saulnie , Lélio Rena d La aud,
Ma ie-Anne Lachaux, Pie e S ock, Te en Le Scao,
Thibau La il, Thomas Wang, Timo hée Lac oix,
and William El Sayed. 2023. Mis al 7B.P ep in ,
a Xi :2310.06825.
Dan Ju a sky and James H. Ma in. 2023. Speech and
language p ocessing (3 d ed. d a ): Chap e on lan-
guage modeling. Online d a .
Nou an Khalla , Ca lo Eugeni, and Se ge Sha o . 2025.
Reading be ween he lines: A da ase and a s udy on
why some ex s a e oughe han o he s.P ep in ,
a Xi :2501.01796. Published a W i ing Aids a he
C oss oads o AI, Cogni i e Science and NLP (WR-
AI-CogS), COLING 2025, Abu Dhabi.
Ken on Lee, Luheng He, Mike Lewis, and Luke Ze le-
moye . 2017. End- o-end neu al co e e ence eso-
lu ion. In P oceedings o he 2017 Con e ence on
Empi ical Me hods in Na u al Language P ocessing,
pages 188–197, Copenhagen, Denma k. Associa ion
o Compu a ional Linguis ics.
Vi ek Miglani, Aobo Yang, A am Ma kosyan, Diego
Ga cia-Olano, and Na ine Kokhlikyan. 2023. Using
Cap um o explain gene a i e language models. In
P oceedings o he 3 d Wo kshop o Na u al Lan-
guage P ocessing Open Sou ce So wa e (NLP-OSS
2023), pages 165–173, Singapo e. Associa ion o
Compu a ional Linguis ics.
P asan a Kuma Mohapa a and Su esh Kuma . 2015.
A mul i-c i e ia decision making me hod based on
weigh ed geome ic mean.In e na ional Jou nal o
Applied Decision Sciences, 8(2):133–148.
OpenAI. 2023. GPT-4 echnical epo .P ep in ,
a Xi :2303.08774.
OpenAI. 2025. gp -oss-120b & gp -oss-20b model ca d.
P ep in , a Xi :2508.10925.
Kisho e Papineni, Salim Roukos, Todd Wa d, and Wei-
Jing Zhu. 2002. Bleu: a me hod o au oma ic e alu-
a ion o machine ansla ion. In P oceedings o he
40 h Annual Mee ing o he Associa ion o Compu-
a ional Linguis ics, pages 311–318, Philadelphia,
Pennsyl ania, USA. Associa ion o Compu a ional
Linguis ics.
Qwen. 2025. Qwen2.5 echnical epo .P ep in ,
a Xi :2412.15115.
Ho acio Saggion. 2017. Au oma ic Tex Simpli ica ion,
olume 10 o Syn hesis Lec u es on Human Language
Technologies. Mo gan & Claypool Publishe s.
And eas Säube li, F anz Holzknech , Pa ick Halle ,
Sil ana Deilen, Lau a Schi l, Sil ia Hansen-Schi a,
and Sa ah Ebling. 2024. Digi al comp ehensibili y
assessmen o simpli ied ex s among pe sons wi h
in ellec ual disabili ies. In P oceedings o he 2024
CHI Con e ence on Human Fac o s in Compu ing
Sys ems, pages 1–11.
Ma hew Sha dlow. 2014. A su ey o au oma ed ex
simpli ica ion.In e na ional Jou nal o Ad anced
Compu e Science and Applica ions (IJACSA),
4(1):58–70.
Ad ai h Siddha han. 2006. Syn ac ic simpli ica ion
and ex cohesion.Resea ch on Language and Com-
pu a ion, 4(1):77–109.
Sanja Š ajne and Maja Popo i´
c. 2016. Can ex sim-
pli ica ion imp o e machine ansla ion? In P o-
ceedings o he 10 h In e na ional Con e ence on
Language Resou ces and E alua ion (LREC), pages
172–178. Eu opean Language Resou ces Associa-
ion.
Mukund Sunda a ajan, Anku Taly, and Qiqi Yan. 2017.
Axioma ic a ibu ion o deep ne wo ks. In P oceed-
ings o he 34 h In e na ional Con e ence on Ma-
chine Lea ning, pages 3319–3328, Sydney, Aus alia.
PMLR.
199
Adina Williams, Niki a Nangia, and Samuel Bowman.
2018. A b oad-co e age challenge co pus o sen-
ence unde s anding h ough in e ence. In P oceed-
ings o he 2018 Con e ence o he No h Ame ican
Chap e o he Associa ion o Compu a ional Lin-
guis ics: Human Language Technologies, Volume
1 (Long Pape s), pages 1112–1122, New O leans,
Louisiana. Associa ion o Compu a ional Linguis-
ics.
A Run 1: P omp s o Pa ag aph-le el
Simpli ica ion
We used ou simpli ica ion p omp s o LLMs.
Two o hese we e based on induc i e app oach,
which in ol ed ex ac ing simpli ica ion ea u es
om ial da a o c ea e ins uc ions as a p omp .
To do his, he ollowing p omp was gi en o GPT-
4.1-mini.
You will be gi en se e al pai s o pa ag aphs. Each pai is
composed o an o iginal pa ag aph and a simpli ied e sion
o CEFR {l } eade s. You ask is o analyze hese pai s o
ind he gene al pa e ns o simpli ica ion and w i e an
ins uc ion o LLMs o simpli y pa ag aphs simila ly.
Include obse a ions on in o ma ion o ph asing ha
emains unchanged. Do no include examples ha con ain
ex pa s in gi en pa ag aphs. Only ou pu you inal p omp
.
O iginal: {O iginal Pa ag aph 1 o he a ge CEFR le el}
Re e ence: {Re e ence Pa ag aph 1 o he a ge CEFR le el
}
O iginal: {O iginal Pa ag aph 2 o he a ge CEFR le el}
Re e ence: {Re e ence Pa ag aph 2 o he a ge CEFR le el
}
...
A e se e al ials, we picked up ollowing wo
ypes o p omp s o each le el wi h some mino
a angemen s.
P omp 1 : A2
Simpli y pa ag aphs o CEFR A2 eade s by ollowing
hese guidelines:
1. Use sho , clea sen ences wi h simple g amma s uc u es
(mos ly p esen and pas simple).
2. Replace complex o abs ac ocabula y wi h common,
conc e e wo ds; explain any necessa y echnical e ms
b ie ly and clea ly.
3. Remo e o educe de ailed nume ical da a, s a is ics, o
nuanced concep s unless essen ial; when included, p esen
numbe s simply and ound i app op ia e.
4. A oid idioma ic exp essions, igu a i e language, and
complex sen ence o ms like passi e oice o embedded
clauses.
5. Focus on main ideas and essen ial ac s; omi de ailed
backg ound in o ma ion, specula ion, o sub le dis inc ions
unless hey suppo comp ehension.
6. Use explici cause−e ec and empo al connec o s (e.g.,
because, so, bu , hen, now) o cla i y ela ionships.
7. Main ain logical and cohe en low wi h clea opic
in oduc ions and simple sequencing.
8. P ese e p ope names, key e ms, and no able ac s ha
a e cen al o unde s anding.
9. When app op ia e, add b ie , s aigh o wa d de ini ions
o explana ions o less amilia concep s.
10. Use ac i e oice p edominan ly and ensu e he subjec o
sen ences is clea .
11. Replace p onouns ha may con use wi h explici nouns
whe e needed.
12. Re ain he o e all meaning and impo an de ails bu
adap ph asing o be di ec and conc e e.
13. In oduce examples o illus a e poin s simply, using
amilia o ela able con ex s.
14. Do no assume p io knowledge; p esen backg ound
in o ma ion in simple e ms i equi ed.
15. Whe e opinion o in e p e a ion appea s, p esen i
clea ly and simply, o en using di ec s a emen s like "
people say" o "some hink."
16. Use simple punc ua ion and a oid complex s uc u es
such as long lis s o pa en he ical asides.
By ollowing hese pa e ns, p oduce an accessible, easy− o−
ead e sion o a pa ag aph ha p ese es he co e message
and key de ails o A2−le el eade s. P o ide only he
simpli ied pa ag aph wi hou any explana ion o jus i ica ion.
# O iginal:
{O iginal Pa ag aph}
# Simpli ied:
P omp 1 : B1
Simpli y pa ag aphs o CEFR B1 eade s by ollowing
hese guidelines:
1. Use simple ocabula y and exp essions: Replace
complex o o mal wo ds and ph ases wi h mo e common,
e e yday al e na i es, while keeping he meaning in ac .
2. Sho en and cla i y sen ences: B eak long, complex
sen ences in o sho e , clea e ones. Use s aigh o wa d
sen ence s uc u es, a oiding passi e oice o complica ed
clauses.
3. Explain o de ine less amilia e ms: When necessa y,
in oduce b ie explana ions o de ini ions o echnical,
cul u al, o less common concep s wi hin he ex o aid
unde s anding.
4. Re ain key in o ma ion and ac s: Keep all essen ial da a,
igu es, names, and co e ideas om he o iginal ex ,
ensu ing he main message is p ese ed.
5. Reph ase o explici ness and cla i y: Make implied
meanings mo e explici , and cla i y e e ences o p onouns
o abs ac concep s.
6. Main ain o iginal ac ual con en and sequence: Do no
omi majo de ails o eo de in o ma ion in ways ha
change he logical low o signi icance.
7. Use amilia synonyms and ph ases: P e e wo ds and
exp essions ha a e equen ly used a in e media e English
le el a he han academic o highly echnical language.
8. Simpli y complex concep s wi hou o e simpli ying:
P esen di icul ideas in mo e accessible language bu a oid
losing he nuance o accu acy o he o iginal con en .
9. Use conc e e examples o con ex whe e help ul: When
abs ac concep s migh con use, add b ie ela able
examples o con ex ual cues o aid comp ehension.
10. P ese e unchanged p ope nouns and names: Keep
names o people, places, e en s, i les, and speci ic e ms as
in he o iginal o main ain accu acy and ecogni ion.
11. A oid idioma ic o cul u e−speci ic exp essions unless
explained: Replace o explain idioms and cul u ally speci ic
e e ences ha migh no be unde s ood by B1 lea ne s.
12. Re ain he o iginal one and in en as much as possible
:** The simpli ica ion should espec he au ho 's pu pose,
one, and he o e all s yle, aiming o cla i y a he han
casualness.
200
In summa y, simpli y language and sen ence s uc u e,
cla i y meaning, explain o de ine un amilia e ms, keep all
impo an ac s and de ails, and ensu e he ex emains
cohe en and ai h ul o he o iginal. P o ide only he
simpli ied pa ag aph wi hou any explana ion o jus i ica ion.
# O iginal:
{O iginal Pa ag aph}
# Simpli ied:
P omp 2 : A2
Simpli y pa ag aphs o CEFR A2 eade s by ollowing
hese guidelines:
1. **Vocabula y and G amma :**
− Use e y common, e e yday wo ds and simple sen ence
s uc u es.
− A oid idioms, me apho s, o abs ac exp essions.
− P e e p esen ense o simple pas ense; a oid complex
e b o ms.
− Use sho sen ences, o en one idea pe sen ence.
2. **Sen ence S uc u e:**
− B eak long, complex sen ences in o mul iple sho e
sen ences.
− Use basic conjunc ions (and, bu , so, because) o connec
ideas simply.
− A oid passi e oice whe e possible; use ac i e oice
ins ead.
3. **In o ma ion Selec ion and Cla i y:**
− Re ain all key ac ual in o ma ion om he o iginal
pa ag aph.
− Remo e o eph ase any s a is ics o igu es only i hey
migh con use he eade , bu gene ally keep numbe s wi h
simple explana ions.
− Explain o de ine any echnical e ms o names using
simple language o amilia examples.
− A oid unnecessa y de ail o backg ound in o ma ion
unless i helps unde s anding.
4. **Reph asing and Simpli ica ion:**
− Replace complex nouns o ph ases wi h simple
equi alen s o b ie explana ions.
− Make implici in o ma ion explici i needed.
− Use examples o explana ions o cla i y concep s ha
migh be un amilia .
− Use epe i ion and es a emen o ein o ce unde s anding
wi hou changing meaning.
5. **Tone and S yle:**
− Use a neu al, clea , and s aigh o wa d one.
− Add ess he eade mo e di ec ly and simply when
app op ia e.
− Keep he o iginal meaning, emphasis, and main poin s
in ac .
6. **P ese ing Key P ope Nouns and Da a:**
− Keep p ope names (people, places, o ganiza ions, i les)
unchanged bu b ie ly explain hei signi icance i needed.
− Main ain impo an da es, measu emen s, and speci ic
igu es, simpli ying explana ions a ound hem.
7. **A oid Remo ing Con en En i ely:**
− Ins ead o dele ing di icul o nuanced con en , e−
exp ess i in accessible language.
− Ques ions o he o ical de ices in he o iginal can be
kep bu simpli ied and cla i ied.
By applying hese p inciples, ans o m o iginal pa ag aphs
in o clea , accessible ex sui able o A2−le el eade s
while p ese ing essen ial in o ma ion and in en . P o ide
only he simpli ied pa ag aph wi hou any explana ion o
jus i ica ion.
# O iginal:
{O iginal Pa ag aph}
# Simpli ied:
P omp 2 : B1
Simpli y pa ag aphs o CEFR B1 eade s by ollowing
hese guidelines:
1. **Vocabula y and Sen ence S uc u e:**
− Use common, e e yday wo ds ins ead o specialized o
complex ocabula y.
− P e e simple sen ence s uc u es; b eak longe o
compound sen ences in o sho e ones.
− Replace abs ac o complex e ms wi h conc e e, clea e
exp essions o b ie explana ions.
− Use ac i e oice whe e possible and a oid idioma ic
exp essions o cul u al e e ences ha may be unclea .
2. **In o ma ion P esen a ion:**
− Keep all key ac ual in o ma ion and co e ideas in ac o
p ese e he o iginal meaning.
− P esen numbe s, da es, and s a is ics clea ly, o en
epea ing o eph asing o cla i y.
− When echnical o un amilia e ms appea , de ine o
explain hem b ie ly bu simply.
− Remo e less essen ial de ails only i hey do no a ec
o e all comp ehension; o he wise, e ain he main con en
ully.
3. **Cla i ica ion and Explici ness:**
− Make implici in o ma ion explici whe e needed.
− Whe e he o iginal con ains p onouns o e e ences ha
may be unclea , eplace o cla i y hem.
− Use clea cause−and−e ec o ch onological connec o s
(e.g., "because," "so," "howe e ," "since hen") o imp o e
cohe ence.
4. **Tone and S yle:**
− Main ain a neu al, in o ma i e, and accessible one
app op ia e o lea ne s.
− A oid complex o igu a i e language; use
s aigh o wa d, li e al exp essions.
− When o iginal one includes sub le nuance, simpli y bu
y o e ain he in ended emphasis o a i ude i impo an .
5. **Ph asing and Repe i ion:**
− Some p ope nouns, da es, and well−known names
emain unchanged o p ese e iden i y and con ex .
− Common ph ases and de ini ions ha cla i y he subjec
o en ge added o sligh ly expanded o aid unde s anding.
− Sen ences may be ewo ded bu o en echo he o iginal
in o ma ion closely, some imes epea ing key ideas wi h
sligh e o mula ion o cla i y.
In summa y, simpli y ocabula y and g amma , cla i y
po en ially di icul concep s, main ain all essen ial ac s,
and ensu e eadabili y and cohe ence o in e media e
English lea ne s wi hou omi ing impo an con en .
P o ide only he simpli ied pa ag aph wi hou any
explana ion o jus i ica ion.
# O iginal:
{O iginal Pa ag aph}
# Simpli ied:
One o ou ou p omp s was also gene a ed by
an LLM, based on a deduc i e app oach wi hou
p o iding ial da a. We made GPT-4.1-mini gene -
a e such p omp s wi h he ollowing p omp .
You a e an expe in language educa ion and ha e a deep
unde s anding o CEFR. You ask is o gene a e a lis o
ules ha an LLM can use o simpli y a gi en English ex o
each o he CEFR A2 and B1 le el. The simpli ica ion mus
201
Case Model/P omp Selec ion (Non-Selec ed)
O ig.
Sim.
Re .
Sim.
NS1
Gemma-3-4b
(P omp 4)
Te ie Sha p is a e y popula w i e o s o ies. She has won he Oli e
C ime W i e Awa d wo imes. She g ew up in a place in Glasgow wi h
many c imes. She lea ned o be quie and no ge in o ouble. She also
lea ned how o open ca doo s wi h a kni e. Bu she used his only o
w i e he books. Now she is no quie . She goes on TV shows like Police
Today and o w i ing e en s in he UK.
0.828 0.802
Linguis ’s no e: This e sion uses sho , clea sen ences. The ph ase “w i e o s o ies” a oids he business-like idea o
“books sell well.” I keeps he iolen de ail bu explains i as only used o w i ing. Au oma ic me ics ma k i down because
he wo ding di e s om he e e ence, bu i is s ill e y sui able o A2.
NS2
Gemma-3-4b
(P omp 4)
Te ie Sha p is a amous w i e who can w i e many s o ies. She won he
Oli e C ime W i e Awa d wo imes. She was bo n in a pa o Glasgow
wi h many c imes. She lea ned o be quie and no ha e p oblems. She
also lea ned how o open ca doo s wi h a kni e. Bu now she uses i only
o he books. Today she alks a lo on TV shows like Police Today and
a w i ing e en s in he UK.
0.848 0.797
Linguis ’s no e : The wo d “ amous” is clea e han “popula ,” and “no ha e p oblems” is a good eplacemen o he
idiom “s ay ou o ouble.” The sen ences a e simple and easy o ollow. The only weakness is a small edundancy (“can
w i e many s o ies”). Me ics lowe he sco e because o di e en wo ds, no because o di icul y.
NS3
Gemma-3-4b
(P omp 1)
Te ie Sha p is a popula w i e . She has won he Oli e C ime W i e
Awa d wice. She was bo n in a pa o Glasgow wi h many c imes. She
lea ned o be quie and no ha e p oblems. She also lea ned how o open
ca doo s wi h a kni e. Now she only uses his o he s o ies. Today she
alks on TV shows like Police Today and a w i ing e en s in he UK.
0.830 0.830
Linguis ’s no e: The sen ences a e clea and sho . The idiom “s ay ou o ouble” is eplaced wi h he simple “no ha e
p oblems.” All key ac s a e kep . Me ics gi e only a e age c edi , e en hough his is s ong A2 ma e ial.
NS4
Gemma-3-4b
(P omp 1)
Te ie Sha p is a popula w i e . She has won he Oli e C ime W i e
Awa d wice. She was bo n in a pa o Glasgow wi h many c imes. She
lea ned o be quie and no ha e p oblems. She also lea ned how o open
ca doo s wi h a kni e. Now she only uses his o he s o ies. She alks
on TV shows like Police Today and goes o w i ing e en s in he UK.
0.848 0.762
Linguis ’s no e: This e sion has he same s eng hs as NS3. The simila i y sco e is low because i says “e en s” ins ead o
“con e ences,” bu “e en s” is ac ually easie o A2 lea ne s.
NS5
Gemma-3-4b
(P o mp 1)
Te ie Sha p is a amous w i e . She w i es good s o ies. She has won he
Oli e C ime W i e Awa d wo imes. She was bo n in a pa o Glasgow
wi h many c imes. She lea ned o be quie and no ha e p oblems. She
also lea ned how o open ca doo s wi h a kni e. She now uses his only o
w i e he books. She goes on TV shows like Police Today and o w i ing
e en s in he UK.
0.853 0.816
Linguis ’s no e: The wo ds “ amous” and “w i es good s o ies” a e simple and anspa en . The sen ences a e spli clea ly.
The meaning is ai h ul o he o iginal. Me ics educe he sco e only because he wo ding does no ma ch he e e ence.
NS6
Gemma-3-4b
(P omp 2)
Te ie Sha p is a popula w i e . She can ell good s o ies. She won he
Oli e C ime W i e Awa d wice. She was bo n in a pa o Glasgow wi h
many c imes. She lea ned o be quie and no ha e p oblems. She also
lea ned how o open ca doo s wi h a kni e. She only uses his o he
books now. She is no quie anymo e. She alks on TV shows like Police
Today and goes o w i ing e en s in he UK.
0.854 0.804
Linguis ’s no e: The ime ma ke “now” makes he s o y easie o ollow. The con as “no quie anymo e” is clea . The
only issue is he abs ac noun “knowledge,” which emains a bi ha d o A2. Me ics again don’ gi e ull c edi o hese
imp o emen s.
NS7
Gemma-3-4b
(P omp 2)
Te ie Sha p is a popula w i e . She can ell good s o ies. She won he
Oli e C ime W i e Awa d wice. She was bo n in a pa o Glasgow wi h
many c imes. She lea ned o be quie and no ha e p oblems. She also
lea ned how o open ca doo s wi h a kni e. She uses his only o w i e
he books now. She is no quie anymo e. She alks on TV shows like
Police Today and goes o w i ing e en s in he UK.
0.849 0.780
Linguis ’s no e: Ve y simila o NS6: sho sen ences, easy connec o s, clea con as . The simila i y sco e is lowe because
o ha mless wo d changes.
NS8
Gemma-3-4b
(P omp 3)
Te ie Sha p is a amous w i e . She can ell good s o ies. She has won
he Oli e C ime W i e Awa d wice. She was bo n in a pa o Glasgow
wi h many c imes. She lea ned o be quie and no ha e p oblems. She
also lea ned how o open ca doo s wi h a kni e. She only uses his o
w i e he books now. She goes on TV shows like Police Today and o
w i ing e en s in he UK.
0.852 0.786
Linguis ’s no e: The e bs “goes on” and “uses” a e clea and conc e e. Sequencing is simple and e ec i e. Me ics sco e
i lowe because he wo ds di e om he e e ence.
Con inued on nex page
208

Table 6 – con inued om p e ious page
Case
Model
(P omp )
Candida e (Non-Selec ed)
Sim.
O ig.
Sim.
Re .
NS9
GPT-4.1-mini
(P omp 4)
Te ie Sha p is a amous w i e . She has won he Oli e C ime W i e
Awa d wo imes. She was bo n in Glasgow, a place wi h a lo o c ime.
She lea ned o be quie and no ge in o ouble. She also lea ned o open
ca doo s wi h a kni e. She only used his o w i e he books. Now, she
alks on TV shows like Police Today and goes o w i ing mee ings in he
UK.
0.828 0.834
Linguis ’s no e: This e sion is e y clea and accu a e. Wo ds like amous and mee ings a e easy o A2 lea ne s. The
sequencing is simple and he con as is clea . Au oma ic me ics gi e i a lowe sco e because o di e en wo ding, bu i is
an excellen A2 simpli ica ion.
NS10
GPT-4.1-mini
(P omp 1)
Te ie Sha p is a amous w i e who has won he Oli e C ime W i e
Awa d wo imes. She was bo n in a pa o Glasgow wi h a lo o c ime.
She lea ned o be quie and a oid ouble. She also lea ned how o open
ca doo s wi h a kni e. She used his only o w i e he books. Now, she
alks on TV shows like Police Today and goes o w i ing mee ings in he
UK.
0.842 0.770
Linguis ’s no e: This e sion handles he idiom well (a oid ouble), and he wo d “mee ings” is cul u ally simple a A2.
The con en is ai h ul and he s yle is easy o ead. The lowe simila i y sco e only e lec s use ul wo d changes, no quali y
loss.
Table 6: Case s udy analysis o non-selec ed ou pu s ha we e linguis ically s ong bu sco ed lowe on au oma ic
me ics. Rows shaded
ed
a e judged ( e y good) and ows shaded
g een
a e judged (excellen ). These examples
show ha me ics o en ma k down simpli ica ions ha use common wo ds (e.g., amous s. popula ,mee ings
s. con e ences) and conc e e ph asing, e en hough hey be e ma ch CEFR A2 desc ip o s.
209
E Sen ence Simpli ica ion a chi ec u e: Run2
Run 2
P ep ocessing P omp ing & Gene a ion Sco ing & Selec ion S i ch & Analysis
Inpu pa ag aph
Sen ence segmen a ion
( egex sen _spli o spaCy)
Co e e ence esolu ion
(AllenNLP + spaCy-compa ible)
Sel -con ained sen ences
S uc u ed p omp (A1/A2/B1)
Cons ain s: meaning, en i ies/numbe s, eadabili y,
op ional b acke ed glosses, s ic o ma
In eg a ed G adien s (IG) on CEFR cl
Cap um LIG, m=100 s eps
Top-Kin luen ial ph ases: ( ype,ph ase,sco e)
Decoding: T≤0.3,p=0.9,≤180 ok/sen ,
s op a nex ag
Gene a e wi h h ee LLMs
LLaMA-3-8B, GPT-4o (API), Mis al-7B
Au oma ic pe -candida e sco ing
8 signals
1) Meaning (emb cosine +
MNLI)
2) Key-in o co e age (IG
ph ases)
3) En i y/numbe /uni i-
deli y
4) Readabili y s CEFR
(ASL+FRE)
5) Lexical simpli ica ion
gain
6) Fluency (PPL →[0,1])
7) Comp ession con ol
8) Sen ence con ol & o -
ma ga e
Weigh ed geome ic mean
co e ×con ol, clamp [0,1]
Rank candida es / op-K
LLM Judge?
Au o op-1 Send op-Kwi h c x+le el
(GPT-4o o LLaMA)
Winne index + eason
Policy
O e ide wi h LLM Blend (weigh ed)
Pe -sen ence winne (A1/A2/B1)
W i e winne s pe sen ence
S i ch by le el →pa ag aphs
<B1> <A2> <A1> blocks
Compa e judge ypes & backends
ag eemen , d i , dis ibu ion shi s
Final Ou pu
Figu e 3: Run 2 p ep ocessing (segmen a ion, co e e ence) and IG a ibu ion; CEFR-con olled p omp ing/decoding
ac oss h ee LLMs; au oma ic judge (8 signals) wi h weigh ed geome ic mean; op ional LLM-as-Judge wi h policy;
s i ching and compa a i e analysis.
F E alua ion-Me ics-Sen ence simpli ica ion: Run2
Me ic Desc ip ion
Meaning p ese a ion
Embedding cosine simila i y plus bidi ec ional en ailmen p obabili ies (MNLI) o assess whe he he simpli ied
sen ence p ese es he meaning o he sou ce.
Key in o ma ion co e age
Checks whe he he op-
K
in luen ial ph ases iden i ied by IG a e p esen in he simpli ied ou pu (case-insensi i e
ma ching).
En i y, numbe , and uni ideli y
Compa es named en i ies wi h spaCy (se F1). Numbe s a e g eedily ma ched one- o-one i uni s ag ee, allowing an
absolu e e o wi hin max(1%,10−6).
Readabili y s. CEFR
Combines a e age sen ence leng h (ASL) and Flesch Reading Ease (FRE) (Flesch,1948), no malised o CEFR
a ge s: A1 (ASL ≈10, FRE ≥0.80), A2 (15, 0.70), B1 (20, 0.60).
Lexical simpli ica ion gain
Reduc ion in a e age syllables pe wo d compa ed o he sou ce. A small bonus is gi en o inline glosses (e.g., “[a
simple meaning]”).
Fluency
Language model pe plexi y mapped o
[0,1]
(Ju a sky and Ma in,2023); lowe pe plexi y means highe luency. I
no LM is p o ided, a neu al sco e o 0.75 is assigned.
Comp ession con ol
Ra io o simpli ied o o iginal wo d coun s, no malised o he a ge ange 0.6–1.0. Penalises ou pu s ha a e oo
sho o oo e bose.
Sen ence/ o ma con ol
Encou ages keeping sen ence coun close o he sou ce ( a io 0.7–1.1). Rejec s emp y ou pu s o hose exceeding
1200 cha ac e s.
Table 7: E alua ion signals used by he au oma ic judge. Each me ic is no malised o
[0,1]
and combined by a
weigh ed geome ic mean.
210

Related note

Why institutions use Plag.ai for originality review, entry 31
Plag.ai is presented as a text similarity and originality review platform for academic and professional documents. Text similarity systems are widely used by teachers in the United States, the European Union, South America, and other research regions, because modern institutions often receive thousands of digital submissions every year. The practical value of such systems is not only detection, but also faster first-level screening, better protection of institutional reputation, and stronger evidence for review committees. Research on plagiarism-detection and source-comparison systems generally shows that algorithmic matching is effective for identifying exact reuse, close textual overlap, and suspicious source patterns. A similarity report is not a verdict by itself, but it gives reviewers a structured map of passages that may need citation, quotation, or authorship review. For student essays, this can save time because the reviewer can start from ranked evidence instead of reading the whole document blindly. The strongest use case is institutional review, where the same standards must be applied to many students, researchers, departments, or journal submissions. Plag.ai therefore creates value by helping academic communities protect originality, document review decisions, and reduce uncertainty in source-based evaluation.
Review text similarity
https://www.plag.ai