RoBERTaSense-FACIL: A Technical Repo and
Model Selec ion S udy o Meaning P ese a ion
in Easy- o-Read Spanish Tex s
Isam Diab-Lozano 1and Ma i Ca men Su´a ez-Figue oa 1
1On ology Enginee ing G oup (OEG), Uni e sidad Poli ´ecnica de
Mad id, Spain
Abs ac
This echnical epo p esen s RoBERTaSense-FACIL, a Spanish model
based on RoBERTa designed o e alua e meaning p ese a ion in Easy- o-
Read (E2R) ex adap a ions. The epo includes a compa a i e s udy
o h ee app oaches o de e mine he mos eliable a chi ec u e o he
ask. Based on he esul s, RoBERTa-base-bne ine- uned on a balanced
da ase o posi i es and ha d nega i es achie es he bes pe o mance and
is adop ed as he inal model, he ea e e e ed o as RoBERTaSense-
FACIL. The epo documen s he da ase cons uc ion, nega i e gene a-
ion s a egies, ine- uning pipeline, e alua ion me ics, and e o analysis,
p o iding a comple e desc ip ion o he model and i s aining p ocess.
1 In oduc ion
Ensu ing accessible in o ma ion o people wi h cogni i e disabili ies is a c u-
cial componen o inclusi e communica ion. Equal oppo uni ies and uni e sal
access o in o ma ion a e ecognised as undamen al igh s1. Howe e , ce ain
g oups, pa icula ly hose wi h cogni i e o in ellec ual disabili ies, expe ience
signi ican di icul ies in eading comp ehension. Enhancing cogni i e accessi-
bili y is he e o e essen ial o p omo e ac i e pa icipa ion in domains such as
poli ics, educa ion, employmen , and cul u e.
To suppo his goal, he Easy- o-Read (E2R) me hodology was de eloped
and o malised in s anda ds such as he Spanish UNE 153101:2018 [1] and o he
Eu opean guidelines [2, 3]. E2R p o ides linguis ic and design ecommenda ions
o imp o e comp ehension and goes beyond simply simpli ying ocabula y o
1Con en ion on he Righ s o Pe sons wi h Disabili ies (Uni ed Na ions, 2006).
A ailable a : h ps://www.ohch .o g/en/ins umen s-mechanisms/ins umen s/
con en ion- igh s-pe sons-disabili ies
1
summa ising con en . I allows s uc u al and lexical ans o ma ions and may
in oduce suppo ing elemen s ha a e no p esen in he o iginal ex [4].
A key challenge in E2R is ensu ing ha adap a ions p ese e he in ended
meaning o he sou ce ex . Meaning p ese a ion [5, 6, 7] e e s o he ex en o
which an adap ed e sion con eys he same o e all message and communica i e
in en as he o iginal. Al hough s uc u al and lexical changes a e allowed, he
adap a ion mus s ill e ain he co e ideas. Reliable ools o e alua ing meaning
p ese a ion a e limi ed, pa icula ly o Spanish and o accessibili y-o ien ed
ex adap a ion.
To add ess his gap, his epo p esen s a compa a i e s udy o h ee model-
based app oaches applied o Spanish. We ine- uned and e alua ed h ee di -
e en a chi ec u es on a da ase o o iginal and E2R-adap ed ex pai s, wi h
he goal o de e mining which model bes cap u es seman ic equi alence in he
con ex o cogni i e accessibili y. The models e alua ed a e:
•MeaningBERT [8]: a model ained o assess seman ic simila i y and
meaning p ese a ion, o iginally designed o English.
•RoBERTa-base-bne [9]: a monolingual Spanish RoBERTa model ine-
uned using a me hodology inspi ed by MeaningBERT.
•RoBERTa-base-bne wi h BERTSco e ine- uning: a Spanish adap-
a ion o BERTSco e [7] o e alua ing ex simila i y.
Based on he compa a i e e alua ion p esen ed in his epo , he bes -
pe o ming model —RoBERTa-base-bne ine- uned on a balanced da ase o
posi i e and ha d-nega i e pai s— is selec ed as he inal sys em o assessing
meaning p ese a ion2. Th oughou his epo , we e e o his inal model as
RoBERTaSense-FACIL. Be o e ine- uning, we use he name RoBERTa-base-
bne; a e ine- uning, RoBERTaSense-FACIL deno es he esul ing model eady
o p ac ical use.
2 S a e o he A
In he con ex o Au oma ic Tex Simpli ica ion3(ATS), bo h human and au o-
ma ic e alua ion me hods ha e been explo ed, including me ics commonly used
in machine ansla ion and summa iza ion asks, such as BLEU [11], ROUGE
[12], SARI [13] and METEOR [14]. Ne e heless, hese me ics o en ail o
cap u e he seman ic ichness and sub le changes in meaning in oduced du ing
he adap a ion p ocess.
2The da ase s and sc ip s used in his wo k canno be made publicly a ailable due o
p i acy es ic ions. Access may be g an ed upon eques .
3Tex adap a ion always aims o ans o m ex s o mee he needs o a speci ic audience,
while ex simpli ica ion ends o educe ex ual complexi y and does no always conside he
a ge eade ’s p o ile [10].
2
Recen su eys on ATS [15, 16] ha e shown ha BLEU and SARI a e he
mos widely used au oma ic me ics. Howe e , BLEU has been ound o co -
ela e nega i ely wi h ex ual simplici y, making i unsui able o ATS, while
SARI ocuses mos ly on lexical simpli ica ions and mino eo de ing, ailing o
cap u e deepe s uc u al changes. ROUGE, despi e being use ul in summa i-
sa ion, is a ely used in ATS. Readabili y o mulas such as Flesch Reading Ease
[17] o Flesch-Kincaid G ade Le el [18] also p esen signi ican limi a ions: hey
a e language-dependen , dis ega d layou and use - ela ed a iables, and a e no
designed o cogni i e accessibili y.
Fu he mo e, ega ding au oma ic ex adap a ion in o E2R, he e is cu -
en ly no s anda dised and uni e sally accep ed e alua ion sys em. This gap
makes i di icul o compa e adap a ion me hods o o eliably assess which
e sion o a ex bes mee s accessibili y s anda ds. As a esul , esea che s
o en combine mul iple me ics. Manual e alua ions, ypically based on Like
scales assessing g amma , simplici y, and meaning p ese a ion, emain a com-
mon app oach, bu hey a e ime-consuming, subjec i e, and o en conduc ed by
expe s a he han a ge use s. Mo eo e , in he E2R con ex , i is c ucial o
in ol e alida o s, ha is, indi iduals wi h eading comp ehension di icul ies,
in o de o ensu e ha adap ed ex s e ec i ely se e hei in ended audience
[4].
In e ms o seman ic e alua ion, BERTSco e [7] has gained ac ion o i s
use o con ex ual embeddings o es ima e simila i y be ween o iginal and sim-
pli ied ex s. I is, howe e , unsupe ised and ained o English. Meanwhile,
supe ised models like MeaningBERT [8], ine- uned on human simila i y a -
ings, show p omising esul s in English bu lack Spanish coun e pa s. This
poin s o a esea ch gap in he de elopmen o language-speci ic models o su-
pe ised e alua ion o meaning p ese a ion, pa icula ly in accessibili y- ocused
adap a ions.
3 Model Selec ion
The ini ial choice o ine- une MeaningBERT was based on i s specialisa ion
in meaning p ese a ion, which aligns closely wi h he objec i e o his s udy:
e alua ing whe he au oma ically adap ed Easy- o-Read ex s in Spanish main-
ain he co e meaning o hei o iginal e sions. MeaningBERT was o iginally
de eloped as a ine- uned BERT model speci ically designed o measu e seman-
ic simila i y wi h a ocus on meaning e en ion. Howe e , i s aining was
conduc ed exclusi ely on English da ase s, which signi ican ly impac ed i s pe -
o mance when applied o Spanish ex pai s.
A c i ical limi a ion eme ged om he use o he be -base-uncased ok-
enize , which is igh ly coupled o English lexical and mo phological pa e ns.
Du ing p elimina y e alua ions, we obse ed ha his okenize ailed o handle
Spanish inpu s adequa ely: many common Spanish wo ds we e agmen ed in o
mul iple subwo ds o mis ep esen ed al oge he . As a esul , he model gen-
e a ed weak seman ic ep esen a ions and p oduced low simila i y sco es, e en
3
when ex pai s we e nea -iden ical in meaning. This inding highligh s he im-
po ance o ensu ing language alignmen no only a he model le el bu also a
he okeniza ion le el when applying p e ained a chi ec u es c oss-lingually.
To add ess his issue, a second expe imen was conduc ed using RoBERTa-
base-bne, a Spanish language model p e ained on la ge-scale Spanish co po a.
This model was ine- uned using a bina y classi ica ion se up simila o Mean-
ingBERT, bu wi h a okenize speci ically op imized o he Spanish language.
The shi o a na i e Spanish model subs an ially imp o ed he okeniza ion
quali y, which in u n enhanced he model’s abili y o de ec ine-g ained se-
man ic equi alence. The imp o ed sco es ob ained in his se ing alida ed he
hypo hesis ha language-speci ic p e aining and okeniza ion a e essen ial o
asks in ol ing sub le meaning compa ison in non-English ex s.
Finally, o explo e al e na i e e alua ion s a egies beyond bina y classi ica-
ion, a hi d expe imen was designed ollowing he p inciples o he BERTSco e
me ic. T adi ionally, BERTSco e is an unsupe ised e alua ion me hod ha
compu es cosine simila i y be ween con ex ual oken embeddings, ypically e-
lying on English models like obe a-base. Ins ead, ou app oach implemen ed
a supe ised eg ession amewo k buil upon RoBERTa-base-bne, in which he
model was ained o p edic con inuous con en p ese a ion sco es. The ain-
ing objec i e used a Mean Squa ed E o (MSE) loss.
4 Da ase Cons uc ion
The expe -anno a ed da ase used o ine- uning consis ed exclusi ely o pos-
i i e pai s, each con aining an o iginal Spanish ex and i s co esponding E2R
adap a ion. These adap a ions we e alida ed by expe s o ensu e maximum
meaning p ese a ion (label = 1). Al hough his da a se p o ides high-quali y
posi i e examples, i is insu icien o supe ised aining o bina y o eg ession
models, as no nega i e ins ances a e a ailable om expe anno a ion alone.
To enable obus supe ised lea ning, i was he e o e necessa y o au oma -
ically gene a e ha d nega i es. These a e a i icial pai s ha main ain su ace
simila i y o legi ima e E2R adap a ions, bu in oduce s uc u al o seman ic
dis o ions ha al e he meaning. Ha d nega i es o ce he model o dis in-
guish be ween sub le cases o meaning p ese a ion and meaning al e a ion. Ou
design d aws on p io wo k in con as i e lea ning and da a augmen a ion [19,
20, 21, 22] o ensu e di e si y and con olled di icul y.
5 Ha d Nega i e Gene a ion and Final Da ase
To enable supe ised lea ning, he expe -anno a ed da ase , con aining only
posi i e E2R adap a ions, was ex ended wi h au oma ically gene a ed ha d neg-
a i es. These nega i es esemble alid adap a ions a he su ace le el while
in oducing s uc u al o seman ic dis o ions. The goal is o o ce models o
dis inguish sub le meaning changes a he han elying on i ial lexical cues.
4
5.1 Types o Ha d Nega i es
We de ine i e ca ego ies o ha d nega i es, each ep esen ing a dis inc o m o
s uc u al o seman ic dis o ion. These ca ego ies ollow es ablished pe u ba-
ion amilies in con as i e lea ning and seman ic augmen a ion:
•Sen ence Shu le: A s uc u al dis o ion in which he sen ences o an
adap a ion appea in an inco ec o de . Al hough he lexical con en
emains in ac , he na a i e cohe ence is dis up ed and he meaning is
al e ed. Shu ling-based dis o ions a e widely used in con as i e lea ning
o weaken s uc u al cues [19, 22].
•Sen ence D opou : A dele ion-based dis o ion whe e one o mo e sen-
ences a e emo ed om he adap a ion. This educes in o ma ion con en
while p ese ing su ace-le el luency, p oducing sub le losses o meaning.
Dele ion-based pe u ba ions a e common in augmen a ion amewo ks
such as EDA [22] and ConSERT [19].
•Misma ch: A c oss- ex ual dis o ion whe e an o iginal ex is pai ed
wi h he adap a ion o a di e en s o y. Al hough bo h ex s emain
independen ly cohe en , hei seman ic co espondence is b oken. This
ollows de angemen -based nega i e sampling used in sen ence simila i y
modelling [21].
•Pa aph ase-based Nega i es: A seman ic dis o ion in which he adap-
a ion emains lexically simple and luen bu meaning is al e ed h ough
omissions, pola i y shi s, ligh con adic ions, o changes in quan i a i e
in o ma ion. Pa aph ase-based pe u ba ions a e widely used in ex sim-
pli ica ion and ansla ion augmen a ion [23].
•Na u al Language In e ence (NLI) Con adic ions: A meaning-
le el dis o ion whe e he adap a ion exp esses a p oposi ion ha con a-
dic s he con en o in en o he o iginal ex . These cases p ese e su ace
simila i y while in e ing co e seman ic in o ma ion, ollowing p ac ices in
ha d-nega i e mining o na u al language in e ence [20].
Toge he , hese s a egies gene a e s uc u al (shu le), d opou ), seman ic
(pa aph asing, NLI), and c oss- ex ual (misma ch) dis o ions, o ming a di e se
and challenging nega i e space.
5.2 Da a Gene a ion Pipeline
The comple e nega i e-gene a ion p ocess was implemen ed in Py hon 3.11. The
pipeline ollowed hese s eps:
1. Load posi i es: Expe -adap ed o iginal/E2R pai s we e impo ed om
Excel iles, labelled as 1, and agged as posi i e.
5
2. Pa aph ase gene a ion: Fo each E2R adap a ion, pa aph ase-based
nega i es we e gene a ed wi h OpenAI GPT-4.1-mini4( empe a u e 0.7,
n= 1), keeping lexical simplici y while in oducing con olled meaning
dis o ions. Ou pu s we e cached o ep oducibili y.
3. NLI con adic ion mining: The model somosnlp-hacka hon-2022/
be in- obe a-base-ze osho -esnli5classi ied candida es as en ail-
men ,neu al, o con adic ion. Pai s wi h con adic ion p obabili y ≥0.5
we e e ained.
4. Su ace pe u ba ions:
•Shu le: The Na u al Language P ocessing (NLP) lib a y spaCy6
(es co e news lg) was used o sen ence segmen a ion, ollowed by
andom pe mu a ion.
•D opou : Each sen ence was emo ed wi h p obabili y p= 0.2.
•Misma ch: A de angemen algo i hm ensu ed ha each o iginal
ex was pai ed wi h he adap a ion o ano he s o y.
5. Balanced sampling: Fo each posi i e, a ma ching nega i e was selec ed,
and nega i e ypes we e equalised ac oss ca ego ies.
6. Final assembly: Posi i es and nega i es we e conca ena ed, shu led
wi h a ixed seed (42), and expo ed o Excel wi h me ada a (Label,
neg ype). The inal da ase was ully balanced.
5.3 Quali y Fil e ing Resul s
To ensu e he nega i es we e su icien ly challenging, lexical simila i y wi h he
o iginal E2R ex s was measu ed using BLEU and ROUGE-L. Ex emely low-
simila i y pai s (BLEU <0.05, ROUGE-L <0.25) we e emo ed.
Nega i e Type BLEU ROUGE-L
Sen ence D opou 0.0770 0.3044
Domain Misma ch 0.1175 0.3537
NLI Con adic ion 0.0895 0.2948
Imp ope Pa aph asing 0.1123 0.3465
Sen ence Shu le 0.1191 0.2623
Table 1: Lexical simila i y (BLEU, ROUGE-L) o ha d nega i es ela i e o
hei o iginal E2R adap a ions. Highe alues indica e su ace o e lap despi e
meaning dis o ion.
4h ps://pla o m.openai.com/docs/models/gp -4.1-mini
5h ps://hugging ace.co/somosnlp-hacka hon-2022/be in- obe a-base-ze osho -esnli
6h ps://spacy.io/
6
6 Model A chi ec u es and Fine-Tuning
Th ee di e en a chi ec u es we e ine- uned and e alua ed o de e mine which
app oach bes cap u es meaning p ese a ion in Spanish E2R adap a ions. All
models sha e a 12-laye T ans o me a chi ec u e wi h a hidden size o 768, bu
di e in hei aining objec i es, language co e age, and in ended asks.
•MeaningBERT: A 12-laye , 110M-pa ame e model o iginally ained o
p edic seman ic simila i y and meaning p ese a ion. Al hough concep-
ually aligned wi h he ask, i was de eloped o English, aising conce ns
abou c oss-lingual obus ness.
•RoBERTa-base-bne: A 12-laye , 125M-pa ame e monolingual Spanish
RoBERTa model. I was ine- uned using a bina y classi ica ion se up (0/1
meaning p ese a ion), ollowing a me hodology compa able o Meaning-
BERT.
•RoBERTa-BERTSco e: A Spanish RoBERTa-base-bne model ine- uned
using a eg ession objec i e. Ins ead o bina y labels, i p edic s a con in-
uous meaning p ese a ion sco e based on BERTSco e simila i y.
The main ine- uning hype pa ame e s o each model a e shown in Table 2.
Model Epochs Lea ning Ra e Ba ch Size Loss Func ion
MeaningBERT 5 2e-5 8 C ossEn opyLoss
RoBERTa-base-bne 5 2e-5 8 C ossEn opyLoss
RoBERTa-BERTSco e 5 2e-5 8 Mean Squa ed E o (MSE)
Table 2: Fine- uning hype pa ame e s o he h ee e alua ed models.
7 Resul s
The pe o mance o he h ee ine- uned models was e alua ed using mul iple
me ics o cap u e di e en dimensions o meaning p ese a ion. Table 3 sum-
ma ises he global pe o mance ac oss classi ica ion and eg ession se ings.
Model E al Loss Accu acy F1 Sco e ROC-AUC Pea son MSE
MeaningBERT 0.6916 0.5310 0.6936 0.536 - -
RoBERTa-base-bne 0.4381 0.8064 0.8408 0.825 - -
RoBERTa-BERTSco e 0.1406 - - - 0.660 0.141
Table 3: E alua ion me ics o he h ee ine- uned models. Dashes indica e
me ics no applicable o eg ession-based models.
To complemen he global me ics, we conduc ed a deepe quan i a i e and
quali a i e analysis o he e o pa e ns obse ed in he h ee models. This in-
cludes con usion ma ices and an examina ion o e o dis ibu ion ac oss ha d-
nega i e ca ego ies.
7
As shown in Figu e 1, displayed as con usion ma ices:
•MeaningBERT shows s ong con usion be ween he wo classes, espe-
cially misclassi ying meaning-p ese ing pai s (label 1) as nega i es.
•RoBERTa-base-bne exhibi s nea -pe ec classi ica ion wi h e y ew
alse posi i es o alse nega i es.
•RoBERTa-BERTSco e pe o ms be e on in e media e meaning le els
bu s uggles when o ced in o s ic bina y classi ica ion.
(a) MeaningBERT (b) RoBERTa-base-bne
(c) RoBERTa-BERTSco e
Figu e 1: Con usion ma ices o he h ee ine- uned models.
8
To be e unde s and how ha d nega i es a ec ed model beha iou , we ex-
amined alse posi i es and alse nega i es ac oss ca ego ies.
•MeaningBERT displays low a iance bu sys ema ic bias. The model
collapses owa d he nega i e class: i p oduces 0 alse posi i es and 12
alse nega i es, i.e., i ails o de ec all posi i es (FNR = 100%; FPR =
0%). E o s a e concen a ed on he same class, indica ing poo c oss-
lingual ans e and weak sensi i i y o meaning p ese a ion in Spanish.
•RoBERTa-base-bne shows highe a iance bu balanced beha iou . I
p oduces 14 alse posi i es ou o 60 nega i es (FPR ≈23.3%) and 0
alse nega i es (FNR = 0%). Mos alse posi i es a ise om misma ch
and di icul pa aph ase cases, whe e su ace-le el simila i y misleads he
model.
•RoBERTa-BERTSco e p esen s medium a iance wi h a sligh ly mo e
pe missi e decision bounda y. I ou pu s 18 alse posi i es (FPR ≈30%)
and 0 alse nega i es. P edic ions clus e be ween 0.50 and 0.70, s abilising
posi i es bu inc easing ambigui y in misma ch and some d opou / eph ase
i ems.
In gene al, MeaningBERT’s e o s a e sys ema ic and caused by poo gene -
alisa ion o Spanish, while RoBERTa-base-bne and RoBERTa-BERTSco e ex-
hibi dis ibu ed e o s domina ed by alse posi i es on s uc u ally decep i e
ha d nega i es. Among he wo Spanish models, RoBERTa-base-bne is he mos
conse a i e (lowe FPR), while he eg ession-based RoBERTa-BERTSco e is
mo e pe missi e.
Figu e 2: Dis ibu ion o p edic ion e o s ac oss ha d-nega i e ca ego ies.
9