scieee Science in your language
[en] (orig)

Automatic Evaluation of Open-Ended Questions in MOOCs using a Named-Entity Recognition (NER) System based on the Hidden Markov Model

Author: De Almeida, G. M.; Naukkarinen, J.; Li, C.; Matthews, S.; Jantunen, T.; Datta, S.; Kuparinen, K.; Alobaid, F.; Vakkilainen, E.
Publisher: Zenodo
DOI: 10.5281/zenodo.17631544
Source: https://zenodo.org/records/17631544/files/SEFI2025_098.pdf
P ac ice Pape
Recommended ci a ion: De Almeida, G. M., Naukka inen, J., Li, C., Ma hews, S.,
Jan unen, T., Da a, S., Kupa inen, K., Alobaid, F., & Vakkilainen, E. (2025).
Au oma ic E alua ion o Open-Ended Ques ions in MOOCs using a Named-En i y
Recogni ion (NER) Sys em based on he Hidden Ma ko Model. In Kangaslampi,
R., Langie, G., JΓ€ inen, H.-M., & Nagy, B. (Eds.), SEFI 53 d Annual Con e ence.
Eu opean Socie y o Enginee ing Educa ion (SEFI), Tampe e, Finland. DOI:
10.5281/zenodo.17631544.
This Con e ence Pape is b ough o you o open access by he 53 d Annual Con e ence
o he Eu opean Socie y o Enginee ing Educa ion (SEFI) a Tampe e Uni e si y in
Tampe e, Finland. This wo k is licensed unde a C ea i e Commons
A ibu ion-NonComme cial-Sha e Alike 4.0 In e na ional License.
AUTOMATIC EVALUATION OF OPEN-ENDED QUESTIONS IN
MOOCs USING A NAMED-ENTITY RECOGNITION (NER) SYSTEM
BASED ON THE HIDDEN MARKOV MODEL
GM de Almeida a,1, J Naukka inen b, C Li c, S Ma hews d, T Jan unen e,
S Da a , K Kupa inen g, F Alobaid h, E Vakkilainen i
a LUT Uni e si y, Lappeen an a, Finland, 0000-0002-2898-5177
b LUT Uni e si y, Lappeen an a, Finland, 0000-0001-6029-5515
c LUT Uni e si y, Lappeen an a, Finland, 0000-0001-9400-3998
d LUT Uni e si y, Lappeen an a, Finland
e Lappeen an a Ci y Hall, Lappeen an a, Finland
Digio ouch, Tallinn, Es onia, 0000-0002-2239-2194
g LUT Uni e si y, Lappeen an a, Finland, 0000-0002-3373-8234
h LUT Uni e si y, Lappeen an a, Finland, 0000-0003-1221-3567
i LUT Uni e si y, Lappeen an a, Finland, 0000-0002-7472-3522
Con e ence Key A eas: Digi al ools and AI in enginee ing educa ion, Open and
online educa ion o enginee s
Keywo ds: MOOC, Open-ended ques ion, Au oma ic e alua ion, Named-En i y
Recogni ion (NER), Hidden Ma ko Model (HMM)
ABSTRACT
This wo k in es iga es he au oma ic e alua ion o open-ended ques ions in Massi e
Open Online Cou ses (MOOCs) o highe educa ion. The challenge o dealing wi h
na u al language makes closed-ended ques ions commonly used. On he o he
hand, spon aneous human exp ession can gene a e addi ional use ul eedback o
eache s and s uden s. We use a Named En i y Recogni ion (NER) sys em based on
he Hidden Ma ko Model (HMM) s a is ical echnique o his pu pose. The da a
comes om a MOOC pilo es conduc ed wi h 179 enginee ing bachelo ’s s uden s.
The pe o mance o he HMM-NER sys em was abo e 95% o bo h P ecision and
Recall me ics. In p ac ice, all s uden esponses in he es da a we e assessed
co ec ly and con iden ly. Human-based s eps in his p ocess ha e p o en o be
c ucial o a success ul applica ion. This highligh s he ole o he eache , as
echnology alone canno p o ide sa is ac o y solu ions.
1Co esponding Au ho
GM de Almeida
gus a o.de.almeida@lu . i
1 INTRODUCTION
Mass Open Online Cou ses (MOOCs) ha e seen an exponen ial inc ease in highe
educa ion in ecen yea s (Moo e & Blackmon, 2022). One cha ac e is ic o his
digi al lea ning modali y is he usual need o au oma ic assessmen o s uden asks.
This issue o en leads o he use o closed-ended ques ions, e.g. mul iple-choice.
The oppo uni y o use open-ended ques ions, such as essays, can allow o he
explo a ion o o he aspec s o he eaching-lea ning en i onmen (Gobbo e al.,
2023). This can p o ide addi ional insigh s and eedback on ac ual lea ning o bo h
eache s and s uden s. The ques ion ha a ises is how o pe o m he necessa y
au oma ic assessmen in MOOCs, conside ing na u al language-based esponses.
The answe lies in he compu a ional a ea o Na u al Language P ocessing (NLP)
(Ju a sky & Ma in, 2008), he b anch o AI ha deals wi h na u al language da a.
This wo k explo es Named En i y Recogni ion (NER) (Chincho & Robinson, 1997), a
ype o NLP app oach, o he au oma ic assessmen o open-ended ques ions in he
con ex o MOOCs. The NER sys em is based on he Hidden Ma ko Model (HMM)
s a is ical echnique, commonly used o e.g. speech ecogni ion (Rabine , 1990). To
ou knowledge, HMM-NER sys ems ha e no been used o his pu pose in MOOCs.
2 CONTEXT AND PRACTICAL WORK
In a ecen wo k (Almeida e al., 2025), we p oposed a amewo k o e alua ing
open-ended ques ions. In sho , a pilo es was conduc ed using a de eloped
MOOC wi h 179 enginee ing bachelo ’s s uden s. The amewo k was demons a ed
using one o i s open ques ions (Figu e 1). Fi s , he ques ion is di ided in o pa s. In
his case, Pa 1 ("πΆπ‘ƒπ‘†π‘†π‘‘π‘Ÿπ‘’π‘π‘‘π‘’π‘Ÿπ‘’") and Pa 2 ("πΆπ‘ƒπ‘†π΄π‘π‘π‘™π‘–π‘π‘Žπ‘‘π‘–π‘œπ‘›"). Second, each
pa is subdi ided in o Sub-Ques ions (SQ), ela ed o basic concep s. Fo Pa 1:
SQ1 ("π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘"), SQ2 ("𝐼𝐢𝑇"; In o ma ion and Communica ion Technology)
and SQ3 ("π·π‘–π‘”π‘–π‘‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘"), and o Pa 2: SQ1 ("πΊπ‘œπ‘Žπ‘™") and SQ2 ("πΊπ‘Žπ‘–π‘›"). Fo
example, SQ1 e e s o whe he he e is any men ion o he "π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘" in he
s uden ’s esponse. The expec ed answe conside ed Bloom's Taxonomy wi h a
ma ix o ub ics (Ande son & K a hwohl, 2000), aligned wi h he In ended Lea ning
Ou comes (ILOs). A posi i e answe sco es SQ1 as β€œCo ec ”; o he wise, β€œInco ec ”.
This is applied o all sub-ques ions. Nex , Pa 1 is conside ed β€œComple e” i all o i s
sub-ques ions (SQ1-SQ3) we e conside ed β€œCo ec ”. O he wise, β€œPa ially Co ec ” i
only some o he sub-ques ions, o β€œInsu icien ” i none o hem. This is applied o all
pa s. In he end, i all pa s we e conside ed β€œComple e”, he ques ion is conside ed
β€œCo ec ”. O he wise, β€œPa ially Co ec ” i only some o he pa s, o β€œInco ec ” i
none o hem. This wo k in es iga es he au oma ic e alua ion o open ques ions,
based on his p e ious amewo k, using he same ques ion as a case s udy.
Example:
Sub-Ques ion (SQ): β€œCo ec ” β€œInco ec ” β€œCo ec ” β€œCo ec ” β€œCo ec ”
Pa : β€œIncomple e” β€œComple e”
Final assessmen : β€œPa ially Co ec ”
Fig. 1. F amewo k o au oma ic e alua ion o open-ended ques ions (p e ious wo k)
3 CONCEPTS
3.1 Named En i y Recogni ion (NER)
Named En i y Recogni ion (NER) is used o loca e and classi y named en i ies
men ioned, o example, in ex ual da a (Chincho & Robinson, 1997). Fo example,
in he sen ence: "π‘‡β„Žπ‘’π‘’π‘›π‘–π‘£π‘’π‘Ÿπ‘ π‘–π‘‘π‘¦ β„Žπ‘Žπ‘ π‘π‘’π‘’π‘›π‘€π‘œπ‘Ÿπ‘˜π‘–π‘›π‘”π‘œπ‘›πΏπ‘’π‘Žπ‘Ÿπ‘›π‘–π‘›π‘”π΄π‘›π‘Žπ‘™π‘¦π‘–π‘π‘ .",
"π‘’π‘›π‘–π‘£π‘’π‘Ÿπ‘ π‘–π‘‘π‘¦" is an example o a named (men ioned) en i y belonging o he en i y ag
(ca ego y): "πΏπ‘œπ‘π‘Žπ‘‘π‘–π‘œπ‘›". NER has been widely used in many Na u al Language
P ocessing (NLP) applica ions, o example, in o ma ion ex ac ion, in o ma ion
e ie al, machine ansla ion, and sen imen analysis (Goyal e al, 2018).
In he p esen wo k, he se o en i y ags is gi en by {"π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘","𝐼𝐢𝑇",
"π·π‘–π‘”π‘–π‘‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘","πΊπ‘œπ‘Žπ‘™","πΊπ‘Žπ‘–π‘›"}, co esponding o he i e sub-ques ions (SQ1 o
SQ5) o he case s udy ques ion (sec ion 2). I a named en i y is iden i ied o each
sub-ques ion, he s uden ’s answe is conside ed β€œCo ec ” (Figu e 1).
3.2 Hidden Ma ko Model (HMM) based NER Sys em
Hidden Ma ko Model (HMM) is a s a is ical model o sequen ial pa e n ecogni ion
(Rabine , 1990). I can be seen as an ex ension o Ma ko chains, since in addi ion
o he p obabilis ic ansi ion be ween he s a es (o he Ma ko chain), he
ela ionship be ween s a es and emissions (sys em ou pu s) is also p obabilis ic ( ha
is, hidden). The choice o HMM in his wo k was p ecisely due o i s explici
p obabilis ic modeling ha a o s in e p e abili y, essen ial o eedback pu poses o
s uden s and eache s. The ini ial applica ions we e in speech ecogni ion a ound he
1970s. Since hen, i has been applied in many o he a eas such as human ac i i y
ecogni ion, bioin o ma ics, and ne wo k analysis (Mo e al., 2021). Mo e in o ma ion
abou he heo y o HMMs can be ound, o example, in Rabine (1990).
A disc e e HMM is de ined by i e elemen s (𝑆,𝑂,πœ‹,𝐴,𝐡), as shown in Table 1 (1s
and 2nd columns). The 3 d column con ex ualizes hem in NER applica ions using he
case s udy o his wo k. Since he sys em ou pu s in his case a e gi en by a
sequence o wo ds (s uden esponse), which a e disc e e symbols belonging o a
ini e se ( ocabula y), he ma ix 𝐡 e e s o disc e e p obabili y dis ibu ions, which
cha ac e izes a disc e e HMM.
Table 1. Elemen s o a disc e e HMM in he con ex o a NER applica ion
Elemen
Desc ip ion
An e
xample
in his wo k
S a es
(
𝑆
)
Each s a e o he Ma ko chain
ep esen s an en i y ag.
𝑆
=

"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
,
"
𝐼𝐢𝑇
"
,
…
"
π·π‘–π‘”π‘–π‘‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
,
"
πΊπ‘œπ‘Žπ‘™
"
,
"
πΊπ‘Žπ‘–π‘›
"

Obse a ions
(
𝑂
)Sequence o wo ds
(s uden esponse).
𝑂
=

"
𝐴𝑛
"
,
"
𝑒π‘₯π‘Žπ‘šπ‘π‘™π‘’
"
,
"
π‘œπ‘“
"
,
"
π‘Ž
"
,
"
π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ 
"
,
…
"
𝑖𝑠
"
,
"
𝑑
β„Ž
𝑒
"
,
"
𝑝𝑒𝑙𝑝
"
,
"
π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ 
"

T ansi ion
p obabili y
ma ix (
𝐴
)
P obabili y o mo ing om one
s a e o ano he .
𝐴
(
𝑖
,
𝑗
)
=
𝑃
(
𝑆
𝑑
=
𝑗
|
𝑆
𝑑
βˆ’
1
=
𝑖
)
𝐴
(
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
,
"
𝐼𝐢𝑇
"
)
=
𝑃
(
"
𝐼𝐢𝑇
"
|
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
)
Emission
p obabili y
ma ix (
𝐡
)
P obabili y o a s a e
gene a ing ( ha is, emi ing) a
speci ic obse a ion (wo d).
𝐡
(
𝑖
,
𝑀
)
=
𝑃
(
π‘Š
𝑑
=
𝑀
|
𝑆
𝑑
=
𝑖
)
𝐡
(
"
𝐼𝐢𝑇
"
,
"
π‘‘π‘–π‘”π‘–π‘‘π‘Žπ‘™
"
)
=
𝑃
(
"
π‘‘π‘–π‘”π‘–π‘‘π‘Žπ‘™
"
|
"
𝐼𝐢𝑇
"
)
Ini ial
p obabili y
ec o (

)
P obabili y o a s a e occu ing
a he beginning o a sen ence.
πœ‹
(
𝑖
)
=
𝑃
(
𝑆
1
=
𝑖
)
πœ‹
(
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
)
=
𝑃
(
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
)
One use o HMM as a gene a i e model is o seek o e eal he unde lying p ocess
(sequence o s a es) esponsible o gene a ing he sys em ou pu s (sequence o
obse a ions). In his wo k, he obse a ions a e gi en by a sequence o wo ds (𝑂=
{π‘ π‘‘π‘’π‘‘π‘’π‘›π‘‘π‘Ÿπ‘’π‘ π‘π‘œπ‘›π‘ π‘’}) and each s a e is associa ed wi h an en i y ag in pa icula (𝑆=
{"π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘","𝐼𝐢𝑇","π·π‘–π‘”π‘–π‘‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘","πΊπ‘œπ‘Žπ‘™","πΊπ‘Žπ‘–π‘›"}). Conside ing he objec i e
o his wo k o au oma ically e alua e open-ended ques ions in MOOCs, an HMM
(

,𝐴,𝐡) will compu e a sequence o en i y ags (model ou pu ) o a sequence o
wo ds (model inpu ). In o he wo ds, he idea wi h an HMM-based NER sys em is o
ind he mos likely sequence o en i y ags gi en a sequence o wo ds (s uden
esponse), wi h one en i y ag o each wo d in he sequence. Tha is, in he end,
each wo d in he s uden 's esponse is classi ied in o an en i y ag. Table 2 (1s and
2nd columns) shows an example o he inpu -ou pu associa ion o an HMM in he
con ex o an HMM-based NER sys em. The esul ing NER ag sequence can be
di ec ly used o au oma ic e alua ion o s uden esponses using he p oposed
amewo k o au oma ic e alua ion o open-ended ques ions (sec ion 2). The e m
"𝑂𝑒𝑑𝑠𝑖𝑑𝑒" is used when a wo d does no ma ch any p e iously de ined en i y ag.
Table 2. Inpu -ou pu associa ion in he con ex o an HMM-based NER sys em
HMM inpu
(in o iginal alues,
ha is, wo ds) HMM ou pu
HMM inpu
(in index alues,
one o each wo d in he
1
s
column o his able)
Sequence o wo ds
Sequence
o
en i y ags
S
equence o
indices
"
𝐴𝑛
"
"
𝑒π‘₯π‘Žπ‘šπ‘π‘™π‘’
"
"π‘œπ‘“"
"π‘Ž"
"π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ "
"𝑖𝑠"
"π‘‘β„Žπ‘’"
"
𝑝𝑒𝑙𝑝
"
"
π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ 
"
"
𝑂𝑒𝑑𝑠𝑖𝑑𝑒
"
"
𝑂𝑒𝑑𝑠𝑖𝑑𝑒
"
"𝑂𝑒𝑑𝑠𝑖𝑑𝑒"
"𝑂𝑒𝑑𝑠𝑖𝑑𝑒"
"π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘"
"𝑂𝑒𝑑𝑠𝑖𝑑𝑒"
"𝑂𝑒𝑑𝑠𝑖𝑑𝑒"
"
𝑂𝑒𝑑𝑠𝑖𝑑𝑒
"
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
1
2
3
4
5
6
7
8
5
4 METHODOLOGY
Figu e 2 shows he me hodology adop ed in his wo k, whe e he s a ing poin is he
o iginal se o s uden esponses. The e a e ou main asks. (1) Tex p epa a ion
cleans he o iginal ex ual da a using wo d no maliza ion, punc ua ion emo al, and
wo d lowe casing. (2) Re e ence de ini ion manually e alua es all s uden esponses
o use as a e e ence and hen es ablishes se s o named en i ies ( ha is,
ocabula ies, one o each sub-ques ion; sec ion 2) ha a e used o gene a e he
sequences o en i y ags o he sequences o wo ds (cleaned s uden esponses). In
his sense, an addi ional s a e o wo ds no associa ed wi h any en i y ag is
ca ego ized as "𝑂𝑒𝑑𝑠𝑖𝑑𝑒" ( ha is, ou side o any ocabula y). (3) Model aining
ob ains he disc e e HMM model, ha is, he se o pa ame e s ((

,𝐴,𝐡); Table 1)
h ough an i e a i e es ima ion p ocess using he Baum-Welch algo i hm (Rabine ,
1990). Using he hold-ou app oach, 85% o he s uden esponses a e used o his
pu pose, a e a wo d- o-index con e sion. (4) Model es ing uses he emaining 15%
o he s uden esponses. Wo ds no p esen in he aining da a and in u n in he
emission p obabili y ma ix (𝐡) need o be handled in some way (shown in Table 3).

The esul ing ained HMM-based NER sys em is hen used o gene a e he mos
likely sequences o en i y ags o he inpu sequences o wo ds (s uden esponses
in he es da a no used du ing model aining). This compu a ion usually employs
he Vi e bi algo i hm (Rabine , 1990). The ob ained esul is compa ed o he a ge
sequences o en i y ags o e alua e he quali y o he sys em using he con usion
ma ix and de i ed me ics. Once alida ed, he HMM-based NER sys em can be
used o au oma ic e alua ion o open-ended ques ions, as p oposed in his wo k.
Fig. 2. Me hodology s eps.
Table 3 lis s he se o hype pa ame e s ha a e combined o ob ain candida es o
he HMM-based NER sys em. Eigh candida es a e gene a ed, o which one is
selec ed by pe o mance compa ison using he es da a. The i s hype pa ame e
e e s o using ep esen a i es o all wo ds (p o ided by he ex co pus, ha is, all
s uden esponses) in he aining da a. This is simila o he handling o minimum
and maximum alues in p edic ion p oblems in machine lea ning. Second, wo
app oaches we e used o deal wi h unseen wo ds, ha is, wo ds ha appea in he
es da a bu no in he aining da a. Laplace Smoo hing add esses he coun ing
p oblem, mo e speci ically, he ze o equency (p obabili y) issue. In he con ex o
NER, a gene al obse a ion called "π‘’π‘›π‘˜π‘›π‘œπ‘€π‘›" is c ea ed and gi en a small
p obabili y alue, being included in he emission p obabili y ma ix (𝐡). When an
unseen wo d appea s, i is hen assigned as "π‘’π‘›π‘˜π‘›π‘œπ‘€π‘›".Wo d Embedding maps
wo ds o ec o s. A p e- ained model p o ided by he as Tex lib a y (Mikolo e al.,
2018) was adop ed, which e u ns a 300-dimension eal ec o . The "π‘’π‘›π‘˜π‘›π‘œπ‘€π‘›" wo d
ec o is hen compa ed o all wo d ec o s in he ocabula ies and eplaced by he
closes wo d, conside ing seman ic ela ions. Rega ding he hi d hype pa ame e ,
wo dis ance measu es we e used o such compa ison, Euclidean dis ance and
cosine simila i y. The Ma lab compu ing en i onmen was used in his wo k.
Re e ence de ini ion
Table 3. Hype pa ame e se explo ed du ing HMM aining
Task
Hype pa ame e
Value
[1] Model
aining
P esence o ep esen a i es o all
wo ds in he aining da a
(1)
N
o e i ica ion
(2) A leas one
ep esen a i e o each
ocabula y wo d (named
en i y) is p esen in he
aining da a
[2] Model aining
(in case o (1))
and Model es ing
How o deal wi h unseen wo ds,
ha is, wo ds ha a e no p esen
in he p ede ined ocabula ies
(1) Laplace smoo hing
(2) Wo d embeddings
[3] Model es ing
(gi en he use o wo d
embedding
s
in [2]
)
Dis ance me ic be ween unseen
wo ds ( es da a) and ocabula y
wo ds ( aining da a)
(1) Euclidean dis ance
(2) Cosine simila i y
5 RESULTSAND INSIGHTS
5.1 Tex p epa a ion
The pilo es (sec ion 2) allowed s uden s o choose be ween di e en lea ning
pa hs. O he o al o 179 s uden s who ca ied ou he ac i i y, 97 answe ed he
ques ion used as a case s udy in his wo k (Figu e 1). This s ep had he main
objec i e o educing he wo ds o a oo o m, so ha he in lec ed wo ds could be
analysed as a single e m. We adop ed he p ocess o lemma iza ion, which educes
he wo ds o hei dic iona y o ms e.g. "𝑠𝑒𝑛𝑑" o "𝑠𝑒𝑛𝑑" and "π‘Žπ‘π‘π‘œπ‘Ÿπ‘‘π‘–π‘›π‘”" o "π‘Žπ‘π‘π‘œπ‘Ÿπ‘‘".
5.2 Re e ence de ini ion
Fi s , all 97 s uden s’ esponses we e manually sco ed acco ding o he amewo k
shown in Figu e 1. This human-based e alua ion was used as a e e ence labelling
o assess he quali y o he HMM-NER sys em. This ini ial analysis was hen used o
cons uc i e ocabula ies, one o each sub-ques ion (SQ1-SQ5, which e e s o
{"π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘","𝐼𝐢𝑇","π·π‘–π‘”π‘–π‘‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘","πΊπ‘œπ‘Žπ‘™","πΊπ‘Žπ‘–π‘›"}, espec i ely) (sec ion 2
and Figu e 1). Fo example, he ocabula y o he en i y ag: "π‘ƒβ„Žπ‘¦π‘ π‘–π‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘",
con ains wel e wo ds: {"π‘Žπ‘ π‘ π‘’π‘šπ‘π‘™π‘¦","π‘’π‘žπ‘’π‘–π‘π‘šπ‘’π‘›π‘‘","π‘“π‘Žπ‘π‘–π‘™π‘–π‘‘π‘¦",…,"π‘‘π‘œπ‘œπ‘™"}. Bo h asks
a e c ucial and comple ely eache -dependen and hei se ings should be aligned
wi h he In ended Lea ning Ou comes (ILO). In his wo k, we conside he i s le el
o Bloom’s axonomy: β€œRemembe ”, combined wi h a se o ub ics: {β€œCo ec ”,
β€œPa ially Co ec ”, β€œInco ec ”}. The se o sub-ques ions de ines he se o en i y
ags. In his way, we cons uc he sequences o en i y ags, which will be used as
inpu in o ma ion o ain he HMM models along wi h he sequences o wo ds
(s uden esponses). Table 2 shows an example o a sequence o en i y ags (2nd
column) de i ed om a sequence o wo ds (1s column).
5.3 Model aining
The aining da ase consis s o 82 (85% o 97) s uden esponses o a ying leng hs.
Gi en he disc e e na u e o he HMM model, he wo ds we e con e ed o indices
(3 d column o Table 2). As an illus a ion, he wo d "π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ " ha appea s wice
ecei es he same index (β€œ5”). Then, bo h he in ege - alued wo d sequences and
he co esponding en i y ag sequences a e eady o he model aining phase. Eigh
candida e HMM models a e ob ained by a ying he hype pa ame e se (Table 3).
5.4 Model es ing
This sec ion p esen s he esul s o he 15 s uden esponses o a ying leng hs in
he es da ase . I is c ucial o check he beha iou o a model wi h da a no used
du ing i s aining. Table 4 summa izes he esul s o all eigh candida e HMM-based
NER sys ems (hype pa ame e alues speci ied by he g ay cells). Values e e o
a e ages o he en uns o each candida e model. Two usual me ics a e shown,
P ecision and Recall, which a e calcula ed based on he con usion ma ix. Gi en he
class imbalance issue, whe e classes a e gi en by en i y ags wi h di e en sizes,
he o e all accu acy me ic is no app op ia e. P ecision [=𝑇𝑃 (𝑇𝑃+𝐹𝑃)
⁄] e e s o
he a e o ue posi i es ou o all posi i e p edic ions and Recall [=𝑇𝑃 (𝑇𝑃+𝐹𝑁)
⁄]
e e s o he a e o posi i e p edic ions ou o all ue posi i es, whe e 𝑇𝑃,𝐹𝑃, and
𝐹𝑁 a e he amoun s o ue posi i es, alse posi i es, and alse nega i es,
espec i ely. The ange o bo h me ics is [0,1] and he close o 1, he be e . Values
below 0.95 a e shown in ed. Recall alues a e gene ally close o 1, meaning ha
ue en i y ags a e usually classi ied co ec ly, as desi ed. Howe e , his is no he
case o P ecision, whose lowes alue is 0.32 (model 5 and en i y ag "𝐼𝐢𝑇")
meaning a ela i ely highe a e o alse posi i es. Tha is, a po ion o he p edic ions
in one en i y ag ac ually belong o ano he en i y ag. Fo example, conside ing
model 5, o all he "𝐼𝐢𝑇" p edic ions, only 32% a e ac ually ue. This also occu s o
"π·π‘–π‘”π‘–π‘‘π‘Žπ‘™π‘Šπ‘œπ‘Ÿπ‘™π‘‘". Unlike he Recall me ic, o P ecision, he e a e imp o emen s in
he model (i.e., educ ion in he alse posi i e a e) o some hype pa ame e alues.
The bes esul was gi en by model 7 (bold alues in Table 4), which uses [1]
ep esen a i es o all wo ds in he aining da a, [2] wo d embeddings o deal wi h
unseen wo ds ( he mos signi ican ac o among all h ee), and [3] he Euclidean
dis ance me ic o selec he ep esen a i e wo d o unseen wo ds. All alues o he
P ecision and Recall me ics in his case a e conside ably high, abo e 0.95, as
desi ed. An example o using he wo d embedding app oach is gi en by eplacing
he unseen wo d "𝑏𝑒𝑖𝑙𝑑" (which appea ed in he es da a bu no in he aining da a)
wi h he ocabula y wo d "π‘‘π‘’π‘£π‘’π‘™π‘œπ‘" (p esen in he aining da a) in one o he model
uns. O he examples a e "π‘ π‘’π‘π‘’π‘Ÿπ‘–π‘‘π‘¦" wi h "π‘ π‘Žπ‘“π‘’π‘‘π‘¦" and "π‘π‘Žπ‘‘β„Ž" wi h "π‘Ÿπ‘œπ‘’π‘‘π‘’".
The Mac o-P ecision and Mac o-Recall me ics a e also shown, which summa ize
he co esponding indi idual alues in o a single one. Fo example, Mac o-P ecision
is gi en by βˆ‘π‘ƒπ‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘›π‘–π‘– π‘›π‘’π‘šπ‘π‘’π‘Ÿπ‘œπ‘“π‘’π‘›π‘‘π‘–π‘‘π‘¦π‘‘π‘Žπ‘”π‘ 
⁄, wi h 𝑖=1,2,…,
π‘›π‘’π‘šπ‘π‘’π‘Ÿπ‘œπ‘“π‘’π‘›π‘‘π‘–π‘‘π‘¦π‘‘π‘Žπ‘”π‘ . Bo h me ics a e close o 1 o model 7, as desi ed.
The e o e, he inal HMM-NER sys em (model 7) was able o add ess bo h issues,
namely, co ec ly p edic ing en i y ags (high Recall alues) and a he same ime,
gi en an en i y ag p edic ion, ensu e high con idence ha i is co ec (high P ecision
alues). Bo h me ics a e c ucial o achie e a eliable model o au oma ing open
ques ion e alua ion. In p ac ice, his means ha all s uden esponses in he es
da a we e assessed co ec ly and con iden ly, compa ed o he e e ence ( eache -
based) labelling (Figu e 2). This also shows he bene i o a hype pa ame e g id
sea ch. Simila esul s we e ob ained o o he open ques ions (no shown) in he
same pilo es (sec ion 2), con ibu ing o he alida ion o he p oposal o his wo k.
One cu en issue ega ds o chea ing wi h he use o AI w i e s. Toge he wi h
pedagogical s a egies, au oma ic de ec ion ools can also be used. As an example
in his wo k, en i ies in s uden s’ answe s can be compa ed wi h hose in he
ocabula ies (sec ion 4). This can be measu ed by he p obabili y o gene a ing he
sequence o en i ies (𝑃(𝑂|π»π‘€π‘€βˆ’π‘πΈπ‘…π‘ π‘¦π‘ π‘‘π‘’π‘š)) o he s uden ’s answe (𝑂).
Rela i ely low alues may sugges he use o AI w i e s.
Table 4. Final esul s in a e age alues (as a esul o en uns): Compa ison be ween all
eigh h candida e HMM-based NER sys ems (bes model highligh ed in bold; alues below
0.95 in ed; s anda d de ia ion shown only o he bes model due o lack o space)
HMM-based NER sys em
Hype pa ame e se (Table 3)
P ecision
(by en i y ag)
Recall
(by en i y ag)
Mac o-P ecision (by model)
Mac o-Recall (by model)
Rep esen a i e
wo ds in he
aining da a
Handling
o unseen
wo ds
Dis ance
measu e o
wo d
embeddings
No
Yes
Laplace
smoo hing
Wo d
embedding
Euclidean
dis ance
Cosine
simila i y
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
"
𝐼𝐢𝑇
"
"
π·π‘–π‘”π‘–π‘‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
"
πΊπ‘œπ‘Žπ‘™
"
"
πΊπ‘Žπ‘–π‘›
"
"
𝑂𝑒𝑑𝑠𝑖𝑑𝑒
"
"
𝑃
β„Ž
π‘¦π‘ π‘–π‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
"
𝐼𝐢𝑇
"
"
π·π‘–π‘”π‘–π‘‘π‘Žπ‘™
π‘Šπ‘œπ‘Ÿπ‘™π‘‘
"
"
πΊπ‘œπ‘Žπ‘™
"
"
πΊπ‘Žπ‘–π‘›
"
"
𝑂𝑒𝑑𝑠𝑖𝑑𝑒
"
1
0.90
0.40
0.72
0.94
0.96
1.00
0.99
0.98
0.99
0.97
0.95
0.93
0.82
0.97
2
0.90
0.37
0.64
0.94
0.94
1.00
1.00
0.98
0.98
0.97
0.96
0.91
0.80
0.97
3
0.99
0.95
0.97
0.96
0.96
0.99
0.99
0.98
0.98
0.98
0.97
0.99
0.97
0.98
4
0.97
0.96
0.95
0.97
0.97
0.99
1.00
0.99
0.99
0.99
0.97
0.99
0.97
0.99
5
0.88
0.32
0.70
0.96
0.94
1.00
1.00
1.00
1.00
1.00
0.99
0.92
0.80
0.98
6
0.89
0.34
0.74
0.96
0.94
1.00
1.00
1.00
1.00
1.00
0.99
0.92
0.81
0.99
7
0.99
Β±0.01
1.00
Β±0.00
1.00
Β±0.00
0.96
Β±0.03
0.96
Β±0.02
1.00
Β±0.00
1.00
Β±0.00
1.00
Β±0.00
1.00
Β±0.00
1.00
Β±0.00
0.99
Β±0.01
0.99
Β±0.00
0.98
Β±0.01
1.00
Β±0.00
8
0.97
0.98
1.00
0.95
0.98
1.00
1.00
1.00
1.00
1.00
0.99
0.99
0.98
1.00
6 CONCLUSIONS AND IMPLICATIONS
This wo k in es iga es he au oma ic e alua ion o open-ended ques ions using a
Named En i y Recogni ion (NER) sys em based on he Hidden Ma ko Model (HMM)
s a is ical echnique. The pe o mance o he inal HMM-NER sys em was abo e
95% o bo h P ecision and Recall me ics, showing g ea po en ial o u he
educa ional use. An ad an age o HMM is i s explici p obabilis ic na u e, which
a o s in e p e abili y and he e o e eedback o eache s and s uden s.
Fo a b oade use o his wo k, a Lea ning Managemen Sys em (LMS) could be
used o a web pla o m could be de eloped as a on -end o he eache . One issue
wi h da a-d i en applica ions is main aining hei accu acy o e ime. In his wo k,
his means cons an e iew o he ocabula ies (sec ion 4) by he eache , as well as
e aining he HMM model wi h new s uden esponses pe iodically, o example.
This app oach makes i possible o use open ques ions in assessmen e en wi h
la ge cou ses wi h limi ed eache esou ces. As pa o o ma i e assessmen , he
eedback enabled by he HMM-NER sys em, can suppo and di ec s uden lea ning,
o example by cla i ying misunde s andings and knowledge gaps o adap ing
lea ning pa hs. The use o open ques ions in summa i e assessmen ocuses
s uden s’ a en ion on deepe lea ning a he han o e lea ning o de eloping
guessing echniques, which a e usually encou aged by mul iple-choice ques ions. In
conclusion, i is impo an o highligh he pedagogical ole o he eache h oughou
his p ocess, as echnology alone canno p o ide sa is ac o y solu ions.
7 ACKNOWLEDGEMENTS
This wo k was de eloped wi hin he scope o he THREADING-CO2 p ojec , unded
by he EU Ho izon Eu ope esea ch and inno a ion p og amme (g an 101092257).