PLAYABILITY PREDICTION IN DIGITAL GUITAR LEARNING USING
INTERPRETABLE STUDENT AND SONG REPRESENTATIONS
Manuel Mülle schön Anssi Klapu i Ma celo Rod íguez
Ch is ian Ca din
Yousician Oy, Helsinki, Finland
[email p o ec ed]
ABSTRACT
Digi al music lea ning applica ions ha e become a popu-
la op ion o sel -guided lea ning o musical ins umen s.
Pe sonaliza ion o he lea ning cu iculum in such appli-
ca ions hinges on wo essen ial componen s: The lea ning
uni (song a angemen ) and he lea ne (s uden ). While
p e ious esea ch has ocused ex ensi ely on quan i ying
and cha ac e izing he musical con en , lea ne ep esen-
a ion emains la gely unexplo ed. In his pape , we in o-
duce in e p e able ep esen a ions o hese componen s, in
he con ex o digi al gui a lea ning.
We p opose a me hodology o embed song a ange-
men s and indi idual gui a s uden s in o a sha ed, in e -
p e able skill ec o space. To achie e ha , we employ an
au oma ed p o iling echnique o gui a abla u es, gene -
a ing g anula ocabula y and di icul y desc ip o s. We
alida e he e ec i eness o hese ep esen a ions by p e-
dic ing he p opo ion o onse s played co ec ly by s u-
den s, using a la ge-scale da ase om an online gui a
lea ning pla o m.
Ou esul s demons a e ha models le e aging he
combined ep esen a ion o s uden s and song a ange-
men s ou pe o m in o med baselines and show be e p e-
dic i e accu acy when compa ed o models using ei he
ep esen a ion indi idually. These indings unde sco e he
alue o join lea ne -song a angemen ep esen a ions o
digi al music lea ning.
1. INTRODUCTION
In 1984, Benjamin Bloom published he landma k pape
The 2 Sigma P oblem - The Sea ch o Me hods o G oup
Ins uc ion as E ec i e as One- o-One Tu o ing [1]. In his
esea ch, Bloom alida ed quan i a i ely wha many each-
e s subconsciously knew o be ue: indi idualized lea n-
ing supe ision can d as ically imp o e he lea ning expe-
ience. The e ec was signi ican ; s uden s in his esea ch
© M. Mülle schön, A. Klapu i, M. Rod íguez and Ch is-
ian Ca din. Licensed unde a C ea i e Commons A ibu ion 4.0 In e na-
ional License (CC BY 4.0). A ibu ion: M. Mülle schön, A. Klapu i,
M. Rod íguez and Ch is ian Ca din, “Playabili y P edic ion in Digi al
Gui a Lea ning using In e p e able S uden and Song Rep esen a ions”,
in P oc. o he 26 h In . Socie y o Music In o ma ion Re ie al Con .,
Daejeon, Sou h Ko ea, 2025.
coho pe o med wo s anda d de ia ions be e han hose
augh in a con en ional class oom se ing.
Fo y yea s la e , sel -guided lea ning on digi al lea n-
ing pla o ms enjoys emendous popula i y [2]. While
hese educa ion p o ide s o e many ad an ages ( lexibil-
i y, ubiqui ousness, p ice), hey s ill la gely miss ou on he
educa ional boon o pe sonalized lea ning.
Remedying ha and building meaning ul pe sonaliza-
ion hinges on unde s anding: Who is he s uden , and wha
is hei expe ience? And, wha a e he cha ac e is ics o he
lea ning uni s on he pla o m? In he case o gui a s u-
den s, ha would en ail he need o cha ac e ize bo h he
lea ning uni s and he skill le el o he s uden s. P e ious
esea ch has ocused on he o me [3–6]. Compa a i ely
less wo k has gone in o he la e , likely due o he un-
a ailabili y o sizeable longi udinal da ase s. Ou wo k is
mo i a ed by his esea ch gap.
The con ex o his pape is Yousician, a digi al gui-
a lea ning pla o m ha p o ides se e al a angemen s
o each song a a ying le els o di icul y. We p opose a
me hodology o ma ch song a angemen s o he skill le el
o he s uden , aiming o imp o e lea ning ou comes [7,8].
Ou app oach consis s o h ee main elemen s:
1. We employ an au oma ed p o iling echnique o song
a angemen s, anno a ing each no e o cho d wi h edu-
ca ionally ele an ags and nume ical di icul y es-
ima es.
2. When he s uden plays a song a angemen on he pla -
o m, we assess he co ec ness o hei playing a each
poin , in eal ime. This allows us o ansla e he ags
and di icul y es ima es in o e idence o skill. The in-
o ma ion is accumula ed in o a da a s uc u e ha we
call he s uden expe ience p o ile.
3. We ha e a la ge eco d o s uden s who a emp ed o
play song a angemen s on he lea ning pla o m. The
da a includes he s uden expe ience p o ile a he ime
o play. Wi h i , we ain a model ha can p edic he
success a e o a ce ain s uden in playing a ce ain
song a angemen o he i s ime. We call his ask
pe sonalized playabili y p edic ion.
Ou in e es lies in anspa en , in e p e able ep esen-
a ions o s uden skill and a angemen di icul y. Ha ing
ha enables s uden s o he lea ning pla o m o unde s and
631
wha skill hey a e lea ning, and he aspec s ha make a
song a angemen di icul .
2. RELATED WORK
In MIR he e has been signi ican a en ion paid o he edu-
ca ional cha ac e iza ion and di icul y es ima ion o song
a angemen s o piano [3, 6, 9], and gui a [4, 5]. Velez
Vazquez e al. in oduce he concep o playabili y p edic-
ion o gui a , p edic ing expe -assigned di icul ies o
cho d a angemen s, hough no pe sonalizing i o he in-
di idual s uden [4]. Wi h ega ds o he s uden , he e is
an es ablished ield o s udy o pe o mance assessmen ,
wi h analysis mos ly ied o a single pe o mance [10–12].
Resea ch in educa ional psychology has s udied he
p oblem o knowledge acing, which seeks o model a s u-
den ’s s a e o unde s anding [13,14]. The wo k has his o -
ically cen e ed a ound domains such as ma hema ics and
language. Recen wo k has adap ed knowledge acing o
music lea ning [15]. Beyond knowledge acing, educa-
ional psychology p o ides us wi h ele an insigh s in o
designing sel -guided lea ning en i onmen s [16]. In ad-
di ion, we d aw inspi a ion o designing digi al lea ning
pla o ms om music lea ning esea ch, such as dynamics
o ins umen p ac ice a home [7] and ins i u ional music
lea ning [8].
I we go beyond he domain o music educa ion, he e is
a b oad body o esea ch on educa ional ecommende sys-
ems – hei design, applica ion, and e alua ion [17–20].
While many educa ional ecommende sys ems add ess in-
s i u ional lea ning se ings wi h sca ce da a, some add ess
he pe sonaliza ion o digi al lea ning pla o ms. Smi h
e al. de eloped a co-c ea i e AI o he music and code
lea ning pla o m Ea Ske ch, p o iding pe sonalized eed-
back and ecommenda ions o lea ne s [21]. No ably, he
language lea ning app Duolingo de eloped an educa ional
ecommende sys em ha models bo h he s uden and he
lea ning uni , which imp o ed lea ning ou comes [22].
3. PROFILING TABLATURES WITH TIMED
TAGS AND DIFFICULTY ESTIMATES
In ou digi al lea ning pla o m, songs a e a anged a di -
e en di icul y le els and o ien ed owa ds di e en s yles
(e.g., melodic playing, accompanimen ). The a ange-
men ’s no a ion can be assumed o speci y, a leas , he
no es o be played, hei musical iming, and he s ing,
e , and inge o a icula e hem.
Figu e 1 depic s ou p o iling sys em in ac ion. The
p o ile p oceeds by scanning he inpu abla u e, no e by
no e. Tag desc ip o s a e ca ego ical, aiming o p o ide a
ocabula y o lea ning ou comes. Di icul y desc ip o s
a e ec o s, aiming o quan i y psychomo o di icul y. All
desc ip o s a e ime-s amped, making a abla u e p o ile a
abula , ime-se ies hyb id. In he ollowing, we elabo a e,
in u n, on each aspec .
Tags: Each ag is de ined in collabo a ion wi h ou eam
o educa o s. Tagging au oma ion needs o scale o bo h
Figu e 1: P o iling example o 'Unde The B idge' by RHCP.
new ag de ini ions, as well as ag de ini ion changes. We
sided wi h a ule-based sys em, as i p o ided a clean
way o add ess de ini ion changes. The esul ing ag se
ma ches he wo ding used o in oduce music concep s
and playing echniques in ins uc ional ideos, making o
ins ances o bo h nominal da a (e.g. 'cowboy cho ds' ) and
o dinal da a (e.g. 'small jump', 'big jump' ). The comple e
ag se has 650 unique ags. In his pape , we expe imen
wi h h ee ag subse s: (a) ins umen al echnique, wi h
37 ags (e.g. 'slide', ' hick s ings', 'pa ial ba e' ), (b)
hy hm, wi h 20 ags (e.g. '16 hs', 'o -bea ', 'shu les' ),
and (c) cho d shapes, wi h 256 ags (e.g. 'xx0232', which
is an open D cho d oicing, in s anda d uning). The
la e subse con ains he mos common open and mo able
gui a cho d shapes in ou ca alogue.
Psychomo o di icul y es ima es
F e ing hand
(6)
Pos u e discom o , e ing o ce,
hand eloca ion speed, inge eal-
loca ion speed, w is eposi ioning
speed, inge eposi ioning speed
Plucking hand
(4)
Hand eloca ion speed, pick eposi-
ioning speed, s ing mu ing coo di-
na ion, me e -picking alignmen
Table 1: Nume ical di icul y es ima es used in his pape .
Di icul ies: We aim o au oma ically desc ibe i e ac o s
ha may complica e music pe o mance: e ing discom-
o (e.g., when p essing s ings), hand pos u e discom o
(e.g., when holding a cho d), mo emen ine iciency (e.g.,
when ansi ioning om one poin on he e boa d o an-
o he ), posi ioning ine iciency (e.g., when ea anging in-
ge s so ha hey a e placed on op o he s ings ha need
o be p essed), and coo dina ion ine iciency (e.g., when a
picking pa e n does no align wi h me ic subdi isions).
To ha end, we implemen models ha , gi en an in-
pu abla u e, can p edic a icula ion symbols. We make
one o such models o p edic ( e ing-hand) inge ing,
based on [23], and ano he o p edic plucking di ec ion
(down o up), based on [24]. Bo h ame a icula ion p e-
dic ion as a sea ch p oblem, whe e good inge s/plucking
choices a e hose ha minimize cumula i e biomechanical
cos . To sol e he sea ch p oblem e icien ly, dynamic p o-
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
632
g amming is used. Ou e e ences a e ins ances o p oo -
o -p inciple wo k. In bo h cases, we cu a ed alida ion
da abases, so as o ex end hei p ocessing capaci y (e.g.,
ha e [23] suppo polyphonic inpu ), and ine une hei
biomechanical in e p e abili y. Table 1 speci ies he ull
ec o . Desc ibing he inne wo kings o each model is be-
yond he scope o his pape . Re e o [25] o de ails.
4. REPRESENTING STUDENT SKILL AND
ARRANGEMENT DIFFICULTY
Pe sonaliza ion equi es modelling he in e ela ionship
be ween s uden s and lea ning uni s. In Sec ion 3, we
showed how o ep esen he con en o song a angemen s,
wi h no e-le el g anula i y. We also ha e access o e alu-
a ed pe o mances, wi h ma ching g anula i y. This sec ion
desc ibes how, om ha da a, we make a join ep esen a-
ional space o s uden s and a angemen s.
4.1 S uden Expe ience P o ile (SXP)
The s uden is ep esen ed in a da a s uc u e ha we call
he s uden expe ience p o ile (SXP). The co e componen
o an SXP is an expe ience ec o , in which e idence o
skill is accumula ed. We desc ibe he expe ience ec o in
sec ion 4.1.1, and desc ibe how we ex end i s aw capaci y
in sec ion 4.1.2.
4.1.1 S uden Expe ience Vec o
Any ime a s uden chooses o play an a angemen , he
applica ion eco ds he e alua ion o each no e/cho d pe -
o med. Gi en ha he pla o m o e s di e en play
modali ies, such as p ac ice and looping, we e e o plays
as s uden -a angemen in e ac ions, o p ese e gene al-
i y. Bo h e alua ed pe o mance and ags/di icul ies a e
ime-s amped, wi h high p ecision, so we can accu a ely
align he wo.
Fo his pape , we es a 433-dimensional expe ience
ec o . Each elemen co esponds o ei he a ag o a
di icul y. Tags accoun o 256 + 37 + 20 dimensions,
co esponding o cho ds, ins umen al echnique, and
hy hm. Di icul ies accoun o 120 dimensions. Di i-
cul ies a e con inuous a iables, which we quan ise o an
in ege scale o en s eps. We ese e 100 dimensions o
hos hem, and ese e 10 + 10 dimensions o hos hei
cumula i e o als.
Expe ience Accumula ion: When he s uden plays co -
ec ly, he ec o alues a e inc emen ed a he posi ions
co esponding o he ags and di icul ies ( ha empo a ily
align wi h he no e/cho d played). Hence o h, we e e
o an e alua ed no e/cho d ins ance as onse , so as o see
success/mis ake as an e en in ime, which may ha e o
do wi h mo e han jus he no e/cho d ins ance ha was
played a ha poin . Fo music uni s ha in ac span one
onse , o ins ance, a cho d s um, we accumula e expe i-
ence based on he e alua ion o ha single onse . Fo music
uni s ha span mo e han a single onse , such as hy hms,
we use a sliding window o eigh onse s, and accumula e
Figu e 2: Plucking hand, 'pick eposi ioning speed' di icul y
dis ibu ion o e he en i e a angemen co pus. Compu ed quan-
ile bo de s a e depic ed as e ical do ed lines.
expe ience only i he sha e o co ec ly pe o med onse s
exceeds 75%. This design choice was made o be conse -
a i e abou expe ience accumula ion. Fo di icul ies, we
ollow a sligh ly di e en logic. Fo each onse posi ion
in he a angemen , we ha e access o he di icul y ec o
d, speci ied in Table 1. I an onse is played co ec ly and
has a di icul y di>0 o dimension i, hen he expe ience
ec o dimension ha co esponds o diis inc emen ed.
The abo e p ocess allows us o p ojec each pe o -
mance o a s uden on o he expe ience ec o space,
whe e each dimension can be exp essed in wo ds ha
s uden s can in ui i ely unde s and, and ela e o a skill
necessa y o gui a playing.
Di icul y Quan isa ion: Each di icul y dimension has
di e en anges and di e en dis ibu ions. We quan ise
hei ange o 1-10, using quan ile no malisa ion. Quan ile
no malisa ion is done o each di icul y sepa a ely, using
he en i e y o ou ca alogue. Figu e 2 shows an example.
Figu e 3: Visualiza ion o he s uden expe ience p o ile wi h
di e en empo al decay ac o s.
4.1.2 Views and Tempo ali y o Expe ience
Using he expe ience accumula ion p ocedu e desc ibed
abo e, we de i e a pai o expe ience ec o s: he collec ed
expe ience c, and he maximum possible expe ience m
(had he s uden played he a angemen pe ec ly). Doing
so opens he possibili y o model no only i s -o de e -
ec s, such as expe ience gain, bu also second-o de ones,
such as he consis ency wi h which skill is demons a ed
( ia elemen wise di ison c(s)⊘ m(s), o ins ance).
Recency plays an impo an ole in skill de elopmen
[26], so we in oduce a empo al dimension o he SXP,
illus a ed in Figu e 3. Ins ead o jus one pai o expe-
ience ec o s ha ep esen he en i e lea ning jou ney o
he s uden on he pla o m, we use ou di e en ec o
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
633
pai s ha ep esen di e en ime in e als. When a s u-
den plays an a angemen on he pla o m, each o he ou
ec o pai s is upda ed. We in oduce decay a es o he
expe ience componen s. (" o ge ing ac o s.") Each day,
he ou cumula i e expe ience componen s a e mul iplied
by he decay a es 0.998, 0.9775, and 0.91, app oxima ing
hal -li e imes o he expe ience o one yea , one mon h,
and one week, espec i ely. Fo ou expe imen s, we also
keep a no-decay componen .
The esul ing SXP can be seen as 4·2·433 enso . I
makes o a ho ough ep esen a ion o cumula i e expe-
ience, while e aining all he bene i s o a ixed-size ep-
esen a ion: I can be accessed, upda ed, and s o ed in an
e icien and scalable way.
4.2 A angemen Cha ac e isa ion (AC)
In o de o cha ac e ize a angemen s, we p ojec hem
on o he same 433-dimensional skill ec o space ha we
use o he s uden expe ience ep esen a ion. To do ha ,
we ake he ec o ha would esul om a pe ec and
comple e play h ough o he a angemen , accumula ing
skill in he same manne as done o he SXPs. Tha e-
sul s in an in e p e able pai ing: SXP alues ep esen ex-
pe ience gained, AC alues ep esen expe ience needed.
We no malize AC alues by he numbe o onse s in he
a angemen . Tha makes hem scale-in a ian , which has
es ed be e ( han leng h) in di icul y p edic ion [4, 25].
5. PREDICTING STUDENT PERFORMANCE
Sys ems in sec ions 3 and 4 a e c ea ed o se e mul i-
ple pe sonalisa ion asks. In his pape , we es hei e -
icacy on he ask o ecommending song a angemen s.
We know om music lea ning esea ch ha skill-di icul y
misma ches ha e a nega i e impac on s uden lea ning
ou comes, especially in sel -guided lea ning en i onmen s
[7, 8]. I we can mee s uden s a hei le el, bo h enjoy-
men and lea ning ou comes can be assumed o imp o e.
This expe imen al ask can hence (a) help us alida e he
u ili y o he SXP and AC ep esen a ions, and (b) mo e
owa ds mo e meaning ul pe sonaliza ion in he pla o m.
5.1 Success Ra io as P edic ion Ta ge
Quan i ying he ou come o s uden -a angemen in e ac-
ions is non- i ial. An es ablished c i icism in he space
o educa ional ecommende sys ems is ha adi ional ac-
cu acy me ics a e o e alued, and li le a en ion is paid o
whe he he a ge a iable ac ually co ela es wi h lea n-
ing ou come [20].
Wi h ha in mind, we se led on using he success a-
io o s uden -a angemen in e ac ion. The success a io is
compu ed by di iding he numbe o success ully played
onse s by he o al numbe o played onse s. Ou easons
o his choice a e wo old. Fi s , success a ios a e used o
model lea ning ou comes in knowledge acing heo y [27].
Second, we know ha he success a io co ela es wi h s u-
den sen imen s. Analyzing a la ge-scale se o s uden -
a angemen in e ac ions, including explici s uden eed-
Figu e 4: Dis ibu ion o he success a io o e he da a se o
s uden -a angemen in e ac ions.
back, we ound ha he success a io s ongly co ela es
wi h he sen imen o he s uden . The in e ac ions ha
we e a ed elaxing o bo ing had a mean success a io o
86%. In e ac ions ha s uden s a ed inspi ing o un had
an a e age o 80% while he in e ac ions a ed s ess ul o
discou aging had a mean success a io o 61% This sug-
ges s ha i we a e able o eliably p edic he success a io
o a gi en in e ac ion, we can sugges a angemen s in he
70-80% ange, which is mo e likely o mo i a e s uden s.
5.2 Pe sonalized Playabili y P edic ion Da a Se
We call he abo e-desc ibed ask pe sonalized playabili y
p edic ion, adding he s uden ep esen a ion o p e ious
e o s [4]. We cons uc ed a la ge-scale labeled da ase
om his o ical da a, con aining he SXP, AC, obse ed
success a io, and ele an in e ac ion me ada a. Recog-
nizing ha he SXP con inuously e ol es o e a s uden ’s
lea ning jou ney, he da ase c ea ion equi ed econs uc -
ing SXP snapsho s a he ime o playing, in o de o
p oduce comple e and accu a e obse a ions analogous o
hose a ailable a in e ence ime. Figu e 4 shows he dis-
ibu ion o he success a io in ou da ase .
To limi he amoun o noise in he aining se , we use
h ee sampling cons ain s. Fi s , we choose only o con-
side p ima is a in e ac ions, o isola e he da a om p ac-
ice e ec s. Second, we equi e ha he included in e ac-
ions exceed a ce ain minimum du a ion (20 onse s), o
a oid in oducing noise h ough s uden s qui ing he a -
angemen due o echnical issues o disliking he song.
Thi d, we maximize he di e si y o ways in which s u-
den s access a angemen s (lessons, sea ch, ankings), o
a oid collinea i ies.
The esul ing da a se spans 2,316,880 in e ac ions
om No embe 2022 o June 2024, including mo e han
200,000 gui a s uden s. Due o he ex ensi e na u e o he
SXP snapsho s and ACs, he da ase exceeds 30 gigaby es.
6. EXPERIMENTS AND RESULTS
6.1 Abla ion Using an XGB Reg esso
We made se e al explici design decisions while con-
cei ing he SXP and AC. Al hough in o med by domain
knowledge, we a e awa e ha hese migh no esul in op i-
mal da a s uc u es o pe sonalized playabili y p edic ion,
speci ically. So we i s conduc ed an abla ion s udy o un-
de s and he ele ance o ou inpu and pa ame iza ions
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
634
Id Abla ion s udy Inpu dim RMSE MAE R2
1 P edic - he-mean baseline – 0.2329 ±0.0007 0.1906 ±0.0006 0.0000 ±0.0000
2 Cho d ags 1024 0.1880 ±0.0011 0.1431 ±0.0009 0.3486 ±0.0073
3 Ins umen ech. and hy hmic ags 172 0.1532 ±0.0004 0.1109 ±0.0004 0.5677 ±0.0031
4 Di icul y his og ams 480 0.1489 ±0.0009 0.1053 ±0.0007 0.5915 ±0.0048
5 Full ec o 1732 0.1484 ±0.0008 0.1066 ±0.0006 0.5939 ±0.0045
6 Ins .+ hy hmic ags & di icul ies 708 0.1474 ±0.0007 0.1037 ±0.0005 0.5996 ±0.0045
7 A angemen ep esen a ion only 120 0.1983 ±0.0008 0.1535 ±0.0007 0.2752 ±0.0039
8 S uden ep esen a ion only 360 0.2091 ±0.0007 0.1640 ±0.0006 0.1940 ±0.0064
Table 2: Abla ion s udy on ea u e subse combina ions, using XGB eg ession.
Id Reg esso Dimensionali y RMSE MAE R2
Inpu Encoded
1 Encode NN [SXP] 6928 256 0.2175 ±0.0011 0.1722 ±0.0009 0.1308 ±0.0044
2 Encode NN [SXP+AC] 6928 256 + 120 0.1643 ±0.0015 0.1211 ±0.0012 0.5040 ±0.0062
3 ShallowNN [SXP+AC] 480 – 0.1701 ±0.0020 0.1275 ±0.0058 0.4663 ±0.0133
4 TunedXGB [SXP+AC] 480 – 0.1474 ±0.0007 0.1037 ±0.0005 0.5996 ±0.0045
Table 3: Reg esso candida es o pe sonalized playabili y p edic ion ask
o his ask. We used a g adien -boos ing (XGB) eg es-
sion model. I s ee-based modeling app oach ends o be
less ulne able o da ase imbalances and inpu no maliza-
ion, bo h issues ha we know exis in ou da ase . Also,
he XGB eg esso is an es ablished indus y s anda d o
sol ing eal-wo ld eg ession asks. Fo compa ison, we
es ed a mul i a ia e linea eg ession model. I con e ged
owa ds p edic ing he da ase mean, so we ins ead di ec ly
use p edic ing he mean as a baseline.
P elimina y expe imen s e ealed ha omi ing he in-
e media e decaying ac o s has no meaning ul impac on
XGB pe o mance. Hence, o ou abla ion expe imen , we
inpu an SXP wi h (a) non-decaying expe ience (only c),
and (b) as es -decaying expe ience ( cand m), as well as
he AC. The ull inpu dimensionali y is 4·433 = 1732.
In ou abla ion s udy, we an a c oss- alida ion expe -
imen on ea u e subse s. Fo each a ian , we selec ed
he co esponding slices o he expe ience ec o s om
bo h he SXP and AC. To accoun o he dimensionali y
change, we uned he XGB eg esso pa ame e s be o e
each expe imen . Table 2 shows he esul s.
We can see ha using any slice o he skill ec o s
ende s an imp o emen upon p edic ing he mean (2-4),
and ha he di icul y his og ams seem o con ibu e he
s onges o he model’s accu acy. Adding all o he ec o
slices oge he (5) does no imp o e he accu acy signi -
ican ly. Ins ead, d opping he la ge se o cho d ags (6)
ende s he bes esul – howe e , no signi ican ly be e
han using only he his og ams. Impo an ly, combining
he in o ma ion o he SXP and AC ende s d as ically be -
e esul s han using ei he o he wo on hei own (model
4 s. models 7–8).
6.2 Au oencode and Encode -Based Reg esso s
The abla ion esul s led us o ques ion he deg ee o edun-
dancy in ou ep esen a ions. To assess ha , we ained
an au oencode on he SXPs. We i e a ed on la en spaces
o di e en sizes o unde s and how much comp ession we
can apply o he s uden ep esen a ion. The au oencode
was ained in a sel -supe ised manne . We ained on a
sepa a e se o unlabeled SXPs, so as o p ese e he la-
beled da ase o aining eg ession models ha use he
ep esen a ions lea ned by he au oencode .
Fo hese expe imen s, we augmen ed he ou accu-
mula i e memo y componen s in oduced in Sec ion 4.1.2
wi h ou sequen ial memo y componen s, which ep esen
he expe ience o he s uden wi hin hei las ou ac i e
days. The esul ing expe ience p o ile is a enso o size
8·2·433 = 6928.
We an aining i e a ions on i e di e en au oencode
a chi ec u es, each asked wi h econs uc ing he o iginal
6928 dimensional SXP om a la en space ep esen a ion
o 1048, 524, 256, 128, and 64 dimensions. The accu acy
o he decode emained high o he i s 3 models. Only
when cons aining he la en space o 128 dimensions we
expe ience a signi ican d op in accu acy.
Gi en ha comp essing SXP o 256 dimensions (5% o
i s o iginal size) e ains use ul in o ma ion, we se ou o
es whe he he SXP embeddings can se e as he unk
o a neu al ne wo k eg esso . This could p o ide an al-
e na i e o ea u e selec ion, as we could ely en i ely on
comp ession o condense SXPs o playabili y p edic ion.
We i s buil a p edic o ha uses only he SXP. We
ake he encode componen o he au oencode and add a
shallow neu al ne wo k o he la en space laye . Then, we
conduc end- o-end aining on he labeled da ase . Re-
sul s a e shown in Table 3 (1). Nex , we implemen ed
an adap a ion o his model whe e, in addi ion o he 256-
dimensional embedding, we plugged scale-no malized a -
angemen cha ac e iza ion in o he model and ained i on
he labeled da ase (2). Then, as a benchma k, we applied
a shallow eed- o wa d neu al ne wo k o he di icul y his-
og am ea u es iden i ied in he abla ion s udy (3). Las ly,
we indica e he pe o mance o he XGB eg ession model
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
635
on he same se o ea u es.
We can see ha he embedding ca ies signi ican ex-
plana o y powe . Combining he SXP embedding wi h
he a angemen ep esen a ion leads o an accu acy ha
lies clea ly abo e he baseline, and bea s ou he ea u e-
enginee ed benchma k. We can also see ha he XGB e-
g ession model ou pe o ms all.
6.3 Model E alua ion and E o Analysis
Expe imen s in sec ions 6.1 and 6.2 show ha he XGB e-
g esso , a ian 7 in Table 2, pe o ms bes . This model
uses a combina ion o ins umen al echnique ags, hy h-
mic ags, and di icul y his og ams. We p e e his model
no only because o i s p edic i e pe o mance, bu also
because o i s ligh weigh a chi ec u e, low compu a ional
cos , and ela i ely small inpu dimensionali y.
To u he alida e he model’s sui abili y and pe o -
mance, we conduc ed a de ailed e o analysis. Fi s , we
examined he dis ibu ion o he model’s p edic ions. Re-
sul s indica e ha he p edic ions closely mi o ed he ac-
ual dis ibu ion o he a ge a iable, wi hou undesi able
con e gence owa d he mean.
Figu e 5: Mean absolu e e o o p edic ions o mas e y le el o
he s uden . The mas e y le el con ains he highes success ully
played song le el o he s uden s.
Nex , we analyzed he dis ibu ion o p edic ion e o s
ac oss s uden mas e y le els. The mas e y le el con ains
he highes success ully played song le el o he s uden s,
whe e human educa o s assign song le els ha ange be-
ween 0 and 15. Resul s a e shown in Figu e 5. The anal-
ysis e ealed ha e o s we e gene ally e enly dis ibu ed,
and sligh ly ele a ed a skill le el 0 (absolu e beginne s)
and le el 15 (expe playe s). This pa e n is unde s and-
able since, o beginne s (le el 0), he e is only a limi ed
amoun o s uden his o y a ailable, making accu a e p e-
dic ions challenging. Simila ly, le el 15 ep esen s highly
skilled s uden s ackling di icul a angemen s. As he
da a se only con ains p ima is a in e ac ions, p edic ing
he ou come o such complex in e ac ions is no o iously
di icul .
Addi ionally, we in es iga ed p edic ion e o s ac oss
s uden -a angemen in e ac ions, s a i ied by he ue suc-
cess a io. Resul s a e shown in Figu e 6. E o s a e no-
ably highe in he lowe ange (0.0–0.3), which aligns wi h
expec ed gameplay dynamics: When a s uden canno play
a song, he in e ac ion ends au oma ically, shi ing he p e-
dic ion ask om assessing pe o mance o es ima ing how
long hey play be o e ailing. The p edic ion a ge hence
becomes highly ola ile, explaining he e o inc ease.
Figu e 6: P edic i e pe o mance o he XGBReg ession model,
s a i ied by ue success a ios
7. CONCLUSION
In his pape , we explo ed pe sonalisa ion in digi al gui-
a lea ning. We in oduced a ully au oma ed me hodol-
ogy o ep esen ing bo h s uden expe ience and a ange-
men di icul y, and showed ha combining hese ep e-
sen a ions enables accu a e p edic ion o p ima is a pe -
o mance. Ou analysis highligh ed he signi ican con-
ibu ion o di icul y his og ams o p edic ion accu acy.
We also ound ha s uden ep esen a ions could be com-
p essed o jus 5% o hei o iginal dimensionali y wi h
minimal accu acy loss, acili a ing e icien in e ence.
Fu u e wo k will ex end beyond his ini ial use case, ex-
plo ing he po en ial o hese ep esen a ions ac oss di e -
en pe sonaliza ion scena ios. All looking o enhance sel -
guided music lea ning.
8. ACKNOWLEDGEMENTS
We hank he anonymous e iewe s o he many insigh ul
commen s and sugges ions.
9. REFERENCES
[1] B. S. Bloom, “The 2 sigma p oblem: The sea ch o
me hods o g oup ins uc ion as e ec i e as one- o-one
u o ing,” Educa ional Resea che , ol. 13, pp. 16 – 4,
1984.
[2] R. C. Rod iguez and V. Ma one, “Gui a lea ning, ped-
agogy, and echnology: A his o ical ou line,” Social
Sciences and Educa ion Resea ch Re iew, ol. 8, no. 2,
Dec. 2021.
[3] P. Ramoneda, V. E. E emenko, A. D’Hooge, E. Pa ada-
Cabalei o, and X. Se a, “Towa ds explainable and in-
e p e able musical di icul y es ima ion: A pa ame e -
e icien app oach,” in ISMIR, 2024, pp. 520–528.
[4] M. A. V. Vásquez, M. Baelemans, J. D iedge ,
W. Zuidema, and J. A. Bu goyne, “Quan i ying he
ease o playing song cho ds on he gui a ,” in ISMIR,
2023, pp. 725–732.
[5] S. A iga, S. Fukayama, and M. Go o, “Song2gui a :
A di icul y-awa e a angemen sys em o gene a ing
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
636
gui a solo co e s om polyphonic audio o popula
music.” in ISMIR, 2017, pp. 568–574.
[6] V. Sébas ien, H. Ralambond ainy, O. Sébas ien, and
N. Con uy , “Sco e analyze : Au oma ically de e min-
ing sco es di icul y le el o ins umen al e-lea ning,”
in ISMIR, 2012, pp. 571–576.
[7] S. Fukuda, Y. Fukuda, M. Hosoda, A. Mo omu a,
E. Sasao, M. Ma suba a, and M. Nii suma, “Field
S udy on Child en’s Home Piano P ac ice: De elop-
ing a Comp ehensi e Sys em o Enhanced S uden -
Teache Engagemen ,” in ISMIR, 2024, pp. 381–388.
[8] A. Ka cic Puciha , K. Habe, B. Ro a Pance, and
M. Lau e, “The key easons o d opou in slo enian
music schools – a quali a i e s udy,” F on ie s in Psy-
chology, ol. 15, 05 2024.
[9] P. Ramoneda, N. C. Tame , V. E emenko, X. Se a,
and M. Mi on, “Sco e di icul y analysis o piano pe -
o mance educa ion based on inge ing,” In e na ional
Con e ence on Acous ics, Speech and Signal P ocess-
ing, pp. 201–205, 2022.
[10] Y. Ju, C. Y. Wu, B. C. Lo enzo, J. Yang, J. Deng, F. Fan,
and S. Lui, “End- o-End Au oma ic Singing Skill E al-
ua ion Using C oss-A en ion and Da a Augmen a ion
o Solo Singing and Singing Wi h Accompanimen ,”
in ISMIR, 2024.
[11] A. Le ch, C. A hu , K. A. Pa i, and S. Gu u ani, “Mu-
sic Pe o mance Analysis: A Su ey,” ISMIR, 2019.
[12] Y. Jiang, “Expe and no ice e alua ions o piano pe -
o mances: C i e ia o compu e -aided eedback,” in
ISMIR, 2023, pp. 367–374.
[13] A. T. Co be and J. R. Ande son, “Knowledge ac-
ing: Modeling he acquisi ion o p ocedu al knowl-
edge,” Use Modeling and Use -Adap ed In e ac ion,
ol. 4, no. 4, pp. 253–278, 12 1994.
[14] C.-K. Yeung and H. Kong, “Deep-IRT: Make Deep
Lea ning Based Knowledge T acing Explainable Us-
ing I em Response Theo y,” EDM 2019 - P oceedings
o he 12 h In e na ional Con e ence on Educa ional
Da a Mining, pp. 683–686, 4 2019.
[15] Y. Zhang, Y. Zhang, W. Xu, Z. Wang, and J. Sun,
“SingPAD: A Knowledge T acing Da ase Based on
Music Pe o mance Assessmen ,” pp. 332–340, 2024.
[16] W. Va ela, P. C. Ab ami, and R. Upi is, “Sel -
egula ion and music lea ning: A sys ema ic e iew,”
Psychology o Music, ol. 44, no. 1, pp. 55–74, 2016.
[17] F. L. da Sil a, B. K. Slodkowski, K. K. A. da Sil a,
and S. C. Cazella, “A sys ema ic li e a u e e iew on
educa ional ecommende sys ems o eaching and
lea ning: esea ch ends, limi a ions and oppo uni-
ies,” Educa ion and In o ma ion Technologies, ol. 28,
no. 3, pp. 3289–3328, 3 2023.
[18] M. Mu aza, Y. Ahmed, J. A. Shamsi, F. She wani, and
M. Usman, “AI-Based Pe sonalized E-Lea ning Sys-
ems: Issues, Challenges, and Solu ions,” IEEE Access,
ol. 10, pp. 81 323–81 342, 2022.
[19] S. S. Khanal, P. W. P asad, A. Alsadoon, and A. Maag,
“A sys ema ic e iew: machine lea ning based ecom-
menda ion sys ems o e-lea ning,” Educa ion and In-
o ma ion Technologies, ol. 25, no. 4, pp. 2635–2664,
7 2020.
[20] M. E d , A. Fe nández, and C. Rensing, “E alua -
ing Recommende Sys ems o Technology Enhanced
Lea ning: A Quan i a i e Su ey,” IEEE T ansac ions
on Lea ning Technologies, ol. 8, no. 4, pp. 326–344,
10 2015.
[21] J. Smi h, E. T uesdell, J. F eeman, B. Mage ko,
K. Boye , and T. Mcklin, “Modeling music and code
knowledge o suppo a co- c ea i e ai agen o educa-
ion,” in ISMIR, 2020, pp. 134–141.
[22] K. Bicknell, C. B us , and B. Se les, “How Duolingo’s
AI Lea ns wha you Need o Lea n: The language-
lea ning app ies o emula e a g ea human u o ,”
IEEE spec um, ol. 60, no. 3, pp. 28–33, 3 2023.
[23] G. Ho i and S. Sagayama, “Minimax i e bi algo i hm
o hmm-based gui a inge ing decision.” in ISMIR,
2016, pp. 448–453.
[24] E. T. de Lima and G. L. Ramalho, “On hy hmic pa e n
ex ac ion in bossa no a music.” in ISMIR, 2008, pp.
641–646.
[25] M. Rod íguez and A. Klapu i, “Educa ional p o iling
o gui a abla u e: Tools o os e sel -guided lea n-
ing,” in Submi ed o CMMR, 2025.
[26] T. Sa ion-Lemieux and V. B. Penhune, “The e ec s o
p ac ice and delay on mo o skill lea ning and e en-
ion,” Expe imen al B ain Resea ch, ol. 161, pp. 423–
431, 2005.
[27] S. Shen, Q. Liu, Z. Huang, Y. Zheng, M. Yin, M. Wang,
and E. Chen, “A su ey o knowledge acing: Mod-
els, a ian s, and applica ions,” IEEE T ansac ions on
Lea ning Technologies, 2024.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
637