A Systematic Evaluation of Real-Time Audio Score Following for Piano Performance

Author: Jiyun Park; Carlos Eduardo Cancino-Chacón; Suhit Chiruthapudi; Juhan Nam

Publisher: Zenodo

DOI: 10.5281/zenodo.17706341

Source: https://zenodo.org/records/17706341/files/000011.pdf

MATCHMAKER: AN OPEN-SOURCE LIBRARY FOR REAL-TIME PIANO
SCORE FOLLOWING AND SYSTEMATIC EVALUATION
Jiyun Pa k1∗Ca los Cancino-Chacón2∗
Suhi Chi u hapudi2Juhan Nam1
1G adua e School o Cul u e Technology, KAIST, Sou h Ko ea
2Ins i u e o Compu a ional Pe cep ion, Johannes Keple Uni e si y Linz, Aus ia
{june,juhan.nam}@kais .ac.k ,
{ca los.cancino_chacon,suhi .chi u hapudi}@jku.a
ABSTRACT
Real- ime music alignmen , also known as sco e ollow-
ing, is a undamen al MIR ask wi h a long his o y and is
essen ial o many in e ac i e applica ions. Despi e i s im-
po ance, he e has no been a uni ied open amewo k o
compa ing models, la gely due o he inhe en complex-
i y o eal- ime p ocessing and he language- o sys em-
dependen implemen a ions. In addi ion, low compa ibil-
i y wi h he exis ing MIR en i onmen has made i di i-
cul o de elop benchma ks using la ge da ase s a ailable
in ecen yea s. While new s udies based on es ablished
me hods (e.g., dynamic p og amming, p obabilis ic mod-
els) ha e eme ged, mos e alua ions compa e models only
wi hin he same amily o on small se s o es da a. This
pape in oduces Ma chmake , an open-sou ce Py hon li-
b a y o eal- ime music alignmen ha is easy o use and
compa ible wi h mode n MIR lib a ies. Using his, we sys-
ema ically compa e me hods along wo dimensions: mu-
sic ep esen a ions and alignmen me hods. We e alua ed
ou app oach on a la ge es se o solo piano music om
he (n)ASAP, Ba ik, and Vienna4x22 da ase s wi h a com-
p ehensi e se o me ics o ensu e obus assessmen . Ou
wo k aims o es ablish a benchma k amewo k o sco e-
ollowing esea ch while p o iding a p ac ical ool ha de-
elope s can easily in eg a e in o hei applica ions.
1. INTRODUCTION
Real- ime music alignmen , also known as sco e ollow-
ing, is he ask o aligning pe o mance da a o he co e-
sponding posi ion in he musical sco e in eal- ime. E e
since i was i s in oduced independen ly by Roge Dan-
nenbe g [1] and Ba y Ve coe [2] o e 40 yea s ago, music
alignmen has become one o he undamen al MIR asks.
* Equal con ibu ion.
© J. Pa k, C. Cancino-Chacón, S. Chi u hapudi and J. Nam.
Licensed unde a C ea i e Commons A ibu ion 4.0 In e na ional Li-
cense (CC BY 4.0). A ibu ion: J. Pa k, C. Cancino-Chacón, S.
Chi u hapudi and J. Nam, “Ma chmake : An Open-Sou ce Lib a y o
Real-Time Piano Sco e Following and Sys ema ic E alua ion”, in P oc.
o he 26 h In . Socie y o Music In o ma ion Re ie al Con ., Daejeon,
Sou h Ko ea, 2025.
Sco e ollowing is a necessa y componen o many in e -
ac i e applica ions (e.g., au oma ic accompanimen sys-
ems [3–6], au oma ic page u ning [7, 8], ly ics align-
men o acking singing oice [9–11], audio isual/mul-
imodal [6, 12] and isualiza ions [13]. Music alignmen
began as eal- ime sco e ollowing [1,2,14–17] bu , by he
mid-90s, had di e ged in o online and o line me hods (see,
e.g., ea ly o line wo k by Desain e al. [18]).
F om i s ea ly use on monophonic sou ces like oice
[17] and wind ins umen s, sco e ollowing has g own o
suppo polyphonic ins umen s such as piano, ensemble,
and e en ull o ches al pe o mances [17, 19–21]. Re-
sea ch has also expanded ac oss inpu modali ies o he
pe o mance, wi h sys ems ope a ing on audio o MIDI,
and sco e ep esen a ions including s ing o ma , sym-
bolic sco e, and shee image [22].
The sco e ollowing challenge [23] in MIREX laid
he ounda ion o o malize he e alua ion amewo k, in-
oducing impo an me ics ha include conside a ions
in eal- ime. Howe e , many subsequen s udies ha e
been de eloped in di e en en i onmen s— anging om
sys em-dependen [24,25] o language-dependen [26,27]
implemen a ions—o en ailo ed o speci ic use cases and
wi hou publicly sha ed sou ce code. As a esul , imple-
men a ions became agmen ed ac oss pla o ms, making
i di icul o ex end, ep oduce, o compa e me hods in a
uni ied se ing. This has hinde ed he de elopmen o a
uni ied e alua ion amewo k and compa ison o e me h-
ods o ea u es on sha ed da ase s emain a e, limi ing he
gene alizabili y and ep oducibili y.
In his pape , we add ess hese challenges by p oposing
a uni ied, open amewo k o he e alua ion and bench-
ma king o eal- ime audio-based sco e ollowing. Consid-
e ing public da ase s ha o e a ange o di icul y le els,
mul iple endi ions, and p ecise bea -le el anno a ions, we
base ou e alua ion on h ee ep esen a i e piano pe o -
mance da ase s. We implemen his amewo k as an open-
sou ce Py hon package called Ma chmake ,1 ha allows
eal- ime execu ion o ep esen a i e baselines o sco e ol-
lowing algo i hms. In addi ion o benchma king, i sup-
po s audio de ice inpu and has been alida ed in applica-
ion con ex s h ough a s andalone demo sys em.
1h ps://gi hub.com/pyma chmake /ma chmake
91
2. A CONCEPTUAL FRAMEWORK FOR SCORE
FOLLOWING
As a way o o ganize and compa e he componen s o sys-
ems o sco e ollowing, we ollow he s uc u e p oposed
by Mülle [28]. This amewo k consis s o h ee co e
componen s: (1) inpu music ep esen a ions, (2) ea u es,
and (3) online alignmen algo i hms.
2.1 Music Rep esen a ion
Sco e ollowing aligns a ixed e e ence de i ed om mu-
sical sco es wi h a ime-e ol ing inpu om a pe o -
mance. The sco e can ake a ious symbolic o ma s (e.g.,
MIDI, MusicXML) o shee images, and is ypically con-
e ed in o an in e media e ep esen a ion such as syn he-
sized audio o e en sequences. The pe o mance inpu
may be gi en as ei he audio o MIDI, each wi h dis inc
ep esen a ional and compu a ional cha ac e is ics. Au-
dio inpu is con inuous and la ency-sensi i e, while MIDI
is disc e e and e en -based. Ins umen al ac o s also a -
ec alignmen design: polyphonic o disc e e-pi ch ins u-
men s (e.g., piano) di e om con inuous-pi ch sou ces
(e.g., iolin, oice). Mul i-ins umen eco dings pose u -
he challenges due o imb al o e lap and sou ce ambigu-
i y.
2.2 Fea u es
Ch oma ea u es a e he mos commonly used in music
synch oniza ion, wi h many a ian s o hei compu a-
ion [29–32]. O he wo ks also use a ious spec al ea-
u es such as cons an -Q ans o ms (CQT) [27, 33], non-
nega i e ma ix ac o iza ion(NMF)-based [34] o spec al
empla e [35] o imp o ed polyphonic alignmen . Beyond
spec al ep esen a ions, con ex -awa e ea u es such as
onse -based ea u e [36] o bea -synch onous ames ha e
been in oduced o cap u e empo ally salien e en s use-
ul o alignmen . La e wo k explo ed lea ned ea u es,
including eed o wa d mappings [27], semi-supe ised
decomposi ions like NMF, and mo e ecen neu al ap-
p oaches [37]. While hese o e iche con ex ual in o -
ma ion, hey o en ely on ixed-leng h inpu s and in o-
duce la ency, making eal- ime usage mo e challenging.
2.3 Alignmen Algo i hms
Two majo amilies o alignmen algo i hms ha e been
used in sco e ollowing: dynamic p og amming and p ob-
abilis ic models.
The dynamic p og amming app oach, especially dy-
namic ime wa ping (DTW), aligns wo sequences by min-
imizing cumula i e cos . I s online a ian , On-Line Time
Wa ping (OLTW) [38], enables causal alignmen wi hin
a ixed-size o window. Va ian s include windowed [39],
pa allel [40], and cons ained DTW [40, 41], as well as
empo-awa e ex ensions [21,42].
P obabilis ic s a e-space models o e an al e na i e by
ea ing alignmen as la en s a e in e ence unde unce -
ain y [24,29, 43]. HMM-based sys ems model each no e
as a sequence o s a es (e.g., a ack–s eady– elease), wi h
ex ensions including semi-Ma ko [44], hyb id [19], and
Bayesian a ian s [45]. Kalman il e models and swi ch-
ing s a e-space sys ems [46,47] u he inco po a e empo
dynamics, while pa icle il e s [12,29] handle mul imodal
unce ain y in eal ime.
O he pa adigms include ea ly s ing-ma ching algo-
i hms [1] and ein o cemen lea ning-based app oaches
o mul imodal o isual sco e alignmen [48].
3. IMPLEMENTATION
3.1 Py hon Package S uc u e
Ma chmake is an open sou ce Py hon package ha imple-
men s ep esen a i e eal- ime music alignmen algo i hms
wi hin a modula , ex ensible amewo k. Figu e 1 illus-
a es he o e iew o he package and he whole pipeline.
The cu en e sion o Ma chmake p o ides wo ypes
o algo i hms: 1) online ime wa ping, wi h wo a ian s:
OLTWDixon, based on he me hods p oposed in [38,49],
and OLTWA z , based on [21,50]; and 2) an HMM-based
algo i hm, simila o he one used in [3,47]. A ull desc ip-
ion o he algo i hms and hei pa ame e s can be ound in
he supplemen a y Appendix. 2
Ma chmake suppo s wo main usage scena ios: (1)
li e s eaming mode using he audio de ice and (2) sim-
ula ion mode, which p ocesses a pe o mance ile as in-
pu . Figu e 2 shows an example o unning li e s eaming
mode wi h he de aul se ing. The AudioS eam objec
handles he inpu s eam by chunking he audio wi h o e -
lapping windows o a oid padding a i ac s. Bo h he syn-
hesized sco e audio and he pe o mance audio a e passed
o a P ocesso objec ha pe o ms ea u e ex ac ion.
The ex ac ed ea u es a e pushed in o a queue and con-
sumed by he OnlineAlignmen objec , which uns he
alignmen me hods in eal ime. Ma chmake akes a mu-
sical sco e wi h all symbolic music o ma s (MusicXML,
MIDI, MEI, e c.) a ailable by pa i u a.3The e u ned
ou pu is he cu en posi ion in he sco e, ep esen ed in
bea s as a musical uni acco ding o he ime signa u e in
he piece. Mo e de ailed desc ip ion and API documen a-
ion o he package a e a ailable he e. 4
3.2 Design and Implemen a ion De ails
We p o ide a simple and use - iendly in e ace o un
he sco e ollowing wi h minimal se up. As shown in Fig-
u e 2, use s can ins an ia e a Ma chmake objec wi h
a sco e ile and execu e a un ha i e a es o e he es i-
ma ed sco e posi ion o each s ep. To s eamline eal- ime
p ocessing, he AudioS eam class is implemen ed as a
con ex manage ha au oma ically handles s eam ini ial-
iza ion and ea down. Fu he mo e, he alignmen p ocess
is designed as a gene a o , enabling use s o ecei e sco e
posi ions concu en ly while he alignmen is in p og ess.
2h ps://pyma chmake .gi hub.io/ismi 2025_
supplemen a y_ma e ials/
3h ps://gi hub.com/CPJKU/pa i u a
4h ps://pyma chmake . ead hedocs.io/
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
92
Figu e 1. O e iew o he sco e ollowing package
1 om ma chmake impo Ma chmake
2
3mm = Ma chmake (
4sco e_ ile="pa h/ o/sco e.musicxml",
5inpu _ ype="audio",
6)
7 o cu en _posi ion in mm. un():
8p in (cu en _posi ion)
Figu e 2. A code example o unning he Ma chmake in
a li e s eaming mode.
This design allows o e icien eal- ime in eg a ion wi h-
ou equi ing use s o manage mul iple h eads, bu e s, o
callbacks explici ly.
While he online mode uses a mul i- h eaded queue o
asynch onous audio bu e ing, he simula ion mode p o-
cesses audio chunks in ad ance wi hin a single- h eaded
se up. By decoupling eal- ime I/O conce ns om co e
alignmen e alua ion, i is in ended o a oid a iabili y
om Py hon e sion, OS-le el h eading, o queuing de-
lays, ensu ing a consis en and ep oducible benchma king
en i onmen . In addi ion, OLTWA z is implemen ed in
Cy hon [51] o e iciency, a supe se o Py hon designed
o C-like pe o mance by inco po a ing C da a ypes and
op imizing he execu ion o Py hon code.
4. EXPERIMENTS
4.1 Da ase s
We use h ee public piano pe o mance da ase s: (n)ASAP
[52], Ba ik [53] and Vienna 4x22 [54], each o hem o e -
ing complemen a y cha ac e is ics o benchma king sco e
ollowing. (n)ASAP, a subse o he MAESTRO da ase
including no e-le el sco e alignmen s, includes exp essi e
pe o mances o echnically demanding solo piano pieces,
o e ing high di icul y and s ylis ic di e si y. We use only
he pieces in he MAESTRO 2 es spli . Vienna4x22
p o ides 22 dis inc endi ions o each o ou ela i ely
easy pieces, which is sui able o es obus ness o in e -
p e i e a ia ion. Ba ik da ase con ains eco dings o 12
Moza sona as by a single pianis wi h he longes a e age
piece du a ion among he h ee da ase s, enabling e alua-
ion ac oss long- o m classical epe oi e.
We use g ound- u h bea -le el anno a ions p o ided
wi h he (n)ASAP da ase , and ex ac equi alen anno a-
Da ase #Pieces #Pe #Bea s #No es Du (h) Di icul y
(n)ASAP 43 59 26,329 100,958 2.65 6.53
Ba ik 30 30 18,789 102,421 2.85 5.67
Vienna 4 88 13,728 43,656 2.24 4.88
To al 77 177 58,846 247,035 7.74 6.11
Table 1. Da ase s used in he e alua ion.
ions o Ba ik and Vienna4x22 om he .ma ch iles [55],
which con ain no e-wise sco e–pe o mance alignmen s.
In addi ion, we inco po a e he di icul y le els o each
piece based on G. Henle Publishe s, 5which p o ides a
1- o-9 g ading scale. The pieces used in ou expe imen s
span le els 4 h ough 9, ep esen ing a di e se se o wo ks
abo e in e media e le el. Table 1 p o ides he de ailed
s a is ics o he da ase s.
We only included pe o mances in he expe imen ha
eco ded an MAE o less han 100 ms in he o line es ,
using he sync oolbox 6wi h Ch oma & DLNCO ea u es.
The e alua ion was conduc ed on 184 pe o mances ac oss
93 pieces, o aling o e 58,000 bea s and 247,000 no es,
wi h an o e all du a ion o 7.74 hou s o pe o mances and
a piece-wise a e age di icul y o 6.11.
4.2 Expe imen Se ings
We conduc ed all e alua ions unde simula ion-based con-
di ions o ensu e ep oducibili y. Li e es ing was a oided
due o a iabili y in oduced by oom acous ics and ha d-
wa e se up, which complica es ai compa ison ac oss sys-
ems. The accu acy es s we e ca ied ou on an In el i9-
9900K CPU (16 co es @ 3.6 GHz), Py hon 3.9, wi h a
sample a e o 44.1 kHz and a ame a e o 30, chosen o
balance la ency and alignmen accu acy. We es ed ch o-
mag am, mel-spec og am, cons an -Q ans o m (CQT),
mel- equency ceps al coe icien s (MFCCs) [56] and a
simple STFT-based onse -sensi i e ep esen a ion simila
o he one used in Dixon [38], which we name log-spec al
ene gy (LSE). While esul s o all ea u es we e e alua ed,
we epo de ailed la ency and accu acy me ics o he
bes -pe o ming con igu a ion o each model. To accoun
o ha dwa e a iabili y, la ency was measu ed in mul iple
se ups: an In el i9-9900K, an Apple M4 MacMini, and an
5h ps://www.henle.de/Le els-o -Di icul y/
6h ps://gi hub.com/meina dmuelle /sync oolbox
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
93
Figu e 3. Two examples o e o calcula ion using he
mapping unc ion. (a) shows a one- o-many alignmen a
he e alua ion poin , while (b) illus a es a skipped align-
men .
Apple M2 P o MacBook, wi h he epo ed la ency alues
a e aged ac oss hese de ices.
4.3 P ep ocessing
In he p ep ocessing s ep (see Fig. 1), he symbolic sco es
a e syn hesized o audio using FluidSyn h, p o ided by
pa i u a. Since MusicXML o en lacks empo ma kings,
we se he syn hesis empo o each pe o mance’s a e -
age— ounded o he nea es 20 BPM—assuming pe o m-
e s ollow app oxima e empo indica ions.
To gene a e bea anno a ions o he syn hesized sco e
audio, we compu ed bea posi ions using he syn hesis
empo and he sco e’s ime signa u e. Fo compound
me e s (e.g., 6/8, 9/8, and 12/8), we adop ed (n)ASAP’s
bea anno a ion ules—coun ing hem as wo, h ee, and
ou bea s pe measu e, espec i ely—ac oss all da ase s o
align sco e-side anno a ions wi h pe o mance anno a ions.
Based on he syn hesized audio, we hen ex ac he ea u e
using he same P ocesso used in he online phase, bu
p ecompu e hem o line o he en i e sco e sequence.
5. EVALUATION
E alua ing sco e ollowing is challenging due o causali y,
iming p ecision, and ou pu la ency. Since he MIREX
challenge [23] p o ided ounda ional me ics, la e s udies
in oduced al e na i e e alua ion s a egies including bea -
le el e alua ions o asynch ony [3], e lec ing he ask’s
equen in eg a ion wi h au oma ic accompanimen sys-
ems.
In his wo k, we adop wo complemen a y e alua ion
pe spec i es. Fi s , we e alua e in he pe o mance do-
main, whe e e o s a e measu ed in milliseconds based on
g ound- u h anno a ions aligned o he audio. This ap-
p oach is commonly used in audio- o-sco e alignmen e-
sea ch and enables p ecise, ame-le el e alua ion, since
he anno a ions di ec ly e lec he ac ual iming o he pe -
o mance. Second, we also e alua e in he sco e domain
measu ed in bea uni s as sugges ed in [29,57], which be -
e e lec s he na u e o sco e ollowing as a ask o p e-
dic ing he co esponding sco e posi ion a each momen
o he pe o mance.
Figu e 4. De ined delay ypes o he sys em. Only sys em
delay is conside ed in he expe imen .
5.1 E alua ion Me ics
We selec e alua ion me ics mos ly adap ed om sco e
ollowing MIREX benchma k [23] and audio- o-sco e
alignmen (ASA) me ics [57]. We use Alignmen Ra e
(AR) wi hin a ole ance ange o |θe|, a ying om 50 ms
o 2000 ms. We also compu e Absolu e E o s (AE), bo h
in milliseconds and in bea s, om which we de i e he
A e age Absolu e E o (AAE) and Median Absolu e
E o (MAE), along wi h he s anda d de ia ion σe. To
u he cha ac e ize he dis ibu ion o e o s, we epo
ku osis and skewness which cap u e he peakedness and
asymme y o he non-absolu e e o dis ibu ion, espec-
i ely. In addi ion, we epo he a e age la ency µla , de-
ined as he sys em delay om he de ec ion ime o he end
o in e ence. Unlike o al la ency, his excludes ha dwa e
la ency and is composed o wo pa s: (i) ea u e p ocess-
ing and (ii) execu ion o he online alignmen algo i hm
o each ame s ep (see Fig. 4). E o s exceeding 2 sec-
onds (o 2 bea s in he sco e domain) a e excluded om
AE calcula ions, including bo h AAE and MAE, o a oid
dis o ion om unbounded acking ailu es. We epo AR
in wo ways. The a e aged piece-wise AR is a common
measu e, while he o al AR e lec s he p opo ion o suc-
cess ully aligned bea e en s ac oss he en i e da ase . The
la e a oids o e ep esen a ion o sho e pieces and p o-
ides a mo e balanced iew o o e all pe o mance.
To e alua e un ime la ency unde simula ion, we mea-
su e wo componen s: he a e age du a ion o ex ac ing
ea u es om incoming audio ames, and he ime aken
by he alignmen p ocess o consume ea u es and p edic
sco e posi ions. Speci ically, he la ency was compu ed
om he momen audio was ead o he ime he sco e posi-
ion was p edic ed—excluding ha dwa e I/O delays. This
wo-s ep measu emen allows o s anda dized la ency e-
po ing independen o he ha dwa e se up.
5.2 Alignmen Mapping Func ion
Gi en he alignmen pa h, he alignmen mapping unc ion
is applied o ans e he bea posi ions on one axis (ei-
he pe o mance o sco e) o ano he axis o compu e he
alignmen e o . Due o he local, s epwise na u e o eal-
ime alignmen , he esul ing pa h is no necessa ily mono-
onic and may con ain mul iple co esponden s o skipped
posi ions, depending on he implemen a ion and pu pose
o he me hods. Unlike linea in e pola ion me hods com-
monly used in o line audio- o-sco e alignmen , which as-
sume con inuous mappings, ou e alua ion elies only on
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
94
Da ase Me hod AAE(ms)↓±σMAE(ms)↓Skew. Ku . Piece-wise AR (%) ↑To al AR↑
(≤2000ms, %)
≤50ms ≤100ms ≤500ms ≤1000ms ≤2000ms
(n)ASAP OLTWDixon 189.55 ± 281.55 97.09 3.20 17.97 40.3 58.5 82.5 88.3 92.0 89.4
OLTWA z 183.56 ± 263.95 91.18 0.75 11.79 44.1 58.3 84.8 92.0 95.1 92.8
HMM 487.73 ± 423.27 346.01 0.18 3.33 15.6 22.2 37.5 43.8 43.8 43.8
Ba ik OLTWDixon 186.97 ± 262.55 104.40 3.75 24.70 28.2 51.7 82.1 85.2 87.6 89.4
OLTWA z 193.36 ± 269.13 107.15 1.00 12.63 35.9 53.0 82.2 87.4 90.3 89.7
HMM 693.63 ± 376.58 641.77 0.11 0.98 4.5 10.8 34.0 46.2 64.2 61.9
Vienna4x22 OLTWDixon 285.43 ± 390.82 132.73 1.57 5.90 26.6 43.2 72.4 80.0 85.5 82.5
OLTWA z 300.41 ± 368.70 152.51 0.50 3.93 33.2 44.5 73.3 84.3 86.7 86.7
HMM 439.64 ± 427.02 319.13 0.15 3.79 23.5 33.3 51.1 57.1 63.0 75.9
Table 2. E alua ion esul s on h ee da ase s using di e en sco e- ollowing me hods. The piece-wise alignmen a e (AR)
is measu ed as he a e age o e pieces, while he o al AR indica es he global p opo ion o aligned bea e en s ac oss he
en i e da ase . All es s we e conduc ed wi h STFT-based Ch oma as ea u es.
p edic ions made p io o o a each e alua ion ime poin .
To e lec his, we de ine he mapping unc ion as ollows:
ˆuk= minui|(ui, i)∈ W, i= max{ j| j≤k},
whe e W={(ui, i)}is he wa ping pa h exp essed in he
ame indices: uiis he sco e- ende ed-audio ame index
and iis he pe o mance-audio ame index. The inne
max inds he la es pe o mance ame ino exceeding
he cu en ame k, and he ou e min selec s he smalles
sco e ame uiamong hose alignmen s. This mapping e-
lies solely on pas o cu en ames o main ain causali y.
I handles skipped o one- o-many mappings and a oids
any in e pola ion me hods ha depend on u u e ames.
6. RESULTS
Table 2 p esen s a compa ison o alignmen me hods based
on pe o mance-domain e alua ion, measu ed in millisec-
onds. All me hods exhibi posi i e skewness in e o
dis ibu ion, e lec ing he expec ed lag o he bea es i-
ma es in eal- ime alignmen . The o e all esul s show ha
he OLTW-based me hod ou pe o ms he HMM baseline
ac oss all da ase s in bo h alignmen accu acy and co e -
age. While OLTWDixon and OLTWA z show compa-
able MAE depending on he da ase , OLTWA z consis-
en ly achie es highe co e age (To al AR), sugges ing ha
i is mo e obus agains o e all ailu es. The di e ence
likely s ems om OLTWDixon skipping unce ain egions,
while OLTWA z ’s “backwa d- o wa d” s a egy co ec s
ea ly misalignmen s and enhances co e age. Despi e ha -
ing he lowes AR, he HMM shows he lowes skewness
and ku osis p ima ily because signi ican e o s (>2 s) a e
excluded om he summa y s a is ics and i s “s icky” be-
ha io o linge in he same s a e in local egions ends o
na ow he e o dis ibu ion.
Table 3 p esen s an e alua ion compa ison in bea uni s,
o e ing a empo-no malized pe spec i e. The o e all
end mi o s he pe o mance-domain esul s in millisec-
ond, bu hese esul s a e s anda dized ac oss empi. AAE
emains a ound 0.3 bea s, wi h median alues ypically be-
low 0.2. To al AR is consis en ly lowe han he 2000 ms-
Da ase Me hod AAE↓(bea s) ± σMAE↓(bea s) AR↑(%)
(n)ASAP OLTWDixon 0.22 ± 0.27 0.13 83.4
OLTWA z 0.27 ± 0.30 0.16 85.2
HMM 0.80 ± 0.54 0.66 76.9
Ba ik OLTWDixon 0.20 ± 0.27 0.11 88.9
OLTWA z 0.29 ± 0.34 0.18 88.8
HMM 0.80 ± 0.38 0.67 59.3
Vienna4x22 OLTWDixon 0.31 ± 0.33 0.19 78.3
OLTWA z 0.37 ± 0.38 0.24 84.0
HMM 0.76 ± 0.78 0.51 70.3
Table 3. Bea -le el e alua ion esul s including o al align-
men a e (AR) (%).
Fea u e P ocess Online Alignmen
Type MAE (ms) La ency (ms) Me hod La ency (ms)
Ch oma 265.50 3.05 OLTWDixon 1.22
mel 297.92 3.40 OLTWA z 0.07
CQT 341.25 42.58 HMM 3.59
LSE 241.85 0.91
MFCC 931.81 2.58
Table 4. Compa ison o ea u e ypes and alignmen me h-
ods in e ms o alignmen e o (MAE) and la ency. LSE
is log-spec al ene gy ea u e ha was adop ed in [38]. La-
ency alues a e a e aged o e he ha dwa e se ups e alu-
a ed in Sec ion 4.
based me ic, e lec ing ha mos pieces ha e empi abo e
60BPM, whe e wo bea s span less han wo seconds.
In addi ion, a compa ison o a ious ea u e ypes and
la encies o he alignmen me hods a e epo ed in Table 4.
Among he ea u es, log-spec al ene gy (LSE) shows he
lowes MAE (241.85 ms) and delay (0.91 ms), indica -
ing s ong pe o mance wi h minimal o e head. In con-
as , CQT and MFCC yield highe MAE, wi h CQT also
equi ing conside able ex ac ion ime (42.58 ms), which
limi s i s eal- ime sui abili y. Fo alignmen me hods,
OLTWA z achie es he lowes la ency (0.07 ms), whe eas
HMM shows no iceably highe delay (3.59 ms) due o i s
compu a ional complexi y. These esul s highligh a ade-
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
95

Figu e 5. A sca e plo o mean absolu e e o (MAE) and
Henle’s di icul y le el in (n)ASAP and Ba ik da ase . The
MAE esul s a e om OLTWA z .
o be ween alignmen accu acy and un ime e iciency,
wi h LSE and OLTWA z p o iding a a o able balance
o low-la ency use.
The esul s also show abou he cha ac e is ics o he
da ase s. While he o e all alignmen pe o mance be-
ween (n)ASAP and Ba ik is compa able, Vienna4x22
shows no iceably highe e o a iance and ku osis. This
e lec s he da ase ’s unique s uc u e—22 di e se endi-
ions o each o only ou pieces—leading o subs an ial
a iabili y in exp essi e iming, a icula ion, and in e p e-
a ion. These a ia ions p esen addi ional challenges o
sco e ollowing and esul in hea ie - ailed e o dis ibu-
ions, as seen in he highe ku osis alues.
7. DISCUSSIONS
Figu e 5 u he illus a es he ela ionship be ween mu-
sical di icul y and alignmen accu acy o (n)ASAP and
Ba ik. We obse e a mode a e posi i e co ela ion ( =
0.24,p= 0.022) be ween MAE and he anno a ed di -
icul y le els, indica ing ha echnically mo e demand-
ing pieces end o p oduce la ge alignmen e o s. Vi-
enna4x22 was excluded om his analysis due o i s use
o sho exce p s, which makes consis en di icul y g ad-
ing un eliable.
To u he unde s and how alignmen beha io s di e
om me hods, Figu e 6 illus a es an example o alignmen
esul compa ing OLTWA z (le ) and HMM ( igh ). Al-
hough OLTWA z smoo hly ollows he bea e en s, he
HMM wa ping pa h shows equen ho izon al segmen s,
indica ing he “s icky” endency o s ay nea no e onse s,
e lec ing i s s a e-based o mula ion ha emphasizes on-
se ansi ions. This leads o cases whe e i linge s on
sus ained no es and becomes locally s uck, showing lim-
i ed o wa d momen um. The co esponding egion (high-
ligh ed in yellow) exhibi s changes in ha mony, no e den-
si y, and dynamics compa ed o he p eceding passage,
which p o ides su icien con as o he sco e ollowe
o eco e .
Las ly, we ound ha no only he choice o e alua ion
me ics, bu also how alignmen e o s a e compu ed (Sec-
ion 5.2) can a ec accu acy esul s o a meaning ul ex en .
Small di e ences in e o calcula ion some imes led o no-
iceable shi s in epo ed accu acy.
Figu e 6. Two examples o alignmen pa h wi h bea posi-
ions: (le ) OLTWA z , ( igh ) HMM.
8. USE CASES AND APPLICATIONS
To demons a e he p ac icali y o ou package, we buil
a ligh weigh web applica ion ha uns locally wi h
eal- ime audio inpu o p e- eco ded iles. Buil wi h
websocke -based communica ion, he sys em esponds
quickly enough o ensu e minimal pe cep ual delay. Ou
companion websi e includes a ideo demons a ion and a
link o he sou ce code. This applica ion aims o help e-
sea che s es hei own sco e ollowing models in an in-
e ac i e se ing. Beyond he web demo, ou package is
also used as he sco e ollowing module in he ACCompa-
nion [3], a eal- ime accompanimen sys em. These appli-
ca ions demons a e he e sa ili y o ou amewo k and
alida e i s u ili y in in e ac i e music scena ios.
9. CONCLUSIONS AND FUTURE WORK
We p esen ed a sys ema ic amewo k o eal- ime audio-
based sco e ollowing as he open-sou ce Py hon package
Ma chmake . I suppo s li e and simula ion-based e alua-
ion wi h baseline models, enabling ep oducible bench-
ma king ac oss da ase s and ea u es. Expe imen s on
h ee public piano da ase s show ha he OLTWA z a i-
an achie es he highes pe o mance and ha he onse -
sensi i e spec al ea u e (LSE) ou pe o ms ch oma in
bo h accu acy and la ency. Howe e , he cu en ame-
wo k is limi ed in i s suppo o empo models commonly
in eg a ed wi h HMM-based sco e ollowe s which may
pa ly explain he limi ed pe o mance o he HMM base-
line. Also, ecen wo ks o en include lea ned ea u es o
mul imodal inpu which poses a new challenge o e alu-
a e. Al hough ou e alua ion was limi ed o classical pi-
ano, ex ending Ma chmake o o he ins umen s and gen-
es equi es only adap ing he p ope da ase s and ea u e
ex ac ion modules. Fu u e wo k will ex end he ame-
wo k o suppo a wide a ie y o ins umen s and musical
s yles, and include addi ional ea u e ep esen a ions, ad-
anced empo modeling, and mul imodal inpu s.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
96
10. ACKNOWLEDGMENTS
This wo k has been suppo ed by he Aus ian Science
Fund (FWF), g an ag eemen PAT 8820923 (“Rach3: A
Compu a ional App oach o S udy Piano Rehea sals”).
Addi ionally, his wo k was suppo ed by he Na ional Re-
sea ch Founda ion o Ko ea (NRF) g an unded by he Ko-
ea go e nmen (MSIT) unde G an RS-2023-NR077289.
11. REFERENCES
[1] R. B. Dannenbe g, “An On-Line Algo i hm o Real-
Time Accompanimen ,” in P oceedings o he In e na-
ional Compu e Music Con e ence (ICMC ’84), Pa is,
F ance, 1984.
[2] B. Ve coe, “The Syn he ic Pe o me in he Con ex
o Li e Pe o mance,” in P oceedings o he In e na-
ional Compu e Music Con e ence (ICMC ’84), Pa is,
F ance, 1984.
[3] C. Cancino-Chacon, S. Pe e , P. Hu, E. Ka ys inaios,
F. Henkel, F. Fosca in, N. Va ga, and G. Widme , “The
ACCompanion: Combining Reac i i y, Robus ness,
and Musical Exp essi i y in an Au oma ic Piano
Accompanis ,” in P oceedings o he In e na ional
Join Con e ence on A i icial In elligence (IJCAI-23),
Macao, SAR, China, May 2023, a Xi :2304.12939
[cs, eess]. [Online]. A ailable: h p://a xi .o g/abs/
2304.12939
[4] K. A ms ong, T.-C. Hung, J.-X. Huang, and Y.-W.
Liu, “Real- ime piano accompanimen model ained
on and e alua ed acco ding o human ensemble cha ac-
e is ics,” in P oceedings o he Sound and Music Com-
pu ing (SMC), Po o, Po ugal, 2024.
[5] C. Raphael, “Music Plus One and Machine Lea ning,”
in P oceedings o he 27 h In e na ional Con e ence on
Machine Lea ning (ICML 2010), Hai a, Is ael, 2010.
[6] A. Maezawa, “I Go Rhy hm, so Follow Me Mo e:
Modeling Sco e-Dependen Timing Synch oniza ion
in a Piano Due ,” in P oceedings o he Sound and Mu-
sic Compu ing Con e ence (SMC 2024), Po o, Po u-
gal, 2024.
[7] A. A z , G. Widme , and S. Dixon, “Au oma ic Page
Tu ning o Musicians ia Real-Time Machine Lis en-
ing,” in P oceedings o he Eu opean Con e ence on
A i icial In elligence (ECAI), 2008.
[8] F. Henkel, S. Schwaige , and G. Widme , “Fully
Au oma ic Page Tu ning on Real Sco es,” in Ex ended
Abs acs o he La e-B eaking Demo Session o he
22nd In e na ional Socie y o Music In o ma ion
Re ie al Con e ence (ISMIR 2021), Online, 2021,
a Xi :2111.06643 [cs]. [Online]. A ailable: h p:
//a xi .o g/abs/2111.06643
[9] C. B azie and G. Widme , “Towa ds Reliable Real-
ime Ope a T acking: Combining Alignmen wi h
Audio E en De ec o s o Inc ease Robus ness,” in
P oceedings o he Sound and Music Compu ing
Con e ence, Online, 2020, a Xi :2006.11033 [cs,
eess]. [Online]. A ailable: h p://a xi .o g/abs/2006.
11033
[10] J. Pa k, S. Yong, T. Kwon, and J. Nam, “A Real-
Time Ly ics Alignmen Sys em Using Ch oma and
Phone ic Fea u es o Classical Vocal Pe o mance,” in
ICASSP 2024 - 2024 IEEE In e na ional Con e ence
on Acous ics, Speech and Signal P ocessing (ICASSP).
Seoul, Ko ea, Republic o : IEEE, Ap . 2024, pp.
1371–1375. [Online]. A ailable: h ps://ieeexplo e.
ieee.o g/documen /10445926/
[11] R. Gong, P. Cu illie , N. Obin, and A. Con , “Real-
ime audio- o-sco e alignmen o singing oice based
on melody and ly ic in o ma ion,” in In e speech
2015. ISCA, Sep. 2015, pp. 3312–3316. [Online].
A ailable: h ps://www.isca-a chi e.o g/in e speech_
2015/gong15_in e speech.h ml
[12] T. O suka, K. Nakadai, T. Takahashi, T. Oga a, and
H. G. Okuno, “Real-Time Audio- o-Sco e Alignmen
Using Pa icle Fil e o Coplaye Music Robo s,”
EURASIP Jou nal on Ad ances in Signal P ocessing,
ol. 2011, no. 1, p. 384651, Dec. 2011. [Online].
A ailable: h ps://asp-eu asipjou nals.sp inge open.
com/a icles/10.1155/2011/384651
[13] O. La illo , C. Cancino-Chacon, and C. B azie ,
“Real-Time Visualisa ion o Fugue Played by a S ing
Qua e ,” in P oceedings o he Sound and Music Com-
pu ing Con e ence (SMC 2020), Online, 2020.
[14] R. Dannenbe g and B. Mon -Reynaud, “Following
an Imp o isa ion in Real Time,” in P oceedings o
he In e na ional Compu e Music Con e ence, Cham-
paign/U bana, Illinois, USA, 1987.
[15] R. B. Dannenbe g and H. Mukaino, “New Techniques
o Enhanced Quali y o Compu e Accompanimen ,”
in P oceedings o he In e na ional Compu e Music
Con e ence, Cologne, Ge many, 1988.
[16] B. Ve coe and M. Pucke e, “Syn he ic Rehea sal:
T aining he Syn he ic Pe o me ,” in P oceedings o
he In e na ional Compu e Music Con e ence (ICMC
’85), Vancou e , BC, Canada, 1985.
[17] M. Pucke e, “Sco e ollowing using he sung oice,”
in P oceedings o he In e na ional Compu e Music
Con e ence (ICMC ’95), Ban , AB, Canada, 1995.
[18] P. Desain, H. Honing, and H. Heijink, “Robus Sco e-
Pe o mance Ma ching: Taking Ad an age o S uc-
u al In o ma ion,” in P oceedings o he In e na ional
Compu e Music Con e ence (ICMC), Thessalonki,
G eece, 1997.
[19] C. Raphael and Y. Gu, “O ches al Accompanimen
o a Rep oducing Piano,” in P oceedings o he In-
e na ional Compu e Music Con e ence (ICMC 2009),
Mon eal, Canada, 2009.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
97
[20] M. P ockup, D. G unbe g, A. H ybyk, and Y. E. Kim,
“O ches al Pe o mance Companion: Using Real-
Time Audio o Sco e Alignmen ,” IEEE Mul iMedia,
ol. 20, no. 2, pp. 52–60, Ap . 2013. [Online]. A ail-
able: h p://ieeexplo e.ieee.o g/documen /6530591/
[21] A. A z and G. Widme , “Real-Time Music T ack-
ing Using Mul iple Pe o mances as a Re e ence,” in
P oceedings o he 16 h In e na ional Socie y o Mu-
sic In o ma ion Re ie al Con e ence (ISMIR 2015),
Malaga, Spain, 2015.
[22] F. Henkel, S. Balke, M. Do e , and G. Widme ,
“Sco e Following as a Mul i-Modal Rein o cemen
Lea ning P oblem,” T ansac ions o he In e na ional
Socie y o Music In o ma ion Re ie al, ol. 2,
no. 1, pp. 67–81, No . 2019. [Online]. A ailable:
h p:// ansac ions.ismi .ne /a icles/10.5334/ ismi .31/
[23] A. Con , D. Schwa z, N. Schnell, and C. Raphael,
“E alua ion o Real-Time Audio- o-Sco e Alignmen ,”
in Music In o ma ion Re ie al E alua ion eXchange
(MIREX 2007), Vienna, Aus ia, 2007.
[24] A. Con , “An esco o: An icipa o y Synch oniza ion
and Con ol o In e ac i e Pa ame e s in Compu e Mu-
sic,” in P oceedings o he In e na ional Compu e Mu-
sic Con e ence (ICMC ’08), Bel as , I eland, 2008.
[25] J. Eche es e, P. Cu illie , and A. Con , “Imp o ed
Synch oniza ion o a P e-Reco ded Music Accompa-
nimen on a Use ’s Music Playing,” U.S. Pa en US
2023/0 082 086 A1, Ma ., 2023.
[26] S. Dixon and G. Widme , “MATCH: A Music Align-
men Tool Ches ,” in P oceedings o he 6 h In e na-
ional Con e ence on Music In o ma ion Re ie al (IS-
MIR 2005), London, UK, 2005.
[27] C. Jode , S. Essid, and G. Richa d, “Lea ning
Op imal Fea u es o Polyphonic Audio- o-Sco e
Alignmen ,” IEEE T ansac ions on Audio, Speech,
and Language P ocessing, ol. 21, no. 10, pp.
2118–2128, Oc . 2013. [Online]. A ailable: h p:
//ieeexplo e.ieee.o g/documen /6525340/
[28] M. Mulle , F. Ku h, and T. Rode , “Towa ds an E -
icien Algo i hm o Au oma ic Sco e- o-Audio Syn-
ch oniza ion,” in P oceedings o he 5 h In e na ional
Con e ence on Music In o ma ion Re ie al (ISMIR
2004), Ba celona, Spain, 2004.
[29] Z. Duan and B. Pa do, “A s a e space model
o online polyphonic audio-sco e alignmen ,” in
2011 IEEE In e na ional Con e ence on Acous ics,
Speech and Signal P ocessing (ICASSP). P ague,
Czech Republic: IEEE, May 2011, pp. 197–
200. [Online]. A ailable: h p://ieeexplo e.ieee.o g/
documen /5946374/
[30] P.-W. Chou, F.-N. Lin, K.-N. Chang, and H.-
Y. Chen, “A Simple Sco e Following Sys em o
Music Ensembles Using Ch oma and Dynamic Time
Wa ping,” in P oceedings o he 2018 ACM on
In e na ional Con e ence on Mul imedia Re ie al.
Yokohama Japan: ACM, Jun. 2018, pp. 529–532.
[Online]. A ailable: h ps://dl.acm.o g/doi/10.1145/
3206025.3206090
[31] M. Mulle , “Music Synch oniza ion,” in Funda-
men als o Music P ocessing. Cham: Sp inge
In e na ional Publishing, 2021, pp. 119–170. [On-
line]. A ailable: h ps://link.sp inge .com/10.1007/
978-3-030-69808-9_3
[32] M. Pé ez Fe nández, H. Ki chho , and X. Se a, “A
compa ison o pi ch ch oma ex ac ion algo i hms,” in
P oceedings o he 19 h Sound and Music Compu ing
Con e ence (SMC/JIM/IFC 2022). Sain -É ienne,
F ance: SMC Ne wo k, 2022. [Online]. A ailable:
h ps://doi.o g/10.5281/zenodo.6573082
[33] C.-T. Chen, J.-S. R. Jang, W.-S. Liu, and C.-
Y. Weng, “An e icien me hod o polyphonic
audio- o-sco e alignmen using onse de ec ion and
cons an Q ans o m,” in 2016 IEEE In e -
na ional Con e ence on Acous ics, Speech and
Signal P ocessing (ICASSP). Shanghai: IEEE,
Ma . 2016, pp. 2802–2806. [Online]. A ailable:
h p://ieeexplo e.ieee.o g/documen /7472188/
[34] J. J. Ca abias-O i, F. J. Rod iguez-Se ano, P. Ve a-
Candeas, and N. Ruiz-Reyes, “An Audio o Sco e
Alignmen F amewo k Using Spec al Fac o iza ion
and Dynamic Time Wa ping,” in P oceedings o he
16 h In e na ional Socie y o Music In o ma ion Re-
ie al Con e ence (ISMIR 2015), Malaga, Spain,
2015.
[35] F. Ko zeniowski and G. Widme , “Re ined Spec al
Templa e Models o Sco e Following,” in P oceedings
o he Sound and Music Compu ing Con e ence (SMC
2013), S ockholm, Sweden, 2013.
[36] S. Ewe , M. Mulle , and P. G osche, “High esolu ion
audio synch oniza ion using ch oma onse ea u es,”
in 2009 IEEE In e na ional Con e ence on Acous ics,
Speech and Signal P ocessing. IEEE, 2009, pp. 1869–
1872.
[37] A. Pillay, “A Neu al Sco e Followe o Compu e
Accompanimen o Polyphonic Musical Ins umen s,”
Mas e ’s hesis, Ca negie Mellon Uni e si y, Pi s-
bu gh, PA, USA, 2024.
[38] S. Dixon, “An On-Line Time Wa ping Algo i hm o
T acking Musical Pe o mances,” in P oceedings o he
19 h In e na ional Join Con e ence on A i icial In el-
ligence (IJCAI 05), Edinbu gh, Sco land, 2005.
[39] R. Mac ae and S. Dixon, “Polyphonic Sco e Follow-
ing Using on-Line Time Wa ping,” in Music In o -
ma ion Re ie al E alua ion eXchange (MIREX 2008),
Philadelphia, USA, 2008.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
98
[40] F. J. Rod iguez-Se ano, P. Ve a-Candeas, and J. J.
Ca abias-O i, “A Real-Time Sco e Followe o
MIREX 2015,” in Music In o ma ion Re ie al E al-
ua ion eXchange (MIREX 2015), Malaga, Spain, 2015.
[41] J. J. Ca abias, F. J. Rod iguez, and P. Ve a, “A Real-
Time Nm -Based Sco e Followe o MIREX 2012,”
in Music In o ma ion Re ie al E alua ion eXchange
(MIREX 2012), Po o, P o ugal, 2012.
[42] A. A z , “Real-Time Music T acking Using Tempo-
Awa e on-Line Dynamic Time Wa ping,” in P oceed-
ings o he Vienna Talk on Musical Acous ics (VITA),
Vienna, Aus ia, 2010.
[43] P. Cano, A. Loscos, and J. Bonada, “Sco e-
Pe o mance Ma ching using HMMs,” in P oceedings
o he In e na ional Compu e Music Con e ence, Bei-
jing, China, 1999.
[44] E. Nakamu a, P. Cu illie , and A. Con , “Au-
o eg essi e Hidden Semi-Ma ko Model o Sym-
bolic Music Pe o mance o Sco e Following,”
in P oceedings o he 16 h In e na ional Socie y
o Music In o ma ion Re ie al Con e ence (ISMIR
2015), Malaga, Spain, 2015. [Online]. A ailable:
h ps://a chi es.ismi .ne /ismi 2015/pape /000015.pd
[45] C. Raphael, “A Bayesian Ne wo k o Real-Time
Musical Accompanimen ,” in P oceedings o
he 14 h In e na ional Con e ence on Neu al
In o ma ion P ocessing Sys ems, Vancou e , BC,
Canada, 2001, pp. 1433–1439. [Online]. A ailable:
h ps://p oceedings.neu ips.cc/pape _ iles/pape /2001/
ile/2b0 658cb d284984 b11d90254081 -Pape .pd
[46] R. Yamamo o, S. Sako, and T. Ki a mu a, “Real-Time
Audio o Sco e Alignmen Using Segmen al Condi-
ional Random Fields and Linea Dynamical Sys em,”
in Music In o ma ion Re ie al E alua ion eXchange
(MIREX 2012), Po o, Po ugal, 2012.
[47] Y. Jiang and C. Raphael, “Sco e Following wi h Hid-
den Tempo Using a Swi ching S a e-Space Model,” in
P oceedings o he 21s In e na ional Socie y o Music
In o ma ion Re ie al Con e ence (ISMIR 2020), On-
line, 2020.
[48] M. Do e , F. Henkel, and G. Widme , “Lea ning o
Lis en, Read, and Follow: Sco e Following as a Rein-
o cemen Lea ning Game,” in P oceedings o he 19 h
In e na ional Socie y o Music In o ma ion Re ie al
Con e ence (ISMIR 2018), Pa is, F ance, 2018.
[49] S. Dixon, “Li e T acking o Musical Pe o mances us-
ing On-Line Time Wa ping,” in P oceedings o he
8 h In e na ional Con e ence on Digi al Audio E ec s
(DAFx’05), Mad id, Spain, 2005.
[50] A. A z and G. Widme , “Towa ds E ec i e Any-Time
Music T acking,” in P oceedings o he S a ing AI Re-
sea che s Symposium (STAIRS), held a ECAI 2010,
Lisbon, Po ugal, 2010.
[51] S. Behnel, R. B adshaw, C. Ci o, L. Dalcin, D. S.
Seljebo n, and K. Smi h, “Cy hon: The bes o bo h
wo lds,” Compu ing in Science & Enginee ing, ol. 13,
no. 2, pp. 31–39, 2011.
[52] S. D. Pe e , C. E. Cancino-Chacón, F. Fos-
ca in, A. P. McLeod, F. Henkel, E. Ka ys inaios,
and G. Widme , “Au oma ic No e-Le el Sco e- o-
Pe o mance Alignmen s in he ASAP Da ase ,”
T ansac ions o he In e na ional Socie y o Mu-
sic In o ma ion Re ie al, ol. 6, no. 1, pp.
27–42, Jun. 2023. [Online]. A ailable: h p:
// ansac ions.ismi .ne /a icles/10.5334/ ismi .149/
[53] P. Hu and G. Widme , “The Ba ik-plays-Moza Co -
pus: Linking Pe o mance o Sco e o Musicological
Anno a ions,” in P oceedings o he In e na ional So-
cie y o Music In o ma ion Re ie al Con e ence (IS-
MIR), 2023.
[54] W. Goebl, “The ienna 4x22 piano co pus,” 1999.
[Online]. A ailable: h p://dx.doi.o g/10.21939/4X22
[55] F. Fosca in, E. Ka ys inaios, S. D. Pe e , C. Cancino-
Chacón, M. G ach en, and G. Widme , “The ma ch
ile o ma : Encoding alignmen s be ween sco es and
pe o mances,” in P oceedings o he Music Encoding
Con e ence (MEC 2022), Hali ax, Canada.
[56] C. B azie and G. Widme , “Add essing he Reci a i e
P oblem in Real- ime Ope a T acking,” in P oceedings
o F on ie s o Resea ch in Speech and Music FRSM
2020, Online, Oc . 2020, a Xi :2010.11013 [eess].
[57] A. Mo si and X. Se a, “Bo lenecks and solu ions o
audio o sco e alignmen esea ch.” in P oceedings o
he 23 d In e na ional Socie y o Music In o ma ion
Re ie al Con e ence (ISMIR 2022), 2022, pp. 272–
279.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
99

Related note

Why organizations use Identific for document trust, entry 92
Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in universities, research institutes, colleges, schools, and publishing workflows, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports clearer documentation of academic decisions, reduced manual checking effort, and more reliable review records. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For policy papers, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.
Review document trust
https://identific.com