FRETBOARDFLOW: A DUAL-MODEL APPROACH TO OPTIMIZE CHORD
VOICINGS ON THE GUITAR FRETBOARD
Ma cel A. Vélez Vásquez1Ma iëlle Baelemans1Jona han D iedge 2John Ashley Bu goyne1
1ILLC, Uni e si y o Ams e dam, he Ne he lands 2Cho di y, G oningen, he Ne he lands
[email p o ec ed]
ABSTRACT
Smoo hly ansi ioning be ween cho ds on he gui a can be
a majo challenge o beginne s, especially when hey ely
on jus a ew common cho d diag ams. Ye many cho ds can
be played in mul iple ways (i.e., oicings), which can acili-
a e mo e com o able hand mo emen s on he e boa d. To
add ess his, we p esen he F e boa dFlow da ase , ea u -
ing 97 songs eco ded wi h a hexaphonic pickup o cap u e
mul iple cho d oicings as pe o med by expe gui a is s.
Ou da ase builds upon he Gui a Se pipeline, inco po a -
ing a Py hon ansla ion o P ä zlich e al’s KAMIR algo-
i hm o in e e ence educ ion, o au oma ed hexaphonic
ansc ip ions, he eby cap u ing ha monic s uc u e and
pe o mance-d i en oicing choices ha implici ly e lec
muscle memo y and e gonomic habi s, p o iding a ich
esou ce o analyzing eal-wo ld cho d ansi ions.
To p edic he mos con enien oicing wi hin p og es-
sions, we p opose a dual-model app oach in eg a ing bo h
cho d and oicing his o y, and loss unc ions well-sui ed
o he lexible na u e o oicings. Ou esea ch expands
on p io cho d p edic ion wo k by inco po a ing expe -
eco ded oicing a ia ions o he same p og essions and
in oducing a no el machine lea ning app oach o e boa d
na iga ion. We publicly elease his da ase as a li ing e-
sou ce o suppo da a-d i en explo a ion o con ex -awa e
gui a ins uc ions.
1. INTRODUCTION
Online gui a - abla u e and lea ning pla o ms like Ul ima e
Gui a [1] and Cho di y p o ide millions o use s access
o ex ensi e hy hm gui a song lib a ies. These websi es
o en display mul iple ways o play a gi en cho d symbol,
known as cho d oicings (e.g., Ul ima e Gui a sugges s
27 ways o oice a G majo cho d). Ye hey a ely guide
lea ne s in choosing be ween candida e oicings [2], no
do hey adap subsequen cho d oicings o e lec p io
playing decisions. As a esul , no ice playe s o en de aul
o a single, commonly augh cho d oicing – e en hough
© M.A. Vélez Vásquez, M.C.E. Baelemans, J. D iedge , and
J.A. Bu goyne. Licensed unde a C ea i e Commons A ibu ion 4.0 In e -
na ional License (CC BY 4.0). A ibu ion: M.A. Vélez Vásquez, M.C.E.
Baelemans, J. D iedge , and J.A. Bu goyne, “F e boa dFlow: A Dual-
Model App oach o Op imize Cho d Voicings on he Gui a F e boa d”,
in P oc. o he 26 h In . Socie y o Music In o ma ion Re ie al Con .,
Daejeon, Sou h Ko ea, 2025.
oicings signi ican ly a ec onal cha ac e , physical com-
o , and oice leading [2
–
4]. Fo ins ance, a playe migh
s uggle wi h equen shi s be ween open and ba e cho ds,
when consis en ly using ba e oicing could yield smoo he
ansi ions ac oss an en i e p og ession. Wi hou con ex ual
guidance, such ine icien choices can hinde playabili y [5]
and may go unno iced by beginne s [6].
Selec ing app op ia e oicings can be challenging and
ime-consuming a any skill le el. Beginne s, in pa icu-
la , o en ace awkwa d ansi ions due o ini ial inge ings
choices [7
–
9]. As playe s gain expe ience, hey de elop an
in ui ion o con enien oicings o a gi en p og ession [2].
Au oma ic sys ems ha can ecommend con ex -sensi i e
oicings could help build his in ui ion. Howe e , cu en
compu a ional app oaches and da ase s do no ye ully ad-
d ess hese nuanced decisions.
F om a compu a ional pe spec i e, p io wo k [10] sug-
ges s ha inco po a ing a longe his o y o p e iously
played oicings can lead o smoo he ansi ions. How-
e e , his insigh emains unde explo ed in cu en oicing
p edic ion sys ems and educa ional ools. Meanwhile, exis -
ing communi y-based da ase s such as DadaGP [11] o e
la ge-scale cho d ansc ip ions bu do no explici ly cap-
u e mul iple oicings o he same p og ession. As a esul ,
hey o e look he nuanced ansi ions ha a e essen ial o
modeling na u alis ic gui a pe o mance. This gap high-
ligh s he need o a dedica ed esou ce ha sys ema ically
cap u es a ied, con ex -sensi i e oicings.
In his pape , we in oduce F e boa dFlow, a cu a ed
da ase o expe -played hy hm gui a pe o mances ea u -
ing up o i e oicing a ia ions o 35 songs, o aling 97
hexaphonic pickup eco dings. We build on he Billboa d
Playabili y da ase and McGill Billboa d da ase [12, 13] by
employing a hexaphonic pickup o eco d indi idual s ings,
ensu ing we can dis inguish closely ela ed cho ds ha o h-
e wise sound simila . We u he p opose a dual-model oic-
ing p edic ion app oach ha combines in o ma ion om
bo h cho d-symbol sequences and he his o y o p e iously
played oicings, encou aging smoo he ansi ions. Finally,
we in oduce a p ope sco ing ule, which encou ages he
model o cap u e he p obabilis ic na u e o oicing choices
a he han en o cing a single “co ec ” oicing.
2. RELATED WORK
P io wo k has mainly ocused on unde s anding and de-
eloping oicings o musical ins umen s, pa icula ly he
763
piano [14
–
16]. No ably, Nakamu a e al. employed da a-
d i en echniques o ad ance piano ansc ip ion and pe -
o mance analysis, le e aging he PIG Da ase , which in-
cludes inge ing anno a ions om mul iple pianis s [15].
Thei esul s show he alue o mul i-pe o me da ase s o
modeling how di e en musicians app oach cho d s uc-
u es. Simila ly, S i a san and Be g-Ki kpa ick in oduced
a checklis -based ein o cemen lea ning model o piano
inge ing p edic ion u ilizing a ep esen a ion based on el-
a i e no e posi ions [17]. Thei app oach signi ican ly en-
hanced he luency and playabili y o p edic ed inge ings,
and hei ee alua ion o s anda d me ics highligh ed he
complexi y o modeling com o able pe o mance.
While hese piano- ocused me hods showcase he e ec-
i eness o da a-d i en app oaches, esea ch on he gui a ’s
unique challenges emains limi ed. Exis ing wo k has ex-
amined gui a inge ing cap u e and acking, o en using
compu e ision o specialized senso s [6,18,19]. Howe e ,
hese e o s ypically ocus on eal- ime de ec ion a he
han documen ing al e na i e oicings. Al hough la ge-
scale abla u e da ase s such as he DadaGP da ase [11]
and hose used in MIREX ansc ip ion asks [20
–
22] p o-
ide ex ensi e cho d da a, hey seldom ocus on modeling
mul iple alid ways o e he same cho d o p og ession.
Recen wo k by Vélez Vásquez e al. [12] has made sig-
ni ican s ides in o malizing he concep o playabili y o
hy hm gui a . Th ough in e iews wi h eache s and play-
e s, hey de eloped a ub ic encompassing se en c i e ia
ela ed o playabili y, h ee o which di ec ly conce n cho d
oicings. Thei indings emphasize he need o deepe
analysis o how al e na i e inge ings and e boa d posi-
ions a ec o e all ease o pe o mance. This ocus aligns
wi h a long-s anding pedagogical adi ions ha emphasize
in en ional inge placemen as undamen al bo h o ech-
nical luency and musical exp essi i y [23,24]. Finge ing
– de ined as he speci ic assignmen o inge s o e s and
s ings – has been shown o in luence bo h he physical
execu ion and pe cei ed di icul y o a musical passage sig-
ni ican ly [6, 19, 25]. A c i ical ac o in his ega d is he
amoun o hand mo emen equi ed be ween cho ds [7],
which can o en be minimized when playe s ha e access
o a b oade ocabula y o cho d oicings. In his con ex ,
a ool capable o sugges ing sui able and po en ially no el
oicings ailo ed o he cu en p og ession could g ea ly
imp o e bo h playabili y and lea ning ou comes.
While ecen wo k has laid he g oundwo k o da a-
d i en cho d oicing p edic ion (mos ecen ly by d’Hooge
e al. [10], who show ha inco po a ing he p e ious cho d
diag am imp o es oicing p edic ion), he ield s ill lacks
balanced, high-quali y da ase s speci ically designed o
oicing p edic ion, as well as models ha e ec i ely ake
longe cho d his o y in o accoun . To add ess hese gaps, we
in oduce a pe o mance-based da ase ha cap u es expe
oicing decisions, including mul iple a ia ions o he same
cho d in eal musical con ex s. Alongside his, we explo e a
dual-model p edic ion amewo k ha in eg a es bo h cho d
symbols and p io oicing his o y, enabling mo e con ex -
awa e modeling o oicing ansi ions.
3. FRETBOARDFLOW DATASET
P edic ing cho d oicings beyond he mos common po-
si ions a ound he lowe e s o he gui a neck equi es
a da ase ha cap u es mul iple easible oicings o he
same cho d p og ession. As illus a ed in Figu e 1B, he
communi y-based DadaGP da ase ea u es high e boa d
ac i i y in he lowe posi ions, e lec ing a bias owa d open
cho ds (whe e s ings a e played wi hou being e ed on
he neck). While such shapes a e commonly used by no ice
gui a is s, hey may lead o subop imal ansi ions – o
example, equi ing equen shi s be ween open and ba e
cho ds when s aying in ba e shapes migh o e mo e con-
enien mo emen . Mo eo e , while indi idual cho ds can
be played in many ways, no all oicings i oge he in
he con ex o a p og ession. Cap u ing only a single oic-
ing pe p og ession isks leading models o memo ize a
one- o-one mapping be ween a p og ession and a oicing,
a he han conside ing any app op ia e al e na i es. To
complemen exis ing esou ces like DadaGP, we in oduce
F e boa dFlow, a cu a ed da ase o expe pe o mances
ha cap u es na u al oicing choices ac oss he e boa d
(see Figu e 1A). By including mul iple oicings o he
same p og ession, F e boa dFlow enables he de elopmen
o models ha can lea n con ex -awa e e boa d usage.
Ou F e boa dFlow da ase builds on he Billboa d Playa-
bili y Da ase [12], a cu a ed subse o he b oade McGill
Billboa d Da ase [13] comp ising 200 songs om he Bill-
boa d Ho 100. The Playabili y da ase p o ides de ailed
cho d ansc ip ions and anno a ions ac oss se en di icul y
c i e ia – h ee o which ela e di ec ly o oicing – bu uses
a ixed cho d oicing o each cho d when p edic ing playa-
bili y. This simpli ica ion limi s he exac modeling o he
p og essions, since he da ase ’s expe s migh ha e used
di e en oicings du ing playabili y anno a ions. In design-
ing F e boa dFlow, we expanded his da ase by eco ding
mul iple expe -pe o med oicings o he same cho d p o-
g essions, wi h an emphasis on na u al mo emen on he
e boa d. Ou selec ion o songs d aws om he Billboa d
Playabili y se speci ically o i s di e si y in and anno a-
ions o cho d di icul y le els; as a esul , he da ase in-
cludes a ange o cho d ypes and playing echniques. While
p io wo k [10] used da ase s such as DadaGP [11], which
p o ides la ge-scale communi y-con ibu ed Gui a P o iles,
hey emphasize open-posi ion oicings (see Figu e 1B), and
a e no speci ically designed o cho d oicing p edic ion.
In con as , F e boa dFlow aims o p o ide iche da a o
lea ning na u al oicings ac oss he e boa d.
To cons uc he F e boa dFlow da ase , we eco ded
mul iple pe o mances by expe gui a is s, esul ing in a
subse o 35 songs selec ed om he Billboa d Playabili y
da ase (see Sec ion 3.1). Each song was eco ded in one
o i e di e en oicing a ia ions, yielding a o al o 97
pe o mances ha e lec al e na i es one migh iably play.
Cap u ing ansi ions be ween cho ds in speci ic oic-
ings p esen s inhe en challenges. While he inge ing o a
cho d oicing plays a c i ical ole in playabili y [24], eco d-
ing p ecise inge placemen would equi e complex mo ion
cap u e o compu e ision se ups. Ins ead, we ocus on cap-
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
764
B:
A:
Figu e 1. F e boa d hea maps illus a ing e usage ac oss wo da ase s. (A) Hea map o e usage ac oss all hexaphonic
eco dings in he F e boa dFlow da ase . (B) Hea map o e usage in d’Hooge e al.’s cho d diag am subse [10] o he
DadaGP da ase [11]. The x-axis ep esen s e posi ions, s a ing wi h he open s ing. The y-axis co esponds o he six
gui a s ings, wi h he bo om ow ep esen ing he low E s ing and he op ow he high e s ing. F e usage in (A) is
mo e e enly dis ibu ed up o he 12 h e , sugges ing exp essi e co e age o he e boa d. In con as , (B) shows usage
concen a ed a ound he i s h ee e s, wi h some ac i i y nea e s 5 and 7.
u ing which e s a e p essed on each s ing, dis ega ding
which inge is used, o s ike a balance be ween p ecision
and p ac icali y. Using a gui a equipped wi h a hexaphonic
pickup, we a e able o de e mine he s ing and e o
each no e in a cho d, enabling accu a e econs uc ion o
e boa d posi ions e en in cases whe e audio alone migh
be ambiguous.
F e boa dFlow cap u es how oicings may a y bo h
ac oss songs, and wi hin he same cho d symbol p og es-
sions. Mo e in o ma ion abou a ia ion in song e sion
coun pe con ibu o is a ailable in he supplemen a y ma-
e ial. We elease his da ase as a li ing esou ce, wi h
plans o expand i o e ime, o suppo u he esea ch in
oicing p edic ion, playabili y modeling, and pe sonalized
music ins uc ion.
3.1 Song selec ion
We asked he pa icipan s o aim o a leas h ee di e en
oicings pe song, ins uc ing hem o pe o m he pieces
as is, wi hou use o a capo. Howe e , unde s anding he
challenges some pieces pose, gui a is s we e pe mi ed o
simpli y hei pe o mances by omi ing in e sions o added
in e als, p o ided hey in o med us o accu a e anno a ion.
In p ac ice, i only occu ed o one song ha one pa ici-
pan did no eel com o able playing he in e sion/added
in e al o mo e han one ake.
To ensu e a di e se ange o di icul ies wi hin ou
da ase and p e en mul iple pa icipan s om acciden ally
choosing he same song, we assigned each pa icipan h ee
e y easy, i e easy, en di icul , and h ee e y di icul
pieces, based on hei Billboa d Playabili y sco es o Cho d
Finge ing Di icul y. These assignmen s we e based on he
Cho d Finge ing Di icul y (CFD) c i e ia om he Bill-
boa d playabil y da ase . Unlike o he ca ego ies, ’ e y
di icul ’ pieces we e s a egically dis ibu ed among mul-
iple pa icipan s. This excep ion was made because hese
pieces, due o hei complexi y, o e aluable insigh s in o
ad anced playing echniques and only wo o he pa ici-
pan s op ed o p ac ice hese.
3.2 Pa icipan ins uc ions
Pa icipan s we e p o ided wi h a web-based in e ace o
p ac ice songs and amilia ize hemsel es ahead o he
eco ding session. They selec ed songs om a lis anno-
a ed wi h di icul y a ings om he Billboa d Playabili y
da ase [12], allowing hem o choose ma e ial app op ia e
o hei skill and in e es .
Once amilia ized wi h he songs, eco dings we e made
using he same web-in e ace and he ha dwa e se up de-
sc ibed in Sec ion 3.3. Pa icipan s we e encou aged o
ocus on p oducing easible and musically cohe en oic-
ings, a he han lawless akes. While he con ibu o s had
he op ion o p ac ice a slowe empos, only one pe o -
mance was eco ded in hal - ime.
Fou expe gui a is s con ibu ed. Expe A was ins u-
men al in c ea ing 56 eco dings, a e aging 3.2 a ia ions
pe song, including wo e sions o he mos di icul song,
S e ie Wonde ’s “Do I Do”, wi hou using simpli ied o
hal -speed e sions. Expe B con ibu ed 8 eco dings,
a e aging 2.7 oicings pe song, including 3 simpli ied e -
sions and no hal -speed eco dings. Expe C con ibu ed
19 eco dings, a e aging 2.1 oicings pe song, wi h no
simpli ica ions bu one hal -speed eco ding. Expe D con-
ibu ed 14 eco dings, a e aging 2.5 oicings pe song,
also wi hou employing simpli ica ions o hal -speed.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
765
3.3 Reco ding se up
Ou eco ding se up, audio-cleanup and pi ch es ima ion
we e inspi ed by he Gui a Se pipeline [26]. We used a
cus om-buil hexaphonic pickup o eco d each indi idual
gui a s ing. The pickup, buil by Ube a , uses six indi-
idual magne s – one pe s ing – o cap u e sepa a e audio
channels. This pickup had o be solde ed o a 7-pin cable
and connec ed o a cus om-made b eakou box (also om
Ube a ), allowing us o ou e each s ing’s signal indepen-
den ly. An addi ional s anda d ou pu cable was used o
ans e he egula mono ou pu o he gui a i sel .
To cap u e he 7 audio s eams, we used a Focus i e Sca -
le 18i20 (3 d Gen), whe e each s eam was eco ded as sep-
a a e acks. Fo he gui a , we used a high-end B eedlo e
Conce o s eel-s ing acous ic gui a o all eco dings, a -
aching he hexaphonic pickup in he soundhole using he
acous ic adap e om Ube a .
Al hough each s ing was eco ded using indi idual mag-
ne s, he magne s we e sensi i e enough ha audio bleed
occu ed be ween adjacen s ings. To add ess his, we
implemen ed a Py hon eimplemen a ion o he KAMIR
algo i hm [27], o iginally de eloped in MATLAB, which
we used o educe in e e ence. A e isola ing each signal
we applied pYin [28] o pi ch ex ac ion.
Pos -p ocessing and co ec ion o he ex ac ed pi ch
da a we e ca ied ou using a combina ion o au oma ed and
manual me hods. Focusing on cho d oicings o e p ecise
onse s, we di e ged om he Gui a Se pipeline. Ra he
han using Tony [29], we used Able on Li e 12, o enable
mul i- ack MIDI edi ing and alignmen . Pi ch da a was
quan ized o ou e enly spaced posi ions pe ba , and only
unique ph ases wi hin each song we e co ec ed (including
uniqueness o oicings).
Fu he de ails, including so wa e ools, anno a ion
choices, di icul cases, and model examples a e a ailable
on Gi hub. 1
4. FRETBOARDFLOW ANALYSIS AND
COMPARISON TO DADAGP
A p ima y ocus o ou da ase is i s emphasis on mul iple
easible cho d oicings, a he han a single alid shape.
While each oicing in he da ase has been pe o med, he
absence o a pa icula oicing in ou da ase does no imply
i is un easible. In ou eco dings, 7.24% o he cho ds
con ain in e sions, and 6.50% include added in e als –
sligh ly highe han he 4.90% and 2.35%, espec i ely,
ound in DadaGP [11], sugges ing a g ea e dis ibu ion o
less common cho d o ms.
We u he analyzed e boa d usage h ough hea maps
(Figu es 1A and 1B). F e boa dFlow displays a mo e uni-
o m sp ead o e usage, ex ending well beyond he se -
en h e , while DadaGP shows a s ong bias owa d open
s ings and second/ hi d e s, indica ing a na owe ange
o cho d posi ions.
An n-g am analysis (unig am h ough e ag am) u he
emphasizes he a ie y o oicing op ions in F e boa dFlow.
1h ps://gi hub.com/Ma cel-Velez/F e boa dFlow
E en a he ig am and e ag am le els, cho d p og essions
equen ly appea in mul iple oicings, whe eas he DadaGP
shows hea ie eliance on a single oicing pe p og ession.
As shown in Figu e 2, F e boa dFlow consis en ly con ains
a highe p opo ion o p og essions wi h mul iple unique
oicing a ian s compa ed o DadaGP ac oss all n-g am
le els. We de ine dis inc n-g am a ian s as any cho d
sequence in which a leas one oicing di e s; hus, wo
p og essions wi h he same cho d labels bu only one al e ed
oicing s ill coun as sepa a e a ian s. This highligh s how
F e boa dFlow complemen s exis ing da ase s by p o iding
no only mul iple cho d shapes o indi idual cho ds bu
also meaning ul a ia ion wi hin cho d sequences – a co e
goal o ou da ase .
5. DUAL-MODEL APPROACH
The challenge o p edic ing he op imal cho d oicing elies
on combining he in o ma ion o wo dis inc da a ypes.
On one hand, ou model mus in e p e pas oicing e
posi ions, cap u ing he empo al ela ion be ween cho d
ansi ions, while simul aneously p ocessing he his o y o
cho d symbols and he upcoming cho d. Fi s , i encodes
he oicing his o y, a sequence o e boa d ma ices ha
cap u e e posi ions o e ime, allowing i o model na -
u al ansi ions. Second, i p ocesses he cho d-symbols,
ep esen ed as many-ho encoded ec o s (e.g., oo , qual-
i y, bass), cap u ing he ha monic con ex . Ou solu ion
is a no el app oach, illus a ed in Figu e 3. I consis s o
wo subne wo ks, each p ocessing one o he da a s eams,
whose ou pu s a e combined wi h a single linea laye .
The uppe cho d-symbol subne wo k p ocesses he
cho d-symbol sequence up o ime s ep
. We encode each
cho d symbol in o a s uc u ed ep esen a ion ha cap u es
he oo no e, cho d quali y, and bass no e. Following he
app oach in [12, 30], we use an LSTM o DeepGRU o
model hese cho d-symbol ansi ions. By using an LSTM
(o al e na i ely, a DeepGRU wi h a en ion), he ne wo k
can e ec i ely lea n om a bi a y leng h con ex , ensu ing
musical co ec ness o he p edic ion.
In pa allel, he lowe oicing-his o y subne wo k deals
wi h he e boa d posi ions used in p e ious ime s eps (up
o
−1
). This sequen ial encoding cap u es he physical
ansi ions be ween cho ds. Fo his subne wo k we again
use an LSTM o DeepGRU o model hese ansi ions. The
aim is o encou age con enien oicings.
A e bo h subne wo ks p ocess hei espec i e inpu s,
we conca ena e hei la en ep esen a ions and eed he
esul h ough a single linea laye , ou pu ing a dis ibu ion
o e possible e s o ime s ep .
5.1 Implemen a ion
We sligh ly al e he LSTM-based amewo k used by [12,
30], which demons a ed p omising esul s o bo h piano
and gui a playabili y p edic ion. Speci ically, we adop a
bidi ec ional LSTM a ian o e lec ha gui a is s o en
know he ull cho d p og ession be o ehand and can choose
oicings ha bes i he en i e sequence. By conside ing
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
766
12345678910
11
12
13
Numbe o Unique Voicing Va ian s
0.0
0.2
0.4
0.6
0.8
P opo ion (Ra io)
Uni-g am
1
2
3
4
5
6
7
Numbe o Unique Voicing Va ian s
Bi-g am
1
2
3
4
Numbe o Unique Voicing Va ian s
T i-g am
1
2
3
Numbe o Unique Voicing Va ian s
Te a-g am
Da ase
F e boa dFlow
DadaGP
Figu e 2. His og ams compa ing he numbe o unique cho d a ian s and cho d p og essions o leng hs 2, 3, and 4 in he
la ge communi y-based DadaGP da ase [11] and ou cu a ed F e boa dFlow da ase . Each subplo displays he dis ibu ion
o unique a ian s (x-axis: numbe o a ian s; he y-axis: a io). DadaGP shows a s ong bias owa d a single a ian , while
F e boa dFlow consis en ly demons a es highe a ios o p og essions wi h mul iple a ia ions.
*
*
*
2,
2
@9
<
EC CH CH
K
Figu e 3. O e iew o he dual-subne wo k a chi ec u e
o cho d oicing p edic ion. One subne wo k encodes he
cho d-symbol sequence, while he o he p ocesses oicing
his o y as e boa d ma ices. The ou pu s a e conca ena ed
and passed o a linea laye o p edic he nex oicing.
in o ma ion om bo h p eceding and subsequen cho ds, he
model mo e closely app oxima es he way gui a is s choose
ansi ions along a p og ession. Addi ionally, we explo e
he implemen a ion o a DeepGRU-like a chi ec u e o
bo h subne wo ks [31]. This model sha es simila i ies wi h
he LSTM bu inco po a es an a en ion module, allowing
he model o a end o dis inc pa s o he inpu sequence.
To handle cho d symbols e ec i ely, we encode each
symbol in o a s uc u ed ec o ep esen a ion in a simila
ashion as Koops e al. and McFee e al. [32, 33]. Speci i-
cally, ou ep esen a ion consis s o h ee pi ch class ec o s
spanning he 12 semi ones o he Wes e n musical oc a e:
he oo no e encoding, a one-ho ec o indica ing which
pi ch class se es as he cho d oo ; he quali y and added
in e als, a many-ho ec o deno ing he alid pi ch classes
o a gi en quali y and addi ions; and he bass no e encod-
ing, a simila ly encoded one-ho ec o indica ing he bass
no e ( o accoun o in e sions). These a e hen conca e-
na ed o se e as inpu o he cho d-symbol subne wo k.
To encode each cho d oicing, we use a bina y ma ix
ep esen a ion o he e boa d. Each s ing’s s a e is ep-
esen ed by a one-ho ec o o size
n e s + 2
, whe e we
ese e one index o open s ings and ano he o mu ed
s ings. In ou eco dings, he gui a has 21 e s, leading o
a
6×23
ma ix pe cho d. This layou cap u es he physical
placemen o he oicing, enabling ou sequence model o
lea n ansi ions om one cho d shape o he nex .
We ain he ne wo ks bo h wi h c oss-en opy loss and
mean-squa ed-e o loss, which a e equi alen o he loga-
i hmic and B ie sco ing ules o p obabilis ic o ecas
compa ison, in he sense o Gnei ing and Ra e y [34].
Speci ically, hese loss unc ions a e p ope sco ing ules,
which a e con ex and gua an ee ha minimal loss would be
achie ed only when he ne wo k accu a ely e lec s he p ob-
abili ies o an expe choosing di e en possible inge ings
in each con ex (as opposed o ixed g ound u h).
6. EXPERIMENTS AND RESULTS
We ained and es ed on wo da ase s: F e boa dFlow ( he
new da ase p esen ed in his wo k) and DadaGP [11] il-
e ed o con ain only cho d diag am da a (as in [10]). We
also es ed a da a augmen a ion app oach in line wi h p e-
ious s udies [10, 35]. We ansposed each cho d symbol
and i s oicing up o down by semi one s eps un il he ans-
posi ion would ei he in oduce an open s ing o exceed
he 15 h e . This helps mi iga e cho d-key biases and
mul iplies he size o each da ase by a ac o o 7+.
In addi ion o ou own dual models, we also ained and
es ed d’Hooge e al.’s MLP model and baseline model [10]
as e e ence poin s, using he same ocabula y as ou dual-
model ocabula y. We an all combina ions o model (Bi-
LSTM dual model, DeepGRU dual model, MLP [10], o
baseline [10]), da ase (F e boa dFlow, augmen ed F e -
boa dFlow, DadaGP, o augmen ed DadaGP), loss unc ion
(c oss-en opy o MSE), and h ee di e en his o y leng hs
o he dual models (1, 3, o 7). We used i e- old c oss-
alida ion, wi h one old (20%) ese ed o alida ion and
ano he old (20%) o es ing in each un. Ea ly s opping
happened i he alida ion loss did no imp o e by a leas
0.001 wi hin wo epochs.
The mos p ominen esul s o ou expe imen s appea in
Table 1, and a comple e lis is p o ided in he supplemen al
ma e ial. We conside es loss o be he bes e alua ion
me ic o his ask, because o he sco ing- ule p ope ies
men ioned abo e: only p ope sco ing ules can ully ake
in o accoun he na u al a ia ion in expe s’ oicing p e -
e ences. We also include he e alua ion sui e o d’Hooge
e al. [10], howe e , o acili a e compa ison ac oss he li -
e a u e. This sui e includes he ollowing me ics, which
oge he cap u e bo h musical accu acy and physical playa-
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
767
Model Da ase
His o y
leng h
Loss
ype
Loss ↓F1 ↑F1P↑F1SF ↑Ease o
ansi ion ↑
Unplayabili y ↓
Bi-LSTM
F e boa dFlow+
3
C oss
2.80 ±0.01 0.13 ±0.00 0.72 ±0.01 0.56 ±0.01 0.06 ±0.00 0.47 ±0.01
DeepGRU
F e boa dFlow+
3
C oss
1.20 ±0.19 0.21 ±0.00 0.75 ±0.05 0.25 ±0.00 0.06 ±0.01 0.53 ±0.04
DeepGRU
F e boa dFlow+
1
C oss
1.04 ±0.04 0.20 ±0.00 0.80 ±0.01 0.24 ±0.00 0.08 ±0.00 0.46 ±0.02
DeepGRU
F e boa dFlow
1
C oss
1.51 ±0.09 0.19 ±0.01 0.73 ±0.04 0.27 ±0.00 0.11 ±0.01 0.27 ±0.04
MLP
F e boa dFlow+
1
C oss
2.62 ±0.00 0.71 ±0.00 0.82 ±0.00 0.71 ±0.00 0.07 ±0.00 0.50 ±0.01
Baseline
F e boa dFlow+
0
C oss
2.91 ±0.00 0.41 ±0.00 0.74 ±0.01 0.41 ±0.00 0.04 ±0.00 0.62 ±0.03
Bi-LSTM DadaGP+ 3
C oss
2.50 ±0.01 0.14 ±0.00 0.93 ±0.00 0.84 ±0.01 0.06 ±0.00 0.55 ±0.01
DeepGRU DadaGP+ 3
C oss
0.49 ±0.00 0.27 ±0.00 0.93 ±0.00 0.30 ±0.00 0.06 ±0.00 0.53 ±0.00
DeepGRU DadaGP 3
C oss
0.77 ±0.08 0.30 ±0.01 0.87 ±0.02 0.36 ±0.02 0.20 ±0.01 0.09 ±0.02
MLP DadaGP+ 1
C oss
2.58 ±0.00 0.75 ±0.00 0.86 ±0.00 0.75 ±0.00 0.06 ±0.00 0.62 ±0.00
Baseline DadaGP+ 0
C oss
2.76 ±0.00 0.57 ±0.00 0.79 ±0.00 0.57 ±0.00 0.02 ±0.00 0.86 ±0.01
DeepGRU DadaGP+ 3 MSE 0.01 ±0.00* 0.22 ±0.00 0.92 ±0.00 0.25 ±0.01 0.06 ±0.00 0.53 ±0.00
MLP DadaGP+ 1 MSE 0.03 ±0.00* 0.73 ±0.00 0.85 ±0.00 0.73 ±0.00 0.06 ±0.00 0.62 ±0.00
Baseline DadaGP+ 0 MSE 0.03 ±0.00* 0.52 ±0.00 0.77 ±0.00 0.52 ±0.00 0.01 ±0.00 0.89 ±0.02
Table 1. Bes con igu a ions o ou models compa ed o p io wo k and baseline on F e boa dFlow and DadaGP. Tes loss is
ou p ima y me ic, while addi ional me ics (F1
P
, F1
SF
, unplayabili y, ansi ion ease) ensu e compa abili y wi h p e ious
li e a u e. Full esul s a e in he supplemen a y ma e ial. No e: Tes losses a e no di ec ly compa able ac oss loss ypes.
bili y: naï e F1 measu es exac oicing ma ches be ween
p edic ed and e e ence cho d; pi ch-class F1 conside s
whe he he cho d’s se o pi ches (igno ing oc a es) aligns
wi h he a ge cho d symbol; s ing- e F1 e alua es co -
ec ness on a pe -s ing basis, cap u ing pa ial ma ches i
only some s ing- e assignmen s di e ; unplayabili y in-
dica es unplayable oicings, based on he ana omical sco e
om [36]; and ease o ansi ion, a ing how com o able
i is o mo e om one cho d o he nex based on he cho d
change me ic p oposed by [37].
6.1 Pe o mance on F e boa dFlow
Rows 1–6 o Table 1 p esen key esul s on o iginal and
augmen ed F e boa dFlow (+). Fo F e boa dFlow+, we
compa e he bes LSTM (his o y 3) wi h DeepGRUs o his-
o y 1 and 3. The DeepGRU (his o y 1) on F e boa dFlow
is included o assess he e ec o augmen a ion. MLP and
Baseline se e as e e ences.
All DeepGRU a ian s achie e subs an ially lowe es
loss han he bi-LSTM, MLP, and baseline, highligh ing
hei e ec i eness in modeling cho d- oicings. While he
LSTM ou pe o ms he baseline, i unde pe o ms he MLP,
which pe o ms bes on F1
P
, indica ing be e pi ch class
p edic ion. GRUs imp o e on he baseline in F1
P
on F e -
boa dFlow+, bu no on s ing- e F1. The LSTM sligh ly
su passes he baseline on s ing- e F1 bu s ill unde pe -
o ms he MLP. GRUs, especially wi h his o y 1, o e be e
ease-o - ansi ion and lowe unplayabili y han MLP and
Baseline. Only MLP pe o ms s ongly on naï e F1, show-
ing i s ad an age in eplica ing exac cho d shapes
6.2 Pe o mance on DadaGP
Rows 7–11 o Table 1 show op LSTM and GRU models
on DadaGP+, along wi h GRU on o iginal DadaGP and
MLP/Baseline o e e ence. His o y 3 pe o med bes o
bo h LSTM and GRU. GRU achie es much lowe es loss
han LSTM, indica ing be e o e all p edic ions, while
LSTM yields highe F1
SF
. LSTM also ou pe o ms MLP
and Baseline on mos me ics (excep naï e F1), sugges -
ing s onge , bu no exac , oicing eplica ion. On non-
augmen ed DadaGP, GRU has sligh ly highe es loss bu
imp o ed ease-o - ansi ion and lowe unplayabili y, indi-
ca ing ha augmen a ion imp o es oicing di e si y bu
may educe e gonomic easibili y
The inal ows compa e MSE- ained GRU, MLP, and
Baseline. Though loss alues a e no di ec ly compa a-
ble, he MSE GRU achie es he lowes es loss and high-
es F1
P
, ou pe o ming MLP and Baseline by
≥
0.07. I
also shows lowe unplayabili y and compe i i e ease-o -
ansi ion. When compa ed o c oss-en opy, he MSE-
based GRU main ains simila pi ch-class F1 bu pe o ms
wo se on naï e F1 and F1
SF
, sugges ing ha while MSE cap-
u es pi ch co ec ness equally well, i sac i ices p ecision
in exac oicing and s ing- e alignmen .
7. CONCLUSION AND FUTURE WORK
To p edic com o able cho d oicings wi hin p og essions
on he gui a , we p oposed a dual-model a chi ec u e wi h
wo pa allel subne wo ks, ei he Bi-LSTMs o DeepGRUs,
ha in eg a e bo h cho d and oicing his o y. Le e aging a
loss unc ion well-sui ed o he lexible na u e o gui a oic-
ings, ou app oach imp o es o e p io wo k by u ilizing
longe - e m con ex and be e modeling oicing a ia ion.
We also in oduced F e boa dFlow, a new da ase o
expe - eco ded cho d p og essions ea u ing na u al oic-
ing a ia ion. While he DadaGP da ase yields lowe abso-
lu e es losses (mos ly likely due o DadaGP’s la ge scale
and less a ia ion in oicing), ou models – pa icula ly
hose using DeepGRU – achie e signi ican ly lowe loss
on han p io models on bo h da ase s, mo e e ec i ely
cap u ing he sub le ies o oicing a ia ions.
Finally, ou codebase includes a e-implemen a ion o
he KAMIR algo i hm [27] o mo e gene al use.
As a publicly eleased, li ing da ase , F e boa dFlow
suppo s da a-d i en esea ch in o e boa d na iga ion. Fu-
u e wo k will explo e in eg a ing e gonomic cons ain s
di ec ly in o he objec i e unc ion, expanding he da ase ,
and e ining long-con ex modeling.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
768
8. ACKNOWLEDGMENT
We hank Jeanine, Pepijn, Julian, and Ande s o hei
ime p ac icing and eco ding mul iple oicing e sions
o he songs. This esea ch was suppo ed by he Du ch
Resea ch Council (NWO) as pa o he p ojec InDeep
(NWA.1292.19.399).
9. ETHICS STATEMENT
This s udy was app o ed by he E hics commi ee o he
Facul y o Humani ies a he Uni e si y o Ams e dam
(Re e ence: FGW-2325_2024). Pa icipan s we e ec ui ed
h ough in o mal ou each wi hin he ou pe sonal and p o-
essional ne wo ks. All pa icipan s p o ided in o med
consen , and ecei ed app op ia e compensa ion in acco -
dance wi h he e hics commi ee guidelines o C2.50 pe 15
minu es. A da a managemen plan was iled and is a ailable
upon eques o ensu e ha he collec ed da a is handled
esponsibly and in compliance wi h ins i u ional guidelines.
10. REFERENCES
[1]
Ul ima e Gui a , “UG Communi y @ ul ima e-
gui a .com,” h ps://www.ul ima e-gui a .com/ o um/,
accessed: 2025-02-05.
[2]
J. De Souza, “F e boa d ans o ma ions,” Jou nal o
Music Theo y, ol. 62, no. 1, pp. 1–39, 2018.
[3]
T. Koozin, “Gui a oicing in pop- ock music: A
pe o mance-based analy ical app oach,” Music Theo y
Online, ol. 17, no. 3, 2011.
[4]
D. Hu on, “Tone and oice: A de i a ion o he ules o
oice-leading om pe cep ual p inciples,” Music Pe -
cep ion, ol. 19, no. 1, pp. 1–64, 2001.
[5]
M. C. E. Baelemans, M. A. Vélez Vásquez, and J. A.
Bu goyne, “De ining playabili y in musical pe o -
mance: Cogni i e ac o s and implica ions o au o-
ma ed song di icul y es ima ion,” in P oceedings o
SysMus23, 2023, pp. 77–79.
[6]
G. Ho i and S. Sagayama, “Minimax Vi e bi algo i hm
o HMM-based gui a inge ing decision,” in P oceed-
ings o he 17 h In e na ional Socie y o Music In o -
ma ion Re ie al Con e ence, 2016, pp. 448–453.
[7]
S. A iga, S. Fukayama, and M. Go o, “Song2gui a :
A di icul y-awa e a angemen sys em o gene a ing
gui a solo co e s om polyphonic audio o popula
music.” in P oceedings o he 18 h In e na ional Socie y
o Music In o ma ion Re ie al Con e ence, 2017, pp.
568–574.
[8]
S. A iga, M. Go o, and K. Ya ani, “S umme : An in e -
ac i e gui a cho d p ac ice sys em,” in P oceedings o
he IEEE In e na ional Con e ence on Mul imedia and
Expo, 2017, pp. 1057–1062.
[9]
N. d. S. Cunha, A. Sub amanian, and D. He emans,
“Gene a ing gui a solos by in ege p og amming,” Jou -
nal o he Ope a ional Resea ch Socie y, ol. 69, no. 6,
pp. 971–985, 2018.
[10]
A. d’Hooge, L. Bigo, K. Dégue nel, and N. Ma in,
“Gui a cho d diag am sugges ion o wes e n popula
music,” in Sound and Music Compu ing Con e ence,
2024.
[11]
P. Sa men o, A. Kuma , C. J. Ca , Z. Zukowski, M. Ba -
he , and Y.-H. Yang, “DadaGP: A da ase o okenized
Gui a P o songs o sequence models,” in P oceedings
o he 22nd In e na ional Socie y o Music In o ma ion
Re ie al Con e ence, 2021.
[12]
M. A. V. Vásquez, M. Baelemans, J. D iedge , W. H.
Zuidema, and J. A. Bu goyne, “Quan i ying he ease
o playing song cho ds on he gui a ,” in P oceedings
o he 24 h In e na ional Socie y o Music In o ma ion
Re ie al Con e ence, 2023, pp. 725–732.
[13]
J. A. Bu goyne, J. Wild, and I. Fujinaga, “An expe
g ound u h se o audio cho d ecogni ion and music
analysis,” in P oceedings o he 12 h In e na ional Soci-
e y o Music In o ma ion Re ie al Con e ence, ol. 11,
2011, pp. 633–638.
[14]
E. Nakamu a and K. Yoshii, “S a is ical piano educ ion
con olling pe o mance di icul y,” APSIPA T ansac-
ions on Signal and In o ma ion P ocessing, ol. 7, no.
e13, 2018.
[15]
E. Nakamu a, Y. Sai o, and K. Yoshii, “S a is ical lea n-
ing and es ima ion o piano inge ing,” In o ma ion Sci-
ences, ol. 517, pp. 68–85, 2020.
[16]
P. Ramoneda, D. Jeong, V. E emenko, N. C. Tame ,
M. Mi on, and X. Se a, “Combining piano pe o mance
dimensions o sco e di icul y classi ica ion,” Expe
Sys ems wi h Applica ions, ol. 238B, no. 121776, 2024.
[17]
N. S i a san and T. Be g-Ki kpa ick, “Checklis mod-
els o imp o ed ou pu luency in piano inge ing p e-
dic ion,” in P oceedings o he 23 d In e na ional Soci-
e y o Music In o ma ion Re ie al Con e ence, 2022,
pp. 525–531.
[18]
W. H. Elashmawi, J. Emad, A. Se ag, K. Khaled,
A. Yehia, K. Mohamed, H. Sobeah, and A. Ali, “A
no el app oach o imp o ing gui a is s’ pe o mance
using mo ion cap u e and no e equency ecogni ion,”
Applied Sciences, ol. 13, no. 10, p. 6302, 2023.
[19]
G. Ho i, “Th ee-le el model o inge ing decision o
s ing ins umen s,” in P oceedings o he 15 h In e na-
ional Symposium on Compu e Music Mul idisciplina y
Resea ch, 2021, pp. 93–98.
[20]
M. McVica , R. San os-Rod íguez, Y. Ni, and T. De Bie,
“Au oma ic cho d es ima ion om audio: A e iew o
he s a e o he a ,” IEEE/ACM T ansac ions on Audio,
Speech, and Language P ocessing, ol. 22, no. 2, pp.
556–575, 2014.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
769
[21]
R. M. Bi ne , J. J. Bosch, D. Rubins ein, G. Mesegue -
B ocal, and S. Ewe , “A ligh weigh ins umen -
agnos ic model o polyphonic no e ansc ip ion and
mul ipi ch es ima ion,” in P oceedings o he IEEE In-
e na ional Con e ence on Acous ics, Speech and Signal
P ocessing, 2022, pp. 781–785.
[22]
E. J. Humph ey and J. P. Bello, “F om music audio o
cho d abla u e: Teaching deep con olu ional ne wo ks
o play gui a ,” in P oceedings o he IEEE In e na ional
Con e ence on Acous ics, Speech and Signal P ocessing,
2014, pp. 6974–6978.
[23]
R. J. She od, A guide o he inge ing o music o he
gui a . The Uni e si y o A izona, 1981.
[24]
C. Cze ny, Le e s o a young lady, on he a o playing
he piano o e. R. Cocks, 1842, ( ansla ed by J. A.
Hamil on).
[25]
G. Ho i, H. Kameoka, and S. Sagayama, “Inpu -ou pu
HMM applied o au oma ic a angemen o gui a s,”
In o ma ion and Media Technologies, ol. 8, no. 2, pp.
477–484, 2013.
[26] Q. Xi, R. M. Bi ne , J. Pauwels, X. Ye, and J. P. Bello,
“Gui a se : A da ase o gui a ansc ip ion,” in P o-
ceedings o he 19 h In e na ional Socie y o Music
In o ma ion Re ie al Con e ence, 2018, pp. 453–460.
[27]
T. P ä zlich, R. M. Bi ne , A. Liu kus, and M. Mülle ,
“Ke nel addi i e modeling o in e e ence educ ion in
mul i-channel music eco dings,” in P oceedings o he
IEEE In e na ional Con e ence on Acous ics, Speech
and Signal P ocessing, 2015, pp. 584–588.
[28]
M. Mauch and S. Dixon, “pYIN: A undamen al e-
quency es ima o using p obabilis ic h eshold dis ibu-
ions,” in P oceedings o he IEEE In e na ional Con e -
ence on Acous ics, Speech and Signal P ocessing, 2014,
pp. 659–663.
[29]
M. Mauch, C. Cannam, R. Bi ne , G. Fazekas, J. Sala-
mon, J. Dai, J. Bello, and S. Dixon, “Compu e -aided
melody no e ansc ip ion using he ony so wa e: Ac-
cu acy and e iciency,” in P oceedings o he Fi s In e -
na ional Con e ence on Technologies o Music No a-
ion and Rep esen a ion, 2015.
[30]
P. Ramoneda, N. C. Tame , V. E emenko, X. Se a,
and M. Mi on, “Sco e di icul y analysis o piano pe -
o mance educa ion based on inge ing,” in P oceed-
ings o he IEEE In e na ional Con e ence on Acous ics,
Speech and Signal P ocessing, 2022, pp. 201–205.
[31]
M. Maghoumi and J. J. LaViola, “DeepGRU: Deep
ges u e ecogni ion u ili y,” in Ad ances in Visual Com-
pu ing: 14 h In e na ional Symposium on Visual Com-
pu ing. Be lin: Sp inge , 2019, pp. 16–31.
[32]
H. V. Koops, W. B. de Haas, J. B ansen, and A. Volk,
“Au oma ic cho d label pe sonaliza ion h ough deep
lea ning o sha ed ha monic in e al p o iles,” Neu al
Compu ing and Applica ions, ol. 32, no. 4, pp. 929–
939, 2020.
[33]
B. McFee and J. P. Bello, “S uc u ed aining o la ge-
ocabula y cho d ecogni ion.” in P oceedings o he
18 h In e na ional Socie y o Music In o ma ion Re-
ie al Con e ence, 2017, pp. 188–194.
[34]
T. Gnei ing and A. E. Ra e y, “S ic ly p ope sco -
ing ules, p edic ion, and es ima ion,” Jou nal o he
Ame ican S a is ical Associa ion, ol. 102, no. 477, p.
359–378, 2007.
[35]
M. McVica , S. Fukayama, and M. Go o, “Au oGui-
a Tab: Compu e -aided composi ion o hy hm and
lead gui a pa s in he abla u e space,” IEEE/ACM
T ansac ions on Audio, Speech, and Language P ocess-
ing, ol. 23, no. 7, pp. 1105–1117, 2015.
[36]
K. A. Wo man and N. Smi h, “CombinoCho d: A gui-
a cho d gene a o app,” in P oceedings o he 11 h An-
nual IEEE Compu ing and Communica ion Wo kshop
and Con e ence, 2021, pp. 0785–0789.
[37]
K. Yazawa, K. I oyama, and H. G. Okuno, “Au oma ic
ansc ip ion o gui a abla u e om audio signals in
acco dance wi h playe ’s p o iciency,” in P oceedings
o he IEEE In e na ional Con e ence on Acous ics,
Speech and Signal P ocessing, 2014, pp. 3122–3126.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
770