AUTOMATIC MELODY REDUCTION VIA SHORTEST PATH FINDING
Ziyu Wang12 Yuxuan Wu1Roge B. Dannenbe g3Gus Xia1
1Music X Lab, MBZUAI 2New Yo k Uni e si y 3Ca negie Mellon Uni e si y
{ziyu.wang, yuxuan.wu, gus.xia}@mbzuai.ac.ae, [email p o ec ed]
ABSTRACT
Melody educ ion, as an abs ac ep esen a ion o musical
composi ions, se es no only as a ool o music analy-
sis bu also as an in e media e ep esen a ion o s uc u ed
music gene a ion. P io compu a ional heo ies, such as
he Gene a i e Theo y o Tonal Music, p o ide insigh ul
in e p e a ions o music, bu hey a e no ully au oma ic
and usually limi ed o he classical gen e. In his pape ,
we p opose a no el and concep ually simple compu a ional
me hod o melody educ ion using a g aph-based ep e-
sen a ion inspi ed by p inciples om compu a ional mu-
sic heo ies, whe e he educ ion p ocess is o mula ed as
inding he sho es pa h. We e alua e ou algo i hm on
pop, olk, and classical gen es, and expe imen al esul s
show ha he algo i hm p oduces melody educ ions ha
a e mo e ai h ul o he o iginal melody and mo e musi-
cally cohe en han o he common melody downsampling
me hods. As a downs eam ask, we use melody educ ions
o gene a e symbolic music a ia ions. Expe imen s show
ha ou me hod achie es highe quali y han s a e-o - he-
a s yle ans e me hods. 1
1. INTRODUCTION
Main aining s uc u al cohe ence in long- e m music gen-
e a ion is a undamen al challenge. One app oach o ad-
d essing his challenge is h ough hie a chical models,
which ely on ex ac ing high-le el abs ac ions o enable
cascaded gene a i e p ocesses [1–4]. These abs ac ions
p o ide a coa se -g ained iew o musical s uc u e, cap-
u ing essen ial long- ange dependencies. In exis ing ap-
p oaches, abs ac ions a e ypically explici ly de ined (e.g.,
cho d p og ession o ph ase labels) o lea ned h ough un-
supe ised me hods (e.g., la en codes ia an au oencode ).
Ye , so a , hey ha e no been able o cap u e a unda-
men al musical s uc u e: he melodic low—how a melody
e ol es and esol es wi hin a ph ase—which emains oo
1Music samples o melody educ ion and a ia ion can be
ound a h ps://au o-melody- educ ion.gi hub.io/
AMRA-demo/. We elease he code a h ps://gi hub.com/
ZZWaang/melody- educ ion-algo.
© Z. Wang, Y. Wu, R. Dannenbe g, and G. Xia. Licensed
unde a C ea i e Commons A ibu ion 4.0 In e na ional License (CC BY
4.0). A ibu ion: Z. Wang, Y. Wu, R. Dannenbe g, and G. Xia, “Au-
oma ic Melody Reduc ion ia Sho es Pa h Finding”, in P oc. o he
26 h In . Socie y o Music In o ma ion Re ie al Con ., Daejeon, Sou h
Ko ea, 2025.
nuanced o be explici ly labeled and oo challenging o
unsupe ised lea ning o eliably iden i y.
F om a musicology pe spec i e, melodic low can be
ep esen ed h ough melody educ ion, which p ese es he
s uc u al essence o a melody [5, 6]. Howe e , mos ex-
is ing app oaches ega d melody educ ion as a by-p oduc
o analysis, ypically ep esen ed by hie a chical s uc u es
such as ees o u he in e p e a ion [7,8]. In his con ex ,
educ ion is no a ixed ans o ma ion bu a he a sub-
jec i e and demons a i e p ojec ion o he analysis p o-
cedu e. This inhe en ambigui y makes melody educ ion
no only di icul o e alua e bu also challenging o use as
a p ac ical ep esen a ion [9–11]. In his wo k, we explo e
how melody educ ion can be app oxima ed using s uc-
u al heu is ics, aiming o make he concep mo e accessi-
ble and use ul o music gene a ion.
To his end, we p opose an algo i hm o au oma ic
melody educ ion. The algo i hm uses he g aph ep esen-
a ion o a melody ph ase and ega ds all possible educ-
ions as g aph pa hs. The in ui ion behind he algo i hm
is ha i we de ine a cos unc ion consis en wi h guiding
p inciples unde lying mos educ ion heo ies, an ideal e-
duc ion should be he pa h wi h he leas cos . Speci ically,
we conside wo p inciples. Fi s , he subsequen no es in
an ideal melody educ ion usually e eal a simple s uc-
u e (e.g., a p olonga ion (unison), o a linea p og ession
(s ep-wise mo ion) [5]. Second, an ideal melody educ ion
usually includes no es o highe signi icance in e ms o
pi ch, hy hm, and ha mony [6, 12]. We de ine edge cos s
based on hese p inciples and use he sho es -pa h algo-
i hm o ind he melody educ ion [13]. The esul ing pa h
is subsequen ly pos -p ocessed in o an ac ual melody.
We e alua e he p oposed algo i hm in pop, olk, and
classical music gen es, showing ha i yields educ ions
ha a e o en pe cei ed as mo e ai h ul o he o iginal
melody and musically cohe en compa ed o o he melody
downsampling me hods. We also in oduce a ia ion gen-
e a ion as a downs eam applica ion, in which we ain a
melody gene a ion model condi ioned on educ ions. The
educ ions ex ac ed wi h ou algo i hm a e shown o yield
highe -quali y a ia ions compa ed o baselines.
2. RELATED WORK
In his sec ion, we e iew h ee ealms o ela ed wo k: 1)
cogni i e heo ies abou music educ ion, 2) he algo i h-
mic implemen a ion o music heo ies, and 3) he impo -
ance o melody educ ion in downs eam applica ions.
In he his o y o cogni i e music heo y, a sha ed
346
Un i ledsco e
Sub i le Compose /a ange
5
17
29
G EmC Am
01234
Onse (Measu e)
B3
C4
C]4/D[4
D4
D]4/E[4
E4
F4
F]4/G[4
G4
Pi ch
x1x2x3
x4
x5
x6
x7
x8x9x10
x11
x12 x13
x14
C:maj G:maj A:min E:min
The sho es pa h
All edges
5
10
15
20
Edge Weigh
Un i ledsco e
Sub i le Compose /a ange
29
17
5
G EmC Am
01234
Onse (Measu e)
B3
C4
C]4/D[4
D4
D]4/E[4
E4
F4
F]4/G[4
G4
Pi ch
x1x2x3
x4
x5
x6
x7
x8x9x10
x11
x12 x13
x14
C:maj G:maj A:min E:min
5
10
15
20
Edge Cos
1. Con e o he g aph ep esen a ion
2. Find he sho es pa h
3. Pos -p ocess o melody educ ion
Figu e 1: The o e iew o he p oposed melody educ ion algo i hm.
me hodology o music analysis is o use a educed melody
o ep esen he abs ac melodic low [5, 6, 14]. Schenke-
ian analysis in ol es a ecu si e educ ion p ocess o u n
a music composi ion in o he undamen al s uc u e [5];
and he Gene a i e Theo y o Tonal Music (GTTM) u -
he o malizes he g amma in Schenke ian analysis [6].
These s udies highligh ha melody educ ion is an e ec-
i e ep esen a ion in he cogni ion p ocess, and educ ion
is highly ela ed o he conside a ions o no e connec ion,
ha monic con ex , pi ch impo ance, e c., which usually
imply ension and elaxa ion in di e en music scopes.
The e a e se e al a emp s o u n hese heo ies in o al-
go i hms. Ki lin e al. p opose a amewo k o au oma ic
Schenke ian analysis [15], and Hamanaka e al. design an
in e ac i e so wa e o implemen GTTM using machine-
lea ning echniques [9, 16, 17]. O he app oaches educe
melodies ecu si ely by assigning weigh s o no es [7, 8].
Howe e , a quali y gap emains be ween au oma ic analy-
ses and human in e p e a ions. Mo eo e , he algo i hms
usually equi e sco e-no a ion le el da a (e.g., MusicXML
o ma ), a e gen e-speci ic, and a e no open-sou ced. A
ecen compu a ional music analysis poin s ou ha since
ou music p e e ence is ha d o exp ess in o mal g am-
ma , such o mal sys ems end o ha e a b oad sea ch space
o music analyses [18]. This mo i a es us o pu sue an
in ui i e al e na i e: we di ec ly app oxima e melody e-
duc ion based on cogni i e p e e ence wi hou building a
o mal sys em.
Al hough melody educ ion is mos ly s udied in an an-
aly ical scope, ecen ad ances in deep lea ning also show
ha melody educ ion ep esen a ion is bene icial o s uc-
u ed long- e m music gene a ion. P e iously, melody e-
duc ion was usually implici ly modeled by su oga e ea-
u es such as down-sampled melody s a is ics [1], melody
con ou [19], o implici la en ep esen a ions [20]. The
ecen hie a chical music gene a ion me hodology shows
ha using an explici ly de ined melody educ ion, long-
e m music gene a ion can be ackled mo e elegan ly and
e ec i ely [3]. The algo i hm we p opose aims o es ablish
a ounda ion o such u u e s udies.
3. METHODOLOGY
In his sec ion, we in oduce he p oposed melody educ-
ion algo i hm in de ail. A diag am o ou algo i hm is
shown in Figu e 1. Sec ion 3.1 in oduces he da a a -
ibu es and he g aph ep esen a ion o a melody. Sec-
ion 3.2 de ines he edge ypes o he g aph, and Sec ion 3.3
de ines he edge cos . Finally, we discuss he melody e-
duc ion pos -p ocessing ope a ions in Sec ion 3.4.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
347
3.1 G aph Rep esen a ion o a Melody
The inpu o he algo i hm is a sequence o no es, deno ed
by x1, ..., xN, and an unde lying cho d p og ession, de-
no ed by c1, ..., cK. A melody can be ep esen ed by a di-
ec ed g aph G(V, E), whe e he melody no es a e ega ded
as g aph nodes V:= {xi}N
i=1, and empo al ela ions o
no es can be ep esen ed by edges E:= {xi→xi+1}N−1
i=1 .
We conside he onse , pi ch, and du a ion a ibu es o
a no e xi. These a e deno ed by Onse (xi),Pi ch(xi)and
Du (xi), espec i ely. The onse and du a ion should be
quan ized by bea loca ions, and he pi ch is ep esen ed
by MIDI no e numbe s om 0 o 127. Addi ionally, a
cho d is ep esen ed by a 12-d bina y ch oma ec o , i.e.,
ci∈ {0,1}12, and we de ine Cho d(xi)∈ {c1, ..., cK}
o indica e cho d membe ship o he no e xi. In ou al-
go i hm, we heu is ically de ec an icipa ion-like cases (a
ype o non-cho d one) and ega d hese no es as belong-
ing o he nex cho d.
3.2 Edge De ini ion
In he p oposed algo i hm, we ega d bo h he o iginal
melody and educed melodies as pa hs om x1 o xN. The
o iginal melody uses edges in E, whe eas a educ ion uses
sho cu edges. To his end, we de ine an augmen ed edge
se E∗:= {xi→xj|i < j}, ep esen ing all causal edges.
I an edge xi→xjis selec ed in he educ ion p ocess,
i means he melodic mo emen om xi→xjis mo e
signi ican han all o he mo emen s xi′→xj′inside he
ime ange, i.e., i≤i′< j′≤jand (i, j)= (i′, j′).
We ca ego ize an edge xi→xj∈E∗in o six ca e-
go ies. The i s h ee ca ego ies a e he mos undamen al,
which co espond o h ee main ways o melody educ ion
in Schenke ian analysis: p olonga ion, linea p og ession,
and a peggia ion [21]. No e ha he edges a e s ic ly de-
ined below, and we only bo ow he e ms o implica ion:
•P olonga ional Edge (PE):xjp olongs xiwi h
he same pi ch. Fo example, a PE can po en-
ially emo e a neighbo one. Ma hema ically, a
PE sa is ies Pi ch(xi) = Pi ch(xj)and Onse (xj)−
Onse (xi)< D.
•Linea Edge (LE): he in e al be ween xiand xjis
a second. Fo example, an LE can po en ially ma k
a signi ican melodic mo emen . Ma hema ically, an
LE sa is ies |Pi ch(xi)−Pi ch(xj)| ∈ {1,2}and
Onse (xj)−Onse (xi)< D.
•A peggia ion Edge (AE): he in e al be ween
xiand xjis la ge han a (compound) sec-
ond and xiand xja e wi hin he same cho d.
Fo example, an AE can po en ially ma k an
elabo a ion o ha mony. Ma hema ically, an
AE sa is ies |Pi chClass(xi)−Pi chClass(xj)| ∈
{3,4,5,6,7,8,9}and Cho d(xi) = Cho d(xj).
In some melody composi ions, pi ches ha span an oc-
a e a e also ega ded as a smoo h connec ion. In Schenke-
ian analysis, his is explained by imagina y con inuo—
al hough he wo ones span an oc a e in he cu en eal-
iza ion, hey a e close in o he imagina y ealiza ions. We
de ine wo ypes o imagina y edges acco dingly:
•Imagina y P olonga ional Edge (IPE):xjp o-
longs xiwi h he same pi ch class. Ma hema ically,
an IPE (is no a PE) and sa is ies Pi chClass(xi) =
Pi chClass(xj)and Onse (xj)−Onse (xi)< D.
•Imagina y Linea Edge (ILE): he in e al be-
ween xiand xj(o i s in e sion) is a compound
second. Ma hema ically, an ILE (is no an LE)
and sa is ies |Pi chClass(xi)−Pi chClass(xj)| ∈
{1,2,10,11}and Onse (xj)−Onse (xi)< D.
Finally, all he es o he edges in E∗belong o he inal
ca ego y. This is o ensu e he g aph is connec ed so ha
he e mus exis a leas one pa h om x1 o xN:
•Unclassi ied Edges (UE): he es o he edges.
In he abo e de ini ion, Pi ch(·), Onse (·), and Cho d(·)
a e p e iously de ined in Sec ion 3.1. Pi chClass(x) :=
Pi ch(x) mod 12 and Cho d(xi) = Cho d(xj)i and
only i xiand xja e wi hin he in e al o a single cho d.
In ou expe imen , he empo al h eshold Dis se o 2
measu es.
3.3 Edge Cos De ini ion
We de ine he edge cos unc ion o xi→xjso ha a
mo e signi ican edge will ha e a smalle cos . The edge
cos unc ion conside s h ee aspec s: 1) he unc ion o
di e en edge ypes, 2) he empo al dis ance o an edge,
and 3) he no e impo ance. 2
Fi s , we de ine onal cos , deno ed by c onal(xi→xj),
p io i izing p olonga ional and linea edges in he melody
educ ion. Fo mally,
c onal(xi→xj) :=
0.1, i xi→xjis a PE,
0.3, i xi→xjis an LE,
1.5, i xi→xjis an AE,
1.0, i xi→xjis an IPE,
1.3, i xi→xjis an ILE,
3.0, i xi→xjis a UE.
(1)
Then, we de ine he empo al cos , deno ed by
c emp(xi→xj), o measu e he dis ance om index i o
index j. We se he hype pa ame e η= 1.6 o achie e an
ideal deg ee o educ ion. A la ge η esul s in oo li le
educ ion and a smalle ηmakes he educ ion oo coa se:
c emp(xi→xj) := (j−i)η. (2)
Besides he wo cos unc ions on edges, we in oduce a
no e impo ance ac o , deno ed by α(xi), o ensu e s uc-
u ally impo an no es a e mo e likely o be selec ed. Pa -
icula ly, α(xi)is a p oduc o ou e ms:
α(x) := αp(x)αo(x)αd(x)αh(x), (3)
2Cu en ly, he cos s a e empi ically speci ied based on domain
knowledge and p elimina y analysis. Es ima ing hem om da a is le
o u u e wo k.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
348
whe e αp(xi)deno es pi ch impo ance,αo(xi)deno es
onse impo ance,αd(xi)deno es du a ion impo ance,
and αh(xi)deno es ha mony impo ance.
1. Pi ch Impo ance. Highe and lowe pi ches a e
usually mo e signi ican in a melody and should be
gi en a smalle weigh ac o :
αp(xi) := 0.1×0.5−|Pi ch(xi)−pmid|
pmax −pmid +1, (4)
whe e pmax and pmin a e maximum and minimum
pi ch alues and pmid = (pmax +pmin)/2.
2. Onse Impo ance. The no es ha ing highe me -
ical impo ance should be gi en a smalle weigh
ac o :
αo(xi) :=
0.85, Onse (xi)∈DB,
0.95, Onse (xi)∈B,
1.05, Onse (xi)∈B/2,
1.15, Onse (xi)∈B/4.
(5)
He e, DB, B, B/2, and B/4 ep esen downbea , bea ,
eigh h-no e, and six een h-no e posi ions, espec-
i ely (i unde he 4/4 ime signa u e).
3. Du a ion Impo ance. Longe no es should be
gi en a smalle weigh ac o :
αd(xi) :=
0.85, Du (xi)≥hal no e,
0.95, Du (xi)≥qua e no e,
1.05, Du (xi)≥8 h no e,
1.15, Du (xi)≥16 h no e.
(6)
4. Ha mony Impo ance. A cho d one should be
gi en a smalle weigh ac o han non-cho d ones.
αh(xi) := (0.85,xiis a cho d one,
1.15, o he wise. (7)
He e xiis a cho d one s ic ly means Pi chClass(xi)
is in Cho d(xi). So, an an icipa ion is ega ded as a
cho d one o he nex cho d (see Sec ion 3.1).
Finally, he o al edge cos is de ined as a summa ion o
onal and empo al cos , modula ed by he no e impo ance
ac o :
c(xi→xj) = α(xj)[c emp(xi→xj) + c onal(xi→xj)].
(8)
Thus, he melody educ ion can be achie ed by unning a
sho es -pa h algo i hm o ind he sho es pa h om x1 o
xN.
3.4 Pos -P ocessing
A e we ind he sho es pa h, we use a ule-based pos -
p ocessing me hod o a ange he selec ed no es in he pa h
o melody educ ion. The maximum esolu ion o he e-
duc ion is a qua e no e, in he s yle o a i h-species
coun e poin [22].
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Compose /a ange
Sub i le
Un i ledsco e
23
11
Numbe o no es in a cho dCho d du a ion
(bea )
Compose /a ange
Sub i le
Un i ledsco e
23
11
Candida e Rhy hmic Pa e ns
N/A N/A N/A
N/A N/A
N/A
Pos -P ocessing S eps
Compose /a ange
Un i ledsco e
Sub i le
11
23
Compose /a ange
Un i ledsco e
Sub i le
11
23
Compose /a ange
Sub i le
Un i ledsco e
10
22
Compose /a ange
Sub i le
Un i ledsco e
22
10
1. An Ou pu sho es -pa h
2. Remo e p olonga ions
inside a cho d
3. Apply hy hmic pa e ns
4. Add suspensions o
p olonga ions
Unde lying cho ds
Figu e 2: An illus a ion o pos -p ocessing ope a ions.
Figu e 2 shows he de ailed p ocedu e. Fi s , he nodes
in he sho es pa h a e alloca ed in o cho d bins, wi h each
bin co esponding o a dis inc cho d. In each cho d bin,
he no es a e gi en a ixed hy hm empla e (see he a-
ble a he bo om o Figu e 2), ensu ing he no es wi hin a
bin collec i ely span he en i e du a ion o hei associa ed
cho d. In his p ocess, no es linked by a p olonga ional
edge a e me ged in o a single no e. I he numbe o nodes
in a cho d bin exceeds he leng h o he cho d, a andom
selec ion o no es will be omi ed. Finally, he p olonga-
ional edges be ween wo cho ds a e ma ked wi h suspen-
sion. No e ha no es se ing as an icipa ions a e alloca ed
o he bin o he subsequen cho d.
4. EXPERIMENTS
In Sec ion 4.1, we e alua e he p oposed algo i hm h ough
a subjec i e lis ening es . In Sec ion 4.2, we show and
e alua e a melody educ ion example as a case s udy. Fi-
nally, we e alua e he e ec i eness o melody educ ion in
downs eam music gene a ion asks in Sec ion 4.3.
4.1 Subjec i e E alua ion o Melody Reduc ion
Unlike asks wi h clea g ound u hs, melody educ ion is
inhe en ly subjec i e and s yle-dependen . Exis ing he-
o ies, such as GTTM o Schenke ian analysis, p o ide
in e p e i e hie a chies a he han p esc ip i e ou comes
[6,21]. Finding educ ion ypically in ol es p uning a ee
a a iable dep hs, o en in o med by human judgmen .
Mo eo e , such heo ies a e p ima ily sui ed o classical
music.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
349
(a) Pop Gen e
(b) Folk Gen e
(c) Classical Gen e
Figu e 3: Subjec i e e alua ion esul s o melody educ ion quali y ac oss h ee gen es.
Gi en hese challenges, we adop a subjec i e lis ening
es o be e cap u e he pe cep ual and musical quali y o
melody educ ions. We co e h ee music gen es: pop,
olk, and classical. We sample melodies om he POP909
da ase [23], he No ingham da ase [24], and he GTTM
da abase [25] o he pop, olk, and classical gen es, e-
spec i ely. We compa e wi h wo ep esen a i e baselines
commonly used o melody educ ion as ea u e ex ac ion
in music gene a ion:
•Downsampling on Obse a ions (DS-OBS): F om
a s a is ical pe spec i e, he melody is downsampled
o a sequence o hal no es, each ep esen ing he
mos common pi ch in he 2-bea music segmen [1].
•Downsampling in La en Space (DS-LS): EC2-
VAE [26] lea ns disen angled la en ep esen a ions
o he pi ch con ou and hy hmic pa e n o 2-
measu e music segmen s as zpand z , espec i ely,
which enables downsampling in he la en space o
hy hm pa e ns. Speci ically, we encode he pi ch
con ou zpo da a and decode i oge he wi h a
downsampled hy hm z o ge he melody educ ion
o e e y 2-measu e segmen .
Fo he subjec i e es , we andomly selec ou 8-
measu e melodies om each gen e. Each pa icipan lis-
ens o a leas 3 g oups o melody educ ions o each
gen e. In each g oup, pa icipan s a e p esen ed wi h he
o iginal melody i s , ollowed by he melody educ ions
gene a ed by he p oposed algo i hm, downsampling, and
la en ep esen a ion ecombina ion in a andomized o de .
Pa icipan s a e asked o a e he quali y o he melody e-
duc ion on a 5-poin Like scale, whe e 1 indica es he
wo s quali y and 5 indica es he bes , in e ms o h ee
c i e ia: (1) Melody Fai h ulness: how well he melody e-
duc ion p ese es he o iginal music in o ma ion. (2) Ha -
monic Cohe ency: how well he melody educ ion i s he
unde lying cho d p og ession. (3) O e all Musicali y: he
o e all music quali y o he melody educ ion.
A o al o 45 subjec s (26 emales & 19 males) pa ici-
pa ed in he su ey, in which o e 70% ha e a music educa-
ion expe ience o a leas 2 yea s. The esul s a e epo ed
in Figu e 3, whe e he heigh s o ba s ep esen means o
he a ings and he e o ba s ep esen he s anda d e o
compu ed by wi hin-subjec ANOVA [27]. The p oposed
Ph ase A Ph ase B Ph ase C
Ou s
DS-OBS
DS-LS
O iginal
Figu e 4: Compa ison o he o iginal melody, melody e-
duc ions om he p oposed me hod, and he baselines. We
highligh he ph ases in he op ow.
algo i hm is signi ican ly p e e ed o e bo h baselines in
all gen es and c i e ia (p < 0.05), excep o melody ai h-
ulness in he pop gen e, whe e he di e ence shows a pos-
i i e bu ma ginal end (p < 0.075).
4.2 A Case S udy o Melody Reduc ion
We p o ide a case analysis o melody educ ion compa ing
he p oposed algo i hm and baselines in oduced in Sec-
ion 4.1, as shown in Figu e 4. The o iginal melody is
shown in he op ow, ollowed by he h ee melody educ-
ions gene a ed by he p oposed algo i hm and baselines.
Bo h he p oposed me hod and DS-OBS can mos ly
cap u e he co ec melody low, such as impo an passing
ones like C4 in he i s measu e. In sub le si ua ions such
as he second measu e, whe e Ph ase A ends i s downwa d
music low wi h he downbea cho d one B♭3 and linge s
a F4 un il he ansi ion o Ph ase B, DS-OBS ails o p e-
se e B♭3, as i is o e shadowed by he long du a ion o
F4. In con as , he p oposed algo i hm success ully cap-
u es B♭3 by paying a en ion o i s ha monic and hy hmic
impo ance, as well as he imagina y p olonga ional edge
be ween B♭3 and B♭4 in he hi d measu e. DS-LS only
cap u es he pi ch con ou bu in oduces se e al unwan ed
non-cho d ones. This example demons a es he e ec i e-
ness o he p oposed algo i hm o melody educ ion.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
350
DS-OBS + Di . (No el ideas abou hy hm, la pi ch a ia ions, ab up non-scale one)
Ou s + Di . (No el ideas abou pi ch & hy hm, ai h ul melody low)
EC2-VAE Sampling (Un ai h ul melody low, ew pi ch & hy hm a ia ions)
Figu e 5: Compa ison o a ia ions o an example melody. He e Di . deno es he condi ional di usion model ained o
gene a e melody a ia ions om melody educ ions. Posi i e commen s a e highligh ed in ed, nega i es in blue.
4.3 Downs eam Task: Gene a ing Melody Va ia ions
We belie e melody educ ion can se e as a use ul ep-
esen a ion o s uc u al in o ma ion in downs eam asks.
In his sec ion, we demons a e one such applica ion in a
melody a ia ion gene a ion ask. The ask uses he educ-
ion o a melody as inpu , and ou pu s a ia ions ai h ul o
he o iginal melody. While we do no claim a s ong causal
link be ween educ ion quali y and gene a ion quali y, ou
in ui ion is ha an accu a e educ ion be e e lec s he un-
de lying melodic and ha monic con ex , which in u n sup-
po s mo e cohe en and musically g ounded gene a ion.
To his end, we ain a di usion model o gene a e
melody a ia ions om melody educ ions p o ided by he
p oposed algo i hm. We use a simila model design and
aining se ings as he leadshee gene a ion model in [3]
and ain he model on he POP909 da ase . Simila ly, we
ain he model using melody educ ions by DS-OBS. We
also gene a e melody a ia ions by sampling in he la en
space o zpand z o EC2-VAE o compa ison. Fo all
me hods, we andomly sample ou ou pu s pe inpu and
selec he mos ep esen a i e one o use in lis ening es s.
Figu e 5 shows a g oup o melody a ia ion examples.
I can be seen ha he a ia ion model ained wi h he p o-
posed melody educ ions no only main ains he o iginal
melody low bu also in oduces no el ideas in pi ch and
hy hm. The model ained wi h DS-OBS also p ese es
he pi ch con ou , bu ends o ha e la pi ch a ia ions.
The a ia ion gene a ed by sampling om he la en space
o EC2-VAE changes he o iginal melody low in an un-
wan ed way, and does no in oduce ich a ia ions.
We e alua e he melody a ia ions on he es se o
POP909 using a subjec i e lis ening es wi h he same
pa icipan s as in Sec ion 4.1. Each pa icipan lis ens o
a leas h ee g oups o melody a ia ions, whe e pa ic-
ipan s a e i s p esen ed wi h he o iginal melody, ol-
lowed by he h ee a ia ions in andom o de . Pa ici-
pan s a e asked o a e he quali y o melody a ia ions and
he o iginal melody by human compose s in h ee c i e ia:
Na u alness,C ea i i y, and Musicali y [28]. The esul s
a e epo ed in Figu e 6, wi h he same compu a ion as
2
Figu e 6: Subjec i e esul s o melody a ia ions.
in Sec ion 4.1, including mean a ings and s a is ical sig-
ni icance es s. Ou me hod is consis en ly p e e ed o e
wo baselines in e ms o c ea i i y and o e all musicali y
(p < 0.05), and emains compe i i e in na u alness.
5. CONCLUSION
To sum up, his wo k p oposed a no el and use ul algo-
i hm o melody educ ion, illing he gap be ween he
need o cap u e melody low o long- e m and hie a chi-
cal music gene a ion and he lack o all-gen e o - he-shel
ools o melody educ ion. The p oposed algo i hm inds
he op imal melody educ ion by inding he sho es pa h
in a g aph ep esen a ion o he melody, conside ing he
onal, empo al, and no e impo ance ac o s. Subjec i e
expe imen s demons a ed ha ou me hod ou pe o ms
baselines in a a ie y o musical s yles. We also demon-
s a ed he e ec i eness o he melody educ ion algo i hm
in melody a ia ion gene a ion h ough subjec i e e alua-
ion. In he u u e, we plan o ackle educ ion ha cap-
u es la en polyphony and hie a chical s uc u e, and ex-
plo e he applica ion o he p oposed algo i hm in a b oade
ange o music gene a ion asks. While he cu en algo-
i hm con ains ad-hoc pa ame e s, u u e wo k could also
explo e lea ning hese di ec ly om da a.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
351
6. REFERENCES
[1] S. Dai, Z. Jin, C. Gomes, and R. B. Dannenbe g,
“Con ollable deep melody gene a ion ia hie a chical
music s uc u e ep esen a ion,” in P oceedings o
he 22nd In e na ional Socie y o Music In o ma ion
Re ie al Con e ence, ISMIR 2021, Online, No embe
7-12, 2021, J. H. Lee, A. Le ch, Z. Duan, J. Nam,
P. Rao, P. an K anenbu g, and A. S ini asamu hy,
Eds., 2021, pp. 143–150. [Online]. A ailable: h ps:
//a chi es.ismi .ne /ismi 2021/pape /000017.pd
[2] S. Wei, G. Xia, Y. Zhang, L. Lin, and W. Gao, “Music
ph ase inpain ing using long- e m ep esen a ion and
con as i e loss,” in IEEE In e na ional Con e ence
on Acous ics, Speech and Signal P ocessing, ICASSP
2022, Vi ual and Singapo e, 23-27 May 2022.
IEEE, 2022, pp. 186–190. [Online]. A ailable: h ps:
//doi.o g/10.1109/ICASSP43922.2022.9747817
[3] Z. Wang, L. Min, and G. Xia, “Whole-song hie a chi-
cal gene a ion o symbolic music using cascaded di u-
sion models,” in The Twel h In e na ional Con e ence
on Lea ning Rep esen a ions, 2024. [Online]. A ail-
able: h ps://open e iew.ne / o um?id=sn7CYWya h
[4] P. Dha iwal, H. Jun, C. Payne, J. W. Kim, A. Rad o d,
and I. Su ske e , “Jukebox: A gene a i e model o
music,” CoRR, ol. abs/2005.00341, 2020. [Online].
A ailable: h ps://a xi .o g/abs/2005.00341
[5] H. Schenke , F ee Composi ion (De eie Sa z). New
Yo k: Longman, 1979, ansla ed and edi ed by E ns
Os e .
[6] F. Le dahl and R. S. Jackendo , A Gene a i e Theo y
o Tonal Music, eissue, wi h a new p e ace. MIT
p ess, 1996.
[7] N. O io and A. Rodà, “A measu e o melodic
simila i y based on a g aph ep esen a ion o he music
s uc u e,” in P oceedings o he 10 h In e na ional
Socie y o Music In o ma ion Re ie al Con e ence,
ISMIR 2009, Kobe In e na ional Con e ence Cen e ,
Kobe, Japan, Oc obe 26-30, 2009, K. Hi a a,
G. Tzane akis, and K. Yoshii, Eds. In e na ional
Socie y o Music In o ma ion Re ie al, 2009, pp.
543–548. [Online]. A ailable: h p://ismi 2009.ismi .
ne /p oceedings/OS7-1.pd
[8] F. Simone a, F. Ca no alini, N. O io, and A. Rodà,
“Symbolic music simila i y h ough a g aph-based ep-
esen a ion,” in P oceedings o he Audio Mos ly 2018
on Sound in Imme sion and Emo ion, W exham, Uni ed
Kingdom, Sep embe 12-14, 2018, S. Cunningham and
R. Picking, Eds. ACM, 2018, pp. 26:1–26:7. [Online].
A ailable: h ps://doi.o g/10.1145/3243274.3243301
[9] S. Tojo, K. Hi a a, and M. Hamanaka, “Compu a ional
econs uc ion o cogni i e music heo y,” New Gene .
Compu ., ol. 31, no. 2, pp. 89–113, 2013. [Online].
A ailable: h ps://doi.o g/10.1007/s00354-013-0202-7
[10] R. G o es, “Au oma ic melodic educ ion using
a supe ised p obabilis ic con ex - ee g amma ,”
in P oceedings o he 17 h In e na ional Socie y
o Music In o ma ion Re ie al Con e ence, ISMIR
2016, New Yo k Ci y, Uni ed S a es, Augus 7-11,
2016, M. I. Mandel, J. De aney, D. Tu nbull, and
G. Tzane akis, Eds., 2016, pp. 775–781. [Online].
A ailable: h ps://wp.nyu.edu/ismi 2016/wp-con en /
uploads/si es/2294/2016/07/274_Pape .pd
[11] S. Ni-Hahn, W. Xu, Z. Yin, R. Zhu, S. Mak, Y. Jiang,
and C. Rudin, “A new da ase , no a ion so wa e,
and ep esen a ion o compu a ional schenke ian
analysis,” in P oceedings o he 25 h In e na ional
Socie y o Music In o ma ion Re ie al Con e ence,
ISMIR 2024, San F ancisco, Cali o nia, USA and
Online, No embe 10-14, 2024, B. Kaneshi o, G. J.
Myso e, O. Nie o, C. Donahue, C. A. Huang,
J. H. Lee, B. McFee, and M. C. McCallum,
Eds., 2024, pp. 866–873. [Online]. A ailable: h ps:
//doi.o g/10.5281/zenodo.14877467
[12] S. Ahlbäck, “Melody beyond no es: A s udy o melody
cogni ion,” Ph.D. disse a ion, Gö ebo gs uni e si e ,
2004.
[13] J. Y. Yen, “Finding he k sho es loopless pa hs in a
ne wo k,” managemen Science, ol. 17, no. 11, pp.
712–716, 1971.
[14] E. Na mou , The analysis and cogni ion o basic
melodic s uc u es: The implica ion- ealiza ion model.
Uni e si y o Chicago P ess, 1990.
[15] P. B. Ki lin and P. E. U go , “A amewo k o au o-
ma ed schenke ian analysis,” in ISMIR 2008, 9 h In e -
na ional Con e ence on Music In o ma ion Re ie al,
D exel Uni e si y, Philadelphia, PA, USA, Sep embe
14-18, 2008, J. P. Bello, E. Chew, and D. Tu n-
bull, Eds., 2008, pp. 363–368. [Online]. A ailable:
h p://ismi 2008.ismi .ne /pape s/ISMIR2008_229.pd
[16] M. Hamanaka, K. Hi a a, and S. Tojo, “σGTTM
III: Lea ning-based ime-span ee gene a o based on
pc g,” in In e na ional Symposium on Compu e Mu-
sic Mul idisciplina y Resea ch. Sp inge , 2015, pp.
387–404.
[17] ——, “deepGTTM-II: Au oma ic gene a ion o me i-
cal s uc u e based on deep lea ning echnique,” in 13 h
Sound and Music Con e ence, 2016, pp. 221–249.
[18] C. Finkensiep and M. Roh meie , “Modeling and
in e ing p o o- oice s uc u e in ee polyphony,” in
P oceedings o he 22nd In e na ional Socie y o
Music In o ma ion Re ie al Con e ence, ISMIR 2021,
Online, No embe 7-12, 2021, J. H. Lee, A. Le ch,
Z. Duan, J. Nam, P. Rao, P. an K anenbu g, and
A. S ini asamu hy, Eds., 2021, pp. 189–196. [Online].
A ailable: h ps://a chi es.ismi .ne /ismi 2021/pape /
000023.pd
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
352
[19] K. Chen, C. Wang, T. Be g-Ki kpa ick, and S. Dub-
no , “Music ske chne : Con ollable music gene a ion
ia ac o ized ep esen a ions o pi ch and hy hm,”
in P oceedings o he 21 h In e na ional Socie y
o Music In o ma ion Re ie al Con e ence, ISMIR
2020, Mon eal, Canada, Oc obe 11-16, 2020,
J. Cumming, J. H. Lee, B. McFee, M. Schedl, J. De-
aney, C. McKay, E. Zange le, and T. de Reuse,
Eds., 2020, pp. 77–84. [Online]. A ailable:
h p://a chi es.ismi .ne /ismi 2020/pape /000146.pd
[20] D. on Rü e, L. Biggio, Y. Kilche , and T. Ho -
mann, “FIGARO: con ollable music gene a ion us-
ing lea ned and expe ea u es,” in The Ele en h
In e na ional Con e ence on Lea ning Rep esen a-
ions, ICLR 2023, Kigali, Rwanda, May 1-5,
2023. OpenRe iew.ne , 2023. [Online]. A ailable:
h ps://open e iew.ne /pd ?id=NyR8OZFHw6i
[21] A. C. Cadwallade , D. Gagné, and F. Sama o o, “Anal-
ysis o onal music: a schenke ian app oach,” (No Ti-
le), 1998.
[22] M. Clemen i, C. Tausig, and K. F. Wei zmann, G adus
ad pa nassum. Pe e s, 2010.
[23] Z. Wang, K. Chen, J. Jiang, Y. Zhang, M. Xu, S. Dai,
G. Bin, and G. Xia, “Pop909: A pop-song da ase
o music a angemen gene a ion,” in P oceedings o
21s In e na ional Con e ence on Music In o ma ion
Re ie al, ISMIR, 2020.
[24] E. Foxley, “No ingham da abase,” 2011.
[25] M. Hamanaka, “G m da abase,” h ps://g m.jp/g m/
da abase/, 2009.
[26] R. Yang, D. Wang, Z. Wang, T. Chen, J. Jiang, and
G. Xia, “Deep music analogy ia la en ep esen a ion
disen anglemen ,” in P oceedings o he 20 h In e na-
ional Socie y o Music In o ma ion Re ie al Con e -
ence, ISMIR 2019, Del , The Ne he lands, No embe
4-8, 2019, A. Flexe , G. Pee e s, J. U bano, and
A. Volk, Eds., 2019, pp. 596–603. [Online]. A ailable:
h p://a chi es.ismi .ne /ismi 2019/pape /000072.pd
[27] H. Sche e, The analysis o a iance. John Wiley &
Sons, 1999, ol. 72.
[28] H. Chu, J. Kim, S. Kim, H. Lim, H. Lee, S. Jin, J. Lee,
T. Kim, and S. Ko, “An empi ical s udy on how peo-
ple pe cei e ai-gene a ed music,” in P oceedings o he
31s ACM In e na ional Con e ence on In o ma ion &
Knowledge Managemen , 2022, pp. 304–314.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
353