scieee Science in your language
[en] (orig)

RISE: Music Rearrangement for Realtime Intensity Synchronization With Exercise

Author: Alexander Wang; Chris Donahue; Dhruv Jain
Publisher: Zenodo
DOI: 10.5281/zenodo.17706321
Source: https://zenodo.org/records/17706321/files/000002.pdf
RISE: ADAPTIVE MUSIC PLAYBACK FOR REALTIME INTENSITY
SYNCHRONIZATION WITH EXERCISE
Alexande Wang
Uni e si y o Michigan
[email p o ec ed]
Ch is Donahue
Ca negie Mellon Uni e si y
[email p o ec ed]
Dh u Jain
Uni e si y o Michigan
[email p o ec ed]
ABSTRACT
We p opose a sys em o adap a use ’s music o hei
exe cise by aligning high-ene gy music segmen s wi h in-
ense in e als o he wo kou . Lis ening o music du ing
exe cise can boos mo i a ion and pe o mance. Howe e ,
he s uc u e o he music may be di e en om he use ’s
na u al phases o es and wo k, causing use s o es longe
han needed while wai ing o a mo i a ional sec ion, o
lose mo i a ion mid-wo k i he sec ion ends oo soon. To
add ess his, ou sys em, called RISE, au oma ically es-
ima es he in ense segmen s in music and uses cu poin -
based music ea angemen echniques o dynamically ex-
end and sho en di e en segmen s o he use ’s song o i
he ongoing exe cise ou ine. Ou sys em akes as inpu he
es and wo k du a ions o guide adap a ion. Cu en ly, his
is de e mined ei he ia a p e-de ined plan o manual inpu
du ing he wo kou . We e alua ed RISE wi h 12 pa ici-
pan s who compa ed ou sys em o a non-adap i e music
baseline while exe cising in ou lab. Pa icipan s ound
ou ea angemen s seamless, in ensi y es ima ion accu-
a e, and many ecalled momen s when in ensi y alignmen
helped hem push h ough hei wo kou .
1. INTRODUCTION
The igh music can ans o m an o dina y wo kou in o
a mo e engaging and mo i a ing expe ience. Resea ch
has shown ha lis ening o music while wo king ou can
imp o e physical pe o mance and educe pe cei ed exe -
ion [1]. Consequen ly, many comme cial applica ions o -
e music ecommenda ions o i he use ’s exe cise ou-
ines [2, 3]. Howe e , cu en applica ions do no con-
side musical s uc u e, ea ing songs as uni o m in in en-
si y. This can lead o subop imal scena ios, such as a b ie
calm sec ion o an o he wise high-ene gy song playing jus
as he lis ene begins weigh li ing. Con en ional music
playback emains s a ic and un esponsi e o he lis ene ’s
dynamic beha io and con ex , highligh ing he need o
adap i e music sys ems ha cus omize he lis ening expe-
ience in eal ime.
© A.Wang, C.Donahue, and D.Jain. Licensed unde a C e-
a i e Commons A ibu ion 4.0 In e na ional License (CC BY 4.0). A i-
bu ion: A.Wang, C.Donahue, and D.Jain, “RISE: Adap i e Music Play-
back o Real ime In ensi y Synch oniza ion wi h Exe cise”, in P oc. o
he 26 h In . Socie y o Music In o ma ion Re ie al Con ., Daejeon,
Sou h Ko ea, 2025.
We p opose RISE, a eal- ime in ensi y synch oniza ion
sys em ha adap s he playback o use -selec ed music o
be e align wi h exe cise plans (Figu e 1). RISE in eg a es
se e al Music In o ma ion Re ie al (MIR) echniques o
segmen songs, es ima e segmen in ensi y, and iden i y
cu poin s, enabling seamless looping and skipping o align
segmen du a ions wi h use exe cise. RISE ope a es in
wo scena ios: in he guided scena io, i adap s a music
eco ding o i a p ede ined wo kou plan, while in he un-
guided scena io, i adap s in eal- ime o spon aneous ex-
e cise changes, o e ing lexibili y a he po en ial cos o
adap a ion quali y. In his wo k, we ocus on he music
analysis and adap i e playback, assuming ha he g ound
u h exe cise s a e is gi en as inpu o he sys em.
We alida e ou solu ion h ough a wo-p onged e al-
ua ion. Fi s , we assessed he seamlessness o ou sys-
em’s adap a ions by gene a ing exce p s om 270 songs
and conduc ing an in-lab s udy in which pa icipan s a ed
ansi ion na u alness. Resul s showed ha ou sys em’s
adap a ions we e pe cei ed as seamless, sco ing simila ly
o unmodi ied exce p s. Second, in a con olled use s udy,
12 pa icipan s compa ed exe cising wi h adap i e music
o nonadap i e music. Pa icipan s ound ou adap a ions
seamless and in ensi y segmen a ion accu a e. They app e-
cia ed how he alignmen o music wi h wo kou enhanced
mo i a ion and o e all expe ience, pa icula ly when ex-
ending high-ene gy segmen s helped hem push h ough.
In summa y, ou wo k makes wo p ima y con ibu-
ions. Fi s , we p opose a no el music adap a ion sys em
o exe cises by applying se e al exis ing MIR echniques.
Second, we o e insigh s om wo use s udies demon-
s a ing he u ili y o music adap a ion o exe cise. Mo e-
o e , as he i s o explo e cu poin -based ea angemen
echniques o adap ing music o exe cise, ou wo k lays
he ounda ion o u u e esea ch in applying adap i e mu-
sic sys ems o scena ios beyond exe cise. 1
2. RELATED WORK
Con ex -awa e music sys ems en ich use expe ience by
dynamically adap ing music based on usage con ex s, a o-
cus o MIR esea ch emphasizing use -cen e ed applica-
ions [4–6]. Fo example, p io wo k has explo ed adap-
i e music o enhancing d i ing expe iences [7, 8], no-
i ica ion expe iences [9, 10], and na a i e expe iences
[11, 12]. Gi en he well-documen ed bene i s o music in
1Supplemen al ideo: h ps://you u.be/XZLB p 6Lgg.
20
Figu e 1. We p esen RISE, a eal- ime music in ensi y synch oniza ion sys em o exe cise. Ou sys em akes in use
music and wo kou phases as inpu and aligns he high-in ensi y segmen s o he music wi h he use ’s exe ion phases (and
ice e sa), o enhance he wo kou expe ience.
exe cise con ex s [1], p io wo k has also explo ed mu-
sic adap a ions o wo kou s, including song ecommen-
da ion [2,13,14] and empo adjus men s based on unning
pace and hea a e [15–19].
While hese app oaches pe sonalize music a he song
le el, hey o e look he mo i a ional a o dances o spe-
ci ic segmen s wi hin a song. Relaxing music is be e
sui ed o eco e y, while upbea , hy hmically p ominen
music s imula es physical ac i i y [20]. These a ia ions
in in ensi y also occu wi hin a single song, whe e high-
ene gy sec ions like he ’d op’ in elec onic music eel pa -
icula ly exci ing [21] and exe cise s na u ally synch onize
bu s s o e o wi h hese peaks [22]. We le e age hese in-
sigh s o de elop music adap a ion algo i hms ha dynam-
ically adjus indi idual segmen s wi hin a song, aligning
high-in ensi y music wi h peak exe ion.
RISE combines se e al MIR echniques o s uc u e-
awa e adap i e playback. We use he All-in-one package
[23] o music s uc u e analysis, au oma ically segmen -
ing songs in o sec ions (e.g., in o, e se). Then, we use
Splee e [24] o sou ce sepa a ion, isola ing indi idual
audio componen s (e.g., d um acks) o es ima e each seg-
men ’s d um p ominence. Finally, we use cu poin iden i i-
ca ion [25] o pinpoin pai s o ansi ion poin s whe e cu -
ing om one o he o he wouldn’ sound ab up . This en-
ables RISE o dynamically skip o loop musical segmen s.
3. RISE
We p esen RISE, a sys em ha synch onizes music in en-
si y wi h wo kou in ensi y o enhance he exe cise expe i-
ence (Figu e 2). RISE akes in a use ’s music (audio) and
wo kou (planned in e als in he guided se ing o cu en
s a us in he unguided se ing) as inpu . The music is p e-
p ocessed o label high-in ensi y segmen s (Sec ion 3.1)
and poin s o ansi ion (Sec ion 3.2). This in o ma ion
is hen used o eal- ime music adap a ion by ou Uni y-
based sys em o wo kou alignmen (Sec ion 3.3).
3.1 P ep ocessing - Es ima ing In ense Segmen s in
Music
In o med by p io wo k, we selec cho us and ins umen al
sec ions as he high-in ensi y segmen s, as hey a e o en
pe cei ed as mo e ene ge ic and mo i a ing [21,22]. To en-
su e hy hmic d i e [26,27], we il e ou sec ions lacking
in d um p esence o a oid a angemen s whe e pe cussion
is in en ionally d opped o simpli ied o con as .
We es ima e hese segmen s by le e aging wo exis ing
music analysis sys ems. Fi s , we use a music s uc u e
analysis sys em [23] o segmen each song in o a se o
unc ional sec ions S={s1, s2, . . . , sN}and ex ac bea
imes amps B={b1, b2, . . . , bM}. He e, S ep esen s a
pa i ion o he song in o Nsec ions, whe e snma ks he
s a imes amp o a sec ion nand sn+1 ma ks bo h i s end
and he s a o he nex sec ion. Bis he ime-o de ed
se o de ec ed bea imes. Func ional sec ion labels (e.g.,
cho us, ins umen al, e se, b idge) a e assigned o each
sec ion by he sys em.
To measu e d um p ominence, we i s ex ac he d um
ack using a sou ce sepa a ion sys em [24]. Gi en an
audio inpu x, we de ine he isola ed d um signal as
d:=Sou ceSepD ums(x). We hen compu e LUFS loud-
ness [28] a bea -le el in e als. Fo each bea index m, we
de ine i s loudness as:
Ld(m):=LUFS(Segmen (d, bm, bm+1)),
whe e Segmen (d, bm, bm+1)ex ac s he segmen o he
d um audio be ween consecu i e bea s bmand bm+1.
The a e age d um loudness o a sec ion snis calcula ed
by a e aging he loudness o all bea s in he segmen :
Ld(sn):=1
|Bn|X
m∈Bn
Ld(m),
whe e Bn⊆ {1, . . . , M}is he se o bea indices ha
sa is y sn≤bm< sn+1.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
21
PYTHON PRE-PROCESSING
ISOLATED DRUM TRACK
SECTION TYPE
Use Music
Sou ce Sepa a ion
Use Exe cise Plan
Music S uc u e
Analysis
SEGMENTS, BEAT TIMES, AND
INTRA-SEGMENT TRANSITION POINTS
Rea anged Music
UNITY REALTIME ADAPTATION
Bea T acking
Audio Loudness
Me e
In a-Segmen
Cu poin s
BEAT TIMESTAMPS
INPUT
OUTPUT
High-In ensi y
Segmen s
Figu e 2.Sys em o e iew. RISE akes use music and exe cise plans as inpu , p ep ocesses he music o iden i y in ense
segmen s and in a-segmen cu poin s, and sends his in o ma ion o Uni y o eal- ime adap a ion.
We p ecompu e a pe -song loudness h eshold τ, whe e
any segmen wi h a loudness abo e τis conside ed high in-
ensi y, and o he wise we conside i low in ensi y. Th esh-
old τis compu ed ela i e o he loudes sec ion in he
song: τ= maxsn∈SLd(sn)−δ, whe e δis a cons an
ha de e mines he ela i e h eshold. In ou implemen a-
ion, we se δ= 5 decibels. I ou o mo e consecu i e
segmen s a e labeled high-in ensi y, we eassign he seg-
men wi h he lowes loudness as low-in ensi y and me ge
consecu i e segmen s wi h he same in ensi y labels. Ul-
ima ely, ou analysis induces a pa i ion o he ull ack
in o ≤Nsegmen s and associa ed bina y in ensi y la-
bels S′={(s′
1, i1),(s′
2, i2), . . .}, whe e s′
na e delinea ing
imes amps and ii∈ {Low,High}.
3.2 P ep ocessing - Es ima ing In a-Segmen
Cu poin s
To be e align in ense segmen s o music wi h high-
in ensi y wo kou phases, we es ima e a se o cu poin s
o acili a e seamless adap a ion. A cu poin is a pai o
imes amps in he music eco ding, consis ing o a s a ing
imes amp and a des ina ion imes amp. The objec i e is o
es ima e cu poin s ha allow smoo h musical ansi ions,
such ha i playback jumps om he s a o a cu poin o
i s des ina ion, use s expe ience minimal dis up ion.
We de i e an ini ial se o cu poin s using he app oach
p oposed by Plachou as and Mi on [25], which analyzes
ecu ence ma ices encoding he sel -simila i y o musi-
cal bea s. Cu poin s a e iden i ied by de ec ing diagonals
in hese ma ices ha co espond o epea ed pa e ns, pin-
poin ing ansi ions be ween musically cohe en sec ions.
This esul s in an ini ial se o candida e cu poin s:
C={(co ig.
i, cdes .
i)∈B×B}
whe e Cis he se o all es ima ed cu poin pai s, and each
cu poin consis s o a s a ime co ig.
iand an end ime cdes .
i,
bo h aligned o de ec ed bea imes amps B.
In p io wo k, cu poin s ha e been applied o ea -
ange music o i ex e nal cons ain s, such as ideo du-
a ion [29]. These app oaches allow cu poin s o c oss sec-
ion bounda ies, maximizing lexibili y a he po en ial cos
o playback na u alness.
To p io i ize na u alness in ou sys em, we en o ce an
in a-segmen cons ain , ensu ing ha cu poin s only jump
wi hin a segmen as opposed o ac oss segmen s. We de ine
he il e ed se o in a-segmen cu poin s as:
C′:={(co ig.
i, cdes .
i)∈C| ∃s′
n∈S′, s′
n≤co ig.
i, cdes .
i< s′
n+1}
This gua an ees ha e e y cu poin ’s s a and des ina ion
imes amps all wi hin he same unc ional sec ion s′
n, p e-
se ing he s uc u al in eg i y o he music.
3.3 Adap a ion Wi h Cu poin s
In addi ion o music audio and exe cise plan, ou eal-
ime adap a ion sys em akes as inpu he ollowing in-
o ma ion es ima ed du ing p e-p ocessing: (1) musical
segmen s and co esponding in ensi ies, (2) seamless cu -
poin s, and (3) bea imes amps. To adap music in eal
ime, we de ine a s a e machine ha go e ns playback be-
ha io . The sys em ope a es in one o h ee possible s a es
(Figu e 3):
•Loop S a e: I he sys em de e mines ha a segmen
should be ex ended, i selec s a cu poin ciwhe e
co ig.
i> cu en and cdes .
i< cu en , looping p e i-
ously played sec ions o inc ease he du a ion o he
cu en in ensi y segmen .
•Skip S a e: I a segmen du a ion needs o be sho -
ened, he sys em selec s a cu poin ciwhe e co ig.
i>
cu en and cdes .
i> co ig.
i, skipping o wa d o educe
he segmen du a ion.
•Unmodi ied S a e: I no ansi ion is equi ed, he
music plays con inuously wi hou al e a ion.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
22
O iginal Music
Rea anged
Ou pu
O iginal Music
Rea anged
Ou pu
Looping
Skipping
T ansi ion o ea lie loca ion
T ansi ion o la e loca ion
Segmen ex ended h ough epe i ion
Segmen sho ened h ough emo al
A B C
A B C
B
A B C
AC
Figu e 3.Visualiza ion o adap a ion modes. Top: Loop
mode ex ends a segmen by jumping back. Bo om: Skip
mode sho ens a segmen by jumping o wa d.
Playback s a e ansi ions a e de e mined dynamically
based on he use ’s exe cise plan, enabling he sys em o
adjus segmen du a ion in eal ime.
3.3.1 Fil e -based ansi ions
Some imes, he sys em canno immedia ely ansi ion o
he a ge in ensi y because he e a e no a ailable cu poin s
ha ma ches he desi ed ansi ion. In hese cases, we use a
il e -based ansi ion, g adually emo ing high- equency
con en be o e he ansi ion and es o ing i a e wa d,
simila o DJ ade echniques. These ansi ions a e no-
iceable and less seamless compa ed o cu poin ansi-
ions. We cu en ly apply il e -based ansi ions only in
unguided mode (sec ion 3.4.1), when he exe cise calls o
a imely ansi ion o a high-in ensi y music s a e.
3.4 Usage Modes
Wo kou habi s may a y g ea ly ac oss di e en ypes o
exe cises—weigh aining can equi e minu es o es o
ully eco e and he ac ual iming can a y g ea ly de-
pending on how exhaus ed he use is. Guided in e al
wo kou s, on he o he hand, emphasize sho bu s s ol-
lowed by sho es pe iods ha a e s ic ly imed o max-
imize ime e iciency. Mo i a ed by his obse a ion, we
designed wo usage modes o wo scena ios. An unguided
Use wo king when music is low in ensi y
Use wo king when music is high in ensi y
Use (s a ed) es ing when music is high in ensi y
Use es ing when music is low in ensi y
Unmodi ied
Skipping
Looping
Skipping
en e loop cycle
b eak om loop cycle
skip o high in ensi y
play no mally un il use s a s wo king
1
2
3
4
Figu e 4. Adap a ion mode o di e en scena ios.
mode whe e he use is ee o es as long as hey need, and
aguided mode whe e he use ollows a p ede ined exe cise
plan. We de ail he sys em design o each mode below.
3.4.1 Unguided use
The unguided mode allows use s o eely choose s a
imes and wo k/ es du a ions, bu some imes sac i ice
adap a ion quali y by using il e -based ansi ions when
no cu poin s a e immedia ely a ailable. The sys em akes
eal- ime wo kou s a e as inpu and adjus s he music ac-
co dingly (bina y: wo k/ es ). Cu en ly, use s manually
indica e s a e changes by p essing a bu on. As depic ed
in Figu e 4, RISE ansi ions be ween playback s a es de-
pending on he cu en wo kou s a us:
1. S a ing exe cise du ing a low-in ensi y segmen :
The sys em en e s skip mode o quickly ansi ion
o a high-in ensi y segmen .
2. Du ing exe cise in a high-in ensi y segmen : The
sys em ac i a es loop mode o sus ain high-in ensi y
music un il he use begins es ing.
3. S a ing es in a high-in ensi y segmen : The sys-
em disables looping and swi ches o skip mode o
exi high-in ensi y segmen s.
4. Du ing es in a low-in ensi y segmen : The sys-
em en e s unmodi ied mode, allowing he music o
play na u ally un il he use esumes exe cising.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
23
3.4.2 Guided use
The guided mode p o ides seamless adap a ions bu e-
qui es a p ecise exe cise plan. The sys em akes wo du-
a ions (in seconds) o wo k and es . We i e a e h ough
a ailable cu poin s and selec he ansi ion ha esul s in
an adap a ion closes o he desi ed du a ion. We allow one
ansi ion pe segmen o maximize na u alness. The e-
sul s a e close o he speci ied du a ion bu a ely pe ec
(e.g., adjus ing a 50-second segmen o 32 seconds o a
30-second a ge ). We ade songs in a he beginning and
ou a he end, adjus ing he in o o ma ch he es du a ion.
4. QUANTITATIVE EVALUATION
We conduc ed an in-lab quan i a i e lis ening e alua ion o
assess he seamlessness o ansi ions.
4.1 P ocedu e
We collec ed 270 di e en songs, 30 songs each om nine
di e en o icial Spo i y wo kou playlis s wi h di e en
gen e p e e ences. 13 songs we e emo ed om he s udy
because hey had simila in ensi y h oughou he en i e
song o had no a ailable in a-segmen cu poin s. Fo he
emaining 257 songs, we gene a ed wo 10-second audio
clips pe song, one wi h a ansi ion and one unmodi ied.
Each clip is selec ed om a andom segmen o he song.
We andomized he ansi ion iming o occu be ween wo
and eigh seconds wi hin he 10-second clip.
This e alua ion was pe o med wi h six human lis en-
e s. Each clip was andomly assigned o wo lis ene s who
a ed ansi ion na u alness on a scale o 1-5 (1 = e y ja -
ing, 5 e y seamless/unno iceable). We compa e he a -
ings o modi ied and unmodi ied clips using a pai ed - es .
4.2 Resul s
We ound no s a is ically signi ican di e ences be ween
he clips wi h ansi ions (M = 4.3/5, SD = 1.1) and he
baseline clips (M = 4.5/5, SD = 1.0), (256) = -1.63, p=
.10, sugges ing ha ou ansi ions a e highly seamless and
compa able o he unmodi ied clips. We obse ed a small
dec ease in he a e age a ing o ansi ion clips (4.5 s.
4.3) o wo easons. Fi s , he bea de ec ion algo i hm is
no pe ec . We ound ins ances whe e he ansi ions we e
no pe ec ly aligned, causing a sligh jump in hy hm. Sec-
ond, amilia i y wi h a song in luenced he de ec ion o
ansi ions. Ra e s no ed ha , e en when ansi ions we e
comple ely na u al, hey pe cei ed di e ences in expec ed
p og ession (e.g., al e ed ly ics) in songs hey knew well.
We also ound ha unmodi ied clips did no ecei e pe ec
sco es. This was due o s uc u al elemen s such as synco-
pa ed hy hms and ab up b eaks, in ended o su p ise he
lis ene , being pe cei ed as ansi ions by a e s, despi e
hese elemen s being pa o he o iginal composi ions.
5. USER STUDY
We conduc ed a use s udy o explo e how use s expe ience
RISE in bo h guided and unguided exe cise se ings.
5.1 S udy Design
Pa icipan s exe cised o bo h unmodi ied music and ou
adap i e sys em ac oss wo blocks: guided in e al aining
and unguided weigh aining. Wi hin each block, hey ex-
pe ienced bo h adap i e and non-adap i e condi ions, wi h
o de ully coun e balanced. Each condi ion included a
b ie u o ial, 8 minu es o exe cise, and op ional es .
Pa icipan s selec ed wo songs om a cu a ed pool o 25
acks, played iden ically ac oss all condi ions. The adap-
i e sys em modi ied he music in esponse o use ac i i y,
while he non-adap i e e sion le he music unchanged.
The ull session las ed app oxima ely 90 minu es.
Guided in e al aining. Pa icipan s pe o med in-
e al exe cises o hei choice (e.g., jumping jacks) ac-
co ding o a 40s wo k / 30s es schedule. Fo he non-
adap i e sys em, hese in e als a e s ic . Fo he adap-
i e sys em, he ac ual ime may a y by seconds depend-
ing on he a ailable ansi ions in each sec ion. Ins uc-
ions we e shown on a sc een wi h coun down isuals and
sounds, modeled a e popula wo kou ime ideos [30].
Unguided weigh aining. Pa icipan s used dumb-
bells o pe o m any ee o m weigh exe cises (e.g., bi-
cep cu ls). In adap i e condi ions, hey e bally indica ed
when hey we e abou o begin o end a wo k segmen ,
allowing he esea che o inpu music adap a ion s a e
changes. No in e ace was shown.
In e iew p ocedu e. A e in oducing he s udy and
ob aining pa icipan demog aphic in o ma ion and con-
sen , we ga e hem a e bal desc ip ion o ou sys em and
eco ded hei i s imp essions h ough a sho in e iew.
A e expe iencing all condi ions, we showed pa icipan s
how he sys em ope a ed by eplaying he music o hem
along wi h isualiza ions o bo h segmen in ensi y labels
and cu poin ansi ions. This was done a he end o he
s udy o a oid p iming pa icipan s o ocus on speci ic ma-
nipula ions and o e alua e whe he he adap a ions we e
pe cep ible wi hou guidance. The s udy concluded wi h a
semi-s uc u ed exi in e iew o e bal eedback.
Pa icipan and appa a us. We ec ui ed 12 pa ici-
pan s (5 emale, 7 male, age M= 24.8 yea s; SD = 2.1)
om a local uni e si y. Pa icipan s egula ly exe cise
(3: 1-2x/week, 7: 3-4x/week, 2: 5-6x/week) o conside -
able du a ions (1: 15-30 min, 4: 30-45 min, 1: 45-60 min,
5: 1-2 hou s, 1: 2 hou s+) in a a ie y o exe cise ypes
(8: s eady ca dio, 3: in e al aining, 10: s eng h aining,
4: o he ). The s udy ook place in a con olled lab en i on-
men wi h music played h ough speake s o pa icipan
sa e y. Pa icipan s ecei ed $50 o hei pa icipa ion.
5.2 Findings
In e iew ansc ip s we e hema ically analyzed using a
coding eliabili y app oach [31] wi h he Recal2 ool [32].
The analysis esul ed in an a e age aw accu acy o 86.1%
and an a e age K ippendo ’s alpha o 0.72 be ween wo
a e s, indica ing accep able ag eemen (α> 0.66). We
p esen key indings below.
Pa icipan s we e exci ed by he p emise o hei
music adap ing o hei wo kou s. A e we explained
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
24

he s udy o pa icipan s bu be o e hey expe ienced ou
sys em, we asked pa icipan s abou hei ini ial eac ions
o he desc ip ion o he s udy and ou p oposed sys em.
Some pa icipan s had al eady made manual e o s o align
hei wo kou wi h music (n = 8), such as coo dina ing
speci ic music sec ions wi h hei wo kou (n = 4). All
pa icipan s exp essed ha music in ensi y alignmen wi h
wo kou s could be help ul o hem. Pa icipan s belie ed
alignmen can inc ease mo i a ion, o help hem eel mo e
ene gized and “pumped” du ing he wo kou (n = 11).
Es ima ed high-in ensi y segmen s aligned wi h hei
in ui ions. We showed pa icipan s a isualiza ion o ou
sys em’s es ima ed high-in ensi y music segmen s a e he
s udy, and all pa icipan s ound he esul s o be aligned
wi h hei expec a ions (n = 12).
Pa icipan s ound ou cu poin adap a ion ech-
nique seamless, sub e ing hei expec a ions. Be o e
he s udy, we sha ed a high-le el desc ip ion o ou wo k-
ou adap a ion sys em wi h pa icipan s. Many exp essed
ini ial skep icism ha he modi ica ions migh de ac om
he na u alness o he music playback (n = 6). Ou o he
pa icipan s who had conce ns ega ding he na u alness
o music, all bu one ema ked ha hei conce ns we e
o e u ned a e expe iencing ou sys em (n = 5). While
he il e -based modi ica ions we e ega ded as mo e no-
iceable, mos pa icipan s ei he did no no ice o ba ely
no iced any cu poin -based modi ica ions un il hey we e
shown a isualiza ion a he end o he s udy (n = 10).
Pa icipan s app ecia ed he ex a mo i a ion ou
alignmen app oach p o ided. All bu one pa icipan
ound he in ensi y alignmen o enhance hei wo kou ex-
pe ience (n = 11), gi ing hem mo e mo i a ion o push
ha de in hei wo kou s. P2 no ed ha when he music
in ensi y peaked jus as hey we e abou o each ailu e
in hei wo kou , i helped hem push h ough and ge an
ex a one o wo eps. Simila ly, P4 ecalled an ins ance
whe e hey expec ed he music o shi o a lowe -in ensi y
segmen in he middle o hei wo k in e al, bu ins ead
he upbea segmen epea ed, p o iding hem he ene gy o
push h ough he emainde o hei wo kou .
Pa icipan s a ied in hei sensi i i y o bo h music
modi ica ions and alignmen wi h exe cise. Fo ins ance,
P8 ba ely no iced music changes bu el hei mo emen s
we e mo e consis en in he adap i e condi ion, while he
nonadap i e one el chao ic. In con as , P3, who egu-
la ly coo dina es music wi h wo kou s, immedia ely no-
iced misalignmen s in he nonadap i e condi ion and e en
el he u ge o s op he music when high-ene gy segmen s
played du ing hei es . Sensi i i y could also depend on
exe cise. P6 men ioned ha hey ea ed music as back-
g ound du ing simple exe cises bu alued music align-
men o mo i a ion du ing mo e challenging exe cises,
such as he bench p ess. This indica es he need o be e
unde s and how di e en modes o music engagemen in-
luence ole ance and demand o sys em-d i en changes.
Pa icipan s p e e ed adap i e music bu iden i ied
issues wi h he unguided expe ience. Ou o 12 pa ici-
pan s, 11 p e e ed ou adap i e sys em o e nonadap i e
music: 6 p e e ed i in bo h scena ios, while 5 a o ed i
only in he guided se ing. Those who p e e ed he sys em
only in he guided se ing ound wo main issues wi h he
unguided expe ience. Fi s , he inpu me hod— equi ing
hem o no i y he esea che be o e exe ion—was dis-
ac ing and imp ac ical (n = 7). Second, while cu poin
ansi ions we e deemed seamless and o en unno iceable,
il e ansi ions dis up ed he na u al low o he music.
These il e ansi ions we e especially common du ing ex-
e cises wi h longe wo k du a ions and sho e es pe i-
ods, whe e excessi e looping also occu ed, leading o un-
na u al adap a ions o he music. As a esul , pa icipan s
ecommended imp o emen s in sys em cus omizabili y o
suppo di e en wo kou habi s (e.g., long wo k, sho
es ) and music adap a ion p e e ences (n = 8), manipu-
la ion seamlessness (n = 4), and expanding he a ailable
music pool o allowing use inpu (n = 3).
6. LIMITATION AND FUTURE WORK
Ou s udy e ealed se e al limi a ions, including use s
inding manual inpu imp ac ical and il e ansi ions dis-
up ing. We p opose u u e wo k below:
Fully au oma ing he sys em wi h sensing echnolo-
gies. To elimina e he need o manual in e ac ion du ing
wo kou s, we p opose au oma ing he sys em using eal-
ime sensing echnologies. By in eg a ing ac i i y ecogni-
ion, he sys em could au oma ically de ec exe cise phases
wi hou manual inpu . The sys em could also model use
beha io and p edic upcoming exe cise s a es based on
pas pa e ns o p epa e adap a ion plans in ad ance.
Inco po a ing o he ypes o modi ica ions. While
cu poin s enable seamless ansi ions, hey may be una ail-
able when imely adap a ion is equi ed. Fu u e wo k could
complemen ou segmen -sensi i e app oach wi h exis ing
echniques, such as pace adjus men and song ecommen-
da ion, o compensa e o mino ime disc epancies. Addi-
ionally, explo ing audio inpain ing wi h gene a i e mod-
els could enable smoo h ansi ions be ween a bi a y seg-
men s, p o iding g ea e lexibili y and p ecision.
De e mining high-in ensi y segmen s. RISE cu en ly
aligns d um-p ominen cho us and ins umen al sec ions
wi h use exe ion. While his app oach wo ked well in
ou s udy, u u e wo k can u he explo e how segmen -
le el musical a ia ions in luence exe cise and how hese
e ec s may di e by gen e o lis ene p e e ence.
7. CONCLUSION
We p esen RISE, a no el sys em ha adap s music o align
wi h exe cise phases. A use s udy in ol ing 12 pa ici-
pan s e ealed ha , despi e ini ial skep icism, mos use s
app ecia ed he alignmen and p e e ed i o e nonadap-
i e music o hei wo kou s. Ou wo k ep esen s a s ep
owa ds expanding he design space o adap i e music,
making ailo ed music expe iences, once limi ed o ideo
games and p ecomposed sound acks, applicable o eal-
wo ld scena ios like wo kou s.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
25
8. REFERENCES
[1] P. C. Te y, C. I. Ka ageo ghis, M. L. Cu an, O. V.
Ma in, and R. L. Pa sons-Smi h, “E ec s o music in
exe cise and spo : A me a-analy ic e iew.” Psycho-
logical bulle in, ol. 146, no. 2, p. 91, 2020.
[2] S. Mi o , “Hi ing he pa emen wi h
spo i y unning (hands-on) - cne ,” h ps:
//www.cne .com/ ech/se ices-and-so wa e/
hi ing- he-pa emen -wi h-spo i y- unning-hands-on/,
2015.
[3] Apple, “Apple i ness+ - apple,” h ps://www.apple.
com/apple- i ness-plus/, 2024.
[4] P. Knees, M. Schedl, and M. Go o, “In elligen use
in e aces o music disco e y: The pas 20 yea s and
wha ’s o come.” in P oceedings o he 20 h In e na-
ional Socie y o Music In o ma ion Re ie al Con e -
ence, ISMIR 2019, Del , The Ne he lands, No embe
4-8, 2019, 2019, pp. 44–53.
[5] M. Go o, “G and challenges in music in o ma ion e-
sea ch,” in Dags uhl Follow-Ups: Mul imodal Music
P ocessing, M. Mülle , M. Go o, and M. Schedl, Eds.
Dags uhl Publishing, 2012, pp. 217–225.
[6] M. Schedl and A. Flexe , “Pu ing he use in he cen e
o music in o ma ion e ie al.” in P oceedings o he
13 h In e na ional Socie y o Music In o ma ion Re-
ie al Con e ence, ISMIR 2012, Mos ei o S.Ben o Da
Vi ó ia, Po o, Po ugal, Oc obe 8-12, 2012, 2012, pp.
385–390.
[7] M. Ka i, T. G osse-Puppendahl, A. Jagaciak,
D. Be hge, R. Schü e, and C. Holz, “Sound-
s ide: A o dance-synch onized music mixing o
in-ca audio augmen ed eali y,” in The 34 h Annual
ACM Symposium on Use In e ace So wa e and
Technology, 2021, pp. 118–133.
[8] L. Bal unas, M. Kaminskas, B. Ludwig, O. Moling,
F. Ricci, A. Aydin, K.-H. Lüke, and R. Schwaige , “In-
ca music: Con ex -awa e music ecommenda ions in
a ca ,” in E-Comme ce and Web Technologies: 12 h
In e na ional Con e ence, EC-Web 2011, Toulouse,
F ance, Augus 30-Sep embe 1, 2011. P oceedings 12.
Sp inge , 2011, pp. 89–100.
[9] A. Wang, Y. F. Cheng, and D. Lindlbaue , “Ma ingba:
Music-adap i e ing ones o blended audio no i ica-
ion deli e y,” in P oceedings o he CHI Con e ence
on Human Fac o s in Compu ing Sys ems, CHI 2024,
Honolulu, HI, USA, May 11-16, 2024. New Yo k, NY,
USA: Associa ion o Compu ing Machine y, 2024.
[10] A. Wang, D. Lindlbaue , and C. Donahue, “Towa ds
music-awa e i ual assis an s,” in P oceedings o he
37 h Annual ACM Symposium on Use In e ace So -
wa e and Technology, UIST 2024, Pi sbu gh, PA, USA,
Oc obe 13-16, 2024. New Yo k, NY, USA: Associa-
ion o Compu ing Machine y, 2024.
[11] J. Sh i am, M. Tapaswi, and V. Allu i, “Sonus exe e!
au oma ed dense sound ack cons uc ion o books us-
ing mo ie adap a ions,” in P oceedings o he 23 d
In e na ional Socie y o Music In o ma ion Re ie al
Con e ence, ISMIR 2022, Bengalu u, India, Decembe
4-8, 2022, 2022, pp. 535–542.
[12] S. Rubin, F. Be houzoz, G. Myso e, W. Li, and
M. Ag awala, “Unde sco e: musical unde lays o au-
dio s o ies,” in P oceedings o he 25 h Annual ACM
Symposium on Use In e ace So wa e and Technol-
ogy, se . UIST ’12. New Yo k, NY, USA: Associa ion
o Compu ing Machine y, 2012, p. 359–366.
[13] N. Masahi o, H. Takaesu, H. Demachi, M. Oono, and
H. Sai o, “De elopmen o an au oma ic music selec-
ion sys em based on unne ’s s ep equency,” in IS-
MIR 2008, 9 h In e na ional Con e ence on Music In-
o ma ion Re ie al, D exel Uni e si y, Philadelphia,
PA, USA, Sep embe 14-18, 2008, 2008, pp. 193–198.
[14] G. T. Ellio and B. Tomlinson, “Pe sonalsound ack:
con ex -awa e playlis s ha adap o use pace,” in
CHI’06 ex ended abs ac s on Human ac o s in com-
pu ing sys ems, 2006, pp. 736–741.
[15] B. Moens, L. an Noo den, and M. Leman, “D-jogge :
Syncing music wi h walking,” in 7 h Sound and music
compu ing Con e ence. Uni e sidad Pompeu Fab a,
2010, pp. 451–456.
[16] J. Hockman, M. M. Wande ley, and I. Fujinaga, “Real-
ime phase ocode manipula ion by unne ’s pace.” in
NIME, 2009, pp. 90–93.
[17] N. Oli e and L. K ege -S ickles, “Papa: Physiology
and pu pose-awa e au oma ic playlis gene a ion.” in
ISMIR 2006, 7 h In e na ional Con e ence on Music
In o ma ion Re ie al, 2006, pp. 250–253.
[18] B. an de Vlis , C. Ba neck, and S. Mäuele , “mobea :
Using in e ac i e music o guide and mo i a e use s
du ing ae obic exe cising,” Applied psychophysiology
and bio eedback, ol. 36, pp. 135–145, 2011.
[19] Y. Chen, C.-C. Chen, L.-C. Tang, and W.-H. Chieng,
“Enhancing unning exe cise wi h io , blockchain, and
hea a e adap i e unning music,” IEEE Access, 2024.
[20] C. I. Ka ageo ghis and D. Holland, “Music in he exe -
cise domain: A e iew and syn hesis (pa ii),” In e na-
ional Re iew o Spo and Exe cise Psychology, ol. 5,
no. 1, pp. 67–84, 2012.
[21] A. Tu ell, A. R. Halpe n, and A.-H. Ja adi, “When
ension is exci ing: an elec oencephalog am explo-
a ion o exci emen in music,” bioRxi , p. 637983,
2019.
[22] D.-L. P ies and C. I. Ka ageo ghis, “A quali a i e in-
es iga ion in o he cha ac e is ics and e ec s o music
accompanying exe cise,” Eu opean physical educa ion
e iew, ol. 14, no. 3, pp. 347–366, 2008.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
26
[23] T. Kim and J. Nam, “All-in-one me ical and unc-
ional s uc u e analysis wi h neighbo hood a en ions
on demixed audio,” in IEEE Wo kshop on Applica ions
o Signal P ocessing o Audio and Acous ics (WAS-
PAA), 2023.
[24] R. Hennequin, A. Khli , F. Voi u e , and M. Moussal-
lam, “Splee e : a as and e icien music sou ce sepa-
a ion ool wi h p e- ained models,” Jou nal o Open
Sou ce So wa e, ol. 5, no. 50, p. 2154, 2020.
[25] C. Plachou as and M. Mi on, “Music ea angemen
using hie a chical segmen a ion,” in ICASSP 2023-
2023 IEEE In e na ional Con e ence on Acous ics,
Speech and Signal P ocessing (ICASSP). IEEE, 2023,
pp. 1–5.
[26] C.-W. Li and C.-G. Tsai, “The p esence o d um and
bass modula es esponses in he audi o y do sal pa h-
way and mi o - ela ed egions o pop songs,” Neu o-
science, ol. 562, pp. 24–32, 2024.
[27] G. Madison, “Expe iencing g oo e induced by music:
consis ency and phenomenology,” Music pe cep ion,
ol. 24, no. 2, pp. 201–208, 2006.
[28] C. J. S einme z and J. D. Reiss, “pyloudno m: A simple
ye lexible loudness me e in py hon,” in 150 h AES
Con en ion, 2021.
[29] Adobe, “Remix in p emie e p o,” h ps:
//helpx.adobe.com/p emie e-p o/using/
emix-audio-in-p emie e-p o.h ml, 2024.
[30] W. M. W. Time , “In e al ime wi h music | 40 sec
ounds 30 sec es | mix 107,” YouTube ideo, 2021,
accessed: Sep embe 1, 2024. [Online]. A ailable:
h ps://www.you ube.com/wa ch? =lnBOQnc_p-E
[31] R. E. Boya zis, T ans o ming quali a i e in o ma ion:
Thema ic analysis and code de elopmen . Sage, 1998.
[32] D. F eelon, “Recal2: Reliabili y o 2 code s,” h p://
d eelon.o g/u ils/ ecal on / ecal2/, 2010.
P oceedings o he 26 h ISMIR Con e ence, Daejeon, Ko ea, Sep embe 21-25, 2025
27