PATIENCE X: Extended artistic expression in AI-assisted music composition

Author: Macdonald, Lois

Publisher: Zenodo

DOI: 10.5281/zenodo.17306326

Source: https://zenodo.org/records/17306326/files/53.pdf

P oceedings o he 6 h Con e ence on AI Music C ea i i y (AIMC 2025),
B ussels, Belgium, Sep embe 10 h-12 h
PATIENCE X: Ex ended a is ic exp ession in AI-
assis ed music composi ion
Lois Macdonald
School o Digi al A s (SODA)
Manches e Me opoli an Uni e si y
[email protected]
Abs ac
PATIENCE X is a collec ion o se en audio composi ions explo ing he
in eg a ion o machine lea ning ools in o composi ional and pe o ma i e
wo k lows o al -pop. This p ac ice-led esea ch examines eal wo ld AI-assis ed
music composi ion. Th ough quali a i e expe imen a ion wi h selec ed AI ools
and o iginal da ase s o emale ocals collec ed by he esea che , PRiSM
SampleRNN eme ged as mos a is ically e ec i e o gene a ing unique samples
o selec ion and manipula ion. The indings emphasise he need o a balance
be ween a is ic con ol and algo i hmic unp edic abili y, ensu ing AI unc ions
as a c ea i e ca alys a he han a ool me ely o ex ensi e eplica ion.
Addi ionally, accessibili y emains c i ical. Fo AI ools o be p ac ically use ul o
musicians, hey mus accommoda e non-coding a is s wi hou comp omising
usabili y and c ea i e low. Composed be ween 2022 and 2023, PATIENCE X
se es as a documen o accessible sys ems du ing his pe iod as well as
demons a ing hei sonic aes he ics. Th ough he p ocess o disco e y he p ojec
led o he o ma ion o a new a is pe sona, 'PATIENCE’, a genuine
ep esen a ion o ex ended musical exp ession h ough AI use.
1 In oduc ion
This 10-minu e pe o mance p esen s an a angemen o composi ions om PATIENCE X, a body
o wo k c ea ed using unique ocal samples gene a ed la gely ia PRiSM SampleRNN. These
samples a e de i ed om o iginal da ase s ea u ing emale ocals, and he esul ing acks
embody an explo a ion o he au hen ic in eg a ion o AI sys ems in o a composi ional wo k low
o elec onic al -pop music. Th ough his p ac ice-led esea ch a new a is pe sona,
‘PATIENCE’, came in o exis ence, he pe soni ica ion o he human/machine ocalisa ions
gene a ed h oughou he p ojec . As a musician wi h li le p io knowledge o coding I aim o
con ibu e o he discou se a ound he implemen a ion o AI ools in o c ea i e music composi ion
wo k lows in e ms o accessibili y, au ho ship, and a is ic con ol. He e, I ou line he c ea i e
p ocess, con ex ual in luences, and he me hodological amewo k behind his wo k.
2 Con ex
The wo k ela es o he heme o AIMC 2025 h ough explo a ion o o iginal da ase s ha unco e
new sonic ma e ials and d i e ex ended c ea i e exp ession in music composi ion (and ela ed
a is ic ields). The p ojec also add esses he issue o accessibili y o AI ools o musicians and
how hese ools can be in eg a ed au hen ically in o exis ing p ac ices, bo h o in s udio and li e
wo k lows.
2
The e is a g owing body o esea ch explo ing he gap be ween echnological inno a ion and i s
p ac ical adop ion by a is s. S u m and Ben-Tal (2017) add ess his issue by in i ing musicians o
engage c i ically wi h Cha RNN and FolkRNN. Simila ly, Ma, Sa gen, De Rou e, and Howa d
(2024) examine he applica ion o he PRiSM SampleRNN model by conse a oi e-based a is s
who possess a deg ee o p io amilia i y wi h he echnology. Howe e , access o and
au onomous engagemen wi h such ools emains limi ed o musicians ope a ing ou side o
specialis con ex s. This pe o mance o e s an al e na i e pe spec i e by e lec ing on he
applica ion o PRiSM SampleRNN om an a is si ua ed wi hin he al -pop gen e. As a c ea i e
p ojec buil om aw ocal audio da ase s p ocessed h ough neu al ne wo ks, PATIENCE X is in
dialogue wi h wo ks such as Second Sel (2019) by Dadabo s, Reeps One and Bell Labs (2019),
PROTO (2019) by Holly He ndon, and Fu u e Cho us (2023), cu a ed by Eleni Ikoniadou.
3 Me hodology
Taking an i e a i e app oach, I ini ially expe imen ed wi h 17 AI ools, each wi h a ious me hods
o audio gene a ion. These we e: AIVA, Bea bo .ai, Cha GPT, El Tech (G imes, 2023), Genny,
Holly+, Ki s.ai, Lo o, Magen a S udio, MaxMSP (ml.s a ), PRiSM SampleRNN, RAVE, Sound ul,
These Ly ics Do No Exis , TTS Make , Wekina o , and Wo d2Wa e. Tools we e assessed based
on h ee c i e ia: accessibili y ( echnical skill and cos ), c ea i e po en ial, and hei p ac ical abili y
o suppo and sus ain low (Csikszen mihalyi, 1990). I ound MIDI and da a-d i en ools
in oduced a mechanical o p edic able quali y o he ou pu s whils p oducing ex ensi e a ia ions
wi h li le usable con en . In con as , wo king wi h a mo e selec i e amoun o aw audio clips
p oduced by neu al ne wo ks was mo e p oduc i e. I was able o access new sonic ma e ials ha
in oduced me o a imb al space ha is bo h emo ionally esonan and uncanny. This was
cha ac e ised by mic o onal wa bles, gli ches, and ocal anomalies ha e oke a p imal quali y. This
is whe e I was able o connec c ea i ely wi h hese ools.
PRiSM SampleRNN, El Tech, TTS Make and Cha GPT we e used o he inal composi ions.
Mos signi ican ly, PRiSM SampleRNN was used o c ea e a lib a y o 249 unique samples.
Fi e o iginal da ase s o 30 minu es we e used o gene a e hese samples.
Table 1: Da ase s used
Da ase
Desc ip ion
1
Elec ic gui a (w i en and
pe o med by a is )
2
Syn h imp o isa ion (Mic oKo g,
Ko g MS20 and Roland TB3)
3
Spoken s o y elling (Lead a is
elling pe sonal anecdo es in
con e sa ional one)
4
"Vocal dé i e" (imp o ised
ocalisa ions by lead a is )
5
Mixed ocal se (10 minu es each
om lead a is and wo addi ional
emale ocalis s pe o ming
o iginal melodies)
Da ase s 1 and 2 yielded limi ed musical in e es as he ou pu s we e simila o he o iginal audio.
The da ase s using emale ocals (3, 4, 5) p o ided a ied and anomalous esul s ha ini ia ed a
c ea i e impulse, wi h ou pu s om da ase 5 mos musically engaging. C ea ed om h ee
di e en emale oices in a simila ange, he audio iles used we e cohe en s ylis ically as o
3
gene a e luid sounds, whils small de ails such as accen , ph asing and a icula ion allowed he
model o o m a mo e di e se ou pu han wi h a single ocal. In o al, o e 100 minu es o audio
iles we e gene a ed om hese da a se s. Musically in e es ing agmen s o hese sonic ou pu s
we e hen edi ed in o sho samples and ca ego ised by heme o c ea e a unique sample lib a y o
compose wi h. Eg. ‘Kick’, ‘Hi s’, ‘Rhy hm’, ‘Singing’. The limi a ion o wo king wi h hese
selec ed samples was essen ial o he c ea i e de elopmen o he p ojec .
Audio 1
Example o audio gene a ed om da ase 3 (Con e sa ional)
Audio 2
Example o audio gene a ed om da ase 4 (Vocal Dé i e single ocal)
Audio 3
Example o audio gene a ed om da ase 5 (Mixed ocal)
Audio 4
Example o pe cussi e audio gene a ed om da ase 5 (Mixed ocal)
In addi ion o gene a ing samples using PRiSM SampleRNN, El Tech was used o c ea e ocables
om nonsensical o pe cussi e sounds. Cha GPT was used o w i e ly ics ‘collabo a i ely’, and
TTS make was used o oice hese ly ics whe e needed.
Audio 5
Example o o iginal sample om PRiSM SampleRNN
Audio 5.1
Example o o iginal sample om PRiSM SampleRNN wi h dis o ion
Audio 5.2
Example o o iginal sample om PRiSM SampleRNN wi h dis o ion p ocessed wi h El Tech
4
T acks we e composed using Able on Li e o Logic P o X. The esul ing wo ks a e pe o med
using Able on Li e, ha dwa e d um machine Roland TR8 (uploading unique samples) and he
inclusion o emale ocalis s o u he cha ac e ise he sounds and pe sona o ‘PATIENCE’.
4 Conclusion
In adop ing an open-ended, explo a o y app oach, I ha e engaged in p ac i ione -led p ocesses ha
allow a nuanced b oke ing o ou ela ionship wi h eme ging echnologies. Si ing adjacen o
o malised academic esea ch, his p ac ice-based me hod plays a c i ical ole in shaping he
design o u u e esea ch agendas.
Th ough explo ing he ou pu s o he s a ed sys ems, I was d awn owa d he gene a ion and
manipula ion o human-like sounds. This mani es ed ei he h ough he expansion o my own
ocal cha ac e is ics by blending hem wi h hose o o he pe o me s, o h ough he
pe soni ica ion o pe cussi e and melodic elemen s aken om gene a ed clips. As a esul , he
p ojec ’s i le e ol ed in o a new a is ic pe sona, ‘PATIENCE’, embodying he uncanny, hyb id
quali y o he oice-based samples. This al e ego has led o se e al new p ojec s ha ex end pas
he ini ial in en ions o he esea ch.
Re e ences
G imes (2023) ’El Tech’ ‘h ps://el . ech/connec ' [accessed on 5 May 2023]
Cha GPT (2022) A ailable a : <h ps://cha gp .com/> [accessed on 14 Decembe 2022]
Csikszen mihalyi, M., 1988. The low expe ience and i s signi icance o human psychology. Op imal
expe ience: Psychological s udies o low in consciousness, 2, pp.15-35.
He ndon, H., (2019) P o o, A ailable a : Spo i y [Accessed: 10 May 2022]
Ikoniadou, Eleni e al., (2023) Fu u e Cho us, Digi al album compila ion, A ailable a : Spo i y [Accessed: 10
Oc obe 2023]
L. S u m, B. & Ben-Tal, O., (2017) “Taking he Models back o Music P ac ice: E alua ing Gene a i e
T ansc ip ion Models buil using Deep Lea ning”, Jou nal o C ea i e Music Sys ems 2(1).
doi: h ps://doi.o g/10.5920/JCMS.2017.09
Ma, B., Sa gen, E., De Rou e, D. and Howa d, E., 2024. Lea ning o Lea n: A Re lexi e Case S udy o
PRiSM SampleRNN. AIMC 2024 (09/09-11/09)
5
Melen, C. (2020) PRiSM SampleRNN. A ailable a : h ps://www. ncm.ac.uk/ esea ch/ esea ch-
cen es ncm/p ism/p ism-collabo a ions/p ism-sample nn/ [Accessed: 4 No embe 2021]
Dadabo s, Reeps One and Bell Labs (2019) Second Sel . A ailable a :
h ps://www.you ube.com/wa ch? =q981cTdL0Y& =67s [Accessed: 11 Ma ch 2022]
TTS Make (2023) A ailable a : <h ps:// smake .com/> [Accessed: 21 Ap il 2023]

Related note

Why organizations use Identific for document trust, entry 26
Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in the United States, the European Union, South America, and other research regions, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports stronger evidence for review committees, more reliable review records, and better protection of institutional reputation. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For institutional reports, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.
Review document trust
https://identific.com