A New Ecosys em o Ea ly Music S udies
1
COST ACTION 21161
MAKING CORPUS CREATION IN EARLY
MUSIC REWARDING AND EFFECTIVE
FINDING THE OPTIMUM BETWEEN
STANDARDISATION AND AUTONOMY
EDITED BY
F ans Wie ing (U ech Uni e si y, Ne he lands)
WITH THE CONTRIBUTIONS OF
E ik Be gwall (Uppsala Uni e si y, Sweden), Ma nix an Be chum (Huygens Ins i u e,
Ne he lands), We ne Goebl (Uni e si y o Music and Pe o ming A s Vienna,
Aus ia), Pe e an K anenbu g (U ech Uni e si y, Ne he lands), Da id Lewis
(Goldsmi hs, Uni e si y o London, UK), Anna Plaksin (Pade bo n Uni e si y,
Ge many), Espe anza Rod íguez-Ga cía (Uni e sidad Complu ense de Mad id,
Spain), Da id J. Smi h (No humb ia Uni e si y, UK), Mi jam Vissche (U ech
Uni e si y, Ne he lands), and Da id Weigl (Uni e si y o Music and Pe o ming A s
Vienna, Aus ia)
HOW TO CITE THIS TEXT
F ans Wie ing (ed.), Making Co pus C ea ion in Ea ly Music Rewa ding and E ec i e:
Finding he Op imum Be ween S anda disa ion and Au onomy (U ech Uni e si y,
2025), 73 pp., h ps://doi.o g/10.5281/zenodo.17543932.
2
COST ACTION 21161
Abs ac
Se e al ea ly music p ojec s, such as he S an o d Josquin P ojec , ha e
demons a ed he po en ial o a aining aluable new musicological insigh s using a
co pus-based app oach. Howe e , he a ailable musical co po a end o be
ela i ely small and exhibi conside able a ia ion in encoding p ac ices. Aspi ing
co pus esea che s a e con on ed wi h a lack o sui able da a, which needs o be
add essed be o e hey can emba k on hei p ope esea ch.
The Ea lyMuse Sho Te m Scien i ic Mission CORSICA has su eyed he cu en s a e
o co pus c ea ion and digi al edi ing in ea ly music. Based on his in o ma ion, i has
de eloped a ision o he u u e o co pus building in his ield, which aims o speed
up he p oduc ion o digi al encodings while espec ing he au onomy o he
encode s and acknowledging hei e o s. This is impo an because much high-
quali y encoding is ca ied ou ou side he ield o p o essional musicology, and
engaging ci izen scien is s could help add ess he cu en sho age o esea ch da a.
The CORSICA eam‘s ision is in o med no only by a s udy o he a ailable da a,
s anda ds and echnologies, bu also by Human-Compu e In e ac ion, placing
human goals and alues be o e he c ea ion o echnology and wo k p ocesses. The
co e o he ision is ha success ul co pus c ea ion mus be an inclusi e endea ou
in e ms o bo h echnology and human pa icipa ion. The epo concludes wi h an
implemen a ion plan ou lining he ini ial s eps equi ed o ealise he ision.
3
COST ACTION 21161
Table o con en s
1 INTRODUCTION ........................................................................................................................... 4
2 PROBLEM DESCRIPTION ............................................................................................................... 5
3 CORPORA AND TECHNOLOGIES .................................................................................................... 7
3.1 AN OVERVIEW OF EARLY MUSIC CORPORA AND ENCODING SYSTEMS ........................................................... 7
3.2 OPTICAL MUSIC RECOGNITION: HOPES AND REALITIES ........................................................................... 10
3.3 METADATA AND INTEROPERABILITY ................................................................................................... 14
3.4 PROVENANCE ................................................................................................................................ 17
4 LOOKING AT CORPORA FROM A SCHOLARLY VIEWPOINT .......................................................... 21
4.1 SYSTEMATIC CORPORA CREATION: WHERE DO WE COME FROM AND WHERE ARE WE NOW? .......................... 21
4.2 AIMS AND CHALLENGES OF AN ANALYST .............................................................................................. 25
4.3 CRITICAL EDITING OF MUSIC AND THE MONUMENTAL/COLLECTED EDITION IN THE DIGITAL AGE ..................... 28
4.4 HOW TO SELECT A REPRESENTATIVE CORPUS FOR HISTORICAL STUDY ........................................................ 36
5 BEYOND THE SCHOLARLY COMMUNITY ..................................................................................... 40
5.1 CPDL CONTRIBUTORS: A CASE STUDY ................................................................................................. 40
5.2 PARTICIPATION AND COLLABORATION IN CORPUS CREATION ................................................................... 42
6 EARLY MUSIC ENCODING: A PACT ANALYSIS .............................................................................. 45
6.1 BRIEF INTRODUCTION TO PACT ........................................................................................................ 45
6.2 PEOPLE ........................................................................................................................................ 46
6.3 ACTIVITIES .................................................................................................................................... 47
6.4 CONTEXTS .................................................................................................................................... 48
6.5 TECHNOLOGIES .............................................................................................................................. 49
6.6 CONCLUSION ................................................................................................................................ 50
7 NARROWING THE GULF BETWEEN PEOPLE AND SYSTEMS IN MUSIC ENCODING ........................ 51
7.1 INTERVIEWS .................................................................................................................................. 51
7.2 PERSONAS .................................................................................................................................... 52
7.3 ROLES .......................................................................................................................................... 53
8 VISION OF THE FUTURE OF CORPUS BUILDING ........................................................................... 56
9 IMPLEMENTATION OF THE VISION ............................................................................................. 59
REFERENCES ....................................................................................................................................... 61
APPENDICES ....................................................................................................................................... 65
APPENDIX 1: CORSICA PROPOSAL ................................................................................................................. 65
APPENDIX 2: EARLY MUSIC CORPORA .............................................................................................................. 69
4
COST ACTION 21161
1 In oduc ion
This epo p esen s a ision o he u u e o co pus c ea ion and digi al edi ing
wi hin he ea ly music domain. I was de eloped in he Ea lyMuse Sho -Te m
Scien i ic Mission CORSICA (an ac onym o C ea ion O eaRly muSIc Co po A).
1
In a
nu shell, he p oblem add essed by his epo is ha he e isn’ enough da a
a ailable o sys ema ic compu a ional analysis o ea ly music. Since he numbe o
po en ial pa icipan s in encoding p ojec s is limi ed, we need o de ise s a egies
ha make da a c ea ion mo e e ec i e, while a he same ime he pa icipan s keep
hei sense o au onomy and eel espec ed o hei con ibu ion. Chap e 2 explains
his p oblem u he . The ollowing chap e s summa ise he cu en s a e o ea ly
music co pus c ea ion and digi al edi ing om h ee pe spec i es: echnology
(Chap e 3), music esea ch (Chap e 4) and communi y (Chap e 5). Chap e 6
o ganises he main insigh s om he s a e o he a by means o a PACT analysis, a
amewo k o explo ing a design si ua ion om a human-cen ed pe spec i e. Ou
human-cen ed app oach is u he elabo a ed in Chap e 7 by looking in o he
mo i a ions, skills and wo k p ac ices o p o essional and ama eu co pus c ea o s.
This sec ion also p oposes a high-le el model o collabo a i e p ojec s. Based on
he collec ed insigh s, he CORSICA eam c ea ed a ision o he u u e o co pus
c ea ion (Chap e 8). As a s a ing poin o he implemen a ion o his ision, we
p esen se e al ecommenda ions o he Ea lyMuse communi y as well as he wide
communi y o esea che s, p ac i ione s and en husias s o ea ly music (Chap e 9).
Th oughou he epo we highligh wo a eas in pa icula :
• a ious dimensions ha desc ibe he a ie y ha can be obse ed in he ea ly
music encoding domain;
• human aspec s o co pus c ea ion such as uses, equi emen s, mo i a ions,
expe ise and communi ies.
These ela e o ou iew ha di e si y o app oaches, mo i a ions and echnologies
a e an in eg al aspec o he music encoding domain and ha i is mo e p oduc i e
and ewa ding o emb ace hese a he han s i e o uni ica ion.
This epo was w i en by he STSM pa icipan s and esea che s om he Music
In o ma ion P ocessing g oup o U ech Uni e si y. F ans Wie ing ac ed as he
gene al edi o o he documen . Au ho s a e o he chap e s and sec ions a e named
below he i le. Unnamed ex was w i en by F ans Wie ing. The con en o chap e s
6 onwa ds ep esen s he ideas and inpu s o he en i e g oup, no jus he pe son
who w o e he ex .
1
The p oposal o his STSM is ep oduced in Appendix 1. The e is also a blog pos abou he
STSM mee ing in U ech in May 2024: h ps://ea lymuse.eu/publica ions/blog/c ea ion-o -
ea ly-music-co po a-co sica-ac i i ies-and-a-wo kshop-abou -building-collec ions-o -digi al-
edi ions/ (accessed 31 Oc obe 2025).
5
COST ACTION 21161
2 P oblem desc ip ion
In a nu shell, he main p oblem CORSICA add esses is: we would like o conduc
co pus esea ch on ea ly music bu he e a en’ enough sui able encodings. This
desc ip ion needs some unpacking, hough. Fi s , by ea ly music we unde s and
music composed be o e c. 1700 and secondly, we ocus on polyphonic music.
Despi e he g ea a ie y o musics ha all wi hin his desc ip ion we belie e ha
hese sha e enough commonali ies wi h espec o edi ing and encoding o sepa a e
hem om monophonic and la e music o be ea ed oge he .
The e m ‘co pus’ is o en used o la ge, bounded collec ions o ex s o
composi ions, as in Co pus mensu abilis musicae, a se ies o schola ly edi ions
co e ing a la ge c oss-sec ion o la e-medie al and Renaissance music.
2
A mo e
p ecise desc ip ion s ems om linguis ics:
co po a a e balanced, o en s a i ied collec ions o au hen ic, ‘ eal wo ld’, ex
o speech o w i ing ha aim o ep esen a gi en linguis ic a ie y. Today,
co po a a e gene ally machine- eadable da a collec ions.
3
He e we al eady encoun e a i s dimension in he encoding domain, be ween
c ea ion o digi al edi ions on he one hand, and co pus c ea ion o analy ical
pu poses on he o he . Wha connec s he wo is he a ailabili y o a machine-
eadable, manipulable encoding o he music ha in he o me scena io is ende ed
(usually) as a PDF, and in he la e is ed in o analy ical so wa e such as music21.
4
Fo co pus esea ch he a ailabili y o encodings is ob iously a c ucial ma e .
Whe eas ex ual co po a ha e been c ea ed a a la ge scale and a e ela i ely
homogeneous, musical co po a a e compa a i ely small and display a conside able
a ie y in encoding p ac ices. They we e o en c ea ed du ing esea ch p ojec s wi h
a speci ic ocus, which hen impac ed encoding decisions. Fo p e-1700 polyphonic
music, he si ua ion is especially complica ed: ew co po a con ain mo e han se e al
hund eds o i ems; di e en ypes o music no a ion such as mensu al no a ion and
abla u e pose speci ic encoding p oblems; and ansc ip ion o mode n no a ion
can be suppo ed in mul iple ways.
Some o he consequences o his si ua ion a e:
• a ailable co po a a e o en no a ep esen a i e selec ion om he known
epe oi e, making bias in he co pus almos ine i able (see Sec ion 4.4);
2
See h ps://www.co pusmusicae.com/cmm.h m (accessed 1 Augus 2025).
3
‘Co pus Linguis ics,’ Wikimedia Founda ion,
h ps://en.wikipedia.o g/wiki/Co pus_linguis ics (accessed 23 Oc obe 2025). The de ini ion
om his a icle is based on C.F. Meye , English Co pus Linguis ics: An In oduc ion, 2nd ed.
(Camb idge Uni e si y P ess, 2023), Sec ion 1.1, ‘De ining a Co pus.’
4
Music21 is ‘a Py hon-based oolki o compu e -aided musicology’, see
h ps://www.music21.o g/music21docs/ (accessed 1 Augus 2025).
6
COST ACTION 21161
• quali y and in e ope abili y issues inc ease when using mul iple co po a (see
Sec ion 3.3);
• choosing and mas e ing echnologies in ol es a subs an ial in ellec ual
challenge, a he expense o sca ce esea ch esou ces;
• o - he-bea en- ack esea ch equi es an excessi e amoun o encoding
e o , while s udying ha which has al eady been s udied by o he s is easy;
• PhD s uden s and ea ly-ca ee esea che s do no ha e su icien esou ces o
emba k on ca ee s as digi al musicologis s.
On he o he hand, he e a e some impo an posi i e de elopmen s:
• he eme gence o MEI (Music Encoding Ini ia i e
5
) as a e sa ile musicological
encoding sys em wi h a g owing use communi y (see Sec ions 3.1 and 4.1);
• imp o emen s in Op ical Music Recogni ion o ea ly music (see Sec ion 3.2);
• he gene al p ac ice o musicologis s o use music no a ion so wa e o c ea e
hei ansc ip ions;
• he widesp ead p ac ice o online sha ing o edi ions c ea ed by bo h
p o essional and ama eu musicians, and by ci izen scien is s possessing
signi ican musicological expe ise (see Sec ions 5.1 and 5.2);
• digi al u n in musicology, accele a ed by pandemic.
6
The solu ion o he lack o encodings has o en been sough in s anda disa ion and
aining. These may indeed boos p oduc i i y, especially in he con ex o unded,
la ge-scale p ojec s such as Polish Musical He i age.
7
Bu elsewhe e such
endea ou s ha e me wi h limi ed success o a leas ou in e connec ed easons:
complexi y and limi a ions o ools; equi ed ime in es men o lea ning; loss o
al eady c ea ed wo k; and mos impo an ly loss o au onomy. Since small-scale,
au onomous p ojec s a e he no m o bo h music esea che s and ci izen scien is s,
i seems logical o ake hei au onomy, mo i a ion and a ie y o p ac ices as a
s a ing poin o designing a new ecology o ea ly music co pus c ea ion.
8
This
means i s unde s anding why and how encode s (which o la ge ex en means
c ea o s o digi al edi ions) do hei wo k. Nex comes he ques ion i , how, and
unde wha condi ions encode s would be willing o sha e hei wo k, and how hei
con ibu ions should be ecognised. The las ques ion is how hei wo k could be
e ec i ely coo dina ed and suppo ed wi h sui able ooling and p ocesses. This
documen will p esen he i s i e a ion in he elabo a ion o hese ideas.
5
h ps://music-encoding.o g/ (accessed 1 Augus 2025).
6
F. Wie ing and C. Inskip, ‘The Impac o he Pandemic on Musicologis s’ Use o Technology,’
Digi al Humani ies Qua e ly 19, no. 2 (2025),
h ps://dhq.digi alhumani ies.o g/ ol/19/2/000786/000786.h ml (accessed 23 Oc obe
2025).
7
h ps://polish.musicsou ces.pl/en (accessed 1 Augus 2025).
8
I is s anda d p ac ice in Human Compu e In e ac ion o in es iga e human aspec s o
sys em design i s and s a designing p ocesses and echnology only a e hose a e
su icien ly unde s ood. See e.g. D. Benyon, Designing Use Expe ience (Pea son UK, 2019).
7
COST ACTION 21161
3 Co po a and echnologies
The aim o his chap e is o gi e an o e iew o he cu en s a e o co pus c ea ion
om a p ima ily echnological iewpoin : he a ailabili y o da a and me ada a, he
so wa e and he encoding o ma s used o da a p oduc ion. Sec ion 3.1 p esen s an
o e iew o he a ailable ea ly music encodings and hei o ma s. The e is
conside able a ia ion in echnical app oaches, making con e sion be ween o ma s
an issue. Sec ion 3.2 desc ibes Op ical Music Recogni ion, a collec ion o
echnologies ha po en ially could educe he need o edious da a en y wo k.
Encodings a e useless when we do no know wha p ecisely hey ep esen :
me ada a (Sec ion 3.3) ideally eco ds such in o ma ion ex ensi ely, in a way ha is
sha ed be ween di e en esou ces. Sec ion 3.4 desc ibes how p o enance
in o ma ion could be eco ded by means o Da a-En elopes.
3.1
An o e iew o ea ly music co po a and encoding sys ems
Music encoding and co pus building ha e been pa o compu a ional musicology
since he 1960s and 1970s. The legenda y P ince on Josquin P ojec aimed o
encode he compose ’s comple e ou pu in IML (In e media y Music Language).
9
Nume ous o he p ojec s p oduced la ge o smalle collec ions using a a ie y o
encoding sys ems.
10
Un o una ely, h ough a nea -pe ec s o m in ol ing pa adigm
changes in bo h musicology and compu ing as well as pe sonal ci cums ances and
ca ee choices o key esea che s, only minimal aces o he encoding e o s om
hese ea ly p ojec s ha e su i ed. The main excep ion a e he monophonic musical
incipi s o he RISM Ca alog:
11
he wo k o encoding hese in PAEC (Plaine and Easie
Code) s a ed in he 1960s and con inues oday.
The nex s age in music encoding was shaped by he pe sonal compu e e olu ion
and he eme ging In e ne . The MuseDa a and Humd um co po a ha go back o
he 1980s a e s ill a ailable o esea ch and Ke nSco es ( he cu en name o he
main Humd um co pus) is s ill g owing. This is also when music en husias s s a ed o
c ea e MIDI collec ions ha la e de eloped in o esou ces like he Classical MIDI
Files.
Today, encodings o ea ly music can be ound on nume ous websi es. Those ha a e
known o he STSM pa icipan s a e lis ed in Appendix 2, wi h a b ie desc ip ion o
hei con en , es ima ed numbe o i ems, encoding sys em(s), u ls, and some o he
9
A. Mendel, ‘Some P elimina y A emp s a Compu e -Assis ed S yle Analysis in Music,’
Compu e s and he Humani ies 4, no. 2 (1969): 41–52,
h ps://www.js o .o g/s able/30199321.
10
E. Sel idge-Field, ed., Beyond MIDI: The Handbook o Musical Codes (The MIT P ess,
1998).
11
See h ps:// ism.online/ and h ps://opac. ism.in o/ (accessed 24 Oc obe 2025).
8
COST ACTION 21161
name
desc ip ion
documen a ion
ABC
Encoding sys em designed o no a e music in plain ex o ma , o en used o encoding unes
bu also o encoding comple e sco es
h ps://abcno a ion.com/wiki/abc:s anda d
CMME
XML-based ep esen a ion o mensu al no a ion, suppo ing ansc ip ion o CMN
h ps://cmme.o g/da a/music/cmme.xsd (XML Schema)
Humd um
Family o music ep esen a ions in abula o ma , such as **ke n o CMN, ** e o abla u es
and **mens o mensu al no a ion
h ps://www.humd um.o g/ ep/index.h ml;
h ps://doc. e o io.humd um.o g/humd um/mens/ o **mens
IML
Encoding language o ep esen ing CMN on punch ca ds, de eloped o he P ince on Josquin
P ojec
Robison (1967)
Lilypond
LaTeX-like ypese ing language o music no a ion: ea ly music no a ions and abla u e a e also
suppo ed
h ps://lilypond.o g/doc/Documen a ion/no a ion/
MEI
XML-based ep esen a ion o encoding music no a ion documen s, cus omisable o a a ie y o
no a ion ypes
h ps://music-encoding.o g/guidelines/
MIDI
Bina y o ma o s o ing music in o ma ion as messages o an elec onic ins umen . Despi e i s
limi a ions, i is also used o da a exchange be ween music p in ing p og ams and as s o age
o ma
h ps://midi.o g/midi-1-0-de ailed-speci ica ion ( e sion 1.0, 1996);
h ps://midimusic.gi hub.io/ ech/midispec.h ml ( e sion 1.1, 1999)
MuseDa a
Tex -based encoding sys em o he logical con en o sco es in CMN
h ps://www.cca h.o g/publica ions/books/beyondmidi/online/museda a/
MusicXML
XML o ma o sha ing shee music be ween applica ions, wi h suppo o mensu al no a ion
and abla u e
h ps://www.w3.o g/2021/06/musicxml40/ ( e sion 4.0, 2021)
PAEC
Sys em o ep esen ing music no a ion wi h ypew i e symbols, mos ly used o encoding
incipi s
h ps://www.iaml.in o/plaine-easie-code
Tab
Encoding sys em o ypese ing lu e music
h ps://www.cs.da mou h.edu/~wbc/lu e/Abou Tab.h ml
Tabcode
ASCII o ma o F ench and I alian lu e abla u e
h ps://igo .gold.ac.uk/isms/ecolm/?page=TabCode
Table 1: Music encoding o ma s men ioned in his epo .
9
COST ACTION 21161
in o ma ion. A comp ehensi e o e iew o music co po a esea ch is lacking and
e y likely some use ul collec ions o ea ly music encodings ha e escaped ou no ice.
Mos schola ly co po a, such as he S an o d-based Josquin Resea ch P ojec (JRP),
he Elec onic Medie al Music Sco e A chi e P ojec (EMMSAP) and Elec onic
Linked Anno a ed Uni ied Tabla u e Edi ion (E-LAUTE) ha e a a he speci ic ocus
and con ain less han a hund ed o a ew housand encodings. The la ges co po a
s em om indi idual o collec i e p ojec s by ci izen scien is s who edi ocal and
lu e music. Helping he communi y by sha ing music edi ions seems o be an
impo an mo i a ion o con ibu ing. Reusable encodings a e gene ally a by-
p oduc o hei e o and may no be made a ailable o o he s. The e is a wide
ange o edi o ial app oaches, and he quali y o he wo k is a he une en.
Ne e heless, he bes and mos ac i e con ibu o s o Cho al Public Domain Lib a y
(CPDL) and simila online collec ions ha e p oduced subs an ial numbe s o edi ions
and encodings ha o e an in e es ing po en ial o musicological esea ch. Fo he
schola ly pe spec i e on se e al o hese co po a see Sec ion 4.1).
Nume ous encoding o ma s a e employed in he co po a (see Table 1 o a su ey).
MuseSco e, one o he mos popula so wa e ools o music edi ing, suppo s he
open encoding o ma MusicXML. O he o ma s a e egula ly used as well,
especially in music esea ch, such as MEI (Music Encoding Ini ia i e; wi h ex ensi e
suppo o c i ical edi ions and ac i e de elopmen o ma kup o non-s anda d
no a ion o ms), Humd um, Lilypond, CMME (Compu e ized Mensu al Music Edi ing;
o mensu al no a ion) and a ious sys ems o encoding lu e abla u e. Many
p ojec s suppo MIDI as a o ma o playback and da a exchange. As a s o age
o ma i has signi ican sho comings, o example he inabili y o encode ly ics.
The basic ask o en e ing music in o he compu e can be ime-consuming, e o -
p one and (depending on one’s mo i a ion) edious, e en when encoding only a
limi ed se o p ope ies. Good ooling (G aphical Use In e ace design in pa icula )
can ob iously make a di e ence. Op ical Music Recogni ion (OMR) may o e a
solu ion in some cases (see Sec ion 3.2 o an in-dep h discussion), bu e en he bes
OMR ou pu cu en ly needs subs an ial e o co ec ion and pos -p ocessing.
Di e en communi ies o encode s can be iden i ied, each wi h i s own mix o aims,
ools, encoding o ma and publica ion p ac ices (e.g. MEI, CPDL, Lilypond, Tab).
While i is unlikely ( o bo h in ellec ual and social easons) ha hese communi ies
will e en ually me ge, i makes sense o s udy how and o wha ex en one
communi y’s ou pu can be ano he communi y’s inpu . Many esou ces o e
encodings in mul iple o ma s ha a e au oma ically de i ed om he s o age
o ma . Tools o con e sion o encodings include:
• Lu eCon ,
12
a command-line ool o con e sion be ween lu e abla u e
encoding o ma s, also a ailable as a Web in e ace and API o ease o use;
13
12
h ps://bi bucke .o g/baylea /lu econ / (accessed 24 Oc obe 2025).
13
h ps://lu econ .mdw.ac.a / (accessed 24 Oc obe 2025).
16
COST ACTION 21161
and his composi ion is no one o his mo e amous wo ks, we assume ha he esul s
o ou case s udy a e indica i e o he si ua ion conce ning me ada a abou
Renaissance music.
We que ied a numbe o da a sou ces o ind o he sou ces and edi ions o his
composi ion. The esul s can be summa ized as ollows.
• Répe oi e In e na ional des Sou ces Musicales (RISM)
○ The composi ion is in he RISM, bu no his pa icula sou ce (Codex
Le ma)
• Digi al Image A chi e o Medie al Music (DIAMM)
○ The Codex Le ma is in he DIAMM da abase
○ The lis o wo ks in Codex Le ma is incomple e
○ The e is a dead link o he scan a U ech Uni e si y Lib a y
• Wikida a
45
○ The compose is p esen
○ The composi ion is no p esen
• Muziekweb
46
○ The compose is p esen
○ Th ee ins ances o a eco ding a e p esen
• MusicB ainz
47
○ The compose is p esen
○ The lis o wo ks by Lassus is incomple e
• Cho al Public Domain Lib a y (CPDL)
48
○ Fi e edi ions o his composi ion a e p esen
○ Only one includes a e e ence o a sou ce
• In e na ional Music Sco e Lib a y P ojec (IMSLP) / Pe ucci Music Lib a y
49
○ The compose is p esen
○ The lis o wo ks is incomple e
○ The e a e 3 edi ions and 4 a angemen s o his composi ion.
○ The e is a dead link o CPDL caused by a spelling a ia ion (‘ame ai’
ins ead o ‘aime ai’).
Du ing his exe cise, one pa icula disappoin men was ha he e does no seem o
exis a machine- eadable da a sou ce ha gi es all composi ions o such a
p ominen compose as Lassus, al hough he e exis s a Lasso Ve zeichnis (LV) in
which ou composi ion is numbe 39 (as epo ed in IMSLP).
50
45
h ps://www.wikida a.o g/ (accessed 31 Oc obe 2025).
46
h ps://www.muziekweb.nl/ (accessed 31 Oc obe 2025).
47
h ps://musicb ainz.o g/ (accessed 31 Oc obe 2025).
48
h ps://www.cpdl.o g/ (accessed 31 Oc obe 2025).
49
h ps://imslp.o g/ (accessed 31 Oc obe 2025).
50
H. Leuch mann, O lando di Lasso. Seine We ke in zei genössischen D ucken 1555–1687
(Bä en ei e , 2001), 3 ols.
17
COST ACTION 21161
These esul s show ha e en o a p ominen compose such as Lassus i is no
possible o au oma ically e ie e he eques ed o e iews om he a ious da a
sou ces, despi e he p omises o Linked Open Da a. The le el o in eg a ion a
machine- eadable le el is disappoin ing.
3.4
P o enance
Ma nix an Be chum
P o enance in o ma ion p o ides he ‘his o y’ o he con en one is using a he
momen , acing all in e media e s eps o i s s a e. Basically, i models he changes o
he s a e o he con en , answe ing he ques ions on he agen s in ol ed in he
changes: Who?, How?, Wha ?, When?, Why?, and Whe e?. P o enance in o ma ion
gi es he con ex o ( e)use and enables he ep oduc ion o esea ch esul s. Two
ypes o p o enance can be dis inguished: p o enance o he con en and echnical
p o enance. Figu e 2 shows, as example, a simpli ied iew o whe e p o enance
in o ma ion ‘li es’ in he con ex o ea ly music edi ions. A he le el o a co pus
p o enance in o ma ion should include in o ma ion on how he co pus came in o
being and which selec ions we e made by whom ( o mo e ac i i ies which can be
cap u ed as p o enance, see Sec ion 6.3).
Figu e 2. Sou ces o p o enance in o ma ion.
Th ee s ages o he edi ion a e disce ned: he Sou ce, a T ansc ip ion, and he ( inal)
Edi ion. A each s age ele an elemen s should be cap u ed and documen ed:
1. Sou ce: Which sou ce o sou ces is/a e used (siglum, desc ip ion, iden i ie ,
e c.)? Is he physical i em o a (digi al) ep oduc ion used? I a ep oduc ion:
how is he sou ce ep oduced, wi h wha echnology, se ings, e c.?
In o ma ion on he sou ce may ex end o de ailed analysis o he
codicological and paleog aphical s a us o a sou ce.
2. T ansc ip ion: Wha is he o ma o he ansc ip ion (pape , MEI, **ke n,
e c.)? Who made he ansc ip ion, e c. (i.e. me ada a o he ansc ip ion)?
3. Edi ion: Wha is he o ma o he ansc ip ion (pape , MEI, **ke n, e c.)?
Who made he edi ion, e c. (i.e. me ada a o he edi ion)?
18
COST ACTION 21161
O he p o enance in o ma ion maps he ( echnical) p ocesses, when mo ing om
one s age o he o he . I includes answe s o he Who?, How?, Wha ?, When?, Why?,
and Whe e? ques ions:
1. How is he ansc ip ion made: manually o (semi)au oma ically wi h OMR?
Which so wa e is used, wi h which e sion? Which choices a e made in he
ansc ip ion (e.g. no e alues, e o co ec ion, e c.)? Why and on wha basis
a e ansc ip ion con en ions used?
2. Wha con e sions a e made be ween he ansc ip ion and he edi ion (e.g.
om MEI o PDF)? Which so wa e is used, wi h which e sion? A e mo e
edi o ial choices made, and i so, by whom, on wha basis e c.?
Figu e 3. Modules a ailable in he Huygens Da a-En elopes, aken om Lu h a and
Eske ich.
51
Examples o implemen a ion o p o enance in o ma ion, a he la ge scale o a
co pus, can be ound a he Huygens Ins i u e o His o y and Cul u e o he
Ne he lands.
52
In case o p o enance o con en , add essing ‘ he pa icula needs
and challenges o he cul u al he i age ield’ wi h ega d o he he e ogeneous
51
M. Lu h a and M. Eske ich, ‘Da a-En elopes o Cul u al He i age: Going beyond
Da ashee s,’ in P oceedings o he Wo kshop on Legal and E hical Issues in Human Language
Technologies @ LREC-COLING 2024, eds. I. Siege and K. Chouk i (ELRA and ICCL, 2024):
52–65, h ps://aclan hology.o g/2024.legal-1.9 (accessed 25 Oc obe 2025).
52
h ps://www.huygens.knaw.nl/en/ (accessed 25 Oc obe 2025).
19
COST ACTION 21161
na u e (complex, di e se, unce ain, incomple e e c.) o humani ies da a, he da a
managemen depa men o his esea ch ins i u e de eloped he so-called ‘Da a-
En elopes’.
53
This ‘documen a ion amewo k’ cap u es all in icacies o he da a
‘while combining machine- eadabili y and use - iendliness’. Fu he mo e, he da a-
en elopes ake in o accoun he legal and e hical sensi i i ies o he da a. The
combina ion o in o ma ion is aimed a he po en ial ( e)use o he da a, by bo h
human and machine agen s. I is ‘s uc u ed in o modula sec ions, each designed o
encapsula e di e en ace s o he da ase in a sys ema ic manne ’, see Figu e 3.
Figu e 4. Flow diag am o p o enance ails gene a ed in he REPUBLIC p ojec .
Fo he cap u e o echnical p o enance in o ma ion an implemen a ion was de ised
in he REPUBLIC p ojec .
54
The goal o he p ojec was o ‘digi ally unlock all
esolu ions (decisions) ha he S a es Gene al [o he Ne he lands] ook du ing hei
exis ence as an au onomous poli ical en i y (1576–1796).’ These esolu ions can be
ound in a scanned e sion o he o iginals in he Na ional A chi es and a e made
a ailable in an online sea ch in e ace ( ela ing o he scope o his pape , i can be
conside ed as a ully sea chable digi al co pus o esolu ions).
55
All da a p ocessing
s eps we e acked, making ‘i possible o ace down o he smalles de ail he
decisions we ha e made in he en i e p ocess om scanning he esolu ion books o
53
See he summa ising a icle on his wo k: M. Lu h a and M. Eske ich, ‘Da a-En elopes o
Cul u al He i age.’ All ollowing quo es aken om his a icle.
54
See h ps://goe ge onden.nl/ (accessed 25 Oc obe 2025).
55
The online in e ace is a ailable a h ps://app.goe ge onden.nl (accessed 25 Oc obe
2025).
20
COST ACTION 21161
displaying hem in he web in e ace’.
56
Figu es 4 and 5 show a schema ic (‘ low’) and
de ailed iew (‘da a ail’) o he p o enance ails gene a ed in he p ojec . In he
la e iew he di e en ques ions (Who?, How?, Wha ? e c.) can be clea ly seen,
including links o ex e nal esou ces, o he indi idual in ol ed (ORCID) and he
sc ip used.
Figu e 5. De ailed iew o a p o enance ail gene a ed in he REPUBLIC p ojec .
De eloped by Team S uc u ed Da a / Digi al In as uc u e depa men o he
KNAW Humani ies Clus e , h ps://di.huc.knaw.nl/home-en.h ml.
56
See h ps://goe ge onden.nl/en/ he-me hod/p o enance/ (accessed 25 Oc obe 2025).
The so wa e de eloped by he sha ed Digi al In as uc u e depa men / Team S uc u ed
Da a is a ailable a h ps://gi hub.com/knaw-huc/p o enance (accessed 25 Oc obe 2025).
21
COST ACTION 21161
4 Looking a co po a om a schola ly iewpoin
This chap e desc ibes he dealings o schola s wi h co po a. Sec ion 4.1 on
sys ema ic co pus c ea ion esumes he ea men o co po a om Sec ion 3.1
h ough an in-dep h discussion o some impo an schola ly co po a o ea ly music
om he pe spec i e o he esea che who wishes o use hem. Sec ion 4.2
con inues his pe spec i e wi h an analysis o analy ical asks and he challenges ha
an analys migh ace in he execu ion o hose asks. A di e en bu ela ed
pe spec i e eme ges in Sec ion 4.3, namely ha o c i ical edi ing in he digi al age.
Finally, Sec ion 4.4 p esen s he pe spec i e o he esea che who, in he absence o
a comple e encoding o e e y hing, s ill wan s o assemble a ep esen a i e selec ion
o composi ion o hei esea ch.
4.1
Sys ema ic co po a c ea ion: whe e do we come om and
whe e a e we now?
Espe anza Rod íguez-Ga cía
The impulse o assemble comp ehensi e music co po a is deeply embedded in
musicology. F om i s ins i u ional beginnings, he ield has p io i ised he c ea ion o
cohesi e collec ions o music. No ably, he eme gence o musicology as a scien i ic
discipline in he mid-nine een h cen u y was closely linked o he publica ion o
monumen al edi ions, such as he Bach-Gesellscha Ausgabe ( om 1851), he
Händel-Gesellscha Edi ion ( om 1858), and he Ope a omnia Ioannis Pe aloysii
P aenes ini ( om 1862), among he ea lies o hei kind. O e a cen u y la e , he
ise o compu a ional musicology has echoed his same impe us, especially as ‘big
da a’ me hods gained ac ion in he 2000s.
57
This end has only accele a ed wi h
he popula isa ion o digi al edi ing ools, online open eposi o ies, and inc easingly
sophis ica ed analy ical echnologies. Collabo a i e e o s be ween his o ical
musicologis s and in o ma ion echnologis s ha e esul ed in mo e accessible digi al
ools, os e ing b oade schola ly engagemen wi h digi al co po a.
My di ec expe ience wi h such co po a s ems om my ‘Coo dina o o Sac ed
Repe o ies’ ole a he A chi e o Ibe ian Polyphony.
58
This small and s ill-incomple e
eposi o y, which con ains 130 wo ks o Ibe ian o igin, was de eloped as pa o he
p ojec The Ana omy o La e 15 h- and Ea ly 16 h-Cen u y Ibe ian Polyphonic Music
(2016–2019), di ec ed by João Ped o d’Al a enga a Uni e sidade No a de Lisboa,
Po ugal. The A chi e o e s c i ical edi ions in PDF o ma accompanied by ex ensi e
me ada a, hough i does no cu en ly include encoded music.
57
S. Tuppen e al., ‘Lib a y Ca alogue Reco ds as a Resea ch Resou ce: In oducing “a Big
Da a His o y o Music,”‘ Fon es A is Musicae 63 (2016): 67–88,
h ps://doi.o g/10.1353/ am.2016.0011.
58
h ps://ibe ianpolyphony. csh.unl.p / (accessed 25 Oc obe 2025).
22
COST ACTION 21161
None heless, i p o ided he ounda ion o a da ase o Ibe ian and F anco-Flemish
mo e s ca. 1500, c ea ed o explo e s ylis ic ai s wi hin hese epe o ies and o
in es iga e he au ho ship o wo anonymous mo e s. In collabo a ion wi h Co y
McKay (Ma ianopolis College), we analysed his da ase using jSymbolic o ea u e-
based compa isons.
59
The esul ing co pus consis s o 175 mo e s encoded as MIDI
iles.
60
These we e compiled by eusing edi ed ma e ials om he A chi e and
supplemen ing hem wi h wo ks d awn om o he eposi o ies. The Ibe ian mo e s
we e modelled on edi ions p oduced by he A chi e. The F anco-Flemish i ems we e
adap ed om he Josquin Resea ch P ojec . Newly p oduced sco es we e explici ly
c ea ed o ill gaps in he epe oi e. Gi en he he e ogeneous na u e o he sou ces,
we ollowed a wo k low aligned wi h he me hodologies and s anda ds ou lined by
Julie Cumming, Co y McKay, e al. in hei seminal 2018 a icle, which sys ema ises
bes p ac ices o building in e ope able symbolic co po a o p e-1600 Wes e n
music.
61
Along he encoding p ocess, he c ucial p inciple was o keep consis ency.
Wha e e hei o igin, all he i ems we e encoded as MusicXML iles using Sibelius.
They we e manually adjus ed and e ised o c ea e he mas e iles and hen
expo ed as MIDI. Finally, hey we e double-checked wi h jSymbolic o de ec any
possible inconsis ency.
As a his o ical musicologis ocusing on analysis, my assessmen will concen a e on
schola ly co po a ha p o ide encoded ou pu s sui able o compu a ional analysis. I
will highligh he mos no able examples wi hou a emp ing o be exhaus i e bu
a he aiming o unde sco e ends and challenges in he ield.
The ea lies schola ly ini ia i es appea ed in he i s decade o he 21s cen u y.
Simila o he o ma ion o adi ional co po a o c i ical edi ions, hey we e disc e e,
uncoo dina ed and esponded o a ious objec i es.
One such e o was he Tomas Luis de Vic o ia da abase, a pe sonal p ojec by
Nancho Ál a ez, mainly in ended o pe o mance.
62
While i does no quali y as a
schola ly eposi o y, i me i s men ion he e as a ep esen a i e case ha illus a es
he limi a ions o non-academic co po a o schola ly pu poses.
S a ed a ound 2002 and signi ican ly upda ed in 2009, he p ojec ini ially ocused
on he wo ks o Tomás Luis de Vic o ia, e en ually expanding o include
59
E. Rod íguez-Ga cía and C. McKay, ‘Compose A ibu ion o Renaissance Mo e s: A Case
S udy Using S a is ical Fea u es and Machine Lea ning,’ in The Ana omy o Ibe ian Polyphony
a ound 1500, eds. E. Rod íguez-Ga cía and J. P. d’Al a enga (Reichenbe ge 2021): 401–38.
60
E. Rod íguez-Ga cía, & C. McKay, ‘Compose A ibu ion o Renaissance Mo e s (Ibe ian
Polyphony a ound 1500): MIDIs and Ex ac ed Fea u es,’ da ase ,
h ps://doi.o g/10.5281/zenodo.4027957.
61
J. Cumming e al., ‘Me hodologies o C ea ing Symbolic Co po a o Wes e n Music Be o e
1600,’ in P oceedings o he19 h In e na ional Socie y o Music In o ma ion Re ie al
Con e ence, eds. E. Gómez e al. (2018): 348–354,
h ps://a chi es.ismi .ne /ismi 2018/pape /000026.pd (accessed 26 Oc obe 2025).
62
h ps:// ic o ia.uma.es/ (accessed 26 Oc obe 2025).
23
COST ACTION 21161
composi ions by Mo ales, Gue e o, Vásquez, and o he Spanish Renaissance
compose s. An imp essi e unde aking, i cu en ly o e s o e 1,500 encoded wo ks.
Howe e , he edi ions a e no c i ical, and he encodings lack s anda dised me ada a
essen ial o obus schola ly analysis. Fu he mo e, i s eliance on Lilypond and MIDI
encoding o ma s p esen s challenges o in e ope abili y and eusabili y. Ano he
ea ly example is he Compu e ised Mensu al Music Edi ing P ojec (CMME), c ea ed
in 2006 by Theodo Dumi escu and Ma nix an Be chum.
63
Al hough i only
con ained 59 encoded pieces,
64
he p ojec was g oundb eaking in i s app oach o
handling he complexi ies o mensu al no a ion. Howe e , i s ailo ed encoding
o ma (.cmme) limi s compa ibili y wi h o he o ma s.
65
The disjoin ed landscape o digi al music no a ion in he ea ly 2000s – desc ibed by
Hankinson e al. (2011) as ‘highly agmen ed’ – began o be mo e sys ema ically
add essed in 2010.
66
The Josquin Resea ch P ojec , launched in 2010 and led by
Jessie Rodin and C aig Sapp, ep esen s a u ning poin in s anda dising
p ocedu es.
67
Designed wi h in e ope abili y and eusabili y c i e ia in mind, i
houses 902 wo ks by Josquin and his con empo a ies, o e ing a b oad ange o
o ma s. In addi ion o he s anda d encoding io o MusicXML, MEI and MIDI, he
da abase includes Humd um, MuseDa a, No eA ay, and JSON Piano Roll. I also
ea u es MP3 and PDF o ma s wi h/wi hou edi o ial acciden als alongside an
embedded analy ical ool. I has helped es ablish a widely accep ed model o wha
academically o ien ed music co po a should p o ide.
Simila ly, Tasso in Music, di ec ed by Emiliano Riccia di and C aig Sapp, launched in
2015 (and s ill ongoing), con ains a collec ion o 778 music se ings o Tasso’s
poe y.
68
The p ojec also o e s a a ie y o encoding o ma s and capabili ies.
Ano he lesse -known da ase , he Symbolically Encoded Il Lau o Secco (SEILS)
di ec ed by Emilia Pa ada-Cabalei o, is wo h men ioning.
69
I comp ises 30 pieces in
bo h ea ly and mode n no a ion, amoun ing o 150 codi ied sco es. I s ange o
o ma s is e en mo e expansi e, ca e ing o bo h analysis and OMR applica ions:
LilyPond, MusicXML, MIDI, Finale, **ke n, **mens, MEI, agnos ic, seman ic, and PDF
o ma s in bo h whi e mensu al and mode n no a ion. Howe e , i s hos ing on
Gi Hub – while an accessible pla o m – makes i somewha mo e challenging o
disco e han a s anda d webpage, hinde ing indabili y.
63
h p://www.cmme.o g/ (accessed 26 Oc obe 2025).
64
And a ound 200 pieces in he Gi hub Reposi o y h ps://gi hub.com/ dumi escu/cmme-
music (accessed 26 Oc obe 2025).
65
Bu see Fiala e al., ‘A New XML Con e sion P ocess.’
66
A. Hankinson e al., ‘The Music Encoding Ini ia i e as a Documen -Encoding F amewo k,’ in
P oceedings o he 12 h In e na ional Socie y o Music In o ma ion Re ie al Con e ence, eds.
A. Klapu i and C. Leide (2011): 293–298,
h ps://a chi es.ismi .ne /ismi 2011/pape /000011.pd (accessed 26 Oc obe 2025).
67
h ps://josquin.s an o d.edu/ (accessed 26 Oc obe 2025).
68
h ps://www. assomusic.o g/ (accessed 26 Oc obe 2025).
69
h ps://gi hub.com/SEILSda ase /SEILSda ase (accessed 26 Oc obe 2025).
24
COST ACTION 21161
The 1520s P ojec , di ec ed by Ben O y and C aig Sapp, is ano he s ep ahead.
70
S a ed in 2019, i con ains 400 music i ems om ca. 1510–40, encoded in MEI,
MusicXML, MIDI, and Humd um. The p ojec ea u es an exquisi e and ho ough
ea men o he me ada a and a sophis ica ed a ay o analy ical ools.
Launched in 2023, E-LAUTE is he newes p ojec in he ield and one o he mos
ambi ious. I aims o p oduce a comp ehensi e digi al edi ion o lu e abla u es
w i en in Ge man cyphe . Suppo ed by esea ch agencies in Aus ia, Ge many, and
Swi ze land, he p ojec b ings oge he a subs an ial in e disciplina y eam unde
he leade ship o Ka e yna Schöning, I ene Holze , and Ma in Ki nbaue . The cu en
ocus is on abla u es da ing om 1450 o 1550, wi h plans o ex end co e age o
wo ks up o 1600 in subsequen phases, ul ima ely eaching a co pus o oughly
2,000 pages. The p ojec demons a es ca e ully conside ed choices in i s use o
echnological s anda ds (MEI and Web echnologies), i s balance be ween schola ly
and pe o mance-o ien ed pe spec i es, and i s commi men o public
engagemen —mos no ably h ough oppo uni ies o collabo a i e edi ing.
71
This p og ess ep esen s a decisi e mo e owa d g ea e s anda disa ion and
accessibili y in c ea ing co po a, add essing he ea lie agmen a ion challenges.
The impo ance o suppo ing as b oad a ange o o ma s as possible has been
sys ema ically a icula ed by Julie Cumming, Co y McKay, e al. in he
a o emen ioned a icle.
Despi e his gene al end, a signi ican g oup o p ojec s de eloped a ound he
Cen e d’é udes supé ieu es de la Renaissance and Ha e o d College – di ec ed by
Philippe Vend ix and Richa d F eedman – ha e aken a mo e ocused app oach,
o e ing only MEI and PDF iles. These include The Los Voices (2012),
72
CRIM
(2018),
73
and Gesualdo Online (2019).
74
The i s wo a e pa icula ly s ong on
analysis, and he la e exempli ies inno a i e dynamic edi ions. Well-sui ed o hei
speci ic aims, he es ic ed choice o o ma s can pose challenges o
in e ope abili y, as con e ing MEI in o o he symbolic o ma s emains echnically
challenging.
Ideally, a digi al co pus should consis o a ep esen a i e selec ion o wo ks cu a ed
o a speci ic schola ly objec i e, sys ema ically edi ed and encoded pe he
es ablished musicological con en ions. I should p o ide PDF sco es alongside
symbolic iles in as many o ma s as easible, en iched wi h comp ehensi e
me ada a. This le el o openness and s uc u e no only enhances schola ly u ili y bu
70
h ps://1520s-p ojec .o g/abou / (accessed 26 Oc obe 2025).
71
h ps://e-lau e.in o/ (accessed 26 Oc obe 2025).
72
h p://digi alduchemin.o g/ (accessed 26 Oc obe 2025).
73
h ps://c imp ojec .o g/ (accessed 26 Oc obe 2025).
74
h ps:// ice ca .gesualdo-online.ces .uni - ou s. / (accessed 26 Oc obe 2025).
25
COST ACTION 21161
also aligns wi h he FAIR p inciples:
75
making da a Findable, Accessible,
In e ope able, and Reusable, he eby ensu ing i s long- e m ele ance and usabili y.
4.2
Aims and challenges o an analys
Anna Plaksin
A key ask o he analys o a co pus is he mapping o he a ionale p esen in he
c ea ion o a co pus wi h he goal o i s in e oga ion. This is especially ue when
p eexis ing co po a o digi al ea ly music esou ces a e eused and eagg ega ed.
Da a collec ions a e ga he ed based on di e en goals, encompassing he spec um
o esea ch-d i en and cu a ion-d i en aims.
76
The ole o a co pus in esea ch can
be unde s ood as a model o he wo ld we would like o gain knowledge o .
Acco ding o He be S achowiak’s de ini ion, a model is always a ep esen a ion o
some hing. I gene ally cap u es no all he a ibu es o he o iginal i ep esen s,
and i always ep esen s he o iginal no pe se bu o a speci ic con ex o unde he
es ic ion o ce ain ope a ions.
77
Relying on his de ini ion, his means ha a co pus
is always a ep esen a ion o some hing in a di e en medium. I is no he o iginal o
a copy, i is a educ ion. And i always aims a a special con ex – i only ep esen s
he o iginal o a ce ain (limi ed) use. Tha means he e-use o a co pus in a di e en
con ex is no gua an eed, and i is necessa y o check whe he he co pus and i s
assump ions i he esea ch ques ion. Because bo h he c ea ion a ionales as well as
he ac i i ies encompass nea ly in ini e possibili ies.
Ac i i ies ha in ol e he in e oga ion o ea ly music co po a may be g ouped in o
h ee ca ego ies:
1. Sea ch / Re ie al: Finding esou ces by ce ain c i e ia.
E.g. inding eaching esou ces by compose , gen e, o p o enance o shee
music o pe o mance o a ce ain se ing, ocal ange o di icul y.
These scena ios a e mainly ocused on me ada a sea ches and aim a inding
speci ic esou ces wi hin a la ge collec ion.
2. Con en -based analy ical ope a ions.
E.g. melodic sea ch, au oma ed music analysis o aining an algo i hm o
au oma ic composi ion.
These ac i i ies need machine- eadable ep esen a ions o music. They do
wo k ine wi h s aigh o wa d ep esen a ions o mode n ansc ip ions, e.g.
as MIDI iles, MusicXML o **ke n.
75
h ps://www.go- ai .o g/ ai -p inciples/ (accessed 26 Oc obe 2025).
76
See J. Flande s and F. Jannidis, eds., The Shape o Da a in Digi al Humani ies: Modeling
Tex s and Tex -Based Resou ces (Rou ledge, 2019), 86–89.
77
See H. S achowiak, Allgemeine Modell heo ie (Sp inge ,1973), 131 .
32
COST ACTION 21161
expense o he o he s; he second is o con la e he ex s. Howe e , in eali y we a e
dealing wi h a spec um wi h diploma ic ansc ip ion a one end and c i ical edi ion
Figu e 8. Beginning o c i ical commen a y o Pe e Philips, Dolo osa Pa an. © Copy igh
1999 by The Musica B i annica T us and S aine & Bell L d. Rep oduced om Pe e
Philips: Comple e Keyboa d Music, edi ed by Da id J. Smi h, Musica B i annica
Vol. 75 (p.196) by pe mission o S aine & Bell L d, London, www.s aine .co.uk. All igh s
ese ed.
a he o he . The con la ion o sou ces is he no m o Musica B i annica olumes o
keyboa d music, and indeed o some se en een h-cen u y ins umen al
collec ions.
91
A he o he end o he spec um he e a e sou ce-based edi ions such
as he one o Fi zwilliam Vi ginal Book by Jon Baxendale and F ancis Knigh s which,
while schola ly in e ms o i s in oduc o y ma e ial, is no c i ical – indeed, he edi o s
91
Fo example, he Le S ange manusc ip s, especially GB-Lbl, Add. MSS 39550–4, eco d
e e y di e ence be ween his e sions o music o iol conso and hose in manusc ip s
owned by his iends, ega dless o whe he i a ec s how he music sounds. See Pe e
Philips and Richa d De ing: Conso Music, ed. Da id J. Smi h, Musica B i annica Vol. 101
(S aine & Bell, 2016), xxi –xx iii, 186–187.
33
COST ACTION 21161
desc ibe i as a ‘pseudo- acsimile’,
92
and i p ese es many g aphical ea u es o he
manusc ip , including he beaming o he sho e no e- alues and idiosync a ic
no a ion o semib e es as wo ied minims. The edi o s make minimal changes o he
ex , co ec ing ob ious e o s, and make sugges ions conce ning, o example,
acciden als, bu hey do no e e o o he sou ces o pieces con ained in he
Fi zwilliam Vi ginal Book: gi en he scale o he manusc ip , his is unde s andable.
The ocus he e is on he sou ce, a he han on he compose .
Mos edi ions si somewhe e in he middle o he spec um: a sou ce edi ion – as
opposed o ansc ip ion – would co ec ob ious mis akes by e e ence o
conco dan sou ces; a compose edi ion usually has a main, p incipal, base ex .
Playe s ha e a use o bo h kinds o edi ion: he e, o example, someone making a
eco ding o Fi zwilliam Vi ginal book would need a sou ce-based edi ion, bu a
conce o keyboa d music by Pe e Philips migh equi e a compose -based one. An
ideal edi ion, hen, would be able o ca e o bo h. Howe e , adi ional, pape -
based publica ions canno p o ide edi ions o all he sou ces alongside a c i ical
edi ion combining hem because o he cons ain s o space. This is whe e he
dynamic, digi al edi ion o he u u e comes in: he wo edi o ial app oaches can be
econciled, wi h sou ce- and compose -edi ions coexis ing in he same i ual space.
A c i ical edi ion needs o eco d all he in o ma ion in one o m o ano he , and he
co-exis ence o sou ce edi ions and compose edi ions makes his easie o achie e.
Du ing he STSM in U ech , we hough a g ea deal in e ms o ‘laye s’. In he
con ex o schola ly, c i ical music edi ions, encoding would in ol e a minimum o
h ee le els, each building on he p e ious one:
• Diploma ic Sou ce T ansc ip ion = T anspa ency: all sou ces should be
encoded o p oduce diploma ic sou ce ansc ip ions so ha he use is able
o e e back o he ‘ aw ma e ial’.
• C i ical Sou ce Edi ions: c i ical edi ions o he sou ces whe e he main ex is
changed only whe e absolu ely necessa y and by e e ence o he conco dan
sou ces.
• C i ical Compose Edi ions: he ocus is on he compose so, ollowing he
example o se en een h-cen u y sou ces such as he Le S ange conso
manusc ip s, a base ex o ms he ounda ion o an edi ion whe e he e is
g ea e lexibili y in combining eadings om di e en sou ces.
In he digi al domain, such a mul i ace ed edi ion could ollow he model o exis ing
collec ed edi ions: a a click o a mouse, he ex ual commen a y could appea bu
using music no a ion a he han ex . No es could be colou ed on sc een o ale he
use o disag eemen be ween sou ces. Howe e , i would also be easible o display
ex s in pa allel in a manne no dissimila o Figu e 6, o o use IIIF
93
o c ea e links
92
The Fi zwilliam Vi ginal Book, eds. J. Baxendale and F. Knigh s (Ly ebi d Music, 2020) ol. 1,
xxxi.
93
In e na ional Image In e ope abili y F amewo k, h ps://iii .io/ (accessed 26 Oc obe 2025).
34
COST ACTION 21161
be ween a ba in he edi ion and he co esponding place on a digi ized image o
he sou ce.
94
The la e is mo e easily en isaged o sou ce edi ions, and has been
implemen ed in some o he Polish sco es: see, o example, Gio anni Ma eo Asola’s
A e Sanc issima Ma ia whe e double-clicking on a no e b ings up an image o he
co esponding sys em in i s sou ce. The same piece may be used o illus a e how
digi al edi ions can acili a e he co-exis ence o o iginal and mode n no a ion: on
he click o a mouse he display may be changed om o iginal cle s, no e- alues
(including s emming), pa -names, spelling and o hog aphy o a mode n sco e o
mode n cle s, wi h ba lines added, mode n pa -names, spelling and o hog aphy.
Di e en use s o shee keyboa d music ha e di e en p io i ies, and he e is
disag eemen o e he ex en o which g aphical aspec s o he no a ion (such as he
beaming o qua e s and semiqua e s) ha e implica ions o pe o mance p ac ice. A
digi al edi ion would he e o e be able o ake he use om a e sion wi h o iginal
no a ion (numbe o s a lines, cle s, beaming, placemen o acciden als) o a
mode n one. The e could be simila op ions o downloading PDFs o p in ing o
use on a able .
Fo a digi al edi ion along he lines sugges ed abo e o be possible, he e is a need
o co pus c ea ion. All ex s o each piece need o be encoded and ideally he
en i e con en s o each manusc ip so ha he edi ion becomes a esea ch ool o
explo e sc ibal in e ac ion wi h he music. A digi al edi ion such as is being p oposed
he e shi s he ocus away om compose s and on o sc ibes and use s o
manusc ip s in he se en een h cen u y. On he ace o i , IIIF links o images migh
sugges ha no e e y hing needs o be encoded. Howe e , a playe who belie es
ha beaming o smalle no e alues is indica i e o ph asing would wan i p ese ed
in a sou ce edi ion. Sys em-b eaks can be i al o unde s anding placemen o
acciden als and can p o ide aluable e idence o in e -sou ce ela ionships.
Fu he mo e, any hing ha emains wi hin a digi ised image o a sou ce wi hou
being encoded becomes ‘in isible’ o a machine and he e o e makes he da a less
use ul when analysing he music. One signi ican ad an age o a schola ly digi al
edi ion is ha i can be mo e han he isual p esen a ion o musical no a ion: he
da a equi ed o he c ea ion o a mul idimensional, dynamic edi ion as en isaged
abo e can be manipula ed o a ange o o he pu poses. Whe eas a esea ch
p ojec in ol ing a pa icula esea ch ques ion may in ol e making comp omises in
wha o include, lea ing ou any elemen s ha will no be needed in add essing i , a
digi al edi ion needs o emain as open and lexible as possible since i needs o
mee he needs o a ange o use s. As pa o he U ech STSM, a numbe o
in e iews we e conduc ed (see Sec ion 7.1), and one pa icipan commen ed ha
‘you don’ know wha i is ha you need o know un il you need o know i ’. A
monumen al edi ion need no ha e a speci ic goal in mind, so needs o in ol e
94
h ps://polishsco es.o g/?id=16xx:900 (accessed 26 Oc obe 2025).
35
COST ACTION 21161
encoding as much as possible in o de o make i use ul o he wides ange o
audiences.
Manually encoding music om sc a ch is labou -in ensi e. Howe e , he use o
exis ing digi al ma e ial speeds hings up immeasu ably. Any exis ing ansc ip ion o
edi ion o a piece can se e as he basis: OMR can be used i necessa y o c ea e a
Music XML ile ha can hen be idied up in a no a ion so wa e package be o e
being expo ed as XML and con e ed o MEI. The Music XML ile can be edi ed
mul iple imes o gene a e he a ious laye s needed o a dynamic digi al edi ion.
Howe e , i is s ill a mammo h e o , especially i he encoding is o include he
deg ee o de ail sugges ed. The e is a ade-o o be made be ween speed and
quan i y on he one hand and inclusi i y on he o he . Some p ojec s ha e in ol ed
he bene icia ies – he use communi ies – in he c ea ion o a esou ce, bu his aises
issues o In ellec ual P ope y and copy igh when wo king wi h es ablished se ies,
no o men ion quali y assu ance.
Du ing he STSM, Ma k Go ham ga e an illumina ing p esen a ion on encoding
Ge man Liede and s ing qua e s. Al hough he epe oi e is la e , we bene i ed
om his expe ience in c owd-sou cing he encoding o nine een h-cen u y edi ions
which seemed o ha e been highly success ul. By using such edi ions as sou ces, he
a oided issues o IP and copy igh . Howe e , Julie Cuming and Co y McKay
e lec ed on hei p ojec s, including ELVIS and SIMSSA. Thei expe ience o c owd-
sou ced encodings o music which we e in ended p ima ily o analysis di e ed
ma kedly om ha o Ma k mainly because he sou ces used as hei basis a ied:
some encodings employed educed no e alues whe eas o he s did no . They we e
scep ical abou using c owdsou cing o he c ea ion o encoded co po a o music
and p e e ed smalle bu accu a e and consis en da ase s o la ge ones. The
con as be ween he wo pe spec i es illus a ed he ade-o be ween quali y and
scale. A p ojec in ol ing he c ea ion o a new se ies o c i ical edi ions migh well
in ol e use s in hei c ea ion. I onically, he lack o lu e music in monumen al
edi ions such as Musica B i annica has led o he c ea ion o digi al esou ces by
en husias s which can now p o ide he sou ce ansc ip ions needed o c i ical
edi ions o , o example, lu e music by Dowland.
95
Howe e , he issues o IP and
copy igh a e ha much g ea e o a well-es ablished monumen al edi ion,
especially when new edi ions a e made om exis ing ma e ial. In he case o exis ing
se ies, he e is scope o in ol e use communi ies in he design o a u u e digi al
c i ical edi ion.
I may be un ealis ic o expec a use necessa ily o pay o an edi ion i hey ha e
con ibu ed o i s c ea ion. This is pe haps whe e he e needs o be a dis inc ion
be ween da a and edi ion: an encoded manusc ip is a digi al ep esen a ion o ou
95
T. C aw o d, ‘A Digi al Co pus o Explo ing he Lu e Music o John Dowland (1563–1626),’
Jou nal o New Music Resea ch 53, no. 3–4 (2025): 1–11,
h ps://doi.o g/10.1080/09298215.2025.2486139.
36
COST ACTION 21161
sha ed he i age so pe haps should no be mone ized, whe eas an edi ion a ising
om i is in es ed wi h he in ellec ual p ope y o whoe e c ea ed i . Clea ly, we e
c owdsou cing o be used o encoding ex s in he c ea ion o a c i ical edi ion o
an es ablished se ies, he e would need o be absolu e anspa ency su ounding
he a angemen .
The e is an ob ious need o be able o combine da a o edi ions o English
keyboa d music made by se e al indi idual edi o s o edi o ial eams, especially
when one is wo king on a compose , and ano he on an indi idual sou ce. Fo
schola ly, c i ical edi ions o keyboa d music o lou ish in he digi al sphe e, he e
needs o be a deg ee o s anda disa ion, consis ency and anspa ency. In he longe
e m, howe e , i is possible o en isage combining da ase s o his keyboa d
epe oi e wi h hose o o he kinds o music – maybe adi ional, olk music? – o
explo e b oade ques ions o dissemina ion and ansmission.
4.4
How o selec a ep esen a i e co pus o his o ical s udy
Mi jam Vissche and F ans Wie ing
The i s p oblem one encoun e s when doing co pus-based music esea ch is he
c ea ion o he co pus i sel , o he composi ion o he co pus is a c i ical ac o in
he alidi y o he esea ch ou comes. A sui able co pus minimally mee s he
ollowing c i e ia:
• i i s he esea ch ques ion: i mus con ain he ea u es ha a e ele an o he
esea ch;
• i is ep esen a i e o he music unde in es iga ion, in e ms o sp ead o e
di e en s a a such as gen e, coun y and compose s.
The la e p oblem is he opic o his sec ion. I seems ha mos o he co pus
esea ch (inso a as i doesn’ s udy a closed co pus, such as he comple e ou pu o
a compose ) uses an app oach o con enience sampling, making he mos o he
a ailable da a se s and/o he (usually limi ed) amoun o ime o da a c ea ion. We
a e no awa e o a solid me hod o selec composi ions o inclusion in a musical
co pus. Howe e , in compu a ional linguis ics a comp ehensi e, s aigh o wa d
me hod has been de ined by Bibe , ha can be ans e ed o he music domain.
96
Bibe ’s app oach has h ee impo an s eps:
• de ine he bounda ies o he popula ion;
• de ine he o ganisa ion wi hin he popula ion;
• se a sampling ame.
We will illus a e his app oach wi h a sho example. Fo his we de ine he
bounda ies o he popula ion as all wes e n music su i ing in manusc ip o p in
da ed be ween 1500 and 1700. The o ganisa ion wi hin he popula ion e e s o
96
D. Bibe , ‘Rep esen a i eness in Co pus Design,’ Li e a y and linguis ic compu ing 8, no. 4
(1993): 243–257, h ps://doi.o g/10.1093/llc/8.4.243.
37
COST ACTION 21161
p ope ies known as s a a such as he compose , gen e, secula o sac ed na u e,
ins umen a ion, coun y, language and da e. The choice o s a a ela es o he
esea ch ques ion: a ep esen a i e co pus e lec s he dis ibu ion o composi ions
o e he selec ed s a a o he popula ion.
Figu e 9. Concep ual popula ion b eakdown o all music he e e e was (da k blue)
in o he pa ially o e lapping ca ego ies known (yellow), known & los ( ed) and
unde ep esen ed (g een). The isible da k blue a ea ep esen s he unknown
epe oi e, o which we do no know i s size.
Figu e 9 shows ha he popula ion has some p oblema ic aspec s. Fi s , i is al eady
a selec ion om all wes e n music he e e e was in he 16 h and 17 h cen u ies. We
know e y subs an ial amoun s o music we e los . In some cases, we know ha
composi ions and sou ces exis ed in he pas , bu he e is an unknown amoun o
music ha has anished wi hou a ace. The e is no eason o assume ha his was a
pu ely andom p ocess in which e e y piece s ood he same chance o su i ing, so
we can be su e his has in luenced he dis ibu ions o s a a. The e is some
specialised esea ch in o he es ima ion o numbe s o i ems los wi hou a ace,
97
bu we a e no awa e o me hods o doing so a a la ge scale.
98
In o ma ion on he
known losses is sca e ed o e he li e a u e: i would be help ul i a cen alised
in en o y o hese could be c ea ed. A second p oblema ic aspec is he occu ence
97
M.S. Cu hbe , ‘Tipping he Icebe g: Missing I alian Polyphony om he Age o Schism,’
Musica Disciplina 54 (2009): 39–74, h ps://www.js o .o g/s able/25750547; M. Kes emon , e
al., ‘Fo go en Books: The Applica ion o Unseen Species Models o he Su i al o Cul u e,’
Science 375, issue 6582 (2022): 765–769, h ps://www.doi.o g/10.1126/science.abl7655.
98
Recen ly, howe e , i was es ima ed how many G ego ian chan s ha e disappea ed, based
on he chan dis ibu ions in he CANTUS da abase (h ps://can usda abase.o g/, accessed
30 Oc obe 2025): J. Hajič, j . and F.C. Moss, ‘Knowing When o S op: Insigh s om Ecology
o Building Ca alogues, Collec ions, and Co po a,’ in DL M ‘25: P oceedings o he 12 h
In e na ional Con e ence on Digi al Lib a ies o Musicology, ed. E. De Luca (Associa ion o
Compu ing Machine y, 2024): 90–94, h ps://doi.o g/10.1145/3748336.3748347.
38
COST ACTION 21161
o anonymous composi ions. This is no a negligible ca ego y: in RISM, 18% o he
i ems a e anonymous and in CPDL 10% o he composi ions a e anonymous.
Anonymous wo ks a e o en ha d o da e, and hei geog aphical o igin may also be
less easy o iden i y. The dis ibu ion o anonymous compose s is no uni o m
geog aphically; we know o ins ance ha he majo i y o 17 h cen u y Sco ish
composi ions a e by an anonymous compose .
99
They hus pose a se ious p oblem
o c ea ing a ep esen a i e selec ion.
Thi d, no aking unde ep esen ed g oups in o accoun may in oduce addi ional
bias. The e a e ca alogues o music by unde ep esen ed g oups, such as he ones in
Table 2. Such lis s a en’ he inal answe , howe e , since he a en ion o he
unde ep esen ed may lead o o e ep esen a ion and o he unde ep esen ed
g oups may be missing. Being explici abou such p oblems, e en i hey canno be
sol ed, is he bes s a egy.
ca ego y
da ase name
u l
Ibe ian polyphony
A chi e o Ibe ian Polyphony
h ps://ibe ianpolyphony. csh.unl.p /
Female compose s
BIG LIST o Women Compose s
h ps://donne-uk.o g/ he-big-lis /
Spanish and New Wo ld Polyphony
Books o Hispanic Polyphony
h ps://hispanicpolyphony.eu
Poland, 16 h-20 h cen u y
Polish digi al sco es
h ps://polishsco es.o g/
Spanish Renaissance
Tomás Luis de Vic o ia
h ps://www.uma.es/ ic o ia/index.h ml
Table 2. Sample ca alogues o unde ep esen ed g oups.
The hi d and las s ep in Bibe ’s me hodology is c ea ing a sample ame. Such a
ame consis s o one o mo e lis s wi h a ibu es and hei dis ibu ion o e he
popula ion. Table 3 gi es wo simple examples o sample ames c ea ing om
RISM.
In p ac ice sample ames will be hie a chical ( o example gen e and coun y). A e
he sampling ame has been c ea ed, he ac ual co pus can be selec ed by
andomly choosing composi ions ha ill he ‘slo s’ in he ame. How many need o
be selec ed depends on he minimum numbe o pieces pe slo . In he end, we
ha e a sample ha is subs an ially smalle han he popula ion bu ne e heless is
ep esen a i e o i by he chosen c i e ia.
99
D. Coney, ‘The Da id Mel ille Bassus Pa book and Sco ish Music Cul u e a he Tu n o
he 17 h Cen u y,’ pape p esen ed a he In e na ional Medie al and Renaissance Music
Con e ence, Du ham, U.K, July 2025.
39
COST ACTION 21161
compose
%
gen e
%
Anonymous
18.1
Sac ed song
12.9
Pales ina
0.7
Ope a
7.1
Lassus
0.4
Lied
4.4
Lully
0.3
Mo e
4.0
Pu cell
0.2
Song
3.7
Ma enzio
0.1
Keyboa d piece
3.6
By d
0.1
Mass
3.4
Vic o ia
0.1
Sona a
3.1
Gallus
0.1
A ia
3.1
F escobaldi
0.1
Can a a
3.0
Table 3. Sampling ames o compose s and gen es based on he RISM eco ds ( op 10).
40
COST ACTION 21161
5 Beyond he schola ly communi y
As has been epea edly sugges ed in he p e ious chap e s, he in ol emen o
con ibu o s om ou side he schola ly communi y o co pus c ea ion (in he wides
sense) is qui e impo an . Whe e possible he connec ion be ween music
esea che s, p o essional musicians and ci izen scien is s dese es o be os e ed.
Sec ion 5.1 illus a es his by means o an analysis o he con ibu ions o CPDL, a
majo online esou ce o cho al music. In Sec ion 5.2 digi al edi ing is s udied as a
collabo a i e p ocess whe e con ibu o s a e mo i a ed o pe o m ce ain asks:
hese asks should be designed in such a way ha hey educe echnical complexi y
and allow ocus on he musical ask a hand.
5.1
CPDL con ibu o s: a case s udy
Mi jam Vissche and F ans Wie ing
Ea ly 2024, we in es iga ed he p oduc i i y o he con ibu o s by sc aping he
con en s o he si e. A he ime 60,868 edi ions o wo ks by 4,753 compose s we e
a ailable. As ou ocus is on co pus building and eusabili y o encodings, we looked
in o he p e ailing ile o ma o sha ing encodings, MusicXML. Ou da ase con ains
48,847 MusicXML iles by 1,609 con ibu o s, accompanied by me ada a.
Figu e 10. Top 25 CPDL con ibu o s (anonymised) and he yea s du ing which hey
we e ac i e. The con ibu o s accoun o 69% o he o al numbe o 48,874 MusicXML
iles con ibu ed o CPDL.
Fi s , we looked in o ac i i y and p oduc i i y o he op 25 con ibu o s. Thei e o s
accoun o 33,836 iles o 69% o he o al numbe o MusicXML iles (see Figu e 10).
41
COST ACTION 21161
Figu e 11. P oduc i i y me ics o e he yea s.
Figu e 11 shows how o e ime new con ibu o s joined he CPDL pla o m and ha
he long- e m end o he numbe o pieces added pe yea is going upwa d while
he a e age p oduc i i y o he con ibu ions seems s able. F om a business
economics pe spec i e, hese me ics a e a sign o heal hy con inui y.
Figu e 12. Composi ion yea s (smoo hed) o he con ibu ions by he op 10 CPDL
con ibu o s (anonymised).
48
COST ACTION 21161
The p ocessing o he da a depends on he kind o desi ed ou pu , wi h wo main
ca ego ies, analy ical esul s o music. The o me en ails he use o specialised music
so wa e such as music21, so wa e o machine lea ning o da a analysis such as
WEKA, o gene ic p og am languages such as Py hon. Music as ou pu implies he
abili y o pe o m asks such as con e sion o CMN, ansposi ion, choosing ypes o
edi o ial de ail, adap a ion o pe o mance ci cums ances, and soni ica ion.
Bo h use and make pe spec i es spawn an ex ended wo k low o asks ha di e in
complexi y and expe ise needed. In p inciple, i seems possible o spli hese up in
sub asks ha can be ca ied ou by wo ke s wi h he igh combina ion o quali ies.
Howe e , e en he simples asks a e likely o equi e accu acy, concen a ion, and
unde s anding o he con en . The e o e, i is c ucial o design use in e aces in such
a way ha hey op imally suppo he ask a hand and do no dis ac o hinde i , in
o he wo ds, wi h ull a en ion o usabili y aspec s ha a e sadly o en igno ed.
6.4
Con ex s
Th ee ypes o con ex a e dis inguished in he PACT model: physical en i onmen ,
social con ex , and o ganisa ional con ex . Conside ing he i s , mos o he wo k
p obably akes place in o ices, lib a ies and home s udies using a desk op
compu e . As an al e na i e se ing one migh hink o a mobile de ice, on which
mic o asks can be execu ed in a collabo a i e scena io, o example when a elling
by public anspo o while wai ing in a queue. This would equi e a comple ely
di e en app oach o ask and in e ace design.
In e ms o social con ex , mos wo ke s, e en i hey do hei edi ing and encoding
in isola ion, a e pa o a communi y o p ac ice ( o example Lilypond) ha sha es
he same ools and publica ion pla o ms and whe e hey can ask each o he o
help. When designing a new co pus c ea ion p ojec , i is impe a i e no only o
ocus on ools and asks, bu o ac i ely design and os e he social en i onmen in
which he wo k akes place as well. I wo ke s don’ eel connec ed, suppo ed and
alued, hey will go elsewhe e.
O ganisa ional con ex b ings in se e al in e ela ed ac o s ha in luence he long-
e m success o a p ojec :
• o ganisa ional embedding: p ojec s a e gene ally sho - e m, bu a e hey
endo sed and suppo ed by o ganisa ions wi h a longe li espan?
• s o age: whe e a e he da a hos ed, how a e hey backed up?
• echnical sus ainabili y: will da a and in e ace o accessing hem emain
a ailable and i necessa y upda ed a e he p ojec ?
• in ellec ual sus ainabili y: a e sus ainable o ma s used and is he e
documen a ion ha eco ds p o enance o he da a and encoding decisions, so
ha u u e use s unde s and he po en ial and limi a ions o he da a?
• inancial sus ainabili y: wha does i cos o main ain he en i onmen and how is
income gene a ed o do so?
49
COST ACTION 21161
O he o ganisa ional ac o s include:
• in ellec ual p ope y: can da a be law ully sha ed, and is e e yone’s sha e in
c ea ing hem adequa ely ecognised?
• ela ion o o he ini ia i es, such as digi isa ion p ojec s o lib a ies and a chi es,
ca aloguing endea ou s (RISM, CANTUS, DIAMM), and di ision o labou wi h
ela ed p ojec s;
• in e ope abili y wi h he Linked Open Da a wo ld, o example by means o
sha ed iden i ie s o a ious en i ies like pe sons and composi ions;
• obse a ion o FAIR p inciples.
The sub ex he e is ha , al hough c owdsou cing and employing ci izen scien is s
seem like cheap s a egies o assemble a lo o da a, co pus c ea ion emains an
expensi e ac i i y in a wo ld whe e unding o cul u al he i age and humani ies
esea ch is sca ce. When a ailable, unding needs o be spen wisely and e ec i ely,
and when no a ailable, s ong and a he same ime ealis ic cases need o be made
o unding agencies o con ince hem ha music co pus c ea ion is a wo hy
des ina ion o hei money.
6.5
Technologies
In he PACT amewo k, Technologies a e di ided in ou ca ego ies, inpu , ou pu ,
communica ion and con en . We will skip inpu and communica ion,
106
impo an
hough hey may be, since a his a he gene ic le el o PACT analysis, li le can be
said abou hese ha is speci ic o ou p oblem. This is di e en o ou pu , as his
ca ego y desc ibes he a ie y o p oduc s ha co pus c ea ion may aim a . B ie ly
summa ised, he ollowing dimensions ha e been iden i ied:
• da a o ien ed o edi ion o ien ed encoding;
• selec i e o nea -comple e encoding, wi h he quali ica ion ha comple e
encoding is una ainable and a well-a icula ed decision mus be made abou
whe e o s op;
• encodings and edi ions ange om sou ce-o ien ed o compose -o ien ed
app oaches;
• whe he symbols (e.g. liga u es) o seman ics (e.g. pi ches and du a ions) a e
encoded;
• he no a ion ou pu can be s a ic, allow basic manipula ion such as e o ma ing
o selec ion o oices, o can be dynamic, a o ding no a ion-al e ing
manipula ions such as ansc ip ion and ansposi ion;
• whe he he in ended medium o he end p oduc is pape o digi al;
• eusabili y o he encodings, anging om none o basic ea u es o comple e
encodings.
106
Inpu is mainly abou inpu de ices. Fo ou pu pose one could hink o scanne s and
(music) keyboa ds as non-s anda d inpu de ices. Communica ion add esses ma e s such as
ne wo k connec ions, p o ocols, and ansmission speed.
50
COST ACTION 21161
This ange o ou pu op ions implies a co esponding ange o echnical decisions in
he design o he musical con en . Such decisions ake he o m o compu a ional
models, abs ac s uc u es ha cap u e aspec s o he music in a o mal logical
s uc u e. As explained in 4.2, a model is a educ ion o he o iginal o a ce ain
pu pose, and i is impo an o unde s and wha is included in he model as well as
wha isn’ . Aspec s o co pus c ea ion ha ela e o con en and modelling a e:
• encoding sys em(s) used in he sys em; con e sion
• quali y and accu acy o he encoding
• no a ion o m o he sou ce (mensu al, abla u e, CMN)
• choice o ea u es o encode
• inclusion o digi al images (po en ially linking o image and encoding ia IIIF)
• me ada a o ma desc ibing sou ce and encoding p ocess
Me ada a design in pa icula seems an unde de eloped a ea: he choice seems o
be in p ac ice o ei he use a minimalis ic app oach and supply li le beyond
compose and i le, o wo k wi h a e bose o ma such as he TEI/MEI heade . The e
is a need o de eloping a middle g ound, whe e me ada a p o iles a e de ined o
speci ic epe oi es bu ha can s ill be sha ed a a highe le el.
6.6
Conclusion
This PACT analysis can ha e mul iple pu poses. I can se e as a amewo k o
desc ibe cu en p ojec s, iden i y oppo uni ies o imp o ing hem, compa e hem
and ind chances o collabo a ion o da a exchange. I can also be seen as a se o
ques ions and a en ion poin s o designe s o u u e music co pus c ea ion
en i onmen s ha hey can use as a s a ing poin o use esea ch and
equi emen s ga he ing.
An impo an gene ic obse a ion abou all PACT dimensions is ha accu a e
documen a ion, no jus o he speci ic encoding ask bu o he en i e p ojec , is
i al, o he wise he use o he p ojec ou pu is le in he da k wi h espec o ac o s
ha undamen ally shape he encodings. Such documen a ion should include:
• musical documen a ion, desc ibing he aim and con en o he p ojec , as well as
c i e ia o inclusion/exclusion o composi ions;
• echnical documen a ion abou he sys em design and implemen a ion;
• make documen a ion, explaining he c ea ion wo k low in he sys em;
• use documen a ion, explaining how he encodings can be manipula ed: i
should also discuss design choices and limi a ions ha may in luence he
in e p e a ion o he ou comes;
• me ada a documen a ion, explaining how he encoding should be o mally
desc ibed.
In all documen a ion i is i al no jus o p esen all he op ions (which sadly is wha
mos use manuals documen ) bu o gi e ask o ien ed, p ac ical ad ice ha
ma ches he p oblem he encode is s uggling wi h.
51
COST ACTION 21161
7 Na owing he gul be ween people and
sys ems in music encoding
In ou ac i i ies, we ocused on he people in ol ed in he c ea ion and use o
co po a in h ee ways, in e iews (Sec ion 7.1), he c ea ion o pe sonas (Sec ion 7.2),
and he analysis o oles in collabo a i e co pus c ea ion (Sec ion 7.3).
7.1
In e iews
We conduc ed nine in e iews, six wi h musicologis s and h ee wi h ci izen
scien is s, o lea n mo e abou encode s’ mo i a ions and expe iences. We p esen a
sho o e iew o hei esponses.
Mul iple esponden s men ion ha disco e y is an impo an mo i a ion o hei
wo k, as is sha ing hei disco e ies wi h o he s. Ano he mo i a ion is al uism:
doing some hing o o he musicologis s o pe o me s, o gi ing back o he
communi y in exchange o he use o o he people’s ma e ials. O he mo i a ions
include obsessi e collec ing, edi ing as a way o gaining deep knowledge o he
music, inaccessibili y o expensi e schola ly edi ions, and lea ing some hing behind
o u u e gene a ions. All esponden s see PDF as he p ima y o ma o
dis ibu ion, hough se e al men ion sha ing encodings; one o hem does no wan
o sha e encodings o ea o o he s in e e ing wi h hei wo k. The e a e some
in e es ing hough s hough on he added alue o digi al edi ions, such as suppo
o mul iple comple ions o he same wo k ( om an in e iewee wo king wi h
polyphony agmen s), be e isibili y o sou ces in he edi ion, di e en ‘ iewpoin s’
on he edi ion o di e en use g oups, sea ching o in e ex ual ela ionships,
suppo o anno a ion, and inally quan i a i e analysis.
Responden s gene ally expec acknowledgmen o hei wo k and app ecia e
eedback on hei sco es. No one expec s ma e ial ewa d, bu some would ha e a
p oblem wi h hei wo k being e-used in a comme cial se ing and ha e licenced
hei wo k acco dingly. Some esponden s plan o sha e hei wo k a a la e s age
and wonde abou he igh pla o m o do so. One qui e in e es ing sugges ion is o
c ea e a cen alised place (i e e en ly desc ibed as a ‘dumps e ’) whe e esea che s
could deposi hei encodings o o he s o use, a e ha ing inished esea ching
hem.
Quali y is a majo conce n o all. E e yone wo ks di ec ly om p ima y sou ces (we
know ha many encode s wo k om mode n edi ions, bu no ou esponden s) and
aims o c ea e accu a e sco es. Be e access o sou ces is desi able, especially o
unde ep esen ed ma e ials such as o gan abla u es. Gene ally, hey apply
guidelines o p inciples, o mally o in o mally. Such p inciples can be e y concise
(‘ e e sibili y, aceabili y, anspa ency’) and some imes e ol e o e ime.
Some imes o mal guidelines a e ollowed. C i ical commen a ies ange om none
52
COST ACTION 21161
o e y ex ensi e ex s. One esponden sugges s a pee app o al p ocess o
con ibu ions o CPDL-like esou ces.
Ou esponden s come om se e al communi ies o p ac ice (e.g. Humd um, MEI,
Lilypond, MuseSco e, Finale, digi al lu enis s) and seem o be gene ally happy wi h
he unc ionali ies ha hei pla o m o e , mino complain s excep ed. A imes hey
men ion limi a ions o o he pla o ms. Technical help seems o be mos ly a ailable,
and he ci izen scien is s ha e easy access o musicological expe ise. I is no
uncommon o digi ised sou ces and unpublished ansc ip ions o be sha ed wi h
pee s. One pa icipan is in ol ed in a la ge-scale unded p ojec and wo ks wi h
mul iple colleagues. O he s wo k indi idually and a he in e ac wi h hei
communi ies on an ad-hoc basis.
7.2
Pe sonas
Anna Plaksin and F ans Wie ing
We also used ou expe iences o c ea e pe sonas - ic ional ep esen a ions o
po en ial use s - each wi h a sho p o ile including biog aphic backg ound,
expe iences, mo i a ions, and us a ions. This should help us o c ea e a sha ed
unde s anding o po en ial communi ies and a sha ed ision o he u u e o co pus
building. In eams o wo, we c ea ed i e di e en pe sonas wi h a ying
backg ounds. We en isioned people be ween hei ea ly 20s and hei mid-70s,
some o hem p o essional musicians, o he s music en husias s wi h a ying
aspi a ions. En isioned expe iences, mo i a ions, and us a ions we e al eady
as onishingly di e se, anging om seeking ma e ial o play om, communi y,
isibili y o hei own wo k, and ecogni ion ( o a compac su ey see Table 4).
Howe e , he e ec o c ea ing hese pe sonas did no lie so much in hei de ails bu
in hei impac on ou discussions. Mo i a ions and us a ions o po en ial use s
inc easingly guided ou decisions and led us o ake he consequences o design
choices in o accoun . I con as ed ou ea lie mapping o he many modali ies o
c ea ing o using digi al co po a o ea ly music by gi ing us a pe spec i e om
wi hin his sys em, showing po en ial conside a ions o choosing a ce ain di ec ion
ins ead o o he s. Mo eo e , i p esen ed us wi h he di e si y o aims and needs ha
a e placed on echnology and mus be balanced in i s use. Simply pu : he c ea ion
o music con en can be a ewa ding ac i i y o people om di e se backg ounds
and wi h di e se echnological expe ise. The e o e, encodings sys ems and ela ed
echnologies need o p o ide di e en modes o engagemen e.g., add essing
di e en le els o in e es o commi men . This e e s back o he e y co e o use -
cen ed design. As Don No man poin ed ou al eady in 1986:
107
humans use
echnology o achie e hei goals. In hei use o echnology, hey ace wo gul s: he
107
D. A. No man, ‘Cogni i e Enginee ing,’ in Use Cen e ed Sys em Design: New Pe spec i es
on Human-Compu e In e ac ion, eds. S.W. D ape and D.A. No man (E lbaum, 1986): 31–61.
53
COST ACTION 21161
Gul o Execu ion when igu ing ou how some hing ope a es, and he Gul o
E alua ion when e alua ing i an ac ion has b ough hem close o hei goal. The
design o echnological sys ems is always conce ned wi h na owing his gul , ei he
by b inging people close o echnology o by b inging echnology close o people.
name, age
backg ound
mo i a ion
us a ions
Da id (c. 60)
cho al conduc o , has
li le in e es in
echnology
ind epe oi e ha makes singe s
happy, help o he s in same
si ua ion
exis ing edi ions always
unsui able
Di k (73)
e i ed physics eache ,
plays eco de and
composes, unhappy
ma iage
publish ansc ip ions and
composi ions online, wan s
ecogni ion o his e o s
web si e doesn’ ge much
a ic, main enance is
edious
Elizabe h
(la e 60s)
e i ed geog aphy
eache , single, ama eu
singe
lea n mo e abou he music she
sings, ind new music
wa y o echnology, eally
needs sense o communi y
Sebas ian
(mid 40s)
ech sa y, plays iolin in
ama eu o ches a
sa e money by c ea ing his own
sco es
expensi e o ches al
ma e ials
Tanya (23)
music s uden , basic
MuseSco e use
publish ansc ip ions, w i e he
own music, wo king wi h pee s
conside s he sel a non-
compu e pe son, wo ied
hings go w ong
Table 4. Pe sonas c ea ed du ing he wo kshop.
Wi h ega d o s anda disa ion and da a quali y, his leads o a signi ican ques ion in
he ole o echnological sys ems o co pus c ea ion: do we wan o b ing people
close o he echnology by eaching hem e.g., wha ‘good da a’ should look like?
O do we wan o b ing echnology close o he people who a e using i ? As he shi
in ou discussions based on he pe sonas showed: aiming o he second op ion
shi s he ole o echnology owa ds a sys em ha helps nego ia ing people’s aims
and needs, o example by p o iding con e sion p ocedu es o mul i- ace ed access
poin s. Wi h such an app oach he e would be less need o hem o modi y hei
echnical skills and p e e ences, while hei in insic mo i a ion o c ea e codes and
edi ions would emain una ec ed.
7.3
Roles
We also discussed he collabo a i e p ocess o music encoding in one o ou
sessions in connec ion wi h he ideas p esen ed in Sec ion 5.2. The aim was o c ea e
a gene ic wo k low wi h asks and oles ha could be used as a empla e o
54
COST ACTION 21161
conc e e encoding p ojec s. Assuming a a he high le el o o ganisa ion in he
p ojec , we ini ially we p oduced a qui e elabo a e diag am, which was subsequen ly
educed o he simple model o Figu e 15 o i a b oade ange o app oaches.
Figu e 15. High-le el wo k low o music encoding p ojec s.
The model con ains ou main asks, and mul iple sub asks:
• ins iga ion: se ing he goals o he p ojec ; selec ing composi ions; c ea ing
a empla e o encodings; de ining he p ojec ’s success c i e ia;
• ga he ing: collec ion he ma e ials o be used: me ada a; digi ised and
physical sou ces; digi al and physical edi ions; eusable encodings;
• edi ing: c ea ing he encodings by means o OMR, manual ansc ip ion,
encoding, and/o code con e sion; in eg a ing he aw anscip ion(s) in o an
edi ed encoding; co ec ion and alida ion o he edi ed encoding;
• publishing: making he alida ed encodings a ailable o he use s.
This could be jus a linea p ocess, bu mo e ealis ically i would in ol e eedback
s eps om publica ion o he h ee o he asks:
• o edi ing: in eg a e eedback om he use s in he encodings;
• o ga he ing: en ich he encoding wi h newly-selec ed sou ce ma e ials, o e-
use he encodings as base ex o ansc ip ions o o he e sions o he same
wo k, pa icula ly close conco dances;
• ins iga ion: check he esul agains he goals and success c i e ia, po en ially
ins iga ing a nex cycle h ough he wo k low.
In addi ion o he asks, a numbe o oles is de ined:
• ins iga o o he p ojec : pe o ms he asks necessa y o s a o he p ojec
(see abo e);
• ansc ibe o sou ces: c ea es he aw encodings o he selec ed sou ces;
• compu a ional specialis : in cha ge o OMR, con e sion and o he
compu a ional p ocesses;
• edi o : c ea es and edi ed encoding (o digi al edi ion) om he aw
encodings and he edi ion empla e; publishes edi ion a e alida ion;
55
COST ACTION 21161
• alida o : akes ca e o p oo eading and quali y checking;
• edi ion use : uses he ma e ials o hei own pu poses and gi es eedback o
he p ojec .
Lea ing edi ion use s aside, each ole asks o a di e en se and/o le el o
expe ise. Fo example, o ansc ibe s a good wo king knowledge o music
no a ion and a basic unde s anding o edi ing would su ice, while he edi o s need
o possess bo h a a highe le el. I depends on he ambi ion, scale, scope,
embedding, unding and li espan o a p ojec i mul iple (o e en all) oles a e
elega ed o a single pe son, o i each ole is di ided be ween mul iple pe sons. In
he case o ansc ibe s and edi o s he la e seems ob ious bu la ge, s uc u ed
p ojec s a e likely o ha e eams o ins iga o s, compu a ional specialis s and
alida o s.
56
COST ACTION 21161
8 Vision o he u u e o co pus building
Inc easingly, musicians and music esea che s engaged in ea ly music do hei wo k
in he digi al en i onmen . Music is sea ched, ansc ibed and analysed by means o
specialised so wa e and se ices, edi ions a e sha ed online, and esea ch esul s
a e published in digi al jou nals, con e ence p oceedings, and books. Ye i we
o e iew he digi al landscape o ea ly music, he dominan imp ession is one o
ela i e agmen a ion and sca ci y o esou ces. This is especially ue o wha is
a guably he undamen al p econdi ion o mos digi al wo k in he ield: he
a ailabili y o music no a ion in digi al, compu e -p ocessible o m. Cu en ly one
canno assume ha o a gi en esea ch p oblem in ea ly music su icien musical
da a con aining he necessa y ea u es a e a ailable ha can be analysed o sol e i .
O en he e is such an in es men in da a c ea ion and cu a ion needed ha a
compu a ional app oach becomes in easible. The e o e, a comp ehensi e app oach
o accele a e co pus c ea ion in ea ly music is u gen ly needed, aiming a a be e
co e age o he epe oi e in a o m ha ideally is eady o be used, o , mo e
ealis ically, needs a limi ed amoun o wo k o make i eady o use.
We conside co po a and co pus c ea ion o co e a b oad ange o p oduc s and
ac i i ies, including he p oduc ion o analy ical encodings ha con ain only a ew
speci ic musical ea u es, he encoding o he o iginal no a ion o sou ces, and he
c ea ion o digi al edi ions ha mee s schola ly and/o p ac ical equi emen s. These
ac i i ies ha e nume ous commonali ies in e ms o how hey a e ca ied ou on a
day- o-day basis. They in ol e dealing wi h sou ces, no a ion o ms, and edi o ial
in e en ions, as well as wo king wi h simila echnologies and encoding sys ems.
Fu he mo e, hey ace he same p oblem o limi ed esou ces. The e is so much
wo k o be done wi h only limi ed inancial means and human e o ha sha ing
wha e e can be sha ed be ween ini ia i es becomes impe a i e.
Co pus c ea ion mus hus be an inclusi e endea ou . This applies no jus o
exchanging encodings, bu o pa icipa ion and echnology as well. Music lo e s and
ci izen scien is s pu a emendous amoun o e o in o encoding music o hei
edi ions, o en wi h imp essi e esul s. P o essionals and ama eu s use a a ie y o
echnologies o c ea ing hei encodings ha i hei wo king me hods and aims
well. Any e o o coo dina e hei ac i i ies should s a wi h a ecogni ion o hei
mo i a ions, musical skills, and echnical choices. The challenge is o de ise
coo dina ion mechanisms on op o hese ha espec hei au onomy, ye help o
spend esou ces wisely and s imula e pa icipa ion.
Challenges do no end when he encoding is done. I mus be possible o ind
encodings; hey mus possess a ibu es ha documen wha hey con ain, how hey
we e made and how hey ela e o o he i ems in he digi al ecosys em o ea ly
music. And hey mus emain accessible in he o eseeable u u e, on sus ainable
in as uc u es. In he es o his sec ion, we will elabo a e his ision in a numbe o
57
COST ACTION 21161
conc e e aims. In he nex sec ion we will desc ibe s eps o ealise some o hese
aims.
Coo dina ion
• C ea e a g oup ha ac s as a glue be ween encoding and digi al edi ing
p ojec s, collec s expe iences, ad ises ongoing p ojec s and de ises
guidelines o he c ea ion and sus ainabili y o co po a. The cu en
CORSICA eam could o m he co e o his g oup.
• C ea e an in en o y o encodings ha a e sui able o musicological wo k o
as inpu o co pus c ea ion (p o isionally called RIEM – Répe oi e
In e na ional des Encodages Musicaux).
• C ea e a sus ainable, decen alised pla o m wi h a small co pus o encoded
sco es, wi h he aim o g ow li le by li le. This could also unc ion as a
eposi o y o encodings c ea ed in sho - e m o small-scale p ojec s lacking
a publica ion pla o m.
• C ea e a me ada a se ice ha is eliable, in one place and being main ained,
ha p o ides unique IDs o a ious en i ies and can be inco po a ed in o
IMSLP, CPDL, and o he al eady exis ing pla o ms.
• C ea e a uni ied use in e ace ac oss he mo e impo an esou ces.
Pa icipa ion
• Ins ead o building new communi ies, ha ing a high isk o ailu e, iden i y he
good (o p omising) exis ing communi ies and build om he e.
• Encoding o ea ly music sou ces is democ a ised and is public domain.
• Mechanisms exis o acknowledging e e yone’s con ibu ion in sha ed wo k,
such ha hey closely align wi h mo i a ions o he pa icipan s. They should
eel com o able in con ibu ing o he co pus.
• Digi al encoding/edi ing is pa o he musicological cu iculum: s uden s
lea n o encode, o p oduce digi al edi ions and o analyse encodings.
• Enable c ea ion o encoding campaigns: iden i y an encoding challenge,
c ea e a wo k low, di ide he asks o e pa icipan s and in eg a e he wo k in
a collec i e esul .
Tools and suppo
• De elop an easy- o use mobile- i s edi o o encoding o allow c owd-
sou cing e o s.
• Design me hods o a seamless and easy as possible con e sion be ween
encoding o ma s and no a ions.
• Tech wo k lows, such as o he encoding campaigns desc ibed abo e, a e
low- h eshold, easy o use and eely a ailable.
• C ea e a well-designed on end o dis ibu ing da a and edi ions.
• Success ul p o o ypes o music encoding so wa e can be u ned in o s able
and usable p oduc s.
64
COST ACTION 21161
No es in Compu e Science, 14809. Sp inge , 2024. h ps://doi.o g/10.1007/978-3-031-
70552-6_2.
Robison, T.D. ‘IML-MIR: A Da a-P ocessing Sys em o he Analysis o Music.’ In Elek onische
Da en e a bei ung in de Musikwissenscha . Edi ed by H. Heckmann. Gus a Bosse
Ve lag, 1967.
Rod íguez-Ga cía, E., and C. McKay. ‘Compose A ibu ion o Renaissance Mo e s: A Case
S udy Using S a is ical Fea u es and Machine Lea ning.’ In The Ana omy o Ibe ian
Polyphony a ound 1500. Edi ed by E. Rod íguez-Ga cía and J. P. d’Al a enga.
Reichenbe ge , 2021.
Rod íguez-Ga cía, E., and C. McKay. ‘Compose A ibu ion o Renaissance Mo e s (Ibe ian
Polyphony a ound 1500): MIDIs and Ex ac ed Fea u es.’ Da a se .
h ps://doi.o g/10.5281/zenodo.4027957.
Roselló, A., E. Fuen es-Ma ínez, M. Al a o-Con e as, D. Rizo, and J. Cal o-Za agoza. ‘Sou ce-
F ee Domain Adap a ion o Op ical Music Recogni ion.’ In Documen Analysis and
Recogni ion – ICDAR 2024. Edi ed by E.H. Ba ney Smi h, M. Liwicki, and L. Peng. Lec u e
No es in Compu e Science, 14809. Sp inge , 2024. h ps://doi.o g/10.1007/978-3-031-
70552-6_2.
Sel idge-Field, E., ed. Beyond MIDI: The Handbook o Musical Codes. The MIT P ess, 1998.
Smi h, D.J. ‘The Ins umen al Music o Pe e Philips.’ PhD diss., Uni e si y o Ox o d, 1994.
h ps://o a.ox.ac.uk/objec s/uuid:d1a 3140-2553-4 58-8437-
a1eee66d7 13/ iles/m06 60 301b6cca42daeeca7b3508cd0a.
Smi h, D.J., and A. Woolley. ‘Edi ing Pu cell’s Keyboa d Music: Some Re lec ions on
Collec ed Edi ions, Pas and Fu u e.’ Jou nal o New Music Resea ch 53, no. 3–4 (2024):
277–296. h ps://doi.o g/10.1080/09298215.2025.2472612.
S achowiak, H. Allgemeine Modell heo ie. Sp inge ,1973.
Tuggene , L., R. Embe ge , A. Ghosh, e al. ‘Real Wo ld Music Objec Recogni ion.’
T ansac ions o he In e na ional Socie y o Music In o ma ion Re ie al 7, no. 1 (2024): 1–
14. h ps://doi.o g/10.5334/ ismi .157.
Tuppen, S., S. Rose, and L. D osopoulou. ‘Lib a y Ca alogue Reco ds as a Resea ch Resou ce:
In oducing “a Big Da a His o y o Music”.‘ Fon es A is Musicae 63 (2016): 67–88.
h ps://doi.o g/10.1353/ am.2016.0011.
Umb ei , J., and S. Schumann. ‘OMR on Ea ly Music Sou ces a he Ba a ian S a e Lib a y wi h
MuRET – P o o yping, Au oma ing, Scaling.’ In P oceedings o he 6 h Wo kshop on
Reading Music Sys ems. Edi ed by J. Cal o-Za agoza, A. Pacha, and E. Sha i.
h ps://doi.o g/10.48550/a Xi .2411.15741.
Wie ing, F. ‘Digi al C i ical Edi ions o Music: A Mul idimensional Model.’ In Mode n Me hods
o Musicology: P ospec s, P oposals, and Reali ies. Edi ed by T. C aw o d and L. Gibson.
Ashga e, 2009.
Wie ing F., and C. Inskip. ‘The Impac o he Pandemic on Musicologis s’ Use o Technology.’
Digi al Humani ies Qua e ly 19, no. 2 (2025).
h ps://dhq.digi alhumani ies.o g/ ol/19/2/000786/000786.h ml.
Wikimedia Founda ion. ‘Co pus Linguis ics.’ h ps://en.wikipedia.o g/wiki/Co pus_linguis ics.
65
COST ACTION 21161
Appendices
Appendix 1: CORSICA p oposal
Name o hos
F ans Wie ing
Hos Ins i u ion
U ech Uni e si y, Depa men o In o ma ion and Compu ing
Sciences, Music In o ma ion Compu ing g oup
Role o hos and
hos ins i u ion
Hos will guide he STSM p ojec , plan ac i i ies, o ganise
online p epa a ion mee ings, he wo kshop, and ollow-up
mee ing(s) i hese a e needed. The hos ins i u ion will
p o ide wo king space o he wo kshop. In addi ion o he
hos , 1-2 esea che s om he hos ins i u ion will pa icipa e
in he wo kshop.
Time pe iod
Wo kshop will ake place ei he in he week o 13-17 May
2024. Online ac i i ies will ake place in he weeks be o e and
immedia ely a e he wo kshop.
Numbe o gues s
c. 7 gues s, om mul iple coun ies
Wo king g oup
WG2, Sou ces; WG3, Publica ions
Objec i e o
STSM p ojec
Analysis o la ge ex ual co po a has e ealed new insigh s in
many his o ical ields. P ojec s such as ECOLM and he
S an o d Josquin P ojec show ha a co pus-based app oach
also wo ks in he music domain. The e a e impo an
di e ences hough. Whe eas ex ual co po a ha e been
c ea ed a a la ge scale and a e ela i ely homogeneous,
musical co po a a e compa a i ely small and show a
conside able a ie y in encoding p ac ices. Fo p e-1700
polyphonic music, he si ua ion is especially complica ed: ew
co po a con ain mo e han se e al hund eds o i ems;
di e en ypes o music no a ion such as mensu al no a ion
and abla u e pose speci ic encoding p oblems; and
ansc ip ion o mode n no a ion can be suppo ed in a
numbe o ways.
Some o he consequences o his s a e o a ai s a e:
• A ailable co po a a e no a ep esen a i e selec ion o
he known epe oi e;
• O - he-bea en- ack esea ch equi es an excessi e
66
COST ACTION 21161
amoun o encoding e o ;
• Quali y and in e ope abili y issues inc ease when using
mul iple co po a;
• PhD s uden s and ea ly-ca ee esea che s do no ha e
su icien esou ces o emba k on a ca ee as digi al
musicologis s.
On he o he hand, he e a e some impo an posi i e
de elopmen s:
• The eme gence o MEI (Music Encoding Ini ia i e) as a
e sa ile ‘musicological’ encoding sys em wi h a
g owing communi y o use s;
• Imp o emen s in Op ical Music Recogni ion o ea ly
music;
• The gene al p ac ice o musicologis s o use music
no a ion so wa e o c ea e hei ansc ip ions;
• A g owing p ac ice o online sha ing o edi ions
c ea ed by bo h p o essional and ama eu musicians.
In he pas , he solu ion o he lack o encodings has o en
been sough in s anda disa ion and aining. Such endea ou s
ha e o en me wi h limi ed success o a leas ou
in e connec ed easons: limi ed ool suppo ; equi ed ime
in es men o lea ning; loss o al eady c ea ed wo k; and
mos impo an ly loss o au onomy. The logical al e na i e is
he e o e o conside au onomy and a ie y o p ac ices as
s a ing poin s o a new pa adigm o co pus c ea ion. Some
elemen s o his a e:
• Con ibu e encodings wi h minimal e o ;
• Me ada a o e ec i e documen a ion o encoding
p ac ices;
• In e ope abili y a he han s anda disa ion;
• A se o in ui i e ools o common asks;
• P ac ical guidelines o sha ing and exchanging
encodings;
• In ol ing music p o essionals and lay expe s in he
encoding p ocess;
• Making pa icipa ion ewa ding and un.
Realising hese (and o he necessa y) ambi ions will ake a
mul i-yea e o , which could ake i s i s s eps owa ds
67
COST ACTION 21161
ma u i y in he en i onmen o he Ea lyMuse COST ac ion.
We p opose his STSM p ojec named ‘C ea ion O eaRly
muSIc Co po A’ (CORSICA) as he i s o hese s eps. A small
g oup o expe s will come oge he o:
• Su ey he cu en s a e o he a in co pus c ea ion o
ea ly music;
• Iden i y bo lenecks and oppo uni ies;
• C ea e a ision documen o accele a ing co pus
c ea ion;
• C ea e an implemen a ion plan wi h p ac ical
ecommenda ions.
Rela ion o WG
and Ac ion goals
The p ojec is ela ed o he ollowing goals and objec i es:
MoU, Seconda y objec i e 3: p ese a ion o musical he i age
and accessibili y o esou ces. As a logical nex s ep a e
digi isa ion, co pus c ea ion inc eases deep access o sou ces
as well as a way o p ese ing he con en , no jus he
appea ance, o a sou ce.
MoU, Seconda y objec i e 3: c ea ing a mul idisciplina y
esea ch ne wo k. By c ea ing la ge co po a in a mo e
sys ema ic ashion, i becomes easie use hese da a in an
in e disciplina y con ex , whe e esea che s may no ha e he
specialis knowledge ha is cu en ly necessa y o employ
co po a ui ully. La ge co po a also allow a wide ange o
esea ch p oblems o be add essed, so ha a ne wo k o
esea che s wi h mul iple in e es can eme ge a ound hem.
Fo ea ly-ca ee esea che s his is pa icula ly bene icial as
hey a e no longe obliged o c ea e hei own digi al esea ch
en i onmen om sc a ch.
GAPG 3 (WG2): desc ibe ways o iden i ying and desc ibing
endange ed musical sou ces (including in Uk aine). The e may
be enuous ela ion o his goal, in ha encoding can be seen
as p o ec ion o he con en . Fo example, communi y
ini ia i es ha aim o digi ise hei own musical he i age could
be suppo ed by guidelines ha make his as s aigh o wa d
as possible and a he same ime compa ible wi h schola ly
app oaches.
GAPG 4 (WG3): Imagining inno a i e publica ion models.
Se e al connec ions seem ele an :
68
COST ACTION 21161
• The inc easing impo ance o ep oduceabili y o
esea ch. Publishing a co pus, o e e ing o an
exis ing co pus as unde lying da a makes esea ch
easie o ep oduce;
• Co po a could be seen as publica ions: one’s
con ibu ions o a co pus become pa o one’s
academic po olio;
• Encodings ela e o digi al edi ions in wo ways: hey
can be used as inpu o an edi o ial p ocess in which
hey a e u he e ined, o example by colla ing
mul iple sou ces o he same wo k; o he edi ed wo k
can be s o ed as an analysable encoding.
GAPG 6 (WG2): es ablish an in o ma ion exchange p o ocol
wi h RISM. CORSICA has a deep connec ion o RISM. RISM
may p o ide essen ial me ada a ha imp o e he quali y,
usabili y and accessibili y o he composi ions in a co pus. I
may also play an essen ial ole in he selec ion o i ems o be
encoded and in add essing ques ions o ep esen a i eness.
Con e sely, he ac o encoding o encoding will p oduce new
me ada a ha need o be complian wi h RISM and ha may in
some cases be used o co ec o enhance he RISM da a
i sel . In ac , co pus c ea ion is an ideal case s udy o GAPG 6
i sel .
Expec ed
ou comes /
deli e ables
The ollowing ou comes a e expec ed:
• A epo desc ibing he cu en s a e o he a in co pus
c ea ion o ea ly music
• A ision on how o accele a e co pus co pus c ea ion
• A plan o implemen ing his ision
Sho summa y
CORSICA (C ea ion O eaRly muSIc Co po A) will add ess he
sho age o la ge-scale co po a o encoded ea ly music.
Exis ing co po a a e gene ally small (usually less han 1000
i ems) and display a a ie y o encoding p ac ices. This makes
la ge-scale compu a ional analysis o he music cu en ly
un easible. Expe s in he ield will come oge he in a 1-week
wo kshop in U ech (Ne he lands) o su ey he cu en s a e
o ea ly music encoding; o iden i y bo lenecks and
oppo uni ies (in e ms o echnologies as well as
communi ies); o c ea e a ision o he u u e and o de ine
p ac ical s eps owa ds accele a ing co pus c ea ion in ea ly
music.
69
COST ACTION 21161
Appendix 2: Ea ly music co po a
CORPUS DESCRIPTIONS
co pus name
musical con en
da a
c ea ion me hod
u l
Accessible Lu e Music
Lu e music om Renaissance and Ba oque
accessible co pus
ci izen science
h ps://wp.lu emusic.o g/
A chi e o Ibe ian Polyphony
Ibe ian polyphony om he Renaissance
digi al edi ions, non-
public encodings
academic esea ch
h ps://ibe ianpolyphony. csh.unl.p /
Cho al Public Domain Lib a y (CPDL)
Cho al music o any age
encodings a ailable
o many wo ks
ci izen science
h ps://www.cpdl.o g/wiki/
Ci a ions: The Renaissance Imi a ion
Mass (CRIM)
16 h c. imi a ion masses and hei models
accessible co pus
academic esea ch
h ps://c imp ojec .o g/
Classical A chi es
Wes e n compose s, 14-20 h cen u y
accessible co pus
ci izen science
h ps://www.classicala chi es.com/midi.h ml
Compu e ized Mensu al Music
Edi ing (CMME)
15-16 h c. polyphony, wi h wo ks by Josquin,
Be chem and o he s
accessible co pus
academic esea ch
h p://www.cmme.o g/;
h ps://gi hub.com/ dumi escu/cmme-music
Elec onic Co pus o Lu e Music
(ECOLM)
Lu e music om Renaissance and Ba oque
accessible co pus,
passwo d p o ec ed
academic esea ch
h ps://ecolm.o g
Elec onic Linked Anno a ed Uni ied
Tabla u e Edi ion (E-LAUTE)
Lu e music om he Ge man-speaking a eas, mainly
in Ge man lu e abla u e
accessible co pus
academic esea ch
h ps://e-lau e.in o/
Elec onic Medie al Music Sco e
A chi e P ojec (EMMSAP)
14 h cen u y music
accessible co pus
academic esea ch
h ps://gi hub.com/cu hbe Lab/emmsap
ELVIS da abase
Con ains mul iple subcollec ions, some also exis ing
as sepa a e co po a. Includes composi ions p in ed
in Gla ean’s Dodekacho don, wo ks by Pales ina,
Vic o ia and o he s
accessible co pus
academic esea ch
h ps://gi hub.com/ELVIS-P ojec
Fu nace and Fugue
Digi al edi ion o A alan a ugiens (1618), music
composed by Michael Meie
digi al edi ions, non-
public encodings
academic esea ch
h ps:// u naceand ugue.o g/
Ga u ius Codices Online (MCE)
Polyphony om c. 1500, wi h wo ks by Compe e,
Wee beke, Ga u ius and o he s
digi al edi ions, non-
public encodings
academic esea ch
h ps://www.ga u ius-codices.ch/s/po al/page/edi ions
70
COST ACTION 21161
co pus name
musical con en
da a
c ea ion me hod
u l
Gaspa Online Edi ion
Collec ed wo ks o Gaspa an Wee beke
accessible co pus
academic esea ch
h p://www.gaspa - an-wee beke.sbg.ac.a /gaspa -online-
edi ion
Gesualdo online
Comple e wo ks o Gesualdo
accessible co pus
academic esea ch
h ps:// ice ca .gesualdo-online.ces .uni - ou s. /
Goldbe g Co pus
15 h and ea ly 16 h cen u y music
non-public encodings
ci izen science
h ps://www.goldbe gs i ung.o g/en/
Johannes Tinc o is Comple e
P ac ical Wo ks
P ac ical wo ks o Johannes Tinc o is, 3 pieces by
Du ay, and 2 by Busnois
accessible co pus
academic esea ch
h ps://ea lymusic heo y.o g/Tinc o is/Music/
John Robinson/Lu e Socie y edi ions
Lu e music om Renaissance and Ba oque
accessible co pus
ci izen science
h ps://gi hub.com/TimC aw o d/jh _ epo
Josquin La Rue Secu e Duo Da ase
Duos om masses by Josquin and La Rue
accessible co pus
academic esea ch
h ps://gi hub.com/ELVIS-P ojec /mass-duos-co pus-
josquin-la ue/ ee/Me hodologies- o -C ea ing-Symbolic-
Music-Co po a
Josquin Resea ch p ojec (JRP)
Wo ks by Josquin and con empo a ies
accessible co pus
academic esea ch
h ps://josquin.s an o d.edu/
Ke nSco es
Wes e n compose s, 14-20 h cen u y, wi h wo ks by
F escobaldi, Mon e e di, Co elli and o he s
accessible co pus
mix
h p://ke n.cca h.o g/
Kuns de Fuge
Wes e n compose s, 14-20 h cen u y
accessible co pus
ci izen science
h ps://kuns de uge.com/midi.h m
Lassus T icinium P ojec
T icinia by O lande and Rudolph de Lassus
accessible co pus
academic esea ch
h ps://lassus.mh- eibu g.de/;
h ps://gi hub.com/Wol gangD esche /lassus-geis liche-
psalmen
Los Voices
16 h c. F ench chansons
accessible co pus
academic esea ch
h p://digi alduchemin.o g/
Ma enzio Online Digi al Edi ion
(MODE)
Wo ks by Ma enzio
non-public encodings
academic esea ch
h p://www.ma enzio.o g/index.xh ml
Measu ing polyphony
14 h cen u y mo e s
accessible co pus
academic esea ch
h ps://measu ingpolyphony.o g/;
h ps://gi hub.com/Measu ingPolyphony/mp-music- iles
Miami Publica ion Se e (Uni e si y
o Müns e )
Digi al edi ions o ea ly 17 h c. sac ed music by
C üge and Tex o ius
accessible co pus
academic esea ch
h ps://miami.uni-muens e .de/
Music21 co pus
Wes e n compose s, 14-20 h cen u y
accessible co pus
mix
h ps://www.music21.o g/music21docs/abou / e e enceCo
pus.h ml
Neuma
Con ains mul iple subcollec ions, wi h wo ks by
Josquin, Ockeghem, Du ay, F escobaldi and o he s
accessible co pus
academic esea ch
h p://neuma.huma-num. /
Polish Digi al Sco es
Polish music, 16 h-20 h cen u y
accessible co pus
academic esea ch
h ps://polishsco es.o g/
71
COST ACTION 21161
co pus name
musical con en
da a
c ea ion me hod
u l
Single In e ace o Music Sco e
Sea ching and Analysis (SIMMSA)
He e ogeneous co pus, agg ega ed om a ious
o he co po a
accessible co pus
academic esea ch
h ps://simssa.ca/ac i i ies/co po a-and-da ase s/
Symbolically Encoded Il Lau o Secco
(SEILS)
Mad igals by a ious compose s published in Il Lau o
Secco (1582)
accessible co pus
academic esea ch
h ps://gi hub.com/SEILSda ase
Tasso in Music
Musical se ings o Tasso’s poems, wi h wo ks by
Mon e e di, Ma enzio, We , Ci a, o he s
accessible co pus
academic esea ch
h ps://www. assomusic.o g/
The 1520s P ojec
1520s polyphony, wi h wo ks by Willae , Sen l,
Ve delo and o he s
accessible co pus
academic esea ch
h ps://1520s-p ojec .o g/abou /
Tomás Luis de Vic o ia
Spanish music om Mo ales onwa d, including wo ks
by Gue e o, Vic o ia, Ceballos, Vasquez
accessible co pus
ci izen science
h ps://www.uma.es/ ic o ia/index.h ml
Ve o io Humd um Viewe
Va ious encodings, mos ly om o he p ojec s such
as 1, 8, 9, 25
accessible co pus
academic esea ch
h ps:// e o io.humd um.o g
72
COST ACTION 21161
CORPUS SIZES AND ENCODINGS
Suppo ed o ma s a e indica ed by ‘x’; s o age o ma s used o de i e o he o ma s om a e
indica ed by ‘o’.
co pus name
i ems
be o e
1700
ASCII abla u e
CMME
Humd um
Lilypond
MEI
MIDI
MP3
MuseDa a
MusicXML/
Finale
PDF
Sibelius
O he
Accessible Lu e Music
c. 17000
o
x
x
A chi e o Ibe ian Polyphony
130
x
o
x
o
Cho al Public Domain Lib a y (CPDL)
c. 30000
x
x
x
x
x
Ci a ions: The Renaissance Imi a ion
Mass (CRIM)
c. 300
o
x
Classical A chi es
se e al
100s
o
Compu e ized Mensu al Music
Edi ing (CMME)
c. 260
o
Elec onic Co pus o Lu e Music
(ECOLM)
1619
o
Elec onic Linked Anno a ed Uni ied
Tabla u e Edi ion (E-LAUTE)
60, g owing
o
Elec onic Medie al Music Sco e
A chi e P ojec (EMMSAP)
3600
x
ELVIS da abase
c. 1000
x
x
x
x
Fu nace and Fugue
50
o
x
o
Ga u ius Codices Online (MCE)
c. 55
o
Gaspa Online Edi ion
c. 100
x
x
o
Gesualdo online
222
o
x
Goldbe g Co pus
3600
o
x
Johannes Tinc o is Comple e
P ac ical Wo ks
c. 50
x
o
John Robinson/Lu e Socie y edi ions
c. 8000
o
x
x
Josquin La Rue Secu e Duo Da ase
77
x
x
x
o
Josquin Resea ch p ojec (JRP)
902
o
x
x
x
x
x
x
x
Ke nSco es
c. 350
o
x
Kuns de Fuge
se e al
100s
o
Lassus T icinium P ojec
50
o
x
Los Voices
c. 350
x
x
o
Ma enzio Online Digi al Edi ion
(MODE)
32
o
x
Measu ing polyphony
61
o
x
x
x
73
COST ACTION 21161
co pus name
i ems
be o e
1700
ASCII abla u e
CMME
Humd um
Lilypond
MEI
MIDI
MP3
MuseDa a
MusicXML/
Finale
PDF
Sibelius
O he
Miami Publica ion Se e (Uni e si y
o Müns e )
c. 230
x
o
Music21 co pus
se e al
100s
x
x
x
Neuma
c. 600
x
x
x
Polish Digi al Sco es
1740
o
x
x
x
x
x
Single In e ace o Music Sco e
Sea ching and Analysis (SIMMSA)
???
x
x
x
x
x
x
Symbolically Encoded Il Lau o Secco
(SEILS)
30
x
o
x
x
x
Tasso in Music
778
o
x
x
x
x
x
x
x
The 1520s P ojec
491
x
x
x
x
x
o
Tomás Luis de Vic o ia
c. 1000
o
x
x
Ve o io Humd um Viewe
c. 1500
o
x
x