Open sou ce
so wa e and wo k lows
A d i ing o ce o inno a ion in li e sciences
Au ho s
ELIXIR Hub
Despoina Sousoni
Mihail An on
And ew Smi h
Uni e si y o Duisbu g-Essen
Mahnoo Shahid
Lydia Adungo
Hannes Ro he
3
6
Open sou ce so wa e ools and
wo k lows o suppo inno a ion
7
How open sou ce ools suppo
bioin o ma ics companies
8
Why companies a e u ning o open
sou ce so wa e and wo k lows
10
S a egic app oaches o open sou ce
so wa e and wo k lows in indus y
12
Example s a egies om ou
in e iewed companies
14
How companies in eg a e impac me ics
wi h open sou ce so wa e and wo k lows
Con en s
15
Global communi y collabo a ion
and con ibu ions
16
Galaxy’s global each ac oss
li e science domains
18
Main enance ac i i y e lec ed
in ool a ailabili y
20
Open sou ce sus ainabili y: ope a ionalising
collabo a ion and inno a ion
22
Me hods
4
Key e ms
Open sou ce so wa e
1 h ps://en.wikipedia.o g/wiki/F ee_and_open-sou ce_so wa e
2 Wilkinson,S.R.,Aloqalaa,M.,Belhajjame,K.e al.Applying heFAIRP inciples ocompu a ionalwo k lows.SciDa a12,328(2025).h ps://doi.o g/10.1038/s41597-025-04451-9
So wa e wi h sou ce code a ailable o he public o inspec , modi y, enhance and edis ibu e,
acco ding o he chosen license. Mo e p ecisely, his concep is e e ed o as ‘ ee and open sou ce
so wa e’1 (FOSS).
Wo k lows
In his epo , ‘wo k low’ e e s o a compu a ional wo k low – a special kind o so wa e exp essed
in a speci ic language a ge ed a handling mul i-s ep, mul i-code da a pipelines, da a analyses and
o he da a-handling ope a ions, especially h ough he e icien use o compu a ional esou ces o
ans o m da a inpu s in o desi ed ou pu s. Wo k lows inc easingly inco po a e machine lea ning
models and a e c i ical o in eg a ing and deploying so wa e code and da a analysis2.
Wo k low managemen sys ems
Specialised so wa e sys ems ha help scien is s design, manage, execu e and pa ially o ully
au oma e hei compu a ional wo k lows om s a o inish.
5
Open sou ce so wa e ools
and wo k lows o suppo
inno a ion
La ge, complex da ase s a e undamen al o li e science
esea ch, and ex ac ing meaning ul insigh s om
hem is essen ial o de eloping new p oduc s and
se ices. This equi es obus analy ical wo k lows and
sus ainably main ained, p o essional-g ade so wa e.
These ools enable da a agg ega ion, analysis and
in as uc u e suppo , while p omo ing ep oducibili y,
po abili y, us and p oduc i i y, o en in alignmen
wi h FAIR p inciples2,3. While open sou ce so wa e
(OSS) has been an impo an pilla o esea ch and
de elopmen o decades, he inc easing ubiqui y o
machine lea ning (ML) has u he emphasised he
impo ance o wo k lows, as manual asks inc easingly
shi owa ds o algo i hm-d i en p ocesses4,5.
Li e scien is s in indus y and academia use open sou ce
so wa e and wo k lows (OSSW) o build analy ical
pipelines ailo ed o speci ic esea ch pu poses,
including clinical s udies and he de elopmen o
diagnos ic o he apeu ic applica ions. In u n, hese
scien is s o en con ibu e expe ise o design, alida e
and en ich open wo k lows, os e ing a communi y o
p ac ice and a weal h o open knowledge wi h di e se
applica ions and me hodologies. Such collec i e e o s
p oduce in aluable asse s o esea ch ac oss indus y
and academia.
In es iga ing cu en p ac ices in OSSW is c ucial o
unde s anding ends and de eloping bes p ac ices o
sha ing and using open so wa e and wo k lows. In his
epo , we conduc ed a quali a i e analysis based on 15
in e iews wi h he ep esen a i es om bioin o ma ics
companies, alongside a quan i a i e analysis o
empi ical da a collec ed om Gi Hub and he Galaxy
pla o m. The indings unde sco e he impo ance o
OSSW in d i ing inno a ion, os e ing collabo a ion and
enhancing anspa ency in esea ch.
3 FAIRResea chSo wa eP inciples(h ps://e e se.so wa e/RSQKi / ai _ s),Resea chSo wa eQuali yKi (h ps://doi.o g/10.5281/zenodo.14892767)
4 Bai d,A.,&Ma uping,L.M.(2021).Thenex gene a iono esea chonISuse:A heo e ical amewo ko delega ion oand omagen icISa i ac s.MISqua e ly,45(1).
h ps://doi.o g/10.25300/MISQ/2021/15882
5 S elmaszak,M.,Möhlmann,M.,&Sø ensen,C.(2024).Whenalgo i hmsdelega e ohumans:Explo inghuman-algo i hmin e ac iona Ube .MISqua e ly,49(1).
h ps://doi.o g/10.25300/MISQ/2024/17911
6
How open sou ce ools suppo
bioin o ma ics companies
OSS plays a c ucial ole in esea ch ac oss academia and
indus y. O e he pas decade, OSS has ans o med
indus ies such as bioin o ma ics, bio echnology6
and pha maceu icals7 by inc easing eliance on
p o essionals skilled in so wa e enginee ing, esea ch
da a managemen and AI, gi ing ise o so-called
‘in e i ms’8. In hese i ms, employees no only b ing
echnical expe ise bu also equip companies wi h open
ools and wo k lows ha accele a e p oduc and se ice
inno a ion. Many o ganisa ions now s a egically
build hei business and ope a ional models a ound
open sou ce, while o he s con ibu e o and use hem
ex ensi ely9. As wi h open da a ini ia i es, OSS has
played an impo an ole in enhancing openness and
anspa ency,10 and in shaping new companies in he
bioin o ma ics domain11.
Ad anced compu a ional analysis – pa icula ly in ol ing
ML – has become c i ical in li e science p oduc and
se ice de elopmen 12. Wo k lows encode algo i hmic
app oaches o load and ans o m da a, ain, es ,
package and deploy ML models. Also, hey acili a e he
moni o ing and go e nance o ML models, enhancing
ep oducibili y and us 13,14. As he ole o ML becomes
mo e pi o al15, hese wo k lows a e becoming a key
pa o OSSW.
6 Vassilakopoulou,P.,Sko e,E.,&Aanes ad,M.(2019).Enablingopennesso aluablein o ma ion esou ces:Cu bingda asub ac abili yandexclusion.In o ma ionSys ems
Jou nal,29(4),768–786.h ps://doi.o g/10.1111/isj.12191
7 P iego,L.P.,&Wa eham,J.(2024).Da aCommoningin heLi eSciences.MISQua e ly,48(2).h ps://doi.o g/10.25300/MISQ/2023/17439
8 Pa ke ,G.,VanAls yne,M.,&Jiang,X.(2017).Pla o mecosys ems.MISQua e ly,41(1),255-266.h p://dx.doi.o g/10.2139/ss n.2861574
9 Shaikh, M., & Vaas , E. (2016). Folding and un olding: Balancing openness and anspa ency in open sou ce communi ies. In o ma ion Sys ems Resea ch, 27(4), 813-833.
h ps://doi.o g/10.1287/is e.2016.0646
10 Li shi z-Assa ,H.(2018).Disman lingknowledgebounda iesa NASA:Thec i ical oleo p o essionaliden i yinopeninno a ion.Adminis a i eScienceQua e ly,63(4),746-782.
h ps://doi.o g/10.1177/0001839217747876
11 Ro he,H.,Laue ,K.B.,Talbo -Coope ,C.,&Si izacaConde,D.J.(2023).Digi alen ep eneu ship omcellula da a:Howomicsa o d heeme genceo anewwa eo digi al
en u esinheal h.Elec onicMa ke s,33(1),48.h ps://doi.o g/10.1007/s12525-023-00669-w
12 Lou,B.,&Wu,L.(2021).AIonD ugs:CanA i icialIn elligenceAccele a eD ugDe elopmen ?E idence omaLa ge-ScaleExamina iono Bio-Pha maFi ms.Managemen
In o ma ionSys emsQua e ly,45(3),1451-1482.h p://dx.doi.o g/10.2139/ss n.3524985
13 Sculley,D.,Hol ,G.,Golo in,D.,e al.(2015).Hidden echnicaldeb inmachinelea ningsys ems.Ad ancesinneu alin o ma ionp ocessingsys ems,28.
h ps://dl.acm.o g/doi/10.5555/2969442.2969519
14 Walsh,I.,Fishman,D.,Ga cia-Gasulla,D.,e al.(2021).DOME: ecommenda ions o supe isedmachinelea ning alida ioninbiology.Na u eMe hods,18(10),1122-1127.
h ps://doi.o g/10.1038/s41592-021-01205-4
15 Xu,Y.,Liu,X.,Cao,X.,e al.(2021).A i icialin elligence:Apowe ulpa adigm o scien i ic esea ch.TheInno a ion,2(4).
h ps://doi.o g/10.1016/j.xinn.2021.100179
Open sou ce code, wo k lows and
so wa e – de eloped and main ained
by he global communi y – play a c ucial
ole in highly specialised, inno a ion-
d i en ields like d ug disco e y and
de elopmen .” — In e iewee, A digen
Open sou ce ools and public da abases,
being s a e o he a and widely alida ed
by he scien i ic communi y, acili a e apid
ad ancemen s in esea ch wi h minimal
cos s, pa icula ly in he ield o d ug
disco e y.” — In e iewee, Knowing01s
Fo a esou ce-cons ained s a up,
adop ing s anda dised ools and wo k lows
allows us o ocus on ou alue p oposi ion,
a he han di e ing esou ces o
basepla e code and in as uc u e. By
in eg a ing open sou ce ools, which bene i
om con inuous communi y suppo and
e olu ion, we can signi ican ly educe
de elopmen o e head and accele a e ou
ime o ma ke .” — In e iewee, Camb ium
7
Why companies a e u ning
o open sou ce so wa e and
wo k lows
P oduc i i y h ough cos educ ion
Open sou ce ools boos p oduc i i y by p o iding
code a li le o no cos . De elope s bene i om he
collec i e expe ise o global communi ies, and access
o OSSW signi ican ly imp o es he e iciency o p i a e
code de elopmen 16,17 by educing de elopmen ime
and expense – ul ima ely d i ing p oduc imp o emen s
o se ice quali y.
OSS enhances inno a ion by enabling
he company o build on exis ing ools,
he eby educing de elopmen ime. Also,
exposu e o open sou ce ools os e s skill
de elopmen .” — In e iewee, SeQone
T us and c edibili y
Legi imacy is c i ical in li e sciences, due o s ingen
egula o y equi emen s and signi ican isks associa ed
wi h many applica ions. By adop ing OSS, companies
can os e us h ough he use o communi y- e ed
OSSW. This app oach no only showcases hei capaci y
o de elop high-quali y code in an open en i onmen ,
bu also enhances c edibili y by aligning wi h p ominen
esea ch ins i u ions18.
Communi y- e ed ools o e eliabili y,
as bugs a e o en caugh and ixed by a
b oade use base.” — In e iewee, Camb ium
16 Eilha d,J.,&Méniè e,Y.(2009).Alookinside he o ge:De elope p oduc i i yandspillo e sinopensou cep ojec s.SSRNElec onicJou nal,1316772;
h p://dx.doi.o g/10.2139/ss n.1316772
17 Pe ez-Ri e ol,Y.,Bi emieux,W.,Noble,W.S.,e al.(2025).Open-Sou ceandFAIRResea chSo wa e o P o eomics.Jou nalo P o eomeResea ch,24(5),2222-2234.
h ps://doi.o g/10.1021/acs.jp o eome.4c01079
18 Ma san,J.,Ca illo,K.D.A.,&Negoi a,B.(2020).En ep eneu ialac ionsand helegi ima iono ee/opensou ceso wa ese ices.Jou nalo In o ma ionTechnology,35(2),
143–160.h ps://doi.o g/10.1177/0268396219886879
19 Fa aj,S., onK ogh,G.,Mon ei o,E.,&Lakhani,K.R.(2016).Specialsec ionin oduc ion–Onlinecommuni yasspace o knowledge lows.In o ma ionSys emsResea ch,27(4),
668–684.h ps://doi.o g/10.1287/is e.2016.0682
20 Fü s enau,D.,Baiye e,A.,Schewina,K.,Schul e-Al ho ,M.,&Ro he,H.(2023).Ex endedgene a i i y heo yondigi alpla o ms.In o ma ionSys emsResea ch,34(4),1686–1710.
h ps://doi.o g/10.1287/is e.2023.1209
P oduc i i y h ough communi y
engagemen
Adop ing open sou ce code allows de elope s o ap
in o he insigh s and eedback o a b oad, di e se
communi y. De elope s can use epo ed OSSW issues
o limi a ions o e ine and alida e hei solu ions19. This
collabo a i e en i onmen no only d i es inno a ion,
bu also accele a es he o e all imp o emen and
eliabili y o so wa e, o en leading o he disco e y o
no el app oaches20.
Building a communi y o expe s dedica ed
o con ibu ing o open sou ce ools
demands signi ican commi men . In e u n,
anspa ency builds us and he company
gains ecogni ion om he communi y o
i s esea ch expe ise, os e ing comme cial
pa ne ships.” — In e iewee, Helical AI
8
Cus ome acquisi ion and isibili y
When companies con ibu e o OSSW, hey engage
di ec ly wi h es ablished communi ies on pla o ms like
Gi Hub o Galaxy. This p oac i e pa icipa ion inc eases
hei each and connec s hem wi h p ospec i e use s
and clien s9.
The company uses open sou ce ools and
da a when ele an because hey a e well
documen ed and pee e iewed, and his helps
he company build c edibili y wi h clien s,
especially in he academic wo ld, whe e
anspa ency is alued.” — In e iewee, Saphe o
T anspa ency
Open wo k lows p o ide ull anspa ency in how code,
da a and ML models a e handled and execu ed. These
bene i s ex end o so wa e es ing and quali y assu ance,
ensu ing consis ency h oughou he so wa e
de elopmen cycle. Wi h compu a ional app oaches
and ML inc easingly pe mea ing li e sciences, he abili y
o moni o and con ol da a and models is inc easingly
c i ical. Public acking, communi y discussions and
collabo a i e main enance o wo k low changes u he
s eng hen ep oducibili y. Sha ing and e sion acking
wo k lows openly o among communi ies ia egis ies
such as Wo k lowHub p omo es anspa en euse and
eliabili y o published esul s21.
In eg a ing ML models in o he same
open amewo k makes hem highly
in e changeable; eusable wo k lows educe
edundan coding.” — In e iewee, Helical AI
Using open sou ce ools like Nex low and
n -co e allows o ganisa ions o un bes -
p ac ice pipelines o he shel , which a e
ep oducible and can be execu ed on a ious
in as uc u es.” — In e iewee, Seqe a
21 Gus a sson,O.J.R.,Wilkinson,S.R.,Bacall,F.e al.Wo k lowHub:a egis y o compu a ionalwo k lows.SciDa a12,837(2025).h ps://doi.o g/10.1038/s41597-025-04786-3
22 Allen,J.P.(2012).Democ a izingbusinessso wa e:Smallbusinessecosys ems o opensou ceapplica ions.Communica ionso heAssocia ion o In o ma ionSys ems,30(1),
28.h ps://doi.o g/10.17705/1CAIS.03028
23 Lindbe g,A.,Be en e,N.,Howison,J.,&Lyy inen,K.(2024).Discu si eModula ioninOpenSou ceSo wa e:HowOnlineCommuni iesShapeNo el yandComplexi y.Managemen
In o ma ionSys emsQua e ly,48(4),1395–1422.h ps://doi.o g/10.25300/MISQ/2023/16872
Mission-d i en goals
OSS has long been associa ed wi h democ a isa ion,
enabling b oad audiences, including unde - esou ced
g oups and small businesses o use, sha e, modi y
and con ibu e o high-quali y so wa e22. Fo some
companies, engagemen wi h OSS is no jus a echnical
choice, bu a commi men o b oade socie al impac .
By embedding hei mission and alues in o code o
licensing ag eemen s, hese o ganisa ions help shape
he u u e o p oduc s and se ices23.
Employees a e mo i a ed by he company’s
mission o enable open science and open
sou ce de elopmen , and enjoy wo king in an
en i onmen ha alues anspa ency and
communi y con ibu ion.” — Anonymous
9
Galaxy’s global each ac oss
li e science domains
The Galaxy pla o m is a co ne s one o bioin o ma ics
and li e science da a analysis, p o iding a collabo a i e
space o execu e, de elop and dissemina e wo k lows
and so wa e ools25. Behind his di e se pla o m
o ools lies an in ica e web o sha ing pa e ns,
con ibu o beha iou and e ol ing ool main enance
p ac ices, all essen ial o unde s anding how
knowledge and echnology p opaga e ac oss he
pla o m. While Galaxy’s p ima y ocus is li e sciences,
i s global dis ibu ion – suppo ed by a ne wo k o OSSW
eposi o ies on Gi Hub – o e s a clea o e iew o he
di e si y o ools used in his domain.
To shed ligh on how wo k lows and ools a e sha ed
wi hin Galaxy, we analysed he pla o m’s ecosys em,
iden i ying he key ends and beha iou s ha de ine
i s communi y-d i en model. The basis o his analysis
is a cu a ed da ase o a ound 6800 Galaxy ToolShed
eposi o ies26 owned by 636 dis inc ToolShed owne s. In
25 TheGalaxyCommuni y.TheGalaxypla o m o accessible, ep oducible,andcollabo a i eda aanalyses:2024upda e,NucleicAcidsResea ch,2024,
h ps://doi.o g/10.1093/na /gkae410
26 Thisda ase includesallpublicly-a ailable eposi o ies om heGalaxyToolShedaso 26Feb ua y2025, e ie ed ia heToolShedAPI.
27 Theweigh sco espond o hep obabili iesgene a edby he opicmodel,whe eeachp obabili yindica es helikelihood ha a eposi o ybelongs oaspeci icca ego y.
Galaxy, each ool is ypically de eloped and main ained
in a dedica ed eposi o y, mapped in o 15 dis inc
p ima y domains. Because many ools span mul iple
esea ch domains, we used a weigh ed dis ibu ion
( a he han simple headcoun s) o iden i y he mos
p ominen domains27.
Key insigh
Galaxy is no me ely a collec ion o isola ed
ools; i is a dynamic, in e connec ed
ecosys em whe e ools a e equen ly
in eg a ed ac oss di e se li e sciences
subdomains. This in eg a ion d i es
collabo a ion and c oss- unc ional euse in
ways ha simple me ics canno ully cap u e.
Weigh ed dis ibu ion o Galaxy ools by ca ego y
Sequence assembly & alignmen
Genomic &
a ian analysis
Small con ibu ion
ca ego ies
Fo ma ing &
con e sion
Da a p ep ocessing &
cleaning
Wo k low au oma ion &
me a- ools
Me agenomics & mic obiome analysis
Fea u e ex ac ion & anno a ion
Visualiza ion &
epo ing ools
Quan i ica ion &
s a is ical analysis
Machine lea ning &
p edic i e models
Molecula s uc u e &
chemoin o ma ics
O he
Da a in eg a ion & me ging
Ne wo k & pa hway analysis
S uc u al biology & simula ion ools
Tex & li e a u e mining
1000
ools
100
ools
24.8%
16.4%
15.3%
15.2%
10.1%
9.4%
4.7%
4.0%
3.1%
2.8%
2.6%
2.0%
1.3%
1.3%
0.9%
0.7%
0.5%
16
Top 50 mos downloaded ools by ca ego y
To assess he impac o Galaxy’s mos popula ools,
we analysed he op 50 mos -downloaded ools and
hei p ima y ca ego ies. Each ool was assigned o i s
p ima y ca ego y based on he highes weigh , gi ing
a clea pe spec i e on whe e each ool p edominan ly
belongs28. The analysis e ealed a classic long- ail
dis ibu ion obse ed in o he pla o m ecosys ems,
whe e some ‘supe s a ’ wo k lows a e complemen ed
by a long ail o o he wo k lows wi h low o mode a e
le els o use.
The mos downloaded ools ha belong in di e se
ca ego ies showcase how Galaxy suppo s a wide
28 Fo each ool, he opicmodelgene a esap obabili ydis ibu ionac ossse e alpo en ialca ego ies.The oolis henassignedexclusi ely o hesingleca ego y ha ecei ed he
highes p obabili ysco e.
ange o scien i ic needs. The mos equen ly
downloaded eposi o ies belong o ca ego ies such
as Wo k low Au oma ion & Me a- ools, Fo ma ing &
Con e sion and Genomic & Va ian Analysis. S andou
ools – including Fas QC (Genomic & Va ian Analysis),
da a_manage _manual (Wo k low Au oma ion) and
collec ion_column_join (Da a In eg a ion & Me ging) –
ha e become indispensable in many esea ch pipelines.
Thei widesp ead adop ion highligh s he Galaxy
communi y’s emphasis on eliabili y, au oma ion and
e ec i e da a managemen .
Sequence assembly &
alignmen
Machine lea ning & p edic i e models
Fea u e ex ac ion & anno a ion
Genomic &
a ian analysis
Fo ma ing &
con e sion
Da a p ep ocessing & cleaning
Wo k low au oma ion & me a- ools
50000 downloads
Da a in eg a ion &
me ging
Tex &
li e a u e mining
Ca ego ies
Tools
cu links
collec ion_column_join
di
ncbi_blas _plus
package_a las_3_10
sam ools_calmd
smagexp_da a ypes
immoma ic
package_ eadline_6_3
sam ools_mpileup
bam_ o_sam
column_make
cu adap
da a_manage _ e ch_genome_dbkeys_all_ as
a
da a_manage _sam_ as a_index_builde
as p
mul iqc
package_bzlib_1_0
package_libpng_1_6_7
package_ncu ses_5_9
package_ncu ses_6_0
package_zlib_1_2_8
que y_ abula
sam_ o_bam
sam ools_ mdup
abula _ o_ as a
bed ools
deseq2
as qc
ea u ecoun s
package_ as qc_0_11_4
sam ools_s a s
blas _da a ypes
bow ie2
bwa
da a_manage _bwa_mem_index_builde
pica d
g nas a
sam ools_ lags a
sam ools_idxs a s
sam ools_slice_bam
sam ools_so
compose_ ex _pa am
ex _p ocessing
a_selenium_ es _ epo
da a_manage _bow ie2_index_builde
da a_manage _manual
package_ on con ig_2_11_1
package_sam ools_1_2
17
Main enance ac i i y e lec ed
in ool a ailabili y
Ou analysis o OSSW on Galaxy shows ha ac i e
main enance co ela es closely wi h ool a ailabili y
on Galaxy ins ances – a p ominen end ac oss all
subdomains. The mos popula subdomains (by
ins alla ions) also demons a e a s ong pa e n o
equen upda es and ac i e e isions. No ably, he e is
a s ong posi i e co ela ion (0.88) be ween he numbe
o ool e isions and o al downloads ( ypically ia
ins alla ions on Galaxy ins ances).
T us is ea ned h ough consis en main enance
and e inemen o mee e ol ing needs. In essence,
we see ha he e is us in ools ha keep e ol ing.
Regula e isions ha enhance o expand unc ionali y,
combined wi h con inuous in eg a ion sys ems, appea
o os e con idence among Galaxy adminis a o s
and he b oade communi y. This likely con ibu es
o highe adop ion a es, suppo ing he g ow h and
impac iden i ied in he independen ly-commissioned
epo on he sus ainabili y o Galaxy29.
29 Jain,S.(2025).GalaxySus ainabili yRepo (Ve sion1).Zenodo.h ps://doi.o g/10.5281/zenodo.16030329
Key message
Ac i e main enance os e s
us – and us d i es usage.
Fo example, Sequeone,
which p o ides an end- o-
end solu ion o clinicians
and biologis s, eleases new
e sions mon hly based on
cus ome eedback. I also
main ains a dedica ed suppo
eam and obus secu i y
in as uc u e, adding alue a
beyond algo i hms. Simila ly,
ano he company highligh s
ha clien s ‘ e y much
app ecia e’ s able so wa e, so
much so ha hey a e willing o
‘pay o keeping i s able in he
long un’. This unde sco es he
di ec link be ween eliabili y,
ongoing main enance and use
adop ion.
In ou indus y, us and anspa ency
a e pa amoun , and while we sha e
simila i ies wi h ou compe i o s, ou
unique ad an age lies in ou specialised
skillse , cus omised solu ions and ac i e
pa icipa ion in open-sou ce communi ies
ha build us wi h ou cus ome s and
he communi y.” — In e iewee, A digen
Re ision coun s. downloads
Re isionsand ooldownloadsco ela eindi e en esea chca ego ies.
Da a p ocessing
Times downloaded
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
Times downloaded
Re ision coun
1.4
1e6
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0 1000 2000 3000 4000
Re ision coun
0 1000 2000 3000 4000
Sequence & genomic analysis
Ad anced analy ics & biology
Visualiza ion & wo k low
Da a in eg a ion
& me ging
Machine lea ning
& p edic i e models
O he
Ne wo k
& pa hway analysis
Visualiza ion &
epo ing ools
Tex &
li e a u e mining
Quan i ica ion
& s a is ical analysis
Molecula s uc u e
& chemoin o ma ics
S uc u al biology
& simula ion ools
Fo ma ing
& con e sion
Da a p ep ocessing
& cleaning
Sequence assembly
& alignmen
Me agenomics
& mic obiome analysis
Fea u e ex ac ion
& anno a ion
Genomic
& a ian analysis
Wo k low au oma ion &
me a- ools
18
Top en con ibu ing ToolShed owne s in he Galaxy pla o m
An analysis o owne ship pa e ns wi hin he Galaxy
pla o m and i s ools shows a s iking concen a ion o
OSS de elopmen . The op en con ibu ing ToolShed
owne s a e esponsible o an imp essi e 56.6% o
he ools in he ToolShed. Howe e , mos o hese a e
g oups o people, which explains hei b oad ac i i y
30 Ho mann,M.,Nagle,F.,&Zhou,Y.(2024).TheValueo OpenSou ceSo wa e.Ha a dBusinessSchoolS a egyUni Wo kingPape ,(24-038).
h ps://dx.doi.o g/10.2139/ss n.4693148
ac oss ca ego ies29. This inding shows ha small co e
g oups o ToolShed owne s play a signi ican ole in
sus aining and ad ancing open sou ce ecosys ems30,
highligh ing he impo ance o collabo a i e, long- e m
main enance o high-quali y ools.
ece ox
na eam
nml
ecology
ebi-gxa
q2d2
de eam
galaxyp
bg uening
iuc
Ne wo k & pa hway analysis
Sequence assembly & alignmen
Machine lea ning & p edic i e models
Quan i ica ion & s a is ical analysis
Fea u e ex ac ion & anno a ion
Genomic &
a ian analysis
Molecula s uc u e & chemoin o ma ics
Fo ma ing &
con e sion
Da a p ep ocessing &
cleaning
Visualiza ion & epo ing ools
Wo k low au oma ion &
me a- ools
S uc u al biology & simula ion ools
Me agenomics & mic obiome analysis
Da a in eg a ion & me ging
O he
Tex & li e a u e mining
1000
ools
ToolShed
owne s
Ca ego ies
The op encon ibu ingToolShedowne sinGalaxya eno con ined oasingleniche;ins ead, heyac i elycon ibu e oawide ange
o essen ial domains.
19
Open sou ce sus ainabili y:
ope a ionalising collabo a ion
and inno a ion
OSSW ha e become in eg al o he li e sciences, unde pinning e e y hing om basic esea ch o
comme cial applica ions. O ganisa ions o all sizes ely on OSSW o de elop p oduc s and se ices.
The ques ion is no longe whe he o use open sou ce, bu how o deploy i s a egically o ensu e
sus ainabili y, secu i y and inno a ion.
This s udy demons a es ha indus y app oaches o OSSW di e based on he ex en o which companies align
hei co e p oduc s and se ices wi h OSSW and in eg a e communi y con ibu ions in o hei business models.
Sus ainable OSSW adop ion equi es a combina ion o open collabo a ion and obus ope a ional models.
Companies ha p io i ise OSSW and communi y engagemen o en see bene i s in ma ke g ow h, alen acquisi ion
and quali y imp o emen . O he s in eg a e OSSW selec i ely o educe cos s o accele a e de elopmen , balancing
lexibili y wi h s a egic independence. All app oaches, howe e , ace common challenges: ongoing main enance
cos s, he need o adap o e ol ing echnologies and app op ia e IP p o ec ion.
Open sou ce echnology,
combined wi h he
ope a ional backing o
managed se ices, gi e
de elopmen eams
signi ican le e age. This
app oach allows hem
o capi alise on ma u e,
widely-adop ed open
sou ce ools, educing he
bu den o ounda ional
de elopmen and
main enance. A he same
ime, he peace o mind
o e ed by managed se ices
ees hem o concen a e
hei ene gy and esou ces
on building a p oduc .”
— In e iewee, Camb ium
Despi e hea y eliance
on open sou ce ools,
modi ica ions a e
always needed o ma ch
speci ic equi emen s.”
— In e iewee, Saphe o
Finding he balance
be ween he needs o
he communi y and he
speci ic equi emen s
o indi idual cus ome s,
wi hou di e ging om
he main open sou ce
alues o he company,
can be complica ed bu
is highly easible.”
— In e iewee, Seqe a
20
The e is g owing ecogni ion ha sus ainable
and secu e open science p ac ices a e essen ial.
Collabo a ion emains he mos e icien and cos -
e ec i e pa h o wa d, bene i ing all s akeholde s, e en
in comme cially-d i en en i onmen s. Communi y-
d i en p ojec s like Nex low exempli y he powe o
collec i e e o in c ea ing obus , open solu ions ha
bene i he esea ch communi y, ega dless o sec o .
Challenges a ound sus ainabili y, scalabili y and
secu i y pe sis , p omp ing inno a ion in bo h open
sou ce adop ion and e enue gene a ion. As one
Saphe o in e iewee no ed, ‘Iwan science obeasopen
aspossible o hegoodo scienceandsocie yasawhole’.
This s a emen e lec s a b oade e hos: open science
and speci ically, in his case, OSS, is c ucial o ad ancing
knowledge and bene i ing socie y.
The examples in his epo show i is possible o
embed open sou ce in o business models, while
balancing communi y bene i wi h comme cial iabili y.
This means in es ing in ac i e main enance, adop ing
common s anda ds and de eloping go e nance models
ha ewa d bo h con ibu ion and consump ion.
31 h ps:// esea ch-so wa e-ecosys em.gi hub.io
32 h ps://e e se.so wa e/abou /objec i es
33 h ps://elixi -eu ope.o g/abou -us/how- unded/eu-p ojec s/s ee s
34 h ps://elixi -eu ope.o g/abou -us/how- unded/eu-p ojec s/s ee s/wp3
ELIXIR plays a cen al ole in his ecosys em. Th ough
ini ia i es like he Resea ch So wa e Ecosys em31, a
p ojec ha cen alises high-quali y OSSW me ada a,
ELIXIR membe s cu a e and connec me ada a o
compu a ional biology ools, wo k lows and lib a ies,
making hem mo e FAIR. Ho izon Eu ope p ojec s32,33 led
by ELIXIR o p omo e he adop ion and implemen a ion
o so wa e bes p ac ices34, – pa icula ly o OSSW –
and o ele a e esea ch so wa e, including OSSW, o
a cen al ole in he scien i ic p ocess. These e o s
s eng hen ep oducibili y and us , educe duplica ion
and lowe in eg a ion cos s, ueling he g ow h o OSSW.
Fo he li e sciences communi y, his is an oppo uni y o
mo e beyond isola ed codebases owa ds a connec ed,
sus ainable knowledge in as uc u e. By collabo a ing
ac oss academia, indus y and in as uc u es, we can
ensu e open sou ce emains a ca alys o disco e y,
a d i e o e iciency and a ounda ion o long- e m
compe i i eness. ELIXIR will con inue o ac as a
con eno , s anda ds-se e and enable in his space,
os e ing bo h he openness ha accele a es science
and he s uc u es ha keep i sus ainable.
21
Me hods
Compu a ional analysis
The da ase o Galaxy Pla o m was collec ed om
he Galaxy Tool-shed (26 Feb ua y 2025) ia he Tool-
shed API. Due o limi ed da a access, only i e me ics
we e a ailable o analysis: ool name, o al downloads,
e isions coun , unique owne s and desc ip ion. Based
on hei desc ip ions, ools we e g ouped in o 15
p ima y ca ego ies using a opic modeling app oach.
Thei p ima y ca ego ies we e gene a ed h ough
he modelling p ocess and we e no de i ed om he
Galaxy axonomy.
To assess company engagemen and collabo a ion,
531 public Gi Hub eposi o ies om 15 in e iewed
companies we e ex ac ed using he Gi Hub G aphQL
API. The analysis compa ed how companies o di e en
sizes engage wi h he open sou ce communi y, ocusing
on key me ics such as a e age s a s, o ks, commi s
and con ibu o s.
To analyse global collabo a ion h ough Gi Hub, we
iden i ied he geog aphic loca ion o each company,
whe eas he loca ion o each con ibu o was ex ac ed
om hei public Gi Hub p o ile. The in ensi y o
collabo a ion om a speci ic egion was measu ed by
coun ing he o al numbe o con ibu o s o igina ing
om ha loca ion.
This s udy did no di e en ia e be ween CI/CD wo k lows
and analy ical o ML wo k lows; we no e hei di e en
pu poses and some sha ed and dis inc ea u es.
All da a gene a ed and analysed in his s udy a e publicly
a ailable. The comple e da ase , including me ada a
and documen a ion, is accessible h ough Zenodo a
h ps://doi.o g/10.5281/zenodo.17637118.
In e iew s udy
F om a pool o 678 ele an companies ounded
be ween 2014 and 2024, we con ac ed 258 o ganisa ions
wi h accessible ounde o managemen con ac s in
he heal h da a sec o . Fi een companies ag eed o
in e iews be ween No embe 2024 o Ma ch 2025.
Sessions (30–60 minu es) we e held ia Zoom o in
pe son a majo con e ences such as BioTechX 2024 and
he Fes i al o Genomics 2025. Pa icipa ing companies
we e ca ego ised by size – small, mid and la ge – based
on hei LinkedIn employee coun s. In line wi h EU
Recommenda ion 2003/361, we conside ed a company
as small (less han 50 employees and/o less han €10m
EUR u no e ), medium (less han 250 employees and/
o less han €50m EUR u no e ) and la ge (abo e hese
h esholds).
The in e iews ollowed a se o ques ions ha had
been de eloped a ound wo k lows and OSS ega ding
use, ac i e sha ing and business-model ele ance ( alue
c ea ion, deli e y, cap u e). Responses we e analysed
based on hema ic coding o in e p e he da a.
In e iewees ep esen ed hei companies h oughou ,
and hey e iewed all quo ed ma e ial. All quo a ions
a e used wi h company consen , and mos companies
pe mi ed di ec a ibu ion. The in e iewees ha only
wan ed o sha e hei pe sonal iews and expe iences
p e e ed o be e e enced as anonymous.
We no e a sampling bias – mos companies had some
exis ing knowledge o ELIXIR, po en ially inc easing hei
likelihood o engagemen in open sou ce ac i i ies. This
bias may ha e in luenced ou in e p e a ion, pa icula ly
ega ding he open sou ce inges ion s a egy.
While he in e iews add essed OSS and wo k lows, we
adop ed a lexible app oach o he de ini ion o FOSS.
This allowed o a wide ange o discussions on how
companies cap u e alue om OSS wi hou needing o
p o ide i o ee.
22
C edi s and Acknowledgemen s
Wi h special hanks o all who
con ibu ed including he
in e iewed companies and hose
wi hin he ELIXIR communi y who
p o ided help ul eedback.
Hinx on, UK, No embe 2025
Published unde he CC BY 4.0 licence.
h p://doi.o g/10.5281/zenodo.17570133
ELIXIR HUB
[email p o ec ed]
Sou h Building
Wellcome Genome Campus
Hinx on
CB10 1SD, UK