scieee Science in your language
[en] (orig)

Some Tips on Modernizing Legacy Software

Author: Koufos, Alexander
Publisher: Zenodo
DOI: 10.5281/zenodo.17280173
Source: https://zenodo.org/records/17280173/files/US_RSE_25_Poster.pdf
Some Tips on Mode nizing Legacy So wa e
Alex Kou os (he/him)1
1S an o d Uni e si y & Join Science Ope a ions Cen e
O e iew
Legacy codebases o en pose conside able challenges, including a sca ci y o es s, monoli hic
a chi ec u e, inadequa e documen a ion, and a lack o s anda dized de elopmen p ac ices.
These obs acles ypically a ise om he absence o o mal es ing, complica ing e o s o main-
ain and upda e he code o compa ibili y wi h newe ha dwa e and so wa e. Consequen ly,
w i ing uni es s o en necessi a es e ac o ing he code, which in ol es es uc u ing i wi h-
ou comp omising i s o iginal unc ionali y. Howe e , his aises a c i ical ques ion: How can
one e ac o wi hou ha ing es s in place o e i y unc ionali y?
This pos e will explo e he inco po a ion o so wa e be e p ac ices, wi h a pa icula em-
phasis on es ing, o acili a e he mode niza ion o legacy codebases. By adop ing a s uc-
u ed app oach o es ing, ou aim is o enhance he eliabili y and main ainabili y o he code
while p ese ing he cu en unc ionali y and s abili y o so wa e ha p oduces science- eady
da a. The discussions will ocus on p ac ical s a egies o implemen ing es ing amewo ks,
es ablishing a cul u e o con inuous in eg a ion, and ensu ing ha upda es do no dis up he
p oduc ion o c i ical scien i ic ou pu s. Th ough hese e o s, we can c ea e a mo e obus
and adap able codebase ha mee s he demands o mode n scien i ic esea ch.
Why Good Resea ch Requi es Be e So wa e P ac ices
Wha Makes Good Resea ch?
Acco ding o The Na ional Academies o Sciences, Enginee ing, and Medicine, ep oducibili y (i.e.
compu a ional ep oducibili y) is de ined “as ob aining consis en compu a ional esul s using he
same inpu da a, compu a ional s eps, me hods, code, and condi ions o analysis”8. Rep oducibil-
i y is he co ne s one o eliable esea ch, and mos esea ch canno be done wi hou so wa e.
Thus, be e so wa e p ac ices a e e en mo e impo an in oday’s esea ch en i onmen .
A Failu e o Rep oduce?
Figu e 1 shows esea che s abili y o ep oduce esul s wi hin hei ields. In all esea ch commu-
ni ies, esea che s s uggled o ep oduce he esul s o o he s, and e en hemsel es.
Figu e 1. Failu e o Rep oduce Published Resul s (Bake , M. 2016)1. Highe alues ep esen a g ea e ailu e in
ep oducing expe imen s.
In mos ields, esea che s ha e been unable o ep oduce ≈50% o hei own wo k!
Mode niza ion o Legacy Code
Ve sion Con ol & Simple Wo k low
Ini ialize a Gi epo (i missing); en o ce a ea u e-b anch → PR model
Requi e a leas one passing es be o e me ging
Tag eleases ha co espond o ep oduced scien i ic esul s
Pe o m Code Re iews
Re iew e e y change o s yle, es adequacy, and documen a ion upda es
Use checklis s: “Is he e a es ?”, “Did we upda e docs?”, “Did we p ese e beha io ?”
Inc emen al Documen a ion
Documen why a module exis s, no jus wha i does
S o e docs alongside code (Ma kdown in docs/ o docs ings)
Link es cases o he co esponding documen a ion sec ions
Le e age Exis ing Manual Tes s
Ha es sc ip s, example uns, o ”known o be co ec ” ou pu iles
Con e hem in o au oma ed eg ession es s (e.g. py es ix u es)
Compa e new uns agains baseline esul s ac oss OSes, compile s, and ha dwa e
Es ablish a Tes ing F amewo k
Uni es s o pu e unc ions (aim ≥70% co e age)
In eg a ion es s ha s i ch modules oge he (use his o ic inpu /ou pu pai s)
Sys em/accep ance es s ha alida e end� o�end scien i ic esul s
Con inuous In eg a ion (CI) Pipeline
Au oma e lin ing, uni - es execu ion, and c oss-pla o m builds on each push
Decide when i ’s be o un he ull in eg a ion sui e on ep esen a i e da ase s
Visual dashboa ds (Gi Hub Ac ions, Gi Lab CI, Jenkins) keep he eam awa e o b eakages
Map & Modula ize
Iden i y unc ional “chunks” and hei dependencies
Ske ch a dependency g aph; look o na u al sepa a ion poin s
Re ac o inc emen ally: ex ac a module → w i e es s → lock beha io
Rep oducibili y Checks
Pick a published esul , e� un i on he e ac o ed code, and cap u e he ou pu
T ea he ep oduced esul as a golden in eg a ion es
Re e ence he “Imp o ing ep oducibili y h ough be e so wa e p ac ices” u o ial o
conc e e s eps
Use ul Links and Re e ences
[1] M. Bake . 1,500 scien is s li he lid on ep oducibili y. Na u e, 533:452–454, 2016. doi: 10.1038/533452a.
[2] Be e Scien i ic So wa e (BSSw). Tes ing Resou ces. h ps://bssw.io/i ems? opic= es ing, 2024.
[3] N. U. Eis y, D. E. Be nhold , A. Kou os, D. J. Lue , and M. Mund . Ten essen ial guidelines o building high-quali y esea ch so wa e,
2025. URL h ps://a xi .o g/abs/2507.16166.
[4] Gi K aken. Gi K aken - Gi Flow Page. h ps://www.gi k aken.com/lea n/gi /gi - low#gi hub- low, 2022.
[5] In e sec - Resea ch So wa e Enginee T aining. Tes ing Lesson. h ps://in e sec - aining.o g/ es ing-lesson/, 2022.
[6] A. Kou os. So wa e Bes P ac ices o Rep oducible Open Science. Ap . 2024. doi: 10.5281/zenodo.10994996.
[7] A. Kou os, M. Mund , and N. Eis y. So wa e Tes ing P ac ices o Rep oducible Open Science. Dec. 2024. doi: 10.5281/zen-
odo.14291633.
[8] C. on Rep oducibili y and Replicabili y in Science. Rep oducibili y and Replicabili y in Science. Na ional Academies P ess, 2019. ISBN
978-0-309-48616-3. doi: 10.17226/25303.
[9] S ack O e low. S ack O e low Su ey. h ps://su ey.s acko e low.co/2022/# e sion-con ol- e sion-con ol-sys em-p o ,
2022.
[10] US-RSE. Uni ed S a es Resea ch So wa e Associa ion (US-RSE). h ps://us- se.o g, 2017.
P ac ical Tips & Common Pi alls
Tip Why I Ma e s
A oid massi e s yle changes while e ac o ing • S yle changes can mask eg essions
• Keeps he ocus on unc ional co ec ness
In oduce a lin e ea ly (e.g., u , clang- o ma ) • En o ces a consis en s yle wi hou manual
e o
• Reduces cogni i e load o e iewe s
Remo e dead code be o e adding es s; i pos-
sible, un s a ic analysis
• Sh inks he su ace a ea ha needs es ing
• Helps wi h eadabili y o he codebase
Use disposable b anches; me ge in o main
⇐⇒ all es s pass
• Allows bold a chi ec u al changes
• Limi s isk o he s able/ elease e sion o
he so wa e
Documen “known-issues” alongside ixes • Fu u e con ibu o s see he a ionale behind
wo ka ounds
• Imp o es onboa ding speed
C oss-pla o m CI • Gua an ees po abili y o scien i ic
pipelines
• Inc eases abili y o ep oducibili y o esul s
Take he Fi s S ep Today
S a small: pick one module, w i e a single uni es , and commi he change.
Make es ing isible: add a badge (“Tes s X85%”) o he eposi o y README.
Sha e success: p esen a sho demo a you nex lab mee ing—show how a ailing es
caugh a eg ession.
I e a e: epea he cycle, g adually expanding co e age and modula i y.
Bo om line
E en modes , es -d i en e ac o ing yields measu able con idence gains, as e onboa ding, and
ep oducible science.
Acknowledgmen s
RSE wo k suppo ed by he NASA Con ac NAS5-02139 (SDO/HMI) and NASA Coope a i e Ag eemen 80NSSC22M0162 (COFFIES
DRIVE Science Cen e ). I would also o acknowledge he Uni ed S a es Resea ch So wa e Enginee Associa ion (US-RSE)10 o suppo ing
esea ch so wa e enginee ing and RSEs mo e b oadly. A special hanks o Schmid Sciences o unding many, including mysel , o a end
he USRSE’25 Con e ence. Las ly, I’d like o hank Nasi Eis y, Mi anda Mund , Da id Lue , and Da id Be nhold o con e sa ions a ound
be e p ac ices wi h some con e sa ions abou legacy so wa e o ou pape on a Xi 3.
Check ou he digi al e sion o his pos e om ou Zenodo collec ion o
USRSE’25 pape s and pos e s!
Alex Kou os: akou [email p o ec ed]o d.edu US-RSE’25 Gi hub: exo icd