scieee Science in your language
[en] (orig)

Who Do You Think You Are? Creating RSE Personas from GitHub Interactions in Research Software Repositories…

Author: Anderson, Felicity
Publisher: Zenodo
DOI: 10.5281/zenodo.17337652
Source: https://zenodo.org/records/17337652/files/RSE_Personas_HiRSE_2025-10-09.pdf
Who Do You
Think You
A e?
C ea ing RSE Pe sonas om Gi Hub In e ac ions in Mid-
Size Resea ch So wa e (RS) Reposi o ies
Flic Ande son, D Julien Sind , P o Neil Chue Hong
(EPCC, Uni e si y o Edinbu gh)
P ep in : doi.o g/10.48550/a Xi .2510.05390
RSE Pe sonas:
Pa e ns o collabo a i e
Resea ch So wa e (RS)
Reposi o y in e ac ions on
Gi Hub.
Why?
I we can name some hing,
we can hink, alk abou ,
and change i !
Resea ch Objec i es:
•Re- ind ini ial clus e s om pilo
s udy (High and Low In e ac i i y)...
•Applying u he clus e ing o iden i y
addi ional high- esolu ion pe sonas wi hin
hese low- esolu ion g oupings
•P o ing whe he high- esponsibili y ype
in e ac ions (Issue Assignmen , PR Closu e)
s ongly in luence RSE Pe sona c ea ion
Pilo S udy Clus e s
•Pilo s udy explo ing whe e and how I migh
ind pe sonas... (45 epos, 791 epo-indi iduals)
•O iginally expec ed pe sonas in 4 quad an s
•UIT: Unique In e ac ion Types
oIn e ac ion Va ie y
•MRC: Mean Reposi o y Con ibu ion
oIn e ac ion Volume
•Low o High Axes
•Found good e idence o 2...
h p://dx.doi.o g/10.5281/zenodo.14988656
Da a
Analysis:
Pe -Repo-Indi idual da a
on 6 In e ac ion Types
Va iables ocus on Va ie y
and Volume o in e ac ions
Ini ial clus e ing / analysis,
hen e-clus e subse s o
be e esolu ion

Hie a chical Clus e ing gene a ed 3 ini ial clus e s based on In e ac i i y G oupings
Longe e ical heigh s om node mean
g ea e di e ences om al e na i e b anch.
High In e ac i i y Low In e ac i i y Mode a e In e ac i i y
Calinski-Ha abasz (CH) Index (1974) used o e alua e N clus e op ions wi hou o e i ing h p://dx.doi.o g/10.5281/zenodo.14988656
P incipal
Componen
Analysis
PCA Eigen ec o s explain:
X: 81.36%
Y: 5.54% ...
Z: 4.58% o a iance in da a
High Mode a e
Low
Fea u e
Impo ance
Analysis
Highes Impo ance Values:
X: RC PR Closed (39.81)
Y: RC Commi C ea ed (27.74)
Z: RC Issue Closed (46.08)
High Mode a e
Low
3 le els o gene al in e ac i i y g oupings
(conside ing bo h a ie y and olume)
High
Mode a e
Low In e ac i i y
Low In e ac i i y Pe sonas
Key Spli s: UIT, MRC

Low In e ac i i y Pe sonas
•Ve y low mean UIT (1.56) and MRC (0.15%)
•Na ow and shallow con ibu ions!
•Low ne impac : ~no ne issues / PR in e ac ions (-0.08%;
0.17%)
•"epheme al": only 3 mean in e ac ion days and 0.23% RC,
ac oss an in e ac ion pe iod o 124 days (~4 mon hs)
•May isi only o eques a ix o new ea u e (highe issue
c ea ion han commi s o closu e o issues), bu plen y o
hem: con ibu e lo s o ideas
•UIT mean is 3.42%, Mean MRC 3.41%
•low gene al con ibu ions ac oss mode a e a ie y...
•Weak ne RC Issue Closu e (-2.03%) bu weak ne C ea ion
o PRs (3.45%)
•~4% o all in e ac ion days in hei epo (mean 43), he e o e
"occasional con ibu o " ac oss in e ac ion pe iod o 802.65
days (2.20 yea s)
•Highe equency han a e bu highe -in e ac ion
pe sonas, so s ill impo an o signi ican RS de elopmen
wi hin hei p ojec s
UIT: Unique In e ac ion Types [min=1, max=6] RC: Reposi o y Con ibu ion [pe cen age] MRC: Mean Reposi o y Con ibu ion [no malized pe cen age]
Epheme al Con ibu o (91.80%)
Ve y Low In e ac i i y (2:1)
Occasional Con ibu o (5.54%)
Low In e ac i i y (2:0)
Mode a e
In e ac i i y
Pe sonas
Mode a e In e ac i i y Pe sonas
Key Spli s: PRs Closed, (Ne ) C ea ed-Closed Issues RC
Mode a e In e ac i i y Pe sonas
•Mode a ely high mean UIT: 4.88; MRC s ill low: 14.85%
•Va ied (wide) bu e y shallow engagemen wi h he epo...
•Only 10.70% RC o Commi C ea ion, bu 22.71% o
Assignmen o Issue Ticke s ( ange: 12.01%)
•High assignmen bu ela i ely low Issue/PR closu e a es may
mean "keep me in he loop"?
•Mode a e ime in ol emen (119.97 in e ac ion days
ac oss 3.85 yea s)
•Ma ches 'low MRC, high UIT - P ojec Manage ' ole ini ially
expec ed a ou se o pilo s udy ( enamed)
•Focus on managing p ojec s and de elopmen e o ins ead
o ac i e de elopmen ole?
•High mean UIT (5.45) and mode a e MRC: 31.32%
•Va ied bu no oo deep con ibu ions gene ally
•RC anges mode a e oo: 22.34%
•Focussed on ne Closu e! RC alues o PR and Issue
Closu e nea ly double equi alen C ea ion ypes, leading
o ne Issue Closu e o -43.40%; ne closu e o PRs o -
19.25%
•278 In e ac ion days, ac oss 2140 days (5.86
yea s) In e ac ion Pe iod
•Uses de managemen ea u es equen ly, bu also does
he de wo k equi ed o close i ems: 30.13% RC Commi
C ea ion (c. . P ojec O ganise s!)
P ojec O ganise (1.55%)
Low-Mode a e In e ac i i y (0:0)
Mode a e Con ibu o (0.50%)
Mode a e In e ac i i y (0:1)
High
In e ac i i y
Pe sonas

High In e ac i i y Pe sonas
Key Spli s: RC Issue Assignmen , RC Commi s C ea ed
High In e ac i i y Pe sonas
•High UIT 5.15; mode a e MRC 42.54%, bu ...
•S ong beha iou p e e ence away om Assignmen /Issue
C ea ion (11.93% and 16.12%) owa ds PR Closu e
(84.13%)!
•Highes ne PR closu e (37.04% c ea ed minus closed
84.13% = ne -47.10%) and s ong ne icke closu e
(16.12% c ea ed minus closed 72.68% = ne –56.56%) due o
low opening a es
•Showing up! 347.72 mean in e ac ion days; In e ac ion
Pe iod 2573 days (7.05 yea s); 53.5% o eposi o ies' o al
•Likely a ixe , keene on 'ge ing hings done' by closing
exis ing icke s/PRs han opening new ones ("low-p ocess")
•UIT 5.43; MRC 50.08%
•High a ie y o in e ac ions, good olume!
•Mo e consis en RCs han Low-P ocess Close s ( ange
41.97% c. . 72.20%!)
•Low commi c ea ion RC: 32.88%, ( he e o e "low-
coding") ...bu ...
•S ill high closu e a es?! 74.85% Issue Closu e and
71.64% PR Closu e
•P esen ! 286.35 in e ac ion days on a e age; In e ac ion
Pe iod 2362 days (6.47 yea s); 42.47% o epo's o al days
•May be iaging PRs and Issue Ticke s, closing
duplica ed/i ele an i ems o wo king on i ems needing no
commi c ea ion o esol e hem?
Low-P ocess Close (0.12%)
Mode a e-High In e ac i i y (1:2)
Low-Coding Close (0.29%)
High In e ac i i y (1:0)
High
In e ac i i y
•Highes mean UIT (5.88) and highes MRC (69.10%)
•High a ie y and deep olume o con ibu ions!
•Highes Issue Ticke Assignmen o all pe sonas (77.09% o assignmen s
in hei epos o hese epo-indi iduals)
•G ea ne closu e o PRs: -82.07% (RC C ea ed Minus Closed Issues)
•O e 65% o all In e ac ion Days in hei epo by hem (397 In e ac ion
Days) ac oss In e ac ion Pe iod o nea ly 7 yea s (2523.62 days)
•High usage o de elopmen managemen ea u es (issue icke s, PRs,
assignmen ) AND imp essi e codebase con ibu ions h ough commi s
•Impo an co e membe s o hei epos
•Ma ches hypo hesised pe sona!
Ac i e Con ibu o (0.20%)
Ve y High In e ac i i y (1:1)
Limi a ions
Vasilescu e al., 2014
Ha o i-Lanza, 2008
Kalliam akou e al., 2016
•UIT oo simplis ic, MRC is ok summa y (wi h ca ea s)?
•"High Responsibili y" In e ac ion Types (such as
Assignmen o PR Closu e) impo an
•Commi Classi ica ion Me hods (Vasilescu e al., 2014 o
Ha o i-Lanza, 2008) no di e en (commi size, ile ype, o
message key wo ds)
Va iable Selec ion
•Fo ks discoun ed o on collabo a i e coding,
bu Kalliam akou e al., 2016 include all o ks, wo king a
'p ojec ' le el
•'O line' wo k and ex e nal ools...
RS Repos s P ojec s
•Zenodo esea ch eposi o y – epos polished be o e
publishing?
Skewed owa ds Bes P ac ices?
h p://dx.doi.o g/10.1007/s10664-013-9244-1; h p://dx.doi.o g/10.1109/ASEW.2008.4686322; h p://dx.doi.o g/10.1007/s10664-015-9393-5