scieee Science in your language
[en] (orig)

Benchmarking Darshan and Recorder for HPC I/O Profiling and Tracing

Author: Zhu, Zhaobin; Derstroff, Leonie; Neuwirth, Sarah
Publisher: Zenodo
DOI: 10.5281/zenodo.17654733
Source: https://zenodo.org/records/17654733/files/REXIO2025-Paper2-Neuwirth.pdf
Benchma king Da shan and Reco de
o HPC I/O P o iling and T acing
Zhaobin Zhu, Leonie De s o , Sa ah M. Neuwi h
Johannes Gu enbe g Uni e si y Mainz, Ge many
neuwi [email protected]
REX-IO 2025 Wo kshop, IEEE CLUSTER, Sep embe 2025
Backg ound & Mo i a ion
Why HPC I/O Analysis Ma e s
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz 2
Applica ion
Ne wo k
File Sys em
• Numbe o p ocesses
• Reques sizes
• Access pa e ns
• I/O ope a ion
• Da a o lume
• Message sizes
• Ne wo k opology
• Ne wo k pa hs
• Ne wo k ype
• Type o ile sys em
• Disk ypes
• S ipe sizes
• File hie a chy
• Sha ed access
Applica ion
High-le el I/O
Lib a ies
Pa allel File Sys em
HDD
DAOS
SSD NVM Tape
Objec S o e
Applica ion
Tenso Flow
Pandas
Applica ion
Vi ual Machine
S3
Cloud S o e
...
...
...
Low-le el I/O Lib a ies
MPI-IO
I/O Fo wa ding
Laye
RAID
Compu e
pa adigms
S o age
pa adigms
Di e se Wo kloads
Di e se So wa e
Di e se A chi ec u e
Di e se Ha dwa e
I/O bo lenecks limi scalabili y and h oughpu P o iling & acing ools essen ial o op imiza ion
HPC wo k lows inc easingly da a-in ensi e
Backg ound & Mo i a ion
Landscape o I/O Tools
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz 3
•Many ools, di e en design philosophies
•Challenge: Which ool o use o which pu pose?
Tool Languages P o iling T acing Moni o ing
Da shan C, Py hon, Pe l   
Reco de C, C++, Py hon 
Sco e-P C, C++, Fo an   
TAU C, C++, Ja a   
Scalasca 2.x C, C++   
Beacon C, Py hon, Ja aSc ip 
SIOX C, C++, Py hon  
DFT ace C, C++, Py hon 
Backg ound & Mo i a ion
P oblem S a emen
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz 4
Mul iple pe o mance analysis ools o I/O a ailable
Tools p oduce di e en me ics & pe spec i es
Use s s uggle o compa e o combine esul s
Ou ques ions:
−How do Da shan and Reco de di e in p ac ice?
−Wha a e hei ade-o s in scalabili y s. de ail?
Tools in Focus
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz5

Tools in Focus
Da shan and Reco de in a Nu shell
Da shan Reco de
Ou pu Da shan bina y Reco de bina y
Analysis Tools /
Fo ma
Da shan logs, Da shan u il sc ip s, Gauge, DXT
explo e ( ace analysis)
Reco de logs, eco de - iz, Ch omium ace ile,
Pa que
De aul
Ins umen a ion
LD_PRELOAD LD_PRELOAD
In e ace
Ca ego iza ion
MPI-IO, POSIX, HDF5, Pne CDF, Lus e and STDIO MPI-IO, POSIX, HDF5, Pne CDF, Ne CDF
I/O G ouping •Func ions g ouped in o ead & w i e o each
module
•Non ead o w i e a e no displayed in he ace
•O iginal unc ion names a e no e ained in he
DXT
O iginal unc ion names a e e ained wi h addi ional
ca ego ies such as module and ope a ion ype ( ead
/ w i e)
I/O T ace Scope T ace only co e s in e cep ed I/O ead / w i e
ope a ions
Se e al POSIX unc ions a e in e cep ed and se e al
o he MPI calls ha does no include MPI_Ini and
MPI_Finalize
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz 6
Reco de : Pa allel I/O T acing Tool
•T acing ool a unc ion-call le el
•Cap u es POSIX, MPI-IO, ull HDF5 (749 calls)
•Includes imes amps, pa ame e s, p ocess-le el de ails
Da shan: I/O Cha ac e iza ion Tool
•Ligh weigh p o iling ool
•Agg ega ed s a is ics (coun s, sizes, bandwid h)
•Op ional DXT module o de ailed POSIX/MPI-IO aces
•P oduces compac summa ies, low un ime o e head
Tools in Focus
Da shan and Reco de : Key Di e ences
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz 7
Snyde , S., 2022. Da shan: Enabling Insigh s in o
HPC I/O Beha io . ECP Communi y BoF Days.
Wang, C. e al., 2025. Reco de : Comp ehensi e
Pa allel I/O T acing and Analysis.
Tools in Focus
Compa ison o In e cep ed Func ions
POSIX MPI-IO HDF5
Bo h Tools open, open64, c ea , c ea 64, dup, dup2, ileno,
ead, w i e, p ead, pw i e, p ead64, pw i e64,
ead , w i e , lseek, lseek64, __xs a (+64),
__lxs a (+64), __ xs a (+64), mmap(+64), sync,
da async, close, ename, open(+64), dopen,
close, w i e, p in , ead, seek, seeko, lush
PMPI_File_{close, i ead a , i ead, i ead_sha ed,
iw i e a , iw i e, iw i e_sha ed, open,
ead_all_begin, ead_all, ead_a _all_begin,
ead_a _all, ead_a , ead, ead_o de ed_begin,
ead_o de ed, ead_sha ed, se _ iew, sync,
w i e_all_begin, w i e_all, w i e_a _all_begin,
w i e_a _all, w i e_a , w i e, w i e_o de ed_begin,
w i e_o de ed, w i e_sha ed}
H5Fc ea e, H5Fopen, H5F lush, H5Fclose,
H5Dc ea e1/2, H5Dopen1/2, H5D ead,
H5Dw i e, H5Dclose, H5Oopen, H5Oclose,
H5D lush, H5Oopen_by_add ,
H5Oopen_by_idx, H5Oopen_by_ oken
Da shan only __open_2, opena (+64), dup3, mks emp,
mkos emp, mks emps, mkos emps,
p ead (+64/2), pw i e (+64/2),
aio_ ead/w i e(+64), aio_ e u n(+64),
lio_lis io(+64), eopen(+64), pu c, pu w, pu s,
p in , p in , p in , ge c, ge w, _IO_ge c,
_IO_pu c, __isoc99_ scan , scan , scan , ge s,
seeko64, se pos(+64), ewind
– –
Reco de only msync, ge cwd, mkdi , mdi , chdi , link,
unlink, linka , symlink(+a ), eadlink(+a ),
chmod, chown(+lchown), u ime, opendi ,
eaddi , closedi , ewinddi , __xmknod(+a ), cn l,
pipe, mk i o, umask, access, accessa , mp ile,
unca e, unca e, ell, emo e, ello
PMPI_File_{se _size, seek, seek_sha ed, ge _size,
iw i e_a _all, iw i e_all}
Reco de p o ides ull HDF5 API co e age
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz 8
* Func ions a e aken om he ools’ sou ce code. Func ions in blue colo a e ca ego ized as STDIO/ISO-C by he ools.
E alua ion
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz9
Rela ed Wo k & Con ex
F agmen ed Tooling Landscape
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz16
•Siloed Tool Views: Each ool sees a laye . None explain he whole sys em.
=> In case o I/O: App-le el p o ile s (e.g., Da shan) s. sys em ools (LDMS, DCDB)
•Wo k low Sc ip s Ins ead o Wo k lows: Cus om sc ip s pe expe imen = unscalable, un epea able.
=> Benchma king ools (e.g., ipe , sockpe ) o en equi e clien /se e logic incompa ible wi h SLURM
•Need: in eg a ed mul i-laye explainabili y
Ca ego y Examples S eng hs Limi a ions
Applica ion-le el Da shan, Reco de ,
Sco e-P
Fine-g ained unc ion acing
and ligh weigh p o iling
No isibili y in o sys em-wide
in e ac ions
Sys em-le el LDMS, DCDB,
TACCS a s
Agg ega ed I/O pe o mance
me ics
Canno co ela e applica ion
pe o mance wi h sys em me ics
End- o-end Ganglia, Nagios,
Apollo
Holis ic iew o sys em
u iliza ion
Lacks deep p o iling a ke nel
and ne wo k le els

Rela ed Wo k & Con ex
Mango-IO: I/O Me ics Consis ency Analysis
•P oblem: me ics di e be ween ools → compa abili y issue
•Solu ion: Mango-IO con e s Da shan/Reco de aces → OTF2 o consis ency
•Finding: once no malized, disc epancies sh ink
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz17
Liem, Radi a, Sebas ian Oes e, Jay Lo s ead,
and Julian Kunkel. Mango-IO: I/O Me ics
Consis ency Analysis. In 2023 IEEE In e na ional
Con e ence on Clus e Compu ing Wo kshops
(CLUSTER Wo kshops), pp. 18-24. IEEE, 2023.
Rela ed Wo k & Con ex
XIO: eXplainable I/O
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz18
Cu en ools answe wha , no why
Neuwi h, S. and De a ajan, H., Wang, C., and Lo s ead, J..,
2025. XIO: Towa d eXplainable I/O o HPC Sys ems. SSDBM’25. XIO p oposes Mas e A chi ec u al Plan (MAP) + Da aC umbs
Rela ed Wo k & Con ex
XIO: eXplainable I/O
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz19
Da aC umbs: Low-
O e head Mul i-Laye
P o iling o enabling
Explainable I/O
Di e en ke nel s ack calls can
help iden i ied bu e ed s
unbu e ed ead calls.
Da aC umbs: eBPF-based, ke nel + use acing => causal explana ions
Conclusions
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz20
Conclusions
Recommenda ions
Fu u e: Hyb id wo k lows combining bo h (o o he ools); consis ency ia Mango-IO?
When sys em-le el beha io (e.g., me ada a-hea y wo kloads, STDIO il e ing) ma e s o needs
con ol, Reco de ’s ine-g ained policies a e mo e adap able han Da shan’s ixed agg ega ion model.
Choose Reco de o ine-g ained, ace-le el inspec ion o when wo king wi h applica ions using
complex o laye ed I/O s acks like HDF5.
Use Da shan when p o iling needs o scale ac oss many nodes, un ime pe u ba ion mus be minimal,
and agg ega ed I/O summa ies su ice.
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz21

Conclusions
Summa y & Ou look
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz22
Tools a e no In e changeable
•Wo kload & goal ma e
•P o iling s. acing s. explainabili y:
each gi es a di e en u h
Me ada a S anda diza ion
•Can we con e ge on ace me ada a schemas
ac oss ools?
•How do we ensu e ace con ex is cap u ed and
p ese ed?
F om Me ics o Meaning
•Wha cons i u es a e i ied insigh ?
•Can we es ablish common g ound be ween
co ec ness e i ica ion and pe o mance
alida ion?
Vision: In eg a ed, Mul i-laye
Ecosys em o I/O Analysis
•Combining: E iciency (Da shan-like), De ail
(Reco de -like), Consis ency (e.g., Mango-IO),
Explainabili y (XIO/Da aC umbs)
Thank you o you A en ion!
D . Sa ah M. Neuwi h
P o esso o Compu e Science
Johannes Gu enbe g Uni e si y Mainz
Email: neuwi [email protected]
Websi e: h ps://www.hpca-g oup.de/
NHR Sou h-Wes HPC Cen e : h ps://nh sw.de/
Benchma king Da shan and Reco de o HPC I/O P o iling and T acing • ©Sa ah M. Neuwi h • Johannes Gu enbe g Uni e si y Mainz23