scieee Science in your language
[en] (orig)

Integrating Diverse Research Data into one Repository: The BASS metadata.xlsx Crawler

Author: Ribas-Ribas, Mariana; Bibi, Riaz; Wurl, Oliver; Badewien, Thomas H.; Feenders, Christoph
Publisher: Zenodo
DOI: 10.5281/zenodo.17292147
Source: https://zenodo.org/records/17292147/files/20251001_RibasRibas_BASSDM.pdf
Biogeochemical p ocesses
and Ai –sea exchange in he
Sea-Su ace mic olaye
(BASS)
PD D . Ma iana Ribas Ribas
FAIR in Ac ion: Open RDM In eg a ion
01.10.2025, Gö ingen
Mul idisciplina y Resea ch
Ne wo k D i ing BASS
DFG Resea ch Uni : BASS
2
Mul idisciplina y obse a ional, expe imen al and modelling da a
He e ogeneous Da a Fo ma s and Wo k lows
3
Sample Collec ion → P ocessing → Analysis → S o age → Sha ing
Mesocosm and Wind Wa e Tunnel
s udy
Helgoland join ield campaign
Bal ic Sea campaign
Resea ch c uise Me eo expedi ion
Disc e e Obse a ional Da a
Remo e Sensing Da a
Sa elli e obse a ions
Resea ch ai c a senso s
High-Resolu ion Da a
In si u senso s con inuous da a
Au onomous pla o ms
Mic o p o ile
He e ogeneous Sou ces o Da a
4
Di e en subp ojec s use di e en sampling s a egies, ins umen s, o ma s, and ocabula ies
Why In eg a ion was ha d?
• Tabula da a: XLSX, CSV, TXT
• Documen s: PDF
• Images: JPG, PNG, TIFF
• Videos: MP4
How can we b ing all hese in o one da abase in a FAIR
and in eg a ed way?
3 / 9
LinkAhead
oFully open-sou ce so wa e (AGPL 3)
oLinked da a iles and me ada a
oEasy and powe ul sea ch → ind da a ac oss subp ojec s ins an ly
oSeman ic co e: Fas and easy uning o he da a model
oLinkAhead can be easily ex ended o modi ied acco ding o use equi emen s
oIn eg a es in o exis ing wo k lows (This BASS phase could be in eg a ed in nex )
oAPI a ailable o link elec onic lab no ebook
oAu oma ed expo → FAIR- eady me ada a o eposi o ies (e.g., PANGAEA)
The B idge: LinkAhead + Me ada a.xlsx C awle
5

6
In eg a ing Di e se Resea ch Da a in o One Reposi o y: The BASS me ada a.xlsx C awle
BASS Me ada a.xlsx Concep
• One cen al me ada a sp eadshee empla e o all SPs
•Excel-based empla e – no ex a so wa e needed
•Familia ool (excel) – lowe ing he ba ie o adop ion
•Aligned wi h FAIR and eposi o y (PANGAEA/LinkAhead).
• Ensu es s anda dized s uc u e while s aying lexible
• Collec s essen ial me ada a:
• Da ase i le, desc ip ion, keywo ds
• Ins umen /me hod
• Spa ial/ empo al co e age
• Con ac /PI in o ma ion
7
The C awle
•C awle au oma ically eads submi ed me ada a.xlsx iles
• T ans o ms Excel ows in o s anda dized me ada a (JSON/XML)
•S anda d ields (pa ame e , uni , ins umen , da e, loca ion)
•Op ional ields: Add ex a p ope ies (e.g., calib a ion ID, se ial no.) → wi hou changing he empla e
• P epa es FAIR- eady me ada a o LinkAhead / PANGAEA
• De ec s e o s, missing alues, and en o ces con olled ocabula ies (e.g., uni s, me hods).
• Bene i s:
• Reduces manual cu a ion wo kload
• Ensu es consis ency ac oss da ase s
• Ea ly QA/QC o me ada a be o e submission
Scien is s ill Excel → C awle P ocesses → LinkAhead → PANGAEA
“Final” Da a Managemen Sys em
8
Lessons Lea ned
•FAIR in p ac ice: Me ada a ha moniza ion makes da a Findable and In e ope able
•T aining and guidance a e essen ial – clea ins uc ions needed o consis en me ada a
•Au oma ion educes e o s – c awle ca ches missing alues and en o ces ocabula ies
•Challenges:
• Con incing scien is s o ill empla es p ope ly
• Balance equi ed → comple eness s. use - iendliness is a ade-o
• Con olled ocabula ies s ill e ol ing
In eg a ion achie ed: 9 SPs eeding in o one da abase
This empla e + c awle wo k low can scale o o he conso ia
Reusable model:
9