Fe nUni e si ä
in Hagen
CATALPA – Cen e o Ad anced Technology o Assis ed Lea ning and P edic i e Analy ics
Das Ma ie Kondō P inzip ü Fo schungsda en
im Bildungswesen
Niels Seidel
Hagen, 2025/11/18
2
Image sou ce: h ps://lea n.konma i.com/
3
Books
METADATA AND PRESENTATION
SENTIMENTAL DATA
PUBLISH RESEARCH DATA DATA SHARING
LIVING WITH RDM
Image sou ce: h ps://lea n.konma i.com/
DATA KONMARI 101 MAKE A DATA
MANAGEMENT PLAN
FAIR DATA FASHION
DATA USE
AND PROTECTION
RESEARCH DATA
POLICIES
WHAT IS THE PROBLEM
WITH RESEARCH DATA?
RDM TOOLS
4
Books
doi: 10.5281/zenodo.17668427
METADATA AND PRESENTATION
SENTIMENTAL DATA
PUBLISH RESEARCH DATA DATA SHARING
LIVING WITH RDM
Image sou ce: h ps://lea n.konma i.com/
DATA KONMARI 101 MAKE A DATA
MANAGEMENT PLAN
FAIR DATA FASHION
DATA USE
AND PROTECTION
RESEARCH DATA
POLICIES
WHAT IS THE PROBLEM
WITH RESEARCH DATA?
RDM TOOLS
5
WHAT IS THE PROBLEM
WITH RESEARCH DATA?
Image sou ce: h ps://lea n.konma i.com/
Whe e I’ e s a ed …
6
Whe e we a e …
7
Melkamu Ja e, Wabi; S iewe, Michael (2023): Umgang de DELFI-Communi y mi Fo schungsda en und
So wa ea e ak en - Eine E hebung au Basis de Tagungsbände im Zei aum 2018-2022. 21. Fach agung
Bildungs echnologien (DELFI). DOI: 10.18420/delfi2023-27. Bonn: Gesellscha ü In o ma ik e.V.. pp. 167-172.
Whe e we a e …
8
Haim, A., Shaw, S., & He e nan, N. (2023). How o Open Science: A
P inciple and Rep oducibili y Re iew o he Lea ning Analy ics
and Knowledge Con e ence. LAK23: 13 h In e na ional
Lea ning Analy ics and Knowledge Con e ence, 156–164.
h ps://doi.o g/10.1145/3576050.3576071
LAK’22 + LAK’23
2. Phase
The Da a-D i en Re olu ion in Educa ion
9
Phases o digi iza ion in educa ion (Wahls e , 2017)
2018 BERT
…
2019 GPT-2
…
2025 GPT-5.1
Sou ce: S. Joksimo ić, V. Ko ano ić, S. Dawson. The Jou ney o Lea ning Analy ics. HERDSA Re iew o Highe Educa ion
1. Phase
De eloping Comp ehensi e Da a Managemen Plans (DMPs)
16
How?
- make da a managemen a wo k package in you p ojec
- make DMP c ea ion a join ask
- use a li ing documen
- don’ use any o ms and DMP ools
- blend s uc u es and ques ions o mul iple DMP empla es o
find a sui able s uc u e o you p ojec
- upda e and e ise he documen egula ly
17
DATA KONMARI 101
Image sou ce: h ps://lea n.konma i.com/
Signs o Da a Chaos
18
Resea ch Da a
● poo s uc u e
○ inapp op ia e olde s uc u e
○ inconsis en naming schemes
○ p op ie a y ile o ma s
○ comp essed da a
● clu e ed
○ unclea sepa a ion be ween aw da a
and p ocessed da a
○ wo k-in-p og ess da a
○ seconda y da a such as pape d a s
○ edundan da a, e.g., o mul iple
pla o ms (e.g., SPSS, R, Py hon)
○ con lic ing ile e sions
○ o e ly bulky ables
○ un ela ed commen s
● lack o documen a ion
○ no gene al o e iew
○ no me ada a o codebooks
Special Case: Resea ch So wa e
● dedica ed o pe sonal use only
● mul i-pu pose ools
● ins all ins uc ions missing
● no lisence decla ed
● unclea ma u i y le el
● iles/ olde s missing
● passwo ds men ioned
19
Da a Konma i 101
1. Tidy e e y hing up a once, in a sho
ime and pe ec ly.
2. All da a o be idied up is collec ed in
one place.
3. Decide wha o keep based on he
ques ion: Does i make me happy when
his da a is used in his way?
4. E e y ile ha is kep is assigned i s
place.
5. All iles mus be s o ed he e co ec ly.
-Rese e ime exclusi ely o da a managemen
-Radical in en o y ins ead o con inuous chaos
- Es ablish a clean, sus ainable sys em once and
o all
20
Da a Konma i 101
-Ga he all da a: local ha d d i es, ex e nal
s o age de ices, cloud se ices, email
a achmen s, USB s icks
-C ea e a cen al o e iew: Wha ac ually
exis s?
- Make da a isible ha is lying do man in
o go en olde s
-C ea e ca alogues: Wha da a eco ds,
analyses and aw da a a e a ailable?
1. Tidy e e y hing up a once, in a sho
ime and pe ec ly.
2. All da a o be idied up is collec ed in
one place.
3. Decide wha o keep based on he
ques ion: Does i make me happy when
his da a is used in his way?
4. E e y ile ha is kep is assigned i s
place.
5. All iles mus be s o ed he e co ec ly.
21
Da a Konma i 101
✅ Keep i da a …
- ensu es ep oducibili y ( o published esul s)
- mus be e ained o legal/con ac ual easons
- is unique and canno be e-collec ed
- has po en ial o u u e esea ch ques ions
❌ Remo e i da a is …
- duplica e o ou da ed e sion
- es da a o esul o ailed analyses
- ( aw da a al eady a ailable in p ocessed o m)
- scien ifically/me hodological inco ec
1. Tidy e e y hing up a once, in a sho
ime and pe ec ly.
2. All da a o be idied up is collec ed in
one place.
3. Decide wha o keep based on he
ques ion: Does i make me happy when
his da a is used in his way?
4. E e y ile ha is kep is assigned i s
place.
5. All iles mus be s o ed he e co ec ly.
22
Da a Konma i 101
- Define a clea olde s uc u e and s o age
loca ions
- S o age loca ions by unc ion:
-Ac i e wo k: locally on he compu e
-Collabo a ion: Coscine, Gi Lab, Gi Hub
-Long- e m a chi ing: (ins i u ional
eposi o y)
-Publica ion: OSF, Zenodo, subjec -specific
eposi o ies
1. Tidy e e y hing up a once, in a sho
ime and pe ec ly.
2. All da a o be idied up is collec ed in
one place.
3. Decide wha o keep based on he
ques ion: Does i make me happy when
his da a is used in his way?
4. E e y ile ha is kep is assigned i s
place.
5. All iles mus be s o ed he e co ec ly.
23
Da a Konma i 101
-File names: Desc ip i e and consis en
(YYYY-MM-DD_p ojec name_desc ip ion_ 01.cs )
-README iles: In each p ojec olde (wha ,
when, who, how)
-Me ada a: Use s anda ds (Dublin Co e,
Da aCi e)
-Ve sion con ol: Gi o code, documen ed
e sions o da a
-Backup ule: 3-2-1 (3 copies, 2 media, 1
ex e nal)
-Access con ol: Who is allowed o iew/edi
wha ?
-Da a managemen plan (DMP)
- So wa e managemen plan (SMP)
1. Tidy e e y hing up a once, in a sho
ime and pe ec ly.
2. All da a o be idied up is collec ed in
one place.
3. Decide wha o keep based on he
ques ion: Does i make me happy when
his da a is used in his way?
4. E e y ile ha is kep is assigned i s
place.
5. All iles mus be s o ed he e co ec ly.
24
DATA PROTECTION
FOR DATA USE
Image sou ce: h ps://lea n.konma i.com/
Da a P o ec ion o Da a Use
25
Reco d o P ocessing Ac i i ies (VVT)
- pu pose, da a/ca ego ies, p ocess, pe sons/pa ies, dele ion
- echnical-o ganiza ional measu e
In o med consen
- OPT-IN
- Example: h ps://aple. e nuni-hagen.de/admin/ ool/policy/ iewall.php#policy-6
Wi hou in o med consen ? (see ull ex )
Da a P o ec ion o Da a Use
32
Ensu e k-anonymi y
Simula ed da a
- gene a e da a wi h simila dis ibu ion as he o iginal da ase
- = uly anonymous
Da a P o ec ion o Da a Use
33
Replacing fi s and las names
om names_da ase impo NameDa ase
p in (NameDa ase ().sea ch('F odo'))
Video image dis o ion
(see manual o VLC)
Disguise oice
- Audaci y: Pi ch -20% > Dis o ion
Pixela e aces
- Gimp: selec > fil e s/blu > pixela e
34
METADATA AND PRESENTATION
Image sou ce: h ps://lea n.konma i.com/
Me ada a, p esen a ion, use
35
Da aci e me ada a s anda d
So wa e
- e.g. mod_longpage eadme.md incl. SMP
Da a and da a analysis
- no ebooks: Jupy e no ebooks, RMa kDown
- web app: R Shiny Apps (galle y)
- sandboxes: lea n (R)
FAIR p inciples o AI Models
- open da ase s incl. no ebooks o explo e he da a
- baseline model + ained model o mul iple pla o ms incl. no ebooks
- HDF5 o ROOT file o ma
36
PUBLISHING RESEARCH DATA
Image sou ce: h ps://lea n.konma i.com/
Publishing Resea ch Da a
37
Se ice Da a ypes DOI File size limi API
Zenodo iles, ables, so wa e, models yes 50 GB yes
CMU Da a Shop * no - yes
Open Science
F amewo k
* yes - no
Gi Hub ee sou ce code no 2 GB yes
e3da a
- egis y o esea ch da a epo i o ies
- should ha e a decla ed plan o enable pe manen accessibili y
Da a Jou nals
- Lis o gene al da a jou nals, e.g. o so wa e: So wa e Impac s, J. Open Sou ce
So wa e, JORS
anonymous.4open.science
Summa y & Ou look
38
Summa y
- Unique RDM challenges in educa ional esea ch
- E o s wi h DMP, Reco d o P ocessing Ac i i ies, in o med consen , and
me ada a desc ip ions a e easible
- Clean esea ch da a is a p e equisi e o collabo a ion
-Da a sha ing is possible despi e o p i acy obliga ions
- RDM wo ks bes as a eam
Ou look
- NFDI ia Ve bund Fo schungsda en Bildung
- COSCINE.n w
- RDM policy
Q&A
39
Thank you!
doi: 10.5281/zenodo.17668427
D . Niels Seidel
[email p o ec ed]
No es and e e ences
40
Wilkinson, M. D. e al. The FAIR guiding p inciples o scien i ic da a managemen and s ewa dship. Sci. Da a 3, 160018,
h ps://doi.o g/10.1038/sda a.2016.18 (2016).
Wilkinson, M. D. e al. A design amewo k and exempla me ics o FAIRness. Scien i ic Da a 5, 180118,
h ps://doi.o g/10.1038/sda a.2018.118 (2018).
Tso, e al., "The R Jou nal: Ad ancing Rep oducible Resea ch by Publishing R Ma kdown No ebooks as In e ac i e Sandboxes
Using he lea n Package", The R Jou nal, 2022
NFDI: h ps://www.n di.de/konso ien/
Khalil, M., & P insloo, P. (2025). The lack o gene alisabili y in lea ning analy ics esea ch: why, how does i ma e , and whe e o?
P oceedings o he 15 h In e na ional Lea ning Analy ics and Knowledge Con e ence, 170–180.
h ps://doi.o g/10.1145/3706468.3706489
Liu, Q., & Khalil, M. (2023). Unde s anding p i acy and da a p o ec ion issues in lea ning analy ics using a sys ema ic e iew.
B i ish Jou nal o Educa ional Technology, 54(6), 1715–1747. h ps://doi.o g/h ps://doi.o g/10.1111/bje .13388
Wahls e , W. (2017, June). Küns liche In elligenz als T eibe de zwei en Digi alisie ungswelle. IM+io, 4.