Ou look
• main limi a ion: all h ee ools ha e a
s eep lea ning cu e and amilia i y wi h
he command line is use ul
◦ hus, we assume he main audience o
be da a scien is s o ech-sa y
esea che s
• we a e cu en ly implemen ing p ac ical demons a ions - ollow ou wo k
o upda es in his ega d
• I am e y in e es ed in alking abou open science in clinical esea ch as
well as implemen ing ep oducible esea ch p ojec s o educe
inequali y in heal hca e - find me a he con e ence o con ac me
ega ding hese opics:
• wha is i ?
◦ o ganizes files as da ase s, based on gi
and gi -annex
• sol es:
◦ e sion con ol o all you code/da a/e c
◦ eco ding p o enance o files, including
which code was used o gene a e hem
◦ use gi b anches o ags as en ypoin s
o ep oduce expe imen s
• al e na i es:
◦ gi (p oblem: no acking o a bi a ily
la ge files, no di ec link om code o
da a)
da alad
Box 2: li e a e p og amming
ins ead o code wi h commen s, w i e p ose
in e jec ed wi h code ( ocus on humans)
ela ed concep s:
• angling - au oma ic ex ac ion o sou ce
code om li e a e p og amming documen
in o sou ce files
• ansclusion - opposi e app oach, include
iewable e e ences o (pa s o ) sou ce files
in o he li e a e p og amming documen
• wha is i ?
◦plain ex -based ool o ou lining, no e- aking, sp eadshee s,
p ojec planning, ...
◦ polyglo (see Box 1) and li e a e (see Box 2) p og amming ia
no ebook-like en i onmen o >70 languages
◦one-click publishing as LaTeX, ODT, a mode n websi e, ...
• sol es:
◦ human- eadable "glue" o diffe en ep oducibili y componen s
◦ in eg a e ypical human eadable ex s like esea ch hypo heses,
e hics decla a ions, e c. in o ep oducible pipelines
• al e na i es:
◦ no plain- ex , limi ed languages: Polyglo No ebooks in VS Code
Emacs o g-mode
Limi a ions
Gi hub Mas odon Bluesky
The powe o open polyglo plain ex
ooling o ep oducible AI esea ch
Al a Sel mann1, 2, Ch is ian Eggeling1, 2
1Ins i u e o Applied Op ics and Biophysics, F ied ich Schille Uni e si y Jena, Jena, Ge many,
2Leibniz Ins i u e o Pho onic Technology, Jena, Ge many
In oduc ion
flake.nix
{
desc ip ion = "A e y basic lake";
inpu s = {
nixpkgs.u l = "gi hub:nixos/nixpkgs? e =nixos-25.05";
};
ou pu s =
{ sel , nixpkgs }:
le
pkgs = nixpkgs.legacyPackages."x86_64-linux";
in
{
de Shells."x86_64-linux".de aul = pkgs.mkShell {
packages =
wi h pkgs; [
((emacsPackagesFo emacs).emacsWi hPackages(epkgs: [
epkgs.emacs-jupy e
]))
da alad
py hon313
R
]
++ (wi h pkgs.py hon313Packages; [
seabo n
jupy e lab
])
++ (wi h pkgs. Packages; [
g summa y
]);
shellHook = ''
echo "welcome o he shell!"
'';
MY_ENVVAR = "cus om_en a ";
};
};
}
labbook.o g
Some plain ex desc ip ion, wi h ypical ma kup ea u es like
*bold* o /i alic/
#+begin_s c sh
da alad c ea e -- o ce
da alad sa e -m "sa e ini ial s a us"
#+end_s c
#+RESULTS:
add(ok): labbook.o g ( ile)
add(ok): lake.nix ( ile)
sa e(ok): (da ase )
ac ion summa y:
add (ok: 2)
sa e (ok: 1)
#+begin_s c R :session 1
lib a y("g summa y")
ial |>
bl_summa y(include = c(age, esponse), by = ) |>
as_ ibble()
#+end_s c
#+RESULTS:
| Age | 46 (37, 60) | 48 (39, 56) |
| Unknown | 7 | 4 |
| Tumo Response | 28 (29%) | 33 (34%) |
| Unknown | 3 | 4 |
#+begin_s c jupy e -py hon :session /jpy:localhos #8888:py1
impo seabo n as sns
d = sns.load_da ase ("penguins")
sns.swa mplo (da a=d , x="body_mass_g", y="sex", hue="species")
#+end_s c
#+RESULTS:
[[ ile:./.ob-jupy e /98 d83785222d93
b70dc4691db24c0d50a 5a992.png]]
#+begin_s c sh
da alad sa e -m "Add able and image"
#+end_s c
#+RESULTS:
e c...
he e, we connec o R di ec ly,
bu o Py hon we connec o a
jupy e session ( ia heade
a gumen s)
eco d da alad commands and
ou pu oge he wi h R and
Py hon sc ip s
ou pu plo is sa ed as file,
inline image is displayd by o g-
mode
plain ex is in e p e ed by
Emacs o g-mode, making he
documen in e ac i e
shell sc ip o execu e a e
en i onmen c ea ion; se
cus om en i onmen a iables.
nix package manage
• goal: me hods ep oducibili y = abili y o eco d and implemen
all expe imen al and compu a ional p ocedu es wi h he same
da a and ools, ob aining he same esul
• se e al sub-p oblems ha e o be sol ed: se ing up a ep oducible
en i onmen , acking and e sioning gene a ed code and da a, ...
◦ cu en ly, no one bes p ac ice ool exis s
◦ he p oblem ge s ha de o polyglo p og amming (see Box 1)
• we p opose he combina ion o h ee exis ing ools, each sol ing
specific sub-p oblems o me hods ep oducibili y
◦ all ools a e popula in ce ain communi ies - he e we highligh
hei syne gies
• wha is i ?
◦decla a i e en i onmen (see
example on he igh )
◦ he e: using he flakes ea u e
→ c ea es a lock file o all
dependencies o he p ojec ,
making i ep oducible
• sol es:
◦ no mo e "i wo ks on my
machine"
◦ easy and de e minis ic se up
o mo e complica ed ools
(looking a you, Emacs!)
◦ no mo e dependency hell
(looking a you, Py hon!)
◦ mul iple package e sion o
diffe en p ojec s a e possible
◦ sha ing p ojec s - e y easy o
un jus om e.g. a Gi lab
eposi o y link
• al e na i es:
◦ only Py hon: u , poe y, en
◦ only Py hon and R: conda
• exis ing w appe s o ease o
use:
◦ ix h ps://gi hub.com/ opensci/ ix
decla e sys em packages,
Emacs packages, Py hon
packages, R packages, e c...
decla e sys em - wo ks o all
UNIX sys ems and using WSL2
on windows
decla e package sou ce
(nixpkgs has >120.000
so wa e packages - cus om
sou ces possible)
Box 1: polyglo p og amming
use mul iple p og amming languages in one
so wa e / da a science p ojec o le e age
hei s eng hs o diffe en asks
example scena io: ain a classifie o wo
g oups → use R o c ea ing ep oducible
summa y ables o he g oup a iables (e.g.
ia g summa y) → use Py hon o
ep oducible model aining (e.g. ia
enso flow.ke as and mlflow)
Syne gies
easy se up o Emacs o g-mode and da alad when
ep oducing p ojec s o expe imen s
ep oducible polyglo en i onmen (ha d o achie e o he wise!)
Syne gies
plain ex files (and code) lend hemsel es o gi -based
e sioning - a s uggle o e.g. jupy e no ebooks -
while being ea u e- ich when iewed
inside Emacs
Find me: LinkedIn