scieee Science in your language
[en] (orig)

Agentic Coding for STEM Research

Author: Cardoen, Ben
Publisher: Zenodo
DOI: 10.5281/zenodo.17708614
Source: https://zenodo.org/records/17708614/files/slides.pdf
Agen ic coding o STEM esea ch
Ben Ca doen
School o Ma hema ics, Uni e si y o Bi mingham
2025-11-20
Slides licensed unde CC BY 4.0 – Ben Ca doen, 2025
1
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
In oduc ion
Disclaime :
This is no a p o o agains AI p esen a ion.
I am documen ing my findings on how I make i wo k o me, in my uses cases.
Tha may o may no wo k o you
I am no claiming he wo kflow ips cap u e he la es esea ch, pu ely my expe ience
The e can be e o s, so use wisely
2
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Se ing he Scene
My Resea ch: I ocus on designing
Backg ound Wha we’ll co e
no el scalable algo i hms
ha cap u e wha canno be sol ed
o quan ified
e ealing mechanism.
PhD/Msc/BSc in Comp Sci.
C++, MIPS, Julia, Py hon, P olog, Ja a, Haskell, …
Pa allel compu ing, high pe o mance compu ing, biomedical imaging, causal
mechanisms, signal p ocessing on g aphs
Use AI ools in e e y hing excep email and messages, i he use case wa an s
i .
Final w i eup is s ill my own, pe suasion, in en , accu acy, s yle and seman ics
ma e .
Pa e ns, no ins uc ions
I can’ show he eally powe ul examples (publica ion), he i ial examples
don’ ha e lea ning alue
We will explo e how o unlock po en ial
This slidedeck is pa o my AI-gene a ed wo kflow, so i ’s he pe ec
example ( he empla e, no he con en ).
Coding ~ Pape s, slides, p oo s, g aphs, geome y, …, no jus ‘code’.
3
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
P oduc i i y gains
These a e my expe iences, I don’ claim hese ans e .
~ 2022-2023 2-3%
Copilo o undocumen ed code
Fall 2023 3.5%
In e disciplina y science ansla ion, upda ing s uden assignmen s/quizzes
( e sioning ques ions)
Sp ing 2024 4%
LaTeX + seman ic eph asing (W i e ull) speeds up w i ing/ e iew.
Summe 2024 5%
Py hon gene a ion & plo ing mos ly sol ed o sol ed p oblems
Win e 2024 6%
Rela ed wo k sea ches eliable enough o e i y wi h sampling, no edoing
i om sc a ch.
Sp ing 2025 10%
Pa sing unde equi emen s, jou nal scope/selec ion, Google sea ch
eplaced by Pe plexi y o 99%.
B ains o ming/ easoning
Summe 2025 15%
Linea Algeb a/Complexi y easoning/p oo s eliable enough o ske ches.
Julia gene a ion wo ks 99.9% o he ime (e en wi h 2-week old API)
Fall 2025 20%
Claude sys em adminis a ion (supe ised) + Julia CA.
Compu a ional geome y becomes possible o non- esea ch le el asks
Pe cei ed p oduc i i y != Ac ual p oduc i i y. Measu e, log, quan i y, eflec .
4
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Wisdom e sus Knowledge
B ooks dis inguishes essen ial om acciden al complexi y ( ).
Essen ial: Acciden al:
B ooks 1987
Wha is he bes ime and space complexi y o ecompu ing fi s k eigen alues
o a spa se adjacency ma ix a e modi ying 1 edge?
Will p e-o de o pos -o de con e ge as e in andom walk o ees?
Is my new dis ance measu e a me ic?
How can I de ec ex eme alue causali y?
Wha mechanism explains mis olded p o ein accumula ion in neu ons?
Is ageing ela ed o complexi y?
How do I secu e unding?
LaTeX ails o compile because 1 missing } a page 225 o you hesis,
doesn’ ell you wha page.
Dependency nigh ma es, clus e /Cloud ailu es, un es ed code
Figu es (and figu e layou ), making slides
P og amming Language limi a ions, In e disciplina y ja gon
Va iable aliasing, use a e ee, da a aces, b oken APIs, endo lockin, …
Wisdom is wha you need o sol e u u e essen ial p oblems. Knowledge is wha , bu no how, you lea ned om pas p oblems. Sub le diffe ence.
5
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences

Wha is p oduc i i y?
Adap i e Decision-Making Th ough In o ma ion-Theo e ic Lea ning
The cos model:
E e y AI a emp : (gene a ion + e ifica ion)
c
= +
c
g
c
Success p obabili y a a emp : , whe e
n
= − ( − )
p
n
p
∞
p
∞
p
1
n
−1
= 1 −
q
n
p
n
eflec s you up on in es men : que y design, model selec ion, con ex
p
1
High equi es excellen ounda ions
p
1
Expec ed cos :
E
[
C
(
k
)] =
c n
⋅ + (
ck
+ )
∑
n
=1
k
p
n
∏
i
=1
n
−1
q
i
C
man
∏
i
=1
k
q
i
6
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Sequen ial adap a ion: The nai e app oach
A e ask :
S ochas ic upda e based on ask ou come:
j
✓ Success → inc ease us →
=
k
j
+1
K
max
✗ Failu e → dec ease us →
= max(1, − 1)
k
j
+1
k
j
j
=
{
k
j
+1
,
K
max
max(1, − 1),
k
j
i ask
j
succeeds (p ob.
S
( ) = 1 − )
k
j
∏
k
j
i
=1
q
i
i ask
j
 ails (p ob.
F
( ) = )
k
j
∏
k
j
i
=1
q
i
Bu is his he igh way o lea n?
7
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
The in o ma ion- heo e ic iew: Failu e is he s onges signal
Le = en opy o you men al model (unce ain y abou wha wo ks)
In o ma ion gained om ou comes:
Con olled ailu e = delibe a e explo a ion:
Re ame adap a ion: Failu e → high in o ma ion → imp o e s a egy** (no abandon)
H
Success:
≈ 0
I
success
Confi ms exis ing model, low in o ma ion
Risk: model o e fi s o you que y s yle
Failu e:
≫
I
ailu e
I
success
Re eals bounda ies, cons ain s, ailu e modes
Maximum en opy educ ion:
Δ
H
∝ −log
p
ailu e
S a egy imp o emen ∝
I
(ou come;model capabili ies)
max
asks
Push models o ailu e → align men al models - ailu e nea bounda y → s onges upda e signal
In eali y he p e ious expec a ion model in in ac able, i ’s a sequen ial o mula e o he Byzan ine gene al’s p oblem.
8
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
The Men al Model: In en -d i en au onomy.
Wisdom needs a men al model. Ask you sel
Black box model app oach is no efficien
Bu you don’ need o know
T ans o me a chi ec u e
How KL di e gence wo ks
2-way communica ion o `in en ’
Unde s and how models ‘see he wo ld’
Effec i e in e ac ion boils down o en opy minimiza ion by bo h
Two-way calib a ion: You build a men al model o AI capabili ies, bu he
AI also models you s anda ds wi hin con e sa ions. Consis en expec a ions
c ea e s able co-adap a ion.
Whe e do models ‘ ail’?
Wha , eally, is a ‘hallucina ion’?
When do I igge i ?
Which documen s s ay in memo y? How long?
GPT5 will au oma ically eload you sou ce files i you use gi connec o ?
I you ne e ca ed abou accu acy in code snippe s, do you expec ha o
affec nex day’s que ies? Ac oss cha s? Ac oss models?
Adding TOC helps, yes/no. Wha abou 2-column o ma ?
I you ask o algo i hm X, do you ge he op imal, he easy, o he ou da ed
e sion, and why?
How do models encode e y la ge se s o in o ma ion? Does i ma e ?
9
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Plan you exi s
Complex esea ch le el ask: Simula e high fideli y HIV1 capsid coa wi h nm
accu acy
Coding agen (Claude) and Resea ch agen (GPT-5) ag eed on plan + logs
Bo h wa ned ha ask 4 would ha e 60% chance o success, as i was
iden ified as esea ch class.
Comple ed 3/5 asks, hen s uck
Hi sub le issue in combina ion o de-no o implemen a ion o wo algo i hms
om pape s
4-5 debugging a emp s a high le el (using logs)
Consensus o abo , confidence o esol e ~ 20%
Time explo ing dead ends is also ime sa ed
You wan o ha e he abili y o answe as : Can X be done?
Righ hand side shows he ex ac minimum ep oduce , showing mesh
co up ion (edge case selec ed)
16
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences

Los in T ansla ion
T ansla ion can wo k, bu you need o guide he model and unde s and whe e i can go w ong. No low-le el, high le el.
Py hon (C/ ow-majo amily)
Sou ce
❌ Di ec T ansla ion o column majo amily, o de o mangi ude
pe o mance eg ession
1
esul = 0.02
o i in ange( ows):3
o j in ange(cols):4
esul += ma ix[i][j] * 2.05
Ma hema ical no a ion o a simple, s able unc ion
(
p
,
q
) = ,
p
,
q
∈
R
,(
p
−
q
) ≠ −1
p
1 + (
p
−
q
)
esul = 0.01
o i in 1: ows2
o j in 1:cols3
esul += ma ix[i, j] * 2.04
end5
end6
✅ Cache-Op imized (no e GPT-5/Claude ge his igh he fi s ime)
esul = 0.01
o j in 1:cols2
o i in 1: ows3
esul += ma ix[i, j] * 2.04
end5
end6
Py hon implemen a ion o o mula can be nume ically e y uns able
de (p, q): #AI gene a ed, i p ~ q his can ha e ca as ophic cancella ion1
e u n p / (1 + (p - q))2
de _s able(p, q): #AI gene a ed + old o wa ch ou o s abili y3
denomina o = (1.0 - q) + p4
e u n p / denomina o 5
Rela ed ansla ion pi alls o guide models on:
Indexing con en ions (0-based s 1-based)
In ege di ision seman ics (floo s unca e), floa /In
Pass-by- alue s pass-by- e e ence (mu a ion), h eading model.
Sho -ci cui e alua ion o de
Type p omo ion and coe cion ules
Remembe he es s you needed?
17
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
You ge wha you push o
A hea , s ill op imiza ion sol e s.
Lack na i e algo i hmic back acking (no explici sea ch ees), bu can
back ack when scaffolded
Gi e explici capabili y: “i s uck, ask me/ e e o s ep 2.3” o use agen ic
e y loops
Mode n models (o1/o3) + chain-o - hough enable back acking-like beha io
Be VERY ca e ul wi h cos ly esou ces (files, ne wo k), models ha e
ime/compu e budge s.
I i has o choose be ween slow ne wo k (bu la es file) and memo y (ou
o da e), i may ha e o pick he ou o da e ‘hallucina ion’.
Make PDFs ha LLMs can pa se p ope ly, o use ma kdown. I you make i
ha de o ex ac in o ma ion, he model can’ ex ac enough in o ma ion in
ime, leading o poo seman ic easoning on incomple e in o ma ion. You hen
see ha as hallucina ions in he ex eme case.
Making PDFs pa seable:
% --- Pa se - iendly add-ons ( oggle on o analysis builds) ---1
usepackage[ agged= ue,ac i a e= ue,in e wo dspace= ue]{ agpd } % agged PDF s uc u e2
usepackage{accsupp} % BeginAccSupp... Ac ualTex ... o key equa ions3
usepackage[numbe ed]{bookma k} % obus ou lines (pai s wi h hype e )4
usepackage[none]{hyphena } % a oid hyphena ion (imp o es ex ex ac ion)5
usepackage[ inal]{mic o ype} % hen op ionally: DisableLiga u es[ ]{encoding=*}6
usepackage{axessibili y} % au o /Ac ualTex o ma h (disable i i clashes)7
% Op ional mic o ype weak:8
% DisableLiga u es[ ]{encoding = *}9
Add TOC.
No columns, single column.
No mul ifile LaTeX i you sha e la ex, fla en i .
When in doub , bo h LaTeX and PDF, o QMD.
Ge ha fi s que y (whe e you sha e he file) jus igh .
Do No : “Summa ize his”
Do: Con as Sec ion 2.3.1 in e ms o consis ency wi h conclusion, hen
c ea e a s uc u ed summa y o he na a i e flow be ween me hods and
esul s.
18
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
The ne e ge o i p oblems
We all ha e p oblems ha a e annoying, can be sol ed, bu i will ake
unknown amoun o ime, and i ’s no ime c i ical.
Example
Fi e ox eezes a andom momen s ~2-10s.
No e idence in logs
Sys em heal h fine
No changes in oduced
Manual: se imed logs, use io op moni o ing, hen sea ch o lis ed bugs in
ke nel, filesys em, fi e ox, …. . ~5+h s.
Claude:
I gi e 2 sen ence desc ip ion, ask i o gi e me 4 ques ions i needs
answe ed o s a
10 minu es p ocessing, au oma ed sea ch o io op + syscall equency
( e y e y low le el logs),
finds seemingly un ela ed ke nel bug, checks n me fi mwa e/ s
configu a ion, a gues why his is he cause, p oposes 2-3 solu ions.
Does i always wo k? No, bu I ha e 1 min o s a i , i can’ damage my sys em
(i ne e needed sudo), and I ha e 1 minu e o confi m, ano he o fix.
19
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
The Righ Way(s) ( m)
20
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Plan so execu ion is almos no needed anymo e
Coding agen (CA) Resea ch Agen (RA) You
⇆ ⇆
You ansla e in en + seman ics ules + easibili y c i e ia
RA –> modula asks + e ifiable objec i es o CA
CA asks cla ifica ion, h ee o you need o ag ee
I i ’s no wo h planning, why is i wo h doing?
Righ hand side: Task 3/5 o a p oo o concep p o ein coa o HIV1,
wo kplan o Claude Code CA made by GPT-5 based on ou con e sa ions.
21
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences

Lea e a ail
Expo con e sa ions o Gi hub
Keep Agen logs in Gi hub
Res a om logs (o swi ch agen s be ween modes)
Mul iple agen s can coo dina e using logs
Resea ch agen s can’ debug li e, bu hey can using logs.
Resea ch agen s can e isi he con e sa ions (is his eally he bes
algo i hm?)
22
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Know Thysel
LLMs do no (?) ha e eflec ion, you do. Use i .
Be awa e o wha you wan o hea .
Explain ha ha is a p oblem and how o sol e i .
Wha ’s you blind spo ? (Ask a esea ch agen wi h memo y who i hinks you
a e)
Desc ibe you obse a ions on my s eng hs and weaknesses (solely om
memo y) as a esea ch ellow, and excluding any pe sonal in o ma ion.
Use a single pa ag aph, do no leak p ojec ideas.
23
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Flip he able
Le he agen ask you ques ions.
Ques ions o ce you o hink
Agen will ask ques ions i needs o educe en opy, sa ing you 2-3 answe s.
24
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences
Dynamic bea s s a ic (o why ins uc ions a e ha m ul)
You canno ask a model i i ’s hallucina ing (~Lia ’s pa adox), you can ask a
model wha would make i hallucina e
Hallucina ions a e jus subop imal solu ions
Ask o anno a ed answe s (see below), no jus sou ces, ask how much ime i
needs o wha i will need
Ins uc ions can be oo s a ic, so lea e a high le el back acking pa h, “I
ins uc ions block you om wha you hink is op imal, ac i a e OVERRIDE
p o ocol”
Model lea ns om you, no ins uc ions.
When in doub , ask. Below I’m asking Codex wha i s cu en configu a ion is
(global local).
25
In oduc ion
Wisdom e sus Knowledge
Success is Lea ning om Failu e
The Righ Way(s) ( m)
Whe e o go om he e
Acknowledgemen s
Re e ences