Resea ch Au oma ion
wi h Agen ic LLMs
A man Khala yan1
Session mode a o : Anna Jacyszyn2
In e disciplina y Colloquium on Digi alisa ion o Resea ch, FIZ Ka ls uhe, 4 Sep embe 2025
(1) Leibniz Ins i u e o As ophysics Po sdam (AIP)
(2) FIZ Ka ls uhe - Leibniz Ins i u e o In o ma ion In as uc u e
Pho os and eco ding
Pixabay, s e_phania
2
In e disciplina y Colloquium on Digi alisa ion o Resea ch, A man Khala yan (AIP), FIZ Ka ls uhe, 4 Sep embe 2025
www.you ube.com/@DiT aRe
3
In e disciplina y Colloquium on Digi alisa ion o Resea ch, A man Khala yan (AIP), FIZ Ka ls uhe, 4 Sep embe 2025
Resea ch Au oma ion wi h Agen ic
LLMs
DiT aRe In e disciplina y Colloquium on Digi alisa ion o Resea ch, 04.09.2025
D . A man Khala yan
Resea che a eScience/Supe compu ing/IT
Leibniz-Ins i u ü As ophysik Po sdam (AIP), Ge many
1
1
Scien i ic li e ( op o down)
2
Idea Collabo a e Publish
Reading P o o yping
Rep oduce
esul s o
o he s
Sha e da a Plo s
Da a access De elop
so wa e Funding Ha dwa e In as uc u e
Scien i ic li e ( op o down)
3
Idea Collabo a e Publish
Reading P o o yping
Rep oduce
esul s o
o he s
Sha e da a Plo s
Da a access De elop
so wa e Funding Ha dwa e In as uc u e
Scien i ic li e ( op o down)
4
Idea Collabo a e Publish
Reading P o o yping
Rep oduce
esul s o
o he s
Sha e da a Plo s
Da a access De elop
so wa e Funding Ha dwa e In as uc u e
Scien i ic li e ( op o down)
5
Idea Collabo a e Publish
Reading P o o yping
Rep oduce
esul s o
o he s
Sha e da a Plo s
Da a access De elop
so wa e Funding Ha dwa e In as uc u e
Scien i ic li e ( op o down)
6
Idea Collabo a e Publish
Reading P o o yping
Rep oduce
esul s o
o he s
Sha e da a Plo s
Da a access De elop
so wa e Funding Ha dwa e In as uc u e
Wa ning
E e y hing you see in his p esen a ion abou ools o LLMs may
al eady be ou da ed by onigh !
The LLM ield is mo ing oo as !
I need you o al a en ion
"A en ion is All You Need"
New keywo ds
•AGI - A i icial Gene al In elligence compa able o human capabili ies
•ASI - A i icial Supe in elligence, mo e capable han humans
•…
•T ans o me – s a e o a NLP, wi h a en ion mechanisms
•LLM – La ge Language Model
•GPT - Gene a i e P e- ained T ans o me , is a special LLM
•Cha GPT - is a LLM model c ea ed by OpenAI
A new e m added since 17 Jan 2025
•Reasoning - i in ol es p ocessing in o ma ion, making in e ences, and
gene a ing cohe en esponses based on lea ned pa e ns.
•Chain o hough s – easoning p ocess “moni o ing”
T ans o me s wi h a en ion
(by Google esea ch eam 2017)
T ans o me s wi h a en ion mechanisms a e he ounda ional
a chi ec u e o Gene a i e P e- ained T ans o me s (GPTs).
These models le e age sel -a en ion o p ocess inpu da a,
allowing hem o weigh he impo ance o di e en wo ds in a
sequence and gene a e cohe en ex .
Each laye o LLM is a ans o me
Con ex leng h
Simple ANN
How he cha gp .com wo ks?
p omp
How many s a s in he sky?
oken ec o ep esen a ion
okenize
The numbe o s a s isi ble in he nigh sky a ies depending on condi ions such as lig h pollu ion a nd a mosphe ic cla i y. On a clea , da k nigh , wi hou
ligh pollu ion, he human eye can see be ween 2,500 and 5,000 s a s om a single loca ion.
Howe e , he Milky Way galaxy, which is he galaxy we eside in, con ains an es ima ed 100 o 400 billion s a s. Beyond he Milky Way, he ….
gene a ed ex
ein o cemen aining,
based on use da a
Tex ual e c.
aining da a
language model
neu al ne
I e a i e oken
gene a ion
p obabilis ic
choices
E e y use p omp is landing in
he company da abase
LLM ma ke
is HUGE
(2024)
The e olu iona y ee o mode n LLMs. Sou ce "Ha nessing he Powe o LLMs in P ac ice: A
Su ey on Cha GPT and Beyond" a ailable on a Xi
LLM sizing(aging e y quickly)
1-100 TRILLION?
New E a o …
o o
New pape om OpenAI eam on
Ma hema ical p oblems!
Le ’s Ve i y S ep by S ep
ained a model o achie e a new s a e-
o - he-a in ma hema ical p oblem
sol ing by ewa ding each co ec s ep
o easoning (“p ocess supe ision”)
ins ead o simply ewa ding he co ec
inal answe (“ou come supe ision”).
In addi ion o boos ing pe o mance
ela i e o ou come supe ision, p ocess
supe ision also has an impo an
alignmen bene i : i di ec ly ains he
model o p oduce a chain-o - hough
ha is endo sed by humans
h ps://openai.com/ esea ch/imp o ing-ma hema ical-
easoning-wi h-p ocess-supe ision
S awbe y is ou om OpenAI:o1
“Thinking” Time
quali y
200$/m PhD le el
esea che ??
O1 qui e hyped
One p omp based coding
AI esea che s on he way…
Then his happened: F ee DeepSeek-R1
Then his happened: F ee DeepSeek-R1
•DeepSeek-R1
•Open weigh : 671B – min >320GB VRAM, >500GB RAM
•Reasoning:
The model i s uses chain-o - hough easoning o hink abou
he p oblem. Only once i inishes hinking i s a s ou pu ing he
answe
New keywo ds
•RAG - Re ie al-Augmen ed Gene a ion, combines e ie al o
ex e nal in o ma ion wi h gene a i e models o enhance esponses
•Agen ic LLM - La ge Language Models designed o pe o m asks
au onomously
•Agen s - Sys ems o models capable o au onomous ac ion and ool
calling o achie e speci ied goals
Why we canno use i o own science?
•I hallucina es a lo o he speci ic asks (knowledge cu )
•The answe s a e agile, changes i s mind, use can nega e he
answe s easily
Ques ions a e emains:
•How us able a e he answe s?
•No clea way o p oo he esul s, we need accu acy.
•Wha is he in as uc u e behind he scenes?
•No in o ma ion on de ail p omp ?
Main Challenge: P omp s
•T ained knowledge base is limi ed!!!
I, Robo mo ie scene
No e: a p omp enginee ge s abou +150-400K$/y
sala y
Le ’s c ea e an appealing
p ess elease image.
Pool o Agen s
Planne
Valida o
Use inpu :
Show an example o REANA
pipeline o S a ho se
da ase s om 2024, CMD
da a is ex ac ed and plo ed
in e ac i ely
Resul s:
•Sou ce code
•REANA wo k low
•Da a
•Plo s
Knowledge base:
Speci ic ules
Planne
Knowledge base:
Da a que y
Execu o
Knowledge base:
Sou ce Code
Code
Knowledge base:
Valida ions
Tes e s
Knowledge base:
Pape s
Resea che
Knowledge base:
In as uc u e
Execu o
LLM Agen s
Resea ch Assis an LLM
Code
Execu o
Tes e s
Resea che
Execu o
Code
T ans e ing spec oscopic s ella labels o
217 million Gaia DR3 XP s a s wi h SHBoos ,
by Khala yan, Ande s, e al. (2024), aa51427-
24,a Xi :2407.06963
Inpu om pape
Pool o Agen s
Planne
Valida o
Use inpu :
Show an example o REANA
pipeline o S a ho se
da ase s om 2024, CMD
da a is ex ac ed and plo ed
in e ac i ely
Resul s:
•Sou ce code
•REANA wo k low
•Da a
•Plo s
Knowledge base:
Speci ic ules
Planne
Knowledge base:
Da a que y
Execu o
Knowledge base:
Sou ce Code
Code
Knowledge base:
Valida ions
Tes e s
Knowledge base:
Pape s
Resea che
Knowledge base:
In as uc u e
Execu o
LLM Agen s
Resea ch Assis an LLM
Code
Execu o
Tes e s
Resea che
Execu o
Code
En ico Tom Leonha d S auss (BS s uden No 2024- Feb 2025)
Enhancing Da a Wo k lows and Rep oducibili y wi h LLM Agen s
An expe imen : Ge a s uden , explo e LLMs
P o A.G üning
(HOST, S alsund)
Tom
Cus om Cha bo s o speci ic opic
•You a e a help ul assis an o
gi ing he daily ips on py hon.
e e y ime when we ask
some hing you should gi e a ips
and icks on py hon language o
some o lib a ies in andom
o de : pandas, ma plo lib,
numpy
•Keep examples as sho as
possible i use asking abou
o he hems jus answe :
"I am an assis an o py hon
ips“
•No gua an ee answe s will ela e
o o he languages o code is
100% co ec .
An expe imen : Ge a s uden , explo e LLMs
Knowledge
Base Supe iso T ue
False
LLM
Simple models <70B
Unable o answe ques ions on
REANA wi hou knowledgebase
h ps://gi hub.com/e ls auss/bachelo - hesis-public
An expe imen : add Agen s o LLMs
Knowledge
Base Supe iso T ue
False
LLM
An expe imen : Adding RAG and ac s check
Knowledge
Base Supe iso T ue
False
LLM
Final cha wo k low
Thank you o you a en ion
Nex Colloquium
6
In e disciplina y Colloquium on Digi alisa ion o Resea ch, A man Khala yan (AIP), FIZ Ka ls uhe, 4 Sep embe 2025
DiT aRe Symposium
■2-3 Decembe 2025
■ZKM Ka ls uhe
■Sessions on:
○Knowledge Rep esen a ion and AI
○Law and E hics in Digi alisa ion o Resea ch
○Resea ch In as uc u es
○Impac on Science and Socie y
■Call o pos e s
h ps://www.di a e.de/en/symposium-2025
7
In e disciplina y Colloquium on Digi alisa ion o Resea ch, A man Khala yan (AIP), FIZ Ka ls uhe, 4 Sep embe 2025
www.di a e.de/en
Thank you o joining!
S ay connec ed
■DiT aRe
○Websi e: www.di a e.de/en
○Email: di a e@ iz-ka ls uhe.de
○LinkedIn: www.linkedin.com/company/di a e
○Mas odon: social.ki .edu/@DiT aRe
○YouTube: www.you ube.com/@DiT aRe
○Zenodo: zenodo.o g/communi ies/di a e
■Discussion o um: www.di a e.de/en/ o um
■Newsle e : www.di a e.de/en/newsle e
8