Michael Zimme mann G oup
P edic ion o gu -bac e ial d ug bio ans o ma ion
Mahnoo Zul iqa 1, Ting-Hao Kuo1, Eleono a Mas o illi1, Ma iia Beliae a1, Pe a Scepano ic2,Michael
Zimme mann1*
1Molecula Sys ems Biology Uni , Eu opean Molecula Biology Labo a o y, Heidelbe g, Ge many.
2F. Ho man-La Roche, Roche Pha maceu ical Resea ch and Ea ly De elopmen pRED, Basel, Swi ze land
@zimme mannlab.bsky.social
Mo i a ion
•
C ea e and sha e ahigh-quali y, AI- eady and FAIR-complian da ase
o
gu bac e ial d ug bio ans o ma ion
•
De elop machine lea ning models
o p edic bac e ial bio ans o ma ion
om eac i e unc ional g oups
•
Valida e he p edic ion model ia ac i e lea ning loop
and o in eg a e gu
mic obial bio ans o ma ion p edic ion in he ea ly d ug de elopmen s age
Aim
•P io i ize ea u e subse s linked o d ug–enzyme in e ac ions.
•Explo e g aph-based models and p oduc -de i ed ea u es.
•Inco po a e bac e ial s ain–speci ic insigh s o be e p edic ions.
•Add ess da a gaps by en iching ChEMBL wi h mic obial bio ans o ma ion da a.
zedmahnoo
[email p o ec ed]
zmahnoo 14 EMBL Heidelbe g
Meye ho s aße 1 · 69117 Heidelbe g ·
Ge many
www.embl.o g/g oups/zimme mann
Ou look and Fu u e wo k
Highligh s
Da a Rep esen a ion
527 amides wi hin d ugs
385142
Radius=13
Amide Bio ans o ma ion Classi ie
Machine Lea ning Wo k low
GLPG1837
D ugs no bio ans o med by bac e ia
D ugs bio ans o med by bac e ia
UMAP o cu en da ase wi hin D ugBank
Ip oniazid Le osulpi ide
Sul inpy azone
~ 8 bac e ia
Acecainide
~ 2 bac e ia
Calpep in
~ 39 bac e ia
Example amide con aining d ug s uc u es and
bac e ial bio ans o ma ion
UMAP 1
UMAP 2
D ugBank
Bio ans o ma ion = 0
Bio ans o ma ion = 1
UMAP o amide con aining d ugs, labelled
bio ans o ma ion label and da a spli ing
A adius 13 a ound he ca bonyl C o he amide, he whole molecule is co e ed and
gi es he highes MCC sco e o 0.54 using pa e n inge p in s as he model ea u es
Example d ug co e age a
adius 13 a ound he ca bonyl
ca bon om one di ec ion
Ca bonyl
C13 h
a om
(Subs uc u e) Pa e n Finge p in based Model pe o mance
Amide con aining d ug dis ibu ion
Model esul s o Mo gan Finge p in s
+ Physicochemical p ope ies
Model esul s o Mo gan
Finge p in s
ROC AUC Balanced accu acy MCC ROC AUC Balanced accu acy MCC
Sco es
The low MCC sco es (a e age sco es om c oss alida ion men ioned in he ba plo abo e) om Mo gan inge p in s and
physicochemical p ope ies indica e ha d ug bio ans o ma ion canno be eliably p edic ed using he ull chemical s uc u e o he
comple e se o physicochemical desc ip o s.
MCC = 0.54
Ex aT eesClassi ie
Logis ic Reg ession
Random Fo es
Random Fo es + SMOTE
Suppo Vec o Machine
XGBoos
XGBoos + SMOTE
Ex aT eesClassi ie
Logis ic Reg ession
Random Fo es
Random Fo es + SMOTE
Suppo Vec o Machine
XGBoos
XGBoos + SMOTE
0.70.67 0.69 0.68 0.69 0.7 0.69
0.57
0.59
0.53
0.6
0.57
0.51
0.59
0.2
0.2
0.11
0.22
0.2
0.23
0.05
0.2
0.57
0.11
0.52
0.2
0.25
0.04 0.04
0.57
0.5 0.5
0.6 0.6
0.69 0.68
0.63
0.68 0.68 0.67 0.69
Ex a T ee Classi ie