Theoretical studies for the development of a better understanding of cofactor-structure and mechanistic properties of photoreceptors and metallo-enzymes with quantum chemical and molecular dynamical approaches [original]

Theor etical studies for the development of a b e tter
understanding of cofa ctor -structur e and mechanist ic
pr operties of photor eceptors and meta llo-enzymes w ith
quantum chemical and molecular dynamica l
appr oaches
vor gelegt von
Master of Science-Chemiker
Dennis Heinz Belger
an der Fakultät II – Mathematik und Naturwissenschaften
der T echnischen Universität Ber lin
zur Erlangung des akademischen Grades
Doktor der Naturwissenschaften
- Dr . rer . nat. -
genehmigte Dissertation
Promotionsausschuss:
V orsitzender: Prof. Dr . rer . nat. A rne Tho mas
Gutachterin: Prof. Dr . rer . nat. Maria A ndrea Mroginski
Gutachterin: Prof. Dr . rer . nat. Silke Lei mkühler
T ag der wissen schaftlichen Au ssprache: 29.1 1.2019
Berlin 2020

i

T able of Conten ts
List of Abbr eviations ...............................................................…................................................vi
Abstract ...........................................................................................…..........................................viii
Zusammenfassung .........................................................................................................................x
I. Intr oduction
1. Phytochr ome Photore ceptors ........................................................................................................2
1.1. General protein structure motives (Cph1 and Agp2).....................................................................2
1.2. Chromophore structure and binding pocket..................................................................................2
1.3. Photocy cle (comparison between prototypical and bathy phytochromes)....................................5
2. Rhodobacter capsulatus formate dehydr ogenase ( Rc FDH) .........................................................8
2.1. General protein structure motives and gl obal mode of action.......................................................8
2.2. Molybdenum-containi ng cofactor….............................................................................................9
2.3. Possible m echanistic properties at the active site........................................................................12
3. Motivation for computational investigations of large biomolecules ........................................13
3.1. Phytochromes – Improvement of availabl e methods..................................................................13
3.2. Phytochromes – Dete rmination of structural and dynamical information..................................14
3.3. Rc FDH – Structural and mechanistic questions..........................................................................16
II. Materials and Methods
4. Theor etical backgro und ...............................................................................................................21
4.1. Molecular Dynamics...................................................................................................................21
4.1.1. Molecular Mechanics – Classical force fi elds..................…….......…….............…................21
4.1.2. Molecular dynami cs algorithms (Equations of motion)......…………...........……..................23
4.1.3. Constant temperature/ constant pressure d y na mics.............…….……................….................24
4.1.4. Boundary conditions.............................................................…….…..............….....................26
4.1.5. Polarized force fields.........................................................…….....….............…….................29
4.1.6. Constrains and restrains of MD-sim ulations............................................................................29
4.1.7. Determination of structural properties by M D-observables..…….…............…......................30
4.2. Quantum chemical methods.............................................................................….......................31
4.2.1. Density Functional Theor y ( DFT)....................................................…….………...................31
4.2.2. Hohenber g and Kohn...........................................................…….….............……...................32
4.2.3. Kohn and Sham.................................................................……....…….............…..................33
4.2.4. Mulliken char ges..............................................................................…..............…..................35
4.2.5. Basis functions and ba sis set approximations for metals............……….................................36
4.2.6. Self-Consistent Char ge Density Functional T ight Binding method (SCC-DFTB)..................39
ii

4.2.7. Implicit solvation m ethods – Self-Consistent Reaction Field (SCRF) M odel....….................40
4.3. Quantum Chemical/Mole cular Mechanical (QM/MM) methods......…......................................43
4.3.1. Additive scheme (+ subtractive scheme)..........................……….….…..................................45
4.3.2. Embedding methods...............…..................................……......…….….................................46
4.3.3. Boundary Region…...................................................…….............….............….....................46
4.3.4. Applications......................................…...........................…...........….….................................47
4.4. Raman spectroscopy and com putation........................................................................................48
4.4.1. Resonance Raman spectroscopy ....…..…............…....................…….................…................50
4.4.2. Frequency calculations via normal m ode analysis.………......................................................52
4.4.3. Raman intensities...............................................…...……..........………..................................54
4.4.4. Bandwidth.................................................................……..........………..................................55
5. Syste m preparati on and pr otocols for calculations ...................................................................57
5.1. Computation protocols for Cph1.................................................................................................57
5.1.1. Model building for Cph1..........................................................................................................57
5.1.2. MD-simulation protocol for Cph1 with cl assical force field....................................................57
5.1.3. Generation of a polarized force field for Cph1 and com putation of Mulliken char ges............58
5.1.4. MD-simulation protocol for Cph1 with pol arized force field..................................................60
5.1.5. QM/MM geometry optimizations for Cph1.............................................................................60
5.1.6. Raman-spectra calculati on with NMA for Cph1......................................................................61
5.2. Computation protocols for Agp2.................................................................................................62
5.2.1. Model building for Agp2 wild type (WT)................................................................................62
5.2.2. Model building for Agp2 BV -va riants......................................................................................63
5.2.3. MD-simulation protocol for Agp2 (WT and BV -variants) with clas sical force field...............64
5.2.4. QM/MM geometry optimizations for Agp2 WT ......................................................................64
5.2.5. QM/MM geometry optimizations for Agp2 BV -variants.........................................................65
5.2.6. Analysis of optimized geometries for Agp2 (WT and BV -variants)........................................65
5.3. Computation protocols for Rc FDH.............................................................................................65
5.3.1. Model building with QM soft ware..............................……..........……...........…....................65
5.3.2. QM calculations for Rc FDH models.........................……..........……….................................67
5.4. Computational resources..........................................……..........……….....................................68
III. Molecular dynamics and QM/MM based Ram an-spectra computations for
phytochr o m e Cph1
6. MD simulations of Cph1 with clas sical for ce field .....................................................................70
6.1. Instability of py r role-water with classical MM force field.........................................................70
6.2. Computation of Mulliken cha r ges and generation of a polarized force field..............................74
6.3. Stabilization of py r role-water via polarized force field...............................................................75
6.4. Discussion....................................................................................................................................78
iii

7. QM/MM based Ra m an-spectra calculati on ...............................................................................79
7.1. Minimum ener g y Ram an-spectra based on MM force field VS. polarized force field...............79
7.2. Comparison of both MM and polf f methods including influences of protein and water
fluctuations on the computed spectra .....................................................................................83
7.3. Discussion....................................................................................................................................86
8. Summary and conclusions o f part III .........................................................................................87
IV . Molecular dyna m i cs and QM/MM based computations for phytochr ome
Agp2
9. QM/MM calculations for determination of pr otonation state of two conserved histidine
r esidues in near vic inity of the chr omophore BV in Agp2.. ..........................................................89
9.1. His278 models with proton on N ε -position.................................................................................90
9.2. Correctness of models with proton only on N δ -position of His278.............................................91
9.3. Determination of correct protonation-model...............................................................................91
9.4. Models with missing prot on at PSC(C).......................................................................................94
9.5. Discussion....................................................................................................................................97
10. QM/MM calculations and MD si m ulations of Agp2 m eth ylester vari ant of BV ...................99
10.1. Experimental remarks................................................................................................................99
10.2. Theoretical investi gations of Agp2 BV -variants.....................................................................101
10.2.1. QM/MM computations of models BV MB and BVMC........................................................101
10.2.2. QM/MM computations of model Bi MET wit h bimethy lated BV ........................................104
10.2.3. MM-MD of Agp2 WT (with ed-protonation of Hi s248 and His278)...................................106
10.2.4. MM-MD of models BVMB and B VMC and comparison to WT .........................................108
10.2.5. Discussion of MM-MD-simul ations of BVMB and BVMC models....................................1 14
10.2.6. MM-MD of model BiMET with bimethylated BV ...............................................................1 15
10.2.7. Discussion of MM-MD-simul ation of BiMET mode l..........................................................1 17
1 1. Summary and conclusions of part IV ......................................................................................1 19
V . QM calculations and subsequen t Ra m an- spectra computations via NMA f or
Rc FDH
12. General r emarks on investigations of FDHs + Refine m ent of the cofactor structure ........123
12.1. General remarks.......................................................................................................................123
12.1.1.State of art – available result s in the literature and remaining questions..............................123
12.1.1.1. State of art..........................................................................................................................123
12.1.1.2. Remaining questions regarding struct ure of Mo-containing cofactor in FDHs and mech-
anisms of FDHs....................................................................................................................124
12.1.2. Resonance Raman experiments of Rc FDH..........................................................................126
12.1.3. Procedure for evaluation of ligation-sphere of m o lybdenum, as well as mechanistic
properties of the Moco with QM-m ethods...........................................................................128
12.1.4. Preparing QM-methods: T esting of model size and method accurac y .................................128
iv

12.1.4.1. T ests of influence of model size on preciseness of computed vibrational frequenci es.....129
12.1.4.2. T ests of influence of conver gence method on preciseness of computed vibrati onal
frequencies............................................................................................................................131
12.2. Refinement of cofactor structure of oxi dized MoVI-form...................,..................................132
12.2.1. W ild type (active; regardi ng S*- and Cys-ligation).......…..…...........……..……................132
12.2.2. Prestate (inactive desul fo form)...........................................….............………....................136
12.3. Refinement of cofactor structure of re duced MoIV -form........................................................138
12.4. Discussion of refineme nt of cofactor structures......................................................................141
13. Evaluation of mechanistic pr operties .....................................................................................142
13.1. State of art of the cataly t ic process..........................................................................................142
13.2. Disulfide-bridge formation and possible unbinding of cy s teine in the MoVI state ................143
13.3. Reduced MoIV -form................................................................................................................144
13.3.1. Cys-ligation.............................................................….…...................…….........................145
13.3.2. Disulfide bridge formation........................................………...….…...................................147
13.4. MoV -species............................................................................................................................148
13.5. Bond-breaking enthalpies for MoVI- and MoIV -specim en.....................................................152
13.6. Proposed mechanism for reversible formate oxidati on at Moco-site......................................154
13.7. Discussion of proposed mechanism and compa rison to the literature.....................................157
14. Summary and conclusions o f part V .......................................................................................164
Final Rem arks ……………………...…………………………………..…...................………167
Acknowledgm e nts ................................................................…..................................................169
Appendix ..……………….…..……..….………….….……..….……......……….…...................170
Bibliography ................................................................................................................................212
v

List of A bbr eviations.
Phytochrome Phy
Cph1 Cyanobacterial phytoch r ome 1
PCB Phycocy canobilin
Agp2 Ag r obacterium tumefaciens phytochr ome 2
BV Bil iverdine
Rc Rhodobacter capsulatus
FDH Formate dehy drogenase
p-water Pyrrole-wate r
CH 3 Methyl group
RR Resonance Raman
eq. Equation
MM Molecular mechanics
MD Molecular dynami cs
polf f Polarized force field
RMSD Root mean square deviation
RMSF Root mean square fluctuati on
RGYR Radius of gyration
HF Hatree Fock
DFT Density functional theory
QC Quadratically conver gent
SCF Self consistent fiel d
PBC Periodic boundary conditions
SBC Stochastic boundary conditions
PME Particle Mesh Ewald approximations
QM Quantum mechanics
PCM Polarized continuum model
MM-MD MD-simulation with cl assical molecular mechanics
Force field
polf f-MD MD-simulation with pola rized force field
His Hi s tidine
Asp Aspartate
Cys Cy st eine
Ser Serine
Pro Proline
Ar g Argini ne
T y r T yrosine
L y s L y sine
OH - H y d roxyl
CO Carbony l
CH 2 Meth y lene
NH-ip N-H in-plane rocking
HSE Nε-protonation of histidine
HSD Nδ-protonation of histidine
HSP Double protona tion of histidine
ed, ee, de, etc. Corresponding protonation states of N of His248 and
His278 (f.e. ed means His248 protonated
corresponding to HSE and His278 protonated
corresponding to HSD)
vi

PSC (B/C) Propionic side chain (ring B or C) of bili verdine
bimme Biliverdine m onometh y lester
BVMB Biliverdine m onometh y lester methylated at PSC(B)
BVMC Biliverdine m onometh y lester methylated at PSC(C)
BiMET Doubly methylated bili verdine at PSC(B) and PSC(C)
RR Resonance Raman
R Raman
IR Infrared
Subtr . Subtraction
Moco Moy bdenum cofactor
XAS X-ray absorption spectroscopy
EXAFS Extended X-Ray absorption fi ne structure
EPR Ele ctron paramagnetic resonance
wo W ithout
GDP Guanine diphosphate
small Moco Moco without GDP units
bigger Moco Moco without guanine parts, but with phosphates and
ribose
S Sulfido
O Oxo
OH Hy d roxo
SH Thiol
PO 4 Phosphate
vii

Abstrac t
Diversity of life is not only defined by th e number of genes in an or ganism, but by the pr oteins and
subsequent modifications of these. The differ ent pr oteins and differ ent functionality ar e the key for
generating life as it is known. Pr oteins ar e important for structural s tability of cellular systems, as
well as for mediating all kinds of pr ocesses (like chemical r eactions and s ensing of external
stimuli). Understanding of the structur e and functionality of pr oteins on an atomic level is of utter
importance for developing pharmaceuticals, artificial devices and their appli cations. In this work
differ ent p r otein systems (such as photosensor s and enzymes ) ar e investigated wi th theor etical
methods which in the past have pr oven very useful for determination of structur e and function of
lar ge biomolecules. The r esults fr om theory can then verify , complete or pr edict experimental
r esults, such as obtained by Raman spectr oscopy and X-ray analysis.
Phytochr omes ar e biol ogical r ed/far-r ed light sensing pr oteins occurring in plants, as well as in
bacteria and fungi. In this work two phytochr omes, namely Cyanobacterial phy tochr ome Cph1 and
Agr obacterium tumefaciens phytochr om e Agp2, ar e investigated. In Part III of this wor k, the
development and its application of a new polarized for ce f ield for the chr omophor e-binding domain
(CBD) of Cph1 is s hown. In the case of Cph1, the stability of the water -network inside the binding
domain of the chr omophor e has been addr essed with a polarized for ce field which has been
developed fr om Mulliken char ges derived by quantum chemical computations of the cofactor and its
immediate envir onment. MD -simulations with this for ce field r evealed an incr eased stability of the
pyrr ole-water . This stabilization eff ect can be seen in computed Raman-spe ctra. Hybrid quantum
mechanic/molecular mechanic (QM/MM) appr oaches have been used for subsequent Raman-
spectra calculations. The implementation of polarization effects led to a better agr eement of
computed spectra wi th experiments, especially in terms of the NH-in-plane r ockings (mor e
pr onounced bands) which ar e due to less fluctuating water -molecules. Thus, this method impr oves
existing MD-pr otocols by exchanging conventional point char ges of atoms with quantum
chemically derived partial char ges leading to a stabilization of the water network and can be used
for mor e accurate spectra-computations.
Part IV illustrates the investigation of the Agp2 biliver din e-binding domain in its par ent Pfr-st ate.
QM/MM methods have been used for determination of the corr ect pr otonation s tate of the two
conserved histidine r esidues His248 and His278 in near vicinity of the chr om ophor e. Comparison
of differ ent pr otonation models of His248 and His278 with the corr esponding X-ray structur e.
r evealed tha t the former histidine is pr otonated at the N ε , while the latter carries a pr oton on N δ .
Furthermor e, QM/MM- and MD-techniques have been applied to differ ent variants of Agp2, in
which BV carries a m ethyl gr oup on either the pr opion ic side-chain of rings B (BVMB) or C
(BVMC) or both (BiMET). Comparison to the crystallographic structur e with native-cofactor inside
its binding-domain, r evea led a high flexibility of Ar g21 1 to the intr oduction of the methyl function
at pr opiona te of ring B. It induced an opening of the chr omophor e binding dom ain not be observed
in the wild-ty pe. In contrast to this, the variant BVMC showed big deviations of a highly unstable
His278 fr om its original position and less tolerance to the additional methyl gr oup. This led to the
hypothesis that BVMB is photoactive, because of its ability to p r oduce a stable CBD after initial
r earrangements. BVMC was postulated to be photoinactive, due to the highly unstable CBD. This is
in agr eement to experimental findings pr oposing BVMB to under go a complete photoconversion
upon light-irradiation, wile BVMC r emains photoinactive.
Studies of formate dehydr ogenase of Rhodobacter capsulatus ( Rc FDH ) ar e shown in Part V of this
work. Rc FDH is an enzyme which catalyzes the r eversible oxidation of form ate to carbon dioxide
upon r eduction of NAD + . It carries a Mo-containing cofactor , call ed Moco, common in FDHs of
viii

differ ent or ganisms. Since no crystal structur e is available for Rc FDH s o far , the complete
coor dination spher e of the Mo has not been determined conclusively , yet. Her e, a combination of
quantum mechanical (QM) geometry optimizations and computations of Raman-spectra has been
done, using a homology model of RcFDH. Differ ent models of a s maller Moco (without GD P-
units), with differ ent ligation spher es have been built, including the native Cys386 in dir ect vicinity
of the cofactor . Subsequent comparison to experimental r esonance Raman findings r evealed, the
ligands of Mo: Mo exists in at least two differ ent oxidations states +VI and +IV . Bo th ar e
coor dinated by two dithiolene moieties. MoVI is further ligated to a native Cys386 r esidue and
either a sulfido-li gand or an oxo-ligand. The latter leads to an inactive species which has been
observed in experimental EXF AS studies. The MoIV -species lacks (at least temporarily) the Cys-
ligation and is pentacoor dinated by a sulfido-ligand (or temporarily thiol-gr oup) and two
dihtiolenes. Additional models with differ ent ligands at the metal-site for both oxidation states, as
well as bond dissociation enthalpy computations of differ ent ligand-metal bond systems gave deep
insight into the catalytic cycle: Formate does not bind dir ectly to MoVI. In contrast to that, Cys386
stays coor dinated to MoVI, while in MoIV Cys leaves the m etal. This is in accor dance with a
mechanistic pr oposal that formate stays unbound near to the metal-center during the r eaction
cycle. Possible for mal hydride-t ransfer fr om formate is m ediated by the sulfido-ligand leading to
the r eduction of molybdenum to MoIV and temporar ily forming a SH-ligand at MoIV which further
translocates a pr oton to Cys386 and afterwar ds deeper into the pr otein. Subsequently , Cys386 is
r eturning to the metal-site and two electr ons ar e r eleased fr om the metal and the cycle can start
anew (while CO 2 is r eleased fr om the active-site). The on-and-off-going of Cys386 may be mediated
by the formation of a disulfide-bridge with th e sulfido-ligand. Also, the possibility of an
intermediate MoV -species has been investigated and possible partici pation in the mechani sm
discussed. Finally , the complete coor dination spher es of both MoVI, MoV and MoIV in the Moco of
Rc FDH ar e shown, wi th possible influence on the r eaction cycle. The her e pr esented catalytic
mechanism includes important characteristics of r ecently discussed ones (like pr oton-abstraction
mediated by the Cys-gr oup and temporal formation of a SH-gr oup and formal hydride-trans fer and
only indir ect formate-Moco interaction), as well as new feautr es and thus r epr esents an impr oved
and extended version.
ix

Zusammenfa ssung
Die V ielfalt des Lebens wir d nicht allein dur ch die Anzahl der Gene in einem O r ganismus bestimmt,
sondern vor allem dur ch die unterschiedlichen Pr oteine und der en Modi fikationen.
Unterschiedliche Pr ot eine und der en Eigenschaften sind entscheidend für die Erschaffung von
Leben, so wie es bekannt ist. Pr oteine sind wichtig für die Stabilität von zellulär en Systemen,
genauso wie für die Dur chführung unterschiedlichster Pr ozesse (wie zum Beispiel chemische
Reaktionen und das Erkennen von externen Reizen). Um Medikamente und artifizielle Pr oteine
herzustellen, s owie für der en Anwendung ist es von unerlässlicher Bedeutung, die Struktur , sowie
die Funktionalitäten von Pr oteinen auf einem atomar en Level zu kennen. In dieser Arbeit sind
verschiedene Pr oteinsysteme (wie zum Beispiel Fotosensor en und Enzyme) mit Hilfe von
theor etischen Methoden untersucht wor den. In der V er gangenheit hat sich der Einsatz von
theor etischen Ber echnungsmodellen als sehr hilfr eich bei der Bestimmung von Struktur und
Funktion gr oßer Biomoleküle erwiesen. Die Er gebnisse von diesen Ber echnungen können dann
genutzt wer den um experimentelle Studien z u vervollständigen, zu bestätigen oder sogar
vor herzusagen, wie es zum Beispiel bei Raman-Spektr oskopie und de r Röntgenstrukturanalyse in
dieser Arbeit getan wur de.
Phytochr ome sind biologische lichtempfindliche Pr oteine, die im r oten Spektralber eich absorbier en.
In dieser Arbeit wer den zwei verschiedene Phytochr ome untersucht, jenes von Cyanobakterium
(Cph1) und das von Agr obacterium tumefaciens (Agp2). Im T eil 3 dieser Arbeit wir d die
Entwicklung eines neuartige n polarisierten Kraftfeldes für die Chr omophor-B indungs-Domäne
(CBD) von Cph1 dar gestellt. Mulliken-Ladungen des Chr omophor es und seiner dir ekten Umgebung
wur den mittels quantenmechanischer (QM ) Rechnungen ermittelt und z ur Entwicklung eines
polarisierten Kraftfeldes herangezogen. Dieses wu r de in eine r MD-Simulation genutzt um die
Stabilität des W assernetzwerks innerhalb der CBD zu adr essier en und zu er höhen. Im Er gebnis
er höhte sich die Stab ilität des Pyr olwassers deutlich, was auch in nachfolgend ber echneten
Ramanspektr en gesehen wer den konnte. Ei ne Methode be stehend zum T eil aus
quantenmechanischen Ansätzen und zum ander en T eil aus klassisch molekülmechanischen
(QM/MM) wur de genutzt um Ramanspektr en für Cph1 zu ber echnen (basier end auf der Simulation
mit polarisiertem Kraftfeld). Die hiermit ber echneten Spektr en zeigen eine deutlich besser e
Über einstimmung mit experi mentellen D aten und weisen e ine deutlich schmaler e NH-in-plane
Schwingungsbande auf. D ies ist auf die Stabilität des Pyrr olwassers auf seiner nativen Position
zwischen den Pyrr olringen zurückzuführ en. Die hier vor gestellte Methode konventionelle
Punktladungen in MM-Kraftfeldern gegen mit QM-Methoden ber echnete Partialladungen
auszutauschen, stellt demnach eine V erbesserung gängiger Simulationsmethodiken von
Biomolekülen dar und ermöglicht ei ne r eal istischer e Ber echnung von Schwingungsspektr en.
In T eil 4 dieser Arbeit wir d die Untersuchung der CBD von Agp2 in s einem nati ven Pfr-Zu stand
diskutiert. Q M/MM T echniken sind genutzt wor den um den Pr otonierungszustand der beiden
konservierten H istidine His248 und His278 in unmittelbar er Nähe des Chr omophor es Biliver din
(BV) zu bestimmen. V erschiedene QM/MM-Modelle, die sich dur ch mögliche Pr otonierungen der
Stickstoffe der beiden Histi dine unterscheiden, wur den mit der Kr istallstruktur von Agp2
ver glichen. Hierbei wur de ermittelt, das s His248 am N ε pr otoniert ist, währ end H is278 sein Pr oton
am N δ trägt, da das dazugehörige Modell die geringsten struktur ellen Abweichungen von der
Kristallstrukt ur aufwies. Dur ch die Identifizierung des korr ekten Pr otonierungszustandes in der
CBD von Agp2 ist es nun möglich weiterfüh r ende theor etische U ntersuchungen anzustellen, was
der Interpr etation von experimentellen Befunden zu Gute kommt. So wur den dr ei verschiedene
Modelle erstellt, welche sich in der Position einer unnatürlichen Methylgruppe an den
Pr opionatseitenketten des Kofaktors unterschieden: an r ing B (BVMB), ring C (BVMC) oder an
x

beiden (BiMET ). Diese wiederum wur den m it Hilf e von MD- und QM/MM-Methoden untersucht
und mit der Kristallstruktur m it unmethyliertem Kofaktor ver glichen. Dabei zeigte sich, dass
Ar g21 1 e ine besonders hohe Flexibilität aufweist, in Bezug auf die eingeführte Methylgruppe.
Letztgenannte Gruppe induziert eine Öffnung der CBD unter E inbezug von Ar g21 1 (ähnlich wie bei
einem Fass, an welchem der Deckel angehoben wir d), welche in dieser Art nicht im nat iven Pr otein
simuliert wer den konnte. Im Gegensatz dazu, offenbarte die Simulation von BV MC ein sehr
instabiles His278, welches starke Abweichungen zu seiner Ursprungsposition zei gte. W eite T eile der
CBD in BVMC zeigten deutliche Ins tabilitäten währ end der MD- und QM/MM-Ber echnungen. Dies
führte zu der Hypothese, dass BVMB fotoaktiv sei , da sich hier die CBD der eingeführten
Methylgruppe anpassen konn te und am End e der MD s tabil blieb. Im Gegensatz daz u sei BVMC
auf Grund der Ins tabilität in der CBD nicht in der Lage eine komplette Fotokonversion
dur chzulaufen. Dies ist in Über einstimmung unveröffentlichten Experimenten.
In T eil 5 dieser Arbeit wir d das Enzym Formiat dehydr ogenase von Rhodobacter capsulatus
(Rc FDH ) eingehend untersucht. Dieses Enzym ist verantwortlich für die r eversible Oxidation von
Formiat zu Kohlenstoffdioxid. Hierbei wir d NAD + zu NADH r eduziert. Der Kofaktor , welcher in
allen bekannten FDHs zu finden ist, der sogenannte Moco, trägt ein Molybdän in sei nem Zentrum.
Die genaue Koor dinationssphär e des Mo ’ s im Moco ist noch nicht endgültig bekannt, da es zum
jetzigen Zeitpunkt keine Kristallstruktur von RcFDH gibt. Deshalb wur de ein Homologiemodell von
RcFDH genutzt u m verschiedene Mode lle eines kleiner en Mocos zu generi er en. Mittels QM-
Methoden ber echnete Raman-spektr en wur den hinsichtlich ihr er V er gleichbarkeit mit
experimentellen Resonanz-Raman-Spekt r en untersucht, um so die korr ekte Koor dinationssphär e des
Metalls zu entschlüsseln. Hierbei kann das Metall mindestens in zwei verschiedenen
Oxidationsstufen vorliegen, entweder oxidiert als +VI oder r eduziert als +IV . Beide Metallformen
sind von zwei D ithioleneinheiten koor diniert. MoVI ist z udem vom Cys386 dir ekt koor diniert und
trägt zusätzlich entweder einen Sulfido- oder Oxo-Liganden und ist s omit hexakoor diniert. Der
letztgenannte Ligand wur de in EXF AS-Messungen in einer inaktiven Form des Enzyms beobach tet.
MoIV hingegen trägt (zumindest teilweise) keinen Cys-Liganden und ist pentakoor diniert. Das
Metall ist von zwei Dithiolenen und ein er Sulfido-Gruppe (evtl. auch temprär von einer Thiol-
Gruppe) koor diniert. Mit Hilfe die ser Erkenntni sse wur den wei ter e Modelle erstellt, die sich
hinsichtlich zusätzlicher L iganden in be iden O xidationsstufen unterschieden. Zusammen mit der
Ber echnung von B indungsbr echungsenthalpien verschiedener Metall-Liganden-Bindungen lieferte
dies neue Erkenntnisse hinsichtlich eines möglichen Reaktionsmechanismus: MoVI hat eine
stärker e Bindung zu Cys386, als zu Formiat, weshalb Cys386 nicht als Ligationspartner des
Metalls von Formiat ausgetauscht wir d. Dies ist in Über einst immung mit ber eits veröff entlichten
Daten. W eiter konnte gezeigt wer den, dass ein Hydridtransfer des Formiatpr otons mittels des
Sulfido-Liganden unter Ausbildung einer temporär en SH-Gruppe am Moco möglich ist. Dadur ch
wir d MoVI zu MoIV r eduziert. Das Pr oton wir d dann sehr schnell weiter zum negativ geladenen
Cys386 geleitet, welches nicht stabil an MoIV bindet, von wo es tiefer in das Pr ote in transportiert
wir d. Auch Formiat (und evtl. CO 2 ) bindet nicht stabil an das r eduzierte Metallzentrum, was zu
einer pentakoor dinierten Form des MoIV führt. Nach dem V erlassen von CO 2 koor diniert Cys386
wieder an das Metall, welches zwei Elektr onen abgibt (MoIV → MoVI) und der Zyklus kann von
vorne beginnen. Die Abspaltung und Bindung von Cys386 vom bzw . z um Metall könnte dabei
mittels der Ausbildung einer Disulfidbrücke mit dem Sulfido-Liganden vonstatten gehen. Ebenfalls
ist es möglich, dass die Elektr onen Schritt für Schri tt übertragen wer den und somit eine MoV -
Zwischenstufe präsent ist. Diese Möglichkeit wur de getestet und ein möglicher Anteil am
Mechanismus diskutiert. Zusammenfassend lässt sich sagen, dass der hier vor gestellte
Mechanismus Eigenschafte n der beid en vor herrschenden Katalysemodelle (P r otontransfer m ittels
Cys386, SH-Ligand-Ausbildung, Hydridtransfer , keine Ligation von Formiat an Moco)
zusammenführt und noch um neue wichti ge Charakteristika erweitert und somit verbessert.
xi

Part I.
Intr oduction
1

Chapter 1
Phytochr ome Photor eceptors
1.1. General pr otein structur e motives (Cph1 and Agp2)
Phytochromes are biological light-receptors. The y are responsible for determination of red light
and managem ent of dif ferent ph y siological functions in both higher plants [1] and bacteria, fungi
and algae [2],[3]. In the for mer they are responsible e.g. for avoidance of shade and orientation
towards a light-source, the development of seeds and flowers and leave-formation. In or ganisms
other than plants, like bacteria for example, they mediate e.g. orientation in the l ight-deficient soil.
These photoreceptors in terconvert reversibly between a red-light sensitive P r state and a far -red-
light sensitive Pfr state, where one state is inactive (but thermally stable) and the other is active (but
thermally unstable; s. section 1.3. for further information about the photocycle and the differe nces
in parent states). They abs orb either at ~670 n m or ~730 nm in their Pr and Pfr s tates, respectively .
The Pr-P fr-transition is initiated b y a covalently bound chro mophore (an open-chain
tetrapyrrol/bilin). Upon ligh t-absorption, the cofactor changes its conformation, thus inducing
changes in the protein environment. This ultimately activates signaling cascades through an
attached catal y tic module (histidine kinase (HK) in Cph1 and Agp2) which induce ph y siological
processes. In the following, the focus will be on the two ph y tochromes Cyanobacterial
phytochr ome Cph1 and Agr obacterium tumefaciens Agp2 and both have been subject of this thesis.
The photosensory core (located at the N-terminus of the protein) and the catalytic active regulatory
part (loc ated at the C-terminus) [4],[5],[6],[7],[8] constitute the two dif ferent functional parts of all
phytochromes [3]. The photosensor y core consists of the P AS ( P ER/ AR NT/ SIM), GAF (“vertebrate
cG MP-specific phosphodiesterases”, “cyanobacterial adeny late c yclases” and “ for mate h y drogen
lyase transcription activator , Fh1A”) and PHY (“ phy tochrome”) domains. In the case of Cph1 the
HK is connected to the PHY domain. The so-called hinge region connects the photosensor y core
with the regulatory one in plant ph yt ochromes. The chromophore binding domain (CBD) is the
P AS-GAF region, in which the cofactor is located (s ee F igure 1 on the right, for an example of the
CBD in Agp2). Phytochromes exist in dimeri c form (homo- or heterodimerization).
X.ray crystallography helped to resolve the structure of dif ferent ph ytochromes [9],[10],[1 1]. For
example the ZZZ/ssa comformation of the chromophore Ph y coc ya nobilin (PCB) in its Pr-state in
Cph1 could be confirmed [12] (s. s ection 1.2.). N MR analy sis [13][14] similarly contributed to the
valdiation of structural m otives in ph y t ochromes.
The in this work discussed crystal structures of Cph1 and Agp2 are based on the structure [10]
available in the Protein Data Base [15] (code: 2VEA) and the X-Ray-studies from the group of
Scheerer et al. [16] (code: 6G1Y), respectively .
1.2. Chr omophor e structur e and binding pock et
For the following The chromophore is an open-chain tetrap y rrole (bilin, see Figure 1, left side)
consisting of four pyrrole-ri ngs, labeled from A to D (naming starts at ring A which is the protein-
chromophore link). These are connected via three methine bridges. Dif ferent phytochromes show
dif ferent prostheic groups. A ll of them carry the same main tetrap yrrolic spine and only differ in
substitutes at rings A and D. Ph y coc y canobilins (PCB) and biliverdins (BV) are found in the
2

bacterial ph ytochrom es Cph1 and Agp2, respectively . Plant ph ytochromes mostly carry a
Phytochromobilin (PΦB). PCB and PΦB show an eth yl group for R 1 , while BV has an additional
double bond either exo- or endo-c yclic on ring A. In Agp2, the linkage of the C y s13 to BV is in the
exo-form [16], although in this w ork an early endo-structure has been used (depicted in F igure 1,
right) [16]. R 2 is in BV and PΦB a vinyl group, while in PCB it is an ethyl group.
Figur e 1. (left) Possible tetrapyrrolic structure s for ph y tochrome chromophores: R 1 is in PCB and
P Φ B a ethyl-group with a linkage to a nati ve Cys, in BV an additional double bond is either endo-
or exo-cyclic; R 2 is in P Φ B and BV a vinyl function, while in PCB it is a et h y l-group; (right) CBD
of Agp2 with BV and surrounding protein residues and cry s tal water molecules [16]
The flexibility of the chromophore is attained to its thr ee methine bridges which connect the four
rings. T orsions around these rings represent the crucial modes of action for conversion between P r
and Pfr s tates. A rotation around the single-bond be tween rings C and D (see section 1.3. for further
details) initiates conformational changes in the protein matrix, finall y activating or shutting down
the catal y tic unit. The parent configuration of PCB in Cph1 in its Pr state is ZZZssa. Here, “Z”
means “Zusammen” and “E” means “Entgegen”, concerning the respective double bond
configurations of the methine bridges. “s” means “s y n ” and “a” means “ anti” and they correspond
to the methine bridge s ingle bonds. Agp2 holds the ZZEssa for the Pfr state in the cr y stal structure
[16] (code: 6G1Y).
Furthermore, the four pyrrole-nitrogens have been found to be protonated in both Cph1 and Agp2
and in both Pr and Pfr states (Raman and NMR studies [17],[18],[19],[20] ,[21],[22]). It has also
been found [23] that in the case of Agp2 the propionic s ide chain of ring C is protonated in its Pfr
state. This is in contrast to observations from other ph ytochromes (like Cph1), where the
propionates are alway s negatively char ged and do not carry any protons [10],[13],[15].
In phytochromes, the chromophore resides inside the photosensor y do main and is covalently bond
to a c ysteine residue. In the case of plant ph ytochromes and Cph1 this cy steine (C y s259) re sidue
lies within the GAF domain, while in the case of bacterial phytochromes, such as Agp2 the c ysteine
3

(Cys13) is lo cated in the P AS region. In Figure 3 the CBD of Agp2 can be seen, together with its
location in the photosensor y core and important secondar y structures (in a similar wa y the CBD of
Cph1 is depicted in Figure 2 which is not discussed here in more det ail). BV is linked to Cys13 and
embedded in the protein m atrix. Several residues are in teracting with the cofactor . This is e.g. an
aspartate l yi ng beneath BV . It is a conserved residue in all known ph ytochromes and might play an
important role in proton transfer processes (it is also part of the Asp-Ile-Pro motif which is
conserved). An argi nine, Ar g21 1, interacting with the propionic side chain of ring B (PSC(B)) via
formation of a strong salt bridge might also be important (see section 2.2. for further insight in the
importance of PSC’ s of chromophores) [24],[25],[26]. Shown in Figure 1 (right) are the conserved
histidine residues His248 and His278 (His260 and His290 in Cph1). Both are suggested to pla y
important roles in the photocy cle and in stabilization of the cofactor (see the results of this work in
sections 9. and 10.) [23],[27],[28]. His278 directl y interacts with the propionic side chain of ring C
(PSC(C)) via a h y drogen bond. His248 interacts with the tetrapyrrole via a water which is bridging
to the cofactor . This water is also present in all known ph ytochrom es and lies between th e four
pyrrole-rings. It is hence called the p y r role-water (p-water). This water is part of an important
water -network surrounding the chromophore. The two histidines are optimal for the proton-
acception and -donation events [23],[27],[28] hapening during the photocycle (s. section 1.3.). Both
the named histidines and the p-water wi ll be subject to studies in this work (parts III and IV).
Figur e 2. Crystal structure of Cph1 [10]; PDB-code: 2VE A – Pr state (left) PCB and surrounding
protein residues ; (middle) CBD wit h important secondary structures; (right) sensoring domain
Because X-ra y cr y stallograph y in general is not able to provide the positions of the h ydrogens (due
to low resolutions), the protonation states of these two His residues are not known. The
determination of thus is not trivial and will be subject of this work (part IV). In Cph1, both NMR-
studies [29] ,[30], as well as theoretical investigations [27], proposed a heterogeneity of the P r -state
(Pr -I and Pr -II) which diffe r in the protonation of His260. Newer studies b y Mroginski et al. [27]
suggested a ca tionic His260 with a N ε -protonated His290 in Pr-I. Dif fering from this, in Pr-II, both
histidines are protonate on N ε . Studies b y T akiden et al. [28] suggested a lso a possible het erogeneity
for Agp1, with either both histidines either protonated at N ε , or His250 at N ε , while His290 is
cationic. No studies on Agp2 have yet reported a heterogeneity in its pare nt Pfr-sta te.
4

Figur e 3. Crystal struct ure of Agp2 [16]; PDB-code: 6G1Y – Pfr state (left) BV and surrounding
protein residues ; (middle) CBD wit h important secondary structures; (right) sensoring domain
1.3. Photocycle (com p arison between pr ototypical and bathy phytochr o mes)
Photocycle. Ph y tochro mes under go a photoc y c le upon irradiation with red or far-re d light. F igure 4
shows the sche matic photoc ycle from Pr to Pfr and vice versa . This process is ver y similar in al l
known ph ytochromes [31],[32],[33],[34],[35],[36]. Red-light induces the isomerization of the
C15=C16 double bond [31],[37],[38],[39] which leads to subsequent changes in the surrounding
protein environment. In Figure 5 the differe nces between the two parent states Pr and Pfr are
illustrated for Cph1 (Pr state; left) and Agp2 (Pfr s tate; right). ring D is twisted in BV of Agp2 in
contrast to ring D of PCB in Cph1. D ue to the rotation of ring D, several int eractions are lost or
newly gained which leads to a translation of the chromophore inside the CBD [8][40]. The
photoproducts are called Lum i-R for Pr-t o-Pfr - and Lumi-F for the Pfr -to-Pr -transition (s. Figure 4).
Figur e 4. Schematic representation of the photocy c le of ph y tochromes via Lumi-R and Meta-R or
Lumi-F and Meta-F , res pectively; with the rmal dark-conversions for both bathy and prototypical
ph y tochromes
The thermal relaxation of the CBD leads to Meta-R and Meta-F , respectively (Figure 4). Following
this a longer Meta-F to Pr (or Meta-R to P fr , respectively) transition takes place. Reprotonation of
the chromophore, as well as substantial changes in the protein matrix occur during this step [8],[40].
5

Finally , the protein in its activated form is converted back to i ts initial Pr/Pfr s tate via far -red/red-
light irradiation. Another possible back-conversion is a ther mal relaxation process which is called
dark-conversion, happening independent of light.
The first step in the reaction c yc le is the fastest with only ~50 ps, while all other subsequent
relaxation processes are in the range of milliseconds [41]. The relaxation of the CBD induces globa l
changes of the photosensor y core of the protein leads ultimately to the signaling cascades. Ma y be a
signaling process can onl y be intitiated, if both monomers in the dimeric ph y tochromes under go
photoisomerization processes [42].
Figur e 5. (left) Cph1 chromophore PCB in its Pr state; (ri ght) Agp2 chromophore BV in its Pfr
state: the ring D is tilted in the Pfr state com pared to the Pr one and PSC(C) carries an additional
proton; structures based on crystal structures for Cph1 [10] and A gp2 [16]
Furthermore, proton transfers, which occur during the last phase of the transition, between the Meta
and Pr/Pfr states [32], might play an important role in the following signaling process. After
formation of the respective Meta state, a proton (most probably from the inner p yrrole-ri ngs) is
transferred into the protein matrix. The proton- transfer might initiate further protein changes on a
secondary structure level: In all ph ytochromes the region connecting GAF to PHY is reaching into
the CBD (called “tongue”; s ee Figures 2 and 3, middle+right). In the Pr state it occurs as a beta-
sheet (s. Figure 2), while in the Pfr state it beco mes an alpha-helix (s. Figure 3) [42]. This change in
structure m ay be indu ced by proton releases from the chro mophore and might trigger the further
movement of the ph yt ochrome monomers, finally activating the signaling processes [42]. But this
has yet to be proven. Afterwards ,the cofactor (p y r role-nitrogen) gets reprotonated before complete
conversion to Pfr/Pr . In the case of Cph1 it has been discussed that His260 might be the source for
the back-protonation of the chromophore [27]. In the case of Agp2, two additional proton-involving
events are assumed to pla y an important role: These are the deprotonation of the PSC(C) and the
protonation of His278 [23] upon formation of Pr-stat e. A ll these protonation events and the
rearrangements of the CBD may be the reason for the long Meta-R to Pfr transition (M eta-F to Pr ,
respectively ) of about 300 ms.
Pr ototypical/Bathy phytochr omes. There exist two dif ferent ty pes of ph y tochromes in bacteria:
The prototypical ph y tochromes (e.g. Cph1 and Agp1) and the bathy ph y tochro mes (like Agp2). The
main dif ference between both of them is the dif ferent thermodynamic stability of Pr and Pfr states.
In protoypical ph ytoc hromes, like Cph1, the Pr state is inactive and the thermally stable one.
Irradiation leads to conversion into active Pfr . In the case of bathy ph ytochromes, like Agp2, it is
vice versa . Here, Pfr is inactive and the most stabl e state and far -red- light irradiation leads to the
active Pr state (see Figure 5 for comparison between both chromophores in their respective parent
states). In the case of Agp2, the proton at the propionic side- chain of p y rrole-ring C (PSC(C)) might
6

play an important role in the thermal back-reaction fro m Pfr to Pr . It might be transferred to His278
and this event might induce the possibl e dark conversion via a keto-enol tautomerie [23].
The photocycle is still subject to investigations. Theoretical s tudies might help to provide necessary
information, such as protonation states and possible s tuctural d y namics in the CBD to understand
this process. Also prediction/computations of spectroscopic data (such as Raman-spectra) are of
great importance. Interpretation and comparison to experimental resonance Raman (RR) data might
provide insight into the photoc ycle. In this work diffe rent co mputational approaches are us ed to get
insights into structure and d ynamic s of both Cph1 and Agp2. See parts II for theory and methods
and III and IV for results for studies on Cph1 and Agp2, respectively .
7

Chapter 2
Rhodobacter capsu lat us formate dehydr ogenase( Rc FDH)
Protein dehydrogenases are oxireductases (enz ymes) whic h catalyze the oxidation of various
substrates, such as alcohols and aldehydes via the formal re moval of a hydride H - -ion. Th ese
processes are often further mediated b y additional cofactors, such as
nicotineamideadeninenucleotide (NAD + /NADH), flavineadeninedinucleotide (F AD+/F ADH 2 ) and
further cofactors (see section 2.2.). One example is the human aldehy de dehydrogenase [ALDH2]
[43] which is important for the degradation of alcohol. It mediates the reaction RCOH + NAD + +
H 2 O → RCOOH + NADH + 2H + [44]. W ithout a correctly functioning a ldehy de dehydrogenase, the
or ganism is not able to convert the toxic aldeh ydes into non-toxic acetates RCOO - . Such is also
called the Oriental-flushing sy ndro me [45]. The ALDH2 contains addi tional to NAD + ( the electron-
acceptor) a M g-containing cofactor mediating the oxidation of the aldeh yde into the carboxylic
acid. Furthermore, diffe rent protein residues play an important role in stabilization and catalytic
conversion of the substrate (such as Cy s [45] ).
Formate dehydrogenases (FDHs) are a member of the dimeth yl sulfoxid (DMSO) reductase family .
They mediate the oxidation of formate (HCOO - ) to carbon dioxide CO 2 via cleavage of a C-H bond,
The electrons are donated to NAD + or cytochrome (CYT) [46], depending on the organism: Aerobic
metabolisms, e.g. yeast and bacteria, contain NAD + -dependent FDHs, w hile anaerobic prokar y otes
contain CYT . It is a key process for the or ganis m (e.g. bacteria li ke Rhodobac ter capsulatus ) to
obtain ener g y and reducing-equivalents. Metal-dependent FDHs contain a metal-containing cofactor
with either mol ybdenum (Mo) or tungsten (W) [47] at their active-sites. Furthermore, the y all show
conserved C y s (native form or seleno-Cys), His and Arg residues in near vic inity of the
moly bdenum-containing cofactor (Moco). In the following, more detailed insight into FDH of
Rhodobacter capsulatus ( Rc FDH) is given, s ince this work is focused only on this FDH. FDHs of
other species, such as from E. coli are only mentioned but not discussed in de tail. In the following
the structure and function of the Moco and other cofactors, such as iron-sulfur clusters are described
in greater detail. Man y of the following details (such as Moco-structure) are conserved between
known FDHs.
2.1. General pr otein structur e motives and global mode of action
Rc FDH is a Mo- and NAD + -dependent oxygen-tolerant enzyme. Its functional unit consists of a
dimeric heterotrimer [48] (show n schematically in Figure 6 on the l eft). The (αβγ) 2 -unit holds the
active site α (FdsA) with the Moco and five diffe rent iron- sulfur clusters (FeS), namel y four Fe 4 S 4
clusters and one Fe 2 S 2 cluster . The γ-unit (FdsG) contains only a Fe 2 S 2 cluster . And finall y the β-unit
(FdsB) cont ains one Fe 4 S 4 cluster , as well as one flavinemononucleotide (FMN) cofactor and the
NAD + binding-site. The formate oxidation (HCOO - → CO 2 + 2e - + H + ) occurs at the acti ve site
located in the α-monomer and at the Moco-site. The electrons are then transmitted via reduc tion of
the metal (here Mo): Mo(+VI) → Mo(+IV); into the Moco. The protein and Moco exist therefore in
at least two dif ferent redox sta tes (more on possible oxidation states and the possibility of a MoV -
state in sections 12. and 13.4.): one with Mo(+VI) and one w ith Mo(+IV). Afterwards, the electrons
might be translocated into the protein matrix via reduction of the dif ferent FeS clusters, until they
finally reach the FMN-site. Here FMN ca talyzes the reduction of NAD + + 2e - + H + → NADH. It has
also been found that Rc F DH is able to catalyze the reverse reaction, namely the CO 2 -reduction to
formate [48]. The observed reaction rate for formate oxidation was k cat = 2189 min -1 , while the one
8

of the backr eaction was k cat = 89 min -1 [48]. The CO 2 -reduction is ver y slow , because the electrical
potential of the FeS cluste rs might hinder the electronic flow from flavine to the Moco-site [48].
Figur e 6. (left) Schematic representation of the Rc F DH heterotrimer and the corresponding
cofactors and its active sites (Fig. inspire d b y Fig. 1C in [48]); (right) Rc FDH homology model [49]
Until now , five differe nt genes are known to code for Rc FDH. These are FdsA, FdsB and F dsG
(responsible to the α-, β- and γ-units). Additionally , there exist two other genes FdsC and FdsD
which have been recently investigated [48]: The former is an analog to the FdhD gene in FDH of E.
coli and is important for sulfuration of the Moco (see section 2.2.) [48], while the latter has no
known analog in E. coli and is unique to Rc FDH. It see ms to be important for inser tion and
stabilization of the Moco inside the protein matrix, while bound to FdsC [48]. It also might be
restricted to oxygen-tolerant FDHs and might be important for the protection of the M oco from
oxygen-induced degradation processes. Thus, both are necessary for maturation processes of FDH
and cofactor -insertion [48].
Until now , no cr y stal structure is available for Rc FDH. The above mentioned results were due to
spectroscopic [50] and kinetic studies [48][49]. For the here described theoretical studies, a
homology mod el [49] has been taken as a starting point. This has been built from the cr y stal
structures of E. coli (see section 5.2.1.).
2.2. Molybdenum -containing cofactor
Several cofactors are incorporated into the protein, such as F eS clusters, FMN and also the Moco.
Since this work focus' only on the Moco and surrounding protein residues, the other cofactors will
not be discussed further .
The Mo-containing cofactor , the Moco, consists of a metal (here mol ybdenum) ligated b y two
dithiolene moieties. T hese are each covalently attached to a 5, 6, 7, 8-tetrahydropterin ligand on one
side and a guanosine diphosphate (GDP) on the other side. Hence, the Moco is calle d bis-
moly bdopterin guanine dinucleotide (bis-MGD). In Figure 7 one possible Moco-structure is shown.
9

Figur e 7. Structure of the oxidized bis-MGD cofactor (Moco). X m eans potentially binding of S,
SH, O or OH ligands and Cy s-ligation might be optional, as discussed in the text.
The dithiolene moieties may be important for tuning the redoxpotential of Mo [51]: They are able to
adjust to the oxidation s tate of Mo (as the y have dif ferent oxidation s tates themselves, shown in
Figure 8). Thus, they might participate in the redox process at th e Mo-site. Furthermore, b y bendi ng
and thus changing their geometry , the y can also possibly inf luence the electronic env ironment of
moly bdenum. Stronger folding of dithiolenes with respect to the metal seemingly facilitates higher
electron donation potential in M o [51]. Therefore, for high oxidation-states of molybdenum such
strong distortions have been found [51].
Figur e 8. Dithiolene ligand and its dif ferent redox states (picture was inspired by Fig. 2 in [51])
The pterin ligands may also be important for modulating the redoxpotential of Mo [51] or for the e - -
transfer fro m the active site to the interior of the protein. This is due to the fact that they also exist
in different oxidation forms, as shown in Figure 9. The electron-deficient pter in ligand might
facilitate e lectron-withdrawal from the dithiolene ligands and ulti mately from molybdenum [51].
Furthermore, strong h y drogen-bonding interactions between pter in groups and the protein
environment have been observed for molybdo-enzymes [51]. These might further pla y an important
part for redox processes, coupling electron-transfer -pathwa y s with proton-translocations [51].
Finally , the protein-pterin interactions might be important for stabilization of the energeti cally less
ef ficient distorted trigonal prismatic sturcture of the Moco inside the binding-site [51].
Figur e 9. The opterin ligand and its dif ferent redox states (picture was inspired b y Fig. 4 in [51])
10

In addition, the p y ran ring linking the pterin group to the dith iolene ligands might exist in an ring-
open and closed form and might also be important for redox reactions [51].
MoVI has two additional possible ligation-sit es: In E. coli it has been found that one is occupied b y
a seleno-c y steine ( Se-Cys), while the last ligand is a sulfido or thiol-group [52]. Since no cry stal
structure is available for Rc FDH onl y assumptions (based on spectroscopic and kinetic studies of
Rc FDH and of other FDHs, since all carry similar motives) can be made of what addi tional ligands
may be present in both MoVI and MoIV -forms (see Figure 10). It has been suggested [48],[49],[50],
[52],[53],[54],[55] that in the active form of FDH, the oxidized mol y bdenum (MoVI) is ligated to
an additiona l native (Se-)C y s386 and a S*-ligand (see Figure 10, left and middle). Therefore, the
oxidized Moco w ith M oVI might be hexacoordinated, while the reduced form ma y be
pentacoordinated [49],[50] ,[52],[53],[55] and onl y carrying a S*-ligand and no C y s (see Figure 10,
right). This is controversial, since Maia et al. [54] also suggested based on EPR-results that the
native Cys is not leaving the ligation site upon Mo-reduction and instead sta y s coordinated to th e
metal in its MoIV -for m. The y also suggested the presence of a (at least temporar y ) SH-ligation at
the metal. Therefore, it remains unclear , whether Cys is ligated to the MoIV and what other ligand
may be present. For more details see section 3.3.). Another question is, w hether or not the s ubstrate
(formate) is directly binding to mol y bdenum which will be possible, if either the Cys or the S*-
ligand is leaving the coordinati on site [48],[49] ,[50],[52],[53],[54],[55],[56].
Furthermore, in its oxidized form also an inactive species of Rc FDH could be observed [50].
Spectral analysis [50] suggested that this form carries either an O-ligand or and OH-ligand at the
MoVI-site, instead of the sulfido-group and represents also a hexacoordinated MoVI-species.
Therefore, it is called the desulfo form (or presta te). This species has been observe d, upon
suppression of the FdsC gene [50].
Figur e 10. The Moco in Rc FDH and nearby conserved protein residue s C y s, His and Arg in FD Hs
in possible ligation scenarios of m ol ybde num (regard ing S*- and Cys-coordination of Mo): (le ft)
schematic representation of all-sulfur hexacoordinated MoVI; (m iddle) oxidized Rc FDH and (right)
reduced Rc FDH; in the latter two, the struc tures of the homology models of Rc FDH by Hartmann et
al. [49] are shown; also the conserved residues Arg587 a nd His387 are displayed: X denotes either
S, SH, O or OH-ligand (pictures inspired by Fig.1 in [53])
As s hown in Figure 10, three conserved protein residues, na mely Cys386, His387 and Ar g587 can
be found in direct vicinity of the Moco. These have been a ttributed differe nt functions [48]:
1 1

According to H artmann et al. [49] C ys is i mportant for the proton abstraction from for mate and thus
directly involved in the catal yt ic process. Therefore, it has to be detached and deprotonated
(normally in native environments C ys is protonated due to its high pKa) and both Arg and His pla y
important roles in this. His tunes the pKa-value of C y s, while Ar g s tabilizes its char g e via salt-
bridge formation. Arg is also important for correct inser tion and subsequent stabilization of the
substrate inside the bi nding pocket at the Moco. This has been found in m utation studies [49].
2.3. Possible m e chanistic pr operties at the active site
As can be seen, the structure of th e Moco is still area of discussion. For further information of
possible involvement of the Moco in the reaction cycle and potential mechanism s see section 3.3.
There, the underl yi ng motivation of the study of Moco-structure and its mechanistic properties, as
carried out in this work, are discussed in furthe r detail.
12

Chapter 3
Motivation for computational investigations of large
biomolecul es
Phytochromes, as w ell as FDHs have dif ferent areas of interest. For example, it is still not
completely clear how ph y t ochromes interconvert between their two parent states on an atomic level.
Intermediate s teps of the photocycle have to be clarified, as well as how changes of the
chromophore finally induce changes on a macroscopic scale. It is important to investigate the role
each re s idue play s in the photocycle, as well as possible interactions with the chromophore. For
this, computational s tudies combined with experiments are of importance. This includes the
development of new and improved methods for simulating the native environments. The s ame goes
for the stud y of FDH. Structure of the mol ybde num-containing cofactor , as w ell as mechanistic
properties of it and the involvement of surrounding residues His, Arg and Cys are still area of
discussion. They may be estim a ted with the help of co mputational m e thods, such as densit y
functional theor y (DFT). In the following chapter the motivations and concrete aims for the
individual projects of thi s thesis are shown and discussed.
3.1. Phytochr omes – Impr ovement of available methods
Classical molecular d y namic (MD) simulations or force field methods (for theory see section 4.1.)
provide a tool for getting access to conformational states and d y namics of macromolecules. It has
many advantages, l ike high simulation times at relative low computational costs which makes it a
widely used method to examine biological matter . On the minus side, possible problems can arise
from shortcomings of classic force field (f f) methods: In past s tudies [57],[58],[59],[60],[61],[62] ,
[63],[64] it has been indicated that the point-charge s as included into classic m olecular mechanic
(MM) MD-simulations are not able to describe th e water-net work inside biological probes
realistically . Th is was especially true for explicit water molecules around cofactors inside e.g.
proteins. A possible reason for this m ay be th e insuf ficient description of the charge-envi ronment
with point-char ges. In case of Cph1, the charges for individual protein atoms included in the used
force field are obtained from experiments and calculations of the apo- protein. Hence, in the absence
of the cofactor . Electrostatic effec ts of the chromophore on the surrounding protein residues and
vice versa are therefore unaccounted for .
MD-simulations of Bacteriorhodopsin (BR) as carried out in the past b y the group of Eichinger et
al. [57] and others [58] ,[59] showed a decreased stability of the structure of the water network
surrounding the cofactor . Furthermore, important hydrogen bonds, such as between a cry stal water
and the Schif f base are disrupted. This led to the format ion of a h ydroge n-bond between cofactor
and asparagine85 in MD-s tudies of Grudinin et al. [62]. Additiona lly , the complete water network
showed more fluctuations, as observed b y Kandt et al. [61]. In contrast to this, NMR-findings [60]
predicted a stabilized water - and h y drogen- bonding-network around the cofactor .
Kandt [61] and Grudinin [62], as well as Ha y a shi et al. [65] attributed the unstable water -network
and the high fluctuations of individual w ater molecules to the lack of polarization ef fects in the
applied force field. The simple point-char ges were not able to reproduce the native structure and
hydrogen-bonding network, especially c oncerning the Schif f base.
13

In addition to the development of polarizable potential functions by Rick et al. [66] and its
application by Hay ashi et al. [65], Babitzki et al. [67] addressed the instability of water m olecules in
BR via the application of a so-called polarized force field. H y brid quantu m mechanical/molecular
mechanical (QM/MM) methods have been used for the development of the latter . MD-simulations
of BR verified th e stabilization of hydrogen-bonding and water -networks, when using the new
polarized force field [67].
In studie s of Mroginski et al. [63] and unpublished studies by D aminelli et al. [64] similar
observations have been made concerning MD-simulations of the sensor y do main of Cph1. They
showed a deca y of the structural stability inside the chromophore binding pocket: T he p y rrole-water
did not s tay stable and lef t it s crystall ographic position during the simulation. H ydrogen-bonding
interactions between the cr y s tal water and the chromophore, as well as surrounding amino acids are
disrupted or lost. This might be, because the appl ied point-char ges for th e chromophore and
surrounding protein residues did not suf ficiently describe the real electronic interactions. These
findings are in contrast to cr y s tallographic studies [10].
Furthermore, snapshots of these traject ories were used for Raman-spectra co mputati ons [63]: The
water -fluctuations in the simulations led to a broadening of the calculated Ra man-spectra bands
[63]. Especially the bands connected to the d ynamics of the p y rrole-water were af fected. The NH-
in-plane (ip) rocking of the p yrrole-nitrogen at ring C of the chromophore is coupled to the p y r role-
water . The high mobility of the latter lead to a broadening of the corresponding band in the
computed Raman-spectra [63], while experiments show a sharp peak at 1570 cm -1 . This makes the
computed s pectra less comparable to experimental ones and thus less helpful in interpretation of the
latter .
Since the pyrrole-water might play an important role in the photoc ycle, its correct treatment in MD-
simulations, as well as Raman-spectra calculations is of high importance. A more rea listic tr eatment
of the charges would improve the comparability of theoretical data with experimental studies, such
as resonance Raman (RR) spectroscopy . Therefore, the instability of pyrrole-water has to be
addressed. A plausible solution is the incorporation of polarization ef fects of the cofactor on the
surrounding protein residues and vice versa via a polarized force field.
In this work several MM simulations of the photosensory domain of Cph1 have been carried out
with both classical M M force fields and also w ith a newly generated polarized force field for the
Cph1 PCB-binding pocket. Additionall y , Raman-spectra have been calculated for each type of
simulation and then compa red to experimental studies. This was done in order to investigate and
address the band-widening problem of the NH ip-rocking of ring C. This stud y will demonstrate a
better tr eatment of the charges in near vicinity of the chromophore, thus generating more realistic
and stable conditions during MD-simulations of Cph1. Additionall y , it will be displayed that such a
force field generates more trustworthy visualizations of the d ynamics inside the CBD and more
realistic computed Raman-spectra. This will be helpful in interpretation of experimental RR studies
of ph y tochromes. It will provide a helpful tool in future investigations of th e photocycle with a
combined theoretical/experimental spectroscopic approach. The theory and procedure for the
generation of the polarized force fie ld, as well as deta ils of the application are shown in sections
4.1.5. and 5.1.2. The results of these studies are shown in part III.
3.2. Phytochr omes – Determination of structural an d dyna mical inform ation.
Determination of the cr y s tal structure of bio molecules is alwa y s an i mportant part of understanding
the structure and function. But even with those results, the understanding of the m olecular structure
14

still has to be improved, since the resolution of X-ray cr y s tallograph y is often to low to visualize the
hydrogen (H) ato ms bound to the protein structure and possible cofactors. H atoms might pla y an
important role in protonation events, such as has been suggested in the photoc ycle (see section 1.3.
for further details) of phytochromes [23] ,[27],[32],[42]. Therefore, it is necessary to use theoretical
methods to determine the protonation states of the various amino acids and the chro mophore. This
becomes even more important in the case of titratable amino acids. These can either be protonated
or not, influenced by their immediate environment and possible interaction partners. Under
physiological ph-conditions most amino acids, which are not in close contact to cofactors, are
treated as char ged (H -acceptor groups are protonated, while donor s are deprotonated). It becomes
problematic for amino ac ids in near vicinity of a cofactor (such as chromophores like BV), since it
influences the pKa values of the surrounding titratable groups. The most difficult is the
determination of the protonation states of histidine residues (for they have two possible H-acceptor
sites, i.e. nitrogens (N), leading to three dif ferent protonation patterns: H on N δ , named d-
protonation ; H on N ε ,named e-protonation; or H's on both N's, named p-protonation; see Figure
1 1). Here, the correct protonation state has to be evaluated out of these three possibilities. In al l
phytochromes two histidi ne residues in near vicinity to the chromophore are conserved (in Agp2
they are named His248 and His278). It is assumed that they pla y a crucial role in s tabilization of the
chromophore in the CBD and , additionall y , may affect the conversion from Pfr to Pr and vice versa
via acceptance and donation of protons during the conversion [27]. In Agp2 these residues are
located directl y above the BV (His248) and in the case of His278 near the propionic side chain of
ring C (PSC(C)). T o investigate the ir role in the stabilizati on of the CBD it is important to find the
correct protonation state of both residues beforehand. It has recently been shown in the case of
Cph1 [27] and Agp1 [28] that dif ferent positioning of h ydroge n a toms on the corresponding His
might induce structural changes and i nstabilities on the chromophore and its protein surroundings.
Figur e 1 1. Structure of Agp2 chromophore BV w ith protonated propionic chain of ring C (left ) +
possible protonation site s of histidine side chain (right); based on crystal structure of Agp2 [16]
The co mputational investigation issuing the protona tion sta te of H is248 and His278 in Agp2 is
carried out via QM/MM computations (see section 5.1.3. for more insight s into the appl ied
methods) of nine different models. These are taking all possible combinations of protonations of the
two N’ s of both His into account: Namely , these are dd, de, dp, ed, ee, ep, pd, pe and pp (first letter
refers to protonation-state of His248, second le tter to His278). Subsequentl y , the computed models
have been compared with the X-Ray structure [16]. The recent discovered X-Ra y structure [16],
which is the first for Agp2 in its Pfr state (beforehand the only available crystal structure of a bathy
phytochrome in its Pfr state, was that of Pseudom onas Aeruginosa [1 1],[68],[69],[70]), is used as
reference structure in this stud y . The comparison of stabilities of the diffe rent co mputed models
15

compared to the X-ra y structure was the most critical testing para meter . It has been used together
with the s tability of the hydrogen-bonding-network (H-bond-network) inside the CBD and potential
hydrogen transfers fro m the histidines to the cofactor in order to the deter mine the most probable
protonation state of His248 and Hi s278. The results of these studies are shown in part IV , section 9.
These results provide a good starting-point for further investigations of the CBD of Agp2. The
propionic side-chains of BV ma y play an important role in the photo cycle and stabilization of the
cofactor in its pocket [24],[25]. This has been already investigated for BV -containing phytochrome
DrBphp and PCB-containing Cph1. Lagarias et al. [24],[25] s tudied the influence of amidation of
PSC’ s in the chromophores of named ph y tochromes on th e formation of stable parent states and the
photocy cle itself. T hey found [24],[25] that in DrBphp, the modification of PSC of ring B (B) led to
dif ferences in the Pr state (co mpared to the native Pr state) and multiple Pfr states. PSC(C)
amidation led to dual-Pr states and subsequently to dual-Pfr states. The y conc luded that P SC(B)
may be responsible to pKa-tuning of the cofactor and i ts surrounding residues. Further it seems to
play an important role in red-light sensing. PSC(C) de fines the width of the detected spect ral region.
Their studies of Cph1 led to similar conclusions concerning PSC(B), whereas modification of
PSC(C) inhibite d completely the formation of a stable Pfr -state.
These findings suggest that PSC(B) and PSC(C) pla y an important role in both s tructure and
dynamics of ph y tochromes. W ith the help of the correct protonation state of His248 and His278, the
influence of both PS C(B) and PSC(C) on the d ynamics inside the CBD of BV ha s been investigated
for Agp2. Different computational models, in which the PSC’ s have been meth y lated (either onl y
one of them or both) were computed. The impact of these structural changes on the stability of the
CBD (compared to dy namics of the native protein and to the crystal structure of Agp2) has been
evaluated and used to interpret the corresponding RR experiments [26]. The latter findings of
Hildebrandt et al. [26] suggest tha t one monomethylated variant of BV remains photoactive, while
the other leads to an abolished photocycle. These investigations will lead to a better understanding
of the roles of PSC’ s of BV in Agp2 and in ph y tochromes in general. Furthermore, the obt ained
results of these studies will provide helpful too ls for future investigation of the photocycle of
phytochromes. The results are shown and discussed in section 10.
3.3. Rc FDH – Structural and m echanistic questions
FDHs are of high interest for both industr y and environment. Since they are able to catalyze both
the oxidation of for mate and the reduction of carbon dioxide, the y provide the necess ary tools for
generation of C-building blocks out of gas eous CO 2 . For mate might function as a reduced carbon
(C)-source for further modifications and can also be used as a solid energy source. Simultaneously ,
the amount of CO 2 in the atmosphere could be reduced, helping to reduce the Greenhouse effect and
global warming. The key for application of FDHs and artificial enzymes w ith similar function for
these purposes, is the co mplete understanding of the reaction cycle of FDHs and its catal y tic
activity . Only then is it possible to create and tune the desired CO 2 -reduction activity . Prerequisite
for this, is the correct determ ination of the structure of involved cofactors.
Several studies have been carried out, addressing the correct ligation of the mol y bdenum-site of the
Moco in FDHs. The y co mmonly agre e in four ligation-sites being occupied b y two dithiolene
moieties (see Figure 10). But further assumptions diffe r fro m work to work. In X-Ray-studies on
FDH of E. coli ( Ec FDH), a Seleni um(Se)-C y s has been found to be ligated to the molybdenum [52],
while a sulfido- or thiol-ligand is the sixth ligand in the oxidized form of the Moco. Mota et al. [54]
suggested also a s ix-coordinated Mo in its oxidized for m (with Se-C y s- and S*-ligands), but only as
an inactive prestate which leads to a pentacoordinated active form, la cking the Se-Cys coordination.
16

Schrapers et al. [50] found in spectroscopic studies of Rc FDH two different oxidized species which
both showed hexacoordination (see Figure 10, left), w ith native Cys386-ligation and either S*-
ligand (acti ve) or possible O- or OH-ligation for the inactive species. Leim kühler et al. [48],[49]
followed this approach for Rc FDH. It is not completel y clear what ligand is present in addition to
the two di thiolenes in the reduced form of FDH. A sulfido- ligand as fifth lig and has been mostly
assumed (s ee Figure 10, right), but never completel y verified. It has been assumed that ma ybe the
(Se-)Cys-ligand is dissociating from Mo [49],[50],[52],[53],[55],[71]. The thus generated free
coordination-site might be occupied by the substrate formate [49],[50],[52],[53],[55],[71]. This is
currently s till under discussion. For example Maia et al. proposed a hexacoordinated MoIV with a
Cys-ligation in additi on to the formation of a SH-group ligated to the m ol ybdenum [54].
Figur e 12. Dif ferent cataly tic reaction mechanisms (left) proposal of Hartmann et al. [49] with
direct formate binding and H-abstraction by a di ssociated Cys-ligand; (right) proposal of Maia et al .
[54] with formate residing unbound in the second coordinati on sphere and proposed h y d ride
transfer from formate to sulfido ligand (the pictures were i nspired b y Fig. 7 i n [49] and Fig. 6 in
[54])
The disagreement of the correct ligation-sphere in molybdenum-containing FDHs has a huge impa ct
on the understanding of the catalytic c ycle, since it depends strongly on the correct positioning of
the ligands. Until now ther e have been several mechanistic proposals, some of which are presented
here:
Concerning a hexacoordinated MoVI as starting point in the catalytic c y cle, several studies assume
that the C y s (ligated to MoVI) leaves the metal center upon formate approach, while the
sulfido-/thiol-group s tay s at th e metal [50],[52],[71]. For mate then directl y ligates to the
moly bdenum-center . Here, the C ys386 moiety is mediating the H-abstraction fro m for mate and thus
the conversion of formate to carbon dioxide. Another stud y proposes the dissociation of the sulfido-
atom instead of the C y s- ligand from the MoVI site [56]. In both the lea ving group returns to the
metal after the cycle is complete. A second co mm on characteristic of both is that the sulfido-/thiol-
group is not involv ed in the oxidation of for mate or subsequent electron translocations. These
mechanism s are based on the cr ystal structure of the FDH of E. coli . The s econd reaction proposal
[56] has assumed to be false afte r revision and reinterpretation of the X-ray data [52].
A third possible reaction c ycle states that a hexacoordinated M oVI center is inactive and has to be
activated in a first step [53]. Here, the C y s386 is leaving the ligation site via temporary formation of
a disulfide bridge w ith the S*-ligand. This generates a free coordination position for for mate which
is directly binding to the Mo [53]. This is hence called the s ulfur -s hift mechanism [53].
17

T wo approa ches (for either Rc FDH [49] and Dd Fdh [54], respect ively) are currently discussed in
the scientific co mmunity and are of major interest: Namel y the ones from Hartmann et al. [49] and
from Maia et al. [54] (which is a refined version of the a lready know n mechanis m from Niks et al.
[72]). Both are shown as a schematic in Figure 12.
In the mechanism proposal fro m Hartmann et al. [49] the process starts with a hexacoordinated
MoVI-species (2 dithiolene, 1 C y s386 and 1 sulfido ligation). Upon the arrival of formate, Cys is
leaving the ligation site, while formate directl y binds to the metal. C-H bond cleavage is done by
Cys abstracting the H + from formate, while two electrons are transferred from the substrate to
MoVI, thus changing its oxidation state to MoIV . In a final step CO 2 is released, the proton further
translocated into the protein matrix and the two electrons directed to the FMN-site (most probably
via the FeS clusters), oxidizing MoIV back to MoV I. Simultaneously , C ys is returning to Mo and
the cycle can restart. Site-directed mutageneses studies [49] revealed the important role of the
His387 and Ar g587 residues which are in near vic inity of the cofactor (similar residue s are
conserved in all known FDHs, thus underli ning the importance of these for cataly tic activity): While
Cys386 is important for proton uptaking, it has to sta y deprotonated which under physiological pH
is nearly impossible (due to its high pKa). A higher ph-optimimum (ph=9) for the catalytic rea ction
in Rc FDH has been found [49], ensuring a deprotonated Cys. His387 further was proposed to
mediate the pKa of C y s386. Arg587 is directing and stabilizing the substrate to its correct position
at the metal center . Further it helps Cys386 to stay charged via for mation of a salt-bridge with it. In
the tested vari ants, in w hich the named re s idues have been subs tituted to a residue with differi ng
functionality at the side-chain (e.g. unpolar group), the enzym e looses its cataly tic activity [49].
The proposal from Maia et al . [54] goes a different wa y: The y also s tarted with a similarly
hexacoordinated Mo species. In their studies on Dd FDH, the y found that in the presence of cyanide
the enz yme becomes inactive. Their interpretation is that CN - for ms a thiocyanide group with the
sulfido-ligand. And since this renders the whole oxidation process obsolete, they concluded that the
S*-ligand has to be important in the c y cle and has to be directly involved. They further suggested
that on one hand, Cy s is not leaving i ts coordination position and play s no direct role in proton
release and uptake. Thus, formate is not directl y ligated to the metal center , onl y approaching and
stabilized b y an Ar g residue. A hydride ion is then abstracted b y the sulfido-group (MoV I becomes
MoIV). This leads to the release of CO 2 and the formation of a SH-ligand. After transferring the
electrons deeper into the protein, a proton is released and SH becomes a S-ligand, while MoIV
becomes MoVI again and the c ycle can start from the beginning. These findings they supported by
kinetic studies (which also suggest an ac tivation process of unknown nature [54]) and b y EPR
measurements [54] which show the presence of a coupl ing H + in the s econd ligation sphere of Mo.
Further they state that no experimental evidence, besides XAS data [50] is given that C y s is actuall y
leaving th e reacti on site. Neverthele ss, they do not exclude the possibility of a pentacoordinated
MoIV state and the involvement of a dissociated Cys in the transl ocation of a proton [54].
A third recent publication b y Robinson et al. [55] shows a combination of cer tain aspects of these
two mechanis ms: Starting with a hexacoordinated MoVI in Ec FDH the y proposed a direct
coordination of formate to the molybdenum-site after dissociation of c y steine, based on inhibitor-
studies and electrochemical experiments. Further more, they suggested that the sulfido-ligand
abstracts either a proton or a h ydri de from bound formate, thus reducing MoV I to MoIV and
releasing CO 2 . T emporarily a SH-ligand is formed which re turns back to the sulfido-for m after
releasing a proton into the protein m atrix and restart of the cycle.
All these mechanistic suggestions have reasonable ideas, underlined by exper imental evidences.
Unfortunately the y differ often in important key aspects, such as the release of Cys and the direct
18

ligation of formate. Also the proton abstraction from for mate and the electron transfer from formate
to molybdenum is not consistent in all named publications. The question remains which one of
these proposed reaction cycles is correct or whether it is a combination of dif ferent elements of
several of these mechanisms. Furthermore, these proposals lack some exclusive things: For
example, the mechanism from Hartmann et al. [49] does not include the EPR findings, suggesting
the h ydrogen in the second coordination sphere. In the Maia-mechanism [54], a controversially
issue is the inhibition by CN - . The y attributed it to the formation of SCN - with the S*-ligand. This
might be wrong. CN - is known to be a very strong ligand with a strong ligand field splitting
(creating low spin complexes b y formation of π-backbinding) and it is possible that CN - simply
initiates the leaving of all ligands from Mo and exchanges the m, because i t forms the more stable
complex [MoVI(CN) 6 ] 0 which would also inactivate the enzyme. Another point is the C y s-
dissociation fro m the metal observed in known cr y s tal structures of dif ferent FDHs in the reduced
state [52],[53]. There, C y s is dissociated fro m MoIV . Additionally , the EPR-findings of an
intermediate MoV -species (which would suggest a sequential electron release/uptake in contrast to
an immediate release/ uptake of two electrons) are not taken into account in all these reaction cy cles.
Both, the ligation s phere of mol ybdenum at the dif ferent reaction-steps and th e mechanism of FDHs
are still not completely clear . Therefore, quantum chemical studies, followed by Raman-spectra
computations and comparison to experimental resonance Raman data (b y W ahelfeld et al. [73])
have been carried out in this work. Theoretical models for these studies have been built and
verified. F urthermore, diffe rent possible mechanistic features, like C ys-dissociation, SH-formation
and formate-ligation, have been investigated in great detail. The deter mination of the complete
ligation sphere of the Moco at dif ferent points in the catalytic c ycle together with ener gy-
computations will help to generate a new and m or e complete understanding of the catalyt ic
oxidation of for mate to carbon dioxide. Thi s will help cross ing the gap between current mechanistic
understandings and provide helpful tools for future s tudies. The results of these studies are s hown in
part V .
19

Part II.
Material s and Methods
20

Chapter 4
Theor etical backgr ound
Theoretical/Com putational methods are a necessar y counterpoint to experiments. Many
experimental observations cannot be interpreted completely , since information about events on the
atomic and molecular s cale is missing. Com putational investigations provide helpful tools to
understand experiments, such as measured spectra and to predict structures and reaction
mechanism s. Additionall y , the y can shed li ght on d y n amical properties and structural changes inside
the investigated matter .
As s tated in the sections before, there are many fields of application for theoretical studies. In this
work d y n amical propert ies of Cph1 have been investigated. Additionally , missing structural
parameters of both Agp2 and Rc FDH have also been evaluated. Finally , for both latter s y stems
mechanistic properties have been determined. For all these areas ver y different computational
methods have been used, for which the theoretical background will be discussed in the following
section. The focus will be mainl y on the methods applied to above mentioned problems. While
other methods may be available, they will only be mentioned.

4.1. Molecular Dyna m i cs
Molecular d y na mic (MD) simulations are a useful tool to investigate mol ecular s y stems in
structural and d ynamic terms at an atomic resolution. In these simulations, a trajector y will be
generated over time, in which the time-evolution of velocities and positions of ever y atom and
molecule is monitored in the context of its confor mational s pace. Molecular dynamics are able to
provide equilibrium geometries and transition states, as well as relative energies of conformers and
potential ener gy surfaces. T o get the most rea listic expression for the investigated s y stems, the time-
dependent Schrödinger equation [74] has to be evaluated for ever y a tom at any given tim e-step.
Since this is unfortunately onl y possible for ver y small s y stems, due to current comput er power ,
dif ferent approa ches are necessary to simulate dif ferent sizes of s y s tems. MD-simulations are of
special interest for the s imulation of confor mational space for liquids [75],[76],[77], periodic solids
and surfaces [78],[79],[80], as well as biomolecular sy stems [81],[82],[83] (which are the area of
interest in this work). These cannot be treated co mpletely with quantum mechanics, since the y are
too lar ge and too computationally costl y . MD-simulations are the method of choice for these.
Several methods exist and are prerequisite to carry out MD-simulations which will be discussed in
the following.
4.1.1. Molecular Mechanics – Classical for ce fields
The simulation of large biomolecular s ystems, such as proteins including cofactors, via molecular
dynamics is achieved by a simplificaton of matter . Classical force fields regard atoms as partially
char ged unflexible spheres. Classic Newton mechanics are app lied to these. Covalent bonds
between atoms are treated as classical springs, whereas electrons are completel y omitted. These
simplifications lead to the following analytical expression for the potential ener g y of the
investigated molecular sy stem as depicted in equation (eq.) (1) which is called a force fie ld [84]:
21

V = V bonded + V non − bonded

(1)

Here V is a function of posi tions r(N) of all nuclei.
Dif ferent terms attribute to the bonded interactions V bonde d [84], such as bond stretching V bond , angle
bending V angle , bond rotation (torsion) V torsion and out of plane moveme nts (improper torsion) V improper .
Further , non-bonded interactions V non-bonde d are divided into electrostatic interactions (Coulomb
potentials) V electros tatic and van der W aals (vdW) (London) interactions V vdW [84],
Motions corresponding to bonds, angles and torsions, as well as vdW and electrostatic in teractions
are schematically ill ustrated in Figure 13.
Figur e 13. Schematic representation of dif ferent inter- a nd intramolecular motions (bonds, angle s,
torsions, vdW and electrostatic interactions).
One possible ener g y expression for the potential ener gy is the CHARMM(28) [85],[84] force field,
as shown in eq. (4) which has been used for all MD-sim ulations carried out in this work:
Here K (b,θ,χ,Φ) denotes the respective force constants for bonds ( b), bond angles ( θ), dihedral angles
(χ), and im proper dihedral angles (Φ), whereas b 0 , θ 0 , χ 0 and Φ 0 are the respective values at the
equilibrium.
22

(3)
V non − bonded = V vdW + V electros tati c

(2)
V bonded = V bond + V angl e + V torsion + V improper

(4)
V total = ∑
bonds
K b ( b − b 0 ) 2 + ∑
angles
K θ ( θ − θ 0 ) 2 + ∑
torsions
K χ ( 1 + cos ( n χ − δ ) ) + ∑
improper
K ϕ ( ϕ − ϕ 0 ) 2
+ ∑
non − bonded
q i q j
4 π ϵ 0 r ij
+ ϵ ij
[
(
R min , ij
r ij
)
12
− 2
(
R min , ij
r ij
)
6
]

Both bond s tretching and angle bending (as well as impropers) are given b y a T aylor series around
the equilibrium bond length b 0 (or equi librium angle width θ 0 or width of i mproper Φ 0 ), following
Hooke's L aw . In both, the corresponding ener g y is assumed to increase quadratically (following a
harmonic potential) with displacement fro m the equilibri um value. In contrast to this, the energy
expression of the torsional motion is approximated b y a Fourier series. In this way , the periodic
nature of the torsion is captured. Some steric and electrostatic interactions between neighboring
atoms can thus be mimicked (corresponding parameters are coupled to non-bonded ones, see next
paragraph).
Electrostatic and vdW interactions (non-bonded) are described as a sum of Lennard-Jones and
Coulomb terms. The first depends on the parameters ε ij (depth of energy well) and R minij ( minimum
ener g y dista nce), for the ato m pair i and j and the distance r ij between them. The Coulomb
interactions are calculated between pairs of atoms at a relative distance r ij , containing partial charge s
q i and q j and depending on the dielectric constant of vacuum ε 0 .
The det ermination of the para meters define the accur acy and possible applicati ons of a given force
field [84]. It can be done in several w ays, such as fitting to experimental values and expl icit
treatment of important parts of the s ystem with quantum chemical methods. A given force field is
limited to a finite num ber of s y stems, for which the parameters are derived for specifically [84].
Therefore, for dif ferent s ystem s , differe nt force fields have to be generated.
4.1.2. Molecular dynamics algorithms (Equations of m o tion)
The in equation (4) depicted expression for the potential is used to generate th e force F(r(t) of any
atom at ea ch given time s tep as shown in (6). This is done b y diffe rentiating the potential as
computed b y the force field dep icted in (4) over space. Numerically integrating Newton's equations
of motion as shown in (5), in which F is the force acting on an atom and m its mass and a its
acceleration then y ie lds the molecular dynamics trajectory [86] .
In the case of a 3N particle s y ste m this leads to a set of 3N coupled second-order dif feren tial
equations of the form shown in (5) [86]. The force of one atom at position r(t) of the s y s tem is a
function of all other atom s.
There exist dif feren t algorithms for solving the differential equations described above. T he so-called
V elocity-V erlet algorithm [87],[86] is the one of choice in this work and the one most oft en applied
in MD-simulations:
An initial guess for velocities v(0) at the beginning of the MD by a Boltzmann distribution [86] and
starting coordinates r (0) (commonly obtained by crystallographic structures or NMR measurements,
as well as ho mology m odels) are the prerequisites [86]. The timestep Δt has to be chosen one order
of magnitude s maller th an the fastest motion (which is the X-H stretching motion with 10 -14 s) and
thus be 1 fs [86]. T o achieve a bigger Δt of 2 fs (and thus reduce the necessary integration steps)
23

(6)
F ( r ( t ) ) = − ∂ V ( r )
∂ r

(5)
F = m ⋅ a ;
(
∂ 2 r
∂ t 2
)
= 1
m F ( r ( t ) )

further prerequisites have to be used, as explained in s ection 4.1.6. Finall y , an appropriate
expression for r(t) and v(t) at each given timestep for each atom is necessary for generation of a
MD-trajectory [87],[86]:
In the framework of the V elocity - V erlet algori thm [87] these are described in equations (7) and (8):
The calculation of a molecular d ynamics trajectory is then described in the following way , in which
the necessary parts are computed for every atom at each tim e:
1. Starting point with r(0), v(0) and a(0).
2. Computation of r(t+Δt) using (7).
3. Evaluation of the potential V(r) as shown in (4).
4. W ith V(r) the force acting on the atoms F(r(t+Δt)) can be obtained using (6) and also the
acceleration a(t+Δt).
5. Finally the new velocities v(t+Δt) are obta ined according to (8).
This cycle is repeated until the i nitially defined number of time steps is reached. T he main advantage
of this algorithm is that it is able t o give positions and velocities at the sam e time [87].
4.1.3. Constant tempe ratur e/ constant pr essur e dynamics
Biological s y s tems (which are the s ubject of s tudy in this w ork) mostl y appear under temperatures
of 300 K and a pressure of 1 atm. Therefore, it is important to maintain this conditions during the
simulation. The volume is allowed to change during the simulation. Thus a so-called NPT ensemble
is generated [86]. This means that the number of particles (N ), pressure (P) and temperature (T) are
kept constant and therefore the s y s tem is both isobaric and isothermic. T o achieve this, a thermostat
and a barostat are applied as described below [86]:
Constant temperatur e
The kinetic energy E kin of a N-bod y system is coupled to its t emperature [86] as given in (9), where
N A is the A vogadro number , K B the Boltzmann constant and T(t) the temperature at any given time.
Since the kinetic ener gy is related to the velocit y (eq. (9)), also v(t) and T(t) are proportional to each
other and v(t) can be used for controlling the temperature of the s y s tem. This is done b y scaling of
24

(7)
r ( t + Δ t ) = r ( t ) + Δ t ⋅ v ( t ) + Δ t 2 F ( r ( t ) )
2m

(8)
v ( t + Δ t ) = v ( t ) + Δ t [ F ( r ( t + Δ t ) ) + F ( r ( t ))]
2m

(9)
E kin = 2
3 N A K B T ( t ) = 1
2 mv ( t ) 2

v(t) with a factor λ which is proportional to the te mperature (both desired T desired = T bath and the
current one T current = T(t)) as shown in (10) [86]:
This T -dependent factor can be generated b y coupling an ex ternal heat bath (which has th e desired
temperature T desired ) to the sy stem with the coupling parameter τ as shown by Berendsen [88],[89]:
In eq. (1 1) the corresponding expression for λ is given.
τ is defined as the step-size, in which the temperature is modified and adjusted from the current one
to the desired one T bat h and should be chosen according to eq. (12), so that for a timestep of 1 fs a
value of 0.4 ps is used.
After T(t) equals the desired T bath , i t is constant at this value during t he simulation.
Constant pr essure
Maintenance of pressure can be achieved by cha nging of the volume of a given s y s tem [86]:
This is due to the fact that the volume is related to its isother mal co mpressibili t y κ, as shown in
(13):
A constant pressure can be obtained b y scaling of the volume (represented by the posit ions of the
atoms r(t)) with a factor λ [86]. In (14) the eva luation of the new volume r s (t) via relation between λ
and r(t) is depicted.
λ can be gained by coupling of the s ystem to an external pressure bath (15) [86],[88] (a commonly
known way to couple a tem perature bath to a pressure bath is the Nose-Hoover thermostat [87]).
25

(1 1)
λ =
√
1 + Δ t
τ
(
T bath
T ( t ) − 1
)

(12)
τ → Δ t
τ ≈ 0.0025 → Δ t = 1 fs ; τ = 0.4 ps

(13)
κ = − 1
V
(
∂ V
∂ P
)
T

(14)
r s ( t ) = λ
1
3 r ( t )

(15)
λ = 1 − κ Δ t
τ P
(
P ( t )
P bath
)

(10)
v s ( t ) = λ ⋅ v ( t ) → λ ∝
√
T desired
T current

Here τ p is the corresponding coupling constant which in a similar way as explained in eq. (12)
represents the step-size, in which the current pressure P(t) is adjusted to the desired one P bath . It is
chosen according to (16).
4.1.4. Boundary condi tions
Another important aspect of MD-s imulations is the correct treatment of boundaries. In an y s y s tem,
the outer regions near the edges would feel dif ferent forces and interactions with surrounding atoms
than the ones in the bulk. In native environ ments biological systems are much larger and also
strongly interconnected. If onl y a single part of these macroscopic s yst ems is to be simulated, it has
to be trunc ated in its size (to reduce co mputational costs) and thus neglect the interacti ons of the
atoms at the edges with potential neighbors. Th is might lead to wrong forces and wrong dynamics
of the corresponding atoms. Therefore, a proper treatment of these areas has to be considered, so
that every atom is treated as if residi ng in the cen ter of the s y ste m. In the following the periodic
boundary conditions [90] and stochastic boundary [91],[92],[93] concepts are described in more
detail, since they have been used in this work.
Periodic boundary conditi ons (PBC)
PBC [90] can be used to reduce the number of artifacts, resulting fro m the interaction of the
particles w ith the boundar y region of the sy stem: T ranslation operations are used to replicate the
simulation cell in ea ch direction. This gener ates a periodic system. The shape of the periodic cell
can be e.g. cubic or hexagona l (as used in this w ork). Since in the image cells positions and
velocities of the particles are not evaluated b y solving dif fer ential equations (6) no computational
costs do arise from these. In Figure 14a the gene ral principle of PBC is shown.
Figur e 14a. Schematic representation of peri odic boundar y condit ions. The cell in the middle is
repeated in each direction in 3D space. A cutof f-radius is chosen for vdW interactions.
26

(16)
(
∂ P
∂ t
)
= P bath − P ( t )
τ P

T o guarantee that each ato m sees onl y one version of the others and avoid its int eraction with itself,
a cutoff is chosen for the non-bonded interactions. This represents a radius, outside which all non-
bonded interactions can be truncated and are not calculated. This follows th e minimum image
convention [94]. Such an approach reduces the necessary computational costs signif icantly .
Additionally , the Particle Mesh Ewald [95],[96],[97] (PME) method is used to evaluate the
electrostatic interactions in the chosen sy stem with PBC. This is due to the fact that onl y fast-
decaying interactions, such as van der W aals forces (deca y s with r -6 ) can be treated effectively with
truncation methods. Long-range interactions, such as coulombic ones (decay i ng onl y with 1/r) need
to be treated differently , as shown by Ewald [95] who split 1/r into two fast-decaying terms. Simply
spoken, in PM E, the Coulomb interactions are co mputed b y s mearing the electronic char ges of the
sy stem on a grid [96],[97].
Stochastic boundar y conditions
There exist also other methods to treat boundaries, suc h as stochastic boundar y conditions (SBC)
[91],[92] and the generalized solvent boundar y potential (GSBP) [93] . In both onl y th e solvent in
direct vicinity of the biological s ystem is treated explicitl y , while the solvent at higher distances is
approximated b y a dielectric constant or a repulsi ve force. While stochastic boundary conditions
will be discussed in m o re detail, GSBP will not.
SBC is an alternative to PBC, in w hich the explicit treatment of solvent molecules is partly avoided
[91],[92]: In man y cases, only a small part of the investigated sy stem is of interest. This can be for
example an active center of an enz ym e or a cofactor-bi nding region in proteins. Thus onl y a s mall
amount of the protein has to be explicitly s olvated, while the rest of the s olvent can be treated in
less costly wa y s, such as mimicking by a boundar y potential F b (r i ) which is felt b y every atom at
positions r i [91],[92]. As depicted in Figure 14b, the sy s tem is spherically shaped and further divided
into three parts (treated with different theoretical approaches), depending on the importance of the
contained atoms [91],[92]: I reaction region, II boundary/bath regi on and III reservoir region:
Figur e 14b. Schematic representation of stochastic boundary conditions. The s y s tem is divided into
three parts: (I) Reaction region, (II) Boundary /Bath region and (III) Revervoir region which are
treated dif ferent with respect to levels of theory .
27

Reaction r egion
This region contains sites of interest, like active centers, cofactors and certain interesting
residues/molecules. Therefore, all atoms (including solvent) are free to move and treated explicitly .
The level of theor y is depending on th e t y pe of computation carr ied out. In classical MD-
simulations, all atoms in this region are treated according to Newton d y namics. T o replace the pair -
interactions with the solvent in the reservior region, the atoms in the reaction region are feeli ng an
additional force F add (r i ) as depicted in e q. (17).
In hybrid QM/MM calculations (s. section 4.3. for more detailed information about this method)
this region is treated at a quantum chem ical level.
Boundary r egion
This region serves as a buffer area between I and III, in which all atom s (solute and solvent) are
treated explicitl y according to Langevin d y n amics. The atoms in this reg ion are feeling an
additional force F add (r i ) as depicted in (18).

Here, γ i ν i is the so-called frictional force (with γ i the friction coefficient) which is attributing for the
solvent-induced ' slow-down' the atoms are feeling. R i (t) is a random force driving the thermal
motions of th e atoms and represents the collision-effect solvent molecules in the reservoir region
would have on particles in this area. While the frictional force is decelerating, the random force is
stimulating the particles in this region. The latter is a G aussian distribution centered at zero. Its
width is defined b y the temperature. The inf luence of these two additional forces is also visible in
the Langevin-dynamics, as depicte d in equation (19).
Both γ i ν i and R i (t) are obeying to the fluctuation-disspiation t heorem (20).
Here, K B is the Boltzmann constant and T t he temperature.
In this wa y the two additional forces are tuning the temperature of the boundar y region. Thus this
region serves either as a temperature sink or source for the ener g y in the reacti on region and acts as
a temperature bath. This preserves the equilibrium structure and reaction region in general for
unnatural fluctuations.
28

(17)
F add ( r i ) = F b ( r i ) ; r i ⩽ r reaction

(18)
F add ( r i ) = F b ( r i ) − γ i v i + R i ( t ) ; r reaction ⩽ r i ⩽ r boundary

(19)
∂ 2 r i ( t )
∂ t 2 = F i ( r i ( t ) )
m i
+ R i ( t )
m i
− γ i
∂ r i ( t )
∂ t

(20)
∫ 〈 R i ( t ) ⋅ R i ( 0 ) 〉 dt = 6 K B T γ i

Reservoir re gion
Solvent is not treated expl icitly in this region. In classical MD-simulation, us ing s tochastic
boundary conditions, solvation ef fects are mimicked b y a dielectric constant. A repulsive force
F b L (r i ) is contributing to F add (eq. (21)) and is responsible for keeping the particles inside the inner
sphere (reaction and boundary regions).
STM are often applied to QM/MM computations. Thereby , par t of the s olute often reach into this
region. All atoms here in are kept fixed. This is often according to the GS PB formalism [93] as a
special case of stochastic boundar y conditions which is not discussed here. Furthermore, no solvent
is present outside a s phere of explicit water and the solute re s ides in vacuum. The force F b (r i ) is
constituted b y the partial char ges of the solute. For further information about the used QM/MM
methods in this work see secti ons 5.1.5 and 5.2.4.
4.1.5. Polarized for c e fields
In contrast to classical force fields, in w hich only point charge s (derived fro m experiments) are used
for investigated s y s tems, a polarized force field provides a more realistic approach [67]. In the real
sy stem, the charges of molecules (e.g. cofactor) induce changes in the electrostatic field of its
environment (e.g. amino-acids) and thus influence surrounding charges. In the same way , they are
also under influence of the neighboring charge-environment. This char g e-field is constantly
changing, as the molecules and atoms move around in space. Polarization and depolarization ef fects
are strongl y changing over the course of the trajectory [67] (for more details and refer ences on the
importance of polarized force fields, plea se see sections 3.1. and 6. and 7.).
In a classic MD-simulati on this is not taken into account: The charges rem ain constant for the whole
simulation. Also the charge s of the amino-acid residues are often derived without taking an y
possible present cofactors and their influence on the charges into account. One approach to ach ieve
a more realistic description of electrostatics is to employ so-called polarized charges for at least the
region of most interest (e.g. active center of an enzyme) [67]. These charge s are represented by
Mulliken char ges [98] as discussed in section 4.2.4. The region of interest is simul ated with QM
methods, yielding more accurate charges due to the incorporation of electronic influences [67].
Afterward these new derived char ges can then be incorporated into the classic CHARMM force
field [85] (as shown in eq. (4)) and used for subsequent MD-si mulati ons. Unfortunately , these
char ges cannot adapt to changes in the electrostatic environment during the course of the
simulation. Nevertheless polarized charges represent an improvement compared to the classic MM
approach, because influences of cofactors on ami no-acids and vice versa can be accounted for [67].
4.1.6. Constrains and r e strains of MD-simulations
In the case that e.g. some part of the s y stem has not been parametrized properl y or because of other
reasons, it might be of importance to restrain or fix certain parts of the investigated sy stem [99],
[100]. In the latter case, the corresponding atoms remain fixed on their initial positions during the
whole simulation by application of so-called constraints. One algorithm commonly used for
internally constraining m olecules is the SHAKE algorithm [86],[99]. In a s imilar way , X-H
stretching motions can be constrained. This w ill le ad to th e possibility of an increased ti mestep of 2
29

(21)
F add ( r i ) = F b
L ( r i ) − γ i v i + R i ( t ) ; r i ⩾ r boundary

fs and thus a reduction of computati onal costs and speeding up of the MD-simulation [86],[99].
Investigating biological matter often includes the solvation in huge a mounts of w ater . T o increase
the speed of the s imulation special potentials for water have been developed. In TIP3P [101] the O-
H bonds, as well as the H-O-H angle are kept fixed to constant valu es. Thus water molecules can
only move as rigid bodies. This potential for wa ter (TIP3P [101]) is commonly used, when
simulating biological sy stems.
Furthermore, it is possible to appl y an additional force to an atom, such that it is less free in its
movement. These is called restraining of atoms. Restrains are often used in pre-simulation
adjustment of temperature to 300 and pressure as shown in section 4.1.3. Herein, the restraints are
slowly reduced, until the corresponding ato ms are completely free. This ma y be necessary to reduce
the risk of sudden and unrealistic increases in the velocities (due to the increase in temperature) and
thus the rise of artifacts.
Thus, constraints and restraints help to reduce computational costs, speed up the simulation, reduce
artifacts and to overcome missing paramet ers.
4.1.6. Determination o f structural pr operties by MD-observables
Several properties can be subsequently calc ulated from the evaluation of the positions of every atom
at each time-step (MD-traject or y ).
Radius of gyration (RGYR)
The RGYR is the dive r gence of an atom at r k (t) from the mean structure r mean
It is used for proteins to visualize and quantify the global movements of backbone atoms (i.e.
folding or unfolding motions). It represent s a “diameter” of the protein.
Root mean squar e deviation (RMSD)
The RMSD is another important propert y of the s ystem which can be co mputed out of a MD-
trajectory [102],[103],[104],[105],[106]. The basic principle is ver y similar to the one of the RGYR.
It is the diver gence of an atom at position r j (t i ) from a reference structure [102],[103],[104],[105],
[106]. This can be the starting s tructure or a cr ystal model. Before the RMSD can be calculated, the
molecular structure has to be aligned with the reference structure to exclude rotation- and
translation-contributions.
It can be used to determine, how much the comput ed structure deviates from the reference.
30

(22)
RGYR = 1
N ∑
i = 1
N
( r k ( t i ) − r mean )

(23)
RMSD ( t i ) = 1
N ∑
i = 1
N
( r j ( t i ) − 
r j )

Root mean squar e fluctuations (RMSF)
It is computed again as a deviation between actual and reference structure [106]. The RMSF , in
contrast to the RMSD, uses the s tructure averaged over all timesteps of the trajectory as the
reference structure.
This value accounts for the flexibility of the corre sponding parts of the s y s tem.
Further observables and properties of a MD-trajectory will be discussed in more detail in the results
in parts III and IV of this work.
4.2. Quantum chem i cal methods
Quantum chemical me thods provide a way to obtain a realistic description of the electronic
structure of molecules. Th is is done in general by solving the Schrödinger equation following the
Born-Oppenheimer approximation [107]. In this, nuclear-nucl ear interactions are treated as
constant, due to a separation of nuclear and electronic motions. This is valid, because electrons
move faster than the atom cores and it is assumed that from the point of view of the electrons, the
nuclei remain fixed. This leads to an electronic Schrödinger equation which can be solved in
dif ferent wa y s. Either ab initio methods can be used, further divided into wavefunction-based
methods and the ones based on the elctronic density , like DFT [108],[109],[1 10] ,[1 1 1],[1 12],[ 1 13] ,
[1 14],[1 15],[1 16] . There exist also semiem pirical methods, in which the above named methods are
approximated with empirical data, so th at computational costs can be saved. One example for the
latter , is the SCC-DFTB method [1 17]. The application of these methods yield differe nt properties
of the s ystem, like Mulliken char g es and electronic densit y (in the case of DFT) and the appropriate
wavefunction of the s y stem. These will be further discussed in the following chapter . Since in this
work onl y DFT -based methods, as well as SCCDFTB, have been used, no wavefunvtion-based
techniques are discussed here.
4.2.1. Density Functiona l Theory (DFT)
The basic idea of the density functional theory is th at each molecular s ystem contains an electronic
density [1 16] . This is calculated as shown in eq. (25) as the integral over all electronic coordin ates
in the investigated space [1 16]:
Here Ψ is the wavefunction of the s y s tem and χ are the spatial spin-coordinates. ρ(r) is the
electronic density which defines the probability to find any electron with exchangeable spin within
the corresponding space, whe n all other ele ctrons can be an y where. Therefore, in tegration of ρ(r)
will lead to the total number of elect rons N of the s ystem. And the densit y becomes zero, when r is
approaching infinity (26).
31

(24)
RMSF ( j )=
√
1
T ∑
i = 1
T
( r j ( t i ) − 〈 r j 〉 ) 2

(25)
ρ ( r ) = N ∫ ... ∫ ∣Ψ ( r , x 1 , x 2 , ... , x N ) ∣ 2 dx 1 , dx 2 , ... , dx N

The most important property of ρ(r) is that it is exper imentally obs ervable, in contrast to the
wavefunction. This makes the use of DFT more intuitive and more easil y comparable to
experiments.
4.2.2. Hohenberg and Kohn
The development of the DFT m ethod started w ith the proposals of the two funda mental theorems by
Hohenber g and Kohn [108],[1 16]:
1. HK theor em:
Every observable quant ity of a stationar y quantum mechanical s yst em is completely determined b y
the ground-state electron densit y ρ 0 (for example the ground-state total energy and the
wavefunction).
The total electronic ener g y of the ground-state E 0 is defined as a sum of the kinetic ener gy T and the
electron-electron interaction E ee plus the electron-nuclear interaction E Ne which all onl y depend on
the electronic density (27):
In contrast to E Ne ,, computed as an integral over the electronic densi ty ρ 0 and the potential V Ne , the
kinetic ener g y of interacting electrons, as well as the electron-electron interactions are unknow n
[1 16]. These two ener gies for m the so-called Hohenber g-Kohn functional F HK , as stated in eq. (28).
The electron-electron ter m is further divided into a classical Coulomb interaction E H and a non-
classical E ncl (consisting of electronic exchange E X and correla tion E C ):
If F HK would be known exactly , the Schrödinger equation could be solved exactly for all molecular
sy stems. This is due to the fact that F HK is not unique for a given sy s tem, but a universal functional
[1 16]. F inding F HK i s the major challenge of m odern DFT -development.
32

(27)
E 0 [ ρ 0 ] = T [ ρ 0 ] + E ee [ ρ 0 ] + E Ne [ ρ 0 ]
E 0 [ ρ 0 ] = ∫ ρ 0 ( r ) V Ne dr + T [ ρ 0 ] + E ee [ ρ 0 ]

(28)
F HK [ ρ 0 ] = T [ ρ 0 ] + E ee [ ρ 0 ]
F HK [ ρ 0 ] = T [ ρ 0 ] + 1
2 ∬ ρ ( r 1 ) ρ ( r 2 )
r 12
dr 1 dr 2 + E ncl [ ρ 0 ]

(29)
E ncl [ ρ 0 ] = E x + E c

(26)
∫ ρ ( r ) dr = N ; ρ ( r → ∞ ) = 0

2. HK theor em (varational principle)
The ground-state densit y can be calculated, in principle exactly , using the varational principle,
involving only the density (Ray leigh-Ritz-principle) [109],[108],[1 16]:
The second theorem which is shown in equation (30) states that with the help of F HK an y given trial
density would yield an ener g y which is higher than or equal to th e exac t ground-state ener gy .
Provided that the correct density and F H K of the true ground-state would be known, the exact ener gy
could be calculated [108],[109],[1 16].
In summary it can be s aid that knowledge of the electronic kinetic ener g y of interacting electrons
and the exchange and correlation energies would allow the determination of the correct ground-state
and all its properties, including its ener gy (31).
But since only E Ne is known, a way to approximate or circumvent F H K has to be found, done b y
Kohn and Sham [1 10].
4.2.3. Kohn and Sham
Since the kinetic ener g y of non-interacting electrons is known (32), Kohn and Sham introduced a
fictitious non-interacting sy s tem that has the same density as the real one [1 10]:
Whereas Hohenber g and Kohn's approach has been orbital-free [108],[1 16] , Kohn and Sham
proposed an orbital-based approach, in which the orbitals Φ i do not interact with each other [1 10] ,
[1 16]:
A non-interacting densit y ρ(r) (33) is generated as a linear combination of one-electron orbitals
(33), named Kohn-Sham orbitals φ i .
Thus, the ener gy expression can be changed into (34) which is based on non-interacting electrons
and orbitals:
Here T KS is the kinetic ener g y of the non-interacting K ohn-Sham electrons and is dependent on the
33

(30)
E 0 ⩽ E [  ρ ] = E Ne [  ρ ] + F H K [  ρ ]

(31)
E [ ρ ] = T [ ρ ] + E Ne [ ρ ] + E H [ ρ ] + E x [ ρ ] + E c [ ρ ]

(32)
T KS [ ρ ] = − 1
2 〈 ϕ i ∣∇ 2
∣ ϕ i 〉

(34)
E [ ρ ] = T KS [ ρ ] + E Ne [ ρ ] + E XC [ ρ ] + ( T [ ρ ] − T KS [ ρ ] )

(33)
ρ ( r )= ∑
i = 1
N
∣ ϕ i ( r ) ∣ 2

Kohn-Sham orbit als φ i . The unknown terms E XC and the dif ference between the kinetic ener g y of
interacting electrons and T KS (T(ρ)-T KS (ρ)) are further combined into the exchange-correlation
functional (35).
This functional generates the so-called exc hange-correlation potential V XC (eq. (36)).
The one-electron ener gies ε i , as well as the corresponding orbitals Φ i are obta ined b y s olving the
one-electron Kohn-Sham equations, as stated in e q. (37) [1 10]:
Since starting values for φ i are necessar y to solve this set of one-electron equations as well as for
defining the Kohn-Sham operator ƒ KS , the y have to be solved iterativel y and form a self-consistent
field procedure. In this, an initial guess of the orbitals leads to an initial iteration with resulting
ener gies and new orbital coef ficients. These are then used for further iterations and calculations of
ƒ KS , until a desired threshold is reached and the ener g y and coeffi cients conver ge. The used Kohn-
Sham operator ƒ K S is defined thereby as [1 10]:
The first ter m in V eff displa ys the interaction of the electrons with the field induced b y the nucelar
char ges Z k . The second term describes the interactions of electrons with each other . The effective
potential V e ff is strongly related to the exch ange-correlation potential V XC . This is furth er generated
according to (36)
V eff is important for the correctness of the chosen densit y and has be chosen in such a way that the
non-interacting density equals the real (interacting) one. Therefore, a correct approximation for the
unknown terms contributing to E' XC has to be found. The gradient-corrected (like BP86) [1 1 1] and
hybrid functionals [1 12] are the ones most widely applied. In the following work especially BP86
and the hybrid functi onal B3L YP (39) have been used.
B3L YP consists of contributions fro m the S later -Dirac-exchange ter m [1 13],[1 14], Becke-exchange
34

(35)
E ' XC [ ρ ] = E XC [ ρ ] + ( T [ ρ ] − T KS [ ρ ] )

(37)

f KS ϕ i ( r ) = ϵ i ϕ i ( r )

(39)
B3LYP → 0.8S + 0.72B88 + 0.2HF + 0.19VWN ( III ) + 0.81LYP

(36)
V XC [ ρ ] = ∂ E ' XC [ ρ ]
ρ

(38)

f KS = − 1
2 ∇ 2 V eff ( r ) ;
V eff ( r ) = ∑
k = 1
N k Z k
∣ r − R k ∣ + ∫ ρ ( r ' )
∣ r − r ' ∣ dr ' + V XC [ ρ ( r ) ]

term, the Hartree-Fock-exchange term, the V oskov-W ilk-Nussair correlation [1 15] and the Lee-
Y ang-Parr correlation [1 1 1].
4.2.4. Mulliken charges
DFT computations can produce dif ferent properties of the investigated s y s tem, depending on its
electronic structure. One of these are the charges of the individual parts of the s y stem, the s o-called
Mullikan charges [98]. In the following the concept of Mulliken charge s is discussed in further
detail, since they have been used for the generation of polar ized force fie lds (as discussed in section
4.1.5):
The Mulliken charges arise fro m the Mulliken Population anal y s is [98]. The y can be used for the
description of the electronic structure and also for determination of the bonding, ant ibonding or
nonbonding nature of molecular orbitals (MOs). As prerequisites for the computation of Mulliken
char ges the molecular orbital wavefunction Ψ i and the coef ficients c i of the corresponding atomic
orbitals have to be known (generated for exa mple with DFT methods as discussed in the previous
section). Ac cording to the linear co mbinati on of atomic orbital (LCAO) approach [1 16] the char ge
distribution of the 2-particle sy s tem can be displayed as a probability density (40):
Here c ij and c ik are orbital coef ficients, while Φ j and Φ k are the atomic orbitals (AOs). Integrating
over all electronic coordinates, with norm alized AOs and MOs, one gets the following expression:
Here S jk is the overl ap integral, describing the overlap be tween two AOs. This can be inte rpreted
according to Mulliken [98] as follows:
c 2 ij is the contribution of one electron in MO Ψ i to the electronic char ge in AO Φ j . Accordingl y c 2 ik
is the contribution of one el ectron in MO Ψ i to the electronic char ge in AO Φ k . Hence these are
called atomic-orbital populations. Furthermore, 2c ij c ik S jk is the contr ibution of one e - to the overlap
region between the two corresponding AOs and is called overlap population ( >0: bonding MO, <0:
antibonding MO, =0: nonbonding MO). These results are commonly presented in th e so-called
Mulliken populati on matrix:
Here rows and columns correspond to ato mic orbitals, while each diagonal element display s the
atomic-orbital populati ons with the of f-diagonal elements being the overlap populations.
Further condensation of the data can be achieved via combination of the atomic-orbital popul ations
with the correlated overl ap populations for each MO which resul t in the Gross population matrix
[98]. The matrix e lements are created by dividing the overlap populations in half and adding each
half to the AO populations of the participating Aos (43):
35

(40)
Ψ i
2 = c ij
2 ϕ j
2 + c ik
2 ϕ k
2 + 2c ij c ik ϕ i ϕ j

(41)
1 = c ij
2 + c ik
2 + 2 c ij c ik S jk

(42)
P i =
(
c i j
2 2c ij c ik S jk
2c ij c ik S j k c ik
2
)

The GP matrix and its elements provide the amount of gross charge s that the MOs contribute to the
AOs. Columns correspond to MOs, while rows correspond to AOs. Furthermore, AO char ges Q φ are
then obtained by adding the elem ents in the rows of GP for the occupied Mos .
Atomic char ges Q atom are obtained from the Q φ when adding the AO char ges on the same atom
according to eq. (44).
And finally the Mulliken charges Q Mullikan (also called polarized char ges) are obtained by subtracting
Q atom from Z a tom (char ge of the nucleus) according to (45):
4.2.5. Basis functions and basis se t appr oxi mations for metals
An i mportant prerequisite for ab-initio calculations is the correct definition of the atomic orbitals.
As shown before, one-electron orbitals are used in DFT [1 10] for iterative co mputation procedures.
In the MO-theor y , the LCAO approach defines the description of molecular orbitals b y a linear
combination of ato mic orbitals [1 16] . T o achieve reliable results, a good approximation for these
atom-centered orbitals has to be found. In the following the theor y of basis sets is explained, as well
as dif ferent kinds of basis sets and their possible application [1 16],[1 18]. Finally , basis set
approximations for d-period metals are discussed, since co mputations involving metals have been
done in this work [1 19],[120],[121],[122].
Standard Basis functi ons
Atomic orbitals are represented in QM cal culations by so-called basis functions [1 16],[1 18] :
In general, basis functions are predefined and reflect importa nt properties of the atoms they should
resemble. They are centered on the nucleus and are similar to the corresponding atomic orbitals.
Commonly used are the so-calle d gaussian t y p e orbitals (GT O) as shown in eq. (46)
Here r and l represent the radius of the orbital and the angular momentum quantu m number ,
respectively . ξ is related to the nuclear char ge of the corresponding ato m. Although functions of this
type are eas y to compute, they are a bad basis for actual AOs, since the y differ much from the exact
solution of the Schrödinger equation, as shown in e q. (47) [1 16].
36

(46)
GTO : R ( r )= r l e − ξ r 2 ; STO : R ( r )= r n − l − 1 e − ξ r

(44)
Q atom = ∑ Q ϕ k

(45)
Q Mullikan = Z atom − Q atom

(43)
GP ij = P i jj + 1
2 ∑
k ≠ j
S jk

L corresponds to the radial part of the atomic orbitals and is a Laguerre pol y nomal. It is dependent
on the dif ferent quantum nu mbers n (main quantum nu mber) and l, as w ell as the nuclear char ge Z
and the radius r .
Slater t ype orbitals (ST Os) are rea listic representations for AO s (see eq. (46); not discussed here),
but are computationally very costly i n contrast to GT Os.
Therefore, a linear combination of several GT Os has to be used to approximate the radial
distribution of the atomic orbitals. A number of pri mitive GTO s (PG T O ) is forming contr acted
GT Os (CGTOs) which are shown in e q. (48).
This provides a good approximation for real atomic orbitals and is similar in qualit y to ST O (if s till
less flexible) and less com putationally costly . D if ferent basis set types are vailable [1 16],[1 18]:
Basis set. A set of CGTOs then for m a basis set, used for description of AOs/MOs. The appropri ate
coef ficients have to be deter mined e.g. b y DFT computations. There exist different basis sets which
dif fer in its applicati on and accuracy:
Mini mal basis. In a minimal basis a single CGT O is used to describe ever y orbital (e.g. STO-k G,
where k denotes the number of PGT Os to be contracted for each atomic orbital).
V a lence double zeta (VDZ) basis. VDZ basis s ets appl y two CGTOs to each orbital. T riple zeta
basis sets use accordingly three CGTO s. More commonl y used are the so-called split valence basis
sets, in which the core orbitals are represented in minimal basis, while the valence orbi tals are
represented b y VDZ or V TZ basis s ets. This is due to the fact that the core electrons/orbitals are
mostly unimportant for chemi cal behavior , while the valence orbital s contribute heavil y to bind ing
situations in molecules.
Polarization f unctions. Polarization functions can be applied to achieve a better description of
bonding situations. They add a func tion with higher angular mo mentum (e.g. for h ydrogen: three p-
functions are added; for carbon: five d-functions are adde d).
Diffuse f unctions. These functions have small exponents and are added to the basis set for the
purpose of better describing dif fuse states, s uch as anions and hydrogen bridges (weak bonds in
general).
Effective cor e potential (ECP) → pseudopotential
Another important approximation for basis sets is the so-called effective core potential (ECP) or
pseudopotential [1 19],[120],[121] . Because of the low contribution of core-near electrons to
37

(48)
χ ( CGT O ) = ∑
i
k
a i χ i ( PGTO ) ≃ STO

(47)
R ( r ) = L n − l − 1
2l + 1
(
2Zr
n
)
e − z
n r

chemical reactions and their orthogonalit y to the valence electrons, the all-elec tron potential can be
approximated by an appropriate poten tial, named ECP . The valence electrons move inside this
potential. It was first introduced by Hans Hellm ann [1 19]:
Using ECP’ s, the solution of the Schrödinger equation becomes much less costl y in terms of
computational resources. There are three dif ferent approximations for generation of ECP's:
1. One-electron approxi mation:
Separation of core and valence electrons.
2. Frozen core approxi mation:
The one-electron contributions of the core-nea r orbitals are constant.
3. Small-core approximation:
In the s mall-core approximation it is assumed that there is no significant overlap between core and
valence electrons which is why the former can be kept constant. For cases, in which the overlap
between both is not negligible their exist other methods like non-linear core corrections [120] and
semicore electron inclusions [121].
This lea ds to a s implification of the DFT approach, in which the exchange-correlation functional
can be separated into two terms (49). Thus, in DFT computations using ECPs onl y the part for
valence electrons has to be considered.
Thus, the valence electronic-potential in ECPs is described b y so-called pseudo-wavefunctions,
while the core-electrons are o mitted. After a chosen cut-of f, the pseudo-wavefunctions and the rea l
potential function are similar . A bigger cut-off guarantees a better convergence, but makes the ECP
less transferable to dif ferent s y stems.
Incorporation of ECP's reduce not only the neces sar y basis s et size and the number of electrons (and
thus the computational costs), it also addresses heavy metals [122]: Relativistic ef fects can be found
in ver y heavy elements, lik e d-row-metals: Because of the heavy core (and higher positive charge),
the core electrons are mov ing at a much higher velocit y , nearly w ith the speed of ligh t, resulting in
the contraction of the core orbitals and therefore, stronger shielding of the valence electrons from
the core charge. That is wh y , the valence orbitals are more spacious than nor mally assumed and are
also much more separated from the core orbital s.
Therefore, in d-metals, core-electrons pla y a less significant role in molecular-bi nding situat ions.
Further the contribution of valence orbita ls to chemical bonds is altered compared to non-relativistic
elements. This cannot be approximated realistically b y standard basis s ets. Because of this, the
application of ECP's for d-metals is of high importance to simulate correct electronic behavior and
get realistic results [122]. QM-computations inc luding d-metals, as have been done in this work, are
therefore alwa y s treated with appropriate ECP's and basis sets (for further detail about the used
basis sets, please see section 5.2.).
38

(49)
E ' XC ( n cor e + n valenc e )= E ' XC ( n core ) + E ' X C ( n val enc e )

4.2.6. Self-Consistent Charge Density Functional T ight Binding method (SCC-
DFTB)
The SCC-DFTB method [1 17] is an approximation of density func tional th eory (with use of the
Kohn-Sham orbitals [1 10]): It uses empirical parameters and is a se miempirical method. Further it
only takes valence electrons into account, therefore us ing pseudopotentials. This leads to
computational ef ficienc y at l ow costs (compared to conventional DFT).
The ener g y function is divided into three dif ferent parts (50):
The hamiltonian ener gy E H0 is generated b y using either analytical functions or pre-evaluated results
from DFT calculations (minimal basis of pseudoatomic orbitals w ith usage of GGA functional) for
the desired s ystem or molecule. These have been carried out using a reference densit y which
contributes to the T ight-binding (TB) model (atom hardness, as is t y p ical for ECP-treated orbitals).
As seen in (50), the first term in E tot i s defined as the sum over the product of the atomic coef ficients
c i in minimal basis representation and the Ha milt on H 0 matrix, depending on the reference densit y
and V XC .
The second term E γ is strongl y related to the exchange-correlation functional. Δq is the induced
char ge on the corresponding atom and γ AB a distance -dependent funct ion which describ es char ge
interactions and approximates E' XC with a T a y lor series (s elf-consistent charge method is applied) to
account for changes in charge-densities and charge transfers. In SCC-DFTB first and second order
terms are used for this, resulting in a good descript ion of short- to long-range char ge interactions.
The last ter m, E Rep , the repulsive part, is built from experim e ntal parameters resulting in the
approximation of the two-center integrals b y one-center terms. Thi s is done by fitting parameters to
experimental values. Further E Rep is directl y related to the application of ps eudopotentials and thus
the neglect of core electrons.
The coef ficients and ener gies are determined b y solving the Kohn-Sham equations as shown in
section 4.2.3., resulting in equations shown in (51) which are solved i teratively . H 0 μν is preev aluated
and similar to the Kohn-Sham operator in s tandard DFT (eq. (38)) and depends on the reference
density ρ 0 .
These simplifications yield a semiempirical method on one hand more effi cient than conventional
MM methods and on the other hand less computationally costly than standard DFT [1 10] and
therefore, a convenient choice for e.g. QM/MM hy brid approaches (as discussed in 5.1.3.).
39

(50)
E t ot = E H0 + E γ + E Rep ;
E tot = ∑
i μ ν
c i μ c i ν H μ ν
0 + 1
2 ∑
A , B
Δ q A Δ q B γ AB + E Rep

(51)
∑
ν c i ν ( H μ ν − ε i S μ ν ) = 0 ;
H μ ν = 〈 ϕ μ ∣ 
H μ ν
0 ∣ ϕ ν 〉 + 1
2 S μ ν ∑
C
Δ qC ( γ AC + γ BC )

4.2.7. Implicit solvation m et hods – Self-Consistent Reaction Field ( SCRF)
In this work theoretical methods are applied t o biological systems and therefore, sol vation effec ts on
the biological matter have to be accounted for . As mentioned in section 4.1., th e water environment
can be treated exp licitly or implicitly . The application of lar ge amounts of explicit solvent
molecules in DFT computations is unfavorable due to the limitations of current co mputer pow er .
Therefore, the app lication of implicit solvation methods [123],[124],[125],[126],[127],[128] ,[129],
[130],[131],[132],[133],[134],[135],[136] is of great importance and interest. Not only water
environments can be simulated b y implicit solvation models, but also the protein environment
which surrounds e.g. cofactors can be accounted for wit h implicit solvation [71],[137],[138],[139].
Implicit solvation methods as applied in this work use a polarizable continuum (dielectrics) for
mimicking the solvent. The first descriptions of molecules surrounded by a dielectric mediu m with
incorporated char ges can be seen in the classical Poisson-Boltzmann equation [126] and the
generalized Born formalism [127],[128],[129].
A dielectric med ium [140] generally has a high polarizability and is based on its dielectric con stant
ε which def ines the permitti vity of the material. Permittivity accounts for how m uch an external
electric field has an effect on the material (and vice versa ). The dielectric constant ε is the product
of the relative permittivity ε r (relation of ε to ε 0 ) and the one in vacuum ε 0 as depicted in (52). High
values of ε r account for high polarizability .
In SCFR-calculations, the solute is surrounded b y a dielectric continuum, defined b y its dielectric
constant and thus its polarizabilit y . One wa y of approaching the computation of such models is
shown in mor e detail in the next subsection. This method has been used in this work to mimic both
solvents and protein environment s.
Polarizable Continuu m Models (PCM)
The polarizable continuu m model (PCM) is the most comm on us ed SCRF m ethod and has been
derived b y T o masi et al. [123],[124],[125],[130],[131],[132],[133],[134],[135],[136],[141]: A cavity
is created inside the dielectric conti nuum, in which the solute (molecule) is placed (a s can be seen in
Figure 15). This is carried out by using interlocking spheres defined by the van der W aals-radii a of
each atom.
The solvent-solvent interactions are reduced upon introduction of the cavity inside the solvent
resulting in a reduced entrop y . Therefore, the ener gy ΔG cav is needed to create the cavity which is
therefore ener ge tically unfavorable. In contrast to this, placing of the solute inside this cavity is
ener geticall y favorab le, because of increased dispersion and electrostatic interactions, yielding
ΔG disp and ΔG ele (electrostatic interaction). This leads to eq. (53).
40

(52)
ϵ = ϵ r
⋅ ϵ 0

(53)
Δ G solv = Δ G ele + Δ G cav + Δ G dis

The individual terms are:
ΔG ele : electrostatic solute-solvent int eractions
ΔG cav : ener gy needed to form the molecular cavity inside the continuum
ΔG dis : dispersion ener g y arising from sol ute-solvent interactions
Additional terms may be (these are not alway s accounted for in available methods and are therefore
not further discussed here):
ΔG re p : repulsion ener gy (arising from Pauli repulsion)
ΔG Mm : entropi c contributions (due to molecular motions)
Figur e 15. Polarizable continuum m odel: A cavity is build around the molecule/the solute, whic h is
surrounded by the solvent, represente d b y a dielec tric medium/reaction field. The picture was
inspired by [135].
As said before, the cavit y can be either be a simple sphere (radius a) with a point charge q in its
center . Or it can be constructed in a more sophisticated wa y by overlapping spheres centered on
each atom and defined by their van der W aals-radii.
For computing the individual ter ms of G solv (as shown in (53)) three slightly differe nt versions of
cavities are used (e.g. [142]):
G cav is calculated using the van der W a als-surface, generated b y the van der W aa ls-radii of the
individual atoms. The s urface available for the solvent is then described b y a so-called Solvent
Assessable Surface (SAS) which is used to compute G dis . The S AS is generated b y inclusi on of
idealized sol vent-radii into the van der W aals-surface. Finally , the G ele is computed, using a third
version of surface, namely the Solvent Excluding Surface (SES): Here, all van der W aals-radii are
scaled by a constant factor and sm oothed b y t he additional usage of not atom-centered spheres.
The energy used for creation of the cavi ty . ΔG cav as well as the gained dispersion energy ΔG dis is
further proportional to th e cavity 's surface S, as shown in eq. (49) [135]. ξ is the proportionality
factor .
41

The electrostatic energy is dependent on the polarization ef fect of solvent on solute and vice versa
[135]: Th e induced dipole moment μ of the s olute leads to the cre ation of charges on the molecular
surface (Apparent Surface Char ge (ASC) model). This forces the solvent to react/adapt to it w hich
is wh y it is called a reaction field. Following this, the polarization of the solvent also induces a
change in the molecular char ge-distribution ( multipole). In this wa y the solvent is a lso influencing
the solute. The amount of the multipole moment change is determined b y the polarizability α of the
molecule. This solute-solvent interaction is defined in the so-called interaction potential V i nter which
is added to the conventional Hamiltoni an and used in the SCF procedure, as described below .
In the following the implementation of PCM into SCF pr ocedures, as w ell as the analytic form
of G ele is illustrated for a simple spherical cavity [135] :
Eq. (55) [135] displays the analytic form of G ele . Here, μ is the dipole mo ment of the solute and α its
polarizability . ε r is the relative dielectric constant of the dielectric medium s urrounding the cavity
and a the radius of the latter . The first part (before the brackets) corresponds to the polarization of
the solvent and the second part (inside the bracket s) to the back-polarization of the solute [135] :
The electrostatic free ener g y and the induced dipoles are directly proportional to the int eraction
potential V inter which describes hence the interaction between solute and solvent. It is an additional
term acting on the Hamilton operator in the Sc hrödinger equation as seen in eq. (56).
The SCRF m odel is then implemented into QM (DFT) calculations in the following way [135]:
1. Us age of normal SCF procedure (e.g. DFT) with conventional Hamiltonian H 0 (e.g. as discussed
in section 4.2.3.). T his results in the calculation of wavefunction Ψ 0 (r) and electronic density ρ 0 (r) in
vacuum.
2. Calculation of dipole m oment μ of the solute with Ψ 0 (r) ac cording to (57):
Here, Z α and R α are the char ge and coordinates of the nuclei, respectively .
3. The obtained dipole m oment μ is then used to compute the induced reaction field E R (58) . :
42

(54)
Δ G cav + Δ G dis = ∑
i
atoms
ξ i S i

(55)
Δ G ele = − μ 2
a 3
ϵ r − 1
2 ϵ r + 1
[
1 − ϵ r − 1
2 ϵ r + 1
2 α
a 3
]
− 1
∝ V inter

(56)
[ H 0 + V inter ] ∣ Ψ 〉 = E ∣ Ψ 〉

(57)

μ = − ∫ ρ 0 ( r ) dr + ∑
a
Z a R a

4. This leads t o the interaction potential V inter (59):
5. Finally , the wavefunction Ψ(r) and densit y ρ(r) can be computed including the solvation-ef fects
by adding V inter to H 0 , resulting in the new Hamiltonia n H for the solvated system (60):
6. The cycle ca n start again with step 2 and is repeated until μ, ρ and E R conver ge.
The general procedure of first computing Ψ 0 (r) and afterward Ψ(r) with the help of V i nter can also be
applied to computations with non-spheric al cavities.
This method is eas y to implement and has lower computational costs as explicit solvation.
Unfortunately th ere are some drawbacks to this model [130],[131],[132],[133],[134] ,[135],[136],
[141]: First, ionic solutes which interact strongly with the solvent are not accounted for . Second,
sy stems which include strong electrostatic interactions are less accurately treated, yielding often
false positive char ges in the s y stem (especially if the w avefunction penetrates the cavity walls).
Third, the estimation of non-polar entropic interactions (ΔG Mm ) are often neglected or treated
insuf ficiently . And finally an y interaction between solvent molecules and the solute (like H-bridges)
are omitted. These can lead to wrong estimations of solvation ener gies and therefore, models with
at least some of these qualitie s have to be treated with care, when applying PCM's.
4.3. Quantum Chem i cal/Molecular Mechanical (QM/MM) methods
Both pure MM and QM methods have huge advantage s, but also shortcomings. MM is very useful
for descrip tions of confor mational changes in big macromolecules and for analyzing d y n amics.
Unfortunately MM techniques are not able to describe the structure of the invest igated system s on
an electronic level. This means i t cannot attribute for complete understanding of reaction
mechanism s, char ge tr ansfers and electron excitations. In contrast, solving the Schrödinger equation
leads to an understanding and visua lization of the electronic structure. But since the computational
costs for these methods are very high, they can only be applied to small sy stems and not to big
biological s y stems, such as proteins. Clearl y both methods have huge advantages and to overco me
their restrictions W arshel and Lev itt [84],[143],[144] introduced the concept of combini ng quantum
mechanics with molecular me chanics in the QM/MM framework:
QM/MM is a h y b rid method combining the powers of both MM and QM in one new concept. In
this approach [84],[143],[144], the entire s y s tem (S) is partitioned into three diffe rent areas as
depicted in Figure 16:
43

(58)

E R = − 2 ( ϵ r − 1 )
( 2 ϵ r + 1 ) a 3 
μ

(59)

V inter =  μ ⋅ 
E R

(60)
H = H 0 + V in ter

The outer region (O) described b y molecular mechanics methods. According to stochastic boundar y
conditions [91],[92] (s. section 4.1.4.) this region can be further divided into N ewtonian, Langevin
and Boundar y regions or at least be divided into an active and an inactive region, in which atoms
are free to move or remain fixed, respectivel y . A ll atoms in the Newtonian area are completely
unrestraint in the ir m o vements and follow Newtons laws of m o tion. In contrast to this, all atoms
inside the second section experience additional frictional and rando m forces according to Langevin
dynamics. Sol ute reaching out of the boundary region is treated as fix ed (similar to GSPB [93]).
Furthermore, most often the s olvent is cut down to a sphe re of a predefine d diameter . Outside of
this sphere solvent can be mimicked by application of a dielectric constant or co mpletely o mitted.
See sections 5.1.5 and 5.2.4. for more information on practical application of Q M/MM methods on
biological sy stems, as carried out in this work.
The inner region (I) is described b y QM methods. This is the area of most interest (e.g. active
centers of proteins, reaction centers and so on). In the additive scheme both (I) and (O) are
connected with a so-called boundar y region (B). A general scheme of th e QM/MM approach can be
seen in Figure 16.
Figur e 16. Schematic representation of a QM/M M model in the additive scheme as descri bed in the
text. The outer region (blue; O) is treate d with MM, while the inner region (yellow; I) with QM
methods. The brown area dividing both is called bounda r y region, c oncerning the transition from I
to O and can be treated at dif ferent levels of theor y .
Since both regions are strongl y interconnected and interact with each other (e.g. via cova lent bonds,
which cross from (I) to (O) or with simple el ectrostatic interactions between both), it is not possible
to simple sum up the ener gies of both regions to gain the one of the complete system (S) (61):
Instead, one has to think about coupling terms for both regions and take special prec autions for the
boundary region. This will lead to a new expression of the ener gy of S as shown in 4.3.1.
44

(61)
E ( S ) ≠ E ( I ) + E ( O )

4.3.1. Additive Scheme (+ subtractive scheme)
Currently two dif ferent schemes for generating the ener gy of (S) are in use [84],[144]: One is called
subtractive scheme and the other additi ve scheme.
Subtractive sche me
Here [84],[144], the QM /MM ener g y is generated (according to eq. (62)) by first calculating the
ener g y of the complete s y s tem (S) with MM methods. Afterward, the MM-ener gy of (I) E M M (I+L)
(with possible l ink atoms (L), as described in 4.3.3.) is subtracted from E MM (S) and the QM-derived
ener g y E QM (I+L) of this region is added.
Fortunately , this approach does not need an y coupling terms for the Q M and MM regions, since the
complete s y s tem is initially computed w ith MM-methods. Second, the corresponding QM and MM
packages do not have to be altered and thus the approach is easy to appl y to existing software.
Unfortunately , the coupling between Q M and MM regions is only done with MM methods (leaving
out electronic ef fects) and force fi eld para meters are necessar y for (I). Th is often proofs to be
dif ficult, since appropriate parameters for e.g. cofactors might not be available.
Additive sche me
The additive scheme is tar geting these disadvantages in the following manner [84],[144]: MM
methods are only applied to the outer region (O), while the inner region (I+ L) is treated at a
quantum-chemical level. Both ener gy terms are combined. Furthermore, a third term is added which
is called the QM-MM coupling t erm E QM-MM (I, O, L) and corresponds to the boundary region (B):
The coupling term treats interactions between the QM and MM regions ((I) and (O), respectivel y ),
such as bonding (cova lent bonds across the boundar y ):

E QM − M M
bond

, van der W aa ls interactions:

E QM − MM
vdW

and electrostatic interactions:

E QM − MM
elec

.
This approach is the most used one in the field of QM/MM. The treatment of E QM-MM (I, O, L) and
the corresponding individual interaction ener gies is therefore of major interest and can be
approached in dif ferent wa y s [84] ,[144]:

E QM − MM
vdW

is commonly treated at the MM level by application of a Lennard-Jones potential. This
makes it necessary to have correct van de r W aals-parameters for both regions (I) and (O).
45

(62)
E QM / MM ( S ) = E MM ( S ) − E MM ( I + L ) + E QM ( I + L )

(63)
E QM / MM ( S ) = E MM ( O ) + E QM ( I + L ) + E QM − MM ( I , O , L )

(64)
E QM − M M ( I , O , L ) = E Q M − MM
bond + E Q M − MM
vd W + E QM − M M
elec

Covalent bonds, attributing to

E QM − M M
bond

can be treated in diffe rent ways, as s hown in 4.3.3 and the
dif ferent possibilities for treatment of the electrostatic coupling between QM and MM atoms are
discussed in the next secti on.
4.3.2. Embedding m et hods
The

E QM − MM
elec

term can be addressed in diffe rent wa y s, which are called “embedding” methods
[84],[144], attributing for possible polarization ef fects between both regions (I) and (O):
Mechanical e mbedding
The simplest wa y is to treat the QM-MM electrostatic interactions on a pure MM level. Point
char ges as co mmon for force fields are a lso applied to the QM atoms. This leads to an exclusion of
polarization effe cts of both region (I) and (O), since no char ges on either side are pol arized b y the
other . Also it is necessary to have appropriate point char ges for the ato ms of (I).
Electr ostatic embedding
In this approach, the Hamiltonian of t he the QM-region (I) includes the MM point char ges of (O), as
seen in eq. (65):
Here q j denotes the charges of MM atoms, while the char ges of the QM atoms are Q a and r i and R a
the corresponding positions of the char ges.
The incorporation of the MM charges leads to a polarizati on of the char ges of the atoms of (I) and
in this way (I) is polarized by ( O). Ther efore, this method is an im provement to the mechanical
embedding, although it still lacks the back-pol arization ef fec t on MM atoms by the QM char ges.
Polarized E m bedding
The polar ization of M M atoms in (O) b y QM atoms in (I) is accounted for by polarized embedding
methods. They work similarl y to the electrostatic embedding, with the exception of using
polarizable MM force fields. Flexible MM charge models are able to yield adjustments of the
char ges of MM atoms to neighboring charges (both QM and MM atoms) at each time-step of the
computation. An example is the Drude-oscillator [84].
This method would provide the most accurate description in the QM/MM framework, but s ince
polarizable force fields are still in development, the electrostatic embedding has been us ed in this
work.
4.3.3. Boundary Region
The treatment of covalent bonds, which are cut b y the QM-MM boundary , is also of great
importance. They have to be included into the QM -MM coupling term according to equation (64).
46

(65)

H QM − MM
elec = − ∑
i
N ∑
j ∈ O
L q j
∣ r i − R j ∣ +
∑
a ∈ I
M ∑
j ∈ O
L q j Q a
∣ R a − R j ∣

Dif ferent approaches for generation of

E QM − M M
bond

are available [84],[143],[144],[145],[146],[147],
[148],[149] as discussed below .
Fr ozen orbitals appr oach
The first approach is the so-called frozen orbital or localized orbital approach. In this, frozen h ybrid
orbitals are used to saturate the free valency of the atoms at the cut bond [84],[143],[145],[146],
[147].
Link-atom method
The second method (w hich has been appl ied to QM/MM co mputations carried out in this w ork) is
the link-atom scheme which introduces an additional atom (not part of either MM or QM regions)
connected to the QM host at the cut covalent bond [84],[144],[148],[149]: Th is atom is often a
simple h y drogen atom H which is commonly included into the QM Hamiltonian and thus saturates
the QM-atom valenc y . The link-atom interacts w ith the MM region onl y via electrostatics. In F igure
17 the basic principle of this approach i s shown.
The M M-atom see s and interacts with the QM-atom on a classical MM-level via harmonic bond
stretching as discussed in section 4.1.1. Since the MM-atom is ver y near to the H, co mpared to the
original bond between QM and MM host atoms, this would lead to an overpolarization of the QM-
char ge in electrostatic and polarized e mbedding. Therefore, it is common to s mear th e charge of the
MM atom on the neighboring atoms and t hus “erase” it from the calculations.
Figur e 17. Schematic visualization of the link-a tom method. The bond from the QM-host to the
MM-host is cut. The QM-atom is then saturated b y a hy drog en atom, while the MM-atom still see s
the QM-host on a MM-le vel of theory .
4.3.4. Application
QM/MM methods can be performed in several way s [84],[144]: In principle it is possible to add
QM attributes to an existing MM program, using the corresponding force fields, such as CHARMM
[150],[151], AMBER [152], etc. The y are able to si mulate lar ge a mounts of atoms. Also, MM
attributes can be added to QM progra ms such as GAUSSIAN [153], TURBOMOLE [154], etc. The
named codes are able to co mpute electronic s tructures of the investigated s y stems and other
wavefunction- or electronic density-based observa bles. But both strategies take only the advantages
47

of the parent platforms and lack flexibility in adjustment of dif ferent parameters for both MM and
QM methods. Therefore, a third option, which has been used in this work, is the program
CHEMSHELL [155] . It equall y combines MM force fields with QM routines on an interface that is
optimized for QM/MM coupling. In this work a combination of the CHARMM force field [85]
together with TURBOMOLE [154] has been used for QM/MM calculations. For further
computational details see secti on 5.1.3.
QM/MM calculations can be used to gain the ener g y of the s y stem of interest and, furthermore, to
obtain several properties as discuss ed in the sections concerning MM and QM, respectively . In this
work QM/MM methods are used for geo metry optimizations of lar ge proteins including cofactors
and subsequent frequenc y and intensity calculations for the generation of Ra man-spectra. The
theoretical background on vibrational spectroscopy in general and Raman-spect roscop y in detail is
discussed in the next secti on and also in the sections 5.1.6. and 5.3.
4.4. Ram an spectr oscopy and com putation
In this section the general principles of Rama n scatte ring (resonant and non-resonant), as well as
computational details for the calculation of Ra man-spectra will be discus sed. The main source of
literature for this section and all subsect ions has been [156],[157],[158].
An inelastic scattering process of photons from matter , known as Raman ef fect [159] or Raman
scattering, is initiated b y exposure of the investigated s y stem to electromagnetic radiation. In
conventional absorption-processes, electrons in the materi al are excited via incoming photons to a
higher el ectronic lev el. In contrast to this, in Raman processes the photons are not absorbed b y the
material, but are instead scatte red from the molecule [159],[156],[157].
Figur e 18. Schematic representation of Ram an scattering processes, with corresponding ener gies of
incomming light (hν 0 ), scattered light (hν R ) and observed vibration (hν K = hν 0 +/- hν R ); G is the
electronic ground-state; n, m the ini tial and final vibrational states and R, r are virtual elect ronic and
vibrational states, respectively which resem ble an intermediate vibronic state.
Scattering pr ocesses
In Figure 18 the dif ferent scattering processes [156],[157] are depic ted. Most of the l ight is
dispersed in an elastic wa y , called Rayleigh-scattering. No shift in frequency between incoming and
48

outgoing light can be observed and thus no ener gy is transferred from or onto the molecule. But a
small percentage of the incoming photons are causing so-called S tokes- or Anti-Stokes-scattering
processes. These are inelastic in nature. Resulting fro m a vibrational transition in the studied
molecule, a frequency-shift between incoming and outgoing light is induced. This is named either
Anti-Stokes (rise in outgoing light fre quency) or Stokes (lowering of outgoing light frequency ) .
Illustrated in Figure 18, hν 0 denotes the energy of the initial photons. In case of Stokes-scattering,
the s y s tem is excited fro m an initial electronic (G) and vibr ational (n) ground-s tate to a virtual
vibronic state (R,r). Subsequently it falls back dow n to the electronic ground-state, but re mains in a
higher (1 st excited) vibrational level (G,m). The scattered light has the energy hν R =hν 0 -hν k , whereas
hν k denotes the diffe rence between the vibrational ground-state (n) energy and the one of the
induced vibration k (respective state: m ).
Due to higher temperatures, some molecules show already a population of the first vibrational level
(G,m), even before the exposure to light. Irradiation w ith light then can lead to Anti-Stokes-
scattering, in which again the incoming light induces an excitation from the first vibrational level
(G,m) to a virtual state (R,r) in the molecule. From this it falls back to the electronic and vibr ational
ground-state (G,n). In this case a rise in the e ner gy of the scattered light hν R =hν 0 +hν k is observed.

The frequency shifts are identical in both Stokes- and Anti-Stokes-scattering, since the y alwa y s
attribute to the same vibrational levels. The intensities however diffe r between both processes,
depending on the temperature and thus on the initial state-population. Since at room temperature the
population of the ground-s tate is more probable, Stokes processes are also m ore common and thus
the observed intensities are often higher than for Anti-Stokes under these conditions. The computed
Raman-spectra in this work are al wa y s referring to Stokes-scattering processes.
The Raman-spectrum is constituted by the scattered photons which are measured with appropriate
spectroscopic techniques (not discussed here) [156],[157]. The intensity of the measured radiation
(hν R ) is equal to the probability of the transition. The frequen cy ν R of the measured light defines the
ener g y of the vibrational transition and is given in reciprocal cm -1 and is called Raman shift.
Sections 4.4.2. and 4.4.3. show the computational treatment of both frequencies and intensities of
Raman scatterings [158]. Finall y the bandwidth in the Raman-spectrum [156],[157] is connected to
the lifetime of a the tra nsition and is further discussed in section 4.4.4.
Raman intensities and polariz ab ilities
A dipole moment μ ind is induced in the molecule, upon irradiation of the matter with an
electromagnetic fi eld E 0 as depcited in (66). μ ind is then proport ional to the polarizability α of the
molecule and dependent on the freque ncy of the incoming light and E 0 [156],[157].
Here α ρσ can be described as a tensor , in which the derivatives of th e polarizability over the normal
coordinate space for each polarization-plane(-direction) are gathered (67) [156],[157],[158]. See
section 4.4.3. for more information on the pola rizability tensor and its quantum-chemical treatment .
49

(66)

μ ind = α ( ν ) ⋅ 
E 0 cos ( 2 π ν 0 t )

(67)
α ρ σ =
(
α xx α xy α xz
α yx α yy α yz
α zx α zy α z z
)

The polarizability tensor α ρσ is an important observable for the description of Raman scattering
processes. It is proportional to the Ra man cross section σ which again is proportional to the Raman
intensity I (eq. (68)) [156],[157],[158].
Here n and m denote the initial and final vibra tional level, respectively .
The polarizabilit y tensor α ρσ then can be written according to Kramers and Heisenber g [156],[157],
[158],[160] as follows:
The initial and final inter mediate vibronic (electronic and vibrational) states are represented b y <|
and |>, respectivel y . H ere, capital letters imply electronic states, while small le tters give vibrational
states. Furthermore, n, m are the initial and final vibrational levels of the electronic ground-state G .
R,r are the respective el ectronic and vibr ational intermediate states. In the case of non-resonant
Raman scattering, these are virtual levels. This is in contrast to resonance Raman processes, in
which the intermediate state is corresponding to th e first excited electronic level (s. section 4.4.1.
and Figure 19). Therefore, <||> describes a vibronic transition. The different frac tions of the dipole
moment operator are given b y M, with and directions σ and ρ. The lifetime of the virtual state is
accounted for b y the dampi ng factor Γ . ν Rr is the frequenc y of the transition from initial to virtual
states, while ν k display s the diffe rence between vibrational ground-state and first level. ν 0 accounts
for the incident light.
Thus, the polarizabilit y tensor can be described as a sum over all possible states and transitions.
W ith the first summation term representing resonant contributions (s. section 4.4.1.), while the
second term attributes for non-resonant transitions.
4.4.1. Resonance Ram an spectr oscopy
In resonance Raman processes [156],[157], the incident photons and the initiated excitations are in
resonance with an electronic transition. The general principle is shown in Figure 19 for resonance
Stokes-scattering, compared to the conventional non-resonant one. In resonant Raman scattering
processes, the electrons are excited to an intermediate vibronic state, representing the first excited
electronic s tate R (with r 1 , r 2 the corresponding vibrational levels). The virtual levels of the
transition in non-resonant Raman scattering are denoted I (virtual electronic level) and i (virtual
vibrational state), respectively .
Resulting fro m the approach of H eisenber g and Kramers [156],[157],[158],[160], equation (69) can
be simplified by omitting a ll non-resonant contributions, y i elding equation (70):
50

(68)
σ n → m ∝
(
ν 0 ± ν k
)
4
⋅ ∑ ∣ α ρ σ ∣ 2 ∝ I n → m

(69)
[
α nm
]
ρ σ = 1
h ∑
R , r
(
〈 nG ∣ M ρ ∣ R r 〉 〈 rR ∣ M σ ∣ G m 〉
ν Rr − ν k − ν 0 + i Γ R
+ 〈 r R ∣ M ρ ∣ Gm 〉 〈 nG ∣ M σ ∣ Rr 〉
ν Rr − ν k + ν 0 + i Γ R
)

(70)
[ α n m ] ρ σ ≃ 1
h ∑
(
〈 nG ∣ M ρ ∣ Rr 〉 〈 rR ∣ M σ ∣ Gm 〉
ν Rr − ν k − ν 0 + i Γ R
)

Figur e 19. Comparison between non-resonance Stokes-scatt ering processes and resonance Stokes-
scattering; Resonance Ram an occurs, when the excitation is in resonance with a highe r electronic
level (R,r) with corresponding ener gies of incoming light (hν 0 ), scattered light (hν R ) and observed
vibration (hν K = hν 0 - hν R ); G is the electronic ground-state; n, m t he initial and final vibrational
states and I, i virtual electronic and vibrational state s which resemble an intermediate vibronic state
for non-resonant processes; R, r is the intermediate vibronic stat e for resonance Raman scattering
with R the first excited electronic state and r 1 and r 2 the respective vibrat ional levels of R.
Furthermore, electron-excitations are much faster than the corresponding nuclear motions (electrons
move much faster than nuclei). This means that the vibrational states are not changi ng during the
electronic exc itation process. Thus the Born-Oppenheimer approximation can be appl ied, resulting
in a separation of electronic and vibrati onal transitions [107],[156] ,[157],[158] (71):
Here <nr> is the so-called Franck-Condon factor , describing the overlap between both initial and
intermediate vibrational states. A transition will be most probable, if the vibrational states of both
electronic ground-state and excited state are similar to each other , following the Franck-Condon
principle [156],[157].
The second term < G|M ρ |R> corresponds to the pure electronic transition. It contributes to the
electronic transition dipole moment M G R,ρ which is further expanded in a T a y lor series. Combination
of eq. (70) with eq. (71) then yields the expression (72) for the polarizability tensor of resonance
Raman processes [156],[157],[158]:
[ α nm ] ρσ then consists of a num ber of individual terms [156],[157],[158]:
51

(72)
[ α n m ] ρ σ ≃ A ρ σ + B ρ σ

(71)
〈 nG ∣ M ρ ∣ Rr 〉 = 〈 nr 〉 〈 G ∣ M ρ ∣ R 〉= 〈 nr 〉 M GR , ρ

A-ter m scattering: Depends on the electronic transition dipole moment of the T a y lor series of
M GR,ρ . It becomes important for strong resonant electronic transitions, in which th e excited states are
lar gel y displaced from the ground-state.
B-ter m scattering: Depends on the second element of the T aly o r expansion of M GR,ρ which is the
first derivative of the electronic transi tion dipole mom ent over the normal coordinate space. It is
only important for transiti ons with strong vibronic coupling, where the conical intersection is m et.
In the following sections it will be discussed how to accomplish computations of Ra man-spectra
[158]. In detail it will be discussed how frequency cal culations (secti on 4.4.2.) and generation of
Raman intensities (secti on 4.4.3.) and bandwidth' (section 4.4.4.) will be carried out.
4.4.2. Fr equency calculations via nor m a l mode analysis (NMA)
Fundamental for frequenc y computations is the normal mode analysis (NMA) [63] as used in this
study . There are also other methods (such as Fourier T ransformations), but this work will focus
solely on the NM A approach, in which the vibrational Schrödinger equation will be solved [63],
[158]:
The geometry of the investigated mol ecule has to be near to th e ener getic equilibrium s tructure.
Therefore, the first derivative of the potential V over all spatial coordinates x i has to be zero as
shown in (73). Thi s can be ach ieved b y various optimization procedures of QM programs [158] (not
explained here), yielding the minimized structure in cartesian coordinates and has to be done
preliminary .
Furthermore, the corresponding potential energy V can be approximated by a T aylor series for small
deflections from the ener getic minimum. It is truncated after the second quadratic term, yielding the
harmonic approximation as displayed in equation (74) which can then be used to compute the force
constant matrix (78) [63],[158]:
The NMA procedure [63],[158] requires internal coordinates. Therefore, comm on QM-programs,
like GAUSSIAN [153], convert the optimized s tructure from car tesian coordinates into a set of
internal coordinates (bonds , angles, torsions and impropers) which are used for frequency
calculations according to eq. (76). These are more intuitive for description of molecular m otions.
Cartesian coordinates x are relate d to internal coordinates s via the W ilson B matrix [161]:
52

(73)
(
∂ V
∂ x i
)
= 0

(74)
V = V 0 +
∑
(
∂ V
∂ x i
)
0
x i + 1
2 ∑
(
∂ 2 V
∂ x i ∂ x j
)
0
x i x j

(75)
B =
(
∂ s
∂ x
)

Internal coordinates can also be combined into natural internal coordinates, as used b y the program
GV A [162], b y the following form ula:
According to Pulay e t al. [163] all internal coordinates s, which are necessar y for description of a
specific normal m ode, are combined.
Furthermore, following the W ilson Decius formalism [63],[156],[157],[158],[161] one gets:
Equation (77) can be transformed into the conventional normal mode equations as shown in eq.
(78):
G is the so-called Geome try matrix (eq. (79)) describing a combination of all internal coordinates s ta
(with atom α and internal coordinate t, t', respectively ) of the molecule, with the reciprocal mass μ.
The force constant matrix F is the second derivative of the poten tial V of the molecule over internal
coordinates (eq. (80)).
T ogether with Pulay's approach [158],[163], the force- constants can additionally be s caled,
according to (81):
This will be done for the usage of the progra m GV A [162], as described in the methods in section 5.
In equation (81) σ denotes the scaling factors and F ij the respective force constants. The scaling
factors are produced for a group of internal coordinates and a certain QM method and are
transferable between similar molecul es.
The eigenvalues λ in (78) describe the frequencies ν of the corresponding vibrations/transitions as
shown in equation (82) with c bei ng the constant of speed of light and λ the wavelength:
53

(77)
∣ F − G − 1 λ ∣ = 0

(78)
( GF ) L = λ L

(79)
G tt ' = ∑
α= 1
N
μ α ⋅  s t α
⋅  s t ' α

(80)
F =
(
∂ 2 V
∂ s i ∂ s j
)
0

(76)
r = ∑
k
c k s k

( 81)
( F ij ) σ =
√
σ i ( F ij )
√
σ j

Furthermore, the eigenvector of this equation is the so-called L-Matrix. It is i mportant for the
transformation from internal coordinates S to nor mal coordinates Q (83) and the generation of the
potential ener g y distribution (PE D).
In this work the NMA method [63],[158] has been carried out by using GAUSSIAN [153] and
application of the program GV A [162] in a QM and QM/MM framework (s. section 5).
4.4.3. Ram an intensities
According to equation (68) the corresponding intensities can be computed using (84) [156],[157],
[158]:
The Raman cross section σ k is proportional to the frequency of the vibrational mode ν k and the
polarizability tensor α ρσ of the investigated s y s tem. Therefore, knowing the frequencies of the
corresponding nor mal modes (solutions of NMA) and the polarizability deriv atives will yield the
Raman intensities I Ra k .
The computation of intensities can t hen be carried out as follows [158] :
In general, the cartesian polarizabilit y derivative s are calculated (with ρ and σ being the polarization
directions and H the Force constant matrix in cartesian coordinates and E the ener g y) according to
(85):
Afterward, cartesian polarizabilities are transformed into normal coordinate Q k -dependent ones via
B- and L-matrices.
The Raman activitie s are computed according to equation (87):
54

(82)
λ : ν = 1
2 π c
√
λ

(83)
L : Q = L − 1 S

(84)
I k
Ra ∝ σ k ∝ ( ν 0 − ν k ) 4
(
1 − exp
[
− hc ν k
K B T
]
)
∑ ∣ α ρ σ ∣ 2

(85)
α ρ σ =
∣
∂ α
∂ x
∣
2 ∝
∣
∂ 2 H x
∂ E i ∂ E j
∣
2

(86)
(
∂ α
∂ Q k
)
= ∑
i = 1
3N − 6
L ik ∑
j = 1
3N
B ij
(
∂ α
∂ X j
)

α k and γ k are isotropic and anisotropic parts of the polarizability tensor (polarizability derivative s).
The intensity is then computed according t o equation (88):
The last step is onl y carried out, when using GV A [162] for computation. GAUSSIAN [153] onl y
uses Raman-activitie s [158].
4.4.4. Bandwidth
The bandwidth Δν, which is proportional to the energy of the transition ΔE, is connected to the
lifetime Δt of the excited vibrational s tates (i.e. the corresponding transitions) via the Heisenberg
uncertainty principle [164] as depicted in eq. (89) [156],[157]:
According to (89) a short lifetime Δt (e.g. Δt=1 ps) l eads to a big ΔE and thus a broad band (Δν=2.7
cm -1 ) and vice versa (Δt=1 m s leads to Δν=2.7 x 10 -9 cm -1 ).
Since the lifetime of a vibrational transition in a molecule is strongly influenced b y the environment
it is in, the s hape and width of certain band gives information about structures and d y namics in the
investigated s y s tem. Therefore, in the current work two different approaches are used for obtaining
the bandwidth:
1. The bandshape is modeled by a Lorentzian function (with fixed bandwidth Δν=12 cm -1 ),
connected to the corresponding intensities/activities (this approach has been used for Raman-spectra
computation for Rc FDH cofactor) [156],[157],[158].
2. A second possibility to gain ac cess to d y n amics and thus bandwidth in Raman-spectra is to
generate so-calle d sum-spectra [158]: In these dif ferent snapshots of a MD-si mulati on are
55

(87)
A k
Ra =
[
45 α k
2 + 7 γ k
2
]

(88)
I k
Ra = I 0
( ν 0 − ν k ) 4
(
1 − exp
[
− hc ν k
K B T
] )
A k
Ra

(90)
I ( ν ) = I max
1 +
(
ν 0 − ν
Δ ν
2
)
2

(89)
Δ E Δ t ⩾ h
4 π

geometrically optimized via QM/MM techni ques and of each struct ure a Raman-spectrum is
generated as described above. Afterward, the individual spectra are summed into a single one. Thus
the influence of different conformations of the system at different timesteps is taken into the
computed Raman-spectrum.
56

Chapter 5
System pr eparation and pr otocols for calculations
5.1. Com putation pr otocols for Cph1
5.1.1. Model building for Cph1
Protein structure models based on X-ra y measurement s [10] were used as a starting point for the
model building. The initial coordinates for the protein were taken from the Protein Data Bank [15]
(code 2VEA; resolution, 2.45 Å). The model for Cph1, used in this work, was obtained from Grazia
Daminelli [64]. It was alread y completely prepared, according to the procedure described in earlier
work [165],[166]. For more details on model building, please see section 5.2.1.
Preliminary work by Gra zia Daminelli has been [64]: In general, missing amino acids have been
inserted manually and/or b y secondary structure prediction algorithms [167]. Absent h y drogen
atoms have been added to the heavy at oms via the H BUILD [168] routine, im plemented in
CHARMM [150],[151]. Protonation states for histidines have been evaluated via visual inspection
of the surrounding environment. Exception have been the two histidine residues in near vicinity of
the PCB cofac tor , na mely His260 and His290. According to the results of previous N MR studies
[29] these two have both been protonated only on the ε-nitrogen. Charge d amino acids (l ysine,
ar ginine, aspartate and glutamate) were protonated according to standard CHARMM [150],[151]
protocol. A hexagonal w ater box consisting of TIP3 water [101] with the dimensions of a=b=1 19.04
Å, c=146.2 Å; α=β=90.0°, γ=120.0° ha s been built around the protein. Chloride or sodium ions have
been placed within 3 Å of each a mino residue not involved in a s alt-bridge. The s y s tem has been
neutralized by random placing of ions [166]. This was necessar y to obtain a system char ge of zero
which is required for Particle-Mesh-Ewald calculations [96],[97]. The protonated, solvated and
neutralized sy s tem has then been used for MD-simulations, as described in the next subsection.
5.1.2. MD-simulation pr ot ocol for Cph1 with classical for ce field
The NAMD [169] -package (version 2.6) together with the CHARMM22 force field [85] has been
used for MD-simulations according to the protocol displayed in Figure 20a. The Particle-Mesh-
Ewald method [96],[97] has been applied to compute electrostatic interactions, while van der W aa ls
forces have been treated with switching functions ( switching distance of 10 Å ) and a cutof f of about
12 Å. Period ic boundar y conditi ons (PBC) have been used for the hexagonal cell. Further a time
step of 2 fs has been applied, together with the constrain algorith m SHAKE [170]. It has been used
to guarantee minimum values for al l bond lengths between heav y atoms and h y drogen. For the
chromophore PCB, previousl y optimized parameters [171] have been used. In the productions run
all heavy atoms were completely free of an y constrains. A Noose-Hoover Ther mostat [171],[172]
was used to m aintain NPT c onditions in all productions runs.
The MD-simulations have then been carrie d out as follows:
1.) Preliminary work like m o del building.
2.) Solvent water molecules (as well as hydrogen ato ms) are minimized (4 ps) to remove bad
contacts, then heated for 30 ps to a te mperature of 300 K under stepwise release of restraints of the
57

water molecules. Afterward water/h ydroge n atoms are equilibrated for 50 ps with no restrains for
TIP3-water and hydrogen. All for several thousand steps. During step 2, all other heav y atoms have
been kept fixed. Step 2 has been prelimi nary carried out b y Daminelli et al. [64].
3.) The resulting structure from 2.) is then used in an all-atom ener g y minimization (20 ps) under
constraints for all heav y backbone atoms of the protein and the chromophore with a har monic
constrain of 15 kcal/mol Å 2 . T hese are removed stepwise, until all atom s are completely free.
4.) The mini mized s y stem is then hea ted to 300 K for 200 ps. Again heavy ato ms of protein
backbone, as well as chro mophore are restraint. The atoms are gradually freed, until the y are
completely unrestraint.
5.) The heated and completel y unrestrained sy stem is then further equilibrated for 200 ps with
constant pressure and temperature. T o obtain NPT conditions the Noose-Hoover thermostat together
with the Langevin m ethod [171],[172] as provided by the program NAMD [169] can be used.
6.) Finally a production run of 40 ns (consisting of 200 0.2 ns steps) of the completely unrestraint
sy stem is done.
The resulting mol ecular structures and observables fro m the M D-trajectory , l ike RMSD and RMSF
can be visualized and computed via the V isual Molecular D y na mics software VMD 1.8.6. [173].
Figur e 20a. General MD-simulation receipt
5.1.3. Generation of a polarized f or ce field for Cph1 and com putation of
Mulliken charges
For the generation of a polarized force field for Cph1 a QM/MM method has been used. In detail, a
SCC-DFTB [1 17] /CHARMM force field [85] approach w as used to compute the part ial char ges of
all residues which have at least one atom inside a sphere with a radius of 4 Å fro m the center of
58

PCB (s. Figure 20b, right). The remaining part of the system was treated b y classical molecular
mechanics. Additionally , the partition of the s y stem in a Q M and a MM fragment was defined b y
specify ing those C-C single bonds, at which the covalent linkage between two fragments is cut
using the SPLAM [174] method. For all selected residues the C α -C β bond was the cut one.
Therefore, onl y the respec tive side chains have been treated b y SCC-DFTB. The computations have
been carried out using the CHARMM32 progra m [150],[151] and optimized parameter sets
(Hamiltonian approximations done with DFT in a minimal basis and with GGA functional)
developed b y the group of Elstner et al. [175] which are optimized for implementation of SCC-
DFTB calculations into CHARMM. Furthermore, the MM-electrostatic potential has been imported
into the QM-Ham iltonian [174]. Finally , the Mulliken [98] char ges have been calculated for each
enlisted residue via an iterative procedure until char ge convergence was reach ed. These were then
exchanged with the corresponding old char ges in the psf-file to generate a polarized force field.
PROTEIN-
RESIDUE COF ACT OR W A TER
LEU 15 PCB HOH 2
LEU 18 HOH 3
ILE 20 HOH 4
MET 174 HOH 10
TYR 176 HOH 1 1
V AL 186 HOH 28
TYR 198 W A T 632
TYR 203 W A T 663
SER 206 W A T 566
ASP 207 W A T 572
ILE 208 W A T 577
PRO 209 W A T 578
ALA 212 W A T 579
LEU 215 W A T 586
PHE 216 W A T 589
ARG 222 W A T 605
ILE 224 W A T 614
ARG 254 W A T 610
ALA 256 W A T 619
TYR 257 W A T 620
HSE 260 W A T 624
TYR 263 W A T 626
MET 267 W A T 629
SER 272
THR 274
LEU 286
ALA 288
HSE 290
ALA 457
TYR 458
LEU 469
HSD 470
PRO 471
Figur e 20b. Residues i nvolved in QM-Mulliken-char ge co mputati on (h y drogens not shown; protein
residues in ghostly representation); HOH m eans cr y s tal-water and W A T denotes solvent water
59

5.1.4. MD-simulation pr ot ocol for Cph1 with polarized for ce field
MD s im ulations w ith polarized force field (polf f) have been carr ied out in an identical way as for
MD-simulations for Cph1 with classical force field (see section 5.1.2.; steps 1 to 6), using a psf
structure file containing the polari zed char ges of the selected residues.
5.1.5. QM/MM geometry optim izations for Cph1
QM/MM model buildi ng
As starting points for the Q M/MM calculations 25 snapshots (of every 4 ps) of the last 100 ps of
both the MM- and polf f-MD trajector y have been taken. The w ater -box has been cut down to a
sphere with a radius of 40 Å centered on ring C of the chromophore. The QM/MM optimization
procedures were ac cording to recent publ ication [63] and as discussed in section 4.3. The s ystem
has been divid ed into a QM and a MM region, with a boundar y region in between, according to the
additive sche me [84],[143]. The boundary reg ion was accounting for coupling between QM- and
MM-atoms. The general concept is visualized in Figure 16 in section 4.3.
The QM region consisted of the protona ted bilin cofactor (PCB), the p y rrole-water and the side-
chain of protein-attachment Cys259 at the cofactor . The corresponding atoms were treated at a
quantum-chemical level. All other residues/atoms were either treated b y MM force field techniques
or remained fixed during the optim ization.
QM/MM pr ocedure
The CHEMSHELL progra m [155],[176] has been used for all Q M/MM calculations, combining the
TURBOMOLE package [154] for the QM-related parts and CHARMM [85] for the MM-part. The
actual calculations have been carried out accordi ng to the following scheme:
1. QM atoms are computed using conventional D FT methods [108],[1 10] with the B3L YP
functional. A 6-31G* basis set has been applied to all heavy atoms, while hydrogen atoms have
been treated with a 4-31G* basis set. DFT -calculations were ca rried out b y TURBOMOLE [154].
2. The CHARMM27 force field [85] is used for description of the MM part of the s y stem. The latte r
is further divided into an active and an oute r region.
3. All atoms inside a 15 Å sphere centered at the pyrrole-nitrogen of ring C of the chromophore
PCB are considered active during the QM/MM optimizations and contribute to the active region.
All ato ms outside the active region remain fixed during the whole optimization. The y are tre ated as
point char ges. The region outside the 40 Å water sphere is called outer region and contains no
explicit solvent molecules. Additional forces to keep the explicit water molecules inside the active
region are generated by the point cha r ges of the atoms outside the active region.
4. Bonds crossing the border between QM and MM region are tr eated b y the link-atom scheme (as
discussed in section 4.3.3.). Respective QM-atoms of the cut bonds are saturated with h ydrogen-
atoms, while for the MM-atom s a char ge-shift scheme is applied [84],[143],[144].
5. The electrostatic embedding method is used for the electrostatic interactions between QM and
MM atoms [84],[143],[144].
60

6. The optimization procedure is carr ied out with the help of a limited memory quasi- Newton L-
BFGS algortihm [63]. Furthermore, hybrid delocalized internal coordina tes (HDLC) [177] are used.
The results have been visualized with VMD [173].
The QM/MM optimized snapshots have then been used for further Raman-spectra computations
according to the NMA approach [63],[158], as described in the next subsection.
5.1.6. Ram an-spectra calculation with NMA for Cph1
The QM/MM geometr y optimized structures in car tesian coordinates have been used for subsequent
frequency computations with the GAUSSIAN program [153]. The NMA approach has been used as
described in the following [63],[158]:
1. The cartesian structure is converted into a set of internal coordinates according to equation (81).
The structural information in internal coordinates is then used to generate the G-matrix. Natural
internal coordinates [158] have to be used in case of subsequent usage of the program GV A [162].
These have to be manuall y prepared according to P ulay 's approach [163], as depicted in equation
(82).
2. The force constant matrix is computed as the second derivative of the potential ener gy of the
sy stem as provided by the previous geometry optimizations. The force constant matrix in cartesian
coordinates (called the Hessian). H is then converted into internal coordinate force matrix F . This is
carried out according to F=BH, with B the W ilson B-matrix [158],[161].
3. Afterward, the vibrational S chrödinger equation (76) is solve d b y diagonalization of GF . This
y ields the fundamental frequencies of the normal modes. The eigenvectors L (amplitude of motions)
will be used for computation of th e potential ener gy distribution (PED) according to P=L t L -1 . It
display s how much an internal coordinate is contributing to a normal mode [158].
4. Raman-activities were computed according to equation (87) with use of prior co mputation of the
polarizability derivatives (eq. (85)) and conversion from cartesian to normal coordinate
representation (eq. (86)) [158]. It has to be noted that these are not directl y comparable to
experimental Ram an-intensities.
Steps 1. to 4. have been carried out with the QM-packa ge GAUSSIAN [153].
5. Subsequentl y , GV A [162] has been used for obtaining normal m ode frequencies and a PED in a
more accurate fashion. The differe nce in computational protocol is that the opt imized cartesian
coordinates are manually transformed into a set of natural internal coordinates. Furthermore, since
in Q M/MM geo metry optimizations an additional link-atom has been used, this is taken out of the
computed Hessian. Both vibrationa l frequencies and Raman-intensities are computed by GV A.
The advantage of GV A-produced output is a better and “ easier to interpret” PED [162] . The PED
obtained b y frequenc y calculations w ith GAUSSIAN [153] is actually only interpretable for small
and eas y molecules/sy s tems (as was done for the cofactor -s tudies of Rc FDH, as described in section
5.3.). Furthermore, the force-constants can be s caled in an ef ficient wa y according to equation (83).
This m a y be helpful to overcome s hortcomings of the harmonic approxim ation, as well as of the
used quantum mechanical method which may produce s y stematic errors in the computed vibrational
spectra.
61

The final Raman-spectra are pl otted, using the computed frequencies and intensities with t he help of
a Lorentzian function with a bandwidth of 12 cm -1 . It should be further noted that in accordance to
previous studies [178],[179], the comparison of intensities between GV A [162]-co mputed Raman-
spectra and experimental Resonanc e Raman spectra of tetrapyrrolic chromophores is vali d.
5.2. Com putation pr otocols for Agp2
5.2.1. Model building for Agp2 wild type (WT)
An earl y unpublished version of the X-ra y structure fro m Agp2 Pfr state (with an endo-c y clic
double bond of ring A of BV) fro m Scheerer e t al. [16] has been us ed as an initial set of coord inates
for the computations. Missing prot ein residues 80-85 were identified via primar y structure analysis.
They were part of a loop region. The positions of the corresponding α- and β-carbons were chosen
by visual inspec tion of the environment and then pl aced manually . Afterward, the missing atoms
were included b y using the CHARMM [150],[151] software, followed b y a short minimization (100
steps) for reducing bad contacts. Biliverdine (BV) w as bound to a native C ys via endo-c yc lic
displacement of the double-bond of ring A [16].
Since, for fold ing purposes, the cr y s tals w ere obtained after m e th y lation [180] of all lysine (L y s)
residues, three L y s in a radius of 20 Å around BV also occurred as methylated species in the X-ra y
structure [16]. These three methy lated residues were manually replaced b y nat ive L ys.
Protons for titratable groups have been assigned via visual inspection of the surrounding
environment, potential h y d rogen-bonding partners and s altbridges, with the exception of His248
and H is278 (see last paragraph of this subsection). All missing h y d rogen atoms w ere then inserted
via the HBUILD routine [168] of the CHARMM code [150],[151]. Additionally , a proton ha s been
added manually at the propionic side-chai n of ring C according to experimental findings [23].
Subsequently the protein has been solvated in a hexagonal box with a=b=90, c=120 and α=β=90.0°,
γ=120.0° of TIP3P w ater molecules with the help of CHARMM [150],[151]. Additionall y , the
solvated structure has been treated wit h NaCl (using the VMD package [173]). A sodium or chloride
ion has been placed in proximity of each char ged amino acid not involved in forming salt-bridge s.
Afterward, ions have been placed randomly in 5 Å proximity of the s olute to achieve electro-
neutrality of the system [166].
Finally the water molecules and hydrogen atoms in the structures were then equilibrated with help
of the program NAMD [169] and the CHARMM all-atom force field [85]. First a minim ization of
5000 steps was carried out, followed b y heating for 1 ns and equilibrating for 15 ns (under slowly
reducing restraints for water m olecules). For further details of simulation procedure s. section 5.1.2.
As described in section 3.2., the protonation states of the conserved His248 and His278 are of large
significance. Therefore, nine m odels have been built (dd, de, dp, ed, ee, ep, pd, pe, pp) in which all
possible protonati on states of the two hist idines are combine d. The for mer letter refers to His248
and the last one to Hi s278. Here d denotes the protonation on N δ and e the protonation of N ε . P
represents a doubl y protonated histidine. The dif ferent protonation states were realized with the
HBUILD routine [168] of CH ARMM [150],[151]. All further discussed QM/MM co mputations
were carried out for all nine models. The same has been done for six models (de, dp, ee, ep, pe, pp)
with a chro mophore with missing h y d rogen on PSC(C). See Figure 21a for examples of built
models of Agp2.
62

Figur e 21a. Structure of Agp2 BV bi nding-pocket with ed-protonation of H248 and H278, solvated
and neutralized (after equilibration) with surrounding solvent a nd crystal waters; based on [16]
5.2.2. Model building for Agp2 BV - variants
Additionally , dif ferent non-native variants of the cofactor BV have also been used inside the protein
structure of Agp2. In detail, BV was either meth y lated at P SC(B), PS C(C) or both (see Figure 21b).
The propion ic side- chain which was not methylated was neutralized b y adding a proton. These
modifications have been included in the corresponding structures via patching of the protonated
crystal structure [16]. Parameters for CH 3 groups have been taken from acetaldehyde [85]. The
models have been protonated, solvated, neutralized and equilibrated as described in 5.2.1. The ed-
model for His248 and His278, respective ly , has been used for these ty pes of BV -variants.
Figur e 21b. Exam ple of BV -variant: Structure of Agp2 chromophore BV wit h additional methyl
groups at propionic chains of rings B a nd C
63

5.2.3. MD-simulation pr otocol for Agp2 ( WT and BV -variants) with classical
for ce field
All MD-simulations w ith classical partial char ges for Agp2 (+ variants) have been carried out
according to the protocol as discussed in section 5.1.2. Differe nces in the si mul ation conditions for
Agp2 in contrast to Cph1 have been : All heav y atoms of the chromophore BV have been kept fixed
during all simulation steps/procedures. This was necessar y due to an inefficie nt parametrization.
The used parameters were only fitted for the Pr state of BV and not the Pfr . T ests with free atoms of
BV showed strong unnatural behavior like twisting of bond ang les and generation of strained
geometries between the individual rings. W ater has been minimized (10 ps) , heated (0.5 ns) and
equilibrated (7.5 ns ) according to section 5.1.2. Afterward, minimization, heating and equilibration
of the co mplete s ystem has been carried out similar as in described in section 5.1.2. Finally ,
production runs of 20 ns have been carried out for Agp2-sy s tems with the different variants of the
chromophore.
5.2.4. QM/MM geometry optim izations for Agp2 WT
QM/MM model buildi ng
As s tarting point for the QM/MM ca lculations, the solvated X-ra y structure [16] with equilibrated
water -box and h y d rogen atoms have been taken. The water -box has been cut to a sphere with a
radius of 40 Å centered at the p yrrole-nitrogen of ring C of BV . The QM-part consisted of BV and
1 1 water molecules and the side-chains of protei n residues His248, Hi s278, T y r 251, T yr205,
Asp196, Ser462, Cy s13 (170 atoms) (see Figure 22 for an example of the dd model).
Figur e 22. QM-region of dd-model of Agp2: BV+Cy s-link (orange); CW (cry st al-waters)+protein
residues (color)
QM/MM pr ocedure
The program CHEMSHELL [155],[176] has been used for QM/MM calculations. The QM region
was tretated with DFT (B3L YP). 6-31G* and 4-31G* basis sets have been used for hea v y atom s and
hydrogen atoms, respectivley . The QM-package TURBOMOLE [154] has been used for the DFT -
computations. MM-atoms are described b y the CHARMM27 force field [85]. An active region with
64

radius of 20 Å centered at the N of ring C of the BV -cofactor was chosen, including the
chromophore and surrounding residues and water molecules. All ato ms outside the active region
remained fixed during the optimization and where treated as point charge s. The geometr y of the
chromophore binding-site for all nine m odels was optimized as discussed in section 5.1.5.
Additionally , following the same procedur e, some calculations w ith extended basis set for hydrogen
atoms has been carried out: Here, a 6-31+G* basis set with polarization and dif fuse functions have
been used for h y drogens, while the other ato ms have been treated with 6-31G* as before. Due to the
higher computational costs, the QM region has been reduced to the side-chains of C ys13, His248,
His278, BV and the p-water and three water molecules.
5.2.5. QM/MM geometry optim izations for Agp2 BV -variants
For Agp2 BV -variants a QM -region including the side-cha ins of C y s13, Arg21 1, H is278, Hi s248,
Asp196, BV -variant and the 1 1 waters has been chosen. Th e optimization procedure was the same
as for Agp2 W T and as discussed in sections 5.1.5. and 5.2.4.
5.2.6. Analysis of opti m i zed geo m et ries for Agp2 (WT a nd BV -variants)
QM/MM optimized s tructures were analyzed using the MOLDEN [181] program for estimation of
correct convergence and investigation of H-bond-networks and VMD [173] for RGYR, RMSD,
RMSF and distance calculations and comparison to the protona ted, s olvated and water -equilibrated
crystal structure [16] as reference.
5.3. Com putation pr otocols for Rc FDH
5.3.1. Model building wit h QM softwar e
As starting point for all used models, the homology models b y Hartmann et al. [49] have been used
(see Figure 23) for both the oxidized and reduced for m of the cofa ctor . These models based on
dif ferent cr y s tal structures of FDH of E. coli [52],[56], Fe-onl y h y drogenase fro m Chlostridium
pasteurianum [182] and NADH-deh y d rogenase from Therm us thermophilus [183]. The actual
model building has been done with the Gaussview [184] software .
Figur e 23. Mo-cofactor (homology model of oxidized Rc FDH [49]; Cy s386-ligation is not shown)
65

Since the co mplete Mo-containing cofactro (Moco) is ver y lar ge and thus co mputati onal cost ly ,
only a fragment of the Moco has been used for in most calculations (see section 12.1.4. for detailed
information on model-size tests) which is named model 1. All used models are depicted in Figure
24. If not stated otherwise all models have be en built in the following:
In all used model 1 compounds the GDP-units were omitted. Dithiolene ligands, as well as linked
pterin moieties have been included. M odels with Mo in +VI oxidation had C y s 386 coordinated to it
(with one exception, where Cys386 wa s absent for test purposes, as described in section 12.2.1.)
and O, OH, S or SH ligand present. Models with Mo in +V/+IV oxidation had either C y s386
present or absent. Either Cy s386 is ligated to metal or uncoordinated and either char ged or
protonated and in near proxi mity to the cofactor . In all MoIV models O, OH, S or SH is ligated as
sixth ligand at mol y bdenum. The models are named M oZ_X, were X is either an O, OH, S or SH
ligand and Z denotes the oxidation-state of Mo which can either be +V I, +V or +IV . If us ed, Cys386
has been mimicked by a trunca ted version, in which the C β is saturated by protons. C β re mained
fixed during all computations in order to simulate the lesser degree of freedom the residue would
feel in the native protein environme nt (s. Figure 24). Further , the truncated GDP-connection-sites (s.
Figure 24) have been fixed, in order to m imic the real situation in the complete Moco .
Figur e 24. Schematic representation of used models 1-4 for QM; X denotes either S, SH, O or OH
ligands; Y denotes either ligated Cys (m imicked b y t runcated version) or empty ligation site; Mo is
in either +VI, +V or +IV stat e; asterisk denotes fixed truncation spots
Char ges have been assigned as follows: each dithiolene (-2), deprotonated C ys (-1), doubly estered
PO 4 (-1), O /S group (-2), OH/SH group (-1) and Mo either (+6), (+5) or (+4). Multiplicity of the
complex has always been singlet for MoV I and MoIV (doubly occupied MOs) and doublet for
MoV . The protein env ironment has always been mimicked b y application of a dielectric field
constant of ε=4 with PCM-calculati ons [136] in accordance with previous studies [71],[137],[138],
[139] (computations in vacuum have also been carrie d out and are shown in Appendix B).
66

All used model 1 co mpounds dif fer only in the possible ligation of C ys to molybdenum and the
identity of the s ixth ligand (X in Figure 24). Models with Mo-ligations differing from the
descriptions in this chapter are: Model 2 with additional CH 2 +PO 4 H - attached at dithiolene and
model 3 compounds w ith attached diphosphate and ribose at the dithiolene moieties, as well as
model 4 compounds (complete Moco; onl y for test purposes). Furthermore, MoVI_X model 1
compounds w ith missing C y s386, MoV_X model 1 compounds with dif fering ligation-scenarios for
Cys386 and MoVI-, MoV - and MoIV -models 1 with other l igands, such as N 3 - and formate. These
are described accordingly in the respective secti ons in part V of this work and the Appendix.
The main focus in this w ork has been on calculations with model 1 compounds. For tests of models
2, 3 and 4 please see section 12.1.4 and the Appendix.
5.3.2. QM – calculations for Rc F DH m odels
Geometry optimizati ons
The geometry optimizations follow the s tandard procedur es fro m GAUSSIAN09 [153], using DFT -
methods [108],[1 10] with the BP86 [1 1 1] functional and a 6-31G* basis set for C, H, N atoms
[185],[186] and def2-TVPP basis sets for S and M o [187]. Additionall y , an ECP [188] at the same
level of theor y has been applied to the metal atom Mo. This level of th eory is in accordance to
recent theoretical investigations, as done within the work of Mota et. al. [53] and Porcher et. al.
[189]. Concerning the basis-set the tr eatment of metal and the sulfur -atoms in this work is of even
higher ac curacy , as compared to named works. F or description of the environment of the cofactor ,
PCM-methods [136] have been applied, as implemented in GAUSSIAN09 [153]. Geometr y
optimizations have been carried out using standard SCF-converge nce methods with a converge nce
criteria of 10 -5 . Exceptions to this are models including azide and models w ith MoV . These had to
be co mputed with a combination of reduced conver gence criteria of 10 -4 and a Newton-Raphson
algorithm (quadratically conver gent, QC) [158].
Raman-spectra co m putations
Following geometry opt imizations, the vibra tional frequencies and Raman-activities have been
computed using the NMA-approach [63],[158] as described in section 5.1.6. (steps 1. to 4.; no
application of GV A [162] and thus Pulay's approach [163]). It was carried out by the GAUSSIAN
program [153] . Since no GV A procedure was used, neither the frequencies, nor the force-constants
have been scaled. Also only Raman-activities have been computed, since GAUSSIAN onl y
provided these. The obtained frequen cies and activities have been plotted, us ing a Lorentzian
function with a fixed bandwidth of 12 c m -1 for m o deling of the bandshape [158]. The computed
Raman-spectra have been compared to experimental Resonance Raman-spectra under diffe rent
conditions for both oxidized and reduced s pecies of Rc FDH (together w ith different stabilizing
agents, like azide). The experiments were carr ied out and the corresponding data was provided b y
Stefan W ahlefeld et al. in the group of Prof. Dr . Peter Hildebrandt at the TU Berlin [73]. For further
information on experi mental procedures, as well as the respective data, please consult sections 12.
and 13.
Thermodynamic binding ener gies
Bond-breaking ener gies/enthalpies have been calculated after geometrical opt imization of
individual models (as shown in 13.5. in detail ) b y the following subtraction scheme:
67

Here, Δ E bond-breaki ng is the ener gy needed to break the bond between two moieties A and B and E AB the
ener g y of the connected s y stems A and B, while E A and E B are the respective energie s of the s ystem-
parts A and B, respectively . Δ E corresponds to the total electronic ener gies of the corresponding
sy stems after the respective geometrical optimizations.
Furthermore, the bond dissociation enthalpies have been computed [190] using the formula (92)
(with H° being the standard e nthalp y ; Δ f meaning formation and Δ r meaning reaction):
Frequency computations with GAUSSIAN09 [153] generate the sums of electronic and thermal
enthalpies [190] ε 0 +H corr which makes it possible to com pute reaction enthalpies according to (93):
Here, ε 0 is the electronic total ener g y of the s y stem and H corr is the enthalpy correction. Hence, only
molecular information is neede d and no atomic information is necessary [190].
5.4. Com putational r e sour ces
Chemical formulas have been buil t with the program Chemdraw [191]. Other visualizations of
molecular matter has been carried out with either VMD [173], Gaussview [184] or Molden [181].
The computations have been carried out using several dif ferent resources: For MD-simulations the
Norddeutscher V erbund für Hoch- und Höchstleistungsrechnen [192] (HLRN) provided the
necessary computation times. QM/MM-calculations, as well as QM-geometr y optmizations and
subsequent frequenc y computations were done with the help of the computer cluster of the Institute
of Mathematics [193] at the T echnische Universität Berlin. Computations of observables, such as
RMSD and RMSF , as w ell as usage of VMD [173], Molden [181], Gaussiview [184], Chemdraw
[191] and standard program s have been carried out at conventional o f fice computers at and
provided b y the T echnische U niversität Berlin and the working group of Prof. Maria Andrea
Mroginski.
68

Δ E bond − breaking ( A − B ) = E A + E B − E AB

(91)
Δ f H ° ( 298.15K ) = ∑
products
( Δ r H ° product ( 298.15K ) ) − ∑
reactants
( Δ r H ° reactants ( 298.15K

(92)
Δ r H ° ( 298.15K ) = ∑ ( ϵ 0 + H corr ) products − ∑ ( ϵ 0 + H corr ) reactants

(93)

Part III.
Molecu lar dynamics and QM/MM b ased Raman-
spectra co mputations for phytochr ome Cph1
69

Chapter 6
MD simu lations of Cph1 w ith classical for ce field
MD-simulations provide a useful tool to visua lize structural and dynamical properties of biological
sy stems. It is les s demanding in terms of computational costs than pure quantum mechanics and
nevertheless is able to y ie ld important insigh ts into biological processes. It can also be combined
with other methodologies, like QM/MM, to obt ain for example Raman-spectra of the investigated
matter . A prerequisite for this is the correct application of force fields and a realistic simulation of
biological behavior . Since MD-simulations us e point-charge s to describe the charge environment of
atoms, the influence of polarization effects is not included in conventional force fields (s. section
4.1.). This deficienc y can lead to wrong or unrealistic d ynamic s of e.g. water molecules. As stated
in section 3.1. i t has been found e.g. in the case of bacteriorhodopsin that classical force fields with
conventional point-char ges lead to a deca y of the water-ne twork [57],[58],[59] w hich is in contrast
to NMR-observations [60]. Espec ially waters near the Schiff base showed high mobility and a high
rate of fluctuations (instability on its supposed pos ition) [61],[62], where in experimental studies
they have been found stable [60]. This has been attributed to the insuf ficient water-m odel which has
been used [61],[62] and the m issing of polarization ef fects in the applied force fields [65]. This has
been addressed b y using a so-called polarized force field which led to a stabilization of the water -
network [65],[66],[67].
A similar observation has been made in studies by Mroginski et al. [63] and unpublished studies b y
Daminelli et. al. for Cyanobacteri al phytochr ome Cph1 [64]. They found that the highly conserved
(crystal) pyrrole (p)-water showed high mobility and thus instability on its starting position between
the p y rrole-rings of the PCB chro mophore. The crystallographic s tructure [10] and NMR-studies
[194],[195] suggested that th is w ater remains on this position. This instability of named water
molecule led to a broadening of the corresponding NH-in-plane (ip) rockings in calculated Raman-
spectra [63] and therefore, makes them less comparable to expe riments.
T o overcome this problem a new set of polarized or Mulliken char ges have been derived for the
chromophore binding-pocket (CBD) of PCB using QM-methods in a h y b rid QM/MM approa ch,
including the chromophore and the p-w ater . These display the polarization effect of the environment
on the cofa ctor and vice versa and have been included into the existing force field. In this w ay a
polarized force field for Cph1 has been generated. It has been used for MD-simulations to show its
superiority over classical force fields for stabilization of the p-water . Furthermore, the results of
these MD-simulations have been used in subsequent h y b rid QM/MM optimizations and Ra man-
spectra co mputations to s how the ef fect of a more stable p-water on the corresponding NH-ip
rocking bands of the p y rrole-rings. Thus, the necessity of polarized force fields and their
development for correct reproduction of s tructural and d y namical properties of biological s ystems
will be discussed. See sections 5.1.3. for details on gener ation of the polarized force field and 5.1.4.
for the MD -simulation protocol. All used models of Cph1 were based on the X-ra y structure [10]
which has been taken from the Protei n Data Bank [15] (entry : 2VEA, resolution 2.45 Å).
6.1. Instability of p yrr ole-w ater with classical MM for ce field
RMSD of C ph1 and chr omophore PCB. The highly mobile nature of the p yrrole-water with
classical force fiel ds w as demonstra ted b y 40 ns long MM-MD s imulations in NPT environm ent,
with all free atoms, as described in section 5.1.4. First the stability of both protein and PCB were
70

confirmed by monitoring the average RMSD values (with respect to s olvated X-ra y structure [10])
over the course of the 40 ns production run. Results can be seen in Figure 25. The structure of the
chromophore stays stable during the whole 40 ns run without any fluctuati ons in the RMSD. In
contrast to this the protein shows increasing RMSD values in the last 10 ns of the simulation (up to
5 Å). They are due to movements of loop-regions, while the backbon e of the protein stays s table.
These fluctuations may orig inate in the high mobi lity of water -network, due to the missing
description of corresponding polarization ef fects. The average RMSD values of the protein over the
whole course for Cph1 is about 4 Å and for PCB it i s 1.25 Å.
Figur e 25. A verage RMSD of Cph1 (red) and PCB (black) over time of 40 ns of production for
classical MM-MD
Structur e and RMSF calculations of PCB. The overall structure of the chro mophore at the end of
40 ns (see Figure 26) shows planar rings A and C, while rings B and D of PCB are partl y tilted out
of plane (for starting s tructures and position of p-water , please s. Figure 5). RMSF calculations of
the cofa ctor s uggest, the most flexible part of PCB being ring D (RMSF=10 Å) and th e met hine
bridge between rings C and D (RMSF=9 Å ), followed by both methy l groups on rings A and C
(RMSF=8 Å) and the methine bridge bet ween rings A and B (RMSF=7.5 Å).
Figur e 26. Structure of PCB after production run of 40 ns;
(left) on-top view; (right) side view
Instability of pyrr ole-water dur ing pr oduction run. S ince both backbone of protein as well as
chromophore remain stable during the production run, it could be estimated that also the water -
network and espec ially the pyrrole-water should show only s mall fluctuations. For the purpose of
monitoring the dynamics of the initial p-water (W1) the distances between the main h ydrogen-
bonding partners (N δ of His260, backbone CO of Asp207, N A , N B , N C and CO A of PCB) and the
oxygen of the p-water have been monitored, as depicted in Figures 27 and 28. Initiall y the w ater
71

stayed in close h ydrogen-bonding distance to all partners but His260 for 10 ns. At around the 10 th
ns, fluctuations occur for all m onitored distance s, except between Asp207-CO and O of p-water .
The w ater sta y e d in distances between 2 Å and 3 Å to Asp207 and between 3 Å and 4 Å to the
cofactor for the next 10 ns. A fterward higher distances (up to 6 Å) can be observed between water
and PCB. W1-PCB distances fluctuate from this tim estep till the end of the simul ation, at which
they re main ~ 5.5 Å. An increased distance (up to 6 Å) can be observed for the Asp207-p-water
distance after 36 ns of si mulati on time till the end. Thus the p-water is instable on its position
between the four p yrrole-rings. After 35 ns of production time, the water flew out of its cav ity
(Figure 27 and 28) w ith increasing distances to all bonding partners up to 7 Å, with exception of
His260. The water -molecule changes site of the PCB from below (near Asp207) to above PCB and
stays near His260 (see Figure 29). Furthermore, the changes in the tiltangle of the pyrrole-water ,
with respect to the p y rrole-rings have been measured (not shown here). Constant changes of about
3° also identify the instability of name d water .
Figur e 27. Evolution of distances between py rrole-water W 1, His260- N δ and A sp207-backbone-
CO during 40 ns classical of MM-MD
Figur e 28. Evolution of distances between py rrole-water W 1, N A , N B , N C and CO A of PCB during 40
ns of classical MM-MD
Over the time-course of the MD-simulati on, after the p-water left the position between the rings,
dif ferent w ater molecules flew in a nd out of this position.
72

Figur e 29. (left) Position of W8160 after 40 ns of production run; (right) with visualized distances
to N A , N B , N C , CO A , His260-N and Asp207-O
Instability of solvent waters during pr oduction run. After the p yrrole-water (W1) flew awa y
from its in itial position at PCB i t was replaced b y a solve nt water (W2) whic h again showed the
unstable behavior of its predecessor . In Figures 30 and 31 the corresponding changes in distances
between the ox ygen of the solvent water W2 and N δ of His248, backbon e CO of Asp207 and N A ,
N B , N C and CO A of PCB are displa y ed. At around 8 ns of simulation time, the molecule dove into
the chromophore binding pocket and stay ed th ere in hydrogen-bonding distance (for at least 2 ns
(see Figures 30 and 31) to named h y d rogen-bonding partners (although W1 is s till located between
the pyrrole-rings at this time). Then it fluctuated for almost 20 ns until it again shifted onto the p-
water location between 34 ns and 36 ns of production time. At this time W1 flew into the solvent.
The lifetime of W2 on this position is only abou t 4 ns, then it also wanders back into the solvent,
replaced by another one (not shown here).Observation of the changes in the tiltangle of W2 shows
the same trend as noticed before with W1, namel y highly deviating angles over the whole course of
the simulation.
Figur e 30. Evolution of distances between solvent water W2, His260-N δ and Asp207-backbone-CO
during 40 ns of classical MM-MD
It can be said conclusivel y that the ini tial water molecule at the p-water position is unstable, while
solvent water molecules fly in and out of the cavity . Therefore, the unpubli shed findi ngs of
Daminelli et al. [64] are confirmed.
73

Figur e 31. Evolution of distances between solvent water W2, N A , N B , N C and CO A of PCB during 40
ns of classical MM-MD
6.2. Com putation of Mulliken charges and generation of a polarize d for ce field
Mulliken char ges have been calculated for all atoms (protein side-chains and waters) inside a 4 Å
sphere centered at N A of the PCB via a h y b rid QM/MM approa ch, according to section 5.1.3. (for
detailed information about the residues taken into account, please consult section 5.1.3. and Figure
20b). In this wa y , polarization effects of the cofactor on its environment and vice versa have been
included into the force field. For a comparison between old point char ges and computed Mulliken
char ges, please see A ppendix H:
Of s pecial interest is the effect of the protein environment on the charge s of the chromophore.
Bigger changes of about 0.04 – 0.3 can be noted for almost all PCB atoms. Especially the
methylene, carboxylic and carbon yl groups, as well as the ring-nitrogens show high deviations fro m
their original MM-char ges. In contrast to this, the p-water shows onl y moderate changes of 0.05 for
the oxygen and ~0.03 for the h y d rogens. Protein side-chains inside the 4 Å sphere centered at PCB
show the most deviations in near vicinit y of the chro mophore. Especially N and the corresponding
H of His260 and H is290, as well as C, CO and the aminofunction of Arg254 and C, CG and
backbone O of Asp207 show higher deviations in their respective atom charges. In most cases
already positive char ges became even more positive and negative charges more negative. Thus, it
can be seen that the chro mophore is hugely af fected b y the charge environ ment of the protein. The
amino acids have a big impact on the charges of PCB. In a reverse fashion, the chromophore affects
the point char ges of surrounding residues.
The “increased” positive or negative char ges, especially of the ring-nitrogens of the chromophore
might facilitate a more stabilized p yrrole-water on its posit ion between the four p y rrole-rings.
Therefore, the original MM-char ges do in parts not correctly reflect the real char ge environment.
These charges have then been incorporated into the existing force field and thus used to generate a
polarized force field (polff). This has been us ed for subsequent MD-simulation (polf f-MD), in
which the stability of the p yrrole-water on its position between the four rings of PCB was subject of
investigation.
74

6.3. Stabilization of p yrr ole-w ater via polarized for ce field (polff-MD)
RMSD of Cph1 and chro m ophor e PCB. The RMSD values for both protein and PCB have been
computed with respect to the solvated X-ray structure [10] (see Figure 32): Compared to the MM-
MD it is notable that the RMSD values for PCB are slightly lowered to an average RMSD of 1.2 Å
and stay constant for the whole simulation. Also the RMSD's of the prote in show a decrease in
values (betwe en 2 Å and 3 Å) and a s maller average RMSD of about 2.5-3 Å. Both decreases in
RMSD may be attributed to a stabilization ef fect of the implemented polarized char ges. Exceptions
are the RMSD's of the protein between 30 ns and 40 ns, when an increase up to 7 Å can be noted.
The reason for this are bigger m ovements of loop-regions and not a denaturation of the protein.
Figur e 32. A verage RMSD of Cph1 (red) and PCB (black) over time of 40 ns of production for MD
with polarized f f
Structur e and R MSF ca lculations of PCB. Comparing the structures of protein and chro mophore
from the polf f-MD w ith the ones of the MM-MD-simulation (s hown in Figure 33) reveals
similarities in both. W ith the exception of loop regions, the protein structure fro m the polf f-MD
shows less deviations fro m the cr ystal structure than the one of the MM-MD (RMSD’ s of 3.5 Å and
4 Å, respec tively). The biggest deviations between both structures can be found in the GAF region
near the chromophore which ma y be due to the implementation of the polarized char ges in this area.
Furthermore, the structures of PCB are nearl y identical, with only small dif ferences in the tilting of
the methyl groups of rings B and C (RMSD(MM): 1.25 Å; RMSD(polf f): 1.2 Å). A lso in the
structure of the polf f-M D, rings A, B and C are in plane (like it has been found in X-ra y studies
[10],[12]) with ring D out of the plane (same angle as in pre vious simulation).
The residues H is260 and Asp207, which are essential for h y drogen-bonding of the p-water , are
closer to the cofactor (as in MM-MD) and a lso the ox y g ens of Asp207 (both OH and CO) are
directed towards the p-water position (in contrast to MM-MD, where both O-groups point slightly
away from the p-water). Therefore, both residues are better available for H-bonding of the p-water .
RMSF calculations of PCB for the polff-MD reveal an identical pattern as for the MM-MD, but
with strongly decreased RMSF values: RMSF of 6.9-7.5 Å for the biggest fluctuations. Si milarly
ring D (7.5 Å) and the corresponding bridge between rings C and D (RMSF=7.3 Å) are the most
flexible parts, followed b y the methyl groups of rings A , C and D (RMSF=7 Å ) and the bridge
between rings A and B (RMSF=6.9 Å).
75

Figur e 33. Structures of both Cph1 (left) sensoring domai n; (right) PCB chromophore (with
additional His260 and Asp207) for MD with polarized f f (blue and colored according to dif ferent
atoms) compared to MM-MD (re d)
It can thus be seen that the implementati on of polarized char ges diminish the fluctuations in the
structure of the cofactor .
The stabilization effect of the polarized force field on the water-net work inside the CBD in contrast
to the standard force field can be seen, when looking at the d y n amics of both crystal and s olvent
waters inside the CBD. For these purposes the distances between the six h y d rogen-bonding partners
(N δ o f His248, backbone-CO of Asp207, N A , N B , N C and CO A of PCB) and the cr ystal water W1, as
well as the solvent wate r W3 have been m onitored and are presented in the following:
Instability of pyrrole -water W1 during production run. W1 resides at the start at a position
between the four pyrrole-rings of P CB (s. Figure 5), from which it flies away after only 1 ns of
simulation time. Afterward it stays stable on a position underneath the chro mophore and in close
approximation to CO A of PCB (O(PCB)-H(W1)-dist ance of about 3 Å) w ith onl y small fluctuations
of less than 1 Å (not shown here).
Stabilizati on of solvent water W3 during producti on run. After 4 ns the empty cavity in the
center of the p y r role-rings of the cofactor is again occupied by a water molecule (W3) which moves
inside from the solvent. F igures 34 and 35 displa y the continuous occup ation of this molecule on
the crystallographic p y rrole- water location [10] with onl y small fluctuations in the anal y zed
distances. It stays in close h ydrogen-bonding distance (less than 3 Å) for all six binding-partners
(N δ o f His248, backbon e-CO of Asp207, N A , N B , N C and CO A of PCB) with exception of CO A of
PCB. The observed shifts in RMSD are all under 1 Å and thus tolerable. This indicates th e desired
stability of a water on the p-water position. F igure 36 displays the p-water after 40 ns of simulation
time on its supposed position at PCB wit h the potential hydrogen-binding partners.
T iltangles of W3 do not change (fluctuations are smalle r than 1°) after 4 ns of s imulation. Also it is
notable that no additional water molecules are present at the position between the four p y rrole-rings
of PCB (which is in contra st to the observations of the MM-MD).
76

Figur e 34. Evolution of distances between solvent water W3 and His260- N δ and Asp207-
backbone-CO during 40 ns of MD with polarized force field
Figur e 35. Evolution of distances between sol vent water W3 a nd N A , N B , N C and CO A of PCB
during 40 ns of MD with polarized force field
Figur e 36. Position of W3 after 40 ns of production run wit h visualized distances to N A , N B, N C ,
CO A , N δ of His260 and backbone -CO of A sp207
77

6.4. Discussion
In conclusion it beco mes obvious that without polarized char ges inc luded in to the force field, the
waters, especiall y on the p-water location, are unstable and exchange very often during MD-
simulations. In contrast to this, the addition of polarized charges for PCB and its surrounding
residues leads to only one exch ange in water molecules observed in the corresponding MD. T he last
water sta y s stable without bigger movements and fluctuations on its supposed position. Since only
small dif ferences between the s tructures of the chromophore in both MD simulations could be
observed, the stabilizing effect is assigned do be mainly due to the usage of the polarized force
field. This is due to the fact tha t the implementation of M ulliken char ges incorporate polarization
ef fects of the CBD-residues on the cofa ctor and vice versa . In classical MM-MD’ s these
polarization ef fects are neglected, s ince the point-char ges of protein residues have been determined
for the Apo-protein (see se ction 3.1.). But the interaction of charges of cofact or and environm ent
play important roles in stabilization of the surrounding water -network. The polar ized charges help
to reproduce the water -stability as observed in NMR [194],[195] and cr y stal studies [10]. F urther
repetitions of the named MD's (not shown here) show the sa me trend in stabilities and confirm the
reproducibility of the results. Thus a polarized force field for at least a s mall region of interest (e.g.
cofactor binding-site) can prove necessary , if MD-simulations show similar instabilities of water
molecules. Since the com putation of Mulliken char ges, as well as the generation of a polarized force
field can be time-consuming it should be tested beforehand if their application is necessary . In
conclusion, polarized force fields provide a useful and ef fective way to generate a more rea listic
char ge environment and overcome instabilities in MD-simulations. Additionally , it can help to
improve computed Raman-spectra, as i s discussed in the next chapter .
78

Chapter 7
QM/MM based R aman-spect ra-calculation
Calculations of Raman-spectra for both MM-MD- and polf f-MD-simulations have been carried out
(as explained in s ection 5.1.4.). In the past, differences have been observed between computed
Raman-spectra and experimental ones, especiall y regarding the N H in-plane rockings (broadening
of corresponding band) [63]. It has been suggested that this may be due to the instability of the
pyrrole-water on its position during normal M M-MD s imulations. Therefore, i t has been of interest,
if spectra computed out of polf f- trajectories will lead to an increased comparability to experimental
data [165]. For this purpose QM/MM geometr y opt imizations have been done for 25 snapshots (of
both MM-trajectory and polff-t rajectory), using QM methods for the chromophore, the sulfide-link
to the protein (including Cα of the corresponding C ys residue) and the p-water . The rest of the
sy stem has been treated with an MM force field. In the case of the MD simulation with polarized
force field, the new char ges have been added to the MM part of the optimization. Afterward the
optimized structures were used for subsequent Raman-spectra co mputations according to procedure
explained in section 5.1.4. T wo dif ferent t y pes of spectra are presented and discussed in the
following: First, minimum energy spectra for both t ypes of MD-simulations. For these, the
snapshots with the lowest energy after optimization has been used, called minimum energy spectra
(min-spectra). And second, s um-spectra were created b y summing up of the da ta over all 25
snapshot-spectra. This is one way to include the influence of fluctuations in the protein environment
and the stability of the p-water on the computed spectra (see section 4.4.4.). These computed
Raman-spectra were compared to experimental Resonance Raman spectra of Cph1 by the group of
Peter Hildebrandt [165].
7.1. Mini mum energy Ra m an-spectra based on MM for ce field VS. polarized
for ce field
In Figure 37a and 37b both calculated minimum ener gy spectra for MM -MD and polf f-MD are
depicted in comparison with the experimental Resonance Ra man spectra [165]. The general trend is
the same in both (MM and polff). The one based on the polff-MD shows dif ferences in intensities
and their distribution, compared to the spectrum of the MM -MD. The bands are more pronounced
and have smaller bandwidth and are mostly of higher i ntensit y in t he former .
For discussion purposes, the complete s pectra has been divided in three different regions. The low
frequency region (600-1000 cm -1 ), in which m ostl y out of plane movements (especially of the four
rings) are present, the middle region (1000-1400 cm -1 ) of single-bond vibrations and the s o-called
marker band region (1400-1700 c m -1 ) which holds the C=C-stretchings of the methine (CH 2 -)
bridges and the NH-ip rockings (rockings) of rings B and C. The focus of the investigation has been
on the mark er band region and especially on the NH-ip bands. The spectra for the MM-MD will be
referred to as MM-spectra and accordingl y the spectra of the MD with polarized force field as polff-
spectra. The results are shown in Figures 37a and 37b and T able 1.
Comparison of the low region for both spectra reveals cer tain dif ferences, as for example the lower
intensity (53) of the peak at 674 cm -1 in the polff case (80 in MM-spectra). Peaks at 727 and 762 c m -
1 (MM) are downshifted compared to polf f (713 and 754 cm -1 ). Al so the band at 769 cm -1 ( MM) is
upshifted. Mos t notable is for instance the s plitting of a broad band (787 c m -1 ) in the MM case into
two distinct peaks (806 and 832 c m -1 ) which create a doublet in the spectrum of the polf f-M D.
79

Following an id entical pattern, an additional doublet occur in the polf f-spectrum at around 970 and
987 cm -1 which is not visib le in the MM case. In general, the spectrum for the polarized trajectory
shows more pronounced and clearl y distinguishable peaks, where in the MM-spectrum often broad
bands are visible. The region between 1000 and 1220 cm -1 shows no differe nces between both (with
exception of an add itional peak at 1 150 cm -1 in polff-s pec trum contri buting to a doublet). In the
region of 1300 – 1350 c m -1 ( mostly CX 2 w aggings and twist ings) also strong deviations in
intensities of a rat io of about 3:1 (polff:MM) between both computed spectra can be seen,
accompanied by additional down- (1290 to 1279 cm -1 and 1307 to 1304 cm -1 ) and upshifts (1320 to
1329 cm -1 and 1346 to 1347 cm -1 ) in the polff-spect rum. Of great interest were the differences in the
marker band region and the appearance of the NH-ip rocking bands. Therefore, a more detailed look
on this region is provided in Figure 37b.
Figur e 37a. Minimum ener g y spectra for both MM (A)- and pol f f (B)-MD simulations of Cph1
compared to experiment (C) Experime ntal Resonance Raman spectrum [165] (Cph1 in frozen
solvent); asterisk in C denotes solvent bands
On first view , it can be seen tha t the peaks in the polff-spectrum are more pronounced and better
distinguishable from each other . A dow nshift of 1415 and 1451 cm -1 compared to the M M-case
(1432 and 1463 cm -1 ) can be observed. These two bands correspond to the C=C (ring C and methine
bridge betwe en rings C and D ) stretchings, as well as the NH-ip rockings of ring A in both-spectra
or the N H-ip rocking of ring D in the case of the MM-spectrum. For a more detailed mode
composition of the bands in the marker band region, take a look at T able 1. Furthermore, two
80

additional bands can be seen in the polf f-spectrum at 1479 and 1501 cm -1 (C-C and C=C stretchi ngs
of ring B), where onl y a small shoulder at the peak at 1512 cm -1 is present in the MM-spectrum. The
interesting NH-ip rockings of rings B and C (w hich consist m ostl y of these rocking/roc king
movements) are visib le in both spectra in different qual ity : The ones in the MM-spectrum at 1512
and 1565 cm -1 are downshifted in comparison to the polff-data (in which the y are at 1519 and 1576
cm -1 ). The band w hich in the past has been found mostly af fected b y the instab ility of the p-water in
MD-simulations, is the one at 1565 cm -1 (MM ) and 1576 cm -1 (polf f) and corresponds mostl y to the
NH-ip rocking of rings B and C. In the MM-spectrum this band is flat and broad, while in the case
of polff it is sharper and higher in intensity . Also th e intensity-distribution between named two
signals is w eighted differe ntly: In the case of MM, the band at 1512 c m -1 is higher than the one ate
1565 cm -1 . This is in contr ast to the distribution for the polf f-MD which is vice versa . F inally , the
highest peaks in both spectra, are corresponding to C=C stretchings of methine bridges between
rings C and D or B and C, as well, as O=C stretchings of rings A and D. They are at 1615 cm -1
(MM-spectrum) and 1614 cm -1 (polff-spec trum). The signal is of higher intensity (about two times)
and the shoulder (corresponding to C=C stretchings of the methine bridge between rings A and B
and rings B and C mostly ) more visible in the latter spectrum.
Figur e 37b. Minimum ener g y spectra for both MM (A)- and polf f (B)-MD simulations of Cph1,
(C) Experimental Resonance Raman spec trum [165] (Cph1 in frozen solvent): zoom into the m arker
band region with visible NHip-rockings; aste risk in C denotes solvent bands
As a first conclusion, the usage of polf f-MD’ s for spectra computations lead to spectra with s harper
and more distinctive bands, especially the Nhip-rocking bands. This can be due to a more stable and
stronger interaction between the cofactor and the py r role-water in the polf f-MD.
81

Comparison of calculated spectra to the experimental Resonance Raman spectrum [165] (carr ied
out in th e group of Peter Hildebrandt) revealed the following: In the region of 600-1000 cm -1 there
are some shifts in frequencies. For example both peaks at 670 cm -1 (MM) and 674 c m -1 (polf f) are
upshifted compared to the experi ment (659 cm -1 ). Also 713 c m -1 (polff) is downshifted, but closer to
the experiment (718 cm -1 ) than the corresponding peak in the M M-spectrum (727 cm -1 ). T he doublet
at 792 and 809 cm -1 (exp) is onl y visible in the polf f-spectrum (upshift of about 14 and 23 cm -1 ) and
missing in the MM case. The experimental bands a t 992 and 1003 cm -1 are downshifted in the polf f-
spectrum of about 22 and 16 c m -1 , respectivel y . These are not visible in the MM data. Furthermore,
in the middle-region the double t at 1 1 14 and 1 133 c m -1 (exp) is only partl y visible (onl y one s ignal
at 1 1 19 cm -1 ) in MM, while in the polff-spectrum it is a complete doublet (upshifted b y 10 and 17
cm -1 ). Additionally , th e two peaks at 1239 and 1254 cm -1 (exp) are s harper in the polff-spec trum
(also upshifte d about 15 and 25 cm -1). The peaks in the region between 1300 and 1350 cm -1 are
sharper in the polff-spectrum and show the same intensit y distribution as in the experi ment. Further ,
the downshifts (13, 5 and 32 cm -1 , respecti vely) are smaller than in the case of MM.
ν/cm -1 (I) - MM - MD ν/cm -1 (I) - polff-MD exp – Cph1-fr ozen
solvent [165]
Normal Mode
composition
1432 (7) 1415 (24) -
N-H rocking of ring D
(MM), C=C stretching a t
ring C and methine bridge
between rings C+D (polf f)
1463 (6) 1451 (20) -
C=O deformation + N-H
rocking at ring A
(MM+polf f)
1512 (14) 1519 (31) 1524
N-H rocking at ring C +
O=C stretching a t ring B
(MM), N-H rocking at rings
B and C + C=C stretching
of methine bridges between
rings A+B and B+C (polff)
1565 (1 1 ) 1576 (40) 1570
N-H rocking at rings B and
C (MM+polf f), C =C
stretching of methine bridge
between rings B+C (polf f)
1615 (100) 1614 (100) 1635
C=C stretching of methine
bridge between rings C+D
(MM+polf f), O=C
stretching at ring D (polf f)
1626 (31) 1632 (75) 1649
C=C stretching
(MM+polf f), O=C
stretching (polf f)
T able 1. V ibrational frequencies of ma rker band region for both minimum ener gy spectra of MM-
and polf f-MD of Cph1 compared to experiment [165] and with assigned normal m odes
Comparison of the marker band regions of both computed spectra to the exper iment (see Figure 37b
and T able 1.) first shows the mi ssing of two signals in the experiment (at 1415 and 1451 cm -1 (polf f)
and 1432 and 1463 c m -1 (MM)), due to an over lap with sol vent bands (denoted b y an asterisk in
Figure 37b) The next band at 1477 cm -1 (exp) is only present in the polf f-spectrum (upshifted by 2
cm -1 ). Furthermore, the band at 1501 cm -1 (polf f), w hich is not visible in the MM-spectrum, can also
not be seen in the experimental one. Both NH-ip rocking peaks at 1524 and 1570 c m -1 (exp) are
downshifted in the MM case (12 and 5 cm -1 ), while in the polf f-spectrum the y are onl y displaced by
82

5 and 6 cm -1 . F urthermore, the s econd band at 1570 cm -1 (exp) shows similarities to the
corresponding one in the polff-spectrum b y means of intensity and shape. Also the distribution of
intensities between both NH-ip bands for polf f (second higher than first) is in agreement to the
experiment. Finall y the peak at 1635 c m -1 and its shoulder at 1649 c m -1 (exp) is ups hifted compared
to both calculated spectra, t o which the polff-s pec trum is in better agreement.
Thus, the co mputed minimum energy spectrum for the MD with polarized force field is in bett er
agreement to the experi mental one in terms of vibrational frequencies and concerning intensities
(and intensity-distributions). Further the usage of polf f af fects the shape of the NH -ip bands in such
a wa y that the y fit to the ones in the experiment better (more com parable frequencies and
intensities) than without usi ng polarized char ges.
7.2. Com parison of both MM and polff methods including influences of pr otein
and water fluctuations on the computed s pectra
T o visualize effects of protein and water fluctuations on the ind ividual vibrational modes, sum-
spectra have been computed (s. section 5.1.4.). In these the influence of the stabilization of the
pyrrole-water due to the usage of pola rized charges on t he NH-ip rockings can be observed.
Figur e 38. Comparison between calculated m inimum ener g y spectr a and sum-spectra of Cph1; (A)
minimum ener g y spectrum of MM-MD; (B) sum-spec trum of MM-MD; (C) minimum ener g y
spectrum of polf f-MD; (D) sum-spectrum of polf f-MD
83

In Figure 38 a co mparison between both MM- and polff-sum- and minimum ener gy spectra is
depicted. The dif ferences in frequency-shifts (between mini mum ener g y and the corresponding
sum-spectra) are almost neglectable. Also the intensit y distribution and the shape of the dif ferent
bands remains the sa me as in the minimum energy spectra. The only notable dif feren ces are the
appearance of an additional band at 987 cm -1 in the MM sum-spectrum, where a doublet is present
in both experi ment and polff case. The “ triplet” at around 1307 c m -1 is less visible in the MM sum-
spectrum, due to overlapping signals and s hifts in frequencies of these, while the peak at 1456 cm -1
(MM) is sharper than in the respective minimum ener g y spectrum. Furthermore, the NH-ip rocking
band at 1558 cm -1 (MM sum-spectrum) is broader and s maller and thus indicating the instabilit y of
the p-water and the s hifts in the water-NH( PCB)-distances during the corresponding MD. In
contrast to this, there are almost no changes from the minimum energy spectrum of polf f compared
to the respective sum-spectrum .
When comparing the signals’ of the NH-ip rockings (Figure 38; peaks at 1558 cm -1 (MM) and 1572
cm -1 (polf f)) of both sum-spectra, it can be seen that the one in polff is less w ide, than the on in the
MM sum-spectrum. Estimations of the standard deviation of the named Nh-ip signal in both MM
and polff s um-spectra gives a ratio of about 5.3:3.6 (MM:polff). This de monstrates the decreased
bandwidth in the polff s um-spectrum and thus indicates the s tabilizat ion of the p-water (and the
water -NH(PCB)-interaction) w ith the application of a polarized force fie ld even affe cts the
computed Raman-spectra.
Figur e 39a. Cph1 Sum-spectra for both MM (A)- and polf f (B)-MD simulations, (C) Experimental
Resonance Raman spectrum [165] (Cph1 in frozen sol vent); asterisk in C denotes solvent bands
84

Figur e 39b. Cph1 Sum-spectra for both MM (A)- and polf f (B)-MD simulations, (C) Experimental
Resonance Raman spectrum [165] (Cph1 in frozen sol vent): zoom into the marker band region with
visible NHip-bands; asterisk in C de notes solvent bands
The dif fer ences between calculated sum-spectra and the experimental Resonance Raman spectrum
are the same as for the minimum ener g y spectra for both MM and polf f (see Figure 39a). Therefore,
only marker band regions of the sum-spectra were compared to the exper iment (see Figure 39b and
T able 2): The peaks at 1456 cm -1 (MM) and 1459 cm -1 (polff) are similar in frequency and intensity .
The two bands at 1480 and 1504 cm -1 in the polff-spec trum are less visible than before (and 1480
cm -1 is 1 cm -1 more upshifted compared to the experiment than before). Both NH-ip rocking signals
in the sum-spectra are still sharper for the polff-M D and here, the y are in even better agreement to
the experi ment than in the minimum ener g y spectrum (a lso in terms of intensities). The y are onl y
shifted about 3 and 2 cm -1 (polff) fro m the experimental dat a , w hile in the case of the MM-sum-
spectra the shifts in freque ncy are large r (13 and 14 cm -1 ).
In this wa y , it has been demonstrated that the main goal of the reduction of bandwidth of the NH-ip
rocking s ignals by s tabilizing of the w ater network and especially the p-water with the help of a
polarized force field has been successfully a chieved.
85

ν/cm -1 - MM-MD ν/cm -1 - polff-MD exp – Cph1-fr ozen
solvent [165]
Normal Mode
Composition
1431 1419 -
N-H rocking at ring D
(MM), C=C stretching a t
ring C and methine bridge
between rings C+D (polf f)
1456 1459 -
C=O deformation at ring A
+ N-H rocking at ring A
(MM+polf f)
1508 1521 1524
N-H rocking at ring C +
O=C stretching a t ring B
(MM), N-H rocking at rings
B and C + C=C stretching
of methine bridges between
rings A+B and B+C (polff)
1558 1572 1570
N-H rocking at rings B and
C (MM+polf f),
C=C stretching of methine
bridge between rings B+C
(polf f )
1612 1618 1635
C=C stretching of methine
bridge between rings C+D
(MM+polf f), O=C
stretching at ring D (polf f)
1630 1633 1649
C=C stretching of methine
bridges between rings A+B
and B+C (MM+polf f), O=C
(stretching at ring A (polf f)
T able 2. V ibrational frequencies of marker band region for both sum-spectra of MM- and polf f-MD
of CPh1 compared to experiment [165] and with assigned normal m odes
7.3. Discussion
The minimum energy s pectrum for the polff-M D shows sharper bands and intensit y distribution
patterns which are in better agreement to the experiment (especially in the marker band region, e.g.
the NH-ip rocking signals) than the MM-min-spectrum. Also the frequencies of the individual
vibrations in the polf f-spectrum are more close to the experimental da ta. This is conf irmed by the
corresponding sum-spectra showing the same trend. The polf f sum-spectra is in even better
agreement to the exper imental Resonance Ra man data. This leads to the conclusion that the
stabilization of the p-water (and the water network around PCB in genera l), due to the usage of
polarized char ges, directl y af fects the reproducibility of vibrational data with theor etical methods in
a positive way .
86

Chapter 8
Summary and conclu sions of part III
It is assumed that cr y s tal waters are importa nt for corre ct d y n amics and processes inside protein
environments (s. sections 1. and 3.1. for refer ences). It is a major issue to si mulate them correctly in
theoretical applications. In preliminary work of Mor ginski et al. [63] and Daminelli et al.
(unpublished) [64] an increased mobility of the water -network, especiall y the p yrrole-water , could
be observed. This led to computed Raman-spectra which showed w ider bands co mpared to
experimental data. This was especi ally true for the very flat and broad signals for th e the NH-ip
rocking band for p yrrole-ring C. It was assumed that this could be directly related to the wrong
treatment of water molecules with the applied force field. Currentl y available MM-MD methods
might suffer from an insuffi cient tr eatment of the charge interactions be tween individual groups,
especially in near vicinity of cofactors. This might lead to a destabilization of the w ater - network
[57],[58],[59],[60],[61],[62]. The development of i mproved techniques for modeling of the char ge-
environment has thus proofed necessary in order to generate more realistic MD-trajectories [65],
[66],[67].
In this work, the quantum-chemical computation of polarized char ges for at least the chromophore
PCB of Cph1 and the protein residues surrounding i t, and its i mplementation in classical mol ecular
mechanic force fields has been carried out. It has been shown in the case of C y anobacterial
Phytochrome Cph1 that these new charges stabilize the water -network in the vicinity of the
cofactor . This is due to the more realistic treatment of polarization effe cts of cofactor and
environment on each other .
Furthermore, these new polarized char ges have been used within a QM/MM formalism for
subsequent geo metry optimizations of dif ferent s napshots and Ra man-spectra calculations for both
polf f- and MM-trajectories. In this wa y , it has been demonstrated th at the polf f-MD leads to a better
reproduction of experimental Resonance Raman results than spectra computed out of a classic MM-
trajectory . Mos t significant are the changes in both intensities and appearances of the individual
signals. The app lication of a polarized force field as done in this w ork, furthermore, leads to a more
realistic band-shape of the NH-ip rocking mode, with the corresponding band becoming sharper and
less wide (of about 1.7 cm -1 ). This is in good agreement to the experimental data, also in terms of
frequencies of these vibrational modes.
In that respect, this work de monstrates an approach for the generation of m ore accurate force fields
for application with Cph1. A polarized force field reflec ts the interactions of partial char ges in a
more rea listic w ay . The accuracy of QM/MM and Ra man-spectra computations have been
increased, helping to get a better reproduction of experimental data. Finall y , the importance of the
application of polarized charges into the existing force fields has been de monstrated. This method is
helpful for systems which show si milar problems with water instabilities and migh t stabilize the
corresponding water -network. The more accurate treatment of char ge distribution inside areas of
interest of biological matter represent a valuable tool for prediction of structural, d y namic and
spectroscopic propertie s and might help in the correct interpretation of experimental data.
87

Part IV .
Molecular dynamics and QM/MM based computations
for phytochr ome Agp2
88

Chapter 9
QM/MM calcula tions for determination of pr otonation
state of t wo conserved histidine r esidues in near vicinity
of the chr omophor e BV in Agp2
Agp2 is a bathy ph y tochrome with light-induced conversion fro m far -red light sensible parent state
(Pfr) via Lumi-F and Meta-F intermediate states towards Pr b y introduction of far-re d light. From
there on, the chromophore can either convert back to its parent state via red light treat ment and the
intermediate states Lumi-R and Meta-R or directly by a thermal back reaction from Pr to Pfr state.
Herein the chromophore biliverdin (BV) is converting between dif ferent structures by rota tion of
ring D around the meth y l bridge between rings C and D. The thereafter induced changes in the
chromophore binding-domain (CBD) lead to a signaling cascade. The understanding of the
triggering events of this signaling process are still not co mpletely clear . Therefore, ongoing
investigations are tar get ing the diffe rent states in the photoc yc le, while observing both chromophore
and surrounding environ ment for apparent changes. The knowledge of each part of the mechanism
and the contributing agents are of utter im portance to get a complete picture of the photocy cle.
As stated in section 3.2., the determination of the correc t protonation of the two conserved
histidines Hi s248 and His278 in the vicinity of the cofactor is important, since they might play a
significant role in the uptake and release of protons during the photoc y c le, as proposed for Cph1
[27]. The two protein residues and their protonation s tate might be of essence for the
photoconversion: E.g. the thermal backreaction of Agp2 from Pr to Pfr via Keto-Enol-T autom erie,
as shown b y V elazquez et al. [23] might be facilitated by (de-)protonation of His278. It is necessar y
for understanding and interpretation of experimental results and, further more, to achieve proper
simulation conditions for theoretical studies to know the correct protonation states of these two
histidines. For this purpose, nine dif ferent mode ls (dd, de, dp, ed, ee, ep, pd, pe, pp) have been built
as explained in sections 3.2. and 5.1.3., and treated w ith h y b rid QM/MM methods in geometr y
optimizations in order to find the model which showed the least deviations from the
crystallographic s tructure of Agp2 by Scheere r et al. [16]. Comparisons were alwa y s done to the
protonated, solvated and water -equilibrated cr y s tal structure [16] as described in section 5.2.1. Here,
H278D accounts for protonation of N δ -position, H278E means proton ation of N ε -postion and H278P
means doubly protonated histidine. T he results of these studies are described below .
General tr ends in the geometry optimizations of the nine differ ent models. It can be observed
that no change in the protonation s tates of both His and chromophore occurs during the geometry
optimizations. Further it can be seen that the different protonation states of His248 and His278 have
indeed an i mpact on the structural stability . In some models the chang es are s mall, while in others
(like the pe- model, as shown in 9.1.) they were s trong. A lso diffe rent amounts of alterations in the
H-bond-network can be s een, depending on the used model. Further more , the potential bonds
between N δ or N ε and their respective hydrogens of His248 and His278 are either elongating or
shortening of around 0.003 A in the beginning of the optimization which then stayed stable till the
end. As will be discussed later in more detail, there are so me models, in which the dist ances do not
stay stead y , but instead fluctuate, indicating instabilities in the corresponding model. In the
following the pe-model will be shown as an example for the instabilities introduced b y His278
being protonated on its N ε -position, followed by a demonstration of the m o re stable models with N δ -
protonation of His278.
89

9.1. His278 m odels w ith pr ot on on N ε -position
An example: pe-model. This model is a good example for the induced deviations from the cr y stal
structure by a protonation of His278 on its N ε -position.
Figur e 40. Optimized structure (colours) of pe-model com pared to X-Ra y [16] (orange), aligned at
chromophore (NOTE: no protons at BV are shown; BV i s pr otonated at right PSC(C)-oxygen )
In Figure 40 the optimized structure (in colors) is co mpared to the cr y stal s tructure (in blue).
Deviations from the X-ray structure can be seen for Ser462, His248 and His278. Both histidines
shift backwards from its original position, thus also af fecting the surrounding water molecules
which have to adjust to the altered environment. The chromophore itself shows onl y small
dif ferences to the cr y s tallographic structure (e.g. a s mall twist of the PSC(C) due to the back shift of
His278). These observations can also be found in the corresponding RMSD values (s. T able 3). The
RMSD of BV is about 0.46 Å which indicates only small deviations. In contrast to this, both H is248
and His278 give high values, 0.84 Å and 1.26 Å, respectively (see T abl e 3), suggesting that this
model does not represent the native prot onation state.
The dista nce-evolution of the N ε -H-bond (s . T able 5) of His278 shows an oscillating na ture,
confirming the unst able structure of His278. In contrast, the analyzed N-H-bonding evolution of
His248 shows only in the beginning a rise in the distance which then stay s the same till the end of
the trajectory . At the end of the optimization these bonds are in normal N-H-bonding range: His248-
N δ -H = 1.03 Å , His248-N ε -H = 1.02 Å and His278-N ε -H = 1.01 Å. Finally , the H-bond-network in
near proximity to th e histidines has been investigated (s. T able 4). It can be obs erved that no
deviations occurred around His248 and near Ser462, while small ones are present near PSC of ring
B (one additional H-bond of the ox y gen of the carbox y lic group to a water molecule). Contrary to
this, the hydrogen-bonding-network connecting His278 to the cofactor decomposes completely
which leaves His278 with no bond to the chromophore (also not via a water molecule).
All these deviations from the original cr y stal structure indicate that this model is not representing
the native protonati on state. The biggest dif ferences concern His278 and adjacent parts of BV , as
well as the H-bond connection between both. Such can be found for all m odels in which His278
was e- or p-protonated, leading to the sugge stion that the models de, dp, ee, ep, pe, pp are incorrect.
90

9.2. Corr ectness of models with pr oton on N δ -position of His278
An exa m p le: ed-model. The m odels with s ingly protonated His278 on the N δ -position show less
deviations fro m the cr y s tal structure during the QM/MM optimization process, as shown for the ed-
model: In Figure 41 the optimized structure (in colors) is co mpared to the crystal structure (in
orange). Only s mall shifts can be observed, concerning His278, Ser462 and the PSC of ring C. This
is also underlined b y the corresponding RMSD values (s . T able 3) of BV , H is248 and His278: 0.39
Å 0.41 Å and 0.78 Å, respectivel y . A fortiori, if compared to the ones of the pe-model. The NH-
bond-evolutions for both histidines (s . T abl e 5) show stable bonds after an initial change. The
absolute bond values for the NH-bonds at th e end of the simulation are: His248- N ε -H = 1.03 Å and
His278-N δ -H = 1.03 Å.
Figur e 41. Optim ized structure (colors) of ed-model compared to X-Ray [16] (orange) ; both
aligned at chromophore (NOTE: no protons at BV are shown; BV i s pr otonated at right PSC(C)-
oxygen ); dotted line indicate H-bond-i nteraction between PSC(C)-proton and free N ε of His278
Furthermore, the qualitative investigation of the H-bond-network afte r the optimization shows no
deviations from the network in the initial X-ra y structure in the CBD. The network inside the
computed QM-region stays the s ame during the optimization process (s . T able 4): Bonds fro m both
H248-N's to the pyrrole-water and one water bridging to the PSC of ring C can be observed, as well
as two bonds from His278-N ε to the PSC of ring C of the cofactor , directl y and via water -bridge.
This is indicating s tructural stability of this model. These trends occur in all H278D models and
indicate a correctness of the dd-, ed- and pd-models in contrast to the six others. In the case of the
ed-model these findings w ere even further established b y computations with lar ger basis set (6-
31+G*) for al l h y d rogen atoms. Here, the RMSD values are even lower (BV : 0.40 Å, H248: 0.24 Å ,
H278: 0.71 Å), demonstrating even less deviat ions from the initial cry stal structure.
9.3. Determination o f corr ect pr otonation-m odel
These findings of models with single protonated His278 on its N δ show ing the least deviations from
the initial cr y s tal model, lead to the conclusion that one of the three mode ls dd, ed or pd has to be
the correct one. Therefore, all comput ed RMSD values (T able 3), stabilities of H-bond-networks
(T able 4) and stabilities of N-H-bonds (T abl e 5) for all nine models are compare d to each other .
91

RMSD
(heavy atoms)
H248D
H278D
H248D
H278E
H248D
H278P
BV 0.49 0.52 0.50
H248D 0.40 0.56 0.53
H278 0.81 1.07 0.90
RMSD
(heavy atoms)
H248E
H278D
H248E*
H278D*
H248E
H278E
H248E
H278P
BV 0.39 0.40 0.46 0.44
H248E 0.41 0.24 0.63 0.64
H278 0.78 0.71 0.72 0.84
RMSD
(heavy atoms)
H248P
H278D
H248P*
H278D*
H248P
H278E
H248P
H278P
BV 0.40 0.49 0.46 0.45
H248P 0.30 0.30 0.84 0.44
H278 0.87 0.95 1.26 0.70
T able 3. Computed RMSD values for all ni ne QM/MM geometry optimized models with aligne d
chromophore and with respect to the solvate d + water -equ ilibrated crystal struct ure [16]; asterisk
denotes the m odels computed with lar ge basis set for hydrogen atom s (6-31+G*)
When the computed RMSD values of all models are compared to each other , the ones with single
protonation of N δ of H278 yield the s malle st values . E.g. dd (RMSD(His248): 0.40 Å,
RMSD(His278): 0.81 Å), de (RMSD(His248): 0.56 Å, RM SD(His278): 1.07 Å) and dp
(RMSD(His248): 0.53 Å, RMSD(His278): 0.90 Å) compared to each other show the lowest
RMSD’ s for the dd-model (dif ferences of about 0.16 Å/0.13 Å for H is248 and 0.26 Å/0.09 Å w hen
comparing dd to de/dp). The same is true for the ed- (RMSD(His248): 0.41 Å, RM SD(His278):
0.78 Å), ee- (RMSD(His248): 0.63 Å, RMSD(His278): 0.72 Å) and ep-models (RMSD(His248):
0.64 Å, RMSD(His278): 0.84 Å), with the exception of the RMSD for His278 in ed which is 0.06 Å
higher than in the ee model. But since also the RMSD for BV (0.39 Å) is smaller in ed than in ee
(0.46 Å) it does not account to much. Here, ed shows differences to ee/ep of about: 0.07 Å /0.05 Å
for BV , 0.22 Å/0.23 Å for His248 and -0.07 Å/0.06 Å for His278. Similar results can be seen for the
pd-, pe- and pp-models, in w hich the RMSD of BV is 0.06 Å, for His248 about 0.54 Å and for
His278 about 0.39 Å smaller when comparing pd to pe and about 0.05 Å for BV , 0.14 Å for His248
and -0.17 Å for His278 when comparing pd to pp. Here, pd gives a higher RMSD for His278 than
in pp, but the other values are still lower .
The lowest RMSD’ s are alwa y s given for the H278D-models (the above mentioned exceptions can
be neglecte d as shown below). Comparison of dd, ed and pd to each other in terms of struc tural
stability reveals similarities in RMSD’ s which could mean a less important role of the protonation
state of His248 for s tructural stabilit y during the QM/MM opt imizations. The dd-model shows
slightly higher values for BV and His278 (0.1 Å/0.09 Å for BV and -0.01 Å /0.1 Å for His248 and
0.03 Å/-0.06 Å for HI278 when comparing to ed/pd ), whereas the ed-and pd-models were alike in
terms of RMSD. The ed-model has a smaller RMSD for His278 (0.09 Å) and for BV (0.01 Å) as in
the pd-model. The RMSD for H248 is s maller in the pd-model (0.1 1 Å). ed shows smaller values
for BV and His278 compared to both dd and pd. Simultaneously , pd has s maller RMSD values for
BV and H is248 com pared to dd. Therefore, either ed or pd represents the correct protonation state.
92

W ith the help of further computations with larger basis set for hydrogens (6-31+G* for H’ s; s ee
section 5.2.4.), it can be further dif ferentiated between ed and pd (s. T able 3): Here, the ed- model
shows the lowest RMSD values in all three investigated structural motifs (BV : 0.09 Å, His248: 0.06
Å, His278: 0.24 Å) compared to pd and thus makes it the more probable protonation model.
H248D
H278D
H248D
H278E
H248D
H278P
Stable/not stabl e ✔ / ✕ (H278) ✕ (H278) ✕
H248E
H278D
H248E*
H278D*
H248E
H278E
H248E
H278P
Stable/not stabl e ✔ ✔ ✔ / ✕ (H278) (H278) ✕
H248P
H278D
H248P*
H278D*
H248P
H278E
H248P
H278P
Stable/not stabl e ✔ / ✕ (H248) ✔ / ✕ (H248/H278) (H278) ✕ (H278) ✕
T able 4. Co mparison of H-bond-network stability in all nine QM/MM geometry optimi zed models
with respect to the sol vated + water -equi librated cry stal structure [16] ( : stable network → no ✔
changes; : unsta ble network → many + crucial changes, concerning named residue; / : ✕ ✔ ✕
moderate changes not af fe cting general stability → e.g. exchange of bonding part ners); asterisk
denotes the models computed with lar ge basis set for h y d rogen atoms (6-31+G*); denoted in
parentheses are the residues which experience the biggest change s of the corresponding H-bond-
network
In T able 4 the dif ferent stabilities of h y d rogen-bond-networks of all nine models are co mpared to
each other . It can be seen that in models, where His278 is e- or p-protonate d, this resulted in the loss
of most or all H-bonds from H278 to the chromophore. And also in the case of the H 278D models
only in the ed model a complete unchanged network can be observed at the end of the QM/MM
optimization. In the pd-model additional H-bonds concerning His248 are observed. Additionall y , in
case of the corresponding calculations with bigger basis for H’ s, His278 loses H-bonds towards BV
and surrounding water molecules.
In T able 5 all N-H-bond-evolutions of the two histidines during the QM/MM optimization process
are monitored, whereas strong fluctuations indi cate unstable bonds and thus unstable struc tures, i.e.
less correct models. W i th the exception of ed-and ep-models, major deviations in the bond-
evolutions of all other for mations are always been observed. Especially models with N ε -protonation
of His278 yield fluctuating NH-bonds, with the exception of the ep-model. But the results
concerning RMSD’ s and H -bond-network s tabilities de monstrate the ep-model to be less probable
to be the correct one. Additionally , in the case of pd, it should be mentioned that the N ε -H-bond of
His248 was subject to severe variations and is ver y long (1.05 Å) in the QM/MM optimized
structure which shows the trend of this proton to to undergo a translocation to the PSC of ring C
(observed for both applied basis sets for h ydrogens). This renders the pd-model more unlikely to be
the correct one. Further , this confirms that the ed-model yields the most stable structure after
QM/MM optimization, w ith both N-H- bonds increasing/decreasing only in the beginning and then
staying stable.
93

bond between N δ or
N ε and H of His
H248D
H278D
H248D
H278E
H248D
H278P
H248 stable stable stable
H278 unstable unstable δ:stable |
ε:unstable
H248E
H278D
H248E*
H278D*
H248E
H278E
H248E
H278P
H248 stable stable unstable stable
H278 stable stable unstable both:stable
H248P
H278D
H248P*
H278D*
H248P
H278E
H248P
H278P
H248 δ:stable |
ε:unstable both:unstable both: stable both:stable
H278 unstable unstable unstable δ:unstable |
ε:stable
T able 5. N-H-bond evolution/stability of His248/Hi s278 in all nine optimized models during
QM/MM geometry optimization wit h respect to the solvated + water -equilibrated cr y s tal structure
[16]; asterisk denotes the m odels computed with lar ge basis set for hydrogen atom s (6-31+G*);
“stable” means constant bond-length during the l ength of the optimization, while “unstable” denotes
strong fluctuations in the respective bond s during optimization procedure
In conc lusion, the overall structure of the chromophore and its surrounding protein and water
environment is af fec ted b y the protonation state of the two investigated histidines. His278 is most
likely N δ -protonated due to discussed RMSD-values, H-bond-network s tability and NH-bond-
evolution and His248 is N ε -protonated. Thus, the most probable and correct protonation state for
His248 and His278 will be the ed-model. Th is results fit nicely w ith recent finding of V elazquez et
al. [23] w ho predicted the thermal degradation route fro m Pr to Pfr s tate via a Keto-Enol
equilibrium, in which His278 is probable protonated on the N ε -po sition by the proton from the PSC
of ring C (as observed via RR). It would make a singl y protonated His278 on the N δ -position in the
Pfr -state necessar y t o yield a free N ε w hich then can be protonated in the Pr -state.
9.4. Models with depr ot onated PSC(C)
General trends. Additional QM/MM calculations of models (dd, de, dp, ed, ee, ep, pd, pe, pp) with
deprotonated PS C(C) have been carried out (see section 5.1.3.). The reason for this has been the
investigation of a possible H-translocation from N ε of H is278 (for H278E/P models) to the PSC(C)
(as it might occur in Pr-P fr -conversion [23]). This would proof a protonated PSC(C) in the Pfr s tate
of Agp2. T herefore, all computed results were com pared to the initial crystal structure [16].
Due to high RMSD values (s. T able 6), H-bond-network instabilities (s. T able 7), as wel l as
deviations in the corresponding N -H-bonds (s. T able 8) a high instab ility in all H278D m o dels is
revealed. Th e models with the least deviations from the initial cry stal structure are the ones with p-
protonation of His278. In all mode ls containing a N ε -H on the His278, no proton transfer can be
observed, but still a strong tendency towards it, a s shown in the following for the dp-model.
94

The dp -model will be discussed in more detail as an example. In Figure 42 it can be observed
that the optimized structure of the chromophore (in colors), as well as the water molecules and both
His in the QM /MM opt imized dp-mode l are in good agreement to the crystal structure (in orange).
This is further underlined b y small RMSD values for BV , His248 and His278: 0.48 Å, 0.57 Å and
0.65 Å, respec tively . The biggest s tructural change occur for the p-water which shifts a wa y from
BV . The h y drogen-bonding-network shows onl y ver y small deviations in the region of His278,
where one (PSC(C))O-(His278)- N ε -H bond is lost. The N-H- bond-lengths vary onl y in the
beginning of the optimization procedure and stay stable afterward (except the N ε -H-bond; see next
paragraph).
Figur e 42. QM/MM geometry optimized structure (colors) of dp-model com pared to X-Ray [16]
(orange); both aligned at BV (NOTE: no protons at BV are shown; B V is NOT pr otonated at
PSC(C) ); dotted red line indicate h y drogen- bond-interaction between free oxygen of PSC(C)
proton at ε -nitrogen of His278
Notable is the behavior of the H is278-N ε -H connection to the PS C(C) oxygen. A s trong decrease in
this distance can be observed in all His278E/P models. Th e N ε -H-bond increases (up to 1.05 Å in
the end) while simultaneously His278 shifts nearer to the PSC(C). Although the distance between
His278 and PSC(C) alwa y s stays over 1.7 Å, this behavior indicates a strong tendency of the proton
to leave His278 and switch to the PSC(C).
Comparison of diff er ent pr oton ation m odels lead to confirmation of pr oton ated PSC(C) (B V).
All computed mode ls are analyzed in term s of RMSD’ s, s tability of H-bond-networks and N-H-
bonds. In T able 6 to 8 the corresponding resul ts are presented. The RMSD's for His278 are
generally higher than with a protonated P SC(C). Further more, the highest deviations from the
crystal structure occur for the models with His278D protonation and the lowest with th e His278P
protonation. The dp-mod el shows dif fer ences in RMSD of about -0.05 Å/-0.09 Å for BV and -0.31
Å/-0.07 Å for His248 and 0.57 Å/0.26 Å for His278 (co mpared to dd/de). Further , ep shows
dif ferences in RMSD of about 0.03 Å/-0.08 Å for BV and 0.18 Å/0.1 1 Å for His248 and -0.02 Å/-
0.07 Å for His278 (compared to ed/ee). The pp-model finally shows RM SD-dif feren ces of about 0.2
Å/0.06 Å for BV and 0.18 Å/0.21 Å for His248 and 0.14 Å/0.29 Å for His278 (compared to pd/pe).
So the best choice will be either dp, ep or pp. It is difficult to further distinguish between these
three. Nevertheless since ep and pp have lower RMSD values for BV and His248, they seem more
plausible.
95

RMSD
(heavy atoms)
H248D
H278D
H248D
H278E
H248D
H278P
BV 0.43 0.39 0.48
H248D 0.26 0.50 0.57
H290 1.22 0.91 0.65
RMSD
(heavy atoms)
H248E
H278D
H248E
H278E
H248E
H278P
BV 0.41 0.36 0.44
H248E 0.51 0.44 0.33
H278 0.86 0.81 0.88
RMSD
(heavy atoms)
H248P
H278D
H248P
H278E
H248P
H278P
BV 0.60 0.46 0.40
H248P 0.54 0.57 0.36
H278 0.92 1.09 0.78
T able 6. Computed RMSD values for all ni ne QM/MM geometry optimized models with aligne d
chromophore and respect to the solvated + wate r -equilibrated cry stal structure [16]
The qualitative investigation of H-bond-network stabilities, which is shown in T able 7, helps further
dif ferentiating between these three models. From the three chosen mode ls, the ep-model has the
least deviations from the initial H-bond-network as observed in the crystal structure and thus
supports th e estimation that this represents the most probable model for a deprotonated PSC(C). In
the case of dp and pp, there are always more changes in H-bonding for at least one of the two
histidines: In dp the network changes around His278 moderately , while in pp it changes around
His248 strongly .
H248D
H278D
H248D
H278E
H248D
H278P
Stable/not stable (H278) ✕ ✔ / ✕ ✔ / (H278) ✕
H248E
H278D
H248E
H278E
H248E
H278P
Stable/not stable (H278) ✕ ✔ / ✕ (H278) / ✔ ✕
H248P
H278D
H248P
H278E
H248P
H278P
Stable/not stable ✔ / ( H248) ✕ ✔ (H248) ✕
T able 7. Co mparison of H-bond-network stability in all nine QM/MM geometry optimi zed models
with respect to the solvated + water -equilibrated crystal structure [16] ( : stable work → no ✔
changes; : unsta ble network → many + crucial changes, concerning named residue; / : ✕ ✔ ✕
moderate changes → e.g. exchange of bonding partners); denoted in parentheses are the re sidues
which experience the biggest corresponding changes of the H-bond-network
96

Finally , the only model w ith no fluctuations in the N-H- bond-evolutions for both His248 and
His278 is th e ep-model (as shown in T able 8), thus rendering this the one to be in the best
agreement with the X -ray structure and the correct one for a s ituation with a deprotonated PSC(C)
at BV .
bond between N δ or N ε
and H of His
H248D
H278D
H248D
H278E
H248D
H278P
H248 stable stable stable
H278 unstable unstable δ:stable |ε:unstable
H248E
H278D
H248E
H278E
H248E
H278P
H248 unstable unstable stable
H278 stable stable stabl e
H248P
H278D
H248P
H278E
H248P
H278P
H248 stable stable stable
H278 unstable unstable δ:stable |ε:unstable
T able 8. N-H-bond evolution/stability of His248/Hi s278 in all nine optimized models during
QM/MM optimization with respect to the s olvated + water -equilibrated cr y s tal structure [16];
asterisk denotes the models computed with lar ge basis set for hydrogen atoms (6-31+G*); “stable”
means constant bond-length during the lengt h of the optimization, while “unstable” denotes strong
fluctuations in the respective bonds duri ng optimization procedure
Conclusively , the indicated t endency of a proton transfer from N ε of H278 to BV confirms the
findings of the investigations of the CBD with a protonated PSC(C): If the proton on N ε of His278
in the ep-model would completel y l eave its position and change to the propionate of ring C of BV ,
this would lead back to the ed-model with protonated PSC(C). This is in agreement with the results
presented in 9.3., in w hich the ed-model is s tated as the model with the least deviations fro m the
crystal structure and thus the correct one. These findings underline Resonance Raman findings that
the PSC of ring C is protonated in its native Pfr state [23]. F urthermore, this confirms His278 being
protonated on N δ , while His248 is protonated on N ε .
9.5. Discussion
In summary the co mparison of Q M/MM optimized models accoun ting for dif ferent protonation
states for the two histidines H is248 and His278 in near vicinity of BV (protonated PSC(C)) with the
respective cr ystal structures revealed the ed- model (His248: protonati on at N ε ; H is278: protonation
at N δ ) as the most probable native protonation state. Further computations and analysis of the
respective models with deprotonated PSC(C) conf irmed these findings. Also, the presence of a
proton on PSC(C) as already suggeste d b y RR-expe riments [23] could be confirmed.
Similar studies have been recentl y published, concerning th e protonation state of the two conserved
histidine residues in both Cph1 [27] (His260, His290) and Agp1 [28] (His20, His280) in their parent
Pr -states:
97

Cph1 has been found to have two dif ferent protonation patterns of His260 and His290 which are
attributed to Pr-I- and Pr-II-st ates and thus refer to a ph-dependent heterogeneity [27]. Both diffe r in
the protonation of His260. Pr-I resembles a doubl y protonated and cationic His260, while H is290 is
protonated at the N ε . In contrast to this, Pr -II shows single N ε -protonation of both His’ (pe- and ee-
models, respectively). V elazquez et al. [27] suggested H is260 as a proton-sink and -source and
essential for the conformati onal heterogeneit y of the CB D.
T akiden et al . [28] invest igated the protonation pattern of His250 and His280 in A gp1 and
determined also two pos sible protonation states [28]: Either both His’ carrying a proton on N ε , or
His250 is protonated on N ε , while His280 is charged and carries two protons (ee- or ep-m o dels,
respectively ) . Again a heterogeneity of the Pr -state might be possible [28].
The findings concerning Agp1 are in good agreement to the here presented results of A gp2. Here, a
homogeneous Pfr -state with ed- protonation of His248 and His278 has been proposed. V elaz quez et
al. [23] suggested the release of the proton from PSC(C) and subsequent uptaking by His278 during
Pfr -Pr -conversion which would lead to a negatively charged propionate at ring C and a doubly
protonated and a positively charged His278. This is exactl y what T akiden et al. [28] observed in the
Pr -state of Agp1: an ep-protonation state of th e two histidines and a deprotonated PSC(C) at B V .
Furthermore, studies with a deprotona ted PSC(C) showed an ep- model to be the m os t promising
one for the correct protonat ion of the two histi dines. This is also in agreement to the findings of
T akiden et al. [28]. Conclusively , there might be a resemblance between the Agp1 (Pr state) findings
and the ones for Agp2 (Pfr state) in terms of proton-translocations between PSC(C) and His278 and
protonation states of the two histidines in gener al. This could mean a shared and similar photocycle
with similar characte ristics in both ph yt ochromes.
In contrast to th is, His290 in Cph1 carries in both parent P r -states only one proton at N ε [27]. This
leads to the suggestion that both Cph1 and Agp1/Agp2 follow diffe rent photoconversion
mechanism s, concerning the role of their respective H is290 and His280/His278. A common feature
of all three sy s tems is the protonation of N ε in His260 in Cph1, His250 in Agp1 and His248 in
Agp2. This seems to be a conserved motif. It might be tha t H is260/250/ 248 plays a similar role in
the photocycle of all three ph y tochromes. This is in agree ment with the suggestion of the proton-
transferring role of His260 in Cph1 by V elazque z et al. [27].
The findings of both V elazquez et al. [27] and T akiden et al. [28] further confirm the validity of the
here presented results.
The here presented computational result s make it possible to reliably simulate the CBD of Agp2 and
investigate the role of H is248 and His278, as well as PSC(B) and PSC(C). This will help to get
further insight into d y na mics inside the CBD and their possible influence on the photocy c le, as will
be discussed in the next chapter .
98

Chapter 10
QM/MM calculations and MD simulations of Agp2
methyles t er variant of BV
10.1. Experimental r emarks
Studies of Agp2 via resonance Raman techniques have been carried out b y the group of Hildebrandt
et al.: The y revealed that the thermal back reaction from Pr to Pfr is controlled b y a keto-enol-
equilibrium [23]. This is enabled b y the additional proton ation of the H is278 residue at its ε-
position. The origin of this proton is not quite clear , but it has been suggested that it may come fro m
the protonated propionic side-chain (PSC) of ring C. The y found, it carries a proton in the Pfr s tate
of BV , as discussed in section 9. The corresponding signal in the RR is disappe aring upon Meta-F to
Pr state transition whi ch they interpreted as possible proton translocation from PSC(C) to His278
and thus enabling the thermal back route to Pfr via a keto-enol-tautomerie [23]. This proton transfer
highlights the importance of both PSC(C) and Hi s278 for the correct photoconversion.
Past s tudies on differe nt ph y t ochromes revealed already a s ignificant importance of the carboxylic
groups at the cofactor for the corre ct photoconversion. In 2009 and 2010 L agarias and coworkers
[24],[25] presented studies of modified PSC's of the BV and PCB chromophores (DrBphP and
Cph1, respectivel y ) via amidation of the propionates. The y investigated the impact of this on the
stability of the chromophore-protein system and the correct photoconversion [24],[25]: For BV -
containing Dr BphP the a midation of PSC(B) led to a different Pr ground state and mul tiple Pfr
states. PSC(C) m odification created dual Pr -states which could be excited to dual Pfr -s tates. Further ,
the two states seemed to collapse into one single state upon denaturation. Both differing in a
denaturation-sensitive propert y , such as protonation. They concluded that PSC(B) seems to play an
important role in sensing red light, while PSC(C) determines the width of the deter mined spectral
region. They also tested th is procedure on PCB-containing ph y tochrome Cph1, where amidation of
PSC(B) showed a mixture of different P r -like states, in which at least one s pecies was able to
under go corre ct photoisomerization. Similar to their studies on Dr BphP , PSC(B) was also found to
be responsible in Cph1 for defining the sensed s pectral range. In contrast to this, the amidation of
PSC(C) led to an inh ibition of Pfr -for mation. Therefore, it has to be important for the corre ct
formation of the Pfr-adduct. Both propionates in both phyt ochromes have not been found essential
for photoisomerization of the chromophore, but are inf luencing the correct Pr to Pfr transition.
Especially in Cph1, PSC(C) is involved criticall y . The y thus concluded that both PSC's are
important for stability inside the protein and correct adjustment in to its pocket, determination of
spectral range sensed b y the proteins and pKa-tuning of the surrounding amino acids [24],[25].
Formation of amides hinders this to a cert ain extent [24],[25].
Following these findings, F ranzisco V elazquez and Maria Fernandez tested the idea of Prof. Siebert,
to modif y the propionic side-chains (PSC(B) and PSC(C)) of biliverdine of ph y tochrome Agp2 b y a
methylati on procedure, yielding monomethylester variants of BV [26]. These esters should shed
further light on the importance of the PSC’ s on the photoc y cle in general and the thermal
backreaction (as described in [23]) in part icular . In Figure 43 both variants are shown schematically .
Here, the one with methy lmonoester at ring B is called BVMB and the one with t he ester function at
ring C is calle d B VMC. The unmethylated PSC's were protonated in their studies [26]. The
following experimental findings are currently unpubli shed [26]:
99

Figur e 43. V ariants of Agp2 cofactor biliverdine – monomethy lesters: (1) BVMB: BV w ith methy l
group at PSC of ring B and proton at PSC of ring C; (2) B V MC: BV with met h y l group at PSC of
ring C and proton at PSC of ring B; m odified areas are encircled red .
It was not possible to get pure BVMB and BVMC probes, but only a 1:1 mixture of both which has
been incorporated into the Agp2 A poprotein. They tested the correct insertion of the modified
chromophores into the Apoprottein via UV/VIS spectroscopy . The formation of a Pfr-li ke state
could be observed (band at 745 nm absorpt ion), although it for med ver y slowly (af ter 77 h it w as
still not complete d). After irradiation with far -red light the formation of the Pr -like photoproduct
could be observed in U V/VIS at 675 nm (t y pical absorbance of Agp2 Pr state). The thermal back
conversion happen ed ver y slowl y (not complete after 16 h). Also anoth er state could be detected,
appearing as a band at 685 nm. This was assigned to a photoinactive state. They concluded that two
dif ferent states were formed: a photochemicall y active Pfr-state and one which was unable to
convert into a Pr -state [26].
T o further eva luate these findings, the y did RR measurements of the mixture and compared the
results to spectra from na tive Agp2 [26]: They found that the dark state (Pfr -l ike) showed high
similarities to the native protein. They confirmed the appearance of an enol-form of BV in the Pr
state upon irradiation. The RR spectrum of the Pr for m was again similar to the one of the
unmodified protein. D if ferences in RR s pectra between methylated and native species could be
found e.g. in the region around 1300 cm -1 . H ere propionate-contributions are typically present.
Furthermore, H/D exchange revealed a protonated chromophore at the p y r role nitrogens, similar to
the native form.
They suggested [26] that the monoesters form a mixture of 1:1 active:inactive, meaning that either
BVMB or BVMC is active and the other one for ms a photoinactive species. Since, they observed
the Pfr to Pr thermal backconversion with a proton on PSC(C) in the corresponding RR of the Pfr , it
is likely to assume that BVMB undergoes photoconversion. Also, in both Pfr- and Pr -RR-spectra,
strong changes for bands including contributions from PSC(B) could be observed. This also poin ts
towards BVMB as the acti ve species, while BVMC only forms an inactive state.
They conculded that since the methylation of PSC(C) led to an inactive form, this side-chain and its
proton is essential for correct photoconversion and for the proposed keto-enol-triggered
backreaction [23],[26]. In contrast to this, the methyl ation of PSC(B) onl y led to an elongated Pfr-
formation period, as well as, a dela y ed ther mal backreaction but still enables the ph y tochrome to
100

under go correct photoconversion [26]. The phytochrome is thus more tolerable to modifications of
PSC(B), while m aintaining its photoactivity [26] .
Theoretical studies at an atomic-level have to be carried out, to further investigate the influence of
the methylation of both PSC’ s on the stability and structure in the CBD. T aking the findings of
section 9 of the protonation states of His248 and His278 into account, it is now possible to simulate
Agp2 with methyl ated PSC’ s of BV in a reli able wa y .
10.2. Theor etical investigations o f Agp2 – BV -variants
The ed-model of Agp2 has been used in QM/MM computations and MM-MD simulations with
methylated variants of BV . The models BVM B and BVMC have been built, using the cr ystal
structure of Agp2 in its paren t Pfr -state as solved b y Scheerer et al. [16]. Hereb y BV was either
methylated at PSC(B) and protonated at PSC(C) (BVMB) or vice versa for BVMC. T he protonation
of either PSC(B) (BVMC) or PSC(C) (B VMB) was chosen to be consistent to the sy n thesized esters
[26] as shown in Figure 43. The decision, which ox y gen of the corresponding PSC w as methylated
or protonated, was made upon visual inspection of poten tial binding partners and poss ible space for
placing the relatively large meth y l group. The methyl group in BVMC has been placed on the same
oxygen of PSC(C), as was the h y drogen in the computations in section 9. A third model has been
built with meth y l ation at both PSC-sit es (called BiMET). W ith the exception of the BV -
modifications, the rest of the structure in all mode ls was co mpletely si milar . For further details
please consult sections 5.2.2. and 5.2.5. In these studies the i mpact of dif ferent variants of BV (WT ,
BVMB, BVMC and BiMET) on the structural s tability of the CBD has been investigated.
Deviations from the cr y s tal structure model (protonated, solvated and water(proton)-equilibrated X-
Ray structure from Scheere r et al. [16]) have been anal yzed. This procedure was used to validate the
experimental assumption that B VMB is the photoactive form.
10.2.1. QM/MM computatio ns of models BVMB and BVMC
Both, BVMB and BVMC models have been used in a h y b rid QM/MM geometr y optimization with
an active region of the modif ied BV and surrounding protein re sidues and w ater molecules in a 20
Å sphere, centered at BV and quantum chemical tre atment of BV , Arg21 1, H is278, His248, Asp196
and 1 1 waters as shown in section 5.2.5. The results of these are s hown in Figures 44 to 47 and
T ables 9 and 10. Here, the corresponding starting structures (cr y s tal structure with modified BV and
surrounding amino acids and water molecules which have been treated quantum mechanically) are
depicted in blue, while the optimized models are shown in colors. The distance-evolutions,
concerning the distances between the ox y g en/methyl group (carbon) of PSC(B) and the a mine NH-
group of Arg21 1 (O/CH 3 (PSC(B))-NH(Arg2 1 1), as well as the distances between the o ygen/methy l
group (carbon) of PSC(C) and the N ε of His278 (O/CH 3 (PSC(C))-N ε (His278)) have been also
subject of investigation. Both optimized models have been studied for deviations from the original
crystal structure. For this purpose the stabilities of O/CH 3 (PSC(B))-NH(Ar g21 1) and
O/CH 3 (PSC(C))-N ε (His278) distances, as well as RMSD values for BV , Arg2 1 1 and His278 and
His248 and the qualitative stabil ity of the H-bond-network in 5 Å radius of the chromophore, during
the optimization was investigated. All RMSD values in the following have been computed with
alignment of optimized m odel and crystal structure at the cofactor .
In BV MB (as shown in Figure 44) the structure of BV and surrounding residues show man y
similarities after optimization to the starting mode l. The biggest deviations can be seen at the
methylated PSC(B) and smaller ones at His278 (s. Figure 44). PSC(B) and the methyl group are
turning to the side (to avoid s terical clash with Arg2 1 1), while Arg2 1 1 is mov ing away fro m BV .
101

This is also reflected in the relatively high RMSD for Arg21 1 of 0.88 Å (s. T able 9). An interesting
feature is that although the saltbridge between the carboxyl group of PSC(B) and Ar g21 1 is lost, an
interaction betwee n PSC(B) and Arg21 1 is still m aintained. The movement of PSC(B) brings the
non-methylated oxygen of PSC(B) nearer to the protonated amino-group of A r g21 1. In contrast to
this, the distance between methyl-carbon and the primar y amino-group of Ar g21 1 is ge tting large r
which can also be seen b y the increase in the corresponding distance. Therefore, h y drogen-bonding
between PSC(B) and Arg21 1 is maintained (s . T able 10), while simultaneously Arg2 1 1 moves awa y
from BV . In contrast to this, PSC(C) shows only sm all deviations from the original structure, as well
as His278 which is moving towards BV (decrea sing distance between both) w ith a moderate RMSD
of 0.7 Å (s. T able 9). Also here, the interaction between BV and His278 is maintained via H-bond
(not shown here) as depicted in T abl e 10. Further , the positions, waters do occupy after the QM/MM
optimization (not shown here) do not change from the crystallographic positions.
Figur e 44. Optimized structure (colors) of ed-m odel with methylated PSC(B) compared to X-Ray
[16] (orange); both aligned at chromophore (NOTE: no protons at B V a re shown; BV is pr oton ated
at PSC(C) )
In BVMC the situation is quite different (see Figure 45): His278 moves strongly awa y from PSC(C)
and deviates fro m its original cr y s tallographic pos ition. The reason might be the influence that the
methyl group at PSC(C) has on His278. Due to the methy l group PSC(C) is less nega tively char ged,
possibly influencing the el ectrostatic interaction between both, as well as it is sterically more
demanding. This is also reflect ed in the increasing distance between both and the ver y high RMSD
of 1.12 Å of His278 (s. T able 9) . PSC(C) itself shows no movement at all. Also Ar g21 1 is slightly
shifted, as well as PSC(B). While in BVMB the introduction of the CH 3 group leads onl y to a
rearrangement of named residues, but int ermolecular interactions are maintained, it is quite dif ferent
in BVMC. Here the CH 3 group on PSC(C) leads to a loss of BV -His278 connection (no h y d rogen
bond is directly or indir ectly via water -bridge connecting both) and bigger instabilities in the whole
structure which can be seen in the higher RSMD values (s. T abl e 9), as well as in the instability of
the hy drogen- bond-network in BVMC (not sho wn here) as depicted in T able 10.
102

Figur e 45. Optimized structure (colors) of ed-m odel with methylated PSC(C) compared to X-Ray
[16] (orange); both aligned at chromophore (NOTE: no protons at B V a re shown; BV is pr oton ated
at PSC(B) )
In T able 9 the computed RM SD's are shown for BVMB and BVMC (comparison to BiMET is done
in the next section). As stated, BVMB shows the biggest fluctuations at Arg21 1 and moderate ones
for all other three residues. BVMC shows similar patterns, w ith the exception of the high value for
His278. The RMSD for Arg2 1 1 in BVMB is Δ=0.38 Å higher as in BVMC, while the corresponding
RMSD of His278 in BVMC is Δ=0.42 Å higher as in BVMB. Concluding, BVMC shows overa ll
higher deviations from the starting struct ure than BVMB.
RMSD (Å)
(heavy atoms)
BVMB
H248E
H278D
BVMC
H248E
H278D
BiMET
H248E
H278D
BV 0.68 0.44 0.62
R21 1 0.88 0.50 0.71
H248 0.49 0.56 0.81
H278 0.70 1.12 1.30
T able 9. Computed RMSD values for bim eth y lated BV model and of both monometh y lester
structures BVMB and BVMC with ali gned chromophore and respect to the crystal structure [16]
Additionally , the H-bond-network (s. T able 10) in BVMB remains the sa me during the whole
QM/MM co mputation, while in BVMC it changes. This is especially true for the H-bond
connection between PSC(C) and His278, where no H-bond can be found in the optimized BVMC
model. Also no water molecules are bridging between both PS C(C) and His278. This is in strong
contrast to BVMB, where a H-bond-connecti on is maintained between Arg2 1 1 and PSC(C).
103

BVMB
H248E
H278D
BVMC
H248E
H278D
BiMET
H248E
H278D
Stable/not stab le ✔ ✕ ✔ / ✕
T able 10. Co mparison of H-bond-network stability in bimethylated BV model and both
monomethylester structures BVMB and BVMC with re spect to the crystal structure [16] ( : stable ✔
network → no changes; : unstable network → ma n y + c rucial changes, concerning either R21 1 or ✕
H278; / ✕ : m oderate changes → e.g. only exchange of bonding partners) ✔
This shows tha t m ethylation at PSC(C) leads to higher instabilities in the chromophore binding-
pocket and loss of BV -His278 interaction which might induc e an photoinactivit y of BVMC as
postulated b y Hildebrandt et al. [26] The interaction of the protonated PSC(C) and H is278 thus
might be ess ential for correct photoconversion (as suggested in RR experiments [23],[26]). BVMB
on the other hand shows moderate deviations fro m the cr y s tal structure and maintained the original
H-bond network, even between Arg2 1 1 and BV . Thus, Arg2 1 1 is more flexible and tolerant towards
the artificial introduction of the methyl group and in itiates an opening-up of the CBD, in contrast to
His278 for methylation at PSC(C). Additionall y , His248 shows almost no deviations from its
starting positions ans is seemi ngly not af fe cted by the single methyl ation at either PSC.
10.2.2. QM/MM computatio ns of model BiMET with bimethylated BV
These findings have been validated b y similar computations of a bimethylated species of BV .
QM/MM geometr y optimizations with an active region of the modified BV and surrounding protein
residues and water molecules (in a 20 Å sphere, centered at BV with QM-treatment of Arg21 1,
His278, His248, Asp196, BV -variant and s everal water molecules; see section 5.1.3. for more
details) have been carried out and the results have been ana lyzed in terms of structural devi ations
(as compared to X-ray structure [16]) and compared to the findings for BVMB and BVMC. The
calculations y i elded the structure depicted in Figure 46.
No proton translocations can be observed during the optimization procedure. Furthermore, Ar g21 1
and His278 deviate from their original positions in the cr ystal structure. The former shows moderate
deviations si milar to the model BVMB. It moves slightly awa y fro m BV . His278 shifts s trongly
away from the cofactor as alread y shown in BVMC. The distance-evolutions (not shown here) show
that both Arg21 1 and His278 are moving awa y from the cofactor , resulting in large r distances
between these residues and PSC(B) and PSC(C), respectivel y . In contrast to this His248 shows
almost no deviations from its original crystal position (according to Figure 46). In contrast to this it
gives an RMSD value of 0.81 Å (s. T able 9) which is much higher than the ones in BVMB (about
0.32 Å ) and BVMC (about 0.25 Å), respecti vel y . The reason for thi s might be the contradicting
movements/directi ons of Ar g21 1 and His278, as w ill be discussed in more detail in the next
paragraph, as well as in section 10.2.4. Ar g21 1 moves in another direction as His278, and since both
are connected via beta-sheet (in which also His248 is located) their movements are pulling at
His248 in differe nt directions, resulting in this high RMSD without seeing big deviations in Figure
46. In contrast to this, the chromophore structure is mostly maintained. Only PSC(B) shows some
rotations, due to a rearrangement with Ar g21 1 to circumvent steric cont acts between methyl group
and Arg21 1. This movement is exactly the sa me as observed for BVMB. Agai n the h y d rogen
bonding interaction between Arg2 1 1 and PSC(B) is maintained (s . T able 10). In contrast, no H-bond
could be observed between BV and His278 (s. T able 10). This residue lost completel y the contact to
the cofactor which m ay be the reason for its strong movement, since no attractive H-bond-
interactions between both are maintained. Furthermore, the cr ystal waters in the vicinity of the
cofactor are mostly undisturbed (not shown here ).
104

Figur e 46. Optimized structure (colors) of ed-model with bime th y la ted BV at PSC(B) and PSC(C)
compared to X-Ray [16] (orange); both ali gned at chromophore (NOTE: no protons at BV are
shown)
These observations reflect a mixture of movements seen in BVMB and BVM C models. Arg21 1 is
moving upwards and awa y from BV (with respect to BV , as shown in Figure 46), just like in
BVMB. But th e movement is interrupted b y the translation of His278. This is triggered b y the CH 3
group at PSC(C) and is similar to BVMC. This induces a shift of His278 which counters the
opening-up of the chromophore binding-pocket, as triggered by the Ar g21 1 translations. Therefore,
trends of both BVMB and BVMC are at least partly visible in BiME T , but ne ither is fully
established, since both are opposing each other . In T able 9 the corresponding RMSD values in
comparison to previous ones for BVMB and BVMC are depicted. Ar g21 1 has a higher RMSD of
0.71 Å in BiMET than in BVMC, but lower than in BVMB. This shows again that Ar g21 1 may be
hindered b y movements of H is278. The RMSD of the latter is even higher (1.30 Å) than in BVMC,
indicating even bigger deviations and fluctuations than in the monomethylester . The reason for this
may be that on one hand the translocation of A rg21 1 triggers the shift of the connec ted beta-sheet
β1 (of which His278 is a part of; for further analy sis of secondary s tructure elements inside the
CBD see sections 1.2., Figure 3 and 10.2.4. and 10.2.5.). This induces a directed movement of
His278. O n the other hand, the meth yl group on PSC(C) induces a translation of His278 into
another direction, awa y from BV . Both movements are increasing the distortions of His278 which
leads to an increased RMSD of His278 even higher than in B VMC. Also H is248 shows stronger
fluctuations (as reflected by an RMSD of 0.81 Å) than in the mono-variants. This is (as pointed out
in the paragraph before) a direct consequence of the two controversial movements induced by
Ar g21 1 and His278 which are initiated by the introduction of the two methyl groups at the PSC's.
Furthermore, the qualitative stabilities of the h y d rogen-bonding-networks are shown in T able 10 for
BiMET and compared to BVMB and BVMC. In the bimethylated variant changes in the h y drogen
bonds occur alway s near the methylation sites. In more detail, the int eraction between Ar g21 1 and
PSC(B) is ultim atel y m a intained, while His278 loses every contact to BV (even no water is bridging
between them). Again this reflects a mixture of the findings in the m o no esters.
Conclusively , BiMET shows a mixture of trends observed in BVMB and BVMC. Both RMSD's and
H-bond-network, as well as visual inspect ion of the optimized structure, are showing an upward
105

movement of Ar g21 1 and at the same time a shift of His278 away from BV . Both movements m ay
be hindering each other . This results partly in very high RMSD' s for His278 and His248. Again,
Ar g21 1 de monstrate its ability to compensate the disturbance of the CH 3 group and still interacts
with BV . This is in contrast to H is278 which is less tolerant to the modification of PSC(C). These
results support the findings and conclusions of the pre vious section 10.2.1.
10.2.3. MM-MD of Agp2 WT (w ith ed-pr otonation of His248 and H is278)
The motions of Agp2 WT protein with un modi fied BV has been simulated with a MD-simulation:
the cr y s tal structure with ed-protonation of His248 and His278, as well as a proton on PSC(C) has
been us ed for the MM-MD-computations (with a production run of 20 ns; all-atom-free, with the
exception of BV , for w hich all heavy atoms have been fixed; for further details of used methods,
please see s ection 5.2.1.). The co mparison to the variants BVMB and BVMC will be discussed in
more detail in section 10.2.5.
RGYR and RMSF cal culations of pr otein (not shown her e in detail)
The radius of g y r ation changes during the production run fro m values around 29 Å to s maller ones
around 28 Å . This might reflect folding of parts of the protein (mainl y PHY region). RMSD
computations show an equilibrated protei n structure at the end of 20 ns production. According to
RMSF calculations the highest fl exibility lies within the PHY domain with values up to 6 Å. Arg2 1 1
and His278 have both values of approxim ately 1 Å.
Structur e of CBD
Of high interest are the movements and stabilities of the two residues Arg21 1 and His278. As has
been shown in the QM/MM computations (see sections 10.2.1. and 10.2.2.), the methylation of
PSC(B) and PSC(C), respectivel y , has direct impact on these two residues. Therefore, the CBD in
general and these two residues espec ially , have been analy z ed mor e thoroughly . The RMSD’ s of
both residues sta y constant at ~ 1.5 Å (His278) and ~ 0.5 Å (Arg2 1 1) with the exception of s mall
fluctuations (during the simulation). After 20 ns of simulation time, His278 is at a distance of 4 Å to
the hy drogen of PSC(C). Ar g21 1 shows a distance of 2 Å to the oxygen of PSC(B).
As can be s een in Figure 47 (left) Arg21 1 does sta y very near its initial cr ystal position at the end of
simulation. His278 on the other hand shows strong deviations from its original position. It moves
away from the chromophore during the production run and further the imidazole ring is turning
around 180°. These movements are in contrast to the observed behavior of His278 in the
corresponding QM/MM computations as discussed in section 9.2., where His278 showed al most no
shift at all from its initial cr y stal position. The reason for this will be discussed at the end of this
subsection. The motions (and possible deviations fro m the X-ray) of protein residues His248,
Ar g242, T y r 205, Pro459 and Arg456 ha ve also been studied in more detail. As depicted in Figure 47
(left), His248 s hows a deviation in its orientation towards BV . Arg242, which is a direct interaction
partner of Ar g21 1 (as can be s een in Figure 47 on the left), sta y s stable at its initial position, onl y
following th e occa sional movements of Ar g21 1 b y adjusting the corresponding H -bonds betwe en
them. T y r205 onl y reorientates, turning its O H-group away from BV in the process, but staying in
the same distance as in the cry s tal structure (not shown here for better visualization). For Ar g456
and Pro459 the situation is dif ferent (Figure 47 on the right):. Both residues are shifted awa y from
BV . Arg456 shifts towards Pro459 and the latter moves in the same direction. Their initial
orientation towards BV remains the same at the end of the simulation. Therefore, for both the
distances to their original neighbors (Arg456: CH 3 of ring A of BV; Pro459: C=O of ring D of BV)
106

at BV are increasing in a significant w ay (a possible reason for this will be explained in the next
paragraph). All these movements occur in the beginning of the MD and afterward the residues sta y
stable.
Figur e 47. BV and surrounding protein residues: (left) Arg21 1, Arg242 a nd His278, His248; (right)
Ar g456 and Pro459; after 20 ns MD-simulation (colors) of WT variant of BV compared to X-Ray
[16] (orange); both aligned at chromophore; all -atom free, except fixed heavy atoms of BV (NOTE:
no protons at BV are shown; BV i s pr otonated at PSC(C) )
Figur e 48. Comparison of secondary structure elem ents between crystal structure [16] (orange) and
structure after 20 ns of MD-simulation (red) of CB D; arrows indicate direction of shift of secondary
structure elements occurring during MD-simulation of WT; all-atom free, except fixed heavy atoms
of BV; aligned at B V (NO TE: BV is pr otonated at PSC(C) )
107

Changes in secondar y s tructure of the CBD are depicted in Figure 48. The alpha helix α1 (located
directly above BV; in red) is shifted awa y from its original position (in blue). Furthermore, as
visualized in Figure 48, the tongue, as well as beta-sheets deviate in the same direction. This is in
accordance to the observed movements of Ar g456 and Pro459, since the y are located in the tongue-
region. These structural el ements are shifted to the right (in respect to the pos ition of BV in Figure
48, left). These deviations occur alread y in the beginning of the MD, while for the rest of th e
simulation these residues s tay stable at their new positions. Therefore, the residues seemingly
reached their maximum deviation from their original positions (which will play an important rol e
when investigating B VMC with MM-MD, see sections 10.2.4. and 10.2.5.). The PHY domai n is
more compact after the MD (as the movements in the CBD trigger the rearrange ment of the PHY
domain). This confirms the findings of RGYR computations which reflected folding of the globular
protein.
In summary , Ar g21 1 does not deviate form the cry stal s tructure, while His278 and His248 show
displacements and shifts awa y from the cofactor . The CBD displays an opening-movement, as α1,
β1, β2 and the tongue are translated away from their original cr y stal positions (and awa y from BV),
into th e same direction (to the right, with respect to BV , as depicted in Figure 48). A reason of this
strong deviation can be the conditi ons of the MD-simulation. All heavy atoms of BV have been
remained fixed during the simulation which might cause a rearrangement of the surrounding protein
residues. BV is not able to adjust to the CBD and th is might be the origin of the observed
movements in the CBD. Further that might be wh y the results from the MM -MD are dif fering from
the results in the respective QM/MM computations in section 9.2. The obs erved movements are
limited in the wa y that after a certain distance to their respec tive initial positions is reached, the
secondary s tructure elements remain stable on th eir new positions. The rest of the CBD shows onl y
small deviations. Although these results are not co mparable to the QM/MM combinations (due to
fixed BV), the y can and will be used as a refer ence for comparison with BVMB- and BVMC-MM-
MD-trajectories (since here, also all heav y atoms of BV have been fixed during the M D-
simulation). Thus the influence of the introduced methyl groups on the respective PSC's on the here
discussed motions and the general sta bility of the CBD can be observed.
10.2.4. MM-MD of m odels BVMC and BVMB and com parison to WT
MM-MD simulations have bee n carried out for Agp2 photosensor y domain wit h variants of BV . The
chromophore has been either meth y lated at P SC(B) or PSC(C), named B VMB and BVMC,
respectively (as alr eady discussed e.g. in section 10.2.1.). For co mparison with the findings of the
QM/MM co mputations and the MM-MD-simulation of the WT (sections 10.2.2. and 10.2.3.,
respectively ) , 20 ns production runs have been done (all-atom free, with the exception of fixed
heavy atoms of BV; for further details on e.g. heating, equilibration and s imulation conditions,
please see section 5.1.1. and 5.1.2.) for both models and compared to each other . In the following,
the results obtained b y these simulations are shown and compared to each other and to th e findings
for MM-MD calculations of Agp2 WT (s. section 10.2.3.).
RMSD and RMSF cal culations for MM-MD's
The radius of g y r ation of BVMB is increasing over the course of the M D, from 28.5 Å to 29.5 Å
which reflects an opening movement of the protein (w hich is in contrast to the folding, observed in
the WT simulation). In contrast to th is the RGYR in BVMC is decr easing from 29 Å to 28 Å,
indicating an increased fold ing of the protein (similar to the WT). The RMSD of BVMB is stable
after 5 ns of simulation time, while the RM SD of BVMC is not stable after 20 ns. The latter model
is not co mpletely equilibrated (as shown in the next paragraph). RMSF values for BVMC (protein)
108

[Document text truncated for crawler view.]

Why organizations use Identific for document trust, entry 22

Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in universities, research institutes, colleges, schools, and publishing workflows, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports clearer documentation of academic decisions, reduced manual checking effort, and more reliable review records. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For policy papers, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.

Review document trust