Machine learning of accurate energy-conserving molecular force fields [original]

APPLIED MATHEMATICS 2017 © Th e Auth ors,
som e rights re se rved ;
excl usi ve lice nsee
Ame rica n Asso ci atio n
for the Ad vanc em ent
of Scie nce. D is tribu te d
unde r a Creat iv e
Com mons At tri bu tion
NonCo mm er cial
Lice ns e 4.0 (CC BY -N C).
Machine learning of accurate energy-conserving
molecular force fields
Stefan Chmiela,
1
Alexandre Tkatchenko,
2,3
* Huziel E. Sauceda,
3
Igor Poltavsky,
2
Kristof T. Schütt,
1
Klaus-Robert Müller
1,4,5
*
Using conservation of energy — a fundamental property of closed classical and quantum mechanical systems —
we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular
force fields using a restricted number of samples from ab initio molecular dynamics (AIMD) trajectories. The
GDML implementation is able to reproduce global potential energy surfaces of intermediate-sized molecules
with an accuracy of 0.3 kcal mol
− 1
for energies and 1 kcal mol
− 1
Å
− 1
for atomic forces using only 1000 confor-
mational geometries for training. We demonstrate this accuracy for AIMD trajectories of molecules, including
benzene, toluene, naphthalene, ethanol, uracil, and aspirin. Th e challenge of constructing conservative force
fields is accomplished in our work by le arning in a Hilbert space of vector-valued functions that obey the
law of energy conservation. The GDML approach enables quantitative molecular dynamics simulations for mol-
ecules at a fract ion of cost of expli cit AIMD calcula tions, thereby allowin g the cons truction of efficient force
fields with the accuracy and transferability of high-level ab initio methods.
INTRODUCTION
Within the Born-Oppenhe imer (BO) approximation, p redictive simulations
of properties an d functions of molecul a rs y s t e m sr e q u i r ea na c c u r a t ed e s c r i p -
tion of the global potenti al energy hyper surf ace V
BO
( r
→
1
, r
→
2
, … , r
→
N
),
whe re r
→
i
indicates the nuclear Cart esian coordinates. Although
V
BO
could, in principle, be obtained on the fly using explicit ab initio
calculations, more efficient approaches that can access the long time
scales are required to understand relevant phenomena in large mo-
lecular systems. A plethora of classical mechanistic approximations
to V
BO
have been constructed, in which the parameters are typically
fitted to a small set of ab initio calculati ons or experimental da ta.
Unfortun ately, t hese cl assical app roxim ations may suffer from the
lack of transfer ability and can yield accurate results only close to
the conditions (geom etries) they have been fitted to. Alterna tively,
sophisticated machine learning (ML) approaches that can accurately
rep rodu ce t he gl oba l pot ent ial en erg y su rfa ce (P ES) f or el eme ntal
materials ( 1 – 9 ) and small molecules ( 10 – 16 ) have been recently developed
(see Fig. 1, A and B) ( 17 ). Although potentially very promis ing , o ne pa r-
ticular ch allenge for di rect ML fitt ing of mole cular PE S is the large
amount of data necessary to obtain an acc urate model. Often, many thou-
sands or even millions of atomic configurations are used as training
data for ML models. This results in nontransparent models, which are
difficult to analyze and may break consistency ( 18 ) between energies
and forces.
A fundamental property that any force field F
i
( r
→
1
, r
→
2
, … , r
→
N
) must
satisfy is the conservation of to tal energy, which implies that
F i ð →
r 1 ; →
r 2 ; … ; →
r N Þ¼ ∇ →
r i V ð →
r 1 ; →
r 2 ; … ; →
r N Þ . A ny classical me chanistic
expres sions for the po tential ene rgy (also denoted as classic al force
fi eld ) or an al yt ic al ly de r iv ab le ML ap pr oa ch es tr ain ed on ene rgi es sa t-
isfy en ergy conservation by construc tion. However, even if conserva-
tion of e nergy is sati sf i ed implicitl y within an approximat ion, this do es
not im ply t hat t he m od el wi l l be a ble t o acc ur at ely f oll ow t he t r aj ect ory
of the true ab in itio pote ntial, which was used to fit the fo rce field. In
pa rti cula r, small en ergy /fo rce in cons iste ncie s betw een th e for ce fi el d
model and ab initio ca lculations can lead to unfo reseen art ifacts in
the PES topolo gy, such as spur ious cri tical po ints tha t can give rise
to inco rrec t molec ular dyna mics (MD) tra ject ories. Ano th er funda-
mental probl em is that classical and ML force fiel ds focusing on energy
a st h em a i no b s e r v a b l eh a v et oa s s u m ea t o m i ce n e r g y a d d i t i v i t y — an
ap pro xim ati on t ha t is h ar d t o ju sti fy f ro m q ua nt um m ec ha nic s.
He re, w e pres e nt a r ob us t so l ut ion t o th es e ch alle ng es by co ns tr uc t-
ing an expl icitly co nse rvati ve ML force fi eld , which us es exclu sive ly
at omi c gra die nt in for mat ion i n l ie u of a tom ic (o r to ta l) en er gi es. In th is
ma nne r, wi th a ny n um ber o f dat a sam ple s, th e pro po se d m od el fu lf il ls
energy con serva tion by constru ction. Obv iously, the deve lop ed ML
fo rce f ield ca n be c ou ple d t o a he at ba th, m aki ng t he f ull s yst em ( mo l-
ecule and bath) non – energy-conserving.
We remark that atomic forces are true qu antum-mechanical observa-
bles within the BO approximatio n by virtue of the Hellmann-Feynman
theo rem. Th e ener gy of a mol ecular sy stem is reco vere d by anal ytic
integration o f the force-field kernel (see Fig. 1C ). We demonstrate that
ou r gr ad ient -dom ai n ma ch ine le ar nin g (GD ML) a pp roa ch i s abl e to
ac cura tel y rep rodu ce gl oba l PE Ss of in te rmed ia te-si ze d mol ecu les
within 0.3 kcal mol
− 1
for energies and 1 kcal mol
− 1
Å
− 1
for atomic
fo rces r elati ve to the ref eren ce data . This ac cura cy is ach iev ed when
using less than 1000 training geomet ries to construct the GDML model
and using energy conservation to avoid o verfitting and arti facts. Hence,
the GDML approach paves the way for efficient and precise MD simula-
tions with PESs that are obtained with arbitrary high-l evel quantum-
chemica l approaches. We demonstr ate the accuracy of GDML by
computing AIMD- quality thermodynami c observables using path-
integral MD (PIMD) for eight organic molecules with up to 21 atoms
and four ch emi ca l el em ent s. Alt ho ug h w e u s e d ens it y f un ct ion al theo r y
(DFT) calculations as reference in this development work, it is pos-
sible to use any higher-level quantum-chemical reference data. With
s t a t e - o f - t h e - art quantum chemistry codes running on current high-
p er fo rm an c e c om puters, it is possible to generate accurate reference
d a t af o rm o l e c u l e sw i t haf e wd o z e na t o m s. Here, we focus on intramolecular
1
Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany.
2
Physics and Materials Science Research Unit, University of Luxembourg, L-1511
Luxembourg, Luxembo urg.
3
Fritz-Haber -Institut d er Max-Planck-Gesell schaft,
14195 Berlin, Germany.
4
Department of Brain and Cognitive Engineering, Korea
University, Anam-dong, Seongbuk-gu, Seoul 136-713, Korea.
5
Max Planck Institute
for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany.
*Co rre spo ndin g au tho r. E mai l: al exan dr e.t kat chen ko@ uni .l u (A .T. ); kl aus -ro ber t.
[email protected] (K.-R.M.)
SCIENCE ADVANCES | RESEARCH ARTICLE
Chmiela et al ., Sci. Adv. 2017; 3 : e1603015 5 May 2017 1o f6
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

f or c e s i n small- and medium-sized molec ules. However, in the future,
the GDML model should be combin ed with an accurate model for i n-
termolecular forces to enable predic tive simulations of condensed mo-
le cu l a r s y s te ms . Widel y used classical mechanistic force fields are based on
simpl e harmoni c term s for in tramol ecular degre es of freed om. Ou r
GDML model correctly trea ts anharmonicities by using no assumptions
whats oever o n th e an a ly tic f or m on t he i nt er at o mi c po te nt i al e ner gy
functions within molecules.
Kernel
G eom et r y Space
Kernel f unc tion
Energy [kcal/mol] Energy [kcal/mol]
Energy [kcal/mol]
Descriptor
Differentiation
Kernel
C F orce domain
Descriptor encodes molecular structure .
Kernel function measures the similarity
between pairs of inputs.
Solution:
T raining in the forc e domain accurately re -
produces PES topology .
Integration
Energy [kcal/mol]
B Energy domain
Energy samples
F orce samples
...
Problem:
Energy-based model lacks detail in under-
sampled regions.
...
Prediction
Prediction
T est error
Energy model
For ce model
ML
ML
A
Fig. 1. The construction of ML models: First, reference data from an MD trajectory are sampled. ( A ) The geometry of each molecule is encoded in a descriptor.
This representation introduces elementary transformational invariances of energy and constitutes the first part of the prior. A kernel function then relates all descriptors
to form the kernel matrix — the second part of the prior. The kernel function encodes similarity between data points. Our particular choice makes only weak assump-
tions: It limits the frequency spectrum of the resulting model and adds the energy conservation constraint. Hess, Hessian. ( C ) These general priors are sufficient to
reproduce good estimates from a restricted number of force samples. ( B ) A comparable energy model is not able to reproduce the PES to the same level of detail.
Ground t ruth Samples V ector field Conserv ativ e field Solenoidal field
f
f Helmholtz decomp osition
Fig. 2. Modeling the true vector field (leftmost subfigure) based on a small number of vector samples With GDML, a conservative vector field estimate ^
f F is
obtained directly. A naïve estimator ^
f 
F with independent predictions for each element of the output vector is not capable of imposing energy conservation constraints.
We perform a Helmholtz decomposition of this nonconservative vector field to show the error component that violates the law of energy conservation. This is the
portion of the overall prediction error that was avoided with GDML because of the addition of the energy conservation constraint.
SCIENCE ADVANCES | RESEARCH ARTICLE
Chmiela et al ., Sci. Adv. 2017; 3 : e1603015 5 May 2017 2o f6
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

METHODS
The GDML approach explicitly constructs an energy-conserving
force field, avoiding the application of the noise-amplifying deriv-
ative operator to a parameterized potential energy model (see the
Supplementary Material s for details). This can be achieved by directly
learning the func tional relationshi p
^
f F : ð →
r 1 ;
→
r 2 ; … ;
→
r N Þ i →
ML F i ð 1 Þ
between atomic coordinates and interatomic forces, instead of com-
puting th e gradien t of the PES (s ee Fig. 1, C and B). Thi s requires
const ra ining the so luti on spac e of al l arb itra ry vec tor f iel ds to the
subset of energy-conserving gradient fields. The PES can be obtained
through direct integration of ^
f F up to an additive constant.
To construct ^
f F , we used a generalization of the commonly used
kernel ridge regression tech nique for structur ed vector field s (see
the Supplementary Materials for details) ( 19 – 21 ). GDML solves
the normal equation of the ridge estimator in the gradient domain
using the Hessian matrix of a kernel as the covariance structure. It
maps to all partial forces of a molecule simultaneously (see Fig. 1A)
K Hess κ ðÞ þ λ I

→
a ¼ ∇ V BO ¼ − F ð 2 Þ
We resorted to the extensive body of research on suitable
kernels and descriptors for the energy prediction task ( 10 , 13 , 17 ).
F or our application, we consider ed a subclass from the paramet ric
Matérn family ( 22 – 24 ) of (isotropic) ke rnel functions
κ : C v ¼ n þ 1
2 d ðÞ ¼ exp − ﬃﬃﬃﬃ ﬃ
2 v
p d
σ

P n d ðÞ ;
P n d ðÞ ¼ ∑
n
k ¼ 0 ð n þ k Þ !
ð 2 n Þ !
n
k
 2 ﬃﬃﬃﬃ ﬃ
2 v
p d
σ

n − k
ð 3 Þ
where d ¼ ∥ x
→  x
→ ′ ∥ is the Euclidean distance between two mol-
ecule descri ptors. It can be regarded as a generali zation of the
universal Gaussian kernel with an additional smoothness par-
ameter n . Our parameterization n =2r e s e m b l e st h eL a p l a c i a n
kernel, as suggested by Hansen et a l. ( 13 ), while being sufficient-
ly differentiable.
To disambiguate Cartesian geometries that are physically
equivalent, we use an input descriptor derived from the Coulomb
matrix (see the Supplementary Materials for details) ( 10 ).
The tr ain ed fo rce fi eld es tim ato r col lect s th e co nt rib uti ons o f th e
partial derivatives 3N of all training points M to compile the prediction.
It takes the form
^
f F
→
x

¼ ∑
M
i ¼ 1 ∑
3 N
j ¼ 1
→
a i

j
∂
∂ x j
∇ κ →
x ; →
x i
 ð 4 Þ
and a corresponding energy pre dictor is obtained by integrating ^
f F ð x
→ Þ
with respect to the Cartesian g eometry. Because the trained model is a
(fixed) line ar combination of kernel funct ions, integrat ion only affects
the kernel f unction itself. The expression
^
f E
→
x

¼ ∑
M
i ¼ 1 ∑
3 N
j ¼ 1
→
a i

j
∂
∂ x j
κ →
x ; →
x i
 ð 5 Þ
for the energy predictor is therefore neither problem-specific nor does it
require ret raining.
We remark that our PES model is global in the sense that e ac h
mol ec ul ar de scr ipto r is co nsi der ed a s a wh ole en ti ty, byp ass ing the
need for arbitrary partitioning of energy into atomic contributions.
This allows the GDML framework to capture chemical and long-
range interactions. Obviously, long-range electrostatic and van der
Waals interactions that fall within the error of the GDML model
will have to be incorp orated wit h explicit physi cal models. O ther
approaches that use ML to fit PESs such as Gaussian approxi m a t i o n
potentials ( 3 , 8 ) have b een proposed. Ho wever, these appro aches con-
sider an explicit localization of the contribution of i ndividual atoms to
the total energy. The total energy is expre ssed as a linear combinati on
0. 2
0. 4
0. 6
0. 8
1
1.2
1.4
1.6
Force prediction accuracy

O O
O
C H 3
O H O H
O
O H
O H
0
0. 1
0. 2
0. 3
Energy prediction accuracy [kcal mol –1 ]

Malo naldehy d e
Benzene Ura cil
Naphth alene Aspirin
Sa licylic a cid
Ethanol Toluene
N
H O
N H
O CH 3
O O
0
5
10
Number of samples

1000
1 kcal mol – 1 Å – 1
1 kcal mol –1 Å –1
[kcal mol –1 Å –1 ]
A
B
C
Fig. 3. Efficiency of GDML predictor versus a model that has been trained on
energies. ( A ) Required number of samples for a force prediction performance of
MAE (1 kcal mol
− 1
Å
− 1
) with the energy-based model (gray) and GDML (blue). The
energy-based model was not able to achieve the targeted performance with the
maximum number of 63,000 samples for aspirin. ( B ) Force prediction errors for
the converged model s (same number of parti al derivativ e samples and energy
samples). ( C ) Energy prediction errors for the converged models. All reported pre-
diction errors have been estimated via cross-validation.
SCIENCE ADVANCES | RESEARCH ARTICLE
Chmiela et al ., Sci. Adv. 2017; 3 : e1603015 5 May 2017 3o f6
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

of local environments characterized by a descriptor that acts as a
n o n unique partitioning function to the total energy. Training on force
s am p le s simil arly requires the evaluation of kernel derivativ es, but w.
r.t. those local environmen ts. Although any partitioning of the total
energy is arbitrary, our molecular to tal energy is physically meaningful
in that it is related to the atomic fo rce, thus being a me asure for the
deflect ion of every atom from its gro und state.
We first demonstrate the impact of the energy conservation
constraint on a toy model that can be easily visualized. A noncon-
servative force model ^
f 
F was trained alongside our GDML model ^
f F
on a sy nthe tic po te ntial def ined by a t wo-d imen sio nal ha rmo nic
oscillator using the same samples, descriptor, and kernel.
We were interested in a qualitative as sessment of th e predict ion erro r that
is intro duced as a direct re sult of violat ing the law of energy con servatio n.
For this, we uniquely decomposed our naïve estimate
^
f 
F ¼ ∇ E þ ∇  A ð 6 Þ
into a sum of a curl-free (conservative) and a divergence-free (so-
lenoidal) vector field, according to the Helmholtz theorem (see Fig.
2) ( 25 ). This was achieved by subsampling ^
f 
F on a regular grid and
numerically projecting it onto the closest conservative vector field
by solving Poisson ’ s equation ( 26 )
 ∇ 2 E !
¼ ∇ ^
f 
F ð 7 Þ
with Neumann boundary conditi ons. The remaining solenoidal
field represents the systematic error made by the naïve estimator.
Other than in this example, our GDML approach directly estimates
the conservative vector field and does not require a costly numer-
ical projection on a dense grid of regularly spaced samples.
RESULTS
We now proceed to eva luate the per formance of the GD ML ap-
proach by learning and then predicting AIMD trajectories for mol-
ecules, including benzene, uracil, naphthalene, aspirin, salicylic
acid, malonaldehyde, ethanol, and toluene (see table S1 for details
of the se mo lec ula r data set s). Th es e data se ts ra nge i n siz e fro m
150 k to nearly 1 M conformational geometries with a resolution
of 0.5 fs, although only a drastically reduced subset is necessary to
train our energy and GDML predictors. The molecules have differ-
ent sizes, and the molecular PESs exhib it different levels of com-
pl exit y. The en ergy ra nge acro ss all dat a poin ts wi th in a set sp ans
from 20 t o 48 kcal mo l
− 1
. Force co mponen ts range fr om 266 to
570 kcal mol
− 1
Å
− 1
. The total energy and force lab els for each data
set were computed using the PBE + vdW-TS electronic structure
method ( 27 , 28 ).
The GDML prediction results are contrasted with the output of a
model that has been trained on energies. Both models use the same
kernel and descriptor, but the hyperparameter search was performed
individu ally to ensure opt imal model selecti on. The GDML model
for each data set was trained on ~1000 geometries, sampled uniformly
according to the MD@DFT trajectory energy distribution. For the
energy model, we multiplied this amount by the number o f atoms in
one molecule times its three spatial degrees of freedom. This
configuration yields equal kernel sizes for both models and therefore
equal levels of comple xity in terms of the optimization problem. We
compare the models on the basis of the required numbe r o f samples
(F ig. 3A ) to ach ie ve a for ce pre dict io n acc ura cy of 1 kc al mo l
− 1
Å
− 1
.
Furthermore, the prediction accuracy of the force an d energy estimates
for fully conver ged models (w.r.t. number of samples) (Fi g. 3, B and C)
ar e judg ed on th e basi s of th e mean a bsol ut e er ror (MAE ) and roo t
mean square error performance measures.
It can be seen in Fig. 3A that the GDML model achieves a force
accuracy of 1 kcal mol
− 1
Å
− 1
using only ~1000 samples from differ-
ent data sets. Conversely, a pure energy-based model would require
up t o tw o orde rs o f ma gni tude mo re s amp les to a chi eve a si mi lar
accuracy. The superior performance of the GDML model cannot
be simply attributed to the greater information content of force
samples. We compare our results to those of a naïve force model
along the lines of the toy e xample shown in Fig. 2 (see tables S1
and S3 for details on the prediction accuracy of both models). The
naïve force model is nonconservativ e but identical to the GDML
model in all other aspects. Note that its performance deteriorates sig-
nificantly on all data sets compared to the full GDML model (see the
Sup plem enta ry Mate rials fo r deta il s). We no te her e th at we use d DFT
calcu latio ns, but any oth er high-lev el quantu m chemistry app roach
could h ave been us ed to calcula te forces for 1000 co nforma tional
geometries. This allows AIMD simulations to be carried out at the
speed of ML models with the accuracy of correlated quantum chem-
istry calculations.
It is notice able that the GDML m o del at convergence (w.r.t. number
of sa mp l es ) yi el ds hig her acc ur ac y for fo rce s th an an eq ui vale nt en er g y-
ba sed mo del (s ee Fig . 3B). He re, w e shou ld remar k that th e ener gy -
based model trained on a very large data set can reduce the energy error
to be low 0.1 kc al mol
− 1
, whereas the GDML energy error remains at
0.2 kcal mol
− 1
fo r ~1 000 t rai nin g sam ple s (see Fig . 3C) . How eve r, th es e
errors are already significantly below thermal fluctuations ( k
B
T )a t
room temperature (~0.6 kc al mol
− 1
), indic ating that the GDML model
pr o vi des an ex cel len t des cri pti on of bot h ene rgi es and for ce s, ful ly pr e-
se rve s th ei r con si sten cy , and r edu ces t he co mpl exit y of t he ML m od el.
These are all desirable features of mode ls that combine rigorous
ph ysi cal la ws wit h th e po wer of d at a-d ri ven m ac hin es.
The ultimate test of any force fiel d model is to establish its aptitude
to predi ct statistical averages and fluc tuations using MD simulation s.
Th e qua nti tat ive p er f or man c e of th e GD ML mod el i s dem on st r at ed in
Fig. 4. Results of classical and PIMD simulations. The recently developed es-
timators based on perturbation theory were used to evaluate structural and
electronic observables ( 30 ). ( A ) Comparison of the interatomic distance distribu-
tions, hr ðÞ ¼ 〈 2
N ð N  1 Þ X N
i < j
d ð r j j
→
r i  →
r j jjÞ 〉 P ; t , obtained from GDML (blue line) and
DFT (dashed red line) with classical MD (main frame), and PIMD (inset). a.u., arbi-
trary units. ( B ) Pr obability dist ribution of th e dihedral angle s (corresponding to
carboxylic acid and ester functiona l groups) using a 20 ps time interval from a
total PIMD trajectory of 200 ps.
SCIENCE ADVANCES | RESEARCH ARTICLE
Chmiela et al ., Sci. Adv. 2017; 3 : e1603015 5 May 2017 4o f6
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

Fig. 4 for classical and quantum MD simulations of aspirin at T = 300 K.
Fi gur e 4A s ho ws a com par ison o f int era to mic dis ta nce d ist rib uti o ns ,
h ( r ), from MD@DFT and MD@GDML. Overall, we observe a quanti-
ta ti ve agre emen t in h ( r ) betw een DFT and GDML si mulat ions . The
sma ll diff eren ces in the di st ance rang e be twee n 4.3 an d 4. 7 Å res ult
from slightly higher energy barriers of the GDML model in the
pa thw ay fr om A to B co rres pond ing to t he col lec tive m oti on s of th e
car box yli c aci d and es ter g ro up s in a sp ir in. Thes e dif feren ce s va nis h
on ce t he q uan tum na tu re o f th e nu cle i is in tro duc ed in the P IMD s im -
ulations ( 29 ). I n addi tion, long – time scale simulations are require d to
completely un d e rstand the dynamics of mo lecular systems. Figure 4B
shows the probability distri bution of the fluctuations of dihedral angles
of ca rb oxy lic ac id a nd es ter gr o up s in as pir in. T hi s p lo t sho ws th e ex -
is ten ce of two mai n met ast ab le con figu rati ons A and B and a shor t-
lived con figura tion C, illustr ating the nontri vial dynamics cap tured
b yt h eG D M Lm o d e l .F i n a l l y ,w er e m a r kt h a tas i m i l a r l yg o o d
p er fo rma nce a s fo r asp i ri n is a l so ob ser ve d fo r th e ot her s eve n mole -
cule s show n in Fig . 3. Th e effi cien cy of the GD ML m od el (w hich is
three orders of magnitude faster than DFT) should enable long – time
s ca le PI MD si mu la ti on s to o bt ai n c on ver ged th erm od yn am ic pro pe r-
ti es of i nte rme dia te -s ize d mole cu les w ith t he acc ur acy a nd tr ans fer abil -
ity o f high-lev el ab initi o methods.
In s um mar y, the de vel op ed GD ML m od el allo ws t he co nst ruc ti on
of co mpl ex m ult id im en sio na l PE S by co mbi nin g ri gor ou s phy sic al law s
with data-driven ML tec hniques. In additio n to the presented success-
ful appl ication s to the model sys tems an d interm ediat e-sized mol e-
cules, our work can be further deve loped in several directions, including
scaling with system size and c omplexity , incorporating additional physical
priors, describing reaction pathways, and e nabling seamless coupling b e-
tween GDML and ab i nitio calculations.
SUPPLEMENTARY MATERIALS
Supplementary material for this article is available at http://advances.sciencemag.org/cgi/
content/full/3/5/e1603015/DC1
section S1. Noise amplification by differentiation
section S2. Vector-valued kernel learning
section S3. Descriptors
section S4. Model analysis
section S5. Details of the PIMD simulation
f i g . S 1 . T he a ccur acy of the GDM L model (in terms of the MAE) as a function of training set size:
Ch emi cal ac cu ra cy o f le ss th an 1 kc al /m ol is al rea dy ac hie ve d for s mal l tr ai nin g set s.
fig. S2. Pre di ct in g en er gi es a n d fo rc es fo r co n se c ut iv e t i m e st e p s of a n MD si m ul at io n o f ur ac il a t 50 0 K.
table S1. Properties of MD data sets that were used for numerical testing.
ta bl e S 2. G DML pre dic tio n acc ur a cy f or i nte rat om ic fo rc es an d to tal en er gi es fo r al l dat a s et s.
table S3. Accuracy of the naïve force predictor.
table S4. Accuracy of the converged energy-based predictor.
References ( 31 – 36 )
REFERENCES AND NOTES
1. J. Behler, M. Parrinello, Generalized neural-network representation of high-dimensional
potential-energy surfaces. Phys. Rev. Lett. 98 , 146401 (2007).
2. J. Behler, S. Lorenz, K. Reuter, Representing molecule-surface interactions with symmetry-
adapted neural networks. J. Chem. Phys. 127 , 014705 (2007).
3. A. P. Bartók, M. C. Payne, R. Kondor, G. Csányi, Gaussian approximation potentials: The
accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104 , 136403
(2010).
4. J. Behler, Atom-centered symmetry functions for constructing high-dimensional neural
network potentials. J. Chem. Phys. 134 , 074106 (2011).
5. J. Behler, Neural network potential-energy surfaces in chemistry: A tool for large-scale
simulations. Phys. Chem. Chem. Phys. 13 , 17930 – 17955 (2011).
6. K. V. J. Jose, N. Artrith, J. Behler, Construction of high-dimensional neural network
potentials using environment-dependent atom pairs. J. Chem. Phys. 136 , 194111 (2011).
7. A. P. Bartók, R. Kondor, G. Csányi, On representing chemical environments. Phys. Rev. B
87 , 184115 (2013).
8. A. P. Bartók, G. Csányi, Gaussian approximation potentials: A brief tutorial introduction.
Int. J. Quantum Chem. 115 , 1051 – 1057 (2015).
9. S. De, A. P. Bartók, G. Csányi, M. Ceriotti, Comparing molecules and solids across structural
and alchemical space. Phys. Chem. Chem. Phys. 18 , 13754 – 13769 (2016).
10. M. Rupp, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, Fast and accurate modeling of
molecular atomization energies with machine learning. Phys. Rev. Lett. 108 , 058301
(2012).
11. G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko,
K.-R. Müller, O. A. von Lilienfeld, Machine learning of molecular electronic properties in
chemical compound space. New J. Phys. 15 , 095003 (2013).
12 . K. Ha nsen, G. Montav on, F. Biegle r, S. Fazli, M. Rupp, M. S cheffler, O. A. v on Lilienfel d,
A. Tkatchen ko, K.-R. Müller, Ass essment and va lidation of mach ine learning
methods for predicting molec ular atomization energies . J. Chem. Theory Comput. 9 ,
3404 – 3419 (2013).
1 3 . K .H a n s e n ,F .B i e g l e r ,R .R a m a k r i s h n a n ,W .P r o n o b i s ,O .A .v o nL i l i e n f e l d ,K . - R .M ü l l e r ,
A. Tkatchen ko, Machine le arning predic tions of molecu lar properties: Acc urate
many-body potentia ls and nonlocality in chemical space. J. Phy s. Ch em . Le t t. 6 , 2326 – 2331
(2 015 ).
14. M. Rupp, R. Ramakrishnan, O. A. von Lilienfeld, Machine learning for quantum mechanical
properties of atoms in molecules. J. Phys. Chem. Lett. 6 , 3309 – 3313 (2015).
15. V. Botu, R. Ramprasad, Learning scheme to predict atomic forces and accelerate materials
simulations. Phys. Rev. B 92 , 094306 (2015).
16. M. Hirn, N. Poilvert, S. Mallat, Quantum energy regression using scattering transforms.
CoRR arXiv:1502.02077 (2015).
17. J. Behler, Perspective: Machine learning potentials for atomistic simulations. J. Chem.
Phys. 145 , 170901 (2016).
18. Z. Li, J. R. Kermode, A. De Vita, Molecular dynamics with on-the-fly machine learning of
quantum-mechanical forces. Phys. Rev. Lett. 114 , 096405 (2015).
19. C. A. Micchelli, M. A. Pontil, On learning vector-valued functions. Neural Comput. 17 ,
177 – 204 (2005).
20. A. Caponnetto, C. A. Micchelli, M. Pontil, Y. Ying, Universal multi-task kernels. J. Mach.
Learn. Res. 9 , 1615 – 1646 (2008).
21. V. Si nd hwa ni , H. Q . Mi nh, A . C. L oza no , S ca lab le m atr ix- va lue d ke rne l lea rni ng f or h ig h-
dimensional nonlinear multivariate regression a nd g ranger causality, in P ro ce ed ing s of the 2 9th
Conference on Uncertainty in Artific ial Intelligence (UAI ’ 13) ,1 2t o1 4J u l y2 0 1 3 .
22. B. Matérn, Spatial Variation , Lecture Notes in Statistics (Springer-Verlag, 1986).
23. I. S. Gradshteyn, I. M. Ryzhik, Table of Integrals, Series, and Products , A. Jeffrey, D. Zwillinger,
Eds. (Academic Press, ed. 7, 2007).
24. T. Gneiting, W. Kleiber, M. Schlather, Matérn cross-covariance functions for multivariate
random fields. J. Am. Stat. Assoc. 105 , 1167 – 1177 (2010).
25. H. Helmholtz, Über Integrale der hydrodynamischen Gleichungen, welche den
Wirbelbewegungen entsprechen. Angew. Math. 1858 ,2 5 – 55 (2009).
26. W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, Numerical Recipes: The Art of
Scientific Computing (Cambridge Univ. Press, ed. 3, 2007).
27. J. P. Perdew, K. Burke, M. Ernzerhof, Generalized gradient approximation made simple.
Phys. Rev. Lett. 77 , 3865 – 3868 (1996).
28 . A . Tkatchenko, M. Scheffler, Accurate molecular Van Der Waals interactions from ground-sta te
electron density a nd free-atom refe rence data. Ph ys. Rev. Lett . 102 , 073005 (200 9).
29. M. Ceriotti, J. More, D. E. Manolopoulos, i-PI: A Python interface for ab initio path integral
molecular dynamics simulations. Comput. Phys. Commun. 185 , 1019 – 1026 (2014).
30. I. Poltavsky, A. Tkatchenko, Modeling quantum nuclei with perturbed path integral
molecular dynamics. Chem. Sci. 7 , 1368 – 1372 (2016).
31. A. J. Smola, B. Schölkopf, Learning with Kernels: Support Vector Machines, Regularization,
Optimization, and Beyond (MIT Press, 2001).
32. J. C. Snyder, M. Rupp, K. Hansen, K.-R. Müller, K. Burke, Finding density functionals with
machine learning. Phys. Rev. Lett. 108 , 253002 (2012).
33. J. C. Snyder, M. Rupp, K.-R. Müller, K. Burke, Nonlinear gradient denoising: Finding
accurate extrema from inaccurate functional derivatives. Int. J. Quantum Chem. 115 ,
1102 – 1114 (2015).
34. B. Schölkopf, A. Smola, K.-R. Müller, Nonlinear component analysis as a kernel eigenvalue
problem. Neural Comput. 10 , 1299 (1998).
35. B. Sch ölko pf , S. M ika , C. J . C. Bu rg es, P. Kn irs ch , K.- R. Mü lle r, G. Ra ts ch , A. J. S mo la , Inp ut sp ac e
ve rsu s fe a tu re s pac e in k ern el -bas ed me th od s. IE EE Tr an s. Ne ura l N etw . Lea rn. S yst . 10 ,
1000 – 1017 (1999).
36. K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, B. Schölkopf, An introduction to kernel-based
learning algorithms. IEEE Trans. Neural Netw. Learn. Syst. 12 , 181 – 201 (2001).
Acknowledgments
Funding: S.C., A.T., and K.-R.M. thank the Deutsche Forschungsgemeinschaft (project MU 987/
20-1) for funding this work. A.T. is funded by the European Research Council with ERC-CoG
SCIENCE ADVANCES | RESEARCH ARTICLE
Chmiela et al ., Sci. Adv. 2017; 3 : e1603015 5 May 2017 5o f6
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

grant BeStMo. K.-R.M. gratefully acknowledges the BK21 program funded by the Korean
Nati ona l Rese arc h Fo unda tion gra nt (no . 2012 -0 057 41). A ddi tion al su ppor t was p rov id ed by
the F ede ral M ini stry of Educ ati on an d Res earc h ( BMBF ) for t he Be rli n Bi g Data C ent er
BBD C (01I S1 401 3A). P art of this re sea rch wa s per for me d whil e th e aut hor s we re
vis iti ng the In st itut e for P ure and Appli ed Ma them ati cs , whic h is sup por ted by th e NS F.
Aut hor co ntr ib ution s: S.C . con cei ve d, co ns tru ct ed, an d anal yze d th e GD ML mo de ls . S.C ., A .T .,
an d K.- R.M. de vel ope d th e the ory an d de si gne d th e an alys es . H.E .S. an d I.P . pe rf or me d the D FT
calculations and MD simulations. H.E.S. helped wi th the analyses. K.T.S. and A.T. helped
with the figures. A.T., S.C., and K.-R.M. wrote the paper with contri butions from other authors. All
authors discussed the results and commen ted on the manuscript. Competing in terests: The
authors declare that they have no competing interests. Dat a and mater ials ava ilabili ty: All data
sets used in this work are available at http://quan tum-machine.org/datasets/. Ad ditional data related
to this paper may be requested f rom the authors.
Submitted 1 December 2016
Accepted 7 March 2017
Published 5 May 2017
10.1126/sciadv.1603015
Citati on: S .C h m i e l a ,A .T k a t c h e n k o ,H .E .S a u c e d a ,I .P o l t a v s k y ,K .T .S c h ü t t ,K . - R .M ü l l e r ,
Machine le arning of accura te energy-conserving molecular for ce fields. Sci. Adv. 3 ,
e1603015 (2017).
SCIENCE ADVANCES | RESEARCH ARTICLE
Chmiela et al ., Sci. Adv. 2017; 3 : e1603015 5 May 2017 6o f6
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

Machine learning of accurate energy-conserving molecular force fields
Stefan Chmiela, Alexandre Tkatchenko, Huziel E. Sauceda, Igor Poltavsky, Kristof T. Schütt and Klaus-Robert Müller
DOI: 10.1126/sciadv.1603015
(5), e1603015. 3 Sci Adv
ARTICLE TOOLS http://advances.sciencemag.org/content/3/5/e1603015
MATERIALS
SUPPLEMENTARY http://advances.sciencemag.org/content/suppl/2017/05/01/3.5.e1603015.DC1
REFERENCES http://advances.sciencemag.org/content/3/5/e1603015#BIBL
This article cites 30 articles, 0 of which you can access for free
PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions
Terms of Service Use of this article is subject to the
registered trademark of AAAS. is a Science Advances Association for the Advancement of Science. No claim to original U.S. Government Works. The title
York Avenue NW, Washington, DC 20005. 2017 © The Authors, some rights reserved; exclusive licensee American
(ISSN 2375-2548) is published by the American Association for the Advancement of Science, 1200 New Science Advances
on April 16, 2018 http://advances.sciencemag.org/ Downloaded from

Why organizations use Identific for document trust, entry 52

Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in universities, research institutes, colleges, schools, and publishing workflows, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports clearer documentation of academic decisions, reduced manual checking effort, and more reliable review records. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For policy papers, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.

Review document trust