scieee Science in your language
[en] (orig)
Parameter Estimation for Semilinear
Stochastic Partial Differential Equations
vorgelegt von
M. Sc.
Gregor Pasemann
ORCID: 0000-0002-8107-6685
an der Fakultät II Mathematik und Naturwissenschaften
der Technischen Universität Berlin
zur Erlangung des akademischen Grades
Doktor der Naturwissenschaften
Dr. rer. nat.
genehmigte Dissertation
Promotionsausschuss:
Vorsitzender: Prof. Dr. Tobias Breiten
Gutachter: Prof. Dr. Wilhelm Stannat
Gutachterin: Prof. Dr. Annika Lang
Tag der wissenschaftlichen Aussprache: 27.08.2021
Berlin 2021
Abstract
The problem of parametric drift estimation for semilinear stochastic partial
differential equations (SPDE) is considered based on a maximum–likelihood
approach. The diffusivity of such models is estimated in finite time based
on a single trajectory with high resolution in space. This is implemented
by observing either a large number of Fourier modes (spectral approach), a
large number of spatial point evaluations of the process (discretized spec-
tral approach) or a convolution with a kernel of small diameter (local ap-
proach). Asymptotic properties of different estimators within these ob-
servation schemes are discussed, based on a spatial regularity analysis of
the solution to the underlying SPDE. Examples of the general theory in-
clude reaction-diffusion equations, Burgers equation and equations of Cahn–
Hilliard type. Special emphasis is put on the issue of model misspecification,
with respect to either the drift or the driving noise.
The theoretical results are supported by a numerical simulation.
As an extension, the case of simultaneous diffusivity and reaction parame-
ter estimation from spectral observations is treated in the context of stochas-
tic activator-inhibitor models. This is applied to experimental observations
of the actin marker concentration within Dictyostelium discoideum giant
cells, whose spatiotemporal dynamics is described as a stochastic FitzHugh–
Nagumo system. The performance of different estimators is compared on
synthetic data from numerical simulation as well as real data.
1
2
Zusammenfassung
Diese Arbeit befasst sich mit parametrischer Driftschätzung für semilineare
stochastische partielle Differentialgleichungen (SPDE) auf der Grundlage ei-
nes Maximum-Likelihood-Ansatzes. Die Diffusivität solcher Modelle wird in
endlicher Zeit unter Beobachtung eines einzelnen Pfades mit hoher räumlicher
Auflösung geschätzt. Diese hohe räumliche Auflösung wird formalisiert durch
eine große Anzahl an Eigenfrequenzen (Spektralansatz), eine große Anzahl
beobachteter Punktauswertungen (diskretisierter Spektralansatz) oder eine
Faltung mit einem Kern mit kleinem Durchmesser (lokaler Ansatz). Für ver-
schiedene Schätzer werden die asymptotischen Eigenschaften innerhalb dieser
Beobachtungsmodelle analysiert. Grundlage hierfür ist eine genaue Bestim-
mung der räumlichen Regularität der Lösung der zugrundeliegenden SPDE.
Beispiele für die allgemeine Theorie sind Reaktions-Diffusions-Gleichungen,
die Burgers-Gleichung sowie Gleichungen vom Cahn–Hilliard-Typ. Weiterhin
werden Fehlspezifikationen des zugrundeliegenden Modells behandelt, bezo-
gen sowohl auf den Driftterm als auch auf den stochastischen Term.
Die Theorie wird durch numerische Simulationen unterstützt.
Als eine Erweiterung der bisherigen Theorie wird die simultane Diffusions-
und Reaktionsparameterschätzung im Spektralansatz im Kontext stochas-
tischer Aktivator-Inhibitor-Modelle betrachtet. Dies wird angewendet auf
experimentelle Beobachtungsdaten der Aktinmarkerkonzentration in Dictyo-
stelium discoideum-Zellen, wobei hier eine Beschreibung als stochastisches
FitzHugh–Nagumo-System angenommen wird. Die Ergebnisse verschiedener
Schätzer werden für Simulationen und experimentelle Daten verglichen.
3
4
Acknowledgments
I am grateful to numerous people, with whom I discussed and from whom I
learned during the last years. Especially, I would like to thank
my PhD supervisor Wilhelm Stannat for the support and guidance
through my doctorate,
Annika Lang for examining this dissertation,
Tobias Breiten for chairing the scientific defense,
all my coauthors, from whom I had the chance to learn a lot: Sergio
Alonso, Randolf Altmeyer, Carsten Beta, Igor Cialenco, Sven Flem-
ming, Hyun-Jung Kim, Wilhelm Stannat (in alphabetical order),1
the anonymous referees for their helpful comments on the preprints and
publications,
the SFB 1294 “Data Assimilation” as well as the Technical University
Berlin (TU Berlin) for providing all the resources needed in order to
conduct this work,2
Igor Cialenco and the Illinois Institute of Technology (IIT) for the
hospitality during a research stay from January to March 2020,
Sven Flemming for preparing and providing the experimental giant cell
data used in Section 6.2,
Randolf Altmeyer for giving helpful feedback on the final draft of this
manuscript,
Markus Reiß for introducing me to the field of statistics for SPDEs,
my family for supporting me and believing in my project.
1During my time as a PhD student, I worked on four projects [PS20], [ACP20],
[PFA+21], [CKP21]. The first three works are the basis for this dissertation.
2This research has been funded by the Deutsche Forschungsgemeinschaft (DFG)-
Project-ID 318763901 - SFB1294.
5
Contents
1 Introduction 8
2 The Spectral Approach 16
2.1 TheSetting ............................ 17
2.2 SpatialRegularity......................... 19
2.3 Diffusivity Estimation . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Discussion of Examples . . . . . . . . . . . . . . . . . . . . . . 34
2.4.1 Linear Perturbations . . . . . . . . . . . . . . . . . . . 35
2.4.2 Reaction-Diffusion Equations . . . . . . . . . . . . . . 37
2.4.3 Burgers Equation . . . . . . . . . . . . . . . . . . . . . 40
2.4.4 Equations of Cahn-Hilliard Type . . . . . . . . . . . . 42
2.4.5 Robustness under Model Misspecification . . . . . . . . 43
2.5 Numerical Illustration . . . . . . . . . . . . . . . . . . . . . . 45
2.6 The Case of Systems . . . . . . . . . . . . . . . . . . . . . . . 46
3 Extended Noise Models for the Spectral Approach 51
3.1 The Case of Ornstein–Uhlenbeck Noise . . . . . . . . . . . . . 52
3.1.1 Covariance Structure and Asymptotic Behavior . . . . 53
3.1.2 The Maximum–Likelihood Approach . . . . . . . . . . 60
3.1.3 Diffusivity Estimation . . . . . . . . . . . . . . . . . . 62
3.1.4 Correlation Decay Estimation . . . . . . . . . . . . . . 65
3.2 The Case of Integrated Noise . . . . . . . . . . . . . . . . . . 73
3.3 Structure of the Dispersion Operator . . . . . . . . . . . . . . 77
4 Discretization of the Spectral Approach 84
5 The Local Approach 104
6
6 Diffusivity Estimation for Activator-Inhibitor Models 117
6.1 Joint Diffusivity and Reaction Parameter Estimation . . . . . 118
6.1.1 The General Case . . . . . . . . . . . . . . . . . . . . . 118
6.1.2 The Statistically Linear Case . . . . . . . . . . . . . . 121
6.1.3 A Model of FitzHugh–Nagumo Type . . . . . . . . . . 126
6.2 Application to Cell Data . . . . . . . . . . . . . . . . . . . . . 127
6.2.1 Evaluation of Simulated Data . . . . . . . . . . . . . . 129
6.2.2 Evaluation of Real Data . . . . . . . . . . . . . . . . . 133
6.2.3 Evaluation of a Cell Population . . . . . . . . . . . . . 135
6.2.4 The Effective Unstable Zero . . . . . . . . . . . . . . . 137
6.2.5 The Effective Diffusivity Outside the Cell . . . . . . . . 138
7 Further Research 141
A Limit Theorems 142
B Additional Proofs 144
B.1 Proof of Proposition 2.17 . . . . . . . . . . . . . . . . . . . . . 144
B.2 Proof of Lemma 5.5 . . . . . . . . . . . . . . . . . . . . . . . . 146
B.3 Proof of Proposition 6.6 . . . . . . . . . . . . . . . . . . . . . 147
List of Figures 150
Notation 151
Bibliography 169
7
Chapter 1
Introduction
Statistical inference for stochastic partial differential equations (SPDEs) is an
emerging field within the topic of statistics for stochastic processes. Formally,
an SPDE can be described as an evolution equation
dXt=A(Xt)dt+B(Xt)dWt(1.1)
on an infinite-dimensional state space, with suitable initial condition. Here,
Adetermines the drift, Bacts as an dispersion operator and Wis a cylin-
drical Wiener process. Further details are given below and in Section 2.1.
It is natural to use an SPDE in order to describe the spatiotemporal
dynamics of phenomena such as pattern formation or traveling waves, see
[SS16] and references therein for the propagation of action potentials in neu-
roscience, or [FFAB20] for actin dynamics within D. discoideum giant cells.
Such models may arise in different ways, for example by deriving them from
first principles, or as a phenomenological description, by adding noise to
a deterministic partial differential equation. A widely used class of mod-
els is given by stochastic reaction–diffusion systems, whose drift component
combines localized reaction dynamics with diffusive coupling in space, i.e.
A= · (θ) + F, where θdescribes the (possibly inhomogeneous and
anisotropic) diffusivity. The reaction term Fcan reflect detailed knowledge
of the underlying (e.g. biophysical) processes, or it may be a minimal model
capable of reproducing certain features found in the observations. Different
approaches may be used to describe observed patterns, and it is desirable to
apply statistical methods in order to understand the advantages and limits
of different models.
8
Despite their great variety, a common feature of many SPDE models is
the presence of diffusive forcing. Consequently, a precise understanding of
the diffusivity θis crucial. As a natural first approximation, the diffusivity
can be assumed to be homogeneous and isotropic, i.e. θ > 0is just a positive
number. In this case, ·(θ) = θ, where is the Laplacian. Depending
on the amount and quality of the data at hand, this may be refined. For
example, in [AR21], an estimation theory for a stochastic heat equation with
inhomogeneous diffusivity is developed.
Context and Literature
In order to improve the understanding of a random field with spatial and
temporal extension based on observed data, different techniques from statis-
tical inference can be applied to different classes of models. We mention two
related approaches, which complement the modeling ansatz outlined above:
Apart from starting directly with an SPDE model, a Gaussian field with spec-
ified covariance function may be imposed, see e.g. [Yin93, Moh97]. Here, the
focus lies on features of the covariance rather than the dynamics of the tem-
poral evolution. Another related concept is a partial differential equation
with boundary noise, as studied in [BST80, AB88, MP07].
Literature surveys concerning statistical inference for SPDEs are given
in [Lot09, Cia18].1A classical problem concerns the estimation of unknown
parameters of the underlying model based on observed data [IH81]. Most
of the literature on statistics for SPDEs is related to parameter estima-
tion, which we are primarily interested in. Further central tasks that have
been studied include hypothesis testing [CX14, CX15], nonparametric esti-
mation [HL00a, HL00b, PR02, HT21a] and Bayesian inference [Bis99, PR00,
Bis02, CCG20]. Another different but related topic is stochastic filtering
(see [LS77, LS01, BC09] for the general theory and e.g. [Ouv78, RR20] for
infinite-dimensional processes). In [Lot04], the solution to a parabolic equa-
tion is interpreted as the observation process of a hidden parameter, which is
assumed to satisfy itself a stochastic differential equation. Estimation of un-
known quantities of the signal SPDE with maximum–likelihood methods in
the context of stochastic filtering is treated in [BB84, AS88, Aih92, Aih98b].
1A continuously updated list of references can be found at the webpage:
https://sites.google.com/prod/view/stats4spdes/
9
Inference on stochastic (ordinary) differential equations, SDEs for short,
is well-established, with a huge body of literature. We refer to [Kut04] for
a detailed analysis of various statistical questions for ergodic diffusion pro-
cesses, which can be analyzed through their large time behavior. It is natural
to consider the large time regime also for the case of SPDEs, with infinite
dimensional state space, exploiting ergodicity properties of the process. This
has been done by [Log84, KL85, KL86, Moh94] for parabolic equations (see
also [Aih98a]), and [Jan20, Jan21] for damped wave equations. The case of
fractional noise is treated in [MP07, MT13, KM19].
However, it has been noted in [HKR93, HR95] that it is possible to re-
cover certain drift parameters (for example, the diffusivity of a stochastic heat
equation) even in finite time. In fact, the presence of unbounded operators
implies that the measures on path space generated by the process in finite
time for different drift parameters can be singular. This is in strong contrast
to the case of SDEs with finite dimensional state space, where Girsanov’s
theorem assures that the measures on path space are absolutely continu-
ous. While this fragmentation of the path space into the domains of singular
measures can lead to analytical difficulties (e.g. due to the lack of a dom-
inating measure), it is helpful from a statistical point of view as it implies
the identifiability of the parameters: Given an observation X, one only has
to determine the measure whose support contains X. Of course, in practice
this is achieved by substituting the state space by some finite-dimensional ap-
proximation and studying the asymptotics of classical estimation techniques
based on that discretization.
The methods and works on parameter estimation for SPDEs can be cat-
egorized according to the observation scheme they are based on. Different
discretization schemes can be applied in space and time.
We focus on the idealized assumption that the process is observed con-
tinuously in time up to a fixed T > 0. However, there are various works on
the temporally discrete setting, see e.g. [Mar03, PR96, PR97, CDVK20] for
maximum likelihood-type drift estimation, or [BT19, BT20, Cho20, Cho19,
CD20, KU21b, KU21a, TTV14] for different approaches based on temporal
power variations.
Under full spatial observations, it is possible to study the limit of small
noise intensity, as done by [Hue99, IK99, IK00, IK01]. On the other hand,
partial knowledge on the process in space can be modeled in different ways:
10
The spectral approach is based on observing an increasing number of eigen-
frequencies of the process, which are associated to the highest order linear
differential operator appearing in the drift. For the (finite) set of observed
modes, maximum–likelihood techniques can be applied. This approach has
been introduced by [HKR93, HR95, Hue93] and extended by a large num-
ber of subsequent works. If the remaining drift and dispersion terms are
linear and commute with the highest order drift term, the SPDE decou-
ples to a system of independent one-dimensional processes. Even in the
non-diagonalizable case, similar techniques can be applied, see [Lot96, LR99,
LR00, Lot03]. Further, in [HLR97], estimators derived from Galerkin approx-
imations instead of spectral projections are discussed. In [LL10b, LL10a],
hyperbolic equations are considered. Trajectory fitting estimators have been
analyzed in [CGH18]. Different noise models are treated in [CL09, CCG20]
(multiplicative noise), [CLP09, Cia10, Hui14, Kří20] (additive and multi-
plicative fractional noise) and [CKL20] (space-only noise).
For spatially discrete observations, modeled as a set of spatial point eval-
uations of the solution process, a natural approach is to consider power vari-
ations in space and analyze the asymptotic behavior as the mesh size of the
underlying spatial grid tends to zero. This has been done in [PT07, CH20]
(for a stochastic heat equation), [MKT19a, MKT19b] (for a stochastic frac-
tional heat equation), [CKL20, CK22] (in the context of space-only noise)
as well as [SST20] (for a wave equation driven by fractional noise). A joint
spatiotemporal variation is considered in [HT21b, HT21a].
The new local approach, pioneered in [AR21], considers spatially discrete
observations as local averages rather than point evaluations of the process.
The process is weighted by some localized kernel, which is determined by
the measurement device and its resolution. The high frequency regime from
the spectral approach is substituted by a “shrinking kernel” regime that cor-
responds to high precision measurements. Interestingly, the diffusivity of a
stochastic heat equation is identifiable from one single localized observation
in finite time.
Most literature on SPDE parameter estimation is concerned with linear
equations. An early treatment of nonlinear systems appears in [Hue93, Chap-
ter 4], where estimators derived from Galerkin approximations are studied in
the scope of general maximum likelihood theory [IH81], based on a similar
analysis for ergodic diffusion processes [Kut04]. However, in this setting, the
observations are not a functional of the full process X, but rather a finite-
11
dimensional Markovian approximation to X. In [GM02], consistency of the
maximum likelihood estimator for a class of controlled stochastic reaction–
diffusion equations is shown in the large time regime (see also [DMPD00],
where the case that the nonlinear term depends linearly on its parameters
is treated separately). For finite observation horizon T, the first study of
parameter estimation from spectral observations of a nonlinear process has
been given in [CGH11] for the stochastic Navier–Stokes equations. The first
result on nonparametric estimation of a reaction term based on discrete ob-
servations is given in the recent work [HT21a], where the authors consider
one-dimensional stochastic reaction–diffusion equations.
So far, there are few works on application of SPDE parameter estima-
tion techniques to experimental data. We point out [Unn89, KUP91], where
parameter estimation for SPDE models arising in groundwater hydrology is
studied with a formal maximum likelihood method, and diffusion and advec-
tion coefficients are calibrated from data.2Very recently, [ABJR21] considers
stochastic cell repolarization models in the context of the local approach.
Main Results and Outline
In this work, we consider drift parameter estimation for semilinear SPDEs.
We adapt the spectral and local approach to the general semilinear setting.
Based on a maximum-likelihood ansatz, we construct consistent estimators
for the unknown diffusivity and study their asymptotic properties in finite
time in detail. Depending on the specific setting, we are able to obtain op-
timal convergence rates as well as asymptotic normality, which allows for
the construction of confidence intervals. General results are given in Theo-
rem 2.11 and 2.12 for the spectral approach, and Theorem 5.8 for the local
approach. Examples are studied in Section 2.4. Our theory depends on a
precise understanding of the spatial regularity of the solution Xand related
processes. Special emphasis is put on robustness of the estimators to model
misspecification, either in the reaction term (Theorem 2.27) or in the speci-
fication of the dispersion operator (Theorem 3.22).
The general theory is supported by numerical evidence for the case of a
stochastic Allen–Cahn equation (Section 2.5).
2Within a different setting, [Moh00] considers a random advection-diffusion equation
in order to describe soil plutonium data.
12
We study how the results can be transferred to SPDEs with different
noise models that are relevant for applications. For Ornstein–Uhlenbeck
driven models, a detailed characterization is given in Theorem 3.7. As a new
feature, even the rate of temporal correlation decay of such models can be
identified in finite time (Theorem 3.10). On the other hand, the estimators
are very sensitive to deviations from the semimartingale setting, as shown
for the case of processes driven by integrated Wiener noise (Theorem 3.15).
In addition, we study to what extent the asymptotic results of the spec-
tral approach can be recovered if spatially discrete observations rather than
Fourier modes are available. Under mild conditions on the domain and
scheme of point observations, we obtain bounds on the convergence rate of a
discretized spectral estimator as the mesh size tends to zero (Theorem 4.3).
Under additional assumptions on the underlying geometry, these bounds are
optimal (Theorem 4.7), in the sense that they are consistent with results
from related literature.
Finally, we extend the spectral estimation theory to the case of joint
diffusivity and reaction parameter estimation (Theorem 6.1 and 6.4), with
special emphasis on stochastic activator–inhibitor models. The results are
applied to experimental D. discoideum giant cell observations in Section 6.2,
where we discuss the effective diffusivity of intracellular actin concentration.
This work is based on the papers [PS20], [ACP20], [PFA+21] and addi-
tional new material. It is structured as follows:
Chapter 2 is based on [PS20] and develops the spectral approach for
general semilinear models. In fact, the results from [PS20] are extended
by means of a different approach to higher regularity (as in [ACP20]).
Chapters 3 and 4 are new and not based on previous publications. In
Chapter 3, different noise models for the spectral approach are worked
out, based on SPDE models arising in biophysics literature. Special
emphasis is put on the case of Ornstein–Uhlenbeck noise. Chapter 4
concerns the adaptation of the asymptotic results from the spectral
approach to the case that the solution process Xis observed not via
its Fourier modes, but discretized in space.
Chapter 5 is based on [ACP20]. Diffusivity estimation for semilinear
SPDE models is treated from the perspective of the recently introduced
local approach. A crucial tool is higher Lp-regularity of the solution
process.
13
Chapter 6 is based on [PFA+21]. Diffusivity and reaction parameters
are jointly estimated in the scope of the spectral approach, and the
results are applied to simulated and experimental cell data.
A First Example
We outline the general proceeding with a simple example. For a Gelfand
triple VHV, consider the equation
dXt=θA(Xt)dt+B(Xt)dWt, X0H, (1.2)
with unknown θ > 0, where A:VVis a possibly nonlinear operator,
Wis a cylindrical Wiener process, and Bmaps Vinto the space of Hilbert–
Schmidt operators on H. This corresponds to A=θA in (1.1). In this
example, θshould be seen as the overall drift intensity rather than diffusivity.
Assume that (1.2) is well-posed, e.g. under monotonicity and coercivity
assumptions on Aand B[LR15]. Now, given a sequence of linear projection
operators (PN)NNwith finite-dimensional range on H, the dynamics for
XN:= PNXis given by
dXN=θAN(Xt)dt+BN(Xt)dWt,(1.3)
where AN(X) := PNA(X)and BN(X) := PNB(X). Note that in gen-
eral XNceases to be Markovian. Assume that XN, AN(X)and BN(X)
are observed. Let BN(Xt)BN(Xt)Tbe invertible for 0tT(inter-
preted as an operator acting on the range of PN), and set BN(Xt)+:=
BN(Xt)T(BN(Xt)BN(Xt)T)1. This is the Moore–Penrose pseudoinverse3
of the operator BN(Xt). Then a natural estimator for θis given by
ˆ
θN=RT
0(BN(Xt)BN(Xt)T)1AN(Xt),dXN
t
RT
0BN(Xt)+AN(Xt)2dt.(1.4)
This estimator can be either derived from a maximum likelihood approach
or directly justified by the decomposition
ˆ
θNθ=RT
0BN(Xt)+AN(Xt),dWt
RT
0BN(Xt)+AN(Xt)2dt.(1.5)
3For operators between Hilbert spaces, the Moore–Penrose pseudoinverse is defined
analogously to the finite-dimensional case, cf. [WD01]. See e.g. [LS01, Chapter 13] for
properties of the pseudoinverse in the finite-dimensional case.
14
Set IN:= RT
0BN(Xt)+AN(Xt)2dt. According to Theorem A.1, ˆ
θNis
a consistent estimator as N which is asymptotically normal with rate
(EIN)1/2, i.e.
(EIN)1/2ˆ
θNθ N(0,1),(1.6)
whenever I1
N
P
0and IN/EIN
P
1as N . An explicit expression in N
of the convergence rate (EIN)1/2will depend on the particular projection
operators PN.
This discussion outlines the argument for maximum-likelihood based es-
timation theory for SPDEs with parametric drift terms. There are some
comments to this approach:
(i) Closability of the observation scheme: In practice, it is unlikely that all
three quantities XN, AN(X)and BN(X)are observed. Usually there
is just access to XN. This problem can be addressed in different ways:
Certain model assumptions can be imposed, e.g. that B(X)Bis
constant and known. Also, the generating model and the observation
scheme can be aligned in the sense that e.g. A(XN) = AN(X), at least
up to negligible terms. This is the basic idea behind the spectral ap-
proach, where Aand PNcommute. When considering spatially discrete
observations in Chapter 4, such commutativity relation does not hold,
and we have to deal with an additional bias.
(ii) Model refinement: In this scenario, θrepresents the overall drift in-
tensity of the dynamics. However, it is desirable to refine this model,
either by investigating a parameter linked to a specific part of the drift
(e.g. spatial diffusion, as outlined above), or by considering multipara-
metric drift terms based on specific model knowledge. We address both
questions in the sequel, with the main focus on diffusivity estimation.
(iii) Robustness: In the case that (1.2) is misspecified but close to the true
generating dynamics, it is desirable that the asymptotic results transfer.
We will look at robustness of the estimation procedure to misspecifica-
tion in the drift and noise terms.
15
Chapter 2
The Spectral Approach
This chapter is an adaptation of the statements and results from [PS20].
The aim of this chapter is to develop an estimation theory for the dif-
fusivity of a semilinear SPDE driven by additive noise within the spectral
approach to statistical inference for SPDEs.
In the spectral approach, a finite number Nof Fourier modes of the so-
lution Xto an SPDE is observed, usually continuously in time, and the
asymptotic properties of estimators derived from these observations is deter-
mined as Ntends to infinity. This ansatz has been pioneered by [HKR93,
Hue93, HR95], where it has been noted that the coefficient of an unbounded
drift operator of an SPDE can be identified in finite time, in strong con-
trast to the finite-dimensional case of an stochastic (ordinary) differential
equation. In [CGH11], statistical inference for the stochastic Navier–Stokes
equations with additive noise in two dimensions has been considered, which
is the first treatment of parameter estimation in finite time for a nonlinear
system within the spectral approach. The results and methods therein have
been extended to general semilinear equations in [PS20], on which the present
chapter is based.
In Section 2.1, we discuss the semilinear SPDE model which will be used
throughout this chapter, together with some auxiliary results. Section 2.2
is concerned with optimal spatial regularity of the solution process X. In
Section 2.3, we discuss the maximum-likelihood approach to diffusivity esti-
mation and prove asymptotic results for the estimators derived from that ap-
proach. Examples including reaction-diffusion equations, the Burgers equa-
tion and equations of Cahn-Hilliard type are treated in Section 2.4. The
impact of a misspecified drift term is analyzed in the same section. A nu-
16
merical validation of the theory is given in Section 2.5 for the stochastic
Allen-Cahn equation. Systems consisting of an observed component and an
unobserved component are handled in Section 2.6.
2.1 The Setting
Let Hbe a separable Hilbert space with scalar product ⟨·,·⟩ and A:D(A)
Ha densely defined, closed operator that is self-adjoint and negative definite
with compact resolvent. For s0, let Hs:= D((A)s/2)Hbe the domain
of the fractional Laplacian, equipped with the norm ∥·∥s=(A)s/2·H. For
s < 0, let Hsbe the completion of Hw.r.t. the norm ∥·∥sgiven by the same
term. For each s0,HsHHsforms a Gelfand triple. The dual
pairing between Hsand Hsis again denoted by ⟨·,·⟩. Set V:= H1, then
Vcan be identified with H1. For θ > 0, denote by t7→ eA,t > 0, the
C0-semigroup generated by θA. Let k)kNbe an orthonormal basis of H
consisting of eigenfunctions of A, such that the corresponding sequence
of (positive) eigenvalues (λk)kNis ordered increasingly, taking into account
multiplicities. Let PN:HHbe the orthogonal projection onto the span
of the first Neigenfunctions Φ1,...,ΦN. Let (Ω,F,P)be a probability space
and (Ft)t0a right-continuous complete filtration, then (Ω,F,(Ft)t0,P)is
called a stochastic basis.
In this chapter, we consider a semilinear SPDE of the form
dXt=θAXtdt+F(X)(t)dt+BdWt(2.1)
together with initial condition X0H, where Wis a cylindrical Wiener
process on H,B=σ(A)γfor some σ, γ > 0is assumed to be of Hilbert–
Schmidt type, F:C(0, T;H)D(F)L1(0, T;H)is a nonlinear operator
and θ > 0is an unknown parameter.
A pair of adapted processes (X, W), defined on some stochastic basis
(Ω,F,(Ft)t0,P), with XD(F)C(0, T;H)a.s. and Wa cylindrical
Wiener process, is a solution to (2.1) in the analytically and probabilistically
weak sense if for all vD(A) = H2a.s.
v, Xt=v, X0+θZt
0Av, Xrdr+Zt
0v, F(X)(r)dr+v, BWt.
We always assume the following:
17
(W)There is a solution in the analytically and probabilistically weak sense
to (2.1) in C(0, T;H), which is unique in the sense of probability law.
By means of the Yamada-Watanabe theorem (see e.g. [LR15, Theorem
E.0.8]), pathwise uniqueness implies uniqueness in the sense of probability
law. In most examples, Fwill be of the form F(X)(t) = F(Xt), i.e. (by
abuse of notation) F:HD(F)H, such that Fextends to an operator
VV. In this case, well-posedness of (2.1) can be handled in the context
of the variational approach [LR15], cf. Section 2.4.1
Condition (W)alone imposes very little spatial regularity and should be
considered as a minimal requirement that serves as a baseline for stronger
regularity properties of X. In fact, a detailed analysis of higher regularity
for Xwill be crucial for our statistical analysis, cf. Section 2.2. There, we
need the representation of Xas a mild solution:
Xt=eAX0+Zt
0
e(tr)θAF(X)(r)dr+Zt
0
e(tr)θABdWr,
where the first integral is understood in the Bochner sense, and the second
integral is a stochastic convolution.
Remark 2.1. In general, analytically weak and mild solutions are not equiv-
alent. However, if a.s. X, F(X)L1(0, T;H), it can be shown that analyt-
ically weak solutions are also mild solutions, see Proposition G.0.5 (i) and
Remark G.0.6 in [LR15]. These conditions are satisfied in our setting.
For two sequences of positive numbers (aN)NNand (bN)NN, we write
aNbNif aN/bN1and aNbNif aN/bNCfor some C > 0. Further
we write aNbNif aNCbNfor some C > 0,aNbNif aN/bN0and
aNpbNif there is ϵ > 0such that NϵaN/bN0. If the sequences are
random, then these limits are meant in the almost sure sense, unless stated
otherwise.
We always assume that there are Λ, β > 0such that the sequence of
eigenvalues of Ahas polynomial growth:
λkΛkβ(2.2)
1There is a vast literature on well-posedness and regularity for SPDEs, see [DPZ14,
LR15] and references therein. In [Pes95], existence and uniqueness of semilinear equations
on Banach spaces is considered. Stochastic reaction–diffusion equations are studied in
detail in [Cer01]. See [Kry96, vNVW12] for a treatment of maximal Lp-regularity.
18
for k . Note that with this notation, the condition that Bis of Hilbert–
Schmidt type is equivalent to
γ > 1
2β.(2.3)
We close this section by stating some auxiliary results.
Lemma 2.2. For s1s2and XH:
(i) PNX2
s2λs2s1
NPNX2
s1.
(ii) (IPN)X2
s1λs1s2
N+1 (IPN)X2
s2.
(iii) If XHs1, then erθAX2
s2Cs2s1r(s2s1)X2
s1for some Cs2s1>
0and all r > 0.
Proof. All properties are clear from the spectral decomposition Z2
s=
P
k=0 λs
kZ, Φk2,sR,ZH. In (iii), we can choose the constant
Cs2s1= supy>0e2yys2s1s2s1.
The statements (i) and (ii) are bounds of Bernstein and Jackson type,
respectively (cf. [Sha71, BL76]). Statement (iii) is a smoothing property of
the semigroup, see [Paz83, Lun95].
2.2 Spatial Regularity
In this section, we describe the precise regularity of X, and in particular, the
excess regularity of its nonlinear part. In order to do so, we apply a classical
splitting argument and write X=¯
X+e
X, where ¯
Xsolves
d¯
Xt=θA ¯
Xtdt+BdWt(2.4)
with initial condition X0= 0, and e
Xsolves the random PDE
de
Xt= (θAXt+F(¯
X+e
X)(t))dt, e
X0=X0,(2.5)
which reads as
e
Xt=eAX0+Zt
0
e(tr)θAF(¯
X+e
X)(r)dr(2.6)
19
in the mild formulation.
We infer higher regularity for e
Xby means of the conditions (Fs,η)and
(Fv
s,η)below, which rely on the representation of e
Xas a mild or weak solution,
respectively. The weak solution approach has been used in [CGH11, PS20] for
the spectral observation scheme, whereas the mild representation has been
applied in [ACP20] in the context of the local observation scheme, cf. Chapter
5. As the mild approach yields larger excess regularity in our context, we
focus mainly on (Fs,η). However, the weak approach using (Fv
s,η)will be
crucial in the examples in order to reach the level of regularity where the
mild approach can be applied.
Here and in the sequel, we write XN:= PNXas well as ¯
XN:= PN¯
Xand
e
X:= PNe
X. These projected processes satisfy
d¯
XN
t=θA ¯
XN
tdt+BdWN
t,¯
X0= 0,
de
XN
t=θA e
XN
tdt+PNF(X)(t)dt, e
X0=X0,
where WN:= PNW. In this section, we do not need the precise form (2.4)
for the dynamics of ¯
X, but only its spatial regularity. In this sense, the
conclusions remain valid for Chapters 3 6, where the assumptions on the
noise term are changed.
In order to quantify the regularity of X, we need the following spaces:
R(s) := L(0, T;Hs),(2.7)
RE(s) := \
p1
Lp(Ω, L(0, T;Hs)),(2.8)
i.e. R(s)is a normed space with norm XR(s)= sup0tTXs, and RE(s)
is a locally convex space with XRE(s)if and only if for all p1:
Esup
0tTXtp
s<.(2.9)
While XR(s)a.s. suffices for the purpose of diffusivity estimation, in
many examples, the stronger statement XRE(s)can be shown.
In order to conduct our regularity analysis, we need that for some η > 0
and sR, the regularity of F(X)differs from that of Xby 2ηderivatives,
in the following sense:
20
(Fs,η)There is ϵ > 0and a monotonous, locally bounded function g: [0,)
[0,)such that for all XR(s):
F(X)R(s+η+ϵ2) g(XR(s)).(2.10)
In general, the function ghas polynomial growth, cf. Section 2.4. If
F(X)(t) = F(Xt), i.e. Facts on every point in time separately, then it is
sufficient for (2.10) that for each 0tTand XHs:
F(Xt)s+η+ϵ2g(Xts).(2.11)
The proof of the next proposition is inspired by similar estimates in
[DPDT94].
Proposition 2.3. Let sR, η > 0. Assume that (Fs,η)is true.
(i) If ¯
X, e
XR(s)and X0Hs+η, then e
XR(s+η)a.s.
(ii) If ¯
X, e
XRE(s)and X0Lp(Ω, Hs+η)for any p1and if the
function gfrom (Fs,η)is of the form g(x) = C(1+x)bfor some C, b 0,
then e
XRE(s+η).
Proof.
(i) We have by Lemma 2.2 (iii):
e
XN
ts+ηeAXN
0s+η+Zt
0e(tr)θAPNF(¯
X+e
X)(r)s+ηdr
XN
0s+η+Zt
0
(tr)1+ϵ/2PNF(¯
X+e
X)(r)s+η+ϵ2dr
X0s+η+F(¯
X+e
X)R(s+η+ϵ2) Zt
0
(tr)1+ϵ/2dr
X0s+η+g¯
X+e
XR(s)2
ϵtϵ/2,
thus, uniformly in NN,
sup
0tTe
XN
ts+ηX0s+η+2
ϵTϵ/2g¯
XtR(s)+e
XR(s),(2.12)
and the right-hand side is finite by assumption.
21
(ii) By (2.12), for any p1,
Esup
0tTe
XN
t
p
s+ηEX0p
s+η+E"1 + ¯
XR(s)+e
XR(s)pb#
and further,
E"1 + ¯
XR(s)+e
XR(s)pb#1 + E¯
Xpb
R(s)+e
X
pb
R(s),
which is finite by assumption. This proves the claim.
We say that sRis the optimal regularity for ¯
Xif a.s. ¯
XR(s)for
all s < sand ¯
X /R(s)for all s > s.
Proposition 2.4. Let η > 0, let sbe the optimal regularity of ¯
X, let s0< s
such that (Fs,η)is true for each s0s<s.
(i) If a.s. XR(s0)and X0Hs+η, then a.s. XR(s)for s < sas
well as X /R(s)for s > s, and further a.s. e
XR(s+η)for s<s.
(ii) If XRE(s0),X0Lp(Ω, Hs+η)for p1and ¯
XRE(s)for s < s,
and if the function gfrom (Fs,η)is of the form g(x) = C(1 + x)bfor
some C, b > 0, then XRE(s)for s < s,X /RE(s)for s > s, and
e
XRE(s+η)for s < s.
Proof. For (i), note that the statements XR(s),e
XR(s+η)for s < s
follow inductively from Proposition 2.3. Further, if XR(s)for some s > s
with positive probability, then ¯
X=Xe
XR(s(s+η/2)) with positive
probability, in contradiction to the optimality of s. The reasoning for (ii) is
similar.
If it is possible to set s0= 0 in Proposition 2.4, standard existence re-
sults for SPDEs can be used as a starting point for inferring higher regu-
larity. In contrast, if (Fs,η)does not hold for s= 0, we have to prove first
that XR(s0)for some s0>0. This can be achieved by modifying the
regularity induction. Typically, the variational approach for SPDEs yields
well-posedness of the paths of Xin spaces of the form
Rv(s) := L(0, T;Hs1)L2(0, T;Hs).(2.13)
22
It is still possible to find a condition on Fwhich allows for an analogue of
Proposition 2.3. Here, we restrict to the case F(X)(t) = F(Xt).
(Fv
s,η)There is a locally bounded g: [0,)[0,)such that for XHs:
F(X)2
s+η2(1 + X2
s)g(Xs1).(2.14)
The next result extends a similar argument for the stochastic Navier–
Stokes equations from [CGH11].
Proposition 2.5. Let sR,η > 0such that (Fv
s,η)holds true and a.s.
X0Hs+η1. If a.s. ¯
X, e
XRv(s), then e
XRv(s+η)a.s.
Proof. With PNHRN,e
XN=PNe
Xis a process in C1(0, T;RN). The
chain rule gives for 0tT:
e
XN
t
2
s+η1=PNX02
s+η1+ 2 Zt
0De
XN
r, θA e
XN
r+PNF(Xr)Es+η1dr,
and consequently,
sup
0tTe
XN
t
2
s+η1+ 2θZT
0e
XN
t
2
s+ηdt
X02
s+η1+ 2 ZT
0De
XN
t, PNF(Xt)Es+η1dt.
The last term can be estimated as
2ZT
0De
XN
t, PNF(Xt)Es+η1dt2ZT
0e
XN
ts+ηPNF(Xt)s+η2dt
θZT
0e
XN
t
2
s+ηdt+1
2θZT
0F(Xt)2
s+η2dt.
Finally, using (Fv
s,η)and X=¯
X+e
XRv(s),
sup
0tTe
XN
t
2
s+η1+θZT
0e
XN
t
2
s+ηdt
X02
s+η1+1
2θZT
0F(Xt)2
s+η2dt
X02
s+η1+1
2θsup
0tT
gXts1ZT
0
(1 + Xt2
s)dt < .
Thus (e
XN)NNis uniformly bounded in Rv(s+η), and the claim follows.
23
Analogously to Proposition 2.4, given η > 0and s0< ssuch that s
denotes the optimal regularity of ¯
Xand a.s. XRv(s0)and X0Hs+η1,
if (Fv
s,η)holds for all s0s < s, then XRv(s)and e
XRv(s+η)for all
s < s. In particular, XR(s1) for all s < s. The latter statement can
be used as a starting point for Proposition 2.4.
Finally, we will need a pathwise regularity statement for stochastic inte-
grals. If a.s. UR(s), then t7→ (A)s/2Ut,·has values in the space of
Hilbert–Schmidt operators from Hto R, and RT
0(A)s/2Ut,dWtis well-
defined [DPZ14]. Provided that UL2(Ω×[0, T]; Hs), Itô’s isometry implies
that this integral is approximated in L2(Ω; R)by RT
0(A)s/2UN
t,dWtas
N , where UN=PNU. By a classical stopping argument, we have even
almost sure convergence, together with a quantification of the divergence rate
of the rescaled approximants (cf. Lemma 2.2 (i)), in the following sense:
Lemma 2.6. Let s, sRwith s < s. For every process Uwith a.s.
UR(s), it holds that a.s.
lim
N→∞ ZT
0(A)s/2UN
t,dWt=ZT
0(A)s/2Ut,dWt,(2.15)
and for every a > 0, a.s.
lim
N→∞ λa/2
NZT
0(A)(s+a)/2UN
t,dWt= 0.(2.16)
Proof. For KN, let τK:= inf{0tT; sup0rtUr2
sK}T. We
abbreviate ZN
s(t) := Rt
0(A)s/2UN
r,dWrand Zs(t) := Rt
0(A)s/2Ur,dWr.
Let ϵ > 0and p4/(β(ss)). The Burkholder–Davis–Gundy inequality
and Lemma 2.2 (ii) give
PZs(τK)ZN
s(τK)> ϵϵpE"ZτK
0(IPN)Ut2
sdtp/2#
ϵpλp(ss)/2
N+1 E"ZτK
0Ut2
sdtp/2#
ϵp(KT)p/2λp(ss)/2
N+1 N2.
24
The Borel–Cantelli lemma implies that ZN
s(τK)Zs(τK)a.s. Similarly,
using Lemma 2.2 (i),
Pλa/2
NZN
s+a(τK)> ϵϵpλap/2
NE"ZτK
0UN
t2
s+adtp/2#
ϵpλp(ss)/2
NE"ZτK
0UN
t2
sdtp/2#
ϵp(KT)p/2λp(ss)/2
NN2,
where we w.l.o.g. assume that s+a > s(otherwise take sto be smaller).
Again by the Borel–Cantelli lemma, λa/2
NZN
s+a(τK)0a.s.
Consequently, (2.15), (2.16) are true on the set AK:= {τK=T}. The
claim follows as SKNAKhas probability one.
2.3 Diffusivity Estimation
For kN, we set ¯x(k):= ¯
X, Φk. Then (¯x(k))kNare independent one-
dimensional Ornstein–Uhlenbeck processes that solve
d¯x(k)
t=θλk¯x(k)
tdt+σλγ
kdW(k)
t,(2.17)
¯x(k)
0= 0, where (W(k))kNare independent Brownian motions. ¯x(k)has the
explicit representation
¯x(k)
t=σλγ
kZt
0
eθλk(tr)dW(k)
r,(2.18)
and consequently,
E[(¯x(k)
t)2] = σ2
2θ(1 e2θλkt)λ2γ1
k.(2.19)
Lemma 2.7. For any s < s:= 1 + 2γ+ 1, it holds a.s. ¯
XR(s).
25
Proof. For 0< α < 1/2(ss)/2,
ZT
0
t2α(A)s/2eAB2
HS dt=σ
X
k=1
λs2γ
kZT
0
t2αe2θλktdt
σ
X
k=1
λs2γ
kZ
0r
2θλk2αer
2θλk
dr
Γ(1 2α)
X
k=1
λs2γ1+2α
k
X
k=1
kβ(s2γ1+2α),
where Γdenotes the Gamma function. The last sum is finite since β(s
2γ1+2α)<1for α < (ss)/2. Now, by [DPZ14, Theorem 5.11],
(A)s/2¯
XR(0), i.e. ¯
XR(s)a.s.
In fact, the proof of [DPZ14, Theorem 5.11] shows that even (A)s/2¯
X
RE(0) in the situation of the proof of Lemma 2.7, thus ¯
XRE(s)for s < s.
Proposition 2.8. With s= 1+2γ1, let η > 0,s0< ssuch that (Fs,η)
holds for any s0s < s. Assume that a.s. XR(s0),X0Hs+η. Then
a.s. for any s > s:
ZT
0(A)s/2XN
t2dtCsN1+β(s2γ1) (2.20)
with
Cs=σ2TΛs2γ1
2θ(1 + β(s2γ1)) (2.21)
Proof. Integrating (2.19), we see that
EZT
0
(¯x(k)
t)2dtσ2T
2θλ2γ1
k,(2.22)
thus, using λkΛkβ,
EZT
0(A)s/2¯
XN
t2dt=
N
X
k=1
λs
kEZT
0
(¯x(k)
t)2dtσ2T
2θ
N
X
k=1
λs2γ1
k
σ2TΛs2γ1
2θ(1 + β(s2γ1))N1+β(s2γ1).
26
Lemma A.2 (ii), with X
k(t) = λs/2
k¯x(k)
tin the notation therein, immediately
gives that (2.20) is true for XNreplaced by ¯
XN. Now, for any 0< ϵ < η, by
Proposition 2.4:
ZT
0(A)s/2e
XN
t
2dtλssη+ϵ
NZT
0e
XN
t
2
s+ηϵdtN1+β(s2γ1η+ϵ).
This is negligible compared to the right-hand side of (2.20), and the claim
follows from expanding the square on the left-hand side of (2.20) together
with the Cauchy–Schwarz inequality for the mixed term.
Remark 2.9. In the setting of the previous proposition, it is immediate that
for the limit case s=s, we have a.s.
ZT
0(A)s/2XN
t2dtσ2T
2θΛ1 ln(N),(2.23)
with obvious changes in the proof. In particular, as the right-hand side di-
verges, X /R(s), and the regularity from Lemma 2.7 is optimal.
Next, we derive three maximum–likelihood type estimators for θ(cf.
[CGH11]). The projected process XN=PNXinduces a measure PN,T
θon
the path space C(0, T;PNH)C(0, T;RN)for each value of the diffusivity
θ > 0. If we fix an arbitrary reference parameter θ0>0and assume that
each of the measures (PN,T
θ)θ>0is absolutely continuous with respect to PN,T
θ0,
we obtain a likelihood that we can use for statistical inference. According to
[LS77, Section 7.6.4], the log–likelihood is formally given by
ln dPN,T
θ
dPN,T
θ0
(XN) = 1
σ2ZT
0(θθ0)AXN
t,(A)2γdXN
t
1
2σ2ZT
0(θθ0)AXN
t,(A)2γ(θ+θ0)AXN
t+ 2PNF(X)(t)dt.
This is rigorous if PNF=FPN, otherwise it should be considered as a natural
(but heuristic) approach. Maximizing for θyields the following maximum
likelihood–type estimator:
ˆ
θfull
N:= RT
0(A)1+2αXN
t,dXN
t
RT
0(A)1+αXN
t2dt+RT
0(A)1+2αXN
t, PNF(X)(t)dt
RT
0(A)1+αXN
t2dt,
(2.24)
27
where we substituted γby an additional parameter α. This estimator de-
pends on PNF(X)and is therefore not closed in XN. It can be modified as
follows:
ˆ
θpart
N:= RT
0(A)1+2αXN
t,dXN
t
RT
0(A)1+αXN
t2dt+RT
0(A)1+2αXN
t, PNF(XN)(t)dt
RT
0(A)1+αXN
t2dt.
(2.25)
Finally, the nonlinear term can be left out completely:
ˆ
θlin
N:= RT
0(A)1+2αXN
t,dXN
t
RT
0(A)1+αXN
t2dt.(2.26)
Note that the stochastic integral appearing in the numerator of each of
the estimators has a robust representation
ZT
0(A)1+2αXN
t,dXN
t=
N
X
k=1
λ1+2α
kZT
0
x(k)
tdx(k)
t
=1
2
N
X
k=1
λ1+2α
k(x(k)
T)2(x(k)
0)2σ2λ2γ
kT,
so it is a function of a single trajectory of XNalone.
The aim of this section is to study the asymptotic properties of these
estimators as N . Note that one cannot directly apply the general
theory for maximum likelihood estimation, as exposed e.g. in [IH81], because
for PNF=FPN, none of these estimators is the true MLE.
Lemma 2.10. Let sR,ϵ > 0. For any process UR(s), we have a.s.:
ZT
0(A)1+2αXN
t, PNUtdtN1
2+β(2αγ+1
2s
2+ϵ
2).(2.27)
In particular, if s=s2 + ηϵfor some η > 0, then
ZT
0(A)1+2αXN
t, PNUtdtN1+β(2α2γ+1η
2+ϵ).(2.28)
28
Proof. This follows from
ZT
0(A)1+2αXN
t, PNUtdtZT
0XN
t2
2+4αsdtZT
0PNUt2
sdt1/2
λ2+4αss+ϵ
NZT
0XN
t2
sϵdt1/2
λ2αγs
2+1
2+β1
2+ϵ
2
NN1
2+β(2αγ+1
2s
2+ϵ
2).
An estimator ˆ
θNfor θis called strongly consistent if a.s. ˆ
θNθ.
Theorem 2.11. Let η > 0,s0Rsuch that (Fs,η)is true for s0s < s.
Assume that X0Hs+ηand XR(s0). Let α > γ (1 + 1)/4.
(i) ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
Nare strongly consistent estimators of θ.
(ii) ˆ
θfull
Nis asymptotically normal:
N1+β
2(ˆ
θfull
Nθ)d
N(0,Σ),(2.29)
where
Σ = 2θ(1 + β(2α2γ+ 1))2
TΛ(1 + β(4α4γ+ 1)).(2.30)
(iii) If η > 1 + β1, then ˆ
θpart
Nand ˆ
θlin
Nare asymptotically normal as in
(2.29). Otherwise, for any a < βη/2,
ˆ
θpart
N=θ+o(Na)(2.31)
almost surely, and the same is true for ˆ
θlin
N.
Proof.
(i) This is a consequence of (ii) and (iii).
29
(ii) Plugging in the dynamics of XNinto ˆ
θfull
N, we obtain
ˆ
θfull
Nθ=σRT
0(A)1+2αγXN
t,dWN
t
RT
0(A)1+αXN
t2dt
=σC1/2
2+4α2γN1/2+β(2α2γ+1/2)
RT
0(A)1+αXN
t2dtMN
T,
where MN
t=C1/2
2+4α2γN1/2β(4α4γ+1/2) RT
0(A)1+2αγXN
t,dWN
tis
a local martingale. By Proposition 2.8 with s= 2 + 4α2γit holds
MNT1in probability, thus MN
T/pMNT N(0,1) in distribu-
tion as N by Theorem A.1. An application of Slutsky’s lemma
together with Proposition 2.8 with s= 2 + 2αgives
N1+β
2(ˆ
θfull
Nθ)d
N(0, σ2C2+4α2γ/C2
2+2α).(2.32)
(iii) We write
ˆ
θpart
Nθ=σRT
0(A)1+2αγXN
t,dWN
t
RT
0(A)1+αXN
t2dtbiasN(X) + biasN(XN),
ˆ
θlin
Nθ=σRT
0(A)1+2αγXN
t,dWN
t
RT
0(A)1+αXN
t2dtbiasN(X),
where, with Y=Xor Y=XN,
biasN(Y) = RT
0(A)1+2αXN
t, PNF(Y)(t)dt
RT
0(A)1+αXN
t2dt.(2.33)
Let ϵ > 0and s=s+η2ϵ= 2γ1β1+ηϵ. Using condition
(Fs2ϵ,η), we have that F(X)R(s). By Lemma 2.10,
ZT
0(A)1+2αXN
t, PNF(Y)(t)dtN1+β(2α2γ+1η
2+ϵ),
and using Proposition 2.8, a.s.
biasN(Y)C1
2+2αN1β(2α2γ+1) ZT
0(A)1+2αXN
t, PNF(Y)(t)dt
Nβ
2(η2ϵ).
30
Consequently, NabiasN(Y)0a.s. for any a < βη/2.
Now, if η > 1 + 1, let in addition ϵ < (η11)/2, then we
see that a.s. N(1+β)/2biasN(Y)0, and asymptotic normality follows
from an application of the Slutsky lemma.
Otherwise, in case η1 + 1, we have by Lemma 2.6 (setting a=
2+4α2γs+ϵfor any ϵ>0in the notation therein) that
NbRT
0(A)1+2αγXN
t,dWN
t
RT
0(A)1+αXN
t2dt0(2.34)
almost surely for any b < (1 + β)/2, and (2.31) is immediate.
Any asymptotically normal estimator in Theorem 2.11 allows to construct
asymptotic confidence intervals for θby using quantiles of the approximating
normal distribution N(θ, N(1+β)Σ) for fixed NN. However, the asymp-
totic variance Σdepends linearly on the unknown parameter θ. Thus, in
order to construct asymptotic confidence intervals for the diffusivity that do
not depend on unknown quantities, the variance itself has to be estimated
consistently using any of the three estimators. This is justified by Slutsky’s
lemma. Alternatively, a variance-stabilizing transform can be used [vdV98,
Section 3.2].
Consistency of any of the three estimators implies that the measures on
C(0, T;H)generated by Xfor different values of θ > 0are mutually singular.
In the setting of this section, it is possible to determine the precise rate
of almost sure convergence of the estimators by a law of iterated logarithm:
Theorem 2.12. Let η > 0,s0Rsuch that (Fs,η)is true for s0s < s.
Assume X0Hs+ηand XR(s0). Let α > γ (1 + 1)/4. Then a.s.
lim sup
N→∞
N1+β
2
pln(ln(N)) ˆ
θfull
Nθ= (2.35)
with Σas in (2.30). If η > 1 + 1, then (2.35) is true for ˆ
θpart
Nand ˆ
θlin
Nas
well.
31
Proof. We write
¯
MN
t:= Zt
0(A)1+2αγ¯
XN
r,dWN
r,
f
MN
t:= Zt
0D(A)1+2αγe
XN
r,dWN
rE.
Then
¯
MN
T=
N
X
k=1
λ1+2αγ
kZT
0
¯x(k)
tdW(k)
t=:
N
X
k=1
Zk.
We show that we can apply the law of iterated logarithm for independent, not
necessarily identically distributed random variables from [Wit85]2to (Zk)kN.
To this end, we write
sN:= E"N
X
k=1
Z2
k#!1
2
=EZT
0(A)1+2αγ¯
XN
t2dt1
2
.
Then clearly sN and sNsN+1. Using the Burkholder–Davis–Gundy
inequality, Jensen’s inequality, Gaussianity of ¯x(k)and (2.19), we have
E"ZT
0
¯x(k)
tdW(k)
t
3#E"ZT
0
(¯x(k)
t)2dt3
2#EZT
0¯x(k)
t
3dt
ZT
0E[(¯x(k)
t)2]3
2dtλ2γ1
k3
2.
Consequently,
N
X
k=1
E[|Zk|3]
s3
k
N
X
k=1 λ2α2γ+1/2
k
sk!3
N
X
k=1
k3β(2α2γ+1/2)3/23β(2α2γ+1/2) =
N
X
k=1
1
k3/2,
2Although it is sufficient for our purposes, this law of iterated logarithm can be further
weakened, see e.g. [Wit87] and [Che93] for a discussion.
32
which converges for N . Therefore all the conditions from [Wit85] are
satisfied, and using ln ln s2
Nln ln N, we conclude that a.s.
lim sup
N→∞
1
2 ln ln N
¯
MN
T
p¯
MT
= lim sup
N→∞
¯
MN
T
p2s2
Nln ln s2
N
= 1.
In particular, lim supN→∞ N(1+β)/2(ln ln N)1/2¯
MN
T/RT
0(A)1+αXN
t2dt=
(2Σ)1/2. Further, by Lemma 2.6 (with a= 2 + 4α2γsin the nota-
tion therein), N(1+β)/2f
MN
T/RT
0(A)1+αXN
t2dt0almost surely, where
we have used that e
XR(s+ηϵ)for every ϵ > 0. With ˆ
θfull
Nθ=
σ(¯
MN
T+f
MN
T)/RT
0(A)1+αXN
t2dt, (2.35) is proven. The statement con-
cerning ˆ
θpart
Nand ˆ
θlin
Nfollows from the proof of Theorem 2.11 (iii).
Remark 2.13.
(i) A direct calculation shows that the asymptotic variance is minimal for
α=γ. In this case, Σ = 2θ(1 + β)/TΛ. However, the estimators are
robust to the case α=γ, when the spatial regularity of Xis wrongly
specified.
(ii) In fact, continuous observation on [0, T]of any of the modes x(k)allows
to reconstruct γprecisely via the quadratic variation x(k)T=σ2λ2γ
kT,
if σand λkare known.
(iii) Σdepends linearly on T1, i.e. observation on a large time interval
improves the estimate. This corresponds to the large time asymptotics
with rate T1/2, which is well-known from statistics for stochastic dif-
ferential equations under ergodicity assumptions, see e.g. [Kut04].
(iv) Following the formalism of the (heuristic) maximum–likelihood approach,
the term σ2RT
0(A)1+αXN
t2dt(for α=γ) can be considered the
observed Fisher information.
(v) The convergence rate of ˆ
θfull
Nis upper bounded by N1/2independently
of the dimension d.
(vi) The convergence rate (2.31) cannot be improved for ˆ
θlin
N, see Section
2.4.1.
33
(vii) In [PS20], a condition of the type (Fv
s,η)has been used to infer higher
regularity of e
X. Compared to (Fs,η), this condition is more restrictive
and yields lower excess regularity ηin examples (cf. Lemma 2.22 be-
low). Consequently, the lower bounds on the convergence rate of ˆ
θlin
N
were too pessimistic. An additional Lipschitz condition on Fhas been
used in order to reduce the asymptotic behavior of ˆ
θpart
Ndirectly to that
of ˆ
θfull
N, leading to better convergence rates of ˆ
θpart
Ncompared to ˆ
θlin
N,
namely, the same rates as stated in Theorem 2.11. In the present for-
malism used in this chapter, this additional Lipschitz condition is no
longer necessary for statistical purposes, as both ˆ
θpart
Nand ˆ
θlin
Nobtain
the mentioned rates using (Fs,η)alone.
(viii) If Fsatisfies (Fs,η)and PNFPNFPNsatisfies (Fs,η)for some η> η,
then the convergence rate of ˆ
θpart
Ncan be further improved. This is
trivially the case if [PN, F] := PNFFPN= 0, in this case ˆ
θpart
N
coincides with ˆ
θfull
N.
(ix) Consider the situation that Aand Fare (pseudo-) differential oper-
ators on a domain D Rd. Theorem 2.11 implies that in order to
identify θin finite time, it is sufficient that Fis of lower order com-
pared to the leading order drift term θA (this is discussed in detail in
Section 2.4). However, parameters describing the intensity of lower or-
der terms may be identified in finite time as well. Consequently, it is
possible that θremains identifiable if Fis of higher order than θA. In
[HR95], a characterization for linear Fthat commute with Ais given:
θis consistently (and asymptotically normal) estimated by an estima-
tor of the type ˆ
θfull
Nif and only if order(A)(order(θA +F)d)/2,
i.e. order(F)2 order(A) + d. This has been extended by subsequent
works on the spectral approach, cf. [LR99, LR00] for the case of non-
commuting operators.
(x) As only pathwise properties of the nonlinear process e
Xare needed, F
may, in fact, depend on the realization ω.
2.4 Discussion of Examples
Next, we discuss the validity of condition (Fs,η)and the resulting statements
concerning diffusivity estimation for models with different nonlinear term F.
34
These examples are by no means exhaustive. Note that more complicated
nonlinearities can be decomposed into their elementary building blocks in
the following sense:
Lemma 2.14. Let sR,η > 0.
(i) If F1, F2satisfy (Fs,η), then the same is true for F1+F2.
(ii) Let η>0and s:= s+η2. If Fsatisfies (Fs,η)and Gsatisfies
(Fs), then GFsatisfies (Fs,η+η2).
Proof. All statements are clear from (2.10).
Consequently, a broad class of models where the present theory is appli-
cable can be constructed from elementary components, e.g. polynomial non-
linearities (or related reaction terms), differential operators acting in spatial
direction (advection or fractional diffusion), integration in time (delay terms
as in [DPZ14, Example 5.6]). In the next sections, we consider different
models in detail.
2.4.1 Linear Perturbations
For r < 2and cR, consider
dXt=θAXtdt+c(A)r/2Xtdt+BdWt
with initial condition X0Hs+2. Here, F(X) = c(A)r/2X. If c= 0, then
F:Hs+rHsis an isomorphism for any sR. In particular, (Fs,η)is true
for all sRand η < 2r. In this setting, ˆ
θfull
Ncoincides with ˆ
θpart
N. We
have:
Theorem 2.15. Let α > γ(1+1)/4. Then ˆ
θfull
Nis asymptotically normal
as in (2.29). Furthermore:
(i) If r < 11, then ˆ
θlin
Nis asymptotically normal as in (2.29).
(ii) If r= 1 1, then
N1+β
2ˆ
θlin
Nθd
N(κ, Σ),(2.36)
35
where
κ=cΛr/211 + β(2α2γ+ 1)
1 + β(2α2γ+r/2) (2.37)
and Σis given by (2.30).
(iii) If r > 11, then a.s.
Nβ(1r/2) ˆ
θlin
Nθκ. (2.38)
Proof. (i) is a direct consequence of Theorem 2.11. For (ii), (iii), it suffices to
understand the exact asymptotics of the bias term involving Fin the setting
of Theorem 2.11. Due to α > γ (1 + 1)/4together with r11,
it holds 1 + β(2α2γ+r/2) >0. Using Proposition 2.8 and the notation
therein, it holds a.s.
RT
0(A)1+2αXN
t, PNF(X)(t)dt
RT
0(A)1+αXN
t2dt=cRT
0(A)1/2+r/4+αXN
t2dt
RT
0(A)1+αXN
t2dt
cC1+r/2+2αN1+β(2α2γ+r/2)
C2+2αN1+β(2α2γ+1)
=cΛr/211 + β(2α2γ+ 1)
1 + β(2α2γ+r/2)Nβ(1r/2).
Now (ii) is immediate, and (2.38) holds w.r.t. convergence in probability.
Finally, Lemma 2.6 yields almost sure convergence in (2.38) by the same
argument used in the proof of Theorem 2.11 (iii).
Remark 2.16. In particular, the convergence rate (2.31) for ˆ
θlin
Nas stated
in Theorem 2.11 cannot be improved.
In case β= 2/d for dN, the critical condition r < 11 is equivalent
to r < 1d/2, i.e. the critical order of Fdecreases with the dimension. Note
that ris allowed to be negative here. We highlight two cases, which will be
refined in the next sections:
Perturbation of order zero (r= 0): In d= 1,ˆ
θfull
Nand ˆ
θlin
Nare asymp-
totically normal. In d= 2,ˆ
θlin
Nstill converges to θwith optimal rate.
In d3, the convergence rate of ˆ
θlin
Ndeclines.
Perturbation of order one (r= 1): In any dimension d1the conver-
gence rate of ˆ
θlin
Ndeclines compared to that of ˆ
θfull
N, but all estimators
stay consistent.
36
2.4.2 Reaction-Diffusion Equations
Let d1, and let D Rdbe a bounded domain with smooth boundary. Let
f:RRbe a real-valued function. Consider
dXt=θXtdt+f(Xt)dt+BdWt,(2.39)
together with initial condition X0such that EX0p
s+2<is true for
any p1. W.l.o.g. we assume Dirichlet boundary conditions, i.e. Xt= 0 on
the boundary Dfor 0tT. Set H=L2(D). The leading order linear
operator Ais given by : D(∆) H, where D(∆) = W2,2(D)W1,2
0(D).
The regularity scale is given by Hs=D((∆)s/2), such that Hsconsists of
functions of L2-Sobolev regularity s. It is always true that Ws,2
0(D)Hs
Ws,2(D),s0. For sN, a precise characterization of Hsin terms of
boundary trace operators can be given [Tho06, Lemma 3.1]. Further, it is
well-known that λkΛkβwith β= 2/d, see [Wey11] or [Shu01, Section
13.4].
We consider either of the following two structural assumptions:
(i) fis a polynomial of odd degree and negative leading coefficient, i.e.
for m2N1, there are a0, . . . , amRwith am<0such that
f(x) = amxm+···+a1x+a0.(2.40)
(ii) fis a bounded smooth functions with bounded derivatives of any order:
fC
b(R).(2.41)
Reaction–Diffusion models exhibit a broad variety of different dynamical
features. Nonetheless, in terms of diffusivity estimation, they can be treated
in a unified way, as explained in Theorem 2.20 below.
Proposition 2.17. Let d3and γ > d/4 + 1/2. Consider either of the
following two situations:
Let fbe a polynomial as in (2.40). If d= 3, assume additionally that
fis at most of third order (i.e. m3).
Let fbe a smooth function with bounded derivatives as in (2.41).
Then there is a unique solution Xto (2.39) with XRE(s)for some s > d/2.
37
The proof relies on [LR15] and is given in Appendix B.1. The condition
γ > d/4+1/2means that Bis a Hilbert–Schmidt operator from Hinto
V. This is needed because in the proof of Proposition 2.17, coercivity is
verified directly in Vinstead of H. However, there are situations where this
requirement can be relaxed to γ > d/4, i.e. Bis of Hilbert–Schmidt type
from Hinto H:
Proposition 2.18. Let d= 1,γ > 1/4and f(x) = xx3.
(i) There is a unique solution XRv(1) to (2.39).
(ii) (Fv
s,η)holds for s= 1 and η= 1. In particular, even XR(1).
Note that condition (Fv
s,η)instead of (Fs,η)is used in order to prove the
basic regularity result XR(1). From there, (Fs,η)can be used to infer
higher regularity.
Proof.
(i) This is a special case of [LR15, Example 5.1.8].
(ii) (Fv
s,η)is true due to
f(X)2
s+η2=f(X)2
L2(D)X2
L2(D)+X6
L6(D)X2
1+X6
1/3
X2
1+X2
1X4
0=X2
11 + X4
0,
where we used X32
L2(D)=X6
L6(D)together with the Sobolev em-
bedding H1/3L6(D)in d= 1 [AF03]. Proposition 2.5 implies
e
XRv(2) R(1). Together with ¯
XR(1) due to γ > 1/4, this
implies the claim.
Since d= 1 in the previous proposition, we can rephrase the existence
result: In particular, we have XR(s)for some s > d/2, exactly as in
Proposition 2.17. This condition s > d/2means that Xhas values in a
Sobolev space that is embedded into the space of continuous functions on D.
This is a natural starting point for inductively applying (Fs,η), cf. Proposition
2.4, as the next result illustrates:
38
Proposition 2.19.
(i) Let fbe a polynomial as in (2.40). Then (Fs,η)holds for any s > d/2
and 0< η < 2.
(ii) Let fC
b(R). Then (Fs,η)holds for any s[0,1] [d/2,)and
0< η < 2.
Proof.
(i) For s > d/2the Sobolev spaces Ws,2(D)are closed under multiplication
(see e.g. [AF03, Theorem 4.39], [Tri10a, p. 146]). Therefore,
f(X)s
m
X
k=1 |ak|Xk
s(1 + Xs)m
for XHs.
(ii) The case s= 0 is trivial since f(X)2
L2(D) |D|supyR|f(y)|2<,
so let s > 0. Set ˜
f:= ff(0). By Theorem A from [AF92] and
the discussion thereafter, there is C > 0such that ˜
f(X)sC(1 +
Xs)1sfor s(0,1] [d/2,), and the claim is immediate.
In particular, for each of the examples considered in this section, it is true
that a.s. XR(s)and e
XR(s+2) for any s<s. Therefore, by Theorem
2.11, we obtain the following result concerning diffusivity estimation:
Theorem 2.20. Let α > γ (d+ 2)/8. Then ˆ
θfull
Nsatisfies
N1
2+1
dˆ
θfull
Nθd
N 0,2θ(d+ 4α4γ+ 2)2
TΛd(d+ 8α8γ+ 2)(2.42)
If d= 1, the same is true for ˆ
θpart
N,ˆ
θlin
N. In d= 2,ˆ
θpart
Nis consistent with
optimal rate, i.e. ˆ
θpart
N=θ+o(Na)for any a < 1, and the same is true for
ˆ
θlin
N.
It is clear that the coefficients (ak)0kmin (2.40) may depend on x D,
in such a way that akHsfor 0km. This does not change the proof
of (Fs,η)for s < sin Proposition 2.19, thus Theorem 2.20 remains valid in
that case.
39
Remark 2.21. It is straightforward to include an advection term of the form
Fad(X) = · (Xv) = div(Xv)to the reaction–diffusion equation, where
v:D Rdis a vector field with components v(i)Hsfor some s >
d/2. More precisely, assume that the nonlinearity of the equation F=Fre +
Fad splits into a reaction term Fre(X) = f(X)as before, and an advection
term Fad as described above. It is clear that div maps (Hs)dinto Hs1for
any sR.3Furthermore, if XHsand v(Hs)dfor s > d/2, then
Xv(i)sXsv(i)sfor 1id, and consequently, Fad(X)s1
XsPd
i=1 v(i)s, i.e. Fad (and consequently F=Fre +Fad) satisfies (Fs,η)
with any η < 1. In terms of diffusivity estimation, this means that ˆ
θpart
N=θ+
o(Na)for any a < 1/d, and similarly for ˆ
θlin
N, whereas ˆ
θfull
Nis asymptotically
normal with rate N1/21/d. It cannot be expected that this loss in convergence
rate (compared to ˆ
θfull
N) can be improved for ˆ
θlin
N, cf. Section 2.4.1.
2.4.3 Burgers Equation
Let d= 1 and D= [0, L]Rfor some L > 0. Consider
dXt=θXtdtXtxXtdt+BdWt(2.43)
with initial condition X0Lp(Ω, Hs+1)for any p1, and Dirichlet bound-
ary conditions. The nonlinearity is given by
F(X) = XxX=x1
2X2.
The spaces H=L2(D)and (Hs)sRare as in Section 2.4.2. As a special
case of [LR15, Example 5.1.8], (2.43) has a unique solution in Rv(1). Higher
regularity can be inferred in a stepwise manner as follows:
Lemma 2.22.
(i) Fsatisfies (Fv
s,η)with η= 3/8for s= 1.
(ii) Fsatisfies (Fv
s,η)with η= 1/2for any s > 1.
(iii) Fsatisfies (Fs,η)for all η < 1and s > 1/2.
3For sZthis is obvious, for general suse the exact interpolation Theorem [AF03,
Theorem 7.23].
40
In particular, XRv(s)and e
XRv(s+ 1/2) for all s<s. If additionally
s>3/2(i.e. γ > 1/2), then XR(s)and e
XR(s+ 1) for any s < s.
Proof. In the following calculations we use repeatedly the interpolation in-
equality Xrs1+(1r)s2Xr
s1X1r
s2for s1, s2Rand 0< r < 1, further
the algebra property XY sXsYsfor s > 1/2and the Sobolev em-
bedding H1/4L4(D)in d= 1. These estimates are standard and can be
found, e.g., in [AF03].
(i) We have
F(X)2
s+η2=1
4X22
3/8X23/4X2L2(D)X2
3/4X2
L4(D)
X2
3/4X2
1/4X3/2
1X1/2
0X1/2
1X3/2
0
=X2
1X2
0,
so (Fv
s,η)is satisfied as stated.
(ii) For s > 1and η= 1/2,
F(X)2
s+η2=1
4X22
s1/2X4
s1/2X2
sX2
s1,
so condition (Fv
s,η)holds.
(iii) For s > 1/2and η < 1, with ϵ= 1 η,
F(X)s+η2+ϵ=1
2X2sX2
s,
so (Fs,η)holds.
Concerning the regularity of X, we know already that XRv(1). By (i)
and Proposition 2.5, e
XRv(1 + 3/8). As ¯
XR(s)for all s < s, we get
XRv(s)for s < s(1 + 3/8), and in particular, there is s > 1with
XRv(s). Now (ii) and repeated use of Proposition 2.5 yields XRv(s)
and e
XRv(s+ 1/2) for any s < s. Further, XR(s1) for all s < s
because Rv(s)R(s1). In case s>3/2, we have XR(s)for some
s > 1/2, and Proposition 2.4 gives XR(s),e
XR(s+1) for all s < s.
This Lemma, together with Theorem 2.11, yields the asymptotic proper-
ties of ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
Nfor the Burgers equation:
41
Theorem 2.23. Let γ > 1/2and α > γ 3/8. Then ˆ
θfull
Nsatisfies
N3
2ˆ
θfull
Nθd
N 0,2θ(3 + 4α4γ)2
TΛ(3 + 8α8γ).(2.44)
Further, ˆ
θpart
N=θ+o(Na)for a < 1, and the same is true for ˆ
θlin
N.
Remark 2.24. It is possible to apply Theorem 2.11 to the stochastic Navier–
Stokes equations driven by additive noise in dimension d= 2 with unknown
viscosity. In this case, we reobtain the results from [CGH11]. It has been
conjectured in [Cia18] that these results apply also to the stochastic Burgers
equation.
2.4.4 Equations of Cahn-Hilliard Type
For d1, fix a bounded domain D Rdwith smooth boundary. Let
f:RRbe a real-valued function. Consider
dXt=θ2Xtdtf(Xt)dt+BdWt(2.45)
with initial condition X0and boundary conditions X·ν= 0,(∆X)·ν= 0,
where ν:D Rdis the unit vector pointing outwards the domain D. We
formalize this setting as in [LR15, p. 172 ff.]: Set H=L2(D), and let Vbe
the closure in W2,2(D)of {uC4(D)| u·ν= 0,(∆u)·ν= 0 on D}.
Considering the Gelfand triple VHHV, we have that A=2
is a bounded operator VV. As before, we set Hs:= D((A)s/2). Our
standing assumption is X0Hs+1.
Note that the regularity counting in this section differs from the conven-
tion from the previous examples, because the leading drift term in (2.45) is
of order four: This means that Hsis a closed subspace of W2s,2(D). Further-
more, in this case we have β= 4/d, i.e. λkΛk4/d, see [Shu01, Section 13.4].
We additionally introduce the “classical” regularity spaces H
s:= D((∆)s/2)
that have been used in the previous sections. It is necessary to specify which
regularity scale we are using when we speak about condition (Fs,η).
Proposition 2.25. Let sR,η > 0, and set s:= 2s,η:= 2η. If f
satisfies (Fs)with respect to the scale of Hilbert spaces (H
r)rR, then F,
given by F(X) = f(X), satisfies (Fs,η)with respect to the scale of Hilbert
spaces (Hr)rR.
42
Proof. Choose ϵ > 0and g: [0,)[0,)as in (Fs). Then
∥−f(X)Hs+η+ϵ/22=f(X)H
s+η+ϵ2gXH
s=gXHs.
For example, let d2and assume that fis of the form
f(x) = cx +ϕ(x)(2.46)
for cRand ϕC
b(R). In particular, fis globally Lipschitz continuous.
By [LR15, Example 5.2.27], there is a unique solution Xto (2.45) with a.s.
XRv(1) R(0). As a consequence of Proposition 2.25 and Proposition
2.19 (ii), F=fsatisfies (Fs,η)for any s0with η < 1in the regularity
scale (Hr)rR. By Proposition 2.4 we conclude XR(s)and e
XR(s+ 1)
for any s < s. Therefore, we have:
Theorem 2.26. Let γ > d/8and α > γ (d+ 4)/16. Then ˆ
θfull
Nsatisfies
N1
2+2
dˆ
θfull
Nθd
N 0,2θ(d+ 8α8γ+ 4)2
TΛd(d+ 16α16γ+ 4).(2.47)
Further, ˆ
θpart
N=θ+o(Na)for all a < 2/d, and the same is true for ˆ
θlin
N.
2.4.5 Robustness under Model Misspecification
Assume that the true dynamics of a process Xis given by
dXt=θAXtdt+F(X)(t)dt+G(X)(t)dt+BdWt(2.48)
with smooth initial condition X0and F, G :C(0, T;H)D(F)D(G)
L1(0, T;H), where D(F)D(G)is the common domain for Fand G. We
assume that (2.48) is well-posed in R(s0)for some 0s0< s, and that
Fsatisfies (Fs,ηF)for some ηF>0and all s0s < s. Assume further
that X0Hs+ηF. We are interested in the robustness of ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
N
with respect to the misspecification G. In this section, all three estimators
are given by the same terms as in Section 2.3. In particular, ˆ
θfull
Nand ˆ
θpart
N
include knowledge on Fbut not on G.
43
Theorem 2.27. Let α > γ (1 + 1)/4.
(i) If Gsatisfies (Fs,ηG)for some ηG>0and s0s<s, then ˆ
θfull
N,ˆ
θpart
N
and ˆ
θlin
Nare consistent.
(ii) If Gsatisfies (Fs,ηG)for some ηG>1 + 1 and s0s < s, then
all statements on the asymptotic properties of ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
Nfrom
Theorem 2.11 transfer to the present case.
(iii) If F+Gsatisfies (Fs,ηF+G)for some ηF+G>1 + 1 and s0s < s,
then ˆ
θlin
Nis asymptotically normal as in (2.29).
Proof. Note that in any of these cases, F+Gsatisfies (Fs,η)for s0s <
sand η=ηFηG, so (2.20) remains true. Thus, all claims follow from
a straightforward modification of the proof of Theorem 2.11, taking into
account the additional bias of the form (2.33), with Freplaced by Gtherein,
coming from the nonlinear term G(X).
The excess regularity ηF+Gof F+Gclearly satisfies ηF+GηFηG, but
due to cancellation effects, ηF+Gmay be larger than ηFηG.
Remark 2.28.
(i) The preceding examples show that a large class of nonlinearities Gsat-
isfies (Fs,ηG)for some ηG>0.
(ii) As Gis assumed to be unknown (or intractable), it does not make
sense to construct a modified estimator that takes into account the shift
coming from Gin order to improve the convergence rate. Rather, Gand
its impact on diffusivity estimation should be understood qualitatively.
(iii) The typical situation can be described as follows: Let Ftrue be the true
nonlinearity of the underlying process, which is either unknown or too
complex to be handled directly. Instead, an approximate nonlinear term
Fapprox is used to model the dynamics of X. In this case F=Fapprox,
and G=Ftrue Fapprox is the remainder that describes the devia-
tion from the true model. The excess regularity ηGassociated with G
measures the quality of the approximate model Fapprox for diffusivity
estimation.
44
(iv) For example, Fapprox may be the linearization of a nonlinear model
Ftrue. In this case, ηGis related to the linearization procedure.
(v) If Xis the solution to a reaction–diffusion equation (with possible ad-
vection term) as in Section 2.4.2, the excess regularity of Gencodes
the order of the model misspecification as a differential operator. For
example, if only reaction terms (of order zero) are misspecified, but the
advection mechanism (of order one) is known very precisely, then we
have ηG<2. If the description of the advection term is wrong, then
ηG<1.
(vi) In particular, for diffusivity estimation, precise knowledge on the trans-
port term is more important than precise knowledge on the reaction
term.
2.5 Numerical Illustration
We simulate the Allen-Cahn equation [CA77]
dXt=θXtdt+ (XtX3
t)dt+ (∆)γdWt(2.49)
on D= [0,1] with Dirichlet boundary and initial condition x7→ sin(πx).
We discretize the equation in Fourier space and simulate N0= 100 Fourier
modes by a linear-implicit Euler scheme until T= 1. The temporal and
spatial step size is set to t= 2.5×105and x= 5 ×104, respectively.
The diffusivity is given by θ= 0.02. We generate M= 1000 Monte Carlo
simulations for each of the choices γ= 0.4and γ= 0.8. In either case, we
set α=γ. A detailed discussion on numerical simulation for SPDEs can be
found in [LPS14].
By Theorem 2.20, all three estimators ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
Nare asymptot-
ically normal. In Figure 2.1, the simulation results concerning consistency,
convergence rate and asymptotic distribution are shown. Whereas ˆ
θfull
Nand
ˆ
θpart
Nperform as predicted, ˆ
θlin
Nseems to exhibit non-asymptotic effects. Ap-
parently, this depends on the impact of the noise on the dynamics, which
is controlled by γ. In fact, the value of γhas two effects: It determines
the spatial regularity of the noise (and consequently of X), but it also has
an impact on the overall noise intensity via the magnitude of λγ
1, i.e. the
largest eigenvalue of (∆)γ. Our interpretation is that irregular noise from
45
low values of γtends to cover the effect of the nonlinearity. Said another
way, nonlinear effects have a larger impact under smooth noise with smaller
amplitude.
We further mention that for even larger values of γ(take γ= 1.3), the
estimated value from ˆ
θlin
Nis mostly negative and therefore not related to the
true diffusivity. On the other hand, ˆ
θfull
Nand ˆ
θpart
Nremain consistent. It is
possible that this effect depends on the number of Fourier modes N0used in
the simulation.
2.6 The Case of Systems
Consider a partially observed system of the form
dXO
t=θAXO
tdt+FO(XO
t, XU
t)dt+BOdWO
t,
dXU
t=FU(XO
t, XU
t)dt+BUdWU
t,(2.50)
together with initial conditions XO
0, XU
0. Here, XOdenotes the observed
component and XUthe unobserved component of the dynamics. We want
to estimate the unknown diffusivity θof the observed component.
More precisely, let HO,HUbe two Hilbert spaces, and let A:D(A)HO
be a densely defined, closed, negative definite and self-adjoint operator on HO
with compact resolvent, whose eigenvalue sequence (λk)kNsatisfies (2.2). FO
and FUare nonlinear operators defined on a subset D(F)of HOHU, with
values in HOand HU, respectively. WOand WUare independent cylindrical
Wiener processes on HOand HU, and BO,BUare Hilbert–Schmidt operators
on HOand HU. We assume BO=σO(A)γfor some γ > 1/(2β), where
σO>0is the noise intensity in the observed component.
PN:HOHOdenotes the projection onto the span of the first N
eigenvalues Φ1,...,ΦNof A. We write X= (XO, XU)and H=HOHU.
Let A:D(A)HUHbe the operator given by A(x, y) = (Ax, 0),
define F:D(F)Hby means of F(u, v) = (FO(u, v), FU(u, v)) and
B:HHvia B(u, v) = (BOu, BUv). Finally, W= (WO, WU)is a
cylindrical Wiener process on H. Then Xsatisfies
dXt=θA Xtdt+F(Xt)dt+BdWt.(2.51)
In order to capture the regularity of X, we extend the notation from Section
46
2.2 and set for sR:
Hs:= D((A)s/2),(2.52)
Hs:= D((A)s/2) = D((A)s/2)HU,(2.53)
R(s) := L(0, T;Hs),(2.54)
R(s) := L(0, T;Hs).(2.55)
In analogy to condition (Fs,η), we need a condition on the observed part
FOof F:
(Fsys
s,η )There is ϵ > 0and a continuous increasing function g: [0,)[0,)
such that for all XR(s):
FO(X)R(s+η+ϵ2) gXR(s).(2.56)
The splitting argument concerns only the observed part: We write XO=
¯
XO+e
XO, where ¯
XO,e
XOsatisfy
d¯
XO
t=θA ¯
XO
tdt+BOdWO
t,(2.57)
de
XO
t=θA e
XO
tdt+FO(X)dt, (2.58)
with ¯
XO
0= 0,e
XO
0=XO
0.
In analogy to Proposition 2.3 and Proposition 2.4, we have
Proposition 2.29. Let η > 0. If (Fsys
s,η )holds for sRsuch that a.s.
XR(s)and XO
0Hs+η, then e
XOR(s+η). In particular, if s0< s
such that (Fsys
s,η )holds for s0s < s,¯
XOR(s)for s < s,XR(s0)
and XO
0Hs+ηalmost surely, then XR(s)and e
XOR(s+η)for
s < s.
Adapting the estimators from Section 2.3 to the present situation, we
define
ˆ
θfull
N:= RT
0(A)1+2αPNXO
t,dPNXO
tHO
RT
0(A)1+αPNXO
t2
HOdt
+RT
0(A)1+2αPNXO
t, PNFO(X)HOdt
RT
0(A)1+αPNXO
t2
HOdt.
47
In fact, ˆ
θfull
Nis not a function of the observed process PNXOalone, as it
depends on XUvia X. Consequently, we define
ˆ
θpart
N:= RT
0(A)1+2αPNXO
t,dPNXO
tHO
RT
0(A)1+αPNXO
t2
HOdt
+RT
0(A)1+2αPNXO
t, PNFO(PNXO
t,0)HOdt
RT
0(A)1+αPNXO
t2
HOdt,
ˆ
θlin
N:= RT
0(A)1+2αPNXO
t,dPNXO
tHO
RT
0(A)1+αPNXO
t2
HOdt.
Analogously to Theorem 2.11, the following result is proven:
Theorem 2.30. Let γ > 1/(2β)and η > 0,s0< ssuch that (Fsys
s,η )holds
for s0s<s. Assume a.s. XR(s0)and XO
0Hs+η. Let α >
γ(1 + 1)/4. Then ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
Nare strongly consistent as N ,
and ˆ
θfull
Nis asymptotically normal as in (2.29). If η > 1 + 1, the same is
true for ˆ
θpart
N,ˆ
θlin
N, otherwise ˆ
θpart
N=θ+o(Na)for each a < βη/2, and the
same is true for ˆ
θlin
N.
Example 2.31. The theory from this section is applicable to a stochastic
FitzHugh–Nagumo system [Fit61, NAY62], whose activator component is ob-
served:
dUt=θUtdt+k1Ut(1 Ut)(Uta)k2Vtdt+BOdWO
t,
dVt=ϵ(bUtVt) dt+BUdWU
t,
with initial condition U0,V0, on D= [0, L],L > 0, with Neumann boundary
conditions. The reaction parameters are k1, k2, ϵ, b > 0and a(0,1). The
state space for both components is HO=HU=L2(D). Assume4that 1/2<
s<2and X= (U, V )R(s)for s < s. We verify condition (Fsys
s,η )for FO.
Here, FO(U, V ) = k1U(1 U)(Ua)k2V. The first term is a polynomial
that can be treated exactly as in Section 2.4.2, with obvious changes in the
notation of the norm, resulting in η < 2. Concerning the second term k2V,
4In fact, XR(0) together with URv(1) can be shown as in [SS15], and a direct
modification of Proposition 2.18 gives XR(1) in this case. Note that under the Hilbert–
Schmidt assumption γ > d/4, we necessarily have s>1. Higher regularity in R(s)for
all s < sfollows from (Fsys
s,η ).
48
we can find ϵ > 0such that Vs+η+ϵ2=VL2(D)if and only if η < 2s,
and this equality leads to (2.56) as well. In total, (Fsys
s,η )holds for FOwith
η < 2sfor 1/2< s < s. This proves that ˆ
θfull
Nis asymptotically normal,
and ˆ
θpart
Nand ˆ
θlin
Nare consistent with convergence rate bounded by Nafor
a < 2s. This result can be refined if the optimal regularity of the inhibitor
is taken into account. In fact, under regularity assumptions on the inhibitor
noise, all three estimators will be asymptotically normal. We refer to Chapter
6, where a similar FitzHugh–Nagumo system is studied in greater detail.
Finally, we note that if σO= 0, i.e. if only the unobserved component is
driven by noise, other methods need to be employed. We come back to that
case in Remark 3.16 below.
49
Figure 2.1: Left column contains results for γ= 0.4, right column for γ= 0.8.
(top row) Red line: Median from M= 1000 realizations of ˆ
θfull
N. The blue
region is bounded above by the 97.5-percentile and below by the 2.5-percentile.
Black solid line is plotted at true value θ= 0.02, dashed line plotted at zero.
(middle row) The mean squared error (MSE), given by M1PM
k=1(ˆ
θN(k)
θ)2, is plotted, where ˆ
θN(k)is the k-th realization of either of the estimators
ˆ
θfull
N,ˆ
θpart
Nor ˆ
θlin
N. Black line corresponds to the squared true theoretical rate
N7→ 1/2N3/2)2, with Σfrom (2.30).(bottom row) Histogram for the
standardized values Σ1/2N3/2(ˆ
θNθ)at N= 20, where ˆ
θNis either of the
three estimators. The width of each bin is 0.4. Outliers outside the interval
[5,5] are put into the leftmost and rightmost bin, respectively.
50
Chapter 3
Extended Noise Models for the
Spectral Approach
In the last chapter, we studied semilinear SPDE models driven by spa-
tially correlated but temporally white noise. Nonetheless, in many situ-
ations it is desirable to include temporal correlation to an SPDE. There
are different ways to achieve this: A standard approach is using fractional
noise, as studied e.g. in [MP08, MT13, KM19] in the large time regime,
[TTV14, MKT19a, MKT19b, SST20] in a spatial and/or temporal infill
regime, or [CLP09, Kří20] in the spectral approach. Fractional noise im-
pacts the temporal regularity of the solution process and can be used to
model long-range dependence, see e.g. [Tud13] for a discussion of SPDEs
driven by such noise.
However, in applications, there are further common approaches to include
temporal correlation, which have gained little attention in literature concern-
ing statistical inference for SPDEs. As an important example, in models
appearing in biophysics literature [ASB18, FFAB20, MFF+20], integrated
Ornstein–Uhlenbeck noise is used.1While the presence of certain dynamical
properties such as separation of phases or traveling waves may not be affected
by substituting Brownian noise by integrated Ornstein–Uhlenbeck noise (or
vice versa), the precise specification becomes important when it comes to the
quantitative analysis of data. Motivated by these works, we study the cases
of Ornstein–Uhlenbeck noise and integrated noise separately.
1As a motivation, note that the one-dimensional integrated Ornstein–Uhlenbeck process
serves as an alternative model (besides the Wiener process) for describing the movement
of a Brownian particle, see e.g. [HL84, Chapter 2] for a discussion.
51
Ornstein–Uhlenbeck noise is treated in Section 3.1. From the mathemat-
ical point of view, the statistical properties of Ornstein–Uhlenbeck driven
models have not yet been investigated. We study this model both from the
perspective of parameter estimation under Ornstein–Uhlenbeck assumption
as well as from the point of view of model misspecification, where white
noise is used in the description but the true dynamics is Ornstein–Uhlenbeck
driven.
Integrated noise is studied in Section 3.2. Its statistical analysis can
be reduced to the case of semimartingale-type noise. In addition, integrated
noise provides a simple example of a model that cannot be handled by simply
using the estimators from Chapter 2 without further modification.
Finally, in Section 3.3, we consider more general dispersion operators,
which allows us to handle a certain type of multiplicative noise.
3.1 The Case of Ornstein–Uhlenbeck Noise
In this section we consider a semilinear SPDE driven by Ornstein–Uhlenbeck
noise. We develop a hierarchical estimation theory for diffusivity θand tem-
poral correlation decay µand compare the results to the white noise case, in
particular, we consider the case of model misspecification in the noise. Our
setting in this section is as follows:
dXt=θAXtdt+F(X)(t)dt+ dVt,(3.1)
dVt=µVtdt+BdWt,(3.2)
with initial condition X0and V0. Without loss of generality, we assume
V0= 0. As before, Wis a cylindrical Wiener process, B=σ(A)γfor
some γ > 1/(2β)and σ > 0, and θ > 0is the diffusivity. Further, µR
is an additional parameter related to the temporal correlation length of the
driving noise V. In this section we assume always µ= 0, otherwise the
equations reduce to the white noise model from Section 2.3. Additionally,
we assume that w.l.o.g. for all kN,µ=±θλk. (Otherwise replace A
with A+ϵI for some ϵ > 0, where I:HHis the identity operator, and
substitute Fby FϵI. The additional perturbation is of order zero.) This
will be used in Lemma 3.2 below.
52
Remark 3.1. Our model is compatible with a different natural approach to
Ornstein–Uhlenbeck driven SPDEs, namely:
dXt=θAXtdt+F(X)(t)dt+BdW(µ)
t(3.3)
with initial condition X0, where W(µ)is a cylindrical Ornstein-Uhlenbeck
process in the sense that w(µ,k):= W(µ),Φkare independent Ornstein-
Uhlenbeck processes of the form
dw(µ,k)
t=µw(µ,k)
tdt+ dW(k)
t(3.4)
for independent Wiener processes (W(k))kN. If B=σ(A)γ, this model
can be reduced to (3.1),(3.2) by setting V=BW(µ).
The linearized model is given by
d¯
Xt=θA ¯
Xtdt+ dVt,(3.5)
dVt=µVtdt+BdWt(3.6)
with ¯
X0=V0= 0. As in the previous chapter, we set e
X:= X¯
X, then
e
Xsatisfies the random PDE (2.5).
3.1.1 Covariance Structure and Asymptotic Behavior
As before, we set ¯x(k)=¯
X, ΦkHand v(k)=V, ΦkH. The processes
(¯x(k), v(k)),kN, are independent centered Gaussian processes with
d¯x(k)
t= (θλk¯x(k)
tµv(k)
t)dt+σλγ
kdW(k)
t,(3.7)
dv(k)
t=µv(k)
tdt+σλγ
kdW(k)
t,(3.8)
and ¯x(k)
0=v(k)
0= 0, where (W(k))kNare independent Brownian motions.
Lemma 3.2. With µ= 0 and µ=±θλk, we have the explicit representation
¯x(k)
t=σλγ
k
θλkµZt
0θλkeθλk(tr)µeµ(tr)dW(k)
r,(3.9)
v(k)
t=σλγ
kZt
0
eµ(tr)dW(k)
r.(3.10)
53
Furthermore, for 0rt,
E[¯x(k)
r¯x(k)
t] = σ2λ2γ
k
(θλkµ)2θλk
2(eθλk(tr)eθλk(t+r))
+µ
2(eµ(tr)eµ(t+r))
+µθλk
θλk+µ(eµtθλkr+eθλktµr eµ(tr)eθλk(tr)),
E[v(k)
rv(k)
t] = σ2λ2γ
k
2µ(eµ(tr)eµ(t+r)),
E[v(k)
r¯x(k)
t] = σ2λ2γ
k
θλkµθλk
θλk+µ(eθλk(tr)eθλktµr)
1
2(eµ(tr)eµ(t+r)),
E[¯x(k)
rv(k)
t] = σ2λ2γ
k
θλkµθλk
θλk+µ(eµ(tr)eµtθλkr)
1
2(eµ(tr)eµ(t+r)).
Proof. Fix kN. With Z= (¯x(k), v(k))Twe have dZt=AZZtdt+BZdW(k)
t,
where AZ:R2R2,BZ:RR2are linear mappings given by
AZ=θλkµ
0µ, BZ=σλγ
k1
1.
It is straightforward to verify that AZ=SDZS1with
DZ=θλk0
0µ, S =1µ
0θλk+µ.
It follows that for tR,
etAZBZ=SetDZS1BZ=σλγ
k
θλkµθλkeθλktµeµt
(θλkµ)eµt .
Now, (3.9), (3.10) are clear from Zt=Rt
0e(tr)AZBZdW(k)
r, and the covariance
terms follow from Itô’s isometry.
54
In particular, we have
E[(¯x(k)
t)2] = σ2λ2γ
k
(θλkµ)2θλk
2(1 e2θλkt) + µ
2(1 e2µt)
2µθλk
θλk+µ(1 e(θλk+µ)t),(3.11)
E[(v(k)
t)2] = σ2λ2γ
k
2µ(1 e2µt),(3.12)
E[v(k)
t¯x(k)
t] = σ2λ2γ
k
θλkµθλk
θλk+µ(1 e(θλk+µ)t)1
2(1 e2µt).(3.13)
From the above calculations (or the elementary observation that each v(k)
is a classical one-dimensional Ornstein–Uhlenbeck process) it follows that
limt→∞ E[v(k)
tv(k)
t+d]/(E[(v(k)
t)2]E[(v(k)
t+d)2])1/2=eµd for d0, so µdescribes
the rate of exponential decay of the autocorrelation function of each noise
mode in the stationary regime. Hence the name temporal correlation decay”
for µ.
Lemma 3.3. It holds a.s. ¯
XR(s)for any s<s:= 1 + 2γ1.
Proof. The reasoning is similar as in Lemma 2.7. Define C1, C2:HHvia
C1Φk:= [σθλγ+1
k/(θλkµ)]Φkand C2Φk=[µσλγ
k/(θλkµ)]Φk. Note
that both operators are of Hilbert–Schmidt type due to γ > 1/(2β). Using
(3.9), we write ¯
Xt=Rt
0e(tr)θAC1dWr+Rt
0eµ(tr)C2dWr=: ¯
X(1)
t+¯
X(2)
t,
where, as before, t7→ etθA is the C0-semigroup generated by θA. We prove
the claim for both stochastic integrals separately: For s < s= 1 + 2γ1
and 0< α < min{1/2,(ss)/2},
ZT
0
t2α(A)s/2eAC12
HS dt=σ2θ2
X
k=1
λs2γ+2
k
(θλkµ)2ZT
0
t2αe2θλktdt
X
k=1
λs2γ
kZ
0
λ2α1
kr2αerdr
X
k=1
kβ(s+2α2γ1) <.
By [DPZ14, Theorem 5.11], (A)s/2¯
X(1) C(0, T;H), i.e. ¯
X(1) R(s)
almost surely. With regard to ¯
X(2), we have for any s < s+1 = 2+2γ1
55
and 0< α < 1/2,
ZT
0
t2α(A)s/2eµtC22
HS dt=µ2σ2ZT
0
t2αe2µtdt
X
k=1
λs2γ
k
(θλkµ)2
X
k=1
λs2γ2
k
X
k=1
kβ(s2γ2) <,
and consequently, again by [DPZ14, Theorem 5.11], it follows that a.s. ¯
X(2)
R(s). This proves the claim.
The next Proposition implies that a.s. ¯
X /L2(0, T;Hs)for any s > s.
In particular, the optimal regularity sis the same as in the white noise case
from Section 2.3.
Proposition 3.4. Set C(±)
T := T±(1 e2µT )/(2µ).
(i) As k , we have the following asymptotic expansions:
EZT
0
(¯x(k)
t)2dtσ2T
2θλ2γ1
k,(3.14)
EZT
0
(v(k)
t)2dtσ2C()
T
2µλ2γ
k,(3.15)
EZT
0
v(k)
t¯x(k)
tdtσ2C(+)
T
2θλ2γ1
k,(3.16)
EZT
0Zt
0
¯x(k)
rdr2
dtσ2C()
T
2µθ2λ2γ2
k,(3.17)
EZT
0
¯x(k)
tZt
0
¯x(k)
rdrdtσ2(1 e2µT )
4µθ2λ2γ2
k.(3.18)
(ii) As N , we have a.s.
ZT
0(A)s/2¯
XN
t2dtσ2TΛs2γ1
2θ(1 + β(s2γ1))N1+β(s2γ1),(3.19)
ZT
0(A)s/2VN
t2dtσ2C()
T Λs2γ
2µ(1 + β(s2γ))N1+β(s2γ),(3.20)
56
ZT
0(A)sVN
t,¯
XN
tdtσ2C(+)
T Λs2γ1
2θ(1 + β(s2γ1))N1+β(s2γ1),
(3.21)
ZT
0Zt
0
(A)s/2¯
XN
rdr
2
dtσ2C()
T Λs2γ2
2µθ2(1 + β(s2γ2))N1+β(s2γ2),
(3.22)
ZT
0(A)s¯
XN
t,Zt
0
¯
XN
rdrdtσ2(1 e2µT s2γ2
4µθ2(1 + β(s2γ2))N1+β(s2γ2),
(3.23)
whenever sis such that the right-hand side diverges. All statements
remain true if the left-hand side is replaced by its expected value.
(iii) Let η > 0and s0< s= 1 + 2γ1. Assume X0Hs+η. If F
satisfies (Fs,η)for all s0s < sand XR(s0)a.s., then (3.19) and
(3.21) remain true if ¯
XNis replaced by XN.
Remark 3.5. Comparing (3.19) and (3.22), we see that R·
0¯
Xrdrexhibits
more spatial regularity than ¯
X, namely one derivative in the scale of Sobolev
spaces (Hs)sR. From the point of view of a deterministic heat equation, one
may expect that one temporal derivative corresponds to two spatial deriva-
tives. This does not apply here due to nontrivial interactions with the noise.
Proof.
(i) First, (3.14), (3.15) and (3.16) follow from integrating the expressions
(3.11), (3.12) and (3.13). Further, (3.17) and (3.18) are a direct conse-
quence of Rt
0¯x(k)
rdr= (v(k)
t¯x(k)
t)/(θλk)and (3.14), (3.15) and (3.16).
(ii) First, if the left-hand side is replaced by its expected value, we use (i)
together with λkΛkβand the series expansion of every term, for
example,
EZT
0(A)sVN
t,¯
XN
tdt=
N
X
k=1
λs
kEZT
0
v(k)
t¯x(k)
tdt,(3.24)
and similar expansions for the other terms. The claim is immediate
in that case. It remains to prove that the claim is still true for every
57
realization of the left-hand side outside a set of measure zero. Now,
(3.19) and (3.20) follow directly from Lemma A.2 (ii), setting X
k(t) =
λs/2
k¯x(k)
tand X
k(t) = λs/2
kv(k)
t, respectively, in the notation therein. For
(3.21), the argument is similar: From Lemma 3.2, we see that
Ak:= sup
0r,tTE[v(k)
rv(k)
t]λ2γ
k,
Bk:= sup
0r,tTE[v(k)
r¯x(k)
t]λ2γ1
k,
Ck:= sup
0tTZt
0E[¯x(k)
r¯x(k)
t]drλ2γ2
k.
Set Yk=λs
kRT
0v(k)
t¯x(k)
tdt. Then by means of the Wick theorem [Jan97,
Theorem 1.28], applied to the mixed moment E[v(k)
tv(k)
r¯x(k)
t¯x(k)
r],
Var(Yk) = λ2s
kZT
0ZT
0
E[v(k)
tv(k)
r¯x(k)
t¯x(k)
r]E[v(k)
t¯x(k)
t]E[v(k)
r¯x(k)
r]drdt
= 2λ2s
kZT
0Zt
0
E[v(k)
tv(k)
r]E[¯x(k)
t¯x(k)
r] + E[v(k)
t¯x(k)
r]E[v(k)
r¯x(k)
t]drdt
2λ2s
kTAkCk+T2B2
k/2λ2s4γ2
k.
Now, we see that
X
N=1
VarYN
PN
k=1 EYk2
X
N=1
1
N2<,
and (3.21) follows from the strong law of large numbers [Shi96, Theorem
IV.3.2]. Now, (3.22) and (3.23) follow from (3.19), (3.20), (3.21) via
θZt
0
A¯
XN
rdr=¯
XN
tVN
t.(3.25)
(iii) By Lemma 3.3, condition (Fs,η)for Fand Proposition 2.4, e
XR(s+η)
for each s < s. The analogue of (3.19) follows as in the white noise
case in Proposition 2.8. For the analogue of (3.21), let s>sand ϵ > 0
58
with η2(ss)< ϵ < η. Then
ZT
0D(A)sVN
t,e
XN
tEdt
sZT
0VN
t2
2ssη+ϵdtsZT
0e
XN
t
2
s+ηϵdt
N(1+β(2ssη+ϵ2γ))/2=N1+β(s2γ1η/2+ϵ/2)
by (3.20). Note that the latter exponent is positive due to ϵ > η2(s
s). Furthermore, 1 + β(s2γ1η/2 + ϵ/2) <1 + β(s2γ1),
such that
ZT
0VN
t, XN
tsdt=ZT
0VN
t,¯
XN
tsdt+ZT
0DVN
t,e
XN
tEsdt
σ2C(+)
T Λs2γ1
2θ(1 + β(s2γ1))N1+β(s2γ1).
This concludes the proof.
Let Jdenote the Bochner integral operator, i.e. JZ(t) = Rt
0Zrdrfor
ZL1(0, T;Hs),sR. It is desirable to transfer also (3.22) to the nonlinear
case, i.e. to substitute ¯
Xby Xtherein. In order to do so, we have to
strengthen the condition on F. In addition to (Fs,η)for F, we need:
(FJ
s,η)One of the following two conditions holds:
(i) Fsatisfies (Fs,1+η).
(ii) There is an operator Gthat satisfies (Fs,η)such that JF =GJ.
Lemma 3.6. Let η > 0and s0< ssuch that Fsatisfies (Fs,η)and (FJ
s,η)
for all s0s < s. Assume a.s. XR(s0)and X0Hs+η. Then
(3.22) remains true if ¯
XNis replaced by XN. Furthermore, in this case,
(JF)(X)R(s1 + η)for s < s.
Proof. Lemma 3.3 and Proposition 2.4 yield e
XR(s+η)and therefore also
Je
XR(s+η)for s < s. We distinguish the two cases from (FJ
s,η)and
prove that Je
XR(s+1+η),s < s, in either case:
59
(i) If, in fact, Fsatisfies even (Fs,1+η), another application of Proposition
2.4 proves e
X, J e
XR(s+1+η)for s<s.
(ii) If JF =GJ, where Gsatisfies (Fs,η), we proceed as in the proof of
Proposition 2.3: We know that J¯
XR(s)for any s < s+ 1 due to
Proposition 3.4. Let s < s+ 1 such that Je
XR(s), this is the case
e.g. for s=s. Then also JX R(s), and
Je
XN
ts+ηZt
0e(tr)θA PN(JF)(X)(r) + XN
0s+ηdr
Zt
0
(tr)1+ϵ/2(JF)(X)(r) + X0s2+η+ϵdr
sup
0rT(GJ)(X)(r)s2+η+ϵ+X0s+ηZt
0
(tr)1+ϵ/2dr
sup
0rTJXrs+X0s+η2
ϵTϵ/2<,
so Je
XR(s+η). Iterating this argument proves Je
XR(s+1+η)
for all s < s.
In particular, for s > 2+2γ1 and any ϵ > 0:
ZT
0Zt
0
(A)s/2e
XN
rdr
2
dtλss1η+ϵ
NZT
0Zt
0
(A)(s+1+ηϵ)/2e
XN
rdr
2
dt
λs22γ+1η+ϵ
NN1+β(s2γ2η+ϵ),
where we assume w.l.o.g. that the exponent is positive. As a consequence,
ZT
0Zt
0
(A)s/2Xrdr
2
dtZT
0Zt
0
(A)s/2¯
Xrdr
2
dt,
and (3.22) holds with ¯
Xreplaced by X. Finally, AJ e
XR(s1 + η), thus
JF(X) = e
XθAJ e
XX0R(s1 + η)for s < s.
3.1.2 The Maximum–Likelihood Approach
Heuristically, the log–likelihood is given by [LS77, Section 7.6.4]:
60
ln dPN,T
(θ)
dPN,T
(θ00)
(XN) = 1
σ2ZT
0a(θ, µ)a(θ0, µ0),(A)2γdXN
t(3.26)
1
2σ2ZT
0a(θ, µ)a(θ0, µ0),(A)2γ(a(θ, µ) + a(θ0, µ0))dt
1
σ2ZT
0a(θ, µ)a(θ0, µ0),(A)2γPNF(X)(t)dt,
where we abbreviate
a(θ, µ) = θAXN
tµXN
t+µXN
0+µθ Zt
0
AXN
rdr+µZt
0
PNF(X)(r)dr.
As before, this is rigorous if PNF=FPN. Maximizing for the unknown
parameter θfor known µyields the maximum likelihood–type estimator:
ˆ
θref
N=RT
0D(A)1+2αXN
t+µRt
0(A)1+2αXN
rdr, dXN
tE
RT
0(A)1+αXN
t+µRt
0(A)1+αXN
rdr
2dt
µRT
0D(A)1+2αXN
t+µRt
0XN
rdr, XN
tXN
0Rt
0PNF(X)(r)drEdt
RT
0(A)1+αXN
t+µRt
0(A)1+αXN
rdr
2dt
+RT
0D(A)1+2αXN
t+µRt
0(A)1+2αXN
rdr, PNF(X)(t)Edt
RT
0(A)1+αXN
t+µRt
0(A)1+αXN
rdr
2dt
,(3.27)
whereas maximizing for unknown µand known θyields
ˆµref
N=RT
0(A)2αVN
t,dXN
t
RT
0(A)αVN
t2dtθRT
0(A)1+2αVN
t, XN
tdt
RT
0(A)αVN
t2dt
+RT
0(A)2αVN
t, PNF(X)(t)dt
RT
0(A)αVN
t2dt,(3.28)
61
where
VN
t=XN
tXN
0θZt
0
AXN
rdrZt
0
PNF(X)(r)dr
is a functional of XNand PNF(X).
In both estimators, we substituted γby a contrast parameter αR, as
before. Clearly, setting θ=ˆ
θref
Nand µ= ˆµref
Nin the above expressions leads
to a (nonlinear) system of equations for the maximum likelihood estimators
in the case that both parameters are unknown. However, we are interested in
a hierarchical approach of first estimating θindependently of µand secondly
estimating µbased on the previous estimator of θ, exploiting the asymptotic
properties of the terms appearing in ˆ
θref
Nand ˆµref
N. This will be explained in
detail in the following sections. The hierarchical approach is insightful for
two reasons:
(i) From the point of view of model misspecification, the diffusivity estima-
tors from Section 2.3 still work if the driving noise Vexhibits temporal
correlation which is not accounted for in the model, as long as the
temporal regularity of Xis not affected (as in the case of fractional or
integrated noise, cf. Section 3.2).
(ii) The hierarchical approach is (at least asymptotically) as good as the
direct approach in the following sense: Let A= be the Laplacian.
In dimension d= 1, Theorem 3.7 below shows that the hierarchical
approach leads to an estimator for θwhich is agnostic to µ, and which
has the same asymptotic properties as the reference estimator ˆ
θref
Nwith
known µ. In d= 2, still the optimal convergence rate is preserved.
Further, by Theorem 3.10 below, a hierarchical estimator for µbehaves
asymptotically as ˆµref
Nwith known θwhenever d3.
3.1.3 Diffusivity Estimation
As explained above, it is reasonable to consider a simplified estimator for θ
which is obtained by formally setting µ= 0 in ˆ
θref
N:
ˆ
θsim
N=RT
0(A)1+2αXN
t,dXN
t
RT
0(A)1+αXN
t2dt+RT
0(A)1+2αXN
t, PNF(X)(t)dt
RT
0(A)1+αXN
t2dt.
(3.29)
62
Formally, this is the same estimator as ˆ
θfull
Nin the white noise setting. In
particular, ˆ
θsim
Ndoes not depend on µ. Of course, the other estimators ˆ
θpart
N
and ˆ
θlin
Nfrom Section 2.3 can be used, too, with conditions on the excess
regularity ηof Fas in Theorem 2.11.
Theorem 3.7. Let γ > 1/(2β). Assume that there is η > 0and s0< s
such that (Fs,η)is true for s0s<s, and that XR(s0)and X0Hs+η
a.s. Let α > γ (1 + 1)/4.
(i) If µis known, ˆ
θref
Nis asymptotically normal as in the white noise case:
N1+β
2(ˆ
θref
Nθ)d
N(0,Σ),(3.30)
where Σis given by (2.30).
(ii) If β > 1, then ˆ
θsim
Nis asymptotically normal as in (3.30). If β= 1,
then
N(ˆ
θsim
Nθ)d
N(m, Σ),(3.31)
where
m=µ+1
2T(1 e2µT )1 + β(2α2γ+ 1)
Λ(1 + β(2α2γ)) .(3.32)
If β < 1, then
Nβ(ˆ
θsim
Nθ)a.s.
m. (3.33)
If η > 1 + 1, then (3.30),(3.31),(3.33) for β > 1,β= 1,β < 1,
respectively, hold for ˆ
θpart
Nand ˆ
θlin
Nas well. If η1 + 1, then a.s.
ˆ
θpart
N=θ+o(Na)for a < β(η/21), and the same is true for ˆ
θlin
N.
Proof.
(i) Plugging in the dynamics of XNinto (3.27), we obtain
ˆ
θref
Nθ=σRT
0D(A)1+2αγXN
t+µRt
0(A)1+2αγXN
rdr, dWN
tE
RT
0(A)1+αXN
t+µRt
0(A)1+αXN
rdr
2dt
.
63
Set Cs=σ2TΛs2γ1/(2θ(1+β(s2γ1))), whenever s > 1+2γ1.
By Proposition 2.4, e
XR(s+η)for s < s. As Jmaps R(s+η)into
itself, the same is true for Je
X. From Proposition 3.4 (ii), comparing
the rates of (3.19) and (3.22), we get immediately
IN
s:= ZT
0(A)s/2XN
t+µZt
0
(A)s/2XN
rdr
2
dt(3.34)
ZT
0(A)s/2¯
XN
t+µZt
0
(A)s/2¯
XN
rdr
2
dtCsN1+β(s2γ1).
We write
ˆ
θref
Nθ=: σC1/2
2+4α2γN1/2+β(2α2γ+1/2)
IN
2+2α
MN
T,
such that (MN)NNis a sequence of local martingales with MNT
P
1.
According to Theorem A.1, it follows that MN
T
d
N(0,1), and making
use of Slutsky’s lemma, we see that
N1+β
2(ˆ
θref
Nθ)d
N 0,σ2C2+4α2γ
C2
2+2α.(3.35)
(ii) We decompose ˆ
θsim
Nas follows:
ˆ
θsim
Nθ=µRT
0(A)1+2αXN
t, V N
tdt
RT
0(A)1+αXN
t2dtσRT
0(A)1+2αγXN
t,dWN
t
RT
0(A)1+αXN
t2dt.
As before, the second term converges in distribution to N(0,Σ) with
rate N(1+β)/2. Using Proposition 3.4 (iii), we have
µRT
0(A)1+2αXN
t, V N
tdt
RT
0(A)1+αXN
t2dtNβa.s.
µC(+)
T (1 + β(2α2γ+ 1))
TΛ(1 + β(2α2γ)) ,
and the right-hand side equals m. This yields the claim in the case
β > 1and β= 1, and for β < 1note that by Lemma 2.6, we have
almost surely RT
0(A)1+2αγXN
t,dWN
tN1/2+β(1/2+2α2γ), and the
claim follows in this case as well. The statements regarding ˆ
θpart
Nand ˆ
θlin
N
are straightforward, taking into account an additional bias term of order
Nbfor b < βη/2as in Theorem 2.11, coming from the nonlinearity.
64
Remark 3.8. ˆ
θsim
Nis identical to ˆ
θfull
Nfrom Section 2.3. Thus, Theorem 3.7
is revelatory for the case that the true noise model is of Ornstein–Uhlenbeck
type, but the diffusivity estimator is derived under a white noise assump-
tion. In the reference case β= 2/d, this translates as follows: In d= 1,
ˆ
θsim
Nis asymptotically normal with optimal convergence rate. In particular,
ˆ
θfull
Nfrom Section 2.3 is asymptotically robust to noise misspecification of
Ornstein–Uhlenbeck type. In d= 2,ˆ
θsim
Nconverges to a non-centered normal
distribution, still with optimal rate. In d3,ˆ
θsim
Nis still consistent for θ,
but its convergence rate is no longer optimal.
3.1.4 Correlation Decay Estimation
In contrast to the case of diffusivity estimation, we cannot just set the nui-
sance parameter θin ˆµref
Nto zero: According to Proposition 3.4, the term
θRt
0(A)1+2αXN
rdrdominates the denominator of (3.28). As a consequence,
estimation of µdepends on knowledge (or precise estimation) of θ. In the
sequel, we set
Vfull
t(ϑ) := XtX0+ϑZt
0
(A)XrdrZt
0
F(X)(r)dr, (3.36)
Vlin
t(ϑ) := XtX0+ϑZt
0
(A)Xrdr, (3.37)
then Vfull,N (ϑ) = PNVfull(ϑ)and Vlin,N (ϑ) = PNVlin(ϑ)are given by the
same terms, with Xand F(X)replaced by XNand PNF(X). Further, we
set for ϑ > 0:
ˆµfull
N(ϑ) := RT
0D(A)2αVfull,N
t(ϑ),dXN
tE
RT
0(A)αVfull,N
t(ϑ)
2dt
ϑRT
0D(A)1+2αVfull,N
t(ϑ), XN
tEdt
RT
0(A)αVfull,N
t(ϑ)
2dt
(3.38)
+RT
0D(A)2αVfull,N
t(ϑ), PNF(X)(t)Edt
RT
0(A)αVfull,N
t(ϑ)
2dt
65
and
ˆµlin
N(ϑ) := RT
0D(A)2αVlin,N
t(ϑ),dXN
tE
RT
0(A)αVlin,N
t(θ)
2dt
θRT
0D(A)1+2αVlin,N
t(ϑ), XN
tEdt
RT
0(A)αVlin,N
t(ϑ)
2dt
.(3.39)
Note that if ϑ=θis the true diffusivity, then ˆµfull
N(θ) = ˆµref
Nas given in
(3.28). If ϑis close to θ, then Vfull(ϑ)and Vlin(ϑ)should be seen as an
approximation of V. This is formalized as follows:
Lemma 3.9. Let η > 0,s0Rsuch that Fsatisfies (Fs,η)and (FJ
s,η)for
s0s < s. Assume XR(s0)and X0Hs+η. Let (θN)NNa sequence
of estimators for θwhich is a.s. consistent. Then, for s > 2γ1,
ZT
0(A)s
2Vfull,N
t(θN)Vt
2dt=o(N1+β(s2γ)).(3.40)
In particular,
ZT
0(A)s
2Vfull,N
t(θN)
2dtZT
0(A)s
2Vt2dtCOU
sN1+β(s2γ),(3.41)
where COU
s=σ2C()
T Λs2γ/(2µ(1 + β(s2γ))). The same statements are
true for Vlin,N (θN).
Proof. Let ϵ > 0. If s > s1 + ηϵ, using JF(X)R(s1 + ηϵ)by
Lemma 3.6, we have
ZT
0(A)s
2Zt
0
PNF(X)(r)dr
2
dt
λss+1η+ϵ
NZT
0(A)s1+ηϵ
2Zt
0
PNF(X)(r)dr
2
dt
Nβ(ss+1η+ϵ)=N1+β(s2γη+ϵ),
thus for ϵsufficiently small, this grows slower than N1+β(s2γ). If s < s
1 + ηϵ, the left-hand side is even bounded uniformly in N. The case
66
s=s1 + ηϵcan be avoided by substituting ϵ7→ ϵ/2. Using that
X=X0+θJAX +JF(X) + V, we see that
Vfull,N
t(θ) = Vt, V lin,N
t(θ)Vt=Zt
0
F(X)(r)dr,
and consequently,
ZT
0(A)s
2Vlin,N
t(θ)VN
t
2dtZT
0(A)s
2Zt
0
PNF(X)(r)dr
2
dt
pN1+β(s2γ),
and the same estimate is trivially satisfied for Vfull,N instead of Vlin,N . Next,
again by Lemma 3.6, we have
ZT
0(A)s
2Vfull,N
t(θN)Vfull,N
t(θ)
2dt
= (θNθ)2ZT
0(A)s+2
2Zt
0
XN
rdr
2
dt
(θNθ)2θ2COU
sN1+β(s2γ),
and the same is true for Vlin,N instead of Vfull,N . As θNis a consistent
estimator for θ, the right-hand side is negligible compared to N1+β(s2γ). Now
(3.40) follows by simple norm estimates, and (3.41) is a direct consequence
of (3.40).
Theorem 3.10. Let η > 0,s0Rsuch that Fsatisfies (Fs,η)and (FJ
s,η)
for s0s < s. Let XR(s0)and X0Hs+ηalmost surely. Let
α > γ 1/(4β).
(i) If β > 1/2, then
N1
2ˆµfull
Nˆ
θsim
Nµd
N (0,Σµ)(3.42)
with
Σµ=4µ2(1 + β(2α2γ))2
(e2µT 1+2µT)(1 + β(4α4γ)).(3.43)
67
(ii) If β= 1/2, then
N1
2ˆµfull
Nˆ
θsim
Nµd
N(mµ/θ, Σµ),(3.44)
where mis given by (3.32).
(iii) If β < 1/2, then
Nβˆµfull
Nˆ
θsim
Nµa.s.
θ.(3.45)
If Fsatisfies (Fs,¯η)for some ¯η > 1and all s0s < s,ˆµlin
Nˆ
θlin
Nis
a consistent estimator for µ. If ¯η > 1+1, (i), (ii), (iii) remain true for
ˆµlin
Nˆ
θlin
N. Otherwise, ˆµlin
Nˆ
θlin
N=µ+o(Nb)for every b < (β(¯η1)/2)β.
Proof. We write θN=ˆ
θsim
Nfor short. Expanding ˆµfull
NθNby plugging in the
dynamics of XN, we see that
ˆµfull
NθN=µRT
0D(A)2αVfull,N
t(θN), V N
tEdt
RT
0(A)αVfull,N
t(θN)
2dt
σRT
0D(A)2αγVfull,N
t(θN),dWN
tE
RT
0(A)αVfull,N
t(θN)
2dt
=: [I]N[II]N.(3.46)
For the second term, one would like to apply Theorem A.1 as in the previous
cases. Note, however, that θNdepends on the whole trajectory of (Xt)0tT,
so the integrand in the stochastic integral is not adapted. Nonetheless, as
Vfull,N
t(θN)is an affine function of θN, this issue is easy to avoid by decom-
posing the integrand as Vfull,N
t(θN) = Vfull,N
t(θ)+(θNθ)Rt
0(A)XN
rdr:
σRT
0D(A)2αγVfull,N
t(θN),dWN
tE
RT
0(A)αVfull,N
t(θN)
2dt
=σRT
0D(A)2αγVfull,N
t(θ),dWN
tE
RT
0(A)αVfull,N
t(θN)
2dt
+ (θNθ)σRT
0D(A)1+2αγRt
0XN
rdr, dWN
tE
RT
0(A)αVfull,N
t(θN)
2dt
=: [IIa]N+ (θNθ)[IIb]N.
68
Now we rescale both terms separately with the square root of the quadratic
variation processes of their stochastic integrals and apply Theorem A.1, using
α > γ1/(4β)together with Lemma 3.9 and Proposition 3.4 (iii). This yields
N1/2[IIa]N
d
N(0, σ2COU
4α2γ/(COU
2α)2),(3.47)
N1/2[IIb]N
d
N(0, σ2COU
4α2γ/(θCOU
2α)2),(3.48)
As θNθalmost surely, (3.47) holds for [II]Ninstead of [IIa]Nas well.
The term [I]Ncan be treated as follows:
[I]Nµ=µRT
0D(A)2αVfull,N
t(θN), V N
tVfull,N
t(θN)Edt
RT
0(A)αVfull,N
t(θN)
2dt
=(θNθ)µθNRT
0(A)1+αRt
0XN
rdr
2dt
RT
0(A)αVfull,N
t(θN)
2dt
(θNθ)µRT
0D(A)1+2αXN
t,Rt
0XN
rdrEdt
RT
0(A)αVfull,N
t(θN)
2dt
+ (θNθ)µRT
0D(A)1+2αXN
0+Rt
0PNF(X)(r)dr,Rt
0XN
rdrEdt
RT
0(A)αVfull,N (θN)2dt
=: [Ia]N+ [Ib]N+ [Ic]N.
For the first term,
[Ia]N (θNθ)µθRT
0Rt
0XN
rdr
2
2+2αdt
RT
0Vfull,N
t(θN)
2
2αdt µ
θ(θNθ).
69
For [Ib]N, the Cauchy-Schwarz inequality is used: If α > γ + 1/41/(2β),
ZT
0XN
t,Zt
0
XN
rdr1+2α
dt
v
u
u
tZT
0XN
t2
1/2+2αdtZT
0Zt
0
XN
rdr
2
3/2+2α
dt
N1+β(2α2γ1/2)
pN1+β(2α2γ)ZT
0Zt
0
XN
rdr
2
2+2α
dt.
If α < γ + 1/41/(2β), the left-hand side is bounded uniformly in N,
and for α=γ+ 1/41/(2β)replace αby α+ 1/8under the square root
term in an additional norm estimate, and continue as before. In any case,
|[Ib]N| p[Ia]N. Finally, consider [Ic]N. Let 0< ϵ < η. We can neglect
X0Hs+η, which has larger spatial regularity than JF(X)R(s1 +
ηϵ). Then, if α > γ 1/(2β) + η/4ϵ/4, we have with the abbreviation
¯r:= 2 + 4αs+ 1 η+ϵ:
|[Ic]N|θNθrRT
0Rt
0PNF(X)(r)dr
2
s1+ηϵdtRT
0Rt
0XN
rdr
2
¯rdt
RT
0(A)αVfull,N
t(θN)
2dt
θNθNβ
2(ηϵ),
and |[Ic]N| p[Ia]N. The case αγ1/(2β)+η/4ϵ/4is treated as before.
Putting things together, we have shown [I]Nµ (θNθ)µ/θ. Now (i),
(ii), (iii) for ˆµfull
NθNfollow from the asymptotic behavior of θN=ˆ
θsim
Nfrom
Theorem 3.7.
Now, tracing the proof for ˆµlin
NθN(with Vlin instead of Vfull and θN=
ˆ
θlin
Ninstead of θN=ˆ
θsim
N), there are two additional bias terms that have to
be controlled. First, (3.46) is replaced by
ˆµlin
NθN= [I]N[II]NRT
0D(A)2αVlin,N
t(θN), PNF(X)(t)Edt
RT
0(A)αVlin,N
t(θN)
2dt
=: [I]N[II]N[III]N
70
with Vlin instead of Vfull in the definition of [I]Nand [II]N.[II]Nis treated
exactly as before. Due to (Fs,¯η)we have F(X)R(s2 + ¯η)for all s<s.
Let w.l.o.g. 2αs+ 2 ¯η > 0, this can be achieved by choosing ¯η > 1
smaller if necessary. Then, for any ϵ > 0,
|[III]N|
RT
0(A)αPNF(X)(t)2dt
RT
0(A)αVlin,N
t(θN)
2dt
1
2
λ
2αs+2¯η+ϵ
2
N
RT
0(A)s2+¯ηϵ
2PNF(X)(t)
2dt
RT
0(A)αVlin,N
t(θN)
2dt
1
2
N1
2β
2(2α2γ)+ 1
2+β
2(2α2γ+1¯η+ϵ)=Nβ
2(¯η1ϵ),
and for ¯η > 1and sufficiently small ϵ > 0, this term converges to zero. If
¯η > 1 + 1, then even N1/2|[III]N|Nβ(¯η11ϵ)/20for N .
Furthermore, the decomposition of [I]Nchanges: If we set w.l.o.g. X0= 0,
then we obtain
[I]Nµ= [Ia]N+ [Ib]NµRT
0D(A)2αVlin,N
t(θN),Rt
0PNF(X)(t)drEdt
RT
0(A)αVlin,N
t(θN)
2dt
=: [Ia]N+ [Ib]N+ [Id]N,
where again Vfull has been substituted by Vlin in every term. The term [Id]N
is treated exactly as [III]N(note that, in fact, JF(X)exhibits even larger
spatial regularity than F(X)). The claims for the case ¯η > 1 + 1 are
now immediate. For the remaining case ¯η1 + 1, note that by Theorem
3.7, ˆ
θlin
N=θ+o(Na)for a < β¯η/2β, whereas |[III]N|=o(Nb)for
b < β(¯η1)/2. This concludes the proof.
Remark 3.11.
(i) Σµis minimal for α=γ. In this case, Σµ= 4µ2/(e2µT 1+2µT).
In particular, Σµ2µ/T for µ , i.e. for large µ, the asymptotic
variance grows linearly in µ. Further, limµ0Σµ= 2/T2, i.e. for small
µ, the asymptotic variance does not depend on µ, but a large observation
time Tis even more beneficial.
71
(ii) Similar to the case of diffusivity θ, for finite–dimensional systems, the
temporal correlation decay µis not identifiable in finite time. For exam-
ple, if F= 0,(3.26) describes the true likelihood for the N-dimensional
system, and the measures on path space are absolutely continuous for
different µ.
(iii) If XNand PNF(X)are observed, both ˆµfull
Nˆ
θsim
Nand ˆµlin
Nˆ
θlin
Nare
valid estimators. If PNF(X)is not observed, only ˆµlin
Nˆ
θlin
Nis feasible.
(iv) From the proof of Theorem 3.10, it is clear that ˆµref
N= ˆµfull
N(θ)with the
true diffusivity θis asymptotically normal as in (3.42).
(v) In particular, the hierarchical approach using ˆµfull
Nˆ
θsim
Nis asymptoti-
cally as good as the direct maximum–likelihood approach with known θ
whenever β > 1/2. If β= 2/d, this means d3.
Example 3.12. We close this section with a short discussion on the validity
of the additional condition (FJ
s,η).
(i) If Fsatisfies (Fs,η)with excess regularity η > 1, then (FJ
s,η)holds with
η=η1. In particular, the theory is applicable to reaction–diffusion
equations as in Section 2.4.2
(ii) If Fsatisfies (Fs,η)and JF =FJ, then (FJ
s,η)holds. For example, if
F(Z) = (A)rZfor some r < 2.
(iii) Let Dbe a bounded domain with smooth boundary and A= the
Laplacian. Let s > d/2. We extend Remark 2.21 as follows: Given a
(possibly time-dependent) vector field v:D×[0, T]Rdwith compo-
nents v(i)=Jw(i)for some w(i)R(s), consider the advection term
F(Z)(t) = ·(Ztvt). This term belongs to neither of the previous exam-
ples (i), (ii). We show that it satisfies (FJ
s,η)for any η < 1. To this end,
we use the integration by parts formula J(Jf ·g) = Jf ·JgJ(f·Jg)for
f, g R(s), where multiplication is understood pointwisely for x D.2
Define ¯
G(i)(Z) := v(i)·ZJ(w(i)·Z)and G(Z) := · ¯
G(Z) =
Pd
i=1 xi¯
G(i)(Z). Then clearly JF =GJ. Further, ¯
G(i)satisfies (Fs,η)
for η < 2, thus Gsatisfies (Fs,η)with η < 1.
2Note that for f, g R(s)all terms appearing are well-defined, and for any x D,
the (multiplicative, bounded) point evaluation operator δxreduces the formula to the
one-dimensional integration by parts.
72
3.2 The Case of Integrated Noise
In this section we consider the case that the solution process is driven by an
integrated semimartingale. Such a process has pathwise Hölder regularity
3/2ϵin time for every ϵ > 0. Apart from providing a non-standard noise
model for semilinar SPDEs that is used in applications, this type of noise can
arise from partially observed systems driven by Brownian noise, as explained
in Remark 3.16.
A better understanding of the impact of different noise types on statistical
questions may help to decide for (or against) them. This noise model provides
a simple example of a model misspecification in which the natural estimator
ˆ
θlin
N, derived under the assumption of martingale noise, is no longer consistent.
This is proven in Theorem 3.15.
For a model driven by an integrated noise term Rt
0Wrdrrather than Wt,
with dispersion operator B=σ(A)γ, the resulting equation reads as
dXt=θAXtdt+F(X)(t)dt+BWtdt(3.49)
together with initial condition X0. W.l.o.g. we assume that the Wiener pro-
cess starts in zero; different (e.g. random) initial conditions can be absorbed
into F. Note that Xis the solution to a random PDE of the form
tXt=θAXt+F(X)(t) + BWt.(3.50)
In particular, Yt:= tXsatisfies
dYt=θAYtdt+SF(Y)(t)dt+BdWt(3.51)
with initial condition Y0=AX0+F(X)(0), where
SF:= tF(J+X0),(3.52)
and Jis the Bochner integral operator (JX)(t) = Rt
0Xrdr. We make this
precise as follows: Let sR. For fC(0, T;Hs)such that f, Φk
C1(0, T;R)for k1, let tfbe given by tf, Φk=tf, Φkwhenever
it exists. Define C1
Φ(0, T;Hs)C(0, T;Hs)to be the subspace of func-
tions fsuch that tfexists in the previously explained sense and belongs
to L(0, T;Hs). It is clear that Jmaps L(0, T;Hs)into C1
Φ(0, T;Hs),
while tmaps C1
Φ(0, T;Hs)into L(0, T;Hs). For operators of the form
F:C1
Φ(0, T;H0)D(F)C1
Φ(0, T;H0), (3.52) is meaningful.
73
It is clear that the solution Xto (3.49) and Yto (3.51) contain the same
amount of statistical information, as both processes can be transferred into
each other by means of the operators Jand t.
If the nonlinear operator SFsatisfies (Fs,η), we can connect to the the-
ory from Section 2.3. Note that if X0is random, SFwill depend on the
realization ω, but this does not alter any of the (pathwise) arguments.
Condition (Fs,η)for SFcan be deduced naturally from similar conditions on
Fitself, for example, in the case of reaction terms:
Lemma 3.13. Let D Rdbe a bounded domain with smooth boundary,
A= the Laplacian operator and F(X) = f(X)for a differentiable function
f:RR. W.l.o.g. let Xsatisfy Dirichlet boundary conditions. If X0Hs
and fsatisfies (Fs,η)for some s > d/2and any 0< η < 2, then the same is
true for SF.
Proof. In that case, SF(Z) = tf(JZ +X0) = f(JZ +X0)Zfor any Z
R(s). Choose ϵ > 0and a monotonous function gas in condition (Fs,η)for
f. W.l.o.g. let s+η2> d/2(otherwise substitute η < 2by a larger value)
and ϵ2η. Then:
∥SF(Z)R(s+η+ϵ2) sup
0tTf(JZt+X0)s+η+ϵ2sup
0tTZs
sup
0tT
g(JZts+X0s)ZR(s)
g(TZR(s)+X0s)ZR(s),
which proves the claim.
Furthermore, if tcommutes with F, e.g. if Fitself is a linear differential
operator acting in spatial direction, then (Fs,η)for SFimmediately reduces
to (Fs,η)for F.
In total, the whole theory as developed for noise of semimartingale type
transfers if Y=tXis considered instead of X. For example, a maximum
likelihood–type estimator for the case of integrated white noise is given by
ˆ
θrescaled
N=RT
0(A)1+2αtXN
t,d(tXN)t
RT
0(A)1+αtXN
t2dt
+RT
0(A)1+2αtXN
t, PN(SF)(tX)(t)dt
RT
0(A)1+αtXN
t2dt.(3.53)
74
It is obvious that the same reduction technique applies if Xis driven by
integrated Ornstein–Uhlenbeck noise instead of integrated white noise.
Remark 3.14. Taking the time derivative of Xamounts to a rescaling of
the Hölder regularity of Xto be 1/2ϵin time (for all ϵ > 0), such that
semimartingale theory can be applied. A similar approach is possible for
fractional noise with Hurst index 0< H < 1. In this case, the temporal
regularity rescaling can be done by applying a kernel instead of taking the
derivative. Based on that observation, a Girsanov transform for SODEs
driven by fractional Brownian motion BHcan be derived by considering a
surrogate semimartingale [NVV99, KLBR00, TV07, Mis08]. This allows for
likelihood-based inference. In [CLP09, Cia10] this approach is used for pa-
rameter estimation for SPDEs driven by additive and multiplicative fractional
noise.
In the case of integrated noise, it is interesting to see how model misspec-
ification changes the behavior of the estimator. Namely, assume that ˆ
θfull
Nis
given as in Section 2.3, but the dynamics of Xis generated by integrated
noise. Then even in the simplest possible case, i.e. if Xsatisfies a linear
equation with X0= 0,ˆ
θfull
Nis not consistent:
Theorem 3.15. Let X0= 0,F= 0. It holds that ˆ
θfull
N0almost surely.
Proof. First note that
ˆ
θfull
Nθ=σRT
0(A)1+2αγXN
t, WN
tdt
RT
0(A)1+αXN
t2dt.(3.54)
With x(k)=X, Φk, it holds that
dx(k)
t=θλkx(k)
tdt+σλγ
kW(k)
tdt
with independent Wiener processes W(k), and consequently,
x(k)
t=σλγ
kZt
0
eθλk(tr)W(k)
rdr.
A straightforward calculation yields
E[(x(k)
t)2] = 2σ2λ2γ
kZt
0Zr
0
eθλk(tr)eθλk(tr)rdrdr
=2σ2
θ2λ2γ+2
kt
23
4θλke2θλkt
4θλk
+eθλkt
θλk,
75
thus
EZT
0
(x(k)
t)2dtT2σ2
2θ2λ2γ2
k.
Summing over the first Nmodes and using Lemma A.2 (ii), we get a.s.
ZT
0(A)1+αXN
t2dtT2σ2Λ2α2γ
2θ2(1 + β(2α2γ))N1+β(2α2γ).(3.55)
Furthermore,
E[x(k)
tW(k)
t] = σλγ
kZt
0
eθλk(tr)rdr
=σ
θλγ+1
kt1
θλk
(1 eθλkt),
and consequently,
EZT
0
x(k)
tW(k)
tdtT2σ
2θλγ1
k.
By summing up, we obtain
EZT
0(A)1+2αγXN
t, WN
tdtT2σΛ2α2γ
2θ(1 + β(2α2γ))N1+β(2α2γ).(3.56)
Finally, using the Wick theorem as in [Jan97, Theorem 1.28],
Var ZT
0
x(k)
tW(k)
tdt
=ZT
0ZT
0
E[x(k)
tx(k)
rW(k)
tW(k)
r]drdtZT
0
E[x(k)
tW(k)
t]dt2
=ZT
0ZT
0
E[x(k)
tx(k)
r]E[W(k)
tW(k)
r] + E[x(k)
tW(k)
r]E[x(k)
rW(k)
t]drdt
2ZT
0ZT
0qrtE[(x(k)
r)2]E[(x(k)
t)2]drdt
2T2EZT
0
(x(k)
t)2dtλ2γ2
k,
76
and in particular,
X
N=1
Var hλ1+2αγ
kRT
0x(N)
tW(N)
tdti
ERT
0(A)1+2αγXN
t, WN
tdt2
X
N=1
Nβ(4α4γ)
N2+β(4α4γ)<,
such that by the strong law of large numbers [Shi96, Theorem IV.3.2], (3.56)
holds a.s. for RT
0(A)1+2αγXN
t, WN
tdt. Now, from (3.54), we see that
ˆ
θfull
Nθa.s.
θ,
which implies the claim.
Remark 3.16. Integrated noise appears naturally if one considers systems
such that the first component is observed, but only the second component is
driven by noise. More precisely, the linear system
dXO
t=θA11XO
tdt+A12XU
tdt, (3.57)
dXU
t=A21XO
tdt+A22XU
tdt+B2dWt,(3.58)
with XO
0= 0,XU
0= 0 and unknown θ, can be formally rewritten as
dXO
t=θAXO
tdt+F(XO)(t)dt+BWtdt, (3.59)
where A=A11,B=A12B2and F(X) = A12A22A1
12 X+A12A21JX
θA12A22A1
12 A11JX. Depending on the form of A11, A12, A21, A22 and B2, this
reasoning can be made rigorous. In order to neglect F, the regularity of all
terms appearing in the (linear) system (3.57),(3.58) can be evaluated directly,
or it can be shown that SF=Fsatisfies (Fs,η). If either of these approaches
is feasible, the reduction to the theory from Section 2.3 as described above is
applicable.
The extension of this setting to semilinear systems is possible by a reg-
ularity argument as in the previous sections, decomposing both components
into their linearization and nonlinear remainder.
3.3 Structure of the Dispersion Operator
Set ¯
B=σ(A)γ. We call ¯
Bthe reference dispersion operator. In all models
we have considered so far, we used ¯
Bas dispersion operator. Now we study
77
to which extent this assumption can be relaxed. W.l.o.g. we study only the
white noise case. More precisely, consider
dXt=θAXtdt+F(Xt)dt+B(Xt)dWt(3.60)
with initial condition X0. Let L2(H)denote the space of Hilbert–Schmidt
operators on H, with norm ∥·∥HS. We demand that Bmaps its domain
D(B)Hinto L2(H). In direct analogy to (W), our standing assumption
is well-posedness of (3.60) in the sense of a unique probabilistically and an-
alytically weak solution in C(0, T;H). Within the variational approach, as
exposed in [LR15], this can be shown under Lipschitz and growth conditions
on B(and additional mild conditions on F). Here and in the sequel, we write
e
B(Z) = B(Z)¯
Bfor the deviation from the reference dispersion operator,
i.e. we consider lower order (possibly multiplicative) noise of the form
B(Z) = σ(A)γ+e
B(Z).(3.61)
In order to transfer the results from the reference case, Bmust be asymp-
totically similar to ¯
Bin the following sense:
(Nγ
η)There is a locally bounded b: [0,)[0,)such that for ZH
and kN:
e
B(Z)TΦk
2
Hpb(ZH)λ12γη
k.(3.62)
This is a natural condition, as shown in the next lemma:
Lemma 3.17. Let η > 0. If for ZH,e
B(Z)is a linear bounded operator
mapping Hinto Hrfor some r > 1+2γ+ηsuch that the operator norm
satisfies
e
B(Z)
2
HHrb(ZH),(3.63)
then condition (Nγ
η)is satisfied.
Proof. In that case,
e
B(Z)TΦk
2
He
B(Z)T(A)r/2
2
HH(A)r/2Φk2
H
(A)r/2e
B(Z)
2
HHλr
k.
78
Now (A)r/2:HrHis an isometry, and therefore
e
B(Z)TΦk
2
He
B(Z)
2
HHr
λr
kb(ZH)λr
k,
which implies the claim.
Example 3.18. Diagonal noise of the form B(Xk=bk(Xkfor func-
tions bk:HR,kN. Such diagonal dispersion terms have been con-
sidered e.g. in [CCG20], or in [CKL20] in the context of space-only noise.
Here, condition (Nγ
η)simplifies to
bk(Z)γ
kσ2pb(ZH)λ1η
k,(3.64)
which amounts to fast asymptotic equivalence of the modes bkand λγ
k.
As before, let ¯
Xbe the solution to
d¯
Xt=θA ¯
Xtdt+¯
BdWt(3.65)
with ¯
X0= 0, and e
X:= X¯
X. In order to control the regularity of e
X, we
extend the splitting argument as follows: Define ¯
XFto be the solution of
d¯
XF
t=θA ¯
XF
tdt+B(Xt)dWt(3.66)
with ¯
XF
0= 0, such that e
XF:= X¯
XFsatisfies
de
XF
t=θA e
XF
tdt+F(Xt)dt(3.67)
with e
XF
0= 0. It follows from Proposition 3.19 below that ¯
XFis well-posed.
Next, with ¯
XB:= ¯
X, the process e
XB:= ¯
XF¯
XBsatisfies
de
XB
t=θA e
XB
tdt+e
B(Xt)dWt,(3.68)
e
XB
0= 0. This means that the nonlinear process e
X=e
XF+e
XBconsists of
two components, which contain the nonlinear behavior in the drift and the
dispersion, respectively.
As before, we write s= 1 + 2γ1.
Proposition 3.19. Let γ > 1/(2β)and η > 0. Under condition (Nγ
η), we
have e
XBR(s+η)for any s < s.
79
Proof. First, note that for any ZH, the operator (A)(s+η)/2e
B(Z)is a
Hilbert–Schmidt operator on H:
(A)(s+η)/2e
B(Z)
2
HS =
X
k=1 e
B(Z)T(A)(s+η)/2Φk
2
H
=
X
k=1
λs+η
ke
B(Z)TΦk
2
H
b(ZH)
X
k=1
λs+η12γηϵ
k
X
k=1
λ1ϵ
k<
for some ϵ > 0due to condition (Nγ
η). In particular, using XC(0, T;H)
a.s., we see that e
B
t:= (A)(s+η)/2e
B(Xt)is uniformly bounded in ∥·∥HS for
t[0, T]. Now, Theorem 4.2.4 of [LR15] implies that
dYt=θAYtdt+e
B
tdWt,(3.69)
Y0= 0, has a unique solution in C(0, T;H). As a consequence, Y=
(A)(s+η)/2e
XB, and the claim follows.
Together with Proposition 2.3 (with ¯
Xreplaced by ¯
XF=¯
X+e
XB
therein), we immediately obtain:
Theorem 3.20. Let γ > 1/(2β),sRand η > 0. Assume X0Hs+η.
If (Fs,η)and (Nγ
η)are true and XR(s), then e
X=e
XF+e
XBR(s+η)
almost surely.
Remark 3.21. If in the proof of Proposition 3.19 it can be shown that Y=
(A)(s+η)/2e
XBhas continuous trajectories not only in Hbut also in V, then
we can conclude even e
XBR(s+η+1). In view of Lemma 3.17, this seems
natural: There, e
B(Z)must map Hinto Hrfor r > 1 + 2γ+η, whereas ¯
B
maps Hinto H2γ. In this sense, 1+ηshould be expected to be the “true” excess
regularity instead of η. However, in the general setting it is not clear if Y
has continuous trajectories in V, although there are sufficient criteria known
in literature. For example, in the case of additive noise B(Z)B, according
to Theorem 5.11 from [DPZ14] we have that YC(0, T;V)if the integral
RT
0t2αeA(A)(s+η+1)/2(B¯
B)2
HS dtis finite for some 0< α < 1/2.
80
In particular, if ˆ
θfull
N,ˆ
θpart
Nand ˆ
θlin
Nare given by (2.24), (2.25) and (2.26),
the results on diffusivity estimation from Theorem 2.11 transfer directly to
the model studied in this section:
Theorem 3.22. Let γ > 1/(2β),η > 0(1 1) and s00, assume
X0Hs+ηand XR(s0). Let (Nγ
η)and (Fs,η)be true for s0s < s. Let
α > γ 1/4. Then the following asymptotic statements are true:
(i) ˆ
θfull
Nis asymptotically normal as in (2.29), and if η > 1+1, the same
is true for ˆ
θpart
N,ˆ
θlin
N.
(ii) In the case η1+1,ˆ
θpart
Nand ˆ
θlin
Nare consistent in probability with
convergence rate Na,a < βη/2, i.e.
Naˆ
θpart
NθP
0, Naˆ
θlin
NθP
0.(3.70)
Proof. By Theorem 3.20, RT
0(A)s/2XN
t2
Hdtsatisfies the asymptotics from
(2.20) whenever s > s. Now, by means of (Nγ
η)and XC(0, T;H),
ZT
0e
B(Xt)T(A)1+2αXN
t
2
HdtZT
0 N
X
k=1
λ1+2α
kx(k)
te
B(Xt)TΦkH!2
dt
psup
0tT
b(XtH)ZT
0 N
X
k=1
λ1/2+2αγη/2
kx(k)
t!2
dt
N
N
X
k=1
λ1+4α2γη
kZT
0
(x(k)
t)2dt
=NZT
0(A)1
2+2αγη
2XN
t
2
Hdt. (3.71)
In case α > γ +(η1)/4we have 1+4α2γη > s, and the latter term
is dominated by N2+β(4α4γη). On the other hand, if α < γ + (η1)/4,
the last integral converges, and the latter term is dominated by N. The case
α=γ+(η1)/4can be ignored by substituting η7→ ηϵfor some small
ϵ > 0. In any of these cases, the right-hand side is negligible compared to
N1+β(1+4α4γ), where we take into account η > 1 1and α > γ 1/4.
81
In particular, using B(Xt) = σ(A)γ+e
B(Xt)and expanding the squared
norm, we have a.s.
ZT
0B(Xt)T(A)1+2αXN
t2
Hdtσ2ZT
0(A)1+2αγXN
t2
Hdt
σ2C2+4α2γN1+β(1+4α4γ)
because the condition η > 1 1ensures that 2 + β(4α4γη)<
1+β(1+4α4γ), i.e. the remaining terms are of lower order. Consequently,
the local martingale
MN
T:= C1/2
2+4α2γσ1N1/2β(1+4α4γ)/2ZT
0B(Xt)T(A)1+2αXN
t,dWt
is such that MNT1a.s., and according to Theorem A.1 and the Slutsky
lemma,
N1+β
2(ˆ
θfull
Nθ) = N1+β
2RT
0B(Xt)T(A)1+2αXN
t,dWN
t
RT
0(A)1+αXN
t2
Hdt
=σC1/2
2+4α2γN1+β(1+2α2γ)
RT
0(A)1+αXN
t2
HdtMN
T
converges to a normal distribution with mean zero and variance as given by
(2.30) The remaining claims for ˆ
θpart
Nand ˆ
θlin
Nfollow verbatim as in Theorem
2.11 (note that the condition on αin this theorem are even more restrictive
than in Theorem 2.11).
Remark 3.23.
(i) In comparison with Theorem 2.11, there are two additional restrictions:
The excess regularity ηmust exceed 1 1, and further α > γ 1/4,
which is always stronger than the condition on αfrom Theorem 2.11.
Both inequalities are related to the control of the non–diagonal elements
in the noise term. In the setting of Example 3.18, both restrictions can
be avoided: Namely, in that case, (3.71) in the proof of Theorem 3.22
82
is substituted by
ZT
0e
B(Xt)T(A)1+2αXN
t
2
Hdt
=
N
X
k=1
λ2+4α
kZT
0bk(Xt)λγ
k2(x(k)
t)2dt
pZT
0(A)1/2+2αγη/2XN
t2
Hdt,
which is always dominated by N1+β(1+4α4γ)if η > 0and α > γ (1 +
1)/4, and the rest of the proof is identical. Note that if A= is
the Laplacian on a bounded domain, then β= 2/d, and the additional
condition η > 1 1is void in dimension d2.
(ii) If e
B(X)Tmaps Hsinto Hs+2γfor all sRwith e
B(X)THsHs+2γ
bs(XH)for some locally bounded bs, then it is straightforward to see
that for sR,(A)1+2αγXR(s)implies B(X)T(A)1+2αX
R(s). In this case, (3.70) can be strengthened to almost sure conver-
gence using Lemma 2.6, as in the proof of Theorem 2.11.
83
Chapter 4
Discretization of the Spectral
Approach
In this section we adapt the spectral approach to the case that the observa-
tions consist of a set of point evaluations of the process Xin space instead of
Fourier modes. It is determined how much spatial information is needed in
order to reconstruct the spectral asymptotics from Theorem 2.11, depending
on the regularity of the process.
By now, there is plenty of literature on statistical inference for SPDEs
based on spatially and/or temporally discretized observations. Various works
are based on the asymptotic analysis of power variations, either in time
[BT19, BT20, Cho20, Cho19, CD20, KU21a, KU21b], in space [CKL20,
CK22, SST20, CKP21], in time and space [PT07, CH20, MKT19a], or extend-
ing the approach to a combined spatiotemporal variation [HT21b, HT21a].
Within the spectral approach, however, there seems to be almost no rigorous
attempt to quantify the amount of spatial information needed to recover its
asymptotics for diffusivity estimation. We are aware only of [Hue93, p. 44ff.],
where this topic is sketched shortly (but without rigorous proof) in d= 1,
using first-order integral approximations. On the other hand, [CDVK20] con-
siders the discretization in time of the maximum likelihood estimator from
the spectral approach.
As in the previous sections, we consider a semilinear SPDE of the form
dXt=θAXtdt+F(X)(t)dt+BdWt(4.1)
with initial condition X0, where Ais a closed, densely defined, negative
definite and self-adjoint operator with compact resolvent, B=σ(A)γis of
84
Hilbert–Schmidt type, and Fsatisfies (Fs,η)for some η > 0and s0s<s.
W.l.o.g. we set σ= 1. We always assume that (4.1) is well-posed with X
R(s)for s<s. As we are interested in spatially discrete point evaluations,
we have to make the abstract setting from Chapter 2 more specific: Let
D Rdbe a bounded domain with smooth boundary, such that the state
space for Xis given by H=L2(D). For simplicity, we assume that Xsatisfies
Dirichlet boundary conditions. The eigenvalues (λk)kNof Aare assumed to
satisfy
λkΛkD
d(4.2)
for some D > 0, called the order of the operator A. It is well-known that
(4.2) is true if Ais a (pseudo-) differential operator of order D[Shu01]. For
example, for A= we have D= 2 and for 2we have D= 4, cf. Section
2.4. Note that Bis a Hilbert–Schmidt operator if and only if γ > d/(2D).
For any h > 0, let MhNand (x(h)
i)i=1,...,Mh D. Define the evaluation
operator Eh:C(D)RMhvia (Ehf)i:= f(x(h)
i). Then each component of
Ehis a bounded multiplicative linear form on C(D). We write ⟨·,·⟩(h)for the
Euclidean scalar product on RMh.
In order to apply Ehto (4.1), we need AX to have values in C(D).
Therefore, we make the standing assumption
s>2, AX L(0, T;C(D)).(4.3)
For example, if A= , the latter condition holds if s> d/2 + 2, i.e.
γ > 1/2 + d/2, by means of the Sobolev embedding theorem. However, in
many situations it is not necessary to use the Sobolev embedding theorem
in order to prove continuity in space. For example, if the eigenfunctions
Φkare uniformly bounded in x D,kN, then s>2already implies
AX L(0, T;C(D)), see Lemma 5.5 and Proposition 5.6 below. We note
that for general A,s>2if and only if γ > d/(2D)+1/2.
Similarly, we always assume that the terms EhF(X)and EhBW are well-
defined. If A= , the former is true e.g. if Fis of the form F(X) = f(X)for
a function f:RRby (4.3), and the latter can be enforced e.g. by imposing
the additional bound γ > d/2, such that BW L2(0, T;Wd/2,2(D)), and
spatial point evaluations are well-defined again by means of the Sobolev
embedding theorem. In this situation, EhXsatisfies
dEhXt=θEhAXtdt+EhF(X)(t)dt+EhBdWt.(4.4)
85
Let r> r0such that rsD/2and r<(s1)D/2. Fix a
Banach space BrHfor each r< r r, such that Br2Br1for r1< r2.
Further, let h>0. We need the following conditions:
(D0)For r< r r,Bris a Banach algebra, and BrC(D).
(D1)For any r< r r:
ΦkBrkr/d.(4.5)
(D2)For any 0< h < h, there are real numbers w(h)
1, . . . , w(h)
Mhsuch that for
any r< r rand ZBr,
ZD
Zdx
Mh
X
i=1
w(h)
i(EhZ)ihrZBr.(4.6)
Condition (D2)relies on higher order quadrature formulas, reflecting the reg-
ularity of X. In the examples, the scale (Br)r<rrwill consist of Hölder
spaces or L2-based Sobolev spaces. Since these spaces are supposed to mea-
sure the regularity of X, we need
XL(0, T;Br) for r < sD/2.(4.7)
This is immediate if Brcoincides with H2r/D, otherwise it has to be proven
separately. In all our examples, this will be valid.
Remark 4.1. We emphasize that the index of the regularity space Hsfrom
Chapter 2 does not count spatial derivatives, but fractional powers of (A).
If D= 2, this is not the same. This is why the factor D/2arises e.g. in
(D0)and related relations below.
Denote by WhRMh×Mhthe diagonal matrix with entries (w(h)
i)1iMh
and set
Eh=WhEh.(4.8)
Note that for Z1, Z2C(D), it holds Eh(Z1Z2) = EhZ1EhZ2in the sense
of componentwise multiplication, and therefore:
⟨WhEhZ1, EhZ2(h)=
Mh
X
i=1
w(h)
iEh(Z1Z2).(4.9)
86
With that notation, a direct consequence of (D0),(D1)and (D2)is
Φk, Z⟨WhEhΦk, EhZ(h)hrkr/d ZBr.(4.10)
We use the following discretized version of A:
A(s/2)
h,N :=
N
X
k=1
λs/2
k(EhΦk)(EhΦk)T.(4.11)
Based on these considerations, we want to adapt the maximum-likelihood
based estimator ˆ
θlin
Nto the case of spatially discrete observations. Its natural
discrete analogue ˆ
θdiscr
h,N is given in the present setting as follows:
ˆ
θdiscr
h,N =RT
0DA(1+2α)
h,N (EhXt),d(EhXt)E(h)
RT
0DA(2+2α)
h,N EhXt,EhXtE(h)dt
.(4.12)
The error decomposition of ˆ
θdiscr
h,N reads as:
ˆ
θdiscr
h,N =
θRT
0DA(1+2α)
h,N EhXt,EhAXtE(h)dt
RT
0DA(2+2α)
h,N EhXt,EhXtE(h)dt
RT
0DA(1+2α)
h,N EhXt,EhF(X)(t)E(h)dt
RT
0DA(2+2α)
h,N EhXt,EhXtE(h)dtRT
0DA(1+2α)
h,N EhXt,EhBdWtE
RT
0DA(2+2α)
h,N EhXt,EhXtE(h)dt
=θ+θI(2)
h,N (2 + 2α)IN(2 + 2α)+IN(2 + 2α)I(1)
h,N (2 + 2α)
I(1)
h,N (2 + 2α)
F(1+2α)
h,N F(1+2α)
N+F(1+2α)
N
I(1)
h,N (2 + 2α)pM(h,N)T
I(1)
h,N (2 + 2α)
M(h,N)
T
pM(h,N)T
,
(4.13)
87
where we abbreviate
IN(s) = ZT
0(A)s/2XN
t2dt, (4.14)
I(1)
h,N (s) = ZT
0DA(s)
h,N EhXt,EhXtE(h)dt, (4.15)
I(2)
h,N (s) = ZT
0DA(s1)
h,N EhXt,Eh(A)XtE(h)dt, (4.16)
F(1+2α)
N=ZT
0(A)1+2αXN
t, PNF(X)(t)dt, (4.17)
F(1+2α)
h,N =ZT
0DA(1+2α)
h,N EhXt,EhF(X)(t)E(h)dt, (4.18)
M(h,N)
T=ZT
0DA(1+2α)
h,N EhXt,Eh(A)γdWtE,(4.19)
and (M(h,N)
t)t0is a local martingale with quadratic variation
M(h,N)T=ZT
0(A)γ(Eh)A(1+2α)
h,N EhXt
2
Hdt. (4.20)
Proposition 4.2. Assume that (D0),(D1),(D2)hold. Let R0and s > s.
(i) If we have
hpN2
dK(1)
d,D,R(γ), K(1)
d,D,R(γ) := 4Dγ + 2Dd+ 2dR
4Dγ + 2D2d,(4.21)
then a.s. I(1)
h,N (s)IN(s)NRIN(s),(4.22)
and in particular, I(1)
h,N (s)IN(s).
(ii) If we have
hpN2
dK(2)
d,D,R(γ), K(2)
d,D,R(γ) := 4Dγ 2Dd+ 2dR
4Dγ 2D2d,(4.23)
then a.s. I(2)
h,N (s)IN(s)NRIN(s),(4.24)
and in particular, I(2)
h,N (s)IN(s).
88
(iii) Let α > (γ(1 + d/D)/4) (γ1/2). If
hpN2
dK(M)
d,D (γ), K(M)
d,D (γ) := 2Dγ
2Dγ d,(4.25)
then a.s. M(h,N)TIN(2 + 4α2γ)as N .
(iv) Let α > (γ(1d/D)/4)(η/41)(η/4), where ηis as in (Fs,η).
If hpN2K(2)
d,D,R(γ)/d, then a.s.
F(1+2α)
h,N F(1+2α)
NNRIN(s).(4.26)
It holds K(1)
d,D,R(γ)< K(2)
d,D,R(γ). If R= 0, then K(1)
d,D,R(γ)< K(M)
d,D (γ), and if
R > 1/2, then K(M)
d,D (γ)< K(2)
d,D,R(γ).
Note that the denominators in (4.21), (4.23), (4.25) are positive if and
only if s>0,s>2,s>1, resp., which is satisfied by (4.3).
Proof. We write for r, s, h > 0,NNand ZL2(0, T;H):
L(h,N)
s,r(Z) := hr
N
X
k=1
λs+r/D
kZT
0|⟨Φk, Zt⟩|dt. (4.27)
Now let Z(1) L(0, T;Br1)and Z(2) L(0, T;Br2). Then
ZT
0DA(s)
h,N EhZ(1)
t,EhZ(2)
tE(h)dt
=
N
X
k=1
λs
kZT
0DWhEhΦk, EhZ(1)
tE(h)DWhEhΦk, EhZ(2)
tE(h)dt
as well as
ZT
0D(A)sPNZ(1)
t, PNZ(2)
tEdt=
N
X
k=1
λs
kZT
0DΦk, Z(1)
tEDΦk, Z(2)
tEdt.
89
Consequently, using |ab AB|≤|aA||b|+|a||bB|+|aA||bB|for
a, b, A, B R, together with (4.10), we obtain
ZT
0D(A)sPNZ(1)
t, PNZ(2)
tEdtZT
0DA(s)
h,N EhZ(1)
t,EhZ(2)
tE(h)dt
N
X
k=1
λs
kZT
0hr1kr1/d Z(1)
tBr1DΦk, Z(2)
tE
+hr2kr2/d Z(2)
tBr2DΦk, Z(1)
tE+hr1kr1/d Z(1)
tBr1dt
sup
0tTZ(1)
tBr1
L(h,N)
s,r1(Z(2)) + sup
0tTZ(2)
tBr2
L(h,N)
s,r2(Z(1))
+Tsup
0tTZ(1)
tBr1
sup
0tTZ(2)
tBr2
hr1+r2
N
X
k=1
λs
kk(r1+r2)/d
L(h,N)
s,r1(Z(2)) + L(h,N)
s,r2(Z(1)) + hr1+r2
N
X
k=1
k(sD+r1+r2)/d
L(h,N)
s,r1(Z(2)) + L(h,N)
s,r2(Z(1)) + hr1+r2N1+(sD+r1+r2)/d.
Thus, in order to bound the approximation error stemming from spatial
discretization, we have to control the terms L(h,N)
s,r1(Z(2)),L(h,N)
s,r2(Z(1))and
Lrest
s,r1,r2:= hr1+r2N1+(sD+r1+r2)/d.(4.28)
Based on this consideration, we prove the different cases separately. We will
repeatedly use Jensen’s inequality in the form PN
k=1 a1/2
k(NPN
k=1 ak)1/2
(in fact, if akkrfor some r > 1, then both sides grow as N1+r/2). Further,
note that for 0< a A, the function (A+x+y)/(a+x)is decreasing in
x > aand increasing in yR. This implies the relation between K(1)
d,D,R(γ),
K(2)
d,D,R(γ)and K(M)
d,D (γ)from the statement, as well as similar relations used
in the estimates below.
(i) In order to control I(1)
h,N (s), we set Z(1) =Z(2) =X. For ϵ > 0, let
90
r=r1=r2:= sD/2ϵ. W.l.o.g. assume that ϵ < Ds. Then
L(h,N)
s,r (X)Thr
N
X
k=1 λ2s+2r/D
kZT
0
(x(k)
t)2dt1
2
hr N
N
X
k=1
λ2s+2r/D
kZT
0
(x(k)
t)2dt!1
2
hrNZT
0(A)s+r/DXN
t2dt1
2
hrN2+ D
d(2s+2r
D2γ1)1
2=hrN1+ D
d(s+r
Dγ1
2),
where we made use of Proposition 2.8. This is possible due to 0< ϵ <
Ds, i.e. s+r/D > s/2. Now by assumption (4.21), we can choose
ϵ > 0small enough such that
hN2
d
4Dγ+2Dd+2Rd2ϵ
4Dγ+2D2d4ϵ.(4.29)
In particular, with IN(s) = RT
0(A)s/2XN
t2dtN1+D(s2γ1)/d,
and r=sD/2ϵ=D(1+2γ)/2d/2ϵ, it follows that L(h,N)
s,r (X)
NRIN(s). It remains to bound Lrest
s,r,r: We have Lrest
s,r,r NRIN(s)
whenever
hN2
d
4Dγ+2Dd+Rd2ϵ
4Dγ+2D2d4ϵ,
and this follows from (4.29) for any R0. In total, we have shown that
I(1)
h,N (s)IN(s)NRIN(s), and in particular, I(1)
h,N (s) = IN(s) +
(I(1)
h,N (s)IN(s)) IN(s).
(ii) Here, Z(1) =X,Z(2) = (A)X,r1=sD/2ϵand r2= (s2)D/2ϵ
for some ϵ > 0, where w.l.o.g. ϵ < D(s2). The terms can be
controlled as follows:
L(h,N)
s1,r1((A)X)Thr1
N
X
k=1 λ2s2+2r1/D
kZT
0
(λkx(k)
t)2dt1
2
hr1 N
N
X
k=1
λ2s+2r1/D
kZT
0
(x(k)
t)2dt!1
2
,
91
which is the same term appearing in (i). As K(1)
d,D,R(γ)< K(2)
d,D,R(γ),
(4.23) implies that L(h,N)
s1,r1((A)X)NRIN(s). Further,
L(h,N)
s1,r2(X)Thr2
N
X
k=1 λ2s2+2r2/D
kZT
0
(x(k)
t)2dt1
2
hr2 N
N
X
k=1
λ2s2+2r2/D
kZT
0
(x(k)
t)2dt!1
2
hr2NZT
0(A)s1+r2/DXN
t2dt1
2
hr2N2+ D
d(2s2+ 2r2
D2γ1)1
2=hr2N1+ D
d(sγ3
2+r2
D).
In the last line we have used s>sand ϵ < D(s2), i.e. s1+r2/D >
s/2, such that Proposition 2.8 is applicable. As before, we can choose
ϵ > 0small enough such that
hN2
d
4Dγ2Dd+2dR2ϵ
4Dγ2D2d4ϵ.(4.30)
Using r2= (s2)D/2ϵ= (1 + 2γ)D/2d/2ϵand again the
asymptotics of IN(s), we conclude L(h,N)
s1,r2(X)NRIN(s). Finally,
with (4.28), it is clear that Lrest
s1,r1,r2NRIN(s)whenever
hN2
d
4Dγd+dR2ϵ
4Dγ2d4ϵ,(4.31)
which is a consequence of (4.30) for R0. Now the claim follows as
in (i).
(iii) First note that for i, j Nand Rij := δij ⟨WhEhΦi, EhΦj(h), we
have by (4.10) and (D1):
|Rij|(hi1/dj1/d)q(4.32)
for each r< q r. We expand the quadratic variation of M(h,N):
For jN, we have by definition of A(1+2α)
h,N :
DΦj,(A)γ(Eh)A(1+2α)
h,N EhXtE
=λγ
j
N
X
k=1
λ1+2α
k⟨EhΦj, EhΦk(h)EhΦk,EhXt(h),
92
and consequently,
M(h,N)T=ZT
0(A)γ(Eh)A(1+2α)
h,N EhXt
2dt
=
X
j=1
λ2γ
j
×ZT
0 N
X
k=1
λ1+2α
k⟨WhEhΦj, EhΦk(h)⟨WhEhΦk, EhXt(h)!2
dt
=
X
j=1
λ2γ
j
N
X
k,l=1
λ1+2α
kλ1+2α
l
×ZT
0⟨WhEhΦk, EhXt(h)⟨WhEhΦl, EhXt(h)dt
×⟨WhEhΦj, EhΦk(h)⟨WhEhΦj, EhΦl(h)
=
N
X
k=1
λ2+4α2γ
kZT
0⟨WhEhΦk, EhXt2
(h)dt
2
N
X
k,l=1
λ1+2α2γ
kλ1+2α
lRkl
×ZT
0⟨WhEhΦk, EhXt(h)⟨WhEhΦl, EhXt(h)dt
+
N
X
k,l=1
λ1+2α
kλ1+2α
lZT
0⟨WhEhΦk, EhXt(h)⟨WhEhΦl, EhXt(h)dt
×
X
j=1
λ2γ
jRjkRjl
=: A12A2+A3.
We have 2 + 4α2γ > sdue to α > γ (1 + d/D)/4, and further,
K(1)
d,D,R(γ)< K(M)
d,D (γ)(with R= 0 in the first term), so part (i) yields
A1=I(1)
h,N (2 + 4α2γ)IN(2 + 4α2γ).
It remains to find bounds for A2and A3. We start with the latter term.
For r< q < Dγ d/2 = (s1)D/2, (4.32) and the Cauchy–Schwarz
93
inequality give
|A3|h2q N
X
k=1 λ2+4α
kk2q/d ZT
0⟨WhEhΦk, EhXt2
(h)dt1
2!2
×
X
j=1
λ2γ
jj2q/d,
where the last sum is finite. Again using (i), it follows that
|A3|h2qN
N
X
k=1
λ2+4α+2q/D
kZT
0⟨WhEhΦk, EhXt2
(h)dt
h2qNI(1)
h,N (2 + 4α+ 2q/D)
h2qNIN(2 + 4α+ 2q/D)
h2qN2+ D
d(2+4α+2q
D2γ1),
where we have used 2 + 4α+ 2q/D > 2+4α2γ > s. Now choose
q= (s1)D/2ϵ=Dγ d/2ϵfor some ϵ > 0, and in addition let
ϵbe small enough such that by (4.25),
hN2
d
2Dγϵ
2Dγd2ϵ.(4.33)
Then we immediately obtain |A3| IN(2 + 4α2γ). The bound on
A2is similar. With (4.32) and r< q r,
|A2|hq
N
X
k=1 λ2+4α4γ
kk2q/d ZT
0⟨WhEhΦk, EhXt2
(h)dt1
2
×
N
X
l=1 λ2+4α
ll2q/d ZT
0⟨WhEhΦl, EhXt2
(h)dt1
2
=: hqB1B2.
94
The sums B1and B2are treated as before, using part (i). For B1,
B1 N
N
X
k=1
λ2+4α4γ+2q
D
kZT
0⟨WhEhΦk, EhXt2
(h)dt!1
2
NI(1)
h,N (2 + 4α4γ+ 2q/D)1
2
N2+ D
d(1+4α6γ+2q
D)1
2=N1+ D
d(1
2+2α3γ+q
D),
if 2 + 4α4γ+ 2q/D > s, which is the case for q=sD/2rdue
to α > γ 1/2. Similarly B2N1+D(1/2+2αγ+q/D)/d. Therefore,
|A2|hqN2+ D
d(1+4α4γ+2q
D).
Now (4.33) implies
hN2
d
2Dγ+D
2Dγd+D.(4.34)
With q=sD/2r, we get |A2| IN(2 + 4α2γ). Putting things
together,
M(h,N)T=A12A2+A3
IN(2 + 4α2γ)2A2+A3
IN(2 + 4α2γ).
(iv) Set Z(1) =X,Z(2) =F(X),r1=sD/2ϵand r2= (s2+η)D/2ϵ.
Let ϵbe small enough such that
hN2
d
4Dγ+2Dd+2dRDη
4Dγ+2D2d4ϵ,(4.35)
hN2
d
4Dγ2Dd++2dR2ϵ
4Dγ2D2d+24ϵ,(4.36)
hN2
d
8Dγ2d++2dR4ϵ
8Dγ4d+28ϵ,(4.37)
which is possible due to hpN2K(2)
d,D,R(γ)/d. Since α > η/41, it
95
holds that 1 + 2α+ (r1r2)/D = 2 + 2αη/2>0, and consequently,
L(h,N)
1+2α,r1(F(X)) hr1 N
N
X
k=1
λ2+4α+2r1
D
kZT
0Φk, F(X)(t)2dt!1/2
=hr1NZT
0(A)1+2α+r1
DF(X)(t)
2
Hdt1/2
hr1Nλ2+4α+2r12r2
D
NZT
0(A)r2
DF(X)(t)
2
Hdt1/2
hr1N1
2+D
d(1+2α+r1r2
D)=hr1N1
2+D
d(1+2α+2η
2),
and a direct calculation using (4.35) and r1= (1 + 2γ)D/2d/2ϵ
yields L(h,N)
1+2α,r1(F(X)) NRIN(2 + 2α). Further, we have α > η/4,
and if ϵ < (4α+η)D/2, it follows that 2+4α+ 2r2/D > s, thus
L(h,N)
1+2α,r2(X)hr2 N
N
X
k=1
λ2+4α+2r2
D
kZT
0
x(k)
tdt!1/2
=hr2NZT
0(A)1+2α+r2
DXN
t
2
Hdt1/2
hr2N1+ D
d(1
2+2αγ+r2
D),
and (4.36) together with r2= (1 + 2γ+η)D/2d/2ϵgives
L(h,N)
1+2α,r2(X)NRIN(2 + 2α). Finally, (4.37) can be reformulated
as Lrest
1+2α,r1,r2NRIN(2 + 2α). This finishes the proof.
Motivated by the previous proposition, we define
Kd,D(γ) := K(2)
d,D, 1
2+D
2d
(γ) = 4Dγ D
4Dγ 2D2d.
Theorem 4.3. In the setting of this section, assume (D0),(D1)and (D2),
and further
hpN2
dKd,D(γ).(4.38)
With ηfrom (Fs,η), let α > (γ(1+d/D)/4)(γ1/2)(η/41)(η/4).
96
(i) If η > 1 + d/D, then
N1
2+D
2dˆ
θdiscr
h,N θd
N (0,Σ) (4.39)
as N ,h0, where Σis given by (2.30).
(ii) If η1 + d/D, then
Naˆ
θdiscr
h,N θP
0(4.40)
for any a < Dη/(2d)as N ,h0.
Proof. This follows directly from the decomposition of ˆ
θdiscr
h,N as in (4.13): By
Theorem A.1 and Proposition 4.2 (iii), M(h,N)
T/(M(h,N)T)1/2 N(0,1).
Further, N1/2+D/(2d)(M(h,N)T)1/2/I(1)
h,N (2 + 2α)Σ1/2. Next, the term
F(1+2α)
N/I(2)
h,N (2 + 2α)F(1+2α)
N/IN(2 + 2α)is treated exactly as in Theorem
2.11, which leads to the case distinction η > 1+d/D compared to η1+d/D.
Finally, all other terms from (4.13) converge to zero with rate N1/2D/(2d)
by Proposition 4.2, and the claim follows from the Slutsky lemma.
Remark 4.4.
(i) For fixed d1and D > 0, we have Kd,D(γ)1for γ (or
equivalently s ). This means that for large spatial regularity of
X, a spatial precision of order hpN2/d is sufficient in order to
transfer the asymptotic results from the classical spectral approach.
(ii) On a bounded domain in dimension d, one typically has Mhd
point observations. (For example, let Dbe a hypercube in Rd, where
the point evaluation grid consists of points which are aligned along the
coordinate lines in an equidistant way.) In the large regularity setting
from the previous comment, this number of point observations leads to
the relation
N2pM. (4.41)
This relation does not depend on the dimension d. In this sense, a
given resolution level Ncan be recovered from a dimension-independent
number of point observations (if the spatial regularity of Xis large)
within the spectral approach.
97
(iii) On the level of the estimator for diffusivity, this means the following:
In the setting of the previous comment, let additionally F= 0 for
simplicity, such that ˆ
θlin
Nconverges to θwith optimal rate N1/2D/(2d).
In terms of the number Mof point observations, this corresponds to the
convergence rate M1/4D/(4d)for ˆ
θdiscr
h,N (neglecting terms of arbitrary
small polynomial order in Nin the relation N2pM). This rate is
upper bounded by M1/4. While this bound decays rather slowly in M,
it holds uniformly in d.1Therefore, sparse observations in an high-
dimensional setting can still yield reasonable results.
(iv) Below, we explain how to further improve this bound on the convergence
rate of ˆ
θdiscr
h,N in Mby tightening (4.10).
(v) We point out that I(1)
h,N (s)need not be a good approximation for IN(s).
In fact, according to Proposition 4.2, the absolute error I(1)
h,N (s)IN(s)
may even diverge, but slowly compared to the energy IN(s)itself. The
same is true for the other approximation terms.
(vi) We highlight that there is no assumption on the shape of Dor the
distribution of the point evaluations within Dother than the integral
approximation property from (D2). To the best of our knowledge, The-
orem 4.3 is the first rigorous asymptotic result for diffusivity estimation
based on such general discrete point evaluation schemes.
Next, we consider different cases in which higher order approximation
estimates allow to connect to the assumptions from Theorem 4.3.
Example 4.5 (Quadrature formulas in d= 1).Let L > 0and D= [0, L].
Further, let A=(∆)D/2. With Dirichlet boundary conditions, we have
Φk(x) = 2 sin(πkx/L)and λk= (πk/L)D. For kN0, equip the space of k
times differentiable functions Ck(D)with the norm fCk=Pk
i=1 i
xf.
Further, for r > 0let Cr(D)be the Hölder–Zygmund space with the norm
∥·∥Cr=∥·∥+|·|Cr. Here, |f|Cr= supx∈D,0<h<1 with x+Kh∈D hrK
hf(x)for
any K > r, where hf=f(·+h)fis the difference operator. Different
choices of K > r lead to equivalent norms. For integer rN, the spaces
1Note, however, that for fixed γ, the deviation from the large spatial regularity setting
still depends on din the term Kd,D(γ), i.e. the regularity of Xneeded to approach the
large regularity regime grows with d.
98
Cr(D)and Cr(D)do not coincide, but the former is a subspace of the lat-
ter. See e.g. [Tri10a, Tri10b] or [GN15, Chapter 4] for further details on
these spaces. We will use that the Hölder–Zygmund spaces can be identified
as interpolation spaces between C(D)and Ck(D)for any kN, see e.g.
[Lun95, Chapter 1] for details. The scale of Banach spaces (Br)r>ris given
by (Cr(D))r>0with r= 0. The upper bound ris arbitrary. By Lemma 5.5
below, (4.7) is true. It is clear that the Cr(D)are Banach algebras [Tri10a,
Section 2.8.3], so (D0)is trivially satisfied. Further, for r > 0,
ΦkCr1 + |Φk|Cr1 + sup
xR,h>0
hr(∆K
hΦk)(x)
= 1 + sup
xR,h>0
krhr(∆K
hΦ1)(x)kr,
where we substituted x=kx and h=kh. Thus, (D1)is satisfied.
Fix h>0. For 0< h < h, let MhNsuch that Mhh1for h0.
Let π(h) = {x(h)
0, x(h)
1, . . . , x(h)
Mh1}be a partition of Mhpoints in [0, L]. Let Eh
be the point evaluation operator associated to π(h). We consider quadrature
formulas of the form Q(h)(f) = PMh
i=1 w(h)
if(x(h)
i)for some weights w(h)
iR.
Let kN,k> r. Typically, Q(h)satisfies an error estimate of the form
ZD
fdxQ(h)(f)Mk
hfCk(4.42)
for fCk(D). Examples include the composite Newton-Cotes formulas of
order kon equidistant partitions, or Gaussian quadrature formulas, where
∥·∥Ckcan be even replaced by an L2-based Sobolev norm. This is well-known,
see for example [QSS00]. The right-hand side of (4.42) can be bounded by
hkfCk(up to a constant), and the exact interpolation theorem [AF03,
Theorem 7.23], applied to the operator RD·dxQ(h), extends the resulting
estimate to all 0rr, where the norm on the right-hand side of (4.42)
is replaced by ∥·∥Cr. Thus (D2)holds with the weight matrix Whdetermined
by the quadrature weights w(h)
i. Consequently, Theorem 4.3 is applicable in
this setting.
Example 4.6 (Finite element method in d2).Let A=(∆)D/2, set
Br=H2r/D =Wr,2(D), let r=d/2and rNarbitrary. Condition (D0)
is immediate, and for (D1), note that ΦkBr=Φk2r/D =λr/D
kkr/d. In
order to describe the discretization operator Ehand the approximation prop-
erty (D2), we make use of results from the theory of finite elements. The
99
finite element method is a standard approach from numerical analysis with
a huge body of literature, we follow the exposition from [Cia02]. As we are
interested in point evaluations, we only consider Lagrange finite elements.
Let K0be a compact reference domain (typically a simplex or cube in ddi-
mensions) with non-empty interior. Fix rpoints y(1)
0, . . . , y(r)
0K0, and
let P0be a r-dimensional space of polynomial functions defined on K0, such
that for p0P0,p0(y(i)
0) = 0 for 1irimplies p0= 0. Then there are r
polynomials p(1)
0, . . . , p(r)
0P0such that p(i)
0(y(j)
0) = δij, and the interpolation
operator Π0:C(K0)P0, given by (P0f)(x) = Pr
i=1 f(y(i)
0)p(i)
0(x), is well-
defined and acts as the identity on P0. Now we partition the domain Dinto
a family (Kj)j=1,...,L of compact domains with open interior (which overlap
only on their boundaries), such that there is a diffeomorphism Fj:K0Kj
for 1jL. Let Pjconsist of the pullback of functions p0P0via F1
j,
i.e. Pj={p0F1
j|p0P0}, and set y(i)
j:= Fj(y(i)
0). The interpolation
operator on Kjis given by Πjf= 0(fFj)) F1
j. Typically, the Fj
are affine functions, which leads to a partition of polygonal domains D, but
also curved elements Kjare possible (see e.g. [Zlá73], [Cia02, Chapter 4.3]),
which allow to handle a smooth boundary of D. We assume mild compati-
bility criteria on the partition of D: The images of the faces of K0under
the Fjhave disjoint (d1)-dimensional interior or coincide. Further, for
fC(D), the interpolation polynomials Πjf|Kjcoincide on the boundaries
of the Kj, such that there is a well-defined interpolation operator Πacting
on C(D). For each 1jL, let hjdenote the diameter of Kjand ρj
the diameter of the largest ball contained in Kj. We assume that there is a
constant C > 0such that hjjCfor all j. Finally, let h:= max1jLhj
be the mesh size of the partition of D.
Let {x(h)
i}1iMhbe the set of all Mhpoints y(k)
jin the interior of D,
where 1jL,1kr, and the labeling is arbitrary. Using Dirichlet
boundary conditions, we can neglect evaluations at the boundary of the do-
main by assuming f= 0 on D. Now, Ehis the operator mapping fC(D)
to the vector of point evaluations f(x(h)
i)for 1iMh. The weights
w(h)
iare given by w(h)
i:= RDΠfidxfor any function fiC(D)that satisfies
fi(x(h)
j) = δij,1i, j Mh, and vanishes on D.
We denote by |f|2
k,2=P|α|=k|αf|2
L2(D)the L2-Sobolev seminorm of order
k, where αis a multi-index. It is well-known that for 0kr,
fΠfL2(D)hk|f|k,2(4.43)
100
see e.g. [Cia02, Theorem 3.2.1] for the affine case. Together with the obvious
estimate |f|k,2 fBkand
ZD
fdx
Mh
X
i=1
w(h)
if(x(h)
i) fΠfL1(D)fΠfL2(D),(4.44)
we obtain that (4.6) is true for integer r, and the exact interpolation theorem
[AF03, Theorem 7.23], applied to the operator IΠ, extends this estimate
to general 0rr. In particular, (D2)is true.
In total, Theorem 4.3 is applicable in this setting. Note that the finite
element method provides a very flexible approach to discretizing D, which
allows to handle point evaluations schemes way beyond a rectangular grid.
Finally, we outline a possibility to further improve the results from The-
orem 4.3 and Remark 4.4. An explicit understanding of the discretization
error of the Fourier modes, which is not based on the universal approxima-
tion error from (D2), can improve the bounds on the number of spatial points
needed in order to recover the spectral approach from discrete observations.
Our standing assumption for the rest of this section is that for each h > 0,
the vectors EhΦ1, . . . , EhΦMhRMhare linearly independent. Consider the
operator
ThZ:=
Mh
X
k=1 EhZ, EhΦk(h)Φk(4.45)
for ZC(D). It clearly satisfies
ThZ, Φk=EhZ, EhΦk(h)(4.46)
for 1kMh. In particular, if EhΦk, EhΦ(h)=Φk,Φfor 1k,
Mh, then the left-hand side of (4.46) can be replaced with EhThZ, EhΦk(h),
the linear independence of (EhΦk)1kMhimplies that ThZcoincides with Z
at the observation points. In this case, This the interpolation operator at
the given observation points associated to the basis functions k)1kMh.
Instead of (D2), we assume the following:
(D
2)For any 0< h < h,r< r rand ZBr,
ZThZL2(D)hrZBr.(4.47)
101
This immediately implies
Z, Φk⟩−⟨EhZ, EhΦk(h)=|⟨ZThZ, Φk⟩|
ZThZL2(D)hrZBr(4.48)
for 1kMh. If hpN1/d for N and h0, then (4.48) is true for
all 1kNat least asymptotically. Note that (4.48) is an improvement
over (4.10) by a factor kr/d.
This leads to an additional gain in rate in the proof of Proposition 4.2.
As a consequence, in Theorem 4.3, condition (4.38) can be relaxed:
Theorem 4.7. In the present setting, let (D0),(D1)and (D
2)hold. Then
there is a function K
d,D, with K
d,D(γ)1for γ , such that (4.38) can
be substituted by
hpN(1/d)K
d,D(γ)(4.49)
without changing the conclusions of Theorem 4.3.
It follows that (4.41) can be improved to
NpM, (4.50)
i.e. Npoint evaluations suffice in order to recover the spectral resolution
level Nfor diffusivity estimation in the large regularity regime (neglecting
terms of arbitrarily small polynomial order). Consequently, the convergence
rate of ˆ
θdiscr
h,N is described by M1/2D/(2d)in Remark 4.4.
We shortly compare our result to related literature. Note that the con-
struction of ˆ
θdiscr
h,N is independent of the dispersion intensity σ, which may be
treated as unknown. In fact, in [HT21b] it is shown that under spatially
and temporally discrete observations of a stochastic heat equation driven by
space-time white noise in d= 1, the parameters θand σcan be jointly esti-
mated at rate M3/2=M1/2D/(2d)if the observation scheme is balanced,
or if the resolution in time exceeds that of a balanced observation scheme. Of
course, as we work with time-continuous observations, we may always assume
arbitrarily high resolution in time. In this sense, Theorem 4.7 is compatible
with [HT21b]. On the other hand, if σis treated as known, the observation of
the process continuously in time at a single point in space suffices to recover
θ, see e.g. [PT07, CH20].
102
In contrast to the integral approximation estimate from (D2), bounds
on the approximation error of Thas in (D
2)seem to be harder to obtain.
An important example is given by a uniform observation grid on a periodic
domain:
Example 4.8. Let d= 1, let w.l.o.g. Dbe an interval of length 2π, and
consider periodic (instead of Dirichlet) boundary conditions. Then we can
identify D R/(2πZ). Let Br=Wr,2(D). The observation grid is as-
sumed to be spatially uniform. In this case, EhΦk, EhΦ(h)=Φk,Φfor
1k, Mh, so This a trigonometric interpolation operator, and (D
2)
holds [KO79]. This can be extended to rectangular domains with a uniform
point evaluation grid in larger dimension d2[Pas80]. See also [CHQZ88,
Chapter 9], [QSS00, Chapter 10.9], [SV02, Chapter 8] for discussions of
trigonometric interpolation.
Nonetheless, even for non-uniform observation point grids in d= 1, the
situation is less clear. Recent works [Aus16, AT17] indicate that the trigono-
metric interpolation operator on a non-uniform grid has diminished approx-
imation power, at least in ∥·∥, with a convergence rate depending on the
deviation from the uniform point grid. While in this case we cannot expect
that Thand the trigonometric interpolation operator coincide, this gives a
hint that the validity of (D
2)can be more involved than (D2).
It is an interesting question for further research if (4.50) and the resulting
(optimal) convergence rate in Mfor diffusivity estimators can be achieved in
non-rectangular domains in dimension d2that do not arise as the tensor
product of one-dimensional intervals, or if it is possible to find a domain D
and an observation point distribution within Dsuch that (4.41) cannot be
improved.
103
Chapter 5
The Local Approach
This chapter is an adaptation of material from [ACP20].
The local approach to parameter estimation for SPDEs is a recent de-
velopment different from (and in some sense complementary to) the spectral
approach. It has been introduced in [AR21] for the stochastic heat equation.
[ACP20] generalizes the theory to semilinear models and [ABJR21] applies
the local approach to the stochastic Meinhardt model. The novelty from
the local approach is its observation scheme. It is assumed that a spatially
localized average of the solution process Xis observed on [0, T], which is for-
mally realized as the convolution with a compactly supported kernel. This
is a physically realistic assumption in many cases. As the support of the
kernel shrinks (corresponding to observing Xwith high resolution at a point
in space) the true diffusivity can be recovered.
Let D Rdbe a bounded open domain with smooth boundary. For
sN,p1denote by Ws,p(D),Ws,p
0(D)the Sobolev spaces as in [AF03],
and for non–integer s0, their complex interpolation spaces. Let be
the Laplacian, with domain W2,p(D)W1,p
0(D)in Lp(D)(see e.g. [GT01,
Chapter 9.6]). For s0,p1, let Hs,p(D) := Dp((∆)s/2)Lp(D)be the
maximal domain of definition of the fractional Laplacian (∆)s/2acting as a
closed, densely defined operator on Lp(D), cf. [Yag10, Chapter 16]. Hs,p(D)
is equipped with the norm ∥·∥s,p := (∆)s/2·Lp(D)and is a closed subspace
of Ws,p(D)for any s0and p1. For s < 1,Hs,p(D)is the completion of
Lp(D)w.r.t. the norm ∥·∥s,p. It is clear that for s, sRand p1,(∆)s/2
maps Hs+s,p(D)into Hs,p(D). We write Rp(s) := L(0, T;Hs,p(D)). These
spaces are equipped with their natural norm ZRp(s)= sup0tTZts,p.
104
Finally, let 0be the Laplacian as a closed, densely defined operator on
L2(Rd). See e.g. [Tri10a, Tri10b] for more details on function spaces.
In this chapter, we consider a semilinear SPDE
dXt=θXtdt+F(X)(t)dt+BdWt(5.1)
together with Dirichlet boundary conditions Xt= 0 on Dfor all 0tT,
and initial condition X0. As in the previous chapters, F:C(0, T ;L2(D))
D(F)L1(0, T;L2(D)) is a nonlinear operator. Wis a cylindrical Wiener
process, and Bis a dispersion operator of Hilbert–Schmidt type, such that
B:L2(D)H2γ,2(D)is a linear isomorphism for some γ > d/4.1Further
assumptions on Bwill be given below in condition (LB).
In order to formalize the local asymptotics, we define for δ > 0,x0 D
and ZL2(Rd):
Dδ,x0:= δ1(Dx0) = {δ1(xx0)|x D},(5.2)
Zδ,x0(x) := δd/2Z(δ1(xx0)).(5.3)
Then Zδ,x0L2(Rd), and (·)δ,x0maps L2(Dδ,x0)onto L2(D)with
Zδ,x0L2(D)=ZL2(Dδ,x0)(5.4)
for all δ > 0,x0 D. More generally, for any 1< q < and ZLq(Dδ,x0),
(5.3) implies
Zδ,x0Lq(D)=ZD|Zδ,x0(x)|qdx1
q
=ZDδdq
4+d
2|Z|q/2δ,x0(x)2dx1
q
=δd
2+d
q|Z|q/2δ,x0
2
q
L2(D)=δd
2+d
q|Z|q/22
q
L2(Dδ,x0)
=δd
2+d
qZLq(Dδ,x0)(5.5)
for δ > 0,x0 D. Note that the spaces Dδ,x0can be seen as non-asymptotic
tangential spaces of Dat x0: Formally, for δ0, the tangential space
Tx0D Rdis recovered. The behavior of the fractional Laplacian under
localization is given by the following result:
1In fact, it suffices to have (5.18) for all s > 1+2γd/2.
105
Lemma 5.1 ([AR21], [ACP20]).For sR,p2,δ > 0,x0 D and
ZHs,p(Dδ,x0), we have
(∆)s/2Zδ,x0=δs((∆)s/2Z)δ,x0.(5.6)
For s2N, this is a consequence of the chain rule.
Here and in the sequel, we fix a kernel KW2,2(Rd)with compact
support. We identify Kδ,x0, defined via (5.3), with its restriction to D. Then
Kδ,x0W2,2(D)[LM72, Remark 8.1]. For small enough δ, the support
of Kδ,x0is compactly contained in D. In particular, the boundary trace
operators of Kδ,x0of any differential order are zero on D. Then, clearly,
Kδ,x0H2,2(D). W.l.o.g. we restrict to that case in the sequel.
Now, by assumption, we observe a local average of X, namely Xtested
against Kδ,x0. In addition, we also need Xtested against Kδ,x0:
XK
δ,x0:= X, Kδ,x0L2(D)=ZD
XKδ,x0dx, (5.7)
XK
δ,x0:= X, Kδ,x0L2(D)=ZD
XKδ,x0dx. (5.8)
It is immediate from (5.1) that the dynamics of XK
δ,x0is determined by the
one-dimensional stochastic differential equation
dXK
δ,x0(t) = θXK
δ,x0(t)dt+F(X)(t), Kδ,x0dt+BKδ,x0L2(D)dWδ,x0(t)
(5.9)
with initial condition XK
δ,x0(0) = X0, Kδ,x0L2(D), where the process Wδ,x0:=
BKδ,x0, W/BKδ,x0L2(D)is a one-dimensional Brownian motion.
If XK
δ,x0and XK
δ,x0are observed on [0, T], the natural MLE-type estimator,
called augmented MLE [AR21], is given by
ˆ
θδ,x0=RT
0XK
δ,x0(t)dXK
δ,x0(t)
RT
0XK
δ,x0(t)2dt.(5.10)
By means of (5.9), ˆ
θδ,x0can be decomposed as follows:
ˆ
θδ,x0θ=RT
0XK
δ,x0(t)F(X)(t), Kδ,x0L2(D)dt
RT
0XK
δ,x0(t)2dt(5.11)
+BKδ,x0L2(D)RT
0XK
δ,x0(t)dWδ,x0(t)
RT
0XK
δ,x0(t)2dt.
106
Note that
Iδ,x0:= BKδ,x02
L2(D)ZT
0
XK
δ,x0(t)2dt
can be interpreted as the observed Fisher information. We need the following
conditions on the dispersion Band the kernel K:2
(LB)There is a family of bounded linear operators (Bδ,x0)δ0mapping L2(Rd)
into itself, such that
B(∆)γZδ,x0= (B
δ,x0Z)δ,x0(5.12)
for ZC(Rd)with support in Dδ,x0and δ > 0, as well as
B
δ,x0ZB
0,x0ZL2(Rd)0(5.13)
for ZL2(Rd)as δ0.
(LK)There is e
KW2γ+2,2(Rd)with compact support such that K=
(∆)γe
K.
(LΨ)We have B
0,x0(0)γ⌉−γe
KL2(Rd)>0,(5.14)
Ψ(0)γ⌉−γe
K>0,(5.15)
where Ψ(Z) = R
0B
0,x0er00Z2
L2(Rd)dr.
It is straightforward to see that Ψ(Z)<for ZW2,2(Rd)L1(Rd):
As B
0,x0is bounded, we can w.l.o.g. assume that B
0,x0=I. Let Gtfor
t > 0be the heat kernel given by et0Z=GtZ. It is elementary to
verify that GtL2(Rd)td/4, given that Gtis normed in L1as a func-
tion. Using standard semigroup properties as stated e.g. in [EN00], we see
that Rt
0er00Z2
L2(Rd)dr=1
2e2t0Z, 0ZL2(Rd)+1
2∥∇Z2
L2(Rd). Fur-
ther, using Young’s inequality for convolution products, we can estimate
e2t0Z, 0ZL2(Rd) GtL2(Rd)ZL1(Rd)0ZL2(Rd)td/4, which con-
verges to zero for t .
In (5.15), our standing assumption is (0)γ⌉−γe
KW2,2(Rd)L1(Rd).
This is certainly the case if γN.
2These are the assumptions B,Kand ND from [ACP20].
107
Example 5.2. It is clear that B=σ(∆)γfor σ > 0satisfies (LB). In
this case, we trivially have Bδ,x0Z=σZ for all δ0and ZL2(Rd).
In addition, the above calculation shows Ψ(Z) = σ2
2∥∇Z2
L2(Rd). This can
be generalized, for example, to smooth space-dependent σ:D [0,), see
[ACP20, Example 1].
As before, we reduce the asymptotic analysis of ˆ
θδ,x0to the linear case
F= 0 by means of the splitting argument X=¯
X+e
X, where ¯
Xsatisfies (5.1)
with F= 0 and X0= 0, and e
Xsolves the random PDE (2.5) with initial
condition e
X0=X0. The terms ¯
XK
δ,x0and e
XK
δ,x0are defined analogously to
(5.8).
Lemma 5.3. Under (LB),(LK)and (LΨ), we have the following asymptotics
as δ0:
(i) RT
0¯
XK
δ,x0(t)2dtClocδ2+4γ, where Cloc =Tθ1Ψ(0)γ⌉−γe
K.
(ii) BKδ,x0L2(D)CB
locδ2γ, where CB
loc =B
0,x0(0)γ⌉−γe
KL2(Rd).
In particular, if F= 0 and X0= 0,Iδ,x0(CB
loc)2Clocδ2as δ0.
Proof. (i) is a direct consequence of [ACP20, Proposition 22], and (ii) is
shown in the proof of [ACP20, Proposition 2].
This lemma is sufficient to obtain the asymptotic properties of ˆ
θδ,x0in
case F= 0 and X0= 0. In order handle the full semilinear model, we need
higher regularity of e
X. However, in contrast to the spectral approach, we
can make use of higher Lp–type regularity in the spaces Hs,p(D)instead of
mere L2–type regularity. This is further explained in Remark 5.10 below. In
order to exploit Lpregularity, we have to modify Condition (Fs,η)as follows:
(Fp
s,η)There is ϵ > 0and a monotonous, locally bounded function g: [0,)
[0,), such that for all ZRp(s):
F(Z)Rp(s+η+ϵ2) gZRp(s).(5.16)
In the Markovian case F(X)(t) = F(Xt), (5.16) holds if
F(Z)s+η+ϵ2,p g(Zs,p)(5.17)
for all ZHs,p(D).3
3Condition (Fp
s,η)in the form (5.17) is Assumption Afrom [ACP20].
108
Proposition 5.4. Let sR,p2and η > 0. Assume that X0Hs+η,p(D)
and ¯
X, e
XRp(s). If (Fp
s,η)is true, then e
XRp(s+η).
Proof. Verbatim as in Proposition 2.3, using the norm ∥·∥s,p instead of ∥·∥s.
It remains to understand the regularity of the linear process ¯
X. For p2,
set s
p:= sup{sR|¯
XRp(s) a.s.}, and further s
:= infp2s
p, as well as
s:= s
2. As in the previous chapters, we have s= 1 + 2γd/2. This can
be seen as follows: As B(∆)γ:L2(D)L2(D)is a linear isomorphism
by assumption, we have that
ZT
0(∆)s/2eB2
HS dt=(5.18)
is true if and only if it is true for Breplaced by (∆)γ, and this holds for
s1+2γd/2. This proves s1+2γd/2, and the opposite inequality
is shown as in Lemma 2.7. It is clear that s
psfor all p2due to
Hs,p(D)Hs,2(D)for sR, and therefore s
s. On the other hand,
we have the Sobolev embedding Hs,2(D)Hsd/2,p(D)for all sR,p2,
and thus s
sd/2. So 0ss
d/2is the possible regularity gap
for the linear process ¯
X.
Lemma 5.5. If supkNΦkL(D)<, then s
=s.
The proof is given in Appendix B.2. The condition supkNΦkL(D)<
is true e.g. in d= 1, but in general, only a bound of the form ΦkL(D)
λ(d1)/4
kcan be given, and this bound cannot be improved without further
restrictions on D, see [Gri02]. Proposition 5.4 together with Lemma 5.5
yields the precise Lp–excess regularity of e
X:
Proposition 5.6. Let η > 0,s0< s
such that (Fp
s,η)is true for any
s0s < sand p2. Let XR2(s0)and X0Hs+η,p(D)for any p2.
Then we have XRp(s)and e
XRp(s+η)for all s<s
and p2. In
particular, with
η:= η(ss
),(5.19)
we have e
XRp(s+η)for all s < s,p2.
109
Proof. We distinguish the cases (s
s0)d/2and (s
s0)< d/2. In the
former case, by using Proposition 5.4 iteratively, we have XR2(s)for all
s<s, and by the Sobolev embedding theorem, XRp(s0)for any p2.
Applying Proposition 5.4 a second time repeatedly yields XRp(s)and
e
XRp(s+η)for all s < s
and p2, which implies the claim. In the case
(s
s0)< d/2, we use a similar inductive argument: If for some p2it
holds that XRp(s0), then a repeated application of Proposition 5.4 gives
XRp(s)for any s<s
s
p, and by the Sobolev embedding theorem, it
follows that XRp(s0)for any p < p< dp/(d2(s
s0)). In particular,
there is a constant c > 1such that for all p2we can choose p=p(p)in
such a way that p/p c. Repeating this step we obtain XRp(s)for all
s < s
and p2. A final application of Proposition 5.4 yields e
XRp(s+η)
for all s < s
and p2, which implies the claim in this case, too.
The excess regularity of e
Xcan be used to show that the terms related to
Fand e
Xappearing in (5.11) are of lower order:
Lemma 5.7. Under the conditions from Proposition 5.6 and (LK), if η>0,
the following is true with η:= η(1 + d/4) and any ϵ > 0:
(i) It holds
ZT
0e
XK
δ,x0(t)2dtδ2+4γ+2(ηϵ),
and in particular
ZT
0
XK
δ,x0(t)2dtZT
0
¯
XK
δ,x0(t)2dtClocδ2+4γ.(5.20)
(ii) It holds
ZT
0
XK
δ,x0(t)F(X)(t), Kδ,x0L2(D)dtδ2+4γ+(ηϵ).(5.21)
Proof. With e
Kas in (LK), i.e. K= (∆)γe
K, we first show that for any
q > 1and r0,
sup
01(∆)r/2e
KLq(Dδ,x0)<.(5.22)
110
This is obvious for r2N0, as (∆)r/2e
Khas compact support in this case,
and in particular, sup01(∆)r/2e
KLq(Dδ,x0)=(∆)r/2e
KLq(Rd)<.
For general r0, we use the Gagliardo–Nirenberg inequality [BM18] on the
reference domain D: For 0< δ 1, with R:= 2r/2 2N0,
(∆)r/2e
KLq(Dδ,x0)=δd
2d
q+r(∆)r/2e
Kδ,x0Lq(D)
δd
2d
q+r(∆)R/2e
Kδ,x0
2r+R
2
Lq(D)(∆)R/2+1 e
Kδ,x0
rR
2
Lq(D)
(∆)R/2e
K
2r+R
2
Lq(Dδ,x0)(∆)R/2+1 e
K
rR
2
Lq(Dδ,x0)
where we used (5.5) and Lemma 5.1 repeatedly. Consequently, using that
(5.22) is true for the exponents Rand R+ 2, we obtain that (5.22) is true
for all r0.
Next, by choice of η, we have η<1 + d/2, thus 2γ+ 2 sη>0,
and therefore γ+ 1 (s+η)/2>0for all s < s. Using (5.5) and (5.22),
(∆)1(s+η)/2Kδ,x0Lq(D)
=δd
2+d
q(∆)γ+1(s+η)/2e
KLq(Dδ,x0)
δd
2+d
q.(5.23)
After these preparations, we can prove the statements of the lemma.
(i) Now let s<sand p2, with q=p/(p1) 2. By Proposition 5.6,
Lemma 5.1 and (5.23), we have for all s<s
ZT
0e
XK
δ,x0(t)2dt=ZT
0De
Xt,Kδ,x0E2
L2(D)dt
=ZT
0D(∆)(s+η)/2e
X, (∆)1(s+η)/2Kδ,x0E2
L2(D)dt
Tsup
0tTe
X
2
s+η,p (∆)1(s+η)/2Kδ,x0
2
Lq(D)
δ4+2(s+η)(∆)1(s+η)/2Kδ,x0
2
Lq(D)
δ4+2(s+η)d+2d/q.
111
Now, for any ϵ>0, with s:= sϵ= 1+2γd/2ϵand p:= 2d/ϵ,
such that 11/q = 1/p =ϵ/(2d), we have
ZT
0e
XK
δ,x0(t)2dtδ2+4γ+2η2d(11/q)2ϵ=δ2+4γ+2η3ϵ,
which implies the claim with ϵ= 3ϵ/2, using Lemma 5.3.
(ii) Condition (Fp
s,η)together with Proposition 5.6 gives F(X)Rp(s+η
2) for all s < s
and p2, which is the same as F(X)Rp(s+η2)
for all s < sand p2. Let ϵ>0,s=sϵ,p= 2d/ϵand
q=p/(p1). Similar as in (i), we estimate
ZT
0F(X)(t), Kδ,x02
L2(D)dt
=ZT
0D(∆)s+η2
2F(X)(t),(∆)s+η2
2Kδ,x0E2
L2(D)dt
TF(X)2
Rp(s+η2) (∆)1s+η
2Kδ,x0
2
Lq(D)
δ2+4γ+2η3ϵ.
Finally, an application of the Hölder inequality in time gives (5.21).
We can now state and prove the main result of this chapter:
Theorem 5.8. Let x0 D. Assume that (LB),(LK)and (LΨ)hold. Let
η > 0,s0< ssuch that (Fp
s,η)is satisfied for all s0s < sand p2. Let
a.s. XR2(s0)and X0Hs+η,p(D)for all p2. If η=η(ss
)>0,
then the following is true:
(i) ˆ
θδ,x0is a consistent estimator for θ, i.e. ˆ
θδ,x0
P
θas δ0.
(ii) If η>1, then
δ1ˆ
θδ,x0θd
N (0,Σloc),(5.24)
112
where
Σloc =
θB
0,x0(0)γ⌉−γe
K
2
L2(Rd)
TΨ(0)γ⌉−γe
K.(5.25)
In the case η1, it holds
δaˆ
θδ,x0θP
0(5.26)
for all a < η.
Proof. We use the representation of ˆ
θδ,x0θas given in (5.11). First, we set
M(δ)
T:= C1/2
loc δ12γRT
0XK
δ,x0(t)dWδ,x0(t). Then M(δ)T1in probability
as δ0, and Theorem A.1 implies that M(δ)
T N(0,1) in distribution. An
application of Slutsky’s lemma together with Lemma 5.3 (ii) and Lemma 5.7
(i) gives
BKδ,x0L2(D)RT
0XK
δ,x0(t)dWδ,x0(t)
RT
0XK
δ,x0(t)2dt
d
N (0,Σloc).
Next, again by Lemma 5.7, for any ϵ > 0,
RT
0XK
δ,x0(t)F(X)(t), Kδ,x0L2(D)dt
RT
0XK
δ,x0(t)2dtδη(1+d/4)ϵ.
Using (5.11), this implies (ii). Finally, (i) is a consequence of (ii).
As an important example, if B=σ(∆)γfor some σ > 0and γN,
then we immediately obtain due to B0,x0Z=σZ for ZL2(Rd):
Σloc =
2θe
K
2
L2(Rd)
Te
K
2
L2(Rd)
.(5.27)
Example 5.9. The results from Theorem 5.8 can be applied to the models
from Section 2.4. For simplicity, we assume that s
=s(which is true, for
example, in d= 1). In this case, the effective excess regularity ηcoincides
with the optimal excess regularity η.
113
(i) Linear Perturbations. If F(X) = c(∆)r/2for some cR,r < 2,
then (Fp
s,η)is true for all sR,p2and η < 2r. Thus, if r < 1,
then ˆ
θδ,x0is asymptotically normal as in (5.24). Otherwise, ˆ
θδ,x0is
consistent as in (5.26). In particular, perturbations up to order 1are
negligible, with first order perturbations being the critical case.
(ii) Reaction–Diffusion Equations. If F(X) = f(x), where fis either
a polynomial as in (2.40) or a bounded smooth function with bounded
derivatives of any order as in (2.41), then (Fp
s,η)holds for any p2,
s > d/p and η < 2. This can be seen verbatim as in Proposition
2.19, using the fact that Hs,p(D)is closed under multiplication if sp >
d(cf. [AF03, Theorem 4.39]) for the case of polynomial f, as well
as bounds on composition operators [AF92] for the case fC
b(R).
Consequently, ˆ
θδ,x0is asymptotically normal as in (5.24).
(iii) Burgers Equations. In d= 1, if F(X) = XxX=x(X2/2),
then exactly as in Lemma 2.22 (iii) it can be shown that (Fp
s,η)is true
for p2,s > 1/p and η < 1. In particular, ˆ
θδ,x0is consistent as in
(5.26), i.e. δa(ˆ
θδ,x0θ)0in probability for all a < 1. In fact, for
this particular model, it is possible to prove that the first term in (5.11),
representing the bias from neglecting the effect of F, converges to zero
in probability even with rate δinstead of δafor a < 1, which means
that asymptotic normality as in (5.24) transfers to ˆ
θδ,x0for the one–
dimensional Burgers equation (see [ACP20, Theorem 11] for details).
Remark 5.10 (Lp–regularity in the spectral approach).Lp–regularity has
been a crucial tool in determining the excess regularity ηof e
Xin the local
approach. It is a natural question to ask if Lp–regularity can improve the spec-
tral approach as well. In order to make the two approaches comparable, we
assume that a single Fourier mode of X(instead of the first NFourier modes)
is observed in the spectral approach. For simplicity, let F(X) = (∆)r,r < 2
and B=σ(∆)γ. The natural MLE–type estimator, neglecting information
on F, is given by
ˆ
θmode
N=RT
0λ1+2γ
Nx(N)
tdx(N)
t
RT
0λ2+2γ
N(x(N)
t)2dt=RT
0x(N)
tdx(N)
t
λNRT
0(x(N)
t)2dt(5.28)
with x(N)=X, ΦNL2(D)as before. This is the canonical analogon to ˆ
θlin
N
(with α=γ) for single mode observations, and it corresponds to ˆ
θδ,x0if
114
Kδ,x0is replaced by ΦN. By (2.22) with respect to the linear operator A=
θ∆+(∆)r, together with Lemma A.2 (i) (setting X
k=λ1+γ
kx(k)therein
and taking into account Var RT
0(x(k)
t)2dtλ4γ3
k, see e.g. [Lot09, Theorem
2.1] or [PS20, Lemma 4.1]), we have λ2+2γ
NRT
0(x(N)
t)2dt(σ2TΛ/2θ)N2/d in
probability. In particular, if there is η>0such that
λ2γ
NZT
0F(X)(t),ΦN2
L2(D)dtN2
d2η,(5.29)
a decomposition as in (5.11) yields
If η>1/d :N1
dˆ
θmode
Nθd
N 0,2θ
TΛ,(5.30)
if η1/d :Naˆ
θmode
NθP
0 for a < η.(5.31)
We have that (Fp
s,η)is true for any sR,p2and η < 2r, thus
F(X)Rp(sr)for all s < sand p2(if s
=s). Now, analogously
to the proof of Lemma 5.7, the Hölder inequality gives
λ2γ
NZT
0F(X)(t),ΦN2
L2(D)dtN2
d+12η
d+ϵΦN2
Lq(D)
for all q > 1and ϵ > 0, so with q= 2 we can choose η=η/d 1/2ϵ/2in
(5.29). In particular, if η > 1 + 1 = 1 + d/2, then η>1/d, and ˆ
θmode
Nis
asymptotically normal, in accordance with Theorem 2.11.
Now, in order to exploit higher Lp–regularity of Xin the spectral ap-
proach, the term ΦNLq(D)has to converge to zero if N , with conver-
gence rate possibly depending on 1q2. By interpolation, it suffices to
understand the border case q= 1 in order to obtain a bound on the rate. But
in general, such convergence to zero does not hold: For example, if d= 1 and
D= [0,1], we have ΦN(x) = 2 sin(Nπx), and ΦNL1(D)= 22 is inde-
pendent of N. For general domains, results from literature on L1–bounds for
the eigenvalues of the Laplacian [vdBHV15, Vog15] do not improve the triv-
ial estimate supNNΦNL1(D)<in terms of the convergence rate in N.
Even more, [Sog15, BS17] indicate that improved L1–bounds (along eigen-
value subsequences) on compact manifolds without boundary are related to
concentration of mass of the eigenfunctions along geodesics. 4
4Note the apparent asymmetry between the (optimal) bound ΦNL(D)λ(d1)/4
N
115
In contrast, Kδ,x0Lq(D)δd
2+d
qby (5.5) if Khas compact support,
which improves the convergence rate in δas q1.
Remark 5.11.
(i) If the linear process ¯
Xhas optimal Lp–regularity, i.e. if s
=s, then
η=η, and the convergence rate of ˆ
θδ,x0in Theorem 5.8 is determined
directly by the excess regularity ηcoming from (Fp
s,η).
(ii) In view of condition (LK), it is natural to consider also the case γ= 0.
Indeed, it is possible to include that case without further modification
as long as (5.1) is well-posed. This has been done in [ABJR21] in the
context of the stochastic Meinhardt model. For instance, the results
from [DPZ14, Chapter 7] show that in d= 1, reaction-diffusion equa-
tions driven by space-time white noise can be given a meaning in the
mild sense. If γ= 0, then condition (LK)is void, whereas for positive
γ,(LK)suggests that Kapproximates higher order derivatives of X
instead of point evaluations.
(iii) The convergence rate δof ˆ
θδ,x0can be recovered from the spectral ap-
proach: We can easily see how the asymptotic variance Σfrom The-
orem 2.11 for the estimator ˆ
θfull
Nbehaves if the domain Dis replaced
by a shrunk domain D1/δ,x0.Σdepends linearly on Λ1, and this con-
stant is characterized by λkΛk2/d. An explicit term for Λcan be
found e.g. in [Shu01, Section 13.4], and using the notation therein, it
holds Λ = V2/d
1, where V1depends linearly on |D|. It is clear that
D1,x0δd|D|. Consequently, Λδ2, and finally, Σδ2. This is
in accordance with Theorem 5.8.
[Gri02] and the trivial bound ΦNL1(D)1. If the latter cannot be improved, this means
that it is not possible to recover ΦNL2(D)1from interpolating the L1and Lcases.
Indeed, as pointed out in [Gri92, SS07] (see also references therein), for p2, optimal Lp
bounds for (linear combinations of) ΦNdo not simply follow from interpolation between
the L2and Lcases.
116
Chapter 6
Diffusivity Estimation for
Activator-Inhibitor Models
This chapter is an adaptation of material from [PFA+21].
We apply the theory of parameter estimation for semilinear SPDEs to
a particular test case from cell biology, concerning the dynamical behavior
of actin concentration in D. discoideum giant cells. The actin cytoskeleton
plays a crucial role in different processes such as motility of amoeboid cells
[BBPSP14]. In spite of its complex filamental structure, at the length scale
of the cell itself it may be reasonably approximated by a scalar field, repre-
senting concentration. Intracellular actin is capable of generating traveling
waves. In [FFAB20], this has been described by an SPDE of FitzHugh–
Nagumo type, which is coupled to a phase field representing the boundary of
the cell. In order to increase the spatial resolution of experimental data, it
is possible to artificially merge various cells to form a so-called giant cell, see
[GEW+14]. In particular, this allows to observe the spatiotemporal actin dy-
namics within a cell away from the cell boundary. In this case, the describing
model can be simplified by neglecting the phase field.
The reaction model employed in [FFAB20] in order to describe the actin
dynamics is a minimal model capable of generating traveling waves rather
than a detailed representation of the biochemical reaction pathway. Conse-
quently, it is natural to ask to what extent the true dynamics is described by
that model. In order to provide a first step towards answering that question,
we extend the theory of diffusivity estimation for semilinear SPDEs from
Chapter 2, including simultaneous estimation of reaction parameters. To
this end, we assume that the nonlinearity is given as a parametrized term.
117
This can be interpreted as qualitative a priori knowledge on the behavior of
the reaction terms, without knowing the magnitude of the involved parame-
ters quantitatively. Based on these considerations, we compare the effective
diffusivity, given as the value of either of different related estimators, on
simulated and experimental data, in order to understand the effects of the
reaction model.
In Section 6.1, we discuss joint diffusivity and reaction parameter esti-
mation for semilinear SPDEs, extending the results from Chapter 2. We put
special emphasis on the statistically linear case, i.e. when the nonlinearity
depends linearly on its parameters. Finally, we state and discuss the regu-
larity properties of an activator-inhibitor model, which is closely related to
[FFAB20]. In Section 6.2, we apply the estimation theory from Section 6.1
to simulated and real data described by that activator-inhibitor model.
6.1 Joint Diffusivity and Reaction Parameter
Estimation
6.1.1 The General Case
We extend the model from Chapter 2 by allowing the nonlinear term Fto
depend on additional parameters θ1, . . . , θK,K > 0, which we call reaction
parameters in the sequel:
dXt=θ0AXtdt+Fθ1:K(X)(t)dt+B(Xt)dWt(6.1)
with initial condition X0, where we write θ1:K= (θ1, . . . , θK)for short. Fur-
ther, we write θ= (θ0, . . . , θK)for the complete parameter vector. We fix the
parameter space ΘRK+1, which encodes our usual standing assumption
θ0>0, as well as possible restrictions on the reaction parameters coming
from the bifurcation structure of (6.1) together with a priori knowledge on
the dynamical regime. It is possible that an estimator for θtakes values
outside Θ, in that case it should be considered void. The dispersion operator
Bis assumed to satisfy (Nγ
η)for some γ > d/4and η > 0. In order to take
the reaction parameters into account when controlling the nonlinearity in the
drift term, we have to extend condition (Fs,η):
118
(Fpar
s,η )There are continuous functions g: [0,)[0,)and c:RK[0,)
and there is ϵ > 0such that for ZR(s):
Fθ1:K(Z)R(s+η+ϵ2) c(θ1, . . . , θK)gZR(s).(6.2)
W.l.o.g. we always assume that gis increasing. The non-Markovian
nature of Fis crucial in our main example, as explained in Section 6.1.3.
Setting formally B=σ(A)γ, by [LS77, Section 7.6.4], the log-likelihood
for XNis heuristically given by
ln dPN,T
θ
dPN,T
¯
θ
=1
σ2ZT
0(θ0¯
θ0)AXN
t,(A)2γdXN
t
+1
σ2ZT
0PNFθ1:K(X)(t)PNF¯
θ1:K(X)(t),(A)2γdXN
t
1
2σ2ZT
0(θ0¯
θ0)AXN
t+PNFθ1:K(X)(t)PNF¯
θ1:K(X)(t),
(A)2γ(θ0+¯
θ0)AXN
t+PNFθ1:K(X)(t) + PNF¯
θ1:K(X)(t)dt,
where ¯
θ= (¯
θ0,...,¯
θK)Θis an arbitrary reference parameter.
Maximizing this term for θ0, . . . , θKsimultaneously leads to the corre-
sponding likelihood equations
ZT
0(A)1+2αXN
t,dXN
t
=ZT
0(A)1+2αXN
t, θ0(A)XN
tPNFθ1:K(X)(t)dt,
ZT
0θiPNFθ1:K(X)(t),(A)2αdXN
t
=ZT
0(A)2αθiPNFθ1:K(X)(t), θ0(A)XN
tPNFθ1:K(X)(t)dt
for i= 1, . . . , K, where as before we substituted γby a free parameter
α. Without further mentioning it, we assume that there is a solution to
these equations. We fix any solution and call it a maximum likelihood–type
estimator ˆ
θNfor our problem, with components ˆ
θN
0,...,ˆ
θN
K. In general, the
likelihood equations cannot be solved explicitly.
119
Now, depending on the specific form of Fas well as the eigenvalue asymp-
totics of A, it may happen that not all reaction parameters (if any) are identi-
fiable in finite time. This means that ˆ
θN
idoes not necessarily converge to θiif
N if i1. Therefore, we put our main focus on diffusivity estimation,
i.e. identifying θ0, and analyze the impact of the reaction parameters on that
problem. In order to control ˆ
θN
1:Kwhen studying the asymptotics of ˆ
θN
0, it
suffices that the reaction parameter estimators are bounded in probability (or
tight), i.e. supNNP(|ˆ
θN
i|> M)0for M and all 1iK. From
the likelihood equations it is clear that ˆ
θN
0can be written as
ˆ
θN
0=RT
0(A)1+2αXN
t,dXN
t
RT
0(A)1+αXN
t2
Hdt+RT
0D(A)1+2αXN
t, PNFˆ
θN
1:K(X)(t)Edt
RT
0(A)1+αXN
t2
Hdt,
(6.3)
even if this is not explicit due to the presence of ˆ
θN
1:K.
Theorem 6.1. Let γ > 1/(2β)and further s00,η > 0(1 1) such
that (Nγ
η)and (Fpar
s,η )for s0s < sare true. Assume X0Hs+ηand
XR(s0). Let α > γ 1/4, and assume that (ˆ
θN
i)NNare bounded in
probability for 1iK. Then:
(i) If η > 1+1, then ˆ
θN
0is asymptotically normal as in (2.29).
(ii) If η1 + 1, then ˆ
θN
0is consistent in probability with rate Na,
a < βη/2, as in (3.70).
Proof. By Theorem 3.22, the claim is true with ˆ
θN
0replaced by ˆ
θlin
Ngiven by
(2.26). Thus, by (6.3) it suffices to control the bias term involving Fˆ
θN
1:K(X).
As in the proof of Theorem 2.11, we see that by means of (Fpar
s,η ),
ZT
0D(A)1+2αXN
t, PNFˆ
θN
1:K(X)(t)Edtpc(ˆ
θN
1:K)N1+β(2α2γ+1η/2),
and consequently, for (i),
N1+β
2RT
0D(A)1+2αXN
t, PNFˆ
θN
1:K(X)(t)Edt
RT
0(A)1+αXN
t2
Hdtpc(ˆ
θN
1:K)Nβ(1+1η)/2
almost surely. The right-hand side converges to zero in probability because
η > 1 + 1 and ˆ
θN
1:Kare bounded in probability. This implies (i). The case
(ii) is similar.
120
6.1.2 The Statistically Linear Case
In this section, let Fdepend linearly on its parameters:1
Fθ1:K(X) = F(X) +
K
X
i=1
θiFi(X)(6.4)
for functions F1, . . . , FK, F:C(0, T;H)D(F)L1(0, T;H). We set
Dα(F) := {ZD(F)|F1(Z), . . . , FK(Z)L2(0, T;H2α)}. Identifiability of
θ1, . . . , θKis ensured by a non-degeneracy condition:
(Iα)The terms F1(Z), . . . , FK(Z)are linearly independent in L2(0, T;H2α)
for every ZDα(F)which is not constant in t[0, T ].
As a consequence of (Iα),
ZT
0(A)αFi(X)(t)2
Hdt > 0(6.5)
for i= 1, . . . , K. In order to unify notation in the sequel, we define
F0(X) := AX. (6.6)
The maximum likelihood equations simplify to
A(α)
N(X)ˆ
θN=b(α)
N(X),(6.7)
where
A(α)
N(X)i,j =ZT
0(A)αPNFi(X)(t),(A)αPNFj(X)(t)dt,
b(α)
N(X)i=ZT
0(A)2αPNFi(X)(t), PNF(X)(t)dt
+ZT
0(A)2αPNFi(X)(t),dXN
t.
1A comparable setup has been considered in [Hue93, Chapter 3] for linear SPDEs in
the spectral approach with similar arguments as given below, and in [DMPD00, Section
3] for semilinear SPDEs in the large time regime.
121
In particular,
A(α)
N(X)0,0=ZT
0(A)1+αXN
t2
Hdt. (6.8)
Further, it is immediate that for i, j = 0, . . . , K:
A(α)
N(X)i,jqA(α)
N(X)i,iA(α)
N(X)j,j.(6.9)
In order to connect to Theorem 6.1 and prove that the reaction pa-
rameter estimators ˆ
θ1:Kare bounded in probability, we have to control the
rate of the determinant of A(α)
N(X), whose square root is the volume of the
(K+ 1)-dimensional parallelepiped spanned by PNF0(X),...PNFK(X)in
L2(0, T;H2α). In order to do so, we choose αin such a way that PNF0(X) =
AXNdiverges in L2(0, T;H2α), while PNF1(X), . . . , PNFK(X)converge. This
way, AXNgets asymptotically orthogonal to the latter terms and determines
the rate of volume growth. This is formalized in the next lemma.
Lemma 6.2. Let s00,η > 0such that XR(s0),X0Hs+ηa.s.
and (Nγ
η)as well as (Fs,η)for s0s < sare true. Let αRsuch that
γ(1 + 1)/2< α < γ + (η11)/2. Under condition (Iα), there are
N0Nand c, C > 0such that
cZT
0(A)1+αXN
t2
Hdtdet A(α)
N(X)CZT
0(A)1+αXN
t2
Hdt
uniformly in NN0, almost surely. In particular,
det A(α)
N(X)N1+β(2α2γ+1).(6.10)
Proof. First, α < γ + (η11)/2implies 2α < s+η2, thus Fi(X)
R(2α)for i= 1, . . . , K. Thus, for these i, we have limN→∞ A(α)
N(X)i,i =
RT
0(A)αFi(X)(t)2
Hdt < . In particular, A(α)
N(X)i,i are positive and
finite for i= 1, . . . , K and large enough N. Using (6.9), we have
det A(α)
N(X)(K+ 1)!
K
Y
i=0
A(α)
N(X)i,i
A(α)
N(X)0,0=ZT
0(A)1+αXN
t2
Hdt.
122
For brevity, we use the notation ⟨·,·⟩αfor the scalar product on L2(0, T;H2α)
and ∥·∥αfor the corresponding norm. We will prove that
lim inf
N→∞ det  PNFi(X)
PNFi(X)α
,PNFj(X)
PNFj(X)ααi,j=0,...,K!>0.(6.11)
First note that by condition (Iα), this is true if the matrix in (6.11) is sub-
stituted by its (0,0)-minor, i.e. such that 1i, j K.
Let ϵ > 0, let MNsuch that (IPM)Fi(X)α< ϵ Fi(X)for
i= 1, . . . , K. This is possible because PNFi(X)Fi(X)in L2(0, T;H2α).
Then for i= 1, . . . , K and N > M:
PNF0(X)
PNF0(X)α
,PNFi(X)
PNFi(X)αα
=PMF0(X)
PNF0(X)α
,PMFi(X)
PNFi(X)αα
+(PNPM)F0(X)
PNF0(X)α
,(PNPM)Fi(X)
PNFi(X)αα
.
Taking into account PNFi(X)α Fi(X)αand PNF0(X)α in
the first term as well as the Cauchy-Schwarz inequality for the second term,
lim sup
N→∞ PNF0(X)
PNF0(X)α
,PNFi(X)
PNFi(X)ααlim sup
N→∞
(PNPM)Fi(X)α
PNFi(X)α
(IPM)Fi(X)α
Fi(X)α
< ϵ.
As ϵ > 0is arbitrary, we see that the (0, i)-entry of the matrix in (6.11)
converges to zero for i1. Expanding the determinant in (6.11) in the first
column and using the non-degeneracy of the (0,0)-minor, we conclude that
(6.11) is true.
If DNis the diagonal matrix with i-th diagonal entry A(α)
N(X)i,i, (6.11) is
equivalent to lim infN→∞ det(D1/2
NA(α)
N(X)D1/2
N)>0. In particular,
ZT
0(A)1+αXN
t2
Hdt|det (DN)|det A(α)
N(X).(6.12)
Finally, (2.20) implies (6.10), and all statement in the lemma are proven.
Proposition 6.3. Let s00,η > 0such that XR(s0),X0Hs+η
a.s. and (Nγ
η)as well as (Fs,η)for every s0s<shold. Let αR
123
with γ(1 + 1)/4< α < γ + (η11)/4(η11)/2and
γ1/4< α γ. Under condition (Iα), the sequences (ˆ
θN
i)NN,i= 1, . . . , K
are bounded in probability.
Proof. First note that every admissible αRis also admissible in Lemma
6.2. W.l.o.g. we can restrict to the case B(X)σ(A)γdue to α > γ1/4,
as explained in Theorem 3.22. Define
¯
b(α)
N(X)i:= σZT
0(A)2αγPNFi(X)(t),dWN
t.
By Lemma 6.2, A(α)
N(X)is invertible for all NN0. Plugging in the dynam-
ics of Xinto the stochastic integral appearing in each component of b(α)
N(X),
it is immediate that
ˆ
θNθ=A(α)
N(X)1¯
b(α)
N(X).(6.13)
For simplicity of notation, denote the entries of A(α)
N(X)by ai,j, the entries of
A(α)
N(X)1by ai,j and the entries of ¯
b(α)
N(X)by ¯
bi,i, j = 0, . . . , K. All terms
implicitly depend on N. W.l.o.g. assume that ai,i >0for i= 0, . . . , K,
which is guaranteed by (Iα)for large enough N. Then the i-th component
of ˆ
θNθreads as
ˆ
θN
iθi=
K
X
j=0
ai,j¯
bj=1
det A(α)
N(X)
K
X
j=0
¯
bj(1)i+jdet (Aj,i),
where Aj,i is the matrix obtained from erasing the j-th row and the i-th
column in A(α)
N(X). By means of ai,j ai,iaj,j,
ˆ
θN
iθi1
det A(α)
N(X)
K
X
j=0 ¯
bjK!
ai,iaj,j
K
Y
=0 |aℓ,ℓ|
K
X
j=0 ¯
bj
ai,iaj,j
,
where we have used Lemma 6.2. Next, α < γ + (η11)/4implies
Fj(X)R(4α2γ+ϵ)for some ϵ > 0. By Lemma 2.6, limN→∞ ¯
bj<
124
a.s. for j= 1, . . . , K. Thus, for these j,¯
bj/aj,j is bounded almost surely
and thus in probability. Finally, taking into account α > γ (1 + 1)/4,
Nβ(γα)¯
b0
a0,0
=σNβ(γα)v
u
u
tRT
0(A)1+2αγXN
t2
Hdt
RT
0(A)1+αXN
t2
Hdt
×RT
0(A)1+2αγXN
t,dWN
t
qRT
0(A)1+2αγXN
t2
Hdt
,
which converges to a normal distribution by (2.20) together with the choice
of α > γ (1+1)/4, Theorem A.1, and the Slutsky lemma. Consequently,
as αγ, we see that ¯
b0/a0,0is bounded in probability, too. In total, for
i= 1, . . . , K,|ˆ
θN
iθi|is bounded a.s. by the sum of random variables that
are bounded in probability, so (ˆ
θN
i)NNitself is bounded in probability.
In particular, consider the case A= in dimension d2, where Fis
a reaction term that satisfies (Fpar
s,η )for all η < 2. Combining Theorem 6.1
with Proposition 6.3, we obtain:
Theorem 6.4. Let d2,γ > d/4,s00such that (Nγ
η)and (Fpar
s,η )hold
for s0s < sand 0< η < 2. Let a.s. X0Hs+2 and XR(s0).
(i) In d= 1, let γ1/4< α γsuch that (Iα)holds. Then ˆ
θN
0is
asymptotically normal as in (2.29).
(ii) In d= 2, let γ1/4< α < γ such that (Iα)holds. Then ˆ
θN
0converges
in probability to θ0with rate Nafor every a < 1.
Remark 6.5. The condition on αcan be relaxed. For example, if B=
σ(A)γ, only α > γ (1 + d/2)/4is needed, and a similar result for di-
mension d= 3 can be stated, cf. Remark 3.23.
Theorem 6.4 is applicable to the activator-inhibitor model explained in
the next section.
125
6.1.3 A Model of FitzHugh–Nagumo Type
For L1, . . . , Ld>0, let D= [0, L1]×···×[0, Ld]Rda bounded domain.
Motivated by [FFAB20], we consider an activator–inhibitor model of the form
dUt=DUUt+k1Ut(u0Ut)(Utu0a(UtL2(D))) k2Vtdt+BdWt,
(6.14)
dVt= (DVVt+ϵ(bUtVt)) dt, (6.15)
together with initial conditions U0, V0and periodic boundary conditions.
Consequently, in this setting, the state space is given by H:= L2(D), and
H1=¯
W1,2(D) = {uW1,2(D)|RDudx= 0}. Here and in the sequel we
consider only the case B=σ(∆)γfor σ > 0. The parameters DU, DV>0
are the diffusivity constants for the activator Uand inhibitor V, resp. The
parameters k1, k2, u0, ϵ, b are supposed to be positive. Finally, a:RR
is a bounded continuously differentiable function with bounded derivative.
The boundedness of ais not essential and can be modeled in practice with
a cutoff. For constant a, this is the spatially extended FitzHugh–Nagumo
model. We mention that a careful choice of the function acan have a sta-
bilizing effect on the dynamics. We also introduce an additional parameter
¯a(0,1), which will describe the effective long–time average of a(U). The
initial conditions are assumed to be sufficiently regular, i.e. E[U0p
s]<,
E[V0p
s+2]<for all p2and s= 1+2γd/2. This model is well-posed
in dimension d3:
Proposition 6.6. Let d3and γ > d/4+1/2. Then there exists a unique
solution (U, V )to (6.14),(6.15) with URE(s)and VRE(s+ 2) for any
s < s.
The proof is given in Appendix B.3.
This model is used to describe cell data in Section 6.2, where we assume
that the observation Xis given by the activator concentration X=U. The
activator dynamics in this model can be reduced to (6.1) as follows: First,
with the variation of constants formula, Vis determined by Uvia
Vt=et(DVϵI)V0+ϵb Zt
0
e(tr)(DVϵI)Urdr, (6.16)
where Iis the identity operator acting on L2(D)and t7→ et(DVϵI)is the
semigroup generated by DVϵI. For simplicity, we assume V0= 0 here
126
and in the sequel. Now, the nonlinearity Fis given by
Fθ123(X)(t) = θ1F1(Xt) + θ2F2(Xt) + θ3F3(X)(t),(6.17)
where
θ1=k1u0¯a, θ2=k1, θ3=k2ϵb, (6.18)
and
F1(Xt) = a(XtL2(D))
¯aXt(u0Xt),(6.19)
F2(Xt) = X2
t(u0Xt),(6.20)
F3(X)(t) = Zt
0
e(tr)(DVϵI)Xrdr. (6.21)
This matches the statistically linear case (6.4) with F= 0. Note that F3
acts on the trajectory of X, such that the dynamics of Xis not Markovian,
even if the joint dynamics of (U, V )is Markovian. Finally, θ0=DUis the
diffusivity of the activator.
6.2 Application to Cell Data
Next, we apply the theory of joint diffusivity and reaction parameter estima-
tion to simulated and real giant cell data. Our main assumption is that the
data is generated by the FitzHugh–Nagumo model from Section 6.1.3. Why
this is certainly the case for the numerical simulation (up to a discretization
error), it is less clear for data coming from microscopy observation.
As a first approximation, when estimating parameters of the process, we
always assume that the function ais constant, a¯a, where the latter value
is known or unknown. In this case, F1is replaced by
e
F1(Xt) = Xt(u0Xt).(6.22)
For simulated data, this models a misspecification of the true generating dy-
namics. However, this is not severe, as a(XtL2(D))tends to oscillate around
its effective value. In this sense, if the real data is modeled accurately by
the FitzHugh–Nagumo model from Section 6.1.3, this additional assumption
will have little impact.
127
In order to compare the effect of different model assumptions on diffu-
sivity estimation, we construct a hierarchy of estimators, starting from the
purely linear case and taking into account an increasing number of features
from the FitzHugh–Nagumo model. The assumptions displayed here refer to
the description of the data used to perform parameter estimation, not the
generating process itself.
ˆ
θlin,N
0is the estimator for θ0coming from the assumption of a purely
linear model, i.e. a stochastic heat equation. In this case, F= 0, and
there are no other drift parameters to be estimated.
ˆ
θpol,N
0is the diffusivity estimator based on a stochastic Schlögl (or
Nagumo) model [Sch72], i.e.
F(X) = k1X(u0X)(X¯au0)
=θ1e
F1(X) + θ2F2(X),
where both reaction parameters θ1and θ2are assumed to be known.
This model is capable of generating spatially extending phase transi-
tions for the concentration of X, and it arises formally from taking
ϵ0in the stochastic FitzHugh–Nagumo model.
ˆ
θfull,N
0is the diffusivity estimator under the assumption of a full FitzHugh–
Nagumo model, i.e.
F(X) = k1X(u0X)(X¯au0)k2ϵb Z·
0
e(· r)(DVϵI)Xrdr
=θ1e
F1(X) + θ2F2(X) + θ3F3(X),
This model can generate traveling waves as observed in the cell data.
Again, all reaction parameters are assumed to be known.
While ˆ
θfull,N
0incorporates the full model, the assumption of known re-
action parameters will be too strong. We further relax this assumption by
estimating an increasing number of reaction parameters simultaneously:
ˆ
θ2,N
0is the diffusivity estimator based on the full FitzHugh–Nagumo
model as ˆ
θfull,N
0, but with unknown θ1.
ˆ
θ3,N
0additionally treats θ2as unknown.
128
ˆ
θ4,N
0treats all reaction parameters θ1, θ2, θ3as unknown.
The superscript denotes the number of unknown parameters, including the
diffusivity θ0. Thus, ˆ
θ4,N
0is an estimator for θ0which uses qualitative, but
little quantitative knowledge on the generating process. While all estimators
are consistent with optimal rate as N by Theorem 6.4, their perfor-
mance in the non-asymptotic setting may vary strongly.
For all estimators we have described here, we set α= 0. This is reason-
able if the driving noise of the activator component is close to being white
noise.
The linear estimator ˆ
θlin,N
0is the same as ˆ
θlin
Nfrom Chapter 2, and it is
given by (2.26). In contrast to the other estimators considered here, it is scale
invariant in the sense that for any c > 0, the substitution X7→ cX leaves
the resulting estimator ˆ
θlin,N
0invariant. It is clear that this invariance does
not hold if nonlinearities are taken into account. While the remaining esti-
mators use detailed information on the nonlinear model, their performance
depends on a careful calibration of the intensity of the input data in order
to match the fixed points of the third order polynomial in the reaction term.
In fact, this is a source of additional uncertainty. While the advantages of a
good reaction model clearly outweighs the benefit from scale invariance for
simulated data, as we will see in Section 6.2.1, this is less clear for real data,
where the reaction model may not fully capture the underlying dynamics,
see Section 6.2.2.
In this section, we work on two-dimensional rectangular domains of the
form D= [0, L1]×[0, L2]with L1, L2>0. In particular, the eigenfunctions
in Hof with periodic boundary conditions are given by Φk,ℓ(x1, x2) =
φk(x1/L1)φ(x2/L2)for (k, )Z2, where φk(x) = cos(2πkx)for k0
and φk(x) = sin(2πkx)for k > 0. The eigenvalues are given by λk,ℓ =
4π2(k2/L2
1+2/L2
2). As before, we choose a reordering r:NZ2\{0}such
that λN=λr(N)is increasing, with corresponding eigenfunction ΦN= Φr(N),
where we exclude the case λ0,0= 0.
6.2.1 Evaluation of Simulated Data
We simulate the system (6.14), (6.15) on a two-dimensional square with
side length L= 75 and periodic boundary conditions, starting from zero
129
0 200 400 600 800
N
4
3
2
1
0
1
2
3
4
diffusivity (m2/s)
1e 13
0
lin, N
0
pol, N
0
full, N
0
0 200 400 600 800
N
4
3
2
1
0
1
2
3
4
diffusivity (m2/s)
1e 13
0
full, N
0,a= 0.1
full, N
0,a= 0.2
full, N
0,a= 0.3
0 200 400 600 800
N
4
3
2
1
0
1
2
3
4
diffusivity (m2/s)
1e 13
0
2, N
0
3, N
0
4, N
0
0 200 400 600 800
N
4
3
2
1
0
1
2
3
4
diffusivity (m2/s)
1e 13
0
2, N
0, full domain
2, N
0, section
2, N
0, periodified
Figure 6.1: Performance of different diffusivity estimators on a numerical
simulation as Ngets large. Solid black line is plotted at the true value of
θ0= 1 ×1013 m, dashed black line is plotted at zero. We restrict to N25
in order to avoid artifacts.
initial conditions and with θ0=DU= 0.1. We use an explicit finite
difference scheme with spatial and temporal increment x= 0.375 and
t= 0.01, respectively. In order to mitigate the impact of the initial condi-
tions, we observe every 100th frame of the simulation in the shifted interval
[T0, T1), where T0= 500 and T1= 700. The remaining drift parameters are
DV= 0.02, k1=k2= 1, u0= 1, ϵ = 0.02, b = 0.2. In the noise, we set γ= 0
and σ= 0.1. The unstable zero of the reaction potential is determined by
a(z) = 0.5b+0.5(z/(0.33u0L2)1). In order to compare the simulation to
real data, the unit length and unit time in these specifications are interpreted
as 1µmand 1s, respectively.
In Figure 6.1 (top left), the performance of ˆ
θlin,N
0,ˆ
θpol,N
0and ˆ
θfull,N
0with
130
¯a= 0.1is compared. The linear estimator severely underestimates the true
diffusivity. This can be explained as follows: The data exhibits steep con-
centration gradients at the wave fronts, which are interpreted by ˆ
θlin,N
0as
coming from low diffusive forcing. In contrast, the estimator ˆ
θpol,N
0, which
takes into account the bistable polynomial from the reaction term, heavily
overestimates the true value of θ0. As an explanation, while this estimator
is able to account for the phase transition at the wave front, the concentra-
tion decay due to the inhibitor in the data is interpreted as fast diffusion.
Finally, ˆ
θfull,N
0performs best of these three estimators, incorporating knowl-
edge on the full reaction model. Note, however, that in the simulation, ais
not constant, such that even ˆ
θfull,N
0does not have perfect information on the
dynamics of X. Rather, a(XtL2(D))oscillates around a value slightly larger
than 0.15 in the simulation. Figure 6.1 (top right) shows the sensitivity of
ˆ
θfull,N
0to ¯a. Different a priori assumptions on ¯ahave a large impact on the
value of the estimator, even for large N. In contrast, Figure 6.1 (bottom left)
shows the performance of the estimators ˆ
θ2,N
0,ˆ
θ3,N
0and ˆ
θ4,N
0, which treat the
reaction parameters as unknown. All of them determine the true value of θ0
rapidly, even if ais misspecified in their description of the dynamics.
Apart from the form of the nonlinearity F, the behavior of Xat the
boundary impacts the performance of the estimators. In Figure 6.1 (bottom
right), we evaluate ˆ
θ2,N
0on the original domain Dof 200 ×200 pixels, on
the restriction of Xto a subdomain of 75 ×75 pixels and its periodification,
as described below, on a square of 150 ×150 pixels. When evaluated on a
subdomain instead of D, the estimate deteriorates. A possible explanation
is given as follows: The assumption of periodic boundary conditions on the
subdomain leads to discontinuities of Xat its boundary. As before, these
discontinuities can be interpreted as steep gradients, which the diffusivity
estimators translates into low diffusivity present in the data.
As a remedy, we use a hands-on approach and periodify the data in the
sense that we take four copies of the data, mirror them on both coordinate
axes and glue them together in such a way that the resulting field is contin-
uous and fills a square with double side length compared to D, with periodic
boundary conditions. This way, we avoid the discontinuities, but the result-
ing field still does not satisfy the dynamics of Xat the boundaries of the
original domain D. Further, due to the introduced redundancies and change
in spatial extension, the estimators based on the original and periodified data
should be compared for different values of N.
131
While periodification seems to be a natural ad-hoc approach to deal with
the difficulties arising at the boundary, its performance will depend on the
specific situation, and it should be used with care. In order to understand
its performance better, a systematic study is needed.
0 200 400 600 800
N
4
3
2
1
0
1
2
3
4
diffusivity (m2/s)
1e 13
0
lin, N
0, = 0.05
lin, N
0, = 0.1
lin, N
0, = 0.2
0 200 400 600 800
N
4
3
2
1
0
1
2
3
4
diffusivity (m2/s)
1e 13
0
2, N
0, = 0.05
2, N
0, = 0.1
2, N
0, = 0.2
Figure 6.2: Comparison of (left) ˆ
θlin,N
0and (right) ˆ
θ2,N
0for different noise
intensity levels. We restrict to N25 in both panels. As before, the solid
black line is plotted at the true value θ0= 1 ×1013 m, and the dashed black
line is plotted at zero.
In Figure 6.2, the effect of changing the noise intensity σon diffusivity
estimation is shown. We simulate additional trajectories with σ= 0.05
and σ= 0.2, on which ˆ
θlin,N
0and ˆ
θ2,N
0are evaluated. The former estimator is
agnostic to the reaction model, whereas the latter includes the full FitzHugh–
Nagumo model. While ˆ
θ2,N
0performs well regardless of the noise intensity,
the behavior of ˆ
θlin,N
0is heavily influenced by the noise level, with large noise
intensity leading to better results. In this sense, a large noise intensity hides
the effect of the nonlinear term. This is in accordance with the Monte–Carlo
simulation for the stochastic Allen–Cahn equation in Section 2.5. We note
that for σ= 0.2, the simulation is no longer capable of generating traveling
waves due to large fluctuations in the driving noise.
132
0 200 400 600 800
N
0.2
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
lin, N
0, no filter
lin, N
0, f= 1 × 10 7m
lin, N
0, f= 2 × 10 7m
0 200 400 600 800
N
0.2
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
lin, N
0, no filter
lin, N
0, f= 1 × 10 7m
lin, N
0, f= 2 × 10 7m
0 200 400 600 800
N
0.2
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
lin, N
0
2, N
0
3, N
0
4, N
0
0 200 400 600 800
N
0.2
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
lin, N
0
2, N
0
3, N
0
4, N
0
Figure 6.3: (top row) Performance of ˆ
θlin,N
0on spatially smoothed obser-
vations. (bottom row) Performance of different diffusivity estimators on
the data, without spatial smoothing. The data is considered (left column)
without and (right column) with periodification. Dashed line is plotted at
zero. As before, we restrict to N25 in all panels.
6.2.2 Evaluation of Real Data
We first describe the data we are working with. See [PFA+21, Appendix B]
for a description of the experimental setup.2Each observation consists of a
sequence of rectangular frames of varying length, resolution and aspect ratio,
describing the observed intensity of an actin marker within a D. discoideum
giant cell. The regions considered lie completely inside the cell, i.e. no cell
boundaries appear in the data. The concentration of the actin marker is
given by grey values ranging from 0to 255 at each pixel. When evaluating
the data, this intensity is standardized to the interval [0,1], such that absence
2The giant cell data used in this chapter has been provided by Sven Flemming.
133
of the activator and the maximal activator intensity match the stable fixed
points 0and u0= 1, respectively.3As the data exhibits traveling waves, it
is assumed that the actin concentration (or actin marker concentration) can
be described by (6.14), (6.15).
In this section, we estimate the diffusivity on a single giant cell observa-
tion. The analysis of a population of cells is postponed to Section 6.2.3.
As a first consistency check, we consider the behavior of the data set un-
der convolution. For kL1(D), let Tk:L2(D)L2(D)be the convolution
operator given by (Tkf)(y) = RDk(yx)f(x)dx, where kand fare identi-
fied with their periodic continuation. It is well-known that Tk=Tk.
In particular, if Xis generated by a (stochastic, perturbed) heat equation
with (homogeneous and isotropic) diffusivity θ0, the same is true for TkX,
although the structure of the nonlinear term and the noise changes. Thus, if
the describing model is reasonable, it is to expect that the estimated diffu-
sivity of TkXis close to that of X. We use a family of kernels k=k(σf)for
σf>0, which are constructed as a Gaussian density with standard deviation
σf, truncated to the rectangle [L1/2, L1/2] ×[L2/2, L2/2] and normed in
L1(D). We use the standard deviation σf= 1 ×107mand σf= 2 ×107m.
The performance of ˆ
θlin,N
0on data smoothed with k(σf)is shown in Figure
6.3 (top left) without periodification, and in Figure 6.3 (top right) for the
periodified data. While not in perfect alignment, the estimator graphs are
very close. For the data without periodification, the decrease of ˆ
θlin,N
0in
Nis slightly more highlighted. For the periodified data, which cannot be
expected to satisfy (6.14), (6.15) on the boundaries of the four sub-patches
of its enlarged domain, the graphs of the estimators are nonetheless almost
indistinguishable. In this sense, periodification seems to retain convolution
invariance. In total, these results support the hypothesis that the data is
generated by a stochastic partial differential equation with diffusive forcing
stemming from a second order differential operator.
Now we proceed to the nonlinear reaction model. Based on the perfor-
mance of the estimators from Section 6.2.1, we compare ˆ
θlin,N
0with ˆ
θ2,N
0,ˆ
θ3,N
0
and ˆ
θ4,N
0, which incorporate knowledge on the full FitzHugh–Nagumo model.
The performance on cell data is shown in Figure 6.3 (bottom left), and the
performance on periodified data is shown in Figure 6.3 (bottom right). In-
3Such calibration is necessary for all estimators except ˆ
θlin,N
0.
134
terestingly, the model-free estimator ˆ
θlin,N
0behaves similar to ˆ
θ3,N
0and ˆ
θ4,N
0,
which are the most flexible estimators we consider and which do not fix the
reaction rate corresponding to the bistable potential in the drift. This pattern
does also appear in different sample cells. In terms of diffusivity estimation,
the detailed reaction model doesn’t seem to yield additional benefit, in con-
trast to the case of simulated data from Section 6.2.1. This can be seen as
a hint that the FitzHugh–Nagumo model, while being capable of generating
traveling waves, misses additional features of the true intracellular dynamics.
For example, it may be helpful to consider models which mimic the biophys-
ical reaction pathway more closely. On the other hand, ˆ
θ2,N
0, which fixes the
parameters describing the reaction intensities in advance, deviates from the
other estimators, but finally approaches them. Further evaluations suggest
that changing u0does not alter the general picture.
The performance of the estimators on the periodified sample is similar to
the the case of the original sample. In accordance with the discussion from
the previous section, the estimated diffusivity increases in that case.
6.2.3 Evaluation of a Cell Population
We consider a population of 36 giant cell observations, as described in the
previous section. The spatial extension of each data set is clipped in such
a way that only the interior dynamics is captured, i.e. no cell boundaries
appear in the data. As a consequence, the spatial resolution varies within
the cell population. It is natural to assume that the range of possible Nthat
yields meaningful results grows with the resolution of the sample. In general,
while the estimate will be more precise for large N, discretization effects
depending on the spatial resolution will render arbitrarily large Nuseless. In
order to find a reasonable tradeoff, we apply the following heuristics: If each
frame within a data set consists of rx×rypixels, we set Nstop =4rxry/R2,
where Ris a parameter representing the number of pixels needed in order to
extract meaningful information on [0,2π]by testing with a sine function. For
example, if rx=ry=R, then Nstop = 4, and only the first four eigenfunctions
Φ±1,±1are taken into account, whose period is Rpixels in both dimensions.
We choose R= 12 for the cell population and R= 24 for the periodified
population. Further, we set Nconst = 899 and evaluate the estimator ˆ
θ3,N
0
at N=Nconst and N=Nstop. Results are shown in Figure 6.4. Note that
135
200 300 400 500 600 700 800
Nstop =rxry/36
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
p = 0.06
200 300 400 500 600 700 800
Nstop =rxry/36
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
p = 0.75
200 300 400 500 600 700 800
Nstop =rxry/144
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
p = 0.01
200 300 400 500 600 700 800
Nstop =rxry/144
0.0
0.2
0.4
0.6
0.8
1.0
diffusivity (m2/s)
1e 13
p = 0.27
Figure 6.4: The estimator ˆ
θ3,N
0evaluated at N=Nconst (left column) and
N=Nstop (right column) is plotted against Nstop for a sample of 36 ob-
servations (top row) and their periodification (bottom row). The least
square regression lines are plotted in red. The p-value in each display comes
from a two-sided t-test with null hypothesis that the slope is zero.
Nstop encodes the resolution of the frames within a sample.4We see that
choosing Nbased on the spatial resolution decorrelates the estimate for θ0
from the resolution of the frames. Further, the estimated diffusivities for all
samples considered have a similar magnitude. This indicates that the concept
of effective diffusivity can be useful for statistical analysis on cell samples of
the same or possibly different populations.
In addition to the inhomogeneous spatial resolution, the number of frames
4In fact, Nstop grows like the number of pixels in each frame. If each pixel is interpreted
as a point evaluation of the underlying process (rather than a local average), this is in
accordance with Example 4.8 and (4.50).
136
(i.e. the temporal resolution) varies within the population. However, the
estimate tends to stabilize in time, such that this does not pose a problem.
We further note that the population is not homogeneous with respect to
the side length xof a pixel and the temporal distance tbetween two
frames. Further tests indicate that the estimated diffusivity correlates with
the characteristic diffusivity x2/t. However, a detailed analysis of the
resulting effects, including the impact of discretization and possible scale
dependence of the diffusivity, is beyond the scope of the present work.
6.2.4 The Effective Unstable Zero
0 200 400 600 800
N
0.0
0.2
0.4
0.6
0.8
1.0
unstable fixed point
2, N
1, T=68.0
2, N
1, T=134.0
2, N
1, T=199.0
0 200 400 600 800
N
0.0
0.2
0.4
0.6
0.8
1.0
unstable fixed point
2, N
1, T=260.02
2, N
1, T=518.04
2, N
1, T=775.09
Figure 6.5: Estimated unstable fixed point for simulated data (left) and real
data (right). Frames up to time T(in seconds) are used to calculate ˆ
θ2,N
0,
starting from the first frame in the sample. As before, we restrict to N25.
When using ˆ
θ2,N
0in order to estimate the diffusivity θ0, we simultaneously
obtain an estimate ˆ
θ2,N
1for θ1by solving (6.7). As θ1=k1u0¯aand k1=u0= 1
by assumption, we can identify θ1with the effective unstable zero ¯afrom the
reaction term. In Figure 6.5, the performance of this estimate is displayed
for simulated data and an experimentally observed sample.
The term a(XtL2(D))oscillates around an effective value slightly larger
than 0.15 in the numerically simulated trajectory. Even if this value is ap-
proximated, we see that the quality of the estimate does not improve with
increasing N. Indeed, this cannot be expected, as the reaction term is of
order zero: It is known [HR95] that the maximum likelihood estimate of the
137
coefficient of a linear order zero perturbation to a stochastic heat equation
converges only with logarithmic rate in dimension d= 2. On the other hand,
also the long-time behavior can be considered, including an increasing num-
ber of frames into the evaluation. The left-hand panel in Figure 6.5 shows
that the effective value is approached with larger T.5
In the case of real data, the results fall into the interval (0,1) and are
rather stable. This indicates that the “effective unstable fixed point under
the reaction model F”, defined as the value at which ˆ
θ2,N
1stabilizes, can be
used in a meaningful way for statistically evaluating spatiotemporal cell data
exhibiting traveling waves.
6.2.5 The Effective Diffusivity Outside the Cell
0 200 400 600 800
N
0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
diffusivity (m2/s)
1e 13
lin, N
0
0 200 400 600 800
N
1
0
1
2
3
4
5
6
7
energy
1e17
AN(X)0, 0 - inside the cell
AN(X)0, 0 - outside the cell
Figure 6.6: (left) Performance of ˆ
θlin,N
0on a data set consisting of pure noise
outside the cell. (right) Comparison of the energy of a measurement inside
and outside the cell, with the same spatiotemporal extension. In both panels,
the dashed line is plotted at zero. As before, we restrict to N25.
When formally applying the estimation procedure to a data set consisting
of pure noise, i.e. a region of a microscopy data set where no part of the cell
is present, we obtain a result that can be named “effective diffusivity outside
the cell” or “effective diffusivity of the noise”. Here, we restrict ourselves to
5Typically, under ergodicity assumptions, consistency with convergence rate T1/2can
be expected for estimators of maximum likelihood type if Tincreases, see e.g. the mono-
graph [Kut04] for SDEs, [Log84, KL85] for linear SPDEs. In [DMPD00, GM02], large
time consistency is proven in the context of semilinear SPDEs.
138
the use of ˆ
θlin,N
0, i.e. F= 0. The result is shown in the left panel of Figure
6.6. Comparing with Figure 6.3, the effective diffusivity of a pure noise ob-
servation can even exceed the value obtained from a region inside a cell.6It
is important to understand the order of magnitude of this effective value, as
well as its impact on diffusivity estimation within the cell.
We derive the magnitude of the effective diffusivity outside the cell heuris-
tically: A pixel can be described by a weighted indicator function of a square
within D, where the weight describes the intensity. In the case of pure noise,
the instantaneous disappearance of such a pixel in the next frame can be
interpreted as fast diffusion within the time between two frames. In order to
understand the magnitude of θ0needed for that effect, we approximate the in-
dicator function of the pixel by a Gaussian density. For t > 0, let ϕt:R2R
be the centered Gaussian density in two dimensions with covariance matrix
tI. This density attains its maximum at x= 0, with ϕt(0) = 1/(2πt). Let t
be the time between two frames, let xbe the side length of a pixel within
each frame. We set σ0= x/2. In this case, the distance between the inflec-
tion points of the one-dimensional marginals of ϕσ02matches the side length
of a pixel, and we take ϕσ02as an approximation for the pixel. After time t,
the heat semigroup on R2with diffusivity θ0maps ϕσ02to ϕσ02+2θ0t. Now
let the decay of the maximal value of the density, ϕσ02(0)σ02+2θ0t(0), be
at least as large as some threshold b > 0, i.e. (σ02+ 2θ0t)02b. This is
equivalent to
θ0b1
8
x2
t.(6.23)
In the data sample from Figure 6.6 (left), we have x= 2.08 ×107m
and t= 0.97 s. The decay factor bdepends on the particular noise pixel
and its intensity within the data set. Reasonable values are given for b30.
For example, if b= 30, then θ01.6×1013 m2/s, if b= 20, then θ0
1×1013 m2/s, and if b= 15, then θ07.8×1014 m2/s. This matches the
order of the observed diffusivity from Figure 6.6 (left): For example, if we ap-
ply the stopping rule from Section 6.2.3 to this case, i.e. Nstop =4rxry/R2
with R= 12, then we obtain Nstop = 165 and ˆ
θlin,N
0= 1.36 ×1013 m2/s for
N=Nstop, in accordance with the heuristic derivation in this section.
6This observation also applies to the cell sample used in the right panel of Figure 6.6.
139
We have seen that the observed diffusivity of the noise can be larger than
the estimated diffusivity of the signal within the cell. Nonetheless, the noise
described here does not interfere with the diffusivity estimate of the signal.
This can be explained as follows: Assume that the signal process Xsig is
perturbed by measurement noise Wmeas. This means that inside the cell, we
observe X=Xsig +Wmeas instead of X=Xsig as supposed previously, while
outside the cell, only X=Wmeas is observed. It is revelatory to consider
the energy AN(X)0,0for both cases separately. This is done in the right
panel of Figure 6.6. The value of AN(Wmeas)0,0outside the cell is orders
of magnitude below the energy within the cell, at least at the resolution
level we consider. Consequently, AN(Xsig +Wmeas)0,0is indistinguishable
from AN(Xsig)0,0. Thus, it doesn’t make a difference if ˆ
θlin,N
0is evaluated
on Xsig +Wmeas or on Xsig itself, and the measurement noise has very little
impact on the estimated diffusivity of the signal.
However, from a mathematical perspective, adding noise to Xsig has an
impact on its regularity, such that the theoretical properties of ˆ
θlin,N
0for
N will change, depending on the precise model assumptions.
140
Chapter 7
Further Research
As exposed in the introduction, statistical inference for SPDEs is a source
for diverse mathematical research. The field is continuously expanding, and
it keeps incorporating new models and methods. In this work, we consid-
ered parameter estimation for semilinear equations in different asymptotic
regimes, together with possible model misspecification. To conclude, we give
a list of further interesting mathematical questions related to the topic of
this work. This list is by no means exhaustive, and it should be considered
a suggestion for possible further research.
Beyond semilinear models, one can consider quasilinear equations, e.g.
with state-dependent diffusivity.
Further types of model misspecification can be studied: For example,
this includes the effect of an inhomogeneous or anisotropic diffusivity
on the estimators from Chapter 2.
The impact of measurement noise can be analyzed systematically.
Apart from the spatially discretized Laplacian used in Chapter 4, which
is based on a Fourier decomposition, it is interesting to consider the
classical discretization on a three-point stencil or five-point stencil (in
dimension one or two), and to study the asymptotics as h0.
In the context of Chapter 4, it remains open if the rates from Theorem
4.7 can be achieved for domains with arbitrary geometry.
141
Appendix A
Limit Theorems
The following martingale central limit theorem is a special case of [LS89,
Theorem 5.5.4 (I)], [JS03, Theorem VIII.2.4].
Theorem A.1. Let (MN)NNbe a sequence of real-valued continuous local
martingales with MN
0= 0, let T > 0such that MNT
P
1for N .
Then MN
T
d
N(0,1) as N .
We will repeatedly use the following version of the law of large numbers,
which exploits Gaussianity:
Lemma A.2. Let (X
k)kNbe independent centered Gaussian processes on
[0, T], set Y
k:= RT
0X
k(t)2dtand Z
N=PN
k=1 Y
k.
(i) If Var(Y
k)(EY
k)2as k , then Y
k/EY
k
P
1.
(ii) If EY
kCkαas k for some constants C > 0,αR, then
Z
N/EZ
N
a.s.
1as N .
Proof. The first statement is a direct consequence of the Markov inequality:
P
Y
k
EY
k1> ϵVar(Y
k)
ϵ2(EY
k)20
for every ϵ > 0. Now we prove (ii). As the (X
k)are Gaussian with mean
142
zero, the Wick theorem [Jan97, Theorem 1.28] gives
Var(Y
k) = ZT
0ZT
0
E[X
k(t)X
k(t)X
k(s)X
k(s)] (A.1)
E[X
k(t)X
k(t)]E[X
k(s)X
k(s)]dsdt
= 2 ZT
0ZT
0
E[X
k(t)X
k(s)]2dsdt2EZT
0
X
k(t)2dt2
= 2(EY
k)2.
W.l.o.g. assume EY
1>0, such that the denominator in the following esti-
mates is positive. (Otherwise, if EY
1= 0, then (A.1) implies Y
1= 0 almost
surely, and Y
1does not contribute to ZN.) We have for α > 1:
Var(Y
k)
(EZ
k)22(EY
k)2
Pk
i=1 EY
k22C2k2α
(C(α+ 1)1kα+1)21
k2.
Similarly, for α=1,
Var(Y
k)
(EZ
k)21
k2ln(k)2.
Finally, if α < 1, the denominator (EZ
k)2converges for k , and
Var(Y
k)
(EZ
k)2(EY
k)2k2α1
k2.
In any case, we have P
k=1 Var(Y
k)/(EZ
k)2<, and the strong law of large
numbers [Shi96, Theorem IV.3.2] implies the claim.
143
Appendix B
Additional Proofs
B.1 Proof of Proposition 2.17
We prove the statement in two parts.
Lemma B.1. In the situation of Proposition 2.17, there is a unique solution
Xto (2.39) that satisfies XRE(s)for s= 1.
Proof. This is an application of the arguments in [LR15, Theorem 5.1.3].
More precisely, we show that the assumptions (H1) (continuity) and (H2)
(monotonicity) therein are satisfied in the Gelfand triple W1,2
0(D)L2(D)
L2(D)W1,2
0(D)(i.e. H1H0H1), whereas (H3) (coercivity) and
(H4)(boundedness) are satisfied in the shifted triple W2,2(D)W1,2
0(D)
W1,2
0(D)L2(D)(i.e. H2H1H0). First note that in all cases consid-
ered, xf:RRis bounded from above, and thus there is c > 0such that
for any XL2(D)and Y:D R:
xf(Y)X, XL2(D)cX2
L2(D).(B.1)
Since γ > d/4 + 1/2,(∆)1/2Bis a Hilbert–Schmidt operator on H, i.e. the
dispersion operator Bis a Hilbert–Schmidt operator from Hto H1. As Bis
constant, it suffices to test (H1),(H2),(H3),(H4)only for the drift of the
SPDE (2.39).
(H1) If fis a polynomial, this is a trivial consequence of the binomial the-
orem. On the other hand, if fC
b(R)and u, v, w H1, then
t7→ H1θ∆(u+tv) + f(u+tv), wH1is continuous as a consequence
of the linearity of and the dominated convergence theorem.
144
(H2)This is a consequence of (B.1): For u, v H1,
H1θu+f(u)θvf(v), u vH1 f(u)f(v), u vH0
=xf(w)(uv), u vH0
cuv2
H0
for some w:D R.
(H3) For uH2,
L2(D)θu+f(u), uH2=θu2
H2+f(u), uH1
=θu2
H2+xf(u)u, uL2(D)
θu2
H2+cu2
H1,
where we used (B.1) componentwise in the last inequality.
(H4)First consider the case that fis a polynomial. W.l.o.g., assume f(x) =
xkfor kN. Then for uH2:
f(u)2
L2(D)=u2k
L2k(D),
and for d2this term is bounded by u2k
W1,2(D)=u2k
H1up to a
constant. In d= 3 this is still true if k3. This proves (H4). Finally,
if fC
b(R), then f(u)2
L2(D) |D|supxRf(x)2<, and (H4)is
trivially satisfied.
Now as in [LR15, Lemma 5.1.4 and 5.1.5], (H3) and (H4)imply that there
is a sequence of finite dimensional approximations X(n)to the solution X
which is bounded uniformly in L2(Ω ×[0, T]; Hs+1)and Lp(Ω; L(0, T;Hs))
for s= 1, and such that θX(n)+f(X(n))is bounded uniformly in L2(Ω ×
[0, T]; Hs1)for s= 1. In particular, these statements remain true for the
(weaker) case s= 0. Based on these bounds for s= 0, the proof of [LR15,
Theorem 5.1.3] transfers verbatim and yields a unique solution Xto (2.39)
with XRE(0). The stronger a priori bounds (s= 1) imply that in fact
XRE(1), which concludes the proof.
Lemma B.2. In the situation of Proposition 2.17, there is s > d/2such that
XRE(s).
145
Proof. In d= 1, this has been proven in Lemma B.1, so let d {2,3}. We
apply the usual splitting argument and write X=¯
X+e
X, where ¯
Xis the
solution to (2.39) with f= 0. Then ¯
XRE(s)for any s < s= 1+2γd/2,
see [DPZ14, Section 5.3]. As γ > d/4+1/2, we have in particular ¯
XRE(2).
As a consequence, the claim is proven once we know that for all 0< η < 2,
e
XRE(η).(B.2)
(i) For polynomial f, we assume w.l.o.g. f(x) = xk, where kis arbitrary
in d= 2 and k3in d= 3. In this case, f(X)L2(D)=Xk
L2k(D)
Xk
H1. Consequently, f(X)RE(0) because XRE(1). Similar to
the proof of Proposition 2.3, we estimate for 0tT:
e
XtηerθX0η+Zt
0e(tr)θf(Xr)ηdr
X0η+2
2ηT1η/2sup
0rtf(Xr)L2(D).
As f(X)RE(0) and EX0p
s+2<for any p1, we conclude
that (B.2) holds true.
(ii) Let fC
b(R). By Proposition 2.19 (ii), condition (Fs,η)holds for
s= 1 and 0< η < 2. Thus, Proposition 2.3 (ii) implies (B.2).
This finishes the proof of Proposition 2.17.
B.2 Proof of Lemma 5.5
This section is an adaptation of the proof of [ACP20, Proposition 30], which,
in turn, is a modification of [DPZ14, Theorem 5.25].
For s < sand α > 0, let Y(s,α)
t:= Rt
0(tr)α(∆)s/2e(tr)θBdWr.
First, we prove:
Lemma B.3. For all s<s,0< α < (ss)/2and p2, we have a.s.
Y(s,α)Lp(0, T;Lp(D)).
146
Proof. For x D, let δxbe the point evaluation operator. We have for
x D,0tT, using that B(∆)γis a bounded operator on L2(D):
EhY(s,α)
t(x)2i=Zt
0
r2αδx(∆)s/2erθB2
HS dr
=Zt
0
r2αB(∆)γerθ(∆)s/2γδ
x2
L2(D)dr(B.3)
Zt
0
r2αδx(∆)s/2erθ(∆)γ2
HS dr,
so w.l.o.g. we restrict to the case B= (∆)γ. In that case, together with
supkNΦkL(D)<, a calculation as in Lemma 2.7 gives
EhY(s,α)
t(x)2i=
X
k=1
λs2γ
kZt
0
r2αe2θλkrdrΦk(x)2
X
k=1
k2
d(s2γ1+2α),
which is finite1for α < (ss)/2. As Y(s,α)is Gaussian,
sup
0tT,x∈D
EhY(s,α)
t(x)
pisup
0tT,x∈D
EhY(s,α)
t(x)2ip/2
<.
This leads to
EZT
0ZDY(s,α)
t(x)
pdxdtT|D| sup
0tT,x∈D
EhY(s,α)
t(x)
pi<,
proving the claim.
Proof of Lemma 5.5. Using the factorization formula [DPZ14, Theorem 5.10],
we obtain from Lemma B.3 together with [DPZ14, Proposition 5.9] that
(∆)s/2¯
XC(0, T;Lp(D)) for all s < sand p2such that 1/p <
(ss)/2. This means that ¯
XRp(s)for all p2and s < s2/p. As
p2is arbitrary, this finishes the proof.
B.3 Proof of Proposition 6.6
The arguments are similar as in Appendix B.1, the main difference being the
new inhibitor component and the concentration dependent unstable zero in
the reaction polynomial. For d2, the proof can be found in [PFA+21].
1In particular, the terms involving the point evaluation δxin (B.3) are finite.
147
We write Hs:= HsHsfor the regularity spaces describing both com-
ponents2. Similarly to Lemma B.1, we work in the Hilbert space triples
H1H0H1and H2H1H0. Further, with f(u, z) = k1u(u0
u)(uu0a(z)), we write A1(U, V ) = DUU+f(U, UL2(D))k2Vand
A2(U, V ) = DVV+ϵ(bU V)as well as A(U, V ) = (A1(U, V ), A2(U, V )).
Similarly to (B.1), we have for UL2(D),Y:D Rand zR:
uf(Y, z)U, UL2(D)cU2
L2(D),(B.4)
because ais a bounded function. As B=σ(∆)γand A2is linear, it suffices
to consider A1in order to show (H1),(H2),(H3),(H4)from [LR15]. For
(H2), we have to take into account the dependence of fon the overall
concentration: Using (B.4),
H1A1(U1, V1)A1(U2, V2), U1U2H1
Df(U1,U1L2(D))f(U2,U2L2(D)), U1U2EL2(D)
+k2V1V2L2(D)U1U2L2(D)
Duf(Y, U1L2(D))(U1U2), U1U2EL2(D)
+Dzf(U2,˜z)U1L2(D)U2L2(D), U1U2EL2(D)
+k2V1V2L2(D)U1U2L2(D)
U1U22
L2(D)+V1V22
L2(D)
+zf(U2,˜z)L2(D)U1L2(D)U2L2(D)U1U2L2(D)
1 + zf(U2,˜z)L2(D)(U1, V1)(U2, V2)2
L2(D)L2(D)
for some Y:D Rand ˜zR. Further, using that zais a bounded
function as well as the Sobolev embedding W1,2(D)L4(D),
zf(U2,˜z)L2(D)U2(u0U2)L2(D)U2L2(D)+U22
L4(D)
1 + U2H12.
Therefore we can take ρ(U, V ) = c(1+UH1)2for some c > 0in the notation
of [LR15], and (H2)is satisfied.
2Note that this is different from that notation in Section 2.6 as the regularity of the
inhibitor component is taken into account.
148
Now, (H1),(H3) and (H4)work exactly as in Lemma B.1, with obvious
modifications in notation due to the presence of the inhibitor component,
taking into account that ais continuous and bounded. As a consequence, we
have the following analogon to Lemma B.1:
Lemma B.4. In the situation of Proposition 6.6, there is a unique solution
U, V to (6.14),(6.15) with U, V RE(1).
We can represent the inhibitor as Vt=et(DVϵI)V0ϵbF3(U)(t) =
et(DVϵI)V0+ϵb Rt
0e(tr)(DVϵI)Urdr. Note that F3satisfies (Fs,η)for every
sRand η < 4: With ε= 2 η/2,
sup
0tTF3(U)(t)s+η+ε2sup
0tTZt
0e(tr)(DVϵI)Urs+2εdr
sup
0tTZt
0
(tr)1+ε/2Ursdr2
εTε/2UR(s).
Further, E[V0p
s+2]<for all p2. Consequently, for all s < sand
ε > 0, we have VRE(s+ 2 ε)whenever URE(s). In particular, from
Lemma B.4 it follows that VRE(3 ε).
Now, exactly as in Lemma B.2 we see that there is some s > d/2such
that URE(s), taking into account that ais bounded and VRE(3 ε)
for ε > 0. Finally, it is clear that U7→ f(U, UL2(D))satisfies (Fs,η)for
d/2< s < sand η < 2, so the same is true for F(U) = f(U, UL2(D))
k2(e(·)(DVϵI)V0ϵbF3(U)). Thus, an application of Proposition 2.4 finishes
the proof of Proposition 6.6.
149
List of Figures
2.1 Monte-Carlo simulation for the Allen-Cahn equation. . . . . . 50
6.1 Diffusivity estimation for simulated cell data. . . . . . . . . . . 130
6.2 Effect of the noise intensity . . . . . . . . . . . . . . . . . . . 132
6.3 Diffusivity estimation for a giant cell observation. . . . . . . . 133
6.4 Diffusivity estimation on a population of giant cells. . . . . . . 136
6.5 Effective unstable zero of the reaction term. . . . . . . . . . . 137
6.6 Effective diffusivity outside the cell. . . . . . . . . . . . . . . . 138
All of these plots appear in one of the works on which this dissertation is
based, namely [PS20] (Figure 2.1) and [PFA+21] (Figure 6.1 to 6.6).
150
Notation
Assumptions
(W)well-posedness of the SPDE (p. 18)
(Fs,η)regularity bound on the nonlinearity F(p. 21)
(Fv
s,η)bound on Fin variational spaces Rv(s)(p. 23)
(Fp
s,η)Lp-regularity bound on the nonlinearity F(p. 108)
(Fpar
s,η )regularity bound on parametrized nonlinearity F(p. 119)
(Fsys
s,η )analogon of (Fs,η)for partially observed systems (p. 47)
(FJ
s,η)bound for the integrated nonlinearity JF (p. 59)
(Nγ
η)dispersion Bis asymptotically close to ¯
B= (A)γ(p. 78)
(D0)Bris an algebra of continuous functions (p. 86)
(D1)growth bound on the norm of the eigenfunctions (p. 86)
(D2)error bound for the integral discretization error (p. 86)
(D
2)trigonometric interpolation error (p. 101)
(LB)local control on the dispersion (p. 107)
(LK)shape of the kernel (p. 107)
(LΨ)non-degeneracy within the local approach (p. 107)
(Iα)linear independence of the nonlinear components (p. 121)
151
Asymptotics
aNbNThere is C > 0such that aN/bNCfor N .
aNbNaN/bN1for N .
aNbNThere is C > 0such that aNCbN.
aNbNaN=o(bN), i.e. aN/bN0for N .
aNpbN
There is ϵ > 0such that aNbNNϵ.
(polynomial negligibility)
Similar notation is used for different asymptotic parameters (i.e. h,δ).
Vector Spaces, Norms, Scalar Products
A (fixed) norm on a Banach space Bis denoted by ∥·∥B.
If Bis even a Hilbert space, the corresponding scalar product is ⟨·,·⟩B.
Frequently used norms are abbreviated:
HHilbert space, typically L2(D)(state space)
∥·∥,⟨·,·⟩ norm and scalar product on H
HsD((A)s/2)(scale of regularity spaces)
∥·∥s,⟨·,·⟩snorm and scalar product on Hs
VH1=D((A)1/2)(energy space)
R(s)L(0, T;Hs)
RE(s)Tp1Lp(Ω, R(s)) = Tp1Lp(Ω, L(0, T ;Hs))
(locally convex space)
Rv(s)L(0, T;Hs1)L2(0, T;Hs)
∥·∥HS Hilbert–Schmidt norm of an operator acting on H
Hs,p(D)domain of (∆)s/2in Lp(D)
∥·∥s,p canonical norm on Hs,p(D)
Rp(s)L(0, T;Hs,p(D))
∥·∥(h),⟨·,·⟩(h)Euclidean norm and scalar product on RMh
152
Estimators
Temporally white noise (Chapter 2): ˆ
θfull
N,ˆ
θpart
N,ˆ
θlin
N(p. 27 f.)
Ornstein–Uhlenbeck noise (Section 3.1): ˆ
θref
N,ˆµref
N(p. 61)
ˆ
θsim
N(p. 62)
ˆµfull
N(ϑ),ˆµlin
N(ϑ)(p. 65 f.)
Integrated noise (Section 3.2): ˆ
θrescaled
N(p. 74)
Discrete observations (Chapter 4): ˆ
θdiscr
h,N (p. 87)
Local observations (Chapter 5): ˆ
θδ,x0(p. 106)
ˆ
θmode
N(p. 114)
Joint parameter estimation (Section 6.1): ˆ
θN= (ˆ
θN
0,··· ,ˆ
θN
K)(p. 119)
Activator–inhibitor model (Section 6.2): ˆ
θlin,N
0,ˆ
θpol,N
0,ˆ
θfull,N
0(p. 128)
ˆ
θ2,N
0,ˆ
θ3,N
0,ˆ
θ4,N
0(p. 128 f.)
ˆ
θ2,N
1(p. 137)
Constants and Further Notation
Fourier decomposition of A:
λNeigenvalue of A
ΦNeigenfunction of A
PNprojection onto the span of Φ1,...,ΦN, defined on any Hs
Frequently used constants:
βdetermined by λNNβ, usually β= 2/d
Λproportionality constant given by λNΛNβ, depends on D
γdegree of spatial correlation in the noise
soptimal regularity of the solution process, i.e. XR(s)if and only
if s < s(usually s= 1 + 2γ1)
Further notation:
JBochner integral operator f7→ Jf =R·
0f(r)dr
153
Bibliography
[AB88] S. I. Aihara and A. Bagchi, Parameter Identification for Stochas-
tic Diffusion Equations with Unknown Boundary Conditions,
Appl Math Optim 17 (1988), 15–36.
[ABJR21] Randolf Altmeyer, Till Bretschneider, Josef Janák, and Markus
Reiß, Parameter Estimation in an SPDE Model for Cell Repo-
larisation, arXiv:2010.06340v2 [math.ST] (2021), preprint.
[ACP20] Randolf Altmeyer, Igor Cialenco, and Gregor Pasemann, Param-
eter estimation for semilinear SPDEs from local measurements,
arXiv:2004.14728v2 [math.ST] (2020), preprint.
[AF92] David R. Adams and Michael Frazier, Composition operators on
potential spaces, Proc. Amer. Math. Soc. 114 (1992), no. 1, 155–
165.
[AF03] Robert A. Adams and John J. F. Fournier, Sobolev spaces, second
ed., Pure and Applied Mathematics, vol. 140, Elsevier/Academic
Press, 2003.
[Aih92] S. I. Aihara, Regularized Maximum Likelihood Estimate for an
Infinite-dimensional Parameter in Stochastic Parabolic Systems,
SIAM J. Control and Optimization 30 (1992), no. 4, 745–764.
[Aih98a] , Consistency property of extended least-squares parame-
ter estimation for stochastic diffusion equation, Systems & Con-
trol Letters 34 (1998), 249–256.
[Aih98b] , Identification of a Discontinuous Parameter in Stochas-
tic Parabolic Systems, Appl Math Optim 37 (1998), 43–69.
154
[AR21] Randolf Altmeyer and Markus Reiß, Nonparametric Estimation
for Linear SPDEs from Local Measurements, Ann. Appl. Probab.
31 (2021), no. 1, 1 38.
[AS88] S. I. Aihara and Y. Sunahara, Identification of an Infinite-
dimensional Parameter for Stochastic Diffusion Equations,
SIAM J. Control and Optimization 26 (1988), no. 5, 1062–1075.
[ASB18] Sergio Alonso, Maike Stange, and Carsten Beta, Modeling ran-
dom crawling, membrane deformation and intracellular polarity
of motile amoeboid cells, PLOS ONE 13 (2018), no. 8, 1–22.
[AT17] Anthony P. Austin and Lloyd N. Trefethen, Trigonometric Inter-
polation and Quadrature in Perturbed Points, SIAM J. Numer.
Anal. 55 (2017), no. 5, 2113–2122.
[Aus16] Anthony P. Austin, Some New Results on and Applications of
Interpolation in Numerical Computation, University of Oxford,
2016, DPhil thesis.
[BB84] A. Bagchi and V. Borkar, Parameter identification in infinite
dimensional linear systems, Stochastics 12 (1984), 201–213.
[BBPSP14] L. Blanchoin, R. Boujemaa-Paterski, C. Sykes, and J. Plastino,
Actin Dynamics, Architecture, and Mechanics in Cell Motility,
Physiol Rev 94 (2014), no. 1, 235–263.
[BC09] A. Bain and D. Crisan, Fundamentals of Stochastic Filtering,
Stochastic Modelling and Applied Probability, vol. 60, Springer
Science+Business Media, LLC, 2009.
[Bis99] J. P. N. Bishwal, Bayes and Sequential Estimation in Hilbert
Space Valued Stochastic Differential Equations, Journal of the
Korean Statistical Society 28 (1999), no. 1, 93–106.
[Bis02] , The Bernstein-von Mises Theorem and Spectral Asymp-
totics of Bayes Estimators for Parabolic SPDEs, J. Austral.
Math. Soc. 72 (2002), 287–298.
[BL76] J. Bergh and J. Löfström, Interpolation Spaces (An Introduc-
tion), Grundlehren der mathematischen Wissenschaften, vol.
223, Springer-Verlag Berlin Heidelberg, 1976.
155
[BM18] Haïm Brezis and Petru Mironescu, Gagliardo–Nirenberg inequal-
ities and non-inequalities: The full story, Ann. Inst. H. Poincaré
Anal. Non Linéaire 35 (2018), no. 5, 1355–1376.
[BS17] Matthew D. Blair and Christopher D. Sogge, Refined and Mi-
crolocal Kakeya-Nikodym Bounds of Eigenfunctions in Higher
Dimensions, Comm. Math. Phys. 356 (2017), no. 2, 501–533.
[BST80] A. Bagchi, R. C. W. Strijbos, and G. Thé, Identification of a
distributed-parameter system with boundary noise, Int. J. Sys-
tems Sci. 11 (1980), no. 1, 49–56.
[BT19] Markus Bibinger and Mathias Trabs, On Central Limit Theo-
rems for Power Variations of the Solution to the Stochastic Heat
Equation, Stochastic Models, Statistics and Their Applications
(A. Steland, E. Rafajłowicz, and O. Okhrin, eds.), Springer Pro-
ceedings in Mathematics & Statistics, vol. 294, Springer, Cham,
2019, pp. 69–84.
[BT20] , Volatility estimation for stochastic PDEs using high-
frequency observations, Stochastic Process. Appl. 130 (2020),
no. 5, 3005–3052.
[CA77] J. W. Cahn and S. M. Allen, A Microscopic Theory for Domain
Wall Motion and its Experimental Verification in Fe-Al Alloy
Domain Growth Kinetics, J. Phys. Colloques 38 (1977), no. C7,
C7–51 C7–54.
[CCG20] Ziteng Cheng, Igor Cialenco, and Ruoting Gong, Bayesian esti-
mations for diagonalizable bilinear SPDEs, Stochastic Process.
Appl. 130 (2020), no. 2, 845–877.
[CD20] Carsten Chong and Robert C. Dalang, Power Variations in Frac-
tional Sobolev Spaces for a Class of Parabolic Stochastic PDEs,
arXiv:2006.15817v1 [math.PR] (2020), preprint.
[CDVK20] Igor Cialenco, Francisco Delgado-Vences, and Hyun-Jung Kim,
Drift estimation for discretely sampled SPDEs, Stoch PDE: Anal
Comp 8(2020), no. 4, 895–920.
156
[Cer01] Sandra Cerrai, Second order PDE’s in finite and infinite dimen-
sion, Lecture Notes in Mathematics, vol. 1762, Springer-Verlag,
Berlin, 2001.
[CGH11] Igor Cialenco and Nathan Glatt-Holtz, Parameter estimation for
the stochastically perturbed Navier-Stokes equations, Stochastic
Process. Appl. 121 (2011), no. 4, 701–724.
[CGH18] Igor Cialenco, Ruoting Gong, and Yicong Huang, Trajectory fit-
ting estimators for SPDEs driven by additive noise, Stat Infer-
ence Stoch Process 21 (2018), no. 1, 1–19.
[CH20] Igor Cialenco and Yicong Huang, A note on parameter estima-
tion for discretely sampled SPDEs, Stochastics and Dynamics
20 (2020), no. 3, 2050016.
[Che93] Xia Chen, On the law of the iterated logarithm for independent
Banach space valued random variables, The Annals of Probabil-
ity 21 (1993), no. 4, 1991–2011.
[Cho19] Carsten Chong, High-frequency analysis of parabolic stochas-
tic PDEs with multiplicative noise: Part I, arXiv:1908.04145v1
[math.PR] (2019), preprint.
[Cho20] , High-frequency analysis of parabolic stochastic PDEs,
The Annals of Statistics 48 (2020), no. 2, 1143 1167.
[CHQZ88] C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang,
Spectral Methods in Fluid Dynamics, Springer Series in Compu-
tational Physics, Springer-Verlag Berlin Heidelberg, 1988.
[Cia02] Philippe G. Ciarlet, The Finite Element Method for Elliptic
Problems, Classics in Applied Mathematics, vol. 40, Society for
Industrial and Applied Mathematics, 2002, reprint of the 1978
edition.
[Cia10] Igor Cialenco, Parameter estimation for SPDEs with multiplica-
tive fractional noise, Stoch. Dyn. 10 (2010), no. 4, 561–576.
[Cia18] , Statistical inference for SPDEs: an overview, Stat In-
ference Stoch Process 21 (2018), no. 2, 309–329.
157
[CK22] Igor Cialenco and Hyun-Jung Kim, Parameter estimation for
discretely sampled stochastic heat equation driven by space-only
noise, Stochastic Process. Appl. 143 (2022), 1–30.
[CKL20] Igor Cialenco, Hyun-Jung Kim, and Sergey V. Lototsky, Statis-
tical analysis of some evolution equations driven by space-only
noise, Stat Inference Stoch Process 23 (2020), no. 1, 83–103.
[CKP21] Igor Cialenco, Hyun-Jung Kim, and Gregor Pasemann, Statisti-
cal analysis of discretely sampled semilinear SPDEs: a power
variation approach, arXiv:2103.04211v1 [math.PR] (2021),
preprint.
[CL09] Igor Cialenco and Sergey V. Lototsky, Parameter estimation in
diagonalizable bilinear stochastic parabolic equations, Stat Infer-
ence Stoch Process 12 (2009), no. 3, 203–219.
[CLP09] Igor Cialenco, Sergey V. Lototsky, and Jan Pospíšil, Asymp-
totic properties of the maximum likelihood estimator for stochas-
tic parabolic equations with additive fractional Brownian motion,
Stoch. Dyn. 9(2009), no. 2, 169–185.
[CX14] Igor Cialenco and Liaosha Xu, A note on error estimation for
hypothesis testing problems for some linear SPDEs, Stoch PDE:
Anal Comp 2(2014), no. 3, 408–431.
[CX15] , Hypothesis testing for stochastic PDEs driven by addi-
tive noise, Stochastic Process. Appl. 125 (2015), no. 3, 819–866.
[DMPD00] T. E. Duncan, B. Maslowski, and B. Pasik-Duncan, Adaptive
Control for Semilinear Stochastic Systems, SIAM J. Control Op-
tim. 38 (2000), no. 6, 1683–1706.
[DPDT94] Giuseppe Da Prato, Arnaud Debussche, and Roger Temam,
Stochastic Burgers’ equation, NoDEA 1(1994), no. 4, 389–402.
[DPZ14] Giuseppe Da Prato and Jerzy Zabczyk, Stochastic Equations in
Infinite Dimensions, second ed., Encyclopedia of Mathematics
and its Applications, vol. 152, Cambridge University Press, 2014.
158
[EN00] Klaus-Jochen Engel and Rainer Nagel, One-Parameter Semi-
groups for Linear Evolution Equations, Graduate Texts in Math-
ematics, vol. 194, Springer-Verlag New York, 2000.
[FFAB20] Sven Flemming, Francesc Font, Sergio Alonso, and Carsten Beta,
How cortical waves drive fission of motile cells, Proceedings of
the National Academy of Sciences 117 (2020), no. 12, 6330–6338.
[Fit61] R. FitzHugh, Impulses and Physiological States in Theoretical
Models of Nerve Membrane, Biophys. J. 1(1961), 445–466.
[GEW+14] M. Gerhardt, M. Ecke, M. Walz, A. Stengl, C. Beta, and
G. Gerisch, Actin and PIP3 waves in giant cells reveal the inher-
ent length scale of an excited state, Journal of Cell Science 127
(2014), no. 20, 4507–4517.
[GM02] B. Goldys and B. Maslowski, Parameter Estimation for Con-
trolled Semilinear Stochastic Systems: Identifiability and Con-
sistency, Journal of Multivariate Analysis 80 (2002), 322–343.
[GN15] E. Giné and R. Nickl, Mathematical Foundations of Infinite-
Dimensional Statistical Models, Cambridge Series in Statisti-
cal and Probabilistic Mathematics, Cambridge University Press,
2015.
[Gri92] D. Grieser, Lpbounds for eigenfunctions and spectral projections
of the Laplacian near concave boundaries, ProQuest LLC, Ann
Arbor, MI, 1992, Ph.D. thesis, University of California, Los An-
geles.
[Gri02] , Uniform bounds for eigenfunctions of the Laplacian on
manifolds with boundary, Commun. in Partial Differential Equa-
tions 27 (2002), no. 7-8, 1283–1299.
[GT01] David Gilbarg and Neil S. Trudinger, Elliptic Partial Differential
Equations of Second Order, Classics in Mathematics, Springer-
Verlag Berlin Heidelberg, 2001, reprint of the 1998 edition.
159
[HKR93] M. Huebner, R. Khasminskii, and B. L. Rozovskii, Two Exam-
ples of Parameter Estimation for Stochastic Partial Differen-
tial Equations, Stochastic Processes (S. Cambanis, J. K. Ghosh,
R. L. Karandikar, and P. K. Sen, eds.), Springer-Verlag New
York, 1993, pp. 149–160.
[HL84] W. Horsthemke and R. Lefever, Noise-Induced Transitions (The-
ory and Applications in Physics, Chemistry, and Biology),
Springer Series in Synergetics, Springer-Verlag Berlin Heidel-
berg, 1984.
[HL00a] M. Huebner and S. Lototsky, Asymptotic analysis of a kernel
estimator for parabolic SPDE’s with time-dependent coefficients,
Ann. Appl. Probab. 10 (2000), no. 4, 1246–1258.
[HL00b] , Asymptotic Analysis of the Sieve Estimator for a Class
of Parabolic SPDEs, Scand J Statist 27 (2000), no. 2, 353–370.
[HLR97] M. Huebner, S. Lototsky, and B. L. Rozovskii, Asymptotic Prop-
erties of an Approximate Maximum Likelihood Estimator for
Stochastic PDEs, Statistics and Control of Stochastic Processes
(Yu. M. Kabanov, B. L. Rozovskii, and A. N. Shiryaev, eds.),
World Scientific, 1997, pp. 139–155.
[HR95] M. Huebner and B. L. Rozovskii, On asymptotic properties of
maximum likelihood estimators for parabolic stochastic PDE’s,
Probab. Theory Relat. Fields 103 (1995), no. 2, 143–163.
[HT21a] Florian Hildebrandt and Mathias Trabs, Nonparametric calibra-
tion for stochastic reaction-diffusion equations based on discrete
observations, arXiv:2102.13415v1 [math.ST] (2021), preprint.
[HT21b] , Parameter estimation for SPDEs based on discrete ob-
servations in time and space, Electronic Journal of Statistics 15
(2021), no. 1, 2716–2776.
[Hue93] M. Huebner, Parameter estimation for stochastic differential
equations, ProQuest LLC, Ann Arbor, MI, 1993, Ph.D. thesis,
University of Southern California.
160
[Hue99] , Asymptotic Properties of the Maximum Likelihood Esti-
mator for Stochastic PDEs Disturbed by Small Noise, Statistical
Inference for Stochastic Processes 2(1999), no. 1, 57–68.
[Hui14] Jiang Hui, Moderate deviation for parameter estimator in the
stochastic parabolic equations with additive fractional Brownian
motion, Stochastics and Dynamics 14 (2014), no. 3, 1450002.
[IH81] I. A. Ibragimov and R. Z. Has’minski˘ı, Statistical Estimation
(Asymptotic Theory), Applications of Mathematics, vol. 16,
Springer-Verlag, New York-Berlin, 1981.
[IK99] I. A. Ibragimov and R. Z. Khas’minski˘ı, Estimation Problems for
Coefficients of Stochastic Partial Differential Equations. Part I,
Theory Probab. Appl. 43 (1999), no. 3, 370–387.
[IK00] , Estimation Problems for Coefficients of Stochastic Par-
tial Differential Equations. Part II, Theory Probab. Appl. 44
(2000), no. 3, 469–494.
[IK01] , Estimation Problems for Coefficients of Stochastic Par-
tial Differential Equations. Part III, Theory Probab. Appl. 45
(2001), no. 2, 210–232.
[Jan97] Svante Janson, Gaussian Hilbert Spaces, Cambridge Tracts in
Mathematics, Cambridge University Press, 1997.
[Jan20] Josef Janák, Parameter estimation for stochastic partial differ-
ential equations of second order, Appl. Math. Optim. 82 (2020),
353–397.
[Jan21] , Parameter Estimation for Stochastic Wave Equation
Based on Observation Window, Acta Appl. Math. 172 (2021),
no. 2, 37 pages.
[JS03] Jean Jacod and Albert N. Shiryaev, Limit theorems for stochas-
tic processes, second ed., Grundlehren der Mathematischen Wis-
senschaften, vol. 288, Springer-Verlag, Berlin, 2003.
[KL85] T. Koski and W. Loges, Asymptotic Statistical Inference for a
Stochastic Heat Flow Problem, Statistics & Probability Letters
3(1985), 185–189.
161
[KL86] , On minimum-contrast estimation for Hilbert space-
valued stochastic differential equations, Stochastics 16 (1986),
no. 3-4, 217–225.
[KLBR00] M. L. Kleptsyna, A. Le Breton, and M.-C. Roubaud, Parameter
Estimation and Optimal Filtering for Fractional Type Stochastic
Systems, Stat. Inference Stoch. Process. 3(2000), no. 1-2, 173–
182.
[KM19] P. Kříž and B. Maslowski, Central limit theorems and minimum-
contrast estimators for linear stochastic evolution equations,
Stochastics 91 (2019), 1109–1140.
[KO79] Heinz-Otto Kreiss and Joseph Oliger, Stability of the Fourier
method, SIAM J. Numer. Anal. 16 (1979), no. 3, 421–433.
[Kří20] P. Kříž, A space-consistent version of the minimum-contrast es-
timator for linear stochastic evolution equations, Stochastics and
Dynamics 20 (2020), no. 3, 2050019.
[Kry96] N. V. Krylov, On Lp-Theory of Stochastic Partial Differential
Equations in the Whole Space, SIAM Journal on Mathematical
Analysis 27 (1996), no. 2, 313–340.
[KU21a] Yusuke Kaino and Masayuki Uchida, Adaptive estimator for a
parabolic linear SPDE with a small noise, Jpn J Stat Data Sci
(2021), 29 pages.
[KU21b] Yusuke Kaino and Masayuki Uchida, Parametric estimation for
a parabolic linear SPDE model based on discrete observations, J.
Statist. Plann. Inference 211 (2021), 190–220.
[KUP91] P. Kumar, T. E. Unny, and K. Ponnambalam, Stochastic partial
differential equations in groundwater hydrology (Part 2: Appli-
cation to Borden aquifer), Stochastic Hydrol. Hydraul. 5(1991),
239–251.
[Kut04] Yury A. Kutoyants, Statistical inference for ergodic diffusion
processes, Springer Series in Statistics, Springer-Verlag London
Ltd., 2004.
162
[LL10a] W. Liu and S. V. Lototsky, Estimating speed and damping in the
stochastic wave equation, Stochastic partial differential equations
and applications, Quad. Mat., vol. 25, Dept. Math., Seconda
Univ. Napoli, Caserta, 2010, pp. 191–206.
[LL10b] , Parameter estimation in hyperbolic multichannel mod-
els, Asymptot. Anal. 68 (2010), no. 4, 223–248.
[LM72] J.-L. Lions and E. Magenes, Non-homogeneous boundary value
problems and applications. Vol. I, Die Grundlehren der mathe-
matischen Wissenschaften, vol. 181, Springer-Verlag, New York-
Heidelberg, 1972.
[Log84] Wilfried Loges, Girsanov’s theorem in Hilbert space and an ap-
plication to the statistics of Hilbert space-valued stochastic dif-
ferential equations, Stochastic Process. Appl. 17 (1984), no. 2,
243–263.
[Lot96] S. V. Lototsky, Problems in statistics of stochastic differential
equations, ProQuest LLC, Ann Arbor, MI, 1996, Ph.D. thesis,
University of Southern California.
[Lot03] , Parameter estimation for stochastic parabolic equations:
asymptotic properties of a two-dimensional projection-based es-
timator, Stat. Inference Stoch. Process. 6(2003), no. 1, 65–87.
[Lot04] , Optimal filtering of stochastic parabolic equations, Re-
cent developments in stochastic analysis and related topics
(S. Albeverio, Z.-M. Ma, and M. ckner, eds.), World Scien-
tific, 2004, pp. 330–353.
[Lot09] , Statistical inference for stochastic parabolic equations:
a spectral approach, Publ. Mat. 53 (2009), no. 1, 3–45.
[LPS14] G. J. Lord, C. E. Powell, and T. Shardlow, An Introduction to
Computational Stochastic PDEs, Cambridge Texts in Applied
Mathematics, Cambridge University Press, 2014.
[LR99] S. V. Lototsky and B. L. Rosovskii, Spectral asymptotics of some
functionals arising in statistical inference for SPDEs, Stochastic
Process. Appl. 79 (1999), no. 1, 69–94.
163
[LR00] , Parameter Estimation for Stochastic Evolution Equa-
tions with Non-commuting Operators, Skorokhod’s Ideas in
Probability Theory (V. Korolyuk, N. Portenko, and H. Syta,
eds.), Institute of Mathematics of the National Academy of Sci-
ences of Ukraine, Kiev, 2000, pp. 271–280.
[LR15] Wei Liu and Michael ckner, Stochastic Partial Differen-
tial Equations: An Introduction, Universitext, Springer, Cham,
2015.
[LS77] R. S. Liptser and A. N. Shiryayev, Statistics of Random Pro-
cesses. I (General Theory), Applications of Mathematics, vol. 5,
Springer-Verlag New York, 1977.
[LS89] , Theory of Martingales, Mathematics and its Applica-
tions (Soviet Series), vol. 49, Kluwer Academic Publishers, Dor-
drecht, 1989.
[LS01] , Statistics of Random Processes. II (Applications), 2nd
ed., Applications of Mathematics, vol. 6, Springer-Verlag Berlin
Heidelberg, 2001.
[Lun95] Alessandra Lunardi, Analytic Semigroups and Optimal Reg-
ularity in Parabolic Problems, Modern Birkhäuser Classics,
Birkhäuser/Springer Basel, 1995.
[Mar03] Bo Markussen, Likelihood inference for a discretely observed
stochastic partial differential equation, Bernoulli 9(2003), no. 5,
745–762.
[MFF+20] Eduardo Moreno, Sven Flemming, Francesc Font, Matthias
Holschneider, Carsten Beta, and Sergio Alonso, Modeling cell
crawling strategies with a bistable model: From amoeboid to
fan-shaped cell motion, Physica D: Nonlinear Phenomena 412
(2020), 132591.
[Mis08] Yuliya S. Mishura, Stochastic calculus for fractional Brownian
motion and related processes, Lecture Notes in Mathematics, vol.
1929, Springer-Verlag Berlin Heidelberg, 2008.
164
[MKT19a] Z. Mahdi Khalil and C. A. Tudor, Estimation of the drift parame-
ter for the fractional stochastic heat equation via power variation,
Mod. Stoch. Theory Appl. 6(2019), no. 4, 397–417.
[MKT19b] , On the distribution and q-variation of the solution to the
heat equation with fractional Laplacian, Probab. Math. Statist.
39 (2019), no. 2, 315–335.
[Moh94] J. Mohapl, Maximum Likelihood Estimation in Linear Infinite
Dimensional Models, Stochastic Models 10 (1994), 781–794.
[Moh97] , On Estimation in the Planar Ornstein-Uhlenbeck Pro-
cess, Stochastic Models 13 (1997), 435–455.
[Moh00] , A Stochastic Advection-Diffusion Model for the Rocky
Flats Soil Plutonium Data, Ann. Inst. Statist. Math. 52 (2000),
no. 1, 84–107.
[MP07] B. Maslowski and J. Pospíšil, Parameter Estimates for Linear
Partial Differential Equations with Fractional Boundary Noise,
Communications in Information and Systems 7(2007), no. 1,
1–20.
[MP08] , Ergodicity and Parameter Estimates for Infinite-
Dimensional Fractional Ornstein-Uhlenbeck Process, Appl Math
Optim 57 (2008), 401–429.
[MT13] B. Maslowski and C. A. Tudor, Drift parameter estimation for
infinite-dimensional fractional Ornstein–Uhlenbeck process, Bull.
Sci. math. 137 (2013), 880–901.
[NAY62] A. Nagumo, S. Arimoto, and S Yoshizawa, An Active Pulse
Transmission Line Simulating Nerve Axon, Proc. IRE 50 (1962),
no. 10, 2061–2070.
[NVV99] Ilkka Norros, Esko Valkeila, and Jorma Virtamo, An elementary
approach to a Girsanov formula and other analytical results on
fractional Brownian motions, Bernoulli 5(1999), no. 4, 571–587.
[Ouv78] Jean-Yves Ouvrard, Martingale Projection and Linear Filtering
in Hilbert Spaces. I: The Theory, SIAM J. Control and Opti-
mization 16 (1978), no. 6, 912–937.
165
[Pas80] Joseph E. Pasciak, Spectral and pseudospectral methods for ad-
vection equations, Math. Comp. 35 (1980), no. 152, 1081–1092.
[Paz83] A. Pazy, Semigroups of Linear Operators and Applications to
Partial Differential Equations, Applied Mathematical Sciences,
vol. 44, Springer-Verlag New York, 1983.
[Pes95] Szymon Peszat, Existence and uniqueness of the solution for
stochastic equations on Banach spaces, Stochastics and Stochas-
tics Reports 55 (1995), no. 3-4, 167–193.
[PFA+21] Gregor Pasemann, Sven Flemming, Sergio Alonso, Carsten Beta,
and Wilhelm Stannat, Diffusivity Estimation for Activator–
Inhibitor Models: Theory and Application to Intracellular Dy-
namics of the Actin Cytoskeleton, Journal of Nonlinear Science
31 (2021), no. 59, 1–34.
[PR96] L. Piterbarg and B. Rozovskii, Maximum likelihood estimators
in the equations of physical oceanography, Stochastic Modelling
in Physical Oceanography (R. J. Adler, P. Müller, and B. L.
Rozovskii, eds.), Progress in Probability, vol. 39, Birkhäuser
Boston, 1996, pp. 397–421.
[PR97] , On asymptotic problems of parameter estimation in
stochastic PDE’s: discrete time sampling, Math. Methods
Statist. 6(1997), no. 2, 200–223.
[PR00] B. L. S. Prakasa Rao, Bayes estimation for some stochastic par-
tial differential equations, Journal of Statistical Planning and
Inference 91 (2000), no. 2, 511–524.
[PR02] , Nonparametric Inference for a Class of Stochastic
Partial Differential Equations Based on Discrete Observations,
Sankha: The Indian Journal of Statistics, Series A 64 (2002),
no. 1, 1–15.
[PS20] Gregor Pasemann and Wilhelm Stannat, Drift estimation
for stochastic reaction-diffusion systems, Electronic Journal of
Statistics 14 (2020), no. 1, 547 579.
166
[PT07] Jan Pospíšil and Roger Tribe, Parameter estimates and exact
variations for stochastic heat equations driven by space-time
white noise, Stoch. Anal. Appl. 25 (2007), no. 3, 593–611.
[QSS00] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri, Numerical
Mathematics, Texts in Applied Mathematics, vol. 37, Springer-
Verlag New York, 2000.
[RR20] S. Reich and P. J. Rozdeba, Posterior contraction rates for non-
parametric state and drift estimation, Foundations of Data Sci-
ence 2(2020), no. 3, 333–349.
[Sch72] F. Schlögl, Chemical Reaction Models for Non-Equilibrium
Phase Transitions, Z. Physik 253 (1972), no. 2, 147–161.
[Sha71] H. S. Shapiro, Topics in Approximation Theory, Lecture Notes in
Mathematics, vol. 187, Springer-Verlag Berlin Heidelberg, 1971.
[Shi96] A. N. Shiryaev, Probability, second ed., Graduate Texts in Math-
ematics, vol. 95, Springer-Verlag, 1996.
[Shu01] M. A. Shubin, Pseudodifferential Operators and Spectral Theory,
second ed., Springer-Verlag Berlin Heidelberg, 2001.
[Sog15] Christopher D. Sogge, Problems related to the concentration of
eigenfunctions, Journées EDP (2015), no. IX, 11 pages.
[SS07] Hart F. Smith and Christopher D. Sogge, On the Lpnorm
of spectral clusters for compact manifolds with boundary, Acta
Math. 198 (2007), no. 1, 107–153.
[SS15] Martin Sauer and Wilhelm Stannat, Lattice approximation for
stochastic reaction diffusion equations with one-sided Lipschitz
condition, Math. Comp. 84 (2015), no. 292, 743–766.
[SS16] , Analysis and approximation of stochastic nerve axon
equations, Math. Comp. 85 (2016), no. 301, 2457–2481.
[SST20] Radomyra Shevchenko, Meryem Slaoui, and C. A. Tudor, Gener-
alized k-variations and Hurst parameter estimation for the frac-
tional wave equation via Malliavin calculus, Journal of Statistical
Planning and Inference 207 (2020), 155–180.
167
[SV02] Jukka Saranen and Gennadi Vainikko, Periodic Integral and
Pseudodifferential Equations with Numerical Approximation,
Springer Monographs in Mathematics, Springer-Verlag Berlin
Heidelberg, 2002.
[Tho06] Vidar Thomée, Galerkin Finite Element Methods for Parabolic
Problems, second ed., Springer Series in Computational Mathe-
matics, vol. 25, Springer-Verlag Berlin Heidelberg, 2006.
[Tri10a] Hans Triebel, Theory of Function Spaces, Modern Birkhäuser
Classics, Birkhäuser Verlag/Springer Basel AG, 2010, reprint of
the 1983 edition.
[Tri10b] , Theory of Function Spaces. II, Modern Birkhäuser Clas-
sics, Birkhäuser Verlag/Springer Basel AG, 2010, reprint of the
1992 edition.
[TTV14] S. Torres, C. A. Tudor, and F. G. Viens, Quadratic variations
for the fractional-colored stochastic heat equation, Electron. J.
Probab. 19 (2014), no. 76, 1–51.
[Tud13] C. A. Tudor, Analysis of Variations for Self-similar Processes (A
Stochastic Calculus Approach), Probability and Its Applications,
Springer International Publishing Switzerland, 2013.
[TV07] C. A. Tudor and F. G. Viens, Statistical aspects of the fractional
stochastic calculus, Ann. Statist. 35 (2007), no. 3, 1183–1212.
[Unn89] T. E. Unny, Stochastic partial differential equations in ground-
water hydrology (Part I: Theory), Stochastic Hydrol. Hydraul. 3
(1989), 135–153.
[vdBHV15] Michiel van den Berg, Rainer Hempel, and Jürgen Voigt, L1-
estimates for eigenfunctions of the Dirichlet Laplacian, J. Spectr.
Theory 5(2015), no. 4, 829–857.
[vdV98] A. W. van der Vaart, Asymptotic Statistics, Cambridge Series
in Statistical and Probabilistic Mathematics, vol. 3, Cambridge
University Press, 1998.
168
[vNVW12] J. van Neerven, M. Veraar, and L. Weis, Maximal Lp-Regularity
for Stochastic Evolution Equations, SIAM J. Math. Anal. 44
(2012), no. 3, 1372–1414.
[Vog15] Hendrik Vogt, L1-estimates for eigenfunctions and heat kernel
estimates for semigroups dominated by the free heat semigroup,
J. Evol. Equ. 15 (2015), no. 4, 879–893.
[WD01] Y. Wei and J. Ding, Representations for Moore-Penrose Inverses
in Hilbert Spaces, Applied Mathematics Letters 14 (2001), 599–
604.
[Wey11] H. Weyl, Über die asymptotische Verteilung der Eigenwerte,
Nachrichten von der Gesellschaft der Wissenschaften zu Göt-
tingen, Mathematisch-Physikalische Klasse (1911), 110–117.
[Wit85] Rainer Wittmann, A General Law of Iterated Logarithm, Z.
Wahrscheinlichkeitstheorie verw. Gebiete 68 (1985), no. 4, 521–
543.
[Wit87] , Sufficient Moment and Truncated Moment Conditions
for the Law of the Iterated Logarithm, Probab. Th. Rel. Fields
75 (1987), no. 4, 509–530.
[Yag10] Atsushi Yagi, Abstract Parabolic Evolution Equations and their
Applications, Springer Monographs in Mathematics, Springer-
Verlag Berlin Heidelberg, 2010.
[Yin93] Z. Ying, Maximum Likelihood Estimation of Parameters under
a Spatial Sampling Scheme, The Annals of Statistics 21 (1993),
no. 3, 1567–1590.
[Zlá73] Miloš Zlámal, Curved elements in the finite element method. I,
SIAM J. Numer. Anal. 10 (1973), no. 1, 229–240.
169