Parameter estimation for semilinear stochastic partial differential equations [original]

Parameter Estimation for Semilinear

Stochastic Partial Differential Equations

vorgelegt von

M. Sc.

Gregor Pasemann

ORCID: 0000-0002-8107-6685

an der Fakultät II – Mathematik und Naturwissenschaften

der Technischen Universität Berlin

zur Erlangung des akademischen Grades

Doktor der Naturwissenschaften

Dr. rer. nat.

genehmigte Dissertation

Promotionsausschuss:

Vorsitzender: Prof. Dr. Tobias Breiten

Gutachter: Prof. Dr. Wilhelm Stannat

Gutachterin: Prof. Dr. Annika Lang

Tag der wissenschaftlichen Aussprache: 27.08.2021

Berlin 2021

Abstract

The problem of parametric drift estimation for semilinear stochastic partial

differential equations (SPDE) is considered based on a maximum–likelihood

approach. The diffusivity of such models is estimated in finite time based

on a single trajectory with high resolution in space. This is implemented

by observing either a large number of Fourier modes (spectral approach), a

large number of spatial point evaluations of the process (discretized spec-

tral approach) or a convolution with a kernel of small diameter (local ap-

proach). Asymptotic properties of different estimators within these ob-

servation schemes are discussed, based on a spatial regularity analysis of

the solution to the underlying SPDE. Examples of the general theory in-

clude reaction-diffusion equations, Burgers equation and equations of Cahn–

Hilliard type. Special emphasis is put on the issue of model misspecification,

with respect to either the drift or the driving noise.

The theoretical results are supported by a numerical simulation.

As an extension, the case of simultaneous diffusivity and reaction parame-

ter estimation from spectral observations is treated in the context of stochas-

tic activator-inhibitor models. This is applied to experimental observations

of the actin marker concentration within Dictyostelium discoideum giant

cells, whose spatiotemporal dynamics is described as a stochastic FitzHugh–

Nagumo system. The performance of different estimators is compared on

synthetic data from numerical simulation as well as real data.

Zusammenfassung

Diese Arbeit befasst sich mit parametrischer Driftschätzung für semilineare

stochastische partielle Differentialgleichungen (SPDE) auf der Grundlage ei-

nes Maximum-Likelihood-Ansatzes. Die Diffusivität solcher Modelle wird in

endlicher Zeit unter Beobachtung eines einzelnen Pfades mit hoher räumlicher

Auflösung geschätzt. Diese hohe räumliche Auflösung wird formalisiert durch

eine große Anzahl an Eigenfrequenzen (Spektralansatz), eine große Anzahl

beobachteter Punktauswertungen (diskretisierter Spektralansatz) oder eine

Faltung mit einem Kern mit kleinem Durchmesser (lokaler Ansatz). Für ver-

schiedene Schätzer werden die asymptotischen Eigenschaften innerhalb dieser

Beobachtungsmodelle analysiert. Grundlage hierfür ist eine genaue Bestim-

mung der räumlichen Regularität der Lösung der zugrundeliegenden SPDE.

Beispiele für die allgemeine Theorie sind Reaktions-Diffusions-Gleichungen,

die Burgers-Gleichung sowie Gleichungen vom Cahn–Hilliard-Typ. Weiterhin

werden Fehlspezifikationen des zugrundeliegenden Modells behandelt, bezo-

gen sowohl auf den Driftterm als auch auf den stochastischen Term.

Die Theorie wird durch numerische Simulationen unterstützt.

Als eine Erweiterung der bisherigen Theorie wird die simultane Diffusions-

und Reaktionsparameterschätzung im Spektralansatz im Kontext stochas-

tischer Aktivator-Inhibitor-Modelle betrachtet. Dies wird angewendet auf

experimentelle Beobachtungsdaten der Aktinmarkerkonzentration in Dictyo-

stelium discoideum-Zellen, wobei hier eine Beschreibung als stochastisches

FitzHugh–Nagumo-System angenommen wird. Die Ergebnisse verschiedener

Schätzer werden für Simulationen und experimentelle Daten verglichen.

Acknowledgments

I am grateful to numerous people, with whom I discussed and from whom I

learned during the last years. Especially, I would like to thank

•my PhD supervisor Wilhelm Stannat for the support and guidance

through my doctorate,

•Annika Lang for examining this dissertation,

•Tobias Breiten for chairing the scientific defense,

•all my coauthors, from whom I had the chance to learn a lot: Sergio

Alonso, Randolf Altmeyer, Carsten Beta, Igor Cialenco, Sven Flem-

ming, Hyun-Jung Kim, Wilhelm Stannat (in alphabetical order),1

•the anonymous referees for their helpful comments on the preprints and

publications,

•the SFB 1294 “Data Assimilation” as well as the Technical University

Berlin (TU Berlin) for providing all the resources needed in order to

conduct this work,2

•Igor Cialenco and the Illinois Institute of Technology (IIT) for the

hospitality during a research stay from January to March 2020,

•Sven Flemming for preparing and providing the experimental giant cell

data used in Section 6.2,

•Randolf Altmeyer for giving helpful feedback on the final draft of this

manuscript,

•Markus Reiß for introducing me to the field of statistics for SPDEs,

•my family for supporting me and believing in my project.

1During my time as a PhD student, I worked on four projects [PS20], [ACP20],

[PFA+21], [CKP21]. The first three works are the basis for this dissertation.

2This research has been funded by the Deutsche Forschungsgemeinschaft (DFG)-

Project-ID 318763901 - SFB1294.

Contents

1 Introduction 8

2 The Spectral Approach 16

2.1 TheSetting ............................ 17

2.2 SpatialRegularity......................... 19

2.3 Diffusivity Estimation . . . . . . . . . . . . . . . . . . . . . . 25

2.4 Discussion of Examples . . . . . . . . . . . . . . . . . . . . . . 34

2.4.1 Linear Perturbations . . . . . . . . . . . . . . . . . . . 35

2.4.2 Reaction-Diffusion Equations . . . . . . . . . . . . . . 37

2.4.3 Burgers Equation . . . . . . . . . . . . . . . . . . . . . 40

2.4.4 Equations of Cahn-Hilliard Type . . . . . . . . . . . . 42

2.4.5 Robustness under Model Misspecification . . . . . . . . 43

2.5 Numerical Illustration . . . . . . . . . . . . . . . . . . . . . . 45

2.6 The Case of Systems . . . . . . . . . . . . . . . . . . . . . . . 46

3 Extended Noise Models for the Spectral Approach 51

3.1 The Case of Ornstein–Uhlenbeck Noise . . . . . . . . . . . . . 52

3.1.1 Covariance Structure and Asymptotic Behavior . . . . 53

3.1.2 The Maximum–Likelihood Approach . . . . . . . . . . 60

3.1.3 Diffusivity Estimation . . . . . . . . . . . . . . . . . . 62

3.1.4 Correlation Decay Estimation . . . . . . . . . . . . . . 65

3.2 The Case of Integrated Noise . . . . . . . . . . . . . . . . . . 73

3.3 Structure of the Dispersion Operator . . . . . . . . . . . . . . 77

4 Discretization of the Spectral Approach 84

5 The Local Approach 104

6 Diffusivity Estimation for Activator-Inhibitor Models 117

6.1 Joint Diffusivity and Reaction Parameter Estimation . . . . . 118

6.1.1 The General Case . . . . . . . . . . . . . . . . . . . . . 118

6.1.2 The Statistically Linear Case . . . . . . . . . . . . . . 121

6.1.3 A Model of FitzHugh–Nagumo Type . . . . . . . . . . 126

6.2 Application to Cell Data . . . . . . . . . . . . . . . . . . . . . 127

6.2.1 Evaluation of Simulated Data . . . . . . . . . . . . . . 129

6.2.2 Evaluation of Real Data . . . . . . . . . . . . . . . . . 133

6.2.3 Evaluation of a Cell Population . . . . . . . . . . . . . 135

6.2.4 The Effective Unstable Zero . . . . . . . . . . . . . . . 137

6.2.5 The Effective Diffusivity Outside the Cell . . . . . . . . 138

7 Further Research 141

A Limit Theorems 142

B Additional Proofs 144

B.1 Proof of Proposition 2.17 . . . . . . . . . . . . . . . . . . . . . 144

B.2 Proof of Lemma 5.5 . . . . . . . . . . . . . . . . . . . . . . . . 146

B.3 Proof of Proposition 6.6 . . . . . . . . . . . . . . . . . . . . . 147

List of Figures 150

Notation 151

Bibliography 169

Chapter 1

Introduction

Statistical inference for stochastic partial differential equations (SPDEs) is an

emerging field within the topic of statistics for stochastic processes. Formally,

an SPDE can be described as an evolution equation

dXt=A(Xt)dt+B(Xt)dWt(1.1)

on an infinite-dimensional state space, with suitable initial condition. Here,

Adetermines the drift, Bacts as an dispersion operator and Wis a cylin-

drical Wiener process. Further details are given below and in Section 2.1.

It is natural to use an SPDE in order to describe the spatiotemporal

dynamics of phenomena such as pattern formation or traveling waves, see

[SS16] and references therein for the propagation of action potentials in neu-

roscience, or [FFAB20] for actin dynamics within D. discoideum giant cells.

Such models may arise in different ways, for example by deriving them from

first principles, or as a phenomenological description, by adding noise to

a deterministic partial differential equation. A widely used class of mod-

els is given by stochastic reaction–diffusion systems, whose drift component

combines localized reaction dynamics with diffusive coupling in space, i.e.

A=∇ · (θ∇) + F, where θdescribes the (possibly inhomogeneous and

anisotropic) diffusivity. The reaction term Fcan reflect detailed knowledge

of the underlying (e.g. biophysical) processes, or it may be a minimal model

capable of reproducing certain features found in the observations. Different

approaches may be used to describe observed patterns, and it is desirable to

apply statistical methods in order to understand the advantages and limits

of different models.

Despite their great variety, a common feature of many SPDE models is

the presence of diffusive forcing. Consequently, a precise understanding of

the diffusivity θis crucial. As a natural first approximation, the diffusivity

can be assumed to be homogeneous and isotropic, i.e. θ > 0is just a positive

number. In this case, ∇·(θ∇) = θ∆, where ∆is the Laplacian. Depending

on the amount and quality of the data at hand, this may be refined. For

example, in [AR21], an estimation theory for a stochastic heat equation with

inhomogeneous diffusivity is developed.

Context and Literature

In order to improve the understanding of a random field with spatial and

temporal extension based on observed data, different techniques from statis-

tical inference can be applied to different classes of models. We mention two

related approaches, which complement the modeling ansatz outlined above:

Apart from starting directly with an SPDE model, a Gaussian field with spec-

ified covariance function may be imposed, see e.g. [Yin93, Moh97]. Here, the

focus lies on features of the covariance rather than the dynamics of the tem-

poral evolution. Another related concept is a partial differential equation

with boundary noise, as studied in [BST80, AB88, MP07].

Literature surveys concerning statistical inference for SPDEs are given

in [Lot09, Cia18].1A classical problem concerns the estimation of unknown

parameters of the underlying model based on observed data [IH81]. Most

of the literature on statistics for SPDEs is related to parameter estima-

tion, which we are primarily interested in. Further central tasks that have

been studied include hypothesis testing [CX14, CX15], nonparametric esti-

mation [HL00a, HL00b, PR02, HT21a] and Bayesian inference [Bis99, PR00,

Bis02, CCG20]. Another different but related topic is stochastic filtering

(see [LS77, LS01, BC09] for the general theory and e.g. [Ouv78, RR20] for

infinite-dimensional processes). In [Lot04], the solution to a parabolic equa-

tion is interpreted as the observation process of a hidden parameter, which is

assumed to satisfy itself a stochastic differential equation. Estimation of un-

known quantities of the signal SPDE with maximum–likelihood methods in

the context of stochastic filtering is treated in [BB84, AS88, Aih92, Aih98b].

1A continuously updated list of references can be found at the webpage:

https://sites.google.com/prod/view/stats4spdes/

Inference on stochastic (ordinary) differential equations, SDEs for short,

is well-established, with a huge body of literature. We refer to [Kut04] for

a detailed analysis of various statistical questions for ergodic diffusion pro-

cesses, which can be analyzed through their large time behavior. It is natural

to consider the large time regime also for the case of SPDEs, with infinite

dimensional state space, exploiting ergodicity properties of the process. This

has been done by [Log84, KL85, KL86, Moh94] for parabolic equations (see

also [Aih98a]), and [Jan20, Jan21] for damped wave equations. The case of

fractional noise is treated in [MP07, MT13, KM19].

However, it has been noted in [HKR93, HR95] that it is possible to re-

cover certain drift parameters (for example, the diffusivity of a stochastic heat

equation) even in finite time. In fact, the presence of unbounded operators

implies that the measures on path space generated by the process in finite

time for different drift parameters can be singular. This is in strong contrast

to the case of SDEs with finite dimensional state space, where Girsanov’s

theorem assures that the measures on path space are absolutely continu-

ous. While this fragmentation of the path space into the domains of singular

measures can lead to analytical difficulties (e.g. due to the lack of a dom-

inating measure), it is helpful from a statistical point of view as it implies

the identifiability of the parameters: Given an observation X, one only has

to determine the measure whose support contains X. Of course, in practice

this is achieved by substituting the state space by some finite-dimensional ap-

proximation and studying the asymptotics of classical estimation techniques

based on that discretization.

The methods and works on parameter estimation for SPDEs can be cat-

egorized according to the observation scheme they are based on. Different

discretization schemes can be applied in space and time.

We focus on the idealized assumption that the process is observed con-

tinuously in time up to a fixed T > 0. However, there are various works on

the temporally discrete setting, see e.g. [Mar03, PR96, PR97, CDVK20] for

maximum likelihood-type drift estimation, or [BT19, BT20, Cho20, Cho19,

CD20, KU21b, KU21a, TTV14] for different approaches based on temporal

power variations.

Under full spatial observations, it is possible to study the limit of small

noise intensity, as done by [Hue99, IK99, IK00, IK01]. On the other hand,

partial knowledge on the process in space can be modeled in different ways:

The spectral approach is based on observing an increasing number of eigen-

frequencies of the process, which are associated to the highest order linear

differential operator appearing in the drift. For the (finite) set of observed

modes, maximum–likelihood techniques can be applied. This approach has

been introduced by [HKR93, HR95, Hue93] and extended by a large num-

ber of subsequent works. If the remaining drift and dispersion terms are

linear and commute with the highest order drift term, the SPDE decou-

ples to a system of independent one-dimensional processes. Even in the

non-diagonalizable case, similar techniques can be applied, see [Lot96, LR99,

LR00, Lot03]. Further, in [HLR97], estimators derived from Galerkin approx-

imations instead of spectral projections are discussed. In [LL10b, LL10a],

hyperbolic equations are considered. Trajectory fitting estimators have been

analyzed in [CGH18]. Different noise models are treated in [CL09, CCG20]

(multiplicative noise), [CLP09, Cia10, Hui14, Kří20] (additive and multi-

plicative fractional noise) and [CKL20] (space-only noise).

For spatially discrete observations, modeled as a set of spatial point eval-

uations of the solution process, a natural approach is to consider power vari-

ations in space and analyze the asymptotic behavior as the mesh size of the

underlying spatial grid tends to zero. This has been done in [PT07, CH20]

(for a stochastic heat equation), [MKT19a, MKT19b] (for a stochastic frac-

tional heat equation), [CKL20, CK22] (in the context of space-only noise)

as well as [SST20] (for a wave equation driven by fractional noise). A joint

spatiotemporal variation is considered in [HT21b, HT21a].

The new local approach, pioneered in [AR21], considers spatially discrete

observations as local averages rather than point evaluations of the process.

The process is weighted by some localized kernel, which is determined by

the measurement device and its resolution. The high frequency regime from

the spectral approach is substituted by a “shrinking kernel” regime that cor-

responds to high precision measurements. Interestingly, the diffusivity of a

stochastic heat equation is identifiable from one single localized observation

in finite time.

Most literature on SPDE parameter estimation is concerned with linear

equations. An early treatment of nonlinear systems appears in [Hue93, Chap-

ter 4], where estimators derived from Galerkin approximations are studied in

the scope of general maximum likelihood theory [IH81], based on a similar

analysis for ergodic diffusion processes [Kut04]. However, in this setting, the

observations are not a functional of the full process X, but rather a finite-

dimensional Markovian approximation to X. In [GM02], consistency of the

maximum likelihood estimator for a class of controlled stochastic reaction–

diffusion equations is shown in the large time regime (see also [DMPD00],

where the case that the nonlinear term depends linearly on its parameters

is treated separately). For finite observation horizon T, the first study of

parameter estimation from spectral observations of a nonlinear process has

been given in [CGH11] for the stochastic Navier–Stokes equations. The first

result on nonparametric estimation of a reaction term based on discrete ob-

servations is given in the recent work [HT21a], where the authors consider

one-dimensional stochastic reaction–diffusion equations.

So far, there are few works on application of SPDE parameter estima-

tion techniques to experimental data. We point out [Unn89, KUP91], where

parameter estimation for SPDE models arising in groundwater hydrology is

studied with a formal maximum likelihood method, and diffusion and advec-

tion coefficients are calibrated from data.2Very recently, [ABJR21] considers

stochastic cell repolarization models in the context of the local approach.

Main Results and Outline

In this work, we consider drift parameter estimation for semilinear SPDEs.

We adapt the spectral and local approach to the general semilinear setting.

Based on a maximum-likelihood ansatz, we construct consistent estimators

for the unknown diffusivity and study their asymptotic properties in finite

time in detail. Depending on the specific setting, we are able to obtain op-

timal convergence rates as well as asymptotic normality, which allows for

the construction of confidence intervals. General results are given in Theo-

rem 2.11 and 2.12 for the spectral approach, and Theorem 5.8 for the local

approach. Examples are studied in Section 2.4. Our theory depends on a

precise understanding of the spatial regularity of the solution Xand related

processes. Special emphasis is put on robustness of the estimators to model

misspecification, either in the reaction term (Theorem 2.27) or in the speci-

fication of the dispersion operator (Theorem 3.22).

The general theory is supported by numerical evidence for the case of a

stochastic Allen–Cahn equation (Section 2.5).

2Within a different setting, [Moh00] considers a random advection-diffusion equation

in order to describe soil plutonium data.

We study how the results can be transferred to SPDEs with different

noise models that are relevant for applications. For Ornstein–Uhlenbeck

driven models, a detailed characterization is given in Theorem 3.7. As a new

feature, even the rate of temporal correlation decay of such models can be

identified in finite time (Theorem 3.10). On the other hand, the estimators

are very sensitive to deviations from the semimartingale setting, as shown

for the case of processes driven by integrated Wiener noise (Theorem 3.15).

In addition, we study to what extent the asymptotic results of the spec-

tral approach can be recovered if spatially discrete observations rather than

Fourier modes are available. Under mild conditions on the domain and

scheme of point observations, we obtain bounds on the convergence rate of a

discretized spectral estimator as the mesh size tends to zero (Theorem 4.3).

Under additional assumptions on the underlying geometry, these bounds are

optimal (Theorem 4.7), in the sense that they are consistent with results

from related literature.

Finally, we extend the spectral estimation theory to the case of joint

diffusivity and reaction parameter estimation (Theorem 6.1 and 6.4), with

special emphasis on stochastic activator–inhibitor models. The results are

applied to experimental D. discoideum giant cell observations in Section 6.2,

where we discuss the effective diffusivity of intracellular actin concentration.

This work is based on the papers [PS20], [ACP20], [PFA+21] and addi-

tional new material. It is structured as follows:

•Chapter 2 is based on [PS20] and develops the spectral approach for

general semilinear models. In fact, the results from [PS20] are extended

by means of a different approach to higher regularity (as in [ACP20]).

•Chapters 3 and 4 are new and not based on previous publications. In

Chapter 3, different noise models for the spectral approach are worked

out, based on SPDE models arising in biophysics literature. Special

emphasis is put on the case of Ornstein–Uhlenbeck noise. Chapter 4

concerns the adaptation of the asymptotic results from the spectral

approach to the case that the solution process Xis observed not via

its Fourier modes, but discretized in space.

•Chapter 5 is based on [ACP20]. Diffusivity estimation for semilinear

SPDE models is treated from the perspective of the recently introduced

local approach. A crucial tool is higher Lp-regularity of the solution

process.

•Chapter 6 is based on [PFA+21]. Diffusivity and reaction parameters

are jointly estimated in the scope of the spectral approach, and the

results are applied to simulated and experimental cell data.

A First Example

We outline the general proceeding with a simple example. For a Gelfand

triple V⊂H⊂V∗, consider the equation

dXt=θA(Xt)dt+B(Xt)dWt, X0∈H, (1.2)

with unknown θ > 0, where A:V→V∗is a possibly nonlinear operator,

Wis a cylindrical Wiener process, and Bmaps Vinto the space of Hilbert–

Schmidt operators on H. This corresponds to A=θA in (1.1). In this

example, θshould be seen as the overall drift intensity rather than diffusivity.

Assume that (1.2) is well-posed, e.g. under monotonicity and coercivity

assumptions on Aand B[LR15]. Now, given a sequence of linear projection

operators (PN)N∈Nwith finite-dimensional range on H, the dynamics for

XN:= PNXis given by

dXN=θAN(Xt)dt+BN(Xt)dWt,(1.3)

where AN(X) := PNA(X)and BN(X) := PNB(X). Note that in gen-

eral XNceases to be Markovian. Assume that XN, AN(X)and BN(X)

are observed. Let BN(Xt)BN(Xt)Tbe invertible for 0≤t≤T(inter-

preted as an operator acting on the range of PN), and set BN(Xt)+:=

BN(Xt)T(BN(Xt)BN(Xt)T)−1. This is the Moore–Penrose pseudoinverse3

of the operator BN(Xt). Then a natural estimator for θis given by

θN=RT

0(BN(Xt)BN(Xt)T)−1AN(Xt),dXN

t

0∥BN(Xt)+AN(Xt)∥2dt.(1.4)

This estimator can be either derived from a maximum likelihood approach

or directly justified by the decomposition

θN−θ=RT

0⟨BN(Xt)+AN(Xt),dWt⟩

0∥BN(Xt)+AN(Xt)∥2dt.(1.5)

3For operators between Hilbert spaces, the Moore–Penrose pseudoinverse is defined

analogously to the finite-dimensional case, cf. [WD01]. See e.g. [LS01, Chapter 13] for

properties of the pseudoinverse in the finite-dimensional case.

Set IN:= RT

0∥BN(Xt)+AN(Xt)∥2dt. According to Theorem A.1, ˆ

θNis

a consistent estimator as N→ ∞ which is asymptotically normal with rate

(EIN)−1/2, i.e.

(EIN)1/2ˆ

θN−θ→ N(0,1),(1.6)

whenever I−1

−→ 0and IN/EIN

−→ 1as N→ ∞. An explicit expression in N

of the convergence rate (EIN)−1/2will depend on the particular projection

operators PN.

This discussion outlines the argument for maximum-likelihood based es-

timation theory for SPDEs with parametric drift terms. There are some

comments to this approach:

(i) Closability of the observation scheme: In practice, it is unlikely that all

three quantities XN, AN(X)and BN(X)are observed. Usually there

is just access to XN. This problem can be addressed in different ways:

Certain model assumptions can be imposed, e.g. that B(X)≡Bis

constant and known. Also, the generating model and the observation

scheme can be aligned in the sense that e.g. A(XN) = AN(X), at least

up to negligible terms. This is the basic idea behind the spectral ap-

proach, where Aand PNcommute. When considering spatially discrete

observations in Chapter 4, such commutativity relation does not hold,

and we have to deal with an additional bias.

(ii) Model refinement: In this scenario, θrepresents the overall drift in-

tensity of the dynamics. However, it is desirable to refine this model,

either by investigating a parameter linked to a specific part of the drift

(e.g. spatial diffusion, as outlined above), or by considering multipara-

metric drift terms based on specific model knowledge. We address both

questions in the sequel, with the main focus on diffusivity estimation.

(iii) Robustness: In the case that (1.2) is misspecified but close to the true

generating dynamics, it is desirable that the asymptotic results transfer.

We will look at robustness of the estimation procedure to misspecifica-

tion in the drift and noise terms.

Chapter 2

The Spectral Approach

This chapter is an adaptation of the statements and results from [PS20].

The aim of this chapter is to develop an estimation theory for the dif-

fusivity of a semilinear SPDE driven by additive noise within the spectral

approach to statistical inference for SPDEs.

In the spectral approach, a finite number Nof Fourier modes of the so-

lution Xto an SPDE is observed, usually continuously in time, and the

asymptotic properties of estimators derived from these observations is deter-

mined as Ntends to infinity. This ansatz has been pioneered by [HKR93,

Hue93, HR95], where it has been noted that the coefficient of an unbounded

drift operator of an SPDE can be identified in finite time, in strong con-

trast to the finite-dimensional case of an stochastic (ordinary) differential

equation. In [CGH11], statistical inference for the stochastic Navier–Stokes

equations with additive noise in two dimensions has been considered, which

is the first treatment of parameter estimation in finite time for a nonlinear

system within the spectral approach. The results and methods therein have

been extended to general semilinear equations in [PS20], on which the present

chapter is based.

In Section 2.1, we discuss the semilinear SPDE model which will be used

throughout this chapter, together with some auxiliary results. Section 2.2

is concerned with optimal spatial regularity of the solution process X. In

Section 2.3, we discuss the maximum-likelihood approach to diffusivity esti-

mation and prove asymptotic results for the estimators derived from that ap-

proach. Examples including reaction-diffusion equations, the Burgers equa-

tion and equations of Cahn-Hilliard type are treated in Section 2.4. The

impact of a misspecified drift term is analyzed in the same section. A nu-

merical validation of the theory is given in Section 2.5 for the stochastic

Allen-Cahn equation. Systems consisting of an observed component and an

unobserved component are handled in Section 2.6.

2.1 The Setting

Let Hbe a separable Hilbert space with scalar product ⟨·,·⟩ and A:D(A)→

Ha densely defined, closed operator that is self-adjoint and negative definite

with compact resolvent. For s≥0, let Hs:= D((−A)s/2)⊆Hbe the domain

of the fractional Laplacian, equipped with the norm ∥·∥s=(−A)s/2·H. For

s < 0, let Hsbe the completion of Hw.r.t. the norm ∥·∥sgiven by the same

term. For each s≥0,Hs⊆H⊆H−sforms a Gelfand triple. The dual

pairing between Hsand H−sis again denoted by ⟨·,·⟩. Set V:= H1, then

V∗can be identified with H−1. For θ > 0, denote by t7→ etθA,t > 0, the

C0-semigroup generated by θA. Let (Φk)k∈Nbe an orthonormal basis of H

consisting of eigenfunctions of −A, such that the corresponding sequence

of (positive) eigenvalues (λk)k∈Nis ordered increasingly, taking into account

multiplicities. Let PN:H→Hbe the orthogonal projection onto the span

of the first Neigenfunctions Φ1,...,ΦN. Let (Ω,F,P)be a probability space

and (Ft)t≥0a right-continuous complete filtration, then (Ω,F,(Ft)t≥0,P)is

called a stochastic basis.

In this chapter, we consider a semilinear SPDE of the form

dXt=θAXtdt+F(X)(t)dt+BdWt(2.1)

together with initial condition X0∈H, where Wis a cylindrical Wiener

process on H,B=σ(−A)−γfor some σ, γ > 0is assumed to be of Hilbert–

Schmidt type, F:C(0, T;H)⊇D(F)→L1(0, T;H)is a nonlinear operator

and θ > 0is an unknown parameter.

A pair of adapted processes (X, W), defined on some stochastic basis

(Ω,F,(Ft)t≥0,P), with X∈D(F)⊆C(0, T;H)a.s. and Wa cylindrical

Wiener process, is a solution to (2.1) in the analytically and probabilistically

weak sense if for all v∈D(A) = H2a.s.

⟨v, Xt⟩=⟨v, X0⟩+θZt

0⟨Av, Xr⟩dr+Zt

0⟨v, F(X)(r)⟩dr+⟨v, BWt⟩.

We always assume the following:

(W)There is a solution in the analytically and probabilistically weak sense

to (2.1) in C(0, T;H), which is unique in the sense of probability law.

By means of the Yamada-Watanabe theorem (see e.g. [LR15, Theorem

E.0.8]), pathwise uniqueness implies uniqueness in the sense of probability

law. In most examples, Fwill be of the form F(X)(t) = F(Xt), i.e. (by

abuse of notation) F:H⊇D(F)→H, such that Fextends to an operator

V→V∗. In this case, well-posedness of (2.1) can be handled in the context

of the variational approach [LR15], cf. Section 2.4.1

Condition (W)alone imposes very little spatial regularity and should be

considered as a minimal requirement that serves as a baseline for stronger

regularity properties of X. In fact, a detailed analysis of higher regularity

for Xwill be crucial for our statistical analysis, cf. Section 2.2. There, we

need the representation of Xas a mild solution:

Xt=etθAX0+Zt

e(t−r)θAF(X)(r)dr+Zt

e(t−r)θABdWr,

where the first integral is understood in the Bochner sense, and the second

integral is a stochastic convolution.

Remark 2.1. In general, analytically weak and mild solutions are not equiv-

alent. However, if a.s. X, F(X)∈L1(0, T;H), it can be shown that analyt-

ically weak solutions are also mild solutions, see Proposition G.0.5 (i) and

Remark G.0.6 in [LR15]. These conditions are satisfied in our setting.

For two sequences of positive numbers (aN)N∈Nand (bN)N∈N, we write

aN≍bNif aN/bN→1and aN∼bNif aN/bN→Cfor some C > 0. Further

we write aN≲bNif aN≤CbNfor some C > 0,aN≪bNif aN/bN→0and

aN≪pbNif there is ϵ > 0such that NϵaN/bN→0. If the sequences are

random, then these limits are meant in the almost sure sense, unless stated

otherwise.

We always assume that there are Λ, β > 0such that the sequence of

eigenvalues of −Ahas polynomial growth:

λk≍Λkβ(2.2)

1There is a vast literature on well-posedness and regularity for SPDEs, see [DPZ14,

LR15] and references therein. In [Pes95], existence and uniqueness of semilinear equations

on Banach spaces is considered. Stochastic reaction–diffusion equations are studied in

detail in [Cer01]. See [Kry96, vNVW12] for a treatment of maximal Lp-regularity.

for k→ ∞. Note that with this notation, the condition that Bis of Hilbert–

Schmidt type is equivalent to

γ > 1

2β.(2.3)

We close this section by stating some auxiliary results.

Lemma 2.2. For s1≤s2and X∈H:

(i) ∥PNX∥2

s2≤λs2−s1

N∥PNX∥2

s1.

(ii) ∥(I−PN)X∥2

s1≤λs1−s2

N+1 ∥(I−PN)X∥2

s2.

(iii) If X∈Hs1, then erθAX2

s2≤Cs2−s1r−(s2−s1)∥X∥2

s1for some Cs2−s1>

0and all r > 0.

Proof. All properties are clear from the spectral decomposition ∥Z∥2

P∞

k=0 λs

k⟨Z, Φk⟩2,s∈R,Z∈H. In (iii), we can choose the constant

Cs2−s1= supy>0e−2yys2−s1/θs2−s1.

The statements (i) and (ii) are bounds of Bernstein and Jackson type,

respectively (cf. [Sha71, BL76]). Statement (iii) is a smoothing property of

the semigroup, see [Paz83, Lun95].

2.2 Spatial Regularity

In this section, we describe the precise regularity of X, and in particular, the

excess regularity of its nonlinear part. In order to do so, we apply a classical

splitting argument and write X=¯

X+e

X, where ¯

Xsolves

d¯

Xt=θA ¯

Xtdt+BdWt(2.4)

with initial condition X0= 0, and e

Xsolves the random PDE

Xt= (θAXt+F(¯

X+e

X)(t))dt, e

X0=X0,(2.5)

which reads as

Xt=etθAX0+Zt

e(t−r)θAF(¯

X+e

X)(r)dr(2.6)

in the mild formulation.

We infer higher regularity for e

Xby means of the conditions (Fs,η)and

(Fv

s,η)below, which rely on the representation of e

Xas a mild or weak solution,

respectively. The weak solution approach has been used in [CGH11, PS20] for

the spectral observation scheme, whereas the mild representation has been

applied in [ACP20] in the context of the local observation scheme, cf. Chapter

5. As the mild approach yields larger excess regularity in our context, we

focus mainly on (Fs,η). However, the weak approach using (Fv

s,η)will be

crucial in the examples in order to reach the level of regularity where the

mild approach can be applied.

Here and in the sequel, we write XN:= PNXas well as ¯

XN:= PN¯

Xand

X:= PNe

X. These projected processes satisfy

d¯

t=θA ¯

tdt+BdWN

t,¯

X0= 0,

t=θA e

tdt+PNF(X)(t)dt, e

X0=X0,

where WN:= PNW. In this section, we do not need the precise form (2.4)

for the dynamics of ¯

X, but only its spatial regularity. In this sense, the

conclusions remain valid for Chapters 3 – 6, where the assumptions on the

noise term are changed.

In order to quantify the regularity of X, we need the following spaces:

R(s) := L∞(0, T;Hs),(2.7)

RE(s) := \

p≥1

Lp(Ω, L∞(0, T;Hs)),(2.8)

i.e. R(s)is a normed space with norm ∥X∥R(s)= sup0≤t≤T∥X∥s, and RE(s)

is a locally convex space with X∈RE(s)if and only if for all p≥1:

Esup

0≤t≤T∥Xt∥p

s<∞.(2.9)

While X∈R(s)a.s. suffices for the purpose of diffusivity estimation, in

many examples, the stronger statement X∈RE(s)can be shown.

In order to conduct our regularity analysis, we need that for some η > 0

and s∈R, the regularity of F(X)differs from that of Xby 2−ηderivatives,

in the following sense:

(Fs,η)There is ϵ > 0and a monotonous, locally bounded function g: [0,∞)→

[0,∞)such that for all X∈R(s):

∥F(X)∥R(s+η+ϵ−2) ≤g(∥X∥R(s)).(2.10)

In general, the function ghas polynomial growth, cf. Section 2.4. If

F(X)(t) = F(Xt), i.e. Facts on every point in time separately, then it is

sufficient for (2.10) that for each 0≤t≤Tand X∈Hs:

∥F(Xt)∥s+η+ϵ−2≤g(∥Xt∥s).(2.11)

The proof of the next proposition is inspired by similar estimates in

[DPDT94].

Proposition 2.3. Let s∈R, η > 0. Assume that (Fs,η)is true.

(i) If ¯

X, e

X∈R(s)and X0∈Hs+η, then e

X∈R(s+η)a.s.

(ii) If ¯

X, e

X∈RE(s)and X0∈Lp(Ω, Hs+η)for any p≥1and if the

function gfrom (Fs,η)is of the form g(x) = C(1+x)bfor some C, b ≥0,

then e

X∈RE(s+η).

Proof.

(i) We have by Lemma 2.2 (iii):

e

ts+η≤etθAXN

0s+η+Zt

0e(t−r)θAPNF(¯

X+e

X)(r)s+ηdr

≲XN

0s+η+Zt

(t−r)−1+ϵ/2PNF(¯

X+e

X)(r)s+η+ϵ−2dr

≲∥X0∥s+η+F(¯

X+e

X)R(s+η+ϵ−2) Zt

(t−r)−1+ϵ/2dr

≲∥X0∥s+η+g¯

X+e

XR(s)2

ϵtϵ/2,

thus, uniformly in N∈N,

sup

0≤t≤Te

ts+η≲∥X0∥s+η+2

ϵTϵ/2g¯

XtR(s)+e

XR(s),(2.12)

and the right-hand side is finite by assumption.

(ii) By (2.12), for any p≥1,

Esup

0≤t≤Te

t

s+η≲E∥X0∥p

s+η+E"1 + ¯

XR(s)+e

XR(s)pb#

and further,

E"1 + ¯

XR(s)+e

XR(s)pb#≲1 + E¯

Xpb

R(s)+e

X

R(s),

which is finite by assumption. This proves the claim.

We say that s∗∈Ris the optimal regularity for ¯

Xif a.s. ¯

X∈R(s)for

all s < s∗and ¯

X /∈R(s)for all s > s∗.

Proposition 2.4. Let η > 0, let s∗be the optimal regularity of ¯

X, let s0< s∗

such that (Fs,η)is true for each s0≤s<s∗.

(i) If a.s. X∈R(s0)and X0∈Hs∗+η, then a.s. X∈R(s)for s < s∗as

well as X /∈R(s)for s > s∗, and further a.s. e

X∈R(s+η)for s<s∗.

(ii) If X∈RE(s0),X0∈Lp(Ω, Hs∗+η)for p≥1and ¯

X∈RE(s)for s < s∗,

and if the function gfrom (Fs,η)is of the form g(x) = C(1 + x)bfor

some C, b > 0, then X∈RE(s)for s < s∗,X /∈RE(s)for s > s∗, and

X∈RE(s+η)for s < s∗.

Proof. For (i), note that the statements X∈R(s),e

X∈R(s+η)for s < s∗

follow inductively from Proposition 2.3. Further, if X∈R(s)for some s > s∗

with positive probability, then ¯

X=X−e

X∈R(s∧(s∗+η/2)) with positive

probability, in contradiction to the optimality of s∗. The reasoning for (ii) is

similar.

If it is possible to set s0= 0 in Proposition 2.4, standard existence re-

sults for SPDEs can be used as a starting point for inferring higher regu-

larity. In contrast, if (Fs,η)does not hold for s= 0, we have to prove first

that X∈R(s0)for some s0>0. This can be achieved by modifying the

regularity induction. Typically, the variational approach for SPDEs yields

well-posedness of the paths of Xin spaces of the form

Rv(s) := L∞(0, T;Hs−1)∩L2(0, T;Hs).(2.13)

It is still possible to find a condition on Fwhich allows for an analogue of

Proposition 2.3. Here, we restrict to the case F(X)(t) = F(Xt).

(Fv

s,η)There is a locally bounded g: [0,∞)→[0,∞)such that for X∈Hs:

∥F(X)∥2

s+η−2≤(1 + ∥X∥2

s)g(∥X∥s−1).(2.14)

The next result extends a similar argument for the stochastic Navier–

Stokes equations from [CGH11].

Proposition 2.5. Let s∈R,η > 0such that (Fv

s,η)holds true and a.s.

X0∈Hs+η−1. If a.s. ¯

X, e

X∈Rv(s), then e

X∈Rv(s+η)a.s.

Proof. With PNH≃RN,e

XN=PNe

Xis a process in C1(0, T;RN). The

chain rule gives for 0≤t≤T:

e

t

s+η−1=∥PNX0∥2

s+η−1+ 2 Zt

0De

r, θA e

r+PNF(Xr)Es+η−1dr,

and consequently,

sup

0≤t≤Te

t

s+η−1+ 2θZT

0e

t

s+ηdt

≤ ∥X0∥2

s+η−1+ 2 ZT

0De

t, PNF(Xt)Es+η−1dt.

The last term can be estimated as

2ZT

0De

t, PNF(Xt)Es+η−1dt≤2ZT

0e

ts+η∥PNF(Xt)∥s+η−2dt

≤θZT

0e

t

s+ηdt+1

2θZT

0∥F(Xt)∥2

s+η−2dt.

Finally, using (Fv

s,η)and X=¯

X+e

X∈Rv(s),

sup

0≤t≤Te

t

s+η−1+θZT

0e

t

s+ηdt

≤ ∥X0∥2

s+η−1+1

2θZT

0∥F(Xt)∥2

s+η−2dt

≤ ∥X0∥2

s+η−1+1

2θsup

0≤t≤T

g∥Xt∥s−1ZT

(1 + ∥Xt∥2

s)dt < ∞.

Thus (e

XN)N∈Nis uniformly bounded in Rv(s+η), and the claim follows.

Analogously to Proposition 2.4, given η > 0and s0< s∗such that s∗

denotes the optimal regularity of ¯

Xand a.s. X∈Rv(s0)and X0∈Hs∗+η−1,

if (Fv

s,η)holds for all s0≤s < s∗, then X∈Rv(s)and e

X∈Rv(s+η)for all

s < s∗. In particular, X∈R(s−1) for all s < s∗. The latter statement can

be used as a starting point for Proposition 2.4.

Finally, we will need a pathwise regularity statement for stochastic inte-

grals. If a.s. U∈R(s), then t7→ (−A)s/2Ut,·has values in the space of

Hilbert–Schmidt operators from Hto R, and RT

0(−A)s/2Ut,dWtis well-

defined [DPZ14]. Provided that U∈L2(Ω×[0, T]; Hs), Itô’s isometry implies

that this integral is approximated in L2(Ω; R)by RT

0(−A)s/2UN

t,dWtas

N→ ∞, where UN=PNU. By a classical stopping argument, we have even

almost sure convergence, together with a quantification of the divergence rate

of the rescaled approximants (cf. Lemma 2.2 (i)), in the following sense:

Lemma 2.6. Let s, s′∈Rwith s < s′. For every process Uwith a.s.

U∈R(s′), it holds that a.s.

lim

N→∞ ZT

0(−A)s/2UN

t,dWt=ZT

0(−A)s/2Ut,dWt,(2.15)

and for every a > 0, a.s.

lim

N→∞ λ−a/2

NZT

0(−A)(s+a)/2UN

t,dWt= 0.(2.16)

Proof. For K∈N, let τK:= inf{0≤t≤T; sup0≤r≤t∥Ur∥2

s′≥K}∧T. We

abbreviate ZN

s(t) := Rt

0(−A)s/2UN

r,dWrand Zs(t) := Rt

0(−A)s/2Ur,dWr.

Let ϵ > 0and p≥4/(β(s′−s)). The Burkholder–Davis–Gundy inequality

and Lemma 2.2 (ii) give

PZs(τK)−ZN

s(τK)> ϵ≤ϵ−pE"ZτK

0∥(I−PN)Ut∥2

sdtp/2#

≤ϵ−pλp(s−s′)/2

N+1 E"ZτK

0∥Ut∥2

s′dtp/2#

≤ϵ−p(KT)p/2λp(s−s′)/2

N+1 ≪N−2.

The Borel–Cantelli lemma implies that ZN

s(τK)→Zs(τK)a.s. Similarly,

using Lemma 2.2 (i),

Pλ−a/2

NZN

s+a(τK)> ϵ≤ϵ−pλ−ap/2

NE"ZτK

0UN

t2

s+adtp/2#

≤ϵ−pλp(s−s′)/2

NE"ZτK

0UN

t2

s′dtp/2#

≤ϵ−p(KT)p/2λp(s−s′)/2

N≪N−2,

where we w.l.o.g. assume that s+a > s′(otherwise take s′to be smaller).

Again by the Borel–Cantelli lemma, λ−a/2

NZN

s+a(τK)→0a.s.

Consequently, (2.15), (2.16) are true on the set AK:= {τK=T}. The

claim follows as SK∈NAKhas probability one.

2.3 Diffusivity Estimation

For k∈N, we set ¯x(k):= ¯

X, Φk. Then (¯x(k))k∈Nare independent one-

dimensional Ornstein–Uhlenbeck processes that solve

d¯x(k)

t=−θλk¯x(k)

tdt+σλ−γ

kdW(k)

t,(2.17)

¯x(k)

0= 0, where (W(k))k∈Nare independent Brownian motions. ¯x(k)has the

explicit representation

¯x(k)

t=σλ−γ

kZt

e−θλk(t−r)dW(k)

r,(2.18)

and consequently,

E[(¯x(k)

t)2] = σ2

2θ(1 −e−2θλkt)λ−2γ−1

k.(2.19)

Lemma 2.7. For any s < s∗:= 1 + 2γ+ 1/β, it holds a.s. ¯

X∈R(s).

Proof. For 0< α < 1/2∧(s∗−s)/2,

t−2α(−A)s/2etθAB2

HS dt=σ

∞

k=1

λs−2γ

kZT

t−2αe−2θλktdt

≲σ

∞

k=1

λs−2γ

kZ∞

0r

2θλk−2αe−r

2θλk

≲Γ(1 −2α)

∞

k=1

λs−2γ−1+2α

k≲

∞

k=1

kβ(s−2γ−1+2α),

where Γdenotes the Gamma function. The last sum is finite since β(s−

2γ−1+2α)<−1for α < (s∗−s)/2. Now, by [DPZ14, Theorem 5.11],

(−A)s/2¯

X∈R(0), i.e. ¯

X∈R(s)a.s.

In fact, the proof of [DPZ14, Theorem 5.11] shows that even (−A)s/2¯

X∈

RE(0) in the situation of the proof of Lemma 2.7, thus ¯

X∈RE(s)for s < s∗.

Proposition 2.8. With s∗= 1+2γ−1/β, let η > 0,s0< s∗such that (Fs,η)

holds for any s0≤s < s∗. Assume that a.s. X∈R(s0),X0∈Hs∗+η. Then

a.s. for any s > s∗:

0(−A)s/2XN

t2dt≍CsN1+β(s−2γ−1) (2.20)

with

Cs=σ2TΛs−2γ−1

2θ(1 + β(s−2γ−1)) (2.21)

Proof. Integrating (2.19), we see that

EZT

(¯x(k)

t)2dt≍σ2T

2θλ−2γ−1

k,(2.22)

thus, using λk≍Λkβ,

EZT

0(−A)s/2¯

t2dt=

k=1

λs

kEZT

(¯x(k)

t)2dt≍σ2T

2θ

k=1

λs−2γ−1

≍σ2TΛs−2γ−1

2θ(1 + β(s−2γ−1))N1+β(s−2γ−1).

Lemma A.2 (ii), with X∗

k(t) = λs/2

k¯x(k)

tin the notation therein, immediately

gives that (2.20) is true for XNreplaced by ¯

XN. Now, for any 0< ϵ < η, by

Proposition 2.4:

0(−A)s/2e

t

2dt≲λs−s∗−η+ϵ

NZT

0e

t

s∗+η−ϵdt≲N1+β(s−2γ−1−η+ϵ).

This is negligible compared to the right-hand side of (2.20), and the claim

follows from expanding the square on the left-hand side of (2.20) together

with the Cauchy–Schwarz inequality for the mixed term.

Remark 2.9. In the setting of the previous proposition, it is immediate that

for the limit case s=s∗, we have a.s.

0(−A)s∗/2XN

t2dt≍σ2T

2θΛ1/β ln(N),(2.23)

with obvious changes in the proof. In particular, as the right-hand side di-

verges, X /∈R(s∗), and the regularity from Lemma 2.7 is optimal.

Next, we derive three maximum–likelihood type estimators for θ(cf.

[CGH11]). The projected process XN=PNXinduces a measure PN,T

θon

the path space C(0, T;PNH)≃C(0, T;RN)for each value of the diffusivity

θ > 0. If we fix an arbitrary reference parameter θ0>0and assume that

each of the measures (PN,T

θ)θ>0is absolutely continuous with respect to PN,T

θ0,

we obtain a likelihood that we can use for statistical inference. According to

[LS77, Section 7.6.4], the log–likelihood is formally given by

ln dPN,T

dPN,T

θ0

(XN) = 1

σ2ZT

0(θ−θ0)AXN

t,(−A)2γdXN

t

−1

2σ2ZT

0(θ−θ0)AXN

t,(−A)2γ(θ+θ0)AXN

t+ 2PNF(X)(t)dt.

This is rigorous if PNF=FPN, otherwise it should be considered as a natural

(but heuristic) approach. Maximizing for θyields the following maximum

likelihood–type estimator:

θfull

N:= −RT

0(−A)1+2αXN

t,dXN

t

0∥(−A)1+αXN

t∥2dt+RT

0(−A)1+2αXN

t, PNF(X)(t)dt

0∥(−A)1+αXN

t∥2dt,

(2.24)

where we substituted γby an additional parameter α. This estimator de-

pends on PNF(X)and is therefore not closed in XN. It can be modified as

follows:

θpart

N:= −RT

0(−A)1+2αXN

t,dXN

t

0∥(−A)1+αXN

t∥2dt+RT

0(−A)1+2αXN

t, PNF(XN)(t)dt

0∥(−A)1+αXN

t∥2dt.

(2.25)

Finally, the nonlinear term can be left out completely:

θlin

N:= −RT

0(−A)1+2αXN

t,dXN

t

0∥(−A)1+αXN

t∥2dt.(2.26)

Note that the stochastic integral appearing in the numerator of each of

the estimators has a robust representation

0(−A)1+2αXN

t,dXN

t=

k=1

λ1+2α

kZT

x(k)

tdx(k)

k=1

λ1+2α

k(x(k)

T)2−(x(k)

0)2−σ2λ−2γ

kT,

so it is a function of a single trajectory of XNalone.

The aim of this section is to study the asymptotic properties of these

estimators as N→ ∞. Note that one cannot directly apply the general

theory for maximum likelihood estimation, as exposed e.g. in [IH81], because

for PNF=FPN, none of these estimators is the true MLE.

Lemma 2.10. Let s∈R,ϵ > 0. For any process U∈R(s), we have a.s.:

0(−A)1+2αXN

t, PNUtdt≲N1

2+β(2α−γ+1

2−s

2+ϵ

2).(2.27)

In particular, if s=s∗−2 + η−ϵfor some η > 0, then

0(−A)1+2αXN

t, PNUtdt≲N1+β(2α−2γ+1−η

2+ϵ).(2.28)

Proof. This follows from

0(−A)1+2αXN

t, PNUtdt≤ZT

0XN

t2

2+4α−sdtZT

0∥PNUt∥2

sdt1/2

≲λ2+4α−s−s∗+ϵ

NZT

0XN

t2

s∗−ϵdt1/2

≲λ2α−γ−s

2+1

2+β−1

2+ϵ

N≲N1

2+β(2α−γ+1

2−s

2+ϵ

2).

An estimator ˆ

θNfor θis called strongly consistent if a.s. ˆ

θN→θ.

Theorem 2.11. Let η > 0,s0∈Rsuch that (Fs,η)is true for s0≤s < s∗.

Assume that X0∈Hs∗+ηand X∈R(s0). Let α > γ −(1 + 1/β)/4.

(i) ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

Nare strongly consistent estimators of θ.

(ii) ˆ

θfull

Nis asymptotically normal:

N1+β

2(ˆ

θfull

N−θ)d

−→ N(0,Σ),(2.29)

where

Σ = 2θ(1 + β(2α−2γ+ 1))2

TΛ(1 + β(4α−4γ+ 1)).(2.30)

(iii) If η > 1 + β−1, then ˆ

θpart

Nand ˆ

θlin

Nare asymptotically normal as in

(2.29). Otherwise, for any a < βη/2,

θpart

N=θ+o(N−a)(2.31)

almost surely, and the same is true for ˆ

θlin

Proof.

(i) This is a consequence of (ii) and (iii).

(ii) Plugging in the dynamics of XNinto ˆ

θfull

N, we obtain

θfull

N−θ=−σRT

0(−A)1+2α−γXN

t,dWN

t

0∥(−A)1+αXN

t∥2dt

=−σC1/2

2+4α−2γN1/2+β(2α−2γ+1/2)

0∥(−A)1+αXN

t∥2dtMN

where MN

t=C−1/2

2+4α−2γN−1/2−β(4α−4γ+1/2) RT

0(−A)1+2α−γXN

t,dWN

tis

a local martingale. By Proposition 2.8 with s= 2 + 4α−2γit holds

⟨MN⟩T→1in probability, thus MN

T/p⟨MN⟩T→ N(0,1) in distribu-

tion as N→ ∞ by Theorem A.1. An application of Slutsky’s lemma

together with Proposition 2.8 with s= 2 + 2αgives

N1+β

2(ˆ

θfull

N−θ)d

−→ N(0, σ2C2+4α−2γ/C2

2+2α).(2.32)

(iii) We write

θpart

N−θ=−σRT

0(−A)1+2α−γXN

t,dWN

t

0∥(−A)1+αXN

t∥2dt−biasN(X) + biasN(XN),

θlin

N−θ=−σRT

0(−A)1+2α−γXN

t,dWN

t

0∥(−A)1+αXN

t∥2dt−biasN(X),

where, with Y=Xor Y=XN,

biasN(Y) = RT

0(−A)1+2αXN

t, PNF(Y)(t)dt

0∥(−A)1+αXN

t∥2dt.(2.33)

Let ϵ > 0and s=s∗+η−2−ϵ= 2γ−1−β−1+η−ϵ. Using condition

(Fs∗−2ϵ,η), we have that F(X)∈R(s). By Lemma 2.10,

0(−A)1+2αXN

t, PNF(Y)(t)dt≲N1+β(2α−2γ+1−η

2+ϵ),

and using Proposition 2.8, a.s.

biasN(Y)≍C−1

2+2αN−1−β(2α−2γ+1) ZT

0(−A)1+2αXN

t, PNF(Y)(t)dt

≲N−β

2(η−2ϵ).

Consequently, NabiasN(Y)→0a.s. for any a < βη/2.

Now, if η > 1 + 1/β, let in addition ϵ < (η−1−1/β)/2, then we

see that a.s. N(1+β)/2biasN(Y)→0, and asymptotic normality follows

from an application of the Slutsky lemma.

Otherwise, in case η≤1 + 1/β, we have by Lemma 2.6 (setting a=

2+4α−2γ−s∗+ϵ′for any ϵ′>0in the notation therein) that

NbRT

0(−A)1+2α−γXN

t,dWN

t

0∥(−A)1+αXN

t∥2dt→0(2.34)

almost surely for any b < (1 + β)/2, and (2.31) is immediate.

Any asymptotically normal estimator in Theorem 2.11 allows to construct

asymptotic confidence intervals for θby using quantiles of the approximating

normal distribution N(θ, N−(1+β)Σ) for fixed N∈N. However, the asymp-

totic variance Σdepends linearly on the unknown parameter θ. Thus, in

order to construct asymptotic confidence intervals for the diffusivity that do

not depend on unknown quantities, the variance itself has to be estimated

consistently using any of the three estimators. This is justified by Slutsky’s

lemma. Alternatively, a variance-stabilizing transform can be used [vdV98,

Section 3.2].

Consistency of any of the three estimators implies that the measures on

C(0, T;H)generated by Xfor different values of θ > 0are mutually singular.

In the setting of this section, it is possible to determine the precise rate

of almost sure convergence of the estimators by a law of iterated logarithm:

Theorem 2.12. Let η > 0,s0∈Rsuch that (Fs,η)is true for s0≤s < s∗.

Assume X0∈Hs∗+ηand X∈R(s0). Let α > γ −(1 + 1/β)/4. Then a.s.

lim sup

N→∞

N1+β

pln(ln(N)) ˆ

θfull

N−θ=√2Σ (2.35)

with Σas in (2.30). If η > 1 + 1/β, then (2.35) is true for ˆ

θpart

Nand ˆ

θlin

Nas

well.

Proof. We write

t:= Zt

0(−A)1+2α−γ¯

r,dWN

r,

t:= Zt

0D(−A)1+2α−γe

r,dWN

rE.

Then

k=1

λ1+2α−γ

kZT

¯x(k)

tdW(k)

t=:

k=1

Zk.

We show that we can apply the law of iterated logarithm for independent, not

necessarily identically distributed random variables from [Wit85]2to (Zk)k∈N.

To this end, we write

sN:= E"N

k=1

k#!1

=EZT

0(−A)1+2α−γ¯

t2dt1

Then clearly sN→ ∞ and sN≍sN+1. Using the Burkholder–Davis–Gundy

inequality, Jensen’s inequality, Gaussianity of ¯x(k)and (2.19), we have

E"ZT

¯x(k)

tdW(k)

t

3#≲E"ZT

(¯x(k)

t)2dt3

2#≲EZT

0¯x(k)

t

3dt

≲ZT

0E[(¯x(k)

t)2]3

2dt≲λ−2γ−1

k3

Consequently,

k=1

E[|Zk|3]

≲

k=1 λ2α−2γ+1/2

sk!3

≲

k=1

k3β(2α−2γ+1/2)−3/2−3β(2α−2γ+1/2) =

k=1

k3/2,

2Although it is sufficient for our purposes, this law of iterated logarithm can be further

weakened, see e.g. [Wit87] and [Che93] for a discussion.

which converges for N→ ∞. Therefore all the conditions from [Wit85] are

satisfied, and using ln ln s2

N≍ln ln N, we conclude that a.s.

lim sup

N→∞

√2 ln ln N

p⟨¯

M⟩T

= lim sup

N→∞

p2s2

Nln ln s2

= 1.

In particular, lim supN→∞ N(1+β)/2(ln ln N)−1/2¯

T/RT

0(−A)1+αXN

t2dt=

(2Σ)1/2/σ. Further, by Lemma 2.6 (with a= 2 + 4α−2γ−s∗in the nota-

tion therein), N(1+β)/2f

T/RT

0(−A)1+αXN

t2dt→0almost surely, where

we have used that e

X∈R(s∗+η−ϵ)for every ϵ > 0. With ˆ

θfull

N−θ=

−σ(¯

T+f

T)/RT

0(−A)1+αXN

t2dt, (2.35) is proven. The statement con-

cerning ˆ

θpart

Nand ˆ

θlin

Nfollows from the proof of Theorem 2.11 (iii).

Remark 2.13.

(i) A direct calculation shows that the asymptotic variance is minimal for

α=γ. In this case, Σ = 2θ(1 + β)/TΛ. However, the estimators are

robust to the case α=γ, when the spatial regularity of Xis wrongly

specified.

(ii) In fact, continuous observation on [0, T]of any of the modes x(k)allows

to reconstruct γprecisely via the quadratic variation ⟨x(k)⟩T=σ2λ−2γ

kT,

if σand λkare known.

(iii) Σdepends linearly on T−1, i.e. observation on a large time interval

improves the estimate. This corresponds to the large time asymptotics

with rate T−1/2, which is well-known from statistics for stochastic dif-

ferential equations under ergodicity assumptions, see e.g. [Kut04].

(iv) Following the formalism of the (heuristic) maximum–likelihood approach,

the term σ−2RT

0(−A)1+αXN

t2dt(for α=γ) can be considered the

observed Fisher information.

(v) The convergence rate of ˆ

θfull

Nis upper bounded by N−1/2independently

of the dimension d.

(vi) The convergence rate (2.31) cannot be improved for ˆ

θlin

N, see Section

2.4.1.

(vii) In [PS20], a condition of the type (Fv

s,η)has been used to infer higher

regularity of e

X. Compared to (Fs,η), this condition is more restrictive

and yields lower excess regularity ηin examples (cf. Lemma 2.22 be-

low). Consequently, the lower bounds on the convergence rate of ˆ

θlin

were too pessimistic. An additional Lipschitz condition on Fhas been

used in order to reduce the asymptotic behavior of ˆ

θpart

Ndirectly to that

of ˆ

θfull

N, leading to better convergence rates of ˆ

θpart

Ncompared to ˆ

θlin

namely, the same rates as stated in Theorem 2.11. In the present for-

malism used in this chapter, this additional Lipschitz condition is no

longer necessary for statistical purposes, as both ˆ

θpart

Nand ˆ

θlin

Nobtain

the mentioned rates using (Fs,η)alone.

(viii) If Fsatisfies (Fs,η)and PNF−PNFPNsatisfies (Fs,η′)for some η′> η,

then the convergence rate of ˆ

θpart

Ncan be further improved. This is

trivially the case if [PN, F] := PNF−FPN= 0, in this case ˆ

θpart

coincides with ˆ

θfull

(ix) Consider the situation that Aand Fare (pseudo-) differential oper-

ators on a domain D ⊂ Rd. Theorem 2.11 implies that in order to

identify θin finite time, it is sufficient that Fis of lower order com-

pared to the leading order drift term θA (this is discussed in detail in

Section 2.4). However, parameters describing the intensity of lower or-

der terms may be identified in finite time as well. Consequently, it is

possible that θremains identifiable if Fis of higher order than θA. In

[HR95], a characterization for linear Fthat commute with Ais given:

θis consistently (and asymptotically normal) estimated by an estima-

tor of the type ˆ

θfull

Nif and only if order(A)≥(order(θA +F)−d)/2,

i.e. order(F)≤2 order(A) + d. This has been extended by subsequent

works on the spectral approach, cf. [LR99, LR00] for the case of non-

commuting operators.

(x) As only pathwise properties of the nonlinear process e

Xare needed, F

may, in fact, depend on the realization ω∈Ω.

2.4 Discussion of Examples

Next, we discuss the validity of condition (Fs,η)and the resulting statements

concerning diffusivity estimation for models with different nonlinear term F.

These examples are by no means exhaustive. Note that more complicated

nonlinearities can be decomposed into their elementary building blocks in

the following sense:

Lemma 2.14. Let s∈R,η > 0.

(i) If F1, F2satisfy (Fs,η), then the same is true for F1+F2.

(ii) Let η′>0and s′:= s+η−2. If Fsatisfies (Fs,η)and Gsatisfies

(Fs′,η′), then G◦Fsatisfies (Fs,η+η′−2).

Proof. All statements are clear from (2.10).

Consequently, a broad class of models where the present theory is appli-

cable can be constructed from elementary components, e.g. polynomial non-

linearities (or related reaction terms), differential operators acting in spatial

direction (advection or fractional diffusion), integration in time (delay terms

as in [DPZ14, Example 5.6]). In the next sections, we consider different

models in detail.

2.4.1 Linear Perturbations

For r < 2and c∈R, consider

dXt=θAXtdt+c(−A)r/2Xtdt+BdWt

with initial condition X0∈Hs∗+2. Here, F(X) = c(−A)r/2X. If c= 0, then

F:Hs+r→Hsis an isomorphism for any s∈R. In particular, (Fs,η)is true

for all s∈Rand η < 2−r. In this setting, ˆ

θfull

Ncoincides with ˆ

θpart

N. We

have:

Theorem 2.15. Let α > γ−(1+1/β)/4. Then ˆ

θfull

Nis asymptotically normal

as in (2.29). Furthermore:

(i) If r < 1−1/β, then ˆ

θlin

Nis asymptotically normal as in (2.29).

(ii) If r= 1 −1/β, then

N1+β

2ˆ

θlin

N−θd

−→ N(κ, Σ),(2.36)

where

κ=−cΛr/2−11 + β(2α−2γ+ 1)

1 + β(2α−2γ+r/2) (2.37)

and Σis given by (2.30).

(iii) If r > 1−1/β, then a.s.

Nβ(1−r/2) ˆ

θlin

N−θ→κ. (2.38)

Proof. (i) is a direct consequence of Theorem 2.11. For (ii), (iii), it suffices to

understand the exact asymptotics of the bias term involving Fin the setting

of Theorem 2.11. Due to α > γ −(1 + 1/β)/4together with r≥1−1/β,

it holds 1 + β(2α−2γ+r/2) >0. Using Proposition 2.8 and the notation

therein, it holds a.s.

0(−A)1+2αXN

t, PNF(X)(t)dt

0∥(−A)1+αXN

t∥2dt=cRT

0(−A)1/2+r/4+αXN

t2dt

0∥(−A)1+αXN

t∥2dt

≍cC1+r/2+2αN1+β(2α−2γ+r/2)

C2+2αN1+β(2α−2γ+1)

=cΛr/2−11 + β(2α−2γ+ 1)

1 + β(2α−2γ+r/2)N−β(1−r/2).

Now (ii) is immediate, and (2.38) holds w.r.t. convergence in probability.

Finally, Lemma 2.6 yields almost sure convergence in (2.38) by the same

argument used in the proof of Theorem 2.11 (iii).

Remark 2.16. In particular, the convergence rate (2.31) for ˆ

θlin

Nas stated

in Theorem 2.11 cannot be improved.

In case β= 2/d for d∈N, the critical condition r < 1−1/β is equivalent

to r < 1−d/2, i.e. the critical order of Fdecreases with the dimension. Note

that ris allowed to be negative here. We highlight two cases, which will be

refined in the next sections:

•Perturbation of order zero (r= 0): In d= 1,ˆ

θfull

Nand ˆ

θlin

Nare asymp-

totically normal. In d= 2,ˆ

θlin

Nstill converges to θwith optimal rate.

In d≥3, the convergence rate of ˆ

θlin

Ndeclines.

•Perturbation of order one (r= 1): In any dimension d≥1the conver-

gence rate of ˆ

θlin

Ndeclines compared to that of ˆ

θfull

N, but all estimators

stay consistent.

2.4.2 Reaction-Diffusion Equations

Let d≥1, and let D ⊂ Rdbe a bounded domain with smooth boundary. Let

f:R→Rbe a real-valued function. Consider

dXt=θ∆Xtdt+f(Xt)dt+BdWt,(2.39)

together with initial condition X0such that E∥X0∥p

s∗+2<∞is true for

any p≥1. W.l.o.g. we assume Dirichlet boundary conditions, i.e. Xt= 0 on

the boundary ∂Dfor 0≤t≤T. Set H=L2(D). The leading order linear

operator Ais given by ∆ : D(∆) →H, where D(∆) = W2,2(D)∩W1,2

0(D).

The regularity scale is given by Hs=D((−∆)s/2), such that Hsconsists of

functions of L2-Sobolev regularity s. It is always true that Ws,2

0(D)⊆Hs⊆

Ws,2(D),s≥0. For s∈N, a precise characterization of Hsin terms of

boundary trace operators can be given [Tho06, Lemma 3.1]. Further, it is

well-known that λk≍Λkβwith β= 2/d, see [Wey11] or [Shu01, Section

13.4].

We consider either of the following two structural assumptions:

(i) fis a polynomial of odd degree and negative leading coefficient, i.e.

for m∈2N−1, there are a0, . . . , am∈Rwith am<0such that

f(x) = amxm+···+a1x+a0.(2.40)

(ii) fis a bounded smooth functions with bounded derivatives of any order:

f∈C∞

b(R).(2.41)

Reaction–Diffusion models exhibit a broad variety of different dynamical

features. Nonetheless, in terms of diffusivity estimation, they can be treated

in a unified way, as explained in Theorem 2.20 below.

Proposition 2.17. Let d≤3and γ > d/4 + 1/2. Consider either of the

following two situations:

•Let fbe a polynomial as in (2.40). If d= 3, assume additionally that

fis at most of third order (i.e. m≤3).

•Let fbe a smooth function with bounded derivatives as in (2.41).

Then there is a unique solution Xto (2.39) with X∈RE(s)for some s > d/2.

The proof relies on [LR15] and is given in Appendix B.1. The condition

γ > d/4+1/2means that Bis a Hilbert–Schmidt operator from Hinto

V. This is needed because in the proof of Proposition 2.17, coercivity is

verified directly in Vinstead of H. However, there are situations where this

requirement can be relaxed to γ > d/4, i.e. Bis of Hilbert–Schmidt type

from Hinto H:

Proposition 2.18. Let d= 1,γ > 1/4and f(x) = x−x3.

(i) There is a unique solution X∈Rv(1) to (2.39).

(ii) (Fv

s,η)holds for s= 1 and η= 1. In particular, even X∈R(1).

Note that condition (Fv

s,η)instead of (Fs,η)is used in order to prove the

basic regularity result X∈R(1). From there, (Fs,η)can be used to infer

higher regularity.

Proof.

(i) This is a special case of [LR15, Example 5.1.8].

(ii) (Fv

s,η)is true due to

∥f(X)∥2

s+η−2=∥f(X)∥2

L2(D)≲∥X∥2

L2(D)+∥X∥6

L6(D)≲∥X∥2

1+∥X∥6

1/3

≲∥X∥2

1+∥X∥2

1∥X∥4

0=∥X∥2

11 + ∥X∥4

0,

where we used ∥X3∥2

L2(D)=∥X∥6

L6(D)together with the Sobolev em-

bedding H1/3⊂L6(D)in d= 1 [AF03]. Proposition 2.5 implies

X∈Rv(2) ⊂R(1). Together with ¯

X∈R(1) due to γ > 1/4, this

implies the claim.

Since d= 1 in the previous proposition, we can rephrase the existence

result: In particular, we have X∈R(s)for some s > d/2, exactly as in

Proposition 2.17. This condition s > d/2means that Xhas values in a

Sobolev space that is embedded into the space of continuous functions on D.

This is a natural starting point for inductively applying (Fs,η), cf. Proposition

2.4, as the next result illustrates:

Proposition 2.19.

(i) Let fbe a polynomial as in (2.40). Then (Fs,η)holds for any s > d/2

and 0< η < 2.

(ii) Let f∈C∞

b(R). Then (Fs,η)holds for any s∈[0,1] ∪[d/2,∞)and

0< η < 2.

Proof.

(i) For s > d/2the Sobolev spaces Ws,2(D)are closed under multiplication

(see e.g. [AF03, Theorem 4.39], [Tri10a, p. 146]). Therefore,

∥f(X)∥s≲

k=1 |ak|∥X∥k

s≲(1 + ∥X∥s)m

for X∈Hs.

(ii) The case s= 0 is trivial since ∥f(X)∥2

L2(D)≤ |D|supy∈R|f(y)|2<∞,

so let s > 0. Set ˜

f:= f−f(0). By Theorem A from [AF92] and

the discussion thereafter, there is C > 0such that ∥˜

f(X)∥s≤C(1 +

∥X∥s)1∨sfor s∈(0,1] ∪[d/2,∞), and the claim is immediate.

In particular, for each of the examples considered in this section, it is true

that a.s. X∈R(s)and e

X∈R(s+2) for any s<s∗. Therefore, by Theorem

2.11, we obtain the following result concerning diffusivity estimation:

Theorem 2.20. Let α > γ −(d+ 2)/8. Then ˆ

θfull

Nsatisfies

2+1

dˆ

θfull

N−θd

−→ N 0,2θ(d+ 4α−4γ+ 2)2

TΛd(d+ 8α−8γ+ 2)(2.42)

If d= 1, the same is true for ˆ

θpart

N,ˆ

θlin

N. In d= 2,ˆ

θpart

Nis consistent with

optimal rate, i.e. ˆ

θpart

N=θ+o(N−a)for any a < 1, and the same is true for

θlin

It is clear that the coefficients (ak)0≤k≤min (2.40) may depend on x∈ D,

in such a way that ak∈Hs∗for 0≤k≤m. This does not change the proof

of (Fs,η)for s < s∗in Proposition 2.19, thus Theorem 2.20 remains valid in

that case.

Remark 2.21. It is straightforward to include an advection term of the form

Fad(X) = ∇ · (Xv) = div(Xv)to the reaction–diffusion equation, where

v:D → Rdis a vector field with components v(i)∈Hsfor some s >

d/2. More precisely, assume that the nonlinearity of the equation F=Fre +

Fad splits into a reaction term Fre(X) = f(X)as before, and an advection

term Fad as described above. It is clear that div maps (Hs)dinto Hs−1for

any s∈R.3Furthermore, if X∈Hsand v∈(Hs)dfor s > d/2, then

Xv(i)s≲∥X∥sv(i)sfor 1≤i≤d, and consequently, ∥Fad(X)∥s−1≲

∥X∥sPd

i=1 v(i)s, i.e. Fad (and consequently F=Fre +Fad) satisfies (Fs,η)

with any η < 1. In terms of diffusivity estimation, this means that ˆ

θpart

N=θ+

o(N−a)for any a < 1/d, and similarly for ˆ

θlin

N, whereas ˆ

θfull

Nis asymptotically

normal with rate N−1/2−1/d. It cannot be expected that this loss in convergence

rate (compared to ˆ

θfull

N) can be improved for ˆ

θlin

N, cf. Section 2.4.1.

2.4.3 Burgers Equation

Let d= 1 and D= [0, L]⊂Rfor some L > 0. Consider

dXt=θ∆Xtdt−Xt∂xXtdt+BdWt(2.43)

with initial condition X0∈Lp(Ω, Hs∗+1)for any p≥1, and Dirichlet bound-

ary conditions. The nonlinearity is given by

F(X) = −X∂xX=∂x−1

2X2.

The spaces H=L2(D)and (Hs)s∈Rare as in Section 2.4.2. As a special

case of [LR15, Example 5.1.8], (2.43) has a unique solution in Rv(1). Higher

regularity can be inferred in a stepwise manner as follows:

Lemma 2.22.

(i) Fsatisfies (Fv

s,η)with η= 3/8for s= 1.

(ii) Fsatisfies (Fv

s,η)with η= 1/2for any s > 1.

(iii) Fsatisfies (Fs,η)for all η < 1and s > 1/2.

3For s∈Zthis is obvious, for general suse the exact interpolation Theorem [AF03,

Theorem 7.23].

In particular, X∈Rv(s)and e

X∈Rv(s+ 1/2) for all s<s∗. If additionally

s∗>3/2(i.e. γ > 1/2), then X∈R(s)and e

X∈R(s+ 1) for any s < s∗.

Proof. In the following calculations we use repeatedly the interpolation in-

equality ∥X∥rs1+(1−r)s2≲∥X∥r

s1∥X∥1−r

s2for s1, s2∈Rand 0< r < 1, further

the algebra property ∥XY ∥s≲∥X∥s∥Y∥sfor s > 1/2and the Sobolev em-

bedding H1/4⊂L4(D)in d= 1. These estimates are standard and can be

found, e.g., in [AF03].

(i) We have

∥F(X)∥2

s+η−2=1

4X22

3/8≲X23/4X2L2(D)≲∥X∥2

3/4∥X∥2

L4(D)

≲∥X∥2

3/4∥X∥2

1/4≲∥X∥3/2

1∥X∥1/2

0∥X∥1/2

1∥X∥3/2

=∥X∥2

1∥X∥2

so (Fv

s,η)is satisfied as stated.

(ii) For s > 1and η= 1/2,

∥F(X)∥2

s+η−2=1

4X22

s−1/2≲∥X∥4

s−1/2≲∥X∥2

s∥X∥2

s−1,

so condition (Fv

s,η)holds.

(iii) For s > 1/2and η < 1, with ϵ= 1 −η,

∥F(X)∥s+η−2+ϵ=1

2X2s≲∥X∥2

so (Fs,η)holds.

Concerning the regularity of X, we know already that X∈Rv(1). By (i)

and Proposition 2.5, e

X∈Rv(1 + 3/8). As ¯

X∈R(s)for all s < s∗, we get

X∈Rv(s)for s < s∗∧(1 + 3/8), and in particular, there is s > 1with

X∈Rv(s). Now (ii) and repeated use of Proposition 2.5 yields X∈Rv(s)

and e

X∈Rv(s+ 1/2) for any s < s∗. Further, X∈R(s−1) for all s < s∗

because Rv(s)⊂R(s−1). In case s∗>3/2, we have X∈R(s)for some

s > 1/2, and Proposition 2.4 gives X∈R(s),e

X∈R(s+1) for all s < s∗.

This Lemma, together with Theorem 2.11, yields the asymptotic proper-

ties of ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

Nfor the Burgers equation:

Theorem 2.23. Let γ > 1/2and α > γ −3/8. Then ˆ

θfull

Nsatisfies

2ˆ

θfull

N−θd

−→ N 0,2θ(3 + 4α−4γ)2

TΛ(3 + 8α−8γ).(2.44)

Further, ˆ

θpart

N=θ+o(N−a)for a < 1, and the same is true for ˆ

θlin

Remark 2.24. It is possible to apply Theorem 2.11 to the stochastic Navier–

Stokes equations driven by additive noise in dimension d= 2 with unknown

viscosity. In this case, we reobtain the results from [CGH11]. It has been

conjectured in [Cia18] that these results apply also to the stochastic Burgers

equation.

2.4.4 Equations of Cahn-Hilliard Type

For d≥1, fix a bounded domain D ⊂ Rdwith smooth boundary. Let

f:R→Rbe a real-valued function. Consider

dXt=−θ∆2Xtdt−∆f(Xt)dt+BdWt(2.45)

with initial condition X0and boundary conditions ∇X·ν= 0,∇(∆X)·ν= 0,

where ν:∂D → Rdis the unit vector pointing outwards the domain D. We

formalize this setting as in [LR15, p. 172 ff.]: Set H=L2(D), and let Vbe

the closure in W2,2(D)of {u∈C4(D)| ∇u·ν= 0,∇(∆u)·ν= 0 on D}.

Considering the Gelfand triple V⊂H≃H∗⊂V∗, we have that A=−∆2

is a bounded operator V→V∗. As before, we set Hs:= D((−A)s/2). Our

standing assumption is X0∈Hs∗+1.

Note that the regularity counting in this section differs from the conven-

tion from the previous examples, because the leading drift term in (2.45) is

of order four: This means that Hsis a closed subspace of W2s,2(D). Further-

more, in this case we have β= 4/d, i.e. λk≍Λk4/d, see [Shu01, Section 13.4].

We additionally introduce the “classical” regularity spaces H∆

s:= D((−∆)s/2)

that have been used in the previous sections. It is necessary to specify which

regularity scale we are using when we speak about condition (Fs,η).

Proposition 2.25. Let s∈R,η > 0, and set s′:= 2s,η′:= 2η. If f

satisfies (Fs′,η′)with respect to the scale of Hilbert spaces (H∆

r)r∈R, then F,

given by F(X) = −∆f(X), satisfies (Fs,η)with respect to the scale of Hilbert

spaces (Hr)r∈R.

Proof. Choose ϵ > 0and g: [0,∞)→[0,∞)as in (Fs′,η′). Then

∥−∆f(X)∥Hs+η+ϵ/2−2=∥f(X)∥H∆

s′+η′+ϵ−2≤g∥X∥H∆

s′=g∥X∥Hs.

For example, let d≤2and assume that fis of the form

f(x) = cx +ϕ(x)(2.46)

for c∈Rand ϕ∈C∞

b(R). In particular, fis globally Lipschitz continuous.

By [LR15, Example 5.2.27], there is a unique solution Xto (2.45) with a.s.

X∈Rv(1) ⊂R(0). As a consequence of Proposition 2.25 and Proposition

2.19 (ii), F=−∆fsatisfies (Fs,η)for any s≥0with η < 1in the regularity

scale (Hr)r∈R. By Proposition 2.4 we conclude X∈R(s)and e

X∈R(s+ 1)

for any s < s∗. Therefore, we have:

Theorem 2.26. Let γ > d/8and α > γ −(d+ 4)/16. Then ˆ

θfull

Nsatisfies

2+2

dˆ

θfull

N−θd

−→ N 0,2θ(d+ 8α−8γ+ 4)2

TΛd(d+ 16α−16γ+ 4).(2.47)

Further, ˆ

θpart

N=θ+o(N−a)for all a < 2/d, and the same is true for ˆ

θlin

2.4.5 Robustness under Model Misspecification

Assume that the true dynamics of a process Xis given by

dXt=θAXtdt+F(X)(t)dt+G(X)(t)dt+BdWt(2.48)

with smooth initial condition X0and F, G :C(0, T;H)⊇D(F)∩D(G)→

L1(0, T;H), where D(F)∩D(G)is the common domain for Fand G. We

assume that (2.48) is well-posed in R(s0)for some 0≤s0< s∗, and that

Fsatisfies (Fs,ηF)for some ηF>0and all s0≤s < s∗. Assume further

that X0∈Hs∗+ηF. We are interested in the robustness of ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

with respect to the misspecification G. In this section, all three estimators

are given by the same terms as in Section 2.3. In particular, ˆ

θfull

Nand ˆ

θpart

include knowledge on Fbut not on G.

Theorem 2.27. Let α > γ −(1 + 1/β)/4.

(i) If Gsatisfies (Fs,ηG)for some ηG>0and s0≤s<s∗, then ˆ

θfull

N,ˆ

θpart

and ˆ

θlin

Nare consistent.

(ii) If Gsatisfies (Fs,ηG)for some ηG>1 + 1/β and s0≤s < s∗, then

all statements on the asymptotic properties of ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

Nfrom

Theorem 2.11 transfer to the present case.

(iii) If F+Gsatisfies (Fs,ηF+G)for some ηF+G>1 + 1/β and s0≤s < s∗,

then ˆ

θlin

Nis asymptotically normal as in (2.29).

Proof. Note that in any of these cases, F+Gsatisfies (Fs,η)for s0≤s <

s∗and η=ηF∧ηG, so (2.20) remains true. Thus, all claims follow from

a straightforward modification of the proof of Theorem 2.11, taking into

account the additional bias of the form (2.33), with Freplaced by Gtherein,

coming from the nonlinear term G(X).

The excess regularity ηF+Gof F+Gclearly satisfies ηF+G≥ηF∧ηG, but

due to cancellation effects, ηF+Gmay be larger than ηF∧ηG.

Remark 2.28.

(i) The preceding examples show that a large class of nonlinearities Gsat-

isfies (Fs,ηG)for some ηG>0.

(ii) As Gis assumed to be unknown (or intractable), it does not make

sense to construct a modified estimator that takes into account the shift

coming from Gin order to improve the convergence rate. Rather, Gand

its impact on diffusivity estimation should be understood qualitatively.

(iii) The typical situation can be described as follows: Let Ftrue be the true

nonlinearity of the underlying process, which is either unknown or too

complex to be handled directly. Instead, an approximate nonlinear term

Fapprox is used to model the dynamics of X. In this case F=Fapprox,

and G=Ftrue −Fapprox is the remainder that describes the devia-

tion from the true model. The excess regularity ηGassociated with G

measures the quality of the approximate model Fapprox for diffusivity

estimation.

(iv) For example, Fapprox may be the linearization of a nonlinear model

Ftrue. In this case, ηGis related to the linearization procedure.

(v) If Xis the solution to a reaction–diffusion equation (with possible ad-

vection term) as in Section 2.4.2, the excess regularity of Gencodes

the order of the model misspecification as a differential operator. For

example, if only reaction terms (of order zero) are misspecified, but the

advection mechanism (of order one) is known very precisely, then we

have ηG<2. If the description of the advection term is wrong, then

ηG<1.

(vi) In particular, for diffusivity estimation, precise knowledge on the trans-

port term is more important than precise knowledge on the reaction

term.

2.5 Numerical Illustration

We simulate the Allen-Cahn equation [CA77]

dXt=θ∆Xtdt+ (Xt−X3

t)dt+ (−∆)−γdWt(2.49)

on D= [0,1] with Dirichlet boundary and initial condition x7→ sin(πx).

We discretize the equation in Fourier space and simulate N0= 100 Fourier

modes by a linear-implicit Euler scheme until T= 1. The temporal and

spatial step size is set to ∆t= 2.5×10−5and ∆x= 5 ×10−4, respectively.

The diffusivity is given by θ= 0.02. We generate M= 1000 Monte Carlo

simulations for each of the choices γ= 0.4and γ= 0.8. In either case, we

set α=γ. A detailed discussion on numerical simulation for SPDEs can be

found in [LPS14].

By Theorem 2.20, all three estimators ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

Nare asymptot-

ically normal. In Figure 2.1, the simulation results concerning consistency,

convergence rate and asymptotic distribution are shown. Whereas ˆ

θfull

Nand

θpart

Nperform as predicted, ˆ

θlin

Nseems to exhibit non-asymptotic effects. Ap-

parently, this depends on the impact of the noise on the dynamics, which

is controlled by γ. In fact, the value of γhas two effects: It determines

the spatial regularity of the noise (and consequently of X), but it also has

an impact on the overall noise intensity via the magnitude of λ−γ

1, i.e. the

largest eigenvalue of (−∆)−γ. Our interpretation is that irregular noise from

low values of γtends to cover the effect of the nonlinearity. Said another

way, nonlinear effects have a larger impact under smooth noise with smaller

amplitude.

We further mention that for even larger values of γ(take γ= 1.3), the

estimated value from ˆ

θlin

Nis mostly negative and therefore not related to the

true diffusivity. On the other hand, ˆ

θfull

Nand ˆ

θpart

Nremain consistent. It is

possible that this effect depends on the number of Fourier modes N0used in

the simulation.

2.6 The Case of Systems

Consider a partially observed system of the form

dXO

t=θAXO

tdt+FO(XO

t, XU

t)dt+BOdWO

dXU

t=FU(XO

t, XU

t)dt+BUdWU

t,(2.50)

together with initial conditions XO

0, XU

0. Here, XOdenotes the observed

component and XUthe unobserved component of the dynamics. We want

to estimate the unknown diffusivity θof the observed component.

More precisely, let HO,HUbe two Hilbert spaces, and let A:D(A)→HO

be a densely defined, closed, negative definite and self-adjoint operator on HO

with compact resolvent, whose eigenvalue sequence (λk)k∈Nsatisfies (2.2). FO

and FUare nonlinear operators defined on a subset D(F)of HO⊕HU, with

values in HOand HU, respectively. WOand WUare independent cylindrical

Wiener processes on HOand HU, and BO,BUare Hilbert–Schmidt operators

on HOand HU. We assume BO=σO(−A)−γfor some γ > 1/(2β), where

σO>0is the noise intensity in the observed component.

PN:HO→HOdenotes the projection onto the span of the first N

eigenvalues Φ1,...,ΦNof A. We write X= (XO, XU)and H=HO⊕HU.

Let A:D(A)⊕HU→Hbe the operator given by A(x, y) = (Ax, 0),

define F:D(F)→Hby means of F(u, v) = (FO(u, v), FU(u, v)) and

B:H→Hvia B(u, v) = (BOu, BUv). Finally, W= (WO, WU)is a

cylindrical Wiener process on H. Then Xsatisfies

dXt=θA Xtdt+F(Xt)dt+BdWt.(2.51)

In order to capture the regularity of X, we extend the notation from Section

2.2 and set for s∈R:

Hs:= D((−A)s/2),(2.52)

Hs:= D((−A)s/2) = D((−A)s/2)⊕HU,(2.53)

R(s) := L∞(0, T;Hs),(2.54)

R(s) := L∞(0, T;Hs).(2.55)

In analogy to condition (Fs,η), we need a condition on the observed part

FOof F:

(Fsys

s,η )There is ϵ > 0and a continuous increasing function g: [0,∞)→[0,∞)

such that for all X∈R(s):

∥FO(X)∥R(s+η+ϵ−2) ≤g∥X∥R(s).(2.56)

The splitting argument concerns only the observed part: We write XO=

XO+e

XO, where ¯

XO,e

XOsatisfy

d¯

t=θA ¯

tdt+BOdWO

t,(2.57)

t=θA e

tdt+FO(X)dt, (2.58)

with ¯

0= 0,e

0=XO

In analogy to Proposition 2.3 and Proposition 2.4, we have

Proposition 2.29. Let η > 0. If (Fsys

s,η )holds for s∈Rsuch that a.s.

X∈R(s)and XO

0∈Hs+η, then e

XO∈R(s+η). In particular, if s0< s∗

such that (Fsys

s,η )holds for s0≤s < s∗,¯

XO∈R(s)for s < s∗,X∈R(s0)

and XO

0∈Hs∗+ηalmost surely, then X∈R(s)and e

XO∈R(s+η)for

s < s∗.

Adapting the estimators from Section 2.3 to the present situation, we

define

θfull

N:= −RT

0(−A)1+2αPNXO

t,dPNXO

tHO

0∥(−A)1+αPNXO

t∥2

HOdt

+RT

0(−A)1+2αPNXO

t, PNFO(X)HOdt

0∥(−A)1+αPNXO

t∥2

HOdt.

In fact, ˆ

θfull

Nis not a function of the observed process PNXOalone, as it

depends on XUvia X. Consequently, we define

θpart

N:= −RT

0(−A)1+2αPNXO

t,dPNXO

tHO

0∥(−A)1+αPNXO

t∥2

HOdt

+RT

0(−A)1+2αPNXO

t, PNFO(PNXO

t,0)HOdt

0∥(−A)1+αPNXO

t∥2

HOdt,

θlin

N:= −RT

0(−A)1+2αPNXO

t,dPNXO

tHO

0∥(−A)1+αPNXO

t∥2

HOdt.

Analogously to Theorem 2.11, the following result is proven:

Theorem 2.30. Let γ > 1/(2β)and η > 0,s0< s∗such that (Fsys

s,η )holds

for s0≤s<s∗. Assume a.s. X∈R(s0)and XO

0∈Hs∗+η. Let α >

γ−(1 + 1/β)/4. Then ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

Nare strongly consistent as N→ ∞,

and ˆ

θfull

Nis asymptotically normal as in (2.29). If η > 1 + 1/β, the same is

true for ˆ

θpart

N,ˆ

θlin

N, otherwise ˆ

θpart

N=θ+o(N−a)for each a < βη/2, and the

same is true for ˆ

θlin

Example 2.31. The theory from this section is applicable to a stochastic

FitzHugh–Nagumo system [Fit61, NAY62], whose activator component is ob-

served:

dUt=θ∆Utdt+k1Ut(1 −Ut)(Ut−a)−k2Vtdt+BOdWO

dVt=ϵ(bUt−Vt) dt+BUdWU

with initial condition U0,V0, on D= [0, L],L > 0, with Neumann boundary

conditions. The reaction parameters are k1, k2, ϵ, b > 0and a∈(0,1). The

state space for both components is HO=HU=L2(D). Assume4that 1/2<

s∗<2and X= (U, V )∈R(s)for s < s∗. We verify condition (Fsys

s,η )for FO.

Here, FO(U, V ) = k1U(1 −U)(U−a)−k2V. The first term is a polynomial

that can be treated exactly as in Section 2.4.2, with obvious changes in the

notation of the norm, resulting in η < 2. Concerning the second term −k2V,

4In fact, X∈R(0) together with U∈Rv(1) can be shown as in [SS15], and a direct

modification of Proposition 2.18 gives X∈R(1) in this case. Note that under the Hilbert–

Schmidt assumption γ > d/4, we necessarily have s∗>1. Higher regularity in R(s)for

all s < s∗follows from (Fsys

s,η ).

we can find ϵ > 0such that ∥V∥s+η+ϵ−2=∥V∥L2(D)if and only if η < 2−s,

and this equality leads to (2.56) as well. In total, (Fsys

s,η )holds for FOwith

η < 2−s∗for 1/2< s < s∗. This proves that ˆ

θfull

Nis asymptotically normal,

and ˆ

θpart

Nand ˆ

θlin

Nare consistent with convergence rate bounded by N−afor

a < 2−s∗. This result can be refined if the optimal regularity of the inhibitor

is taken into account. In fact, under regularity assumptions on the inhibitor

noise, all three estimators will be asymptotically normal. We refer to Chapter

6, where a similar FitzHugh–Nagumo system is studied in greater detail.

Finally, we note that if σO= 0, i.e. if only the unobserved component is

driven by noise, other methods need to be employed. We come back to that

case in Remark 3.16 below.

Figure 2.1: Left column contains results for γ= 0.4, right column for γ= 0.8.

(top row) Red line: Median from M= 1000 realizations of ˆ

θfull

N. The blue

region is bounded above by the 97.5-percentile and below by the 2.5-percentile.

Black solid line is plotted at true value θ= 0.02, dashed line plotted at zero.

(middle row) The mean squared error (MSE), given by M−1PM

k=1(ˆ

θN(k)−

θ)2, is plotted, where ˆ

θN(k)is the k-th realization of either of the estimators

θfull

N,ˆ

θpart

Nor ˆ

θlin

N. Black line corresponds to the squared true theoretical rate

N7→ (Σ1/2N−3/2)2, with Σfrom (2.30).(bottom row) Histogram for the

standardized values Σ−1/2N3/2(ˆ

θN−θ)at N= 20, where ˆ

θNis either of the

three estimators. The width of each bin is 0.4. Outliers outside the interval

[−5,5] are put into the leftmost and rightmost bin, respectively.

Chapter 3

Extended Noise Models for the

Spectral Approach

In the last chapter, we studied semilinear SPDE models driven by spa-

tially correlated but temporally white noise. Nonetheless, in many situ-

ations it is desirable to include temporal correlation to an SPDE. There

are different ways to achieve this: A standard approach is using fractional

noise, as studied e.g. in [MP08, MT13, KM19] in the large time regime,

[TTV14, MKT19a, MKT19b, SST20] in a spatial and/or temporal infill

regime, or [CLP09, Kří20] in the spectral approach. Fractional noise im-

pacts the temporal regularity of the solution process and can be used to

model long-range dependence, see e.g. [Tud13] for a discussion of SPDEs

driven by such noise.

However, in applications, there are further common approaches to include

temporal correlation, which have gained little attention in literature concern-

ing statistical inference for SPDEs. As an important example, in models

appearing in biophysics literature [ASB18, FFAB20, MFF+20], integrated

Ornstein–Uhlenbeck noise is used.1While the presence of certain dynamical

properties such as separation of phases or traveling waves may not be affected

by substituting Brownian noise by integrated Ornstein–Uhlenbeck noise (or

vice versa), the precise specification becomes important when it comes to the

quantitative analysis of data. Motivated by these works, we study the cases

of Ornstein–Uhlenbeck noise and integrated noise separately.

1As a motivation, note that the one-dimensional integrated Ornstein–Uhlenbeck process

serves as an alternative model (besides the Wiener process) for describing the movement

of a Brownian particle, see e.g. [HL84, Chapter 2] for a discussion.

Ornstein–Uhlenbeck noise is treated in Section 3.1. From the mathemat-

ical point of view, the statistical properties of Ornstein–Uhlenbeck driven

models have not yet been investigated. We study this model both from the

perspective of parameter estimation under Ornstein–Uhlenbeck assumption

as well as from the point of view of model misspecification, where white

noise is used in the description but the true dynamics is Ornstein–Uhlenbeck

driven.

Integrated noise is studied in Section 3.2. Its statistical analysis can

be reduced to the case of semimartingale-type noise. In addition, integrated

noise provides a simple example of a model that cannot be handled by simply

using the estimators from Chapter 2 without further modification.

Finally, in Section 3.3, we consider more general dispersion operators,

which allows us to handle a certain type of multiplicative noise.

3.1 The Case of Ornstein–Uhlenbeck Noise

In this section we consider a semilinear SPDE driven by Ornstein–Uhlenbeck

noise. We develop a hierarchical estimation theory for diffusivity θand tem-

poral correlation decay µand compare the results to the white noise case, in

particular, we consider the case of model misspecification in the noise. Our

setting in this section is as follows:

dXt=θAXtdt+F(X)(t)dt+ dVt,(3.1)

dVt=−µVtdt+BdWt,(3.2)

with initial condition X0and V0. Without loss of generality, we assume

V0= 0. As before, Wis a cylindrical Wiener process, B=σ(−A)−γfor

some γ > 1/(2β)and σ > 0, and θ > 0is the diffusivity. Further, µ∈R

is an additional parameter related to the temporal correlation length of the

driving noise V. In this section we assume always µ= 0, otherwise the

equations reduce to the white noise model from Section 2.3. Additionally,

we assume that w.l.o.g. for all k∈N,µ=±θλk. (Otherwise replace A

with A+ϵI for some ϵ > 0, where I:H→His the identity operator, and

substitute Fby F−ϵI. The additional perturbation is of order zero.) This

will be used in Lemma 3.2 below.

Remark 3.1. Our model is compatible with a different natural approach to

Ornstein–Uhlenbeck driven SPDEs, namely:

dXt=θAXtdt+F(X)(t)dt+BdW(µ)

t(3.3)

with initial condition X0, where W(µ)is a cylindrical Ornstein-Uhlenbeck

process in the sense that w(µ,k):= W(µ),Φkare independent Ornstein-

Uhlenbeck processes of the form

dw(µ,k)

t=−µw(µ,k)

tdt+ dW(k)

t(3.4)

for independent Wiener processes (W(k))k∈N. If B=σ(−A)−γ, this model

can be reduced to (3.1),(3.2) by setting V=BW(µ).

The linearized model is given by

d¯

Xt=θA ¯

Xtdt+ dVt,(3.5)

dVt=−µVtdt+BdWt(3.6)

with ¯

X0=V0= 0. As in the previous chapter, we set e

X:= X−¯

X, then

Xsatisfies the random PDE (2.5).

3.1.1 Covariance Structure and Asymptotic Behavior

As before, we set ¯x(k)=¯

X, ΦkHand v(k)=⟨V, Φk⟩H. The processes

(¯x(k), v(k)),k∈N, are independent centered Gaussian processes with

d¯x(k)

t= (−θλk¯x(k)

t−µv(k)

t)dt+σλ−γ

kdW(k)

t,(3.7)

dv(k)

t=−µv(k)

tdt+σλ−γ

kdW(k)

t,(3.8)

and ¯x(k)

0=v(k)

0= 0, where (W(k))k∈Nare independent Brownian motions.

Lemma 3.2. With µ= 0 and µ=±θλk, we have the explicit representation

¯x(k)

t=σλ−γ

θλk−µZt

0θλke−θλk(t−r)−µe−µ(t−r)dW(k)

r,(3.9)

v(k)

t=σλ−γ

kZt

e−µ(t−r)dW(k)

r.(3.10)

Furthermore, for 0≤r≤t,

E[¯x(k)

r¯x(k)

t] = σ2λ−2γ

(θλk−µ)2θλk

2(e−θλk(t−r)−e−θλk(t+r))

+µ

2(e−µ(t−r)−e−µ(t+r))

+µθλk

θλk+µ(e−µt−θλkr+e−θλkt−µr −e−µ(t−r)−e−θλk(t−r)),

E[v(k)

rv(k)

t] = σ2λ−2γ

2µ(e−µ(t−r)−e−µ(t+r)),

E[v(k)

r¯x(k)

t] = σ2λ−2γ

θλk−µθλk

θλk+µ(e−θλk(t−r)−e−θλkt−µr)

−1

2(e−µ(t−r)−e−µ(t+r)),

E[¯x(k)

rv(k)

t] = σ2λ−2γ

θλk−µθλk

θλk+µ(e−µ(t−r)−e−µt−θλkr)

−1

2(e−µ(t−r)−e−µ(t+r)).

Proof. Fix k∈N. With Z= (¯x(k), v(k))Twe have dZt=AZZtdt+BZdW(k)

where AZ:R2→R2,BZ:R→R2are linear mappings given by

AZ=−θλk−µ

0−µ, BZ=σλ−γ

k1

1.

It is straightforward to verify that AZ=SDZS−1with

DZ=−θλk0

0−µ, S =1µ

0−θλk+µ.

It follows that for t∈R,

etAZBZ=SetDZS−1BZ=σλ−γ

θλk−µθλke−θλkt−µe−µt

(θλk−µ)e−µt .

Now, (3.9), (3.10) are clear from Zt=Rt

0e(t−r)AZBZdW(k)

r, and the covariance

terms follow from Itô’s isometry.

In particular, we have

E[(¯x(k)

t)2] = σ2λ−2γ

(θλk−µ)2θλk

2(1 −e−2θλkt) + µ

2(1 −e−2µt)

−2µθλk

θλk+µ(1 −e−(θλk+µ)t),(3.11)

E[(v(k)

t)2] = σ2λ−2γ

2µ(1 −e−2µt),(3.12)

E[v(k)

t¯x(k)

t] = σ2λ−2γ

θλk−µθλk

θλk+µ(1 −e−(θλk+µ)t)−1

2(1 −e−2µt).(3.13)

From the above calculations (or the elementary observation that each v(k)

is a classical one-dimensional Ornstein–Uhlenbeck process) it follows that

limt→∞ E[v(k)

tv(k)

t+d]/(E[(v(k)

t)2]E[(v(k)

t+d)2])1/2=e−µd for d≥0, so µdescribes

the rate of exponential decay of the autocorrelation function of each noise

mode in the stationary regime. Hence the name “temporal correlation decay”

for µ.

Lemma 3.3. It holds a.s. ¯

X∈R(s)for any s<s∗:= 1 + 2γ−1/β.

Proof. The reasoning is similar as in Lemma 2.7. Define C1, C2:H→Hvia

C1Φk:= [σθλ−γ+1

k/(θλk−µ)]Φkand C2Φk=−[µσλ−γ

k/(θλk−µ)]Φk. Note

that both operators are of Hilbert–Schmidt type due to γ > 1/(2β). Using

(3.9), we write ¯

Xt=Rt

0e(t−r)θAC1dWr+Rt

0e−µ(t−r)C2dWr=: ¯

X(1)

t+¯

X(2)

where, as before, t7→ etθA is the C0-semigroup generated by θA. We prove

the claim for both stochastic integrals separately: For s < s∗= 1 + 2γ−1/β

and 0< α < min{1/2,(s∗−s)/2},

t−2α(−A)s/2etθAC12

HS dt=σ2θ2

∞

k=1

λs−2γ+2

(θλk−µ)2ZT

t−2αe−2θλktdt

≲

∞

k=1

λs−2γ

kZ∞

λ2α−1

kr−2αe−rdr≲

∞

k=1

kβ(s+2α−2γ−1) <∞.

By [DPZ14, Theorem 5.11], (−A)s/2¯

X(1) ∈C(0, T;H), i.e. ¯

X(1) ∈R(s)

almost surely. With regard to ¯

X(2), we have for any s < s∗+1 = 2+2γ−1/β

and 0< α < 1/2,

t−2α(−A)s/2e−µtC22

HS dt=µ2σ2ZT

t−2αe−2µtdt

∞

k=1

λs−2γ

(θλk−µ)2

≲

∞

k=1

λs−2γ−2

k≲

∞

k=1

kβ(s−2γ−2) <∞,

and consequently, again by [DPZ14, Theorem 5.11], it follows that a.s. ¯

X(2) ∈

R(s). This proves the claim.

The next Proposition implies that a.s. ¯

X /∈L2(0, T;Hs)for any s > s∗.

In particular, the optimal regularity s∗is the same as in the white noise case

from Section 2.3.

Proposition 3.4. Set C(±)

T,µ := T±(1 −e−2µT )/(2µ).

(i) As k→ ∞, we have the following asymptotic expansions:

EZT

(¯x(k)

t)2dt≍σ2T

2θλ−2γ−1

k,(3.14)

EZT

(v(k)

t)2dt≍σ2C(−)

T,µ

2µλ−2γ

k,(3.15)

EZT

v(k)

t¯x(k)

tdt≍σ2C(+)

T,µ

2θλ−2γ−1

k,(3.16)

EZT

0Zt

¯x(k)

rdr2

dt≍σ2C(−)

T,µ

2µθ2λ−2γ−2

k,(3.17)

EZT

¯x(k)

tZt

¯x(k)

rdrdt≍σ2(1 −e−2µT )

4µθ2λ−2γ−2

k.(3.18)

(ii) As N→ ∞, we have a.s.

0(−A)s/2¯

t2dt≍σ2TΛs−2γ−1

2θ(1 + β(s−2γ−1))N1+β(s−2γ−1),(3.19)

0(−A)s/2VN

t2dt≍σ2C(−)

T,µ Λs−2γ

2µ(1 + β(s−2γ))N1+β(s−2γ),(3.20)

0(−A)sVN

t,¯

tdt≍σ2C(+)

T,µ Λs−2γ−1

2θ(1 + β(s−2γ−1))N1+β(s−2γ−1),

(3.21)

0Zt

(−A)s/2¯

rdr

dt≍σ2C(−)

T,µ Λs−2γ−2

2µθ2(1 + β(s−2γ−2))N1+β(s−2γ−2),

(3.22)

0(−A)s¯

t,Zt

rdrdt≍σ2(1 −e−2µT )Λs−2γ−2

4µθ2(1 + β(s−2γ−2))N1+β(s−2γ−2),

(3.23)

whenever sis such that the right-hand side diverges. All statements

remain true if the left-hand side is replaced by its expected value.

(iii) Let η > 0and s0< s∗= 1 + 2γ−1/β. Assume X0∈Hs∗+η. If F

satisfies (Fs,η)for all s0≤s < s∗and X∈R(s0)a.s., then (3.19) and

(3.21) remain true if ¯

XNis replaced by XN.

Remark 3.5. Comparing (3.19) and (3.22), we see that R·

0¯

Xrdrexhibits

more spatial regularity than ¯

X, namely one derivative in the scale of Sobolev

spaces (Hs)s∈R. From the point of view of a deterministic heat equation, one

may expect that one temporal derivative corresponds to two spatial deriva-

tives. This does not apply here due to nontrivial interactions with the noise.

Proof.

(i) First, (3.14), (3.15) and (3.16) follow from integrating the expressions

(3.11), (3.12) and (3.13). Further, (3.17) and (3.18) are a direct conse-

quence of Rt

0¯x(k)

rdr= (v(k)

t−¯x(k)

t)/(θλk)and (3.14), (3.15) and (3.16).

(ii) First, if the left-hand side is replaced by its expected value, we use (i)

together with λk≍Λkβand the series expansion of every term, for

example,

EZT

0(−A)sVN

t,¯

tdt=

k=1

λs

kEZT

v(k)

t¯x(k)

tdt,(3.24)

and similar expansions for the other terms. The claim is immediate

in that case. It remains to prove that the claim is still true for every

realization of the left-hand side outside a set of measure zero. Now,

(3.19) and (3.20) follow directly from Lemma A.2 (ii), setting X∗

k(t) =

λs/2

k¯x(k)

tand X∗

k(t) = λs/2

kv(k)

t, respectively, in the notation therein. For

(3.21), the argument is similar: From Lemma 3.2, we see that

Ak:= sup

0≤r,t≤TE[v(k)

rv(k)

t]≲λ−2γ

Bk:= sup

0≤r,t≤TE[v(k)

r¯x(k)

t]≲λ−2γ−1

Ck:= sup

0≤t≤TZt

0E[¯x(k)

r¯x(k)

t]dr≲λ−2γ−2

Set Yk=λs

kRT

0v(k)

t¯x(k)

tdt. Then by means of the Wick theorem [Jan97,

Theorem 1.28], applied to the mixed moment E[v(k)

tv(k)

r¯x(k)

t¯x(k)

r],

Var(Yk) = λ2s

kZT

0ZT

E[v(k)

tv(k)

r¯x(k)

t¯x(k)

r]−E[v(k)

t¯x(k)

t]E[v(k)

r¯x(k)

r]drdt

= 2λ2s

kZT

0Zt

E[v(k)

tv(k)

r]E[¯x(k)

t¯x(k)

r] + E[v(k)

t¯x(k)

r]E[v(k)

r¯x(k)

t]drdt

≤2λ2s

kTAkCk+T2B2

k/2≲λ2s−4γ−2

Now, we see that

∞

N=1

VarYN

PN

k=1 EYk2≲

∞

N=1

N2<∞,

and (3.21) follows from the strong law of large numbers [Shi96, Theorem

IV.3.2]. Now, (3.22) and (3.23) follow from (3.19), (3.20), (3.21) via

θZt

A¯

rdr=¯

t−VN

t.(3.25)

(iii) By Lemma 3.3, condition (Fs,η)for Fand Proposition 2.4, e

X∈R(s+η)

for each s < s∗. The analogue of (3.19) follows as in the white noise

case in Proposition 2.8. For the analogue of (3.21), let s>s∗and ϵ > 0

with η−2(s−s∗)< ϵ < η. Then

ZT

0D(−A)sVN

t,e

tEdt

≤sZT

0∥VN

t∥2

2s−s∗−η+ϵdtsZT

0e

t

s∗+η−ϵdt

≲N(1+β(2s−s∗−η+ϵ−2γ))/2=N1+β(s−2γ−1−η/2+ϵ/2)

by (3.20). Note that the latter exponent is positive due to ϵ > η−2(s−

s∗). Furthermore, 1 + β(s−2γ−1−η/2 + ϵ/2) <1 + β(s−2γ−1),

such that

0VN

t, XN

tsdt=ZT

0VN

t,¯

tsdt+ZT

0DVN

t,e

tEsdt

≍σ2C(+)

T,µ Λs−2γ−1

2θ(1 + β(s−2γ−1))N1+β(s−2γ−1).

This concludes the proof.

Let Jdenote the Bochner integral operator, i.e. JZ(t) = Rt

0Zrdrfor

Z∈L1(0, T;Hs),s∈R. It is desirable to transfer also (3.22) to the nonlinear

case, i.e. to substitute ¯

Xby Xtherein. In order to do so, we have to

strengthen the condition on F. In addition to (Fs,η)for F, we need:

(FJ

s,η)One of the following two conditions holds:

(i) Fsatisfies (Fs,1+η).

(ii) There is an operator Gthat satisfies (Fs,η)such that JF =GJ.

Lemma 3.6. Let η > 0and s0< s∗such that Fsatisfies (Fs,η)and (FJ

s,η)

for all s0≤s < s∗. Assume a.s. X∈R(s0)and X0∈Hs∗+η. Then

(3.22) remains true if ¯

XNis replaced by XN. Furthermore, in this case,

(J◦F)(X)∈R(s−1 + η)for s < s∗.

Proof. Lemma 3.3 and Proposition 2.4 yield e

X∈R(s+η)and therefore also

X∈R(s+η)for s < s∗. We distinguish the two cases from (FJ

s,η)and

prove that Je

X∈R(s+1+η),s < s∗, in either case:

(i) If, in fact, Fsatisfies even (Fs,1+η), another application of Proposition

2.4 proves e

X, J e

X∈R(s+1+η)for s<s∗.

(ii) If JF =GJ, where Gsatisfies (Fs,η), we proceed as in the proof of

Proposition 2.3: We know that J¯

X∈R(s)for any s < s∗+ 1 due to

Proposition 3.4. Let s < s∗+ 1 such that Je

X∈R(s), this is the case

e.g. for s=s∗. Then also JX ∈R(s), and

Je

ts+η≤Zt

0e(t−r)θA PN(J◦F)(X)(r) + XN

0s+ηdr

≲Zt

(t−r)−1+ϵ/2∥(J◦F)(X)(r) + X0∥s−2+η+ϵdr

≲sup

0≤r≤T∥(G◦J)(X)(r)∥s−2+η+ϵ+∥X0∥s+ηZt

(t−r)−1+ϵ/2dr

≲sup

0≤r≤T∥JXr∥s+∥X0∥s+η2

ϵTϵ/2<∞,

so Je

X∈R(s+η). Iterating this argument proves Je

X∈R(s+1+η)

for all s < s∗.

In particular, for s > 2+2γ−1/β and any ϵ > 0:

0Zt

(−A)s/2e

rdr

dt≤λs−s∗−1−η+ϵ

NZT

0Zt

(−A)(s∗+1+η−ϵ)/2e

rdr

≲λs−2−2γ+1/β−η+ϵ

N≲N1+β(s−2γ−2−η+ϵ),

where we assume w.l.o.g. that the exponent is positive. As a consequence,

0Zt

(−A)s/2Xrdr

dt≍ZT

0Zt

(−A)s/2¯

Xrdr

dt,

and (3.22) holds with ¯

Xreplaced by X. Finally, AJ e

X∈R(s−1 + η), thus

JF(X) = e

X−θAJ e

X−X0∈R(s−1 + η)for s < s∗.

3.1.2 The Maximum–Likelihood Approach

Heuristically, the log–likelihood is given by [LS77, Section 7.6.4]:

ln dPN,T

(θ,µ)

dPN,T

(θ0,µ0)

(XN) = 1

σ2ZT

0a(θ, µ)−a(θ0, µ0),(−A)2γdXN

t(3.26)

−1

2σ2ZT

0a(θ, µ)−a(θ0, µ0),(−A)2γ(a(θ, µ) + a(θ0, µ0))dt

−1

σ2ZT

0a(θ, µ)−a(θ0, µ0),(−A)2γPNF(X)(t)dt,

where we abbreviate

a(θ, µ) = θAXN

t−µXN

t+µXN

0+µθ Zt

AXN

rdr+µZt

PNF(X)(r)dr.

As before, this is rigorous if PNF=FPN. Maximizing for the unknown

parameter θfor known µyields the maximum likelihood–type estimator:

θref

N=−RT

0D(−A)1+2αXN

t+µRt

0(−A)1+2αXN

rdr, dXN

0(−A)1+αXN

t+µRt

0(−A)1+αXN

rdr

2dt

−µRT

0D(−A)1+2αXN

t+µRt

0XN

rdr, XN

t−XN

0−Rt

0PNF(X)(r)drEdt

0(−A)1+αXN

t+µRt

0(−A)1+αXN

rdr

2dt

+RT

0D(−A)1+2αXN

t+µRt

0(−A)1+2αXN

rdr, PNF(X)(t)Edt

0(−A)1+αXN

t+µRt

0(−A)1+αXN

rdr

2dt

,(3.27)

whereas maximizing for unknown µand known θyields

ˆµref

N=−RT

0(−A)2αVN

t,dXN

t

0∥(−A)αVN

t∥2dt−θRT

0(−A)1+2αVN

t, XN

tdt

0∥(−A)αVN

t∥2dt

+RT

0(−A)2αVN

t, PNF(X)(t)dt

0∥(−A)αVN

t∥2dt,(3.28)

where

t=XN

t−XN

0−θZt

AXN

rdr−Zt

PNF(X)(r)dr

is a functional of XNand PNF(X).

In both estimators, we substituted γby a contrast parameter α∈R, as

before. Clearly, setting θ=ˆ

θref

Nand µ= ˆµref

Nin the above expressions leads

to a (nonlinear) system of equations for the maximum likelihood estimators

in the case that both parameters are unknown. However, we are interested in

a hierarchical approach of first estimating θindependently of µand secondly

estimating µbased on the previous estimator of θ, exploiting the asymptotic

properties of the terms appearing in ˆ

θref

Nand ˆµref

N. This will be explained in

detail in the following sections. The hierarchical approach is insightful for

two reasons:

(i) From the point of view of model misspecification, the diffusivity estima-

tors from Section 2.3 still work if the driving noise Vexhibits temporal

correlation which is not accounted for in the model, as long as the

temporal regularity of Xis not affected (as in the case of fractional or

integrated noise, cf. Section 3.2).

(ii) The hierarchical approach is (at least asymptotically) as good as the

direct approach in the following sense: Let A= ∆ be the Laplacian.

In dimension d= 1, Theorem 3.7 below shows that the hierarchical

approach leads to an estimator for θwhich is agnostic to µ, and which

has the same asymptotic properties as the reference estimator ˆ

θref

Nwith

known µ. In d= 2, still the optimal convergence rate is preserved.

Further, by Theorem 3.10 below, a hierarchical estimator for µbehaves

asymptotically as ˆµref

Nwith known θwhenever d≤3.

3.1.3 Diffusivity Estimation

As explained above, it is reasonable to consider a simplified estimator for θ

which is obtained by formally setting µ= 0 in ˆ

θref

θsim

N=−RT

0(−A)1+2αXN

t,dXN

t

0∥(−A)1+αXN

t∥2dt+RT

0(−A)1+2αXN

t, PNF(X)(t)dt

0∥(−A)1+αXN

t∥2dt.

(3.29)

Formally, this is the same estimator as ˆ

θfull

Nin the white noise setting. In

particular, ˆ

θsim

Ndoes not depend on µ. Of course, the other estimators ˆ

θpart

and ˆ

θlin

Nfrom Section 2.3 can be used, too, with conditions on the excess

regularity ηof Fas in Theorem 2.11.

Theorem 3.7. Let γ > 1/(2β). Assume that there is η > 0and s0< s∗

such that (Fs,η)is true for s0≤s<s∗, and that X∈R(s0)and X0∈Hs∗+η

a.s. Let α > γ −(1 + 1/β)/4.

(i) If µis known, ˆ

θref

Nis asymptotically normal as in the white noise case:

N1+β

2(ˆ

θref

N−θ)d

−→ N(0,Σ),(3.30)

where Σis given by (2.30).

(ii) If β > 1, then ˆ

θsim

Nis asymptotically normal as in (3.30). If β= 1,

then

N(ˆ

θsim

N−θ)d

−→ N(m, Σ),(3.31)

where

m=µ+1

2T(1 −e−2µT )1 + β(2α−2γ+ 1)

Λ(1 + β(2α−2γ)) .(3.32)

If β < 1, then

Nβ(ˆ

θsim

N−θ)a.s.

−−→ m. (3.33)

If η > 1 + 1/β, then (3.30),(3.31),(3.33) for β > 1,β= 1,β < 1,

respectively, hold for ˆ

θpart

Nand ˆ

θlin

Nas well. If η≤1 + 1/β, then a.s.

θpart

N=θ+o(N−a)for a < β(η/2∧1), and the same is true for ˆ

θlin

Proof.

(i) Plugging in the dynamics of XNinto (3.27), we obtain

θref

N−θ=−σRT

0D(−A)1+2α−γXN

t+µRt

0(−A)1+2α−γXN

rdr, dWN

0(−A)1+αXN

t+µRt

0(−A)1+αXN

rdr

2dt

Set Cs=σ2TΛs−2γ−1/(2θ(1+β(s−2γ−1))), whenever s > 1+2γ−1/β.

By Proposition 2.4, e

X∈R(s+η)for s < s∗. As Jmaps R(s+η)into

itself, the same is true for Je

X. From Proposition 3.4 (ii), comparing

the rates of (3.19) and (3.22), we get immediately

s:= ZT

0(−A)s/2XN

t+µZt

(−A)s/2XN

rdr

dt(3.34)

≍ZT

0(−A)s/2¯

t+µZt

(−A)s/2¯

rdr

dt≍CsN1+β(s−2γ−1).

We write

θref

N−θ=: −σC1/2

2+4α−2γN1/2+β(2α−2γ+1/2)

2+2α

such that (MN)N∈Nis a sequence of local martingales with ⟨MN⟩T

−→ 1.

According to Theorem A.1, it follows that MN

−→ N(0,1), and making

use of Slutsky’s lemma, we see that

N1+β

2(ˆ

θref

N−θ)d

−→ N 0,σ2C2+4α−2γ

2+2α.(3.35)

(ii) We decompose ˆ

θsim

Nas follows:

θsim

N−θ=µRT

0(−A)1+2αXN

t, V N

tdt

0∥(−A)1+αXN

t∥2dt−σRT

0(−A)1+2α−γXN

t,dWN

t

0∥(−A)1+αXN

t∥2dt.

As before, the second term converges in distribution to N(0,Σ) with

rate N−(1+β)/2. Using Proposition 3.4 (iii), we have

µRT

0(−A)1+2αXN

t, V N

tdt

0∥(−A)1+αXN

t∥2dtNβa.s.

−−→ µC(+)

T,µ (1 + β(2α−2γ+ 1))

TΛ(1 + β(2α−2γ)) ,

and the right-hand side equals m. This yields the claim in the case

β > 1and β= 1, and for β < 1note that by Lemma 2.6, we have

almost surely RT

0(−A)1+2α−γXN

t,dWN

t≲N1/2+β(1/2+2α−2γ), and the

claim follows in this case as well. The statements regarding ˆ

θpart

Nand ˆ

θlin

are straightforward, taking into account an additional bias term of order

N−bfor b < βη/2as in Theorem 2.11, coming from the nonlinearity.

Remark 3.8. ˆ

θsim

Nis identical to ˆ

θfull

Nfrom Section 2.3. Thus, Theorem 3.7

is revelatory for the case that the true noise model is of Ornstein–Uhlenbeck

type, but the diffusivity estimator is derived under a white noise assump-

tion. In the reference case β= 2/d, this translates as follows: In d= 1,

θsim

Nis asymptotically normal with optimal convergence rate. In particular,

θfull

Nfrom Section 2.3 is asymptotically robust to noise misspecification of

Ornstein–Uhlenbeck type. In d= 2,ˆ

θsim

Nconverges to a non-centered normal

distribution, still with optimal rate. In d≥3,ˆ

θsim

Nis still consistent for θ,

but its convergence rate is no longer optimal.

3.1.4 Correlation Decay Estimation

In contrast to the case of diffusivity estimation, we cannot just set the nui-

sance parameter θin ˆµref

Nto zero: According to Proposition 3.4, the term

θRt

0(−A)1+2αXN

rdrdominates the denominator of (3.28). As a consequence,

estimation of µdepends on knowledge (or precise estimation) of θ. In the

sequel, we set

Vfull

t(ϑ) := Xt−X0+ϑZt

(−A)Xrdr−Zt

F(X)(r)dr, (3.36)

Vlin

t(ϑ) := Xt−X0+ϑZt

(−A)Xrdr, (3.37)

then Vfull,N (ϑ) = PNVfull(ϑ)and Vlin,N (ϑ) = PNVlin(ϑ)are given by the

same terms, with Xand F(X)replaced by XNand PNF(X). Further, we

set for ϑ > 0:

ˆµfull

N(ϑ) := −RT

0D(−A)2αVfull,N

t(ϑ),dXN

0(−A)αVfull,N

t(ϑ)

2dt

−ϑRT

0D(−A)1+2αVfull,N

t(ϑ), XN

tEdt

0(−A)αVfull,N

t(ϑ)

2dt

(3.38)

+RT

0D(−A)2αVfull,N

t(ϑ), PNF(X)(t)Edt

0(−A)αVfull,N

t(ϑ)

2dt

and

ˆµlin

N(ϑ) := −RT

0D(−A)2αVlin,N

t(ϑ),dXN

0(−A)αVlin,N

t(θ)

2dt

−θRT

0D(−A)1+2αVlin,N

t(ϑ), XN

tEdt

0(−A)αVlin,N

t(ϑ)

2dt

.(3.39)

Note that if ϑ=θis the true diffusivity, then ˆµfull

N(θ) = ˆµref

Nas given in

(3.28). If ϑis close to θ, then Vfull(ϑ)and Vlin(ϑ)should be seen as an

approximation of V. This is formalized as follows:

Lemma 3.9. Let η > 0,s0∈Rsuch that Fsatisfies (Fs,η)and (FJ

s,η)for

s0≤s < s∗. Assume X∈R(s0)and X0∈Hs∗+η. Let (θN)N∈Na sequence

of estimators for θwhich is a.s. consistent. Then, for s > 2γ−1/β,

0(−A)s

2Vfull,N

t(θN)−Vt

2dt=o(N1+β(s−2γ)).(3.40)

In particular,

0(−A)s

2Vfull,N

t(θN)

2dt≍ZT

0(−A)s

2Vt2dt≍COU

sN1+β(s−2γ),(3.41)

where COU

s=σ2C(−)

T,µ Λs−2γ/(2µ(1 + β(s−2γ))). The same statements are

true for Vlin,N (θN).

Proof. Let ϵ > 0. If s > s∗−1 + η−ϵ, using JF(X)∈R(s∗−1 + η−ϵ)by

Lemma 3.6, we have

0(−A)s

2Zt

PNF(X)(r)dr

≤λs−s∗+1−η+ϵ

NZT

0(−A)s∗−1+η−ϵ

2Zt

PNF(X)(r)dr

≲Nβ(s−s∗+1−η+ϵ)=N1+β(s−2γ−η+ϵ),

thus for ϵsufficiently small, this grows slower than N1+β(s−2γ). If s < s∗−

1 + η−ϵ, the left-hand side is even bounded uniformly in N. The case

s=s∗−1 + η−ϵcan be avoided by substituting ϵ7→ ϵ/2. Using that

X=X0+θJAX +JF(X) + V, we see that

Vfull,N

t(θ) = Vt, V lin,N

t(θ)−Vt=Zt

F(X)(r)dr,

and consequently,

0(−A)s

2Vlin,N

t(θ)−VN

t

2dt≲ZT

0(−A)s

2Zt

PNF(X)(r)dr

≪pN1+β(s−2γ),

and the same estimate is trivially satisfied for Vfull,N instead of Vlin,N . Next,

again by Lemma 3.6, we have

0(−A)s

2Vfull,N

t(θN)−Vfull,N

t(θ)

2dt

= (θN−θ)2ZT

0(−A)s+2

2Zt

rdr

≍(θN−θ)2θ−2COU

sN1+β(s−2γ),

and the same is true for Vlin,N instead of Vfull,N . As θNis a consistent

estimator for θ, the right-hand side is negligible compared to N1+β(s−2γ). Now

(3.40) follows by simple norm estimates, and (3.41) is a direct consequence

of (3.40).

Theorem 3.10. Let η > 0,s0∈Rsuch that Fsatisfies (Fs,η)and (FJ

s,η)

for s0≤s < s∗. Let X∈R(s0)and X0∈Hs∗+ηalmost surely. Let

α > γ −1/(4β).

(i) If β > 1/2, then

2ˆµfull

Nˆ

θsim

N−µd

−→ N (0,Σµ)(3.42)

with

Σµ=4µ2(1 + β(2α−2γ))2

(e−2µT −1+2µT)(1 + β(4α−4γ)).(3.43)

(ii) If β= 1/2, then

2ˆµfull

Nˆ

θsim

N−µd

−→ N(−mµ/θ, Σµ),(3.44)

where mis given by (3.32).

(iii) If β < 1/2, then

Nβˆµfull

Nˆ

θsim

N−µa.s.

−−→ −mµ

θ.(3.45)

If Fsatisfies (Fs,¯η)for some ¯η > 1and all s0≤s < s∗,ˆµlin

Nˆ

θlin

Nis

a consistent estimator for µ. If ¯η > 1+1/β, (i), (ii), (iii) remain true for

ˆµlin

Nˆ

θlin

N. Otherwise, ˆµlin

Nˆ

θlin

N=µ+o(N−b)for every b < (β(¯η−1)/2)∧β.

Proof. We write θN=ˆ

θsim

Nfor short. Expanding ˆµfull

NθNby plugging in the

dynamics of XN, we see that

ˆµfull

NθN=µRT

0D(−A)2αVfull,N

t(θN), V N

tEdt

0(−A)αVfull,N

t(θN)

2dt

−σRT

0D(−A)2α−γVfull,N

t(θN),dWN

0(−A)αVfull,N

t(θN)

2dt

=: [I]N−[II]N.(3.46)

For the second term, one would like to apply Theorem A.1 as in the previous

cases. Note, however, that θNdepends on the whole trajectory of (Xt)0≤t≤T,

so the integrand in the stochastic integral is not adapted. Nonetheless, as

Vfull,N

t(θN)is an affine function of θN, this issue is easy to avoid by decom-

posing the integrand as Vfull,N

t(θN) = Vfull,N

t(θ)+(θN−θ)Rt

0(−A)XN

rdr:

σRT

0D(−A)2α−γVfull,N

t(θN),dWN

0(−A)αVfull,N

t(θN)

2dt

=σRT

0D(−A)2α−γVfull,N

t(θ),dWN

0(−A)αVfull,N

t(θN)

2dt

+ (θN−θ)σRT

0D(−A)1+2α−γRt

0XN

rdr, dWN

0(−A)αVfull,N

t(θN)

2dt

=: [IIa]N+ (θN−θ)[IIb]N.

Now we rescale both terms separately with the square root of the quadratic

variation processes of their stochastic integrals and apply Theorem A.1, using

α > γ−1/(4β)together with Lemma 3.9 and Proposition 3.4 (iii). This yields

N1/2[IIa]N

−→ N(0, σ2COU

4α−2γ/(COU

2α)2),(3.47)

N1/2[IIb]N

−→ N(0, σ2COU

4α−2γ/(θCOU

2α)2),(3.48)

As θN→θalmost surely, (3.47) holds for [II]Ninstead of [IIa]Nas well.

The term [I]Ncan be treated as follows:

[I]N−µ=µRT

0D(−A)2αVfull,N

t(θN), V N

t−Vfull,N

t(θN)Edt

0(−A)αVfull,N

t(θN)

2dt

=−(θN−θ)µθNRT

0(−A)1+αRt

0XN

rdr

2dt

0(−A)αVfull,N

t(θN)

2dt

−(θN−θ)µRT

0D(−A)1+2αXN

t,Rt

0XN

rdrEdt

0(−A)αVfull,N

t(θN)

2dt

+ (θN−θ)µRT

0D(−A)1+2αXN

0+Rt

0PNF(X)(r)dr,Rt

0XN

rdrEdt

0∥(−A)αVfull,N (θN)∥2dt

=: [Ia]N+ [Ib]N+ [Ic]N.

For the first term,

[Ia]N≍ −(θN−θ)µθRT

0Rt

0XN

rdr

2+2αdt

0Vfull,N

t(θN)

2αdt≍ −µ

θ(θN−θ).

For [Ib]N, the Cauchy-Schwarz inequality is used: If α > γ + 1/4−1/(2β),

ZT

0XN

t,Zt

rdr1+2α

dt

≤v

tZT

0∥XN

t∥2

1/2+2αdtZT

0Zt

rdr

3/2+2α

≲N1+β(2α−2γ−1/2)

≪pN1+β(2α−2γ)≲ZT

0Zt

rdr

2+2α

dt.

If α < γ + 1/4−1/(2β), the left-hand side is bounded uniformly in N,

and for α=γ+ 1/4−1/(2β)replace αby α+ 1/8under the square root

term in an additional norm estimate, and continue as before. In any case,

|[Ib]N| ≪p[Ia]N. Finally, consider [Ic]N. Let 0< ϵ < η. We can neglect

X0∈Hs∗+η, which has larger spatial regularity than JF(X)∈R(s∗−1 +

η−ϵ). Then, if α > γ −1/(2β) + η/4−ϵ/4, we have with the abbreviation

¯r:= 2 + 4α−s∗+ 1 −η+ϵ:

|[Ic]N|≲θN−θrRT

0Rt

0PNF(X)(r)dr

s∗−1+η−ϵdtRT

0Rt

0XN

rdr

¯rdt

0(−A)αVfull,N

t(θN)

2dt

≲θN−θN−β

2(η−ϵ),

and |[Ic]N| ≪p[Ia]N. The case α≤γ−1/(2β)+η/4−ϵ/4is treated as before.

Putting things together, we have shown [I]N−µ≍ −(θN−θ)µ/θ. Now (i),

(ii), (iii) for ˆµfull

NθNfollow from the asymptotic behavior of θN=ˆ

θsim

Nfrom

Theorem 3.7.

Now, tracing the proof for ˆµlin

NθN(with Vlin instead of Vfull and θN=

θlin

Ninstead of θN=ˆ

θsim

N), there are two additional bias terms that have to

be controlled. First, (3.46) is replaced by

ˆµlin

NθN= [I]N−[II]N−RT

0D(−A)2αVlin,N

t(θN), PNF(X)(t)Edt

0(−A)αVlin,N

t(θN)

2dt

=: [I]N−[II]N−[III]N

with Vlin instead of Vfull in the definition of [I]Nand [II]N.[II]Nis treated

exactly as before. Due to (Fs,¯η)we have F(X)∈R(s−2 + ¯η)for all s<s∗.

Let w.l.o.g. 2α−s∗+ 2 −¯η > 0, this can be achieved by choosing ¯η > 1

smaller if necessary. Then, for any ϵ > 0,

|[III]N| ≤ 



RT

0∥(−A)αPNF(X)(t)∥2dt

0(−A)αVlin,N

t(θN)

2dt





≤λ

2α−s∗+2−¯η+ϵ

N



RT

0(−A)s∗−2+¯η−ϵ

2PNF(X)(t)

2dt

0(−A)αVlin,N

t(θN)

2dt





≲N−1

2−β

2(2α−2γ)+ 1

2+β

2(2α−2γ+1−¯η+ϵ)=N−β

2(¯η−1−ϵ),

and for ¯η > 1and sufficiently small ϵ > 0, this term converges to zero. If

¯η > 1 + 1/β, then even N1/2|[III]N|≲N−β(¯η−1−1/β−ϵ)/2→0for N→ ∞.

Furthermore, the decomposition of [I]Nchanges: If we set w.l.o.g. X0= 0,

then we obtain

[I]N−µ= [Ia]N+ [Ib]N−µRT

0D(−A)2αVlin,N

t(θN),Rt

0PNF(X)(t)drEdt

0(−A)αVlin,N

t(θN)

2dt

=: [Ia]N+ [Ib]N+ [Id]N,

where again Vfull has been substituted by Vlin in every term. The term [Id]N

is treated exactly as [III]N(note that, in fact, JF(X)exhibits even larger

spatial regularity than F(X)). The claims for the case ¯η > 1 + 1/β are

now immediate. For the remaining case ¯η≤1 + 1/β, note that by Theorem

3.7, ˆ

θlin

N=θ+o(N−a)for a < β¯η/2∧β, whereas |[III]N|=o(N−b)for

b < β(¯η−1)/2. This concludes the proof.

Remark 3.11.

(i) Σµis minimal for α=γ. In this case, Σµ= 4µ2/(e−2µT −1+2µT).

In particular, Σµ≍2µ/T for µ→ ∞, i.e. for large µ, the asymptotic

variance grows linearly in µ. Further, limµ→0Σµ= 2/T2, i.e. for small

µ, the asymptotic variance does not depend on µ, but a large observation

time Tis even more beneficial.

(ii) Similar to the case of diffusivity θ, for finite–dimensional systems, the

temporal correlation decay µis not identifiable in finite time. For exam-

ple, if F= 0,(3.26) describes the true likelihood for the N-dimensional

system, and the measures on path space are absolutely continuous for

different µ.

(iii) If XNand PNF(X)are observed, both ˆµfull

Nˆ

θsim

Nand ˆµlin

Nˆ

θlin

Nare

valid estimators. If PNF(X)is not observed, only ˆµlin

Nˆ

θlin

Nis feasible.

(iv) From the proof of Theorem 3.10, it is clear that ˆµref

N= ˆµfull

N(θ)with the

true diffusivity θis asymptotically normal as in (3.42).

(v) In particular, the hierarchical approach using ˆµfull

Nˆ

θsim

Nis asymptoti-

cally as good as the direct maximum–likelihood approach with known θ

whenever β > 1/2. If β= 2/d, this means d≤3.

Example 3.12. We close this section with a short discussion on the validity

of the additional condition (FJ

s,η).

(i) If Fsatisfies (Fs,η)with excess regularity η > 1, then (FJ

s,η′)holds with

η′=η−1. In particular, the theory is applicable to reaction–diffusion

equations as in Section 2.4.2

(ii) If Fsatisfies (Fs,η)and JF =FJ, then (FJ

s,η)holds. For example, if

F(Z) = (−A)rZfor some r < 2.

(iii) Let Dbe a bounded domain with smooth boundary and A= ∆ the

Laplacian. Let s > d/2. We extend Remark 2.21 as follows: Given a

(possibly time-dependent) vector field v:D×[0, T]→Rdwith compo-

nents v(i)=Jw(i)for some w(i)∈R(s), consider the advection term

F(Z)(t) = ∇·(Ztvt). This term belongs to neither of the previous exam-

ples (i), (ii). We show that it satisfies (FJ

s,η)for any η < 1. To this end,

we use the integration by parts formula J(Jf ·g) = Jf ·Jg−J(f·Jg)for

f, g ∈R(s), where multiplication is understood pointwisely for x∈ D.2

Define ¯

G(i)(Z) := v(i)·Z−J(w(i)·Z)and G(Z) := ∇ · ¯

G(Z) =

i=1 ∂xi¯

G(i)(Z). Then clearly JF =GJ. Further, ¯

G(i)satisfies (Fs,η)

for η < 2, thus Gsatisfies (Fs,η)with η < 1.

2Note that for f, g ∈R(s)all terms appearing are well-defined, and for any x∈ D,

the (multiplicative, bounded) point evaluation operator δxreduces the formula to the

one-dimensional integration by parts.

3.2 The Case of Integrated Noise

In this section we consider the case that the solution process is driven by an

integrated semimartingale. Such a process has pathwise Hölder regularity

3/2−ϵin time for every ϵ > 0. Apart from providing a non-standard noise

model for semilinar SPDEs that is used in applications, this type of noise can

arise from partially observed systems driven by Brownian noise, as explained

in Remark 3.16.

A better understanding of the impact of different noise types on statistical

questions may help to decide for (or against) them. This noise model provides

a simple example of a model misspecification in which the natural estimator

θlin

N, derived under the assumption of martingale noise, is no longer consistent.

This is proven in Theorem 3.15.

For a model driven by an integrated noise term Rt

0Wrdrrather than Wt,

with dispersion operator B=σ(−A)−γ, the resulting equation reads as

dXt=θAXtdt+F(X)(t)dt+BWtdt(3.49)

together with initial condition X0. W.l.o.g. we assume that the Wiener pro-

cess starts in zero; different (e.g. random) initial conditions can be absorbed

into F. Note that Xis the solution to a random PDE of the form

∂tXt=θAXt+F(X)(t) + BWt.(3.50)

In particular, Yt:= ∂tXsatisfies

dYt=θAYtdt+SF(Y)(t)dt+BdWt(3.51)

with initial condition Y0=AX0+F(X)(0), where

SF:= ∂t◦F◦(J+X0),(3.52)

and Jis the Bochner integral operator (JX)(t) = Rt

0Xrdr. We make this

precise as follows: Let s∈R. For f∈C(0, T;Hs)such that ⟨f, Φk⟩ ∈

C1(0, T;R)for k≥1, let ∂tfbe given by ⟨∂tf, Φk⟩=∂t⟨f, Φk⟩whenever

it exists. Define C1

Φ(0, T;Hs)⊂C(0, T;Hs)to be the subspace of func-

tions fsuch that ∂tfexists in the previously explained sense and belongs

to L∞(0, T;Hs). It is clear that Jmaps L∞(0, T;Hs)into C1

Φ(0, T;Hs),

while ∂tmaps C1

Φ(0, T;Hs)into L∞(0, T;Hs). For operators of the form

F:C1

Φ(0, T;H0)⊃D(F)→C1

Φ(0, T;H0), (3.52) is meaningful.

It is clear that the solution Xto (3.49) and Yto (3.51) contain the same

amount of statistical information, as both processes can be transferred into

each other by means of the operators Jand ∂t.

If the nonlinear operator SFsatisfies (Fs,η), we can connect to the the-

ory from Section 2.3. Note that if X0is random, SFwill depend on the

realization ω∈Ω, but this does not alter any of the (pathwise) arguments.

Condition (Fs,η)for SFcan be deduced naturally from similar conditions on

Fitself, for example, in the case of reaction terms:

Lemma 3.13. Let D ⊂ Rdbe a bounded domain with smooth boundary,

A= ∆ the Laplacian operator and F(X) = f(X)for a differentiable function

f:R→R. W.l.o.g. let Xsatisfy Dirichlet boundary conditions. If X0∈Hs

and f′satisfies (Fs,η)for some s > d/2and any 0< η < 2, then the same is

true for SF.

Proof. In that case, SF(Z) = ∂tf(JZ +X0) = f′(JZ +X0)Zfor any Z∈

R(s). Choose ϵ > 0and a monotonous function gas in condition (Fs,η)for

f′. W.l.o.g. let s+η−2> d/2(otherwise substitute η < 2by a larger value)

and ϵ≤2−η. Then:

∥SF(Z)∥R(s+η+ϵ−2) ≲sup

0≤t≤T∥f′(JZt+X0)∥s+η+ϵ−2sup

0≤t≤T∥Z∥s

≤sup

0≤t≤T

g(∥JZt∥s+∥X0∥s)∥Z∥R(s)

≤g(T∥Z∥R(s)+∥X0∥s)∥Z∥R(s),

which proves the claim.

Furthermore, if ∂tcommutes with F, e.g. if Fitself is a linear differential

operator acting in spatial direction, then (Fs,η)for SFimmediately reduces

to (Fs,η)for F.

In total, the whole theory as developed for noise of semimartingale type

transfers if Y=∂tXis considered instead of X. For example, a maximum

likelihood–type estimator for the case of integrated white noise is given by

θrescaled

N=−RT

0(−A)1+2α∂tXN

t,d(∂tXN)t

0∥(−A)1+α∂tXN

t∥2dt

+RT

0(−A)1+2α∂tXN

t, PN(SF)(∂tX)(t)dt

0∥(−A)1+α∂tXN

t∥2dt.(3.53)

It is obvious that the same reduction technique applies if Xis driven by

integrated Ornstein–Uhlenbeck noise instead of integrated white noise.

Remark 3.14. Taking the time derivative of Xamounts to a rescaling of

the Hölder regularity of Xto be 1/2−ϵin time (for all ϵ > 0), such that

semimartingale theory can be applied. A similar approach is possible for

fractional noise with Hurst index 0< H < 1. In this case, the temporal

regularity rescaling can be done by applying a kernel instead of taking the

derivative. Based on that observation, a Girsanov transform for SODEs

driven by fractional Brownian motion BHcan be derived by considering a

surrogate semimartingale [NVV99, KLBR00, TV07, Mis08]. This allows for

likelihood-based inference. In [CLP09, Cia10] this approach is used for pa-

rameter estimation for SPDEs driven by additive and multiplicative fractional

noise.

In the case of integrated noise, it is interesting to see how model misspec-

ification changes the behavior of the estimator. Namely, assume that ˆ

θfull

Nis

given as in Section 2.3, but the dynamics of Xis generated by integrated

noise. Then even in the simplest possible case, i.e. if Xsatisfies a linear

equation with X0= 0,ˆ

θfull

Nis not consistent:

Theorem 3.15. Let X0= 0,F= 0. It holds that ˆ

θfull

N→0almost surely.

Proof. First note that

θfull

N−θ=−σRT

0(−A)1+2α−γXN

t, WN

tdt

0∥(−A)1+αXN

t∥2dt.(3.54)

With x(k)=⟨X, Φk⟩, it holds that

dx(k)

t=−θλkx(k)

tdt+σλ−γ

kW(k)

tdt

with independent Wiener processes W(k), and consequently,

x(k)

t=σλ−γ

kZt

e−θλk(t−r)W(k)

rdr.

A straightforward calculation yields

E[(x(k)

t)2] = 2σ2λ−2γ

kZt

0Zr

e−θλk(t−r)e−θλk(t−r′)r′dr′dr

=2σ2

θ2λ2γ+2

kt

2−3

4θλk−e−2θλkt

4θλk

+e−θλkt

θλk,

thus

EZT

(x(k)

t)2dt≍T2σ2

2θ2λ−2γ−2

Summing over the first Nmodes and using Lemma A.2 (ii), we get a.s.

0(−A)1+αXN

t2dt≍T2σ2Λ2α−2γ

2θ2(1 + β(2α−2γ))N1+β(2α−2γ).(3.55)

Furthermore,

E[x(k)

tW(k)

t] = σλ−γ

kZt

e−θλk(t−r)rdr

=σ

θλγ+1

kt−1

θλk

(1 −e−θλkt),

and consequently,

EZT

x(k)

tW(k)

tdt≍T2σ

2θλ−γ−1

By summing up, we obtain

EZT

0(−A)1+2α−γXN

t, WN

tdt≍T2σΛ2α−2γ

2θ(1 + β(2α−2γ))N1+β(2α−2γ).(3.56)

Finally, using the Wick theorem as in [Jan97, Theorem 1.28],

Var ZT

x(k)

tW(k)

tdt

=ZT

0ZT

E[x(k)

tx(k)

rW(k)

tW(k)

r]drdt−ZT

E[x(k)

tW(k)

t]dt2

=ZT

0ZT

E[x(k)

tx(k)

r]E[W(k)

tW(k)

r] + E[x(k)

tW(k)

r]E[x(k)

rW(k)

t]drdt

≤2ZT

0ZT

0qrtE[(x(k)

r)2]E[(x(k)

t)2]drdt

≤2T2EZT

(x(k)

t)2dt≲λ−2γ−2

and in particular,

∞

N=1

Var hλ1+2α−γ

kRT

0x(N)

tW(N)

tdti

ERT

0⟨(−A)1+2α−γXN

t, WN

t⟩dt2≲

∞

N=1

Nβ(4α−4γ)

N2+β(4α−4γ)<∞,

such that by the strong law of large numbers [Shi96, Theorem IV.3.2], (3.56)

holds a.s. for RT

0(−A)1+2α−γXN

t, WN

tdt. Now, from (3.54), we see that

θfull

N−θa.s.

−−→ −θ,

which implies the claim.

Remark 3.16. Integrated noise appears naturally if one considers systems

such that the first component is observed, but only the second component is

driven by noise. More precisely, the linear system

dXO

t=θA11XO

tdt+A12XU

tdt, (3.57)

dXU

t=A21XO

tdt+A22XU

tdt+B2dWt,(3.58)

with XO

0= 0,XU

0= 0 and unknown θ, can be formally rewritten as

dXO

t=θAXO

tdt+F(XO)(t)dt+BWtdt, (3.59)

where A=A11,B=A12B2and F(X) = A12A22A−1

12 X+A12A21JX −

θA12A22A−1

12 A11JX. Depending on the form of A11, A12, A21, A22 and B2, this

reasoning can be made rigorous. In order to neglect F, the regularity of all

terms appearing in the (linear) system (3.57),(3.58) can be evaluated directly,

or it can be shown that SF=Fsatisfies (Fs,η). If either of these approaches

is feasible, the reduction to the theory from Section 2.3 as described above is

applicable.

The extension of this setting to semilinear systems is possible by a reg-

ularity argument as in the previous sections, decomposing both components

into their linearization and nonlinear remainder.

3.3 Structure of the Dispersion Operator

Set ¯

B=σ(−A)−γ. We call ¯

Bthe reference dispersion operator. In all models

we have considered so far, we used ¯

Bas dispersion operator. Now we study

to which extent this assumption can be relaxed. W.l.o.g. we study only the

white noise case. More precisely, consider

dXt=θAXtdt+F(Xt)dt+B(Xt)dWt(3.60)

with initial condition X0. Let L2(H)denote the space of Hilbert–Schmidt

operators on H, with norm ∥·∥HS. We demand that Bmaps its domain

D(B)⊆Hinto L2(H). In direct analogy to (W), our standing assumption

is well-posedness of (3.60) in the sense of a unique probabilistically and an-

alytically weak solution in C(0, T;H). Within the variational approach, as

exposed in [LR15], this can be shown under Lipschitz and growth conditions

on B(and additional mild conditions on F). Here and in the sequel, we write

B(Z) = B(Z)−¯

Bfor the deviation from the reference dispersion operator,

i.e. we consider lower order (possibly multiplicative) noise of the form

B(Z) = σ(−A)−γ+e

B(Z).(3.61)

In order to transfer the results from the reference case, Bmust be asymp-

totically similar to ¯

Bin the following sense:

(Nγ

η)There is a locally bounded b: [0,∞)→[0,∞)such that for Z∈H

and k∈N:

e

B(Z)TΦk

H≪pb(∥Z∥H)λ−1−2γ−η

k.(3.62)

This is a natural condition, as shown in the next lemma:

Lemma 3.17. Let η > 0. If for Z∈H,e

B(Z)is a linear bounded operator

mapping Hinto Hrfor some r > 1+2γ+ηsuch that the operator norm

satisfies

e

B(Z)

H→Hr≤b(∥Z∥H),(3.63)

then condition (Nγ

η)is satisfied.

Proof. In that case,

e

B(Z)TΦk

H≲e

B(Z)T(−A)r/2

H→H(−A)−r/2Φk2

≲(−A)r/2e

B(Z)

H→Hλ−r

Now (−A)r/2:Hr→His an isometry, and therefore

e

B(Z)TΦk

H≲e

B(Z)

H→Hr

λ−r

k≲b(∥Z∥H)λ−r

which implies the claim.

Example 3.18. Diagonal noise of the form B(X)Φk=bk(X)Φkfor func-

tions bk:H→R,k∈N. Such diagonal dispersion terms have been con-

sidered e.g. in [CCG20], or in [CKL20] in the context of space-only noise.

Here, condition (Nγ

η)simplifies to

bk(Z)/λ−γ

k−σ2≪pb(∥Z∥H)λ−1−η

k,(3.64)

which amounts to fast asymptotic equivalence of the modes bkand λ−γ

As before, let ¯

Xbe the solution to

d¯

Xt=θA ¯

Xtdt+¯

BdWt(3.65)

with ¯

X0= 0, and e

X:= X−¯

X. In order to control the regularity of e

X, we

extend the splitting argument as follows: Define ¯

XFto be the solution of

d¯

t=θA ¯

tdt+B(Xt)dWt(3.66)

with ¯

0= 0, such that e

XF:= X−¯

XFsatisfies

t=θA e

tdt+F(Xt)dt(3.67)

with e

0= 0. It follows from Proposition 3.19 below that ¯

XFis well-posed.

Next, with ¯

XB:= ¯

X, the process e

XB:= ¯

XF−¯

XBsatisfies

t=θA e

tdt+e

B(Xt)dWt,(3.68)

0= 0. This means that the nonlinear process e

X=e

XF+e

XBconsists of

two components, which contain the nonlinear behavior in the drift and the

dispersion, respectively.

As before, we write s∗= 1 + 2γ−1/β.

Proposition 3.19. Let γ > 1/(2β)and η > 0. Under condition (Nγ

η), we

have e

XB∈R(s+η)for any s < s∗.

Proof. First, note that for any Z∈H, the operator (−A)(s+η)/2e

B(Z)is a

Hilbert–Schmidt operator on H:

(−A)(s+η)/2e

B(Z)

HS =

∞

k=1 e

B(Z)T(−A)(s+η)/2Φk

∞

k=1

λs+η

ke

B(Z)TΦk

≲b(∥Z∥H)

∞

k=1

λs+η−1−2γ−η−ϵ

k≲

∞

k=1

λ−1/β−ϵ

k<∞

for some ϵ > 0due to condition (Nγ

η). In particular, using X∈C(0, T;H)

a.s., we see that e

B∗

t:= (−A)(s+η)/2e

B(Xt)is uniformly bounded in ∥·∥HS for

t∈[0, T]. Now, Theorem 4.2.4 of [LR15] implies that

dYt=θAYtdt+e

B∗

tdWt,(3.69)

Y0= 0, has a unique solution in C(0, T;H). As a consequence, Y=

(−A)(s+η)/2e

XB, and the claim follows.

Together with Proposition 2.3 (with ¯

Xreplaced by ¯

XF=¯

X+e

therein), we immediately obtain:

Theorem 3.20. Let γ > 1/(2β),s∈Rand η > 0. Assume X0∈Hs+η.

If (Fs,η)and (Nγ

η)are true and X∈R(s), then e

X=e

XF+e

XB∈R(s+η)

almost surely.

Remark 3.21. If in the proof of Proposition 3.19 it can be shown that Y=

(−A)(s+η)/2e

XBhas continuous trajectories not only in Hbut also in V, then

we can conclude even e

XB∈R(s+η+1). In view of Lemma 3.17, this seems

natural: There, e

B(Z)must map Hinto Hrfor r > 1 + 2γ+η, whereas ¯

maps Hinto H2γ. In this sense, 1+ηshould be expected to be the “true” excess

regularity instead of η. However, in the general setting it is not clear if Y

has continuous trajectories in V, although there are sufficient criteria known

in literature. For example, in the case of additive noise B(Z)≡B, according

to Theorem 5.11 from [DPZ14] we have that Y∈C(0, T;V)if the integral

0t−2αetθA(−A)(s+η+1)/2(B−¯

B)2

HS dtis finite for some 0< α < 1/2.

In particular, if ˆ

θfull

N,ˆ

θpart

Nand ˆ

θlin

Nare given by (2.24), (2.25) and (2.26),

the results on diffusivity estimation from Theorem 2.11 transfer directly to

the model studied in this section:

Theorem 3.22. Let γ > 1/(2β),η > 0∨(1/β −1) and s0≥0, assume

X0∈Hs∗+ηand X∈R(s0). Let (Nγ

η)and (Fs,η)be true for s0≤s < s∗. Let

α > γ −1/4. Then the following asymptotic statements are true:

(i) ˆ

θfull

Nis asymptotically normal as in (2.29), and if η > 1+1/β, the same

is true for ˆ

θpart

N,ˆ

θlin

(ii) In the case η≤1+1/β,ˆ

θpart

Nand ˆ

θlin

Nare consistent in probability with

convergence rate N−a,a < βη/2, i.e.

Naˆ

θpart

N−θP

−→ 0, Naˆ

θlin

N−θP

−→ 0.(3.70)

Proof. By Theorem 3.20, RT

0(−A)s/2XN

t2

Hdtsatisfies the asymptotics from

(2.20) whenever s > s∗. Now, by means of (Nγ

η)and X∈C(0, T;H),

0e

B(Xt)T(−A)1+2αXN

t

Hdt≤ZT

0 N

k=1

λ1+2α

kx(k)

te

B(Xt)TΦkH!2

≪psup

0≤t≤T

b(∥Xt∥H)ZT

0 N

k=1

λ1/2+2α−γ−η/2

kx(k)

t!2

≲N

k=1

λ1+4α−2γ−η

kZT

(x(k)

t)2dt

=NZT

0(−A)1

2+2α−γ−η

2XN

t

Hdt. (3.71)

In case α > γ +(η−1/β)/4we have 1+4α−2γ−η > s∗, and the latter term

is dominated by N2+β(4α−4γ−η). On the other hand, if α < γ + (η−1/β)/4,

the last integral converges, and the latter term is dominated by N. The case

α=γ+(η−1/β)/4can be ignored by substituting η7→ η−ϵfor some small

ϵ > 0. In any of these cases, the right-hand side is negligible compared to

N1+β(1+4α−4γ), where we take into account η > 1/β −1and α > γ −1/4.

In particular, using B(Xt) = σ(−A)−γ+e

B(Xt)and expanding the squared

norm, we have a.s.

0B(Xt)T(−A)1+2αXN

t2

Hdt≍σ2ZT

0(−A)1+2α−γXN

t2

Hdt

≍σ2C2+4α−2γN1+β(1+4α−4γ)

because the condition η > 1/β −1ensures that 2 + β(4α−4γ−η)<

1+β(1+4α−4γ), i.e. the remaining terms are of lower order. Consequently,

the local martingale

T:= C−1/2

2+4α−2γσ−1N−1/2−β(1+4α−4γ)/2ZT

0B(Xt)T(−A)1+2αXN

t,dWt

is such that ⟨MN⟩T→1a.s., and according to Theorem A.1 and the Slutsky

lemma,

N1+β

2(ˆ

θfull

N−θ) = −N1+β

2RT

0B(Xt)T(−A)1+2αXN

t,dWN

t

0∥(−A)1+αXN

t∥2

Hdt

=−σC1/2

2+4α−2γN1+β(1+2α−2γ)

0∥(−A)1+αXN

t∥2

HdtMN

converges to a normal distribution with mean zero and variance as given by

(2.30) The remaining claims for ˆ

θpart

Nand ˆ

θlin

Nfollow verbatim as in Theorem

2.11 (note that the condition on αin this theorem are even more restrictive

than in Theorem 2.11).

Remark 3.23.

(i) In comparison with Theorem 2.11, there are two additional restrictions:

The excess regularity ηmust exceed 1/β −1, and further α > γ −1/4,

which is always stronger than the condition on αfrom Theorem 2.11.

Both inequalities are related to the control of the non–diagonal elements

in the noise term. In the setting of Example 3.18, both restrictions can

be avoided: Namely, in that case, (3.71) in the proof of Theorem 3.22

is substituted by

0e

B(Xt)T(−A)1+2αXN

t

Hdt

k=1

λ2+4α

kZT

0bk(Xt)−λ−γ

k2(x(k)

t)2dt

≪pZT

0(−A)1/2+2α−γ−η/2XN

t2

Hdt,

which is always dominated by N1+β(1+4α−4γ)if η > 0and α > γ −(1 +

1/β)/4, and the rest of the proof is identical. Note that if A= ∆ is

the Laplacian on a bounded domain, then β= 2/d, and the additional

condition η > 1/β −1is void in dimension d≤2.

(ii) If e

B(X)Tmaps Hsinto Hs+2γfor all s∈Rwith ∥e

B(X)T∥Hs→Hs+2γ≤

bs(∥X∥H)for some locally bounded bs, then it is straightforward to see

that for s∈R,(−A)1+2α−γX∈R(s)implies B(X)T(−A)1+2αX∈

R(s). In this case, (3.70) can be strengthened to almost sure conver-

gence using Lemma 2.6, as in the proof of Theorem 2.11.

Chapter 4

Discretization of the Spectral

Approach

In this section we adapt the spectral approach to the case that the observa-

tions consist of a set of point evaluations of the process Xin space instead of

Fourier modes. It is determined how much spatial information is needed in

order to reconstruct the spectral asymptotics from Theorem 2.11, depending

on the regularity of the process.

By now, there is plenty of literature on statistical inference for SPDEs

based on spatially and/or temporally discretized observations. Various works

are based on the asymptotic analysis of power variations, either in time

[BT19, BT20, Cho20, Cho19, CD20, KU21a, KU21b], in space [CKL20,

CK22, SST20, CKP21], in time and space [PT07, CH20, MKT19a], or extend-

ing the approach to a combined spatiotemporal variation [HT21b, HT21a].

Within the spectral approach, however, there seems to be almost no rigorous

attempt to quantify the amount of spatial information needed to recover its

asymptotics for diffusivity estimation. We are aware only of [Hue93, p. 44ff.],

where this topic is sketched shortly (but without rigorous proof) in d= 1,

using first-order integral approximations. On the other hand, [CDVK20] con-

siders the discretization in time of the maximum likelihood estimator from

the spectral approach.

As in the previous sections, we consider a semilinear SPDE of the form

dXt=θAXtdt+F(X)(t)dt+BdWt(4.1)

with initial condition X0, where Ais a closed, densely defined, negative

definite and self-adjoint operator with compact resolvent, B=σ(−A)−γis of

Hilbert–Schmidt type, and Fsatisfies (Fs,η)for some η > 0and s0≤s<s∗.

W.l.o.g. we set σ= 1. We always assume that (4.1) is well-posed with X∈

R(s)for s<s∗. As we are interested in spatially discrete point evaluations,

we have to make the abstract setting from Chapter 2 more specific: Let

D ⊂ Rdbe a bounded domain with smooth boundary, such that the state

space for Xis given by H=L2(D). For simplicity, we assume that Xsatisfies

Dirichlet boundary conditions. The eigenvalues (λk)k∈Nof Aare assumed to

satisfy

λk≍ΛkD

d(4.2)

for some D > 0, called the order of the operator A. It is well-known that

(4.2) is true if Ais a (pseudo-) differential operator of order D[Shu01]. For

example, for A= ∆ we have D= 2 and for −∆2we have D= 4, cf. Section

2.4. Note that Bis a Hilbert–Schmidt operator if and only if γ > d/(2D).

For any h > 0, let Mh∈Nand (x(h)

i)i=1,...,Mh⊂ D. Define the evaluation

operator Eh:C(D)→RMhvia (Ehf)i:= f(x(h)

i). Then each component of

Ehis a bounded multiplicative linear form on C(D). We write ⟨·,·⟩(h)for the

Euclidean scalar product on RMh.

In order to apply Ehto (4.1), we need AX to have values in C(D).

Therefore, we make the standing assumption

s∗>2, AX ∈L∞(0, T;C(D)).(4.3)

For example, if A= ∆, the latter condition holds if s∗> d/2 + 2, i.e.

γ > 1/2 + d/2, by means of the Sobolev embedding theorem. However, in

many situations it is not necessary to use the Sobolev embedding theorem

in order to prove continuity in space. For example, if the eigenfunctions

Φkare uniformly bounded in x∈ D,k∈N, then s∗>2already implies

AX ∈L∞(0, T;C(D)), see Lemma 5.5 and Proposition 5.6 below. We note

that for general A,s∗>2if and only if γ > d/(2D)+1/2.

Similarly, we always assume that the terms EhF(X)and EhBW are well-

defined. If A= ∆, the former is true e.g. if Fis of the form F(X) = f(X)for

a function f:R→Rby (4.3), and the latter can be enforced e.g. by imposing

the additional bound γ > d/2, such that BW ∈L2(0, T;Wd/2,2(D)), and

spatial point evaluations are well-defined again by means of the Sobolev

embedding theorem. In this situation, EhXsatisfies

dEhXt=θEhAXtdt+EhF(X)(t)dt+EhBdWt.(4.4)

Let r∗> r∗≥0such that r∗≥s∗D/2and r∗<(s∗−1)D/2. Fix a

Banach space Br⊂Hfor each r∗< r ≤r∗, such that Br2⊂Br1for r1< r2.

Further, let h∗>0. We need the following conditions:

(D0)For r∗< r ≤r∗,Bris a Banach algebra, and Br⊆C(D).

(D1)For any r∗< r ≤r∗:

∥Φk∥Br≲kr/d.(4.5)

(D2)For any 0< h < h∗, there are real numbers w(h)

1, . . . , w(h)

Mhsuch that for

any r∗< r ≤r∗and Z∈Br,

ZD

Zdx−

i=1

w(h)

i(EhZ)i≲hr∥Z∥Br.(4.6)

Condition (D2)relies on higher order quadrature formulas, reflecting the reg-

ularity of X. In the examples, the scale (Br)r∗<r≤r∗will consist of Hölder

spaces or L2-based Sobolev spaces. Since these spaces are supposed to mea-

sure the regularity of X, we need

X∈L∞(0, T;Br) for r < s∗D/2.(4.7)

This is immediate if Brcoincides with H2r/D, otherwise it has to be proven

separately. In all our examples, this will be valid.

Remark 4.1. We emphasize that the index of the regularity space Hsfrom

Chapter 2 does not count spatial derivatives, but fractional powers of (−A).

If D= 2, this is not the same. This is why the factor D/2arises e.g. in

(D0)and related relations below.

Denote by Wh∈RMh×Mhthe diagonal matrix with entries (w(h)

i)1≤i≤Mh

and set

Eh=WhEh.(4.8)

Note that for Z1, Z2∈C(D), it holds Eh(Z1Z2) = EhZ1EhZ2in the sense

of componentwise multiplication, and therefore:

⟨WhEhZ1, EhZ2⟩(h)=

i=1

w(h)

iEh(Z1Z2).(4.9)

With that notation, a direct consequence of (D0),(D1)and (D2)is

⟨Φk, Z⟩−⟨WhEhΦk, EhZ⟩(h)≲hrkr/d ∥Z∥Br.(4.10)

We use the following discretized version of −A:

A(s/2)

h,N :=

k=1

λs/2

k(EhΦk)(EhΦk)T.(4.11)

Based on these considerations, we want to adapt the maximum-likelihood

based estimator ˆ

θlin

Nto the case of spatially discrete observations. Its natural

discrete analogue ˆ

θdiscr

h,N is given in the present setting as follows:

θdiscr

h,N =−RT

0DA(1+2α)

h,N (EhXt),d(EhXt)E(h)

0DA(2+2α)

h,N EhXt,EhXtE(h)dt

.(4.12)

The error decomposition of ˆ

θdiscr

h,N reads as:

θdiscr

h,N =−

θRT

0DA(1+2α)

h,N EhXt,EhAXtE(h)dt

0DA(2+2α)

h,N EhXt,EhXtE(h)dt

−RT

0DA(1+2α)

h,N EhXt,EhF(X)(t)E(h)dt

0DA(2+2α)

h,N EhXt,EhXtE(h)dt−RT

0DA(1+2α)

h,N EhXt,EhBdWtE

0DA(2+2α)

h,N EhXt,EhXtE(h)dt

=θ+θI(2)

h,N (2 + 2α)−IN(2 + 2α)+IN(2 + 2α)−I(1)

h,N (2 + 2α)

I(1)

h,N (2 + 2α)

−F(1+2α)

h,N −F(1+2α)

N+F(1+2α)

I(1)

h,N (2 + 2α)−p⟨M(h,N)⟩T

I(1)

h,N (2 + 2α)

M(h,N)

p⟨M(h,N)⟩T

(4.13)

where we abbreviate

IN(s) = ZT

0(−A)s/2XN

t2dt, (4.14)

I(1)

h,N (s) = ZT

0DA(s)

h,N EhXt,EhXtE(h)dt, (4.15)

I(2)

h,N (s) = ZT

0DA(s−1)

h,N EhXt,Eh(−A)XtE(h)dt, (4.16)

F(1+2α)

N=ZT

0(−A)1+2αXN

t, PNF(X)(t)dt, (4.17)

F(1+2α)

h,N =ZT

0DA(1+2α)

h,N EhXt,EhF(X)(t)E(h)dt, (4.18)

M(h,N)

T=ZT

0DA(1+2α)

h,N EhXt,Eh(−A)−γdWtE,(4.19)

and (M(h,N)

t)t≥0is a local martingale with quadratic variation

⟨M(h,N)⟩T=ZT

0(−A)−γ(Eh)∗A(1+2α)

h,N EhXt

Hdt. (4.20)

Proposition 4.2. Assume that (D0),(D1),(D2)hold. Let R≥0and s > s∗.

(i) If we have

h≪pN−2

dK(1)

d,D,R(γ), K(1)

d,D,R(γ) := 4Dγ + 2D−d+ 2dR

4Dγ + 2D−2d,(4.21)

then a.s. I(1)

h,N (s)−IN(s)≪N−RIN(s),(4.22)

and in particular, I(1)

h,N (s)≍IN(s).

(ii) If we have

h≪pN−2

dK(2)

d,D,R(γ), K(2)

d,D,R(γ) := 4Dγ −2D−d+ 2dR

4Dγ −2D−2d,(4.23)

then a.s. I(2)

h,N (s)−IN(s)≪N−RIN(s),(4.24)

and in particular, I(2)

h,N (s)≍IN(s).

(iii) Let α > (γ−(1 + d/D)/4) ∨(γ−1/2). If

h≪pN−2

dK(M)

d,D (γ), K(M)

d,D (γ) := 2Dγ

2Dγ −d,(4.25)

then a.s. ⟨M(h,N)⟩T≍IN(2 + 4α−2γ)as N→ ∞.

(iv) Let α > (γ−(1−d/D)/4)∨(η/4−1)∨(−η/4), where ηis as in (Fs,η).

If h≪pN−2K(2)

d,D,R(γ)/d, then a.s.

F(1+2α)

h,N −F(1+2α)

N≪N−RIN(s).(4.26)

It holds K(1)

d,D,R(γ)< K(2)

d,D,R(γ). If R= 0, then K(1)

d,D,R(γ)< K(M)

d,D (γ), and if

R > 1/2, then K(M)

d,D (γ)< K(2)

d,D,R(γ).

Note that the denominators in (4.21), (4.23), (4.25) are positive if and

only if s∗>0,s∗>2,s∗>1, resp., which is satisfied by (4.3).

Proof. We write for r′, s′, h > 0,N∈Nand Z∈L2(0, T;H):

L(h,N)

s′,r′(Z) := hr′

k=1

λs′+r′/D

kZT

0|⟨Φk, Zt⟩|dt. (4.27)

Now let Z(1) ∈L∞(0, T;Br1)and Z(2) ∈L∞(0, T;Br2). Then

0DA(s′)

h,N EhZ(1)

t,EhZ(2)

tE(h)dt

k=1

λs′

kZT

0DWhEhΦk, EhZ(1)

tE(h)DWhEhΦk, EhZ(2)

tE(h)dt

as well as

0D(−A)s′PNZ(1)

t, PNZ(2)

tEdt=

k=1

λs′

kZT

0DΦk, Z(1)

tEDΦk, Z(2)

tEdt.

Consequently, using |ab −AB|≤|a−A||b|+|a||b−B|+|a−A||b−B|for

a, b, A, B ∈R, together with (4.10), we obtain

ZT

0D(−A)s′PNZ(1)

t, PNZ(2)

tEdt−ZT

0DA(s′)

h,N EhZ(1)

t,EhZ(2)

tE(h)dt

≲

k=1

λs′

kZT

0hr1kr1/d Z(1)

tBr1DΦk, Z(2)

tE

+hr2kr2/d Z(2)

tBr2DΦk, Z(1)

tE+hr1kr1/d Z(1)

tBr1dt

≲sup

0≤t≤TZ(1)

tBr1

L(h,N)

s′,r1(Z(2)) + sup

0≤t≤TZ(2)

tBr2

L(h,N)

s′,r2(Z(1))

+Tsup

0≤t≤TZ(1)

tBr1

sup

0≤t≤TZ(2)

tBr2

hr1+r2

k=1

λs′

kk(r1+r2)/d

≲L(h,N)

s′,r1(Z(2)) + L(h,N)

s′,r2(Z(1)) + hr1+r2

k=1

k(s′D+r1+r2)/d

≲L(h,N)

s′,r1(Z(2)) + L(h,N)

s′,r2(Z(1)) + hr1+r2N1+(s′D+r1+r2)/d.

Thus, in order to bound the approximation error stemming from spatial

discretization, we have to control the terms L(h,N)

s′,r1(Z(2)),L(h,N)

s′,r2(Z(1))and

Lrest

s′,r1,r2:= hr1+r2N1+(s′D+r1+r2)/d.(4.28)

Based on this consideration, we prove the different cases separately. We will

repeatedly use Jensen’s inequality in the form PN

k=1 a1/2

k≤(NPN

k=1 ak)1/2

(in fact, if ak≍krfor some r > −1, then both sides grow as N1+r/2). Further,

note that for 0< a ≤A, the function (A+x+y)/(a+x)is decreasing in

x > −aand increasing in y∈R. This implies the relation between K(1)

d,D,R(γ),

K(2)

d,D,R(γ)and K(M)

d,D (γ)from the statement, as well as similar relations used

in the estimates below.

(i) In order to control I(1)

h,N (s), we set Z(1) =Z(2) =X. For ϵ > 0, let

r=r1=r2:= s∗D/2−ϵ. W.l.o.g. assume that ϵ < Ds∗. Then

L(h,N)

s,r (X)≲√Thr

k=1 λ2s+2r/D

kZT

(x(k)

t)2dt1

≲hr N

k=1

λ2s+2r/D

kZT

(x(k)

t)2dt!1

≲hrNZT

0(−A)s+r/DXN

t2dt1

≲hrN2+ D

d(2s+2r

D−2γ−1)1

2=hrN1+ D

d(s+r

D−γ−1

2),

where we made use of Proposition 2.8. This is possible due to 0< ϵ <

Ds∗, i.e. s+r/D > s∗/2. Now by assumption (4.21), we can choose

ϵ > 0small enough such that

h≪N−2

4Dγ+2D−d+2Rd−2ϵ

4Dγ+2D−2d−4ϵ.(4.29)

In particular, with IN(s) = RT

0(−A)s/2XN

t2dt∼N1+D(s−2γ−1)/d,

and r=s∗D/2−ϵ=D(1+2γ)/2−d/2−ϵ, it follows that L(h,N)

s,r (X)≪

N−RIN(s). It remains to bound Lrest

s,r,r: We have Lrest

s,r,r ≪N−RIN(s)

whenever

h≪N−2

4Dγ+2D−d+Rd−2ϵ

4Dγ+2D−2d−4ϵ,

and this follows from (4.29) for any R≥0. In total, we have shown that

I(1)

h,N (s)−IN(s)≲N−RIN(s), and in particular, I(1)

h,N (s) = IN(s) +

(I(1)

h,N (s)−IN(s)) ≍IN(s).

(ii) Here, Z(1) =X,Z(2) = (−A)X,r1=s∗D/2−ϵand r2= (s∗−2)D/2−ϵ

for some ϵ > 0, where w.l.o.g. ϵ < D(s∗−2). The terms can be

controlled as follows:

L(h,N)

s−1,r1((−A)X)≲√Thr1

k=1 λ2s−2+2r1/D

kZT

(λkx(k)

t)2dt1

≲hr1 N

k=1

λ2s+2r1/D

kZT

(x(k)

t)2dt!1

which is the same term appearing in (i). As K(1)

d,D,R(γ)< K(2)

d,D,R(γ),

(4.23) implies that L(h,N)

s−1,r1((−A)X)≪N−RIN(s). Further,

L(h,N)

s−1,r2(X)≲√Thr2

k=1 λ2s−2+2r2/D

kZT

(x(k)

t)2dt1

≲hr2 N

k=1

λ2s−2+2r2/D

kZT

(x(k)

t)2dt!1

≲hr2NZT

0(−A)s−1+r2/DXN

t2dt1

≲hr2N2+ D

d(2s−2+ 2r2

D−2γ−1)1

2=hr2N1+ D

d(s−γ−3

2+r2

D).

In the last line we have used s>s∗and ϵ < D(s∗−2), i.e. s−1+r2/D >

s∗/2, such that Proposition 2.8 is applicable. As before, we can choose

ϵ > 0small enough such that

h≪N−2

4Dγ−2D−d+2dR−2ϵ

4Dγ−2D−2d−4ϵ.(4.30)

Using r2= (s∗−2)D/2−ϵ= (−1 + 2γ)D/2−d/2−ϵand again the

asymptotics of IN(s), we conclude L(h,N)

s−1,r2(X)≪N−RIN(s). Finally,

with (4.28), it is clear that Lrest

s−1,r1,r2≪N−RIN(s)whenever

h≪N−2

4Dγ−d+dR−2ϵ

4Dγ−2d−4ϵ,(4.31)

which is a consequence of (4.30) for R≥0. Now the claim follows as

in (i).

(iii) First note that for i, j ∈Nand Rij := δij − ⟨WhEhΦi, EhΦj⟩(h), we

have by (4.10) and (D1):

|Rij|≲(hi1/dj1/d)q(4.32)

for each r∗< q ≤r∗. We expand the quadratic variation of M(h,N):

For j∈N, we have by definition of A(1+2α)

h,N :

DΦj,(−A)−γ(Eh)∗A(1+2α)

h,N EhXtE

=λ−γ

k=1

λ1+2α

k⟨EhΦj, EhΦk⟩(h)⟨EhΦk,EhXt⟩(h),

and consequently,

⟨M(h,N)⟩T=ZT

0(−A)−γ(Eh)∗A(1+2α)

h,N EhXt

2dt

∞

j=1

λ−2γ

×ZT

0 N

k=1

λ1+2α

k⟨WhEhΦj, EhΦk⟩(h)⟨WhEhΦk, EhXt⟩(h)!2

∞

j=1

λ−2γ

k,l=1

λ1+2α

kλ1+2α

×ZT

0⟨WhEhΦk, EhXt⟩(h)⟨WhEhΦl, EhXt⟩(h)dt

×⟨WhEhΦj, EhΦk⟩(h)⟨WhEhΦj, EhΦl⟩(h)

k=1

λ2+4α−2γ

kZT

0⟨WhEhΦk, EhXt⟩2

(h)dt

−2

k,l=1

λ1+2α−2γ

kλ1+2α

lRkl

×ZT

0⟨WhEhΦk, EhXt⟩(h)⟨WhEhΦl, EhXt⟩(h)dt

k,l=1

λ1+2α

kλ1+2α

lZT

0⟨WhEhΦk, EhXt⟩(h)⟨WhEhΦl, EhXt⟩(h)dt

∞

j=1

λ−2γ

jRjkRjl

=: A1−2A2+A3.

We have 2 + 4α−2γ > s∗due to α > γ −(1 + d/D)/4, and further,

K(1)

d,D,R(γ)< K(M)

d,D (γ)(with R= 0 in the first term), so part (i) yields

A1=I(1)

h,N (2 + 4α−2γ)≍IN(2 + 4α−2γ).

It remains to find bounds for A2and A3. We start with the latter term.

For r∗< q < Dγ −d/2 = (s∗−1)D/2, (4.32) and the Cauchy–Schwarz

inequality give

|A3|≲h2q N

k=1 λ2+4α

kk2q/d ZT

0⟨WhEhΦk, EhXt⟩2

(h)dt1

2!2

∞

j=1

λ−2γ

jj2q/d,

where the last sum is finite. Again using (i), it follows that

|A3|≲h2qN

k=1

λ2+4α+2q/D

kZT

0⟨WhEhΦk, EhXt⟩2

(h)dt

≲h2qNI(1)

h,N (2 + 4α+ 2q/D)

≲h2qNIN(2 + 4α+ 2q/D)

≲h2qN2+ D

d(2+4α+2q

D−2γ−1),

where we have used 2 + 4α+ 2q/D > 2+4α−2γ > s∗. Now choose

q= (s∗−1)D/2−ϵ=Dγ −d/2−ϵfor some ϵ > 0, and in addition let

ϵbe small enough such that by (4.25),

h≪N−2

2Dγ−ϵ

2Dγ−d−2ϵ.(4.33)

Then we immediately obtain |A3| ≪ IN(2 + 4α−2γ). The bound on

A2is similar. With (4.32) and r∗< q ≤r∗,

|A2|≲hq

k=1 λ2+4α−4γ

kk2q/d ZT

0⟨WhEhΦk, EhXt⟩2

(h)dt1

l=1 λ2+4α

ll2q/d ZT

0⟨WhEhΦl, EhXt⟩2

(h)dt1

=: hqB1B2.

The sums B1and B2are treated as before, using part (i). For B1,

B1≲ N

k=1

λ2+4α−4γ+2q

kZT

0⟨WhEhΦk, EhXt⟩2

(h)dt!1

≲NI(1)

h,N (2 + 4α−4γ+ 2q/D)1

≲N2+ D

d(1+4α−6γ+2q

D)1

2=N1+ D

d(1

2+2α−3γ+q

D),

if 2 + 4α−4γ+ 2q/D > s∗, which is the case for q=s∗D/2≤r∗due

to α > γ −1/2. Similarly B2≲N1+D(1/2+2α−γ+q/D)/d. Therefore,

|A2|≲hqN2+ D

d(1+4α−4γ+2q

D).

Now (4.33) implies

h≪N−2

2Dγ+D

2Dγ−d+D.(4.34)

With q=s∗D/2≤r∗, we get |A2| ≪ IN(2 + 4α−2γ). Putting things

together,

⟨M(h,N)⟩T=A1−2A2+A3

≍IN(2 + 4α−2γ)−2A2+A3

≍IN(2 + 4α−2γ).

(iv) Set Z(1) =X,Z(2) =F(X),r1=s∗D/2−ϵand r2= (s∗−2+η)D/2−ϵ.

Let ϵbe small enough such that

h≪N−2

4Dγ+2D−d+2dR−Dη

4Dγ+2D−2d−4ϵ,(4.35)

h≪N−2

4Dγ−2D−d+Dη+2dR−2ϵ

4Dγ−2D−2d+2Dη−4ϵ,(4.36)

h≪N−2

8Dγ−2d+Dη+2dR−4ϵ

8Dγ−4d+2Dη−8ϵ,(4.37)

which is possible due to h≪pN−2K(2)

d,D,R(γ)/d. Since α > η/4−1, it

holds that 1 + 2α+ (r1−r2)/D = 2 + 2α−η/2>0, and consequently,

L(h,N)

1+2α,r1(F(X)) ≲hr1 N

k=1

λ2+4α+2r1

kZT

0⟨Φk, F(X)(t)⟩2dt!1/2

=hr1NZT

0(−A)1+2α+r1

DF(X)(t)

Hdt1/2

≤hr1Nλ2+4α+2r1−2r2

NZT

0(−A)r2

DF(X)(t)

Hdt1/2

≲hr1N1

2+D

d(1+2α+r1−r2

D)=hr1N1

2+D

d(1+2α+2−η

2),

and a direct calculation using (4.35) and r1= (1 + 2γ)D/2−d/2−ϵ

yields L(h,N)

1+2α,r1(F(X)) ≪N−RIN(2 + 2α). Further, we have α > −η/4,

and if ϵ < (4α+η)D/2, it follows that 2+4α+ 2r2/D > s∗, thus

L(h,N)

1+2α,r2(X)≲hr2 N

k=1

λ2+4α+2r2

kZT

x(k)

tdt!1/2

=hr2NZT

0(−A)1+2α+r2

DXN

t

Hdt1/2

≲hr2N1+ D

d(1

2+2α−γ+r2

D),

and (4.36) together with r2= (−1 + 2γ+η)D/2−d/2−ϵgives

L(h,N)

1+2α,r2(X)≪N−RIN(2 + 2α). Finally, (4.37) can be reformulated

as Lrest

1+2α,r1,r2≪N−RIN(2 + 2α). This finishes the proof.

Motivated by the previous proposition, we define

Kd,D(γ) := K(2)

d,D, 1

2+D

(γ) = 4Dγ −D

4Dγ −2D−2d.

Theorem 4.3. In the setting of this section, assume (D0),(D1)and (D2),

and further

h≪pN−2

dKd,D(γ).(4.38)

With ηfrom (Fs,η), let α > (γ−(1+d/D)/4)∨(γ−1/2)∨(η/4−1)∨(−η/4).

(i) If η > 1 + d/D, then

2+D

2dˆ

θdiscr

h,N −θd

−→ N (0,Σ) (4.39)

as N→ ∞,h→0, where Σis given by (2.30).

(ii) If η≤1 + d/D, then

Naˆ

θdiscr

h,N −θP

−→ 0(4.40)

for any a < Dη/(2d)as N→ ∞,h→0.

Proof. This follows directly from the decomposition of ˆ

θdiscr

h,N as in (4.13): By

Theorem A.1 and Proposition 4.2 (iii), M(h,N)

T/(⟨M(h,N)⟩T)1/2→ N(0,1).

Further, N1/2+D/(2d)(⟨M(h,N)⟩T)1/2/I(1)

h,N (2 + 2α)→Σ1/2. Next, the term

F(1+2α)

N/I(2)

h,N (2 + 2α)≍F(1+2α)

N/IN(2 + 2α)is treated exactly as in Theorem

2.11, which leads to the case distinction η > 1+d/D compared to η≤1+d/D.

Finally, all other terms from (4.13) converge to zero with rate N−1/2−D/(2d)

by Proposition 4.2, and the claim follows from the Slutsky lemma.

Remark 4.4.

(i) For fixed d≥1and D > 0, we have Kd,D(γ)→1for γ→ ∞ (or

equivalently s∗→ ∞). This means that for large spatial regularity of

X, a spatial precision of order h≪pN−2/d is sufficient in order to

transfer the asymptotic results from the classical spectral approach.

(ii) On a bounded domain in dimension d, one typically has M∼h−d

point observations. (For example, let Dbe a hypercube in Rd, where

the point evaluation grid consists of points which are aligned along the

coordinate lines in an equidistant way.) In the large regularity setting

from the previous comment, this number of point observations leads to

the relation

N2≪pM. (4.41)

This relation does not depend on the dimension d. In this sense, a

given resolution level Ncan be recovered from a dimension-independent

number of point observations (if the spatial regularity of Xis large)

within the spectral approach.

(iii) On the level of the estimator for diffusivity, this means the following:

In the setting of the previous comment, let additionally F= 0 for

simplicity, such that ˆ

θlin

Nconverges to θwith optimal rate N−1/2−D/(2d).

In terms of the number Mof point observations, this corresponds to the

convergence rate M−1/4−D/(4d)for ˆ

θdiscr

h,N (neglecting terms of arbitrary

small polynomial order in Nin the relation N2≪pM). This rate is

upper bounded by M−1/4. While this bound decays rather slowly in M,

it holds uniformly in d.1Therefore, sparse observations in an high-

dimensional setting can still yield reasonable results.

(iv) Below, we explain how to further improve this bound on the convergence

rate of ˆ

θdiscr

h,N in Mby tightening (4.10).

(v) We point out that I(1)

h,N (s)need not be a good approximation for IN(s).

In fact, according to Proposition 4.2, the absolute error I(1)

h,N (s)−IN(s)

may even diverge, but slowly compared to the energy IN(s)itself. The

same is true for the other approximation terms.

(vi) We highlight that there is no assumption on the shape of Dor the

distribution of the point evaluations within Dother than the integral

approximation property from (D2). To the best of our knowledge, The-

orem 4.3 is the first rigorous asymptotic result for diffusivity estimation

based on such general discrete point evaluation schemes.

Next, we consider different cases in which higher order approximation

estimates allow to connect to the assumptions from Theorem 4.3.

Example 4.5 (Quadrature formulas in d= 1).Let L > 0and D= [0, L].

Further, let A=−(−∆)D/2. With Dirichlet boundary conditions, we have

Φk(x) = √2 sin(πkx/L)and λk= (πk/L)D. For k∈N0, equip the space of k

times differentiable functions Ck(D)with the norm ∥f∥Ck=Pk

i=1 ∥∂i

xf∥∞.

Further, for r > 0let Cr(D)be the Hölder–Zygmund space with the norm

∥·∥Cr=∥·∥∞+|·|Cr. Here, |f|Cr= supx∈D,0<h<1 with x+Kh∈D h−r∆K

hf(x)for

any K > r, where ∆hf=f(·+h)−fis the difference operator. Different

choices of K > r lead to equivalent norms. For integer r∈N, the spaces

1Note, however, that for fixed γ, the deviation from the large spatial regularity setting

still depends on din the term Kd,D(γ), i.e. the regularity of Xneeded to approach the

large regularity regime grows with d.

Cr(D)and Cr(D)do not coincide, but the former is a subspace of the lat-

ter. See e.g. [Tri10a, Tri10b] or [GN15, Chapter 4] for further details on

these spaces. We will use that the Hölder–Zygmund spaces can be identified

as interpolation spaces between C(D)and Ck(D)for any k∈N, see e.g.

[Lun95, Chapter 1] for details. The scale of Banach spaces (Br)r>r∗is given

by (Cr(D))r>0with r∗= 0. The upper bound r∗is arbitrary. By Lemma 5.5

below, (4.7) is true. It is clear that the Cr(D)are Banach algebras [Tri10a,

Section 2.8.3], so (D0)is trivially satisfied. Further, for r > 0,

∥Φk∥Cr≲1 + |Φk|Cr≲1 + sup

x∈R,h>0

h−r(∆K

hΦk)(x)

= 1 + sup

x′∈R,h′>0

krh−r(∆K

h′Φ1)(x′)≲kr,

where we substituted x′=kx and h′=kh. Thus, (D1)is satisfied.

Fix h∗>0. For 0< h < h∗, let Mh∈Nsuch that Mh∼h−1for h→0.

Let π(h) = {x(h)

0, x(h)

1, . . . , x(h)

Mh−1}be a partition of Mhpoints in [0, L]. Let Eh

be the point evaluation operator associated to π(h). We consider quadrature

formulas of the form Q(h)(f) = PMh

i=1 w(h)

if(x(h)

i)for some weights w(h)

i∈R.

Let k∗∈N,k∗> r∗. Typically, Q(h)satisfies an error estimate of the form

ZD

fdx−Q(h)(f)≲M−k∗

h∥f∥Ck∗(4.42)

for f∈Ck∗(D). Examples include the composite Newton-Cotes formulas of

order k∗on equidistant partitions, or Gaussian quadrature formulas, where

∥·∥Ck∗can be even replaced by an L2-based Sobolev norm. This is well-known,

see for example [QSS00]. The right-hand side of (4.42) can be bounded by

hk∗∥f∥Ck∗(up to a constant), and the exact interpolation theorem [AF03,

Theorem 7.23], applied to the operator RD·dx−Q(h), extends the resulting

estimate to all 0≤r≤r∗, where the norm on the right-hand side of (4.42)

is replaced by ∥·∥Cr. Thus (D2)holds with the weight matrix Whdetermined

by the quadrature weights w(h)

i. Consequently, Theorem 4.3 is applicable in

this setting.

Example 4.6 (Finite element method in d≥2).Let A=−(−∆)D/2, set

Br=H2r/D =Wr,2(D), let r∗=d/2and r∗∈Narbitrary. Condition (D0)

is immediate, and for (D1), note that ∥Φk∥Br=∥Φk∥2r/D =λr/D

k≲kr/d. In

order to describe the discretization operator Ehand the approximation prop-

erty (D2), we make use of results from the theory of finite elements. The

finite element method is a standard approach from numerical analysis with

a huge body of literature, we follow the exposition from [Cia02]. As we are

interested in point evaluations, we only consider Lagrange finite elements.

Let K0be a compact reference domain (typically a simplex or cube in ddi-

mensions) with non-empty interior. Fix r∗points y(1)

0, . . . , y(r∗)

0∈K0, and

let P0be a r∗-dimensional space of polynomial functions defined on K0, such

that for p0∈P0,p0(y(i)

0) = 0 for 1≤i≤r∗implies p0= 0. Then there are r∗

polynomials p(1)

0, . . . , p(r∗)

0∈P0such that p(i)

0(y(j)

0) = δij, and the interpolation

operator Π0:C(K0)→P0, given by (P0f)(x) = Pr∗

i=1 f(y(i)

0)p(i)

0(x), is well-

defined and acts as the identity on P0. Now we partition the domain Dinto

a family (Kj)j=1,...,L of compact domains with open interior (which overlap

only on their boundaries), such that there is a diffeomorphism Fj:K0→Kj

for 1≤j≤L. Let Pjconsist of the pullback of functions p0∈P0via F−1

i.e. Pj={p0◦F−1

j|p0∈P0}, and set y(i)

j:= Fj(y(i)

0). The interpolation

operator on Kjis given by Πjf= (Π0(f◦Fj)) ◦F−1

j. Typically, the Fj

are affine functions, which leads to a partition of polygonal domains D, but

also curved elements Kjare possible (see e.g. [Zlá73], [Cia02, Chapter 4.3]),

which allow to handle a smooth boundary of D. We assume mild compati-

bility criteria on the partition of D: The images of the faces of K0under

the Fjhave disjoint (d−1)-dimensional interior or coincide. Further, for

f∈C(D), the interpolation polynomials Πjf|Kjcoincide on the boundaries

of the Kj, such that there is a well-defined interpolation operator Πacting

on C(D). For each 1≤j≤L, let hjdenote the diameter of Kjand ρj

the diameter of the largest ball contained in Kj. We assume that there is a

constant C > 0such that hj/ρj≤Cfor all j. Finally, let h:= max1≤j≤Lhj

be the mesh size of the partition of D.

Let {x(h)

i}1≤i≤Mhbe the set of all Mhpoints y(k)

jin the interior of D,

where 1≤j≤L,1≤k≤r∗, and the labeling is arbitrary. Using Dirichlet

boundary conditions, we can neglect evaluations at the boundary of the do-

main by assuming f= 0 on ∂D. Now, Ehis the operator mapping f∈C(D)

to the vector of point evaluations f(x(h)

i)for 1≤i≤Mh. The weights

w(h)

iare given by w(h)

i:= RDΠfidxfor any function fi∈C(D)that satisfies

fi(x(h)

j) = δij,1≤i, j ≤Mh, and vanishes on ∂D.

We denote by |f|2

k,2=P|α|=k|∂αf|2

L2(D)the L2-Sobolev seminorm of order

k, where αis a multi-index. It is well-known that for 0≤k≤r∗,

∥f−Πf∥L2(D)≲hk|f|k,2(4.43)

100

see e.g. [Cia02, Theorem 3.2.1] for the affine case. Together with the obvious

estimate |f|k,2≤ ∥f∥Bkand

ZD

fdx−

i=1

w(h)

if(x(h)

i)≤ ∥f−Πf∥L1(D)≲∥f−Πf∥L2(D),(4.44)

we obtain that (4.6) is true for integer r, and the exact interpolation theorem

[AF03, Theorem 7.23], applied to the operator I−Π, extends this estimate

to general 0≤r≤r∗. In particular, (D2)is true.

In total, Theorem 4.3 is applicable in this setting. Note that the finite

element method provides a very flexible approach to discretizing D, which

allows to handle point evaluations schemes way beyond a rectangular grid.

Finally, we outline a possibility to further improve the results from The-

orem 4.3 and Remark 4.4. An explicit understanding of the discretization

error of the Fourier modes, which is not based on the universal approxima-

tion error from (D2), can improve the bounds on the number of spatial points

needed in order to recover the spectral approach from discrete observations.

Our standing assumption for the rest of this section is that for each h > 0,

the vectors EhΦ1, . . . , EhΦMh∈RMhare linearly independent. Consider the

operator

ThZ:=

k=1 ⟨EhZ, EhΦk⟩(h)Φk(4.45)

for Z∈C(D). It clearly satisfies

⟨ThZ, Φk⟩=⟨EhZ, EhΦk⟩(h)(4.46)

for 1≤k≤Mh. In particular, if ⟨EhΦk, EhΦℓ⟩(h)=⟨Φk,Φℓ⟩for 1≤k, ℓ ≤

Mh, then the left-hand side of (4.46) can be replaced with ⟨EhThZ, EhΦk⟩(h),

the linear independence of (EhΦk)1≤k≤Mhimplies that ThZcoincides with Z

at the observation points. In this case, This the interpolation operator at

the given observation points associated to the basis functions (Φk)1≤k≤Mh.

Instead of (D2), we assume the following:

(D∗

2)For any 0< h < h∗,r∗< r ≤r∗and Z∈Br,

∥Z−ThZ∥L2(D)≲hr∥Z∥Br.(4.47)

101

This immediately implies

⟨Z, Φk⟩−⟨EhZ, EhΦk⟩(h)=|⟨Z−ThZ, Φk⟩|

≤ ∥Z−ThZ∥L2(D)≲hr∥Z∥Br(4.48)

for 1≤k≤Mh. If h≪pN−1/d for N→ ∞and h→0, then (4.48) is true for

all 1≤k≤Nat least asymptotically. Note that (4.48) is an improvement

over (4.10) by a factor kr/d.

This leads to an additional gain in rate in the proof of Proposition 4.2.

As a consequence, in Theorem 4.3, condition (4.38) can be relaxed:

Theorem 4.7. In the present setting, let (D0),(D1)and (D∗

2)hold. Then

there is a function K∗

d,D, with K∗

d,D(γ)→1for γ→ ∞, such that (4.38) can

be substituted by

h≪pN−(1/d)K∗

d,D(γ)(4.49)

without changing the conclusions of Theorem 4.3.

It follows that (4.41) can be improved to

N≪pM, (4.50)

i.e. Npoint evaluations suffice in order to recover the spectral resolution

level Nfor diffusivity estimation in the large regularity regime (neglecting

terms of arbitrarily small polynomial order). Consequently, the convergence

rate of ˆ

θdiscr

h,N is described by M−1/2−D/(2d)in Remark 4.4.

We shortly compare our result to related literature. Note that the con-

struction of ˆ

θdiscr

h,N is independent of the dispersion intensity σ, which may be

treated as unknown. In fact, in [HT21b] it is shown that under spatially

and temporally discrete observations of a stochastic heat equation driven by

space-time white noise in d= 1, the parameters θand σcan be jointly esti-

mated at rate M−3/2=M−1/2−D/(2d)if the observation scheme is balanced,

or if the resolution in time exceeds that of a balanced observation scheme. Of

course, as we work with time-continuous observations, we may always assume

arbitrarily high resolution in time. In this sense, Theorem 4.7 is compatible

with [HT21b]. On the other hand, if σis treated as known, the observation of

the process continuously in time at a single point in space suffices to recover

θ, see e.g. [PT07, CH20].

102

In contrast to the integral approximation estimate from (D2), bounds

on the approximation error of Thas in (D∗

2)seem to be harder to obtain.

An important example is given by a uniform observation grid on a periodic

domain:

Example 4.8. Let d= 1, let w.l.o.g. Dbe an interval of length 2π, and

consider periodic (instead of Dirichlet) boundary conditions. Then we can

identify D ≃ R/(2πZ). Let Br=Wr,2(D). The observation grid is as-

sumed to be spatially uniform. In this case, ⟨EhΦk, EhΦℓ⟩(h)=⟨Φk,Φℓ⟩for

1≤k, ℓ ≤Mh, so This a trigonometric interpolation operator, and (D∗

holds [KO79]. This can be extended to rectangular domains with a uniform

point evaluation grid in larger dimension d≥2[Pas80]. See also [CHQZ88,

Chapter 9], [QSS00, Chapter 10.9], [SV02, Chapter 8] for discussions of

trigonometric interpolation.

Nonetheless, even for non-uniform observation point grids in d= 1, the

situation is less clear. Recent works [Aus16, AT17] indicate that the trigono-

metric interpolation operator on a non-uniform grid has diminished approx-

imation power, at least in ∥·∥∞, with a convergence rate depending on the

deviation from the uniform point grid. While in this case we cannot expect

that Thand the trigonometric interpolation operator coincide, this gives a

hint that the validity of (D∗

2)can be more involved than (D2).

It is an interesting question for further research if (4.50) and the resulting

(optimal) convergence rate in Mfor diffusivity estimators can be achieved in

non-rectangular domains in dimension d≥2that do not arise as the tensor

product of one-dimensional intervals, or if it is possible to find a domain D

and an observation point distribution within Dsuch that (4.41) cannot be

improved.

103

Chapter 5

The Local Approach

This chapter is an adaptation of material from [ACP20].

The local approach to parameter estimation for SPDEs is a recent de-

velopment different from (and in some sense complementary to) the spectral

approach. It has been introduced in [AR21] for the stochastic heat equation.

[ACP20] generalizes the theory to semilinear models and [ABJR21] applies

the local approach to the stochastic Meinhardt model. The novelty from

the local approach is its observation scheme. It is assumed that a spatially

localized average of the solution process Xis observed on [0, T], which is for-

mally realized as the convolution with a compactly supported kernel. This

is a physically realistic assumption in many cases. As the support of the

kernel shrinks (corresponding to observing Xwith high resolution at a point

in space) the true diffusivity can be recovered.

Let D ⊂ Rdbe a bounded open domain with smooth boundary. For

s∈N,p≥1denote by Ws,p(D),Ws,p

0(D)the Sobolev spaces as in [AF03],

and for non–integer s≥0, their complex interpolation spaces. Let ∆be

the Laplacian, with domain W2,p(D)∩W1,p

0(D)in Lp(D)(see e.g. [GT01,

Chapter 9.6]). For s≥0,p≥1, let Hs,p(D) := Dp((−∆)s/2)⊂Lp(D)be the

maximal domain of definition of the fractional Laplacian (−∆)s/2acting as a

closed, densely defined operator on Lp(D), cf. [Yag10, Chapter 16]. Hs,p(D)

is equipped with the norm ∥·∥s,p := (−∆)s/2·Lp(D)and is a closed subspace

of Ws,p(D)for any s≥0and p≥1. For s < 1,Hs,p(D)is the completion of

Lp(D)w.r.t. the norm ∥·∥s,p. It is clear that for s, s′∈Rand p≥1,(−∆)s/2

maps Hs′+s,p(D)into Hs′,p(D). We write Rp(s) := L∞(0, T;Hs,p(D)). These

spaces are equipped with their natural norm ∥Z∥Rp(s)= sup0≤t≤T∥Zt∥s,p.

104

Finally, let ∆0be the Laplacian as a closed, densely defined operator on

L2(Rd). See e.g. [Tri10a, Tri10b] for more details on function spaces.

In this chapter, we consider a semilinear SPDE

dXt=θ∆Xtdt+F(X)(t)dt+BdWt(5.1)

together with Dirichlet boundary conditions Xt= 0 on ∂Dfor all 0≤t≤T,

and initial condition X0. As in the previous chapters, F:C(0, T ;L2(D)) ⊇

D(F)→L1(0, T;L2(D)) is a nonlinear operator. Wis a cylindrical Wiener

process, and Bis a dispersion operator of Hilbert–Schmidt type, such that

B:L2(D)→H2γ,2(D)is a linear isomorphism for some γ > d/4.1Further

assumptions on Bwill be given below in condition (LB).

In order to formalize the local asymptotics, we define for δ > 0,x0∈ D

and Z∈L2(Rd):

Dδ,x0:= δ−1(D−x0) = {δ−1(x−x0)|x∈ D},(5.2)

Zδ,x0(x) := δ−d/2Z(δ−1(x−x0)).(5.3)

Then Zδ,x0∈L2(Rd), and (·)δ,x0maps L2(Dδ,x0)onto L2(D)with

∥Zδ,x0∥L2(D)=∥Z∥L2(Dδ,x0)(5.4)

for all δ > 0,x0∈ D. More generally, for any 1< q < ∞and Z∈Lq(Dδ,x0),

(5.3) implies

∥Zδ,x0∥Lq(D)=ZD|Zδ,x0(x)|qdx1

=ZDδ−dq

4+d

2|Z|q/2δ,x0(x)2dx1

=δ−d

2+d

q|Z|q/2δ,x0

L2(D)=δ−d

2+d

q|Z|q/22

L2(Dδ,x0)

=δ−d

2+d

q∥Z∥Lq(Dδ,x0)(5.5)

for δ > 0,x0∈ D. Note that the spaces Dδ,x0can be seen as non-asymptotic

tangential spaces of Dat x0: Formally, for δ→0, the tangential space

Tx0D ≃ Rdis recovered. The behavior of the fractional Laplacian under

localization is given by the following result:

1In fact, it suffices to have (5.18) for all s > 1+2γ−d/2.

105

Lemma 5.1 ([AR21], [ACP20]).For s∈R,p≥2,δ > 0,x0∈ D and

Z∈Hs,p(Dδ,x0), we have

(−∆)s/2Zδ,x0=δ−s((−∆)s/2Z)δ,x0.(5.6)

For s∈2N, this is a consequence of the chain rule.

Here and in the sequel, we fix a kernel K∈W2,2(Rd)with compact

support. We identify Kδ,x0, defined via (5.3), with its restriction to D. Then

Kδ,x0∈W2,2(D)[LM72, Remark 8.1]. For small enough δ, the support

of Kδ,x0is compactly contained in D. In particular, the boundary trace

operators of Kδ,x0of any differential order are zero on D. Then, clearly,

Kδ,x0∈H2,2(D). W.l.o.g. we restrict to that case in the sequel.

Now, by assumption, we observe a local average of X, namely Xtested

against Kδ,x0. In addition, we also need Xtested against ∆Kδ,x0:

δ,x0:= ⟨X, Kδ,x0⟩L2(D)=ZD

XKδ,x0dx, (5.7)

X∆K

δ,x0:= ⟨X, ∆Kδ,x0⟩L2(D)=ZD

X∆Kδ,x0dx. (5.8)

It is immediate from (5.1) that the dynamics of XK

δ,x0is determined by the

one-dimensional stochastic differential equation

dXK

δ,x0(t) = θX∆K

δ,x0(t)dt+⟨F(X)(t), Kδ,x0⟩dt+∥B∗Kδ,x0∥L2(D)dWδ,x0(t)

(5.9)

with initial condition XK

δ,x0(0) = ⟨X0, Kδ,x0⟩L2(D), where the process Wδ,x0:=

⟨B∗Kδ,x0, W⟩/∥B∗Kδ,x0∥L2(D)is a one-dimensional Brownian motion.

If X∆K

δ,x0and XK

δ,x0are observed on [0, T], the natural MLE-type estimator,

called augmented MLE [AR21], is given by

θδ,x0=RT

0X∆K

δ,x0(t)dXK

δ,x0(t)

0X∆K

δ,x0(t)2dt.(5.10)

By means of (5.9), ˆ

θδ,x0can be decomposed as follows:

θδ,x0−θ=RT

0X∆K

δ,x0(t)⟨F(X)(t), Kδ,x0⟩L2(D)dt

0X∆K

δ,x0(t)2dt(5.11)

+∥B∗Kδ,x0∥L2(D)RT

0X∆K

δ,x0(t)dWδ,x0(t)

0X∆K

δ,x0(t)2dt.

106

Note that

Iδ,x0:= ∥B∗Kδ,x0∥−2

L2(D)ZT

X∆K

δ,x0(t)2dt

can be interpreted as the observed Fisher information. We need the following

conditions on the dispersion Band the kernel K:2

(LB)There is a family of bounded linear operators (Bδ,x0)δ≥0mapping L2(Rd)

into itself, such that

B∗(−∆)γZδ,x0= (B∗

δ,x0Z)δ,x0(5.12)

for Z∈C∞(Rd)with support in Dδ,x0and δ > 0, as well as

B∗

δ,x0Z−B∗

0,x0ZL2(Rd)→0(5.13)

for Z∈L2(Rd)as δ→0.

(LK)There is e

K∈W2⌈γ⌉+2,2(Rd)with compact support such that K=

(−∆)⌈γ⌉e

(LΨ)We have B∗

0,x0(−∆0)⌈γ⌉−γe

KL2(Rd)>0,(5.14)

Ψ∆(−∆0)⌈γ⌉−γe

K>0,(5.15)

where Ψ∆(Z) = R∞

0B∗

0,x0er∆0∆0Z2

L2(Rd)dr.

It is straightforward to see that Ψ∆(Z)<∞for Z∈W2,2(Rd)∩L1(Rd):

As B∗

0,x0is bounded, we can w.l.o.g. assume that B∗

0,x0=I. Let Gtfor

t > 0be the heat kernel given by et∆0Z=Gt∗Z. It is elementary to

verify that ∥Gt∥L2(Rd)≲t−d/4, given that Gtis normed in L1as a func-

tion. Using standard semigroup properties as stated e.g. in [EN00], we see

that Rt

0er∆0∆0Z2

L2(Rd)dr=1

2e2t∆0Z, ∆0ZL2(Rd)+1

2∥∇Z∥2

L2(Rd). Fur-

ther, using Young’s inequality for convolution products, we can estimate

e2t∆0Z, ∆0ZL2(Rd)≤ ∥Gt∥L2(Rd)∥Z∥L1(Rd)∥∆0Z∥L2(Rd)≲t−d/4, which con-

verges to zero for t→ ∞.

In (5.15), our standing assumption is (−∆0)⌈γ⌉−γe

K∈W2,2(Rd)∩L1(Rd).

This is certainly the case if γ∈N.

2These are the assumptions B,Kand ND from [ACP20].

107

Example 5.2. It is clear that B=σ(−∆)−γfor σ > 0satisfies (LB). In

this case, we trivially have Bδ,x0Z=σZ for all δ≥0and Z∈L2(Rd).

In addition, the above calculation shows Ψ∆(Z) = σ2

2∥∇Z∥2

L2(Rd). This can

be generalized, for example, to smooth space-dependent σ:D → [0,∞), see

[ACP20, Example 1].

As before, we reduce the asymptotic analysis of ˆ

θδ,x0to the linear case

F= 0 by means of the splitting argument X=¯

X+e

X, where ¯

Xsatisfies (5.1)

with F= 0 and X0= 0, and e

Xsolves the random PDE (2.5) with initial

condition e

X0=X0. The terms ¯

X∆K

δ,x0and e

X∆K

δ,x0are defined analogously to

(5.8).

Lemma 5.3. Under (LB),(LK)and (LΨ), we have the following asymptotics

as δ→0:

(i) RT

0¯

X∆K

δ,x0(t)2dt≍Clocδ−2+4γ, where Cloc =Tθ−1Ψ∆(−∆0)⌈γ⌉−γe

K.

(ii) ∥B∗Kδ,x0∥L2(D)≍CB

locδ2γ, where CB

loc =B∗

0,x0(−∆0)⌈γ⌉−γe

KL2(Rd).

In particular, if F= 0 and X0= 0,Iδ,x0≍(CB

loc)−2Clocδ−2as δ→0.

Proof. (i) is a direct consequence of [ACP20, Proposition 22], and (ii) is

shown in the proof of [ACP20, Proposition 2].

This lemma is sufficient to obtain the asymptotic properties of ˆ

θδ,x0in

case F= 0 and X0= 0. In order handle the full semilinear model, we need

higher regularity of e

X. However, in contrast to the spectral approach, we

can make use of higher Lp–type regularity in the spaces Hs,p(D)instead of

mere L2–type regularity. This is further explained in Remark 5.10 below. In

order to exploit Lpregularity, we have to modify Condition (Fs,η)as follows:

(Fp

s,η)There is ϵ > 0and a monotonous, locally bounded function g: [0,∞)→

[0,∞), such that for all Z∈Rp(s):

∥F(Z)∥Rp(s+η+ϵ−2) ≤g∥Z∥Rp(s).(5.16)

In the Markovian case F(X)(t) = F(Xt), (5.16) holds if

∥F(Z)∥s+η+ϵ−2,p ≲g(∥Z∥s,p)(5.17)

for all Z∈Hs,p(D).3

3Condition (Fp

s,η)in the form (5.17) is Assumption Afrom [ACP20].

108

Proposition 5.4. Let s∈R,p≥2and η > 0. Assume that X0∈Hs+η,p(D)

and ¯

X, e

X∈Rp(s). If (Fp

s,η)is true, then e

X∈Rp(s+η).

Proof. Verbatim as in Proposition 2.3, using the norm ∥·∥s,p instead of ∥·∥s.

It remains to understand the regularity of the linear process ¯

X. For p≥2,

set s∗

p:= sup{s∈R|¯

X∈Rp(s) a.s.}, and further s∗

∞:= infp≥2s∗

p, as well as

s∗:= s∗

2. As in the previous chapters, we have s∗= 1 + 2γ−d/2. This can

be seen as follows: As B∗(−∆)γ:L2(D)→L2(D)is a linear isomorphism

by assumption, we have that

0(−∆)s/2etθ∆B2

HS dt=∞(5.18)

is true if and only if it is true for Breplaced by (−∆)−γ, and this holds for

s≥1+2γ−d/2. This proves s∗≤1+2γ−d/2, and the opposite inequality

is shown as in Lemma 2.7. It is clear that s∗

p≤s∗for all p≥2due to

Hs,p(D)⊂Hs,2(D)for s∈R, and therefore s∗

∞≤s∗. On the other hand,

we have the Sobolev embedding Hs,2(D)⊂Hs−d/2,p(D)for all s∈R,p≥2,

and thus s∗

∞≥s∗−d/2. So 0≤s∗−s∗

∞≤d/2is the possible regularity gap

for the linear process ¯

Lemma 5.5. If supk∈N∥Φk∥L∞(D)<∞, then s∗

∞=s∗.

The proof is given in Appendix B.2. The condition supk∈N∥Φk∥L∞(D)<

∞is true e.g. in d= 1, but in general, only a bound of the form ∥Φk∥L∞(D)≲

λ(d−1)/4

kcan be given, and this bound cannot be improved without further

restrictions on D, see [Gri02]. Proposition 5.4 together with Lemma 5.5

yields the precise Lp–excess regularity of e

Proposition 5.6. Let η > 0,s0< s∗

∞such that (Fp

s,η)is true for any

s0≤s < s∗and p≥2. Let X∈R2(s0)and X0∈Hs∗+η,p(D)for any p≥2.

Then we have X∈Rp(s)and e

X∈Rp(s+η)for all s<s∗

∞and p≥2. In

particular, with

η∞:= η−(s∗−s∗

∞),(5.19)

we have e

X∈Rp(s+η∞)for all s < s∗,p≥2.

109

Proof. We distinguish the cases (s∗

∞−s0)≥d/2and (s∗

∞−s0)< d/2. In the

former case, by using Proposition 5.4 iteratively, we have X∈R2(s)for all

s<s∗, and by the Sobolev embedding theorem, X∈Rp(s0)for any p≥2.

Applying Proposition 5.4 a second time repeatedly yields X∈Rp(s)and

X∈Rp(s+η)for all s < s∗

∞and p≥2, which implies the claim. In the case

(s∗

∞−s0)< d/2, we use a similar inductive argument: If for some p≥2it

holds that X∈Rp(s0), then a repeated application of Proposition 5.4 gives

X∈Rp(s)for any s<s∗

∞≤s∗

p, and by the Sobolev embedding theorem, it

follows that X∈Rp′(s0)for any p < p′< dp/(d−2(s∗

∞−s0)). In particular,

there is a constant c > 1such that for all p≥2we can choose p′=p′(p)in

such a way that p′/p ≥c. Repeating this step we obtain X∈Rp(s)for all

s < s∗

∞and p≥2. A final application of Proposition 5.4 yields e

X∈Rp(s+η)

for all s < s∗

∞and p≥2, which implies the claim in this case, too.

The excess regularity of e

Xcan be used to show that the terms related to

Fand e

Xappearing in (5.11) are of lower order:

Lemma 5.7. Under the conditions from Proposition 5.6 and (LK), if η∞>0,

the following is true with η′:= η∞∧(1 + d/4) and any ϵ > 0:

(i) It holds

X∆K

δ,x0(t)2dt≲δ−2+4γ+2(η′−ϵ),

and in particular

X∆K

δ,x0(t)2dt≍ZT

X∆K

δ,x0(t)2dt≍Clocδ−2+4γ.(5.20)

(ii) It holds

X∆K

δ,x0(t)⟨F(X)(t), Kδ,x0⟩L2(D)dt≲δ−2+4γ+(η′−ϵ).(5.21)

Proof. With e

Kas in (LK), i.e. K= (−∆)⌈γ⌉e

K, we first show that for any

q > 1and r≥0,

sup

0<δ≤1(−∆)r/2e

KLq(Dδ,x0)<∞.(5.22)

110

This is obvious for r∈2N0, as (−∆)r/2e

Khas compact support in this case,

and in particular, sup0<δ≤1(−∆)r/2e

KLq(Dδ,x0)=(−∆)r/2e

KLq(Rd)<∞.

For general r≥0, we use the Gagliardo–Nirenberg inequality [BM18] on the

reference domain D: For 0< δ ≤1, with R:= 2⌊r/2⌋ ∈ 2N0,

(−∆)r/2e

KLq(Dδ,x0)=δd

2−d

q+r(−∆)r/2e

Kδ,x0Lq(D)

≲δd

2−d

q+r(−∆)R/2e

Kδ,x0

2−r+R

Lq(D)(−∆)R/2+1 e

Kδ,x0

r−R

Lq(D)

≲(−∆)R/2e

K

2−r+R

Lq(Dδ,x0)(−∆)R/2+1 e

K

r−R

Lq(Dδ,x0)

where we used (5.5) and Lemma 5.1 repeatedly. Consequently, using that

(5.22) is true for the exponents Rand R+ 2, we obtain that (5.22) is true

for all r≥0.

Next, by choice of η′, we have η′<1 + d/2, thus 2γ+ 2 −s∗−η′>0,

and therefore ⌈γ⌉+ 1 −(s+η′)/2>0for all s < s∗. Using (5.5) and (5.22),

(−∆)1−(s+η′)/2Kδ,x0Lq(D)

=δ−d

2+d

q(−∆)⌈γ⌉+1−(s+η′)/2e

KLq(Dδ,x0)

≲δ−d

2+d

q.(5.23)

After these preparations, we can prove the statements of the lemma.

(i) Now let s<s∗and p≥2, with q=p/(p−1) ≤2. By Proposition 5.6,

Lemma 5.1 and (5.23), we have for all s<s∗

X∆K

δ,x0(t)2dt=ZT

0De

Xt,∆Kδ,x0E2

L2(D)dt

=ZT

0D(−∆)(s+η′)/2e

X, (−∆)1−(s+η′)/2Kδ,x0E2

L2(D)dt

≤Tsup

0≤t≤Te

X

s+η′,p (−∆)1−(s+η′)/2Kδ,x0

Lq(D)

≲δ−4+2(s+η′)(−∆)1−(s+η′)/2Kδ,x0

Lq(D)

≲δ−4+2(s+η′)−d+2d/q.

111

Now, for any ϵ′>0, with s:= s∗−ϵ′= 1+2γ−d/2−ϵ′and p:= 2d/ϵ′,

such that 1−1/q = 1/p =ϵ′/(2d), we have

X∆K

δ,x0(t)2dt≲δ−2+4γ+2η′−2d(1−1/q)−2ϵ′=δ−2+4γ+2η′−3ϵ′,

which implies the claim with ϵ= 3ϵ′/2, using Lemma 5.3.

(ii) Condition (Fp

s,η)together with Proposition 5.6 gives F(X)∈Rp(s+η−

2) for all s < s∗

∞and p≥2, which is the same as F(X)∈Rp(s+η∞−2)

for all s < s∗and p≥2. Let ϵ′>0,s=s∗−ϵ′,p= 2d/ϵ′and

q=p/(p−1). Similar as in (i), we estimate

0⟨F(X)(t), Kδ,x0⟩2

L2(D)dt

=ZT

0D(−∆)s+η′−2

2F(X)(t),(−∆)−s+η′−2

2Kδ,x0E2

L2(D)dt

≲T∥F(X)∥2

Rp(s+η′−2) (−∆)1−s+η′

2Kδ,x0

Lq(D)

≲δ−2+4γ+2η′−3ϵ′.

Finally, an application of the Hölder inequality in time gives (5.21).

We can now state and prove the main result of this chapter:

Theorem 5.8. Let x0∈ D. Assume that (LB),(LK)and (LΨ)hold. Let

η > 0,s0< s∗such that (Fp

s,η)is satisfied for all s0≤s < s∗and p≥2. Let

a.s. X∈R2(s0)and X0∈Hs∗+η,p(D)for all p≥2. If η∞=η−(s∗−s∗

∞)>0,

then the following is true:

(i) ˆ

θδ,x0is a consistent estimator for θ, i.e. ˆ

θδ,x0

−→ θas δ→0.

(ii) If η∞>1, then

δ−1ˆ

θδ,x0−θd

−→ N (0,Σloc),(5.24)

112

where

Σloc =

θB∗

0,x0(−∆0)⌈γ⌉−γe

K

L2(Rd)

TΨ∆(−∆0)⌈γ⌉−γe

K.(5.25)

In the case η∞≤1, it holds

δ−aˆ

θδ,x0−θP

−→ 0(5.26)

for all a < η∞.

Proof. We use the representation of ˆ

θδ,x0−θas given in (5.11). First, we set

M(δ)

T:= C−1/2

loc δ1−2γRT

0X∆K

δ,x0(t)dWδ,x0(t). Then ⟨M(δ)⟩T→1in probability

as δ→0, and Theorem A.1 implies that M(δ)

T→ N(0,1) in distribution. An

application of Slutsky’s lemma together with Lemma 5.3 (ii) and Lemma 5.7

(i) gives

∥B∗Kδ,x0∥L2(D)RT

0X∆K

δ,x0(t)dWδ,x0(t)

0X∆K

δ,x0(t)2dt

−→ N (0,Σloc).

Next, again by Lemma 5.7, for any ϵ > 0,

0X∆K

δ,x0(t)⟨F(X)(t), Kδ,x0⟩L2(D)dt

0X∆K

δ,x0(t)2dt≲δη∞∧(1+d/4)−ϵ.

Using (5.11), this implies (ii). Finally, (i) is a consequence of (ii).

As an important example, if B=σ(−∆)−γfor some σ > 0and γ∈N,

then we immediately obtain due to B0,x0Z=σZ for Z∈L2(Rd):

Σloc =

2θe

K

L2(Rd)

T∇e

K

L2(Rd)

.(5.27)

Example 5.9. The results from Theorem 5.8 can be applied to the models

from Section 2.4. For simplicity, we assume that s∗

∞=s∗(which is true, for

example, in d= 1). In this case, the effective excess regularity η∞coincides

with the optimal excess regularity η.

113

(i) Linear Perturbations. If F(X) = c(−∆)r/2for some c∈R,r < 2,

then (Fp

s,η)is true for all s∈R,p≥2and η < 2−r. Thus, if r < 1,

then ˆ

θδ,x0is asymptotically normal as in (5.24). Otherwise, ˆ

θδ,x0is

consistent as in (5.26). In particular, perturbations up to order 1are

negligible, with first order perturbations being the critical case.

(ii) Reaction–Diffusion Equations. If F(X) = f(x), where fis either

a polynomial as in (2.40) or a bounded smooth function with bounded

derivatives of any order as in (2.41), then (Fp

s,η)holds for any p≥2,

s > d/p and η < 2. This can be seen verbatim as in Proposition

2.19, using the fact that Hs,p(D)is closed under multiplication if sp >

d(cf. [AF03, Theorem 4.39]) for the case of polynomial f, as well

as bounds on composition operators [AF92] for the case f∈C∞

b(R).

Consequently, ˆ

θδ,x0is asymptotically normal as in (5.24).

(iii) Burgers Equations. In d= 1, if F(X) = −X∂xX=−∂x(X2/2),

then exactly as in Lemma 2.22 (iii) it can be shown that (Fp

s,η)is true

for p≥2,s > 1/p and η < 1. In particular, ˆ

θδ,x0is consistent as in

(5.26), i.e. δ−a(ˆ

θδ,x0−θ)→0in probability for all a < 1. In fact, for

this particular model, it is possible to prove that the first term in (5.11),

representing the bias from neglecting the effect of F, converges to zero

in probability even with rate δinstead of δafor a < 1, which means

that asymptotic normality as in (5.24) transfers to ˆ

θδ,x0for the one–

dimensional Burgers equation (see [ACP20, Theorem 11] for details).

Remark 5.10 (Lp–regularity in the spectral approach).Lp–regularity has

been a crucial tool in determining the excess regularity ηof e

Xin the local

approach. It is a natural question to ask if Lp–regularity can improve the spec-

tral approach as well. In order to make the two approaches comparable, we

assume that a single Fourier mode of X(instead of the first NFourier modes)

is observed in the spectral approach. For simplicity, let F(X) = (−∆)r,r < 2

and B=σ(−∆)−γ. The natural MLE–type estimator, neglecting information

on F, is given by

θmode

N=−RT

0λ1+2γ

Nx(N)

tdx(N)

0λ2+2γ

N(x(N)

t)2dt=−RT

0x(N)

tdx(N)

λNRT

0(x(N)

t)2dt(5.28)

with x(N)=⟨X, ΦN⟩L2(D)as before. This is the canonical analogon to ˆ

θlin

(with α=γ) for single mode observations, and it corresponds to ˆ

θδ,x0if

114

Kδ,x0is replaced by ΦN. By (2.22) with respect to the linear operator A=

θ∆+(−∆)r, together with Lemma A.2 (i) (setting X∗

k=λ1+γ

kx(k)therein

and taking into account Var RT

0(x(k)

t)2dt≲λ−4γ−3

k, see e.g. [Lot09, Theorem

2.1] or [PS20, Lemma 4.1]), we have λ2+2γ

NRT

0(x(N)

t)2dt≍(σ2TΛ/2θ)N2/d in

probability. In particular, if there is η′>0such that

λ2γ

NZT

0⟨F(X)(t),ΦN⟩2

L2(D)dt≲N2

d−2η′,(5.29)

a decomposition as in (5.11) yields

If η′>1/d :N1

dˆ

θmode

N−θd

−→ N 0,2θ

TΛ,(5.30)

if η′≤1/d :Naˆ

θmode

N−θP

−→ 0 for a < η′.(5.31)

We have that (Fp

s,η)is true for any s∈R,p≥2and η < 2−r, thus

F(X)∈Rp(s−r)for all s < s∗and p≥2(if s∗

∞=s∗). Now, analogously

to the proof of Lemma 5.7, the Hölder inequality gives

λ2γ

NZT

0⟨F(X)(t),ΦN⟩2

L2(D)dt≲N2

d+1−2η

d+ϵ∥ΦN∥2

Lq(D)

for all q > 1and ϵ > 0, so with q= 2 we can choose η′=η/d −1/2−ϵ/2in

(5.29). In particular, if η > 1 + 1/β = 1 + d/2, then η′>1/d, and ˆ

θmode

Nis

asymptotically normal, in accordance with Theorem 2.11.

Now, in order to exploit higher Lp–regularity of Xin the spectral ap-

proach, the term ∥ΦN∥Lq(D)has to converge to zero if N→ ∞, with conver-

gence rate possibly depending on 1≤q≤2. By interpolation, it suffices to

understand the border case q= 1 in order to obtain a bound on the rate. But

in general, such convergence to zero does not hold: For example, if d= 1 and

D= [0,1], we have ΦN(x) = √2 sin(Nπx), and ∥ΦN∥L1(D)= 2√2/π is inde-

pendent of N. For general domains, results from literature on L1–bounds for

the eigenvalues of the Laplacian [vdBHV15, Vog15] do not improve the triv-

ial estimate supN∈N∥ΦN∥L1(D)<∞in terms of the convergence rate in N.

Even more, [Sog15, BS17] indicate that improved L1–bounds (along eigen-

value subsequences) on compact manifolds without boundary are related to

concentration of mass of the eigenfunctions along geodesics. 4

4Note the apparent asymmetry between the (optimal) bound ∥ΦN∥L∞(D)≲λ(d−1)/4

115

In contrast, ∥Kδ,x0∥Lq(D)≲δ−d

2+d

qby (5.5) if Khas compact support,

which improves the convergence rate in δas q→1.

Remark 5.11.

(i) If the linear process ¯

Xhas optimal Lp–regularity, i.e. if s∗

∞=s∗, then

η∞=η, and the convergence rate of ˆ

θδ,x0in Theorem 5.8 is determined

directly by the excess regularity ηcoming from (Fp

s,η).

(ii) In view of condition (LK), it is natural to consider also the case γ= 0.

Indeed, it is possible to include that case without further modification

as long as (5.1) is well-posed. This has been done in [ABJR21] in the

context of the stochastic Meinhardt model. For instance, the results

from [DPZ14, Chapter 7] show that in d= 1, reaction-diffusion equa-

tions driven by space-time white noise can be given a meaning in the

mild sense. If γ= 0, then condition (LK)is void, whereas for positive

γ,(LK)suggests that Kapproximates higher order derivatives of X

instead of point evaluations.

(iii) The convergence rate δof ˆ

θδ,x0can be recovered from the spectral ap-

proach: We can easily see how the asymptotic variance Σfrom The-

orem 2.11 for the estimator ˆ

θfull

Nbehaves if the domain Dis replaced

by a shrunk domain D1/δ,x0.Σdepends linearly on Λ−1, and this con-

stant is characterized by λk≍Λk2/d. An explicit term for Λcan be

found e.g. in [Shu01, Section 13.4], and using the notation therein, it

holds Λ = V−2/d

1, where V1depends linearly on |D|. It is clear that

D1/δ,x0∼δd|D|. Consequently, Λ∼δ−2, and finally, Σ∼δ2. This is

in accordance with Theorem 5.8.

[Gri02] and the trivial bound ∥ΦN∥L1(D)≲1. If the latter cannot be improved, this means

that it is not possible to recover ∥ΦN∥L2(D)≡1from interpolating the L1and L∞cases.

Indeed, as pointed out in [Gri92, SS07] (see also references therein), for p≥2, optimal Lp–

bounds for (linear combinations of) ΦNdo not simply follow from interpolation between

the L2and L∞cases.

116

Chapter 6

Diffusivity Estimation for

Activator-Inhibitor Models

This chapter is an adaptation of material from [PFA+21].

We apply the theory of parameter estimation for semilinear SPDEs to

a particular test case from cell biology, concerning the dynamical behavior

of actin concentration in D. discoideum giant cells. The actin cytoskeleton

plays a crucial role in different processes such as motility of amoeboid cells

[BBPSP14]. In spite of its complex filamental structure, at the length scale

of the cell itself it may be reasonably approximated by a scalar field, repre-

senting concentration. Intracellular actin is capable of generating traveling

waves. In [FFAB20], this has been described by an SPDE of FitzHugh–

Nagumo type, which is coupled to a phase field representing the boundary of

the cell. In order to increase the spatial resolution of experimental data, it

is possible to artificially merge various cells to form a so-called giant cell, see

[GEW+14]. In particular, this allows to observe the spatiotemporal actin dy-

namics within a cell away from the cell boundary. In this case, the describing

model can be simplified by neglecting the phase field.

The reaction model employed in [FFAB20] in order to describe the actin

dynamics is a minimal model capable of generating traveling waves rather

than a detailed representation of the biochemical reaction pathway. Conse-

quently, it is natural to ask to what extent the true dynamics is described by

that model. In order to provide a first step towards answering that question,

we extend the theory of diffusivity estimation for semilinear SPDEs from

Chapter 2, including simultaneous estimation of reaction parameters. To

this end, we assume that the nonlinearity is given as a parametrized term.

117

This can be interpreted as qualitative a priori knowledge on the behavior of

the reaction terms, without knowing the magnitude of the involved parame-

ters quantitatively. Based on these considerations, we compare the effective

diffusivity, given as the value of either of different related estimators, on

simulated and experimental data, in order to understand the effects of the

reaction model.

In Section 6.1, we discuss joint diffusivity and reaction parameter esti-

mation for semilinear SPDEs, extending the results from Chapter 2. We put

special emphasis on the statistically linear case, i.e. when the nonlinearity

depends linearly on its parameters. Finally, we state and discuss the regu-

larity properties of an activator-inhibitor model, which is closely related to

[FFAB20]. In Section 6.2, we apply the estimation theory from Section 6.1

to simulated and real data described by that activator-inhibitor model.

6.1 Joint Diffusivity and Reaction Parameter

Estimation

6.1.1 The General Case

We extend the model from Chapter 2 by allowing the nonlinear term Fto

depend on additional parameters θ1, . . . , θK,K > 0, which we call reaction

parameters in the sequel:

dXt=θ0AXtdt+Fθ1:K(X)(t)dt+B(Xt)dWt(6.1)

with initial condition X0, where we write θ1:K= (θ1, . . . , θK)for short. Fur-

ther, we write θ= (θ0, . . . , θK)for the complete parameter vector. We fix the

parameter space Θ⊂RK+1, which encodes our usual standing assumption

θ0>0, as well as possible restrictions on the reaction parameters coming

from the bifurcation structure of (6.1) together with a priori knowledge on

the dynamical regime. It is possible that an estimator for θtakes values

outside Θ, in that case it should be considered void. The dispersion operator

Bis assumed to satisfy (Nγ

η)for some γ > d/4and η > 0. In order to take

the reaction parameters into account when controlling the nonlinearity in the

drift term, we have to extend condition (Fs,η):

118

(Fpar

s,η )There are continuous functions g: [0,∞)→[0,∞)and c:RK→[0,∞)

and there is ϵ > 0such that for Z∈R(s):

∥Fθ1:K(Z)∥R(s+η+ϵ−2) ≤c(θ1, . . . , θK)g∥Z∥R(s).(6.2)

W.l.o.g. we always assume that gis increasing. The non-Markovian

nature of Fis crucial in our main example, as explained in Section 6.1.3.

Setting formally B=σ(−A)−γ, by [LS77, Section 7.6.4], the log-likelihood

for XNis heuristically given by

ln dPN,T

dPN,T

σ2ZT

0(θ0−¯

θ0)AXN

t,(−A)2γdXN

t

σ2ZT

0PNFθ1:K(X)(t)−PNF¯

θ1:K(X)(t),(−A)2γdXN

t

−1

2σ2ZT

0(θ0−¯

θ0)AXN

t+PNFθ1:K(X)(t)−PNF¯

θ1:K(X)(t),

(−A)2γ(θ0+¯

θ0)AXN

t+PNFθ1:K(X)(t) + PNF¯

θ1:K(X)(t)dt,

where ¯

θ= (¯

θ0,...,¯

θK)∈Θis an arbitrary reference parameter.

Maximizing this term for θ0, . . . , θKsimultaneously leads to the corre-

sponding likelihood equations

−ZT

0(−A)1+2αXN

t,dXN

t

=ZT

0(−A)1+2αXN

t, θ0(−A)XN

t−PNFθ1:K(X)(t)dt,

0∂θiPNFθ1:K(X)(t),(−A)2αdXN

t

=−ZT

0(−A)2α∂θiPNFθ1:K(X)(t), θ0(−A)XN

t−PNFθ1:K(X)(t)dt

for i= 1, . . . , K, where as before we substituted γby a free parameter

α. Without further mentioning it, we assume that there is a solution to

these equations. We fix any solution and call it a maximum likelihood–type

estimator ˆ

θNfor our problem, with components ˆ

θN

0,...,ˆ

θN

K. In general, the

likelihood equations cannot be solved explicitly.

119

Now, depending on the specific form of Fas well as the eigenvalue asymp-

totics of A, it may happen that not all reaction parameters (if any) are identi-

fiable in finite time. This means that ˆ

θN

idoes not necessarily converge to θiif

N→ ∞ if i≥1. Therefore, we put our main focus on diffusivity estimation,

i.e. identifying θ0, and analyze the impact of the reaction parameters on that

problem. In order to control ˆ

θN

1:Kwhen studying the asymptotics of ˆ

θN

0, it

suffices that the reaction parameter estimators are bounded in probability (or

tight), i.e. supN∈NP(|ˆ

θN

i|> M)→0for M→ ∞ and all 1≤i≤K. From

the likelihood equations it is clear that ˆ

θN

0can be written as

θN

0=−RT

0(−A)1+2αXN

t,dXN

t

0∥(−A)1+αXN

t∥2

Hdt+RT

0D(−A)1+2αXN

t, PNFˆ

θN

1:K(X)(t)Edt

0∥(−A)1+αXN

t∥2

Hdt,

(6.3)

even if this is not explicit due to the presence of ˆ

θN

1:K.

Theorem 6.1. Let γ > 1/(2β)and further s0≥0,η > 0∨(1/β −1) such

that (Nγ

η)and (Fpar

s,η )for s0≤s < s∗are true. Assume X0∈Hs∗+ηand

X∈R(s0). Let α > γ −1/4, and assume that (ˆ

θN

i)N∈Nare bounded in

probability for 1≤i≤K. Then:

(i) If η > 1+1/β, then ˆ

θN

0is asymptotically normal as in (2.29).

(ii) If η≤1 + 1/β, then ˆ

θN

0is consistent in probability with rate N−a,

a < βη/2, as in (3.70).

Proof. By Theorem 3.22, the claim is true with ˆ

θN

0replaced by ˆ

θlin

Ngiven by

(2.26). Thus, by (6.3) it suffices to control the bias term involving Fˆ

θN

1:K(X).

As in the proof of Theorem 2.11, we see that by means of (Fpar

s,η ),

0D(−A)1+2αXN

t, PNFˆ

θN

1:K(X)(t)Edt≪pc(ˆ

θN

1:K)N1+β(2α−2γ+1−η/2),

and consequently, for (i),

N1+β

2RT

0D(−A)1+2αXN

t, PNFˆ

θN

1:K(X)(t)Edt

0∥(−A)1+αXN

t∥2

Hdt≪pc(ˆ

θN

1:K)Nβ(1+1/β−η)/2

almost surely. The right-hand side converges to zero in probability because

η > 1 + 1/β and ˆ

θN

1:Kare bounded in probability. This implies (i). The case

(ii) is similar.

120

6.1.2 The Statistically Linear Case

In this section, let Fdepend linearly on its parameters:1

Fθ1:K(X) = F∗(X) +

i=1

θiFi(X)(6.4)

for functions F1, . . . , FK, F∗:C(0, T;H)⊃D(F)→L1(0, T;H). We set

Dα(F) := {Z∈D(F)|F1(Z), . . . , FK(Z)∈L2(0, T;H2α)}. Identifiability of

θ1, . . . , θKis ensured by a non-degeneracy condition:

(Iα)The terms F1(Z), . . . , FK(Z)are linearly independent in L2(0, T;H2α)

for every Z∈Dα(F)which is not constant in t∈[0, T ].

As a consequence of (Iα),

0∥(−A)αFi(X)(t)∥2

Hdt > 0(6.5)

for i= 1, . . . , K. In order to unify notation in the sequel, we define

F0(X) := AX. (6.6)

The maximum likelihood equations simplify to

A(α)

N(X)ˆ

θN=b(α)

N(X),(6.7)

where

A(α)

N(X)i,j =ZT

0⟨(−A)αPNFi(X)(t),(−A)αPNFj(X)(t)⟩dt,

b(α)

N(X)i=−ZT

0(−A)2αPNFi(X)(t), PNF∗(X)(t)dt

+ZT

0(−A)2αPNFi(X)(t),dXN

t.

1A comparable setup has been considered in [Hue93, Chapter 3] for linear SPDEs in

the spectral approach with similar arguments as given below, and in [DMPD00, Section

3] for semilinear SPDEs in the large time regime.

121

In particular,

A(α)

N(X)0,0=ZT

0(−A)1+αXN

t2

Hdt. (6.8)

Further, it is immediate that for i, j = 0, . . . , K:

A(α)

N(X)i,j≤qA(α)

N(X)i,iA(α)

N(X)j,j.(6.9)

In order to connect to Theorem 6.1 and prove that the reaction pa-

rameter estimators ˆ

θ1:Kare bounded in probability, we have to control the

rate of the determinant of A(α)

N(X), whose square root is the volume of the

(K+ 1)-dimensional parallelepiped spanned by PNF0(X),...PNFK(X)in

L2(0, T;H2α). In order to do so, we choose αin such a way that PNF0(X) =

AXNdiverges in L2(0, T;H2α), while PNF1(X), . . . , PNFK(X)converge. This

way, AXNgets asymptotically orthogonal to the latter terms and determines

the rate of volume growth. This is formalized in the next lemma.

Lemma 6.2. Let s0≥0,η > 0such that X∈R(s0),X0∈Hs∗+ηa.s.

and (Nγ

η)as well as (Fs,η)for s0≤s < s∗are true. Let α∈Rsuch that

γ−(1 + 1/β)/2< α < γ + (η−1−1/β)/2. Under condition (Iα), there are

N0∈Nand c, C > 0such that

cZT

0(−A)1+αXN

t2

Hdt≤det A(α)

N(X)≤CZT

0(−A)1+αXN

t2

Hdt

uniformly in N≥N0, almost surely. In particular,

det A(α)

N(X)∼N1+β(2α−2γ+1).(6.10)

Proof. First, α < γ + (η−1−1/β)/2implies 2α < s∗+η−2, thus Fi(X)∈

R(2α)for i= 1, . . . , K. Thus, for these i, we have limN→∞ A(α)

N(X)i,i =

0∥(−A)αFi(X)(t)∥2

Hdt < ∞. In particular, A(α)

N(X)i,i are positive and

finite for i= 1, . . . , K and large enough N. Using (6.9), we have

det A(α)

N(X)≤(K+ 1)!

i=0

A(α)

N(X)i,i

≲A(α)

N(X)0,0=ZT

0(−A)1+αXN

t2

Hdt.

122

For brevity, we use the notation ⟨·,·⟩αfor the scalar product on L2(0, T;H2α)

and ∥·∥αfor the corresponding norm. We will prove that

lim inf

N→∞ det  PNFi(X)

∥PNFi(X)∥α

,PNFj(X)

∥PNFj(X)∥ααi,j=0,...,K!>0.(6.11)

First note that by condition (Iα), this is true if the matrix in (6.11) is sub-

stituted by its (0,0)-minor, i.e. such that 1≤i, j ≤K.

Let ϵ > 0, let M∈Nsuch that ∥(I−PM)Fi(X)∥α< ϵ ∥Fi(X)∥for

i= 1, . . . , K. This is possible because PNFi(X)→Fi(X)in L2(0, T;H2α).

Then for i= 1, . . . , K and N > M:

PNF0(X)

∥PNF0(X)∥α

,PNFi(X)

∥PNFi(X)∥αα

=PMF0(X)

∥PNF0(X)∥α

,PMFi(X)

∥PNFi(X)∥αα

+(PN−PM)F0(X)

∥PNF0(X)∥α

,(PN−PM)Fi(X)

∥PNFi(X)∥αα

Taking into account ∥PNFi(X)∥α→ ∥Fi(X)∥αand ∥PNF0(X)∥α→ ∞ in

the first term as well as the Cauchy-Schwarz inequality for the second term,

lim sup

N→∞ PNF0(X)

∥PNF0(X)∥α

,PNFi(X)

∥PNFi(X)∥αα≤lim sup

N→∞

∥(PN−PM)Fi(X)∥α

∥PNFi(X)∥α

≤∥(I−PM)Fi(X)∥α

∥Fi(X)∥α

< ϵ.

As ϵ > 0is arbitrary, we see that the (0, i)-entry of the matrix in (6.11)

converges to zero for i≥1. Expanding the determinant in (6.11) in the first

column and using the non-degeneracy of the (0,0)-minor, we conclude that

(6.11) is true.

If DNis the diagonal matrix with i-th diagonal entry A(α)

N(X)i,i, (6.11) is

equivalent to lim infN→∞ det(D−1/2

NA(α)

N(X)D−1/2

N)>0. In particular,

0(−A)1+αXN

t2

Hdt≲|det (DN)|≲det A(α)

N(X).(6.12)

Finally, (2.20) implies (6.10), and all statement in the lemma are proven.

Proposition 6.3. Let s0≥0,η > 0such that X∈R(s0),X0∈Hs∗+η

a.s. and (Nγ

η)as well as (Fs,η)for every s0≤s<s∗hold. Let α∈R

123

with γ−(1 + 1/β)/4< α < γ + (η−1−1/β)/4∨(η−1−1/β)/2and

γ−1/4< α ≤γ. Under condition (Iα), the sequences (ˆ

θN

i)N∈N,i= 1, . . . , K

are bounded in probability.

Proof. First note that every admissible α∈Ris also admissible in Lemma

6.2. W.l.o.g. we can restrict to the case B(X)≡σ(−A)−γdue to α > γ−1/4,

as explained in Theorem 3.22. Define

b(α)

N(X)i:= σZT

0(−A)2α−γPNFi(X)(t),dWN

t.

By Lemma 6.2, A(α)

N(X)is invertible for all N≥N0. Plugging in the dynam-

ics of Xinto the stochastic integral appearing in each component of b(α)

N(X),

it is immediate that

θN−θ=A(α)

N(X)−1¯

b(α)

N(X).(6.13)

For simplicity of notation, denote the entries of A(α)

N(X)by ai,j, the entries of

A(α)

N(X)−1by ai,j and the entries of ¯

b(α)

N(X)by ¯

bi,i, j = 0, . . . , K. All terms

implicitly depend on N. W.l.o.g. assume that ai,i >0for i= 0, . . . , K,

which is guaranteed by (Iα)for large enough N. Then the i-th component

of ˆ

θN−θreads as

θN

i−θi=

j=0

ai,j¯

bj=1

det A(α)

N(X)

j=0

bj(−1)i+jdet (Aj,i),

where Aj,i is the matrix obtained from erasing the j-th row and the i-th

column in A(α)

N(X). By means of ai,j ≤√ai,iaj,j,

ˆ

θN

i−θi≤1

det A(α)

N(X)

j=0 ¯

bjK!

√ai,iaj,j

ℓ=0 |aℓ,ℓ|

≲

j=0 ¯

bj

√ai,iaj,j

where we have used Lemma 6.2. Next, α < γ + (η−1−1/β)/4implies

Fj(X)∈R(4α−2γ+ϵ)for some ϵ > 0. By Lemma 2.6, limN→∞ ¯

bj<∞

124

a.s. for j= 1, . . . , K. Thus, for these j,¯

bj/√aj,j is bounded almost surely

and thus in probability. Finally, taking into account α > γ −(1 + 1/β)/4,

Nβ(γ−α)¯

√a0,0

=σNβ(γ−α)v

tRT

0∥(−A)1+2α−γXN

t∥2

Hdt

0∥(−A)1+αXN

t∥2

Hdt

×RT

0(−A)1+2α−γXN

t,dWN

t

qRT

0∥(−A)1+2α−γXN

t∥2

Hdt

which converges to a normal distribution by (2.20) together with the choice

of α > γ −(1+1/β)/4, Theorem A.1, and the Slutsky lemma. Consequently,

as α≤γ, we see that ¯

b0/√a0,0is bounded in probability, too. In total, for

i= 1, . . . , K,|ˆ

θN

i−θi|is bounded a.s. by the sum of random variables that

are bounded in probability, so (ˆ

θN

i)N∈Nitself is bounded in probability.

In particular, consider the case A= ∆ in dimension d≤2, where Fis

a reaction term that satisfies (Fpar

s,η )for all η < 2. Combining Theorem 6.1

with Proposition 6.3, we obtain:

Theorem 6.4. Let d≤2,γ > d/4,s0≥0such that (Nγ

η)and (Fpar

s,η )hold

for s0≤s < s∗and 0< η < 2. Let a.s. X0∈Hs∗+2 and X∈R(s0).

(i) In d= 1, let γ−1/4< α ≤γsuch that (Iα)holds. Then ˆ

θN

0is

asymptotically normal as in (2.29).

(ii) In d= 2, let γ−1/4< α < γ such that (Iα)holds. Then ˆ

θN

0converges

in probability to θ0with rate N−afor every a < 1.

Remark 6.5. The condition on αcan be relaxed. For example, if B=

σ(−A)−γ, only α > γ −(1 + d/2)/4is needed, and a similar result for di-

mension d= 3 can be stated, cf. Remark 3.23.

Theorem 6.4 is applicable to the activator-inhibitor model explained in

the next section.

125

6.1.3 A Model of FitzHugh–Nagumo Type

For L1, . . . , Ld>0, let D= [0, L1]×···×[0, Ld]⊂Rda bounded domain.

Motivated by [FFAB20], we consider an activator–inhibitor model of the form

dUt=DU∆Ut+k1Ut(u0−Ut)(Ut−u0a(∥Ut∥L2(D))) −k2Vtdt+BdWt,

(6.14)

dVt= (DV∆Vt+ϵ(bUt−Vt)) dt, (6.15)

together with initial conditions U0, V0and periodic boundary conditions.

Consequently, in this setting, the state space is given by H:= L2(D), and

H1=¯

W1,2(D) = {u∈W1,2(D)|RDudx= 0}. Here and in the sequel we

consider only the case B=σ(−∆)−γfor σ > 0. The parameters DU, DV>0

are the diffusivity constants for the activator Uand inhibitor V, resp. The

parameters k1, k2, u0, ϵ, b are supposed to be positive. Finally, a:R→R

is a bounded continuously differentiable function with bounded derivative.

The boundedness of ais not essential and can be modeled in practice with

a cutoff. For constant a, this is the spatially extended FitzHugh–Nagumo

model. We mention that a careful choice of the function acan have a sta-

bilizing effect on the dynamics. We also introduce an additional parameter

¯a∈(0,1), which will describe the effective long–time average of a(∥U∥). The

initial conditions are assumed to be sufficiently regular, i.e. E[∥U0∥p

s∗]<∞,

E[∥V0∥p

s∗+2]<∞for all p≥2and s∗= 1+2γ−d/2. This model is well-posed

in dimension d≤3:

Proposition 6.6. Let d≤3and γ > d/4+1/2. Then there exists a unique

solution (U, V )to (6.14),(6.15) with U∈RE(s)and V∈RE(s+ 2) for any

s < s∗.

The proof is given in Appendix B.3.

This model is used to describe cell data in Section 6.2, where we assume

that the observation Xis given by the activator concentration X=U. The

activator dynamics in this model can be reduced to (6.1) as follows: First,

with the variation of constants formula, Vis determined by Uvia

Vt=et(DV∆−ϵI)V0+ϵb Zt

e(t−r)(DV∆−ϵI)Urdr, (6.16)

where Iis the identity operator acting on L2(D)and t7→ et(DV∆−ϵI)is the

semigroup generated by DV∆−ϵI. For simplicity, we assume V0= 0 here

126

and in the sequel. Now, the nonlinearity Fis given by

Fθ1,θ2,θ3(X)(t) = θ1F1(Xt) + θ2F2(Xt) + θ3F3(X)(t),(6.17)

where

θ1=k1u0¯a, θ2=k1, θ3=k2ϵb, (6.18)

and

F1(Xt) = −a(∥Xt∥L2(D))

¯aXt(u0−Xt),(6.19)

F2(Xt) = X2

t(u0−Xt),(6.20)

F3(X)(t) = −Zt

e(t−r)(DV∆−ϵI)Xrdr. (6.21)

This matches the statistically linear case (6.4) with F∗= 0. Note that F3

acts on the trajectory of X, such that the dynamics of Xis not Markovian,

even if the joint dynamics of (U, V )is Markovian. Finally, θ0=DUis the

diffusivity of the activator.

6.2 Application to Cell Data

Next, we apply the theory of joint diffusivity and reaction parameter estima-

tion to simulated and real giant cell data. Our main assumption is that the

data is generated by the FitzHugh–Nagumo model from Section 6.1.3. Why

this is certainly the case for the numerical simulation (up to a discretization

error), it is less clear for data coming from microscopy observation.

As a first approximation, when estimating parameters of the process, we

always assume that the function ais constant, a≡¯a, where the latter value

is known or unknown. In this case, F1is replaced by

F1(Xt) = −Xt(u0−Xt).(6.22)

For simulated data, this models a misspecification of the true generating dy-

namics. However, this is not severe, as a(∥Xt∥L2(D))tends to oscillate around

its effective value. In this sense, if the real data is modeled accurately by

the FitzHugh–Nagumo model from Section 6.1.3, this additional assumption

will have little impact.

127

In order to compare the effect of different model assumptions on diffu-

sivity estimation, we construct a hierarchy of estimators, starting from the

purely linear case and taking into account an increasing number of features

from the FitzHugh–Nagumo model. The assumptions displayed here refer to

the description of the data used to perform parameter estimation, not the

generating process itself.

•ˆ

θlin,N

0is the estimator for θ0coming from the assumption of a purely

linear model, i.e. a stochastic heat equation. In this case, F= 0, and

there are no other drift parameters to be estimated.

•ˆ

θpol,N

0is the diffusivity estimator based on a stochastic Schlögl (or

Nagumo) model [Sch72], i.e.

F(X) = k1X(u0−X)(X−¯au0)

=θ1e

F1(X) + θ2F2(X),

where both reaction parameters θ1and θ2are assumed to be known.

This model is capable of generating spatially extending phase transi-

tions for the concentration of X, and it arises formally from taking

ϵ→0in the stochastic FitzHugh–Nagumo model.

•ˆ

θfull,N

0is the diffusivity estimator under the assumption of a full FitzHugh–

Nagumo model, i.e.

F(X) = k1X(u0−X)(X−¯au0)−k2ϵb Z·

e(· −r)(DV∆−ϵI)Xrdr

=θ1e

F1(X) + θ2F2(X) + θ3F3(X),

This model can generate traveling waves as observed in the cell data.

Again, all reaction parameters are assumed to be known.

While ˆ

θfull,N

0incorporates the full model, the assumption of known re-

action parameters will be too strong. We further relax this assumption by

estimating an increasing number of reaction parameters simultaneously:

•ˆ

θ2,N

0is the diffusivity estimator based on the full FitzHugh–Nagumo

model as ˆ

θfull,N

0, but with unknown θ1.

•ˆ

θ3,N

0additionally treats θ2as unknown.

128

•ˆ

θ4,N

0treats all reaction parameters θ1, θ2, θ3as unknown.

The superscript denotes the number of unknown parameters, including the

diffusivity θ0. Thus, ˆ

θ4,N

0is an estimator for θ0which uses qualitative, but

little quantitative knowledge on the generating process. While all estimators

are consistent with optimal rate as N→ ∞ by Theorem 6.4, their perfor-

mance in the non-asymptotic setting may vary strongly.

For all estimators we have described here, we set α= 0. This is reason-

able if the driving noise of the activator component is close to being white

noise.

The linear estimator ˆ

θlin,N

0is the same as ˆ

θlin

Nfrom Chapter 2, and it is

given by (2.26). In contrast to the other estimators considered here, it is scale

invariant in the sense that for any c > 0, the substitution X7→ cX leaves

the resulting estimator ˆ

θlin,N

0invariant. It is clear that this invariance does

not hold if nonlinearities are taken into account. While the remaining esti-

mators use detailed information on the nonlinear model, their performance

depends on a careful calibration of the intensity of the input data in order

to match the fixed points of the third order polynomial in the reaction term.

In fact, this is a source of additional uncertainty. While the advantages of a

good reaction model clearly outweighs the benefit from scale invariance for

simulated data, as we will see in Section 6.2.1, this is less clear for real data,

where the reaction model may not fully capture the underlying dynamics,

see Section 6.2.2.

In this section, we work on two-dimensional rectangular domains of the

form D= [0, L1]×[0, L2]with L1, L2>0. In particular, the eigenfunctions

in Hof −∆with periodic boundary conditions are given by Φk,ℓ(x1, x2) =

φk(x1/L1)φℓ(x2/L2)for (k, ℓ)∈Z2, where φk(x) = cos(2πkx)for k≤0

and φk(x) = sin(2πkx)for k > 0. The eigenvalues are given by λk,ℓ =

4π2(k2/L2

1+ℓ2/L2

2). As before, we choose a reordering r:N→Z2\{0}such

that λN=λr(N)is increasing, with corresponding eigenfunction ΦN= Φr(N),

where we exclude the case λ0,0= 0.

6.2.1 Evaluation of Simulated Data

We simulate the system (6.14), (6.15) on a two-dimensional square with

side length L= 75 and periodic boundary conditions, starting from zero

129

0 200 400 600 800

diffusivity (m2/s)

1e 13

lin, N

pol, N

full, N

0 200 400 600 800

diffusivity (m2/s)

1e 13

full, N

0,a= 0.1

full, N

0,a= 0.2

full, N

0,a= 0.3

0 200 400 600 800

diffusivity (m2/s)

1e 13

2, N

3, N

4, N

0 200 400 600 800

diffusivity (m2/s)

1e 13

2, N

0, full domain

2, N

0, section

2, N

0, periodified

Figure 6.1: Performance of different diffusivity estimators on a numerical

simulation as Ngets large. Solid black line is plotted at the true value of

θ0= 1 ×10−13 m, dashed black line is plotted at zero. We restrict to N≥25

in order to avoid artifacts.

initial conditions and with θ0=DU= 0.1. We use an explicit finite

difference scheme with spatial and temporal increment ∆x= 0.375 and

∆t= 0.01, respectively. In order to mitigate the impact of the initial condi-

tions, we observe every 100th frame of the simulation in the shifted interval

[T0, T1), where T0= 500 and T1= 700. The remaining drift parameters are

DV= 0.02, k1=k2= 1, u0= 1, ϵ = 0.02, b = 0.2. In the noise, we set γ= 0

and σ= 0.1. The unstable zero of the reaction potential is determined by

a(z) = 0.5−b+0.5(z/(0.33u0L2)−1). In order to compare the simulation to

real data, the unit length and unit time in these specifications are interpreted

as 1µmand 1s, respectively.

In Figure 6.1 (top left), the performance of ˆ

θlin,N

0,ˆ

θpol,N

0and ˆ

θfull,N

0with

130

¯a= 0.1is compared. The linear estimator severely underestimates the true

diffusivity. This can be explained as follows: The data exhibits steep con-

centration gradients at the wave fronts, which are interpreted by ˆ

θlin,N

0as

coming from low diffusive forcing. In contrast, the estimator ˆ

θpol,N

0, which

takes into account the bistable polynomial from the reaction term, heavily

overestimates the true value of θ0. As an explanation, while this estimator

is able to account for the phase transition at the wave front, the concentra-

tion decay due to the inhibitor in the data is interpreted as fast diffusion.

Finally, ˆ

θfull,N

0performs best of these three estimators, incorporating knowl-

edge on the full reaction model. Note, however, that in the simulation, ais

not constant, such that even ˆ

θfull,N

0does not have perfect information on the

dynamics of X. Rather, a(∥Xt∥L2(D))oscillates around a value slightly larger

than 0.15 in the simulation. Figure 6.1 (top right) shows the sensitivity of

θfull,N

0to ¯a. Different a priori assumptions on ¯ahave a large impact on the

value of the estimator, even for large N. In contrast, Figure 6.1 (bottom left)

shows the performance of the estimators ˆ

θ2,N

0,ˆ

θ3,N

0and ˆ

θ4,N

0, which treat the

reaction parameters as unknown. All of them determine the true value of θ0

rapidly, even if ais misspecified in their description of the dynamics.

Apart from the form of the nonlinearity F, the behavior of Xat the

boundary impacts the performance of the estimators. In Figure 6.1 (bottom

right), we evaluate ˆ

θ2,N

0on the original domain Dof 200 ×200 pixels, on

the restriction of Xto a subdomain of 75 ×75 pixels and its periodification,

as described below, on a square of 150 ×150 pixels. When evaluated on a

subdomain instead of D, the estimate deteriorates. A possible explanation

is given as follows: The assumption of periodic boundary conditions on the

subdomain leads to discontinuities of Xat its boundary. As before, these

discontinuities can be interpreted as steep gradients, which the diffusivity

estimators translates into low diffusivity present in the data.

As a remedy, we use a hands-on approach and periodify the data in the

sense that we take four copies of the data, mirror them on both coordinate

axes and glue them together in such a way that the resulting field is contin-

uous and fills a square with double side length compared to D, with periodic

boundary conditions. This way, we avoid the discontinuities, but the result-

ing field still does not satisfy the dynamics of Xat the boundaries of the

original domain D. Further, due to the introduced redundancies and change

in spatial extension, the estimators based on the original and periodified data

should be compared for different values of N.

131

While periodification seems to be a natural ad-hoc approach to deal with

the difficulties arising at the boundary, its performance will depend on the

specific situation, and it should be used with care. In order to understand

its performance better, a systematic study is needed.

0 200 400 600 800

diffusivity (m2/s)

1e 13

lin, N

0, = 0.05

lin, N

0, = 0.1

lin, N

0, = 0.2

0 200 400 600 800

diffusivity (m2/s)

1e 13

2, N

0, = 0.05

2, N

0, = 0.1

2, N

0, = 0.2

Figure 6.2: Comparison of (left) ˆ

θlin,N

0and (right) ˆ

θ2,N

0for different noise

intensity levels. We restrict to N≥25 in both panels. As before, the solid

black line is plotted at the true value θ0= 1 ×10−13 m, and the dashed black

line is plotted at zero.

In Figure 6.2, the effect of changing the noise intensity σon diffusivity

estimation is shown. We simulate additional trajectories with σ= 0.05

and σ= 0.2, on which ˆ

θlin,N

0and ˆ

θ2,N

0are evaluated. The former estimator is

agnostic to the reaction model, whereas the latter includes the full FitzHugh–

Nagumo model. While ˆ

θ2,N

0performs well regardless of the noise intensity,

the behavior of ˆ

θlin,N

0is heavily influenced by the noise level, with large noise

intensity leading to better results. In this sense, a large noise intensity hides

the effect of the nonlinear term. This is in accordance with the Monte–Carlo

simulation for the stochastic Allen–Cahn equation in Section 2.5. We note

that for σ= 0.2, the simulation is no longer capable of generating traveling

waves due to large fluctuations in the driving noise.

132

0 200 400 600 800

0.2

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

lin, N

0, no filter

lin, N

0, f= 1 × 10 7m

lin, N

0, f= 2 × 10 7m

0 200 400 600 800

0.2

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

lin, N

0, no filter

lin, N

0, f= 1 × 10 7m

lin, N

0, f= 2 × 10 7m

0 200 400 600 800

0.2

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

lin, N

2, N

3, N

4, N

0 200 400 600 800

0.2

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

lin, N

2, N

3, N

4, N

Figure 6.3: (top row) Performance of ˆ

θlin,N

0on spatially smoothed obser-

vations. (bottom row) Performance of different diffusivity estimators on

the data, without spatial smoothing. The data is considered (left column)

without and (right column) with periodification. Dashed line is plotted at

zero. As before, we restrict to N≥25 in all panels.

6.2.2 Evaluation of Real Data

We first describe the data we are working with. See [PFA+21, Appendix B]

for a description of the experimental setup.2Each observation consists of a

sequence of rectangular frames of varying length, resolution and aspect ratio,

describing the observed intensity of an actin marker within a D. discoideum

giant cell. The regions considered lie completely inside the cell, i.e. no cell

boundaries appear in the data. The concentration of the actin marker is

given by grey values ranging from 0to 255 at each pixel. When evaluating

the data, this intensity is standardized to the interval [0,1], such that absence

2The giant cell data used in this chapter has been provided by Sven Flemming.

133

of the activator and the maximal activator intensity match the stable fixed

points 0and u0= 1, respectively.3As the data exhibits traveling waves, it

is assumed that the actin concentration (or actin marker concentration) can

be described by (6.14), (6.15).

In this section, we estimate the diffusivity on a single giant cell observa-

tion. The analysis of a population of cells is postponed to Section 6.2.3.

As a first consistency check, we consider the behavior of the data set un-

der convolution. For k∈L1(D), let Tk:L2(D)→L2(D)be the convolution

operator given by (Tkf)(y) = RDk(y−x)f(x)dx, where kand fare identi-

fied with their periodic continuation. It is well-known that ∆◦Tk=Tk◦∆.

In particular, if Xis generated by a (stochastic, perturbed) heat equation

with (homogeneous and isotropic) diffusivity θ0, the same is true for TkX,

although the structure of the nonlinear term and the noise changes. Thus, if

the describing model is reasonable, it is to expect that the estimated diffu-

sivity of TkXis close to that of X. We use a family of kernels k=k(σf)for

σf>0, which are constructed as a Gaussian density with standard deviation

σf, truncated to the rectangle [−L1/2, L1/2] ×[−L2/2, L2/2] and normed in

L1(D). We use the standard deviation σf= 1 ×10−7mand σf= 2 ×10−7m.

The performance of ˆ

θlin,N

0on data smoothed with k(σf)is shown in Figure

6.3 (top left) without periodification, and in Figure 6.3 (top right) for the

periodified data. While not in perfect alignment, the estimator graphs are

very close. For the data without periodification, the decrease of ˆ

θlin,N

0in

Nis slightly more highlighted. For the periodified data, which cannot be

expected to satisfy (6.14), (6.15) on the boundaries of the four sub-patches

of its enlarged domain, the graphs of the estimators are nonetheless almost

indistinguishable. In this sense, periodification seems to retain convolution

invariance. In total, these results support the hypothesis that the data is

generated by a stochastic partial differential equation with diffusive forcing

stemming from a second order differential operator.

Now we proceed to the nonlinear reaction model. Based on the perfor-

mance of the estimators from Section 6.2.1, we compare ˆ

θlin,N

0with ˆ

θ2,N

0,ˆ

θ3,N

and ˆ

θ4,N

0, which incorporate knowledge on the full FitzHugh–Nagumo model.

The performance on cell data is shown in Figure 6.3 (bottom left), and the

performance on periodified data is shown in Figure 6.3 (bottom right). In-

3Such calibration is necessary for all estimators except ˆ

θlin,N

134

terestingly, the model-free estimator ˆ

θlin,N

0behaves similar to ˆ

θ3,N

0and ˆ

θ4,N

which are the most flexible estimators we consider and which do not fix the

reaction rate corresponding to the bistable potential in the drift. This pattern

does also appear in different sample cells. In terms of diffusivity estimation,

the detailed reaction model doesn’t seem to yield additional benefit, in con-

trast to the case of simulated data from Section 6.2.1. This can be seen as

a hint that the FitzHugh–Nagumo model, while being capable of generating

traveling waves, misses additional features of the true intracellular dynamics.

For example, it may be helpful to consider models which mimic the biophys-

ical reaction pathway more closely. On the other hand, ˆ

θ2,N

0, which fixes the

parameters describing the reaction intensities in advance, deviates from the

other estimators, but finally approaches them. Further evaluations suggest

that changing u0does not alter the general picture.

The performance of the estimators on the periodified sample is similar to

the the case of the original sample. In accordance with the discussion from

the previous section, the estimated diffusivity increases in that case.

6.2.3 Evaluation of a Cell Population

We consider a population of 36 giant cell observations, as described in the

previous section. The spatial extension of each data set is clipped in such

a way that only the interior dynamics is captured, i.e. no cell boundaries

appear in the data. As a consequence, the spatial resolution varies within

the cell population. It is natural to assume that the range of possible Nthat

yields meaningful results grows with the resolution of the sample. In general,

while the estimate will be more precise for large N, discretization effects

depending on the spatial resolution will render arbitrarily large Nuseless. In

order to find a reasonable tradeoff, we apply the following heuristics: If each

frame within a data set consists of rx×rypixels, we set Nstop =⌊4rxry/R2⌋,

where Ris a parameter representing the number of pixels needed in order to

extract meaningful information on [0,2π]by testing with a sine function. For

example, if rx=ry=R, then Nstop = 4, and only the first four eigenfunctions

Φ±1,±1are taken into account, whose period is Rpixels in both dimensions.

We choose R= 12 for the cell population and R= 24 for the periodified

population. Further, we set Nconst = 899 and evaluate the estimator ˆ

θ3,N

at N=Nconst and N=Nstop. Results are shown in Figure 6.4. Note that

135

200 300 400 500 600 700 800

Nstop =rxry/36

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

p = 0.06

200 300 400 500 600 700 800

Nstop =rxry/36

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

p = 0.75

200 300 400 500 600 700 800

Nstop =rxry/144

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

p = 0.01

200 300 400 500 600 700 800

Nstop =rxry/144

0.0

0.2

0.4

0.6

0.8

1.0

diffusivity (m2/s)

1e 13

p = 0.27

Figure 6.4: The estimator ˆ

θ3,N

0evaluated at N=Nconst (left column) and

N=Nstop (right column) is plotted against Nstop for a sample of 36 ob-

servations (top row) and their periodification (bottom row). The least

square regression lines are plotted in red. The p-value in each display comes

from a two-sided t-test with null hypothesis that the slope is zero.

Nstop encodes the resolution of the frames within a sample.4We see that

choosing Nbased on the spatial resolution decorrelates the estimate for θ0

from the resolution of the frames. Further, the estimated diffusivities for all

samples considered have a similar magnitude. This indicates that the concept

of effective diffusivity can be useful for statistical analysis on cell samples of

the same or possibly different populations.

In addition to the inhomogeneous spatial resolution, the number of frames

4In fact, Nstop grows like the number of pixels in each frame. If each pixel is interpreted

as a point evaluation of the underlying process (rather than a local average), this is in

accordance with Example 4.8 and (4.50).

136

(i.e. the temporal resolution) varies within the population. However, the

estimate tends to stabilize in time, such that this does not pose a problem.

We further note that the population is not homogeneous with respect to

the side length ∆xof a pixel and the temporal distance ∆tbetween two

frames. Further tests indicate that the estimated diffusivity correlates with

the characteristic diffusivity ∆x2/∆t. However, a detailed analysis of the

resulting effects, including the impact of discretization and possible scale

dependence of the diffusivity, is beyond the scope of the present work.

6.2.4 The Effective Unstable Zero

0 200 400 600 800

0.0

0.2

0.4

0.6

0.8

1.0

unstable fixed point

2, N

1, T=68.0

2, N

1, T=134.0

2, N

1, T=199.0

0 200 400 600 800

0.0

0.2

0.4

0.6

0.8

1.0

unstable fixed point

2, N

1, T=260.02

2, N

1, T=518.04

2, N

1, T=775.09

Figure 6.5: Estimated unstable fixed point for simulated data (left) and real

data (right). Frames up to time T(in seconds) are used to calculate ˆ

θ2,N

starting from the first frame in the sample. As before, we restrict to N≥25.

When using ˆ

θ2,N

0in order to estimate the diffusivity θ0, we simultaneously

obtain an estimate ˆ

θ2,N

1for θ1by solving (6.7). As θ1=k1u0¯aand k1=u0= 1

by assumption, we can identify θ1with the effective unstable zero ¯afrom the

reaction term. In Figure 6.5, the performance of this estimate is displayed

for simulated data and an experimentally observed sample.

The term a(∥Xt∥L2(D))oscillates around an effective value slightly larger

than 0.15 in the numerically simulated trajectory. Even if this value is ap-

proximated, we see that the quality of the estimate does not improve with

increasing N. Indeed, this cannot be expected, as the reaction term is of

order zero: It is known [HR95] that the maximum likelihood estimate of the

137

coefficient of a linear order zero perturbation to a stochastic heat equation

converges only with logarithmic rate in dimension d= 2. On the other hand,

also the long-time behavior can be considered, including an increasing num-

ber of frames into the evaluation. The left-hand panel in Figure 6.5 shows

that the effective value is approached with larger T.5

In the case of real data, the results fall into the interval (0,1) and are

rather stable. This indicates that the “effective unstable fixed point under

the reaction model F”, defined as the value at which ˆ

θ2,N

1stabilizes, can be

used in a meaningful way for statistically evaluating spatiotemporal cell data

exhibiting traveling waves.

6.2.5 The Effective Diffusivity Outside the Cell

0 200 400 600 800

0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

diffusivity (m2/s)

1e 13

lin, N

0 200 400 600 800

energy

1e17

AN(X)0, 0 - inside the cell

AN(X)0, 0 - outside the cell

Figure 6.6: (left) Performance of ˆ

θlin,N

0on a data set consisting of pure noise

outside the cell. (right) Comparison of the energy of a measurement inside

and outside the cell, with the same spatiotemporal extension. In both panels,

the dashed line is plotted at zero. As before, we restrict to N≥25.

When formally applying the estimation procedure to a data set consisting

of pure noise, i.e. a region of a microscopy data set where no part of the cell

is present, we obtain a result that can be named “effective diffusivity outside

the cell” or “effective diffusivity of the noise”. Here, we restrict ourselves to

5Typically, under ergodicity assumptions, consistency with convergence rate T−1/2can

be expected for estimators of maximum likelihood type if Tincreases, see e.g. the mono-

graph [Kut04] for SDEs, [Log84, KL85] for linear SPDEs. In [DMPD00, GM02], large

time consistency is proven in the context of semilinear SPDEs.

138

the use of ˆ

θlin,N

0, i.e. F= 0. The result is shown in the left panel of Figure

6.6. Comparing with Figure 6.3, the effective diffusivity of a pure noise ob-

servation can even exceed the value obtained from a region inside a cell.6It

is important to understand the order of magnitude of this effective value, as

well as its impact on diffusivity estimation within the cell.

We derive the magnitude of the effective diffusivity outside the cell heuris-

tically: A pixel can be described by a weighted indicator function of a square

within D, where the weight describes the intensity. In the case of pure noise,

the instantaneous disappearance of such a pixel in the next frame can be

interpreted as fast diffusion within the time between two frames. In order to

understand the magnitude of θ0needed for that effect, we approximate the in-

dicator function of the pixel by a Gaussian density. For t > 0, let ϕt:R2→R

be the centered Gaussian density in two dimensions with covariance matrix

tI. This density attains its maximum at x= 0, with ϕt(0) = 1/(2πt). Let ∆t

be the time between two frames, let ∆xbe the side length of a pixel within

each frame. We set σ0= ∆x/2. In this case, the distance between the inflec-

tion points of the one-dimensional marginals of ϕσ02matches the side length

of a pixel, and we take ϕσ02as an approximation for the pixel. After time ∆t,

the heat semigroup on R2with diffusivity θ0maps ϕσ02to ϕσ02+2θ0∆t. Now

let the decay of the maximal value of the density, ϕσ02(0)/ϕσ02+2θ0∆t(0), be

at least as large as some threshold b > 0, i.e. (σ02+ 2θ0∆t)/σ02≥b. This is

equivalent to

θ0≥b−1

∆x2

∆t.(6.23)

In the data sample from Figure 6.6 (left), we have ∆x= 2.08 ×10−7m

and ∆t= 0.97 s. The decay factor bdepends on the particular noise pixel

and its intensity within the data set. Reasonable values are given for b≤30.

For example, if b= 30, then θ0≥1.6×10−13 m2/s, if b= 20, then θ0≥

1×10−13 m2/s, and if b= 15, then θ0≥7.8×10−14 m2/s. This matches the

order of the observed diffusivity from Figure 6.6 (left): For example, if we ap-

ply the stopping rule from Section 6.2.3 to this case, i.e. Nstop =⌊4rxry/R2⌋

with R= 12, then we obtain Nstop = 165 and ˆ

θlin,N

0= 1.36 ×10−13 m2/s for

N=Nstop, in accordance with the heuristic derivation in this section.

6This observation also applies to the cell sample used in the right panel of Figure 6.6.

139

We have seen that the observed diffusivity of the noise can be larger than

the estimated diffusivity of the signal within the cell. Nonetheless, the noise

described here does not interfere with the diffusivity estimate of the signal.

This can be explained as follows: Assume that the signal process Xsig is

perturbed by measurement noise Wmeas. This means that inside the cell, we

observe X=Xsig +Wmeas instead of X=Xsig as supposed previously, while

outside the cell, only X=Wmeas is observed. It is revelatory to consider

the energy AN(X)0,0for both cases separately. This is done in the right

panel of Figure 6.6. The value of AN(Wmeas)0,0outside the cell is orders

of magnitude below the energy within the cell, at least at the resolution

level we consider. Consequently, AN(Xsig +Wmeas)0,0is indistinguishable

from AN(Xsig)0,0. Thus, it doesn’t make a difference if ˆ

θlin,N

0is evaluated

on Xsig +Wmeas or on Xsig itself, and the measurement noise has very little

impact on the estimated diffusivity of the signal.

However, from a mathematical perspective, adding noise to Xsig has an

impact on its regularity, such that the theoretical properties of ˆ

θlin,N

0for

N→ ∞ will change, depending on the precise model assumptions.

140

Chapter 7

Further Research

As exposed in the introduction, statistical inference for SPDEs is a source

for diverse mathematical research. The field is continuously expanding, and

it keeps incorporating new models and methods. In this work, we consid-

ered parameter estimation for semilinear equations in different asymptotic

regimes, together with possible model misspecification. To conclude, we give

a list of further interesting mathematical questions related to the topic of

this work. This list is by no means exhaustive, and it should be considered

a suggestion for possible further research.

•Beyond semilinear models, one can consider quasilinear equations, e.g.

with state-dependent diffusivity.

•Further types of model misspecification can be studied: For example,

this includes the effect of an inhomogeneous or anisotropic diffusivity

on the estimators from Chapter 2.

•The impact of measurement noise can be analyzed systematically.

•Apart from the spatially discretized Laplacian used in Chapter 4, which

is based on a Fourier decomposition, it is interesting to consider the

classical discretization on a three-point stencil or five-point stencil (in

dimension one or two), and to study the asymptotics as h→0.

•In the context of Chapter 4, it remains open if the rates from Theorem

4.7 can be achieved for domains with arbitrary geometry.

141

Appendix A

Limit Theorems

The following martingale central limit theorem is a special case of [LS89,

Theorem 5.5.4 (I)], [JS03, Theorem VIII.2.4].

Theorem A.1. Let (MN)N∈Nbe a sequence of real-valued continuous local

martingales with MN

0= 0, let T > 0such that ⟨MN⟩T

−→ 1for N→ ∞.

Then MN

−→ N(0,1) as N→ ∞.

We will repeatedly use the following version of the law of large numbers,

which exploits Gaussianity:

Lemma A.2. Let (X∗

k)k∈Nbe independent centered Gaussian processes on

[0, T], set Y∗

k:= RT

0X∗

k(t)2dtand Z∗

N=PN

k=1 Y∗

(i) If Var(Y∗

k)≪(EY∗

k)2as k→ ∞, then Y∗

k/EY∗

−→ 1.

(ii) If EY∗

k≍Ckαas k→ ∞ for some constants C > 0,α∈R, then

Z∗

N/EZ∗

a.s.

−−→ 1as N→ ∞.

Proof. The first statement is a direct consequence of the Markov inequality:

P

Y∗

EY∗

k−1> ϵ≤Var(Y∗

ϵ2(EY∗

k)2→0

for every ϵ > 0. Now we prove (ii). As the (X∗

k)are Gaussian with mean

142

zero, the Wick theorem [Jan97, Theorem 1.28] gives

Var(Y∗

k) = ZT

0ZT

E[X∗

k(t)X∗

k(s)X∗

k(s)] (A.1)

−E[X∗

k(t)X∗

k(t)]E[X∗

k(s)X∗

k(s)]dsdt

= 2 ZT

0ZT

E[X∗

k(t)X∗

k(s)]2dsdt≤2EZT

X∗

k(t)2dt2

= 2(EY∗

k)2.

W.l.o.g. assume EY∗

1>0, such that the denominator in the following esti-

mates is positive. (Otherwise, if EY∗

1= 0, then (A.1) implies Y∗

1= 0 almost

surely, and Y∗

1does not contribute to ZN.) We have for α > −1:

Var(Y∗

(EZ∗

k)2≤2(EY∗

k)2

Pk

i=1 EY∗

k2≍2C2k2α

(C(α+ 1)−1kα+1)2≲1

k2.

Similarly, for α=−1,

Var(Y∗

(EZ∗

k)2≲1

k2ln(k)2.

Finally, if α < −1, the denominator (EZ∗

k)2converges for k→ ∞, and

Var(Y∗

(EZ∗

k)2≲(EY∗

k)2≲k2α≪1

k2.

In any case, we have P∞

k=1 Var(Y∗

k)/(EZ∗

k)2<∞, and the strong law of large

numbers [Shi96, Theorem IV.3.2] implies the claim.

143

Appendix B

Additional Proofs

B.1 Proof of Proposition 2.17

We prove the statement in two parts.

Lemma B.1. In the situation of Proposition 2.17, there is a unique solution

Xto (2.39) that satisfies X∈RE(s)for s= 1.

Proof. This is an application of the arguments in [LR15, Theorem 5.1.3].

More precisely, we show that the assumptions (H1) (continuity) and (H2′)

(monotonicity) therein are satisfied in the Gelfand triple W1,2

0(D)⊂L2(D)≃

L2(D)∗⊂W1,2

0(D)∗(i.e. H1⊂H0⊂H−1), whereas (H3) (coercivity) and

(H4′)(boundedness) are satisfied in the shifted triple W2,2(D)∩W1,2

0(D)⊂

W1,2

0(D)⊂L2(D)(i.e. H2⊂H1⊂H0). First note that in all cases consid-

ered, ∂xf:R→Ris bounded from above, and thus there is c > 0such that

for any X∈L2(D)and Y:D → R:

⟨∂xf(Y)X, X⟩L2(D)≤c∥X∥2

L2(D).(B.1)

Since γ > d/4 + 1/2,(−∆)1/2Bis a Hilbert–Schmidt operator on H, i.e. the

dispersion operator Bis a Hilbert–Schmidt operator from Hto H1. As Bis

constant, it suffices to test (H1),(H2′),(H3),(H4′)only for the drift of the

SPDE (2.39).

(H1) If fis a polynomial, this is a trivial consequence of the binomial the-

orem. On the other hand, if f∈C∞

b(R)and u, v, w ∈H1, then

t7→ H−1⟨θ∆(u+tv) + f(u+tv), w⟩H1is continuous as a consequence

of the linearity of ∆and the dominated convergence theorem.

144

(H2′)This is a consequence of (B.1): For u, v ∈H1,

H−1⟨θ∆u+f(u)−θ∆v−f(v), u −v⟩H1≤ ⟨f(u)−f(v), u −v⟩H0

=⟨∂xf(w)(u−v), u −v⟩H0

≤c∥u−v∥2

for some w:D → R.

(H3) For u∈H2,

L2(D)⟨θ∆u+f(u), u⟩H2=−θ∥u∥2

H2+⟨f(u), u⟩H1

=−θ∥u∥2

H2+⟨∂xf(u)∇u, ∇u⟩L2(D)

≤ −θ∥u∥2

H2+c∥u∥2

H1,

where we used (B.1) componentwise in the last inequality.

(H4′)First consider the case that fis a polynomial. W.l.o.g., assume f(x) =

xkfor k∈N. Then for u∈H2:

∥f(u)∥2

L2(D)=∥u∥2k

L2k(D),

and for d≤2this term is bounded by ∥u∥2k

W1,2(D)=∥u∥2k

H1up to a

constant. In d= 3 this is still true if k≤3. This proves (H4′). Finally,

if f∈C∞

b(R), then ∥f(u)∥2

L2(D)≤ |D|supx∈Rf(x)2<∞, and (H4′)is

trivially satisfied.

Now as in [LR15, Lemma 5.1.4 and 5.1.5], (H3) and (H4′)imply that there

is a sequence of finite dimensional approximations X(n)to the solution X

which is bounded uniformly in L2(Ω ×[0, T]; Hs+1)and Lp(Ω; L∞(0, T;Hs))

for s= 1, and such that θ∆X(n)+f(X(n))is bounded uniformly in L2(Ω ×

[0, T]; Hs−1)for s= 1. In particular, these statements remain true for the

(weaker) case s= 0. Based on these bounds for s= 0, the proof of [LR15,

Theorem 5.1.3] transfers verbatim and yields a unique solution Xto (2.39)

with X∈RE(0). The stronger a priori bounds (s= 1) imply that in fact

X∈RE(1), which concludes the proof.

Lemma B.2. In the situation of Proposition 2.17, there is s > d/2such that

X∈RE(s).

145

Proof. In d= 1, this has been proven in Lemma B.1, so let d∈ {2,3}. We

apply the usual splitting argument and write X=¯

X+e

X, where ¯

Xis the

solution to (2.39) with f= 0. Then ¯

X∈RE(s)for any s < s∗= 1+2γ−d/2,

see [DPZ14, Section 5.3]. As γ > d/4+1/2, we have in particular ¯

X∈RE(2).

As a consequence, the claim is proven once we know that for all 0< η < 2,

X∈RE(η).(B.2)

(i) For polynomial f, we assume w.l.o.g. f(x) = xk, where kis arbitrary

in d= 2 and k≤3in d= 3. In this case, ∥f(X)∥L2(D)=∥X∥k

L2k(D)≲

∥X∥k

H1. Consequently, f(X)∈RE(0) because X∈RE(1). Similar to

the proof of Proposition 2.3, we estimate for 0≤t≤T:

e

Xtη≤erθ∆X0η+Zt

0e(t−r)θ∆f(Xr)ηdr

≲∥X0∥η+2

2−ηT1−η/2sup

0≤r≤t∥f(Xr)∥L2(D).

As f(X)∈RE(0) and E∥X0∥p

s∗+2<∞for any p≥1, we conclude

that (B.2) holds true.

(ii) Let f∈C∞

b(R). By Proposition 2.19 (ii), condition (Fs,η)holds for

s= 1 and 0< η < 2. Thus, Proposition 2.3 (ii) implies (B.2).

This finishes the proof of Proposition 2.17.

B.2 Proof of Lemma 5.5

This section is an adaptation of the proof of [ACP20, Proposition 30], which,

in turn, is a modification of [DPZ14, Theorem 5.25].

For s < s∗and α > 0, let Y(s,α)

t:= Rt

0(t−r)−α(−∆)s/2e(t−r)θ∆BdWr.

First, we prove:

Lemma B.3. For all s<s∗,0< α < (s∗−s)/2and p≥2, we have a.s.

Y(s,α)∈Lp(0, T;Lp(D)).

146

Proof. For x∈ D, let δxbe the point evaluation operator. We have for

x∈ D,0≤t≤T, using that B∗(−∆)γis a bounded operator on L2(D):

EhY(s,α)

t(x)2i=Zt

r−2αδx(−∆)s/2erθ∆B2

HS dr

=Zt

r−2αB∗(−∆)γerθ∆(−∆)s/2−γδ∗

x2

L2(D)dr(B.3)

≲Zt

r−2αδx(−∆)s/2erθ∆(−∆)−γ2

HS dr,

so w.l.o.g. we restrict to the case B= (−∆)−γ. In that case, together with

supk∈N∥Φk∥L∞(D)<∞, a calculation as in Lemma 2.7 gives

EhY(s,α)

t(x)2i=

∞

k=1

λs−2γ

kZt

r−2αe−2θλkrdrΦk(x)2≲

∞

k=1

d(s−2γ−1+2α),

which is finite1for α < (s∗−s)/2. As Y(s,α)is Gaussian,

sup

0≤t≤T,x∈D

EhY(s,α)

t(x)

pi≲sup

0≤t≤T,x∈D

EhY(s,α)

t(x)2ip/2

<∞.

This leads to

EZT

0ZDY(s,α)

t(x)

pdxdt≤T|D| sup

0≤t≤T,x∈D

EhY(s,α)

t(x)

pi<∞,

proving the claim.

Proof of Lemma 5.5. Using the factorization formula [DPZ14, Theorem 5.10],

we obtain from Lemma B.3 together with [DPZ14, Proposition 5.9] that

(−∆)s/2¯

X∈C(0, T;Lp(D)) for all s < s∗and p≥2such that 1/p <

(s∗−s)/2. This means that ¯

X∈Rp(s)for all p≥2and s < s∗−2/p. As

p≥2is arbitrary, this finishes the proof.

B.3 Proof of Proposition 6.6

The arguments are similar as in Appendix B.1, the main difference being the

new inhibitor component and the concentration dependent unstable zero in

the reaction polynomial. For d≤2, the proof can be found in [PFA+21].

1In particular, the terms involving the point evaluation δxin (B.3) are finite.

147

We write Hs:= Hs⊕Hsfor the regularity spaces describing both com-

ponents2. Similarly to Lemma B.1, we work in the Hilbert space triples

H1⊂H0⊂H−1and H2⊂H1⊂H0. Further, with f(u, z) = k1u(u0−

u)(u−u0a(z)), we write A1(U, V ) = DU∆U+f(U, ∥U∥L2(D))−k2Vand

A2(U, V ) = DV∆V+ϵ(bU −V)as well as A(U, V ) = (A1(U, V ), A2(U, V )).

Similarly to (B.1), we have for U∈L2(D),Y:D → Rand z∈R:

⟨∂uf(Y, z)U, U⟩L2(D)≤c∥U∥2

L2(D),(B.4)

because ais a bounded function. As B=σ(−∆)−γand A2is linear, it suffices

to consider A1in order to show (H1),(H2′),(H3),(H4′)from [LR15]. For

(H2′), we have to take into account the dependence of fon the overall

concentration: Using (B.4),

H−1⟨A1(U1, V1)−A1(U2, V2), U1−U2⟩H1

≲Df(U1,∥U1∥L2(D))−f(U2,∥U2∥L2(D)), U1−U2EL2(D)

+k2∥V1−V2∥L2(D)∥U1−U2∥L2(D)

≲D∂uf(Y, ∥U1∥L2(D))(U1−U2), U1−U2EL2(D)

+D∂zf(U2,˜z)∥U1∥L2(D)−∥U2∥L2(D), U1−U2EL2(D)

+k2∥V1−V2∥L2(D)∥U1−U2∥L2(D)

≲∥U1−U2∥2

L2(D)+∥V1−V2∥2

L2(D)

+∥∂zf(U2,˜z)∥L2(D)∥U1∥L2(D)−∥U2∥L2(D)∥U1−U2∥L2(D)

≲1 + ∥∂zf(U2,˜z)∥L2(D)∥(U1, V1)−(U2, V2)∥2

L2(D)⊕L2(D)

for some Y:D → Rand ˜z∈R. Further, using that ∂zais a bounded

function as well as the Sobolev embedding W1,2(D)⊂L4(D),

∥∂zf(U2,˜z)∥L2(D)≲∥U2(u0−U2)∥L2(D)≲∥U2∥L2(D)+∥U2∥2

L4(D)

≲1 + ∥U2∥H12.

Therefore we can take ρ(U, V ) = c(1+∥U∥H1)2for some c > 0in the notation

of [LR15], and (H2′)is satisfied.

2Note that this is different from that notation in Section 2.6 as the regularity of the

inhibitor component is taken into account.

148

Now, (H1),(H3) and (H4′)work exactly as in Lemma B.1, with obvious

modifications in notation due to the presence of the inhibitor component,

taking into account that ais continuous and bounded. As a consequence, we

have the following analogon to Lemma B.1:

Lemma B.4. In the situation of Proposition 6.6, there is a unique solution

U, V to (6.14),(6.15) with U, V ∈RE(1).

We can represent the inhibitor as Vt=et(DV∆−ϵI)V0−ϵbF3(U)(t) =

et(DV∆−ϵI)V0+ϵb Rt

0e(t−r)(DV∆−ϵI)Urdr. Note that F3satisfies (Fs,η)for every

s∈Rand η < 4: With ε= 2 −η/2,

sup

0≤t≤T∥F3(U)(t)∥s+η+ε−2≲sup

0≤t≤TZt

0e(t−r)(DV∆−ϵI)Urs+2−εdr

≲sup

0≤t≤TZt

(t−r)−1+ε/2∥Ur∥sdr≲2

εTε/2∥U∥R(s).

Further, E[∥V0∥p

s∗+2]<∞for all p≥2. Consequently, for all s < s∗and

ε > 0, we have V∈RE(s+ 2 −ε)whenever U∈RE(s). In particular, from

Lemma B.4 it follows that V∈RE(3 −ε).

Now, exactly as in Lemma B.2 we see that there is some s > d/2such

that U∈RE(s), taking into account that ais bounded and V∈RE(3 −ε)

for ε > 0. Finally, it is clear that U7→ f(U, ∥U∥L2(D))satisfies (Fs,η)for

d/2< s < s∗and η < 2, so the same is true for F(U) = f(U, ∥U∥L2(D))−

k2(e(·)(DV∆−ϵI)V0−ϵbF3(U)). Thus, an application of Proposition 2.4 finishes

the proof of Proposition 6.6.

149

List of Figures

2.1 Monte-Carlo simulation for the Allen-Cahn equation. . . . . . 50

6.1 Diffusivity estimation for simulated cell data. . . . . . . . . . . 130

6.2 Effect of the noise intensity . . . . . . . . . . . . . . . . . . . 132

6.3 Diffusivity estimation for a giant cell observation. . . . . . . . 133

6.4 Diffusivity estimation on a population of giant cells. . . . . . . 136

6.5 Effective unstable zero of the reaction term. . . . . . . . . . . 137

6.6 Effective diffusivity outside the cell. . . . . . . . . . . . . . . . 138

All of these plots appear in one of the works on which this dissertation is

based, namely [PS20] (Figure 2.1) and [PFA+21] (Figure 6.1 to 6.6).

150

Notation

Assumptions

(W)well-posedness of the SPDE (p. 18)

(Fs,η)regularity bound on the nonlinearity F(p. 21)

(Fv

s,η)bound on Fin variational spaces Rv(s)(p. 23)

(Fp

s,η)Lp-regularity bound on the nonlinearity F(p. 108)

(Fpar

s,η )regularity bound on parametrized nonlinearity F(p. 119)

(Fsys

s,η )analogon of (Fs,η)for partially observed systems (p. 47)

(FJ

s,η)bound for the integrated nonlinearity JF (p. 59)

(Nγ

η)dispersion Bis asymptotically close to ¯

B= (−A)−γ(p. 78)

(D0)Bris an algebra of continuous functions (p. 86)

(D1)growth bound on the norm of the eigenfunctions (p. 86)

(D2)error bound for the integral discretization error (p. 86)

(D∗

2)trigonometric interpolation error (p. 101)

(LB)local control on the dispersion (p. 107)

(LK)shape of the kernel (p. 107)

(LΨ)non-degeneracy within the local approach (p. 107)

(Iα)linear independence of the nonlinear components (p. 121)

151

Asymptotics

aN∼bNThere is C > 0such that aN/bN→Cfor N→ ∞.

aN≍bNaN/bN→1for N→ ∞.

aN≲bNThere is C > 0such that aN≤CbN.

aN≪bNaN=o(bN), i.e. aN/bN→0for N→ ∞.

aN≪pbN

There is ϵ > 0such that aN≪bNN−ϵ.

(polynomial negligibility)

Similar notation is used for different asymptotic parameters (i.e. h,δ).

Vector Spaces, Norms, Scalar Products

A (fixed) norm on a Banach space Bis denoted by ∥·∥B.

If Bis even a Hilbert space, the corresponding scalar product is ⟨·,·⟩B.

Frequently used norms are abbreviated:

HHilbert space, typically L2(D)(state space)

∥·∥,⟨·,·⟩ norm and scalar product on H

HsD((−A)s/2)(scale of regularity spaces)

∥·∥s,⟨·,·⟩snorm and scalar product on Hs

VH1=D((−A)1/2)(energy space)

R(s)L∞(0, T;Hs)

RE(s)Tp≥1Lp(Ω, R(s)) = Tp≥1Lp(Ω, L∞(0, T ;Hs))

(locally convex space)

Rv(s)L∞(0, T;Hs−1)∩L2(0, T;Hs)

∥·∥HS Hilbert–Schmidt norm of an operator acting on H

Hs,p(D)domain of (−∆)s/2in Lp(D)

∥·∥s,p canonical norm on Hs,p(D)

Rp(s)L∞(0, T;Hs,p(D))

∥·∥(h),⟨·,·⟩(h)Euclidean norm and scalar product on RMh

152

Estimators

Temporally white noise (Chapter 2): ˆ

θfull

N,ˆ

θpart

N,ˆ

θlin

N(p. 27 f.)

Ornstein–Uhlenbeck noise (Section 3.1): ˆ

θref

N,ˆµref

N(p. 61)

θsim

N(p. 62)

ˆµfull

N(ϑ),ˆµlin

N(ϑ)(p. 65 f.)

Integrated noise (Section 3.2): ˆ

θrescaled

N(p. 74)

Discrete observations (Chapter 4): ˆ

θdiscr

h,N (p. 87)

Local observations (Chapter 5): ˆ

θδ,x0(p. 106)

θmode

N(p. 114)

Joint parameter estimation (Section 6.1): ˆ

θN= (ˆ

θN

0,··· ,ˆ

θN

K)(p. 119)

Activator–inhibitor model (Section 6.2): ˆ

θlin,N

0,ˆ

θpol,N

0,ˆ

θfull,N

0(p. 128)

θ2,N

0,ˆ

θ3,N

0,ˆ

θ4,N

0(p. 128 f.)

θ2,N

1(p. 137)

Constants and Further Notation

Fourier decomposition of A:

λNeigenvalue of −A

ΦNeigenfunction of −A

PNprojection onto the span of Φ1,...,ΦN, defined on any Hs

Frequently used constants:

βdetermined by λN∼Nβ, usually β= 2/d

Λproportionality constant given by λN≍ΛNβ, depends on D

γdegree of spatial correlation in the noise

s∗optimal regularity of the solution process, i.e. X∈R(s)if and only

if s < s∗(usually s∗= 1 + 2γ−1/β)

Further notation:

JBochner integral operator f7→ Jf =R·

0f(r)dr

153

Bibliography

[AB88] S. I. Aihara and A. Bagchi, Parameter Identification for Stochas-

tic Diffusion Equations with Unknown Boundary Conditions,

Appl Math Optim 17 (1988), 15–36.

[ABJR21] Randolf Altmeyer, Till Bretschneider, Josef Janák, and Markus

Reiß, Parameter Estimation in an SPDE Model for Cell Repo-

larisation, arXiv:2010.06340v2 [math.ST] (2021), preprint.

[ACP20] Randolf Altmeyer, Igor Cialenco, and Gregor Pasemann, Param-

eter estimation for semilinear SPDEs from local measurements,

arXiv:2004.14728v2 [math.ST] (2020), preprint.

[AF92] David R. Adams and Michael Frazier, Composition operators on

potential spaces, Proc. Amer. Math. Soc. 114 (1992), no. 1, 155–

165.

[AF03] Robert A. Adams and John J. F. Fournier, Sobolev spaces, second

ed., Pure and Applied Mathematics, vol. 140, Elsevier/Academic

Press, 2003.

[Aih92] S. I. Aihara, Regularized Maximum Likelihood Estimate for an

Infinite-dimensional Parameter in Stochastic Parabolic Systems,

SIAM J. Control and Optimization 30 (1992), no. 4, 745–764.

[Aih98a] , Consistency property of extended least-squares parame-

ter estimation for stochastic diffusion equation, Systems & Con-

trol Letters 34 (1998), 249–256.

[Aih98b] , Identification of a Discontinuous Parameter in Stochas-

tic Parabolic Systems, Appl Math Optim 37 (1998), 43–69.

154

[AR21] Randolf Altmeyer and Markus Reiß, Nonparametric Estimation

for Linear SPDEs from Local Measurements, Ann. Appl. Probab.

31 (2021), no. 1, 1 – 38.

[AS88] S. I. Aihara and Y. Sunahara, Identification of an Infinite-

dimensional Parameter for Stochastic Diffusion Equations,

SIAM J. Control and Optimization 26 (1988), no. 5, 1062–1075.

[ASB18] Sergio Alonso, Maike Stange, and Carsten Beta, Modeling ran-

dom crawling, membrane deformation and intracellular polarity

of motile amoeboid cells, PLOS ONE 13 (2018), no. 8, 1–22.

[AT17] Anthony P. Austin and Lloyd N. Trefethen, Trigonometric Inter-

polation and Quadrature in Perturbed Points, SIAM J. Numer.

Anal. 55 (2017), no. 5, 2113–2122.

[Aus16] Anthony P. Austin, Some New Results on and Applications of

Interpolation in Numerical Computation, University of Oxford,

2016, DPhil thesis.

[BB84] A. Bagchi and V. Borkar, Parameter identification in infinite

dimensional linear systems, Stochastics 12 (1984), 201–213.

[BBPSP14] L. Blanchoin, R. Boujemaa-Paterski, C. Sykes, and J. Plastino,

Actin Dynamics, Architecture, and Mechanics in Cell Motility,

Physiol Rev 94 (2014), no. 1, 235–263.

[BC09] A. Bain and D. Crisan, Fundamentals of Stochastic Filtering,

Stochastic Modelling and Applied Probability, vol. 60, Springer

Science+Business Media, LLC, 2009.

[Bis99] J. P. N. Bishwal, Bayes and Sequential Estimation in Hilbert

Space Valued Stochastic Differential Equations, Journal of the

Korean Statistical Society 28 (1999), no. 1, 93–106.

[Bis02] , The Bernstein-von Mises Theorem and Spectral Asymp-

totics of Bayes Estimators for Parabolic SPDEs, J. Austral.

Math. Soc. 72 (2002), 287–298.

[BL76] J. Bergh and J. Löfström, Interpolation Spaces (An Introduc-

tion), Grundlehren der mathematischen Wissenschaften, vol.

223, Springer-Verlag Berlin Heidelberg, 1976.

155

[BM18] Haïm Brezis and Petru Mironescu, Gagliardo–Nirenberg inequal-

ities and non-inequalities: The full story, Ann. Inst. H. Poincaré

Anal. Non Linéaire 35 (2018), no. 5, 1355–1376.

[BS17] Matthew D. Blair and Christopher D. Sogge, Refined and Mi-

crolocal Kakeya-Nikodym Bounds of Eigenfunctions in Higher

Dimensions, Comm. Math. Phys. 356 (2017), no. 2, 501–533.

[BST80] A. Bagchi, R. C. W. Strijbos, and G. Thé, Identification of a

distributed-parameter system with boundary noise, Int. J. Sys-

tems Sci. 11 (1980), no. 1, 49–56.

[BT19] Markus Bibinger and Mathias Trabs, On Central Limit Theo-

rems for Power Variations of the Solution to the Stochastic Heat

Equation, Stochastic Models, Statistics and Their Applications

(A. Steland, E. Rafajłowicz, and O. Okhrin, eds.), Springer Pro-

ceedings in Mathematics & Statistics, vol. 294, Springer, Cham,

2019, pp. 69–84.

[BT20] , Volatility estimation for stochastic PDEs using high-

frequency observations, Stochastic Process. Appl. 130 (2020),

no. 5, 3005–3052.

[CA77] J. W. Cahn and S. M. Allen, A Microscopic Theory for Domain

Wall Motion and its Experimental Verification in Fe-Al Alloy

Domain Growth Kinetics, J. Phys. Colloques 38 (1977), no. C7,

C7–51 – C7–54.

[CCG20] Ziteng Cheng, Igor Cialenco, and Ruoting Gong, Bayesian esti-

mations for diagonalizable bilinear SPDEs, Stochastic Process.

Appl. 130 (2020), no. 2, 845–877.

[CD20] Carsten Chong and Robert C. Dalang, Power Variations in Frac-

tional Sobolev Spaces for a Class of Parabolic Stochastic PDEs,

arXiv:2006.15817v1 [math.PR] (2020), preprint.

[CDVK20] Igor Cialenco, Francisco Delgado-Vences, and Hyun-Jung Kim,

Drift estimation for discretely sampled SPDEs, Stoch PDE: Anal

Comp 8(2020), no. 4, 895–920.

156

[Cer01] Sandra Cerrai, Second order PDE’s in finite and infinite dimen-

sion, Lecture Notes in Mathematics, vol. 1762, Springer-Verlag,

Berlin, 2001.

[CGH11] Igor Cialenco and Nathan Glatt-Holtz, Parameter estimation for

the stochastically perturbed Navier-Stokes equations, Stochastic

Process. Appl. 121 (2011), no. 4, 701–724.

[CGH18] Igor Cialenco, Ruoting Gong, and Yicong Huang, Trajectory fit-

ting estimators for SPDEs driven by additive noise, Stat Infer-

ence Stoch Process 21 (2018), no. 1, 1–19.

[CH20] Igor Cialenco and Yicong Huang, A note on parameter estima-

tion for discretely sampled SPDEs, Stochastics and Dynamics

20 (2020), no. 3, 2050016.

[Che93] Xia Chen, On the law of the iterated logarithm for independent

Banach space valued random variables, The Annals of Probabil-

ity 21 (1993), no. 4, 1991–2011.

[Cho19] Carsten Chong, High-frequency analysis of parabolic stochas-

tic PDEs with multiplicative noise: Part I, arXiv:1908.04145v1

[math.PR] (2019), preprint.

[Cho20] , High-frequency analysis of parabolic stochastic PDEs,

The Annals of Statistics 48 (2020), no. 2, 1143 – 1167.

[CHQZ88] C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang,

Spectral Methods in Fluid Dynamics, Springer Series in Compu-

tational Physics, Springer-Verlag Berlin Heidelberg, 1988.

[Cia02] Philippe G. Ciarlet, The Finite Element Method for Elliptic

Problems, Classics in Applied Mathematics, vol. 40, Society for

Industrial and Applied Mathematics, 2002, reprint of the 1978

edition.

[Cia10] Igor Cialenco, Parameter estimation for SPDEs with multiplica-

tive fractional noise, Stoch. Dyn. 10 (2010), no. 4, 561–576.

[Cia18] , Statistical inference for SPDEs: an overview, Stat In-

ference Stoch Process 21 (2018), no. 2, 309–329.

157

[CK22] Igor Cialenco and Hyun-Jung Kim, Parameter estimation for

discretely sampled stochastic heat equation driven by space-only

noise, Stochastic Process. Appl. 143 (2022), 1–30.

[CKL20] Igor Cialenco, Hyun-Jung Kim, and Sergey V. Lototsky, Statis-

tical analysis of some evolution equations driven by space-only

noise, Stat Inference Stoch Process 23 (2020), no. 1, 83–103.

[CKP21] Igor Cialenco, Hyun-Jung Kim, and Gregor Pasemann, Statisti-

cal analysis of discretely sampled semilinear SPDEs: a power

variation approach, arXiv:2103.04211v1 [math.PR] (2021),

preprint.

[CL09] Igor Cialenco and Sergey V. Lototsky, Parameter estimation in

diagonalizable bilinear stochastic parabolic equations, Stat Infer-

ence Stoch Process 12 (2009), no. 3, 203–219.

[CLP09] Igor Cialenco, Sergey V. Lototsky, and Jan Pospíšil, Asymp-

totic properties of the maximum likelihood estimator for stochas-

tic parabolic equations with additive fractional Brownian motion,

Stoch. Dyn. 9(2009), no. 2, 169–185.

[CX14] Igor Cialenco and Liaosha Xu, A note on error estimation for

hypothesis testing problems for some linear SPDEs, Stoch PDE:

Anal Comp 2(2014), no. 3, 408–431.

[CX15] , Hypothesis testing for stochastic PDEs driven by addi-

tive noise, Stochastic Process. Appl. 125 (2015), no. 3, 819–866.

[DMPD00] T. E. Duncan, B. Maslowski, and B. Pasik-Duncan, Adaptive

Control for Semilinear Stochastic Systems, SIAM J. Control Op-

tim. 38 (2000), no. 6, 1683–1706.

[DPDT94] Giuseppe Da Prato, Arnaud Debussche, and Roger Temam,

Stochastic Burgers’ equation, NoDEA 1(1994), no. 4, 389–402.

[DPZ14] Giuseppe Da Prato and Jerzy Zabczyk, Stochastic Equations in

Infinite Dimensions, second ed., Encyclopedia of Mathematics

and its Applications, vol. 152, Cambridge University Press, 2014.

158

[EN00] Klaus-Jochen Engel and Rainer Nagel, One-Parameter Semi-

groups for Linear Evolution Equations, Graduate Texts in Math-

ematics, vol. 194, Springer-Verlag New York, 2000.

[FFAB20] Sven Flemming, Francesc Font, Sergio Alonso, and Carsten Beta,

How cortical waves drive fission of motile cells, Proceedings of

the National Academy of Sciences 117 (2020), no. 12, 6330–6338.

[Fit61] R. FitzHugh, Impulses and Physiological States in Theoretical

Models of Nerve Membrane, Biophys. J. 1(1961), 445–466.

[GEW+14] M. Gerhardt, M. Ecke, M. Walz, A. Stengl, C. Beta, and

G. Gerisch, Actin and PIP3 waves in giant cells reveal the inher-

ent length scale of an excited state, Journal of Cell Science 127

(2014), no. 20, 4507–4517.

[GM02] B. Goldys and B. Maslowski, Parameter Estimation for Con-

trolled Semilinear Stochastic Systems: Identifiability and Con-

sistency, Journal of Multivariate Analysis 80 (2002), 322–343.

[GN15] E. Giné and R. Nickl, Mathematical Foundations of Infinite-

Dimensional Statistical Models, Cambridge Series in Statisti-

cal and Probabilistic Mathematics, Cambridge University Press,

2015.

[Gri92] D. Grieser, Lpbounds for eigenfunctions and spectral projections

of the Laplacian near concave boundaries, ProQuest LLC, Ann

Arbor, MI, 1992, Ph.D. thesis, University of California, Los An-

geles.

[Gri02] , Uniform bounds for eigenfunctions of the Laplacian on

manifolds with boundary, Commun. in Partial Differential Equa-

tions 27 (2002), no. 7-8, 1283–1299.

[GT01] David Gilbarg and Neil S. Trudinger, Elliptic Partial Differential

Equations of Second Order, Classics in Mathematics, Springer-

Verlag Berlin Heidelberg, 2001, reprint of the 1998 edition.

159

[HKR93] M. Huebner, R. Khasminskii, and B. L. Rozovskii, Two Exam-

ples of Parameter Estimation for Stochastic Partial Differen-

tial Equations, Stochastic Processes (S. Cambanis, J. K. Ghosh,

R. L. Karandikar, and P. K. Sen, eds.), Springer-Verlag New

York, 1993, pp. 149–160.

[HL84] W. Horsthemke and R. Lefever, Noise-Induced Transitions (The-

ory and Applications in Physics, Chemistry, and Biology),

Springer Series in Synergetics, Springer-Verlag Berlin Heidel-

berg, 1984.

[HL00a] M. Huebner and S. Lototsky, Asymptotic analysis of a kernel

estimator for parabolic SPDE’s with time-dependent coefficients,

Ann. Appl. Probab. 10 (2000), no. 4, 1246–1258.

[HL00b] , Asymptotic Analysis of the Sieve Estimator for a Class

of Parabolic SPDEs, Scand J Statist 27 (2000), no. 2, 353–370.

[HLR97] M. Huebner, S. Lototsky, and B. L. Rozovskii, Asymptotic Prop-

erties of an Approximate Maximum Likelihood Estimator for

Stochastic PDEs, Statistics and Control of Stochastic Processes

(Yu. M. Kabanov, B. L. Rozovskii, and A. N. Shiryaev, eds.),

World Scientific, 1997, pp. 139–155.

[HR95] M. Huebner and B. L. Rozovskii, On asymptotic properties of

maximum likelihood estimators for parabolic stochastic PDE’s,

Probab. Theory Relat. Fields 103 (1995), no. 2, 143–163.

[HT21a] Florian Hildebrandt and Mathias Trabs, Nonparametric calibra-

tion for stochastic reaction-diffusion equations based on discrete

observations, arXiv:2102.13415v1 [math.ST] (2021), preprint.

[HT21b] , Parameter estimation for SPDEs based on discrete ob-

servations in time and space, Electronic Journal of Statistics 15

(2021), no. 1, 2716–2776.

[Hue93] M. Huebner, Parameter estimation for stochastic differential

equations, ProQuest LLC, Ann Arbor, MI, 1993, Ph.D. thesis,

University of Southern California.

160

[Hue99] , Asymptotic Properties of the Maximum Likelihood Esti-

mator for Stochastic PDEs Disturbed by Small Noise, Statistical

Inference for Stochastic Processes 2(1999), no. 1, 57–68.

[Hui14] Jiang Hui, Moderate deviation for parameter estimator in the

stochastic parabolic equations with additive fractional Brownian

motion, Stochastics and Dynamics 14 (2014), no. 3, 1450002.

[IH81] I. A. Ibragimov and R. Z. Has’minski˘ı, Statistical Estimation

(Asymptotic Theory), Applications of Mathematics, vol. 16,

Springer-Verlag, New York-Berlin, 1981.

[IK99] I. A. Ibragimov and R. Z. Khas’minski˘ı, Estimation Problems for

Coefficients of Stochastic Partial Differential Equations. Part I,

Theory Probab. Appl. 43 (1999), no. 3, 370–387.

[IK00] , Estimation Problems for Coefficients of Stochastic Par-

tial Differential Equations. Part II, Theory Probab. Appl. 44

(2000), no. 3, 469–494.

[IK01] , Estimation Problems for Coefficients of Stochastic Par-

tial Differential Equations. Part III, Theory Probab. Appl. 45

(2001), no. 2, 210–232.

[Jan97] Svante Janson, Gaussian Hilbert Spaces, Cambridge Tracts in

Mathematics, Cambridge University Press, 1997.

[Jan20] Josef Janák, Parameter estimation for stochastic partial differ-

ential equations of second order, Appl. Math. Optim. 82 (2020),

353–397.

[Jan21] , Parameter Estimation for Stochastic Wave Equation

Based on Observation Window, Acta Appl. Math. 172 (2021),

no. 2, 37 pages.

[JS03] Jean Jacod and Albert N. Shiryaev, Limit theorems for stochas-

tic processes, second ed., Grundlehren der Mathematischen Wis-

senschaften, vol. 288, Springer-Verlag, Berlin, 2003.

[KL85] T. Koski and W. Loges, Asymptotic Statistical Inference for a

Stochastic Heat Flow Problem, Statistics & Probability Letters

3(1985), 185–189.

161

[KL86] , On minimum-contrast estimation for Hilbert space-

valued stochastic differential equations, Stochastics 16 (1986),

no. 3-4, 217–225.

[KLBR00] M. L. Kleptsyna, A. Le Breton, and M.-C. Roubaud, Parameter

Estimation and Optimal Filtering for Fractional Type Stochastic

Systems, Stat. Inference Stoch. Process. 3(2000), no. 1-2, 173–

182.

[KM19] P. Kříž and B. Maslowski, Central limit theorems and minimum-

contrast estimators for linear stochastic evolution equations,

Stochastics 91 (2019), 1109–1140.

[KO79] Heinz-Otto Kreiss and Joseph Oliger, Stability of the Fourier

method, SIAM J. Numer. Anal. 16 (1979), no. 3, 421–433.

[Kří20] P. Kříž, A space-consistent version of the minimum-contrast es-

timator for linear stochastic evolution equations, Stochastics and

Dynamics 20 (2020), no. 3, 2050019.

[Kry96] N. V. Krylov, On Lp-Theory of Stochastic Partial Differential

Equations in the Whole Space, SIAM Journal on Mathematical

Analysis 27 (1996), no. 2, 313–340.

[KU21a] Yusuke Kaino and Masayuki Uchida, Adaptive estimator for a

parabolic linear SPDE with a small noise, Jpn J Stat Data Sci

(2021), 29 pages.

[KU21b] Yusuke Kaino and Masayuki Uchida, Parametric estimation for

a parabolic linear SPDE model based on discrete observations, J.

Statist. Plann. Inference 211 (2021), 190–220.

[KUP91] P. Kumar, T. E. Unny, and K. Ponnambalam, Stochastic partial

differential equations in groundwater hydrology (Part 2: Appli-

cation to Borden aquifer), Stochastic Hydrol. Hydraul. 5(1991),

239–251.

[Kut04] Yury A. Kutoyants, Statistical inference for ergodic diffusion

processes, Springer Series in Statistics, Springer-Verlag London

Ltd., 2004.

162

[LL10a] W. Liu and S. V. Lototsky, Estimating speed and damping in the

stochastic wave equation, Stochastic partial differential equations

and applications, Quad. Mat., vol. 25, Dept. Math., Seconda

Univ. Napoli, Caserta, 2010, pp. 191–206.

[LL10b] , Parameter estimation in hyperbolic multichannel mod-

els, Asymptot. Anal. 68 (2010), no. 4, 223–248.

[LM72] J.-L. Lions and E. Magenes, Non-homogeneous boundary value

problems and applications. Vol. I, Die Grundlehren der mathe-

matischen Wissenschaften, vol. 181, Springer-Verlag, New York-

Heidelberg, 1972.

[Log84] Wilfried Loges, Girsanov’s theorem in Hilbert space and an ap-

plication to the statistics of Hilbert space-valued stochastic dif-

ferential equations, Stochastic Process. Appl. 17 (1984), no. 2,

243–263.

[Lot96] S. V. Lototsky, Problems in statistics of stochastic differential

equations, ProQuest LLC, Ann Arbor, MI, 1996, Ph.D. thesis,

University of Southern California.

[Lot03] , Parameter estimation for stochastic parabolic equations:

asymptotic properties of a two-dimensional projection-based es-

timator, Stat. Inference Stoch. Process. 6(2003), no. 1, 65–87.

[Lot04] , Optimal filtering of stochastic parabolic equations, Re-

cent developments in stochastic analysis and related topics

(S. Albeverio, Z.-M. Ma, and M. Röckner, eds.), World Scien-

tific, 2004, pp. 330–353.

[Lot09] , Statistical inference for stochastic parabolic equations:

a spectral approach, Publ. Mat. 53 (2009), no. 1, 3–45.

[LPS14] G. J. Lord, C. E. Powell, and T. Shardlow, An Introduction to

Computational Stochastic PDEs, Cambridge Texts in Applied

Mathematics, Cambridge University Press, 2014.

[LR99] S. V. Lototsky and B. L. Rosovskii, Spectral asymptotics of some

functionals arising in statistical inference for SPDEs, Stochastic

Process. Appl. 79 (1999), no. 1, 69–94.

163

[LR00] , Parameter Estimation for Stochastic Evolution Equa-

tions with Non-commuting Operators, Skorokhod’s Ideas in

Probability Theory (V. Korolyuk, N. Portenko, and H. Syta,

eds.), Institute of Mathematics of the National Academy of Sci-

ences of Ukraine, Kiev, 2000, pp. 271–280.

[LR15] Wei Liu and Michael Röckner, Stochastic Partial Differen-

tial Equations: An Introduction, Universitext, Springer, Cham,

2015.

[LS77] R. S. Liptser and A. N. Shiryayev, Statistics of Random Pro-

cesses. I (General Theory), Applications of Mathematics, vol. 5,

Springer-Verlag New York, 1977.

[LS89] , Theory of Martingales, Mathematics and its Applica-

tions (Soviet Series), vol. 49, Kluwer Academic Publishers, Dor-

drecht, 1989.

[LS01] , Statistics of Random Processes. II (Applications), 2nd

ed., Applications of Mathematics, vol. 6, Springer-Verlag Berlin

Heidelberg, 2001.

[Lun95] Alessandra Lunardi, Analytic Semigroups and Optimal Reg-

ularity in Parabolic Problems, Modern Birkhäuser Classics,

Birkhäuser/Springer Basel, 1995.

[Mar03] Bo Markussen, Likelihood inference for a discretely observed

stochastic partial differential equation, Bernoulli 9(2003), no. 5,

745–762.

[MFF+20] Eduardo Moreno, Sven Flemming, Francesc Font, Matthias

Holschneider, Carsten Beta, and Sergio Alonso, Modeling cell

crawling strategies with a bistable model: From amoeboid to

fan-shaped cell motion, Physica D: Nonlinear Phenomena 412

(2020), 132591.

[Mis08] Yuliya S. Mishura, Stochastic calculus for fractional Brownian

motion and related processes, Lecture Notes in Mathematics, vol.

1929, Springer-Verlag Berlin Heidelberg, 2008.

164

[MKT19a] Z. Mahdi Khalil and C. A. Tudor, Estimation of the drift parame-

ter for the fractional stochastic heat equation via power variation,

Mod. Stoch. Theory Appl. 6(2019), no. 4, 397–417.

[MKT19b] , On the distribution and q-variation of the solution to the

heat equation with fractional Laplacian, Probab. Math. Statist.

39 (2019), no. 2, 315–335.

[Moh94] J. Mohapl, Maximum Likelihood Estimation in Linear Infinite

Dimensional Models, Stochastic Models 10 (1994), 781–794.

[Moh97] , On Estimation in the Planar Ornstein-Uhlenbeck Pro-

cess, Stochastic Models 13 (1997), 435–455.

[Moh00] , A Stochastic Advection-Diffusion Model for the Rocky

Flats Soil Plutonium Data, Ann. Inst. Statist. Math. 52 (2000),

no. 1, 84–107.

[MP07] B. Maslowski and J. Pospíšil, Parameter Estimates for Linear

Partial Differential Equations with Fractional Boundary Noise,

Communications in Information and Systems 7(2007), no. 1,

1–20.

[MP08] , Ergodicity and Parameter Estimates for Infinite-

Dimensional Fractional Ornstein-Uhlenbeck Process, Appl Math

Optim 57 (2008), 401–429.

[MT13] B. Maslowski and C. A. Tudor, Drift parameter estimation for

infinite-dimensional fractional Ornstein–Uhlenbeck process, Bull.

Sci. math. 137 (2013), 880–901.

[NAY62] A. Nagumo, S. Arimoto, and S Yoshizawa, An Active Pulse

Transmission Line Simulating Nerve Axon, Proc. IRE 50 (1962),

no. 10, 2061–2070.

[NVV99] Ilkka Norros, Esko Valkeila, and Jorma Virtamo, An elementary

approach to a Girsanov formula and other analytical results on

fractional Brownian motions, Bernoulli 5(1999), no. 4, 571–587.

[Ouv78] Jean-Yves Ouvrard, Martingale Projection and Linear Filtering

in Hilbert Spaces. I: The Theory, SIAM J. Control and Opti-

mization 16 (1978), no. 6, 912–937.

165

[Pas80] Joseph E. Pasciak, Spectral and pseudospectral methods for ad-

vection equations, Math. Comp. 35 (1980), no. 152, 1081–1092.

[Paz83] A. Pazy, Semigroups of Linear Operators and Applications to

Partial Differential Equations, Applied Mathematical Sciences,

vol. 44, Springer-Verlag New York, 1983.

[Pes95] Szymon Peszat, Existence and uniqueness of the solution for

stochastic equations on Banach spaces, Stochastics and Stochas-

tics Reports 55 (1995), no. 3-4, 167–193.

[PFA+21] Gregor Pasemann, Sven Flemming, Sergio Alonso, Carsten Beta,

and Wilhelm Stannat, Diffusivity Estimation for Activator–

Inhibitor Models: Theory and Application to Intracellular Dy-

namics of the Actin Cytoskeleton, Journal of Nonlinear Science

31 (2021), no. 59, 1–34.

[PR96] L. Piterbarg and B. Rozovskii, Maximum likelihood estimators

in the equations of physical oceanography, Stochastic Modelling

in Physical Oceanography (R. J. Adler, P. Müller, and B. L.

Rozovskii, eds.), Progress in Probability, vol. 39, Birkhäuser

Boston, 1996, pp. 397–421.

[PR97] , On asymptotic problems of parameter estimation in

stochastic PDE’s: discrete time sampling, Math. Methods

Statist. 6(1997), no. 2, 200–223.

[PR00] B. L. S. Prakasa Rao, Bayes estimation for some stochastic par-

tial differential equations, Journal of Statistical Planning and

Inference 91 (2000), no. 2, 511–524.

[PR02] , Nonparametric Inference for a Class of Stochastic

Partial Differential Equations Based on Discrete Observations,

Sankhy¯a: The Indian Journal of Statistics, Series A 64 (2002),

no. 1, 1–15.

[PS20] Gregor Pasemann and Wilhelm Stannat, Drift estimation

for stochastic reaction-diffusion systems, Electronic Journal of

Statistics 14 (2020), no. 1, 547 – 579.

166

[PT07] Jan Pospíšil and Roger Tribe, Parameter estimates and exact

variations for stochastic heat equations driven by space-time

white noise, Stoch. Anal. Appl. 25 (2007), no. 3, 593–611.

[QSS00] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri, Numerical

Mathematics, Texts in Applied Mathematics, vol. 37, Springer-

Verlag New York, 2000.

[RR20] S. Reich and P. J. Rozdeba, Posterior contraction rates for non-

parametric state and drift estimation, Foundations of Data Sci-

ence 2(2020), no. 3, 333–349.

[Sch72] F. Schlögl, Chemical Reaction Models for Non-Equilibrium

Phase Transitions, Z. Physik 253 (1972), no. 2, 147–161.

[Sha71] H. S. Shapiro, Topics in Approximation Theory, Lecture Notes in

Mathematics, vol. 187, Springer-Verlag Berlin Heidelberg, 1971.

[Shi96] A. N. Shiryaev, Probability, second ed., Graduate Texts in Math-

ematics, vol. 95, Springer-Verlag, 1996.

[Shu01] M. A. Shubin, Pseudodifferential Operators and Spectral Theory,

second ed., Springer-Verlag Berlin Heidelberg, 2001.

[Sog15] Christopher D. Sogge, Problems related to the concentration of

eigenfunctions, Journées EDP (2015), no. IX, 11 pages.

[SS07] Hart F. Smith and Christopher D. Sogge, On the Lpnorm

of spectral clusters for compact manifolds with boundary, Acta

Math. 198 (2007), no. 1, 107–153.

[SS15] Martin Sauer and Wilhelm Stannat, Lattice approximation for

stochastic reaction diffusion equations with one-sided Lipschitz

condition, Math. Comp. 84 (2015), no. 292, 743–766.

[SS16] , Analysis and approximation of stochastic nerve axon

equations, Math. Comp. 85 (2016), no. 301, 2457–2481.

[SST20] Radomyra Shevchenko, Meryem Slaoui, and C. A. Tudor, Gener-

alized k-variations and Hurst parameter estimation for the frac-

tional wave equation via Malliavin calculus, Journal of Statistical

Planning and Inference 207 (2020), 155–180.

167

[SV02] Jukka Saranen and Gennadi Vainikko, Periodic Integral and

Pseudodifferential Equations with Numerical Approximation,

Springer Monographs in Mathematics, Springer-Verlag Berlin

Heidelberg, 2002.

[Tho06] Vidar Thomée, Galerkin Finite Element Methods for Parabolic

Problems, second ed., Springer Series in Computational Mathe-

matics, vol. 25, Springer-Verlag Berlin Heidelberg, 2006.

[Tri10a] Hans Triebel, Theory of Function Spaces, Modern Birkhäuser

Classics, Birkhäuser Verlag/Springer Basel AG, 2010, reprint of

the 1983 edition.

[Tri10b] , Theory of Function Spaces. II, Modern Birkhäuser Clas-

sics, Birkhäuser Verlag/Springer Basel AG, 2010, reprint of the

1992 edition.

[TTV14] S. Torres, C. A. Tudor, and F. G. Viens, Quadratic variations

for the fractional-colored stochastic heat equation, Electron. J.

Probab. 19 (2014), no. 76, 1–51.

[Tud13] C. A. Tudor, Analysis of Variations for Self-similar Processes (A

Stochastic Calculus Approach), Probability and Its Applications,

Springer International Publishing Switzerland, 2013.

[TV07] C. A. Tudor and F. G. Viens, Statistical aspects of the fractional

stochastic calculus, Ann. Statist. 35 (2007), no. 3, 1183–1212.

[Unn89] T. E. Unny, Stochastic partial differential equations in ground-

water hydrology (Part I: Theory), Stochastic Hydrol. Hydraul. 3

(1989), 135–153.

[vdBHV15] Michiel van den Berg, Rainer Hempel, and Jürgen Voigt, L1-

estimates for eigenfunctions of the Dirichlet Laplacian, J. Spectr.

Theory 5(2015), no. 4, 829–857.

[vdV98] A. W. van der Vaart, Asymptotic Statistics, Cambridge Series

in Statistical and Probabilistic Mathematics, vol. 3, Cambridge

University Press, 1998.

168

[vNVW12] J. van Neerven, M. Veraar, and L. Weis, Maximal Lp-Regularity

for Stochastic Evolution Equations, SIAM J. Math. Anal. 44

(2012), no. 3, 1372–1414.

[Vog15] Hendrik Vogt, L1-estimates for eigenfunctions and heat kernel

estimates for semigroups dominated by the free heat semigroup,

J. Evol. Equ. 15 (2015), no. 4, 879–893.

[WD01] Y. Wei and J. Ding, Representations for Moore-Penrose Inverses

in Hilbert Spaces, Applied Mathematics Letters 14 (2001), 599–

604.

[Wey11] H. Weyl, Über die asymptotische Verteilung der Eigenwerte,

Nachrichten von der Gesellschaft der Wissenschaften zu Göt-

tingen, Mathematisch-Physikalische Klasse (1911), 110–117.

[Wit85] Rainer Wittmann, A General Law of Iterated Logarithm, Z.

Wahrscheinlichkeitstheorie verw. Gebiete 68 (1985), no. 4, 521–

543.

[Wit87] , Sufficient Moment and Truncated Moment Conditions

for the Law of the Iterated Logarithm, Probab. Th. Rel. Fields

75 (1987), no. 4, 509–530.

[Yag10] Atsushi Yagi, Abstract Parabolic Evolution Equations and their

Applications, Springer Monographs in Mathematics, Springer-

Verlag Berlin Heidelberg, 2010.

[Yin93] Z. Ying, Maximum Likelihood Estimation of Parameters under

a Spatial Sampling Scheme, The Annals of Statistics 21 (1993),

no. 3, 1567–1590.

[Zlá73] Miloš Zlámal, Curved elements in the finite element method. I,

SIAM J. Numer. Anal. 10 (1973), no. 1, 229–240.

169