Document [original]

Numerische Mathematik (2020) 145:655–692

https://doi.org/10.1007/s00211-020-01123-1

Numerische

Mathematik

Adaptive stochastic Galerkin FEM for lognormal coefficients

in hierarchical tensor representations

Martin Eigel1·Manuel Marschall1·Max Pfeﬀer2·Reinhold Schneider3

Received: 2 August 2018 / Revised: 17 January 2020 / Published online: 24 June 2020

Abstract

Stochastic Galerkin methods for non-affine coefficient representations are known to

cause major difficulties from theoretical and numerical points of view. In this work, an

adaptive Galerkin FE method for linear parametric PDEs with lognormal coefficients

discretized in Hermite chaos polynomials is derived. It employs problem-adapted

function spaces to ensure solvability of the variational formulation. The inherently

high computational complexity of the parametric operator is made tractable by using

hierarchical tensor representations. For this, a new tensor train format of the lognormal

coefficient is derived and verified numerically. The central novelty is the derivation of

a reliable residual-based a posteriori error estimator. This can be regarded as a unique

feature of stochastic Galerkin methods. It allows for an adaptive algorithm to steer

the refinements of the physical mesh and the anisotropic Wiener chaos polynomial

degrees. For the evaluation of the error estimator to become feasible, a numerically effi-

cient tensor format discretization is developed. Benchmark examples with unbounded

lognormal coefficient fields illustrate the performance of the proposed Galerkin dis-

cretization and the fully adaptive algorithm.

Mathematics Subject Classification 35R60 ·47B80 ·60H35 ·65C20 ·65N12 ·

65N22 ·65J10

BReinhold Schneider

[email protected]

Martin Eigel

[email protected]

Manuel Marschall

[email protected]

Max Pfeffer

max.pfef[email protected]

1Weierstrass Institute, Mohrenstrasse 39, 10117 Berlin, Germany

2Max Planck Institute for Mathematics in the Sciences, Inselstr. 22, 04103 Leipzig, Germany

3TU Berlin, Straße des 17. Juni 136, 10623 Berlin, Germany

123

656 M. Eigel et al.

1 Introduction

In the thriving field of uncertainty quantification (UQ), efficient numerical methods

for the approximate solution of random PDEs have been a topic of vivid research. As

common benchmark problem, one often considers the Darcy equation (as a model for

flow through a porous medium) with different types of random coefficients in order to

assess the efficiency of a numerical approach. Two important properties are the length

of the expansion of random fields, which often directly translates to the number of

independent random variables describing the variability in the model, and the type of

dependence on these random variables. The affine case with uniform random variables

has been studied extensively, since it represents a rather simple model which can easily

be treated with standard methods. Opposite to that, the lognormal case with Gaussian

random variables is quite challenging, from the analytical as well as numerical point

of view. A theory for the solvability of linear elliptic PDEs with respective unbounded

coefficients (and hence a lack of uniform ellipticity) in a variational formulation was

only developed recently in [25,41,49]. Computationally, the problems quickly become

difficult or even intractable with many stochastic dimensions, which might be required

to accurately represent the stochasticity in the random field expansion. This paper is

concerned with the development of an efficient numerical method for this type of

problems.

While popular sample-based Monte Carlo methods obtain dimension-independent

convergence rates, these are rather low despite often encountered higher regularity

of the parameter dependence. Moreover, such methods can only be used to evaluate

functionals of the solution (QoIs = quantities of interest) and an a posteriori error

control usually is not feasible reliably. Some recent developments in this field can e.g.

be found in [32,33,36,38] for the model problem with a lognormal coefficient. Some

ideas on a posteriori adaptivity for Monte Carlo methods can e.g. be found in [11,20].

An alternative are functional (often called spectral) approximations, which for

instance are obtained by Stochastic Collocation (SC) [2,43,44], the related Multi-

level Quadraure (MLQ) [30,31] and Stochastic Galerkin (SG1.) methods. The latter

in particular is popular in the engeneering sciences since it can be perceived as an

extension of classical finite element methods (FEM). These approaches provide a

complete parameter to solution map based on which e.g. statistical moments of the

stochastic solution can be evaluated. Notably, the regularity of the solution can be

exploited in order to obtain quasi-optimal convergence rates. However, the number of

random variables and nonlinear parameter representations have a significant impact on

the computational feasibility and techniques for a model order reduction are required.

Collocation methods with pointwise evaluations in the parameter space are usually

constructed either based on some a priori knowledge or by means of an iterative

refinement algorithm which takes into account the hierarchical surplus on possible

new discretization levels. While these approaches work reasonably well, methods for

a reliable error control do not seem immediate since the approximation relies only on

interpolation properties. Nevertheless, for the affine case and under certain assump-

tions, first ideas were recently presented in [28].

1We usually use SGFEM for Stochastic Galerkin FEM

123

Adaptive stochastic Galerkin FEM for lognormal… 657

The computationally more elaborate Stochastic Galerkin methods carry out an

orthogonal projection with respect to the energy norm onto a discrete space which is

usually spanned by a tensor basis consisting of FE basis functions in physical space

and polynomial chaos polynomials orthogonal with respect to the joint parameter

measure in (stochastic) parameter space. The use of global polynomials is justified

by the high (analytic) regularity of the solution map with respect to the parameters

[8,9,34]. However, in particular the large computational cost of Galerkin methods

make adaptivity and model reduction techniques a necessity.

In order to achieve this, different paths have been pursued successfully. As a first

approach, sparse approximations as in [15,16,19]or[4,5,10] with either a residual

based or a hierarchical a posteriori error estimators can be computed. Here, the aim is

to restrict an exponentially large discrete basis to the most relevant functions explictly

by iteratively constructing a quasi-optimal subset. In [16], convergence of the adaptive

algorithm could be shown. Moreover, adjoint based error estimators are considered in

[6,48].

As a second approach, an adaptive discretization in hierarchical tensor represen-

tations can be derived as described in [21]. These modern compression formats have

lately been investigated intensively in the numerical analysis community [3,29,45,46].

It has been examined that with appropriate assumptions the curse of dimensionality

can be alleviated, particularly so when employed with typical random PDE problems

in UQ, see [12,13] for examples with sample-based reconstruction strategies. Such

representations can be understood as an implicit model order reduction technique,

closely related to (but more general than e.g.) reduced basis methods.

In the mentioned adaptive approaches, the FE mesh for the physical space and

the parametric polynomial chaos space are adapted automatically with respect to the

considered problem. In the case of tensor approximations, also the ranks of the repre-

sentation are adjusted.

However, in all adaptive SGFEM research, so far only the affine case with uni-

form elliptic coefficient has been considered. In this paper, we extend the ASGFEM

approach developed in [21] to the significantly more involved case of lognormal

(non-affine) coefficients. This poses several severe complications analytically and

numerically. Analytical aspects have recently been tackled in [25,26,34,41]. Numer-

ically, in particular computationally efficient Galerkin methods are quite diffucult to

construct for this case and have not been devised. Compression techniques and adaptiv-

ity most certainly are required in order to make these problems tractable with SGFEM,

as described in this paper. Of particular interest is the construction of a computable

a posteriori error estimator, which also greatly benefits from using tensor formats. In

order to obtain a well-posed discretization, problem adapted spaces according to the

presentation in [49] are used.

Main contributions of this work are a representation of the coefficient in the tensor

train (TT) format, the operator discretization in tensor format and the derivation of an

reliable residual based error estimatator. This then serves as the basis for an adaptive

algorithm which steers all discretization parameters of the SGFEM. The performance

of the proposed method is demonstrated with some benchmark problems. Here, the

used field models are not artificially bounded or shifted away from zero.

123

658 M. Eigel et al.

We point out that, to our knowledge, an SGFEM for the lognormal case so far

has only been practically computed in the weighted function space setting in [42]

for a small 1d model as proof of concept. Moreover, there has not been any adaptive

numerical method with reliable a posteriori error estimation as derived in this work.

However, we note that our approach relies on the assumption that the coefficient is

discretized sufficiently accurately and hence the related discretization error can be

neglected. In practice, this can be ensured with high probability by sampling the error

of the discrete coefficient. Additionally, since constants in the error bound can become

quite large, we interpret the error estimate as a refinement indicator.

It should be noted that a functional adaptive evluation of the forward map allows

for the derivation of an explicit adaptive Bayesian inversion with functional tensor

representations as in [17]. The results of the present work lay the ground for a similar

approach with a Gaussian prior assumption. This will be the topic of future research.

Moreover, the described approach enables to construct SGFEM with arbitrary densi-

ties (approximated in hierarchical tensor formats). This generalization should also be

examined in more detail in further research. Lastly, while sparse discretizations seem

infeasible for the lognormal coefficient, a transformation [51] yields a convection-

diffusion formulation of the problem with affine parameter dependence, which then

again is amenable to an adaptive sparse SGFEM. This direction is examined in [14].

The structure of this paper is as follows: We first introduce the setting of parametric

PDEs with our model problem in Sect. 2. The variational formulation in problem

dependent weighted function spaces and the finite element (FE) setting are described in

Sect. 3. The employed tensor formats and the tensor representations of the coefficient

and the differential operator are examined in Sect. 4. As a central part, in Sect. 5

we derive the a posteriori error estimators and define a fully adaptive algorithm in

Sect. 6, including efficient ways to compute error indicators for physical and stochastic

refinement. Numerical examples in Sect. 7illustrate the performance of the presented

method and conclude the paper.

2 Setting and discretization

In the following, we introduce the considered model problem formally, present its weak

formulation and describe the employed discretizations in finite dimensional function

spaces. We closely follow the presentations in [26,34,49] regarding the lognormal

problem in problem-dependent function spaces. In [21], a related formulation for the

solution and evaluation of a posteriori error estimators for parametric PDEs with affine

coefficient fields in hierarchical tensor formats is derived.

2.1 Model problem

We assume some bounded domain D⊂Rd,d=1,2,3, with Lipschitz boundary ∂D

and a probability space (Ω, F,P).ForP-almost all ω∈Ω, we consider the random

elliptic problem

123

Adaptive stochastic Galerkin FEM for lognormal… 659

−div(a(x,ω)∇u(x,ω))=f(x)in D,

u(x,ω)=0on∂D.(2.1)

The coefficient a:D×Ω→ Rdenotes a lognormal, isotropic diffusion coefficient,

i.e., log(a)is an isotropic Gaussian random field.

Remark 2.1 The source term f∈L2(D)is assumed deterministic. However, it would

not introduce fundamental additional difficulties to also model fand the boundary

conditions as stochastic fields not correlated to the coefficient a(x,ω) as long as

appropriate integrability of the data is given.

For the coefficient a(x,ω)of (2.1), we assume a Karhunen-Loève type expansion

of b:= log(a)of the form

b(x,ω)=∞



=1

b(x)Y(ω), x∈D,P- almost all ω∈Ω.

Here, the parameter vector Y=(Y)∈Nconsists of independent standard normal

Gaussian random variables in R. Then, by passing to the image space (RN,B(RN), γ )

with the Borel σ-algebra B(RN)of all open sets of RNand the Gaussian product

measure

γ:= 

∈N

γwith γ:= γ1:= N1:= N(0,1)

and dγ1(y)=1

√2πexp(−y2

/2)dy,

we can consider the parameter vector y=(y)∈N=(Y(ω))∈N,ω∈Ω.

For any sequence β∈1(N)with

β:= bL∞(D)and β=(β)∈N,

we define the set

Γβ:= y∈RN:∞



=1

β|y|<∞.

The set Γβof admissible parameter vectors is γ-measurable and of full measure.

Lemma 2.2 ([34, Lemma 2.1]) For any sequence β∈1(N), there holds

Γβ∈B(RN)and γ(Γ

β)=1.

123

660 M. Eigel et al.

For any y∈Γβ, we define the deterministic parametric coefficient

a(x,y)=exp(b(x,y)) =exp ∞



=1

b(x)y,x∈D.(2.2)

This series converges in L∞(D)for all y∈Γβ.

Lemma 2.3 ([34, Lemma 2.2]) For all y ∈Γβ, the diffusion coefficient (2.2)is well-

defined and satisfies

0<ˇa(y):= ess inf

x∈Da(x,y)≤a(x,y)≤ess sup

x∈D

a(x,y)=: ˆa(y)<∞,

with

ˆa(y)≤exp ∞



=1

β|y|and ˇa(y)≥exp −∞



=1

β|y|.

Due to Lemmas 2.2 and 2.3 , we consider Γ=Γβas the parameter space instead

of RN. By Lemma 2.3, the stochastic coefficient a(x,y)is well defined, bounded from

above and admits a positive lower bound for almost all y∈Γ. Thus, the equations (2.1)

and (2.2) have a unique solution u(y)∈Xfor almost all y∈Γ.

Let X:= H1

0(D)denote the closed subspace of functions in the Sobolev space

H1(D)with vanishing boundary values in the sense of trace and define the norm

vX:= D|∇v(x)|2dx1/2

We denote by ·,· = ·,·X∗,Xthe duality pairing of X∗and Xand consider f∈

L2(D)as an element of the dual X∗=H−1(D).

For any y∈Γ, the variational formulation of (2.1) reads: find u(y)∈Xsuch that

B(u(y), v;y)=f,v,for all v∈X,(2.3)

where B:X×X×Γ→Ris defined by

B(u(y), v;y):= D

a(x,y)∇u(y)·∇vdx.

Hence, pathwise existence and uniqueness of the solution u(y)is obtained by the

Lax-Milgram lemma due to uniform ellipticity for any fixed y∈Γ. In particular, for

all y∈Γ,(2.3), it holds

u(y)≤ 1

ˇa(y)fX∗,

123

Adaptive stochastic Galerkin FEM for lognormal… 661

with some 0 <ˇa(y)≤a(x,y)on D. The integration of (2.3) over Γwith respect

to the standard normal Gaussian measure γdoes not lead to a well-defined problem

since the coefficient a(x,y)is not uniformly bounded in y∈Γand not bounded away

from zero. Hence, a more involved approach has to be pursued, which is elaborated

in Sect. 3.2. Alternative results for this equation were presented in [25,26,49].

The formulation of (2.1) as a parametric deterministic elliptic problem with solution

u(y)∈Xfor each parameter y∈Γreads

−div(a(x,y)∇u(x,y)) =f(x)for x∈D,u(x,y)=0forx∈∂D.

3 Variational formulation and discretization

This section is concerned with the introduction of appropriate function spaces required

for the discretization of the model problem. In particular, a problem-adapted proba-

bility measure is introduced which allows for a well-defined formulation of the weak

problem rescaled polynomial chaos basis.

3.1 Problem-adapted function spaces

Let Fbe the set of finitely supported multi-indices

F:= {μ∈N∞

0;|supp μ|<∞} where supp μ:= {m∈N;μm= 0}.

A full tensor index set of order M∈Nis defined by

Λ:= {(μ1,...,μM,0,...)∈F:μm=0,...,dm−1,m=1,...,M}

≃Λ1×...×ΛM×{0}... ⊂F,

with complete index sets of size dmgiven by

Λm:= {0,...,dm−1},m=1,...,M.

For any such subset Λ⊂F, we define supp Λ:= μ∈Λsupp μ⊂N.

We denote by (Hn)∞

n=0the orthonormal basis of L2(R,γ

m)=L2(R,γ

1)with

respect to the standard Gaussian measure consisting of Hermite polynomials Hnof

degree n∈N0on R. An orthogonal basis of L2(Γ , γ ) is obtained by tensorization

of the univariate polynomials, see [49,50]. To reduce notation, we drop the explicit

dependency on the sigma-algebra which is always assumed to be rich enough. For any

multi-index μ∈F, the tensor product polynomial Hμ:= ∞

m=1Hμmin y∈Γis

expressed as the finite product

Hμ(y)=∞



m=1

Hμm(ym)=

m∈supp μ

Hμm(ym).

123

662 M. Eigel et al.

For practical computations, an analytic expression for the triple product of Hermite

polynomials can be used.

Lemma 3.1 ([40,50]) For μ, ν, ξ ∈F,m∈N, it holds

HνmHμm=

min(νm,μm)



ξm=0

κνm,μm,νm+μm−2ξmHνm+μm−2ξm,

where for ηm=νm+μm−2ξm

κνm,μm,ηm:= R

Hνm(ym)Hμm(ym)Hηm(ym)dγm(ym)

=⎧

⎨

⎩

√νm!μm!ηm!

ξm!(νm−ξm)!(μm−ξm)!,νm+μm−ηmis even and

|νm−μm|≤ηm≤νm+μm,

0,otherwise.

Lemma 3.2 ([22,41]) Let Y ∼N1,t∈Rand X =exp(tY)∈L2(R,γ). The expan-

sion of X =exp(tY)in Hermite polynomials is given by

X=exp(tY)=

n∈N0

cnHnwith cn=tn

√n!exp(t2/2).

We recall some results from [49] required in our setting. Let σ=(σm)m∈N∈

exp(1(N)) and define

γσ:= ∞



m=1

γσm:= ∞



m=1

Nσ2

m:= ∞



m=1

N(0,σ2

m).

Then, dNσ2

m=ζσ,mdN1where

ζσ,m(ym):= 1

σm

exp −1

2(σ−2

m−1)y2

m

is the one-dimensional Radon–Nikodym derivative of γσmwith respect to γm, i.e., the

respective probability density. We assume that the sequence σdepends exponentially

on β=(βm)m∈Nand some ∈R, namely

σm() := exp(βm), m∈N,

and define

γ:= γσ() and ζ,m:= ζσ(),m.

123

Adaptive stochastic Galerkin FEM for lognormal… 663

By multiplication, this yields the multivariate identity

dγ(y)=ζ(y)dγ(y)with ζ(y)=∞



m=1

ζ,m(ym).

A basis of orthonormal polynomials with respect to the weighted measure γcan be

defined by the transformation

τ:R∞→R∞,(ym)m∈N→ e−βmymm∈N.

Then, for all v∈L2(Γ , γ ),

Γ

v(y)dγ(y)=Γ

v(τ(y)) dγ(y).

We define the scaled Hermite polynomials Hτ

μ:= Hμ◦τ.

Remark 3.3 Lemmas 3.1 and 3.2 are also valid with the transformed multivariate Her-

mite polynomials Hτ. In particular, κξm,νm,μmdoes not change under transformation

and the expansion in Lemma 3.2 holds by substituting t∈Rwith σmtin the corre-

sponding dimension m∈N.

3.2 Weak formulation in problem-dependent spaces

In order to obtain a well-posed variational formulation of (2.1)onL2(Γ , γ ;X),we

follow the approach in [49] and introduce a measure γwhich is stronger than γand

assume integrability of fwith respect to this measure. For >0 and 0 ≤θ<1, let

the bilinear form Bθ :Vθ ×Vθ →Rbe given by

Bθ(w, v) := ΓD

a(x,y)∇w(x,y)·∇v(x,y)dxdγθ(y). (3.1)

The solution space is then defined as the Hilbert space

Vθ := {v:Γ→XB(Γ )-measurable ;Bθ(v, v) < ∞},

endowed with the inner product Bθ(·,·), the induced energy norm vθ :=

Bθ(v, v)1/2for v∈Vθ and the respective dual pairing ·,·θ between V∗

θ and

Vθ. The different employed spaces are related as follows.

Lemma 3.4 ([49, Proposition 2.43]) For 0<θ<1,

L2(Γ , γ;X)⊂Vθ ⊂L2(Γ , γ ;X)

are continuous embeddings.

123

664 M. Eigel et al.

It can be shown that the bilinear form Bθ(·,·)is Vθ-elliptic in the sense of the

following Lemma.

Lemma 3.5 ([49, Lemma 2.41, 2.42]) For w, v ∈L2(Γ , γ;X),

|Bθ(w, v)|≤ˆcϑwL2(Γ ,γ;X)vL2(Γ ,γ;X)(3.2)

and for v∈L2(Γ , γ ;X),

Bθ(v, v) ≥ˇcϑv2

L2(Γ ,γ ;X).(3.3)

Moreover, for f∈X∗we define the continuous linear form

Fθ(v) := ΓD

f(x)v(x,y)dxdγθ(y).

For Fθ ∈V∗

θ,(3.2) and (3.3) in particular lead to the unique solvability of the

variational problem in Vθ,

Bθ(u,v)=Fθ(v) for all v∈Vθ,

and u∈Vθ is the unique solution of (2.1).

3.3 Deterministic discretization

We discretise the deterministic space Xby a conforming finite element space

Xp(T):= span{ϕi}N

i=1⊂Xof degree pon some simplicial regular mesh Tof

domain D with the set of faces S(i.e., edges for d=2) and basis functions ϕi.

In order to circumvent complications due to an inexact approximation of boundary

values, we assume that Dis a polygon. We denote by Pp(T)the space of piecewise

polynomials of degree pon the triangulation T. The assumed FE discretization with

Lagrange elements of order pthen satisfies Xp(T)⊂Pp(T)∩C(T). For any element

T∈Tand face F∈S, we set the entity sizes hT:= diam Tand hF:= diam F.Let

nFdenote the exterior unit normal on any face F. The jump of some χ∈H1(D;Rd)

on F=T1∩T2in normal direction [[χ]]Fis then defined by

[[χ]]F:= χ|T1·nF−χ|T2·nF.

By ωTand ωFwe denote the element and facet patches defined by the union of all

elements which share at least a vertex with Tor F, respectively. Consequently, the

Clément interpolation operator I:X→Xp(T)satisfies

v−IvL2(T)≤cThT|v|X,ωTfor T∈T,

v−IvL2(F)≤cSh1/2

F|v|X,ωFfor F∈S,

123

Adaptive stochastic Galerkin FEM for lognormal… 665

where the seminorms |·|

X,ωTand |·|

X,ωFare the restrictions of ·

Xto ωTand

ωF, respectively.

The fully discrete approximation space with |Λ|<∞is given by

VN:= VN(Λ;T,p):= vN(x,y)=

μ∈Λ

vN,μ(x)Hτθ

μ(y);vN,μ ∈Xp(T),

and it holds VN(Λ;T,p)⊂Vθ. The Galerkin projection of uis the unique uN∈

VN(Λ;T,p)which satisfies

Bθ(uN,vN)=Fθ(vN)for all vN∈VN(Λ;T,p).

We define a tensor product interpolation operator I:L2(Γ , γ ;X)→VN(Λ;T,p)

for v=μ∈FvμHμ∈L2(Γ , γ ;X)by setting

Iv:= 

μ∈Λ

(Ivμ)Hμ.

For v∈Vϑ(Λ) := v=μ∈ΛvμHμ;vμ∈Xand all T∈T, this yields the

interpolation estimate

(id −I)vL2(Γ ,γ ;L2(T)) =⎛

⎝Γ

μ∈Λ

(id −I)vμHμ(y)2

L2(T)dγ(y)⎞

⎠

1/2

=⎛

⎝

μ∈Λ(id −I)vμ2

L2(T)⎞

⎠

1/2

≤cThT⎛

⎝Γ

μ∈Λ

vμHμ(y)2

X,ωT

dγ(y)⎞

⎠

1/2

=cThT|v|Vϑ,ωT.(3.4)

Likewise, on the edges F∈Swe derive

v−IvL2(Γ ,γ ;L2(F)) ≤cSh1/2

F|v|Vϑ,ωF.

Here,

|v|2

Vϑ,ωT:= Γ|v(y)|2

X,ωTdγ(y),

|v|2

Vϑ,ωF:= Γ|v(y)|2

X,ωFdγ(y).

123

666 M. Eigel et al.

Theorem 3.6 ([49, Theorem 2.45]). If f ∈Lp(Γ , γ;X∗)for p >2, the Galerkin

projection uN∈VNsatisfies

u−uNL2(Γ ,γ ;X)≤ˆcϑ

ˇcϑ

inf

vN∈VN(Λ;T,p)u−vNL2(Γ ,γ;X).

Remark 3.7 It should be noted that the constant ˆcϑ

ˇcϑ tends to ∞as →{0,∞} and

for θ→{0,1}, see Remark 2.46 in [49]. This is expected, as the problem is ill-posed

in these limits. In order to obtain reasonable upper error bounds, the parameters have

hence to be chosen judiciously. A more detailed investigation of an optimal parameter

choice is postponed to future research.

4 Decomposition of the operator

In this section, we introduce the discretization of the operator in an appropriate tensor

format. For this, an efficient representation of the non-linear coefficient is derived. We

first introduce basic aspects of the employed Tensor Train (TT) format.

4.1 The tensor train format

We only provide a brief overview of the notation used regarding the tensor train

representation. For further details, we refer the reader to [3,21] and the references

therein.

Any function wN∈VN(Λ;T,p)can be written as

wN=

N−1



i=0

μ∈Λ

W(i,μ)ϕ

iHτθ

μ.

Thus, the discretization space is isomorphic to the tensor space, namely

VN(Λ;T,p)≃RN×d1×···×dM.

The tensor Wgrows exponentially with the order M, which constitutes the so called

curse of dimensionality. We employ a low-rank decomposition of the tensor for a

dimension reduction. In this paper, we adhere to the Tensor Train (TT) format for

tensor decomposition [46]. This seems reasonable, as the components (of the operator

and hence the solution) are of decreasing importance due to the decay of the coefficient

functions bmand therefore we can expect decreasing ranks in the tensor train format.

Nevertheless, other tensor formats are also feasible in principle.

The TT representation of a tensor W∈RN×d1×···×dMis given as

W(i,μ

1,...,μM)=



k1=1···



kM=1

W0(i,k1)



m=1

Wm(km,μ

m,km+1).

123

Adaptive stochastic Galerkin FEM for lognormal… 667

For simplicity of notation, we set r0=rM+1=1. If all dimensions rmare min-

imal, then this is called the TT decomposition of Wand rankTT(W):= r=

(1,r1,...,rM,1)is called the TT rank of W. The TT decomposition always exists

and it can be computed in polynomial time using the hierarchical SVD (HSVD)

[35]. A truncated HSVD yields a quasi-optimal approximation in the Frobenius norm

[27,29,39,46]. Most algebraic operations can be performed efficiently in the TT format

[3].

Once the function wNhas a low-rank TT decomposition, it is advisable to obtain a

similar representation for the Galerkin operator on VN(Λ;T,p)in order to allow for

efficient tensor solvers. For the lognormal coefficient a(x,y), this can only be done

approximately.

Later, it will be useful to express the storage complexity of a tensor train. We

distinguish the degrees of freedom given by the tensor train representation and the

full (uncompressed) degrees of freedom. For a tensor U∈Rq0×...×qLof TT-rank

r=(1,r1,...,rL,1), the dimension of the low-rank tensor manifold is given by

tt-dofs(U):=

L−1



=1

(rqr+1−r2

+1)+rLqL,(4.1)

while the dimension of the full tensor space and hence its representation is

full-dofs(U):=



=0

q.

One can conclude from (4.1) that the complexity of tensor trains depend only

linearly on the dimension, i.e., we have to store

O(Lˆqˆr2), ˆr=max {r}ˆq=max {q0,...,qL}

entries instead of O(ˆqL), which is much smaller for moderate TT-ranks r.

4.2 TT representation of the non-linear coefficient

We approximate the coefficient

a(x,y)=exp ∞



=1

b(x)y=∞



=1

eb(x)y(4.2)

using the coefficient splitting algorithm described in [47]. This results in a discretized

coefficient on a tensor set

Δ={(ν1,...,νL,0,...)∈F:ν=0,...,q−1,=1,...,L}

123

668 M. Eigel et al.

with TT-rank s=(1,s1,...,sL,1). Here, we exploit Lemma 3.2, i.e., the fact that

every factor of (4.2) has a Hermite expansion of the form

exp(b(x)y)=∞



ν=0

c()

ν(x)Hτθ

ν(y)(4.3)

with

c()

ν(x)=(b(x)σ(ϑ))ν

√ν!exp((b(x)σ(ϑ))2/2).

The procedure is as follows: First, we fix an adequate quadrature rule for solving

the involved integrals by choosing quadrature points χq∈Dand weights wq∈Rfor

q=1,...,Pquad. We begin the discretization at the right most side and define the

correlation matrix

CL(νL,ν

L):=

Pquad



q=1

c(L)

νL(χq)c(L)

ν

(χq)wq

≈D

c(L)

νL(x)c(L)

ν

(x)dx

for νL,ν

L=0,...,qL−1. This means that we have truncated the expansion (4.3)

according to the tensor set Δ, which yields an approximation of the factors. This

matrix is symmetric and positive semidefinite and it therefore admits an eigenvalue

decomposition

CL(νL,ν

L)=



kL=1

λkLAL(kL,νL)AL(kL,ν

L).

This yields reduced basis functions

˜c(L)

kL(χq):=

qL−1



νL=0

AL(kL,νL)c(L)

νL(χq)

for kL=1,...,sLthat we can store explicitly at the quadrature points of the integral.

If we choose sL=qLthen this is just a transformation without any reduction.

We proceed successively for =L−1,...,1 by defining correlation matrices

C(ν,k+1,ν

L,k

+1):=

Pquad



q=1

c()

ν(χq)˜c(+1)

k+1(χq)c()

ν



(χq)˜c(+1)

k

+1

(χq)wq

123

Adaptive stochastic Galerkin FEM for lognormal… 669

Algorithm 1 Algorithm for coefficient splitting.

Input: Coefficients c(1)

ν1,...,c(L)

νL,ν

=0,...,q−1,=1,...,L;

ranks s1,...,sL;

quadrature rule (χq,w

q), q=1,...,Pquad;

Output: TT Tensor components A1,...,AL;

Init sL+1=1;˜c(L+1)

1≡1

for =L,…,1do

Arrange correlation matrix C:

C(ν,k+1,ν

,k

+1):=

Pquad



q=1

c()

ν(χq)˜c(+1)

k+1(χq)c()

ν



(χq)˜c(+1)

k

+1

(χq)wq;

Compute eigenvalue decomposition:

C(ν,k+1,ν

,k

+1)=

s



k=1

λkA(k,ν

,k+1)A(k,ν

,k

+1);

Set reduced basis functions:

˜c

k(χq)=

q−1



ν=0

s+1



k+1=1

A(k,ν

,k+1)c

ν(χq)˜c+1

k+1(χq);

end for

return A1,...,AL;

with eigenvalue decompositions

C(ν,k+1,ν

L,k

+1)=

q



k=1

λkA(k,ν

,k+1)A(k,ν

,k

+1)

and the resulting reduced basis functions at the quadrature points

˜ck()(χp)=

q−1



ν=0

s



k+1=1

A(k,ν

,k+1)c()

ν(χq)˜c(+1)

k+1(χq),

see Algorithm 1.

This results in a first component

a0[k1](χq):= ˜c(1)

k1(χq)

for k1=1,...,s1. Note that on the one hand, it is possible to evaluate this component

at any point x∈Dby converting the reduced basis functions back into their original

form by means of the tensor components Aand the coefficient functions c()

ν.More

123

670 M. Eigel et al.

Fig. 1 A coefficient splitting for L=4

specifically,

a0[k1](x)=



ν1=0



k2=1

A1(k1,ν

1,k2)c(1)

ν1(x)˜c(2)

k2(x)

=...



k2=1···



kL=1



=1q−1



ν=0

A(k,ν

,k)c()

ν(x).

On the other hand, each original coefficient function is approximated by the reduced

basis representation

c(L)

νL(x)≈



kL=1

AL(kL,νL)˜c(L)

kL(x),

c()

ν(x)˜c(+1)

k+1≈

s



k=1

A(k,ν

,k+1)˜c()

kfor all =L−1,...,1.

This approximation is exact if the ranks s=(s1,...,sL)are full.

By the described procedure we obtain an approximate discretization aΔ,s≈ain a

TT-like format that is continuous in the first component and that has the decomposition

aΔ,s(x,y)=



k1=1···



kL=1

a0[k1](x)

ν∈Δ



=1

A(k,ν

,k+1)Hτθ

ν(y),(4.4)

see Fig. 1.

On VN(Λ;T,p)the linear Galerkin operator Ais in the TT matrix or matrix product

operator (MPO) format:

A(i,μ, j,μ

)=



k1=1···



kM=1

A0(i,j,k1)



m=1

Am(km,μ

m,μ



m,km+1), (4.5)

123

Adaptive stochastic Galerkin FEM for lognormal… 671

where

A0(i,j,k1)=D

a0[k1](x)∇ϕi·∇ϕjdx

and for all m=1,...,M−1

Am(km,μ

m,μ



m,km+1)=

qm−1



νm=0

Am(km,ν

m,km+1)

×R

Hτθ

νmHτθ

μmHτθ

μ

mdγθ,m(ym)

qm−1



νm=0

Am(km,ν

m,km+1)κμm,μ

m,νm

and

AM(km,μM,μ

M)=

qM−1



νM=0

sm+1



km+1=1···



kL=1

AM(km,νM,km+1)κμM,μM,νM



=m+1

A(k,0,k+1).

Since the integral over the triple product κμm,μ

m,νm=0 for all νm>2max(μm,μ



m),

it is sufficient to set q=2d−1 for all =1,...,L. If the rank sof the decomposition

of the coefficient is full, then the discretised coefficient aΔ,sis exact on the discrete

space VN(up to quadrature errors).

However, this is generally infeasible as the rank would grow exponentially with M.

Therefore, a truncation of the rank becomes necessary and the coefficient is only an

approximation. We assume in the following that the error that is due to this approx-

imation of the coefficient is small. A thorough estimation of this error is subject to

future research.

Remark 4.1 A similar approach to decomposing the coefficient has been chosen in [23]

where the knowledge of the eigenfunctions of the covariance operator was assumed a

priori. This means that one has an orthogonal basis also for the deterministic part in x

and all that remains to do is to decompose the coefficient tensor for this basis represen-

tation. This is also done using some quadrature and the L2-error of this approximation

can be estimated.

According to the discretization of the coefficient, we introduce the splitting of the

operator,

A(v) =A+(v) +A−(v),

123

672 M. Eigel et al.

with the active and inactive operators A+:VN(Λ;T,p)→VN(Λ;T,p)∗and

A−:VN(Λ;T,p)→VN(Λ;T,p)∗defined by

A+(v) := −div(aΔ,s∇v),

A−(v) := −div(a−aΔ,s)∇v,(4.6)

for v∈VN(Λ;T,p).

The above considerations yield that the approximate operator A+can be represented

by the following tensor product structure

A+=



k1=1···



k=1

A0[k1]⊗A1[k1,k2]⊗···⊗AL[kL],

with the operator components

A0[k1]:X→X∗,A0[k1](vx)=−div(a0[k1]∇vx),

and for all =1,...,L,

A[k,k+1]:L2(R,γ

ϑ,m)→L2(R,γ

ϑ,m)

A[k,k+1](vy)=

q−1



ν=0

A(k,ν

,k+1)Hτθ

νvy.

5 Error estimates

This section is concerned with the derivation of a reliable a posteriori error estimator

based on the stochastic residual. In comparison to the derivation in [15,16,21], the log-

normal coefficient requires a more involved approach directly related to the employed

weighted function spaces introduced in Sect. 3. In theory, an additional error occurs

because of the discretization of the coefficient which we assume to be negligible. The

developed adaptive algorithm makes possible a computable a posteriori steering of the

error components by a refinement of the FE mesh and the anisotropic Hermite poly-

nomial chaos of the solution. The efficient implementation is due to the formulation

in the TT format the ranks of which are also set adaptively.

The definition of the operators as in (4.6) leads to a decomposition of the residual

R(v) =R+(v) +R−(v),

with

R+(v) := f−A+(v), R−(v) := −A−(v).

123

Adaptive stochastic Galerkin FEM for lognormal… 673

The discrete solution wN∈VNreads

wN=

N−1



i=0

μ∈Λ

W(i,μ)ϕ

iHτθ

μ.

We assume that the operator is given in its approximate semi-discrete form A+and

aim to estimate the energy error

u−wN2

A+=ΓD

aΔ,s|∇(u−wN)|2dxdγϑ(y).

Remark 5.1 As stated before, we assume that the error that results from approximating

the coefficient is small. Estimation of this error is subject to future research. Work in

this direction has e.g. been carried out in [7,23]. Additionally, we require that the

bounds (3.2) and (3.3) still hold, possibly with different constants ˆc+

ϑ and ˇc+

ϑ.This

is for example guaranteed if aΔ,sis positive, i.e., if

aΔ,s(x,y)>0∀x∈D,y∈Γ.

Then, since the approximated coefficient is polynomial in y, the arguments in

Lemma 3.5 yield the same constants

ˆc+

ϑ =ˆcϑ,ˇc+

ϑ =ˇcϑ.

We recall Theorem 5.1 from [15] and also provide the proof for the sake of a

complete presentation. Note that the result allows for non-orthogonal approximations

wN∈VN.

Theorem 5.2 Let VN⊂Vϑ a closed subspace and wN∈VN, and let uNdenote the

A+Galerkin projection of u onto VN. Then it holds

u−wN2

A+≤sup

v∈Vθ\{0}

|R+(wN), (id −I)vθ|

ˇc+

θvL2(Γ ,γ ;X)+cIuN−wNA+2

+uN−wN2

A+.

Here, Idenotes the Clément interpolation operator in (3.4) and cIis the operator

norm of id −Iwith respect to the energy norm ·A+. The constant ˇc+

θ is derived from

the assumed coercivity of the bilinear form induced by A+similar to (3.2) and (3.3).

Proof Due to Galerkin orthogonality of uN, it holds

u−wN2

A+=u−uN2

A++uN−wN2

A+.

123

674 M. Eigel et al.

By the Riesz representation theorem, the first part is

u−uNA+=sup

v∈Vθ\{0}

|R+(uN), vθ|

vA+

We now utilise the Galerkin orthogonality and introduce the bounded linear map

I:Vθ →VNto obtain

u−uNA+=sup

v∈Vθ\{0}

|R+(uN), (id −I)vθ|

vA+

Since we do not have access to the Galerkin solution uN, we reintroduce wN

u−uNA+≤sup

v∈Vθ\{0}

|R+(wN), (id −I)vθ|

vA+

+|R+(uN)−R+(wN), (id −I)vθ|

vA+

≤sup

v∈Vθ\{0}

|R+(wN), (id −I)vθ|

vA+

+uN−wNA+(id −I)vA+

vA+

≤sup

v∈Vθ\{0}

|R+(wN), (id −I)vθ|

vA++cIwN−uNA+.

We apply the coercivity of the operator A+to the denominator, which yields the

desired result. For the last inequality, we used the boundedness of Iin the energy

norm by defining the constant as the operator norm

cI:= sup

v∈Vθ\{0}

(id −I)vA+

vA+

Since the product of the Hermite polynomials for each m=1,...,Mhas degree

at most qm+dm−2, it is useful to define the index set

Ξ:= Δ+Λ:= η=(η1,...,ηL,0,...):

ηm=0,...,qm+dm−2,m=1,...,M;

η=0,...,q−1,=M+1,...,L.

Then, the residual can be split into an active and an inactive part by using the tensor

sets Ξand Λ,

R+(wN)=f−A+(wN)

123

Adaptive stochastic Galerkin FEM for lognormal… 675

=f+

η∈Ξ

div res(·,η)Hτθ

=R+,Λ(wN)+R+,Ξ\Λ(wN),

with

R+,Λ(wN)=f+

η∈Λ

div res(·,η)Hτθ

η,

R+,Ξ\Λ(wN)=

η∈Ξ\Λ

div res(·,η)Hτθ

η,

where div res(·,η)∈X∗for all η∈Ξ.

For all η∈Ξ, the function res is given as

res(x,η)=



k1=1···



kM=1



k

1=1···



k

L=1

res0[k1,k

1](x)

×M



m=1

Rm(km,k

m,η

m,km+1,k

m+1)



=M+1

A(k

,η

,k

+1)

with continuous first component

res0[k1,k

1](x)=

N−1



i=0

a0[k

1](x)W0(i,k1)∇ϕi(x)

and stochastic components for m=1,...,M,

Rm(km,k

m,η

m,km+1,k

m+1)

qm−1



νm=0

dm−1



μm=0

A(k

m,ν

m,k

m+1)Wm(km,μ

m,km+1)κμm,νm,ηm.

The function res is again a TT tensor with continuous first component with TT ranks

rmsmfor m=1,...,Mand sfor =M+1,...,L. The physical dimensions are

dm+qm−2 for all m=1,...,Mand d−1for=M+1,...,L.

The above considerations suggest that the error can be decomposed into errors that

derive from the respective approximations in the deterministic domain, the parametric

domain and in the ranks. This is indeed the case, as we will see in the following. In a

nutshell, if uNis the Galerkin solution in VNand uΛis the Galerkin solution in the semi-

discrete space V(Λ), then the deterministic error errdet =uΛ−uNA+corresponds

to the error of the active residual R+,Λ,theparametric error errparam =u−uΛA+

corresponds to the inactive residual R+,Ξ\Λand the error made by restricting the

ranks is the error in the discrete space errdisc(wN)=uN−wN2

A+, see Fig. 2for an

illustration.

123

676 M. Eigel et al.

Fig. 2 An illustration of the different errors

5.1 Deterministic error estimation

We define the deterministic error estimator

estdet(wN):= 

T∈T

est2

det,T(wN)+

F∈S

est2

det,F(wN)1/2

estdet,T(wN):= hTf+div σθ

Λ(wN)ζϑL2(Γ ,γ ;L2(T)),

estdet,F(wN):= h1/2

F[[σθ

Λ(wN)]]FζϑL2(Γ ,γ ;L2(F)),

where the flow is given by the residual contributions

σθ

Λ:= 

η∈Λ

res(·,η)Hτθ

η.

This estimates the active residual as follows.

Proposition 5.3 For any v∈Vϑ and any wN∈VN, it holds

|R+,Λ(wN), (id −I)vθ|

vL2(Γ ,γ ;X)≤cdet estdet(wN).

Proof By localization to the elements of the triangulation Tand integration by parts,

R+,Λ(wN), (id −I)vθ

=Γ

T∈TT

f(id −I)v−aΔ,s∇wN·∇(id −I)vdxdγϑ(y)

123

Adaptive stochastic Galerkin FEM for lognormal… 677

=

T∈TΓTf+div σθ

Λ(wN)(id −I)vζϑ(y)dxdγ(y)

+

F∈SΓF[[σθ

Λ(wN)]]F(id −I)vζϑ(y)d(x)dγ(y).

The Cauchy-Schwarz inequality yields

|R+,Λ(wN), (id −I)vθ|

≤

T∈Tf+div σθ

Λ(wN)ζϑL2(Γ ,γ ;L2(T))(id −I)vL2(Γ ,γ ;L2(T))

+

F∈S[[σθ

Λ(wN)]]FζϑL2(Γ ,γ ;L2(F))(id −I)vL2(Γ ,γ ;L2(F)).

With the interpolation properties (3.4) we obtain

|R+,Λ(wN), (id −I)vθ|

≤

T∈T

hTcTf+div σθ

Λ(wN)ζϑL2(Γ ,γ ;L2(T))|v|Vϑ,ωT

+

F∈S

h1/2

FcS[[σθ

Λ(wN)]]FζϑL2(Γ ,γ ;L2(F))|v|Vϑ,ωF.

Since the overlaps of the patches ωTand ωFare bounded uniformly, a Cauchy-

Schwarz estimate leads to

|R+,Λ(wN), (id −I)vθ|≤cdet estdet(wN)vL2(Γ ,γ ;X).

Here, the constant cdet depends on the properties of the interpolation operator (3.4).

Remark 5.4 Note that an L2-integration of the residual, which is an element of the dual

space V∗

ϑ, is possible since the solution consists of finite element functions. These

are piecewise polynomial and thus smooth on each element T∈T.

5.2 Tail error estimation

The parametric or tail estimator is given by

estparam(wN):= ⎛

⎝ΓD

η∈Ξ\Λ

res(x,η)Hτθ

η(y)ζ

ϑ(y)2

dxdγ(y)⎞

⎠

1/2

and bounds the parametric error as follows.

123

678 M. Eigel et al.

Proposition 5.5 For any v∈Vϑ and any wN∈VN, it holds

|R+,Ξ\Λ(wN), (id −I)vθ|

vL2(Γ ,γ ;X)≤estparam(wN).

Proof.Recall that R+,Ξ\Λ(wN), Ivθ =0 since Iv∈VN.

Instead of factorizing out the L∞-norm of the diffusion coefficient as in [15,16,21],

we use the Cauchy-Schwarz inequality to obtain

R+,Ξ\Λ(wN), vθ

=ΓD

η∈Ξ\Λ

res(x,η)Hτθ

η(y)·∇v(x,y)ζ

ϑ(y)dxdγ(y)

≤ΓD

η∈Ξ\Λ

res(x,η)Hτθ

η(y)ζ

ϑ(y)2

dxdγ(y)vL2(Γ ,γ ;X)

=estparam(wN)vL2(Γ ,γ ;X).

5.3 Algebraic error estimation

In order to define the algebraic error estimator, we need to state the linear basis change

operator that translates integrals over two Hermite polynomials in the measure γϑ to

the measure γ:

Hϑ→0:RN×d1×···×dM→RN×d1×···×dM,

Hϑ→0:= Z0⊗Z1⊗···⊗ZM,

Z0(i,j):= D∇ϕi·∇ϕjdx,

Zm(μm,μ



m):= R

Hτθ

μm(ym)Hτθ

μ

m(ym)dγm(ym)for all m=1,...,M.

This yields the estimator

estdisc(wN):= (A(W)−F)H−1/2

ϑ→02(RN×d1×···×dM).(5.1)

Proposition 5.6 For any wN∈VNand the Galerkin solution uN∈VN, it holds

uN−wNA+≤(ˇc+

ϑ)−1estdisc(wN).

Proof For vN=N−1

i=0μ∈ΛV(i,μ)ϕ

iHτθ

μ∈VN, it holds

ΓD∇vN·∇vNdxdγ(y)=VHϑ→0,V=VH1/2

ϑ→02

2(RN×d1×···×dM).

123

Adaptive stochastic Galerkin FEM for lognormal… 679

With this and using the coercivity of A+, we can see that

wN−uN2

A+=A+(wN−uN), (wN−uN)θ

=AW−F,W−U

=(AW−F)H−1/2

ϑ→0,(W−U)H1/2

ϑ→0

≤(AW−F)H−1/2

ϑ→02(RN×d1×···×dM)wN−uNL2(Γ ,γ ;X)

≤(ˇc+

ϑ)−1(AW−F)H−1/2

ϑ→02(RN×d1×···×dM)wN−uNA+

and thus

wN−uNA+≤(ˇc+

ϑ)−1estdisc(wN).

5.4 Overall error estimation

A combination of the above estimates yields an overall error estimator.

Corollary 5.7 For any wN∈VN, the energy error can be bounded by

u−wN2

A+≤(ˇc+

ϑ)−2estall(wN)2

with the error estimator given by

estall(wN)2:= cdet estdet(wN)+estparam(wN)+cIestdisc(wN)2

+estdisc(wN)2.

Remark 5.8 In order to get suitable measures for the estimators, the squared density

ζ2

ϑ appears, which upon scaling with

cσ:=



=1

σ2−σ2



again is a Gaussian measure with standard deviation σ=(σ 

)1≤≤Lfor

σ

:= σ

2−σ2



First, this adds a restriction on ϑsuch that the argument in the square root is positive.

This is fulfilled if exp(2ϑα1)<2, since (αm)1≤m≤Mis a decreasing series, which

can be ensured for some ϑsmall enough. Second, it is important to check whether the

123

680 M. Eigel et al.

new measure is weaker or stronger than γϑ [49], i.e., which space contains the other.

Since

σ

=σ

2−σ2



=exp(ϑα)

2−exp(2ϑα)≥exp(ϑα)=σ,

functions that are integrable with respect to the measure γϑ are not necessarily inte-

grable with respect to the squared measure. However, since fis independent of the

parameters and A+(wN)∈X∗⊗Y(Ξ) has a polynomial chaos expansion of finite

degree, the residual R+(wN)is integrable over the parameters for any Gaussian mea-

sure and therefore it is also integrable with respect to the squared measure.

5.5 Efficient computation of the different estimators

The error estimators can be calculated efficiently in the TT format. For each element

T∈Tof the triangulation, the residual estimator is given by

estdet,T(wN)2=h2

Tf+div σθ

Λ(wN)ζϑ2

L2(Γ ,γ ;L2(T))

=h2

TΓTf+

η∈Λ

div res(x,η)Hτθ

η2

ζ2

ϑ dxdγ(y)

=h2

T(f,f)L2(T)Γ

ζ2

ϑ dγ(y)

+2h2

T

η∈Λ

(f,div res(x,η))L2(T)Γ

Hτθ

ηζ2

ϑdγ(y)

+

η∈Λ

η∈Λ

(div res(x,η),div res(x,η

))L2(T)

×Γ

Hτθ

ηHτθ

ηζ2

ϑdγ(y).

A complication of the change of the measure to γand the involved weight ζ2

ϑ is the

fact that the shifted Hermite polynomials Hτθ are not orthogonal with respect to this

measure. However, this property can be restored easily by calculating the basis change

integrals beforehand. This results in another tensor product operator that is defined

element-wise for η, η∈Ξby

H(η, η):= ˜

Z1(η1,η



1)···˜

ZL(ηL,η

L),

Z(η,η



):= Γ

Hτθ

ηHτθ

η



ζ2

ϑ,dγ(y).

123

Adaptive stochastic Galerkin FEM for lognormal… 681

This operator encodes the basis change to the squared measure and can be inserted in

order to calculate the scalar product. With this, the estimator takes the form

estdet,T(wN)2=h2

T(f,f)L2(T)Γ

ζ2

ϑ dγ(y)

+2h2

T

η∈Λ˜

H(η, 0)( f,div res(x,η))L2(T)

+

η∈Λ

η∈Λ

H(η, η)(div res(x,η),div res(x,η

))L2(T).

Since ˜

His a tensor product operator, this summation can be done component-wise,

i.e., performing a matrix-vector multiplication of every component of the operator ˜

with the corresponding component of the tensor function r.

Similarly, for the jump over the edge Fwe obtain the estimator

estdet,F(wN)2=hF

η∈Λ

η∈Λ

H(η, η)([[res(x,η)]]F,[[res(x,η

)]]F)L2(F).

Analogously to the affine case dealt with in [21], both of these estimators can then be

computed efficiently in the TT format. The parametric error estimator estparam(wN)

can be estimated in a similar way.

To gain additional information about the residual influence of certain stochastic

dimensions, we sum over specific index sets. Let

Ξm:= {(η1,...,dm,...,ηM,0,...)∈F:xη=0,...,d−1,

=1,...,Z

m,...,M},

where the strike through means that takes all values but m. For every m=1,2,...,

and wN∈VNwe define

estparam,m(wN)2:= ΓD

ζ2

ϑ

η∈Ξm

res(x,η)Hτθ

η2

dxdγ(y).

Using the same arguments and notation as above, we can simplify

estparam,m(wN)2=D

η∈Ξnres(x,η)·

η∈Ξm

H(η, η)res(x,η

)dx.

These operations, including the calculation of the discrete error estimator (5.1), can

be executed efficiently in the TT format.

123

682 M. Eigel et al.

6 Fully adaptive algorithm

With the derived error estimators of the preceding sections, it is possible to refine all

discretization parameters accordingly. As discussed before, the deterministic estima-

tor assesses the error that arises from the finite element method. The discrete error

estimator evaluates the error made by a low rank approximation. The rest of the error

is estimated by the parametric error estimator.

The adaptive algorithm described in this section is similar to the algorithms pre-

sented in [15,16,21]. Given some mesh T, a fixed polynomial degree p, a finite tensor

set Λ⊂F, and a start tensor Wwith TT rank r, we assume that a numerical approx-

imation wN∈VNis obtained by a function

N←Solve[Λ, T,r,W]

where the superscript +always denotes the updated object. In our implementation,

we used the preconditioned ALS algorithm but other solution algorithms are feasible

as well. The lognormal error indicators from Sect. 5and thus the overall upper bound

estall(wN)in Corollary 5.7 are computed by the methods

(estdet,T(wN))T∈T,estdet(wN)←Estimatex[wN,Λ,T,p],

(estparam,m(wN))m∈N,estparam(wN)←Estimatey[wN,Λ],

estdisc(wN)←EstimateALS[wN],

estall(wN)←Estimateall[estdet,estparam,estdisc].

Depending on which error is largest, either the mesh is refined, or the index set Λ

is enlarged, or the rank rof the solution is increased. This is done as follows:

•If the deterministic error estdet(wN)outweighs the others, we mark the elements

T∈Tthat have the largest error estdet,T(wN)until the sum of errors on the

elements in this marked subset Tmark ⊂Texceeds a certain ratio 0 <ϑ<1. This

is called the Dörfler marking strategy



T∈Tmark

estdet,T(wN)≥ϑestdet(wN).

We denote this procedure by

Tmark ←Markx[ϑ, (estdet,T(wN,Λ))

T∈T,estdet(wN,Λ,T)].

The elements in this subset are subsequently refined by

T+←Refinex[T,Tmark].

•In case the parametric error estparam(wN)dominates, we use estimators

estparam,m(wN)in order to determine which components need to be refined. Here,

123

Adaptive stochastic Galerkin FEM for lognormal… 683

Algorithm 2 The TTASGFEM algorithm.

Input: solution wNwith solution tensor Wand rank r;

mesh Twith degrees p;

index set Λ;

Dörfler marking ratio ϑ;

accuracy TTASGFEM;

Output: New solution w+

Nwith new solution tensor W+;

updated mesh T+;

updated index set Λ+;

Init w+

N=wN;W+=W;T+=T;Λ+=Λ;r+=r;estall =∞;

repeat

N←Solve[Λ+,T+,r+,W+]

(estdet,T)T∈T+,estdet ←Estimatex[w+

N,Λ

+,T+,p]

(estparam,m)m∈N,estparam ←Estimatey[w+

N,Λ

estdisc ←EstimateALS[w+

estall ←Estimateall[estdet,estparam,estdisc]

if estdet =max{estdet,estparam,estdisc}then

Tmark ←Markx[ϑ, (estdet,T)T∈T+,estdet]

T+←Refinex[T,Tmark]

else if estparam =max{estdet,estparam,estdisc}then

Imark ←Marky[(estparam,m)m∈N,estparam]

Λ+←Refiney[Λ, Imark]

else

W+←RefineTT[W+]

end if

until estall <

TTASGFEM

we also mark until the Dörfler property is satisfied, that is, we obtain a subset

Imark ⊂Nsuch that



m∈Imark

estparam,m(wN)≥ϑestparam(wN).

This is the marking

Imark ←Marky[ϑ, (estparam,m(wN))m∈N,estparam(wN)],

and we refine by increasing d+

m←dm+1form∈Imark

Λ+←Refiney[Λ, Imark].

•Finally, if estdisc(wN)exceeds the other errors, we simply add a random tensor

of rank 1 to the solution tensor W. This increases all TT ranks of Wby 1 almost

surely. It would also be possible to add an approximation of the discrete residual

as this is also in TT format. However, since the ALS algorithm will be performed

after the refinement, the advantage of this rather costly improvement has shown

to be negligible [52]. Thus we get

W+←RefineTT[W].

123

684 M. Eigel et al.

A single iteration step of the adaptive algorithm returns either a refined T+or Λ+

or the tensor format solution with increased rank W+and then solves the problem with

these properties. This is done repeatedly until the overall error estimator errall(wN)

is sufficiently small, i.e., defined error bound TTASGFEM or a maximum problem size

is reached. This procedure is given by the function TTASGFEM, displayed in Algo-

rithm 2. Note that we omit the dependence of wNfor the estimators in the algorithm.

The upper error bounds directly follow from Corollary 5.7.

7 Numerical experiments

In this section we examine the performance of the proposed adaptive algorithm with

some benchmark problems. The computations are carried out with the open source

framework ALEA [18]. The FE discretization is based on the open source package

FEniCS [24]. The shown experiments are similar to the ones of the predecessor

paper [21] in Section 7. As before, the model equation (2.1) is computed for different

lognormal coefficients on the square domain. The derived error estimator is used as

a refinement indicator. Of particular interest is the observed convergence of the true

(sampled) expected energy error and the behaviour of the error indicator. Moreover,

we comment on the complexity of the coefficient discretization.

7.1 Evaluation of the error

The accuracy of the computed approximation is determined by a Monte Carlo estima-

tion. For this, a set of NMC independent samples y(i)NMC

i=1of the stochastic parameter

vector is considered. By the tensor structure of the probability measure γ=m∈Nγ1,

each entry of the vector valued sample y(i)is drawn with respect to N1. The parametric

solution wN∈VN(Λ;T,p)obtained by the adaptive algorithm at a sample y(i),is

compared to the corresponding (deterministic) sampled solution u(y(i))∈Xp(!

T)on

a much finer finite element space subject to a reference triangulation !

T, obtained from

uniform refinements of the finest adaptively created mesh, up to the point where we

obtain at least 105degrees of freedom with Lagrange elements of order p=3. For

computations, the truncation parameter applied to the infinite sum in (2.2), controlling

the length of the affine field representation, is set to M=100. The mean squared error

is determined by a Monte Carlo quadrature,

Γu(y)−wN(y)2

Xdγ(y)≈1

NMC



i=1u(y(i))−wN(y(i))2

Xp(!

T).

Note that the sample reference quantity u(y)exhibits approximation solely by a finite

element method and does not involve the computation of a unified reference solution.

A choice of NMC =250 proved to be sufficient to obtain a reliable estimate in our

experiments.

123

Adaptive stochastic Galerkin FEM for lognormal… 685

7.2 The stochastic model problem

In the numerical experiments, we consider the stationary diffusion problem (2.1)on

the square domain D=(0,1)2.Asin[15,16,19,21], the expansion coefficients of the

stochastic field (2.2)aregivenby

bm(x):= γm−σcos(2π1(m)x1)cos(2π2(m)x2), for m≥1.(7.1)

We chose the scaling γ=0.9.

Moreover,

1(m)=m−k(m)(k(m)+1)/2 and 2(m)=k(m)−1(m),

with k(m)=−1/2+√1/4+2m, i.e., the coefficient functions bmenumerate all

planar Fourier sine modes in increasing total order. To illustrate the influence which

the stochastic coefficient plays in the adaptive algorithm, we examine the expansion

with varying decay, setting σin (7.1) to different values. The computations are carried

out with conforming FE spaces of polynomial degree 1 and 3.

The fully discretized problem is solved in the TT format using the Alternating Least

Squares (ALS) algorithm as introduced in [21]. Other algorithms like Riemannian

optimization are also feasible [1,37]. The ALS has a termination threshold of 10−12

that needs to be reached in the Frobenius norm of the difference of two successive

iteration results.

For our choice of the coefficient field, the introduced weights in (3.1) are set to

=1 and θ=0.1.

7.3 Tensor train representation of the coefficient

Since the tensor approximation of the coefficient is the starting point for the dis-

cretization of the operator, we begin with an examination of the rank dependence of

the coefficient approximation scheme given in Algorithm 1. For this, we fix the multi-

index set such that the incorporated Hermite polynomials are of degree 15 in each

stochastic dimension, i.e.,

Δ={ν∈F:ν=0,...,16,=1,...,L}.

As a benchmark, we chose a slow decay with σ=2 and set the quadrature rule2

to exactly integrate polynomials of degree 7 with a grid such that the number of

quadrature nodes is at least Pquad =104. In the following, the relative root mean

squared (RRMS) error

E(a,aΔ,s):= E[a−aΔ,s2

L2(D)/a2

L2(D)]1/2

2The statements are with regard to the respective FE discretization in FEniCS.

123

686 M. Eigel et al.

Table 1 Comparison of different coefficient field approximations with fixed stochastic parameter set of

order L={10,50,100}, fixed stochastic polynomial degree 15 and different rank configurations having

maximal rank smax ={10,20,50,100}

smax E(a,aΔ,s)tt-dofs(A) full-dofs(A) CPU-time (s)

L=10

10 1.21 ×10−32.04 ×1063.02 ×1017 1.7

20 6.25 ×10−44.12 ×1064.56

50 6.18 ×10−41.05 ×10716.52

100 6.18 ×10−42.15 ×10770.29

L=50

10 1.17 ×10−32.11 ×1063.26 ×1065 9.14

20 3.26 ×10−44.36 ×10613.56

50 5.98 ×10−51.20 ×10750.60

100 5.16 ×10−52.75 ×107247.63

L=100

10 1.16 ×10−32.18 ×1065.25 ×10125 19.09

20 3.20 ×10−44.66 ×10627.16

50 6.33 ×10−51.38 ×107100.53

100 7.47 ×10−63.50 ×107498.47

is compared to the growth in degrees of freedom with respect to various tensor ranks.

Numerically, this expression is computed by a Monte Carlo approximation as described

in Sect. 7.1 with the reference mesh !

By denoting A=(A0,A1,...,AL)the component tensors of aΔ,sin (4.4), where

A0[xk,k1]=a0[k1](xk)corresponds to the evaluation of a0[k1]at every node xkof

T, we can apply the notion of tt-dofs (4.1) to the discrete tensor train coefficient. To

highlight the crucial choice of the rank parameter s=(1,s1,...,sL,1)in aΔ,s,we

let the maximal attainable rank smax vary.

It can be seen in Table 1that a higher rank yields a more accurate approximation

of the coefficient, as long as the stochastic expansion Lis long enough. For small

numbers of L, the RRMS does not decrease any further than 6 ×10−4, even when

increasing the maximal attainable rank. Otherwise, for up to L=100 terms in the

affine coefficient field parametrisation, a small rank parameter of smax =10 gives a

reasonable RRMS error of 1 ×10−3which can be improved by increasing smax.

It should be pointed out that the used approximation only becomes feasible com-

putationally by the employed tensor format since without low-rank representation the

amount of degrees of freedom grows exponentially in der number of expansion dimen-

sions. Furthermore, since the tensor complexity and arithmetic operations depend only

linearly on the number of expansion dimensions, the computational time scales lin-

early as illustrated in the last column of Table 1showing the average run time of 10

runs.3

3Run on an 8GB Intel Core i7-6500 laptop.

123

Adaptive stochastic Galerkin FEM for lognormal… 687

Fig. 3 Relative sampled mean

squared H1(D)error for fixed

stochastic dimension L=5and

adaptive refinement of the

physical mesh. Each setting is

shown along its respective

overall error estimator as defined

in Corollary 5.7 and plotted

against the total number of

degrees of freedom in the TT

representation. Considered are

FE approximations of order

p=1andp=3

7.4 Adaptivity in physical space

As a first example for an adaptive discretization, we examine the automatic refinement

of the physical FE space only. For this, the stochastic expansion of the coefficient is

fixed to some small value L=5in(4.4), which also is assumed for the corresponding

reference solution. The considered polynomial (Hermite chaos) degree of the approx-

imation in each dimension is fixed to d1=... =d5=10, which can be considered

sufficiently large for the respective problem, given an algebraic decay rate of σ=2

in the coefficient.

Although this experiment does not illustrate the performance of the complete algo-

rithm, it nevertheless depicts the varying convergence rates for different polynomial

orders in the FE approximation. The rank of the tensor train approximation is fixed to

rmax =10, which is sufficient due to the low stochastic dimensions.

It can be observed in Fig. 3that the error estimator follows the rate of convergence

of the sampled error. Moreover, the higher-order FE method exhibits a higher conver-

gence rate as expected. This could also be observed in the (much simpler) affine case

scrutinized in the preceding works [15,16,19,21].

7.5 Fully adaptive algorithm

The fully adaptive algorithm described in Algorithm 2is instantiated with a small

initial tensor with full tensor rank r1=2 consisting of a single M=1 stochastic

component discretized with a linear polynomial d1=2 and a physical mesh with

|T|=32 elements for linear ansatz functions p=1 and |T|=8forp=3. The

marking parameter is set to ϑ=0.5.

Figure 4depicts the real (sampled) mean squared energy error and the corresponding

overall error estimator for FE discretizations of degrees p=1,3. On the left-hand

side of the figure, a lower decay rate (σ=2) in the coefficient representation is used,

resulting in more stochastic modes to be relevant for an accurate approximation. The

right-hand side shows results for a faster decay rate with σ=4. As expected, the

convergence rate for the p=3 FE discretizations is faster than with p=1FE.

123

688 M. Eigel et al.

Fig. 4 Relative, sampled, mean squared H1(D)error for the fully adaptive refinement. Each setting is

shown along its respective overall error estimator as defined in Corollary 5.7 and plotted against the total

number of degrees of freedom in the representation. Considered are finite elements approximation of order

p=1andp=3 and slow decay (σ=2, left) and fast decay (σ=4, right)

Table 2 Stochastic expansion length, maximal polynomial chaos degrees, tensor ranks and degrees of

freedom for the physical and (compressed) stochastic discretizations as well as for the operator in TT

format for selected iteration steps in the fully adaptive algorithm

Iteration Md

max rmax m-dofs(WN) tt-dofs(WN) op-dofs(WN)

Finite element order p=1 and slow coefficient decay σ=2

5/37 1 1 2 292 584 873

15/37 2 2 5 1577 7900 73,965

20/37 3 4 8 2330 18,656 170,247

30/37 5 4 13 6586 85,,847 678,382

37/37 6 5 19 6586 126,330 734,880

Finite element order p=1 and fast coefficient decay σ=4

5/22 1 1 2 302 604 1200

10/22 1 3 3 941 2826 11,196

15/22 1 4 5 2951 14,755 44,115

22/22 2 5 9 9608 86,499 962,022

Finite elements of order p=1 are used to solve the problem with coefficient decay rates of σ∈{2,4}

Moreover, for the simpler problem with fast decay, we achieve a smaller error with a

comparable number of degrees of freedom than in the harder slow decay setting.

Remark 7.1 Note that we neglect the (large) factor ˇc+

θ in Corollary 5.7, which depends

on the choice of weights θand of the discrete space. A detailed analysis of how to

optimally select these weights is outside the scope of this article.

We conclude the numerical observations with Table 2and 3to display the behaviour

of the fully adaptive algorithm. To allow for more insights, we redefine the physical

mesh resolution as m-dofs(WN), which is the number of FE degrees of freedom for

the respective parametric solution WN. Moreover, we define the number of degrees

123

Adaptive stochastic Galerkin FEM for lognormal… 689

Table 3 Stochastic expansion length, maximal polynomial chaos degrees, tensor ranks and degrees of

freedom for the physical and (compressed) stochastic discretizations as well as for the operator in TT

format for selected iteration steps in the fully adaptive algorithm

Iteration Md

max rmax m-dofs(WN) tt-dofs(WN) op-dofs(WN)

Finite element order p=3 and slow coefficient decay σ=2

5/65 1 2 2 139 417 816

20/65 5 3 12 241 3040 37,537

35/65 7 6 21 403 9856 130,799

50/65 9 11 33 568 27,662 239,273

65/65 16 11 46 1078 76,173 353,153

Finite element order p=3 and fast coefficient decay σ=4

5/67 1 2 2 322 966 2226

20/67 2 7 7 991 9974 101,110

35/67 4 13 15 1207 20,423 250,913

50/67 6 21 19 1909 39,998 434,669

67/67 6 34 22 2404 61,920 607,076

Finite elements of order p=3 are used to solve the problem with coefficient decay rates of σ∈{2,4}

of freedom incorporated in the operator (4.5) generated to obtain the current solution.

For a tensor train operator of dimension A∈R(N×N)×(q1×q1)×...×(qL×qL)and rank

s=(s1,...,sL), we define

op-dofs(A):= N2s1−s2

L−1



=1

(sq2

s+1−s2

+1)+sLq2

Note that, using a sparse representation of the operator in the first (physical) compo-

nent, it is possible to reduce number of op-dofs.

Tables 2and 3highlight some iteration steps and the employed stochastic expansion

length M, the maximal polynomial degree dmax (which was usually naturally attained

in the first stochastic component), the maximal tensor train rank rmax (again most of

the time this rank is the separation rank of the physical space and the first stochastic

dimension) and the corresponding degrees of freedom. Notably, for the fast coefficient

decay σ=4, the stochastic expansion is very short since most of the randomness is due

to the first few stochastic parameters. It is reasonable that the respective parameter

dimensions require higher polynomial degrees in the approximation. For the more

involved slow decay σ=2 setting we observe a larger stochastic expansion of the

solution and larger tensor ranks.

Acknowledgements Open Access funding provided by Projekt DEAL. M.P. and R.S. were supported by the

DFG project ERA Chemistry; additionally R.S. was supported through Matheon by the Einstein Foundation

Berlin.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which

permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give

appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,

123

690 M. Eigel et al.

and indicate if changes were made. The images or other third party material in this article are included

in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If

material is not included in the article’s Creative Commons licence and your intended use is not permitted

by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the

References

1. Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton

University Press, Princeton (2008)

2. Babuška, I., Nobile, F., Tempone, R.: A stochastic collocation method for elliptic partial differential

equations with random input data. SIAM J. Numer. Anal. 45(3), 1005–1034 (2007)

3. Bachmayr, M., Schneider, R., Uschmajew, A.: Tensor networks and hierarchical tensors for the solution

of high-dimensional partial differential equations. Found. Comput. Math. 16(6), 1423–1472 (2016)

4. Bespalov, A., Powell, C.E., Silvester, D.: Energy norm a posteriori error estimation for parametric

operator equations. SIAM J. Sci. Comput. 36(2), A339–A363 (2014)

5. Bespalov, A., Silvester, D.: Efficient adaptive stochastic Galerkin methods for parametric operator

equations. SIAM J. Sci. Comput. 38(4), A2118–A2140 (2016)

6. Bryant, C.M., Prudhomme, S., Wildey, T.: Error decomposition and adaptivity for response surface

approximations from pdes with parametric uncertainty. SIAM/ASA J. Uncertain. Quant. 3(1), 1020–

1045 (2015)

7. Charrier, J.: Strong and weak error estimates for elliptic partial differential equations with random

coefficients. SIAM J. Numer. Anal. 50(1), 216–246 (2012)

8. Cohen, A., DeVore, R., Schwab, C.: Convergence rates of best N-term Galerkin approximations for a

class of elliptic sPDEs. Found. Comput. Math. 10(6), 615–646 (2010)

9. Cohen, A., Devore, R., Schwab, C.: Analytic regularity and polynomial approximation of parametric

and stochastic elliptic PDE’s. Anal. Appl. (Singapore) 9(1), 11–47 (2011)

10. Crowder, A.J., Powell, C.E., Bespalov, A.: Efficient adaptive multilevel stochastic Galerkin approxi-

mation using implicit a posteriori error estimation. arXiv preprint arXiv:1806.05987 (2018)

11. Detommaso, G., Dodwell, T., Scheichl, R.: Continuous level Monte Carlo and sample-adaptive model

hierarchies. arXiv preprint arXiv:1802.07539 (2018)

12. Dolgov, S., Khoromskij, B.N., Litvinenko, A., Matthies, H.G.: Polynomial chaos expansion of random

coefficients and the solution of stochastic partial differential equations in the tensor train format.

SIAM/ASA J. Uncertain. Quant. 3(1), 1109–1135 (2015)

13. Dolgov, S., Scheichl, R.: A hybrid alternating least squares–tt cross algorithm for parametric pdes.

arXiv preprint arXiv:1707.04562 (2017)

14. Eigel, M., Martin, G., Claude, J.: Adaptive stochastic Galerkin FEM for a log-transformed PDE with

non-affine unbounded random coefficients (2018). (in preparation)

15. Eigel, M., Gittelson, C.J., Schwab, C., Zander, E.: Adaptive stochastic Galerkin FEM. Comput. Methods

Appl. Mech. Eng. 270, 247–269 (2014)

16. Eigel, M., Gittelson, C.J., Schwab, C., Zander, E.: A convergent adaptive stochastic Galerkin finite

element method with quasi-optimal spatial meshes. ESAIM Math. Model. Numer. Anal. 49(5), 1367–

1398 (2015)

17. Eigel, M., Marschall, M., Schneider, R.: Sampling-free bayesian inversion with adaptive hierarchical

tensor representations. Inverse Prob. 34(3), 035010 (2018)

18. Eigel, M., Marschall, M., Zander, E.: alea—A python framework for spectral methods and low-rank

approximations in uncertainty quantification. https://bitbucket.org/aleadev/alea

19. Eigel, M., Merdon, C.: Local equilibration error estimators for guaranteed error control in adaptive

stochastic higher-order Galerkin finite element methods. SIAM/ASA J. Uncertain. Quant. 4(1), 1372–

1397 (2016)

20. Eigel, M., Merdon, C., Neumann, J.: An adaptive multilevel monte carlo method with stochastic bounds

for quantities of interest with uncertain data. SIAM/ASA J. Uncertain. Quant. 4(1), 1219–1245 (2016)

21. Eigel, M., Pfeffer, M., Schneider, R.: Adaptive stochastic Galerkin FEM with hierarchical tensor

representations. Numerische Mathematik 136(3), 765–803 (2017)

123

Adaptive stochastic Galerkin FEM for lognormal… 691

22. Ernst, O.G., Mugler, A., Starkloff, H.-J., Ullmann, E.: On the convergence of generalized polynomial

chaos expansions. ESAIM Math. Model. Numer. Anal. 46(2), 317–339 (2012)

23. Espig, M., Hackbusch, W., Litvinenko, A., Matthies, H.G., Wähnert, P.: Efficient low-rank approx-

imation of the stochastic Galerkin matrix in tensor formats. Comput. Math. Appl. 67(4), 818–829

(2014)

24. FEniCS project—automated solution of differential equations by the finite element method. http://

fenicsproject.org

25. Galvis, J., Sarkis, M.: Approximating infinity-dimensional stochastic Darcy’s equations without uni-

form ellipticity. SIAM J. Numer. Anal. 47(5), 3624–3651 (2009)

26. Gittelson, C.J.: Stochastic Galerkin discretization of the log-normal isotropic diffusion problem. Math.

Models Methods Appl. Sci. 20(2), 237–263 (2010)

27. Grasedyck, L.: Hierarchical singular value decomposition of tensors. SIAM J. Matrix Anal. Appl.

31(4):2029–2054 (2009/10)

28. Guignard, D., Nobile, F.: A posteriori error estimation for the stochastic collocation finite element

method. Technical report (2017)

29. Hackbusch, W., Schneider, R.: Tensor spaces and hierarchical tensor representations. In: Extraction of

Quantifiable Information from Complex Systems, volume 102 of Lecture Notes in Computer Science

Engineering, pp. 237–261. Springer, Cham (2014)

30. Harbrecht, H., Peters, M., Siebenmorgen, M.: Multilevel accelerated quadrature for pdes with log-

normally distributed diffusion coefficient. SIAM/ASA J. Uncertain. Quant. 4(1), 520–551 (2016)

31. Harbrecht, H., Peters, M., Siebenmorgen, M.: On the quasi-monte carlo method with halton points for

elliptic pdes with log-normal diffusion. Math. Comput. 86(304), 771–797 (2017)

32. Herrmann, L., Schwab, C.: QMC algorithms with product weights for lognormal-parametric, elliptic

pdes (2017)

33. Herrmann, L., Schwab, C.: Multilevel quasi-monte carlo integration with product weights for elliptic

pdes with lognormal coefficients. Techncial Report 2017–2019, Seminar for Applied Mathematics.

ETH Zürich, Switzerland (2017)

34. Hoang, V.H., Schwab, C.: N-term Wiener chaos approximation rate for elliptic PDEs with lognormal

Gaussian random inputs. Math. Models Methods Appl. Sci. 24(4), 797–826 (2014)

35. Holtz, S., Rohwedder, T., Schneider, R.: On manifolds of tensors of fixed TT-rank. Numer. Math.

120(4), 701–731 (2012)

36. Kazashi, Y.: Quasi-monte carlo integration with product weights for elliptic pdes with log-normal

coefficients. arXiv preprint arXiv:1701.05974 (2017)

37. Kressner, D., Steinlechner, M., Vandereycken, B.: Low-rank tensor completion by Riemannian opti-

mization. BIT 54(2), 447–468 (2014)

38. Kuo, F., Scheichl, R., Schwab, C., Sloan, I., Ullmann, E.: Multilevel quasi-monte carlo methods for

lognormal diffusion problems. Math. Comput. 86(308), 2827–2860 (2017)

39. Loève, M.: Probability theory. II. Graduate Texts in Mathematics, vol. 46, 4th edn. Springer, New York

(1978)

40. Malliavin, P.: Stochastic Analysis, vol. 313. Springer, Berlin (2015)

41. Mugler, A.: Verallgemeinertes polynomielles Chaos zur Lösung stationärer Diffusionsprobleme mit

zufälligen Koeffizienten. PhD thesis, BTU Cottbus (2013)

42. Mugler, A., Starkloff, H.-J.: On the convergence of the stochastic Galerkin method for random elliptic

partial differential equations. ESAIM Math. Model. Numer. Anal. 47(5), 1237–1263 (2013)

43. Nobile, F., Tempone, R., Webster, C.G.: An anisotropic sparse grid stochastic collocation method for

partial differential equations with random input data. SIAM J. Numer. Anal. 46(5), 2411–2442 (2008)

44. Nobile, F., Tempone, R., Webster, C.G.: A sparse grid stochastic collocation method for partial differ-

ential equations with random input data. SIAM J. Numer. Anal. 46(5), 2309–2345 (2008)

45. Nouy, A.: Low-rank methods for high-dimensional approximation and model order reduction. In:

Benner, P., Cohen, A., Ohlberger, M., Willcox, K. (eds.) Model reduction and approximation, pp.

171–226. SIAM, Philadelphia, PA (2017)

46. Oseledets, I.V., Tyrtyshnikov, E.E.: Breaking the curse of dimensionality, or how to use SVD in many

dimensions. SIAM J. Sci. Comput. 31(5), 3744–3759 (2009)

47. Pfeffer, M.: Tensor methods for the numerical solution of high-dimensional parametric partial differ-

ential equations. PhD thesis, TU Berlin Dissertation No. 19533, Technische Universität Berlin (2018)

48. Prudhomme, S., Bryant, C.M.: Adaptive surrogate modeling for response surface approximations with

application to bayesian inference. Adv. Model. Simul. Eng. Sci. 2(1), 22 (2015)

123

692 M. Eigel et al.

49. Schwab, C., Gittelson, C.J.: Sparse tensor discretizations of high-dimensional parametric and stochastic

PDEs. Acta Numer. 20, 291–467 (2011)

50. Ullmann, E.: Solution strategies for stochastic finite element discretizations. PhD thesis, TU

Bergakademie Freiberg (2008)

51. Ullmann, E., Elman, H.C., Ernst, O.G.: Efficient iterative solvers for stochastic Galerkin discretizations

of log-transformed random diffusion problems. SIAM J. Sci. Comput. 34(2), A659–A682 (2012)

52. Uschmajew, A., Vandereycken, B.: Line-search methods and rank increase on low-rank matrix varieties.

In: NOLTA2014, Luzern, Switzerland, September 14–18, 2014, pp. 52–55. IEICE, Tokyo (2014)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps

and institutional affiliations.

123