SPECTRAL ERROR BOUNDS FOR HERMITIAN INEXACT
KRYLOV METHODS†
UTE KANDLER∗AND CHRISTIAN SCHR¨
ODER∗
Abstract. We investigate the convergence behavior of inexact Krylov methods for the approx-
imation of a few eigenvectors or invariant subspaces of a large, sparse Hermitian matrix. Bounds
on the distance between an exact invariant subspace and a Krylov subspace and between an exact
invariant subspace and a Ritz space are presented. Using the first bound we analyze the question:
if a few iteration steps have been taken without convergence, how many more iterations have to be
performed to achieve a preset tolerance. The second bound provides a measure on the approximation
quality of a computed Ritz space. Traditional bounds of these quantities are particularly sensitive
to the gap between the wanted eigenvalues and the remaining spectrum. Here this gap is allowed
to be small by considering how well the exact invariant subspace is contained in a slightly larger
approximated invariant subspace. Moreover, numerical experiments confirm the applicability of the
given bounds.
Key words. Hermitian eigenvalue problem, inexact Krylov method, convergence analysis,
Krylov relation, Ritz pair
AMS subject classifications. 65F15, 65G99
1. Introduction. The Hermitian eigenvalue problem consists of solving the
equation
Ax =λx
for λ∈R,x∈Cn\ {0}, where A=AH∈Cn×nis given. Here we consider the
case when Ais large and data-sparse and only a small subset of exterior eigenvalues
and the corresponding invariant subspace are desired. Under these conditions the
most prominent iterative methods are Krylov subspace methods that search in Krylov
subspaces
Kk:= Kk(A, v1) := span(v1, Av1, A2v1, . . . , Ak−1v1)
for approximations of eigenvectors or invariant subspaces.
The application we have in mind is the identification of ground states of quan-
tum systems. Mathematically this amounts to is an Hermitian eigenvalue problem.
A ground state is an eigenvector corresponding to the smallest eigenvalue, whereas
eigenvectors corresponding to the second (third, forth,...) smallest eigenvalues are
called first (second, third,...) excited states. Note that these eigenvalues can be very
close to each other which can lead to complications in the determination of the ground
state [25]. The dimension of these quantum systems is often extremely large; it is 2d
where dis the number of particles of the system (and d > 50 is not uncommon) [11,25].
Hence, already a single vector might not fit into memory of even a large computing
cluster when stored in standard element-by-element format. For that reason vectors
have to be stored in a data sparse way, such as, e.g., in the tensor train or the hierarchi-
cal tensor format [9,10,21]. These formats, however, entail the drawback that vector
operations like matrix-vector multiplication, vector addition and vector scaling are
†This work was supported by German research council, DFG under project “Scalable Numerical
Methods for Adiabatic Quantum Preparation”
∗TU Berlin, Str. des 17. Juni 136, 10623 Berlin, GERMANY,
({kandler,schroed}@math.tu-berlin.de)
1
2Ute Kandler Christian Schr¨oder
inexact, i.e., only approximations of the intended quantities are available. Using inex-
act vector operations inside Krylov methods result in inexact Krylov methods. Other
occurrences include inexact solves in a shift-invert setting (when A= (A−σI)−1) or
mixed precision arithmetic (when computations are carried out in double precision,
but vectors are stored in single precision).
Backward error analysis shows that inexact Krylov subspace methods can be
interpreted as exact Krylov methods applied to a perturbed matrix A+E[14,29,33].
In general, an inexact Krylov method will generate orthonormal Vk+1 = [Vk, vk+1] =
[v1, . . . , vk+1]∈Cn×k+1,bk+1 ∈Ck, and Bk=BH
k∈Ck×ksuch that a Krylov relation
of the form
(1.1) (A+E)Vk=VkBk+vk+1bH
k+1
holds. Equation (1.1) takes the role of the familiar Arnoldi relation AVk=VkHk+
hk+1,kvk+1eT
k. The backward error matrix E=EH∈Cn×nis unknown, only a bound
on its norm ∥E∥2is usually available. Note that using the Lanczos process Bkwould
be tridiagonal, but other methods generate a full Hermitian Bk[14]. Relation (1.1)
implies that Vkis a basis of a Krylov subspace
˜
Kk:= Kk(A+E, ˜v1)
of the Hermitian matrix A+Eclose to Afor some ˜v1∈ran (Vk).
In this paper the following questions are addressed:
•How well is a desired invariant subspace Xof the original matrix Aapproxi-
mated by the Krylov subspace ˜
Kkof a perturbed matrix A+Eor by a Ritz
space ˜
Yin ˜
Kk?
In our notation a tilde indicates a quantity that corresponds to A+Einstead
of to A. E.g., ˜
Kkis a Krylov subspace of A+Eand ˜
Yis a Ritz space of A+E
in ˜
Kk.
•How can the sensitivity of the error bounds with respect to a small gap between
the wanted and the remaining eigenvalues be avoided?
We allow the approximation ˜
Yto be of larger dimension than X. In other
words, we alter the question from ”How close is Xto ˜
Y?” to ”How well is X
contained in ˜
Y?”. This enables us in contrast traditional bounds, e.g., [7,13,
31], to treat small gaps between the wanted and the remaining eigenvalues.
•What is a suitable measure for the quality of the approximate invariant sub-
space?
We use the angle of inclusion of Xin ˜
Yfor nonzero subspaces X,˜
Y ⊂ Cn
which is defined by
(1.2) ∠↷
max(X,˜
Y) := max
x∈X, x=0 ∠(x, ˜
Y).
A small angle ∠↷
max(X,˜
Y) does not mean that Xis close to ˜
Y, but indi-
cates that Xis almost contained in ˜
Y. In numerous applications this is all
that is needed [1]. When dim(X)≤dim( ˜
Y), the angle ∠↷
max(X,˜
Y) coincides
with the well-known maximal canonical angle [5,37], otherwise (i.e., when
dim(X)>dim( ˜
Y)) it is π/2. This unsymmetric formulation is reasonable,
because for example a 2-dimensional space can be approximately contained in
a 3-dimensional space but obviously not vice versa. As a nice consequence it
also means that our bounds hold in the trivial case when dim(X)>dim( ˜
Y).
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 3
The bounds presented in this paper use (1.1) as a starting point. We distinguish
between bounds on the distance to Krylov subspaces and to Ritz spaces. In the first
case we achieve an a priori result (requiring ∥E∥2only) and in the second case an a
posteriori result (requiring Vk, Bkand bk+1 in addition to ∥E∥2). In both cases the
bounds also require the exact eigenvalues of A. Next to being interesting in their own
right the bounds may also be beneficial inside algorithms as stopping criterion when
combined with certain spectral estimation techniques, cf. [2,22].
Our theory heavily relies on spectral perturbation theory. Classical works in this
field include [7,12,22,24,34]. The distance of an approximate invariant subspaces from
an exact one is considered in [13,20] (based on the residual) and in [31] (based on
the quality of a surrounding search space). For the a priori setting we will generalize
a classic result [23, Theorem 6.3] that bounds the angle between an eigenvector and
the kth Krylov subspace. Generalizations for block Krylov methods of this classic
result can be found in [18,27]. [3] presents bounds for the angle between an invariant
subspace of Aand the Krylov subspace Kk(A, v1) generated by a Krylov method
with polynomial restarts. Other works considering inexact Krylov subspace methods
include [19,28,29,36], where only the matrix vector multiplication is assumed to be
inexact. Finally we mention [33] where the backward error for an approximate Krylov
subspace is analyzed.
The paper is structured as follows: Section 2introduces more notation and basic
or preliminary results. The main results of the paper are presented in sections 3and 4.
In Section 3we present bounds on the distance to Krylov subspaces of a perturbed
matrix A+E, while in Section 4a bound on the approximation quality of Ritz spaces
is discussed. Numerical examples illustrating our findings are presented in Section 5.
Finally we offer some concluding remarks in Section 6.
2. Notation and preliminary results. In this section we introduce some no-
tation and collect basic results, grouped by topic. Throughout this paper Cm×n
denotes the set of complex m×nmatrices and Cnthe n-dimensional complex vector
space. A matrix norm ∥ · ∥ is called unitarily invariant if ∥UAQ∥=∥A∥for any
matrix A∈Cn×mand any unitary matrices U∈Cm×mand Q∈Cn×n. Prominent
unitarily invariant matrix norms are the 2-norm denoted by ∥.∥2and the Frobenius
norm denoted by ∥.∥F.
For a set Λ ⊂Rand A∈Cn×nwe define
spread(Λ) := max
λ1,λ2∈Λ|λ1−λ2|and ran(A) := {Ax |x∈Cn},
where the range of Aequals the column space of the matrix A. We indicate by 0 the
null matrix, by Inthe n×nidentity matrix and by ekits kth column. The cardinality
of a discrete set S, denoted |S|, is the number of elements in S. The envelope env(M)
of a set M ⊂ R∪{∞,−∞} is defined by the smallest interval that contains M.
2.1. The angle of inclusion. For nonzero subspaces X,Y ⊂ Cnthe angle of
inclusion of Xin Yis defined by (1.2) (we replace ˜
Yby Yhere for ease of notation)
where
∠(x, Y) := min
y∈Y
y=0
∠(x, y),and ∠(x, y) := arccos (|xHy|
∥x∥2∥y∥2)∈[0,π
2].
For matrices X, Y we define ∠↷
max(X, Y ) := ∠↷
max(ran (X),ran (Y)).
4Ute Kandler Christian Schr¨oder
Intuitively, the angle of inclusion (1.2) provides a measure of how well Xis con-
tained in Y. If the angle of inclusion is small then for every x∈ X there is a y∈ Y that
is close to x, i.e., ∥x−y∥2/∥x∥2is small. In particular, we have ∠↷
max(X,Y) = 0 if and
only if X ⊂ Y and ∠↷
max(X,Y) reaches π/2 if Xcontains a direction orthogonal to Y.
In particular the latter is the case whenever dim(X)>dim(Y). Using these rationale
the angle of inclusion can be extended to zero-dimensional spaces: ∠↷
max({0},Y) := 0
for any (zero or non-zero) Y, and ∠↷
max(X,{0}) := π/2 for any non-zero X.
We also stress that while the angle of inclusion is non-symmetric in general , i.e.,
∠↷
max(X,Y)=∠↷
max(Y,X), it is symmetric whenever dim(X) = dim(Y). In the this
case ∠↷
max(X,Y) = ∠↷
max(Y,X) coincides with the maximal canonical angle between
Xand Y.
As a side note we mention that there is also a minimal angle, which is defined by
∠min(X,Y) := minx∈X ,x=0 ∠(x, Y), but this concept does not play a role in this paper.
For more details on subspace angles see [6,15,37]. We state some useful properties of
inclusion angles in the following lemma and theorem.
Lemma 2.1. Let X,Y,Zbe subspaces of Cn. Then
i. ∠↷
max(X,Z)≤∠↷
max(X,Y) + ∠↷
max(Y,Z)(triangle inequality);
ii. ∠↷
max(X,Y)≥∠↷
max(X,Z)
whenever Y ⊂ Z;
iii. ∠↷
max(X,Y) = ∠↷
max(Y⊥,X⊥)
where X⊥,Y⊥⊂Cnare the orthogonal complements of X,Yin Cn;
iv. ∠↷
max(X,Y) =
0if dim(X) = 0
arccos (σdim(X)(XHY))if 1≤dim(X)≤dim(Y)
π
2if dim(X)>dim(Y)
,
where X, Y are any orthonormal bases of X,Yrespectively, i.e. X:= ran(X),
Y:= ran(Y), and σi(XHY)denotes the i-th largest singular value of XHY;
v. ∠↷
max(X,Y⊥∩Z)≤∠↷
max(X ⊕Y,Z)
whenever X ⊥ Y.
Proof.i. We distinguish three cases. a) It is easy to check that the inequality holds
whenever at least one of X,Y,Zis zero. b) If dim(X)>dim(Y) or dim(Y)>dim(Z)
at least one angle on the right-hand side reaches π
2and there is nothing to prove.
c) Otherwise, i.e., if 1 ≤dim(X)≤dim(Y)≤dim(Z) the definition of ∠↷
max(X,Y)
coincides with the definition of angles between subspaces in [37]. Hence in this case
the proof in [37, pp. 275] applies to our definition as well.
ii. This is a special case of part i. with ∠↷
max(Y,Z) = 0.
iii. We distinguish three cases. a) If X ⊂ Y or, equivalently, Y⊥⊂ X⊥then both
angles are zero. (This covers the cases that X= 0 or Y⊥= 0.) b) If dim(X)>dim(Y)
or, equivalently, dim(Y⊥)>dim(X⊥) then both angles are π/2. (This covers the cases
that Y= 0 or X⊥= 0.) c) If 1 ≤dim(X)≤dim(Y) see [16].
iv. The first case (dim(X) = 0) holds by definition. The last case (dim(X)>
dim(Y)) was discussed above. For the middle case (1 ≤dim(X)≤dim(Y)) see [17, p.
2010].
v. W.l.o.g. Xis non-zero (otherwise the angle on the left hand side is zero
and there is nothing to prove). Also, w.l.o.g. dim(X)≤dim(Y⊥∩ Z) (otherwise
dim(X)>dim(Y⊥∩Z) implies
dim(X ⊕Y) = dim(X) + dim(Y)>dim(Y⊥∩Z) + dim(Y)
≥(dim(Y⊥) + dim(Z)−n) + dim(Y) = dim(Z),
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 5
so both angles in the claimed inequality are π/2 and there is nothing to prove). So
1≤dim(X)≤dim(Y⊥∩Z) which implies that Zand Y⊥are non-zero. W.l.o.g., also
Yis non-zero (otherwise both angles in the claimed inequality reduce to ∠↷
max(X,Z)
and there is nothing to prove). Finally we can assume w.l.o.g. that dim(X ⊕Y) =
dim(X)+dim(Y)≤dim(Z) (otherwise the angle on the right-hand side in the claimed
inequality is π/2 and there is nothing to prove).
Let X∈Cn×dim(X),Y∈Cn×dim(Y),Z=∈Cn×dim(Z)be orthonormal bases of
X,Y,Z, respectively, where Z= [Z1, Z2] is chosen such that ran(Z1) = Y⊥∩Z. Then
[X, Y ] is an orthonormal basis of X⊕Y, since X ⊥ Y. Moreover, W:= [X, Y ]H[Z1, Z2]
is of the form [W11 W12
0W22 ], where W11 =XHZ1does not have less columns than rows
(since 1 ≤dim(X)≤dim(Y⊥∩Z)) and W22 =YHZ2has full column rank. Also W
does not have less columns than rows (since 1 ≤dim(X ⊕Y)≤dim(Z)). Hence, with
part iv. we have
cos ∠↷
max(X,Y⊥∩Z) = σdim(X)(W11) and σdim(X⊕Y)(W) = cos ∠↷
max(X ⊕Y,Z).
Using the monotonically decreasing behavior of the cosine in the interval [0,π
2] all
that remains to prove is σdim(X)(W11)≥σdim(X⊕Y)(W). We distinguish two cases. If
W22 has more rows than columns, then the rows of Wmust be linearly dependent,
i.e., σdim(X⊕Y)(W) = σmin(W) = 0. Otherwise W22 is square and we have by the
interlacing property of eigenvalues of Hermitian matrices [22] that
σdim(X)(W11)2=λdim(X)(WH
11W11)≥λdim(X)+dim(Y)([WH
11W11 ∗
∗ ∗])
=λdim(X)+dim(Y)(WHW) = σdim(X⊕Y)(W)2.
Theorem 2.2. [30, Theorem 2.7] Let X∈Cn×m, X⊥∈Cn×n−m, Z ∈Cn−m×m
such that [X, X⊥]is unitary. Let X:= ran(X)and ˜
X:= ran(X+X⊥Z). Then
tan ∠↷
max(X,˜
X) = ∥Z∥2.
2.2. The spectrum and its perturbation. The set of eigenvalues of a matrix
A, i.e., its spectrum, is denoted by eig(A). A subspace X ∈ Cnis called an invariant
subspace of A∈Cn×nif AX ⊂ X. When Ais Hermitian, choosing orthonormal bases
X∈Cn×m, X⊥∈Cn×n−mof an invariant subspace Xand its orthogonal complement
X⊥, respectively, gives rise to a block spectral decomposition of Aof the form
(2.1) [X, X⊥]HA[X, X⊥] = [A11 0
0A22]
with A11 ∈Cm×m,A22 ∈Cn−m×n−mand eig(A) = eig(A11)∪eig(A22). The invariant
subspace Xis called simple if eig(A11)∩eig(A22) = ∅.
The following result is known as Weyl’s theorem for the 2-norm, e.g., [34, Corollary
4.10], [12], and as Hoffman-Wielandt theorem for the Frobenius norm, e.g., [12,38].
Theorem 2.3. Let A,E∈Cn×nbe Hermitian. Let λ1≥λ2≥... ≥λnbe the
eigenvalues of Aand ˜
λ1≥˜
λ2≥. . . ≥˜
λnthe eigenvalues of A+E. For ∗ ∈ {2, F }
we have
∥diag(λ1, . . . , λn)−diag(˜
λ1, . . . , ˜
λn)∥∗≤ ∥E∥∗.
6Ute Kandler Christian Schr¨oder
The following theorem is a generalization of the well-known Davis-Kahan tan(θ)
theorem which imposes, compared to the original formulation [7, Theorem 6.3] relaxed
conditions on the spectrum.
Theorem 2.4. [20, Theorem 2] Let A∈Cn×nbe a Hermitian matrix and let
X= [X1, X2, X3]be unitary so that XHAX = diag(A11, A22, A33)is block diagonal,
where Xi∈Cn×ni,Aii ∈Cni×nifor i= 1,2,3with n1+n2+n3=n. Let ˜
X3∈Cn×n3
have orthogonal columns and let R=A˜
X3−˜
X3˜
A3, where ˜
A3=˜
XH
3A˜
X3. Suppose
that eig(A11)lies in [a, b]and eig( ˜
A3)lies in the union of (−∞, a −δ]and [b+δ, ∞).
Then
(2.2) tan ∠↷
max(˜
X3,[X2, X3]) ≤∥R∥2
δ.
Remark 2.5. In [20, Theorem 2] XHAX was assumed diagonal (instead of just
block diagonal). Our formulation holds, because the change of bases that diagonalizes
A11,A22, and A33 does not influence the subspace angles.
The standard Davis-Kahan tan(θ) theorem is obtained for n2= 0.
2.3. Gap between eigenvalues and separation of subspaces. We define
the gap between closed sets Λ1, Λ2⊂Rand square matrices A,Brespectively, by
gap(Λ1,Λ2) := min
λ1∈Λ1,λ2∈Λ2|λ1−λ2|and gap(A, B) := gap(eig(A),eig(B)).
Note that with (2.1) gap(A11, A22) = gap (eig(A11),eig(A)\eig(A11)) if ran(X) is a
simple invariant subspace. The gap provides a natural way of describing the distance
between two spectra. A similar quantity, also measuring in some sense the distance
between the spectra of two arbitrary square matrices A∈Cn×n, B ∈Cm×m, is given
by the separation
(2.3) sep∗(A, B) := min
Z∈Cm×n
∥Z∥∗=1 ∥AZ −ZB∥∗,where ∗ ∈ {2, F }.
Furthermore, the separation between two subspaces X,Y ∈ Cnwith respect to a
matrix A∈Cn×nis defined by
sepA,∗(X,Y) := sep∗(XHAX, Y HAY ),
where Xand Yare any orthonormal bases for Xand Y. This quantity is well defined
because the norms used in (2.3) are unitarily invariant, cf. [31]. For Hermitian matrices
the sep and the gap operators are related as follows.
Lemma 2.6. Let A11 ∈Cn×nand A22 ∈Cm×mbe Hermitian. Then
2
πgap(A11, A22)≤sep2(A11, A22)≤gap(A11, A22) = sepF(A11, A22).
Proof. The first inequality is discussed in [4, p.15] leveraging a function-theoretical
result in [35]. The second inequality is obtained by restricting Zin (2.3) to the form
Z=uivH
j, where ui∈Cnand vj∈Cmare a unit (right) eigenvector of A11 and a
unit (left) eigenvector of A22, respectively. The last relation is stated in [34, Theorem
3.1].
Remark 2.7. The second inequality of Lemma 2.6 is also valid in the case of
non-Hermitian matrices A11 and A22. In that case the words “right” and “left” put
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 7
in parentheses in the above proof become important. The first and the last relation
in Lemma 2.6 also hold for normal A11 and A22 (if 2
πis replaced by 4
π), but do not
carry over to the non-normal case.
The following theorem is a corollary of Theorem 2.4 and provides a measure how
well the invariant subspace ran(X1) of Ais contained in the invariant subspace of
larger dimension ran([ ˜
X1,˜
X2]) of a perturbed matrix A+E. Note that it is even
applicable if the eigenvalues corresponding to ran(X1) are not well separated from
the remaining ones.
Theorem 2.8. Let A, E ∈Cn×nbe Hermitian. Let Xi,˜
Xi∈Cn×nifor i= 1,2,3
with n1+n2+n3=nbe such that X:= [X1, X2, X3],˜
X:= [ ˜
X1,˜
X2,˜
X3]are unitary
and
(2.4) XHAX = diag(A11, A22, A33),˜
XH(A+E)˜
X= diag( ˜
A11,˜
A22,˜
A33)
are block diagonal.
i) If gap(env(eig(A11)),eig( ˜
A33)) >∥E∥2then
tan ∠↷
max (X1,[˜
X1,˜
X2])≤∥E∥2
gap(A11,˜
A33)−∥E∥2
.
ii) If gap(env(eig(A11)),eig(A33)) >2∥E∥2and max
˜
λ∈eig( ˜
A33)
min
λ∈eig(A33)|˜
λ−λ| ≤ ∥E∥2
then
tan ∠↷
max (X1,[˜
X1,˜
X2])≤∥E∥2
gap(A11, A33)−2∥E∥2
.
iii) If gap(env(eig( ˜
A11)),eig( ˜
A33)) >2∥E∥2and max
λ∈eig(A11)min
˜
λ∈eig( ˜
A11)|˜
λ−λ| ≤
∥E∥2then
tan ∠↷
max (X1,[˜
X1,˜
X2])≤∥E∥2
gap( ˜
A11,˜
A33)−2∥E∥2
.
Proof. Defining ˜
A3:= ˜
XH
3A˜
X3=˜
XH
3(A+E)˜
X3−˜
XH
3E˜
X3=˜
A33 −˜
XH
3E˜
X3we
get
R:= A˜
X3−˜
X3˜
A3= (A+E)˜
X3−E˜
X3−˜
X3(˜
A33 −˜
XH
3E˜
X3)
=−E˜
X3+˜
X3˜
XH
3E˜
X3=−(I−˜
X3˜
XH
3)E˜
X3,
which implies ∥R∥2≤ ∥E∥2. Since ˜
A3and ˜
A33 differ by ˜
XH
3E˜
X3their eigen-
values differ by at most ∥E∥2by Theorem 2.3, implying gap(eig(A11),eig( ˜
A3)) ≥
gap(eig(A11),eig( ˜
A33)) −∥E∥2. Together with the assumption
gap (env(eig(A11)),eig( ˜
A33))>∥E∥2
we deduct that there is no eigenvalue of ˜
A3in the interval [λmin(A11)−gap(A11,˜
A3),
λmax(A11) + gap(A11,˜
A3)], where λmin(A11) := min(eig (A11)) and λmax(A11) :=
max (eig(A11)). Hence Theorem 2.4 is applicable and leads to
∠↷
max(˜
X3,[X2, X3]) ≤∥R∥2
gap(A11,˜
A3)≤∥E∥2
gap(A11,˜
A3)≤∥E∥2
gap(A11,˜
A33)−∥E∥2
.
8Ute Kandler Christian Schr¨oder
From Lemma 2.1.iii it follows that
∠↷
max(X1,[˜
X1,˜
X2]) = ∠↷
max([ ˜
X1,˜
X2]⊥, X⊥
1) = ∠↷
max(˜
X3,[X2, X3]),
which concludes the proof of part i).
For part ii) we use that by assumption max˜
λ∈eig( ˜
A33)minλ∈eig(A33)|˜
λ−λ| ≤ ∥E∥2
for every eigenvalue of ˜
A33 there exists an eigenvalue of A33 that is no further away
than ∥E∥2. This implies gap(A11, A33)≥gap(A11,˜
A33)−∥E∥2. Together with part
i) we obtain the assertion. Analogously, also part iii) follows from part i).
Remark 2.9. The assumption in part ii) that max˜
λ∈eig( ˜
A33)minλ∈eig(A33)|˜
λ−
λ| ≤ ∥E∥2and the similar condition in part iii) are not very restrictive. By Theo-
rem 2.3, the eigenvalues of Adiffer from the corresponding ones of A+Eby at most
∥E∥2. So the assumption is mainly a restriction on the distribution of eig(A+E) into
eig( ˜
A11), eig( ˜
A22), and eig( ˜
A33).
2.4. Ritz and Krylov subspaces. Let A∈Cn×n,V ⊂ Cnbe a subspace,
Y∈Cn×k, and M∈Ck×k. The pair (M, Y ) is called a Ritz pair of Awith respect to
Vif it satisfies the Galerkin condition
AY −Y M ⊥ V and ran(Y)⊂ V.
A subspace Yis called a Ritz space of Ain Vif for some basis Y∈Cn×kof Y
there is an M∈Cdim(Y)×dim(Y)such that (M, Y ) is a Ritz pair of Awith respect
to V. The eigenvalues of such an Mare well-defined and are called Ritz values of A
corresponding to Y.
AKrylov relation of Aof order kis a relation of the form
(2.5) AVk=VkBk+vk+1bH
k+1,
where Bk∈Ck×k,bk+1 ∈Ck, and the columns of [Vk, vk+1]∈Cn×k+1 are orthonor-
mal.
Relation (2.5) implies that ran(Vk) is a Krylov subspace of A. But it does not
imply that ran(Vi) = ran([v1, . . . , vi]) is a Krylov subspace of Afor i<k. It does
if Bkis Hessenberg and bk+1 =bk+1,kek. The following two theorems are the main
basis of our results.
Theorem 2.10. [24, Theorem 6.3] [23] Let the eigenvalues λiof the Hermitian
matrix Abe ordered decreasingly. Then the angle between the exact eigenvector zi
associated with λiand the k-th Krylov subspace Kksatisfies the inequality
(2.6) tan ∠(zi,Kk)≤θi
ψk−i(1 + 2ηi)tan ∠(v1, zi),
where
(2.7) θ1= 1, θi=
i−1
∏
j=1
λj−λn
λj−λi
for i > 1, ηi=λi−λi+1
λi+1 −λn
and ψk−iis the Chebychev polynomial of degree k−i.
Theorem 2.11. [31, Theorem 2] Let Xbe an eigenspace of A∈Cn×nand let
Kbe a subspace in Cn. Let Ybe a Ritz space in Kand let Y⊥be the orthogonal
complement of Yin K. Then
(2.8) sin ∠↷
max(X,Y)≤sin ∠↷
max(X,K)√1 + ∥P A(I−P)∥2
2
sepA,2(Y⊥,X)2,
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 9
where Pis the orthogonal projector onto K.
Remark 2.12. In the original formulation of Theorem 2.11 in [31] it is required
that Xand Yare of the same dimension. However, the proof given there is actually
valid for dim(X)= dim(Y).
3. Distance to inexact Krylov subspaces. Here we present our main result
concerning how well an invariant subspace X1of Ais contained in a search space ˜
Kk
which is a Krylov subspace of a perturbed matrix A+E. We consider a situation
where literations of an inexact Krylov subspace method have been carried out but
the desired eigenpair approximations cannot yet be considered converged. This leads
to the question of how many more iterations have to be carried out until convergence
can be expected. The theorem uses nested subsets of eigenvalues. An illustration is
provided in Figure 1.
Theorem 3.1. Let A∈Cn×nbe Hermitian with eigenvalues λ1≥λ2≥. . . ≥λn.
Let J1⊆J2⊆J3⊆J4be nested nonempty subsets of {1,2, . . . , n}so that max(J3)< n
and J2,J3consist of consecutive integers.
Denote by JL:= {1,2, . . . , min(J3)−1}the leading and by JT:= {max(J3)+1, . . . , n}
the trailing indices of J3.
Let Λi:= {λj:j∈Ji}and Λ−i={λj:j∈ {1, . . . n}\Ji}for i∈ {1,...,4, L, T}.
Let X1and X4be invariant subspaces of Acorresponding to Λ1and Λ4, respectively.
Let k > max(J3).
For j= 1, . . . , k let ˜
Kj:= Kj(A+E, ˜v1)for some Hermitian E∈Cn×nand some
˜v1∈Cn.
For i∈ {2,3}let ˜
Xibe an invariant subspace of A+Ecorresponding to {λj(A+E) :
j∈Ji}such that ˜
X2⊂˜
X3.
If 2∥E∥2<min {gap(Λ1,Λ−2),gap(Λ2,Λ−3),gap(Λ3,Λ−4)}then for every l= 1, . . . ,
k−|JL|, we have
∠↷
max (X1,˜
Kk)≤∠↷
max (˜
X2,˜
Kk)+δ1,2
≤arctan (ϱk,l ·tan ∠↷
max(˜
X3,˜
Kl))+δ1,2
(3.1)
≤arctan (ϱk,l ·tan≤π
2(∠↷
max(X4,˜
Kl) + δ3,4))+δ1,2
(3.2)
where tan≤π
2(α) := tan(min{α, π
2})and
ϱk,l := (∑
i∈J2
˜
θ2
i
ψk−l−|ΛL|(1 + 2˜ηi)2)1
2
, δi,j := arctan (∥E∥2
gap(Λi,Λ−j)−2∥E∥2),
(3.3)
˜
θi:= ∏
j∈JL
|λj−λn|+ 2∥E∥2
|λj−λi|−2∥E∥2
>0,˜ηi:= gap(λi,ΛT)−2∥E∥2
spread(ΛT)+2∥E∥2
>0,
(3.4)
where ψjdenotes the Chebychev polynomial of degree j.
Proof. The proof consists of three parts, proving that
i) ∠↷
max(X1,˜
Kk)≤∠↷
max(˜
X2,˜
Kk) + δ1,2,
ii) tan ∠↷
max(˜
X2,˜
Kk)≤ϱk,l ·tan ∠↷
max(˜
X3,˜
Kl), and
iii) ∠↷
max(˜
X3,˜
Kl)≤min(π
2,∠↷
max(X4,˜
Kl) + δ3,4), respectively.
Then the claim follows because of the monotonicity of the tangent in [0,π
2].
10 Ute Kandler Christian Schr¨oder
λ(A)
λ1
λn
Λ1
Λ2
Λ3
Λ4
ΛT⊂Λ−3ΛL⊂Λ−3
Λ−1Λ−1
Λ−2Λ−2
Λ−4Λ−4
gap−4,3
gapT,2
gap−2,1gap1,−2
gap2,L
gap3,−4
Fig. 1.Illustration of the nested spectral subsets, here gap(Λi,Λj) = min{gapij ,gapji}
Part i) Using Theorem 2.8 ii) and the definition of δ1,2, respectively, we have
(3.5) tan ∠↷
max(X1,˜
X2)≤∥E∥2
gap(Λ1,Λ−2)−2∥E∥2
= tan δ1,2.
Hence, by the triangle inequality, Lemma 2.1.i, we get
(3.6) ∠↷
max (X1,˜
Kk)≤∠↷
max(X1,˜
X2) + ∠↷
max (˜
X2,˜
Kk)≤δ1,2+∠↷
max (˜
X2,˜
Kk).
Part iii) The proof of part iii) is similar to that of part i) and also uses a triangle
inequality, this time in the form of
∠↷
max (˜
X3,˜
Kl)≤∠↷
max(˜
X3,X4) + ∠↷
max (X4,˜
Kl)≤δ3,4+∠↷
max (X4,˜
Kl)
where we used Theorem 2.8 iii) for the second inequality. Also, any subspace angle
can at most be π
2by definition.
Part ii) If dim( ˜
Kl)<dim( ˜
X3) = |J3|then ∠↷
max(˜
X3,˜
Kl) = π
2and the right-hand
side is infinite. Thus there is nothing to prove in this case. Hence we may assume
that dim( ˜
Kl)≥ |J3|in the following.
We start by considering the angles between ˜
Kkand eigenvectors of A+E. So,
let ˜x1, . . . , ˜xnbe unit eigenvectors of A+Ecorresponding to the eigenvalues ˜
λ1≥
˜
λ2≥. . . ≥˜
λnsuch that ˜
X3= span{˜xj:j∈J3}and ˜
X2= span{˜xj:j∈J2}. Let
˜
ΛT:= {˜
λj:j∈JT}.
For i∈J2let Vi:= span{˜xj:j∈J3, j =i}=˜
X3∩span(˜xi)⊥. We note
that ˜
Kl∩V⊥
i, being an intersection of a (≥ |J3|)-dimensional and an (n−|J3|+ 1)-
dimensional subspace, is at least one-dimensional. Thus there exists a nonzero vector
vi∈˜
Kl∩V⊥
isuch that ∠(˜xi, vi) = ∠(˜xi,˜
Kl∩V⊥
i). Let
Ai:= A+E−∑
j∈J3
j=i
(˜
λj−˜
λn)˜xj˜xH
j.
Note that the matrices Aiand A+Ehave the same eigenvectors. Also the eigenvalues
˜
λjfor j∈({1, . . . , n}\J3)∪ {i}coincide. The remaining eigenvalues of A+Eare
moved to ˜
λn∈˜
ΛT. Then Theorem 2.10, applied to Aiand vi, yields
(3.7) tan ∠(˜xi,Kk−l+1(Ai, vi)) ≤ρi:= θi
ψk−l−|JT|(1 + 2ηi)tan ∠(˜xi, vi),
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 11
with
(3.8) θi=∏
j∈JL
|˜
λj−˜
λn|
|˜
λj−˜
λi|, ηi=|˜
λi−˜
λmin(JT)|
|˜
λmin(JT)−˜
λn|=gap(˜
λi,˜
ΛT)
spread(˜
ΛT).
Let gi∈ Kk−l+1(Ai, vi) be such that ∠(˜xi, gi) = ∠(˜xi,Kk−l+1(Ai, vi)). W.l.o.g. gi
is scaled such that ˜xH
igi= 1 (otherwise, if ˜xH
igi= 0 we can rescale gi, if ˜xH
igi= 0
it follows with Lemma 2.1.vthat π/2 = ∠(˜xi, gi) = ∠(˜xi,˜
Kl∩V⊥
i)≤∠↷
max(˜
X3,˜
Kl),
i.e., ∠↷
max(˜
X3,˜
Kl) = π/2 such that the right-hand side of part ii) is infinite, i.e., in
this case is nothing to prove). Note that since vi⊥ Vi, we have Aivi= (A+E)vi
and Aivi⊥ Vi. Thus by induction Kj(Ai, vi) = Kj(A+E, vi) and Kj(Ai, vi)⊥ Vifor
every j= 1,2, . . .. Hence, gi⊥ Vithus there exist gL,i ∈C|JL|and gT,i ∈C|JT|such
that
gi= [˜x1, . . . , ˜x|JL|]gL,i + ˜xi+ [˜xmin(JT), . . . , ˜xn]gT,i.
Using Theorem 2.2, the definition of gi, and (3.7) leads to∥[gT
L,i, gT
T,i]T∥2=tan ∠(gi,˜xi)
≤ρi.Since the subspaces ˜
Kjare nested and vi∈˜
Kl, we have Kk−l+1(A+E, vi)⊂
Kk(A+E, ˜v1) = ˜
Kk. Thus gi∈˜
Kk. Since ˜
Kkis independent of i, it contains gifor
all i∈J2. Thus it contains the subspace ran(G) where
G:= [gmin(J2), . . . , gmax(J2)] = [˜x1, . . . , ˜xn]
GL
0
I|J2|
0
GT
with GL:= [gL,min(J2), . . . , gL,max(J2)]∈C|JL|×|J2|,GT:= [gT,min(J2), . . . , gT,max(J2)]
∈C|JT|×|J2|and the upper and lower zero blocks are of format (min(J2)−min(J3)) ×
|J2|and (max(J3)−max(J2))×|J2|, respectively. Using Lemma 2.1.ii and Theorem 2.2
we have
(3.9) tan ∠↷
max(˜
X2,˜
Kk)≤tan ∠↷
max(˜
X2,ran(G)) =
[GL
GT]
2
.
Further transformations yield
(3.10)
[GL
GT]
2≤
[GL
GT]
F≤(∑
i∈J2
ρ2
i)1
2
.
Applying Theorem 2.3, we have for i=j:|λi−λj|−2∥E∥2≤ |˜
λi−˜
λj| ≤ |λi−λj|+
2∥E∥2. Thus θi≤˜
θiand ηi≥˜ηi, for θi, ηias in (3.8) and ˜
θi,˜ηias in (3.4). Also, ˜
θi
and ˜ηiare positive by the assumed bound on ∥E∥2. Moreover, ψk−l−|JL|(1 + 2ηi)>
ψk−l−|JL|(1 + 2˜ηi), because Chebychev polynomials are positive and monotonically
increasing in [1; ∞). Furthermore, since ˜
X3= span(˜xi)⊕Viwe have by Lemma 2.1.v
that ∠(˜xi,˜
Kl∩V⊥
i)≤∠↷
max(˜
X3,˜
Kl), hence ∠(˜xi, vi)≤∠↷
max(˜
X3,˜
Kl). Thus
(3.11) ρi≤˜
θi
ψk−l−|JL|(1 + 2˜ηi)tan ∠↷
max(˜
X3,˜
Kl).
12 Ute Kandler Christian Schr¨oder
Combining (3.9), (3.10), and (3.11) we have
(3.12) tan ∠↷
max(˜
X2,˜
Kk)≤(∑
i∈J2
˜
θ2
i
ψk−l−|JL|(1 + 2˜ηi)2)1
2
tan ∠↷
max(˜
X3,˜
Kl)
=ϱk,l tan ∠↷
max(˜
X3,˜
Kl).
This concludes the proof of part ii) and of the whole theorem.
The following remarks are in order.
Remark 3.2. Theorem 3.1 somewhat resembles Theorem 2.10, but holds several
improvements. i) The search space is chosen as Krylov subspace of a perturbed matrix
A+E(instead of Aitself). Thus the setting of inexact Krylov methods is covered.
ii) Instead of just eigenvectors we consider invariant subspaces. This allows to treat
clusters of eigenvalues as a whole. iii) The dimension lof the Krylov subspace on
the right-hand side is allowed to be larger than one. (Note that ∠(v1, zi) in (2.6)
could be written as ∠(vi,K1).) This is necessary for the theorem to be meaningful
as for l < dim(X4) the right-hand side is infinite. Moreover, this might be useful if
information about the angle ∠↷
max(˜
X3,˜
Kl) or ∠↷
max(X4,˜
Kl) is available for some l. iv)
Theorem 3.1 is still useful even if the wanted eigenvalues Λ1are not well separated
from the rest of the spectrum, i.e., if gap(Λ1,Λ−1)−2∥E∥2is tiny or even negative.
This is achieved by considering four possibly different (but nested) sets of eigenvalues,
Λ1, . . . , Λ4. All that matters is that the gaps between Λiand the complement of Λi+1,
i.e., gap(Λ1,Λ−2), gap(Λ2,Λ−3), gap(Λ3,Λ−4), are well larger than 2∥E∥2. We stress
in particular that gap(Λi,Λ−i) may even be zero, i.e., Xiis not required to be simple.
Finally, Theorem 3.1 is a generalization of Theorem 2.10 and reduces to the latter in
case ∥E∥= 0, l= 1, J1=J2=J3=J4={i}.
Remark 3.3. The constant ϱk,l decreases exponentially fast as kgrows. Indeed,
from ψk(x) = cosh(k·arcosh(x)) for |x| ≥ 1, cosh(x) = 1
2(ex+e−x)>1
2ex, and
arcosh(x) = ln(x+√x2−1) for x≥1 we deduct that ψk(1 + 2η)>1
2(1 + 2η+
2√η+η2)k. This indicates at least linear convergence of ϱk,l towards zero.
Remark 3.4. In order to get an idea of the qualitative behavior of bound (3.2)
we consider the following. For small arguments 0 < a ≪1 we have arctan(a)≈aand
arctan(a)≤a. Since in the later stages of the iteration ϱk,l is small we may replace
arctan(·) by the identity in (3.2). Doing so, taking logarithms, and substituting
γi:= 1 + 2˜ηi+ 2√˜ηi+ ˜η2
i,τl:= tan≤π
2(∠↷
max(X4,˜
Kl) + δ3,4)results in
log (∠↷
max (X1,˜
Kk))≤log
τl(∑
i∈J2
˜
θ2
i
1
4γ2(k−l−|ΛL|)
i)1
2
+δ1,2
.
Moreover, using log(a+b)≈max(log(a),log(b)) and some algebraic transformations
the right hand side can be approximated by
max (log(δ1,2),log (τl)+max
i∈J2(log(˜
θi)−log (1
2)+(l+|ΛL|) log(γi)−klog(γi))).
On the other hand, in the beginning of the iteration the argument of the arctan
in (3.2) is usually large and the only save bound for ∠↷
max (X1,˜
Kk)is π
2. Together
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 13
k
init.
phase
convergence
phase
stagnation
phase
αi−kβi
log(δ1,2)
log(π
2)
log ∠y
max(X1,˜
Kk)
Fig. 2.Qualitative shape of the bound (3.2), see Remark 3.4
with the previous consideration we get
(3.13) log (∠↷
max (X1,˜
Kk))⪅min (log (π
2),max (log(δ1,2),max
i∈J2
(αi−kβi)))
with αi= log (τl) + log(˜
θi)−log(1
2)+(l+|ΛL|) log(γi), βi= log(γi)>0, and a⪅b
meaning that “ais usually smaller than b, it can be a bit larger, but not much”.
So, the logarithm of ∠↷
max(X1,˜
Kk) can be approximately bounded by a set of
straight lines, two horizontal and a few decreasing ones, see Figure 2for an illustration.
We can identify three phases. i) During the first iteration steps all the decreasing lines
lie above the π
2-level. Hence the minimum is assumed by log (π
2). We call this phase
the initialization phase. ii) In the following iteration steps the decreasing lines fall
below the π
2-level and become the important terms. Here the largest of the decreasing
lines defines the bound. We obtain a strict monotonically decreasing, piecewise linear,
convex shape. This phase is called the convergence phase. It is well possible that only
one line dominates for the whole convergence phase, i.e., there is no bend, see the
numerical examples in Section 5. iii) In the last phase the maximum is assumed
by log(δ1,2). We denote the last section by stagnation phase because the angle of
inclusion does not decrease anymore.
Remark 3.5. Theorem 3.1 offers some freedom in choosing the index sets Ji.
J1is fixed to the indices of the wanted eigenvalues. J2, J3, J4are free and could thus
be chosen to minimize the bounds (3.1), (3.2). Here the various relations between
the index sets J1, J2, J3, J4influence the constants δ12,˜
θi, ˜ηiand δ34 in different ways.
For example extending J2leads to a larger gap(Λ1,Λ−2) and thus to a decreased δ12.
of (3.2). Similar extending J3with respect to J2will improve ˜
θiand ˜ηi. Finally
extending J4with respect to J3will improve δ34.
Remark 3.6. Theorem 3.1 still holds if the eigenvalues are sorted in increasing
order, λ1≤λ2≤. . . ≤λn. This can be seen by applying the theorem to −Aand using
that all the constants ˜
θi,˜ηi, δi,j and ϱk,l are invariant under negating the eigenvalues.
Remark 3.7. Although this case is irrelevant in practice, we mention that
Theorem 3.1 still holds if the denominator of ˜ηiis zero, i.e., if ∥E∥2= 0 = spread(ΛT).
In this case ˜ηi=∞for all i∈J2and ϱk,l = 0. Then ∠↷
max(X1,˜
Kk)≤δ1,2for all
14 Ute Kandler Christian Schr¨oder
k > l +|JL|.
Remark 3.8. Theorem 3.1 relates four angles to one another: ∠↷
max(X1,˜
Kk),
∠↷
max(˜
X2,˜
Kk), ∠↷
max(˜
X3,˜
Kl), and ∠↷
max(X4,˜
Kl). The relation ”looks nicest“ when
restricted to the first and the last of these. (Especially since the middle two angles
involve invariant subspaces of A+Ewhich seem hard to obtain in practice.) The
reason to include all four angles in Theorem 3.1 is that i) ∠↷
max(˜
X2,˜
Kk) appears in
the following Theorem 4.1 and ii) ∠↷
max(˜
X3,˜
Kl) may actually be bounded in practice.
Remark 3.9. The formulation of Theorem 3.1 and Figure 1may seem to
suggest that the desired eigenvalues can be in the center of the spectrum (and the
theorem does hold in this case). However, the constants ˜
θi,i∈J2rapidly grow for
an increasing number of leading eigenvalues |JL|. Hence the theorem is useful only
if fairly exterior eigenvalues are wanted (say, when |JL|is not larger than two or
three). For example considering the matrix constructed by the MATLAB command
A= diag(100 :−1:1), ∥E∥2= 10−10 and choose Λ2= Λ3= Λ4= Λ1.Then for
•Λ1={90,91, . . . , 100}we get ˜
θmax = 1,
•Λ1={90,91}we get ˜
θmax = 1.731 ·1012,
•Λ1={50,51}we get ˜
θmax = 5.045 ·1028
with ˜
θmax := maxi∈J2˜
θi. Hence, already if we are interested in the 9th and 10th
largest eigenvalue of ATheorem 3.1 becomes little meaningful. Note, if largest Λ3
contains all the largest eigenvalues then ˜
θmax = 1.
4. Distance to inexact Ritz spaces. So far we have considered how well the
exact eigenspace X1is contained in the search space ˜
Kk. If ∠↷
max(X1,˜
Kk) is small then
there is a subspace of ˜
Kkthat is close to X1. However, in practice this subspace in ˜
Kk
is not known and X1is approximated by a Ritz space ˜
Yin ˜
Kk. Hence, in this section
we address the question: How much worse is the distance to a Ritz space compared
to the distance to the whole search space?
Theorem 4.1. Let A, E, Λ1,Λ2,Λ−2,X1,˜
X2and δ1,2be defined as in Theo-
rem 3.1.
Let Bk∈Ck×k,bk+1 ∈Ck, and Vk∈Cn×kbe such that the Krylov relation (2.5)for
A+Eis satisfied and ˜
Kk= ran(Vk).
Let ˜
Ybe a Ritz space of A+Ein ˜
Kk.
Let ˜
Mbe the Ritz values corresponding to ˜
Yand let ˜
M−be the set of k−dim( ˜
Y)
remaining Ritz values, with ∥E∥2<gap( ˜
M−,Λ2).
Then
(4.1)
∠↷
max(X1,˜
Y)≤δ1,2+ arcsin≤1(√1 + π2∥bk+1∥2
2
4(gap( ˜
M−,Λ2)−∥E∥2)2sin ∠↷
max(˜
X2,˜
Kk))
where arcsin≤1(x) := arcsin(min{1, x}).
Proof. By Lemma 2.1.iwe have
(4.2) ∠↷
max(X1,˜
Y)≤∠↷
max(X1,˜
X2) + ∠↷
max(˜
X2,˜
Y).
Now we treat each term of the right-hand side separately. Using Theorem 2.8 ii)
and (3.3), respectively, as in Theorem 3.1, the first one is bounded by
(4.3) ∠↷
max(X1,˜
X2)≤arctan ∥E∥2
gap(Λ1,Λ−2)−2∥E∥2
=δ1,2.
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 15
Applying Theorem 2.11 to the second right-hand side angle in (4.2) results in
(4.4) ∠↷
max(˜
X2,˜
Y)≤arcsin≤1(√1 + ∥P(A+E)(I−P)∥2
2
sepA+E,2(˜
Y⊥,˜
X2)2sin ∠↷
max(˜
X2,˜
Kk)),
where Pis the orthogonal projector onto ˜
Kkand ˜
Y⊥is the orthogonal complement
of ˜
Yin ˜
Kk. Choosing V⊥∈Cn×n−ksuch that [Vk, V⊥] is unitary, we have for the
numerator ∥P(A+E)(I−P)∥2
2that
(4.5) ∥P(A+E)(I−P)∥2
2=∥VH
k(A+E)V⊥∥2
2=∥VH
⊥(A+E)Vk∥2
2
=∥VH
⊥VkBk+VH
⊥vk+1bH
k+1∥2
2=∥bk+1∥2
2,
where relation (1.1) and the fact that A+Eis Hermitian was used. Moreover, applying
Lemma 2.6 to the denominator sepA+E,2(˜
Y⊥,˜
X2)2leads to
sepA+E,2(˜
Y⊥,˜
X2)2≥4
π2gap( ˜
M−,˜
Λ2)2≥4
π2(gap( ˜
M−,Λ2)−∥E∥2)2,
with ˜
Λ2:= {λj(A+E) : j∈J2}where J2is defined as in Theorem 3.1. Together
with (4.5) we obtain
(4.6)
∠↷
max(˜
X2,˜
Y)≤arcsin≤1
v
u
u
t1 + π2∥bk+1∥2
2
4(gap( ˜
M−,Λ2)−∥E∥2)2sin ∠↷
max(˜
X2,˜
Kk)
.
Finally, inserting (4.3) and (4.6) into (4.2) completes the proof.
Remark 4.2. Again, in order to deal with small gaps between the wanted and
unwanted eigenvalues, we have to allow the Ritz space ˜
Yto be of larger dimension
than the wanted invariant subspace X1.
Remark 4.3. One distinction between the Theorems 3.1 and 4.1 is the amount
of information necessary to evaluate the bounds corresponding to the kth step of the
Krylov method. Theorem 3.1 is an a priori result as it only needs the information of
the lth step (with l < k). In contrast, Theorem 4.1 is an a posteriori result, because
the vector bk+1 and the Ritz values at the kth step are required. It is unclear how an
a priori result bounding ∠↷
max(X1,˜
Y) could look like.
5. Numerical results. In this section we verify the spectral error bounds pre-
sented in sections 3and 4using two different test matrices. The first one is constructed
to evaluate how changes in the eigenvalue distribution influence the behavior of the
bounds. The second test matrix comes from the application mentioned in the intro-
duction.
To obtain the Krylov relation (1.1) we apply an exact Arnoldi method, cf. [8,32],
to a matrix A+E, where Eis chosen as a Hermitian, random matrix of prescribed
norm.
5.1. Academic problem. We use a diagonal matrix Abuilt by the MATLAB
command
(5.1) A=diag([9,9−g,7,6,5,4,4−g,2,rand(1,992)]),
where g∈Ris a parameter. This is a 1000 ×1000 matrix with the eigenvalues
9,9−g, 7,6,5,4,4−g, 2 and additionally 992 eigenvalues in the interval (0,1). We
16 Ute Kandler Christian Schr¨oder
0 5 10 15 20 25 30 35 40 45
10−13
10−11
10−9
10−7
10−5
10−3
10−1
101
number of Arnoldi steps k
true value, ε=10−4
true value, ε=10−7
true value, ε=10−11
bound (3.2), ε=10−4
bound (3.2), ε=10−7
bound (3.2), ε=10−11
(a) case (A), Λ2= Λ3= Λ4= Λ1
0 5 10 15 20 25 30 35 40 45
10−13
10−11
10−9
10−7
10−5
10−3
10−1
101
number of Arnoldi steps k
true value, ε=10−4
true value, ε=10−7
true value, ε=10−11
bound (3.2), ε=10−4
bound (3.2), ε=10−7
bound (3.2), ε=10−11
(b) case (B), Λ1= Λ2⊂Λ3= Λ4
Fig. 3.Experiment 1: True angle of inclusion ∠↷
max(X1,˜
Kk)and corresponding bound (3.2)
with ε:= ∥E∥2/∥A∥2∈ {10−3,10−7,10−11}
are interested in the eigenspace corresponding to the eigenvalues Λ1={9−g, 7,6,5,4}.
Thus for g∈[0,2], gis the gap between the wanted and the unwanted eigenvalues
and if gis small Λ1is not well separated from the remaining spectrum of A.
Experiment 1. The first experiment explores the influence of the norm ∥E∥2of
the perturbation on the development of ∠↷
max(X1,˜
Kk) over the course of the Arnoldi
iteration. We choose g= 1 in (5.1), so the wanted and the unwanted eigenvalues
are well separated. We distinguish two cases (A) Λ2= Λ3= Λ4= Λ1and (B)
Λ1= Λ2⊂Λ3= Λ4.
For case (A) ∠↷
max(X1,˜
Kk) is plotted as black lines for ∥E∥2∈ {10−3,10−7,10−11}·
∥A∥2in Figure 3(a) . It can be observed that the angle decreases during the iteration,
although we search for eigenspaces of Ain a Krylov subspace of A+E(and not of A).
Convergence sets in after an initial phase and is linear with convergence rate indepen-
dent of ∥E∥2. But ∠↷
max(X1,˜
Kk) decreases not to zero but only to a limiting accuracy
that is on the order of ∥E∥2/∥A∥2. We see that all the curves agree in the sense that
the three phases mentioned in Remark 3.4 can be identified. The lighter lines show
the bounds of Theorem 3.1, where we choose l= 10, and Λ2= Λ3= Λ4= Λ1be-
cause the gap(Λ1,Λ−1) is large. The bound (3.2) starts to exist after the initialization
phase and correctly predicts the convergence and stagnation phases. In the stagna-
tion phase the bound overestimates the limiting accuracy only by a value of about 6.
The convergence rate (the αin ∠↷
max(X1,˜
Kk)≤α·∠↷
max(X1,˜
Kk+1) ) is overestimated
to be 0.41 (true value 0.12, both roughly computed from the observed values). This
amounts to a slow down factor of 2.5≈log(0.12)/log(0.41), i.e., reducing the angle
by a certain amount takes two and a half times less iterations than predicted by the
bound.
The convergence rate of bound (3.2) is determined by ˜ηi. According to Remark 3.5
the constant ˜ηican be improved by extending subset J3with respect to J2. This is
confirmed by Figure 3(b) where ∠↷
max(X1,˜
Kk) is plotted for Λ1= Λ2and Λ3= Λ4=
Λ1∪{9,4−a, 2}and for the same values of ∥E∥2as before. We see that during the
convergence phase the convergence rate of the true value and of the bound almost
coincide (true: 0.11, bound: 0.12).
Experiment 2. Here we investigate the sensitivity of bound (3.2) with respect
to the gap between the wanted and the remaining eigenvalues. We use the same Λ1,
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 17
case gin (5.1)δ1,2
(lower
is better)
δ3,4
(lower
is better)
˜
θmax
(lower
is better)
˜ηmin
(higher
is better)
overest.
of lim.
accuracy
slow
down
factor
(A) 104·∥E∥210−410−49·1062·10−70.33 1.96
(A) 102·∥E∥210−210−29·1082·10−90.40 2.41
(A) 2.1·∥E∥21.47 1.47 9 ·1011 2·10−12 1.00 1.5·105
(B) 2.1·∥E∥25·10−11 5·10−11 1.00 1.00 0.23 1.47
(B) 10−2·∥E∥25·10−11 5·10−11 1.00 1.00 0.23 1.49
(C) 10−2·∥E∥25·10−11 9·10−11 1.00 3.00 0.11 1.00
Table 1
Experiment 2: Constants of the bound (3.2)for case (A) (Λ1= Λ2= Λ3= Λ4), case (B)
(Λ1⊂Λ2= Λ3= Λ4) and case (C) Λ1⊂Λ2⊂Λ3= Λ4(the numbers are correct to leading digit)
g∈ {10−2,2.1,102,104}·∥E∥2and distinguish between the three cases (A) Λ1= Λ2=
Λ3= Λ4, (B) Λ1⊂Λ2= Λ3= Λ4and (C) Λ1⊂Λ2⊂Λ3= Λ4.
For the first case because of the small gap between the wanted and the unwanted
eigenvalues the bound (3.2) can be expected to be very loose. This is confirmed by
Figure 4(a), where the true angle ∠↷
max(X1,˜
Kk) and the bound (3.2) are depicted
for ∥E∥2= 10−11∥A∥2,l= 20 and g∈ {2.1,102,104} ·∥E∥2for g= 10−2∥E∥2the
assumptions of Theorem 3.1 are not fulfilled and the bound does not exist. To be
precise, as we consider three different matrices we would have to plot three different
true angles. However, (apart from a minimal deviation in the starting point of the
convergence phase by 2 −3 iteration steps) the three true angles behave the same. So
for the sake of readability, only one true angle (for g= 2.1·∥E∥2) is depicted. The
figure illustrates that in case of a small gap choosing Λ2,Λ3,Λ4to coincide with Λ1
results in a rather loose bound. In particular, we see that the smaller the gap between
the wanted and the remaining eigenvalues the less meaningful bound (3.2) becomes.
The reason for the poor quality of the bound are the large constants δ1,2,δ3,4,˜
θmax
and the small ˜ηmin (see Table 5.1 where ˜
θmax := maxi∈J2{˜
θi}and ˜ηmin := mini∈J2{˜ηi}.
As mentioned in Remark 3.5 Λ2, . . . , Λ4may be chosen to optimize δ1,2,δ3,4,˜
θmax
and ˜ηmin in (3.2).
In case (B) we consider again g= 2.1·∥E∥2, but choose now Λ2= Λ3= Λ4=
Λ1∪{9,4−g} ⊃ Λ1. The constants stated in the second part of Table 5.1 improve
substantially and lead to a much sharper bound (3.2) compared to case (A), see
Figure 4(b). The bound predicts the actual behavior of ∠↷
max(X1,˜
Kk), i.e., all three
phases are reflected by the bound. We use this setting to analyze the influence of the
parameter lon our bound. Choosing l∈ {15,22,30}we see that the larger the number
of iteration steps lthat already have been carried out, the sharper the corresponding
bound in the convergence phase. In this phase the convergence rate of the bound
underestimates the true value (≈0.11) to be 0.23, which amounts to a slow down
factor of 1.47. Afterwards in the stagnation phase the bounds overestimate the actual
value only by a constant factor of about 3 (independent of l).
The constants of bound (3.2), stated in the second row of case (B) in Table 5.1,
indicate that even for a matrix where the gap between the wanted and the remaining
eigenvalues is smaller than the norm of the perturbation, i.e., g= 10−2∥E∥2the
bound (3.2) is meaningful.
We note that in case (C) the bound is further improved choosing Λ1,Λ2as in
18 Ute Kandler Christian Schr¨oder
0 5 10 15 20 25 30 35 40 45 50
10−11
10−9
10−7
10−5
10−3
10−1
101
number of Arnoldi steps k
true value, ε=10−11
bound (3.2), a=2.1 · kEkk2
bound (3.2), a=102· kEkk2
bound (3.2), a=104· kEkk2
(a) g∈ {2.1,102,104} · ∥E∥2,l= 20 and Λ2=
Λ3= Λ4= Λ1
0 5 10 15 20 25 30 35 40 45 50
10−11
10−9
10−7
10−5
10−3
10−1
101
number of Arnoldi steps k
true value, ε=10−11
bound (3.2), l=15
bound (3.2), l=22
bound (3.2), l=30
(b) g= 2.1· ∥E∥2,l∈ {15,22,30}and Λ2=
Λ3= Λ4⊃Λ1.
Fig. 4.Experiment 2: True angle of inclusion ∠↷
max(X1,˜
Kk)and its bounds (3.2)
case (B) and Λ3= Λ4= Λ2∪{2}. As stated in Remark 3.5 this improves ˜ηiwhich
leads to more accurate convergence rate of the bound. Here the convergence rate of
the bound and of the true value almost coincide, i.e., the slow down factor is about
1.00.
Experiment 3. In the third experiment we investigate how well the exact invariant
subspace X1is included in a Ritz space ˜
Yusing Theorem 4.1, i.e., we are interested
in the quality of the computed subspace ˜
Y. The subspace of interest X1still corre-
sponds to the eigenvalues Λ1:= {9−g, 7,6,5,4}. We choose Λ2:= Λ1∪ {9,4−g},
∥E∥2= 10−13∥A∥2and g= 10−11, i.e., the gap between the wanted and the re-
maining eigenvalues is very small. The approximate eigenpair ( ˜
M, ˜
Y) is obtained via
calculating a Ritz pair of A+E.
In this experiment we again distinguish between two cases: In case (A) we consider
the Ritz space corresponding to the 2nd to 6th largest Ritz values, i.e., the subspace
is of the same dimension as X1and the wanted eigenvalues Λ1are not well separated
from the unselected Ritz values ˜
M−. This leads to a very slow convergence of the
angle ∠↷
max(X1,˜
Y). Referring to Figure 5∠↷
max(X1,˜
Y) does not start to converge until
the 27th iteration step and decreases only to 10−3afterwards. Moreover, the angle
∠↷
max(X1,˜
Y) wiggles around after the convergence phase, which is also caused by the
insufficient separation of the wanted and remaining Ritz values. The bound (4.1)
captures all three phases correctly, although it lags behind 5 iteration steps. After
the convergence phase the bound overestimates the true value by a factor of about
1.5.
In case (B) the subspace ˜
Ycorresponds to the 8 largest Ritz values. Now, there
is a sufficiently large gap between the wanted eigenvalues Λ1und the unselected Ritz
values ˜
M−. We see in Figure 5that now the angle ∠↷
max(X1,˜
Y) starts to converge
earlier after 22 iteration steps and decreases afterwards to limiting accuracy on the
order of ∥E∥2/∥A∥2. The bound (4.1) captures all three phases initialization, con-
vergence and stagnation very well. During the convergence phase the convergence
rates of the bound and the true value differ only by 0.005 and afterwards the bound
overestimates the true value by a constant factor of about 35.
5.2. Ising model. Returning to the motivation of this paper we consider as
a second example the one-dimensional quantum Ising model in the presence of a
transverse field as defined in [25,26]. The Hamiltonian of a system of dtwo-level
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 19
0 5 10 15 20 25 30 35 40 45 50
10−18
10−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
102
number of Arnoldi steps k
true ∠y
max(X1,˜
Y) (dim(X1) = dim(˜
Y))
true ∠y
max(X1,˜
Y) (dim(X1)<dim(˜
Y))
bound (4.1) of ∠y
max(X1,˜
Y) (dim(X1) = dim(˜
Y))
bound (4.1) of ∠y
max(X1,˜
Y) (dim(X1)<dim(˜
Y))
Fig. 5.Experiment 3: True angles of inclusion ∠↷
max(X1,˜
Y)and corresponding bounds (4.1)
subsystems (qubits) is defined by
(5.2) A:= (1 −s)
d
∑
i=1
Ai+s
d
∑
i=1
BiBi+1
where s∈[0,1] and for i= 1, . . . , d
Ai:= −I2(i−1) ⊗[0 1
1 0]⊗I2(d−i), Bi:= −I2(i−1) ⊗[1 0
0−1]⊗I2(d−i)
and Bd+1 := B1. The dimension of the Hamiltonian Ais n= 2dgrowing exponentially
in the number of qubits. In the following experiments we choose d= 10, n= 1024
and s= 0.4. The spectrum for these values is depicted in Figure 6. We are interested
in the ground state as well as in the first and second excited state, i.e., we want to
approximate the subspace X1corresponding to Λ1consisting of the three smallest
eigenvalues of A. So J1={1,2,3}.
In Figure 6(b), we see that there is no gap between the third smallest and the
fourth smallest eigenvalue of A, i.e. Λ1is not well separated from the remaining
eigenvalues.
Experiment 4. In this experiment we illustrate the bound of Theorem 3.1. In the
first case we choose Λ2= Λ3= Λ4to consist of the four smallest eigenvalues of A,
i.e., J2=J3=J4={1, . . . , 4}and ∥E∥2= 10−11∥A||2. In Figure 7(a) the true angle
of inclusion ∠↷
max(X1,˜
Kk) and the bound (3.2) for l∈ {60,75,90}are plotted.
Qualitatively we obtain the same result as in Experiment 2 (B). The bound (3.2)
predicts all three phases initialization, convergence and stagnation of ∠↷
max(X1,˜
Kk).
As for the first test matrix in Experiment 2 we see that the larger the number of
previous iterations l, the sharper the corresponding bound in the first iteration steps.
The convergence rate is overestimated to be 0.77 (true value: 0.55). This amounts a
slow down factor of 2.31. After the convergence phase all three bounds overestimates
the true value by a constant factor of approximately 30.
20 Ute Kandler Christian Schr¨oder
0100 200 300 400 500 600 700 800 900 1,0001,100
−10
−8
−6
−4
−2
0
2
4
6
8
10
(a) all eigenvalues
0 2 46 8 10 12 14 16 18 20 22 24 26 28 30
−7
−6.5
−6
−5.5
−5
−4.5
−4
(b) the 30 smalles eigenvalues
Fig. 6.Eigenvalue distribution of the Ising model (5.2)with d= 10 and s= 0.4
0 20 40 60 80 100 120 140
10−11
10−9
10−7
10−5
10−3
10−1
101
number of Arnoldi steps k
true ε=10−11
bound (3.2) l=60
bound (3.2) l=75
bound (3.2) l=90
(a) Λ1⊂Λ2= Λ3= Λ4and l∈ {60,75,90}
0 20 40 60 80 100 120 140
10−11
10−9
10−7
10−5
10−3
10−1
101
number of Arnoldi steps k
true ε=10−11
bound (3.2) l=40
(b) Λ1⊂Λ2⊂Λ3⊂Λ4and l= 40
Fig. 7.Experiment 4: The angle of inclusion ∠↷
max(X1,˜
Kk)and its bounds (3.2)
In the second case we improve the bound by choosing l= 40 and Λ1⊂Λ2⊂
Λ3⊂Λ4, more precisely Λ1, Λ2, Λ3, Λ4consists of the 3, 4, 7, 11 smallest eigenvalues
of A, respectively. With this choice of the subsets the bound (3.2) is much sharper
compared to the first case, see Figure 7(b). Also the convergence rate of the bound
improved and the slow down factor is reduced to 1.31.
Experiment 5. In the last experiment we examine how well the exact subspace
X1of Ais included in the computed Ritz space ˜
Y, using bound (4.1). Again we
distinguish between dim(X1) = dim( ˜
Y) and dim(X1)<dim( ˜
Y), where in the first
case the Ritz space corresponds to the three smallest and in the second case to the
four smallest Ritz values.
In the first case there is an insufficient separation between the wanted eigenvalues
und the unwanted Ritz values. Figure 8shows that for dim(X1) = dim( ˜
Y) the angle
∠↷
max(X1,˜
Y) does not converge and starts to wiggle after 100 steps of the algorithm.
This behavior is caused by the dense distribution of the eigenvalues of the matrix A.
More precisely, since the third and the fourth eigenvalue of Ais multiple the exact
subspace corresponding to the three smallest eigenvalues can not be approximated by
a Ritz space of the perturbed matrix A+E.
Since the third and the fourth eigenvalue of Ais multiple in general the condition
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 21
0 20 40 60 80 100 120 140
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
102
number of Arnoldi steps k
angle of inclusion
∠y
max(X1,˜
Y) (dim(X1) = dim(˜
Y))
∠y
max(X1,˜
Y) (dim(X1)<dim(˜
Y))
bound (3.13) of ∠y
max(X1,˜
Y) (dim(X1) = dim(˜
Y))
bound (3.13) of ∠y
max(X1,˜
Y) (dim(X1)<dim(˜
Y))
Fig. 8.Experiment 5: True angles of inclusion ∠↷
max(X1,˜
Y)and corresponding bounds (4.1)
∥E∥2<gap( ˜
M−,Λ2) of bound (4.1) is violated. Only during the initial phase the
bound (4.1) can be computed because there the approximation of Ritz values is still
very inaccurate such that for a few iteration steps the condition on gap( ˜
M−,Λ2)
holds.
In the second case, the subspace ( ˜
Y) is slightly enlarged, which leads to an im-
provement of the convergence behavior of ∠↷
max(X1,˜
Y). Referring to Figure 8the
angle ∠↷
max(X1,˜
Y) starts to converge after 83 iteration. The bound (4.1) captures
this behavior correctly and overestimates the true value after the convergence phase
by a constant factor of 15.
In summary, also for the bound (4.1) we obtain qualitatively the same result as
for the first test matrix.
6. Conclusions. We have investigated the convergence of inexact Krylov sub-
space methods, where we bound the distance of an exact invariant subspace to an
inexact Krylov subspace and to an inexact Ritz space therein. The first bound ad-
dresses the question: If literations have been carried out without convergence, how
many more iteration steps of an inexact Krylov subspace method are necessary to
ensure convergence to a certain tolerance.
The second bound addresses the question: How much worse is the distance of
an invariant subspace to a Ritz space compared to its distance to the whole search
space? In particular, we have bounded the angle of inclusion of Xin an inexact Ritz
space ˜
Yin an a posteriori setting.
As special features our bounds can handle the presence of perturbations and a
small gap between the wanted and the unwanted eigenvalues. This bound needs the
information of the l-th step of the iteration, the 2-norm of the perturbation and the
exact eigenvalues of A. Here the key idea was to enlarge the Ritz space in order to
guarantee a sufficiently large gap between the remaining Ritz values and the wanted
22 Ute Kandler Christian Schr¨oder
eigenvalues. Finally, in the numerical tests the bounds have been confirmed to cor-
rectly predict the trends of the convergence curves.
Acknowledgments. We thank Volker Mehrmann, Michael Karow, and Ag-
nieszka Miedlar (all TU Berlin) for insightful discussions on the topic and constructive
comments on early versions of the manuscript.
REFERENCES
[1] A. C. Antoulas,Approximation of large-scale dynamical systems, vol. 6 of Advances in Design
and Control, SIAM, Philadelphia, PA, 2005.
[2] C. A. Beattie,Harmonic Ritz and Lehmann bounds, Electron. Trans. Numer. Anal., 7 (1998),
pp. 18–39. Large scale eigenvalue problems (Argonne, IL, 1997).
[3] C. A. Beattie, M. Embree, and D. C. Sorensen,Convergence of polynomial restart Krylov
methods for eigenvalue computations, SIAM Rev., 47 (2005), pp. 492–515.
[4] R. Bhatia and P. Rosenthal,How and why to solve the operator equation AX −XB =Y,
Bull. London Math. Soc., 29 (1997), pp. 1–21.
[5] ˚
A. Bj¨
orck,Numerical methods for least squares problems, SIAM, Philadelphia, 1996.
[6] S. B¨
orm and C. Mehl,Numerical Methods for Eigenvalue Problems, De Gruyter Graduate
Lectures, Berlin/Boston, 2012.
[7] C. Davis and W. M. Kahan,The rotation of eigenvectors by a perturbation. III, SIAM J.
Numer. Anal., 7 (1970), pp. 1–46.
[8] G. H. Golub and C. F. Van Loan,Matrix computations, Johns Hopkins Studies in the
Mathematical Sciences, Johns Hopkins University Press, Baltimore, MD, third ed., 1996.
[9] L. Grasedyck, D. Kressner, and C. Tobler,A literature survey of low-rank tensor approx-
imation techniques, GAMM-Mitt., 36 (2013), pp. 53–78.
[10] S. Holtz, T. Rohwedder, and R. Schneider,On manifolds of tensors of fixed TT-rank,
Numer. Math., 120 (2012), pp. 701–731.
[11] T. Huckle, K. Waldherr, and T. Schulte-Herbr¨
uggen,Computations in quantum tensor
networks, Linear Algebra Appl., 438 (2013), pp. 750–781.
[12] I. C. F. Ipsen,Relative perturbation results for matrix eigenvalues and singular values, in
Acta numerica, 1998, vol. 7 of Acta Numer., Cambridge Univ. Press, Cambridge, 1998,
pp. 151–201.
[13] Z. Jia and G. W. Stewart,An analysis of the Rayleigh-Ritz method for approximating
eigenspaces, Math. Comp., 70 (2001), pp. 637–647.
[14] U. Kandler and C. Schr¨
oder,Backward error analysis of an inexact Arnoldi method using
a certain Gram Schmidt variant, Preprint 10-2013, TU Berlin, 2013.
[15] A. Kie lbasi´
nski and H. Schwetlick,Numerische lineare Algebra, vol. 18 of Mathematik
f¨ur Naturwissenschaft und Technik [Mathematics for Science and Technology], VEB
Deutscher Verlag der Wissenschaften, Berlin, 1988. Eine computerorientierte Einf¨uhrung.
[A computer-oriented introduction].
[16] A. Knyazev, A. Jujunashvili, and M. Argentati,Angles between infinite dimensional sub-
spaces with applications to the Rayleigh-Ritz and alternating projectors methods, J. Funct.
Anal., 259 (2010), pp. 1323–1345.
[17] A. V. Knyazev and M. E. Argentati,Principal angles between subspaces in an A-based
scalar product: algorithms and perturbation estimates, SIAM J. Sci. Comput., 23 (2002),
pp. 2008–2040 (electronic).
[18] R. B. Lehoucq,Implicitly restarted Arnoldi methods and subspace iteration, SIAM J. Matrix
Anal. Appl., 23 (2001), pp. 551–562 (electronic).
[19] R. B. Lehoucq and K. Meerbergen,Using generalized Cayley transformations within an
inexact rational Krylov sequence method, SIAM J. Matrix Anal. Appl., 20 (1999), pp. 131–
148 (electronic).
[20] Y. Nakatsukasa,The tan θtheorem with relaxed conditions, Linear Algebra Appl., 436 (2012),
pp. 1528–1534.
[21] I. V. Oseledets,Tensor-train decomposition, SIAM J. Sci. Comput., 33 (2011), pp. 2295–2317.
[22] B. N. Parlett,The symmetric eigenvalue problem, Prentice-Hall Inc., Englewood Cliffs, N.J.,
1980. Prentice-Hall Series in Computational Mathematics.
[23] Y. Saad,On the rates of convergence of the Lanczos and the block-Lanczos methods, SIAM J.
Numer. Anal., 17 (1980), pp. 687–706.
[24] , Numerical methods for large eigenvalue problems, Classics in Applied Mathematics,
Society for Industrial and Applied Mathematics, rev. ed. ed., 2011.
Analysis of inexact Arnoldi methods for Hermitian eigenvalue problems 23
[25] G. Schaller,Adiabatic preparation without quantum phase transitions, Phys. Rev. A, 78
(2008), p. 032328.
[26] R. Sch¨
utzhold and G Schaller,Adiabatic quantum algorithms as quantum phase transitions:
First versus second order, Phys. Rev. A, 74 (2006), p. 060304.
[27] V. Simoncini,Ritz and pseudo-Ritz values using matrix polynomials, in Proceedings of
the Fourth Conference of the International Linear Algebra Society (Rotterdam, 1994),
vol. 241/243, 1996, pp. 787–801.
[28] , Variable accuracy of matrix-vector products in projection methods for eigencomputa-
tion, SIAM J. Numer. Anal., 43 (2005), pp. 1155–1174.
[29] V. Simoncini and D. B. Szyld,Theory of inexact Krylov subspace methods and applications
to scientific computing, SIAM J. Sci. Comput., 25 (2003), pp. 454–477 (electronic).
[30] G. W. Stewart,Error and perturbation bounds for subspaces associated with certain eigen-
value problems, SIAM Rev., 15 (1973), pp. 727–764.
[31] , A generalization of Saad’s theorem on Rayleigh-Ritz approximations, Linear Algebra
Appl., 327 (2001), pp. 115–119.
[32] , Matrix algorithms. Vol. II, Society for Industrial and Applied Mathematics (SIAM),
Philadelphia, PA, 2001. Eigensystems.
[33] , Backward error bounds for approximate Krylov subspaces, Linear Algebra Appl., 340
(2002), pp. 81–86.
[34] G. W. Stewart and J.-G. Sun,Matrix perturbation theory, Computer Science and Scientific
Computing, Academic Press Inc., Boston, MA, 1990.
[35] B. Sz.-Nagy,¨
Uber die Ungleichung von H. Bohr, Math. Nachr., 9 (1953), pp. 255–259.
[36] J. van den Eshof and G. L. G. Sleijpen,Inexact Krylov subspace methods for linear systems,
SIAM J. Matrix Anal. Appl., 26 (2004), pp. 125–153 (electronic).
[37] P. ˚
A. Wedin,On angles between subspaces of a finite dimensional inner product space, in
Matrix Pencils, vol. 973 of Lecture Notes in Mathematics, Springer Berlin Heidelberg,
1983, pp. 263–285.
[38] H. Wielandt,An extremum property of sums of eigenvalues., Proc. Am. Math. Soc., 6 (1955),
pp. 106–110.