Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
SIAM J. MATRIX ANAL. APPL.c
2014 Society for Industrial and Applied Mathematics
Vol. 35, No. 2, pp. 599–618
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES OF
MATRICES∗
MICHAEL KAROW†AND DANIEL KRESSNER‡
Abstract. Given a nonsymmetric matrix A, we investigate the effect of perturbations on an
invariant subspace of A. The result derived in this paper differs from Stewart’s classical result and
sometimes yields tighter bounds. Moreover, we provide norm estimates for the remainder terms in
well-known perturbation expansions for invariant subspaces, eigenvectors, and eigenvalues.
Key words. invariant subspaces, perturbation theory, pseudospectra, quadratic matrix equation
AMS subject classifications. Author must provide
DOI. 10.1137/130912372
1. Introduction. A subspace X⊂Cnof a matrix A∈Cn×nis called invariant
if it satisfies
(1.1) AX⊂X.
In this paper, we reconsider the classical question of estimating the impact of pertur-
bations in Aon X.
Suppose that the columns X∈Cn×kform an orthonormal basis of X. Then (1.1)
implies the existence of A11 ∈Ck×ksuch that AX =XA11. The eigenvalues of A11 are
independent of the choice of basis and constitute the spectrum of the restriction of Ato
X. Extending Xto a unitary matrix [X, X⊥] leads to the block Schur decomposition
(1.2) AX, X⊥=X,X⊥A11 A12
0A22.
Note that this implies Λ(A)=Λ(A11)∪Λ(A22), where Λ(·) denotes the spectrum of
a matrix. Throughout this paper, we will assume
(1.3) Λ(A11)∩Λ(A22)=∅.
This is a necessary and sufficient condition for the Lipschitz continuity of Xwith
respect to perturbations in A[9, Thm. 15.5.1]. (Note that continuity requires a sub-
stantially weaker condition [9, Thm. 15.2.1].) Hence, if (1.3) holds, adding a small
perturbation A→ A+Eimplies a change in the invariant subspace that is asymp-
totically proportional to E.
Various bounds on the change of invariant subspaces under perturbations of A
have been derived, notably by Davis and Kahan [6], Stewart [18, 19], Demmel [8],
and Sun [24]. In the general nonsymmetric case, these bounds are valid only as long
as Eremains sufficiently small. A minimal requirement is that the separation con-
dition (1.3) remains valid under perturbations. In the language of pseudospectra,
this means that Eshould stay below the critical perturbation level εfor which
∗Received by the editors March 7, 2013; accepted for publication (in revised form) by B. Sutton
March 17, 2014; published electronically May 13, 2014.
http://www.siam.org/journals/simax/35-2/91237.html
†Institut f¨ur Mathematik, TU Berlin, 10623 Berlin, Germany (k[email protected]berlin.de).
‡MATHICSE, EPF Lausanne, CH-1015 Lausanne, Switzerland (daniel.kressner@epfl.ch).
599
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
600 MICHAEL KAROW AND DANIEL KRESSNER
the components of the ε-pseudospectrum containing Λ(A11)andΛ(A22)firstmeet
each other [1]. It turns out that some existing perturbation results are unnecessarily
restrictive and require Eto stay significantly below this critical level. The main con-
tribution of this paper consists of a novel perturbation bound for invariant subspaces;
see Theorem 3.1 below. To derive this bound, we employ pseudospectral techniques
in the analysis of a quadratic matrix equation.
The rest of this paper is organized as follows. In section 2, we introduce the basic
tools for the perturbation analysis of invariant subspaces and recall some existing
results. Section 3 contains the statement and proof of our main result, a new pertur-
bation bound for invariant subspaces. In section 3.2, it is shown that this bound is
sharp for a 2 ×2 example. Section 3.3 discusses a variation of the main result based
on the block diagonalization of A, while section 3.5 provides a comparison to existing
perturbation bounds. In section 4, the former is used to quantify existence conditions
and remainder terms for well-known eigenvalue and eigenvector expansions.
2. Preliminaries and existing results. The goal of this section is to summa-
rize some existing perturbation results for invariant subspaces and introduce notation,
needed in the rest of the paper, on the way. Let us first recall some basic tools used
in the perturbation analysis.
2.1. Separation between matrices. The condition (1.3) can be quantified
by the separation between A11 and A22. Based on Varah’s original definition [27],
Demmel [8] has proposed
sepλ(A11,A
22):=sup{ε>0|Λ(A11 +E11)∩Λ(A22 +E22)=∅
∀E11,E
22 with max{E112,E222}≤ε}.
(2.1)
This definition has an important interpretation in terms of ε-pseudospectra, defined as
Λε(M)=z∈C|z∈Λ(M+E)forsomeE∈Cn×nwith E2≤ε
for a matrix M∈Cn×n[26]. The separation (2.1) is the minimum value of εsuch
that Λε(A11)∩Λε(A22) is nonempty. This interpretation yields the expression
sepλ(A11,A
22) = inf
λ∈Cmax σmin(A11 −λI),σ
min(A22 −λI),
where σmin(·) denotes the smallest singular value of a matrix. Based on this expres-
sion, an algorithm for computing sepλhas been developed by Gu and Overton [11].
Although sepλis not an appropriate measure to quantify the perturbation of
invariant subspaces, it will still play a role in our derivations. Stewart [19] has intro-
duced a different notion of separation based on the observation that (1.3) is satisfied
if and only if the Sylvester operator
T:C(n−k)×k→C(n−k)×k,T:Z→ ZA11 −A22Z
is nonsingular. The separation of A11 and A22 with respect to an arbitrary norm ·
is defined as
(2.2) sep(A11,A
22):= min
Z=1 T(Z)=min
Z=1 ZA11 −A22Z.
Then sep(A11,A
22)= 0 if and only if (1.3) holds. Examples in [8, 27] show that the
quantity sep(A11,A
22) can be magnitudes smaller than sepλ(A11,A
22).
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 601
In the following, sepF(A11,A
22)=min
ZF=1 ZA11 −A22ZFdenotes the sep-
aration with respect to the Frobenius norm. An efficient algorithm for estimating
sepF(A11,A
22) can be derived from the inverse power method [4]. Various lower and
upper bounds for sepF(A11,A
22) can be found in [22, 23, 28]. Lower bounds for other
norms are discussed in [29].
Proposition 2.1 below relates sep with sepλand the distance between the spectra
of A11 and A22. Most of its statements are well known; we have included the proof for
the convenience of the reader. Recall that a norm ·is said to be unitarily invariant
if UZV=Zfor all unitary matrices U, V of compatible size. A norm is unitarily
invariant if and only if
(2.3) XZY≤X2ZY2
for all matrices of compatible size.
Proposition 2.1. Let A11,E
11 ∈Ck×k,A22,E
22 ∈C(n−k)×(n−k),andλ∈C.
Let δ=min{|λ−μ||λ∈Λ(A11),μ∈Λ(A22)}denote the distance between the spectra
Λ(A11)and Λ(A22).
(a) For any norm, sep(A11,A
22)≤δ.
(b) For any unitarily invariant norm,
(i) sep(λI, A
22)=σmin(λI −A22),
(ii) sep(A11 +E11,A
22 +E22)≥sep(A11,A
22)−E112−E222,
(iii) sep(A11,A
22)≤2·sepλ(A11,A
22)≤δ.
(c) If A11 and A22 are normal matrices, then sepF(A11,A
22)=
2·sepλ(A11,A
22)=δ.
Proof. Consider vectors xand y∗such that x2=y2=1,y∗A11 =λy∗,and
A22x=μxfor λ∈Λ(A11), μ∈Λ(A22). Let Z=xy∗.ThenZA11−A22Z=(λ−μ)Z.
This implies (a). To show the statements of (b), let E11 =1
2(μ−λ)yy∗and E22 =
1
2(λ−μ)xx∗.Theny∗(A11 +E11)=1
2(λ+μ)y∗and (A22 +E22)x=1
2(λ+μ)x.Thus,
Λ(A11 +E11)∩Λ(A22 +E22)=∅for perturbations satisfying E112=E222=
|λ−μ|/2. This yields the second inequality in (b)(iii). The statement of (b)(ii) follows
from the inequality
Z(A11 +E11)−(A22 +E22)Z≥ZA11 −A22Z−(E112+E222)Z.
In particular, if max{E112,E22} <sep(A11,A
22)/2, then sep(A11 +E11,A
22 +
E22)>0. Thus, Λ(A11 +E11)∩Λ(A22 +E22)=∅. This yields the first inequality of
(b)(iii).
Next, we show (c). Suppose that A11 and A22 are normal, and let U, V be unitary
matrices such that A11 =Udiag(λ1,...,λ
k)U∗,A22 =Vdiag(μ1,...,μ
n−k)V∗,where
λjand μiare the eigenvalues of A11 and A22, respectively. Let W=V∗ZU =[wij ].
Then
ZA11 −A22Z2
F=V∗(ZA11 −A22Z)U2
F
=Wdiag(λ1,...,λ
k)−diag(μ1,...,μ
n−k)W)2
F
=[(λj−μi)wij ]2
F=
ij |λi−μj|2|wij |2
≥δ2W2
F=δ2Z2
F.
Thus, sepF(A11,A
22)≥δ. Combined with (b)(iii), this yields (c).
It remains to prove (b)(i). Let (u, v) be a pair of normalized singular vectors be-
longing to the smallest singular value σmin of λI−A22 such that (λI−A22)v=σminu,
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
602 MICHAEL KAROW AND DANIEL KRESSNER
and u2=v2.Thenvu∗=uu∗,sincevu∗and uu∗have the same singular val-
ues 1,0,...,0. Setting Z=vu∗/uu∗we obtain Z=1and(λI−A22)Z=σmin.
Thus sep(λI,A
22)≤σmin. On the other hand,
sep(λI,A
22)=min
Z=0 (λI−A22)Z
Z=min
W=0 W
(λI−A22)−1W
≥min
W=0 W
(λI−A22)−12W=σmin.
2.2. Invariant subspaces and a quadratic matrix equation. Let us con-
sider a general matrix A∈Cn×nand partition
A=A11 A12
A21 A22,A
11 ∈Ck×k,A
22 ∈C(n−k)×(n−k).
For given Z∈C(n−k)×k, consider the similarity transformation
(2.4) I0
−ZI
AI0
ZI
=A11 +A12ZA
12
A21 +A22Z−ZA11 −ZA12ZA
22 −ZA12
which becomes block upper triangular if and only if the quadratic matrix equation
(2.5) 0 = f(A, Z):=A21 +A22Z−ZA11 −ZA12Z
is satisfied. This implies Lemma 2.2 below. Note that a subspace Yof row vectors is
called a left invariant subspace if YA⊂Y.
Lemma 2.2. Using the notation introduced above, the following statements are
equivalent:
(i) The columns of [IZ
]span a right invariant subspace of Asuch that
AI
Z=I
Z(A11 +A12Z).
(ii) The rows of [−ZI]span a left invariant subspace of Asuch that
−ZI
A=(A22 −ZA12)−ZI
.
(iii) The quadratic matrix equation (2.5) is satisfied.
2.3. An asymptotic result. Perturbation bounds that are asymptotically valid
as E→0 can be obtained in a relatively straightforward way from truncating per-
turbation expansions. For invariant subspaces, such expansions have been discussed
in [5, 14, 25]. In the following, we will illustrate this approach.
Lemma 2.3. Given A∈Cn×n, suppose that there exists Z∈C(n−k)×ksuch that
f(A, Z)=0,with fdefined as in (2.5).IfΛ(A11 +A12Z)∩Λ(A22 −ZA12)=∅,
then there exist an open neighborhood E⊂Cn×nof 0and an open neighborhood
Z⊂C(n−k)×kof Zsuch that for each E∈Ethe equation f(A+E,ZE)=0has a
unique solution ZE∈Z. Moreover, ZEdepends holomorphically on Eand admits
the first-order expansion
ZE=Z+T−1
ZE21+O(E2)
with the Sylvester operator TZ:ΔZ→ ΔZ(A11 +A12Z)−(A22 −ZA12)ΔZ.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 603
Proof. Clearly, fis a holomorphic function in the entries of Aand Z.The
derivative of fwith respect to the variable Zequals the Sylvester operator −TZ.
Since A22 −ZA12 and A11 +A12Zhave disjoint spectra, the operator TZis invertible.
Thus, the lemma follows from the implicit function theorem [15].
The way Lemma 2.3 is stated will be convenient for later purposes. However, for
the sake of an asymptotic result, we may assume without loss of generality that the
unperturbed matrix Ais already in block triangular form,
(2.6) A=A11 A12
0A22,A
11 ∈Ck×k,A
22 ∈C(n−k)×(n−k).
By Lemma 2.2, this is equivalent to requiring that X=[
Ik
0] spans an invariant
subspace of A. Combining the statement of Lemma 2.3 with Lemma 2.2 then yields
the following result.
Corollary 2.4. Let Abe in block triangular form (2.6) and assume Λ(A11)∩
Λ(A22)=∅. Then, for every Ewith Esufficiently small, there exists an invariant
subspace XE= span[ I
ZE]of A+Esuch that ZEadmits the first-order expansion
(2.7) ZE=T−1E21+O(E2),T:Z→ ZA11 −A22Z.
Recently, Stewart [20] derived bounds on Efor which ZE, as a function of E,
is Fr´echet differentiable.
Once we have obtained ZE, there are different ways of comparing the two invariant
subspaces
X=spanI
0,XE=spanI
ZE
of the matrices Aand A+E, respectively. If σ1≥σ2≥ ··· ≥ σkdenote the
singular values of ZE, then the ith canonical angle between Xand XEis given by
θi(X,XE) = arctan σi.Defining Θ(X,XE) := diag(θ1,...,θ
k), it is well known [21,
sect. II.4] that sin(Θ(X,XE))generates a metric on the k-dimensional subspaces
of Cnfor any unitarily invariant matrix norm ·. However, it is sometimes more
convenient to simply use
(2.8) ZE=tan(Θ(X,XE))
for measuring the distance, which remains close to sin(Θ(X,XE))as long as ZE
is small. The first-order result
(2.9) tan(Θ(X,XE))=ZE=E21
sep(A11,A
22)+O(E2)
is now readily obtained from Corollary 2.4. This also confirms that sep(A11,A
22)−1
is the condition number of X[2].
2.4. Nonasymptotic results. The derivation of nonasymptotic results requires
a more careful study of the quadratic matrix equation f(A+E,ZE) = 0 with fas
in (2.5) and
(2.10) A+E=A11 +E11 A12 +E12
E21 A22 +E22.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
604 MICHAEL KAROW AND DANIEL KRESSNER
In particular, Ais assumed to be block upper triangular. Then Stewart’s result [18,
Thm. 4.1] see also [21, Thm. 2.7]) reads as follows.
Theorem 2.5. Let ·denote a consistent family of norms (i.e. XY≤
XYwhenever the matrix product XY is defined). Let A+E∈Cn×nbe parti-
tioned as in (2.10) and set
sE:= sep(A11,A
22)−E11−E22.
If sE>0and
(2.11) E21(A12+E12)<s
2
E/4,
then there exists a unique solution ZEof f(A+E,ZE)=0satisfying
(2.12) ZE≤ 2E21
sE+s2
E−4E21(A12+E12)<2E21
sE
.
Interestingly, Theorem 2.5 can be derived directly from the Newton–Kantorovich
theorem; see Appendix A.
Remark 2.6. For a unitarily invariant family of norms ·, the result of Theo-
rem 2.5 still holds if sEis replaced with ˜sE=sep(A11,A
22)−E112−E222,E12
is replaced with E122,andA12is replaced with A122. See Appendix A.
A different analysis for the case of the Frobenius norm has been given by
Demmel [8]; see also [16].
Theorem 2.7 (see [16, Thm. 1.15]). Using the notation of Theorem 2.5, assume
that
(2.13) EF<sepF(A11,A
22)
4P2
,
where Pdenotes the spectral projector of Abelonging to Λ(A11). Then there exists a
unique solution ZEof f(A+E,ZE)=0satisfying
ZEF<4EF
sepF(A11,A
22)−4P2EF
.
A comparison of Theorems 2.5 and 2.7 with our results is given in section 3.5.
3. Main result. Our main result admits the use of different norms for measuring
the perturbation Eand the solution Z. More specifically, we consider two norms
·:C(n−k)×k→Rand |||·|
||:Cn×n→Rthat satisfy the inequalities
(N1) Z2≤Z,(N2) UZV≤U2ZV2,
(N3) E2≤|
||E|||,(N4) XEY≤X2||
|E|||Y2
for all E∈Cn×n,Z∈C(n−k)×k,X∈Ck×n,Y ∈Cn×k,U∈C(n−k)×(n−k),and
V∈Ck×k. Condition (N2) is equivalent to requiring that ·is unitarily invariant.
In particular, there is a symmetric gauge function Φ such that
Z=Φ(σ1(Z),...,σ
min{k,n−k}(Z)),
where σ1(Z)≥ ··· ≥ σmin{k,n−k}(Z) are the singular values of Zin nonincreas-
ing order; see [3, Thm. IV.2.1]. Condition (N1) is equivalent to requiring that
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 605
Φ(1,0,...,0) ≥1. If we now define |||E|||for E∈Cn×nby the same formula as
Z,thatis
|
||E|||=Φ(σ1(E),...,σ
min{k,n−k}(E)),
then conditions (N1)–(N4) are all satisfied. It is important to note that only min{k,n−
k}of the nsingular values of Eare involved in the definition of |||E|||.
We remark that ||
|·|
||need not be unitarily invariant. For instance, the conditions
(N1)–(N4) are also valid if ·=·Fis the Frobenius norm and |||E|||=i,j |eij|.
Theorem 3.1. Consider the block triangular matrix
A=A11 A12
0A22,A
11 ∈Ck×k,A
22 ∈C(n−k)×(n−k),
and assume that Λ(A11)∩Λ(A22)=∅.Let·and |||·|
||be norms on C(n−k)×kand
Cn×n, respectively, that satisfy conditions (N1)–(N4).Let
s:= sep(A11,A
22)= min
Z=1 ZA11 −A22Z.
For ε≥0define g(ε)=ε(ε+A122)and let ρ≥0be such that g(ρ)=s
2, i.e.,
ρ=1
2(s2+A122
2−A122).Finally,letBρ:= {E∈Cn×n||
||E|||<ρ}. Then the
following statements hold:
(a) For all ε≥0,Λε(A)⊆Λg(ε)(A11)∪Λg(ε)(A22).
(b) If ε<ρ,thenΛg(ε)(A11)∩Λg(ε)(A22)=∅.
(c) There exists a unique holomorphic function
BρE−→ ZE∈C(n−k)×k
with the following properties:
(i) The columns of [IZ
E]span a right invariant subspace XEof A+E.
(ii) The rows of [−ZEI]span a left invariant subspace YEof A+E.
(iii) The spectrum of the restriction of A+Eto XEis contained in the
pseudospectrum Λg(E2)(A11). The spectrum of the restriction of A+E
to YEis contained in the pseudospectrum Λg(E2)(A22).
(iv) The matrix ZEsatisfies
(3.1) ZE≤ 2|||E|||
s+s2−4|||E|||(|||E|||+A122)≤2
s|||E|||
as well as
(3.2) ZE−T−1E21≤ 6
s2E2||
|E|||,
where T:Z→ ZA11 −A22Z.
Remark 3.2. Inequality (3.2) gives a bound for the remainder O(E2)inthe
first-order expansion (2.7).
Remark 3.3. The first bound in (3.1) looks quite complicated. A slightly weaker
but more appealing estimate for ZEis
(3.3) ZE≤|||E|||
s1+|||E|||
ρ,
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
606 MICHAEL KAROW AND DANIEL KRESSNER
which can be derived as follows. Setting ψ()=2/(s+s2−4(+A12)) the first
bound in (3.1) can be written as ZE≤ψ(|||E|||)|||E|||. A direct computation yields
ψ()≥0. Thus, ψis a convex function. It follows that ψ()≤ψ(0) + (/ρ)(ψ(ρ)−
ψ(0)) = s−1(1+(/ρ)).This shows (3.3).
3.1. Proof of Theorem 3.1. Statement (a) of Theorem 3.1 has been shown
by Grammont and Largillier [10, Proposition 3.1]; see also [13]. Statement (b) is a
consequence of Proposition 2.1(b)(iii). It remains to prove statement (c), in particular
the upper bound (3.1). For this purpose, we will first derive an auxiliary result showing
that this upper bound and a certain lower bound are mutually exclusive. The main
part of the proof then consists of ruling out the lower bound by a continuity argument.
In the following, we will write Zinstead of ZEfor notational convenience.
Lemma 3.4. With the notation and assumptions stated in Theorem 3.1, define
r+(ε)=2ε/s+s2−4ε(ε+A122),r
−(ε)=2ε/s−s2−4ε(ε+A122).
If |||E|||≤ρand f(A+E,Z)=0,thenZ≤r+(|||E|||)or Z≥r−(|||E|||).
Proof. By direct computation, it can be verified that r−(ε)andr+(ε)arethe
zeros of the quadratic polynomial pε(r):=(ε+A122)r2−sr+ε. Since its leading
coefficient is positive, we have
(3.4) pε(r)<0⇔r+(ε)<r<r
−(ε).
The assumption Λ(A11)∩Λ(A22)=∅of Theorem 3.1 implies that the Sylvester
operator T:Z→ ZA11 −A22Zis nonsingular, and hence s=T−1−1. The equation
f(A+E,Z) = 0 can be written as
T(Z)=[−ZI]EI
Z−ZA12Z
or, equivalently,
Z=T−1[−ZI]EI
Z−ZA12Z.
Using (2.3) and the fact that [−ZI]2=[IZ
]2=1+Z2
2≤1+Z2,
we conclude that Z≤|||E|||(1 + Z2)+A122Z2/s, which is equivalent to
0≤p|
|
|E|
|
|(Z). Thus, the claim follows from (3.4). Note that the upper bound
Z≤r+(|||E|||) in Lemma 3.4 is equivalent to the first inequality in (3.1).
Lemma 3.5. For E∈B
ρlet IEdenote the set of t∈[0,1] such that there exists
amatrixZwith the following properties:
(α)f(A+tE, Z)=0;
(β)Λ
1(t)⊆Λg(E2)(A11),whereΛ1(t):=Λ(A11 +t(E11 +E12Z));
(γ)Λ
2(t)⊆Λg(E2)(A22),whereΛ2(t):=Λ(A22 +t(E22 −ZE21));
(δ)Z≤r+(|||E|||).
Then 1∈IE.
Proof. The proof of the statement proceeds via analytic continuation in three
steps.
Step 1. Since the conditions (α)–(δ) hold for t=0andZ= 0, it follows that
0∈IE.
Step 2. We now claim that there exists ε>0 such that [ˆ
t, ˆ
t+ε)⊂IEfor any
ˆ
t∈IEwith ˆ
t<1.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 607
Let ˆ
Zbe such that f(A+ˆ
tE, ˆ
Z) = 0. The pseudospectra Λg(E2)(A11)and
Λg(E2)(A22) are disjoint by Theorem 3.1(b). Thus Λ1(ˆ
t)andΛ
2(ˆ
t)aredisjoint,too.
Hence, Lemma 2.3 applied to A+ˆ
tE implies that there exist ε>0 and a holomorphic
function
[ˆ
t, ˆ
t+ε)t−→ Zt∈Cm×l
such that f(A+tE, Zt)=0andZˆ
t=ˆ
Z. We may assume that ˆ
t+ε<1. The set
Λ1(t) is the spectrum of A+tE restricted to the right invariant subspace [IZ
t].
Thus,
Λ1(t)⊂Λ(A+tE)⊂Λg(|
|
|E|
|
|)(A11)∪Λg(|
|
|E|
|
|)(A22).
However, since the latter pseudospectra are disjoint closed sets it follows from Λ1(ˆ
t)⊆
Λg(|
|
|E|
|
|)(A11) and the continuity of eigenvalues that Λ1(t)⊆Λg(|
|
|E|
|
|)(A11) for all t∈
[ˆ
t, ˆ
t+ε). Analogously, we conclude that Λ2(t)⊆Λg(|
|
|E|
|
|)(A22) for all t∈[ˆ
t, ˆ
t+ε).
It remains to verify property (δ). By Lemma 3.4, we have for every tthat Zt≤
r+(t|||E|||)≤r+(|||E|||)orZt≥r−(t|||E|||)≥r−(|||E|||). Since the former inequality holds
for t=ˆ
tand r+(|||E|||)<r
−(|||E|||), the continuity of t→ Ztimplies that Zt≤r+(|||E|||)
for all t∈[ˆ
t, ˆ
t+ε). This establishes the claim.
Step 3. We now claim that the set IEis closed.
Let (tj) be a sequence in IEwith limit t∗. Then there exists a sequence (Zj)
such that the pairs (tj,Z
j) satisfy (α)–(δ). In particular Zj≤r+(|||E|||) for all j.
By compactness the sequence (Zj) has a convergent subsequence. Let Z∗denote its
limit. Then (t∗,Z
∗)satisfies(α)–(δ). In particular, the properties (β)and(γ)for
(t∗,Z
∗) are consequences of the continuity of eigenvalues and the closedness of the
pseudospectra Λg(E2)(A11),Λg(E2)(A22). This establishes the claim.
From Step 3, together with Steps 1 and 2, it follows that 1 = sup IE∈IE.
ProofofTheorem3.1(c). By applying Lemma 2.2 to A+E, each of the conditions
(c)(i) and (c)(ii) of Theorem 3.1 is equivalent to property (α) of Lemma 3.5 for t=1.
Again by Lemma 2.2, condition (c)(iii) is equivalent to the properties (β)and(γ)for
t= 1. Thus, Lemma 3.5 establishes the existence of Zsatisfying (c)(i)–(c)(iii) as well
as the first inequality in (3.1).
Next we prove uniqueness of any such matrix Z. For this purpose, suppose that
f(A+E,Z)=f(A+E, ˜
Z)=0. Then
0=f(A+E,Z)−f(A+E, ˜
Z)
=E21 +(A22 +E22)Z−Z(A11 +E11)−Z(A12 +E12)Z
−E21 +(A22 +E22)˜
Z−˜
Z(A11 +E11)−˜
Z(A12 +E12)˜
Z
=(A22 +E22 −˜
Z(A12 +E12))
=: ˜
A22
(Z−˜
Z)−(Z−˜
Z)(A11 +E11 +(A12 +E12)Z)
=: ˜
A11
.
Since both Zand ˜
Zsatisfy properties (β)and(γ)fort= 1, the spectra Λ( ˜
A22)and
Λ( ˜
A11) are disjoint. Thus, Z−˜
Z=0.
In summary, we have shown the existence and uniqueness of a function E→ ZE
satisfying conditions (i)–(iii) in Theorem 3.1. Lemma 2.3 implies that this function is
holomorphic. It remains to prove the inequality (3.2). The relation f(A+E,ZE)=0
is equivalent to
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
608 MICHAEL KAROW AND DANIEL KRESSNER
ZE=T−1(E21 +E22ZE−ZEE11 −ZEE12ZE).
Furthermore, by (3.1) we have ZE≤2|||E|||/s < 2ρ/s ≤1. Thus,
ZE−T−1E21=T−1(E22ZE−ZEE11 −ZEE12ZE)
≤1
s(2E2ZE+E2ZE2)
≤3
sE2ZE≤ 6
s2E2||
|E|||.
This concludes the proof of Theorem 3.1.
3.2. The 2×2case. In this section we show for a 2×2 example that the bounds
in Theorem 3.1 are sharp. Let
A=−s/2c
0s/2,E=0ε
−ε0,s>0,c,ε≥0.
Then sis the separation of the diagonal elements of Aand E2=ε.Furthermore,
let ρ=1
2(√s2+c2−c). Then ρ(ρ+c)=s2/4. The ε-pseudospectrum of Acan be
calculated as
Λε(A)={z∈C|σmin(zI −A)≤ε}
=z∈C|1
2z−s
2+z+s
22+c2−z−s
2−z+s
22+c2≤ε
;
see also [13]. By Theorem 3.1(a),
Λε(A)⊆D
√ε(ε+c)(−s/2) ∪D
√ε(ε+c)(s/2),(3.5)
where Dr(z)⊂Cdenotes the closed disk of radius r≥0 around z∈C.Ifε<ρ,
then ε(ε+c)<s/2 and the disks in (3.5) are disjoint. Hence, the pseudospectrum
Λε(A) has two connected components. For ε=ρthe disks in (3.5) touch each other
at 0 ∈C. From (3.5) it follows that also 0 ∈σρ(A). Hence, σε(A) has only one
connected component for ε≥ρ. The eigenvalues of A+Eare
λ±(ε)=±1
2s2−4ε(ε+c).
These eigenvalues lie close to the boundary of Λε(A). The situation is illustrated in
Figure 1, where the shaded regions represent pseudospectra, the circles represent the
boundaries of the disks D√ε(ε+c)(±s/2), and the dots mark the eigenvalues λ±(ε).
A right eigenvector to the eigenvalue λ−(ε) is given by [1 zε],wherezε=2ε/(s+
s2−4ε(ε+c)). If ε<ρ, then the eigenvalues are real and distinct. If ε=ρ,then
A+Eis similar to a Jordan block. More specifically, in this case we have
A+E=1−2/s
2ε/s 001
00
1−2/s
2ε/s 0−1
.
If ε>ρ, then the eigenvalues λ±(ε) are purely imaginary. Note that the function
ε→ zεis not differentiable at ε=ρ.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 609
−s/2 0 s/2
−s/2
0
s/2
(ε(ε+c))1/2
Case ε < 0.5*((s2+c2)1/2−c)
−s/2 0 s/2
−s/2
0
s/2
(ε(ε+c))1/2
Case ε = 0.5*((s2+c2)1/2−c)
−s/2 0 s/2
−s/2
0
s/2
(ε(ε+c))1/2
Case ε > 0.5*((s2+c2)1/2−c)
Fig. 1.ε-pseudospectra of a 2×2matrix for three values of ε.
3.3. A bound in terms of the spectral decomposition. The purpose of this
section is to derive a bound based on the block diagonalization of A. We assume the
setting of Theorem 3.1. In particular, Ais block triangular and Λ(A11)∩Λ(A22)=∅.
Then there exists a unique R∈Ck×(n−k)such that RA22 −A11R=A12.Wehave
AR
I=R
IA22.
Hence, the columns of [RI]span a right invariant subspace Xcof Awhich is
complementary to X= range([I0]). The projector onto Xalong Xcis the spectral
projector P=I−R
00
.Let
(3.6) p=1+R2
2,κ=p+R2=p+p2−1,G=IR/p
0I/p.
Then p=P2,κ=G2G−12is the condition number of G[8] and
A=Gdiag(A11,A
22)G−1,G
−1=I−R
0pI.
Furthermore, we have
R2= (tan ϕ)−1,p=(sinϕ)−1,κ= (tan ϕ
2)−1,
where ϕis the smallest angle between the subspaces Xand Xc; see [7].
With these preparations we are in a position to state and prove the following
theorem.
Theorem 3.6. Let Aand sbe defined as in Theorem 3.1,andlet·and |||·|
||
be unitarily invariant norms on C(n−k)×kand Cn×n, respectively, satisfying the con-
ditions (N1)–(N4).LetRand κbe defined as above. If |
||E|||<s/(2κ), then there
exists a unique WE∈C(n−k)×k, depending holomorphically on Ewith the following
properties:
(i) The columns of [IR
0I][ I
WE]span a right invariant subspace XEof A+E.
(ii) The rows of [−WEI][ I−R
0I]span a left invariant subspace YEof A+E.
(iii) The spectrum of the restriction of A+Eto XEis contained in the pseu-
dospectrum ΛκE2(A11). The spectrum of the restriction of A+Eto YEis
contained in the pseudospectrum ΛκE2(A22).
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
610 MICHAEL KAROW AND DANIEL KRESSNER
(iv) The matrix WEsatisfies
(3.7) WE≤2κ
sp|||E|||
and
(3.8) WE−T−1(E21)≤6κ2
ps
2E2|
||E|||,
where T:Z→ ZA11 −A22Z.
(v) Let ZE=WE(I+RWE)−1=(I+WER)−1WE. Then the columns of
[IZ
E]span XE, the rows of [−ZEI]span the subspace YE,and
(3.9) ZE≤
2κ
p|||E|||
s−2κ
pR2|
||E|||.
Proof.Letˆ
A= diag(A11,A
22)and ˆ
E=G−1EG.ThenA+E=G(ˆ
A+ˆ
E)G−1.
Furthermore, the equivalences
(3.10) (ˆ
A+ˆ
E)U=UL ⇔(A+E)(GU)=(GU)L,
V(ˆ
A+ˆ
E)=MV ⇔(VG
−1)(A+E)=M(VG
−1)
hold for any U∈Cn×k,L∈Ck×k,V∈C(n−k)×n,M∈C(n−k)×(n−k).Thus,the
columns of GU span a right invariant subspace of A+Eif and only if the columns of
Uspan a right invariant subspace ˆ
A+ˆ
E. Furthermore, the rows of VG
−1span a left
invariant subspace of A+Eif and only if the rows of Vspan a left invariant subspace
ˆ
A+ˆ
E. Suppose ||
|E|||<s/(2κ). Then, since |||·|
||is unitarily invariant,
(3.11) |
||ˆ
E|
||≤κ|||E|||<s/2.
Hence, according to Theorem 3.1 there exists a unique Zˆ
E∈C(n−k)×kdepending
holomorphically on ˆ
Ewith the following properties:
(i)Thecolumnsof[IZ
ˆ
E]span a right invariant subspace Xˆ
Eof ˆ
A+ˆ
E.
(ii)Therowsof[−Zˆ
EI] span a left invariant subspace Yˆ
Eof ˆ
A+ˆ
E.
(iii) The spectrum of the restriction of ˆ
A+ˆ
Eto Xˆ
Eis contained in the pseu-
dospectrum Λˆ
E2(A11). The spectrum of the restriction of A+Eto Yˆ
Eis
contained in the pseudospectrum Λˆ
E2(A22).
(iv)ThematrixZˆ
Esatisfies
Zˆ
E≤2
s|||ˆ
E|||as well as Zˆ
E−T−1(ˆ
E21)≤ 6
s2ˆ
E2|||ˆ
E|||.
Let U=[IZ
ˆ
E],V=[−Zˆ
EI], and WE=Zˆ
E/p.Then
IR
0II
WE=GU, −WEII−R
0I=VG
−1.
Hence, the claims (i)–(iv) follow from (i)–(iv), the equivalences (3.10), the inequality
(3.11), and the fact that ˆ
E21 =pE
21.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 611
Now, redefine ZEas ZE:= WE(I+RWE)−1=(I+RWE)−1WE. The first
statements of claim (v) follow from the identities
IR
0II
WE=I
ZE(I+RWE),−WEII−R
0I=(I+WER)−ZEI.
Since ZE=WE(I+RWE)−1≤WE/(1 −R2WE) the inequality (3.9) is a
consequence of (3.7).
Remark 3.7. If Eand the underlying norms fulfill the assumptions of both
Theorem 3.1 and Theorem 3.6, then the solution matrices ZEestablished in these
theorems are identical. This follows from the uniqueness statement in Theorem 3.1.
Note that the bound (3.1) is tighter than the bound (3.9).
3.4. The case of the Frobenius norm. We now specialize our bounds to the
case that the underlying norm is the Frobenius norm ·Fon C(n−k)×k. To this end,
we define a unitarily invariant norm on Cn×nby
(3.12) EF,k =⎛
⎝
min{k,n−k}
j=1
σj(E)2⎞
⎠
1/2
,
where σ1(E)≥σ2(E)≥ ··· ≥ σn(E) denote the singular values of E∈Cn×nin
nonincreasing order. Note that EF,k ≤EFwith equality if and only if rank E≤
k. Since conditions (N1)–(N4) are satisfied for ·=·Fand |||·|
||=·F,k,wehave
the following corollary to Theorems 3.1 and 3.6.
Corollary 3.8. Let A∈Cn×nbe partitioned as in Theorem 3.1.LetR,
κ,andpbe defined as above, and let s=sep
F(A11,A
22). Furthermore, let ρ=
1
2(s2+A122
2−A122).
(i) If EF,k <ρ, then then the matrix ZEin Theorem 3.1 satisfies
(3.13) ZEF≤s
2EF,k.
(ii) If EF,k <s
2κ, then the matrix ZEin Theorem 3.6 satisfies
(3.14) ZEF≤
2κ
pEF,k
s−2κ
pR2EF,k
.
If Esatisfies both conditions, EF,k <ρand EF,k <s
2κ, then the matrices ZEin
(i) and (ii) are identical.
3.5. Comparison with existing results.
Comparison with Theorem 2.7.Since always EF,k ≤EF,R2≤p,and
κ≤2p, the bound (3.14) is an improvement of Demmel’s bound in Theorem 2.7.
For the particular case that Ais block diagonal (i.e., A12 =0)wehavethat
κ=p=1,R= 0, and (3.14) as well as (3.13) state
ZEF≤2EF,k
sif EF,k ≤s
2,
where s=sep
F(A11,A
22). In contrast, Theorem 2.7 yields
ZEF≤4EF
sif EF≤s
4.
On the other hand, if R2is large, then the constants in the bound (3.14) and
in Demmel’s bound are nearly identical since then R2≈pand 2κ/p ≈4.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
612 MICHAEL KAROW AND DANIEL KRESSNER
Comparison with Theorem 2.5.It is not that easy to compare our bound (3.1)
with Stewart’s bound (2.12) from Theorem 2.5. First, the bound (3.1) holds under
the condition |||E|||(|||E|||+A12)<s
2/4, while Stewart’s bound requires E21(E12+
A12)<s
2
E/4, where s=sep(A11,A
22)andsE=s−E11−E22. Hence, these
bounds have a different range of applicability. The advantage of the bound in Theo-
rem 3.1 over Stewart’s result is that it has the separation sin the denominator instead
of the smaller number sE.
On the other hand, (3.1) works with the norm of the whole matrix Eand Stewart’s
bound involves the norms of the blocks Eij , which are smaller than |||E|||.Inparticular,
Stewart’s bound properly predicts ZE=0whenE21 = 0. Our bound (3.1) does not
reflect this fact.
Following advice by Stewart, we performed several numerical experiments with
random matrices and perturbations. Not surprisingly, in these experiments Stewart’s
bound is observed to be tighter than ours in most cases when |||E|||is small compared
to s/2. The same observation is made when A12is not significantly smaller than s.
It turns out that our new bound becomes advantageous when A12 =0,theper-
turbations are measured in the spectral norm, and |||E|||is not small compared with s/2.
To demonstrate this, let us consider an n×nblock diagonal matrix Awith k×kand
(n−k)×(n−k) diagonal blocks. With A12 = 0, the perturbation bounds depend on
Aonly via the separation sand they scale nearly proportionally with 1/s. It therefore
suffices to consider one value for s,say,s= 1. We have then chosen ε∈(0,s/2] and
generated a large set of random perturbations (10,000 for n= 7 and 1000 for n= 50)
having normally distributed entries and norm ε. The plots in Figure 2 show the
percentage of cases our bound is smaller than Stewart’s bound. Note that “strong”
refers to the stronger bounds (i.e., the first inequalities) while “weak” refers to the
weaker but simpler bounds (i.e., the second inequalities) in (2.12) and (3.1). We have
considered the spectral norm as well as the Frobenius norm for measuring pertur-
bations. In the latter case, we have used the tighter bound (A.4) instead of (2.12)
and set |||E|||:= EF,k =σ1(E)2+···+σ2
min{k,n−k}(E). For the spectral norm,
Figures 2(a) and (c) show that our new bound performs better for |
||E|||close to s/2,
especially when kis larger. For the Frobenius norm, the new bound seems to be
better on average only for k=1andwhen|
||E|||is close to s/2.
The following example illustrates that Stewart’s bound can become very conser-
vative for |
||E|||≈s/2.
Example 3.9. For A11,A
22 ∈C2×2with disjoint spectra consider
A=A11 0
0A22,E
t=st
2
⎡
⎢
⎢
⎣
1000
0000
0100
0001
⎤
⎥
⎥
⎦=Et
11 Et
12
Et
21 Et
22,
where 0 ≤t<1ands=sep
2(A11,A
22) is the separation with respect to the spectral
norm. Then Et2=Et
112=Et
222=Et
212=st/2, Et
122=0,andsE=
s−Et
112−Et
222=s(1−t). If ·=|||·|
||=·2, then our bound (3.1) and Stewart’s
bound in Theorem 2.5 are both applicable for t<1. As ttends to 1, Stewart’s bound
tends to infinity, while our bound (3.1) tends to 1. If s=sep
F(A11,A
22) denotes the
separation with respect to the Frobenius norm and |||·|
||=·F,2, then our bound (3.1)
and Stewart’s bound in Theorem 2.5 are both applicable for t<1/√2. As ttends
to 1/√2, Stewart’s bound tends to 1/(23/2−2) ≈1.207, while our bound (3.1) tends
to 1.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 613
0 0.1 0.2 0.3 0.4 0.5
0
20
40
60
80
100
norm of random perturbation E
%new bound is better
strong (k=1)
weak (k=1)
strong (k=2)
weak (k=2)
strong (k=3)
weak (k=3)
(a) n= 7, spectral norm
0 0.1 0.2 0.3 0.4 0.5
0
20
40
60
80
100
norm of random perturbation E
%new bound is better
strong (k=1)
weak (k=1)
strong (k=2)
weak (k=2)
strong (k=3)
weak (k=3)
(b) n= 7, Frobenius norm
0 0.1 0.2 0.3 0.4 0.5
0
20
40
60
80
100
norm of random perturbation E
%new bound is better
strong (k=1)
weak (k=1)
strong (k=2)
weak (k=2)
strong (k=20)
weak (k=20)
(c) n= 50, spectral norm
0 0.1 0.2 0.3 0.4 0.5
0
20
40
60
80
100
norm of random perturbation E
%new bound is better
strong (k=1)
weak (k=1)
strong (k=2)
weak (k=2)
(d) n= 50, Frobenius norm
Fig. 2.Performance of new perturbation bound (3.1) compared to Stewart’s bounds (2.12)
(spectral norm) or (A.4) (Frobenius norm) for block diagonal matrices and random perturbations.
4. The case of a simple eigenvalue. The theorem below gives the second-
order expansion of a simple eigenvalue as well as the first-order expansion of the
associated eigenvector. These expansions are well known and can be found, e.g.,
in [17]. Our novel contributions consist of the existence regions and the bounds for
the remainders in the expansion. Below, M†and Mdenote the Moore–Penrose
inverse and the Drazin inverse of M∈Cn×n, respectively. Furthermore, θ(x, y)=
arccos(|x∗y|/(x2y2)) denotes the angle between the one-dimensional subspaces
Cxand Cy.
Theorem 4.1. Let x0∈Cnbe a normalized right eigenvector of A∈Cn×n
belonging to a simple eigenvalue λ0∈C(i.e., Ax0=λ0x0,x02=1). Let y0∈C
be a left eigenvector such that y∗
0A=λ0y∗
0and y∗
0x0=1. Moreover, let
c=x∗
0(λ0I−A)2,
P0=I−x0x∗
0,
p0=y02,
κ0=y02+y02
2−1,
s0=σn−1(P0(λ0I−A)),
where σn−1(·)denotes the second smallest singular value.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
614 MICHAEL KAROW AND DANIEL KRESSNER
(i) If E2<1
2(s2
0+c2−c), then there exists an eigenvector xEof A+E
depending holomorphically on Esuch that x∗
0xE=1and
xE=(I+[P0(λI −A)]†E)x0+ξE
for some ξE∈Cnwith
ξE2≤6
s2
0E2
2.
Furthermore,
xE−x02=tan(θ(x0,x
E)) ≤2E2/s0.
(ii) If E2<s
0/(2κ0), then there exists an eigenvector ˜xEof A+Edepending
holomorphically on Esuch that y∗
0˜xE=1and
˜xE=x0+(λ0I−A)Ex0+˜
ξE
for some ˜
ξE∈Cnwith
(4.1) ˜
ξE2≤6κ2
0
s2
0E2
2.
Furthermore,
˜xE−x02≤2κ0
s0E2and tan(θ(x0,˜xE)) ≤
2κ0
p0E2
s0−2κ0
p0p2
0−1E2
.
(4.2)
The corresponding eigenvalue of A+Esatisfies
λE=λ0+y∗
0E˜xE
(4.3)
=λ0+y∗
0Ex0+E
=λ0+y∗
0Ex0+y∗
0E(λ0I−A)Ex0+˜
E
with E=y∗
0(˜xE−x0)and ˜
E=y∗
0E˜
ξE. We have
(4.4) |E|≤2p0κ0
s0E2
2and |˜
E|≤6p0κ2
0
s2
0E3
2.
Proof. After a unitary similarity transformation we may assume that
A=λ0A12
0A22,A
22 −λ0Inonsingular,x
0=1
0,y
∗
0=[1 r]∈C1×n,
where r=A12(λ0I−A22)−1.Furthermore,
(λ0I−A)=0A12(λ0I−A22)−2
0(λ0I−A22)−1=0r(λ0I−A22)−1
0(λ0I−A22)−1,
P0(λ0I−A22)†=00
0λ0I−A22†
=00
0(λ0I−A22)−1.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 615
The statements of the theorem are obtained by specializing Theorems 3.1 and 3.6 to
the case A11 =λ0,·=|||·|
||=·2. In this case we have the following identities:
T(z)=(λI −A22)z,
T−1(E21)=(λI−A22)−1E21,
A122=x∗
0(λ0I−A)2=c,
p=1+r2
2=y02=p0,
κ=1+r2
2+r2=y02+y02
2−1=κ0,
sep2(λ0I,A22)=σmin(λ0I−A22)=σn−1(P0(λ0I−A22)) = s0.
Let E∈Cn×nwith E2<1
2(s2
0+c2−c). According to Theorem 3.1 there exists
a vector zE∈Cn−1depending holomorphically on Esuch that xE=[1 z
E]is an
eigenvector of A+E,zE2≤2
s0E2and zE−T−1(E21)2≤6E2
2/s2
0. Clearly,
x∗
0xE=1andzE2=tan(θ(x0,x
E)). It is straightforward to verify that
ξE2=xE−(I+[P0(λI −A)]†E)x02=zE−T−1(E21)2.
This concludes the proof of (i).
To show (ii), suppose that E2<κ
0/(2s0). According to Theorem 3.6 there ex-
ists a vector wE∈Cn−1depending holomorphically on Esuch that ˜xE=1r
0I1
wE
is an eigenvalue of A+Eand
wE2≤2κ0
p0s0E2as well as wE−T−1(E21)2≤6κ2
0
p0s2
0E2
2.
It is easily verified that
˜xE−x0=r
IwEand
˜
ξE=˜xE−(x0+(λ0I−A)Ex0)=r
I(wE−T−1(E21)).
This yields (4.1) and the first inequality in (4.2), since [rI]2=p0.Wehave
˜xE=(1+rwE)1(1+rwE)−1w
E.Thus, tan(θ(x0,˜xE)) = (1+ rwE)−1wE2≤
wE2/(1 −r2wE2). This implies the second inequality in (4.2). Equation (4.3)
follows by multiplying the identity (A+E)˜xE=λE˜xEwith y∗
0. The estimates in
(4.4) are then obvious.
Let A∈Cn×nbe a normal matrix. Then we can take y0=x0in Theorem
4.1. Moreover, the separation s0=σn−1(P0(λ0I−A)) = σn−1(λ0I−A)equals
the distance of λ0to the set Λ(A)\{λ0}, and the identities a=0,p0=κ0=1,
[P0(λ0I−A)]†=(λ0I−A)†=(λ0I−A)hold. We thus have the following corollary
to Theorem 4.1.
Corollary 4.2. Let A∈Cn×nbe a normal matrix and consider a normalized
eigenvector x0∈Cnbelonging to a simple eigenvalue λ0of A.Lets0denote the
distance of λ0to the rest of the spectrum of A0, that is, s0=min{|λ0−ν|:ν∈
Λ(A),ν =λ0}.LetE∈Cn×nwith E2<s
0/2. Then there exists an eigenvector
xEof A+Edepending holomorphically on Esuch that x∗
0xE=1and
xE=(I+(λI −A)†E)x0+ξE
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
616 MICHAEL KAROW AND DANIEL KRESSNER
for some ξE∈Cnwith ξE2≤6E2
2/s2
0.Furthermore, xE−x02=tan(θ(x0,x
E)) ≤
2E2/s0.The associated eigenvalue satisfies
λE=λ0+x∗
0ExE=λ0+x∗
0Ex0+E=λ0+y∗
0Ex0+x∗
0E(λ0I−A)†Ex0+˜
E
with |E|≤2E2
2/s0and |˜
E|≤6E3
2/s2
0.
5. Conclusions. By establishing a link to the coalescence of pseudospectral
components, we have derived a new perturbation bound for invariant subspaces. As
the bound turns out to be sharp for a 2×2 example, no further obvious improvement
of the bound seems to be possible. Moreover, we establish a novel bound for the
remainder term of a well-known perturbation expansion. Even the (modified) special-
ization of this remainder bound to the case of individual eigenvectors and eigenvalues
appears to be new. We believe that such bounds for remainder terms are important;
e.g., they can be used for quantifying the validity for condition numbers frequently
used in practice, e.g., in MATLAB and LAPACK [2].
As a side result, we have shown that Stewart’s classical result on the perturbation
of invariant subspaces is a direct consequence of the Newton–Kantorovich theorem.
We believe that there is some interest in this observation, as it may more easily allow
for extensions of Stewart’s result to different settings.
Appendix A. Stewart’s result via the Newton--Kantorovich theorem. In
this section, we show that Theorem 2.5 is a special case of the Newton–Kantorovich
theorem formulated in [12, p. 536].
Theorem A.1. Let E,Zbe Banach spaces and let f:Z→Ebe twice continu-
ously differentiable in a sufficiently large neighborhood Ωof Z∈Z. Suppose that there
exists a linear operator T:Z→Ehaving a continuous inverse T−1and satisfying the
following conditions:
T−1(F(Z))≤η,(A.1)
T−1◦F(Z)−I≤δ,(A.2)
T−1◦F(˜
Z)≤K∀˜
Z∈Ω.(A.3)
If δ<1and h:= ηK
(1−δ)2<1
2, then there exists a solution ZEof F(ZE)=0such that
ZE−Z≤r0with r0:= 2η
(1 −δ)(1 + √1−2h).
We apply Theorem A.1 to the setting of section 2.4:
0=F(Z):=−f(A+E,Z)=−E21 +Z(A11 +E11)−(A22 +E22)Z+Z(A12 +E12)Z
with E=Z=C(n−k)×k,Z=0,andT:Z→ ZA11 −A22Z. In the following, ·
denotes a consistent family of matrix norms.
Condition (A.1). We have
T−1(F(0))=T−1(E21)≤E21
s=: η,
where s=sep(A11,A
22).
Condition (A.2). From
F(0) : Z→ (A22 +E22)Z−Z(A11 +E11)=T(Z)+T(Z)
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
ON A PERTURBATION BOUND FOR INVARIANT SUBSPACES 617
with T(Z):=E22 ·Z−Z·E11, it follows that
T−1◦F(0) −I=T−1◦T≤E11+E22
s=: δ.
Condition (A.3). Since the second derivative of fis constant, it immediately
follows that
T−1◦F(˜
Z)≤2A12 +E12
s≤2A12+E12
s=: K.
Summary. Setting sE=s−E11−E22, we finally obtain
h=ηK
(1 −δ)2=2E21(A12+E12)
s2
E
r0=2η
(1 −δ)(1 + √1−2h)=2E21
sE+s2
E−4E21(A12+E12).
Theorem A.1 now states the existence of a solution ZEto F(Z)=−f(A+E,ZE)=0
with ZE≤r0if δ<1andh<1
2. This coincides precisely with the statement of
Theorem 2.5.
Extension. The statement of Theorem 2.5 can be improved when we assume that
·is unitarily invariant. In this case, the quantities δ, K in the derivation above can
be replaced by the potentially smaller quantities
δ=E112+E222
s,K=2
A122+E122
s.
Consequently, the bound of Theorem 2.5 becomes
(A.4) ZE≤ 2E21
sE+s2
E−4E21(A122+E122)<2E21
sE
under the condition E21(A122+E122)<s
2
E/4 with sE=s−E112−E222.
Acknowledgments. The authors thank Pete Stewart for very helpful discus-
sions. Parts of this work were prepared while the first author was visiting FIM (In-
stitute for Mathematical Research) at ETH Zurich. The generous hospitality of FIM
is gratefully acknowledged.
REFERENCES
[1] R. Alam and S. Bora,On stable eigendecompositions of matrices,SIAMJ.MatrixAnal.
Appl., 26 (2005), pp. 830–848.
[2] Z. Bai, J. W. Demmel, and A. McKenney,On computing condition numbers for the non-
symmetric eigenproblem, ACM Trans. Math. Software, 19 (1993), pp. 202–223.
[3] R. Bhatia,Matrix Analysis, Springer-Verlag, New York, 1997.
[4] R. Byers,A LINPACK-style condition estimator for the equation AX −XBT=C, IEEE
Trans. Automat. Control, 29 (1984), pp. 926–928.
[5] F. Chatelin,Spectral Approximation of Linear Operators, Academic Press, New York, 1983.
[6] C. Davis and W. M. Kahan,The rotation of eigenvectors by a perturbation. III, SIAM J.
Numer. Anal., 7 (1970), pp. 1–46.
[7] J. W. Demmel,The condition number of equivalence transformations that block diagonalize
matrix pencils, SIAM J. Numer. Anal., 20 (1983), pp. 599–610.
[8] J. W. Demmel,Computing stable eigendecompositions of matrices, Linear Algebra Appl.,
79 (1986), pp. 163–193.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.
618 MICHAEL KAROW AND DANIEL KRESSNER
[9] I. Gohberg, P. Lancaster, and L. Rodman,Invariant Subspaces of Matrices with Applica-
tions, Classics in Appl. Math. 51, SIAM, Philadelphia, 2006.
[10] L. Grammont and A. Largillier,On -spectra and stability radii, J. Comput. Appl. Math.,
147 (2002), pp. 453–469.
[11] M. Gu and M. L. Overton,An algorithm to compute Sepλ, SIAM J. Matrix Anal. Appl.,
28 (2006), pp. 348–359.
[12] L. V. Kantorovich and G. P. Akilov,Functional Analysis, 2nd ed., Pergamon Press, Oxford,
UK, 1982.
[13] M. Karow,Inclusion theorems for pseudospectra of block triangular matrices, in preparation.
[14] T. Kato,Perturbation Theory for Linear Operators, Classics in Math., Springer-Verlag, Berlin,
1995.
[15] S. G. Krantz and H. R. Parks,The Implicit Function Theorem,Birkh¨auser, Boston, 2003.
[16] D. Kressner,Numerical Methods for General and Structured Eigenvalue Problems, of Lect.
Notes Comput. Sci. Eng. 16, Springer-Verlag, Berlin, 2005.
[17] C. D. Meyer and G. W. Stewart,Derivatives and perturbations of eigenvectors,SIAMJ.
Numer. Anal., 25 (1988), pp. 679–691.
[18] G. W. Stewart,Error bounds for approximate invariant subspaces of closed linear operators,
SIAM J. Numer. Anal., 8 (1971), pp. 796–808.
[19] G. W. Stewart,Error and perturbation bounds for subspaces associated with certain eigen-
value problems, SIAM Rev., 15 (1973), pp. 727–764.
[20] G. W. Stewart,Smooth local bases for perturbed eigenspaces, UMIACS-TR-2012-08, CS-TR-
4010, May 2012, Department of Computer Science, University of Maryland.
[21] G. W. Stewart and J.-G. Sun,Matrix Perturbation Theory, Academic Press, New York,
1990.
[22] J.-G. Sun,Estimation of the separation of two matrices, J. Comput. Math., 2 (1984),
pp. 189–200.
[23] J.-G. Sun,Estimation of the separation of two matrices. II, J. Comput. Math., 3 (1985),
pp. 19–26.
[24] J.-G. Sun,Perturbation expansions for invariant subspaces, Linear Algebra Appl., 153 (1991),
pp. 85–97.
[25] J.-G. Sun,Stability and Accuracy: Perturbation Analysis of Algebraic Eigenproblems, Techni-
cal report UMINF 98-07, Department of Computing Science, University of Ume˚a, Ume˚a,
Sweden, 1998.
[26] L. N. Trefethen and M. Embree,Spectra and Pseudospectra. The Behavior of Nonnormal
Matrices and Operators, Princeton University Press, Princeton, NJ, 2005.
[27] J. M. Varah,On the separation of two matrices, SIAM J. Numer. Anal., 16 (1979),
pp. 216–222.
[28] H. Xu,Bounds on the separation of two matrices, J. Fudan Univ. Nat. Sci., 33 (1994), pp. 413–
420.
[29] S.-F. Xu,Lower bound estimation for the separation of two matrices, Linear Algebra Appl.,
262 (1997), pp. 67–82.
Downloaded 12/14/17 to 130.149.176.172. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php