Document [original]

Linear Stable Sampling Rate: Optimality of 2D Wavelet

Reconstructions from Fourier Measurements

Ben Adcock, Anders C. Hansen, Gitta Kutyniok, and Jackie Ma

March 1, 2014

Abstract

In this paper we analyze two-dimensional wavelet reconstructions from Fourier samples within the frame-

work of generalized sampling. For this, we consider both separable compactly-supported wavelets and boundary

wavelets. We prove that the number of samples that must be acquired to ensure a stable and accurate reconstruc-

tion scales linearly with the number of reconstructing wavelet functions. We also provide numerical experiments

that corroborate our theoretical results.

1 Introduction

A problem that appears in multiple disciplines is the reconstruction of an object from linear measurements.

One special situation of particular importance which we will focus on in this paper are Fourier measurements.

This particular reconstruction problem occurs in numerous applications such as Fourier optics, radar imaging,

magnetic resonance imaging (MRI) and X-ray CT (the latter after application of the Fourier slice theorem). One

of the main issues is that we are only able to acquire finitely many samples, since we cannot process an infinite

amount of information in practice. The reconstruction of an object from a finite collection of Fourier samples can

be obtained by a truncated Fourier series. However, this typically leads to undesirable effects such as the Gibbs

phenomenon, which are wild oscillation near points of discontinuity.

Beside the Gibbs phenomenon, the convergence of the Fourier series (in the Euclidean norm) is slow. Con-

versely, wavelets bases are well-known to achieve much better results (see [22, Ch. 9]). Indeed, wavelets have

much better localization properties than the standard Fourier transform, leading to a better detection of image

features. For this reason, wavelets have found widespread used in compression and denoising. For example the

algorithm of JPEG 2000 is based on wavelets. Moreover, they come equipped with fast algorithms which are of

great importance in today’s age of technology. Wavelets also play a pivotal role in biomedical imaging, with an

example being the technique of wavelet encoding in MRI (see [19, 23, 29, 30]).

The issue, however, is that physical samplers such as an MRI scanner naturally yield Fourier measurements,

not wavelet coefficients. Thus in order to exploit the power of wavelets, we need a reconstruction algorithm

capable of producing wavelet coefficients given a fixed set of Fourier measurements.

1.1 Generalized sampling

Generalized sampling is a framework for this problem developed by two of the authors in a series of papers

[2, 3, 4, 5] and based on past work of Unser and Aldroubi [27, 28], Eldar [15] and Hrycak and Gröchenig [18].

The theory allows for stable and accurate reconstructions in an arbitrary reconstruction system of choice given

fixed measurements with respect to another system.

The problem of reconstructing wavelet coefficients from Fourier samples is an important example of this ab-

stract framework. Mathematically, the reconstruction problem can be modeled in a separable Hilbert space H,

resulting in an infinite-dimensional linear algebra problem. Generalized sampling provides a faithful discretiza-

tion of such a problem. The stable sampling rate (see Section 2 for details) is fundamental characteristic within

generalized sampling that determines how many samples are needed in order to obtain stable and accurate recon-

structions with a given number of reconstruction elements. It is therefore vital that this rate be determined for

important instances of generalized sampling.

1.2 Our contribution

In [7] the respective authors proved the linearity of the stable sampling rate for one-dimensional compactly

supported wavelets based on finitely many Fourier samples. This means, up to a constant, one needs the same

number of samples as reconstruction elements. Our results extend the previous one to dimension two, although

higher dimensional results can be obtained in a straightforward manner. This is an important extension, since

most of the above applications involve two- or three-dimensional images. The crucial part that makes our result

non-trivial is the allowance of non-diagonal scaling matrices neglecting straightforward arguments for separable

two dimensional wavelets from 1D to 2D. Moreover, we will not only prove the linearity for standard two-

dimensional separable wavelets, but also for two-dimensional boundary wavelets which are of particular interest

for smooth images. This case was not considered in [7] but was addressed recently in [1] for the case on 1D

nonuniform Fourier samples. Here, for simplicity, we consider only uniform samples but in the 2D setting.

At this stage we note that other higher dimensional concepts, such as curvelets and shearlets, can provide

better approximations rates for cartoon-like-images; a specific class of functions ([10], [20]). However, this

paper serves as an extension of known 1D results [7]. It thus provides a necessary first step in the study of

reconstructions from Fourier samples within the context of generalized sampling in higher-dimensional settings.

We shall discuss shearlets in an upcoming paper.

Let us now make one further remark. The reader may at this stage wonder why, given a vector yof Fourier

samples of a 2D image, one cannot simply form the vector x=U−1

dft y, and then form z=Vdwtxand hope that z

would represent wavelet coefficients of the function fto be reconstructed (here Udft and Vdwt denote the discrete

Fourier and wavelet transforms respectively). Unfortunately, xrepresents a discretization of the truncated Fourier

series of f. Thus, ignoring the wavelet crime [25, p. 232] for a moment, we find that zrepresents the wavelet

coefficients of the truncated Fourier series and not the wavelet coefficients of fitself (taking the wavelet crime

into account, zwould actually be an approximation to the wavelet coefficients of the truncated Fourier series).

Thus, zwill typically have lost all the decay properties of the original wavelet coefficients. Moreover, if we map

zback to the image domain we get x=V−1

dwtzand thus we do not gain anything as xis the discretized truncated

Fourier series.

This paper is about getting the actual wavelet coefficients of ffrom the Fourier samples, thus preserving

all the decay properties of the original coefficients. This is done using generalized sampling. We show that the

number of samples that must be acquired to ensure a stable and accurate reconstruction scales linearly with the

number of reconstructing wavelet functions. This means that one can reconstruct a function from its Fourier

coefficients yet get error bounds on the reconstruction (up to a constant) in terms of the decay properties of

the wavelet coefficients. Put in short: seeing Fourier coefficients of a function is asymptotically as good as

seeing the wavelet coefficients directly. As we will see in the numerical experiments, the generalized sampling

reconstruction provides a substantial gain over the classical Fourier reconstruction.

The outline of the remainder of this paper is as follows. In Section 2 we give a more elaborate introduction

into generalized sampling and present the main results of this method. Furthermore, we introduce the stable

sampling rate to which our main focus is devoted. Having presented the framework we are dealing with, we

introduce the wavelet reconstruction systems and the Fourier sampling systems in Section 3. The main results

are then presented in Section 4. Finally we demonstrate our theoretical results in application by presenting some

numerical experiments in Section 5. Proofs of the main results are given in Section 6.

2 Generalized Sampling

In this section, we recall the main definitions and results of the methodology of generalized sampling from

[2, 3, 4, 5]. For this, we start by introducing a general model situation for reconstruction from samples with

associated quality measures.

2.1 General Setting

Let Hbe a (separable) Hilbert space Hwith an inner product h·,·i, which will be our ambient space throughout

this section. For modeling the acquisition of samples, let S ⊂ H be a closed subspace and {sk}k∈N⊂ H be an

orthonormal basis for S. We will refer to {sk}k∈Nas the sampling system and Sas the sampling space. For a

signal f∈ H, we then assume that the associated samples (also called measurements) are given by

m(f)k:= hf, ski, k ∈N.(2.1)

Based on the measurements m(f)=(hf, ski)k∈N, we aim to reconstruct the original signal f. To be able to

utilize some prior knowledge concerning the initial signal f, we also require a carefully chosen reconstruction

system {ri}i∈N⊂ H. The space R= span{ri:i∈N}, in which fis assumed to lie or be well approximated

in, is then referred to as the corresponding reconstruction space.

Since in an algorithmic realization, only finitely many samples – and likewise a finite linear combination of

reconstruction elements – is possible, we also introduce the finite-dimensional spaces

SM= span{s1, . . . , sM}, M ∈N

and

RN= span{r1, . . . , rN}, N ∈N.

Thus, the reconstruction problem can now be phrased as follows: Given samples m(f)1, . . . , m(f)Mof an initial

signal f∈ H, compute a good approximation to fin the reconstruction space RN. Aiming to compare different

methodologies for solving this problem, we next formally introduce the notion of reconstruction method.

Definition 2.1. Let a sampling system {sk}k∈Nand reconstruction spaces RN,N∈Nbe defined as before.

Further, let f∈ H and let M, N ∈N. Then some mapping

FN,M :H −→ RN

is called reconstruction method, if, for every f∈ H, the signal FN,M (f)depends only on m(f)1, . . . , m(f)M,

where the samples m(f)jare defined in (2.1).

We can now strengthen the reconstruction problem in the following way: Given the dimension of the recon-

struction space N, how many samples Mare required to obtain a stable and optimally accurate reconstruction

method? This intuitive phrasing will be made precise in the next subsections.

2.2 Quality Measures for Reconstruction Methods

We start by introducing two quality measures for reconstruction methods that analyze the degree of approxima-

tion within the reconstruction space and robustness for reconstruction. For this, throughout this subsection, let

{sk}k∈Nbe a sampling system, and let RN,N∈Nbe reconstruction spaces.

The first measure quantifies the closeness of the reconstruction to the ‘best’ reconstruction in the sense of the

orthogonal projection onto the reconstruction space. In the sequel, for the orthogonal projection onto a closed

subspace U, we will always utilize the notation PU:H −→ U.

Definition 2.2 ([6]).Let FN,M :H −→ RNbe a reconstruction method. Moreover, let µ=µ(FN,M )>0be

the least number such that

kf−FN,M (f)k ≤ µkf−PRN(f)kfor all f∈ H.

Then we call µthe quasi-optimality constant of FN,M . If no such constant exists, then we write µ=∞. If µis

small, we say FN,M is quasi-optimal.

The second measure quantifies stability in the sense of robustness against perturbations.

Definition 2.3 ([6]).Let FN,M :H −→ RNbe a reconstruction method. The (absolute) condition number

κ=κ(FN,M )>0is then given by

κ= sup

f∈H

lim

ε&0sup

g∈H,

0<km(g)k`2≤εkFN,M (f+g)−FN,M (f)k

km(g)k`2,

where m(g)=(m(g)1, . . . , m(g)M,0, . . .). If κis small, then we say FN,M is well conditioned, otherwise FN,M

is called ill-conditioned.

We now merge both definitions in order to have only one single measure for a reconstruction method.

Definition 2.4 ([6, 7]).Let FN,M :H −→ RNbe a reconstruction method. The reconstruction constant of FN,M

is defined as

C(FN,M ) = max{µ(FN,M ), κ(FN,M )},

where µ(FN,M )is the quasi-optimality constant and κ(FN,M )is the (absolute) condition number.

2.3 The Infimum Cosine Angle

Intuitively, the angle between the sampling and reconstruction space should play a role in determining the recon-

struction constant for a given reconstruction method. As we will see, this will be in particular the case for the

reconstruction method of generalized sampling. To prepare those results, in this subsection, we first introduce

a particularly useful notion of angle. The concept of principal angles between two Euclidean subspaces is well

known in the literature. However, for arbitrary closed subspaces of a Hilbert space many different notions of an

angle exist. We exemplarily mention the Friedrichs angle and the Dixmier angle (cf. [12, 14]). For our analysis,

the notion of infimum cosine angle – utilized, for instance, in [8, 28] – will be the most appropriate. It is defined

as follows.

Definition 2.5. Let U, V be closed subspaces of H. Then the infimum cosine angle cos(ω(U, V )) between Uand

Vis defined by

cos(ω(U, V )) = inf

u∈U,

kuk=1 kPVuk, ω(U, V )∈[0, π/2].

We remark that the infimum cosine angle is not symmetric in general. For example, if Uand Vare two

non-trivial closed subspaces of Hand U6=Vwith U⊂V, then cos(ω(U, V )) = 1, whereas cos(ω(V, U)) = 0.

The following general characterization of pairs of closed subspaces for which the infimum cosine angle is not

symmetric was proven in [9].

Lemma 2.6 ([9]).For two non-trivial closed subspaces U, V ⊂ H, we have cos(ω(U, V )) 6= cos(ω(V, U)) if

and only if one of these quantities is zero and the other is positive.

A positive infimum cosine angle between the sampling and reconstruction space will be crucial for enabling

reconstruction at all. The next theorem provides a characterization of subspaces which admit a positive infimum

cosine angle.

Theorem 2.7 ([26]).Let U, V be closed subspaces of H. Then, we have that cos(ω(U, V )) >0if and only if

U∩V⊥={0}and U+V⊥is closed in H.

This characterization gives rise to the following definition.

Definition 2.8 ([6, 8]).If cos(ω(U, V ⊥)) >0for two closed subspaces U, V of H, then we say Uand Vobey

the subspace condition. In this case, we define the associated (oblique) projection

PU,V :U⊕V−→ H

with range of Pequal to Uand kernel of Pequal to V.

We wish to mention that oblique projections are customarily used in sampling theory, such as, for instance,

in [8, 15, 16, 28].

2.4 Reconstruction Method of Generalized Sampling

We are now ready to introduce the method of generalized sampling. To this end, we will always assume that

the reconstruction space Rand the sampling space S⊥fulfill the subspace condition. In other words, we have

cos(ω(R,S)) >0. For any M∈N, let PSMbe the orthogonal projection given by

PSM:H −→ SM, h 7→

k=1hh, skisk.

This enables us to formally define the reconstruction method of generalized sampling.

Definition 2.9. For f∈ H and N, M ∈N, we define the reconstruction method of generalized sampling

GN,M :H → RNby

hPSMGN,M (f), rji=hPSMf, rji, j = 1, . . . , N. (2.2)

We also refer to GN,M (f)as the generalized sampling reconstruction of f.

We emphasize that this is indeed a reconstruction method in the sense of Definition 2.1, since the right-

hand side of (2.2) only depends on the given samples (hf, ski)M

k=1 and not on fitself. Moreover, generalized

sampling is a linear reconstruction method. Algorithmically, to determine GN,M (f), i.e., solving (2.2), can be

phrased as the numerical linear algebra problem of computing the coefficients α[N]= (α1, . . . , αN)∈CNas

the least-squares solution of

U[M,N]α[N]=m(f)[M],where U[M,N]=





u11 . . . u1N

.....

uM1. . . uMN





, uij =hrj, sii.

Note that, if U[M,N]is well-conditioned this requires O(MN)operations in general. However, in the case of

Fourier samples and wavelet reconstruction one can make use of the Fast Fourier Transform and the discrete

wavelet transform and thus reduce this figure down to only O(Mlog M)operations.

The philosophy of generalized sampling is to allow the number of samples Mto grow independently of the

fixed number of reconstruction elements N. This flexibility of Mand Nis crucial for stable reconstruction. The

next theorem guarantees the existence of the reconstruction for any f∈ H provided that the number of samples

Mis large enough.

Theorem 2.10 ([6]).Let N∈N. Then, there exists an M0∈N, such that, for every f∈ H, (2.2) has a unique

solution GN,M (f)for all M≥M0. In particular, we then have

GN,M =PRN,(PSM(RN))⊥.

Moreover, the smallest M0is the least number such that cos(ω(RN,SM0)) >0.

It is a priori not clear how to find M∈Nlarge enough such that cos(ω(RN,SM)) >0, or even determine

the smallest such value M0∈N. In the next subsection, it will turn out that this is intimately related to the

reconstruction constant defined in Definition 2.4, and will lead to the notion of a stable sampling rate.

2.5 Stable Sampling Rate

In Subsection 2.2, we introduced the reconstruction constant as the main quality measure for stable and accurate

reconstructions. Intriguingly, we can now relate this notion to the infimum cosine angle in the case of generalized

sampling as reconstruction method.

Theorem 2.11 ([6]).Retaining the definitions and notations from Subsection 2.4, for all f∈ H, we have

kGN,M (f)k ≤ 1

cos(ω(RN,SM))kfk,

and

kf−PRNfk ≤ kf−GN,M (f)k ≤ 1

cos(ω(RN,SM))kf−PRNfk.

In particular, these bounds are sharp. Moreover, the reconstruction constant of generalized sampling C(GN,M )

obeys

C(GN,M ) = µ(GN,M ) = κ(GN,M ) = 1

cos(ω(RN,SM)).

Hence, in order to obtain stable and accurate reconstructions, it is both necessary and sufficient to control the

angle between RNand SM. This leads us to the definition of the stable sampling rate.

Definition 2.12. For N∈Nand θ > 1, the stable sampling rate is defined as

Θ(N, θ) = min {M∈N:C(GN,M )< θ}.

Note that the stable sampling rate is of interest, since it determines the number of samples required for

guaranteed, quasi-optimal and numerically stable reconstructions.

We next aim to compare generalized sampling to other reconstruction methods. For this, we first introduce a

class of reconstruction methods, which recover signals from the reconstruction space exactly.

Definition 2.13. A reconstruction method FN,M :H −→ RNis called perfect, if FN,M (f) = fwhenever

f∈ RN.

Note that any reconstruction method with finite quasi-optimality constant is automatically perfect. The next

result proves that generalized sampling is in the sense superior to any other perfect reconstruction method that its

condition number as well as even its reconstruction constant is smaller.

Theorem 2.14 ([6]).Let M≥N, and let FN,M :H −→ RNbe a linear or non linear perfect reconstruction

method. Then

κ(GN,M )≤κ(FN,M ),

where GN,M is the reconstruction method of generalized sampling. In particular, we have

C(GN,M )≤C(FN,M ).

Next, we aim to compare the quasi-optimality constant of generalized sampling to other reconstruction meth-

ods. For this, assume that the stable sampling rate of generalized sampling is linear in N, i.e., Θ(N, θ) = O(N)

as N→ ∞, and assume that there exist constants Cf, Df, γf>0depending on the initial signal f∈ H such

that

CfN−γf≤ kf−PRNfk ≤ DfN−γf,for all N∈N.(2.3)

The following result then shows that the error of any reconstruction method FMcan be only up to a constant

better than the error of generalized sampling reconstruction.

Theorem 2.15 ([6]).Suppose that the stable sampling rate Θ(N, θ)is linear in N, i.e. Θ(N, θ) = O(N)as

N→ ∞. Let f∈ H be fixed and let

FM: (m(f)1, . . . , m(f)M)7→ FM(f)∈ Rψf(M),

be a reconstruction method, where ψf:N−→ Nwith ψf(M)≤λM for some λ > 0. Assume that (2.3) holds.

Then, for any θ > 1, there exist constants d(θ)∈(0,1) and c(θ, Cf, Df)>0such that

kf−Gd(θ)M,M (f)k ≤ c(θ, Cf, Df)kf−FM(f)||,for all M∈N,

where GN,M denotes the generalized sampling reconstruction method.

3 Linear Sampling Rate for Compactly Supported 2D Wavelets

As already elaborated upon in the introduction, the situation of taking the Fourier basis as sampling system is of

particular importance to applications. A very common choice for a reconstruction system in imaging sciences are

2D wavelets, predominantly of compact support due to their high spatial localization.

In this section, we will state our main result for the situation of Fourier samples and reconstruction within

a wavelet basis using 2D compactly supported wavelets. More precisely, we will show linearity of the stable

sampling rate for the associated generalized sampling scheme, which shows by Theorem 2.15 the near-optimality

of this reconstruction method. For this, we start by defining first the reconstruction and second the sampling

space, followed by the statement of our main result. Due to its technical nature, we present its proof in a separate

section, namely Section 6.

3.1 Sampling and Reconstruction Spaces

3.1.1 Compactly Supported 2D Wavelets

We start by recalling the notion of scaling matrices.

Definition 3.1. Let Abe a 2×2matrix with non-negative integer entries and eigenvalues greater than one in

modulus. Then we call Aascaling matrix.

For the sake of brevity, in the sequel, we will use the notation

A=λ1λ2

λ3λ4.

Moreover, for the entries of Ajwe write

Aj= λ(j)

1λ(j)

λ(j)

3λ(j)

4!, j ∈N.

Notice that λ(j)

i6= (λi)jin general. For the sake of completeness, we next give the definition of a 2D multireso-

lution analysis (MRA). For more details, we refer to the existing literature, e.g [13, 21].

Definition 3.2. Let Abe a scaling matrix. Then a sequence of closed subspaces (Vj)j∈Zof L2(R2)is called a

multiresolution analysis, if the following properties are satisfied.

i) {0} ⊂ . . . ⊂Vj⊂Vj+1 ⊂. . . ⊂L2(R2),

ii) T

j∈Z

Vj={0},

iii) S

j∈Z

Vj=L2(R2),

iv) f∈Vj⇔f(A·)∈Vj+1,

v) there exists a function φ∈L2(R2)(called scaling function), such that

{φ0,m := φ(·−m) : m∈Z2}

constitutes an orthonormal basis for V0.

The associated wavelet spaces (Wj)j∈Zare then defined by

Vj+1 =Vj⊕Wj.

It is well known that there exist |det A| − 1corresponding compactly supported wavelets ψ1, . . . , ψ|det A|−1

such that

{ψp

j,m := |det A|j/2ψp(Aj·−m) : m= (m1, m2)∈Z2, p = 1,...,|det A|−1}

forms an orthonormal basis for Wjfor each j, see, e.g., [?]. We now consider the decomposition

L2(R2) = V0⊕M

j∈N

Wj,

where

V0:= span{φ0,m :m= (m1, m2)∈Z2}

and

Wj:= span{ψp

j,m :m= (m1, m2)∈Z2, p = 1,...,|det A|−1}, j ∈N.

3.1.2 Reconstruction Space

Our aim is to reconstruct functions that are supported on [0, a]2. To this end, suppose that the scaling function

and wavelet functions are supported in [0, a]2. To mimic the fact that practical applications can only handle finite

systems, we restrict to those functions whose support intersects [0, a]2, i.e., to the systems

Ω1={φ0,m :m= (m1, m2)∈Z2,−a<m1, m2< a}

and

Ω2={ψp

j,m :j∈N∪{0}, m = (m1, m2)∈Z2,−a < m1< a(λ(j)

1+λ(j)

2),

−a<m2< a(λ(j)

3+λ(j)

4), p = 1,...,|det A|−1}.

The reconstruction space Ris then defined as the closed linear span of these functions, which is

R= span{ϕ:ϕ∈Ω1∪Ω2}.

To define the finite-dimensional subspaces RN, we require an ordering for this system. The most natural way to

order Ω1∪Ω2is starting from wavelets at coarsest scale and then continue to higher scales. Within one scale,

one might order the translation (m1, m2)in a lexicographical manner starting from the smallest number up the

largest. More precisely, we fix m1and let m2run, increase m1by one and repeat. This then leads to the following

ordering of Ω1∪Ω2:

{ϕi}i∈N

={φ0,(−a+1,−a+1),...φ0,(−a+1,a−1), φ0,(−a+2,−a+1) ...,φ0,(−a+2,a−1), . . . , φ0,(a−1,a−1),

ψ1

0,(−a+1,−a+1), . . . , ψ1

0,(−a+1,m(0)

2−1), . . . , ψ1

0,(m(0)

1−1,−a+1), . . . , ψ1

0,(m(0)

1−1,m(0)

2−1),...,

ψ|det A|−1

0,(−a+1,−a+1), . . . , ψ|det A|−1

0,(−a+1,m(0)

2−1), . . . , ψ|det A|−1

0,(m(0)

1−1,−m(0)

2+1), . . . , ψ|det A|−1

0,(m(0)

1−1,m(0)

2−1),

ψ1

1,(−a+1,−m(1)

2+1), . . . , ψ1

1,(−a+1,m(1)

2−1), ψ1

1,(−a+2,−m(1)

2+1), . . . , ψ1

1,(−a+2,m(1)

2−1), . . .}(3.1)

where m(j)

1=a(λ(j)

1+λ(j)

2)and m(j)

2=a(λ(j)

3+λ(j)

4). We emphasize that the presented results do not depend

on the specific ordering within the scale. Finally, based on the chosen ordering, we define the reconstruction

space RNby

RN= span{ϕi:i= 1, . . . , N}, N ∈N.

In practise, it is much more common to use as approximation spaces those being generated up to a specific

scale, say, J. To mimic this approach, we coarsen the choices of N∈Nfor which we consider RNsuitably in

the following way. First, we observe that the number of functions being in Rup to a fixed scale J−1∈Nis

NJ= (2a−1)2+ (|det A|−1)

J−1

j=0

(a(λ(j)

1+λ(j)

2+ 1) −1)(a(λ(j)

3+λ(j)

4+ 1) −1).(3.2)

Lemma 3.3. Retaining the definitions and notations above, then NJ≤Ca(λ(J)

1+λ(J)

2)(λ(J)

3+λ(J)

4)for some

constant Cadepending on a.

Proof. The total number of elements NJin the reconstruction space RNJup to a scale J−1is prescribed by the

matrix Aand the support of the scaling function φand the wavelet ψ, respectively. However, since the support

of both φand ψis a square of the form [0, a]2, the total number of elements in the reconstruction space NJis the

same as if we would have generated the wavelet system with the transpose of A. Hence, it is sufficient to prove

NJ.(λ(J)

1+λ(J)

2)(λ(J)

3+λ(J)

4),

where the constant depends on a. We prove the result by induction. For J= 1 it is clearly true. Hence,

(2a−1)2+ (|det A|−1)

J−1

j=0

(a(λ(j)

1+λ(j)

2+ 1) −1)(a(λ(j)

3+λ(j)

4+ 1) −1)

.(λ(J−1)

1+λ(J−1)

2)(λ(J−1)

3+λ(J−1)

4)+(|det A|−1)(a(λ(J−1)

+λ(J−1)

2+ 1) −1)(a(λ(J−1)

3+λ(J−1)

4+ 1) −1)

Now,

AJ= λ(J)

1λ(J)

λ(J)

3λ(J)

4!= λ(J−1)

1λ(J−1)

λ(J−1)

3λ(J−1)

4!λ1λ2

λ3λ4= λ(J−1)

1λ1+λ(J−1)

2λ3λ(J−1)

1λ2+λ(J−1)

2λ4

λ(J−1)

3λ1+λ(J−1)

4λ3λ(J−1)

3λ2+λ(J−1)

4λ4!

which implies

λ(J)

1+λ(J)

2=λ(J−1)

1(λ1+λ2) + λ(J−1)

2(λ3+λ4),

λ(J)

3+λ(J)

4=λ(J−1)

3(λ1+λ2) + λ(J−1)

4(λ3+λ4).

Similar to the construction in [7], we now define the truncated scaling space by

V(a)

0:= span{φ0,m :m= (m1, m2)∈Z2,−a < m1, m2< a}

and the truncated wavelet spaces by

W(a)

j:= span{ψp

j,m :m= (m1, m2)∈Z2,−a < m1< a(λ(j)

1+λ(j)

2),

−a<m2< a(λ(j)

3+λ(j)

4), p = 1,...,|det A|−1}.

The reconstruction space of interest to us is then defined by

RNJ=V(a)

0⊕W(a)

0⊕. . . ⊕W(a)

J−1.

3.1.3 Sampling Space

To define the sampling space consisting of elements of the Fourier basis, we first choose T1, T2>0sufficiently

large such that

R ⊂ L2([−T1, T2]2).

Thus, only functions supported on [−T1, T2]2are relevant to us. Indeed, choosing T1≥a−1and T2≥2a−1

is enough. Note that λ(0)

1=λ(0)

4= 1 and λ(0)

2=λ(0)

3= 0. To allow an arbitrarily dense sampling, for each

ε≤1

T1+T2, we define the sampling vectors by

s(ε)

l=εe2πiεhl,·i ·χ[−T1,T2]2, l ∈Z2.(3.3)

Thus, we sample in each direction with the same sampling rate ε. Based on these sampling vectors, we now

define the sampling space S(ε)by

S(ε)= span ns(ε)

l:l∈Z2o.

The finite-dimensional subspaces S(ε)

M,M= (M1, M2)∈N×N, are then given by

S(ε)

M= span ns(ε)

l:l= (l1, l2)∈Z2,−Mi≤li≤Mi, i = 1,2o.

3.2 Main Result

Our main results concerns the stable sampling rate of the generalized sampling scheme for the sampling spaces

S(ε)

Mand the reconstruction spaces RNJ. By Theorem 2.11, for this, we have to control the infimum cosine angle

between the RNJand S(ε)

M. In particular, we wish to determine M= (M1, M2)∈N×Nsuch that, for given

θ > 1and J−1∈Nthe term cos(ω(RNJ,S(ε)

M)) can be bounded from below by θ−1, where NJdenotes the

total number of reconstruction elements up to scale J−1.

For this to work, we will assume that the scaling matrix does not distort the grid too much. To make this

precise, let IM={(l1, l2)∈Z2:−Mi≤li≤Mi, i = 1,2}, LM= [−M1, M1]×[−M2, M2]for (M1, M2)∈

N2. Then, we assume that the so-called mesh norm δobeys

δ:= max

x∈εA−J(LM)min

k∈εA−J(Z2)kx−y+kk∞<

log 1

√µ(LM)+ 1

4πmax{|L1|,|L2|,|L3|,|L4|},(3.4)

for some εindependent of J, where µdenotes the 2D lebesgue measure and Li, i = 1,2,3,4are bounds that are

obtained by Lemma 6.1. We will also discuss this assumption in Example 3.5 for better understanding.

The following result shows that the stable sampling rate is indeed linear in the considered situation, showing

that this scheme is superior to any other reconstruction method in the sense of Theorem 2.15.

Theorem 3.4. Let NJbe the number of reconstruction elements up to a fixed scale J−1, and let RNJand S(ε)

be the reconstruction space and sampling space, respectively. Furthermore, assume (3.4) is fulfilled. If θ > 1,

then there exists a constant S(θ)independent of Jand εsuch that, if

M1≥λ(J)

1+λ(J)

εS(θ), M2≥λ(J)

2+λ(J)

εS(θ),

then, for M= (M1, M2),

cos(ω(RNJ,S(ε)

M)) ≥1

θ.

In particular, the stable sampling rate obeys Θ(NJ, θ) = O(NJ)as NJ→ ∞ for every fixed θ > 1.

Summarizing, this result shows that the required number of Fourier samples is optimally small up to a constant

when utilizing 2D compactly supported wavelets for the generalized sampling reconstruction. This extends the

result of [7] to the two-dimensional case. We next consider a special, yet widely used choice for scaling matrices,

namely diagonal matrices. This includes two dimensional dyadic wavelets, which are those typically used in

applications. This situation is also considered in the numerical experiments presented in Section 5.

Example 3.5. In this example we want to demonstrate our results and, in particular, give some better understand-

ing of the construction and assumption (3.4). Furthermore, this example shall show that there are large classes of

2D wavelets that fulfill (3.4). For this purpose, let

A=2 0

0 2

be the scaling matrix, which – as mentioned before – gives rise to three wavelet generators. Let φand ψp, p =

1,2,3be two dimensional scaling and wavelet functions with compact support in [0, a]2, a ∈N, which might be

obtained by tensor products of one dimensional scaling and wavelet functions, see [13, 22].

As in Subsection 3.1.2, we define

Ω1:= {φ0,m :m= (m1, m2)∈Z2,|mi|< a, i = 1,2}(3.5)

and

Ω2:= {ψp

j,m :j∈N∪{0}, m = (m1, m2)∈Z2,−a<mi<2ja, i = 1,2, p = 1,2,3},(3.6)

since these are the only functions whose support intersects [0, a]2. Again, in line with our previous approach, we

then define the reconstruction space Rby

R= span{ϕ:ϕ∈Ω1∪Ω2},

order the elements Ω1∪Ω2analogously to (3.1), and set

RN= span{ϕi:i= 1, . . . , N}, N ∈N.

Now, each function ϕin RNcan be represented as a linear combination of scaling functions at highest level J,

in particular, there exist positive integers L1, L2, L3,and L4such that

ϕ=

l1=L3

l2=L4

αl1,l2φJ,(l1,l2), αl1,l2∈C.

This statement will be proven in Lemma 6.1. With a view to (3.4), the explicit expression of the bounds Li, i =

1,2,3,4are highly important. In fact, we should not choose them too large, otherwise (3.4) might not hold. Since

ϕ=

l1=L3

l2=L4hϕ, φJ,(l1,l2)iφJ,(l1,l2),

and is compactly supported, shifting by (l1, l2)far enough leads to the fact that the coefficients become zero. For

the scaling matrix A= diag(2,2) we have (see proof of Lemma 6.1)

L1=L3= 2J(3a−1), L2=L4=−a+ 2J(−a+ 1).

Moreover, the mesh norm δobeys δ≤ε

2J.Hence, for ε=1

4π(3a−1)

δ≤1

4π(3a−1)2J≤

log 2J

ε2+ 1

2π(3a−1)2J,(3.7)

so assumption (3.4) is fulfilled. Thus, we can apply Theorem 3.4 to obtain a constant Sθindependent of J, such

that for

M1, M2=2JSθ

ε

we have

cos(ω(RNJ,S(ε)

M)) >1

θ.

Counting the numbers of elements in the space RNJgives

NJ= (2a−1)2+3

J−1

j=0

(2ja+a−1)2= (22J−1)a2+6a(a−1)(2J−1)+3J(a−1)2+(2a−1)2=O(22J),(3.8)

which means the number of samples scales linearly with the number of reconstruction elements.

4 Linear Sampling Rate for 2D Boundary Wavelets

In applications, typically signals on a bounded domain are considered such as on the interval [0,1]. Aiming

to avoid artifacts at the boundaries x= 0,1, in [11] (see also [22]) boundary wavelets were introduced. The

considered wavelet system then consists of interior wavelets, which do not touch the boundary, and the just

mentioned boundary wavelets, leading to a system with an associated multiresolution analysis and prescribed

vanishing moments even at the boundary. Similar to the classical wavelet construction, this system can be lifted

to 2D by tensor products.

In this section, we will show that in fact also this (boundary) wavelet system allows a linear stable sampling

rate, though with a proof differing significantly from the one of the ‘classical wavelet case’ due to the different

structure at the highest scale. For this, we start with formally introducing 2D boundary wavelets – which we use

as an expression for the whole wavelet system – followed by our choice of reconstruction and sampling space.

4.1 Construction of Boundary Wavelets

We start with the 1D construction of wavelets on the interval as introduced in [11]. For this, let φbe a compactly

supported Daubechies scaling function with pvanishing moments. It is well known that φmust then have a

support of size 2p−1. By a shifting argument, we can assume that supp (φ)=[−p+ 1, p]. In order to properly

define interior wavelets and boundary wavelets, the scale need to be large enough – i.e., the support of the

wavelets need to be small enough – to be able to distinguish between wavelets whose support fully lie in [0,1]

and wavelets whose support intersect the boundary x= 0 and, likewise, x= 1.

Therefore, we now let j∈Nsuch that 2p≤2j. Then there exist 2j−2pinterior scaling functions, i.e.,

scaling functions which have support inside [0,1], defined by

φb

j,n =φj,n = 2j/2φ(2j·−n),for p≤n < 2j−p.

Depending on boundary scaling functions {φleft

n}n=0,...,p−1and {φright

n}n=0,...,p−1, which we will introduce

below, the pleft boundary scaling functions are defined by

φb

j,n = 2j/2φleft

n(2j·),for 0≤n < p,

and the pright boundary scaling functions are

φb

j,n = 2j/2φright

2j−1−n(2j(·−1)),for 2j−p≤n < 2j.

We remark that this leads to 2jscaling functions in total, which is the number of original scaling functions (φj,n)n

that intersect [0,1].

We next sketch the idea of the construction of boundary scaling functions {φleft

n}n=0,...,p−1as well as

{φright

n}n=0,...,p−1following [11], to the extent to which we require it in our proofs. One starts by defining

edge functions e

φkon the positive axis [0,∞)by

φk(x) =

2p−2

n=0 n

kφ(x+n−p+ 1), k = 0, . . . , p −1,

such that these edge functions are orthogonal to the interior scaling functions and such that they together generate

all polynomials up to degree p−1. After performing a Gram-Schmidt procedure one obtains the left boundary

functions φleft

k, k = 0, . . . , p−1. The right boundary functions are then – after some minor adjustments – obtained

by reflecting the left boundary functions. This construction from [11] allows one to obtain a multiresolution

analysis.

Theorem 4.1 ([11]).If 2j≥2p, then {φb

j,n}n=0,...,2j−1is an orthonormal basis for a space Vb

jthat is nested,

i.e.

j⊂Vb

j+1,

and complete, i.e.

[

j≥log22p

j=L2[0,1].

Next, we define an orthonormal basis for the wavelet space Wb

j, which is as usual defined as the orthogonal

complement of Vb

jin Vb

j+1. For this, let ψbe the corresponding wavelet function to φwith pvanishing moments

and supp ψ= [−p+ 1, p]. Similar to the construction of the scaling functions, we will obtain interior wavelets

and boundary wavelets, which then constitutes the set of wavelets in the interval. Again based on a careful choice

of boundary wavelets (ψleft

n)nand (ψright

k)k, for which we refer to [11], we define 2j−2pinterior wavelets by

ψb

j,n =ψj,n = 2j/2ψ(2j·−n),for p≤n < 2j−p,

pleft boundary wavelets

ψb

j,n = 2j/2ψleft

n(2j·),for 0≤n < p,

and pright boundary wavelets

ψb

j,n = 2j/2ψright

2j−1−n(2j(·−1)),for 2j−p≤n < 1j.

Summarizing, the following result hold for these wavelet functions.

Theorem 4.2 ([11]).Let 2J≥2p. Then the following properties hold:

i) {ψb

J,n}n=0,...,2J−1is an orthonormal basis for Wb

ii) L2[0,1] can be decomposed as

L2[0,1] = Vb

J⊕Wb

J+1 ⊕Wb

J+2 ⊕. . . =Vb

∞

j=J

iii) {φb

J,m}m=0,...,2J−1,{ψb

j,n}j≥J,n=0,...,2j−1is an orthonormal basis for L2[0,1].

iv) If φ, ψ ∈Cr[0,1], then {φb

j,m}m=0,...,2J−1,{ψb

j,n}j≥J,n=0,...,2j−1is an unconditional basis for Cs[0,1]

for all s < r.

As mentioned before, this system gives rise to a 2D system by tensor products, i.e., by the standard 2D

separable wavelet construction. In particular, this 2D system again constitutes an MRA, see [22].

4.2 Sampling and Reconstruction Space

4.2.1 Reconstruction Space

For defining the reconstruction space, we will now assume that the region of interest is [0,1]2instead of [0, a]2

with a∈N, as previously chosen. Our starting point is a 1D compactly supported Daubechies scaling function

with pvanishing moments (cf. Subsection 4.1). Recall that Daubechies wavelets have at least the following

frequency decay,

φ(ξ)|.1

1 + |ξ|,as |ξ|→∞.(4.1)

Let ψbe a corresponding wavelet with pvanishing moments, and let φband ψbbe the wavelets on the interval

as introduced in the previous subsection. Let J0be the smallest number such that 2J0≥2p. Then the associated

2D scaling functions are of the form

φb

J0,(n1,n2):= φb

J0,n1⊗φb

J0,n2,0≤n1, n2≤2J0−1

and, for 0≤n1, n2≤2j−1with j≥J0, the 2D wavelet functions are defined by

ψb,k

j,(n1,n2):= 









φb

j,n1⊗ψb

j,n2, j ≥J0, k = 1,

ψb

j,n1⊗φb

j,n2, j ≥J0, k = 2,

ψb

j,n1⊗ψb

j,n2, j ≥J0, k = 3.

Next, let

Ω1={φb

J0,(n1,n2): 0 ≤n1, n2≤2J0−1}

and

Ω2={ψb,k

j,(n1,n2):j=J0,1, . . . , J −1,0≤n1, n2≤2j−1, k = 1,2,3}.

Then, for a fixed scale J, and for NJ= 22J, the reconstruction space RNJis given by

RNJ= span{ϕi:ϕi∈Ω1∪Ω2, i = 1, . . . , NJ}.(4.2)

An ordering can be obtained analogously as in Subsection 3.1.2.

Numbers for Haar and θ−1= 0.45

J NJMfor ε= 1/2Mfor ε= 1/3

0 4 25 25

1 16 81 169

2 64 289 625

3 256 1089 2401

4 1024 4225 9409

Numbers for Daubechies and θ−1= 0.45

J NJMfor ε= 1/7Mfor ε= 1/8

0 100 225 289

1 292 841 1089

2 880 3249 4225

3 2908 12769 16641

4 10408 - -

Table 1: Number of reconstruction elements and samples for Haar and Daubechies-4. Note that these numbers

predict the jumps in Figure 1.

4.2.2 Sampling Space

The sampling space can be chosen similar to Subsection 3.1.3, i.e., for ε≤1, we set

s(ε)

l=εe2πiεhl,·i ·χ[0,1]2, l ∈Z2,

and for M= (M1, M2)∈N×N

S(ε)

M= span ns(ε)

l:l= (l1, l2)∈Z2,−Mi≤li≤Mi, i = 1,2o.(4.3)

4.3 Main Result

Our main result of this section states the linearity of the stable sampling rate for boundary wavelets, whose proof

is presented in Section 7.

Theorem 4.3. Let J∈N,ε≤1, and θ > 1. Further, let S(ε)

Mand RNJbe defined as 4.3 and (4.2), respectively.

Then

inf

ϕ∈RNJ

kϕk=1

kPS(ε)

ϕk ≥ 1

for M= (dSθ/εe2J,dSθ/εe2J)∈N×N, where Sθis a constant independent of J. In particular, the stable

sampling rate obeys Θ(NJ, θ) = O(NJ)as NJ→ ∞ for every fixed θ > 1

Remark 4.4. This result will be proved independently of Theorem 3.4, since boundary wavelets do constitute an

MRA but at highest scaling level, the space VJcontains more than one generating function φ. It uses the reflected

functions as well.

5 Numerical Experiments

In this section we numerically demonstrate the linearity of the stable sampling rate as stated in Theorem 3.4. We

will also demonstrate how this combines with generalized sampling in practice. In particular, given this linearity,

reconstructing from Fourier samples in smooth boundary wavelets will give an error decaying according to the

smoothness and the number of vanishing moments.

In this section we consider dyadic scaling matrices

A=2 0

0 2.(5.1)

Furthermore, our focus are separable wavelets, i.e. wavelets that are obtained by tensor products of one dimen-

sional scaling functions and one dimensional wavelet functions, respectively. Scaling matrices of the form (5.1)

preserve the separability.

0 200 400 600 800 1000 1200

500

1000

1500

2000

2500

3000

3500

4000

4500

N − Number of reconstruction elements

M − Number of samples

(N,M) − Graph of the SSR

Slope Mmax/Nmax

(a) Θ(N, θ)for Haar. Computed for J= 4 up to N= 1024

with ε= 1/2and θ−1= 0.45.

0 200 400 600 800 1000 1200

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

N − Number of reconstruction elements

M − Number of samples

(N,M) − Graph of the SSR

Slope Mmax/Nmax

(b) Θ(N, θ)for Haar. Computed for J= 4 up to N= 1024

with ε= 1/3and θ−1= 0.45.

0 500 1000 1500 2000 2500 3000

2000

4000

6000

8000

10000

12000

14000

N − Number of reconstruction elements

M − Number of samples

(N,M) − Graph of the SSR

Slope Mmax/Nmax

N= 2908 with ε= 1/7and θ−1= 0.45.

0 500 1000 1500 2000 2500 3000

2000

4000

6000

8000

10000

12000

14000

16000

18000

N − Number of reconstruction elements

M − Number of samples

(N,M) − Graph of the SSR

Slope Mmax/Nmax

(d) Θ(N, θ)for Daubechies-4. Computed for J= 3 up to

N= 2908 with ε= 1/8and θ−1= 0.45.

Figure 1: Stable sampling rate Θ(N, θ)for two dimensional dyadic Haar wavelets and two dimensional dyadic

Daubechies-4 wavelets.

5.1 Linearity examples with Haar and Daubechies-4 wavelets

We use the description of Section 3.1 and Example 3.5 in order to perform numerical experiments for some

known wavelets. This gives the reconstruction space R= span{ϕ:ϕ∈Ω1∪Ω2}where Ω1and Ω2are defined

in (3.5) and (3.6) respectively. We order the reconstruction space Rin the same manner as presented at the end

of Section 3. In (3.8) we counted the number of reconstruction elements up to level J−1, which leads to

NJ= (22J−1)a2+ 6a(a−1)(2J−1) + 3J(a−1)2+ (2a−1)2(5.2)

many elements, which is asymptotically of order 22J.

We test Haar wavelets and 2D Daubechies-4 wavelets. Figure 1 shows the linear behaviour of the stable

sampling rate for these two types of wavelet generators. By a small abuse of notation, we also write Mfor the

total number of samples. In our analysis in Section 3 we estimated the angle cos(ω(RN,S(ε)

M)) (recall that is

the sampling rate) with respect to some fixed θ > 1. In fact we computed Msuch that

inf

ϕ∈RN

kϕk=1 kPS(ε)

ϕk ≥ 1

Figure 2: Reconstruction of the function f1(x, y) = cos2(x) exp(−y). The second row shows an 8 times zoomed-

in version of the upper left corner. Left: original function. Middle: truncated Fourier series with 2562Fourier

coefficients. Right: generalized sampling with Daubechies-3 wavelets computed from the same Fourier coeffi-

cients.

holds. We proved that Mis up to a constant of the same size as NJ. Figure 1 shows the stable sampling rate (in

blue)

Θ(N, θ) = min M∈N,cos(ω(RN,SM)) ≥1

θ

and the linear function f(in red) given by

f(N) = Mmax

Nmax

where Nmax is the maximum value of Nused in the experiment and Mmax = Θ(Nmax, θ). We computed the

stable sampling rate up to level J= 4. Note the (significant) jumps of the stable sampling rate occur whenever

N∈Ncrosses the scaling level NJ, J = 0,...,4. In the Haar case these are

N0= 4, N1= 16, N2= 64, N3= 256, N4= 1024,

see (5.2). Note that a= 1 in the Haar case. In particular, the jumps are linear, suggesting a linear stable sampling

rate. Figure 1 (b), (c), and (d) are interpreted similarly. However, the theoretical results are asymptotic results.

Therefore, it should not be surprising that the stable sampling rate is below the linear line in some cases. It aligns

asymptotically.

5.2 Fourier samples and boundary wavelet reconstruction

In this example we will demonstrate the efficiency of generalized sampling given the established linearity of the

stable sampling rate. In particular, suppose that fis a function we want to recover from its Fourier information.

It is smooth, however, not periodic – a problem that occurs for example in electron microscopy and also in MRI.

This causes the classical Fourier reconstruction to converge slowly, yet a smooth boundary wavelet basis will

give much faster convergence (see [22, Ch. 9]). As discussed, the issue is that we are given Fourier samples,

not wavelet coefficients. However, this is not a problem in view of the linearity of the stable sampling rate. In

Figure 3: Reconstruction of the function f2(x, y) = (1 + x2)(2y−12).Upper left: truncated Fourier series with

5122Fourier coefficients. Middle left: 8 times zoomed-in version of the upper figure. Lower left: error committed

by the truncated Fourier series. Upper right: generalized sampling with Daubechies-3 wavelets computed from

the same 5122Fourier coefficients. Middle right: 8 times zoomed-in version of the upper figure. Lower right:

error committed by generalized sampling.

particular, if f∈Ws(0,1), where Ws(0,1) denotes the usual Sobolev space, and PRNdenotes the projection

onto the space RNof the first Nboundary wavelets (see (4.2)), then

kf−PRNfk=O(N−s), N → ∞,

provided that the wavelet has sufficiently many vanishing moments. Now, if GN,M (f)∈ RNis the generalized

sampling solution from Definition 2.9 given MFourier coefficients, and Mis chosen according to the stable

sampling rate then

kf−GN,M (f)k=O(N−s) = O(M−s), N → ∞.

Hence, we obtain the same convergence rate up to a constant, by simply postprocessing the given samples.

To illustrate this advantage we will consider the following two functions:

f1(x, y) = cos2(x) exp(−y), f2(x, y) = (1 + x2)(2y−12).

In Figure 2 we have shown the results for f1and compared the classical Fourier reconstruction with the general-

ized sampling reconstruction. Both examples use exactly the same samples, however, note the pleasant absence

of the Gibb’s ringing in the generalized sampling case. The same experiment is carried out for f2in Figure 3,

however, here we have displayed the reconstructions in 3Din order to visualize the error.

6 Proof of Theorem 3.4

The proof of Theorem 3.4 is somewhat technical, wherefore we divide the proof into several steps. First, in

Subsection 6.1, the overall structure of the proof is presented, and the respective details can then be found in

Subsection 6.1.

6.1 Structure of the Proof

Let ε∈(0,1/(T1+T2)] and θ > 1. Then we have to prove that

inf

ϕ∈RNJ

kϕk=1

kPS(ε)

ϕk ≥ 1

θ,(6.1)

for

M= (M1, M2) = λ(J)

1+λ(J)

εS(θ),λ(J)

2+λ(J)

εS(θ)!,

and some S(θ)independent of J. To this end, let ϕ∈ RNbe such that kϕk= 1. Since the sampling functions

form an orthonormal basis of S(ε)

M, we obtain

kPS(ε)

ϕk2=

l1=−M1

l2=−M2|hϕ, s(ε)

li|2, l = (l1, l2).(6.2)

By Lemma 6.1, which relies mainly on the underlying MRA structure, we can write ϕin terms of scaling func-

tions at highest scale, i.e.,

ϕ=

l1=L3

l2=L4

αl1,l2φJ,(l1,l2),

for some L1, L2, L3, L4∈Z. Lemma 6.3, which is proven by direct computations, then shows that

hϕ, s(ε)

li=ε|det A|−J/2Φ(ε(A−J)Tl)b

φ(ε(A−J)Tl),

where Φis a trigonometric polynomial of the form

Φ(z) =

m1=L3

m2=L4

αm1,m2e−2πihz,mi, z ∈R2, m = (m1, m2).

Using (6.2), we conclude that

kPS(ε)

ϕk2=

l1=−M1

l2=−M2

ε2|det A|−J|Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2, l = (l1, l2).

By Theorem 6.10, there exist some ε0>0and S(θ)=S(θ)

1, S(θ)

2∈N×Nthat does not depend on Jsuch

that M1

l1=−M1

l2=−M2

ε2

0|det A−J||Φ(ε0(A−J)Tl)|2|b

φ(ε0(A−J)Tl)|2≥kΦk2

θ2

for

M=1

ε0

(AJ)Te

S(θ)=



λ(J)

1+λ(J)

ε0S(θ)

λ(J)

2+λ(J)

ε0S(θ)

2

.

Since

kΦk2=

m1=L3

m2=−L4|βm1,m2|2=kϕk2= 1,

we obtain

kPS(ε0)

ϕk2≥1

θ2.

Finally, Lemma 6.11 implies that a change of εonly changes the constant, showing that (6.1) is true for any

ε∈(0,1/(T1+T2)], thereby proving Theorem 3.4.

6.2 Auxiliary results

In this section we will prove the results mentioned in Subsection 6.1, which are required for completing the proof

of Theorem 3.4. The following result is an extension of a result from [6] to the two dimensional setting.

Lemma 6.1. Let J∈Nand ϕ∈V(a)

0⊕W(a)

0⊕. . . ⊕W(a)

J−1. Then there exist L1, L2, L3, L4∈Zdependent of

Jsuch that

ϕ=

l1=L3

l2=L4

αl1,l2φJ,(l1,l2).

Proof. Let ϕ∈V(a)

0⊕W(a)

0⊕. . . ⊕W(a)

J−1. Then ϕhas an expansion of the following form

ϕ=

m1=a−1

m1=−a+1

m2=a−1

m2=−a+1

αm1,m2φ0,(m1,m2)+

|det A|−1

p=1

J−1

j=0

a(λj

1+λj

2)−1

m1=−a+1

a(λj

3+λj

4)−1

m2=−a+1

βp

j,(m1,m2)ψp

j,(m1,m2).(6.3)

Since V(a)

0⊂V0and W(a)

j⊂Wjfor j∈Nand the sequence (Vj)j∈Zforms an MRA, it follows that

V(a)

0⊕W(a)

0⊕. . . ⊕W(a)

J−1⊂VJ.

Since an orthonormal basis for VJis given by the functions {φJ,m :m∈Z2}, for each |mi|< a, i = 1,2, we

have

φ0,(m1,m2)=X

l∈Z2hφ0,(m1,m2), φJ,liφJ,l

Moreover, since φ0,(m1,m2)and φJ,l are compactly supported, we obtain

hφ0,(m1,m2), φJ,li 6= 0,

−a+ (−a+ 1)(λ(J)

1+λ(J)

2)≤l1≤(2a−1)(λ(J)

1+λ(J)

and

−a+ (−a+ 1)(λ(J)

3+λ(J)

4)≤l2≤(2a−1)(λ(J)

3+λ(J)

4).

This follows by a straightforward computation from the support conditions of φ0,(m1,m2)and φJ,l together with

|mi|< a, i = 1,2. Similarly, we have hψp

j,(m1,m2), φJ,(l1,l2)i 6= 0 if

Case I: det Aj≥0

−a−λ(J)

(−a+ 1)λ(j)

4+λ(j)

2(a(λ(j)

3+λ(j)

4)−1)

det Aj+λ(J)

(−a+ 1)λ(j)

1+ (a(λ(j)

1+λ(j)

2)−1)λ(j)

det Aj

< l1< λ(J)

a(λ(j)

4−λ(j)

2)+(a(λ(j)

1+λ(j)

2)−1)λ(j)

4+ (a−1)λ(j)

det Aj+

+λ(J)

a(λ(j)

1−λ(j)

3)+(a(λ(j)

3+λ(j)

4)−1)λ(j)

1+ (a−1)λ(j)

det Aj

and

−a−λ(J)

(−a+ 1)λ(j)

4+λ(j)

2(a(λ(j)

3+λ(j)

4)−1)

det Aj+λ(J)

(−a+ 1)λ(j)

1+ (a(λ(j)

1+λ(j)

2)−1)λ(j)

det Aj

< l2< λ(J)

a(λ(j)

4−λ(j)

2)+(a(λ(j)

1+λ(j)

2)−1)λ(j)

4+ (a−1)λ(j)

det Aj+

+λ(J)

a(λ(j)

1−λ(j)

3)+(a(λ(j)

3+λ(j)

4)−1)λ(j)

1+ (a−1)λ(j)

det Aj.

Case II: det Aj<0

−a−λ(J)

(−a+ 1)λ(j)

4+λ(j)

2(a(λ(j)

3+λ(j)

4)−1)

det Aj+λ(J)

(−a+ 1)λ(j)

1+ (a(λ(j)

1+λ(j)

2)−1)λ(j)

det Aj

> l1> λ(J)

a(λ(j)

4−λ(j)

2)+(a(λ(j)

1+λ(j)

2)−1)λ(j)

4+ (a−1)λ(j)

det Aj+

+λ(J)

a(λ(j)

1−λ(j)

3)+(a(λ(j)

3+λ(j)

4)−1)λ(j)

1+ (a−1)λ(j)

det Aj

and

−a−λ(J)

(−a+ 1)λ(j)

4+λ(j)

2(a(λ(j)

3+λ(j)

4)−1)

det Aj+λ(J)

(−a+ 1)λ(j)

1+ (a(λ(j)

1+λ(j)

2)−1)λ(j)

det Aj

> l2> λ(J)

a(λ(j)

4−λ(j)

2)+(a(λ(j)

1+λ(j)

2)−1)λ(j)

4+ (a−1)λ(j)

det Aj+

+λ(J)

a(λ(j)

1−λ(j)

3)+(a(λ(j)

3+λ(j)

4)−1)λ(j)

1+ (a−1)λ(j)

det Aj.

Minimizing the lower bounds with respect to j∈ {0, . . . , J −1}and maximizing the upper bounds with respect

to j, respectively, yields the claim.

The following lemma is well known (see [22]).

Lemma 6.2. Let f∈L2(R2). Then {f(·−m) : m∈Z2}is an orthonormal system if and only if

m∈Z2|b

f(ξ+m)|2= 1 for almost every ξ∈R2.

Finally, one more technical lemma is needed.

Lemma 6.3. Let Abe a scaling matrix, J∈Z, and m= (m1, m2)∈Z2. Further, let ϕ∈span{φJ,m :m∈

Z2}be compactly supported in [−T1, T2]2, and let L1, L2, L3, L4∈Zbe such that

ϕ=

m1=L3

m2=L4

αm1,m2φJ,(m1,m2), αm∈C.

Then, for all l∈Z2,

hϕ, s(ε)

li=ε|det A|−J/2Φ(ε(A−J)Tl)b

φ(ε(A−J)Tl),

where s(ε)

lis defined in (3.3) and Φis the trigonometric polynomial given by

Φ(z) = X

L3≤m1≤L1,

L4≤m2≤L2,

αme−2πihz,mi, z ∈R2, m = (m1, m2).

Proof. Since ϕis supported in [−T1, T2]2, we obtain

hϕ, s(ε)

li=εZ

ϕ(x)e−2πiεhl,xi·χ[−T1,T2]2dx

=εbϕ(εl)

=ε L2

m1=L1

m2=L3

αm1,m2φJ,(m1,m2)!∧

(εl)

=ε

m1=L1

m2=L3

αm1,m2φJ,(m1,m2)∧(εl)

=ε|det A|−J/2Φ(ε(A−J)Tl)b

φ(ε(A−J)Tl).

This proves the claim.

6.3 Theorem 6.10

The proof of Theorem 6.10 requires a particular estimate (Proposition 6.9) for the norm of trigonometric polyno-

mials depending on their evaluations on a particular grid whose mesh norm and associated Voronoi regions come

also into play.

6.3.1 Mesh Norm

We start with the definition of a mesh norm for the situation we are faced with. A mesh norm can be interpreted

as the largest distance between neighboring nodes.

Definition 6.4. Let Λ⊂Z2be an integer grid of the form

Λ := (l1, l2)∈Z2:−Mi≤li≤Mi, i = 1,2, M1, M2∈N,

and let Abe a scaling matrix. Set

Ω := ΛA:= A([−M1, M1]×[−M2, M2]) ⊂R2,

and define a metric ρon Ωby

ρ: Ω ×Ω−→ R+,(x, y)7→ min

k∈A(Z2)kx−y+kk∞.

The mesh norm of {xl∈Ω : l∈Λ}is then defined as

δ:= max

x∈Ωmin

l∈Λρ(xl, x),

where xl:= A·l, l ∈Λdenote the nodes in Ω.

Before we can continue, we require some notions and results on Voronoi regions and trigonometric polyno-

mials. Our first result shows that if the distance between the nodes {xl}lconverges to zero, the mesh norm of the

entire grid Ωconverges to zero.

Lemma 6.5. Let ε > 0and Λ⊂Z2be defined by

Λ := (l1, l2)∈Z2:−Mi≤li≤Mi, i = 1,2,

where M1, M2∈N. Furthermore, suppose Ais a scaling matrix, and let Ω(ε):= ΛεA. If ε−→ 0, then

δ(ε)−→ 0, where δ(ε)denotes the mesh norm of {εAl ∈Ω(ε):l∈Λ}.

Proof. If ε→0, then

εA(Λ) = {εAl :l∈Λ} −→ {(0,0)}

with respect to the standard Euclidean distance. Furthermore, for xl:= εAl, we have

lim

ε→0δ(ε)= lim

ε→0max

x∈Ω(ε)min

l∈Λmin

k∈εA(Z2)kx−xl+kk∞≤lim

ε→0max

x∈Ω(ε)min

l∈Λkx−xlk∞.(6.4)

Since x, xl∈Ω(ε)for all l∈Λ, we obtain

lim

ε→02 max

x∈Ω(ε)min

l∈Λkx−xlk∞≤lim

ε→0max

x,y∈Ω(ε)kx−yk∞.

Inserting this estimate into (6.4) yields

lim

ε→0δ(ε)≤lim

ε→0max

x,y∈Ω(ε)kx−yk∞≤lim

ε→0diam Ω(ε)= lim

ε→0diam ΛεA = lim

ε→0diam εΛA= lim

ε→0εdiam ΛA

| {z }

<∞

= 0,

where diam(F)denotes the diameter of a set F⊂Rd, i.e.,

diam(F) = sup

x,y∈F

d2(x, y),

and d2denotes the Euclidean metric on Rd. Since the mesh norm is always non-negative, the lemma is proven.

6.3.2 Voronoi Regions

The next result studies the volume of the Voronoi regions associated to the previously considered grid Λwith

respect to the metric ρdefined in Definition 6.4. We start by formally defining the notion of Voronoi region in

our setting.

Definition 6.6. Let Ω⊂R2, and let (xl)l∈Λbe a sequence in R2. Then we refer to the sets (Vl)l∈Λdefined by

Vl:= {x∈Ω : ρ(x, xl)≤ρ(x, xk)for all k6=l}

as Voronoi regions with respect to Λ,Ω,ρ, and xl.

We can now state the previously announced result.

Lemma 6.7. Let M1, M2∈Nand

Λ := (l1, l2)∈Z2:−Mi≤li≤Mi, i = 1,2.

Moreover, let Ω = ΛId, where Id denotes the 2×2-identity matrix, and let (Vl)l∈Λbe the Voronoi regions with

respect to Λ,Ω,ρ, and l. Then, for all l∈Λ,

µ(Vl)≤1,

where µdenotes the 2D Lebesgue measure.

Proof. Notice that the Voronoi regions (Vl)l∈Λare in fact rectangles, since the grid is an integer grid with a

constant step-size. Hence, for each l∈Λ,

µ(Vl) = al1,l2·bl1,l2, al1,l2, bl1,l2∈R,(6.5)

where al1,l2denotes the width and bl1,l2the height of the rectangle Vl.

Towards a contradiction, assume that Vldoes contain two different nodes xkand xlwith k6=l. This implies

06=ρ(xk, xl)≤ρ(xl, xl) = 0,

which is a contradiction. Thus, we can conclude that

al1,l2≤ρ(xl1+1,l2, xl1,l2)and bl≤ρ(xl1,l2, xl1,l2+1),

which, by (6.5), proves the claim.

We next obtain a slight generalization of the previous result.

Lemma 6.8. Let Abe a (linear) bijective transformation acting on R2with matrix representation

A=λ1λ2

λ3λ4,

and let Λand (Vl)l∈Λbe defined as in Lemma 6.7. Then

µ(A(Vl)) ≤ |det A|.

Proof. The result follows from Lemma 6.7 by an integration by substitution.

6.3.3 Trigonometric Polynomials

The next theorem is an adapted version of a result presented in [24, Thm. 3.2] which again is a reformulation of

a result proven by Gröchenig in [17].

Proposition 6.9. Let J, L1, L2, L3, L4∈Zsuch that L1≥L3and L2≥L4, and let Φa trigonometric

polynomial of the form

Φ(z) =

m1=L3

m2=L4

αm1,m2e−2πihz,mi, z ∈R2, m = (m1, m2).

Further, let the grid Λbe defined as in Lemma 6.7, let Abe a scaling matrix, and let Ω(ε):= ΛεA−J

for ε > 0.

Set xl:= ε(A−J)Tl,l∈Λ. If the mesh norm δof {xl∈Ω(ε):l∈Λ}obeys

δ <

log 1

√µ(Ω(ε))+ 1

2πmax{|L1|,|L2|,|L3|,|L4|},

where µis the 2D Lebesgue measure, then there exists a positive constant C(δ, L1, L2, L3, L4)such that

C(δ, L1, L2, L3, L4)kΦk ≤ X

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl|2!1

where

C(δ, L1, L2, L3, L4) = 1−e2πδ max{|L1|,|L2|,|L3|,|L4|} −1qµ(Ω(ε))

and

kfk:= ZΩ(ε)|f(x)|2dx1/2

, f ∈L2(Ω(ε)).

Proof. We first observe that by the hypotheses, the constant C(δ, L1, L2, L3, L4)is indeed positive.

Second, let (Vl)l∈Λbe the Voronoi regions with respect to Λ,Ω(ε), and ρ. For l∈Λ, we define the weights

ωl:= µ(Vl). As in Lemma 6.8, integration by substitution yields

ωl=µ(Vl)≤ε2|det A−J|.

Hence it suffices to prove the existence of a constant C(δ, L1, L2, L3, L4)>0such that

C(δ, L1, L2, L3, L4)kΦk ≤ X

l∈Λ

ωl|Φ(ε(A−J)Tl|2!1

,(6.6)

For this, we first observe that

l∈Λ

ωl|Φ(ε(A−J)Tl|2!1

=

Z

Ω(ε)X

l∈Λ|Φ(ε(A−J)Tl|2



=X

l∈Λ

Φ(xl)χVl.(6.7)

By the (inverse) triangle inequality,

X

l∈Λ

Φ(xl)χVl≥ kΦk−Φ−X

l∈Λ

Φ(xl)χVl.(6.8)

Hence, we require an upper bound for kΦ−Pl∈ΛΦ(xl)χVlk. By Taylor expansion and Bernstein’s inequality,

Φ−X

l∈Λ

Φ(xl)χVl

Ω(ε)Φ(x)−X

l∈Λ

Φ(xl)χVl(x)

Ω(ε)X

l∈Λ

Φ(x)χVl(x)−Φ(xl)χVl(x)

≤X

l∈ΛZ

|Φ(x)−Φ(xl)|2dx

≤X

l∈ΛZ

X

α∈N2\{(0,0)}

α!kx−xlkα|DαΦ(x)|

≤X

l∈ΛZ

X

α∈N2\{(0,0)}

α!δ|α|(max{|L1|,|L2|,|L3|,|L4|}π)|α|kΦk

≤X

α∈N2\{(0,0)}

α!δ|α|(max{|L1|,|L2|,|L3|,|L4|}π)|α|kΦk

·X

l∈ΛZ

1dx

Since the Voronoi regions build a partition of Ω(ε),

l∈Λ

µ(Vl) = µ(Ω(ε)),

and we can continue by

Φ−X

l∈Λ

Φ(xl)χVl

≤e2πδ max{|L1|,|L2|,|L3|,|L4|} −1

2kΦk2µ(Ω(ε)).(6.9)

Using (6.9) and (6.8), we obtain

X

l∈Λ

Φ(xl)χVl≥ kΦk−Φ−X

l∈Λ

Φ(xl)χVl

≥ kΦk−kΦke2πδ max{|L1|,|L2|,|L3|,|L4|} −1qµ(Ω(ε))

=kΦk1−e2πδ max{|L1|,|L2|,|L3|,|L4|} −1qµ(Ω(ε)).

Combining this estimate with (6.7) proves (6.6).

Finally, we can state and prove Theorem 6.10, which is one main ingredient for the proof of Theorem 3.4 in

Subsection 6.1.

Theorem 6.10. Let L1, L2, L3, L4∈Zsuch that L1≥L3and L2≥L4and αm1,m2∈C, and let Φbe the

trigonometric polynomial

Φ(·,·) =

m1=L3

m2=L4

αm1,m2e−2πi(·)m1e−2πi(·)m2.

Further, let A=λ1λ2

λ3λ4be a scaling matrix, J∈Na maximal scale, and Λ⊂Z2the grid defined by

Λ := (l1, l2)∈Z2:−Mi≤li≤Mi, i = 1,2,

where M1, M2∈N. If there exists ε≤1/(T1+T2)independent of Jsuch that δfulfills

δ <

log 1

√µ(Ω(ε))+ 1

4πmax{|L1|,|L2|,|L3|,|L4|},

then there exists e

S(θ)= (S(θ)

1, S(θ)

2)∈N×N, independent of J, such that, for all θ > 1, we have

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2≥kΦk2

2θ2,

for M= (M1, M2) = 1

ε(AJ)Te

S(θ)and scaling function φ.

Proof. Since 0<ε<1, there exists an m∈N, such that

m+ 1 ≤ε < 1

Set ε=1

m+1 , and note that (3.4) still holds, since the logarithm is monotonically increasing.

Next, for some S∈Nto be determined later, let

M1

M2:= 1

ε(AJ)TS

S=1

ε λ(J)

1+λ(J)

λ(J)

2+λ(J)

4!S, where as before AJ= λ(J)

1λ(J)

λ(J)

3λ(J)

4!.

For l= (l1, l2)∈Z2, we then obtain

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2

M1−1

l1=−M1

M2−1

l2=M2

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2

λ(J)

1+λ(J)

ε−1

l1=0

λ(J)

2+λ(J)

ε−1

l2=0

S−1

s=−S

S−1

t=−Sε2|det A−J|·Φε(A−J)Tl+1

ε(AJ)Ts

t

·b

φε(A−J)Tl+1

ε(AJ)Ts

t

2,(6.10)

Integer periodicity of the trigonometric polynomial implies that, for all s∈Z,

Φε(A−J)Tl+1

ε(AJ)Ts

t= Φ ε(A−J)Tl+s

t= Φ ε(A−J)Tl.

Therefore, by (6.10)

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2

λ(J)

1+λ(J)

ε−1

l1=0

λ(J)

2+λ(J)

ε−1

l2=0 ε2|det A−J||Φ(ε(A−J)Tl)|2

S−1

s=−S

S−1

t=−Sb

φ(A−J)Tl+s

t

2.(6.11)

Let θ > 1. Then, by Lemma 6.2, there exists S(θ)∈Nsuch that

S(θ)−1

s=−S(θ)

S−1

t=−Sb

φ(A−J)Tl+s

t

≥1

θ.(6.12)

We now choose S:= S(θ). Combining (6.11) and (6.12) yields

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2

≥

λ(J)

1+λ(J)

ε−1

l1=0

λ(J)

2+λ(J)

ε−1

l2=0

ε2|det A−J||Φ(ε(A−J)Tl)|2·1

θ.(6.13)

Next, we apply Proposition 6.9 to (6.13) to obtain

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2

≥kΦk1−e2πδ max{|L1|,|L2|,|L3|,|L4|} −1qµ(Ω(ε))2

·1

Since

1+ 1−e2πδ max{|L1|,|L2|,|L3|,|L4|}qµ(Ω(ε))

≥1 + 



1−e2πmax{|L1|,|L2|,|L3|,|L4|}

log





√µ(Ω(ε))

+1





4πmax{|L1|,|L2|,|L3|,|L4|} 



qµ(Ω(ε))

= 1 + 1−qµ(Ω(ε))+11/2!qµ(Ω(ε))

= 1 + qµ(Ω(ε))−µ(Ω(ε))3/2+µ(Ω(ε))1/2

≥1−µ(Ω(ε))3/4

≥1−ε3/4det A−J3/4

≥1

we can conclude

l∈Λ

ε2|det A−J||Φ(ε(A−J)Tl)|2|b

φ(ε(A−J)Tl)|2≥kΦk2

2θ.

Hence the theorem is proven for

M1=λ(J)

1+λ(J)

εS(θ)and M2=λ(J)

2+λ(J)

εS(θ).

6.3.4 Lemma 6.11

We mention that this result is proved in the same manner as the one dimensional result from [6].

Lemma 6.11. For γ > 1and ε1, ε2∈(0,1/(T1+T2)], let θ(γ), C(γ)>1be such that

θ(γ)2−16

π4(C(γ)−1)2−s1−1

θ(γ)>1

γ.(6.14)

If there exists M= (M1, M2)∈N×Nsuch that

inf

ϕ∈RN

kϕk=1 kPS(ε1)

ϕk ≥ 1

θ(γ),(6.15)

then

inf

ϕ∈RN

kϕk=1 kPS(ε2)

ϕk ≥ 1

γ,

whenever K= (K1, K2) = lC(γ)M1ε1

ε2m,lC(γ)M2ε1

ε2m.

Proof. First, notice that for any γ∈(0,1), there exist θ(γ)and C(γ)such that (6.14) is fulfilled. Now, let γ > 1

and ε2>0. Then, by (6.15),

inf

ϕ∈RN

kϕk=1 kPS(ε2)

ϕk ≥ inf

ϕ∈RN

kϕk=1 kPS(ε2)

PS(ε1)

ϕk−kPS(ε2)

P⊥

S(ε1)

ϕk

≥inf

ϕ∈RN

kϕk=1 kPS(ε2)

PS(ε1)

ϕk−s1−1

θ(γ)!.(6.16)

In order to estimate kPS(ε2)

PS(ε1)

ϕk, we decompose this term by

kPS(ε2)

PS(ε1)

ϕk2=kPS(ε1)

ϕk2−kP⊥

S(ε2)

PS(ε1)

ϕk2.(6.17)

Thus, we require a suitable upper bound for kP⊥

S(ε2)

PS(ε1)

ϕk2. For this, let

IM={l= (l1, l2)∈Z2:−Mi/2≤li≤Mi/2−1, i = 1,2},

IK={j= (j1, j2)∈Z2:−Ki/2≤ji≤Ki/2−1, i = 1,2},

and for the complementary sets in Z2we write

M={l= (l1, l2)∈Z2:l /∈IM},

K={j= (j1, j2)∈Z2:j /∈IK}.

Then, we have

kP⊥

S(ε2)

PS(ε1)

ϕk2=X

j∈Ic

hPS(ε1)

ϕ, s(ε2)

jis(ε2)

j

j∈Ic

KX

l∈IMhϕ, s(ε1)

lihs(ε1)

l, s(ε2)

ji

≤X

j∈Ic

K X

l∈IM|hϕ, s(ε1)

li|2X

l∈IM|hs(ε1)

l, s(ε2)

ji|2!

≤X

j∈Ic

l∈IM|hs(ε1)

l, s(ε2)

ji|2.(6.18)

Now, for ε= max{ε1, ε2}, we obtain

|hs(ε1)

l, s(ε2)

ji| =

ε1·ε2·Z

[−1

2ε,1

2ε]2

e2πiε1hl,xie2πiε2hj,xidx

≤

ε1ε2

π2(ε1l1−ε2j1)(ε1l2−ε2j2).

Therefore, by using (6.18),

kP⊥

S(ε2)

PS(ε1)

ϕk2≤X

j∈Ic

l∈IM

ε1ε2

π2(ε1l1−ε2j1)(ε1l2−ε2j2)

Assuming Ki=C(γ)Miε1

ε2, i = 1,2, we can continue by

kP⊥

S(ε2)

PS(ε1)

ϕk2≤ε1ε2

π22M1M2X

(j1,j2)/∈IK

|(ε1M1

2−ε2j1)(ε1M2

2−ε2j2)|2

≤ε1ε2

π22M1M2

ε2

2|(ε1M1−ε2K1)(ε1M2−ε2K2)|

≤ε1

π22M1M2

|(ε1M1−ε2K1)(ε1M2−ε2K2)|

≤ε1

π22M1M2

|(ε1M1−C(γ)M1ε1)(ε1M2−C(γ)M2ε1)|

=16

(π2(C(γ)−1))2.

Thus, by (6.17),

kPS(ε2)

PS(ε1)

ϕk2≥1

θ(γ)2−16

(π2(C(γ)−1))2

which, using (6.16), yields

inf

ϕ∈RN

kϕk=1 kPS(ε2)

ϕk ≥ s1

θ(γ)2−16

(π2(C(γ)−1))2−r1−1

θ.

The lemma is proved.

7 Proof of Theorem 4.3

The following lemma will be used in the upcoming proof Theorem 4.3. A one dimensional analogue can be found

in [6]. The proof extends straightforwardly and we omit it here.

Lemma 7.1. Let A1, A2, A3, and A4∈Z, A1≤A2, A3≤A4. Moreover, let L1, L2∈Nsuch that 2L1≥

A2−A1+ 1 and 2L2≥A4−A3+ 1. Then the trigonometric polynomial

Φ(z1, z2) =

k=A1

l=A3

αk,le2πikz1e2πilz2

satisfies

2L1−1

m=0

2L2−1

n=0

4L1L2Φm

2L1

2L2

k=A1

l=A3|αk,l|2.

Proof of Theorem 4.3. Let ϕ∈ RNsuch that kϕk= 1. Then ϕcan be expanded as

ϕ=

2J0−1

n1,n2=0

αJ0,(n1,n2)φJ0,(n1,n2)+

k=1

J−1

j=J0

2j−1

n1,n2=0

βk

j,(n1,n2)ψb,k

j,(n1,n2)(7.1)

We will now use the nestedness of the two dimensional MRA that is generated by the one dimensional wavelet

system {φb

J0,m}m=0,...,2J−1,{ψb

j,n}j≥J0,n=0,...,2j−1. In particular, we have

Vb,2

j⊂Vb,2

j+1, j ≥J0,

and

Vb,2

j+1 =Vb,2

j⊕Wb,2

j, j ≥J0,

with Vb,2

j=Vb

j⊗Vb

jand Wb,2

j= (Vb

j⊗Wb

j)⊕(Wb

j⊗Vb

j)⊕(Wb

j⊗Wb

j). Loosely speaking, due to

the MRA embedding properties we can expand functions from the reconstructions space into scaling functions

(φb

J,(n1,n2))n1,n2at highest scale. Since the left boundary functions can be constructed by translates of the initial

scaling function φand the right scaling function can be obtained by reflecting the left boundary functions. The

reflected function will be denoted by φ#. This gives us in (7.1)

ϕ=

2J−p−1

n1,n2=0

αn1,n2φ(2J·−n) +

n1,n2=2J−p

βn1,n2φ#(2J·−n),(7.2)

where only finitely many αn1,n2and βn1,n2are non-zero. Now, for any l= (l1, l2)∈Z2we obtain by basic

properties of the Fourier transform

hϕ, s(ε)

li=

2J−p−1

n1,n2=0

αn1,n2

2Je−2πiεhn,2−Jlib

φε

2Jl+

n1,n2=2J−p

βn1,n2

2Je−2πiεhn,2−Jlic

φ#ε

2Jl.(7.3)

For the sake of brevity, we shall write in the following

Φ1(z) =

2J−p−1

n1,n2=0

αn1,n2

2Je−2πiεhn,zi,

Φ2(z) =

n1,n2=2J−p

βn1,n2

2Je−2πiεhn,zi.(7.4)

By our assumptions on the scaling function

φ(ξ1, ξ2)|.1

(1 + |ξ1|)(1 + |ξ2|),(7.5)

and by the same argument

φ#(ξ1, ξ2)|.1

(1 + |ξ1|)(1 + |ξ2|).(7.6)

Using (7.4), (7.5), and (7.6) in (7.3) yields

l/∈IM

|hϕ, s(ε)

li|2≤X

l/∈IMε

2JΦ1ε

2Jlb

φε

2Jl

2+X

l/∈IMε

2JΦ2ε

2Jlc

φ#ε

2Jl

+ 2 

X

l/∈IMε

2JΦ1ε

2Jlb

φε

2Jl

2



1/2

X

l/∈IMε

2JΦ2ε

2Jlc

φ#ε

2Jl

2



1/2

.(7.7)

We assume 2J/ε ∈Nand the number of samples M= (M1, M2)∈N×Nis

Mi=S·2J

ε, i = 1,2

where Sis some positive constant. Now, (l1, l2)/∈IMif

Case I: |l1|> M1and |l2|< M2,

Case II: |l1|< M1and |l2|> M2, or

Case III: |l1|> M1and |l2|> M2.

It is sufficient to consider the sum for Case I. Case II can be obtained by symmetry and Case III yields a

smaller sum. For K= 2J/ε, we have

|l2|<M2X

|l1|>M1ε

2JΦ1ε

2Jlb

φε

2Jl

K−1

KΦ1j1

K,j2

K

|s2|<S X

|s1|>S b

φj1

K+s1,j2

K+s2

K−1

KΦ1j1

K,j2

K

|s2|<S X

|s1|>S

(1 + |j1/K +s1|)2

(1 + |j2/K +s2|)2

≤C1

K−1

KΦ1j1

K,j2

K

S.(7.8)

By Lemma 7.1 we obtain in (7.9)

|l2|>M2X

|l1|>M1ε

2JΦ1ε

2Jlb

φε

2Jl

2≤C1

2J−1

n1,n2=0 |αn1,n2|21

S.(7.9)

Since the functions form an orthonormal basis and kϕk= 1 we have

n1,n2∈N|αn1,n2|2+X

n1,n2∈N|βn1,n2|2= 1

and hence

n1,n2∈N|αn1,n2|2≤1

Similarly, one shows

|l2|>M2X

|l1|>M1ε

2JΦ2ε

2Jlc

φ#ε

2Jl

2≤C2X

n1,n2∈N|βn1,n2|21

S.(7.10)

Therefore, in (7.7) we have that

|l2|<M2X

|l1|>M1

|hϕ, s(ε)

li|2≤C1X

n1,n2∈N|αn1,n2|21

S+C2X

n1,n2∈N|βn1,n2|21

S+

+ 2 

C1X

n1,n2∈N|αn1,n2|21

S



1/2

C2X

n1,n2∈N|βn1,n2|21

S



1/2

≤√2C1+√2C2

√S2

Now, for θ > 1choosing Slarge enough, such that

√2C1+√2C2

√S2

≤θ2−1

3θ2

gives the claim.

8 Acknowledgements

BA acknowledges support from the NSF DMS grant 1318894. ACH acknowledges support from a Royal Society

University Research Fellowship as well as the UK Engineering and Physical Sciences Research Council (EPSRC)

grant EP/L003457/1. GK was supported in part by the Einstein Foundation Berlin, by Deutsche Forschungsge-

meinschaft (DFG) Grant KU 1446/14, by the DFG Collaborative Research Center TRR 109 “Discretization in

Geometry and Dynamics”, and by the DFG Research Center MATHEON “Mathematics for key technologies” in

Berlin. JM acknowledges support from the Berlin Mathematical School.

References

[1] B. Adcock, M. Gataric, and A. C. Hansen. On stable reconstructions from univariate nonuniform Fourier

measurements. arXiv:1310.7820, 2013.

[2] B. Adcock and A. C. Hansen. Generalized sampling and infinite-dimensional compressed sensing. Techni-

cal report, 2011.

[3] B. Adcock and A. C. Hansen. A generalized sampling theorem for stable reconstructions in arbitrary bases.

J. Fourier Anal. Appl., 18(4):685–716, 2012.

[4] B. Adcock and A. C. Hansen. Stable reconstructions in Hilbert spaces and the resolution of the Gibbs

phenomenon. Appl. Comput. Harmon. Anal., 32(3):357–388, 2012.

[5] B. Adcock, A. C. Hansen, E. Herrholz, and G. Teschke. Generalized sampling: extension to frames and

inverse and ill-posed problems. Inverse Problems, 29(1):015008, 27, 2013.

[6] B. Adcock, A. C. Hansen, and C. Poon. Beyond consistent reconstructions: optimality and sharp bounds

for generalized sampling, and application to the uniform resampling problem. SIAM J. Math. Anal.,

45(5):3132–3167, 2013.

[7] B. Adcock, A. C. Hansen, and C. Poon. On optimal wavelet reconstructions from fourier samples: Linearity

and universality of the stable sampling rate. Appl. Comput. Harmon. Anal., to appear.

[8] A. Aldroubi. Oblique projections in atomic spaces. Proc. Amer. Math. Soc., 124(7):2051–2060, 1996.

[9] S. Bishop, C. Heil, Y. Y. Koo, and J. K. Lim. Invariances of frame sequences under perturbations. Linear

Algebra Appl., 432(6):1501–1574, 2010.

[10] E. J. Candés and D. L. Donoho. New tight frames of curvelets and optimal representations of objects with

piecewise c? singularities. Comm. Pure Appl. Math, pages 219–266, 2002.

[11] A. Cohen, I. Daubechies, and P. Vial. Wavelet bases on the interval and fast algorithms. Appl. Comput.

Harmon. Anal, (1):54–81, 1993.

[12] G. Corach and A. Maestripieri. Products of orthogonal projections and polar decompositions. Linear

Algebra Appl., 434(6):1594–1609, 2011.

[13] I. Daubechies. Ten lectures on wavelets, volume 61 of CBMS-NSF Regional Conference Series in Applied

Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992.

[14] F. Deutsch. Best approximation in inner product spaces. CMS Books in Mathematics/Ouvrages de Mathé-

matiques de la SMC, 7. Springer-Verlag, New York, 2001.

[15] Y. C. Eldar. Sampling with arbitrary sampling and reconstruction spaces and oblique dual frame vectors. J.

Fourier Anal. Appl., 9(1):77–96, 2003.

[16] Y. C. Eldar and T. Werther. General framework for consistent sampling in Hilbert spaces. Int. J. Wavelets

Multiresolut. Inf. Process., 3(4):497–509, 2005.

[17] K. Gröchenig. Reconstruction algorithms in irregular sampling. Math. Comp., 59(181–1924), 1992.

[18] T. Hrycak and K. Gröchenig. Pseudospectral Fourier reconstruction with the modified inverse polynomial

reconstruction method. J. Comput. Phys., 229(3):933–946, 2010.

[19] D. M. H. Jr. and J. B. Weaver. Two applications of wavelet transforms in magnetic resonance imaging. IEEE

Transactions on Information Theory, 38(2):840–860, 1992.

[20] G. Kutyniok and W.-Q. Lim. Compactly supported shearlets are optimally sparse. Journal of Approximation

Theory, 163(11):1564–1589, 2011.

[21] P. Maass. Families of orthogonal 2d wavelets. SIAM J. Math. Anal., pages 1454–1481, 1996.

[22] S. Mallat. A wavelet tour of signal processing. Elsevier/Academic Press, Amsterdam, third edition, 2009.

The sparse way, With contributions from Gabriel Peyré.

[23] L. Panych. Theoretical comparison of fourier and wavelet encoding in magnetic resonance imaging.

15(2):141–53, 02 1996.

[24] D. Potts and M. Tasche. Numerical stability of nonequispaced fast Fourier transforms. J. Comput. Appl.

Math., 222(2):655–674, 2008.

[25] G. Strang and T. Nguyen. Wavelets and filter banks. Wellesley-Cambridge Press, 1997.

[26] W.-S. Tang. Oblique projections, biorthogonal Riesz bases and multiwavelets in Hilbert spaces. Proc. Amer.

Math. Soc., 128(2):463–473, 2000.

[27] M. Unser. Sampling–50 years after Shannon. Proc. IEEE, 88(4):569–587, 2000.

[28] M. Unser and A. Aldroubi. A general sampling theory for nonideal acquisition devices. IEEE Transactions

on Signal Processing, 42(11):2915–2925, 1994.

[29] M. Unser and A. Aldroubi. A review of wavelets in biomedical applications. Proc. IEEE, 84(4):626–638,

1996.

[30] M. Unser, A. Aldroubi, and A. F. Laine. Guest editorial: wavelets in medical imaging. IEEE Trans. Med.

Imaging, 22(3):285–288, 2003.