Document [original]

Received: 6 September 2022 Accepted: 18 November 2022

DOI: 10.1002/pamm.202200005

Measure concentration and the Schrödinger equation

Harry Yserentant1,∗

1Technische Universität Berlin, Institut für Mathematik, 10623 Berlin, Germany

This talk pursued the aim to represent the solutions of the electronic Schrödinger equation as traces of higher-dimensional

functions. This allows to decouple the electron-electron interaction potential but comes at the price of a degenerate elliptic

operator replacing the Laplace operator on the higher-dimensional space. The surprising observation is that this operator can

without much loss again be substituted by the Laplace operator, the more successful the larger the system under consideration

is. This is due to a nontrivial concentration of measure phenomenon that has much to do with the random projection theorem

known from probability theory and can, for example, serve as a building block for the construction of iterative methods that

map sums of products of orbitals and geminals onto functions of the same type.

1 Introductory remarks

The electronic Schrödinger equation establishes a connection between chemistry and physics. It describes systems of finitely

many electrons that interact with a given, fixed set of nuclei and among each other. The neglect of the motion of the nuclei

can be heuristically justified by the fact that the nuclei are much heavier than the electrons and move therefore on a different

time scale. A complete mathematical justification is, of course, a considerably more difficult task. For a system of Nelectrons

moving in the field of Knuclei of charge Zνclamped at the positions aν, the corresponding Hamilton operator reads

H=−1

i=1

∆i−

i=1

ν=1

Zν

∥xi−aν∥+1

i,j=1

∥xi−xj∥.(1.1)

It acts on functions that depend on the variables x1, x2, . . . , xNassociated with the positions of the electrons in the three-

dimensional space. To get rid of the problems with the electron-electron interaction terms, it is an obvious idea, which has

already in the early days of quantum mechanics been successfully applied to calculate the ground state of the Helium atom

and paved the way for the general acceptance of quantum mechanics in its present form, to introduce the differences of the

electron positions as additional variables. To be precise, let

m= 3 ×N, n = 3 ×N+ 3 ×(N−1)N

2,(1.2)

and let the vectors in Rmand Rn, respectively, be partitioned into subvectors in the position space R3. Let the subvectors

of the vectors in Rmand the first Nof the subvectors of the vectors in Rnbe labeled by the indices i= 1, . . . , N and the

remaining subvectors of the vectors in Rnby the index pairs (i, j), with components i, j = 1, . . . , N and i<j. We try to

represent the solutions of the electronic Schrödinger equation then in the form u(x) = U(Tx), where the matrix Tmaps the

vectors xin Rminto the vectors y=Tx in Rnwith the subvectors

yi=xi, yij =xi−xj

√2(1.3)

and Uis a function from Rnto Rthat can ideally be well approximated, say, by a sum of products of orbitals, functions

depending only on the position yi=xiof a single electron, and geminals, functions of the variables yij associated with the

differences xi−xjof the electron positions. The scaling factor √2is not an absolute must, but will considerably simplify the

later presentation. The potential decouples in this new set of variables completely and splits into a sum of terms that depend

only on one of the components yior yij. The question is what happens with the first term in the Hamilton operator, which is

associated with the kinetic energy of the electrons. The talk tried to answer this question. For background information and

references to the literature, see [1] and [2], the papers on which the talk was essentially based. A detailed exposition of the

results presented in this short review, including complete proofs, can be found in [3].

2 Trace functions and the Laplace operator

To keep the presentation simple, we are in the following mainly concerned with functions U:Rn→R,na potentially high

if not very high dimension, that possess a then also unique representation

U(y) = 1

√2πnZb

U(ω) eiω·ydω(2.1)

∗Corresponding author: e-mail [email protected]berlin.de

This is an open access article under the terms of the Creative Commons Attribution License, which permits use,

distribution and reproduction in any medium, provided the original work is properly cited.

PAMM ·Proc. Appl. Math. Mech. 2022;22:1 e202200005. www.gamm-proceedings.com 1 of 6

2 of 6 Section 26: Modelling, analysis and simulation of molecular systems

in terms of an integrable function b

U, their Fourier transform. Such functions are by the Riemann-Lebesgue theorem uniformly

continuous and vanish at infinity. The space W0(Rn)of these functions is under the norm

∥U∥=1

√2πnZ|b

U(ω)|dω(2.2)

a Banach space and even a Banach algebra. Let Tbe a still arbitrary (n×m)-matrix of full rank m<nand let

u:Rm→R:x→U(Tx)(2.3)

be the trace of a function in U∈W0(Rn). As the functions in W0(Rn)are uniformly continuous, the same holds for their

traces. As there is a constant cwith ∥x∥ ≤ c∥Tx∥, the trace functions (2.3) vanish at infinity, too. The next lemma gives a

criterion for the existence of partial derivatives of the trace functions, where we use the common multi-index notation.

Lemma 2.1 Let Ube a function in W0(Rn)and let, for αbe given and with Ttthe transpose of T, the functions

ω→(i Ttω)βb

U(ω), β ≤α, (2.4)

be integrable. The trace function (2.3) of this function Upossesses then the partial derivative

(Dαu)(x) = 1

√2πnZ(i Ttω)αb

U(ω) eiω·T x dω, (2.5)

which is like uitself uniformly continuous and vanishes at infinity.

For partial derivatives of order one, this can be shown with the help of the dominated convergence theorem applied to the

corresponding difference quotients. For partial derivatives of higher order, the proposition follows by induction.

Let W2

0(T)be the space of the functions U∈W0(Rn)with finite (semi)-norm

|U|T=1

√2πnZ∥Ttω∥2|b

U(ω)|dω, (2.6)

where ∥·∥denotes again the Euclidean norm. The traces of these functions are by Lemma 2.1 twice continuously differentiable.

Let L:W2

0(T)→W0(Rn)be the degenerate elliptic differential operator given by

(LU)(y) = 1

√2πnZ∥Ttω∥2b

U(ω) eiω·ydω. (2.7)

For the functions U∈W2

0(T)and their traces (2.3), by Lemma 2.1

−(∆u)(x)=(LU)(Tx)(2.8)

holds. That is, −∆uis itself the trace of a higher-dimensional function LU. The matrix Ttmaps the Rninto the lower-

dimensional, and in the case of the matrix given by (1.3) much lower-dimensional Rm. The dimension n−mof its kernel is

in such cases much higher than the dimension mof its range. The more surprising is the fact that the Euclidean norm of Ttω

is, for the matrix Tassigned to the Schrödinger equation, on most of the Rnalmost equal to the Euclidean norm of ωitself.

The fraction of the vectors ωon the unit sphere of the Rnfor which the Euclidean norm of Ttωdiffers from one by more

than a given small amount tends exponentially to zero as the number of electrons goes to infinity. This is due to a nontrivial

concentration of measure phenomenon, which has a lot to do with the random projection theorem (see Lemma 5.3.2 in [4], for

example) from probability theory. It means that the operator (2.7) behaves more or less like the negative Laplace operator

−(∆U)(y) = 1

√2πnZ∥ω∥2b

U(ω) eiω·ydω(2.9)

applied to functions Udefined on the higher-dimensional space.

3 The underlying measure concentration effect

Let n > m and let Abe a real (m×n)-matrix of rank m. The kernel of such a matrix has the dimension n−mand hence

can be a large subspace of the Rn. Nevertheless, the set of all xfor which

∥Ax∥ ≥ δ∥A∥∥x∥(3.1)

PAMM ·Proc. Appl. Math. Mech. 22:1 (2022) 3 of 6

holds fills, in the high-dimensional case, often almost the complete Rnonce δfalls below a certain bound; the norms are as

in the previous section the Euclidean norm on the Rmand the Rn, respectively, and the matrix norm is the assigned spectral

norm. To describe this effect in more detail, we introduce on the Rnthe probability measure

λ(M) = 1

nνnZM∩Sn−1

dη, (3.2)

where νnis the volume of the n-dimensional unit ball and nνnis the area of the unit sphere Sn−1. We apply it to the sectors

consisting of the x∈Rnfor which ∥Ax∥< δ ∥A∥∥x∥holds to measure their opening angle. For orthogonal projections,

matrices with one as the only singular value, the measure of these sectors possesses a closed integral representation.

Theorem 3.1 Let the (m×n)-matrix P,m<n, be an orthogonal projection. Then

λx∥Px∥< δ ∥x∥=F(δ),0≤δ < 1,(3.3)

holds, where the function F(δ) = F(m, n;δ)is defined by the integral expression

F(δ) = 2 Γ(n/2)

Γ(m/2)Γ((n−m)/2) Zδ

(1 −t2)αtm−1dt(3.4)

and the exponent α≥ −1/2is given by

α=n−m−2

2.(3.5)

It takes nonnegative values for dimensions n≥m+ 2.

If the difference of the dimensions nand mis even, the function (3.4) is for even man even and for odd man odd

polynomial of degree n−2in δ. A closed representation of these polynomials is given in [2]. For practical purposes, it is,

however, more advantageous not to rely on such representations and to evaluate F(δ)numerically by means of a quadrature

rule. As Ftakes the value F(1) = 1,F(δ) = F(δ)/F (1) holds, so that there is no need to evaluate the Gamma function.

The function (3.4) always represents a lower bound for the area ratios under consideration.

Theorem 3.2 Let Abe a nonvanishing matrix of dimension m×n,m<n. Then one has

λx∥Ax∥< δ ∥A∥∥x∥≥F(m, n;δ).(3.6)

Upper bounds for the area ratios depend in general on the singular values of the matrix, in the extreme case on its condition

number, the ratio of its maximum and its minimum singular value. This is fortunately not the case for the matrices A=Tt

assigned to the Schrödinger equation. The Euclidean norm of the vector Tx ∈Rnis given by

∥Tx∥2=

i=1 ∥xi∥2+1

i=1

j=1 ∥xi−xj∥2(3.7)

or, after rearrangement, with the rank three map T0x=x1+x2+···+xNby

∥Tx∥2=N+ 2

2∥x∥2−1

2∥T0x∥2.(3.8)

The (m×m)-matrix TtThas therefore only two distinct eigenvalues, the eigenvalue 1of multiplicity 3×1=3and the

eigenvalue (N+ 2)/2of multiplicity 3×(N−1), that is, m−3. The singular values of the matrix Ttare therefore

σi= 1 for i≤3, σi=rN+ 2

2for i≥4.(3.9)

The spectral norm of the matrix Ttis p(N+ 2)/2.

4 of 6 Section 26: Modelling, analysis and simulation of molecular systems

Fig. 1: The functions F(δ) = F(m, n;δ)from equation (3.4) for the dimensions m= 2k,k= 1,...,16, and n= 2m

Theorem 3.3 Let Abe a matrix of dimension m×n,m<n, with singular values σk=σmfor k > m0. The corresponding

area ratios then satisfy the estimate

λx∥Ax∥< δ ∥A∥∥x∥≤F(m−m0, n;δ).(3.10)

The next theorem describes the limit behavior of the function (3.4) when the dimensions increase and tend to infinity. The

subsequent estimates are expressed in terms of the function

ϕ(ϑ) = ϑexp 1−ϑ2

2.(3.11)

It increases on the interval 0≤ϑ≤1strictly, attains at the point ϑ= 1 its maximum value one, and decreases from there

again strictly to its limit value zero.

Theorem 3.4 Let ξbe the square root of the dimension ratio m/n. For δ < ξ and ξ < δ, respectively, then one has

0≤F(m, n;δ)≤ϕδ

ξm

,0≤1−F(m, n;δ)≤ϕδ

ξm

.(3.12)

If the dimension ratio δ2

0=m/n is kept fixed or only tends to δ2

0, the functions (3.4) thus tend to a step function with jump

discontinuity at δ0. Figure 1 reflects this behavior. We summarize our findings therefore once more as follows and relate them

to the prospective jump positions.

Theorem 3.5 Let Abe a nonvanishing matrix of dimension m×n,m < n, and let ξbe the square root of the dimension

ratio m/n. For ϑ > 1, then one has

λx∥Ax∥ ≥ ϑξ∥A∥∥x∥≤ϕ(ϑ)m.(3.13)

The theorem states in particular that the norm of Ax exceeds the value ξ∥A∥∥x∥by more than a moderate factor ϑ > 1

only on a very small, de facto negligibly sector, an observation that is of great importance for the analysis of iterative methods.

Under the much more restrictive assumptions from Theorem 3.3, Theorem 3.5 possesses a counterpart for values ϑ < 1.

Theorem 3.6 Let n > m and let Abe a nonvanishing (m×n)-matrix with singular values σk=σmfor k > m0. If

m′=m−m0and ξ′is the square root of m′/n, then

λx∥Ax∥< ϑξ′∥A∥∥x∥≤ϕ(ϑ)m′

(3.14)

holds for all ϑin the interval 0<ϑ<1.

PAMM ·Proc. Appl. Math. Mech. 22:1 (2022) 5 of 6

Fig. 2: Upper bounds for the probability that the condition (4.1) is violated as function of εfor N= 8,...,64 electrons.

4 Back to the Laplace operator

The last two theorems apply, because of (3.9), to the matrices A=Ttassigned to the Schrödinger equation. They form the

basis of our argumentation. Let 0<ε<1. For a randomly chosen vector ωin the frequency space, the probability that

(1 −ε)ξ′∥Tt∥∥ω∥ ≤ ∥Ttω∥<(1 + ε)ξ∥Tt∥∥ω∥(4.1)

holds is by these two theorems, because of ϕ(1 ±ε)<exp(−c ε2)with c=−ln(ϕ(2)), at least

1−2 exp(−c ε2m′),(4.2)

where the dimensions (1.2) and the exponent m′= 3 ×(N−1) depend on the number Nof particles, the quantities ξand ξ′

are the square roots of the dimension ratio m/n and of m′/n, and the constants

ξ′∥Tt∥=s1−2

N(N+ 1), ξ ∥Tt∥=s1 + 1

N+ 1 (4.3)

enclose the value one and tend to one as Ngoes to infinity. The fraction of the vectors ωon the unit sphere of the Rnfor

which the Euclidean norm of Ttωdiffers from one by more than a given small amount thus tends as claimed exponentially to

zero as the number of electrons goes to infinity. A quantitatively significantly better lower bound than (4.2) for the probability

that (4.1) holds can be directly derived from Theorem 3.2 and Theorem 3.3. In terms of the function (3.4), it reads

F(m, n; (1 + ε)ξ)−F(m′, n; (1 −ε)ξ′)(4.4)

and deviates for increasing particle number less and less from the probability

F(m, n; (1 + ε)ξ)−F(m, n; (1 −ε)ξ)(4.5)

that an orthogonal projection from the Rnto the Rmmaps a randomly chosen unit vector to a vector of length between (1−ε)ξ

and (1 + ε)ξ. For some small to medium size systems, Fig. 2 shows the resulting upper bounds for the probability that the

condition (4.1) is violated. Since the products of the matrix Ttwith vectors e∈Rnin the three-dimensional subspaces

assigned to the particle positions and their differences have the norm ∥Tte∥=∥e∥, they satisfy the condition

ξ′∥Tt∥∥e∥<∥Tte∥< ξ ∥Tt∥∥e∥(4.6)

and thus, independent of ε, the condition (4.1). This establishes a link to hyperbolic cross spaces and thereby indirectly also

to tensor product approximations, and not least to the mixed regularity of electronic wave functions [5]. The inequality tells

us that the Fourier transforms of functions in the corresponding hyperbolic cross spaces are inherently concentrated in the

subregions of the frequency space on which the norm of Ttωdoes not much differ from that of ω.

What does all this mean? For high electron numbers, at the latest when statistical physics comes into play, the norm of Ttω

is almost equal to that of ωfor all ωoutside of a tiny, negligible sector. This allows to replace the operator (2.7) in such cases

by the negative Laplace operator (2.9) and the original Hamilton operator in the corresponding sense by a decoupled Hamilton

operator acting upon the higher-dimensional functions. But also for moderate particle numbers, the negative Laplace operator

(2.9) remains a good approximation to the operator (2.7), surely good enough to serve as a building block for the construction

of rapidly convergent iterative methods to determine the lowermost eigenvalues of the Hamilton operator (1.1).

6 of 6 Section 26: Modelling, analysis and simulation of molecular systems

Acknowledgements Open access funding enabled and organized by Projekt DEAL.

References

[1] H. Yserentant, Numer. Math. 146, 219–238 (2020).

[2] H. Yserentant, SIAM J. Matrix Anal. Appl. 43, 464–478 (2022).

[3] H. Yserentant, The Laplace operator, measure concentration, Gauss functions, and quantum mechanics,

https://arxiv.org/abs/2208.03957.

[4] R. Vershynin, High-Dimensional Probability (Cambridge University Press, Cambridge, UK, 2018).

[5] H. Yserentant, ESAIM: M2AN 45, 803–824 (2011).