Adjoint-based optimization of sound reinforcement including non-uniform flow [original]

Adjoint-based optimization of sound reinforcement including non-uniform flow

Lewin Stein, Florian Straube, Jörn Sesterhenn, et al.

Citation: The Journal of the Acoustical Society of America 146, 1774 (2019); doi: 10.1121/1.5126516

View online: https://doi.org/10.1121/1.5126516

View Table of Contents: https://asa.scitation.org/toc/jas/146/3

Published by the Acoustical Society of America

ARTICLES YOU MAY BE INTERESTED IN

Compressive synthetic aperture sonar imaging with distributed optimization

The Journal of the Acoustical Society of America 146, 1839 (2019); https://doi.org/10.1121/1.5126862

A deep learning method for grid-free localization and quantification of sound sources

The Journal of the Acoustical Society of America 146, EL225 (2019); https://doi.org/10.1121/1.5126020

Perceptual audio processing stethoscope

The Journal of the Acoustical Society of America 146, 1769 (2019); https://doi.org/10.1121/1.5126226

On the limitations of sound localization with hearing devices

The Journal of the Acoustical Society of America 146, 1732 (2019); https://doi.org/10.1121/1.5126521

Machine learning in acoustics: Theory and applications

The Journal of the Acoustical Society of America 146, 3590 (2019); https://doi.org/10.1121/1.5133944

Interior near-field acoustic holography based on finite elements

The Journal of the Acoustical Society of America 146, 1758 (2019); https://doi.org/10.1121/1.5126513

Adjoint-based optimization of sound reinforcement including

non-uniform flow

Lewin Stein,

1,a)

Florian Straube,

J€

orn Sesterhenn,

Stefan Weinzierl,

and Mathias Lemke

Computational Fluid Dynamics Group, TU Berlin, M€uller-Breslau-Str. 15, 10623 Berlin, Germany

Audio Communication Group, TU Berlin, Einsteinufer 17c, 10587 Berlin, Germany

(Received 14 May 2019; revised 10 August 2019; accepted 23 August 2019; published online 26

September 2019)

The determination of optimal geometric arrangements and electronic drives of loudspeaker arrays in

sound reinforcement applications is an ill-posed inverse problem. This paper introduces an innovative

method to determine complex driving functions, also considering complex environmental conditions.

As an alternative to common frequency domain methods, the authors present an adjoint-based

approach in the time domain: Acoustic sources are optimized in order to generate a given target sound

field. Instead of the Helmholtz equation, the full non-linear Euler equations are considered. This

enables an easier treatment of non-uniform flow and boundary conditions. As proof of concept, a

circular and a linear monopole array are examined. For the latter, the environmental conditions include

wind and thermal stratification. For all examples, the method is able to provide appropriate driving

functions. V

C2019 Author(s). All article content, except where otherwise noted, is licensed under a

Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

https://doi.org/10.1121/1.5126516

[KGS] Pages: 1774–1785

I. INTRODUCTION

Sound reinforcement aims at synthesizing specified

sound fields for the entire audio bandwidth. The generation

of the target sound fields can be controlled by the placement

and the electronic drive of different loudspeakers. Gain and

delay are the two parameters—sometimes called excitation

coefficients or feeding coefficients—that have to be ascer-

tained for electronic optimization of the loudspeaker array

radiation. Both parameters are related to the amplitude and

phase of a complex driving function. They are typically

computed separately for each frequency (Betlehem and

Withers, 2012;Coleman et al., 2014;Feistel et al., 2013;

Straube et al., 2018;Terrell and Sandler, 2010;Thompson,

2009;van Beuningen and Start, 2000) using frequency

domain methods.

This article presents a new method to find loudspeaker

positions and their driving functions for application in sound

reinforcement. The focus of this work is the determination of

optimized driving functions—also in case of non-uniform flow.

For this purpose, the authors present an adjoint-based

method in the time domain. It is based on the full non-linear

Euler equations and the corresponding adjoint, which are

solved by means of computational aeroacoustic (CAA) tech-

niques. Since the Euler equations allow us to consider wind

and thermal stratification, applications of the current work

may include stadiums or open-air events (Fig. 1). Here, they

are considered in a model form.

In fluid mechanics, the use of the adjoint has proven to

be an effective approach for determining various model

parameters (Giles and Pierce, 2000;Jameson, 1995).

Utilizing corresponding methods for acoustic problems is

convenient as the adjoint compressible Euler and Navier-

Stokes equations (Lemke et al., 2014;Lemke and

Sesterhenn, 2016) also capture acoustics, both for a station-

ary environment and complex base flow.

The paper is organized as follows: In Sec. II the adjoint-

based method is presented and tailored for the application in

sound reinforcement. Section III introduces the concept of val-

idation and methods. Section IV presents exemplary results to

verify the applicability of the adjoint-based method for circular

and linear source arrays—for the latter with a velocity and a

temperature profile. Section Vgives a summary.

II. ADJOINT APPROACH

In contrast to frequency domain approaches, which are

common in sound field generation as well as sound rein-

forcement and which are based on an integral representation

of the homogeneous wave equation for discrete source distri-

butions (Feistel et al., 2009;Meyer, 1984;Meyer and

Schwenke, 2003;van Beuningen and Start, 2000), the

adjoint-based method makes use of a more general represen-

tation of the wave propagation in the time domain.

A. Adjoint equations

The adjoint equations can be defined in a continuous or

discrete manner. For the sake of simplicity, they are intro-

duced in discrete version (Giles and Pierce, 2000). A matrix-

vector notation is used. The vector space is the full solution

in space and time. This section is based on Lemke (2015).

Adjoint equations arise by a so-called objective function

J, which is defined by the product between a geometric

weight gand the system state q:

J¼gTq;g;q2Rn:(1)

Electronic mail: [email protected]

1774 J. Acoust. Soc. Am. 146 (3), September 2019 V

CAuthor(s) 2019.0001-4966/2019/146(3)/1774/12

The system state qcorresponds to the solution of the govern-

ing system

Aq ¼s;A2Rnn;s2Rn;(2)

with Aas governing operator and sas source terms on the

right-hand side. In terms of optimization of Jby means of s,

the equation has to be solved for every different s. In order

to reduce the computational effort, the adjoint equation can

be used,

ATq¼g;(3)

with the adjoint variable q*. By

J¼gTq¼ATq



Tq¼qTAq ¼qTs;(4)

an expression is found, which enables the computation of J

without the need to solve the system for every discrete s.

After solving the adjoint equation, the objective can be

determined by a simple and computationally cheap scalar

product. Thus, the adjoint approach enables the efficient

computation of gradients for Jwith respect to s.

B. Objective and adjoint-based gradient

According to the intended application—the sound field

generation in the time domain—the objective function Jis

defined in space and time with dX¼dxidt:

J¼1

2ððqqtarget

ðÞ

2dX:(5)

The variable q¼½.;uj;pcomprises all quantities which are

necessary for a state description of the governing system,

defined by the Euler equations. The variable q

target

denotes

the desired system state, .the density, u

the velocity in the

direction x

and pthe pressure, which is used solely for eval-

uation of the objective

J¼1

2ððpptarget

ðÞ

2rdX:(6)

The additional weight rðxi;tÞdefines where and when the

objective is evaluated. In practice, the objective function is

supplemented by additional constraints, e.g., power con-

sumption, and a regularization term (Bewley, 2001), which

are omitted here for the sake of clarity. Optimal sound rein-

forcement is realized if Jreaches a minimum.

The minimum is to be achieved under the constraint

that the Euler equations Eare satisfied. Similar to Eq. (2),

the following system is introduced:

EðqÞq¼s;(7)

abbreviating the Euler equations

.uj

c1

Aþ@xi

.ui

.uiujþpdij

uipc

c1

Aui@xi

A¼

s.uj

(8)

with cas heat capacity ratio. The summation convention

applies. See Lemke et al. (2014) for details on the

formulation.

The terms s¼½s.;s.uj;spon the right side of the equa-

tions characterize sources for mass, momentum, and energy.

They enable control of the system state, respectively, the solu-

tion of the equations. The general goal is to obtain a solution

of the Euler equations, which matches q

target

, respectively,

target

in an optimal sense, by adapting s.

FIG. 1. (Color online) “Mayan Warrior” sound system at the “Burning Man” festival in Nevada’s Black Rock desert (with friendly permission of the Mayan

Warrior Collective, http://mayanwarrior.mx). Open-air conditions including wind and thermal stratification, such as a hot ground and a cool breeze above con-

sidered in Sec. III D, are examples, where sound optimization with non-uniform flow matters.

J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1775

The sources scan be interpreted as sound sources or

loudspeakers. Monopole sources can be described solely by

sources s

in the pressure equation. Thus, an optimization of

scorresponds to an optimization of the loudspeakers’ output

signals.

In order to use the adjoint approach for optimizing s,it

is necessary to linearize the objective function Eq. (6) and

the governing system Eq. (8). This results in

dJ¼ððpptarget

ðÞ

|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}

¼g

dpdX(9)

and

Elinðq0Þdq¼ds;(10)

with the now defined weight g¼ðpptargetÞr. Combining

the linearized system and the objective in a Lagrangian man-

ner by introducing the (adjoint) multiplier q*, leads to

dJ¼gTdqqTElinðq0ÞdqdsðÞ

|ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ}

¼0

¼qTdsþdqTgET

linðq0Þq



:(11)

For the sake of simplicity, the integrals are not shown. The

adjoint equations result in

gET

linðq0Þq¼0;(12)

with q¼½.;u

j;p, similar to Eq. (3). See Lemke (2015,

p. 19) for a detailed derivation of the adjoint Euler equations,

which are given by

@tq¼~

AðBiÞT@xiq@xiðCiÞTqþ~

Ci@xicg

(13)

with ~

A¼ðATÞ1and ~

Cias resorting

q

adCi

ab@xicb¼q

adqj

@Ci

@qj

@xicb;(14)

abbreviated as dqj

jb@xicb. The matrices A,B

, and C

result

from the linearization of the Euler Eq. (8) and are given in

the Appendix.

The change of the objective function reads

dJ¼qTds;(15)

similar to Eq. (4). Thus, the adjoint solution can be inter-

preted as a gradient of Jwith respect to the source terms s,

rsJ¼q:(16)

Initial and boundary conditions of the adjoint Euler equa-

tions as well as the derivation of the adjoint compressible

Navier-Stokes equations are discussed in Lemke (2015).

C. Iterative determination of the driving function

The adjoint-based gradient is used in an iterative man-

ner. First, the Euler equations are solved forward in time

with s

¼0. Subsequently, the adjoint equations are calcu-

lated backward in time deploying the direct solution and the

weight g. Based on the adjoint solution, the gradient rsJis

determined and used to update the source distribution s

snþ1¼snþarsJ;(17)

with adenoting an appropriate step size and nthe iteration

number. The gradient is calculated for the whole computing

region and the entire simulation time, but only evaluated at

desired loudspeaker positions. While convergence is not

accomplished in the objective function, the process is repeated

with the current s

.Pleasenotethatonlys

is optimized in this

work. Thus, only monopole sources are considered.

The resulting optimal sis deconvolved with the consid-

ered primitive input signal and the loudspeaker properties

resulting in an optimal driving for every considered loud-

speaker. See Fig. 3for an overview of the process.

The proposed technique optimizes towards local

extrema. Detecting global optima is not ensured. The com-

putational costs of the adjoint-based approach are indepen-

dent of the number of loudspeakers and their arrangement.

They only depend on the size of the computational domain,

which includes the entire space from the loudspeakers to the

most distant listening position, and the considered frequency

range, as higher frequencies demand a higher spatial and

temporal resolution. Using high-accuracy discretization

schemes, e.g., compact schemes, a resolution of four to 12

points per wavelength will usually be sufficient, depending

on the required accuracy.

Figure 2shows the convergence behavior of the objec-

tive functions of all optimization runs conducted in the

following. Throughout this work, all CAA solutions are the

result of 15 iteration steps. Depending on the line array setup

and the type of background flow the objective is reduced

between one and two orders of magnitude.

Utilizing a line search strategy with three forward and one

adjoint backward calculation per iteration step (quadratic inter-

polation), the total CPU time after 15 iterations adds up to

roughly 200 CPU-hours (details of the employed numerical

resolution follow in Sec. III B). On a local machine with 16

cores (AMD EPYC 7351, 32 GB RAM each), these hours cor-

respond to a total runtime of approximately half a day.

FIG. 2. (Color online) The convergence of the objectives of all three con-

ducted optimization runs. Section II C explains the details about the underly-

ing adjoint optimization strategy.

1776 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.

III. VALIDATION SETUP

A. Introduction of the validation concept

As validation, the adjoint-based method (Sec. II)is

applied to different test cases (Secs. IV A,IV B,IV C). To

obtain a reference solution for all test cases, i.e., a sound field

generated by prescribed speakers with fixed positions for

given input signals, we use a complex directivity point source

(CDPS) model. In the frequency domain, the CDPS model

(Feistel et al., 2009;Meyer, 1984;Meyer and Schwenke,

2003;van Beuningen and Start, 2000) calculates the analytic

sound field in free space without any background flow, by

superimposing the impact of all speakers (for its setup, see

Sec. III C).

The basic validation procedure is as follows: First, we

specify a test case consisting of circularly or linearly aligned

speakers and an input signal. Throughout this paper, all speak-

ers are assumed as monopoles. For the exact speaker locations

please refer to Figs. 6or 12. As favorable input signal—to

analyze an entire frequency range—we select a sine sweep

with an exponential frequency increase over time (Fig. 4)in

all cases. Below 1.3kHz and above 2.7 kHz the sweep has a

fade in and fade out period. At the end of the sweep there is

1 ms silence. The exponential sine sweep is chosen as a popu-

lar signal for loudspeaker measurements (Mueller and

Massarani, 2001), though other input signals could be also

selected. Here, always the same sweep signal is prescribed for

all speakers. However, to impose a random variation between

the speakers, the amplitudes and the phases among all utilized

speakers are varied.

Second, we determine the entire four-dimensional (three

spatial and one temporal dimension) reference sound field

ref for the specified speakers and input signals with the

CDPS model (Sec. III C). To convert the sound fields from

frequency to temporal domain, we perform a discrete

Fourier transform. Snapshots of the reference fields are visu-

alized by Figs. 7and 12.

Third, we utilize the adjoint CAA solver (for its numeri-

cal setup see Sec. III B) to independently recalculate the

input signals for each speaker. As input for the adjoint

CAA solver the speaker locations and the four-dimensional

reference field p0

ref, which is used as the optimization target

(ptarget ¼p0

ref þp1) within the spatial mask r[Eq. (6)], are

provided. The ambient pressure is denoted by p1.

Fourth, the quality of the numerically identified speaker

signals is evaluated by contrasting them against the original

input signals of the CDPS model (reference). As further vali-

dation, the numerically reproduced (optimized) sound field

opt ¼popt p1is compared with the reference p0

ref at repre-

sentative microphone locations.

Taken together, we check to what extent the adjoint

approach is able to identify the optimal excitation signals for

a given sound field with a given speaker alignment.

To keep the discussion clear of special features (e.g.,

initial transient) of adjoint-based optimization, the startup

and the final decay process are excluded from the following

discussion. We thus restrict the validation to a frequency

interval of the sweep between 1.3 and 2.7 kHz (see vertical

black lines of Figs. 5or 8). If needed, one can modify the

FIG. 3. Iterative process for determining optimal driving functions: Starting with an initial guess for the loudspeakers s

¼0, the governing equations are solved

forward in time. The result is compared to the desired reference state, while the actual difference drives the adjoint equations, which are solved backward in

time. Based on the adjoint solution, the gradient for rspJis used to update the actual forcing. Once convergence is reached, deconvolution of optimal swith the

predefined input signal and the speaker characteristics results in optimal drives. Computationally intensive steps including CAA methods are marked in gray.

FIG. 4. Exponential sine sweep with a monotonically increasing frequency in

the time domain—the loudspeaker input signal used throughout this work. All

other speakers use the same signal sequence varying amplitude and phase.

J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1777

transient process and consider a broader frequency band, by

extending the duration of the sweep signal.

B. Setup of the adjoint-based CAA solver in time

domain

The governing equations Eq. (8) are discretized by finite

differences. A sixth-order accurate compact symmetric deriv-

ative stencil is used (Lele, 1992). The sound reinforcement

area under consideration is 0 <xi<1:6 m for all directions

and is discretized by means of a uniform grid resolved by

197 197 99 grid positions. This relatively small spatial

domain does not reflect a real outdoor sound reinforcement

situation. However, it enables us to complete an optimization

run with 15 iteration steps within the above mentioned com-

putational time. At the same time all physical relevant phe-

nomena such as an inhomogeneous sound pressure level

(SPL) distribution, a wide frequency spectrum, and a strati-

fied background flow can be fully studied.

Large scale computations can be realized using a paral-

lel cluster as the problem is fully parallelizable. The problem

scales with the number of discretization positions used to

resolve space and time. The computational complexity

remains the same. For larger setups stability problems are

not expected. Thus, for a large-scale venue slice with

70 15 m

, the required amount of CPUh will increase to

about 45 000 CPUh per loop, which can be handled on a par-

allel cluster in reasonable time and with reasonable costs.

The physical time span t

¼0mstot

end

¼26 ms is sepa-

rated into 1250 time steps, corresponding to a sampling rate

of 48 kHz and a maximal Courant–Friedrichs–Lewy condi-

tion (Wendt, 2009) equal to 0.94. An explicit fourth-order

Runge-Kutta scheme (Wendt, 2009) is employed for time-

wise integration. All boundaries are treated as non-reflective

using so-called “characteristic boundary conditions”

(Poinsot and Lele, 1992;Stein, 2019). In order to suppress

reflections, a quadratic sponge layer is added to all bound-

aries, acting on a side margin with a width of 0.3 m (Mani,

2012). To ensure stability, an implicit filter of sixth order is

used at each time step (Gaitonde and Visbal, 2000). The

adjoint equations are solved using the same discretization.

The corresponding boundary conditions are discussed in

Lemke (2015).

To realize an ambient speed of sound c

¼343 m/s of an

ideal gas with the adiabatic index c¼1.4, the ambient values

for the density and the pressure are set to .1¼1.21 kg/m

and p

¼101 325 Pa, respectively, with R

¼287 the specific

gas constant of air. Please note that for the case with non-

homogeneous flow field, the speed of sound is not constant,

see Sec. IV C.

C. Setup of the CDPS model in frequency domain

To determine the reference sound field, the CDPS model

is taken to calculate the transfer functions from every source to

each considered grid position of the sound reinforcement area

in the frequency domain. These transfer functions result from

the superposition of the impact of each source, e.g., monopole

sources as in this article, or sources with modeled as well as

measured loudspeaker directivity data. The transfer functions

comprise the reference driving functions—the electronic filters

which individually control the different inputs for each

source—and the acoustic transfer functions—the spatial

source–receiver sound wave propagation. Subsequently, the

transfer functions are transformed to the time domain using the

discrete inverse Fourier transform. The sound field is calcu-

lated by convolving the input time signal and the transfer func-

tions to obtain the sound pressure for each considered grid

position and for each considered time step.

Air is assumed to be homogeneous and dissipation-less

with a constant speed of sound c

¼343 m/s. The spatial loca-

tions, provided by the finite differences nodes of the CAA

solver, for which the sound field is to be calculated, have to be

located in the acoustical far-field of every single source.

D. Inhomogeneous velocity and temperature profile

We specify the wind and temperature profiles in this

section, which are utilized in Sec. IV C as test case for non-

uniform background flow.

An atmospheric boundary layer profile with constant

shear stress can be approximated by the log-law (Richards

and Hoxey, 1993):

uðyÞ¼uþ

jln 1 þy



;(18)

with u

the friction velocity, y

a roughness parameter, and j

¼0.4 the Von Karman constant. The roughness parameter in

case of a landscape with bushes is roughly y

¼0.1 (Leclerc

and Foken, 2014). A maximum wind speed of uðymaxÞ

¼15 m/s at the edge y

max

¼1.6 m of the computational

domain is selected, which also defines the friction velocity

using Eq. (18).

In case of a compressible turbulent boundary layer, the

temperature profile and the density profile is related to the

mean velocity profile u(y) [Eq. (18)] by the Crocco-

Busemann relation (White, 2006)

FIG. 5. The instantaneous root mean square error [RMSE, Eq. (20)] between

the pressure of the CAA solver and the CDPS model over the frequency of

the exponential sine sweep (see case in Sec. IV A). The two vertical black

lines mark the frequency range under consideration of the sweep between

1.3 and 2.7 kHz, excluding the switch on and off procedure of the adjoint-

based solver.

1778 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.

TðyÞ¼TgþTag Tg

ðÞ

uðyÞ

T0

c1

uðyÞ



;

q¼qw

T;(19)

being Tg¼T1292:80 K the ground temperature. The the-

oretically corresponding adiabatic ground temperature T

given by the assumption of linear heat flux at the ground,

i.e., the condition @yTjy¼0¼ðTmax TgÞ=ymax. The selected

reference temperature Tmax ¼TðymaxÞ¼Tg20 K at the

top edge defines T

. Evaluating Eqs. (18) and (19) provides

the velocity and temperature profiles depicted in Fig. 6.

These are utilized as background mean profiles in Sec. IV C.

IV. VALIDATION EXAMPLES

Sections IV A and IV B present a circular and a linear

array in the case of no flow. As a subsequent step, the linear

array case is perturbed by both a representative inhomoge-

neous velocity and temperature background field (Sec. IV C).

A. Circular monopole array

As a first validation example, we present the sound field

of a circular array, consisting of six speakers, orientated in

the x1x2-plane around the center (0.8, 0.8, 0.8) m of the

domain, and with a diameter of 0.8 m. A pressure snapshot

of the CAA time stepper is shown in Fig. 7. The highest SPL

occurs at three o’clock for the first speaker, decreasing

counter-clockwise from the second to the sixth speaker (an

arbitrary choice done here).

After 15 iterations of the adjoint-based optimization, the

objective function is decreased by about two orders of mag-

nitude (Fig. 2). The normalized instantaneous root mean

square error (RMSE) between the CAA solution p0

opt and the

CDPS reference p0

ref of Fig. 5,

RMSE

RMSðp0

refÞ¼ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

Xrðp0

ref p0

optÞ

2Xrp0

ref



s(20)

falls below 7% within the examined frequency interval.

Therein, Pis the sum over all spatial locations and ris the

weight function [Eq. (6)], which excludes in particular the

sponge (boundary) region, starting at a distance of 0.3 m

to each edge (Fig. 7). There, the CAA solution is not valid

(Fig. 7). However, the small RMSE of Fig. 5implicitly

validates the quality of non-reflective boundary conditions:

any spurious reflections would cross the r-area, thus, increas-

ing the RMSE.

Beside the comparison in the time domain, the ampli-

tudes and phases are contrasted in the frequency domain

below by analyzing the driving signals of the field sources of

the CDPS model and the CAA solver.

Naturally, the amplitude of the pressure source s

of the

right-hand-side of the CAA equation [Eq. (8)] is not directly

comparable to the resulting pressure field. However, both are

physically related by a phase shift of p/2 and an amplitude

conversion factor, which is discussed in the Appendix. In

contrast, the driving signal of the CDPS model is directly

proportional (phase and amplitude) to the acoustic pressure

field. In order to compare the CDPS and the CAA driving

signals in a physically equivalent manner, we will always

apply the phase and amplitude conversion just stated

throughout the paper.

Figures 8and 9depict the input gains and the phases

provided by the CAA solver. Due to the used exponential

sweep (cf. Fig. 5), higher frequencies have smaller time

shares of the overall signal reflected by a decaying gain with

increasing frequency in Fig. 8. The phases (Fig. 9) feature

periodic 2pshifts as expected.

Figures 10 and 11 show the gain and the phase differ-

ences between the CAA solver and the CDPS model within

the frequency interval from 1.3 to 2.7 kHz. The maximum

gain deviation error at the interval edges is 60.2 dB and less

FIG. 6. (Color online) Exemplary velocity and temperature profiles in a

windy desert at sunset [result of Eqs. (18),(19)]. The height x

of the three

microphone positions—examined in Sec. IV C—is marked with magenta

crosses.

FIG. 7. (Color online) CAA solution of the sound field radiated by six

speakers in a circular arrangement. A snapshot of pressure fluctuations of

the central x1x2-plane (x

¼0.8 m, where all speakers are located) is shown.

The locations of the six monopole sources of pressure are marked by black

crosses. Isolines of the objective weight function r[Eq. (6)] at values of

0.01, 0.5, and 0.99 are denoted by white, dashed circles (thin lines). The

sponge region is indicated by a white, dashed box (thick line) at a distance

of 0.3 m to the edge of the computational domain.

J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1779

than 60.1 dB between 1.6 and 2.7 kHz for all six speakers

(Fig. 10). Hence, also the relative pressure levels between all

six speakers are correctly captured. The phase deviation

between the CAA and CDPS solution corresponds to a phase

shift of less than half a percent (phase shift in rad/2p) within

the whole frequency interval (Fig. 11). In both figures the

adjoint-based CAA solver achieves best optimization results

for the loudest speaker, while the largest deviations are

apparent for the quietest speaker, indicating varying sensitiv-

ities on the objective.

Thus, we can summarize that the adjoint-based CAA

solver is able to provide driving functions for the circular

monopole array, which are in magnitude and phase almost

identical to the reference.

B. Linear monopole array

While horizontally arranged loudspeaker arrays are typi-

cally used for sound reproduction, vertical arrays are mostly

used for sound reinforcement. Thus, the second validation

example is a linear array with five vertically aligned mono-

pole speakers. Again the same exponential sweep test signal

is used. The equidistant speaker alignment is located on the

left side (all located at x

¼0.4 m), see Fig. 12. The loudest

speaker is located at the bottom (x

¼0.4 m), while the qui-

etest speaker is located at the top position (x

¼1.2 m).

For the linear array, we compare not only the source

signals of the CAA solution and the reference but also the gen-

erated sound pressure at three selected positions in the target

area (“mic 1”–“mic 3”). This extra comparison has two rea-

sons: First, microphone signals are another measure of the

quality of the obtained sound field. Second, the CDPS model

cannot deal with non-uniform background flow, as present in

Sec. IV C. Consequently, there are no reference signals avail-

able to compare with. Thus, after the following discussion of

the gains and the phases of the source signals, the amplitudes

and phases recorded by the microphones are discussed, too.

Qualitatively, both the source gain (Fig. 13) and the

source phase (Fig. 14) provided by the CAA solver for the

linear array are similar to those of the circular array (cf.

Figs. 8and 9). This is expected since the same exponential

sweep is reused (Fig. 4). The main difference between the

linear and the circular array (Sec. IV A) is the gain offset

between the speakers. The phase differs outside the investi-

gated frequency interval only [switch on process, discrete

Fourier transform (DFT) windowing].

The deviation between the CAA and the CDPS solution

are comparable to those in Sec. IV A. The maximal gain devia-

tion in Fig. 15 is below 0.1 dB. The maximal phase deviation

in Fig. 16 is less than half a percent (phase difference in rad/

2p). Again the loudest speakers are the most accurate.

Figures 17 and 18 display the SPL and the phase differ-

ences, respectively, at three representative microphone loca-

tions marked by the magenta crosses in Fig. 12, located at

FIG. 9. CAA solution of the input phase of the loudest speaker of the circu-

lar array. All other speakers show similar behavior. The two vertical black

lines define the frequency interval of the sweep between 1.3 and 2.7 kHz.

FIG. 10. (Color online) Difference between the CAA and the CDPS narrow-

band input gains of the speakers for the circular array.

FIG. 11. (Color online) Difference between the CAA and the CDPS input

phases of the speakers for the circular array.

FIG. 8. (Color online) CAA solution of the input gains for all speakers of

the circular array. The SPL is normalized at the loudest frequency of the first

speaker.

1780 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.

[1.1, 1.1, 0.8] m (mic 1), [0.8, 0.8, 0.8] m (mic 2) and [0.73,

0.63, 0.8] m (mic 3). In contrast to the speaker signals, the

deviation of the microphone SPLs is increased by a factor of

10 up to 1 dB at the frequency interval edges and the devia-

tion of the microphone phase is increased by a factor of 8 up

to 4%. These deviations are also representative of other

microphone locations. An explanation of the more accurate

speaker signal in comparison to the microphone signal is as

follows: The adjoint-based speaker signal results from an

comparatively more robust integral measure J[Eq. (6)]of

all spatial locations of r. In contrast, the pressure signal at

a microphone represents a single spatial point only. The dis-

crete Fourier transform windowing artifacts due to the short

available signal length of 26 ms are another explanation,

which especially affects the beginning and end of the sweep,

i.e., very low and very high frequencies.

Having confirmed that also a linear array setup can be

optimized, the geometrical flexibility of the adjoint approach

is demonstrated. Beside the signal of the speakers itself also

the resulting pressure fields at different microphone locations

match the CDPS reference.

C. Linear monopole array with velocity and

temperature profile

In the following the linear loudspeaker setup of Sec. IV B

is extended by an inhomogeneous velocity and temperature

profile, cf. Sec. III D. Inspired by a sunset festival situation in

a windy desert (cf. Fig. 1) we assume a hot ground with colder

air blowing above. Subsequently, the impact of these outdoor

FIG. 13. (Color online) CAA solution of the input gains for all speakers of

the linear array. The two vertical black lines define the sweep frequency

interval between 1.3 and 2.7 kHz.

FIG. 14. CAA solution of the input phase of the loudest speaker of the linear

array. All other speakers show similar behavior.

FIG. 15. (Color online) Difference between the CAA and the CDPS signal

gains of the speakers for the linear array.

FIG. 12. (Color online) CAA solution of the sound field radiated by a

strictly vertically stacked line array with five monopole sources. A snapshot

of pressure fluctuations of the central x1x2-plane (x

¼0.8 m, where all

speakers and microphones are located) is shown. All white dashed lines are

discussed in the caption of Fig. 6. The three virtual microphone positions

are visualized by magenta crosses: Microphone one is placed at [1.1, 1.1,

0.8] m, microphone two at [0.8, 0.8, 0.8] m, and microphone three is located

at [0.73, 0.63, 0.8] m.

FIG. 16. (Color online) Difference between the CAA and the CDPS signal

phases of the speakers for the linear array.

J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1781

conditions on sound propagation is discussed. The end of this

section deals with the optimization of the linear array in order

to reinforce the original target field by compensating the

impact of the velocity and the temperature profile.

Using the identical source signals as in Sec. IV B,and

the background flow profiles described in Sec. III D signifi-

cantly alter the sound propagation. At the ground x

¼0,

there is no flow and the speed of sound is equal to Sec. IV B.

However, with increasing wall distance x

>0, both, the

incoming wind and the decreasing sound speed delay the

sound propagation in the positive x

direction further and fur-

ther. In the time domain the resulting delayed pressure signal

at the mic 1 location is depicted as a dashed-dotted red line in

Fig. 19, while the original reference signal is colored solid

black. The time delay s

between the cases with and without

flow of the arrival time at a “mic i” can be estimated by

sðLx1;c1;c;uÞ¼Lx1=ðcþuÞ

zﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄ{

with flow

Lx1=c1

zﬄﬄﬄ}|ﬄﬄﬄ{

no flow

;

s1ð0:70 m;343 m=s;333:6m=s;13:2m=sÞ¼0:144 ms;

s2ð0:40 m;343 m=s;335:5m=s;11:6m=sÞ¼0:069 ms;

s3ð0:33 m;343 m=s;336:7m=s;10:5m=sÞ¼0:050 ms;

(21)

being Lx1the streamwise distance between speaker and mic.

The seven-time step shift [7/(48 kHz)] between the extrema

of the reference and the delayed curve in Fig. 19 matches s

In addition to the time domain representation, the time

shift is visible as a spectral phase shift in Fig. 20. A time

shift scorresponds to a spectral phase shift of D/=2p¼fs.

Hence, for the first approximation, we find a linear depen-

dency on the frequency at the microphone locations, origi-

nating from the time delays fs1;fs2;and fs3as a black,

blue, and green solid line, respectively. For example, at

2 kHz the time shift s

leads to a phase shift of 13.8%, fully

consistent with the CAA value (blue solid line). As expected,

the largest phase shift of around 35% occurs for the highest

frequency of mic 1. For setups with more speakers and a

larger audience region stronger deviations from the desired

sound field are expected.

FIG. 19. (Color online) Comparison of the pressure amplitudes in the time

domain at “mic 1” in the case with and without a flow (and temperature)

profile, and each case with and without enabled optimization.

FIG. 20. (Color online) Difference between the disturbed CAA and the ref-

erence CDPS phases of the three microphones for the linear array. Always

the same flow disturbance of a combined velocity and temperature back-

ground profile is used (Sec. III D). Solid lines correspond to non-optimized

CAA solutions, whereas dash-dotted lines are optimized CAA solutions.

FIG. 17. (Color online) Difference between the SPLs of the CAA and the

CDPS solver at three microphone locations described by the legend and

illustrated by Fig. 11.

FIG. 18. (Color online) Difference between the acoustic pressure phases of

the CAA and the CDPS solver at three microphone locations described by

the legend and illustrated by Fig. 11.

1782 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.

Strictly speaking, the simplified explanation of the intro-

duced time delay is only considering the effects of wave

propagation in x

-direction assuming a single speaker. More

generally, superposition effects of all five speakers including

fully three-dimensional wave propagation within a boundary

layer mean flow leads to more complex phenomena and pos-

sible local outliers of the predominantly linear phasing. One

example of this non-linear phasing occurs at mic 3 (Fig. 20).

Beside the phase shift, the amplitude error at the exam-

ined mic positions is increased between plus and minus 3 dB

as visible in Figs. 19 and 21. An explanation might be a con-

structive or destructive interference between the waves of

the five speakers, which also explains the spatial dependency

of this effect. Because of the locally varying wind and tem-

perature, the acoustic path lengths between a microphone

and each speaker differ. Among the three microphones pre-

sented here, mic 3 with non-linear phasing is the most sensi-

tive to this interference effect and displays the largest

amplitude change in Fig. 21.

After discussing the effects of a non-uniform mean flow

on sound propagation, now its optimization is addressed.

Using the adjoint-based approach, the previous no-flow

speaker signals sare modified in order to reinforce the origi-

nal target field thereby compensating the effects of the non-

uniform background. The resulting optimized sound fields

are measured at the same mic positions as before.

In the time domain (Fig. 19) the optimized dashed green

curve is clearly shifted towards the black reference curve. In

the frequency domain (dashed lines of Fig. 20) the relative

phase error (dashed lines) is at least halved. The SPL differ-

ences (dashed lines of Fig. 21) display only small or no

(frequency dependent) improvement compared to the not

optimized curves (solid lines). Meanwhile a SPL decrease

over all frequencies of up to 2 dB can be seen, with the only

exception of mic 3 above 2.5 kHz. In comparison, the phases

are proportionally further restored by the optimization algo-

rithm than the SPL. This is probably caused by the fact that a

phase shift of pin Eq. (6) has a greater impact on the objective

in contrast to a perturbation of the pressure magnitude.

It is worth noting that the deviation between the opti-

mized and the reference solution is only a measure to which

extent the adjoint solver is able to reproduce a given target

field with the speakers available under the present conditions

(wind, temperature effects). It is not a quality criterion of the

sound reinforcement system itself. However, there are two

possibilities, to include quality criteria in the optimization

process. First, most easily, by predefining the target field

according to the criteria. Or second, by extending the objec-

tive function Eq. (6) to meet additional criteria—besides the

sound field reproduction alone. Potential criteria among

others could be a homogeneous SPL coverage and avoidance

of undesired radiation, or a smooth spectral distribution.

In the present paper, the driving signals were optimized

for predefined fixed speaker locations. However, in general,

the adjoint approach is also capable to identify the optimal

speaker locations, possibly suggesting a more efficient setup

than the original alignment to reinforce the same reference

field, see Lemke et al. (2017) for a preliminary test to iden-

tify source locations in two dimensions.

V. CONCLUSION

An adjoint-based framework for determination of optimal

driving functions of monopole speakers for sound reinforce-

ment applications was presented. The time-domain approach

allows the consideration of wind and other environmental con-

ditions like thermal stratification. It has been shown that the

approach is able to determine optimal driving functions, i.e.,

gain and delay, for different array configurations.

In practice, the obtained driving functions, including

environmental conditions, can be pre-computed, e.g., for dif-

ferent wind intensities. Based on on-site measurements the

sound field generation setup can then be switched between

the different functions.

In a next step, the adjoint-based approach will be ana-

lyzed in the context of complex speaker directivities. For

this purpose, monopole synthesis models are particularly

promising because the numerical effort of the adjoint-based

approach is independent of the number of sources.

ACKNOWLEDGMENTS

The authors gratefully acknowledge financial support by

the Deutsche Forschungsgemeinschaft (DFG) within the

projects Nos. LE 3888/2-1 and WE 4057/16-1.

APPENDIX

1. Conversion between the pressure forcing term of

the CDPS model and the CAA solver

While the CDPS model directly enforces the acoustic

pressure of a monopole speaker, the CAA solver indirectly

adds a source term s

to the pressure equation of Eq. (8).

FIG. 21. (Color online) Difference between the disturbed CAA and the

reference CDPS SPLs of the three microphones for the linear array with the

same flow disturbance as in Fig. 19. Again solid lines correspond to non-

optimized CAA solutions, whereas dash-dotted lines are optimized CAA

solutions.

J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1783

This section discusses, how to relate s

of the CAA solver

with the CDPS pressure signal of the monopole.

If the Euler equations [Eq. (8)] of the CAA solver

are linearized around a homogeneous background flow, the

wave equation

tp0c2

0Dp0¼sðy;tÞ(A1)

follows, with the right-hand-side force term

sðy;tÞ¼ c1

ðÞ

@tspðy;tÞ:(A2)

The general solution of this inhomogeneous wave equation

in free space is known as

p0ðx;tÞ¼ 1

4pðR3

sðy;tjxyj=c0Þ

jxyjdy:(A3)

To ensure numerical stability, the CAA solver uses a distrib-

uted Gaussian speaker shape

spðy;tÞ¼As

c1eðyy0Þ2r2eixðtÞt;(A4)

instead of monopole source, being x(t) the time-dependent

frequency function of the test signal. Strictly speaking, the

CAA speakers should be called Gaussian. Throughout the

present work, a point-like Gauss in relation to the shortest

acoustic wavelength (3 kHz) is used: r<k. Hence the CAA

speakers are approximate monopoles.

Combining the latter equations and executing the time

derivative provides

p0ðx;tÞ¼iðR3

spðy;sÞ@txðsÞs



jxyjdy;(A5)

being sðtÞ¼tjxyj=c0the delayed time between source

and observer position. Hence, the time derivative of the

eixðtÞtterm of the test signal causes a constant phase shift of

p(imaginary i) and a time, i.e., frequency dependent ampli-

tude relationship between the CAA source s

and the result-

ing pressure signal p0ðx;tÞ.

Alternatively, if both the CDPS signal s

CDPS

and the

CAA forcing s

are available, the frequency-dependent

amplitude relation is given by the relation jF½sCDPSðxÞj=

jF½spðxÞj, being Fthe discrete Fourier transform. This

amplitude relation is universal for all speakers. In the present

paper, the average of all speakers is taken into account.

Furthermore, one can show, that for low Mach numbers

M<0.1, the conversion curve is independent of the back-

ground flow. Hence, as soon as the amplitude conversion

curve is calculated for one signal and Gaussian speaker

shape without flow, it can also be applied in cases including

wind profiles or thermal stratification.

2. Adjoint equations

As discussed in the main part, linearization of the

governing Euler equations with respect to density, velocity,

and pressure around a given base state q

with q¼q0þdq

results in

@tAdqþ@xiBidqþCi@xidqþdCi@xic¼ds:(A6)

Therein, the summation convention applies. The lineariza-

tion matrices are given by

A¼

1 000 0

u1q00 0

u20q00

u300q0

0 000 1

c1

;

B1¼

u1q00 0

12qu100 1

u1u2qu2qu100

u1u3qu30qu10

0cp

c100cu1

c1

;

B2¼

u20q00

u1u2qu2qu100

202qu201

u2u30qu3qu20

00cp

c10cu2

c1

;

B3¼

u300 q0

u1u3qu30qu10

u2u30qu3qu20

3002qu31

000cp

c1

cu3

c1

;

Ci¼

0000 0

0000ui

;

dCi¼

0000 0

0000dui

The full adjoint Navier-Stokes equations, in particular,

the friction terms, are derived and discussed in Lemke

(2015). The two-dimensional adjoint Euler-equations can be

found in Lemke et al. (2017).

Betlehem, T., and Withers, C. (2012). “Sound field reproduction with energy

constraint on loudspeaker weights,” IEEE Trans. Audio, Speech, Lang.

Process. 20(8), 2388–2392.

Bewley, T. R. (2001). “Flow control: New challenges for a new renais-

sance,” Progr. Aerosp. Sci. 37(1), 21–58.

1784 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.

Coleman, P., Jackson, P. J. B., Olik, M., Møller, M., Olsen, M., and

Pedersen, J. A. (2014). “Acoustic contrast, planarity and robustness of

sound zone methods using a circular loudspeaker array,” J. Acoust. Soc.

Am. 135(4), 1929–1940.

Feistel, S., Sempf, M., K€

ohler, K., and Schmalle, H. (2013). “Adapting loud-

speaker array radiation to the venue using numerical optimization of FIR

filters,” in Proc. of the 135th Audio Eng. Soc. Conv., New York, paper

#8937.

Feistel, S., Thompson, A., and Ahnert, W. (2009). “Methods and limitations

of line source simulation,” J. Audio Eng. Soc. 57(6), 379–402.

Gaitonde, D. V., and Visbal, M. R. (2000). “Pade-type higher-order bound-

ary filters for the Navier-Stokes equations,” AIAA J. 38, 2103–2112.

Giles, M., and Pierce, N. (2000). “An introduction to the adjoint approach to

design,” Flow, Turbul. Combust. 65, 393–415.

Jameson, A. (1995). “Optimum aerodynamic design using CFD and control

theory,” AIAA Paper 1729, 124–131.

Leclerc, M. Y., and Foken, T. (2014). Footprints in Micrometeorology and

Ecology (Springer, Berlin).

Lele, S. K. (1992). “Compact finite difference schemes with spectral-like

resolution,” J. Comput. Phys. 103(1), 16–42.

Lemke, M. (2015). “Adjoint based data assimilation in compressible flows

with application to pressure determination from PIV data,” Ph.D. thesis,

Technische Universit€

at Berlin, Berlin.

Lemke, M., Reiss, J., and Sesterhenn, J. (2014). “Adjoint based optimisation

of reactive compressible flows,” Combust. Flame 161(10), 2552–2564.

Lemke, M., and Sesterhenn, J. (2016). “Adjoint-based pressure determina-

tion from PIV data in compressible flows—Validation and assessment

based on synthetic data,” Eur. J. Mech. – B Fluids 58, 29–38.

Lemke, M., Straube, F., Schultz, F., Sesterhenn, J., and Weinzierl, S. (2017).

“Adjoint-based time domain sound reinforcement,” in Audio Engineering

Society Conference: 2017 AES International Conference on Sound

Reinforcement pen Air Venues.

Mani, A. (2012). “Analysis and optimization of numerical sponge layers as

a nonreflective boundary treatment,” J. Comput. Phys. 231(2), 704–716.

Meyer, D. G. (1984). “Computer simulation of loudspeaker directivity,”

J. Audio Eng. Soc. 32(5), 294–315.

Meyer, P., and Schwenke, R. (2003). “Comparison of the directional point

source model and BEM model for arrayed loudspeakers,” Proc. Inst.

Acoust. 25(4), 1–10.

Mueller, S., and Massarani, P. (2001). “Transfer-function measurement with

sweeps,” J. Audio Eng. Soc. 49(6), 443–471.

Poinsot, T., and Lele, S. (1992). “Boundary conditions for direct simulations

of compressible viscous flows,” J. Comput. Phys. 101, 104–129.

Richards, P., and Hoxey, R. (1993). “Appropriate boundary conditions for

computational wind engineering models using the k-turbulence model,”

J. Wind Eng. Indust. Aerodyn. 46-47, 145–153.

Stein, L. (2019). “Simulation and modeling of a Helmholtz resonator under

grazing turbulent flow,” Ph.D. thesis, Technische Universit€

at Berlin,

Berlin.

Straube, F., Schultz, F., Makarski, M., and Weinzierl, S. (2018). “Mixed

analytical-numerical filter design for optimized electronic control of line

source arrays,” J. Audio Eng. Soc. 66(9), 690–702.

Terrell, M., and Sandler, M. (2010). “Optimising the controls of a homoge-

nous loudspeaker array,” in Proc. of the 129th Audio Eng. Soc. Conv., San

Francisco, paper #8159.

Thompson, A. (2009). “Improved methods for controlling touring loud-

speaker arrays,” in Proc. of the 127th Audio Eng. Soc. Conv., New York,

paper #7828.

van Beuningen, G. W. J., and Start, E. W. (2000). “Optimizing directivity

properties of DSP controlled loudspeaker arrays,” Proc. Inst. Acoust.:

Reproduced Sound 22(6), 17–37.

Wendt, J. F. (2009). Computational Fluid Dynamics: An Introduction, 3rd

ed. (Springer, New York).

White, F. (2006). Viscous Fluid Flow, 3rd ed. (McGraw Hill, New York).

J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1785