scieee Science in your language
[en] (orig)
Adjoint-based optimization of sound reinforcement including non-uniform flow
Lewin Stein, Florian Straube, Jörn Sesterhenn, et al.
Citation: The Journal of the Acoustical Society of America 146, 1774 (2019); doi: 10.1121/1.5126516
View online: https://doi.org/10.1121/1.5126516
View Table of Contents: https://asa.scitation.org/toc/jas/146/3
Published by the Acoustical Society of America
ARTICLES YOU MAY BE INTERESTED IN
Compressive synthetic aperture sonar imaging with distributed optimization
The Journal of the Acoustical Society of America 146, 1839 (2019); https://doi.org/10.1121/1.5126862
A deep learning method for grid-free localization and quantification of sound sources
The Journal of the Acoustical Society of America 146, EL225 (2019); https://doi.org/10.1121/1.5126020
Perceptual audio processing stethoscope
The Journal of the Acoustical Society of America 146, 1769 (2019); https://doi.org/10.1121/1.5126226
On the limitations of sound localization with hearing devices
The Journal of the Acoustical Society of America 146, 1732 (2019); https://doi.org/10.1121/1.5126521
Machine learning in acoustics: Theory and applications
The Journal of the Acoustical Society of America 146, 3590 (2019); https://doi.org/10.1121/1.5133944
Interior near-field acoustic holography based on finite elements
The Journal of the Acoustical Society of America 146, 1758 (2019); https://doi.org/10.1121/1.5126513
Adjoint-based optimization of sound reinforcement including
non-uniform flow
Lewin Stein,
1,a)
Florian Straube,
2
J
orn Sesterhenn,
1
Stefan Weinzierl,
2
and Mathias Lemke
1
1
Computational Fluid Dynamics Group, TU Berlin, Muller-Breslau-Str. 15, 10623 Berlin, Germany
2
Audio Communication Group, TU Berlin, Einsteinufer 17c, 10587 Berlin, Germany
(Received 14 May 2019; revised 10 August 2019; accepted 23 August 2019; published online 26
September 2019)
The determination of optimal geometric arrangements and electronic drives of loudspeaker arrays in
sound reinforcement applications is an ill-posed inverse problem. This paper introduces an innovative
method to determine complex driving functions, also considering complex environmental conditions.
As an alternative to common frequency domain methods, the authors present an adjoint-based
approach in the time domain: Acoustic sources are optimized in order to generate a given target sound
field. Instead of the Helmholtz equation, the full non-linear Euler equations are considered. This
enables an easier treatment of non-uniform flow and boundary conditions. As proof of concept, a
circular and a linear monopole array are examined. For the latter, the environmental conditions include
wind and thermal stratification. For all examples, the method is able to provide appropriate driving
functions. V
C2019 Author(s). All article content, except where otherwise noted, is licensed under a
Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
https://doi.org/10.1121/1.5126516
[KGS] Pages: 1774–1785
I. INTRODUCTION
Sound reinforcement aims at synthesizing specified
sound fields for the entire audio bandwidth. The generation
of the target sound fields can be controlled by the placement
and the electronic drive of different loudspeakers. Gain and
delay are the two parameters—sometimes called excitation
coefficients or feeding coefficients—that have to be ascer-
tained for electronic optimization of the loudspeaker array
radiation. Both parameters are related to the amplitude and
phase of a complex driving function. They are typically
computed separately for each frequency (Betlehem and
Withers, 2012;Coleman et al., 2014;Feistel et al., 2013;
Straube et al., 2018;Terrell and Sandler, 2010;Thompson,
2009;van Beuningen and Start, 2000) using frequency
domain methods.
This article presents a new method to find loudspeaker
positions and their driving functions for application in sound
reinforcement. The focus of this work is the determination of
optimized driving functions—also in case of non-uniform flow.
For this purpose, the authors present an adjoint-based
method in the time domain. It is based on the full non-linear
Euler equations and the corresponding adjoint, which are
solved by means of computational aeroacoustic (CAA) tech-
niques. Since the Euler equations allow us to consider wind
and thermal stratification, applications of the current work
may include stadiums or open-air events (Fig. 1). Here, they
are considered in a model form.
In fluid mechanics, the use of the adjoint has proven to
be an effective approach for determining various model
parameters (Giles and Pierce, 2000;Jameson, 1995).
Utilizing corresponding methods for acoustic problems is
convenient as the adjoint compressible Euler and Navier-
Stokes equations (Lemke et al., 2014;Lemke and
Sesterhenn, 2016) also capture acoustics, both for a station-
ary environment and complex base flow.
The paper is organized as follows: In Sec. II the adjoint-
based method is presented and tailored for the application in
sound reinforcement. Section III introduces the concept of val-
idation and methods. Section IV presents exemplary results to
verify the applicability of the adjoint-based method for circular
and linear source arrays—for the latter with a velocity and a
temperature profile. Section Vgives a summary.
II. ADJOINT APPROACH
In contrast to frequency domain approaches, which are
common in sound field generation as well as sound rein-
forcement and which are based on an integral representation
of the homogeneous wave equation for discrete source distri-
butions (Feistel et al., 2009;Meyer, 1984;Meyer and
Schwenke, 2003;van Beuningen and Start, 2000), the
adjoint-based method makes use of a more general represen-
tation of the wave propagation in the time domain.
A. Adjoint equations
The adjoint equations can be defined in a continuous or
discrete manner. For the sake of simplicity, they are intro-
duced in discrete version (Giles and Pierce, 2000). A matrix-
vector notation is used. The vector space is the full solution
in space and time. This section is based on Lemke (2015).
Adjoint equations arise by a so-called objective function
J, which is defined by the product between a geometric
weight gand the system state q:
J¼gTq;g;q2Rn:(1)
a)
Electronic mail: [email protected]
1774 J. Acoust. Soc. Am. 146 (3), September 2019 V
CAuthor(s) 2019.0001-4966/2019/146(3)/1774/12
The system state qcorresponds to the solution of the govern-
ing system
Aq ¼s;A2Rnn;s2Rn;(2)
with Aas governing operator and sas source terms on the
right-hand side. In terms of optimization of Jby means of s,
the equation has to be solved for every different s. In order
to reduce the computational effort, the adjoint equation can
be used,
ATq¼g;(3)
with the adjoint variable q*. By
J¼gTq¼ATq

Tq¼qTAq ¼qTs;(4)
an expression is found, which enables the computation of J
without the need to solve the system for every discrete s.
After solving the adjoint equation, the objective can be
determined by a simple and computationally cheap scalar
product. Thus, the adjoint approach enables the efficient
computation of gradients for Jwith respect to s.
B. Objective and adjoint-based gradient
According to the intended application—the sound field
generation in the time domain—the objective function Jis
defined in space and time with dX¼dxidt:
J¼1
2ððqqtarget
ðÞ
2dX:(5)
The variable q¼½.;uj;pcomprises all quantities which are
necessary for a state description of the governing system,
defined by the Euler equations. The variable q
target
denotes
the desired system state, .the density, u
j
the velocity in the
direction x
j
and pthe pressure, which is used solely for eval-
uation of the objective
J¼1
2ððpptarget
ðÞ
2rdX:(6)
The additional weight rðxi;tÞdefines where and when the
objective is evaluated. In practice, the objective function is
supplemented by additional constraints, e.g., power con-
sumption, and a regularization term (Bewley, 2001), which
are omitted here for the sake of clarity. Optimal sound rein-
forcement is realized if Jreaches a minimum.
The minimum is to be achieved under the constraint
that the Euler equations Eare satisfied. Similar to Eq. (2),
the following system is introduced:
EðqÞq¼s;(7)
abbreviating the Euler equations
@t
.
.uj
p
c1
0
B
B
@1
C
C
Aþ@xi
.ui
.uiujþpdij
uipc
c1
0
B
B
@1
C
C
Aui@xi
0
0
p
0
@1
A¼
s.
s.uj
sp
0
@1
A;
(8)
with cas heat capacity ratio. The summation convention
applies. See Lemke et al. (2014) for details on the
formulation.
The terms s¼½s.;s.uj;spon the right side of the equa-
tions characterize sources for mass, momentum, and energy.
They enable control of the system state, respectively, the solu-
tion of the equations. The general goal is to obtain a solution
of the Euler equations, which matches q
target
, respectively,
p
target
in an optimal sense, by adapting s.
FIG. 1. (Color online) “Mayan Warrior” sound system at the “Burning Man” festival in Nevada’s Black Rock desert (with friendly permission of the Mayan
Warrior Collective, http://mayanwarrior.mx). Open-air conditions including wind and thermal stratification, such as a hot ground and a cool breeze above con-
sidered in Sec. III D, are examples, where sound optimization with non-uniform flow matters.
J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1775
The sources scan be interpreted as sound sources or
loudspeakers. Monopole sources can be described solely by
sources s
p
in the pressure equation. Thus, an optimization of
scorresponds to an optimization of the loudspeakers’ output
signals.
In order to use the adjoint approach for optimizing s,it
is necessary to linearize the objective function Eq. (6) and
the governing system Eq. (8). This results in
dJ¼ððpptarget
ðÞ
r
|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}
¼g
dpdX(9)
and
Elinðq0Þdq¼ds;(10)
with the now defined weight g¼ðpptargetÞr. Combining
the linearized system and the objective in a Lagrangian man-
ner by introducing the (adjoint) multiplier q*, leads to
dJ¼gTdqqTElinðq0ÞdqdsðÞ
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼0
¼qTdsþdqTgET
linðq0Þq

:(11)
For the sake of simplicity, the integrals are not shown. The
adjoint equations result in
gET
linðq0Þq¼0;(12)
with q¼½.;u
j;p, similar to Eq. (3). See Lemke (2015,
p. 19) for a detailed derivation of the adjoint Euler equations,
which are given by
@tq¼~
AðBiÞT@xiq@xiðCiÞTqþ~
Ci@xicg
hi
(13)
with ~
A¼ðATÞ1and ~
Cias resorting
q
adCi
ab@xicb¼q
adqj
@Ci
ab
@qj
@xicb;(14)
abbreviated as dqj
~
Ci
jb@xicb. The matrices A,B
i
, and C
i
result
from the linearization of the Euler Eq. (8) and are given in
the Appendix.
The change of the objective function reads
dJ¼qTds;(15)
similar to Eq. (4). Thus, the adjoint solution can be inter-
preted as a gradient of Jwith respect to the source terms s,
rsJ¼q:(16)
Initial and boundary conditions of the adjoint Euler equa-
tions as well as the derivation of the adjoint compressible
Navier-Stokes equations are discussed in Lemke (2015).
C. Iterative determination of the driving function
The adjoint-based gradient is used in an iterative man-
ner. First, the Euler equations are solved forward in time
with s
0
¼0. Subsequently, the adjoint equations are calcu-
lated backward in time deploying the direct solution and the
weight g. Based on the adjoint solution, the gradient rsJis
determined and used to update the source distribution s
n
:
snþ1¼snþarsJ;(17)
with adenoting an appropriate step size and nthe iteration
number. The gradient is calculated for the whole computing
region and the entire simulation time, but only evaluated at
desired loudspeaker positions. While convergence is not
accomplished in the objective function, the process is repeated
with the current s
n
.Pleasenotethatonlys
p
is optimized in this
work. Thus, only monopole sources are considered.
The resulting optimal sis deconvolved with the consid-
ered primitive input signal and the loudspeaker properties
resulting in an optimal driving for every considered loud-
speaker. See Fig. 3for an overview of the process.
The proposed technique optimizes towards local
extrema. Detecting global optima is not ensured. The com-
putational costs of the adjoint-based approach are indepen-
dent of the number of loudspeakers and their arrangement.
They only depend on the size of the computational domain,
which includes the entire space from the loudspeakers to the
most distant listening position, and the considered frequency
range, as higher frequencies demand a higher spatial and
temporal resolution. Using high-accuracy discretization
schemes, e.g., compact schemes, a resolution of four to 12
points per wavelength will usually be sufficient, depending
on the required accuracy.
Figure 2shows the convergence behavior of the objec-
tive functions of all optimization runs conducted in the
following. Throughout this work, all CAA solutions are the
result of 15 iteration steps. Depending on the line array setup
and the type of background flow the objective is reduced
between one and two orders of magnitude.
Utilizing a line search strategy with three forward and one
adjoint backward calculation per iteration step (quadratic inter-
polation), the total CPU time after 15 iterations adds up to
roughly 200 CPU-hours (details of the employed numerical
resolution follow in Sec. III B). On a local machine with 16
cores (AMD EPYC 7351, 32 GB RAM each), these hours cor-
respond to a total runtime of approximately half a day.
FIG. 2. (Color online) The convergence of the objectives of all three con-
ducted optimization runs. Section II C explains the details about the underly-
ing adjoint optimization strategy.
1776 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.
III. VALIDATION SETUP
A. Introduction of the validation concept
As validation, the adjoint-based method (Sec. II)is
applied to different test cases (Secs. IV A,IV B,IV C). To
obtain a reference solution for all test cases, i.e., a sound field
generated by prescribed speakers with fixed positions for
given input signals, we use a complex directivity point source
(CDPS) model. In the frequency domain, the CDPS model
(Feistel et al., 2009;Meyer, 1984;Meyer and Schwenke,
2003;van Beuningen and Start, 2000) calculates the analytic
sound field in free space without any background flow, by
superimposing the impact of all speakers (for its setup, see
Sec. III C).
The basic validation procedure is as follows: First, we
specify a test case consisting of circularly or linearly aligned
speakers and an input signal. Throughout this paper, all speak-
ers are assumed as monopoles. For the exact speaker locations
please refer to Figs. 6or 12. As favorable input signal—to
analyze an entire frequency range—we select a sine sweep
with an exponential frequency increase over time (Fig. 4)in
all cases. Below 1.3kHz and above 2.7 kHz the sweep has a
fade in and fade out period. At the end of the sweep there is
1 ms silence. The exponential sine sweep is chosen as a popu-
lar signal for loudspeaker measurements (Mueller and
Massarani, 2001), though other input signals could be also
selected. Here, always the same sweep signal is prescribed for
all speakers. However, to impose a random variation between
the speakers, the amplitudes and the phases among all utilized
speakers are varied.
Second, we determine the entire four-dimensional (three
spatial and one temporal dimension) reference sound field
p0
ref for the specified speakers and input signals with the
CDPS model (Sec. III C). To convert the sound fields from
frequency to temporal domain, we perform a discrete
Fourier transform. Snapshots of the reference fields are visu-
alized by Figs. 7and 12.
Third, we utilize the adjoint CAA solver (for its numeri-
cal setup see Sec. III B) to independently recalculate the
input signals for each speaker. As input for the adjoint
CAA solver the speaker locations and the four-dimensional
reference field p0
ref, which is used as the optimization target
(ptarget ¼p0
ref þp1) within the spatial mask r[Eq. (6)], are
provided. The ambient pressure is denoted by p1.
Fourth, the quality of the numerically identified speaker
signals is evaluated by contrasting them against the original
input signals of the CDPS model (reference). As further vali-
dation, the numerically reproduced (optimized) sound field
p0
opt ¼popt p1is compared with the reference p0
ref at repre-
sentative microphone locations.
Taken together, we check to what extent the adjoint
approach is able to identify the optimal excitation signals for
a given sound field with a given speaker alignment.
To keep the discussion clear of special features (e.g.,
initial transient) of adjoint-based optimization, the startup
and the final decay process are excluded from the following
discussion. We thus restrict the validation to a frequency
interval of the sweep between 1.3 and 2.7 kHz (see vertical
black lines of Figs. 5or 8). If needed, one can modify the
FIG. 3. Iterative process for determining optimal driving functions: Starting with an initial guess for the loudspeakers s
p
¼0, the governing equations are solved
forward in time. The result is compared to the desired reference state, while the actual difference drives the adjoint equations, which are solved backward in
time. Based on the adjoint solution, the gradient for rspJis used to update the actual forcing. Once convergence is reached, deconvolution of optimal swith the
predefined input signal and the speaker characteristics results in optimal drives. Computationally intensive steps including CAA methods are marked in gray.
FIG. 4. Exponential sine sweep with a monotonically increasing frequency in
the time domain—the loudspeaker input signal used throughout this work. All
other speakers use the same signal sequence varying amplitude and phase.
J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1777
transient process and consider a broader frequency band, by
extending the duration of the sweep signal.
B. Setup of the adjoint-based CAA solver in time
domain
The governing equations Eq. (8) are discretized by finite
differences. A sixth-order accurate compact symmetric deriv-
ative stencil is used (Lele, 1992). The sound reinforcement
area under consideration is 0 <xi<1:6 m for all directions
and is discretized by means of a uniform grid resolved by
197 197 99 grid positions. This relatively small spatial
domain does not reflect a real outdoor sound reinforcement
situation. However, it enables us to complete an optimization
run with 15 iteration steps within the above mentioned com-
putational time. At the same time all physical relevant phe-
nomena such as an inhomogeneous sound pressure level
(SPL) distribution, a wide frequency spectrum, and a strati-
fied background flow can be fully studied.
Large scale computations can be realized using a paral-
lel cluster as the problem is fully parallelizable. The problem
scales with the number of discretization positions used to
resolve space and time. The computational complexity
remains the same. For larger setups stability problems are
not expected. Thus, for a large-scale venue slice with
70 15 m
2
, the required amount of CPUh will increase to
about 45 000 CPUh per loop, which can be handled on a par-
allel cluster in reasonable time and with reasonable costs.
The physical time span t
0
¼0mstot
end
¼26 ms is sepa-
rated into 1250 time steps, corresponding to a sampling rate
of 48 kHz and a maximal Courant–Friedrichs–Lewy condi-
tion (Wendt, 2009) equal to 0.94. An explicit fourth-order
Runge-Kutta scheme (Wendt, 2009) is employed for time-
wise integration. All boundaries are treated as non-reflective
using so-called “characteristic boundary conditions”
(Poinsot and Lele, 1992;Stein, 2019). In order to suppress
reflections, a quadratic sponge layer is added to all bound-
aries, acting on a side margin with a width of 0.3 m (Mani,
2012). To ensure stability, an implicit filter of sixth order is
used at each time step (Gaitonde and Visbal, 2000). The
adjoint equations are solved using the same discretization.
The corresponding boundary conditions are discussed in
Lemke (2015).
To realize an ambient speed of sound c
1
¼343 m/s of an
ideal gas with the adiabatic index c¼1.4, the ambient values
for the density and the pressure are set to .1¼1.21 kg/m
3
and p
1
¼101 325 Pa, respectively, with R
s
¼287 the specific
gas constant of air. Please note that for the case with non-
homogeneous flow field, the speed of sound is not constant,
see Sec. IV C.
C. Setup of the CDPS model in frequency domain
To determine the reference sound field, the CDPS model
is taken to calculate the transfer functions from every source to
each considered grid position of the sound reinforcement area
in the frequency domain. These transfer functions result from
the superposition of the impact of each source, e.g., monopole
sources as in this article, or sources with modeled as well as
measured loudspeaker directivity data. The transfer functions
comprise the reference driving functions—the electronic filters
which individually control the different inputs for each
source—and the acoustic transfer functions—the spatial
source–receiver sound wave propagation. Subsequently, the
transfer functions are transformed to the time domain using the
discrete inverse Fourier transform. The sound field is calcu-
lated by convolving the input time signal and the transfer func-
tions to obtain the sound pressure for each considered grid
position and for each considered time step.
Air is assumed to be homogeneous and dissipation-less
with a constant speed of sound c
1
¼343 m/s. The spatial loca-
tions, provided by the finite differences nodes of the CAA
solver, for which the sound field is to be calculated, have to be
located in the acoustical far-field of every single source.
D. Inhomogeneous velocity and temperature profile
We specify the wind and temperature profiles in this
section, which are utilized in Sec. IV C as test case for non-
uniform background flow.
An atmospheric boundary layer profile with constant
shear stress can be approximated by the log-law (Richards
and Hoxey, 1993):
uðyÞ¼uþ
jln 1 þy
yr

;(18)
with u
þ
the friction velocity, y
r
a roughness parameter, and j
¼0.4 the Von Karman constant. The roughness parameter in
case of a landscape with bushes is roughly y
r
¼0.1 (Leclerc
and Foken, 2014). A maximum wind speed of uðymaxÞ
¼15 m/s at the edge y
max
¼1.6 m of the computational
domain is selected, which also defines the friction velocity
u
þ
using Eq. (18).
In case of a compressible turbulent boundary layer, the
temperature profile and the density profile is related to the
mean velocity profile u(y) [Eq. (18)] by the Crocco-
Busemann relation (White, 2006)
FIG. 5. The instantaneous root mean square error [RMSE, Eq. (20)] between
the pressure of the CAA solver and the CDPS model over the frequency of
the exponential sine sweep (see case in Sec. IV A). The two vertical black
lines mark the frequency range under consideration of the sweep between
1.3 and 2.7 kHz, excluding the switch on and off procedure of the adjoint-
based solver.
1778 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.
TðyÞ¼TgþTag Tg
ðÞ
uðyÞ
u0
T0
c1
2
uðyÞ
c0

2
;
q¼qw
Tw
T;(19)
being Tg¼T1292:80 K the ground temperature. The the-
oretically corresponding adiabatic ground temperature T
ag
is
given by the assumption of linear heat flux at the ground,
i.e., the condition @yTjy¼0¼ðTmax TgÞ=ymax. The selected
reference temperature Tmax ¼TðymaxÞ¼Tg20 K at the
top edge defines T
0
. Evaluating Eqs. (18) and (19) provides
the velocity and temperature profiles depicted in Fig. 6.
These are utilized as background mean profiles in Sec. IV C.
IV. VALIDATION EXAMPLES
Sections IV A and IV B present a circular and a linear
array in the case of no flow. As a subsequent step, the linear
array case is perturbed by both a representative inhomoge-
neous velocity and temperature background field (Sec. IV C).
A. Circular monopole array
As a first validation example, we present the sound field
of a circular array, consisting of six speakers, orientated in
the x1x2-plane around the center (0.8, 0.8, 0.8) m of the
domain, and with a diameter of 0.8 m. A pressure snapshot
of the CAA time stepper is shown in Fig. 7. The highest SPL
occurs at three o’clock for the first speaker, decreasing
counter-clockwise from the second to the sixth speaker (an
arbitrary choice done here).
After 15 iterations of the adjoint-based optimization, the
objective function is decreased by about two orders of mag-
nitude (Fig. 2). The normalized instantaneous root mean
square error (RMSE) between the CAA solution p0
opt and the
CDPS reference p0
ref of Fig. 5,
RMSE
RMSðp0
refÞ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Xrðp0
ref p0
optÞ
hi
2Xrp0
ref

2
s(20)
falls below 7% within the examined frequency interval.
Therein, Pis the sum over all spatial locations and ris the
weight function [Eq. (6)], which excludes in particular the
sponge (boundary) region, starting at a distance of 0.3 m
to each edge (Fig. 7). There, the CAA solution is not valid
(Fig. 7). However, the small RMSE of Fig. 5implicitly
validates the quality of non-reflective boundary conditions:
any spurious reflections would cross the r-area, thus, increas-
ing the RMSE.
Beside the comparison in the time domain, the ampli-
tudes and phases are contrasted in the frequency domain
below by analyzing the driving signals of the field sources of
the CDPS model and the CAA solver.
Naturally, the amplitude of the pressure source s
p
of the
right-hand-side of the CAA equation [Eq. (8)] is not directly
comparable to the resulting pressure field. However, both are
physically related by a phase shift of p/2 and an amplitude
conversion factor, which is discussed in the Appendix. In
contrast, the driving signal of the CDPS model is directly
proportional (phase and amplitude) to the acoustic pressure
field. In order to compare the CDPS and the CAA driving
signals in a physically equivalent manner, we will always
apply the phase and amplitude conversion just stated
throughout the paper.
Figures 8and 9depict the input gains and the phases
provided by the CAA solver. Due to the used exponential
sweep (cf. Fig. 5), higher frequencies have smaller time
shares of the overall signal reflected by a decaying gain with
increasing frequency in Fig. 8. The phases (Fig. 9) feature
periodic 2pshifts as expected.
Figures 10 and 11 show the gain and the phase differ-
ences between the CAA solver and the CDPS model within
the frequency interval from 1.3 to 2.7 kHz. The maximum
gain deviation error at the interval edges is 60.2 dB and less
FIG. 6. (Color online) Exemplary velocity and temperature profiles in a
windy desert at sunset [result of Eqs. (18),(19)]. The height x
2
of the three
microphone positions—examined in Sec. IV C—is marked with magenta
crosses.
FIG. 7. (Color online) CAA solution of the sound field radiated by six
speakers in a circular arrangement. A snapshot of pressure fluctuations of
the central x1x2-plane (x
3
¼0.8 m, where all speakers are located) is shown.
The locations of the six monopole sources of pressure are marked by black
crosses. Isolines of the objective weight function r[Eq. (6)] at values of
0.01, 0.5, and 0.99 are denoted by white, dashed circles (thin lines). The
sponge region is indicated by a white, dashed box (thick line) at a distance
of 0.3 m to the edge of the computational domain.
J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1779
than 60.1 dB between 1.6 and 2.7 kHz for all six speakers
(Fig. 10). Hence, also the relative pressure levels between all
six speakers are correctly captured. The phase deviation
between the CAA and CDPS solution corresponds to a phase
shift of less than half a percent (phase shift in rad/2p) within
the whole frequency interval (Fig. 11). In both figures the
adjoint-based CAA solver achieves best optimization results
for the loudest speaker, while the largest deviations are
apparent for the quietest speaker, indicating varying sensitiv-
ities on the objective.
Thus, we can summarize that the adjoint-based CAA
solver is able to provide driving functions for the circular
monopole array, which are in magnitude and phase almost
identical to the reference.
B. Linear monopole array
While horizontally arranged loudspeaker arrays are typi-
cally used for sound reproduction, vertical arrays are mostly
used for sound reinforcement. Thus, the second validation
example is a linear array with five vertically aligned mono-
pole speakers. Again the same exponential sweep test signal
is used. The equidistant speaker alignment is located on the
left side (all located at x
1
¼0.4 m), see Fig. 12. The loudest
speaker is located at the bottom (x
2
¼0.4 m), while the qui-
etest speaker is located at the top position (x
2
¼1.2 m).
For the linear array, we compare not only the source
signals of the CAA solution and the reference but also the gen-
erated sound pressure at three selected positions in the target
area (“mic 1”–“mic 3”). This extra comparison has two rea-
sons: First, microphone signals are another measure of the
quality of the obtained sound field. Second, the CDPS model
cannot deal with non-uniform background flow, as present in
Sec. IV C. Consequently, there are no reference signals avail-
able to compare with. Thus, after the following discussion of
the gains and the phases of the source signals, the amplitudes
and phases recorded by the microphones are discussed, too.
Qualitatively, both the source gain (Fig. 13) and the
source phase (Fig. 14) provided by the CAA solver for the
linear array are similar to those of the circular array (cf.
Figs. 8and 9). This is expected since the same exponential
sweep is reused (Fig. 4). The main difference between the
linear and the circular array (Sec. IV A) is the gain offset
between the speakers. The phase differs outside the investi-
gated frequency interval only [switch on process, discrete
Fourier transform (DFT) windowing].
The deviation between the CAA and the CDPS solution
are comparable to those in Sec. IV A. The maximal gain devia-
tion in Fig. 15 is below 0.1 dB. The maximal phase deviation
in Fig. 16 is less than half a percent (phase difference in rad/
2p). Again the loudest speakers are the most accurate.
Figures 17 and 18 display the SPL and the phase differ-
ences, respectively, at three representative microphone loca-
tions marked by the magenta crosses in Fig. 12, located at
FIG. 9. CAA solution of the input phase of the loudest speaker of the circu-
lar array. All other speakers show similar behavior. The two vertical black
lines define the frequency interval of the sweep between 1.3 and 2.7 kHz.
FIG. 10. (Color online) Difference between the CAA and the CDPS narrow-
band input gains of the speakers for the circular array.
FIG. 11. (Color online) Difference between the CAA and the CDPS input
phases of the speakers for the circular array.
FIG. 8. (Color online) CAA solution of the input gains for all speakers of
the circular array. The SPL is normalized at the loudest frequency of the first
speaker.
1780 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.
[1.1, 1.1, 0.8] m (mic 1), [0.8, 0.8, 0.8] m (mic 2) and [0.73,
0.63, 0.8] m (mic 3). In contrast to the speaker signals, the
deviation of the microphone SPLs is increased by a factor of
10 up to 1 dB at the frequency interval edges and the devia-
tion of the microphone phase is increased by a factor of 8 up
to 4%. These deviations are also representative of other
microphone locations. An explanation of the more accurate
speaker signal in comparison to the microphone signal is as
follows: The adjoint-based speaker signal results from an
comparatively more robust integral measure J[Eq. (6)]of
all spatial locations of r. In contrast, the pressure signal at
a microphone represents a single spatial point only. The dis-
crete Fourier transform windowing artifacts due to the short
available signal length of 26 ms are another explanation,
which especially affects the beginning and end of the sweep,
i.e., very low and very high frequencies.
Having confirmed that also a linear array setup can be
optimized, the geometrical flexibility of the adjoint approach
is demonstrated. Beside the signal of the speakers itself also
the resulting pressure fields at different microphone locations
match the CDPS reference.
C. Linear monopole array with velocity and
temperature profile
In the following the linear loudspeaker setup of Sec. IV B
is extended by an inhomogeneous velocity and temperature
profile, cf. Sec. III D. Inspired by a sunset festival situation in
a windy desert (cf. Fig. 1) we assume a hot ground with colder
air blowing above. Subsequently, the impact of these outdoor
FIG. 13. (Color online) CAA solution of the input gains for all speakers of
the linear array. The two vertical black lines define the sweep frequency
interval between 1.3 and 2.7 kHz.
FIG. 14. CAA solution of the input phase of the loudest speaker of the linear
array. All other speakers show similar behavior.
FIG. 15. (Color online) Difference between the CAA and the CDPS signal
gains of the speakers for the linear array.
FIG. 12. (Color online) CAA solution of the sound field radiated by a
strictly vertically stacked line array with five monopole sources. A snapshot
of pressure fluctuations of the central x1x2-plane (x
3
¼0.8 m, where all
speakers and microphones are located) is shown. All white dashed lines are
discussed in the caption of Fig. 6. The three virtual microphone positions
are visualized by magenta crosses: Microphone one is placed at [1.1, 1.1,
0.8] m, microphone two at [0.8, 0.8, 0.8] m, and microphone three is located
at [0.73, 0.63, 0.8] m.
FIG. 16. (Color online) Difference between the CAA and the CDPS signal
phases of the speakers for the linear array.
J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1781
conditions on sound propagation is discussed. The end of this
section deals with the optimization of the linear array in order
to reinforce the original target field by compensating the
impact of the velocity and the temperature profile.
Using the identical source signals as in Sec. IV B,and
the background flow profiles described in Sec. III D signifi-
cantly alter the sound propagation. At the ground x
2
¼0,
there is no flow and the speed of sound is equal to Sec. IV B.
However, with increasing wall distance x
2
>0, both, the
incoming wind and the decreasing sound speed delay the
sound propagation in the positive x
1
direction further and fur-
ther. In the time domain the resulting delayed pressure signal
at the mic 1 location is depicted as a dashed-dotted red line in
Fig. 19, while the original reference signal is colored solid
black. The time delay s
i
between the cases with and without
flow of the arrival time at a “mic i can be estimated by
sðLx1;c1;c;uÞ¼Lx1=ðcþuÞ
zfflfflfflfflfflfflffl}|fflfflfflfflfflfflffl{
with flow
Lx1=c1
zfflfflffl}|fflfflffl{
no flow
;
s1ð0:70 m;343 m=s;333:6m=s;13:2m=sÞ¼0:144 ms;
s2ð0:40 m;343 m=s;335:5m=s;11:6m=sÞ¼0:069 ms;
s3ð0:33 m;343 m=s;336:7m=s;10:5m=sÞ¼0:050 ms;
(21)
being Lx1the streamwise distance between speaker and mic.
The seven-time step shift [7/(48 kHz)] between the extrema
of the reference and the delayed curve in Fig. 19 matches s
1
.
In addition to the time domain representation, the time
shift is visible as a spectral phase shift in Fig. 20. A time
shift scorresponds to a spectral phase shift of D/=2p¼fs.
Hence, for the first approximation, we find a linear depen-
dency on the frequency at the microphone locations, origi-
nating from the time delays fs1;fs2;and fs3as a black,
blue, and green solid line, respectively. For example, at
2 kHz the time shift s
2
leads to a phase shift of 13.8%, fully
consistent with the CAA value (blue solid line). As expected,
the largest phase shift of around 35% occurs for the highest
frequency of mic 1. For setups with more speakers and a
larger audience region stronger deviations from the desired
sound field are expected.
FIG. 19. (Color online) Comparison of the pressure amplitudes in the time
domain at “mic 1” in the case with and without a flow (and temperature)
profile, and each case with and without enabled optimization.
FIG. 20. (Color online) Difference between the disturbed CAA and the ref-
erence CDPS phases of the three microphones for the linear array. Always
the same flow disturbance of a combined velocity and temperature back-
ground profile is used (Sec. III D). Solid lines correspond to non-optimized
CAA solutions, whereas dash-dotted lines are optimized CAA solutions.
FIG. 17. (Color online) Difference between the SPLs of the CAA and the
CDPS solver at three microphone locations described by the legend and
illustrated by Fig. 11.
FIG. 18. (Color online) Difference between the acoustic pressure phases of
the CAA and the CDPS solver at three microphone locations described by
the legend and illustrated by Fig. 11.
1782 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.
Strictly speaking, the simplified explanation of the intro-
duced time delay is only considering the effects of wave
propagation in x
1
-direction assuming a single speaker. More
generally, superposition effects of all five speakers including
fully three-dimensional wave propagation within a boundary
layer mean flow leads to more complex phenomena and pos-
sible local outliers of the predominantly linear phasing. One
example of this non-linear phasing occurs at mic 3 (Fig. 20).
Beside the phase shift, the amplitude error at the exam-
ined mic positions is increased between plus and minus 3 dB
as visible in Figs. 19 and 21. An explanation might be a con-
structive or destructive interference between the waves of
the five speakers, which also explains the spatial dependency
of this effect. Because of the locally varying wind and tem-
perature, the acoustic path lengths between a microphone
and each speaker differ. Among the three microphones pre-
sented here, mic 3 with non-linear phasing is the most sensi-
tive to this interference effect and displays the largest
amplitude change in Fig. 21.
After discussing the effects of a non-uniform mean flow
on sound propagation, now its optimization is addressed.
Using the adjoint-based approach, the previous no-flow
speaker signals sare modified in order to reinforce the origi-
nal target field thereby compensating the effects of the non-
uniform background. The resulting optimized sound fields
are measured at the same mic positions as before.
In the time domain (Fig. 19) the optimized dashed green
curve is clearly shifted towards the black reference curve. In
the frequency domain (dashed lines of Fig. 20) the relative
phase error (dashed lines) is at least halved. The SPL differ-
ences (dashed lines of Fig. 21) display only small or no
(frequency dependent) improvement compared to the not
optimized curves (solid lines). Meanwhile a SPL decrease
over all frequencies of up to 2 dB can be seen, with the only
exception of mic 3 above 2.5 kHz. In comparison, the phases
are proportionally further restored by the optimization algo-
rithm than the SPL. This is probably caused by the fact that a
phase shift of pin Eq. (6) has a greater impact on the objective
in contrast to a perturbation of the pressure magnitude.
It is worth noting that the deviation between the opti-
mized and the reference solution is only a measure to which
extent the adjoint solver is able to reproduce a given target
field with the speakers available under the present conditions
(wind, temperature effects). It is not a quality criterion of the
sound reinforcement system itself. However, there are two
possibilities, to include quality criteria in the optimization
process. First, most easily, by predefining the target field
according to the criteria. Or second, by extending the objec-
tive function Eq. (6) to meet additional criteria—besides the
sound field reproduction alone. Potential criteria among
others could be a homogeneous SPL coverage and avoidance
of undesired radiation, or a smooth spectral distribution.
In the present paper, the driving signals were optimized
for predefined fixed speaker locations. However, in general,
the adjoint approach is also capable to identify the optimal
speaker locations, possibly suggesting a more efficient setup
than the original alignment to reinforce the same reference
field, see Lemke et al. (2017) for a preliminary test to iden-
tify source locations in two dimensions.
V. CONCLUSION
An adjoint-based framework for determination of optimal
driving functions of monopole speakers for sound reinforce-
ment applications was presented. The time-domain approach
allows the consideration of wind and other environmental con-
ditions like thermal stratification. It has been shown that the
approach is able to determine optimal driving functions, i.e.,
gain and delay, for different array configurations.
In practice, the obtained driving functions, including
environmental conditions, can be pre-computed, e.g., for dif-
ferent wind intensities. Based on on-site measurements the
sound field generation setup can then be switched between
the different functions.
In a next step, the adjoint-based approach will be ana-
lyzed in the context of complex speaker directivities. For
this purpose, monopole synthesis models are particularly
promising because the numerical effort of the adjoint-based
approach is independent of the number of sources.
ACKNOWLEDGMENTS
The authors gratefully acknowledge financial support by
the Deutsche Forschungsgemeinschaft (DFG) within the
projects Nos. LE 3888/2-1 and WE 4057/16-1.
APPENDIX
1. Conversion between the pressure forcing term of
the CDPS model and the CAA solver
While the CDPS model directly enforces the acoustic
pressure of a monopole speaker, the CAA solver indirectly
adds a source term s
p
to the pressure equation of Eq. (8).
FIG. 21. (Color online) Difference between the disturbed CAA and the
reference CDPS SPLs of the three microphones for the linear array with the
same flow disturbance as in Fig. 19. Again solid lines correspond to non-
optimized CAA solutions, whereas dash-dotted lines are optimized CAA
solutions.
J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1783
This section discusses, how to relate s
p
of the CAA solver
with the CDPS pressure signal of the monopole.
If the Euler equations [Eq. (8)] of the CAA solver
are linearized around a homogeneous background flow, the
wave equation
@2
tp0c2
0Dp0¼sðy;tÞ(A1)
follows, with the right-hand-side force term
sðy;tÞ¼ c1
ðÞ
@tspðy;tÞ:(A2)
The general solution of this inhomogeneous wave equation
in free space is known as
p0ðx;tÞ¼ 1
4pðR3
sðy;tjxyj=c0Þ
jxyjdy:(A3)
To ensure numerical stability, the CAA solver uses a distrib-
uted Gaussian speaker shape
spðy;tÞ¼As
4p
c1eðyy0Þ2r2eixðtÞt;(A4)
instead of monopole source, being x(t) the time-dependent
frequency function of the test signal. Strictly speaking, the
CAA speakers should be called Gaussian. Throughout the
present work, a point-like Gauss in relation to the shortest
acoustic wavelength (3 kHz) is used: r<k. Hence the CAA
speakers are approximate monopoles.
Combining the latter equations and executing the time
derivative provides
p0ðx;tÞ¼iðR3
spðy;sÞ@txðsÞs

jxyjdy;(A5)
being sðtÞ¼tjxyj=c0the delayed time between source
and observer position. Hence, the time derivative of the
eixðtÞtterm of the test signal causes a constant phase shift of
p(imaginary i) and a time, i.e., frequency dependent ampli-
tude relationship between the CAA source s
p
and the result-
ing pressure signal p0ðx;tÞ.
Alternatively, if both the CDPS signal s
CDPS
and the
CAA forcing s
p
are available, the frequency-dependent
amplitude relation is given by the relation jF½sCDPSðxÞj=
jF½spðxÞj, being Fthe discrete Fourier transform. This
amplitude relation is universal for all speakers. In the present
paper, the average of all speakers is taken into account.
Furthermore, one can show, that for low Mach numbers
M<0.1, the conversion curve is independent of the back-
ground flow. Hence, as soon as the amplitude conversion
curve is calculated for one signal and Gaussian speaker
shape without flow, it can also be applied in cases including
wind profiles or thermal stratification.
2. Adjoint equations
As discussed in the main part, linearization of the
governing Euler equations with respect to density, velocity,
and pressure around a given base state q
0
with q¼q0þdq
results in
@tAdqþ@xiBidqþCi@xidqþdCi@xic¼ds:(A6)
Therein, the summation convention applies. The lineariza-
tion matrices are given by
A¼
1 000 0
u1q00 0
u20q00
u300q0
0 000 1
c1
2
6
6
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
5
;
B1¼
u1q00 0
u2
12qu100 1
u1u2qu2qu100
u1u3qu30qu10
0cp
c100cu1
c1
2
6
6
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
5
;
B2¼
u20q00
u1u2qu2qu100
u2
202qu201
u2u30qu3qu20
00cp
c10cu2
c1
2
6
6
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
5
;
B3¼
u300 q0
u1u3qu30qu10
u2u30qu3qu20
u2
3002qu31
000cp
c1
cu3
c1
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
;
Ci¼
0000 0
0000 0
0000 0
0000 0
0000ui
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
;
dCi¼
0000 0
0000 0
0000 0
0000 0
0000dui
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
:
The full adjoint Navier-Stokes equations, in particular,
the friction terms, are derived and discussed in Lemke
(2015). The two-dimensional adjoint Euler-equations can be
found in Lemke et al. (2017).
Betlehem, T., and Withers, C. (2012). “Sound field reproduction with energy
constraint on loudspeaker weights,” IEEE Trans. Audio, Speech, Lang.
Process. 20(8), 2388–2392.
Bewley, T. R. (2001). “Flow control: New challenges for a new renais-
sance,” Progr. Aerosp. Sci. 37(1), 21–58.
1784 J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al.
Coleman, P., Jackson, P. J. B., Olik, M., Møller, M., Olsen, M., and
Pedersen, J. A. (2014). “Acoustic contrast, planarity and robustness of
sound zone methods using a circular loudspeaker array,” J. Acoust. Soc.
Am. 135(4), 1929–1940.
Feistel, S., Sempf, M., K
ohler, K., and Schmalle, H. (2013). “Adapting loud-
speaker array radiation to the venue using numerical optimization of FIR
filters,” in Proc. of the 135th Audio Eng. Soc. Conv., New York, paper
#8937.
Feistel, S., Thompson, A., and Ahnert, W. (2009). “Methods and limitations
of line source simulation,” J. Audio Eng. Soc. 57(6), 379–402.
Gaitonde, D. V., and Visbal, M. R. (2000). “Pade-type higher-order bound-
ary filters for the Navier-Stokes equations,” AIAA J. 38, 2103–2112.
Giles, M., and Pierce, N. (2000). “An introduction to the adjoint approach to
design,” Flow, Turbul. Combust. 65, 393–415.
Jameson, A. (1995). “Optimum aerodynamic design using CFD and control
theory,” AIAA Paper 1729, 124–131.
Leclerc, M. Y., and Foken, T. (2014). Footprints in Micrometeorology and
Ecology (Springer, Berlin).
Lele, S. K. (1992). “Compact finite difference schemes with spectral-like
resolution,” J. Comput. Phys. 103(1), 16–42.
Lemke, M. (2015). “Adjoint based data assimilation in compressible flows
with application to pressure determination from PIV data,” Ph.D. thesis,
Technische Universit
at Berlin, Berlin.
Lemke, M., Reiss, J., and Sesterhenn, J. (2014). “Adjoint based optimisation
of reactive compressible flows,” Combust. Flame 161(10), 2552–2564.
Lemke, M., and Sesterhenn, J. (2016). “Adjoint-based pressure determina-
tion from PIV data in compressible flows—Validation and assessment
based on synthetic data,” Eur. J. Mech. B Fluids 58, 29–38.
Lemke, M., Straube, F., Schultz, F., Sesterhenn, J., and Weinzierl, S. (2017).
“Adjoint-based time domain sound reinforcement,” in Audio Engineering
Society Conference: 2017 AES International Conference on Sound
Reinforcement pen Air Venues.
Mani, A. (2012). “Analysis and optimization of numerical sponge layers as
a nonreflective boundary treatment,” J. Comput. Phys. 231(2), 704–716.
Meyer, D. G. (1984). “Computer simulation of loudspeaker directivity,”
J. Audio Eng. Soc. 32(5), 294–315.
Meyer, P., and Schwenke, R. (2003). “Comparison of the directional point
source model and BEM model for arrayed loudspeakers,” Proc. Inst.
Acoust. 25(4), 1–10.
Mueller, S., and Massarani, P. (2001). “Transfer-function measurement with
sweeps,” J. Audio Eng. Soc. 49(6), 443–471.
Poinsot, T., and Lele, S. (1992). “Boundary conditions for direct simulations
of compressible viscous flows,” J. Comput. Phys. 101, 104–129.
Richards, P., and Hoxey, R. (1993). “Appropriate boundary conditions for
computational wind engineering models using the k-turbulence model,”
J. Wind Eng. Indust. Aerodyn. 46-47, 145–153.
Stein, L. (2019). “Simulation and modeling of a Helmholtz resonator under
grazing turbulent flow,” Ph.D. thesis, Technische Universit
at Berlin,
Berlin.
Straube, F., Schultz, F., Makarski, M., and Weinzierl, S. (2018). “Mixed
analytical-numerical filter design for optimized electronic control of line
source arrays,” J. Audio Eng. Soc. 66(9), 690–702.
Terrell, M., and Sandler, M. (2010). “Optimising the controls of a homoge-
nous loudspeaker array,” in Proc. of the 129th Audio Eng. Soc. Conv., San
Francisco, paper #8159.
Thompson, A. (2009). “Improved methods for controlling touring loud-
speaker arrays,” in Proc. of the 127th Audio Eng. Soc. Conv., New York,
paper #7828.
van Beuningen, G. W. J., and Start, E. W. (2000). “Optimizing directivity
properties of DSP controlled loudspeaker arrays,” Proc. Inst. Acoust.:
Reproduced Sound 22(6), 17–37.
Wendt, J. F. (2009). Computational Fluid Dynamics: An Introduction, 3rd
ed. (Springer, New York).
White, F. (2006). Viscous Fluid Flow, 3rd ed. (McGraw Hill, New York).
J. Acoust. Soc. Am. 146 (3), September 2019 Stein et al. 1785