scieee Science in your language
[en] (orig)
Stefan Weinzierl, Steffen Lepa, Martin Thiering
The Language of Rooms: From Perception to
Cognition to Aesthetic Judgment
Open Access via institutional repository of Technische Universität Berlin
Document type
Book chapter | Accepted version
(i. e. final author-created version that incorporates referee comments and is the version accepted for
publication; also known as: Author’s Accepted Manuscript (AAM), Final Draft, Postprint)
This version is available at
https://doi.org/10.14279/depositonce-15239
Citation details
Weinzierl, S.; Lepa, S.; Thiering, M. (2020) The Language of Rooms: From Perception to Cognition to
Aesthetic Judgment. In: Blauert, J., Braasch J. (eds) The Technology of Binaural Understanding. pp. 435–454
(Modern Acoustics and Signal Processing). Springer, Cham. https://doi.org/10.1007/978-3-030-00386-9_15.
Terms of use
This work is protected by copyright and/or related rights. You are free to use this work in any way permitted by
the copyright and related rights legislation that applies to your usage. For other uses, you must obtain
permission from the rights-holder(s).
Abstract Rooms are not perceptual objects themselves; they can only be perceived
through their effect on the presented signal, the sound source, and the human receiver.
An overview of different approaches to identify the qualities and the dimensions of
“room acoustical impression” will be provided, that have resulted in psychological
measuring instruments for room acoustical evaluation from the audience perspective.
It will be outlined how the psychoacoustic aspects of room acoustical perception are
embedded in a socio-cultural practice that leads to an aesthetic judgment on the
quality of performance venues for music and speech.
1 Language and Perception
The aim of this contribution is to highlight the relationship between the characteris-
tics of performance venues for music and speech and the language which is used to
describe them. On the one hand, an overview of different approaches to using lan-
guage as a “measuring instrument” for the qualities of these spaces will be provided.
On the other hand, there is an interest in what conclusions can be drawn from the
language used with respect to the characteristics of these spaces, the listeners using
this language, and the perceptual and cognitive processes involved.
These relationships will be looked at through the lens of a theoretical frame
describing the relationship between cultural artifacts (performance spaces), their
perception, and their linguistic encoding. This frame model, combining elements
of perceptual psychology and cognitive linguistics, assumes a perceptual front end,
where an external acoustical signal is transformed into neural activity by auditory
sensory organs. For hearing, this process takes place in the inner ear, where sound
pressure transmitted by the outer and middle ear is transduced into neural signals. In
a perceptual back end, these signals are integrated above different sensory modalities
and activate concepts, i.e. mental representations corresponding to abstract classes
Stefan Weinzierl, Steffen Lepa, Martin Thiering
Fachgebiet Audiokommunikation, Technische Universität Berlin, Berlin, Germany
Corresponding author: S. Weinzierl
1
of objects, which “tie our past experiences to our present interactions with the world”
(Murphy 2004, p. 1) and allow us, for example, to classify an audiovisual experience
in a certain social setting as a “concert” and a spatial environment of a specific size,
design, and acoustical properties as a “concert hall”. The result of this comparison
of sensory information with preconfigured categories (concepts) is called percept.
The size and the structure of the concept repertoire as well as the matching process
with the sensory input depends on many personal and situational factors, including the
knowledge, the experience, the expectations, the motivation, and the attention of the
listener. Accordingly, the percept, such as “a successful concert in an acoustically
appropriate environment”, depends as much on these situational factors and the
conceptual repertoire of the individual listener as it depends on the sensory input at
this specific moment.
The relevance of language in this process has two important aspects. First, lan-
guage is—not the only, but the most important—“metrological” access to human
perception. From paired comparisons, similarity judgments, sorting tasks, multidi-
mensional scaling, semantic scales, to vocabulary profiling and related qualitative and
quantitative analyses: most studies of the auditory properties of performance venues
and sound description in general (Susini et al. 2011) have relied on language-related
tasks borrowed from the repertoire of methods of experimental psychology and quan-
titative and qualitative social research. Second, the language used can itself provide
information about the speaker’s conceptual representation of the world (Evans and
Green 2007, p. 5), such as about the spatial environments where listeners’ perceive
music or speech. The taxonomic organization and the privileged level of catego-
rization that is used in everyday language about spatial concepts are the results of
Fig. 1 Musical events as a cultural, social, visual and acoustic experience. The language to describe
performance venues for music and speech reflects each of these domains. The image shows the
concept for a new concert hall, to be opened in Munich ( c
Cukrowicz Nachbaur Architekten)
2
preceding experiences, and it also shapes the lens through which new experiences are
observed. The preferred vocabulary about performance venues does not only reflect
the knowledge and the professional experience, e.g., of expert versus lay listeners;
it is also assumed to have a direct impact on their instantaneous perception and the
respective mental models—for an introduction on mental models see Johnson-Laird
(1983), for a description on frame-theory see Minsky (1977). This kind of “linguistic
relativism”, which has been evidenced for the languages of different ethnic groups by
many empirical observations in cognitive linguistics (Dabrowska and Divjak 2020;
Dancygier 2017;Everett2013;Levinson2020; Thiering 2018), also occurs on a small
scale in lay versus expert language linguistics, or when the languages of groups with
different kinds of expertise are compared, such as music listeners versus musicians.
2 Linguistic Inventories as a Basis for Psychological
Measuring Instruments in Room Acoustics
Throughout the first half of the 20th century, the investigation of room acoustical envi-
ronments was mainly focused on the effects of reverberation, with its dependence on
frequency, and with its control through volume and absorption according to Sabine’s
formula (Sabine 1900). Psychological experiments were conducted already by Sabine
himself. In 1902, he invited a number of musical experts to the then recently built
New England Conservatory of Music to judge the acoustic quality of piano instruc-
tion rooms while seat cushions were successively added to the rooms in order to
reduce their reverberation times. Sabine observed that the listeners judged all rooms
to be acoustically optimal if the reverberation time was within a quite narrow range
of tolerance, from which he concluded a common taste of surprising accuracy
for the acoustical conditions of musical performance venues (Sabine 1906). This
unexpected consensus on the appropriate acoustical conditions for classical concert
venues, which would hardly have been observed 100 years earlier (Weinzierl 2002),
can only be interpreted as the result of a cultural process which accompanied the
emergence of public concert life, and which was largely completed around 1900
(Tkaczyk and Weinzierl 2019).
While Sabine asked his subjects only whether the duration of reverberation was
appropriate, the British architect and acoustician Hope Bagenal carried out simi-
lar experiments with musicians as test participants, who were asked to assess the
effect of room acoustic conditions on different sound qualities including “reverbera-
tion” (too long/too short), “tone” (full/bright/rich/soft), “tone” (hard/thin/dead/dull),
“loudness” (sense of power, body of tone), “reinforcement of notes” (even/uneven)
and “conditions”, by which he asked his subjects to name specific halls which resem-
ble the conditions reached (Bagenal 1925). Even if this scheme was not consistently
adhered to by his subjects, it can be considered as the first attempt to have the acous-
tics of concert halls evaluated by a semantic differential, covering the room acoustical
effect on spatial, dynamic and timbral dimensions, and considering the effect of typ-
3
icality with respect to prototypical reference halls. The first standard with guidelines
for auditorium acoustics, issued in 1926 by the American Bureau of Standards, how-
ever, was mainly focused on specifying an optimal range of reverberation times for
halls of different sizes, along with recommendations of how to reach these values
(Bureau of Standards 1926).
After 1950, an increasing awareness can be observed that an optimal reverberation
time alone is no guarantee for a successful room-acoustical design, and that “rever-
berance” should not form the only criterion for the perceptual assessment of halls.
The British Broadcasting Corporation (BBC) and the Acoustics Group of the Physical
Society sponsored a number of experiments, where an identical repertoire (Don Juan
by R. Strauss) was performed and recorded in four different British concert halls.
Subjects listening to the different recordings were then asked to produce a ranking
of these halls concerning tonal quality, definition, and overall preference. Consis-
tent rankings, however, could not be observed, and practising musicians exhibited
preferences that were different from other skilled listeners (Somerville 1953).
In a further study, a glossary of 14 acoustic terms was collected, which were,
according to the authors, commonly used to describe the qualities of concert halls and
recording studios—see entry for Somerville and Gilford (1957) in Table1. A similar
list of 18 attributes was proposed by Beranek (1914–2016) in his landmark book on
Music, Acoustics and Architecture, along with relations between these perceptual
qualities and physical properties of the hall, which were based on his intuition and
experience (Beranek 1962).
The two lists of attributes, along with a third, originally German list described
below, demonstrate the grown awareness for the multidimensional impact of room
acoustical conditions on the perceived sound qualities in these halls. At the same
time, the studies reflect an awareness for the need to separate the physical and the
perceptual domain more clearly. Somerville and Gilford emphasized that the subject
under investigation is purely aesthetic and therefore must begin and end with human
aesthetic judgments (Somerville and Gilford 1957, p. 171). Nevertheless, their list
is a mix of perceptual and physical items, including aspects such as “scattering”
or “standing wave system” without an obvious equivalent in the perceptual domain.
Beranek’s features, on the other hand, are psychological throughout, at least if aspects
such as “ensemble” and “dynamic range” are understood as “perceived ensemble”
and “perceived dynamic range”.
The way to analyze the results of questionnaire studies constructed on the basis
of these terms was paved by the psychological fundamentals of the use of semantic
differentials (Osgood et al. 1957) and the statistical techniques of multidimensional
scaling (MDS, Torgerson 1952) and factor analysis (Spearman and Jones 1950), all
of which were introduced during the 1950s. One of the first applications of these new
tools was made in a study by Hawkes and Douglas (1971). Sixteen attributes inspired
by Beranek’s list, shown in Table1, were used for a questionnaire applied in four
different British Concert Halls (with different musical programs and performers), in
the Royal Festival Hall, London, with the newly installed Assisted Resonance system
in different technical settings, and in the Royal Festival Hall at 23 different positions.
With interest in identifying the different dimensions of acoustic experience (Hawkes
4
Table 1 Attributes addressed in early investigations on the differential qualities of room-acoustical
environments. The translation of the originally German attributes by Wilkens (1977) was adopted
from Kahle (1995)
Somerville and Gilford
(1957)
Beranek (1962)Wilkens (1977)
1Balance Intimacy Small/large
2 Bass masking Liveness Pleasant/unpleasant
3 Coloration Warmth Unclear/clear
4Deadness Loudness of the direct sound Soft/hard
5 Definition Loudness of the reverberant
sound
Brilliant/dull
6Diffusion Definition/Clarity Rounded/pointed
7Echoes Brilliance Vigorous/muted
8 Flutter echoes Diffusion Appealing/unappealing
9 Liveness Balance Blunt/sharp
10 Pitch changes Blend Diffuse/concentrated
11 Scattering Ensemble Overbearing/reticent
12 Singing tone Immediacy of response Light/dark
13 Slap back Texture Muddy/clear
14 Standing-wave system Freedom from echo Dry/reverberant
15 Freedom from noise Weak/strong
16 Dynamic range Treble emphasized/not
emphasized
17 Tonal quality Bass emphasized/not
emphasized
18 Uniformity throughout the
hall
Beautiful/ugly
19 Soft/loud
and Douglas 1971, p. 249), the authors applied both MDS and factor analyses, finding
4–6 orthogonal factors. The solutions they obtained, however, were different both
for the different stimulus settings and for the different types of analyses, i.e., both
the number of factors and the relation of factors and items were different for each of
the sub-studies. This problem will be further addressed in Sect.3.
While Hawkes and Douglas collected data in the field, i.e., by interviewing con-
certgoers, Lehmann and Wilkens (1980) used an experimental approach by present-
ing dummy-head recordings of the Berlin Philharmonic Orchestra in six different
halls and capturing the assessment of subjects on a semantic differential with 19
different attributes shown in Table1. These were selected from a list of originally
27 items by eliminating those with an excessive inter-rater variance, indicating an
inconsistent interpretation between subjects. The factor analysis of the ratings deliv-
ered three orthogonal factors, explaining 89% of the total variance. Considering the
weigths of the original attributes on these variables, these were interpreted as strength
5
and extension of the sound source,definition and timbre of the overall sound (Wilkens
1977). In an attempt to overcome the limitations of the experimental approach lack-
ing the ecological validity of live concert situations, Sotiropoulou et al. (1995)used
a questionnaire to be rated at three concerts in two different concert halls in London.
Similar to Lehmann and Wilkens (1980), they started with a larger vocabulary of
about 100 labels, which was reduced to 54 bipolar attributes based on a relevance
rating collected in pretests. Analysing the ratings of about 80 participants by factor
analysis, the authors obtained four factors explaining roughly 66% of the total vari-
ance, which they interpreted as body,clarity,tonal quality, and proximity. In both
experiments it became obvious that linguistic descriptors are not necessarily suitable
as measuring instruments if their meaning is inconsistently interpreted by different
raters or if their immediate relationship to the perceptual object under consideration
is not assured.
The numerous investigations dedicated to finding suitable technical parameters
to predict specific perceptual categories are not the subject of this contribution. Only
some of them are also interesting here because they highlighted the importance of
specific perceptual aspects which did not appear in earlier studies (compare Table1).
Most importantly, a group of studies emphasized the importance of spatial aspects
of room acoustics, in particular of an increased perceived “source width” and the
perceived acoustic “envelopment” of the auditorium (Barron 1971; Barron and Mar-
shall 1981; Bradley and Soulodre 1995). All of these studies, however, employed
synthetic sound fields created by loudspeakers in the anechoic chamber, and asked
participants to evaluate these qualities as isolated items. Since they were not evalu-
ated as part of a multidimensional measuring instrument, it is not apparent to what
extent they form independent aspects of the room acoustical impression, or whether
they are physically or perceptually correlated to other aspects.
In this context, it is essential to bear in mind that the ratings of two objects can
be correlated because two labels refer to similar perceptual impressions (such as
“loudness” and “strength” of sound), or because different perceptual qualities co-
vary in the physical objects of the stimulus pool. For example, rooms providing more
“reverberance” could—for physical reasons—always provide more “envelopment”,
although the perceptual concepts are clearly different.
After 2000, the study of the perceptual space of room acoustic conditions as a
whole attracted a renewed interest directed to the individual vocabularies used to
describe room acoustic conditions. Several studies, first aiming at the evaluation of
spatial-audio reproduction systems (Berg and Rumsey 2006) and then also on the
perception of natural acoustical environments, included a qualitative part for the ver-
bal elicitation of the terminology and a quantitative part for the statistical analysis of
the generated terms, allowing to identify clusters of attributes with a similar meaning.
An initial of two studies conducted at Aalto University, Helsinki, produced room-
acoustical stimuli by impulse-response measurements of a loudspeaker orchestra
in three different concert halls, encoded in Ambisonics B-Format, processed with
directional audio coding (DirAC, Pulkki 2007) and reproduced by a 16-channel
loudspeaker system. A second study used impulse responses of eight different con-
cert halls, encoded in Ambisonics B-Format, processed with the spatial impulse-
6
Fig. 2 Wheel of concert hall acoustics (Kuusinen and Lokki 2017)
response-rendering algorithm (SIRR; Merimaa and Pulkki 2005) and reproduced by
a 14-channel loudspeaker system. Analysing the large set of about 100 individual
attributes generated and rated by 20 resp. 23 participants in the two studies, two resp.
three main components could be extracted explaining 66% resp. 67% of the total
variance. These were interpreted as loudness/distance and reverberance in the first
study (Lokki et al. 2011), and loudness, envelopment and reverberance,bassiness
and proximity and definition and clarity in the second study (Lokki et al. 2012).
As a summary of their own and other work, authors of the same group suggested
a “wheel of concert hall acoustics”, including eight main categories and 33 items
to visualize the main perceptual aspects of concert halls with unamplified musical
instrument sounds (Kuusinen and Lokki 2017). The wheel format, which has a longer
tradition in the domain of food quality and sensory evaluation (compare Noble et al.
1987) is a structured and hierarchical form to present a lexicon of different sensory
characteristics. Pedersen and Zacharov (2015) used the wheel to present such a
lexicon for reproduced sound, with the selection of the items and the structure of
the wheel based on hierarchical cluster analysis and measures for discrimination,
reliability, and inter-rater agreement of the individual items—an empirical basis
7
which was not provided in Kuusinen and Lokki’s original wheel for concert halls—
see Fig.2.
3 Psychometrics and Scale Development in Room Acoustics
The studies summarized above have, in different ways, confirmed the multidimen-
sional character of room acoustical conditions as a mediator for the sound qualities
of music and speech, and they provided different lists of attributes to describe these
qualities. As underlying dimensions of the room acoustical impression, the loud-
ness or strength and the reverberance of rooms were consistently extracted from the
ratings of these attributes, as well as a factor for the timbre of the room. None of
these studies, however, attempted to construct a standardized measuring instrument
that can be used with different groups of listeners to describe the whole width of the
room-acoustic perception space. From today’s point of view, this was impossible due
to shortcomings in terms of the experimental stimuli, the participating subjects and
the statistical analysis techniques employed in these works. According to modern
standards in social research, any psychological measurement instrument has to meet
established quality criteria in terms of reliability,validity, and invariance concerning
a typical sample of users and stimuli. To establish perceptual dimensions, the investi-
gation has to take care of representativeness and breadth of stimuli, possible hidden
confounders and a proper sample selection in order to prevent bias concerning the
generalizability of results.
A first problem that pertains to most of the early studies in acoustical room impres-
sion measurement is the lack of experimental control concerning the stimuli pre-
sented. Studies that collected ratings in physically existing rooms always risked the
influence of hidden confounders such as the audio content, the visual impression, or
the musical performance, all of which co-vary with the auditory impression of the
room. Presenting the whole breadth of possible room acoustical conditions while
keeping these confounding variables constant seems only possible with state-of-
the-art technologies for auralization. It is, of course, true that such an experimental
approach can not account for all visual, architectural, and social aspects that con-
stitute the multi-modal impression of a concert venue. To determine the acoustical
properties of a room, however, these influences act as confounders increasing the
measurement error of the test.
One may ask to what extent full control of all non-acoustic factors makes sense,
since, for example, an interaction between the room-acoustic conditions and the
playing style of musical performers is also present in the real situation and thus not
an experimental artifact in the narrower sense. However, incorporating such inter-
action into the experimental design might, on the one hand, conceal characteristics
of space if, for example, musicians were tempted to compensate for the effect of
space by adjusting their timbre and volume in the opposite direction. On the other
hand, the effect of space on the performer’s playing has proven to be quite individ-
ual. Different musicians react in very different ways to room-acoustic conditions
8
(Schärer Kalkandjiev and Weinzierl 2013,2015; Luizard et al. 2020). The consider-
ation of the interaction of space and performance would, therefore, face the problem
of which reaction pattern should be used as a basis here. Nevertheless, the final
question that must be taken into account regarding the overall aesthetic judgment
of space—“Is the room suitable for a musical content?”—cannot be conclusively
evaluated in the laboratory.
A second challenge lies in the sample of rooms presented in terms of represen-
tativeness. The identification of latent dimensions of room acoustical perception,
i.e., a stable factor-analytic solution of the measured data, which is valid beyond the
specific sample of rooms used in the test, cannot be expected for a too-small set of
stimuli. In order to identify five largely independent perceptual dimensions, a set of
at least 25= 32 stimuli would be required so that all perceptual qualities can be var-
ied systematically and independently from each other and hence can be adequately
identified by factor analysis. Furthermore, only with a sufficiently large sample of
rooms, the results can be considered representative of the targeted population of room
acoustical conditions. Comparing these requirements with the sample sizes used in
the studies mentioned above, with typically less than ten rooms, it becomes evident
that neither the dimension of the perceptual space, i.e., the number of latent variables,
nor the structure and interpretation of the adopted factor solution could be reliably
determined.
A third challenge is the size of the sample of listeners. In order to reliably assess
the dimensionality of perceptual constructs represented by questionnaire item bat-
teries by factor analyses, it is recommended to use sample sizes of at least 100–200
subjects or at least a sample of three times the number of employed items, even
in the favorable case of a good fit of attributes and factors (high commonality). If
this requirement is not met—and it never was in the aforementioned studies, stable
solutions can be expected only for the most critical factors, i.e., those carrying most
of the variance, while all other factors, representing the more subtle aspects of room
acoustic impression, are affected by a large sampling error (MacCallum et al. 1999).
Finally, any psychological measuring instrument requires an analysis of the psy-
chometric qualities of the perceptual constructs and questionnaires based on them
in terms of validity, reliability, and measurement invariance. Techniques and crite-
ria for this purpose have been developed extensively in the social sciences (Vooris
and Clavio 2017). Typical requirements for psychological questionnaire instruments
comprise the use of latent measurement models, the demonstration of convergent and
discriminant validity (“Do the scale’s subdimensions actually measure what they
are supposed to measure and are sub-dimensions sufficiently different from each
other?”), as well as demonstrations of sufficient construct reliability (“How precise
does the scale measure?”) and measurement invariance (Millsap 2011) across time,
stimuli, and populations of interest (“Are the scale’s measurements independent of
the experimental factors employed?”).
In order to achieve and demonstrate an acceptable degree of validity, reliability,
and measurement invariance, and to deal with different sources of measurement error
(Schmidt and Hunter 1999), psychological scale development today typically relies
on latent-variable models (Loehlin 2004). In this approach, it is assumed that every
9
manifest measurement of a questionnaire item is, in fact, the expression of underlying
latent psychological constructs. When at least three items are measured for any con-
struct exclusively (“simple structure”), it is possible to not only estimate the degree
of item-measurement error, construct loading and resulting construct reliability, but
also to calculate error-free construct scores and work with these in later analyses. The
latent-variable approach also allows checking for reliability across time (retest relia-
bility), which is considered the most important indicator of reliability for scales since
Cronbach (1947), and invariance across experimental conditions. Since past studies
on room acoustics predominantly drew on principal-component analyses (PCA) and
clustering techniques, where none of these tests is possible (Fabrigar et al. 1999),
only a minority of reliability-, validity-, and invariance-related questions could be
addressed systematically.
4 The Room-Acoustical-Quality Inventory (RAQI)
A multi-stage investigation was conducted by the Audio-Communication Group
at TU Berlin to develop a language-based measuring instrument for the different
qualities of room-acoustical environments for music and speech and to address the
methodological gaps described above (Weinzierl et al. 2018). In a first step, expert
knowledge from different professional domains in room acoustics was acquired by
help of a focus group in order to provide a comprehensive terminology covering
all aspects of the room-acoustical impact on music and speech performances. In a
second step, listening experiments with acoustical experts and non-specialists were
conducted using 35 rooms of different architectural types, different size, and different
average absorption values in order to address the most important types of acoustic
performance venues and their specifics. Different audio content was used, including
solo music, orchestral music, and dramatic speech. The goals of the subsequent
statistical analyses were to,
Find an exhaustive list of verbal attributes that describes all relevant room acous-
tical properties
Identify the best-suited items of this list to form a standardized measurement
instrument
Analyze the underlying dimensions of room-acoustical impressions
Construct a measurement instrument based on these dimensions and corresponding
items
Demonstrate the reliability of the new instrument across and within raters
Demonstrate measurement invariance of the new instrument across experimental
conditions such as audio content type and subject samples
Demonstrate sufficient discriminant validity of its subdimensions.
In order to realize this in an experimental setting that permitted controlling for any
possible confounders, the study drew on room-acoustical simulation and auralization
by dynamic binaural synthesis. The consensus vocabulary generated by the expert
focus group consisted of 50 perceptual qualities related to the timbre, geometry,
10
reverberation, temporal behavior, and dynamic behavior of room-acoustical environ-
ments, as well as overall, holistic qualities. While some attributes reflect lower-order
qualities closely related to temporal or spectral properties of the audio signal (“loud-
ness”), (“treble/mid/bass range tone color”), perceived “size”, and “width” of sound
sources), other attributes reflect higher-order psychological constructs, supra-modal,
affective, cognitive, aesthetic, or attitudinal aspects such as “clarity”, “intimacy”,
“liveliness”, “speech intelligibility”, “spatial transparency”, or “ease of listening”
(Weinzierl et al. 2018, SuppPub 1). For the listening experiment, binaural room-
impulse response (BRIR) datasets were simulated for 35 rooms at 2 listening posi-
tions for solo music and speech. For the orchestral piece, 25 rooms at 2 listening
positions were selected, leaving off 10 rooms where the stage area would not be
large enough for an orchestra. Thus, in total, 190 room-acoustical conditions (rooms
×listening positions ×source characteristics) were simulated for the listening exper-
iment. Fourteen of these 190 possible stimulus combinations were rated by each of
the 190 participants in a balanced incomplete block design, using 46 items selected
from the focus group terminology.
An exploratory factor analysis (EFA) based on the common factor approach was
conducted to estimate the number of independent latent dimensions contained in the
full-item data matrix. The scree- and Kaiser-criterion was used as a starting point
for constructing a multidimensional measurement model. For each of the possible
solutions, a series of confirmatory factor analyses (CFA) was conducted to consec-
utively remove single items from the measurement models up to a point where an
implied removal would have led to less than three items per factor, or otherwise, the
overall fit of the measurement model was already good. The latter was read from
Root-Mean-Square Errors of Approximation (RMSEA), Comparative Fit Indices
(CFIs), and Standardized Root-Mean-Square Residual (SRMR) coefficients, as well
as congeneric Construct Reliability (CR), indicating the internal consistency of a
factor construct, and Average Variance Extracted (AVE) indicating how well a factor
explains the scores of its underlying items (Fornell and Larcker 1981).
The factor analysis suggests possible solutions with 4, 6 or 9 factors. These can be
interpreted as a general room-acoustical quality factor, strength,reverberance,bril-
liance (4-factor solution), irregular decay and coloration (6-factor solution), clarity,
liveliness and intimacy (9-factor solution). The corresponding item batteries consist
of 14, 20, and 29 attributes as shown in Fig. 3. From a statistical point of view, the
6-factor RAQI scale with 20 items is the best compromise between a comprehensive
assessment of the full complexity of room-acoustical impressions while at the same
time ensuring sufficient statistical independence of the different factors.
With strength and reverberance, two of the sub-dimensions are omnipresent in the
room-acoustical literature. Also, clarity and intimacy as additional factors have been
frequently highlighted by previous studies (Hawkes and Douglas 1971; Lokki et al.
2012). With brilliance,coloration, and intimacy appearing as largely independent
factors, it seems that timbre-related qualities play a greater role with more dimensions
than previously assumed. The importance of perceived irregularities in decay and of
liveliness as an independent construct has, however, hardly been considered so far.
11
Fig. 3 The Room-Acoustical-Quality Inventory (RAQI). Four, six, and nine factors as possible
sub-dimensions of room acoustical impression can be measured with questionnaires containing 14,
20, and 29 items, which are given with corresponding poles. Weights (W) and Intercepts (I) should
be used to measure factors and for structural-equation analysis. Four additional single items with
high retest reliability, which could not be assigned to any of the factors, are given below
In terms of psychometric quality, the factors of the 6-factor RAQI exhibit good
across-rater consistency and within-rater stability. With regards to measurement
invariance, scalar measurement invariance across measurement occasions could be
demonstrated for a rather long distance of approximately 42 days. Scores from all
RAQI sub-dimensions can thus be directly compared across studies as long as exper-
imental conditions and test subject sample are identical. Similar results pertain to
changes in experimental listening position: Scores taken from different listening posi-
tions in the same room did not differ systematically. Although the acoustical transfer
functions might be quite different, as was demonstrated even for minor changes of
the listening position (de Vries et al. 2001), listeners are able to identify the room
and its acoustical properties as a consistent cognitive object.
12
5 Low Retest Reliabilities of Experts versus Laymen:
A Problem of Language or a Problem of Perception?
As part of the RAQI development study, the listening test with 88 subjects of 190
in total was repeated six weeks later with identical stimuli. Based on these data, test-
retest reliabilities could be determined, calculated as the correlation of measurements
within individuals across time, as a measure of the precision, with which certain
room-acoustical features could be evaluated. For a majority of the 46 items rated by
all participants, the reliabilities turned out to be rather low. Only three items related to
reverberation: reverberance”, strength”, and duration of reverberation”, exceeded
values of r= 0.7, which is usually considered as a criterion of good reliability.
Many other items, including popular ones in room acoustics such as sharpness
or transparency”, turned out to be based on somewhat unreliable judgements (r=
0.37/0.43), using the variation over time within subjects as an indicator.
The low stability of most single item scores indicates that room acoustical impres-
sions appear to be strongly influenced by time-varying situational factors, such as
variations in attention, mental efficiency and distraction (random response errors)
and variations in mood, feeling and mindset (transient errors, Schmidt and Hunter
1999). The extent of these psychological measurement errors, however, also depends
on the expertise of the listeners. Since the aforementioned subject sample consisted
of 60 music-interested non-specialists and 28 individuals with professional education
in room acoustics, the relevance of this personal trait could be determined, showing
a mean retest reliability of 0.50 across all items for non-specialists versus 0.59 for
acoustical experts. Since there is no evidence for differences in the sensory perfor-
mance between the two groups, the reasons for this difference have to be sought in
the perceptual back-end, to pick up on a term from Sect. 1.
Laymen could assess properties, which are clearly room-related and accessible
to everyday experience such as the size of the room (r= 0.59 vs. 0.58 for experts
vs. non-experts), the occurrence of echoes (r= 0.67 vs. 0.68) and even the degree
of liking (r= 0.59 vs. 0.62) with the same, sometimes even better reliability than
experts. Whenever, however, the influence of the acoustical source and the influence
of the room on the same auditory qualities had to be separated, such as when rating
the brightness”(r= 0.65 vs. 0.49) and the bass-range characteristic”(r=0.64
vs. 0.17), or the impact of the room on temporal clarity (r= 0.72 vs. 0.49) and
speech intelligibility (r= 0.70 vs. 0.58), experts were clearly in the advantage. In
these cases, the ability to judge this reliably depends on the extent to which the
performance and the room are separated cognitive objects attracting differentiated
attention. In addition, experts then also cultivate a specialized vocabulary, including
attributes such as comb-filter coloration (r= 0.56 vs. 0.44) or the spatial transparency
(r= 0.56 vs. 0.38), which are hardly required to describe everyday experiences with
music and speech.
Hence, the question of whether the higher precision of experts in evaluating the
acoustic properties of rooms is due to a better-trained perception or a more sophisti-
13
cated vocabulary points to the same interwoven phenomenon: The cognitive perfor-
mance in the separation of source and space requires a more sophisticated vocabulary,
but the sophisticated vocabulary, in turn, can lead to a more differentiated perception.
6 The Perceived Quality of Performance Spaces
Most studies dedicated to room acoustic qualities have, in one way or another, also
examined what makes up the overall quality of performance spaces. To avoid termi-
nological confusion between the two concepts, Blauert has suggested distinguishing
between the “aural quality” of a room and a set of “quality features” making up the
“aural character” of the room (Blauert 2013). In the following, however, we will
stick to a distinction between “quality” and “qualities”, because these are fairly well
established terms in the psychological literature.
Quality, in the sense of liking or preference, was already an issue in early investi-
gations aiming at preferred values for the reverberation time of concert halls (Sabine
1906;Watson1923; Bagenal 1925; Sabine 1928; Knudsen 1931). Multidimensional
approaches have often tried to find correlations between the rating of individual
attributes or factors and the overall pleasantness (Wilkens 1977), the degree of enjoy-
ment (Hawkes and Douglas 1971)orthepreference (Soulodre and Bradley 1995;
Lokki et al. 2012) of the room acoustical impression. The relation between the rat-
ing of individual qualities and overall quality judgements, however, turned out to
be dependent on many factors beyond the room acoustical properties, such as the
musical repertoire, the musical performance, and the taste of individual listeners.
Some of the studies could even identify different preference groups, one of which
preferred more reverberance while the other preferred more clarity and definition
(Lehmann and Wilkens 1980; Lokki et al. 2012).
The proportion of the variance in overall quality judgments which can be explained
only by acoustic qualities of the rooms themselves was estimated by the authors of
this chapter, based on the ratings of 190 participants, collected in the development of
the Room-Acoustical-Quality Inventory (Weinzierl et al. 2018). Since an unspecific
quality factor turned out to be one dimension in each factor solution, the relationship
of this general factor to the other dimensions could be estimated. For this purpose, the
measurement model of the RAQI was turned into an equivalent structural-equation
model regressing the scores of the quality factor on the other factors to estimate their
influence and the overall explained variance for the quality factor. Not only linear but
also quadratic influences were tested, since psychoacoustic influences on cognitive
percepts often show a u-formed (or inverse u-formed) relationship, for example,
between preference and reverberation, where an optimal range and a decrease in
quality on both sides is the most plausible relation. To account for this, the LMS
approach for non-linear effects in structural-equation models was used (Harring
et al. 2012).
In the 6-factor version of the RAQI shown in Fig. 3, the model was able to explain
about half of the variance in quality—see Fig. 4. While strength and brilliance scores
14
Fig. 4 Structural-Equation Model estimating the influence of five dimensions in the 6-factor-
RAQI on the overall quality factor. Path coefficients are beta-weights/correlations. The parameters
indicating the fit of the model are explained in Sect.4
exhibit the largest positive influence on quality judgements, a decrease in quality
arises with higher coloration and higher irregular decay values. Reverberance had
both a linear and an inverted-u relationship to quality.
One should be aware that the influence of the individual factors—indicated by
the beta weights—as well as the explanatory power of this quality model—indicated
by the measure of determination, R2—depends significantly on the properties of the
stimulus pool from which it is derived. The less the presented rooms and the music-
and-speech content correspond acoustically to the expectations of the listeners, the
higher will be the proportion of the overall quality that can be explained solely by
acoustic properties. Also, the sign and the value of the linear beta weights initially
only indicate in which direction and to what extent parameters, for example, the
rooms’ reverberance, deviated on average from the perceived optimum as seen from
the listeners’ point of view. In order to obtain values for these relationships that
correspond to a certain cultural practice, it is therefore vital to work with a stimulus
pool that is an adequate representation of this practice.
With the stimulus pool used, fifty percent of the variance in quality could be
explained by perceptual attributes of the room in the presented model. This part of the
variance can be considered as the context-independent part of the overall preference
judgements, not accounting for the musical repertoire, the musical performance, and
the individual taste of the listeners. To explain the preference of music listeners for
specific concert halls in a specific situation, a significantly extended model would
thus be required. Although various potential influencing factors related to the cultural
context of such an overall judgement have been proposed, for example, by Blauert
(2013), who pointed out the importance of the typicality of concert halls as a result
15
of two hundred years of Western concert culture, a comprehensive model for the
overall aesthetic impression of performance venues, validated by empirical data, has
not yet been proposed.
A promising candidate for such a model could emerge when taking into account
that judgments about concert halls are always embedded in music-cultural practices
in which the music piece, its performance, the performance space and the predispo-
sition of listeners are intricately interwoven, and in their entirety shape the aesthetic
judgment of a musical event. Thus, when music psychology examines the factors
that influence the aesthetic judgment of a concert performance, the spatial and social
context under which that judgment is made will always form a part of that judgment.
Listeners can try to consider individual aspects and, for example, try to analytically
separate the contributions of the sound source and the performance space to the
perceived sound event (Traer and McDermott 2016). However, this separation will
always be incomplete. Hence a reasonable approach will possibly lie in applying
models that have already been proven empirically to explain the aesthetic judgment
of music and music performances also to the evaluation of the venues in which music
is performed.
Such an approach is exemplified by the model of Juslin et al. (2016), shown
schematically in Fig.5. It assumes that listeners make aesthetic judgments in partic-
ular situations in which they adopt what the authors call an “aesthetic attitude”. It
is, not least, the concert ritual and the concert hall itself that encourages listeners to
adopt this attitude. Once this condition is met, aesthetic processing may be influenced
by several factors in the artwork, the perceiver, and the situation. These influences
are mediated through the perception, cognition, and emotion of the listener. For one
thing, a clear separation of these processes is difficult to draw, and the same musical
cues can be processed perceptually (i.e., as sensory impressions), cognitively (i.e.,
depending on conceptual knowledge) and emotionally (i.e., aroused by other psycho-
physiological mechanisms). However, even more important is the observation that
different listeners use different criteria that determine which of this information and
which of those channels have an impact on the resulting aesthetic judgment.
These criteria can be related to varying degrees both to the musical work, to a
performance, and to a performance space. One of these criteria is beauty. Concerning
the performance space, beauty could be understood as the sum of the room acoustical
qualities, for which a multidimensional measuring instrument has been developed
with the RAQI (Fig.3). Beauty, however, should not be identified with aesthetic value
in general. Other criteria such as the degree of originality of a musical event and the
related performance venue, the skill in its realization, the typicality with respect to
performance traditions, the degree of expression and emotional contagion, and the
message related to the socio-cultural connotations of the musical event, can play
a major role for many listeners. In the study of Juslin et al. (2016), most listeners
appeared to use a small number of three to five criteria in their judgments, and there
were significant individual differences among the listeners, both in how many and
which criteria were used.
It is tempting to assume that it is the individual choice of criteria which may
account not only for the different aesthetic judgments about music as an integrated
16
Fig. 5 A model for the formation of aesthetic judgments about music, according to Juslin et al.
(2016). The analysis of a musical event is channelled through the perception, cognition, and emotion
of the listener. Whether these inputs will affect the resulting aesthetic judgment depends on the
listener’s criteria, which act as filters for the processed information
experience but also for the different judgments about musical performance venues.
Empirical verification of this assumption could be an essential contribution to a
problem that might have been considered for too long only from a psychoacoustic
perspective.
Acknowledgements The work reported here was produced within the research unit on “Simulation
and Evaluation of Acoustical Environments (SEACEN)”, supported by the Deutsche Forschungs-
gemeinschaft (FOR 1557). The authors are indebted to all colleagues in this project who contributed
to this work, such as D. Ackermann, F. Brinkmann, D. Grigoriev, H. Helmholz, M. Ilse, O. Kokabi,
L. Aspöck, and M. Vorländer, as well as to Jens Blauert for many inspiring discussions on the topic.
Further, we want to thank two anonymous reviewers for their comments on this book chapter.
17
References
Bagenal, H. 1925. Designing for musical tone. Journal of the Royal Institute of British Architects
32 (20): 625–629.
Barron, M. 1971. The subjective effects of first reflections in concert halls–the need for lateral
reflections. Journal of Sound and Vibration 15 (4): 475–494.
Barron, M., and A.H. Marshall. 1981. Spatial impression due to early lateral reflections in concert
halls: the derivation of a physical measure. Journal of Sound and Vibration 77 (2): 211–232.
Beranek, L.L. 1962. Music, Acoustics & Architecture. New York: Wiley.
Berg, J., and F. Rumsey. 2006. Identification of quality attributes of spatial audio by repertory grid
technique. Journal of the Audio Engineering Society 54 (5): 365–379.
Blauert, J. 2013. Conceptual aspects regarding the qualification of spaces for aural performances.
Acta Acustica united with Acustica 99 (1): 1–13.
Bradley, J.S., and G.A. Soulodre. 1995. Objective measures of listener envelopment. Journal of the
Audio Engineering Society 98 (5): 2590–2597.
Bureau of Standards. 1926. Circular of the Bureau of Standards, No. 300. Architectural acous-
tics. Washington: G.P.O. https://archive.org/details/circularofbureau300unse,https://archive.org/
details/circularofbureau300unse.
Cronbach, L.J. 1947. Test ‘reliability’: Its meaning and determination. Psychometrika 12 (1): 1–16.
https://doi.org/10.1007/BF02289289.
Dabrowska, E., and D. Divjak. Handbook of Cognitive Linguistics. Berlin, Boston: De Gruyter
Mouton.
Dancygier, B. 2017. The Cambridge Handbook of Cognitive Linguistics. Cambridge, UK: Cam-
bridge University Press
de Vries, D., E.M. Hulsebos, and J. Baan. 2001. Spatial fluctuations in measures for spaciousness.
Journal of the Acoustical Society of America 110 (2): 947–954.
Evans, V., and M. Green. 2007. Cognitive Linguistics. An Introduction. Edinburgh: Edinburgh
University Press.
Everett, C. 2013. Linguistic Relativity: Evidence Across Languages and Cognitive Domains,vol.
25. Berlin/New York: De Gruyter Mouton.
Fabrigar, L.R., D.T. Wegener, R.C. MacCallum, and E.J. Strahan. 1999. Evaluating the use of
exploratory factor analysis in psychological research. Psychological Methods 4 (3): 272–299.
Fornell, C., and D.F. Larcker. 1981. Evaluating structural equation models with unobservable vari-
ables and measurement error. Journal of Marketing Research 18 (1): 39–50. https://doi.org/10.
1177/002224378101800104.
Harring, J.R., B.A. Weiss, and J.C. Hsu. 2012. A comparison of methods for estimating quadratic
effects in nonlinear structural equation models. Psychological Methods 17 (2): 193–214. https://
doi.org/10.1037/a0027539.
Hawkes, R.J., and H. Douglas. 1971. Subjective acoustic experience in concert auditoria. Acustica
24 (5): 235–250.
Johnson-Laird, P.N. 1983. Mental Models: Towards a Cognitive Science of Language, Inference,
and Consciousness. Cambridge: Harvard University Press.
Juslin, P.N., L.S. Sakka, G.T. Barradas, and S. Liljeström. 2016. No accounting for taste?: Idio-
graphic models of aesthetic judgment in music. Psychology of Aesthetics, Creativity, and the Arts
10 (2): 157–170.
Kahle, E. 1995. Validation d’un modèle objectif de la perception de la qualité acoustique dans
un ensemble de salles de concerts et d’opéras (Validation of a perceptual model of the acoustic
quality in an ensemble of concert halls and opera houses). Le Mans: Le Mans Universite. Ph.D.
thesis.
Knudsen, V.O. 1931. Acoustics of music rooms. Journal of the Acoustical Society of America 2:
434–467.
Kuusinen, A., and T. Lokki. 2017. Wheel of concert hall acoustics. Acta Acustica united with
Acustica 103 (2): 185–188.
18
Lehmann, P., and H. Wilkens. 1980. Zusammenhang subjektiver Beurteilungen von Konzertsälen
mit raumakustischen Kriterien (Relation between subjective elaluations of concert halls and
room-acoustical criteria). Acustica 45: 256–268.
Levinson, S.C. Space in Language and Cognition: Explorations in Cognitive Diversity. Cambridge:
Cambridge University Press.
Loehlin, J.C. 2004. Latent Variable Models: An Introduction to Factor, Path, and Structural Equation
Analysis. Mahwah: Routledge.
Lokki, T., J. Pätynen, A. Kuusinen, and S. Tervo. 2012. Disentangling preference ratings of concert
hall acoustics using subjective sensory profiles. Journal of the Acoustical Society of America 132
(5): 3148–3161.
Lokki, T., J. Pätynen, A. Kuusinen, H. Vertanenen, and S. Tervo. 2011. Concert hall acoustics
assessment with individually elicited attributes. Journal of the Acoustical Society of America 130
(2): 835–849.
Luizard, P., J. Steffens, and S. Weinzierl. 2020. Singing in different rooms: Common or individual
adaptation patterns to the acoustic conditions? Journal of the Acoustical Society of America. 147
(2): EL132–EL137.
MacCallum, R.C., K.F. Widaman, S. Zhang, and S. Hong. 1999. Sample size in factor analysis.
Psychological Methods 4 (1): 84–99.
Merimaa, J., and V. Pulkki. 2005. Spatial impulse response rendering I: Analysis and synthesis.
Journal of the Audio Engineering Society 53: 1115–1127.
Millsap, R.E. 2011. Statistical Approaches to Measurement Invariance. New York: Routledge.
Minsky, M. 1977. Frame-system theory. In Thinking. Readings in Cognitive Science,ed.P.N.
Johnson-Laird and P.C. Wason. Cambridge: Cambridge University Press.
Murphy, G. 2004. The Big Book of Concepts. MIT Press.
Noble, A.C., R.A. Arnold, J. Buechsenstein, E.J. Leach, J.O. Schmidt, and P.M. Stern. 1987. Mod-
ification of a standardized system of wine aroma terminology. American Journal of Enology and
Viticulture 38 (2): 143–146.
Osgood, C.E., G.J. Suci, and P.H. Tannenbaum. 1957. The Measurement of Meaning. Urbana, Ill.:
University of Illinois Press.
Pedersen, T.H., and N. Zacharov. 2015. The development of a sound wheel for reproduced sound.
Audio Engineering Society Convention. 138 Preprint No. 9310.
Pulkki, V. 2007. Spatial sound reproduction with directional audio coding. Journal of the Audio
Engineering Society 55: 503–516.
Sabine, P.E. 1928. The acoustics of sound recording rooms. Transactions of the Society of Motion
Picture Engineers 12 (35): 809–822.
Sabine, W.C. 1900. Reverberation. The American Architect and Building News. 68: 3–5, 19–22,
35–37, 43–45, 59–61, 75–76, 83–84.
Sabine, W.C. 1906. The accuracy of musical taste in regard to architectural acoustics. Proceedings
of the American Academy of Arts and Sciences 42 (2): 53–58.
Schärer Kalkandjiev, Z., and S. Weinzierl. 2013. The influence of room acoustics on solo music
performance: An empirical case study. Acta Acustica united with Acustica 99: 433–441.
Schärer Kalkandjiev, Z., and S. Weinzierl. 2015. The influence of room acoustics on solo music
performance: An experimental study, Psychomusicology: Music. Mind and Brain 25 (3): 195–
207.
Schmidt, F.L., and J.E. Hunter. 1999. Theory testing and measurement error. Intelligence 27 (3):
183–198. https://doi.org/10.1016/S0160-2896(99)00024-0.
Somerville, T. 1953. Subjective comparisons of concert halls. BBC Quarterly 8: 125–128.
Somerville, T., and C.L.S. Gilford. 1957. Acoustics of large orchestral studios and concert halls.
Proceedings of the IEE 104: 85–97.
Sotiropoulou, A.G., R.J. Hawkes, and D.B. Fleming. 1995. Concert hall acoustic evaluations by
ordinary concert-goers: I, Multi-dimensional description of evaluations. Acta Acustica united
with Acustica 81 (1): 1–9.
19
Soulodre, G.A., and J.S. Bradley. 1995. Subjective evaluation of new room acoustic measures.
Journal of the Audio Engineering Society 98 (1): 294–301.
Spearman, C., and L.W. Jones. 1950. Human Ability. London: Macmillan.
Susini, P., G. Lemaitre, and S. McAdams. 2011. Psychological Measurement for Sound Description
and Evaluation in Measurement with Persons: Theory, Methods, and Implementation Areas,eds.
B. Berglund and G. B. Rossi. New York: Psychology Press.
Thiering, M. 2018. Kognitive Semantik und Kognitive Anthropologie: Eine Einführung [Cognitive
semantics and cognitive anthropology: An introduction]. Berlin: De Gruyter.
Tkaczyk, V., and S. Weinzierl. 2019. Architectural acoustics and the trained ear in the arts: A journey
from 1780 to 1830. In The Oxford Handbook of Music Listening in the 19th and 20th Centuries,
eds. C. Thorau and H. Ziemer. New York, NY: Oxford University Press.
Torgerson, W.S. 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17 (4):
401–419.
Traer, J., and J.H. McDermott. 2016. Statistics of natural reverberation enable perceptual separation
of sound and space. Proceedings of the National Academy of Sciences 113 (48): E7856–E7865.
Vooris, R., and G. Clavio. 2017. Scale development. In The International Encyclopedia of Commu-
nication Research Methods. eds. C.S.D.J. Matthes and R.F. Potter. American Cancer Society.
Watson, F.R. 1923. Acoustics of Buildings. New York: Jon Wiley and Sons.
Weinzierl, S. 2002. Beethovens Konzerträume. Raumakustik und symphonische Aufführungspraxis
an der Schwelle zum modernen Konzertwesen [Beethoven’s concert halls. Room acoustics and
symphonic performance practice on the threshold to modern concert life] (Bochinsky, Frankfurt
am Main).
Weinzierl, S., S. Lepa, and D. Ackermann. 2018. A measuring instrument for the auditory perception
of rooms: The Room Acoustical Quality Inventory (RAQI). Journal of the Acoustical Society of
America 144 (3): 1245–1257.
Wilkens, H. 1977. Mehrdimensionale Beschreibung subjektiver Beurteilungen der Akustik von
Konzertsälen [Multidimensional description of subjective evaluations of the acoustics of concert
halls]. Acustica 38: 10–23.
20