scieee Science in your language
[en] (orig)
Proc. of the EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, 3-5 April 2014
SCALE MODEL AURALIZATION FOR ART, SCIENCE, AND MUSIC: THE STUPAPHONIC
EXPERIMENT
Brian F.G. Katz
Audio & Acoustics Group
LIMSI-CNRS
Orsay, France
Markus Noisternig
Acoustic and Cognitive Spaces Group
UMR STMS, IRCAM-CNRS-UPMC
Paris, France
Olivier Delarozière
Woodstacker
Champ-au-Beau, France
ABSTRACT
The use of acoustical scale models has been replaced for the most
part by computational models and numerical simulations for room
acoustic studies as well as artificial reverberation units. There re-
mains however a number of acoustical phenomena which are diffi-
cult to address with computer simulations, such as coupled vol-
umes, diffraction, and complex scattering, due to the computa-
tional complexity and/or calculation time necessary for addressing
such acoustical wave phenomena on the scale of room acoustical
problems, even small rooms. This paper presents a pilot study
of a rather unique artistic architectural structure consisting of a
self-supporting construction composed of small stacked linear ele-
ments. Acoustically, the structure combines modal behavior, con-
cave forms, and very regular scattering patterns. An example scale
model has been constructed and studied in order to separate differ-
ent construction features and their associated acoustics effects. In
an attempt to explore the interest of the specific acoustic for mu-
sical performance, a computational platform was created to utilize
the scale model as a physical convolution reverberation unit for
musical performance.
1. INTRODUCTION
With the advent of recording, and dry recording studios, there have
been many efforts developed for the reintroduction of reverbera-
tion into studio recorded music. Some of the first technologies
developed were the use of “echo chambers”, wherein the dry au-
dio captured with the microphone was diffused in a reverberant
environment over loudspeakers, and then recaptured with micro-
phones. This physical-based artificial reverberation was quite pop-
ular, with examples existing in such famous institutions as Abbey
Road Studios where the echo chamber was constructed in 1931a,b.
Echo chambers are, however, space demanding, difficult to trans-
port, and not extremely adjustable. With improvements in elec-
tronics, other physical-electronic reverberation systems have been
developed such a plate reverberators and spring reverberators.
ahttp://en.wikipedia.org/wiki/Echo_chamber, last viewed 2013-11-30
bhttp://audiogeekzine.com/2011/02/the-history-of-echo-echo-chambers-chambers/,
last viewed 2013-11-30
With improvements in computer processing power, purely
electronic reverberation became possible, such as using feedback
delay network (FDN) for reverberation processing [1, 2, 3].
These reverberators could be easily adjusted, for example us-
ing perceptual descriptors relying on a simplified model of the
time-frequency energy distribution of parametric FDN [4]. Such
reverberators are however limited, lacking certain realism and
ability to represent unique architectural elements. Additional
increases in computational power allowed for the use of convo-
lution reverberators, using complex impulse responses, either
measured or calculated based on geometrical models such as
ray tracing [5, 6], beam tracing [7, 8, 9], or radiosity [10, 11].
Convolution reverberators capture the fingerprint of a given space,
but require preparations for the acquisition of such IRs and allows
little flexibility with regards to modifying the room at time of use,
though perceptual control of convolution based room simulators
is a subject of current study [12].
Scale models, to date, have been used as off-line convolution
reverberators to study architectural spaces [13, 14], but never in a
performance setting. The current study envisages the possibility
of using scale models in the same way as the old “echo cham-
bers” of the early and mid-twentieth century, while allowing for
the creation of complex and unique acoustic spaces rather than
just simple reverberators. Real-time use of one, or several, mod-
els and the ability to dynamically alter source, receiver, and even
room positions and configurations as desired offers a new form of
reverberation and musical expression.
In parallel, the development and exploitation of real-time scale
model convolution offers a number of interesting scientific aspects.
To begin with, there is the basic signal processing challenge of
achieving such a system. The applications of real-time physical-
based convolution in scale models, in contrast to off-line convo-
lution with measured impulse responses of the scale model, of-
fers the ability to study room excitation by dynamic sources, such
as moving or rotating, with perceptual studies. of specific inter-
est are perceptual studies concerning musician/room interactions,
which require real-time processing of generated music in coordi-
nation with source dynamics. The effect of dynamic architectures
can also be examined, such as movable panels, or dynamic listener
placement or movement during a performance.
14
Proc. of the EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, 3-5 April 2014
2. ARTISTIC CONTEXT
What child has not dreamed of being able to experience, as a Lil-
liputian, a world in miniature: doll houses, electric trains, minia-
ture circus . . . In the “Stupaphonic” 1project that childhood dream
will become a reality for musicians: they will be able to play to
their audience in a space in miniature.
This particular space is based on a special type of structure,
which is at the core of the architectural project Woodstacker [15].
This architectonical type of structure is a solution to the geometri-
cal problem of how to cover a large area by stacking small pieces
of wood without the use of glue or nails. The result is a bottle
shaped three-dimensional rose window (see Figures 1 and 4). This
new building system, based on “stacked laminated” timber struc-
tures, can evoke references to pagodas whose construction also
consists of wooden stacked elements. This stacked architecture,
like chorten of Tibet, belongs to a family of stacked structures
which are derived from a Buddhist mound-like structure called
stupa. Stupas originated as pre-Buddhist earthen burial mounds,
like tumuli in Europa. Thus was born the idea of linking our new
project to these ancient and universal architecture.
The stupa is used by Buddhists as a place of meditation. In
the original pre-Buddhist burial mounts ascetics were buried in a
seated position. The American anthropologist J. Jaynes [16] pro-
poses that they were buried in this position so they can continue to
speak to living people. Hearing voices from beyond the grave sug-
gests some acoustic illusions which are also a part of our device.
In the Woodstacker system, the special geometrical pattern of the
lamellas not only focuses the sound [17] but also functions as fre-
quency filters producing a particular, almost metallic, sound. This
strange acoustic effect is a second reason to link our project to the
stupa as a kind of container for “voices from beyond the grave”,
a “voice granary” (“grenier à voix” in French) to cite the french
writer Pascal Quignard [18].
The larger structures we have currently built can accommo-
date about 30 people for sound experiments (see photograph in
Figure 1). This size limitation is a compromise between the fund-
ing for artistic experiments and the cost of such a construction.
With the project’s evolution we desired a means to quickly exper-
iment with different architectural structures in a flexible way. The
use of physical scale models which are powerful tools for archi-
tects, carpenters, and acousticians [19] offered a solution to ex-
ceed the current constraints. Thus, we found a way to invert the
acoustical environment, like turning a glove inside-out, and give
the musician and audience located outside of the building the same
acoustical experience that they could have inside the structure. Us-
ing the acoustic scale effect we are able to drastically reduce the
size of our installation and increase the number of structures per-
formers can play with and turn “stackscapes” (see Figure 2) into
interactive soundscapes.
3. SYSTEM OVERVIEW
To achieve the artistic, acoustic, and audio scheme imagined, a
basic scenario and system architecture was envisioned. One can
1Stupaphonic: from stupa (from Sanskrit: m., , st¯upa, literally
meaning “heap” a) and phonic (from Ancient Greek φων ´η,ph¯on¯e, mean-
ing “voice” or “sound” b)
ahttp://en.wikipedia.org/wiki/Stupa, last viewed 2013-11-30
bhttp://en.wikipedia.org/wiki/Phonetic, last viewed 2013-11-30
Figure 1: Photo of live performance at StackCamp 2013 featur-
ing Didier Petit (cello) and Emre Gultekin (saz). Champ-au-Beau,
France.
Figure 2: Blue Stackscape, c
2006 O. Delarozière.
imagine a performance area, where the performer is equipped with
one or several microphones. The space is open and large. Near the
musician is one or several acoustic scale models, equipped with
ultrasonic speakers and microphones. Around the musician is the
audience. The sound produced by the musician is captured, trans-
formed to the scale of the model where it is played and recaptured,
then transformed to the full scale of the musician’s performance,
where it is played live to the audience over an electro-acoustic ar-
ray of speakers either on-stage or around the audience.
3.1. Signal Processing
The signal processing chain is depicted in Figure 3. First, the in-
put audio signal is up-sampled to the ultrasound sampling rate,
which is determined by the scaling factor of the model. Sec-
ond, the up-sampled input signal is transposed by the scaling fac-
tor preserving the harmonic structure of the signal. Two imple-
mentations have been tested: a) off-line transposition that allows
for the time-stretching of the signal; b) a real-time implementa-
tion thereof, which compensates for the time-scaling effect using
15
Proc. of the EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, 3-5 April 2014
Figure 3: Signal processing flow chart: (a) instrument signal cap-
ture, transformed audio signal playback; (b) up/down-sampling
with anti-aliasing filters; (c) frequency transposition, and (d) the
Stupaphonic scale model.
a phase vocoder and thus preserves the continuity of the audio
stream.
The off-line version study was carried out using MATLAB R
.
Audio samples were processed individually, with no audio stream-
ing functionality. The basic approach consisted of taking an audio
extract, resampling the audio, modifying the sample rate in order
to apply the scale factor, play/recording the sample in the model,
and then retransforming the recorded physically-convolved signal
to full-scale for audio playback. A code sample for the described
process is provided below:
fs = 44100;
[y] = wavrecord(audiolength,fs);
fs_max = 192000; % sample rate in scale model audio chain
scale_desired = 12;
[y_resamp] = resample(y,fs_max/scale_desired,fs);
fs_resamp = fs_max/scale_desired;
[y_convolved]=wavplayrecord(y_resamp,fs_max);
wavplay(y_convolved,fs_resamp);
In this example, with a maximum sample rate of 192 kHz on
the audio system and a scale factor of 12, the recorded audio track
is resampled to 16 kHz (bandlimited to 8 kHz). This resampled
track is then played back and recorded in the scale model at an ac-
tual sample rate of 192 kHz. The recorded physically-convolved
signal is then played back at an actual sample rate of 16 kHz, or
resampled to the sample rate of the audio device. The simple re-
definition of the sample rate for the audio buffer performs the ap-
plication of the scale factor, while the resampling assures correct
anti-aliasing filters.
While currently tested in single buffer full convolution, fu-
ture studies will evaluate the possibility of applying the concept
of overlap-add convolution [20] to the concept of this physical-
based convolution in order to allow for real-time operation on au-
dio streams.
The real-time version study was conceived of as an alter-
nate approach to the above approach employing resampling and
transposition to apply the scaling factor through the use of a high
quality phase vocoder architecture. Phase vocoder techniques
are typically based on a sinusoidal signal model. The digital
audio sampling rate conversion employed band-limited inter-
polation (see e.g. [21, 22]) that can be efficiently implemented
with sinc-function look-up tables. In [23] it was shown that
parametrized phase vocoders can also be applied to non-sinusoidal
signals. However, initial tests showed that the sinusoidal signal
model limits the use of phase vocoders for real-time scale-model
processing for large scale factors. Modified algorithms are the
subject of continuing investigations.
0
5 m
© o. delarozière
Figure 4: Woodstacker stacked lamella timber cupola. Champ-
au-Beau, France, 2010. (upper) A winter outside view. (lower)
Section and Reflected ceiling plan.
3.2. Scale Model
This preliminary study has been carried out using a single structure
as a test case. The scale 1:1 structure was built in 2008 for a Land
Art Exhibition in the highland of Auvergne, France (see Figure 4).
The original building comprised 366 pieces of Douglas pine wood.
It was 5 m in diameter, 4.5 m high, and weighed 6 tons. This work
called “Vox Granarium” [24] was dedicated to famous ancient fid-
dlers from this area. This installation was dismantled and moved
to Morvan where it was rebuilt in 2010. The Stupaphonic model
is a 1:12 scale model of “Vox Granarium”, 425 mm in diameter
and 322 mm high. This scale was chosen due to material availabil-
ity and because it is a traditional doll house scale. Serendipitously,
1:12 is also the Lilliputian people’s scale in Gulliver’s Travels2.
The model was constructed from oak wood, whose lengths and
widths were hand cut with no automatic process used for assem-
bly. Unlike the full-scale construction, glue was used to fix the
lamellas, and the model was assembled in three parts for ease of
transportation and manipulation purpose (see Figure 5).
Due to the very long time needed to build this model by hand,
subsequent models will probably be built using rapid prototyping
techniques such as laser cutting. This will allow to quickly exper-
iment with a large variety of structural shapes and configurations.
We are also planing to use other materials such as metal or con-
crete for special acoustics effects.
2. . . having taken the height of my Body by the help of a Quadrant,
and finding it to exceed theirs in the Proportion of twelve to one ... [25,
p. 64]
16
Proc. of the EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, 3-5 April 2014
Figure 5: Scale model (1:12) of “Vox Granarium”, highlighting
the tetrahedral acoustic source and 3 modular elements.
Figure 6: Spectral analysis of impulse response (balloon burst ex-
citation) of the full (1:1) scale “Vox Granariium”, recorded with a
sample rate of 96 kHz. Temporal analysis windows corresponding
to direct/initial (010 msec), early decay (10 30 msec), and
late tail (30 msec).
4. PRELIMINARY TEST
The diffraction acoustic effect discussed in Section 2 can be ob-
served as a series of high-Q total resonances. These resonances
can be observed by comparing the spectral response (magnitude of
the FFT) at different moments in the impulse response. Figure 6
shows the spectral response at three moments in the impulse re-
sponse (balloon burst excitation signal employed for room acous-
tics measurements [26]) of the full (1:1) scale “Vox Granariium”
(see Figure 4). Resonant peaks are identified in the response for
identification. There is clearly a region of resonance density over
the frequency range 100–600 Hz, continuing still to 800 Hz.
The audio system employed for use in the scale model con-
sisted of DPA 4060 microphones and a custom 3-speaker tetrahe-
dron (see Figure 5) driven by a Samson Servo amplifier, connected
to a RME Fireface 400 audio interface. While somewhat uncon-
ventional in traditional scale model research, this selection of pro-
audio equipment has been used in previous studies [27, 28] and
has been shown to provide improved signal-to-noise ratio when
compared to more traditional laboratory scale model measurement
architectures. The current hardware exhibits a frequency roll-off
at 50 kHz. This of course imposes a low-pass frequency lim-
itation for the physical-based convolution. With a scale factor of
12, the upper frequency limit due to this roll off is on the order of
4.2 kHz, rather than the 8 kHz permissible due to sampling theory.
While suitable for the majority of studies in room acoustics with
scale models, the musical implications of this limit can not be ig-
Figure 7: Spectrogram of anechoic music extract (upper) and
physical-based convolved music (lower) manually time-aligned.
nored. This limit can be raised by improving the upper frequency
limit of the audio chain or selecting a lower scale factor.
An example result of the processing chain can be seen in Fig-
ure 7, which shows the spectrogram of a dry music extract be-
fore and after the physical-based convolution processing. The test
music except was a dry multichannel recording of a Schubert trio
(D.929, op.100), by [29] and publicly available a.
The acoustic timbre of the convolved signal using the scale
model greatly resembled that of the musical experience heard
within the full scale installation. Even thought the processing
steps apply a low-pass filter effect, due predominantly to trans-
ducer and amplifier performance limitations above 50 kHz, the
frequency range where the resonance characteristics of the struc-
ture are apparent are still well within the operating frequency of
the current signal processing chain for the 1:12 scale.
5. CONCLUSIONS
This paper has presented the foundations of the “Stupaphonic ex-
periment”, an artistic and scientific project which aims to use scale
models as physical-based convolution reverberators. The architec-
tural structure at the center of the project offers specific timbral
qualities which are maintained in the initial tests, despite the fre-
quency limitations of the scale transformations and associated sig-
nal processing chain.
The current example operates in an off-line, or time-deferred
situation. While streaming is currently still being investigated, the
current implementation could still be used in a performance setting
ahttp://c4dm.eecs.qmul.ac.uk/rdr/handle/123456789/27
17
Proc. of the EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, 3-5 April 2014
in a live looping context, where the musician could send different
audio samples to different architectures at a unique or different
scale factors, effectively changing the size of the “echo chamber”.
The development of the real-time processing stage, currently
a subject of study, will allow for exploitation of the proposed
physical-based convolution for studies in room acoustics, specifi-
cally those involving dynamic source, listener, or architectural
elements, as well as dynamic performer/room interactions.
One artistic performance aspect specific to this system is the
potential for cross-scale cross-talk. If the scale models are open
to some degree, then the up-scaled audio will be heard by some of
the audience. At the same time, full scale sounds, such as other
elements of the performance or noise from the audience, can also
be captured in the scale model, and subsequently down-scaled and
played over the reproduction array. According to the Lilliputian
scale factor of 1:12, one can imagine the majority of these sounds
will be shifted to the lower end of the audible range, or into the
subsonic region. However, a high pitched scream with a center
frequency in the 5 kHz third-octave band for example would be
clearly audible when transposed to the 400 Hz third-octave band,
albeit also time stretched to 12 times its original duration. Investi-
gations of these effects, and their possible artistic use, remain the
subject of further studies.
6. ACKNOWLEDGMENTS
This study was funded in part by an Action Initiative grant at the
LIMSI-CNRS.
7. REFERENCES
[1] J-M. Jot and A. Chaigne, “Digital delay networks for design-
ing artificial reverberators, in Proc. 90th AES Convention,
Paris, France, Feb. 1991.
[2] G. Garcia, “Optimal Filter Partition for Efficient Convolu-
tion with Short Input/Output Delay, in Proc. 113th AES
Convention, Oct. 2002.
[3] M. Noisternig, A. Sontacchi, T. Musil, and R. Höldrich, “A
3D Ambisonic based Binaural Sound Reproduction System,
in Proc. 24th AES Int. Conf., Banff, Canada, June 2003.
[4] J-M. Jot, “Real-time spatial processing of sounds for mu-
sic, multimedia and interactive human-computer interfaces,
Multimedia Systems, vol. 7, no. 1, pp. 55–69, 1999.
[5] A. Krokstad, S. Strom, and S. Sorsdal, “Calculating the
acoustical room response by the use of a ray tracing tech-
nique, J. Sound Vib., vol. 8, no. 1, pp. 118–125, 1968.
[6] M. R. Schroeder, “Digital Simulation of Sound Transmission
in Reverberant Spaces, J. Acoust. Soc. Am., vol. 47, no. 2,
pp. 424–431, 1970.
[7] T. A. Funkhouser, I. Carlbom, G. Elko, G. Pingali,
M. Sondhi, and J. West, A beam tracing approach to acous-
tic modeling for interactive virtual environments, Proc.
ACM Comp. Graphics (SIGGRAPH’98), pp. 21–32, July
1998.
[8] S. Laine, S. Siltanen, T. Lokki, and L. Savioja, Accelerated
beam tracing algorithm, Applied Acoustics, vol. 70, no. 1,
pp. 172–181, 2009.
[9] M. Noisternig, B. F.G. Katz, S. Siltanen, and L. Savioja,
“Framework for real-time auralization in architectural acous-
tics, Acta Acoust. united with Acust., vol. 94, pp. 1000
1015, 2008.
[10] C. Malcurt, Simulations informatiques pour prédire les
critères de qualification acoustique des salles. Compara-
ison des valeurs mesurées et calculées dans une salle à
acoustique variable, Ph.D. thesis, Laboratoire Acoustique
Métrologie Instrumentation, Toulouse, France, July 1986.
[11] G. I. Koutsouris, J. Brunskog, C-H. Jeong, and F. Jacobsen,
“Combination of acoustical radiosity and the image source
method, J. Acoust. Soc. Am., vol. 133, no. 6, pp. 3963–3974,
2013.
[12] T. Carpentier, T. Szpruch, M. Noisternig, and O. Warus-
fel, “Parametric control of convolution based room simula-
tors, in Proc. Int. Symp. on Room Acoust. (ISRA), Toronto,
Canada, June 2013.
[13] Jean-Dominique Polack, Xavier Meynial, and Vincent Gril-
lon, Auralization in scale models: Processing of impulse
response, J. Audio Eng. Soc, vol. 41, no. 11, pp. 939–945,
1993.
[14] Vincent Grillon, Xavier Meynial, and Jean-Dominique Po-
lack, Auralization in small-scale models: Extending the
frequency bandwidth, in Audio Engineering Society Con-
vention 98, Feb 1995.
[15] O. Delarozière and U. Gleeson, “Woodstacker, in Archi-
tectures autrement : Habiter le monde, M. Culot and A-M.
Pirlot, Eds., pp. 46–51. AAM, Brussels, 2005.
[16] J. Jaynes, La naissance de la conscience dans l’effondrement
de l’esprit, Presses Universitaires de France, 1994.
[17] B. Katz, O. Delarozière, and P. Luizard, A ceiling case study
inspired by an historical scale model., in Proc. 8th Int. Conf.
on Auditorium Acoust., Institute of Acoustics, Dublin, May
2011, vol. 33, pp. 314–321.
[18] P. Quignard and C. Lapeyre-Desmaison, Pascal Quignard
le solitaire : Pascal Quignard, rencontre avec Chantal
Lapeyre-Desmaison, Les Flohic éditions, 2006.
[19] O. Delarozière, “Camera tectonica : Hypothèses pour un
facsimilé d’architecture, in Utopia Instrumentalis : Fac-
similés au musée - Musée de la Musique, Cité de la musique,
Paris, Nov. 2010, pp. 46–56.
[20] A. V. Oppenheim and R. W. Schafer, Digital signal pro-
cessing, Prentice-Hall, Englewood Cliffs, N.J., 1975, ISBN
0-13-214635-5.
[21] R. W. Schafer and L. R. Rabiner, “A digital signal processing
approach to interpolation, Proceedings of the IEEE, vol. 61,
no. 6, pp. 692–702, 1973.
[22] J. O. Smith, III and P. Gossett, “A flexible sampling-rate con-
version method, in Proc. IEEE Int. Conf. Acoust., Speech
and Sig. Proc. (ICASSP), 1984, pp. 112–115.
[23] W.-H. Liao, A. Roebel, and A. W. Y. Su, “On stretching gaus-
sian noises with the phase vocoder, in Proc. of the 15 Int.
Conference on Digital Audio Effects (DAFx-12, Sept. 2012.
[24] O. Delarozière, “Vox Granarium, in Horizons - Rencontres
“Arts Nature", pp. 12–13. Office de Tourisme du Sancy, July
2008.
18
Proc. of the EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, 3-5 April 2014
[25] J. Swift, Part 1. A Voyage to Lilliput, vol. 1 of Travels Into
Several Remote Nations of the World, chapter III, pp. 47–64,
Printed for Benj. Motte, at the Middle Temple-Gate in Fleet-
street, 1726.
[26] J. Pätynen, B.F.G. Katz, and T. Lokki, “Investigations on
the balloon as an impulse source, J. Acoust. Soc. Am., vol.
129(1), pp. EL27–EL33, 2011.
[27] Paul Luizard, Les volumes couplés : comportement, con-
ception, et perception dans un contexte de salle de spectacle,
Ph.D. thesis, Université Pierre et Marie Curie, Paris, France,
2013.
[28] P. Luizard, M. Otani, J. Botts, L. Savioja, and Brian F. Katz,
“Comparison of sound field measurements and predictions
in coupled volumes between numerical methods and scale
model measurements, in Proc. Meetings on Acoustics, Mon-
treal, June 2013, vol. 19, p. (9 pages).
[29] Joachim Fritsch, “High quality musical audio source sepa-
ration, M.S. thesis, UPMC / IRCAM / Telecom ParisTech,
2012.
19