scieee Science in your language
[en] (orig)
Fakultät I
Institut für Sprache und Kommunikation
Fachgebiet Audiokommunikation
Ultrasonic Cyborg Hearing
Maximilian Wagenbach
Primary Supervisor: Prof. Dr. Stefan Weinzierl
Secondary Supervisor: Markus Hädrich
Fachgebiet Audiokommunikation
A thesis submitted to Technische Universität Berlin in partial fulfilment of the
requirements for the degree Master of Science in Audiocommunication and -technology.
Technische Universität Berlin
Fakultät I, Institut für Sprache und Kommunikation
25th February, 2022
Eidesstattliche Erklärung
Hiermit erkläre ich, dass ich die vorliegende Arbeit selbstständig und eigenhändig, sowie
ohne unerlaubte fremde Hilfe und ausschließlich unter Verwendung der aufgeführten
Quellen und Hilfsmittel, angefertigt habe.
Berlin, den 22.02.2022
Ort, Datum Maximilian Wagenbach
III
Abstract
This thesis explores the feasibility of a useable cybernetic expansion of the human auditory
system into the ultrasonic range. As part of this thesis a fully functional prototype was
built and tested successfully. The hardware, software and algorithm used in the process
of realizing the prototype as well as their technical verification are documented here. A
detailed analysis of the history and current developments in cybernetic implants, with
a particular focus on do-it-yourself cyborgs are also provided. Research on the state of
the art is presented for both digital signal processing techniques for pitch shifting and
ultrasonic hearing devices. Several natural and technical ultrasound sources are listed
as well. Additionally a personal account of the perceptive experience of wearing the
prototype is discussed together with possible future implications of cybernetic implant
technology as a whole. The prototype built for this thesis shows that it is possible to
extend the capabilities of a cochlear implant to enable the perception of ultrasound. This
allows the redefinition of the cochlear implant, from a utility mitigating a “disability”
into an extendable sense broadening the human perception. The algorithm implemented
to perform the pitch shifting was designed specifically to preserve the harmonic overtone
structure of the ultrasonic signal after the transposition. Therefore the prototype’s
real-time implementation is particularly suitable for the use as a sensory extension, in
comparison to other ultrasonic hearing devices that do not have these properties. Possible
other ideas for future uses of the prototype without a cochlear implant or as a tool for
research are also discussed.
This work is licensed under the Creative Commons Attribution-NonCommer
cial-ShareAlike 4.0 International License.
Ultrasonic Cyborg Hearing
Table of Contents IV
Table of Contents
Abstract III
Table of Contents IV
List of Figures VI
List of Equations VIII
1 Introduction 1
2 Concepts & Context 3
2.1 Ultrasound ................................... 3
2.2 CochlearImplants ............................... 11
2.3 PitchShifting.................................. 19
2.3.1 History&Usage ............................ 20
2.3.2 StateoftheArt ............................ 21
2.4 State of the Art in Ultrasonic Hearing . . . . . . . . . . . . . . . . . . . . 23
3 Extension of the Senses through Technology 25
3.1 HistoricContext ................................ 25
3.2 DIYMovement................................. 27
3.3 Plasticity of the Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Prototype project 37
4.1 Hardware .................................... 37
4.2 Software..................................... 42
4.3 Algorithm.................................... 43
5 Evaluation 51
5.1 TechnicalEvaluation.............................. 51
5.2 Perceptive Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Discussion&Outlook ............................. 58
Acknowledgments IX
Deutsche Zusammenfassung X
Ultrasonic Cyborg Hearing
Table of Contents V
Acronyms XI
Literature XII
Ultrasonic Cyborg Hearing
List of Figures VI
List of Figures
2.1 Frequency ranges and their respective names. . . . . . . . . . . . . . . . . 4
2.2
Spectrogram of the feeding buzz of a Pipistrellus pipistrellus leading up to
a catch towards the right. (Pitched down into the audible range) . . . . . 7
2.3
Audible frequencies in the hearing range of different species grouped by
color[1][2][3][4][5]............................... 10
2.4
The two parts of a cochlear implant. Implanted component shown on
the left and externally worn sound processor shown on the right. Image
source:CochlearLtd. ............................. 14
3.1 X-Ray image of an RFID implant. Image source: dangerousthings.com . . 30
3.2 Magnetic implant lifting objects. Image source: dangerousthings.com . . . 30
3.3 Sensor tattoo measuring pH, glucose and albumin levels. Image source: [6] 30
3.4 North sense implanted onto the chest. Image source: cyborgnest.net . . . 30
3.5 Neil Harbisson with his antenna. Image source: wikimedia.org . . . . . . . 33
3.6 Manel De Aguas with his fins. Image source: wikimedia.org . . . . . . . . 34
4.1
The two circuit boards of the prototype. The black microcontroller chip
in the middle of the top board is clearly visible as well as the audio shield
boardbelowit. ................................. 37
4.2
The Knowles SPU0410LR5H-QB MEMS microphone (dimensions in mm).
Imagesource:[7]................................ 39
4.3
Preliminary ultrasonic free field response normalized to 1kHz of the
Knowles SPU0410LR5H-QB (take from its datasheet [7]). . . . . . . . . . 39
4.4
The full signal flow through the hardware. Analog steps marked in yellow
and digital steps colored blue. . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5
Octaves and their corresponding fundamental frequency in a twelve-tone
equal temperament. The audible range is marked with a green background.
44
5.1
Results of the first test. Calculated, expected results on the left, showing
the original octave in yellow, the downshifted octave in blue and the
upshifted octave in red. Measured output octave of the algorithm with
no shift shown second from the left. Measured downshifted output octave
second from the right and measured upshifted output octave on the right. 53
Ultrasonic Cyborg Hearing
List of Figures VII
5.2
Results of the second test utilizing the microphone. Power spectrum of the
shifted output octaves in blue and their expected, precalculated frequency
spectruminred. ................................ 54
Ultrasonic Cyborg Hearing
List of Equations VIII
List of Equations
2.1 Relationship between frequency and period duration. . . . . . . . . . . . . 3
4.1 Formula for determining the frequency width of an FFT bin. . . . . . . . . 46
Ultrasonic Cyborg Hearing
Chapter 1 Introduction 1
1 Introduction
Perception shapes our reality. Ancient Greek philosophers had already established this
connection between the human perception and the gain of knowledge or truth. One
famous example is Plato’s “Allegory of the cave”, where the perception of shadows reveals
truth about the objects casting the shadows to the observer. The expansion of the human
perception through technical means is the subject of the research whose results I would
like to present in this thesis. The primary human senses, like the visual or auditory
system, are quite sophisticated and complex. They have evolutionarily developed to adapt
to our surrounding environment in order to help us perceive and survive. Nevertheless
they only grant us a window into a small slice of the electromagnetic and acoustic
wave spectrum. Many natural and technical processes remain imperceivable and hidden
to us. Even though knowledge about them is not strictly necessary for survival, they
can nevertheless bring interesting insights and offer a different awareness of the world.
Through technological advances humans have continuously developed and improved
tools to help them overcome those natural limitations and improve their senses. These
extensions of the senses through technology have also helped humankind to achieve feats
previously impossible and gain insights into domains thought to be inaccessible, for
example viewing far away objects, resolving microscopic structure or perceiving light
of invisible wavelengths. This notion still holds true today, as every new evolution in
observation technology offers new methods for discovery and acquiring knowledge.
In recent years observation technology and electronics in general have become a lot
cheaper and more accessible. As a result of the mass production and miniaturization of
silicon based semiconductor chips, sensors and microcontrollers as well as the use of signal
processing algorithms, it has become feasible for interested makers and hackers to build
their own electronic devices affordably at home. This has enabled a growing community
of biohackers and do-it-yourself cyborgs, interested in manipulating or extending their
own senses through technological means, to experiment. The term cyborg describes an
organism composed of both organic and electromechanical body parts. Merging humans
and machines has been a subject in literature and film for over 100 years, but cyborgs in
their modern form were popularized as a topic in science fiction starting from the 1960s
onward. Through the continuous advances in technological development it has recently
become an ongoing topic of growing importance with real world future implications
for humankind. As a cochlear implant wearer, my sensory nervous system is already
Ultrasonic Cyborg Hearing
Chapter 1 Introduction 2
connected and stimulated by electronics, which makes me a cyborg according to the
traditional definition. Cochlear implants offer people with profound hearing loss a new
auditory impression, thus allowing them to hear again. During my research I have focused
on extending the hearing capabilities of my cochlear implant into the ultrasonic range. In
this thesis I have documented the development and testing of a prototype that intends to
transpose ultrasound into the audible range and streams it to the cochlear implant. The
transposition is supposed to be carried out in a way that preserves the sound’s harmonic
relationships to closely mimic the way human hearing would naturally extend into the
ultrasonic range. This would enable a redefinition of the cochlear implant from a way to
mitigate a “disability” to a practical way for extending the human perception beyond its
natural form. This creative, critical approach was chosen to hack my own perception
and gather some practical experience in cybernetic augmentation of the senses under real
world conditions.
First a theory part in chapter 2 provides some background information and context
on topics touched by this thesis, like ultrasound, cochlear implants or pitch shifting.
Then a detailed look on past and current cybernetic implantations and their use of
technology for sensory expansion follows in chapter 3. The implementation part in
chapter 4 describes both the hard- and software of the prototype device I built as well
as the signal processing techniques employed. The final chapter 5 contains an account
of my experience and learnings, documenting how the prototype device feels and works
under real world conditions. Acoustic as well as electric evaluations are recorded in order
to gather properties of the device and verify it is working as intended. Lastly a short
outlook on what the implications of the availability of such devices could mean for the
future is also provided in chapter 5.
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 3
2 Concepts & Context
This chapter provides some context and background on concepts and devices, that are
mentioned and used in the course of the thesis.
2.1 Ultrasound
Sound can be defined as the propagation of vibrations or acoustical waves through a
medium [
8
]. For the purpose of this thesis the considered medium is air. A sound source
exerts pressure differences on the air molecules surrounding it, making them vibrate back
and forth [
8
]. Because of the mass and compressibility of air the compression wave is
able to travel through the medium and spread outwards to reach the listener [
8
]. If the
sound source exerts periodic pressure differences the sound is perceived to have a tone or
acoustic color [
9
]. The time it takes for the molecules to complete one oscillation cycle in
such a periodic pressure wave is called the period duration
𝑇
and is measured in seconds.
The relationship between the frequency
𝑓
of a sound and its period duration is expressed
in the following equation [10]. The SI unit for frequency is Hertz (Hz).
𝑓=1
𝑇(2.1)
These periodic pressure fluctuations are perceptible by humans through their auditory
system. Evolved through millennia of evolution and conditioned by their physiology and
environment, humans are only able to perceive a certain part of the available acoustic
spectrum. In the literature this so called “human audible range” is usually specified to
range from 16 or 20 Hz up to 20 kHz [
8
] [
10
]. Although children and some teenagers
might in fact be able to hear frequencies as high as 20 kHz, for adults from the age of
20 the upper end of the spectrum steadily decreases over the course of their lifetime by
about 1 kHz every decade [
10
]. Thus most young adults have a hearing range from 20
Hz to 16 kHz, while some elderly people might have an upper limit of 8 kHz or less.
General aging effects, prolonged exposure to loud music, noisy environments as well as
some infectious diseases have adverse effects on the hearing range [
10
]. Since the scientific
accepted range is from 20 Hz to 20 kHz, we will use it as the audible range for humans
in this thesis for its ease of calculation and in order to consider all edge cases.
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 4
Sound waves with a frequency of less than 20 Hz are known as infrasound [
11
]. Around
that threshold the human hearing makes a gradual transition from a tonal perception to
a vibrational sensation [
12
]. Infrasound can originate from powerful natural events such
as lightning strikes, volcanic eruptions, earthquakes, avalanches, land slides and other
similar events [
12
]. Some animals like whales or elephants also use infrasound waves for
communication over large distances. Human made phenomena such as explosions, large
engines, sonic booms or subwoofer speakers can also be sources of infrasound [12].
1 Hz 10 Hz 100 Hz 1 kHz 10 kHz 100 kHz 1 MHz 10 MHz 1 GHz1 GHz
infrasound human audible sound ultrasound
Frequency
Figure 2.1: Frequency ranges and their respective names.
The frequency range above 20 kHz up to 1 GHz is referred to as ultrasound [
11
]. Figure
2.1 shows a visualization of the sound frequency ranges and their names. Ultrasound is
employed in a wide variety of different contexts, both in industrial applications as well
as occurring naturally. In nature a number of animal species, such as bats or mice, use
ultrasound for orienting, communicating or hunting. Ultrasound’s significant utility for
industrial use presents itself in the numerous medical, fabrication and sensory applications.
A few technical applications of ultrasound can also be encountered in regular day-to-day
life. To give an overview some of the applications from these different fields will be briefly
described here [11].
One of the commonly best-known applications of ultrasound is its use for imaging fetuses
in the womb during pregnancy. To achieve this ultrasound pulses in the range between
1 MHz to 18 MHz are transmitted through the skin. Their reflections off of internal
organs, tissue and the unborn child are detected and an image is reconstructed from these
reflections [
13
]. In addition to that specific use case ultrasound has become one of the
main general imaging methods used in medical diagnostics. Besides magnetic resonance
imaging (MRI) scans and x-ray images it is one of the few techniques allowing doctors
and physicians to take a look inside the body without invasive surgery. Especially for
delicate areas like imaging unborn children or reproductive organs, which are particularly
sensitive to ionizing radiation, different ultrasound imaging methods are the most common
technique. More advanced signal processing also allows three dimensional reconstruction
of internal organs or measurements of the blood flow. In medicine ultrasound is not only
used for diagnostics but also has therapeutic use. High intensity, focused ultrasound
waves can be utilized to shatter kidney stones or destroy dead, cancerous tissue [
14
].
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 5
Again, this is done in a non-invasive way without the need for surgery, making it a very
convenient and accessible technology.
In manufacturing, fabrication and material testing ultrasound has a great number
of applications as well. While processing plastics, synthetic fabrics, dissimilar metals
or ceramics, ultrasonic sound waves between 15 kHz to 40 kHz can be employed to
permanently join two workpieces together. In a process known as ultrasonic welding
a permanent bond is created by selectively generating hotspots through ultrasonic
vibrations [
15
]. A related process called ultrasonic soldering uses a specialized soldering
iron to melt a filler material using acoustic energy, which then can be used to join
materials that are hard to solder by conventional means. Another important use case in
manufacturing is the mixing of liquid chemical compounds thoroughly without the need
for high-speed mixers. The process known as ultrasonication introduces tiny vacuum
bubbles in the medium to be mixed generated through ultrasonic vibrations. When these
bubbles collapse an effect called cavitation violently produces extreme shear forces which
disturb and completely mix the medium on a molecular level. The same cavitation effect
is used in ultrasonic cleaning baths utilized both in professional and home appliances [
16
].
So called Transducers, produce these tiny bubbles typically with frequencies around 40
kHz. When the bubbles collapse, they remove dirt and residue of the surface without
attacking the surface material itself. Ultrasonic cleaning is often used for delicate objects
like jewelry, watches, coins, surgical instruments or other fine mechanical objects. Similar
to medical applications the ultrasonic imaging methods are also used in manufacturing
for the testing of materials, fabricated components or for quality assurance of finished
products. As a non-destructive testing method it is especially popular for discovering
faults in welding joints, general tears or even hair cracks. Usually the frequency range
from 1 MHz to 10 MHz is used for that purpose.
A much simpler and more rugged version of the imaging methods can be found in the
application of ultrasound for distance measurements. With bats being the bionic model,
ultrasonic pulses can be utilized to detect the distance of an object. This is achieved by
measuring the time the ultrasonic waves takes to travel between sending the pulse and
the arrival of its reflection off of the object’s surface. By taking into account the speed of
sound of the medium the pulse travels through, the distance can be derived [
17
]. One
advantage of this technique is its independence from the presence of light, which makes
it an ideal candidate for dark, foggy or opaque environments. It is frequently used to
measure the fluid level in storage tanks as well as in robotics. Recently this technique has
seen a growing utilization in autonomous vehicles for their situational awareness. With
additional signal processing they are even able to achieve basic object detection [
18
]. A
more sophisticated version of distance measuring is also used in underwater environments.
Known as sonar (acronym for
so
und
na
vigation and
r
anging) it has many military
applications and has played a key role in submarine warfare during the second world
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 6
war. Sonar uses frequencies from infrasound up to the megahertz range depending on
the desired spacial resolution, range and water conditions. Today sonar is also used in
civilian, scientific or industrial applications like simple depth measurements to avoid
running aground, mapping and analyzing ocean floors or detecting shoals of fish. Another
diagnostics application of ultrasound is the detection of leaks in pressure systems. Gasses
rushing through small holes produce ultrasonic sound waves, which can be tracked to
find the location of the hole. This property has been used in numerous industrial settings
on earth as well as in space to detect pressure leaks. Space agencies have equipped their
spacecrafts with ultrasound microphones since the 1990’s
1
. They have successfully been
used on the Space Shuttle, the International Space Station and the Soyuz spacecraft to
identify leaks and avoid further endangerment of the respective crews.
The outlined industrial ultrasound applications operate in one or more gaseous, liquid
or solid media, illustrating the versatility of ultrasound. Since the goal of the thesis is to
extend the human auditory sense into the ultrasonic range the focus is put on gaseous
air as the medium. For the same reason of extending the existing human auditory range
continuously, the lower part of the ultrasonic spectrum from 20 kHz to 100 kHz will
be the subject of interest. At the same time, most natural ultrasound sources or those
encountered in day-to-day life can be found within that range.
While cats and dogs as well as bats commonly first come to mind in regards to above
human level hearing capabilities, in fact quite a number of animals are able to perceive
and use ultrasound for communication, situational awareness or orientation purposes.
The animals best known for employing ultrasound though are without a doubt bats.
They have evolutionarily developed the ability to use ultrasound as their main way of
perceiving the world. Instead of relying primarily on their visual sense like many other
animals do, bats use ultrasound to detect objects, navigate their environment and seek
food. This capability known as “echolocation” allows bats to fly and hunt even in total
darkness [
19
]. To achieve this feat bats emit calls of short, ultrasonic pulses through
their mouth or nose. These sound waves then reflect off of objects around them. Using
their large ears bats gather these reflected sound waves and deduce how far away and
where objects are in relationship to them as well as their direction of movement. They
derive these features by processing the sound intensity received by each ear, the stereo
delay, the Doppler shift, spectral peaks as well as other cues they can extract from the
sound [
19
]. By repeating this process in rapid succession bats are able to construct an
adequate impression of their surroundings. Of course different bat species have evolved
to use different frequency ranges for their ultrasonic pulses, as can be seen in figure 2.3,
to best fit their preferred environment. The calls of some species can be as low as 10
kHz and thus are audible to humans without any sensory extensions, while the upper
1
NASA technology spinoff website: https://spinoff.nasa.gov/Spinoff2012/ps_1.html - Accessed:
22.02.2022
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 7
end of a few species is well above 100 kHz. Interesting to note is also that the pulses and
calls are not always the same, but vary with flight phase, activity and surrounding [
19
].
In general orientation calls depend on the environment surrounding the bat. In open
air spaces most bat species use constant frequency (cf) or quasi constant frequency (qcf)
calls of 10 to 50 ms length, with variations and exceptions for some species [
2
]. In very
tight air spaces and close to objects most species resort to frequency modulated (fm)
calls of only a few milliseconds in length, to counteract echos and improve the temporal
resolution [
2
]. Besides calls for navigation and orientation the hunting calls, known as
“feeding buzz”, can also be differentiated acoustically. They differ from the others by
having a much shorter length and time interval between them [
2
]. The hunting calls get
increasingly shorter and faster while slowly falling in pitch right up to the moment of the
catch as can be seen in figure 2.2. The third category of bat calls are their social calls.
They sound quite different from the orientation or hunting calls and vary from species to
species [
2
]. Some can be discerned by having a harmonic or ringing quality, variability in
the pitch or simply not be repeated very often. Unfortunately they are also not very well
researched and categorized [2].
Figure 2.2: Spectrogram of the feeding buzz of a Pipistrellus pipistrellus leading up to a catch
towards the right. (Pitched down into the audible range)
Interestingly, even though very few insects vocalize using ultrasound, researchers have
observed a number of insect species that are able to listen to the ultrasonic range [
4
].
The most prominent example is the field cricket which uses its wide hearing range from 2
kHz to 100 kHz to detect the presence and type of predator using the spectral fingerprint
of the predator’s calls [
3
]. Researchers were able to observe evasive movements and other
escape behaviours when the crickets are subjected to ultrasonic stimuli [
3
]. For a growing
list of other species similar protection behaviour when exposed to bat calls has also been
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 8
observed. Anti-bat tactics have evolved in crickets, moths, praying mantids, nocturnal
butterflies, green lacewings and possibly flies as well as beetles [
4
]. The alignment of the
sensitivity and frequencies of the insects hearing range and the bat’s vocalization range
shows that the insects ability to hear ultrasound and react to it has evolved from an
evolutionary selection pressure (compare figure 2.3) [
4
]. Different species demonstrate
different evasive behaviours like flying in the opposite direction, diving towards the
ground, spiraling or stopping any movement [
4
]. Researchers have also found some
insect species who do vocalize in the ultrasonic range. Among them is the moth which
produces a very quiet ultrasound signal exclusively for private communication when in
close proximity to females [
20
]. The research shows that the sound producing organs
have evolved after the moth’s ears [
20
]. This suggests in fact that moths have adapted to
use ultrasound for communication as a result of bats hunting them using ultrasound.
Another animal group who have been observed to emit and use ultrasound are members
of the rodent order. Particularly laboratory rats and laboratory mice have been subject
of thorough research regarding their ultrasonic vocalizations. In contrast to bats, rats
and mice don’t use ultrasound for echolocation, but for vocalization and communication
among each other. Adult rats emit two types of calls in the ultrasonic spectrum. The
first category is known as the “22-kHz vocalization” and occurs in the range of 18 kHz
to 32 kHz depending on the animal [
21
]. The second category are short, chirping calls
known in the literature as “50-kHz vocalization” and range from 35 kHz to 70 kHz again
depending on the physicality of the animal [
21
]. Rats have been observed to utilize
these different calls depending on their social situation, their physical or psychological
demands as well as their environmental situation [
22
]. The research suggest that the
calls correspond to different affective states of the animals [
21
]. Broadly speaking the
22-kHz vocalization could be linked to aversive behavioral situations like male-male
aggression, experiences of pain, social defeat, exposure to predators or other distressing
events [
22
] [
21
]. It could also be observed that these calls are not simply a spontaneous
vocalization, but can be emitted as a warning call to alert other rats in the burrow of the
presence of a predator [
23
]. The 50-kHz vocalizations are often observed in conjunction
with nonaversive conditions, for example sexual behaviors, play among teenage rats, male
fighting contests as well as tickling or petting by experimenters [
22
]. Research on these
50-kHz vocalizations suggest that they indicate a positive affective expression of the
rat [21].
The other rodent species that has been subject to more in-depth studies about their
ultrasonic vocalizations are laboratory mice. Mice use the frequency range from 30 kHz
to 110 kHz for their calls [
22
]. In contrast to rats, mice only emit calls in very specific
situations primarily during non-aggressive interactions and especially during mating
behaviour [
24
]. Male mice have been observed to emit calls mostly in the presence of
females [
24
], while female mice have been recorded to vocalize when they are alone, during
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 9
female-female interactions or when they are searching for their offspring [
22
]. Young
mouse pups send out “isolation calls” when they are getting cold or if they get separated
from their nest [
24
]. The sounds that laboratory mice produce have been compared to a
rapid series of “chirp-like” syllables [
24
]. They consist either of single whistles, harmonics,
frequency sweeps, frequency jumps or a combination of those [
22
]. The calls are described
in the literature to be a nonrandom, repeating temporal sequence, thus giving them
characteristics comparable to a song or the chirping of birds [
24
]. Most of the research
regarding laboratory mice and rats was conducted inside a lab and very few studies took
place in the wild. The little research that was done in the wild however suggests that
their songs and sounds might be quite different from those in lab settings [
22
]. It seems
that their vocalizations tend to be more expressive and extensive when in their natural
habitat [
22
]. Additionally the primary subject for most of the studies on ultrasound
vocalization in rodents have been laboratory rats and mice, with only very few other
species from the rodent order being examined as well [
22
]. The few other mouse species
that have been studied are shrews and singing mice [
25
]. However there is a high potential
for many other rodents possibly using ultrasound in one form or another, as some of their
predators, for example birds, reptiles or amphibians, are not able to perceive ultrasound,
making it an ideal tool for covert communication [
25
]. Recent research on squirrels has
shown that many squirrel species, like ground squirrels, tree squirrels and flying squirrels,
use ultrasonic warning calls around 60 kHz to alarm their peers of predators in proximity,
without them knowing [
25
]. As always more research in the field is required and will
likely reveal more rodent species utilizing ultrasound.
Another codependent evolution similar to that between bats and their preferred prey
insects can also be found in cats and dogs versus their prey rodents. However, instead
of the prey adapting to their predator, here the development was reversed. It has been
theorized that the predecessors of dogs as well as cats have developed the ability to
perceive ultrasound in order to detect and hunt rodents [
26
]. As visible in figure 2.3 their
hearing ranges match up well with the vocalization range for example of mice or rats.
Lastly, one more animal group that also makes heavy use of ultrasound are several marine
mammals in the order of whales. Porpoises for example are in fact the animals with
the highest upper hearing limit at around 160 kHz [
5
]. Among other marine mammels,
porpoises and dolphins employ ultrasound in a technique called bio-sonar for orientation,
which works similar to other methods of distance measurements described earlier in this
thesis. Because they use water as the medium for their ultrasonic sounds we will not go
into further detail here, since, as mentioned before, air is the medium of interest for this
thesis.
Researchers have also found natural occurring ultrasound under more unexpected
circumstances. In 1983 the first measurements of trees emitting ultrasound under drought
conditions were recorded [
27
]. The recorded emissions were in the range of 100 kHz to
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 10
1 MHz and lasted between 20 to 100 microseconds [
27
]. Since then the source of the
ultrasonic emissions could be traced to the water-bearing layer of the trees known as
xylem [
28
]. The small conduits bringing water up the tree are under enormous negative
pressure in periods of drought [
28
]. These conditions lead to the forming of cavitations
inside these conduits, which emit ultrasound when violently collapsing [
28
]. The resulting
effect are comparable to embolisms in the human bloodstream. Because of the conduit’s
small safety margin for rupture these cavitations are now seen as a major factor in
influencing the death rate of trees during droughts [28].
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160
Frequency [in kHz]
Porpoise
Dolphine
Squirrel
Laboratory Mouse
Laboratory Rat
Guinea Pig
Hedgehog
Rabbit
Cat
Dog
Moth
Grasshopper
Praying Mantis
Tiger Beetle
Cricket
Daubenton's bat
Common pipistrelle
Common noctule
Serotine Bat
Bats
Human
Figure 2.3:
Audible frequencies in the hearing range of different species grouped by color [
1
] [
2
]
[3] [4] [5].
In an average everyday life applications of ultrasound are less common, but can
nevertheless be encountered occasionally. For example ultrasound can be employed to
communicate with animals whose hearing is sensitive in that range. Dog whistles are
one example for sensible communication using ultrasound. They can be used to train
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 11
dogs to associate and execute a certain behaviour with the sound. Some dog whistles
operate in the upper end of the human audible range from 16 kHz to 22 kHz, but there
are also models working in the range from 23 kHz to 54 kHz, which has the added benefit
of not disturbing any nearby humans. A less kind form of communication with animals
is the use of ultrasound as deterrent. Some people use ultrasonic loudspeaker to keep
rodents, insects or other animals off their property. They are especially popular in engine
compartment to protect against martens. The speakers usually work in the range between
40 kHz and 60 kHz. Research has shown that they are not effective at repelling animals
and their use is undoubtedly not very respectful [
29
] [
30
]. Probably the most common
application which people have used before, without realizing it is powered by ultrasound,
are parking distance assistants. Almost all modern cars have an assistant feature that
announces the distance to objects behind the car using repeating tone sequences when
driving in reverse. Similar to the industrial distance measurements, the parking assistants
measure the time an ultrasonic pulses takes to travel to an object and reflect back, to
derive the distance to the object. The distance is then translated into a tone sequence
with a corresponding, varying repetition rate. Usually a frequency around 40 kHz is used,
slightly varying between manufacturers.
An interesting, historic use of ultrasound in home appliances was in television remote
controls. In 1957 Robert Adler developed a system which employed ultrasound to adjust
TV settings from a distance [
31
]. Notably the remote worked completely mechanically,
meaning it did not require batteries. It consisted of a series of metal rods, which vibrated
when struck by a push of a button. These ultrasonic sound vibrations were then detected
and interpreted by the television to execute different functions like channel switching or
volume control. During the 1970s the system were gradually replaced by infrared light
remotes using cheaper electronics.
2.2 Cochlear Implants
Another important component of this thesis are cochlear implants which will be described
in detail in this section. Cochlear implants are surgically implanted neuroprostheses that
enable people with severe hearing loss to regain a sense for sound. Acoustic waves are
transformed by the cochlear implant into electric impulses, that are then subsequently
transmitted into the body. The eardrum as well as the auditory ossicles in the middle ear
are bypassed completely and the impulses stimulate the auditory nerve directly. With
adequate training wearers are able to derive good speech comprehension from the new
hearing impression. One main target audience for cochlear implants are people whose
hearing loss has become too severe for hearing aids, primarily the elderly. The other main
target group are children who are born deaf. If they receive implants within the first
two years of their life, which is the time where language development usually happens,
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 12
deaf children have good chances at learning spoken language, even at the same rate as
their hearing peers [
32
]. A third, much smaller group are individuals who have lost their
hearing abruptly either due to an accident, an illness or another factor.
Currently there are 4 manufacturers that distribute worldwide: Cochlear Limited from
Australia, Advanced Bionics from the United States (currently owned by the Sonova
Group from Switzerland), MED-EL from Austria as well as Neurelec from France. The
Chinese manufacturer Nurotron is also available in some parts of the world. Except for
Switzerland there are no national or global registers for cochlear implants, so only rough
estimates on the number of implants in use worldwide can be made
2
. The association
“Deutsche Cochlear Implant Gesellschaft e.V. estimated in December 2011 that around
300.000 cochlear implant wearers existed worldwide at the time
3
. In 2016 the UK based
“Ear Foundation” estimated the number of cochlear implant wearers in the world to be
around 600.000
4
. As of December 2019 the US “National Institute for Deafness and
Other Communication Disorders” approximated 736.900 implanted devices worldwide
according to their website
5
. Another factor adding uncertainty to these numbers is
that some statistics count implanted devices and some count cochlear implant wearers.
These two numbers can not be easily converted, as some people wear two implants and
others only one. The German medical newspaper “Ärzteblatt”
6
estimated in 2013 that
there are around 25.000 to 30.000 cochlear implant wearers in Germany. Additionally
they estimated that circa 3.000 implantation surgeries are happening each year within
Germany. Extrapolating from these estimates brings the current number closer to 50.000
implant wearers in Germany as of 2021. More recent or more precise numbers are
unfortunately not available.
As with any sufficiently successful technology many different inventors are credited
with the development of the cochlear implant and indeed many individuals made major
contributions to the creation of the cochlear implant in use today. Historically the
first written accounts of electricity being used to stimulate the auditory nerve go back
to experiments conducted by Duchenne de Boulogne in 1855 as well as even further
to works by Alessandro Volta in the early 1800s [
33
]. The first successful attempt to
implant an electrode into the human auditory system was made by André Djourno and
2
Compendium produced by the “Wissenschaftlicher Dienst des Deutschen Bundestags” (sci
entific service of the German parliament): https://www.bundestag.de/resource/blob/562774/
3e41a2ce1f41897e55821f878dc37897/wd-9---016-18-pdf-data.pdf - Accessed: 22.02.2022
3
Archived website of the Deutsche Cochlear Implant Gesellschaft e.V.: https://web.archive.
org/web/20140411085010/http://schnecke-online.de/informieren/behandlung-und-reha/cochlea-
implantat.html - Accessed: 22.02.2022
4
Archived information PDF of the Ear Foundation: https://web.archive.org/web/20170711192446/http:
//www.earfoundation.org.uk/files/download/1221 - Accessed: 22.02.2022
5
US NIH website on cochlear implants: https://www.nidcd.nih.gov/health/cochlear-implants - Accessed:
22.02.2022
6
Ärzteblatt online article: “Cochlea-Implantate: Wenn Hörgeräte nicht mehr helfen”: https://www.
aerzteblatt.de/archiv/136885/Cochlea-Implantate-Wenn-Hoergeraete-nicht-mehr-helfen - Accessed:
22.02.2022
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 13
Charles Eyriès in Paris in 1957. They implanted a single electrode into the remaining
auditory nerve stump of a deaf patient. After the implantation the patient was able to
distinguish between different sound intensities, but the single channel system was too
limited for differentiating between different frequencies or understanding speech. These
first experimental devices failed after only a few weeks, but the promising results inspired
other researchers to pursue the goal. Among them was Dr. William House who in 1961
implanted the first devices into patients in Los Angeles. While still using a single channel
electrode, he pioneered many new techniques like inserting the electrodes into the cochlear
through its round window as well as conducting many fundamental examinations. In
1967 his work culminated in the first devices that could be worn outside a lab and were
used by patients for many years [
34
]. The modern multichannel cochlear implant was
pioneered by two separate research teams [
33
]. One group in Vienna, Austria was led
by Ingeborg Hochmair and her husband Erwin Hochmair. They developed a system
allowing them to transmit impulses through the skin using two coils and thus removed
many infection sources through addressing biocompatibility issues of materials used in
the implant. In December 1977 they successfully implanted the first 8 channel cochlear
implant. The team later commercialized their prototypes by founding the MED-EL
company. Around the same time Graeme Clark and his team in Melbourne, Australia
also worked on developing a multichannel cochlear implant. They made important
contributions such as designing the electrode array with increased flexibility towards its
end to follow the contour of the cochlear more easily. Their first successful implantation
of a 10 channel electrode took place in August 1978. Later they partnered with the
Nucleus company to sell their cochlear implant commercially. As the implants gained
more attention from the public and their implantations became more frequent and wide
spread, the companies started to share less of their research progress publicly. However
one very visible improvement that happened throughout the 1980s and 1990s was the
miniaturization of the sound processor. This was enabled by the general improvements
in electronics and battery manufacturing techniques during that time period. Sound
processors were originally heavy boxes the size of a portable cassette player, like the
classic Sony Walkman, which were connected to the earpiece using cables and had to
be carried around in a pocket. Today they are self contained units that can be worn
behind the ear and weigh around 10 grams including the rechargeable batteries. The
software running the sound processors was also fundamentally reworked many times
since its beginning. New electrode stimulation techniques, different filter settings and
noise reduction algorithms as well as special programs for speech perception, improved
music appreciation and loud environments have since been introduced. At the same time
the electrode arrays and surgery techniques were also continuously improved with the
goal of eliminating possible infection sources, reducing trauma inside the cochlear during
the implantation as well as improving the preservation of possible residual hearing. In
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 14
conclusion, even though modern cochlear implants work on the same basic principle as
their predecessors from the 1970s they have changed many times and improved drastically
over time.
Figure 2.4: The two parts of a cochlear implant. Implanted component shown on the left and
externally worn sound processor shown on the right. Image source: Cochlear Ltd.
Modern cochlear implants are made out of two parts: one that is implanted into the
body and the other one that is worn externally. The implanted part consists of a large
round coil with a small magnet in its center, visible in figure 2.4 as the round element
situated on the top of the left component. It also includes some passive electronics, a
ground electrode and the electrode array. In figure 2.4 the left component houses the
passive electronics in the square center section, the ground electrode is visible as the
shorter tail with the straight end and the electrode array is the longer tail with the curled
tip. The individual electrodes are visible as small dots along the curled tip. During the
implantation surgery the body of the implanted part is countersunk and fixated into
the skull bone underneath the skin just behind and above the affected ear. The ground
electrode is then anchored into the surrounding tissue. A hole is drilled from behind the
ear through the skull and into the middle ear cavity. Through the hole and the cavity
the electrode array is carefully inserted through the cochlear’s round window into the
inner ear. Once successfully placed, the electrode array lies along the outer wall of the
cochlear following its natural curvature.
The external part of the cochlear implant is the sound processor and visible in figure
2.4 as the black component on the right. The bottom part of the sound processor is a
detachable, often rechargeable battery. Above the battery is the main unit which houses
the actual processing electronics as well as one or more microphones, one of which is
visible in figure 2.4 as a small black circular notch right underneath the silver button.
Attached to the sound processor through a short cable is another coil with a small magnet
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 15
in its center. The sound processor is worn behind the ear similar to a hearing aid or
a Bluetooth headset. The external coil is placed on top of the implanted coil and the
two magnets keep the coils aligned on top of each other. When their strength is tuned
correctly, the magnets hold the coils in place without squeezing or bruising the skin and
tissue between them. Through this configuration the sound processor is able to transmit
electrical energy and signals through the skin into the body using the homogeneous
electromagnetic field of the two coils. This system eliminates the need for a connector or
socket puncturing through the skin, thus greatly reducing the risk of infections. In 2020
two manufacturers presented new sound processor variants, which combine the sound
processor with the coil into a single unit that is only supported by the magnets. These
single unit processors can be helpful for people that wear glasses or have small or missing
ears. However they have other drawbacks, like difficulties using them while wearing a
helmet or higher risks of loss during highly dynamic situations like impact sports.
To summarize, the sound signal flow through a cochlear implant is described again
in the following. An acoustic sound wave is received by the microphones on the sound
processor. The sound processor then digitizes the signal and tries to improve the speech
intelligibility by taking the microphones directional characteristics into account, reducing
noise, filtering and applying additional signal processing algorithms. The cleaned signal
is then split into its frequency components and each of those is turned into electrical
impulses. The electrical impulses are directed from the sound processor to the outer coil,
through the skin using the homogeneous field of the two coils and into the inner coil.
From there the signal is lead to the electrode array. When the individual electrodes in
the array fire, the energy is absorbed by the acoustic nerve surrounding the cochlear.
The triggered nerve sends the signal to the brain, which interprets it as sound. Since
the electrodes are spatially spread out along the array, which lies along the wall of the
cochlear, they stimulate different endings of the acoustic nerve, thus creating sounds of
different pitches in the brain. The frequency components split, that were by the sound
processor, are mapped to the different electrodes in the electrode array allowing the
wearer to perceive sounds containing different pitches.
Alongside the greatly reduced risk of infections, another advantage of the split into a
separate external and implanted component is, that the parts can be serviced indepen
dently. It is for example quite common to upgrade the sound processor without touching
the implanted part, as it is much harder to service. Since it requires invasive surgery
for possible maintenance all manufacturers keep some level of backward compatibility
between their implants and newer models of the sound processors. This allows wearers to
switch to newer sound processor versions without the need for surgery, allowing them to
also benefit from new development usually resulting in improved speech comprehension
and situational awareness. Within the last 3 years, which is very recent in the medical
devices industry, manufacturers have also started to provide software updates for their
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 16
sound processors. These updates are installed by audiologists on the wearers sound
processor. One example for a feature that was introduced through such a software update
is the “ForwardFocus” feature on sound processors of the manufacturer “Cochlear”. When
activated it uses the sound processor’s two microphones to determine the direction a
sound comes from and filters out any sound coming from behind the listener. This has
proven to be very useful and effective in reducing background noise and improving speech
comprehension in noisy environments like bars, restaurants or other group settings [35].
Although a cochlear implant tries to mimic the way the auditory nerve is naturally
stimulated the resulting hearing impression is quite different from regular hearing. Audi
ologists and the literature often talk about an auditory impression rather than hearing.
In part this is due to the fact that the electrodes are not connected directly to the
auditory nerve, but emit more broadly into the tissue surrounded by the auditory nerve.
To be clearly distinguishable from each other, a single frequency band therefore needs
a much larger surface area, than the hair cells do. This prevents electrodes from being
spaced closer together on the array which in turn greatly reduces the implants frequency
resolution. Depending on their size and configuration, modern electrode arrays have
between 12 and 24 electrodes [
36
]. Another factor reducing the available frequency
range is the fact, that the electrode array can only be inserted to around 1½turns of
the cochlears 2¾turns [
36
]. This is due to the fact, that the decreasing size of the
cochlear towards its inner turns constrains the insertion depth. Otherwise a rupture of
the basilar membrane or the outer wall of the cochlear might result in irreparable damage.
Important to note here is, that from the wider outer end to the tighter inner end of
the cochlear the auditory nerves frequencies are arranged from high to low respectively.
Since the electrode array only reaches about half way into the cochlear, only the upper
half of the auditory nerves frequency spectrum can be stimulated. As a consequence
the perceived frequencies are being remapped significantly compared to the acoustic
frequencies. This results in sounds being perceived much higher than their acoustic
counterpart. Another consequence of the limited available frequency range inside the
cochlear is, that the sound processor itself also has to limit the frequency range that it can
forward to the implant. Since speech comprehension is most important for interactions
in daily life, cochlear implants are optimized for detecting and processing the modulation
patterns of speech. Therefore most sound processors only use the range up to around
8 kHz, varying by manufacturer and audiologist settings. The optimization for speech
and the small number of available frequency bands result in a rather poor reproduction
quality for music. While repetitive music gets through fairly well, more complex music
can be difficult to enjoy because its fine frequency structure is hard to replicate. In
order to try to mitigate these drawbacks, digital processing is applied. Unfortunately a
further implication of the limited frequency resolution is that cochlear implants don’t
perform equally well for all spoken languages. Wearers in regions with tonal languages
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 17
that rely on small variations in pitch, like Thai, Vietnamese, many variants of Chinese
or most languages from Sub-Saharan Africa, have a much harder time gaining speech
comprehension than in areas where non-tonal languages, like English, German or most
other European languages are spoken. New electrode stimulation strategies and different
signal processing algorithms might be able to address these current shortcomings [37].
Even though the technology is constantly improving and each new generation of sound
processors enhances the wearers hearing abilities, the auditory impression provided
by cochlear implants might be regarded as limited when compared to regular hearing.
However this comparison misses an important point: while the sound processor and the
implant do a lot of sound treatment already and provide the way for sound to get to the
brain, the actual heavy lifting in processing and interpreting the sound happens in the
brain. The extraction of meaning as well as understanding and comprehending spoken
language from the unfamiliar input is the accomplishment of the brain. The brain is
able to achieve this feat due to its plasticity. This property allows the brain to adapt
itself to new circumstances and react to new or different sensory stimulations. Section
3.3 describes the plasticity of the brain in more detail. The process of the brain adjusting
itself to a new input, starting to learn to recognize patterns and extracting meaning from
those, is a rather slow and time-consuming process. Cochlear implant wearers who first
receive their sound processor often describe the sound impressions as beepy, echoic and
atmospheric tones with high pitched, squeaky, bubbling voice-like sounds that follow the
loudness contour of the words spoken [
38
]. Personally I have often described the first
impression of voices through a cochlear implant as a mixture between the voices of Micky
Mouse and R2-D2. Over the weeks following the first activation voices start to emerge
and separate from the spherical background sounds [
38
]. The sensation of the process is
hard to put into words, but as the brain gets more accustomed to the new input and over
time learns new ways to infer information from it, words raise from the background and
voices become more clear and understandable as well as easier to discern. Even the higher
pitch of all sounds is something the brain gets accustomed to. The brain just conceals
the change in pitch similar to how somebody wearing glasses doesn’t notice the rim of
their glasses in their field of view, unless they look for them or actively think about it.
The aid of audiologists and speech therapists is crucial in learning to hear and under
stand successfully, using a cochlear implant. For that reason the implantation is generally
accompanied by a post-surgery care of two years or more. During these sessions both
the technical settings of the sound processor are tuned, as well as listening exercises
and speech comprehension tests are performed. Today listening exercises can also be
supplemented by using special smartphone apps. Modern sound processors can connect
to smartphones via Bluetooth or dedicated extension devices to stream audio directly
to the implant, which also allows other sources like audiobooks, TV programs or music
to be used for training at home. The audiologists will watch out for complications and
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 18
adjust settings on the sound processor to customize the auditory experience according
to the wearer’s needs. Some of the more important settings are the loudness levels of
the individual electrodes, different sound processing programs, background noise or wind
reduction as well as microphone sensitivity. Generally the success of a cochlear implant
does not primarily depend on the capabilities of the sound processor, implant or number
of electrodes, but on the brain’s ability to adapt to the new auditory input and extract
information from it. The length of this familiarization period depends on a number of
personal and environmental factors. Among the most important factors is the age of the
brain, as it is easier for a child or young adult than an elderly person to relearn hearing.
Another important factor is the length of the deafness, as the brain shuts down areas it
does not need in order to conserve energy. The learning efforts and restructuring of the
brain is an ongoing process that continues even after the first years post implantation.
Cochlear implants are undoubtedly extremely helpful and a powerful technology that
allows people to regain a possibly lost primary sense. Even though it can offer an
improvement in the quality of life and autonomy that comes with it, cochlear implants
are not without controversy. Especially in the deaf community they have been met with
resistance by some people. To those opposing cochlear implants they are often presented
as the only right solution to a condition they don’t view as problematic. Most deaf
people don’t perceive their deafness as a shortcoming or deficiency that needs fixing, but
simply as another facet of life. They regard their deafness as a part of their identity
which flourishes inside a supportive community, with its own language, culture and
heritage. People opposing cochlear implants often feel that the perception of deafness
as a condition that needs curing is a view that the rest of society pushes on them as
the only solution to fix all their perceived “problems”. Instead of working for a more
inclusive society, where deaf people don’t have to recluse into their own community,
deafness is labeled as a disability and a problem that needs to be managed and solved.
Listening to actual problems that deaf people face and implementing less invasive, more
inclusive and often cheaper solutions, like simple pictograms, wider use of closed captions,
required sign language translators or offering text based communication channels just
to name a few are often dismissed, as they would require active change by the society.
The impression left behind is that the deaf individual is the one that needs fixing instead
of the society becoming more inclusive. The rejection of cochlear implants can also be
attributed to a rather insensitive approach by some doctors and scientists as well as flaws
and complications in the first generations of cochlear implants combined with training
techniques that had not fully matured yet, which led to earlier implants being much
less successful than they are now. Another fear is that children receiving implants at
an early age and growing up with spoken language will lose their connection to the deaf
community resulting in deaf culture fading away over time.
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 19
2.3 Pitch Shifting
Historically pitch shifting has always been an integral part of musical practice, but the
advances in digital audio technology and signal processing algorithms during the past
decades have taken it into entirely new realms of acoustic and musical possibilities. Much
like the change in speed, loudness and expression of a melody or a piece of music, altering
the pitch is one of the fundamental tools a musician can employ to modify music according
to their need and style. In musical terms changing the pitch simply means to shift the
notes of a melody up or down by a fixed interval of semitones. Historically musicians using
acoustic instruments or their voice were able to shift pitch by simply using a different
register of their instrument. If we take a piano as an example for a classical instrument the
player would just move their hands along the keyboard by a given number of keys. In the
context of acoustic instruments it is important to note that altering the pitch of a sound
will also always change its timbre according to the physicality of the instrument. Timbre
is a somewhat hard to define concept incorporating a very complex set of perceptual
attributes, which describe a sound’s “color” or qualities and contribute to its unique
identity [9]. Many different subfields of music and sound studies have defined timbre in
different ways and especially in recent years a lot of new research conducted in the field
has uncovered new insights and interesting interconnections [
39
]. With regards to pitch
shifting as discussed in this thesis it is however sufficient to use the simplified and dated
timbre model of Hermann von Helmholtz. He used Fourier’s theorem in 1863 to realize
that the timbre of a sound heavily depends on the distribution of overtones in the sound’s
spectrum [
39
]. In this simplified model the fundamental frequency of a sound can be
thought of as responsible for the pitch, while the overtones are responsible for the sound’s
timbre [
39
]. Within the domain of digital audio effects we are offered the choice to preserve
the characteristic timbre of an instrument when changing the pitch or to transpose the
sound without considering the timbre alteration. If the fundamental frequency is moved
up or down and the overtone spectrum is stretched or contracted accordingly, so that its
distribution relative to the fundamental remains intact, the original timbre is preserved
in the pitched sound [
40
]. If the entire spectrum is moved by a constant amount, thus
destroying the overtone relationship, the timbre of the sound is modified during this
operation [
40
]. For unnatural sounding, but nevertheless musically interesting, effects the
overtone spectrum can also be stretched or contracted independently of the fundamental
frequency to change the timbre of a sound without changing its pitch. As most pitch
shifting algorithms are designed for musical use, a preservation of the timbre after the
pitch shift is usually desirable. For the purpose of this thesis transposing the spectrum
while preserving the harmonic overtone relationship is referred to as “pitch shifting”
(sometimes also called frequency scaling). Transposing the whole spectrum by a constant
amount therefore destroying the harmonic relationship between the fundamental and the
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 20
overtones is referred to as “frequency shifting”. Unfortunately in the literature the term
pitch shifting is often incorrectly used as a catch-all term to also describe the process of
frequency shifting.
2.3.1 History & Usage
The history of analog electronic pitch altering devices goes back to the American Bell
Labs during the 1930s [
41
], where the Vocoder was developed from experiments with
filterbanks. It was originally used to encrypt voice messages for military communication
during the second world war. Towards the end of the 1970s, analog Vocoders like the
Roland VP-330, the Korg VC-10 or the Moog Vocoder became commercially available
as professional musical equipment. They became widely popular with musicians of
all genres; some examples among many others: Pink Floyd
7
, Phil Collins
8
, Michael
Jackson
9
, Kraftwerk
10
and Jean Michel Jarre
11
. The first digital pitch shifting device
was the “Harmonizer” released by the company Eventide in 1975. Instead of a filter
based approached like the Vocoder it was build around a circular delay line which used
random access memory on integrated circuits, a technology brand new at the time
12
. The
harmonizer was only able to shift up or down by one octave and had some audible glitches
while crossfading at the edges of its circular buffer. With larger transposition factors the
timbre of the original sound was also getting increasingly distorted [
40
]. Nevertheless
it was still very popular with a lot of musicians. The next major step forward in the
field of pitch shifting was the exploding popularization of digital audio effects in the late
1990s. It was facilitated by advances in chip manufacturing, which resulted in cheaper
components, bringing many new digital rack mounted effects units on the market, as well
as the success and increasing processing power of home computers, leading to many new
effects being developed as software plugins for digital audio workstation (DAW). Higher
fidelity as well as easier access contributed to DAW plugin’s immense success. So much so
that many older rack mounted effect units were rereleased again as plugins. One famous
example is the popular software “Auto-Tune”, which was both sold as an effects unit
and as a plugin. It was released in 1997 and almost instantly became the go-to industry
standard for pitch shifting. While the name of the software “Auto-Tune” has become
synonymous with voice pitch shifting, other pitch altering software like “Melodyne” or
“Elastique” are also used extensively in many different music genres today.
The original intention of the software Auto-Tune was to correct singing that was off-key,
by shifting the inaccurate notes unnoticeably to match the desired ones. Some musicians
7Pink Floyd: Animals; Pigs (Three different Ones), 1977, London
8Phil Collins: In the Air Tonight, 1981, London
9Michael Jackson: Thriller; P.Y.T. (Pretty Young Thing), 1982, Los Angeles
10Kraftwerk: Die Mensch-Maschine; Die Mensch-Maschine, 1978 Deutschland
11Jean-Michel Jarre: Zoolook, 1984, France
12
Eventide’s Harmonizer 50 years anniversary website: https://www.eventideaudio.com/50th-flashback-
4-2-h910-harmonizer-the-product/ - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 21
and music critics have criticized the wide spread use of Auto-Tune as being a shortcut
to skill, which results in less talented singers. Nevertheless it is continued to be used
extensively. Besides the original intended use, musicians have developed it further into
an effects category of its own, using it as an artistic and stylistic device, rather than
just as a tool for the correction of mistakes. Especially the genres of R&B, hip-hop
and electronic music have been decisively shaped by the effect. Many different effects
altering the expression of the voice can be achieved through pitch or frequency shifting
techniques; among them are: giving the voice a distorted, unnatural feeling (like in the
1998 song “Believe” by “Cher”
13
), making voices sound darker or lighter (as evident in
music by the artist “Fever Ray”
14
), giving the voice a dreamy, wavy feel resulting in
a feeling of loneliness and being disconnected (extensively used in R&B and hip-hop
vocals) as well as more extreme alterations of the voice to be choppy, stepped and broken
up (like in “hand crushed by mallet” by “100 gecs”
15
). The effect was used for the first
time in the aforementioned song “Believe” by “Cher”. Besides the use in music, pitch or
frequency shifting is often used in television or cinema productions to give characters an
alien, robot or science fiction sounding voice as for example in the movies “Battlestar
Galactica” or “Transformers”. It is also commonly used in combination with other effects
to disguise the voices of people wishing to stay anonymous on the record. Pitch shifting
as well as time stretching is also an integral part of any modern DJ software. DJs use
both effects to align the tempo, key or pitch of two songs following each other, to blend
them unnoticeably, as well as for creative and artistic uses. Another field that also has
embraced the use of the pitch shifting effect is the field of personal karaoke machines.
While karaoke machines are less common in the western world, they are widespread and
highly popular in East Asian countries. Karaoke machines use the effect to offer users a
chance to improve their vocals and make their singing experience more enjoyable.
2.3.2 State of the Art
The state of the art in pitch shifting algorithms using digital signal processing (DSP)
is largely based on research done in the 1990s and early 2000s [
40
]. There are different
approaches with many variations and different implementations in use, each with their
own strengths and weaknesses [
42
]. The two most popular approaches will be briefly
outlined here. The first one is known as PSOLA (pitch synchronous overlap and add)
and was developed towards the end of the 1980s [
43
]. The algorithm operates in the
time domain and works by first splitting the incoming signal into overlapping, windowed
blocks. Then, to alter the pitch, these blocks are spaced further apart (pitching down)
or closer together (pitching up). Afterwards the blocks are combined into a continuous
13Cher: Believe, 1998, United States of America
14Fever Ray: If I Had a Heart, 2009, Sweden
15100 gecs: hand crushed by a mallet, 2020, United States of America
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 22
signal again using the overlap-add technique. The PSOLA algorithm has a long history
with a variety of different use cases and has been continuously adapted and improved over
time. Some of its limitations are, that it relies on the source signal being quasi-periodic,
meaning it works best with monophonic sources or solo speakers. Notably it is limited to
stretch factors between 0.25 and 2.0. Nowadays, PSOLA has mostly been superseded
by other algorithms and is only rarely used for vocals in live settings, because to its
efficiency and low latency, results from its operation in the time domain.
The second algorithm is the “phase vocoder”, which is in comparison to the PSOLA more
recent and enjoys an even higher popularity [
44
]. The algorithm works by transforming a
signal block from the time domain to the frequency domain, using either a filter bank or,
more commonly, the fast Fourier transform (FFT). In the frequency domain the phase of
the signal is then altered according to the pitch shifting factor. Afterwards the shifted
signal is transformed back to the time domain, either by using an oscillator bank or the
inverse fast Fourier transform, together with the overlap-add technique to concatenate the
resulting blocks. This blockwise processing is done continuously by using an overlapping,
sliding window. Working in the frequency domain, the phase vocoder approach results
in a higher latency and increased computational complexity, but makes it suitable for
signals containing multiple sound sources or complex signals like music. It is very often
used for offline transformations in digital audio workstations, although optimizations and
the increasing speed of computers in the recent decades have also made phase vocoders
suitable for real-time use. Over time the phase vocoder algorithms have seen numerous
general improvements and adjustments for special applications [
45
]. One of the most
prominent weaknesses of the phase vocoder are perceptible artifacts often described as
“phasiness” or “loss of fullness”. These artifacts stem from phase incoherences between
the overlapping blocks which can be mitigated through additional processing [46].
As a matter of fact the basic scientific principles of pitch shifting have stayed relatively
unchanged in the past decades. Most of the improvements to the algorithms in the past
decade have been developed by private companies behind closed doors outside of academia.
The music production industry has developed specialized high quality implementations
based on the algorithms described above, with the goal of producing the best sound with
as little harmonic distortion as possible. As part of these implementations optimizations,
such as reduced amplitude modulation artifacts or preserved exponential drum decays,
were introduced. These implementations still have trouble with scaling factors of large
magnitudes, as they were not designed for that purpose. In the past years first research
efforts have been made towards a new approach to pitch shifting using machine learning
techniques. Neural networks based on the Google Wavenet or NSynth [
47
] structures
have been successfully employed to shift pure sine waves by a fixed factor [
48
]. The
first promising results of using another neural network to correct the pitch of detuned
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 23
solo singing voices have also been demonstrated [
49
]. These techniques are still in their
infancy though and require more research.
2.4 State of the Art in Ultrasonic Hearing
A number of devices made specifically to enable users to hear ultrasound are commercially
available. Their primary intended use is for animal observation, either in a scientific,
commercial, conservational or hunting contexts. The most common devices for ultrasonic
hearing are so called “bat sonars”, made for detecting or tracking bats, based on their
calls. Skilled users are even able to classify different bat species by using these calls [
2
].
Bat sonars are often constructed in a quite straightforward way. Usually they consist
of an ultrasonic microphone and a main device containing a headphone output which
does the frequency shifting. Modern devices built with digital processing offer more
advanced features such as recording capabilities or spectrogram visualizations. Simpler
versions using a fully analogue circuit are also available. Indeed the algorithm employed
by almost all bat sonars is a fairly basic frequency shifting algorithm and is much simpler
than the pitch shifting algorithms described in section 2.3.2. It uses a process known as
“heterodyning”, which is very similar to the way FM radios work [
40
]. Essentially the
ultrasonic signal is multiplied with a sine wave signal and filtered through a bandpass.
This process transforms it into the audible range by shifting the signal, thus destroying
the overtone relationship as described in section 2.3. The rather basic construction
and working principles stands in a strong contrast to the price these devices are sold
for. Their prices start in the hundreds of Euro and go well into the thousands, even
though the same functionality can be achieved through much cheaper means. Soldering
kits for the analogue circuit and detailed instructions on how to use small single-board
computers, like the Raspberry Pi or various other microcontroller boards, are available
for purchase
16
. There are even some open hardware projects like the “AudioMoth”
17
developing dedicated hardware. However the technique employed to perform the pitch
alteration is almost always heterodyning.
Very recently new advances have been made in the field by Finnish researchers through
developing a prototype device with multiple microphones, spatial audio processing and a
pitch shifting algorithm based on the phase vocoder [
50
]. Their prototype allows test
subjects to localize ultrasonic sound sources almost as well as sources in the audible
range. In contrast to their prototype, which uses a specially developed plugin running in
a DAW on a laptop to do the multiple channel audio processing, the prototype developed
for this thesis is much more compact and portable. Since I am single-sided deaf and
have one cochlear implant, my prototype will produce a mono signal, which makes the
16For examples see: “Make” magazine (German edition) 6/2020 p.72 or 4/2021 p.84
17AudioMoth Website: https://www.openacousticdevices.info/audiomoth - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 2 Concepts & Context 24
proposed spatial localization not applicable. However the pitch shifting algorithm used
will be similar in nature.
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 25
3 Extension of the Senses through
Technology
The usage of tools by humans, to manipulate and shape the environment surrounding
them to their advantage, dates back to prehistoric times and is even regarded as one of
the facets of intelligence that sets humans apart from many other animals. Expanding
their capabilities through the application of tools and technology can also be regarded as
a fundamental human trait. It ties in deeply with the human desire to adventure, explore
and improve their situation as well as themselves. Additionally, humans have a long
and deeply rooted history of employing tools as instruments to learn, understand and
gain knowledge about their surroundings. In order to do so, humans have successfully
developed and applied a wide range of different technologies to enable them to extend
their senses into territories that are inaccessible to them through naturally provided
means. Besides the current tools and equipment, the prospect of utilizing implanted
devices for the purpose of sensory extension is gaining popularity as a futuristic possibility.
Transhumanism is one prevalent philosophical school of thought, that concerns itself
with the future enhancement of the human body, the extension of the senses and the
prolongation of the human experience through modern technological means. While
many of the transhumanist ideas and arguments are influential on these topics they are
often theoretical in nature. For the context of this thesis, the focus is shifted towards
practical implementations and experimental applications, that are currently achievable.
These applied experiments are commonly consolidated under the name “biohacking”.
People pursuing these goals outside of established institutions sometimes call themselves
do-it-yourself (DIY) cyborgs. The term cyborg in this context refers to the merging
of humans and machines into an organism which is composed of both organic and
electromechanical body parts.
3.1 Historic Context
One of the historically most significant advances in the domain of sensory extension
was the expansion of the human eyesight through the development of the lens. Optical
lenses gather, focus or scatter light, depending on their cut, due to different angles of
refraction between different media like glass, water or air. Its discovery and subsequent
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 26
use lead to a variety of different sensory extensions among others glasses and later the
telescope as well as the microscope. The invention of the telescope in the 17th century
enabled astronomers, like Galileo Galilei and his peers, to observe and describe things
that few other people could have imagined seeing at the time. Observing phenomena,
such as craters on the moon’s surface, discovering moons around other planets or new
planets themselves were quite literally revolutionary at the time. Institutions that held
the interpretive authority at the time, like the catholic church or some of the reigning
aristocracy, rejected the conclusions that resulted from those new observations, as it
did not align with their view of the world. Since the observations were reproducible
by anybody who had access to the technology of a telescope, the new findings and
the consequences following them prevailed. The invention of the telescope had huge
implications for hunting, navigation and warfare. It also enabled a better understanding
the universe, by observing the relationship between the earth and other objects in the
night sky. The resulting discoveries and their cultural impact permanently shaped
our understanding and view of the world. This historical example illustrates well that
advancements in broadening the way humans are able to perceive their environment can
have huge scientific and cultural impacts. Another related example from the optical
spectrum is the invention of the microscope. After it became popular in the 17th century,
scholars had the chance for the first time to examine structures and phenomena, that
are too small for human vision to resolve. Similar to the telescope, the invention of
the microscope opened up new worlds for science and technology to explore. Again,
this access had wide ranging consequences, which resulted in improvements for human
life in general. The advances into these small dimensions allowed scientists to discover
and study many things previously unknown, such as red blood cells, bacteria, insects or
minerals deposits in rocks. In turn, these discoveries and observations gave rise to entire
new fields and subcategories of science and medicine, which then led to the development
of pharmaceuticals, germ theory or precision manufacturing techniques, just to name a
few. On a more individual level was the lens the key component in developing monocles
and later glasses, which helped people with visual impairments to improve their sight
and counter loss in vision [
51
]. These visual aids allowed humans, for the first time in
history, to counteract shortcomings of their primary sense. Today it is hard to imagine
a world without glasses and the social impact they had can not be overstated. Being
able to correct refractive errors allows affected people to visually perceive many more
details, become more self reliant and experience many joys in life they would have missed
otherwise. Once glasses became cheap and available enough for most people to obtain,
their perceived image started to change from a crutch towards a desirable fashion item
with many different styles. Nowadays, technological advances in material science have
produced a variety of different lenses, that can support the visual sense in a number of
different ways besides correcting refractive errors. The most popular use of specialized
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 27
lenses are sunglasses shielding the eyes and retina from overexposure and UV light.
Winter sport athletes as well as drivers in snowy conditions use lenses tinted yellow or
orange as well as polarization filters to improve the visibility of edges and contours in
the snow [
52
]. There are even specialized lenses that allow people with certain types of
color blindness to perceive some of the colors they are missing. A more recent example
of a new sensory extensions would be the invention of the handheld infrared camera.
They work similar to similar to regular digital cameras for visible light, but have sensors
sensitive to infrared radiation. A false color image is generated from the captured data,
offering a picture of the surrounding’s heat signature. Temperature gradients, as well as
hot and cold areas, can be easily identified visually through differently coloured areas.
Infrared cameras have given researchers a view into the darkness of the night time world.
They have allowed them to study nocturnal animals and wildlife in ways previously
impossible with human eyesight or other cameras. Besides the gain in scientific knowledge
infrared cameras also had an impact on human life in more tangible ways, as it allows
firefighters to inspect extinguished houses for pockets of embers or hotspots, to ensure
they are permanently extinguished. This has proven to be far more reliable than keeping
a lookout for the sight of smoulders or waiting for rekindling. Infrared cameras can also
be employed while searching for missing or buried persons as well as identifying insulation
leaks for making buildings more energy efficient.
The mentioned historic examples establish the connection, that extending the human
sensory inputs through advances in observation tools and technology, offers the possibility
of a deeper understanding of the environment and subsequent improvements in life
conditions. The more recent examples show that this notion continues to be true today.
In fact, every new step in observation technology enables new scientific discovery and the
opportunity for social progress. Most of the developed technologies extending the human
senses are aimed at the visual sense, because it is the sense that human primarily rely on
to gather information about their surroundings. Some attempts at extending the other
senses or adding new ones have been made by individuals, as will be described in the
next section.
3.2 DIY Movement
As a result of the rapid miniaturization and mass production of silicon based semiconduc
tors, the chips, sensors and microcontrollers made up of them have become very small
and cheap compared to 50, or even just 10, years ago. This continuous development has
made it feasible for interested individuals to design, build and experiment with electronics
in an affordable manner at home. They no longer have to depend on the funding and big
electronic labs of institutions like universities, research facilities or private companies.
The trend has lowered the end consumer price for a single off-the-shelf microcontroller
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 28
down to a few Euro. Nowadays, getting your own circuit board design printed and
manufactured in one of the respective manufacturing countries, often costs in the range
between a couple of Euro up to a few tens of Euro, depending on the size and complexity.
Famous open hardware projects like the “Arduino” or the “Raspberry Pi” made the
occupations of “hacking” and “making” affordable and helped to popularize them among
enthusiasts and interested people [
53
]. A flourishing scene of tinkerers, makers and
hackers has since evolved with both online and offline aspects, like conventions, fairs
and magazines as well as websites, tutorial videos and blogs. Principles, like sharing
knowledge and designs, helping each other learn, trying to understand the ins and outs of
your devices as well as providing access to cutting edge technology in a playful manner,
are integral elements of the movement. Creating, fixing and working with electronics,
instead of just consuming them, is another of the fundamental aspects.
Within this scene a loose subgroup of people, sometimes referred to as DIY cyborgs
or biohackers, is interested in exploring and working with the intersection of human
perception and technology. The recognition of the intricate connection between perception
and reality is deeply incorporated into their work. They have conducted a variety of
different practical experiments and trials on augmenting and extending the human senses.
This regards both devices worn externally as well as devices implanted inside the body in
the cybernetic style. It is important to realize that their motivation for sensory extension
is less about the gain of hard scientific knowledge as it might have been in the past. For
them the extensions are more about connecting with the world in different ways and
experiencing new ways to perceive [
54
]. Shaping their reality through the alteration of
their perception by using technology is their pursued goal.
For the descriptions in the following paragraphs the focus is put specifically on devices
that are both implanted into the body and extend or augment the human perception.
Other experiments involving externally worn devices, mechanical devices or aesthetic
body modifications are not considered here. A comprehensive list and detailed review of
various different cybernetic enhancement technologies from both the medical and DIY
area can be found in a paper from 2017, including those intentionally left out here [54].
A few smaller cybernetic body modifications popular among the DIY cyborgs have
been carried out many times. Although it does not influence the perception directly, the
probably most well known and popular one is the implantation of an NFC or RFID chip
under the skin. An indication illustrating their high demand is the fact that full kits
with all the required tools and accessories are sold online. The chip is usually encased in
a small bio-compatible glass vial and has the dimensions of about 2 mm x 10-14 mm. A
syringe is used to insert the implant, usually into the skin pocket between the thumb
and index finger, as can be seen on the X-ray image in figure 3.1. Different models with
varying ranges and different operating frequencies are in use, with some of the more recent
even including an LED that can blink underneath the skin. Since the technology has
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 29
been used to tag animals for decades, it can be considered reasonably safe and matured.
The first implantation of an RFID chip into a human arm was executed on the 24 August
1998 in the United Kingdom, which means over the past decades people had time to
gather experience and further refine the procedure [
55
]. The applications supported by
these type of implant are similar to any other NFC or RFID chip. For example, it can be
used as an electronic identifier to unlock doors, operate machines, control smartphones
or to authorize transactions of cryptocurrency wallets as well as some other payment
cards. An increasing number of public transport providers and companies like night
clubs (for example in Rotterdam or Barcelona) or fitness studios also use NFC tags for
access control [
55
]. Alternatively, small amounts of data, in the order of a few hundred
to a few thousand bytes, can also be stored on the implant. The most common uses
of this feature is storing important medical information or a digital business card on
the implant, which can be transferred by simply putting the hand on somebody else’s
smartphone. Very recent developments in the field have seen a more advanced use of
NFC technology’s capabilities. A Swedish company is developing an implant that is able
to measure the body temperature, when prompted through a smartphone, by utilizing
the energy harvesting feature of the NFC’s antenna. If the results of their research
verifies that it works as expected, it would mark the next step in NFC implant technology,
transitioning from access control and data storage towards active sensing. Clinical trials
seem to be still ongoing and there are unfortunately no publications available yet18.
Although no official statistics exist, it is safe to assume that the second most popular
implant that extends the senses is the implantation of a small magnet under the skin.
They are usually implanted into the index or middle finger, for right handed people often
in the left hand and vice versa. Careful consideration has to be given to the materials and
coatings involved as neodymium, a component in many high strength magnetic alloys,
is known to have toxic properties [
56
]. A bio-compatible coating is thus necessary, as
corrosion or rejection by the body is likely to lead to infections or other complications.
The magnetic implant has two main functions: Firstly it is possible to use the finger
tips to pick up small ferromagnetic objects, like screws, nuts or paper clips dropped
into hard to reach places (as can be seen in figure 3.2). Secondly it enables to sense
and feel magnetic fields through what is described as a tingling, vibrating sensation in
the finger tips. This offers the possibility to perceive the orientation of magnetic field
lines around ferromagnetic objects, the rotation of electric motors or feeling if a cable
carries a strong current. Aside from the pure perceptive part, scientific research on the
feasibility of using such a magnetic implant as a tactile human-machine interface has
been carried out [
57
]. The researchers concluded that the implants could feasibly be a
“new channel of information to the brain”. A more detailed study comparing the different
electric stimuli, on implanted magnets versus externally worn magnets for the use in
18Dsruptive Subdermals website: https://dsruptive.com/ - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 30
Figure 3.1: X-Ray image of an RFID implant.
Image source: dangerousthings.com
Figure 3.2: Magnetic implant lifting objects.
Image source: dangerousthings.com
Figure 3.3: Sensor tattoo measuring pH,
glucose and albumin levels.
Image source: [6]
Figure 3.4: North sense implanted onto
the chest.
Image source: cyborgnest.net
human-machine interfaces, has also been conducted [
58
]. The researchers were able to
measure an increased sensitivity to stimuli by using implanted magnets compared to
the externally worn ones. Their research suggests that magnetic implants could see an
additional application in the future as an interface between humans and technology,
providing an alternative information channel to transmit text messages, GPS driving
instructions or any other data directly to the brain. One such conceivable application
that has seen some early testing is the coupling of the magnetic implant to an ultrasonic
range finder, to offer a new sense of perceiving distances and aiding navigation [57].
A new and upcoming topic, that has seen some promising research in the last few years,
are so called “functional tattoos” or “tattoo sensors”, which are applied under the skin
in a similar fashion to regular tattoo ink. A specially engineered chemical composition
allows them to change color, depending on certain environmental factors thus displaying
their presence or absence (see figure 3.3). Due to their novelty and chemical complexity
they have not seen widespread use in the DIY cyborg community. Researchers have
demonstrated different tattoos, both in lab settings and in vivo in human skin [
59
].
Different chemical complexes have been developed to measure and display all kinds of
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 31
different factors, among others: UV exposure of the skin, the pH and glucose levels inside
the body or even the concentration of single proteins like albumin [
6
] [
59
]. The new inks
currently still suffer from stability, durability and reversibility issues, but it is safe to
assume that sensor tattoos will find users and applications in the DIY community, once
they have matured enough to be safe and available.
The so called “North Sense” is another extension of perception, through the addition
of an entirely new sense. It indicates to the person using it when they are facing towards
the magnetic north pole, usually through vibrations. A number of different configurations
of devices providing this sense have been made by different individuals and institutions.
They were integrated into smartwatches, worn as a belt around the chest, the hip or as
an armlet. An externally implanted version that is fixed to the chest with two metal
rods similar to piercings, pictured in figure 3.4, is also available. Thanks to the device’s
simplicity, its functionality of providing a fixed spatial reference point is intuitively
understood by the wearers. The north sense has been reported to be a tremendous
assistance for orientation, after an acclimatization and training phase was completed. In
research trials conducted by the University Osnabrück some wearers reported that the
device gave them “a new sense of spacial perception” [
60
]. Sighted wearers are quoted
with having a strong feeling of certainty of “never getting lost again” [
60
]. The new
sense offers a constant fixed point, which can assist in finding one‘s way around places,
without losing the feeling of the own location, especially in unfamiliar places. For blind
wearers improved orientation and mobility was also observed. Notably, an improved
understanding of the spacial relationship between places was reported by blind wearers,
as well as an increased feeling of security, when navigating through open or outdoor
spaces, that offer little other orientation cues [61].
Despite their futuristic appeal, DIY implantations should not be underestimated, as
they are after all still surgical interventions with all their associated risks and side effects,
especially when executed incorrectly. The two primary threats to a successful surgery are
the risk of infection, either of the wound or the embedding tissue, or the as a rejection of
the implant by the body, leading to the immune system attacking it. The first problem can
be mitigated by appropriate sanitation and surgical procedures. However it is often very
difficult, if not impossible, to find a qualified surgeon and a proper medical facility. As
there is usually no medical indication for these cybernetic surgeries and they are not yet
socially accepted like plastic surgery, they are usually rejected outright, making it difficult
to find qualified personnel. The second issue of rejection by the body is usually connected
to implant components or materials not being bio-compatible. Since the devices are
sometimes prototypes and there is no certification process comparable to other medical
implants this problem can be dangerous and hard to mitigate if not considered beforehand.
A few case reports of infections linked to DIY implants have already been published [
62
].
A couple more technical factors come into play when an electric device is implanted into
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 32
the body. Obviously they are inaccessible without another surgery, making it impossible
to improve or repair them quickly. Since some of the devices are untested or in a prototype
phase, a malfunction or breaking of the device can also be a possible source for injury.
Due to their experimental nature, missing certification and lack of regulatory oversight
they can also cause issues in combination with other medical procedures involving electric
current or magnetic fields. For example defibrillators, electrical muscle stimulations or
MRI procedures can be disrupted or prevented from functioning at all. Studies with NFC
implants inside MRI machines have shown that both the implants and the machines take
no damage, but the resulting images turn out slightly distorted [
63
]. Unfortunately there
are no similar studies yet for other combinations of implants and medical machinery.
Sometimes an impression of the experiences or acclimatization to the new perception
can be made with a device worn outside the body. But it seems that implants offer a
unique and more nuanced experience, as for example research with the magnetic implants
showed [
58
]. It can be quite different on an emotional level to have a device implanted
when compared to wearing it externally. Many wearers of cochlear implants, RFID chips,
pacemaker or any other implant for that matter specifically describe the feeling of the
implant “becoming part of their body” [
64
] [
38
]. Which is of course an experience that
many DIY cyborgs are looking to have. However it is important that all these aspects
are thoroughly contemplated before considering an implantation.
Besides these somewhat more common, smaller scale implantations, most of which
are based on some kind of existing technology, there are a few individuals that have
taken a more extreme route and designed entirely new senses using self designed devices.
Probably the most famous example is the artist and cyborg activist Neil Harbisson.
Harbisson was born with achromotopsia, a rare condition which makes the cone cells
in the eyes unable to perceive color, meaning he sees the world in different shades of
gray. In 2004 Harbisson started developing a system called “Eyeborg”, together with
the help of the arts instructor Adam Montandon, the software developer Peter Kese and
the developer Matias Lizana [
65
]. The Eyeborg initially consisted of a small camera, a
computer and a pair of headphones. The camera would register a color in front of it,
send it to the computer which would translate the color into a sine wave assigned to
that color and then output it through the headphones. This way colors could be mapped
to sound and offer Harbisson a way to perceive them. Initially, the system could only
recognize 6 colors, but through subsequent upgrades the number was gradually increased
to 12 then 24 and 48 colors [
65
]. The current system maps the entire color wheel to
360 different tones with microtones in between them. Color saturation was linked to
the sounds volume and together with the lightness perception his eyes already granted
him, he was now able to perceive all aspects of color: hue, saturation and lightness [
55
].
Through subsequent upgrades the heavy computer was replaced by a small dedicated
chip and the headphones were replaced with occipital bone conduction bypassing the ear.
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 33
After nearly ten years of almost continuously wearing the Eyeborg externally, Harbisson
decided in 2013 to implant it into the backside of his head. Taking inspiration from
nature, Harbisson decided to mount the camera on the tip of a flexible antenna which is
anchored to the back of his skull and reaches over his head to his forehead (as can be
seen in figure 3.5). Post-implantation there are in total three holes through his skin: one
for the antenna, one for relieving pressure and one for charging the device [65].
Figure 3.5: Neil Harbisson with his antenna. Image source: wikimedia.org
Similar to cochlear implants or any other perception altering changes it takes the brain
some time to get used to the new input. Harbisson also suffered from headaches and
other rejection symptoms in the beginning. Improving the weight and wearing comfort
of the device before the implantation together with his brain steadily getting more used
to it over time thanks to its neuronal plasticity helped mitigate these effects. According
to Harbisson, he knew the acclimatization phase was nearing completion once he started
perceiving colors in his dreams. To advance his color sensing abilities beyond human
capabilities, Harbisson extended the Eyeborg to perceive both infrared radiation as well
as ultraviolet light. This enables him to know when a building is equipped with infrared
sensors like motion detectors or infrared cameras or if someone is using a remote control.
He also knows when a day is suitable for sunbathing or if he is at danger of getting a
sunburn. Notably Harbisson states that he is the first government recognized cyborg,
as the British passport authority granted him permission to have his antenna appear
in his passport picture after a long and difficult struggle. Contrary to their rules of
not allowing any hair or head accessories or other “foreign” object in the frame, their
decision effectively recognized his antenna as part of his body. He co-founded the Cyborg
Foundation
19
after realizing that there are no laws or legal protections for cyborgs, for
example against law enforcement forcing him to remove his antenna, as well as the lack of
19Cyborg Foundation website: https://www.cyborgfoundation.com/ - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 34
a dedicated space for scientists and developers to research and discuss new technological
extensions of the senses. Harbisson does a lot of public speaking and presentations to
bring attention to and popularize the topic. His cyborg identity has also significantly
influenced his art work. He uses his access to color through sound to translate music
or speeches into paintings as well as “paint peoples voices” or compose chords based on
peoples faces.
The other co-founder of the Cyborg Foundation is the Catalonian dancer and choreog
rapher Moon Ribas. She has experimented with a few different externally worn sensors
like a 360°movement perception device worn as earrings or a glove that let her perceive
the exact speed of any movement. Ribas is perhaps best known for her seismic sense
implanted into her feet in 2013. The implant allowed her to perceive any earthquake
above 1.0 on the Richter scale happening worldwide through vibrations [
65
]. The intensity
of the earthquake corresponds to the intensity of the vibration. She has extended her
seismic sense to also receive moon quakes. Many of her music or dance performances
incorporated this connection of hers with the movements of the earth and moon. For
example in her performance piece “Waiting for Earthquakes” the structure as well as
all her movements are dictated by the movements of the earth. According to Ribas she
continued to perceive “phanton quakes” for a few month after she took the implants out
in 2019 [66].
Figure 3.6: Manel De Aguas with his fins. Image source: wikimedia.org
As a last example for a more advanced sensory extension, the Spanish cyborg artist
Manel De Aguas is mentioned [
66
]. His implant called “Weather Fins” is inspired by
the fins of fish and consists of an inner and an outer part. The outer part is made up
of two rounded fins worn externally on either side of the head above the ears as can be
seen in figure 3.6. These fins are able to perceive weather data like atmospheric pressure,
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 35
temperature and humidity. The implanted inner part sits in the back of his skull and
translates this data into sound which is again transmitted through bone conduction via
the ears to the brain. De Aguas implant enables him to perceive the current weather,
depending on the perceived variances predict changes in the weather and sense his current
altitude.
3.3 Plasticity of the Brain
All of the devices described in the previous chapters, the DIY sensory extensions, cochlear
implants as well as other medical neuroprosthetics, are only able to achieve their proper
functionality because of the brain’s ability to adapt and learn to extract information
from these unfamiliar inputs. This crucial property is know as the “plasticity of the
brain” and has only been discovered relatively recently. First experiments, leading to
the discovery of this ability, where conducted towards the end of the 1960s [
67
]. Brain
researchers from the United States repurposed a dentist chair which participants could
lay on. In the chair’s backrest 400 small pressure actuators were installed in a 20 by 20
grid [
67
]. An image filmed by a black and white camera pointed at objects on a table
was translated onto the pressure actuators matrix pressing into the participants backs.
Thus the system was able to substitute a visual input with a tactile stimulus. With some
hours of training in the chair the blindfolded participants were able to perceive visual
information through the pressure points they felt on their backs [
67
]. Interestingly, blind
participants were also able to learn to recognize and interpret objects and relationships
between them. Even recognizing only partially visible or overlapping objects, movements
as well as a form of depth perception with objects growing in size as they get closer
was observed [
67
]. Another surprising discovery was that after some time in the chair
participants reported an “external localization of the stimuli”, meaning they felt like their
visual perspective seemed to come from in front of the camera instead of the pressure
on their back [
67
]. This suggests that the vision substitution integrated seamless into
their regular perception and became an extension of their senses. The research field that
emerged from these experiments is known today as the study of “neuroplasticity”, with
“sensory substitution” being one of the tools to help and study patients who suffered from
neurological damage. Sensory substitution research today has expanded from tactile
stimuli to electric stimulation of the tongue [
68
]. The tongue offers many different
advantages like having a very fine spacial resolution and nerve endings positioned very
close under its surface [
68
]. First commercial devices like the “BrainPort”
20
have become
available in recent years, but are still being actively developed and tested.
So far findings from the sensory substitution experiments are only used therapeutically
to restore or reconstruct lost abilities, senses or body parts. Newer generations of
20BrainPort manufacturer website: https://www.wicab.com - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 3 Extension of the Senses through Technology 36
prosthetic limbs for example can be connected to nerve endings to provide control and
even feedback. Cochlear implants can restore some capacity of hearing for deaf people
as discussed in section 2.2. There has even been research on so called “deep brain
stimulation” where electrode arrays are hooked up directly to the brain in order to
do tasks like drive a wheelchair or move a robotic arm. The reason for the current
concentration on reconstruction is because it is more achievable and circumvents ethical
questions in comparison to extending the human capabilities beyond their “natural” form.
However it is almost certain that the same mechanisms and procedures will be used in
the future to extend our senses into new fields, similar and beyond to what the DIY
cyborgs are currently exploring. The brains ability to find patterns in streams of data is
incredibly good and the boundaries of the brain’s data processing capabilities are not yet
fully known [
69
]. What is sure however, is that the plasticity of the brain is playing a key
role in a future were we are able to extend our senses technologically or have additional
electronic body parts.
Evolutionarily it makes a lot of sense that the brain is agnostic to where its inputs are
coming from. This way nature only had to develop the brain once and make sure it was
capable of handling any kind of input. The brain could then focus on finding patters in
the input data and extracting information from those patters, instead of reconfiguring to
every new input that developed [
69
]. The process of evolution was now able to experiment
with a myriad of different input systems. In the animal world we can find many of these
developments, like heat perception in snakes, the highly developed sense for smell in
dogs, birds being able to perceive the earth’s magnetic field lines, ultraviolet vision in
bees, the 180°field of vision in lobsters as well as electric field sensors in some fish just
to name a few [69].
One important thing to keep in mind is that the adjustment process of the brain’s
plasticity is not instantaneous. Depending on its age as well as many other internal and
external factors the brain needs time to create and strengthen new neural pathways,
allowing it to extract patterns and information from the new inputs [
69
]. The process
is comparable to learning a new language, which at first requires a lot of attention and
concentration, but with more and more training becomes effortless and casual. This is
true for any sensory extension or substitution. A practical example of the training and
learning process of the brain is the familiarization period described in section 2.2 when
a wearer receives a new cochlear implant. While in the beginning only beepy tones are
perceived with advancing training words and sentences emerge from the noise [
38
]. A
similar experience was also described by Neil Harbisson and his Eyeborg antenna, were
the sounds representing color would become completely natural and embedded into his
regular perception the longer he wore it.
To summarize the subject, a quote from Paul Bach-y-Rita, founder of the field of
neuroplasticity: “We see with our brain, not with our eyes.
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 37
4 Prototype project
This chapter will provide an in-depth look into the prototype built for this thesis. The
main goal of the prototype is to prove that it is possible to build a device which transposes
ultrasound into the audible range and streams it to the cochlear implant. The prototype
aims to be put together from readily available off the shelf components and stay below
50 in total component cost in order to be cheap enough for interested people to replicate
it. To further encourage replication the software written is well-documented and released
under an open source license. Both the hard- and software employed to manufacture the
working prototype are presented in this chapter, as well as an explanation of the digital
signal processing techniques used to pitch shift the ultrasonic signals into the human
audible range.
4.1 Hardware
The prototype is composed of two circuit boards which can be seen in figure 4.1. The
primary circuit board is a microcontroller which acts as the main processor and coordinates
the signal flow. A second, smaller circuit board is stacked underneath the primary one.
It is called the audio shield and is mostly concerned with audio input and output.
Figure 4.1: The two circuit boards of the prototype. The black microcontroller chip in the
middle of the top board is clearly visible as well as the audio shield board below it.
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 38
The microcontroller board on top is the so called “Teensy” development board in version
4.1
21
. The Teensy series, developed and sold by Paul Stoffregen from www.pjrc.com, is a
series of USB microcontroller development boards popular in the hacking and making
community. Part of their popularity is based on the fact that they are fully compatible
with the large amount of frequently used Arduino software libraries. Simultaneously
the Teensy offers much greater processing power, a smaller form factor and many more
analog and digital input/output pins than the Arduino boards
22
. They can simply
be programmed through a USB port and are able to emulate any kind of USB device
themselves, making them an ideal candidate for building a variety of custom computer
input or output peripherals. They are also regularly used to power LED installations,
DIY synthesizers, self-built MIDI controllers, interactive art installations, provide custom
interfaces to connect to old and unsupported technology or develop small prototypes
like in this thesis, just to name a few popular use cases
23
. The Teensy 4.1 has an ARM
Cortex-M7 as its central processing chip, running at 600 MHz while having a fairly
low power consumption. 7936 kilobyte flash memory and 1024 kilobyte of RAM are
available for low latency data storage as well as a built-in microSD card reader to access
up to 32 GB of slower external storage. The processor also has an integrated floating
point math unit providing hardware acceleration for both 32 and 64 bit float arithmetic
operations.
The audio shield
24
is an extension board to the Teensy that adds the capabilities to
input, manipulate and output high quality audio signals (24 bit, 96 kHz sampling rate,
on standard settings). For this project “revision D” is used, as it is compatible with the
Teensy 4.x series. The Teensy audio shield is built around the Freescale Semiconductor
SGTL5000 chip, which offers multiple different audio inputs and outputs [
70
]. A line
level stereo input and a separate line level stereo output are provided. The board also
has a soldered-on 3.5 mm stereo jack socket that is connected to the chip’s integrated
headphone amplifier, which proved to be very useful to connect headphones for testing
during development. The chip also has a mono microphone input with a software
configurable gain, which was used as the primary input channel for the project. Besides
these analog channels the SGTL5000 also has a built-in analog-to-digital converter (ADC)
and digital-to-analog converter (DAC). These power a digital input and a digital output,
as well as enabling the conversion of audio between the digital and analog domain. The
audio shield is connected to the main Teensy board through I2S, a protocol specifically
designed to move PCM data between integrated circuits over a serial bus interface.
21
PJRC.COM LLC, Teensy®4.1 Development Board: https://www.pjrc.com/store/teensy41.html -
Accessed: 22.02.2022
22Arduino Micro: https://store.arduino.cc/arduino-micro - Accessed: 22.02.2022
23
PJRC.COM LLC, Projects Using Teensy: https://www.pjrc.com/teensy/projects.html - Accessed:
22.02.2022
24
PJRC.COM LLC, Audio Adaptor Boards for Teensy 3.x and Teensy 4.x: https://www.pjrc.com/store/
teensy3_audio.html - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 39
Figure 4.2: The Knowles SPU0410LR5H-QB MEMS microphone (dimensions in mm).
Image source: [7]
The prototype has a third very small circuit board. Its main purpose is to house the
tiny microphone component, with its footprint of only 3 x 3,7 mm as seen in figure 4.2,
making it easier to handle. The microphone used is the SPU0410LR5H-QB manufactured
by the company Knowles Electronics [
7
]. It is a silicon diaphragm MEMS in the form
of a surface mounted device (SMD). MEMS are a progression in manufacturing for
miniaturizing components popularized in recent years. They are usually created from
silicon waivers using the same manufacturing techniques as semiconductor electronics
thus making them incredibly small. Their form factor and high versatility make them the
ideal candidate for building sensors into smartphones or similar devices. In comparison
to the electret microphone widely used in the past, MEMS microphones are much smaller
in size, require less additional circuitry, are more resilient to environmental conditions
like temperature fluctuations, are manufactured to higher production tolerances and are
sold for much cheaper, making them superior in many regards. The SPU0410LR5H-QB
has an omnidirectional characteristic, a flat frequency response in the audible range up to
10 kHz and features shielding against radio frequency and electromagnetic interference.
Figure 4.3: Preliminary ultrasonic free field response normalized to 1kHz of the Knowles
SPU0410LR5H-QB (take from its datasheet [7]).
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 40
It is also capable of sensing acoustic waves in the ultrasonic range with a reasonably flat
frequency response between 40 kHz and 80 kHz, as can be seen in figure 4.3. Additionally
it is extremely efficient drawing only 120 µA at 3,3 V with a sensitivity of -38 dB V/Pa
and a signal to noise ratio of 63 dB. The output is supplied as an analog signal, which
can be connected directly to the SGTL5000’s microphone input. Commercially the
SPU0410LR5H-QB is used in devices like hearing aids, cellphones or laptops. It also
has seen a variety of uses in scientific or making contexts, like in urban soundscape
monitoring for mapping sound pollution [
71
] or in an open source recording system to
sample terrestrial wildlife [72].
MEMS Mic ADC
SGTL5000
Teensy
CPU
Pitchshifting
DAC
SGTL5000
Analog
Output
Cochlear
Implant
I²S I²S
Recording
Figure 4.4: The full signal flow through the hardware. Analog steps marked in yellow and
digital steps colored blue.
For clarification, the full audio signal flow through the prototype is outlined again in
the following. An accompanying diagram of the signal flow can be seen in figure 4.4. First
the MEMS microphone detects the acoustic sound waves and translates them into an
analogue electrical signal. This signal is then digitized by the ADC inside the SGTL5000
chip. From there the signal is transferred to the main Teensy chip using the I2S protocol.
On the Teensy microcontroller the actual pitch shifting processing as well as optional
recording is done. Afterwards, the procedure is reversed and the processed signal is
transferred back from the Teensy to the SGTL5000 through the I2S interface. Then the
DAC inside the SGTL5000 creates an analogue audio signal which is put out through
the headphone jack. The audio signal is then streamed to the cochlear implant.
Different ways of streaming the resulting audio signal to the cochlear implant were
evaluated. In the end the most straightforward method of utilizing an already existing
extension accessory was chosen. The manufacturer Cochlear offers an accessory called the
“Wireless Mini Microphone 2+”
25
, often simply referred to as “MiniMic”. The MiniMic
is essentially an external microphone for the cochlear implant which can be worn by a
conversation partner in noisy environments like restaurants. It streams wireless to the
cochlear implant to improve understanding of the conversation. Another frequent use
case of the MiniMic are lectures where the speaker is usually far away from the listener
25
Wireless Mini Microphone 2+ product page: https://www.cochlear.com/de/de/home/products-and-
accessories/our-accessories/true-wireless-devices - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 41
in an often large, echoic room. The speaker can clip on the MiniMic and their words can
then be perceived clearly by the implant wearer. Additionally the MiniMic has a stereo
jack socket in the bottom to input audio from other sources. This possibility was used
for the prototype to connect the output of the audio shield to the input of the MiniMic,
thus streaming the processed audio signal directly to the cochlear implant. Another
option that was investigated thoroughly was to stream the audio directly using Bluetooth
without the need of any additional accessories. Recent generations of smartphones have
extended the already existing possibility to stream audio via Bluetooth to hearing aids
to include cochlear implants as well. Since modern sound processors already have this
capability the concept was to use the same technology to stream from the prototype
straight to the implant. Unfortunately the underlying technology is highly proprietary
in nature and technical information about it is scarce. Both big smartphone operating
system manufacturers Apple and Google have developed their own closed system based
on extensions to the Bluetooth Low Energy standard. The Android system is called
“Audio Streaming for Hearing Aids”
26
and the name for the iOS system is “Made for
iPhone”
27
. Besides licensing issues that were not even taken into account here, the
proprietary technology makes it exceedingly difficult to connect to the implant with
custom Bluetooth hardware. Existing open source Bluetooth stacks would have to be
heavily extended with functionality that is not publicly documented. As both Google and
Apple want to sell licenses to hearing aid and cochlear implant manufacturers there is no
incentive for them to open up their technology. So for feasibility reasons the streaming
approach using the MiniMic was chosen.
The sound processor used for developing, testing and evaluating the prototype is the
latest cochlear implant model of the manufacturer “Cochlear”. It is called “Nucleus 7”
and has the model number “CP 1000”. Compatibility between this version of the sound
processor and the MiniMic accessory used for connecting to the prototype is assured
by the manufacturer. The actual implant in use was the “Nucleus CI532” by the same
manufacturer, which is equipped with 22 electrodes. Note that one cochlear implant is
used for the whole constellation, as I am single-sided deaf.
As mentioned before the price for the prototype’s hardware was intentionally kept below
50 . This limit was set to lower the accessibility threshold and increase the likelihood of
reproduction of the prototype by interested people. The goal was achieved by using only
standard off the shelf components, that are readily available as well as already popular
and widely used in the making and hacking community. Pjrc.com sells the Teensy 4.1 for
26,85 $ and the audio shield for 13,75 $. Mouser.de (a popular electronics component
reseller) sells one SPU0410LR5H-QB microphone for 0,82 $. Adding those prices up
26
Audio Stream for Hearing Aids website: https://source.android.com/devices/bluetooth/asha - Accessed:
22.02.2022
27Made for iPhone website: https://mfi.apple.com/ - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 42
brings the hardware cost to a total of 41,42 $ or 34,90 thus staying well below the 50
goal.
4.2 Software
The Software running the prototype is written in C++, using the open source software
PlatformIO, an integrated development environment tailored for cross-compiling code
from a desktop computer onto an embedded system
28
. It is popular in the making and
hacking community and comes with support for many embedded targets including the
Teensy. The compiled binary is flashed onto the Teensy using the Teensy Loader software
and a standard USB cable.
For audio manipulation the library “Audio” by Paul Stoffregen is used
29
. It interfaces
with the SGTL5000 chip on the Teensy audio shield and provides a node based audio
graph for processing and routing audio data. The audio graph structure is defined at
compile time by instantiating and connecting nodes. At run time audio buffers are
generated or manipulated in these nodes and passed onto the next node through the
connections according to the graph layout. Since the audio buffers have a size of 128
samples the latency is extremely low with around 3 ms at a sampling rate of 44100 Hz.
Besides the support for different audio chips the library is quite feature rich, offering
many different oscillators, synthesizers, effects, filters, analysis tools as well as recording
and mixing capabilities. Since each node is at its core a C++ class with an
update()
method for processing, it is easy to extend the library by adding your own class with the
desired functionality. This object oriented programming style was followed to implement
the pitch shifting algorithm used in the prototype and integrate it with the library. How
the algorithm works in detail is described in the following section 4.3. The prototype’s
main execution loop follows the usual Arduino design pattern of a
setup()
and a
loop() function.
As mentioned, the prototype has the possibility to record audio by saving it on a
microSD card through the mircoSD card reader which is integrated into the audio shield.
To access and write onto the card the Arduino library “SD”
30
is used. Because of
its interoperability and simplicity the uncompressed WAVE format was chosen as the
preferred output file format. A dedicated class
31
was developed as an extension to the
audio library implementing the RIFF WAVE file specification [
73
] for PCM data and
handling the byte order (endianness) of both the data and the platform correctly. Being
28PlatformIO Website: https://platformio.org/ - Accessed: 22.02.2022
29
Code repository for the library “Audio” by Paul Stoffregen: https://github.com/PaulStoffregen/Audio
- Accessed: 22.02.2022
30
Code repository for the library “SD” by Arduino: https://github.com/arduino-libraries/SD - Accessed:
22.02.2022
31
Code directory for the “WavFileWriter” class: https://github.com/Foaly/UltrasonicHearing/tree/
master/lib/WavFileWriter - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 43
able to save both the ultrasonic audio data as well as the pitch shifting algorithm’s output
for later comparison and spectral analysis on a computer was a tremendous help during
development. Besides that, it also allows saving interesting sounds for later review.
According to the SGTL5000 data sheet [
70
] the highest supported sample rate is 96
kHz, which is perfectly acceptable for regular audio, but not sufficient for recording
ultrasound. Following the Nyquist–Shannon sampling theorem the highest representable
frequency is half the sample rate. Since we would like to resolve frequencies of around 100
kHz a higher sample rate is required. The data sheet also mentions that the SGTL5000’s
master clock can use an internal phase-locked loop (PLL), which is in turn used to derive
all internal audio clocks [
70
]. This PLL can be set externally from the Teensy to an
arbitrary value modifying the derived sampling rate. Overclocking the audio chip in this
way sample rates of 176,4 kHz and 192 kHz were tested successfully, meaning frequencies
up to 88,2 kHz and 96 kHz could be sampled. Setting higher sample rates, than the ones
marked as supported, enabled ultrasound hearing and recording. The proper operation
of the audio chip, despite the overclocking, could be verified using a function generator
and is documented in section 5.1.
All source code written for the prototype during the course of this thesis is licensed
under version 3.0 of the GNU General Public License (GPL)
32
. In a nutshell this open
source license allows everyone interested to run, study and modify the algorithm and
source code, as long as they release their changes under the same license. The GPL
license was chosen to ensure that the code stays free and accessible to the public. The
source code as well as its documentation can be found in an online repository
33
. Besides
the main algorithm and the supporting utility functions both unit and integration tests
were written to verify the correct operation of the code. These tests were also used for the
technical evaluation as described in section 5.1. The code developed over the course of
the thesis, especially the “WavFileWriter” and the pitch shifting algorithm, are planned
to be integrated back into the Teensy’s Audio library to give others easier access to them.
4.3 Algorithm
The following section focuses on explaining the pitch shifting algorithm in detail. Es
sentially the algorithm transposes the incoming ultrasonic signal down into the desired
audible frequency range and thus forms the heart of the prototype’s software stack. As
described in section 4.2 the audio signal flow is managed by an audio graph structure.
The nodes of the graph are C++ classes following a certain design pattern and are
each dedicated to a specific task. The pitch shifting algorithm is integrated into the
32GPL 3.0 license text: https://www.gnu.org/licenses/gpl-3.0.html - Accessed: 22.02.2022
33
UltrasonicHearing source code repository: https://github.com/Foaly/UltrasonicHearing - Accessed:
22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 44
existing audio graph structure by implementing it as a custom C++ class complying
with the required pattern. All calculations and data manipulation necessary for the pitch
shifting are contained in the
PitchShift
class
34
. This keeps the algorithm modular
and loosely coupled to the rest of the code.
To better visualize the intended functionality of the pitch shifting algorithm figure 4.5
offers an illustration of the distribution of tones in the frequency spectrum. The blue
line connects all tones in the twelve tone equal temperament and visualizes the frequency
interval doubling with each octave, resulting in an exponential curve. The audible range
from 20 Hz to 20 kHz is shaded with a green background. Since the prototype is running
at 192 kHz the highest frequency resolvable by the prototype is 96 kHz. Below that
boundary the closest semitone is F
#
12 at 94.718,6 Hz. The two ultrasonic octaves
below that note, till F
#
11 (47,36 kHz) and till F
#
10 (23,68 kHz), are meant to be
transposed down into the audible range by the algorithm. Since the prototype is designed
to be used together with a cochlear implant the target frequency has to lie in the range
where the cochlear implant has the biggest frequency resolution and thus provides the
highest fidelity. After some trial and error it turned out that the sweet spot between the
algorithm’s frequency resolution and the cochlear implant’s frequency resolution for the
lower bound is the tone F
#
6 (at 1480 Hz). This means that the tone F
#
10 is transposed
down to F
#
6. The resulting transposition of 4 octaves down corresponds to a scaling
factor of 24or 0,0625.
0 10 20 30 40 50 60 70 80 90 100 110
frequency [in kHz]
A1
A2
A3
(440 Hz) A4
A5
A6
A7
A8
A9
A10
A11
A12
octaves
Figure 4.5: Octaves and their corresponding fundamental frequency in a twelve-tone equal
temperament. The audible range is marked with a green background.
The pitch shifting algorithm developed for this thesis is in its basic structure very
34
Code directory for the “PitchShifter” class: https://github.com/Foaly/UltrasonicHearing/tree/master/
lib/PitchShifter - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 45
similar to that of a state of the art phase vocoder as described in section 2.3.2. In
essence the incoming audio signal is transformed from the time domain to the frequency
domain using the fast Fourier transform (FFT). The resulting magnitude and phase
components are then analysed, manipulated and scaled along the frequency axis according
to the scaling factor. Afterwards the processed components are resynthesized into a time
domain signal using the inverse fast Fourier transform (iFFT) [
44
]. In order to support a
variety of different FFT frame sizes the input audio data has to be decoupled first to
become independent of the audio engine’s block size. The Teensy audio library has a
fixed block size of 128 samples and calls the
PitchShift::update()
method each
time a new audio block becomes available from the DAC. At the start of the method
these input blocks have to be stored in a class member buffer until the desired FFT
frame size is reached. Since oversampling is applied to the input sequence, not only
the FFT frame size has to be taken into account, but also the hop size between frames.
Oversampling FFT frames is a standard DSP procedure and is done by overlapping
consecutive frames by a certain number of samples. The key benefit of oversampling is a
decrease in overall latency of the system, as less time has to pass before a new frame can
be processed. Of course this happens at the expense of a higher computational cost, as
the processing loop is executed more often. These trade-offs have to be balanced, but
having an acceptably low latency is quite important for using the device as a sense. In
the case of this algorithm oversampling also helps to improve the frequency resolution.
When calculating the true frequency of the FFT bins later on, the oversampling shortens
the time passed between two consecutive frames, thus resulting in a more precise phase
derivative. The FFT size is set to 2048 samples and an oversampling factor of 4 is used for
the prototype, resulting in a hop size of 512 samples and consecutive frames overlapping
by 75%. Only once enough blocks have accumulated the actual transformation and
processing steps are executed on the buffer storing the appropriate number of previous
input blocks. This has the unfortunate, but unavoidable side effect, that the algorithm’s
runtime may not be constant across subsequent iterations. A similar construct is in
place for the output blocks that are transmitted back to the audio engine, which are
then forwarded to the next nodes in the graph. Since a 128 sample sized block has to
be emitted on each
PitchShift::update()
call in order to avoid discontinuities
resulting in audible clicks or noisy distortions, the finished FFT output block has to
be spread out over the appropriate number of calls in a similar manner. Besides this
supporting work, preparing blocks of the desired length, the algorithm is roughly split
into 3 stages: analysis, transformation and resynthesis stage. A pseudocode declaration
of the algorithm can be found in listing 1. In the following paragraphs explaining the
algorithm the line numbers of this pseudocode listing will be referenced.
After enough samples for a new FFT frame have accumulated, the analysis stage begins.
All collected samples are first converted from 16 bit integers to the IEEE
float
data
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 46
type, to both avoid over- and underflow errors as well as utilize the floating point math
hardware acceleration of the Teensy’s ARM Cortex-M7 processor. Next a window function
is applied to the input signal in order to reduce the effects of spectral leakage, sometimes
also referred to as spectral smearing (l. 1). By applying a window function to the signal
both frame boundaries are brought down to zero making the frame become quasi periodic.
This suppresses the energy generated by the discontinuities at the boundaries of the
input frame, thus reducing the spectral leakage and improving the frequency resolution.
The standard Hann window, widely used in many DSP algorithms, was chosen as the
precomputed window function because it offers a good compromise between the two
important properties of a window function: the main lobe width and level of the side
lobes. Of course applying a window function also introduces amplitude distortions to the
original signal. Therefore the amplitude correction factor for Hann windows depending
on the oversampling factor was also multiplied with the windowed input signal. Next the
windowed signal is run through the FFT function provided by the ARM DSP library
(l. 2). After the input signal is successfully transformed to the frequency domain, the
phase and magnitude of each FFT bin is calculated from its complex value (l. 4 - 5). To
shift or scale the frequency content of the signal it is not only necessary to derive the
strength of each component, determined by the bin’s magnitude, but also to resolve the
component’s frequency as precise as possible. Due to the nature of the FFT algorithm
the frequency resolution is quantized to a fixed linear spacing describe by the following
formula 4.1.
𝑠𝑎𝑚𝑝𝑙𝑒𝑅𝑎𝑡𝑒
𝑓𝑟𝑎𝑚𝑒𝑆𝑖𝑧𝑒𝐹 𝐹 𝑇 =𝑏𝑖𝑛𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦𝑊 𝑖𝑑𝑡ℎ [𝐻𝑧](4.1)
Fortunately however it is possible to calculate the signal’s true frequency within a
FFT bin by including the bin’s phase offset information. First the derivative of the bin’s
phase is approximated, by subtracting the bin’s phase increment of the previous FFT
frame from the current one (l. 6). Then the expected bin’s phase increment is calculated,
by multiplying the result of formula 4.1 with the corresponding bin number (l. 8), and
subtracted from the derivative. Since the FFT frames overlap, the influence of the overlap
on the phase increment is also considered (l. 10). Finally the much more precise true
frequency of the bin is calculated (l. 11).
The analysis stage is followed by the transformation stage which executes the pitch
shifting. To get rid of unnecessary FFT bins, which correspond to the frequencies in the
audible range, their magnitudes and true frequencies are overwritten with zeros. This
step behaves similar to a spectral high-pass filter, but also doubles as an optimization,
as the affected bins don’t have to be considered in subsequent calculations. The cutoff
frequency is configurable from outside the class through a parameter and set to 20 kHz.
In theory this top-hat filter should produce ripples due to its sharp cutoff resulting in an
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 47
audible ringing at 1250 Hz, however in practice this ringing is not measurable. Next the
pitch shifting factor is applied to each bin to calculate their corresponding transposed bin
(l. 18). This way the spectrum is expanded or contracted accordingly. The magnitudes
are simply accumulated in their new bins thus preserving the energy of the spectrum
(l. 19). The true frequencies however are also multiplied with the pitch shifting factor
separately in order to calculate their correctly scaled new frequencies (l. 20). Finally the
results are stored in a new data structure to be synthesized afterwards.
The following synthesis stage works very similar to the analysis stage but in reverse.
First each phase of the scaled bins is reconstructed from their true frequency, by undoing
the phase increment calculations described above (l. 25 - 28). Afterwards the reconstructed
phase and magnitude are inserted into trigonometric functions to compute the real and
imaginary part of the complex value representing the contents of each bin (l. 29 - 30).
Next the inverse FFT is called on the finished complex frequency domain data sequence
(l. 32). The resulting real valued time domain sequence is windowed again using the same
Hann window as before (l. 33). To reconstruct a continuous signal the new windowed
time domain sequence is overlapped and added to parts of the sequence from previous
iterations, according to the oversampling factor. The resulting final audio signal buffer is
now ready to be forwarded to the next nodes in the audio graph in the blockwise manner
described above.
Even without any significant input signal the algorithm’s output tends to be rather
noisy. Unfortunately this is an inherent property of the algorithm, originating from the
downscaling that is done along the frequency axis. Due to the high scaling factor a
number of FFT bins with a low magnitude noise are being combined into a single bin
resulting in a noise with a much higher magnitude. To counteract this phenomenon a
simple noise gate was added to the algorithm, so that FFT bins with a magnitude below
a certain threshold do not contribute to the final signal.
A strong focus on efficiency and performance was very important while developing
the pitch shifting algorithm in order to comply with the realtime audio constraints
of the prototype. Since the SGTL5000 audio chip is overclocked to run at 192 kHz,
to sample ultrasound up to 96 kHz, and the audio engines block size is fixed at 128
samples, every
PitchShift::update()
call, even in the worst case scenario, has a
hard limit to finish in under 666
,6
microseconds to avoid dropouts resulting in audible
discontinuities. To stay within these narrow time constraints a variety of computer
science performance optimization techniques were applied. For example, because memory
allocation calls have an undefined execution time all memory necessary for the algorithm
was preallocated outside of the
update()
call. Additionally all memory used is
aligned to the processor’s word size to improve memory access speeds. To accelerate
memory manipulation operations copies and moves were executed blockwise. Most
ARM microcontrollers come with the CMSIS library (Common Microcontroller Software
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 48
Interface Standard) which includes efficient implementations of commonly used algorithms.
The version 5.0.1 of the CMSIS is included in the Teensy 4.1 software package. The
FFT subroutines of the CMSIS’s DSP sublibrary in version 1.5.1 were used in the
prototype. These FFT functions have a number of performance improvements like
considering the processors word size as well as using vectorized math operations. They
have proven to be the fastest FFT implementation available on the Teensy platform.
The CMSIS’s memory aligned, vectorized math operations were also used to speed up
large addition or multiplication operations like applying the window function. Their
provided trigonometric functions for sine and cosine have also proven to be much faster
than the implementations of the C++ standard libraries, while still being precise enough.
Interestingly the square root function of the CMSIS DSP library was slower than the
one the standard library provides. Presumably this is the case because the standard
library likely uses the built-in ARM Cortex-M7’s
VSQRT
processor instruction, while
the DSP library computes a mathematical approximation, which is slower in this case.
The arctan2 trigonometric function has no implementation in the CMSIS library and the
standard libraries implementation is relatively slow due to its high precision, which is not
required for this use case. For improved processing speed a simple arctan2 approximation
based on polynomials was implemented. As already mentioned in the description of the
transformation stage as another way of avoiding unnecessary calculations only FFT bins
that contribute to the audible output are computed, either because they are inside the
ultrasonic range before pitch shifting or because they are inside the audible range after
shifting. To improve performance even further compiler optimization flags like -O3 or
-ffast-math
were enabled. Overall the entire pitch shifting algorithm was written in
such a way that it is compatible with both the Teensy 4.x series as well as the Teensy 3.x
series. The Teensy 3.x series ships with the much older version 1.1.0 of the CMSIS DSP
library and only supports FFT frame sizes of 128, 512, 2048 samples while the Teensy 4.x
series supports FFT frame sizes of 32, 64, 128, 256, 512, 1024, 2048 and 4096 samples.
The described pitch shifting algorithm was first publicized in a blog post in 1999
35
on
a blog named “The DSP Dimension” at the time. In the meantime the blog has been
bought up by the company “zynaptiq”, but the algorithm has since been implemented
and used many times, among other things also for pitch shifting ultrasound [
50
]. It is
particularly suitable for this purpose, because the algorithm supports both very small as
well as very large scaling factors. While other pitch shifting algorithms like PSOLA only
work in a small range of shifting factors the employed algorithm is able to transpose the
signal over many octaves. Even though the algorithm works in the frequency domain,
requiring the transformation of a signal window using the FFT, which is inherently
slower than working in the time domain, it is possible using oversampling as well as the
optimization techniques outlined above to run the algorithm fast enough for realtime use
35Blog post: http://blogs.zynaptiq.com/bernsee/pitch-shifting-using-the-ft/ - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 49
cases. Another important property that makes this algorithm especially applicable for
an extension of the auditory sense is that it keeps the harmonic relationship between
the sound’s fundamental note and its overtones intact after shifting, thus distorting the
harmonics as little as possible and keeping the sound natural. As a consequence the
timbre of the original sound is only marginally affected by the pitch shifting which also
adds to its usefulness as a sensory extension.
Ultrasonic Cyborg Hearing
Chapter 4 Prototype project 50
Algorithm 1 Processing loop of the pitch shifter in pseudocode.
Require: 𝑓𝑟𝑎𝑚𝑒 audio time series data
Require: 𝑤𝑖𝑛𝑑𝑜𝑤 precomputed Hann window
Require: 𝑝𝑖𝑡𝑐ℎ𝑆ℎ𝑖𝑓𝑡𝐹 𝑎𝑐𝑡𝑜𝑟 2𝑠𝑒𝑚𝑖𝑡𝑜𝑛𝑒𝑠÷12 controls transposition
Require: 𝑜𝑣𝑒𝑟𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔𝐹 𝑎𝑐𝑡𝑜𝑟 𝑓𝑟𝑎𝑚𝑒𝑆𝑖𝑧𝑒 ÷ℎ𝑜𝑝𝑆𝑖𝑧𝑒
Require: 𝑜𝑚𝑒𝑔𝑎 2𝜋·ℎ𝑜𝑝𝑆𝑖𝑧𝑒 ÷𝑓𝑟𝑎𝑚𝑒𝑆𝑖𝑧𝑒
Require: 𝑏𝑖𝑛𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦𝑊 𝑖𝑑𝑡ℎ 𝑠𝑎𝑚𝑝𝑙𝑒𝑅𝑎𝑡𝑒 ÷𝑓𝑟𝑎𝑚𝑒𝑆𝑖𝑧𝑒
1: 𝑓𝑟𝑎𝑚𝑒 𝑓𝑟𝑎𝑚𝑒 ·𝑤𝑖𝑛𝑑𝑜𝑤
2: 𝑋FFT(𝑓𝑟𝑎𝑚𝑒)
3: for each 𝑖in length(𝑋)do Analysis stage
4: 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 square_root(𝑋[𝑖].𝑟𝑒𝑎𝑙 ·𝑋[𝑖].𝑟𝑒𝑎𝑙 +𝑋[𝑖].𝑖𝑚𝑎𝑔 ·𝑋[𝑖].𝑖𝑚𝑎𝑔)
5: 𝑝ℎ𝑎𝑠𝑒 arctan2(𝑋[𝑖].𝑖𝑚𝑎𝑔, 𝑋[𝑖].𝑟𝑒𝑎𝑙)
6: 𝑓𝑟𝑒𝑞 𝑝ℎ𝑎𝑠𝑒 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠𝑃 ℎ𝑎𝑠𝑒[𝑖]
7: 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠𝑃 ℎ𝑎𝑠𝑒[𝑖]𝑝ℎ𝑎𝑠𝑒
8: 𝑓𝑟𝑒𝑞 𝑓𝑟𝑒𝑞 𝑖·𝑜𝑚𝑒𝑔𝑎
9: 𝑓𝑟𝑒𝑞 wrap_phase(𝑓𝑟𝑒𝑞)Wrap 𝑓𝑟𝑒𝑞 into the range ]-pi, pi]
10: 𝑓𝑟𝑒𝑞 𝑜𝑣𝑒𝑟𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔𝐹 𝑎𝑐𝑡𝑜𝑟 ·𝑓𝑟𝑒𝑞 ÷2𝜋
11: 𝑓𝑟𝑒𝑞 (𝑖+𝑓𝑟𝑒𝑞)·𝑏𝑖𝑛𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦𝑊 𝑖𝑑𝑡ℎ
12: 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒𝑠[𝑖]𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒
13: 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠[𝑖]𝑓𝑟𝑒𝑞
14: end for
15: 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝑀𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒𝑠.𝑧𝑒𝑟𝑜𝐴𝑙𝑙𝐸𝑙𝑒𝑚𝑒𝑛𝑡𝑠() Transformation stage
16: 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠.𝑧𝑒𝑟𝑜𝐴𝑙𝑙𝐸𝑙𝑒𝑚𝑒𝑛𝑡𝑠()
17: for each 𝑖in length(𝑋)do
18: 𝑖𝑛𝑑𝑒𝑥 round(𝑖·𝑝𝑖𝑡𝑐ℎ𝑆ℎ𝑖𝑓𝑡𝐹 𝑎𝑐𝑡𝑜𝑟)
19: 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝑀𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒𝑠[𝑖𝑛𝑑𝑒𝑥]𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝑀𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒𝑠[𝑖𝑛𝑑𝑒𝑥] + 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒𝑠[𝑖]
20: 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠[𝑖𝑛𝑑𝑒𝑥]𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠[𝑖]·𝑝𝑖𝑡𝑐ℎ𝑆ℎ𝑖𝑓𝑡𝐹 𝑎𝑐𝑡𝑜𝑟
21: end for
22: for each 𝑖in length(𝑋)do Synthesis stage
23: 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝑀𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒𝑠[𝑖]
24: 𝑝ℎ𝑎𝑠𝑒 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠[𝑖]
25: 𝑝ℎ𝑎𝑠𝑒 𝑝ℎ𝑎𝑠𝑒 ÷𝑏𝑖𝑛𝐹 𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦𝑊 𝑖𝑑𝑡ℎ 𝑖
26: 𝑝ℎ𝑎𝑠𝑒 2𝜋·𝑝ℎ𝑎𝑠𝑒 ÷𝑜𝑣𝑒𝑟𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔𝐹 𝑎𝑐𝑡𝑜𝑟
27: 𝑝ℎ𝑎𝑠𝑒 𝑝ℎ𝑎𝑠𝑒 +𝑖·𝑜𝑚𝑒𝑔𝑎
28: 𝑝ℎ𝑎𝑠𝑒𝑆𝑢𝑚[𝑖]wrap_phase(𝑝ℎ𝑎𝑠𝑒𝑆𝑢𝑚[𝑖] + 𝑝ℎ𝑎𝑠𝑒)
29: 𝑋[𝑖].𝑟𝑒𝑎𝑙 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 ·cos(𝑝ℎ𝑎𝑠𝑒𝑆𝑢𝑚[𝑖])
30: 𝑋[𝑖].𝑖𝑚𝑎𝑔 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 ·sin(𝑝ℎ𝑎𝑠𝑒𝑆𝑢𝑚[𝑖])
31: end for
32: 𝑓𝑟𝑎𝑚𝑒 IFFT(𝑋)
33: 𝑓𝑟𝑎𝑚𝑒 𝑓𝑟𝑎𝑚𝑒 ·𝑤𝑖𝑛𝑑𝑜𝑤
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 51
5 Evaluation
A significant portion of this thesis’ time was spent on evaluating and verifying the
functionality and performance of the prototype. The results of the evaluation are
documented in this chapter in a split manner. First the technical properties of the
prototype were evaluated through measurements resulting in reliable data to verify the
correct operation of the algorithm. Secondly the prototype was considered from a more
informal point of view through a perceptive analysis. Here the impressions and personal
experiences of wearing the prototype and using it as an extension to the auditory sense
are documented.
The possibility of distributing prototypes to a larger number of test candidates for
further evaluation through more than one person was thoroughly investigated, but
ultimately turned out to be unattainable. As mentioned in section 2.2 cochlear implant
wearers are spread rather sparsely throughout Germany. Therefore it is quite difficult to
find an appropriate number of individuals that would both be interested in the subject
matter as well as bring the right preconditions in terms of hearing loss and cochlear
implant fitment. Studies that rely on a small number of test candidates tend to often
produce ambiguous results. It is difficult to come to meaningful conclusions with results
that are hardly statistically significant due to their small sample set. Additionally
quantifying listening experiences in a robust way is quite complicated, especially if the
listening experience ventures into new territories of perception. The different hearing loss
journeys people undergo as well as whether they are single-sided deaf with an implant on
one side or wear two cochlear implants influences their relationship with the auditory
experience significantly. For all these reasons the decision was made not to carry out a
statistical analysis of the prototype’s listening experience using multiple test subjects. As
the analysis would be exceedingly difficult to conduct in a proper scientific manner and
likely still be heavily tainted with personal perceptions, the focus was instead put on the
subjective wearing experience of just one individual as well as verifying the prototype’s
engineering and correct functionality.
5.1 Technical Evaluation
The technical evaluation focuses on verifying the prototype’s measurable properties
and testing the algorithm’s correct technical functionality. In order to achieve this the
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 52
hardware was set up in the same way it would be during regular use. The software’s proper
functionality was tested by feeding pure sine wave tones into the pitch shifting algorithm
and comparing the output frequencies to the expected frequencies of precalculated results.
As an example if the note F 10 (at 22350.6 Hz) is fed into the algorithm, which was
configured to shift down by one octave, the expected result would be the tone F 9 (at
11175.3 Hz). These expected results were then compared to the measurements made at
the algorithm’s output. To verify different components of the prototype’s signal chain
independent of each other the sine waves were introduced at different points along the
signal chain. First to check the correct functionality of the pitch shifting algorithm
itself separately, the sine waves were procedurally generated directly in the audio engine
running on the Teensy CPU, thus bypassing the microphone, the ADC as well as the I2S
bus transfer. The generated sine waves were then fed into the pitch shifting algorithm
and recorded at its output for later analysis. The second verification used a setup
closer to real world use, where the sine waves were generated by an external function
generator connected to a speaker. These ultrasonic acoustic sound waves were then
picked up by the prototype’s microphone which transmitted them, following the signal
flow shown in figure 4.4, to the SGTL5000 audio chip. After being digitized by the ADC
and transferred through the I2S bus to the Teensy CPU, the signal was fed into the
pitch shifting algorithm to then be recorded for further analysis and comparison with
the expected results. In comparison to the first test which solely tested the algorithm,
the second is a full system test which also verifies that no errors are introduced by any
other component along the signal chain. Lastly the system latency of the prototype was
measured to check if it introduces a noticable delay.
For the first test using internally generated sine waves three separate rounds of mea
surements were conducted. All three rounds used the same input of 12 pure sine wave
tones spanning one full ultrasonic octave from F 10 (at 22350.6 Hz) to E 11 (at 42192.3
Hz) in semitone increments. During the first round of measurements the algorithm was
configured to apply no shift at all, in order to simulate a pass-through and verify that
the output is equal to the input with no artifacts introduced by the algorithm. In the
second measurement round the algorithm was configured to shift the 12 input tones down
by one octave. For the last round the configuration was set to shift the input signal up
by one octave, to verify the full range of the algorithm’s capabilities. To compare the
results of these three rounds of measurements with the expected frequencies the spectral
peaks of each round’s output signal were analysed and compared to the precalculated
frequencies. The results of all three measurement rounds of the first test showed that the
algorithm shifts the pitch of the generated sine waves correctly to their expected values.
A visualization of these results in the form of a spectrogram can be found in figure 5.1.
As can be seen the spectral lines in the three output spectrograms also visibly line up
correctly with the spectral lines of the precalculated results.
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 53
Figure 5.1: Results of the first test. Calculated, expected results on the left, showing the
original octave in yellow, the downshifted octave in blue and the upshifted octave in
red. Measured output octave of the algorithm with no shift shown second from the
left. Measured downshifted output octave second from the right and measured
upshifted output octave on the right.
The second test was conducted in a similar manner, using acoustic sine waves gener
ated by a function generator connected to a speaker, which were then received by the
prototype’s microphone. Again three separate rounds of measurements were taken using
the same 12 tones as during the first test spanning an ultrasonic octave from 22350.6 Hz
to 42192.3 Hz. For the three measurement rounds the algorithm was again configured
to first introduce no shift, then a shift down by one octave and for the last round shift
the signal up by one octave. The output of these three sets of measurements was then
analysed similar to the first test, by calculating the spectral peaks and comparing them
to those of the expected results. The results of the second test show that for all measured
tones their spectral frequency peaks match the precalculated frequencies of the expected
results. Figure 5.2 shows a visual representation of the second test’s analysis results. The
output spectrum of each measurement round is plotted in blue. Marked using dotted red
lines are the expected frequency values. The blue and red lines lining up on top of each
other demonstrate that the prototype shifts the input signal’s pitch as expected.
For the second test of the full signal chain a speaker capable of producing sound waves
from 20 kHz up to 90 kHz was required. Most regular Hi-Fi or reference speakers are
obviously only specified and tuned within the human audible range up to 20 kHz. While
there are some ribbon tweeters capable of delivering higher frequencies well into the
ultrasonic range, they are very rare and quite expensive. As a viable and inexpensive
alternative a simple piezoelectric speaker was chosen. Piezoelectric speakers are not
known for their sound quality due to the variability in their frequency response, resulting
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 54
10000 15000 20000
0.0
0.2
0.4
0.6
0.8
1.0
Amplitude
shifted down
20000 30000 40000
Frequency [Hz]
no shift
40000 60000 80000
shifted up
Figure 5.2: Results of the second test utilizing the microphone. Power spectrum of the shifted
output octaves in blue and their expected, precalculated frequency spectrum in red.
in the differing amplitude levels in figure 5.2. However they are capable of reliably
producing sound waves in the ultrasonic range. Indeed piezoelectric transducers are
used in many of the industrial ultrasonic applications described in section 2.1. For the
purposes of this evaluation I found the piezoelectric speaker “L010” manufactured by
the company “Kemo” which is rated up to 60 kHz
36
. Tests using a function generator
connected to the speaker showed that it is easily capable of generating sounds up to
100 kHz. While piezoelectric speakers are able to produce the required frequencies they
are not very efficient in doing so. This is due to their working principle of being tuned
to certain resonances. A wide-band ultrasound generator would be desirable for a use
case like this. They are currently subject of scientific research and unfortunately not
commercially available yet. Electromagnetic film (EMFi) as well as thermoacoustic
transducers seem to be promising candidates [
74
] [
11
]. While designing and planning the
second test, considerations were made whether or not the microphone and the ultrasonic
speaker should be calibrated, in order to counter the shortcomings of the piezoelectric
speaker and the unsteady frequency response of the microphone. Linearizing both of
their frequency responses was considered. In the end it was decided against tuning a filter
bank to make the frequency response of the system close to linear. The rationale behind
it was that for the validation measurements the important parameter is the frequencies
and the relative shift between them. So normalizing their magnitudes doesn’t add any
meaningful information to the test results.
36
Manufacturer website of the L010: https://www.kemo-electronic.de/de/Bauelemente/Bauteile/
Lautsprecher/L010-Ultraschall-Piezo-Lautsprecher.php - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 55
Lastly the third measurement was done to quantify the overall system latency of the
prototype’s whole signal chain. For comparison the latency of just the pitch shifting
algorithm was calculated, by taking into account the overlap factor as well as the chosen
window function. It turned out to be 1024 sample at a sampling rate of 192.000 kHz
resulting in a theoretical lower limit of 5
,3
milliseconds for the system latency. In order
to then measure the actual system latency of the prototype a Dirac impulse signal was
divided into two signals streams. One signal stream was fed through the input pin on the
prototype normally used for the microphone. The prototype’s output was then routed into
one channel of a sound card. The second signal stream bypassed the prototype entirely
and was routed directly into a different channel of the sound card. Then the prototype’s
system latency was derived by determining the time difference of the impulse’s arrival in
the sound card between the two channels. Over multiple runs the difference was measured
to be 2160 samples at 192.000 kHz resulting in a system latency of 11,25 milliseconds.
Since the pitch shifting algorithm is only one part of the prototype’s longer signal chain,
the system latency is as expected higher than the latency of just the algorithm. With a
total system latency of 11,25 milliseconds the prototype stays well below the temporal
resolution limit of the human auditory system of 20 milliseconds [
11
], resulting in no
perceivable delays for the wearer.
5.2 Perceptive Experience
Equally if not more important than the verification of the prototype’s technical properties
are the wearer’s perceptive experiences and impressions while wearing the device. Since
the device is designed to be a sensory extension wearing it under real-life conditions and
documenting the impressions is important in assessing its functionality and usefulness.
The evaluation of these perceptive experiences encompasses the observations made through
many extensive listening sessions. These sessions were usually conducted as long walks
through different real world environments while wearing the prototype. Dense urban
street environments, city parks, public transport, forests, villages in the countryside as
well as residential apartments were all settings evaluated during the listening sessions.
The results are of course highly subjective and anecdotal, but offer a glimpse into the
experience of wearing a device that extends your auditory sense through technology.
As ultrasonic sound sources were not encountered on a regular basis during the listening
sessions the impressions were fortunately not overwhelming and could comfortably be
integrated into all other sensory perceptions. Rather each time an ultrasound event
occurred it was a pleasant instant followed by an attentive look around for the source of
the sound. Since I am only wearing a cochlear implant on one side the ultrasonic sounds
could be easily distinguished as such, because they are only audible on the cochlear
implant, while sounds in the regular human audible spectrum were audible on both my
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 56
acoustic ear as well as my cochlear implant. This identification quickly became very
intuitive.
During the listening sessions I was able to sense some of the natural ultrasound sources
described in section 2.1. Definitely the most interesting and complex sounds were the
calls of bats. From an elevated vantage point at the side of a subway bridge overlooking
a lake in a city park as well as on a 3rd floor balcony overlooking a courtyard it was
possible to observe the bats flying maneuvers close up at eye level. This made it a lot
easier to correlate the different phases of the bat’s flight with the many different calls that
could be heard. For example I noticed that during straight, stable flight the calls were
very steady and spread comparatively far apart in time. If the bats engaged in erratic
flight maneuvers or flew close to trees I could hear the time between the pulses getting
shorter. I was also able to observe the quickly increasing pulse rate when a bat hunts
and catches insects, known as “feeding buzz” as described in section 2.1 and illustrated
in figure 2.2. As I knew very little about bats and their variety of calls at the time, it
was fascinating to later read about the different flight phases and their corresponding
calls and realize that I had been able to notice these myself [
2
]. Perhaps one of the
first ultrasound sources I accidentally discovered was the sound generated by breathing
out through the nose. Especially when breathing out through the nose intermittently, a
hissing noise can be heard in the ultrasonic range. Interestingly breathing through the
open mouth does not produce the sound, blowing through tightly pressed lips however
does generate the same ultrasonic noise. A possible explanation could be that the
phenomenon of pressurized gasses generating ultrasonic emissions while rushing through
small openings can be observed here. The same effect is also used for finding pressure
leaks in spacecraft or other pressurized systems. Another ultrasound source that I have
perceived is stepping on really dry grass, leaves or shrubs, which produces a crushing or
rustling sound. Crumpeling plastic foils also creates a similar sound. I have not to been
able to fully understand how and why that happens yet. A literature and web search has
also not produced any satisfactory explanations either.
Besides ultrasonic emissions of biological origin another big group of sound sources are
vibrations generated by metal moving against metal. Probably the loudest ultrasound
source that I have encountered were squeaking bicycle chains. The high-pitched tone
that we normally perceive in the audible spectrum coming off of bicycle chains in need
of some oil continues into the ultrasonic range. I could observe the same for an old
tractor, cog wheels, hinges and joints, but it is probably true for any other kind of not
well oiled metal on metal mechanism. Interestingly I was not able to hear any similar
sounds coming from cars. I assume it is because the moving parts of modern cars are
very well encased and soundproofed and people tend to keep them well oiled. Curiously I
noticed that joggers in the park running past me would also produce a rattling sound. It
took me quite a while to figure out that it is most likely the set of keys in their pockets.
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 57
Indeed through trying it out myself, I could confirm that shaking a bunch of keys so
that they hit each other produces the same kind of ultrasonic rattling sound. On a walk
near a dog park I noticed that dogs also produce a similar, yet slightly different rattle
sound. After coming closer and getting a chance to pet the dog I realized that it was
the little metal badge with the dogs identification number hitting against its ring and
the collar. There are two other daily mechanisms producing ultrasound that I found
surprising, but are probably obvious in hindsight. Firstly the inserting of a metal key
into a regular profile cylinder lock. You can actually hear the pins click individually and
count how many pins a particular lock has. The second one was simply opening and
closing a sweater with a metal zipper. The zipper produces a chattering sound that can
be heard very well in the ultrasonic range.
A few ultrasonic sources of technical origin were also encountered during the listening
sessions. Probably the one most frequently encountered was the parking distance
assistants of cars. The clicking of a pulse train with a constant rate could be heard
coming off the rear of many cars maneuvering into parking spots in reverse. Sometimes
these sounds were hard to catch as the ultrasonic transducers inside the cars are focused
very directional towards the back. Often standing off to the side of the car rather than
behind it already made the sounds very faint. As described in section 2.1 most modern
cars have a parking distance assistant which allows the driver to judge the distance to
any object behind the car by listening to a sequence of beeps. The rate of the beeps
corresponds to the distance, getting faster as the distance decreases. These distance
measurements are executed by the ultrasonic transducers I was able to hear. Very rarely
the same sound could also be heard coming from the front of a car. In particular a
handful of very modern, mostly electric cars seemed to emit ultrasonic pulses all around
the car. I assume these are part of the car’s situational awareness systems to facilitate
some form of autonomous driving. Research on the car models revealed that indeed all
of them have self driving capabilities. Unfortunately I did not have access to any of
these cars or similar models for further in-depth tests. Another ultrasonic sound source I
discovered around cars was an interior alarm system. One parked high-end limousine had
a window open just a little bit. Through the crack I was able to hear a series of ultrasonic
pulses. In contrast to the parking assistants the pulse train was not continuous, but had
a sort of rhythm with periods of silence. The sounds clearly came through the window
from inside the car, which suggests that they are part of an alarm system monitoring
any movement inside the car once it is parked.
As mentioned above, wearing the prototype was overall quite a pleasant and fun
experience. It integrated well with the regular auditory perception and did not obstruct
any of the other senses. The sensory extension turned out to work as intended and
provided a number of new and interesting insights into the environment surrounding me.
Indeed I was able to make quite a few discoveries, like the different flight phases of bats
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 58
or the metal clicking sounds of locks, that I found surprising. Unfortunately during the
listening sessions I did not encounter any ultrasonic sounds with a complex harmonic
overtone structure, as most sounds I came across were impulses or pulse-like in nature.
Thus I was unable to make use of the full potential of the pitch shifting algorithm and
its harmonics preserving properties. Maybe listening to the calls of rats or mice would
necessitate this property, as some of their calls are known to have a harmonic overtone
structure, as described in section 2.1 [
22
]. Nevertheless I would argue that as an auditory
sense extension it is desirable that the device correctly handles harmonic overtones. So I
am glad these properties were worked into the algorithm, as they make the prototype a
consequential continuation of the auditory sense into the ultrasonic range.
5.3 Discussion & Outlook
Overall the prototype built as part of this thesis can be considered a success. It ac
complishes the task it was designed to do, namely the transposition of ultrasound into
the audible range while preserving the harmonic overtone structure of the signal. As
intended it turned out to be possible to run the code on an affordable microcontroller
while still executing it fast enough for real time audio processing. Thanks to the efficient
implementation and the resulting execution speed, the thesis’ primary goal of connecting
the working prototype to a cochlear implant and thus extending its capabilities into the
ultrasonic frequency range could be achieved successfully. As a result of the creative,
critical approach a fully functional cybernetic extension of the acoustic sense was created.
Its utility could be verified under real-life conditions by practically extending the wearer’s
perceptive experience and allowing them to listen to both natural and artificial sounds
previously inaccessible. The relatively straightforward design and open source license
of the prototype should encourage others to replicate and use the prototype themselves
thereby carrying the do-it-yourself notion forward.
Future improvements to the prototype that first come to mind are some further refine
ments of the hardware. Starting with a better casing for the prototype to make it more
rugged and easier to handle. A 3D printed case released under an open hardware license
would be in line with the spirit of the project. Ideally shielding against electromagnetic
interference would be incorporated into that new casing to improve noise reduction,
both from the outside as well as between the prototype’s components and especially
around the analog microphone. A potentiometer knob for adjusting the volume and a
push button for initiating recordings could also be added for increased usability. Even
though the microphones frequency response only fluctuates by 6 dB, compare figure 4.3,
it could be further linearized by applying filters to the incoming signal. Further software
optimizations and hardware upgrades could be evaluated to increase processing speed,
possibly leading to an even higher frequency resolution. Adding additional features to the
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 59
sensory extension prototype are also conceivable. For example an ultrasonic transducer
aimed backwards could be coupled with the prototype, leading to an extension of the
situational awareness to include movements happening behind the back of the wearer.
This would be realized by periodically sending out pulses and either listening to their
echo directly or translating the distance information into a tone.
Now that a compact, affordable ultrasonic hearing device was developed, it can be
studied as a subject of research. Expanding the analysis of the device’s perceptional
aspects from anecdotal evidence to larger and more diverse user groups would be a
comprehensive area to study. Possible areas of interest include a proper scientific
exploration of the ultrasonic perception, a statistic evaluation thereof, examinations of
the brains ability to adapt to and integrate the new input over time or possible ways to
accelerate the process among others. As mentioned before it is rather difficult to gather
a large group of cochlear implant wearers, so it would be useful to provide an alternative
for acoustically hearing people. Using a bone conducting headset in conjunction with the
prototype could perhaps be used as a viable alternative for people without a cochlear
implant to experience a similar perception. People with a single sided deafness and a
cochlear implant could even try both setups and compare their sonic experience. Studies
with participants who wear cochlear implants and participants wearing such a bone
conduction headset could be conducted, both separate as well as in comparison with each
other. The experience of blind users, whose perception is heavily based on hearing, with
an additional ultrasonic sense would also be interesting to explore in a comparative study.
Using the device as a practical tool for research could be another option. Especially for
animal research that includes rats, mice or bats in their work, the ultrasonic hearing
sense could be useful to observe the state of the animal from a different perspective.
Further practical use cases could be acoustically monitoring gas flows during experiments
as well as checking for leaks in pressurized gas systems.
However most importantly, besides its possible applications and improvements, the
existence of the prototype’s successful implementation shows that effectively a do-it-y
ourself sensory extension of the human perception is possible in practice. As laid out
in the thesis’ introduction this matter of fact allows the redefinition of the cochlear
implant, from a utility mitigating a “disability” into an extendable sense broadening
the human perception, to become reality. For me personally this is an important step,
as currently a lot of prosthetics’ manufacturers do not venture forward into a similar
direction. Their focus is centered around rebuilding lost abilities, instead of imagining
ways that human capabilities could be expanded. Currently there is unfortunately very
little scientific research happening in the field of cybernetic extensions, especially in
ways that are accessible to people. Up until now, most of these advances are happening
either as artistic or aesthetic interventions. Therefore the presented prototype, combining
aspects from both the scientific and the DIY community into a new practical sensory
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 60
extension, is a valuable progression. However it is important to take a look and see what
direction the field might take into the future and the consequences that might come with
it. Personally I think that smaller cybernetic extensions, like the NFC implants and more
capable successors, will grow in popularity as their usefulness increases. They will likely
end up taking a similar path to tattoos or piercings, integrating with people’s aesthetic
ideals. To significantly advance the field and enable more complex implants, present
obstacles like the supply of energy, ideally through energy harvesting from the body, or
interfacing with nerves have to be overcome. If no further fundamental barriers arise
I believe it is however very likely that in the future some tech company will produce
an implant that lets us integrate ourselves very closely with technology. Their form,
capabilities and configuration are of course hard to predict, but features like direct access
of information, telepathic communication, in-situ translations, video streaming or even
full virtual reality immersion have been predicted many times by science fiction authors.
The listed features might not materialize in the near future, but it is safe to assume that
when they do these implants will have an impact on society similar to the introduction
of the smartphone or the world wide web had. All of this seems somewhat far fetched
and hard to image as of now, but developments of today’s tech giant like Facebook’s
“Metaverse” ambitions or the research of an implantable computer-brain interface by the
company “Neuralink” are already pointing towards such a future.
Once humanity does venture into a future were cybernetic implants are not about indi
vidual sensory extensions or personal instruments, but become interconnected computer
interfaces supplied by corporations, many important ethical and social issues arise. For
some people these new technologies will seem to pose a threat to the fabric of society,
which seems to be a common fear with most new technological advances. Contrary
to that, others will simply look at these implants, perhaps more careless, as a logical
progression to the usage of tools or to modifications of the human body in line with
wearing glasses, jewellery, makeup or tattoos. While major societal change as a result of
cybernetic implants will be inevitable, it is important to try to shape the change in a
way that benefits humanity. Disregarding people opposing such technology in principle,
fundamental questions need answering in order for that to happen. One of these impor
tant questions is the question of ownership. Who owns the implant? Spontaneously most
people would probably argue that the person with the implant owns it. However if we
draw a comparison to computers and smartphones used today we see that the case is
not so clear. With computers users own the physical hardware and traditionally they
used to own copies of the software they use, although in the recent past the model has
shifted to licensing or even renting the software. However in most cases users are still
able to run any kind of software they would like on it, be it an open source operating
system or simply any application they can acquire. For smartphones though the trend
has developed into a different direction. When purchasing a smartphone users usually
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 61
do own the hardware, but they only license a copy of the software to run on the device.
Contractually users are in almost all cases prohibited from running any other software on
the hardware they own, making the hardware unusable when the manufacturers decide to
no longer support the device. While there are workarounds for the “Android” platform to
run your own software, the company “Apple Inc. invests significant engineering resources
to make sure the “iOS” operating system can not be modified or replaced. The choice of
available software applications, executable on smartphones, is also tightly regulated by
the manufacturer’s digital distribution platforms like Apple’s “App Store” or Google’s
“Play Store”.
Applying this model back to cybernetic implants would make it entirely possible that
future implanted people do own their implant on paper, but are entirely depending on the
corporations due to the inability to configure, modify or improve the existing software or
run any entirely different software. This created dependency should be regarded critically
because it leaves implant wearers stranded if the company decides to end support for
certain features, turn off their servers, goes bankrupt or simply decides that an implant
has reached the support end-of-life date and they would like to turn it off. While these
“business decisions” might be tolerable with electronic devices today, it is a different story
if these devices require potentially complicated surgery to be taken out again. First
occurrences of these issues are already manifesting themselves, for example in February
2022 the company “Second Sight Medical Products”, who sold bionic eye implants, was
bought by another company, which discontinued the support, leaving all their customers
abandoned
37
. It can only be described as a nightmarish experience when components,
that you viewed as part of your body stop working, because a company goes bankrupt and
all of a sudden you can’t walk or see anymore. While opening access and running possibly
untested software on an implant might entail new liability and gross negligence issues
that are not easily waived, solutions for these problems will have to be found. Cochlear
implant manufacturers for example take great care to ensure that new versions of their
sound processors stay backward compatible with old implant versions, to avoid surgery
when upgrading to a new sound processor. However this will not always be possible or
desired by the company and only works for products of a certain company, not across
different companies extending the danger of vendor lock-ins to implants. Further parallels
between today’s electronics and future implants can be drawn as well. Almost certainly
the right to repair will come up as an issue similar to the current fight with smartphone
or farming equipment manufacturers, as corporations will want to capitalize the exclusive
right to repair implants or control the supply of replacement parts. Another perhaps
more grim question is about system security. Even though programming languages
and practices will likely have improved by the time implants become a reality, complex
systems will always have design flaws or bugs. Chances are high that these vulnerabilities
37Article in “IEEE spectrum”: https://spectrum.ieee.org/bionic-eye-obsolete - Accessed: 22.02.2022
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 62
will be exploited by criminals or adversaries. Similar to ransomware attacks today people
could be blackmailed or even get physically hurt, with the stakes being significantly
higher since the electronics are implanted into the body. The threat is already relevant
today, as can be seen by the fact that the doctors of former US Vice President Dick
Cheney disabled all wireless capabilities of his pacemaker, as a precaution, due to the
fear of it becoming a target for attacks.
Another big question with a lot of dystopian potential is the question of data ownership.
Just by their way of operating, these implants will produce a huge amount of very
personal data and a large chunks of it will be particularly sensitive medical or intimate
body data. Likely this data will not remain on the implant, but will be transmitted
to the corporations, who will possibly sell it as they are doing today. This will grant
them an enormous amount of power and set up a dependent relationship between the
users and the corporations, with a high potential for violation of privacy. Implications
from invasive behaviours like assessing if you truly enjoyed watching a commercial over
in-depth surveillance methods, both by private corporations as well as governments all
the way to extortion are conceivable. Unfortunately the way that today’s tech giants
currently handle these issues does not generate much hope, unless big shifts in policy
happen until then. A different policy issue that already affects implant wearers today is
the question about cyborg rights and what status implants take in our laws. For example
if somebody destroys an implant either by accident or on purpose does it count as bodily
injury or is it property damage? Does it qualify as cruel punishment if the police denies
you access to one of your senses for example by denying access to electricity? Many
implant wearers recognize their implants as part of their body, but currently there are
no international or national laws that codify this and protect the rights of cyborgs. A
perhaps more distant issue is the problem of social equality connected to cybernetic
implants. If they become so ubiquitous that a majority of people have them, at what
point does it become unfeasible for an individual to not have one, while still wanting
to partake in society. Comparable technologies today could be email or a smartphone.
Besides the cost or the issues with corporate dependency and power, getting an implant
will after all still be a medical intervention that not everybody might welcome.
These are a few of the ethical and societal questions that definitely will come up
and will need answers in one form or another, once cybernetic implants become widely
adopted. There are most likely many more, some of which are not obvious now. Many of
these questions, like those of ownership or cyborg rights, are already affecting cochlear
implant or other implant wearers today, but will undoubtedly become more pressing in the
future. Ideally how to handle these challenges would be discussed and thought through
before these issues arise, but unfortunately the past has shown, that technology usually
moves faster than policy. It should not be forgotten though that, besides their dystopian
potential and integration challenges that any new technology faces, cybernetic implants
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation 63
also have to potential to bring fascinating progress. They will very likely advance medical
interventions and monitoring as well as alter our perception of the world and possibly
fundamentally change the human experience. It’s hard to say how it will all play out or
to cite a danish saying: “It is difficult to make predictions, especially about the future.
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation IX
Acknowledgments
I would like to thank my partner Alpha for her patience and support as well as my family,
especially my brother for proofreading. Thanks to Lars Clausen from the Hörtherapiezen
trum Oberlin in Potsdam for reviewing the section on cochlear implants. Also I would
like to thank my employer Native Instruments for granting me the time and opportunity
to write this thesis. A special thanks goes to Julian Parker and Luca Tironi for their help.
Another big thanks to everybody who help proofread and find mistakes. Thank you all!
Ultrasonic Cyborg Hearing
Chapter 5 Evaluation X
Deutsche Zusammenfassung
Die vorliegende Masterarbeit untersucht die Umsetzbarkeit einer praktischen, kybernetis
chen Erweiterung des menschlichen Hörsinns in den Ultraschallbereich. Im Zuge dieser
Arbeit wurde ein voll funktionsfähiger Prototype gebaut und erfolgreich getestet. Die
Hardware, Software und der Algorithmus, die zur Realisierung des Prototyps verwendet
wurden, sowie deren technische Verifikation werden hier dokumentiert. Die bisherigen
und aktuellen Entwicklungen im Bereich der kybernetischen Implantate, mit einem
speziellen Fokus auf die do-it-yourself Cyborgs, werden detailiert analysiert. Außerdem
wird die Forschung zum Stand der Technik von Signalverarbeitungstechniken zur Ton
höhenverschiebung und Geräten zum hörbar Machen von Ultraschall vorgestellt. Eine
Reihe von technische sowie natürliche Ultraschallquellen werden ebenfalls aufgelistet.
Ein persönlicher Bericht über die Wahrnehmungserfahrung beim Tragen des Prototypen
wird zusammen mit einer Diskussion über die mögliche Zukunft von kybernetischer
Implantattechnologie besprochen. Der Prototype, welcher als Teil dieser Masterarbeit
gebaut wurde, zeigt, dass es möglich ist die Eigenschaften eines Cochlea Implants zu
erweitern, um die Wahrnehmung von Ultraschall möglich zu machen. Dies erlaubt eine
Neudefinition des Cochlea Implants von einem Hilfsmittel zur Linderung einer “Behin
derung” in einen erweiterbaren Sinn zum Ausbauen der menschlichen Wahrnehmung.
Der Algorithmus, der für die Tonhöhenverschiebung implementiert wurde, wurde mit
besonderem Augenmerk darauf entwickelt die Verteilung der harmonischen Obertöne
auch nach der Transposition zu bewahren. Daher eignet sich die Echtzeitimplementierung
des Prototyps, im Vergleich zu anderen Geräten zum hörbar Machen von Ultraschall,
welche diese Eigenschaft nicht aufweisen, besonders gut als Sinneserweiterung. Weitere
Ideen für zukünftige Einsatzmöglichkeiten des Prototyps ohne ein Cochlea Implantat
oder als Forschungswerkzeug werden ebenfalls aufgeführt.
Ultrasonic Cyborg Hearing
Acronyms XI
Acronyms
ADC analog-to-digital converter 38, 40, 52
ARM
Advanced RISC Machines (family of architechtures for computer processors) 38,
46–48
CMSIS Common Microcontroller Software Interface Standard 47, 48
CPU central processing unit 52
DAC digital-to-analog converter 38, 40, 45
DAW digital audio workstation 20, 23
DIY do-it-yourself 25, 28, 30–32, 35, 36, 38, 59
DSP digital signal processing 21, 45, 46, 48
FFT fast Fourier transform 22, 45–48
GPL GNU General Public License 43
GPS Global Positioning System 30
Hz Hertz 3–11, 16, 38–40, 42–44, 46, 47, 52–55
IEEE Institute of Electrical and Electronics Engineers 45
iFFT inverse fast Fourier transform 22, 45
I2SInter-IC Sound 38, 40, 52
LED light-emitting diode 28, 38
MEMS micro-electro-mechanical systems VI, 39, 40
MIDI Musical Instrument Digital Interface 38
MRI magnetic resonance imaging 4, 32
NFC near-field communication 28, 29, 32, 60
PCM pulse-code modulation 38, 42
PLL phase-locked loop 43
PSOLA pitch synchronous overlap and add 21, 22, 48
RAM random-access memory 38
RFID radio-frequency identification VI, 28–30, 32
USB Universal Serial Bus 38, 42
Ultrasonic Cyborg Hearing
Literature XII
Literature
[1]
C. D. West, “The relationship of the spiral turns of the cochlea and the length of the
basilar membrane to the range of audible frequencies in ground dwelling mammals,”
The Journal of the Acoustical Society of America, vol. 77, pp. 1091–1101, 3 1985.
https://doi.org/10.1121/1.392227.
[2]
U. Marckmann, “Bestimmung von Fledermausrufaufnahmen und Kriterien
für die Wertung von akustischen Artnachweisen,” tech. rep., Bayerisches Lan
desamt für Umwelt, Augsburg, 6 2020. https://archive.org/details/bestimmung-
von-fledermausrufaufnahmen-und-kriterien-fur-die-wertung-von-akustischen-
artnachweisen - Accessed: 21.01.2022.
[3]
R. R. Hoy, G. S. Pollack, and A. Moiseff, “Species-Recognition in the Field Cricket,
Teleogryllus oceanicus: Behavioral and Neural Mechanisms,” American Zoologist,
vol. 22, pp. 597–607, 08 1982. https://doi.org/10.1093/icb/22.3.597.
[4]
L. A. Miller and A. Surlykke, “How Some Insects Detect and Avoid Being Eaten by
Bats: Tactics and Countertactics of Prey and Predator: Evolutionarily speaking,
insects have responded to selective pressure from bats with new evasive mechanisms,
and these very responses in turn put pressure on bats to “improve” their tactics,”
BioScience, vol. 51, pp. 570–581, 07 2001. https://doi.org/10.1641/0006-3568(2001)
051[0570:HSIDAA]2.0.CO;2.
[5]
D. Warfield, “Chapter 2 The Study of Hearing in Animals,” in Methods of
Animal Experimentation (W. I. Gay, ed.), pp. 43–141, Academic Press, 1973. https:
//doi.org/10.1016/b978-0-12-278004-2.50008-6.
[6]
A. K. Yetisen, R. Moreddu, S. Seifi, N. Jiang, K. Vega, X. Dong, J. Dong, H. Butt,
M. Jakobi, M. Elsner, and A. W. Koch, “Dermal tattoo biosensors for colorimetric
metabolite detection,” Angewandte Chemie International Edition, vol. 58, no. 31,
pp. 10506–10513, 2019. https://doi.org/10.1002/anie.201904416.
[7]
Knowles Electronics, “MEMS microphone SPU0410LR5H-QB data sheet
Rev. H. https://media.digikey.com/pdf/Data%20Sheets/Knowles%20Acoustics%
20PDFs/SPU0410LR5H-QB_RevH_3-27-13.pdf - Accessed: 09.07.2021.
[8]
Prof. Dr. S. Weinzierl, Handbuch der Audiotechnik, pp. 1–9, 18–20. VDI-Buch,
Springer Berlin Heidelberg, 1 ed., 2008. https://doi.org/10.1007/978-3-540-34301-1.
Ultrasonic Cyborg Hearing
Literature XIII
[9]
M. Caetano, C. Saitis, and K. Siedenburg, “Audio content descriptors of timbre,” in
Timbre: Acoustics, Perception, and Cognition (K. Siedenburg, C. Saitis, S. McAdams,
A. N. Popper, and R. R. Fay, eds.), Springer Handbook of Auditory Research,
pp. 297–333, Springer International Publishing, 1 ed., 2019. https://doi.org/10.1007/
978-3-030-14832-4_11.
[10]
M. Möser, Technische Akustik, pp. 1–8. VDI-Buch, Springer Berlin Heidelberg,
8. ed., 2009. https://doi.org/10.1007/978-3-540-89818-4.
[11]
R. Lerch, G. M. Sessler, and D. Wolf, Technische Akustik - Grundlagen und
Anwendungen, pp. 1, 200, 573–586. Springer Berlin Heidelberg, 1 ed., 2009.
https://doi.org/10.1007/978-3-540-49833-9.
[12]
H. Møller and C. S. Pedersen, “Hearing at low and infrasonic frequencies,” Noise &
Health, vol. 6, pp. 37–57, 04 2004.
[13] J. A. Jensen, “Medical ultrasound imaging,” Progress in Biophysics and Molecular
Biology, vol. 93, no. 1, pp. 153–165, 2007. https://doi.org/10.1016/j.pbiomolbio.
2006.07.025.
[14]
T. G. Leighton and R. O. Cleveland, “Lithotripsy,” Proceedings of the Institution of
Mechanical Engineers, Part H: Journal of Engineering in Medicine, vol. 224, no. 2,
pp. 317–342, 2010. https://doi.org/10.1243/09544119JEIM588.
[15]
G. Wagner, F. Balle, and D. Eifler, “Ultrasonic welding of hybrid joints,” JOM,
vol. 64, 03 2012. https://doi.org/10.1007/s11837-012-0269-5.
[16]
T. J. Mason, “Ultrasonic cleaning: An historical perspective,” Ultrasonics Sonochem
istry, vol. 29, pp. 519–523, 2016. https://doi.org/10.1016/j.ultsonch.2015.05.004.
[17]
H. He and J. Liu, “The design of ultrasonic distance measurement system based on
S3C2410,” in 2008 International Conference on Intelligent Computation Technology
and Automation (ICICTA), vol. 2, pp. 44–47, 2008. https://doi.org/10.1109/ICICTA.
2008.222.
[18]
M. Okugumo, “Object shape classification by ultrasound for the purpose of car
parking assistance system,” in Proceedings of the 8th IIAE International Conference
on Industrial Application Engineering 2020, pp. 139–143, IIAE, 01 2020. https:
//doi.org/10.12792/iciae2020.028.
[19]
G. Jones, “Echolocation,” Current Biology, vol. 15, no. 13, pp. R484–R488, 2005.
https://doi.org/10.1016/j.cub.2005.06.051.
[20]
R. Nakano, Y. Ishikawa, S. Tatsuki, N. Skals, A. Surlykke, and T. Takanashi,
“Private ultrasonic whispering in moths,” Communicative & Integrative Biology,
vol. 2, no. 2, pp. 123–126, 2009. https://doi.org/10.4161/cib.7738.
[21]
B. Knutson, J. Burgdorf, and J. Panksepp, “Ultrasonic vocalizations as indices of
affective states in rats.,” Psychological bulletin, vol. 128, no. 6, pp. 961–977, 2002.
https://doi.org/10.1037/0033-2909.128.6.961.
Ultrasonic Cyborg Hearing
Literature XIV
[22]
C. V. Portfors, “Types and functions of ultrasonic vocalizations in laboratory rats
and mice,” Journal of the American Association for Laboratory Animal Science,
vol. 46, pp. 28–34, 01 2007. https://www.ingentaconnect.com/content/aalas/jaalas/
2007/00000046/00000001/art00005.
[23]
R. J. Blanchard, D. C. Blanchard, R. Agullana, and S. M. Weiss, “Twenty-two
khz alarm cries to presentation of a predator, by laboratory rats living in visible
burrow systems,” Physiology & Behavior, vol. 50, pp. 967–972, 11 1991. https:
//doi.org/10.1016/0031-9384(91)90423-l.
[24]
T. E. Holy and Z. Guo, “Ultrasonic songs of male mice,” PLOS Biology, vol. 3, 11
2005. https://doi.org/10.1371/journal.pbio.0030386.
[25]
S. L. Newar and J. Bowman, “Think before they squeak: Vocalizations of the squirrel
family,” Frontiers in Ecology and Evolution, vol. 8, 07 2020. https://doi.org/10.
3389/fevo.2020.00193.
[26]
A. Barber, A. Wilkinson, F. Montealegre-Z, V. Ratcliffe, K. Guo, and D. Mills,
“A comparison of hearing and auditory functioning between dogs and humans,”
Comparative Cognition & Behavior Reviews, vol. 15, pp. 1–50, 07 2020. https:
//doi.org/10.3819/CCBR.2020.150005E.
[27]
M. T. Tyree and M. A. Dixon, “Cavitation Events in Thuja occidentalis L.?: Utra
sonic Acoustic Emissions from the Sapwood Can Be Measured,” Plant Physiology,
vol. 72, pp. 1094–1099, 08 1983. https://doi.org/10.1104/pp.72.4.1094.
[28]
A. Ponomarenko, O. Vincent, A. Pietriga, H. Cochard, É. Badel, and P. Marmottant,
“Ultrasonic emissions reveal individual cavitation bubbles in water-stressed wood,”
Journal of The Royal Society Interface, vol. 11, p. 20140480, 10 2014. https:
//doi.org/10.1098/rsif.2014.0480.
[29]
A. Panthawong, S. L. Doggett, and T. Chareonviriyaphap, “The efficacy of ultra
sonic pest repellent devices against the australian paralysis tick, ixodes holocyclus
(acari: Ixodidae),” Insects, vol. 12, p. 400, Apr 2021. https://doi.org/10.3390/
insects12050400.
[30]
Q. Liu, Q. Chen, Y. Chen, J. Qin, and Q. Su, “Evaluation efficiency of an ultrasonic
pest repeller on male rattus losea,” Acta Theriologica Sinica, vol. 38, pp. 230–234,
03 2018. https://doi.org/10.16829/j.slxb.150187.
[31]
R. Adler, P. Desmares, and J. Spracklen, “An ultrasonic remote control for home
receivers,” IRE Transactions on Broadcast and Television Receivers, vol. BTR-3,
no. 1, pp. 8–13, 1957. https://doi.org/10.1109/TBTR2.1957.4502960.
[32]
T. J. Balkany, A. Hodges, R. T. Miyamoto, K. Gibbin, and O. Odabasi, “Cochlear
implants in children,” Otolaryngologic clinics of North America, vol. 34, no. 2,
p. 455–467, 2001. https://doi.org/10.1016/s0030-6665(05)70342-x.
Ultrasonic Cyborg Hearing
Literature XV
[33]
A. A. Eshraghi, R. Nazarian, F. F. Telischi, S. M. Rajguru, E. Truy, and C. Gupta,
“The cochlear implant: Historical aspects and future prospects,” Anatomical record
(Hoboken, N.J. : 2007), vol. 295, pp. 1967–1980, 11 2012. http://doi.org/10.1002/ar.
22580.
[34]
A. Mudry and M. Mills, “The early history of the cochlear implant: A retrospective,”
JAMA Otolaryngology–Head & Neck Surgery, vol. 139, pp. 446–453, 05 2013. https:
//doi.org/10.1001/jamaoto.2013.293.
[35]
M. Hey, T. Hocke, B. Böhnke, and S. J. Mauger, “ForwardFocus with cochlear implant
recipients in spatially separated and fluctuating competing signals introduction of
a reference metric,” International Journal of Audiology, vol. 58, no. 12, pp. 869–878,
2019. https://doi.org/10.1080/14992027.2019.1638527.
[36]
A. Dhanasingh and C. Jolly, “An overview of cochlear implant electrode array
designs,” Hearing Research, vol. 356, pp. 93–103, 2017. https://doi.org/10.1016/j.
heares.2017.10.005.
[37]
K. D. Morton, P. A. Torrione, C. S. Throckmorton, and L. M. Collins, “Mandarin
chinese tone identification in cochlear implants: Predictions from acoustic models,”
Hearing Research, vol. 244, no. 1, pp. 66–76, 2008. https://doi.org/10.1016/j.heares.
2008.07.008.
[38]
L. R.-M. Hallberg and A. Ringdahl, “Living with cochlear implants: experiences
of 17 adult patients in sweden,” International Journal of Audiology, vol. 43, no. 2,
pp. 115–121, 2004. https://doi.org/10.1080/14992020400050016.
[39]
K. Siedenburg, C. Saitis, and S. McAdams, “The present, past, and future of
timbre research,” in Timbre: Acoustics, Perception, and Cognition (K. Siedenburg,
C. Saitis, S. McAdams, A. N. Popper, and R. R. Fay, eds.), Springer Handbook
of Auditory Research, pp. 1–19, Springer International Publishing, 1 ed., 2019.
https://doi.org/10.1007/978-3-030-14832-4_1.
[40]
U. Zölzer, ed., DAFX: Digital Audio Effects, pp. 219–278. John Wiley & Sons, Ltd.,
2. ed., 2011.
[41]
M. Mills, “Media and prosthesis: The vocoder, the artificial larynx, and the history
of signal processing,” Qui Parle, vol. 21, pp. 107–149, 12 2012. https://doi.org/10.
5250/quiparle.21.1.0107.
[42]
É. Moulines and J. Laroche, “Non-parametric techniques for pitch-scale and time-s
cale modification of speech,” Speech Communication, vol. 16, pp. 175–205, 1995.
https://doi.org/10.1016/0167-6393(94)00054-E.
[43]
R. Bristow-Johnson, “A detailed analysis of a time-domain formant-corrected
pitch-shifting algorithm,” Journal of the Audio Engineering Society. Audio En
gineering Society, vol. 43, pp. 340–352, 05 1995.
Ultrasonic Cyborg Hearing
Literature XVI
[44]
A. De Götzen, N. Bernardini, and D. Arfib, “Traditional (?) implementations
of a phase-vocoder: The tricks of the trade,” in Proceedings of the COST G-6
Conference on Digital Audio Effects (DAFX-00), Verona, Italy, pp. 37–44, 12 2000.
https://www.dafx.de/paper-archive/2000/pdf/Bernardini.pdf.
[45]
J. Laroche and M. Dolson, “New phase-vocoder techniques for pitch-shifting, har
monizing and other exotic effects,” in Proceedings of the 1999 IEEE Workshop
on Applications of Signal Processing to Audio and Acoustics. WASPAA’99 (Cat.
No.99TH8452), pp. 91–94, 1999. https://doi.org/10.1109/ASPAA.1999.810857.
[46]
J. Laroche and M. Dolson, “Improved phase vocoder time-scale modification of audio,”
IEEE Transactions on Speech and Audio Processing, vol. 7, no. 3, pp. 323–332, 1999.
https://doi.org/10.1109/89.759041.
[47]
J. H. Engel, C. Resnick, A. Roberts, S. Dieleman, D. Eck, K. Simonyan, and
M. Norouzi, “Neural audio synthesis of musical notes with wavenet autoencoders,”
CoRR, vol. abs/1704.01279, 2017. https://arxiv.org/abs/1704.01279.
[48]
C. Brazier, “Pitch shifting: A machine learning method for music transformation,”
August 2017. MSc internship report, Sound and Music Computing Group, Aal
borg University of Copenhagen, https://www.atiam.ircam.fr/Archives/Stages1617/
BRAZIER_Charles_Memoire_Stage.pdf - Accessed: 26.07.2021.
[49]
S. Wager, G. Tzanetakis, C. Wang, and M. Kim, “Deep autotuner: A pitch correcting
network for singing performances,” in ICASSP 2020 - 2020 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 246–250, 2020.
https://doi.org/10.1109/ICASSP40776.2020.9054308.
[50]
V. Pulkki, L. McCormack, and R. Gonzalez, “Superhuman spatial hearing technology
for ultrasonic frequencies,” Scientific Reports, vol. 11, p. 11608, 06 2021. https:
//doi.org/10.1038/s41598-021-90829-9.
[51]
M. L. Rubin, “Spectacles: Past, present, and future,” Survey of Ophthalmology,
vol. 30, no. 5, pp. 321–327, 1986. https://doi.org/10.1016/0039-6257(86)90064-0.
[52]
R. L. B. Selinger, “Materials science of vision correction: Glasses and sun
glasses,” MRS Bulletin, vol. 22, no. 7, pp. 71–71, 1997. https://doi.org/10.1557/
S0883769400033455.
[53]
A. A. Galadima, “Arduino as a learning tool,” in 2014 11th International Conference
on Electronics, Computer and Computation (ICECCO), pp. 1–4, IEEE, 2014. https:
//doi.org/10.1109/ICECCO.2014.6997577.
[54]
W. Barfield and A. Williams, “Cyborgs and enhancement technology,” Philosophies,
vol. 2, p. 4, 01 2017. https://doi.org/10.3390/philosophies2010004.
[55]
K. Warwick, “The cyborg revolution,” NanoEthics, vol. 8, 12 2014. https://doi.org/
10.1007/s11569-014-0212-z.
Ultrasonic Cyborg Hearing
Literature XVII
[56]
C. Rumbo, C. C. Espina, J. Gassmann, O. Tosoni, R. Barros García, S. M. Martín,
and J. A. Tamayo-Ramos, “In vitro safety evaluation of rare earth-lean alloys for
permanent magnets manufacturing,” Scientific Reports, vol. 11, p. 12633, 06 2021.
https://doi.org/10.1038/s41598-021-91890-0.
[57]
J. Hameed, I. Harrison, M. N. Gasson, and K. Warwick, “A novel human-machine
interface using subdermal magnetic implants,” in 2010 IEEE 9th International
Conference on Cybernetic Intelligent Systems, pp. 1–5, 2010. https://doi.org/10.
1109/UKRICIS.2010.5898141.
[58]
I. Harrison, K. Warwick, and V. Ruiz, “Subdermal magnetic implants: An ex
perimental study,” Cybernetics and Systems, vol. 49, no. 2, pp. 122–150, 2018.
https://doi.org/10.1080/01969722.2018.1448223.
[59]
J. L. Butterfield, S. P. Keyser, K. V. Dikshit, H. Kwon, M. I. Koster, and C. J.
Bruns, “Solar freckles: Long-term photochromic tattoos for intradermal ultraviolet
radiometry,” ACS Nano, vol. 14, no. 10, pp. 13619–13628, 2020. https://doi.org/10.
1021/acsnano.0c05723.
[60]
K. Kaspar, S. König, J. Schwandt, and P. König, “The experience of new sensorimotor
contingencies by sensory augmentation,” Consciousness and Cognition, vol. 28,
pp. 47–63, 08 2014. https://doi.org/10.1016/j.concog.2014.06.006.
[61]
S. U. König, F. Schumann, J. Keyser, C. Goeke, C. Krause, S. Wache, A. Ly
tochkin, M. Ebert, V. Brunsch, B. Wahn, K. Kaspar, S. K. Nagel, T. Meilinger,
H. Bülthoff, T. Wolbers, C. Büchel, and P. König, “Learning new sensorimotor
contingencies: Effects of long-term use of sensory augmentation on the brain
and conscious perception,” PLOS ONE, vol. 11, pp. 1–35, 12 2016. https:
//doi.org/10.1371/journal.pone.0166647.
[62]
A. Schiffmann, M. Clauss, and P. Honigmann, “Biohackers and self-made problems:
Infection of an implanted RFID/NFC chip,” JBJS Case Connector, vol. 10, p. e0399,
05 2020. https://doi.org/10.2106/JBJS.CC.19.00399.
[63]
S. Piesnack, M. E. Frame, G. Oechtering, and E. Ludewig, “Functionality of vet
erinary identification mircochips following low- (0.5 tesla) and high-field (3 tesla)
magnetic resonance imaging,” Veterinary Radiology & Ultrasound, vol. 54, no. 6,
pp. 618–622, 2013. https://doi.org/10.1111/vru.12057.
[64]
S. Wagemakers, L. Van Zoonen, and G. Turner, “Giving meaning to rfid and cochlear
implants,” IEEE Technology and Society Magazine, vol. 33, no. 2, pp. 73–80, 2014.
https://doi.org/10.1109/MTS.2014.2319978.
[65]
E. Pearlman, “I, Cyborg,” PAJ: A Journal of Performance and Art, vol. 37, pp. 84–90,
05 2015. https://doi.org/10.1162/PAJJ_a_00264.
Ultrasonic Cyborg Hearing
Literature XVIII
[66]
N. Hindi, “S1E11. Moon Ribas & Manel de Aguas. Cyborg Art: Creating New
Senses and Organs.,” 10 2020. Shaping Business Minds Through Art - The Artian
Podcast, https://podcast.theartian.com/1032724/5672227-11-moon-ribas-manel-de-
aguas-cyborg-art-creating-new-senses-and-organs.
[67]
P. Bach-y-Rita, C. C. Collins, F. A. Saunders, B. White, and L. Scadden, “Vision
substitution by tactile image projection,” Nature, vol. 221, pp. 963–964, 03 1969.
https://doi.org/10.1038/221963a0.
[68]
E. Sampaio, S. Maris, and P. Bach-y-Rita, “Brain plasticity: ‘visual’ acuity of
blind persons via the tongue,” Brain Research, vol. 908, no. 2, pp. 204–207, 2001.
https://doi.org/10.1016/S0006-8993(01)02667-1.
[69]
D. Eagleman, “Who will we be?,” in The brain: The story of you, pp. 186–235,
Canongate Books, 2015.
[70]
Freescale Semiconductor, Inc., “SGTL5000 data sheet. https://www.nxp.com/docs/
en/data-sheet/SGTL5000.pdf - Accessed: 08.07.2021.
[71]
C. Mydlarz, S. Nacach, E. Rosenthal, M. Temple, T. H. Park, and A. Roginska,
“The implementation of mems microphones for urban sound sensing,” 137th Audio
Engineering Society Convention 2014, 10 2014.
[72]
K. Darras, B. Kolbrek, A. Knorr, V. Meyer, M. Zippert, and A. Wenzel, “Assem
bling cheap, high-performance microphones for recording terrestrial wildlife: the
sonitor system,” F1000Research, vol. 7, p. 1984, 02 2021. https://doi.org/10.12688/
f1000research.17511.3.
[73]
Mircosoft Corporation, IBM Corporation, Multimedia Programming Interface and
Data Specifications 1.0, pp. 56–65, 08 1991. 1.0 ed.
[74]
A. Streicher, Luftultraschall-Sender-Empfänger-System für einen künstlichen Fled
ermauskopf. PhD thesis, Technischen Fakultät der Universität Erlangen-Nürnberg,
2008. https://d-nb.info/991236505/34.
Ultrasonic Cyborg Hearing