scieee Science in your language
[en] (orig)
Received March 2, 2019, accepted May 4, 2019, date of publication May 27, 2019, date of current version June 13, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2919162
Applicability of Immersive Analytics
in Mixed Reality: Usability Study
BURKHARD HOPPENSTEDT 1, THOMAS PROBST 2, MANFRED REICHERT 1,
WINFRIED SCHLEE3, KLAUS KAMMERER1, MYRA SPILIOPOULOU4,
JOHANNES SCHOBEL 1, MICHAEL WINTER 1, ANNA FELNHOFER5,
OSWALD D. KOTHGASSNER 6, AND RÜDIGER PRYSS 1
1Institute of Databases and Information Systems, University of Ulm, 89081 Ulm, Germany
2Department for Psychotherapy and Biopsychosocial Health, Danube University Krems, 3500 Krems an der Donau, Austria
3Department of Psychiatry and Psychotherapy, University of Regensburg, 93053 Regensburg, Germany
4Faculty of Computer Science, Otto von Guericke University Magdeburg, 39106 Magdeburg, Germany
5Department of Pediatrics and Adolescent Medicine, Medical University of Vienna, 1090 Vienna, Austria
6Department of Child and Adolescent Psychiatry, Medical University of Vienna, 1090 Vienna, Austria
Corresponding author: Burkhard Hoppenstedt ([email protected]).
ABSTRACT Nowadays, visual analytics is mainly performed by programming approaches and viewing the
results on a desktop monitor. However, due to the capabilities of smart glasses, new user interactions and
representation possibilities become possible. This refers especially to 3D visualizations in the medical field,
as well as, the industry domain, as valuable depth information can be related to the complex real-world
structures and related data, which is also denoted as immersive analytics. However, the applicability of
immersive analytics and its drawbacks, especially in the context of mixed reality, are quite unexplored.
In order to validate the feasibility of immersive analytics for the aforementioned purposes, we designed and
conducted a usability study with 60 participants. More specifically, we evaluated the effects of spatial sounds,
performance changes from one analytics task to another, expert status, and compared an immersive analytics
approach (i.e., a mixed-reality application) with a desktop-based solution. Participants had to solve several
data analytics tasks (outlier’s detection and cluster recognition) with the developed mixed-reality application.
Thereby, the performance measures regarding time, errors, and movement patterns were evaluated. The sep-
aration into groups (low performer vs. high performer) was performed using a mental rotation pretest. When
solving analytic tasks in mixed reality, participants changed their movement patterns in the mixed reality
setting significantly, while the use of spatial sounds reduced the handling time significantly, but did not affect
the movement patterns. Furthermore, the usage of mixed reality for cluster recognition is significantly faster
than the desktop-based solution (i.e., a 2D approach). Moreover, the results obtained with self-developed
questionnaires indicate 1) that wearing smart glasses are perceived as a potential stressor and 2) that the
utilization of sounds is perceived very differently by the participants. Altogether, industry and researchers
should consider immersive analytics as a suitable alternative compared to the traditional approaches.
INDEX TERMS Immersive analytics, mixed reality, spatial sounds, visual analytics, hololens.
I. INTRODUCTION
Augmented reality glasses, so-called smart glasses, are used
in various contexts and their interaction patterns and visu-
alizations help to simplify many procedures and processes.
However, their main advantage is that users can continue their
work by using their hands, while interacting with the smart
glasses via voice or gaze commands for other work tasks.
The associate editor coordinating the review of this manuscript and
approving it for publication was Tai-hoon Kim.
This is, for example, beneficial for maintenance tasks that
have to be accomplished in industry [1] or medicine [2].
Moreover, cognitive processes may benefit from bodily
experiences [3] and, hence, required learning processes to
accomplish these tasks can be improved. Furthermore, when
dealing with dangerous working environments [4], the use
of augmented-reality glasses allows for a realistic interac-
tion with a machine without being on-site. Additionally,
augmented-reality glasses allow for a flexible and exchange-
able visualization of expensive and complex objects, such
VOLUME 7, 2019
2169-3536 2019 IEEE. Translations and content mining are permitted for academic research only.
Personal use is also permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
71921
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
as houses or construction machines [5]. In this context,
augmented reality is categorized through the virtual-reality
continuum [6]. Hereby, mixed reality [7] is known to have
the highest overlap of reality and virtuality, trying to max-
imize the mixing of reality and virtuality compared to aug-
mented reality. Mixed reality, in turn, relies on the concept
of spatial mapping, also denoted as 3D reconstruction [8].
In a previous work, we showed the applicability of mixed
reality in the medical context for the purposes of visual
analytics [9] as well as for augmented data analytics [10],
which is also denoted as immersive analytics [11]. How-
ever, the work at hand addresses the following research
questions when performing immersive analytic tasks by
using mixed-reality applications in challenging contexts like
medicine or Industry 4.0:
RQ1:Does the task performance differ significantly for
participants with different spatial imagination abilities? We
denote high performers as participants with a high spatial
imagination ability. Note that we measure spatial imagina-
tion abilities using a mental rotation test and declare those
participants with a higher score than the median as high
performers, while the remaining participants are denoted as
low performers. For these two groups, we analyze whether
high performers have a greater benefit from an augmented
reality solution than low performers [12].
RQ2:Does the task performance differ significantly over
time? With the help of a smart glass, we measure the time
and movement patterns (e.g., walking paths and gaze) in
several tasks to investigate whether these variables change
significantly from task to task over time.
RQ3:Do spatial sounds significantly improve the task
performance in a mixed-reality application? The used device,
a Microsoft HoloLens [13], includes the possibility to provide
spatial sounds coming from a set of speakers around the
user’s head. This concept offers the possibility to draw the
participants attention to a spot in the room [14]. Notably,
we do not visualize these sounds as suggested by other related
works (e.g., [15]). The goal of this approach was to evaluate
the pure perception of these sounds.
RQ4:How stressful is the use of a mixed-reality appli-
cation in general? In general, stress is an important factor
in the context of augmented representations [16]. Next to
the motion-induced sickness, there exists a simulator sick-
ness due to the unknown visual stimulation [17]. Further-
more, the weight of a smart glass might be a disturbing
factor. For example, the Microsoft HoloLens weights 579 g
(1.28 lb), which causes a noticeable different feeling when
worn. Therefore, the psychosocial stress-level is measured
via self-developed questionnaires. Additionally, a skin con-
ductance measurement was carried out, which is not included
in this paper, but will be evaluated in further studies.
RQ5:Do users pose a better task performance when using
an 3D approach compared to an 2D approach? The gen-
eral question whether immersive analytics are superior to
2D representations is intensively discussed in the scientific
community. Gracia et al. [18], for example, stated that 3D
representations are more suitable than 2D plots. To be more
precise, [18] carried out a study based on loss of quality
quantification, using the tasks point classification, distance
perception, and outlier’s identification as use cases. However,
the visualizations were shown without any augmented-reality
glasses. The theory that a high degree of physical immer-
sion results in lower interaction times is proposed by [19].
Another study favored the 3D space [20], by comparing
2D and 3D visualizations on interaction times and errors.
Study participants were asked to identify clusters, determine
the dimension of a dataset, and classify the sparseness of
data. Immersive analytics can be combined with advanced
analytics, such as dimensionality reduction for an improved
visualization of scatter plots. However, another study [21]
reveals advantages of 2D scatter plots when reducing the
dimensionality of a data set, as it leads to lower interaction
costs. Therefore, this study compares an 3D approach with a
desktop-based solution.
The remainder part of this work is structured as follows.
In Section II, the used materials and methods are presented,
while Section II discusses the results of the conducted study.
The latter section includes the limitations of this work as
well as a summary and an outlook, while the first-mentioned
section includes the discussion of related works.
II. METHODS
A. PRETEST APPLICATIONS
1) MENTAL ROTATION TEST
A mental rotation test assesses the ability of spatial vision and
imagination. Since the 3D point clouds contain an enormous
amount of information regarding spatial distance, we assume
that a good spatial vision is essential to solve the analytics
tasks (see RQ1). We classify the spatial imagination based
on a digitalized version of the mental rotation test presented
by [22]. In general, a study participant has to solve problems
by rotating 3D objects in his mind. In previous studies using
mental rotation tests, males scored significantly higher than
females. However, more recent studies reported a smaller
gap [23]. For our study, we implemented a graphical user
interface for a desktop computer (see Fig. 1), where the user
gets the task to select two identical objects using buttons. The
time limit of two minutes for this test is displayed by a red
progress bar and an arbitrary number of tasks can be solved
FIGURE 1. Screenshot of the digital mental rotation test.
71922 VOLUME 7, 2019
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
within these two minutes. Then, when the user clicks a button,
the current timestamp is recorded. Hereby, the median value
of correctly solved tasks was four.
2) SPATIAL SOUND TEST
The HoloLens offers a concept named spatial sound [24].
Hereby, the sound is emitted from a set of microphones
around the user’s head and the latter is enabled to indicate the
direction and location of a hologram in the room. We conduct
this test using a headset that emulates a Dolby 5.1 system
with the channels left(l),central(c),right(r),left surround(ls),
and right surround (rs) (see Fig. 2A). Without any time
limit, users have to choose the direction using four buttons,
schematically indicated in Fig. 2B. Six times, an audio sample
of 13 seconds with the sound of footsteps is played and
presented to the participants, without any repetition. After
each playback, the participant has to guess the right direction.
The aim of this test was to identify participants with low
scores (less than two solved tasks) as potential outliers for
the spatial sound part of this study. However, all participants
passed this test and solved more than two tasks.
FIGURE 2. (A) Audio configuration for the task Back,(B) schematic user
interface.
B. MEASURES
The following performance measures were assessed in the use
case of outlier’s detection (see below): time, path, and angle.
The following performance measures were collected in the
use case of cluster recognition (see below): time and errors.
Note that the processed questionnaires and performed stress
measurements are related to both use cases.
1) TIME
Timestamps were added to a CSV file during the study. Those
timestamps were collected when pressing a button in a 2D
interface, sending a voice command in the HoloLens appli-
cation, or after completing a task. This allowed us to evaluate
the required time to complete respective tasks.
2) ERRORS
This performance measure is only applicable to the use case
cluster recognition [25], as the answer for recognized clusters
can be compared to the real value. Furthermore, it is possible
to measure the degree of error, as the answer of recognized
clusters can be either seen binary (true/false) or of numeric
character (distance to real value). Note that we solely focused
on the binary scale.
3) PATH
The HoloLens collects the current position as a 3D vec-
tor relative to the starting position [26] at a frame rate
of 60 frames per second [27]. The distance between two
points is denoted as section. The ycoordinate (height of
person) is ignored and the points are mapped to the xz-plane
(floor). Based on this data, several features can be extracted.
First, the length of the path (path) can be calculated. This
value indicates whether the participant walked a lot during
the accomplishment of the respective task. Second, the mean
of all sections (pathAvg) can be calculated. This expression
represents the average speed of the participant, represented
by the unit meter per frame. Third, the variance of all sections
(pathVar) represents the erraticness of a movement. Finally,
the Bounding Box (BBox) of the movement is calculated as
the area limited by the maximum and minimum positions
in x- and z-directions. Consequently, participants, who often
change their perspective to different positions, lead to high
BBox values.
4) ANGLE
The angle measures the rotation between the aforementioned
3D vectors. Analogous to the path, we calculate the average
(angleAvg) to represent the rotation speed in degrees per
frame. In addition, we measure the erraticness of the rotation
using the variance (angleVar).
5) STRESS-LEVEL
The state version of the State-Trait Anxiety Inventory (STAI)
questionnaire [28] was utilized and handed out in the begin-
ning and in the end of this study. This questionnaire consists
of 20 items and measures state anxiety, a construct similar
to state stress. Hereby, positive attributes (e.g., calm) and
negative attributes (e.g., worried) are answered, using the
scale [1,2,3,4]. For the evaluation of this questionnaire, all
positive attributes are flipped (e.g., an answer ’4’ becomes
a ’1’), and all answers are summed up to reveal a final
STAI score. Moreover, skin conductance was measured with
30 randomly selected users with the tool movisens [29], but
this physiological measure of stress will be analyzed in future
studies.
6) SELF-DEVELOPED QUESTIONNAIRE
A self-developed questionnaire (see Appendix A) asks for the
participant’s feedback. The scale was set from one to ten,
for which ten represents a high value of agreement to the
question. This questionnaire was handed out in the end of this
study.
7) DEMOGRAPHIC QUESTIONNAIRE
Using a demographic questionnaire, we assessed gender, age,
and education of all participants.
VOLUME 7, 2019 71923
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
FIGURE 3. Sketch of use case: Outlier’s detection.
C. MIXED-REALITY APPLICATION AND 2D ALTERNATIVE
New possibilities in spatial sounds, gaze, and voice com-
mands offer interesting perspectives for simplifying data
analytics. Therefore, we implemented a prototype to tackle
outlier’s detection as well as cluster recognition as use cases
for data analytics. Our used device is a Microsoft HoloLens.
This smart glass is able to generate a virtual representation
of the room and projects holograms into the room. These
holograms stay in place, while the user is able to walk around
the room and inspect them from different angles.
1) OUTLIER’S DETECTION
For the use case outlier’s detection, we create a 3D point
cloud with normal distribution. These points are displayed
all over the room, in which the study takes place. All of
them appear white to the user, except one, which is red. The
user’s task is to find the red-marked point. Hereby, eight tasks
for outlier’s detection have to be solved, without any time
limit, whereby four tasks are supported by spatial sounds.
A constant sound of 44100 Hz is coming from the direction of
the red point. If the user approaches the red point, the volume
increases. In contrast, if the participant walks away, the sound
volume decreases. The current gaze is indicated by a green
point. The participant confirms the finding of an outlier’s
point by focusing his/her gaze to the point. When an outlier
is found, the color changes to white and a next outlier has
to be found. Sound-supported and sound-unsupported tasks
alternate in their order. Furthermore, one test group starts with
sound support, while the other one without it. Participants
were assigned randomly to these two conditions, so that high
performers and low performers are in both conditions. The
outlier’s points are chosen in different distances and angles,
relatively to the start position. The order of the tasks is the
same for every participant and it is not sorted by distances to
outlier’s points, but randomly. Moreover, in order to be able
to automatically collect the data needed for the evaluation,
alog mode was implemented. The application tracks the time
during the tasks, as well as the current position and the angle
relative to the starting pose. Therefore, every participant starts
at the same spot and with the same gaze focus. Based on
the position and angle, more performance measures, such as
bounding box or length of path, can be calculated (see Fig. 3).
2) CLUSTER RECOGNITION
For the second use case (cluster recognition), we compare a
2D and a 3D approach. All participants had to solve 12 clus-
ter recognition tasks without time limit, whereby six had
to be analyzed using the HoloLens and six with the 2D
approach. Half of the participants, chosen randomly, started
with the six tasks using the HoloLens, followed by the six
tasks using the 2D approach, whereas the other half started
with the six tasks using the 2D approach, followed by the six
tasks using the HoloLens. Therefore, high performers and low
performers were in each of the two conditions. Then, after
every six tasks of one approach are finished, the other six
tasks had to be accomplished. The six tasks are the same for
both approaches and must be solved without any time limit.
For the tasks in general, we used several normally distributed
point clouds, denoted as clusters. The tasks are generated by
combining these clusters into one plot, for which the number
of clusters in one plot differs. Hereby, the number of clusters
and the degree of overlap are changed between each task,
whereby a high number of clusters and overlaps might be
considered as more difficult. Again, the six tasks within each
approach (2D and 3D) are sorted randomly, but in the same
order for each participant. The resulting plots have to be
inspected from different angles to eventually recognize the
clusters (see Fig. 4). For the 2D approach, a Matlab GUI on
a Desktop PC was introduced. When using the HoloLens,
the participants have the possibility to walk around the cluster
to visually separate the clusters. In contrast, the Matlab plot
can be rotated using the mouse wheel. The confirmation of
the number of clusters is done using voice command when
using the HoloLens, and with the help of a TextBox + Button
when accomplishing tasks in Matlab.
D. STUDY PROCEDURE
A controlled environment was chosen for this study in
order to be able to quickly react to upcoming problems.
For the study, one laptop and the HoloLens were pro-
vided. Before each session, the study was carefully prepared.
71924 VOLUME 7, 2019
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
FIGURE 4. A gaze at the same point cloud from different angles (A-C) reveals clusters.
FIGURE 5. Study design.
This includes, for example, setting up the log mode and
setting all applications into a default state as well as preparing
the questionnaires.
Participants solved tasks from the use cases outlier detec-
tion and cluster recognition in one session (i.e., avg 43 min).
Half of the participants started with the outlier detection,
while the other half started with cluster recognition tasks.
The initial use case was chosen randomly. If the participants
started with the eight tasks of outlier detection, afterwards,
the same participants performed the 12 tasks of cluster recog-
nition and vice versa (see Fig. 5, blue box). The procedure
of the study is outlined in Fig. 5. The study started with
welcoming the participants and introducing the goal of the
study as well as the overall procedure. Participants using
the movisens (skin conductance measurement) had to stick
to a short resting phase, to receive a baseline measure. All
participants filled out the state version of the STAI question-
naire before the start of the experiment. Next, the participants
performed the mental rotation test (see Fig. 1) to be able
to divide them into high and low performers, followed by
the spatial sound test (see Fig. 2) to measure their spatial
hearing ability. We used a median split (median =4) of
the test scores in the mental rotation test, which resulted
in 29 high performers (48.33%) and 31 low performers
(51.67%). Next, the participants are separated randomly into
two groups to start either with outlier detection or cluster
recognition and continue with the missing use case after-
wards. Concluding the session, participants had to answer the
state version of the STAI questionnaire again, as well as the
self-developed questionnaire, and a demographic question-
naire. Altogether, this session took between 40 and 50 min-
utes. The data, automatically recorded by each application,
were then stored on the laptop’s storage after the session.
All materials and methods were approved by the Ethics
Committee of Ulm University and were carried out in accor-
dance with the approved guidelines. All participants gave
their written informed consent.
E. PARTICIPANTS
In total, 60 participants were recruited (see Table 1), whereby
most of them were recruited at Ulm University and Soft-
ware Companies. The study included students from various
VOLUME 7, 2019 71925
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
TABLE 1. Sample description and comparison between low and high
performers in baseline variables.
subjects [30], such as computer science, physics, or psy-
chology. Ten female participants and 50 male participants
joined the study, whereby the majority was between 25 and
35 years old. Recruited professionals were mainly software
developers, not necessarily with a focus on smart glasses.
The target group of immersive analytics (e.g., data scientists,
production workers) has most probably no experience with
smart glasses. Study participants willing to participate were
instructed according to the developed study design, which
was explained to them before. The participants were clas-
sified to the groups high and low performers according to
the mental rotation test. Altogether, our classification resulted
in 31 low performers (7 female and 24 male) and 29 high
performers (3 female and 26 male).
F. STATISTICS
Matlab R2017a, RPY2 [31], and SPSS 25.0 were used for
statistical analyses. Frequencies, percentages, means, and
standard deviations were calculated as descriptive statis-
tics. Low and high performers were compared in base-
line demographic variables using Fisher’s exact tests and
t-Tests for independent samples. For RQ1, RQ2, RQ3, and
RQ5, linear multilevel models [32] with the full maximum
likelihood estimation were performed. Hereby, two levels
were included, where level one represents the repeated assess-
ments (either in outlier detection or cluster recognition),
whereas level two represents the participants. The perfor-
mance measures (except errors) were the dependent variables
in these models. In RQ 1, we also used Fisher’s exact tests for
the error probabilities. The STAI scores were evaluated using
t-Tests for dependent samples for RQ4. In RQ5, the effect
of 2D vs. HoloLens was explored, using McNemar’s test for
the error probability. All statistical tests were performed two
tailed; the significance value was set to P < .05.
G. DATA AVAILABILITY
The raw data set containing all collected data that was ana-
lyzed during this study is included in this paper (and its
supplementary material).
III. RESULTS
This section discusses the results of the conducted study.
A. BASELINE COMPARISON BETWEEN LOW
AND HIGH PERFORMERS
Table 1summarizes the sample description and compar-
isons between low and high performers in baseline vari-
ables. No significant differences were found. Descriptively,
the low performers had a higher percentage of female partic-
ipants than the high performers and the high performers were
younger than the low performers.
B. RESULTS FOR RQ1
Concerning the errors for cluster recognition, the perfor-
mance did not differ significantly for the 2D approach
(4 errors for low performers, 2 errors for high performers,
p=.672) and the 3D approach (8 errors for low performers,
2 errors for high performers, p =.082) between high and low
performers. In Table 2, high performers were compared to
low performers across the eight tasks of the outlier detection
task using the HoloLens. Hereby, the high performers were
significantly quicker to solve the tasks than the low perform-
ers (p =.013). Moreover, the high performers required a
tendentially shorter walking distance (p =.071) to solve the
tasks. The multilevel models did not converge for the 2D /
3D cluster recognition to explore differences between high
performers and low performers, with respect to time in 2D
and 3D cluster recognition.
C. RESULTS FOR RQ2
During the eight outlier detection tasks in HoloLens,
the BoundingBox (p < .001), Pathlength (p =.709),
PathVariance (p < .001), PathMean (p < .001), Angle-
Variance (p < .001), and AngleMean (p < .001) increased
significantly from task to task (see Table 3). The recorded
time (p =.709) did not change significantly from task to task
in outlier’s detection using the Hololens. Concerning cluster
recognition using the Hololens, the recorded time (p < .187)
did not increase from task to task (see Table 4). The change
71926 VOLUME 7, 2019
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
TABLE 2. Results of the multilevel models for RQ 1 (Outlier detection in HoloLens).
TABLE 3. Results of the multilevel models for RQ 2 (Outlier detection in HoloLens).
of time from task to task in cluster recognition in the desktop
approach is not reported since these multilevel models did not
converge.
D. RESULTS FOR RQ3
With the help of spatial sounds, the users were able to solve
the tasks in the outlier detection quicker than without using
spatial sounds (p =.020, see Table 5).
E. RESULTS FOR RQ4
At the pre-assessment, the average state on the STAI scores
were M =44.58 (SD =4.67). At post-assessment, it was
M=45.72 (SD =4.43). This change did not attain statistical
significance (p =.175). Descriptive statistics of the answers
in the self-developed questionnaire are presented in Fig. 6.
Hereby, the number of outliers for question Support Sound is
remarkable.
F. RESULTS FOR RQ5
The HoloLens approach resulted in significantly faster cluster
recognition times than using a desktop computer (p < .001,
see Table 6). However, note that the speed advantage of
the HoloLens was rather small (i.e., in a milliseconds
range).
VOLUME 7, 2019 71927
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
TABLE 4. Results of the multilevel models for RQ 2 (Cluster recognition in HoloLens).
TABLE 5. Results of the multilevel models for RQ 3 (Outlier detection in HoloLens).
FIGURE 6. Box plots of questionnaire tasks.
When using the HoloLens, ten participants made errors.
In contrast, six participants answered wrongly when analyz-
ing the tasks with the desktop solution (p =.344).
IV. DISCUSSION
This study evaluated immersive analytic approaches for
mixed reality. More specifically, several research questions
were raised to address the usability of mixed reality for the
accomplishment of visual analytic tasks. Hereby, the effect of
spatial sounds, the benefit of a good spatial imagination abil-
ity, and the comparison against a traditional desktop approach
were evaluated. In total, 60 participants took part and solved
eight tasks for outlier detection and six tasks in the field of
cluster recognition twice. High performers were classified
71928 VOLUME 7, 2019
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
TABLE 6. Results of the multilevel models for RQ 5 (Cluster recognition in HoloLens).
by using a mental rotation test. For RQ1, we found that
high performers are quicker in outlier detection as well as
in cluster recognition using the HoloLens compared to low
performers. Note that our tasks were not domain specific.
Also, other studies showed an advantage of high versus low
performers in augmented reality, e.g., in the field of medical
training [33]. In our study, we recognized a significant change
of the measured movements patterns in outlier’s detection
using the Hololens (BoundingBox, Pathlength, PathVariance,
PathMean, AngleVariance, AngleMean). The users learned
that they can get a better overview and generate a better
3D perspective (RQ2) when moving around. Measurement
patterns, such as the smoothness of a movement, have been
also studied by augmented reality studies [34]. For RQ3,
we measured a significant speed advantage in the outlier’s
task through the use of spatial sounds. It is evident that this
effect is more valuable if the user’s interface is too large to
keep it always in the field of view. Again, the effects of spa-
tial sounds are also addressed by other works in augmented
reality settings, e.g., by [14], [35], and [36]. Its outstanding
role in the handling of 3D user interfaces is always stressed by
these works. For the HoloLens, the spatial sounds are limited
to horizontal directions and unsuitable to direct the gaze to the
bottom or the top. Moreover, even though the spatial sounds
helped to solve the tasks quicker, the participants opinion of
its usefulness was ambiguous (see RQ4). Some participants
mentioned that they would like to start and stop the sound
in order not to be not disturbed when they are in an orien-
tation phase. Spatial sounds, in turn, could be applied very
precisely and were even used in another use case to help blind
users [37]. Referring to RQ5, the HoloLens results in faster
responses in cluster recognition compared to a 2D desktop
approach. To embed this result in the existing literature, the
comparison of 3D visualizations to 2D representations was
conducted several times ([18]–[20], or [21]), interestingly
with different outcomes.
A. LIMITATIONS
The following limitations of this study [38] need to be dis-
cussed. First, the selection process of the participants limits
generalizability. About 80% of all recruited participants were
students or research associates. Note that the recruitment of
students as a substitute in empirical studies is a general sub-
ject of discussions, where [30] argues in favor of recruiting
them. Furthermore, the classification of recruited participants
into low and high performers by using a mental rotation test
administered via a desktop computer may be oversimplified,
as it excludes prior knowledge in the field of augmented
reality. We collected the prior knowledge concerning data
analytics and augmented reality in a self-assessment ques-
tionnaire with questions asking for the number of days spent
in this field. However, few people had prior knowledge with
smart glasses and therefore it was impossible to choose this
questionnaire as a suitable base for the division into the
two groups. A combination of different aspects of mixed
reality might lead to a more sophisticated expert score in
future work. A possible threat to internal validity might
be to have not covered enough variables to identify poten-
tial baseline differences between low and high performers.
As immersive analytics include and affect many of the human
senses, it might be difficult to cover and exclude all exter-
nal influences. As another shortcoming, 60 participants are
a rather small sample of participants. Moreover, the tasks
were chosen domain independent and cannot be transferred
to very specific settings and use cases. Despite these limita-
tions, the strength of the study was that we provided a wide
overview of the feasibility on immersive analytics.
B. SUMMARY
In summary, the results of this study underline the feasibility
of immersive analytics in general. The main findings show
that (1) high performers with a high mental rotation imagina-
tion ability are able to solve tasks quicker, (2) spatial sounds
make immersive analytic less time consuming, (3) immersive
analytics are a suitable alternative to desktop approaches.
To the best of our knowledge, usability issues in the context
of mixed reality have not been studied at this scale previously.
Furthermore, this may serve as a valuable benchmark for the
evaluation of immersive analytics in more general.
APPENDIX A
DEVELOPED QUESTIONNAIRE
The following questions were used as a subjective feedback:
As how stressful did you experience wearing the
glasses? (Short: Wearing)
How stressful was the outlier’s task? (Short: Outliers)
As how stressful did you experience the spatial sounds?
(Short: Sound)
How stressful was the task finding clusters in Mixed
Reality? (Short: Cluster(MR))
How stressful was the task finding clusters in the desktop
approach? (Short: Cluster(DT))
How stressful was the usage of the voice commands?
(Short: Voice)
VOLUME 7, 2019 71929
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
Did you feel supported by the spatial sounds? (Short:
Support Sound)
APPENDIX B
CONFLICT OF INTEREST STATEMENT
The authors declare that the research was conducted in the
absence of any commercial or financial relationships that
could be construed as a potential conflict of interest.
APPENDIX C
AUTHOR CONTRIBUTIONS
BH conducted the study and wrote the paper; KK recruited
participants; JS provided support on the ethics vote; MZ
supervised the skin conductance measure; TP, WS, and MS
calculated and supervised the statistical part; RP and MR
provided resources. All authors revised the manuscript and
approved its final version.
REFERENCES
[1] M. Aleksy, M. Troost, F. Scheinhardt, and G. T. Zank, ‘Utilizing hololens
to support industrial service processes, in Proc. IEEE 32nd Int. Conf. Adv.
Inf. Netw. Appl. (AINA), May 2018, pp. 143–148.
[2] A. Hurstel and D. Bechmann, Approach for intuitive and touchless inter-
action in the operating room, Proc. J., vol. 2, no. 1, pp. 50–64, 2019.
[3] S. Bakker, E. Van Den Hoven, and A. N. Antle, ‘Moso tangibles: Eval-
uating embodied learning, in Proc. 5th Int. Conf. Tangible, Embedded,
Embodied Interact., Jan. 2011, pp. 85–92.
[4] H. Regenbrecht, G. Baratoff, and W. Wilke, Augmented reality projects
in the automotive and aerospace industries, IEEE Comput. Graph. Appl.,
vol. 25, no. 6, pp. 48–56, Nov. 2005.
[5] H.-L. Chi, S.-C. Kang, and X. Wang, ‘Research trends and opportunities of
augmented reality applications in architecture, engineering, and construc-
tion, Autom. Construct., vol. 33, pp. 116–122, Aug. 2013.
[6] P. Milgram, H. Takemura, A. Utsumi, and F. Kishino, Augmented real-
ity: A class of displays on the reality-virtuality continuum, Proc. SPIE,
vol. 2351, pp. 282–293, Dec. 1995.
[7] G. Evans, J. Miller, M. I. Pena, A. MacAllister, and E. Winer, ‘Evaluating
the microsoft hololens through an augmented reality assembly applica-
tion, Proc. SPIE, vol. 10197, May 2017, Art. no. 101970V.
[8] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli,
J. Shotton, S. Hodges, D. Freeman, and A. Davison, ‘KinectFusion: Real-
time 3D reconstruction and interaction using a moving depth camera,
in Proc. 24th Annu. ACM Symp. Interface Softw. Technol., Oct. 2011,
pp. 559–568.
[9] B. Hoppenstedt, C. Schneider, R. Pryss, W. Schlee, T. Probst, P. Neff,
J. Simoes, A. Treß, and M. Reichert, ‘HOLOVIEW: Exploring patient data
in mixed reality, in Proc. TRI/TINNET Conf., Jun. 2018.
[10] B. Hoppenstedt, M. Reichert, C. Schneider, K. Kammerer, W. Schlee,
T. Probst, B. Langguth, and R. Pryss, ‘Exploring dimensionality reduction
effects in mixed reality for analyzing tinnitus patient data, in Proc. 4th Int.
Conf. Virtual Augmented Reality Educ. (VARE), G. Bruzzone, G. Mendívil,
and L. Gutierrez, Eds., 2018, pp. 163–170.
[11] T. Chandler, M. Cordeil, T. Czauderna, T. Dwyer, J. Glowacki, C. Goncu,
M. Klapperstueck, K. Klein, K. Marriott, F. Schreiber, and E. Wilson,
‘Immersive analytics, in Proc. Big Data Vis. Anal. (BDVA), Sep. 2015,
pp. 1–8.
[12] A. Tang, C. Owen, F. Biocca, and W. Mou, ‘Comparative effectiveness
of augmented reality in object assembly, in Proc. SIGCHI Conf. Hum.
Factors Comput. Syst., Apr. 2003, pp. 73–80.
[13] A. G. Taylor, ‘HoloLens hardware, in Develop Microsoft HoloLens Apps
Now. Berkeley, CA, USA: Apress, 2016, pp. 153–159.
[14] J. Sodnik, S. Tomazic, R. Grasset, A. Duenser, and M. Billinghurst, ‘Spa-
tial sound localization in an augmented reality environment, in Proc. 18th
Aust. Conf. Comput.-Hum. Interact., Design, Activities, Artefacts Environ.,
Nov. 2006, pp. 111–118.
[15] A. Härmä, J. Jakka, M. Tikander, M. Karjalainen, T. Lokki, J. Hiipakka,
and G. Lorho, Augmented reality audio for mobile and wearable appli-
ances, J. Audio Eng. Soc., vol. 52, no. 6, pp. 618–639, Jun. 2004.
[16] U. Neumann and A. Majoros, ‘Cognitive, performance, and systems issues
for augmented reality applications in manufacturing and maintenance, in
Proc. Virtual Reality Annu. Int. Symp., Mar. 1998, pp. 4–11.
[17] R. S. Kennedy, N. E. Lane, K. S. Berbaum, and M. G. Lilienthal, ‘Simula-
tor sickness questionnaire: An enhanced method for quantifying simulator
sickness, Int. J. Aviation Psychol., vol. 3, no. 3, pp. 203–220, 1993.
[18] A. Gracia and S. González, V. Robles, E. Menasalvas, and
T. Von Landesberger, ‘New insights into the suitability of the third
dimension for visualizing multivariate/multidimensional data: A study
based on loss of quality quantification, Inf. Vis., vol. 15, no. 1, pp. 3–30,
Jan. 2016.
[19] D. Raja, D. Bowman, J. Lucas, and C. North, ‘Exploring the benefits
of immersion in abstract information visualization, in Proc. Immersive
Projection Technol. Workshop, May 2004, pp. 61–69.
[20] L. Arms, D. Cook, and C. Cruz-Neira, ‘The benefits of statistical visu-
alization in an immersive environment, in Proc. IEEE Virtual Reality,
Mar. 1999, pp. 88–95.
[21] M. Sedlmair, T. Munzner, and M. Tory, ‘Empirical guidance on scatterplot
and dimension reduction technique choices, IEEE Trans. Vis. Comput.
Graph., vol. 19, no. 12, pp. 2634–2643, Dec. 2013.
[22] S. G. Vandenberg and A. R. Kuse, ‘Mental rotations, a group test of three-
dimensional spatial visualization, Perceptual Motor Skills, vol. 47, no. 2,
pp. 599–604, Oct. 1978.
[23] M. S. Masters and B. Sanders, ‘Is the gender difference in mental rotation
disappearing?’ Behav. Genet., vol. 23, no. 4, pp. 337–341, Jul. 1993.
[24] V. Pulkki, ‘Spatial sound reproduction with directional audio coding,
J. Audio Eng. Soc., vol. 55, no. 6, pp. 503–516, Jun. 2007.
[25] R. J. Hathaway and J. C. Bezdek, ‘Visual cluster validity for prototype
generator clustering models, Pattern Recognit. Lett., vol. 24, nos. 9–10,
pp. 1563–1569, Jun. 2003.
[26] M. B. Selleck, D. Burke, C. Johnston, and V. Nambiar, Augmented reality
integration of fused lidar and spatial mapping, Proc. SPIE, vol. 10666,
May 2018, Art. no. 106660S.
[27] J. Choi, S. Park, and J. Ko, Analyzing head-mounted AR device energy
consumption on a frame rate perspective, in Proc. 14th Annu. IEEE Int.
Conf. Sens., Commun., Netw. (SECON), Jun. 2017, pp. 1–2.
[28] C. D. Spielberger, R. L. Gorsuch, and R. E. Lushene, STAI Manual for
the Stait-Trait Anxiety Inventory (‘‘self-evaluation questionnaire’’), vol. 22.
Palo Alto, CA, USA: Consulting Psychologist, 1970, pp. 1–24.
[29] S. Härtel, J.-P. Gnam, S. Löffler, and K. Bös, ‘Estimation of energy expen-
diture using accelerometers and activity-based energy models—Validation
of a new device, Eur. Rev. Aging Phys. Activity, vol. 8, no. 2, pp. 109–114,
Oct. 2011.
[30] M. Höst, B. Regnell, and C. Wohlin, ‘Using students as subjects—A
comparative study of students and professionals in lead-time impact assess-
ment, Empirical Softw. Eng., vol. 5, no. 3, pp. 201–214, Nov. 2000.
[31] L. Gautier. RPY2: A Simple and Efficient Access to R from Python.
Accessed: Feb. 15, 2019. [Online]. Available: https://sourceforge.
net/projects/rpy/
[32] I. G. G. Kreft and J. de Leeuw, Introducing Multilevel Modeling. Newbury
Park, CA, USA: Sage, 1998.
[33] E. Z. Barsom, M. Graafland, and M. P. Schijven, ‘Systematic review on
the effectiveness of augmented reality applications in medical training,
Surgical Endoscopy, vol. 30, no. 10, pp. 4174–4183, Oct. 2016.
[34] E. Nugent, N. Shirilla, A. Hafeez, and D. S. O’Riordain, O. Traynor,
A. M. Harrison, and P. Neary, ‘Development and evaluation of
a simulator-based laparoscopic training program for surgical
novices, Surgical Endoscopy, vol. 27, no. 1, pp. 214–221,
Jan. 2013.
[35] Z. Zhou, A. D. Cheok, Y. Qiu, and X. Yang, ‘The role of 3-D sound
in human reaction and performance in augmented reality environments,
IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 37, no. 2,
pp. 262–272, Mar. 2007.
[36] M. Hatala, L. Kalantari, R. Wakkary, and K. Newby, ‘Ontology and
rule based retrieval of sound objects in augmented audio reality system
for museum visitors, in Proc. ACM Symp. Appl. Comput., Mar. 2004,
pp. 1045–1050.
[37] J. R. Blum, M. Bouchard, and J. R. Cooperstock, ‘What’s around me?
Spatialized audio augmented reality for blind users with a smartphone, in
Proc. Int. Conf. Mobile Ubiquitous Syst., Comput., Netw., Services. Berlin,
Germany: Springer, 2011, pp. 49–62.
[38] R. Hyman, ‘Quasi-experimentation: Design and analysis issues for field
settings (book), J. Personality Assessment, vol. 46, no. 1, pp. 96–97,
1982.
71930 VOLUME 7, 2019
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
BURKHARD HOPPENSTEDT studied computer
science from the University of Ulm. He has
been with the Institute of Databases and Informa-
tion Systems as an External Research Associate,
in cooperation with the ATR Software GmbH,
since 2016. His research interest includes big data
analytics (e.g., outlier detection and concept drift)
in distributed systems and mixed reality, especially
with the focus on predictive maintenance in an
industrial context.
THOMAS PROBST studied psychology at
Regensburg University. He received the Diploma
degree, in 2009, and the Ph.D. degree in psychol-
ogy from the Humboldt University of Berlin, in
2015. During the Ph.D. thesis, he was involved in
psychotherapy monitoring, patient-therapist feed-
back, and decision support tools. He started his
psychotherapy training and received the certifi-
cation as a cognitive-behavior therapist, in 2013.
From 2013 to 2015, he was a Research Assistant
with Regensburg University and the Deputy Head of the Psychotherapy
Outpatient Center. From 2015 to 2016, he was an Interim Professor of
clinical psychology and psychotherapy and clinical psychodiagnostics at the
University Witten/Herdecke. In 2017, he was an Interim Professor of clinical
psychology and psychotherapy with the Georg-August-University Göttingen
and a Research Associate with the University of Ulm. In 2017, he was
appointed as a Professor of psychotherapy sciences with Danube University
Krems, Austria. He is experienced in teaching courses on psychotherapy and
psychodiagnostics, psychosomatics, digital health, and quantitative research
designs.
MANFRED REICHERT received the Ph.D. degree
in computer science and the Diploma degree in
mathematics. Since 2008, he has been a Full Pro-
fessor with the University of Ulm, where he is cur-
rently the Director of the Institute of Databases and
Information Systems. He was an Associate Pro-
fessor with the University of Twente, The Nether-
lands. He was also a member of the Management
Board of the Centre for Telematics and Infor-
mation Technology, which is one of the largest
academic ICT research institutes in Europe. His research interests include
business process management (e.g., adaptive and flexible processes, pro-
cess lifecycle management, data-driven, and object-centric processes) and
service-oriented computing (e.g., service interoperability, mobile services,
and service evolution). He has been a PC Co-CHAIR of the BPM’08,
CoopIS’11, EMISA’13 and EDOC’13 conferences, and a General Chair of
the BPM’09 and EDOC’14 conferences.
WINFRIED SCHLEE was born in 1978. He
received the Ph.D. degree in clinical neuropsy-
chology from the University of Konstanz, in 2009,
where he introduced the concept of the Global
Model of Tinnitus Perception to explain the neu-
ronal mechanisms underlying the conscious per-
ception of the tinnitus percept, and the Habilitation
degree, in 2018. Since 2009, he has studied var-
ious factors influencing the conscious perception
of tinnitus, among them the influence of age, stress
and emotional arousal, the interference with auditory, electric and magnetic
stimulations, and intrinsic neuronal moment-to-moment fluctuations of the
resting alpha activity in temporal brain regions. In 2013, he joined the Tinni-
tus Research Initiative (TRI), where his current work focuses on discovering
new methods for the treatment and measurement of chronic tinnitus. He is
currently a German Neuropsychologist with the University of Regensburg.
He studied psychology, statistics, and philosophy with the University of
Konstanz and the University of Alabama at Birmingham. He is also a Chair
of the European COST project ‘TINNET - Better understanding the tinnitus
heterogeneity to improve and develop new treatments’ and the European
School for Interdisciplinary Tinnitus Research (ESIT).
KLAUS KAMMERER received the M.Sc. degree
in media computer science from the University
of Ulm, Germany, in 2014, where he is cur-
rently pursuing the Ph.D. degree with the Insti-
tute of Databases and Information Systems in
cooperation with Uhlmann Pac-Systeme GmbH
& Co., KG. His research interests include sensor
data management, semantic web technologies, and
context-aware business processes.
MYRA SPILIOPOULOU is currently a Profes-
sor of business information systems with the Fac-
ulty of Computer Science, Otto von Guericke
University Magdeburg, Magdeburg, Germany. Her
publications are on mining complex streams, min-
ing evolving objects, adapting models to drift and
building models that capture drift. She focusses on
two application areas: 1) business, including opin-
ion stream mining and adaptive recommenders,
and 2) medical research, including epidemiolog-
ical mining and learning from clinical studies. In the application domain
of medical research, she works on modeling and predicting evolution of
study participants with and without the target outcome. Her research on topic
monitoring, social network monitoring, and analysis of complex dynamic
data has been published in renowned international conferences and journals.
Her current research interest includes mining dynamic complex data. She is
regularly presenting tutorials on different aspects of complex data mining,
and recently on medical mining. She is a member of the Presidium of
the European Association of Data Science (EuADS). In Germany, she is a
member of the Jury for the Best Ph.D. Award of the German Informatics
Society. In 2018, she was a PC Co-Chair of the Applied Data Science Track in
the ACM SIGKDD International Conference on Knowledge Discovery from
Data (KDD’2018), London, in 2018. In 2019, she serves as a PC Co-Chair for
the International Symposium on Computer-Based Medical Systems (CBMS
2019), in 2019. In 2019, she is a Guest Editor of the ECML PKDD 2019 Jour-
nal Track; this track is hosted by the Data Mining and Knowledge Discovery
(DAMI) and the Machine Learning Journals of Springer. Since 2016, she
has been serving as an Action Editor for the Data Mining and Knowledge
Discovery Journal of Springer (DAMI). She is involved as a Senior Reviewer
in major conferences on data mining and knowledge discovery.
JOHANNES SCHOBEL studied computer sci-
ence from the University of Ulm. He received the
Ph.D. thesis from the Institute of Databases and
Information Systems, in 2018. He has been with
the Institute of Databases and Information Sys-
tems, University of Ulm, as a Research Associate,
since 2012. His current research interest includes
mobile data collection. In particular, he focuses
on end-user programming approaches to empower
domain experts to create their own mobile data col-
lection applications. In this context, he applies business process management
techniques and end-user programming approaches to unravel new insights.
VOLUME 7, 2019 71931
B. Hoppenstedt et al.: Applicability of Immersive Analytics in Mixed Reality: Usability Study
MICHAEL WINTER studied computer science
with the University of Ulm. Since 2015, he has
been a Research Associate with the Institute of
Databases and Information Systems. He places a
special emphasis on the comprehension of visual
process models. In this context, he applies mea-
surement methods and theories from cognitive
neuroscience such as eye tracking and electroder-
mal activity, and psychology such as cognitive
load theory to unravel new insights in this field.
Therefore, he has developed a conceptual framework to foster and to assist
novices and experts alike in the comprehension of process models. His
research interests include business process management, statistical sciences,
and human cognition.
ANNA FELNHOFER studied psychology at the
University of Vienna. She received the Ph.D.
degree in psychology in 2015; in the context of
her thesis, she focused on the experience of pres-
ence, social stress, and social interactions in virtual
reality. In 2015, she also received the master’s
training and was licensed as a Clinical Psychol-
ogist and a Health Psychologist. Since 2015, she
has been a Research Associate (postdoctoral) with
the Division Pediatric Psychosomatic Medicine,
Department of Pediatrics and Adolescent Medicine, Medical University of
Vienna. Her research interests include virtual reality treatment and digital
media, and applied ethics in the context of pediatric psychosomatic medicine.
OSWALD D. KOTHGASSNER received the Ph.D.
and Diploma degrees in psychological science.
He has a certification as a Clinical Psycholo-
gist and a Health Psychologist. He was with the
Department of Clinical Psychology, University of
Vienna, and also as a Guest Researcher with TU
Eindhoven. He is currently the Head of the Virtual
Reality Intervention & Stress Research Labora-
tory, Department of Child and Adolescent Psychi-
atry, Medical University of Vienna. He is experi-
enced with teaching courses on clinical psychology treatment, psycholog-
ical assessment, and psychophysiological measures. His research interests
include social stress and stress-related disorders, psychology in digital age,
and virtual reality treatments.
RÜDIGER PRYSS studied at the Universities
of Passau, Karlsruhe, and Ulm. He received the
Diploma and Ph.D. degrees in computer science
from the University of Ulm, in 2015. In his Ph.D.
thesis, he focused on fundamental issues related to
mobile process and task support. He was a Con-
sultant and a Developer in a software company.
Since 2008, he has been a Research Associate
with the University of Ulm. He is experienced
with teaching courses on database management,
programming, service-oriented computing, business process management,
document management, and mobile application engineering. He was a Local
Organization Chair of the BPM’09 and EDOC’14 conferences.
71932 VOLUME 7, 2019