scieee Science in your language
[en] (orig)

Gaze-Based Menu Navigation in Virtual Reality: A Comparative Study of Layouts and Interaction Techniques

Author: Kopácsi, László; Klimenko, Albert; Mohamed Selim, Abdulrahman; Barz, Michael; Sonntag, Daniel
Publisher: Zenodo
DOI: 10.1007/978-3-032-04999-5_31
Source: https://zenodo.org/records/17258525/files/978-3-032-04999-5_31.pdf
Gaze-Based Menu Na iga ion in Vi ual
Reali y: A Compa a i e S udy o Layou s
and In e ac ion Techniques
László Kopácsi1(B) , Albe Klimenko2 , Abdul ahman Mohamed Selim1 ,
Michael Ba z1,3 , and Daniel Sonn ag1,3
1 In e ac i e Machine Lea ning, Ge man Resea ch Cen e o A ificial In elligence
(DFKI), Saa b ücken, Ge many
{laszlo.kopacsi,abdul ahman.mohamed,michael.ba z,daniel.sonn ag}@d ki.de
2 Saa land Uni e si y, Saa b ücken, Ge many
[email p o ec ed]
3 Applied A ificial In elligence, Uni e si y o Oldenbu g, Oldenbu g, Ge many
Abs ac . In eg a ing eye- acking echnologies in Ex ended Reali y
(XR) headse s has enabled in ui i e, hands- ee sys em in e ac ion, such
as gaze-based menu na iga ion. Howe e , he e is a lack o comp ehen-
si e compa isons and consensus in he li e a u e on he op imal use o
gaze-based menu na iga ion. This pape p esen s a compa a i e analysis
o gaze-based menu na iga ion in i ual en i onmen s, ocusing on wo
common menu layou s: pie and lis menus, wi h h ee in e ac ion me h-
ods: gaze-based dwell, con olle -based, and a mul imodal app oach com-
bining gaze and con olle inpu s. We conduc ed a 19-pa icipan wi hin-
subjec s udy, measu ing ask comple ion ime, e o a e, usabili y, and
use p e e ence o each condi ion. The esul s indica e ha while he
pie layou was s a is ically as e and less e oneous han he lis lay-
ou , no ice use s end o a ou lis layou s. Fu he mo e, we ound ha
use s p e e ed he mul imodal in e ac ion me hod, despi e i s lowe ask
comple ion imes and highe e o a es compa ed o con olle -based
na iga ion. Based on ou findings, we offe design guidelines and ecom-
menda ions o implemen ing gaze-based menu sys ems.
Keywo ds: Ex ended Reali y (XR) · Gaze-based In e ac ion · Menu
Na iga ion · Eye T acking
1 In oduc ion
Eye acking has long been ega ded as an impo an inpu modali y o na u al,
hands- ee in e ac ion [ 24, 54]. This no ion holds ue o his day, especially wi h
he in eg a ion o eye acking in o Ex ended Reali y (XR) de ices (e.g., HTC
Vi e XR Eli e
1and Apple Vision P o
2), which use i o acili a e use in e ac ion
1 h ps://www. i e.com/eu/p oduc / i e-x -eli e/ (Accessed 17 Feb 2025).
2 h ps://www.apple.com/apple- ision-p o/ (Accessed 17 Feb 2025).
c The Au ho (s) 2026
C. A di o e al. (Eds.): INTERACT 2025, LNCS 16108, pp. 520–543, 2026.
h ps://doi.o g/10.1007/978-3-032-04999-5_31
Gaze-Based Menu Na iga ion in Vi ual Reali y 521
wi h digi al con en [ 13, 49]. This usage o eye acking alls unde ac i e gaze-
based in e ac ion, whe e a use in en ionally uses hei gaze o in e ac wi h
and con ol a sys em [
14]. Menu manipula ion is one applica ion in which use s
can in e ac wi h imme si e en i onmen s by using hei gaze o na iga e and
selec op ions. Howe e , despi e he a ailabili y o open s anda ds ha acili a e
de elopmen ac oss diffe en headse s, in he li e a u e, he e is no consensus on
he effec i eness o op imal use o gaze-based in e ac ion o menu manipula ion.
In gene al, menus in XR can be ca ego ised based on h ee main aspec s:
(1) In e ac ion Me hod, (2) Menu Layou , and (3) Menu Placemen [
20]. Fo
gaze-based in e ac ion me hods, he cons an ac i i y o he eyes p esen s a
significan challenge in dis inguishing be ween explo a o y gaze beha iou and
in en ional command-ac i a ing gaze beha iou , which is known as he Midas
Touch p oblem [
24]. Dwell-based gaze inpu (i.e., in en ionally main aining gaze
fixa ion on a specific a ge o a p ede e mined du a ion known as dwell ime o
ac i a e o selec elemen s [
46]) is o en ega ded as one o he simples and mos
common gaze e en s [
49]. While i is commonly used o selec ion, e.g., [ 23], i
has also been in es iga ed o o he ac ions, such as sc olling [
22]. To o e come
he Midas Touch p oblem, long dwell imes (e.g., 500 ms [
43] o 300 ms [ 45])
a e o en used, bu his aises he challenge o finding he bes comp omise
be ween e o a es and in e ac ion imes [
44]. As a esul , esea che s ha e
explo ed mo e ad anced gaze-based e en s such as smoo h pu sui s [
28] (i.e.,
he eye mo emen while ollowing a mo ing objec [
19]) and bo de -c ossing
[
21] (i.e., in en ionally mo ing he gaze ac oss a p e-defined bounda y o en e
an ac i a ion a ea, he eby igge ing an ac ion). Mul imodal sys ems ha e also
been explo ed, e.g., using gaze o poin ing and a con olle o selec ing [
44].
Rega ding menu layou s, pie o adial menus (i.e., whe e i ems a e a anged
a ound he ci cum e ence o a ci cle a an equal adial dis ance om he cen-
e [
7]) a e one o he mos esea ched menu layou s o gaze-based in e ac ion
[
2, 30, 53]. Howe e , compa a i e s udies by Mon ei o e al. [ 42] and Lediae a and
LaViola [
34] indica e ha use s end o p e e adi ional linea o lis layou s
o e pie layou s. Addi ionally, Lediae a and LaViola [
34] ound no significan
diffe ence in ask comple ion ime be ween pie and lis layou s du ing single-
le el menu selec ion using gaze-based in e ac ion. Las ly, menu placemen can
be b oadly ca ego ised in o head- e e enced, e.g. [
36], body- e e enced, e.g. [ 34],
and wo ld- e e enced, e.g. [
42], configu a ions. Wo ld- e e enced placemen has
been iden ified as he mos sui able op ion o ensu e a p ope and ai compa -
ison o diffe en menu layou s and in e ac ion me hods [
34, 42]. The e o e, we
decided o ocus on menu layou s and in e ac ion me hods, while keeping he
menu placemen cons an , i.e., wo ld- e e enced.
To p o ide a comp ehensi e o e iew o how diffe en menu layou s and in e -
ac ion me hods affec he o e all use expe ience in XR, we conduc ed a wi hin-
subjec s udy wi h 19 pa icipan s using six condi ions in Augmen ed Reali y
(AR). The s udy e alua ed wo hie a chical menu layou s: Pie Menu and Lis
Menu, ac oss h ee in e ac ion me hods: a Gaze-only dwell-based me hod (i.e., a
baseline o gaze-based in e ac ion), a Con olle -only me hod (i.e., he de aul
522 L. Kopácsi e al.
in e ac ion me hod in XR), and a mul imodal Gaze-and-Con olle me hod in
which gaze is used o poin ing and he con olle o selec ion. Al hough Bo de -
c ossing was ini ially included as an addi ional Gaze-only in e ac ion me hod,
i was emo ed p io o he main use s udy due o ou findings in he pilo
s udy. Fo each o he six condi ions, we measu ed ask comple ion ime, e o
a e, usabili y, and use p e e ence. Ou con ibu ions a e wo old: (i) we con-
duc ed his ex ensi e s udy o p o ide a compa ison be ween he diffe en menu
and in e ac ion combina ions, and (ii) we o mula ed design guidelines based
on he quan i a i e and pos -expe imen ques ionnai e esul s. Ou esul s show
ha use s had a sligh p e e ence owa ds he Gaze-and-Con olle mul imodal
in e ac ion me hod despi e i s lowe ask comple ion imes and highe e o
a es compa ed o he Con olle -only me hod. In addi ion, ou findings indi-
ca e ha Pie layou s a e s a is ically as e and less e oneous han Lis layou s
using Con olle -only and Gaze-and-Con olle in e ac ion me hods, bu use s
wi hou p io XR and eye acking expe ience s ill p e e ed Lis menus. To sup-
po ep oducibili y and u u e esea ch, ou sou ce code is a ailable a h ps://
gi hub.com/DFKI-In e ac i e-Machine-Lea ning/menu-na iga ion-in-VR.
2 Backg ound and Rela ed Wo k
XR is an umb ella e m o echnologies ha al e o gene a e eali y. I encom-
passes AR, which o e lays digi al elemen s on o he eal wo ld; Mixed Reali y
(MR), which enables in e ac ion be ween eal and digi al elemen s; and Vi ual
Reali y (VR), which c ea es a ully imme si e digi al en i onmen [
51]. In his
sec ion, we e iew he li e a u e on digi al menu layou s and in e ac ion me hods
in XR, wi h a pa icula ocus on gaze-based ela ed app oaches.
2.1 Menu Layou
Va ious menu layou s ha e been p oposed o e he yea s. Howe e , unique lay-
ou s a e o en associa ed wi h specific in e ac ion me hods (e.g., [
15, 52]), making
hem difficul o compa e agains o he menu layou s and in e ac ion me hods.
The e o e, we decided o ocus on wo classical layou s, i.e., lis and pie, because
hey a e no associa ed wi h specific in e ac ion me hods. Lis layou s usually
display i ems in a s aigh line, ei he ho izon ally o e ically, and emain he
mos commonly used menu layou in VR [
20]. Pie layou s, on he o he hand,
a ange i ems in a ci cula pa e n a ound a cen al poin , le e aging he na -
u al ange o mo ion o he human a m and eyes, which enables quick access
and selec ion. Huckau and U bina [
21] in oduced he concep o pie layou s o
gaze-based in e ac ion and demons a ed hei efficiency in a yping ask. Since
hen, pie layou s ha e shown a dominan p esence in gaze-based in e ac ion li e -
a u e [
2, 30, 44, 53]. Howe e , Mon ei o e al. [ 42] and Lediae a and LaViola [ 34]
sugges ha use s s ill p e e lis layou s o e he inc easing end o pie layou s.
We e alua ed bo h layou s in a hie a chical s uc u e, which is widely used
in eal-wo ld applica ions. Hie a chical layou s a e o ganised in a nes ed man-
ne , whe eby selec ing one op ion e eals a subsequen menu le el con aining
Gaze-Based Menu Na iga ion in Vi ual Reali y 523
addi ional op ions. This s uc u e mi o s he na u al way humans ca ego ise
and access in o ma ion, making i an in ui i e me hod o managing complex
in e ac ions [
1]. Al hough bo h lis and pie layou s can be s uc u ed hie a chi-
cally, such s uc u es p esen challenges, including he placemen o subsequen
menus and he po en ial o o e lapping. While a ious solu ions ha e been p e-
sen ed o hie a chical lis s and pie menus, a di ec compa ison be ween hem,
especially o gaze-based in e ac ion, is lacking.
Kim e al. [ 30] designed and implemen ed a no el hie a chical pie menu o
bo de -c ossing gaze-based in e ac ion wi h wo ld- e e enced placemen . Thei
inno a ion included isual ancho s used as es ing poin s o add ess he “o e -
shoo ing” p oblem obse ed in bo de -c ossing menus wi h sub-menus. Howe e ,
hey ocused p ima ily on he design pa ame e s o hei pie menu and compa ed
i only wi h o he bo de -c ossing pie menus. Ea ly adap a ions o hie a chical
lis menus, e.g., [
40, 56], de i ed hei design di ec ly om 2D desk op en i on-
men s. Al hough efficien , hese designs can lead o o e ly complex hie a chi-
cal s uc u es, especially in VR [
20]. Despi e hese challenges, hie a chical lis
menus emain a c ucial design elemen o in e ac i e sys ems; howe e , hey
a e mainly designed o con olle -based in e ac ion, wi h a lack o hie a chical
lis menus ailo ed o gaze-based in e ac ion. The e o e, in ou s udy, we inco -
po a ed design elemen s om Kim e al. [
30] o de elop hie a chical lis menus
specifically sui ed o gaze-based in e ac ion.
2.2 Menu Manipula ion
The inc eased in e es in XR has led o an inc ease in esea ch ocused on na u al
in e ac ion me hods ha enable use s o in e ac wi h compu e sys ems in u-
i i ely, mi o ing eal-wo ld in e ac ions [
9], which enhances use imme sion in
i ual en i onmen s. As a esul , a ious me hods o con olling i ual menus
ha e been de eloped, including handheld con olle s [
42, 48], hand ges u es [ 20],
gaze-based in e ac ion [
30, 45, 53], speech-based in e ac ion [ 43], and mul imodal
sys ems [ 44, 48].
In e ac ion ia con olle s is widely conside ed he de aul in VR [ 20]. How-
e e , i is impo an o explo e iable al e na i es o con olle s o minimise he
ha dwa e equi ed o sys em ope a ion. Hands- ee solu ions can p o e ad an-
ageous in scena ios whe e use s’ hands migh be occupied, and hey a e pa ic-
ula ly beneficial in public se ups due o hei disc ee na u e. Rega dless o he
modali y, each in e ac ion me hod needs o inco po a e a way o poin ing, such
as poin ing wi h a i ual ay a menu i ems, and a way o selec ion, such as
using a bu on o confi m he choice. Speech-based and ges u e-based me hods
do no equi e explici poin ing mechanisms, as hey p o ide di ec sho cu s
o he co esponding menu op ions; o example, by u e ing a keywo d o dis-
playing a hand signal, he linked op ion is selec ed, he eby enabling as in e -
ac ion. Howe e , hese app oaches equi e p io sys em knowledge and aining
o ope a e co ec ly, which is why we decided no o use hem. On he o he
hand, indi iduals na u ally di ec hei gaze owa ds objec s hey wan o in e -
ac wi h, making gaze a seamless and in ui i e in e ac ion me hod. Howe e ,
524 L. Kopácsi e al.
gaze-based in e ac ion is suscep ible o he Midas Touch p oblem [ 24], which
can significan ly unde mine he usabili y and efficiency o gaze-based in e ac ion
by igge ing unin ended use inpu s. The e o e, diffe en app oaches o gaze-
based in e ac ion ha e been de eloped o mi iga e his issue. These app oaches
can be ca ego ised in o Gaze-only in e ac ion [
15, 28, 30] and mul imodal in e -
ac ion [
34, 37, 44, 47, 53], he la e combining gaze wi h an addi ional in e ac ion
me hod.
Gaze-Only In e ac ion. seeks o emo e he need o physical inpu de ices
by elying solely on he eye acke s in eg a ed in o XR de ices. Dwell-based
in e ac ion igge s a selec ion when he use ’s gaze emains on a bu on o a se
amoun o ime, and is a well- esea ched in e ac ion me hod [
21, 24, 39, 44, 48, 54],
which makes i a sui able baseline o compa ison. Long dwell imes ha e been
shown o educe he a e o alse ac i a ions, bu hey also in oduce a delay
be ween he use ’s ac ion and he sys em’s esponse. A dwell ime abo e one
second is o en conside ed long, while sho dwell imes can be as b ie as 280
ms [
39]. Fo hie a chical menu in e ac ion, sho e dwell imes a e p e e able o
educe he o e all in e ac ion ime. Maja an a e al. [
39] in oduced adjus able
dwell imes o yping asks, allowing use s o manually modi y hei dwell ime
ia a bu on p ess; his esul ed in be e in e ac ion imes wi hou an inc ease
in e o a e. Howe e , we used a fixed dwell ime, simila o Mon ei o e al. [
43]
and Mu asim e al. [
45], o simpli y he sys em lea ning p ocess.
Bo de -c ossing, ano he Gaze-only me hod, igge s selec ion when he
poin e en e s an i em’s selec ion zone, essen ially making i a dwell in e ac-
ion wi h a 0 ms dwell ime; i has been explo ed as an in e ac ion me hod
wi h pie menus [
2, 21, 30]. Al hough as , i equi es p ecise con ol, and i su -
e s om he “o e shoo ing” p oblem, whe e apid eye mo emen s acciden ally
igge mul iple menu le els simul aneously. Ahn e al. [
2] add essed his by
dynamically adjus ing he posi ion o subsequen menu le els o he nex es ing
eye posi ion. Kim e al. [
30] enhanced his app oach by using isual ancho s as
es ing poin s, elimina ing posi ion calcula ions and educing in e ac ion imes
and e o a es compa ed o Ahn e al. [
2]. Mu asim e al. [ 44] epo ed sim-
ila findings by showing ha bo de -c ossing is a as and obus gaze-based
in e ac ion me hod. Smoo h pu sui s ha e also been used as an ad anced Gaze-
only menu in e ac ion me hod; howe e , hey equi e special menu layou s wi h
mo ing op ions o c ea e ackable ajec o ies in VR scenes [
28].
Mul imodal Gaze-Based In e ac ion. uses eye acking o poin ing o p e-
selec ion, while an addi ional modali y is used o confi m he p e-selec ion. In
gaze-head in e ac ion, one cu so is linked o he eyes and ano he o he head;
bo h cu so s mus ocus on he same elemen o a selec ion o be made. This
me hod builds on he na u al coo dina ion be ween he head and eyes [
53]and
p esen s an in ui i e and easy- o-lea n hands- ee app oach. Howe e , Lediae a
and LaViola [
34] ound ha head-based selec ion equi ed mo e ime and was
less popula han o he me hods. O he app oaches ha e combined gaze wi h

Gaze-Based Menu Na iga ion in Vi ual Reali y 525
hand-based in e ac ion, e.g., [ 37, 38, 47]. These me hods, howe e , a e gene ally
in e io o con olle s due o low hand- acking accu acy [
50] An al e na i e
mul imodal me hod is he gaze-and-bu on app oach, in which he gaze is used
o p e-selec ion, and a physical bu on p ess confi ms he selec ion. This can be
implemen ed using handheld con olle s [
34, 48], which a e commonly included
wi h mos VR de ices, o a keyboa d [
44]. Acco ding o Mu asim e al. [ 44], he
gaze-and-bu on me hod is one o he as e in e ac ion echniques; howe e , i
ends o suffe om highe e o a es due o hand-eye coo dina ion challenges,
whe e use s may p ess he bu on be o e eaching he a ge o lea e he a ge
p ema u ely. P o iding isual eedback, such as highligh ing he bu on wi h a
colou ed bo de when he gaze cu so en e s he a ge , can help mi iga e hese
issues. Despi e hese pe o mance issues, P euffe e al. [
48] epo ed ha use s
we e posi i e abou i , making i he second mos p e e ed me hod in hei s udy
compa ing fi e diffe en me hods. We op ed o a gaze-and-bu on app oach
using a con olle (i.e., gaze-and-con olle ) because i does no equi e use s o
lea n un amilia ges u es and offe s a balance be ween Gaze-only and Con olle -
only me hods while a oiding issues ela ed o low hand- acking accu acy.
Table 1 summa ises he key aspec s o he mos ele an publica ions and
highligh s he esea ch gap ou s udy aims o add ess. P e ious wo k has mainly
ocused on ei he pie menus wi h Gaze-only me hods (e.g. Kim e al. [
30]’s
bo de -c ossing app oach) o lis layou s wi h con olle -based in e ac ion [
42].
Lediae a and LaViola [
34] combined bo h layou s bu omi ed hie a chical s uc-
u es, while Mu asim e al. [
44] concen a ed solely on a ge selec ion. Al hough
gaze-based in e ac ion o hie a chical menu na iga ion has been explo ed [
32], i
was no ex ensi e, and a comp ehensi e compa ison emains lacking. Ou s udy
add esses his gap by sys ema ically compa ing hie a chical pie and lis menu
layou s ac oss Gaze-only, Con olle -only, and mul imodal in e ac ion me hods.
Table 1. Compa ison o menu design layou s and in e ac ion me hods be ween he
mos ele an publica ions and ou s udy se up.
Layou Hie a chical In e ac ion Me hod Visual Ancho s
Pie Lis Gaze-only Con olle -only Mul imodal
Lediae a and LaViola [ 34]✓ ✓ ✗ ✗ ✓ ✓ ✗
Mon ei o e al. [ 42]✓ ✓ ✓ ✗ ✓ ✗ ✗
Mu asim e al. [ 44]✓ ✗ ✓ ✓ ✗ ✓ ✗
Kim e al. [ 30]✓ ✗ ✓ ✓ ✗ ✗ ✓
Ou S udy ✓ ✓ ✓ ✓ ✓ ✓ ✓
3 Design and Implemen a ion
Fo ou implemen a ion, we used he HTC Vi e XR Eli e headse
3,along wi h
he supplied mo ion con olle s as inpu de ices. The headse was equipped wi h
3 h ps://www. i e.com/eu/p oduc / i e-x -eli e/ (Accessed 17 Feb 2025).
526 L. Kopácsi e al.
he Vi e Facial T acke
4 o eye acking capabili ies, and i ea u es adjus able
diop e s, allowing use -specific lens se ings o indi iduals wi h co ec ed ision.
3.1 Menu Layou Design
We implemen ed wo dis inc menu layou s in a hie a chical s uc u e. The pie
menu, shown in Fig.
1a, is based on he la ice menu by Kim e al. [ 30], ea u ing
a ci cula a angemen ha allows use s o selec menu op ions by di ec ing hei
gaze a isual ancho s posi ioned equidis an ly a ound a cen al poin . The lis
menu (shown in Fig.
1b) is adap ed om Mon ei o e al. [ 42], inco po a ing
isual ancho s om Kim e al. [
30] wi hin menu op ions wi h subsequen le els
ex ending o he igh .
(a) Pie menu (b) Lis menu
Fig. 1. Design o ou (a) pie and (b) lis menus, wi h angles ep esen ed in isual
deg ees. This shows ha we kep he dimensions consis en ac oss bo h layou s.
Following he sugges ions o Kim e al. [ 30], bo h menu layou s we e designed
wi h a ho izon al isual angle o 8° o menu i ems and an addi ional 4° o he
i em selec ion zone in he case o Gaze-only in e ac ion. The sepa a ion o he
menu i em and he i em selec ion zone mi iga es acciden al menu selec ions
du ing dwell-based Gaze-only in e ac ion, add essing he Midas Touch p oblem.
Addi ionally, each i em selec ion zone includes a isual ancho wi h a adius o
1.5°, se ing as a es ing poin o add ess he po en ial “o e shoo ing” p oblem
when na iga ing hie a chical menus using Gaze-only in e ac ion. Figu e
1 p o-
ides an o e iew o he menu layou s, highligh ing he i em selec ion zones and
isual ancho s. Bo h menus had a o al o 12 op ions dis ibu ed ac oss h ee
le els. Figu e
2 illus a es how he menus un old. In bo h layou s, he subsequen
le els un old, acing he pa icipan o enhance usabili y and acili a e di ec ional
mo emen s [
8, 20]. Lis menus, as shown in Fig. 2b, ex end o he igh , adjacen
4 h ps://www. i e.com/eu/accesso y/ i e- ull- ace- acke / (Accessed 17 Feb 2025).
Gaze-Based Menu Na iga ion in Vi ual Reali y 527
o he cen e o he isual ancho , while he subsequen le els o pie menus, as
showninFig.
2a, ex end in he di ec ion o he selec ion, posi ioning hemsel es
a he cen e o he isual ancho o easy and in ui i e na iga ion. To acili-
a e seamless menu na iga ion and accommoda e a ious in e ac ion ypes, we
inco po a ed isual eedback in o he design. When a use ho e s hei gaze o
con olle o e a menu op ion, he op ion is highligh ed, indica ing eadiness o
selec ion. Addi ionally, we inco po a ed a p og ess ba a ound he isual ancho s,
ma ching he size o he i em selec ion zone o isual eedback du ing Gaze-only
in e ac ion.
(
a
)
Pie menu un olding
(
b
)
Lis menu un olding
Fig. 2. The un olding p ocess o he (a) pie and (b) lis menus, showing he g adual
expansion o he menu as he i ems a e e ealed in sequen ial s eps.
3.2 In e ac ion Me hod Design
We in es iga ed h ee ypes o in e ac ion me hods: (i) Gaze-only dwell-based
in e ac ion, (ii) Con olle -only in e ac ion, and (iii) a mul imodal Gaze-and-
Con olle in e ac ion, which uses gaze o poin ing and a bu on p ess om
he con olle o selec ion. We ini ially used bo de -c ossing as an addi ional
Gaze-only in e ac ion me hod o a oid he Midas Touch p oblem, bu i was
only e alua ed du ing he pilo s udy and no in he main use s udy.
The Gaze-only dwell-based in e ac ion equi es a use o main ain hei gaze
wi hin he i em selec ion zone o a p e-defined du a ion be o e he menu op ion
is selec ed. We adop ed a dwell ime o 500 ms, as sugges ed by Mon ei o e
al. [
43]. The p og ess ba p o ides isual eedback o he use du ing he in e ac-
ion. This me hod is well- esea ched and has a low e o a e, making i eliable,
hough i is inhe en ly slow due o he minimum ime equi ed o each selec ion.
Fu he mo e, o Gaze-only in e ac ion, i is c ucial ha he menu le els expand,
as showninFig.
2, and no o e lay on op o each o he ; o he wise, his could
lead o mul iple unin en ional inpu s.
528 L. Kopácsi e al.
The Con olle -only in e ac ion is o en seen as he mos commonly used
in e ac ion me hod in XR. Use s selec menu op ions by poin ing hei con-
olle wi hin he egion o a menu op ion and p essing a bu on. The isual
ancho s a e emo ed o Con olle -only in e ac ion because he menu op ions
mus be selec ed di ec ly. Simila o Gaze-only in e ac ion, ho e ed menu i ems
a e highligh ed o p o ide isual eedback. Addi ionally, hap ic eedback is p o-
ided when he poin e mo es o e a menu op ion.
The mul imodal Gaze-and-Con olle in e ac ion uses gaze as a poin ing
mechanism and a bu on p ess o confi m he selec ion, add essing he Midas
Touch p oblem and elimina ing he need o dedica ed i em selec ion zones.
Simila o Con olle -only in e ac ion, isual ancho s a e no p esen because
Gaze-and-Con olle equi es he di ec selec ion o menu i ems. Addi ionally, o
p e en use s om p ema u ely p essing he bu on be o e hei gaze eaches he
a ge , simila o wha Mu asim e al. [
44] epo ed, isual and hap ic eedback
a e p o ided when he use looks a a menu op ion.
3.3 Vi ual En i onmen Design
Ou VR scene was de eloped in Uni y 2022.3.13 1
5. The in e ac ion me hods
we e implemen ed using he XR In e ac ion Toolki 2.5.2
6. To un he applica-
ion, we used he S eam VR OpenXR un ime
7, and enabled MR pass h ough
ia Vi e Business S eaming
8 o mi iga e mo ion sickness and allow use s o
see he en i onmen and he con olle s. The s udy se up was powe ed by a
high-pe o mance wo ks a ion wi h an N idia RTX 4090, an In el i9-13900K
p ocesso , and 64 GB RAM, which allowed he expe imen e o moni o he AR
iew o he pa icipan s and p o ide guidance when necessa y.
The i ual en i onmen se up, as shown in Fig. 3, included a i ual sel -
se ice kiosk displaying ins uc ions, which se ed as he p ima y in e ac ion
poin . The kiosk was posi ioned 2.2 m om he use , while he in e ac i e menus
we e placed 1.8 m away. The menus, shown in Fig. 4, we e scaled o main-
ain a uni o m size, measu ed in isual deg ees using he ollowing equa ion:
( isual deg ee =2 · a c an size
2·dis ance ), and o ien ed o ace he use , ensu ing a
consis en expe ience ac oss all menu le els, as explained in Sec .
3.1. To c ea e a
con olled s udy en i onmen , we disabled locomo ion wi hin he i ual space,
se a fixed heigh o he i ual a a a , and conduc ed he s udy wi h pa ici-
pan s in a sea ed posi ion. Addi ionally, wi h pass h ough enabled, we u ned off
he i ual ende ing o he con olle s o p e en any po en ial con usion and
con olled he oom ligh ing o isual consis ency. Fo bo h Con olle -only and
Gaze-and-Con olle in e ac ion me hods, we mapped he p ima y in e ac ion
bu on om he igge bu ons o he p ima y “A” o “X” bu ons on he igh
5 h ps://uni y.com/ eleases/edi o /wha s-new/2022.3.13 (Accessed 17 Feb 2025).
6 h ps://docs.uni y3d.com/Packages/com.uni y.x .in e ac ion. oolki @2.5/manual/
ins alla ion.h ml (Accessed 17 Feb 2025).
7 h ps://s o e.s eampowe ed.com/s eam (Accessed 17 Feb 2025).
8 h ps://business. i e.com/eu/solu ions/s eaming/ (Accessed 17 Feb 2025).
Gaze-Based Menu Na iga ion in Vi ual Reali y 535
sco e (μ=80.39, σ=14.56), and Pie & Gaze-only had he lowes a e age sco e
(μ=63.82, σ=20.60). We can see in Fig. 6c ha he e a e sligh diffe ences in
he da a dis ibu ions; howe e , he diffe ences we e no s a is ically significan
ANOVA(F(2,36)=.308, p=.737, η2
p=.006).
Fig. 7. Pa icipan p e e ences.
Despi e he lack o s a is ical significance in he usabili y o he six condi-
ions, no iceable pa icipan p e e ences eme ged in he pos -expe imen ques-
ionnai e (see Figs. 7 and 6d). The esponses indica ed a clea dislike o bo h
dwell-based Gaze-only menus, wi h none o he pa icipan s p e e ing o use
ei he o hem equen ly. All use s ound hem difficul o use, and mos (N
= 17) conside ed hem incon enien . Howe e , despi e he highe e o a es,
mos pa icipan s p e e ed he Gaze-and-Con olle -based combina ions (i.e.,
Pie & Gaze-and-Con olle (N = 7) and Lis & Gaze-and-Con olle (N = 5))
o equen use; his was ollowed by Con olle -only-based combina ions (i.e.,
Lis & Con olle -only (N = 5) and Pie & Con olle -only (N = 2)). Rega d-
ing ease o use, he p e e ences we e simila : mos pa icipan s a ou ed he
Pie & Gaze-and-Con olle (N = 6), ollowed by Lis & Con olle -only (N=5),
Lis & Gaze-and-Con olle (N = 4), and Pie & Con olle -only (N = 4). The
pos -expe imen ques ionnai e findings can be summa ised by he pa icipan s’
esponses o he las poin ph ased as ollows “Please so he menus om op
(mos a ou i e) o bo om (leas a ou i e)”. To analyse use p e e ence among
he six condi ions, we used a weigh ed ank-o de sco ing app oach. Pa icipan s
anked he condi ions om mos o leas p e e ed, wi h anks assigned sco es
om 6 (mos p e e ed) o 1 (leas p e e ed); we hen summed he sco es o
each condi ion and no malised hese o als by di iding by he maximum possi-
ble sco e, yielding a No malised P e e ence Sco e (NPS) o each condi ion. This
sco ing sys em p o ides an in e p e able measu e o ela i e p e e ence in ensi y,
wi h highe sco es indica ing s onge p e e ence. I is e iden om he final o de
shown in Fig. 6d ha Pie & Gaze-and-Con olle was he mos p e e ed op ion,
while bo h Gaze-only combina ions anked he lowes .
To e alua e he influence o p io XR and eye acking expe ience, we
compu ed poin -bise ial co ela ions be ween hese bina y ac o s and bo h

536 L. Kopácsi e al.
e o a es and ask comple ion imes. These analyses e ealed no co ela-
ion. Responses o he pos -expe imen ques ionnai e indica ed a sligh , non-
significan p e e ence o he Gaze-and-Con olle in e ac ion among pa icipan s
wi hou p io eye acking expe ience (NPS o 76% s 69%), while hose wi h
such expe ience displayed a ma ginally g ea e p e e ence o he Gaze-only in e -
ac ion (NPS o 34% s 27%). Fu he mo e, pa icipan s wi h p e ious XR and
eye acking expe ience a ou ed Pie menu layou s (NPS o 64%) o e Lis lay-
ou s (NPS o 53%), whe eas pa icipan s wi hou such expe ience p e e ed Lis
layou s (NPS o 61%) o Pie layou s (NPS o 56%).
6 Discussion
In his s udy, we conduc ed a 19-pa icipan use s udy o e alua e six condi-
ions o med by wo hie a chical menu layou s, and h ee in e ac ion me hods o
add ess ou h ee hypo heses 4.1. Con a y o ou ini ial assump ion, Pie menus
we e s a is ically as e han Lis menus o bo h Con olle -only and Gaze-and-
Con olle in e ac ion me hods. In addi ion, Pie menus we e o e all s a is ically
less e oneous. This could explain he p e alen use o Pie menus in he li e a u e
[
2, 30, 53] despi e he epo ed use p e e ence o Lis menus [ 34, 42]. The e o e,
we ejec ou hypo hesis H1 ha bo h layou s a e equi alen .
The Gaze-and-Con olle in e ac ion me hod was significan ly as e han
Gaze-only. This con as s wi h Mu asim e al. [
44], who epo ed no signifi-
can speed diffe ence be ween hese me hods; i is wo h no ing ha Mu asim
e al. [
44] used a sho e dwell du a ion (300 ms s 500 ms) and ocused on
a sligh ly diffe en ask (i.e., a ge selec ion). Howe e , despi e inco po a ing
isual and hap ic eedback, he Gaze-and-Con olle me hod exhibi ed a highe
e o a e, bu i was no s a is ically significan . When assessing he lea ning
cu e, we obse ed an a e age educ ion in ask comple ion ime o app oxi-
ma ely 20% ac oss all in e ac ion me hods wi h epe i i e menu en ies, which
aligns wi h he findings o Kim e al. [
30]. Las ly, he pos -expe imen ques-
ionnai e e ealed a use p e e ence o he Gaze-and-Con olle modali y, wi h
he Lis & Gaze-and-Con olle combina ion achie ing he highes SUS sco es,
he eby suppo ing hypo hesis H3. The e o e, hypo hesis H2 canno be ully
e ained; al hough u he esea ch is needed o educe he e o a e, he mul-
imodal app oach appea s o be he mos p e e ed and con enien op ion o
use s.
Addi ionally, Con olle -only in e ac ion was significan ly as e han bo h
Gaze-only and Gaze-and-Con olle , wi h he Pie & Con olle -only condi ion
being he as es , e en ou pe o ming he Lis & Con olle -only condi ion. This
esul diffe s om Mon ei o e al. [
42], who obse ed ha Lis menus pe o med
be e han Pie menus wi h Con olle -only in e ac ion; his disc epancy migh
ha e been caused by ou use o un olding hie a chical le els, as opposed o hei
o e lapping le els, which we e unsui able o Gaze-only.
Gaze-Based Menu Na iga ion in Vi ual Reali y 537
6.1 Design Guidelines
Ou findings indica e ha Pie menus a e significan ly as e wi h lowe e o
a es compa ed o Lis menus. Howe e , o Gaze-only (dwell-based) in e ac-
ion, he e was no s a is ically significan speed diffe ence be ween Pie and Lis
layou s. Mo eo e , use s wi h p io XR/eye acking expe ience showed a p e e -
ence o Pie menus, whe eas non-expe s ended (non-significan ly) o a ou Lis
menus. Rega ding in e ac ion me hods, bo h he Gaze-only and Con olle -only
me hods achie e compa able accu acy. Con olle -only and Gaze-and-Con olle
a e p e e able due o hei highe speed and usabili y. Gaze-and-Con olle also
eme ged as he mos p e e ed me hod among non-expe s, while Gaze-only
anked among he leas p e e ed.
The e o e, Pie menus should be used o asks equi ing pe o mance and p e-
cision. Fo Gaze-only (dwell-based) in e ac ion, he layou can be chosen acco d-
ing o o he design c i e ia such as usabili y o aes he ic conside a ions, since
pe o mance and e o a es do no significan ly diffe . When use s a e likely o
be XR o eye acking expe s, Pie layou s a e p e e ed, bu o non-expe s,
conside using Lis layou s o align wi h hei in ui i e expec a ions.
Gaze-only in e ac ion should no be used as he p ima y selec ion me hod
due o i s compa a i ely lowe speed and use p e e ence. Con olle -only offe s
high speed and usabili y. Al e na i ely, Gaze-and-Con olle offe s almos simi-
la efficiency and usabili y, wi h a b oade use p e e ence. To op imise pe o -
mance, Pie menus should be pai ed wi h Con olle -only o Gaze-and-Con olle .
Fo no ice-o ien ed in e aces ha u ilise Gaze-and-Con olle in e ac ion, a Lis
layou emains accep able and aligns wi h use p e e ence. Addi ionally, when
hands a e occupied, o he e is mo o impai men , Gaze-only, o p e e ably he
mul imodal Gaze-and-Con olle , should be used since i can be implemen ed
wi h a single bu on ins ead o he con olle [
44].
Despi e he e alua ed gaze-based solu ions showing lowe o e all pe o -
mance, hei ad an ages in e ms o p i acy, especially in public se ings whe e
obse a ion a acks a e a conce n, a e well epo ed [
4, 29, 31, 55]. Ou esul s
indica e ha using gaze in combina ion wi h o he modali ies is p e e ed o e
adi ional Con olle -only in e ac ion. Al hough he e hical conce ns ega ding
eye acking ha e been discussed in he li e a u e, e.g., [
41], ou s udy had min-
imal impac in his ega d and ecei ed app o al om ou ins i u ion’s e hical
e iew boa d.
While hap ic eedback ia con olle s and adjus able diop e s in he headse
may assis use s wi h co ec ed ision, XR sys ems, in gene al, a e no op i-
mised o isually impai ed use s [
12, 35]. In such cases, a con olle should be
used, and Gaze-only in e ac ion me hods should be a oided due o hei lowe
usabili y and p e e ence. Fu he mo e, clea isual eedback should be p o ided
ac oss all in e ac ion me hods o educe e o s and enhance ask efficiency [
33].
Addi ionally, based on Kim e al. [
30], isual ancho s should be used o sepa a e
i em selec ion zones om he a ge o Gaze-only in e ac ions; his p o ides
Gaze-only wi h a compa able e o a e o Con olle -only.
538 L. Kopácsi e al.
6.2 Fu u e Wo k
Fo Gaze-only in e ac ion, sho e du a ions, e.g. 300 ms [
45], o adjus able dwell
imes, e.g. [
39], may enhance pe o mance and use p e e ence, while inco po-
a ing isual indica o s could help main ain accu acy [
30]. Rega ding Gaze-and-
Con olle , u he in es iga ion is needed o educe e o a e, po en ially by
adding delayed selec ion o mo e p ominen isual eedback, such as a gaze indi-
ca o , o p e en use s om looking away when p essing he bu on.
Ou ini ial emo al o he bo de -c ossing-based Gaze-only me hod was based
on i s longe ask comple ion imes and lowe use p e e ence du ing he pilo
s udy. Ou obse a ions indica e ha bo de -c ossing was hinde ed by low eye
acke accu acy, which caused use s o spend mo e ime on selec ion despi e
conduc ing a gaze accu acy es p io o each condi ion. Al hough ou headse ’s
ad e ised eye- acke accu acy is highe (0.5°-1.1°)
12 han he de ice used by
Kim e al. [
30], use s epo ed low pe cei ed eye- acking accu acy du ing bo h
dwell and bo de -c ossing in e ac ions in he pilo s udy. Enla ging he i em
selec ion zone, as sugges ed by Kim e al. [
30], could mi iga e his issue, as
la ge a ge a eas and sho e dis ances imp o e ease and speed o selec ion
[
8, 17]. Fu u e esea ch should e alua e he spa ial accu acy and p ecision o
XR headse s, simila o Kapp e al. [
25], o adap i ely adjus he size o i em
selec ion zones based on measu ed eye acke accu acy, as sugges ed by Ba z e
al. [
3].
7 Conclusion
This pape p esen ed a compa a i e s udy o gaze-based menu na iga ion me h-
ods in XR, analysing he efficiency, usabili y, and use p e e ences o wo menu
layou s (Pie and Lis ) wi h h ee in e ac ion me hods: dwell-based Gaze-only,
Con olle -only, and a mul imodal Gaze-and-Con olle app oach. Th ough a
wi hin-subjec s udy wi h 19 pa icipan s, we conduc ed s a is ical analysis o
iden i y key insigh s and o mula e design guidelines o gaze-based menu sys-
ems. Ou findings indica e ha , despi e he pe o mance and accu acy o Pie
menus, use s wi hou p io XR and eye acking expe ience a ou ed Lis menus.
This sugges s ha use s’ p e-exis ing men al models and amilia i y wi h con-
en ional Lis layou s can ou weigh he aw pe o mance benefi s o Pie menus,
mo i a ing u he esea ch owa d guided onboa ding and adap i e menu sys-
ems. While dwell-based in e ac ion demons a ed high accu acy, i ecei ed low
usabili y sco es and use p e e ence, poin ing o he need o u he explo a ion
o al e na i e, inno a i e in e ac ion me hods. Addi ionally, al hough con olle -
based in e ac ion was as e and mo e accu a e han he mul imodal app oach,
use s exp essed a p e e ence o he gaze-and-con olle in e ac ion, highligh ing
i s po en ial as a p e e ed disc e e selec ion me hod in public se ups. O e all,
ou findings con ibu e o p o iding insigh s ha can in o m u u e esea ch and
de elopmen o mo e effec i e, use -cen ed in e ac ion echniques o gaze-based
menu sys ems in XR.
12 h ps://www. i e.com/eu/accesso y/ i e- ull- ace- acke / (Accessed 17 Feb 2025).
Gaze-Based Menu Na iga ion in Vi ual Reali y 539
Acknowledgmen s. This wo k was unded by he Eu opean Union unde g an
numbe 101093079 (MASTER) and he Ge man Fede al Minis y o Educa ion and
Resea ch (BMBF) unde g an numbe s 01IW23002 (No-IDLE) and 01IW24006 (NoI-
DLECha GPT), as well as by he Endowed Chai o Applied AI a he Uni e si y o
Oldenbu g.
Disclosu e o In e es s. The au ho s ha e no compe ing in e es s o decla e ha
a e ele an o he con en o his a icle.
Re e ences
1. Ahls öm, D., Cockbu n, A., Gu win, C., I ani, P.: Why i ’s quick o be squa e:
modelling new and exis ing hie a chical menu designs. In: P oceedings o he
SIGCHI Con e ence on Human Fac o s in Compu ing Sys ems, CHI 2010, pp.
1371–1380. ACM, New Yo k, NY, USA, Ap il 2010. h ps://doi.o g/10.1145/
1753326.1753534
2. Ahn, S., San osa, S., Pa en , M., Wigdo , D., G ossman, T., Gio dano, M.: S ick-
yPie: a gaze-based, scale-in a ian ma king menu op imized o AR/VR. In: P o-
ceedings o he 2021 CHI Con e ence on Human Fac o s in Compu ing Sys ems,
ACM, May 2021. h ps://doi.o g/10.1145/3411764.3445297
3. Ba z, M., Daibe , F., Sonn ag, D., Bulling, A.: E o -awa e gaze-based in e aces
o obus mobile gaze in e ac ion. In: P oceedings o he 2018 ACM Symposium
on Eye T acking Resea ch & Applica ions, ETRA 2018, pp. 1–10. ACM, New Yo k,
NY, USA, June 2018. h ps://doi.o g/10.1145/3204493.3204536
4. Bha i, O.S., Ba z, M., Sonn ag, D.: EyeLogin - calib a ion- ee au hen ica ion
me hod o public displays using eye gaze. In: ACM Symposium on Eye T acking
Resea ch and Applica ions, ETRA 2021 Sho Pape s, pp. 1–7. ACM, New Yo k,
NY, USA, May 2021. h ps://doi.o g/10.1145/3448018.3458001
5. Blanca, M., Ala cón, R., A nau, J., Bono, R., Bendayan, R.: Non-no mal da a: Is
ANOVA s ill a alid op ion? Psico hema 4(29), 552–557 (2017). h ps://doi.o g/
10.7334/psico hema2016.383
6. B ooke, J.: SUS: a ‘quick and di y’ usabili y scale. In: Usabili y E alua ion
In Indus y, p. 6. CRC P ess, 1s edi ion edn. (1996). h ps://doi.o g/10.1201/
9781498710411-35
7. Callahan, J., Hopkins, D., Weise , M., Shneide man, B.: An empi ical compa ison
o pie s. linea menus. In: P oceedings o he SIGCHI Con e ence on Human
Fac o s in Compu ing Sys ems, CHI 1988, pp. 95–100. ACM, New Yo k, NY, USA,
May 1988. h ps://doi.o g/10.1145/57167.57182
8. Cha, Y., Myung, R.: Ex ended Fi s’ law o 3D poin ing asks using 3D a ge
a angemen s. In . J. Ind. E gon. 43(4), 350–355 (2013). h ps://doi.o g/10.1016/
j.e gon.2013.05.005
9. Chu, M., Begole, B.: Chap e 17 - na u al and implici in o ma ion-seeking cues
in esponsi e echnology. In: Aghajan, H., Delgado, R.L.C., Augus o, J.C. (eds.)
Human-Cen ic In e aces o Ambien In elligence, pp. 415–452. Academic P ess,
Ox o d, Janua y 2010. h ps://doi.o g/10.1016/B978-0-12-374708-2.00017-6
10. Clemo e, A., Velasco, M., To icelli, D., Raya, R., Ce es, R.: Accu acy and p e-
cision o he Tobii X2-30 eye- acking unde non ideal condi ions. In: P oceedings
o he 2nd In e na ional Cong ess on Neu o echnology, Elec onics and In o ma -
ics - NEUROTECHNIX, pp. 111–116. SciTeP ess (2014). h ps://doi.o g/10.5220/
0005094201110116, backup Publishe : INSTICC
540 L. Kopácsi e al.
11. Cohen, J.: S a is ical Powe Analysis o he Beha io al Sciences. Rou ledge, New
Yo k, 2 edn. July 1988. h ps://doi.o g/10.4324/9780203771587
12. C eed, C., Al-Kalbani, M., Theil, A., Sa ca , S., Williams, I.: Inclusi e AR/VR:
accessibili y ba ie s o imme si e echnologies. Uni . Access In . Soc. 23(1), 59–73
(2024). h ps://doi.o g/10.1007/s10209-023-00969-0
13. Duchowski, A.T.: A b ead h-fi s su ey o eye- acking applica ions. Beha .
Res. Me hods Ins um. Compu . 34(4), 455–470 (2002). h ps://doi.o g/10.3758/
BF03195475
14. Duchowski, A.T.: Gaze-based in e ac ion: a 30 yea e ospec i e. Compu . G aph.
73, 59–69 (2018). h ps://doi.o g/10.1016/j.cag.2018.04.002
15. Elmadjian, C., Mo imo o, C.H.: Gazeba : Exploi ing he midas ouch in gaze
in e ac ion. In: Ex ended Abs ac s o he 2021 CHI Con e ence on Human Fac-
o s in Compu ing Sys ems, CHI 2021, ACM, May 2021. h ps://doi.o g/10.1145/
3411763.3451703
16. Faul, F., E d elde , E., Buchne , A., Lang, A.G.: S a is ical powe analyses using
G*Powe 3.1: es s o co ela ion and eg ession analyses. Beha . Res. Me hods
41(4), 1149–1160 (2009). h ps://doi.o g/10.3758/BRM.41.4.1149
17. Fi s, P.M.: The in o ma ion capaci y o he human mo o sys em in con olling
he ampli ude o mo emen . J. Exp. Psychol. 47(6), 381–391 (1954). h ps://doi.
o g/10.1037/h0055392, place: US Publishe : Ame ican Psychological Associa ion
18. Holm, S.: A simple sequen ially ejec i e mul iple es p ocedu e. Scandina ian J.
S a . 6(2), 65–70 (1979), publishe : [Boa d o he Founda ion o he Scandina ian
Jou nal o S a is ics, Wiley]
19. Holmq is , K., Nys om, M., Ande sson, R., Dewhu s , R., Ja odzka, H., Wei-
je , an de, J.: Eye T acking: A Comp ehensi e Guide o Me hods and Measu es.
Ox o d Uni e si y P ess, Ox o d, New Yo k, No embe 2011
20. Hou, S., Thomas, B.H., Lu, X.: VRMenuDesigne : a oolki o au oma ically gene -
a ing and modi ying VR menus. In: 2021 IEEE In e na ional Con e ence on A ifi-
cial In elligence and Vi ual Reali y (AIVR), pp. 154–159, No embe 2021. h ps://
doi.o g/10.1109/AIVR52153.2021.00036
21. Huckau , A., U bina, M.H.: Gazing wi h pEYEs. In: P oceedings o he 2008 sym-
posium on Eye acking esea ch & applica ions - ETRA 2008, ACM P ess (2008).
h ps://doi.o g/10.1145/1344471.1344483
22. Imamu a, S., Jieun, L., Rekimo o, J., Mako o, I.: Ad an age o gaze-only con-
en b owsing in VR using cumula i e dwell ime compa ed o hand con olle .
In: P oceedings o he 2023 ACM Symposium on Spa ial Use In e ac ion, SUI
2023, pp. 1–8. ACM, New Yo k, NY, USA, Oc obe 2023. h ps://doi.o g/10.1145/
3607822.3614513
23. Isomo o, T., Yamanaka, S., Shizuki, B.: In e ac ion design o dwell selec ion owa d
gaze-based AR/VR in e ac ion. In: 2022 Symposium on Eye T acking Resea ch
and Applica ions, ETRA 2022, pp. 1–2. ACM, New Yo k, NY, USA, June 2022.
h ps://doi.o g/10.1145/3517031.3531628
24. Jacob, R.J.K.: Wha you look a is wha you ge : eye mo emen -based in e ac ion
echniques. In: P oceedings o he SIGCHI Con e ence on Human Fac o s in Com-
pu ing Sys ems, CHI 1990, pp. 11–18. ACM, New Yo k, NY, USA, Ma ch 1990.
h ps://doi.o g/10.1145/97243.97246
25. Kapp, S., Ba z, M., Mukhame o , S., Sonn ag, D., Kuhn, J.: ARETT: augmen ed
eali y eye acking oolki o head moun ed displays. Senso s 21(6), 2234 (2021).
h ps://doi.o g/10.3390/s21062234, numbe : 6 Publishe : Mul idisciplina y Digi al
Publishing Ins i u e

Gaze-Based Menu Na iga ion in Vi ual Reali y 541
26. Kay, M., Elkin, L., Higgins, J.J., Wobb ock, J.O.: mjskay/ARTool: ARTool 0.11.2,
Ap il 2025. h ps://doi.o g/10.5281/ZENODO.594511, h ps://zenodo.o g/doi/
10.5281/zenodo.594511
27. Ke by, D.S.: The simple diffe ence o mula: an app oach o eaching nonpa ame ic
co ela ion. Comp . Psychol. 3, 11.IT.3.1 (2014). h ps://doi.o g/10.2466/11.IT.3.
1, publishe : SAGE Publica ions Inc
28. Khamis, M., Oechsne , C., Al , F., Bulling, A.: V pu sui s: in e ac ion in i ual
eali y using smoo h pu sui eye mo emen s. In: P oceedings o he 2018 In e -
na ional Con e ence on Ad anced Visual In e aces, AVI 2018, ACM, May 2018.
h ps://doi.o g/10.1145/3206505.3206522
29. Khamis, M., e al.: CueAu h: compa ing ouch, mid-ai ges u es, and gaze o cue-
based au hen ica ion on si ua ed displays. P oc. ACM In e ac . Mob. Wea able
Ubiqui ous Technol. 2(4), 174:1–174:22 (2018). h ps://doi.o g/10.1145/3287052
30. Kim, T., Ham, A., Ahn, S., Lee, G.: La ice menu: a low-e o gaze-based ma king
menu u ilizing a ge -assis ed gaze ges u es on a la ice o isual ancho s. In: CHI
Con e ence on Human Fac o s in Compu ing Sys ems, ACM, Ap il 2022. h ps://
doi.o g/10.1145/3491102.3501977
31. Kopácsi, L., Schneide , T.S., Ka , C., Ba z, M., Sonn ag, D.: Gazelock: gaze- and
lock pa e n-based au hen ica ion. In: P oceedings o he 30 h ACM Symposium
on Vi ual Reali y So wa e and Technology. VRST 2024, ACM, New Yo k, NY,
USA (2024). h ps://doi.o g/10.1145/3641825.3689520
32. Kopácsi, L., Klimenko, A., Ba z, M., Sonn ag, D.: Explo ing gaze-based menu
na iga ion in i ual en i onmen s. In: P oceedings o he 2024 ACM Symposium
on Spa ial Use In e ac ion, SUI 2024, pp. 1–2. ACM, New Yo k, NY, USA, Oc obe
2024. h ps://doi.o g/10.1145/3677386.3688887
33. Lankes, M., Rami ez Gomez, A.: GazeCues: explo ing he effec s o gaze-based
isual cues in i ual eali y explo a ion games. P oc. ACM Hum.-Compu . In e -
ac . 6(CHI PLAY), 237:1–237:25 (2022). h ps://doi.o g/10.1145/3549500
34. Lediae a, I., LaViola, J.: E alua ion o body- e e enced g aphical menus in i -
ual en i onmen s. In: P oceedings o G aphics In e ace 2020, GI 2020, pp. 308 –
316. Canadian Human-Compu e Communica ions Socie y/Socié é canadienne du
dialogue humain-machine (2020). h ps://doi.o g/10.20380/GI2020.31
35. Liu, T., Fazli, P., Jeong, H.: A ificial in elligence in i ual eali y o blind and
low ision indi iduals: li e a u e e iew. P oc. Hum. Fac o s E gon. Soc. Ann.
Mee ing 68(1), 1333–1338 (2024). h ps://doi.o g/10.1177/10711813241266832
36. Lu, F., Da a i, S., Bowman, D.: Explo a ion o echniques o apid ac i a ion o
glanceable in o ma ion in head-wo n augmen ed eali y. In: Symposium on Spa-
ial Use In e ac ion. SUI 2021, ACM, No embe 2021. h ps://doi.o g/10.1145/
3485279.3485286
37. Lys bæk, M.N., Rosenbe g, P., P euffe , K., G ønbæk, J.E., Gelle sen, H.: Gaze-
hand alignmen : combining eye gaze and mid-ai poin ing o in e ac ing wi h
menus in augmen ed eali y. P oc. ACM Hum.-Compu . In e ac . 6(ETRA), 1–18
(2022). h ps://doi.o g/10.1145/3530886
38. Lys bæk, M.N., e al.: Hands-on, hands-off: gaze-assis ed bimanual 3d in e ac ion.
In: P oceedings o he 37 h Annual ACM Symposium on Use In e ace So wa e
and Technology, pp. 1–12. ACM, Pi sbu gh PA USA, Oc obe 2024. h ps://doi.
o g/10.1145/3654777.3676331
39. Maja an a, P., Ahola, U.K., Špako , O.: Fas gaze yping wi h an adjus able dwell
ime. In: P oceedings o he SIGCHI Con e ence on Human Fac o s in Compu ing
Sys ems, CHI 2009, ACM, Ap il 2009. h ps://doi.o g/10.1145/1518701.1518758
542 L. Kopácsi e al.
40. Mine, M.R.: ISAAC: a i ual en i onmen ool o he in e ac i e cons uc ion
o i ual wo lds. Technical Repo , Uni e si y o No h Ca olina a Chapel Hill,
USA, Ap il 1995
41. Mohamed Selim, A., Ba z, M., Bha i, O.S., Alam, H.M.T., Sonn ag, D.: A e iew
o machine lea ning in scanpa h analysis o passi e gaze-based in e ac ion. F on .
A i . In ell. 7 (2024). h ps://doi.o g/10.3389/ ai.2024.1391745, publishe : F on-
ie s
42. Mon ei o, P., Coelho, H., Goncal es, G., Melo, M., Bessa, M.: Compa ison o adial
and panel menus in i ual eali y. IEEE Access 7, 116370–116379 (2019). h ps://
doi.o g/10.1109/access.2019.2933055
43. Mon ei o, P., Gonçal es, G., Peixo o, B., Melo, M., Bessa, M.: E alua ion o hands-
ee in e ac ion me hods du ing a fi s’ ask: efficiency and effec i eness. IEEE
Access 11, 70898–70911 (2023). h ps://doi.o g/10.1109/access.2023.3293057
44. Mu asim, A., Ba maz, A.U., Mugh abi, M.H., S ue zlinge , W.: Pe o mance analy-
sis o saccades o p ima y and confi ma o y a ge selec ion. In: 28 h ACM Sympo-
sium on Vi ual Reali y So wa e and Technology, ACM, No embe 2022. h ps://
doi.o g/10.1145/3562939.3565619
45. Mu asim, A.K., Ba maz, A.U., S ue zlinge , W.: Pinch, click, o dwell: compa ing
diffe en selec ion echniques o eye-gaze-based poin ing in i ual eali y. In: ACM
Symposium on Eye T acking Resea ch and Applica ions, ACM, May 2021. h ps://
doi.o g/10.1145/3448018.3457998
46. Paulus, Y.T., Remijn, G.B.: Usabili y o a ious dwell imes o eye-gaze-based
objec selec ion wi h eye acking. Displays 67, 101997 (2021). h ps://doi.o g/10.
1016/j.displa.2021.101997
47. P euffe , K., Maye , B., Ma danbegi, D., Gelle sen, H.: Gaze + pinch in e ac ion in
i ual eali y. In: P oceedings o he 5 h Symposium on Spa ial Use In e ac ion,
SUI 2017, ACM, Oc obe 2017. h ps://doi.o g/10.1145/3131277.3132180
48. P euffe , K., Mecke, L., Delgado Rod iguez, S., Hassib, M., Maie , H., Al , F.:
Empi ical e alua ion o gaze-enhanced menus in i ual eali y. In: 26 h ACM
Symposium on Vi ual Reali y So wa e and Technology, pp. 1–11. ACM, Vi ual
E en Canada, No embe 2020. h ps://doi.o g/10.1145/3385956.3418962
49. Plopski, A., Hi zle, T., No ouzi, N., Qian, L., B ude , G., Langlo z, T.: The eye
in ex ended eali y: a su ey on gaze in e ac ion and eye acking in head-wo n
ex ended eali y. ACM Compu . Su . 55(3), 53:1–53:39 (2022). h ps://doi.o g/
10.1145/3491207
50. Ran amaa, H.R., Kangas, J., Kuma , S.K., Meh onen, H., Jä ns ed , J., Raisamo,
R.: Compa ison o a s ylus wi h a con olle , hand acking, and a mouse o
objec manipula ion and medical ma king asks in i ual eali y. Appl. Sci. 13(4),
2251 (2023). h ps://doi.o g/10.3390/app13042251
51. Rauschnabel, P.A., Felix, R., Hinsch, C., Shahab, H., Al , F.: Wha is XR? owa ds
a amewo k o augmen ed and i ual eali y. Compu . Hum. Beha . 133, 107289
(2022). h ps://doi.o g/10.1016/j.chb.2022.107289
52. Rei e , K., P euffe , K., Es e es, A., Mi e meie , T., Al , F.: Look & u n: one-
handed and exp essi e menu in e ac ion by gaze and a m u ns in VR. In: 2022
Symposium on Eye T acking Resea ch and Applica ions, ACM, June 2022. h ps://
doi.o g/10.1145/3517031.3529233
53. Sidenma k, L., Po s, D., Bapisch, B., Gelle sen, H.: Radi-eye: hands- ee adial
in e aces o 3d in e ac ion using gaze-ac i a ed head-c ossing. In: P oceedings o
he 2021 CHI Con e ence on Human Fac o s in Compu ing Sys ems. ACM, May
2021. h ps://doi.o g/10.1145/3411764.3445697
Gaze-Based Menu Na iga ion in Vi ual Reali y 543
54. S a ke , I., Bol , R.A.: A gaze- esponsi e sel -disclosing display. In: P oceedings o
he SIGCHI Con e ence on Human Fac o s in Compu ing Sys ems, CHI 1990, pp.
3–10. ACM, New Yo k, NY, USA, Ma 1990. h ps://doi.o g/10.1145/97243.97245
55. S ephenson, S., Pal, B., Fan, S., Fe nandes, E., Zhao, Y., Cha e jee, R.: SoK:
au hen ica ion in augmen ed and i ual eali y. In: 2022 IEEE Symposium on
Secu i y and P i acy (SP), pp. 267–284, May 2022. h ps://doi.o g/10.1109/
SP46214.2022.9833742, iSSN: 2375-1207
56. an Teylingen, R., Riba sky, W., an de Mas , C.: Vi ual da a isualize . IEEE
T ans. Visual Compu . G aphics 3(1), 65–74 (1997). h ps://doi.o g/10.1109/2945.
582350
57. Wilcoxon, F.: Indi idual compa isons by anking me hods. Biome ics Bull. 1(6),
80–83 (1945). h ps://doi.o g/10.2307/3001968, publishe : [In e na ional Biome -
ic Socie y, Wiley]
58. Wobb ock, J.O., Findla e , L., Ge gle, D., Higgins, J.J.: The aligned ank ans o m
o nonpa ame ic ac o ial analyses using only ano a p ocedu es. In: P oceedings
o he SIGCHI Con e ence on Human Fac o s in Compu ing Sys ems, CHI 2011, pp.
143–146. ACM, New Yo k, NY, USA, May 2011. h ps://doi.o g/10.1145/1978942.
1978963
Open Access This chap e is licensed unde he e ms o he C ea i e Commons
A ibu ion 4.0 In e na ional License (h p://c ea i ecommons.o g/licenses/by/4.0/),
which pe mi s use, sha ing, adap a ion, dis ibu ion and ep oduc ion in any medium
o o ma , as long as you gi e app op ia e c edi o he o iginal au ho (s) and he
sou ce, p o ide a link o he C ea i e Commons license and indica e i changes we e
made.
The images o o he hi d pa y ma e ial in his chap e a e included in he
chap e ’s C ea i e Commons license, unless indica ed o he wise in a c edi line o he
ma e ial. I ma e ial is no included in he chap e ’s C ea i e Commons license and
you in ended use is no pe mi ed by s a u o y egula ion o exceeds he pe mi ed
use, you will need o ob ain pe mission di ec ly om he copy igh holde .