scieee Science in your language
[en] (orig)

SafeFetch: Practical Double-Fetch Protection with Kernel-Fetch Caching

Author: Victor Duta; Mitchel Josephus Aloserij; Cristiano Giuffrida
Publisher: Zenodo
DOI: 10.5281/zenodo.13771642
Source: https://zenodo.org/records/13771642/files/usenixsecurity24-duta.pdf
This pape is included in he P oceedings o he
33 d USENIX Secu i y Symposium.
Augus 14–16, 2024 • Philadelphia, PA, USA
978-1-939133-44-1
Open access o he P oceedings o he
33 d USENIX Secu i y Symposium
is sponso ed by USENIX.
Sa eFe ch: P ac ical Double-Fe ch P o ec ion
wi h Ke nel-Fe ch Caching
Vic o Du a, Mi chel Josephus Alose ij, and C is iano Giu ida,
V ije Uni e si ei Ams e dam
h ps://www.usenix.o g/con e ence/usenixsecu i y24/p esen a ion/du a
Sa eFe ch: P ac ical Double-Fe ch P o ec ion
wi h Ke nel-Fe ch Caching
Vic o Du a Mi chel Josephus Alose ij C is iano Giu ida
V ije Uni e si ei Ams e dam
Abs ac
Double- e ch bugs (o ulne abili ies) s em om in-ke nel
sys em call execu ion e ching he same use da a wice
wi hou p ope da a ( e)sani iza ion, enabling TOCT-
TOU a acks and posing a majo h ea o ope a ing sys-
ems secu i y. Exis ing double- e ch p o ec ion sys ems
ely on he MMU o ap on w i es o syscall-accessed
use pages and p o ide he ke nel wi h a consis en snap-
sho o use memo y. While his s a egy can hinde a -
acks, i also in oduces non i ial un ime pe o mance
o e head due o he cos o apping/ emapping and he
coa se (page-g anula ) w i e in e posi ion mechanism.
In his pape , we p opose
Sa eFe ch
, a p ac ical solu-
ion o p o ec he ke nel om double- e ch bugs. The key
in ui ion is ha mos sys em calls e ch small amoun s o
use da a (i a all), hence caching his da a in he ke nel
can be done a a small pe o mance cos . To his end,
Sa eFe ch
c ea es pe -syscall caches o pe sis e ched
use da a and eplay hem when hey a e e ched again
wi hin he same syscall. This s a egy neu alizes all
double- e ch bugs, while elimina ing apping/ emapping
o e heads and elying on e icien by e-g anula in e po-
si ion. Ou Linux p o o ype e alua ion shows
Sa eFe ch
can p o ide comp ehensi e p o ec ion wi h low pe o -
mance o e heads (e.g., 4.4% geomean on LMBench),
signi ican ly ou pe o ming s a e-o - he-a solu ions.
1 In oduc ion
The ope a ing sys em (OS) ke nel is he bed ock o mod-
e n sys ems. To p o ide se ice, he ke nel includes a
syscall in e ace, an explici bounda y be ween un us ed
OS p ocesses and he us ed ke nel. Hence, i is c ucial
o he ke nel o p ope ly sani ize da a ha lows h ough
his bounda y (e.g., syscall a gumen s). Failu e o do so
may lead o (ke nel) double- e ch bugs [15]. Such bugs oc-
cu when he ke nel e ches (i.e., eads) he same/o e lap-
ping da a om use space wice—a common ke nel design
pa e n—wi hou p ope ly ( e)sani izing da a on he sec-
ond e ch. In essence, double- e ch bugs in oduce a ace
condi ion, which a acke s can exploi o moun ime-
o -check o ime-o -use (TOCTTOU) a acks—changing
use da a be ween he wo e ches. This is o bypass
sani y checks and ypically escala e p i ileges. Such bugs
a e bo h common (as hey in ol e skipping seemingly
“ edundan ” ke nel checks [15]) and elusi e (as hey no -
mally escape es ing wi h p oduc ion sani ize s [9]).
P io esea ch [18,23,26,31] has mos ly sough o
de ec and epo se e al double- e ch bugs [2
–
7]. How-
e e , p io de ec ion ools a e imp ecise and hus un-
sui able o mi iga e double- e ch bugs in p oduc ion.
Mo e ecen ly, Midas [15] p oposed he i s mi iga ion
o o e p o ec ion ( a he han de ec ion) gua an ees
agains double- e ch bugs. Midas elies on he MMU and
copy-on-w i e
mechanics o expose consis en snap-
sho s o use pages o each syscall. To his end, Midas
aps w i es o each e ched use page, copies he page,
and exposes he new (old) page o he w i e (syscall).
While his app oach s uc u ally p e en s double- e ch
exploi a ion, i also incu s non i ial o e head due o
he cos o apping, emapping, and copying pages as
well as ope a ing a he coa se page g anula i y (causing
o e apping and o e copying due o alse sha ing [15]).
Finally, due o he complex and cos ly ope a ions Midas
whi elis s a numbe o syscalls (including he ke nel- e ch
hea y
exec e
), ul ima ely educing p o ec ion co e age.
In his pape , we p opose
Sa eFe ch
, a p ac ical p o-
ec ion sys em agains ke nel double- e ch bugs. The
key idea is o mo e he co e ins umen a ion om
w i es o ke nel e ches, wi h he ke nel main aining
pe -syscall caches o se e ke nel e ches. Indeed, ou
design has been inspi ed by Linux ke nel de elope s
seeking a p ac ical double- e ch bug mi iga ion by “pe -
o ming some kind o ke nel-side caching o use space
memo y” [8].
Sa eFe ch
’s design p e en s concu en
(a acke -con olled) w i es om co up ing da a exposed
o ke nel double e ches—which ins ead hi he p e i-
USENIX Associa ion 33 d USENIX Secu i y Symposium 1207
ously popula ed ke nel- e ch cache by cons uc ion. As
we will show, he as majo i y o syscalls only copy a ew
by es om use space, hence caching ke nel- e ch da a
pe -syscall can be done in a simple and inexpensi e way.
In con as o Midas’ w i e-side ins umen a ion s a -
egy,
Sa eFe ch
’s e ch-side s a egy elimina es he need
o cos ly MMU-based ins umen a ion (as w i es un
unins umen ed), alse sha ing and o e copying (as he
cache ope a es a he by e a he han page g anula i y),
and syscall-based whi elis ing (as indi idual p oblema ic
ke nel e ches can be whi elis ed as needed). As a esul ,
Sa eFe ch
signi ican ly imp o es bo h he pe o mance
and he secu i y (p o ec ion co e age) o s a e-o - he-a
solu ions a a ac ion o he complexi y.
To suppo ou claims, we implemen ed
Sa eFe ch
on Linux and e alua ed ou p o o ype, along wi h a
numbe o caching-cen ic op imiza ions, agains a num-
be o s anda d benchma ks. Ou e alua ion shows ha
Sa eFe ch
p o ides comp ehensi e p o ec ion a a ac-
ion o he o e head incu ed by Midas, despi e he highe
p o ec ion co e age (i.e., a single ke nel e ch s. h ee
majo syscalls whi elis ed). Fo ins ance,
Sa eFe ch
in-
cu s geomean pe o mance o e heads consis en ly below
5% ac oss s anda d ke nel benchma ks (i.e., LMBench,
OSBench, and Pho onix). On he same benchma ks, Mi-
das epo s much highe geomean o e heads (i.e., as
high as
≈
15% on OSBench and
≈
36% on LMBench).
Mo eo e , Midas incu s a single-benchma k wo s -case
o e head o 279% ( s. 22% o Sa eFe ch).
Con ibu ions. We make he ollowing con ibu ions:
•
We in es iga e common ke nel e ch pa e ns du ing
syscall execu ion and use he esul ing insigh s o
design a pe -syscall ke nel- e ch cache.
•
We p esen
Sa eFe ch
, an implemen a ion o ou
design o s uc u ally mi iga e double- e ch bugs
in he Linux ke nel. We show
Sa eFe ch
can be
seamlessly in eg a ed in o exis ing ke nel code pa hs,
esul ing in a p ac ical implemen a ion.
•
We e alua e
Sa eFe ch
on a numbe o s anda d
benchma ks, con i ming ha i can comp ehensi ely
mi iga e double- e ch bugs wi h low pe o mance
o e heads (e.g., 4.4% geomean on LMBench).
2 Backg ound
2.1 Use /ke nel Memo y Isola ion
Mode n ope a ing sys ems ely on i ual memo y sup-
po o en o ce use /ke nel memo y isola ion, ha is
p e en ing use (ke nel) execu ion om accessing ke nel
(use ) memo y. This is ypically done by using a join i -
ual memo y add ess space—whe e bo h use and ke nel
USERSPACE @da a
Th ead 1 Th ead 2
syscall_en y
FETCH @da a
VALIDATE @da a
FETCH @da a
USE @da a
UPDATE @da a
TIME
Figu e 1: Wo k low o a double- e ch exploi .
memo y mappings coexis du ing use /ke nel execu ion—
and ea u es o e ed by mode n memo y managemen
uni s (MMUs) o en o ce isola ion. Speci ically, on x86
pla o ms, he ke nel can se (unse ) he Use /Supe iso
bi in he Page Table En ies (PTEs) o use (ke nel)
memo y mappings. This p e en s use execu ion om
accessing ke nel memo y. I also p e en s he e e se
(i.e., ke nel execu ion accessing use memo y) assuming
Supe iso Mode Access/Execu ion P e en ion ea u es
(SMAP and SMEP, espec i ely) a e enabled.
While impo an o memo y isola ion, SMAP compli-
ca es he implemen a ion o common ope a ions such as
ke nel e ches, ha is use - o-ke nel da a ans e s o en
issued by he ke nel as pa o syscall handling (e.g.,
copying a message om use memo y o be sen o e he
ne wo k). To ease hei implemen a ion, mode n ope -
a ing sys ems ypically suppo special ke nel ans e
unc ions in o de o sa ely ans e da a be ween use
and ke nel. Fo example, he Linux ke nel o e s wo use -
o-ke nel (i.e.,
copy_ om_use
and
ge _use
) and wo
ke nel- o-use (i.e.,
copy_ o_use
and
pu _use
) ans-
e unc ions, which empo a ily disable SMAP and copy
da a om/ o use memo y ( espec i ely).
2.2 Double- e ch Bugs
A ke nel double e ch occu s in p esence o ke nel e ches
ans e ing he same use da a wice, ha is wi h mul-
iple use - o-ke nel ans e unc ion in oca ions o he
same (o o e lapping) use da a on Linux. This pa -
e n is no mally benign and used o simpli y o op imize
common ypes o (e.g., deep o a iable-leng h [26]) use -
o-ke nel da a ans e s. Howe e , i he ke nel assumes
he da a o be in a ian and only alida es da a on he
i s e ch, he second e ch o igina es a double- e ch bug.
Such bugs a e pa icula ly insidious as hey in oduce a
ace condi ion ha may ne e cause any ha m du ing
no mal execu ion. Howe e , an a acke can exploi such
1208 33 d USENIX Secu i y Symposium USENIX Associa ion
Th ead
syscall_en y
FETCH @da a
FETCH @da a
syscall_exi
TIME
Sa eFe ch
pe - h ead caches Use space
sea ch @da a
(cache miss)
e ch @da a om use
s o e @da a
e u n @da a
sea ch @da a
(cache hi )
e u n @da a
in alida e cache
Figu e 2: Sa eFe ch hinde ing a double- e ch exploi .
bugs by acing agains ke nel execu ion om ano he use
h ead and co up ing he (unsani ized) da a exposed
o he second e ch. Such ime-o -check o ime-o -use
(TOCTTOU) a ack can bypass sani y checks and o en
kicks a a p i ilege escala ion exploi [26].
Figu e 1depic s he wo k low o a ypical double- e ch
exploi . In esponse o Th ead 1 execu ing a syscall,
he ke nel i s e ches and alida es some use
@da a
.
Sho ly a e , ano he a acke -con olled Th ead 2 con-
cu en ly modi ies he
@da a
in use space. In Th ead 1,
he ke nel hen p oceeds o e ch
@da a
again wi hou
( e) alida ion. This allows he a acke o bypass alida-
ion checks and moun a TOCTTOU a ack. P io wo k
has la gely ocused on de ec ing such bugs wi h eason-
able accu acy [18,23,26,31]. In his pape , we ins ead
ocus on p o ec ing he ke nel om ze o-day double- e ch
bugs, while p oposing a much simple and mo e e icien
design han he s a e-o - he-a p o ec ion sys em [15].
3 Th ea Model
We assume a ypical local exploi a ion h ea model, wi h
an unp i ileged use -space a acke seeking o exploi a
ke nel double- e ch bug. O he classes o ulne abili ies
a e ou o scope, e.g., add essed by o hogonal mi iga ions.
The a acke ul ima ely aims o moun a TOCTTOU
a ack o a a ie y o di e en pu poses, e.g., p i ilege
escala ion, in o leak, denial o se ice, e c.
4 O e iew
To hinde exploi a ion o ke nel double- e ch bugs,
Sa eFe ch
gua an ees ha , du ing he li e ime o each
syscall, ke nel e ches o he same use da a will e u n
Cache Backend
Cache F on end
Sa eFe ch
Syscall Cache
syscall ans e unc ion call
sea ch
ange
ange
que y
miss
Cus om Alloca o
sani ized
ange
p o ision
Figu e 3: Sa eFe ch’s high-le el a chi ec u e.
he same alue. To his end,
Sa eFe ch
caches da a ead
by ke nel e ches a he pe - h ead and pe -syscall g an-
ula i y, as illus a ed in Figu e 2. As shown in he igu e,
when a syscall e ches some use
@da a
o he i s ime,
Sa eFe ch
p oxies he e ch o use memo y and s o es
he e ched da a in a pe - h ead in-ke nel cache. When
he syscall e ches he same
@da a
again,
Sa eFe ch
e ie es he da a om he cache. As such, double e ches
a e always consis en ly se ed wi h he ini ial e sion o
he da a ega dless o any concu en upda es o use
memo y. A he end o he syscall li ecycle, he cache is
in alida ed o implemen pe -syscall caching seman ics.
In e nally,
Sa eFe ch
includes wo co e componen s,
as shown in Figu e 3. The
Cache F on end
in e cep s
each ke nel e ch, i.e., (use - o-ke nel) ans e unc ion
in oca ion, and e u ns a sani ized ange (i.e., a con igu-
ous, immu able use memo y block) in ou pu . To his
end, he on end que ies he cu en syscall cache o
he ange. In case o a hi , a (sub) ange is se ed di ec ly
om he cache. In case o a miss, he on end no i ies
he
Cache Backend
. The la e p o isions he cache o
s o e he missing (sub) ange ( e ched om use memo y)
by means o a cus om alloca o .
Challenges.
While
Sa eFe ch
’s design is concep ually
simple, he e a e se e al challenges in ol ed in i s ealiza-
ion. Fi s , he need o an in-ke nel cache may lead o a
non i ial TCB impac , a ec ing secu i y. We will la e
show i is easible o e icien ly implemen ou design
wi h small TCB impac . Second, he need o in e pose
on all e ch ope a ions may lead o p o ec ion co e age
(and hus secu i y) issues. We will la e show i is easible
o p oduce a (nea ly) ull-co e age implemen a ion on
mode n ope a ing sys ems such as Linux, a ing e en
be e han he s a e o he a (Midas). Thi d, ou
e ch-side ins umen a ion s a egy may end up copying
USENIX Associa ion 33 d USENIX Secu i y Symposium 1209
mo e da a han Midas’ w i e-side s a egy in case use
da a is ne e changed du ing syscall execu ion. We will
la e show such cos is ma ginal compa ed o ha o
MMU-based ins umen a ion, esul ing in consis en ly
be e pe o mance. Finally, ou design equi e ins u-
men ing he ke nel’s as pa h (i.e., ans e unc ions).
As such, i s ins umen a ion and da a s uc u es need o
be ca e ully designed o e icien ly suppo ypical ke nel
e ch pa e ns. We will show i is possible o cap u e a
a ie y o di e en e ch pa e ns wi h ela i ely simple
da a s uc u es. In he nex sec ions, we i s analyze
he pa e ns ele an o ou design. Then, we use he
insigh s ga he ed om ou analysis o de ail ou design.
5 P o iling Ke nel Fe ches
While ou syscall caches a e supe icially simila o
o he ke nel caches since hey may suppo simila ange
que ies, ou design equi emen s a e ai ly unique. Fo
ins ance, he VMA cache is a classic example o a ke nel
cache suppo ing ange que ies, howe e , i s lookup pa -
e ns a e wholly di e en om ou s (lookups on locali y-
iendly memo y managemen ope a ions s. lookups on
ke nel e ches) and so a e i s scope (p ocess s. syscall)
and da a s o age equi emen s ( ixed- s. a iable-sized
da a). As such, o make op imal design decisions, we
need o lea n mo e abou ypical pa e ns o ke nel
e ches, including hei equency, da a ans e size, e c.
To his end, we de eloped a simple p o ile o ga he
ke nel- e ch s a is ics. Speci ically, ou p o ile in e poses
on syscall execu ion and eco ds he ollowing s a is ics:
he o al numbe o anges a syscall ans e s om use
space, he a e age size o anges ans e ed by a syscall,
and he o al amoun o da a a syscall e ches om use
space. Mo eo e , o each p ocess execu ed du ing p o il-
ing, we also ga he he numbe o syscalls ha ans e
da a om use space. We use a ious benchma ks (e.g.,
LMBench, OSBench) and popula use applica ions (e.g.,
Nginx, Apache) o gene a e a wo kload o sample syscall
execu ion. In o al, ou wo kload gene a ed a ound 317
million syscall samples, exe cising 165 indi idual syscalls
(
≈
52% o all de ined syscalls o Linux
x86_64
). Ou
main p o iling esul s a e depic ed in Figu es 4,5,6,7,8.
We elabo a e on he esul s in he nex sec ions, using
he ga he ed insigh s o mo i a e ou design.
6 Cache F on end
To p o ide he ke nel wi h a consis en iew o use
memo y, he cache on end in e poses on all use - o-
ke nel ans e unc ion in oca ions eques ing a speci ic
use ange. In esponse, he on end que ies he syscall
cache o he ange by means o he use ( i ual) add ess
and he leng h o he ange. A e he que y comple es,
he on end pe o ms a que y esolu ion s ep, e ching
pa s o he ange om use space in o he cache i needed
(i.e., i no cached) and hen o wa ding a sani ized ange
o he o iginal ans e unc ion.
The on end conside s a use ange
A
sani ized i and
only i : o any sub- ange
B
o con iguous use add esses,
such ha
B⊆A
, and
B
was p e iously e ched du ing
he execu ion o he syscall (i.e., ia a p e ious ans e
unc ion) hen
B
mus consis o he same by es as when i
was i s e ched. When que ying he cache, he on end
uses a p ede e mined sea ch policy o loca e he ange in
he cache. The sea ch policy is subjec o he (me a)da a
s uc u e used o bookkeep he use anges in he cache.
6.1 E icien Range Que ies
Gi en a use s a and end add ess (i.e., a ange)
Sa eFe ch
needs o ind all cached chunks ha o e -
lap wi h his inpu ange. Since a ange may only pa -
ially o e lap wi h an exis ing ange (o mul iple cached
anges), we a e in e es ed in inding he op imal da a
s uc u e ha can se ice his ope a ion e icien ly. Fo
his pu pose, o he ke nel subsys ems use ei he linked
lis s ( o small caches) o ed-black ees ( o la ge caches).
The Vi ual Memo y A ea (VMA) cache is case in poin ,
gene ally se ing add ess ange que ies ia a pe -p ocess
ed-black ee. Howe e , he VMA cache’s as pa h uses
a linked lis o he ew ecen ly used VMAs.
We expe imen ed wi h bo h ypes o da a s uc u es in
he con ex o
Sa eFe ch
. In bo h cases, a node con ains
me ada a eco ding he s a /end add ess o he ange
and a e e ence o he cached da a. Clea ly, in he case o
many cached anges, we expec linked-lis -based que ies
o pe o m poo ly, wi h a wo s -case sea ch complexi y
o O(n). In he same ein, we expec ed-black ees o be
mo e e icien , wi h a wo s -case sea ch sea ch complexi y
o O(log(n)) due o cons an - ime ebalancing. Indeed,
we expe imen ally e i ied ha a e a ound 100 cached
anges, he a e age sea ch ime o a linked lis is slowed
down by a ac o o wo compa ed o a ed-black ee.
Howe e , when he cache con ains only a ew elemen s
we obse ed he linked lis signi ican ly ou pe o ming a
ed-black ee. On op o mo e ligh weigh sea ch logic,
a small linked lis has ano he impo an pe o mance
edge o e a small ed-black ee on he inse ion pa h
(i.e., when adding a new use ange in o he cache).
Indeed, due o ebalancing, inse ions in he ed-black
ee a e always a ound 10 o de s o magni ude slowe
han in a linked lis . Since acco ding o ou p o iling
esul s, double e ches a e a e (occu ing once e e y
1,277 sampled syscalls), inse ions a e equen and a e
hus impo an o conside o pe o mance. Fo mo e
de ailed double e ch s a is ics om ou p o iling esul s,
1210 33 d USENIX Secu i y Symposium USENIX Associa ion

0 e ches
1 e ch
[2-10) e ches
[10-30) e ches
[30-4099) e ches
0%
50%
% o syscall samples
63.93%
28.69%
7.29% 0.03% 0.06%
Figu e 4: Pe cen age o syscall samples s. numbe o
anges hey e ch.
we e e he in e es ed eade o Appendix A.
Selec ing he ideal da a s uc u e.
To selec he
ideal candida e be ween ou wo da a s uc u es, we u n
o ou p o iling esul s. Figu e 4shows he pe cen age
o syscall samples ha e ch
N
anges om use memo y
ac oss all he p o iled benchma ks. As shown in he
igu e, mos samples (
≈
64%) do no e ch use anges
a all, while he as majo i y o hose which do, e ch
a mos one ange (
≈
28.7% o samples). E en so, a
non i ial numbe o samples (
≈
7.32%) e ch a mos 30
anges, while only
≈
0.06% samples e ch o e 30 anges.
As ou esul s sugges , a (small) linked lis seems he
ideal candida e o suppo ange que ies o he as
majo i y o syscall samples. Howe e , we also obse ed
samples e ching as many as
≈
4,000 anges. In hose
cases, he linked lis has e y poo pe o mance and he
ed-black ee is a as ly be e op ion.
To op imize o all possible syscall scena ios, we ul-
ima ely op ed o an
adap i e
sea ch policy. In o he
wo ds, he sea ch policy uses a linked lis by de aul
un il he numbe o anges in he cache eaches a p e-
de e mined h eshold. When he h eshold is exceeded,
Sa eFe ch
swi ches o a ed-black ee implemen a ion.
We de ail how we expe imen ally selec ed a good h esh-
old in Sec ion 9.3. To con e he linked lis in o a ed-
black ee,
Sa eFe ch
uses an e icien algo i hm ha
cons uc s a balanced ed-black ee om an o de ed
linked lis . This is done by i s copying he poin e s o
anges in a ec o and hen i e a ing o e he ec o in a
bina y b ea h- i s sea ch ashion. To keep he algo i hm
e icien , we need o ini ia e he con e sion when he lis
con ains a numbe o anges o he o m 2N−1.
6.2 Que y Resolu ion
When p ocessing he esul o he que y, he
Cache
F on end
makes di e en decissions depending on
whe he i ound a cached use ange colliding wi h he
que ied ange (cache hi ) o no (cache miss).
In case o a miss, no sub ange om he que ied ange
was p e iously e ched om use space. As a esolu ion,
he on end e ches he ange om use space and in-
s uc s he backend o alloca e s o age o inse he ange
in o he cache. I hen also indexes he new ange by
inse ing new me ada a in o he linked lis o ed-black
ee wi h wi h O(1) complexi y—by piggybacking on he
esul o he p e ious que y.
Cache hi s can ei he be pe ec o pa ial. A pe ec
hi means ha he on end ound a cached ange which
con ains he en i e ange i que ied o . In his case, he
on end o wa ds he sane ange om he cache wi hou
pe o ming any e ch om use space. A cache hi is
pa ial i he on end inds a cached ange ha collides
wi h he ange i que ied o , bu does no con ain he
en i e ange. In his case, he que ied ange may collide
wi h mul iple cached anges p e iously e ched om use
space and he on end needs o i s execu e a ange
de agmen a ion s ep o de e mine he sani ized ange.
Le
B
be he que ied ange and le
M={Ai|Ai∈cache ∧
AiTB6=/
0}
con aining all cached anges ha collide wi h
B
. De agmen a ion in ol es compu ing a new ange
C
such ha
C=∪n
i=1Ai∪(B ∪n
i=1Ai)
. In o he wo ds, he
de agmen ed ange
C
con ains all by es om he cached
anges colliding wi h
B
, while he sub- anges o by es
ha a e no cached ye a e e ched om use space. The
on end ins uc s he backend o eplace all colliding
anges in he cache wi h he newly de agmen ed ange
C
, a e which i can se ice he sani ized ange om
C
.
De agmen a ion piggybacks on he esul o he p e ious
sea ch, because all colliding anges can be ound by using
he p e ious sea ch’s i e a o . To suppo his, inse ion
in he linked lis and ed-black ee p ese es he i ual
add ess o de ing o cached anges.
7 Cache Backend
The
Cache Backend
is esponsible wi h managing he
backing memo y o he syscall cache. Speci ically, i s
main goals a e o e icien ly manage he li ecycle o anges
in he cache and exploi CPU cache locali y as much
as possible. The la e can be achie ed by s acking o-
ge he anges in memo y o da a locali y and hus be e
CPU cache u iliza ion, which speeds up ange que ies.
The o me in ol es en o cing policies o ange alloca-
ion/dealloca ion and s o age p o isioning/ elinquishing
echniques o keep cache ope a ions op imal.
To achie e i s goals, he backend main ains one cache
o each in- ansi syscall (a he pe - h ead g anula i y)
and adhe es o a cache o ganiza ion speci ically ailo ed
o maximize da a locali y when pe o ming ange que ies
h ough he cache. Addi ionally, he backend en o ces a
se ies o ange li ecycle managemen policies o e icien ly
o e see cache li espan. To suppo his o e all s a egy,
he backend elies on a cus om memo y alloca o .
USENIX Associa ion 33 d USENIX Secu i y Symposium 1211
2022242628210 212 214
A e age amoun o by es pe e ch
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
% o syscall samples
Figu e 5: Pe cen age o syscall samples s. a e age size
o da a hey e ch.
7.1 Cus om Alloca o
A nai e alloca ion policy would be o c ea e an objec
in he cache e e y ime a syscall e ches a new ange,
by using an o - he-shel ke nel alloca o (e.g., slab) o
accommoda e ange da a and me ada a. Howe e , s an-
da d ke nel alloca o s do no gi e con ol o e whe e
objec s a e placed in ( i ual) memo y, so we canno as-
su e locali y o imp o e he pe o mance o ange que ies.
Mo eo e , o an o - he-shel alloca o he alloca ion and
dealloca ion logic migh cause non- i ial o e head when
syscalls ans e many anges om use space, which
happens in p ac ice (see Figu e 4).
A be e app oach is o se ice ke nel memo y in la ge
chunks o i mul iple anges, i.e., using a egion-based
alloca ion scheme [13]. In such a scheme, a egion consis s
o one o mo e bu e s (con iguous memo y blocks) wi h
he same da a li e ime. In a egion, memo y objec s a e
alloca ed one a e ano he in memo y (in he cu en
bu e ) and ge dealloca ed all a once (by lushing all he
bu e s), once he li e ime o all he objec s in he egion
ends. As such, egion-based alloca ion allows us o pack
anges oge he in memo y o imp o e locali y. Mo eo e ,
i educes he numbe o calls o he unde lying alloca o
when alloca ing anges. On he as pa h, an alloca ion
in ol es only bumping a poin e o he nex slo in he
cu en bu e . Finally, i can e icien ly dealloca e all
he alloca ed objec s in one blow.
Sa eFe ch
uses a cus om egion-based alloca o ,
which se ices ke nel memo y a he g anula i y o a
egion, e e y ime he backend eques s mo e s o age
o hold anges. To unde s and how well
Sa eFe ch
can
bene i om egion alloca ion’s locali y- iendly design,
we u n again o ou p o iling esul s. Figu e 5shows
he pe cen age o syscall samples ha e ch an a e age
size
S
o use da a ac oss all he p o iled benchma ks. As
shown in he igu e, he majo i y (i.e., 65%) o syscall
samples e ch anges ha a e on a e age less han 64
by es, con i ming a high deg ee o da a locali y in a e-
20232629212 215
To al amoun o by es copied om use .
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
% o syscall samples
Figu e 6: Pe cen age o syscall samples s. o al amoun
o da a hey e ch.
gion in he common case.
Sa eFe ch
also bene i s om
he as dealloca ion pa h o egion-based alloca ion, e i-
cien ly dealloca ing all he anges in he egion when he
syscall e mina es. As an op imiza ion,
Sa eFe ch
does
no disca d he bu e s once he egion is dealloca ed,
bu adds hem o a pool o ( as ) euse in new egions
c ea ed by u u e syscalls.
The nex ques ion is he bu e size one should use. As
Figu e 5sugges s, small alloca ion eques s a e common
sugges ing small bu e s a e desi able. Mo eo e , Figu e 6
shows he pe cen age o syscall samples ha e ch a o al
amoun o
B
by es o use da a ac oss all he p o iled
benchma ks. As shown in he igu e, al hough a mode a e
po ion o syscalls ans e much da a (e en abo e 8
pages), he as majo i y ans e a less han a page o
da a. As a esul ,
Sa eFe ch
uses a single 1-page bu e
by de aul in each new egion (syscall) and elas ically
adds bu e s as needed in he edge cases—many e ches
pe syscall o e ches ans e ing o e 4 KB o da a.
7.2 Dual Region Design
A nai e app oach would be o s o e he use memo y
anges and he me ada a necessa y o ange que ies as
a s andalone objec in one single pe -syscall egion. How-
e e , his app oach would lead o me ada a agmen a ion
and poo locali y because me ada a would be in e mixed
wi h da a by es. While mos e ched anges a e small,
we saw ha syscalls can ans e la ge anges as well
(Figu e 5). To maximize da a locali y o ange que ying,
he
Cache Backend
pa i ions he syscall cache in wo
sepa a e egions: a da a egion s o es all by e anges
copied om use space while a me ada a egion s o es
he bea bone necessi ies o pe o m que ies o e anges.
Consequen ly, o each use ange, he backend main ains
wo objec s: a) a da a objec s o ing he ange o by es
copied om use space and b) a me ada a objec s o ing
he p ope ies o he ange used when que ying (e.g., use
i ual add ess, leng h, poin e o he da a objec , and
1212 33 d USENIX Secu i y Symposium USENIX Associa ion
1 syscall
[2-10) syscalls
[10-50) syscalls
50 o mo e
0%
50%
% o p ocesses
0.4%
85.2%
5.6% 8.9%
Figu e 7: Pe cen age o p ocesses s. numbe o syscalls
e ching use da a.
a ield used o link in o a ed-black ee o linked-lis ).
The size o me ada a objec s is ixed and small, al-
lowing one o p o ision me ada a egions wi h bu e s
smalle han a page. None heless, we chose o se e 1-
page bu e s o me ada a egions as well because his
leads o be e CPU cache colo ing (and u iliza ion) [19].
Despi e he same de aul bu e size, he wo egions
p o ision hei bu e s om sepa a e memo y pools (i.e.,
slab caches) o u he imp o e locali y.
7.3 Li ecycle Managemen
Range alloca ion policy.
A ange is alloca ed in he
cache e e y ime he on end encoun e s a cache miss
while pe o ming a ange que y. Ou p o iling da a (e.g.,
Figu e 4) sugges s ha mos e ches a e no double
e ches, hence we expec equen ange alloca ions in
he cache. Ou cus om alloca o helps educe he p essu e
on he (less e icien ) unde lying alloca o , because many
anges can be allo ed om he same egion bu e .
Alloca ing anges en ails c ea ing me ada a and da a
objec s in he app op ia e egions. To his end, he back-
end uses a
egion accoun ing
s uc u e, which keeps
ack o all bu e s allo ed o he e e en egion. To
speedup objec c ea ion, he
egion accoun ing
s uc-
u e s o es a poin e o a egion head bu e and a o s
se icing alloca ions om his bu e . As egion heads
ge deple ed a some poin , new bu e s a e added o he
egion when app op ia e and egion heads ge upda ed.
I an objec canno be c ea ed om he egion head,
he
Cache Backend
goes on he slow pa h i e a ing o e
all bu e s allo ed o he egion un il i inds one wi h
enough space o se ice he eques . As shown in Figu e 6,
some syscalls can ans e la ge amoun s o use da a,
hus i is possible ha a egion can hold many deple ed
bu e s. While his is mo e likely o da a egions, i can
also occu o me ada a egions when syscalls execu e
many e ches. To op imize he slow pa h, he
egion
accoun ing
s uc u e keeps a
eelis
con aining only
he bu e s ha a e no ye deple ed and can s ill se ice
alloca ions. Las ly, i an objec canno be se iced on
0 by es
[1,64) by es
[64, 256) by es
[256-1024) by es
[1024-4096) by es
[1-4) pages
[4,16) pages
abo e 16 pages
w i e
pw i e64
w i e
send o
exec e
0.01 0.17 0.40 0.10 0.13 0.01 0.18 0.01
0.03 0.56 0.21 0.10 0.07 0.02 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
0.00 0.80 0.00 0.00 0.00 0.00 0.20 0.00
0.00 0.00 0.00 0.00 0.57 0.39 0.03 0.00
0.0
0.2
0.4
0.6
0.8
a io
Figu e 8: Hea map showing he o al amoun o use
da a e ched by e ch-hea y syscalls.
he slow pa h, hen he backend le e ages he cus om
alloca o o c ea e a new bu e in he e e en egion.
Cache in alida ion policy.
When a sys em call e -
mina es, all anges cu en ly held in he cache can be
dealloca ed. As a esul , one could elinquish all s o -
age held by cached anges on syscall exi . This s a egy
educes he memo y oo p in and also yields a simple
dealloca ion pa h lushing all he bu e s held by me a-
da a and da a egions. Mo eo e , as discussed, syscalls
do no e ch many anges, hus on a e age we need no
ee many bu e s. Howe e , as shown in Figu e 7, ou
p o iling da a shows ha 99.6% o he p ocesses ha
incu e ches do so ac oss a leas wo syscalls. In o he
wo ds, i a p ocess issues a syscall ha e ches use da a,
i is likely o issue o he syscalls ha do he same.
Gi en ou p o iling esul s, i may be emp ing o p e-
se e all he alloca ed egion bu e s ac oss syscalls. How-
e e , while his s a egy minimizes he numbe o calls
o he unde lying alloca o (imp o ing pe o mance), i
may also signi ican ly inc ease he s eady-s a e memo y
oo p in . Tha is pa icula ly he case o h eads issu-
ing syscalls ha e ch a lo o use da a (e en mo e han
8 pages, as shown in Figu e 6). Gi en hese obse a ions,
ou cache in alida ion policy is o p ese e he head
bu e s, o bo h me ada a and da a egions, ac oss he
syscalls o a gi en h ead bu elease all he o he bu e s
held by each egion, on syscall exi . Addi ionally, on each
syscall exi , we in alida e he bookkeeping s uc u e used
by ou on end when pe o ming ange que ies. Finally,
on h ead exi we elease he esidual head bu e s.
Cache ini ializa ion policy.
Gi en ha ou wo head
bu e s pe sis ac oss syscalls o he same h ead, he
nex ques ion is when o alloca e such bu e s. An op ion
would be o simply alloca e he head bu e s a h ead
(i.e.,
ask_s uc
) c ea ion ime. Howe e , as shown in
Figu e 4, syscalls a ely e ch use da a, e.g.,
≈
64% o
he syscalls do no issue any e ches. Hence, eage head
USENIX Associa ion 33 d USENIX Secu i y Symposium 1213
bu e alloca ion a h ead c ea ion ime may lead o
unnecessa y memo y o e head. As a esul , he backend
alloca es he wo head bu e s (and he wo egions)
lazily, on he i s use e ch ha a h ead incu s.
Ze o-copy op imiza ion.
Gi en ha mos syscalls
e ch li le da a, s o ing da a in o he cache is gene -
ally inexpensi e. Howe e , some syscalls a e e ch-hea y
and copy la ge amoun s o da a. In such scena ios, copy-
ing da a in o he cache incu s a non i ial cos (as shown
la e in ou e alua ion), so i is desi able o elimina e as
many such ex a copies as possible. To in es iga e op i-
miza ion a enues, we isola ed all syscalls in ou p o iling
esul s ha copy mo e han a page o da a om use
space. Figu e 8p esen s ou esul s in a hea map.
As shown in he igu e, many o hese syscalls (wi h
he excep ion o exec) a e I/O bound. To a oid e-
sou ce con en ion, I/O bound syscalls ely on he ke -
nel
io _i e a o
unc ionali y o copy dozens o la ge
chunks o use da a in o ke nel s aging bu e s. Such
bu e s (and hei copied da a) pe sis unchanged un il
he ha dwa e (e.g., ne wo k ca d, ha d disk) becomes
a ailable o comple e he I/O ans e . To op imize ou
expensi e copies o he cache, we use he ke nel s aging
bu e i sel as a da a cache o all he
io _i e a o
copies e ching one o mo e pages. To delay dealloca-
ion o such s aging bu e s un il syscall exi , we inc ease
he e e ence coun on he unde lying pages. When en-
abling his ze o-copy op imiza ion (
Sa eFe ch
de aul ),
ou cache in alida ion policy also eleases he s aging
bu e s on syscall exi .
8 Implemen a ion
We implemen ed a
Sa eFe ch
p o o ype o Linux ke nel
5.11 (ma ching Midas’ e sion o a ai compa ison).
The
Cache F on end
adds he logic o cache lookups
and e ch sani iza ion by ins umen ing all ke nel ans e
unc ion a ian s (i.e.,
aw_copy_ om_use
and all he
ge _use mac o a ian s).
To manage he cache li ecycle o in- ansi syscalls,
he
Cache Backend
s o es he accoun ing s uc u es o
(me ada a and da a) caches as s uc u es in a h ead’s
ask_s uc
. Simila ly, he bookkeeping s uc u e used
( o cache lookups) by he on end is also e e enced by
he h ead’s
ask_s uc
. The cus om egion alloca o
uses wo global slab caches (
kmem_caches
) o se ice
egion bu e s e icien ly o in- ansi syscalls. We imple-
men ed cache in alida ion by ins umen ing he epilogue
o he
do_syscall_64
ke nel unc ion and he
do_exi
unc ion. Addi ionally, we ins umen ed all
clone
a i-
an s o ini ialize he
Sa eFe ch
logic in newly c ea ed
h eads. Finally, we implemen ed he ze o-copy op imiza-
ion by ins umen ing he
copyin
ke nel unc ion used
o io _i e a o -based use - o-ke nel ans e s.
Main ainabili y.
While ou design and implemen a ion
a e op imized based on ou syscall p o iling da a, we
e icien ly suppo a a ie y o syscall pa e ns. As such,
e en assuming applica ions change some o hei syscall
pa e ns in he u u e, we expec only mino (i any)
modi ica ions o ou implemen a ion. To acili a e he
p ocess, Sa eFe ch al eady suppo s un ime changes o
all i s co e pa ame e s, simila o o he wo kload-sensi i e
ke nel ea u es (e.g., KSM, THP, e c.). In o he wo ds,
wi h i s abili y o suppo many syscall pa e ns, i s
small codebase, and compliance o s anda d ke nel design
pa e ns (e.g., use o s anda d ke nel da a s uc u es,
s anda d hooking poin s, e c.), we expec
Sa eFe ch
o
be highly main ainable mo ing o wa d.
9 E alua ion
In his sec ion, we expe imen ally e alua e he secu i y
and pe o mance o Sa eFe ch.
9.1 Secu i y
CVE analysis.
As
Sa eFe ch
hinde s double- e ch
bugs by cons uc ion, simila o Midas [15], we e i y
i can co ec ly mi iga e a known double- e ch ulne a-
bili y.
CVE-2016-6516
is a known double- e ch ulne a-
bili y in oduced in Linux ke nel e sion 4.5. This ul-
ne abili y can be igge ed ia an ioc l syscall in com-
bina ion wi h he
FIDEDUPERANGE
lag. This pa icula
syscall a emp s o deduplica e he memo y pages be-
ween he sou ce and i s espec i e des ina ion iles. The
sou ce and des ina ions a e supplied as pa ame e s in a
ile_dedupe_ ange
s uc u e which he syscall copies
om use space. Howe e , be o e he syscall can e ch
he s uc u e i i s needs o compu e i s size by using a
coun a iable loca ed inside he use space s uc u e. A -
e acqui ing he coun om use space, and pe o ming
sani y checks, he syscall e ches he en i e s uc u e in o
ke nel memo y o e w i ing he coun ield in he p ocess,
as can be seen in Lis ing 1. A double e ch ulne abili y
can eme ge i be ween he i s e ch (line
2
) and second
e ch (line
17
) he coun ield is modi ied in use space.
As a esul ,
s_dedupe_ ile_ ange
would ope a e on
a malicious coun alue. This ulne abili y was pa ched
wi hin Linux ke nel 4.7 by copying back he old alue
in o he da a s uc u e (line 23) a e he second e ch.
To e alua e whe he
Sa eFe ch
can de end agains
his pa icula double e ch we modi ied Linux ke nel
e sion 5.11 by adding an addi ional check ha e i-
ies i a double e ch has occu ed (lines
19-20
) p io
o he p oposed ix. In o de o ep oduce he bug
1214 33 d USENIX Secu i y Symposium USENIX Associa ion
Table 4: Sa eFe ch-induced h oughpu deg ada ion.
Benchma k
Sa eFe ch
/wo ze o-copy /w ze o-copy
(%) (%)
AF_UNIX 39.1% 8.4%
Pipe 16.1% 5.1%
Nginx 5.9% 1.3%
Apache 9.3% 1.8%
wi hou ze o-copy op imiza ions,
Sa eFe ch
copies la ge
chunks o use da a o he da a cache, incu ing signi i-
can o e head. Fo ins ance, a oiding unnecessa y copy-
ing educes he h oughpu deg ada ion o 8.4% and
5.1% on he
AF_UNIX
and pipe benchma ks, espec i ely.
Copying la ge chunks o use da a impac s pe o mance
o eal-wo ld applica ions as well: Nginx and Apache ha e
hei h oughpu educed by 5.9% and 9.3% compa ed
o he baseline ( espec i ely). Enabling he ze o-copy op-
imiza ion educes h oughpu deg ada ion o bo h web
se e s o below 2%. Ou esul s con i ms he ze o-copy
op imiza ion is key o good I/O benchma k pe o mance.
10 Rela ed Wo k
While double- e ch o TOCTTOU (Time-o -Check- o-
Time-o -Use) ulne abili ies a ec di e en in e aces
and componen s (e.g., encla es [17,25], sandboxes [22],
compile s [29], and compa men s [14]), we ocus he e
on closely ela ed esea ch on ope a ing sys em ke nels.
Se na [24] was he i s who coined he name “double-
e ch ulne abili y” o desc ibe an ins ance in he Win-
dows ke nel. Since hen, esea ch has ocused on inding
double- e ch bugs ( h ough s a ic and dynamic p og am
analysis) o e en mi iga ing hese issues.
S a ic analysis.
On he s a ic analysis on , Wang e
al. [26] le e ages pa e n ma ching analysis on sou ce
code o ind double- e ch bugs in he Linux ke nel.
DFTinke [20] ex ends such pa e n-based app oach o
inc ease double- e ch co e age and educe alse posi i es.
While pa e n-ma ching echniques p o ed e ec i e in
unco e ing new double- e ch bugs, hey s ill p oduce a
high a e o alse posi i es and nega i es and a e unda-
men ally limi ed o de ec ing only speci ic bug pa e ns.
DEADLINE [31] and DFT acke [27] imp o e s a ic de-
ec ion o double e ches by means o compile -le el sym-
bolic execu ion. While hese app oaches can be applied
mo e b oadly (e.g., o de ec compile -induced double
e ches [28,30]) and a e gene ally be e sui ed o e ing
alse posi i es, symbolic execu ion in oduces o he limi-
a ions (e.g., pa h explosion) and does no comple ely
emo e alse epo ing (e.g., due o imp ecise memo y
modeling o incomple e code co e age). DEADLINE,
o example, does no de ec double e ches in inline
assembly, which is widely used in he ke nel.
Dynamic analysis.
On he dynamic analysis on ,
Ju czyk and Coldwind [18] p opose Bochspwn, which in-
s umen s memo y access callbacks in an x86 emula o o
ace double- e ch pa e ns. Coupled wi h uzzing, hei
echnique ound a se ies o exploi able double e ches
in he Windows ke nel. Howe e , Bochspwn incu s high
o e heads because i s acing ins umen a ion a ec s
he emula o ’s as pa h. Wilhelm [28] uses a simila ap-
p oach o Bochspwn and ound double- e ch ulne abili-
ies in he Xen hype iso , one o which was in oduced
h ough compile op imiza ions. Schwa z e al. [23] use
a uzze in conjunc ion wi h concu en use memo y
accesses o de ec double e ches h ough a cache co e
channel. Howe e , such a solu ion is elian on ha dwa e
ea u es (e.g., caches), which a y ac oss mic oa chi ec-
u es and a e subjec o noise. While dynamic echniques
a e o en mo e p ecise han s a ic app oaches, hey a e
undamen ally subjec o code co e age and limi ed o
inding double- e ch bugs only on execu ed code pa hs.
Mi iga ions.
Midas [15] is he i s p oposed mi iga ion
o p o ec he ope a ing sys em ke nel agains double-
e ch ulne abili ies. Midas elies on MMU-enabled Copy-
on-W i e seman ics o c ea e snapsho s o use pages ac-
cessed by he ke nel du ing syscall execu ion. As a esul ,
Midas incu s non i ial un ime pe o mance o e head
due o he cos o apping/ emapping and he coa se
(page-g anula ) w i e in e posi ion mechanism. Addi ion-
ally, Midas whi elis s some syscalls, mos no ably he
e ch-hea y exec syscall. In con as ,
Sa eFe ch
elimi-
na es he need o MMU-based ins umen a ion and o -
e s comp ehensi e p o ec ion a a ac ion o Midas’ cos .
Mo eo e , ou solu ion is simple and seamlessly in eg a es
wi h exis ing ke nel code pa hs. Indeed,
Sa eFe ch
’s
high-le el design is inspi ed by he “ke nel-side caching
o use space memo y” on he wishlis o Linux ke nel
de elope s o add ess double- e ch ulne abili ies [8].
11 Conclusion
We p esen ed
Sa eFe ch
, a p ac ical double- e ch bug
p o ec ion sys em which caches ke nel e ches a he
syscall g anula i y and se es subsequen e ches o he
same da a om he cache, hus ensu ing ha use da a
ne e changes ac oss e ches. We showed ha
Sa eFe ch
o e s comp ehensi e p o ec ion a a ac ion o he cos
o s a e-o - he-a solu ions such as Midas wi h ma ginal
memo y o e heads and geome ic pe o mance o e heads
consis en ly below 5% ac oss a ious OS benchma ks
(e.g., 4.4% on LMBench and 1.3% on OSBench) and
eal-wo ld wo kloads (e.g., 1.2% on Pho onix).
USENIX Associa ion 33 d USENIX Secu i y Symposium 1221

A ailabili y
To encou age adop ion, we ha e open sou ced
Sa eFe ch
a
h ps://gi hub.com/ usec/sa e e ch
. We a e
also ac i ely engaging Linux ke nel de elope s o seek
mainline inclusion.
Acknowledgmen s
We would like o hank he anonymous e iewe s o
hei eedback. This wo k was suppo ed by In el Co -
po a ion h ough he “Allocamelus” p ojec , by NWO
h ough p ojec “INTERSECT” and “Theseus”, and by
he Eu opean Union’s Ho izon Eu ope p og amme unde
g an ag eemen No. 101120962 (“Rescale”).
Re e ences
[1]
Bes p ac ices o adap ing pho onix es
sui e o benchma k linux pe o mance.
h ps://blogs.o acle.com/linux/pos /bes -
p ac ices- o -adap ing-pho onix- es -
sui e- o-benchma k-linux-pe o mance.
[2]
C e-2013-1332.
h ps://c e.mi e.o g/cgi-
bin/c ename.cgi?name=CVE-2013-1332.
[3]
C e-2015-8550.
h ps://c e.mi e.o g/cgi-
bin/c ename.cgi?name=CVE-2015-8550.
[4]
C e-2016-10433.
h ps://c e.mi e.o g/cgi-
bin/c ename.cgi?name=CVE-2016-10433.
[5]
C e-2016-10435.
h ps://c e.mi e.o g/cgi-
bin/c ename.cgi?name=CVE-2016-10435.
[6]
C e-2016-10439.
h ps://c e.mi e.o g/cgi-
bin/c ename.cgi?name=CVE-2016-10439.
[7]
C e-2016-8438.
h ps://c e.mi e.o g/cgi-
bin/c ename.cgi?name=CVE-2016-8438.
[8]
De ec and a oid ToCToU double- e ch / double-
ead om use space.
h ps://gi hub.com/KSPP/
linux/issues/95.
[9]
KASAN.
h ps://gi hub.com/google/kasan/
wiki.
[10]
Ni linux de ice d i e s ails o in-
s all daqmx.
h ps://knowledge.
ni.com/KnowledgeA icleDe ails?id=
kA03q000000wzMJCAY.
[11]
OSBench Au ho s. OSBench.
h ps://h ps://
gi hub.com/mbi snbi es/osbench.
[12]
Pho onix Au ho s. Pho onix.
h ps://www.
pho onix- es -sui e.com.
[13]
Eme y D Be ge , Benjamin G Zo n, and Ka h yn S
McKinley. Reconside ing cus om memo y alloca ion.
In OOPSLA, 2002.
[14]
A i Bha acha yya, Flo ian Ho hamme , Yuanlong
Li, Siddha h Gup a, And es Sanchez, Babak Falsa i,
and Ma hias Paye . Secu ecells: A secu e compa -
men alized a chi ec u e. In IEEE S&P, 2023.
[15]
A i Bha acha yya, U os Tesic, and Ma hias Paye .
Midas: Sys ema ic ke nel TOCTTOU p o ec ion.
In USENIX Secu i y, 2022.
[16]
Aa on B B own and Ma go I Sel ze . Ope a ing
sys em benchma king in he wake o lmbench: A
case s udy o he pe o mance o ne bsd on he in el
x86 a chi ec u e. In SIGMETRICS, 1997.
[17]
Felix D eissig, Jonas Röckl, and Tilo Mülle .
Compile -aided de elopmen o us ed encla es
wi h us . In ARES, 2022.
[18]
Ma eusz Ju czyk and Gyn ael Coldwind. Iden i ying
and exploi ing windows ke nel ace condi ions ia
memo y access pa e ns. 2013.
[19]
Hsien-Hsin S Lee and Ga y S Tyson. Region-based
caching: an ene gy-delay e icien memo y a chi ec-
u e o embedded p ocesso s. In CASES, 2000.
[20]
Yingqi Luo, Peng ei Wang, Xu Zhou, and Kai Lu.
D inke : De ec ing and ixing double- e ch bugs in
an au oma ed way. In WASA, 2018.
[21]
La y W McVoy, Ca l S aelin, e al. lmbench:
Po able ools o pe o mance analysis. In USENIX
ATC, 1996.
[22]
Sh a an Na ayan, C aig Disselkoen, Tal Ga inkel,
Na han F oyd, E ic Rahm, So in Le ne , Ho a
Shacham, and Deian S e an. Re o i ing ine g ain
isola ion in he i e ox ende e . In USENIX Secu-
i y, 2020.
[23]
Michael Schwa z, Daniel G uss, Mo i z Lipp, Clé-
men ine Mau ice, Thomas Schus e , Ande s Fogh,
and S e an Manga d. Au oma ed de ec ion, exploi a-
ion, and elimina ion o double- e ch bugs using
mode n cpu ea u es. In AsiaCCS, 2018.
[24]
Fe min J. Se na. Ms08-061 : The case o he ke nel
mode double- e ch.
h ps://ms c.mic oso .
com/blog/2008/10/ms08-061- he-case-o -
he-ke nel-mode-double- e ch/, 2008.
1222 33 d USENIX Secu i y Symposium USENIX Associa ion
[25]
Jo Van Bulck, Da id Oswald, Edua d Ma in, Ab-
dulla Aldose i, Fla io Ga cia, and F ank Piessens.
A ale o wo wo lds: Assessing he ulne abili y o
encla e shielding un imes. In CCS, 2019.
[26]
Peng ei Wang, Jens K inke, Kai Lu, Gen Li, and
S e e Dodie -Laza o. How double- e ch si ua ions
u n in o double- e ch ulne abili ies: A s udy o
double e ches in he linux ke nel. In USENIX
Secu i y, 2017.
[27]
Peng ei Wang, Kai Lu, Gen Li, and Xu Zhou. D -
acke : de ec ing double- e ch bugs by mul i- ain
pa allel acking. F on ie s o Compu e Science,
13, 2019.
[28]
Felix Wilhelm. Xenpwn: B eaking pa a i ualized
de ices. Black Ha USA, 2016.
[29]
Jianhao Xu, Luca Di Ba olomeo, Fla io To alini,
Bing Mao, and Ma hias Paye . Wa pa ack: Bypass-
ing c i h ough compile -in oduced double- e ches.
In IEEE S&P, 2023.
[30]
Jianhao Xu, Kangjie Lu, Zhengjie Du, Zhu Ding,
Linke Li, Qiushi Wu, Ma hias Paye , and Bing Mao.
Silen bugs ma e : A s udy o compile -in oduced
secu i y bugs. In USENIX Secu i y, 2023.
[31]
Meng Xu, Chenxiong Qian, Kangjie Lu, Michael
Backes, and Taesoo Kim. P ecise and scalable de-
ec ion o double- e ch bugs in os ke nels. In IEEE
S&P, 2018.
USENIX Associa ion 33 d USENIX Secu i y Symposium 1223
A Double Fe ch S a is ics
In his sec ion, we p esen de ailed s a is ics ela ed o
e ch and double e ch occu ences ac oss ou bench-
ma ks. Table 5p esen s ou esul s. Fo bo h gene ic
e ches and double e ches (i.e., e ches ha ans e
da a o e lapping wi h a p e ious e ch wi hin he same
syscall) we de ail: he a e o (double) e ches ela i e
o he o al numbe o syscall samples collec ed o he
benchma k ( he
Ra e
column), he o al numbe o dis-
inc syscalls ha execu ed a e ch du ing he benchma k
( he
Syscalls
column). Addi ionally, we epo he mini-
mum, a e age, and maximum numbe o (double) e ches
execu ed by one single syscall sample. The las ow in
he able e e s o syscalls execu ed by backg ound ap-
plica ions. As shown in he able, each benchma k has a
non i ial (e.g., up o 21 o Pho onix) numbe o syscall
ypes wi h double e ches. Mo eo e , he numbe o dou-
ble e ches pe o med by double- e ch syscalls g ea ly
a ies (e.g., anging om 1 o 218 o Pho onix).
Table 5: S a is ics o (double) e ch a es.
Benchma k S a is ic Fe ches Double Fe ches
LMBench
Ra e 1/2 1/273457
Syscalls 38 7
Min/A g/Max 1/1/459 1/50/67
OSBench
Ra e 1/3 1/80
Syscalls 17 5
Min/A g/Max 1/6/134 1/42/43
Pho onix
Ra e 1/4 1/5993
Syscalls 47 21
Min/A g/Max 1/1/661 1/1/218
Backg ound
Ra e 1/2 1/644
Syscalls 93 20
Min/A g/Max 1/2/4099 1/130/467
Ac oss all benchma ks, syscalls a e likely o execu e
e ches (a ound 1 in 3 syscalls e ch use bu e s), bu
mo e o en hey will e ch only a small numbe o use
bu e s (e.g., on a e age syscalls e ch 6 use bu e s on
OSBench). This p omp ed us o use a linked lis as
he de aul da a s uc u e o e icien caching. Howe e ,
ac oss all benchma ks, he e a e occu ences o e ch-
hea y syscall execu ions (e.g., wi h up o 661 e ches on
Pho onix), mo i a ing he need o an adap i e s a egy
ha eso s o a ed-black ee o handle e ch-hea y
scena ios. Mo eo e , he s iking di e ence be ween e ch
a es and double e ch a es ac oss all benchma ks, wi h
double e ches being a less equen , sugges s ha cache
inse ions happen o en and hus i is c ucial o ac o in
inse ion ime when de e mining he op imal h eshold
o con e o a ed-black ee.
Looking mo e closely a he a iabili y o he numbe
o e ches a syscall makes—ac oss he syscalls ha e ch
da a in ou p o ile—we obse ed ha
≈
55% ha e a s able
e ch a e (i.e., hey e ch he same numbe o use bu e s
e e y ime). Mos syscalls wi h s able e ch a es pe o m
ei he 1, 2 o 3 e ches, wi h he excep ion o sendmmsg
which pe o ms 6 e ches all he ime. F om he syscalls
ha ha e a iabili y in he numbe o execu ed e ches,
45 sys em calls execu e be ween 1 and a mos 8 e ches
while only 5 sys em calls execu e mo e han 8 e ches.
Special cases a e 3 sys em calls, i.e., pw i e64, exec e, and
w i e ha ha e high a iabili y and execu e 1-255, 1-1411
and 1-4,099 e ches, espec i ely. Again, his a iabili y
mo i a es he need o an adap i e s a egy o cache
lookups. Looking ins ead mo e closely a he a iabili y
o he numbe o double e ches—ac oss he syscalls ha
e ch da a wice in ou p o ile—we obse ed ha
≈
60%
execu e only one double e ch, while 33% execu e ei he 1
o 2 double e ches and he sendmsg syscall may execu e
be ween 3 and 6 double e ches. The exec e syscall is
again an excep ion and can execu e be ween 1 and as
many as 467 double e ches.
1224 33 d USENIX Secu i y Symposium USENIX Associa ion