scieee Science in your language
[en] (orig)

Sticky Tags: Efficient and Deterministic Spatial Memory Error Mitigation using Persistent Memory Tags

Author: Floris Gorter; Taddeus Kroes; Herbert Bos; Cristiano Giuffrida
Publisher: Zenodo
DOI: 10.1109/SP54263.2024.00182
Source: https://zenodo.org/records/11335217/files/313000a217.pdf
S icky Tags: E ficien and De e minis ic Spa ial Memo y E o Mi iga ion
using Pe sis en Memo y Tags
Flo is Go e ∗, Taddeus K oes‡, He be Bos∗and C is iano Giu ida∗
∗‡V ije Uni e si ei Ams e dam
‡ addeusk [email p o ec ed]
∗{ .c.go e ,h.j.bos,c.giu ida}@ u.nl
Abs ac —Spa ial memo y e o s such as bu e o e flows s ill
ank among he op ulne abili ies in C/C++ p og ams. Despi e
much esea ch in he a ea, he pe o mance o e head o (e en
pa ial) mi iga ions is s ill oo high o p ac ical adop ion. To
educe he cos , ecen solu ions a e shi ing owa ds ha dwa e-
assis ed echniques such as A m’s Memo y Tagging Ex ension
(MTE). Un o una ely, s a e-o - he-a MTE solu ions incu
high o e head due o equen memo y ( e) agging, especially
on he s ack. Mo eo e , hey ely on he sec ecy o andom
memo y ags and o e p obabilis ic secu i y gua an ees.
In his pape , we fi s p o ide e idence ha andom agging
o e s limi ed p o ec ion as a acke s can deduce he mem-
o y ags by means o specula i e p obing. We hen p esen
S ickyTags, a de e minis ic MTE solu ion ha e ficien ly mi i-
ga es bounded spa ial memo y e o s. By o ganizing he s ack
and heap layou in o pe -size-class egions, we can apply pe sis-
en memo y ags o each egion in a p ede e mined pa e n.
Hence, he memo y ags need only be ini ialized once, a e
which hey can be eused by objec s o he same size class. This
elimina es he need o cos ly memo y e agging and allows
o a fixed, ound- obin assignmen o he ags, su ounding
e e y objec wi h la ge implici spa ial gua ds. While he
size o such gua ds is bounded by he 4-bi MTE en opy
(16 ags), he p o ec ion is e ficien and de e minis ic. Indeed,
we show S ickyTags significan ly ou pe o ms exis ing solu-
ions wi h ealis ic un ime o e heads o p ac ical adop ion
(≤4% on SPEC CPU2006), while ully mi iga ing 7 ou o 8
spa ial CVEs e alua ed by a ecen p obabilis ic MTE solu ion.
1. In oduc ion
Spa ial memo y e o s emain a common and impac ul
secu i y conce n. The 2023 CWE anking lis s ou -o -bounds
w i es as he mos se e e so wa e weakness [1]. Anecdo-
ally, he GWP-ASan p ojec has al eady ound o e hi y
bu e o e flows in he li e build o Google Ch ome [2],
highligh ing he need o exploi mi iga ions. While many
exis ing ools can de ec such bugs du ing so wa e es -
ing [3], [4], [5], [6], [7], [8], [9], [10], pos -deploymen
solu ions ha e ound li le applicabili y in he field due o
‡Now a Google
hei high o e head. Recen epo s indica e ha mi iga-
ions only see eal-wo ld deploymen i hei pe o mance
o e head s ays below 5% [11]— ende ing exis ing bounds
checking solu ions imp ac ical [12], [13], [14], [15], [16]. In
esponse, con empo a y memo y e o de ec ion and mi iga-
ion sys ems a e shi ing owa ds ha dwa e-assis ed solu ions
o educe he o e head [6], [7], [17], [18], [19].
In pa icula , A m’s Memo y Tagging Ex ension (MTE)
is a s ong con ende o p o ide spa ial memo y e o
mi iga ion on he cheap. MTE associa es e e y memo y
loca ion and e e y poin e wi h a ag, wi h he ha dwa e
disallowing any de e e ence i he poin e and memo y
ags do no ma ch. Un o una ely, e en s a e-o - he-a MTE
solu ions emain cos ly due o he need o equen memo y
( e) agging: LLVM’s MemTagSani ize [20] incu s a e age
and wo s -case o e heads on SPEC CPU2006 o 15.2% and
3.67x ( espec i ely) o he s ack alone (Sec ion 8).
Mo eo e , exis ing MTE solu ions [20], [21], [22], [23],
[24] hea ily ely on andom ags p o ided by he ha d-
wa e [25] (which, in u n, impose expensi e e agging cos s).
While such ags a e no i ial o p edic , a bes hey
o e p obabilis ic secu i y gua an ees wi h low en opy. In
pa icula , ag collisions be ween nea o neighbo ing objec s
may lea e he applica ion ulne able o con iguous o e -
flows and bounded o e flows such as ype con usion bugs.
Mo eo e , he low (4-bi ) en opy o MTE ags lea es ap-
plica ions i ially ulne able o b u e- o ce a acks agains
a a ie y o c ash- esis an a ge s, such as se e s [26], web
b owse s [27], and e en ke nels, o ins ance Linux wi h he
de aul oops mechanism [28].
Un o una ely, he si ua ion is e en wo se, since, as we
show, a acke s can find poin e / memo y ag ma ches
h ough specula i e p obing [29]. Mo e specifically, we
show a acke s can use a con en ion-based side channel o
deduce whe he o no a ag check esul s in a iola ion
(i.e., ag misma ch). These esul s confi m, o he fi s ime,
conjec u es in he communi y ha MTE is ulne able o side
channels [25], [30], [31] and con adic a ecen analysis by
Google [32]. The ne esul is ha such specula i e o acles
b oaden he b u e- o ce a ack su ace o non-c ash- esis an
a ge s (e.g., Linux wi hou he oops mechanism [29]).
In his pape , we p esen S ickyTags, an e ficien and
de e minis ic spa ial memo y e o mi iga ion o he s ack
and he heap. Ra he han aiming o he classic andom
( e) agging-based design o de ec gene ic spa ial and em-
po al e o s wi h only p obabilis ic gua an ees, S ickyTags
ocuses on mi iga ing a specific (bu widesp ead) class o
ulne abili ies (bounded spa ial e o s) wi h s ong pe o -
mance and secu i y gua an ees. In pa icula , S ickyTags o -
e s p oduc ion- eady o e heads below 5% and de e minis ic
secu i y gua an ees (whe e all ags a e public and known)
bounded by he numbe o MTE-p o ided ags.
To minimize memo y agging cos s, we build on A m’s
ecommenda ion o limi he numbe o (de)alloca ions [33].
Ra he han ew i ing he applica ion, we di ide he heap
and s ack in pe -size-class egions—assigning objec s o
p ede e mined slo s wi h pe sis en memo y ags al eady in
place. We make su e o ini ialize he ags only once, and
keep hem in memo y eady o be eused o objec s o he
same size class. The syne gy be ween o ganizing memo y
in o size classes and he unde lying pe sis en ags allows
us o achie e high pe o mance by elimina ing he need
o memo y e agging. This is especially beneficial on he
s ack, whe e he alloca ion (and hence agging) equency
is high. As an added ad an age and in con as o s a e-o -
he-a solu ions [20], ou s ack agging design also o e s
inc eased backwa ds compa ibili y wi h legacy (non-MTE)
de ices. We de ail ou compa ibili y gua an ees in Sec ion 5.
To p o ide de e minis ic secu i y gua an ees, we assign
ags in a ound- obin ashion in each egion such ha he
ag o any objec canno collide wi h ha o a known
numbe o neighbo ing slo s. As a esul , ou p o o ype
S ickyTags e ec i ely c ea es implici spa ial gua ds a ound
each objec . E en wi h he cu en ag size o 4 bi s, S icky-
Tags e ficien ly mi iga es spa ial memo y e o s wi h unde -
and o e flow gua ds o 15 imes he size class. Since he
smalles class con ains objec s o 16 by es ( he MTE agging
g anula i y), each objec in his class is p o ec ed by implici
spa ial gua ds o 240 by es in bo h di ec ions. The spa ial
gua ds o la ge size classes a e p opo ionally la ge .
Ou e alua ion shows ha S ickyTags significan ly ou -
pe o ms s a e-o - he-a spa ial memo y e o mi iga ions.
On SPEC CPU2006 and 2017, S ickyTags incu s geomean
o e heads o ≤4% measu ed bo h wi h MTE analogs [34]
and MTE de ices, b inging spa ial memo y p o ec ion
wi hin each o p oduc ion sys ems o he fi s ime.
S ickyTags is 12x as e on a e age han MemTagSani ize
(s ack agging) and nea ly 2x as e han he Scudo alloca o
(heap agging), while ully mi iga ing 7 ou o 8 spa ial
CVEs e alua ed by he ecen (p obabilis ic) MTSan [17].
Con ibu ions. We make he ollowing con ibu ions:
•We p esen he fi s on-de ice e idence ha specula i e
p obing can leak MTE poin e / memo y ag ma ches,
ques ioning andom agging as a mi iga ion s a egy
e en o applica ions no p one o classic b u e o cing.
•We p esen a design o de e minis ic memo y agging
o he s ack and he heap ha uses pe sis en ags
o enable e ficien spa ial memo y gua ds wi h MTE.
We u he s udy he applicabili y o pe sis en spa ial
gua ds o x86 a chi ec u es, using ligh weigh compile
ins umen a ion o compensa e o he lack o MTE.
•We e alua e ou S ickyTags p o o ype and show ha
S ickyTags p o ides p oduc ion- eady o e heads.
A ailabili y. h ps://gi hub.com/ usec/s icky ags
2. Backg ound
Spa ial memo y e o s. Spa ial memo y e o s such as
bu e o e flows occu when a de i ed poin e e oneously
accesses a di e en objec han i s base poin e . To p e en
exploi a ion o such bugs, bounds checke s [12], [13], [14],
[15], [16], [35] e ofi p og ams wi h checks ha disallow
such illegal accesses. Un o una ely, bounds checke s ha e
no seen widesp ead adop ion due o high o e heads. To
lowe he o e head, o he solu ions educe he scope o
bounds checking and ins ead ely on explici spa ial gua ds
b acke ing each memo y objec [2], [3], [4], [9], [36], [37].
Such gua ds, implemen ed by means o gua d pages [2],
[38] o compile -en o ced edzones [3], [4], [36], [37], can
de ec in alid ou -o -bounds eads/w i es up o he gua d
size. This can mi iga e con iguous o e flows and bounded
non-con iguous o e flows, such as o -by-N e o s and ype
con usion. In he la e case, unsa e ype cas s allow a -
acke s o eplace an objec poin e wi h a poin e o a
la ge objec ype, yielding ou -o -bounds eads/w i es a a
bounded o se up o he la ges di e ence in con usable
objec sizes [39]. Un o una ely, exis ing gua d-based solu-
ions s ill incu high o e heads and ha e only ound p ac ical
adop ion in o fline es ing [3], [37] o online sampling [40].
Memo y Tagging Ex ension. Memo y Tagging Ex ension
(MTE) is an A m 8.5+ ea u e o de ec memo y e o s. I
in oduces a ‘lock’ and ‘key’ mechanism, wi h he ha dwa e
only pe mi ing eads/w i es i he poin e ag (key) ma ches
he memo y ag (lock). Checks a e suppo ed bo h in syn-
ch onous and asynch onous mode. Poin e agging elies on
A m’s TBI (Top-By e Igno e) ea u e o s o e a ag in he
uppe poin e bi s. The 4-bi memo y ags (16 alues in
o al) a e s o ed sepa a ely om applica ion da a. Exis ing
MTE solu ions [17], [20], [21], [22], [23], [24] o en ely
on he A m IRG ins uc ion o assign a andom ag o each
alloca ion/dealloca ion, which scales poo ly due o equen
( e) agging and p o ides p obabilis ic secu i y.
3. Specula i ely P obing o Random Tags
Fo p obabilis ic MTE solu ions based on andom ag-
ging [20], [21], [22], [23], [24], he assump ion is ha , e en
i a acke s manage o hijack a agged ic im poin e (e.g.,
ia a bu e o e flow) o e e ence a a ge objec , hey
canno p edic whe he he ag o he a ge objec ma ches
he poin e ag—hinde ing eliable exploi a ion. Howe e ,
e en wi hou b u e- o cing capabili ies [26], i a acke s can
deduce which ags a e assigned a un ime, hen he andom
sou ce o he ags has no added benefi . This is because
Lis ing 1 Tes ed p obe gadge s o leak ag ma ches.
1 lush(signal);
2i (/*misp edic */){
3#i de DEP_LOAD
4idx =*oob_p ; // a ge ag check
5*(signal+idx); // dependen load
6#else
7*oob_p ; // a ge ag check
8*signal; // independen load
9#endi
10 }
11 eload(signal); // is signal cached?
a acke s can massage memo y [41] un il he ic im poin e ’s
andom ag happens o ma ch he one o he a ge objec —
and only hen igge he ulne abili y o achie e eliable
exploi a ion in spi e o andom agging.
We now show a acke s can indeed deduce ag assign-
men s ia side channels and bypass p obabilis ic MTE solu-
ions ha ely on sec e andom ags, confi ming conjec u es
om he communi y [25], [30], [31] wi h he fi s e idence
on eal MTE ha dwa e. Specifically, we show ha a acke s
can leak whe he he ag o a gi en ic im poin e ma ches
he ag o a a ge objec using specula i e p obing [29],
[42], [43]. This is possible by epea edly p obing di e en
poin e / (massaged) objec pai s using a p obe gadge un il
mic oa chi ec u al side channels leak a ag ma ch.
Fo ou e alua ion, we conduc ed expe imen s on oo ed
Samsung Galaxy S22 and Google Pixel 8 P o de ices,
suppo ing sync/async MTE mode. Ou fi s expe imen was
on he o me de ice in sync mode, using he s anda d
Spec e p obe gadge in Lis ing 1 (DEP_LOAD case). The
gadge specula i ely issues an ou -o -bounds load ia he
( agged) ic im poin e (oob_p , line 4) ollowed by a
load dependen on he loaded alue (a signal+idx, line
5). The dependen load, i comple ed, fills a cache line and
ansmi s 1 bi o in o ma ion ia a classic Flush+Reload
co e channel [44] (i.e., “cache hi ” as e ealed by iming).
We ini ially expec ed wo possible scena ios o ailed
(i.e., misma ching) MTE checks on he specula i e pa h:
(i) hey a e ully synch onous and p e en he ic im load
om passing da a o he dependen load, esul ing in a 0%
cache hi a e; (ii) hey a e ully asynch onous and allow
da a o be passed o he dependen load, esul ing in a 100%
cache hi a e simila o he success ul checks. The o me
scena io would allow he MTE implemen a ion o gua an ee
specula i e memo y sa e y—since he checks hinde any in-
alid specula i e access—bu also incu ag leakage—since
he a acke can dis inguish co ec /inco ec ag pai s based
on he cache hi a e. The la e scena io, in u n, would
yield opposi e gua an ees (i.e., no ag leakage, no specula i e
memo y sa e y). Howe e , ou fi s expe imen e ealed a
high cache hi a e (sugges ing checks a e asynch onous),
bu no as high as o success ul checks. Hence, we can s ill
leak a ag ma ch based on he cache hi a e.
Ou nex ques ion was why he ailed check causes he
subsequen dependen load o some imes no comple e in he
specula ion window. One hypo hesis is ha ailed checks
1 2 3 4 5 6 7 8 9 10
Numbe o checks
0
20%
40%
60%
80%
100%
Signal load cache hi a e
Co ec Tag (Bo h) Inco ec Tag S22 Inco ec Tag P8
Figu e 1: Cache hi a es o a single independen load o co ec-
/inco ec ag ma ch on he Samsung S22 and Google Pixel 8 P o.
occasionally cu he specula ion window sho . Ano he is
ha hey c ea e con en ion on he memo y subsys em (by
ha ing o ac upon he ag iola ion), causing o he memo y
ope a ions o occasionally s all and ail o comple e wi hin
he window. To answe his ques ion, we designed ano he
expe imen wi h he simple p obe gadge in Lis ing 1
(no DEP_LOAD case), which, unlike s anda d Spec e,
d ops he second dependen load in a o o an a bi a y
(independen ) one. Swi ching o he independen load did
no a ec ou o iginal esul s and nei he did swi ching o
async MTE mode (whe e checks a e e y asynch onous e en
a chi ec u ally) o eplacing he independen load wi h an
independen s o e. This all seems o confi m ou second
hypo hesis, wi h con en ion on he memo y subsys em a -
ec ing he cache hi a e (and allowing o ag ma ch leaks).
Figu e 1 p esen s ou esul s when epea edly igge ing
he simple p obe gadge o ma ching/misma ching ag pai s
as we inc ease he numbe o checked ou -o -bounds loads in
he gadge (by duplica ing line 7 in Lis ing 1). As expec ed,
co ec poin e ag / objec ag pai s consis en ly sco e a
cache hi a e o 100%. Inco ec ag pai s, on he o he
hand, esul in an inc easingly lowe a e as we inc ease
he numbe o ( ailed) checks and hus he con en ion.
Mo eo e , one check is su ficien o leak a ag ma ch (0.2%
cache hi a e di e ence). The figu e also includes esul s
o he Pixel 8 P o, on which we ep oduced he beha io
discussed o he S22, excep ha con en ion seems lowe
and ou p obe gadge can iden i y he co ec ag s a ing
om he con en ion caused by wo ( a he han one) checks.
In summa y, he con en ion caused by ag misma ches
p o ides a acke s wi h a con enien side channel o de-
e mine whe he a ag misma ch occu ed. C a ing p obe
gadge s is ela i ely simple: an a acke needs o igge
he a ge so wa e ulne abili y on a specula i e pa h [29]
and, unlike s anda d (and mi iga ed) Spec e [44], obse e
a mic oa chi ec u al signal om any independen memo y
ope a ion wi hin he specula ion window. On some de ices
(Pixel 8), addi ional ailed checks wi hin he window may
be equi ed, bu , since in alid memo y accesses a e common
on specula i e pa hs, his is a ela i ely mino hu dle o
a acke s o o e come. Ou esul s p o ide conc e e e idence
ha ag leakage a acks a e possible wi h easy- o-c a p obe
gadge s and ques ion he use o p obabilis ic MTE-based
solu ions ha ely on andom agging as a mi iga ion—e en
wi h lack o b u e- o cing capabili ies [26]. Fu he mo e, ou
findings con adic a ecen analysis on MTE by Google,
which ound no side channels on hei es ed de ices [32].
4. Th ea Model
We conside an ad e sa y seeking o exploi an exis ing
spa ial memo y e o in a ic im p og am. We assume
a acke s can exhaus he ag en opy h ough classic b u e
o cing o c ash- esis an a ge s [26], [27], [28] o specula-
i e p obing o non-c ash- esis an ones [29]. Fo specula-
i e p obing, he s anda d Spec e [44] h ea model applies,
wi h a local a acke moun ing c oss-p i ilege (e.g., use -
o-ke nel o gues - o-hos ) o in-domain (e.g., Ja aSc ip
sandbox) a acks [45]. Fo he la e , a acke s also need o
bypass any deployed (b owse ) ime mi iga ions [46], o
ins ance by c a ing hei own high- esolu ion ime s [47],
[48], [49], [50] o moun ing ime less a acks [51], [52].
We conside o e flows (and unde flows) wi hin he size
o ou spa ial gua ds. This includes con iguous o e flows
and bounded non-con iguous o e flows, such as cons ained
ou -o -bounds accesses ia ype con usion, e c. A acke s
can launch hei spa ial memo y a acks no jus h ough
bu e s on he heap, bu also on he s ack. In addi ion,
hey may a ack bo h confiden iali y and in eg i y ( eads
and w i es). We conside empo al memo y e o s ou o
scope and subjec o ex ensi e li e a u e on o hogonal de-
enses [53], [54], [55], [56], [57], [58], [59].
5. S ickyTags
As la e e idenced by ou e alua ion, equen memo y
agging (no mally done o e e y alloca ion/dealloca ion)
is a majo pe o mance bo leneck o exis ing MTE-based
solu ions. To educe his o e head, S ickyTags dec eases he
numbe o imes i needs o ag memo y, by eo ganizing
memo y in o egions each con aining objec s o a pa icula
size class (Figu e 2). I ags memo y a he fi s use o an
objec slo , allowing he ag o pe sis ac oss he li e imes
o di e en objec s alloca ed in he same slo .
Ou pe sis en memo y ags ollow a de e minis ic pa -
e n ha assigns ags o slo s in ound- obin ashion, so
o N-bi ags, each ag epea s e e y 2Nslo s. This way,
S ickyTags p o ec s agains bu e unde - and o e flows
bounded by he numbe o ags imes he slo size. Con-
cep ually, each memo y slo has wo implici spa ial gua ds
consis ing o he 2N−1su ounding objec s on bo h sides
wi h di e en ags. The e ec i e size o he gua ds depends
on he size class S: each objec is p o ec ed by (2N−1)×S
gua d by es. Wi h MTE ea u ing ags o N=4bi s, his
amoun s o 15×S. To quan i y his: ou smalles (16 by es)
and la ges (262 KB) size classes p o ide bi-di ec ional
gua ds o 240 by es and 3.75 MB, espec i ely.
HeapS ack
Valida e ags ma ch on loads/s o es
MTE Ha dwa e
Use space P og am
Size Class
(16 by es)
Size Class
(32 by es)
Size Class
(16 by es)
Size Class
(64 by es)
cha bu 1[15];
cha bu 2[16];
cha * p 1 = malloc(60);
cha * p 3 = malloc(64);
0
1
2
3
4
5
Tags
...
15
0
1
...
2
0
0
1
1
0
1
2
...
...
0
0
0
0
1
1
1
1
2
2
2
2
...
Tags
cha * p 2 = malloc(10);
[... s ack popula ed ...]
Figu e 2: Memo y o ganiza ion in S ickyTags. The ags a e pe sis-
en : hey emain in place when new objec s euse he memo y.
Wi hin a size class he objec s always use he same slo
size, hence he ag layou in a egion is cons an . A e an
objec is dealloca ed, a new objec can euse he slo while
he unde lying memo y ags emain unchanged. The ags
a e epea ed o ma ch he size class, e.g., 64-by e objec s
equi e ou consecu i e iden ical ags. On objec alloca ion,
we ag he e u ned objec poin e such ha i ma ches he
pe sis en ag o he co esponding memo y loca ion. The
MTE ha dwa e compa es add ess ags wi h memo y ags and
gene a es an in e up upon accesses in case o misma ch.
While size classes a e common in mode n heap alloca-
o s [60], [61], and p e ious wo k has spli he s ack in o
sepa a e egions pe a iable ype o comba ype con u-
sions [62], [63], o he bes o ou knowledge he s ack
has ne e been di ided in o size classes. I is p ecisely
his combina ion o a size class-based (s ack and heap)
alloca o and pe sis en memo y ags ha allows S ickyTags
o deli e much highe pe o mance han exis ing solu ions.
Specifically, S ickyTags elimina es he need o e agging
memo y, because o which i pe o ms well on bo h he
s ack, whe e he alloca ion (and hence e agging) equency
is ypically high, and he heap, whe e la ge alloca ions a e
no uncommon and hence e agging is cos ly.
5.1. Pe sis en Memo y Tag Ini ializa ion
While S ickyTags’ one- ime ag ini ializa ion is key o
i s pe o mance, i is also challenging. A nai e solu ion is
o main ain me ada a o ack whe he he memo y ag o a
slo has al eady been ini ialized, and check i a objec allo-
ca ion ime—ini ializing he ags only i needed. Howe e ,
his s a egy would in oduce checks and me ada a acking
on he as pa h, se e ely impac ing pe o mance. Ano he
op ion is o immedia ely ag an en i e memo y a ea e e y
ime a whole egion (i.e., s ack/heap chunk) is alloca ed.
Howe e , doing so may esul in se e e o e agging o
memo y ha is ne e used, incu ing bo h un ime and
memo y o e head. This is especially likely because mode n
applica ions and alloca o s end o keep la ge amoun s o
unused memo y a ound o u u e use.
To a oid such sho comings, S ickyTags ini ializes mem-
o y ags only once he memo y ecei es backing. In pa -
icula , S ickyTags elies on use -le el page aul handling
o lazily ini ialize memo y ags: upon accessing a memo y
page o he fi s ime, he ha dwa e igge s a page aul
ha S ickyTags handles by ini ializing he p ede e mined
memo y ags o he page. Fo his pu pose, i only needs o
know he base add ess and size class o he egion con aining
he page, which i ob ains om pe - egion me ada a main-
ained by he alloca o . Using his in o ma ion, S ickyTags
de e mines wha ags o apply o he page based on he
dis ance o he cu en page o he base o he egion,
because he egion always s a s wi h ag ze o and ags cycle
up de e minis ically ( ound- obin) om ha poin .
5.2. Size Classes
S ack. Fo he s ack, S ickyTags alloca es one egion pe
size class o s ack objec s. Alloca ing he s ack egions
using he heap alloca o (desc ibed below) allows S ickyTags
o deduce he size class a un ime when handling page
aul s by consul ing he heap me ada a. S ickyTags uses
s ack size classes ha a e mul iples o wo, which simplifies
poin e agging, as we will explain in he nex sec ion.
While he e is a mos one ins ance o each size class in
a single- h eaded applica ion, he e can be mul iple in he
case o mul i- h eading. S ickyTags ins umen s each unsa e
s ack alloca ion in he p og am (as de e mined by s a ic
analysis [64]) o use he base poin e o he associa ed s ack
egion ins ead o he egula s ack. The egula s ack is s ill
used o e u n add esses, s a ically sa e alloca ions, and
s ack objec s in unins umen ed lib a ies.
Since s ack objec s a e alloca ed on each unc ion en-
y, hey equi e a highly e ficien eplacemen scheme.
The e o e, S ickyTags decides a compile ime which s ack
egion poin e (o simply s ack poin e ) o use o each
s ack objec and s o es he s ack poin e in h ead-local
s o age (TLS). No e ha S ickyTags c ea es s ack egions
only o he size classes ha i s a ically de e mines o
be equi ed. Dynamically-sized s ack objec s (e.g., calls o
alloca) canno benefi om his scheme and, like heap
objec s, equi e he alloca o o find a egion o hei size
class. Since his is al eady done o heap objec s, S ickyTags
mo es hese objec s o he heap by ans o ming hem
in o malloc calls. I inse s calls o ee a he end o
he objec ’s li e ime, which i de e mines using dominance
on ie s [65]. In ou e alua ion we a ely obse e unsa e
dynamically-sized s ack objec s. The e o e, mo ing hese o
he heap incu s negligible un ime o e head.
Heap. Fo heap alloca ions, commodi y high-pe o mance
memo y alloca o s al eady p o ide a sui able o ganiza-
ion wi h size classes. Fo ins ance, an alloca o such as
TCMalloc [60] uses slab alloca ion o e ficien ly implemen
pe - h ead caching o heap objec s. S ickyTags piggybacks
on hese e o s and ensu es he alloca o uses only size
classes ha a e a mul iple o 16 by es— he MTE agging
g anula i y. See Appendix A o he exac size classes.
Lis ing 2 Tagging s ack poin e s upon alloca ion.
1// assuming: 'p ' is a ge alloca ion
2 egion_base =p &(˜((1<< 24)-1));
3dis ance =p - egion_base;
4jumps =dis ance >> size_class_powe ;
5 ag =jumps &15;
6p =p |( ag << 56);
5.3. Tag Calcula ion
Whene e S ickyTags alloca es a memo y objec , i de-
e mines he ag o use o he poin e (i.e., he add ess)
based on he loca ion o he unde lying memo y. Addi ion-
ally, upon a page aul , S ickyTags applies he app op ia e
agging pa e n o he aul ing page, in acco dance o he
associa ed size class. The key insigh o agging is ha
S ickyTags can de e minis ically calcula e he co ec ags
o all alloca ion poin e s and aul ing pages such ha bo h
co espond o he same agging layou .
S ack. Since he s ack is designed o equen alloca ions,
i is c ucial o op imize poin e agging e en wi h e ficien
pe sis en memo y ags. Fo ou pu poses, we conside ed
h ee possible design op ions. The fi s main ains an explici
(pe -size-class) ag poin e in TLS simila o he s ack
poin e , which we o wa d h ough unc ion calls and in-
c ease/dec ease acco dingly, allowing S ickyTags o i ially
calcula e he nex ag o use based on he cu en ag poin e
alue. The second op ion elies ins ead on he s ack poin e
o calcula e he cu en (pe -size-class) ag alue jus -in-
ime. The hi d op ion is a hyb id design. Specifically, i
we ensu e ha all s ack alloca ions a e pe o med uncondi-
ionally (i.e., we pe o m alloca ion hois ing), hen we can
assume comple e linea i y o he alloca ions wi hin each size
class in a unc ion. As a esul , we do no need o ecalcula e
he ag o each alloca ion (since he ag layou is fixed), and
ins ead need only calcula e he fi s ag in he unc ion o
each size class, cache i , and o se subsequen alloca ions
acco dingly. Wi h his app oach, we e ec i ely calcula e and
main ain an explici unc ion-local ag poin e .
A e inspec ing p elimina y benchma k esul s, we
quickly disca ded he wo op ions based on explici ag
poin e managemen , as any inc eased p essu e on he eg-
is e alloca o s caused by p opaga ing a iables p o ed o
unde mine pe o mance. Ins ead, we ocused on he sec-
ond op ion, op imizing he pe o mance o jus -in- ime ag
calcula ions as much as possible. Lis ing 2 p esen s how
S ickyTags pe o ms jus -in- ime poin e ag calcula ions o
s ack alloca ions based on he objec add ess. In pa icula ,
by compu ing he dis ance o he cu en add ess o he base
o he egion, S ickyTags can deduce how many ag cycles
fi in his dis ance, and hence de e mine he nex ag o use.
To op imize he calcula ion, we align he base o each
s ack egion o 16 MB and limi he size o each egion
o 16 MB. As a esul , we can find he base o a s ack
egion by masking (i.e., cu ing down) any a bi a y s ack
objec ’s poin e o he 16 MB bounda y, ins ead o ha ing o

pe o m a memo y load (line 2 o Lis ing 2). Nex , a simple
sub ac ion yields he dis ance o he objec o he base (line
3). We compu e how many objec s fi in his dis ance by
di iding i by he size class o he egion (as a powe o
wo exponen ). The compu a ion is e ficien : he size class
o he alloca ion (and he co esponding egion) is known a
compile ime, while he s ack’s size classes a e a mul iple o
wo, allowing us o use igh shi s a he han di isions (line
4). By knowing how many objec s fi be o e he cu en one,
S ickyTags compu es he nex ag o use by pe o ming a
modulo 16 ( he ag cycle size) ope a ion, which is op imized
o a bi wise AND (line 5). The las s ep applies he ag o
he uppe bi s (56-63 wi h A m TBI) o he poin e (line 6).
Now ha S ickyTags has agged he poin e s o he allo-
ca ions, i mus ensu e ha he unde lying memo y ollows
he same agging layou . To apply hese pe sis en memo y
ags o he s ack upon page aul s, i uses an algo i hm ha
closely esembles Lis ing 2. S a ing om a aul ing add ess,
i ob ains he co esponding egion base and size class om
he heap me ada a (since S ickyTags alloca es s ack egions
using he heap alloca o ) and compu es he fi s ag o he
page h ough he same s eps as in Lis ing 2 (lines 3 o 5).
Finally, i applies he memo y ags by epea edly execu ing
he STG (s o e ag) MTE ins uc ion, looping o e he page
in chunks o he size class, and inc easing he ag by one
o e e y objec —w apping a ound a e ag 15.
I is impo an o no e ha S ickyTags ags s ack mem-
o y upon page aul s in a un ime lib a y and hence he
inline s ack ins umen a ion (see Lis ing 2) does no equi e
any MTE ins uc ions. Mo e specifically, S ickyTags’ s ack
agging ins uc ions do no equi e STG o s o e memo y
ags, which is ins ead done by he page aul handle , no
LDG o load ags om s o age, since he poin e ags a e
compu ed based on he s ack add esses (and applied using
TBI). This comes wi h he added benefi o p o iding back-
wa ds compa ibili y wi h legacy A m 8 de ices (suppo ing
TBI, bu no MTE). Indeed, i a de ice does no suppo
MTE, S ickyTags can simply no egis e he page aul
handle , he eby making memo y agging ully condi ional.
In con as , andom agging solu ions such as MemTagSan-
i ize [20] equi e uncondi ional inse ion o MTE ins uc-
ions on he s ack, he eby b eaking he applica ion bina y
in e ace (ABI). As acknowledged by Google, e aining ABI
compa ibili y is c ucial o deploy s ack agging in p ac ice—
and his is especially he case o sys ems a ge ing a wide
a ie y o A m de ices such as And oid [66].
Heap. On he heap, S ickyTags ags poin e s by piggy-
backing on he exis ing heap me ada a o e ie e he base
add ess and size class o he objec ’s egion. In con as o
he s ack, he heap uses size classes ha a e mul iples o 16
by es o a oid excessi e memo y o e head as well as po-
en ial pe o mance penal ies due o in e nal agmen a ion
o powe -o - wo heap alloca o s [16]. Since he alloca ion
pa e ns on he heap a e gene ally less in ensi e, we ade
o be e memo y locali y o a sligh ly mo e expensi e
ag calcula ion. The main complica ion is ha some size
classes do no fi pe ec ly in he memo y page g anula i y,
because di iding he page size by a size class ha is no
a mul iple o wo esul s in a non-in ege objec dis ibu-
ion (e.g., 4096/48). S ickyTags add esses his by o se ing
he memo y agging ini ializa ion acco dingly, such ha i
accoun s o memo y objec s ha a e pa ially agged by a
(p io o u u e) neighbo ing page aul .
The algo i hm o agging heap poin e s ollows he
same s uc u e as o he s ack (Lis ing 2), wi h he ollowing
mino di e ences: (1) he egion base and size class a e
de e mined h ough a me ada a lookup, and (2) he jumps
calcula ion (line 4) is a di ision ins ead o a igh shi (as
he size classes a e no always a powe o wo). The esul
o his di ision is always an in ege , since he o se om
he s a o he alloca ion o he egion base (dis ance)
is gua an eed o be a mul iple o he size class.
While S ickyTags’ lazy agging s a egy educes he
agging cos s o la ge objec s wi hin a egion, he one-
ime agging and subsequen checks s ill incu some esidual
o e head o huge heap objec s. To educe he cos o bo h
memo y agging and he checks ha MTE pe o ms (only)
on agged memo y, S ickyTags includes an op imiza ion
whe e huge objec s a e alloca ed in sepa a e (gua ded) mem-
o y egions and emain un agged. Huge objec alloca ions
(>262 KB) al eady cons i u e a special case in TCMalloc:
each esides in i s own dedica ed egion (also called a
span). Since such huge objec s do no ha e any neighbo ing
objec s inside he egion, we can simply spa ially ence hei
egions using inaccessible gua d pages [10], [38]. Simila
op imiza ions a e also p esen in mode n alloca o s, o in-
s ance in he Scudo alloca o (And oid), which does no ag
objec s bigge han a h eshold (64 KB on And oid, 131 KB
by de aul ) and ins ead elies on gua d pages [67]. While
Scudo does his o a oid ( equen ly) agging la ge egions
o memo y [67], S ickyTags p ima ily aims o educe he
esidual ag checking o e head, since huge objec s can e-
use p e iously agged memo y when a ailable.
6. Pe sis en Spa ial Gua ds on x86
In his sec ion, we show ha he p inciple o pe sis en
spa ial gua ds along wi h one- ime ini ializa ion ex apola es
well o he x86 a chi ec u e, e en hough x86 does no ha e
memo y agging capabili ies. In he absence o MTE, we
ely on p o iding spa ial memo y e o mi iga ion h ough
mo e adi ional compile -inse ed checks and (padded) ex-
plici spa ial gua ds, commonly called edzones. As be o e,
we eo ganize he add ess space in o egions con aining
objec s o he same size class, bu now addi ionally we place
edzones a fixed in e als wi hin each egion (see Figu e 3).
By doing so we can op imize edzone managemen , e en
elimina ing he need o ou -o -bound me ada a such as a
shadow memo y. Compa ed o he implici spa ial gua ds
o MTE, whe e ags a e associa ed sepa a ely om he
memo y, in he case o x86 he pe sis en gua ds a e explici ,
as hey wea e be ween objec s o spa ially sepa a e hem.
As a esul , each objec size (and hence he size class) is
infla ed by including he edzone on i s igh side. No e ha
he edzone on he le is always he igh edzone o he
Size Class
Regions
Redzone Slo Padding
Figu e 3: On x86: Memo y is o ganized in size classes, each
con aining equally sized slo s o a gi en size class, in e lea ed
by edzones. Objec s a e padded o fi he slo .
Algo i hm 1 High-le el bounds check. The ac ual imple-
men a ion euses he loaded alue o eads ins ead o calling
LOAD BYTE, and un olls he loop o up o 8 as checks.
REDZONE DISTANCE is di e en o he heap and he s ack.
o se ←0
while o se < num accessed by es do
i LOAD BYTE(add ess +o se )=gua d alue hen
egion ←GET REGION(add ess)
sc ←GET SIZECLASS( egion)
dis ←REDZONE DISTANCE( egion, add ess, sc)
i dis < 0∨num accessed by es > dis hen
RAISE ERROR(ou o bounds)
o se ←o se + edzone size
p e ious objec , excep o he fi s objec , o which he le
edzone is c ea ed along wi h he egion ini ializa ion.
Exis ing edzoning solu ions (e.g., Add essSani ize [3])
use a shadow memo y o eco d which memo y is accessible,
s o ing one bi o accessibili y in o ma ion o each by e
o applica ion memo y. This fine-g ained me ada a man-
agemen is expensi e in bo h un ime and memo y usage,
bu is necessa y o a design in which a memo y loca ion
con aining an objec (o padding by es) can la e be used
o s o e a edzone, and ice- e sa. Especially i we wan
he edzones o be easonably la ge (e.g., 256 by es) o
app oxima e he lowe -bound secu i y gua an ees o ou
MTE solu ion, he equen cons uc ion and des uc ion
o edzones is expensi e. In con as , pe sis en edzones
en i ely elimina e he need o a shadow memo y. The base
add ess and size class o he con aining egion a e known
o each memo y loca ion, and can be used o de e mine
whe he a poin e poin s o a edzone. Addi ionally, since
he edzones only need o be ini ialized wi h gua d alues
(see below) once, we a oid edzone c ea ion becoming a
se e e pe o mance bo leneck.
Accessibili y Checks. We inse accessibili y checks be o e
each memo y ead/w i e. We le e age edzone-awa e s a ic
analysis a he compile le el o skip unnecessa y checks and
me ge checks o adjacen memo y anges, simila o s a e-o -
he-a compile op imiza ions as seen in ela ed wo k [36].
The checks consul he me ada a o he egion con aining he
accessed memo y add ess, and use i s base and size class o
de e mine whe he he poin e alls wi hin a edzone. This
in oduces h ee memo y loads (me ada a, egion base, and
oid oo() {cha bu [64]; memse (bu , 0, 96); }
bu
memse
checks
Figu e 4: Fas checks on a s ack objec wi h 64-by e edzones. The
access spans mo e han a edzone and is checked by as -checking
e e y 64 by es, ailing on he second check.
size class) and some a i hme ic/b anching ope a ions which
oge he cause high un ime o e head. We he e o e use a
gua d alue, as done by LBC [4], o quickly fil e benign
memo y accesses: a single by e alue ha is s o ed in each
edzone by e. To a oid w i ing gua d alues o a edzone
wice, we lazily ini ialize gua d alues upon a page aul .
Each accessibili y check fi s pe o ms a “ as ” check,
compa ing he by e alue a he accessed loca ion o he
gua d alue, only eso ing o a egula “slow” check based
on egion me ada a i he alues ma ch. Algo i hm 1 shows
his in de ail. Because a memo y access can access mo e
by es han fi in a edzone, one as check is emi ed o each
edzone size by es o he access (see Figu e 4). O he wise,
an a acke migh abuse a la ge memo y access ha s a s
be o e he edzone and hus does no con ain he gua d alue
in he fi s by e, bu c osses he en i e edzone o access he
nex objec slo . As a esul , o e y la ge memo y accesses,
he pe o mance gain o doing as checks is o e come by
he numbe o as checks ha a e needed. Hence, we only
emi as checks i he equi ed numbe is s ill beneficial o
pe o mance (up o eigh , de e mined expe imen ally). To
op imally benefi om as checks, he gua d alue should
be uncommon in egula applica ion memo y.
In summa y, on x86 we use explici pe sis en spa ial
gua ds (i.e., edzones) ha allow us o scale o ela i ely
la ge gua d sizes, since we a oid ( equen ) edzone c e-
a ion/des uc ion becoming a bo leneck by only ha ing
o ini ialize he gua ds once. Due o a lack o ha dwa e
assis ance, we ely on compile -inse ed checks o alida e
memo y accesses. As we will show in ou e alua ion, (la ge)
explici pe sis en spa ial gua ds a e indeed significan ly
mo e e ficien han hei non-pe sis en coun e pa s. How-
e e , he use o compile checks s ill esul s in esidual
o e heads unsui able o p oduc ion use.
7. Implemen a ion
We ha e implemen ed ou p o o ypes on Linux on op
o he LLVM [68] compile in as uc u e and he TCMal-
loc [60] memo y alloca o . We apply ou compile passes
a e link- ime op imiza ions. This ensu es ha inse ed
ins umen a ion does no in e e e wi h any analysis du -
ing op imiza ions. Ou implemen a ions o implici gua ds
(MTE ags) and explici gua ds ( edzones on x86) sha e he
memo y eo ganiza ion and page aul handling logic, and
mainly di e wi h espec o he applica ion o gua ds.
Size Classes. We use TCMalloc [60] as he basis o
ou alloca o . TCMalloc o ganizes objec s in o size classes
by de aul . A compile pass, based on LLVM’s in e nal
Sa eS ack [69], c ea es size classes o he s ack and also
applies he poin e ags. We modified he pass o suppo
one “unsa e” s ack pe size class. We ely on TCMalloc o
alloca e memo y a eas o he s ack egions and o de e mine
objec size classes a un ime. La ge s ack objec s ha do
no fi in any o he p ecompu ed size classes a e assigned
a new, unique size class. This a ely occu s in p ac ice.
Page Faul Handling. A dedica ed polle h ead ca ches
page aul s in use mode using Linux’ use aul d
sys em call, ini ializing memo y ags and edzones on x86
in he aul ing page when i is accessed o he fi s ime.
We de i e whe e o apply he ags and edzones om pe -
egion me ada a main ained by TCMalloc. Execu ion o he
polle /applica ion h eads is in e lea ed: du ing page aul
handling, he aul ing applica ion h ead wai s o he polle
h ead o finish. Because only one h ead uns a a ime,
he e is no o floaded o e head on a sepa a e co e.
8. E alua ion
In his sec ion, we e alua e he pe o mance and secu i y
o S ickyTags. We measu e he un ime and memo y o e -
head using he SPEC CPU2006 and CPU2017 benchma k-
ing sui es, and compa e his o s a e-o - he-a solu ions. To
quan i y he secu i y impac o S ickyTags, we use he Julie
Tes Sui e [70], exis ing CVEs [17], and a ype con usion
ulne abili y analysis. Addi ionally, we in es iga e he pe -
o mance accu acy o exis ing MTE analogs. Fo addi ional
in o ma ion and expe imen s we e e o he Appendix.
8.1. Expe imen al Se up
Fo he co e o ou expe imen s we use a oo ed Google
Pixel 8 P o wi h MTE suppo . The de ice con ains 12 GB
RAM and uns a ch oo Debian 12 dis ibu ion. We u he
make use o a Samsung Galaxy S22 (8 GB RAM) and a
MacBook P o (Apple M2, 16 GB RAM, Asahi Linux 6.3,
Debian 12). All single h eaded benchma ks a e pinned o
a single co e. Each measu emen epo ed is he median
o fi e i e a ions o he same p og am (using he e e ence
wo kload o SPEC CPU). Fo he baseline, we enabled link-
ime op imiza ions and used an unmodified TCMalloc as he
memo y alloca o . No e ha de aul TCMalloc has an a e -
age speedup o 11.7% compa ed o he de aul (non-MTE)
Scudo alloca o (and 8% o he de aul GNU heap alloca o )
and up o 73% o a single benchma k (471.omne pp), while
consuming 3% mo e memo y on a e age. Un o una ely,
we ha e o exclude SPEC CPU2017 om mos o ou
expe imen s, because he baseline uns ou o memo y. The
sys em equi emen s o SPECspeed 2017 s a e 16 GB o
physical memo y, and he Pixel 8 and S22 do no mee his.
MTE Ha dwa e. Recen wo k conce ning MTE uses
analogs o app oxima e he o e head o memo y agging
0% 2% 4% 6% 8% 10% 12% 14% 16%
Run ime o e head (%)
geomean
483.xalancbmk
482.sphinx3
473.as a
471.omne pp
470.lbm
464.h264 e
462.libquan um
458.sjeng
456.hmme
453.po ay
450.soplex
447.dealII
445.gobmk
444.namd
433.milc
429.mc
403.gcc
401.bzip2
400.pe lbench
Sized S ack
Tag S ack P s
Tag S ack Mem
Tag Heap P s
Tag Heap Mem
ASync Checks
Figu e 5: Run ime o e head buildup o di e en componen s o
S ickyTags on SPEC CPU2006 using MTE ha dwa e (Pixel 8).
since MTE ha dwa e was no widely a ailable [17], [19],
[34], [71]. In ou e alua ion, we measu ed he pe o mance
o S ickyTags wi h ac ual MTE ha dwa e. Fi s , he Google
Pixel 8 (Tenso G3, and oid14-5.15) suppo s MTE and
allows he ea u e o be enabled h ough i s de elope op-
ions. Second, by oo ing a Samsung S22 and deploying
a cus om Exynos (Linux 5.10) ke nel, we manage o ac i-
a e i s MTE ha dwa e by explici ly igno ing he nom e
ke nel pa ame e . Howe e , on he S22 he locked-down
boo moni o does no ese e backing (physical) memo y
o s o e he memo y ags o uncached da a. Consequen ly,
agging memo y wo ks as expec ed, eading he a ge da a
in o he cache and se ing he memo y ags in he cache
hie a chy acco dingly. Howe e , as soon as he da a lea es
he cache, he memo y ags canno be swapped o backing
memo y and e ec i ely anish. Subsequen accesses o he
memo y cause a segmen a ion aul by he MTE checks
(since he poin e s ill has he ag). While he Pixel 8 se es
as he main a ge o e alua ion (wi h comple ely unc ional
MTE), he S22 none heless p o ides ano he da a poin as
MTE implemen a ion, allowing us o gain u he insigh s
in o he pe o mance o MTE’s memo y agging and ou
design. Addi ionally, o pain a comple e pic u e wi h espec
o exis ing wo k, we also conduc pe o mance expe imen s
wi h he exis ing MTE analogs on he S22 (and Apple M2).
8.2. Pe o mance Buildup
Fo ou pe o mance e alua ion we configu ed MTE on
he Pixel 8 o pe o m asynch onous checks, which A m
ecommends o p oduc ion usage [72]. Figu e 5 displays he
un ime o e head o S ickyTags on each indi idual bench-
ma k o he SPEC CPU2006 sui e. In o al, S ickyTags in-
cu s a geomean un ime o e head o 4.0%. The figu e b eaks
down his o e head in o six dis inc componen s: using size
classes on he s ack (1.0%), agging s ack poin e s (0.1%),
agging s ack memo y (0.1%), agging heap poin e s (0.8%),
110 100 1K 10K 100K 1M
Numbe o page aul s (log scale)
geomean
483.xalancbmk
482.sphinx3
473.as a
471.omne pp
470.lbm
464.h264 e
462.libquan um
458.sjeng
456.hmme
453.po ay
450.soplex
447.dealII
445.gobmk
444.namd
433.milc
429.mc
403.gcc
401.bzip2
400.pe lbench
S ack Heap
Figu e 6: Numbe o (4 KB) page aul s in SPEC CPU2006.
agging heap memo y (0.5%), and he asynch onous checks
(1.5%). No e ha he heap and s ack memo y agging com-
ponen s cons i u e he o e head o using use aul d o
one- ime ini ialize he pe sis en ags upon page aul s.
F om he o e head buildup figu e we can conclude ha
agging memo y is cheap (see geomean ba ), which is a log-
ical consequence om ou pe sis en ags design. Mo eo e ,
we see ha using a s ack wi h size classes can be he la ges
con ibu o o o e head in some benchma ks (400.pe lbench,
445.gobmk), while o e all he slowdown is modes . Fo he
CPU2006 benchma ks, we c ea e an a e age (geomean) o 7
s ack egions (each dedica ed o a dis inc size class). Then,
on a e age, a maximum o 4 a e used pe unc ion. No e
ha hese numbe s a e s a ically compu ed, meaning ha
some s ack egions may be unused a un ime depending
on he execu ion pa h. Excluding he checks, he emaining
o e head o igina es om agging poin e s on he s ack and
he heap, which co ela es wi h he memo y in ensi y o
he applica ions. Fo ins ance, 447.dealII and 483.xalancbmk
a e known o be ela i ely heap-in ensi e, and hence hese
acco dingly expe ience mo e o e head om agging heap
poin e s. As ouched upon be o e, we may choose o only
use size classes ha a e a mul iple o wo on he heap,
which accele a es he poin e ag calcula ion, bu his may
come wi h o he d awbacks such as memo y agmen a ion.
We obse e ha he o e head implica ions o enabling
asynch onous MTE checks a e low bu non-negligible. On
a e age, he checks comp ise he mos significan o e head
componen , howe e his is no unexpec ed, because S icky-
Tags ocuses on elimina ing agging o e head. Mo eo e , he
checks clea ly show up as he dominan o e head ac o in
mul iple p og ams. The 471.omne pp benchma k s ands ou
he mos , whe e he checks incu mo e han 11% o e head.
Upon u he inspec ion wi h pe [73], we ound ha
o his benchma k he CPU expe iences a 19% inc ease
in s alled cycles in he backend, which is likely he esul
o he asynch onous checks c ea ing addi ional con en ion.
To be e unde s and he cha ac e is ics o ou memo y
Sys em Heap S ack Bo h
S ickyTags 3.1% 1.2% 4.0%
MemTagSan + Scudo 5.8% 15.2% 20.2%
TABLE 1: Run ime o e head compa ison be ween S ickyTags,
MemTagSani ize , and Scudo using SPEC CPU2006 (Pixel 8).
agging design, we measu ed he numbe o page aul s
ha occu a un ime o bo h he heap and he s ack.
Figu e 6 displays he esul s o his expe imen . Looking
a he agg ega e numbe s in he figu e, i is clea ha he
s ack expe iences much ewe page aul s han he heap, wi h
he geomeans being 20 and 83,154, espec i ely. Mo eo e ,
we see a logical co ela ion be ween he o e head o agging
heap memo y being ela i ely expensi e o 403.gcc and he
la ge numbe o heap page aul s o his benchma k. In con-
as , 401.bzip2 also expe iences a ela i ely la ge numbe
o heap page aul s, bu he heap objec s a e e ec i ely all
huge (>262 KB), which means ou huge objec s op imiza-
ion lea es hem un agged. Wi hou his op imiza ion, he
un ime o e head o 401.bzip2 g ows om 6.6% o 7.7%.
Addi ionally, he low numbe o page aul s o he s ack
highligh s he e ficacy o ou pe sis en agging design on he
s ack. On a e age, only 20 memo y pages need o be agged
h oughou he en i e li e ime o he e alua ed applica ions,
ega dless o he in ensi y o hei alloca ion pa e ns.
8.3. Compa ison o he S a e o he A
In o de o pu he pe o mance o S ickyTags in o
pe spec i e, we compa ed i s o e head o s a e-o - he-a
sys ems. We measu ed un ime and memo y o e head us-
ing he SPEC CPU2006 benchma king sui e and e alua ed
agains LLVM’s MemTagSani ize (s ack) and he Scudo
heap alloca o . The p ima ily p obabilis ic Scudo alloca o
assigns a andom ag o e e y heap alloca ion and e ags he
memo y upon dealloca ion. Addi ionally, Scudo gua an ees
neighbo ing objec s o ha e di e en ags by employing an
odd-e en ag masking pa e n. MemTagSani ize ags e e y
s ack alloca ion wi h a “ andom” ag a he s a o i s li e-
ime and ese s he ag a he end o i . To a oid scalabili y
issues wi h andom ags equi ing an ex a li e egis e o
each a iable, he ags a e no comple ely andom. Ins ead,
MemTagSani ize gene a es a andom base ag o each
unc ion, wi h he ollowing s ack a iables ecei ing a ag
de i ed om he base ag. No e ha he p ima y use case o
MemTagSani ize is deploymen in p oduc ion bina ies [20].
Un o una ely, MemTagSani ize causes alse posi i e ag
misma ches due o un agged poin e s accessing agged s ack
memo y. The e o e, o he aul ing p og ams (400.pe l-
bench and 471.omne pp) we modified MemTagSani ize o
only use ag ze o o a oid hese non- i ial c ashes.
Table 1 shows he geomean un ime o e head o S icky-
Tags, MemTagSani ize , and Scudo on he SPEC CPU2006
sui e. The able con ains he isola ed heap and s ack o e -
head, as well as he combina ion o bo h. Mos no ably,
we obse e ha MemTagSani ize ’s s ack ins umen a ion
incu s 15.2% o e head, while S ickyTags’ s ack o e head is
[59] M. E d˝
os, S. Ainswo h, and T. M. Jones, “Minesweepe : a “clean
sweep” o d op-in use-a e - ee p e en ion,” in ASPLOS, 2022.
[60] S. Ghemawa and P. Menage, “TCMalloc: Th ead-caching malloc,”
2009.
[61] D. Leijen, B. Zo n, and L. de Mou a, “Mimalloc: F ee lis sha ding
in ac ion,” in APLAS. Sp inge , 2019.
[62] A. Milbu n, E. Van De Kouwe, and C. Giu ida, “Mi iga ing in-
o ma ion leakage ulne abili ies wi h ype-based da a isola ion,” in
2022 IEEE Symposium on Secu i y and P i acy (S&P). IEEE, 2022.
[63] E. Van De Kouwe, T. K oes, C. Ouwehand, H. Bos, and C. Giu ida,
“Type-a e - ype: P ac ical and comple e ype-sa e memo y euse,” in
ACSAC, 2018, pp. 17–27.
[64] LLVM, “S ack Sa e y Analysis,” Online, h ps://ll m.o g/docs/
S ackSa e yAnalysis.h ml.
[65] S. Muchnick, Ad anced Compile Design and Implemen a ion, 1997.
[66] “P i a e email communica ion wi h Google (MTE) enginee s.”
[67] LLVM, “Scudo sou ce,” h ps://ll m.googlesou ce.com/scudo/+/
966620155350ba9e3d09b6bc70b9babc4d222027/combined.h#388.
[68] C. La ne and V. Ad e, “LLVM: A compila ion amewo k o
li elong p og am analysis & ans o ma ion,” in CGO, 2004.
[69] “Sa es ack,” h ps://clang.ll m.o g/docs/Sa eS ack.h ml.
[70] F. B. J . and P. Black, “Julie 1.1 C/C++ and Ja a Tes Sui e,” in IEEE
Compu e , 2012, pp. 88–90.
[71] H. Liljes and, C. Chinea, R. Denis-Cou mon , J.-E. Ekbe g, and
N. Asokan, “Colo My Wo ld: De e minis ic Tagging o Memo y
Sa e y,” a Xi p ep in a Xi :2204.03781, 2022.
[72] A m, “A m Memo y Tagging Ex ension,” Online, h ps://sou ce.
and oid.com/docs/secu i y/ es /memo y-sa e y/a m-m e#async-mode.
[73] Linux, “P ofiling wi h pe o mance coun e s.”
[74] W. Han, B. Joe, B. Lee, C. Song, and I. Shin, “Enhancing mem-
o y e o de ec ion o la ge-scale applica ions and uzz es ing,” in
Ne wo k and Dis ibu ed Sys ems Secu i y (NDSS) Symposium, 2018.
[75] S. Ainswo h and T. M. Jones, “Ma kus: D op-in use-a e - ee p e-
en ion o low-le el languages,” in S&P, 2020.
[76] M. Phillips, “Globals Tagging - Discussion,” Online, h ps://g oups.
google.com/g/ll m-de /c/FAR7zKNkWh4/m/FIdd BRQAgAJ.
[77] J. De ie i, C. Blundell, M. M. Ma in, and S. Zdancewic, “Ha d-
bound: A chi ec u al suppo o spa ial sa e y o he c p og amming
language,” ACM SIGOPS Ope a ing Sys ems Re iew, 2008.
[78] A m, “A m Memo y Tagging Ex ension: Secu i y Upda e,” Online,
h ps://de elope .a m.com/A m%20Secu i y%20Cen e /A m%
20Memo y%20Tagging%20Ex ension.
[79] S. Singh and M. Awas hi, “Memo y cen ic cha ac e iza ion and
analysis o spec cpu2017 sui e,” in ICPE, 2019.
Appendix
Apple M2: S ickyTags wi h MTE Analogs
Sys em SPEC CPU2006 SPEC CPU2017
S ickyTags-s ack 0.7% 1.0%
MemTagSan-s ack 14.2% 8.8%
S ickyTags-heap 0.5% 0.7%
Non-pe sis en -heap 1.6% 1.4%
S ickyTags-bo h 1.2% 1.9%
MemTagSan + TC 15.8% 10.4%
TABLE 4: SPEC CPU un ime o e head summa y using MTE
analogs o S ickyTags, TCMalloc wi h non-pe sis en agging, and
MemTagSani ize .
Compa ison o he s a e-o - he-a . In his sec ion, we
compa e he o e head o S ickyTags, TCMalloc wi h a
non-pe sis en de e minis ic agging scheme (desc ibed in
Sec ion 8.4), and MemTagSani ize . We make use o MTE
analogs on he Apple M2 and he SPEC CPU2006 and 2017
benchma king sui es. No e ha he M2 uses pages o size
16 KB, which educes he numbe o page aul s S ickyTags
has o handle compa ed o a 4 KB page size. Un o una ely,
he 657.xz s benchma k is known o exhibi an ex emely
la ge memo y oo p in [79], and we ha e o omi i because
he M2 uns ou o memo y (16 GB). Al hough he sys em
equi emen s o SPECspeed 2017 s a e 16 GB o physical
memo y, his is insu ficien on ou machine.
Table 4 showcases he geomean un ime o e head we
measu ed. The able con ains he isola ed heap and s ack
o e head, as well as he combina ion o bo h. Mos no ably,
we obse e ha MemTagSani ize ’s s ack ins umen a ion
incu s 14.2% and 8.8% un ime o e head on CPU2006
and 2017, espec i ely, while S ickyTags’ s ack o e head
is significan ly lowe a 0.7% and 1.0%. Mo eo e , o
he heap we measu e a un ime o e head o 1.6% and
1.4% on CPU2006 and 2017 o he non-pe sis en agging
design, while S ickyTags manages o (mo e han) hal e his
o e head o 0.5% and 0.7%. Wi h he s ack and heap com-
bined, S ickyTags incu s a low o e head o 1.2% and 1.9%,
compa ed o exis ing echniques wi h 15.8% and 10.4%, o
SPEC CPU2006 and 2017, espec i ely. These da a poin s
o bo h he s ack and he heap highligh he benefi o ou
design o pe sis en memo y ags, wi h which we manage
o elie e he p essu e o equen memo y agging.
Lis ing 3 MTE analog o se ing and clea ing memo y ags.
The analog is adap ed om he o iginal [34] o be inlined.
1#de ine MTE_SET_TAG_INLINE(p , size) asm ola ile (
2"mo x2, %0 n"
3"mo x3, %1 n"
4"mo x17, %0 n"
5"cbz %1, 2 n"
6"1: n"
7"mo x16, %0 n"
8"ls x16, x16, #56 n"
9"and x16, x16, #0xFUL n"
10 "s b w16, [x17, #0x0] n"
11 "add %0, %0, #16 n"
12 "sub %1, %1, #16 n"
13 "add x17, x17, 1 n"
14 "cbnz %1, 1b n"
15 "2: n"
16 "mo %0, x2 n"
17 "mo %1, x3 n"
18 :: " "(p )," "(size) : "x16","x17","x2","x3","memo y")
Vulne abili y Type P ojec P og am Ve sion
CVE-2016-10270 heap lib i i cp 4.0.1
CVE-2016-10271 heap lib i i c op 4.0.1
CVE-2017-8786 heap pc e2 pc e2 es 10.23
CVE-2017-14408 s ack mp3gain mp3gain 1.5.2
CVE-2018-20004 s ack mxml es mxml 2.12
CVE-2020-21675 s ack fig2de fig2de 93795dd
CVE-2020-21050 s ack libsixel img2sixel 2d 6437
CVE-2021-20294 s ack binu ils eadel 2.35
TABLE 5: P og am de ails o he CVE analysis.

Heap and s ack size classes
Fo he s ack we use a o al o 15 size classes, all o hem
being a mul iple o wo, and he smalles being he MTE
agging g anula i y. The lis o classes consis s o : 2Nwi h
N={4...18}, making he la ges class 262144 by es. Fo
he heap we use a o al o 76 size classes, all o hem being
a mul iple o 16, and he smalles being he MTE agging
g anula i y. These a e he de aul size classes in TCMalloc,
wi h he excep ion o he fi s class being 16 ins ead o 8.
Redzone gua d alue on x86
050 100 150 200 250
alue o i s by e
0
2
4
% o add esses
20.7% o by e 0 (ou side g aph)
Figu e 9: By e alues a de e e enced add esses in SPEC CPU2006.
Bin ishows wha pe cen age o de e e enced add esses con ains
alue iin he fi s by e. Highligh ed bin: 223 (de aul gua d alue).
Occu ences o he gua d alue in applica ion memo y ou -
side a edzone cause a slow check when de e e enced. To
achie e high e ficiency, i is impo an o limi he numbe o
slow checks by choosing a gua d alue ha occu s spa sely
consis en ly ac oss di e en ypes o p og ams. To find such
alues, we ins umen ed SPEC o log he fi s by e a he
loca ion o each memo y access, as seen in Figu e 9.
•Values 0 and 1 a e (unsu p isingly) he mos common—
as hese a e de aul ini ialize s. This ex ends o a lesse
deg ee o low alues unde 20. Simila ly, he alue 255
(used in bi masks) is bes a oided.
•Tex -p ocessing applica ions such as 483.xalancbmk,
400.pe lbench and 401.bzip show spikes in he ange o
p in able ASCII cha ac e s (up o 127) and in pa icula
alphabe ic cha ac e s (65-90 and 97-122).
•Powe s o 2 and hei mul iples a e p e alen .
•Some benchma ks show ecu ing spikes a he mul i-
ples o some applica ion-specific numbe .
Webse e s on x86
Sa u a ion
connec ions
Th oughpu
deg ada ion
La ency inc ease
50p 75p 90p 99p
Nginx 250 7% 8% 7% 6% 5%
Apache 350 8% 9% 10% 11% 15%
Ligh pd 500 10% 14% 9% 11% 13%
TABLE 6: Web se e o e head a sa u a ion: h oughpu deg ada-
ion and inc ease in 50/75/90/99 pe cen ile la ency.
We ha e benchma ked bo h ou LTO-enabled TCMalloc
baseline and ou x86 design (configu ed o use explici
158k
409k
437k
200 400 600 800 1000 1200 1400
0%
100%
Nginx
baseline
x86 gua ds
74k
141k
153k
100 200 300 400 500 600 700
0%
100%
h oughpu ( eqs/s)
CPU u iliza ion
Apache
214k
527k
588k
0 200 400 600 800 1000 1200 1400
0%
100%
concu en connec ions
Ligh pd
Figu e 10: Web se e h oughpu wi h inc easing clien connec-
ions. E.g., he Nginx baseline achie es i s maximum h oughpu
a sa u a ion (100% CPU) a 250 connec ions, a which poin he
h oughpu deg ada ion is 7% om ou x86 pe sis en gua ds.
pe sis en spa ial gua d o 64 by es) on h ee majo web
se e s: Nginx 1.17.4, Apache 2.4.41 and Ligh pd 1.4.54.
We ins umen ed loadable modules and non-sys em lib a ies
(including APR and APR-U il) o all se e s. We used wo
In el Xeon Sil e 4110 machines—se e and clien —each
wi h 8 hype - h eaded co es a 2.10 GHz and 32 GB o
memo y, connec ed by a dedica ed 100 Gbi /s ne wo k link.
We used a Linux 4.15 ke nel wi h send ile enabled
and used a la ge numbe o keepali e connec ions wi h
a sho imeou . We an ou expe imen s wi h 16 wo ke
p ocesses, eques ing 64-by e pages o 30 seconds using he
w k benchma k wi h an inc easing numbe o concu en
connec ions. We epea ed each expe imen 11 imes and
epo he medians he e. All s anda d de ia ions a e less han
1% excep o 99-pe cen ile la ency o which i goes up o
2.6%. Figu e 10 illus a es how we de e mined sa u a ion
poin s, i.e., he numbe o connec ions wi h he highes
h oughpu a 100% CPU u iliza ion. Table 6 de ails he
h oughpu and la ency impac on x86 o all se e s: a
sa u a ion, h oughpu deg ades by only 7-10% and 99-
pe cen ile la ency inc eases by 5-15%.
Benchma k ST-heap ST-s ack ST-bo h MTS Scudo Scudo+MTS
400.pe lbench 1.02 1.09 1.11 1.36 1.29 1.54
401.bzip2 1.03 1.03 1.07 1.07 1.00 1.08
403.gcc 1.08 1.02 1.10 1.12 1.26 1.36
429.mc 0.99 1.00 1.00 1.00 1.00 1.01
433.milc 1.01 1.01 1.01 1.02 0.99 1.01
444.namd 1.00 0.98 0.99 1.01 1.01 1.01
445.gobmk 1.00 1.04 1.05 1.25 1.00 1.25
447.dealII 1.06 0.97 1.06 1.02 1.09 1.09
450.soplex 1.01 1.01 1.00 1.00 1.01 1.03
453.po ay 1.05 1.04 1.11 1.41 1.04 1.46
456.hmme 1.01 1.01 0.99 1.01 1.00 1.01
458.sjeng 1.01 1.01 1.02 3.67 1.00 3.69
462.libquan um 1.03 1.00 1.00 1.02 1.01 1.00
464.h264 e 1.01 1.00 1.01 1.01 1.00 1.01
470.lbm 1.02 1.00 1.01 1.01 1.00 1.01
471.omne pp 1.17 1.01 1.14 1.02 1.20 1.23
473.as a 1.03 0.99 1.02 1.01 1.03 1.03
482.sphinx3 1.02 1.00 1.01 1.00 1.00 1.01
483.xalancbmk 1.06 1.03 1.08 1.23 1.26 1.43
geomean 1.031 1.012 1.040 1.152 1.058 1.202
TABLE 7: Google Pixel 8 P o SPEC CPU2006 MTE esul s. MTS=MemTagSani ize , ST=S ickyTags, bo h=heap+s ack.
MTE Analogs MTE Ha dwa e
Benchma k MTS MTS+heap ST-s ack ST-bo h MTS MTS+heap ST-s ack ST-bo h
400.pe lbench 1.22 1.20 1.07 1.09 1.14 1.17 1.07 1.10
401.bzip2 1.10 1.10 1.03 1.04 1.17 1.17 1.03 1.04
403.gcc 1.07 1.17 1.00 1.07 1.11 1.41 1.00 1.08
429.mc 1.03 1.03 0.99 1.00 1.00 1.01 0.99 1.01
433.milc 1.01 1.01 1.00 1.03 0.99 0.99 0.99 1.01
444.namd 0.98 0.98 0.97 0.97 1.01 1.01 0.97 0.97
445.gobmk 1.48 1.48 1.03 1.03 1.78 1.73 1.03 1.03
447.dealII 0.85 0.99 0.99 1.00 1.00 0.93 0.99 0.99
450.soplex 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.01
453.po ay 1.14 1.14 1.05 1.05 1.39 1.30 1.05 1.05
456.hmme 0.99 1.00 1.01 1.02 0.99 1.01 1.01 1.02
458.sjeng 4.19 4.20 1.04 1.04 7.33 6.01 1.04 1.04
462.libquan um 1.06 1.07 1.04 1.03 1.10 1.08 1.02 1.02
464.h264 e 1.01 1.01 1.00 1.00 1.02 1.02 1.01 1.01
470.lbm 1.04 1.04 1.04 1.04 1.04 1.04 1.04 1.04
471.omne pp 0.97 1.01 1.00 1.03 1.01 1.17 1.01 1.04
473.as a 0.99 1.00 1.00 1.01 1.00 1.01 1.00 1.00
482.sphinx3 1.00 1.01 1.00 1.00 0.99 1.01 1.00 1.00
483.xalancbmk 1.28 1.37 1.02 1.07 1.59 2.23 1.02 1.07
geomean 1.140 1.161 1.014 1.027 1.228 1.257 1.014 1.027
TABLE 8: Samsung Galaxy S22 SPEC CPU2006 MTE esul s. MTS=MemTagSani ize , ST=S ickyTags, bo h=heap+s ack.
Benchma k MTS TC-NP MTS+TC ST-s ack ST-heap ST-bo h
600.pe lbench s 1.06 1.01 1.07 1.04 1.00 1.05
602.gcc s 1.05 1.06 1.12 1.01 1.02 1.03
605.mc s 1.00 1.00 1.00 1.00 1.00 1.00
619.lbm s 1.02 1.04 1.06 1.03 1.03 1.06
620.omne pp s 1.22 1.05 1.27 1.00 1.02 1.03
623.xalancbmk s 1.26 1.00 1.26 1.01 1.00 1.01
625.x264 s 1.41 0.99 1.41 1.01 1.00 1.01
631.deepsjeng s 1.04 1.01 1.05 1.02 1.00 1.02
638.imagick s 1.00 1.00 1.00 1.00 1.00 1.00
641.leela s 1.00 1.00 1.00 1.00 1.00 1.00
644.nab s 1.00 1.00 1.00 1.00 1.00 1.00
657.xz s------
geomean 1.088 1.014 1.104 1.010 1.007 1.019
TABLE 9: MacBook M2 SPEC CPU2017 analogs esul s. MTS=MemTagSani ize , ST=S ickyTags, TC=TCMalloc, NP=non-pe sis en .
Appendix A.
Me a-Re iew
The ollowing me a- e iew was p epa ed by he p og am
commi ee o he 2024 IEEE Symposium on Secu i y and
P i acy (S&P) as pa o he e iew p ocess as de ailed in
he call o pape s.
A.1. Summa y
This pape de ails a specula i e a ack o leak MTE
ags on eal ha dwa e. To mi iga e he a ack, he au ho s
p opose a eo ganiza ion o heap and unsa e s ack objec s
ha p o ides de e minis ic, bounded spa ial memo y sa e y.
A.2. Scien ific Con ibu ions
•Add esses a Long-Known Issue
•P o ides a Valuable S ep Fo wa d in an Es ablished
Field
•Iden ifies an Impac ul Vulne abili y
A.3. Reasons o Accep ance
1) The au ho s de ail a new MTE ag leak side-channel
a ack.
2) The au ho s implemen a new heap layou o p o ide
de e minis ic, bounded spa ial sa e y ha mi iga es he
new side-channel a ack.
3) The au ho s p o ide e alua ion on pe o mance and
secu i y benefi s.
A.4. No ewo hy Conce ns
1) The memo y o e head o 15% is non- i ial o many
eal wo ld scena ios.
2) Exclusi e use o S ickyTags as a p o ec ion mechanism
wi hou an addi ional Use-A e -F ee (UaF) mi iga ion
makes exploi ing UaF easie , and makes de ec ing UaF
exploi s di ficul . This is due o he euse o objec
classes key o he design o S ickyTags. Howe e , he
au ho s no e ha UaF p o ec ions can be deployed, and
show his in an e alua ion on UaF Julie es cases.