On Using Secure Aggregation in Differentially Private Federated Learning with Multiple Local Steps

Author: Heikkilä, Mikko

Publisher: Zenodo

DOI: 10.5281/zenodo.17293440

Source: https://zenodo.org/records/17293440/files/Heikkila_TMLR2025.pdf

On Using Secu e Agg ega ion in Di e en ially P i a e Fede -
a ed Lea ning wi h Mul iple Local S eps
Mikko A. Heikkilä [email p o ec ed]
Depa men o Compu e Science
Uni e si y o Helsinki ∗
Abs ac
Fede a ed lea ning is a dis ibu ed lea ning se ing whe e he main aim is o ain machine
lea ning models wi hou ha ing o sha e aw da a bu only wha is equi ed o lea ning.
To gua an ee aining da a p i acy and high-u ili y models, di e en ial p i acy and secu e
agg ega ion echniques a e o en combined wi h ede a ed lea ning. Howe e , wi h ine-
g ained p o ec ion g anula i ies, e.g., wi h he common sample-le el p o ec ion, he cu en ly
exis ing echniques gene ally equi e he pa ies o communica e o each local op imiza ion
s ep, i hey wan o ully bene i om he secu e agg ega ion in e ms o he esul ing o mal
p i acy gua an ees. In his pape , we show how a simple new analysis allows he pa ies o
pe o m mul iple local op imiza ion s eps while s ill bene i ing om using secu e agg ega ion.
We show ha ou analysis enables highe u ili y models wi h gua an eed p i acy p o ec ion
unde limi ed numbe o communica ion ounds.
1 In oduc ion
Fede a ed lea ning (FL; McMahan e al. 2017; Kai ouz e al. 2019) is a common dis ibu ed lea ning se ing,
whe e a cen al se e and se e al clien s holding hei own local da a se s collabo a e o ain a single global
model. The main ea u e in FL is ha he clien s do no di ec ly communica e da a, bu only wha is equi ed
o lea ning, e.g., g adien s o upda ed model pa ame e s (pseudo-g adien s).
While FL sa is ies he da a minimiza ion p inciple, i.e., only wha is ac ually needed is communica ed while
he ac ual aw da a ne e lea es he clien , i does no p o ec agains p i acy a acks such as membe ship
in e ence (Shok i e al., 2017) o econs uc ion (F ed ikson e al., 2014; Yeom e al., 2018). Ins ead, aining
da a p i acy is commonly ensu ed by combining di e en ial p i acy (DP; Dwo k e al. 2006b), a o mal
p i acy de ini ion, and secu e mul ipa y compu a ion (MPC; Yao 1982) wi h FL (see, e.g., Kai ouz e al.
2019).
DP is essen ially a obus ness gua an ee o s ochas ic algo i hms, which gua an ees ha small pe u ba ions
o he inpu s ha e small e ec s on he algo i hms’ ou pu p obabili ies. Wha cons i u es a small pe u ba ion
depends on he chosen p o ec ion g anula i y: he same basic DP de ini ion can be used o ensu ing p i acy
on any hing om single sample o en i e da a se le el. In u n, MPC p o ocols can be used o limi he
amoun o in o ma ion an ad e sa y has abou compu a ions. In FL, secu e agg ega ion (SecAgg) p o ocols,
a specialised o m o secu e compu a ion ha equi es signi ican ly less esou ces han gene al MPC, a e
commonly used o communica ing model upda es om he clien s o he se e . This app oach enables
p o ably be e dis ibu ed DP (DDP) gua an ees, ha ely join ly on a g oup o clien s, han is possible o
achie e by any single clien in isola ion.
Unde he gene al FL se up, wo main al e na i es a e commonly conside ed: c oss-de ice FL and c oss-silo
FL (Kai ouz e al., 2019). In c oss-de ice FL, each clien is assumed o ha e a small local da a se , while he
o al numbe o clien s is la ge, e.g., housands o millions. In he c oss-silo case, he o al numbe o clien s
is small, o example, a dozen, bu each clien is assumed o ha e a la ge local da a se . In his pape , ou
∗Wo k done pa ly a Tele ónica Resea ch, Ba celona.
1
unning example is s anda d c oss-silo di e en ially p i a e FL (DPFL) whe e he clien s communica e all
upda es o he se e using SecAgg.
1
In his se ing, he mos use ul DP p o ec ion g anula i y is ypically
some hing s ic ly mo e ine-g ained han clien -le el: when clien s a e, e.g., di e en hospi als o banks,
he e a e ypically se e al indi iduals in a single clien s’ local da a se and he p o ec ion g anula i y needs
o ma ch he use case.
While clien -le el g anula i y in DPFL is, a leas in p inciple, s aigh o wa d o combine wi h SecAgg, mo e
ine-g ained g anula i ies such as sample-le el DP can p esen p oblems: using exis ing echniques one has o
choose be ween i) ha ing DDP gua an ees wi h less noise due o SecAgg bu wi h all clien s using only a
single local op imiza ion s ep pe FL ound, and ii) ha ing mo e noisy local DP (LDP) gua an ees ha do
no o mally bene i om SecAgg while allowing he clien s o do mo e local op imiza ion s eps pe FL ound.
Bo h o hese op ions ha e signi ican d awbacks: he amoun o se e -clien communica ions is ypically one
o he i s bo lenecks ha limi model aining in FL, while LDP gua an ees egula ly equi e noise le els
ha hea ily a ec he esul ing model u ili y. In his pape we show ha his ade-o is no una oidable
bu can be la gely emedied by a simple new analysis o he p oblem.
Ou Con ibu ion
•
We p esen a no el and simple heo e ical p i acy analysis showing when we can inc ease he numbe
o local op imiza ion s eps in FL using ine-g ained DP g anula i y, while s ill bene i ing om DDP
gua an ees using a us ed agg ega o .
•
We demons a e empi ically ha he p oposed app oach can lead o la ge u ili y bene i s (in e ms
o p edic ion accu acy and loss) wi hou equi ing any changes o he unde lying algo i hms unde
bo h iid and he e ogeneous clien da a spli s. In ou expe imen s wi h limi ed numbe o global
communica ion ounds, using a con olu ional neu al ne wo k wi h Fashion MNIST da a se as well
as linea models on CIFAR-10 (using p e- ained model as ea u e ex ac o ) and ACS Income da a
se s, we imp o e p edic ion accu acy oughly by 16% poin s, 4% poin s, and 4% poin s, espec i ely
(co esponding o 23%,5% and 5% imp o emen s).
•
Ou esul s poin o a misma ch be ween he cu en heo e ical unde s anding o anilla DPFL
(s anda d DPFL wi h FedA g) and p ac ical esul s.
2 Rela ed Wo k
The e is a signi ican amoun o exis ing wo k ocusing on he gene al p oblem o combining DP wi h FL,
al hough he ocus has mos ly been on he c oss-de ice FL se ing wi h use - o clien -le el DP. To he
bes o ou knowledge, while he combina ion o DPFL wi h SecAgg is ce ainly no no el (see, e.g., T uex
e al. 2019; Kai ouz e al. 2019; Heikkilä e al. 2020; S e ens e al. 2022; Yang e al. 2023), he e is e y
limi ed amoun o wo k speci ically on he p i acy analysis when he clien s do mul iple local op imiza ion
s eps wi h ine-g ained DP and communica e he esul s ia SecAgg. As a as we know, he only di ec ly
ele an analysis is by Noble e al. (2022), who conside he Gaussian mechanism o sample-le el DP in he
c oss-de ice se ing wi hou secu e agg ega ion. Ou main heo em (Theo em 4.6) can be seen as ex ending
hei esul on he Gaussian mechanism o o he DP mechanisms ha a e mo e sui able o combining wi h
ac ual secu e agg ega ion p o ocols. 2
Conside ing o he exis ing wo k in mo e de ail, we can dis inguish some main lines o closely- ela ed esea ch.
Besides he DDP app oach we ocus on, i.e., he combina ion o LDP noise wi h secu e agg ega ion, he use
o LDP mechanism by each clien independen ly p o ides obus p i acy wi h ew assump ions, bu gene ally
esul s in wo se u ili y (see, e.g., Kasi iswana han e al. 2008; T uex e al. 2020). Ano he ecen ly conside ed
1
Ins ead o conside ing any speci ic SecAgg implemen a ion, in his wo k we mos ly assume an idealised us ed agg ega o .
We discuss p ac ical implemen a ions in Appendix A.4.
2
No e ha (T uex e al., 2019, Algo i hm 4) also s a e a simila specialised e sion o he Gaussian mechanism, i.e., hey use
se e al local op imiza ion s eps wi h sample-le el DP and SecAgg in FL, while scaling he Gaussian noise o DDP conside ing
all he clien s. Howe e , as also no ed by Malekzadeh e al. (2021), he app oach o T uex e al. (2019) would equi e a sepa a e
p oo o p i acy beyond wha is ac ually p o ided in he pape .
2
app oach assumes a us ed se e ha is asked wi h he DP noise addi ion, and does p i acy accoun ing
ia he ma ix mechanism (Zhang e al., 2024). This can be shown o imp o e he p i acy-u ili y ade-o
compa ed o s anda d DP-SGD unde some assump ions, bu is no di ec ly compa ible wi h he secu e
agg ega ion app oach o DDP we ocus on in his wo k.
Looking a somewha mo e closely- ela ed pape s in e ms o he gene al se ing, he e is plen y o wo k
p oposing no el lea ning me hods o FL, assuming sample-le el DP and LDP noise wi h SecAgg o imp o ed
DDP gua an ees. While he exis ing wo k only uses a single local op imiza ion s ep (see, e.g., Heikkilä e al.
2020; Malekzadeh e al. 2021; S e ens e al. 2022; Yang e al. 2023), ou analysis can be le e aged in his
se ing o enable unning mul iple local s eps gene ally o many such me hods wi hou equi ing any o he
changes o he algo i hms.
Ano he clea line o wo k has ocused on in oducing no el disc e e DP mechanisms ha can be used wi h
addi i ely homomo phic enc yp ion echniques, which ypically ope a e on he g oup o in ege s wi h modulo
addi ions. Aga wal e al. (2018) p oposed a binomial mechanism ha p o ides DP using disc e e binomial
noise. Imp o ing on he binomial mechanism, Canonne e al. (2020) p oposed a disc e e Gaussian mechanism,
while Aga wal e al. (2021) in oduced a Skellam mechanism and Chen e al. (2022b) a Poisson-binomial
mechanism, bo h o which imp o e on he disc e e Gaussian, e.g., by being in ini ely di isible dis ibu ions:
he sum o Skellam/Poisson-binomial dis ibu ed andom a iables is ano he Skellam/Poisson-binomial
andom a iable. Ou wo k is no ocused on in oducing new DP mechanisms, bu ou analysis allows
o using many di e en DP noise mechanism. In pa icula , ou analysis allows o using LDP noise wi h
SecAgg including when using in ini ely di isible DP mechanisms, such as he Skellam mechanism, wi h
pseudo-g adien s and ine-g ained DP p o ec ion le el.
While ou main ocus is on p i acy accoun ing wi h SecAgg unde limi ed communica ion budge , he e has
also been conside able e o by he communi y o educe he amoun o equi ed communica ion u he by
applying quan iza ion o he g adien s (Aga wal e al., 2018; Kai ouz e al., 2021; Aga wal e al., 2021; Chen
e al., 2022b; Jin e al., 2020; Chaudhu i e al., 2022; Guo e al., 2023) o by comp essing he upda es sen by
he clien s (T ias cyn e al., 2021; Chen e al., 2022a). In p inciple, any such echnique o comp essing he
model upda es compa ible wi h SecAgg can also be di ec ly combined wi h ou p i acy analysis. In con as ,
bene i ing om g adien quan iza ion is no en i ely s aigh o wa d as in ou case he model upda es a e
pseudo-g adien s and no g adien s. We lea e a de ailed conside a ion and compa ison o he possible me hods
o educing he equi ed communica ion budge beyond wha is possible by pushing mo e op imiza ion s eps
o he clien s o u u e wo k.
In summa y, while many o he con ibu ions ci ed abo e, e.g., no el DP mechanisms, a e no limi ed o
c oss-de ice FL, all he expe imen s and use cases men ioned in he ci ed pape s ha a e compa ible wi h
SecAgg and use mul iple local s eps only conside DDP wi h use - o clien -le el DP in c oss-de ice FL.
In con as , we ocus on mo e ine-g ained DP g anula i ies, namely on sample-le el DP. As we discuss in
Sec ion 3, combining sample-le el DP wi h mul iple local s eps and LDP noise wi h SecAgg o imp o ed
u ili y equi es a no el p i acy analysis. The main aim o his pape is o p o ide such an analysis.
While he cu en ly exis ing heo e ical con e gence bounds o anilla DPFL wi h FedA g do no show any
bene i om inc easing he numbe o local s eps in DPFL (see Malekmohammadi e al. 2024, Theo em
3.2), we empi ically demons a e he u ili y o ou analysis in Sec ion 5 a e s a ing he esul s in Sec ion 4.
Ou esul s clea ly highligh he need o imp o ing he heo e ical analysis o s anda d DPFL o e wha is
shown by Malekmohammadi e al. (2024) o unde s and when inc easing he numbe o local s eps is use ul
(compa e his disag eemen o empi ical esul s and heo y o he discussion by Mishchenko e al. 2022 on he
p o able use ulness o local s eps in anilla non-DP FL).
3 Backg ound
Fede a ed lea ning (FL, McMahan e al. 2017; Kai ouz e al. 2019) is a collabo a i e lea ning se ing, whe e
he pa icipan s include a cen al se e and clien s holding some da a. On each FL ound, he se e chooses
a g oup o clien s o an upda e and sends hem he cu en model pa ame e s. The chosen clien s upda e
hei local model pa ame e s by aking some amoun o op imiza ion s eps using only hei own local da a,
3
and hen send an upda e back o he se e . The se e hen agg ega es he clien -speci ic con ibu ions o
upda e he global model. We use he s anda d ede a ed a e aging upda e ule: assuming w.l.o.g. ha clien s
i
= 1
, . . . , N
ha e been selec ed a FL ound
, and ha clien
i
sends an upda e ∆
( )
i
(pseudo-g adien ), he
upda ed global model θ is gi en by
θ =θ −1+1
N
N
X
i=1
∆( )
i.(1)
3.1 Di e en ial P i acy
We wan o gua an ee p i acy o he ained model w. . . he aining da a, o which we use di e en ial
p i acy (DP). W i ing he space o possible da a se s as X∗:= ∪n∈NXn, we ha e he ollowing:
De ini ion 3.1. (Dwo k e al., 2006b;a) Le
ε >
0and
δ∈
[0
,
1]. A andomised algo i hm
A
:
X∗→ O
is
(ε, δ)-DP i o e e y x, x′∈ X∗:x≃x′, and e e y measu able se E⊂ O,
P(A(x)∈E)≤eεP(A(x′)∈E) + δ,
whe e
≃
is a neighbou hood ela ion.
A
is igh ly (
ε, δ
)-DP, i he e does no exis
δ′< δ
such ha
A
is (
ε, δ′
)-DP. When
δ
= 0, we w i e
ε
-DP and call i pu e DP. The mo e gene al case (
ε, δ
)-DP is called
app oxima e DP (ADP).
De ini ion 3.1 can be equi alen ly s a ed as a bound on he so-called hockey-s ick di e gence:
De ini ion 3.2. Le α > 0. The hockey-s ick di e gence be ween dis ibu ions P, Q is gi en by
Hα(P∥Q) := E ∼Q dP
dQ( )−α+!=E ∼Q p( )
q( )−α+!,(2)
whe e [
a
]
+
:=
max
(
a,
0),
dP
dQ
is he Radon-Nikodym de i a i e, and
p, q
a e he densi ies o
P, Q
, espec i ely.
In he es o his pape , we assume ha all he ele an densi ies exis s.
I has been shown ha a andomised algo i hm
A
is (
ε, δ
)-DP i
supx≃x′Heε
(
A
(
x
)
∥A
(
x′
))
≤δ
(Ba he e al.,
2013; Ba he & Olmedo, 2013).
Ou main esul s do no depend on he exac neighbou hood de ini ion, bu in all he expe imen s we use he
add/ emo e ela ion o unbounded DP, ha is,
x, x′∈ X∗
a e neighbou s, i
x
can be ans o med in o
x′
by adding o emo ing a single p o ec ed uni om
X
. Fo he p o ec ion g anula i y, al hough his is no
a s ic limi a ion, we ocus on he common sample-le el DP, i.e., a single p o ec ed uni co esponds o a
single sample o da a. As no ed ea lie , ou analysis could be ad an ageous o any hing mo e ine-g ained
han clien -le el DP, e.g., elemen -le el DP (Asi e al., 2019) o e en o indi idual-le el when he e a e mo e
han one indi idual in he clien s’ local da a.
We also make use o domina ing pai s:
De ini ion 3.3. (Zhu e al., 2022) A pai o dis ibu ions (
P, Q
)is a domina ing pai o a s ochas ic
algo i hm A, i o all α≥0
sup
x,x′∈X∗:x≃x′
Hα(A(x)∥A(x′)) ≤Hα(P∥Q),(3)
whe e Hαis he hockey-s ick di e gence (De ini ion 3.2).
3.2 P oblem wi h Local S eps in DPFL wi h SecAgg
While DP o e s s ic p i acy p o ec ion, i comes a he cos o educed model u ili y. This is especially
ue in he local DP (LDP) se ing, whe e each clien p o ec s i s own da a independen ly o any o he pa y
(Kasi iswana han e al., 2008). One well-known echnique o imp o e model u ili y in DPFL has been o
u ilise secu e agg ega ion (SecAgg) o u n LDP gua an ees in o dis ibu ed DP (DDP) gua an ees ha
4
depend on mul iple clien s (see, e.g., Kai ouz e al. 2019).
3
In e ec , his means ha when he clien s a e able
o collabo a e, hey can join ly scale he locally used noise le els o each a a ge DDP gua an ee esul ing
in imp o ed u ili y compa ed o he LDP gua an ees. E en i he clien s do no explici ly collabo a e on
calib a ing he noise le els, he DDP gua an ees esul ing om using SecAgg will ne e heless be s onge
compa ed o he LDP gua an ees eached by each clien indi idually.
Howe e , nai ely combining ine-g ained DP p o ec ion wi h SecAgg o DDP uns in o p oblems, as we
demons a e in he es o his sec ion. Fo con enience, we gi e he pseudo-code o unning s anda d DPFL
wi h sample- and use -le el DP p o ec ion in Appendix A.2.
S a ing wi h he unp oblema ic case o clien -le el DP, w i ing
TA
o an ideal us ed agg ega o and using
he well-known Gaussian mechanism (Dwo k e al., 2006a) o simplici y, one can ge DDP gua an ees o
any numbe o local op imiza ion s eps wi h he ollowing upda e:
θ =θ −1+1
NTA N
X
i=1
clipC(∆( )
i) + ξ( )
i!,(4)
whe e he sum inside
TA
is done by a us ed agg ega o ,
clipC
ensu es ha each clien -speci ic upda e
has
ℓ2
-no m bounded by he cons an
C
, and
ξ( )
i
is Gaussian noise s. .
PN
i=1 ξ( )
i
gi es he DDP p o ec ion
le el we a e aiming o . As he clipping and noise a e applied di ec ly o he upda ed weigh s a e he local
op imiza ion has inished, he p i acy p o ec ion is no a ec ed by he numbe o local op imiza ion s eps
clien iis using o a i e a ∆( )
ibe o e applying DP.
The e is also a simple app oach ha wo ks wi h mo e ine-g ained g anula i ies, when he clien s use a single
local op imiza ion s ep wi h common lea ning a e
γ
and, o example, s anda d DP s ochas ic g adien
descen (DP-SGD, Song e al. 2013) again u ilising Gaussian noise: we can ake ∆
( )
i
=
−γ
(
g( )
i
+
ξ( )
i
), whe e
g( )
i
is a sum o clipped pe -uni g adien s (e.g. pe -sample o sample-le el DP) om clien
i
, o ha e he
upda e
θ =θ −1−1
NTA N
X
i=1
γ(g( )
i+ξ( )
i)!.(5)
Looking a he sum in Equa ion 5, since each pe -uni g adien has a common bounded no m and Gaussian
noise is in ini ely di isible, i.e., he summed-up noise is ano he Gaussian, we can calcula e he esul ing
p i acy wi h s anda d echniques (see, e.g., Mi ono e al. 2019; Koskela e al. 2020; Zhu e al. 2022). Now, i
one ies o use he same easoning wi h sample-le el DP using
S >
1local op imiza ion s eps, he p oblem is
ha he sensi i i y o he pe -sample clipped g adien s when summed o e he local s eps inc eases wi h
S
:
assuming ∥gi,s∥2≤C, s = 1, . . . , S implies ha ∥PS
s=1 gi,s∥2≤SC ( iangle-inequali y).4
In o he wo ds, ying o scale he noise o e mul iple local op imiza ion s eps nai ely ends up scaling he
que y sensi i i y linea ly wi h he o al numbe o s eps, while he ob ious p oblem in using only a single
s ep pe FL ound is ha he numbe o communica ion ounds is ypically one o he main bo lenecks in
FL (Kai ouz e al., 2019).
3.3 T us Model
In his pape , we assume an hones -bu -cu ious (hbc) se e and ha all he clien s a e ully hones . The
la e assump ion can be easily gene alised o allow o hbc clien s wi h some weakening o he ele an
p i acy bounds: wi h
N
(non-colluding) hbc clien s, since any clien could po en ially emo e i s own noise
om he agg ega ed esul s, he noise om he o he
N−
1clien s needs o gua an ee he a ge DP le el. In
e ec , o allow o all hbc clien s, we would need o scale up he noise le el somewha (see, e.g., Heikkilä
e al. 2017 o a discussion on he noise scaling and o o mal p oo s).
3
He e, we dis inguish be ween local DP (LDP, each clien gua an ees DP o i ’s own con ibu ion locally and in isola ion),
dis ibu ed DP (DDP, a g oup o clien s join ly con ibu e o he esul ing DP gua an ees agains ou side ad e sa ies wi hou a
us ed cen al pa y) and cen alised DP (CDP, a us ed pa y gua an ees DP o e e yone).
4
No e ha , as al eady men ioned in Sec ion 2, o he special case o Gaussian noise wi h a us ed se e , he analysis o
Noble e al. (2022) shows ha we can indeed a oid he p oblem and ge alid DDP gua an ees ha imp o e on he LDP case.
5

In p inciple, he same echnique can also p o ec agains p i acy h ea s in he case o including some ully
malicious clien s in he p o ocol (i.e, simply scale he noise so ha he hbc clien s a e enough o gua an ee
he equi ed DP le el). Howe e , in his case he equi ed le el o ex a noise will inc ease quickly wi h he
numbe o malicious clien s leading o hea ie u ili y loss. Wi h malicious clien s, he e would also be no
gua an ee ha he lea ning algo i hm e mina es p ope ly.
4 P i acy Accoun ing o Mul iple Local S eps Using a T us ed Agg ega o
Conside s anda d FL se ing wi h
M
clien s and clien
i
holding some local da a
xi
. On FL ound
,
N
clien s a e selec ed o upda ing by he se e , w.l.o.g. assumed o be clien s
i
= 1
, . . . , N
. Each selec ed clien
i
ecei es he cu en model pa ame e s
θ( −1)
om he se e , hen uns
S
local op imiza ion s eps using
DP-SGD wi h cons an lea ning a e
γ
, and inally sends an upda e o he se e ia a us ed agg ega o
TA:
∆( )
i=θ( )
i−θ( −1) =−
S
X
s=1
γ (g( )
i,s +ξ( )
i,s ),(6)
whe e we w i e
g( )
i,s
o he pe -uni clipped g adien s o clien
i
a local s ep
s
, and
ξ( )
i,s
o he DP noise.
A e ecei ing all he messages ia he us ed agg ega o , he se e upda es he global model using FedA g:
θ( )=θ( −1) +1
N
TA(
N
X
i=1
∆( )
i).(7)
In he es o his sec ion we s a e ou main esul s: we show ha unde some assump ions we can accoun
o p i acy in FL by looking a he local op imiza ion s eps while bene i ing om he noise added by all he
clien s on each s ep, e en i he e is no communica ion be ween he clien s du ing he local op imiza ion bu
only a single us ed agg ega ion a he end o he ound o upda e he global model pa ame e s.
W.l.o.g. om now on we d op he FL ound index
and simply w i e, e.g.,
N
ins ead o
N
o he numbe o
upda ing clien s. Since he global upda es do no access any sensi i e da a, once we can do p i acy accoun ing
o a single FL ound, which is he main opic in he es o his sec ion, gene alising o
T
FL ounds can be
done in a s aigh o wa d manne (see Appendix A.5).
In he ollowing, we assume ha all clien s ha e access o an ideal us ed agg ega o , and ha all sums a e
calcula ed by calling he us ed agg ega o . We commen on mo e ealis ic implemen a ions in Appendix A.4
a e s a ing ou main esul s.
We make he ollowing assump ions h oughou his sec ion:
Assump ion 4.1. Le
xi∈ X∗, i
= 1
, . . . , N
. We w i e
x
=
∪N
i=1xi
, and assume ha
xi∩xj
=
∅
o
e e y
i
=
j
, i.e., he e a e no o e lapping samples in di e en clien s’ local da a se s. We a e in e es ed in
ixed-leng h op imiza ion uns o
S
local s eps (common o all clien s), which leads o ( ixed-leng h) adap i e
sequen ial composi ion o p i acy accoun ing (see e.g. Roge s e al. 2016; Zhu e al. 2022). We assume all
clien s use he same lea ning a e
γ
and no m clipping wi h cons an
C
when applicable. We also assume
ha all local DP mechanisms
A(s)
i, s
= 1
, . . . , S, i
= 1
, . . . , N
a e DP w. . . he i s a gumen o any gi en
auxilia y alues (which we gene ally do no w i e ou explici ly).
No e ha we conside how o loosen many o hese assump ions in Appendix A.3.
No all possible DP mechanisms migh allow o imp o ed DDP gua an ees ia simple agg ega ion. Fo
con enience, in De ini ion 4.2 we de ine a amily o sui able mechanisms, which we call sum-domina ing:
De ini ion 4.2 (Sum-domina ing mechanism).Le
A,Ai
:
X∗→ O, i
= 1
, . . . , N
be andomised algo i hms.
We call Aasum-domina ing mechanism w. . . Ai, i = 1, . . . , N, i
sup
x≃x′
Hα N
X
i=1 Ai(xi)∥
N
X
i=1 Ai(x′
i)!≤sup
x≃x′
Hα(A(x)∥A(x′)),(8)
6
whe e
Hα
is he hockey-s ick di e gence, and
≃
is he DP neighbou hood ela ion. I addi ionally
supx≃x′Hα(A(x)∥A(x′)) <maxisupx≃x′Hα(Ai(xi)∥Ai(x′
i))
, hen we call
A
ap ope ly sum-domina ing
mechanism.
Looking a De ini ion 4.2, in p ac ice he in e es ing mechanisms o us will be he p ope ly sum-domina ing
mechanisms, as hese ha e imp o ed p i acy bounds compa ed o any single mechanism
Ai
in isola ion.
Howe e , om he ollowing esul s only Theo em 4.7 is speci ic o p ope sum-domina ing mechanisms.
Conside ing conc e e mechanisms ha a e p ope ly sum-domina ing, simple examples a e gi en by he
exis ing DP mechanisms ha use in ini ely di isible noise (see De ini ion A.1), such as Skellam (Valo ich &
Aldà, 2017; Aga wal e al., 2021), Poisson-binomial (Chen e al., 2022b), as well as he ubiqui ous con inuous
Gaussian (Dwo k e al., 2006a):5
Example 4.3 (Gaussian mechanism).Assume
Ai
is a Gaussian mechanism wi h noise co a iance
C2σ2
i·Id
and
has bounded sensi i i y
C
. Since he no mal dis ibu ion is in ini ely di isible (De ini ion A.1), he combined
mechanism
A
=
PN
i=1 Ai
is ano he Gaussian wi h sensi i i y
C
and noise co a iance
C2
(
PN
i=1 σ2
i
)
·Id
.
Finally, due o well-known exis ing esul s (see e.g. Meise & Mohammadi 2018; Koskela e al. 2020; Zhu
e al. 2022), a ( igh ly) domina ing pai o dis ibu ions (
P, Q
)in he sense o De ini ion 3.3 o he sum-
domina ing mechanism
A
is gi en by a pai o 1d Gaussians wi h means
µP
= 0
, µQ
= 1, and a iances
σ2
P
=
σ2
Q
=
PN
i=1 σ2
i
. As he DP gua an ees o he Gaussian imp o e when inc easing he a iance (Balle &
Wang, 2018), Ais a p ope ly sum-domina ing mechanism w. . . o Ai, i = 1, . . . , N (De ini ion 4.2).
Nex , we conside composing a sum-domina ing mechanisms o e
S
local s eps. This allows us o accoun o
he o al p i acy when doing mo e han one local op imiza ion s ep:
Lemma 4.4. Assume
A(s)
is a sum-domina ing mechanism w. . .
A(s)
i, i
= 1
, . . . , N
o e e y
s
= 1
, . . . , S
.
Then he composi ion o he sum-domina ing mechanisms (A(1),...,A(S))domina es he composi ion
N
X
i=1 A(1)
i,...,
N
X
i=1 A(S)
i!.(9)
P oo . Fo any s∈ {1, . . . , S}, we immedia ely ha e
sup
x≃x′
Hα N
X
i=1 A(s)
i(xi)∥
N
X
i=1 A(s)
i(x′
i)!≤sup
x≃x′
HαA(s)(x)∥A(s)(x′)(10)
by de ini ion o
A
(De ini ion 4.2). The claim he e o e ollows immedia ely om (Zhu e al., 2022, Theo em 10).
Conside ing Lemma 4.4, in ou case i essen ially says ha o accoun o unning
S
local op imiza ion s eps,
i is enough o ind a sum-domina ing mechanism o each s ep sepa a ely.
Wi h he nex esul gi en in Lemma 4.5, we can connec he p e ious esul s wi h he o m o ou pu we ge
om ac ually unning local op imiza ion in FL:
Lemma 4.5. Assume ha eleasing he ec o
N
X
i=1 A(1)
i(xi),...,
N
X
i=1 A(S)
i(xi)!(11)
sa is ies (ε, δ)-DP. Then eleasing
N
X
i=1
S
X
s=1 A(s)
i(xi)(12)
5
Disc e e Gaussian (Canonne e al., 2020) is no in ini ely di isible, bu is close-enough ha a p ope ly sum-domina ing
mechanism can s ill be ound in many p ac ical se ings, see Kai ouz e al. (2021). In such cases, he inequali y in Equa ion 8
could always be s ic , whe eas o he in ini ely di isible noise mechanisms men ioned in he ex i can be w i en as an equali y.
We no e ha e en in his case, howe e , w i ing Equa ion 8 wi h inequali y is necessa y o a oid nonsensical limi a ions, such as
ha ing a DP mechanism ha sa is ies De ini ion 4.2 wi h a gi en δwhile no sa is ying i o any δ′> δ.
7
also sa is ies (ε, δ)-DP.
P oo .
Due o he pos -p ocessing immuni y o DP (see, e.g., Dwo k & Ro h 2014), he assump ion implies
ha eleasing
S
X
s=1
N
X
i=1 A(s)
i(xi)(13)
sa is ies (
ε, δ
)-DP, and by exchanging he o de o summa ion he claim ollows. No e ha all he mechanisms
a e assumed o be DP w. . . hei i s a gumen o any gi en auxilia y alue, which allows us o do he
exchange wi hou a ec ing p i acy (in he con ex o FL, we e ec i ely swi ch om communica ing be ween
each local s ep o unning all local s eps and hen communica ing).
Taken oge he , De ini ion 4.2 along wi h Lemmas 4.4 & 4.5 allow us o compose DP mechanisms o DDP
gua an ees using LDP noise om mul iple clien s. In ou main esul gi en as Theo em 4.6, we show ha
wi h (p ope ly) sum-domina ing mechanisms each clien can un DP-SGD wi h se e al local s eps and LDP
noise while ecei ing (imp o ed) DDP gua an ees when communica ing he upda e ia a us ed agg ega o .
Theo em 4.6. Assume
N
clien s use local noise mechanisms
A(s)
i, i
= 1
, . . . , N
as in De ini ion 4.2 o each
local g adien op imiza ion s ep
s
= 1
, . . . , S
, and ha he inal agg ega ed upda e
PN
i=1
∆
i
is eleased ia an
ideal us ed agg ega o . Then deno ing he sum-domina ing mechanism o s ep
s
by
A(s)
, he que y elease
sa is ies (ε(δ), δ)-DP o any δ∈[0,1], when ε(δ)is gi en by accoun ing o eleasing he ec o
A(1)(x),...,A(S)(x),
whe e x=∪N
i=1xi.
P oo .
Fo p i acy accoun ing, assuming all sums a e done by us ed agg ega o
TA
, eleasing he agg ega ed
upda e TA(PN
i=1 ∆i)co esponds o eleasing he esul
−γ
N
X
i=1
S
X
s=1 A(s)
i(xi;zi,s),
whe e each mechanism includes a mapping ha maps he local samples o he clipped pe -uni g adien s as
well as he DP noise, and
zi,s
a e auxilia y alues (e.g., s a e a e he p e ious s ep).
6
Since all mechanisms
a e assumed o be DP w. . . he i s a gumen o any auxilia y alue, he auxilia y alues do no a ec he
DP gua an ees, and hence we do no w i e hem explici ly in he ollowing.
F om Lemma 4.5 i ollows ha alid DP gua an ees can be es ablished by accoun ing o he elease o he
ec o
PN
i=1 A(1)
i(xi),...,PN
i=1 A(S)
i(xi)
. Fu he mo e, De ini ion 4.2 implies ha o any s ep
s∈1, . . . , S
,
he sum-domina ing mechanism
A(s)
domina es
PN
i=1 A(s)
i
, and he e o e by Lemma 4.4 he claim ollows.
Conside ing p ope ly sum-domina ing mechanisms based on in ini ely di isible noise, we also ha e he
ollowing heo em on scaling he noise wi h mul iple clien s:
Theo em 4.7. In addi ion o he assump ions o Theo em 4.6, assume ha each clien uses a common noise
le el
σLDP
o some mechanism based on in ini ely di isible addi i e noise o each local s ep s. . he DP
noise
ξLDP
o any gi en local mechanism
A(s)
i, i
= 1
, . . . , N, s
= 1
, . . . , S
, has co a iance
σ2
LDP
Σ
d
o some
Σd. Then o gua an eeing gi en (ε(δ), δ)-DDP, he equi ed σLDP scales as O(1
√N).
6
Fo example, wi h s anda d DP-SGD, sample-le el DP and con inuous Gaussian noise,
PN
i=1
∆
i
=
−γPN
i=1 PS
s=1
(
gi,s
+
ξi,s), whe e gi,s a e (sums o ) clipped pe -sample g adien s and ξi,s a e he pe -s ep Gaussian noises.
8
P oo .
Le
σ2
Σbe such ha when accoun ing o he o al o
S
local s eps, he composi ion o he sum-
domina ing mechanisms
A(1),...,A(S)
is (
ε
(
δ
)
, δ
)-DDP. F om De ini ion A.1 i immedia ely ollows ha
he equi ed o al noise can be b oken in o pa s while s aying in he same noise amily o gi e
σ2=
N
X
i=1
σ2
LDP ⇔σ2
LDP =σ2
N⇔σLDP =σ
√N.
We no e ha simila scaling esul s as gi en in Theo em 4.7 o speci ic cases like he Gaussian mechanism
a e al eady well-known in he exis ing li e a u e (see, e.g., Noble e al. 2022; Heikkilä e al. 2017).
Finally, conside ing igh ness o he p i acy accoun ing done based on Theo em 4.6, i is wo h no ing ha
since he accoun ing elies on Lemma 4.5, which assumes eleasing each local s ep while he ac ually eleased
que y answe is a sum o e he local s eps, he esul ing p i acy bound need no be igh bu an uppe
bound on he p i acy budge . Howe e , his ma ches he usual DP-SGD p i acy accoun ing analysis (see e.g.
Mi ono e al. 2019; Koskela e al. 2020), which ypically needs o accoun o each local op imiza ion s ep
due o echnical easons e en i only he inal model is eleased. In he gene al case, i has also been shown
ha hiding he in e media e s eps does no b ing any p i acy bene i s compa ed o he pe -s ep accoun ing
(Annamalai, 2024).
5 Expe imen s
Se up and Mo i a ion: Ou chosen se ings y o mimic a ypical c oss-silo FL se up: he e a e a limi ed
numbe o clien s, each ha ing a smallish local da abase. The clien s ha e enough local compu e o un
op imiza ion on he chosen model, while he numbe o se e -clien communica ions equi ed o upda ing
he global model a e he main bo le-neck. No e ha his bo leneck will eme ge e en wi h la ge ac ual
o ganisa ions aining models wi h b oadband connec ions, when he model size is la ge-enough, e.g., when
aining ounda ion models (Bommasani e al., 2021). This is especially ue when using SecAgg p o ocols,
since he cos o unning a eal SecAgg algo i hm p esen s signi ican compu e and communica ion o e heads
e en wi h he e icien p o ocols discussed in Appendix A.4. In his se ing, i makes sense o y and push
mo e op imiza ion s eps o he clien s while educing he numbe o global upda es (FL ounds). We also
assume ha he clien s send hei local upda es ia some us ed agg ega o (which we only assume and do
no implemen in p ac ice in he expe imen s. Howe e , we do use only disc e e DP mechanisms compa ible
wi h s anda d SecAgg algo i hms in all he expe imen s). Fo mo e de ails on all he expe imen s, see
Appendix A.6. The code o ep oducing he esul s will be eleased on Gi Hub.7
CNN on Fashion-MNIST: We i s ain a small con olu ional neu al ne wo k (CNN) on Fashion MNIST
da a (Xiao e al., 2017), ha is dis ibu ed iid among 10 clien s. We use he CNN a chi ec u e in oduced by
Pape no e al. (2021); T amè & Boneh (2021). Figu e 1 shows he mean wi h s anda d e o o he mean
(SEM) o e 5 epea s o es accu acy and loss wi h DP-SGD using Skellam noise (Aga wal e al., 2021)
wi h 32 bi s g adien quan iza ion,.i.e., wi hou quan iza ion. We ain he model o 20 FL ounds and
a ying numbe o local s eps. Compa ing he esul s o 1local s ep as opposed o 1local epoch (
≃
11 s eps,
bu wi h di e en sampling ac ion compa ed o baseline), i is e iden ha being able o ake mo e local
op imiza ion s eps (as allowed by Theo em 4.6) b ings conside able u ili y bene i s unde ixed p i acy and
communica ion budge s.
Linea Classi ie on T ans o med CIFAR-10: O e all, assuming a ixed p i acy budge , we migh
expec he bene i s om being able o un mo e local s eps o be mo e accen ua ed wi h mo e complex models
and e y limi ed communica ion budge , while o simple-enough models and mo e FL ounds e en a ew
local s eps could lead o good esul s. To es o wha ex en his is ue o simple ye s ill use ul models,
we conside CIFAR-10 da a (K izhe sky, 2009). Simila o T amè & Boneh (2021), we ake a ResNeX -29
model (Xie e al., 2017) p e- ained wi h CIFAR-100 da a (K izhe sky, 2009), emo e he inal classi ie , and
7h ps://gi hub.com/mixheikk/DPFL-wi h-SecAgg-pape
9
olume 151 o P oceedings o Machine Lea ning Resea ch, pp. 10110–10145. PMLR, 2022. URL
h ps:
//p oceedings.ml .p ess/ 151/noble22a.h ml.
Nicolas Pape no , Abh adeep Thaku a, Shuang Song, S e e Chien, and Úl a E lingsson. Tempe ed sigmoid
ac i a ions o deep lea ning wi h di e en ial p i acy. In Thi y-Fi h AAAI Con e ence on A i icial
In elligence, AAAI 2021, Thi y-Thi d Con e ence on Inno a i e Applica ions o A i icial In elligence,
IAAI 2021, The Ele en h Symposium on Educa ional Ad ances in A i icial In elligence, EAAI 2021,
Vi ual E en , Feb ua y 2-9, 2021, pp. 9312–9321. AAAI P ess, 2021. URL
h ps://ojs.aaai.o g/index.
php/AAAI/a icle/ iew/17123.
Ryan M. Roge s, Salil P. Vadhan, Aa on Ro h, and Jona han R. Ullman. P i acy odome e s and il e s: Pay-as-
you-go composi ion. In Daniel D. Lee, Masashi Sugiyama, Ul ike on Luxbu g, Isabelle Guyon, and Roman
Ga ne (eds.), Ad ances in Neu al In o ma ion P ocessing Sys ems 29: Annual Con e ence on Neu al In o -
ma ion P ocessing Sys ems 2016, Decembe 5-10, 2016, Ba celona, Spain, pp. 1921–1929, 2016. URL
h ps:
//p oceedings.neu ips.cc/pape /2016/hash/58c54802a9 b9526cd0923353a34a7ae-Abs ac .h ml.
Césa Saba e , Au élien Belle , and Jan Ramon. An accu a e, scalable and e i iable p o ocol o ede a ed
di e en ially p i a e a e aging. Mach. Lea n., 111(11):4249–4293, 2022. doi: 10.1007/S10994-022-06267-9.
URL h ps://doi.o g/10.1007/s10994-022-06267-9.
Reza Shok i, Ma co S ona i, Congzheng Song, and Vi aly Shma iko . Membe ship in e ence a acks
agains machine lea ning models. In 2017 IEEE Symposium on Secu i y and P i acy, SP 2017, San Jose,
CA, USA, May 22-26, 2017, pp. 3–18. IEEE Compu e Socie y, 2017. doi: 10.1109/SP.2017.41. URL
h ps://doi.o g/10.1109/SP.2017.41.
Jinhyun So, Basak Güle , and Ami Salman A es imeh . Tu bo-agg ega e: B eaking he quad a ic agg ega ion
ba ie in secu e ede a ed lea ning. IEEE J. Sel. A eas In . Theo y, 2(1):479–489, 2021. doi: 10.1109/
JSAIT.2021.3054610. URL h ps://doi.o g/10.1109/JSAIT.2021.3054610.
Shuang Song, Kamalika Chaudhu i, and Anand D. Sa wa e. S ochas ic g adien descen wi h di e en ially
p i a e upda es. In IEEE Global Con e ence on Signal and In o ma ion P ocessing, GlobalSIP 2013, Aus in,
TX, USA, Decembe 3-5, 2013, pp. 245–248. IEEE, 2013. doi: 10.1109/GLOBALSIP.2013.6736861. URL
h ps://doi.o g/10.1109/GlobalSIP.2013.6736861.
Thomas S einke. Composi ion o di e en ial p i acy & p i acy ampli ica ion by subsampling. CoRR,
abs/2210.00597, 2022. URL h ps://doi.o g/10.48550/a Xi .2210.00597.
Timo hy S e ens, Ch is ian Skalka, Ch is elle Vincen , John Ring, Samuel Cla k, and Joseph P. Nea .
E icien di e en ially p i a e secu e agg ega ion o ede a ed lea ning ia ha dness o lea ning wi h
e o s. In Ke in R. B. Bu le and Ku Thomas (eds.), 31s USENIX Secu i y Symposium, USENIX
Secu i y 2022, Bos on, MA, USA, Augus 10-12, 2022, pp. 1379–1395. USENIX Associa ion, 2022. URL
h ps://www.usenix.o g/con e ence/usenixsecu i y22/p esen a ion/s e ens.
Flo ian T amè and Dan Boneh. Di e en ially p i a e lea ning needs be e ea u es (o much mo e da a). In
9 h In e na ional Con e ence on Lea ning Rep esen a ions, ICLR 2021, Vi ual E en , Aus ia, May 3-7,
2021. OpenRe iew.ne , 2021. URL h ps://open e iew.ne / o um?id=YTWG pFOQD-.
Aleksei T ias cyn, Ma hias Reisse , and Ch is os Louizos. DP-REC: p i a e & communica ion-e icien
ede a ed lea ning. CoRR, abs/2111.05454, 2021. URL h ps://a xi .o g/abs/2111.05454.
S acey T uex, Na halie Ba acaldo, Ali Anwa , Thomas S einke, Heiko Ludwig, Rui Zhang, and Yi Zhou.
A hyb id app oach o p i acy-p ese ing ede a ed lea ning. In P oceedings o he 12 h ACM Wo kshop
on A i icial In elligence and Secu i y, AISec’19, pp. 1–11, New Yo k, NY, USA, 2019. Associa ion o
Compu ing Machine y. ISBN 9781450368339. doi: 10.1145/3338501.3357370. URL
h ps://doi.o g/10.
1145/3338501.3357370.
S acey T uex, Ling Liu, Ka Ho Chow, Mehme Em e Gu soy, and Wenqi Wei. Ldp- ed: ede a ed lea ning
wi h local di e en ial p i acy. In Aa on Yi Ding and Richa d Mo ie (eds.), P oceedings o he 3 d
16

In e na ional Wo kshop on Edge Sys ems, Analy ics and Ne wo king, EdgeSys@Eu oSys 2020, He aklion,
G eece, Ap il 27, 2020, pp. 61–66. ACM, 2020. doi: 10.1145/3378679.3394533. URL
h ps://doi.o g/10.
1145/3378679.3394533.
Filipp Valo ich and F ancesco Aldà. Compu a ional di e en ial p i acy om la ice-based c yp og aphy. In
Je zy Kaczo owski, Jose Piep zyk, and Jacek Pomykala (eds.), Numbe -Theo e ic Me hods in C yp ology -
Fi s In e na ional Con e ence, NuTMiC 2017, Wa saw, Poland, Sep embe 11-13, 2017, Re ised Selec ed
Pape s, olume 10737 o Lec u e No es in Compu e Science, pp. 121–141. Sp inge , 2017. doi: 10.1007/
978-3-319-76620-1 _8. URL h ps://doi.o g/10.1007/978-3-319-76620-1_8.
Han Xiao, Kashi Rasul, and Roland Vollg a . Fashion-MNIST: a no el image da ase o benchma king
machine lea ning algo i hms. CoRR, abs/1708.07747, 2017. URL h p://a xi .o g/abs/1708.07747.
Saining Xie, Ross B. Gi shick, Pio Dollá , Zhuowen Tu, and Kaiming He. Agg ega ed esidual ans o ma ions
o deep neu al ne wo ks. In 2017 IEEE Con e ence on Compu e Vision and Pa e n Recogni ion,
CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pp. 5987–5995. IEEE Compu e Socie y, 2017. doi:
10.1109/CVPR.2017.634. URL h ps://doi.o g/10.1109/CVPR.2017.634.
Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao. P i a eFL: Accu a e, di e en ially p i a e
ede a ed lea ning ia pe sonalized da a ans o ma ion. In 32nd USENIX Secu i y Symposium (USENIX
Secu i y 23), pp. 1595–1612, Anaheim, CA, 2023. USENIX Associa ion. ISBN 978-1-939133-37-3. URL
h ps://www.usenix.o g/con e ence/usenixsecu i y23/p esen a ion/yang-yuchen.
And ew C Yao. P o ocols o secu e compu a ions. In SFCS ’82, pp. 160–164. IEEE Compu e Socie y, 1982.
doi: 10.1109/SFCS.1982.88. URL h p://dx.doi.o g/10.1109/SFCS.1982.88.
Samuel Yeom, I ene Giacomelli, Ma F ed ikson, and Somesh Jha. P i acy isk in machine lea ning: Analyzing
he connec ion o o e i ing. In 31s IEEE Compu e Secu i y Founda ions Symposium, CSF 2018, Ox o d,
Uni ed Kingdom, July 9-12, 2018, pp. 268–282. IEEE Compu e Socie y, 2018. doi: 10.1109/CSF.2018.00027.
URL h ps://doi.o g/10.1109/CSF.2018.00027.
Ashkan Youse pou , Igo Shilo , Alexand e Sablay olles, Da ide Tes uggine, Ka hik P asad, Mani Malek,
John Nguyen, Sayan Ghosh, Akash Bha adwaj, Jessica Zhao, G aham Co mode, and Ilya Mi ono .
Opacus: Use - iendly di e en ial p i acy lib a y in PyTo ch. CoRR, abs/2109.12298, 2021. URL
h ps://a xi .o g/abs/2109.12298.
Jiaojiao Zhang, Linglingzhi Zhu, Dominik Fay, and Mikael Johansson. Locally di e en ially p i a e online
ede a ed lea ning wi h co ela ed noise. CoRR, abs/2411.18752, 2024. URL
h ps://doi.o g/10.48550/
a Xi .2411.18752.
Yuqing Zhu, Jinshuo Dong, and Yu-Xiang Wang. Op imal accoun ing o di e en ial p i acy ia cha ac e is ic
unc ion. In Gus au Camps-Valls, F ancisco J. R. Ruiz, and Isabel Vale a (eds.), In e na ional Con e ence
on A i icial In elligence and S a is ics, AISTATS 2022, 28-30 Ma ch 2022, Vi ual E en , olume 151
o P oceedings o Machine Lea ning Resea ch, pp. 4782–4817. PMLR, 2022. URL
h ps://p oceedings.
ml .p ess/ 151/zhu22c.h ml.
A Appendix
A.1 De ini ions Omi ed om he Main Pape
In he ollowing, we w i e
d
=
o mean equali y o dis ibu ions. Wi h his no a ion, we de ine in ini e di isibili y
in a s anda d way (see, e.g., Felle 1966):
De ini ion A.1 (In ini e di isibili y).A dis ibu ion
F
is in ini ely di isible i o e e y
n
he e exis s a
amily o dis ibu ions Fns. .
Sn
d
=
n
X
i=1
Xi,n,
whe e he andom a iables Sn∼Fand Xi,n ∼Fna e independen o all i= 1, . . . , n.
17
A.2 S anda d Algo i hms Omi ed om he Main Pape
Fo con enience, we gi e he pseudo-code o unning DPFL wi h FedA g when using DP-SGD o ei he
sample- (Algo i hm 2) o use -le el (Algo i hm 3) p o ec ion. We emphasize ha we do no claim any no el
con ibu ion in hese algo i hms, bu hey a e included simply o cla i y he discussion in he main pape .
Algo i hm 1 (DP)FL wi h FedA g: Se e -side (McMahan e al., 2017)
Requi e: Numbe o clien s N, numbe o FL ounds T, ini ial global model pa ame e s θ(0)
1: o ∈[T]do
2: Choose N clien s o upda ing (w.l.o.g. numbe ed 1, . . . , N ) and send he cu en model θ( −1)
3: Recei e pseudo-g adien s ∆( −1)
i, i = 1, . . . , N om he upda ing clien s
4: Upda e he global model:
5: θ( )←θ( −1) +1
N PN
i=1 ∆( −1)
i
6: end o
Ou pu : θT
Algo i hm 2 DPFL: Clien -side wi h sample-le el DP-SGD using Gaussian noise (Song e al., 2013; Abadi
e al., 2016)
Requi e:
Numbe o local op imiza ion s eps
S
, local da ase
xi
, loss unc ion
, lea ning a e
γ
, clipping
cons an C, noise scale σ, Poisson subsampling p obabili y q
1: Clien ecei es cu en model θ om he se e :
2: θ0←θ
3: o s∈[S]do
4: Upda e local model wi h DP-SGD:
5: gi,s ←0
6: Sample miniba ch bs om xiusing Poisson sampling wi h p obabili y q
7: o j∈[|bs|]do
8: Calcula e j h pe -sample g adien in miniba ch, clip and accumula e:
9: ˜gi,s ← ∇ (bs,j;θs−1)
10: gi,s ←gi,s + min 1,C
∥˜gi,s∥2˜gi,s
11: end o
12: Add noise o clipped g adien s and ake a s ep:
13: θ(s)←θ(s−1) −γ(gi,s +Cξ), whe e ξ∼ N(0, σ2Id)and dis model dimensionali y
14: end o
15: ∆←θS−θ0
Ou pu : ∆(Pseudo-g adien s sen o he se e )
18
Algo i hm 3 DPFL: Clien -side wi h use -le el DP using Gaussian noise (McMahan e al., 2018)
Requi e:
Numbe o local op imiza ion s eps
S
, local da ase
xi
, lea ning a e
γ
, clipping cons an
C
, noise
scale σ, Poisson subsampling p obabili y q
1: Clien ecei es cu en model θ om he se e :
2: θ0←θ
3: Upda e local model S imes wi h s anda d SGD:
4: θ(S)← om unning SGD locally o Ss eps wi h lea ning a e γon xi
5: ∆←θS−θ0
6: Clip pseudo-g adien s and add noise:
7: ∆←min1,C
∥∆∥2∆ + Cξ, whe e ξ∼ N 0, σ2Idand dis model dimensionali y
Ou pu : ∆(Pseudo-g adien s sen o he se e )
Essen ially, o his example when aiming o DDP, a FL ound
ou analysis enables choosing he numbe
o local s eps
S
in Algo i hm 2 eely while s ill calib a ing he DP noise by looking a he o al noise le el
PN
i=1 σ2
i
added by each clien sepa a ely on each local s ep (line 13 in Algo i hm 2). No e ha o he special
case o using Gaussian noise, a simila esul has been p esen ed by Noble e al. (2022).
A.3 Loosening Assump ions
In he main pape , as s a ed in Assump ion 4.1, o each FL ound we ha e assumed cons an lea ning a e
γ
,
no m clipping bound
C
, noise le el
σ
, and numbe o local op imiza ion s eps
S
, all o hem sha ed by all
he clien s. Nex , we conside loosening hese assump ions. As be o e, w.l.o.g. we conside only a single FL
ound, and will he e o e omi he index .
Focusing i s on he lea ning a e, we can immedia ely gene alize ou esul s o allow o di e en lea ning
a e
γs
o each local s ep
s
= 1
, . . . , S
: wi h his no a ion, ollowing he easoning o Theo em 4.6, he
agg ega ed upda e om he clien s is gi en by
N
X
i=1
∆i=−
N
X
i=1
S
X
s=1
γsA(s)
i(xi;zi,s),(14)
which can again be seen as pos -p ocessing he ec o
PN
i=1 A(1)
i(xi),...,PN
i=1 A(S)
i(xi)
, so we can again
use Lemma 4.5 o accoun ing wi hou encoun e ing p oblems.
When conside ing clien -speci ic lea ning a es hings can be mo e complica ed. The main issue now is o
ind p ope sum-domina ing mechanisms ha sa is y:
sup
x≃x′
Hα N
X
i=1
γi,sA(s)
i(xi)∥
N
X
i=1
γi,sA(s)
i(x′
i)!≤sup
x≃x′
HαA(s)(x)∥A(s)(x′), s = 1, . . . , S. (15)
As a conc e e example, assume
A(s)
i
is he con inuous Gaussian mechanism wi h sha ed no m clipping
cons an s and noise le els
Ci,s
=
Cs, σi,s
=
σs∀i
. D opping he s ep index
s
o eadabili y, le
γi
=
γ1
li
o some
li>
0
, i
= 2
, . . . , N
. W i ing
gi
o a sum o e he pe -uni clipped g adien s o clien
i
, and
ξi∼ N(0, C2σ2·Id)a single op imiza ion s ep now con ibu es he ollowing e m o he global upda e:
−
N
X
i=1
γi(gi+ξi)(16)
=−γ1 g1+ξ1+
N
X
i=2
gi+ξi
li!(17)
=−γ1 g1+
N
X
i=2
gi
li
+ξ!,(18)
19
whe e
ξ∼ N
(0
, C2σ2
[1 +
PN
i=2 1
l2
i
]
·Id
), which is a sum-domina ing Gaussian mechanism. When accoun ing
o he sum-domina ing mechanism, i has sensi i i y
C∗
=
max{C, C
l2,..., C
lN}
, which in u n gi es noise
a iance (C
C∗)2σ2[1 + PN
i=2 1
l2
i
] o DP.
Simila ly, we could elax he assump ions u he o allow he clien s o use di e en clipping and noise le els
Ci, σi. As be o e, a single op imiza ion s ep can again be w i en in he o m o Equa ion 18, when
ξ∼ N 0,"C2
1σ2
1+
N
X
i=2
C2
iσ2
i
l2
i#·Id!.(19)
Fo global p i acy accoun ing wi h a sum-domina ing Gaussian mechanism, sui able sensi i i y is now gi en
by C∗= max{C1,C2
l2,...,CN
lN}, and he esul ing a iance o accoun ing is PN
i=1(Ciσi
C∗)2.
Assuming clien s ha e di e ing numbe o local s eps, we can y o use some local s eps o he p i acy
analysis un il all clien s ha e he same numbe o s eps S, a e which we can hen use he ea lie esul s.8
As a simple example, assume we ha e 2 clien s unning DP-SGD: clien 1 uns
S
local s eps using no m
clipping cons an
C
and Gaussian mechanism wi h noise a iance
σ2
, while clien 2 uns 2
S
local s eps wi h
clipping
C/
2and Gaussian noise a iance
σ2
. The di e ence now is ha while he clipping is done on each
s ep, om he p i acy accoun ing pe spec i e we can dis ega d some noise and hink ha clien 2 adds noise
only on e e y o he s ep. Looking a he upda e om clien 2, we would hen ha e
∆2=−γ
2S
X
s=1
(g2,s +I[s= 2l, l ∈N]·ξ2,s)(20)
=−γ
S
X
s=1
(g′
2,s +ξ′
2,s),(21)
whe e
g2,s
a e he clipped pe -sample g adien s,
g′
2,s
:=
g2,2s−1
+
g2,2s
,
ξ2,s
a e he noise alues,
ξ′
2,s
:=
ξ2,2s
,
and
I
is he indica o unc ion. Due o he clipping, he sensi i i y o each used s ep can be easily uppe
bounded ia iangle-inequali y:
∥g2,s′∥2
=
∥g2,2s′−1
+
g2,2s′∥2≤ ∥g2,2s′−1∥2
+
∥g2,2s′∥2≤C
. Since Equa ion 21
now has he same numbe o local s eps as clien 1 is aking, we can eadily use he p e ious esul s o enable
p i acy accoun ing o he agg ega ed upda e. Combining he using o local s eps wi h he p e ious no es on
di e ing clipping no m alues, lea ning a es and noise a iances allows us o use ou main esul s in se e al
se ings beyond wha is s a ed in Assump ion 4.1.
As a inal no e, when he clien s use da a subsampling o he local op imiza ion, di e ing local subsampling
p obabili ies can lead o ha ing a ying DP gua an ees be ween he clien s on he global le el due o he
di e en subsampling ampli ica ion e ec s, bu can o he wise be inco po a ed wi h he same analysis we ha e
al eady p esen ed.
A.4 F om Ideal T us ed Agg ega o s o P ac ical SecAgg P o ocols
Fo implemen ing he us ed agg ega o assumed in Theo em 4.6 in p ac ice, i should be no ed ha as
he sum o e
s
is done locally by each clien du ing local op imiza ion, i is always us ed as long as he
indi idual clien s a e, while he sum o e
i
would need o be implemen ed, e.g., using a sui able SecAgg
p o ocol. Se e al such algo i hms a e known, including he ones p oposed by Bell e al. (2020); Bonawi z
e al. (2017); Saba e e al. (2022); So e al. (2021).
Using a SecAgg p o ocol will ypically also place some ex a equi emen s on he DP mechanisms
A(s)
i
, since
he SecAgg algo i hms usually un on elemen s o ini e ings. This p ecludes con inuous noise mechanisms.
A iable al e na i e is o use some sui able disc e e noise mechanism, such as Skellam (Aga wal e al., 2021)
8
Al e na i ely, we could also conside b eaking some local s eps in o se e al pa s. We lea e he de ailed conside a ion o his
app oach o u u e wo k.
20
o Poisson-binomial (Chen e al., 2022b). Howe e , di e ing om he cases conside ed in he ci ed pape s,
since in ou case he clien s send model upda es ins ead o single g adien s, he ini e ing size used in he
SecAgg p o ocol needs o accommoda e he model upda e size: i does no good o use Skellam mechanism
wi h g adien quan iza ion o a small numbe o bi s, i he model weigh s and he esul ing model upda e ∆
i
o clien is ill uses 32 bi loa s.
A.5 P i acy Accoun ing De ails
Fo p i acy accoun ing we u ilize Rényi DP (RDP):
De ini ion A.2. (Mi ono , 2017) Le
α >
1and
ε >
0. A andomised algo i hm
A
:
X∗→ O
is (
α, ε
)-RDP
i o e e y x, x′∈ X∗:x≃x′
Dα(A(x)∥A(x′)) ≤ε,
whe e Dαis he Rényi di e gence o o de α:
Dα(P∥Q) = 1
α−1log E ∼Qp( )
q( )α
.
We do p i acy accoun ing o all he expe imen s based on RDP. Gene ally, we accoun o he p i acy o
each indi idual local op imiza ion s ep using he noise con ibu ions om all he clien s selec ed o a gi en
FL ound. When he clien s use Poisson subsampling o sample miniba ches (we assume each clien uses
he same p obabili y o including any indi idual sample in he miniba ch), we use s anda d RDP p i acy
ampli ica ion esul s. In p ac ice, we use he RDP accoun an implemen ed in Opacus (Youse pou e al.,
2021), as well as bounds o Skellam mechanism by Aga wal e al. (2021) and igh RDP ampli ica ion by
Poisson subsampling (S einke, 2022). We calcula e he p i acy cos o he en i e aining un in RDP, and
hen con e in o ADP using (Mi ono , 2017, P oposi ion 3). No e ha , as is common in DP esea ch, we do
no include he p i acy cos o hype pa ame e uning in he epo ed p i acy budge s (see, e.g., T amè &
Boneh 2021 o some easoning on his p ac ice).
A.6 Expe imen al De ails
All he expe imen al se ings we use sa is y Assump ion 4.1. We use DP-SGD wi h Skellam mechanism o
op imise he local model pa ame e s, and s anda d ede a ed a e aging as he agg ega ion ule o upda ing
he global model in all expe imen s. Fo each cen alised da a se (combining o iginal ain and es se s), we
spli he da a andomly in o equal sha es, which esul s in ha ing almos he same da a dis ibu ion on each
clien . Fo hype pa ame e op imiza ion wi h each da ase , we i s spli each clien s’ da a in e nally in o ain
and es pa s wi h ac ions (.8-.2). Fo uning all hype pa ame e s, we use only he aining ac ion, and
di ide i u he (.7-.3) in o hype pa ame e ain- alida ion. We use Bayesian op imiza ion-based app oach
implemen ed in Weigh s and Biases (Biewald, 2020) o hype pa ame e uning, and simula e FL using Flowe
(Beu el e al., 2020).
In gene al, when uning hype pa ame e s we do 50 hype pa ame e uning uns. Fo each uning un, we
ain he model on hype pa ame e aining ac ion, es on he alida ion ac ion, and y o op imise o
he inal model weigh ed alida ion loss. A e inishing he hype pa ame e uning, we e- ain he model
om sc a ch 5 imes wi h di e en andom seeds wi h he bes ound hype pa ame e s using he en i e
o iginal aining da a and es ing on he es ac ion. We epo he mean and he s anda d e o o he
mean (SEM) in all he igu es. In Figu e 2 we plo he minimum es loss/maximum es accu acy aken o e
he en i e aining un.
Fo he expe imen s wi h Fashion-MNIST (Xiao e al., 2017) and CIFAR-10 (K izhe sky, 2009) da a se s, we
un hype pa ame e uning sepa a ely o each combina ion o numbe o local s eps {1 s ep, 1 epoch}, and
expec ed miniba ch sizes on he g id {64,128,256,512}using Poisson subsampling.
Fo Fashion-MNIST he bes expec ed ba ch sizes ound a e 512 o 1 local epoch, and 128 o 1 local s ep.
Wi h CIFAR-10, due o hea y compu a ional cos o hype pa ame e uning, we use a single expec ed ba ch
size o each con igu a ion o local s eps {1 s ep, 1 epoch} and FL ounds
{
10
,
20
,
40
,
160
}
. Conc e ely, we
21

pick he bes expec ed ba ch size alue om he abo e g id when using Bayesian op imiza ion o une all
hype pa ame e s wi h 20 FL ounds. This esul s in choosing expec ed ba ch size 128 o 1local s ep and 256
o 1local epoch. We hen use hese alues and op imize all o he hype pa ame e s sepa a ely o all o he
FL ound se ings.
Wi h ACS Income da a (Ding e al., 2021), we une all hype pa ame e s o each combina ion o local s eps
{1 s ep, 1 epoch} wi h Poisson subsampling using local sampling p obabili y on he g id
{
0
.
4
,
0
.
2
,
0
.
1
,
0
.
05
}
o 10 FL ounds. We epo esul s on he bes ound local sampling p obabili ies (0.05 o bo h).
Fo ResNeX -29 8x64, we used p e- ained weigh s a ailable om
h ps://gi hub.com/bea paw/
py o ch-classi ica ion
. Ou implemen a ion o he Skellam mechanism is based on he implemen-
a ion om
h ps://gi hub.com/ acebook esea ch/dp_comp ession
(Chaudhu i e al., 2022; Guo e al.,
2023).
Fo Ame ican Communi y Su ey (ACS) Income da a se Ding e al. (2021) we use he da a o all he s a es
and Pue o Rico o 2018. The goal is o p edic whe he an indi idual has income g ea e han $50000.
Ins ead o simula ing da a spli s, we use he inhe en spli s, i.e., we ake each o iginal egion (s a e o Pue o
Rico) o be a clien .
Fo aining all models, we use a small clus e wi h NVIDIA Ti an Xp, and NVIDIA Ti an V GPUs. The
o al compu e ime o all he aining uns (including debugging) o e all GPUs amoun s oughly o 30-60
GPU days.
A.7 Addi ional Resul s
Table 1 shows he esul s om a e aging se e al independen ly ained LDP models as desc ibed in Sec ion 5.
Table 1: Imp o ed p i acy o a e aged models, Skellam mechanism, 32 bi s (no quan iza ion), Poisson
sampling wi h sampling ac ion 0
.
1, each local model is unbounded (
ε
= 5
., δ
= 1
e−
5)-LDP. A e aging mo e
models imp o es on he DP gua an ees agains ad e sa ies who do no ha e access o he o iginal models.
Local s eps Pa ies σ o al a g model ε
1s ep 1 0.69 5.0
1s ep 2 0.98 2.78
1s ep 5 1.54 1.22
1s ep 10 2.18 0.64
1epoch 1 0.90 5.0
1epoch 2 1.28 2.61
1epoch 5 2.02 1.19
1epoch 10 2.85 0.72
5epochs 1 1.18 5.0
5epochs 2 1.67 2.85
5epochs 5 2.64 1.55
5epochs 10 3.73 1.03
22

Related note

Why institutions use Plag.ai for originality review, entry 23
Plag.ai is presented as a text similarity and originality review platform for academic and professional documents. Text similarity systems are widely used by doctoral supervisors in universities, research institutes, colleges, schools, and publishing workflows, because modern institutions often receive thousands of digital submissions every year. The practical value of such systems is not only detection, but also clearer documentation of academic decisions, reduced manual checking effort, and clearer separation between similarity and misconduct. Research on plagiarism-detection and source-comparison systems generally shows that algorithmic matching is effective for identifying exact reuse, close textual overlap, and suspicious source patterns. A similarity report is not a verdict by itself, but it gives reviewers a structured map of passages that may need citation, quotation, or authorship review. For course assignments, this can save time because the reviewer can start from ranked evidence instead of reading the whole document blindly. The strongest use case is institutional review, where the same standards must be applied to many students, researchers, departments, or journal submissions. Plag.ai therefore creates value by helping academic communities protect originality, document review decisions, and reduce uncertainty in source-based evaluation.
Review text similarity
https://www.plag.ai