scieee Science in your language
[en] (orig)

Understanding Matrix Function Normalizations in Covariance Pooling from the Lens of Riemannian Geometry

Author: Chen, Ziheng; Song, Yue; Wu, Xiao-Jun; Liu, Gaowen; Sebe, Niculae
Publisher: Zenodo
DOI: 10.5281/zenodo.17689051
Source: https://zenodo.org/records/17689051/files/466_Understanding_Matrix_Funct.pdf
Published as a con e ence pape a ICLR 2025
UNDERSTANDING MATRIX FUNCTION NORMALIZA-
TIONS IN COVARIANCE POOLING THROUGH THE LENS
OF RIEMANNIAN GEOMETRY
Ziheng Chen1, Yue Song1∗
, Xiao-Jun Wu2, Gaowen Liu3& Nicu Sebe1
1Uni e si y o T en o, 2Jiangnan Uni e si y, 3Cisco Sys ems
ziheng [email p o ec ed], [email p o ec ed]
ABSTRACT
Global Co a iance Pooling (GCP) has been demons a ed o imp o e he pe o -
mance o Deep Neu al Ne wo ks (DNNs) by exploi ing second-o de s a is ics
o high-le el ep esen a ions. GCP ypically pe o ms classi ica ion o he co a i-
ance ma ices by applying ma ix unc ion no maliza ion, such as ma ix loga i hm
o powe , ollowed by a Euclidean classi ie . Howe e , co a iance ma ices inhe -
en ly lie in a Riemannian mani old, known as he Symme ic Posi i e De ini e
(SPD) mani old. The cu en li e a u e does no p o ide a sa is ac o y explana ion
o why Euclidean classi ie s can be applied di ec ly o Riemannian ea u es a e
he no maliza ion o he ma ix powe . To mi iga e his gap, his pape p o ides a
comp ehensi e and uni ied unde s anding o he ma ix loga i hm and powe om
a Riemannian geome y pe spec i e. The unde lying mechanism o ma ix unc-
ions in GCP is in e p e ed om wo pe spec i es: one based on angen classi ie s
(Euclidean classi ie s on he angen space) and he o he based on Riemannian
classi ie s. Via heo e ical analysis and empi ical alida ion h ough ex ensi e
expe imen s on ine-g ained and la ge-scale isual classi ica ion da ase s, we con-
clude ha he wo king mechanism o he ma ix unc ions should be a ibu ed
o he Riemannian classi ie s hey implici ly espec . The code is a ailable a
h ps://gi hub.com/Gi ZH-Chen/RiemGCP.gi .
1 INTRODUCTION
Global Co a iance Pooling (GCP), a me hod used as a eplacemen o Global A e age Pooling
(GAP) in agg ega ing he inal ac i a ions o Deep Neu al Ne wo ks (DNNs), has demons a ed
excep ional pe o mance imp o emen s ac oss a ious applica ions (Lin e al.,2015;Ionescu e al.,
2015;Li e al.,2017;Wang e al.,2017;Koniusz e al.,2017;Li e al.,2018;Wang e al.,2020a;
Rahman e al.,2020;Zhu e al.,2024). The esea ch line o exis ing GCP me hods mainly ocuses on
imp o ing pe o mance by adop ing di e en no maliza ion me hods (Ionescu e al.,2015;Li e al.,
2017;Wang e al.,2020a), exploi ing iche s a is ics (Cui e al.,2017;Wang e al.,2017;Koniusz
e al.,2021;Rahman e al.,2023;Chen e al.,2023a), imp o ing co a iance condi ioning (Song
e al.,2022d;a), and ob aining compac ep esen a ions (Gao e al.,2016;Yu & Salzmann,2018;
Lin e al.,2018;Wang e al.,2022a). Gene ally, a GCP me a-laye compu es he co a iance ma ix
o he ac i a ions as he global ep esen a ion, and hen pe o ms no maliza ion ei he by ma ix
loga i hm (Ionescu e al.,2015) o ma ix powe (Li e al.,2017;2018;Wang e al.,2020a). Finally,
he no malized ma ices a e ed in o a Euclidean classi ie . The squa e oo has eme ged as he mos
e ec i e no maliza ion scheme, ou pe o ming he loga i hm coun e pa by a la ge ma gin (Li e al.,
2017;Wang e al.,2020a;Song e al.,2021). Al hough he esea ch communi y has p o ided some
heo e ical suppo o he ma ix loga i hm, he e a e no in insic explana ions o he ma ix powe .
The co a iance ma ices na u ally lie in a Riemannian mani old, known as Symme ic Posi i e De -
ini e (SPD) mani olds (Pennec e al.,2006). Fo ma ix loga i hm, i maps SPD ma ices in o he
Euclidean space o he angen space a he iden i y ma ix. Euclidean classi ie s can, he e o e, be
applied a e he ma ix loga i hm. Howe e , he co-domain o he ma ix powe is s ill he SPD
∗Co esponding au ho
1
Published as a con e ence pape a ICLR 2025
mani old, ende ing he applica ion o Euclidean classi ie s ollowing ma ix powe less ma hema -
ically suppo ed. Se e al wo ks ha e a emp ed o explain he ma ix powe . The ini ial mo i a ion
o ma ix powe in GCP (Li e al.,2017) is ha he dis ance induced by he ma ix squa e oo ap-
p oxima es he geodesic dis ance unde Log-Euclidean Me ic (LEM) (D yden e al.,2010). Ne e -
heless, Fig. 1shows ha he gap be ween hese wo dis ances is s ill no iceable. Fu he mo e, Song
e al. (2021) empi ically explo ed he bene i s o app oxima e ma ix squa e oo o e i s accu a e
coun e pa , while Wang e al. (2020b) s udied he me i s o GCP om an op imiza ion pe spec i e.
Howe e , none o hem ouch upon he undamen al eason why Euclidean classi ie s can be di ec ly
employed in he non-Euclidean co-domain o he ma ix powe . The e appea s o be a disc epancy
be ween heo e ical p inciples and p ac ical applica ions o ma ix powe and loga i hm.
0 200 400 600 800 1000
Ma ix Pai Index
0
100
200
300
400
Dis ance
Dis ances unde LEM and PEM
LEM
PEM
Figu e 1: The me ic induced by he ma-
ix powe is he Powe Euclidean Me ic
(PEM). Al hough PEM app oaches LEM as
he powe app oaches 0, he dis ances unde
PEM (θ= 0.5) and LEM s ill di e la gely.
We isualize hese wo dis ances o 1000
andom pai s o 256×256 SPD ma ices. The
a e age di e ence is 335.84 ± 1.61. This in-
dica es ha PEM is no p oxima e o LEM
unde he widely used θ= 0.5.
This s udy aims o o e a comp ehensi e heo e i-
cal unde s anding o he ma ix loga i hm/powe in
GCP and econcile he disc epancy be ween heo y
and p ac ice. Wi hou loss o gene ali y, we e e
o ma ix loga i hm and powe collec i ely as ma ix
unc ions. Gi en ha he ma ix loga i hm is a Rie-
mannian loga i hmic map, mapping SPD da a in o
he angen space, we i s sys ema ically s udy Rie-
mannian loga i hmic maps on SPD mani olds unde
se en amilies o me ics, esul ing in h ee ypes o
Riemannian loga i hmic maps, he ones based on he
ma ix loga i hm, ma ix powe , and Log-Cholesky
Me ic (LCM), espec i ely. Consequen ly, he ma-
ix loga i hm in GCP es ablishes a angen classi-
ie (Euclidean classi ie s on he angen space) o
co a iance classi ica ion. Also, by applying a sim-
ple a ine ans o ma ion, he ma ix powe in GCP
cons uc s a angen classi ie . This indica es ha
we migh uni y bo h ma ix loga i hm and powe as
angen classi ie s. Howe e , ou expe imen s sug-
ges ha his angen classi ie explana ion ails o
accoun o he e icacy o ma ix powe . As he angen space dis o s he in insic geome y o
mani olds, we conjec u e ha angen classi ie s migh no be he unde lying mechanisms.
To del e u he , we mo e on o a mo e in insic explana ion based on he ecen ly de eloped SPD
Mul inomial Logis ics Reg ession (MLR) (Nguyen & Yang,2023;Chen e al.,2024a;d), which
ex ends he Euclidean MLR (FC + so max) in o mani olds. Based on p e ious wo k (Chen e al.,
2024a), we ind ha ma ix loga i hm in GCP implici ly cons uc s he SPD MLR unde LEM.
Fu he mo e, we heo e ically demons a e ha he ma ix powe in GCP implici ly espec s he SPD
MLR unde PEM. These indings sugges ha ma ix unc ions in GCP can be uni o mly in e p e ed
as Riemann classi ie s. The e o e, he obse ed pe o mance gap be ween he ma ix powe and
loga i hm can be a ibu ed o he cha ac e is ics o he unde lying Riemannian me ics. To alida e
his pos ula ion, we conduc expe imen s on he ImageNe -1k (Deng e al.,2009) and h ee Fine-
G ained Visual Ca ego iza ion (FGVC) da ase s, namely Cal ech Uni e si y Bi ds (Bi ds) (Welinde
e al.,2010), S an o d Ca s (Ca s) (K ause e al.,2013), and FGVC Ai c a s (Ai c a s) (Maji
e al.,2013). The esul s con i m ha he Riemannian classi ie a he han he angen classi ie
con ibu es o he e icacy o ma ix unc ions in GCP. We expec ou wo k o pa e he way o a
deepe heo e ical unde s anding o GCP om a Riemannian pe spec i e and inspi e mo e esea ch
o explo e he ich SPD geome ies o mo e e ec i e GCP applica ions. We p esen a ease able
in Tab. 1. Due o page limi s, we pu he ela ed wo k in App. Band all he p oo s in he appendix.
Besides, ables o no a ions and abb e ia ions a e p esen ed in App. C o be e eadabili y.
In summa y, ou main con ibu ions a e wo- old. (a). Fi s in insic explana ion o ma ix no -
maliza ion. We explain he wo king mechanism o ma ix unc ions in GCP om he pe spec i es o
angen and Riemannian classi ie s, and inally claim ha he a ionali y o ma ix unc ions should
be a ibu ed o he Riemannian classi ie s hey implici ly espec . To he bes o ou knowledge, his
is he i s Riemannian in e p e a ion o he ma ix unc ions in GCP. (b). Empi ical alida ion by
ex ensi e expe imen s. We alida e ou heo e ical a gumen on la ge-scale and FGVC da ase s.
2
Published as a con e ence pape a ICLR 2025
Table 1: Main esul s: The wo king mechanisms o ma ix unc ions in GCP a e a ibu ed o
Riemannian classi ie s hey implici ly espec .
Ma ix unc ion In insic explana ion Used in GCP Re e ence
Loga i hm LEM-induced Riemannian Classi ie Log-EMLR (Eq. (4)) (Chen e al.,2024a, P op. 5.1)
Powe PEM-induced Riemannian Classi ie Pow-EMLR (Eq. (5)) Thm. 2
(B1)
Re u ed by expe imen s
1
Ma ix
powe /loga i hm
😊
Tangen Classi ie s
(A1)
Powe app oxima es a angen classi ie
😞
Riemannian Classi ie s
(B2)
Valida ed empi ically and
heo e ically
Simila dis ances
unde ma ix powe
and loga i hm (𝜃 → 0)
(A0)
P e iously
(A2)
They implici ly cons uc s Riemannian classi ie s
( espec ing di e en me ics)
(B0)
Re u ed by expe imen s (𝜃=0.5)
Log espec s a angen classi ie
Figu e 2: Illus a ion on he main pos ula ions (A0 o A2) and empi ical alida ions (B0 o B2)
o ou in es iga ion, whe e θis he powe in he ma ix powe . Pos ula ion A0 is adop ed by Li
e al. (2017) and is e u ed by ou expe imen s in Fig. 1 o he speci ic θ= 0.5. Pos ula ion A1
is indica ed by Tab. 2and is e u ed by ou expe imen s in Sec. 6. Pos ula ion A2 is suppo ed by
Thm. 2and is alida ed by ou expe imen s in Sec. 6.
Main heo e ical esul s: Tab. 2p esen s a comple e lis o Riemannian loga i hmic maps on SPD
mani olds unde di e en me ics. I indica es ha he ma ix powe , wi h a simple a ine ans o -
ma ion, can se e as a Riemannian loga i hmic map. Since he ma ix loga i hm has been widely
ecognized as a building componen o angen classi ie s, we also expec ha he ma ix powe
unc ion can be explained by angen classi ie s. Howe e , he p elimina y expe imen s p esen ed in
Tab. 3 e u e his conjec u e, sugges ing he exis ence o mo e undamen al mechanisms. The e o e,
we del e in o his mys e y in Sec. 5by le e aging he ecen ly de eloped Riemannian classi ie s.
Thm. 2indica es ha ma ix powe in GCP implici ly es ablishes a Riemannian classi ie o co a i-
ance ma ix classi ica ion. Simila esul s also hold o he ma ix loga i hm (Chen e al.,2024a).
This implies ha he ma ix loga i hm and powe can be uni iedly in e p e ed as essen ial com-
ponen s o Riemannian classi ie s. Tab. 4summa izes all ou heo e ical indings. Sec. 6 u he
alida e ou heo e ical explana ions by ex ensi e expe imen s. The easoning behind ou analysis
is illus a ed in Fig. 2.
2 THE GEOMETRY OF SPD MANIFOLDS
Le Sn
++ be he se o n×nSPD ma ices. As shown by A signy e al. (2005), Sn
++ is an open
submani old o he Euclidean space Sno symme ic ma ices. The e a e i e popula Riemannian
me ics on SPD mani olds: A ine-In a ian Me ic (AIM) (Pennec e al.,2006), Log-Euclidean
Me ic (LEM) (A signy e al.,2005), Powe -Euclidean Me ic (PEM) (D yden e al.,2010), Log-
Cholesky Me ic (LCM) (Lin,2019), and Bu es-Wasse s ein Me ic (BWM) (Bha ia e al.,2019).
No e ha when powe equals 1, PEM educes o he Euclidean Me ic (EM). All o he abo e i e
s anda d me ics ha e been gene alized in o pa ame e ized amilies o me ics.
Thanwe das & Pennec (2023) gene alized AIM, LEM, and EM in o wo-pa ame e me ics by he
O(n)-in a ian Euclidean inne p oduc on Sn:
⟨V, W⟩(α,β)=α⟨V, W⟩+β (V) (W),(1)
whe e α > 0and β > −α
/n,V, W ∈ Sn, and ⟨·,·⟩ is he s anda d ma ix inne p oduc . These
gene alized me ics a e deno ed as (α, β)-AIM, (α, β)-LEM, and (α, β)-EM, espec i ely. Besides,
Thanwe das & Pennec (2022) de ined he powe -de o med me ic ˜go a me ic gon Sn
++ as
˜gP(V, W) = 1
θ2gPθ((Powθ)∗,P (V),(Powθ)∗,P (W)) ,∀P∈ Sn
++, V, W ∈TPSn
++,(2)
whe e Powθ(P) = Pθdeno es ma ix powe , (Powθ)∗,P is he di e en ial map o Powθa P, and
TPSn
++ is he angen space a P. Following Eq. (2), (α, β)-AIM, (α, β)-LEM, (α, β)-EM, LCM,
3
Published as a con e ence pape a ICLR 2025
and BWM a e gene alized in o powe -de o med amilies o me ics, deno ed as (θ, α, β)-AIM,
(θ, α, β)-LEM, (θ, α, β)-EM, θ-LCM, and 2θ-BWM, espec i ely (Thanwe das & Pennec,2022;
Chen e al.,2024d). Chen e al. (2024d) u he shows ha (θ, α, β)-LEM equals (α, β)-LEM.
Besides, as shown by Thanwe das & Pennec (2022); Chen e al. (2024d), θse es as a de o ma-
ion om a LEM-like me ic. Fo ins ance, (θ, α, β)-AIM becomes (α, β)-AIM when θ= 1 and
app oaching (α, β)-LEM wi h θ→0.
On he o he hand, PEM was gene alized in o Mixed Powe Euclidean Me ic (MPEM) (Thanwe das
& Pennec,2019) by wo powe ac o s θ1and θ2, deno ed as (θ1, θ2)-EM. When θ1=θ2, MPEM
is educed o PEM. Han e al. (2023) ex ended BWM in o Gene alized Bu es-Wasse s ein Me ic
(GBWM) by an SPD pa ame e M, deno ed as M-BWM. When M=I, GBWM is educed o
BWM. We u he gene alize GBWM in o (2θ, M)-BWM by powe de o ma ion Eq. (2).
In o al, h ee pa ame e s (θ, α, β)a e in ol ed in he me ics on he SPD mani old. The powe
de o ma ion θcha ac e izes de o ma ion (Chen e al.,2024d;Thanwe das & Pennec,2022), while
(α, β)cha ac e izes he O(n)-in a iance, a powe ul p ope y in modeling co a iance (Thanwe das
& Pennec,2023). The abo e me ics ha e shown success in a ious applica ions, due o hei closed-
o m exp essions o Riemannian ope a o s, such as he Riemannian loga i hmic and exponen ial
maps. We summa ize he in ol ed Riemannian ope a o s in App. D.2.
3 GLOBAL COVARIANCE POOLING REVISITED
GCP cap u es he second-o de s a is ics o he ea u es in he las laye o he deep ne wo k. The
s anda d GCP p ocedu e comp ises calcula ing he co a iance ma ix, no maliza ion wi h a ma ix
unc ion, ec o iza ion, dimensionali y educ ion by an FC laye , and ul ima ely applying a Eu-
clidean classi ie . The sequence o hese ope a ions can be ep esen ed as ollows:
XCo
−−→ Σ M
−−→ ˜
Σ ec
−−→ x FC
−−→ ˜x EC
−−→ ˆy, (3)
whe e M, ec, FC and EC deno e he ma ix unc ion, ec o iza ion, FC laye , and Euclidean
classi ie , espec i ely. Typical candida es o ma ix unc ions a e ma ix powe and loga i hm. As
so max is he mos widely used classi ie , EC always deno es he so max in his pape . Howe e ,
ou discussions can also apply o o he classi ie s used in GCP, such as SVM (Li e al.,2017;Wang
e al.,2020a), as o he classi ie s ecei e he FC ea u es as hei inpu s.
FC + so max is known as Euclidean Mul inomial Logis ics Reg ession (EMLR). When he ma ix
unc ion is he ma ix powe , we call he p ocess EC ◦ FC ◦ ec ◦ Mas he Pow-EMLR, while
he coun e pa o ma ix loga i hm is e e ed o as Log-EMLR. Especially, se ing powe as 1
/2
in GCP no mally eaches he op imal pe o mance (Li e al.,2017, Fig. 3). The Pow-EMLR and
Log-EMLR can be o mally exp essed as
Log-EMLR: so max (F( ec (mlog(S)) ; A, b)) ,(4)
Pow-EMLR: so max F ec Sθ;A, b,(5)
whe e F(·;A, b)deno es he FC laye wi h he ans o ma ion ma ix Aand biasing ec o b.
4 MATRIX FUNCTIONS AND TANGENT CLASSIFIERS
The ma ix loga i hm is he Riemannian loga i hmic map a he iden i y ma ix I, mapping SPD
ma ices in o he angen space TISn
++ ∼
=Sn. As angen spaces a e Euclidean spaces, i is na u al
o exploi FC and Euclidean classi ie s on TISn
++ di ec ly. We e e o he Euclidean classi ie s o e
he angen space a he iden i y ma ix, TISn
++, as angen classi ie s. This sec ion sys ema ically
s udies all Riemannian loga i hmic maps be ween Sn
++ and TISn
++, unde se en amilies o me ics.
4.1 RIEMANNIAN LOGARITHMS UNDER SEVEN DEFORMED METRICS
The ma ix loga i hm is gene ally cha ac e ized as he Riemannian loga i hm LogIa Iunde he
s anda d LEM and AIM. Inspi ed by his, we sys ema ically in es iga e Riemannian loga i hms on
SPD mani olds. Le Pdeno e an SPD ma ix and ˜
L ep esen he Cholesky ac o o Pθ. Tab. 2
4
Published as a con e ence pape a ICLR 2025
Table 2: LogIunde se en amilies o me ics. θ0=θ1+θ2
2 o (θ1, θ2)-EM, θ0=θ o (θ, α, β)-EM
and 2θ-BWM, and (2θ, ϕ2θ(P))-BWM.
Me ic LogIPMe ic LogIP
(α, β)-LEM mlog(P)(θ, α, β)-EM
1
θ0(Pθ0−I)
(θ, α, β)-AIM (θ1, θ2)-EM
θ-LCM 1
θh⌊˜
L⌋+⌊˜
L⌋⊤+ 2 Dlog(D(˜
L))i2θ-BWM
(2θ, P2θ)-BWM
p esen s he Riemannian loga i hms a Iunde all se en me ics, whe e ⌊·⌋ is he s ic ly lowe
iangula pa o a squa e ma ix, D(·)is a diagonal ma ix, and Dlog(·)is he diagonal loga i hm.
We lea e echnical de ails in App. E.
Rema k 1.Le us elabo a e u he on he pa ame e o GBWM in Tab. 2. Gi en an SPD poin
P∈ Sn
++,P-BWM coincides wi h he s anda d AIM in he neighbo hood o P(Han e al.,2021).
This local p ope y could be bene icial (Han e al.,2023). Simila ly, (2θ, P 2θ)-BWM is locally
(2θ, 1,0)-AIM, he de o med me ic o he s anda d AIM. Please e e o App. F o echnical de ails.
Tab. 2implies ha he e a e h ee ypes o LogI:
Ma ix-loga i hm-based: mlog(P),(6)
Ma ix-powe -based: 1
θ(Pθ−I),(7)
LCM-based: 1
θh⌊˜
L⌋+⌊˜
L⌋⊤+ 2 Dlog(D(˜
L))i,(8)
We deno e he angen MLR induced by Eq. (6), i.e.,Eq. (6) + ec o iza ion + FC + so max, as Log-
TMLR, while he coun e pa s o Eq. (7) and Eq. (8) is e e ed o as Pow-TMLR and Cho-TMLR,
espec i ely. Ob iously, Log-TMLR is he exac Log-EMLR (Eq. (4)) used in GCP.
Table 3: Resul s o GCP on he ImageNe -1k and Ca s da ase s wi h Pow-TMLR o Pow-EMLR
unde he a chi ec u e o ResNe -18.
Me hod ImageNe -1k Ca s
Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%)
Pow-TMLR 71.62 89.73 51.14 74.29
Pow-EMLR 73 90.91 80.43 94.15
4.2 POW-TMLR VERSUS POW-EMLR
Pow-EMLR applies Euclidean MLR di ec ly on he non-Euclidean SPD mani old, while Pow-
TMLR applies Euclidean MLR on he Euclidean space o TISn
++.In his sense, Pow-TMLR should
be mo e heo e ically ad an ageous han Pow-EMLR. Mo eo e , he di e ence be ween Pow-EMLR
and Pow-TMLR seems o be mino . Pow-EMLR di e s om Pow-TMLR only in a simple a ine
ans o ma ion θ(X) = 1
θ(X−I). No e ha he composi ion o a ine ans o ma ions emains
a ine, and he FC laye is also an a ine ans o ma ion. The e o e, Pow-EMLR migh be iewed
as he app oxima ion o Pow-TMLR. Based on his discussion, we hypo hesize ha he angen
classi ie se es as he unde lying mechanism o ma ix unc ions in GCP. I his hypo hesis
holds, Pow-EMLR should pe o m wo se o a leas simila ly o Pow-TMLR.
To alida e his pos ula ion, we conduc expe imen s on he ImageNe -1k (Deng e al.,2009) and
S an o d Ca s (Ca s) (K ause e al.,2013) da ase s. We use he a chi ec u e o ResNe -18 and
ResNe -50 (He e al.,2016) on he ImageNe and Ca s da ase s, espec i ely. Following he clas-
sic iSQRT-COV (Li e al.,2018), we se powe =1
/2and use New on-Schulz i e a ion o calcula e
he ma ix squa e oo . No e ha Pow-EMLR unde New on-Schulz i e a ion is exac ly he o igi-
nal implemen a ion o iSQRT-COV. As shown in Tab. 3, opposi e o ou hypo hesis, Pow-TMLR
is in e io o Pow-EMLR o classi ying co a iance ma ices in GCP. Simila ends a e also ob-
se ed in addi ional expe imen s conduc ed on FGVC da ase s, as will be p esen ed in Sec. 6.These
indings sugges ha ins ead o angen classi ie s, he e should exis o he mo e undamen al
mechanisms o unde pinning ma ix unc ions in GCP.
5

Published as a con e ence pape a ICLR 2025
5 MATRIX FUNCTIONS AND RIEMANNIAN CLASSIFIERS
Recen ly, Riemannian classi ie s on he SPD mani old, which can mo e ai h ully espec he inna e
geome y, ha e exhibi ed mo e p omising pe o mance han angen classi ie s (Nguyen & Yang,
2023;Chen e al.,2024a;d). This sec ion will demons a e ha ma ix unc ions in GCP implici ly
espec Riemannian classi ie s, which o e s a uni ied heo e ical explana ion o he wo king mech-
anism o ma ix unc ions. We s a wi h e iewing he Riemannian SPD classi ie s and hen p esen
ou heo e ical analysis in de ail.
5.1 SPD MULTINOMIAL LOGISTICS REGRESSION REVISITED
Inspi ed by Lebanon & La e y (2004); Ganea e al. (2018), some ecen wo ks (Nguyen & Yang,
2023;Chen e al.,2024a;d) ex ended he Euclidean MLR in o SPD mani olds. We i s e isi he
e o mula ion o he Euclidean MLR, and hen mo e on o he SPD MLRs in oduced by Chen e al.
(2024d), especially he ones induced by (θ, α, β)-EM and (α, β)-LEM.
The Euclidean MLR calcula es he p obabili y o each class by
∀k∈ {1, . . . , C}, p(y=k|x)∝exp (⟨ak, x⟩ − bk),(9)
whe e x∈Rnis an inpu ec o , Cis he numbe o classes, bk∈R, and ak∈Rn {0}. Eq. (9)
can be u he ew i en as
p(y=k|x)∝exp (⟨ak, x −pk⟩),(10)
whe e pksa is ies ⟨ak, pk⟩=bk. As shown in he p e ious li e a u e (Lebanon & La e y,2004;
Ganea e al.,2018), Eq. (10) can be u he e o mula ed by he ma gin dis ance o he hype plane:
p(y=k|x)∝exp(sign(⟨ak, x −pk⟩)∥ak∥d(x, Hak,pk)),(11)
whe e pk∈Rnsa is ying ⟨ak, pk⟩=bk, and he hype plane Hak,pkis de ined as:
Hak,pk={x∈Rn:⟨ak, x −pk⟩= 0}.(12)
Chen e al. (2024d) gene alized Eqs. (11) and (12) in o gene al mani olds and p oposed he SPD
MLRs unde i e amilies o me ics. The SPD MLRs unde (α, β)-LEM and (θ, α, β)-EM a e
(α, β)-LEM-based: p(y=k|S)∝exp h⟨log(S)−log(Pk), Ak⟩(α,β)i,(13)
(θ, α, β)-EM-based: p(y=k|S)∝exp 1
θ⟨Sθ−Pθ
k, Ak⟩(α,β),(14)
whe e α > 0,β > −α
/n, and Sis an inpu SPD ea u e. He e, Pk∈ Sn
++ and Ak∈TISn
++ ∼
=Sn
a e pa ame e s o each class k. In Eqs. (13) and (14), he o mula wi hin exp(·)can be iewed as
he coun e pa o he Euclidean FC laye in SPD mani olds, ex ac ing ea u es o calcula e so max
p obabili ies.
5.2 MATRIX FUNCTIONS AS SPD MULTINOMIAL LOGISTICS REGRESSION
Unde he s anda d LEM ((1,0)-LEM) and PEM ((θ, 1,0)-EM), Eqs. (13) and (14) become
LEM-based: exp [⟨log(S)−log(Pk), Ak⟩],(15)
PEM-based: exp 1
θ⟨Sθ−Pθ
k, Ak⟩,(16)
Eqs. (15) and (16) appea o be a away om he Log-/Pow-EMLR in GCP, as he SPD pa ame e s
{P1...C} equi es Riemannian op imiza ion. Howe e , Chen e al. (2024a, P op. 5.1) show ha
unde he LEM-based Riemannian S ochas ic G adien Descen (RSGD) o each Pkand Euclidean
SGD o each Ak, Eq. (15) is equi alen o a Euclidean MLR op imized by he Euclidean SGD in
he co-domain o he ma ix loga i hm. Simila o LEM, we ha e he ollowing p oposi ion w. . .
PEM.
6
Published as a con e ence pape a ICLR 2025
Table 4: In insic explana ions o some classi ie s o GCP. Fo Cho-TMLR, ˜
L= Chol(Sθ).
Fo Pow-TMLR, θ0=θ1+θ2
2 o (θ1, θ2)-EM, θ0=θ o (θ, α, β)-EM, θ0= 2θ o 2θ-BWM
and (2θ, ϕ2θ(S))-BWM. He e, s(·)deno es he so max, F(·)deno es he FC laye , and ˜
V=
1
θh⌊˜
L⌋+⌊˜
L⌋⊤+ 2 Dlog(D(˜
L))iwi h Sθ=˜
L˜
L⊤as he Cholesky decomposi ion.
Log-EMLR Pow-EMLR ScalePow-EMLR Pow-TMLR Cho-TMLR
Exp ession s(F( ec (mlog(S)))) sF ec Sθ
(θ > 0)
sF ec 1
θSθ
(θ > 0) sF ec 1
θ0(Sθ0−I) sF ec ˜
V
Explana ion SPD MLR SPD MLR SPD MLR Tangen Classi ie Tangen Classi ie
Me ics LEM (θ, 1,0)-EM (θ, 1,0)-EM (θ, α, β)-EM, (θ1, θ2)-EM,
2θ-BWM, (2θ, ϕ2θ(S))-BWM θ-LCM
Used in GCP ✓(Eq. (4)) ✓
(θ= 0.5in Eq. (5)) ✗ ✗ ✗
Re e ence (Chen e al.,2024a, P op. 5.1) Thm. 2Thm. 2Tab. 2Tab. 2
Theo em 2. [↓]Unde PEM wi h θ > 0, op imizing each SPD pa ame e Pkin Eq. (16)by PEM-
based RSGD and Euclidean pa ame e Akby Euclidean SGD, he PEM-based SPD MLR is equi -
alen o a Euclidean MLR illus a ed in Eq. (10)in he co-domain o ϕθ(·) : Sn
++ → Sn
++, de ined
as
ϕθ(S) = 1
θSθ, θ > 0,∀S∈ Sn
++.(17)
We de ine ScalePow-EMLR as so max F ec 1
θSθ;A, b.Then, ScalePow-EMLR espec s
he SPD MLR unde he s anda d PEM. The only di e ence be ween ScalePow-EMLR and Pow-
EMLR (Eq. (5)) is he scala p oduc be o e ec o iza ion, which is expec ed o ha e mino e ec s
on DNNs. Ob iously, we ha e
F ec 1
θSθ;A, b=F ec Sθ;˜
A, b.(18)
whe e ˜
A=1
θA. The e o e, om a o wa d pe spec i e, ScalePow-EMLR is equi alen o he o ig-
inal Pow-EMLR. Besides, by scaled ini ializa ion and lea ning a e o A, ScalePow-EMLR could
be comple ely equi alen o Pow-EMLR du ing ne wo k aining. No e ha his analysis canno be
ans e ed in o Pow-TMLR. Please e e o App. G o mo e de ails.
The e o e, he Pow-EMLR in GCP is implici ly an SPD MLR induced by (θ, 1,0)-EM. Fo he
widely used ma ix squa e oo no maliza ion, i espec s he SPD MLR induced by (1
/2,1,0)-EM.
We summa ize all he indings in Tab. 4. Besides, Thm. 2can be easily ex ended in o he case
o θ < 0. In his case, ou wo k can also o e heo e ical insigh s o he in e se o co a iance
(θ=−1) p oposed by Rahman e al. (2023). Mo e de ails a e p esen ed in App. J.
5.3 THEORETICAL INSIGHTS ON THE MATRIX POWER AND LOGARITHM
P e ious s udies on GCP (Li e al.,2017;Wang e al.,2020a;Song e al.,2021) ha e empi ically
demons a ed a clea ad an age o he ma ix powe (pa icula ly ma ix squa e oo ) o e ma ix
loga i hm. This subsec ion o e s no el heo e ical insigh s o disen angle he di e en pe o mance
be ween he ma ix loga i hm and powe in GCP.
As shown by Tab. 4, bo h ma ix loga i hm and ma ix powe implici ly build SPD MLRs. Howe e ,
he Riemannian me ics hey espec a e di e en . Ma ix powe espec s (θ, 1,0)-EM, while ma ix
loga i hm espec s LEM. Bo h (θ, 1,0)-EM and LEM sha e O(n)-in a iance (Chen e al.,2024d),
a powe ul p ope y in cha ac e izing co a iance ma ices. Besides, (θ, 1,0)-EM is a de o med
me ic o LEM, in e pola ing be ween he s anda d EM (θ= 1) and LEM (θ→0) (Thanwe das
& Pennec,2022). The s anda d EM migh su e om a swelling e ec o cha ac e izing SPD
ma ices (A signy e al.,2005), while LEM migh o e -s e ch he eigen alues o SPD ma ices due
o he compu a ion o ma ix loga i hm (Song e al.,2021). Consequen ly, (θ, 1,0)-EM ep esen s
balanced al e na i es be ween he s anda d LEM and EM. In addi ion, as shown by Chen e al.
(2024d, Tab. 4), (θ, 1,0)-EM could pe o m be e han LEM ega ding SPD MLR. The e o e, he
empi ical ad an ages o ma ix powe o e ma ix loga i hm in he GCP could be a ibu ed o he
cha ac e is ics o he unde lying Riemannian me ics.
7
Published as a con e ence pape a ICLR 2025
Table 5: Resul s o iSQRT-COV on ou da ase s wi h di e en co a iance ma ix classi ie s. The
backbone ne wo k on ImageNe is ResNe -18, while he one on he o he h ee FGVC da ase s is
ResNe -50. Powe is se o be 1
/2 o Pow-TMLR, ScalePow-EMLR and Pow-EMLR.
Classi ie ImageNe -1k Ai c a s Bi ds Ca s
Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%)
Cho-TMLR N/A N/A 78.97 91.81 48.07 72.59 51.06 74.33
Pow-TMLR 71.62 89.73 69.58 88.68 52.97 77.80 51.14 74.29
ScalePow-EMLR 72.43 90.44 71.05 89.86 63.48 84.69 80.31 94.07
Pow-EMLR 73 90.91 72.07 89.83 63.29 84.66 80.43 94.15
Figu e 3: The alida ion op-5 accu acy on he h ee FGVC da ase s o iSQRT-COV wi h di e en
classi ie s using he ResNe -50 backbone.
6 EXPERIMENTS
In his sec ion, we alida e he ollowing hypo hesis based on ou p e ious heo e ical analysis.
(A1) As Pow-EMLR app oxima es he angen classi ie Pow-TMLR, he wo king mechanism o
Pow-EMLR is a ibu ed o he angen classi ie ;
(A2) As bo h Pow-EMLR and Log-EMLR in GCP a e equi alen o Riemannian classi ie s, he
mechanism o ma ix no maliza ion should be a ibu ed o Riemannian classi ie s.
We implemen di e en classi ie s o co a iance ma ix classi ica ion, including he o iginal Pow-
EMLR, he angen classi ie s Pow-TMLR and Cho-TMLR, and he in insic ScalePow-EMLR.
We use he Cal ech Uni e si y Bi ds (Bi ds) (Welinde e al.,2010), FGVC Ai c a s (Ai c a s)
(Maji e al.,2013), S an o d Ca s (Ca s) (K ause e al.,2013), and ImageNe -1k (Deng e al.,2009)
da ase s. As he ma ix squa e oo is he mos e ec i e ma ix unc ion in GCP, we se powe = 1
/2.
In all expe imen s, we ain he ne wo k om sc a ch. Mo e implemen a ion de ails a e in App. H.
6.1 MAIN RESULTS
No ably, al hough ScalePow-EMLR is equi alen o Pow-EMLR unde scaled se ings, we imple-
men hem unde he same ne wo k se ings o a comple e compa ison. The esul s on ou da ase s
a e shown in Tab. 5. Ou main empi ical obse a ions a e as ollows:
(1). Pow-EMLR>Pow-TMLR. Pow-EMLR gene ally ou pe o ms Pow-TMLR, especially on
Ca s and Bi ds da ase s. Recalling in Tab. 4, he exp ession o Pow-EMLR di e s om Pow-TMLR
only in an a ine ans o ma ion. Howe e , ac oss all ou da ase s, Pow-EMLR consis en ly su -
passes Pow-TMLR. On he Bi ds and Ca s da ase s, Pow-EMLR ou pe o ms Pow-TMLR by a
la ge ma gin. Fo example, on he Bi ds da ase , he op-5 accu acy o Pow-EMLR and Pow-TMLR
is 84.66% and 77.80%, espec i ely, whe eas, on he Ca s da ase , i is 94.15% and 74.29%.
(2). Pow-EMLR≈ScalePow-EMLR. Pow-EMLR shows compa able pe o mance o ScalePow-
EMLR. Recalling in Tab. 4, he only di e ence be ween Pow-EMLR and ScalePow-EMLR is a
scala p oduc . Mo eo e , as discussed in Sec. 5.2 his mino di e ence can be u he sol ed by
scaled ini ializa ion o he FC laye . Al hough we use he same ini ializa ion o a ai compa ison,
Pow-EMLR and ScalePow-EMLR show simila pe o mance.
(3). Pow-EMLR≫Cho-TMLR. While Cho-TMLR demons a es he bes pe o mance on he Ai -
c a s da ase s, i exhibi s he wo s pe o mance on he o he wo FGVC da ase s. On he Ca s and
Bi ds da ase s, Pow-EMLR su passes Cho-TMLR by a la ge ma gin. The uns able pe o mance o
8
Published as a con e ence pape a ICLR 2025
Cho-TMLR migh be a ibu ed o he diagonal loga i hm, which migh o e ly s e ch he diagonal
elemen s o he Cholesky ac o .
Based on he abo e empi ical indings, we can each he ollowing conclusion. (A1) is e u ed by
(1). The in e io pe o mance o Pow-TMLR agains Pow-EMLR in (1) indica es ha Pow-EMLR
can no be simply iewed as equi alen o he angen classi ie Pow-TMLR. (A2) is alida ed by (2).
(2) alida es ou heo e ical pos ula ion ha he e ec i eness o ma ix powe should be a ibu ed
o he Riemannian classi ie i implici ly cons uc s.
O he indings. In he i s and las obse a ions, angen classi ie s a e less e ec i e han he Rie-
mannian classi ie . Tangen classi ie s can dis o he inna e geome y o he mani old, as he angen
space is only a local linea app oxima ion o he mani old. In con as , he Riemannian classi ie can
ai h ully espec he geome y o he mani old. Besides, al hough Log-EMLR coincides wi h bo h
angen and Riemannian classi ie s, he eal unde lying mechanism o ma ix loga i hm should also
be a ibu ed o he Riemannian classi ie ins ead o he angen classi ie .
Table 6: Abla ions o Pow-EMLR, ScalePow-EMLR, and Pow-TMLR unde di e en se ings.
(a) Resul s o di e en powe s unde he ResNe -50.
Classi ie Ai c a s Ca s
Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%)
Pow-TMLR-0.25 65.41 86.71 41.47 66.66
ScalePow-EMLR-0.25 72.76 90.31 61.78 84.04
Pow-EMLR-0.25 71.47 90.04 62.88 84.14
Pow-TMLR-0.5 67.9 88.75 55.01 77.95
ScalePow-EMLR-0.5 74.29 91.12 62.42 84.82
Pow-EMLR-0.5 74.17 91.21 62.83 84.85
Pow-TMLR-0.7 65.92 87.49 50.68 74.12
ScalePow-EMLR-0.7 74.26 91.15 64.22 83.67
Pow-EMLR-0.7 74.17 90.49 61.41 82.39
(b) Resul s unde he AlexNe .
Da ase Resul Pow-TMLR Pow-EMLR
Ai c a s Top-1 Acc (%) 38.01 65.02
Top-5 Acc (%) 74.4 87.79
Ca s Top-1 Acc (%) 28.57 59.13
Top-5 Acc (%) 59.51 82.04
6.2 TRAINING DYNAMICS AND ABLATIONS
T aining dynamics. Fig. 3p esen s he op-5 alida ion accu acy cu es on h ee FGVC da ase s.
Pow-EMLR exhibi s compa able pe o mance o ScalePow-EMLR h oughou he aining. Mo e-
o e , Pow-EMLR consis en ly ou pe o ms Pow-TMLR, pa icula ly on he Ca s and Bi ds da ase s.
This again sugges s ha he e ec i eness o Pow-EMLR should be a ibu ed o he Riemannian
MLR a he han he angen classi ie . Fu he mo e, we no e ha he dec easing lea ning a e plays
a c ucial ole in Cho-TMLR. On he Ai c a s da ase , be o e he 50 h epoch, Cho-TMLR exhibi s
he wo s pe o mance among all classi ie s. Howe e , a e he 50 h epoch, when he lea ning a e
educes, Cho-TMLR su passes all he o he classi ie s. None heless, on he emaining wo da ase s,
Cho-TMLR emains in e io h oughou he aining. This disc epancy may be a ibu ed o he log-
a i hm ope a ion in Cho-TMLR. Recalling Eq. (8), he e is a diagonal loga i hm o he Cholesky
ac o . Simila o he ma ix loga i hm, Eq. (8) will also o e -s e ch he diagonal elemen s o he
Cholesky ac o , comp omising he o e all pe o mance o Cho-TMLR.
Abla ions. To u he alida e ou pos ula ion, we compa e Pow-EMLR, SaclePow-EMLR, and
Pow-TMLR wi h di e en powe s unde he ResNe -50 a chi ec u e, i.e.,,θ= 0.25,0.5,0.7. We
also compa e Pow-EMLR agains Pow-TMLR unde he AlexNe a chi ec u e. The abla ions a e
conduc ed on he Ai c a s and Ca da ase s. The esul s discussed below con i m again ou indings
ha he mechanism o ma ix unc ions in GCP should be a ibu ed o Riemannian classi ie s.
Impac o ma ix powe . Following Song e al. (2021), we use accu a e SVD o calcula e he ma ix
powe and Pad´
e app oximan o backp opaga ion. The esul s a e epo ed in Tab. 6a. Since we
use SVD o he ma ix powe he e, he esul s in Tab. 6a unde θ= 0.5a e sligh ly di e en om
Tab. 5. Ne e heless, Pow-EMLR consis en ly shows simila pe o mance o ScalePow-EMLR and
ou pe o ms Pow-TMLR unde di e en powe s.
Impac o a chi ec u es. We also use he anilla AlexNe (K izhe sky e al.,2012) as an al e na i e
backbone. Tab. 6b p esen s he compa ison esul s unde he AlexNe a chi ec u e. Consis en wi h
ou p e ious obse a ion, Pow-EMLR s ill ou pe o ms Pow-TMLR.
9
Published as a con e ence pape a ICLR 2025
APPENDIX CONTENTS
A Fu u e wo k 17
B Rela ed wo k 17
C No a ions and abb e ia ions 18
D Addi ional p elimina ies 18
D.1 Pullback me ics .................................... 18
D.2 Riemannian ope a o s on he SPD mani old ..................... 19
E Technical de ails on Riemannian Loga i hm 19
F Powe -de o med GBWM as local powe -AIM 21
G Addi ional discussions on Pow-TMLR, Pow-EMLR, and ScalePow-EMLR 22
G.1 The equi alence o Pow-EMLR and ScalePow-EMLR ............... 22
G.2 The in-equi alence o Pow-EMLR and Pow-TMLR ................. 23
G.3 A Riemannian pe spec i e o Pow-TMLR s. Pow-EMLR ............. 23
H Addi ional expe imen al de ails 23
H.1 Da ase s ........................................ 23
H.2 Implemen a ion de ails ................................ 23
H.3 Expe imen s on he second-o de T ans o me .................... 24
H.4 Abla ions on he bias e o mula ion ......................... 25
I P oo o Thm. 2 25
J Addi ional discussions on Thm. 2 27
16

Published as a con e ence pape a ICLR 2025
A FUTURE WORK
While Chen e al. (2024d) also explo ed Riemannian MLRs induced by o he me ics, hese MLRs
in ol e compu a ionally expensi e Riemannian compu a ions, ende ing hem unsui able o la ge-
scale da ase s. As a u u e a enue, we aim o simpli y he Riemannian compu a ions in hese al e -
na i e Riemannian classi ie s and apply hem o GCP o imp o ed co a iance ma ix classi ica ion.
B RELATED WORK
Global co a iance pooling. GCP aims o le e age he second-o de s a is ics o deep ea u es o
enhance he lea ning compe ence o DNNs. DeepO2P(Ionescu e al.,2015), acknowledged as he
i s end- o-end global co a iance pooling ne wo k, employs ma ix loga i hm o he classi ica ion
o co a iance ma ices. This me hod also o e s ma ix backp opaga ion o di e en ia e he g adien
w. . he decomposi ion-based ma ix unc ions. Following his pionee ing wo k, B-CNN (Lin e al.,
2015) employs he ou e p oduc o global ea u es and ca ies ou elemen -wise powe no maliza-
ion. Howe e , he e exis h ee limi a ions o he abo e wo me hods. Fi s ly, he high dimensional
co a iance ea u e conside ably escala es he pa ame e s o he FC laye , he eby in oducing he
isk o o e i ing. Secondly, he ma ix loga i hm could o e -s e ch he small eigen alues, unde -
mining he e ec i eness o GCP. Thi dly, he ma ix loga i hm is based on ma ix decomposi ion,
which is compu a ionally expensi e. The subsequen esea ch p ima ily ocuses on ou aspec s: (a)
adop ing iche s a is ical ep esen a ion (Wang e al.,2017;Zheng e al.,2019;Nguyen,2021); (b)
educing he dimensionali y o he co a iance ea u e (Gao e al.,2016;Kong & Fowlkes,2017;Cui
e al.,2017;Acha ya e al.,2018;Rahman e al.,2020;Wang e al.,2022a); (c) in es iga ing e ec-
i e and e icien ma ix no maliza ion (Li e al.,2018;Zheng e al.,2019;Lin & Maji,2017;Yu
e al.,2020;Song e al.,2022c;b); (d) imp o ing co a iance condi ioning o be e gene aliza ion
abili y (Song e al.,2022d;a). In his wo k, we do no aim o achie e s a e-o - he-a pe o mance
o e he exis ing GCP-based me hods bu a he o un a el he unde lying heo e ical mechanism o
GCP ma ix unc ions.
In e p e a ions o global co a iance pooling. Along wi h he p og ess o GCP, se e al wo ks
began o s udy i s mechanism. Wang e al. (2020b) in es iga ed he e ec o GCP on deep Con o-
lu ional Neu al Ne wo ks (CNNs) om an op imiza ion pe spec i e, including accele a ed con e -
gence, s onge obus ness, and good gene aliza ion abili y. Wang e al. (2023) u he b oadened
hese in es iga ions, subs an ia ing he me i s o GCP in o he ne wo ks, such as ision ans o m-
e s (Tou on e al.,2021;Yuan e al.,2021;Liu e al.,2021) and di e en iable Neu al A chi ec u e
Sea ch (NAS) (Liu e al.,2019). Song e al. (2021) empi ically s udied he ad an age o app oxima e
ma ix squa e oo o e he accu a e one. Wang e al. (2022a) conside ed he ma ix powe as deco -
ela ing ep esen a ions and de eloped a channel-adap i e d opou o p oduce lowe -dimensional
co a iance ma ices. Ne e heless, exis ing li e a u e does no ully add ess he undamen al ques-
ion o why Euclidean classi ie s ope a e e ec i ely in he non-Euclidean space gene a ed by he
ma ix powe . Ou esea ch ills in his heo e ical gap, o e ing in insic explana ions ega ding he
ole o he ma ix unc ions in GCP.
Riemannian classi ie s on SPD mani olds. Since he ma ix loga i hm is a di eomo phism be-
ween he SPD mani old and i s angen space a he iden i y (A signy e al.,2005), he mos widely
used classi ie on SPD mani olds is composed o he ma ix loga i hm and a Euclidean classi ie
(Wang e al.,2021;Chen e al.,2023b;Wang e al.,2022b;Nguyen,2022a;b;Wang e al.,2022c;
Chen e al.,2024b;Wang e al.,2024b). Howe e , his angen classi ie migh dis o he in insic
geome y o SPD mani olds. Simila issues also a ise in o he mani olds (Huang e al.,2017;Wang
e al.,2024a;Chen e al.,2025) Inspi ed by HNNs (Ganea e al.,2018), ecen s udies ha e de el-
oped in insic classi ie s di ec ly on SPD mani olds. Nguyen & Yang (2023) in oduced h ee gy o
s uc u es on SPD mani olds induced by AIM, LEM, and LCM, espec i ely. Based on hese gy o
s uc u es, he au ho s gene alize he Euclidean Mul inomial Logis ics Reg ession (MLR). Concu -
en ly, Chen e al. (2024a) p oposed a o mula o SPD MLR unde Riemannian me ics pulled back
om he Euclidean space. Howe e , bo h wo ks equi e speci ic Riemannian p ope ies and ocus
on ce ain me ics on SPD mani olds. Chen e al. (2024d) p esen ed a gene al amewo k o de-
signing Riemannian MLRs on gene al geome ies and showcased hei amewo k unde a ious
me ics on SPD mani olds, co e ing he SPD MLRs in oduced by Chen e al. (2024a); Nguyen &
Yang (2023). Based on his amewo k, Chen e al. (2024c) showcased he SPD MLR unde hei
17
Published as a con e ence pape a ICLR 2025
Table 7: Summa y o no a ions.
No a ion Explana ion
Sn
++ The SPD mani old
SnThe Euclidean space o symme ic ma ices
LnThe Euclidean space o n×nlowe iangula ma ices
TPSn
++ The angen space a P∈ Sn
++
gP(·,·)o ⟨·,·⟩PThe Riemannian me ic a P∈ Sn
++
⟨·,·⟩ o ·:·The s anda d F obenius inne p oduc
LogPThe Riemannian loga i hm a P
Ha,p The Euclidean hype plane
∗,P The di e en ial map o a P∈ Sn
++
∗gThe pullback me ic by om g
ad(·)The adjoin ope a o o linea maps
ST ST ={(α, β)∈R2|min(α, α +nβ)>0}
⟨·,·⟩(α,β)The O(n)-in a ian Euclidean inne p oduc
g(α,β)-LE The Riemannian me ic o (α, β)-LEM
g(α,β)-AI The Riemannian me ic o (α, β)-AIM
g(θ,α,β)-AI The Riemannian me ic o (θ, α, β)-AIM
g(α,β)-E The Riemannian me ic o (α, β)-EM
g(θ,α,β)-E The Riemannian me ic o (θ, α, β)-EM
g(θ1,θ2)-E The Riemannian me ic o (θ1, θ2)-EM
gBW The Riemannian me ic o BWM
gM-BW The Riemannian me ic o M-BWM
g(2θ,M)-BW The Riemannian me ic o (2θ, M)-BWM
gLC The Riemannian me ic o LCM
gθ-LC The Riemannian me ic o θ-LCM
FC o F(·;A, b)The FC laye
Powθo (·)θThe ma ix powe
ec The ec o iza ion
EC A Euclidean classi ie
mlog The ma ix loga i hm
LP[·]The Lyapuno ope a o
Chol The Cholesky decomposi ion
LP,M [·]The gene alized Lyapuno ope a o
Dlog(·)The diagonal elemen -wise loga i hm
MThe ma ix unc ion o ma ix powe o loga i hm
⌊·⌋ The s ic ly lowe iangula pa o a squa e ma ix
D(·)A diagonal ma ix wi h diagonal elemen s om a squa e ma ix
p oposed p oduc me ics. In his pape , based on he Riemannian classi ie s de eloped by Chen
e al. (2024d), we p esen an in insic explana ion o ma ix unc ions in GCP.
C NOTATIONS AND ABBREVIATIONS
Fo be e cla i y, we summa ize all he no a ions in Tab. 7and all he abb e ia ions in Tab. 8.
D ADDITIONAL PRELIMINARIES
D.1 PULLBACK METRICS
The powe -de o med me ics on he SPD mani old a e special cases o pullback me ics. Pullback
me ics a e common echniques in Riemannian geome y, connec ing di e en Riemannian me ics.
18
Published as a con e ence pape a ICLR 2025
Table 8: Summa y o Abb e ia ions.
Abb e ia ion Explana ion
SPD Symme ic Posi i e De ini e
GCP Global co a iance pooling
GAP Global A e age Pooling
LEM Log-Euclidean Me ic
AIM A ine-In a ian Me ic
EM Euclidean Me ic
PEM Powe Euclidean Me ic
MPEM Mixed Powe Euclidean Me ic
BWM Bu es-Wasse s ein Me ic
GBWM Gene alized Bu es-Wasse s ein Me ic
FGVC Fine-G ained Visual Ca ego iza ion
MLR Mul inomial Logis ics Reg ession
EMLR Euclidean Mul inomial Logis ics Reg ession
RMLR Riemannian Mul inomial Logis ics Reg ession
SPD MLR RMLR on SPD mani olds
Log-EMLR Eq. (4)
Pow-EMLR Eq. (5)
Pow-TMLR EMLR in he angen space gene a ed by Eq. (7)
ScalePow-EMLR ScalePow-EMLR in Tab. 4
Cho-TMLR EMLR in he angen space gene a ed by Eq. (8)
De ini ion 3 (Pullback Me ics).Suppose M,Na e smoo h mani olds, gis a Riemannian me ic
on N, and :M→Nis smoo h. Then he pullback o gby is de ined poin -wisely,
( ∗g)p(V1, V2) = g (p)( ∗,p(V1), ∗,p(V2)),(19)
whe e p∈ M, ∗,p(·)is he di e en ial map o a p, and Vi∈TpM. I ∗gis posi i e de ini e, i
is a Riemannian me ic on M, which is called he pullback me ic de ined by .
D.2 RIEMANNIAN OPERATORS ON THE SPD MANIFOLD
The O(n)-in a ian Euclidean inne p oduc on Sn(Thanwe das & Pennec,2023) is de ined as
⟨V, W⟩(α,β)=α⟨V, W⟩+β (V) (W),(20)
whe e (α, β)∈ST wi h ST ={(α, β)∈R2|min(α, α +nβ)>0},V, W ∈ Sn, and ⟨·,·⟩ is he
s anda d ma ix inne p oduc .
We summa ize de o med SPD me ics and associa ed Riemannian ope a o s in Tab. 9wi h he ol-
lowing no a ions. Speci ically, P, Q, M ∈ Sn
++ a e SPD ma ices, and V, W a e angen ec o s in
he angen space a P,i.e.,TPSn
++. We deno e gP(·,·)as he Riemannian me ic a P, and LogP(·)
as he Riemannian loga i hm a P, espec i ely. Also, Chol and mlog ep esen he Cholesky decom-
posi ion and ma ix loga i hm, wi h hei di e en ial maps a Pdeno ed as Chol∗,P and mlog∗,P ,
espec i ely. We deno e ˜
V= Chol∗,P (V),˜
W= Chol∗,P (W),L= Chol(P), and K= Chol(Q).⌊·⌋
is he s ic ly lowe pa o a squa e ma ix, D(·)is a diagonal ma ix wi h diagonal elemen s o a
squa e ma ix, and Dlog(·)is a diagonal ma ix consis ing o he loga i hm o he diagonal en ies
o a squa e ma ix. We deno e LP,M [V]as he gene alized Lyapuno ope a o , i.e., he solu ion o
he ma ix linea sys em MLP,M [V]P+PLP,M [V]M=V. When M=I,LP,I [V]is educed o he
Lyapuno ope a o , deno ed as LP[V].
E TECHNICAL DETAILS ON RIEMANNIAN LOGARITHM
We i s e iew a well-known esul o he pullback me ic (Thanwe das & Pennec,2022, Tab. 2).
19
Published as a con e ence pape a ICLR 2025
Table 9: Riemannian ope a o s and de o med me ics o se en basic me ics on SPD mani olds. No e
ha o MPEM, Pand Qmus be commu ing ma ices when compu ing he Riemannian loga i hm.
Name Riemannian Me ic gP(V, W )Riemannian Loga i hm LogPQDe o ma ion
(θ= 0)
(α, β)-LEM
(Thanwe das & Pennec,2023)⟨mlog∗,P (V),mlog∗,P (W)⟩(α,β)(mlog∗,P )−1[mlog(Q)−mlog(P)] 1
θ2Pow∗
θg(α,β)-LE
(α, β)-AIM
(Thanwe das & Pennec,2023)⟨P−1V, W P−1⟩(α,β)P1/2mlog P−1/2QP−1/2P1/21
θ2Pow∗
θg(α,β)-AI
(α, β)-EM
(Thanwe das & Pennec,2023)⟨V, W ⟩(α,β)Q−P1
θ2Pow∗
θg(α,β)-E
(θ1, θ2)-EM
(Thanwe das & Pennec,2022)
1
θ1θ2⟨Powθ1∗,P (V),Powθ2∗,P (W)⟩(Powθ∗,P )−1(Qθ−Pθ), wi h θ= (θ1+θ2)/2N/A
LCM (Lin,2019)Pi>j ˜
Vij ˜
Wij +Pn
j=1 ˜
Vjj ˜
WjjL−2
jj (Chol−1)∗,L ⌊K⌋−⌊L⌋+D(L) Dlog(D(L)−1D(K))1
θ2Pow∗
θgLC
BWM (Bha ia e al.,2019)1
2⟨LP[V], W⟩(PQ)1/2+ (QP )1/2−2P1
4θ2Pow∗
2θgBW
GBWM (Han e al.,2023)1
2⟨LP,M [V], W ⟩MM−1PM−1Q1/2+QM−1PM−11/2M−2P1
4θ2Pow∗
2θgM-BW
Lemma 4. Gi en a Riemannian me ic gon he SPD mani old Sn
++ and a di eomo phism :
Sn
++ → Sn
++, he Riemannian loga i hm ˜
LogPunde he pullback me ic ˜g= ∗gis
˜
LogPQ= ( ∗,P )−1Log (P) (Q),(21)
whe e ∗,P is he di e en ial map a P, and Log is he Riemannian loga i hm unde g.
Nex , we show a lemma abou he scaling o a Riemannian me ic.
Lemma 5. Supposing Sn
++ is endowed wi h a Riemannian me ic gand a > 0is a posi i e eal
scala , he scaling me ic ag sha es he same Riemannian loga i hm map wi h g.
P oo . Since he Ch is o el symbols o ag a e iden ical o hose o g, he geodesic unc ions unde
bo h ag and g emain unchanged. This implies ha he Riemannian exponen ial maps a e he same
o ag and g. As he in e se o he Riemannian exponen ial maps, he Riemannian loga i hm maps
unde ag and ga e also iden ical.
By he abo e lemmas, we can eadily p o e Tab. 2.
P oo . By Lem. 5, o he powe -de o med me ic o a me ic gin Sn
++, he Riemannian loga i hm
a Iis he same as he coun e pa unde Pow∗
θg. The e o e, in he ollowing, wi hou loss o
gene ali y, we compu e LogIunde Pow∗
θg. We u he deno e he Riemannian loga i hm unde g
as ¯
Log.
In he ollowing, we deno e Pas an SPD ma ix, 0as he n×nze o ma ix, and Vas a angen
ec o in TISn
++. Besides, we no e ha
Powθ∗,I (V) = θV. (22)
We i s deal wi h (α, β)-LEM and θ-LCM, as bo h o hem a e pullback me ics om he Euclidean
space. Then, we p oceed o deal wi h o he me ics
(α, β)-LEM: As shown by Thanwe das & Pennec (2023), he Riemannian loga i hm a Iis
LogI(P) = mlog−1
∗,I (mlog(P)−mlog(I))
= mlog(P).(23)
θ-LCM: We de ine a map as
=ψ◦Chol ◦Powθ,(24)
whe e ψ(L) = ⌊L⌋+ Dlog(D(L)) o he lowe iangula ma ix L.Chen e al. (2024e) shows ha
LCM is he pullback me ic by ψ◦Chol om he Euclidean space Lno lowe iangula ma ices.
20
Published as a con e ence pape a ICLR 2025
The e o e, Pow∗
θgLC is he pullback me ic om Lnby . Besides, we ha e he ollowing:
(P) = ⌊˜
L⌋+ Dlog(D(˜
L)),(25)
(I) = 0,(26)
∗,I(V) = θ⌊V⌋+1
2D(L),(27)
whe e ˜
L= Chol(Pθ). We ha e
LogI(P)=( ∗,P )−1( (P)− (I))
=1
θh⌊˜
L⌋+⌊˜
L⌋⊤+ 2 Dlog(D(˜
L))i.(28)
Fo (θ, α, β)-EM, (θ1, θ2)-EM, (θ, α, β)-AIM, 2θ-BWM, and (2θ, P 2θ)-BWM, we deno e LogIas
hei loga i hm a I, while ¯
LogIas he loga i hm unde he me ic be o e de o ma ion. The esul s
can be di ec ly ob ained by Eq. (22), Lem. 4, Lem. 5, and Tab. 9.
(θ, α, β)-EM:
LogI(P) = 1
θ¯
LogI(Pθ)
=1
θPθ−I.
(29)
(θ1, θ2)-EM: The LogIcan be di ec ly ob ained by Tab. 9and Eq. (22).
(θ, α, β)-AIM:
LogI(P) = 1
θ¯
LogI(Pθ)
=1
θmlog(Pθ)
= mlog(P).
(30)
2θ-BWM:
LogI(P) = 1
2θ¯
LogI(P2θ)
=1
θ(Pθ−I).
(31)
(2θ, P2θ)-BWM: Unde M-BWM, we ha e
LogI(M) = 2(M1
2−I).(32)
The e o e, o (2θ, P2θ)-BWM, we ha e
LogI(P) = 1
2θ¯
LogI(P2θ)
=1
θ(Pθ−I).
(33)
F POWER-DEFORMED GBWM AS LOCAL POWER-AIM
Le us i s o malize his p ope y.
P oposi ion 6. Fo any P∈ Sn
++ and V, W ∈TPSn
++, we ha e he ollowing:
g(2θ,P 2θ)-BW
P(V, W) = 1
4g(2θ,1,0)-AI
P(V, W).(34)
21

Published as a con e ence pape a ICLR 2025
P oo . As shown by Bha ia (2009), he Riemannian me ic o he s anda d AIM ((1,1,0)-AIM) is
gAI
P(V, W) = ec(V)⊤(P⊗P)−1 ec(W),(35)
whe e ec(V)is he column ec o iza ion o V,⊗is he K onecke p oduc .
Fo he (2θ, P 2θ)-BWM, we ha e he ollowing:
g(2θ,P 2θ)-BW
P(V, W) = 1
4θ2gϕ2θ(P)-BW
˜
P(˜
V , ˜
W)
=1
4·1
4θ2 ec( ˜
V)⊤(˜
P⊗˜
P)−1 ec( ˜
W)
=1
4·1
4θ2gAI
˜
P(˜
V , ˜
W)
=1
4g(2θ,1,0)-AI
P(V, W),
(36)
whe e ˜
V= Pow2θ∗,P (V),˜
W= Pow2θ∗,P (W),˜
P=P2θ, and Eq. (36) can be ob ain by Han e al.
(2023, Eq. 3)
G ADDITIONAL DISCUSSIONS ON POW-TMLR, POW-EMLR, AND
SCALEPOW-EMLR
G.1 THE EQUIVALENCE OF POW-EMLR AND SCALEPOW-EMLR
I can be p o en ha Pow-EMLR is equi alen o ScalePow-EMLR unde scaled ini ial weigh and
lea ning a e in he FC laye . We deno e he ne wo k as
x0∈Rd0g(·;Θ)
−→ x∈Rd FC
−→ y∈Rc→L∈R,(37)
whe e x0,g(·; Θ), FC, and La e he inpu ea u e, ea u e ex ac ion wi h pa ame e Θ, FC laye ,
and loss, espec i ely. The FC laye s in Pow-EMLR and ScalePow-EMLR a e deno ed as y=Ax+b
and ¯y=1
θ¯
A¯x+¯
b. We se he ini ial alues and lea ning a es o Aand ¯
Asa is ying A0=1
θ¯
A0and
¯γ=θ2γ, and main ain all he o he se ings he same. Then, we ha e he ollowing o he g adien
a A=A0(o ¯
A=¯
A0):
∂L
∂¯
A=1
θ
∂L
∂¯y¯x⊤=1
θ
∂L
∂y x⊤=1
θ
∂L
∂A,
∂L
∂¯x=1
θ¯
A⊤∂L
∂¯y=A⊤∂L
∂y =∂L
∂x .
(38)
Unde SGD, he upda ed alues o ¯
Asa is ying he ollowing:
1
θ¯
A1=1
θ(¯
A0+ ¯γ∂L
∂¯
A)
=1
θ(¯
A0+ ¯γ1
θ
∂L
∂A)
=1
θ¯
A0+ ¯γ1
θ2
∂L
∂A
=A0+γ∂L
∂A
=A1.
(39)
The e o e, he upda ed alues o Aand ¯
As ill sa is y A1=1
θ¯
A1. In addi ion, he g adien s o
Pow-EMLR w. . . xand ba e iden ical o ScalePow-EMLR w. . . ¯xand ¯
b. The e o e, Pow-EMLR
is equi alen o ScalePow-EMLR unde scaled se ings.
22
Published as a con e ence pape a ICLR 2025
G.2 THE IN-EQUIVALENCE OF POW-EMLR AND POW-TMLR
We deno e X=Sθ. Then o Pow-TMLR, we ha e he ollowing
y=F ec 1
θ(X+I);A, b
=F ec (X+I) ; ˜
A, b
=F ec (X) ; ˜
A,˜
b,
(40)
whe e ˜
A=1
θAand ˜
b=1
θA ec(I).
As Aappea s in ˜
b, he g adien o Ais composed o wo pa s, one w. . . yand he o he one
w. . . ˜
b. In con as , in he s anda d FC laye y=F(X;A, b), he g adien o Ais independen o
b. The e o e, Pow-TMLR canno be simply iewed as equi alen o Pow-EMLR wi h ans o med
ini ializa ion.
G.3 A RIEMANNIAN PERSPECTIVE OF POW-TMLR VS. POW-EMLR
Al hough he nume ical exp essions o Pow-TMLR and Pow-EMLR di e by a cons an ans o -
ma ion, hey di e undamen ally in heo y: Pow-TMLR is a angen classi ie , whe eas Pow-EMLR
is a Riemannian classi ie .
1. Tangen Classi ie : The angen classi ie ea s he en i e mani old as a single angen
space a he iden i y ma ix. When mapping da a in o his angen space, c i ical s uc u al
in o ma ion, such as dis ances and angles, canno be p ese ed. This dis o ion unde mines
classi ica ion pe o mance. In con as , Riemannian MLR is cons uc ed based on Rieman-
nian geome y, ully espec ing he mani old’s geome ic s uc u e.
2. Tangen as a Special Case o Riemannian Classi ie . The angen classi ie can be seen
as a educed case o he Riemannian classi ie . Fo example, le us ake Eq. (16) as an
example. When all SPD pa ame e s Pka e se o he ixed iden i y ma ix, Eq. (16) exac ly
co esponds o Pow-TMLR.
In summa y, he Riemannian classi ie enjoys signi ican heo e ical ad an ages o e he angen
classi ie while inco po a ing he angen classi ie as a special case.
H ADDITIONAL EXPERIMENTAL DETAILS
H.1 DATASETS
The Cal ech Uni e si y Bi ds (Bi ds) (Welinde e al.,2010) da ase is composed o 11, 788 images
dis ibu ed o e 200 di e en bi d species. The FGVC Ai c a s (Ai c a s) (Maji e al.,2013) da ase
comp ises 10, 000 images o 100 classes o ai planes, while he S an o d Ca s (Ca s) (K ause e al.,
2013) da ase consis s o 16, 185 images ep esen ing 196 classes o ca s. In addi ion o hese widely
used FGVC da ase s, we also e alua e ou p oposed heo y on he la ge-scale ImageNe -1k (Deng
e al.,2009) da ase , which con ains 1.28M aining images, 50K alida ion images and 100K es ing
images dis ibu ed ac oss 1K classes.
H.2 IMPLEMENTATION DETAILS
We ollow he o icial Py o ch code o iSQRT-COV1(Li e al.,2018) o eimplemen GCP. Follow-
ing Wang e al. (2020a); Song e al. (2022a), we use ResNe -18 as ou backbone ne wo k on he
ImageNe da ase , and ResNe -50 on he o he h ee FGVC da ase s. On Bo h he ImageNe -1k and
FGVC da ase s, he ResNe -18 and ResNe -50 a e ained om sc a ch wi h he GCP laye .
1h ps://gi hub.com/jiang aoxie/ as -MPN-COV
23
Published as a con e ence pape a ICLR 2025
As he ma ix squa e oo is he mos e ec i e ma ix unc ion in GCP, we se powe = 1
/2 o ma ix
powe no maliza ion. Following Song e al. (2022a), we educe he channels o he inal con o-
lu ional ea u es om 2048 o 256 o compac ep esen a ion o co a iance ma ices, p oducing
256×256 spa ial co a iance ma ices. We ain he ne wo k om sc a ch wi h an SGD op imize on
all da ase s. Fo a ai compa ison, he lea ning se ings a e iden ical o Pow-EMLR and ScalePow-
EMLR, while we ine- une Pow-TMLR w. . . lea ning a e, classi ie ac o , and ba ch size on h ee
FGVC da ase s. The lea ning a e o he FC laye is se o be k imes la ge han he con olu ional
laye s, whe e kis called he classi ie ac o . We use a s ep-wise lea ning schedule , di iding he
lea ning a e by 5 a n- h epoch. Tab. 10 summa izes he hype pa ame e s in ou main expe imen s
in Tab. 5.
Fo Cho-TMLR, he lea ning a e is se as 1e−2,5e−3, and 3e−3 o he con olu ional laye s on he
Ai c a s, Bi ds, and Ca s da ase . The ba ch size on he Ca s da ase is 4. On he Ai c a s da ase ,
he aining las s 60 epochs wi h he lea ning a e di ided by 5 a epoch 50. On he Ca s and Bi ds
da ase s, he aining las s 120 epochs wi h a lea ning a e educ ion by a di iso o 10 a epoch 100.
The expe imen s on ImageNe use a wo ks a ion wi h 32-co e AMD EPYC 7302 CPU and an
NVIDIA RTX A6000, while o he expe imen s use a wo ks a ion wi h 16-co e AMD EPYC 7302
CPU and an NVIDIA GeFo ce RTX 2080 Ti GPU. Due o he hea y compu a ional bu den o
Cholesky decomposi ion, we do no implemen Cho-TMLR on he ImageNe .
Table 10: Summa y o hype pa ame e s.
Backbone ResNe 18 ResNe 50
Da ase ImageNe Ai c a s Bi ds Ca s
Classi ie s
Pow-EMLR
ScalePow-EMLR
Pow-TMLR
Pow-EMLR
ScalePow-EMLR
Pow-TMLR
Pow-EMLR
ScalePow-EMLR Pow-TMLR Pow-EMLR
ScalePow-EMLR Pow-TMLR
Lea ning Ra e 1e−1.15e−35e−25e−35e−25e−3
Weigh Decay 1e−41e−41e−41e−41e−41e−4
Classi ie Fac o 1 5 10 1 10 1
LR Schedule [30,45,60] [21,50] [50,100] [50,100] [50,100] [50,100]
Ba ch Size 256 8 10 6 10 10
Epoch 60 50 100 100 100 100
H.3 EXPERIMENTS ON THE SECOND-ORDER TRANSFORMER
Table 11: Compa ison o Pow-EMLR, ScalePow-EMLR and Pow-TMLR unde he SoT-7 backbone
on he ImageNe -1k da ase .
Classi ie Top-1 Acc (%) Top-5 Acc (%)
Pow-TMLR 75.79 92.91
ScalePow-EMLR 76.14 93.18
Pow-EMLR 76.11 93.05
To u he alida e ou indings, we ollow Song e al. (2022b) o conduc expe imen s using he
Second-o de T ans o me (SoT) (Xie e al.,2021) on he ImageNe -1k da ase . Speci ically, we use
he 7-laye SoT (SoT-7) a chi ec u e as he backbone ne wo k and ain he model up o 250 epochs
wi h a ba ch size o 512, keeping he o he se ings he same as Song e al. (2022b).
As shown in Tab. 11, Pow-EMLR s ill achie es simila pe o mance o ScalePow-EMLR, bu ou pe -
o ms Pow-TMLR. These esul s u he suppo ou claim ha angen classi ie s canno adequa ely
explain he ma ix unc ions used in GCP, while he unde lying mechanism can be be e explained
by ou Riemannian pe spec i e.
24
Published as a con e ence pape a ICLR 2025
H.4 ABLATIONS ON THE BIAS REFORMULATION
Recalling he anilla Pow-EMLR Ak, Sθ−bk, i can be ew i en as Ak, Sθ−Pkacco ding o
Eq. (10), whe e ⟨Pk, Ak⟩=bk. This e o mula ion is he key s ep o ex ending Euclidean MLR.
Al hough his e o mula ion has shown success in di e en Riemannian MLRs (Ganea e al.,2018;
Shimizu e al.,2020;Chen e al.,2024a;d;Nguyen & Yang,2023), we conduc abla ions on his
e o mula ion. Fo simplici y, we se Pkiden ical o all k; o he wise, i will b ing a [k, n, n]in e -
media e enso o each [n, n]co a iance. We compa e he ollowing o MLR:
Pow-EMLR: Ak, Sθ−bk,(41)
Pow-EMLR’: Ak, Sθ−P,(42)
whe e Ak, P ∈ Sn. Tab. 12 shows he esul s on all h ee ine-g ained da ase s. We obse e ha
Pow-EMLR pe o ms simila ly o Pow-EMLR’.
Table 12: Compa ison o Pow-TMLR, Pow-EMLR, and Pow-EMLR’ on all h ee ine-g ained
da ase s.
Classi ie Ai Bi ds Ca
Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%) Top-1 Acc (%) Top-5 Acc (%)
Pow-TMLR 69.58 88.68 52.97 77.8 51.14 74.29
Pow-EMLR’ 73.03 90.4 63.96 85.02 80.06 94.02
Pow-EMLR 72.07 89.83 63.29 84.66 80.43 94.15
I PROOF OF THM.2
This p oposi ion is mainly inspi ed by Chen e al. (2024a, Thm. 5). Howe e , all he esul s by Chen
e al. (2024a) equi e he me ic o be a pullback me ic om a s anda d Euclidean space, while he
me ic in ou Thm. 2is a pullback me ic om he SPD mani old. Ne e heless, we s ill can each
simila heo e ical esul s. We i s ecap RSGD and hen begin o p esen ou p oo .
RSGD (Bonnabel,2013) is o mula ed as
¯
W= ExpW(−γΠW(∇W )) (43)
whe e ExpWis he Riemannian exponen ial map a W, and ΠWmaps he Euclidean g adien ∇W
o he Riemannian g adien , and γdeno es lea ning a e.
We deno e (1,0)-EM as EM, and he me ic enso o i as gE. Ins ead o p o iding an ad hoc p oo
exclusi ely o PEM, we p esen he ollowing wo mo e gene al lemmas.
Lemma 7. Gi en a di eomo phism ϕ:Sn
++ → Sn
++,ϕinduces a pullback me ics on Sn
++ om
{Sn
++, gE}, deno ed as gϕ-E. The gϕ-E-induced SPD MLR is
p(y=k|S)∝exp [⟨ϕ(S)−ϕ(Pk), ϕ∗,I(Ak)⟩],(44)
whe e S∈ Sn
++ is an inpu ea u e, Pk∈ Sn
++ and Ak∈ Sna e pa ame e s o each class k.
P oo . Acco ding o Chen e al. (2024d, Thm. 3.3), he Riemannian MLR based on gϕ-E is gi en as
p(y=k|S)∝exp hgϕ-E
Pk(LogPkS, PTI→PkAk)i
= exp [⟨ϕ(S)−ϕ(Pk), ϕ∗,I (Ak)⟩],(45)
whe e Eq. (45) can be ob ained by he p ope ies o de o med me ics (Thanwe das & Pennec,2022,
Tab. 2) and EM (Thanwe das & Pennec,2023, Tab. 3).
Following he no a ions in Lem. 7, we ha e he ollowing lemma.
Lemma 8. Supposing ϕ∗,I is he iden i y map and each SPD pa ame e Pk(Euclidean pa ame e
Ak) in Eq. (44)is op imized by gϕ-E-based RSGD (Euclidean SGD), he gϕ-E-based SPD MLR is
equi alen o a Euclidean MLR illus a ed in Eq. (10)in he co-domain o ϕ.
25