scieee Science in your language
[en] (orig)

Task S6_RT1.3 – Deliverable DS6_RT1.3.4 - Report on novel methodologies for optimal recommendation by empowering Machine Learning algorithms and Learning-to-Rank models

Author: Marcuzzi, Federico; Lucchese, Claudio; Orlando, Salvatore
Publisher: Zenodo
DOI: 10.5281/zenodo.17673612
Source: https://zenodo.org/records/17673612/files/Deliverable_iNEST.pdf
1
1
Task S6_RT1.3 – Deli e able DS6_RT1.3.4 - Repo on no el me hodologies
o op imal ecommenda ion by empowe ing Machine Lea ning algo i hms and
Lea ning- o-Rank models
1 INTRODUCTION
This documen p esen s he esea ch and esul s de eloped du ing my esea ch
g an (Assegno di Rice ca) o he iNEST p ojec , speci ically wi hin he scope o
DS6_RT1.3.4, i led Repo on no el me hodologies o op imal ecommenda ion by
empowe ing Machine Lea ning algo i hms and Lea ning- o-Rank models.
The co e o he esea ch ocused on he design and de elopmen o e ical
sea ch engines dedica ed o ou is des ina ions, accommoda ion acili ies, and e -
i o ial in o ma ion. These sys ems we e equi ed o inco po a e aspec s o ai ness
and/o explainabili y.
In he con ex o sea ch engines, and mo e b oadly In o ma ion Re ie al (IR), ai -
ness e e s o he balanced ea men o i ems in ankings, a oiding disc imina ion
agains en i ies belonging o sensi i e o mino i y g oups. Fai ness is closely ela ed
o he da ase s used o ain machine lea ning (ML) models, which may con ain sam-
pling biases o e lec social p ejudices. This can lead models o gene a e ankings
ha un ai ly disad an age ce ain use g oups (e.g., based on gende ) o ca ego ies
o ou is se ices (e.g., ho els e sus apa men s).
Explainabili y in IR e e s o he abili y o explain how decisions a e made by he
ML-based anking models. The aim is o p o ide a simple and in ui i e link be ween
he inpu ea u es and he model’s ele ance p edic ions, highligh ing which as-
pec s o he que y, use p o ile, o e u ned i em in luenced he anking.
In summa y, ai ness helps build us among s akeholde s, such as hospi ali y
p o ide s who wan assu ance ha hei o e ings a e anked ai ly. Explainabili y,
on he o he hand, can imp o e he expe ience o ou is s and help businesses un-
de s and which ea u es hey may need o imp o e in o de o gain mo e isibili y.
Du ing he esea ch g an , I ex ensi ely analyzed ai ness in Lea ning- o-Rank
(LTR), which led o he publica ion o wo p ojec miles ones: LambdaRank G adi-
en s a e Incohe en [16, 19] and LambdaFai : a Fai and E ec i e LambdaMART
[20,21].
2 STATE OF THE ART
Cu en sea ch engine echnologies ely on machine lea ning algo i hms, pa icu-
la ly LTR models [3,4,28], o e- ank esul s p esen ed o use s. These sys ems le e -
age complex ea u es ela ed o he que y and use con ex , as well as cha ac e is ics
o he i ems being anked (e.g., documen s, poin s o in e es ), o iden i y he mos
ele an esul s ha ul ill use s’ in o ma ion needs.
2
In e ms o ai ness in LTR models, he li e a u e p ima ily ca ego izes exis -
ing app oaches in o h ee b oad g oups: p e-p ocessing,in-p ocessing, and pos -
p ocessing me hods [34,35].
P e-p ocessing me hods aim o mi iga e bias in he aining da a p io o model
lea ning. These echniques a e gene ally model-agnos ic and ocus on enhancing
indi idual ai ness wi hou modi ying he unde lying ML algo i hms. Fo ins ance,
[15] p oposes ans o ming ea u e ec o s in o mo e equi able ep esen a ions o
imp o e indi idual ai ness. Simila ly, [36] in oduces a s a egy ha balances he
dual objec i es o main aining an accu a e ep esen a ion o he da a and hiding
sensi i e g oup membe ship in o ma ion o enhance ai ness.
In-p ocessing me hods inco po a e ai ness cons ain s di ec ly in o he lea n-
ing p ocess, ypically by modi ying he loss unc ion o adding egula iza ion e ms
ha encou age ai ness. A no able example is DELTR [32], which educes g oup-
le el dispa i ies in i em exposu e wi hin ankings while main aining ele ance. Sim-
ila ly, Fai -PG-Rank [26] add esses ai ness by modeling exposu e as expec ed a -
en ion, ope a ing unde a me i -based cons ain ha ensu es i ems ecei e expo-
su e p opo ional o hei ele ance, hus balancing ai ness and e ec i eness. Re-
cen esea ch has iden i ied s ochas ic Placke -Luce (PL) anking models [18,24] as
a powe ul in-p ocessing app oach o join ly op imizing e ec i eness and ai ness
me ics. Unlike de e minis ic algo i hms ha depend on heu is ic op imiza ion, PL
models a e ully di e en iable and can be op imized ia s ochas ic g adien descen
di ec ly on anking me ics. Howe e , es ima ing g adien s in p ac ice is challeng-
ing because i in ol es summing o e all possible pe mu a ions o i ems. To ackle
his, Oos e huis [22] p oposed PL-Rank, an e icien me hod o es ima e g adien s
o PL models wi h espec o bo h ai ness and e ec i eness me ics by le e aging
speci ic p ope ies o anking me ics and PL dis ibu ions. Fu he ad ancing his
line o wo k, he PL-Rank-3 algo i hm [23] in oduced by Oos e huis achie es unbi-
ased g adien es ima es wi h compu a ional e iciency compa able o s a e-o - he-
a so ing algo i hms. Building upon hese ad ances, Go an la e al. [11] p oposed
G oup-Fai -PL, which op imizes g oup ai ness h ough a no el objec i e ha com-
pu es expec ed anking u ili y o e only hose ankings sa is ying explici g oup ep-
esen a ion cons ain s. G oup-Fai -PL en o ces ei he equal o p opo ional ep e-
sen a ion o p o ec ed i ems wi hin he op-𝑘 anks. Equal ep esen a ion equi es
he same numbe o i ems om each g oup in he op-𝑘, whe eas p opo ional ep-
esen a ion e lec s he ela i e g oup p opo ions in he da ase .
Pos -p ocessing me hods modi y he ou pu ankings a e model in e ence o
sa is y ai ness c i e ia. FA*IR [31] adjus s he anking o ensu e ha he p opo ion
o p o ec ed candida es mee s minimum h esholds a di e en anking dep hs.
The CFAΘalgo i hm [33] p o ides a mechanism o con inuously ade o be ween
indi idual and g oup ai ness, enabling p ac i ione s o ine- une he ai ness c i e ia
applied o he inal anking ou pu .
3
3 BACKGROUND
Lea ning- o- ank me hods a e a co ne s one o in o ma ion e ie al sys ems, used
o e-o de a se o documen s in esponse o a que y. Gi en a que y 𝑞and a can-
dida e documen se 𝐷 = {𝑑1,…,𝑑𝑛}, an LTR model lea ns a sco ing unc ion ha
assigns each documen 𝑑𝑖∈𝐷a sco e 𝑠𝑖. The inal anking 𝜋is ob ained by so ing
documen s in descending o de o 𝑠𝑖. Fo mally, 𝜋is a pe mu a ion o {1,…,𝑛}such
ha 𝜋[𝑟]=𝑖means 𝑑𝑖occupies he 𝑟- h posi ion, whe e 𝑟∈ℕ+and 1≤𝑟≤𝑛. Fo any
𝑟ℓ≤𝑟𝑢, we w i e 𝜋[𝑟ℓ,𝑟𝑢]
o deno e he subsequence o documen s anked om posi ion 𝑟ℓ o 𝑟𝑢. In pa icula ,
he p e ix o leng h 𝑟𝑢is 𝜋[1,𝑟𝑢]. Finally, we use
𝑑𝑖≺𝜋𝑑𝑗
o indica e ha 𝑑𝑖appea s be o e 𝑑𝑗in he anking 𝜋.
3.1 EFFECTIVENESS METRIC
We e alua e bo h e ec i eness and ai ness. Fo e ec i eness, we use he No mal-
ized Discoun ed Cumula i e Gain (NDCG) me ic [13]. Le 𝜋be a anking o 𝐷o size 𝑛,
and le each documen 𝑑𝑖∈𝐷ha e a ele ance label 𝑦𝑖. The Discoun ed Cumula i e
Gain pa o NDCG is de ined as:
DCG(𝜋)= 𝑛
∑
𝑟=1 2𝑦𝜋[𝑟] −1
log2(𝑟+1) and IDCG(𝜋)=max
𝜋′DCG(𝜋′).
Then, he NDCG me ic is de ined as:
NDCG(𝜋)= DCG(𝜋)
IDCG(𝜋),
which yields a alue in [0,1], wi h highe alues indica ing be e alignmen wi h he
ue ele ance o de ing. To e lec ypical use beha io , whe e only he op esul s
a e examined, NDCG is compu ed a cu o 𝑘, deno ed NDCG@𝑘.
3.2 FAIRNESS METRIC
In [20, 21], he ai ness me ic we employed is No malized Discoun ed Di e ence
( ND) [30], which cap u es g oup ai ness as s a is ical pa i y [34]. Gi en a anking
𝜋o 𝐷wi h |𝒢+|p o ec ed i ems, ND e alua es whe he , o a ious cu o s 𝑟, he
op-𝑟posi ions con ain a ac ion o p o ec ed i ems equal o |𝒢+|/𝑛. He e 𝒢+⊆𝐷
deno es he p o ec ed subse .
4
ND di ides he anking in o o e lapping p e ixes o leng hs 𝑟=𝑏,2𝑏,3𝑏,…, whe e
𝑏>1is he bin size. Fo each p e ix leng h 𝑟, i compu es he absolu e de ia ion om
he ideal p opo ion and discoun s i by log2(𝑟). The ND me ic is de ined as ollows:
ND(𝜋)= 1
Dmax
𝑛
∑
𝑟=𝑏,2𝑏,…
1
log2(𝑟)∣|𝒢+
𝜋[1,𝑟]|
𝑟−|𝒢+
𝜋[1,𝑛]|
𝑛∣ ,
whe e 𝒢+
𝜋[1,𝑟] is he se o p o ec ed i ems in he op-𝑟, and Dmax is he wo s -case
sum ob ained by anking all elemen s o he g oup wi h smalle ca dinali y i s . The
me ic lies in [0,1], wi h lowe alues indica ing g ea e ai ness. As wi h NDCG, he
ND me ic can be compu ed up o a ce ain cu o 𝑘, i.e., ND@𝑘.
3.3 DATASETS
We pe o med an ex ensi e e alua ion on publicly a ailable da ase s. In Lamb-
daRank G adien s a e Incohe en [16,19], we used he ollowing h ee da ase s. The
Is ella-X [17] da ase has he highes a e age numbe o documen s pe que y and
he la ges numbe o non- ele an documen s. Yahoo! Lea ning o Rank Chal-
lenge Se 1 [6] da ase is he smalles da ase , wi h abou 700,000documen s and
an a e age o 23.73documen s pe que y. The MSLR Web30K Fold 1 [25] da ase
consis s o ea u e ec o s ex ac ed om que y-documen pai s. I is he mos bal-
anced da ase , wi h abou hal o he documen s labeled as ele an . All da ase s
ha e g aded 5-le el ele ance labels anging om 0(non- ele an ) o 4(highly ele-
an ).
Fo LambdaFai : a Fai and E ec i e LambdaMART [20, 21] ins ead, we con-
side ed h ee addi ional da ase s widely used in ai ness-awa e lea ning- o- ank e-
sea ch: S a log (Ge man C edi Da a) [12], and Home Mo gage Disclosu e Ac (Con-
nec icu ) [8]. The Ge man C edi Da a da ase con ains bina y ele ance labels e-
la ed o c edi wo hiness, wi h 1,000indi iduals pe que y sampled o c ea e 100,000
que ies. We de i ed wo a ian s o his da ase by spli ing p o ec ed and unp o-
ec ed g oups based on:
•Sex: emales as p o ec ed, males as unp o ec ed [14,26,29],
•Age: indi iduals unde 35as p o ec ed, hose 35and olde as unp o ec ed [1].
The Home Mo gage Disclosu e Ac (Connec icu ) da ase includes home mo -
gage loan eco ds ac oss US s a es since 2007. We ocused on Connec icu , using
da a om 2013–2015 o aining, 2016 o alida ion, and 2017 o es ing [11]. Sim-
ila o Ge man C edi Da a, we sampled 50indi iduals pe que y wi h a 4:1 a io o
non-app o ed o app o ed loans, o a o al o 100,000que ies. The p o ec ed g oup
consis s o emales, and he unp o ec ed g oup o males.

5
Since MSLR-30K does no p o ide p o ec ed g oup labels, we ollowed p e ious
wo ks [1,14, 27,29] and used he Quali ySco e2 ea u e (QS2, ea u e ID 133) as a dis-
c imina o y a ibu e. Following Va dasbi e al. [27], documen s wi h QS2 alues be-
low 10a e assigned o he p o ec ed g oup, and hose wi h QS2 alues equal o abo e
10 o he unp o ec ed g oup.
All da ase s a e pa i ioned in o aining, alida ion, and es se s ollowing a 60%-
20%-20% spli .
3.4 LAMBDAMART
LambdaMART [3] is a lea ning algo i hm widely used in IR o ain e ec i e anking
models by di ec ly op imizing a a ge me ic. I add esses he non-di e en iabili y
and la egions o anking me ics [2, 4, 9, 19] by le e aging a smoo h g adien ap-
p oxima ion.
Fo mally, le 𝑞be a que y, 𝐷 = {𝑑1,…,𝑑𝑛} he candida e documen s, and 𝑌 =
{𝑦1,…,𝑦𝑛} hei ele ance labels. LambdaMART cons uc s a g ound- u h pai wise
p e e ence se 𝑃whe e (𝑖,𝑗) ∈ 𝑃i and only i 𝑦𝑖> 𝑦𝑗. Fo each documen 𝑑𝑖,
LambdaMART compu es an app oxima ed g adien 𝜆𝑍
𝑖 o he anking me ic 𝑍(e.g.,
NDCG@𝑘) as:
𝜆𝑍
𝑖= ∑
𝑗 ∶ (𝑖,𝑗)∈𝑃 𝜆𝑍
𝑖𝑗 − ∑
𝑘 ∶ (𝑘,𝑖)∈𝑃 𝜆𝑍
𝑘𝑖 ,(1)
whe e each pa ial app oxima ed g adien 𝜆Z
𝑖𝑗 is de ined as:
𝜆𝑍
𝑖𝑗 =𝜕𝐶(𝑠𝑖−𝑠𝑗)
𝜕𝑠𝑖=−𝜎
1+𝑒𝜎(𝑠𝑖−𝑠𝑗)∣Δ𝑍𝑖𝑗∣. (2)
He e, 𝑠𝑖and 𝑠𝑗a e he sco es p edic ed o 𝑑𝑖and 𝑑𝑗,𝐶is he RankNe cos unc ion [5],
and Δ𝑍𝑖𝑗 =𝑍(𝜋𝑗𝑖)−𝑍(𝜋𝑖𝑗)
deno es he change in me ic 𝑍when swapping 𝑑𝑖and 𝑑𝑗in he cu en anking. The
scala 𝜎con ols he sigmoid’s s eepness.
These 𝜆𝑍
𝑖 alues se e as pseudo-g adien s in i e a i e op imiza ion algo i hms
such as neu al ne wo ks (e.g., LambdaRank [4]) o g adien -boos ed ees (e.g.,
LambdaMART [3]).
4 CONTRIBUTION
The p ima y con ibu ion o he esea ch conduc ed du ing he g an pe iod was
cen e ed on he de elopmen o LTR algo i hms wi h an emphasis on ai ness, pa -
icula ly h ough in-p ocessing me hods. This line o wo k add esses a c i ical issue
6
in he deploymen o mode n sea ch and ecommenda ion sys ems: he isk o am-
pli ying biases and p oducing sys ema ically un ai ankings.
The esea ch aimed o in eg a e ai ness conside a ions di ec ly in o he op i-
miza ion p ocess o LTR models, a he han elying on p e-p ocessing (da a ans-
o ma ion) o pos -p ocessing ( e- anking) echniques. This in-p ocessing app oach
is bo h heo e ically and p ac ically signi ican , as i allows o ai ness cons ain s o
objec i es o be embedded wi hin he model’s lea ning dynamics, po en ially lead-
ing o mo e balanced and equi able ou pu s wi hou sac i icing e ec i eness. Two
main con ibu ions we e achie ed du ing he g an pe iod, each esul ing in pee -
e iewed publica ions:
•LambdaRank G adien s a e Incohe en [16, 19]: This wo k e isi s he oun-
da ional LambdaRank amewo k, unco e ing a ma hema ical incohe ence in
he g adien o mula ion used o aining ha can un ai ly mis ank equally el-
e an i ems. By analyzing he g adien dynamics, he esea ch demons a es
ha he implici assump ions made by LambdaRank can lead o unin ended
op imiza ion beha io s. The pape p oposes a e ined unde s anding o ank-
ing g adien s, pa ing he way o mo e p incipled and ai LTR me hods.
•LambdaFai : a Fai and E ec i e LambdaMART [20, 21]: Building on he
insigh s abo e, his pape in oduces LambdaFai , an ex ension o Lamb-
daMART ha inco po a es ai ness cons ain s di ec ly in o he lea ning p o-
cess. LambdaFai is designed o mi iga e exposu e dispa i y among g oups
while main aining compe i i e anking pe o mance. Ex ensi e expe imen al
e alua ion on publicly a ailable da ase s con i ms ha LambdaFai achie es a
be e ade-o be ween ai ness and e ec i eness compa ed o exis ing base-
lines.
O e all, he esea ch con ibu es bo h heo e ical insigh s and p ac ical algo-
i hms ha ad ance he s a e o he a in ai in o ma ion e ie al. I demons a es
ha i is possible o econcile he goals o e ec i eness and ai ness in anking, and
i p o ides ools ha can be adop ed o ex ended in eal-wo ld sys ems.
4.1 LAMBDARANK GRADIENTS ARE INCOHERENT
Many LTR algo i hms s ill ely on g adien -based op imiza ion, ei he by app oxima -
ing he anking me ic o by cons uc ing heu is ic g adien s, as hese me hods ha e
p o en o be highly e ec i e in p ac ice. As men ioned abo e, LambdaMART op i-
mizes a non-di e en iable objec i e by gene a ing ad hoc g adien s o each docu-
men . These g adien s a e based on heu is ic assump ions abou each documen ’s
con ibu ion o he o e all anking and i s in e ac ions wi h o he documen s. Con-
sequen ly, he g adien s p oduced by LambdaMART a e inhe en ly app oxima e.
7
In [16, 19], we demons a e ha LambdaMART and i s ela ed me hods, such as
LambdaRank [4] and he loss unc ions in oduced by [28], exhibi an in insic is-
sue s emming om hei heu is ic ounda ions: he gene a ion o incohe en g a-
dien s. Ou analysis e ealed h ee c i ical and p e iously undocumen ed beha io s
in LambdaMART and i s de i a i es:
•i) We ound ha LambdaMART su e s om g adien incohe encies ha hinde
he lea ning p ocess. Speci ically, i can assign s onge downwa d g adien
o ces o documen s wi h highe ele ance han o hose wi h lowe ele ance.
This leads he model o mislea n he co ec anking o de .
•ii) We obse ed ha op imizing unca ed IR me ics exace ba es hese inco-
he encies, u he deg ading model pe o mance. Al hough unca ed me ics
(e.g., op-𝑘) a e use ul o ocusing lea ning on he op anks and educing ain-
ing ime, hey also inc ease he isk o inco ec g adien es ima ion.
•iii) Finally, we disco e ed ha op imizing unca ed me ics in oduces un ai
ea men o equally ele an documen s. In pa icula , when wo o mo e doc-
umen s ha e he same ele ance, hose anked lowe in he lis ecei e weake
upwa d g adien signals han hose in highe posi ions. This pu s lowe - anked
ye equally ele an documen s a a disad an age, since hey equi e a s onge
push o imp o e hei ank.
These indings e eal ha he widely used LambdaMART algo i hm and i s
de i a i es can exhibi un ai beha io , especially in scena ios whe e ai ness ac oss
equally ele an i ems is essen ial.
4.1.1 GRADIENT INCOHERENCY AND UNFAIR DOCUMENT COMPARISON
G adien -based lea ning algo i hms, such as a i icial neu al ne wo ks o g adien -
boos ed decision ees, un i e a i e upda es o build a anke ha minimizes a gi en
cos unc ion 𝐶. Fo ins ance, g adien -boos ed decision ees i e a i ely lea n a new
ee ha app oxima es 𝜕𝐶/𝜕𝑠𝑖 o each documen 𝑑𝑖in he aining se 𝐷and i s
sco e 𝑠𝑖. Un o una ely, mos IR me ics a e ank-based: hey depend on anking
𝜋 a he han on 𝑠𝑖. This makes he cos unc ion ei he la o non-di e en iable.
No e ha 𝜋is he anking o e he documen s 𝑑𝑖∈𝐷so ed in dec easing o de o
sco es 𝑠𝑖p edic ed by he anke , and 𝜋[𝑖]deno es he posi ion o documen 𝑑𝑖in he
anking.
LambdaRank’s cos unc ion de ined by [4] is one o he mos ele an ap-
p oaches used o ackle his p oblem, and i s ems om he RankNe cos p o-
posed by [5], which is enhanced by conside ing he impac on he IR me ic. The
g adien is compu ed on he basis o pai -wise lambdas 𝜆𝑖𝑗 as de ined in Equa-
ion 1, whe e 𝑃is he se o o de ed documen s pai s (𝑖,𝑗)such ha 𝑦𝑖> 𝑦𝑗, i.e.,
𝑃 = {(𝑖,𝑗) ∣ 𝑑𝑖,𝑑𝑗∈ 𝐷 ∧ 𝑦𝑖> 𝑦𝑗}. The alue o 𝜆𝑖𝑗 es ima es he change on he
8
Table 1: De ailed compu a ion o LambdaMART g adien s.
𝑑𝑖𝜋[𝑖] 𝑦𝑖𝑠𝑖𝜆𝑖
𝑑11 4 0.02 𝜆1=𝜆12 +𝜆13 ≈0.176+0.221≈0.397
𝑑22 0 0.01 𝜆2=−𝜆12 −𝜆32 ≈−0.176−0.004≈−0.180
𝑑33 1 0.00 𝜆3=−𝜆13 +𝜆32 ≈−0.221+0.004≈−0.217
cos unc ion 𝐶when he di e ence be ween he wo sco es 𝑠𝑖and 𝑠𝑗inc eases o
dec eases.
To manage use beha io and inc ease aining e iciency, eal-wo ld applica ions
o in o ma ion e ie al sys ems mos ly y o op imize he e ec i eness only o he
i s 𝑘 esul s. IR me ics na u ally p o ide a unca ed e sion, i.e. NDCG@𝑘is com-
pu ed by conside ing only he con ibu ion o he op-𝑘 anked documen s.
By aining he model o op imize a unca ed me ic 𝑍 o a ce ain unca ion
le el 𝜏, pai s o documen s anked beyond 𝜏a e no conside ed since he co espond-
ing con ibu ion o he me ic is equal o 0. Thus, in o de o educe he aining ime,
he numbe o documen pai s in 𝑃is limi ed while compu ing he g adien s 𝜆𝑖in
Equa ion 1 by eplacing he se 𝑃wi h 𝐼𝜏={(𝑖,𝑗)|𝑑𝑖,𝑑𝑗∈𝐷∧𝑦𝑖>𝑦𝑗∧min(𝜋[𝑖],𝜋[𝑗])≤𝜏}.
I is impo an o no e ha , al hough closely ela ed, he unca ion le el 𝜏is di -
e en om he me ic cu o 𝑘. The o me a ec s he numbe o documen pai s o
p ocess, and he la e a ec s he e alua ion o he me ic. Mo eo e , hey may no
be equal, i.e. 𝜏may be sligh ly la ge han 𝑘 o p ocess mo e pai s du ing he aining
phase.
Table 1 shows an example o LambdaMART g adien s when maximizing NDCG.
The que y has only h ee documen s wi h hei anks 𝜋[𝑖]and sco es 𝑠𝑖p edic ed by
he model, and ele ance label 𝑦𝑖. The op- anked documen wi h ele ance equal
o 4 and is co ec ly pushed up by he g adien 𝜆1. In e es ingly enough, he sec-
ond and hi d documen s a e mis anked wi h labels 0 and 1 espec i ely. The Lamb-
daMART g adien is nega i e o bo h documen s, bu he documen wi h he la ge
label is pushed down wi h g ea e s eng h. We may conclude ha such g adien s
a e no going o imp o e he anking bu a he inc ease he gap be ween he wo
mis anked documen s. We call his phenomenon g adien s incohe ency.
To explain in de ail he eason o such beha io , in Table 1 we epo he com-
pu a ion o he documen g adien s 𝜆𝑖as a unc ion o he pai -wise 𝜆𝑖𝑗 acco ding
o Equa ion 1 in case o he NDCG me ic. Documen 𝑑1has a posi i e g adien 𝜆1
as i is anked highe han documen s wi h smalle ele ance labels. Documen 𝑑2
is he leas ele an and ecei es a nega i e g adien con ibu ion om bo h he
o he documen s. Unexpec edly, documen 𝑑3 ecei es he s onges downwa d
push e en i i has a highe label han 𝑑2. The eason is ha swapping documen
𝑑1wi h 𝑑3has a la ge impac on he NDCG han swapping 𝑑1wi h 𝑑2, esul ing in
𝜆13 >𝜆12. LambdaMART p e e s a oiding he isk o mo ing 𝑑1 o he hi d posi ion
a he han pushing 𝑑3up o he second place. Indeed, his comes om he discoun
ac o o NDCG me ic ha demo es documen s’ con ibu ions in he lowe anks.
15
minimal numbe o in e -bin swaps is applied be ween equally ele an documen s
belonging o di e en g oups o balance p o ec ed and unp o ec ed i ems ac oss
anking bins and app oxima e |𝒢+|/𝑛. Since hese documen s ha e equal ele ance,
such swaps ep esen ies ha do no a ec NDCG bu se e o educe ND by im-
p o ing g oup balance. The second s age p oceeds in he same manne as used o
gene a ing 𝜋 ND+.
4.2.2 RESUSLTS
Figu e 3 illus a es he ade-o be ween e ec i eness and ai ness, whe e ai -
ness is measu ed as (1− ND)%(highe alues indica e be e ai ness). Speci ically,
we p esen NDCG and ND alues e alua ed a cu o 𝑘 = 15on he es se s o
each da ase . Wi h he excep ion o he MSLR-30K da ase , whe e Δ ND pe o ms
bes , ND+eme ges as he o e all op-pe o ming a ian , achie ing highe ai ness
wi h only a sligh educ ion in e ec i eness compa ed o LambdaMART and he
o he LambdaFai a ian s. Compa ed o he ai baselines, LambdaFai consis en ly
achie ed highe e ec i eness wi h a sligh dec ease in ai ness.
48.0 50.0 52.0 54.0
E ec i eness
73.6
74.4
75.2
76.0
76.8
Fai ness
MSLR-30K
90.0 92.5 95.0 97.5 100.0
E ec i eness
72.0
74.0
76.0
78.0
80.0
S a log (Age)
92.0 94.0 96.0 98.0 100.0
E ec i eness
79.6
80.0
80.4
80.8
S a log (Sex)
88.0 90.0 92.0 94.0 96.0
E ec i eness
80.9
81.0
81.0
81.1
HMDA-CT
LambdaMART NDCG+ ∆ ND ND+ PL-Rank-3 G oup-Fai -PL
Figu e 3: E ec i eness and ai ness ade-o . Fai ness = (1− ND)%. Resul s o
models ained and e alua ed wi h cu o 𝑘=15.
5 CONCLUSIONS
Du ing his esea ch g an , I de eloped wo no el algo i hms aimed a c ea ing ai
and e ec i e sea ch engines: Lambda-eX [16,19] and LambdaFai [20,21]. Bo h algo-
i hms employ an in-p ocessing app oach, in eg a ing ai ness cons ain s di ec ly
in o he lea ning- o- ank op imiza ion p ocess. These s a egies allow he model o
simul aneously maximize anking e ec i eness while ac i ely mi iga ing disc imi-

16
na o y biases ha may a ise among equally ele an i ems o g oups equi ing p o-
ec ion.
Mo e speci ically, hese algo i hms add ess wo c i ical aspec s o ai ness in ank-
ing sys ems. Fi s , hey ensu e indi idual ai ness by ea ing i ems wi h compa able
ele ance sco es equi ably du ing he aining phase, hus educing unwa an ed
disc imina ion be ween simila ly quali ied en i ies. Second, hey en o ce g oup ai -
ness by p omo ing s a is ically ai exposu e o p o ec ed g oups ac oss di e en
anking posi ions. This dual ocus suppo s he de elopmen o anking models ha
balance use ele ance needs wi h ai ness objec i es, a oiding un ai unde ep e-
sen a ion o p o ec ed o disad an aged g oups.
In p ac ical e ms, hese ad ancemen s ha e signi ican implica ions o domains
such as ou ism, whe e sea ch esul s in luence use decisions and economic op-
po uni ies. Fo ins ance, hese algo i hms can ensu e ha ou ism- ela ed en i-
ies, such as accommoda ions, a ac ions, o ac i i ies, ha a e equally ele an o a
use ’s que y ecei e equal conside a ion, p e en ing biases ha migh a o popula
o cen ally loca ed op ions un ai ly. Addi ionally, p o ec ed en i ies, such as ac i -
i ies loca ed in less equen ed o ma ginalized a eas, a e gua an eed p opo ional
exposu e in he anking esul s. This leads o mo e equi able isibili y and can help
suppo di e se and sus ainable ou ism de elopmen .
O e all, he de elopmen and e alua ion o Lambda-eX and LambdaFai con-
ibu e o ad ancing ai ness-awa e lea ning- o- ank esea ch by demons a ing
ha ai ness cons ain s can be e ec i ely inco po a ed in o anking op imiza ion
wi hou signi ican ly comp omising ele ance. Fu u e wo k will ocus on ex end-
ing hese me hods o handle mul iple p o ec ed a ibu es simul aneously, explo -
ing adap i e ai ness cons ain s based on use p e e ences, and applying he algo-
i hms o o he domains whe e ai ness in anking is pa amoun .
REFERENCES
[1] A. Bowe , H. E ekha i, M. Yu ochkin, and Y. Sun, “Indi idually ai ankings,”
in 9 h In e na ional Con e ence on Lea ning Rep esen a ions, ICLR 2021,
Vi ual E en , Aus ia, May 3-7, 2021. OpenRe iew.ne , 2021. [Online]. A ailable:
h ps://open e iew.ne / o um?id=71zCSP_HuBN
[2] S. B uch, “An al e na i e c oss en opy loss o lea ning- o- ank,” in WWW ’21:
The Web Con e ence 2021, Vi ual E en / Ljubljana, Slo enia, Ap il 19-23, 2021,
J. Lesko ec, M. G obelnik, M. Najo k, J. Tang, and L. Zia, Eds. ACM / IW3C2,
2021, pp. 118–126. [Online]. A ailable: h ps://doi.o g/10.1145/3442381.3449794
[3] C. J. C. Bu ges, “F om ankne o lambda ank o lambdama : An o e iew,” 2010.
[4] C. J. C. Bu ges, R. Ragno, and Q. V. Le, “Lea ning o ank wi h nonsmoo h
cos unc ions,” in Ad ances in Neu al In o ma ion P ocessing Sys ems 19, P o-
17
ceedings o he Twen ie h Annual Con e ence on Neu al In o ma ion P o-
cessing Sys ems, Vancou e , B i ish Columbia, Canada, Decembe 4-7, 2006,
B. Schölkop , J. C. Pla , and T. Ho mann, Eds. MIT P ess, 2006, pp. 193–200.
[5] C. J. C. Bu ges, T. Shaked, E. Renshaw, A. Lazie , M. Deeds, N. Hamil on, and G. N.
Hullende , “Lea ning o ank using g adien descen ,” in Machine Lea ning, P o-
ceedings o he Twen y-Second In e na ional Con e ence (ICML 2005), Bonn,
Ge many, Augus 7-11, 2005, se . ACM In e na ional Con e ence P oceeding Se-
ies, L. D. Raed and S. W obel, Eds., ol. 119. ACM, 2005, pp. 89–96.
[6] O. Chapelle and Y. Chang, “Yahoo! lea ning o ank challenge o e iew,”
in P oceedings o he Yahoo! Lea ning o Rank Challenge, held a ICML
2010, Hai a, Is ael, June 25, 2010, se . JMLR P oceedings, O. Chapelle,
Y. Chang, and T. Liu, Eds., ol. 14. JMLR.o g, 2011, pp. 1–24. [Online]. A ailable:
h p://p oceedings.ml .p ess/ 14/chapelle11a.h ml
[7] M. Co po a ion, Ligh GBM Release 3.3.3.99, 2023.
[8] F. F. I. E. Council, “HMDA Da a Publica ion,” 2017, eleased due o he Home
Mo gage Disclosu e Ac . [Online]. A ailable: h ps://www.consume inance.
go /da a- esea ch/hmda/his o ic-da a/
[9] P. Donmez, K. M. S o e, and C. J. C. Bu ges, “On he local op imali y o
lambda ank,” in P oceedings o he 32nd Annual In e na ional ACM SIGIR
Con e ence on Resea ch and De elopmen in In o ma ion Re ie al, SIGIR
2009, Bos on, MA, USA, July 19-23, 2009, J. Allan, J. A. Aslam, M. Sande son,
C. Zhai, and J. Zobel, Eds. ACM, 2009, pp. 460–467. [Online]. A ailable:
h ps://doi.o g/10.1145/1571941.1572021
[10] R. Fishe , The design o expe imen s. 1935. Edinbu gh: Oli e and Boyd, 1935.
[11] S. Go an la, E. Bhansali, A. Deshpande, and A. Louis, “Op imizing lea ning- o-
ank models o ex-pos ai ele ance,” in P oceedings o he 47 h In e na ional
ACM SIGIR Con e ence on Resea ch and De elopmen in In o ma ion Re ie al,
SIGIR 2024, Washing on DC, USA, July 14-18, 2024, G. H. Yang, H. Wang, S. Han,
C. Hau , G. Zuccon, and Y. Zhang, Eds. ACM, 2024, pp. 1525–1534. [Online].
A ailable: h ps://doi.o g/10.1145/3626772.3657751
[12] H. Ho mann, “S a log (Ge man C edi Da a),” UCI Machine Lea ning Reposi o y,
1994, DOI: h ps://doi.o g/10.24432/C5NC77.
[13] K. Jä elin and J. Kekäläinen, “Cumula ed gain-based e alua ion o IR
echniques,” ACM T ans. In . Sys ., ol. 20, no. 4, pp. 422–446, 2002. [Online].
A ailable: h p://doi.acm.o g/10.1145/582415.582418
18
[14] J. Ko a y, F. Fio e o, P. V. Hen en yck, and Z. Zhu, “End- o-end lea ning o ai
anking sys ems,” in WWW ’22: The ACM Web Con e ence 2022, Vi ual E en ,
Lyon, F ance, Ap il 25 - 29, 2022, F. La o es , R. T oncy, E. Simpe l, D. Aga wal,
A. Gionis, I. He man, and L. Médini, Eds. ACM, 2022, pp. 3520–3530. [Online].
A ailable: h ps://doi.o g/10.1145/3485447.3512247
[15] P. Laho i, K. P. Gummadi, and G. Weikum, “i ai : Lea ning indi idually ai da a
ep esen a ions o algo i hmic decision making,” in 35 h IEEE In e na ional
Con e ence on Da a Enginee ing, ICDE 2019, Macao, China, Ap il 8-11, 2019.
IEEE, 2019, pp. 1334–1345. [Online]. A ailable: h ps://doi.o g/10.1109/ICDE.2019.
00121
[16] C. Lucchese, F. Ma cuzzi, and S. O lando, “Does lambdama do wha you
expec ?” in P oceedings o he 13 h I alian In o ma ion Re ie al Wo kshop
(IIR 2023), Pisa, I aly, June 8-9, 2023, se . CEUR Wo kshop P oceedings, F. M.
Na dini, N. Tonello o, G. Faggioli, and A. Fe a a, Eds., ol. 3448. CEUR-WS.o g,
2023, p. 72. [Online]. A ailable: h ps://ceu -ws.o g/Vol-3448/pape -16.pd
[17] C. Lucchese, F. M. Na dini, R. Pe ego, S. O lando, and S. T ani, “Selec i e
g adien boos ing o e ec i e lea ning o ank,” in The 41s In e na ional ACM
SIGIR Con e ence on Resea ch & De elopmen in In o ma ion Re ie al, SIGIR
2018, Ann A bo , MI, USA, July 08-12, 2018, K. Collins-Thompson, Q. Mei, B. D.
Da ison, Y. Liu, and E. Yilmaz, Eds. ACM, 2018, pp. 155–164. [Online]. A ailable:
h ps://doi.o g/10.1145/3209978.3210048
[18] R. D. Luce, Indi idual Choice Beha io : A Theo e ical analysis. New Yo k, NY,
USA: Wiley, 1959.
[19] F. Ma cuzzi, C. Lucchese, and S. O lando, “Lambda ank g adien s a e incohe -
en ,” in P oceedings o he 32nd ACM In e na ional Con e ence on In o ma ion
and Knowledge Managemen , CIKM 2023, Bi mingham, Uni ed Kingdom,
Oc obe 21-25, 2023, I. F ommholz, F. Hop ga ne , M. Lee, M. Oakes, M. Lalmas,
M. Zhang, and R. L. T. San os, Eds. ACM, 2023, pp. 1777–1786. [Online]. A ailable:
h ps://doi.o g/10.1145/3583780.3614948
[20] ——, “Lambda ai : A ai and e ec i e lambdama ,” in P oceedings o he 14 h
I alian In o ma ion Re ie al Wo kshop, IIR 2024, Udine, I aly, Sep embe 5-6,
2024, E. Maddalena, S. Mizza o, K. Roi e o, and M. Vi iani, Eds., 2024.
[21] ——, “Lambda ai o ai and e ec i e anking,” in Ad ances in In o ma ion
Re ie al - 47 h Eu opean Con e ence on In o ma ion Re ie al, ECIR 2025,
Lucca, I aly, Ap il 6-10, 2025, P oceedings, Pa IV, se . Lec u e No es in
Compu e Science, C. Hau , C. Macdonald, D. Jannach, G. Kazai, F. M. Na dini,
F. Pinelli, F. Sil es i, and N. Tonello o, Eds., ol. 15575. Sp inge , 2025, pp.
197–213. [Online]. A ailable: h ps://doi.o g/10.1007/978-3-031-88717-8_15
19
[22] H. Oos e huis, “Compu a ionally e icien op imiza ion o placke -luce anking
models o ele ance and ai ness,” in SIGIR ’21: The 44 h In e na ional ACM
SIGIR Con e ence on Resea ch and De elopmen in In o ma ion Re ie al,
Vi ual E en , Canada, July 11-15, 2021, F. Diaz, C. Shah, T. Suel, P. Cas ells,
R. Jones, and T. Sakai, Eds. ACM, 2021, pp. 1023–1032. [Online]. A ailable:
h ps://doi.o g/10.1145/3404835.3462830
[23] ——, “Lea ning- o- ank a he speed o sampling: Placke -luce g adien
es ima ion wi h minimal compu a ional complexi y,” in SIGIR ’22: The 45 h
In e na ional ACM SIGIR Con e ence on Resea ch and De elopmen in
In o ma ion Re ie al, Mad id, Spain, July 11 - 15, 2022, E. Amigó, P. Cas ells,
J. Gonzalo, B. Ca e e e, J. S. Culpeppe , and G. Kazai, Eds. ACM, 2022, pp.
2266–2271. [Online]. A ailable: h ps://doi.o g/10.1145/3477495.3531842
[24] R. L. Placke , “The analysis o pe mu a ions,” Jou nal o he Royal S a is ical
Socie y. Se ies C (Applied S a is ics), ol. 24, no. 2, pp. 193–202, 1975. [Online].
A ailable: h p://www.js o .o g/s able/2346567
[25] T. Qin and T. Liu, “In oducing LETOR 4.0 da ase s,” CoRR, ol. abs/1306.2597,
2013. [Online]. A ailable: h p://a xi .o g/abs/1306.2597
[26] A. Singh and T. Joachims, “Policy lea ning o ai ness in anking,” in
Ad ances in Neu al In o ma ion P ocessing Sys ems 32: Annual Con e ence
on Neu al In o ma ion P ocessing Sys ems 2019, Neu IPS 2019, Decembe 8-14,
2019, Vancou e , BC, Canada, H. M. Wallach, H. La ochelle, A. Beygelzime ,
F. d’Alché-Buc, E. B. Fox, and R. Ga ne , Eds., 2019, pp. 5427–5437. [Online].
A ailable: h ps://dl.acm.o g/doi/10.5555/3454287.3454774
[27] A. Va dasbi, F. Sa i, and M. de Rijke, “P obabilis ic pe mu a ion g aph
sea ch: Black-box op imiza ion o ai ness in anking,” in SIGIR ’22: The
45 h In e na ional ACM SIGIR Con e ence on Resea ch and De elopmen in
In o ma ion Re ie al, Mad id, Spain, July 11 - 15, 2022, E. Amigó, P. Cas ells,
J. Gonzalo, B. Ca e e e, J. S. Culpeppe , and G. Kazai, Eds. ACM, 2022, pp.
715–725. [Online]. A ailable: h ps://doi.o g/10.1145/3477495.3532045
[28] X. Wang, C. Li, N. Golbandi, M. Bende sky, and M. Najo k, “The lambdaloss ame-
wo k o anking me ic op imiza ion,” in P oceedings o he 27 h ACM In e na-
ional Con e ence on In o ma ion and Knowledge Managemen , CIKM 2018,
To ino, I aly, Oc obe 22-26, 2018, A. Cuzzoc ea, J. Allan, N. W. Pa on, D. S i as-
a a, R. Ag awal, A. Z. B ode , M. J. Zaki, K. S. Candan, A. Lab inidis, A. Schus e ,
and H. Wang, Eds. ACM, 2018, pp. 1313–1322.
[29] H. Yada , Z. Du, and T. Joachims, “Policy-g adien aining o ai and
unbiased anking unc ions,” in SIGIR ’21: The 44 h In e na ional ACM
SIGIR Con e ence on Resea ch and De elopmen in In o ma ion Re ie al,
20
Vi ual E en , Canada, July 11-15, 2021, F. Diaz, C. Shah, T. Suel, P. Cas ells,
R. Jones, and T. Sakai, Eds. ACM, 2021, pp. 1044–1053. [Online]. A ailable:
h ps://doi.o g/10.1145/3404835.3462953
[30] K. Yang and J. S oyano ich, “Measu ing ai ness in anked ou pu s,” in
P oceedings o he 29 h In e na ional Con e ence on Scien i ic and S a is ical
Da abase Managemen , Chicago, IL, USA, June 27-29, 2017. ACM, 2017, pp.
22:1–22:6. [Online]. A ailable: h ps://doi.o g/10.1145/3085504.3085526
[31] M. Zehlike, F. Bonchi, C. Cas illo, S. Hajian, M. Megahed, and R. Baeza-Ya es,
“Fa*i : A ai op-k anking algo i hm,” in P oceedings o he 2017 ACM
on Con e ence on In o ma ion and Knowledge Managemen , CIKM 2017,
Singapo e, No embe 06 - 10, 2017, E. Lim, M. Winsle , M. Sande son, A. W. Fu,
J. Sun, J. S. Culpeppe , E. Lo, J. C. Ho, D. Dona o, R. Ag awal, Y. Zheng, C. Cas illo,
A. Sun, V. S. Tseng, and C. Li, Eds. ACM, 2017, pp. 1569–1578. [Online]. A ailable:
h ps://doi.o g/10.1145/3132847.3132938
[32] M. Zehlike and C. Cas illo, “Reducing dispa a e exposu e in anking: A lea ning
o ank app oach,” in WWW ’20: The Web Con e ence 2020, Taipei, Taiwan, Ap il
20-24, 2020, Y. Huang, I. King, T. Liu, and M. an S een, Eds. ACM / IW3C2, 2020,
pp. 2849–2855. [Online]. A ailable: h ps://doi.o g/10.1145/3366424.3380048
[33] M. Zehlike, P. Hacke , and E. Wiedemann, “Ma ching code and
law: achie ing algo i hmic ai ness wi h op imal anspo ,” Da a Min.
Knowl. Disco ., ol. 34, no. 1, pp. 163–200, 2020. [Online]. A ailable:
h ps://doi.o g/10.1007/s10618-019-00658-8
[34] M. Zehlike, K. Yang, and J. S oyano ich, “Fai ness in anking, pa I: sco e-based
anking,” ACM Compu . Su ., ol. 55, no. 6, pp. 118:1–118:36, 2023. [Online].
A ailable: h ps://doi.o g/10.1145/3533379
[35] ——, “Fai ness in anking, pa II: lea ning- o- ank and ecommende sys ems,”
ACM Compu . Su ., ol. 55, no. 6, pp. 117:1–117:41, 2023. [Online]. A ailable:
h ps://doi.o g/10.1145/3533380
[36] R. S. Zemel, Y. Wu, K. Swe sky, T. Pi assi, and C. Dwo k, “Lea ning ai
ep esen a ions,” in P oceedings o he 30 h In e na ional Con e ence on
Machine Lea ning, ICML 2013, A lan a, GA, USA, 16-21 June 2013, se . JMLR
Wo kshop and Con e ence P oceedings, ol. 28. JMLR.o g, 2013, pp. 325–333.
[Online]. A ailable: h p://p oceedings.ml .p ess/ 28/zemel13.h ml