scieee Science in your language
[en] (orig)

Fairbeat: Assessing and Mitigating Bias with the Composite Balance Score

Author: Lequeu, Pierre-Antoine; LAGRAA, Sofiane; Robin, Geoffroy; Ouédraogo, Moussa
Publisher: Zenodo
DOI: 10.1007/978-3-032-06129-4_34
Source: https://zenodo.org/records/17660528/files/fairbeat_pkdd_2025_demos_1698.pdf
Fai bea : Assessing and Mi iga ing Bias wi h he
Composi e Balance Sco e
Pie e-An oine Lequeu1, So iane Lag aa2(

), Geo oy Robin2, and Moussa
Oued aogo2
1So bonne Uni e si y, Pa is, F ance
[email p o ec ed]
2Fuji su Technology Solu ions S.A., Capellen, Luxembou g
[email p o ec ed]
Abs ac . Fai bea , a no el Fai ness Assessmen ool o Resampling-
based Bias Elimina ion and Algo i hm T aining, add esses he c i i-
cal challenge o ai ness in machine lea ning. Machine lea ning mod-
els o en exhibi biases s emming om imbalances in aining da a con-
ce ning p o ec ed a ibu es, leading o disc imina o y ou comes. Fai -
bea le e ages he Composi e Balance Sco e (CBS), a comp ehensi e me -
ic ha e alua es he balance o he da ase by in eg a ing he imbal-
ance o a ibu es, he imbalance o labels and he associa ion o a -
ibu es and labels in o a single no malized sco e. This ool acili a es
p oac i e bias assessmen p io o model aining, suppo s mul i-class
a ibu es, and p o ides a use - iendly en i onmen o explo ing and
isualizing he impac o a ious bias mi iga ion echniques, including
esampling me hods, he eby p omo ing he de elopmen o mo e equi-
able and e hically sound AI sys ems. The demons a ion ideo can be
ound a h ps://you u.be/9aHK Zg XKg.
Keywo ds: Composi e Balance Sco e ·Fai ness ·Bias Mi iga ion ·Ma-
chine Lea ning.
1 In oduc ion
Ensu ing ai ness in machine lea ning models is c ucial o e hical compliance
and socie al impac . Models o en exhibi biases due o imbalances in aining
da a, pa icula ly conce ning p o ec ed a ibu es like gende , age, and e hnic-
i y. Add essing hese biases is essen ial o p e en disc imina ion and achie e
equi able ou comes.
P oblem S a emen . Machine lea ning models a e inc easingly used in human-
cen ic decision-making a eas such as judicia y sys ems, human esou ces, c edi
assessmen , and heal hca e. Howe e , hese models o en show un ai beha io
owa ds social g oups based on p o ec ed a ibu es, leading o disc imina ion
and e hical conce ns. The challenge lies in p edic ing he ai ness o hese models
by analyzing hei aining da a and implemen ing bias mi iga ion s a egies
wi hou comp omising pe o mance.
2 Pie e-An oine Lequeu e al.
Exis ing Wo ks. P e ious s udies ha e explo ed a ious bias mi iga ion ech-
niques, including p e-p ocessing [3], in-p ocessing [4], and pos -p ocessing [5]
me hods. These app oaches o en ocus on single, bina y p o ec ed a ibu es,
neglec ing he complexi ies o mul i-class a ibu es. While some me hods ha e
shown p omise in imp o ing ai ness, hey equen ly lead o a educ ion in u il-
i y, known as he ai ness-u ili y ade-o [1]. Mo eo e , handling missing da a
and p oxy a ibu es emains a challenge, in luencing bo h ai ness and model
pe o mance [2]. Mo eo e , Fai nessE al [9] is a Py hon amewo k o e alua ing
and compa ing ai ness in ML models, s eamlining da a p epa a ion, e alua ion,
and esul p esen a ion o aid in model selec ion and alida ion.
No el y and Con ibu ion. Ou pape in oduces Fai bea , a Fai ness Assess-
men In e ace o Resampling-based Bias Elimina ion and Algo i hm T aining.
I is based on he Composi e Balance Sco e (CBS), a no el me ic designed o
e alua e he balance o da ase s wi h espec o p o ec ed a ibu es and p e-
dic model ai ness by analyzing he aining da a. Fai bea in o ms he decision
o whe he o no o apply bias mi iga ion. I o e s se e al key ad an ages,
including a comp ehensi e balance measu e ha combines a ibu e imbal-
ance, label imbalance, and a ibu e-label associa ion in o a single, no malized
sco e anging om 0 o 1. This easy- o-in e p e me ic assesses o e all da ase
balance and p edic s model ai ness by ocusing on da ase balance as an
indica o o po en ial bias. Fu he mo e, CBS suppo s mul i-class a ibu es,
ex ending beyond bina y ca ego ies, and enables p oac i e bias assessmen
and mi iga ion, allowing o bias e alua ion be o e model aining begins. Fi-
nally, Fai bea has a iendly use in e ace o bias assessmen and mi iga ion.
The ideo o he demons a ion is a ailable a h ps://you u.be/9aHK Zg XKg.
2Fai bea : Assessing and Mi iga ing Bias wi h he
Composi e Balance Sco e
Balancing S a egy
Balance Ou pu
Balance A ibu e
Balance Ou pu o A ibu e
Comple e Balance
Single A ibu e
Mul iple A ibu es
Resampling Me hod
RandomO e Sample
RandomUnde Sample
SMOTE-NC
Ini ial Da a Balance E alua ion
Balance IndexRMSPMI RMSDIR
Composi e Balance Sco e
ML Model
T aining
Model E alua ion
Fai ness
Equalized Odds
Ra io
Dispa a e Impac
Ra io
Fall Ou Ra io
...
Miss Ra e Ra io
U ili y
Accu acy
AUC-ROC
AUPRC
F1 Sco e
...
helps
decision
es da a
ain da a
Rebalanced Da a Balance E alua ion
Balance IndexRMSPMI RMSDIR
Regula ized Composi e Balance Sco e
Regula iza ion
compa e
da ase
Fig. 1: Fai bea : balancing s a egies and esampling me hods wo k low.
Fai bea is a ool designed o simpli y he e alua ion and mi iga ion o bias
in machine lea ning da ase s. I p o ides an in ui i e ool o use s o assess
da ase ai ness, explo e bias mi iga ion echniques, and isualize hei impac .
Fai bea : Assessing and Mi iga ing Bias wi h he Composi e Balance Sco e 3
The ool democ a izes ai ness-awa e machine lea ning, enabling p ac i ione s
o build equi able AI sys ems. Figu e 1 ou lines he wo k low o e alua ing and
mi iga ing bias using he Composi e Balance Sco e (CBS) and ela ed me ics.
I bias mi iga ion is needed, a balancing s a egy and a esampling me hod a e
selec ed. The balance o he ebalanced da a is hen e alua ed, and a machine
lea ning model is ained and es ed o ai ness and u ili y. The CBS alue
helps de e mine he success o he balancing, compa ing esul s o he ini ial
da a balance.
2.1 Assessing da ase balance using he Composi e Balance Sco e
(CBS)
We in oduce he Composi e Balance Sco e, a new me ic o e alua ing he bal-
ance o p o ec ed a ibu es. I uses h ee measu es: he Balance Index,RMSDIR,
and RMSPMI.
Balance Index (Bal) is in oduced as a no el me ic o quan i y he balance
o classes wi hin a p o ec ed a ibu e, add essing a c i ical aspec o da ase
ai ness. Bal(A) = 1 −imb(A)
√n−1
n
, whe e imb(A) is he imbalance index, nis he
numbe o classes, and Ais he p o ec ed a ibu e. Unlike p io wo ks [6] ha
o en ely on a i hme ic means o assess imbalance, he Balance Index employs a
quad a ic mean o he dis ibu ion de ia ion, p o iding a mo e sensi i e measu e
o a ia ions in class ep esen a ion. Fu he mo e, i ’s no malized o a [0, 1]
scale, o e ing in ui i e in e p e abili y whe e 1 signi ies pe ec balance and 0
indica es ex eme imbalance. The Balance Index o e s a unique combina ion
o sensi i i y and in e p e abili y, making i a aluable ool o e alua ing and
add essing a ibu e imbalances in ai ness-awa e machine lea ning.
Roo Mean Squa ed Dispa a e Impac Ra io (RMSDIR): To quan-
i y label imbalance ac oss p o ec ed a ibu e classes, his pape in oduces
he Roo Mean Squa ed Dispa a e Impac Ra io (RMSDIR): RMSDIR(A) =
Pc=cp i DIRno (c)2
|{c=cp i }| , whe e DIRno (c) is he no malized dispa a e impac a io
o class cp oposed in [7], [8]. Building upon he concep o Dispa a e Impac
Ra io (DIR): DIR(ci) = P(Y=1 |A=ci)
P(Y=1 |A=cp i i)wi h ci=cp i i, commonly used o
compa e a o able ou come a es be ween g oups, RMSDIR o e s a c ucial no -
maliza ion s ep. Unlike adi ional DIR, which lacks an uppe bound and can be
challenging o in e p e [7], RMSDIR le e ages he no malized dispa a e impac
in oduced by Bad an e al. o ensu e a [0, 1] scale. This no maliza ion allows
o a mo e in ui i e unde s anding o label imbalance, whe e 1 signi ies pe ec
balance and 0 indica es signi ican dispa i y, ega dless o whe he i a o s he
p i ileged o unp i ileged class. By agg ega ing hese no malized alues using a
oo mean squa e, RMSDIR p o ides a single, obus measu e o label imbalance
o he en i e p o ec ed a ibu e, o e ing a mo e comp ehensi e assessmen han
indi idual pai wise compa isons.
Roo Mean Squa ed Poin wise Mu ual In o ma ion (RMSPMI) is in-
oduced as a no el measu e o cap u e he in o ma ion sha ed be ween classes o
4 Pie e-An oine Lequeu e al.
a p o ec ed a ibu e and he a ge a iable’s labels, o e ing a unique pe spec-
i e on da ase bias. RMSPMI(A) = qPn
i=1 P1
y=0 PMIno (ci,y)2
2n, whe e PMIno (ci, y)
is he no malized poin wise mu ual in o ma ion o class ciand label y. Unlike
adi ional ai ness me ics ha ocus solely on ou come dispa i ies [7], [8], RM-
SPMI le e ages he no malized Poin wise Mu ual In o ma ion (PMI): P MI(ci, y) =
log P(A=ci, Y =y)
P(A=ci)P(Y=y) o quan i y he deg ee o associa ion be ween each class and
each label. While in [8], he au ho s used PMI o measu e unwa an ed asso-
cia ions, RMSPMI agg ega es hese indi idual PMI alues using a oo mean
squa e, p o iding a single, comp ehensi e measu e o he o e all dependency be-
ween he p o ec ed a ibu e and he a ge a iable. This app oach allows o
a mo e nuanced unde s anding o how a p o ec ed a ibu e migh be in luenc-
ing p edic ions beyond simple ou come dispa i ies, cap u ing sub le biases ha
could be missed by o he me ics. By ocusing on in o ma ion sha ing, RMSPMI
complemen s exis ing ai ness measu es and p o ides aluable insigh s o bias
mi iga ion s a egies.
Composi e Balance Sco e (CBS) is a new me ic designed o e alua e he
balance o a da ase conce ning a p o ec ed a ibu e, as shown in Figu e 2a.
CBS is calcula ed as: CBS(A) = Bal(A)+RMSDIR(A)+(1−RMSPMI(A))
3. CBS cap-
u es a ibu e and label imbalances and he s a is ical dependence be ween
he a ibu e and he a ge a iable. No malized o a [0, 1] scale, CBS helps
assess da ase ai ness, guiding bias mi iga ion s a egies and acking hei e -
ec i eness. By calcula ing CBS o each p o ec ed a ibu e, use s can iden i y
a ibu es wi h sco es below a h eshold (e.g., 0.80) ha may need bias mi iga-
ion. CBS guides he applica ion o bias mi iga ion echniques, such as esam-
pling me hods, o imp o e da ase balance and model ai ness. In eg a ing CBS
in o wo k lows enables o ganiza ions o p oac i ely add ess biases, esul ing in
ai e and mo e equi able machine lea ning models.
2.2 Resampling echniques
Resampling echniques a e in eg al ools in da a p ep ocessing a e ai ness as-
sessmen using he CBS sco e, enabling modi ica ion o da ase s h ough he
addi ion o emo al o ows o bias mi iga ion, as shown in Figu e 2b. These
echniques a e used p edominan ly o ec i y imbalanced labels in classi ica ion
asks. In he ealm o ai ness, p io esea ch has in es iga ed esampling me h-
ods o equilib a e p o ec ed a ibu es. The s a egies o balancing include: no
balance, balancing labels, balancing classes, balancing labels ac oss all class-
es/a ibu es, and achie ing comple e balance. Resampling me hods o imple-
men hese s a egies a e classi ied in o o e -sampling (Random O e -Sampling
(ROS), SMOTE-NC) and unde -sampling (Random Unde -Sampling (RUS)).
3 Conclusion
Fai bea unde sco es he pi o al impo ance o da ase balance in educing bias
wi hin machine lea ning models, pa icula ly in bina y classi ica ion scena ios
Fai bea : Assessing and Mi iga ing Bias wi h he Composi e Balance Sco e 5
(a) Fai ness e alua ion. (b) Bias mi iga ion.
Fig. 2: Fai bea dashboa d.
in ol ing mul i-class p o ec ed a ibu es. The in oduced Composi e Balance
Sco e (CBS) se es as a obus p edic o o model ai ness. Implemen ing bal-
ancing s a egies, no ably he equaliza ion o labels wi hin classes, ma kedly
enhances ai ness while incu ing minimal u ili y loss. Al hough he e icacy o
CBS wanes wi h in e sec ional a ibu es, main aining balanced da ase s is es-
sen ial o os e ing ai e and mo e equi able machine lea ning ou comes.
4 Acknowledgmen s
This wo k was suppo ed by he Eu opean Union’s HE RAIDO p ojec unde
he g an ag eemen numbe 101135800.
Re e ences
1. Be simas, D., Fa ias, V. F., and T ichakis, N. (2012). On he e iciency- ai ness
ade-o . Managemen Science, 58:2234–2250.
2. Ca on, S., Malise y, S., and Haas, C. (2022). Impac o impu a ion s a egies on
ai ness in machine lea ning. J. A i . In ell. Res., 74:1011–1035.
3. Laho i, P., Beu el, A., Chen, J., Lee, K., P os , F., Thain, N., Wang, X., and Chi, E.
(2020). Fai ness wi hou demog aphics h ough ad e sa ially eweigh ed lea ning.
Ad ances in neu al in o ma ion p ocessing sys ems, 33:728–740.
4. Wadswo h, C., Ve a, F., and Piech, C. (2018). Achie ing ai ness h ough ad e -
sa ial lea ning: an applica ion o ecidi ism p edic ion. CoRR, abs/1807.00199.
5. Mishle , A., Kennedy, E. H., and Chouldecho a, A. (2021). Fai ness in isk as-
sessmen ins umen s: Pos -p ocessing o achie e coun e ac ual equalized odds. In
ACM FAccT, pages 386–400.
6. Gong, Y., Liu, G., Xue, Y., Li, R., and Meng, L. (2023). A su ey on da ase quali y
in machine lea ning. In o ma ion and So wa e Technology, 162:107268.
7. Bad an, e al. (2023). Can ensembling p ep ocessing algo i hms lead o be e
machine lea ning ai ness? Compu e , 56:71–79.
8. T am`e , F., A lidakis, V., Geambasu, R., Hsu, D. J., Hubaux, J.-P., Humbe , M.,
Juels, A., and Lin, H. (2015). Disco e ing unwa an ed associa ions in da a-d i en
applica ions wi h he Fai Tes es ing oolki . CoRR, abs/1510.02377.
9. Ba aldi, A., B uca o, M., Dud´ık, M., Gue a, F., and In e landi, M. (2025). Fai nes-
sE al: a amewo k o e alua ing ai ness o machine lea ning models. In EDBT,
pages 123–134. ACM.