scieee Science in your language
[en] (orig)

Deep Learning with Discriminative Margin Loss for Cross-Domain Consumer-to-Shop Clothes Retrieval

Author: Alirezazadeh, Pendar,Dornaika, Fadi,Moujahid, Abdelmalik
Publisher: MDPI
Year: 2022
DOI: 10.3390/s22072660
Source: https://addi.ehu.eus/bitstream/10810/56381/1/sensors-22-02660-v2.pdf


Ci a ion: Ali ezazadeh, P.; Do naika,
F.; Moujahid, A. Deep Lea ning wi h
Disc imina i e Ma gin Loss o
C oss-Domain Consume - o-Shop
Clo hes Re ie al. Senso s 2022,22,
2660. h ps://doi.o g/10.3390/
s22072660
Academic Edi o s: Abdeldjalil
Ouahabi, Sébas ien Jacques and
Ami Benzaoui
Recei ed: 15 Feb ua y 2022
Accep ed: 28 Ma ch 2022
Published: 30 Ma ch 2022
Publishe ’s No e: MDPI s ays neu al
wi h ega d o ju isdic ional claims in
published maps and ins i u ional a il-
ia ions.
Copy igh : © 2022 by he au ho s.
Licensee MDPI, Basel, Swi ze land.
This a icle is an open access a icle
dis ibu ed unde he e ms and
condi ions o he C ea i e Commons
A ibu ion (CC BY) license (h ps://
c ea i ecommons.o g/licenses/by/
4.0/).
senso s
A icle
Deep Lea ning wi h Disc imina i e Ma gin Loss o
C oss-Domain Consume - o-Shop Clo hes Re ie al
Penda Ali ezazadeh 1, Fadi Do naika 1,2,3,* and Abdelmalik Moujahid 4
1Depa men o In o ma ics, Uni e si y o he Basque Coun y, 20008 Donos ia-San Sebas ian, Spain;
penda [email p o ec ed]
2School o Compu e and In o ma ion Enginee ing, Henan Uni e si y, Kai eng 475000, China
3Ike basque, Basque Founda ion o Science, Plaza Euskadi, 5, 48009 Bilbao, Spain
4Depa men o Ma hema ics, Uni e si y o he Basque Coun y, 48080 Bilbao, Spain;
[email p o ec ed]
*Co espondence: [email p o ec ed]
Abs ac :
Consume - o-shop clo hes e ie al e e s o he p oblem o ma ching pho os aken by
cus ome s wi h hei coun e pa s in he shop. Due o some p oblems, such as a la ge numbe
o clo hing ca ego ies, di e en appea ances o clo hing i ems due o di e en came a angles and
shoo ing condi ions, di e en backg ound en i onmen s, and di e en body pos u es, he e ie al
accu acy o adi ional consume - o-shop models is always low. Wi h ad ances in con olu ional
neu al ne wo ks (CNNs), he accu acy o ga men e ie al has been signi ican ly imp o ed. Mos
app oaches add essing his p oblem use single CNNs in conjunc ion wi h a so max loss unc ion o
ex ac disc imina i e ea u es. In he ashion domain, nega i e pai s can ha e small o la ge isual
di e ences ha make i di icul o minimize in aclass a iance and maximize in e class a iance
wi h so max. Ma gin-based so max losses such as Addi i e Ma gin-So max (aka CosFace) imp o e
he disc imina i e powe o he o iginal so max loss, bu since hey conside he same ma gin o he
posi i e and nega i e pai s, hey a e no sui able o c oss-domain ashion sea ch. In his wo k, we
in oduce he c oss-domain disc imina i e ma gin loss (DML) o deal wi h he la ge a iabili y o
nega i e pai s in ashion. DML lea ns wo di e en ma gins o posi i e and nega i e pai s such ha
he nega i e ma gin is la ge han he posi i e ma gin, which p o ides s onge in aclass educ ion
o nega i e pai s. The expe imen s conduc ed on publicly a ailable ashion da ase s DARN and wo
benchma ks o he DeepFashion da ase —(1) Consume - o-Shop Clo hes Re ie al and (2) InShop
Clo hes Re ie al—con i m ha he p oposed loss unc ion no only ou pe o ms he exis ing loss
unc ions bu also achie es he bes pe o mance.
Keywo ds:
c oss-domain ashion e ie al; ma gin-based loss unc ion; adap i e ma gin; deep
lea ning; disc imina i e analysis
1. In oduc ion
Finding ashion images is one o he mos sough -a e applica ions in E-comme ce.
This applica ion allows cus ome s o disco e hei a o i e clo hes in online s o es, and
i can be conside ed as an impo an s ep o u u e applica ions in he ashion indus y,
such as ou i ecommenda ions, i.e., cus ome s sea ch o an ou i a e e ie ing hei
desi ed clo hes.
Finding i ems in online s o es using cus ome pho os based solely on hei isual
appea ance has p o en o be a majo challenge o he compu e ision communi y. Since
cus ome and s o e images come om di e en he e ogeneous domains, his p oblem is
e e ed o as a c oss-domain p oblem in appa el sea ch. The quali y o shoo ing equipmen ,
ligh ing condi ions, human body pos u e, and iewing angle a e he main ac o s ha
explain he la ge isual di e ences be ween pho os o cus ome s and ashion images
aken by p o essional pho og aphe s. The same clo hes can look di e en unde di e en
Senso s 2022,22, 2660. h ps://doi.o g/10.3390/s22072660 h ps://www.mdpi.com/jou nal/senso s
Senso s 2022,22, 2660 2 o 17
ci cums ances such as ligh , di e en si ua ions, o poses. In con as , di e en clo hes can
appea isually simila .
O e he pas decade, he e has been conside able p og ess in ga men sea ch be ween
consume s and s o es using con olu ional neu al ne wo ks (CNNs) [
1
–
17
]. Using complex
neu al ne wo ks wi h a high numbe o laye s, p e ious me hods a emp ed o ex ac
powe ul ea u es and imp o e e ie al pe o mance. Howe e , as he numbe o laye s in
he neu al ne wo k inc eases, he in ui i e low-le el ex u e in o ma ion o he clo hing
images is los , while he abs ac high-le el seman ic in o ma ion is p ese ed, which is
only sui able o image classi ica ion asks, bu no o c oss-domain clo hing sea ches [
15
].
Mo eo e , mos o he exis ing me hods use T iple Loss o con e ge neu al ne wo ks.
T iple Loss is speci ically de ined o he ace- ecogni ion p oblem. Human ace images
a e always well-s uc u ed, ha e ixed image sizes, and di e only sligh ly om each o he .
Compa ed o ace images, he c oss-domain clo hing images always ha e a la ge a ie y
o di e en ca ego ies and clo hing s yles (signi ican in aclass di e ences), so he T iple
Loss is no sui able o c oss-domain clo hing e ie al [15].
Ano he challenge ha has no been explici ly add essed is he small isual di e ences
be ween ce ain ga men s (e.g., jeans and pan s) ha lead o unexpec ed ga men s being
ound and and cus ome s being dissa is ied. Small isual di e ences lead o ha d examples
being ound ha ha e small isual di e ences om he que y image, bu do no ma ch (see
Figu e 1).
que y image  igh ma ch ha d examples di e en om he que y
e ie ed i ems
Figu e 1.
Example o consume - o-shop clo hes e ie al, which includes a que y image (wi h a blue
ame) and he 10 closes galle y images. The g een ame ep esen s he co ec ma ch, while he
yellow examples ep esen ha d examples, and he ed ames ep esen i ems ha di e om he
que y. As can be seen, he ha d examples ha e many simila i ies wi h he que y image. The sligh
supe icial di e ence causes he images o be e ie ed in he w ong way, which leads o sys em
pe o mance deg ada ion.
In his wo k, we app oach his p oblem by in oducing a no el loss unc ion ha
en o ces a small in aclass dis ance and inc eases he dis ance be ween inpu pai s ha a e
classi ied as dissimila . Ma gin-based loss unc ions a e ypically mo i a ed as app oxima-
ions o uppe bounds on misclassi ica ion loss. Con as i e loss and T iple Loss a e used
by Siamese ne wo ks o ex ac disc imina i e ea u es. These losses a e based on me ic
dis ances and equi e a la ge numbe o u ili y pai s o iple samples o ob ain an op imal
solu ion. The e o e, hey a e ime-consuming and ha e poo pe o mance on da a om
di e en domains wi h unbalanced ea u es. Recen ly, much a en ion has been paid o
so max-based loss unc ions. Some esea che s ha e op imized so max and in oduced
ma gin-based so max loss unc ions o disc imina i e analysis. Ma gin-based so max
losses such as Addi i e Ma gin-So max (aka CosFace) [
18
] no malized he ea u e and
weigh ec o s by
l2
-no maliza ion o ans o m he angula ma gin o So max o he cosine
ma gin, o imp o e he disc imina i e powe o he o iginal so max loss. They a ied he
decision ma gin in he cosine space o modi y in aclass and in e class a iances, bu since
hey conside he same ma gin o he posi i e and nega i e pai s, hey a e no sui able
o c oss-domain ashion sea ch. We p e e a la ge ma gin o nega i e pai s o s ongly
squeeze he in aclass a ia ions o nega i e classes.
To achie e his goal, we p opose a no el loss unc ion o c oss-domain sea ch o
clo hes be ween consume s and s o es, which we call C oss-Domain Disc imina i e Ma gin
Loss (DML). DML lea ns wo di e en cosine ma gins o posi i e and nega i e pai s o
Senso s 2022,22, 2660 3 o 17
maximize he decision bounda y and compac he nega i e decision ma gin in cosine space.
Speci ically, we make he ma gin
m
speci ic and lea nable o each class and ain he CNN
di ec ly. Fo mally, we de ine he posi i e ma gin
mp
and he nega i e ma gin
mn
, such ha
he decision bounda y is gi en by
cos(θ1)−mp=cos(θ2)
and
cos(θ1)−mn=cos(θ2)
o
posi i e and nega i e classes, espec i ely, whe e
θi
is he angle be ween he ea u e and
he weigh o class
i
. In he expe imen s, we show ha DML is supe io o he Ma gin-
based So max baseline me hods. The Siamese ne wo ks a e ained wi h DML o lea n
disc imina i e deep ea u es o inding simila images. A e aining, he ashion- e ie al
p oblem be ween consume s and s o es is o mula ed as an asymme ic (single- o-mul iple)
ma ching p oblem. These ea u es a e inpu o he simila i y dis ance me ic o pe o m
pai wise ma ching be ween cus ome and s o e images. Then, he op- anked esul s a e
displayed o he cus ome .
The main con ibu ions o he p oposed wo k can be summa ized as ollows:
•
A c oss-domain disc imina i e loss unc ion, called DML, is p oposed o lea n deep
disc imina i e ea u es o cus ome - o-shop ashion sea ch.
•
DML lea ns a la ge ma gin o he nega i e class compa ed o he posi i e class o
inc ease he a ia ion be ween classes and educe he nega i e class.
•
The p oposed app oach achie es he bes pe o mance on consume - o-shop ashion
e ie al da ase s, including DeepFashion [16] and DARN [17].
The es o he pape is o ganized as ollows. Sec ion 2in oduces he ela ed wo k.
Sec ion 3desc ibes ou p oposed me hod. Sec ion 4p esen s he expe imen al esul s
ob ained on wo eal ashion da ase s. Finally, he discussion and conclusions a e p esen ed
in Sec ions 5and 6, espec i ely.
2. Rela ed Wo k
2.1. Fashion Re ie al
O e he pas decade, consume - o-shop image sea ches in s o es ha e been widely
s udied [
1
–
17
]. Re . [
3
] p oposed he concep o c oss-domain clo hing sea ch. Using
human pos u e es ima ion, hey es ima ed he human body a ea, ex ac ed 30 egions o
human body, and ob ained he local ea u es o clo hing images, which can educe he
image di e ences due o c oss-domain clo hing images. They used a local ea u e-ma ching
me hod and implemen ed c oss-domain ga men sea ch h ough a wo-s age spa se coding
me hod. Al hough using he human pos u e es ima ion echnique o ex ac local ea u es is
an in elligen solu ion o he c oss-domain p oblem, his echnique some imes ails o de ec
egions o he human body based on di e en clo hing pos u es. The e o e, he ex ac ed
i ele an ea u es may educe he e ie al pe o mance. Ano he s udy, e . [
5
], p oposed
a no el egion ep esen a ion me hod o educe he in luence o complex and clu e ed
backg ound en i onmen s. A bina y spa ial appea ance mask was used o cons ain he
human body egions ob ained by he pose-es ima ion algo i hm. The me hods based on
he pose-es ima ion algo i hm ha e he limi a ion ha he same poin s mus be isible
in he whole image. O he wise, he local ea u es o di e en pa s o he human body
would be compa ed in c oss-domain clo hing images, which would lead o poo esul s.
Wi h he apid de elopmen o con olu ional neu al ne wo ks (CNNs) in ecen yea s,
adi ional me hods o clo hing analysis ha e been eplaced by neu al ne wo k models.
In [
1
], he concep o p ecise c oss-scene sea ch o cope wi h his shi was p oposed,
wi h he goal o inding he exac same i em on he shopping websi e when shopping
online. They educed he domain di e ence by emo ing he backg ound o consume
images, which is one o he mos c i ical sou ces o appea ance a ia ion, and using objec
p oposals o selec o eg ound i ems. Using pai wise mixed images om bo h domains,
hey ained deep simila i y lea ning me hods o he ask o accu a e s ee - o-s o e sea ch.
Howe e , he objec de ec o s do no wo k o complex ges u es and he pe o mance
o deep simila i y lea ning is sensi i e o he in oduc ion o pai wise images, which is
a e y ime-consuming p ocess acco ding o he limi ed da a. Dual a ibu e pe cep ual
anking ne wo k based on wo ully independen b anches (DARN) [
17
] has used ea u e
Senso s 2022,22, 2660 4 o 17
lea ning o di e en scene domains in eg a ing a ibu e and isual simila i y cons ain s
simul aneously. DARN uses wo CNN-based b anches o each o wo domains and p ojec s
hem in o a common embedding space. Then, he ou pu ea u es o each subne wo k
a e conca ena ed and ed in o he iple anking loss o he wo subne wo ks. Since he
c oss-domain clo hing images ha e a la ge a ie y o di e en ca ego ies and clo hing s yles,
he di e ences be ween he image pai s a e e y la ge and he T iple Loss does no wo k
well. FashionNe , p oposed by [
16
], lea ns clo hes e ie al by join ly p edic ing clo hing
a ibu es and landma k ea u es, and applies he ne wo k o c oss-scena io se ices o
he DeepFashion da ase . FashionNe ocuses on image keypoin localiza ion by using he
egis e ed keypoin s and image a ibu e in o ma ion, which equi es a lo o labo and
also a lo o ime o ma k he keypoin s o clo hing images. Ano he s udy, [
4
], p oposed a
deep Siamese ne wo k wi h a modi ied con as i e loss and mul i ask ine- uning me hod
ha ains a common model o all ca ego ies simul aneously. The Siamese ne wo k is
di ec ly ained o objec de ec ion/classi ica ion and hen used o simila i y es ima ion.
On he o he hand, con as i e loss a emp s o make bina y decisions abou whe he wo
images a e simila , bu canno cap u e ine-g ained simila i y. Mo eo e , he common
b anch a he bo om o he ne wo k has lea ned ea u es wi hou conside ing highe -
le el seman ic in o ma ion. The au ho s o [
6
] used a ibu e labels o pay mo e a en ion
o local disc imina i e egions. They employed a en ion mechanisms in global ea u e
agg ega ion o ocus ne wo k aining on he clo hes hemsel es, e ec i ely neglec ing
he in luence o backg ound noise. Howe e , hei me hod elies hea ily on de ining
label and clo hing pa sing ca ego ies ha may no be a ailable in eal-wo ld scena ios.
Al e na i ely, he au ho s o [
14
] p oposed a G id Sea ch Ne wo k (GSN) o gene a e
isual embeddings o ashion e ie al. They also used a ein o cemen lea ning based
s a egy o imp o e pe o mance and lea n a special ans o ma ion unc ion o e he
GSN ea u e embedding. They gene a ed a a ge g id by andomly selec ing posi i e and
nega i e pa e ns wi h espec o he que y image, and hen op imized a dis ance-based
g id sea ch loss o enable simul aneous compa ison o mul iple ea u e embeddings. The
pe o mance o GSN depends hea ily on he e ec i e selec ion o posi i e and nega i e
samples. In [
11
], he Siamese-based ne wo ks called G aph Reasoning Ne wo k (GRNe )
we e ecommended o simila i y lea ning be ween a que y and a galle y clo hing by
using bo h global and local ep esen a ions in di e en local clo hing egions and scales
based on a g aph con olu ional neu al ne wo k. Ano he s udy, [
10
], employed wo neu al
ne wo ks wi h di e en pa ame e s o de ec he di e ences be ween consume and shop
clo hing images. Howe e , using wo di e en se s o pa ame e s leads o an inc ease in
he numbe o pa ame e s, which is no conduci e o neu al ne wo k op imiza ion [
15
]. In
con as , we pe o m he c oss-domain consume - o-shop clo hes e ie al ia he Siamese
ne wo ks, which ha e he same weigh s o bo h subne wo ks. To o e come he limi a ions
o he da a p oblem and a oid he complexi y o he ne wo k s uc u e o ex ac s onge
ea u es, a no el Disc imina i e Ma gin Loss (DML) sui able o appa el sea ch is p oposed.
The ne wo k is op imized wi h DML o lea n disc imina i e ea u es and achie e mo e
accu a e ma ching.
2.2. Loss Func ions
Deep Embedding Lea ning is undoub edly conside ed as one o he in e es ing and
signi ican aspec s o he esea ch ields in deep con olu ional neu al ne wo ks, and e-
cen ly esea che s ha e shown an inc easing in e es in his a ea. Loss unc ions play an
impo an ole in deep embedding lea ning. Deep embedding lea ning me hods inc ease
disc imina i e powe by imp o ing loss unc ions. Con as i e loss [
19
,
20
] and disc imina-
i e loss [
21
] op imize he Euclidean dis ance o inpu pai wise samples wi hin a ma gin o
in e class in a ea u e space. T iple Loss [
22
] cons uc s inpu iple samples o sepa a e
he posi i e pai om he nega i e pai by a Euclidean dis ance ma gin o be e in e class
ea u e embedding. The e o e, bo h con as i e loss and T iple Loss en o ce a Euclidean
ma gin o lea ned ea u es. These me hods depend on he numbe o posi i e and nega i e
Senso s 2022,22, 2660 5 o 17
inpu pai s o iple images. The e o e, he pe o mance o hese loss unc ions is sensi i e
o he in oduc ion o pai o iple mining p ocedu es, which a e ime consuming [23].
To exploi he supe ision p ope y and imp o e he disc imina i e powe o he
deep-lea ned ea u es, mos ecen app oaches combine Euclidean ma gin-based losses
wi h so max losses. Fo example, Re . [
24
] p oposed a cen e loss o lea n cen e s o
deep ea u es such ha each class minimizes he wi hin-class a ia ions and he gi en
cen e s a e combined wi h so max loss. The deep ea u es lea ned wi h so max loss ha e
an in insic angula dis ibu ion, and Euclidean ma gin-based losses a e no compa ible
wi h so max losses. To add ess his issue, he esea che s decided o op imize he so max
loss o wi hin-class a ia ion. One s udy, e . [
25
], p oposed a la ge ma gin so max
(
i.e., L-So max
) by adding angle cons ain s o each iden i y o imp o e ea u e disc imina-
ion. Mo eo e , e . [
23
] imp o ed L-So max by no malizing he weigh s and p oposed
Angula So max (A-So max). Due o he di icul y o op imizing angle cons ain s, Re s
[
18
,
26
,
27
] mo ed he angle ange o a cosine space and p oposed CosFace and A cFace,
espec i ely. CosFace and A cFace assign he same decision space o he nega i e class
and he posi i e class, espec i ely. In consume - o-shop ashion e ie al, nega i e pai s
wi h small isual di e ences could be conside ed as posi i e pai s and a ec he e ie al
pe o mance. Thus, assigning an equal decision ma gin o posi i e and nega i e classes
causes he sys em o pe o m poo ly on nega i e pai s wi h small isual di e ences. These
pai s equi e a la ge decision ma gin o dis inguish hem as well as possible om he
posi i e pai s. In con as o exis ing loss unc ions, we p opose a no el c oss-domain loss
ha in oduces wo di e en ma gins in o he nega i e and posi i e in e classes o ex ac
disc imina i e deep ea u es.
3. The P oposed App oach
In his sec ion, we desc ibe he p oposed me hod in de ail. Fi s , we discuss he
d awbacks o he exis ing loss unc ions o he c oss-domain p oblem and explain ou
mo i a ion o in oducing a no el loss unc ion (Sec ion 3.1). The p oposed C oss-Domain
Disc imina i e Loss (DML) is p esen ed in Sec ion 3.2. Finally, o be e unde s and he
di e ence be ween DML and he o he loss unc ions, a isual compa ison is made in
Sec ion 3.3.
3.1. Mo i a ion
Ma gin-based so max losses ha e achie ed signi ican imp o emen s by se ing
m
o all he classes o squeeze he in aclass a ia ions. They assumed ha he ea u e
dis ibu ions o all he classes a e iden ical, so ha se ing he same ma gin is enough o
cons ain all he classes. Since hey conside he same ma gin o he posi i e and nega i e
pai s, hey a e no sui able o c oss-domain ashion sea ch. Fo he nega i e class wi h
la ge isual di e ences, he ex ac ed ea u es a e placed in he ea u e dis ibu ion o
nega i e samples, bu o hose nega i e classes wi h small isual di e ences, ex ac ed
ea u es may be placed in he ea u e dis ibu ion o he posi i e class.
I a uni o m ma gin
m
is se o he posi i e and nega i e classes, he ea u e dis ibu-
ions o he nega i e class may no be as compac as hose o he posi i e class. The goal is
o achie e a small in aclass o he nega i e pai s in addi ion o inc easing he a ia ion
be ween classes. I he same ma gin is conside ed o he posi i e and nega i e classes, he
nega i e pai s ha a e e y simila can be conside ed as posi i e, which educes he unc-
ionali y o he sys em in he disc imina ion p ocess. We u he isualize he phenomenon
h ough he p ocess o dis inguishing he posi i e pai s om he nega i e pai s as shown in
Figu e 2. Suppose ha he no malized ea u e ec o s
x
and
y
a e gi en o he posi i e and
nega i e pai s, espec i ely. In ou wo k, ea u e usion o a pai o images is achie ed by
adding he deep ea u e ec o s o he wo images. The blue egion ep esen s he egion o
posi i e pai s, while he ed egion ep esen s he egion o nega i e pai s. In addi ion, he
whi e egion ep esen s he a ia ion be ween classes. Le
θ1
(
θ2
) deno e he angle be ween
he lea ned ea u e ec o ( ep esen ing a gi en pai o images) and he no malized weigh

Senso s 2022,22, 2660 6 o 17
ec o
w1
(
w2
).
w1
and
w2
a e he cen e s o he posi i e and nega i e classes, deno ed
by
C1
and
C2
, espec i ely. The CosFace o ces
cos(θ1)−m=cos(θ2)
o
C1
, and simila ly
o
C2
, so ha ea u es om he posi i e and nega i e classes a e equally compac ed. In
a desi able disc imina ion p ocess, we no only wan o maximize he a ia ion be ween
he classes, bu also wan o minimize he in aclass a ia ion o he nega i e class. To
add ess his p oblem, we in oduce a no el disc imina i e ma gin loss o c oss-domain
ashion e ie al. By lea ning a la ge ma gin
mn
o he nega i e class compa ed o he
posi i e class ma gin
mp
, we simul aneously inc ease he in e class a ia ion and dec ease
he in aclass a ia ion o he nega i e class, ensu ing ha no e y simila nega i e pai s
(ha d examples) occu in he posi i e decision ma gin.
x
W1W2

−


1


2
ma gin
CosFace
x
W1
W2

−

1



−

2


DML
disc imina i e
ma gin
y
ha d example

1

2

1

2
Figu e 2.
Geome ical in e p e a ion o DML is illus a ed om ea u e pe spec i e. Blue and ed
a eas ep esen he ea u e space o he posi i e and nega i e classes, espec i ely. The ex ac ed
ea u e ec o s o he posi i e o nega i e image pai s a e me ged in o a single ec o a he ea u e
le el. CosFace [
18
] se s he same ma gin
m
o posi i e and nega i e classes, so he disc imina ion
p ocess canno be s ong enough. Compa ed o posi i e class ma gin
mp
, DML lea ns a la ge ma gin
mn
o he nega i e class, consequen ly expands he a ia ions be ween classes and condenses he
a ia ions wi hin classes, implici ly op imizing he disc imina ion space. Nega i e pai s wi h small
isual di e ences mo e close o nega i e pai s wi h la ge isual di e ences, pushing ha d examples
in o he ea u e space o he nega i e class.
3.2. C oss-Domain Disc imina i e Ma gin Loss (Dml)
In Siamese ne wo ks, wo inpu images a e simul aneously ed in o wo subne wo ks
(wi h he same a chi ec u e and weigh s) and he simila i y o he wo images is e alua ed
by he con as i e loss. The con as i e loss is used o ain he ne wo k o dis inguish
be ween simila and dissimila pai s o examples.
The Siamese ne wo k p oblem is sensi i e o calib a ion because i equi es a con ex
o he no ion o simila i y o dissimila i y [
28
]. To ob ain a obus disc imina i e model,
posi i e and nega i e pai s mus be in oduced wi h a high numbe , which is a ime-
consuming p ocess. Mo eo e , nega i e pai s in he loss unc ion coope a e only when
hei dis ance is a he decision bounda y. On he o he hand, he choice o an app op ia e
alue o he decision ma gin depends on he numbe and in luence o he posi i e and
nega i e pai s.
To o e come hese p oblems, we me ge he embedded ea u es o he wo subne wo ks
and use he so max unc ion ins ead o he Euclidean dis ance, and p opose a no el C oss-
Domain Disc imina i e Ma gin Loss (DML) o c oss-domain ashion e ie al. So max
sepa a es ea u es om di e en classes by maximizing he pos e io p obabili y o he
Senso s 2022,22, 2660 7 o 17
co esponding class. Gi en he ea u e ec o
xi
and he co esponding label
yi
, he so max
loss is de ined as ollows:
Ls=1
N
N
∑
i=1
−log pi=1
N
N
∑
i=1
−log ewT
yixi+byi
∑C
j=1ewT
jxi+bj
, (1)
whe e
pi
deno es he pos e io p obabili y ha he ea u e ec o
xi
(a single ec o o med
by using he ex ac ed ea u e ec o s o he posi i e o nega i e image pai s a he ea u e
le el) is co ec ly classi ied in o he co esponding class
yi
,
wj
deno es he
j
-
h
column o
he weigh ma ix
W
,
b
is he bias e m,
N
is he numbe o aining samples, and
C
is he
numbe o classes. No malizing
xi
and
wj
using
L2
no maliza ion, escaling
xi
o
s
, and
ixing he bias e m
b=
0, he ea u e dis ance is p ojec ed on o he ea u e angle measu e
o simplici y as ollows:
Ls=1
N
N
∑
i=1
−log pi=1
N
N
∑
i=1
−log ewT
yixi+byi
∑C
j=1ewT
jxi+bj
, (2)
whe e
pi
indica es he pos e io p obabili y o ea u e ec o
xi
(one single ec o which
is o med by he usion o he ex ac ed ea u e ec o s o he posi i e o nega i e image
pai s a he ea u e le el) being co ec ly classi ied in o ela ed class
yi
,
wj
deno es he
j
- h
column o he weigh ma ix
W
,
b
is he bias e m,
N
is he numbe o aining samples and
C
is he numbe o classes. By no malizing
xi
and
wj
using
L2
no maliza ion, escaling
xi
o
s
and ixing he bias
b=
0 o simplici y [
18
], he ea u e dis ance is p ojec ed o ea u e
angula as ollows:
wT
jxi=
wj
kxikcos θji =scos θji, (3)
whe e
θji
is he angle be ween
wj
and
xi
. Thus, bo h he no m and he angle o he ec o s
con ibu e o he pos e io p obabili y. Based on his o mula ion, some me hods ha e been
p oposed o op imize and ex end he in e class ma gin [
18
,
26
]. Since op imiza ion in cosine
space is much easie compa ed o angle space, we u he ocus on he analysis o cosine
ma gin. By impo ing he ma gin
m
in o he cosine space o So max, he La ge Ma gin
Cosine Loss (LMCL) [18] a emp s o u he dis inguish i as ollows:
Llmc =1
N
N
∑
i=1
−log escosθyi,i−m
escosθyi,i−m+∑C
j6=yiescos(θj,i), (4)
subjec o
cosθj,i=wT
jxi, (5)
whe e
N
is he numbe o aining samples,
xi
is he
i
- h ea u e ec o co esponding o
he g ound u h class o
yi
,
wj
is he weigh ec o o he
j
- h class, and
θj,i
is he angle
be ween wjand xi.
Since c oss-domain ashion e ie al is a disc imina i e bina y p oblem, we ha e only
wo classes (simila and dissimila classes). The e o e,
θ1
and
θ2
deno e he angles be ween
he embedding ea u e ec o s and he weigh ec o s o class
C1
and
C2
, espec i ely. In
he LMCL me hod, he alue o he ma gin
m
is conside ed as a cons an alue o posi i e
and nega i e classes, esul ing in pai s wi h small isual di e ences (ha d examples)
being iden i ied as posi i e pai s. This p oblem is pa icula ly p e alen in c oss-domain
ashion e ie al, whe e he e is a high deg ee o simila i y in design and appea ance
be ween di e en ypes o clo hing. Ou goal is o expand he a ia ion be ween classes o
dis inguish nega i e pai s om posi i e pai s and condense he nega i e ea u e space o
ga he nega i e pai s wi h small and la ge isual di e ences. This p e en s ha d examples
om en e ing he ea u e space o posi i e pai s and inc eases he disc imina i e powe . To
his end, we do no assign he same ma gin
m
o he nega i e class and he posi i e class,
Senso s 2022,22, 2660 8 o 17
bu assign a la ge
m
o he nega i e class o educe he in aclass a ia ion o he nega i e
class. Fo cla i y, we ep esen he angles below wi h only one subsc ip co esponding o
he class. In o he wo ds,
θj,i
is deno ed by
θj
. Fo he posi i e class, and simila ly o he
nega i e class, he c oss-domain loss is o mula ed as ollows:
LC oss−Domain =1
N
N
∑
i=1
−log es(cos(θyi)−myi)
es(cos(θyi)−myi)+es(cos(θj)) , (6)
whe e
N
is he numbe o aining samples,
myi
is he ma gin assigned o he g ound u h
class
yiin{p
,
n}
o he
i
- h pai (whe e o he posi i e class is
mp
and o he nega i e class
is
mn
), and
j6=yi
.
mn
should be la ge han
mp
. Se ing
mn>mp
aims o compac he
nega i e decision bounda y and expand he in e class and educe he nega i e in aclass,
which also ensu es he absence o he ha d examples in he posi i e ea u e space.
To ensu e he disc imina i e powe o c oss-domain loss and p o ide a decisi e
solu ion, we in oduce he disc imina i e pa as ollows:
Ldisc imina i e =−(λ1×mp+λ2×mn)/2, (7)
whe e
λ1
and
λ2
(
λ1<λ2
) a e balancing ac o s o con ol he size o he posi i e and
nega i e ma gins. By combining (5) and (6), he c oss-domain disc imina i e ma gin loss
(DML) is p oposed as ollows:
LDML =LC oss−Domain +Ldisc imina i e =
1
N
N
∑
i=1
−log es(cos(θyi)−myi)
es(cos(θyi)−myi)+es(cos(θj)) −(λ1×mp+λ2×mn)/2, (8)
whe e
mp
,
mn
a e he ma gins o posi i e and nega i e classes,
θyi
is he angle be ween
xi
( he used ea u e ec o o he posi i e o nega i e pai ) and he ec o
wyi
. The
hype pa ame e s λ1and λ2con ol he disc imina i e powe o DML.
3.3. Compa ison o O he Loss Func ions
To be e unde s and he ad an ages o DML o e exis ing losses, he decision bound-
a y o he disc imina ion p oblem is shown in Figu e 3. So max conside s
ma gin =
0
be ween he posi i e class
C1
and he nega i e class
C2
. CosFace and A cFace speci y a
cons an alue o he ma gin be ween posi i e and nega i e classes. We a gue ha hese
s a egies a e no sui able o clo hing analysis because he dis ibu ion o he nega i e class
is no uni o m, i.e., nega i e pai s can ha e bo h small and la ge isual di e ences.
To o e come his challenge in c oss-domain ashion sea ch, he p oposed loss assigns a
lea nable ma gin o each class, while a la ge ma gin is en o ced o he nega i e class. The
la ge ma gin
mn
compac s sca e ed nega i e pai s wi h small and la ge isual di e ences
and shi s he decision bounda y o he nega i e class C2away om he posi i e class C1.
Since he numbe o nega i e pai s is highe han he numbe o posi i e pai s (due o
he limi ed amoun o da a), consume - o-shop ashion e ie al could be conside ed as a
class imbalanced p oblem, whe e he aining can be domina ed by he mos equen class
(nega i e pai s). FCdDN [
29
] p oposed a loss unc ion o eassign he p obabili y alue o
he dominan class o a smalle alue o o e come his p oblem. Speci ically, FCdDN maps
he p obabili y alues o he dominan class (nega i e pai s) o a smalle alue and he
p obabili y alues o he poo class (posi i e class) o a la ge alue. By ocusing a en ion
on he dominan class and gi ing i mo e weigh , FCdDN a emp s o sol e he imbalance
p oblem. Compa ed o FCdDN, DML no only ies o sol e he imbalance p oblem by
assigning a la ge ma gin o he nega i e class, bu also ies o p e en he posi i e ma gin
om becoming equal o he nega i e ma gin due o he disc imina i e pa , which leads o
dis inguish be ween ha d examples and posi i e pai s.
Senso s 2022,22, 2660 9 o 17


2
So max
C1
C2


1


2


2


1
CosFace
C1
C2
m
C2
A cFace
C1

2
m
DML
C1
C2


1
mp
mn

1
Figu e 3.
The decision ma gin o di e en loss unc ions o disc imina i e analysis is isualized.
C1
is a posi i e class and
C2
is a nega i e class. Blue, ed, and whi e a eas ep esen posi i e decision
ma gin, nega i e decision ma gin, and decision limi , espec i ely. As can be seen, unlike o he losses
ha conside cons an ma gins
m
o he posi i e and nega i e decision ma gins, DML lea ns ma gins
mp,mn o he posi i e and nega i e decision ma gins, whe e mn>mp.
4. Expe imen s
4.1. Da ase s
We e alua ed ou p oposed me hod wi h he da ase DARN and wi h wo bench-
ma ks o he DeepFashion da ase : (1) InShop Clo hes Re ie al and (2) Consume - o-Shop
Clo hes Re ie al.
The DARN da ase was collec ed speci ically o s ee - o-shop e ie al and con ained
app oxima ely 327,000 in-shop images and 91,000 use images. Since he collec o s o
he DARN da ase did no p o ide a s anda d p o ocol and he iles p o ided by he
au ho s con ain b oken links, we use he cleaned e sion p o ided by [
6
,
10
] and ollow
hei e alua ion p o ocol o a ai compa ison. Fi s , hey emo ed co up ed images o
ob ain a subse o 62,812 s ee images and 238,499 shop images o 13,598 dis inc p oduc s
dis ibu ed o e 20 ashion ca ego ies whe e each s ee image has a ma ched shop image.
Then, hey pa i ioned he da ase in o h ee subse s o aining, alida ion, and es , wi h
no o e lap o p oduc s (see Table 1).
The DeepFashion da ase [
16
] is one o he la ges da ase s o clo hing image analysis
and con ains mo e han 800k images. Each image in his da ase is anno a ed wi h labels
o ca ego ies, a ibu es, bounding boxes, and landma ks. The p esence o occlusions,
de o ma ions, ligh ing a ia ions, and la ge a ia ions in pose and scale ha e made his
da ase challenging. The Consume - o-Shop Clo hes Re ie al benchma k con ains 239,557
consume - o-shop images wi h 33,881 clo hing i ems. The InShop Clo hes Re ie al bench-
ma k con ains 52,712 images wi h 7982 ga men s. Thei pa i ions a e shown in Table 1.
No e ha in he InShop benchma k, he galle y se images a e used as aining shop pho os
and he que y se images a e used as he es shop pho os. To ensu e a ai compa ison, he
spli be ween aining and es ing is gi en. Consis en wi h he s a e o he a , we used his
spli in all o ou expe imen s. In addi ion, each image was c opped using he bounding
boxes p o ided.
Senso s 2022,22, 2660 16 o 17
he p oblem. This means ha he bes pa ame e s o he c oss-domain consume - o-shop
e ie al p oblem would no sui able o ano he p oblem such as ace e i ica ion.
6. Conclusions
In his wo k, a loss unc ion called DML is p oposed o imp o e he pe o mance
o CNNs in consume - o-shop clo hes e ie al. Unlike exis ing ma gin-based so max
losses, DML lea ns wo di e en ma gins o nega i e and posi i e classes o inc ease
compac ness wi hin classes and sepa abili y be ween classes. The ma gin o nega i e
classes is la ge han he ma gin o posi i e classes. Acco dingly, DML a emp s o inc ease
c oss-class sepa abili y and ocuses on nega i e in aclass compac ness. Fo his eason,
nega i e pai s wi h small isual di e ences a e no conside ed as posi i e pai s, esul ing
in imp o ed e ie al pe o mance. Ex ensi e expe imen al esul s on h ee public ashion
da ase s show signi ican ad an ages o e s a e-o - he-a me hods and all compa ed
ma gin-based so max unc ions. Acco ding o he esul s, DML was he mos success ul o
e ie e clo hes and achie ed Top-50 e ie al pe o mances o 0.759, 0.921, and 0.87 on he
Consume - o-Shop Clo hes Re ie al benchma k, he InShop Clo hes Re ie al benchma k,
and DARN da ase , espec i ely. Fu u e esea ch di ec ions include: (1) imp o ing he
pe o mance o he CNN used o eplacing i wi h o he Deep Lea ning a chi ec u es
such as GRNe o le e age bo h global and local ep esen a ions a mul iple scales; (2)
gene alizing DML o he mul iple-class scena io o s eng hen he disc imina ion o lea ned
ea u es by p omo ing a speci ic addi ional ma gin o each class in cosine space.
Au ho Con ibu ions:
Concep ualiza ion, P.A. and F.D.; me hodology, P.A., F.D., A.M.; so wa e,
P.A.; alida ion, P.A., F.D., A.M.; w i ing—o iginal d a p epa a ion, P.A.; w i ing— e iew and
edi ing, P.A., F.D., A.M.; supe ision, F.D., A.M.; unding acquisi ion, P.A., F.D., A.M. All au ho s
ha e ead and ag eed o he published e sion o he manusc ip .
Funding: This esea ch ecei ed no ex e nal unding.
Ins i u ional Re iew Boa d S a emen : No applicable.
In o med Consen S a emen : No applicable.
Da a A ailabili y S a emen : No applicable.
Con lic s o In e es :
The au ho s decla e ha he esea ch was conduc ed in he absence o any
comme cial o inancial ela ionships ha could be cons ued as a po en ial con lic o in e es .
Re e ences
1.
Hadi, Kiapou , M.; Han, X.; Lazebnik, S.; Be g, A.; Be g, T. Whe e o buy i : Ma ching s ee clo hing pho os in online shops. In
P oceedings o he IEEE In e na ional Con e ence on Compu e Vision, San iago, Chile, 7–13 Decembe 2015; pp. 3343–3351.
2.
Li, Z.; Li, Y.; Gao, Y.; Liu, Y. Fas c oss-scena io clo hing e ie al based on indexing deep ea u es. In Paci ic Rim Con e ence on
Mul imedia; Sp inge : Cham, Swi ze land, 2016; pp. 107–118.
3.
Liu, S.; Song, Z.; Liu, G.; Xu, C.; Lu, H.; Yan, S. S ee - o-shop: C oss-scena io clo hing e ie al ia pa s alignmen and auxilia y
se . In P oceedings o he 2012 IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, P o idence, RI, USA, 16–21 June
2012; pp. 3330–3337.
4.
Wang, X.; Sun, Z.; Zhang, W.; Zhou, Y.; Jiang, Y. Ma ching use pho os o online p oduc s wi h obus deep ea u es. In P oceedings
o he 2016 ACM on In e na ional Con e ence on Mul imedia Re ie al, New Yo k, NY, USA, 6–9 June 2016; pp. 7–14.
5.
Kalan idis, Y.; Kennedy, L.; Li, L. Ge ing he look: Clo hing ecogni ion and segmen a ion o au oma ic p oduc sugges ions in
e e yday pho os. In P oceedings o he 3 d ACM Con e ence on In e na ional Con e ence on Mul imedia Re ie al, Dallas, TX,
USA, 16–20 Ap il 2013; pp. 105–112.
6.
Ji, X.; Wang, W.; Zhang, M.; Yang, Y. C oss-domain image e ie al wi h a en ion modeling. In P oceedings o he 25 h ACM
In e na ional Con e ence on Mul imedia, Moun ain View, CA, USA, 23–27 Oc obe 2017; pp. 1654–1662.
7.
Cheng, Z.; Wu, X.; Liu, Y.; Hua, X. Video2shop: Exac ma ching clo hes in ideos o online shopping images. In P oceedings o
he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Honolulu, HI, USA, 21–26 July 2017; pp. 4048–4056.
8.
Wang, Z.; Gu, Y.; Zhang, Y.; Zhou, J.; Gu, X. Clo hing e ie al wi h isual a en ion model. In P oceedings o he 2017 IEEE Visual
Communica ions and Image P ocessing (VCIP), S . Pe e sbu g, FL, USA, 10–13 Decembe 2017; pp. 1–4.

Senso s 2022,22, 2660 17 o 17
9.
Lasse e, J.; B ache , C.; Vollg a , R. S ee 2Fashion2Shop: Enabling Visual Sea ch in Fashion e-Comme ce Using S udio Images.
In P oceedings o he In e na ional Con e ence on Pa e n Recogni ion Applica ions and Me hods; Sp inge : Cham, Swi ze land, 2018;
pp. 3–26.
10.
Gajic, B.; Bald ich, R. C oss-domain ashion image e ie al. In P oceedings o he IEEE Con e ence on Compu e Vision and
Pa e n Recogni ion Wo kshops, Sal Lake Ci y, UT, USA, 18–22 June 2018; pp. 1869–1871.
11.
Kuang, Z.; Gao, Y.; Li, G.; Luo, P.; Chen, Y.; Lin, L.; Zhang, W. Fashion Re ie al ia G aph Reasoning Ne wo ks on a Simila i y
Py amid. a Xi 2019, a Xi :1908.11754.
12.
Pa k, S.; Shin, M.; Ham, S.; Choe, S.; Kang, Y. S udy on Fashion Image Re ie al Me hods o E icien Fashion Visual Sea ch.
In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion Wo kshops, Long Beach, CA, USA, 16–17
June 2019.
13.
Kuce , M.; Mu ay, N. A De ec -Then-Re ie e Model o Mul i-Domain Fashion I em Re ie al. In P oceedings o he IEEE
Con e ence on Compu e Vision and Pa e n Recogni ion Wo kshops, Long Beach, CA, USA, 16–17 June 2019.
14.
Chop a, A.; Sinha, A.; Gup a, H.; Sa ka , M.; Ayush, K.; K ishnamu hy, B. Powe ing Robus Fashion Re ie al Wi h In o ma ion
Rich Fea u e Embeddings. In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion Wo kshops,
Long Beach, CA, USA, 16–17 June 2019.
15.
Miao, Y.; Li, G.; Bao, C.; Zhang, J.; Wang, J. Clo hingNe : C oss-Domain Clo hing Re ie al Wi h Fea u e Fusion and Quad uple
Loss. IEEE Access 2020,8, 142669–142679. [C ossRe ]
16.
Liu, Z.; Luo, P.; Qiu, S.; Wang, X.; Tang, X. Deep ashion: Powe ing obus clo hes ecogni ion and e ie al wi h ich anno a ions.
In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Las Vegas, NV, USA, 27–30 June 2016;
pp. 1096–1104.
17.
Huang, J.; Fe is, R.; Chen, Q.; Yan, S. C oss-domain image e ie al wi h a dual a ibu e-awa e anking ne wo k. In P oceedings
o he IEEE In e na ional Con e ence on Compu e Vision, San iago, Chile, 7–13 Decembe 2015; pp. 1062–1070.
18.
Wang, H.; Wang, Y.; Zhou, Z.; Ji, X.; Gong, D.; Zhou, J.; Li, Z.; Liu, W. Cos ace: La ge ma gin cosine loss o deep ace ecogni ion.
In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Sal Lake Ci y, UT, USA, 18–22 June 2018;
pp. 5265–5274.
19.
Chop a, S.; Hadsell, R.; LeCun, Y. Lea ning a simila i y me ic disc imina i ely, wi h applica ion o ace e i ica ion. In
P oceedings o he 2005 IEEE Compu e Socie y Con e ence on Compu e Vision and Pa e n Recogni ion (CVPR’05), San Diego,
CA, USA, 20–25 June 2005; pp. 539–546.
20.
Hadsell, R.; Chop a, S.; LeCun, Y. Dimensionali y educ ion by lea ning an in a ian mapping. In P oceedings o he 2006 IEEE
Compu e Socie y Con e ence on Compu e Vision and Pa e n Recogni ion (CVPR’06), New Yo k, NY, USA, 17–22 June 2006;
Volume 2, pp. 1735–1742.
21.
Rao, Y.; Lu, J.; Zhou, J. Lea ning Disc imina i e Agg ega ion Ne wo k o Video-Based Face Recogni ion and Pe son Re-
iden i ica ion. In . J. Compu . Vis. 2019,127, 701–718. [C ossRe ]
22.
Wang, J.; Song, Y.; Leung, T.; Rosenbe g, C.; Wang, J.; Philbin, J.; Chen, B.; Wu, Y. Lea ning ine-g ained image simila i y wi h
deep anking. In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Columbus, OH, USA, 23–28
June 2014; pp. 1386–1393.
23.
Liu, W.; Wen, Y.; Yu, Z.; Li, M.; Raj, B.; Song, L. Sphe e ace: Deep hype sphe e embedding o ace ecogni ion. In P oceedings o
he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Honolulu, HI, USA, 21–26 July 2017; pp. 212–220.
24.
Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A disc imina i e ea u e lea ning app oach o deep ace ecogni ion. In P oceedings o he
Eu opean Con e ence on Compu e Vision, Ams e dam, The Ne he lands, 11–14 Oc obe 2016; pp. 499–515.
25. Liu, W.; Wen, Y.; Yu, Z.; Yang, M. La ge-ma gin so max loss o con olu ional neu al ne wo ks. ICML 2016,2, 7.
26.
Wang, F.; Cheng, J.; Liu, W.; Liu, H. Addi i e ma gin so max o ace e i ica ion. IEEE Signal P ocess. Le .
2018
,25, 926–930.
[C ossRe ]
27.
Deng, J.; Guo, J.; Xue, N.; Za ei iou, S. A c ace: Addi i e angula ma gin loss o deep ace ecogni ion. In P oceedings o he
IEEE/CVF Con e ence on Compu e Vision and Pa e n Recogni ion, Long Beach, CA, USA, 16–20 June 2019; pp. 4690–4699.
28.
Ho e , E.; Ailon, N. Deep me ic lea ning using iple ne wo k. In In e na ional Wo kshop on Simila i y-Based Pa e n Recogni ion;
Sp inge : Cham, Swi ze land, 2015; pp. 84–92.
29.
Ouahabi, A.; Taleb-Ahmed, A. Deep lea ning o eal- ime seman ic segmen a ion: Applica ion in ul asound imaging. Pa e n
Recogni . Le . 2021,144, 27–34. [C ossRe ]
30.
Xuan, H.; Sou eni , R.; Pless, R. Deep andomized ensembles o me ic lea ning. In P oceedings o he Eu opean Con e ence on
Compu e Vision (ECCV), Munich, Ge many, 8–14 Sep embe 2018; pp. 723–734.
31.
Shen, Y.; Xiao, T.; Li, H.; Yi, S.; Wang, X. End- o-end deep k onecke -p oduc ma ching o pe son e-iden i ica ion. In P oceedings
o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Sal Lake Ci y, UT, USA, 18–22 June 2018; pp. 6886–6895.
32.
Su, H.; Wang, P.; Liu, L.; Li, H.; Li, Z.; Zhang, Y. Whe e o Look and How o Desc ibe: Fashion Image Re ie al wi h an A en ional
He e ogeneous Bilinea Ne wo k. IEEE T ans. Ci cui s Sys . Video Technol. 2020,31, 3254–3265. [C ossRe ]
33.
Ve ma, S.; An ; S.; A o a, C.; Rai, A. Di e si y in ashion ecommenda ion using seman ic pa sing. In P oceedings o he 2018 25 h
IEEE In e na ional Con e ence on Image P ocessing (ICIP), A hens, G eece, 7–10 Oc obe 2018; pp. 500–504.
34. Lasse e, J.; Rasch, K.; Vollg a , R. S udio2shop: F om s udio pho o shoo s o ashion a icles. a Xi 2018, a Xi :1807.00556.