Deep Learning with Discriminative Margin Loss for Cross-Domain Consumer-to-Shop Clothes Retrieval

Author: Alirezazadeh, Pendar,Dornaika, Fadi,Moujahid, Abdelmalik

Publisher: MDPI

Year: 2022

DOI: 10.3390/s22072660

Source: https://addi.ehu.eus/bitstream/10810/56381/1/sensors-22-02660-v2.pdf



Ci a ion: Ali ezazadeh, P.; Do naika,
F.; Moujahid, A. Deep Lea ning wi h
Disc imina i e Ma gin Loss o
C oss-Domain Consume - o-Shop
Clo hes Re ie al. Senso s 2022,22,
2660. h ps://doi.o g/10.3390/
s22072660
Academic Edi o s: Abdeldjalil
Ouahabi, Sébas ien Jacques and
Ami Benzaoui
Recei ed: 15 Feb ua y 2022
Accep ed: 28 Ma ch 2022
Published: 30 Ma ch 2022
Publishe ’s No e: MDPI s ays neu al
wi h ega d o ju isdic ional claims in
published maps and ins i u ional a il-
ia ions.
Copy igh : © 2022 by he au ho s.
Licensee MDPI, Basel, Swi ze land.
This a icle is an open access a icle
dis ibu ed unde he e ms and
condi ions o he C ea i e Commons
A ibu ion (CC BY) license (h ps://
c ea i ecommons.o g/licenses/by/
4.0/).
senso s
A icle
Deep Lea ning wi h Disc imina i e Ma gin Loss o
C oss-Domain Consume - o-Shop Clo hes Re ie al
Penda Ali ezazadeh 1, Fadi Do naika 1,2,3,* and Abdelmalik Moujahid 4
1Depa men o In o ma ics, Uni e si y o he Basque Coun y, 20008 Donos ia-San Sebas ian, Spain;
penda [email p o ec ed]
2School o Compu e and In o ma ion Enginee ing, Henan Uni e si y, Kai eng 475000, China
3Ike basque, Basque Founda ion o Science, Plaza Euskadi, 5, 48009 Bilbao, Spain
4Depa men o Ma hema ics, Uni e si y o he Basque Coun y, 48080 Bilbao, Spain;
[email p o ec ed]
*Co espondence: [email p o ec ed]
Abs ac :
Consume - o-shop clo hes e ie al e e s o he p oblem o ma ching pho os aken by
cus ome s wi h hei coun e pa s in he shop. Due o some p oblems, such as a la ge numbe
o clo hing ca ego ies, di e en appea ances o clo hing i ems due o di e en came a angles and
shoo ing condi ions, di e en backg ound en i onmen s, and di e en body pos u es, he e ie al
accu acy o adi ional consume - o-shop models is always low. Wi h ad ances in con olu ional
neu al ne wo ks (CNNs), he accu acy o ga men e ie al has been signi ican ly imp o ed. Mos
app oaches add essing his p oblem use single CNNs in conjunc ion wi h a so max loss unc ion o
ex ac disc imina i e ea u es. In he ashion domain, nega i e pai s can ha e small o la ge isual
di e ences ha make i di icul o minimize in aclass a iance and maximize in e class a iance
wi h so max. Ma gin-based so max losses such as Addi i e Ma gin-So max (aka CosFace) imp o e
he disc imina i e powe o he o iginal so max loss, bu since hey conside he same ma gin o he
posi i e and nega i e pai s, hey a e no sui able o c oss-domain ashion sea ch. In his wo k, we
in oduce he c oss-domain disc imina i e ma gin loss (DML) o deal wi h he la ge a iabili y o
nega i e pai s in ashion. DML lea ns wo di e en ma gins o posi i e and nega i e pai s such ha
he nega i e ma gin is la ge han he posi i e ma gin, which p o ides s onge in aclass educ ion
o nega i e pai s. The expe imen s conduc ed on publicly a ailable ashion da ase s DARN and wo
benchma ks o he DeepFashion da ase —(1) Consume - o-Shop Clo hes Re ie al and (2) InShop
Clo hes Re ie al—con i m ha he p oposed loss unc ion no only ou pe o ms he exis ing loss
unc ions bu also achie es he bes pe o mance.
Keywo ds:
c oss-domain ashion e ie al; ma gin-based loss unc ion; adap i e ma gin; deep
lea ning; disc imina i e analysis
1. In oduc ion
Finding ashion images is one o he mos sough -a e applica ions in E-comme ce.
This applica ion allows cus ome s o disco e hei a o i e clo hes in online s o es, and
i can be conside ed as an impo an s ep o u u e applica ions in he ashion indus y,
such as ou i ecommenda ions, i.e., cus ome s sea ch o an ou i a e e ie ing hei
desi ed clo hes.
Finding i ems in online s o es using cus ome pho os based solely on hei isual
appea ance has p o en o be a majo challenge o he compu e ision communi y. Since
cus ome and s o e images come om di e en he e ogeneous domains, his p oblem is
e e ed o as a c oss-domain p oblem in appa el sea ch. The quali y o shoo ing equipmen ,
ligh ing condi ions, human body pos u e, and iewing angle a e he main ac o s ha
explain he la ge isual di e ences be ween pho os o cus ome s and ashion images
aken by p o essional pho og aphe s. The same clo hes can look di e en unde di e en
Senso s 2022,22, 2660. h ps://doi.o g/10.3390/s22072660 h ps://www.mdpi.com/jou nal/senso s
Senso s 2022,22, 2660 2 o 17
ci cums ances such as ligh , di e en si ua ions, o poses. In con as , di e en clo hes can
appea isually simila .
O e he pas decade, he e has been conside able p og ess in ga men sea ch be ween
consume s and s o es using con olu ional neu al ne wo ks (CNNs) [
1
–
17
]. Using complex
neu al ne wo ks wi h a high numbe o laye s, p e ious me hods a emp ed o ex ac
powe ul ea u es and imp o e e ie al pe o mance. Howe e , as he numbe o laye s in
he neu al ne wo k inc eases, he in ui i e low-le el ex u e in o ma ion o he clo hing
images is los , while he abs ac high-le el seman ic in o ma ion is p ese ed, which is
only sui able o image classi ica ion asks, bu no o c oss-domain clo hing sea ches [
15
].
Mo eo e , mos o he exis ing me hods use T iple Loss o con e ge neu al ne wo ks.
T iple Loss is speci ically de ined o he ace- ecogni ion p oblem. Human ace images
a e always well-s uc u ed, ha e ixed image sizes, and di e only sligh ly om each o he .
Compa ed o ace images, he c oss-domain clo hing images always ha e a la ge a ie y
o di e en ca ego ies and clo hing s yles (signi ican in aclass di e ences), so he T iple
Loss is no sui able o c oss-domain clo hing e ie al [15].
Ano he challenge ha has no been explici ly add essed is he small isual di e ences
be ween ce ain ga men s (e.g., jeans and pan s) ha lead o unexpec ed ga men s being
ound and and cus ome s being dissa is ied. Small isual di e ences lead o ha d examples
being ound ha ha e small isual di e ences om he que y image, bu do no ma ch (see
Figu e 1).
que y image  igh ma ch ha d examples di e en om he que y
e ie ed i ems
Figu e 1.
Example o consume - o-shop clo hes e ie al, which includes a que y image (wi h a blue
ame) and he 10 closes galle y images. The g een ame ep esen s he co ec ma ch, while he
yellow examples ep esen ha d examples, and he ed ames ep esen i ems ha di e om he
que y. As can be seen, he ha d examples ha e many simila i ies wi h he que y image. The sligh
supe icial di e ence causes he images o be e ie ed in he w ong way, which leads o sys em
pe o mance deg ada ion.
In his wo k, we app oach his p oblem by in oducing a no el loss unc ion ha
en o ces a small in aclass dis ance and inc eases he dis ance be ween inpu pai s ha a e
classi ied as dissimila . Ma gin-based loss unc ions a e ypically mo i a ed as app oxima-
ions o uppe bounds on misclassi ica ion loss. Con as i e loss and T iple Loss a e used
by Siamese ne wo ks o ex ac disc imina i e ea u es. These losses a e based on me ic
dis ances and equi e a la ge numbe o u ili y pai s o iple samples o ob ain an op imal
solu ion. The e o e, hey a e ime-consuming and ha e poo pe o mance on da a om
di e en domains wi h unbalanced ea u es. Recen ly, much a en ion has been paid o
so max-based loss unc ions. Some esea che s ha e op imized so max and in oduced
ma gin-based so max loss unc ions o disc imina i e analysis. Ma gin-based so max
losses such as Addi i e Ma gin-So max (aka CosFace) [
18
] no malized he ea u e and
weigh ec o s by
l2
-no maliza ion o ans o m he angula ma gin o So max o he cosine
ma gin, o imp o e he disc imina i e powe o he o iginal so max loss. They a ied he
decision ma gin in he cosine space o modi y in aclass and in e class a iances, bu since
hey conside he same ma gin o he posi i e and nega i e pai s, hey a e no sui able
o c oss-domain ashion sea ch. We p e e a la ge ma gin o nega i e pai s o s ongly
squeeze he in aclass a ia ions o nega i e classes.
To achie e his goal, we p opose a no el loss unc ion o c oss-domain sea ch o
clo hes be ween consume s and s o es, which we call C oss-Domain Disc imina i e Ma gin
Loss (DML). DML lea ns wo di e en cosine ma gins o posi i e and nega i e pai s o
Senso s 2022,22, 2660 3 o 17
maximize he decision bounda y and compac he nega i e decision ma gin in cosine space.
Speci ically, we make he ma gin
m
speci ic and lea nable o each class and ain he CNN
di ec ly. Fo mally, we de ine he posi i e ma gin
mp
and he nega i e ma gin
mn
, such ha
he decision bounda y is gi en by
cos(θ1)−mp=cos(θ2)
and
cos(θ1)−mn=cos(θ2)
o
posi i e and nega i e classes, espec i ely, whe e
θi
is he angle be ween he ea u e and
he weigh o class
i
. In he expe imen s, we show ha DML is supe io o he Ma gin-
based So max baseline me hods. The Siamese ne wo ks a e ained wi h DML o lea n
disc imina i e deep ea u es o inding simila images. A e aining, he ashion- e ie al
p oblem be ween consume s and s o es is o mula ed as an asymme ic (single- o-mul iple)
ma ching p oblem. These ea u es a e inpu o he simila i y dis ance me ic o pe o m
pai wise ma ching be ween cus ome and s o e images. Then, he op- anked esul s a e
displayed o he cus ome .
The main con ibu ions o he p oposed wo k can be summa ized as ollows:
•
A c oss-domain disc imina i e loss unc ion, called DML, is p oposed o lea n deep
disc imina i e ea u es o cus ome - o-shop ashion sea ch.
•
DML lea ns a la ge ma gin o he nega i e class compa ed o he posi i e class o
inc ease he a ia ion be ween classes and educe he nega i e class.
•
The p oposed app oach achie es he bes pe o mance on consume - o-shop ashion
e ie al da ase s, including DeepFashion [16] and DARN [17].
The es o he pape is o ganized as ollows. Sec ion 2in oduces he ela ed wo k.
Sec ion 3desc ibes ou p oposed me hod. Sec ion 4p esen s he expe imen al esul s
ob ained on wo eal ashion da ase s. Finally, he discussion and conclusions a e p esen ed
in Sec ions 5and 6, espec i ely.
2. Rela ed Wo k
2.1. Fashion Re ie al
O e he pas decade, consume - o-shop image sea ches in s o es ha e been widely
s udied [
1
–
17
]. Re . [
3
] p oposed he concep o c oss-domain clo hing sea ch. Using
human pos u e es ima ion, hey es ima ed he human body a ea, ex ac ed 30 egions o
human body, and ob ained he local ea u es o clo hing images, which can educe he
image di e ences due o c oss-domain clo hing images. They used a local ea u e-ma ching
me hod and implemen ed c oss-domain ga men sea ch h ough a wo-s age spa se coding
me hod. Al hough using he human pos u e es ima ion echnique o ex ac local ea u es is
an in elligen solu ion o he c oss-domain p oblem, his echnique some imes ails o de ec
egions o he human body based on di e en clo hing pos u es. The e o e, he ex ac ed
i ele an ea u es may educe he e ie al pe o mance. Ano he s udy, e . [
5
], p oposed
a no el egion ep esen a ion me hod o educe he in luence o complex and clu e ed
backg ound en i onmen s. A bina y spa ial appea ance mask was used o cons ain he
human body egions ob ained by he pose-es ima ion algo i hm. The me hods based on
he pose-es ima ion algo i hm ha e he limi a ion ha he same poin s mus be isible
in he whole image. O he wise, he local ea u es o di e en pa s o he human body
would be compa ed in c oss-domain clo hing images, which would lead o poo esul s.
Wi h he apid de elopmen o con olu ional neu al ne wo ks (CNNs) in ecen yea s,
adi ional me hods o clo hing analysis ha e been eplaced by neu al ne wo k models.
In [
1
], he concep o p ecise c oss-scene sea ch o cope wi h his shi was p oposed,
wi h he goal o inding he exac same i em on he shopping websi e when shopping
online. They educed he domain di e ence by emo ing he backg ound o consume
images, which is one o he mos c i ical sou ces o appea ance a ia ion, and using objec
p oposals o selec o eg ound i ems. Using pai wise mixed images om bo h domains,
hey ained deep simila i y lea ning me hods o he ask o accu a e s ee - o-s o e sea ch.
Howe e , he objec de ec o s do no wo k o complex ges u es and he pe o mance
o deep simila i y lea ning is sensi i e o he in oduc ion o pai wise images, which is
a e y ime-consuming p ocess acco ding o he limi ed da a. Dual a ibu e pe cep ual
anking ne wo k based on wo ully independen b anches (DARN) [
17
] has used ea u e
Senso s 2022,22, 2660 4 o 17
lea ning o di e en scene domains in eg a ing a ibu e and isual simila i y cons ain s
simul aneously. DARN uses wo CNN-based b anches o each o wo domains and p ojec s
hem in o a common embedding space. Then, he ou pu ea u es o each subne wo k
a e conca ena ed and ed in o he iple anking loss o he wo subne wo ks. Since he
c oss-domain clo hing images ha e a la ge a ie y o di e en ca ego ies and clo hing s yles,
he di e ences be ween he image pai s a e e y la ge and he T iple Loss does no wo k
well. FashionNe , p oposed by [
16
], lea ns clo hes e ie al by join ly p edic ing clo hing
a ibu es and landma k ea u es, and applies he ne wo k o c oss-scena io se ices o
he DeepFashion da ase . FashionNe ocuses on image keypoin localiza ion by using he
egis e ed keypoin s and image a ibu e in o ma ion, which equi es a lo o labo and
also a lo o ime o ma k he keypoin s o clo hing images. Ano he s udy, [
4
], p oposed a
deep Siamese ne wo k wi h a modi ied con as i e loss and mul i ask ine- uning me hod
ha ains a common model o all ca ego ies simul aneously. The Siamese ne wo k is
di ec ly ained o objec de ec ion/classi ica ion and hen used o simila i y es ima ion.
On he o he hand, con as i e loss a emp s o make bina y decisions abou whe he wo
images a e simila , bu canno cap u e ine-g ained simila i y. Mo eo e , he common
b anch a he bo om o he ne wo k has lea ned ea u es wi hou conside ing highe -
le el seman ic in o ma ion. The au ho s o [
6
] used a ibu e labels o pay mo e a en ion
o local disc imina i e egions. They employed a en ion mechanisms in global ea u e
agg ega ion o ocus ne wo k aining on he clo hes hemsel es, e ec i ely neglec ing
he in luence o backg ound noise. Howe e , hei me hod elies hea ily on de ining
label and clo hing pa sing ca ego ies ha may no be a ailable in eal-wo ld scena ios.
Al e na i ely, he au ho s o [
14
] p oposed a G id Sea ch Ne wo k (GSN) o gene a e
isual embeddings o ashion e ie al. They also used a ein o cemen lea ning based
s a egy o imp o e pe o mance and lea n a special ans o ma ion unc ion o e he
GSN ea u e embedding. They gene a ed a a ge g id by andomly selec ing posi i e and
nega i e pa e ns wi h espec o he que y image, and hen op imized a dis ance-based
g id sea ch loss o enable simul aneous compa ison o mul iple ea u e embeddings. The
pe o mance o GSN depends hea ily on he e ec i e selec ion o posi i e and nega i e
samples. In [
11
], he Siamese-based ne wo ks called G aph Reasoning Ne wo k (GRNe )
we e ecommended o simila i y lea ning be ween a que y and a galle y clo hing by
using bo h global and local ep esen a ions in di e en local clo hing egions and scales
based on a g aph con olu ional neu al ne wo k. Ano he s udy, [
10
], employed wo neu al
ne wo ks wi h di e en pa ame e s o de ec he di e ences be ween consume and shop
clo hing images. Howe e , using wo di e en se s o pa ame e s leads o an inc ease in
he numbe o pa ame e s, which is no conduci e o neu al ne wo k op imiza ion [
15
]. In
con as , we pe o m he c oss-domain consume - o-shop clo hes e ie al ia he Siamese
ne wo ks, which ha e he same weigh s o bo h subne wo ks. To o e come he limi a ions
o he da a p oblem and a oid he complexi y o he ne wo k s uc u e o ex ac s onge
ea u es, a no el Disc imina i e Ma gin Loss (DML) sui able o appa el sea ch is p oposed.
The ne wo k is op imized wi h DML o lea n disc imina i e ea u es and achie e mo e
accu a e ma ching.
2.2. Loss Func ions
Deep Embedding Lea ning is undoub edly conside ed as one o he in e es ing and
signi ican aspec s o he esea ch ields in deep con olu ional neu al ne wo ks, and e-
cen ly esea che s ha e shown an inc easing in e es in his a ea. Loss unc ions play an
impo an ole in deep embedding lea ning. Deep embedding lea ning me hods inc ease
disc imina i e powe by imp o ing loss unc ions. Con as i e loss [
19
,
20
] and disc imina-
i e loss [
21
] op imize he Euclidean dis ance o inpu pai wise samples wi hin a ma gin o
in e class in a ea u e space. T iple Loss [
22
] cons uc s inpu iple samples o sepa a e
he posi i e pai om he nega i e pai by a Euclidean dis ance ma gin o be e in e class
ea u e embedding. The e o e, bo h con as i e loss and T iple Loss en o ce a Euclidean
ma gin o lea ned ea u es. These me hods depend on he numbe o posi i e and nega i e
Senso s 2022,22, 2660 5 o 17
inpu pai s o iple images. The e o e, he pe o mance o hese loss unc ions is sensi i e
o he in oduc ion o pai o iple mining p ocedu es, which a e ime consuming [23].
To exploi he supe ision p ope y and imp o e he disc imina i e powe o he
deep-lea ned ea u es, mos ecen app oaches combine Euclidean ma gin-based losses
wi h so max losses. Fo example, Re . [
24
] p oposed a cen e loss o lea n cen e s o
deep ea u es such ha each class minimizes he wi hin-class a ia ions and he gi en
cen e s a e combined wi h so max loss. The deep ea u es lea ned wi h so max loss ha e
an in insic angula dis ibu ion, and Euclidean ma gin-based losses a e no compa ible
wi h so max losses. To add ess his issue, he esea che s decided o op imize he so max
loss o wi hin-class a ia ion. One s udy, e . [
25
], p oposed a la ge ma gin so max
(
i.e., L-So max
) by adding angle cons ain s o each iden i y o imp o e ea u e disc imina-
ion. Mo eo e , e . [
23
] imp o ed L-So max by no malizing he weigh s and p oposed
Angula So max (A-So max). Due o he di icul y o op imizing angle cons ain s, Re s
[
18
,
26
,
27
] mo ed he angle ange o a cosine space and p oposed CosFace and A cFace,
espec i ely. CosFace and A cFace assign he same decision space o he nega i e class
and he posi i e class, espec i ely. In consume - o-shop ashion e ie al, nega i e pai s
wi h small isual di e ences could be conside ed as posi i e pai s and a ec he e ie al
pe o mance. Thus, assigning an equal decision ma gin o posi i e and nega i e classes
causes he sys em o pe o m poo ly on nega i e pai s wi h small isual di e ences. These
pai s equi e a la ge decision ma gin o dis inguish hem as well as possible om he
posi i e pai s. In con as o exis ing loss unc ions, we p opose a no el c oss-domain loss
ha in oduces wo di e en ma gins in o he nega i e and posi i e in e classes o ex ac
disc imina i e deep ea u es.
3. The P oposed App oach
In his sec ion, we desc ibe he p oposed me hod in de ail. Fi s , we discuss he
d awbacks o he exis ing loss unc ions o he c oss-domain p oblem and explain ou
mo i a ion o in oducing a no el loss unc ion (Sec ion 3.1). The p oposed C oss-Domain
Disc imina i e Loss (DML) is p esen ed in Sec ion 3.2. Finally, o be e unde s and he
di e ence be ween DML and he o he loss unc ions, a isual compa ison is made in
Sec ion 3.3.
3.1. Mo i a ion
Ma gin-based so max losses ha e achie ed signi ican imp o emen s by se ing
m
o all he classes o squeeze he in aclass a ia ions. They assumed ha he ea u e
dis ibu ions o all he classes a e iden ical, so ha se ing he same ma gin is enough o
cons ain all he classes. Since hey conside he same ma gin o he posi i e and nega i e
pai s, hey a e no sui able o c oss-domain ashion sea ch. Fo he nega i e class wi h
la ge isual di e ences, he ex ac ed ea u es a e placed in he ea u e dis ibu ion o
nega i e samples, bu o hose nega i e classes wi h small isual di e ences, ex ac ed
ea u es may be placed in he ea u e dis ibu ion o he posi i e class.
I a uni o m ma gin
m
is se o he posi i e and nega i e classes, he ea u e dis ibu-
ions o he nega i e class may no be as compac as hose o he posi i e class. The goal is
o achie e a small in aclass o he nega i e pai s in addi ion o inc easing he a ia ion
be ween classes. I he same ma gin is conside ed o he posi i e and nega i e classes, he
nega i e pai s ha a e e y simila can be conside ed as posi i e, which educes he unc-
ionali y o he sys em in he disc imina ion p ocess. We u he isualize he phenomenon
h ough he p ocess o dis inguishing he posi i e pai s om he nega i e pai s as shown in
Figu e 2. Suppose ha he no malized ea u e ec o s
x
and
y
a e gi en o he posi i e and
nega i e pai s, espec i ely. In ou wo k, ea u e usion o a pai o images is achie ed by
adding he deep ea u e ec o s o he wo images. The blue egion ep esen s he egion o
posi i e pai s, while he ed egion ep esen s he egion o nega i e pai s. In addi ion, he
whi e egion ep esen s he a ia ion be ween classes. Le
θ1
(
θ2
) deno e he angle be ween
he lea ned ea u e ec o ( ep esen ing a gi en pai o images) and he no malized weigh

Senso s 2022,22, 2660 6 o 17
ec o
w1
(
w2
).
w1
and
w2
a e he cen e s o he posi i e and nega i e classes, deno ed
by
C1
and
C2
, espec i ely. The CosFace o ces
cos(θ1)−m=cos(θ2)
o
C1
, and simila ly
o
C2
, so ha ea u es om he posi i e and nega i e classes a e equally compac ed. In
a desi able disc imina ion p ocess, we no only wan o maximize he a ia ion be ween
he classes, bu also wan o minimize he in aclass a ia ion o he nega i e class. To
add ess his p oblem, we in oduce a no el disc imina i e ma gin loss o c oss-domain
ashion e ie al. By lea ning a la ge ma gin
mn
o he nega i e class compa ed o he
posi i e class ma gin
mp
, we simul aneously inc ease he in e class a ia ion and dec ease
he in aclass a ia ion o he nega i e class, ensu ing ha no e y simila nega i e pai s
(ha d examples) occu in he posi i e decision ma gin.
x
W1W2

−


1


2
ma gin
CosFace
x
W1
W2

−

1



−

2


DML
disc imina i e
ma gin
y
ha d example

1

2

1

2
Figu e 2.
Geome ical in e p e a ion o DML is illus a ed om ea u e pe spec i e. Blue and ed
a eas ep esen he ea u e space o he posi i e and nega i e classes, espec i ely. The ex ac ed
ea u e ec o s o he posi i e o nega i e image pai s a e me ged in o a single ec o a he ea u e
le el. CosFace [
18
] se s he same ma gin
m
o posi i e and nega i e classes, so he disc imina ion
p ocess canno be s ong enough. Compa ed o posi i e class ma gin
mp
, DML lea ns a la ge ma gin
mn
o he nega i e class, consequen ly expands he a ia ions be ween classes and condenses he
a ia ions wi hin classes, implici ly op imizing he disc imina ion space. Nega i e pai s wi h small
isual di e ences mo e close o nega i e pai s wi h la ge isual di e ences, pushing ha d examples
in o he ea u e space o he nega i e class.
3.2. C oss-Domain Disc imina i e Ma gin Loss (Dml)
In Siamese ne wo ks, wo inpu images a e simul aneously ed in o wo subne wo ks
(wi h he same a chi ec u e and weigh s) and he simila i y o he wo images is e alua ed
by he con as i e loss. The con as i e loss is used o ain he ne wo k o dis inguish
be ween simila and dissimila pai s o examples.
The Siamese ne wo k p oblem is sensi i e o calib a ion because i equi es a con ex
o he no ion o simila i y o dissimila i y [
28
]. To ob ain a obus disc imina i e model,
posi i e and nega i e pai s mus be in oduced wi h a high numbe , which is a ime-
consuming p ocess. Mo eo e , nega i e pai s in he loss unc ion coope a e only when
hei dis ance is a he decision bounda y. On he o he hand, he choice o an app op ia e
alue o he decision ma gin depends on he numbe and in luence o he posi i e and
nega i e pai s.
To o e come hese p oblems, we me ge he embedded ea u es o he wo subne wo ks
and use he so max unc ion ins ead o he Euclidean dis ance, and p opose a no el C oss-
Domain Disc imina i e Ma gin Loss (DML) o c oss-domain ashion e ie al. So max
sepa a es ea u es om di e en classes by maximizing he pos e io p obabili y o he
Senso s 2022,22, 2660 7 o 17
co esponding class. Gi en he ea u e ec o
xi
and he co esponding label
yi
, he so max
loss is de ined as ollows:
Ls=1
N
N
∑
i=1
−log pi=1
N
N
∑
i=1
−log ewT
yixi+byi
∑C
j=1ewT
jxi+bj
, (1)
whe e
pi
deno es he pos e io p obabili y ha he ea u e ec o
xi
(a single ec o o med
by using he ex ac ed ea u e ec o s o he posi i e o nega i e image pai s a he ea u e
le el) is co ec ly classi ied in o he co esponding class
yi
,
wj
deno es he
j
-
h
column o
he weigh ma ix
W
,
b
is he bias e m,
N
is he numbe o aining samples, and
C
is he
numbe o classes. No malizing
xi
and
wj
using
L2
no maliza ion, escaling
xi
o
s
, and
ixing he bias e m
b=
0, he ea u e dis ance is p ojec ed on o he ea u e angle measu e
o simplici y as ollows:
Ls=1
N
N
∑
i=1
−log pi=1
N
N
∑
i=1
−log ewT
yixi+byi
∑C
j=1ewT
jxi+bj
, (2)
whe e
pi
indica es he pos e io p obabili y o ea u e ec o
xi
(one single ec o which
is o med by he usion o he ex ac ed ea u e ec o s o he posi i e o nega i e image
pai s a he ea u e le el) being co ec ly classi ied in o ela ed class
yi
,
wj
deno es he
j
- h
column o he weigh ma ix
W
,
b
is he bias e m,
N
is he numbe o aining samples and
C
is he numbe o classes. By no malizing
xi
and
wj
using
L2
no maliza ion, escaling
xi
o
s
and ixing he bias
b=
0 o simplici y [
18
], he ea u e dis ance is p ojec ed o ea u e
angula as ollows:
wT
jxi=
wj
kxikcos θji =scos θji, (3)
whe e
θji
is he angle be ween
wj
and
xi
. Thus, bo h he no m and he angle o he ec o s
con ibu e o he pos e io p obabili y. Based on his o mula ion, some me hods ha e been
p oposed o op imize and ex end he in e class ma gin [
18
,
26
]. Since op imiza ion in cosine
space is much easie compa ed o angle space, we u he ocus on he analysis o cosine
ma gin. By impo ing he ma gin
m
in o he cosine space o So max, he La ge Ma gin
Cosine Loss (LMCL) [18] a emp s o u he dis inguish i as ollows:
Llmc =1
N
N
∑
i=1
−log escosθyi,i−m
escosθyi,i−m+∑C
j6=yiescos(θj,i), (4)
subjec o
cosθj,i=wT
jxi, (5)
whe e
N
is he numbe o aining samples,
xi
is he
i
- h ea u e ec o co esponding o
he g ound u h class o
yi
,
wj
is he weigh ec o o he
j
- h class, and
θj,i
is he angle
be ween wjand xi.
Since c oss-domain ashion e ie al is a disc imina i e bina y p oblem, we ha e only
wo classes (simila and dissimila classes). The e o e,
θ1
and
θ2
deno e he angles be ween
he embedding ea u e ec o s and he weigh ec o s o class
C1
and
C2
, espec i ely. In
he LMCL me hod, he alue o he ma gin
m
is conside ed as a cons an alue o posi i e
and nega i e classes, esul ing in pai s wi h small isual di e ences (ha d examples)
being iden i ied as posi i e pai s. This p oblem is pa icula ly p e alen in c oss-domain
ashion e ie al, whe e he e is a high deg ee o simila i y in design and appea ance
be ween di e en ypes o clo hing. Ou goal is o expand he a ia ion be ween classes o
dis inguish nega i e pai s om posi i e pai s and condense he nega i e ea u e space o
ga he nega i e pai s wi h small and la ge isual di e ences. This p e en s ha d examples
om en e ing he ea u e space o posi i e pai s and inc eases he disc imina i e powe . To
his end, we do no assign he same ma gin
m
o he nega i e class and he posi i e class,
Senso s 2022,22, 2660 8 o 17
bu assign a la ge
m
o he nega i e class o educe he in aclass a ia ion o he nega i e
class. Fo cla i y, we ep esen he angles below wi h only one subsc ip co esponding o
he class. In o he wo ds,
θj,i
is deno ed by
θj
. Fo he posi i e class, and simila ly o he
nega i e class, he c oss-domain loss is o mula ed as ollows:
LC oss−Domain =1
N
N
∑
i=1
−log es(cos(θyi)−myi)
es(cos(θyi)−myi)+es(cos(θj)) , (6)
whe e
N
is he numbe o aining samples,
myi
is he ma gin assigned o he g ound u h
class
yiin{p
,
n}
o he
i
- h pai (whe e o he posi i e class is
mp
and o he nega i e class
is
mn
), and
j6=yi
.
mn
should be la ge han
mp
. Se ing
mn>mp
aims o compac he
nega i e decision bounda y and expand he in e class and educe he nega i e in aclass,
which also ensu es he absence o he ha d examples in he posi i e ea u e space.
To ensu e he disc imina i e powe o c oss-domain loss and p o ide a decisi e
solu ion, we in oduce he disc imina i e pa as ollows:
Ldisc imina i e =−(λ1×mp+λ2×mn)/2, (7)
whe e
λ1
and
λ2
(
λ1<λ2
) a e balancing ac o s o con ol he size o he posi i e and
nega i e ma gins. By combining (5) and (6), he c oss-domain disc imina i e ma gin loss
(DML) is p oposed as ollows:
LDML =LC oss−Domain +Ldisc imina i e =
1
N
N
∑
i=1
−log es(cos(θyi)−myi)
es(cos(θyi)−myi)+es(cos(θj)) −(λ1×mp+λ2×mn)/2, (8)
whe e
mp
,
mn
a e he ma gins o posi i e and nega i e classes,
θyi
is he angle be ween
xi
( he used ea u e ec o o he posi i e o nega i e pai ) and he ec o
wyi
. The
hype pa ame e s λ1and λ2con ol he disc imina i e powe o DML.
3.3. Compa ison o O he Loss Func ions
To be e unde s and he ad an ages o DML o e exis ing losses, he decision bound-
a y o he disc imina ion p oblem is shown in Figu e 3. So max conside s
ma gin =
0
be ween he posi i e class
C1
and he nega i e class
C2
. CosFace and A cFace speci y a
cons an alue o he ma gin be ween posi i e and nega i e classes. We a gue ha hese
s a egies a e no sui able o clo hing analysis because he dis ibu ion o he nega i e class
is no uni o m, i.e., nega i e pai s can ha e bo h small and la ge isual di e ences.
To o e come his challenge in c oss-domain ashion sea ch, he p oposed loss assigns a
lea nable ma gin o each class, while a la ge ma gin is en o ced o he nega i e class. The
la ge ma gin
mn
compac s sca e ed nega i e pai s wi h small and la ge isual di e ences
and shi s he decision bounda y o he nega i e class C2away om he posi i e class C1.
Since he numbe o nega i e pai s is highe han he numbe o posi i e pai s (due o
he limi ed amoun o da a), consume - o-shop ashion e ie al could be conside ed as a
class imbalanced p oblem, whe e he aining can be domina ed by he mos equen class
(nega i e pai s). FCdDN [
29
] p oposed a loss unc ion o eassign he p obabili y alue o
he dominan class o a smalle alue o o e come his p oblem. Speci ically, FCdDN maps
he p obabili y alues o he dominan class (nega i e pai s) o a smalle alue and he
p obabili y alues o he poo class (posi i e class) o a la ge alue. By ocusing a en ion
on he dominan class and gi ing i mo e weigh , FCdDN a emp s o sol e he imbalance
p oblem. Compa ed o FCdDN, DML no only ies o sol e he imbalance p oblem by
assigning a la ge ma gin o he nega i e class, bu also ies o p e en he posi i e ma gin
om becoming equal o he nega i e ma gin due o he disc imina i e pa , which leads o
dis inguish be ween ha d examples and posi i e pai s.
Senso s 2022,22, 2660 9 o 17


2
So max
C1
C2


1


2


2


1
CosFace
C1
C2
m
C2
A cFace
C1

2
m
DML
C1
C2


1
mp
mn

1
Figu e 3.
The decision ma gin o di e en loss unc ions o disc imina i e analysis is isualized.
C1
is a posi i e class and
C2
is a nega i e class. Blue, ed, and whi e a eas ep esen posi i e decision
ma gin, nega i e decision ma gin, and decision limi , espec i ely. As can be seen, unlike o he losses
ha conside cons an ma gins
m
o he posi i e and nega i e decision ma gins, DML lea ns ma gins
mp,mn o he posi i e and nega i e decision ma gins, whe e mn>mp.
4. Expe imen s
4.1. Da ase s
We e alua ed ou p oposed me hod wi h he da ase DARN and wi h wo bench-
ma ks o he DeepFashion da ase : (1) InShop Clo hes Re ie al and (2) Consume - o-Shop
Clo hes Re ie al.
The DARN da ase was collec ed speci ically o s ee - o-shop e ie al and con ained
app oxima ely 327,000 in-shop images and 91,000 use images. Since he collec o s o
he DARN da ase did no p o ide a s anda d p o ocol and he iles p o ided by he
au ho s con ain b oken links, we use he cleaned e sion p o ided by [
6
,
10
] and ollow
hei e alua ion p o ocol o a ai compa ison. Fi s , hey emo ed co up ed images o
ob ain a subse o 62,812 s ee images and 238,499 shop images o 13,598 dis inc p oduc s
dis ibu ed o e 20 ashion ca ego ies whe e each s ee image has a ma ched shop image.
Then, hey pa i ioned he da ase in o h ee subse s o aining, alida ion, and es , wi h
no o e lap o p oduc s (see Table 1).
The DeepFashion da ase [
16
] is one o he la ges da ase s o clo hing image analysis
and con ains mo e han 800k images. Each image in his da ase is anno a ed wi h labels
o ca ego ies, a ibu es, bounding boxes, and landma ks. The p esence o occlusions,
de o ma ions, ligh ing a ia ions, and la ge a ia ions in pose and scale ha e made his
da ase challenging. The Consume - o-Shop Clo hes Re ie al benchma k con ains 239,557
consume - o-shop images wi h 33,881 clo hing i ems. The InShop Clo hes Re ie al bench-
ma k con ains 52,712 images wi h 7982 ga men s. Thei pa i ions a e shown in Table 1.
No e ha in he InShop benchma k, he galle y se images a e used as aining shop pho os
and he que y se images a e used as he es shop pho os. To ensu e a ai compa ison, he
spli be ween aining and es ing is gi en. Consis en wi h he s a e o he a , we used his
spli in all o ou expe imen s. In addi ion, each image was c opped using he bounding
boxes p o ided.
Senso s 2022,22, 2660 16 o 17
he p oblem. This means ha he bes pa ame e s o he c oss-domain consume - o-shop
e ie al p oblem would no sui able o ano he p oblem such as ace e i ica ion.
6. Conclusions
In his wo k, a loss unc ion called DML is p oposed o imp o e he pe o mance
o CNNs in consume - o-shop clo hes e ie al. Unlike exis ing ma gin-based so max
losses, DML lea ns wo di e en ma gins o nega i e and posi i e classes o inc ease
compac ness wi hin classes and sepa abili y be ween classes. The ma gin o nega i e
classes is la ge han he ma gin o posi i e classes. Acco dingly, DML a emp s o inc ease
c oss-class sepa abili y and ocuses on nega i e in aclass compac ness. Fo his eason,
nega i e pai s wi h small isual di e ences a e no conside ed as posi i e pai s, esul ing
in imp o ed e ie al pe o mance. Ex ensi e expe imen al esul s on h ee public ashion
da ase s show signi ican ad an ages o e s a e-o - he-a me hods and all compa ed
ma gin-based so max unc ions. Acco ding o he esul s, DML was he mos success ul o
e ie e clo hes and achie ed Top-50 e ie al pe o mances o 0.759, 0.921, and 0.87 on he
Consume - o-Shop Clo hes Re ie al benchma k, he InShop Clo hes Re ie al benchma k,
and DARN da ase , espec i ely. Fu u e esea ch di ec ions include: (1) imp o ing he
pe o mance o he CNN used o eplacing i wi h o he Deep Lea ning a chi ec u es
such as GRNe o le e age bo h global and local ep esen a ions a mul iple scales; (2)
gene alizing DML o he mul iple-class scena io o s eng hen he disc imina ion o lea ned
ea u es by p omo ing a speci ic addi ional ma gin o each class in cosine space.
Au ho Con ibu ions:
Concep ualiza ion, P.A. and F.D.; me hodology, P.A., F.D., A.M.; so wa e,
P.A.; alida ion, P.A., F.D., A.M.; w i ing—o iginal d a p epa a ion, P.A.; w i ing— e iew and
edi ing, P.A., F.D., A.M.; supe ision, F.D., A.M.; unding acquisi ion, P.A., F.D., A.M. All au ho s
ha e ead and ag eed o he published e sion o he manusc ip .
Funding: This esea ch ecei ed no ex e nal unding.
Ins i u ional Re iew Boa d S a emen : No applicable.
In o med Consen S a emen : No applicable.
Da a A ailabili y S a emen : No applicable.
Con lic s o In e es :
The au ho s decla e ha he esea ch was conduc ed in he absence o any
comme cial o inancial ela ionships ha could be cons ued as a po en ial con lic o in e es .
Re e ences
1.
Hadi, Kiapou , M.; Han, X.; Lazebnik, S.; Be g, A.; Be g, T. Whe e o buy i : Ma ching s ee clo hing pho os in online shops. In
P oceedings o he IEEE In e na ional Con e ence on Compu e Vision, San iago, Chile, 7–13 Decembe 2015; pp. 3343–3351.
2.
Li, Z.; Li, Y.; Gao, Y.; Liu, Y. Fas c oss-scena io clo hing e ie al based on indexing deep ea u es. In Paci ic Rim Con e ence on
Mul imedia; Sp inge : Cham, Swi ze land, 2016; pp. 107–118.
3.
Liu, S.; Song, Z.; Liu, G.; Xu, C.; Lu, H.; Yan, S. S ee - o-shop: C oss-scena io clo hing e ie al ia pa s alignmen and auxilia y
se . In P oceedings o he 2012 IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, P o idence, RI, USA, 16–21 June
2012; pp. 3330–3337.
4.
Wang, X.; Sun, Z.; Zhang, W.; Zhou, Y.; Jiang, Y. Ma ching use pho os o online p oduc s wi h obus deep ea u es. In P oceedings
o he 2016 ACM on In e na ional Con e ence on Mul imedia Re ie al, New Yo k, NY, USA, 6–9 June 2016; pp. 7–14.
5.
Kalan idis, Y.; Kennedy, L.; Li, L. Ge ing he look: Clo hing ecogni ion and segmen a ion o au oma ic p oduc sugges ions in
e e yday pho os. In P oceedings o he 3 d ACM Con e ence on In e na ional Con e ence on Mul imedia Re ie al, Dallas, TX,
USA, 16–20 Ap il 2013; pp. 105–112.
6.
Ji, X.; Wang, W.; Zhang, M.; Yang, Y. C oss-domain image e ie al wi h a en ion modeling. In P oceedings o he 25 h ACM
In e na ional Con e ence on Mul imedia, Moun ain View, CA, USA, 23–27 Oc obe 2017; pp. 1654–1662.
7.
Cheng, Z.; Wu, X.; Liu, Y.; Hua, X. Video2shop: Exac ma ching clo hes in ideos o online shopping images. In P oceedings o
he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Honolulu, HI, USA, 21–26 July 2017; pp. 4048–4056.
8.
Wang, Z.; Gu, Y.; Zhang, Y.; Zhou, J.; Gu, X. Clo hing e ie al wi h isual a en ion model. In P oceedings o he 2017 IEEE Visual
Communica ions and Image P ocessing (VCIP), S . Pe e sbu g, FL, USA, 10–13 Decembe 2017; pp. 1–4.

Senso s 2022,22, 2660 17 o 17
9.
Lasse e, J.; B ache , C.; Vollg a , R. S ee 2Fashion2Shop: Enabling Visual Sea ch in Fashion e-Comme ce Using S udio Images.
In P oceedings o he In e na ional Con e ence on Pa e n Recogni ion Applica ions and Me hods; Sp inge : Cham, Swi ze land, 2018;
pp. 3–26.
10.
Gajic, B.; Bald ich, R. C oss-domain ashion image e ie al. In P oceedings o he IEEE Con e ence on Compu e Vision and
Pa e n Recogni ion Wo kshops, Sal Lake Ci y, UT, USA, 18–22 June 2018; pp. 1869–1871.
11.
Kuang, Z.; Gao, Y.; Li, G.; Luo, P.; Chen, Y.; Lin, L.; Zhang, W. Fashion Re ie al ia G aph Reasoning Ne wo ks on a Simila i y
Py amid. a Xi 2019, a Xi :1908.11754.
12.
Pa k, S.; Shin, M.; Ham, S.; Choe, S.; Kang, Y. S udy on Fashion Image Re ie al Me hods o E icien Fashion Visual Sea ch.
In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion Wo kshops, Long Beach, CA, USA, 16–17
June 2019.
13.
Kuce , M.; Mu ay, N. A De ec -Then-Re ie e Model o Mul i-Domain Fashion I em Re ie al. In P oceedings o he IEEE
Con e ence on Compu e Vision and Pa e n Recogni ion Wo kshops, Long Beach, CA, USA, 16–17 June 2019.
14.
Chop a, A.; Sinha, A.; Gup a, H.; Sa ka , M.; Ayush, K.; K ishnamu hy, B. Powe ing Robus Fashion Re ie al Wi h In o ma ion
Rich Fea u e Embeddings. In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion Wo kshops,
Long Beach, CA, USA, 16–17 June 2019.
15.
Miao, Y.; Li, G.; Bao, C.; Zhang, J.; Wang, J. Clo hingNe : C oss-Domain Clo hing Re ie al Wi h Fea u e Fusion and Quad uple
Loss. IEEE Access 2020,8, 142669–142679. [C ossRe ]
16.
Liu, Z.; Luo, P.; Qiu, S.; Wang, X.; Tang, X. Deep ashion: Powe ing obus clo hes ecogni ion and e ie al wi h ich anno a ions.
In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Las Vegas, NV, USA, 27–30 June 2016;
pp. 1096–1104.
17.
Huang, J.; Fe is, R.; Chen, Q.; Yan, S. C oss-domain image e ie al wi h a dual a ibu e-awa e anking ne wo k. In P oceedings
o he IEEE In e na ional Con e ence on Compu e Vision, San iago, Chile, 7–13 Decembe 2015; pp. 1062–1070.
18.
Wang, H.; Wang, Y.; Zhou, Z.; Ji, X.; Gong, D.; Zhou, J.; Li, Z.; Liu, W. Cos ace: La ge ma gin cosine loss o deep ace ecogni ion.
In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Sal Lake Ci y, UT, USA, 18–22 June 2018;
pp. 5265–5274.
19.
Chop a, S.; Hadsell, R.; LeCun, Y. Lea ning a simila i y me ic disc imina i ely, wi h applica ion o ace e i ica ion. In
P oceedings o he 2005 IEEE Compu e Socie y Con e ence on Compu e Vision and Pa e n Recogni ion (CVPR’05), San Diego,
CA, USA, 20–25 June 2005; pp. 539–546.
20.
Hadsell, R.; Chop a, S.; LeCun, Y. Dimensionali y educ ion by lea ning an in a ian mapping. In P oceedings o he 2006 IEEE
Compu e Socie y Con e ence on Compu e Vision and Pa e n Recogni ion (CVPR’06), New Yo k, NY, USA, 17–22 June 2006;
Volume 2, pp. 1735–1742.
21.
Rao, Y.; Lu, J.; Zhou, J. Lea ning Disc imina i e Agg ega ion Ne wo k o Video-Based Face Recogni ion and Pe son Re-
iden i ica ion. In . J. Compu . Vis. 2019,127, 701–718. [C ossRe ]
22.
Wang, J.; Song, Y.; Leung, T.; Rosenbe g, C.; Wang, J.; Philbin, J.; Chen, B.; Wu, Y. Lea ning ine-g ained image simila i y wi h
deep anking. In P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Columbus, OH, USA, 23–28
June 2014; pp. 1386–1393.
23.
Liu, W.; Wen, Y.; Yu, Z.; Li, M.; Raj, B.; Song, L. Sphe e ace: Deep hype sphe e embedding o ace ecogni ion. In P oceedings o
he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Honolulu, HI, USA, 21–26 July 2017; pp. 212–220.
24.
Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A disc imina i e ea u e lea ning app oach o deep ace ecogni ion. In P oceedings o he
Eu opean Con e ence on Compu e Vision, Ams e dam, The Ne he lands, 11–14 Oc obe 2016; pp. 499–515.
25. Liu, W.; Wen, Y.; Yu, Z.; Yang, M. La ge-ma gin so max loss o con olu ional neu al ne wo ks. ICML 2016,2, 7.
26.
Wang, F.; Cheng, J.; Liu, W.; Liu, H. Addi i e ma gin so max o ace e i ica ion. IEEE Signal P ocess. Le .
2018
,25, 926–930.
[C ossRe ]
27.
Deng, J.; Guo, J.; Xue, N.; Za ei iou, S. A c ace: Addi i e angula ma gin loss o deep ace ecogni ion. In P oceedings o he
IEEE/CVF Con e ence on Compu e Vision and Pa e n Recogni ion, Long Beach, CA, USA, 16–20 June 2019; pp. 4690–4699.
28.
Ho e , E.; Ailon, N. Deep me ic lea ning using iple ne wo k. In In e na ional Wo kshop on Simila i y-Based Pa e n Recogni ion;
Sp inge : Cham, Swi ze land, 2015; pp. 84–92.
29.
Ouahabi, A.; Taleb-Ahmed, A. Deep lea ning o eal- ime seman ic segmen a ion: Applica ion in ul asound imaging. Pa e n
Recogni . Le . 2021,144, 27–34. [C ossRe ]
30.
Xuan, H.; Sou eni , R.; Pless, R. Deep andomized ensembles o me ic lea ning. In P oceedings o he Eu opean Con e ence on
Compu e Vision (ECCV), Munich, Ge many, 8–14 Sep embe 2018; pp. 723–734.
31.
Shen, Y.; Xiao, T.; Li, H.; Yi, S.; Wang, X. End- o-end deep k onecke -p oduc ma ching o pe son e-iden i ica ion. In P oceedings
o he IEEE Con e ence on Compu e Vision and Pa e n Recogni ion, Sal Lake Ci y, UT, USA, 18–22 June 2018; pp. 6886–6895.
32.
Su, H.; Wang, P.; Liu, L.; Li, H.; Li, Z.; Zhang, Y. Whe e o Look and How o Desc ibe: Fashion Image Re ie al wi h an A en ional
He e ogeneous Bilinea Ne wo k. IEEE T ans. Ci cui s Sys . Video Technol. 2020,31, 3254–3265. [C ossRe ]
33.
Ve ma, S.; An ; S.; A o a, C.; Rai, A. Di e si y in ashion ecommenda ion using seman ic pa sing. In P oceedings o he 2018 25 h
IEEE In e na ional Con e ence on Image P ocessing (ICIP), A hens, G eece, 7–10 Oc obe 2018; pp. 500–504.
34. Lasse e, J.; Rasch, K.; Vollg a , R. S udio2shop: F om s udio pho o shoo s o ashion a icles. a Xi 2018, a Xi :1807.00556.

Related note

Why organizations use Identific for document trust, entry 62
Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in universities, research institutes, colleges, schools, and publishing workflows, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports clearer documentation of academic decisions, reduced manual checking effort, and more reliable review records. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For policy papers, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.
Review document trust
https://identific.com