µSplit: image decomposition for fluorescence microscopy

Author: Jug, Florian

Publisher: Zenodo

DOI: 10.5281/zenodo.17662056

Source: https://zenodo.org/records/17662056/files/muSplit-ZENODO.pdf

µSpli : image decomposi ion o luo escence mic oscopy
Ashesh1, Alexande K ull2, Moises Di San e3, F ancesco Sil io Pasqualini3, Flo ian Jug1,*
1Human Technopole, I aly, 2Uni e si y o Bi mingham, UK, 3Uni e si y o Pa ia, I aly
[email p o ec ed], [email p o ec ed], [email p o ec ed]
[email p o ec ed], [email p o ec ed]
Abs ac
We p esen µSpli , a dedica ed app oach o ained
image decomposi ion in he con ex o luo escence mi-
c oscopy images. We ind ha bes esul s using egula
deep a chi ec u es a e achie ed when la ge image pa ches
a e used du ing aining, making memo y consump ion he
limi ing ac o o u he imp o ing pe o mance. We he e-
o e in oduce la e al con ex ualiza ion (LC), a no el me a-
a chi ec u e ha enables he memo y e icien inco po a ion
o la ge image-con ex , which we obse e is a key ing edi-
en o sol ing he image decomposi ion ask a hand. We
in eg a e LC wi h U-Ne s, Hie a chical AEs, and Hie a -
chical VAEs, o which we o mula e a modi ied ELBO loss.
Addi ionally, LC enables aining deepe hie a chical mod-
els han o he wise possible and, in e es ingly, helps o e-
duce iling a e ac s ha a e inhe en ly impossible o a oid
when using iled VAE p edic ions. We apply µSpli o i e
decomposi ion asks, one on a syn he ic da ase , ou o h-
e s de i ed om eal mic oscopy da a. Ou me hod con-
sis en ly achie es bes esul s (a e age imp o emen s o he
bes baseline o 2.25 dB PSNR), while simul aneously e-
qui ing conside ably less GPU memo y. Ou code and
da ase s can be ound a h ps://gi hub.com/juglab/uSpli .
1. In oduc ion
Fluo escence mic oscopy [10] is ou inely used o look a
li ing cells and biological issues a cellula and sub-cellula
esolu ion [18]. Componen s o he imaged cells can be
highligh ed using luo escen labels, allowing biologis s o
in es iga e indi idual s uc u es o in e es . Gi en he com-
plexi y o biological p ocesses, i is ypically necessa y o
look a mul iple s uc u es simul aneously, ypically ia a
empo al mul iplexing scheme [10] ha sepa a es hem in o
di e en image channels.
Imaging mo e han 3 o 4 s uc u es in his way is di -
icul o echnical easons, limi ing he a e o scien i ic
*Co esponding Au ho , ([email p o ec ed]).
Figu e 1. Spli ing o supe imposed image channels. The inpu
image is he sum o wo image channels, each channel con aining
s uc u es om one gi en objec class. The ask o µSpli is o
iden i y and spli he s uc u es supe imposed in he gi en inpu
image (dashed ec angles).
p og ess in he li e sciences. One way o ci cum en his
limi a ion would be o label wo cellula componen s wi h
he same luo opho e, i.e. image hem in he same image
channel. Hence, a compu a ional me hod o spli apa (de-
compose) supe imposed biological s uc u es acqui ed in a
single image channel, i.e. wi hou empo al mul iplexing,
would ha e emendous impac (see Figu e 1).
His o ically, image decomposi ion has ound applica-
ions on na u al images [9,8,1,3]. Ou app oach o image
decomposi ion, called µSpli , es s on he idea o lea ning
s uc u al p io s o he wo unmixed a ge image channels,
and hen using hese o guide he decomposi ion o he su-
pe imposed (added) pixel in ensi ies. Such con en -awa e
p io s ha e p e iously been used o asks such as image
es o a ion [29,4,28], denoising [14,2,15,11,20,19], and
segmen a ion [5,25,30].
In many o hese cases, he achie able pe o mance
hea ily depends on he po ion o he image a ne wo k can
see be o e ha ing o make a p edic ion. As we show in his
wo k, he need o la ge spa ial con ex , i.e. ecep i e ield
and pa ch size, is pa icula ly p onounced o image decom-
posi ion. Biological s uc u es in mic oscopy images can
easily ex end o e dis ances o se e al hund ed pixels. Ac-
co dingly, we obse e ha esul s imp o e wi h la ge ain-
ing pa ch sizes and deepe a chi ec u es (see Figu e 6(a)).
Na u ally, his leads o models ha ing a huge GPU mem-
o y oo p in , which limi s hei applicabili y o selec ed
compu e-sa y li e-science labs.
The impo ance o con ex has p e iously been u ilized
in he ield o image segmen a ion [16,13]. Leng e al. [16]
de ised a me hod o e icien ly use he a ailable con ex o
he inpu image o a segmen a ion ask. Howe e , hey did
no use addi ional inpu s o ha ing access o a la ge con-
ex han wha is al eady p esen in he gi en inpu pa ch.
Hilbe e al. [13] wo ked wi h 3D images and used an ad-
di ional lowe esolu ion image o imp o e o e all segmen-
a ion pe o mance.
Also o µSpli we obse e ha addi ional image con ex
is impo an . In con as o he p e iously men ioned a -
chi ec u es, we in oduce La e al Con ex ualiza ion (LC), a
no el me a-a chi ec u e ha eeds addi ional image con ex
a mul iple p ocessing s eps. We in oduce h ee a ian s,
Lean-LC,Regula -LC, and Deep-LC, di e ing om each
o he in e ms o GPU memo y equi emen s and achie able
p edic ion quali y. As we elabo a e below, Deep-LC addi-
ionally o e s he possibili y o ins an ia e a mo e powe ul
HIERARCHICAL VAE wi h mo e hie a chy laye s han o h-
e wise possible, and show ha his leads o imp o ed pe -
o mance on he image spli ing ask a hand. e Since µSpli
needs o be applicable o la ge mic oscopy images, iled
p edic ions a e equi ed. In iled p edic ions, inpu image
is di ided in o o e lapping pa ches on which p edic ions
a e pe o med indi idually. Those p edic ions a e hen ap-
p op ia ely cen e -c opped in o non-o e lapping iles which
can hen be appended o o m he inal p edic ion. O e lap-
ping pa ches ha e o be used o ensu e ha su icien image
con ex is a ailable o add ess bo de a i ac s o occu in
he non-o e lapping cen al egion.
In Sec ion 3, we a gue ha o deep ne wo ks ope a -
ing on ela i ely small pa ches, o e lapping egions should
no be c ea ed by making iles la ge (Ou e Padding) which
is a guably he mos common way, bu ha i is be e o
ins ead cen e -c op egions smalle han he o iginal pa ch
size (Inne Padding).
Since HIERARCHICAL VAES(HVAES) [26] ha e e-
cen ly gained popula i y, e.g. o mic oscopy image denois-
ing and es o a ion [20,19], we made hese powe ul a chi-
ec u es also a ailable o he image decomposi ion ask by
modi ying he de aul VAE ELBO loss, inco po a ing he
ac ha he ed inpu is di e en om he decoded ou pu .
2. P oblem S a emen
A da ase Dmix = (x1, x2, .., xN)o Nimages is c e-
a ed by supe imposing sampled pai s o image channels
(D1, D2), such ha
xi= (di
1+di
2)/2,∀i∈[1, N],(1)
wi h D1= (d1
1, d2
1, ...dN
1)and D2= (d1
2, d2
2..., dN
2).
Gi en a newly sampled x= (d1+d2)/2, he ask is o
decompose xin o es ima es o d1and d2.
3. Ou App oach
A Sound ELBO o µSpli . We ain ou VAE o desc ibe
he join dis ibu ion o bo h channel images d1and d2.
We modi y he VAE’s ELBO objec i e o inco po a e he
ac ha inpu and ou pu a e no he same (as hey a e o
au oencode s). When aining he VAE, ou objec i e is o
ind
a g max
θ
N
X
i=1
log P(di
1, di
2;θ),
based on ou aining examples (di
1, di
2). He e, θa e he de-
code pa ame e s o ou VAE, which de ine he dis ibu ion.
Nex , we expand log P(d1, d2;θ)as
log ZP(d1, d2, z;θ)dz
= log Zq(z|x;ϕ)∗P(d1, d2, z;θ)
q(z|x;ϕ)dz
>=Zq(z|x;ϕ)∗log P(d1, d2, z;θ)
q(z|x;ϕ)dz, (2)
whe e q(z|x;ϕ)is ou encode ne wo k wi h pa ame e s ϕ.
I can be shown ha he e idence lowe bound in Eq. 2is
equal o
Eq(z|x;ϕ)[log P(d1, d2|z;θ)] −KL(q(z|x;ϕ), P (z)).
By making he assump ion o condi ional independence o
d1and d2gi en z, we can simpli y he exp ession o
Eq(z|x;ϕ)[log P(d1|z;θ) + log P(d2|z;θ)]
−KL(q(z|x;ϕ), P (z)).(3)
Exp ession 3is wha we end up maximizing du ing aining.
No e ha his analysis can be seamlessly ex ended o he
case whe e one has a hie a chy o la en ec o s [26] ins ead
o jus one.
Fo modelling q(z|x;ϕ), we use he iden ical se up o
he bo om-up b anch used in [20] wi h he inpu being
x, he supe imposed inpu . Fo modeling P(d1|z;θ)and
P(d2|z;θ), we again use he op-down b anch design used
in [20] bu make he op-down b anch ou pu wo channels
o mean and wo mo e o he pixelwise log( a ), one each
o d1and d2. So, he ou pu o ou model is a 4 chan-
nel enso wi h iden ical spa ial dimensions as he inpu .
No e ha o enco po a e LC, we modi y bo h q(z|x;ϕ)and
P(d2|z;θ)which we desc ibe nex .
La e al Con ex ualisa ion (LC). We in oduce LC, allow-
ing µSpli o see la ge po ions o he inpu image a inc eas-
ingly downscaled pixel esolu ions. LC only equi es small
(a) (b)
[20]
[20]
[20]
[20]
[20]
xp
x(p,1)
Figu e 2. Ne wo k a chi ec u e o µSpli . In (a), we show he ne wo k a chi ec u e employed by Regula -LC. The inpu (le side) consis s
o a co e image pa ch xp, oge he wi h downscaled e sion o he pa ch su oundings – he la e al con ex (LC). We show he a ea
co esponding o he o iginal pa ch as ed do ed box h oughou he igu e. (b) The ne wo k a chi ec u e o Deep-LC. The a chi ec u e
used in [20] is s acked on op o he Regula -LC a chi ec u e shown in (a). No e ha his is only possible because he la en space in
Regula -LC e ained he spa ial dimensions o all laye s by means o using he p oposed LC. No e: a ske ch o he Lean-LC a chi ec u e,
ou hi d LC a ian , can be ound in he Supp. Figu e S.1.
ull esolu ion pa ches, ende ing he ne wo k conside ably
mo e memo y e icien .
Many popula a chi ec u es, such as U-NETS [23] o
HVAES[20,6,27] a e composed o a hie a chy o le -
els ha ope a e on inc easingly downsampled and he e o e
also inc easingly smalle laye s. The basic idea o LC is o
pad each downsampled laye by addi ional image con ex ,
i.e. addi ional inpu om an a ailable la ge inpu image,
such ha each laye a each hie a chy le el main ains he
same spa ial dimensions. (In Figu e 2(a), he ed dashed
squa es in he s ack o inpu s (le mos column) indica e he
loca ion o he o iginal pa ch (xp) wi hin he downscaled
and la e ally con ex ualized inpu s a highe hie a chy le -
els (x(p,i)).)
C ea ing downsampled LC inpu s. Le xp=x[c,h]deno e a
pa ch o size h×h om x∈Dmix cen e ed a ound pixel lo-
ca ion c. To decompose he pa ch xp, we addi ionally use a
sequence o successi ely downscaled and c opped e sions
o x,Xlow es
p= (x(p,1), x(p,2), . . . , x(p,nLC )), whe e x(p,k)is
x[c,2k·h], downsampled o he same pixel esolu ion o h×h,
and nLC deno es he o al numbe o used LC inpu s.
Implemen a ion o Regula -LC.O e all a chi ec u e is
shown in Figu e 2(a). P ima y inpu pa ch xpis ed o
he i s inpu b anch (IB). The ou pu o his IB is ed o
he i s bo om up (BU) block, which downsamples he in-
pu ia s ided con olu ions, whose ou pu is hen passed
o some esidual blocks (see Supp. igu e S.1), and inally
ze o padded o egain he same spa ial dimension as he in-
pu i ecei ed. The ou pu o he i s BU block is con-
ca ena ed wi h he ou pu o he second IB, which has e-
cei ed he i s lowe esolu ion inpu con aining addi ional
la e al con ex , x(p,1). Ze o-padding ollowed by conca e-
na ion ensu es pixelwise alignmen be ween IB’s ou pu and
BU’s ou pu . We use 1×1-con olu ions o me ge hese con-
ca ena ed channels and eed he esul ing laye in o he nex
BU block. This p ocedu e ge s epea ed o e e y hie a chy
le el in he gi en HVAE.
Once he opmos hie a chy le el is eached, he las
laye is ed in o he opmos op down (TD) block. A
TD block consis o some esidual laye s, ollowed by a
s ochas ic block as hey a e used in HVAES. The ou pu o
he s ochas ic block is cen e -c opped o hal size and up-
sampled ia anspose con olu ions be o e again being ed
h ough some esidual laye s ((see Supp. igu e S.1)). C op-
Figu e 3. Ca oon o a gene ic hie a chical ne wo k wi h an encode -decode a chi ec u e illus a ing he ela ionship be ween he inpu
pa ch size, he e ec i e ecep i e ield, and he heo e ical ecep i e ield. The inpu pa ch, shown a he e y le in he cen e o he ligh
blue a ea, is p ocessed and downsampled mul iple imes (encode ) be o e being upsampled mul iple imes (decode ) o allow he ou pu ,
shown on he e y igh , o ha e he same pixel dimensions as he inpu pa ch. Cuboids shown by solid black lines ep esen he enso s
he ne wo k compu es du ing i s execu ion. Solid blue cuboids show he e ec i e ecep i e ield,i.e. he a eas wi hin each enso ha can
in luence he cen e -mos pixels in he wo ou pu laye s (depic ed by ed ec angles). All bu he las wo enso s a e ully ‘ isible’ o
hose pixels, since he heo e ical ecep i e ield,i.e. he maximum a ea ha would in luence hose pixels i he espec i e enso would
be su icien ly la ge, g ows beyond hei bounds (shown as ligh -blue solid cuboids). No e ha wo king wi h la ge inpu pa ches will ill
a la ge po ion o he heo e ical ecep i e ield. I heo e ical and e ec i e ecep i e ields di e ge, as shown in his ca oon, padded
p edic ions on inpu pa ches la ge hen he aining pa ch size will cause he ne wo k o ope a e ou -o -dis ibu ion (OOD) and he e o e
lead o deg aded p edic ion quali y (see main ex and Supp. Sec ion S.2.1).
ping and upsampling ensu es ha he ou pu o he TD block
ma ches he nex lowe hie a chy le el. The ou pu o he
TD block is, simila o be o e, i s conca ena ed wi h he
ou pu o he bo om up compu a ions and hen ed h ough
1×1-con olu ions. Once we each he bo om hie a chy
le el, he ou pu o he las TD block is ed h ough an ou -
pu block (OB) composed o some addi ional con olu ional
laye s, gi ing us he inal p edic ions o d1and d2.
We’ e in eg a ed LC in o HVAE, HAE and he clas-
sic U-NET a chi ec u e. No e ha he di e ence be ween
HVAESand HIERARCHICAL AUTOENCODERS (HAES) is
ha he s ochas ic block is eplaced by he iden i y. We use
he e m Vanilla o deno e he unde lying a chi ec u e on
which we hen enable LC.
Deep-LC: deepe pe o ms be e . We obse e empi i-
cally ha ha ing deepe hie a chies is bene icial (see Fig-
u e 6(a)). Since in U-NETS, HAES, and HVAES, each
consecu i e hie a chical le el hal es he inpu enso in all
spa ial dimensions, a na u al limi o he maximum hie a -
chy le el is gi en by he ed pa ch size1. By making use o
addi ional lowe esolu ion image con ex a each hie a chy
le el, we’ e designed µSpli such ha spa ial dimensions o
la en enso s s ay cons an ac oss all hie a chy le els. This
enables Deep-LC (see Figu e 2(b)), ou mos po en a chi-
1Using a pa ch size o 64, o example, can a mos gi e ise o 5 hie -
a chy le els (25+1 = 64).
ec u e a ian , o ha e addi ional hie a chy le els o e wha
a anilla HVAE can ha e, ypically showing bes esul s in
ou expe imen s (see Figu e 6(b) and Figu e 7).
Mo e conc e ely, in ou Deep-LC ne wo k, we s ack a
de aul HVAE (like he one used in [20]) on op o ou
Regula -LC a ian (Figu e 2(a)). This means ha s a ing
om he highes hie a chy le el using LC, any u he hie -
a chy le el is buil like a egula HVAE hie a chy s ack.
Lean-LC: minimal memo y oo p in . Lean-LC, ou
mos memo y e icien LC a ia ion, does no use he la -
e al con ex in oduced in he bo om-up b anch wi hin he
op-down b anch (see Supp. Figu e S.1 o i s a chi ec-
u e). Mo e speci ically, he bo om-up b anch is iden ical
o Regula -LC, bu he op-down b anch educes o he de-
aul HVAE implemen a ion, e y simila o how i was also
used in [20]. This is enabled by cen e c opping he ou pu
o each BU block going in o he TD block.
Tiled P edic ions. Fo i ually all asks using ully con-
olu ional a chi ec u es, ained ne wo ks a e o en used o
p edic esul s on inpu s much la ge hen he pa ches hey
we e ained on. Whene e an inpu image is so la ge ha
he ne wo k in ques ion canno scale wi hou unning ou -
o -memo y, p edic ions a e ypically pe o med on o e -
lapping pa ches and la e sui ably c opped and appended.
When applied o ela i ely shallow [24] and non- a ia ional
ne wo ks, esul s can be pixel-pe ec , i.e. no con aining
(a)
(b)
(c)
Figu e 4. S a egies o iled p edic ions. (a) The di e ence be-
ween Inne and Ou e Padding. The blue dashed ec angle ep-
esen s one pa ch used o iled p edic ions. Fo each cell in he
ain g ay g id supe imposed on he inpu image one such pa ch
exis s. The ed dashed ec angle ep esen s he cen e -c op egion
used o ile he inal p edic ion o he en i e inpu image. The blue
shaded a ea is he e o e he pa o he pa ch ha o e laps wi h
neighbo ing pa ches, i.e. i is he padding a ea o he ed ec an-
gle. Ou e Padding uses a ile size equi alen o he aining pa ch
size and in oduces o e lap by enla ging he pa ch being ed o he
ne wo k. Inne Padding, in con as , main ains he o iginal pa ch
size, and uses only an inne c op o ile he gi en inpu image.
(b) Pe cen ange a ia ion (o PSNR measu emen s) when using
di e en amoun s o Ou e o Inne padding ( o HAE and HVAE
anilla se ups using a pa ch size o 64). Fo a ying amoun s o
padding (x-axis), we plo how 6da a poin s o he Pa iaATN da a
(3 asks∗2 = 6) and 2da a poin s o Hagen e al.da a (1 ask) a e
dis ibu ed. No e how dis ibu ions o Inne Padding a e consis-
en ly be e . (c) Using Ou e Padding, p edic ions a e pe o med
on pa ches la ge han he ones used du ing aining, leading o
ou -o -dis ibu ion (OOD) inpu s and he e o e o in e io p edic-
ions ( ed a ows). Fi s and second ow a e he g ound u h and
p edic ion made wi hou any padding espec i ely. See Supp. Fig-
u e S.3 o mo e examples.
any iling a i ac s. Bu we obse e ha he e a e wo cases
whe ein iling a e ac s a e no easily a oidable.
The i s is caused by ne wo ks ha ha e huge ecep i e
ields (see Figu e 3). When ained wi h a pa ch size much
smalle han he heo e ical ecep i e ield size, la ge pa s
o he heo e ical ecep i e ield will be emp y (i.e. ze o).
See also Supp. Sec ion S.2.1 o a mo e de ailed desc ip ion.
When such ained ne wo ks a e la e used o iled p e-
dic ions, a p oblem a ises whene e he inpu pa ches, on
which p edic ions a e made, a e la ge han he pa ch size
used du ing aining (which ypically is he case because
pa ch sizes is chosen such ha GPU memo y is bes u ilized,
and inpu pa ches need o o e lap su icien ly o a oid bo -
de a i ac s). These pa ches will ill a la ge po ion o he
heo e ical ecep i e ield han aining pa ches did, esul -
ing in ou -o -dis ibu ion (OOD) p edic ions and wo sened
pe o mance (see Figu e 4(b) o quan i a i e assessmen ).
The second case o iling a i ac s a ises when a ia-
ional ne wo ks like HVAESa e used. These a chi ec u es
sample om he a ia ional la en space o encoded iles,
wi h samples o neighbo ing iles no necessa ily decod-
ing in o consis en image con en s along he bo de s o p e-
dic ed iles.
The solu ion we p opose is wo old: (i)Ins ead o iled
p edic ion on la ge pa ches (Ou e Padding), which is a -
guably he mos o en used iling scheme, we p opose o
use Inne Padding ins ead, an app oach ha uses pa ches o
he same size as he ones used du ing aining, he eby sol -
ing he OOD issue in oduced abo e. Mo e speci ically, in
bo h iling schemes, he inpu image is di ided in o o e lap-
ping pa ches. The p edic ions on hese pa ches a e hen cen-
e c opped and hese c ops a e pu igh nex o each o he
in o de o c ea e a p edic ion o he en i e inpu image.
To enla ge he o e lap be ween neighbo ing pa ches, Ou e
Padding enla ges he pa ch size. Inne Padding does no al-
e he size o pa ches, bu ins ead only uses a smalle cen-
al a ea o hei espec i e p edic ions. See Figu e 4(a) o
a isual depic ion o Inne and Ou e Padding. In ou ex-
pe imen s (see Sec ion 5), we ha e used Inne Padding o
24 pixels, de e mined ia g id-sea ch. (ii)O e lap amoun
wi h Inne Padding a e cons ained o be small. Small o e -
lap would usually cause a i ac s due o insu icien image
con ex a ile bounda ies. Howe e , due o ou LC ap-
p oach, µSpli is ed a e y la ge and consis en image con-
ex a bo h sides o all pa ch bounda ies, allowing us o op-
e a e wi h minimal a i ac s e en wi h small o e laps2. In
supplemen , we empi ically show he lowe need o o e lap
o ou LC a ian s.
T aining De ails. Fo e e y da ase , we use 80%, 10% and
10% o he da a as ain- alida ion- es spli . All models a e
ained using 16-bi p ecision on a Tesla V100 GPU. Un-
less o he wise men ioned, all models a e ained wi h ba ch
size o 32 and inpu pa ch size o 64. Fo all HVAES, we
lowe -bound σs o P(d1, d2, θ) o exp(−5). This a oids
nume ical p oblems a ising om hese σs going o ze o, as
epo ed in [22]. Nex , we e-pa ame e ize he no mal dis-
ibu ions o he BU b anch using σExpLin e o mula ion
in oduced in [7]. We addi ionally uppe -bound he inpu o
σExpLin o 20. Fo aining µSpli wi h Deep-LC, we ol-
low he sugges ions in [6,21], and di ide he ou pu o each
BU block by √2˙
i, wi h ibeing he index o he hie a chy
le el he BU block is pa o .
4. Da ase s
SinosoidalC i e s.
We c ea ed his syn he ic da ase explici ly o demon-
s a e he impo ance o con ex o he spli ing ask and
he use ulness o using LC wi hin µSpli . Images in his
2No e ha a i ac s a ising om independen ly sampling he la en
space in HVAES emains an unsol ed p oblem.

(a) (b)
F equencies
F eq. pai s, i.e., C i e s Connec ing hese F eq. pai s
Ch1. Image
Ch2. Image
Inpu Image
Figu e 5. The syn he ic SinosoidalC i e s da ase is designed in such a way ha la ge la e al image con ex is needed in o de o pe o m
co ec channel spli ing. (a) A schema illus a ing how we c ea ed he SinosoidalC i e s da ase . A de ailed desc ip ion is p o ided in
Sec ion 4.(b) We show wo sample SinosoidalC i e s inpu images ( ow 1) o size 128 ×128 and 256 ×256 pixels and he wo channels
ha c ea ed hem ( ow 2), espec i ely. Below, we show he decomposi ion esul s ob ained wi h a ained anilla HVAE wi h inpu pa ch
size 64 ( ow 3), and esul s ob ained wi h he same a chi ec u e bu using Regula -LC ( ow 4). To ecognise which c i e is depic ed and
assign i o a channel, he ne wo k has o see bo h wa e o ms, hence equi ing long ange la e al image con ex .
(a) (b)
Figu e 6. Bene i s o µSpli in one glance: Quan i a i e esul s o baselines s.µSpli a ian s. (a) We plo he pe o mance o he
anilla U-NET and he anilla HVAE baseline ained on inc easingly la ge pa ch sizes on ou Pa iaATN Ac s. Tub da a. The U-NET
pe o mance pla eaus oughly a a pa ch size o abou 256. The pe o mance o he anilla HVAE (no using LC) depends on how many
hie a chy laye s we use (1 o 4, di e en colo ed plo s), bu hen pla eaus as well, o equi ing a emendous amoun o GPU memo y
(black plo , also see Table 1). (b) The le plo displays he da a as shown in he HVAE plo in (a), bu now as a unc ion o hie a chy le els
in he used a chi ec u e. Each cu e is now ep esen ing a gi en pa ch size. X-axis icks exp ess how many hie a chy le els he HVAE has,
and how many o hose make use o LC (numbe in b acke s). The igh mos wo plo s show esul s ob ained wi h µSpli using an HVAE
wi h a pa ch size o only 64. Each plo shows esul s ob ained wi h one o ou LC a ia ions being used. No only do ne wo ks using LC
ou pe o m all baselines, hey do so al eady when using he smalles pa ch size (64), he eby equi ing only a mode a e amoun o GPU
memo y (see Table 1).
da ase can only co ec ly be decomposed when su icien
la e al image con ex is a ailable du ing p edic ion ime.
We i s choose 4 di e en equencies and combine
hem in o 4 unique pai s. Two pai s a e dedica ed o im-
age channel 1 (blue box), he o he wo o image channel 2
(g een box). We call hese pai s c i e s. The assignmen o
hese c i e s o channels is done such ha each equency
is assigned exac ly once o each channel. We connec he
wo sinosoids o each c i e wi h a low equency cu e
o con ollable leng h ( la e deno ed by Njoin in Table 2).
No e ha i is he speci ic combina ion o sinosoid equen-
cies p esen in he cu e which decides whe he i belongs
o Channel 1 o 2 since he indi idual sinosoids hemsel es
occu in bo h channels in equal amoun . Nex , we assem-
ble channel images by placing a p ede ined numbe o an-
domly chosen cu es a andom posi ions in he espec i e
image channel. The inal inpu image is c ea ed as he sum
o he wo channels. See Figu e 5(a) o da ase cons uc-
ion.
Pa iaATN Mic oscopy Da ase . We’ e c ea ed Pa iaATN
da ase . I has been imaged in he Syn he ic Physiol-
ogy Labo a o y a Uni e si y o Pa ia, and is composed
o 62 4-channel luo escence mic oscopy images o size
2720 ×2720. No ably, his da ase has highe pixel esolu-
ion han mos publicly a ailable luo oscence mic oscopy
da ase s [12,17,31]. The h ee channels we use label Ac in,
Tubulin and Nuclei, espec i ely, yielding h ee decomposi-
ion asks we e e o as Ac in s. Tubulin, Ac in s. Nu-
clues, and Tubulin s. Nucleus. No e ha he da ase has
wo channels labelling Nuclei om which we picked one.
See supplemen o mo e de ails.
Hagen e al.Ac in-Mi ochond ia Da ase . F om many
PSNR:28.3 (27.2) PSNR:30.4 (29.8) PSNR:30.1 (30.7)
PSNR:22.3 (23.1) PSNR:24.4 (25.5) PSNR:24.0 (26.5)
Figu e 7. Quali a i e esul s on he Ac s. Tub ask om ou Pa iaATN da ase . We compa e g ound u h o esul s ob ained wi h he
anilla HVAE baseline ained wi h a pa ch size o 64 o esul s ob ained wi h wo a ia ions o µSpli (HVAESusing lean and deep LC,
bo h also using a pa ch size o 64). The o e laid his og ams shows ei he he in ensi y dis ibu ion o he wo channels (column 1) o he
in ensi y dis ibu ion o he g ound u h and he p edic ion ( ed). The gi en PSNR a e o he indi idual p edic ion ( ull inpu image) and
o he en i e da ase (in b acke s).
sub-da ase s p o ided by Hagen and colleagues [12], we
picked he one wi h Mi ochond ia and Ac in channels, he
one wi h he highes pixel esolu ion (2048 ×2048).
5. Expe imen s and Resul s
Inc emen ally In oducing LC. In le panel o Fig-
u e 6(b), we show ha o Vanilla HVAE, as hie a chy le -
els inc ease (BU blocks), so does he pe o mance, p o ided
we’ e la ge enough pa ch size. Fo pa ch size o 64, inc eas-
ing hie a chy le els does no b ing any bene i a e a poin .
In cen al panel o Figu e 6(b), keeping he pa ch size
and hie a chy le els ixed o 64 and 4 espec i ely, we in-
oduce LC o an inc easing numbe o hie a chy le els (de-
no ed by he numbe in he b acke s along he x-axis). This
gi es us a cumula i e gain o a ound 2dB PSNR. Fu he -
mo e, wi h Deep-LC ( igh panel), we inc ease he hie a -
chy le el e en u he which gi es us u he imp o emen s.
Two hings a e wo h no ing he e o he pa ch size o 64:
(i)The e is no much bene i in inc easing hie a chy le -
els o Vanilla HVAE. Using LC, on he o he hand, leads
o addi ional imp o emen s, and (ii)Vanilla HVAE, can-
no employ as many hie a chy le els as we can do using
Deep-LC, and he esul s gain subs an ially om hose ex a
le els. The Vanilla-XL model deno es Vanilla model ained
wi h a pa ch size o 512. The Deep-LC esul s ou pe o m
he Vanilla-XL HVAE, see Figu e 6(a), while also ha ing a
much smalle GPU memo y oo p in (see Table 1).
Expe imen s on Mic oscopy Da a. We p esen esul s on 3
decomposi ion asks on he Pa iaATN da ase and 1decom-
posi ion ask on he Hagen e al.da ase . Table 1summa-
izes ou indings. As baselines, we’ e adap ed he wo ks
o [16,13] and ind ha µSpli ou pe o ms hem. I is wo h
no ing ha a chi ec u e used in [13], unlike ou s, did no
gene alize o using a hie a chy o lowe esolu ion inpu s
and wo ked wi h jus one addi ional low esolu ion inpu . I
also, unlike us, did no espec pixel alignmen s while con-
ca ena ing he la en space enso s o he wo esolu ion le -
els. We ha e also applied he unsupe ised Double-DIP [9]
baseline o andom sampled 6c ops o size 256 ×256
o each es -se image o he Pa iaATN and Hagen e al.
da ase s (see Table 1and supplemen a y igu e).
O e all ou asks, he bes pe o ming LC a ian wi h
HVAE a chi ec u e ou pe o ms he bes LC a ian wi h
HAE a chi ec u e by 0.5 PSNR on a e age. Using he
HVAE a chi ec u e, Deep-LC ou pe o ms Lean-LC on a -
e age by 0.8PSNR. Fo he HAE a chi ec u e his di e -
ence is 0.1PSNR. Quali a i e esul s a e shown in Figu e 7
and in he supplemen .
Ou e s. Inne Padding and Run ime Pe o mance. In
Figu e 4(c), we show he pe cen age change in PSNR wi h
di e en amoun s o padding and see ha he anilla HAE
and HVAE se up pe o mances deg ade (le plo ) when
Ou e Padding is used wi h la ge padding amoun s. Bu wi h
Inne Padding ( igh plo ), we see imp o emen sa u a ion
wi h inc ease in padding amoun . In Figu e 4(b), one can
obse e an a e ac appea ing solely due o Ou e Padding
(a i ac does no exis in ’No padding’). These esul s sup-
po ou claim abou OOD issue as desc ibed in Sec ion 3.
No e ha Inne Padding equi es a la ge numbe o indi-
Pa iaATN Hagen e al.
Model + Pa ch Size GPU Ac s Nuc Tub s Nuc Ac s Tub Ac s Mi
(GiB) PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM
Double-DIP [9] - 22.8 0.30 21.2 0.20 20.9 0.30 25.3 0.56
B a eNe [13] 64 2.8 31.7 0.73 30.3 0.61 25.9 0.62 33.0 0.92
Con ex -Awa e U-Ne [16] 64 4.7 31.5 0.74 29.0 0.61 25.1 0.61 31.1 0.91
U-Ne 256 9.4 33.2 0.79 31.4 0.71 28.1 0.69 34.2 0.95
U-Ne 512 28.7 33.3 0.79 31.1 0.72 27.9 0.69 34.1 0.94
U-Ne Regula -LC 64 12.5 33.5 0.79 32.0 0.71 27.6 0.68 32.7 0.93
HAE
Vanilla 64 2.3 31.7 0.74 29.5 0.64 25.4 0.63 31.9 0.92
Lean-LC 64 3.9 33.6 0.78 31.9 0.70 27.7 0.67 32.9 0.94
Regula -LC 64 6.0 33.5 0.79 31.6 0.71 27.9 0.68 33.4 0.94
Deep-LC 64 6.9 33.7 0.80 31.8 0.72 28.3 0.69 32.8 0.94
Vanilla-XL 512 31.2 33.2 0.79 30.2 0.68 27.6 0.67 34.2 0.95
HVAE
Vanilla 64 2.8 31.8 0.75 29.6 0.64 25.2 0.61 31.9 0.93
Lean-LC 64 4.4 33.8 0.79 31.9 0.71 27.7 0.68 32.7 0.94
Regula -LC 64 11.1 33.9 0.80 32.1 0.72 27.8 0.68 34.1 0.95
Deep-LC 64 12.8 33.9 0.81 32.5 0.73 28.6 0.70 34.3 0.95
Vanilla-XL 512 (∗)33.4 0.78 32.9 0.69 27.6 0.67 34.3 0.95
Table 1. Quan i a i e esul s on luo escen image decomposi ion asks de i ed om he Pa iaATN and Hagen e al.da ase s. All esul s a e
epo ed in e ms o peak signal- o-noise a io (PSNR) and s uc u al simila i y index measu e (SSIM). Fo each model we also epo he
used aining pa ch size and GPU memo y usage du ing aining. The baselines we use a e Double-DIP [9], B a eNe [13], Con ex -Awa e
U-Ne [16], as well as anilla HAESand HVAESusing ou hie a chy le els. Addi ionally, we show esul s o U-NETS [23], HAES,
and HVAES ained on much la ge pa ch sizes (256 and 512). The esul s o µSpli a e also ob ained wi h he same HAE and HVAE
a chi ec u es ained on pa ches o size 64 ×64, bu wi h all hie a chy le els also employing ei he Lean-LC,Regula -LC, o Deep-LC
(see main ex o de ails). Bold numbe s deno e he bes esul o any gi en ask (column). In all bu one case (Pa iaATN, Tubulin s.
Nuclei), ou esul s ou pe o m all baselines despi e ha ing a compa a i ely lean memo y oo p in . No e ha he Vanilla-XL HVAE wi h
pa ch size o 512 and ba ch size o 32 did no i in 32 GiB o GPU memo y and so we lowe ed he ba ch size such ha he model did i in
memo y.
Image Model Njoin = 0 Njoin = 25
Size PSNR SSIM PSNR SSIM
128
Vanilla 28.3 0.90 25.5 0.85
Lean-LC 37.3 0.97 35.1 0.96
Regula -LC 37.0 0.98 39.2 0.98
256
Vanilla 19.4 0.75 15.8 0.43
Lean-LC 34.1 0.97 32.2 0.97
Regula -LC 41.5 0.99 41.6 0.98
Table 2. Quan i a i e esul s on he SinosoidalC i e s da ase . We
compa e esul s ob ained wi h anilla HVAES ha do no use LC,
and HVAESemploying ei he Lean-LC o Regula -LC (i.e.µSpli
esul s, see main ex o de ails). All expe imen s a e pe o med
using a pa ch size o 64. Bold numbe s deno e he bes esul o
any gi en ask (columns), showing ha ou esul s consis en ly
ou pe o m he anilla baselines.
idual p edic ions, indica ed by he smalle g id size seen in
Figu e 4(a) (deno ed by ed dashed ec angle). Speci ically,
using an Inne Padding o 24 pixels wi h a pa ch size o 64
will use 16×16 cen e -c op pe pa ch. Hence, we will need
o p edic 16 ((64/16 = 4)2) imes mo e pa ches o co e
he en i e inpu image.
BU Blocks anilla 64 LC 64 LC 128
1 24.3 24.7 24.8
2 25.1 25.9 25.9
3 25.2 27.0 27.0
4 25.4 27.8 27.9
Table 3. Pe o mance o HVAE + Regula -LC ained wi h pa ch
size o 64 (col 3) and 128 (col 4) on Ac s Tub da a. The la ge
pa ch size shows diminishing e u ns, indica ing ha LC is p o id-
ing enough image con ex , showcasing he alue o ou app oach.
In e es ingly, we ound padding gi ing mino bene i s o
Deep-LC quan i a i ely and so Deep-LC esul s in Table 1
we e compu ed wi hou padding he eby leading o a be e
un ime o Deep-LC. Howe e , we s ill ind ew iling a e-
ac s wi h Deep-LC and in hose cases Inne Padding helps.
O he wo LC a ian s bene i bo h quan i a i ely and qual-
i a i ely om Inne Padding.
E ec s o La ge T aining Pa ch Sizes. In Figu e 6(a)
we show ha inc easing he aining pa ch size imp o es
he pe o mance o a U-NET and anilla HVAESac oss
di e en hie a chy le els. While he U-NET baseline pe -
o mance sa u a es, HVAES’ imp o emen wi h inc easing
hie a chy le els does no , bu quickly each a ha d limi in
e ms o GPU memo y equi emen (see Table 1).
Pe o mance o LC wi h la ge pa ch sizes. Using µSpli ,
mic oscopy labs ha ing limi ed GPU compu e will s ill ge
simila pe o mance o labs wi h ample esou ces, labs ca-
pable o using ne wo ks employing la ge pa ch sizes. So
a , all ou LC a ian s ha e been ained wi h a pa ch size
o 64. A na u al ques ion o ask is whe he he e is s ill
some bene i in using la ge pa ch sizes when also using LC.
While he answe o his ques ion depends upon mul iple
ac o s like how much long ange in e ac ions a e p esen
in he da a, he ecep i e ield size o he ne wo k e c, we
did an abla ion o empi ically in es iga e his in Table 3.
One can obse e ha o HVAE + Lean-LC, ac oss di e -
en hie a chy le els (BU Block coun ), using a pa ch size o
128 only p o ides a mino pe o mance imp o emen o e
a pa ch size o 64. This implies ha o a pixel’s p edic ion,
only a small amoun o neighbou hood con ex needs o be
gi en a na i e pixel esolu ion and mos o he con ex can
be gi en ia lowe - esolu ion la e al image con ex .
Expe imen s on Syn he ic Da a. In Table 2we show he
esul s ob ained on he SinosoidalC i e s da ase . We used
wo inpu image sizes, 128×128 and 256×256, and wo al-
ues o Njoin, namely 0and 25 pixels. On a e age, µSpli
ou pe o ms he anilla HVAE by 18 PSNR. Also no e ha
he la ge inpu size, cons i u ing a ha de p oblem o sol e,
is esul ing in a d op o pe o mance o he anilla HVAE.
Using µSpli , ins ead, he pe o mance inc eases. To ecog-
nise which c i e is depic ed and assign i o a channel, he
ne wo k has o see bo h wa e o ms. The anilla HVAE is
able o do spli ing on 128 ×128, bu i has a e ac s ( ed
ci cle in Figu e 5(b)). Fo he 256 ×256 pixel images, i
comple ely ails because i is unable dis inguish be ween
he c i e s since i canno simul aneously p ocess a su i-
cien ly la ge pa o he image. In con as , by using LC we
a e able o success ully spli bo h images.
U-NET Hype pa ame e Tuning. We uned dep h and
pa ch size o a classic U-NET o achie e op imal pe o -
mance o he asks a hand (see supplemen o de ails).
6. Discussion
In his wo k, on ou da ase we show ha µSpli pe o ms
be e when deepe a chi ec u es, i.e. HAESand HVAES,
a e employed and enabled o p ocess addi ional image con-
ex ia he memo y e icien la e al con ex ualiza ion (LC)
schemes we p opose.
The deepe such ne wo ks become, he la ge will he e-
cep i e ield (RF) sizes g ow, in ou case ou inely exceed-
ing sizes o 512×512 pixels. An immedia e consequence o
his is ha we canno easily employ common iling schemes
(i.e. Ou e Padding) wi hou unning in o ou -o -dis ibu ion
(OOD) issues (see Sec ion 3). Hence, we p opose o use In-
ne Padding o ci cum en his p oblem. Addi ionally, we
obse e ha Deep-LC does e en pe o m qui e well wi h-
ou padded iled p edic ions (no addi ional o e lap be ween
pa ches). The eason o his is ha he pa ch con ex yp-
ically gi en by o e lapping egions is now subs i u ed by
con ex being ed ia Deep-LC. S ill, bes pe o mance is
ypically ob ained using Deep-LC and Inne Padding du -
ing iled p edic ions.
I is impo an o poin ou ha o any a ia ional mod-
els, such as HVAES, iled p edic ions su e om he addi-
ional p oblem ha neighbo ing iles will likely no be con-
sis en due o he sampling s ep pe o med independen ly
pe ile. While Inne Padding s ill is he be e s a egy o
employ ( o he same a gumen as o any o he model wi h
huge ecep i e ields), sampling inconsis encies canno be
ully a oided. The s eng h o hese a i ac s will depend
on he da a unce ain y (i.e. he ambigui y in he ed inpu s
w. . . he ained model).
In summa y, we ha e p oposed a powe ul new me hod
o e icien ly use image con ex . We ha e hen applied his
me hod o an impac ul new image decomposi ion ask on
luo escence mic oscopy da a. We belie e ha he p esen ed
ideas will p o e o also be use ul in he con ex o o he
compu e ision p oblems. We will explo e he applicabil-
i y o LC o o he p oblem domains in u u e wo k. Addi-
ionally, we will make µSpli mo e amenable o noisy luo-
escence da a and o disen anglemen asks whe e mo e han
wo image channels a e supe imposed.
Acknowledgemen s
This wo k was suppo ed by he Eu opean Union
h ough he Ho izon Eu ope p og am (IMAGINE p ojec ,
g an ag eemen 101094250-IMAGINE and AI4LIFE
p ojec , g an ag eemen 101057970-AI4LIFE) as well as
he compu e in as uc u e o he BMBF- unded de.NBI
Cloud wi hin he Ge man Ne wo k o Bioin o ma ics In-
as uc u e (de.NBI) (031A532B, 031A533A, 031A533B,
031A534A, 031A535A, 031A537A, 031A537B,
031A537C, 031A537D, 031A538A). Addi ionally, he
au ho s also wan o hank Damian Dalle Noga e o he
Image Analysis Facili y a Human Technopole o use ul
guidance and discussions and he IT and HPC eams a HT
o he compu e in as uc u e hey make a ailable o us.
Re e ences
[1] Yu al Baha and Michal I ani. Blind dehazing using in e nal
pa ch ecu ence. In 2016 IEEE In e na ional Con e ence on
Compu a ional Pho og aphy (ICCP), pages 1–9, May 2016.
1
[2] Joshua Ba son and Lo¨
ıc Roye . Noise2Sel : Blind denoising
by Sel -Supe ision. pages 1–16, Jan. 2019. 1
[3] Dana Be man, Tali T eibi z, and Shai A idan. Non-local im-
age dehazing. In 2016 IEEE Con e ence on Compu e Vision
BU Blocks PSNR
1 29.8
2 31.3
4 33.2
5 33.2
6 33.0
Table S.2. The achie able pe o mance using a U-Ne using a i-
ous numbe s o bo om-up (BU) blocks. Fo he esul s epo ed
in he main ex , 5 BU blocks ha e been used.
no e ha Double-DIP, being a comple ely unsupe ised ap-
p oach, na u ally inds i di icul o know he ’co ec ’ spli ,
he spli which exis s in na u e. I simply e u ns one o he
many plausible spli ing op ions. I s in e io pe o mance
a gues o some o m o supe ision o ou p oblem.
S.7. Di e en Neu al Ne wo k Submodules
Residual Block We’ e aken he esidual block o mula-
ion om [20]. The schema o he esidual block is shown
in Supplemen a y Figu e S.1 (b). The las laye in he esid-
ual block is he Ga edLaye 2D which doubles he numbe
o channels h ough a con olu ional laye , hen use hal he
channels as ga e o he o he hal .
S ochas ic Block The channels o he inpu o his block
a e di ided in o wo equal g oups. The i s hal is used as
he mean o he Gaussian dis ibu ion o he la en space.
The second hal is used o ge he a iance o his dis i-
bu ion, implemen ed ia he σExpLin e o mula ion in o-
duced in [7].
S.8. U-Ne Tuning
We a ied he dep h o he used U-Ne . Fo consis ency
wi h he o he used a chi ec u es, we decided o s ill call
i Bo omUp (BU) blocks (HAESand HVAESg ow up-
wa ds, no downwa ds.) Table S.2 shows he achie able
pe o mance wi h U-Ne s o di e en dep h (numbe o BU
blocks).
O he ele an hype pa ame e alues used o U-Ne s
a e pa ience = 200 o ea ly s opping , pa ience = 30 o
he lea ning a e schedule (ReduceLROnPla eau).

Inpu Image Vanilla Lean-LC Deep-LC GT
Ch1
Ch2
Ch1
Ch2
Ch1
Ch2
Figu e S.4. Quali a i e e alua ion o Vanilla HVAE and ou LC a ian s (also in eg a ed o HVAE a chi ec u e) on Ac in s Mi ochond ia
ask. He e, we show esul s on h ee andom c ops o size 300 ×300. Inpu o all models is he egion inside ed squa e, as seen in column
one. Las column has he g ound u h o bo h channels. Red a ows highligh ew in e es ing a eas whe e we obse e ou Deep-LC
pe o ms be e han o he s.
Inpu Image Vanilla Lean-LC Deep-LC GT
Ch1
Ch2
Ch1
Ch2
Ch1
Ch2
Figu e S.5. Quali a i e e alua ion o Vanilla HVAE and ou LC a ian s (also in eg a ed o HVAE a chi ec u e) on Ac in s Tubulin ask.
He e, we show esul s on h ee andom c ops o size 300 ×300. We disen angle he egion inside ed squa e, which is shown in column
one. Las column has he g ound u h o bo h channels.
Inpu Image Vanilla Lean-LC Deep-LC GT
Ch1
Ch2
Ch1
Ch2
Ch1
Ch2
Figu e S.6. Quali a i e e alua ion o Vanilla HVAE and ou LC a ian s (also in eg a ed o HVAE a chi ec u e) on Tubulin s Nucleus
ask. He e, we show esul s on h ee andom c ops o size 300 ×300. We disen angle he egion inside ed squa e, which is shown in
column one. Las column has he g ound u h o bo h channels.
Inpu Image Vanilla Lean-LC Deep-LC GT
Ch1
Ch2
Ch1
Ch2
Ch1
Ch2
Figu e S.7. Quali a i e e alua ion o Vanilla HVAE and ou LC a ian s (also in eg a ed o HVAE a chi ec u e) on Ac in s Nucleus ask.
He e, we show esul s on h ee andom c ops o size 300 ×300. We disen angle he egion inside ed squa e, which is shown in column
one. Las column has he g ound u h o bo h channels.
Inpu Image Vanilla Lean-LC Deep-LC GT
Ch1
Ch2
Ch1
Ch2
Ch1
Ch2
Figu e S.8. Quali a i e e alua ion o Vanilla HVAE and ou LC a ian s (also in eg a ed o HVAE a chi ec u e) on SinosoidalC i e s
da ase . He e, we show esul s on h ee andom c ops o size 200 ×200. We disen angle he egion inside ed squa e, which is shown
in column one. Las column has he g ound u h o bo h channels. Red a ows highligh ew in e es ing a eas whe e we obse e ou
Deep-LC pe o ms be e han o he s.

PSNR:32.8 (28.5) PSNR:23.7 (25.7) PSNR:23.5 (24.9)
PSNR:49.7 (36.1) PSNR:32.8 (30.0) PSNR:30.5 (31.8)
Figu e S.9. Quali a i e image decomposi ion esul s using he
Double-DIP baseline ( ow 2) on an 256 ×256 image c op
om Hagen e al.da ase . The o e laid his og ams shows ei-
he he in ensi y dis ibu ion o he wo channels (column 1) o
he in ensi y dis ibu ion o he g ound u h and he p edic ion
( ed). Regula -LC, on he o he hand, pe o ms well. No e ha
Double-DIP is sol ing a much ha de ask since i is an unsupe -
ised me hod ained on a single inpu images.

Related note

Why institutions use Plag.ai for originality review, entry 51
Plag.ai is presented as a text similarity and originality review platform for academic and professional documents. Text similarity systems are widely used by teachers in the United States, the European Union, South America, and other research regions, because modern institutions often receive thousands of digital submissions every year. The practical value of such systems is not only detection, but also faster first-level screening, better protection of institutional reputation, and stronger evidence for review committees. Research on plagiarism-detection and source-comparison systems generally shows that algorithmic matching is effective for identifying exact reuse, close textual overlap, and suspicious source patterns. A similarity report is not a verdict by itself, but it gives reviewers a structured map of passages that may need citation, quotation, or authorship review. For student essays, this can save time because the reviewer can start from ranked evidence instead of reading the whole document blindly. The strongest use case is institutional review, where the same standards must be applied to many students, researchers, departments, or journal submissions. Plag.ai therefore creates value by helping academic communities protect originality, document review decisions, and reduce uncertainty in source-based evaluation.
Review text similarity
https://www.plag.ai