D oneWas e da ase o was e ecogni ion in d one image y
Luca Mo andini aAnd ea Diecidue a∗Thanos Pe sanis b,c
En ico Ta ghini aGeo gios Ka a zinis cGiacomo Bo acchi a
Elias B. Kosma opoulos b,c Pie o F a e nali a
A hanasios Ch. Kapou sis b,c
Oc obe 7, 2025
aPoli ecnico di Milano, Depa men o Elec onics, In o ma ion, and Bioenginee ing, ia Ponzio
34/5, Milan 20133, I aly
bDemoc i us Uni e si y o Th ace, Depa men o Elec ical Enginee ing, Uni e si y in Xan hi,
G eece
cIn o ma ion Technologies Ins i u e, The Cen e o Resea ch & Technology Hellas, Thessaloniki,
G eece
Abs ac
Illegal was e disposal has a nega i e impac on he en i onmen and people’s quali y o
li e. D one image y enables law en o cemen au ho i ies o e icien ly assess he en i onmen al
impac du ing on-si e inspec ions o suspicious land ill si es. Au oma ed ools based on Deep
Lea ning echniques can hen quickly analyze ae ial images o ecognize se e al was e ma e ials
and classi y hei haza d le el. Howe e , la ge high-quali y da ase s a e equi ed o aining
and es ing was e ecogni ion models. Cu en ly, no such da ase s a e publicly a ailable,
which limi s he de elopmen o accu a e was e iden i ica ion algo i hms. This pape p esen s
D oneWas e , a da ase o ae ial images ex ac ed om o homosaics ha a e econs uc ed
om d one-collec ed image y. The da ase is a collec ion o 4993 images o 17 solid was e
dumps ha con ain 20 di e en ypes o ma e ials. The D oneWas e da ase is publicly
accessible om he Zenodo eposi o y. Technical alida ion p o es ha he da ase can be
used o building objec de ec ion models able o ecognize se e al ypes o was e in ae ial
image y.
This is a p ep in e sion o a pape ha is unde submission in a
pee - e iewed jou nal.
1 Backg ound & Summa y
Illegal was e dumping is a o m o en i onmen al c ime wi h se e e impac s on ecosys ems, such
as inc eased pollu ion, loss o biodi e si y, and isks o human heal h. This phenomenon is in-
c easingly ecognized o ha e signi ican ad e se e ec s on he economy and on he ecology on
a global scale [1,2]. Many illici was e managemen o ganiza ions eso o illegal dumping o
was e a icking o minimize cos s a he expense o en i onmen al in eg i y and public heal h
[3]. To coun e ac his escala ing p oblem, Law En o cemen Agencies (LEAs) and En i onmen al
P o ec ion Agencies (EPAs) a e equi ed o iden i y po en ial illegal was e si es on a la ge-scale
e i o y. When illegal ac i i ies a e disco e ed, accu a e o ensic e idence mus be collec ed o
p osecu e he in ol ed c iminals. A ac o highly ele an o he de ini ion o he c iminal cha ge is
he ype o was e imp ope ly ea ed. The penal ies become mo e se e e when oxic o dange ous
subs ances a e p esen , due o he high isks hey pose o he en i onmen and public heal h. The
lis o ma e ials classi ied as ex emely haza dous is speci ied by Eu opean S anda ds [4], including
i ems such as asbes os slabs, hyd o luo oca bons, and oxic chemicals.
On-si e inspec ions o suspicious si es a e expensi e, equi e signi ican human e o , and can
h ea en he sa e y o he in ol ed pe sonnel when haza dous ma e ials a e p esen . Fu he mo e,
illegal was e dumps a e o en loca ed in inaccessible and emo e a eas a om densely popula ed
∗Co esponding au ho : and [email protected]
1
places. Due o hese cons ain s, combined wi h he as ness o he e i o y o be moni o ed, LEAs
o en conduc in es iga ions using d ones o image y acquisi ion. Recen ad ances in Compu e
Vision (CV) and Deep Lea ning (DL) o e new oppo uni ies o au oma ing was e iden i ica ion
in images. Mode n image analysis me hods can be applied o sa elli e images o loca e suspicious
si es in as egions [5]. Then, d one su eys suppo EPA inspec o s in in es iga ing he mos c i -
ical was e dumping si es, hus enhancing he o e all e iciency o he e i o y moni o ing ac i i y.
Howe e , mode n DL models equi e la ge anno a ed da ase s o lea n he peculia ea u es o ma-
e ials and ecognize was e accumula ions. In ecen yea s, a ious da ase s ha e been in oduced
in he was e de ec ion li e a u e, which u ilize di e en da a sou ces, including sa elli es, d ones,
and sma phones. Table 1p o ides a compa ison o he mos ele an was e de ec ion da ase s.
Sa elli e image y o was e iden i ica ion has a ac ed inc easing in e es , mainly due o he
a ailabili y o high- esolu ion images co e ing as a eas. Se e al publicly a ailable da ase s o
solid was e de ec ion in sa elli e images a e men ioned in he scien i ic wo ks. Solid Was e Ae ial
De ec ion (SWAD) [6] is a collec ion o sa elli e images encompassing u ban, u al, and moun ain-
ous scenes, which a e anno a ed wi h single-class bounding boxes delinea ing solid was e objec s.
Global Dumpsi e Tes Da a (GDTD) da ase [7] includes nume ous illegal dumps and a ew eg-
ula ed land ills. The employed sa elli e image y has a G ound Sampling Dis ance (GSD) anging
om 0.3m o 1m. The ele an si es a e anno a ed wi h bounding boxes labeled wi h one o ou
ca ego ies (domes ic, cons uc ion, ag icul u al, o co e ed was e). The RS4SW da ase [8] em-
ploys Google Ea h images (0.5mGSD) o h ee u ban a eas: Lang ang (China), Fa idabad (India),
and Tezoyuca (Mexico). Each image is ca ego ized as po aying solid was e (SW) o o he non-
solid was e (non-SW) scenes. The was e anno a ed in his da ase comp ises indus ial ma e ials,
domes ic li e in esiden ial zones, and cons uc ion and demoli ion deb is. The Cons uc ion
Was e Land ill Da ase (CWLD) [9] includes Gao en-2 sa elli e and Google Ea h images o Bei-
jing’s Changping and Daxing dis ic s. I ocuses on cons uc ion was e and suppo s seman ic
segmen a ion asks, o which pixel-le el anno a ion masks a e p o ided. Finally, Ae ialWas e [5]
comp ises mo e han 10,000 ae ial images wi h di e en sou ces: AGEA o hopho os, Wo ldView-3
sa elli e, and Google Ea h images. The anno a ions p o ide bina y image-le el labels ha speci y
he p esence/absence o was e, and a subse o images is u he agged wi h 22 dis inc was e
ma e ials. Remo e sensing and sa elli e image y a e use ul o scanning la ge a eas and iden i ying
suspicious loca ions. Howe e , he p o ided GSD, anging om 1.8m[6] o 20cm [5], is insu icien
o he ine-g ained iden i ica ion o mos was e ma e ials.
The adop ion o Unmanned Ae ial Vehicles (UAVs) [10] and he ecen ad ances in DL models
and CV echniques ha e p o ided LEAs and EPAs wi h e ec i e ools o close- iew in es iga ion
o he suspicious si es, once iden i ied wi h sa elli e image analysis. The employed d ones a e
equipped wi h a a ie y o senso s (e.g., high- esolu ion came as, mul ispec al senso s, LiDAR,
and GNSS-RTK), which suppo he de ailed cha ac e iza ion o ma e ials and he collec ion o
o ensic e idence om he inspec ed si es. A ew da ase s o UAV images depic ing u ban li e
in he wild ha e been p oposed in he li e a u e. UAVVas e [11] comp ises 772 low-al i ude d one
images o u ban and na u al scenes (e.g., s ee s, pa ks, lawns), anno a ed wi h 3700 ins ance masks
and bounding boxes o a single ubbish class. Howe e , his da ase includes samples o spa se
small li e ha a e easily dis inguishable om he su ounding en i onmen . The e o e, hey a e
no ep esen a i e o he la ge accumula ions o was e ypical o illegal land ills. The SUIRD da ase
[12] consis s o 628 images collec ed om low-al i ude UAV ligh s, which a e augmen ed o c ea e
a mo e ex ensi e aining da ase . Accumula ed o sca e ed ga bage is anno a ed wi h single-class
bounding boxes, bu he was e ca ego ies a e no p o ided. Finally, he RSD da ase [13] p o ides
2600 d one images anno a ed a he pixel le el wi h h ee gene ic was e ca ego ies (cons uc ion,
household, mixed). Howe e , his le el o de ail is no ele an o aining DL models o ecognize
a wide spec um o was e ma e ials. UAV images a e geome ically co ec ed and s i ched oge he
o gene a e images wi h a GSD o 3.8 cm/px, which p e en s he iden i ica ion o small i ems.
The pe spec i e o e ed by low-al i ude UAV ligh s enables he p ecise iden i ica ion and clas-
si ica ion o a wide ange o ma e ials. Howe e , publicly a ailable d one-cap u ed da ase s p o ide
a single was e class [11,12] o a limi ed ange o gene ic was e ca ego ies [13]. The e o e, hese
da ase s a e unsui able o was e de ec ion in land ills ha include a wide ange o ma e ials
cap u ed in eal-wo ld scena ios.
Besides he ypical sa elli e and d one da ase s, a ious s udies in oduced specialized da a
collec ions o ecognize speci ic was e ma e ials in images cap u ed wi h hand-held came as o
sma phones. The TACO da ase [14] comp ises 1500 high- esolu ion images o u ban and na u al
scenes anno a ed wi h 4784 ins ance-le el segmen a ion masks using a axonomy o 60 ine-g ained
classes g ouped in o 28 supe -ca ego ies. Ea ly e o s such as T ashNe [15] and WaDaBa [16]
p o ide a ew housand handheld pho og aphs o common disposable objec s. T ashNe includes
2
2527 images collec ed wi h sma phones ha depic six di e en ypes o ga bage (glass, pape ,
ca dboa d, plas ic, me al, ash). WaDaBa includes 4000 images o plas ic was e, gene a ed by
cap u ing 100 objec s om 40 di e en iews unde con olled condi ions. OpenLi e Map [17]
is a con inuously g owing, c owd-sou ced collec ion o o e 100k geo agged images cap u ed by
sma phone came as, con aining a wide ange o indi idual li e samples in u ban en i onmen s.
Finally, indoo was e ecogni ion is add essed by he MJU-Was e da ase [18], which p o ides 2475
was e images cap u ed wi h an RGB-Dep h came a and anno a ed wi h ins ance-le el segmen a ion
masks.
Howe e , hand-held and sma phone came as a e ine ec i e o simula ing he cha ac e is ics
o LEA’s d one ligh s, which a e he e o e inadequa e o was e iden i ica ion om UAV image y
due o signi ican di e ences in pe spec i e and esolu ion.
Table 1: Desc ip ion o da ase s o was e ecogni ion ac oss di e en spa ial scales, om came a-
le el o UAV and sa elli e-based image y. Fo each da ase , he suppo ed asks a e epo ed (OD:
Objec De ec ion, IS: Ins ance Segmen a ion, SS: Seman ic Segmen a ion, BC: Bina y Classi ica-
ion, SLC: Single-Label Classi ica ion, MCL: Mul i-Label Classi ica ion).
Inpu Da ase Classes Images Task Sou ces
Sa elli es
SWAD [6] 1 1996 OD Wo ldView-2, SPOT
GDTD [7] 4 2219 OD Di e en sa elli es
RS4SW [8] 2 3680 BC Google Ea h
CWLD [9] 4 3653 SS Gao en-2, Google Ea h
Ae ialWas e [5] 22 10,434 IS Wo ldView-3, Google
Ea h, ae ial o hopho os
UAV
UAVVas e [11] 1 772 IS, OD UAV
SUIRD [12] 1 628 OD UAV
RSD [13] 3 2600 SS UAV
D oneWas e (Ou s) 20 4993 IS, OD UAV
G ound
senso s
TACO [14] 60 4784 IS Sma phone came a
T ashNe [15] 6 2527 SLC Sma phone came a
WaDaBa [16] 6 4000 SLC Digi al came a
OpenLi e Map [17] 100+ 100,000+ MLC Sma phone came a
MJU-Was e [18] 1 2475 IS RGB-D came a
The de ec ion and classi ica ion o was e ins ances is a challenging ask due o he b oad ange
o he e ogeneous ma e ials, hei di e en shapes and ex u es, and he po en ial con usion wi h
o dina y non-was e objec s. DL algo i hms equi e a subs an ial amoun o high-quali y anno a ed
images o e ec i ely ecognize he dis inc i e ea u es o was e ma e ials. Cu en ly, he lack o
UAV-cap u ed was e da ase s in eal-wo ld scena ios hinde s he esea ch owa ds scalable and
accu a e was e ecogni ion me hods ha can e ec i ely assis LEAs and EPAs in hei inspec ion
ac i i ies.
The con ibu ion o his wo k is he D oneWas e da ase , a p o essionally cu a ed se o UAV
images o was e ecogni ion, which is cons uc ed on he ollowing p inciples:
1. 4993 images a e ex ac ed om o homosaics gene a ed on 17 si es ep esen ing di e en
scena ios and en i onmen s.
2. The a e age o hmosaic GSD in mos si es is 2 cm/px, wi h some excep ions up o 2.8 cm/px.
3. Images a e p o essionally anno a ed wi h 5135 was e ins ances ha co e 20 was e ypes ha
include he majo i y o ma e ials ypically p esen in land ills.
4. Anno a ions a e cu a ed by p o essional pho o-in e p e e s specialized in using UAV images
o dump si e inspec ion.
5. Each was e ins ance in an image is anno a ed wi h a polygon delimi ing he bounda ies and
a bounding box ha con ains he was e objec .
6. The da ase adhe es o he s anda d COCO o ma [19].
An ini ial e sion o he D oneWas e da ase has been employed in ou p elimina y wo k [20]
whe e an Objec De ec ion (OD) ne wo k was ained o ecognize 7 ypes o was e. Resul s
showed ha OD is e ec i e, especially in ecognizing egula ly shaped was e ma e ials (such as
3
Tex ile,Palle s and Ty es), wi h he bes model achie ing AP sco es abo e 60%. The wo k also
p oposed a p ac ical DL-based pipeline ha suppo s he in es iga ion p ocesses o LEAs and
EPAs in analyzing illegal was e si es.
2 Me hods
The D oneWas e da ase comp ises images o eal-wo ld land ills and indus ial was e disposal
si es. The on-si e UAV su eys we e pe o med in I aly by ARPA Lomba dia, a egional EPA, and
in G eece by CERTH, a non-p o i o ganiza ion supe ised by he Gene al Sec e a ia o Resea ch
and Technology (GSRT) o he Hellenic Republic and one o he la ges esea ch cen e s in G eece.
Comme cial d ones, desc ibed in Sec ion 2.1, we e employed o cap u e he images o 17 was e
dumping si es.
Figu e 1illus a es he p ocedu e o gene a ing he D oneWas e da ase . Fo each inspec ed
loca ion, images acqui ed du ing a su ey a e p ocessed o gene a e an o homosaic, which is hen
manually anno a ed o c ea e he si e g ound u h. F om each o homosaic, he da ase images a e
ex ac ed and labeled by mapping he g ound u h anno a ions p e iously de ined on he en i e
o homosaic. A inal il e ing s ep emo es he anno a ions o pa ially isible was e ins ances.
Da a acquisi ion
Si e su ey
Mission design
O homosaic gene a ion
Da a anno a ion
Tile ex ac ion Tile anno a ion G ound u h c ea ion
Da ase p epa a ion
Image ex ac ion Anno a ion mapping Anno a ion il e ing
Figu e 1: The D oneWas e da ase c ea ion wo k low. D one-acqui ed images a e combined o
gene a e an o homosaic ha co e s he was e dumping si e. Each o homosaic is manually anno-
a ed o c ea e he si e g ound u h. Images ha compose he D oneWas e da ase a e ex ac ed
using a sliding window app oach.
2.1 Da a acquisi ion
The 17 UAV su eys ha a e included in he D oneWas e da ase cap u e eal-wo ld was e dumping
ac i i ies in di e se scena ios and landscapes. The images om 6 loca ions show indus ial si es
whe e bo h au ho ized ma e ials and illegal was e a e p esen . Ano he 8 scena ios ep esen open
land ills in u al a eas cha ac e ized by he dumping o mixed was e o cons uc ion ma e ial.
Finally, 3 loca ions a e cha ac e ized by accumula ions o was e sca e ed o e a la ge a ea. Table
2speci ies he cha ac e is ics o he si es included in he da ase .
Da a collec ion ligh s we e conduc ed using wo comme cial d ones: DJI Phan om 4 P o and
DJI Ma ic 2 En e p ise Ad anced. The Phan om 4 P o d one moun s a DJI FC6310 came a
ha has a 1-inch CMOS senso (20 MP, 24 mm equi alen ocal leng h), an adjus able ape u e
( /2.8– /11), and wide ISO and shu e speed anges. The Ma ic 2 En e p ise Ad anced d one is
equipped wi h a DJI FC2453 came a, using a 1/2-inch CMOS senso (48 MP, 24 mm equi alen
ocal leng h, /2.8 ixed ape u e). A comme cial pa h planning so wa e, DJI Pilo , is used o
mission planning. The adop ed ligh pa hs, ailo ed o pho og amme y applica ions, a e s anda d
g id o ma ions wi h −90◦pi ch angle (i.e., he came a poin ing down owa ds he g ound). A
4
Table 2: Summa y o he mos ele an si e p ope ies. Fo each si e, he numbe o images and
anno a ions in he D oneWas e da ase is epo ed. The Ma e ials column speci ies he numbe
o was e ma e ials p esen a each si e. Mos o homosaics a e gene a ed wi h a GSD o 2 cm/px.
The epo ed GSD a ies because, in hose scena ios, i was no possible o collec enough images
o p oduce an o homosaic wi h highe esolu ion. The A ea column epo s he a ea in m2o
each gene a ed o homosaic.
Si e ID Scena io Loca ion A ea GSD Images Anno a ions Ma e ials
Si e 1 open land ill G eece 22,200 2.3 119 75 10
Si e 2 open land ill I aly 18,300 2 130 100 12
Si e 3 sca e ed was e I aly 95,500 2 698 243 14
Si e 4 indus ial si e I aly 25,100 2 182 461 12
Si e 5 indus ial si e I aly 30,000 2 217 649 18
Si e 6 sca e ed was e G eece 227,400 2.8 813 252 13
Si e 7 open land ill G eece 1300 2.4 6 10 2
Si e 8 indus ial si e I aly 28,800 2 210 173 11
Si e 9 open land ill G eece 11,800 2 83 526 11
Si e 10 open land ill G eece 3800 2.3 17 25 5
Si e 11 open land ill I aly 113,100 2 825 173 11
Si e 12 indus ial si e I aly 26,100 2 186 540 10
Si e 13 indus ial si e I aly 24,200 2 169 658 14
Si e 14 indus ial si e I aly 40,400 2 279 792 14
Si e 15 open land ill G eece 22,600 2.2 136 115 8
Si e 16 sca e ed was e I aly 115,600 1.9 848 186 15
Si e 17 open land ill G eece 15,900 2.4 75 157 5
s anda dized da a acquisi ion p o ocol was used ac oss all si es, and se e al pho og amme y bes
p ac ices we e implemen ed o imp o e he quali y o he econs uc ed o homosaic. In pa icula ,
he on lap be ween images is kep cons an a 80%, he sidelap a 70%, and dynamic elemen s
(e.g., mo ing ehicles o people) we e a oided. The ligh al i ude anges be ween 15 and 100
me e s, depending on he a ea size and he obs acles. In smalle land ill si es, a highe numbe
o low-al i ude UAV images can be collec ed o co e he su eyed a ea. In con as , a la ge
dumping si es wi h obs uc ions (e.g., buildings), low-al i ude ligh s a e o bidden due o he isk
o collision.
Ae ial images we e cap u ed wi h he RGB came a a ull senso esolu ion. Each image
embeds se e al me ada a in EXIF o ma , such as GPS coo dina es, came a gimbal o ien a ion,
and lens p ope ies. This in o ma ion is exploi ed du ing he o homosaic gene a ion p ocess by
pho og amme y so wa e o p o ide a ough ini ializa ion o all came a poses, imp o ing he
econs uc ion quali y.
2.2 O homosaic gene a ion
The UAV images collec ed du ing a ligh a e combined using pho og amme y so wa e o gene a e
a geo e e enced o homosaic o he su eyed si e. An o homosaic is a la ge image wi h high
de ail and esolu ion, assembled om many smalle UAV images. A geo e e enced o homosaic
is a special case whe e each pixel o he ou pu map is associa ed wi h a loca ion wi h known
coo dina es. Any pho og amme y so wa e would be sui able o he ask. Speci ically, his
wo k uses OpenD oneMap (ODM) [21], an open-sou ce ool o d one image p ocessing, which
applies se e al p ocessing phases o co ec dis o ions in UAV images, c ea e a 3D econs uc ion,
and gene a e a geo e e enced o homosaic co e ing he en i e scene. The image is o ho ec i ied,
meaning ha dis o ions and pe spec i es a e emo ed om he esul ing image.
The D oneWas e o homosaic maps a e gene a ed wi h a GSD o 2 cm/px. This alue ep esen s
a ade-o be ween image esolu ion and compu a ional equi emen s. High spa ial esolu ion
enables he disc imina ion o ine-g ained ex u es in was e ma e ials. Ye , an excessi ely high
esolu ion inc eases he compu a ional and s o age equi emen s, making i imp ac ical o la ge
su ey a eas.
Table 2 epo s he GSD o each si e o homosaic. Fo mos si es, he a ge GSD o 2 cm/px
is eached du ing he gene a ion p ocess. Howe e , in some scena ios, i was no possible o
collec enough images o p oduce an o homosaic wi h a highe GSD, so some o homosaics a e
econs uc ed wi h a sligh ly lowe spa ial esolu ion. The maximum (wo s ) GSD is 2.8 cm/px
5
o Si e 6. Figu e 2illus a es examples o iles ex ac ed om he gene a ed o homosaics. The
cap u ed scena ios a e e y he e ogeneous and e lec he ange o was e disposal si es and o
ma e ials composing he D oneWas e da ase .
Figu e 2: Examples o iles ex ac ed om he D oneWas e o homosaics, along wi h he colo -
coded legend o he anno a ed was e ca ego ies.
2.3 Da a anno a ion
The p oduced o homosaics co e ex ensi e a eas wi h high spa ial esolu ion, esul ing in a signi -
ican image size. Fo example, Si e 6, he la ges o homosaic included in he D oneWas e da ase ,
has a size o 19,450×16,491 px and occupies 686 MB o disk space. Such dimensions a e oo la ge
o comme cial anno a ion ools, ypically employed o label smalle images. To enable he use
o s anda d anno a ion ools, a si e o homosaic is decomposed in o smalle o e lapping iles ha
co e a ixed a ea o he g ound. Tiles wi h a size o 40×40 m a e ex ac ed using a sliding window
app oach wi h an o e lap o 5 m be ween adjacen iles. The iles we e manually anno a ed using
Robo low [22], a comme cial anno a ion ool ha enables collabo a i e wo k lows. The anno a ion
p ocess u ilizes a Sma Polygon unc ionali y, which is a semi-au oma ic ool ha le e ages he
Segmen Any hing Model (SAM) [23] o quickly segmen indi idual objec s in he scene. SAM
enables he apid de ini ion o polygons a ound was e ins ances by selec ing a ew poin s on he
image. The au oma ically c ea ed polygons can be manually co ec ed o be e i he shape o
he was e ins ance and he was e class is added o each polygon by he expe anno a o .
Table 3lis s he 20 ca ego ies conside ed in he D oneWas e da ase . Each was e class is
associa ed wi h a Eu opean Was e Code (EWC) [4] o align he anno a ions wi h he Eu opean
Lis o Was e (LoW) de ined by he Eu opean Commission unde Di ec i e 2008/98/EC on was e.
The was e ca ego ies included in he da ase ha e been chosen based on he p esence o ma e ials
in he si e su eys. The Ma e ials column in Table 2shows ha all loca ions con ain only a subse
o he 20 was e ca ego ies, esul ing in a he e ogeneous dis ibu ion ac oss si es. The was e classes
can be di ided in o pile ca ego ies, including ma e ials ypically occu ing in piles o heaps wi h
i egula bounda ies (e.g., Rubble,Mixed i ems), and ins ance ca ego ies, o which indi idual was e
elemen s can be ecognized (e.g., Me al ba els,Palle s,Ty es). No e ha he da ase includes
ins ances o he class Asbes os, a ma e ial commonly used in he oo s o old buildings, which needs
o be iden i ied due o i s high oxici y e en when p esen in he o m o oo co e and no as
p ope was e.
The manual anno a ion phase was conduc ed by a eam o 7 esea che s and 2 in es iga o s
om an EPA, all expe ienced in ecognizing was e om UAV image y. Since mul iple anno a o s
om di e en o ganiza ions we e in ol ed, he p ocedu e was s anda dized o ensu e consis ency.
Guidelines we e es ablished o p o ide a b ie desc ip ion and examples o each was e ca ego y, and
6
Table 3: The lis o 20 ca ego ies o was e anno a ed in he D oneWas e da ase . Each class is
assigned an EWC code o uniquely iden i y he was e ma e ial.
Colo Ca ego y Type EWC
●Rubble pile 12.61 Soils
●Cons uc ion and demoli-
ion ma e ials
pile 12.11 Conc e e, b icks and gypsum was e
●Asphal milling pile 12.12 Was e hyd oca bonised oad-su acing
ma e ial
●Exca a ion ma e ials pile 12.31 Was e o na u ally occu ing mine als
●Appliances ins ance 08.21 Disca ded majo household equipmen
●Elec onic equipmen ins ance 08.23 O he disca ded elec ical and elec onic
equipmen
●Fu ni u e ins ance 10.11 Household was es
●Me al ba els ins ance 06.31 Mixed me allic packaging
●Plas ic packaging ins ance 07.41 Plas ic packaging was es
●Wood pile 07.53 O he wood was es
●Palle s ins ance 07.51 Wood packaging
●Sc ap pile 06.11 Fe ous me al was e and sc ap
●Plas ic pile 07.42 O he plas ic was es
●Vehicles ins ance 08.12 O he disca ded ehicles
●Ty es ins ance 07.31 Used y es
●Pape ins ance 07.2 Pape and ca dboa d was es
●Found y pile 12.42 Slags and ashes om he mal ea men
and combus ion
●Asbes os ins ance 12.21 Asbes os was es
●Tex ile ins ance 07.6 Tex iles was es
●Mixed i ems pile 10.2 Mixed and undi e en ia ed ma e ials
he anno a ions c ea ed by a eam membe we e e iewed by a di e en anno a o o minimise use
bias. A e he manual da a labelling phase, all polygons anno a ed on he iles we e p ojec ed
on o he o homosaic. When a la ge was e elemen was no ully con ained wi hin a single ile,
polygons wi h he same ca ego y om nea by iles we e combined in o a single polygon ha co e ed
he en i e was e ins ance. Fu he anno a ion alida ion iden i ied and esol ed inconsis en cases,
such as:
•Duplica e anno a ions on he same ile (e.g., nes ed anno a ions o he same class).
•O e lapping anno a ions wi h di e en ca ego ies on he same ile.
•O e lapping anno a ions wi h di e en ca ego ies on adjacen iles (e.g., inco ec ly classi ied
pa ially isible was e ins ances).
Such con lic s we e iden i ied by an au oma ic sc ip and esol ed manually. The inal g ound
u h o each si e includes a se ies o polygons associa ed wi h a was e ca ego y and geo e e enced
o he o homosaic coo dina es. Figu e 3shows all he o homosaics included in he da ase . Some
o hem co e an ex ensi e a ea while o he s (e.g., Si e 7,Si e 11) a e much smalle . In many
cases, he was e is sca e ed h oughou he scene wi h la ge a eas o was e- ee backg ound.
2.4 Da ase p epa a ion
The iles ha compose he D oneWas e da ase ha e a s anda d size o 640×640 px, which e lec s
he ypical dimensions used by image analysis models. The sliding window p ocessing s ep ex ac s
iles wi h a 10% o e lap be ween ows and columns o a oid missing objec s on he edge o a ile.
Images nea he o homosaic bo de ha con ained mo e han 70% black pixels we e disca ded
om he da ase . When using a sliding window, i may happen ha only a small po ion o a was e
ins ance is isible in a ile. Such pa ial isibili y nega i ely impac s he was e ecogni ion ask, as
a de ec ion model may ail o lea n ele an ea u es om a educed po ion o a was e ins ance.
The e o e, small pa ial anno a ions we e emo ed. The il e ing logic di e s be ween ins ance
and pile was e ca ego ies. Fo ins ance ca ego ies, whe e objec s ha e egula size and shape,
samples a e kep i a minimum po ion o he objec emains isible in an image. Ca ego y-speci ic
7
Figu e 3: The ull o homosaics o all he si es included in he D oneWas e da ase .
h esholds we e de ined empi ically by conside ing he smalles image egion s ill pe mi ing he
ecogni ion o an objec and he h eshold alues ange om 10% o 20% o he whole objec a ea.
Fo pile ca ego ies, he a ea size o anno a ions a ies signi ican ly (piles can ange om e y small
o e y la ge) and he e o e a pa ial heap anno a ion is il e ed ou when i ep esen s less han
1% o he whole anno a ion.
Ul ima ely, he da ase c ea ion p ocess ou pu s he g ound u h samples o all he 17 si es.
Table 2 epo s he numbe o images and o anno a ed was e elemen s. Each g ound u h sample
has a segmen a ion mask and a bounding box. The ull collec ion o g ound u h samples o
D oneWas e includes 5135 anno a ions on 4993 images.
8
3 Da a Reco d
The D oneWas e da ase is published in a Zenodo eposi o y (h ps://doi.o g/10.5281/zenodo.
17045558) and comp ises he ollowing a i ac s, which cons i u e he public pa o he da ase :
1. Images olde : con ains he images ex ac ed om all si e o homosaics. The p o ided images
a e no geo e e enced because he coo dina es a e conside ed sensi i e in o ma ion and ha e
he e o e been emo ed.
2. Da ase g ound u h: a JSON ile ha desc ibes he images and anno a ions o he da ase
using he COCO o ma . Bo h segmen a ion masks and bounding boxes a e de ined o all
anno a ed was e ins ances.
3. In o ma ion ile: gene al in o ma ion abou he cu en da ase e sion. No ele an o
aining o e alua ing de ec ion models.
Figu e 4illus a es he o ganiza ion o he D oneWas e da ase di ec o y s uc u e.
id onewas e/
Dd onewas e 1.0.json
Ain o. x
iimages/
Jsi e1 4.png
Jsi e1 5.png
. . .
Jsi e2 4.png
. . .
Figu e 4: Di ec o y s uc u e o he D oneWas e da ase , showing he o ganiza ion o me ada a
iles and image y.
Figu e 5illus a es he dis ibu ion o he was e ca ego ies in he da ase . Al hough he class
dis ibu ion may appea imbalanced, se e al conside a ions a e wo h men ioning:
1. Some ca ego ies a e inhe en ly a e (e.g., Asphal milling,Found y o Pape ), hus unde ep-
esen ed in he D oneWas e da ase . Ne e heless, hese a e ins ances ha e been anno a ed
o enable he de elopmen and pe o mance assessmen o de ec ion me hods wi h ew sam-
ples.
2. Ca ego ies ep esen ing indi idual i ems (e.g., Plas ic packaging,Ty es, o Me al ba els)
o en esul in many single-objec small anno a ions.
3. Con e sely, ca ego ies agg ega ed in piles (e.g., Rubble o Mixed i ems) a e ypically anno-
a ed as ew la ge pa ches co e ing signi ican po ions o an image.
The di e ence in dis ibu ion be ween ins ance and pile classes is also e iden in Figu e 6,
which epo s he ela i e pixel co e age o each class in he images. I ems such as Palle s and
Tex ile, which a e ypically small and isola ed, ha e a la ge numbe o anno a ions bu co e a
smalle a ea compa ed o he o he classes. On he o he hand, Rubble o Mixed i ems a e o en
g ouped in piles and occupy a la ge egion o he images. Ano he ma e ial ha co e s wide a eas
is Asbes os because i s ins ances mos o en co espond o oo s ha span la ge po ions o he
image.
Figu e 7p esen s a si e-wise b eakdown o he da ase , showing o each loca ion he numbe
o UAV-cap u ed pho os compa ed o he anno a ed images ex ac ed om he o homosaic. Each
ba is di ided in o wo colo s, ep esen ing images anno a ed wi h a leas one was e elemen and
he images ha con ain only backg ound. The cha also epo s he pe cen age o backg ound-
only images o he o al numbe o si e images. Images wi hou any isible was e ma e ial a e
9
[5] Rocio Nahime To es and Pie o F a e nali. Ae ialwas e da ase o land ill disco e y in ae ial
and sa elli e images. Scien i ic Da a, 10(1):63, 2023. 2,3
[6] Liming Zhou, Xiaohan Rao, Yahui Li, Xianyu Zuo, Yang Liu, Yinghao Lin, and Yong Yang.
Swde : Ancho -based objec de ec o o solid was e de ec ion in ae ial images. IEEE Jou nal
o Selec ed Topics in Applied Ea h Obse a ions and Remo e Sensing, 16:306–320, 2022. 2,3
[7] Xian Sun, Dongshuo Yin, Fei Qin, Hong eng Yu, Wanxuan Lu, Fanglong Yao, Qibin He,
Xingliang Huang, Zhiyuan Yan, Peijin Wang, e al. Re ealing in luencing ac o s on global
was e dis ibu ion ia deep-lea ning based dumpsi e de ec ion om sa elli e image y. Na u e
Communica ions, 14(1):1444, 2023. 2,3
[8] Bowen Niu, Quanlong Feng, Jianyu Yang, Boan Chen, Bingbo Gao, Jian ao Liu, Yi Li, and
Jianhua Gong. Solid was e mapping based on e y high esolu ion emo e sensing image y
and a no el deep lea ning app oach. Geoca o In e na ional, 38(1):2164361, 2023. 2,3
[9] Shao u Lin, Lei Huang, Xiliang Liu, Guihong Chen, and Zhe Fu. A cons uc ion was e land ill
da ase o wo dis ic s in beijing, china om high esolu ion sa elli e images. Scien i ic Da a,
11(1):388, 2024. 2,3
[10] Ty one B igh , Sa p Adali, and C is ina T ois. Sys emic e iew and me a-analysis: The
applica ion o ai-powe ed d one echnology wi h compu e ision and deep lea ning ne wo ks
in was e managemen . D ones, 9(8), 2025. 2
[11] Ma ek K a , Ma eusz Piechocki, Ba osz P ak, and K zysz o Walas. Au onomous, onboa d
ision-based ash and li e de ec ion in low al i ude ae ial images collec ed by an unmanned
ae ial ehicle. Remo e Sensing, 13(5):965, 2021. 2,3
[12] Weiyang Chen, Yiyang Zhao, Teng ei You, Hai eng Wang, Yang Yang, and Kun Yang. Au o-
ma ic de ec ion o sca e ed ga bage egions using small unmanned ae ial ehicle low-al i ude
emo e sensing images o high-al i ude na u al ese e en i onmen al p o ec ion. En i on-
men al science & echnology, 55(6):3604–3611, 2021. 2,3
[13] Yang Liu, Bo Zhao, Xuepeng Zhang, Wei Nie, Peng Gou, Jiachun Liao, and Kunxin Wang.
A p ac ical deep lea ning a chi ec u e o la ge-a ea solid was es moni o ing based on ua
image y. Applied Sciences, 14(5), 2024. 2,3
[14] Ped o F P oen¸ca and Ped o Sim˜oes. Taco: T ash anno a ions in con ex o li e de ec ion.
a Xi p ep in a Xi :2003.06975, 2020. 2,3
[15] Mindy Yang and Ga y Thung. Classi ica ion o ash o ecyclabili y s a us. CS229 p ojec
epo , 2016(1):3, 2016. h ps://hugging ace.co/da ase s/ga y hung/ ashne ,h ps:
//gi hub.com/ga y hung/ ashne .2,3
[16] Janusz Bobulski and Jacek Pia kowski. Pe was e classi ica ion me hod and plas ic was e
da abase-wadaba. In Image P ocessing and Communica ions Challenges 9: 9 h In e na-
ional Con e ence, IP&C’2017 Bydgoszcz, Poland, Sep embe 2017, P oceedings, pages 57–64.
Sp inge , 2018. 2,3
[17] Se´an Lynch. Openli e map. com–open da a on plas ic pollu ion wi h blockchain ewa ds
(li e coin). Open Geospa ial Da a, So wa e and S anda ds, 3(1):1–10, 2018. 3
[18] Tao Wang, Yuanzheng Cai, Lingyu Liang, and Dongyi Ye. A mul i-le el app oach o was e
objec segmen a ion. Senso s, 20(14):3816, 2020. 3
[19] Tsung-Yi Lin, Michael Mai e, Se ge Belongie, James Hays, Pie o Pe ona, De a Ramanan,
Pio Doll´a , and C. Law ence Zi nick. Mic oso coco: Common objec s in con ex . In Da id
Flee , Tomas Pajdla, Be n Schiele, and Tinne Tuy elaa s, edi o s, Compu e Vision – ECCV
2014, pages 740–755, Cham, 2014. Sp inge In e na ional Publishing. 3,11
[20] Mo andini, Luca, Buda, Alessio, and F a e nali, Pie o. Geospa ial a i icial in elligence o
solid was e ecogni ion om ua image y. E3S Web Con ., 643:01004, 2025. 3
[21] OpenD oneMap. Odm—a command line oolki o gene a e maps, poin clouds, 3d models
and dems om d one, balloon o ki e images. OpenD oneMap/ODM Gi Hub Page, 2020. 5
16
[22] Hansen T. Dwye B., Nelson J. Robo low ( e sion 1.0) [so wa e]. h ps:// obo low.com,
2024. [accessed 25 Janua y 2025]. 6
[23] Alexande Ki illo , E ic Min un, Nikhila Ra i, Hanzi Mao, Chloe Rolland, Lau a Gus a son,
Te e Xiao, Spence Whi ehead, Alexande C. Be g, Wan-Yen Lo, Pio Doll´a , and Ross
Gi shick. Segmen any hing. a Xi :2304.02643, 2023. 6
[24] Muhammad Yaseen. Wha is yolo 8: An in-dep h explo a ion o he in e nal ea u es o he
nex -gene a ion objec de ec o , 2024. 11
[25] Yunjie Tian, Qixiang Ye, and Da id Doe mann. Yolo 12: A en ion-cen ic eal- ime objec
de ec o s, 2025. 11
[26] Shaoqing Ren, Kaiming He, Ross Gi shick, and Jian Sun. Fas e -cnn: Towa ds eal- ime
objec de ec ion wi h egion p oposal ne wo ks. IEEE T ansac ions on Pa e n Analysis and
Machine In elligence, 39(6):1137–1149, 2017. 11
[27] Edua do Villamo Medina, Theodo a Tsik ika, Pie o F a e nali, Luigi Calda a u, Vasiliki
E s a hiou, Sand a Balbie z, E s a hios Ska la os, E a Ko enjak, Anas asios Ka akos as,
Luca Di Nuo o, Fede ico Benolli, Da io Bellinge i, D ies Bo loo, Rena o Sciunnach, Jimmy
Be gg en, O idiu Manolache, Ioannis Pe opoulos, Radu Bo s, Ni Haimo , And ew S an-
i o h, and Jo dan Thompson. PERIVALLON: Imp o ed In elligence Pic u e and Ope a-
ional Capaci ies o Comba O ganised En i onmen al C ime, pages 197–211. Sp inge Na u e
Swi ze land, Cham, 2025. 15
Au ho Con ibu ions
All au ho s con ibu ed o he s udy design, da ase de elopmen , and manusc ip w i ing. Da a
p epa a ion, isualiza ion, and echnical alida ion we e ca ied ou by L.M., A.D., T.P., E.T.,
G.K. and supe ised by G.B., E.B.K., P.F., A.C.K.
Compe ing In e es s
The au ho s decla e no compe ing in e es s.
Acknowledgemen s
This wo k was pa ially unded by Eu opean Union’s Ho izon Eu ope p ojec PERIVALLON
- P o ec ing he EuRopean e I o y om o ganised enVi onmen Al c ime h ough in eLLigen
h ea de ec iON ools, unde g an ag eemen no. 101073952. We acknowledge he En i onmen-
al P o ec ion Agency ARPA Lomba dia, pa icula ly he OU CREO (Regional Cen e o Ea h
Obse a ion), o hei in aluable expe ise du ing he da a acquisi ion and da ase anno a ion
phases o his esea ch. This wo k was suppo ed by he Minis y o Educa ion, You h and Spo s
o he Czech Republic h ough he e-INFRA CZ (ID:90254).
17