In e na ional Jou nal o Inno a i e Technology and Explo ing Enginee ing (IJITEE)
ISSN: 2278-3075 (Online), Volume-14 Issue-12, No embe 2025
32
Published By:
Blue Eyes In elligence Enginee ing
and Sciences Publica ion (BEIESP)
© Copy igh : All igh s ese ed.
Re ie al Numbe : 100.1/iji ee.A117515011225
DOI: 10.35940/iji ee.A1175.14121125
Jou nal Websi e: www.iji ee.o g
Abs ac : Con olu ional neu al ne wo ks (CNNs) play an
essen ial ole in classi ying images collec ed in eal-wo ld
en i onmen s. This a icle p esen s a pe o mance compa ison o
selec ed CNNs o image classi ica ion asks ela ed o ma ine
lo a and auna, using eco dings om an Unmanned unde wa e
ehicle (UUV). An a emp was made o ind sui able CNN
a chi ec u es o p ocessing images o a poo - isibili y ma ine
en i onmen among i e commonly used a chi ec u es. The
esea ch was based on a uni o m model aining sys em: he same
da ase and iden ical op imisa ion pa ame e s we e used o
demons a e he lea ning capabili ies o each a chi ec u e.
Thanks o he uni o m CNN lea ning sys em, hei di ec lea ning
capabili ies o speci ic images can be mo e accu a ely es ima ed.
This means ha he conduc ed expe imen s showed ha , in he
ea ly s age o aining, he analysed ne wo ks achie ed simila
lea ning esul s, whe eas he di e ences conce ned he inal
aining accu acy. The bes esul s we e achie ed wi h models
such as ResNe 50, which ha e he mos ad anced a chi ec u e.
Ad anced models achie e imp o ed classi ica ion o complex and
dis o ed images by le e aging mo e pa ame e s. The esul s
p o ide insigh s in o he pe o mance o di e en a chi ec u es in
unde wa e image classi ica ion and se e as a e e ence o
u he esea ch on deep lea ning applica ions in ma ine
en i onmen moni o ing.
Keywo ds: A i icial In elligence, Deep Lea ning,
Con olu ional Neu al Ne wo ks, Unmanned Unde wa e
Vehicles.
Nomencla u e:
UUV: Unmanned Unde wa e Vehicle
CNNs: Con olu ional Neu al Ne wo ks
I. INTRODUCTION
In ecen yea s, deep neu al ne wo ks ha e become an
essen ial pa o isual da a analysis, enabling signi ican
ad ances in asks such as image classi ica ion and objec
de ec ion. Among hem, CNNs ha e e olu ionised he ield
o compu e ision [1]. The de elopmen o a chi ec u es o
ans e lea ning has enabled he use o e icien models
based on eady-made neu al ne wo k s uc u es. This pape
ocused on unde wa e image classi ica ion om ideo
ames eco ded by a UUV. The aim was o de ec elemen s
o ma ine lo a and auna ha a e impo an o bo h
ecological esea ch and en i onmen al moni o ing [2].
Manusc ip ecei ed on 02 No embe 2025 | Re ised
Manusc ip ecei ed on 09 No embe 2025 | Manusc ip
Accep ed on 15 No embe 2025 | Manusc ip published on 30
No embe 2025.
*Co espondence Au ho (s)
K ys ian Kozakiewicz*, Depa men o Au onomous Sys ems, Gdynia
Ma i ime Uni e si y, Gdynia (Pomo skie), Poland. Email ID:
[email protected], ORCID ID: 0009-0009-3020-5764
© The Au ho s. Published by Blue Eyes In elligence Enginee ing and
Sciences Publica ion (BEIESP). This is an open-access a icle unde he
CC-BY-NC-ND license h p://c ea i ecommons.o g/licenses/by-nc-nd/4.0/
Such images pose challenges: limi ed wa e anspa ency,
a iable ligh ing condi ions, and a wide a ie y o biological
objec s [3]. To de elop he mos e ec i e app oach in his
con ex , i e commonly used CNN models we e compa ed:
MobileNe V2, Resne 50, E icien Ne B0, Incep ionV3, and
DenseNe 121. This pape used only one aining sys em: all
models we e ained on he same da ase and wi h iden ical
op imisa ion pa ame e s o he same numbe o epochs,
allowing compa ison o aining accu acy ac oss he chosen
a chi ec u es. The main ac o s udied in his wo k is he
ne wo k's abili y o lea n om low- isibili y unde wa e
images.
II.
THEORETICAL
BACKGROUND
The unc ioning o biological ne ous sys ems inspi es he
neu al ne wo ks used in machine lea ning. Wi h indi idual
compu a ional "neu ons" in laye s, a neu al ne wo k allows
es ablishing ela ionships be ween inpu da a and expec ed
esul s by assigning weigh s o selec ed neu ons. O e ime,
deep neu al ne wo ks s a ed o be designed. Thei mul ilaye
s uc u e allows he c ea ion o e y complex pa e ns. One o
he essen ial ypes o a chi ec u e is he CNN, which has
become he s anda d o analysing images. A ypical CNN
includes he ollowing basic componen s: con olu ional
laye s, ac i a ion laye s, and pooling laye s. Con olu ion
laye s ex ac local image ea u es using il e s. Non-linea
ac i a ion laye s allow he ne wo k o app oxima e complex
unc ions; common ypes include ReLU. Pooling laye s
educe he dimensionali y o ea u e maps. Fully connec ed
laye s pe o m he inal classi ica ion based on he high-le el
ea u es lea ned in p e ious s ages.
A chi ec u e buil his way allowed CNNs o de ec simple
pa e ns such as edges and ex u es. Simul aneously, his
s uc u e enabled he classi ica ion o mo e abs ac
s uc u es, including animals and plan s. As a esul , CNNs
ha e ound wide applica ion in medicine, sa elli e image
analysis, en i onmen al moni o ing, and au onomous
sys ems. Many imp o ed a ian s o CNNs ha e been
p oposed in he li e a u e, di e ing in hei in o ma ion low,
dep h, and compu a ional e iciency. Among hem, some a e
widely used. MobileNe V2 is op imised o mobile de ices
and esou ce-cons ained de ices [4]. ResNe 50 enables
aining deep ne wo ks wi hou anishing g adien s [5].
E icien Ne B0 will balance wid h and dep h, o e ing a high
accu acy- o-complexi y a io [6]. Incep ionV3 enables
e icien ea u e ex ac ion ac oss di e en scales ia
mul iscale il e s [7]. DenseNe -12, wi h dense connec i i y
in which each laye ecei es
Inpu om all p e ious laye s
enables imp o ed ea u e
euse [8]. Compa ing hese
Compa ison o Con olu ional Neu al Ne wo k
A chi ec u es o Unde wa e Image
Classi ica ion
K ys ian Kozakiewicz
Compa ison o Con olu ional Neu al Ne wo k A chi ec u es o Unde wa e Image Classi ica ion
33
Published By:
Blue Eyes In elligence Enginee ing
and Sciences Publica ion (BEIESP)
© Copy igh : All igh s ese ed.
Re ie al Numbe : 100.1/iji ee.A117515011225
DOI: 10.35940/iji ee.A1175.14121125
Jou nal Websi e: www.iji ee.o g
i e a chi ec u es p o ides a b oad pe spec i e o e alua ing
lea ning pe o mance. Mos compa a i e s udies ha e
ocused on da ase s such as ImageNe , bu his wo k ex ends
he line by assessing hei pe o mance in classi ying
unde wa e images unde challenging isual condi ions.
III. METHODOLOGY
All s ages o he expe imen s we e pe o med using he
Py hon p og amming language, wi h lib a ies including
Tenso Flow, Ke as, NumPy, Pandas, and ma plo lib.
Tenso Flow and Ke as we e used o build models and
pe o m ans e lea ning. NumPy and Pandas se ed as
u ili ies o wo king wi h he da ase . Ma plo lib was used o
isualisa ion using g aphs. To make he esul s epea able,
andom seeds o models we e ixed.
The ICM Benchma k used in his s udy is publicly a ailable
on he Kaggle pla o m [9]. I includes ideo ames eco ded
by a UUV showing a ious species o ma ine lo a and auna.
F ames om he iden ical sequences we e assigned o
selec ed classes, allowing he di ision o he da ase in o
species o aqua ic o ganisms: Spa angus pu pu eus (pu ple
hea u chin), Echinas e spesi us (Medi e anean ed sea
s a ), Ce ian hus memb anaceus (cylinde anemone),
Bonellia i dis (g een spoonwo m), Scylio hinus canicula
(small-spo ed ca sha k), and Ophiu a ophiu a (se pen s a ).
To p o ide a con ol sample, a class o only seabed
agmen s, wi h no e idence o any species p esen , was also
c ea ed. This class is c ucial in classi ica ion o iden i y
objec s o biological in e es and he en i onmen al
backg ound. The comple e se o species o his da ase is
shown in Figu e 1. In addi ion, Figu es 2 and 3 compa e an
image con aining only he seabed backg ound wi h a pho o
showing a selec ed species. These demons a e he di icul y
o classi ica ion, e en o he human eye.
[Fig.1: ICM Benchma k Species]
[Fig.2. Example o an Image Showing Only he Seabed]
[Fig.3: Example o an Image Showing Echinas e Spesi us]
P ep ocessing included esizing inpu images o he
equi ed esolu ion o each a chi ec u e (224 x 224 pixels o
MobileNe V2, ResNe 50, E icien Ne B0, and DenseNe 121,
and 299 x 299 pixels o Incep ionV3). Each model employed
he app op ia e no maliza ion unc ion p o ided in
Tenso Flow. To enhance gene aliza ion capabili y, da a
augmen a ion was applied du ing aining [10]. This
augmen a ion in ol ed andom o a ions, ho izon al lips,
c opping, and adjus men s o b igh ness and con as . The
ans o ma ions we e designed o e lec he a iabili y o
unde wa e en i onmen s, pa icula ly changes in
illumina ion and wa e u bidi y. All i e CNN a chi ec u es
we e ained en imes each unde iden ical condi ions, wi h
he same op imiza ion se ings and on he same da ase . The
classi ica ion head was adap ed o he numbe o classes,
consis ing o a global a e age pooling laye , a d opou laye
o mi iga e o e i ing, and a dense laye wi h So Max
ac i a ion. Each model was ained o 20 epochs, wi hou
ea ly s opping. The op imiza ion p ocess used he Adam
op imize , and he loss unc ion was ca ego ical
c oss-en opy sui able o mul i-class classi ica ion.
To minimise he in luence o s ochas ic ac o s, each
expe imen was epea ed mul iple imes wi h di e en
andom ini ialisa ions o weigh and da a ba ching.
Pe o mance analysis was based on he mean and s anda d
de ia ion o model accu acies. In addi ion, lea ning cu es
we e examined ac oss epochs o assess aining dynamics.
The p ima y ocus was on aining accu acy o poin ou he
bes CNN a chi ec u e o aining a model on his da ase .
Resul s a e p esen ed in bo h abula and g aphical o m in
he nex pa .
IV. RESULTS
This sec ion p esen s esul s om expe imen s on he
lea ning accu acy o i e CNN a chi ec u es. The a e age
aining accu acy pe epoch o each model is p esen ed in
Table I. Figu e 4 shows he a e age esul s ac oss models.
This g aph enables a di ec compa ison o he i e
a chi ec u es in e ms o hei lea ning e ec s. Figu es 5-9,
howe e , ep esen he indi idual a e age esul s o he
a chi ec u es along wi h hem
exac cu es o each o he 10
ials.
In e na ional Jou nal o Inno a i e Technology and Explo ing Enginee ing (IJITEE)
ISSN: 2278-3075 (Online), Volume-14 Issue-12, No embe 2025
34
Published By:
Blue Eyes In elligence Enginee ing
and Sciences Publica ion (BEIESP)
© Copy igh : All igh s ese ed.
Re ie al Numbe : 100.1/iji ee.A117515011225
DOI: 10.35940/iji ee.A1175.14121125
Jou nal Websi e: www.iji ee.o g
Table I: Model’s A chi ec u es T ain Accu acy Pe Epoch
Epoch
DenseNe 121
E icien Ne B0
Incep ion 3
MobileNe 2
ResNe 50
1
0.5536
0.5511
0.5241
0.5650
0.5736
2
0.6090
0.5929
0.5806
0.6188
0.6343
3
0.6227
0.6067
0.5979
0.6349
0.6546
4
0.6297
0.6115
0.6089
0.6434
0.6669
5
0.6336
0.6176
0.6177
0.6492
0.6754
6
0.6374
0.6204
0.6227
0.6549
0.6813
7
0.6394
0.6209
0.6269
0.6563
0.6881
8
0.6414
0.6236
0.6291
0.6603
0.6913
9
0.6434
0.6257
0.6330
0.6618
0.6956
10
0.6443
0.6256
0.6351
0.6657
0.6995
11
0.6457
0.6273
0.6360
0.6664
0.7023
12
0.6473
0.6293
0.6390
0.6686
0.7042
13
0.6465
0.6294
0.6415
0.6689
0.7073
14
0.6462
0.6309
0.6427
0.6702
0.7090
15
0.6481
0.6300
0.6415
0.6716
0.7098
16
0.6495
0.6307
0.6429
0.6718
0.7125
17
0.6503
0.6310
0.6439
0.6731
0.7133
18
0.6494
0.6320
0.6457
0.6731
0.7156
19
0.6498
0.6331
0.6462
0.6744
0.7182
20
0.6511
0.6317
0.6458
0.6751
0.7181
[Fig.4: A e age T ain Accu acy o all A chi ec u es]
[Fig.5: T ain Accu acy o DenseNe 121]
[Fig.6: T ain Accu acy o E icien Ne B0]
[Fig.7: T ain Accu acy o Incep ion 3]
[Fig.8: T ain Accu acy o MobileNe 2]
[Fig.9: T ain Accu acy o ResNe 50]
The ini ial aining accu acy alues ange om 0.524 o
0.574, indica ing ha no a chi ec u e has a clea ad an age in
he e y beginning o lea ning. Among all models,
Incep ionV3 achie ed he lowes ini ial alue. DenseNe 121,
E icien Ne B0, and MobileNe V2 achie ed a e age ini ial
alues. ResNe 50 achie ed he highes ini ial aining
accu acy. As aining p og essed, all ne wo ks g adually
imp o ed hei accu acy. A e 20 epochs, aining accu acies
anged om 0.632 o 0.718. A his s age, di e ences
be ween a chi ec u es became mo e p onounced.
E icien Ne B0 achie ed he lowes inal aining accu acy,
while Incep ionV3 was
somewha be e .
DenseNe 121 and
MobileNe V2 go
Compa ison o Con olu ional Neu al Ne wo k A chi ec u es o Unde wa e Image Classi ica ion
35
Published By:
Blue Eyes In elligence Enginee ing
and Sciences Publica ion (BEIESP)
© Copy igh : All igh s ese ed.
Re ie al Numbe : 100.1/iji ee.A117515011225
DOI: 10.35940/iji ee.A1175.14121125
Jou nal Websi e: www.iji ee.o g
in e media e esul s. ResNe 50 conside ably ou pe o med
p e ious models, achie ing he bes ou come. I a ained a
di e ence o 0.033, indica ing 3% be e aining accu acy
han he second-bes esul o MobileNe V2.
In summa y, hough all i e a chi ec u es showed e y
simila ini ial esul s, he inal esul s show signi ican
di e ences in model aining e ec i eness. ResNe 50 p o ed
o be he mos e ec i e model unde he expe imen al
condi ions, achie ing he highes a e age accu acy,
sugges ing ha i s a chi ec u e is bes sui ed o aining
models on di icul images o he unde wa e ma ine
en i onmen .
V. CONCLUSION
This pape compa es he pe o mance o selec ed CNN
a chi ec u es. The objec i e was o assess how e ec i ely he
models can be ained o classi y unde wa e images om
ames ex ac ed om UUV eco dings. Fi e models
ep esen a i e o di e en design app oaches ha e been
conside ed; all we e ained on he same da ase and wi h he
same aining pa ame e s. The me hodology ollowed
allowed di ec compa ison o model pe o mance, excluding
a iabili y in oduced by di e en aining imes o
hype pa ame e se ings. The esul s showed ha all
a chi ec u es exhibi ed e y simila lea ning cu es and
aining accu acies du ing he i s epochs. The di e ences
became mo e p onounced only in he la e s ages o aining,
whe e he ne wo ks eached di e en inal alues o aining
accu acy [11]. These esul s also show ha models wi h
highe s uc u al complexi y ou pe o med simple ones,
sugges ing hey a e be e a ecognising complex biological
s uc u es. This analysis p o ides a basis o subsequen
in es iga ions in o he po en ial o deep lea ning o ma ine
en i onmen moni o ing. Fu u e wo k could expand his
s udy o include newe ne wo k a chi ec u es o app op ia e
hype pa ame e selec ions o he chosen a chi ec u e.
Besides, obse ing alida ion and es accu acy would be
ad an ageous o compa ing a ious a chi ec u es unde eal
condi ions.
DECLARATION STATEMENT
I mus e i y he accu acy o he ollowing in o ma ion as
he a icle's au ho .
▪ Con lic s o In e es / Compe ing In e es s: Based on
my unde s anding, his a icle has no con lic s o
in e es .
▪ Funding Suppo : Yes, I ha e ecei ed inancial
assis ance o his a icle. Uniwe sy e Mo ski w Gdyni
NIP: 586-001-28-73 ul. Mo ska 81-87 81-225 Gdynia.
h ps://umg.edu.pl/kon ak
▪ E hical App o al and Consen o Pa icipa e: The
con en o his a icle does no necessi a e e hical
app o al o consen o pa icipa e wi h suppo ing
documen a ion.
▪ Da a Access S a emen and Ma e ial
A ailabili y: The adequa e esou ces o his a icle a e
publicly accessible.
▪ Au ho ’s Con ibu ions: The au ho ship o his a icle
is con ibu ed solely.
REFERENCES
1. Y. LeCun, Y. Bengio, G. Hin on (2015). “Deep lea ning”. Na u e 521,
436–444 (2015). DOI: h ps://doi.o g/10.1038/na u e14539
2. M. J. Islam, S. Sakib Enan, P. Luo, and J. Sa a , "Unde wa e Image
Supe -Resolu ion using Deep Residual Mul iplie s," 2020 IEEE
In e na ional Con e ence on Robo ics and Au oma ion (ICRA), Pa is,
F ance, 2020, pp. 900-906,
DOI: h ps://doi.o g/10.1109/ICRA40945.2020.9197213.
3. C. Li e al., "An Unde wa e Image Enhancemen Benchma k Da ase
and Beyond," in IEEE T ansac ions on Image P ocessing, ol. 29, pp.
4376-4389, 2020, DOI: h ps://doi.o g/10.1109/TIP.2019.2955241.
4. M. Sandle , A. Howa d, M. Zhu, A. Zhmogino , L. Chen.
“MobileNe V2: In e ed Residuals and Linea Bo lenecks”
P oceedings o he IEEE Con e ence on Compu e Vision and Pa e n
Recogni ion (CVPR), 2018, pp. 4510-4520.
DOI: h ps://doi.o g/10.48550/a Xi .1801.04381
5. M. Elpel agy, H. Sallam, “Au oma ic p edic ion o COVID-19 om
ches images using modi ied ResNe 50”. Mul imed Tools Appl 80,
26451–26463 (2021).
DOI: h ps://doi.o g/10.1007/s11042-021-10783-6
6. S. Upadhyay, J. Jain, R. P asad. (2024). “Ea ly Bligh and La e Bligh
Disease De ec ion in Po a o Using E icien ne b0”. In e na ional Jou nal
o Expe imen al Resea ch and Re iew, 38, 15–25.
DOI: h ps://doi.o g/10.52756/ije .2024. 38.002
7. C. Wang e al., "Pulmona y Image Classi ica ion Based on Incep ion- 3
T ans e Lea ning Model," in IEEE Access, ol. 7, pp. 146533-146541,
2019, DOI: h ps://doi.o g/10.1109/ACCESS.2019.2946000.
8. G. Huang, Z. Liu, L. an de Maa en, K. Weinbe ge ; P oceedings o he
IEEE Con e ence on Compu e Vision and Pa e n Recogni ion
(CVPR), 2017, pp. 4700-4708.
DOI: h ps://doi.o g/10.48550/a Xi .1608.06993
9. D. Se ano, “ICM-Benchma k-20,” Kaggle Da ase s, Oc . 2024.
[Online]. A ailable:
h ps://www.kaggle.com/da ase s/ sunamise a/icm-benchma k-20
[Accessed: 31-Oc -2025].
10. Sho en, C., Khoshgo aa , T.M. A su ey on Image Da a
Augmen a ion o Deep Lea ning. J Big Da a 6, 60 (2019).
DOI: h ps://doi.o g/10.1186/s40537-019-0197-0
11. J. Madewell, R.A. Feagin, T.P. Hu , B. Balboa; Es ima ing F eshwa e
In lows o an Ungauged Wa e shed a he Big Boggy Na ional Wildli e
Re uge, USA. J. Ma . Sci. Eng. 2024, 12, 15.
DOI: h ps://doi.o g/10.3390/jmse12010015
AUTHOR’S PROFILE
K ys ian Kozakiewicz comple ed his Mas e 's deg ee a
he Facul y o Elec ical Enginee ing a he Ma i ime
Uni e si y o Gdynia, specialising in Au oma ion,
Elec onics, Elec ical Enginee ing, and Space
Technologies. His esea ch in e es s ocus on a i icial
in elligence and au oma ion, and au onomous sys ems.
He is ac i ely in ol ed in he de elopmen and expe imen al e alua ion o
AI models designed o p ocessing bo h nume ical da a and image da a. His
wo k includes implemen ing and op imising neu al ne wo ks o asks such
as pa e n ecogni ion, anomaly de ec ion, and senso da a usion in eal- ime
en i onmen s. He wo ks wi h au onomous and semi-au onomous mobile
obo s o na iga ion, con ol, and ask execu ion.
Disclaime /Publishe ’s No e: The s a emen s, opinions and
da a con ained in all publica ions a e solely hose o he
indi idual au ho (s) and con ibu o (s) and no o he Blue
Eyes In elligence Enginee ing and Sciences Publica ion
(BEIESP)/ jou nal and/o he edi o (s). The Blue Eyes
In elligence Enginee ing and Sciences Publica ion (BEIESP)
and/o he edi o (s) disclaim esponsibili y o any inju y o
people o p ope y esul ing om any ideas, me hods,
ins uc ions, o p oduc s e e ed o in he con en .