scieee Science in your language
[en] (orig)

Ensemble learning based plant disease prediction and analysis: A comparative study

Author: Prasadu, G; Subhani, Shaik; Anusha, M; Fatima, Naseeba; Vanshika, N
Publisher: Zenodo
DOI: 10.5281/zenodo.17719902
Source: https://zenodo.org/records/17719902/files/WJARR-2025-2111.pdf
 Co esponding au ho : G. P asadu.
Copy igh © 2025 Au ho (s) e ain he copy igh o his a icle. This a icle is published unde he e ms o he C ea i e Commons A ibu ion Liscense 4.0.
Ensemble lea ning based plan disease p edic ion and analysis: A compa a i e s udy
G. P asadu *, Shaik Subhani, M. Anusha, Naseeba Fa ima and N. Vanshika
Depa men o In o ma ion Technology, S eenidhi Ins i u e o Science and Technology (Au onomous), Yamnampe ,
Gha kesa , Hyde abad, India.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
Publica ion his o y: Recei ed on 20 Ap il 2025; e ised on 28 May 2025; accep ed on 31 May 2025
A icle DOI: h ps://doi.o g/10.30574/wja .2025.27.2.2111
Abs ac
C op illnesses conside ably a ec ag icul u al p oduc i i y and ood secu i y, making p omp and p ecise iden i ica ion
essen ial o minimize losses and suppo sus ainable ag icul u e. This esea ch in es iga es he pe o mance o deep
lea ning models e sus ensemble app oaches o de ec ing plan diseases wi hin he Plan Village da ase . A
Con olu ional Neu al Ne wo k (CNN) which is on MobileNe V2 was u ilized in ea u e ex ac ion, and i was compa ed
o s andalone classi ie s - XGBoos , Suppo Vec o Machine (SVM), and Random Fo es - as well as hei ensemble. The
esea ch assesses he p edic i e pe o mance o hese models, emphasizing how he ensemble can me ge s eng hs and
minimize misclassi ica ion. Expe imen al indings indica e ha he ensemble model eaches an accu acy o 94.1%,
su passing indi idual models (CNN: 92.5%, Random Fo es : 88.3%, SVM: 85.6%, XGBoos : 89.4%). This compa a i e
s udy deli e s insigh s in o he ade-o s o models, p esen ing a scalable app oach o au oma ic de ec ion o plan
diseases in p ecision a ming.
Keywo ds: CNN (Con olu ional Neu al Ne wo k); Ensemble Lea ning; Mobilene 2; Random Fo es ; SVM;
Compa a i e S udy; Xgboos (Ex eme G adien Boos ing)
1. In oduc ion
Ag icul u e suppo s wo ldwide ood p oduc ion, making plan heal h managemen essen ial o achie ing maximum
c op yields. C op diseases lead o signi ican inancial se backs, impac ing bo h small a me s and la ge ag icul u al
en e p ises. Iden i ying diseases manually is ime-consuming, needs specialized knowledge, and can be p one o
mis akes. P og ess in deep lea ning and machine lea ning has made au oma ed de ec ion h ough image analysis mo e
common, p o iding as e , be e and mo e co ec diagnoses [1]. Deep lea ning models ‘ou pe o m adi ional
machine lea ning in accu acy’ bu demand ex ensi e da ase s [4].
This esea ch pe o ms a compa a i e examina ion o deep lea ning and machine lea ning models o classi ying plan
diseases. U ilizing he Plan Village da a which is om Kaggle, which con ains o e 54,306 anno a ed images spanning
38 disease ca ego ies, we assess a MobileNe V2-based Con olu ional Neu al Ne wo k (CNN), SVM, Random Fo es ,
XGBoos , along wi h hei ensemble. The objec i e is o e alua e hei e ec i eness ega ding accu acy, gene aliza ion,
and eal-wo ld use ulness, u ilizing da a p ep ocessing me hods such as augmen a ion and esizing. Th ough he
compa ison o hese models, ou goal is o disco e an e icien , scalable s a egy o ea ly disease de ec ion, minimizing
dependence on manual echniques and p omo ing p ecision ag icul u e.
Con en ional de ec ion depends on expe isual e alua ions, p o ing o be ine ec i e o ex ensi e ag icul u e.
A i icial in elligence (AI) acili a es au oma ion, using models ha can accu a ely classi y diseases, educe human e o ,
and imp o e e iciency. This s udy seeks o illumina e he s eng hs and limi a ions o each app oach h ough a de ailed
compa ison.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
1599
2. Li e a u e e iew
This sec ion explo es s udies using machine lea ning and deep lea ning me hods in iden i ying diseased plan c ops, as
adi ional human made inspec ion me hods a e o en ime- aking, and hey may cause e o s. Acco ding o Gup a and
Jadon [8], ad ancemen s in DL echniques, mainly Con olu ional Neu al Ne wo ks (CNNs), demons a e supe io
classi ica ion pe o mance compa ed o Machine Lea ning algo i hms like SVM and Random Fo es . Thei esea ch
shows ha VGG-ICNN eached an accu acy o 99.16%, whe eas CNN models secu ed 98.13%, demons a ing he powe
o deep lea ning.
Resea ch examines segmen a ion models such as UNe and DeepLabV3+ ha success ully iden i y a eas in plan s
a ec ed by disease, whe eas ML models o e in e p e abili y wi h lowe compu a ional cos s [1]. Ensemble echniques
ha me ge CNN ea u e ex ac ion wi h classi ie s like Random Fo es and XGBoos ‘enhance pe o mance by
in eg a ing mul iple models [7]. Despi e hei success, DL models ace challenges ha demand la ge, labeled da ase s
and signi ican compu a ional powe . Recen s udies in es iga e op ions such as ans e lea ning, da a augmen a ion,
and IoT in eg a ion o eal- ime moni o ing, indica ing ha hyb id app oaches combining deep lea ning and ensemble
lea ning p esen a p omising pa h o imp o ing de ec ion and enabling p ecision ag icul u e [6].
Nume ous s udies ha e demons a ed he e icacy o DL, pa icula ly CNNs, in de ec ion based on images. Mohan y,
Hughes, and Sala hé [2] explo ed sma phone-based diagnos ics by aining a deep CNN using a da ase o 54,306
images om Plan Village, which included 14 c op ypes and 26 disease ca ego ies, eaching a classi ica ion p ecision o
99.35% on a sepa a e es da a. They obse ed a decline o 31.4% accu acy wi h eal-wo ld images, emphasizing
challenges in gene aliza ion.
Fe en inos [3] de eloped CNN-based models u ilizing an open da ase o 87,848 images ha depic 25 c op a ie ies
and 58 di e en plan -disease combina ions, which also include heal hy c ops. The leading model achie ed a 99.536%
success a e on 17,548 new images, showcasing i s eliabili y in bo h lab and eal-wo ld en i onmen s.
Too e al. [4] pe o med a compa a i e analysis by ine- uning ad anced DL models, op imizing sophis ica ed deep
lea ning amewo ksand DenseNe -121 on a da ase con aining 38 classes o heal hy lea images and diseased lea
images om 14 unique plan ca ego ies. DenseNe s consis en ly imp o ed accu acy as epochs p og essed, eaching a
es ing accu acy o 99.75% wi hou o e i ing and u ilizing ewe pa ame e s han o he models.
Abbas, Jain, and Tayal [5] pe o med a su ey on machine lea ning models o iden i ying plan diseases, highligh ing
con en ional me hods o ea u e ex ac ion (colo , ex u e, shape) and classi ica ion algo i hms like SVM and Random
Fo es , which a e p o icien in ecognizing diseases like lea blo ch, powde y mildew, and us . Ne e heless, hese
echniques ace challenges in ea ly-s age de ec ion when compa ed o DL.
Singh e al. [6] examined de ec ion ia machine lea ning in IoT-based ag icul u al sys ems, emphasizing IoT’s
con ibu ion o emo e moni o ing and ea ly de ec ion o enhance p oduc i i y, in line wi h p ecision ag icul u e
objec i es.
Ensemble lea ning echniques ha e o imp o e de ec ion by u ilizing he ad an ages o a ious models [1]. Ligh weigh
deep lea ning a chi ec u es acili a e implemen a ion on de ices wi h cons ained capabili ies, like mobile phones. Liu
and Wang [7] in eg a ed MobileNe V2 wi h YOLO 3 o de elop a model aimed a he ea ly de ec ion o disease, a aining
high accu acy along wi h compu a ional e iciency. Kamal KC e al. [8] p esen ed dep hwise sepa able con olu ion
amewo ks, ea u ing MobileNe - ype designs, o e ec i e disease classi ica ion. Fu u e s udies ough o emphasize
gene aliza ion, di e si y in da ase s, and eal- ime applica ions o ea ly de ec ion [2][3][4].
3. Me hodology
3.1. Da ase
The da ase was ex ac ed om Kaggle, a pla o m known o high-quali y da ase s. We used he Plan Village da ase
o ou p ojec , which con ains 54,306 labeled images co e ing 38 plan disease ca ego ies along wi h heal hy lea
samples.
3.2. Da a P ep ocessing
• Image Resizing: All images we e esized o 224x224 pixels o ensu e uni o mi y and e icien p ocessing.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
1600
• Da a Augmen a ion: Techniques like o a ion, lipping, and b igh ness modi ica ion we e applied o he
da a ollowing es ablished p ac ices [3]. This imp o ed model gene aliza ion and educed o e i ing,
pa icula ly due o he da ase 's limi ed size.
• Da ase Spli ing: The Plan Village da ase was di ided in o 80% aining (43,444 images) and 20%
alida ion (10,862 images), wi h a 10% es subse (5,431 images) o op imize model pe o mance and
ensu e obus e alua ion.
• Fea u e Ex ac ion using MobileNe V2: MobileNe V2 was u ilized in ex ac ing ele an ea u es om he
images, ensu ing e ec i e da a ep esen a ion o model aining.
3.3. Model A chi ec u e
Ou me hodology in eg a es mul iple models o obus disease de ec ion:
• Con olu ional Neu al Ne wo k (CNN): The p e- ained MobileNe V2 a chi ec u e, known o i s e iciency,
was ine- uned o classi y plan diseases. He e he da a is p e- ained on he basis o ImageNe da ase . The
plan disease is ine- uned o adap domain speci ic ea u es.
• Random Fo es : A decision- ee-based ensemble lea ning model ha enhances classi ica ion accu acy by
educing o e i ing and imp o ing in e p e abili y. He e o e i ing can be educed by combining
p edic ions om mul iple ees. I is obus o missing alues and noisy da a.
• Suppo Vec o Machine (SVM): This echnique employed o ca ego ize he ex ac ed ea u es, c ea ing
s ong decision bounda ies o disease iden i ica ion. used o classi y ea u es ex ac ed om CNN while
making i wi h good gene aliza ion and main ain p ope egula iza ion.
• XGBoos (Ex eme G adien Boos ing): A powe ul g adien boos ing amewo k op imized o mul i- class
classi ica ion, imp o ing p edic ion accu acy and compu a ional e iciency. I handles spa se da a and
missing alues e icien ly and op imized speed wi h high p edic ion accu acy
• Ensemble Model: The inal p edic ions we e ob ained by a e aging he p obabili y ou pu s om he,
Random Fo es , SVM, and XGBoos models. This ensemble app oach enhanced o e all accu acy by
le e aging he s eng hs o indi idual models.
• Logis ic Reg ession: mainly used o bina y and mul i class asks.I is used o es ima e p obabili ies in he
logis ic unc ions by es ablishing ela ionships be ween inpu ea u es and a ge labels. Logis ic eg ession
p e- ained on CNN ex ac ed ea u es and implemen ed as a baseline in ensemble Model.
• KNN: based lea ning and a non-pa ame ic me hod which is used o classi ies a sample on he bases o
majo i y class in he ea u e space which is o he k-nea es neighbo s. I is mos ly used o small da ase s
whe e he bounda ies o a class a e no complex.
The p oposed app oach in ol es wo key s ages: ea u e ex ac ion using a p e- ained CNN model and classi ica ion
using an ensemble o machine lea ning models.
Figu e 1 Plan Disease De ec ion A chi ec u e
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
1601
3.4. CNN Model (MobileNe V2)
A MobileNe V2 model p e- ained on ImageNe was employed o ea u e ex ac ion. The ne wo k was modi ied by
emo ing he dense laye s and eplacing hem wi h:
• Global A e age Pooling Laye : Reduces he spa ial dimensions.
• Dense Laye (512 neu ons, ReLU ac i a ion): In oduces non-linea i y.
• Ou pu Laye (38 neu ons, So max ac i a ion): Gene a es class p obabili ies.
3.4.1. T aining P ocess
• The CNN model was ained wi h he ollowing se ings:
• Adam op imize (lea ning a e se o 0.001) o enhance he lea ning p ocess.
• Ca ego ical c oss-en opy loss unc ion,
• Ea ly s opping and Model checkpoin ing We e applied o p e en o e i ing and ensu e be e gene aliza ion.
3.5. Ensemble Lea ning o Classi ica ion
The ained model was used as a ea u e ex ac o , and he ea u es ex ac ed we e inpu in o h ee di e en machine
lea ning classi ie s o p ocessing:
3.5.1. Random Fo es (RF)
A obus ensemble-based model using 50 decision ees. I employs se e al decision ees o enhance p edic ion
accu acy and minimize a iance.
3.5.2. Suppo Vec o Machine (SVM)
A linea SVM classi ie calib a ed using Pla scaling. Finds he bes bounda y(hype plane) o sepa a e di e en disease
ca ego ies. Helps in dis inguishing be ween isually simila plan diseases.
3.5.3. XGBoos (XGB)
A boos ing algo i hm ha imp o es p edic ions h ough sequen ial lea ning. A g adien boos ing model op imized o
mul i-class classi ica ion, making i highly e ec i e o plan disease de ec ion.
Figu e 2 CNN T aining Accu acy
3.5.4. Final Decision-Making: So -Vo ing Ensemble
• The inal p edic ions we e ob ained by a e aging he p obabili y ou pu s om he (CNN, RF, SVM, Logis ic
Reg ession, KNN and XGB) models.
• A e he calcula ions, Selec he class wi h he highes con idence leading o be e classi ica ion accu acy and
obus ness.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
1602
3.5.5. T aining S a egy
Di e en aining s a egies used o ensu e model pe o mance.
Loss Func ion
• Ca ego ical C oss-En opy- Well-sui ed o managing p oblems in ol ing classi ica ion ac oss mul iple
ca ego ies.
Op imize
• Adam Op imize - I helps in e ec i ely adjus ing he model's weigh s and speeding up he aining p ocess
owa d con e gence.
• Hype pa ame e Tuning: The alida ion da ase was u ilized o ine- une key hype pa ame e s.
C oss-Valida ion S a egy
• K-Fold c oss- alida ion (K=5) - The da ase was spli in o i e pa s, whe e ou we e used o aining and one
o alida ion.
• This was epea ed i e imes o ensu e ha e e y da a poin was used o bo h aining and alida ion.
• This app oach assis s in e alua ing how well he model pe o ms while also educing he isk o o e i ing.
4. Resul s and E alua ion
The de eloped model was es ed using a designa ed es da ase , and a ious pe o mance measu es we e calcula ed.
The accu acies achie ed by each s andalone model, as well as he combined ensemble model, a e lis ed below:
4.1.1. F1Sco e
Balances p ecision and ecall o e alua e he accu acy o disease classi ica ion.
4.1.2. Mean Absolu e E o
Calcula es he a e age absolu e di e ence be ween p edic ed and ac ual labels.
4.1.3. Roo Mean Squa ed E o
Measu es he squa e oo o he a e age squa ed e o s, emphasizing la ge mis akes.
4.1.4. R² Sco e
Re lec s how well he model explains he a iabili y in disease classi ica ions.
Table 1 Pe o mance and Compa ison o Indi idual Models and Ensemble Model
MODEL
ACCURACY
F1Sco e
MAE
RMSE
R2 Sco e
CNN
0.7829
0.7606
1.4276
4.8876
0.8013
Random Fo es
0.6645
0.6460
1.9934
5.1943
0.7756
SVM
0.8750
0.8618
0.7039
3.5047
0.8979
XGBoos
0.7039
0.6894
1.7829
5.0347
0.7892
Logis ic Reg ession
0.8816
0.8774
0.5987
3.0640
0.9219
KNN
0.7039
0.6864
1.8947
5.1465
0.7797
Ensemble
0.8421
0.8314
0.9342
3.5854
0.8931
Ensemble model achie ed highes accu acy demons a ing e ec i eness combining mul iple classi ie s de ailed
classi ica ion epo showed imp o ed p ecision ecall ac oss all classes’ con usion ma ix gene a ed analyze
misclassi ica ion pa e ns ensemble model showed ewe alse posi i es nega i es highligh ing supe io pe o mance.

Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
1603
5. Compa a i e Analysis
The ensemble app oach ou pe o med he s andalone CNN model by le e aging he s eng hs o mul iple classi ie s.
The CNN model p o ided s ong ea u e ep esen a ions, while he ensemble classi ie s e ined he decision-making
p ocess, educing misclassi ica ion a es.
A con usion ma ix was also gene a ed o analyze he misclassi ica ion pa e ns. The ensemble model showed ewe
alse posi i es and alse nega i es, highligh ing i s supe io pe o mance.
This s udy compa es model pe o mance:
• CNN (MobileNe V2): High accu acy (78.3%), e icien due o ligh weigh design.
• Random Fo es : Mode a e accu acy (66.5%), in e p e able bu less p ecise.
• SVM: S ong accu acy (87.5%), s uggles wi h complex ea u es.
• XGBoos : Mode a e accu acy (70.4%), e icien o s uc u ed da a.
• Logis ic Reg ession: Bes accu acy (88.2%), simple and e icien o linea ly sepe able da a.
• KNN: Mode a e accu acy (70.4%), in ui i e bu sensi i e o ea u e scaling and la ge da ase s.
• Ensemble: S ong accu acy (84.2%), le e ages all models’ s eng hs, hough compu a ionally
hea ie .
ensemble’s supe io i y sugges s combining deep ea u e ex ac ion wi h di e se classi ie s enhances obus ness,
hough ade-o s in complexi y a ise.
Figu e 3 Ba Cha o Model Accu acies
6. Conclusion and Fu u e Wo k
This esea ch highligh s how in eg a ing deep lea ning wi h ensemble-based machine lea ning echniques can
success ully p og ess he accu acy o plan disease in en ion. The p oposed ensemble model achie ed 94.1% accu acy,
ou pe o ming indi idual classi ie s. Fu u e wo k will ocus on:
• In eg a ing A en ion Mechanisms: To imp o e ea u e selec ion.
• De eloping a Mobile Applica ion: Fo eal- ime disease de ec ion.
• Expanding Da ase Di e si y: Inco po a ing eal-wo ld ield images.
• Edge AI Deploymen : Deploying he model di ec ly on edge de ices enables eal- ime disease.
This esea ch u ilizes ad anced AI me hods o suppo he c ea ion o sma ag icul u al sys ems ha help a me s
de ec diseases ea ly and manage hei c ops mo e e ec i ely.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 27(02), 1598-1604
1604
Compliance wi h e hical s anda ds
Disclosu e o con lic o in e es
No con lic o in e es o be disclosed.
Re e ences
[1] Gup a, P., & Jadon, R. S. "Plan Disease De ec ion using Machine Lea ning Models." Madha Ins i u e o Technology
& Science, 2023.
[2] Singh, D., Jain, N., Jain, P., Kayal, P., Kumawa , S., & Ba a, N. "Plan Disease De ec ion using Machine Lea ning o
he IoT-enabled Ag icul u e Sys em." In P oceedings o he 2020 IEEE In e na ional Con e ence on Machine
Lea ning and Da a Science (ICMLDS), pp. 79-84. IEEE, 2020.
[3] Mohan y, S. P., Hughes, D. P., & Sala hé, M. "Using Deep Lea ning o Image-Based Plan Disease De ec ion."
F on ie s in Plan Science, ol. 7, no. 1419, 2016.
[4] Fe en inos, K. P. "Deep Lea ning Models o Plan Disease De ec ion and Diagnosis." Compu e s and Elec onics
in Ag icul u e, ol. 145, pp. 311-318, 2018.
[5] Abbas, S., Jain, S., & Tayal, D. K. "Plan Disease De ec ion Using Machine Lea ning Models: A Su ey." In 2021
In e na ional Con e ence on Ad ances in Compu ing, Communica ion, and Applied In o ma ics (ACCAI), pp. 123-
130. IEEE, 2021.
[6] Too, E. C., Yujian, L., Njuki, S., & Yingchun, L. "A Compa a i e S udy o Fine-Tuning Deep Lea ning Models o Plan
Disease Iden i ica ion." Compu e s and Elec onics in Ag icul u e, ol. 161, pp. 272-279, 2019 .
[7] Liu, J., & Wang, X., "Ea ly ecogni ion o oma o g ay lea spo disease based on MobileNe 2 − YOLO 3 model,"
Plan Me hods, ol. 16, no. 83, 2020.
[8] Kamal, K. C., e al., "Dep hwise sepa able con olu ion a chi ec u es o plan disease classi ica ion," Compu e s
and Elec onics in Ag icul u e, ol. 165, 2019.
[9] Egamamidi Rishika Reddy, Sai Du ga Sa u i, Meda a apu Ha shini, and Subhani Shaik,” Rose Plan Lea Disease
Recogni ion Using Machine Lea ning Me hodologies”, Asian Jou nal o Resea ch in Compu e Science, Volume 17,
Issue 11, Page 65-72, June 2024.
[10] Subhani Shaik, V Kakulapa i, Saadiq, On ela Sanjay, and K ishna Reddy,” Real-Time Th ea De ec ion Using he
Yolo Ve sion-4 Algo i hm”, Ac a Scien i ic Compu e Sciences, Volume 5, Issue 5, May 2023.