Enginee ing and Technology Jou nal e-ISSN: 2456-3358
Volume 10 Issue 11 No embe -2025, Page No.-7939-7946
DOI: 10.47191/e j/ 10i11.25, I.F. – 8.482
© 2025, ETJ
7939
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e
AI in P og amming Educa ion Th ough NLP and Pedagogical Analy ics
Milena Nikolić1, Ma ina Ma jano ić2
1,2The Academy o Applied Technical and P eschool S udies, Singidunum Uni e si y
ABSTRACT: This s udy in es iga es he g owing dependence o s uden s on popula gene a i e a i icial in elligence ools
such as Cha GPT, Gi Hub Copilo , Jupy e AI, Google Ba d, and mo e o coding assignmen s, p ojec de elopmen , exam
p epa a ion, and concep ual lea ning in compu e science highe educa ion. Based on empi ical da a om a ious
unde g adua e and g adua e-le el p og amming cou ses a he Academy o Applied Technical and P eschool S udies in Se bia,
he esea ch applies na u al language p ocessing (NLP) and builds a machine lea ning model o examine s uden engagemen
and p edic po en ial educa ional ou comes. A Py hon-based comp ehensi e amewo k in eg a es CodeBERT me hod o
seman ic simila i y and plagia ism de ec ion, TF-IDF wi h cosine simila i y o benchma k compa isons, XGBoos o ub ic-
based classi ica ion, and DBSCAN clus e ing me hods o code anomaly de ec ion. Sen imen analysis u he cap u es s uden
a i udes owa d equen AI use. Ra he han limi ing he use o such app oaches, his pape in oduces a scalable solu ion o
AI-awa e assessmen and cu iculum design, encou aging esponsible and e hical usage o mode n gene a i e echnologies.
The esul s suppo an inno a i e and u u e- eady model o educa ion in he e a o a i icial in elligence.
KEYWORDS: In elligen Sys ems, P og amming Educa ion, Pedagogical Analy ics, NLP, Da a Science, CodeBERT,
XGBoos , Sen imen Analysis, DBSCAN Clus e ing
I. INTRODUCTION
Gene a i e a i icial in elligence (GAI) has quickly
become a ans o ma i e o ce in p og amming educa ion.
Pla o ms including Cha GPT, Gi Hub Copilo , Jupy e AI,
Google Ba d, and OpenAI Codex a e widely embedded in
s uden p ac ice, suppo ing a ious cou sewo k, p ojec
de elopmen , exam p epa a ion, and o e all concep ual
unde s anding [1]. Thei a ailabili y has eshaped adi ional
help-seeking pa hways, wi h s uden s inc easingly u ning o
AI sys ems a he han ins uc o s o pee s. Lea ne s adop
dis inc p omp ing and in e ac ion s a egies such as epea -
edi , sca olding, copy-pas e, and explo a o y p omp ing,
which illus a e he a ied ways AI is used in p og amming
con ex s [2].
When ca e ully in eg a ed in o he cou sewo k, GAI can
imp o e assignmen comple ion a es, enhance co ec ness,
and s eng hen gene al compu a ional hinking, while also
encou aging s uden mo i a ion and con idence. Howe e ,
unguided use isks os e ing supe icial lea ning, weakening
debugging abili y, and aising academic in eg i y conce ns,
while s ongly in luencing how lea ne s app oach e lec ion
and cogni i e skills [3]. This duali y unde sco es he u gen
need o pedagogical s a egies ha balance he e iciency o
AI wi h he adop ion o deepe p oblem-sol ing skills.
Alongside hese conside a ions, esea ch in compu e and
enginee ing educa ion sugges s ha GAI can easily become a
cons uc i e lea ning companion when i s use is owa d
objec i e easoning. Embedding models like Cha GPT in o
p og amming assignmen s p omo es highe -o de hinking
when s uden s a e guided o ea AI ou pu s as objec s o
analysis and e alua ion ins ead o eady-made solu ions [4].
This pe spec i e encou ages he idea ha he inco po a ion o
AI in o p og amming educa ion should be g ounded on
pedagogical p ac ices ha co e easoning, design
decisions,
and e hical engagemen , ensu ing ha e iciency is balanced
wi h cogni i e de elopmen .
Educa ional esea che s inc easingly a gue o e hinking
pedagogy and s uden assessmen app oaches. Ra he han
ocusing only on he code ou pu , assignmen s a e expec ed
o emphasize key easoning p ocesses, design choices, and
e lec i e engagemen wi h AI-assis ed solu ions. Analy ical
echniques ha combine na u al language p ocessing wi h
lea ning analy ics a e applied o clus e in e ac ion pa e ns,
analyze s uden p omp s, and simula e beha io s o p o ide
moni o ing, pe sonaliza ion, and e hical o e sigh [5]. These
insigh s poin owa d a b oade econ igu a ion o compu e
science educa ion whe e AI is nei he unc i ically emb aced
no p ohibi ed bu ins ead in eg a ed in s uc u ed ways ha
maximize ad an ages while mi iga ing isks.
Building on he ounda ion o his wo k, he p esen s udy
examines he in eg a ion o GAI in p og amming educa ion
h ough empi ical da a collec ed om he las wo yea s o
eaching cou ses a he Academy o Applied Technical and
P eschool S udies in Se bia. A Py hon-based amewo k is
p oposed ha le e ages na u al language p ocessing and
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7940
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
machine lea ning echniques, speci ically CodeBERT, TF-
IDF wi h cosine simila i y, ollowed by XGBoos and
DBSCAN, o de ec eliance pa e ns, p edic s uden
ou comes h ough he semes e , and guide he
implemen a ion o an AI-awa e cu icula. By combining
echnical modeling wi h pedagogical e lec ion, his wo k
con ibu es o he de elopmen o an e hical and scalable
sys em o compu e science educa ion in he e a o a i icial
in elligence.
II. LITERATURE REVIEW
Recen s udies con i m ha s uden s all o e he wo ld
ha e apidly adop ed GAI ools in p og amming cou ses,
o en p e e ing hem o e adi ional sou ces o help like
ins uc o s o o ice hou s [6]. Dis inc p omp -use clus e s
ha e been iden i ied, illus a ing ha lea ne s employ a ied
in e ac ion s yles when engaging wi h sys ems like Cha GPT
and Gi Hub Copilo [7]. Fo example, some s uden s ely on
apid- i e p omp ing o ob ain ins an solu ions, while o he s
use i e a i e e inemen o g adually imp o e code quali y o
depend hea ily on AI o debugging assis ance. These
beha io s indica e ha GAI is in luencing no only he speed
o ask comple ion bu also he dep h o concep ual lea ning.
The esea ch base be ween 2022 and 2025 documen s
oppo uni ies and isks oge he . Posi i e ou comes con ain
measu able imp o emen s in ask comple ion, assignmen
co ec ness, and compu a ional hinking when GAI is used
unde guided condi ions [8]-[9]. Mo eo e , se e al s udies
epo bene i s in mo i a ion, sel -e icacy, and e en ion,
wi h e en ion inc eases o up o 25 pe cen , hen epea ed
mis akes dec ease by 30 pe cen , and s uden sa is ac ion
gains o 20 pe cen [10]-[11]. In benchma king s udies, GPT-
4 achie ed p og am epai a es o 88 pe cen and s ong
pe o mance in gene a ing ele an explana ions, e icien ly
app oaching he p o iciency o a human u o [12]. Simila ly,
CodeBERT and OpenAI Codex we e employed o gene a e
p og amming exe cises and explana ions ha s uden s a ed
highly o no el y, eadiness, and use ulness [13].
Despi e hese gains, isks emain signi ican . Unguided o
hea y dependence on AI has been p ima ily associa ed wi h
shallow lea ning gains [14]. Simila conce ns a e aised in
b oade discussions o AI in educa ion, which cau ion ha
s uden s may ely on su ace-le el ou pu s a he expense o
deepe engagemen wi h p oblem-sol ing p ocesses and
disciplina y knowledge. Essen ial e hical challenges like
plagia ism, ai ness, and bias a e widely documen ed as well,
wi h a leas i e s udies iden i ying academic in eg i y as a
cen al issue [15]-[16]. In addi ion, p i acy, anspa ency, and
inclusi i y emain un esol ed, aising equi y cons ain s
a ound who secu es ad an ages om hese ools.
Pedagogical adap a ions a e also inc easingly ecognized as
c ucial. Resea ch indica es ha adi ional summa i e
assessmen models a e inadequa e in con ex s whe e AI
solu ions a e eadily a ailable. Ins ead, schola s ecommend
designing assignmen s ha emphasize easoning, p oblem
decomposi ion, and e lec i e e alua ion o AI ou pu s [17].
Many class oom s udies con i m ha supe ised in eg a ion
p oduces mo e posi i e lea ning ou comes han unguided
inclusion. E idence u he sugges s ha s uden s bene i
mos when AI use is explici ly amed as a lea ning aid a he
han a subs i u e o p oblem-sol ing. As a esul , he ole o
ins uc o s is shi ing om being he main sou ce o answe s
o becoming acili a o s who each p omp li e acy, c i ical
e alua ion o model ou pu s, and highe -o de design skills.
To encou age hese pedagogical ans o ma ions, NLP and
educa ional da a analy ics ha e been inc easingly applied.
Clus e ing analyses o s uden p omp s e eal consis en
in e ac ion pa e ns ha can be linked o lea ning beha io s.
Simula ion amewo ks like Code Agen demons a e how
syn he ic lea ne s can be u ilized o explo e pe sonaliza ion
and adap i e sca olding. La ge-scale analy ics pipelines a
he ins i u ional le el ha e been deployed as well o de ec
bo lenecks in s uden p og ess and measu e p e- and pos -
u iliza ion e ec s o AI use [18]. A de ailed o e iew o hese
s udies, along wi h models used, asks assigned, da ase s, and
pe o mance highligh s, is p esen ed in Table I.
O e all, he li e a u e showcases ha GAI can ac as a
ca alys o imp o ed p og amming educa ion by enhancing
eedback and suppo ing pe sonaliza ion. Howe e , c i ical
limi a ions emain. Mos s udies a e s ongly es ic ed o
single p og amming cou ses, sho ime ames, o na ow
ins i u ional con ex s. The e is also no enough e idence o
concep ual lea ning and skill ans e s, and low consis ency
ac oss e alua ion me ics. Add essing desc ibed p oblems is
necessa y o ensu e ha AI in eg a ion imp o es sho - e m
pe o mance and sus ains long- e m lea ning ou comes. The
p esen s udy esponds o his gap by combining na u al
language p ocessing, machine lea ning, and eal class oom
da a o build a eplicable and pedagogically in o med model
o AI-awa e assessmen and cu iculum design.
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7941
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
Table I. A b ie o e iew o p e ious indings ela ed o GAI in p og amming educa ion.
III. METHODOLOGY
A. Da a Sou ces
As al eady men ioned, he empi ical da a o his s udy was
collec ed om h ee unde g adua e and g adua e-le el cou ses
augh a he Academy o Applied Technical and P eschool
S udies in he ci y o Niš, Se bia. The cou ses we e
Fundamen als o P og amming, So wa e Enginee ing and
Big Da a Analy ics. Each cou se spanned wel e weeks and
equi ed s uden submissions a egula in e als [19]-[21].
In Fundamen als o P og amming cou se, app oxima ely
120 assignmen s we e submi ed e e y wo weeks. These
asks emphasized ounda ional C p og amming concep s and
di ec applica ions om class ma e ial and eal-wo ld
exe cises. Examples con ained w i ing unc ions o simula e
banking ansac ions, gene a ing s a is ics om s uden
eco ds, and managing ile ope a ions.
In So wa e Enginee ing cou se, 60 Ja a assignmen s we e
submi ed once pe week. These p ojec s in oduced objec -
o ien ed design concep s, wi h exe cises such as de eloping
class hie a chies o e-comme ce, implemen ing scheduling
sys ems, o designing modules o simple managemen app.
In Big Da a Analy ics inal-yea s uden s submi ed 30
Py hon assignmen s e e y week. The assigned asks ocused
on da a p ep ocessing, analysis, and isualiza ion me hods.
Rep esen a i e examples co e ed pa sing la ge CSV iles o
ho el and Twi e da a, implemen ing sen imen analysis on
ex co po a, and gene a ing dashboa ds o obse a ions.
All assignmen s we e designed o handle wo ca ego ies o
exe cises. The i s ca ego y consis ed o di ec coding asks
aligned wi h lec u e opics. Fo example, s uden s we e asked
o w i e ecu si e unc ions in C o compu e ac o ials o
Fibonacci numbe s, o implemen Py hon sc ip s o basic
s a is ical calcula ions such as mean, median, and a iance,
and o design objec -o ien ed class hie a chies in Ja a ha
applied design pa e ns o model en i ies such as s uden s,
cou ses, o bank accoun s. The second ca ego y emphasized
asks ha simula ed common eal-li e scena ios, equi ing
s uden s o ans e hei knowledge and skills in o p ac ical,
con ex ualized solu ions. In his g oup, s uden s de eloped C
p og ams o manage a lib a y sys em wi h bo owing and
e u ning unc ionali ies, buil console Ja a applica ions o
ide-sha ing pla o ms ha inco po a ed design pa e ns such
as Single on o global con igu a ion, Fac o y Me hod o
gene a ing ehicle o d i e objec s, and Obse e o
upda ing ide s a us, and w o e Py hon p og ams o pa se
social media da a and pe o m sen imen analysis. O he
assignmen s in ol ed designing sys ems ha au oma ically
gene a e ime ables o exam egis a ion and implemen ing
da a isualiza ion dashboa ds in Py hon o display ends in
booking speci ic ho els using public da ase s.
In o al, almos 4,500 s uden submissions we e collec ed
ac oss cou ses, p o iding a ich and di e se collec ion ha
cap u es meaning ul a ia ions in coding s yles, e ol ing
p oblem-sol ing s a egies, and dynamic in e ac ions wi h
gene a i e AI ools obse ed o e ime.
S udy
Models/Tools
Used
Tasks Add essed
Da ase / Con ex
Key Findings
P a he e
al. (2023)
GPT-3.5, GPT-4,
Gi Hub Copilo
Code gene a ion,
in e p e a ion,
eaching ma e ial
c ea ion
Unde g adua e /
Global
Highligh ed oppo uni ies and isks;
GPT-4 achie ed 51.5% a g. sco e;
conce ns abou o e eliance and
misconduc
Xie (2024)
Cha GPT
Assignmen
comple ion,
co ec ness, lea ning
ou comes
In oduc o y Ja a /
Uni e si y
Guided use imp o ed assignmen
comple ion a es and co ec ness
Cambaz &
Zhang
(2024)
Codex, GPT-
3/3.5/4, Copilo
Code gene a ion,
u o ing, eedback
In oduc o y Py hon /
Unde g adua e
Iden i ied pe o mance a iabili y;
need o sca olding and moni o ing
Mboya e
al. (2025)
GPT-3, GPT-4,
CodeGeex
Pe sonalized
lea ning, eedback,
u o ing
Uni e si ies / Kenya
Repo ed 25% e en ion gain, 30%
ewe mis akes, 20% highe
sa is ac ion
Boguslaws
ki e al.
(2024)
Cha GPT, LLMs
Mo i a ion,
debugging, complex
p ojec s
Unde g adua e /
G adua e / Ge many
77% equen use; imp o ed
au onomy and compe ence; isks o
unc i ical adop ion
Phung e
al. (2023)
Cha GPT 3.5/4
P og am epai ,
hin s, explana ions
In oduc o y Py hon
GPT-4 achie ed 88% p og am epai ,
84% explana ion, close o human u o
Sa sa e al.
(2022)
Codex, GPT-3,
CodeBERT
Exe cise gene a ion,
explana ions
In oduc o y
p og amming /
Uni e si y
S uden s a ed exe cises as 75%
sensibleness, 81.8% no el y, 76.7%
eadiness
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7942
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
B. Da a P ep ocessing
All code submissions we e s anda dized in he beginning o
ensu e consis ency ac oss he da ase . This p ocedu e
in ol ed uni ying inden a ion s yles, emo ing ex aneous
whi espace, and co ec ing encoding inconsis encies whe e
applicable. Submissions ha ailed o compile o included
incomple e agmen s we e lagged and e ained sepa a ely o
a oid skewing seman ic o s uc u al analysis. Likewise,
b oken, co up ed o o he wise in alid submissions we e
clea ed a his s age, so hey did no pa icipa e in clus e ing.
Following he desc ibed no maliza ion, okeniza ion and
pa sing we e applied using e icien language-speci ic ools.
Fo example, Ja a code was p ocessed wi h ANTLR while
Py hon iles we e ea ed using he buil -in okenize module.
This allowed main iden i ie s, keywo ds, and ope a o s o be
ex ac ed in o s uc u ed oken sequences ha could la e be
mapped o nume ical ea u e spaces.
Na u al language componen s, such as s uden commen s
embedded in he code and e lec i e no es submi ed wi h
assignmen s, we e ca e ully p ep ocessed as well. S anda d
echniques we e applied, such as lowe casing, punc ua ion
emo al, s opwo d il e ing, and lemma iza ion, o c ea e a
clean ex ual ep esen a ion [22]. This s ep gua an eed ha
na u al language p ocessing elemen s could be meaning ully
aligned wi h p og amming cons uc s du ing analysis.
Fea u e ex ac ion p ocess combined wo complemen a y
s a egies. TF-IDF ec o iza ion was used o cap u e lexical
dis ibu ions wi hin commen s, and CodeBERT embeddings
p o ided ep esen a ions o sou ce code and accompanying
ex . This app oach suppo ed analysis o submissions a he
syn ac ic and seman ic le els, imp o ing he model’s abili y
o de ec plagia ism, simila i y, o concep ual o e lap [23].
Finally, me ada a ela ed o submissions was encoded as
nume ical ea u es, including a iables such as submission
equency, ime in e als be ween assignmen s, and code
leng h. By inco po a ing empo al and beha io al ea u es, he
da ase was enhanced wi h in o ma ion ha e ealed
engagemen pa e ns and possible o e eliance on AI ools.
A e p ep ocessing was pe o med, he aw submissions
we e ans o med in o ec o s, embeddings, and me ada a,
ensu ing ha bo h code syn ax and seman ic meaning we e
p ese ed o upcoming machine lea ning and NLP asks.
Table II p esen s he inal numbe o eco ds e ained o
cou ses a e all p ep ocessing echniques we e applied.
Table II. The numbe o eco ds be o e and a e
p ep ocessing.
Cou se
Ini ial Reco ds
Final Reco ds
Fundamen als o
P og amming (C)
2,323
2,103
So wa e
Enginee ing
(Ja a)
1,413
1,297
Big Da a Analy ics
(Py hon)
697
652
Figu e I. Hyb id model a chi ec u e combining XGBoos
classi ica ion, DBSCAN clus e ing, and sen imen analysis.
C. Model Selec ion
The p oposed model a chi ec u e inco po a es se e al
complemen a y componen s ha collec i ely add ess key
s uc u al, seman ic, and beha io al aspec s o p og amming
ac i i ies among s uden s. The ea u e space was buil om
h ee dimensions: CodeBERT embeddings p o ided deepe
con ex ual ep esen a ions o sou ce code and commen s, TF-
IDF ec o s cap u ed lexical pa e ns in na u al language
segmen s, and me ada a ea u es like submission equency,
pos ing imes, and code leng h added a beha io al laye . These
di e se inpu s we e conca ena ed in o a uni ied high-
dimensional ep esen a ion, shown in Figu e I, enabling he
sys em o cap u e lexical pa e ns, seman ic con ex , and
beha io al signals wi hin a single analy ical amewo k.
Fo supe ised lea ning, XGBoos app oach was selec ed
as he p ima y classi ica ion model. I s g adien -boos ed
decision ees a e highly e ec i e o s uc u ed educa ional
da a, and he model is known o s ong p edic i e accu acy,
obus ness agains o e i ing, and he abili y o p oduce
in e p e able ea u e impo ance sco es. Wi hin his s udy,
XGBoos was u ilized o classi y submissions agains ub ic-
aligned c i e ia, iden i y po en ial o e eliance on gene a i e
AI, and p edic s uden ou comes ac oss assignmen s [24].
Fo unsupe ised lea ning, DBSCAN was adop ed owing
o i s e ec i eness o unco e i egula clus e s uc u es and
de ec anomalies in noisy s uden submission da ase s [25].
Unlike k-means o o he cen oid-based me hods, DBSCAN
does no equi e p ede ining he numbe o clus e s and i is
well sui ed o de ec ing beha io al ou lie s, such as sudden
spikes in ac i i y ha may indica e excessi e AI ool usage.
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7943
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
D. Model T aining and E alua ion
Model aining and e alua ion was ca ied ou in Py hon
using a usion o Sciki -lea n (G idSea chCV,
S a i iedKFold), HuggingFace T ans o me s (Au oModel,
Au oTokenize wi h p e ained CodeBERT embeddings) and
he XGBoos lib a y, enabling he powe ul combina ion o
adi ional machine lea ning models and ans o me -based
embeddings. The p ocessed s uden submissions we e
p esen ed as ma ices in eg a ing lexical, seman ic, and
beha io al insigh s. This consolida ed ep esen a ion o med
he inpu laye o bo h supe ised and unsupe ised lea ning
modules.
To ensu e eliable and gene alizable model pe o mance, a
s a i ied c oss- alida ion s a egy was applied, balancing
submissions ac oss all assignmen ca ego ies and di icul y
le els. Da a pa i ioning was designed o p ese e empo al
consis ency, p e en ing in o ma ion leakage be ween weeks
and con i ming ha he e alua ion mi o ed eal class oom
dynamics. Hype pa ame e uning o XGBoos model was
pe o med h ough a g id sea ch, op imizing key pa ame e s
such as lea ning a e, maximum ee dep h and he numbe o
es ima o s. Ea ly s opping mechanisms we e in eg a ed in o
he aining cycle o mi iga e o e i ing and p ese e model
gene alizabili y. Fo DBSCAN clus e ing, he epsilon adius
and minimum sample h eshold alue we e adjus ed based on
empi ical es ing, including silhoue e sco es wi h domain
expe ise on s uden coding beha io s o dis inguish
meaning ul pa e ns om noise. DBSCAN was sui able in his
con ex as i iden i ies clus e s o a bi a y shape and labels
low-densi y poin s as anomalies, making i e ec i e o
cap u ing i egula and a ypical coding pa e ns.
The aining phase hus me ged he exp essi e s eng hs o
ans o me -de i ed embeddings wi h he in e p e abili y
o e ed by g adien boos ing. This hyb id app oach allowed
he classi ie o de ec plagia ism-like simila i ies, iden i y
e idence o AI-assis ed code gene a ion, and p edic g ading
ou comes, while he clus e ing exposed la en beha io al
s uc u es and anomalies in submission pa e ns.
Model e alua ion in ol ed quan i a i e and quali a i e
analyses. Fo he supe ised classi ica ion, pe o mance was
assessed using accu acy, p ecision, ecall, and F1-sco es o
p o ide a balanced iew o p edic i e capabili y. Clus e ing
alidi y was examined h ough silhoue e coe icien s as well
as manual inspec ion o clus e cohesion and sepa a ion.
IV. EXPERIMENTAL RESULTS
The expe imen al e alua ion was designed o assess he
e ec i eness o he p oposed a chi ec u e in cap u ing bo h
seman ic and beha io al p ope ies o s uden submissions.
Resul s a e epo ed h ough classi ica ion, clus e ing, and
sen imen analysis, p o iding a comp ehensi e o e iew o
s uden engagemen wi h di e en coding assignmen s.
The ini ial uns o he XGBoos classi ie did no p oduce
pa icula ly s ong esul s, wi h accu acy and ecall alues
luc ua ing below 80 pe cen . Howe e , a e ine- uning o
hype pa ame e s and op imiza ion o ea u e in eg a ion, he
model achie ed s onge p edic i e pe o mance ac oss all
cou ses. Using s a i ied c oss- alida ion echniques, he
a e age assignmen s accu acy eached 91.4 pe cen , wi h
p ecision a 0.900, ecall a 0.895, and F1-sco es a e aging
0.897 o mo e uns. Pe o mance was sligh ly highe in he
Fundamen als o P og amming and So wa e Enginee ing
cou ses, whe e assignmen s we e no ably mo e s uc u ed and
guided by p ede ined g ading c i e ia. Fo example, F1-sco es
eached 0.921 in Ja a p ojec s ha in eg a ed design pa e ns
like Single on and Obse e , e lec ing he model’s abili y o
cap u e lea ning ou comes. In con as , Big Da a Analy ics
showed a bi lowe esul , wi h accu acy a e aging 88.3
pe cen and F1-sco es a ound 0.865, due o he open-ended
na u e o Py hon assignmen s and di e se coding and
analy ical s a egies among s uden s. These indings sugges
oppo uni ies o con inued e inemen o he model and i s
adap a ion o mo e ins uc ional con ex s. Table III p esen s a
summa y o e alua ion me ics, showing ha he classi ie
achie ed high accu acy and balanced pe o mance ac oss
cou ses while s ill no ing challenges in less cons ained asks.
Figu e II. DBSCAN clus e ing o s uden submissions.
Table III. Model pe o mance me ics ac oss cou ses.
Cou se
Accu ac
y
P ecisio
n
Recal
l
F1
Sco e
Fundamen al
s o
P og amming
(C)
92.5%
0.910
0.900
0.906
So wa e
Enginee ing
(Ja a)
93.1%
0.920
0.921
0.921
Big Da a
Analy ics
(Py hon)
88.3%
0.870
0.861
0.865
A e age
(All cou ses)
91.4%
0.900
0.895
0.897
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7944
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
Fea u e impo ance analysis e ealed ha he CodeBERT
embeddings we e he s onges p edic o s, highligh ing ha
seman ic pa e ns in code and commen s di ec ly in luenced
classi ica ion ou comes. The way s uden s w o e, s uc u ed,
and explained submission codes ca ied signi ican weigh in
dis inguishing au hen ic wo k om AI-assis ed submissions.
Beha io al me ada a, including submission imes amps and
assignmen complexi y, p o ed o be he nex mos ele an
con ibu o s. Unusually as comple ions on complex asks
o en signaled possible eliance on AI ools, while i egula
submission in e als poin ed o inconsis en engagemen .
DBSCAN clus e ing e ealed dis inc beha io al g oups
among s uden s, as illus a ed in Figu e II. App oxima ely 12
pe cen o submissions we e lagged as anomalous (shown in
g ay), usually de ined by high simila i y o AI-gene a ed
empla es, excessi e esubmissions, o sho comple ion imes
inconsis en wi h assignmen leng h and complexi y. The
silhoue e sco e eached 0.67, indica ing meaning ul clus e
sepa a ion, as con i med by manual inspec ion.
The clus e ing p ocess p oduced h ee main g oups based
on seman ic embeddings, lexical ea u es, and beha io al
me ada a. The i s clus e (g een colo ed on g aph) in ol ed
s uden s wi h consis en and au hen ic implemen a ion s yles,
whe e submission equency and code leng h aligned wi h
expec ed pa e ns. The second clus e (o ange colo ed)
con ained s uden s who showed i egula engagemen , such as
apid comple ions o complex asks o une en submission
in e als, sugges ing in e mi en adop ion o AI u ili ies. The
hi d clus e ( ed colo ed), which was p opo ionally small,
consis ed o s uden s p oducing unusually long and well-
s uc u ed code ea ly in he semes e , poin ing o possible
ex e nal help o hea y eliance on AI-gene a ed solu ions.
Cou se-speci ic di e ences we e e iden ac oss clus e s. In
Fundamen als o P og amming assignmen s, anomalies
e lec ed copy-pas e beha io s in epe i i e C exe cises. In
So wa e Enginee ing cou se, lagged anomalies cen e ed on
Ja a design pa e n implemen a ion closely esembling AI-
gene a ed snippe s. In Big Da a Analy ics p ojec s, clus e s
exposed dependencies on ex e nal u o ials and p e ained
lib a ies, pa icula ly in clima e and ho el da a isualiza ion,
ollowed by sen imen analysis asks o Twi e pos s.
V. CONCLUSION
This s udy in es iga ed he inc easing eliance o s uden s
on gene a i e a i icial in elligence ools such as Cha GPT,
Gi Hub Copilo , Jupy e AI, and Google Ba d in p og amming
educa ion. Using da a collec ed om h ee unde g adua e and
g adua e-le el p og amming cou ses a he Academy o
Applied Technical and P eschool S udies in Se bia, we buil
an analy ical amewo k ha inco po a ed na u al language
p ocessing, supe ised and unsupe ised machine lea ning,
and pedagogical analy ics. The sys em combined CodeBERT
o seman ic simila i y and plagia ism de ec ion, TF-IDF o
deep lexical analysis, XGBoos o ub ic-based
classi ica ion, DBSCAN o anomaly de ec ion, and sen imen
analysis o in e p e ing e lec i e no es.
The esul s demons a ed ha he model achie ed s ong
p edic i e accu acy, dis inguishing au hen ic wo k om AI-
assis ed submissions and unco e ing i egula beha io s
ela ed o no able o e eliance on gene a i e ools. DBSCAN
clus e ing e ealed h ee dis inc beha io al g oups, whe e
anomalies e lec ed copy-pas e p ac ices in C p og amming,
AI-d i en design pa e n eplica ion in Ja a, and eliance on
ex e nal lib a ies in Py hon analy ics. Mo eo e , sen imen
analysis and s uden e lec ions u he highligh ed posi i e
a i udes owa d AI guidance and c i ical conce ns ela ed o
ai ness, c i ical hinking, and sus ainable lea ning.
O e all, he hyb id app oach e ec i ely me ged seman ic
embeddings, lexical ea u es, and beha io al me ada a o
cap u e au hen ic engagemen , anomalous pa e ns, and key
isks o dependency. This combina ion o supe ised and
unsupe ised echniques showcasing he bene i s o aligning
ad anced analy ics wi h pedagogical e lec ion, o e ing a
mo e dependable model o e alua ing s uden lea ning in AI-
powe ed con ex s.
While hese esul s s ongly indica e ha gene a i e AI is
ans o ming p og amming educa ion, u he esea ch is
equi ed o s eng hen and gene alize he amewo k. Mul i-
ins i u ional s udies migh be help ul o alida e applica ions
beyond a single academic se ing, and longi udinal esea ch is
needed o de e mine whe he his ype o lea ning os e s long-
las ing knowledge acquisi ion. De eloping a uni o m
e alua ion c i e ion o AI-assis ed lea ning could u he
enhance consis ency and compa abili y ac oss ins i u ions.
Fu he mo e, inco po a ing explainable AI me hods such as
SHAP (used o global ea u e impo ance) and LIME (used
o p o ide local explana ions o indi idual p edic o s) would
enhance anspa ency, gi ing ins uc o s be e insigh s in o
model decisions and enabling mo e ac ionable eedback.
Taken oge he , his s udy demons a es ha p ohibi ing
gene a i e mode n AI ools is nei he easible no bene icial
o educa ion. Ins ead, a mo e balanced app oach can be
achie ed h ough in eg a ion suppo ed by guided eedback,
clea e hical policies, and con inuous moni o ing. By uni ing
echnical analysis wi h e lec i e pedagogy, he p oposed
solu ion highligh s oppo uni ies and isks o gene a i e AI,
poin ing owa d a u u e- eady model o compu e science
educa ion ha os e s meaning ul lea ning while equipping
s uden s o an AI-d i en p o essional en i onmen .
REFERENCES
1. H. Güne and E. E , “AI in he class oom: Explo ing
s uden s’ in e ac ion wi h Cha GPT in p og amming
lea ning,” Educ. In . Tech., ol. 30, pp. 12681–12707,
2025, doi: 10.1007/s10639-025-13337-7.
2. K. Fuchs, “Explo ing he oppo uni ies and challenges
o NLP models in highe educa ion: Is Cha GPT a
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7945
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
blessing o a cu se?,” F on . Educ., ol. 8, p. 1166682,
May 2023, F on ie s Media SA,
doi: 10.3389/ educ.2023.1166682.
3. J. Bel án and E. Veiga-Za za, “E alua ing he use o
la ge language models in p og amming cou ses: a
compa a i e s udy,” EDULEARN P oc., ol. 1, pp.
1761–1768, 2025, doi: 10.21125/edulea n.2025.0530.
4. A. Konak and C. J. S. F. Cla ke, “Augmen ing c i ical
hinking skills in p og amming educa ion h ough
le e aging Cha GPT: Analysis o i s oppo uni ies and
consequences,” in P oc. 2023 Fall Mid A lan ic Con .:
Mee ing Ou S uden s Whe e They A e and Ge ing
Them Whe e They Need o Be, Ewing, NJ, USA, Oc .
2023, doi: 10.18260/1-2--45117.
5. S. Fol a ochna, Using lea ning analy ics o iden i y
s uden challenges in p og amming educa ion. B.Sc.
hesis, Dep . Compu . Sci. In . Technol., Fac. Appl.
Sci., Uk ainian Ca holic Uni ., L i , Uk aine, 2025.
6. G. Fenu, R. Galici, M. Ma as, and D. Re o gia o,
“Explo ing s uden in e ac ions wi h AI in
p og amming aining,” in Adjunc P oc. 32nd ACM
Con . Use Modeling, Adap a ion and Pe sonaliza ion,
New Yo k, NY, USA: Assoc. Compu . Mach., 2024,
pp. 555–560, doi: 10.1145/3631700.3665227.
7. B. Ma, L. Chen, and S. Konomi, “Explo ing s uden
pe cep ion and in e ac ion using Cha GPT in
p og amming educa ion,” in P oc. 21s In . Con .
Cogn. Explo . Lea n. Digi al Age (CELDA), 2024,
doi: 10.33965/celda2024_202408l005.
8. J. Xie, “Imp o ing in oduc o y Ja a p og amming
educa ion h ough Cha GPT,” J. Compu . Sci. Coll.,
ol. 40, no. 3, pp. 140–150, Oc . 2024.
9. R. Yilmaz and F. G. K. Yilmaz, “The e ec o
gene a i e a i icial in elligence (AI)-based ool use on
s uden s' compu a ional hinking skills, p og amming
sel -e icacy and mo i a ion,” Compu . Educ.: A i .
In ell., ol. 4, p. 100147, 2023,
doi: 10.1016/j.caeai.2023.100147.
10. F. M. Mboya, G. M. Wambugu, A. M. Oi e e, E. O.
Omuya, F. M. Musyoka, and J. W. Gikandi,
“Enhancing pe sonalized lea ning in p og amming
educa ion h ough gene a i e a i icial in elligence
amewo ks: A sys ema ic li e a u e e iew,” In . J.
Ad . T ends Compu . Sci. Eng., ol. 14, no. 2, pp.
514–522, 2025,
doi: 10.30534/ija cse/2025/051422025.
11. S. Boguslawski, R. Dee , and M. G. Dawson,
“P og amming educa ion and lea ne mo i a ion in he
age o gene a i e AI: S uden and educa o
pe spec i es,” In . Lea n. Sci., ol. 126, no. 1/2, pp.
91–109, 2025, doi: 10.1108/ILS-10-2023-0163.
12. T. Phung, V. A. Pădu ean, J. Camb one o, S. Gulwani,
T. Kohn, R. Majumda , and G. Soa es, “Gene a i e AI
o p og amming educa ion: Benchma king Cha GPT,
GPT-4, and human u o s,” in P oc. 2023 ACM Con .
In . Compu . Educ. Res.–Vol. 2, Aug. 2023, pp. 41–
42, doi: 10.1145/3568812.3603476
13. S. Sa sa, P. Denny, A. Hellas, and J. Leinonen,
“Au oma ic gene a ion o p og amming exe cises and
code explana ions using la ge language models,” in
P oc. 2022 ACM Con . In . Compu . Educ. Res. – Vol.
1, Aug. 2022, pp. 27–43,
doi: 10.1145/3501385.3543957.
14. S. Yazdani, M. Najimi, and M. Ahmadzadeh, “The
pa adox o gene a i e AI in p og amming educa ion,”
in EDULEARN P oc., 2025, pp. 7775–7784,
doi: 10.21125/edulea n.2025.1927.
15. D. F anklin, P. Denny, D. A. Gonzalez-Maldonado,
and M. T an, Gene a i e AI in compu e science
educa ion: Challenges and oppo uni ies. Camb idge,
U.K.: Camb idge Uni . P ess, 2025.
16. J. P a he , P. Denny, J. Leinonen, B. A. Becke , I.
Albluwi, M. C aig, and J. Sa elka, “The obo s a e
he e: Na iga ing he gene a i e AI e olu ion in
compu ing educa ion,” in P oc. 2023 Wo king G oup
Repo s Inno . Technol. Compu . Sci. Educ., 2023, pp.
108–159, doi: 10.1145/3623762.3633499.
17. D. Cambaz and X. Zhang, “Use o AI-d i en code
gene a ion models in eaching and lea ning
p og amming: A sys ema ic li e a u e e iew,” in
P oceed. 55 h ACM Techn. Sympos. Compu . Sci.
Educ. ol. 1, Ma . 2024, pp. 172–178,
doi: 10.1145/3626252.3630958.
18. Y. Zhan, Q. Liu, W. Gao, Z. Zhang, T. Wang, S. Shen,
e al., “Code Agen : Simula ing s uden beha io o
pe sonalized p og amming lea ning wi h la ge
language models,” a Xi p ep in , 2025.
doi: 10.48550/a Xi .2505.20642.
19. The Academy o Applied Technical and P eschool
S udies, Lec u e no es on Fund. o P og amming,
Se bia, 2023-2024.
20. The Academy o Applied Technical and P eschool
S udies, Lec u e no es on So wa e Enginee ing,
Se bia, 2023-2024.
21. The Academy o Applied Technical and P eschool
S udies, Lec u e no es on Big Da a Analy ics, Niš,
Se bia, 2023-2024.
22. K. M. G. S. Ka una a hna and R. A. H. M. Rupasingha,
“Lea ning o use no maliza ion echniques o
p ep ocessing and classi ica ion o ex documen s,”
In . J. Mul idiscip. S ud., ol. 9, no. 2, pp. 69–81, 2022.
23. P. T. Nguyen, J. Di Rocco, C. Di Sipio, R. Rubei, D.
Di Ruscio, and M. Di Pen a, “Is his snippe w i en by
Cha GPT? An empi ical s udy wi h a CodeBERT-
based classi ie ,” a Xi p ep in , 2023,
doi: 10.48550/a Xi .2307.09381.
24. A. Asselman, M. Khaldi, and S. Aammou, “Enhancing
he p edic ion o s uden pe o mance based on he
“When S uden s Asked Cha GPT Ins ead o Me: In es iga ing Gene a i e AI in P og amming Educa ion Th ough
NLP and Pedagogical Analy ics”
7946
ETJ Volume 10 Issue 11 No embe 2025,
1
Milena Nikolić
machine lea ning XGBoos algo i hm,” In e ac .
Lea n. En i on., ol. 29, no. 3, pp. 3360–3379, 2021,
doi: 10.1080/10494820.2021. 1928235.
25. H. Du, S. Chen, H. Niu and Y. Li, "Applica ion o
DBSCAN clus e ing algo i hm in e alua ing s uden s'
lea ning s a us," in P oc. 17 h In . Con . Compu .
In ell. Secu i y, Chengdu, China, 2021, pp. 372–376,
doi: 10.1109/CIS54983.2021.00084.