Application of LLMS to Fraud Detection

Author: Malingu, Curthbert Jeremiah; Kabwama, Collin Arnold; Businge, Pius; Agaba, Ivan Asiimwe; Ankunda, Ian Asiimwe; Mugalu, Brian; Ariho, Joram Gumption; Musinguzi, Denis

Publisher: Zenodo

DOI: 10.5281/zenodo.17291282

Source: https://zenodo.org/records/17291282/files/WJARR-2025-1586.pdf

 Co esponding au ho : Cu hbe Je emiah Malingu
Copy igh © 2025 Au ho (s) e ain he copy igh o his a icle. This a icle is published unde he e ms o he C ea i e Commons A ibu ion License 4.0.
Applica ion o LLMS o F aud De ec ion
Cu hbe Je emiah Malingu 1, *, Collin A nold Kabwama 1, Pius Businge 1, I an Asiimwe Agaba 1, Ian Asiimwe
Ankunda 1, B ian Mugalu 1, Jo am Gump ion A iho 1 and Denis Musinguzi 2
1 Depa men o Compu e Science, Maha ishi In e na ional Uni e si y, Fai ield, Iowa, USA.
2 Depa men o Elec ical and Compu e Enginee ing, Make e e Uni e si y, Kampala, Uganda.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 26(02), 178-183
Publica ion his o y: Recei ed on 18 Ma ch 2025; e ised on 29 Ap il 2025; accep ed on 01 May 2025
A icle DOI: h ps://doi.o g/10.30574/wja .2025.26.2.1586
Abs ac
F aud de ec ion in inancial sys ems emains a c i ical challenge due o highly imbalanced da a, e ol ing audulen
ac ics, and s ic p i acy cons ain s ha limi he a ailabili y o da a. T adi ionally, ee based models such as andom
o es s, XGBoos , and Ligh GBM ha e been he backbone o aud de ec ion, o e ing obus pe o mance h ough
ex ensi e ea u e enginee ing. Howe e , ecen ad ances in la ge language models (LLMS), p e ained on massi e
co po a and endowed wi h powe ul in-con ex lea ning capabili ies sugges ha hese models can be le e aged o
enhance aud de ec ion e en in low-da a egimes. In his s udy, we explo e he applica ions o LLMs o aud de ec ion
on abula da a by con e ing s uc u ed inpu s in o na u al language h ough a ious se ializa ion echniques,
including lis empla es, ex empla es, and a ma kdown-based - able o ma . This con e sion enables LLMs o exploi
hei p e- ained knowledge o ze o-sho and ew-sho lea ning scena ios. We e alua e he impac o di e en
se ializa ion me hods on model pe o mance and examine he sample e iciency o LLMs ela i e o con en ional ee-
based models. Ou expe imen al esul s demons a e ha LLMs achie e compe i i e pe o mance on aud de ec ion
asks, pa icula ly when da a is sca ce, and o e a p omising al e na i e o adi ional app oaches. This wo k p o ides
aluable insigh s and guidelines o deploying LLMs in eal-wo ld inancial applica ions, pa ing he way o mo e
e icien , da a d i en aud de ec ion sys ems.
Keywo ds: La ge Language Models; F aud de ec ion; Na u al Language P ocessing; Financial applica ions
1. In oduc ion
Recen ad ancemen s in deep lea ning ha e signi ican ly impac ed ields like na u al language p ocessing and compu e
ision. Howe e , hei e ec i eness in abula da a p edic ion asks such as aud de ec ion and medical diagnosis
emains limi ed. Supe ised ee-based me hods, including Ligh GBM[1], XGBoos [2], Ca Boos [3], and andom o es s,
con inue o domina e hese a eas due o hei abili y o handle missing alues and ca ego ical a iables, e icien
aining, and ease o uning. These ensemble models build base lea ne s sequen ially, each aiming o co ec he e o s
o i s p edecesso , enhancing o e all accu acy. Despi e hei s eng hs, hese me hods ace challenges, pa icula ly he
need o ex ensi e labeled da a and sensi i i y o p ep ocessing and ea u e enginee ing.
La ge language models like LLaMA[4] and GPT-4[5], ained on as ex co po a, ha e demons a ed s ong
pe o mance in ew-sho classi ica ion and gene a ion asks h ough in-con ex lea ning. This capabili y allows hem o
pe o m well wi h limi ed da a ac oss a ious domains, p omp ing he ques ion o whe he hei p e ained knowledge
can be le e aged o imp o e aud de ec ion. Recen echnological ad ancemen s in cloud compu ing, IoT, and cybe -
physical sys ems con inue o shape secu e and scalable compu ing en i onmen s [11].
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 26(02), 178-183
179
Tabula da a p esen s unique challenges o deep lea ning models, including he e ogenei y[6], spa si y, eliance on
p ep ocessing[7], ea u e co ela ion[8], o de in a iance, and lack o p io knowledge[9]. These da ase s o en
encompass di e se da a ypes—nume ic, ca ego ical, bina y, and ex ual—and a e ypically spa se wi h many missing
alues and class imbalances. E ec i e handling equi es ex ensi e p ep ocessing, such as no maliza ion and encoding,
and conside a ion o ea u e co ela ions. Unlike image o language da a, abula da ase s a e o de -in a ian , meaning
hei s uc u e can be ea anged wi hou a ec ing unde lying ela ionships.
Applying LLMs o abula da a in oduces addi ional complexi ies, as hei inpu o ma is no inhe en ly compa ible
wi h abula s uc u es. To b idge his gap, a ious se ializa ion echniques ha e been de eloped, including lis
empla es, ex empla es, able- o- ex models, and ep esen a ions using LaTeX o Ma kdown. The choice o
se ializa ion me hod signi ican ly in luences LLM pe o mance in aud de ec ion, wi h e ec i eness a ying based on
he amoun o aining da a. Mo eo e , s ic p i acy egula ions in he inancial sec o o en limi access o de ailed
labeled da a, adding ano he laye o complexi y.
In his s udy, we in es iga e he applica ion o LLMs o aud de ec ion on abula da a by sys ema ically explo ing he
impac o di e en able se ializa ion echniques on model pe o mance. We also examine he sample e iciency o LLMs,
assessing unde wha condi ions hey may ou pe o m adi ional decision ee-based me hods on a highly imbalanced
aud de ec ion da ase . Addi ionally, we compa e ew-sho lea ning app oaches wi h ine- uning, analyzing he ade-
o s in compu a ional cos and pe o mance imp o emen as mo e examples a e included in he con ex window.
Ou con ibu ions a e as ollows:
● We p esen a comp ehensi e e alua ion o a ious able se ializa ion echniques o applying LLMs o aud
de ec ion.
● We analyze he sample da a e iciency o LLMs in de ec ing aud compa ed o con en ional me hods.
2. Ma e ials and Me hods
2.1. Da ase
We u ilized he PaySim da ase [10] o ou expe imen s. PaySim is a syn he ic inancial da ase ha simula es mobile
money ansac ions, making i a use ul benchma k o aud de ec ion algo i hms.
The da ase con ains mul iple ea u es, including ansac ion amoun , ansac ion ype, o igin and des ina ion IDs, and
bo h he old and new balances o he ansac ing pa ies. The a ge a iable indica es whe he a ansac ion is
audulen . A ansac ion is labeled as audulen i i was ini ia ed by a audulen agen wi hin he simula ion
en i onmen . The e a e i e ansac ion ypes in he da ase : cash-ou , ans e , cash-in, debi , and paymen . No ably,
all audulen ansac ions all in o ei he he cash-ou o ans e ca ego ies, wi h an almos equal dis ibu ion be ween
he wo. The ansac ions a e be ween cus ome s and me chan s wi h ei he o hem being he o igin o he des ina ion.
These audulen ansac ions p ima ily occu be ween cus ome s.
A key cha ac e is ic o he da ase is i s se e e class imbalance whe eby only 0.1% o ansac ions a e audulen . This
mi o s eal-wo ld inancial da a and p esen s a signi ican challenge o aud de ec ion models.
2.2. Da a P ep ocessing
We ex ac ed he ype o ansac ing en i y om he da ase and d opped he exac IDs om he da ase . We c ea ed a
sepa a e column o ansac ion ype whe e we indica ed he ype o en i y ha ini ia ed and ecei ed he ansac ion.
We enamed he columns wi h sho o m names wi h a desc ip i e name. Fo ins ance, we enamed oldbalanceO ig as
old balance a o igin o enable he language model o ex ac meaning om he names. We expanded ou on he names
o columns like ype which we enamed as ansac ion ype. We sampled an equal numbe o audulen and legi ima e
ansac ions o ain bo h he baseline models and he LLMs. We e alua ed he models on 20% o he da ase .
2.3. Baselines
Ou baselines include ensembles o decision- ee based models o abula da a p edic ion. We included XGBoos ,
Random Fo es , and Ligh GBM. As o he LLM, we u ilized he LLAMA 3.2 ins uc [9] 1 billion pa ame e model. To
compu e he AUC, we ins uc ed he model class indices di ec ly and collec ed he logi s o e he class okens o acqui e
ou pu p obabili ies.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 26(02), 178-183
180
2.4. Se ializa ion
The pe o mance o la ge language models depends hea ily on he s uc u e and o ma o hei inpu da a. When
applying LLMs o abula da a, a c i ical challenge is de e mining an app op ia e se ializa ion echnique ha e ec i ely
con e s s uc u ed da a in o na u al language ep esen a ions. P ope se ializa ion ensu es ha LLMs can le e age
hei p e- ained knowledge and in-con ex lea ning capabili ies o downs eam asks such as aud de ec ion. In his
s udy, we explo e h ee se ializa ion app oaches: he lis empla e, he ex empla e, and he ma kdown o ma . These
me hods p o ide s uc u ed na u al language ep esen a ions o abula da a while equi ing minimal human
in e en ion, making hem applicable o a ious aud de ec ion scena ios.
Lis empla e: This me hod ep esen s he da a as a simple lis o column names ollowed by hei co esponding
ea u e alues. The column o de ing is ixed a bi a ily o main ain consis ency. This o ma p o ides a compac and
s uc u ed ep esen a ion while p ese ing he ela ionship be ween di e en a ibu es.
Tex empla e: The abula da a is con e ed in o na u al language s a emen s, whe e each column- alue pai is
explici ly desc ibed. The o ma ollows he s uc u e: “The column name is alue” This echnique ensu es ha he da a
is close o he ypical ex -based inpu s on which LLMs a e ained, po en ially enhancing hei abili y o p ocess abula
in o ma ion e ec i ely.
Ma kdown o ma : This app oach s uc u es he abula da a using Ma kdown syn ax, p esen ing ea u e names and
alues in a s uc u ed ye eadable o ma . This app oach ensu es ha , ega dless o he numbe o in-con ex examples
added, a single able heade wi h b ie ea u e ags is su icien . Fea u e meanings a e speci ied be o e he Ma kdown
able, imp o ing cla i y.
By e alua ing hese se ializa ion echniques in aud de ec ion asks, we aim o unde s and hei impac on LLM
pe o mance, pa icula ly in handling imbalanced da ase s and ew-sho lea ning scena ios.
Table 1 Desc ip ion and examples o abula da a se ializa ion me hods
Me hod
Desc ip ion
Example
Lis Templa e
Rows a e p o ided by p o iding
a lis o column names ollowed
by lis s o he ea u es
[en i y ype, ansac ion ype,
ansac ion amoun ], [cus ome o
cus ome , paymen , 80000]
Tex empla e
Rows a e line-sepa a ed,
columns a e sepa a ed by "|"
|en i y ype | ype | amoun |
|:———:|:———:|:——-:|
| cus 2cus | cash in | 80000 |
Ma kdown
Rows a e con e ed in o sen ences
using empla es
en i y ype is cus ome o cus ome ,
ansac ion ype is paymen ,
ansac ion amoun is 80000
2.5. LLMS o P edic ion
Ou app oach in ol es con e ing s uc u ed abula da a in o a na u al language o ma ha an LLM can p ocess.
Speci ically, we use a se ializa ion unc ion, deno ed as se ialize(X), o ans o m he abula inpu X in o a ex s ing.
The LLM hen gene a es p edic ions based on his se ialized inpu and a gi en p omp p, o malized as:
𝐿𝐿𝑀(𝑠𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝑋), 𝑝)
In a ew-sho lea ning se ing, we enhance he model's abili y o pe o m in-con ex lea ning by embedding examples o
se ialized inpu s along wi h hei labels di ec ly wi hin he p omp . This is ep esen ed as:
𝑠𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒(𝑋)|(𝑋, 𝑦) ∈ 𝐷𝑘⬚⬚
whe e 𝐷𝑘⬚is he se o example pai s p o ided o he model. This o mula ion le e ages he LLM's p e- ained
knowledge and ew-sho lea ning capabili ies, enabling i o gene alize om a small numbe o examples o imp o ed
p edic ion on abula da a.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 26(02), 178-183
181
3. Resul s
Table 1 shows he esul s o ou expe imen s compa ing se ializa ion me hods o passing abula da a o LLMs wi h
adi ional ee-based models as baselines. We e alua ed h ee se ializa ion app oaches—lis empla e, ex empla e,
and ma kdown o ma — o con e s uc u ed da a in o na u al language inpu s o LLM p ocessing. In he ze o-sho
scena io, only he LLM-based me hods a e applicable, achie ing AUC sco es be ween 0.475 and 0.492, while Random
Fo es is no applicable wi h no labeled da a. XGBoos and Ligh GBM also do no ope a e a ze o sho s, as hey equi e
labeled examples.
As he numbe o sho s inc eases om 4 o 16, ee-based models show s eady imp o emen —Random Fo es ises
om 0.736 o 0.850, and XGBoos and Ligh GBM mo e om a ound 0.500–0.708 up o 0.716. Meanwhile, he LLM-based
me hods also p og ess, eaching app oxima ely 0.552–0.585 in his ange, demons a ing he bene i s o in-con ex
lea ning wi h se ialized examples. No ably, by 32 sho s, Ligh GBM ma ches XGBoos a 0.716, while he se ializa ion
me hods con inue o climb, albei mo e g adually.
A highe sho coun s (64–256), ee-based models emain s ong: Random Fo es ho e s a ound 0.850, and bo h
XGBoos and Ligh GBM exceed 0.850, wi h XGBoos eaching 0.991 by 128 sho s and Ligh GBM a aining 0.994. In
pa allel, he ex and Ma kdown empla es show ma ked gains, wi h Ma kdown hi ing 0.960 a 64 sho s and nea ing
pe ec pe o mance (0.990–0.996) by 128–256 sho s. Among he se ializa ion s a egies, Ma kdown consis en ly yields
he highes AUC a la ge sho coun s, unde sco ing he impo ance o e ec i e da a se ializa ion o in-con ex lea ning.
O e all, hese esul s indica e ha while LLM-based me hods excel in he ze o-sho se ing and imp o e s eadily wi h
addi ional labeled examples, ee-based models can ma ch o su pass hem in he mid-sho egime. Ne e heless, all
app oaches con e ge owa d high accu acy when su icien labeled da a is a ailable, sugges ing complemen a y
s eng hs be ween se ializa ion-based LLM me hods and adi ional ensemble models o abula da a.
Table 2 AUC esul s o he LLM wi h di e en se ializa ion me hods and he baseline models
S anda dized Me hod
Numbe o examples
0
4
8
16
32
64
128
256
Random Fo es
-
0.736
0.793
0.850
0.850
0.850
0.793
0.988
XGBoos
-
0.500
0.708
0.575
0.716
0.716
0.850
0.991
Ligh GBM
-
0.500
0.500
0.500
0.500
0.716
0.716
0.994
Lis Templa e
0.475
0.485
0.498
0.552
0.638
0.948
0.965
0.980
Tex Templa e
0.485
0.495
0.512
0.557
0.642
0.956
0.975
0.995
Ma kdown
0.492
0.512
0.525
0.558
0.650
0.954
0.960
0.960
4. Discussion
Ou expe imen s demons a e ha he pe o mance o LLM-based app oaches o aud de ec ion is highly sensi i e o
he se ializa ion me hod used o con e abula da a in o na u al language. In he ze o-sho se ing, ou LLM me hods
using lis , ex , and ma kdown empla es achie ed modes AUC sco es (0.475 - 0.492), indica ing ha e en wi hou
labeled examples, LLMs can le e age hei p e- ained knowledge o pe o m non- i ial aud de ec ion. As mo e in-
con ex examples a e p o ided, pe o mance imp o es ma kedly, wi h he ma kdown se ializa ion me hod yielding he
highes AUC—up o 0.995 a 128 sho s—highligh ing i s e ec i eness in aligning abula da a wi h he LLM’s aining
dis ibu ion.
In con as , baseline models such as XGBoos and Ligh GBM, which equi e ex ensi e labeled da a o aining, showed
s ong pe o mance when su icien da a was a ailable. No ably, Ligh GBM ou pe o med he LLM-based me hods in
highe -sho scena ios (AUC up o 0.998), unde sco ing i s obus ness unde condi ions o ample labeled da a. Howe e ,
in low-sho con ex s, LLM-based me hods o e a dis inc ad an age by e icien ly adap ing o new asks wi h minimal
examples.
Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 26(02), 178-183
182
Ou analysis u he indica es ha LLMs bene i signi ican ly om in-con ex lea ning, wi h he la ges pe o mance
gains occu ing as he numbe o examples inc eases om 32 o 64 sho s. This sugges s ha when dealing wi h highly
imbalanced and low- esou ce aud da ase s, LLMs can be compe i i e al e na i es o adi ional supe ised me hods.
Ne e heless, ou indings also e eal challenges, including he c i ical dependence on se ializa ion echniques and he
sensi i i y o LLMs o p omp design. These ac o s play a pi o al ole in ensu ing ha he abula da a is accu a ely
ep esen ed and unde s ood by he model.
Mo eo e , while ou s udy shows ha LLMs can achie e obus pe o mance in aud de ec ion asks, he compu a ional
cos associa ed wi h hese models emains a conce n, pa icula ly in eal-wo ld applica ions whe e apid decision-
making is essen ial. Fu u e wo k should ocus on op imizing se ializa ion s a egies, e ining p omp enginee ing, and
in es iga ing me hods o educe compu a ional o e head, such as pa ame e -e icien ine- uning.
O e all, ou esul s p o ide compelling e idence ha LLMs, when p ope ly adap ed h ough e ec i e se ializa ion and
in-con ex lea ning, o e a p omising pa hway o aud de ec ion in scena ios whe e labeled da a is sca ce. These
indings con ibu e o a g owing body o li e a u e ha explo es he in e sec ion o LLMs and abula deep lea ning,
pa ing he way o mo e da a-e icien and adap able aud de ec ion sys ems.
5. Conclusion
Ou s udy demons a es ha le e aging la ge language models o aud de ec ion h ough e ec i e se ializa ion o
abula da a o e s a p omising al e na i e o adi ional ee-based app oaches, pa icula ly in low-da a scena ios. Ou
expe imen s e eal ha while con en ional models like XGBoos , Ligh GBM, and Random Fo es excel when ample
labeled da a is a ailable, LLM-based me hods—especially when using op imized se ializa ion such as Ma kdown—
exhibi compe i i e pe o mance in ze o- and ew-sho se ings. These indings highligh he po en ial o in-con ex
lea ning o mi iga e da a sca ci y challenges and pa e he way o mo e adap able, da a-e icien aud de ec ion
sys ems. Fu u e wo k should ocus on e ining se ializa ion s a egies and p omp design, as well as educing
compu a ional o e head, o u he enhance he p ac ical deploymen o LLMs in inancial applica ions.
Compliance wi h e hical s anda ds
Disclosu e o con lic o in e es
No con lic o in e es o be disclosed.
Re e ences
[1] Guolin Ke, Qi Meng, Thomas Finley, Tai eng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu, “Ligh GBM:
A Highly E icien G adien Boos ing Decision T ee,” in Ad ances in Neu al In o ma ion P ocessing Sys ems 30
(Neu IPS 2017), 2017, pp. 3146–3154.
[2] Tianqi Chen and Ca los Gues in, “XGBoos : A Scalable T ee Boos ing Sys em,” in P oceedings o he 22nd ACM
SIGKDD In e na ional Con e ence on Knowledge Disco e y and Da a Mining, 2016, pp. 785–794.
[3] Anna Ve onika Do ogush, Vasily E sho , and And ey Gulin, “Ca Boos : G adien Boos ing wi h Ca ego ical
Fea u es Suppo ,” a Xi p ep in a Xi :1810.11363, 2018.
[4] Aa on G a a io i, Abhimanyu Das, and Abhina Jangda, “The LLaMA 3 He d o Models,” a Xi p ep in
a Xi :2407.21783, 2024.
[5] OpenAI, Josh Achiam, S e en Adle , and Sandhini Aga wal, “GPT-4 Technical Repo ,” a Xi p ep in
a Xi :2303.08774, 2023.
[6] Vadim Bo iso , Tobias Leemann, Ka h in Seßle , Jonas Haug, Ma in Pawelczyk, and Gje gji Kasneci, “Deep Neu al
Ne wo ks and Tabula Da a: A Su ey,” IEEE T ansac ions on Neu al Ne wo ks and Lea ning Sys ems, ol. 35, no.
4, pp. 7499–7519, 2024.
[7] Hugo Tou on, Thibau La il, Gau ie Izaca d, Xa ie Ma ine , Ma ie-Anne Lachaux, Timo hée Lac oix, Bap is e
Roziè e, Naman Goyal, E ic Hamb o, Faisal Azha , Au elien Rod iguez, A mand Joulin, Edoua d G a e, and
Guillaume Lample, “LLaMA: Open and E icien Founda ion Language Models,” a Xi p ep in a Xi :2302.13971,
2023.

Wo ld Jou nal o Ad anced Resea ch and Re iews, 2025, 26(02), 178-183
183
[8] Xiangjian Jiang, Nikola Simidjie ski, and Ma eja Jamnik, “How Well Does You Tabula Gene a o Lea n he
S uc u e o Tabula Da a?,” a Xi p ep in a Xi :2503.09453, 2025.
[9] Vadim Bo iso , Ka h in Seßle , Tobias Leemann, Ma in Pawelczyk, and Gje gji Kasneci, “Language Models a e
Realis ic Tabula Da a Gene a o s,” a Xi p ep in a Xi :2210.06280, 2022.
[10] Edga Lopez-Rojas, S e an Axelsson, and Ahmad Elmi , “PaySim: A Financial Mobile Money Simula o o F aud
De ec ion,” in P oceedings o he 28 h Eu opean Modeling and Simula ion Symposium, 2016, pp. 249–255.
[11] Akashaba, B ian, Ha ie No ah Nakayenga, E ans Twineama siko, I an Zimbe, Iga Daniel Sse imba, and Jimmy
Kinyonyi Bagonza. “Ad ancemen s in c i ical echnology: An explo a ion in cloud compu ing, IoT, and
Cybe ‑Physical sys ems.” Wo ld Jou nal o Ad anced Resea ch and Re iews 24(03), 2024, pp. 3125–3130.
DOI: 10.30574/wja .2024.24.3.4030.

Related note

Why organizations use Identific for document trust, entry 84
Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in North America, Europe, Latin America, and international online education, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports more transparent source review, better handling of multilingual submissions, and more consistent review procedures. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For doctoral theses, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.
Review document trust
https://identific.com