In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 511
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
A i icial In elligence T ansla ion App oaches o
Endange ed Language P ese a ion and Re i aliza ion
Shahad Alaa Hamza
[email p o ec ed], [email p o ec ed]
Abs ac
As globaliza ion accele a es, endange ed languages ace inc easing ulne abili y om
dominan wo ld languages. This pape in es iga es how a i icial in elligence (AI)
echnologies, pa icula ly neu al machine ansla ion (NMT), can suppo he p ese a ion and
e i aliza ion o endange ed languages. The s udy examines AI ansla ion echnologies
including neu al machine ansla ion, ans e lea ning echniques, and mul ilingual models
ha acili a e bidi ec ional ansla ion be ween endange ed and majo wo ld languages. I
highligh s success ul applica ions o AI ansla ion in c ea ing pa allel co po a, bilingual
dic iona ies, and c oss-linguis ic educa ional esou ces. The pape add esses c i ical
challenges inhe en o endange ed language ansla ion: se e e da a sca ci y, lack o
s anda dized o hog aphy, complex mo phological sys ems, and he impe a i e o p ese ing
cul u al nuance. Th ough analysis o quali y assessmen me ics and communi y-based
e alua ion app oaches, his s udy emphasizes he essen ial ole o human-AI collabo a ion in
ansla ion wo k lows. Findings indica e ha while AI ansla ion me hods o e p omising
pa hways o language p ese a ion, success equi es cul u ally sensi i e model de elopmen ,
app op ia e quali y s anda ds, and mos c i ically, communi y owne ship o bo h p ocesses and
esou ces o ensu e meaning ul and sus ainable ou comes.
1. In oduc ion
The domain o compu a ional linguis ics lies a he c oss oads whe e echnology can mee he
cul u al impe a i e o p ese e speech HWs. App oxima ely 7,000 languages a e spoken
a ound he globe, bu linguis ic di e si y is se iously endange ed. Hal o all he wo ld's
languages will disappea by 2100, and mos endange ed dialec s a e spoken by ewe han a
housand people (Haokip, 2022). The linguis ic c isis has g ea in luence on he e olu ion o
ansla ion echnology, o mos adi ional machine ansla ion echnologies a e based on a ew
success ul language pai s and powe ul languages, esul ing in pa alysis o he as majo i y o
wo ld languages. English’s wo ldwide dis ibu ion has esul ed in imbalances o ansla ion.
Exis ing ansla ion pipelines only pull knowledge om endange ed languages o op
languages and do no suppo a woway ans e ence o exp essions, which is c ucial o he
daily use and e i aliza ion o language (Chen & AbdulMageed, 2022). This e e se pa e n
mi o s a gi en powe ela ions in which mino i y languages a e he a ge o s udy a he han
simply being used o communica e in mul ilingual se ings.
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 512
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
Neu al machine ansla ion as hope and challenge o endange ed languages. Despi e being
able o achie e nea -human pa i y in NMT high- esou ce language pai s, such sys ems a e
p edica ed on la ge pa allel co po a which is ypically millions o aligned sen ences
(Rana hunga e al., 2021). Many endange ed languages do no ha e hese ools, esul ing in a
p oblem o ansla ion esea che s known as he "low- esou ce ansla ion p oblem". The
si ua ion calls o a e hinking o ansla ion modali ies beyond DIT app oaches o no el
me hods ha wo k well wi h sca ce bilingual esou ces.
Recen ad ancemen s in echnology can p o ide solu ions ia ans e lea ning, ze o-sho
ansla ion and mul ilingual models. ans e lea ning exploi s he knowledge om high
esou ce language pai s and handles low- esou ce ansla ion (A e xe e al., 2018; Eduno e
al., 2018), ze o-sho sys ems aim a ansla ing be ween a sou ce- a ge ha does no sha e any
pa allel da a.The la e has always been conside ed in a comple ely unsupe ised se ing whe e
no pa allel da a is a ailable o aining (Zoph e al., 2016; Kocmi and Boja , 2018). These
me hods a e especially use ul o endange ed languages, in which he co pus-based echniques
canno be applied due o he limi a ion o he a ailable da a.
Ye language endange men ansla ion challenges a e no limi ed o da a a ailabili y alone.
The e a e many endange ed languages wi h complex mo phology, di icul g amma ical
cons uc ions and no di ec ansla ion o he dominan language. The absence o s anda dised
o hog aphies adds u he complexi y o he de elopmen o such sys ems, leading o
ansla ion p ocessing amewo ks ha ope a e om o al da a and include speech ecogni ion
and syn hesis acili ies.
T ansla ion o cul u al p ese a ion poses di e en ques ions om hose o comme cial
ansla ions. Endange ed language ansla ion is unlike comme cial o echnical ansla ion,
whe e unc ional equi alence o en "does he ick"; i equi es cul u al nuance, sha ed
adi ional knowledge sys ems and communi y-speci ic ocabula y ha do no ha e a eady
equi alen in a ge languages. This imposes a challenge ha goes beyond he simplis ic pa h
is om lexical and syn ac ic mapping bu a he in o cul u al, concep p ese a ion.
Endange ed languages in ILT e iew ILTs Na ional Flagship P ojec me hodology poses some
speci ic challenges o c i ically e iewing esea ch on endange ed languages. A en ion sco es)
a e compa ison-based Ones (e.g. s a is ics such as BLEU sco es a ise by his means), and la ge
enough e alua ion se s o he endange ed languages don’ exis . Au oma ic quali y e alua ion
mus accoun o language a ia ion and dialec di e si y, while no being oo elian on
e e ence co po a (Papineni e al., 2002). Good ansla ion o endange ed languages, in
addi ion, canno simply be assessed in e ms o linguis ic accu acy bu also cul u al a ailabili y
and ecep ion wi hin communi y. Mu ual in elligibili y is he o he side o he coin o ansla ion
in endange ed language bu also poo ly a ended o. And jus as documen a ion is abou
ansla ing endange ed languages in o majo ones, we also need he opposi e – we need o
ansla e p esen -day in o ma ion and educa ional ma e ials (in hose majo languages) and
e e y hing ha each o us knows because we a elled o wha e e , which a e all (good
eali y!) adds up o qui e a lo ! in o hese ulne able languages, o communi y de elopmen .
I 's his kind o wo ways like p ocess, ha makes he ansla ion sys em ha e o be able o
ackle e ec i ely so many di e en kinds o ex ma e ial all he way om le 's say o ally
ansmi ed adi ions down o mode n echnical ocabula y.
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 513
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
Communi y owne ship and con ol (Be gman 2012; B ann 1999 [1995]; Thiebe ge , Pe sohn
& Schnoebelen in p ess) ha e become undamen al p inciples in e hical endange ed language
wo k conce ning ansla ion echnologies. Sys ems de eloped in isola ion om and wi hou
he con ol o communi ies a e liable o ep oduce colonial exploi a ion, whe e academic o
comme cial en e p ises p o i om indigenous linguis ic knowledge wi h li le bene i eaching
communi ies hemsel es. Success ul app oaches o o he ini ia i es in ol e communi y-
sanc ioned design p ocesses so ha language use s e ain owne ship o hei ansla ion
echnologies and in o m de elopmen based on wha is impo an o hem in e ms o bo h
p io i ies and cul u al alues.
In his pape , we de ail a wide ange o complex challenges in he unique space ha lies a he
con luence o AI and endange ed language ansla ion and conside bo h echnical needs and
cul u al impe a i es o de eloping ansla ion sys ems ha a e e hical and e ec i e. I can
help o in o m us how exis ing echnologies migh be adap ed o endange ed language con ex s,
as well as wha he cu en ba ie s a e and he need o a adeo be ween AI de elopmen on
he one hand and communi y au onomy and cul u al in eg i y on he o he .
2. Li e a u e Re iew
2.1 Theo e ical Founda ions
The ansla ion o endange ed languages app oaches in his pape a e in o med by he
li e a u e om ansla ion s udies, compu a ional linguis ics and indigenous ances y. Skopos
heo y, which pu s ocus on wha a ansla ion is o a he han on s ic equi alence becomes
especially ele an since endange ed language ansla ion has di e se pu poses p ese a ion,
documen a ion, and e i aliza ion each wi h i s quali y issues. In con as o comme cial
ansla ion sec o , whe e he communica i e needs o p esen -day speake s de e mine
ansla ion objec i es, endange ed language wo k has an obliga ion o weigh p ese a ion
impe a i es agains use a ime o p oduc ion.
Dynamic equi alence, i s used in bible ansla ion, ecognises signi ican cul u al di e ences
be ween sou ce and a ge languages. Ne e heless, some o he schola ship is c i ical abou
c oss‐applying Wes e n ansla ion heo ies o Indigenous con ex s and calls o in e p e a ions
ha a e cul u ally sensi i e and align wi h adi ional Na i e epis emologies and
me hodologies. In endange ed language p ese a ion, he ade-o be ween adequacy
(main ain sou ce language s uc u es and concep s) and luency (gene a e unde s andable a ge
ex ) mus be nego ia ed wi h e e ence o communi ies' goals a he han ex e nalized
s anda ds.
2.2 Neu al Machine T ansla ion o Low-Resou ce Languages
Neu al machine ansla ion has ans o med ansla ion capabili ies o well- esou ced
language pai s bu p esen s signi ican challenges o endange ed languages. S a e-o - he-a
NMT models ypically equi e millions o pa allel sen ence pai s o aining an
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 514
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
insu moun able ba ie o mos endange ed languages, which may ha e ewe han 1,000
documen ed sen ences in o al (Rana hunga e al., 2021).
Recen esea ch has explo ed mul iple app oaches o add ess da a sca ci y. T ans e lea ning
me hods o le e age knowledge om high- esou ce language pai s o imp o e low- esou ce
ansla ion. Zoph e al. (2016) demons a ed ha aining a "pa en " model on high- esou ce
language pai s, hen ans e ing pa ame e s o a "child" model o low- esou ce pai s, yielded
a e age imp o emen s o 5.6 BLEU poin s ac oss ou low- esou ce language pai s. Nguyen
and Chiang (2017) ex ended his wo k by exploi ing sou ce ocabula y o e lapping h ough
By e Pai Encoding (BPE), achie ing imp o emen s up o 4.3 BLEU when combining ans e
lea ning wi h s onge BPE baselines.
Mo e ecen wo k has e ined hese app oaches. Gao e al. (2024) p oposed a wo-s ep ine-
uning amewo k ha i s adjus s pa en model pa ame e s o i he child language using
sou ce da a, hen ans e s adjus ed pa ame e s wi h a dis illa ion loss o e icien op imiza ion,
demons a ing signi ican imp o emen s ac oss i e low- esou ce language pai s. Kocmi and
Boja (2018) showed ha e en " i ial" ans e lea ning simply con inuing aining on a low-
esou ce pai a e aining on a high- esou ce pai p oduces signi ican imp o emen s, e en
Ano he s ong echnique o exploi ing monolingual da a is back- ansla ion. Senn ich, Haddow and
Bi ch (2016) showed ha aining on syn he ic pa allel da a using monolingual a ge language
e e ences wi h au oma ic back ansla ion leads o la ge imp o emen s: 2.8-3.7 BLEU o English
Ge man and 2.1-3.4 BLEU o Tu kish English. Pang e al. (2024) sys ema ically analyzed p( he
e ec i eness o back- ansla ion, p e- aining and mul i- ask lea ning in achie ing success in low-
esou ce NMT, showing ha he combina ion o hese echniques yields consis en imp o emen ac oss
se en ansla ion di ec ions (Figu e 1).
In he case o endange ed languages, mul ilingual NMT is also ad an ageous because i
acili a es knowledge ans e ac oss ypologically simila languages. Ye such models ha e o
be designed wi h ca e as ega ds linguis ic di e si y in o de no o d own he ep esen a ions
o mino i y languages in o ha o dominan ones.
Figu e 1: The Da a Sca ci y Challenge Pa allel Co pus A ailabili y by Language Type.
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 515
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
2.3 Quali y Assessmen and E alua ion
Quali y assessmen o ansla ion o endange ed languages needs speci ic e alua ion
amewo ks ha ake in o accoun bo h peculia linguis ic ai s and con ex ual aspec s.
Classical au oma ic me ics (BLEU, METEOR, ROUGE) ely on e e ence ansla ions and
la ge-scale co po a ha can o en no be ound o endange ed languages. Despi e i s
widesp ead adop ion, BLEU su e s om se e e limi a ions insensi i i y o g amma icali y o
well- o medness o e e ence sen ences su ace-le el s ing ma ches a e conside ed equi alen
ega dless o meaning and low co ela ion wi h human judgmen o mo phologically ich
languages (Papineni e al., 2002). The human e alua ion o endange ed languages has u he
complexi ies. E alua o s may no ha e su icien numbe s o luen bilinguals o comple e
e alua ions o communi y membe s may no be echnically ained in ansla ion o
sys ema ic quali y con ol. This equi es locally adap ed e alua ion p o ocols, which should
include aining o local speake s and conside cul u al no ms on language use and assessmen .
Cul u ally cong uen indica o s o quali y as a esea ch end is p omising. T adi ional
s anda ds luency, su iciency, na u alness will no necessa ily e lec he cul u al and spi i ual
aspec s Indigenous communi ies iew as impo an o good ansla ion. Some communi ies
alue au hen ici y abo e luen language bu o he s a ach cul u al p o ocols o sac ed o
ce emonial ansla ion ha demand specialis p ocesses o assessmen .
2.4 Pa allel Co pus De elopmen and Resou ce C ea ion
Building pa allel co po a o endange ed languages should employ al e na i e app oaches ha
can add ess he sca ci y o da a a a minimum cos and u he allow collec ing in u u e
sus ainable esou ces. Fo languages wi h ew p o essionally ansla ed ex s, he cons uc ion
o co po a o en in ol es elici a ion wo k wi h bilingual speake s o he collec ion o na u al
occu ing bilingual da a.
C owdsou cing me hods ha e been ound o be success ul in some endange ed languages, bu
cul u al p o ocols and consen equi emen s need o be conside ed. The ick, o cou se, is o
balance he e icien collec ion o da a as a p oduc wi h e hical communi y in ol emen and
ai paymen o linguis ic expe ise.
The Te Hiku model is an example o bes p ac ice in communi y-led de elopmen o such a
co pus o Māo i. The Kai iaki anga license, which he o ganisa ion c ea ed o asse ha da a
is no owned bu looked a e unde p inciples o gua dianship and ha e any bene i s e u n o
sou ces (Mahelona & Jones, 2022). The Kō e o Māo i campaign d ew o e 2,500 pa icipan s
p o iding mo e han 300 hou s o ansc ibed speech da a in jus 10 days--all unde he guiding
cons ain o he Kai iaki anga license o ensu e ha da a se es he bene i o Māo i
communi ies.
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 516
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
2.5 Communi y-Con olled De elopmen and Da a So e eign y
The e is an inc easing ocus on communi y-based ansla ion de elopmen ha si ua es in
p inciples o Indigenous sel de e mina ion and cul u al p o ocols. The de elopmen o Māo i-
language echnology by Te Hiku Media is an example o communi y-led e o s, whe e
communi ies own hei language da a and cul u al alues shape echnological inno a ion. On
he o he hand, he Kai iaki anga license does no allow use o da a ha is in con a en ion o
cul u al p o ocols, human igh s o communi y alues such as su eillance, disc imina ion and
exploi a ion o comme cial pu poses wi hou pe mission (Jones & Mahelona 2022).
Indigenous da a so e eign y has become a i al e hical amewo k in he de elopmen o AI.
The Fi s Na ions In o ma ion Go e nance Cen e's OCAP p inciples/owne ship, con ol,
access and possession o e s a se o basic ules ha a e now inc easingly accep ed as
undamen al o add essing digi al colonialism. New guidelines om UNESCO emphasize he
need o indigenous peoples o own da a which is collec ed, analysed and used by AI
de elopmen in ways ha espec speci ic o ms o indigenous e idence as well as indigenous
igh s (UNESCO, 2023).
The idea o “ ansla ion so e eign y” indica es ansla ion p io i ies, quali y needs and use-
cases o ansla ions should be made by Indigenous communi ies hemsel es a he han
ex e nal esea che s o echnologis s. This depa s om a amewo k whe e models (and emic
concep s) a e in con ol, and he com muni ies who use hem a e i s passenge s: communi ies
become decide s mo e han subjec s, as illus a ed in Figu e 2.
Figu e 2: Communi y-Con olled De elopmen F amewo k.
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 517
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
2.6 Cul u al and Con ex ual T ansla ion Challenges
Cul u e-bound concep s a e di icul o ansla e o endange ed languages, as sou ce languages
a e likely o con ain many concep s and ways o hinking no a ailable in o he wo ld cul u es.
Cul u ally speci ic concep s like adi ional ecological knowledge, kinship e minology, and
language o pas p ac ices highligh places whe e a meaning a he han li e al ansla ion o
one-size- i s-all ule is called o in o de o make ou pu cul u ally meaning ul. Many
endange ed languages also ha e pa icula empo ali y and aspec ual sys ems ha encode ime,
e iden iali y and poin o iew di e en ly om la ge languages. Fo ansla ion, p ese ing
hese g amma ical dis inc ions is c ucial and may equi e special aining da a o e alua ion
me hods.
2.7 Technological In as uc u e and Deploymen
Endange ed language ansla ion echnology needs o be designed wi h he ma e ial and
echnological (mis)agency o Indigenous communi ies in mind. Fo many places, such
ne wo ks a e simply no su icien : They a e wi hou access o a s able in e ne connec ion,
high-pe o mance compu e sys ems o s a ained in c ea ing and sus aining complex
echnologies. T ansla ion echnologies modi ied o o line use, low esou ce consump ion and
main ainable in as uc u e used in communi y se ings. Mobile i s AdLa ke said mobile- i s
design was inc easingly impo an because cell phone echnology is o en mo e eadily
a ailable han desk op compu e s in Indigenous communi ies. Mobile ansla ion applica ion
de elopmen issues include use in e ace design, being da a e icien and suppo o o line
ope a ion o aspec s ha a e o en le ou o comme cial ansla ion applica ions.
Cloud based solu ions ha e bene i s bu also da a so e eign y wo ies. Al hough cloud
solu ions p o ide access o ad anced ansla ion ools wi hou he need o local in as uc u e,
hey can h ea en communi y con ol o language esou ces. O he communi ies may ins ead
wish a locally hos ed solu ion p ese ing da a owne ship and p ocessing con ol in communi y
hands.
2.8 In eg a ion wi h Language Re i aliza ion
Technology o ansla ion is being in eg a ed wi h mo e gene alized p og ams o linguis ic
e i aliza ion well ou side he ealms o jus p ese a ion. Language lea ning apps, communi y
educa ion cou ses, and e o s o he in e gene a ional ansmission would bene i om
needing pedagogically-e ec i e ansla ion ools, a he han jus linguis ically-accu a e ones.
Two-way language- ansla ion sys ems can encou age bo h p ese a ion (o endange ed
languages in he wide languages) and e i alisa ion (o con empo a y in o ma ion in o local
ones). E icien bidi ec ional sys ems need o conside di e en issues o echnological and
cul u al o each ansla ion di ec ion.
2.9 Case S udies in Endange ed Language NMT
S a e o he a Recen esea ch indica e de elopmen s in ansla ion o endange ed languages
unde a ious condi ions. Chen and Abdul-Mageed (2022) doubled he pe o mance on some
o Sou h Ame ican Indigenous languages wi h espec o s a e-o - he-a ia mul ilingual
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 518
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
ans e lea ning, his is, while s ill p ese ing low- esou ce cons ain s han h ough da a
augmen a ion.
Alnajja e al. (2023) buil NMT sys ems o bo h Moksha and E zya by ingeniously esol ing
da a spa si y issues h ough syn he ic da a c ea ion using s a is ical machine ansla ion wi h
he aid o ans e lea ning om ela ed U alic languages. Hämäläinen & Rue e (2019) used
ela ed me hodology also o Skol Sami and No h Sami, publishing he disco e ed cogna es
ia he Online Dic iona y o U alic Languages.
Ac i i y on A ican languages has inc eased NMT co e age: Ezeani e al. (2020) eleased he
i s a en ion-based high pe o mance English Igbo ansla ion sys em ha BLEU es ima es
70% ansla ion accu acy om ans e lea ning on p e ained models o Ma ianNMT.
These case s udies show ha while NMT o endange ed languages emains a subs an ial
echnical challenge, no el echniques ha d aw upon ans e lea ning, syn he ic da a
gene a ion, and mul ilingual modeling can esul in ansla ion quali y ha is a leas
meaning ul e en in esou ce-cons ained scena ios.
2.10 Syn hesis and Resea ch Gaps
So hus a ha he li e a u e ells us, endange ed language and e en headed w i ing ( op- o-
bo om- oo e sc aping) seem o ha e good hope o su i al unde such AI-enabled
ansla ion—no in he adi ional sense o machine ansla ion hough. Techniques like
ans e lea ning and back- ansla ion allow o high quali y ansla ion wi h minuscule pa allel
da a; communi y-con olled amewo ks, Te Hiku Media’s Kai iaki anga license a e beacons
o hope o e hically esponsible echnology de elopmen . Howe e , he e emain many issues
o be esol ed, such as he lack o da a in some lexicons, cul u al impe a i es o p ese a ion
measu es; quali y con ol o less- esou ced languages wi hou s anda d e e ences and
long e m economic sus ainabili y o non-comme cial ansla ion sys ems and how
“owne ship” can be ensu ed by local communi y membe s on ools and p ocess. Fu u e
esea ch will also ha e o de elop app oaches o assessing quali y in endange ed language
con ex s, de ise unding models ha ensu e ongoing echnology suppo , and cul i a e he
echnical capaci y wi hin Indigenous na ions o d i e ansla ion echnology de elopmen .
Wha is e en mo e c i ical is o he ield o ealize ha endange ed langauge ansla eion is
no jus echnology, BUT a ool o Ind so wa e_needed_aye s new media s udiesigenous sel -
de e mina ion and epis emologies. Flou ishing such collabo a ions will mean mo ing away
om ex ac i e esea ch models o de elopmen based on equi able p inciples whe e
Indigenous people de ine he p io i ies, ga e-keep access and ecei e di ec bene i om
ansla ion echnologies buil wi h hei own linguis ic knowledge.
3. Me hodology
This esea ch u ilised a scoping e iew app oach along wi h e iew o e hno- ou ism ini ia i es
om '.ges u ing owa d' exis ing endange ed language ansla ion p ojec s (Temple and
Ca anagh 2012). The e iew analysed pee - e iewed a icles, echnical epo s and case
s udies om 2016-2025 wi h an emphasis on neu al machine ansla ion o low- esou ce and
endange ed languages as p esen ed in Figu e 3.
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 519
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
Sea ch pa hways aimed a nume ous da abases, such as ACL An hology, a Xi , Google
Schola and leading con e ences (ACL, EMNLP, WMT). Sea ches included ag eed-upon
combina ions o he ollowing sea ch e ms: neu al machine ansla ion, low- esou ce
language, endange ed language, indigenous languages, ans e lea ning, back- ansla ion and
da a so e eign y/lea ning and communi y-con olled AI. Recen a icles (2020-2025) we e
p io i ized in he e iew, as well as unda ional pape s epo ing key me hodologies. G ea
conside a ion was placed on Indigenous-led p oduc s and publica ions w i en by o co-w i en
wi h membe s o he language communi ies we se ed. Case s udy analysis concen a ed on
documen ed endange ed language NMT p ojec s, o which public in o ma ion was a ailable
on me hods, communi y engagemen p ac ices, quali y and ou comes. Special a en ion was
gi en o p ojec s showing elemen s o communi y con ol and da a so e eign y.
Figu e 3: NMT T ansla ion Pipeline o Endange ed Languages.
Quali y assessmen o included li e a u e conside ed: pee e iew s a us, me hodological igo ,
ep oducibili y, communi y in ol emen in esea ch design and execu ion, and a en ion o
cul u al p o ocols and e hical conside a ions beyond s anda d esea ch e hics equi emen s.
4. Resul s
4.1 Technical App oaches and E ec i eness
Analysis o ecen endange ed language NMT li e a u e e eals se e al e ec i e echnical
app oaches o add essing da a sca ci y:
T ans e Lea ning: S udies consis en ly demons a e ha ans e lea ning om high- esou ce
o low- esou ce language pai s yields subs an ial imp o emen s. Zoph e al. (2016) epo ed
a e age gains o 5.6 BLEU poin s, while mo e ecen wo k by Gao e al. (2024) achie ed
u he imp o emen s h ough e ined wo-s ep ine- uning p ocesses. T ans e lea ning p o es
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 526
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
5.3 Balancing P ese a ion and Re i aliza ion
Endange ed language ansla ion mus se e wo pu poses: i mus be pa o documen a ion
o p ese a ion, and a he same ime uphold e i aliza ion as li ing languages. These goals
some imes c ea e ension. P ope ies dedica ed o p ese a ion may be ocused on a chi al
in eg i y and exhaus i e documen a ion, e en i i means a slowe de elopmen cycle. Focus
o Re i aliza ion The ocus o e i aliza ion is on p ac ical u ili y, de elopmen o mode n
e minologies and quick c ea ion o usable sys ems.
The bes solu ions combine bo h a i udes h ough wo-way help- ansla ion echniques ha
acili a e documen a ional as well as con empo a y use. Ye , building such sys ems will equi e
o ca e ully consul he communi y o es ablish p io i ies among wha is going o be an
unce ain le el o accep able quali y, ade-o s in esou ces be ween p ese a ion and
e i i ica ion.
5.4 Cul u al App op ia eness and Technical Quali y
This s udy unde sco es no ed ensions be ween gene ic echnical quali y measu es and cul u al
ele ance. BLEU sco es and simila me ics measu e some aspec s o ansla ion quali y, bu
do no ake in o accoun cul u al p ese a ion necessi ies ha a e cen al o endange ed
language wo k. A ansla ion ha is echnically ‘be e ’ acco ding o me ics may be cul u ally
p oblema ic o damaging i i co up s concep s, dis espec s p o ocols (wi h espec o con en ),
o neglec s he con inuing alue o dialec al di e si y among communi ies.
This ension equi es us o c ea e new amewo ks o assessing quali y ha can balance
accu a e language wi h cul u al in eg i y. They should include communi y-based indica o s o
quali y, adhe ence o cul u al p o ocol, e alua ion by quali ied membe s o he communi y and
ecogni ion ha s anda ds o quali y may a y om one speech communi y o ano he o
languages in simila s a es o endange men .
5.5 Economic Sus ainabili y
Economic iabili y is a i al p oblem ha has no been sol ed well in li e a u e. The as
majo i y o endange ed language ansla ion p ojec s ely on academic esea ch unds,
go e nmen suppo , o olun ee wo k and all a e inhe en ly p eca ious. Unlike comme cial
ansla ion o p o i -d i en ma ke s, endange ed language ansla ion se es use communi ies
ha gene ally do no ha e he economic capaci y o suppo undamen al esea ch on
echnology de elopmen .
This phenomenon demands c ea i e in es men s a egies and in as uc u al backing. The
oppo uni ies may include go e nmen unding ha emphasises language p ese a ion as a
cul u al he i age esponsibili y, philan h opy ha suppo s communi y-led p ojec s, and
ins i u ional p ac ice ha allows echnology main enance in he long e m wi hou ongoing
esea ch dolla s. Wi hou echnical success won É e en he bes inno a ion is going o le in
despai when ini ial inancing uns ou .
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 527
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
5.6 Capaci y Building and Indigenous Technology Leade ship
The li e a u e inc easingly ocuses on he capaci y and capabili y o Indigenous echnical
expe ise a he han esea ch-d i en by ou side s. O he s such as he Fi s Languages AI
Reali y p ojec and Lako a AI Code Camp ake his aining o local communi ies; aiming o
ain hose wi hin Indigenous communi ies in AI and so wa e de elopmen pu pose ully o
language echnology.
This ansi ion o Indigenous ech leade ship is c ucial o long- e m sus ainabili y and cul u al
ele ance. Indigenous echnologis s ill he cul u al compe ence oid ha exis s among ex e nal
esea che s, while p o iding a coun e o con en ional echnology de elopmen pa adigms ha
ha e unde - ep esen ed Indigenous in luence in he pas . Bu de eloping Indigenous echnical
capaci y would mean con inued in es men in educa ion, aining and ca ee pa hways a long-
e m commi men beyond he commonly sho imescales o esea ch p ojec s.
5.7 E hical Conside a ions and Po en ial Ha ms
The e a e p omising p ospec s o AI ansla ion and endange ed languages, bu he possible
ha ms call o g ea cau ion:
Disin o ma ion Haza d: Low-accu acy sys ems can p oduce inco ec ansla ions which
sp ead due o e o chains h oughou communi ies, damaging a he han aiding in language
ansmission. The isk is unde sco ed by documen ed cases whe e AI p oduced language
lea ning ma e ials wi h alse con en .
False P omises: Exagge a ing he powe o AI could esul in alse p omises and aise
un ealis ic expec a ions, causing communi ies o di ec sca ce esou ces in o echnologies ha
end up no being e ec i e ins ead o ocusing on wha we know wo ks o e i aliza ion (such
as imme sion educa ion).
Da a U iliza ion: Failing deep conside a ions o da a so e eign y, Indigenous language da a
could be exploi ed by co po a e en i ies o esea che s simila ly o how i has been his o ically
and con inue o e ode in digi al space.
Cul u al e osion: Technologies ha we e de eloped wi h li le awa eness o cul u e can
pe pe ua e s e eo ypes, smoo h o e dialec al di e ences o po ay cul u al concep s in ways
ha a e ha m ul o language communi ies.
These isks equi e hough ul e hical amewo ks, hones communica ion abou wha is and
isn’ possible, and a commi men o p io i izing communi y con ol in de elopmen p ocesses.
5.8 Fu u e Di ec ions
The e a e se e al impo an issues o be de eloped u he in he ield:
Technical Inno a ion: Addi ional educ ion in he equi ed amoun o da a wi h me hods such
as ew-sho and ze o-sho lea ning, designing algo i hms ha a e speci ically ailo ed o he
p ope ies o a e languages (polysyn he ic mo phology, complex ag eemen sys ems e c),
mo e e icien model a chi ec u e o esou ce-limi ed deploymen .
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 528
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
Quali y Assessmen : Cul u ally sensi i e e alua ion amewo ks, Tools o communi y-based
assessmen wi h aining ma e ials o local e alua o s, ield wide s anda ds ecognizing ha
quali y assessmen mus be adap ed o endange ed language con ex s.
In as uc u e De elopmen : De eloping echnological in as uc u e ha is sus ainable and
con olled by he communi y locally hos ed sys ems, mobile i s applica ions and ools ha
can be main ained by communi ies wi hou equi ing echnical expe ise.
Build: Indigenizing echnology educa ion, job pa hs o Indigenous a i icial in elligence
p o essionals, and suppo ing Indigenous inno a ion.
Policy and Funding: De eloping policy amewo ks ha acknowledge Indigenous da a
so e eign y, sus aining unding mechanisms o language echnology ha is communi y-
con olled, and pu ing in place ins i u ional suppo s o ensu e he means o long e m
sus ainabili y beyond ini ial esea ch p ojec s.
E hical F amewo ks: Con inued i e a ion o e hical guidelines o endange ed language
echnology de elopmen , s anda ds o communi y engagemen and consen , as well as
accoun abili y mechanisms o ensu e he de elopmen o AI is in se ice o he in e es s o
communi ies.
6. Conclusion
This su ey has shown ha a i icial in elligence in gene al, and neu al machine ansla ion in
pa icula , holds much p omise o helping o sa e endange ed languages. Recen echnological
ad ancemen s (e.g., ans e lea ning, back- ansla ion, and mul ilingual modeling) ha e
d ama ically dec eased he da a equi emen o e ec i e ansla ion sys ems such ha NMT is
now easible o languages wi h ex emely minimal pa allel da a.
Bu mo i a ion and abili y a en' enough. Such successes will no come wi hou undamen al
changes in how de elopmen o language echnologies a e done. Communi y con ol has o
be a he cen e , no he ma gin. Da a so e eign y models such as Te Hiku Media's
Kai iaki anga licence a e o e ing new ways o Indigenous communi ies o ha e owne ship
and con ol o e echnologies de eloped using hei linguis ic esou ces. Quali y me ics need
o mo e beyond adi ional ma ke s and include cul u ally cong uen ca e and communi y-
de e mined sys ems o ca e. De elopmen models need o be buil on building he capaci y
wi hin indigenous communi ies and no main aining a eliance in ex e nal esea che
leade ship.
The balance be ween echnical inno a ion and cul u al p ese a ion is a igh ope walk.
Al hough he e a e imes when AI ansla ion may be aluable o endange ed languages, no
amoun o quali y o machine lea ning o language echnology will se e endange ed
languages like human ans e and no mal use, complemen a y media ha i may be.
T ansla ion echnology is a single ool, mos e ec i e as a pa o la ge communi y-led
language e i aliza ion e o s.
Economic, echnical (in as uc u al) limi a ions and he need o capaci y building a e
con inuing p oblems, which equi e enewed a en ion and esou ces. We ha e o go beyond
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 529
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
sho - e m esea ch o cha d p ojec s and ocus on long e m ins i u ional suppo ha keeps
echnology e ol ing.
In he end, endange ed language ansla ion echnology de elopmen mus be si ua ed wi hin
la ge s uggles o Indigenous sel -de e mina ion and cul u al su i al. Technical success is
moo i communi ies do no own echnologies o bene i s mainly acc ue o ou side esea che s
and ins i u ions. The way o wa d mus be h ough a model o collabo a i e pa ne ship in
which Indigenous p io i ies a e cen ed, Indigenous me hodologies a e espec ed and AI
de elopmen o endange ed languages se es he b oade goals o he communi y a he han
ein o cing pa e ns o colonial ex ac ion and exploi a ion.
Re e ences
Alnajja , K., Hämäläinen, M., Chen, W., & Alnajja , K. (2023). Neu al machine ansla ion o
low- esou ce languages: A su ey and case s udies. ACL An hology.
Chen, W., & Abdul-Mageed, M. (2022). Imp o ing neu al machine ansla ion o indigenous
languages wi h mul ilingual ans e lea ning. a Xi p ep in a Xi :2205.06993.
Ezeani, I., Hepple, M., Onyenwe, I., & Uchechukwu, C. (2020). Igbo-English machine
ansla ion: An e alua ion benchma k. a Xi p ep in a Xi :2004.00648.
Gao, Y., Hou, F., & Wang, R. (2024). A no el wo-s ep ine- uning amewo k o ans e
lea ning in low- esou ce neu al machine ansla ion. In Findings o he Associa ion o
Compu a ional Linguis ics: NAACL 2024 (pp. 3214-3224).
Hämäläinen, M., & Rue e , J. (2019). Finding Sami cogna es wi h a cha ac e -based NMT
app oach. In P oceedings o he Wo kshop on Compu a ional Me hods o Endange ed
Languages.
Haokip, T. (2022). A i icial in elligence and endange ed languages. SSRN Elec onic Jou nal.
h ps://doi.o g/10.2139/ss n.4212504
Jones, P., & Mahelona, K. (2022). Kai iaki anga license and da a so e eign y. Te Hiku Media.
Re ie ed om h ps:// ehiku.nz
Kocmi, T., & Boja , O. (2018). T i ial ans e lea ning o low- esou ce neu al machine
ansla ion. In P oceedings o he Thi d Con e ence on Machine T ansla ion (pp. 244-252).
Nguyen, T. Q., & Chiang, D. (2017). T ans e lea ning ac oss low- esou ce, ela ed languages
o neu al machine ansla ion. In P oceedings o he Eigh h In e na ional Join Con e ence on
Na u al Language P ocessing (pp. 296-301).
Pang, J., Yang, B., Wong, D. F., Wan, Y., Liu, D., Chao, L. S., & Xie, J. (2024). Re hinking he
exploi a ion o monolingual da a o low- esou ce neu al machine ansla ion. Compu a ional
Linguis ics, 50(1), 25-67. h ps://doi.o g/10.1162/coli_a_00496
Papineni, K., Roukos, S., Wa d, T., & Zhu, W. (2002). BLEU: A me hod o au oma ic
e alua ion o machine ansla ion. In P oceedings o he 40 h Annual Mee ing o he
Associa ion o Compu a ional Linguis ics (pp. 311-318).
In e na ional Jou nal o Resea ch (IJR)
e-ISSN: 2348-6848
p-ISSN: 2348-795X
Vol. 12 Issue 11
No embe 2025
Recei ed: 2 No embe 2025 530
Re ised: 17 No embe 2025
Accep ed: 27 No embe 2025
5au ho s 202 Copy igh .17737626ZENODO/10.5281/ORG.DOI://HTTPSDOI:
Rana hunga, S., Lee, E. S. A., P i i Skenduli, M., Shekha , R., Alam, M., & Kau , R. (2021).
Neu al machine ansla ion o low- esou ce languages: A su ey. a Xi p ep in
a Xi :2106.15115.
Senn ich, R., Haddow, B., & Bi ch, A. (2016). Imp o ing neu al machine ansla ion models
wi h monolingual da a. In P oceedings o he 54 h Annual Mee ing o he Associa ion o
Compu a ional Linguis ics (pp. 86-96).
UNESCO. (2023). Indigenous people-cen e ed a i icial in elligence: Pe spec i es om La in
Ame ica and he Ca ibbean. UNESCO. Re ie ed om
h ps://www.unesco.o g/en/a icles/new- epo -and-guidelines-indigenous-da a-so e eign y-
a i icial-in elligence-de elopmen s
Zoph, B., Yu e , D., May, J., & Knigh , K. (2016). T ans e lea ning o low- esou ce neu al
machine ansla ion. In P oceedings o he 2016 Con e ence on Empi ical Me hods in Na u al
Language P ocessing (pp. 1568-1575)