A Deep Learning Model to Predict Congressional Roll Call Votes from Legislative Texts

Author: Payne, Jonathan

Publisher: Zenodo

DOI: 10.5281/zenodo.17531961

Source: https://zenodo.org/records/17531961/files/7420mlaij02.pdf

Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
DOI:10.5121/mlaij.2020.7402 15
A DEEP LEARNING MODEL TO PREDICT
CONGRESSIONAL ROLL CALL VOTES FROM
LEGISLATIVE TEXTS
Jona han Wayne Ko n and Ma k A. Newman
Depa men o Da a Science, Ha isbu g Uni e si y, Ha isbu g, Pennsyl ania
ABSTRACT
De elopmen s in na u al language p ocessing (NLP) echniques, con olu ional neu al ne wo ks (CNNs),
and long-sho - e m memo y ne wo ks (LSTMs) allow o a s a e-o - he-a au oma ed sys em capable o
p edic ing he s a us (pass/ ail) o cong essional oll call o es. The pape in oduces a cus om hyb id
model labeled "P edic Tex Classi ica ion Ne wo k" (PTCN), which inpu s legisla ion and ou pu s a
p edic ion o he documen 's classi ica ion (pass/ ail). The con olu ional laye s and he LSTM laye s
au oma ically ecognize ea u es om he inpu da a's la en space. The PTCN's cus om a chi ec u e
p o ides elemen s enabling adap a ion o he inpu 's a iance om adjus men o he ke nel weigh s o e
ime. On he documen le el, he model epo ed an a e age e alua ion o 67.32% using 10- old c oss-
alida ion. The esul s sugges ha he model can ecognize cong essional o ing beha io s om he
associa ed legisla ion's language. O e all, he PTCN p o ides a solu ion wi h compe i i e pe o mance o
ela ed sys ems a ge ing cong essional oll call o es.
KEYWORDS
Deep Lea ning (DL), Con olu ional Neu al Ne wo ks (CNNs), Long-Sho -Te m Memo y Ne wo ks
(LSTMs), Na u al Language P ocessing (NLP), Cong essional Roll Call Vo es
1. INTRODUCTION
P edic ing he s a us (pass/ ail) o cong essional oll call o es has been poli ical scien is s' goal
o decades. The e a e pa e ns o cong essional o ing beha io cap u ed in he legisla i e ex ,
which has shown signi icance when p edic ing cong essional o es' s a us. Unde s anding he
u u e s a us o legisla ion p o ides i al insigh s in o go e nmen and indus y ma e s.
Analyzing oll-call da a allows insigh in o in o ma ion de ailing he legisla ion's o e s a us and
can p edic u u e o es [1].
O he app oaches using quan i a i e oll call da a and legisla i e ex ha e shown success in he
pas , howe e only unde ce ain condi ions. Thei success exp esses limi a ions due o he
dimensionali y o he da a and un o eseen condi ions in Cong ess's complex social en i onmen .
Fo ins ance, using wo sepa a e da ase s diminishes a model's lexibili y o p edic and adap o
he e en space's changing condi ions.
Poli ical en i onmen s a e complex social ne wo ks ha o en c ea e noisy da a. The scale o
opics ha he go e nmen conside s in legisla ion is a di e se subjec ma e , and he language is
spa se. Mos pas app oaches ely on an a emp o use ex a dimensionali y om bo h ex and
quan i a i e da a. Howe e , using ex a dimensionali y p oduces limi a ions, o p edic i e
bo leneck, in app oaches using ex and quan i a i e cong essional da a. Many si ua ions occu
when cong essional oll call da a is no a ailable o ep esen s uncon olled condi ions c ea ing
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
16
issues p edic ing he e en . Fo ins ance, a pu e exp ession o mixed bu non-pa isan suppo in
he o ing da a could limi exis ing models' o e all accu acy.
A mo e obus model is equi ed o add ess he cu en limi a ions exp essed in p io models
add essing he p oblem discussed abo e. S a is ical modeling seeks o lea n he join p obabili y
unc ion om wo ds
con ained in he ex s [2]. Howe e , i is di icul o ob ain his goal because o he "... cu se o
dimensionali y" [2]. Howe e , he ise o deep lea ning makes i possible o compu e sys ems o
ecognize pa e ns in complex ex ep esen a ions. Ad ancemen s in na u al language p ocessing
echniques allow ex o con e in o okenized wo d ec o spaces. The con e sion p o ides he
p ope dimensions o embed he ex s in o di e en deep neu al ne wo ks. A cus om hyb id
a chi ec u e p o ides he abili y o inpu ex s and p o ide accu a e ou pu s om ecognized
pa e ns. Con olu ional neu al ne wo ks (CNNs) and long- sho e m memo y neu al ne wo ks
(LSTMs) can ecognize hese in ica e pa e ns wi hin he da a's dimensions. Each laye in he
ne wo k p o ides bene i s in ecognizing empo al and spa ial ea u es om abs ac da a
ep esen a ions. Mainly, CNNs e lec success ul esul s in iden i ying abs ac da a pa e ns
mainly because o de elopmen in max-pooling laye s.
Howe e , due o long lag pe iods om he da a's complexi y, he CNN a chi ec u e canno alone
cap u e he ex 's pa e ns. A p ima y eason o implemen ing LSTM laye s in he model's
a chi ec u e is o o e come he long lag ime p oblem ha occu s when p ocessing high
dimensional da a. O e coming long lag pe iods allows o he neu al ne wo k's dep h o con inue,
enabling minu e ea u es o be ecognized.
Combining CNN and LSTM algo i hms p o ide an a chi ec u e capable o ecognizing ea u es
in spa se wo d ec o spaces o e long pe iods, known as a Con olu ional Long-Sho Te m
Memo y Neu al Ne wo k (C-LSTM). Adap able laye s du ing ke nel ini ializa ion help il e ou
he non-signi ican ea u es om he inpu s dimensional space. The adap abili y in each laye o
he ne wo k p o ides s abili y in he p edic ions and obus ness agains he dynamic social
en i onmen c ea ing he da a. The cus om a chi ec u e ensu es he ne wo k dep h is sui able o
accu a e p edic ion om highly spa se ex samples. The pape p esen s a cus om deep lea ning
solu ion o accomplish he bina y classi ica ion o legisla i e ex s.
1.1. O ganiza ional S uc u e
The emainde o he pape is o ganized in o he ollowing sec ions, including Sec ion 2.
Li e a u e Re iew, Sec ion 3. Me hods and Ma e ials, Sec ion 4. Resul s, Sec ion 5. Conclusion,
and Sec ion 6. Fu u e Wo k. Sec ion 3. Me hods and Ma e ials includes wo sec ions including
3.1. The Da a, and 3.2. The Deep Lea ning App oach. Sec ion 3.1. includes Sec ion 3.1.1. P e-
p ocessing ex and 3.1.2. Equal Dis ibu ion o Samples. Sec ion 3.2. includes 3.2.1. The Cus om
PTCN A chi ec u e and 3.2.2. The PTCN Modelling P ocess.
2. LITERATURE REVIEW
A ocus o quan i a i e poli ical scien is s is using oll-call da a o unde s and legisla i e o ing
beha io s be e . Few models ha e a emp ed o use supplemen a y da a such as he ex o
legisla ion o unde s and cong essional o ing ends be e .
In 1991, esea ch exp essed ha spa ial posi ions cap u ed in oll call da a is s able and con ains
eliable ea u es o ecognize o ing pa e ns [3]. The pa y discipline is p esen in he spa ial
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
17
dimensions o oll-call o es o he legisla o 's ideal poin s cap u ed in he da a [4]. Spa ial
dimensions in he con ex o he domain e e o he p obabilis ic wo d occu ence con ained
ac oss he ex samples. Resea che s u ilized a me hod using Bayesian simula ion models o
cap u e he ideal poin s o he legisla o s. The app oach allows esea che s o ensu e he belie s
inco po a ed in o he inpu s' dimensional space o oll call analysis [5]. The me hod enables
esea che s o handle he inc easing complexi y in highe -dimensional con ex s [5]. Policy ideas
a e s anda d ea u es in legisla ion ha p o ide insigh in o legisla o s' beha io simila o
quan i a i e oll call da a [6].
In 2011, Ge ish and Blei in oduced a p obabilis ic model capable o in e ing a pe son's
poli ical posi ion o speci ic opics [7]. The model ocuses on cap u ing he indi idual
ep esen a i e's iew on speci ic poli ical in e es s om he ex alone. The au ho s ha e used 12
yea s o cong essional legisla i e da a in hei expe imen o cap u e signi ican pa e ns. The
pa e ns exp ess he lawmake s' o e beha io and on which ype o documen [7]. Ge ish and
Blei in eg a ed he analysis o ex in o quan i a i e models such as he ideal
poin model wi h success in e ing "... ideal poin s and bills' loca ions..." om oll call da a
leading o he p edic ion o legisla i e o e s a us (pass/ ail) [8]. The in eg a ion o he bill ex s
in o he ideal poin model helped mi iga e a limi a ion o only p edic ing on o e da a alone,
which may, a imes, be inconsis en in i s a ailabili y. The au ho s de eloped a supe ised ideal
poin opic model capable o p edic ing pending bills using o es. I also is a me hod o explo ing
he connec ion be ween language and poli ical suppo [8]. O he models de eloped by Ge ish
and Blei o p edic he "...in e ed ideal poin s using di e en o ms o eg ession on ph ase
coun s." [8]. Howe e , lead exis ing models canno p edic when mixed bu non-pa isan suppo
is p esen in he da a. Many exis ing models canno expand beyond one-dimensional limi a ions,
such as he ideal poin model [8]. The au ho s also based hei model's pe o mance on he
baseline ha 85% o all o es a e 'yea', limi ing he model's pe o mance esul s. The au ho s
epo ed hei ideal opic model p edic ed 89% o he o es wi h 64 opics, and hei L2 model
p edic ed 90% [8]. In sequen ial p edic ions, bo h hei models p edic ed 87% and 88.1%
accu a e a p edic ing u u e o es, espec i ely [8]. Thei s udy only a emp s o unde s and a
ew opics wi h a one-dimensional poli ical space, which c ea es a p edic i e bo leneck [8].
In e es ingly, he ideal poin model e lec s ep esen a i es' p e e ences, cons i uency p e e ences,
o any o he ea u e indica ing a p e e ence o pa icula legisla ion [1]. In 2013, Spa io- empo al
modeling exp essed success in using ex o p edic cong essional oll call o es wi h ela i e
success o he ideal poin s model [9]. In 2015, ideal poin s we e ede ined as wo
cha ac e iza ions, including wo d and o e choice [10]. These cha ac e ize he ideal poin s and
he dimensions o he policy. The esea che s u ilize Spa se Fac o Analysis o combine bo h
o es and ex ual da a o es ima e he ideal poin s [10].
In 2016, Yang and o he s in oduced hie a chical a en ion ne wo ks o documen classi ica ion
[11], he esul s ou pe o med p e ious models by a la ge ma gin, which can be indica ed in he
au ho 's esul s using he Yelp 2013, Yelp 2014, Yelp 2015, IMBD e iew, yahoo Answe , and
Amazon e iew da ase s o es hei algo i hm and compa e i o o he me hods [11]. On a e age,
he HN-ATT pe o med wi h abou a 70% accu acy a e ac oss he asks. They we e gene alizing
he ne wo k o add ess mul iple ypes o asks ha limi hei abili y o iden i y he ex 's c i ical
ea u es, such as he poli ical ideology in he cong essional legisla i e ex s ac oss spa ial and
empo al dimensions. A simila app oach ha uses mul i-dimensional bill ex s o p edic oll
calls' s a us is K a , Jain, and Rush's app oach es ablished in 2016. Using an embedding model
wi h p epa ed bill ex s, hey compe ed wi h Ge ish and Blei's app oach. Mainly he app oach
u ilized ideal ec o s a he han ideal poin s [12]. The app oach elies on quan i a i e da a ha
leads o he same p edic i e bo lenecks as p io models due o he complex e en space.
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
18
In 2016, he Go 2Vec me hod exp essed success in cap u ing opinions om legal documen s.
The ex 's ans o ma ion allows wo ds o be embedded in a model o lea n he ep esen a ions o
indi iduals' opinions [13]. In 2017 i became possible o use quan i a i e da a and ex o p edic
i a bill will become law. The esea che s a es ha hey always pe o m be e han using ex o
con en indi idually [14]. In e es ingly, he au ho conduc ed h ee expe imen s, including
"... ex -only, ex , and con ex , o con ex -only..." o es he p edic i e powe o each ype o
model [14]. Nay's app oach uses a language model ha p o ides a p edic ion on he ex 's
sen ence le el, p o iding a p obabili y o he sen ence con ibu ing o he bill ex 's (pass/ ail)
s a us [14]. The da a included a ew pe o mance measu es and wo da a condi ions spanning 14
yea s.[14]. Nay concluded ha a ex alone app oach is undamen al o be e esul s. Fo newe
da a, he use o bill ex ou pe o ms bill con ex -only. Con ex -only only ou pe o ms bill ex
alone o olde da a [14]. The mos impo an inding is ha ex adds p edic i e powe o he
model. Howe e , he app oach elies on wo da a sou ces, which c ea es a p edic i e bo leneck.
As he mos success ul app oach o da e using ex alone, he model can p edic a 65% accu acy
[14].
A hyb id C-LSTM Neu al Ne wo ks a chi ec u e can help mi iga e he p edic i e bo leneck in
exis ing models by cap u ing mo e ea u es om he bill ex s alone. In s a is ical language
modeling, he p ima y goal is lea ning. The main objec i e o lea n is he "... join p obabili y
unc ion ..." o each sequence o wo ds con ained in he language [2]. Howe e , he e is di icul y
in his ype o ask because o he cu se o dimensionali y. By lea ning dis ibu ed ep esen a ions
o language, he cu se mi iga es. The main issue is add essing he a iance om he aining da a
o he es ing da a. Pas esea ch disco e ed "... he simila i ies
be ween wo ds o ob ain gene aliza ion om aining sequences o new sequences..." [15] [16]
[17] [18]. Much o his ype o wo k is hanks o con ibu ions by Schu ze, 1993, whe e ec o -
space ep esen a ions o wo ds can be lea ned based on he p obabili y o he wo d co-occu ing
in documen s [19]. Mos o he expe imen al wo kings can be summed up om lea ning
dis ibu ed ea u e ec o s o ep esen hei simila i ies be ween wo ds, which is discussed in
[15] [17] [20]. Pas esea ch has shown ha he hyb idiza ion o CNNs and LSTMs asked o
sol e ex classi ica ion p oblems is success ul [21]. The combina ion o he wo di e en ypes
o neu al ne wo ks builds a C-LSTM [21]. In 2015, Zhou e al. in oduced a success ul C- LSTM
in sen ence and documen modeling [21]. The CNN ex ac s a sequence o high-le el ph ase
ep esen a ions, which a e in- e u n ed h ough LSTM laye s se o ob ain he sen ence-
ep esen a ions [21]. O e all, he hyb id model pe o med be e han exis ing sys ems in
classi ica ion asks. The esul s indica ed ha he local ea u es o ph ases and he sen ence's
global/ empo al seman ics a e ecognizable by hei model [21]. In 2019, a eam o esea che s
ook ad an age o con olu ional ecu en neu al ne wo ks o ackle ex classi ica ion asks [22].
Thei expe imen s showed ha he me hod o using a C-LSTM achie es be e success han o he
ne wo ks.
3. METHODS AND MATERIALS
The ollowing sec ions discuss key componen s o de eloping he PTCN and he associa ed
esul s e e ed o in Sec ion IV. The me hodology is summa ized below in Figu e 1.
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
19
Fig. 1: Visualiza ion o he me hodology.
3.1. The Da a
The o iginal da a is ex ac ed and o ganized om (h ps://www.Go ack.us) consis ing o
legisla i e ex s and associa ed quan i a i e oll call da a. The o iginal da a con ained samples
om he yea 2000 o 2019, including 3668 samples om he house and sena e. Re e o Figu e 2
o an example o o iginal legisla i e ex s. No all he samples om he o iginal popula ion will
be included in he expe imen due o limi ing ac o s as discussed below.
Fig. 2: Legisla i e ex samples.

Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
20
3.1.1. P e-P ocessing he Tex
A se ies o NLP echniques a e equi ed o supe ise he cus om C-LSTM aining o classi y he
legisla i e ex based on o ing s a us. The ex samples con ain noise, which is in he o m o
s op wo ds, uppe casing, special cha ac e s, NA alues, punc ua ion, numbe s, and whi espace.
The emo al o noise om he ex samples helps mi iga e he C-LSTM om ecognizing
pa e ns wi hin he ex s ha c ea e bias p edic ions.
The C-LSTM is a deep lea ning algo i hm ha au oma ically ex ac s ea u es om an inpu
ec o . Each ex unde goes augmen a ion using he ollowing condi ions, including con e ing
da a ypes om cha ac e o a s ing, lowe case con e sion, s op-wo ds emo al using he
“SMART” unc ion, punc ua ion emo al, numbe emo al, whi e-space emo al, and documen
s emming. The abo e augmen a ion esul ed in Figu e 3.
Fig. 3: Sample o p e-p ocessed Legisla i e ex s.
A e p e-p ocessing, he ex goes h ough a okeniza ion and ec o iza ion s ep esul ing in each
wo d, symbol, o any o he cha ac e ep esen ed as a unique numbe . Fo ins ance, he wo d
‘bill’ is ep esen ed by he alue o 3109 ac oss all he documen s. No e he ex is limi ed o
10000 max ea u es du ing he okeniza ion p ocess. Re e o Figu e 4 o an example o he
ec o ized ocabula y om he legisla i e documen s.
Fig. 4: Sample o okenized and ec o ized Vocabula y.
The p e-p ocessing o he ex esul ed in ec o ized legisla i e documen s, as seen in Figu e 5. A
me hod o padding is implemen ed o ans o m all he ex s o he same leng h. The dimensions
o he ex a e con e ed in o a okenized, ec o ized, and padded o ma wi h a max leng h o
10,000.
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
21
Fig. 5: Sample o okenized, ec o ized, and padded ex .
3.1.2. Equal Dis ibu ion o Samples
The labeled legisla i e ex samples a e balanced in o an equal dis ibu ion o each o ing s a us
o educe bias. I is essen ial o mi iga e bias in he da a. Deep neu al ne wo ks ans o m inpu s
o e ex ensi e c edi assignmen pa hways (CAPs). The mo e complica ed he dimensions o an
inpu , hen he mo e complex he CAPs.
Each o e’s s a us was cap u ed by de e mining he numbe o “yea” o “Aye” esponses eached
o each legisla ion. Each ansla es in o a alue o 1, ep esen ing a o e o pass he documen .
All o he esponses a e conside ed a o e agains he documen , labeled as a alue o 0.
In e es ingly, he numbe o documen s labeled one is g ea e han 0 labeled documen s. No e ha
some o es equi e a special condi ion o pass, which is mo e han 2/3 o he o es. The s udy
igno es he special o ing condi ion because i only ocuses on an equal dis ibu ion o (pass/ ail)
ex ep esen a ions.
An equal dis ibu ion o he o ing s a uses in he sample is necessa y o mi iga e bias in he
ne wo k o ei he classi ica ion. The ex samples a e andomly selec ed om each class o
ep esen an equal dis ibu ion, which educes he o iginal numbe o samples. Each class o o e
is ep esen ed equally by 98 andomly selec ed legisla i e ex s. 98 is he maximum numbe o 0
labeled samples a ailable in he da a due o he beha io o cong ess. Each ex is a max leng h o
10,000 ea u es. I should be no ed ha mos o he legisla i e ex s a e close o highe o he
maximum numbe o ea u es. Cus om ea u es in he PTCN’s a chi ec u e a e implemen ed o
deal wi h he small numbe o samples du ing aining as discussed below.
3.2. The Deep Lea ning App oach
The ollowing sec ion discusses he cus om deep lea ning a chi ec u e and modelling p ocess
implemen ed in he s udy.
3.2.1. The Cus om PTCN A chi ec u e
The sequen ial deep lea ning model is a s ack o di e en laye s se wi h se e al pa ame e s,
including d opou a e, hidden con olu ional nodes, LSTM hidden nodes, L1 egula iza ion a e,
L2 egula iza ion a e, ba ch size, inpu max leng h, max ea u es, embedding dimensions, leaky
Relu a e, ke nel size, epochs, max-pooling size, lea ning a e, and alida ion spli . In Figu e 6,
an example o he PTCN models a chi ec u e:
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
22
Fig. 6: PTCN a chi ec u e
The i s laye in he model is he embedding laye , which embeds he okenized wo ds. A he
nex laye , he inpu s pipe in o a 1-dimensional con olu ional laye wi h only a 4 h o he se
con olu ional hidden nodes. The inpu s a e ep esen a ions o leng hy and highly spa se wo d
ec o spaces, making he model weigh 's c i ically impo an . The model equi es a me hod o
adjus o he high a iance be ween samples due o he low numbe o samples. The con ol o he
ke nels will help he model s eadily lea n signi ican ea u es con ained in he ex .
Th oughou he con olu ions, he model can explo e deepe o ecognize mo e signi ican
ea u es. A a iance scaling ke nel ini ialize p o ides an “ini ialize capable o adap ing i s scale
o he shape o weigh s. ” [23]. Va iance scaling is a ke nel ini ializa ion s a egy ha encodes an
objec wi hou knowing he shape o a a iable [23]. The Con olu ional laye s’ ini ialize makes
he deep ne wo k adap o he inpu weigh s. I is impo an o egula ize he ke nels when
u ilizing an adap i e ini ialize . Se ing a duel egula iza ion echnique ha deploys L1 and L2
egula iza ion helps mi iga e high luc ua ions while
ba ching samples h ough he laye s. L1 is a Lasso Reg ession (LR). L2 is Ridge Reg ession. The
main di e ence be ween he wo me hods is he penal y e m. The laye includes s ides se a 1L
h ough he con olu ions a e an “...an in ege o lis o a single in ege , speci ying he s ide
leng h o he con olu ion” [24]. The con olu ional laye is ac i a ed using a Leaky Relu unc ion.
The leaky Relu unc ion allows o a “... small g adien when he uni is no ac i e ...”, p o iding
“... sligh ly highe lexibili y o he model han adi ional Relu. ” [25].
Machine Lea ning and Applica ions: An In e na ional Jou nal (MLAIJ) Vol.7, No.3/4, Decembe 2020
23
The i s con olu ional laye ex ac s lowe le el ea u es om he inpu s due o he dec ease in
he hidden nodes. The educ ion o he numbe o ans o ma ions p o ides con ol o he
adap able ea u es ini ializing he weigh s. The second laye pipes he inpu s h ough ano he 1-
dimensional con olu ional laye wi h he same pa ame e s se , excep he numbe o hidden
nodes se o 32. Inc easing he numbe o hidden nodes p o ides mo e ans o ma ions ex ac ing
highe -le el ea u es om he inpu s. The second con olu ional laye is ac i a ed using he leaky
Relu unc ion. The nex laye in he s ack is ano he con olu ional laye se a hal he se numbe
o hidden nodes. Reducing he numbe o nodes and ollowing wi h a max-pooling laye helps
mi iga e o e i ing du ing aining. In he s udy, he max-pooling laye is se o 4. The ollowing
a e wo mo e laye s o 1-dimensional con olu ional laye s se o hal he se hidden nodes and a
4 h o he se hidden nodes.
All pa ame e s a e se he same as he p io con olu ional laye s. A second max-pooling laye ,
ba ch no maliza ion, and d opou laye help mi iga e o e i ing u he . The nex laye is an
LSTM laye se a 32 hidden nodes. Va iance Scaling ke nel ini ialize s and L1 and L2 ke nel
egula ize s con ol he model's explo a ion o he ea u e space. The LSTM laye is ac i a ed
using a Leaky Relu unc ion. A d opou laye o 0.5 is in he s ack be o e he ou pu laye . The
ou pu laye is ac i a ed using a sigmoid unc ion. The model compiles using a loss unc ion o
bina y c oss-en opy. The PTCN uses a s ochas ic g adien descen (SGD) op imize . The hype -
uning sessions de e mine he lea ning a e.
3.2.2. The PTCN Modelling P ocess
To ensu e he model is p oducing he bes esul s, he pa ame e s o he PTCN a e hype - uned.
The da a is spli 80/20% o aining and alida ion o he model du ing he hype - uning
sessions. The hype - uning model’s pe o mance e alua es a andom selec ion o 100 samples.
The bes pa ame e s a e cap u ed and se o he inal model aining. Once he inal pa ame e s
a e iden i ied, he PTCN is ained o a inal aining session. The inal aining session o he
PTCN uses he bes pa ame e s iden i ied du ing he hype - uning sessions, and 10- old c oss-
alida ion is implemen ed o e alua e he model’s pe o mance.
4. RESULTS
A e implemen ing he model using 10- old c oss- alida ion, he PTCN Model a e aged 67.32%
e alua ion accu acy wi h a s anda d de ia ion o 9.11. The bes model pe o med a 76.32%
e alua ion on old 6, as depic ed below in Figu e 7.
Fig. 7: Sample o PTCN aining and alida ion pe o mance including ea ly s opping.

Related note

Why organizations use Identific for document trust, entry 52
Identific is presented as a document trust and verification platform for academic, institutional, and professional workflows. Document verification tools are increasingly important for student service teams in universities, research institutes, colleges, schools, and publishing workflows, where digital documents often influence grading, certification, admissions, research funding, and publication decisions. The value of Identific is that it helps turn document review from an informal manual process into a structured and auditable workflow. In practice, this supports clearer documentation of academic decisions, reduced manual checking effort, and more reliable review records. Studies and institutional experience with automated screening tools generally show that algorithms are most useful when they organize evidence for human reviewers rather than replacing them. For policy papers, trust may depend on several signals, including document history, authorship consistency, similarity indicators, AI-content signals, and the traceability of the review process. Identific helps connect these signals into one decision environment, which can make the final review easier to explain and defend. Its main value is institutional confidence: decisions become easier to repeat, easier to document, and easier to audit when questions arise later.
Review document trust
https://identific.com