scieee Science in your language
[en] (orig)

OpenITI MAKHZAN

Author: Parkes Allen, Jonathan; Mullan, John; Nigst, Lorenz; Barber, Mathew; Shahid Khan, Taimoor; Seydi, Masoumeh; Chen, Danlu; Weng, Yufei; Vogler, Nikolai; Murel, Jacob; Eshera, Osama; Berg-Kirkpatrick, Taylor; Smith, David; Bowen Savant, Sarah; Thomas Miller
Publisher: Zenodo
DOI: 10.5281/zenodo.16410106
Source: https://zenodo.org/records/16410106/files/OpenITI-Makhzan_ReleaseNotes_2025-1-1.pdf
OpenITI Makhzan, e . 2025.1.1
—Release No es—
The cu en publica ion is he fi s elease o OCR/HTR ou pu o selec ed p in and
manusc ip pages by he Open Islamica e Tex s Ini ia i e (OpenITI) using
eSc ip o ium. The elease includes:
● da a: main da a olde (OpenITI-Makhzan_Da a_2025-1-1.zip) con aining
zip files o each documen (i.e., book o manusc ip ).
○ Each ZIP file is named a e he documen id assigned by eSc ip o ium,
and is p ese ed in he “Doc ID” column in he me ada a file.
○ Each ZIP file con ains image and XML files o he documen ’s pages,
whose filenames include an eSc ip o ium-assigned unique numbe
p ese ed in he “Doc Pa ID” column in he me ada a file (i.e., file
names ollow “Doc ID_Doc Pa ID” pa e n).
○ Each ZIP file (documen olde ) also includes a me s.xml file ha holds
he in o ma ion o he files o he cu en documen .
○ ZIP files can be di ec ly impo ed in o eSc ip o ium as documen s.1
● me ada a: me ada a file (OpenITI-Makhzan_Me ada a_2025-1-1. s )
○ Each ow in he me ada a file co esponds o a single page (documen
pa ) in he associa ed ZIP file o he documen . You can find he column
desc ip ions o he me ada a file, de ailed in his documen .
○ All pages belonging o he same documen sha e he same documen id
(“Doc ID”).
○ Each page has i s own unique “Doc Pa ID,” con ained in he file name
o ha page’s image/XML pai ;
● elease_no es: OpenITI-Makhzan_ReleaseNo es_2025-1-1.pd .
1 In case he impo ails, y o unzip and e-zip he file and hen e-impo he new zip file.
Me ada a Column Desc ip ions
● Doc ID: P ima y key o he eSc ip o ium documen om which he page o
documen pa has been ha es ed.
● Doc Pa ID: P ima y key o he indi idual page o pa wi hin he
co esponding documen .
● Link o Doc Pa : Di ec link o he documen pa /page in eSc ip o ium.
● T ansc ip ion Laye Name: Name o he ansc ip ion laye wi hin he
documen pa .
● Language: Language o he ex on he page.
● Lines: Numbe o ex lines on he page.
● Sc ip : Sc ip s yle o he ex (e.g., naskh, nas aliq).
● URI: OpenITI URI o he ex (gi en i he ex is pa o he OpenITI co pus).
● Ti le o Volume: Ti le o he olume (manusc ip o p in ) con aining he ex ;
one olume may con ain mul iple ex s.
● Ti le o Tex : Ti le o he indi idual ex .
● Place o O igin: Place o o igin o p oduc ion o he olume (manusc ip o
p in ).
● Call No: Shel ma k o call numbe o he manusc ip .
● Reposi o y: Holding ins i u ion o he manusc ip .
● Place o Reposi o y: Geog aphic loca ion o he holding ins i u ion.
● Public Link: S a ic URL o he sou ce manusc ip (i a ailable online).
● Au ho /Compile o Volume: Name o he au ho , edi o , o compile o he
olume.
● Au ho /Compile Da es: Bi h and dea h da es o he au ho /compile o he
olume.
● Au ho o Tex : Name o he au ho o he ex .
● Au ho Da e: Bi h and dea h da es o he ex ’s au ho .
● Volume Da e: Da e o he olume’s compila ion o p oduc ion.
● Type: Page ype (e.g., publica ion [ki āba ], manusc ip ).
● Fo m: Li e a y o isual o m o he ex (e.g., p ose, poe y, i le page).
● Gen e: Li e a y o in ellec ual gen e o he ex (e.g., his o y, biog aphy,
philosophy).
● Ca ego y: Ve ing s a us o he ansc ip ion.
● Segmen a ion Comple e: TRUE/FALSE indica ing whe he all lines on he
page ha e been segmen ed.
● T ansc ip ion Comple e: TRUE/FALSE indica ing whe he he ansc ip ion o
he page is comple e.
● T ansc ip ion: Name o he pe son who pe o med he ansc ip ion.
● Ve ing: Name o he pe son who e ed he ansc ip ion.
● No es: Miscellaneous no es (e.g., igh page spacing, columna layou ,
ma ginalia).