scieee Science in your language
[en] (orig)

NutriBase – management system for the integration and interoperability of food- and nutrition-related data and knowledge

Author: Valenčič, Eva; Beckett, Emma; Bucher, Tamara; Collins, Clare E.; Seljak, Barbara Koroušić
Publisher: Zenodo
DOI: 10.3389/fnut.2024.1503389
Source: https://zenodo.org/records/17536610/files/FNUT-2025-2.pdf
F on ie s in Nu i ion 01 on ie sin.o g
Nu iBase– managemen sys em
o he in eg a ion and
in e ope abili y o ood- and
nu i ion- ela ed da a and
knowledge
E aValenčič
1,2,3,4*, EmmaBecke
4,5,6, Tama aBuche
4,5,
Cla eE.Collins
4,5 and Ba ba aKo oušićSeljak
1,2
1 Compu e Sys ems Depa men , Jože S e an Ins i u e, Ljubljana, Slo enia, 2 Jože S e an In e na ional
Pos g adua e School, Ljubljana, Slo enia, 3 School o Heal h Sciences, College o Heal h, Medicine and
Wellbeing, Uni e si y o Newcas le, Newcas le, NSW, Aus alia, 4 Food and Nu i ion Resea ch
P og am, Hun e Medical Resea ch Ins i u e, Newcas le, NSW, Aus alia, 5 School o En i onmen al and
Li e Sciences, College o Enginee ing, Science and En i onmen , Uni e si y o Newcas le, Newcas le,
NSW, Aus alia, 6 Depa men o Science, Nu i ion Resea ch Aus alia, Sydney, NSW, Aus alia
In oduc ion: Con empo a y da a and knowledge managemen and explo a ion
a e challenging due o egula eleases, upda es, and di e en ypes and o ma s.
In he ood and nu i ion domain, solu ions o in eg a ing such da a and
knowledge wi h espec o he FAIR (Findabili y, Accessibili y, In e ope abili y,
and Reusabili y) p inciples a e s ill lacking.
Me hods: To add ess his issue, we ha e de eloped a da a and knowledge
managemen sys em called Nu iBase, which suppo s he compila ion o a ood
composi ion da abase and i s in eg a ion wi h e idence-based knowledge. This
esea ch is a no el con ibu ion because i allows o he in e connec ion and
complemen a ion o ood composi ion da a wi h knowledge and akes wha has
been done in he pas a s ep u he by enabling he in eg a ion o knowledge.
Nu iBase ocuses on wo impo an challenges; da a (seman ic) ha moniza ion
by using he exis ing on ologies, and educing missing da a by semi-au oma ic
da a impu a ion made om con la ing wi h exis ing da abases.
Resul s and discussion: The de eloped web-based ool is highly modi iable
and can be u he cus omized o mee na ional o in e na ional equi emen s.
I can help c ea e and main ain he quali y managemen sys em needed o
assu e da a quali y. Newly gene a ed da a and knowledge can con inuously
beadded, as in e ope abili y wi h o he sys ems is enabled. The ool is in ended
o use by domain expe s, ood compile s, and esea che s who can add and
edi ood- ele an da a and knowledge. Howe e , he ool is also accessible o
ood manu ac u e s, who can egula ly upda e in o ma ion abou hei p oduc s
and hus gi e consume s access o cu en da a. Mo eo e , he aceabili y o
he da a and knowledge p o enance allows he compila ion o a us wo hy
managemen sys em. The sys em is designed o allow easy in eg a ion o da a
om di e en sou ces, which enables da a bo owing and educ ion o missing
da a. In his pape , he easibili y o Nu iBase is demons a ed on Slo enian ood-
ela ed da a and knowledge, which is u he linked wi h in e na ional esou ces.
Ou pu s such as ma ched ood componen s and ood classi ica ions ha e been
in eg a ed in o seman ic esou ces ha a e cu en ly unde de elopmen in
a ious in e na ional p ojec s.
OPEN ACCESS
EDITED BY
Massimo Luca ini,
Council o Ag icul u al Resea ch and
Economics, I aly
REVIEWED BY
Se can Ka a ,
Çanakkale Onsekiz Ma Uni e si y, Tü kiye
Didie G. Leibo ici,
The Uni e si y o She ield, Uni edKingdom
Lindung Pa ningo an Manik,
Na ional Resea ch and Inno a ion Agency
(BRIN), Indonesia
*CORRESPONDENCE
E a Valenčič
[email p o ec ed]
RECEIVED 28 Sep embe 2024
ACCEPTED 13 Decembe 2024
PUBLISHED 06 Janua y 2025
CITATION
Valenčič E, Becke E, Buche T,
Collins CE and Ko oušić Seljak B (2025)
Nu iBase– managemen sys em o he
in eg a ion and in e ope abili y o ood- and
nu i ion- ela ed da a and knowledge.
F on . Nu . 11:1503389.
doi: 10.3389/ nu .2024.1503389
COPYRIGHT
© 2025 Valenčič, Becke , Buche , Collins and
Ko oušić Seljak. This is an open-access a icle
dis ibu ed unde he e ms o he C ea i e
Commons A ibu ion License (CC BY). The
use, dis ibu ion o ep oduc ion in o he
o ums is pe mi ed, p o ided he o iginal
au ho (s) and he copy igh owne (s) a e
c edi ed and ha he o iginal publica ion in
his jou nal is ci ed, in acco dance wi h
accep ed academic p ac ice. No use,
dis ibu ion o ep oduc ion is pe mi ed
which does no comply wi h hese e ms.
TYPE O iginal Resea ch
PUBLISHED 06 Janua y 2025
DOI 10.3389/ nu .2024.1503389
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 02 on ie sin.o g
KEYWORDS
da abase managemen sys em, ood da a compila ion, ood composi ion da a, ood
composi ion da abase, knowledge base
1 In oduc ion
Food and nu i ion- ela ed da a and knowledge (D&K) a e
essen ial o many esea ch domains, including public heal h
su eillance and p omo ion, die a y and heal h assessmen s, disease
p e en ion, nu i ion educa ion, consume p o ec ion, ag icul u e,
ood policy, and ood labeling (1, 2). D&K, such as ood composi ion
da a o die a y guidelines, a e also necessa y o s akeholde s in he
ood indus y, e ail sec o , non-go e nmen o ganisa ions,
policymake s, and ul ima ely consume s. Consume s ely on D&K
when making ood and nu i ion decisions, while policymake s use
ood and nu i ion- ela ed D&K o ob ain accu a e scien i ic e idence
needed o design and p omo e s a egies equi ed o imp o e public
heal h and o e all well-being (3, 4).
Howe e , D&K a e complex, co e ing di e se a eas such as ood
composi ion, ood sa e y, ood au hen ici y, and consump ion. This pape
ocuses on ood composi ion da a (FCD) and knowledge o die a y
assessmen and ad ising. This is highly impo an o domain expe s
and policymake s, as well as consume s, including pa ien s. While FCD
con ains de ailed composi ional, biochemical, and physiological da a o
oods (e.g., how much i amin C apples con ain), knowledge p o ides
addi ional ood- ela ed in o ma ion (e.g., wha is he ecommended
in ake o i amin C). FCD and knowledge a e compiled in a ious
da abases; howe e , hei in eg a ion and in e ope abili y a e lacking (5).
Imp o ed in eg a ion would enable easie access he la es e idence-
based D&K om di e en esea ch a eas wi hin a single sys em.
Nowadays, FCD is compiled online in he o m o a ood
composi ion da abase (FCDB). FCDBs a e usually compiled a he
na ional le el bu a e o en used in e na ionally o conduc public heal h
s udies (2). Examples include mul iple Eu opean FCDBs [a ailable
h ough he FoodEXplo e ool (6)], USDA’s FoodDa a Cen al (7),
FAO/INFOODS da abases (8), Canadian FooDB (9), and o he s. In
gene al, FCDBs con ain da a on adi ional, e hnic, and local oods and
dishes, wi h some combining gene ic and b anded oods [e.g., Se bian
(10)] and o he s main aining sepa a e da abases o di e en ood ypes
[e.g., Du ch b anded ood da abase (11)]. In addi ion o ins i u ional
da abases, nume ous company-owned FCDBs also exis , such as he
Edamam’s ood, g oce y, and ( es au an ) da abase composed using
Na u al Language P ocessing (NLP) echniques (12) and GS1 b anded
oods, and ba code da abases main ained h ough he Global Da a
Synch oniza ion Ne wo k (GDSN) (13).
The e a e wo main challenges wi h exis ing FCDBs. Namely, da a
ha moniza ion and missing da a. Fi s , FCBDs may con ain da a o
di e en quali y due o di e ences in da a p oduc ion me hods ( ood
sampling, analyses o es ima ion, ( e)calcula ion, bo owing), da a
compila ion (collec ion, agg ega ion, compila ion, and dissemina ion),
and da a managemen . The challenge o da a ha moniza ion has been
add essed by se e al ne wo ks o excellence. Fo example, he Food
CEN s anda d (14), which de ines equi emen s on he s uc u e and
seman ics o ood da ase s and o in e change o ood da a. Ano he
ini ia i e, he ESFRI esea ch in as uc u e Me o ood (15),
con ibu es o he de elopmen o aligned me ology se ices in he
ood domain. Mo eo e , when compiling a FCDB, guidelines and
amewo ks o assess he quali y o da a, da ase s, and da abases (16,
17) need o beacknowledged. Se e al amewo ks also enable uni ied
da a classi ica ion and desc ip ion, which need o beconside ed when
ha monizing a ious FCDB (2, 18, 19). While hese s anda ds and
amewo ks acili a e he ha moniza ion o ood- and nu i ion- ela ed
da a, he p oblem o linking i wi h o he da a ypes (e.g., medical,
en i onmen al, and consump ion- ela ed) emains un esol ed. The
second challenge is ela ed o missing da a in FCDBs, which dis o s
da a in eg i y. Analyzing all componen s o speci ic oods poses a
signi ican inancial bu den o ins i u ions; hus, no FCDB is comple e,
and upda es a e no done con inuously. The challenge o missing FCD
is being add essed in a ious ways, including bo owing da a om
o he da abases, pe o ming edious manual wo k, o using compu e -
suppo ed me hods o (semi-) au oma ed da a impu a ion (20, 21).
On he o he hand, oge he wi h da abases, knowledge bases
(KBs) a e also e y impo an esou ces. By de ini ion, a KB is an
easily accessible online lib a y o collec ed and o ganized in o ma ion
and documen a ion abou ce ain opics (22). The impo an
knowledge ha should beincluded in ood and nu i ion KB should
include, bu no be limi ed o: s anda dized classi ica ion and
desc ip ion o coding sys ems [e.g., LanguaL (23), FoodEx2 (24),
INFOODS (8)]; s anda dized alue documen a ion (e.g., acquisi ion
ype, me hod ype) (18); a chemical da abases o molecula en i ies–
ChEBI (25); e en ion and yield ac o s used o calcula e he nu ien
con en o composi e dishes o ecipes (26); s anda dized household
measu emen uni s; na ional die a y e e ence alues and die a y
guidelines; physical ac i i y s anda ds; ood componen s’
bioa ailabili y; ood-d ug in e ac ions, and o he s.
As knowledge accumula es quickly, he c ea ion and main enance
o a KB is edious wo k, usually done manually by domain expe s.
Howe e , seman ic esou ces ha e complemen ed KBs and allowed
in e ope abili y o D&K om a ious esea ch domains. Seman ic
esou ces like he on ologies [e.g., FoodOn (27), ISO-FOOD (28),
FNS-Ha mony (29), COMFOCUS (30)] o knowledge g aphs [e.g.,
desc ibing complex ela ionships be ween ood and biomedical ac o s
(31)] a e being de eloped o o mally desc ibe knowledge as a se o
concep s and he ela ionships be ween hose concep s wi hin a
domain. To link FCD wi h seman ic esou ces, FCD needs o
be anno a ed wi h s anda dized me ada a in machine- eadable
o ma s o enable connec i i y o e ms ac oss di e en da a sou ces.
Rega dless o all esea ch e o s, applicable KBs p o iding
in eg a ed knowledge on ood and nu i ion a e s ill lacking. The e a e
ew KBs ha ocus on speci ic subdomains, such as FoodKG (32) o
ood ecommenda ion based on die - ela ed knowledge o Tas eA las
(33), a wo ld a las o adi ional dishes, local ing edien s, and
au hen ic es au an s.
Abb e ia ions: API, Applica ion P og amming In e ace; D&K, Da a and knowledge;
DKBMS, Da a- and knowledge base managemen sys em; FCD, Food composi ion
da a; FCDB, Food composi ion da abase; KB, Knowledge base; NLP, Na u al
Language P ocessing; KPI, Key pe o mance indica o s; MTBF, Mean ime be ween
ailu es.
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 03 on ie sin.o g
The ood and nu i ion communi y has c ea ed many FCDBs as
well as ew KBs, bu hei in eg a ion and in e ope abili y a e cu en ly
missing. E en when limi ed jus o he in eg a ion wi hin FCDB,
in o ma ion is no ha monized because di e en coding sys ems,
documen a ion o s anda ds a e used. Some examples o bes p ac ice
using ha monized FCD a e FoodEXplo e (6), FoodCASE (34),
FoodDa a Cen al, Glycemic Index Resea ch and GI News (35). Some
o hese ools e en enable compa ison o FCD om mul iple coun ies.
This is impo an as, wi h inc easing globaliza ion, he a ailabili y o
in e na ional oods and dishes is inc easing, and ob aining da ase s o
non-local oods is necessa y. Ha ing da abases composed on a
na ional le el is impo an ; howe e , o applied science, i would
beuse ul i compile s could link and in eg a e no only FCD wi h each
o he bu also FCDBs wi h KBs. This is some hing ha webelie e does
no ye exis o is no publicly a ailable in he ood and nu i ion
domain. The impo ance o in eg a ion and in e ope abili y was also
highligh ed in he ecen pape by Du azzo e al. (36), which u he
emphasized he necessi y o coope a ion and D&K sha ing be ween
compile s. Howe e , he connec i i y among compu e sys ems and/
o online pla o ms is equally necessa y.
In he cu en pape , wein oduce a new da abase managemen
sys em, called Nu iBase, o in eg a ing FCD om di e en da abases
wi h ood- and nu i ion- ela ed knowledge. The in eg a ion is
pe o med in a anspa en way and enables, oge he wi h
ha moniza ion, a educ ion in missing da a. In Sec ion 2, weexplain
how publicly a ailable D&K esou ces, which (cu en ly) ep esen he
baseline o he Nu iBase, we e iden i ied and collec ed. Nex ,
wein oduce Nu iBase and desc ibe i s unc ionali y. In Sec ion 3,
wedesc ibe he compila ion p ocess o he Slo enian FCDB and KB,
iden i y issues, discuss possible solu ions he sys em o e s, and
p o ide plans o u u e wo k. Weconclude he pape in Sec ion 4.
2 Ma e ials and me hods
2.1 Da a and knowledge collec ion
To demons a e he easibili y o Nu iBase, Slo enian FCD and
bo h, na ional and in e na ional seman ic esou ces we e collec ed.
Fi s ly, he analy ical composi ional da a on gene ic oods om he
Slo enian FCDB composed in 2006 and upda ed in 2012 (37) we e
impo ed. The ecipes included in he Slo enian FCDB we e impo ed
sepa a ely, as hey equi e di e en da a handling, such as
conside a ion o yield and e en ion ac o s, as well as s anda ds o
calcula ing ecipes (38, 39). In addi ion, b anded oods ha can
cu en ly bepu chased in Slo enia, a e being uploaded h ough an
applica ion p og amming in e ace (API) om he Composi ion and
Labeling In o ma ion Sys em (CLAS) (40).
To complemen he Slo enian FCDB o gene ic oods, six publicly
a ailable FCDBs (Table1), oge he wi h associa ed me ada a and
documen a ion, we e ei he downloaded o linked h ough an API in
la e 2020 o 2021. The impo ed FCDBs consis ed o da ase s in
di e en o ma s, and no all o hem adhe ed o he Food CEN
s anda d (14). The impo ed me ada a and documen a ion include
a ious backg ound in o ma ion, such as explana ions o da a sou ces,
p ocedu es o da a quali y assu ance, desc ip ions o oods and ood
g oup classi ica ion le els, and explana ions o speci ic componen
desc ip ions, calcula ions and uni s used. Mul iple o eign FCDBs
needed o beimpo ed because hey con ain di e en da a. Fo example,
FoodDa a Cen al (US in Table2) in addi ion o FCD, p o ides also he
da a o household measu emen uni s (e.g., ablespoon, cup, dash)
which can belinked o gene ic oods. Mo eo e , di e en componen s
a e collec ed o analyzed ac oss di e en FCDBs. Fo ins ance, some
da ase s con ain da a o o al ca bohyd a es (diges ible and indiges ible,
including die a y ibe ), whe eas o he s con ain only da a o a ailable
ca bohyd a es. F om he cu en ly impo ed FCDBs only h ee p o ide
da a o o al ca bohyd a e, howe e all o hem con ain da a o
a ailable ca bohyd a es and o al die a y ibe , hus he o al
ca bohyd a es could becalcula ed. Las ly, ele an e idence-based ood
and nu i ion knowledge was sys ema ically e iewed and collec ed
om publicly a ailable na ional and in e na ional esou ces, and was
u he compiled in o he Nu iBase KB (Table2).
The app oaches and ools applied and desc ibed in he cu en
pape can beused o D&K om any coun y. The Slo enian D&K a e
used as an example only. Unlimi ed publicly a ailable FCDBs and/o
KBs can beuploaded o linked ia an API o c ea e a new da abase, as
long as hey comply wi h he Nu iBase equi emen s.
2.2 Nu iBase- da a- and knowledge base
managemen sys em
Nu iBase is designed o enable easy in eg a ion wi h o he KBs
and seman ic esou ces concep ualizing he heal h, en i onmen al,
consume beha io s, and ood and nu i ion domains in pa icula .
This da a- and knowledge base managemen sys em (DKBMS) has
TABLE1 FCDBs cu en ly included in he Nu iBase.
Cu en ly Impo ed FCDBs
Coun y
code
No. o
componen s
No. o
ood
g oup
le els‡
No. o
oods /
dishes
Sou ce
ile
o ma
SI 773*15 993 .CSV/.XSL
48
149
FR 60 10 2,807 .CSV/.XSL
58
83
NL 133 27 2,152 .CSV/.XSL
DK 197 18 1,186 .CSV/.XSL
127
UK 178 14 2,910 .CSV/.XSL
71
54
AU 249 22 1,534 .CSV/.XSL
97
US 235 28 7,793†,
210¨
API
*651 om Eu oFIR Thesau i documen and 122 subsequen ly added (own); ‡ = he op
numbe is he highes le el, he bo om numbe is he lowes ( he mos de ailed) le el (sub-
le el); † = SR Legacy Foods; ¨ = Founda ion Foods.
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 04 on ie sin.o g
TABLE2 Resou ces included in he Nu iBase KB.
Sceman ic esou ces
Resou ce name/ ype and
e e ence
Knowledge
ype
Desc ip ion Numbe o en i ies
S anda dized classi ica ions and desc ip ion
coding sys em
FoodEx2
classi ica ion
A ood classi ica ion and desc ip ion sys em de eloped by EFSA-
includes di e en hie a chies and ace s o di e en ood sa e y
domains. (e.g., A00KR#F27.A00KV$F27.A00LN $F27.
A00LB$F27.A00LG; mixed lea y ege ables)
4,445
S anda dized alue documen a ion (11) Componen ype Componen iden i ie s and desc ip o s (e.g., CHO; ca bohyd a e;
use o o al o hose ca bohyd a es diges ed and abso bed in he
in es ine; o al accessible ca bohyd a es include ee suga s, polyols
and dex ins, s a ch, and glycogen).
660 (9 o hese a e o
backwa d compa ibili y only)
56 classi ica ion iden i ie s
(no used o new indexing)
Uni E.g., g ams, millimoles, alpha- ocophe ol equi alen , pe cen . 19 Addi ional 20 added (IU,
g/kg body mass, e c.)
Ma ix uni E.g., pe 100 g o o al ood, pe 100 mL ood olume, pe uni , pe
100 g edible po ion.
20 ma ixes
Value ype E.g., a i hme ic mean, bes es ima e, a e age, below limi o
de ec ion, ace.
20 ypes
Me hod ype Repo ing i he alue was analyzed, calcula ed o impu ed (e.g.,
calcula ed as ecipes, calcula ed om ela ed ood, analy ical
esul ).
20 ypes
Me hod indica o P o iding de ails o he analy ical me hod o o mulas used o
calcula ion (e.g., ch oma og aphy, di e ence, ash calcula ed as
sum o mine als).
214 indica o s
Acquisi ion ype Desc ibes he o igin o he alue (e.g., labo a o y, ood
composi ion able, au ho i a i e documen ).
12 ypes
Re e ence ype E.g., a icle in jou nal, ile o da abase, p oduc label, so wa e. 14 ypes
LanguaL hesau us (10) Cooking me hods E.g., g iddled, cooked by mic owa e, deep ied. 47 me hods
FoodDa a Cen al a US Depa men o
Ag icul u e (USDA)
Measu emen and
household uni s
E.g., ea spoon, slice, ile , cup, could beused o olume o weigh
con e sions.
115 (cu en ly in use) ou o
1923
ChEBI- a chemical da abase and on ology o
molecula en i ies, which is pa o he Open
biomedical on ologies a he EBI, and
Eu opean ELIXIR in as uc u e
Dic iona y o
molecula en i ies
P o iding de ailed da a o chemical en i ies o biological in e es
(e.g., de ini ions, o mulas, on ologies, chemical eac ions, IUPAC
names and iden i ie s)
210 linked o added
componen s
SciName Finde (26) Sea ch ool o
scien i ic and
common names o
plan s and animals
P o iding p ecise iden i y plan s and animals
Allows p ecise iden i ica ion o plan s and animals, and sea ching
he in o ma ion on scien i ic and common names p o ided by
au ho i a i e esou ces (and no om seconda y sou ces)
Mo e han 1,000,000 scien i ic
and common names
Culina y g oups [adap ed om (18, 23)] Culina y g oups /
subg oups ela ed o
e en ion and yield
ac o s.
P o iding he basics o ob aining nu ien con en o oods by
calcula ion me hods (as ecipe calcula ion), based on he amoun
o ing edien s gi en in a ecipe, nu ien composi ion o
ing edien s and ac o s ha conside changes in nu ien con en
( e en ion ac o s), and weigh (yield ac o s) du ing p epa a ion.
31 g oups and subg oups
ela ed o yield ac o s, and 38
ela ed o e en ion ac o s
Slo enian die a y e e ence alues (DRVs)
(27) based on he D-A-CH e e ence alues
adop ed by he Minis y o Heal h o he
Republic o Slo enia
DRVs Re e ence alues o ene gy and nu ien in ake o child en (a
leas 1-yea old), adolescen s, adul s, elde ly, p egnan women and
nu sing mo he s.
34 e e ences o ene gy,
mac o- and mic onu ien s,
o men and women (10
di e en age g oups)
La es die a y guidelines and
ecommenda ions
Na ional and
in e na ional die a y
guidelines and
ecommenda ions
Rele an e idence-based guidelines and ecommenda ions o
di e en consume s (a hle es, p egnan women and nu sing
mo he s, heal hy indi iduals om di e en age g oups).
Cu en ly de ined o
bioma ke s (blood choles e ol
and glucose) and endu ance
spo s.
(Con inued)
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 05 on ie sin.o g
been implemen ed as a web-based ool (Figu e1) o ood compile s
o easily explo e, compile and mos impo an ly, link da a om
di e en FCDBs and KBs. The main goal o his p ocess is achie ing
an op imal linking o D&K, which enables bo owing da a espec ing
he FAIR (Findabili y, Accessibili y, In e ope abili y, and Reusabili y)
p inciples o da a managemen (41), and educing missing D&K.
2.2.1 FCDB compila ion
To ensu e a semi-au oma ic connec i i y among di e en sou ces
(FCDBs), s anda dized componen s and ood g oups ma ching had o
bemanually pe o med (Figu e2; S ep1 and 2). Since he composi ion
o ood depends on i s geog aphical o igin, i is impo an o also
conside he da a sou ce and he da a mos closely ela ed o local oods.
The e o e, a p e-se p io i y lis o da a sou ces is in eg a ed wi hin he
sys em and can beadap ed i needed. Fo Slo enian example his means
ha Eu opean da ase s a e p io i ized be o e non-EU da ase s. This
allows expe s o semi-au oma ically compile da ase s ha a e as
comple e as possible, while also anspa en ly p o iding he sou ce o
speci ic da a (e.g., componen alue). The p e-se p io i y lis can easily
beamended o se o di e en coun ies. Mo eo e , a compa ison o a
na ional da ase (in ou case, Slo enian), wi h o he , o eign da ase s is
also enabled. This ea u e allows bo owing speci ic da a om o he
FCDBs. Toge he wi h ood composi ion da a, compile s can also check
addi ional alue in o ma ion, such as alue ype and me hod ype (i
p o ided). Being able o check addi ional alue in o ma ion and
s anda ds, allows compile s o assess he quali y o he da a and selec
he mos app op ia e o accu a e one. Addi ionally, du ing he FCDB
compila ion p ocess, basic ood in o ma ion and me ada a, such as
gene ic and/o comme cial names, alle gens, ing edien s, ood o igin,
and ood images, a e also add essed and can bebo owed.
Nu iBase p esen s an in as uc u e ha can beadap ed o FCD
om any coun y. Howe e , o achie e an op imal linking o D&K and
o ease and expedi e FCDB compila ion, a ious knowledge esou ces
had o beconside ed.
2.2.2 Knowledge base compila ion
In he Nu iBase unde lying hesau us, knowledge abou ele an
ood- and nu i ion opics is collec ed and main ained. The KB,
implemen ed wi hin he DKBMS, is connec ed wi h all h ee s eps o
he wo k low seen on Figu e2. Thus, all upda es o he KB con en will
ha e an immedia e impac on linked da a in FCDB. Tha means
whene e a new da a o knowledge is published, i can easily
beimpo ed and linked o exis ing D&K o subs i u ed o he la es
indings. An impo an pa o he implemen ed KB is ood naming
by using ags. I p o ides unc ionali y o unique ood naming and
me ada a anno a ion. While much wo k has al eady been conduc ed
on uni ying ood desc ip ion and classi ica ion, ood naming is s ill an
open issue. The e o e, weha e implemen ed a new ood- agging
app oach o uni y and s anda dize ood naming wi hin he FCDB. This
is especially use ul when di e en use s a e wo king on a FCDB, as i
enables unambiguous communica ion be ween all use s in ol ed in
he wo king p ocess. In addi ion, oge he wi h using ags, se ing
ules o ood naming has been p oposed as ano he solu ion.
2.2.3 Usabili y o Nu iBase
Las ly, he usabili y o he newly de eloped sys em was e alua ed.
Wedis ibu ed he Sys em Usabili y Scale (SUS) ques ionnai e among
egula Nu iBase use s wi h di e en p o ile oles. The SUS ool is a
eliable and alida ed ool o measu ing he usabili y, which is
equen ly used by e alua o s o mHeal h se ices (42). I consis s o
a 10-i em ques ionnai e wi h i e esponse op ions o esponden s
(s ongly ag ee o s ongly disag ee). The su ey was comple ely
anonymous and a e collec ing he esponses, he pa icipan ’s sco es
we e ca e ully in e p e ed o p oduce a pe cen ile anking.
3 Resul s and discussion
3.1 The compila ion o he Slo enian FCDB
and KB
Th oughou he en i e compila ion p ocess (Figu e2), D&K we e
main ained in acco dance wi h he FAIR p inciples. Managing D&K
o ensu e ha he o ma o o eign FCDBs and KBs emains
unchanged om he o iginal sou ces has been a key equi emen in
Nu iBase’s de elopmen (Figu e3).
3.1.1 Componen s ma ching
To c ea e and link he Slo enian da abase, he compila ion p ocess
was ini ia ed by componen s ma ching (Figu e 2, S ep 1). The
Slo enian FCDB complies wi h he CEN Food s anda d (14), he e o e
he componen s speci ied wi h espec o he Eu oFIR hesau us o
componen s (18) we e manually ma ched wi h componen s om he
o eign FCDBs (Figu e4 p esen s he use in e ace o his p ocess).
Al hough mos o he o eign selec ed FCDBs also comply wi h he
CEN Food s anda d, misma ched componen s (i.e., di e en names
o he same componen s among di e en coun ies) we e s ill p esen
(examples a e shown in Table3).
Componen s we e ma ched manually by domain expe s o ensu e
a co ec and unambiguous ma ching. Mo eo e , he esul can
bep o ided as an inpu o he FNS-Ha mony on ology (43), which
has been de eloped wi hin he FNS-Cloud p ojec o suppo
in e ope abili y o ood- and nu i ion- ela ed da a in he Eu opean
Open Science Cloud (EOSC) and is a ailable h ough he NCBO
Biopo al. Nu iBase could bein eg a ed wi h FNS-Ha mony, which
euses o inco po a es se e al on ologies, including FoodOn (27). In
TABLE2 (Con inued)
Sceman ic esou ces
Resou ce name/ ype and
e e ence
Knowledge
ype
Desc ip ion Numbe o en i ies
Physical ac i i y ela ed s anda ds Me abolic equi alen
o ask (METS)
E.g., baske ball, swimming, mopping, walking, si ing. 541 asks
Physical ac i i y le el
(PAL)
E.g., seden a y o ligh ac i i y li es yle. 5 le els pe sex

Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 06 on ie sin.o g
FIGURE1
Use in e ace o Nu iBase.
his case, ood compile s would no only beable o p o ide bu also
use new knowledge abou seman ic in eg a ion wi h o he sys ems,
such as GS1 GDSN (44).
3.1.2 Food g oups ma ching
Fi s ly, ood g oups we e designed based on he classi ica ion o
oods used by ele an in o ma ion sys ems in Slo enia, as well as he
Eu oFIR s anda d (18), which is in ended o gene ic oods. Since
Slo enian FCDB also includes b anded oods, classi ica ion sys ems
o hese had o be conside ed as well. Howe e , we ound ha
di e en Slo enian ins i u ions use di e en classi ica ion sys ems.
This sugges s ha e en wi hin a single coun y, i migh benecessa y
o ollow and comply wi h se e al s anda ds. Fo example, he
Slo enian classi ica ion sys em, which is based on public p ocu emen
and is de e mined by law, o he Dun o d classi ica ion sys em (45),
speci ically de eloped o b anded oods. Cu en ly, he Slo enian
FCDB includes h ee hie a chical classi ica ion le els: 15 g oups on
he i s , 48 g oups on he second, and 160 on he hi d (and mos
de ailed) le el.
In addi ion o manually ma ching na ional ood classi ica ion
sys ems wi h one ano he , he ood g oups used in Slo enian FCDB
we e also ma ched wi h hose used in he o eign FCDBs (Figu e2,
S ep2). An example o a ma ched ood g oup- F esh ege ables,
among FCDBs is p esen ed in Table 4. The ask o ood g oups
ma ching was especially challenging, as di e en coun ies use
di e en numbe s o classi ica ion le els. Fo example, oods in F ance
and he UK a e classi ied in o up o h ee le els, in Aus alia and
Denma k in o wo le els, and in he Ne he lands and USA in o jus
one le el. Mo eo e , he le el o de ail wi hin ood g oups a ies. As
shown in Table4, some coun ies g oup all ege ables oge he , while
he o he s sub-classi y hem u he (e.g., oo ege ables,
ui ing ege ables).
To ensu e accu a e ood classi ica ion and assis use s in using
Nu iBase, a ea u e was implemen ed allowing compile s o add
examples o oods alloca ed o speci ic ood g oup. This ea u e was
ound o be e y use ul, as i enables use s o unambiguously selec he
co ec ood g oup. Addi ionally, manually ma ched ood g oups can
also bep o ided as inpu s in o FNS-Ha mony.
3.1.3 FCDB compila ion
FCDB compila ion p ocess (S ep 3 in Figu e 2) began wi h
manually checking and co ec ing a da ase o 14,064 en ies o 443
gene ic oods analyzed by he Bio echnical Facul y o he Uni e si y
o Ljubljana in 2006 and 2012 (37). Toge he wi h he composi ion
da a, anno a ed me ada a (e.g., alue in o ma ion) we e also e iewed.
Ce ain componen s we e speci ically checked o ensu e compliance
wi h he s anda ds. Fo example, he di e ences be ween o al
a ailable ca bohyd a es and o al ca bohyd a es. This en i e p ocess
aligns wi h he i s 12 s eps o he gene ic compila ion p ocess
desc ibed by Wes enb ink e  al. (2), cu en ly excluding S ep 5
(a ibu ion o quali y index) and S ep11 (physical s o age). The
e alua ion o Slo enian da a quali y (17) and he da abase quali y
e alua ion, as sugges ed by he ecen ly published FAO/INFOODS
amewo k (16), a e cu en ly unde way.
Nex , he Slo enian name o each gene ic ood was e iewed, and
a scien i ic name (when app op ia e), an English name, and synonyms
we e assigned based on he new ood- agging app oach. To achie e
his, ags we e de ined, and ules o hei applica ion we e es ablished
wi hin each ood g oup. Du ing his p ocess, we ound ha simila
oods migh ha e di e en names. This can make sea ching o a
speci ic ood wi hin he FCDB ha de o compile s as well as o
consume s accessing publicly a ailable FCDBs. Fo example, he only
di e ence be ween ‘Baked eggplan wi h added cheese and oma o
sauce’ and ‘Aube gine p epa ed in oma o sauce and cheese, ozen’ is
ha one is baked and he o he is ozen, bu he names a e e y
di e en . The e o e, using ags o ood naming, helps uni y he FCDB
and simpli ies sea ching o speci ic oods. Mo eo e , weensu ed he
naming is clea o all use s, speci ically o consume s accessing FCD
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 07 on ie sin.o g
(e.g., ia a mobile app), who may ind i challenging o unde s and he
p ocessing condi ions o oods. Fo example, mea can beanalyzed as
aw (e.g., bee ile ) o hea - ea ed (e.g., bee ile , g illed). Howe e ,
expe ience shows, i is seen ha consume s do no conside ‘bee ile ’
as aw, bu a he as eady- o-ea s eak. The e o e, adding he ‘ aw’ ag
o aw mea seemed easonable. On he o he hand, i is clea o
consume s ha ‘banana’ is aw, and hey do no expec his ag added
o esh ui s. Thus, he ‘ aw’ ag is used in some ood g oups bu no
in o he s. In addi ion, he ag ‘peeled’ is used only when app op ia e
(e.g., ‘apple, peeled’, bu no ‘banana, peeled’). Cu en ly, each ood
g oup a he hi d hie a chical le el wi hin he ool has an a e age o
15.4 ags.
Addi ionally, he ini ial Slo enian da ase o gene ic oods was
manually linked wi h he same o simila ood i ems om he selec ed
o eign FCDBs. The linking was ca ied ou by domain expe s. Fi s ,
he English names we e compa ed, ollowed by a compa ison o he
main ood componen s. In case he ood composi ion was simila , he
ood i ems we e linked oge he and he missing da a we e impu ed
om he o eign FCDBs. Table5 p esen s an example o he numbe
o impu ed da a o F esh ege ables ood g oup om a speci ic
FCDB. As can beseen, only one alue o o al ca bohyd a es could
be bo owed om US da abase, while he es we e aken om
Slo enian FCDB. Howe e , cys ine alues a e missing in Slo enian
FCDB, so hey we e bo owed om he Danish and US da abases ( he
o he FCDBs do no con ain da a o cys ine). The Nu iBase allows
linking one ood wi h mul iple oods wi hin one da abase o ac oss
mul iple da abases. Fo example, he Slo enian ‘a e age whi e b ead’
can belinked wi h ‘whi e bague e’ and ‘whi e loa ’ om one FCDB,
and wi h ‘whi e b ead’ om he o he FCDBs. The bo owed da a will,
howe e , bedisplayed based on he p e-se p io i y lis o FCDBs. In
ou case, when a ood i em is linked wi h ood i em(s) om ac oss
di e en FCDBs, da a om Eu opean da ase s we e p io i ized be o e
non-EU da ase s. Howe e , compile s can manually change he da a
sou ce and selec (bo ow) non-EU da a o bedisplayed i i is mo e
app op ia e. We ound his app oach o be e y con enien , as i
p o ides compile s wi h da a mos closely ela ed o he local oods,
bu i s ill gi es hem eedom o selec ano he da a. Mo eo e , he
manually ma ched oods p esen a aluable asse ha can beused o
cons uc a gold s anda d co pus, i.e., a co pus o ex anno a ed wi h
ood en i ies equi ed o NLP echniques, such as Ca e e iaFCD (46).
Same as gene ic oods, he b anded oods can also belinked wi h
simila gene ic oods om ei he na ional FCDB o o eign FCDBs. In
his case, he o iginal FCD o a b anded ood is aken om he nu i ion
decla a ion able, while he FCD no p o ided on he nu i ion
decla a ion able (e.g., mic onu ien s) can beimpu ed om FCDBs and
anspa en ly ma ked as such. This is especially bene icial when
FIGURE2
Flowcha o compila ion p ocess o link oods om di e en FCDBs.
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 08 on ie sin.o g
FIGURE3
O e iew o Nu iBase s uc u e.
collec ing ood consump ion da a o he na ional ood consump ion
su ey. As seen in he EU Menu p ojec , consume s usually p o ide only
he b and o p oduc ion line o he ood i em when epo ing ood
in ake. Fo example, ins ead o epo ing consump ion o ‘ ull a milk’,
hey epo ed a p oduce ’s name o such milk. Since he nu i ion
decla a ion able usually only p o ides he in o ma ion o ene gy alue
and six o he nu ien s, he alues o mic onu ien s a e unknown.
Thus, b anded oods could belinked wi h gene ic oods o compose he
comple e da ase , which would p o ide he oppo uni y o mo e
accu a ely assess ood in ake o indi iduals and o e all popula ion.
Finally, ye impo an ly, in e na ionally accep ed algo i hms o
a oid e o s we e selec ed and applied o p oduce agg ega ed da a
[e.g., ecipe calcula ions) (S eps 14 and 15 acco ding o Wes enb ink
e al. (2)]. In addi ion, he compiled and agg ega ed da a wi hin he
Nu iBase we e e i ied [and co ec ed i needed) (S eps 16 and 17,
acco ding o Wes enb ink e al. (2)] o p e en haza ds ela ed o da a
alida ion. The majo i y o he FCD alida ion has been done
manually, howe e he ool au oma ically pe o ms consis ency checks
o some me ada a and componen s (e.g., con en o speci ic
componen is no la ge han 100 g (con e ed ega dless o he uni ),
he sum o p oximi ies is ≤105 g, alue o sa u a ed a y acids is no
la ge han alue o o al a s, e c.). The alida ed da a is hen s o ed
and dissemina ed [S eps 18 o 22, acco ding o Wes enb ink e al. (2)].
3.1.4 Knowledge base c ea ion
Using seman ic esou ces, a KB was c ea ed o suppo he op imal
ood compila ion p ocess, as well as o da a quali y assessmen ,
aceabili y, calcula ions and alida ion. The KB implemen ed wi hin
he Nu iBase is mean o beused by domain expe s, as i collec s he
la es scien i ic e idence and documen a ion equi ed o da a
managemen and da a sou ce managemen . The KB also consis s o
he e e ence lis and i allows publica ion me ada a o beimpo ed in
s anda dized o ma s (e.g., bib). These e e ences can be u he linked
o speci ic da a/in o ma ion, which allows aceabili y o da a and
me ada a. Mo eo e , he in o ma ion can beedi ed o added o he
exis ing KB and upda ed acco dingly. Fo ins ance, uni s lis ed in he
Eu oFIR alue documen a ion (18) can besupplemen ed o ex ended
wi h o he uni s (e.g., IU, ABV) o mee he compile s’ needs, o hey
can beupda ed i changes a e made o he exis ing Eu oFIR s anda ds.
3.1.5 Linking FCDB wi h knowledge
Linking FCD om di e en sou ces is impo an , and linking
knowledge om a ious sou ces is equally c ucial. Bo h ypes o
linking can bepe o med in Nu iBase; howe e , he sys em also
enables he linking o FCD wi h knowledge. Fo ins ance, a speci ic
componen (e.g., i amin C) can belinked wi h a ele an die a y
ecommenda ions, such as Slo enian DRVs (47). The e o e, wi hin he
ool, da a (componen ; i amin C) was in e connec ed and
complemen ed wi h knowledge (die a y equi emen s o i amin C),
enabling access o combined in o ma ion in one place. This app oach
akes wha has been done in he pas a s ep u he by inco po a ing
knowledge in o he sys em, which can be especially use ul o
in o ming and educa ing consume s (e.g., ia mobile apps). Ins ead o
p o iding consume s o app use s wi h jus FCD, he inco po a ed
knowledge can also be p o ided, which can deli e a mo e
pe sonalized app oach. Ou wo k is consis en wi h p e ious wo ks
(5, 27, 48), wi h he di e ence ha Nu iBase is a p ac ical and
applicable ool, whe eas he p e ious wo ks is heo y based.
Valenčič e al. 10.3389/ nu .2024.1503389
F on ie s in Nu i ion 09 on ie sin.o g
3.1.6 Tool alida ion
The Nu iBase and i s unc ionali ies we e alida ed h oughou
he en i e compila ion p ocess o he FCDB and KB. Se en expe s
who egula ly use Nu iBase e alua ed i using he SUS ool, which is
used o judging he pe cei ed usabili y o sys ems. The SUS sco e
was 78.9, which alls o 85 h pe cen ile and co esponds o g ade A-.
Mo eo e , six ood compile s o di e en skills ha e pe o med
a ious asks (e.g., componen ma ching, ood linking) depending on
hei use p o ile ole. Fo example, less skilled compile s ha e only
edi ed D&K, whe eas mo e expe ienced compile s pe o med mo e
demanding asks. Rega dless o hei skill le el, all use s ag eed ha
he sys em is a help ul, easy- o-use ool when compiling a FCDB,
especially because i collec s all ele an and needed D&K in
one place.
3.2 S eng hs and limi a ion o DKBMS
While e iewing analy ical da a o gene ic oods om he pas
Slo enian FCDB and impo ing i in o he DKBMS, some e o s and
gaps we e iden i ied and u he discussed wi h compile s. The da a
was e iewed using sp eadshee s, and i was ound ha e o s we e
di icul o iden i y. Howe e , when using Nu iBase o e iew and edi
he FCD, use s ag eed ha i is a use ul and eliable ool. Al hough
sp eadshee s a e e y popula when handling da a, a simila inding
was epo ed by P esse e al. (34).
To assess he quali y o D&K, i is c ucial o de elop and main ain
a quali y managemen sys em (2). Cu en ly a ailable FCDBs con ain
da a o a ying quali y, mainly due o he use o di e en esou ces
and di e en me hods o da a acquisi ion. The me ada a used o
desc ibe hem, as well as he quan i y o da a di e among FCDBs.
The e o e, compile s need o ollow s anda dized guidelines, p o ide
quali y indexes o hei o iginal da a, and u he e alua e hei
FCDB. This will help domain expe s selec he bes high-quali y
da ase and/o FCDB o hei pu poses, which can u he beused
o ob ain accu a e esul s in esea ch, educa ion, and in decision
making o policy and p og amming (16). No only is Nu iBase a
use ul ool o help domain expe s compa e di e en da ase s and
he e o e selec he mos app op ia e one, i can also help na ional
compile s o e alua e hei own o iginal da a and me ada a, and
ensu e he quali y da ase s. Mo eo e , an ad an age o he sys em is
also ha ood manu ac u e s can gain di ec access, and add o edi
ood- ela ed da a o hei p oduc s. In his way, impo an
in o ma ion abou b anded oods cu en ly a ailable in s o es can
be egula ly upda ed and sha ed wi h consume s.
The usage o FCDBs may besigni ican ly es ic ed due o he
missing da a (3). I has been p oposed ha i is be e o include impu ed
da a, anspa en ly iden i ied as such, han no da a a all (3). Howe e ,
da a should only bebo owed o impu ed among he same o simila
oods. Se e al compu a ional me hods o missing da a impu a ion
wi hin FCDBs ha e been p e iously esea ched (20, 21). All o hem
concluded ha , in o de o ‘bo ow’ da a, as many de ails as possible
FIGURE4
Use in e ace o componen ma ching p ocess.
TABLE3 An example o componen ma ching o Slo enian componen s wi h componen s om o eign da ase s.
Componen names among di e en FCDBs
SI FR NL DK UK AU US
Ca bohyd a e, o al
(CHOT) / / Ca bohyd a e by di e ence;
g/ / Ca bohyd a e, by
di e ence; Uni : G
Ca bohyd a e (CHO) Ca bohyd a e
(g/100 g) CHO g
Ca bohyd a es, a ailable; g,
Ca bohyd a e, decla a ion;
g
Ca bohyd a e (g);
CHO
A ailable ca bohyd a e,
wi h suga alcohols; (g)
Ca bohyd a e, by
summa ion; Uni : G
Fibe , o al die a y
(FIBT)
Fibe s
(g/100 g) FIBT_g Die a y ibe ; g AOAC ibe (g);
AOACFIB To al die a y ibe ; (g)
To al die a y ibe (AOAC
2011.25); Uni : G, Fibe ,
o al die a y; Uni : G
Fa , o al (FAT) Fa (g/100 g) FAT_g Fa , o al; g Fa (g); FAT To al Fa ; (g) To al lipid ( a ); Uni : G
Fa y acids, o al
sa u a ed (FASAT)
FA sa u a ed
(g/100 g) FASAT_g Sum sa u a ed; g
Sa d FA /100 g FA (g);
SATFAC, Sa d FA
/100 g (g); SATFOD
To al sa u a ed a y
acids;(%), To al sa u a ed
a y acids; (g)
Fa y acids, o al sa u a ed;
Uni : G