scieee Science in your language
[en] (orig)


Citation: Capitain, C.C.; Nejati, F.;
Zischka, M.; Berzak, M.; Junne, S.;
Neubauer, P.; Weller, P.
Volatilomics-Based Microbiome
Evaluation of Fermented Dairy by
Prototypic Headspace-Gas
Chromatography–High-Temperature
Ion Mobility Spectrometry
(HS-GC-HTIMS) and Non-Negative
Matrix Factorization (NNMF).
Metabolites 2022,12, 299. https://
doi.org/10.3390/metabo12040299
Academic Editor: Hunter N.
B. Moseley
Received: 15 February 2022
Accepted: 24 March 2022
Published: 28 March 2022
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
metabolites
H
OH
OH
Article
Volatilomics-Based Microbiome Evaluation of Fermented Dairy by
Prototypic Headspace-Gas ChromatographyHigh-Temperature
Ion Mobility Spectrometry (HS-GC-HTIMS) and Non-Negative
Matrix Factorization (NNMF)
Charlotte C. Capitain 1, Fatemeh Nejati 2, Martin Zischka 1, Markus Berzak 1, Stefan Junne 2,
Peter Neubauer 2and Philipp Weller 1,*
1Institute for Instrumental Analytics and Bioanalytics, Mannheim University of Applied Sciences,
68163 Mannheim, Germany; [email protected] (C.C.C.);
2Bioprocess Engineering, Institute of Biotechnology, Technische Universität Berlin, Ackerstrasse 76 ACK 24,
13355 Berlin, Germany; [email protected] (F.N.); [email protected] (S.J.);
peter[email protected] (P.N.)
*Correspondence: [email protected]; Tel.: +49-(0)621-292-6484
Abstract:
Fermented foods, such as yogurt and kefir, contain a versatile spectrum of volatile organic
compounds (VOCs), including ethanol, acetic acid, ethyl acetate, and diacetyl. To overcome the
challenge of overlapping peaks regarding these key compounds, the drift tube temperature was
raised in a prototypic high-temperature ion mobility spectrometer (HTIMS). This HS-GC-HTIMS
was used for the volatilomic profiling of 33 traditional kefir, 13 commercial kefir, and 15 commercial
yogurt samples. Pattern recognition techniques, including principal component analysis (PCA) and
NNMF, in combination with non-targeted screening, revealed distinct differences between traditional
and commercial kefir while showing strong similarities between commercial kefir and yogurt. Classi-
fication of fermented dairy samples into commercial yogurt, commercial kefir, traditional mild kefir,
and traditional tangy kefir was also possible for both PCA- and NNMF-based models, obtaining
cross-validation (CV) error rates of 0% for PCA-LDA, PCA-kNN (k= 5), and NNMF-kNN (k= 5) and
3.3% for PCA-SVM and NNMF-LDA. Through back projection of NNMF loadings, characteristic
substances were identified, indicating a mild flavor composition of commercial samples, with high
concentrations of buttery-flavored diacetyl. In contrast, traditional kefir showed a diverse VOC
profile with high amounts of flavorful alcohols (including ethanol and methyl-1-butanol), esters
(including ethyl acetate and 3-methylbutyl acetate), and aldehydes. For validation of the results and
deeper understanding, qPCR sequencing was used to evaluate the microbial consortia, confirming
the microbial associations between commercial kefir and commercial yogurt and reinforcing the
differences between traditional and commercial kefir. The diverse flavor profile of traditional kefir
primarily results from the yeast consortium, while commercial kefir and yogurt is primarily, but not
exclusively, produced through bacterial fermentation. The flavor profile of fermented dairy products
may be used to directly evaluate the microbial consortium using HS-GC-HTIMS analysis.
Keywords:
gas chromatography–ion mobility spectroscopy (GC-IMS); volatile organic compounds
(VOCs); high-temperature ion mobility spectrometry (HTIMS); non-targeted screening (NTS) using
machine learning; non-negative matrix factorization (NNMF); dairy fermentation; traditional and
commercial kefir; quantitative polymerase chain reaction (qPCR)
1. Introduction
Fermentation of food (such as milk) is an ancient practice for the preservation of food
through the production of acids (such as lactic acid), alcohols, and possibly antimicrobial
compounds [
1
]. Through the release of free amino acids, the synthesis of vitamins and
Metabolites 2022,12, 299. https://doi.org/10.3390/metabo12040299 https://www.mdpi.com/journal/metabolites
Metabolites 2022,12, 299 2 of 29
the nutritional value of foods are enhanced by fermentation [
2
]. Furthermore, flavor
compounds (such as acetaldehyde in yogurt and cheese) and other metabolites (such as
extracellular polysaccharides), which will influence the organoleptic properties of the
product, are formed. Traditionally, many fermented foods were produced by natural
fermentation processes that involve the symbiotic fermentations of, for example, lactic acid
bacteria and yeast [
3
]. Since each microorganism produces a versatile spectrum of flavor
compounds, the composition of microorganisms influences the sensory properties, such as
taste and flavor. For the development of new aroma profiles as well as the monitoring of
cofermentations or the early detection of contaminants, holistic analysis tools are necessary,
such as volatomic profiling by HS-GC-IMS. Being able to correlate certain substances with
the presence of certain microorganisms, as shown in this paper by applying non-negative
matrix factorization (NNMF) analysis, is a big step towards identifying microorganisms in
food products purely based on their volatile organic compound (VOC) profile.
Due to the inherent diversity of biogenic samples, as observed in food analysis, and
the chemical complexity of the sample matrices, analytical approaches covering a multitude
of parameters, which are in parallel paired with strong discrimination power, are required.
Analysis of the VOCs of samples, also known as VOC profiling, volatilomic profiling, or
fingerprint analysis, allows for the detection of compounds in complex sample matrices
without the need for detailed a priori knowledge of the molecular composition. Further-
more, VOC profiling can be completed without advanced sample preparation and without
the need for detailed knowledge of the chemical constitution. GC-MS is commonly used
for VOC analysis. However, due to its high sensitivity and resolving power on the one
hand and its simplicity and robustness on the other, ion mobility spectrometry (IMS) has
gained popularity for the analysis of VOCs [
4
]. In previous studies, the good stability and
reproducibility of VOC profiling by HS-GC-IMS was demonstrated, resulting in relative
standard deviations of retention time and drift time values of less than 1% [
5
]. Moreover,
gas chromatography coupled with ion mobility spectroscopy (GC-IMS) has been shown to
be an easy-to-handle and yet highly effective tool for VOC profiling [6].
The complexity of biological samples results from the presence of a variety of com-
pounds, which, in their entirety, provide a characteristic GC-IMS spectrum, often referred
to as the VOC profile or fingerprint [
7
,
8
]. Due to the large amount of data obtained by
VOC profiling based on GC-IMS, machine learning tools are required for data analysis.
In literature, these are often differentiated between targeted screening and non-targeted
screening (NTS) approaches. Compared to the classical targeted approaches, where one
or more chemical compounds are selected as markers, when applying NTS, the complete
spectral data are analyzed using chemometric techniques. For this purpose, signals and
signal intensities are the selected variables, without the prior identification of substances or
the establishment of calibration curves. Since the identification of individual substances
is relevant for model validation, the volatile compounds, which are responsible for class
separation, are identified following the NTS analysis. As a result, non-targeted VOC profil-
ing based on GC-IMS in combination with machine learning has emerged as a promising
method for sample monitoring.
1.1. HS-GC-IMS for VOC Profiling
Since the 1970s, when IMS was first known as plasma chromatography, IMS has de-
veloped into a highly sensitive technique for the analysis of VOCs, which characterizes an
ion’s mobility [
9
11
]. Due to its robust and easy-to-handle instrumentation, a wide range of
application fields have been established for IMS today, such as food flavor analysis [
4
], pro-
cess monitoring [
12
,
13
], and quality control [
14
], as well as the detection and quantification
of warfare agents [15] and explosives [16,17].
With IMS, analytes are first ionized in the ionization region of the instrument [
18
].
Beta particles, which are emitted, for example, by a tritium (
3
H) source, initiate a gas phase
reaction cascade of the drift gas (nitrogen or air), resulting in predominant proton–water
clusters H
+
[H
2
O]
n
, which are commonly referred to as reactant ions [
19
]. The number of
Metabolites 2022,12, 299 3 of 29
water molecules (n) depends on the gas temperature and the moisture content of the gas
atmosphere [
10
]. Depending on the proton’s affinity, molecules entering the ionization
region react with the reactant ions to produce protonated monomers MH
+
[H
2
O]
nx
while
decreasing the intensity of the reactant ion peak (RIP). At higher analyte concentrations,
proton-bound dimers M
2
H
+
[H
2
O]
mx
are generated by the attachment of additional ana-
lyte molecules. When the concentration further increases, the formation of higher molecular
cluster ions, such as trimers or tetramers, is possible; however, due to their low stability
and short lifetime, higher molecular cluster ions are rarely observed [20].
Following ionization, the analyte ions are transferred into the drift region through a
gating mechanism based on a charged electrode. For precise control of the ion pulse width
admitted into the drift tube, complex gating systems, such as Bradbury–Nielsen or field
switching shutters, are employed [
16
]. In the drift tube, ions are accelerated towards the
detector, a Faraday plate, and subsequently separated by their drift time (or mobility) in
an electrical field at ambient pressure. The ions are slowed down by their collision with
counter-flowing drift gas molecules in the collision cross section (CCS). The equilibrium
between the acceleration generated by an electric field and the deceleration resulting from
the collision with the drift gas molecules results in ions moving with a constant velocity
towards the detector. Depending on their mass, charge, and spatial structure, the ions are
separated in the drift tube and reach the detector at different drift times [
21
]. The drift time
may be used to calculate the reduced ion mobility (K
0
) via the Mason–Schamp equation
(see Equation (1)) [22].
K0=
L
E×tD
×p
p0
×T0
T(1)
where
K0= reduced ion mobility in cm2V1s1
L = drift length in cm
E = electric field strength in V cm1
tD= drift time in s
p = pressure of the drift gas in hPa
p0= ambient pressure: p0= 1013.2 hPa
T = temperature of the drift gas in K
T0= ambient temperature: T0= 273.2 K
To avoid clustering in the ionization or drift region, IMS devices are commonly coupled
to column separation techniques, such as liquid chromatography (LC) or gas chromatog-
raphy (GC). Column separation coupled with drift time IMS separates analytes into two
orthogonal ‘features’—first, the retention time through chromatography, and second, the
drift time or mobility through IMS—resulting in a two-dimensional (2D) highly resolved
GC-IMS spectrum [
6
,
23
]. In LC analysis, soluble compounds can be separated; however,
sample preparation is a critical step for the data quality [
24
]. In GC analysis, the volatility
of a sample is a prerequisite. Headspace (HS)-based techniques allow for the analysis
of untreated samples, avoiding the time-consuming sample pre-treatment steps [
5
]. HS
autosamplers are commonly used for GC sample injection. Being headspace-based, time-
consuming sample pretreatment steps are usually not required for HS-GC-IMS analysis.
HS-GC-IMS analysis can usually be carried out on untreated or almost untreated samples.
HS-GC-IMS has been demonstrated to be an effective technique for the evaluation
of the VOC profiles of biological samples due to its simple system setup, robustness, and
price [
25
29
]. Chemical profiling of food and beverages in combination with chemometric
analysis is widely used for food authentication and ultimately to identify food adulteration
and fraud [
30
]. Furthermore, the VOC profile is influenced by production processes as well
as storage conditions. Consequently, process control and quality assurance, such as the
control of food freshness and food safety, are topics of interest for NTS using HS-GC-IMS
techniques [
31
,
32
]. While LC in combination with MS has been successfully applied for
the investigation of specific biochemical and metabolic profiling [
33
,
34
], in this work, a GC
Advertisement
Metabolites 2022,12, 299 4 of 29
setup was preferred due to its technical simplicity and simple sample preparation. In LC
analysis, any soluble compound can be separated, but sample preparation is a critical step
for the data quality [
24
]. Especially in combination with the
3
H source of the IMS, the lack
of solvents in GC analysis is advantageous to sensitivity and resolution.
1.2. Pattern Recognition and Dimension Reduction Techniques
Each fermented product displays a unique flavor profile, with a few compounds being
characteristic of certain species of microorganisms or microbial consortia and other com-
pounds being found in variable concentrations among many different dairy products. Since
microorganisms secrete a variety of side metabolites, which often cannot be assigned to one
specific microorganism, a comprehensive peak extraction method which enables the extrac-
tion of a wide range of compounds while minimizing potentially interfering co-extractives
is needed. Since unspecific compounds are present [
35
], in this work, an NTS approach not
requiring the specific identification of chemical marker sets prior to sample analysis was
used. Due to the complexity of the flavor profile of microorganisms, a multivariate data
analysis (MVA) approach was needed for sample analysis [
36
]. MVA approaches can be di-
vided into exploratory, classification, and calibration methods. Exploratory methods, such
as principal component analysis (PCA) or NNMF, are unsupervised and solely used for
pattern recognition, whereas methods such as PCA-LDA, kNN, or PLS-DA are supervised
methods used for classification.
PCA is a powerful technique for the unsupervised discovery of patterns in data which
is further applied for dimension reduction [
37
]. The information extracted from a data
matrix is explained through principal components (PCs), which are orthogonal (mathemat-
ically independent) to each other. Another dimension reduction technique which is less
frequently used is NNMF. In NNMF analysis, a matrix X is factorized into two matrices
W and H, with the requirement that all three matrices must contain only zero or positive
elements. Therefore, the sample features must be positive values, such as those provided
by HS-GC-IMS. Prior to factorization, a k-value, also known as a ‘rank of factors’, needs
to be specified. The n-by-m matrix X is therefore divided into an n-by-k matrix W and a
k-by-m matrix H. The factorization is not exact, as W
×
H is a lower-rank approximation
of X. The factors W and H minimize the root mean square residual D between W and
W
×
H. As NNMF decomposes samples into sums of their parts, NNMF models are, as
opposed to PCA models, easily interpretable. Since PCA and NNMF models are developed
without labels or prediction steps, they are generally considered unsupervised. Unsuper-
vised statistical methods are exploratory methods that can be used to study data structures
and search for clusters of samples [
38
]. Hierarchical cluster analysis (HCA) of PCA or
NNMF models in a tree-like diagram (dendrogram) are, for instance, used to visualize
multivariate association and sample similarities [39].
In comparison to unsupervised techniques, which provide predictions without labels
or target variables, supervised techniques aim to build models able to predict target vari-
ables. In supervised learning, several data points or samples are described using predictor
variables, or features, and target variables. For classification tasks, the scores obtained
by unsupervised exploratory analysis are combined with subsequent supervised pattern
recognition techniques to distinguish samples according to defined categories. Among
the PCA-based qualitative methods are linear discriminant analysis (LDA) and k-nearest
neighbors (kNN). Whereas PCA-LDA maximizes the separation of known categories, kNN
assigns the category most common among the k-nearest neighbors. The disadvantage of
using PCA-based methods is that the correlation between dependent and independent vari-
ables is not considered during PCA analysis, which can result in a loss of the information
contained in higher PCs [
37
]. An alternative classification method is provided by partial
least squares (PLS), where the scores are calculated by considering the relationship between
the independent and dependent variables.
Metabolites 2022,12, 299 5 of 29
1.3. Microbial Composition and Flavor Profiles of Fermented Dairy
Yogurt is produced by bacterial fermentation of mixed cultures of Lactobacillus del-
brueckii subsp. bulgaricus and Streptococcus thermophilus. Excreted products of the metabolism
of these organisms are lactate, aroma compounds (including acetaldehyde and diacetyl),
and eventually, exopolysaccharides [
40
]. In addition, other lactobacilli and bifidobacte-
ria are sometimes added during or after yogurt production. Even after many years of
commercially available yogurt, the aromatic profile produced by Lactobacillus delbrueckii
subsp. bulgaricus and Streptococcus thermophilus is still the focus of current research.
Here, volatiles are commonly identified and quantified by solid-phase microextraction
and gas chromatography–mass spectrometry (SPME-GC-MS), often identifying more than
80 VOCs [
41
,
42
]. However, some disadvantages of SPME include a high coefficient of varia-
tion for certain setups, varying matrix characteristics, and interanalyte displacement effects
caused by adsorption onto the fibers [43,44], which are unfavorable for NTS approaches.
Kefir is a fermented dairy product created through the symbiotic fermentation of milk
by lactic acid bacteria and yeasts, with a slightly acidic taste and creamy consistency [
3
].
The composition of microorganisms differs strongly among kefir grains of different origin,
resulting in a unique and versatile spectrum of flavor compounds which, in total, constitute
the taste and flavor of kefir. Thus, different kefir varieties provide the opportunity to
evaluate the milk fermentation by microbial consortia based on VOC profiling using HS-
GC-IMS analysis.
Kefir originated in Caucasian countries and is considered one of the oldest fermented
milk beverages [
45
]. Many beneficial health effects have been reported for kefir [
46
],
including anti-inflammatory [
47
] and wound-healing properties [
48
]. Moreover, due to its
association with probiotic bacteria [
49
] and its capacity to lower cholesterol levels [
50
], milk
kefir has attracted increased attention of dairy producers and health-conscious consumers.
Traditionally, kefir is produced through natural fermentation by kefir grain, which
is a conglomerate of microbial cells and their metabolites, coagulated milk proteins, and
carbohydrates [
51
], such as the polysaccharide kefiran [
52
]. The exact microflora are not
yet well defined and depend on the origin of the starter culture, conditions of growth
(such as temperature, among others), processing of the milk, and type of milk used [
53
].
Furthermore, the microflora proportions change during the fermentation. While the kefir
grains contain 83–90% lactic acid bacteria (including 53–65% of Streptococci and 24–33% of
Lactobacilli) and 10–17% of yeast and acetic acid bacteria, the composition of the kefir was
reported to be 92–96% of lactic acid bacteria (including 74–78% Streptococci and 15–20%
Lactobacilli) and 4–8% of yeast [54].
The composition of microorganisms influences the sensory properties of kefir, which
are dominated by lactic acid, volatile acids, diacetyl, carbon dioxide, and ethanol [
55
,
56
]. The
metabolic pathway of lactic acid bacteria is distinguished between homolactic fermentation
and heterolactic fermentation. Homolactic fermenters, such as Streptococcus thermophilus,
Lactobacillus lactis, and Lactobacillus bulgaricus, mainly produce lactic acid from pyruvic acid,
while heterolactic fermenters, such as Leuconostoc spp., produce ethanol in addition to lactic
acid [57].
Furthermore, bacterial strains can produce alternative end products from pyruvic acid,
such as formic acid and acetic acid or butane-2,3-diol. The latter is produced through the
conversion of pyruvic acid to acetolactic acid and further to acetoin. While acetoin and
butane-2,3-diol are more or less tasteless, their derivate diacetyl, which is produced by
non-enzymatic chemical conversion from acetolactic acid, is an important flavor compound
in dairy products [
58
]. The main catalytic by-product for energy production from carbo-
hydates in yeast is ethanol [
53
,
59
]. For Candida spp., Saccharomyces spp., Kluyveromyces
spp., and Debaromyces spp., flavor compounds such as ethanol, acetone, amyl-alcohol, and
propanal have been reported [
53
]. Further sources for flavor and aroma compounds are
lipid hydrolysis and protein hydrolysis. Lipolysis in milk results in the formation of free
fatty acids, which are precursors to flavor compounds such as methyl ketones, alcohols, lac-
tones, and esters. Methyl ketones, such 2-nonanone and 2-heptanone, which are attributed
Advertisement
Loading more pages...