Information Systems 114 (2023) 102184
Contents lists available at ScienceDirect
Information Systems
journal homepage: www.elsevier.com/locate/is
An approach for analyzing business process execution complexity
based on textual data and event log
Aleksandra Revinaa,b, Ünal Aksuc,∗
aChair of Information and Communication Management, Faculty of Economics and Management, Technical University of
Berlin, 10623 Berlin, Germany
bFaculty of Economics, Brandenburg University of Applied Sciences, 14770, Brandenburg an der Havel, Germany
cDepartment of Information and Computing Sciences, Utrecht University, 3584 CC Utrecht, The Netherlands
article info
Article history:
Received 1 October 2021
Received in revised form 5 July 2022
Accepted 30 January 2023
Available online 3 February 2023
Recommended by Dennish Sasha
Keywords:
Business process execution complexity
Event log
IT service management
Linguistic features
Machine learning
Process mining
Textual data
abstract
With the advent of digital transformation, organizations increasingly rely on various information
systems to support their business processes (BPs). Recorded data, including textual data and event
log, expand exponentially, complicating decision-making and posing new challenges for BP complexity
analysis in Business Process Management (BPM). Herein, Process Mining (PM) serves to derive insights
based on historic BP execution data, called event log. However, in PM, textual data is often neglected
or limited to BP descriptions. Therefore, in this study, we propose a novel approach for analyzing BP
execution complexity by combining textual data serving as an input at the BP start and event log.
The approach is aimed at studying the connection between complexities obtained from these two
data types. For textual data-based complexity, the approach employs a set of linguistic features. In our
previous work, we have explored the design of linguistic features favorable for BP execution complexity
prediction. Accordingly, we adapt and incorporate them into the proposed approach. Using these
features, various machine learning techniques are applied to predict textual data-based complexity.
Moreover, in this prediction, we show the adequacy of our linguistic features, which outperformed the
linguistic features of a widely-used text analysis technique. To calculate event log-based complexity,
the event log and relevant complexity metrics are used. Afterward, a correlation analysis of two
complexities and an analysis of the significant differences in correlations are performed. The results
serve to derive recommendations and insights for BP improvement. We apply the approach in the
IT ticket handling process of the IT department of an academic institution. Our findings show that
the suggested approach enables a comprehensive identification of BP redesign and improvement
opportunities.
©2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Today’s organizations use various information systems like
Enterprise Resource Planning (ERP) systems or Information Tech-
nology (IT) ticketing systems to support their business processes
(BPs) and operations [1]. As such, they highly rely on IT. Since
digital transformation engages organizations in rapidly chang-
ing environments [2], it continually demands them to have a
thorough understanding of their BPs and operations to remain
resilient [3].
Accordingly, Business Process Management (BPM) has become
popular as a well-known way to enable efficient business opera-
tions and improvements in quality and productivity in organiza-
tions. To accomplish these goals, BPM research and practice have
∗Corresponding author.
(Ü. Aksu).
established various approaches. Process Mining (PM) is one of the
commonly used techniques to derive insights for process analy-
sis based on BP execution data extracted as event logs. In this
context, a large number of studies in BPM and PM are devoted
to BP execution complexity analysis and complexity metrics [4].
However, in these studies focusing on BP executions, the analysis
of textual data remains limited [5], despite the fact that textual
data make up more than 80% of data in companies [6]. The
relevant studies in the literature related to BP executions mainly
consider BP descriptions, documentation, and texts in BP models,
such as labels [5]. In addition, many unsolved challenges in ap-
plying Natural Language Processing (NLP) in BPM are highlighted,
such as semantic enhancements and domain or organization-
specific adaptations of NLP solutions [5]. Thus, a more rigorous
relation between these two areas discloses an untapped potential
to substantially improve the BPM toolset.
In fact, textual data serving as an input to a BP at its very start,
i.e., triggering a BP, highly influence its execution. For example, in
https://doi.org/10.1016/j.is.2023.102184
0306-4379/©2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 1. Our approach for analyzing business process execution complexity based on textual data and event log.
the IT Service Management (ITSM) area, activities performed in a
Change Management (CHM) process strongly depend on the tex-
tual descriptions of Requests for Changes (RfCs) communicated by
a customer. Specifically, the urgency of a given RfC and whether
it needs to be analyzed and approved for implementation are
determined primarily based on its textual description. Similarly,
the involvement of roles, such as a Change Advisory Board (CAB),
also depends on the RfC texts. For example, urgent RfCs may
not require CAB involvement. Hence, the same input, i.e., textual
data, determines important decision points of any RfC processing,
such as: (i) which activities in the CHM process will be skipped,
(ii) from which activity RfC processing will start, and (iii) which
roles will be involved. In many other areas, one can also observe
similar influences of textual data on the execution of BPs. For
example, in healthcare, the complaints expressed by a patient
typically determine the required diagnostics and related BP ac-
tivities. Generally speaking, in BPs, textual data can influence the
decision points, activities, and their order. Thus, BP complexity is
affected by textual data. In the related literature, it is shown that
there is a connection between BP complexity and BP performance
and management [7,8]. For this reason, process redesign and
improvement initiatives are often motivated by BP complexity
analysis [9,10].
The potential of textual data serving as an input to BP execu-
tion in the context of complexity has been extensively studied in
our previous work [11–15]. Within that work, we have explored
and developed a set of linguistic features, including semantic,
syntactic, and stylistic ones, i.e., taxonomy-based [11], sentiment-
based [15], and stylistic features [12], which potentially influence
BP execution complexity [13]. In an industrial case study of a CHM
IT ticket handling process [14], we have investigated the linguistic
features favorable for BP execution complexity prediction.
Overall, in this study, we propose a novel BP execution com-
plexity analysis approach in which we combine textual data and
event log. For the development of the approach, we set the
following specific objectives:
•Enriching an understanding of event log-based (EL) com-
plexity common in BPM with textual data-based (TD) com-
plexity,
•Identifying a set of metrics for TD and EL complexities taking
existing works as a basis,
•Studying the relation between TD and EL complexities and
investigating how textual data can contribute to EL com-
plexity prediction,
•Exploring, adapting, and illustrating the benefits of our ap-
proach by applying it in a real-world setting.
To achieve these objectives and ensure the comprehensiveness
of our approach, we build our study on the following steps.
In Section 2, we analyze the related work on the application
of NLP in BPM and BP execution complexity highlighting the
unsolved issues regarding the use of textual data. Section 3sum-
marizes the aspects from our previous work that we use for TD
complexity calculation and explains the state-of-the-art event
log complexity metrics serving as the basis for EL complexity
calculation. Afterward, in Section 4, using a running example, we
adapt and incorporate our previous work on TD complexity and
well-established studies on EL complexity and introduce the BP
execution complexity analysis approach. As can be seen in Fig. 1,
in the first block, TD and EL complexities are calculated. For TD
complexity calculation, linguistic features extracted from textual
data are used. In this regard, we take the designed linguistic fea-
tures (taxonomy-based and stylistic features) from our previous
work [11,12] and identify relevant features for TD complexity.
Using these linguistic features, the TD complexity of BPs is pre-
dicted with respect to an agreed-upon complexity scale. Further,
we assess the adequacy of our linguistic features in predicting TD
complexity. This is done by comparing their prediction perfor-
mance with the prediction performance of the linguistic features
from a well-accepted text analysis technique. To calculate EL
complexity, the state-of-the-art event log complexity metrics are
analyzed, and suitable ones are applied to the given event log. In
the second block, two analyses are performed, namely correlation
analysis and significant difference analysis. More specifically, how
calculated complexities correlate is determined in the correlation
analysis. Following that, significant differences within the correla-
tions are analyzed. Using the obtained results, recommendations
and insights for process improvement are derived. To illustrate
the value of our approach, a Service Request Management process
case study from an IT Service Management (ITSM) of an academic
institution is conducted and explained in Section 5. We discuss
the implications of our findings in Section 6. Finally, in Section 7,
we present our conclusions and directions for future work.
Thus, our work contributes to BPM by proposing a new ap-
proach to analyze BP execution complexity, considering textual
data serving as an input to BP execution and event log. Although
one of the dominant research directions in BPM regarding BP
analysis is BP complexity [4], to the best of our knowledge, no
other works combine textual data and event log to analyze BP ex-
ecution complexity. Using qualitative (interviews) and quantita-
tive (computational analysis) research methods, we demonstrate
the value of our approach by means of a case study. As a practical
contribution, our study findings show a comprehensive way of
identifying process redesign and improvement opportunities.
2
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
2. Related work
This section lists the studies associated with the approach we
propose in this paper. In particular, we highlight the increas-
ing relevance of textual data in organizations and review the
state-of-the-art NLP applications in various BPM lifecycle phases.
Afterward, we present prominent complexity research in the
BPM-related literature.
Organizations increasingly focus on insights into understand-
ing and improving their BPs. In this regard, BPM investigates the
potentials of NLP to benefit from its maturity and availability in
multiple BP applications providing support to different BPM life-
cycle phases [5]. In the following, relevant research is reviewed
according to BPM lifecycle phases.
In the BP discovery phase, considerable research effort has
been made to develop BP model discovery approaches from data.
Whereas BP discovery from event log is a well-established and
matured subject area that already has tangible practical applica-
tions, BP model discovery from textual data is still a promising
research topic lacking the ability to scale [16]. Below, we review
the most prominent and recent developments:
BP model and event log generation from textual data: Creation of
BP models makes up to 60% of the time spent in BPM projects [5].
Further, due to the current dynamics of work environments,
BP modeling has become a time-consuming and costly activ-
ity requiring constant updates of BP models, which might lead
to BPM project failures [17,18]. Thus, an automatic generation
of BP models from available textual data becomes an attrac-
tive application paving the way for multiple research projects.
For example, recent research by [19] proposes an automatic BP
model discovery from textual BP descriptions based on neural
networks. Further, [20] extend existing NLP techniques to extract
activities and their relations defining BP constraints from textual
descriptions. [21] present a method to generate an event log
from textual data using action and topic analysis. Thereafter, BP
models are mined based on common techniques. [22] use natural
language inference to construct event log from customer service
conversations. [23] deal with the problem of multi-grained text
classification by introducing a hierarchical neural network to ex-
tract multi-grained information from BP descriptions. In [24], the
early developments of a tool to extract BP models from text and
then maintain their alignment using Dynamic Condition Response
(DCR) Graphs are presented.
Enrichment of event log with textual data: Process Mining (PM)
represents the most typical approach to automatically create
BP models from event log [25]. Hereby, textual data massively
generated by BP participants in the BP execution, such as com-
ments or email communication, are not considered. To address
this shortcoming, several research projects started to appear.
For example, [26] extract key phrases denoting activities from
comments related to IT ticket processing enriching the event log
with this information. Subsequently, a more comprehensive BP
model can be derived. Further, [27] enhance the event log analysis
with the analysis of textual attributes contained in it using a novel
attribute classification technique.
Automatic discovery of decisions in BP models: Decisions make
up an important effort-intensive part of BP modeling. Accord-
ingly, [28] propose a deep learning approach to obtain decision
constraints and conditional clauses from text. [29] provide an
NLP pipeline to automatically extract the decisions and their
dependencies to build the decision requirements diagram making
part of a decision model. The study by [30] describes a method for
generating entire decision models from textual inputs. The sug-
gested technique based on NLP and customized syntactic patterns
enables the extraction of both decision requirements and decision
logic from a document.
Text annotation: The efforts related to BP model creation can
be sufficiently reduced in case the text is well annotated, this
way decreasing the noise caused by automatic NLP techniques.
Such annotated BP descriptions can be used for both inferring
new relations to create more comprehensive BP descriptions and
as training data for various NLP analyzers [5]. Hence, in [31], a
novel approach using NLP and a query language for tree-based
patterns is introduced. It derives annotations representing essen-
tial BP elements, i.e., activities, events, actors, roles, and con-
trol flow. [32] describe a method based on Semantic Parsing
and Graph Convolutional Networks. This method avoids the use
of manual rules and outputs much better results than existing
neural network-based solutions to derive annotations from BP
descriptions.
Automatic BP modeling recommendations and semantic auto-
completion: Considerable research has been devoted to automatic
activity recommendations to support BP modeling task [33,34].
Hence, grounding on a similar technique as [35,36] exploit la-
bel semantics for rule-based activity recommendation. Addition-
ally, [37] propose to use semantic similarities between BPs to
enable design-time autocompletion by relying on pre-trained NLP
models. The method converts BP sequences into text paragraphs
and encodes them as sentence embeddings, i.e., learned text rep-
resentations that include semantics as real-number vectors [37].
The next phase of the BPM lifecycle, i.e., BP analysis, aims
to identify flaws and bottlenecks in the discovered BP mod-
els. Hereby, NLP techniques can also be of support in specific
applications. We present some up-to-date developments below:
BP model semantic correctness and completeness verification:
The semantic quality of BP models is critical for understanding
BPs correctly. A number of NLP research projects are naturally
aimed at automating the verification of this characteristic. Ac-
cordingly, many BP model analysis strategies rely on a thorough
examination of the natural language information included in the
activity labels of the models. Standard NLP is not adequate for an-
alyzing these labels since they are often short and heterogeneous
in terms of grammatical style. Dealing with this challenge, [38]
propose a Hidden Markov Models-based approach for a linguistic
analysis of BP model activity labels. Additionally, research by [39]
addresses the problem of ambiguity of BP textual descriptions
and suggests a compliance checking technique using behavioral
spaces.
BP model and text consistency check: As mentioned above,
maintaining various BP-related data allows for improving the
knowledge of BPs in organizations. However, as BPs change over
time, it is important to constantly identify inconsistencies among
various BP descriptions so that expectations for BP outputs re-
main the same for every stakeholder [40]. The latter research [40]
proposes an approach to detect conflicts between textual and
model-based descriptions using NLP. Further, [41] design a tech-
nique to align BP models and textual descriptions, mapping the
knowledge derived from these two representations into a unified,
comparable format.
BP-related data querying: Having a variety of BP-related data
allows organizations to better analyze their BPs. In such an anal-
ysis, a common task is querying these data to get insights into
specific BP parts. In the case of event log data, to be able to
use common BP querying techniques, end-users must be familiar
with the query language and database schema. Addressing this
challenge, [42] introduce a natural language interface. Hereby,
questions can be asked in a normal language, and the inter-
face will automatically translate them into a structured query
to be run in a database. [43] also address this problem by sug-
gesting a technique to search both textual and model-based BP
descriptions.
Sentiment analysis: Sentiment analysis has already shown its
high value for e-business and e-commerce providing insights
3
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
based on the textual data collected in social media and social
networks [44]. [45] explore its potential in BPM and develop a
BP modeling tool considering stakeholders’ comments and feed-
back. Applying sentiment analysis, the tool identifies positive and
negative feedback to support BP analysts in designing the to-be
process.
In the following BP redesign phase, the BP model needs to
be modified to address the concerns discovered in the previous
phase. Contemporary NLP techniques can also be used to achieve
this goal:
BP redesign using textual data: As a rule, experts dealing with
BP redesign focus on proposing to-be BP models and redesign
patterns with little or no consideration of end-user feedback.
To address this shortcoming, recent research suggests an NLP
approach based on a novel set of annotation guidelines to identify
redesign suggestions directly from end-user feedback [46].
Comparing BP models: To produce a sound to-be BP model in
the BP redesign phase, a BP analyst might need to examine large
sets of BP models that are often organized hierarchically based on
the level of abstraction. Hereby, one of the most difficult tasks is
ensuring that BP models at the same level in the hierarchy have
the same level of abstraction [5]. To solve this challenge, various
BP model matching algorithms can be used. For example, [47]
present a semantic multi-phase matching algorithm based on a
vector space model and NLP to match the models. [48] provide
a technique for discovering sets of related activities based on
constrained k-means clustering considering both BP semantics
and control flow order.
BP model refactoring: The quality of BP models may signif-
icantly vary since BP modeling is time-consuming and error-
prone. Moreover, the competence of various modelers differs.
Hereby, refactoring can be used to improve the quality of BP mod-
els. Refactoring is a popular approach in software engineering to
restructure the code without changing its external behavior. As
BP modeling and coding are similar to a certain extent, existing
refactoring technologies from software engineering have been
adapted for BP workflows [49]. In the context of NLP, such an
approach as linguistic refactoring has appeared. Accordingly, [50]
elaborates NLP techniques for syntactic, semantic, and pragmatic
refactoring in the dissertation.
In addition to typical monitoring techniques for assessing per-
formance and conformity requirements, in the BP controlling
phase, NLP can be used to make available different forms of BP
descriptions [5]:
Transformation of BP model to text: In the controlling phase,
it is important that all stakeholders are able to understand BP-
related data. However, event log, workflows, and BP models are
not straightforward and require certain expertise for comprehen-
sion. On the contrary, a written BP textual description can be un-
derstood by any stakeholder. Hence, it is highly recommended to
support event log and BP models with the latter [5]. To solve this
problem, BP model-based natural language generation has been
researched [51]. Further, [52] introduce a semi-automated ap-
proach to transfer knowledge from BP models to natural language
requirements documents. [53,54] develop a tool to generate BP
textual descriptions from declarative BP models. In this context,
another group of researchers [55] deals with the comparison
of manually and automatically generated textual descriptions of
BP models focusing on the choice of an appropriate matching
technique. Additionally, [56] suggest a technique to fix poorly
written BP textual descriptions based on BP models.
Multi-lingual support: In international companies as well as
in the context of cross-country and cross-organizational learn-
ing, it is essential to translate BP models and BP descriptions
into multiple languages to enable accessing BP information to
various stakeholders. Thus, [57] develop a framework for the
automatic generation of multi-language description text using an
emergency disposal process example. In line with the latter, [58]
enhance the framework with multiple (cross-department) views
and operationalize it in a cross-department medical diagnosis
process.
As can be concluded from above, several research streams,
such as sentiment analysis, BP-related data querying, BP modeling
recommendation and autocompletion, might be applied in multi-
ple BPM lifecycle phases, for example, BP discovery, analysis, and
redesign. However, the most prominent research stream is aimed
at supporting the discovery phase, i.e., the automatic creation of
BP models from BP textual descriptions. Accordingly, the most
frequently used textual data are related to the BP descriptions,
feedback from BP participants, and textual data inherent in BP
models, i.e., labels. Hereby, only a limited number of works deal
with the comments and emails related to the activities in the
event log [26].
In addition, a large number of studies in BPM analyze BP exe-
cutions from a complexity perspective. For example, [59] describe
metrics for measuring BP model complexity based on observa-
tions from software complexity. Similarly, in [60], metrics for
analyzing BP model complexity are proposed by extending met-
rics on software complexity. BP model complexity metrics and
their theoretical thresholds are studied in [61] to assess BP model
complexity and categorize BP models based on their complexity.
An overview of the BP model complexity reduction mechanisms
is provided in the form of patterns in [62]. Aside from that,
in BPM, there is a great interest in analyzing BPs from a com-
plexity perspective using PM. Hence, [63] study the design and
applicability of metrics for measuring event log complexity. [64]
provide a comprehensive evaluation of state-of-the-art BP discov-
ery techniques considering the complexity of their automatically
generated BP models. An approach aimed at the reduction of
the complexity of the discovered declarative BP models is pro-
posed in [65]. Moreover, [66] provide an overview of the BP
model complexity metrics by conducting a systematic literature
review. Lastly, in a recent study [4], the state-of-the-art event log-
based complexity metrics are analyzed to determine the relation
between the event log and the resulting BP model.
To sum up, despite the ubiquity of textual data in organiza-
tions, in the relevant BPM literature, the analysis of textual data
related to BP executions is prevailingly limited to BP descriptions
and texts in BP models [5]. Moreover, the complexity of textual
data has a considerable influence on BP execution, which has
been recently studied and proven in our work [13]. To address the
shortcoming, in this paper, we propose an approach combining
textual data used as an input to BP execution and event log for
BP execution complexity analysis.
3. Background
In this section, we provide a background on the linguistic
features adapted to calculate TD complexity and event log com-
plexity metrics employed to calculate EL complexity.
3.1. Linguistic features
In our previous work, we studied what characteristics, i.e., fea-
tures, of a given text have a great potential to influence BP
execution complexity [11–13,15]. In particular, we investigated
several features and identified that taxonomy-based, so-called
Decision-Making Logic (DML) taxonomy, and stylistic features are
prevailingly important for the complexity of textual data [14]. In
this regard, we summarize how these features were developed.
To design linguistic features that capture TD complexity via
cognition and style of textual data, we focus on the distribution of
4
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 1
Linguistic features.
Group Linguistic feature
DML taxonomy-based
Relative occurrence of words based on the DML taxonomy and
the DML cognition level routine
Relative occurrence of words based on the DML taxonomy and
the DML cognition level semi-cognitive
Relative occurrence of words based on the DML taxonomy and
the DML cognition level cognitive
Stylistic
Relative occurrence of nouns in all words
Relative occurrence of unique nouns in all nouns
Relative occurrence of verbs in all words
Relative occurrence of unique verbs in all verbs
Relative occurrence of adjectives in all words
Relative occurrence of unique adjectives in all adjectives
Relative occurrence of adverbs in all words
Relative occurrence of unique adverbs in all adverbs
Word count
Wording style
parts-of-speech (PoS) in textual data. In particular, nouns, verbs,
adjectives, and adverbs are analyzed since they reveal the most
information about decision-making and style in textual data. As a
rule, process workers interpret textual data inputs to determine
how a BP should be carried out. They map the phrases to the
BP elements. For example, a process worker may decide on a BP
execution based on the information the customer mentioned in a
textual message about previously performed BP activities (nouns)
using specific verbs, adjectives, or adverbs indicating the timeline
and status of such activities. Hence, extracting this information
could assist process workers in handling textual data related to
BP execution. Aligned with this, there are naming conventions in
BPM [62,67] on the effects of labeling in BPs. More specifically,
these conventions provide guidance on those PoS that can be
used in labeling and how. As such, we suggest considering nouns
as Resources expressing the specifics of BP elements, verbs as
Techniques of knowledge and information transformation activity
impacting Resources, adjectives as Capacities revealing contex-
tual specifics of Techniques, and adverbs as Choices defining the
selection of the necessary set of Techniques, elements of RTCC
framework developed in our previous research [11].
DML taxonomy: DML taxonomy is a 2-tuple: (1) most important
words (nouns, verbs, adjectives, and adverbs) extracted from a
given text and (2) decision-making logic levels, i.e., cognition
levels, each of which denotes the easiness of the process to
understand something for making a decision. In DML taxonomy,
each word is associated with a DML cognition level. For detailed
information regarding the DML taxonomy development process,
we refer to our previous work [11]. In this paper, we present a
summary of the most important steps:
Step-1 The first step is collecting BP-relevant textual data from
different sources. Whereas in our approach, the focus lies
on the textual data provided as input to BP execution, for
DML taxonomy, also other textual data, such as official BP
descriptions, interview transcriptions, or legal documents,
should be considered.
Step-2 The collected data are converted into a machine-readable
format, in which the computational analysis will be per-
formed, for example, a CSV file format.
Step-3 Afterward, the data are pre-processed and parsed, build-
ing the document-term matrices for the most important
parts of speech (nouns, verbs, adjectives, and adverbs).
Step-4 The created document-term matrices are processed us-
ing the topic modeling methods, such as a combination
of Latent Dirichlet Allocation (LDA) and Latent Semantic
Indexing (LSI) [68].
Step-5 In the last step, the extracted topics with descriptive
keywords are classified into the decision-making logic
levels, i.e., cognition levels, of routine, semi-cognitive, and
cognitive. Here, the involvement of experts being famil-
iar with the context is essential for the right keyword
classification.
Stylistic patterns: In our previous work [12], we showcased that
the style, i.e., stylistic patterns, of a given IT ticket text can reveal
information on identifying its BP complexity. More specifically,
ticket length, PoS distributions, and wording style are suitable for
indicating and understanding how the complexity of handling an
IT ticket is affected. To capture such components of a text, we
proposed Syntactic Structure (SynS) and Wording Style (WS) as
new features. The SynS feature focuses on syntax. The way the
words are put together to form phrases influences text compre-
hension and corresponding BP execution, which uses that text as
a primary input. The WS feature takes Zipf’s Laws [69] as a basis
and focuses on the appearance of new words in a text and the
speed of appearance.
In accordance with the considerations explained above, for
DML taxonomy-based features, PoS distributions are computed
based on a given DML taxonomy. Hereby, all the words are
considered as the search space for stylistic features. In Table 1,
we list the identified features.
3.2. Event log complexity metrics
Organizations use various information systems (for example,
ERP systems or IT ticketing systems) to enact their BPs with the
support of such systems. These systems enable organizations to
record a large amount of data about BP executions. Such process
execution data are then extracted in the form of an event log
to analyze and provide insights into improving BPs [70]. An
event log consists of events, each of which refers to an activity
performed in executing a BP. In Table 2, an exemplary event log
of a Service Request Management process is depicted.
As can be seen, each row shows (i) which activity is performed,
(ii) when, (iii) for which request, and (iv) other information (for
example, resource and priority). The events carried out in the
5
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 2
An example of an event log of a Service Request Management process.
Request Activity Time stamp Resource Priority Attr.
t001 Register 01-08-2021 10:11:12 Worker1 Low ...
t001 Analyze 01-08-2021 14:10:00 Worker2 Low ...
t002 Register 01-08-2021 16:01:03 Worker3 High ...
t003 Register 01-08-2021 16:05:42 Worker1 Low ...
t002 Analyze 01-08-2021 16:10:42 Worker4 High ...
t002 Resolve 01-08-2021 16:25:51 Worker5 High ...
t003 Analyze 01-08-2021 17:15:02 Worker6 Low ...
t001 Escalate 01-08-2021 17:35:40 Worker7 Low ...
t003 Resolve 02-08-2021 09:01:02 Worker8 Low ...
t004 Register 02-08-2021 09:10:20 Worker3 Medium ...
t004 Analyze 02-08-2021 09:25:33 Worker2 Medium ...
t003 Re-open 02-08-2021 10:01:01 Worker8 Low ...
t004 Reject 02-08-2021 10:06:08 Worker6 Medium ...
t001 De-escalate 02-08-2021 10:24:32 Worker8 Low ...
t003 Resolve 02-08-2021 11:16:10 Worker4 Low ...
t001 Resolve 02-08-2021 11:59:59 Worker4 Low ...
scope of a single process instance execution are called a case. In
the example, each request refers to a case that goes through the
same Service Request Management process. The sequence of the
events in the scope of a particular case is called a trace.
In the literature, there are several studies focusing on quanti-
fying the complexity of such an event log. Within these studies,
metrics are proposed in order to assess the complexity of event
logs, so that further analysis can be determined considering the
characteristics of event logs. In recent research on EL complex-
ity measurement [4], EL complexity metrics are reviewed and
studied in detail. Then, a set of entropy-based complexity metrics
are proposed to address the issues in the studied EL complexity
metrics. We take this research as the basis and analyze both
discussed and proposed metrics in it. These metrics are listed in
Table 3.
Based on the specifications of the metrics given in the table,
one can note that either one or more aspects (size, variation,
and distance) of complexity are selected as the focus in each
metric. In that sense, some metrics have limitations. For example,
metrics measuring the size of event logs would not capture any
difference in terms of variation or distance. Despite the fact that
some metrics focus on the same aspects of complexity, there
is not much dependency among them as they differ from each
other in measuring an event log using its various components
(for example, traces, event classes, or event relations) [4]. To
have a comprehensive view of the aspects of complexity, in our
approach, we opt for employing all EL complexity metrics listed
in the table. Further, to mitigate the influence of one metric on
another, we use majority voting in our approach to obtain a single
EL complexity value for a given event log.
In general, processes indicate all the work performed in an
organization [1]. Accordingly, the complexity of a process and
corresponding ways to measure it can imply a wide range of
elements and factors, like those emerging from the process con-
text [73,74], which are often difficult to obtain. Moreover, how a
process is reflected in a model affects its perceived complexity.
In other words, quality aspects of process models and process
modeling notations have a notable impact on perceived process
complexity. Hence, we focus on event log-based complexity met-
rics in this paper and list extending our approach with process
complexity metrics as part of our future work.
4. Approach development
As introduced in Section 1, we propose an approach, hereafter
Approach, aimed at analyzing BP execution complexity based on
textual data and event log. More specifically, we investigate the
relation between TD complexity and EL complexity. To achieve
this, first, for a given BP, attributes of an entity that goes through
the BP are identified. For example, a communication channel
attribute of a service request, which is an entity handled in
a Service Request Management (SRM) process. Then, based on
these attributes and time dimension, textual data and event log
of the BP are split into subsets. Further, TD and EL complexities
are calculated for each subset. Using the computed complexities,
correlation analysis is performed to investigate whether textual
data may be used for EL complexity prediction. Thereafter, sta-
tistically significant differences in the created event log subsets
are analyzed to find out the factors affecting EL complexity. For
instance, a certain category of service requests may account for
the repeated or skipped information collection activities resulting
in a considerable increase or decrease of EL complexity. Thus, in
terms of such factors, recommendations for process redesign and
improvement can be formulated and provided to organizations.
In the separate subsections of this section, we present the
inputs required in our Approach and introduce a running exam-
ple. Afterward, we elaborate on how TD and EL complexities
are calculated. Finally, correlation analysis and identification of
statistically significant differences are explained.
4.1. Inputs
To carry out the tasks mentioned above, three types of input
are required in the Approach:textual data,event log, and complex-
ity scale. The first two inputs will be described in Sections 4.3
and 4.4. The complexity scale necessary for calculating TD and
EL complexities is defined below.
Complexity scale: A complexity scale is a set of ordinal complex-
ity values. They can be numbers or categories that are put in a
certain order denoting either increasing or decreasing complexity.
A five-point Likert-type scale containing numbers from one to
five or a set of category names like low, medium, and high are
two examples of a complexity scale [75]. Although a considerable
number of metrics for measuring complexity exists (see Table 3),
textual complexity metrics are rather generic, i.e., mostly con-
sidering language usage in texts [13,76], and have less emphasis
on how textual data are perceived by process workers in terms
of work instructions. This perception is important because, using
a given text, process workers determine which activities will be
performed and in what order. Moreover, such generic metrics are
not applicable in a given area without considering the jargon,
characteristics, and regulations of the area. Therefore, when ap-
plying our Approach, expert involvement is essential to determine
a suitable complexity scale for a particular domain.
Creating textual data and event log subsets: To conduct a well-
established analysis of the relation between TD and EL complex-
ities, in the Approach, the textual data and event log are split
6
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 3
Metrics for event log-based complexity calculation.
Metric Definition
Number of events
(magnitude) [10]
The total number of events an event log contains
Number of event types
(variety) [10]
The total number of event classes in an event log
Number of sequences
(support) [10]
The total number of traces in an event log
Average sequence length
(TL-avg) [25]
The average length of a trace in an event log
Average time difference
between consecutive
events (time granularity)
[10]
The mean duration between two events where the first one is followed
by the second one without an interruption
Number of acyclic paths
in transition matrix
(LOD) [8]
The total number of simple paths (paths without cycles) in the graph
network that represents the event connections in an event log
Number of ties in
transition matrix
(t-comp) [71]
The total number of possible paths in the graph network that
represents the event connections in an event log
Lempel–Ziv complexity
(LZ) [72]
The minimum number of steps that are required to generate a given
trace by either reusing its previous parts or inserting a new symbol
Number and percentage
of unique sequences
DT(#), DT(%) [25]
The number and percentage of distinct traces in a given event log
Average distinct events
per sequence (structure)
[10]
The amount of present directly-related event pairs compared to the all
possible ones in a given event log
Average affinity (affinity)
[10]
The homogeneity of a given event log based on the average overlap of
traces in terms of direct following relations (i.e., one event right
afterward another)
Deviation from random
(dev-random) [72]
The Euclidian distance of the transition matrix that is created using the
pairwise associations of events of a given event log
Average edit distance
(avg-dist) [72]
The average edit distance of traces to transform one to another using
string matching with the lowest cost
Entropy-based metrics
(variance and sequence
entropy) [4]
The entropy-based metrics that use prefix automation to describe
sequences within a given event log to map it to a graph
Table 4
Running example IT tickets.
ID Channel Category Textual data
SR001 IT ticketing
system
Application I would like to get access to XYZ. Could
you please send me the available
document how to install it?
SR002 IT ticketing
system
Security As of this week, I am working in a
different building. When I try to login, it
says unable to find the trust certificate
CRT-ABC in the recovery database for
this workstation. Would you please
activate the broken security
configuration?
into subsets. Hereby, we take the following into account: time
dimension and a set of attributes of an entity going through a
BP, for example, a service request. A time dimension is important
to analyze changes in BP execution complexity. Accordingly, rel-
evant time periods can be defined for observing the BP execution
complexity over time. Having attributes allows us to perform a
drill-down analysis and move from a general to a more detailed
view. Further, to enrich the analysis, we combine entity attributes
pairwise.
4.2. Running example
To illustrate our Approach development, we introduce a run-
ning example of two IT tickets (service requests) from an SRM
process case study used in this research.As can be seen in Table 4,
the two tickets, SR001 and SR002, are entered directly in the IT
ticketing system. When tickets are received, their textual contents
are analyzed, and they are assigned to a category (i.e., grouping
of tickets based on the concerned topics in them) by service desk
employees. Using the category of a ticket, how it will be handled,
i.e., next activities and involvement of resources, is determined.
As shown in the table, the ticket SR001 consists of fewer and
more common words compared to SR002. Important to note
that the event log for the tickets in the running example is also
provided.
In the subsections below, we show how the Approach is de-
veloped using the illustrative running example, in particular, its
textual data and event log.
7
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 2. Textual data-based complexity calculation.
4.3. Calculating textual data-based complexity
As an unstructured data type, one of the important inputs
influencing BP execution is textual data. However, due to the
dynamic nature and high interdependence of BPs, textual data are
often either unlabeled or contain few labeled points. Moreover,
labeling is a time-consuming and costly process. Hence, we use
machine learning to address this problem. In particular, we start
with the textual data that have very few labeled data points,
extract linguistic features, and develop prediction models. After-
ward, we select the outperforming prediction model. We run that
model on the unlabeled data. This flow is depicted in Fig. 2.
To perform a computational analysis of textual data and build
prediction models for TD complexity, we extract two sets of
linguistic features from textual data: DML taxonomy-based and
stylistic features. Specifically, we focus on the two fundamental
aspects of textual data that indicate complexity, namely cognition
level [77] and style [78,79]. The decisions made based on textual
data depend on the complexity perceived after reading the text,
i.e., the cognition of textual data affects decision-making. Further,
words and their order in sentences are referred to as style that
can contain certain stylistic patterns.
As stated in Section 3, in our previous work, we have exten-
sively analyzed those linguistic features potentially influencing
textual complexity [13]. In [14], we have performed a feature
selection using a complete set of features for the IT ticket clas-
sification task. Whereas the importance of these features is likely
dependent on the domain specificity, our analysis demonstrated
that the DML taxonomy-based and stylistic features were favor-
able for complexity prediction in the IT ticket processing case
study, which was belonging to the ITSM domain. Accordingly,
in our Approach, we use the DML taxonomy-based and stylistic
features. Using the features in Table 1, prediction modeling tech-
niques are trained on the labeled textual data, i.e., the training set.
Since the labeled data comprise very few data points, the prob-
lem that the Approach deals with is a semi-supervised learning
problem. Hence, we use the following commonly applied semi-
supervised learning techniques [80] to enrich unlabeled data
using labeled data: Label Propagation, Label Spreading, and Self-
Training. The first two are very similar: both consider the distance
of data points to assign labels using the unlabeled data points by
putting all data points in a graph. In Label Spreading, affinity ma-
trix and normalized graph are used. Self-Training assigns labels to
unlabeled data points by reinforcing a model as a pseudo-labeler.
While training, in each prediction modeling technique, a set
of adequate hyper-parameters is chosen for creating the best
prediction model. As soon as all prediction modeling techniques
are trained, the test set is used to determine the best-performing
technique based on the prediction quality. For prediction quality
assessment, we use the F-score metric.
In the prediction model development, to accomplish a better
prediction quality, we use three common meta-algorithms [81],
namely bagging, boosting, and stacking. In stacking, a single mod-
eling technique aims to learn the best combination of the predic-
tion models of the primary prediction modeling techniques put
in a stack. In boosting, to fix the errors in prior prediction mod-
els, the prediction modeling techniques are trained in a chain.
Bagging involves selecting different sub-samples of a training
data set. Predictions for the sub-samples are then aggregated
to identify the final and the best prediction model. When the
prediction model development is completed, the best-performing
model is applied to the unlabeled data. Using the DML taxonomy-
based and stylistic features, each data point is classified based on
the complexity scale, which is the one used while preparing the
labeled data.
In Table 5, linguistic feature value calculations for the running
example IT tickets are listed. As can be seen, for each DML
taxonomy-based and stylistic feature, a value per ticket is com-
puted. For calculating the DML taxonomy-based feature values,
the DML taxonomy of our case study shown in Table A.12 in the
appendix is used. These values are fed into the best-performing
prediction model to obtain a single TD complexity value for each
ticket.
To derive insights into the BP complexity of these tickets han-
dling and analyze how their TD and EL complexities are related,
a single value of TD complexity per ticket is necessary. Neverthe-
less, one can trace back the single feature presence in the text
if needed, for example, to understand which features influence
the complexity. This becomes notably relevant in the case of
inaccurate classifications and the XAI (explainable artificial in-
telligence) paradigm [82]. For instance, inaccurate classifications
can be analyzed by identifying the contribution of each linguistic
feature to the complexity prediction. Thus, one can use these
contributions to build end-user recommendations to improve the
text.
In addition to the overall analysis of TD and EL complexities,
i.e., at the ticket level, it is to emphasize that the relation between
TD and EL complexities can be further investigated in a higher
granularity using ticket attributes. As shown in Table 5, each of
the IT tickets in the running example has a different category.
Such an attribute of tickets may be beneficial to identify how
BP execution complexity varies among subsets of tickets. In this
regard, it is essential to calculate an aggregated TD complexity
value per subset. To do so, in the Approach, we use weight mul-
tipliers that are determined based on the same complexity scale.
These multipliers are applied to the calculated TD complexities of
IT tickets in a given subset, and a weighted average is computed
per subset.
4.4. Calculating event log-based complexity
To calculate EL complexity, in addition to an event log, a set
of EL complexity metrics and a complexity scale are taken as
8
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 5
Textual data-based complexity calculation of the running example IT tickets.
Group Linguistic feature Calculated values for tickets
Feature value TD complexity
SR001 SR002 SR001 SR002
DML taxonomy-based
Relative occurrence of words based
on the DML taxonomy and the DML
cognition level routine
0.75 (install,
send, available)
0.3 (certificate, login, find,
active)
Relative occurrence of words based
on the DML taxonomy and the DML
cognition level semi-cognitive
0.25
(document)
0.3 (security, configuration,
activate, unable)
Relative occurrence of words based
on the DML taxonomy and the DML
cognition level cognitive
0 0.4 (database, recovery,
workstation, different,
broken)
Stylistic
Relative occurrence of nouns in all
words
0.35 0.4 Low Medium
Relative occurrence of unique nouns
in all nouns
1 0.88
Relative occurrence of verbs in all
words
0.3 0.2
Relative occurrence of unique verbs
in all verbs
1 1
Relative occurrence of adjectives in
all words
0.05 0.05
Relative occurrence of unique
adjectives in all adjectives
1 1
Relative occurrence of adverbs in all
words
0.05 0.05
Relative occurrence of unique
adverbs in all adverbs
1 1
Word count 20 40
Wording style 0 (min. word
repeats)
0 (min. word repeats)
Fig. 3. Event log-based complexity calculation.
inputs. As the complexity scale, we use the same scale as in the
TD complexity. Thus, one can perform a correlation analysis
between the TD and EL complexities computed in respect to the
same complexity scale. As for the EL complexity metrics, in our
Approach, we use the ones1described in Section 3(see Table 3).
Similarly to TD complexity calculation, in this task, we create
subsets of a given event log considering the attributes present in
the textual data.
Fig. 3 shows how the calculation of the EL complexity is per-
formed in our Approach. For each event log subset, a single com-
plexity value is computed using the employed EL complexity met-
rics. Since these metrics focus on different event log properties
1The Python script provided on this Github page is adopted to calculate
those metrics.
(for example, size, variance in executions, distances between se-
quences and events) and use different measurement units, com-
puted measurements will vary for a single event log subset. For
example, in an event log subset, the number of events is a counted
value, and measuring a time interval is about calculating an
average time difference between consecutive events.
To have a single EL complexity value for each event log subset,
the computed complexity values are mapped to the points of
the complexity scale using clustering. More specifically, for each
metric, calculated values are clustered. Then, for each cluster, a
value is determined from the complexity scale, which is the same
scale used for the textual data labeling. This flow is depicted in
Fig. 4.
In Table 6, each EL-based complexity metric and the resulting
EL complexity of our running example tickets are shown. As can
be seen, the ticket SR002 was put into the medium cluster for
the metric Percentage of Unique Sequences (DT%), whereas the
9
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 4. Determining event log-based complexity of an event log subset.
Table 6
Event log-based complexity calculation of the running example IT tickets.
EL-based complexity metric Calculated values per ticket
Metric value EL complexity
SR001 SR002 SR001 SR002
Number of events (magnitude) Low High
Number of event types (variety) Low High
Number of sequences (support) Low High
Average sequence length (TL-avg) Low High
Average time difference between
consecutive events (time granularity)
Low Medium
Number of acyclic paths in transition
matrix (LOD)
Low High
Number of ties in transition matrix
(t-comp)
Low High
Lempel–Ziv complexity (LZ) Low High
Number of unique sequences DT(#) Low High Low High
Percentage of unique sequences DT(%) Low Medium
Average distinct events per sequence
(structure)
Low High
Average affinity (affinity) Low High
Deviation from random (dev-random) Low High
Average edit distance (avg-dist) Low High
Variance entropy Low High
Normalized variance entropy Low High
Sequence entropy Low High
Normalized sequence entropy Low High
ticket SR001 was put into the low cluster. As the result of majority
voting, the EL complexity of SR002 is determined as high and, for
SR001, it is low. As the TD and EL complexities are now known,
one can identify how the TD complexities of these tickets are
associated with their EL complexities. Moreover, depending on
the direction and strength of the association between the TD and
EL complexities, further necessary analyses and inputs can be
determined.
4.5. Correlation analysis
As introduced in Section 1, one of the main goals of our
research is to determine how textual data can contribute to EL
complexity prediction. Thus, in our Approach, we conduct corre-
lation analysis [83] to find out how the TD complexity is related
to the EL complexity. In the correlation analysis, the strength
of association between these two complexities and the direction
of their relation are measured. More specifically, we investigate
strong-positive, strong-negative, and no correlations. In the case
when TD and EL complexities are close to a normal distribution,
the Pearson correlation is used. To assess the strength in the
Pearson correlation, we follow the general guidelines [83,84] and
use 0.1, 0.3, and 0.5 as coefficient thresholds. In the case of non-
normality in TD and EL complexities, we choose the Spearman’s
correlation [83,84]. Using the same general guidelines, 0.2 and 0.8
are taken as the thresholds to detect the strength of correlations.
In addition, p≤0.05 (p denotes probability) is used as the
indicator both in the Pearson and Spearman correlations for a
significance identification.
When TD and EL complexities are in a strong-positive correla-
tion, we can use textual data to predict the execution complexity
of BPs by means of TD complexity. Moreover, organizations can
make prior decisions and take actions to mitigate complexity in
performing BPs. A negative correlation between TD and EL com-
plexities is a good indicator to identify which data type should
be further analyzed. No correlation between these two complex-
ities cannot be directly interpreted. Therefore, more textual data
and event log attributes and other BP execution data, such as
performance indicator values, should be considered and further
analyzed to detect reasons for complexity.
In Table 7, the TD and EL complexities for the running example
IT tickets are shown. Solely considering these two tickets, one can
notice that the TD complexity has a positive correlation with the
EL complexity. Based on such observation, the following inter-
pretation can be deduced: TD complexity affects EL complexity.
Hence, textual data of these (and similar) tickets can be used to
predict their EL complexity.
As a correlation does not necessarily imply a cause and effect
between two complexities, we conduct a significant difference
analysis on event logs to identify the factors that may account
10
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 7
Textual data- and event log-based complexities of the running example IT tickets.
ID Channel Category TD complexity EL complexity
SR001 IT ticketing system Application Low Low
SR002 IT ticketing system Security Medium High
for variation in the EL complexity. In the following subsection,
we elaborate on that.
4.6. Significant difference analysis of event logs
To understand what process parts or activities affect the EL
complexity, statistically significant differences in the event logs
need to be analyzed. Aligned with that goal, there are approaches
in the related literature. We reviewed the applicability of these
approaches for analyzing the event logs within our Approach.
The approach by Bolt et al. [85] came forward since it provides
an extensible basis for the EL complexity metrics. To identify
what differences are significant, well-known statistical tests are
used within that approach. In particular, the two-tailed Welch’s
T-test [86] is used in the case of a normal distribution. This basis
is beneficial for our Approach due to the following reasons: (1)
for Welch’s T-test, it is not necessary to have equal variance
between two groups, and (2) Welch’s T-test is less restrictive
than the original Student’s T-test, which makes it more reliable.
For non-normality cases, the non-parametric rank-based Mann–
Whitney U-test [87] is applied. To handle outliers and unexpected
observations, this test focuses on the median, which is a better
measure of the central tendency for skewed data. In this respect,
such a non-parametric method is useful for our Approach.
Furthermore, this approach is available as a plugin (called
Process Comparator) in the Process Mining framework, ProM [88],
which offers built-in features for handling event logs. As such, we
execute this plugin for each complexity cluster pair and analyze
statistically significant differences in the event logs for a partic-
ular cluster pair. Complexity cluster pairs are determined using
the correlation analysis results. For example, if a strong-positive
correlation is observed with respect to an attribute present in
textual data, subsets created for that attribute will be used to
form cluster pairs.
Taking our running example IT tickets, a cluster pair is created
using the category attribute of the tickets. Fig. 5 illustrates how
the significant differences between the event logs of this cluster
pair are highlighted. In the figure, nodes represent activities,
whereas arcs reflect the sequence of these activities in the pro-
cess. The thickness of the arcs and nodes is determined based on
the increasing or decreasing value of the selected process met-
ric, for example, frequency or duration. Distinct colors, i.e., red
and blue, are used to indicate significant differences in terms
of the selected process metric. In addition, the letters B and R
are placed where appropriate to indicate blue and red colors,
respectively. As can be seen, the activity ‘‘Hand-over’’ (T10) is
significant in the cluster that contains the ticket SR002. Likewise,
‘‘Work assignment’’ (T14) is significant in the cluster that SR001
is put. Considering these insights, one can further analyze the
presence or absence of those textual data features which resulted
in handovers or work reassignments.
5. Case study
In this section, first, we give information about the setting of
the case study, in which we apply the proposed Approach. Then,
we elaborate on the application of our Approach and obtained
results.
Case study organization: The IT department of an educational
institution in the Netherlands initiated a project to learn from
Fig. 5. Statistically significant differences between SR001 (with B) and SR002
(with R). (For interpretation of the references to color in this figure legend, the
reader is referred to the web version of this article.)
the data recorded about its ITSM processes. One of the important
processes among them is the SRM process. This process defines
the way of handling user requests related to the products and
services offered by the educational institution. Upgrading a soft-
ware product installed on a user device or providing access to
a file are two examples of typical service requests. Since the
institution offers a wide range of products and services to more
than 25K users, multiple resolution teams are involved in the SRM
process. Each resolution team handles requests for a particular
set of products and services. For example, printing and secure file
sharing are two services managed by separate resolution teams.
To accomplish a data-driven service delivery, the IT depart-
ment, hereafter Org-IT, expressed its interest in finding ways to
reduce the complexity of incoming requests. As this setting is
highly related to the Approach introduced in this paper, we apply
it in Org-IT and explain the findings showing its usefulness.
11
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 8
Identified ticket attributes.
Attribute Description Values
Channel The entry source of the ticket IT ticketing system, email, phone, online
chat, desk-physical location
Category The assigned category on the
ticket
A number of categorical values, e.g.,
printing, IT infrastructure
Organizational unit The organizational entity that the
ticket reporting end-user belongs
to
A number of categorical values, e.g., human
resources, finance department, or science
faculty
Duration Time spent on the ticket Less than 1 day, 1–3 days, 3–5 days, 5–10
days, 10–15 days, and more than 15 days
Hold-on Whether the ticket is held-on, i.e.,
the ticket processing is paused,
and the end-user is asked for
more information
Never, once, more than once
Re-open Whether the ticket is reopened Never, once, more than once
5.1. Data collection
As explained in the previous section, three types of input
(textual data, event log, and complexity scale) are required in the
Approach. To obtain these, we worked together with five experts
who coordinate the resolution teams in Org-IT. Importantly, these
experts have substantial knowledge of incoming service requests.
With the support of the experts, first, we defined a complexity
scale. Second, we asked experts to label a set of randomly selected
service requests considering their textual data. Lastly, together
with the experts, we extracted the SRM process execution data
in the form of an event log.
Complexity scale: In a semi-structured discussion meeting, we
asked the aforementioned five experts to define a complexity
scale based on the service requests handled by the resolution
teams they are coordinating. The experts agreed on a three-point
scale containing low,medium, and high as complexity values.
Textual data:Org-IT provided us with textual data about the
service requests handled between Jan 2019 and May 2021. The
provided textual data contains 4982 service requests (also called
tickets). From them, randomly selected 134 (∼2.7% of the total
requests) are labeled based on the defined three-point complexity
scale with the support of the same five experts. In the process
of labeling, the experts make their decision based on the textual
data as well as other data recorded about tickets, like priority,
category, or attached documents.
Event log: For the aforementioned 4982 service requests, we
received an event log consisting of ∼37K events.
Furthermore, to conduct a detailed analysis of the relation
between TD and EL complexities, subsets from textual data and
event log need to be created. For this purpose, a set of service
request attributes is identified together with the aforementioned
experts. These attributes are listed in Table 8. In addition, to
analyze complexities over time and detect changes, the following
time periods are defined: before Covid-19 (Jan 2019–Feb 2020)
and during Covid-19 (Mar 2020–May 2021).
5.2. Calculating textual data-based complexity
DML taxonomy is a required input to calculate TD complexity,
as explained in Section 4. Since it is not provided in the case
study, we develop it. For this purpose, we follow the proce-
dure explained in Section 3, which is taken from our previous
work [11].
DML taxonomy for SRM process: As we focus on the SRM pro-
cess in the case study, we develop a DML taxonomy for that
process. To express the cognition level of textual data in terms
of linguistic features, PoS (nouns, verbs, adjectives, and adverbs)
in textual data are analyzed. First, we clean the textual data in
the tickets and create document-term matrices. To clean the tex-
tual data, we performed the following activities: (i) we removed
signatures in the tickets received via email and (ii) URLs, email
addresses, and phone numbers mentioned in the text are replaced
with pseudonyms (for example, url1, email1). Second, LDA and
LSI topic modeling methods [68] are combined to extract the de-
scriptive keywords of the tickets. Finally, the extracted keywords
are grouped into the three DML cognition levels (routine, semi-
cognitive, and cognitive). In this grouping, the aforementioned
five experts are involved in correctly identify the DML cognition
level for each keyword. In particular, these experts are asked to
critically evaluate and provide their feedback on the extracted
keywords and their corresponding DML cognition levels. Addi-
tionally, the Information Technology Infrastructure Library (ITIL)
framework [89] is used to enrich the taxonomy. The experts are
asked to identify which keywords from the SRM process docu-
mentation in ITIL should be included in the taxonomy. The DML
taxonomy created for the SRM process is shown in Table A.12 in
the appendix. For the implementation purpose, separate text files
for each part of speech and DML level are created as presented on
our Github page.2
Using the developed DML taxonomy, we extract the DML
taxonomy-based linguistic features for the given 4982 tickets.
Afterward, for the same tickets, we extract the stylistic features
as listed in Table 1 in Section 3.
Next, prediction models are developed to classify unlabeled
tickets. Specifically, we take the four typical semi-supervised
learning techniques as the basis and enrich the unlabeled data
using the labeled data. Afterward, labeled and unlabeled data are
combined. From the combined data, training and test data sets are
created. A number of commonly used prediction modeling tech-
niques are trained on the training data set. The best-performing
prediction model is selected using the F-score metric. Then, the
best-performing prediction model is run on the unlabeled data to
assign TD complexities.
Fig. 6 depicts the flow from cleaning textual data to calculating
TD complexities. How textual data is split and processed can be
seen in the figure.
As indicated before, developing DML taxonomy-based linguis-
tic features requires additional resources, i.e., domain expert in-
volvement. In fact, one can argue whether such investment would
provide more accurate results compared to a feature develop-
ment technique that does not include user involvement. To ad-
dress this concern and show the value of our linguistic features
2check out the DML taxonomy on our Github page.
12
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 6. Textual data-based complexity calculation by means of prediction.
in predicting TD complexity, we take Linguistic Inquiry and Word
Count (LIWC), a state-of-the-art text analysis technique, predict
TD complexity using LIWC features, and compare the results with
our linguistic features. In this regard, first, we elaborate on LIWC
features and then assess the evaluation of our approach against
it.
LIWC is a dictionary-based text analysis technique that focuses
on the connection between important psycho-social constructs
and theories with words, phrases, and other linguistic construc-
tions [90]. Words given in a text are analyzed and placed into one
or more linguistic, psychological, and topical categories. Each of
these categories indicates several aspects of a text, for example,
social, cognitive, or affective. From these categories, we select
linguistic features to comply with the focus of our Approach.
The selected features and their definitions with examples (taken
from [90]) are given in Table 9. As can be seen, there is consid-
erable overlap between our linguistic features listed in Table 1
and the selected LIWC features. Notably, features derived using
parts-of-speech (PoS) in text, for example, nouns, verbs, adverbs,
and adjectives, show similarity.
Using the LIWC implementation,3we obtain the LIWC fea-
ture values (mostly counts and distribution frequencies) for the
tickets. Using these values and following the same steps ex-
plained above, we develop prediction models and measure the
performance of these models in terms of weighted F-score. The
prediction models and their performance in the evaluation us-
ing both our linguistic features and LIWC features are given in
Table 10.
As highlighted in italics in the first row in Table 10, Bagged
Decision Trees is the outperforming algorithm with the best
weighted F-score value. This performance is achieved with the
DML taxonomy-based and stylistic features. In the case of LIWC
features, the Bagged Decision Trees algorithm has notably lower
performance. Overall, in almost all algorithms, we obtained a
better performance with our linguistic features than LIWC fea-
tures. Only in two cases, with a subtle difference, LIWC features
showed a better performance. These are highlighted in the ta-
ble in dark gray, namely Naïve Bayes with Label Spreading and
Stacked SVM-Naïve Bayes with Pseudo-Labeling. Important to
note that tree-based algorithms perform significantly well when
3check out LIWC implementation.
the base semi-supervised learning technique is Pseudo-Labeling
(see the top four rows in the table). However, Pseudo-Labeling is
also seen in the least successful performances (see the last row
in the table). Apart from that, Label Spreading is another semi-
supervised learning technique observed in the majority of the less
successful performances.
To perform a drill-down TD complexity calculation, we take
Table 8 as the basis and create subsets for the following: per ticket
attribute, pairwise combined ticket attributes, and before and
during Covid-19 periods. Hereby, the case study experts indicated
not only important ticket attributes but also the most relevant
pairwise combinations, i.e., by ‘‘channel’’ attribute, as it directly
affects the incoming text of a service request. The TD complexity
per subset is computed using the weighted TD complexity of the
tickets contained in it. For the low, medium, and high points in
the complexity scale, 1, 2, and 3 are used as the weight multipli-
ers, respectively. Simply put, the TD complexity per subset is the
aggregation of individual TD complexity of tickets in the subset
using these weight multipliers.
5.3. Calculating event log-based complexity
The event log provided by the case study organization is
filtered, and event log subsets are created for the ticket attributes
listed in Table 8, their pairwise combinations, and the defined
two time periods. For example, five event log subsets are created
for the five values seen in the channel attribute. To obtain a
single EL complexity for an event log subset, cluster analysis
is conducted. The number of clusters is set to three since the
given complexity scale contains three value points. Then, for
each EL complexity metric (see in Table 3), a complexity value
is determined, resulting in 13 values per subset. With a majority
voting, a single complexity value is selected as the EL complexity
for each subset.
5.4. Analyzing correlations
In the correlation analysis, we measure to what extent the
calculated TD and EL complexities correlate. Aligned with the
specification in our Approach (see Section 4.5), the Spearman’s
correlation is chosen for the analysis, as there is non-normality
in the data distribution and the complexity scale contains ordinal
13
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 9
LIWC features [90].
Category Feature Definition/Example
Summary
Word count Total word count
Words per sentence Average words per sentence
Big words Percent words 7 letters or longer
Dictionary words Percent words captured by LIWC
Analytical thinking Metric of logical, formal thinking
Linguistic
Total function words the, to, and, I
Total pronouns Total amount of pronouns
Personal pronouns I, you, my, me
1st person singular I, me, my, myself
1st person plural we, our, us, lets
2nd person you, your, u, yourself
3rd person singular he, she, her, his
3rd person plural they, their, them,
Impersonal pronouns that, it, this, what
Determiners the, at, that, my
Articles a, an, the, alot
Numbers one, two, first, once
Prepositions to, of, in, for
Auxiliary verbs is, was, be, have
Adverbs so, just, about, there
Conjunctions and, but, so, as
Negations not, no, never, nothing
Common verbs is, was, be, have
Common adjectives more, very, other, new
Quantities all, one, more, some
Psychological-cognition
All-or-none all, no, never, always
Cognitive processes but, not, if, or, know
Insight know, how, think, feel
Causation how, because, make, why
Discrepancy would, can, want, could
Tentative if, or, any, something
Certitude really, actually, of course, real
Differentiation but, not, if, or
Memory remember, forget, remind, forgot
Expanded-states
Need have to, need, had to, must
Want want, hope, wanted, wish
Acquire get, got, take, getting
Lack don’t have, didn’t have, hungry
Fulfilled enough, full, complete, extra
Fatigue tired, bored, don’t care, boring
Expanded-time
Time when, now, then, day
Past focus was, had, were, been
Present focus is, are, I’m, can
Future focus will, going to, have to, may
values. We compute the correlations for the subsets created with
the help of the case study experts. The computed correlations
are displayed in Table 11. The first column Combination contains
the information regarding the grouping attribute, i.e., its presence
(‘‘Channel’’) or absence (‘‘–’’). In the column Overall, the overall
values of coef, i.e., coefficient indicating the strength of the mea-
sured relationship between TD and EL complexities, and p, i.e., the
quantified significance of such a relationship, are shown. They
are followed by the columns Before Covid-19 and During Covid-
19 revealing coef and pvalues for the indicated time periods. For
instance, for the subset created using the pairwise combination
of the ticket attributes ‘‘Channel’’ and ‘‘Duration’’ for the time
period before Covid-19, coefficient 0.439 and pvalue 0.101 are
computed.
Following the general guidelines explained in the Approach
(see Section 4.5), 0.8 and 0.2 are chosen as the thresholds for
strong and weak correlations. The significance of correlations is
identified using the criterion p≤0.05, which is also explained
in the same Section 4.5. Accordingly, in Table 11, strong corre-
lations are highlighted in dark gray cells. Light gray is used for
indicating the combinations where pvalues meet the criterion.
As can be seen in the first six rows of the table (no group-
ing attribute ‘‘–’’), the correlations are strong in the following
four ticket attributes: ‘‘Channel’’, ‘‘Organizational unit’’, ‘‘Hold-
on’’, and ‘‘Re-open’’. Moreover, in all but one of these attributes
(‘‘Organizational unit’’), the over-time strong correlations are
identified. The correlation strength for ‘‘Organizational unit’’ in
one of the time periods (before Covid-19) is only 0.12 less than
the defined threshold. In the combinations, i.e., pairs (grouping
attribute ‘‘Channel’’), a strong correlation is identified for the
‘‘Channel–Category’’ combination. The computed correlation dur-
ing Covid-19 is significant for this pair. However, it is weak, albeit
very much above the defined threshold (0.2).
Fig. 7 illustrates some of the correlations. In the figure, we
used jittering to reduce overlapping points that hinder getting
a sense of density. As can be seen in Table 11, we observe
weak correlations for the following two subsets, including over
time (no grouping attribute ‘‘–’’): ‘‘Category’’ and ‘‘Duration’’.
For the ‘‘Duration’’ attribute, there is a subtle difference be-
tween the obtained correlation and the defined strong threshold
(i.e., 0.8−0.707 =0.093). Apart from that, weak correlations
are observed in the three out of five combinations (grouping
attribute ‘‘Channel’’ in Table 11): ‘‘Channel–Organizational unit’’,
‘‘Channel–Re-open’’, and ‘‘Channel–Duration’’. The only combina-
tion for which no correlation exists is ‘‘Channel–Hold-on’’. Aside
from that, no negative correlations are detected.
14
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Table 10
Evaluation of prediction models for textual data-based complexity.
Meta algorithm AlgorithmaBase SSLTbWeighted F-score
DML taxonomy-based & stylistic
features
LIWC features
Bagging Decision Trees PL 0.943 0.865
– Random forest PL 0.94 0.852
– Extra trees PL 0.938 0.881
Boosting Random forest PL 0.931 0.872
Boosting Gradient boosting PL 0.922 0.832
– Logistic regression PL 0.919 0.881
– K-Nearest neighbors PL 0.914 0.877
– Decision trees PL 0.909 0.85
Boosting AdaBoost PL 0.901 0.839
– Stochastic gradient descent PL 0.9 0.697
– Naïve Bayes PL 0.899 0.904
– Perceptron ST 0.897 0.826
– Support vector machines ST 0.897 0.818
– Support vector machines PL 0.897 0.852
Stacking Stacked: Support vector machines and
Naïve Bayes, finalizer: Logistic regression
PL 0.897 0.896
– Logistic regression ST 0.893 0.818
– Perceptron PL 0.893 0.837
– Extra trees LS 0.89 0.867
– Support vector machines LS 0.89 0.826
Stacking Stacked: Support vector machines and
Naïve Bayes, finalizer: Logistic regression
LS 0.89 0.826
– Random forest LS 0.89 0.867
– Stochastic gradient descent ST 0.889 0.697
– Decision trees LS 0.889 0.867
Stacking Stacked: Support vector machines and
Naïve Bayes, finalizer: Decision trees
LS 0.885 0.826
– K-Nearest neighbors LS 0.885 0.826
Boosting Gradient boosting LS 0.885 0.826
Boosting Random forest LS 0.884 0.826
Bagging Decision trees LS 0.883 0.826
– Naïve Bayes LP 0.871 0.794
Boosting AdaBoost LS 0.839 0.818
Stacking Stacked: Support vector machines and
Naïve Bayes, finalizer: Decision trees
PL 0.815 0.828
aFor algorithms, we refer to the Python scikit-learn library implementation on https://scikit-learn.org/stable/.
bSemi-Supervised Learning Technique. PL: Custom Pseudo-Labeling, ST: Self-Training, LS: Label Spreading, and LP: Label Propagation.
Table 11
Correlations between textual data- and event log-based complexities.
Combination Overall Before Covid-19 During Covid-19
Grouping attribute Attribute coef. p coef. p coef. p
– Channel 0.968 0.007 1 0 1 0
– Organizational unit 0.983 0 0.788 0 0.833 0
– Hold-on 0.866 0.333 0.866 0.333 0.866 0.333
– Re-open 0.866 0.333 0.866 0.333 0.866 0.333
– Duration 0.707 0.116 0.707 0.116 0.707 0.116
– Category 0.577 0.081 0.655 0.056 0.632 0.092
Channel Category 0.816 0 0.607 0 0.977 0
Channel Organizational unit 0.577 0 0.364 0.005 0.533 0
Channel Re-open 0.556 0.048 0.342 0.232 −0.082 0.782
Channel Duration 0.476 0.019 0.594 0.001 0.536 0.003
Channel Hold-on 0.037 0.895 0.439 0.101 0.426 0.113
Furthermore, the significance of the correlations can be seen
in the pcolumn in Table 11. ‘‘Channel’’ and ‘‘Organizational unit’’
are the attributes with significant correlations, including the two
time periods. For the pair ‘‘Channel–Category’’, a similar obser-
vation can be obtained from the table. In addition, the signif-
icance criterion is met when the ‘‘Channel’’ is combined with
15
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 7. Correlations between textual data- and event log-based complexities.
‘‘Organizational unit’’, ‘‘Re-open’’, and ‘‘Duration’’ attributes, al-
beit their computed correlations are not strong.
The correlations explained above serve as the basis for
analyzing statistically significant differences in the execution of
BPs based on the event log subsets. In the following subsection,
we present the results of that analysis.
5.5. Analyzing significant differences in business process executions
Considering Table 11, we focus on strong correlations, weak
correlations, and changes in correlations over time. Accordingly,
we analyze the statistically significant differences between the
event log subsets in the related correlations. The most interesting
findings from that analysis are presented in this subsection.
16
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 8. Significant differences between more interactive channels (with B) and
less interactive channels (with R). (For interpretation of the references to color
in this figure legend, the reader is referred to the web version of this article.)
Due to their real-time conversation nature, phone, desk-
physical location, and online chat are considered more interactive
channels. Unlike, email and IT ticketing system are the two less
interactive channels. As shown in Table 11 and in Fig. 7(a), we ob-
tained strong significant correlations in the ‘‘Channel’’ attribute.
More specifically, EL complexity increases when the channel is
less interactive. Hence, we investigated the differences between
the event log subsets of more and less interactive channels.
In Fig. 8, statistically significant differences in the SRM process
execution between more and less interactive channels are dis-
played. Activities and paths that more often reoccur in handling
tickets received via email and IT ticketing system are presented
in red (with R). Blue (with B) is used for phone, desk-physical
location, and online chat. The activity T9 is related to the catego-
rization of tickets and their assignments to the ticket resolution
teams. In interactive channels, it is usually performed after the
‘‘Registration’’ activity (T1). However, in less interactive chan-
nels, activities ‘‘Pause’’ and ‘‘Ask end-user for extra information’’
(T12 and T13) occur more often. In addition, ‘‘Hand-over’’ (T10)
and ‘‘Work assignment’’ (T14) are also frequent activities for
these channels. Aside from that, ‘‘Reopening’’ (T31) of the tickets
coming via these channels takes place more often compared to
the interactive channels. Aligned with that, activities that are
related to the resolution and closure of the ticket (T16, T29, and
T30) and changing the ticket status (T8) occur more often for
the same channels. In a further analysis, the tickets coming via
Fig. 9. Significant differences between categories 3, 4, 5, and 7 (with B) and
others (with R). (For interpretation of the references to color in this figure
legend, the reader is referred to the web version of this article.)
less interactive channels, i.e., email and IT ticketing system, are
checked. A notable observation is that the priority of the tickets
with a medium TD complexity is frequently changed.
‘‘Channel–Category’’ is the only pair (grouping attribute ‘‘Chan-
nel’’) in which strong significant correlations are observed (see
Table 11). Moreover, as depicted in Fig. 7(b), in the four ticket
categories (Category 3, 4, 5, and 7), the EL complexity of tickets
registered via IT ticketing system increases to high while their
TD complexities remain unchanged. Fig. 9 shows the differences
in handling the tickets in these four categories (blue color) in
comparison to the other categories. Notably, ‘‘Changing assigned
resolution teams’’ activity (T10) in the tickets is more common in
these categories. In other categories, ‘‘Work assignment’’ (T14) is
frequently handled in the same resolution teams.
In addition, as can be seen in Fig. 7(b), the TD complexity
of the tickets coming via email channel is low before Covid-
19 and increases to medium during Covid-19. However, there
is no recognizable change in their EL complexities. Therefore,
these tickets are further analyzed. It is found out that the median
duration for handling these tickets has been doubled.
17
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 10. Significant differences between duration longer than 10 days (with B) and up to 10 days (with R). (For interpretation of the references to color in this
figure legend, the reader is referred to the web version of this article.)
Fig. 10 shows the difference analysis based on Fig. 7(c), i.e.,
‘‘Channel–Duration’’. In particular, we investigate the reasons that
may account for the longer execution duration (i.e., more than
10 days) of the tickets with low TD complexity and received
via interactive channels (phone and desk-physical location). As
highlighted in blue (with B), frequent ‘‘Reopening’’ activity (T31)
is observed. Because of frequent reopening, other main activities
(for example, T14 and T8) are repeated and highlighted in blue
(with B) to indicate their recurrence. Noteworthy, there might
be several reasons for a longer execution duration of the tickets
with low TD complexity and vice versa, a subject for further
investigation. For example, in some cases, end-users trigger the
reopening of a prior ticket by appending a follow-up request or
requesting minor adjustments.
Next, in Fig. 7(d), we focus on organizational units issuing the
requests. Organizational unit2 is the outlier unit in terms of the
TD and EL complexities. We analyze the executions of the tickets
of that unit and compare them with the tickets coming from the
rest of the units. In Fig. 11, the frequently performed activities
of handling the tickets coming from that unit are displayed in
blue (with B), for example, ‘‘Work assignment’’ (T14) and ‘‘Ask
end-user for extra information’’ (T13) activities. As can be seen,
‘‘Changing ticket status’’ (T8) happens more frequently. In ad-
dition, in the tickets coming from Organizational unit2, ‘‘Work
assignment’’ (T14) is directly followed by ‘‘Provide resolution’’
activity (T29) more often.
Considering the findings explained in the subsections above,
we discuss their implications in the following section.
6. Discussion
We discuss our findings, their implications, and the limitations
of our Approach in two subsections. Section 6.1 is devoted to
the interpretations of the case study results and derived relevant
observations. Section 6.2 highlights the benefits of the Approach
while mentioning its limitations.
6.1. Analyzing results and deriving observations
To obtain recommendations for BP redesign and improvement
in the proposed Approach, we address three important challenges
in IT ticket processing: ticket categorization, work assignment,
and prioritization. Textual data describing tickets are the basis for
identifying ticket categories that are then used to assign resolu-
tion teams to the tickets. Moreover, tickets are prioritized based
18
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Fig. 11. Significant differences between organizational unit2 (with B) and others
(with R). (For interpretation of the references to color in this figure legend, the
reader is referred to the web version of this article.)
on textual data. In IT ticket processing, the priority of tickets has
a major influence on the way they are handled. Hence, we discuss
our case study findings by focusing on such challenges.
The case study results have shown that the TD and EL com-
plexities increase when the channel is less interactive, i.e., email
or IT ticketing system. Specifically, in these channels, tickets
are held on (‘‘Pause’’ and ‘‘Ask end-user for extra information’’
activities) frequently. As a result, constant work reassignments
between resolution teams are detected. Moreover, repetitions of
such activities show a negative influence on the ticket handling
duration, which directly affects Service Level Agreements (SLAs).
Considering these, one can infer that less verbal communication
when submitting the service request raises the complexity. On
the contrary, in interactive channels, textual descriptions of end-
user service requests are clear and comprehensive. The real-time
verbal communication nature of interactive channels enables the
requests to be interpreted and registered correctly. Additionally,
in these channels, operators can guide end-users to provide all
required information in a single conversation. Accurate ticket
categorization, prioritization, and work assignment can be bet-
ter accomplished using the mentioned advantages of interactive
channels.
The analysis using the ‘‘Channel–Category’’ combination shows
that resolution teams are changed frequently in some ticket
categories. Such changes often happen when textual data are not
comprehensive enough to identify the resolution teams correctly.
The involved experts in the case study interpret such frequent
change of resolution teams as a ‘‘ping-pong’’ behavior. In other
words, due to the lack of clarity in textual data, tickets are passing
from one resolution team to another until more information is
available to detect the correct team.
Next, we note that tickets coming from specific organizational
units were more complex. In particular, in one of 17 organiza-
tional units, namely Organizational unit2, medium TD and high
EL complexities are obtained. To unveil the reasons for high EL
complexity, we conducted the significant difference analysis. In
particular, the event log of the tickets requested by this orga-
nizational unit is compared with the event log of the tickets
sent by the remaining 16 organizational units. We found out
that ‘‘Work assignment’’ and ‘‘Ask end-user for extra information’’
activities frequently happen and cause high EL complexity in
handling tickets coming from Organizational unit2. Moreover,
to understand the reasons for TD complexity, we analyzed the
attributes of the tickets coming from this organizational unit.
Based on the ‘‘Category’’ distribution of the tickets, it is observed
that Organizational unit2 often requests services requiring the
involvement of multiple resolution teams. Moreover, technical
terms are commonly used in the textual data of the tickets.
The case study experts mention that end-users belonging to that
organizational unit have technical backgrounds and knowledge.
In this regard, we have discovered that the stylistic features of
these tickets differ from the tickets of other organizational units
in the sense of relative occurrences of unique PoS and wording
style. This observation indicates that in this case, stylistic features
making part of the TD complexity have an important influence on
the EL complexity, i.e., actual ticket processing.
In the context of the implications of the observations about
organizational units, the following points should be considered
by organizations when applying our Approach. The compositions
of particular departments or teams and regulations within them
are likely to have a notable influence on textual data. For example,
textual data in the requests of users with rights to install software
and change configurations on their computers will differ from the
textual data in the requests of users who have no such flexibility
in using a computer. Combining the textual data of such orga-
nizational units and analyzing separately from the textual data
of other grouped units may provide more specific indications of
what leads to complexity. Hence, tailored solutions for addressing
TD complexity in such organizational units may be more effective
than general solutions for the entire organization.
Another observation is that in some tickets coming via inter-
active channels, the TD complexity and ticket execution duration
are inversely related. Noteworthy, the EL complexity of these
tickets remains unchanged. In such exceptional cases, further
analysis considering other ticket attributes and interactions be-
tween teams and end-users is required. For example, due to some
changes in the IT infrastructure or outsourced services, a ticket
with a low TD complexity may take more time to handle.
Aside from the aforementioned, the case study findings also
lead to the following further observations:
•DML taxonomy-based and stylistic features are rather ad-
equate in predicting TD complexity as seen in comparison
with LIWC features.
19
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
•Simple machine learning algorithms, such as tree-based
ones, are powerful and perform well when combined with
pseudo-labeling to deal with semi-supervised learning prob-
lems.
•Entity attributes (for example, channel, category) play an
important role in better understanding the relation between
TD and EL complexities by means of a drill-down analysis.
In addition, the DML taxonomy that is developed for the SRM
process is of practical value and reusable. Since in most organi-
zations operations are dependent on IT, a similar SRM process
exists as part of their ITSM strategy. Hence, it is reasonable that
IT ticket texts contain similar concepts in organizations. Accord-
ingly, the developed DML taxonomy can be adjusted and used for
calculating TD complexity.
Moreover, the results obtained in the case study serve as a
proof of concept for using linguistic features in TD complexity
prediction. In other words, the potential of linguistic features as
textual data representation discovered in our previous work [13]
has been confirmed in a new real-life setting. Notably, DML
taxonomy-based and stylistic features are favorable to reveal such
potential in analyzing TD complexity.
Despite the fact that domain expert involvement is neces-
sary for developing DML taxonomy-based features, in comparison
with LIWC features, we showed that such an effort resulted in
a better TD complexity classification performance. More specif-
ically, the average overall increase in the performance is 7.5%,
whereas it goes up to 9.5% when only the top 10 best-performing
prediction models are considered (see Table 10). Furthermore, the
highest increase in the performance, 29%, is observed in Naïve
Bayes with Pseudo-Labeling. Likewise, 1.5% is determined as the
highest performance loss in the stacking of Naïve Bayes and SVM
with Pseudo-Labeling. Though it is difficult to measure precisely,
based on the difference in the classification performance, we
can infer that such a performance gain outweighs the cost of
DML development. Hence, organizations that want to adopt our
Approach can take the comparison as an initial baseline for a
trade-off between costs and performance gain.
6.2. Outlining benefits and limitations
Considering the results discussed above and correlations pre-
sented in Section 5, we conclude that TD complexity has con-
nections to EL complexity. We show the benefits of our Approach
in detecting and analyzing the relation between these two com-
plexities in a real-world setting. Specifically, our observations
clearly demonstrate that increasing TD complexity could account
for longer BP execution time, less accurate ticket categorization
and prioritization, and frequent work reassignments triggered by
hold-ons and re-opens.
Overall, the connection between TD complexity and EL com-
plexity indicates that textual data are appropriate and can be
used for predicting EL complexity. Organizations operating in
various domains often rely on textual data while performing their
BPs since textual data are generally the primary input to their
BPs. Such organizations can highly benefit from our Approach.
For example, banks, governmental bodies, universities, infras-
tructure or supply providers have established BPs for handling
various requests coming from their customers in a textual form.
Complexity in these texts can considerably influence the execu-
tion of their BPs. By studying this complexity, organizations can
predict and, therefore, mitigate complexity in performing BPs.
Furthermore, using our Approach, organizations can have a more
comprehensive way of identifying process redesign and improve-
ment opportunities. Such opportunities can be formulated based
on the activities in BPs that affect EL complexity. These activities
can be detected in the significant difference analysis phase of the
Approach.
Important to note that our Approach reveals a great potential
for understanding the implications of each specific linguistic fea-
ture for textual complexity. As illustrated in the running example
(see Table 5), the contribution of each feature to complexity
can be identified separately by tracing back from the aggregated
single TD complexity value. Having such information can help
organizations in several ways. In real-time, text provided by users
to BPs can be annotated in terms of complexity contributions.
Hence, based on the indications of those text parts leading to
complexity, users can improve text quality. Another way could
be assisting process workers in rephrasing text for a more accu-
rate interpretation. For example, depending on the text quality,
process workers may apply triage on requests or add an extra
explanation in the text to better identify what activities are
required in handling requests. Providing support for reducing the
use of words or phrases resulting in higher TD complexity and
further EL complexity could also be beneficial for organizations.
Further, the Approach sets apart from the state-of-the-art ap-
proaches on BP complexity analysis by (i) combining textual
data and event log, (ii) blending readily available techniques
in calculating TD and EL complexities, and (iii) analyzing the
relation between them. Our Approach is the first effort to ana-
lyze the connection between TD and EL complexities. For doing
this, publicly accessible and frequently used techniques, such as
LDA, LSI, common machine learning algorithms, and existing EL
complexity metrics, are employed.
Although the case study of our previous research serving the
basis for TD complexity as well as the case study of this pa-
per belong to the ITSM domain, i.e., IT ticket processing, the
research value goes far beyond the latter. The Approach can be
used by organizations relying on textual data as an input to
their processes and already executing or interested in initiat-
ing BPM projects, for example, healthcare or public administra-
tion institutions receiving customer requests in a textual form.
The domain-specific adaptations of the Approach may require
additional efforts of different degrees. Accordingly, among the
four inputs of the Approach, which are textual data, event log,
complexity scale, and DML taxonomy, the latter requires the
most manual effort. Hereby, the availability of experts and their
willingness to dedicate time and resources as well as top man-
agement support can significantly influence the process of DML
taxonomy creation. At the same time, DML taxonomy captures
the essential semantics enabling context awareness and domain
adaptation in the Approach.
As also follows from the title, the focus of the paper lies on the
BP execution data. However, if we consider the BPM lifecycle [1],
the role of textual data goes far beyond BP execution. It can be
employed throughout all phases of the lifecycle. For example,
in the discovery phase, process analysts might use available BP
descriptions, textual data from interviews, legal documents, or
ethnography [91–93]. In other BPM lifecycle phases, such as
process analysis, process redesign, and process monitoring and
controlling, any BP changes must comply with legal requirements,
corporate standards, business rules, and service operating proce-
dures which usually exist in textual form. All these documents
have a more official character than textual data produced by BP
participants (like conversations) in the BP execution revealing
different style and, hence, textual complexity. Such textual in-
formation is not considered in the present paper. However, this
limitation represents a promising direction for future research.
Accordingly, our Approach can be extended with the analyses
of further textual data sources to develop support for process
analysts at various phases of the BPM lifecycle.
Our Approach reveals several further limitations. Thus, iden-
tification of the relevant entity attributes for creating subsets
to perform a drill-down analysis is performed manually in the
20
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Approach. In that sense, it is a limitation. However, referred entity
attributes in the BPs can be identified and incorporated into the
Approach. Hence, one can automatically select entity attributes
that are relevant for particular BPs. Another limitation of the
Approach is related to textual data: it only supports the texts
written in English. With the advancement of NLP libraries, more
languages can be considered in the Approach.
7. Conclusion and future work
In this paper, we presented a novel Approach aimed at BP
execution complexity analysis by combining textual data and
event log. In particular, in the Approach, the relation between
TD and EL complexities is analyzed. To calculate TD complexity,
we use two sets of linguistic features aimed at capturing TD
complexity in terms of cognition and style: DML taxonomy-based
and stylistic features. For calculating EL complexity, the state-of-
the-art event log-based complexity metrics are employed. Then,
the correlations between these two complexities are measured to
check how TD and EL complexities are associated. Based on the
computed correlations and analysis of the significant differences,
the BP activities affecting EL complexity are determined. Hence,
the factors that may account for an increase or decrease in EL
complexity are identified.
The proposed Approach was applied in the IT department of
the academic institution. Specifically, using the textual data (ser-
vice requests descriptions) and an event log from an SRM process,
TD and EL complexities were calculated, and their relation was
analyzed. Further, the advantages of DML taxonomy-based and
stylistic features in determining TD complexity are assessed by
means of comparison against well-accepted linguistic features,
LIWC, in predicting TD complexity. Our findings showed that
TD and EL complexities highly correlate in most cases. Thus,
following our Approach, textual data can be used to predict the
complexity of BP execution.
The results presented in this paper are, to the best of our
knowledge, the first investigation of the connection between TD
and EL complexities related to BP execution. The existence of such
a connection implies that organizations can benefit from studying
the complexity expressed by means of textual data. For instance,
organizations that rely on textual data in performing their BPs can
approximate the BP execution complexity. Thus, organizations
can develop strategies and make prior decisions to deal with the
complexity of BP execution. Furthermore, the Approach enables
organizations to identify the factors that affect the complexity in
BP execution. By interpreting these factors, process redesign and
improvement directions can be determined.
In future work, we will expand our scope with other ITSM pro-
cesses and incorporate conversations captured in BP executions
into textual data analysis. Moreover, we would like to include
decision mining technologies [94] in our Approach since the BP
decision points are very likely to be associated with BP vari-
ants. Apart from that, we aim to develop new linguistic features
using sentences and their dependencies to further improve TD
complexity prediction. Combining our linguistic features with
LIWC features for achieving a better classification performance is
another direction we want to pursue. Regarding EL complexity,
we want to move one step further, focus on discovered process
models, and hence, incorporate process complexity metrics into
our Approach. In addition, investigating the potential of text sum-
marization for obtaining complexity from textual data is another
future work avenue. We will also consider the complexity analy-
sis of other textual data, such as BP descriptions, legal documents,
corporate standards, and interview transcriptions, to assist the
process analysts at all the stages of the BPM lifecycle. Lastly, we
will experiment with other approaches to process complexity,
for example, considering process context [73,74], to enhance our
Approach.
Table A.12
Decision-making logic taxonomy for service request management process.
Decision-making cognition levels
Routine Semi-cognitive Cognitive
Resources (Nouns) account, activation, address, admin, administrator,
admission, agenda, appointment, assessment,
authentication, authenticator, authorization,
booking, capacity, certificate, code, contract,
credit, demo, download, guest, host, intranet,
licence, license, link, mail, mailaddress, mobile,
password, permission, phone, portal, printer,
questionnaire, reference, registration, reset, staff,
storage, twofactorauthentication, update, url,
version
acceptance, app, application, archive,
backup, balance, bank, battery, cloud,
configuration, document, domain,
keychain, mailbox, migration,
network, security, software, toner,
upgrade, video, vpn
analysis, database, disk,
distribution, drive, driver, file,
firewall, folder, group, owner,
ownergroup, recovery,
workstation, server
Techniques (Verbs) complete, connect, create, download, exist,
expand, expire, extend, find, grant, increase,
inform, install, login, logon, open, print, reset,
restart, run, save, send, share, start, stop, store,
write, update
access, activate, add, approve, assign,
associate, deactivate, decide, delete,
disable, edit, link, make, reactivate,
recover, remove, replace, set, unlink
analyse, analyze, change, define,
design, lose, migrate, modify
Capacities (Adjectives) active, additional, available, correct, easier, exact,
extra, free, full, higher, important, larger, last,
latest, least, new, newest, next, older, online,
optional, organizational, original, outdated,
private, public, qualitative, ready, relevant,
responsible, several, urgent, valid, visible, wide
empty, external, incorrect,
international, local, long, multiple,
remote, safe, unable, wrong,
unknown
broken, different, compatible,
offline, temporary, stuck, spatial
Choices (Adverbs) almost, apparently, beforehand, completely,
correctly, directly, easily, efficiently, either, else,
ever, exactly, far, fully, last, mainly, much, next,
otherwise, properly, quite, rather, since, soon,
successfully, together, totally, urgently
accidentally, already, also, always,
anymore, anywhere, automatically,
constantly, everytime, frequently,
indeed, initially, instead, manually,
maybe, moreover, mostly, never,
often, perhaps, previously, probably,
regularly, sometimes, somewhere,
suddenly, temporarily, though, twice,
usually, wrongfully
hence, however, locally, remotely,
somehow, still, therefore, thus, yet
21
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
Declaration of competing interest
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared
to influence the work reported in this paper.
Data availability
The authors do not have permission to share data.
Appendix A. DML taxonomy
See Table A.12.
References
[1] M. Dumas, M. La Rosa, J. Mendling, H.A. Reijers, Fundamentals of Business
Process Management, second ed., Springer, 2018.
[2] C. Matt, T. Hess, A. Benlian, Digital transformation strategies, Bus. Inf. Syst.
Eng. 57 (5) (2015) 339–343.
[3] M. Fischer, F. Imgrund, C. Janiesch, A. Winkelmann, Strategy archetypes
for digital transformation: Defining meta objectives using business process
management, Inf. Manage. 57 (5) (2020) 103262.
[4] A. Augusto, J. Mendling, M. Vidgof, B. Wurm, The connection between
process complexity of event sequences and models discovered by process
mining, Inform. Sci. 598 (2022) 196–215.
[5] H. van der Aa, J. Carmona, H. Leopold, J. Mendling, L. Padró, Challenges
and opportunities of applying natural language processing in business
process management, in: Proceedings of the 27th International Conference
on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA,
August 20-26, 2018, Association for Computational Linguistics, 2018, pp.
2791–2801.
[6] V. Kobayashi, S.T. Mol, H. Berkers, G. Kismihók, D.N. den Hartog, Text
mining in organizational research, Organ. Res. Methods 21 (2018) 733–765.
[7] M. Schäfermeyer, C. Rosenkranz, R. Holten, The impact of business process
complexity on business process standardization, Bus. Inf. Syst. Eng. 4 (5)
(2012) 261–270.
[8] B.T. Pentland, P. Liu, W. Kremser, T. Hærem, The dynamics of drift in
digitized processes, MIS Q. 44 (2020) 19–47.
[9] B. Münstermann, A. Eckhardt, T. Weitzel, The performance impact of busi-
ness process standardization: An empirical evaluation of the recruitment
process, Bus. Process Manage. J. (2010).
[10] A. Gunasekaran, B. Nath, The role of information technology in business
process reengineering, Int. J. Prod. Econ. 50 (2–3) (1997) 91–104.
[11] N. Rizun, A. Revina, V.G. Meister, Method of decision-making logic discov-
ery in the business process textual data, in: International Conference on
Business Information Systems, Springer, 2019, pp. 70–84.
[12] N. Rizun, V.G. Meister, A. Revina, Discovery of stylistic patterns in business
process textual descriptions: It ticket case, Innov. Manage. Educ. Excell.
Through Vis. (2020).
[13] N. Rizun, A. Revina, V.G. Meister, Assessing business process complexity
based on textual data: Evidence from ITIL IT ticket processing, Bus. Process
Manage. J. (2021).
[14] A. Revina, K. Buza, V.G. Meister, IT ticket classification: The simpler, the
better, IEEE Access 8 (2020) 193380–193395.
[15] N. Rizun, A. Revina, Business sentiment analysis. concept and method for
perceived anticipated effort identification, in: Proceedings of the 28th In-
ternational Conference on Information Systems Development: Information
Systems Beyond 2020 (ISD 2019), Toulon, France, August 28-30, 2019,
2019.
[16] P. Bellan, M. Dragoni, C. Ghidini, Process extraction from text: state of the
art and challenges for the future, 2021, arXiv preprint 2110.03754.
[17] A. Revina, Business process management: Integrated data perspective. A
framework and research agenda, in: Proceedings of the 29th International
Conference on Information Systems Development: Crossing Boundaries
Between Development and Operations (DevOps) in Information Systems
(ISD2021), Valencia, Spain, September 8-10, 2021, 2021.
[18] K. Honkisz, K. Kluza, P. Wiśniewski, A concept for generating business
process models from natural language description, in: International Con-
ference on Knowledge Science, Engineering and Management, Springer,
2018, pp. 91–103.
[19] X. Han, L. Hu, L. Mei, Y. Dang, S. Agarwal, X. Zhou, P. Hu, A-BPS: Automatic
business process discovery service using ordered neurons LSTM, in: 2020
IEEE International Conference on Web Services, ICWS, IEEE, 2020, pp.
428–432.
[20] H. van der Aa, C. Di Ciccio, H. Leopold, H.A. Reijers, Extracting declarative
process models from natural language, in: International Conference on
Advanced Information Systems Engineering (CAiSE 2019), Springer, 2019,
pp. 365–382.
[21] A.J. Chambers, A.M. Stringfellow, B.B. Luo, S.J. Underwood, T.G. Allard,
I.A. Johnston, S. Brockman, L. Shing, A. Wollaber, C. VanDam, Automated
business process discovery from unstructured natural-language documents,
in: International Conference on Business Process Management (BPM 2020),
Springer, 2020, pp. 232–243.
[22] C. Kecht, A. Eggert, W. Kratsch, M. Roglinger, Event log construction from
customer service conversations using natural language inference, in: 3rd
International Conference on Process Mining (ICPM 2021), IEEE, 2021, pp.
144–151.
[23] C. Qian, L. Wen, A. Kumar, L. Lin, L. Lin, Z. Zong, S. Li, J. Wang, An
approach for process model extraction by multi-grained text classification,
in: International Conference on Advanced Information Systems Engineering
(CAiSE 2020), Springer, 2020, pp. 268–282.
[24] H.A. López, S. Debois, T.T. Hildebrandt, M. Marquard, The process
highlighter: From texts to declarative processes and back, in: In-
ternational Conference on Business Process Management (BPM 2018,
Dissertation/Demos/Industry), Vol. 2196, 2018, pp. 66–70.
[25] W.M.P. van der Aalst, Process Mining: Data Science in Action, Springer,
2016.
[26] M. Gupta, P. Agarwal, T. Tater, S. Dechu, A. Serebrenik, Analyzing com-
ments in ticket resolution to capture underlying process interactions, in:
International Conference on Business Process Management (BPM 2020),
Springer, 2020, pp. 219–231.
[27] A. Rebmann, H. van der Aa, Extracting semantic process information
from the natural language in event logs, in: International Conference on
Advanced Information Systems Engineering (CAiSE 2021), Springer, 2021,
pp. 57–74.
[28] A. Goossens, M. Claessens, C. Parthoens, J. Vanthienen, Extracting decision
dependencies and decision logic from text using deep learning techniques,
in: International Conference on Business Process Management (BPM 2021),
Springer, 2021, pp. 349–361.
[29] V. Etikala, Z.V. Veldhoven, J. Vanthienen, Text2Dec: extracting decision
dependencies from natural language text for automated DMN decision
modelling, in: International Conference on Business Process Management
(BPM 2020), Springer, 2020, pp. 367–379.
[30] L. Quishpi, J. Carmona, L. Padró, Extracting decision models from textual
descriptions of processes, in: International Conference on Business Process
Management (BPM 2021), Springer, 2021, pp. 85–102.
[31] L. Quishpi, J. Carmona, L. Padró, Extracting annotations from textual
descriptions of processes, in: International Conference on Business Process
Management, Springer, 2020, pp. 184–201.
[32] L. Ackermann, J. Neuberger, S. Jablonski, Data-driven annotation of textual
process descriptions based on formal meaning representations, in: Inter-
national Conference on Advanced Information Systems Engineering (CAiSE
2021), Springer, 2021, pp. 75–90.
[33] H. Wang, L. Wen, L. Lin, J. Wang, Rlrecommender: a representation-
learning-based recommendation method for business process modeling, in:
International Conference on Service-Oriented Computing, Springer, 2018,
pp. 478–486.
[34] S. Deng, D. Wang, Y. Li, B. Cao, J. Yin, Z. Wu, M. Zhou, A recommendation
system to facilitate business process modeling, IEEE Trans. Cybern. 47 (6)
(2016) 1380–1394.
[35] H. van der Aa, A. Rebmann, H. Leopold, Natural language-based detection of
semantic execution anomalies in event logs, Inf. Syst. 102 (2021) 101824.
[36] D. Sola, H. van der Aa, C. Meilicke, H. Stuckenschmidt, Exploiting label
semantics for rule-based activity recommendation in business process
modeling, Inf. Syst. (2022) 102049.
[37] M. Goldstein, C. González-Álvarez, Augmenting modelers with semantic
autocompletion of processes, in: International Conference on Business
Process Management (BPM 2021), Springer, 2021, pp. 20–36.
[38] H. Leopold, H. van der Aa, J. Offenberg, H.A. Reijers, Using hidden Markov
models for the accurate linguistic analysis of process model activity labels,
Inf. Syst. 83 (2019) 30–39.
[39] H. van der Aa, H. Leopold, H.A. Reijers, Checking process compliance
against natural language specifications using behavioral spaces, Inf. Syst.
78 (2018) 83–95.
[40] H. van der Aa, H. Leopold, H.A. Reijers, Comparing textual descriptions to
process models–the automatic detection of inconsistencies, Inf. Syst. 64
(2017) 447–460.
[41] J. Sànchez-Ferreres, H. van der Aa, J. Carmona, L. Padró, Aligning textual
and model-based process descriptions, Data Knowl. Eng. 118 (2018) 25–40.
[42] M. Kobeissi, N. Assy, W. Gaaloul, B. Defude, B. Haidar, An intent-based
natural language interface for querying process execution data, in: 3rd
International Conference on Process Mining (ICPM 2021), IEEE, 2021, pp.
152–159.
22
A. Revina and Ü. Aksu Information Systems 114 (2023) 102184
[43] H. Leopold, H. van der Aa, F. Pittke, M. Raffel, J. Mendling, H.A. Reijers,
Searching textual and model-based process descriptions based on a unified
data format, Softw. Syst. Model. 18 (2) (2019) 1179–1194.
[44] A. Yadav, D.K. Vishwakarma, Sentiment analysis using deep learning
architectures: a review, Artif. Intell. Rev. 53 (6) (2020) 4335–4385.
[45] E. Lüftenegger, S. Softic, Sentipromo: a sentiment analysis-enabled social
business process modeling tool, in: International Conference on Business
Process Management (BPM 2020), Springer, 2020, pp. 83–89.
[46] A. Mustansir, K. Shahzad, M.K. Malik, Towards automatic business process
redesign: an NLP based approach to extract redesign suggestions, Autom.
Softw. Eng. 29 (1) (2022) 1–24.
[47] T. Niesen, S. Dadashnia, P. Fettke, P. Loos, A vector space approach to
process model matching using insights from natural language processing,
in: Multikonferenz Wirtschaftsinformatik (MKWI 2016), Universitätsverlag
Ilmenau, 2016, pp. 93–104.
[48] N. Wang, S. Sun, D. OuYang, Business process modeling abstraction based
on semi-supervised clustering analysis, Bus. Inf. Syst. Eng. 60 (6) (2018)
525–542.
[49] F. Dai, M. Liu, Q. Mo, B. Huang, T. Li, Refactor business process models for
efficiency improvement, in: Cloud Computing, Smart Grid and Innovative
Frontiers in Telecommunications, Springer, 2019, pp. 454–467.
[50] F. Pittke, Linguistic Refactoring of Business Process Models. Dissertation,
Tech. Rep., Vienna University of Economics and Business, 2015.
[51] J. Mendling, H. Leopold, A. Polyvyanyy, Supporting process model val-
idation through natural language generation, Softw. Eng. P-252 (2016)
71–72.
[52] B. Aysolmaz, H. Leopold, H.A. Reijers, O. Demirörs, A semi-automated
approach for generating natural language requirements documents based
on business process models, Inf. Softw. Technol. 93 (2018) 14–29.
[53] L. Ackermann, Language-centric approaches for improving business pro-
cess model acceptance., in: International Conference on Business Process
Management (BPM 2018, Dissertation/Demos/Industry), 2018, pp. 51–55.
[54] L. Ackermann, Sprachzentrierte Ansätze zur Steigerung der Akzeptanz von
Geschäftsprozessmodellen. Dissertation, Tech. Rep., Universitaet Bayreuth
(Germany), 2018.
[55] K. Shahzad, S. Zaheer, R.M. Adeel Nawab, F. Aslam, On comparing manual
and automatic generated textual descriptions of business process models,
J. Softw.: Evol. Process 31 (11) (2019) e2204.
[56] Q. Zeng, X. Tang, W. Ni, H. Duan, C. Li, N. Xie, Missing procedural texts
repairing based on process model and activity description templates, IEEE
Access 8 (2020) 12999–13010.
[57] G. Yuan, Q. Zeng, H. Duan, W. Guo, W. Ni, N. Xie, Multi-language descrip-
tion text automatic generation of emergency disposal process, in: 2018
IEEE International Conference of Safety Produce Informatization, IICSPI,
IEEE, 2018, pp. 31–35.
[58] G. Yuan, Q. Zeng, W. Ni, C. Liu, C. Li, H. Duan, Multi-view and multi-
language description generation for cross-department medical diagnosis
processes, IEEE Access 6 (2018) 76741–76753.
[59] J. Cardoso, J. Mendling, G. Neumann, H.A. Reijers, A discourse on complex-
ity of process models, in: International Conference on Business Process
Management, Springer, 2006, pp. 117–128.
[60] V. Gruhn, R. Laue, Approaches for business process model complexity
metrics, in: Technologies for Business Information Systems, Springer, 2007,
pp. 13–24.
[61] R.D. Boomsma, I. Vanderfeesten, D. Fahland, H.A. Reijers, S. Cramer, An
evaluation of thresholds for business process model metrics, 2017.
[62] M. La Rosa, P. Wohed, J. Mendling, A.H.M. ter Hofstede, H.A. Reijers, W.M.P.
van der Aalst, Managing process model complexity via abstract syntax
modifications, IEEE Trans. Ind. Inform. 7 (4) (2011) 614–629.
[63] M. Benner-Wickner, M. Book, T. Brückmann, V. Gruhn, Examining case
management demand using event log complexity metrics, in: 18th
IEEE International Enterprise Distributed Object Computing Conference
Workshops and Demonstrations, EDOC Workshops 2014, Ulm, Germany,
September 1-2, 2014, IEEE Computer Society, 2014, pp. 108–115.
[64] J. De Weerdt, M. De Backer, J. Vanthienen, B. Baesens, A multi-dimensional
quality assessment of state-of-the-art process discovery algorithms using
real-life event logs, Inf. Syst. 37 (7) (2012) 654–676.
[65] P.H.P. Richetti, F.A. Baião, F.M. Santoro, Declarative process mining: Re-
ducing discovered models complexity by pre-processing event logs, in:
Business Process Management - 12th International Conference, BPM 2014,
Haifa, Israel, September 7-11, 2014. Proceedings, Vol. 8659, Springer, 2014,
pp. 400–407.
[66] G. Polančič, B. Cegnar, Complexity metrics for process models–A systematic
literature review, Comput. Stand. Interfaces 51 (2017) 104–117.
[67] J. Mendling, H.A. Reijers, W.M.P. van der Aalst, Seven process modeling
guidelines (7PMG), Inf. Softw. Technol. 52 (2) (2010) 127–136.
[68] D.M. Blei, Probabilistic topic models, Commun. ACM 55 (4) (2012) 77–84.
[69] G.K. Zipf, Human Behaviour and the Principle of Least Effort: An
Introduction to Human Ecology, Addison-Wesley, 1949.
[70] W.M.P. van der Aalst, Process Mining - Data Science in Action, second ed.,
Springer, 2016.
[71] T. Hærem, B.T. Pentland, K.D. Miller, Task complexity: Extending a core
concept, Acad. Manag. Rev. 40 (3) (2015) 446–460.
[72] B.T. Pentland, Conceptualizing and measuring variety in the execution of
organizational work processes, Manage. Sci. 49 (7) (2003) 857–870.
[73] J. vom Brocke, M.-S. Baier, T. Schmiedel, K. Stelzl, M. Röglinger, C. Wehking,
Context-aware business process management, Bus. Inf. Syst. Eng. 63 (5)
(2021) 533–550.
[74] M. Rosemann, J. Recker, C. Flender, Contextualisation of business processes,
Int. J. Bus. Process Integr. Manage. 3 (1) (2008) 47–60.
[75] T.L. Saaty, The analytic hierarchy and analytic network processes for the
measurement of intangible criteria and for decision-making, in: Multiple
Criteria Decision Analysis, Springer, 2016, pp. 363–419.
[76] A. Revina, Ü. Aksu, V.G. Meister, Method to address complexity in organi-
zations based on a comprehensive overview, Information 12 (10) (2021)
423.
[77] N. Rizun, Y. Taranenko, Simulation models of human decision-making
processes, Manage. Dyn. Knowl. Econ. 2 (2) (2014) 241–264.
[78] W. Daelemans, Explanation in computational stylometry, in: International
Conference on Intelligent Text Processing and Computational Linguistics,
Springer, 2013, pp. 451–462.
[79] G.K. Zipf, Selected studies of the principle of relative frequency in language,
1932.
[80] X. Zhu, A.B. Goldberg, Introduction to semi-supervised learning, Synth. Lect.
Artif. Intell. Mach. Learn. 3 (1) (2009) 1–130.
[81] Z.-H. Zhou, Ensemble Methods: Foundations and Algorithms, Chapman and
Hall/CRC, 2012.
[82] A. Revina, K. Buza, V.G. Meister, Designing explainable text classification
pipelines: Insights from IT ticket complexity prediction case study, in:
Interpretable Artificial Intelligence: A Perspective of Granular Computing,
Vol. 937, Springer, Cham, 2021, pp. 293–332.
[83] F.J. Gravetter, L.B. Wallnau, Essentials of Statistics for the Behavioral
Sciences, tenth ed., Cengage Learning, 2017.
[84] J. Cohen, Statistical Power Analysis for the Behavioral Sciences, second ed.,
Routledge, 1988.
[85] A. Bolt, M. de Leoni, W.M.P. van der Aalst, Process variant comparison:
Using event logs to detect differences in behavior and business rules, Inf.
Syst. 74 (2018) 53–66.
[86] B.L. Welch, The generalization of ‘student’s’problem when several different
population variances are involved, Biometrika 34 (1–2) (1947) 28–35.
[87] H.B. Mann, D.R. Whitney, On a test of whether one of two random variables
is stochastically larger than the other, Ann. Math. Stat. 18 (1) (1947) 50–60.
[88] W.M.P. van der Aalst, B.F. van Dongen, C.W. Günther, A. Rozinat, H.M.W.
Verbeek, T. Weijters, Prom: The process mining toolkit, in: Proceedings of
the Business Process Management Demonstration Track (BPMDemos 2009),
2009.
[89] Axelos, ITIL Foundation, Stationery Office Books, 2019, URL https://www.
axelos.com/best-practice-solutions/itil.
[90] R.L. Boyd, A. Ashokkumar, S. Seraj, J.W. Pennebaker, The Development and
Psychometric Properties of LIWC-22, University of Texas, Austin, TX, 2022,
URL https://www.liwc.app/help/psychometrics-manuals.
[91] K. Winter, S. Rinderle-Ma, Detecting constraints and their relations from
regulatory documents using nlp techniques, in: OTM Confederated In-
ternational Conferences ‘‘on the Move to Meaningful Internet Systems’’,
Springer, 2018, pp. 261–278.
[92] H.A. López, Challenges in legal process discovery, in: ITBPM@BPM 2021,
CEUR, 2021, pp. 68–73.
[93] J.C. de AR Goncalves, F.M. Santoro, F.A. Baião, Business process mining
from group stories, in: 2009 13th International Conference on Computer
Supported Cooperative Work in Design, IEEE, 2009, pp. 161–166.
[94] E. Bazhenova, F. Zerbato, B. Oliboni, M. Weske, From BPMN process models
to DMN decision models, Inf. Syst. 83 (2019) 69–88.
23