Assessing business process complexity based on textual data: Evidence from ITIL IT ticket processing [original]

Nina Rizun, Aleksandra Revina, Vera G. Meister

Assessing business process complexity based on

textual data: Evidence from ITIL IT ticket

processing

Open Access via institutional repository of Technische Universität Berlin

Document type

Journal article | Accepted version

(i. e. final author-created version that incorporates referee comments and is the version accepted for

publication; also known as: Author’s Accepted Manuscript (AAM), Final Draft, Postprint)

This version is available at

https://doi.org/10.14279/depositonce-17154

Citation details

Rizun, N., Revina, A., & Meister, V. G. (2021). Assessing business process complexity based on textual data:

Evidence from ITIL IT ticket processing. In Business Process Management Journal (Vol. 27, Issue 7, pp.

1966–1998). Emerald. https://doi.org/10.1108/bpmj-04-2021-0217.

cbe

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license:

https://creativecommons.org/licenses/by-nc/4.0/

Assessing Business Process Complexity Based on Textual Data: Evidence

from ITIL IT Ticket Processing

Abstract

Purpose

This study aims to draw the attention of business process management (BPM) research and practice to the textual data

generated in the processes and the potential of meaningful insights extraction. We apply standard Natural Language

Processing (NLP) approaches to gain valuable knowledge in the form of business process (BP) complexity concept

suggested in the study. It is built on the objective, subjective, and meta-knowledge extracted from the BP textual data

and encompassing semantics, syntax, and stylistics. As a result, we aim to create awareness about cognitive, attention,

and reading efforts forming the textual data-based BP complexity. Our concept serves as a basis for the development

of various decision-support solutions for BP workers.

Design/methodology/approach

The starting point is an investigation of the complexity concept in the BPM literature to develop an understanding of

the related complexity research and to put the textual data-based BP complexity in its context. Afterward, utilizing the

linguistic foundations and the Theory of Situation Awareness, the concept is empirically developed and evaluated in a

real-world application case using qualitative interview-based and quantitative data-based methods.

Findings

In the practical, real-world application, we confirmed that BP textual data could be used to predict BP complexity from

the semantic, syntactic, and stylistic viewpoints. We were able to prove the value of this knowledge about the BP

complexity formed based on the (i) professional contextual experience of the BP worker enriched by the awareness of

cognitive efforts required for BP execution (objective knowledge), (ii) business emotions enriched by attention efforts

(subjective knowledge), and (iii) quality of the text, i.e., professionalism, expertise, and stress level of the text author,

enriched by reading efforts (meta-knowledge).

In particular, the BP complexity concept has been applied to an industrial example of ITIL Change Management IT

ticket processing. We used IT ticket texts from two samples of 28,157 and 4,625 tickets as the basis for our analysis.

We evaluated the concept with the help of manually labeled tickets and a rule-based approach using historical ticket

execution data. Having a recommendation character, the results showed to be useful in creating awareness regarding

cognitive, attention, and reading efforts for ITIL Change Management BP workers coordinating the IT ticket

processing.

Originality

While aiming to draw attention to those valuable insights inherent in BP textual data, we propose an unconventional

approach to BP complexity definition through the lens of textual data. Hereby, we address the challenges specified by

BPM researchers, i.e., focus on semantics in developing vocabularies and organization- and sector-specific adaptation

of common NLP techniques.

Keywords: Business Process Management, Business Process Complexity, Natural Language Processing, Situation

Awareness, Decision Support, ITIL IT tickets.

1. Introduction

The significance of natural language in human work and private life cannot be overestimated. It is a means of

sharing thoughts and feelings and storing knowledge. Over the last decade, the maturity of Natural Language

Processing (NLP) techniques, along with the proliferation of big data, has shifted the focus to new opportunities

in a range of applications. In these applications, documents and textual data are extensively used to manage

customer service, legal issues, logistics, or accounting (van der Aa, Carmona, et al., 2018). Unstructured text is

commonly believed to account for more than 80% of data in companies (Kobayashi et al., 2018). Yet, as also

stated by (Kobayashi et al., 2018), few researchers have applied NLP to tackle organizational challenges despite

this abundance of textual data. In Business Process Management (BPM), recent research demonstrates the

capabilities of NLP-based analysis techniques to support various tasks in a scalable manner (Mendling et al.,

2017). However, there are still many challenges of NLP-supported BPM, especially related to its enhancement in

the sense of semantics and developing domain or even organization-specific adaptations (van der Aa, Carmona, et

al., 2018).

At the same time, due to the fast development and penetration of digital technologies into BPM, the overworked

term of process complexity and solutions addressing this complexity gain new attention. In this respect, the

mainstream BPM research has been inspired by the software complexity metrics and is directed towards estimating

the complexity of technical artifacts (Cardoso et al., 2006). Hence, it does not consider the textual data. This

observation explains that while being a popular subject area in business (Müller et al., 2016), Text Analytics and

NLP have not been used to study the BP complexity so far.

The demand for complexity research is especially evident in the most impacted IT and IT Service Management

(ITSM) domain (Lei et al., 2021). Practitioners state a dramatic increase in software errors and a lack of experts

to deal with them. Software maintenance and its costs constituting up to 90% of total software development (Goyal

and Sardana, 2021) remain in the research and practice focus (Jang and Kim, 2021; Peimbert-García et al., 2021).

This complexity and the dynamic nature of processes make the problem of providing process workers with

structured knowledge to enable informed decision-making especially significant (Lee et al., 2020).

Based on the above motivation, we aim to create a textual data-based instrument for increasing the awareness

of the BP workers regarding the process complexity. Accordingly, we set to develop a BP complexity concept as

a basis for various decision-support solutions for BP workers. The concept development involves solving some

important issues, which make up the specific objectives of this study:

(i) Extending an understanding and conceptualizing the BP complexity based on the textual data generated in

BPs using a theoretical background.

(ii) Developing a set of BP complexity measures based on the textual data using a linguistic justification.

(iii) Exploring, adapting, and illustrating the benefits of the BP complexity concept application using an

industrial example.

To achieve these objectives and ensure the comprehensiveness of the examined phenomena, we employ a

triangulation approach based on the following five steps. First, we analyze the related work to develop and extend

an understanding of BP complexity and closely associated research in Section 2. Second, expert knowledge is used

to build the theoretical background for the concept model development and adapt the NLP and linguistic

considerations to the BPM context, forming a solid foundation for designing BP complexity measures based on

the textual data in Section 3. Third, a real-world application case is used to develop a set of BP complexity

measures while adapting standard NLP techniques to the BPM context and BP complexity resulting in the BP

complexity concept in Section 4. Fourth, the BP complexity concept is applied to a real-world scenario to

demonstrate its practical value and relevance in Section 5. Fifth, expert knowledge collected in onsite workshops,

interviews, remote feedback (qualitative evaluation), and statistical methods (quantitative evaluation) is iteratively

used to evaluate the results in Section 6.

Hence, the BP complexity concept has been developed in relation to a specific BP from the ITSM domain,

which is ITIL Change Management (CHM) IT ticket processing of an international telecommunication provider.

The concept aims to create awareness about certain efforts needed to process a ticket1: (i) awareness of cognitive

efforts obtained with the help of domain-specific taxonomy and necessary for the process/task execution, i.e., a

comprehensive understanding of the current situation, including the professional contextual experience of the BP

worker, (ii) awareness of attention efforts to be paid to individual process elements or the entire process, i.e.,

business emotions contained in the BP text, extracted with the help of domain-specific business sentiment lexicon,

and (iii) awareness of reading efforts obtained with the help of stylistic features contextually related to the text

quality, i.e., indicating professionalism, expertise, and stress level of the text author. Such awareness can serve as

prioritization support, necessary expert identification, selection of process automation candidates, and aggregated

analyses of BP textual data over specific periods.

1 By ticket processing, we understand opening a ticket in an IT ticketing system, filling in necessary fields, identifying and performing

preparatory and follow-up work required for successful ticket resolution

Thus, our work contributes to BPM by proposing a BP complexity concept based on the three knowledge types,

addressing semantics, syntax, and stylistics, and creating awareness about certain efforts necessary for BP

execution. To the best of our knowledge, this is the first time in the literature that the BP textual data is analyzed

from these three perspectives to predict the BP complexity. Using qualitative and quantitative research methods,

we illustratively apply and evaluate our concept based on the ITIL CHM IT ticket processing. Hence, we adapt

common NLP techniques to the domain specificity of ITIL CHM to increase their performance on the semantic

level.

2. Related Work

According to the research artifact, our study naturally lies at the intersection of (i) BPM, (ii) BP complexity, and

(iii) NLP techniques for extracting the knowledge about the latter. This section provides an understanding of BP

complexity and complexity-related research and gives an overview of the NLP application in BPM while outlining

the research gaps. Thus, Section 2.1. reviews the BP complexity approaches in BPM. Section 2.2. presents the

research closely related to but not directly addressing BP complexity. Finally, Section 2.3. introduces the status

quo of the NLP research in BPM.

2.1. Business Process Complexity

As organizations develop and expand their businesses, interdependencies between their processes and information

systems increase rapidly. To address this problem, organizations modify the technology supporting their

businesses. As a result of such developments, organizations face substantial problems. One of the first and most

significant problems is complexity, which impedes decision-making and leads to excessively high and often hidden

costs. There has been much interest in complexity research from both academia and industry. The term complexity

has received much attention in different fields. For example, Organizational Sciences adapt concepts from

Complexity Theory and define an organization as a complex dynamic system consisting of elements interacting

with each other and their environment (Grobman, 2005). In Computer Sciences, as a rule, the term complexity

determines the complexity of an algorithm, i.e., the number of resources required to execute the algorithm (Arora

and Barak, 2009).

In this study, we limit our scope to the BPM discipline. BPs are sequences of well-defined actions that must

be modeled and redesigned as needed (van der Aalst, 2013). Hence, BPM focuses on modeling whereby processes

are recorded, evaluated, planned, and redesigned. This is also a dominant research direction in BPM (Leno et al.,

2020), demonstrating its closeness to Computer Sciences. Fundamental concepts and approaches of complexity

measures applied to BPs have attracted researchers' attention since the 1970s. The necessity to measure complexity

became apparent in software development projects with the purpose of management and control. One of the first

essential measures, graph theory-based McCabe complexity (McCabe, 1976), or cyclomatic complexity, was

designed to identify software modules that are difficult to test or maintain. Later on, it was applied to different

subject areas, including BPs, whereby it is known as control-flow complexity (Cardoso et al., 2006). Another

popular measurement applied to BPs is Halstead software complexity (Halstead, 1977), calculated based on

program operands (variables and constants) and operators (arithmetic operators and keywords influencing the

program control-flow) (Cardoso et al., 2006). Accordingly, various software complexity approaches have been

adapted to BPs. The cited (Cardoso et al., 2006) can be reasonably considered one of the pioneers of software

complexity adaption in BPM. Other adaptions such as (Henry and Kafura, 1981; Jingqiu Shao and Yingxu Wang,

2003; Jukka Paakki et al., 2000; Woodward et al., 1979) and (Conte et al., 1986; Troy and Zweben, 1981) were

studied in detail by (Laue and Gruhn, 2006) and (Vanderfeesten, Reijers and van der Aalst, 2008; Vanderfeesten

et al., 2007).

At the same time, some research work breaks away from the software complexity adaption and explores other

subject fields. (Vanderfeesten, Reijers, Mendling, et al., 2008) draw inspiration from Cognitive Sciences. (Kluza

et al., 2014; Sánchez-González et al., 2010) link their research to mathematics. Other researchers experiment with

visual cognition of BP models (Petrusel et al., 2017) in a broader context of Decision Sciences and test various

perspectives to BP model complexity, such as errors and rules (Kluza, 2015; Mendling and Neumann, 2007). A

number of studies on BP complexity use the widely deployed BPMN (OMG, 2013) modeling framework (Pozzi

et al., 2011; Rolón et al., 2009). With the BPMN counterparts’ adoption in the BPM field, i.e., CMMN for the

case and DMN for decision modeling, the corresponding work on their complexity has started to appear. The

complexity approaches are similar to the BP model complexity (Hasić and Vanthienen, 2019; Marin et al., 2015).

It is important to note that whereas complexity considerations for BPMN and DMN are comparable, the complexity

in CMMN can get incomparably high. Two other fields worth mentioning are expert systems (Chen and Suen,

1994; Kaisler, 1986; Suen et al., 1990) and IT architectures (Kinnunen, 2006; Solic et al., 2011; Wehling et al.,

2016, 2017). To sum up, BPs consist of many different elements (splits, joins, resources, diverse data types,

activities, etc.). Therefore, there can be no universal measure of process complexity addressing all BP elements.

As we can conclude from the summary in Table I, most of the existing BP complexity approaches come from

the software subject area and consider a BP from the angle of programming language, i.e., as a technical artifact.

Similar to the software complexity, in the sense of the practical contributions, BP complexity research mainly aims

to achieve more transparency, understandability, reducing errors, defects, and exceptions of BPs. The observation

also proves the intense focus on technical artifacts dominant in the BPM community.

Table I. Related work review of BP complexity

Complexity Studies Approach

Pursued goals / practical

contributions

Software

(McCabe, 1976)

graph-theoretic complexity measures

management and control of

software program complexity

(Halstead, 1977)

program operands and operators-based

measures

(Gao and Li, 2009)

complex network theory-based measures

(Henry and Kafura, 1981)

information-flow based measures (fan-in and

fan-out)

evaluating the structure of large-

scale systems

(Woodward et al., 1979)

knots as a measure of control-flow

complexity in program texts

structuring programs

(Jingqiu Shao and Yingxu Wang,

2003)

a measure of the cognitive and psychological

complexity of software

as a human

intelligence artifact

analysis and prediction of

software complexity

(Jukka Paakki et al., 2000) discovery of architectural and design patterns

analysis of the quality of

architecture

(Conte et al., 1986; Troy and

Zweben, 1981)

five design quality measures - coupling,

cohesion, complexity, modularity, size

evaluation of software designs

(Banker et al., 1989, 1993; Basili

and Hutchens, 1983; Gibson and

Senn, 1989)

the average size of module’s procedures,

application’s modules, the density of goto

statements

understanding and managing

computer software complexity in

terms of the maintenance costs

BP model

(Cardoso et al., 2006)

number of activities, control-flows, joins and

splits in general and unique (not repeating),

interface complexity, graph theory-oriented

metrics measuring the complexity of a

graphic

understandability, fewer errors,

defects, and exceptions, more

robust pro

cesses requiring less

time to be developed, tested, and

maintained

(Laue and Gruhn, 2006)

cognitive weights for BP models, information

flow, max/ mean nesting depth, number of

handles, (anti) patterns

(Vanderfeesten, Reijers and van

der Aalst, 2008; Vanderfeesten,

Reijers, Mendling, et al.

, 2008;

Vanderfeesten et al., 2007)

adapted cohesion and coupling metrics, cross-

connectivity (strength of the links between

BP model elements)

(Mendling and Neumann, 2007)

graph theory-based metrics incl. size,

separability, sequentially, structuredness,

cyclicity, parallelism

(Sánchez-González et al., 2010)

structural metrics incl. diameter, nodes,

density, gateway degrees and mismatch, the

coefficient of connectivity

(Kluza, 2015)

BP model metrics integrated with rules

(Petrusel et al., 2017)

visual comprehension of a BP model with an

eye-tracking experiment

(Kluza et al., 2014)

Durfee and Perfect square

Work- and

control-flow

(Cardoso, 2006, 2008; Lassen and

Aalst, 2009)

compound control-flow complexity of all

split constructs

Event log

(Cardoso, 2007)

number of process logs that are generated

when workflows are executed

(Benner-Wickner et al., 2014) average trace length, size, event density, trace

diversity

metrics which can measure the

degree of event log quality that is

needed so that discovery

algorithms can be applied

DMN (Hasić and Vanthienen, 2019)

number of decisions, elements, information

requirements, density, data objects, Durfee

and Perfect square metric, sequentially,

diameter, longest path, vertex degree, knot

count, network complexity, decision nesting

depth, cyclomatic complexity, interface

complexity

complexity metrics for DMN

models

CMMN (Marin et al., 2015) size, length, complexity

complexity metrics for CMMN

models

Expert

systems and

rule bases

(Chen and Suen, 1994; Kaisler,

1986; Suen et al., 1990)

number of rules, decision components,

breadth of the search path, depth of search

space, number of antecedents and

consequents of a rule, content, connectivity

and size complexity, entropy-based rule base

complexity

systematic and reliable techniques

for evaluating expert systems

Enterprise IT

architectures (Wehling et al., 2016, 2017) variability mining

decision support to determine and

remove redundant architectural

artifacts

Loading more pages...