scieee Science in your language
[en] (orig)
Ulm University | 89069 Ulm | Germany Faculty of Engineering,
Computer Science and
Psychology
Institute of Databases and
Information Systems
The Empirical Analysis of the Comprehen-
sibility of Process Models created by Pro-
cess Mining
Master’s thesis at Ulm University
Submitted by:
Jana Bühler
871153
Reviewer:
Prof. Dr.Manfred Reichert
Prof. Dr.Rüdiger Pryss
Supervisor:
Michael Winter
2021
Version of September 9, 2021
© 2021 Jana Bühler
Satz: PDF-L
A
TEX2ε
Abstract
Companies use process models to specify their operational processes. With the
help of process models, the business processes in a company are analysed by pro-
cess mining techniques to optimise them. The subdiscipline of process discovery
identifies the actual state of business processes and enables them to be exam-
ined. Various tools and algorithms can be used, which lead to different process
visualisations. The type of process visualisation has a major influence on the com-
prehensibility of process models.
The objective of this thesis is to investigate the comprehensibility of process models
generated by process mining. For this purpose, an exploratory eye-tracking study is
conducted with fifteen participants. The study examines process models from two
scenarios - a vaccination process and an insurance process. The corresponding
process models are created manually, and event logs are generated from them us-
ing self-created applications. These event logs are loaded into the process mining
tools Celonis Snap, Disco, ProM, Apromore and PM4Py and process models are
generated from them. A selection of the resulting process models is then tested
for comprehensibility in the user study. The analysis of variance (ANOVA) shows
no significant differences between the different generated process models. Finally,
with the Pearson correlation’s help, the participants’ subjective ranking is highly sig-
nificantly related to the level of acceptability and cognitive load. The correlation
between the time spent looking at the process models and the number of correctly
answered comprehension questions is interesting. From this correlation, it can be
concluded that understanding process models requires a certain amount of time.
An astonishing result of the study is that the quality between manually created mod-
els and models generated by process mining is similarly high. Despite interesting
results, further studies are needed, as the study is confronted with some limitations
(particularly the number of participants). The results can be used as a basis for
future studies to further explore this field of research.
iii
Acknowledgement
At this point I would like to thank everyone who supported and motivated me in the
preparation of this Master’s thesis.
First and foremost, I would like to thank all the participants in my study, without
whom this thesis could not have been written. I thank them for their willingness
to provide all information needed and for their answers and contributions to my
questionnaire.
Finally, I would like to thank my parents whose support made my study possible.
iv
Contents
Abstract iii
1 Introduction 1
1.1 Motivation ................................... 1
1.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Fundamentals 4
2.1 Process Visualisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Causal Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Business Process Model and Notation . . . . . . . . . . . . . . 8
2.2 Process Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Process of Vaccination . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Process of Insurance . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Process Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Process Discovery Algorithms . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.1 αAlgorithm.............................. 22
2.4.2 HeuristicsMiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.3 Inductive Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.4 Fuzzy Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.5 Split Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Process Mining Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Celonis Snap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.2 Disco.................................. 28
2.5.3 ProM.................................. 30
2.5.4 Apromore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
v
Contents
2.5.5 PM4Py................................. 32
2.6 Eye-Tracking with Pupil Core . . . . . . . . . . . . . . . . . . . . . . . . 33
2.6.1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.6.2 Pupil Core with Pupil Capture . . . . . . . . . . . . . . . . . . . 35
3 Related Work 37
4 User Study 42
4.1 Context of Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Hypothesis Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4 Experimental Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.2 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.3 Independent and Dependent Variables . . . . . . . . . . . . . . 48
Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Level of Acceptability . . . . . . . . . . . . . . . . . . . . . . . . 49
Cognitive Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4.4 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4.5 Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Operation and Data Validation . . . . . . . . . . . . . . . . . . . . . . . 54
5 Study Evaluation 56
5.1 Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Data Analysis and Interpretation . . . . . . . . . . . . . . . . . . . . . . 59
5.2.1 Analysis of Process Scenarios . . . . . . . . . . . . . . . . . . . 59
5.2.2 Analysis of Vaccination Process . . . . . . . . . . . . . . . . . . 60
5.2.3 Analysis of Insurance Process . . . . . . . . . . . . . . . . . . . 61
5.2.4 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Results of User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Conclusion and Future Work 69
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 FutureWork.................................. 71
vi
Contents
Bibliography 73
A Vaccination Process in Python 79
B Process Visualisations 82
B.1 Celonis Snap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
B.2 Disco . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
B.3 Apromore ................................... 86
B.4 PM4Py..................................... 88
C Questionnaires 94
C.1 Knowledge Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
C.2 Comprehension Questions . . . . . . . . . . . . . . . . . . . . . . . . . 95
C.3 Level of Acceptability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
C.4 Cognitive Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
D Results of User Study 99
vii
1 Introduction
1.1 Motivation
Organisations can use process models to specify their operational processes [2].
By having a graphical representation of the process models, the processes can be
better understood by different stakeholders [28].
Information systems are used to support business processes and enable users to
control them [4]. Business data is traditionally stored in many different systems,
such as ERP systems, to name an important one [16]. These systems directly
or indirectly record business activities and can be used to create event logs [4].
Based on these logs, process visualisations may be created with the help of process
mining algorithms and process analyses may be carried out. Process mining is
seen as an interface between the fields of Business Process Management (BPM)
and data mining [4]. Process mining combines traditional process model analyses,
and data-oriented analyses [4]. Traditional process analyses include procedures for
the simulation and verification of process models. Data-oriented analyses use real
data and employ procedures from data mining, and do not focus on processes [4].
Process mining makes the identification of the actual state of business processes
easier and is used to discover,monitor and improve processes [4]. Based on event
logs, discovery is about getting a picture of the as-is processes automated. Process
mining tools offer different views of the generated process model. These views
facilitate the analysis of the business process concerned. Monitoring helps to find
out how well the real execution of the process matches the documented or specified
process model. Improvement is about identifying and solving bottlenecks and other
weaknesses to optimise the business process and thus achieving greater added
value. Improving process models reduces the risk of problems [16]. "The need to
improve business processes is a competitive advantage for companies and should
1
1 Introduction
therefore be supported as much as possible." [48]. Therefore, the quality of the
created process visualisation is of great importance.
In this thesis, the focus is on process discovery and the generated process models.
A process discovery algorithm is used that extracts knowledge from a given event
log to represent it as a process visualisation [4]. Various tools and algorithms can
be used and which lead to different process visualisations. One type of visualisa-
tion that has been around for a long time is Petri nets. They were developed in
the 1960s by Carl Adam Petri to represent a model for the flow of information [37].
In [47] an adapted representation of process visualisations, so-called Causal nets,
is presented. A popular way to visualise process models is the Business Process
Modelling Notation (BPMN) [35]. With the help of BPMN, process models can con-
ceptually represent how the process should run. BPMN can also be used to create
so-called workflows that can be executed by a business process engine [13].
The type of process visualisation has a major influence on the comprehensibility
of process models. The research on this has been going on for many years [42,
54, 55]. The focus is on studying process models created by different process
modellers, and people should understand the process visualisations. There are
already studies that the experience of the viewer influences the comprehensibility
of the process model, so experts understand process models more effectively than
novices [33, 56]. Currently, there is no research on how understandable the process
models generated by process discovery are. Therefore, a user study is conducted
using eye-tracking to examine the comprehensibility of these models.
1.2 Objective
In this work, the comprehensibility of process models created by process discovery
is investigated. For this purpose, two scenarios are created. Event logs have to be
generated first to apply process mining techniques. In the next step, different pro-
cess mining tools with different algorithms are evaluated concerning the generation
of process visualisations. An important goal is to determine which kind of process
visualisation is best suited to understand the process scenario. Therefore, the com-
prehensibility and cognitive load are examined. A study is to be conducted with
the help of an eye-tracker. To the best knowledge of the author, there has been no
2
1 Introduction
earlier study that addresses this topic. Therefore the findings will be a good starting
point for further investigations.
1.3 Structure of Thesis
After presenting the motivation and objectives of the thesis, Chapter 2 presents
essential topics that help understand the thesis. First, Section 2.1 explains the
different types of process visualisations that are used in process mining. Various
process scenarios are considered in the context of the study carried out. These are
described in Section 2.2. Section 2.3 explains the basics of process mining. Impor-
tant terms and central techniques of process mining are explained. Subsequently,
important process mining algorithms are presented in Section 2.4. Section 2.5 de-
scribes five essential process mining tools that are used during the thesis. Section
2.6 deals with the topic of eye-tracking. Important principles and techniques are
described. Finally, the eye-tracking set "Pupil Core", used to conduct the study, is
presented. In Chapter 3 related works are discussed. On the one hand, the papers
deal with comprehensibility in general. On the other hand, papers are referenced
that have also conducted user studies with the help of eye-tracking in the area of
process visualisations. Chapter 4 deals with the implementation of the user study.
For this purpose, among other things, the structure with the materials of the study
and its procedure is explained. In Chapter 5 the results of the analyses are de-
scribed and tested for possible significance. The thesis concludes with Chapter 6,
which gives a summary of the thesis and an outlook on further possible work.
3
2 Fundamentals
This Chapter deals with the basic terms and topics from the field of process mod-
elling and process mining. Furthermore, an introduction to eye-tracking is given
to understand how the comprehensibility of process diagrams can be practically
examined and evaluated.
2.1 Process Visualisations
For more than 25 years now, the consideration of business processes has increas-
ingly become the focus of companies. In [26] the foundations for business process
reengineering are laid. In the meantime, many consulting firms have specialised
in analysing, executing, monitoring, and optimising processes. An organisation ori-
ented towards business processes reduces costs and helps to achieve its strategic
goals more quickly [34]. Many different modelling languages exist to represent busi-
ness processes. This Section describes essential process visualisations that also
play an important role in process mining.
2.1.1 Petri Nets
Petri nets go back to the work of Carl Adam Petri in 1962, who in his dissertation
considered, among other things, simultaneous models, which are known today as
Petri nets [37]. A net usually consists of several places which can be marked by
one or more tokens. There are also Petri nets that may only hold one token per
place. These are not considered in this thesis. A token indicates the holding of a
possible state. Between places there are transitions. Transitions enable the change
of a place. Places and transitions are connected by arcs representing the flow. The
4
2 Fundamentals
consideration of sequence, simultaneity, and conflicts are sufficient to describe all
basic situations in Petri nets [3].
As in [3] a Petri net is formally defined as follows.
Definition 1 (Petri net)
A Petri net is a triple (P, T, F):
-Pis a finite set of places,
-Tis a finite set of transitions (PT=),
-F(P×T)(T×P) is a set of arcs (flow relation)
The following Figure 2.1 shows a Petri net with a simplified vaccination process. The
process contains three transitions (check-in, receive vaccination, and check-out).
Some places in the example contain tokens. These move through the net during
the processing. The state of the process can be read from the distribution of the
tokens [3]. Three process instances are in the first state, on arrival at the vaccination
centre. Two instances have already passed the check-in and are now waiting for the
vaccination. Three instances have already gone through the complete process and
have reached the end, leaving the vaccination centre.
Figure 2.1: Simplified vaccination process as a Petri net
The use of Petri nets for workflow management has been considered for more than
20 years [3]. In addition, as Van der Aalst state in [47], there are many algorithms in
the field of process mining that may produce Petri nets. He mentions among others
variants of the α-algorithm [5, 17].
In terms of process mining, the places do not have much significance. The addi-
tion or removal of places may, however, introduce anomalies such as deadlocks or
livelocks [47]. Central are the transitions that represent the different process steps.
There are different opinions on how good Petri nets are. In [3] some reasons are
given why Petri nets are useful and shall be used for process analysis. Petri nets
5
2 Fundamentals
follow formal semantics and are graphical, enabling the modelling of workflows. An-
other advantage is the explicit representation of the status of the process. Petri
nets can also be used for analyses to determine the correctness of workflow pro-
cess definitions.
In recent years there has been a trend towards the use of Causal nets or Heuristic
nets as they are considered more beneficial than Petri nets [47].
2.1.2 Causal Nets
Causal nets, for short C-nets, were first introduced by Van der Aalst in 2011 [47].
This kind of graph aims to customise and improve the representation of process
models generated by process mining algorithms. According to Van der Aalst, C-
nets shall be better suited than other process notation languages such as Petri nets
or BPMN.
A C-net consists of nodes and directed edges, which represent causal dependen-
cies. The nodes represent process activities. There is precisely one start and one
end node. The difference between simple dependency graphs is that so-called in-
put and output bindings are introduced to express the routing logic. Black dots
represent bindings. A set of connected dots form a binding. If a binding is added
after a node, it is called an output binding. If a binding is placed before a node, it is
called an input binding.
As in [47] a C-net is formally defined as follows.
Definition 2 (C-net)
A Causal net (C-net) is a tuple C=(A, ai, ao, D, I, O)where:
Ais a finite set of activities;
aiAis the start activity;
aoAis the end activity;
DA×Ais the dependency relation,
AS ={XP(A) X={} X};
IAAS defines the set of possible input bindings per activity; and
OAAS defines the set of possible output bindings per activity,
such that
D={(a1, a2)A×Aa1asI(a2)as};
6
2 Fundamentals
D={(a1, a2)A×Aa2asI(a1)as};
{ai}={aAI(a)={}};
{ao}={aAO(a)={}}; and
all activities in the graph (A, D)are on a path from aito ao.
In the C-net, all activities and the start and end activities are defined first. Then the
dependencies between the activities (see set D) can be defined. The possible input
and output bindings are defined for each activity. In the refinement of the definition,
it can be seen that for each dependency of activities a1and a2,a2must have an
input binding with a1, and a1must have an output binding with a2. In addition, the
start activity has no input bindings, and the end activity has no output bindings.
The following Figure 2.2 shows a booking process. After the start of the process,
there are three possible follow-up activities. However, the output bindings limit the
number of possible versions after the start. For example, it is not possible to book
only a hotel. A hotel can only be booked with a flight or booked with a flight and
a car. The process must be started for a car to be booked, and a flight must have
been booked before. At first, the notation looks pretty abstract. However, after un-
derstanding the principle of input and output bindings, one understands the possible
process flows very well.
Figure 2.2: C-net for booking process [47]
As Van der Aalst explains in [47] "output bindings create obligations whereas input
bindings remove obligations". If the node "complete booking" is taken, the booking
7
2 Fundamentals
can only be completed if a flight is booked. Another obligation is that a car is booked.
In total, four obligations may be valid to successfully complete a booking (see input
bindings of node "complete booking").
A C-net is sound if it does not contain any anomalies (e.g. deadlocks or livelocks)
and forces the completion of the process. For this, it must be checked whether there
is a valid sequence. All parts of the C-net must potentially be able to be activated
by such a sequence.
In summary, the behaviour of process models is accurately described by valid, bind-
ing sequences, and C-nets do not follow the token-game principle as is the case in
Petri net-based approaches. The advantage of C-nets is that XOR/OR/AND splits
and joins traditionally used in other process model notations can be replaced by a
more compact and richer semantics. In [47] Van der Aalst et al. show how C-nets
can be mapped to Petri nets and vice versa.
2.1.3 Business Process Model and Notation
The Business Process Model and Notation (BPMN) was developed by Stephen A.
White of IBM and first published by the Business Process Management Initiative
in 2004 [22]. BPMN provides a standardised graphical process notation that can
visualise not only conceptual models but also represent executable processes [13].
In the meantime, the OMG is responsible for the further development of BPMN,
which attempts to combine readability, flexibility, and extensibility [35]. The currently
valid version 2.0 was adopted by the OMG in 2011 [22].
The BPMN 2.0 offers a variety of modelling elements for the representation of pro-
cesses. These can be roughly assigned to five categories: Flow Objects, Con-
nection Objects, Artefacts, Participants, and Data [22]. Specific process steps, so-
called activities, must always be carried out in a process, and certain events can
occur. Sequence flows connect these flow objects. There can be conditions that
divide the sequence flow in the process, and only activities that fulfil this condition
are executed.
The following Figure 2.3 shows a BPMN process model for a simplified vaccination
process. The process starts with the event at the confirmed date of vaccination.
8
2 Fundamentals
Now the three activities check in,wait and receive vaccination are carried out se-
quentially. Now the XOR gateway decides whether side effects have occurred. If
this is the case, an additional activity (receive first aid) is carried out. Afterwards,
the sequence flow rejoins and the last two activities can be executed.
The process ends when the vaccination is completed.
Figure 2.3: Simplified vaccination process as a BPMN model
BPMN offers many more possibilities to visualise processes in great detail [22].
However, BPMN diagrams in the context of process mining are mainly limited to the
basic elements shown in the simplified example.
2.2 Process Scenarios
After explaining different forms of representation for processes, the process sce-
narios used for the user study are presented. The first process is the vaccination
process. This scenario is chosen because it is essentially a sequential process and
is easy to implement. The second process is an insurance process. This process
contains advanced BPMN elements. Therefore, it is expected that investigating
the comprehensibility of the process diagram created by process mining will reveal
several problem areas.
2.2.1 Process of Vaccination
This process has been modelled after the recommendation from the German Fed-
eral Ministry of Health and the Robert Koch Institute for vaccination against SARS-
CoV-2 in vaccination centres (as of December 2020) [12], seen in Figure 2.4.
9
2 Fundamentals
Figure 2.4: Manually created process model of vaccination, based on [12]
The procedure starts with the arrival at the vaccination centre at the confirmed
appointment. Access control takes place here. The patient then comes to the
check-in, where the registration data and vaccination eligibility are checked. After a
short wait, the patient receives a medical explanation and can ask questions.
Then he has to wait again. Next, the patient is vaccinated. After the vaccination, the
patient is observed for about 15-30 minutes. If unexpected side effects or symptoms
occur during this time, first aid is given. Otherwise, the patient comes directly to the
reception, where he receives an entry in his vaccination card and can then leave
the vaccination centre. Leaving concludes the procedure.
Other conceivable exceptional cases are not considered in the process, because
otherwise the model would become unnecessarily inflated. These include entering
the centre without an appointment, dropping out at any possible time and missing
necessary documents.
No detailed modelling and development of an executable process model are carried
out to obtain the event log as quickly and simply as possible. Instead, the event log
is generated directly. A Python script is written to obtain the event log, which is
included in the Appendix A. A list of activities with time intervals, how long they
can last is given as a text file. The script creates a CSV file from it. For each
activity, a random timestamp is calculated within the time interval. A line in the
CSV file contains a corresponding case ID, the activity name, and a start and end
timestamp.
The event log contains 1000 process instances. Of these, the special case of first
aid is included for 111 instances. Care is taken when creating the event log to en-
sure that the special case occurs often enough to be recognised as a different case
but rarely enough that it is clear that it is a special case. Attention to the number of
instances ensures that the problem of noise and incompleteness is addressed.
10
2 Fundamentals
2.2.2 Process of Insurance
The process begins when a client’s insurance application is received by the clerk
(see in Figure 2.5). He checks the application and decides whether the application
is valid and can be accepted or not. If not, the rejection is initiated, and the sub-
process for rejecting the application is started. After the rejection of the application,
the client is informed, and the process is finished. If the application is accepted,
the system checks whether the customer is an existing customer or not. If not,
the customer data are entered into the system by an additional step. After that, the
contract terms are determined, and the insurance policy is issued. If the issue takes
more than two days, the department head prioritises the application check, and the
clerk issues the insurance policy. Then the insurance policy is sent to the client,
and the application is completed.
Figure 2.5: Manually created process model of insurance in BPMN
A corresponding business process is implemented in Camunda to create an event
log for this process. For this purpose, an executable model is created, which can
be seen in Figure 2.6. The executable BPMN process model is integrated into a
Spring Boot application.
11
2 Fundamentals
Figure 2.6: Implemented process model of insurance in Camunda
To be able to start the process, there is the class WebAppMainProcessApplication
(see in Listing 2.1). The annotations @EnableProcessApplication and @Spring-
BootApplication declare the process application for the Camunda Spring Boot ap-
plication, whereby the process instance can be started at this point. The main
method starts the application.
1
@EnableProcessApplication
@SpringBootApplication
3
public class WebAppMainProcessApplication {
...
5
public static void main(String... args) {
SpringApplication.run(WebAppMainProcessApplication.
class , args);
7
}
...
9
}
Listing 2.1: Start of a spring boot application
Once the process is deployed, the method processPostDeploy of the WebAppMain-
ProcessApplication class is called to launch 10 thousand process instances (see in
Listing 2.2).
1
// set number of instances for event log
private int instances = 10000;
3
...
@EventListener
12
2 Fundamentals
5
private void processPostDeploy(PostDeployEvent event) {
// start process instances
7
for(int i=0; i<instances; i++) {
runtimeService.startProcessInstanceByKey("
insurance_process");
9
}
}
Listing 2.2: Run 10000 process instances
There is a JavaDelegate class that implements the specific business logic for each
task in the process. However, there is no real need to implement the business logic
for this insurance process. The focus of the implementation is on generating the
event log. Therefore, each delegate implementation realises a random waiting time
in a specific range. The delegate classes all look analogous to the class EnterCus-
tomerDataDelegator in Listing 2.3.
public class EnterCustomerDataDelegater implements
JavaDelegate {
2
// minimum and maximum for random waiting
private int min = 1;
4
private int max = 5;
6
@Override
public void execute(DelegateExecution delegateExecution)
throws Exception {
8
System.out.println("*** enter customer data ***");
10
// wait random time
int randomWait = (int) (max * Math.random() + min);
12
TimeUnit.SECONDS.sleep(randomWait);
14
System.out.println("*** customer data is entered ***");
}
Listing 2.3: Delegator class for task ”enter customer data”
13
2 Fundamentals
Only for deciding which process path to take at the XOR gates, the implementation
of the decision logic is required. Therefore, a random variable is initialised. The
random variable is then checked at the appropriate point with the modulo operation,
and the process instance is continued on the correct path. In the following List-
ing 2.4, the variable randomAccept is initialised. In 90% of the cases, the variable
should be set to true, and only 10% of the requests should be rejected.
1
int randomAccept = (int) (Math.random()*(10-1)) + 1;
// set the variable acceptApplication to false in 1
tenth cases , otherwise true
3
if(randomAccept % 9 == 0) {
delegateExecution.setVariable("acceptApplication",
false);
5
}else {
delegateExecution.setVariable("acceptApplication",true
);
7
}
Listing 2.4: Decision handling for processing an insurance application
The same approach is used to decide whether the client already exists and where
the application check should be prioritised. The client already exists in 50% of the
cases. The application check must be prioritised in three-tenths of the cases.
In order to be able to generate the event log, the Spring Boot application is started,
and 10 thousand instances are being executed. The current status and history
of the running instances can be observed during the execution in the Camunda
Cockpit. The history of the executed instances can be seen in Figure 2.7.
During the creation and realisation of the technical model, a problem occurred that
is explained in more detail below.
When looking at the process model in Figure 2.5, it is noticeable that a bound-
ary timer event is used in the process model to catch the case that the insurance
request is not processed further. However, this event cannot be adopted for the ex-
ecutable model. A process instance that is being executed in a thread can only be
interrupted from the outside. However, since the process is to be interrupted from
the inside (depending on the variable’s value), this is not feasible.
14
2 Fundamentals
Figure 2.7: History view of Camunda Cockpit with all cases
Therefore, an intermediate conditional event is used as a workaround*.
It is also noticeable that the pools and lanes are not taken from the process model
and that all activities are created as service tasks in the technical model. The
service tasks are used because the goal is to get a big event log as quickly as
possible. It should not be necessary to do something manually for every user task
in every instance. This way, the process can be started once with many instances
and run through directly.
A Python script is written that accesses the historical process data stored on the
Camunda BPM platform to generate an event log. The platform’s REST API allows
direct access to the required information. In the header of the REST call, the content
type is set to application/json, and the REST command is then sent to the engine via
the endpoint http://localhost:8080/engine-rest/. The implementation of the Python
method make_rest_call, with which the various REST calls are executed, is shown
in Listing 2.5.
1
def make_rest_call(rest_method , params=None):
url_endpoint = 'http://localhost:8080/engine -rest/'
3
r_headers = {'Content -Type':'application/json'}
res = requests.get(url_endpoint + rest_method , params ,
headers=r_headers , timeout=3.0)
5
if res.ok is False:
print('make rest call failed')
7
return res
Listing 2.5: Function used for REST calls
15
2 Fundamentals
In the following Listing 2.6, the creation of the event log is explained in more de-
tail. First, all process instances must be determined from the historical data via
a process definition key. Using the method get_process_instances, all associated
process IDs are determined via the REST command history/process-instance and
saved in a list.
1
def get_process_instances(process_definition_key):
payload = {'definitionKey': process_definition_key , '
finished':'true'}
3
res = make_rest_call('history/process -instance', params
=payload)
5
# get result as json
response_json = res.json()
7
# extract instances from json and save as list
9
process_instances = []
# get first element
11
if response_json:
elem = response_json.pop()
13
else:
print('instances response json is empty')
15
return None
17
# get next elements
while elem['id']:
19
process_instances.append(elem['id'])
# if there are more elements , else break loop
21
if response_json:
elem = response_json.pop()
23
else:
break
25
return process_instances
Listing 2.6: Function to get all process instance IDs
16
2 Fundamentals
In the next step, a REST request for each process instance is executed in the
get_process_activities method to obtain the executed process activities, including
timestamps. The results are formatted accordingly and stored in the activity list.
The code is shown in Listing 2.7.
def get_process_activities(instance_id):
2
payload = {'processInstanceId': instance_id, 'sortBy':
'startTime','sortOrder':'desc'}
res = make_rest_call('history/activity -instance',
params=payload)
4
# get result as json
6
response_json = res.json()
8
# extract activities from json and save as list
activity_list = []
10
# get first element
if response_json:
12
elem = response_json.pop()
else:
14
print('activities response json is empty')
return None
16
while elem['id']:
18
# ensure that fields aren't empty
# activity id
20
activity_id = elem['id']
if activity_id is None:
22
activity_id = 'no_activity_id'
24
# activity name
activity_name = elem['activityName']
26
if activity_name is None:
activity_name = 'no_activity_name'
28
else:
17
2 Fundamentals
# replace new line to empty space
30
activity_name = activity_name.replace('\n',' ')
32
# activity type
activity_type = elem['activityType']
34
if activity_type is None:
activity_type = 'not_activity_type'
36
# start and end time
38
activity_start_time = elem['startTime']
if activity_start_time is None:
40
activity_start_time = 'no_start_time'
activity_end_time = elem['endTime']
42
if activity_end_time is None:
activity_end_time = 'no_end_time'
44
# format the timestamps
46
start_time_split = activity_start_time.split(".")
activity_start_time = start_time_split[0]
48
activity_start_time = activity_start_time.replace("T",
" ")
50
end_time_split = activity_end_time.split(".")
activity_end_time = end_time_split[0]
52
activity_end_time = activity_end_time.replace("T"," ")
54
# exclude exclusive gateway
if 'no_activity_type' not in activity_type and '
exclusiveGateway' not in activity_type and '
boundaryConditional' not in activity_type:
56
activity_entry = instance_id + ';' + activity_id + ';'
+ activity_name + ';' + activity_type + ';' +
activity_start_time + ';' + activity_end_time
activity_list.append(activity_entry)
58
18
2 Fundamentals
# if there are more elements , else break loop
60
if response_json:
elem = response_json.pop()
62
else:
break
64
return activity_list
Listing 2.7: Function to get all activities of a process instance
Finally, the activity lists had to be written to a log file. This is shown in Listing 2.8.
1
# open file and write header
f = open('event_log_insurance.csv','a')
3
f.write('case_id;activity_id;activity;activity_type;
start_timestamp;end_timestamp\n')
for process_instance in instances:
5
# get all activities of process instance
activities = get_process_activities(process_instance)
7
# write activities to file
f.write(list_to_string(activities))
9
f.close()
Listing 2.8: Code to write the log file
Now that the event logs have been completed as preparation, the focus in the fol-
lowing is more on process mining.
2.3 Process Mining
Process mining is an approach to discover,monitor and improve processes [48]. In
general, software systems execute the business processes. To be able to improve
the processes, the actual process is identified with the help of discovery. For this
purpose, the software system creates event logs that record all events. A process
model can be generated from this event log with the help of process discovery
algorithms. If a process model already exists, it can also be checked for conformity
19
2 Fundamentals
with the event log. In this way, it can be checked whether the process runs as it
has been planned. The process model can also be extended with the help of the
process model and the event log. The process can thus be improved. The three
described techniques can be seen in Figure 2.8.
Figure 2.8: Areas of use of process mining, based on [48]
When applying process mining techniques, several aspects should be considered.
The Process Mining Manifesto provides important guiding principles [48].
GP1: Event Data Should Be Treated as First-Class Citizens
GP2: Log Extraction Should Be Driven by Questions
GP3: Concurrency, Choice and Other Basic Control-Flow Constructs Should
be Supported
GP4: Events Should Be Related to Model Elements
GP5: Models Should Be Treated as Purposeful Abstractions of Reality
GP6: Process Mining Should Be a Continuous Process
In order to generate meaningful process models, the quality of the event log is of
great importance. Therefore, GP1 emphasises that the event log should be com-
plete, and the events should satisfy well-defined semantics. GP2 emphasises that
concrete questions are relevant for a meaningful analysis. Before a process mining
20
2 Fundamentals
technique can be applied, the questions to be answered by the technique should be
selected. GP3 presents that different process modelling languages provide differ-
ent modelling elements, and the process mining techniques should support these.
GP4 points out that the starting point of the analysis is the relationship between
events in the log and the elements in the model. Therefore, care must be taken
to remove ambiguities to interpret the results correctly. GP5 illustrates that there
is no perfect representation for process models but emphasises certain things for
specific audiences. In this way, different perspectives and levels of interaction can
be represented. Finally, GP6 suggests that process mining is not a one-off activity
but that processes should be considered continuously.
The event log is crucial for generating the process models, as knowledge can be
extracted using process mining techniques. It provides detailed information about
the process history [48]. The log contains the execution sequence of a process
instance [5]. In other words, the behaviour of the process. Each event in the event
log refers to an activity (task) [5]. The event log contains all process instances
(cases) with their events. The tasks take time [5]. Therefore, additional timestamps
may be included in the event log. Here, both timestamps corresponding to the
activity’s start or the activity’s end are possible.
In the context of process discovery, the terms noise and incompleteness are often
used. Noise describes the problem of rare behaviour in an event log. Rare be-
haviour is not representative of the typical behaviour of the process.
Incompleteness describes the problem of incomplete event logs. An event log only
contains the sequences that have already been executed. Thus, it may not con-
tain all possible sequences of activities. Both problems affect the quality of process
models.
21
2 Fundamentals
2.4 Process Discovery Algorithms
Various algorithms can be used for Process Mining. In this Section, some of these
are explained.
2.4.1 αAlgorithm
One of the most known Process Discovery algorithms is the αalgorithm [20]. The
αalgorithm identifies causal dependencies between activities. From this, a set can
be created that is formulated into a workflow net. "The algorithm uses the fact that
for many WF-nets, two tasks are connected iff their causality can be detected by
inspecting the log." [5]
In [5] Van der Aalst et al. describes four log-based relations to analyze the causal
relations. The <relation can be used to represent succession relationships. If activ-
ity Bis directly after activity Ain the log, this can be represented in this way: A>B.
With relation activities can be represented, where one activity is a successor of
the other, but not vice versa. This means that there is a relation A>B, but BA.
Activities that do not have a successor relation to each other are represented with
the #relation, which expressed in a >relation means that ABBA. Paral-
lel activities are defined with the ∣∣ relation. Here there exists both >relations, i.e.
A>BB>A.
The different traces are first identified from the event log to identify the workflow
net. These can be derived from the >relation. Then all initial transitions and all
final transitions are found. Now a set is defined that contains all tuples of activities
that fulfil the dependency conditions. If there is a causal dependency between
two activities (transitions), then there must also be a place that connects them [5].
Therefore, from the set of causal dependencies, the set of places can be defined.
Finally, the workflow net is formally described as a flow of transitions and places.
The net can be generated from this description.
With the help of the αalgorithm, a sound workflow net can be discovered based on
a complete event log [5].
However, the algorithm also has some problems and limitations [5]. Among other
things, each transition needs a unique name, and hidden tasks cannot be detected.
22
2 Fundamentals
It cannot deal with short loops (loops of length one and two). The αalgorithm can
only be applied if the event log is based on an acyclic sound and structured WF-net.
Due to this limitation, the selected scenarios cannot be examined with the αalgo-
rithm, as they contain such a loop.
Due to the problems, the αalgorithm has been further developed. Among other
things, the Alpha+and Alpha++ algorithms have been developed [17]. The Alpha+
can now also handle short loops [20]. The Alpha + + is the most advanced devel-
opment [17].
2.4.2 HeuristicsMiner
The HeuristicsMiner also uses the causal dependencies, like the αalgorithm [50].
Based on this, the HeuristicsMiner considers the frequencies of the traces. Rare
paths should not be included in the model [4].
First, the frequencies for all possible activity combinations are calculated and can
be displayed in a matrix. The frequency between an activity A and an activity B is
calculated by subtracting the number of times A follows B from the number of times
B follows A and dividing by the sum of these two numbers.
Various threshold parameters can be used to modify the relevance of relationships
[50]. In this way, the HeuristicsMiner can deal with noise. In order to be able to deal
with short loops of lengths one and two, a frequency table is set up here. Loops
of length 1 are loops in which an activity occurs several times. This means that
the frequency must be calculated of how often activity A occurs after activity A.
For loops of length two, two activities are repeated. This needs special treatment
because the pair repeats.
The HeuristicsMiner can also recognise AND/XOR-split/join constructs, and no ex-
plicit modelling of non-observable tasks is done. For this purpose, a causal matrix
is generated that represents the input and output expressions. In the last step,
dependencies are mined depending on a decision in another part of the model.
The HeuristicsMiner uses a representation similar to Causal nets [4]. In conclusion,
the HeuristicsMiner approach is robust due to the representational bias provided by
Causal nets and the usage of frequencies [4].
23
2 Fundamentals
2.4.3 Inductive Miner
The idea behind the Inductive Miner is based on the recursive splitting of the event
log [4]. The approach is based on the use of process trees because they are sound
by construction.
Four different types of cuts can be made for the split. Exclusive-choice cuts,se-
quence cuts,parallel cuts, and redo-loop cuts are considered. In order to perform
an exclusive-choice cut, there must be no direct succession relationship between
the activities of the different groups. There must be only one successor relationship
between the subsets for each activity contained for a sequence cut. For the parallel
cut, the subsets are considered. There must be a succession relationship in both
directions. In addition, each subset must have a start and an end. In order to under-
stand the redo-loop cut, it is helpful to imagine the process tree. The left subtree is
called do-part, and the right subtree is called redo-part. If the redo part is executed,
another do part must follow. After the do-part, whether the loop is executed again
or continues to the end is checked. The following preconditions for the cut can be
read from the formal description. If there is a link between an element from the do
part and the redo part, this element must also end. Analogously, if an element from
the redo part has a link to the do part, this element from the do part must also have
a start. At the same time, all end elements must lead to the same element from
the redo part. Furthermore, the elements from the redo part must not have any
successor relationships to other subsets.
A directly-follows graph can be created for the event log. This graph is examined to
see which cut can be performed. A cut divides the event log into smaller sub-logs,
for which a direct-follows graph is created, and a cut is performed. The procedure
is recursive, and cuts are made until the event log consists of only one activity.
The process tree can then be represented with the four operations (×,,, and ).
The advantage of this structure is that a sound process model is always produced
and can easily be converted into other process visualisations such as Petri nets and
BPMN models. There is a formal guarantee, which ensures fitness. The Inductive
Miner can handle rare behaviour and large models. However, even the Inductive
Miner has problems. If the process tree contains duplicates and silent activities, the
algorithm may produce an underfitting model. Furthermore, it cannot handle loops
of a fixed length.
24
2 Fundamentals
2.4.4 Fuzzy Miner
In contrast to the process discovery algorithms considered so far, the fuzzy miner
has a different approach. It also considers unstructured data ("spaghetti-like"), with
which many algorithms have problems, and uses the concept of a roadmap [25]. For
this, he does not try to map the behaviour found in the protocol to typical process
design patterns but concentrates on a high-level mapping of the found behaviour
[24]. First, all events discovered in the protocol are represented as activity nodes.
Through unary significance, their importance can be expressed. For each prece-
dence relation, a directed edge is added. Through various transformation methods,
the model can now be successively simplified concerning certain aspects. First,
conflicts (loops) are resolved, and the edges are treated. Edges can either be re-
moved or clustered. Edges that do not correspond to the correct behaviour must be
discarded. Clusters are also formed for activities. Rare behaviour that is below a
certain threshold is clustered or abstracted.
2.4.5 Split Miner
The Split Miner is used to create a BPMN model, which requires five steps [8]. First,
a directly-follows graph (DFG) is constructed. The DFG is not filtered directly but
analysed to detect self-loops and short-loops. A self-loop, for example, is recog-
nised by the fact that it has an arc to itself. The loops are removed from the DFG
and only restored at the end to create the BPMN model. In the second step, con-
currency relations are detected, and the corresponding edges are removed. The
result after this step is a pruned DFG. In the next step, a special filter algorithm is
applied. The filter algorithm guarantees a sound process model, a minimum num-
ber of edges and maximises fitness. In the last two steps, the split gateways and
join gateways are added to obtain a BPMN process model from the DFG. A split
gateway is recognised by the fact that it has one incoming edge and many outgo-
ing edges. A join gateway has many incoming edges and only one outgoing edge.
Finally, it can be emphasised that the BPMN model discovered by the split miner is
sound and does not contain deadlocks. It has a complexity of O(n+m4). Compared
to other discovery algorithms, it stands out as very powerful [7].
25
2 Fundamentals
2.5 Process Mining Tools
The algorithms described in Section 2.4 are used in various process discovery tools.
This Section presents various process mining tools that have been used in this
thesis. Table 2.1 gives an overview of the process mining tools considered. The
selection takes different criteria into accounts, such as supported formats for the
event logs, algorithms, and different types of process visualisations.
Celonis Snap Disco ProM Apromore PM4Py
Supported CSV/XLSX, various, e.g. CSV, XES, various,e.g. CSV, XES
formats for Google CSV, XES, MXML CSV, XES,
event logs Sheets, XES XLS/XLSX BPMN, MXML
Process process map process map various, e.g. process map, process tree,
visuali- Petri net BPMN Petri net,
sations BPMN
Algorithm confidential Fuzzy Miner various, e.g. Split Miner α,α+,
αalgorithm HeuristicsMiner,
Inductive Miner
Further yes yes yes yes yes
analysis
Access commercial commercial open-source open-source open-source
Table 2.1: Overview of process mining tools used
2.5.1 Celonis Snap
Celonis Snap is a free cloud process mining platform from Celonis [14]. Not only is
it intended to be the accessible version of the Intelligent Business Cloud (IBC), but
Celonis sees Snap as an entry-level tool for process mining. The Snap tool supports
process discovery by uploading event logs with standard file formats such as CSV
and XES and other file formats like Google Sheets. In addition, data can also be im-
ported from ERP systems such as SAP. The generated process model is visualised
as a process map. The algorithm used for discovery is confidential.*In addition to
discovery, further analysis can be carried out with a focus on business results [14].
For example, customer satisfaction and operating costs can be monitored, and risk
reduction is achieved. Conformance checking and process enhancement are also
*In response to an enquiry, the software vendor replied that it does not disclose the underlying
algorithm.
26
2 Fundamentals
supported. With conformance checking, Snap can automatically evaluate how well
the process conforms to the ideal path. New friction points can be identified and
corrected to improve the process.
In event logs uploaded using the standard CSV file format, only the standard co-
lumns case id,activity for the activity name and a timestamp eventtime can be used
in the Snap Tool. Other columns in the event log can only be used for sorting means
that the selected column is used instead of the column for the activity name. Since
the event logs of the scenarios contain both a start timestamp and an end time-
stamp, a suitable column has to be selected here. Since the start event and the
first activity have the same start timestamp, the choice of this timestamp harms the
resulting process model. The end timestamp of the two differs. Therefore, the end
timestamp is used.
Figure 2.9 shows an excerpt from the discovered vaccination process model to give
an impression of the representation in Celonis. The complete model is included
in Appendix B. The discovered insurance process model is also available in the
Appendix B.
In general, it is noticeable that the process model runs from top to bottom, and
the process start and end are marked with an additional symbol with a label. The
number of process instances is indicated both on the arrows and in the activities.
The number can also be roughly recognised by the colour highlighting. Paths and
activities that occur more frequently are shown in a darker shade of blue and with
thicker arrows. Paths and activities that occur less frequently are shown in a lighter
shade of blue. In the first scenario, the vaccination process shows that the two
waiting activities are not distinguishable. Therefore, they are analysed as a loop.
In the second scenario, the insurance process, it can be seen immediately after
the start that two sequence flows begin here. The left sequence flow shows the
standard process flow. The right one shows the sub-process that is started when
an application is rejected.
27
2 Fundamentals
Figure 2.9: Excerpt of process model in Celonis
2.5.2 Disco
Disco is a process mining tool from Fluxicon for process discovery [23]. A licence
is required to use Disco, but employees and students of partner universities can
obtain it free of charge [18]. The tool specialises in process discovery and can deal
with everything from reading in an event log to performing various analyses, and
process automation [19]. Various formats can be read in an event log, including
CSV, XES, MS Excel, and MXML files. The resulting process visualisation is shown
as a process map. The algorithm used is based on the fuzzy miner but has been
further developed. Further statistics can be created based on the process map.
28
2 Fundamentals
Information about the number of instance variants or the frequency of activities
can be retrieved. On the other hand, statements about performance can also be
determined. With the help of filters, the focus can be set, for example, on a specific
variant or the performance. By animating the process, the process flow can be
visualised over time.
When uploading a CSV file, one has the option of assigning the columns to the
types Case for the instance id, Activity for the activity name, Timestamp for the
process time, Resource for indicating who processed the task and Other for all
other columns. Analogous to Celonis Snap, only one column could be selected for
the timestamp. Due to the same problem, the end timestamp is selected, and the
start timestamp is set as Other.
Figure 2.10 shows an excerpt of the discovered vaccination process model to give
an impression of the representation in Disco. The complete model can be seen in
Appendix B. The discovered insurance process model is also included in Appendix
B.
Figure 2.10: Excerpt of process model in Disco
In the process models in the scenarios, a certain similarity to the previous process
visualisations of Celonis Snap can be seen. The process also runs from top to
bottom, and the process start and end are marked separately. However, there is no
29
2 Fundamentals
textual representation here. Instead, there is a colour marking (start = green and
end = red). The blue colour gradations and thickness of the paths for the frequency
of the processed process instances can also be seen. The number of frequencies
is also included in the arrows and the activities. The loop in the vaccination process
described for Celonis Snap also occurs here. Furthermore, the direct splitting of the
sequence flows in the insurance process can also be seen here. Thus, the results
of Celonis Snap and Disco are very similar.
2.5.3 ProM
ProM is an open-source framework with a lot of different algorithms that could be
used by installing corresponding plugins [39]. The ProM framework is based on
packages that have been developed as plugins by various companies, and univer-
sities [49]. Event logs could be uploaded in various file formats such as CSV, XES
and XML.
By using different plugins, different process visualisations can be generated. These
include Petri nets, Heuristic nets and BPMN models. Several algorithms are avail-
able for this, including the αalgorithm and HeuristicsMiner. The plugins can be
used in various ways to make further analyses, such as conformance checking,
possible. It also offers conversion options, for example, to transform a Process Tree
into a Petri net.
If a CSV file is uploaded, it can be converted into the XES format. The first step
is to set the column separation and the character encoding used. In the next step,
the mapping to the standard is carried out. The column corresponding to the case
column, the event column and the start and end time is specified. The converted
file can now be used to execute various algorithms from the available plugins.
In summary, the tool is suitable in an academic environment to try out different algo-
rithms and visualisations. However, the handling in other tools is more convenient
for practice.
Earlier versions of ProM support the generation of C-nets. However, the current ver-
sion of ProM uses the flexible HeuristicsMiner, which only supports Heuristic nets
and dependency graphs instead of C-nets [51, 11]. ProM is not used to discover
the process visualisations for the study because the visualisations are too unwieldy
for this. Besides, enough visualisations have been created by the other tools.
30
2 Fundamentals
2.5.4 Apromore
Apromore is an open-source online process mining platform [6]. For process dis-
covery, event logs in standard file formats and many other formats can be used.
Apromore supports exploratory data analysis by allowing the user to inspect, anal-
yse and visualise event logs interactively. Process maps and BPMN models are
supported as process visualisations. They allow users to analyse the direct follow-
ing relationship of activities [6]. Apromore is using the Split Miner to discover BPMN
models from event logs [7].
Not only can the "as-is" process be discovered, but other views can also be dis-
played to explore resources and roles. Further analyses can also be carried out, for
example, to determine the frequency and duration of activities or to create filters for
specific execution paths. In addition to discovery, animations of the various process
flows can also be viewed. In the area of redesigning, models can be mixed, and
similar models can be detected.
In Apromore, the event log can be uploaded in CSV or XES format. Besides the
case ID, many other types can be set for the columns. The attributes are differenti-
ated between case attribute and event attribute. The case attribute is the same for
each case event, and the event attribute changes during the execution of a case.
Because of these attributes, an ID and a name can be specified for the activity. In
addition, it is possible to specify both the start time and the end time of the activity.
Figure 2.11 shows an excerpt of the discovered insurance process model to give
an impression of the representation in Apromore. The complete model can be
found in Appendix B. The discovered vaccination process model is also included
in Appendix B.
Figure 2.11: Excerpt of process map in Apromore
31
2 Fundamentals
When displaying the discovered models for the defined scenarios, it is noticeable
that the models’ activities are displayed horizontally from left to right. The start
event (green) and the end event (red) are highlighted in colour. The blue gradations
for the frequency of activities and paths from dark to light are also used here. The
frequency is additionally specified on the arrows and in the activities.
Interestingly, the BPMN process models are not represented in the block structure.
Finally, it is also noticeable that the wait is shown as a loop for the vaccination
process, and in the insurance process, the sub-process is shown directly parallel to
the standard path.
2.5.5 PM4Py
PM4Py is a process mining framework initiated by the Process Mining Group of the
Fraunhofer Institute for Applied Information Technology [21]. PM4Py chooses a new
approach. Instead of a graphical user interface, PM4Py provides process mining
libraries written in Python and, in addition, integrates state-of-the-art data science
libraries [10]. In [10] it is shown that there are also limitations in dealing with large
experimental settings through the use of user interfaces. One severe limitation with
commercial tools is that one cannot freely choose the desired discovery algorithm.
For process discovery, files can be imported as XES and CSV files. Different pro-
cess mining algorithms and process visualisations can be selected. For instance,
the HeuristicsMiner may be used. It creates a Heuristics net that contains the activ-
ities and the relationships between them. The Heuristics net can be then converted
into a Petri net [20]. PM4Py provides factory methods to use the following dis-
covery algorithms: α,α+, HeuristicsMiner, and Inductive Miner [20]. Petri nets,
process trees and BPMN diagrams are possible as process visualisations. Since
the models can only be generated directly as image files (e.g. PNG), direct process
analysis is not possible, as it is in the previous tools. Therefore, analyses must
be implemented in Python itself. Analyses can be carried out, for example, on the
duration of cases, waiting times, and the elapsed time between different activities.
PM4Py also supports conformance checking. In addition, a simulation of Petri nets
is possible. Finally, both decision mining and social network mining are supported.
32
2 Fundamentals
PM4py is also used in the study. The different process models for both scenarios
that are discovered with the help of PM4Py can be seen in the Appendix B.
In the process models discovered, it is noticeable that the process steps shown
run horizontally in Petri nets and BPMN models and from top to bottom in Heuristic
nets. Start events are shown with a green circle and end events with an orange
circle. The number of instances is only shown on the arrows and in the activities
in the Heuristic net. Here a tiny colour gradation of the activities can be seen. The
arrows remain the same. However, the number of instances is not displayed in Petri
nets and BPMN models.
Process discovery has been tried with the αalgorithm, the HeuristicsMiner and
Inductive Miner. Due to the loop in the processes, the αalgorithm could not be
used. The models of the vaccination process obtained by the HeuristicsMiner and
the inductive miner look identical. In contrast, more differences are observed in
the vaccination process. Here, the Petri net created by the inductive miner is more
general, allowing for many more different traces than initially intended. The algo-
rithm’s approach can explain this. The Petri net discovered by the HeuristicsMiner
corresponds to the desired scenario.
2.6 Eye-Tracking with Pupil Core
Eye-tracking is a helpful tool to detect and analyse eye movements [9]. Both the
points of gaze and the path of movement can be recorded [9]. The duration of the
gaze on a point is also recognised. The recorded data can then be visualised with
a suitable tool. The following sections explain the basic terms and how eye-tracking
works in general. The tool used in the study is then presented.
2.6.1 Functionality
When recording eye movements, not only the movements but also the points of
gaze can be recorded. The gaze points are called fixations and contain the points
at which the eye looks "stably" at an object [43]. A person’s eye is constantly mov-
ing, even if it appears stable [9]. A fixation typically lasts at least 100ms and can
thus be identified based on the viewpoints in a given area [41]. The duration of
33
2 Fundamentals
fixation shows the time of paying attention to a specific object [9]. It may depend,
among other things, on whether the object is interesting or challenging to under-
stand. Saccades are the rapid jumps from one fixation to the next [59]. These are
helping the eye to piece together the points it sees [9]. The directed sequence of
gaze points is called gaze path when a directed sequence of gaze points is meant.
The visualisation of this path enables a representation of the order a person looked
at objects on a screen.
To be able to capture this data, an eye-tracker contains cameras that record the
eyes [9]. The camera uses a reflection on the cornea to identify the pupil [9]. There
are different technologies for how eye-tracking systems work [1]. A distinction is
made between bright and dark pupil technology, which can be seen in Figure 2.12.
Figure 2.12: Eye-tracking techniques, based on [45]
Bright pupil technology uses an infrared source similar to the "red-eye" effect. The
pupil is displayed as a white circle, which makes it detectable. The technique used
in this thesis is based on dark pupils. An infrared source is used to illuminate
everything except the pupil. This way, the image processing system knows that the
darkest and round area is the pupil. What the dark pupil looks like can be seen
in Figure 2.13. The advantage of this technique is the more robust behaviour in
different light conditions.
Eye-tracking can be used to derive insights into comprehensibility [27]. A higher
number of fixations indicates a poorer arrangement of elements, requiring more
effort to explore. If the fixations are further apart, this indicates that a person is
conducting a targeted search. Longer fixations mean more effort to understand,
34
2 Fundamentals
as more time and effort must be spent to understand the visual stimulus. Several
metrics can be used to examine fixations, saccades, and scan paths [59]. Metrics
performed based on fixations refer to either the number or duration of fixations.
In [59] an overview of a variety of metrics that have been used in studies can be
obtained.
Figure 2.13: Eye-tracking: dark pupil
2.6.2 Pupil Core with Pupil Capture
The information about fixations, saccades, and gaze paths can be used to explore
the user experience [9]. For this work, the eye-tracking glasses Pupil Labs Core-
Headset have been used. Figure 2.14 shows the glasses. The glasses have two
eye cameras that capture the pupils and a scene camera to capture the wearer’s
field of view [31]. The glasses can be easily worn and connected to the computer
via a USB port.
An open-source software suite is required to use the glasses. It works on Windows
10 as well as on Mac OS and Linux. The software used is divided into two functional
areas: capturing and visualising. With the help of Pupil Capture, the gaze data can
be captured and processed in real-time [31]. The recordings are saved in folders
as mp4 files. Further data (e.g. the individual positions of the fixations) are saved
in CSV files. With the help of Pupil Player, this data can then be replayed and
visualised [31].
35
2 Fundamentals
Figure 2.14: Pupil Labs Core-Headset
"Pupil Core’s fixation detectors implement a dispersion-based method." [40]. The
different angles of the pupil can detect the dispersion and how many points are in
one area.
36
3 Related Work
In this thesis, process models discovered by process mining will be analysed for
comprehensibility. This Chapter looks at various works that have also dealt with the
analysis of comprehensibility.
The basics of process mining have already been presented in Section 2.3. Process
mining aims to identify and improve processes. The findings are based on the
Manifesto, which presents the principles of process mining [48]. Various algorithms
and tools have already been presented in sections 2.4 and 2.5.
Comprehensibility plays a significant role in computer science. Many types of text
need to be understood. The first thing that comes to mind here is the source code.
However, documents like specifications also need to be understood. These form
the basis of the contract between a client and a contractor.
What is meant by comprehensibility is not easy to define [32]. Psychological re-
search into comprehensibility goes back to the 1970s and considers four dimen-
sions of text comprehensibility: linguistic simplicity, semantic redundancy, structure-
order, and motivational stimulation [15]. For linguistic simplicity, a distinction is made
between word difficulty and sentence difficulty. The empirical investigation of word
difficulty results can be summarised: familiar words can be processed more quickly.
In addition, a high technical word density leads to a slowing down of reading, and
the text then becomes more difficult for the reader to understand. Empirical stud-
ies on the difficulty of sentences have identified several factors that affect compre-
hensibility. These include embedded overlong sentences with several sub-clauses,
sentences with much information and syntactically ambiguous sentences. Less re-
search has been done on the second dimension, semantic redundancy. According
to [32], it can be assumed that a repetition of essential text aspects can increase
comprehensibility. In the dimension of structure and order, the structuring of texts
in terms of content is considered. It is proven that texts should follow a hierarchical-
37
3 Related Work
sequential structure and that the level of abstraction shall gradually become more
detailed. For text comprehension, a relationship must be established between sen-
tences and text sections. There are signal words, such as "therefore" and "leads to",
which establish content-related references between sentences and thus increase
comprehension. Motivational stimulation looks at how curiosity and interest can be
used to create a motivating text. One should proceed sparingly for a motivating text
design since an enrichment of interesting but unimportant details does not improve
comprehension and even harms overall retention. Thus, motivational stimulation
has no direct effect on the comprehension of a text but can keep attention high.
Although a large number of empirical studies have already been carried out on the
four dimensions, the research has reached its limits [15, 32]. There is no single
optimum for all readers. Subjective factors such as prior knowledge, expectations,
and interests influence text comprehension. However, a text can be adapted to a
specific target group to increase comprehensibility for them.
In computer science, analysis and design models based on (semi-)formal languages
like UML are often used because they are more precise than natural language. For
UML modelling, some research has already been done on the comprehensibility of
these models [38, 60].
In [60], features of UML class diagrams are identified by which a task is most effec-
tively supported. These include features such as layout and colour. The participants
of the study have to perform various tasks to understand the class diagram. An eye-
tracker is used for analysis purposes, which helps make navigation observations in
UML class diagrams. For example, novices explore the diagram from top to bottom
and from left to right, while experts explore from the middle of the diagram out-
wards. They can also observe a positive effect when using additional information
(such as the use of colour). However, if a participant can not answer the question,
the answer is still close to the correct answer by using this additional information.
Finally, it is noticeable that similar visually presented notations (such as aggregation
relationships and generalisation) can reduce comprehension.
Porras examines the comprehensibility of UML diagrams and compares them with
other representations [38]. An empirical study is conducted on the understanding
of design patterns. Participants have to complete various tasks required for under-
standing design patterns while wearing an eye-tracker that records their eye move-
ments. The experiment can determine which representations are most efficient and
38
3 Related Work
most likely to confuse the participants. However, they find that specific notations are
more suitable for tasks than others. Thus, they motivate those different notations to
be supported in tools to accommodate all tasks.
Methodologically, eye-tracking is often used to investigate comprehensibility [27].
Eye movements can be analysed with the help of an eye-tracker, allowing interest-
ing points or difficult-to-understand diagram elements to be identified. The basics
of eye-tracking and metrics on how to use it have already been presented in Sec-
tion 2.6.
Also, in the field of process visualisations, some studies on comprehensibility have
already been conducted with the help of eye-tracking. One of the first eye-tracking
studies to measure user satisfaction in business process modelling investigates in
[29] among other things, the comprehensibility of two different ways of presenting
EPCs (eEPC and oEPC) in an eye-tracking experiment. The participants are asked
to evaluate the modelling languages subjectively and perform tasks such as mod-
elling a process or detecting errors. Comprehension, completeness, and ease of
use are identified as the essential requirements for process modelling languages.
The study confirms that eye-tracking is a suitable method for measuring and assess-
ing user-related requirements concerning user satisfaction in process modelling.
In [58] the understanding of business process models is looked at from a different
perspective. They introduce a task perspective. In experiments on the comprehen-
sibility of process models, they are often examined using comprehension questions.
A study is conducted in which participants are shown a BPMN model and asked to
answer true-or-false comprehension questions. The eye movements are recorded
with an eye-tracker. Petrusel finds a correlation between how a respondent inspects
the process model and the answers’ performance. His analysis highlights the pre-
dictive power of the independent variables of the number of fixations and the time it
takes a participant to fixate on a model element.
In [53] and [55] experiments are conducted in which the influence of expertise on
process model understanding is investigated. In the first experiment, this is done by
using process models modelled with BPMN [53]. In the second experiment, EPCs
are used, and a comparison of the two papers is made [55]. The design of the
studies is analogous. The participants are each shown three process models of
different difficulty levels. In the model on the first level, only basic elements of the
39
3 Related Work
modelling language are used, and as the difficulty increases, more elements and
more tasks are added. The participants are divided into two groups depending on
their experience with process models. The participants are asked to understand the
models and then (when the model could no longer be seen) to answer true-or-false
comprehension questions. During the experiment, eye movements are recorded
using an eye-tracker.
In both experiments, it is noticeable that performance drops off as the difficulty of
the process models increases. In [55] it is also noticeable that the performance of
beginners and advanced learners converges with increasing difficulty. Overall, the
performances of this second study are better than those of the first, which leads
the authors to assume that EPCs are easier to understand than BPMN models.
However, this still needs to be investigated through further research. The procedure
of participants looking at process models while eye movements are recorded with
the eye-tracker and true-or-false comprehension questions have to be answered
adopted for this thesis.
In [54] a study is presented in which the comprehensibility of process models in four
different modelling languages (BPMN, eGantt, EPC, and Petri net) is investigated
with the help of an eye-tracker. The participants are presented with 12 different
process models (three models for each modelling language). The models from one
modelling language are divided into three levels of difficulty, which means that more
elements are added to the basic elements of the modelling language with increas-
ing difficulty. For each process model seen, the participants have to answer true-
or-false comprehension questions without looking at the model again. Performance
drops with increasing difficulty, and participants take longer to answer. In addition,
Zimoch et al. can gain further insights, which are divided into nine lessons. First,
they describe that the choice of the process scenario influences the cognitive load.
Thus, the load is lower if the person is familiar with the scenario. Otherwise, the per-
son first has to get an overview and understand the scenario. This thesis uses two
scenarios (vaccination process and insurance process), paying attention to possible
familiarity. In times of the Corona pandemic, the process in vaccination centres is
familiar to many, and this scenario is not expected to be a significant burden. Taking
out insurance is also commonplace but might not yet be so familiar depending on
the participants’ experience. The second lesson shows that process models with
simple and clear information are easier to understand. In the experiment, models
40
3 Related Work
are looked at element by element during the first viewing, and different strategies
are used for further understanding. Here it is interesting to see if this is also ob-
served in the study conducted in this paper. It is noted that further work needs to
investigate the extent to which the structuring of a process model (e.g. block struc-
ture) influences its comprehensibility. In this thesis, process models are shown to
be both horizontal and vertical. It will be interesting to see if there is a difference
in performance in this thesis. In the next lesson, the results of the comprehension
questions are compared using the different difficulty levels and modelling language.
Here, for the modelling languages BPMN, EPC, and Petri nets, a decrease in the
answer score with increasing difficulty is observed in each case. The difficulty of the
insurance process is greater than that of the vaccination process. Therefore, in this
thesis, it is expected that the response score for process models of the insurance
process will be lower than for the vaccination process. It is described that partici-
pants initially search for the start of a model. Therefore, those process models with
an explicit start and end symbol are easier to understand. Since the start events of
the generated process models in this thesis are either on top (for vertical process
models) or on the left (for horizontal process models), it is expected that the start
is found quickly. In the next lesson, it can be seen that identifying, for example,
XOR and AND gateways are very challenging. Only XOR gateways are used in
this thesis, so participants are expected to recognise which kind is meant from the
semantics.
In the study conducted, it is noticed that contrary to expectations, more experienced
participants are not significantly more efficient than under-experienced participants.
In summary, performance in both groups drops equally as the difficulty of the model
increased. It remains to be seen whether an observation regarding performance
differences between different modelling languages can be made in this thesis.
In Chapter 3 some work is presented that investigates the comprehensibility of pro-
cess models. Different visualisation types (including EPC and BPMN) are com-
pared, and both simple and more complex models are considered. The experiments
are supported by the use of an eye-tracker. The structure of the investigation in this
thesis is similar, as different process modelling types are also compared here with
the help of an eye-tracker. However, in previous studies, the models are created by
hand. In contrast, this thesis will examine models that have been generated. It will
be interesting to see if this reveals any differences from the results obtained so far.
41
4 User Study
The previous chapters show that generated process models by process mining are
essential for analysing and improving business processes. Therefore, these pro-
cess models must be well understood. A user study is conducted to investigate the
generated process models’ comprehensibility, which is described in this Chapter.
First, the context of the study is described. Then follows the experimental setting
with the research question. Next, the hypotheses to be tested in this experiment are
presented. These are followed by the experimental set-up, which describes the par-
ticipants, the dependent and independent variables and the materials used. Finally,
the mode of operation and data validation are presented.
4.1 Context of Experiment
In recent years, many research results concerning the comprehensibility of process
models have been published [53, 29, 58]. Research to date has focused on differ-
ent process modelling languages, models of varying complexity and the influence
of user experience (see Chapter 3). Research methods based on eye-tracking are
used to investigate the cognitive processes involved in understanding process mod-
els. Concepts from cognitive psychology are used to determine the cognitive load
and the level of acceptability for different process visualisations. Research in the
area of cognitive psychology suggests that our working memory is limited and that
instructional methods should avoid overloading the memory to maximise learning
[44]. This thesis uses these concepts from previous research and focuses on the
new aspect of generated process models. The goal is to compare manually created
and generated process models critically and different generated process visualisa-
tions.
42
4 User Study
The procedure of the experiment is based on the typical steps as presented in
[52]. According to Wohlin et al., the first task is scoping to clarify what is being
investigated. Then the experiment is planned and defined. The planning includes
determining the environment in which the experiment is conducted, defining the
hypotheses, and the dependent and independent variables. In the operation step,
the experiment is prepared, commit the participants, and then the experiment is
carried out. The raw data is then analysed and interpreted. Finally, the results can
be presented in a report.
4.2 Experimental Setting
The user study investigates the comprehensibility and cognitive load of understand-
ing process models generated by process mining. So far, it is unclear whether
the type of process creation (whether the process model is modelled by hand or
generated from an event log) influences the comprehension of the process model
and which visualisation type is more comprehensible concerning generated pro-
cess models. The following research question formulates this approach, which has
not yet been investigated. More specific sub-questions complement the research
question.
Research Question
How comprehensible are process models generated by process mining?
Sub-questions:
Are the manually created process models preferred over the generated
models?
Which representation of the generated process models is considered
as the most comprehensible?
What factors influence the participants in naming the most and least
comprehensible process model?
43
4 User Study
An eye-tracking experiment is conducted to test the research question. Accord-
ing to [52] an experiment is defined as "an empirical enquiry that manipulates one
factor or variable of the studied setting". More specifically, it is a human-oriented
experiment in which people randomly perform various treatments to objects in a
laboratory setting [52]. Eye-tracking has proven to be a useful method for studying
comprehensibility and is also used for this experiment. Eye-tracking can detect eye
movements and thus show how participants view the process model to understand
or learn it. From the collected eye-tracking data, it is possible to determine, among
other things, the number of fixations, which forms the basis for the following analysis
on the experiment. These measured values are then used to formulate hypotheses
and test them.
4.3 Hypothesis Formulation
The following hypotheses can be derived based on the research questions. The
hypotheses are defined according to the scenarios.
Hypothesis 1 (Vaccination Process)
H1: The manually created process model and the generated models are similarly
understandable.
H0,1: Viewing the manually created process model takes the same amount of
time as viewing the generated process models.
H1,1: Viewing the manually created process model takes significantly less or
more time than viewing the generated process models.
H0,2: Viewing the manually created process model requires the same number
of fixations as viewing the generated process models.
H1,2: When looking at the manually created process model, significantly fewer
or more fixations are measured than when looking at the generated process
models.
H0,3: The same number of questions are answered correctly in the manually
created process model as in the generated process models.
H1,3: Significantly fewer or more questions are answered correctly in the man-
ually created process model than in the generated process models.
44
4 User Study
H0,4: Answering the questions takes the same amount of time with the manu-
ally created process model as with the generated process models.
H1,4: Answering the questions takes significantly longer or shorter with the
manually created process model than with the generated process models.
Hypothesis 2 (Vaccination Process)
H2: The generated process models are similarly comprehensible.
H0,1: Viewing each generated process model takes the same amount of time.
H1,1: Viewing one generated process model takes significantly less time than
viewing the other generated process models.
H0,2: Viewing the generated process models requires the same number of
fixations.
H1,2: The number of fixations is significantly lower for one generated process
model than for the other generated process models.
H0,3: The number of correctly answered questions is the same for each pro-
cess model.
H1,3: The number of correctly answered questions is significantly higher for
one process model than for the other generated process models.
H0,4: The processing time is the same for all generated process models.
H1,4: The processing time for one generated process model is significantly
less than for the other generated process models.
The two hypotheses above are made in the vaccination process because it is a
simple and sequential process. Therefore, no significant differences are expected
between the manually created process model and the generated process models
or between the generated process models.
Hypothesis 3 (Insurance Process)
H3: The manually created process model is easier to understand than the gener-
ated process models.
H0,1: Viewing the manually created process model takes less time than
viewing the generated process models.
H1,1: Viewing the manually created process model does not take signif-
icantly less time than viewing the generated process models.
H0,2: Fewer fixations are measured when viewing the manually created
process model than when viewing the generated process models.
45
4 User Study
H1,2: When looking at the manually created process model, significantly
not fewer fixations are measured than when looking at the generated
process models.
H0,3: The number of correctly answered questions is higher for the man-
ually created process model than for the generated process models.
H1,3: The number of correctly answered questions is not significantly
higher for the manually created process model than for the generated
process models.
H0,4: The time required to answer the questions is less for the manually
created process model than for the generated process models.
H1,4: The time needed to answer the questions is not significantly less
for the manually created process model than for the generated process
models.
Hypothesis 4 (Insurance Process)
H2: The generated process models are similarly comprehensible.
H0,1: Viewing each generated process model takes the same amount of time.
H1,1: Viewing one generated process model takes significantly less time than
viewing the other generated process models.
H0,2: Viewing the generated process models requires the same number of
fixations.
H1,2: The number of fixations is significantly lower for one generated process
model than for the other generated process models.
H0,3: The number of correctly answered questions is the same for each pro-
cess model.
H1,3: The number of correctly answered questions is significantly higher for
one process model than for the other generated process models.
H0,4: The processing time is the same for all generated process models.
H1,4: The processing time for one generated process model is significantly
less than for the other generated process models.
The third hypothesis is based on the assumption that most of the participants are
familiar with the notation of BPMN. A known notation could lead to a better un-
derstanding of the process. It is also expected that there will be problems in un-
derstanding the generated process models, as the branching for the sub-process
46
4 User Study
might be unclear. The fourth hypothesis examines the different generated process
models of the insurance process for possible differences.
4.4 Experimental Set-up
An eye-tracking experiment is conducted to test the various hypotheses. In this
Section, the further planning of the experiment is discussed and the participants
and objects, as well as the dependent and independent variables, are presented.
Finally, the procedure is described, and the materials required are mentioned.
4.4.1 Participants
Due to the current Corona pandemic, the acquisition of participants is complex, as
the study is conducted locally. Therefore, students, former students and doctoral
students of the University of Ulm are invited as participants. Participation is vol-
untary, and participants are assured that anonymity is guaranteed based on data
protection. When invited, participants are informed that this experiment uses an
eye-tracker, and the comprehensibility of process models is examined. Therefore, it
is pointed out that some experience with process models should be available. The
participants should know what a process model is and how it looks like. However, a
great deal of experience with process modelling is not required.
4.4.2 Objects
Each participant is shown a total of eight process models, four models from each
scenario (see generated process models in Appendix B). Throughout the experi-
ment, the participant wears an eye-tracker that records eye movements. The par-
ticipant is asked to understand the process models semantically and take as much
time as needed. Once the participant feels that the process model is understood,
the screen is to be clicked with the mouse. The process model disappears, and the
participant is asked to answer three true-or-false comprehension questions (see Ap-
pendix C). Based on the answers to these questions, it is to be determined whether
the process model is interpreted correctly.
47
4 User Study
4.4.3 Independent and Dependent Variables
The experiment considers different performance indicators. The corresponding re-
search model for the user study conducted is described in Figure 4.1.
Figure 4.1: Research model for user study
The independent variable process visualisation is shown in the box on the left side.
The study compares the process visualisations mentioned here and examines their
comprehensibility. On the right side the boxes show the dependent variables that
are examined for each process model. These are further described below.
Performance
While the participants are performing the study, a recording is made using the eye-
tracker. Thus, the timestamps can be used to determine the duration that is needed
by the participants for understanding the process model. The response time is
48
4 User Study
recorded the same way. Fixations are gaze points the participant focus on. They
are also collected with the help of the eye-tracker. For each process model, three
true-or-false comprehension questions are asked (see Appendix C). The score is
derived from the correctly answered questions.
Level of Acceptability
With the help of the level of acceptability, statements can be made about how well
the participants accept a process model. For this purpose, a total of eight questions
are asked on a 5-point Likert scale from strongly disagree (i.e., 0) to strongly agree
(i.e., 5) (see Appendix C). The first four questions examine perceived usefulness
for understandability (PUU), and the other four questions examine perceived ease
of understandability (PEU).
Cognitive Load
Cognitive load describes the required capacities of working memory during a task.
Here, a distinction is made between the three categories intrinsic (ICL), extraneous
(ECL), and germane cognitive load (GCL), which are additive [36]. The intrinsic
cognitive load is provided by the interactivity of the elements of the learning mate-
rial. The required load can be influenced by the number of elements. By omitting
some interactive elements, for example, the intrinsic cognitive load can be reduced.
Extraneous cognitive load is the load imposed by the way the material is presented.
Depending on how the information is shown to the learner, the required load can
be reduced. Since the two categories are additive, it is essential to reduce the ex-
traneous cognitive load when the intrinsic cognitive load is very high [36]. The last
category is the germane cognitive load, which, like the extraneous load, depends
on the mode of presentation. The difference, however, is that a germane cognitive
load enhances learning. Among other things, it leads to schema acquisition and it
is therefore crucial for learning.
There are seven questions which the participants have to answer with respect to
cognitive load. The first two questions are used to analyse intrinsic cognitive load,
the following three questions are used to determine extraneous cognitive load, and
the last two questions are used to answer germane cognitive load. The answers
49
4 User Study
may be given according to a 5-point Likert scale from strongly disagree (i.e., 0) to
strongly agree (i.e., 5) (see Appendix C).
Ranking
After answering the questions about a specific scenario, the participants are asked
to sort the process models they have seen according to their subjective compre-
hensibility and then give a reason for the ranking (why the chosen process model is
in first or last place).
4.4.4 Experimental Design
Despite the current corona situation, the experiment is carried out on-site at the
University of Ulm. Special attention is paid to appropriate hygiene measures in
order not to endanger the participants.
The participants are seated on a chair in front of a 24” monitor. Both the question-
naire and later the process models are displayed on this. Furthermore, a laptop
with a keyboard and a mouse is used to answer the questions and navigate the
questionnaires. The set-up can be seen in Figure 4.2.
In order to obtain comparable results, a fixed procedure is followed. Figure 4.3
shows the procedure of the study for one participant.
First, demographic data on the participant (age and gender) is collected. Then
the participant is asked about their previous knowledge. Here, the participant is
asked to list process modelling languages the participant knows and indicate the
time spent with process modelling or looking at process models in days. Then five
true-or-false knowledge questions (see Appendix C) are asked to query the level
of experience. The eye-tracker is then calibrated with a five-point-screen-based
calibration, in which the participant is shown five points in succession.
Then the participant is shown various process models in full-screen mode, which
the participant is supposed to try to understand. The participant is asked to proceed
as quickly as possible but also as accurately as possible. For each process model,
three true-or-false questions then appear to determine whether the process model
is interpreted correctly (see the questions in Appendix C).
50
4 User Study
Figure 4.2: Set-up for experiment
Figure 4.3: Procedure for each participant
51
4 User Study
In addition, the participant is asked to answer questions about the process model
under consideration related to the level of acceptability and the cognitive load for
each process model. The questionnaires used can be found in Appendix C. The
participant is shown a total of eight process models. At first, process models are
presented for the vaccination process and then for the insurance process. The
manually created process model is presented first, followed by selecting three gen-
erated process models in random order. A permutation table is created to select
which process models are shown to a subject (see Table 4.1). Finally, after each
scenario, the participant is asked to rank the seen process models according to
their comprehensibility and justify the choice.
Permutation 1 Permutation 2 Permutation 3 Permutation 4
Apromore (BPMN) x x
Celonis Snap x x
PM4Py (Petrinetz) x x
PM4Py (BPMN) x x
PM4Py (HeuNet) x x
Disco x x
Table 4.1: Permutations for process model selection
For the two scenarios, different permutations for selecting process models are used
so that each participant views each type of process visualisation. The permutations
1 and 2 and permutations 3 and 4 are each used together.
4.4.5 Instruments
Various materials are needed for the experiment described. Firstly, the eye-tracker
is needed to record the eye movements. For this purpose, the Pupil Labs Core
headset is used in the experiment, which was already introduced in Section 2.6.
With the help of the Pupil Player, the recordings can be visualised, exported and
then used for analysis. The following Figure 4.4 shows a paused sequence from
such an exported recording. This Figure shows the screen with one process model
to be viewed. At the top left, the detected pupils are displayed. In this case, they
are very well matched. In addition, the currently focused point of view is highlighted
in colour.
In addition, questionnaires are needed to collect the demographic data and retrieve
52
4 User Study
Figure 4.4: Eye-tracking recording
the comprehension questions, the questions on the level of acceptability and cogni-
tive load.
All materials are presented digitally in a questionnaire using the EFS (Enterprise
Feedback Suite) survey of questback [57] and thus presented to the participant in
a coherent way. The process models, which represent the central materials, are
integrated into the questionnaire. In this way, the participant is guided through the
entire study by simply clicking on the questionnaire.
The process models have already been roughly presented in Section 2.5 and refer-
ence is made to Appendix B, which contains the generated process models. For the
study, a selection of process models is made to reduce the effort for participants.
The selection of the process models can be justified as follows. Based on the re-
search question of the hypotheses, the manually created process models must be
included. The process maps of Celonis Snap and Disco are included in the selec-
tion. Since the process models created with ProM are not easy to display on a full
screen, they are omitted. Participants should not have to scroll in a window of the
screen to obtain comparable results. In addition, the created process visualisations
can already be covered with other tools and would thus not have added any value
to the study.
Regarding Apromore, only the BPMN diagram is considered and not also the pro-
cess map. Since the process map is almost identical to the BPMN diagram, it is
omitted. Due to the relevance of BPMN, BPMN diagrams should be included in the
study.
53
4 User Study
Concerning PM4Py, all process models generated with the HeuristicsMiner are con-
sidered. The process models generated with the Inductive Miner are partly identical
to those generated with the HeuristicsMiner, and other generated process models
are identified as unsuitable. That is because the models are strongly generalised
so that more types of execution are possible than are specified by the scenario. As
this is not intended, these process models are not considered.
Finally, the IBM software SPSS Statistics (version 27) [30] is used for the statistical
analyses. SPSS allows the results to be checked for significance.
4.5 Operation and Data Validation
A total of fifteen participants have been recruited for the study. Three of them are
female, and twelve are male. The average age of the participants is 28, with the
youngest born in 1996 and the oldest born in 1977.
The study is conducted on several days at the University of Ulm, always in the same
room.
After providing their demographic data, the participants are asked to name all pro-
cess modelling languages they already know. Fourteen participants name BPMN
as the process modelling language they know so far, which is the most frequent.
Six participants name EPC, and five participants name UML and Petri nets. In
rare cases (up to a maximum of three participants), process modelling languages
such as eGantt, Flowcharts, ADEPT and some process modelling languages from
artefact-centred modelling are mentioned, to name a few. Only one participant can
not think of any known process modelling languages.
The participants expertise vary greatly. When asked how much time they had spent
on process modelling or looking at process models so far, the average is 105.5
days. However, the standard deviation is around 250 because a few participants
only indicate a small experience of 4-5 days, while others indicate an extensive
experience of 100 or 1000 days.
The demographic data, the details of the known process modelling languages and
the experience in days can be found in the Appendix D in Table D.1.
54
4 User Study
In the answers to the knowledge questions, the differences in expertise are not so
clearly visible. On average, 3.4 out of 5 questions are answered correctly, and the
standard deviation is 0.95. How the participants answer the knowledge questions
can be found in the Appendix D Table D.2.
All participants with whom the study has also been started completed it. No one
has dropped out of the study. Questionnaire variants 1, 2 and 3 are completed four
times and variant 4 three times. A study session took between 20 and 45 minutes.
All data sets are used to evaluate the comprehension questions (i.e., duration of
answering and number of correct answers). For the eye-tracking data, one data set
is to be partially excluded for the duration of the observation of the process models.
In addition, one data set for the number of fixations is excluded. The exclusion is
due to problems with the recording or a faulty calibration.
55
5 Study Evaluation
After presenting the study conducted in the previous Chapter, this Chapter focuses
on the study results. First, a descriptive evaluation is carried out, followed by fur-
ther analyses, in particular, to determine possible significances. Subsequently, the
results are interpreted, and the limitations of the study are pointed out with a critical
reflection of the study. Finally, the central results of the user study are summarised.
5.1 Descriptive Analysis
To be able to evaluate the experiments conducted, the various measured values
are first needed. Average values are determined to be able to make comparisons
more easily. In Table 5.2 for each process model (see column ID), the average
duration for viewing the process model (see column Dur. in ms), the average num-
ber of fixations (see column Fix.), the average duration for answering comprehen-
sion questions (see column Resp. Time in ms), the average score for correctly
answered comprehension questions (see column Score), perceived usefulness for
understandability (see column PUU), and perceived ease of understandability (see
column PEU), intrinsic (see column ICL), extraneous (see column ECL), and ger-
mane (see column GCL) cognitive load are depicted. For each process model, a
unique process ID is assigned, which is listed in Table 5.1.
The duration for viewing the process models and the response times, as well as the
number of fixations, are obtained from the eye-tracker recordings. All other data in
the table are exported from the questionnaires.
The process models are viewed for an average of 52.89 seconds with a standard
deviation of 29.70 seconds, and 226 fixations are required with a standard deviation
of 122 fixations. The participants took an average of 22.40 seconds with a standard
56
5 Study Evaluation
ID Process Model
1 manually created process model (vaccination process)
2 BPMN diagram generated by Apromore (vaccination process)
3 process model genereated by Celonis (vaccination process)
4 process model generated by Disco (vaccination process)
5 BPMN diagram generated by PM4Py (vaccination process)
6 Heuristic net generated by PM4Py (vaccination process)
7 Petri net generated by PM4Py (vaccination process)
8 manuelly created process model (insurence process)
9 process model generated by Apromore (insurence process)
10 process model generated by Celonis (insurence process)
11 process model generated by Disco (insurence process)
12 BPMN diagram generated by PM4Py (insurence process)
13 Heuristic net generated by PM4Py (insurence process)
14 Petri net generated by PM4Py (insurence process)
Table 5.1: Assignment of process model IDs
deviation of 10.80 seconds to answer the comprehension questions for the process
models. They can answer 2.3 out of 3 questions correctly with a standard deviation
of 0.74. This shows that the process models can be understood intuitively. The
perceived usefulness for understandable is 13.95 (max. is 20) with a standard devi-
ation of 3.887, and the perceived ease of understandable is 14.31 (max. is 20) with
a standard deviation of 3.891. This indicates that the participants do not accept
all visualisations. The best performing process models have IDs 1 and 8, which
are the manually created process models. The worst performing process model is
the process model with ID 14, which is the Petri net generated from PM4Py. As
far as cognitive load is concerned, intrinsic cognitive load scores 2.88 (min. is 1,
max. is 5) with a standard deviation of 1.117 and extraneous cognitive load scores
3.47 (min. is 1, max. is 5) with a standard deviation of 0.978. This shows that the
working memory load is at a moderate level. The germane cognitive load value of
2.65 (min. is 1, max. is 5) with a standard deviation of 1.018 shows that there is not
much learning in the tasks, as little new mental schemata have to be constructed.
The data collected and summarised in Table 5.2 are detailed in the Appendix D in
Table D.3 to Table D.9.
57
5 Study Evaluation
ID Dur. Fix. Resp. Time Score PUU PEU ICL ECL GCL
1 32584,71 138 21848,67 2,33 17 18 2 3 2
2 42289,57 174 33114 3 15 15 3 3 2
3 62896,88 295 27066,5 2,38 14 14 3 4 3
4 30594,43 122 14451,14 1,86 14 16 3 3 2
5 37504,88 169 16525,5 2 15 15 2 3 2
6 28971,86 135 16349,29 2 12 14 3 3 3
7 34586,25 156 18227,38 2,75 14 14 2 3 2
8 82968,47 331 21270,93 2,6 16 16 3 4 2
9 82350,5 348 18363,25 2 12 13 3 4 3
10 56923,86 191 22899 1,71 13 13 4 4 3
11 64203,75 294 27914,63 2,25 13 13 3 4 3
12 61836,57 244 22311,14 2,57 12 13 3 4 3
13 57028,38 277 28385,5 2,38 12 12 3 4 3
14 48604,14 215 26556,71 2 10 10 3 4 3
Table 5.2: Measured results (average values) for each process model
In addition, after answering the questions about a specific scenario, the participants
are asked to sort the four process models they have seen according to their com-
prehensibility and then give reasons for the ranking (why the chosen process model
comes first or last).
The process model with ID 1 (the manually created process model) is ranked as the
most comprehensible process model for the vaccination process. This is justified
with the already known process modelling language BPMN, the horizontal orienta-
tion of the process model and the lack of numbers about the frequency. The par-
ticipants disagree on which process model is the most difficult to understand. The
process models with ID 1, ID 3 (process map generated by Celonis), ID 6 (heuris-
tic net generated by PM4Py) and ID 7 (Petri net generated by PM4Py) are chosen
most often (i.e. three times each). The reason given several times for model 1 is
that the boundary event has to be known, and the process model is also perceived
as complex. One participant finds the process model rather confusing. The verti-
cal arrangement and the numbers are given several times as reasons for process
model with ID 3. They also found the process model difficult to understand because
much information is shown at once. In the case of process model with ID 6, the
numbers for frequency are also mentioned, the choice of colours and the lack of
gateways. In process model with ID 7, the black block, i.e. the element without
obvious meaning, confused the participants. In addition, one participant stated that
58
5 Study Evaluation
the numbers are missing to correctly represent the cycle between the briefing and
the waiting (as this does not mean that only one briefing may take place, as it is
presented in the other process models).
The process model with ID 8 (the manually created process model) is ranked as
the most comprehensible process model for the insurance process. This is justi-
fied by the already known process modelling language BPMN, the lack of numbers
about the frequency as well as the labelled paths and that the participants are repre-
sented. In addition, the choice of colours is mentioned and the fact that the rejection
is shown more clearly as being executed later. The process model with the ID 14
(Petri net generated by PM4Py) is ranked as the most difficult to understand. Here
it matters that it is not identified by all participants as a Petri net because the se-
mantics of some elements is unclear (especially the kind of split elements and the
meaningless black box). Finally, few elements and information are present in the
process model, and it is not easy to follow the main path.
5.2 Data Analysis and Interpretation
In this Section, further analysis is presented to identify significant differences in
the results. First, the vaccination process is compared with the insurance process.
Then a variance analysis is carried out for both scenarios. Finally, the results of the
correlation analysis are presented.
5.2.1 Analysis of Process Scenarios
The thesis has assumed that the vaccination process is a more understandable
process than the insurance process. A two-independent Mann-Whitney U test is
performed for each parameter to verify this statistically. The significance value is
set to p < .05.
The results of the Mann-Whitney U test can be seen in Table 5.3. The U and z
values are given for each parameter, and the significance value p is given in the
next column. Some significant differences are found between the two groups based
59
5 Study Evaluation
Dur.: U = 669.000 z = -5.852 p = .000 r = 0.536
Fix.: U = 640.000 z = -5.400 p = .000 r = 0.510
Resp. Time: U = 1363.000 z = -2.294 p = .022 r = 0.209
Score: U = 1656.000 z = -.827 p = .408 r = 0,075
PUU: U = 9.500 z = -1.951 p = .051 r = 0.521
PEU: U = 5.500 z = -2.472 p = .013 r = 0.661
ICL: U = 12.000 z = -2.015 p = .044 r = 0.538
ECL: U = 3.500 z = -3.122 p = .002 r = 0.834
GCL: U = 10.500 z = -2.082 p = .073 r = 0.556
Table 5.3: Results of Mann-Whitney U test for each parameter
on the different variables. Therefore, the r-value is also calculated. R is the corre-
lation coefficient by which the effect size is calculated. For the number of correctly
answered comprehension questions, small effect size and also no significance is
found. The participants can answer the questions on both scenarios equally well.
For the response time, significance can be determined, and medium effect size can
be found. For the PUU and the GCL, no significance but a large effect size can
be identified. There is a significance or even high significance and a large effect
size for all other variables seen in the table. There are highly significant differences
between the groups, especially concerning process observation time, the number
of fixations, and ECL. The detailed reports of the analysis can be found in the Ap-
pendix D. Based on the results, it can be confirmed that the insurance process is
more complicated to understand than the vaccination process.
5.2.2 Analysis of Vaccination Process
First, a detailed analysis of the vaccination process is to be performed. A one-way
analysis of variance (ANOVA) is carried out to determine the extent to which the
process models can be compared with each other. The significance value is set at
p < .05. A function C is introduced, which acts like a compare function to specify
which two process models are compared.
The manually created process model is compared with the different generated pro-
cess models (see Table 5.4). No significance can be found here.
60
5 Study Evaluation
However, when comparing process models 1 and 3, it is noticeable that the value
for fixations is only just above the defined significance value. A reduced value can
also be found for the duration of how long the process models are viewed.
Compare Sig.
Duration
Sig.
Fixations
Sig.
Resp. Time
Sig.
Score
C(1, 2) p = 1.000 1.000 1.000 1.000
C(1, 3) p = .537 .060 1.000 1.000
C(1, 4) p = 1.000 1.000 1.000 1.000
C(1, 5) p = 1.000 1.000 1.000 1.000
C(1, 6) p = 1.000 1.000 1.000 1.000
C(1, 7) p = 1.000 1.000 1.000 1.000
Table 5.4: Comparison between the manually created process model and
the different generated process models
It can be seen from the values that there are no significant differences between the
manually created process model and the generated process models. This means
that the process models are equally easy to understand, and, above all, the hypoth-
esis that the models are similarly understandable has been confirmed.
Looking at the comparison between the different generated process models, no
significances can be found either (see Table 5.5).
However, there are reduced values for the response times and the number of correct
answers when comparing process model 2 with process models 4, 5, 6 and 7. When
looking at how long participants look at the process models and how many fixations
are needed, decreased values can be found when comparing process model 3 with
process models 4, 6 and 7.
The analysis shows that the comprehensibility of the process models is comparable,
as no significant differences are found. Therefore, the second hypothesis can also
be confirmed.
5.2.3 Analysis of Insurance Process
Next, a detailed analysis of the insurance process is to be performed. Also, here
a one-way analysis of variance (ANOVA) is carried out to determine the extent to
which the process models can be compared with each other. The significance value
61
5 Study Evaluation
Compare Sig.
Duration
Sig.
Fixations
Sig.
Resp. Time
Sig.
Score
C(2, 3) p = 1.000 1.000 1.000 1.000
C(2, 4) p = 1.000 1.000 .078 .254
C(2, 5) p = 1.000 1.000 .194 .607
C(2, 6) p = 1.000 1.000 .239 .779
C(2, 7) p = 1.000 1.000 .514 1.000
C(3, 4) p = 1.000 .180 1.000 1.000
C(3, 5) p = 1.000 1.000 1.000 1.000
C(3, 6) p = .748 .376 1.000 1.000
C(3, 7) p = 1.000 .607 1.000 1.000
C(4, 5) p = 1.000 1.000 1.000 1.000
C(4, 6) p = 1.000 1.000 1.000 1.000
C(4, 7) p = 1.000 1.000 1.000 1.000
C(5, 6) p = 1.000 1.000 1.000 1.000
C(5, 7) p = 1.000 1.000 1.000 1.000
C(6, 7) p = 1.000 1.000 1.000 1.000
Table 5.5: Comparison between the different generated process models
is set at p < .05. A function C is introduced, which acts like a compare function to
specify which two process models are compared.
In Table 5.6 the manually created process model is compared with the generated
process models.
Compare Sig.
Duration
Sig.
Fixations
Sig.
Resp. Time
Sig.
Score
C(8 ,9) p = 1.000 1.000 1.000 1.000
C(8, 10) p = 1.000 .505 1.000 .601
C(8, 11) p = 1.000 1.000 1.000 1.000
C(8, 12) p = 1.000 1.000 1.000 1.000
C(8, 13) p = 1.000 1.000 1.000 1.000
C(8, 14) p = .237 1.000 1.000 1.000
Table 5.6: Comparison between the manually created process model and
the different generated process models
As with the vaccination process, no significance can be found here. Reduced values
can only be found in the comparison of process model 8 with models 10 and 14.
The analysis shows that, contrary to expectations, the generated process models
62
5 Study Evaluation
are comparable to the manually created process model in terms of comprehensibil-
ity. The analysis, therefore, confirms the counter-hypothesis.
Finally, the generated process models are compared with each other (see Ta-
ble 5.7).
Compare Sig.
Duration
Sig.
Fixations
Sig.
Resp. Time
Sig.
Score
C(9, 10) p = 1.000 .576 1.000 1.000
C(9, 11) p = 1.000 1.000 1.000 1.000
C(9, 12) p = 1.000 1.000 1.000 1.000
C(9, 13) p = 1.000 1.000 1.000 1.000
C(9, 14) p = .779 1.000 1.000 1.000
C(10, 11) p = 1.000 1.000 1.000 1.000
C(10, 12) p = 1.000 1.000 1.000 1.000
C(10, 13) p = 1.000 1.000 1.000 1.000
C(10, 14) p = 1.000 1.000 1.000 1.000
C(11, 12) p = 1.000 1.000 1.000 1.000
C(11, 13) p = 1.000 1.000 1.000 1.000
C(11, 14) p = 1.000 1.000 1.000 1.000
C(12, 13) p = 1.000 1.000 1.000 1.000
C(12, 14) p = 1.000 1.000 1.000 1.000
C(13, 14) p = 1.000 1.000 1.000 1.000
Table 5.7: Comparison of the different generated process models
Here, too, no significance and hardly any reduced values can be found. These only
occur when comparing process models 9 with 10 and 9 with 14. Therefore, the
fourth hypothesis can be confirmed.
5.2.4 Correlation Analysis
After conducting the variance analysis, a correlation test according to Bravais-
Pearson [46] is carried out to test possible correlations between the different vari-
ables. Since all data are metric variables, this is possible.
The following Table 5.8 shows the correlations between the individual variables. The
significance value is set at 0.05 (indicated with *), and the high significance value is
set at 0.01 (indicated with **).
63
5 Study Evaluation
Exp. Dur. Fix. Resp.
Time Score PUU PEU ICL ECL GCL
Exp. 1 -.332** -.315** -.304** -.216* -.168 -.160 .031 -.457** -.003
Dur. -.332** 1 .906** .162 .201* -.015 -.099 .196* .451** .213*
Fix. -.315** .906** 1 .162 .171 -.068 -.129 .203* .390** .208*
Resp.
Time -.304** .162 .162 1 .042 .080 .057 .050 .325** -.010
Score -.216* .201* .171 .042 1 .055 .071 -.132 .035 -.017
PUU -.168 -.015 -.068 .080 .055 1 .920** -.358** -.258** -.670**
PEU -.160 -.099 -.129 .057 .071 .920** 1 -.433** -.312** -.675**
ICL .031 .196* .203* .050 -.132 -.358** -.433** 1 .498** .417**
ECL -.457** .451** .390** .325** .035 -.258** -.312** .498** 1 .460**
GCL -.003 .213* .208* -.010 -.017 -.670** -.675** .417** .460** 1
Table 5.8: Results of the Pearson correlation for each parameter
Several significances can be identified. Firstly, the longer the participants look at
the process model, the more fixations are needed. Furthermore, it is significant
that the longer the participants look at the process model, the higher the cognitive
load (both ICL, ECL and GCL). The cognitive load is also significantly higher when
more fixations are needed, which can also be logically concluded by the first two
statements.
The high significances between the intrinsic cognitive load, extraneous cognitive
load and the germane cognitive load are due to the additivity of these three.
A more interesting result is that the longer the process models are viewed, the
higher the number of correct comprehension questions. This means that it is worth-
while for the participants to look closely at the process model to be able to answer
the questions. It also shows that the process models or the questions, in general,
are too challenging to grasp all the information in a short viewing or to answer all
the questions correctly.
The higher the intrinsic cognitive load, the significantly longer the participants look
at the process models and also require more fixations.
The longer the response time is, the higher the extraneous cognitive load is mea-
sured to understand the process models. It can be concluded that the process
model is more difficult to understand or that the comprehension questions are too
complicated. This could be further investigated in future experiments.
A highly significant correlation between the level of acceptability and the cognitive
load can also be measured. The higher the level of acceptability for the process
model, the lower the cognitive load (both ICL, ECL and GCL). Conversely, it can be
64
5 Study Evaluation
concluded that process models that are not considered useful or comprehensible
due to the way of representation also require a higher cognitive load.
Some correlations can be found by looking at the experience of the participants.
Other studies have already shown that experience has an influence on process un-
derstanding [53, 55]. The influence can also be confirmed here. Firstly, the higher
the experience, the lower the time needed to look at the process models, the num-
ber of fixations and the reaction time required. This suggests that participants with
more experience need less time than those with less experience. Conversely, par-
ticipants with little experience require more time to understand the process models.
Of particular interest is the highly significant correlation between experience and
external cognitive load. The higher the experience of the participants, the lower is
the external load. This suggests that the way the materials are presented can be
improved for participants with low experience.
Finally, a significant relationship can be found between experience and the num-
ber of comprehension questions answered correctly. It is found that the higher the
experience, the lower the number of correct answers. The reason could be that
the more experienced participants overestimate themselves and therefore do not
look closely enough at the individual parts of the process models. A longer time is
necessary to read all the information from the process model that is relevant for the
comprehension questions.
Finally, the analysis will now focus on the ranking. Here, too, a correlation test,
according to Bravais-Pearson, is carried out. No significance is found concerning
the variables for expertise, duration of observation, number of fixations, response
time and number of correctly answered questions. In contrast, high significances
are found for the level of acceptability and the cognitive load, which are shown in
the following Table 5.9.
PUU PEU ICL ECL GCL
Ranking -.578** -.568** .271** .282** .515**
Table 5.9: Results of the Pearson correlation between ranking and level of accapt-
ability and cognitive load
Apart from the correlations, the following is noticed to the given ranking with the
number of correctly answered comprehension questions. The manually created
65
5 Study Evaluation
process model for the vaccination process is rated as the best comprehensible
process model. However, not more questions are answered correctly here. All
participants who have seen the generated process model of Apromore answer all
questions correctly. Nevertheless, they do not select it as the most comprehensi-
ble process model. The least correct answers are given to the generated process
model of Disco, with an average of 1.86 correct answers. The ranking does not iden-
tify one process model as the worst comprehensible, but Disco’s process model is
not among them.
The manually created process model for the insurance process is chosen as the
most understandable process model in the ranking, which is only slightly ahead
with 2.6 out of 3 correct answers. Almost all process models achieve a value above
2 here. The only process model with fewer correct answers is the process model
from Celonis, with an average of 1.7 correct answers. Nevertheless, this is not rated
as the worst comprehensible process model, but the Petri net from PM4Py.
The analysis of the experiments conducted confirms the results of other studies but
also reveals new approaches. In some cases, the number of participants is too
small to obtain significant observations. Therefore, an important conclusion is that
an experiment with a larger number of participants is required to examine these
aspects again in more detail.
5.3 Limitations
In a critical review of the study, limiting factors that might have influenced the study
results are detected. First of all, it can be said that the study conducted is an ex-
ploratory experiment. The objective has not been to verify previous results but to
gain new insights. A limiting factor is the number of participants, which prevent
drawing statistically significant conclusions for all hypotheses.
The study is based on the specification of two scenarios. The selection of the sce-
narios can be seen critically because it has not been checked beforehand whether
the participants are familiar with them. However, care has been taken to choose
scenarios that are as generally known as possible so that this aspect should not
have greatly affected the study results.
66
5 Study Evaluation
Another limitation is that the first process model of the insurance scenario is created
manually. The process model might not be representative, as insurance processes
in the real world are more complex than the process model used here. If a new
study is conducted in this area, attention should be paid to this. The process mod-
els should be adapted according to reality.
The following limitation relates to the comprehension questions. It should be noted
that the process model cannot be reaccessed during the answer phase. This means
that the participant has to remember everything. Thus, the questions not only test
comprehensibility because the participant could also have forgotten the aspect. On
the other hand, the questions are slightly different for each process model. Individ-
ual questions could be easier or more complicated than the others.
The fact that the process models have to be memorised additionally influences the
cognitive load. Therefore, the measured cognitive load cannot be understood as a
pure load of the process model.
The generated process models are exported from their corresponding tools and dis-
played only as an image. This is a substantial limitation, as no tool support could
be used this way (for example, sliders, for abstracting the case frequency and other
filters), which might have influenced the understanding. The use of eye-tracking
glasses can also be seen as a limitation. Not all of the participants have an experi-
ence of wearing such glasses, which can negatively influence them.
Another aspect may be emerging fatigue, as the participants have to sit still with as
little movement as possible during the entire experiment while concentrating on the
different models and questions.
In addition, the aspect of experience can be identified as a limitation in the person-
related characteristics. There is an imbalance that could have affected the statistics.
During the study, there is no assistance to support the participants with less experi-
ence (for example, explanation of items or the numbers). In conclusion, the current
hygiene measures due to the Corona pandemic (wearing a face mask) might have
influenced the participant.
67
5 Study Evaluation
5.4 Results of User Study
Despite various limitations, some insights are gained from the study. It should be
noted that this is an exploratory study and that further studies are needed to extend
the findings. Various tests have been carried out to determine possible significances
in the results of the study. The first step shows that there are significant differences
between the vaccination process and the insurance process. As expected, the
vaccination process is easier to understand, and the insurance process is a more
challenging process concerning comprehensibility.
The next step is to look more closely at the vaccination process, where no signif-
icant differences can be found between the different process visualisations. Thus,
null hypotheses 1 and 2 can not be rejected. Next, the insurance process is ex-
amined in more detail. Contrary to hypothesis 3, no significant difference between
the manually created process model and the generated process models can be dis-
covered. More significant differences between the process models are expected,
as the information presented differs greatly from different generated process mod-
els. For example, the Petri net with low information content is rated poorly several
times. However, some participants find the Celonis process model too overloaded
with information. However, it should be noted that Celonis offers functions in its tool
to better represent different aspects of the process model through the use of filters
and sliders. This thesis, therefore, does not look at all the possibilities of Celonis
and other tools. In the future, it could be investigated how these filters affect the
comprehensibility of generated process models.
The results of the correlation test are interesting because some significant depen-
dencies are identified. A correlation can be found between the experience of par-
ticipants and the viewing time of process models, the number of fixations, response
time and extraneous load. The correlation of experience with the understanding
of process models has already been investigated in other studies and also plays
a significant role in generated process models. Another correlation exists between
the level of acceptability and the cognitive load. The higher the level of acceptability
of the process model, the lower the cognitive load (both ICL, ECL and GCL). It can
be concluded that when generating process models, attention must also be paid
to the level of acceptability in order to reduce the cognitive load when looking at
emerging processes. Thus, the view on this can be interesting for tool developers
when choosing which process visualisation to use for process mining.
68
6 Conclusion and Future Work
In this concluding Chapter, the thesis is summarised, and an outlook for future work
is given.
6.1 Conclusion
The objective of this thesis is to investigate the comprehensibility of process models
generated by process mining. The quality of process models has a decisive influ-
ence on the analyses carried out by companies.
In order to investigate the comprehensibility of different generated process models,
an exploratory eye-tracking study is conducted. With the help of an eye-tracker,
gaze points can be measured, which can be an indicator of the comprehensibility of
the models. The user study is carried out to answer the research question of how
comprehensible the generated process models are.
First, a vaccination process and an insurance process are defined and created as
manual process models. Event logs are then generated for these. A Python script
is written to create an event log for the vaccination process. The insurance process
is modelled in Camunda as an executable process. A Python script is written to
extract the history in Camunda and create the event log. Both event logs can then
be used for process mining, more specifically, process discovery. The two event
logs are fed into various process mining tools, and the resulting process models are
saved. A selection of these models is then tested for comprehensibility in the user
study. The user study is conducted with fifteen participants at the University of Ulm.
Each participant is shown a total of eight process models in succession, for each
of which three true-or-false comprehension questions are then to be answered. In
addition, questions regarding the level of acceptability and the cognitive load are
69
6 Conclusion and Future Work
answered for each process model. After each scenario, a ranking of the process
models seen is also requested.
With the help of a Mann-Whitney U test, it can be shown that there are highly signif-
icant differences between the scenarios. Thus it can be justified that both scenarios
are necessary for the study. No significant differences are identified in the anal-
ysis of variance of the process models of the vaccination process. Based on the
analysis, a slight difference between the manually created process model and the
generated process model from Celonis can be identified. In comparing the gener-
ated process models, minor differences are obtained when comparing the process
model of Celonis with the process model of Disco, the heuristic net of PM4Py and
the Petri net of PM4Py. No significances can be identified in the variance analysis
for the insurance process either. When comparing the manually created process
model with the generated process models, a slight difference to the process model
of Celonis and the Petri net of PM4Py can be detected. When comparing the gen-
erated process models with each other, slight differences are found between the
process models of Apromore and the process model of Celonis and the Petri net of
PM4Py, respectively. These differences shall be investigated in a future study.
Based on the correlation analysis, some correlations can be found between the
variables studied. The correlation between the time spent looking at the process
models and the number of correctly answered comprehension questions is interest-
ing. From this correlation, it can be concluded that understanding process models
requires a certain amount of time and that not enough information can be gathered
at a glance to answer all subsequent comprehension questions correctly. Some
high significant correlations are identified concerning the reported experience of
the participants. The higher the experience, the shorter the process models are
viewed, and fewer fixations are needed. The response time also becomes shorter
with increasing experience and the extraneous load decreases. An unexpected
significance can be identified between experience and the number of correctly an-
swered comprehension questions. The higher the experience, the fewer correct
answers are given. It may be because a basic understanding of the process mod-
els is achieved very quickly by participants with much experience. The participants
do not know the questions beforehand and do not memorise all the information
accurately enough.
70
6 Conclusion and Future Work
Using the Pearson correlation, it is finally shown that the subjective ranking of the
participants is highly significant with the level of acceptability and the cognitive load.
Therefore, in process modelling, care must be taken to ensure that the people who
shall understand the model accept the way the process is visualised.
Despite interesting results, further studies are needed, as the study is confronted
with some limitations (particularly the number of participants). The results can be
used as a basis for future studies to further explore the field of research.
6.2 Future Work
As some significant differences could be identified between the different process vi-
sualisations, this should be investigated again with more participants. As the insur-
ance process is simplified, an investigation with a larger real-world scenario would
be fascinating. Also, it would be useful to conduct another study with business
analysts and non-computer scientists who are confronted with the larger process
models in the company.
Concerning the path frequencies, it could be investigated how the use of colour af-
fects the understanding of the process models. The generated process models from
this study, when colour used, rely on different shades of blue or turquoise though
the numbers are still noted on each path. It would be interesting to investigate
whether using the colour is sufficient for comprehension and whether the numbers
only need to be faded if more detailed information is needed. Another aspect that
could be further explored in future works is the influence of process orientation. De-
pending on the process visualisation language, the process models are displayed
either vertically or horizontally.
During the research for this thesis, it is noticed that many different process mod-
elling languages are used to represent generated models. BPMN is a standardised
notation. Therefore, the question arises why BPMN is not always used to repre-
sent process models generated by process discovery. One could imagine that this
could be due to the lack of support for responsibilities (pools and lanes) and the
lack of differentiation between activities and events. However, this question should
be investigated in future work. In addition, one could investigate to what extent the
71
6 Conclusion and Future Work
missing aspects can be read from the event logs, i.e. how this could be represented
in the event logs and how this new log information can then be used to represent
process models.
72
Bibliography
[1] A. Mark Mento - Director of Business Development Bitbrain North Amer-
ica. This Is How Eye Tracking Technology Works. 2018. URL:
https://
www.bitbrain.com/blog/eye- tracking- technology
, last retrieved
08.09.2021.
[2] Wil MP Van der Aalst. “Business Process Management: A Comprehensive
Survey”. In: International Scholarly Research Notices 2013 (2013). Article ID
507984. DOI:
10.1155/2013/507984
.
[3] Wil MP Van der Aalst. “The Application of Petri Nets to Workflow Manage-
ment”. In: Journal of Circuits, Systems, and Computers 8.01 (1998), pp. 21–
66.
[4] Wil Van der Aalst. Process Mining: Data Science in Action. 2nd ed. 2016.
Springer, 2016. DOI:
10.1007/978-3-662-49851-4
.
[5] Wil Van der Aalst, Ton Weijters, and Laura Maruster. “Workflow Mining: Dis-
covering Process Models from Event Logs”. In: IEEE Transactions on Knowl-
edge and Data Engineering 16.9 (2004), pp. 1128–1142.
[6] Apromore Pty Ltd. The Finest Process Mining Experience. 2021. URL:
https:
//apromore.org/
, last retrieved 08.09.2021.
[7] Adriano Augusto et al. “Automated Discovery of Process Models from Event
Logs: Review and Benchmark”. In: IEEE Transactions on Knowledge and
Data Engineering 31.4 (2018), pp. 686–705.
[8] Adriano Augusto et al. “Split Miner: Discovering Accurate and Simple Busi-
ness Process Models from Event Logs”. In: 2017 IEEE International Confer-
ence on Data Mining (ICDM). IEEE. 2017, pp. 1–10.
[9] Jennifer Romano Bergstrom and Andrew Schall. Eye Tracking in User Expe-
rience Design. Elsevier, 2014.
73
Bibliography
[10] Alessandro Berti, Sebastiaan J van Zelst, and Wil van der Aalst. “Process
Mining for Python (PM4Py): Bridging the Gap Between Process-And Data
Science”. In: arXiv preprint arXiv:1905.06169 (2019).
[11] Marius Breitmayer. “Applying Process Mining Algorithms in the Context of
Data Collection Scenarios”. PhD thesis. Ulm University, 2018.
[12] Bundesministerium für Gesundheit and Robert Koch Institut. Empfehlungen
für die Organisation und Durchführung von Impfungen gegen SARS-CoV-
2 in Impfzentren und mit mobilen Teams. 2020. URL:
https://m.halle.
de/push.aspx?s=downloads/de/Verwaltung/Gesundheit/Corona-
Virus//Hinweise- zum- Impfen- 10401/2020- 12- 08_empfehlungen_
impfzentren.pdf
, last retrieved 08.09.2021.
[13] Camunda. Camunda BPM Platform. 2021. URL:
https://camunda.com/
,
last retrieved 08.09.2021.
[14] Celonis. Introducing the Celonis Snap Guide. 2021. URL:
https://www.
celonis.com/snap-guide
, last retrieved 08.09.2021.
[15] Ursula Christmann and Norbert Groeben. “Verständlichkeit: die psychologis-
che Perspektive”. In: Handbuch Barrierefreie Kommunikation. Berlin: Frank &
Timme (2019), pp. 123–145.
[16] Marlon Dumas et al. Fundamentals of Business Process Management. Vol. 1.
Springer, 2013.
[17] Yutika Amelia Effendi and Riyanarto Sarno. “Conformance Checking Evalua-
tion of Process Discovery Using Modified Alpha++ Miner Algorithm”. In: 2018
International Seminar on Application for Technology of Information and Com-
munication. IEEE. 2018, pp. 435–440.
[18] Fluxicon BV. Academic Initiative For Process Mining Research and Edu-
cation. 2021. URL:
https://fluxicon.com/academic/
, last retrieved
08.09.2021.
[19] Fluxicon BV. Discover Your Processes. 2021. URL:
https://fluxicon.com/
disco/
, last retrieved 08.09.2021.
[20] Fraunhofer Institute for Applied Information Technology. Process Discovery.
2021. URL:
https://pm4py.fit.fraunhofer.de/documentation\#discovery
, last retrieved 08.09.2021.
74
Bibliography
[21] Fraunhofer Institute for Applied Information Technology. State-Of-The-Art-Process
Mining in Python. 2021. URL:
https://pm4py.fit.fraunhofer.de/
, last
retrieved 08.09.2021.
[22] Jakob Freund and Bernd Rücker. Praxishandbuch BPMN 2.0. Carl Hanser
Verlag GmbH Co KG, 2014.
[23] Christian W Günther and Anne Rozinat. “Disco: Discover Your Processes”.
In: BPM (Demos) 940 (2012), pp. 40–44.
[24] Christian W Günther and Wil MP Van Der Aalst. “Fuzzy Mining–Adaptive
Process Simplification Based on Multi-Perspective Metrics”. In: International
Conference on Business Process Management. Springer. 2007, pp. 328–
343.
[25] Esmita P Gupta. “Process Mining a Comparative Study”. In: International
Journal of Advanced Research in Computer and Communications Engineer-
ing 3.11 (2014), p. 5.
[26] Michael Hammer. “Reengineering Work: Don’t Automate, Obliterate”. In: Har-
vard business review 68.4 (1990), pp. 104–112.
[27] Silvia Hansen-Schirra and Silke Gutermuth. “Empirische Überprüfung von
Verständlichkeit”. In: Eds. CHRISTIANE MAAß, and ISABEL RINK. Hand-
buch Barrierefreie Kommunikation. Berlin: Frank & Timme (2019), pp. 163–
182.
[28] Arthur H. M. Ter Hofstede and Mathias Weske. “Business Process Manage-
ment: A Survey”. In: Proceedings of the 1st International Conference on Busi-
ness Process Management, Volume 2678 of Lncs. Citeseer. 2003.
[29] Frank Hogrebe, Nick Gehrke, and Markus Nüttgens. “Eye Tracking Experi-
ments in Business Process Modeling: Agenda Setting and Proof of Concept”.
In: Enterprise Modelling and Information Systems Architectures (EMISA 2011)
(2011).
[30] IBM. IBM SPSS-Software. 2021. URL:
https://www.ibm.com/de-de/
analytics/spss-statistics-software
, last retrieved 08.09.2021.
75
Bibliography
[31] Moritz Kassner, William Patera, and Andreas Bulling. “Pupil: An Open Source
Platform for Pervasive Eye Tracking and Mobile Gaze-Based Interaction”. In:
Proceedings of the 2014 ACM International Joint Conference on Pervasive
and Ubiquitous Computing: Adjunct Publication. 2014, pp. 1151–1160.
[32] Benedikt Lutz. “Verständlichkeit aus fachkommunikativer Sicht”. In: Handbuch
Barrierefreie Kommunikation (2018), p. 147.
[33] Jan Mendling, Mark Strembeck, and Jan Recker. “Factors of Process Model
Comprehension—Findings from a Series of Experiments”. In: Decision Sup-
port Systems 53.1 (2012), pp. 195–206.
[34] Stefan Obermeier et al. Geschäftsprozesse realisieren: Ein praxisorientierter
Leitfaden von der Strategie bis zur Implementierung. Springer-Verlag, 2013.
[35] Object Management Group (OMG). About the Business Process Model and
Notation Specification Version 2.0. 2010. URL:
https://www.omg.org/spec/
BPMN/2.0/About-BPMN/
, last retrieved 08.09.2021.
[36] Fred Paas, Alexander Renkl, and John Sweller. “Cognitive load theory and in-
structional design: Recent developments”. In: Educational psychologist 38.1
(2003), pp. 1–4.
[37] C.A. Petri. Kommunikation mit Automaten. Schriften des Rheinisch-Westfälischen
Institutes für Instrumentelle Mathematik an der Universität Bonn. Rheinisch-
Westfälisches Institut f. instrumentelle Mathematik an d. Univ., 1962.
[38] Gerardo Cepeda Porras and Yann-Gaël Guéhéneuc. “An Empirical Study on
the Efficiency of Different Design Pattern Representations in UML Class Dia-
grams”. In: Empirical Software Engineering 15.5 (2010), pp. 493–522.
[39] Process Mining Group, Math&CS department, Eindhoven University of Tech-
nology. ProM. 2016. URL:
https://www.promtools.org/doku.php?id=
docs:start
, last retrieved 08.09.2021.
[40] Pupil Labs. Fixations. 2021. URL:
https://docs.pupil-labs.com/core/
terminology/\#camera-intrinsics
, last retrieved 08.09.2021.
[41] Dario D Salvucci and Joseph H Goldberg. “Identifying Fixations and Sac-
cades in Eye-Tracking Protocols”. In: Proceedings of the 2000 Symposium
on Eye Tracking Research & Applications. 2000, pp. 71–78.
76
Bibliography
[42] Kamyar Sarshar and Peter Loos. “Comparing the Control-Flow of EPC and
Petri Net from the End-User Perspective”. In: International Conference on
Business Process Management. Springer. 2005, pp. 434–439.
[43] Bonita Sharif and Jonathan I Maletic. “An Eye Tracking Study on the Effects
of Layout in Understanding the Role of Design Patterns”. In: 2010 IEEE Inter-
national Conference on Software Maintenance. IEEE. 2010, pp. 1–10.
[44] John Sweller. “Cognitive load during problem solving: Effects on learning”. In:
Cognitive science 12.2 (1988), pp. 257–285.
[45] Tobii AB. Dark and Bright Pupil Tracking. 2021. URL:
https://www.tobiipro.
com/learn-and-support/learn/eye-tracking-essentials/what-is-
dark-and-bright-pupil-tracking/
, last retrieved 08.09.2021.
[46] Universität Zürich. Korrelation nach Bravais-Pearson. 2020. URL:
https://
www.methodenberatung.uzh.ch/de/datenanalyse_spss/zusammenhaenge/
korrelation.html
, last retrieved 08.09.2021.
[47] Wil Van Der Aalst, Arya Adriansyah, and Boudewijn Van Dongen. “Causal
Nets: A Modeling Language Tailored Towards Process Discovery”. In: Inter-
national Conference on Concurrency Theory. Springer. 2011, pp. 28–42.
[48] Wil Van Der Aalst et al. “Process Mining Manifesto”. In: International Confer-
ence on Business Process Management. Springer. 2011, pp. 169–194.
[49] Boudewijn F Van Dongen et al. “The ProM Framework: A New Era in Process
Mining Tool Support”. In: International Conference on Application and Theory
of Petri Nets. Springer. 2005, pp. 444–454.
[50] AJMM Weijters, Wil MP van Der Aalst, and AK Alves De Medeiros. “Process
Mining with the Heuristics Miner-Algorithm”. In: Technische Universiteit Eind-
hoven, Tech. Rep. WP 166 (2006), pp. 1–34.
[51] AJMM Weijters and Joel Tiago S Ribeiro. “Flexible Heuristics Miner (FHM)”.
In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM).
IEEE. 2011, pp. 310–317.
[52] Claes Wohlin et al. Experimentation in software engineering. Springer Sci-
ence & Business Media, 2012.
77
Bibliography
[53] Michael Zimoch et al. “Cognitive Insights into Business Process Model Com-
prehension: Preliminary Results for Experienced and Inexperienced Individ-
uals”. In: Enterprise, Business-Process and Information Systems Modeling.
Springer, 2017, pp. 137–152.
[54] Michael Zimoch et al. “Eye Tracking Experiments on Process Model Compre-
hension: Lessons Learned”. In: Enterprise, Business-Process and Informa-
tion Systems Modeling. Springer, 2017, pp. 153–168.
[55] Michael Zimoch et al. “Using Insights from Cognitive Neuroscience to Investi-
gate the Effects of Event-Driven Process Chains on Process Model Compre-
hension”. In: International Conference on Business Process Management.
Springer. 2017, pp. 446–459.
[56] Michael Zimoch et al. “Utilizing the Capabilities Offered by Eye-Tracking to
Foster Novices’ Comprehension of Business Process Models”. In: Interna-
tional Conference on Cognitive Computing. Springer. 2018, pp. 155–163.
[57] questback. EFS (Enterprise Feedback Suite) Survey. 2021. URL:
https://
community.questback.com/s/questback-efs-survey
, last retrieved
08.09.2021.
[58] Razvan PetIn Proceedings of the rusel and Jan Mendling. “Eye-Tracking the
Factors of Process Model Comprehension Tasks”. In: International Confer-
ence on Advanced Information Systems Engineering. Springer. 2013, pp. 224–
239.
[59] Zohreh SharafiProceedings of the et al. “Eye-Tracking Metrics in Software En-
gineering”. In: 2015 Asia-Pacific Software Engineering Conference (APSEC).
IEEE. 2015, pp. 96–103.
[60] Shehnaaz YusIn Proceedings of the uf, Huzefa Kagdi, and Jonathan I Maletic.
“Assessing the Comprehension of UML Class Diagrams via Eye Tracking”. In:
15th IEEE International Conference on Program Comprehension (ICPC’07).
IEEE. 2007, pp. 113–122.
78
A Vaccination Process in Python
The Python script for generating the event log for the vaccination process is shown
here.
1
from datetime import datetime
import random
3
5
# reformat the time and adds the random_time
def get_log_time(random_time):
7
global current_time
current_log_time = current_time
9
# datetime to float
sec = current_log_time.timestamp()
11
# add random time
new_sec = sec + random_time
13
# convert float to datetime
new_log_time = datetime.fromtimestamp(new_sec)
15
# save new time base
current_time = new_log_time
17
# datetime to string with pattern
dt_string = new_log_time.strftime("%d/%m/%Y %H:%M:%S
")
19
# print("date and time =", dt_string)
return dt_string
21
23
# split the time interval and returns a random value
from it
79
A Vaccination Process in Python
def get_activity_time(line):
25
activity_time = line.split(";")[1].split("-")
# print(activity_time)
27
# (random between 0 and 1 multiplied by max) + min
return random.random() * float(activity_time[1]) +
float(activity_time[0])
29
31
# set time for new case to allow overlapping
def set_time_for_new_case():
33
global current_time
current_log_time = current_time
35
# datetime to float
sec = current_log_time.timestamp()
37
# minus 30 minutes
new_sec = sec - 900
39
# convert float to datetime
new_log_time = datetime.fromtimestamp(new_sec)
41
# save new time base
current_time = new_log_time
43
45
# write event log
def write_event_log():
47
output_object = open("event_log_vaccination.csv","w
")
output_object.write('case_id;activity;
start_timestamp;end_timestamp\n')
49
for case_id in range(1000):
51
input_object = open("vaccination_in.txt")
for line in input_object:
53
line = line.rstrip("\n")
random_time = round(get_activity_time(line),
2)
80
A Vaccination Process in Python
55
start_log_time = get_log_time(0)
end_log_time = get_log_time(random_time)
57
# line without the time (only activity name)
activity = line.split(";")[0]
59
# special case
# generate random variable
61
random_special_case = random.randint(0, 15)
if activity == "be observed and wait" and
random_special_case % 11 == 0:
63
output_object.write(
str(case_id + 1) + ";" + activity +
";" + start_log_time + ";" +
end_log_time + "\n")
65
output_object.write(str(case_id + 1) + "
;" +"receive first aid" +";" +
start_log_time + ";"
+ end_log_time + "\n
")
67
continue
# write line in event log
69
output_object.write(str(case_id+1) + ";" +
activity + ";" + start_log_time + ";" +
end_log_time + "\n")
input_object.close()
71
set_time_for_new_case()
output_object.close()
73
75
# main method and start
if __name__ == '__main__':
77
current_time = datetime.now()
write_event_log()
Listing A.1: Generation of the event log for the vaccination process
81
B Process Visualisations
B.1 Celonis Snap
Figure B.1: Vaccination process (Celonis Snap)
82
B Process Visualisations
Figure B.2: Insurence process (Celonis Snap)
83
B Process Visualisations
B.2 Disco
Figure B.3: Vaccination process (Disco)
84
B Process Visualisations
Figure B.4: Insurence process (Disco)
85
B Process Visualisations
B.3 Apromore
Figure B.5: Vaccination process (Apromore)
86
B Process Visualisations
Figure B.6: Insurence process (Apromore)
87
B Process Visualisations
B.4 PM4Py
Figure B.7: Vaccination process as BPMN model (PM4Py)
88
B Process Visualisations
Figure B.8: Vaccination process as Heuristic net (PM4Py)
89
B Process Visualisations
Figure B.9: Vaccination process as Petri net (PM4Py)
90
B Process Visualisations
Figure B.10: Insurence process as BPMN model (PM4Py)
91
B Process Visualisations
Figure B.11: Insurence process as Heuristic net (PM4Py)
92
B Process Visualisations
Figure B.12: Insurence process as Petri net (PM4Py)
93
C Questionnaires
C.1 Knowledge Questions
Figure C.1: True-or-false knowledge questions
94
C Questionnaires
C.2 Comprehension Questions
Figure C.2: Comprehension questions for vaccination process
95
C Questionnaires
Figure C.3: Comprehension questions for insurence process
96
C Questionnaires
C.3 Level of Acceptability
Figure C.4: Questions for level of acceptability
97
C Questionnaires
C.4 Cognitive Load
Figure C.5: Questions for cognitive load
98
D Results of User Study
In this Appendix, all results are presented as tables. The first Table D.1 shows the
demographic data given for each participant. Table D.2 shows the answers to the
knowledge questions and the sum of the correctly answered knowledge questions
for each participant. Table D.3 shows the results of how long each participant looked
at each process model. The questionnaire variant (Variant), the ID for the partici-
pant (P), the scenario (S; 0 = vaccination process and 1 = insurance process), the
ID of the process model (PM) and the duration are given. The meanings of the rows
are identical for the following tables.
The next Table D.4 shows the number of fixations. Table D.5 shows how long each
participant took to answer the comprehension questions for each process model
and Table D.6 shows the number of comprehension questions answered correctly.
Table D.7 then shows the level of acceptability (V = Variant). Finally, Table D.8 and
Table D.9 show the results of the cognitive load questionnaire.
99
D Results of User Study
ID Variant Gender Age
(year of birth) Known process modelling languages Estimated experience
in days
1 0 0 1995 UML, BPMN 20
2 1 1 1991 bpmn 2.0 21
3 2 1 1994 BPMN 5
4 3 1 1996 BPMN 2.0, EPK, UML, Petri Netz, Gant 100
5 0 1 1991 BPMN, AristaFlow 25
6 1 1 1992 bpmn, petri netze, epks, artifacts,
case handling, philharmonicflows 1000
7 2 1 1996 BPM, AristaFlow 60
8 3 1 1977 Den ganzen Zoo >100
9 0 1 1996 BPMN, UML, EPK 60
10 1 1 1990 BPMN, EPKs, Flow Charts, UML, ADEPT 7
11 2 1 1997 BPMN 35
12 3 1 1994 petri netze, flussdiagramme, bpmn 2.0, gantt charts 30
13 0 1 1995 Keine 4
14 1 0 1993 BPMN 2.0, EPK, Petri-Netze, object-aware etc. 100
15 2 0 1995 BPMN, Flussmodelle 10
Table D.1: Results of demographic data
100
D Results of User Study
ID Question 1 Question 2 Question 3 Question 4 Question 5 Sum of correctly
answered questions
1111112
2110014
3100112
4111004
5101012
6110005
7101003
8111004
9010013
10 1 1 0 0 0 5
11 1 1 1 0 1 3
12 1 1 0 0 1 4
13 1 0 0 1 0 3
14 1 1 0 0 1 4
15 0 1 1 0 0 3
Table D.2: Results of knowledge questions for each participant
101
D Results of User Study
Variant P S PM Duration Variant P S PM Duration Variant P S PM Duration
0 0 0 1 27512 1 5 0 4 21284 2 10 0 3 66658
0 0 0 2 35552 1 5 0 5 28790 2 10 0 5 32848
0 0 0 3 58571 1 5 0 6 24134 2 10 0 7 21281
0 0 0 7 51354 1 5 1 8 41861 2 10 1 8 91411
0 0 1 8 91963 1 5 1 9 51784 2 10 1 9 42925
0 0 1 11 92130 1 5 1 10 43344 2 10 1 11 50456
0 0 1 12 102435 1 5 1 14 42098 2 10 1 13 76190
0 0 1 13 71332 2 6 0 1 41464 3 11 0 1 30580
1 1 0 4 38857 2 6 0 3 72355 3 11 0 2 40117
1 1 0 5 45460 2 6 0 5 25497 3 11 0 4 44919
1 1 0 6 30293 2 6 0 7 30436 3 11 0 6 31305
1 1 1 8 151900 2 6 1 8 104591 3 11 1 8 49503
1 1 1 9 149312 2 6 1 9 149221 3 11 1 10 49808
1 1 1 10 132369 2 6 1 11 77067 3 11 1 12 64263
1 1 1 14 72326 2 6 1 13 33052 3 11 1 14 62243
2 2 0 1 45536 3 7 0 1 10385 0 12 0 1 42859
2 2 0 3 98553 3 7 0 2 61345 0 12 0 2 48954
2 2 0 5 61612 3 7 0 4 19157 0 12 0 3 51208
2 2 0 7 49896 3 7 0 6 23296 0 12 0 7 35654
2 2 1 8 87980 3 7 1 8 19361 0 12 1 8 122461
2 2 1 9 104795 3 7 1 10 22723 0 12 1 11 74897
2 2 1 11 64238 3 7 1 12 19525 0 12 1 12 86865
2 2 1 13 72102 3 7 1 14 19887 0 12 1 13 42798
3 3 0 1 20089 0 8 0 1 34904 1 13 0 1 37828
3 3 0 2 40357 0 8 0 2 41184 1 13 0 4 27797
3 3 0 4 30011 0 8 0 3 49111 1 13 0 5 40887
3 3 0 6 28313 0 8 0 7 36225 1 13 0 6 44168
3 3 1 8 68721 0 8 1 8 65185 1 13 1 8 86123
3 3 1 10 46139 0 8 1 11 58754 1 13 1 9 49461
3 3 1 12 55858 0 8 1 12 66419 1 13 1 10 63436
3 3 1 14 59115 0 8 1 13 54942 1 13 1 14 52904
0 4 0 1 20600 1 9 0 1 21050 2 14 0 1 54089
0 4 0 2 28518 1 9 0 4 32135 2 14 0 3 74752
0 4 0 3 31967 1 9 0 5 21560 2 14 0 5 43385
0 4 0 7 26975 1 9 0 6 21292 2 14 0 7 24867
0 4 1 8 52430 1 9 1 8 59301 2 14 1 8 151735
0 4 1 11 58662 1 9 1 9 33778 2 14 1 9 77525
0 4 1 12 37486 1 9 1 10 40646 2 14 1 11 37424
0 4 1 13 48958 1 9 1 14 31653 2 14 1 13 56853
1 5 0 1 29046 2 10 0 1 40244
Table D.3: Results of time spent looking at the process models
102
D Results of User Study
Variant P S PM Fixations Variant P S PM Fixations Variant P S PM Fixations
0 0 0 1 123 2 6 0 1 133 3 11 0 1 34
0 0 0 2 159 2 6 0 3 363 3 11 0 2 176
0 0 0 3 196 2 6 0 5 115 3 11 0 4 76
0 0 0 7 201 2 6 0 7 123 3 11 0 6 163
0 0 1 8 171 2 6 1 8 403 3 11 1 8 42
0 0 1 11 267 2 6 1 9 627 3 11 1 10 192
0 0 1 12 265 2 6 1 11 325 3 11 1 12 199
0 0 1 13 333 2 6 1 13 142 3 11 1 14 298
2 2 0 1 210 3 7 0 1 53 0 12 0 1 207
2 2 0 3 464 3 7 0 2 138 0 12 0 2 215
2 2 0 5 271 3 7 0 4 109 0 12 0 3 213
2 2 0 7 238 3 7 0 6 120 0 12 0 7 153
2 2 1 8 411 3 7 1 8 86 0 12 1 8 611
2 2 1 9 489 3 7 1 10 99 0 12 1 11 377
2 2 1 11 334 3 7 1 12 84 0 12 1 12 428
2 2 1 13 309 3 7 1 14 87 0 12 1 13 209
3 3 0 1 93 0 8 0 1 140 1 13 0 1 103
3 3 0 2 210 0 8 0 2 188 1 13 0 4 109
3 3 0 4 158 0 8 0 3 247 1 13 0 5 148
3 3 0 6 146 0 8 0 7 180 1 13 0 6 137
3 3 1 8 306 0 8 1 8 314 1 13 1 8 303
3 3 1 10 209 0 8 1 11 279 1 13 1 9 241
3 3 1 12 259 0 8 1 12 326 1 13 1 10 235
3 3 1 14 277 0 8 1 13 280 1 13 1 14 238
0 4 0 1 106 1 9 0 1 121 2 14 0 1 266
0 4 0 2 133 1 9 0 4 175 2 14 0 3 362
0 4 0 3 153 1 9 0 5 122 2 14 0 5 207
0 4 0 7 122 1 9 0 6 122 2 14 0 7 116
0 4 1 8 237 1 9 1 8 317 2 14 1 8 737
0 4 1 11 285 1 9 1 9 184 2 14 1 9 389
0 4 1 12 146 1 9 1 10 227 2 14 1 11 194
0 4 1 13 250 1 9 1 14 171 2 14 1 13 284
1 5 0 1 137 2 10 0 1 207
1 5 0 4 106 2 10 0 3 364
1 5 0 5 145 2 10 0 5 175
1 5 0 6 124 2 10 0 7 111
1 5 1 8 214 2 10 1 8 480
1 5 1 9 260 2 10 1 9 245
1 5 1 10 186 2 10 1 11 287
1 5 1 14 220 2 10 1 13 411
Table D.4: Results of number of fixations
103
D Results of User Study
Variant P S PM Duration Variant P S PM Duration Variant P S PM Duration
0 0 0 1 35883 1 5 0 1 22950 2 10 0 1 23863
0 0 0 2 73645 1 5 0 4 8038 2 10 0 3 76635
0 0 0 3 8535 1 5 0 5 10105 2 10 0 5 15610
0 0 0 7 15560 1 5 0 6 7785 2 10 0 7 10990
0 0 1 8 18900 1 5 1 8 8505 2 10 1 8 35661
0 0 1 11 31685 1 5 1 9 10096 2 10 1 9 16005
0 0 1 12 37065 1 5 1 10 8738 2 10 1 11 18745
0 0 1 13 25354 1 5 1 14 22427 2 10 1 13 17324
1 1 0 1 32015 2 6 0 1 14522 3 11 0 1 21661
1 1 0 4 20521 2 6 0 3 23993 3 11 0 2 15491
1 1 0 5 20339 2 6 0 5 23135 3 11 0 4 12671
1 1 0 6 32708 2 6 0 7 27375 3 11 0 6 11529
1 1 1 8 20631 2 6 1 8 13854 3 11 1 8 30023
1 1 1 9 24795 2 6 1 9 21652 3 11 1 10 30617
1 1 1 10 32525 2 6 1 11 21711 3 11 1 12 15978
1 1 1 14 38521 2 6 1 13 29806 3 11 1 14 33343
2 2 0 1 19016 3 7 0 1 16294 0 12 0 1 22036
2 2 0 3 14480 3 7 0 2 20942 0 12 0 2 42541
2 2 0 5 15588 3 7 0 4 7649 0 12 0 3 19389
2 2 0 7 11081 3 7 0 6 17960 0 12 0 7 27913
2 2 1 8 45674 3 7 1 8 14406 0 12 1 8 15520
2 2 1 9 14427 3 7 1 10 18636 0 12 1 11 26001
2 2 1 11 20143 3 7 1 12 13106 0 12 1 12 19806
2 2 1 13 19014 3 7 1 14 16114 0 12 1 13 42010
3 3 0 1 18401 0 8 0 1 15898 1 13 0 1 21445
3 3 0 2 23124 0 8 0 2 29583 1 13 0 4 16588
3 3 0 4 21035 0 8 0 3 27001 1 13 0 5 11958
3 3 0 6 11364 0 8 0 7 19125 1 13 0 6 20973
3 3 1 8 18496 0 8 1 8 11403 1 13 1 8 26355
3 3 1 10 36275 0 8 1 11 36265 1 13 1 9 24305
3 3 1 12 25020 0 8 1 12 24664 1 13 1 10 20376
3 3 1 14 26262 0 8 1 13 29011 1 13 1 14 24907
0 4 0 1 20288 1 9 0 1 30518 2 14 0 1 12940
0 4 0 2 26472 1 9 0 4 14656 2 14 0 3 24792
0 4 0 3 21707 1 9 0 5 15888 2 14 0 5 19581
0 4 0 7 16175 1 9 0 6 12126 2 14 0 7 17600
0 4 1 8 21126 1 9 1 8 11539 2 14 1 8 26971
0 4 1 11 25698 1 9 1 9 14952 2 14 1 9 20674
0 4 1 12 20539 1 9 1 10 13126 2 14 1 11 43069
0 4 1 13 28621 1 9 1 14 24323 2 14 1 13 35944
Table D.5: Results of duration during answering the comprehension questions
104
D Results of User Study
Variant P S PM Score Variant P S PM Score Variant P S PM Score
0 0 0 1 2 1 5 0 1 3 2 10 0 1 2
0 0 0 2 3 1 5 0 4 3 2 10 0 3 2
0 0 0 3 2 1 5 0 5 1 2 10 0 5 2
0 0 0 7 3 1 5 0 6 2 2 10 0 7 3
0 0 1 8 3 1 5 1 8 3 2 10 1 8 2
0 0 1 11 2 1 5 1 9 2 2 10 1 9 2
0 0 1 12 2 1 5 1 10 2 2 10 1 11 3
0 0 1 13 2 1 5 1 14 2 2 10 1 13 2
1 1 0 1 2 2 6 0 1 3 3 11 0 1 3
1 1 0 4 2 2 6 0 3 3 3 11 0 2 3
1 1 0 5 1 2 6 0 5 3 3 11 0 4 1
1 1 0 6 3 2 6 0 7 2 3 11 0 6 1
1 1 1 8 2 2 6 1 8 3 3 11 1 8 3
1 1 1 9 2 2 6 1 9 3 3 11 1 10 1
1 1 1 10 3 2 6 1 11 1 3 11 1 12 3
1 1 1 14 3 2 6 1 13 2 3 11 1 14 1
2 2 0 1 2 3 7 0 1 0 0 12 0 1 2
2 2 0 3 3 3 7 0 2 3 0 12 0 2 3
2 2 0 5 3 3 7 0 4 2 0 12 0 3 2
2 2 0 7 2 3 7 0 6 1 0 12 0 7 3
2 2 1 8 2 3 7 1 8 2 0 12 1 8 3
2 2 1 9 2 3 7 1 10 0 0 12 1 11 2
2 2 1 11 3 3 7 1 12 3 0 12 1 12 3
2 2 1 13 3 3 7 1 14 2 0 12 1 13 2
3 3 0 1 2 0 8 0 1 3 1 13 0 1 3
3 3 0 2 3 0 8 0 2 3 1 13 0 4 1
3 3 0 4 2 0 8 0 3 3 1 13 0 5 2
3 3 0 6 1 0 8 0 7 3 1 13 0 6 3
3 3 1 8 3 0 8 1 8 3 1 13 1 8 2
3 3 1 10 2 0 8 1 11 2 1 13 1 9 1
3 3 1 12 2 0 8 1 12 3 1 13 1 10 2
3 3 1 14 2 0 8 1 13 2 1 13 1 14 1
0 4 0 1 2 1 9 0 1 3 2 14 0 1 3
0 4 0 2 3 1 9 0 4 2 2 14 0 3 3
0 4 0 3 1 1 9 0 5 1 2 14 0 5 3
0 4 0 7 3 1 9 0 6 3 2 14 0 7 3
0 4 1 8 2 1 9 1 8 3 2 14 1 8 3
0 4 1 11 2 1 9 1 9 2 2 14 1 9 2
0 4 1 12 2 1 9 1 10 2 2 14 1 11 3
0 4 1 13 3 1 9 1 14 3 2 14 1 13 3
Table D.6: Results of number of correctly answered comprehension questions
105
D Results of User Study
V P S PM PUU PEOU V P S PM PUU PEOU V P S PM PUU PEOU
0 0 0 1 20 20 1 5 0 1 16 16 2 10 0 1 15 18
0 0 0 2 17 17 1 5 0 4 16 18 2 10 0 3 15 15
0 0 0 3 20 18 1 5 0 5 16 16 2 10 0 5 14 16
0 0 0 7 14 16 1 5 0 6 14 16 2 10 0 7 15 13
0 0 1 8 20 20 1 5 1 8 14 13 2 10 1 8 18 19
0 0 1 11 11 11 1 5 1 9 13 12 2 10 1 9 12 15
0 0 1 12 17 18 1 5 1 10 15 16 2 10 1 11 10 14
0 0 1 13 19 18 1 5 1 14 8 8 2 10 1 13 9 6
1 1 0 1 20 19 2 6 0 1 16 16 3 11 0 1 14 15
1 1 0 4 15 15 2 6 0 3 15 13 3 11 0 2 14 15
1 1 0 5 20 20 2 6 0 5 16 13 3 11 0 4 15 14
1 1 0 6 15 16 2 6 0 7 9 9 3 11 0 6 14 14
1 1 1 8 19 20 2 6 1 8 14 11 3 11 1 8 13 13
1 1 1 9 13 11 2 6 1 9 9 10 3 11 1 10 15 15
1 1 1 10 12 14 2 6 1 11 16 14 3 11 1 12 11 12
1 1 1 14 17 17 2 6 1 13 8 9 3 11 1 14 13 15
2 2 0 1 18 18 3 7 0 1 20 20 0 12 0 1 18 19
2 2 0 3 9 10 3 7 0 2 16 16 0 12 0 2 15 15
2 2 0 5 15 15 3 7 0 4 4 8 0 12 0 3 10 11
2 2 0 7 12 14 3 7 0 6 7 8 0 12 0 7 17 17
2 2 1 8 20 18 3 7 1 8 20 20 0 12 1 8 15 14
2 2 1 9 7 7 3 7 1 10 4 4 0 12 1 11 10 9
2 2 1 11 16 16 3 7 1 12 8 8 0 12 1 12 11 12
2 2 1 13 6 8 3 7 1 14 4 4 0 12 1 13 14 13
3 3 0 1 17 18 0 8 0 1 20 20 1 13 0 1 19 20
3 3 0 2 14 14 0 8 0 2 6 6 1 13 0 4 19 20
3 3 0 4 15 15 0 8 0 3 10 11 1 13 0 5 12 12
3 3 0 6 15 15 0 8 0 7 15 16 1 13 0 6 11 13
3 3 1 8 13 12 0 8 1 8 18 20 1 13 1 8 13 10
3 3 1 10 15 15 0 8 1 11 10 12 1 13 1 9 15 16
3 3 1 12 14 15 0 8 1 12 10 9 1 13 1 10 15 16
3 3 1 14 14 12 0 8 1 13 11 12 1 13 1 14 10 8
0 4 0 1 16 16 1 9 0 1 15 20 2 14 0 1 17 19
0 4 0 2 20 19 1 9 0 4 16 20 2 14 0 3 15 17
0 4 0 3 16 18 1 9 0 5 8 11 2 14 0 5 17 17
0 4 0 7 19 17 1 9 0 6 10 13 2 14 0 7 10 11
0 4 1 8 12 13 1 9 1 8 20 20 2 14 1 8 15 15
0 4 1 11 12 11 1 9 1 9 14 14 2 14 1 9 16 16
0 4 1 12 16 16 1 9 1 10 15 13 2 14 1 11 15 16
0 4 1 13 15 14 1 9 1 14 5 5 2 14 1 13 12 16
Table D.7: Results of level of acceptability
106
D Results of User Study
Variant P S PM ICL ECL GCL Variant P S PM ICL ECL GCL
0 0 0 1 1 4 1 1 5 0 1 2 2 3
0 0 0 2 2 4 2 1 5 0 4 2 2 2
0 0 0 3 2 4 2 1 5 0 5 2 2 3
0 0 0 7 1 4 3 1 5 0 6 2 2 2
0 0 1 8 2 4 3 1 5 1 8 4 3 3
0 0 1 11 2 5 4 1 5 1 9 4 2 3
0 0 1 12 1 4 3 1 5 1 10 3 2 2
0 0 1 13 2 4 3 1 5 1 14 4 3 4
1 1 0 1 1 2 1 2 6 0 1 2 3 2
1 1 0 4 3 4 2 2 6 0 3 3 3 3
1 1 0 5 1 2 1 2 6 0 5 2 3 3
1 1 0 6 2 4 3 2 6 0 7 3 3 3
1 1 1 8 4 4 1 2 6 1 8 4 4 3
1 1 1 9 4 5 4 2 6 1 9 3 4 3
1 1 1 10 4 5 3 2 6 1 11 3 3 2
1 1 1 14 1 4 2 2 6 1 13 3 3 3
2 2 0 1 2 2 1 3 7 0 1 3 1 1
2 2 0 3 5 4 4 3 7 0 2 3 2 3
2 2 0 5 2 2 1 3 7 0 4 2 2 2
2 2 0 7 3 3 3 3 7 0 6 3 3 4
2 2 1 8 4 4 2 3 7 1 8 3 2 1
2 2 1 9 5 5 5 3 7 1 10 4 4 4
2 2 1 11 2 3 1 3 7 1 12 3 3 3
2 2 1 13 5 5 4 3 7 1 14 3 3 3
3 3 0 1 2 2 1 0 8 0 1 1 2 1
3 3 0 2 3 2 2 0 8 0 2 4 5 5
3 3 0 4 3 2 1 0 8 0 3 3 4 5
3 3 0 6 3 2 2 0 8 0 7 1 3 1
3 3 1 8 3 4 3 0 8 1 8 3 3 3
3 3 1 10 4 4 2 0 8 1 11 4 4 4
3 3 1 12 3 3 3 0 8 1 12 3 4 3
3 3 1 14 3 3 3 0 8 1 13 3 4 4
0 4 0 1 4 4 3 1 9 0 1 2 3 2
0 4 0 2 3 3 2 1 9 0 4 3 3 2
0 4 0 3 2 4 3 1 9 0 5 3 3 4
0 4 0 7 2 3 1 1 9 0 6 3 3 2
0 4 1 8 5 5 4 1 9 1 8 4 4 2
0 4 1 11 4 4 4 1 9 1 9 4 4 3
0 4 1 12 3 4 3 1 9 1 10 4 4 3
0 4 1 13 3 4 3 1 9 1 14 5 5 5
Table D.8: Results of cognitive load (Part 1)
107
D Results of User Study
Variant P S PM ICL ECL GCL
2 10 0 1 2 2 2
2 10 0 3 4 4 2
2 10 0 5 1 2 4
2 10 0 7 1 2 4
2 10 1 8 3 4 2
2 10 1 9 1 2 3
2 10 1 11 1 3 3
2 10 1 13 4 5 4
3 11 0 1 2 3 2
3 11 0 2 4 3 2
3 11 0 4 2 3 2
3 11 0 6 2 3 2
3 11 1 8 4 4 2
3 11 1 10 4 4 3
3 11 1 12 4 4 4
3 11 1 14 3 4 3
0 12 0 1 1 4 2
0 12 0 2 3 4 2
0 12 0 3 3 4 3
0 12 0 7 1 3 1
0 12 1 8 2 4 2
0 12 1 11 2 4 3
0 12 1 12 2 5 3
0 12 1 13 2 4 2
1 13 0 1 3 3 1
1 13 0 4 4 5 1
1 13 0 5 5 5 3
1 13 0 6 4 5 4
1 13 1 8 5 4 3
1 13 1 9 4 5 2
1 13 1 10 5 5 3
1 13 1 14 5 5 4
2 14 0 1 2 3 2
2 14 0 3 3 3 2
2 14 0 5 2 3 2
2 14 0 7 3 4 4
2 14 1 8 4 5 3
2 14 1 9 2 3 3
2 14 1 11 3 4 3
2 14 1 13 3 4 3
Table D.9: Results of cognitive load (Part 2)
108
Name: Jana Bühler Matrikelnummer: 871153
Erklärung
Ich erkläre, dass ich die Arbeit selbständig verfasst und keine anderen als die
angegebenen Quellen und Hilfsmittel verwendet habe.
Ulm,den .........................................................................
Jana Bühler