Towards a Framework for the Agile Mining
of Business Processes
Barbara Weber1, Manfred Reichert2, Stefanie Rinderle3, and Werner Wild4
1Quality Engineering Research Group, University of Innsbruck, Austria
Barbara.Web[email protected]
2Information Systems Group, University of Twente, The Netherlands
m.u.reic[email protected]wente.nl
3Dept. Databases and Information Systems, University of Ulm, Germany
[email protected]–ulm.de
4Evolution Consulting, Innsbruck, Austria
werner.wild@evolution.at
Abstract. In order to support business processes effectively, their im-
plementation by a process management systems (PMS) must be as close
to the real world’s processes as possible. Generally, it is not sufficient
to analyze and model a business process only once, and then to han-
dle respective business cases according to the defined model for a long
period of time. Instead, process implementations must be quickly adapt-
able to changing needs. A PMS should enable process instance changes
and provide facilities for analyzing these instance-specific changes in or-
der to derive optimized process models. In this paper we introduce a
framework for the agile mining of business processes which supports the
whole process life cycle in an integrated way. Our framework is based on
process mining techniques, adaptive process management, and conversa-
tional case-based reasoning. On the one hand, it allows annotating execu-
tion and change logs with semantical information to gather information
about the reasons for ad-hoc deviations, which can then be analyzed by
the process engineer (with support from the PMS). On the other hand,
it enables the process engineer to adapt process models based on the
outcome of these analyses and to migrate related process instances to
the new model.
1 Introduction
Companies are developing a growing interest in aligning their information sys-
tems in a process-oriented way. Recently, process mining, in particular Delta
analysis [1, 2], has been proposed to improve business alignment [3]. If no process
models are available yet, process mining techniques can be applied to identify
repetitive process fragments. However, if the business processes are already cap-
tured in process models, Delta analysis can help to detect discrepancies between
the modeled and the observed execution behavior of a process. Though process
execution logs can be used to reveal malfunctions or bottlenecks, they do not
Proc. 1st Int'l Workshop on Business Process Intelligence (BPI 2005), held in conjunction
with BPM'05, Nancy, France, September 2005 (to appear)
provide any semantical information about the reasons for the observed discrep-
ancies. In respect to the optimization of the process models these logs there-
fore provide only limited information to process engineers. Furthermore, current
PMS do not sufficiently support process engineers to incorporate the results of a
Delta analysis into an improved process model and to smoothly migrate running
process instance to the new model version.
Obviously, the practical benefit of process mining depends on the quality of
the available log data. In PMS, for instance, respective execution logs can only
reflect situations the PMS is able to handle. Particularly, if the PMS does not
support process instance changes it has to be bypassed in exceptional situations
(e.g., by executing unplanned process activities outside the scope of the PMS).
Consequently, the PMS is unaware of the applied deviations and thus unable to
log information about them. This missing traceability of process instance changes
significantly limits the benefits of process mining and Delta analysis approaches.
Continuous process improvement requires adaptive PMS which enable autho-
rized users to flexibly deviate from premodeled processes as needed. Since this
results in more meaningful execution logs, which implicitly reflect the applied
process instance changes, process mining and Delta analysis approaches become
more useful. If such analyses result in an optimized version of a process model,
adaptive PMS support the process engineer to quickly implement the model
change and smoothly migrate running process instances to the new model.
In addition to process execution logs, adaptive PMS maintain information
about applied instance modifications in change logs. Minimally, a change log
should keep syntactical information about the kind and the context of the ap-
plied changes (e.g., the type and position of a dynamically inserted process
activity). While this information is useful for process mining, it is not sufficient
to effectively support process optimization efforts, process engineers also need
semantical information about the reasons for the change. This is particularly
important if the same or similar instance changes happen over and over again.
Assume that in a patient treatment process an unplanned lab test is dynami-
cally added for a considerable number of process instances. Then the respective
change logs should also reflect information about the semantical context of the
applied instance changes (e.g., that insertions have been mainly performed for
patients older than 40 years and suffering from diabetes).
In this paper we introduce a framework for the agile mining of business
processes which supports the whole process life cycle in an integrated way and
fosters continuous and quick adaptation to change. Our framework is based on
process mining [3], adaptive process management (PM) [4, 5], and, to close the
semantic gap, conversational case-based reasoning (CCBR) [6]. CCBR is an in-
teractive extension of the case-based reasoning (CBR) paradigm [7]. In par-
ticular, we combine the advantages of these three approaches in an integrated
prototype. We enhance execution and change logs with semantical information,
which is then analyzed by the process engineer with support from the PMS
to provide knowledge about the context of and the reasons for discrepancies
between process models and related instances. This semantical information is
also reused to support users when similar deviations become necessary. Finally,
our framework enables the process engineer to adapt process models based on
the outcome of process mining efforts and to smoothly migrate related process
instances to the new model version.
Section 2 describes building blocks for the agile mining of business processes.
Section 3 gives an overview of our framework and Section 4 discusses a sample
application. The paper closes with a summary and an outlook in Section 5.
2 Building Blocks for Agile Mining of Business Processes
This section summarizes main characteristics of our framework’s building blocks:
process mining, case-based reasoning, and adaptive PM.
2.1 Process Mining
Process mining denotes the extraction of process knowledge from log data related
to past process and application executions. Respective logs are usually provided
by workflow systems, but also by other process-oriented applications, like en-
terprise resource planning or supply chain management systems. Typically, all
these systems log event-based data (e.g., related to the start or completion of task
executions) together with additional context information (e.g., about actors).
The main objective of process mining is to effectively use automatically col-
lected data in order to gain process knowledge from the logs. In particular,
process mining extracts formal process models from execution logs [3, 8–11]. So
far, the focus has been put on issues related to control-flow mining. However,
first approaches have been developed which use event-based data for mining or-
ganizational and performance aspects as well [12, 13]. Process mining is generally
considered as an alternative to interviews or questionnaires to acquire process
knowledge. Its results can be used for further analysis. For instance, Delta analy-
sis [1, 2] compares existing process models with the results of process mining in
order to detect discrepancies between the modeled and the observed execution
behavior. This information is then used to improve the process model.
2.2 Case-Based Reasoning
Case-based reasoning (CBR) is a contemporary approach to problem solving
and learning [7]. New problems are dealt with by drawing on past experiences –
described in cases – and by adapting their solutions to the new problem situation.
Reasoning based on past experiences is a powerful and frequently applied
way to solve problems by humans [14]. A physician, for example, remembers
previous cases to determine the disease of a newly admitted patient. A case is a
contextualized piece of knowledge representing an experience [7], which typically
consists of a problem description and the corresponding solution.
Our framework applies conversational CBR, an extension of the CBR para-
digm, which actively involves users in the inference process [15]. CCBR systems
are interactive systems that, via a mixed-initiative dialogue, guide users through
a question-answering sequence in a case retrieval context. Unlike traditional
CBR, CCBR neither requires users to provide a complete a priori problem spec-
ification for case retrieval nor to provide knowledge about the relevance of each
feature for problem solving. Instead, the system assists users in finding relevant
cases by presenting a set of questions to assess a given situation. Furthermore,
it guides users who may supply already known information on their initiative.
Therefore, CCBR is especially suitable for handling exceptional or unanticipated
situations which cannot be dealt with in a fully automated way.
CBR has been applied to PM for several purposes [16]. Our research proto-
type CBRFlow, for example, uses CCBR to perform ad-hoc changes of single
process instances, to memorize these changes, and to support their reuse in sim-
ilar future situations [17].
2.3 Adaptive Process Management
Adaptive PM technology increases the flexibility of process-oriented information
systems by enabling (dynamic) changes of different process aspects (e.g., control
and data flow). Such process changes can be performed at two levels – the process
type and the process instance level [5, 18].
If a process template or, more precisely, the process type schema represent-
ing this template is changed the respective changes are propagated to already
running process instances as desired [19]. This has to be done in a correct and
consistent manner, and respective migration procedures must efficiently handle a
high number of concurrently running process instances. In this context it must be
possible to propagate process type changes to both unbiased and biased process
instances. We denote process instances as unbiased if they are running according
to the original process type schema they were derived from, whereas process
instances are denoted as biased if they have been individually modified (e.g., due
to an ad-hoc change) [18].
Instance–specific ad-hoc changes (e.g., switching the order of two activities
or adding new ones) must be performed to deal with exceptional situations [4].
Usually, they are stored in change logs, which provide information about the
type of the applied change and related context information (e.g., the position
of a newly inserted activity). This log information contributes to the analysis of
potential conflicts between process type and process instance changes as well as
to the documentation of the instance history. The described functionality has
been implemented in our ADEPT PMS [4, 5, 18, 19].
3 Framework for Agile Mining of Business Processes
The implementation of a framework for the agile mining of business processes
raises a number of challenges. This section outlines basic requirements (Section
3.1), gives an overview of the designed framework (Section 3.2), and describes
the current state of its implementation (Section 3.3).
3.1 Requirements for an Agile Process Mining Framework
An agile process mining framework must provide comprehensive facilities for
business process monitoring and must allow quick responses to observed discrep-
ancies between the modeled and the executed processes. In particular, process en-
gineers should be supported in learning from previous (ad-hoc) instance changes
and in deriving optimized process models from log data.
When performing a process instance change information about the reasons
for the change should be kept in a case–base. This can be done by either adding
a new case (when no similar change has been applied before) or by reusing an
existing one (representing a previously applied, similar ad-hoc change). A pure
syntactical approach is not sufficient to stimulate the reuse of change informa-
tion, semantical information about the changes must be maintained as well. Con-
sider, for example, a patient treatment process and assume that an additional
activity (i.e., a lab test) has been frequently inserted for patients older than
40 years and suffering from diabetes. If such context information is explicitly
maintained, respective cases can be reused in similar situations and optimized
process models can be derived from instance change logs.
To continuously optimize business processes, extended mining techniques
must be provided. They should use information from execution and change logs
as well as the semantical information (cases) associated with the changes. When
similar ad-hoc changes occur frequently enough (i.e., their occurrence exceeds a
certain threshold), process engineers should be notified and assisted in optimiz-
ing the process model. Semantical change information must be represented in
such a way that useful suggestions can be made. For example, assume that our
lab test activity has been inserted for a significant number of process instances
in the context just mentioned. When moving this change to the process type
level the simple insert operation (used at the instance level) cannot be applied
directly as the additional lab test activity shall only be performed if the specific
conditions (older than 40 years, diabetes) are met. Therefore, the PMS should be
able to translate instance changes and related semantical information to process
type changes (i.e., to respective transformations of the process type schema) and
to suggest them to the process engineer.
Process engineers should not only be supported in deriving new schema ver-
sions from log data, but also in migrating already running instances to a modified
schema. For this, the PMS has to check whether these instances are compliant
with this new schema version or not. Depending on whether an instance is biased
or unbiased (cf. Section 2.3) and depending on the degree of overlap between
process type and process instance change, different migration strategies must
be supported by the PMS [20]. Furthermore, the system has to migrate the se-
mantical information associated with the process schema as well (i.e., the cases
representing ad-hoc changes on instances of this schema), as only information
not yet covered by the new process schema must be migrated. Information on
the ad-hoc change which triggered the process type evolution should be omitted,
i.e., those cases should be dropped from the new case-base version.
3.2 Overview of the Framework
In order to meet these requirements and to reflect a company’s business processes
adequately, process models must be continuously monitored and adapted to
changing needs. For this, both execution and change logs must be enriched with
semantical information. As illustrated in Fig. 1, different information sources
(i.e., execution logs, change logs, and case-bases) are relevant in this context.
These sources must be continuously evaluated and changes to the process model
should be triggered when discrepancies between it and its instances occur fre-
quently. Based on the information maintained in the execution logs, in the change
logs, and in the case-base, the process engineer can then adapt the process model
and migrate related process instances by using adaptive PM technology (cf. Sec-
tion 2.3).
Execution Log Case-Base
Problem
Lab test required
Question-Answer Pairs
Solution
Insert Lab test
Age? > 40
Diagetes? Yes
Problem
Lab test required
Question-Answer Pairs
Solution
Insert Lab test
Age? > 40
Diagetes? Yes
Problem
Lab test required
Question-Answer Pairs
Solution
serialInsert (S, X, C, D)
Age? > 40
Diabetes? Yes
Adaptive Process Management
…………………………
Start Activity (I1, C)
End Activity (I1, C)
Start Activity (I1, X)
End Activity (I1, X)
Start Activity (I1, D)
End Activity (I1, D)
Start Activity (I2, B)
Start Activity (I3, C)
End Activity (I3, C)
Start Activity (I3, X)
End Activity (I2, B)
End Activity (I3, X)
Start Activity (I2, C)
End Activity (I2, C)
Start Activity (I3, D)
Start Activity (I2, D)
…………………………
………………………….
Change Log
................................................
................................................
................................................
I1: serialInsert (S, X, C, D)
I3 : serialInsert (S, X, C, D)
................................................
................................................
................................................
................................................
Schema S
A B C E
DA B D
X
CE
Schema S‘
Fig. 1. Information Sources Triggering Process Evolution
Execution and change logs provide the syntactical information on what hap-
pened, i.e., which activities were executed and which deviations occurred at what
time. CCBR (cf. Section 2.2) is used to provide semantical information on why
changes happened. More precisely, experiences from previous changes are stored
as cases in a case-base. A case represents a concrete ad-hoc modification of one
or more process instances, it consists of a textual problem description briefly ex-
plaining the problem that made the deviation necessary, a set of question-answer
pairs, and a solution part. The question-answer pairs describe the reasons and
the context of the ad-hoc deviation and the solution part reflects the concrete
change operations that have been applied to the respective instance(s) to per-
form the desired ad-hoc change by using services of the adaptive PMS (for a
more detailed description see [21, 22]).
In Fig. 1 the mining of the process execution log reveals that activities A, B,
C, D and E are always executed in sequence. The change log further indicates
that for some process instances running on schema S the additional activity X
was dynamically inserted between activities C and D. However, the change log
does not provide any semantical information on why activity X was inserted. By
analyzing the cases in the case-base, the process engineer learns that activity X
was primarily added and executed for patients older than 40 years who suffers
from diabetes. Based on this information he can derive a new process schema
containing the additional activity X for patients matching these two conditions.
Finally, after the new version of the process type schema is released, process
instances can be migrated to the new schema version.
A B D
C
A B D
X
CE
Instantiation
ChangeProcessType Schema
Examine
patient
Make
appointment
Schema S‘:
Enter
order Inform
patient
Make
appointment
Schema S:
CCBR
Process Engineer
Change
Log
dfhi
Enter
order
Make
appointment
Process Instance I
c
Create ProcessTypeSchema
Ad-hoc changed Process
Instance I
Change Process
Instance
Execution
Log
Process User
eg
ProcessMining
A B C E
D
Fig. 2. Integrated Process Life Cycle Support
In the following we describe how the building blocks (cf. Section 2) and
their respective information sources (i.e., execution log, change log and case-
bases) work together to complete the process life cycle (cf. Fig. 2). An initial
process schema can either be created by applying process mining techniques (i.e.,
by observing process and/or task executions) or by business process analysis
(1). During run-time new process instances are created from such a predefined
process schema (2). In general, process instances are then executed according
to their process type schema and execution logs are written (3). However, when
deviations from the predefined schema become necessary (e.g. due to exceptions
or unanticipated situations) process actors must be able to deviate from it.
They can either specify a new ad-hoc deviation and document the reasons for
their changes in the CCBR subsystem, or reuse a previously specified ad-hoc
modification from the case-base (4). In both situations an appropriate change
log entry is written (5) and the execution continues. When a particular ad-hoc
modification is frequently reused, the process engineer is notified to perform a
process type change (6). Both execution and change logs are needed to know how
often a particular schema has been instantiated and how frequently deviations
occurred. The process engineer can then perform a schema evolution (7), as
supported by the adaptive PMS, and, if possible, migrate running instances to
the new schema version.
Thus, combining process mining and CCBR techniques with adaptive PM
allows full process life cycle support and the seamless integration of the detected
discrepancies into a new, optimized process schema.
3.3 Current State of the Framework
Our ongoing research focuses on the integration of adaptive PM technology and
CCBR. We currently work on the integration of the methods, concepts, and
prototypes developed by our groups in the ADEPT and the CBRFlow projects.
ADEPT [4] is a next generation PMS that offers full functionality in respect
to the modeling, analysis, execution, and monitoring of business processes [4, 5,
19]. To our best knowledge it is the only PMS providing full support for adaptive
processes at both the process instance and the process type level. When perform-
ing process instance changes, ADEPT ensures consistency and correctness. In
the context of process type changes it additionally allows to migrate running
process instances efficiently to the new schema version [5, 18, 19]. A powerful
prototype demonstrating this functionality exists and is used by several research
groups and industry partners.
CBRFlow [17] supports users in performing simple ad-hoc instance changes
and allows them to express the semantics of these changes by applying CCBR
techniques. It documents the reasons for a process instance change and enables
the user to reuse information about previously performed ad-hoc deviations
when creating new ones for other instances of the same process type. Like with
ADEPT, a powerful prototype exists.
By integrating ADEPT and CBRFlow significant synergies can be exploited,
fostering integrated process life cycle support [21, 22]. On the one hand the com-
bined system provides a powerful process engine, supporting all kinds of changes
in one system. On the other hand it enables the intelligent reuse of process in-
stance changes and fosters deriving process type changes from collected informa-
tion. Currently we implement a prototype which integrates both systems; future
work will include process mining tools as well [23]. We will apply and evaluate
our prototype in different application settings, including healthcare processes
and emergent workflows (e.g., in the automotive sector).
4 Sample Application of the Framework
Contemporary PMS rely on predefined process models requiring large upfront
investments to analyze and model business processes before process-aware in-
formation systems can be deployed. it is especially difficult to create adequate
process models for knowledge-intensive processes (e.g., engineering processes in
the automotive sector) comprising both routinized and non-routinized work. In
some cases process schemes are already obsolete when their specification is ”com-
pleted”. In todays dynamic business environment not all eventualities and devia-
tions can be considered in advance as requirements change or evolve rapidly over
time. Covering too many details in a process schema early on raises the risk of
including rarely needed, not yet needed or never needed parts. For these reasons
process modeling on demand and focusing on core functionality is a promising
approach to foster shortened modeling cycles and earlier productive systems.
To cope with these risks and to ensure that the modeled process schemes
closely reflect a company’s business processes, short iteration cycles must be
supported. Instead of starting with a completely defined process schema, the
process engineer can either develop a preliminary one with a clear focus on core
functionality (e.g., covering the standard process, but not all alternative paths)
or apply process mining techniques to extract process fragments from event logs.
Initially, new process fragments may only reflect a partial view on a process,
i.e., the derived models are not mature. However, they already represent mean-
ingful process knowledge which can be evolved over time and be reused in a
different context (e.g., to automate routine work or to derive more comprehen-
sive process templates). For example, engineers may want to customize fragments
to individual needs or combine them to come up with more complete process
views reflecting complex processes. When adapting or extending process frag-
ments in such a way, new process knowledge is created, which again could be
reused, harvested, and linked with existing fragments.
In this scenario fragment creation, storage, reuse, composition, and execu-
tion are important tasks, our framework provides fundamental support to deal
with them. Initially, process fragments can be derived by applying process min-
ing techniques, the resulting models can be semantically annotated by CCBR
before they are stored in the process repository. When a user wants to retrieve
a fragment, he is guided through a question-answering sequence by the CCBR
sub-system. When a fragment is reused, its reuse counter is increased. Addition-
ally, quality ratings from users are aggregated in a reputation score. A special
challenge is the execution of incomplete fragments which are to be completed
or adapted; for this, adaptive PM technology is key. Altogether, our framework
provides the needed adaptation and mining techniques to support such evolving
and emergent processes.
5 Conclusion and Outlook
We have introduced a framework for the agile mining of business processes based
on CCBR, adaptive PM and process mining. The main focus has not been put
on the development of new concepts in these domains, but on the integration of
existing methods, concepts and prototypes (ADEPT and CBRFlow) and on the
support of business process intelligence. We have sketched the basic relationships
between the different components, discussed technical requirements for providing
an integrated solution, and drafted the approach we take. Currently we work on
the implementation of the framework presented, starting with the integration
of ADEPT and CBRFlow. In this context we have also developed concepts for
process evolution and the migration of related case-bases [21, 22]. In the next
step we want to incorporate process mining tools into our framework.
References
1. Guth, V., Oberweis, A.: Delta analysis of petri net based models for business
processes. In: Proc. Applied Informatics. (1997) 23–32
2. v.d. Aalst, W.: Inheritance of business processes: A journey visiting four notorious
problems. In: Proc. Petri Net Technology for Communication Based Systems.
LNCS 2472 (2003) 383–408
3. v.d. Aalst, W., van Dongen, B., Herbst, J., Maruster, L., Schimm, G., Weijters,
A.: Workflow mining: A survey of issues and approaches. Data and Knowledge
Engineering 27 (2003) 237–267
4. Reichert, M., Dadam, P.: ADEPTflex - supporting dynamic changes of workflows
without losing control. JIIS 10 (1998) 93–129
5. Rinderle, S., Reichert, M., Dadam, P.: Flexible support of team processes by
adaptive workflow systems. Distributed and Parallel Databases 16 (2004) 91–116
6. Aha, D.W., Breslow, L., Mu˜noz-Avila, H.: Conversational case-based reasoning.
Applied Intelligence 14 (2001) 9–32
7. Kolodner, J.L.: Case-Based Reasoning. Morgan Kaufmann (1993)
8. Golani, M., Pinter, S.S.: Generating a process model from a process audit log. In:
Proc. BPM’03, Eindhoven (2003) 136–151
9. Herbst, J.: An inductive approach to adaptive workflow systems. In: Proc. Work-
shop Towards Adaptive Workflow Systems (CSCW’98), Seattle (1998)
10. de Medeiros, A., van der Aalst, W., Weijters, A.: Workflow mining: Current status
and future directions. In: Proc. CoopIS’03. LNCS 2888, Berlin (2003) 389–406
11. van Dongen, B., van der Aalst, W.: Multi-phase process mining: Building instance
graphs. In: Conceptual Modeling - ER 2004. LNCS 3288, Berlin (2004) 362–376
12. van der Aalst, W., Song, M.: Mining social networks. uncovering interaction pat-
terns in business processes. In: Proc. BPM’04, Potsdam, Germany (2004) 244–260
13. van der Aalst, W., van Dongen, B.: Discovering workflow performance models
from timed logs. In: International Conference on Engineering and Deployment of
Cooperative Information Systems (EDCIS 2002). LNCS 2480, Berlin (2003) 45–63
14. Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological
variations and system approaches. AI Communications 7(1994) 39–59
15. Aha, D.W., Mu˜noz-Avila, H.: Introduction: Interactive case-based reasoning. Ap-
plied Intelligence 14 (2001) 7–8
16. Weber, B., Werner, W., Breu, R.: CCBR–enabled adaptive workflow management.
In: Proc. European Conf. on Case-Based Reasoning (ECCBR’04). LNCS 3155,
Madrid (2004) 434–448
17. Weber, B., Wild, W., Breu, R.: CBRFlow: Enabling adaptive workflow manage-
ment through conversational case-based reasoning. In: Proc. Eurpean Conf. on
Case–based Reasoning (ECCBR’04), Madrid (2004) 434–448
18. Rinderle, S., Reichert, M., Dadam, P.: On dealing with structural conflicts between
process type and instance changes. In: Proc. BPM’04, Potsdam (2004) 274–289
19. Rinderle, S., Reichert, M., Dadam, P.: Correctness criteria for dynamic changes in
workflow systems – a survey. DKE 50 (2004) 9–34
20. Rinderle, S., Reichert, M., Dadam, P.: Disjoint and overlapping process changes:
Challenges, solutions, applications. In: CoopIS’04, Agia Napa (2004) 101–120
21. Weber, B., Rinderle, S., Wild, W., Reichert, M.: CCBR–driven business process
evolution. In: ICCBR’05, Chicago (2005)
22. Rinderle, S., Weber, B., Reichert, M., Wild, W.: Integrating process learning and
process evolution - a semantics based approach. In: BPM 2005. (2005)
23. Process Mining Research: www.processmining.org (2005)