scieee Science in your language
[en] (orig)
A METHOD FOR REWRITING LEGACY SYSTEMS USING
BUSINESS PROCESS MANAGEMENT TECHNOLOGY
Gleison Samuel do Nascimento, Cirano Iochpe
Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
Lucin´
eia Heloisa Thom, Manfred Reichert
Institute of Databases and Information Systems, University of Ulm Oberer Eselsberg, 89069, Ulm, Germany
Keywords: legacy system, reenginering, business process, business process management
Abstract: Legacy systems are systems which execute useful tasks for the organization. Unfortunately, to maintain a
legacy system running is a complex and costly task. Thus, in recent years several approaches were suggested
to rewrite legacy systems using contemporary technologies. In this paper we present a method for rewriting
legacy systems based on Business Process Management (BPM). The use of BPM for migrating legacy systems
facilitates the monitoring and continuous improvement of the information systems existing in the organization.
1 INTRODUCTION
Legacy systems are information systems which
execute useful tasks for an organization, but were de-
veloped with technologies no longer in use (Ward and
Bennett, 1995). Legacy systems include information
and procedures which are fundamental for the oper-
ation of the organization. However, to maintain a
legacy system running is a complex and costly task.
Reason is that the corresponding program code can be
obsolete, hard to understand, and poorly documented.
To reduce these problems, many organizations are
opting for rewriting and implementing their systems
using contemporary technologies. In recent years
several approaches were suggested to perform this
rewriting. The majority of them either requires the
rewriting of the entire system (Biggerstaff, 1989) or
by specific system modules (Bianchi et al., 2003).
Both approaches block maintenance of the legacy sys-
tem during the rewriting process. However, as busi-
ness in organizations is generally dynamic and inter-
ruption of maintenance activities might result in op-
erational problems. Apart from this, none of the two
approaches consider the need of understanding how
the legacy system works (i.e. what its business logics)
and what its impact on the efficiency of the organiza-
tion’s business is.
Related to these problems, in this paper we present
a method for rewriting legacy systems based on Busi-
ness Process Management (BPM). During the last
years we have seen an increasing adoption of BPM
tools by enterprises as well as emerging standards for
business process specification, execution and moni-
toring.
Our method uses BPM and SOA in order to real-
ize the rewriting of legacy systems. Basically, we aim
at: 1) identifying business processes embedded in a
legacy system and, 2) implementing the businesses
process identified from legacy code using BPM and
Service Oriented Architecture (SOA) tools, without
need of rewriting the source code of the legacy sys-
tem. The major contributions of our approach can be
summarized as follows:
1. the business process being executed in the organi-
zation can be better documented, i.e., the business
model obtained from analyzing the legacy code
can be represented graphically in BPM tools.
2. the business process is explicitly represented in
an executable model, which eases monitoring and
improvement.
3. fragments of the source code existing in the orga-
nization can be reused; i.e., the functions it com-
prises can be transformed in web services which
are composed using a BPM tool.
4. Finally, after discovering the business process
be identified our approach allows to perform the
rewriting of the legacy code for each activity, i.e.,
only the code (e.g., in Cobol) related to a particu-
lar business process activity will be rewritten (e.g.,
in Java).
The remainder of this paper is organized as fol-
lows: Section 2 gives background information needed
for understanding this paper. Section 3 presents the
proposed method for legacy system rewriting based
on BPM. Section 4 discusses related work. We con-
clude with a summary and outlook in Section 5.
2 BACKGROUND INFORMATION
An organization is composed by business pro-
cesses. Each process consists of a series of (struc-
tured) activities which jointly realize a particular busi-
ness goal (Weske, 2007). For instance, the creation of
a sale order in the organization can be seen as a busi-
ness process where the sale of a product is the goal to
be achieved. In order to achieve this goal a number of
activities need to be performed such as stock check-
ing, payment terms definition, and credit checking.
This holistic approach on enterprises, where busi-
ness processes are the main instrument for organiz-
ing their operations, is called BPM (Business Process
Management) (Smith and Fingar, 2002). BPM defines
a life cycle to develop, implement, enact and monitor
business processes (Weber et al., 2009). Resumida-
mente, the cycle has four phases: design configura-
tion,enactment and evaluation phase.
In the design phase, the business process to be
executed in the organization is identified, its goals
are defined and the respective process model is de-
signed. In the configuration and enactment phases
respectively, the process model is implemented and
executed. Finally, in the evaluation phase the process
is monitored and diagnostics on its efficiency are ob-
tained.
One of the major goals of BPM is to gain a better
understanding of the operations a company performs
and of their relationships. The explicit representa-
tion of business processes constitutes the core con-
cept to achieve this better understanding. BPM also
facilitates business process improvement. Currently,
there are several tools to support each phase of the
BPM lifecycle. These tools are called Business Pro-
cess Management System (Reis, 2007).
In organizations, however, there exist information
systems which were developed before BPM technol-
ogy have emerged. These legacy systems do not give
a clear perspective of the business process. Gener-
ally, they implement the complete business process,
or fragments of it. There is a need to integrate legacy
systems with current BPM approaches.
Legacy systems are typically complex systems
which were developed using different programing
languages. In this context Service Oriented Architec-
tures have proven to be as efficient technology (Pa-
pazoglou and Heuvel, 2007). Web Services are inter-
faces that ensure the communication of systems de-
veloped with different technologies (Papazoglou and
Heuvel, 2007).
The joint use of both BPM and SOA can be effi-
cient in order to realize the rewriting of legacy sys-
tems. While BPM provides tools for building and im-
plementing business processes, SOA provides a stan-
dard interface that allows to connect business pro-
cesses and legacy systems through Web services gen-
erated from the source code of the legacy system.
3 PROPOSED METHOD TO
LEGACY SYSTEMS
REWRITING
In this section we introduce a BPM driven method
we are developing to rewrite legacy systems. Our
method comprises three phases: Business process
identification and definition; Business process imple-
mentation; Business process enactment and improve-
ment.
Note that a legacy system can implement more
than one business process. Therefore, we are not nec-
essarily proposing the mapping of the legacy system
to one single business process. Similarly, a legacy
system may implement only fragments of a business
process.
Thus, it is very important that the developer knows
the business processes that the legacy system imple-
ments (developer is the user that is applying the pro-
posed method this paper). Accordingly, the developer
must apply our rewritten method to each of the em-
bedded processes in legacy system. Note that in each
iteraction of our method only one business process
is identified and rewritten. The legacy system then
is rewritten by modules, where each module corre-
sponds to a business process. Therefore, implications
related to the legacy system maintenance are mini-
mized.
In the following sections we describe each of these
phases in detail.
3.1 Business Process Identification and
Definition
To identify business process from legacy code and to
define respective models constitute the main phase of
the proposed method, It is in this phase in which the
developer designs the business process. For this pur-
pose, he identifies from the legacy system the process
activities to be implemented.
Many steps of this phase require human interven-
tion. However, in particular steps of our mapping ap-
proach we can use algorithms which shall help the
developer in extracting the desired information from
the legacy code.
In order to better understand this phase we subdi-
vide it in eight steps executed in sequence.
Step 1: Scope Definition of the Process
In Step 1 the developer must define the purpose
of the business process within the organization and its
application domain. In addition, the developer must
list keywords related to the defined goals and applica-
tion domain. As example, let us assume that the orga-
nization has a legacy system to monitor sales. Assume
further that we want to know how a quotation of sale
is performed in the organization. Then, the developer
first has to define the process purpose: generation of
a quotation. Second, the application domain of the
process must be identified. As it is executed in the
sales department, the application domain is logistics.
Finally, keywords related to the process must be de-
fined (e.g. product, stock, customer, credit).
The definitions made in this step can be used in
the development of algorithms to extract informa-
tion from the source code in the subsequent steps.
Through keywords, such as stock, an algorithm exe-
cuted in Step 7, (Identification of Automatic Activi-
ties), can find a procedure named ”stockControl”, and
automatically map it to an automated activity of the
process. Note that the activity is automatic, since it
is executed for a procedure (or function) of the legacy
system.
Step 2: Defying of Start and End Events
At this step, the developer must identify the start
and end events of the process. The start event may
be the display of a screen of the legacy system, or
even the receipt of a file or message in the organiza-
tion. Considering the quotation example, for instance,
the start event of a quotation may be a system opera-
tor accessing the menu Quotation and selecting the
option Create a New Quotation. Similarly, the devel-
oper must identify the event or condition that termi-
nates the business process. In our example, the end
event may be the recording of the order in a database,
or even the system operator receiving a confirmation
message on the screen. Obviously, a business process
can have more than one start or end event. Through
start and end events, we can select the source code
files to be analyzed in order to identify the business
process.
Step 3: Identifying Human Activities in the Legacy
System
In Step 3 the developer must indicate the human
interactions with the legacy system. The human inter-
actions considered here, correspond to the electronic
forms of the legacy system being involved between
the start and end events (cf. Step 2). Obviously, these
forms correspond to human activities in the business
process.
In the quotation example, the forms are filled out
by an operator of the legacy system during the cre-
ation of a quotation (e.g. the form of client informa-
tion). Each human interaction is mapped to a human
activity in the business process. Observe that in this
step the fields comprised by each form can be iden-
tified. This information can be used in the creation
of metadata which can be further used in the gener-
ation of forms related to the human activities of the
business process.
The metadata can be also useful when identifying
in the source code the automated activities related to
the business process. In the source code we can find
business rules which validate the values entered in the
fields of the forms, as we show in Step 7.
Step 4: Identifying Activities Outside the Legacy
System
As before mentioned, a legacy system does not al-
ways covers all activities of a business process. Even-
tually, the process may also have activities which are
not executed within the scope of the legacy system.
An example of this kind of activity in the quotation
process is the credit verification of a customer within
an institution of credit protection. Other examples in-
clude the production of items that are not in stock and
the shipment of items to the customer. In addition
to the identification of these activities, the developer
should indicate whether the activity is manually or au-
tomatically executed.
Step 5: Defining the Partial Order of the Activities
Identified in Steps 3 and 4
After identifying the human activities imple-
mented in the legacy system as well as the activities
which are not executed within the scope of the legacy
system, the developer must define the partial order of
these activities in the process model. The definition of
the partial order is done manually by the developer.
Obviously, these activities are connected by spe-
cific control flows (e.g., sequence, XOR-Split, AND-
Split). The developer must indicate control dependen-
cies between the activities.
Advertisement
The partial order of human activities can also de-
termine which part of the source code is analyzed in
Step 7. When determining the relationship between
human activities A and B the developer is delimiting
the portion of source code to be reviewed when iden-
tifying automated activities between A and B.
Step 6: Identifying Roles for Executing Human
Activities
At this point, the developer has already identi-
fied the human interactions within the legacy system,
and has defined partial order between these activities.
Now, he must verify which are the roles or respon-
sibilities in the legacy system, that a user must have
to fill each of the forms identified as human activities
(Ly et al., 2005). This allows to identify user roles
within the business process require to work on respec-
tive activities.
Step 7: Identifying Automated Activities
In Step 7, the developer must analyze the source
code in order to identify automated activities being
executed in the legacy system. For that, the partial
order of human activities is taken into consideration.
Like for Step 5, the identification of automated activ-
ities should consider the source code being executed
between the invocation of two human activities of the
business process.
Afterwards, the developer must identify the busi-
ness rules captured by this source code. A business
rule is a statement that control or influence the behav-
ior of a system (Group, 2006).
Regarding the source code of a system, business
rules are structures of type condition and action; a
condition consists of one or more boolean expressions
connected by logic operators (”and”, ”or”, ”not”).
An action, in turn, consists of operations which pro-
duce some processing result (e.g., recording of data in
a database).
Below we list characteristic business rules we are
trying to discover from the source code of the legacy
systems.
Persistence: Code fragments which deal with
data persistence. Usually, these fragments refer
to database transactions.
Information Flow: Rules defining the informa-
tion being exchanged between two activities.
Control Flow: Rules defining the routing of au-
tomated activities (e.g. IF statements).
Pre-conditions Rules indicating required condi-
tions which must be satisfied in order to execute a
particular activity. For example, after performing
a human activity, there be validations in the fields.
Pos-conditions: Rules verifying a system condi-
tion after executing a determined data processing.
Frequency: These rules identify the number of
iterations of a particular rule or a series of rules
within the source code (i.e. loops in source code).
Execution Time (Duration): Business rules
which define the execution time of an activity.
Fragments of the source code that identify time-
outs of processing.
External Calls: These rules determine the in-
vocation of external applications from the legacy
system.
Computations: Rules which identify mathemati-
cal computations.
Here we can develop algorithms to support the
developer in the identification of automated activi-
ties. Currently, there exist several algorithms for
identifying business rules in source code (Tip, 1995)
(Paradauskas and Laurikaitis, 2006). We intend to
adopt these algorithms in order to identify the most
relevant business rules of a particular business pro-
cess. Note that these algorithms must consider all in-
formation collected during Steps 1 to 6.
Step 8: Validation of the Retrieved Process Model
After having identified automated activities, as
well, we obtain, the final process model. The ob-
tained process model has to be validated. Here, be-
havior properties of the process model are verified
(e.g., absence of deadlock); there are paths that are
never executed. In order to perform this validation,
the business process is mapped to a representation in
π-calculus (Milner et al., 1992). This representation is
called π-process. The π-process obtained can then be
checked with a model ckecking tool. In (Thom et al.,
2008) this mapping is defined and a model checking
tool is detailed.
3.2 Business Process Implementation
The implementation phase of the process model is di-
vided into three steps. Note that the implementation
of process model must avoid operational breaks in
the legacy system, i.e., the legacy system must con-
tinue operating and its maintenance must not be inter-
rupted.
Step 1: Implementation of the Process Model
In the first step, the developer must choose a
BPM tool for implementing the business process (e.g.,
Intalio (Intalio, 1999) or ADEPT2 (Reichert et al.,
2005) ). After that, the process must be designed in a
notation supported by the tool. Moreover, the devel-
oper must define input and output attributes of each
activity in the process as well as the related roles and
users which execute them.
In addition, the developer may want to implement
procedures to obtain measurement metrics on process
performance.
Step 2: Implementation of Automated Activities
We can now implement the automated activities of
the process model. As discussed in Section 3.1 (Step
7), these activities are identified based on the analy-
sis of the business rules existent in the source code.
Through business rules identification we can also
identify the source code implementing these rules.
This source code can be mapped to automated activi-
ties of the process model.
We can implement a web service to execute the
piece of source code which implements an automated
activity. Thus, it is not necessary to rewrite the source
code in a language that can be interpreted by the BPM
tool. For example, suppose that we are using a legacy
system written in Clanguage. In this case, we can
write a Java webservice, which will call a procedure
in Clanguage, through JNI (Java Native Interface)
(Liang, 2002). The Java code generated is a method
that calls the Cprocedure.
Altogether, our approach allows for the migration
of legacy systems to a process oriented technology
without need of rewriting the legacy code, and with-
out interrupting the operation or maintenance of the
legacy system.
After creating the web services, they must be con-
nected to the respective activity in the business pro-
cess implemented in the BPM tool.
Step 3: Implementation of the Human Activities
In the final step of the implementation phase, the
forms related to the human activities are generated.
The attributes included in each form are the same ap-
pearing in the the screens of the legacy system.
In the most majority of BPM tools, forms are au-
tomatically generated. Giving the attributes related
to each process activity and respective data types, the
tools generate the forms. The form can then be cus-
tomized according to a particular domain.
3.3 Business Process Enactment and
Improvement
In the last phase of our method the process model
is executed and the results its execution monitored.
Based on the analysis of the results it is possible to
improve the process model as well as the performance
of each activity.
In this step, the organization also may consider to
rewrite the source code in a more contemporary lan-
guage. In the example given in Section 3.2, Step 2, we
use JNI to invoke a procedure written C, thus allow-
ing the reuse of legacy code. After implementation
and validation of the business process, the developer
may begin to migrate from Ccode for a language as
Java. Thus it eliminates the JNI communication in
the future. This change can be executed without any
reflection on the operation of the organization.
4 RELATED WORK
In recent years, several methods for legacy system
rewriting have been proposed and discussed in litera-
ture. We classify these methods into three groups: 1)
reengineering of the complete system; 2) reengineer-
ing through wrapping techniques; 3) reengineering of
the system by modules.
An example of the first category include the ap-
proach proposed in (Biggerstaff, 1989). These meth-
ods consider the migration of the complete system.
This implies the locking of maintenance in legacy
system, in order to prevent whatever change in the
organization business rules during the rewriting pro-
cess. In our proposal this does not occur for two rea-
sons: First, the system is not completely rewritten, but
rather the parts related to the business process. The
second reason is that the source code of the legacy
system is not rewritten in a first moment, as shown
in Step 2 of the implementation phase of the business
process. The code is rewritten after the process be
running, as proposed in Section 3.3.
The second category, reengineering using wrap-
ping techniques, proposes the introduction of a com-
munication layer between the new system and the li-
braries of the legacy system, as proposed in Step 2
of the implementation phase of the business process.
The approach proposed in (Bisbal et al., 1999) is a
sample of the respective techniques. However, these
studies do not consider analysis of the legacy code.
They propose that only new features of the system
are using a new technology, and legacy code is trans-
formed into a black box i.e., they do not provide an
explicit view of the business process. In our proposal,
the business process is documented in a BPM tool, so
it can be executed, monitored and improved.
Finally, the third category considers reengineering
by modules. In this case, the legacy system is divided
into modules. Thus, during maintenance only the in-
volved modules are locked rather than the complete
Advertisement
system. The methodology Iterative (Bianchi et al.,
2003) is example of this category. Our proposal dif-
fers from this category because the main focus is on
the business process being executed in the organi-
zation. Moreover, the identification of the modules
of the legacy system is complex. In our approach,
we propose the identification of the business process
enabling then the identification of the related source
code.
5 SUMMARY AND OUTLOOK
In this paper we proposed a BPM driven rewrit-
ing approach which consists of the mapping of legacy
systems to a business processes. The overall goal of
our rewriting method is to ease the upgrade of the
legacy system. The use of BPM for migrating legacy
systems facilitates the monitoring and continuous im-
provement of the information systems existing in the
organization. In addition, the business process being
executed in the organization is documented. This,
in turn allows for dissemination of the knowledge,
which was previously only by the developers of the
legacy system.
Another significant advantage of our rewriting
method concerns in the head of the reuse of the source
code of the legacy system. Thus, the business process
can be implemented without the legacy system be-
ing interrupted or its maintenance being blocked. Af-
ter the process be running, the organization can start
the rebuilding of legacy code and the business pro-
cess will not be interrupted. Furthermore, we have
already started to use our method to the rewriting of
real legacy system from the logistic domain.
As future work we intend to develop data struc-
tures, which shall store and integrate all the concepts
discussed during the application of our method. For
instance, the construction of an ontology, which re-
lates to business rules, processes activities and appli-
cation domain. These structures are fundamental for
building algorithms.
ACKNOWLEDGEMENTS
The authors would like to acknowledge the Coordi-
nation for the Improvement of Graduated Students
(CAPES), the Institute of Databases and Information
Systems of the University of Ulm (Germany), and
the Informatics Institute of Federal University of Rio
Grande do Sul (Brazil).
REFERENCES
Bianchi, A., Caivano, D., Marengo, V., and Visaggio,
G. (2003). Iterative reengineering of legacy sys-
tems. IEEE Transactions on Software Engineering,
29(3):225–241.
Biggerstaff, T. J. (1989). Design recovery for maintenance
and reuse. Computer, 22(7):36–49.
Bisbal, J., Lawless, D., Wu, B., and Grimson, J. (1999).
Legacy information systems: Issues and directions.
IEEE Software, 16(5):103–111.
Group, B. R. (2006). Guide: Business rules project. Techni-
cal report. Dispon´
ıvel em: www.guide.org/pubs.htm.
Intalio (1999). Creating process flows. Technical report,
Intalio Inc.
Liang, S. (2002). Java Native Interface: Programmer’s
Guide and Specification. Sun Microsystems, Inc.
Ly, L. T., Rinderle, S., and Reichert, M. (2005). Min-
ing staff assignment rules from event-based data. In
In: Proc. Workshop on Business Process Intelligence
(BPI) in conjunction with (BPM’05), pages 177–190,
Nancy, France. Springer.
Milner, R., Parrow, J., and D., W. (1992). A calculus of
mobile processes. Technical report, University of Ed-
inburgh.
Papazoglou, M. P. and Heuvel, W.-J. (2007). Service ori-
ented architectures: approaches, technologies and re-
search issues. The VLDB Journal-The International
Journal on Very Large Data Bases, 16(3):389–415.
Paradauskas, B. and Laurikaitis, A. (2006). Busi-
ness knowledge extraction from legacy informa-
tion systems. Information Technology and Control,
35(3):214–221.
Reichert, M., Rinderle, S., Kreher, U., and Dadam, P.
(2005). Adaptive process management with adept2.
In ICDE ’05: Proc. Int. Conf. on Data Engineering,
pages 1113–1114, Tokyo, Japan. IEEE Comp. Press.
Reis, G. (2007). Introduction to bpm, bpms and soa. Portal
BPM, 01:22–29.
Smith, H. and Fingar, P. (2002). Business Process Manage-
ment: The Third Wave. Meghan-Kiffer Press.
Thom, L. H., Iochpe, C., Reichert, M., Weber, B., Matthias,
D., Nascimento, G. S., and Chiao, C. M. (2008). On
the support of activity patterns in prowap: Case stud-
ies, formal semantics, tool support. Revista Brasileira
de Sistemas de Informacao (iSys), 01.
Tip, F. (1995). A survey of program slicing techniques.
Journal of Programming Languages, 3:121–189.
Ward, M. P. and Bennett, K. H. (1995). Formal methods for
legacy systems. Journal of Software Maintenance and
Evolution, 7(3):203–219.
Weber, B., Reichert, M., Wild, W., and Rinderle-Ma, S.
(2009). Providing integrated life cycle support in
process-aware information systems. Journal of Co-
operative Information Systems, 18(1). (Accepted for
Publication).
Weske, M. (2007). Business Process Management: Con-
cepts, Languages, Architectures. Springer, Berlin.