scieee Science in your language
[en] (orig)
Towards Collecting Sustainability Data in
Supply Chains with Flexible Data Collection
Processes
Gregor Grambow, Nicolas Mundbrod, Jens Kolb, and Manfred Reichert
Institute of Databases and Information Systems
Ulm University, Germany
{gregor.grambow,nicolas.mundbrod,jens.kolb,manfred.reichert}@uni-ulm.de
http://www.uni-ulm.de/dbis
Abstract. Nowadays, OEMs from many domains (e.g., electronics and
automotive) face rising pressure from customers and legal regulations
to produce more sustainable products. This involves the reporting and
publishing of various sustainability indicators. However, the demands of
legal entities and customers constitute a tremendous challenge as prod-
ucts in these domains comprise various components and sub-components
provided by suppliers. Hence, sustainability data collection must be ex-
ecuted along the entire supply chain. In turn, this involves a myriad of
different automated and manual tasks as well as quickly changing sit-
uations. In combination with potentially long-running processes, these
issues result in great process variability that cannot be predicted at de-
sign time. In the SustainHub project, a dedicated information system for
supporting data collection processes is developed. This paper provides
three contributions: (1) it identifies core challenges for sustainable supply
chain communication, (2) it reviews state-of-the-art technical solutions
for such challenges, and (3) it gives a first overview of the approach we
are developing in the SustainHub project to address the challenges. By
achieving that, this comprehensive approach has the potential to unify
and simplify supply chain communication in the future.
Key words: Process Variability, Data Collection, Sustainability, Sup-
ply Chain
1 Introduction
Companies of the electronics and automotive industry face steadily growing de-
mands for sustainability compliance triggered by authorities, customers and
the public opinion. As products often consist of numerous individual compo-
nents, which, in turn, comprise sub-components, heterogeneous sustainability
data need to be collected along intertwined and intransparent supply chains.
As consequence, highly complex, cross-organizational data collection processes
are required that feature a high variability. Further issues include incomplete-
ness and varying quality of provided data, heterogeneity of data formats, or
changing situations and requirements. So far, there has been no dedicated infor-
mation system (IS) supporting companies in creating, managing and optimizing
such data collection processes. In the SustainHub1project, such a dedicated
information system is being developed. In the context of this project, we have
intensively studied use cases, which were delivered by industry partners from
the automotive and electronics domain in order to elaborate core challenges and
requirements regarding the IT support of adaptive data collection processes. To
assess whether existing approaches and solutions satisfy the requirements, state-
of-the-art has also been thoroughly studied as well. This paper presents core
challenges with respect complex sustainability data collection processes along
today’s supply chains and presents the state-of-the-art in this context. Supply
chains are well suited for eliciting these challenges because of the complexity on
one hand and the requirements imposed by emerging laws and regulations on
the other. However, the core challenges identified apply to many other domains
as well.
Altogether, this paper reveals seven core challenges for data exchange and
collection in complex distributed environments and evaluates existing approaches
to contribute to tackle these challenges. Besides the clear focus on challenges and
requirements, this paper also gives a first abstract outlook on a system we are
currently developing to tackle the challenges. Thereupon, future research on
adaptive business process management technology can be aligned to support
more variability and dynamics in today’s data collection processes.
Fundamentals of sustainable supply chains as well as an illustrating example
are introduced in Section 2. Then, seven data collection challenges are unveiled
in Section 3, exposing concrete findings, identified problems and derived require-
ments. In Section 4, the current state-of-the-art is presented. Following this, we
briefly discuss the approach we are developing to solve the reported issues in
Section 5. Finally, Section 6 rounds out this paper giving a conclusion and an
outlook.
2 Sustainable Supply Chains
This section gives insights into sustainable supply chains and provides an illus-
trating example.
2.1 Fundamentals
The development and production of products is often based on complex sup-
ply chains involving dozens of interconnected companies distributed around the
globe. In order to ensure competitiveness, complex communication tasks must
be effectively and efficiently managed for in the context of cross-organizational
1SustainHub (Project No.283130) is a research project within the 7th Framework
Programme of the European Commission (Topic ENV.2011.3.1.9-1, Eco-innovation).
processes. Generally, such a cross-organizational collaboration consists of a vari-
ety of both manual and automated tasks. Moreover, involved companies signif-
icantly differ in size and industry background and use heterogeneous ISs. Due
to this heterogeneity, neither federated data schemes nor unifying tools or other
concepts can be adopted in this context [1].
As sustainability constitutes an emerging trend, manufacturers face new chal-
lenges in their supply chains: sustainable development and production. The in-
centives are given by two parties: On one hand, legal regulations, increasingly
issued by authorities, force companies to publish more and more sustainability
indicators on an obligatory basis. Examples include greenhouse gas emissions
in production and gender issues. On the other, public opinion and customers
compel manufacturers to provide sustainability information (e.g., organic food)
as an important base for their purchase decisions.
Prevalent examples of standards and regulations are the ISO 14000 standard
for environmental factors in production, GRI2covering sustainability factors or
regulations like REACH3and RoHS4. Overall, sustainability information involve
a myriad of indicators, relating to social issues (e.g., employment conditions or
gender issues), to environmental issues (e.g., hazardous substances or greenhouse
gas (GHG) emissions), or managerial issues (e.g., compliance).
There already exist tools providing support for the management and trans-
fer of sustainability data: IMDS5(International Material Data System), for in-
stance, is used in the automotive industry. IMDS allows for material declaration
by creating and sharing bills of materials (BOM) among different companies. A
similar system exists for the electronics industry (i.e., Environ BOMcheck6). De-
spite some useful support regarding basic data declarations and exchange tasks,
these tools fail in providing dedicated support for sustainability data collection
and exchange along supply chain.
2.2 Illustrating Example
To illustrate the complexity of sustainability data collection processes in a dis-
tributed supply chain, we provide an example. The latter exposes requirements
gathered from companies from the automotive and electronics industry based on
surveys and interviews. Note that data collection in such a complex environment
does not have the characteristics of a simple query. It is rather a varying, long-
running process incorporating various activities and techniques for gathering
distributed data, and involving different participants.
The example illustrated in Fig. 1 depicts the following scenario: Imposed by
regulations, an automotive manufacturer (requester) must provide sustainability
2Global Reporting Initiative: https://www.globalreporting.org
3Regulation (EC) No 1907/2006: Registration, Evaluation, Authorisation and Re-
striction of Chemicals
4Directive 2002/95/EC: Restriction of (the use of certain) Hazardous Substances
5http://www.mdsystem.com
6https://www.bomcheck.net
data relating to its production. This data is captured by two sustainability indi-
cators, one dealing with the greenhouse gas emissions regarding the production
of a certain product, the other addressing the REACH regulation. The latter
concerns the whole company as companies usually declare compliance to that
regulation as a whole.
Request 1
Validity date: 1
year
Reference: BoM
–2 Positions
Standard: ISO
14064
Indicator: GHG
Emissions
Submit Data
Request
External
Assessment
Requester
Preferences:
Completeness
Quality
Validity
Responder 1
Approval
Processes
Systems
Platforms
Formats
Available Data
Completeness
Quality
Validity period
Collect
Requested
Data Sign Data Provide
Requested
Data
Convert Data
Integrate Data
Check for
available Data
Approve Data
Request
Approve Data
Request
Find / Select
Right Contact
Check for
available Data
Check for
available Data Submit Data
Request
Collect
Rrequested
Data
Find / Select
Right Contact
Approve Data
Request
Responder 2
Approval
Processes
Systems
Platforms
Formats
Responder 3
Approval
Processes
Systems
Platforms
Formats Request 2
Due date: 2
months in future
Reference:
Company X
Verification:
Legal statement
Indicator: Reach
Compliant
Process Parameters
Process: Request 1
Process: Request 2
Start Event End Event AND Gate XOR Gate Activity Subprocess Data impacts
Activity
Fig. 1: Examples of two Data Collection Processes
To provide data regarding these two indicators, the manufacturer must gather
related information from its suppliers (responder). Hence, it requests a REACH
compliance statement from one of its suppliers. To obtain the respective the
information, the activities shown in the process Request 1 must be executed.
Furthermore, the product for which the greenhouse gas emissions shall be in-
dicated has a BOM with two items coming from external suppliers. Thus, the
request, depicted by the second process, has to be split up into two requests, one
for each supplier.
The basic scenario involves a set of activities as part of the data collection
processes. Some of these are common for both requests; e.g., on the requester
side, checking available data that might satisfy the request, selecting the com-
pany and contact person, and submitting the request. On the responder side,
data must be collected and provided. In turn, other process activities are specif-
ically selected for each case. Thereby, the selection of the activities is strongly
driven by data (process parameters) provided by the requester, the responder,
the requests and indicators, and data that may already be available.
For example, Request 1 implies a legally binding statement considering
REACH compliance. Therefore, a designated representative (e.g., the CEO) must
sign the data. In many cases, companies have special authorization procedures
for releasing such data, e.g., one or more responsible persons may have to ap-
prove the request (see the parallel approval activities (Approve Data Request)
in the context of Request 2 expressing a four-eyes-principle). In some cases,
data might be already available in a company, i.e., it needs not to be manu-
ally gathered (cf. Request 2,Check of available Data). However, every time the
company-internal format of the responder does not match the requester’s one,
a conversion becomes necessary. Further, some indicators and requests directly
relate to a given standard (e.g., ISO 14064 for greenhouse gases). In turn, this
may directly trigger an assessment of the responder if he cannot exhibit the
fulfillment of the standard (cf. Request 2,External Assessment).
Another important aspect of (long-running) data collection processes is that
process parameters might change over time and hence exceptional situations
might occur. Even in this very simple example, many variations and deviations
might happen: for example, if the CEO was not available, activity Sign Data
could be delayed. In turn, this may become a problem if defined deadlines exist
for the query answer.
3 Data Collection Challenges
This section presents seven challenges for an information system supporting
sustainability data collection processes along an entire supply chain (IS-DCP).
The results are based on findings from case studies conducted with industrial
partners in the SustainHub project. Three figures serve for illustration purposes:
Fig. 2 illustrates data collection challenges DCC 1 and DCC 2, Fig. 3 illustrates
DCC 3 and DCC4, and Fig. 4 illustrates DCC 5-7.
ResponderRequester Service Provider Data Storage ApplicationHuman
Challenge 1: Selection
Challenge 2: Access
Example 1: Supplier with
manual data collection and
external assessment
Example 2: Supplier with
automated data access
Fig. 2: Data Collection Challenges DCC 1 and DCC 2
3.1 DCC 1: Dynamic Selection of Involved Parties
Findings In a supply chain, sustainability data collection involves various par-
ties (cf. Fig. 2). A single request may depend on the timely delivery of data from
different companies. For manual tasks, this may have to be accomplished by a
specific person with sustainability knowledge or authority. In big companies, in
turn, it can be a challenging task to find the right contact person to answer a
specific request. In relation, contact persons may change over time. Furthermore,
as the requested data is often complex, has to be computed, or relates to legal
requirements, external service providers may be involved in the data collection
request as well. Relating to our scenario from Section 2.2, Fig. 2 includes two
concrete examples: a supplier that applies manual data collection and needs an
assessment by an external service provider, and a supplier providing automated
access to his data. Finally, regarding the timely answering of a request, many
requests may be adjusted and forwarded to further suppliers (cf. Fig. 2); thus,
answering times can multiply.
Problems The contemporary approach to such requests relies on individuals
conducting manual tasks and interacting individually. There are tools (e.g.,
email) which can provide support for some of these tasks and partly automate
them. However, much work is still coordinated manually. As a request can be
forwarded down the supply chain, it is difficult to predict, who exactly will be
involved in its processing. From this we can conclude that answering times of
requests can be hardly estimated in a reliable manner as well.
Requirements An IS-DCP need to enable companies to centrally create and
manage data collection requests. Thereby, it must be possible to simplify the
dynamic selection process of involved parties and contact persons regarding the
request responders as well as potentially needed service providers. This is a
basic requirement for enabling efficient request answering, data management,
and request monitoring.
3.2 DCC 2: Access to Requested Data
Findings In a supply chain different parties follow different approaches to data
management. While large companies usually have implemented a higher level of
automation, SMEs typically rely on the work of individual persons. Furthermore,
sustainability reporting is still an emerging area and there exists no unified
reporting method along supply chains. In particular, this implies a high degree
of variability when it comes to accessing internal data of companies. Some of
them have advanced software solutions with respect to data management, some
manage their data in databases, some store it in specific files (e.g., Excel), and
others have even not started to manage sustainability data yet.
Problems The contemporary approach to sustainability reporting is managed
manually to a large extend. This involves manual requests from one party to
another and different data collection tasks on the responder side. This can impose
large delays in data collection processes as sustainability data must be manually
gathered from systems, databases or specific files before it can be compiled,
prepared and authorized in preparation to the delivery to the requester.
Requirements An IS-DCP must accelerate and facilitate the access to re-
quested sustainability data. On one hand, this requires guiding users in man-
ually collecting data as well as in automizing data-related activities (e.g., data
approval, data transformation) where possible. On the other hand, automatic
data collection should be enabled whenever possible. This requires accessing the
systems containing the data automatically (e.g., via the provision of appropriate
interfaces) and including manual approval activities when needed. Finally, data
conversion between different formats ought to be supported as a basis for data
aggregation.
3.3 DCC 3: Meta Data Management
Findings The management and configuration of sustainability data requests in
a supply chain relies on a myriad of different data sets. As aforementioned, this
data stems from heterogeneous sources. Examples of such parameters include
the preferences of the requester as well as the responders (including approval
processes and data formats), or the properties of the sustainability indicators
(e.g., relations to standards) (cf. Fig. 3). Involving the scenario from Section 2.2,
concrete examples include the following: a mismatch of the data format configu-
ration of the requester and responder, the need to comply to a specific standard
as the ISO 14064, or available data the matches the quality requirements of the
requester (also illustrated by Fig. 3). As a result, potentially matching data might
be already available in some cases but expose different properties as requested.
Query Variant 2Query Variant 1
Meta Data
Requester Data
„Quality>75%“
„Data Format A“
Query 1
Query 1
Request 1
Request Variant 1 Request Variant 2
Meta Data
Responder Data
„Data Format B“
Meta Data
Available Data
„Quality=80%“
Meta Data
Request Data
„Standard=
ISO14064"
Challenge 3: Meta Data
Challenge 4: Request
Variants
Fig. 3: Data Collection Challenges DCC 3 and DCC 4
Problems As requests rely on heterogeneous data, they are difficult to manage.
Requirements are partially presumed by the requester and often are implicit.
Hence, responders might be unaware of the requirements and deliver data not
matching them. Moreover, it is difficult to determine whether data, which has
been collected before, matches with a new request. Finally, as a supply chain
might involve a large number of requesters and responders, this problem multi-
plies as crucial request data is scattered along the entire supply chain.
Requirements To be able to consistently and effectively manage data collection
processes, an IS-DCP must centrally implement, manage and provide an under-
standable meta data schema addressing relevant request parameters. Thereby,
instanced data based on the uniform meta data schema can be effectively used
to directly derive and adjust variants of data collection processes.
3.4 DCC 4: Request Variants
Findings As mentioned, sustainability data exchange in a supply chain involves
a considerable number of manual and automated tasks aligned to the current
data request. Hence, execution differs greatly among different data requests,
highly influenced by parameters and data, and distributed on many sources (cf.
DCC 3 and Fig. 3). Moreover, the reuse of provided data is problematic as well
as the reuse of knowledge about conducted data requests: persons in charge,
managing a data collection, might not be aware of which approach matches the
current parameter set best.
Problems This makes the whole data collection procedure tedious and error
prone. Based on the gained insights, initially, to each data request a data col-
lection process is defined manually, and evolves stepwise afterwards. Relying on
the various influencing parameters, every request must be treated individually
there is no applicable uniform approach to a data request, instead a high number
of variants of data collection processes exist. So far, there is no system or ap-
proach in place that allows structuring or even governing such varying processes
along a supply chain.
Requirements An IS-DCP not only needs to be capable of explicitly defining
the process of data collection. Due to the great variability in this domain, it must
be also capable of managing numerous variants of each data request relating
to a given parameter set. This includes the effective and efficient modeling,
management, storage, and execution of data collection request processes.
3.5 DCC 5: Incompleteness and Quality
Findings Sustainability data requests are demanding and their complex data
collection processes evolve based on delivered data and forwarded requests to
other parties (i.e., suppliers of the suppliers) (cf. Fig. 4). Furthermore, they
are often tied to regulative requirements and laws, and also involve mandatory
deadlines. Therefore, situations might occur, in which not all needed data is
present, but the request answer must still be delivered due to a deadline. As
another case, needed data might be available, but on different quality levels
and/or in different formats.
Problems Contemporary sustainability data collection in supply chains is
plagued by quality problems relating to the delivered data. Not only that re-
quests are incompletely answered, the requester also has no awareness of the
completeness and quality of the data stemming from multiple responders. More-
over, responders have no approach to data delivery in place when being unable
to provide the requested data entirely, or their data does not match the re-
quest’s quality requirements. Missing a unified approach, definitive assertions or
statements to the quality of the data of one request can often not be made and
requests might even fail due to that fact.
Requirements An IS-DCP must be able to deal with incomplete data and
quality problems. It must be possible that a request can be answered despite
missing or low quality data. Furthermore, such a system must be able to make
assumptions about the quality of the data that answers a request.
Request 1
Request 2
Request 3Request 4
Requester
Challenge 5: QualityChallenge 6: Monitoring
Challenge 7: Variability
Feedback
Feedback
Feedback
Feedback
Sub-Request 1-1
Sub-Request 1-2
Feedback
Feedback
Deviation 1Deviation 2
Deviation 3
Fig. 4: Data Collection Challenges DCC 5-7
3.6 DCC 6: Monitoring
Findings Sustainability data collection along the supply chain involves many
parties and logically may take a long time. The requests exist in many variants
and the quality and completeness of the provided data differ greatly (cf. DCC 5).
The contemporary approach to such requests does not provide any information
about the state of the request to requesters before the latter is answered (cf. Fig.
4). This includes missing statements about delivered data as well as the possibly
existing recursive requests along the supply chain. Thus, it can be a serious issue
for the OEM who issued the initial request to gain an awareness about possible
delays and to gather information about their location in the supply chain.
Problems As a requester has no information about the state of his request
and potential data delivery problems, the latter solely become apparent when
deadlines are approaching. At that time, however, it might be too late to apply
countermeasures to avoid low quality, incomplete data, or responders delivering
no data at all.
Requirements An IS-DCP must be capable of monitoring complex requests
spanning multiple responders as well as various manual and automatic activities.
Furthermore, a requester should be able to be actively or passively informed
about the state of the activities along the data collection process as well as the
state of the data delivered.
3.7 DCC 7: Run Time Variability
Findings The processing of a data collection request might take a long time
to answer if the request involves a great number of parties. Further, it exposes
manual and automatic activities, different kinds of data and data formats, and
unforeseen impacts on the data collection process. This implies that parame-
ters, on which the data collection relies, may change during execution of a data
collection process. Exceptional situation handling occurs as a result of expiring
deadlines or responders not delivering data.
Problems The variability relating to sustainability data collection processes
constitute a great challenge for companies. Running requests might become in-
validated due to the aforementioned issues. However, there is no common sense
or standard approach to this. Instead, requesters and responders must manually
find solutions to still get requests answered in time. This includes much addi-
tional effort and delays. Another issue are external assessments: they could not
only be delayed but also completely fail, leaving the responder without a re-
quired certification. The final problem touched by this example concerns mostly
long-running data collection processes: data, that was available at the beginning
of the query, could get invalid during the long-term process (e.g., if it has a
defined validity period).
Requirements An IS-DCP must cope with run-time variability occurring in
today’s sophisticated sustainability data collection processes. As soon as issues
are detected, data collection processes must be timely adapted to the changing
situation in order to keep the impact of these issues as considerable as possible.
This requests a system which is able to dynamically adapt already running data
collection processes without invalidating or breaking the existing process flow.
4 State of the art
This section gives insights on the state of the art in scientific approaches relating
to the issues shown in this paper. It starts with a broader overview and proceeds
with more closely related work including three subsections.
Section 3 underlines that exchanging data between different companies along
a supply chain in an efficient and effective way has always been a challenge.
Nonetheless, this exchange is not only necessary—it is now a crucial success fac-
tor and a competitive advantage, these days. However, many influencing factors
hamper the realization of a data exchange being automated and homogeneous.
In particular for those companies aiming to address holistic sustainability man-
agement, the inability to implement automated and consistent data exchange is
a big obstacle. Please remind that these companies need to take into account
existing and even emerging laws as well as regulations requesting to gather and
distribute information about their produced goods. Furthermore, that requested
information need be gathered from their suppliers as well. Hence, complex data
collection processes, involving a multitude of different companies and systems,
have to be designed, conducted, and monitored to ensure compliance. So far, we
could not locate any related work that completely addresses the aforementioned
challenges (cf. Section 3).
For complex data collection processes, IS support in the supply chain is de-
sirable supporting communication and enabling automated data collection. The
importance and impact of an IS for supply chain communication has already
been highlighted in literature various times. In [2], for instance, a literature re-
view is conducted showing a tremendous influence of ISs on achieving effective
SCM. The authors also propose a theoretical framework for implementing ISs
in the supply chain. Therefore, they identify the following core areas: strate-
gic planning, virtual enterprise, e-commerce, infrastructure, knowledge manage-
ment, and implementation. However, their findings also include that great flex-
ibility in the IS and the companies is necessary and that IS-enabled SCM often
requires major changes in the way companies deal with SCM. As another exam-
ple, [3] presents an empirical study to evaluate alternative technical approaches
to support collaboration in SCM. These alternatives are a centralized web plat-
form, classical electronic data interchange (EDI) approaches, and a decentralized,
web service based solution. The author assesses the suitability of the different
approaches with regard to the complexity of the processes and the exchanged
information. Concluding, relating work in this area reveals various approaches to
SCM management However, these are mostly theoretic, rather general, and not
applicable to the specific use cases of sustainability data collection processes.
As automation can be a way to deal with various issues of sustainability data
collection, respective approaches addressing that topic can be found in literature
as well. However, none of them applies to the domain of sustainable supply
chain communication and its specific requirements. For example, [4] presents
an approach to semi-automatic data collection, analysis, and model generation
for performance analysis of computer networks. This approach incorporates a
graphical user interface and a data pipeline for transforming network data into
organized hash tables and spread sheets for its usa in simulation tools. As a
specific type of data transformation is considered, it is not suitable in our context.
Such approaches deal with automated data collection; yet they are not related
to sustainability or SCM and the problems arising in this setting.
There exist several approaches dealing with sustainability reporting (e.g.,
[5], [6],[7], and [8]). However, they do not propose technical solutions for auto-
mated data collection. Rather they approach the topic theoretically by analysing
several relating facts. These include the importance of corporate sustainability
reporting, sustainability indicators or the process of sustainability reporting as a
whole. Another goal is building a sustainability model by analysing case studies.
Besides approaches targeting generic sustainability, SCM and data collec-
tion issues, there exist three areas that are more closely related to our problem
context. As discussed, sustainability data collection processes involve numerous
tasks to be orchestrated. Data requests may exist in many different variants
based on a myriad of different data sources and may be subjected to dynamic
changes during run-time (cf. DCC 7). Therefore, this sub-section discusses ap-
proaches for process configuration (Section 4.1), data- and user-driven processes
(Section 4.2), and dynamic processes (Section 4.3).
4.1 Process Configuration
Behaviour-based configuration approaches enable the process modeller to pro-
vide pre-specified adaptations to process behaviour. One option for realizing this
is hiding and blocking as described by [9]: blocking allows disabling the occur-
rence of a single activity/event, whereas hiding allows hiding single activity to
be hidden, which is then executed silently; succeeding activities in that path are
still accessible.
Another way to enable process model configuration for different situations
is to incorporate configurable elements into the process models as described in
[10], [11]. An example of this approach is a configurable activity, which may be
integrated, omitted, or optionally integrated surrounded by XOR gateways. An-
other approach enabling process model configuration is ADOM [12] that builds
on software engineering principles and allows for the specification of guidelines
and constraints with the process model. A different approach to process config-
uration is taken by structural configuration, which is based on the observation
that process variants are often created by simply copying a process model and
then applying situational adaptations to it. A sophisticated approach dealing
with such cases is Provop [13], which realizes a configurable process model by
maintaining a base process models and pre-specified adaptations to it. The latter
can be related to context variables to enable the application of changes match-
ing to different situations. Finally, [14], [15] provide a comprehensive overview
of existing approaches targeting process variability.
Process configuration techniques provide a promising approach in our con-
text. Nevertheless, they do not fully match the requirements for flexible data
collection processes in a dynamic and heterogeneous environment, as many dif-
ferent data sources must be considered and requests may be subjected to change
even during their processing.
4.2 Data- and User-driven Processes
As opposed to traditional process management approaches focusing on the se-
quencing of activities, the case handling paradigm [16] is centralized around the
’case’. Similarly, product-based processes focus on the interconnection between
product specification and processes [17]. The Business Artifacts approach [18] is
a data driven methodology that is centralized around business artifacts rather
than activities. These artifacts hold the information about the current situation
and thus determine how the process shall be executed. In particular, all executed
activities are tied to the life-cycle of the business artifacts. Another data-driven
process approach is provided by CorePro [19], which enables process coordina-
tion based on objects and their relations. In particular, it provides a means for
generating large process structures out of the object life cycles of connected ob-
jects and their interactions. The creation of concepts, methods, and tools for
object- and process-aware applications is the goal of the PHILharmonic Flows
framework [20]. The framework allows for the flexible integration of business
data and business processes overcoming many of the limitations known from
activity-centered approaches.
The approaches shown in this sub-section facilitate processes that are more
user- or data-centric and aware. The creation of processes from certain objects
could be interesting for SustainHub as well. However, in dynamic supply chains,
processes rather rely on their context than on objects and are continuously
influenced by its changes while executing.
4.3 Dynamic Processes
In literature, there exist two main options for enabling flexibility in automatically
supported processes: imperative processes being adaptive or constraint based
declarative processes being less rigid by design.
Adaptive PAIS have been developed that incorporate the ability to change a
running process instance to conform to a changing situation. Examples of such
systems are ADEPT2 [21], Breeze [22], and WASA2 [23]. These mainly allow
for manual adaptation carried out by a user. In case an exceptional situation
leading to an adaptation occurs more than once, knowledge about the previous
changes should be exploited to extend effectiveness and efficiency of the current
change [24][25].
In case humans shall apply the adaptations, approaches like ProCycle [26]
and CAKE2 [27] aim at supporting them with respective knowledge. In our con-
text, these approaches are not suitable since the creation as well as adaptation
of process instances must incorporate various information from other sources.
Furthermore, it must be applied before humans are involved or incorporate
knowledge the issuer of a process does not possess. Automated creation and
adaptation of the data collection processes will thus be favourable. In this area,
only a small number of approaches exist, e.g., AgentWork [28] and SmartPM [29]
However, these are limited to rule based detection of exceptions and application
of countermeasures.
As aforementioned, another way to introduce flexibility to processes is by
specifying them in a declarative way, which does not prescribe a rigid activ-
ity sequencing [30]. Instead, a number of declarative rules constraints may be
used to specify certain facts the process execution must conform to, e.g., mu-
tual exclusion of activities. Based on this, all activities specified can be executed
at any time as long as no constraint is violated. Examples are DECLARE [31]
and ALASKA [32]. However, declarative approaches have specific shortcomings
concerning understandability [30]. Furthermore and even more important in our
context, if no clear activity sequencing is specified, all activities relating to mon-
itoring are difficult to satisfy and monitoring is a crucial requirement for the
industry in this case.
5 Data Collection with Adaptive Processes
As shown in Section 4, none of the approaches present in related work suc-
ceeds in satisfying the complex requirements of a domain like sustainable sup-
ply chain communication. Even if they provide facilities for complex processes
and dynamic behaviour, they mostly fall short regarding human integration and
automation. On account of this, in the SustainHub project, we have started
developing a process-aware data collection approach that shall satisfy the re-
quirements elicited (see Section 3). In this section, we want to give a rough
overview of this approach and what it shall be capable of without going too
much into detail.
Based on the comprehensive set of challenges, our approach is introduced
in four steps: first, we present the basis for handling data exchange in com-
plex environments. Second, we introduce facilities for automatic configuration
and variant management for data requests (cf. Section 5.1). Third, we present
concepts for automated runtime variability (cf. Section 5.2) and, fourth, data
quality and monitoring (cf. Section 5.3) support.
To build an information system capable of automatically supporting data
collection along complex supply chains, the basic requirements we elicited in
DCC1 and DCC2 must be covered first. In particular, SustainHub must provide
central data request management, assistance in terms of selection and integration
of the involved parties, and management of access to the latter. To enable this,
our approach is based on two things: a comprehensive data model and explicit
specifications of data exchange processes.
A
D
E
Process Template 1
Request Type Specific Request
Role
X
Interface
A
Interface
B
Instantiation
B C
Interface
C
Role
X
A
D
E
Process Instance 1
Person
X
System
A
System
B
B C
System
C
Person
Y
Fig. 5: Processes-based Data Collection
In our approach, data collection processes are modeled in a Process-Aware
Information System (PAIS) integrated into the SustainHub platform providing
the domain-related data model. This integration yields a number of advantages:
it allows for explicitly specifying the data collection process for one request
type through a process template (cf. Figure 5). Such a request type can be, for
example, a sustainability indicator, for which data shall be collected. The process
template then governs the activities to be executed at a particular point of time;
the activities themselves allow for specifying what exactly is to be done at a
particular step of data collection. Further, activities in a process template may be
manually executed by a certain role or may implement an interface to a specific
system involved in the data exchange. For a concrete data request relating to
a pre-defined request type, a process instance is created to coordinate the data
collection process. Via the implemented automatic activities, The PAIS is able
to connect to external systems and perform automatic activities concerning the
data request. Taking the specified roles in the involved company into account,
the PAIS can also automatically distribute manual activities to the right persons
in charge.
Substance
Product
Component Material
BoM
Customer External
System
Content
Source Unit
Customer
Relationship
Request
Data
Response
Data
Push Data
Content
Data
Base
Process
Process
Fragment Process
Parameter
Context
Factor
Content
Definition
Master Data
Runtime Data
Customer Data
Exchange Data
Context DataProcess Data
Org. Pos. RoleOrg. Unit Agent
Fig. 6: SustainHub Data Model
In order to enable an information system to systematically support dynamic
data collection processes, it must have access to various kinds of data relating
to context, customers, or the collected data. As aforementioned, we integrate a
data model uniting different kinds of information that is necessary for managing
the data collection. As depicted by Figure 6, the data model is separated into
six sections: first, it comprises customer data like the organizational model of in-
volved companies, descriptions of their products, BOMs, or systems they employ
for sustainability data management (if present). Second, the data model man-
ages a set of master data accessible by all companies connected to the system.
This includes, for example, standardized definitions for sustainability indicators
or substances widely used by companies in these domains. Third, the data ex-
change is explicitly managed and stored in the data model by comprising data
sets for the data requests, data responses, and, in a separate section, the data
collected. Finally, as basis for the advanced features discussed in the following,
the data model integrates various data sets covering the data collection processes
executed as well as mapping of various contextual influences that may impact
the data collection processes during run time.
5.1 Configuration of Data Collection Processes
This section discusses how our approach addresses the challenges DCC3 and
DCC4. In particular, it deals with the automated management of data request
variants and the meta data leading to the execution of the different variants.
Basically, the approach facilitates the automated configuration of pre-defined
process templates to match the properties of the given situation. This is enabled
by integrating meta data regarding the processes as well as the context of the
situation in our data model (cf. Section 5). The concrete procedure applied to
automated process configuration is shown in Fig. 7.
Context
Mapping Process
Configuration
Process Templates
Process Fragments
Data Model
Contextual
Influences
Product
Customer
Customer
Relationship
SustainHub Users
External
Systems
Product
Product
Product
Configured Process Instance
Fig. 7: Configuration of Data Collection Processes
To incorporate contextual factors influencing the course of the data collection
(e.g., if a company executes manual or automated data collection or if, due to a
specific regulation, external data validation is necessary), we explicitly model the
contextual factors. The latter are processed in a Context Mapping component
and stored in the data model. In turn, they are utilized in a Process Configura-
tion component to determine which process instance may be configured for the
current context. In detail, the configuration of data collection processes works
as follows: users can specify Process Templates that contain the activities indis-
pensably for a particular data request type. The modeled activities are extended
on account of the context factors by Process Fragments that may be specified by
users as well. In particular, SustainHub selects a set of fragments matching the
context of the current situation and automatically integrate them into the pro-
cess template as illustrated by Fig. 7. After that, a configured process instance
is started for the particular data request. In the following, we will exemplarily
discuss the context mapping.
As shown in Fig. 8, we distinguish between Context Factors and Process Pa-
rameters. The former capture facts that exist in the environment of SustainHub.
As example consider the fact that a company may miss a certain certification
necessary to respond to a data request concerning a certain legal regulation.
This fact, in turn, may require including additional activities for acquiring the
certification. Process Parameters, in turn, capture internal information directly
CF2 P3
CF3 P2
CF1 P1
Fig. 8: Context Mapping for Configuration of Data Collection Processes
relating to the selection of certain Process Fragments. As the latter do not nec-
essarily correlate with defined Context Factors, we apply a set of configurable
Context Rules to map Context Factors and Process Parameters. Fig. 8 shows a
rather simple case. However, complicated cases, where multiple Context Factors
relate to one Process Parameter are usual in practice. For example, a company
may request a specific four eyes approval procedure in correspondence to differ-
ent Context Factors: if a certain monetary amount is reached, or the company
does not trust the customer, or if the data relates to a certain legal regulation.
For a more in depth discussion of this topic, see [33].
5.2 Adaptation of Data Collection Processes
This section discusses our approach for coping with challenge DCC7. In particu-
lar, it addresses issues regarding runtime variability. In various situations, it may
be required that a data collection process instance has to be changed although
the instance is already running. As discussed in DCC7, this could be necessary
because of changes to the context or exceptions arising during execution. The
first reason constitutes a runtime change to the set of expected situations de-
picted by the Context Factors. For example, a certification gets invalidated for
one company due to a change in a regulation. The second constitutes an error
in the execution of the data collection. An example could be that an activity is
delayed and exceeds a specific deadline.
Our adaptation approach distinguishes these two cases as depicted by Fig.
9. We apply two different handlings: For erroneous situations a Compensation
Action is applied to solve the occurred problem or to give users an opportunity
to solve the problem on their own. For context changes, a Context Change Action
is proposed that can influence the set of applied Process Fragments.
In Fig. 9, the different actions SustainHub can perform on account of various
dynamic events are illustrated. These are the following:
1)Various influencing factors dynamically affect SustainHub. Relevant factors
are mapped to an internal event.
2)The type of event determines the way SustainHub addresses the changed sit-
uation: a context change induces the change of a Context Parameter whereas
an exceptional situation leads to a Compensation Action.
3)If a Compensation Action is issued, various actions may be governed by it,
e.g. resetting a failed activity.
4)If a Context Parameter changes, the set of integrated Process Fragments will
most likely not match the current situation anymore. Therefore, SustainHub
estimates whether Process Fragments have to be added, deleted, or replaced.
5)An issued Context Change action will verify whether an action (e.g., canceling
a Process Fragment) is still possible. If not, a corresponding Compensation
Action will be created.
6)A Compensation Action can be used, e.g., to inform the issuer of the data
collection process about a failure when adapting to a changing situation.
Fig. 9: Adaptation Concept of Data Collection Processes
In order to react to various events and to apply the relating Compensation or
Context Change Actions, SustainHub defines a simple event model as illustrated
by Fig. 10. An event is composed out of three different parts: (1) a trigger rule
that determines, when the event will be fired; (2) the data of the event; (3)
an outcome rule governing what action is to be performed due to the event.
These three parts are needed for the following reasons: customizable trigger
rules enable users to configure what events are important for the data collection
process. Further, Fig. 10 shows two examples distinguishing active and passive
trigger rules: an event, which contains an active trigger rule is fired due to the
change of a certain data set. Instead, an event, which comprises a passive trigger
rule is fired by periodic checks, which, e.g., determine, whether a deadline is
exceeded.
Events can be related to any data or activity in SustainHub. However, not
every event necessitates a following action in every situation. Therefore, outcome
rules are applied to let users specify, under which circumstances such an action
becomes necessary. For example, the introduction of a new regulation may be of
utter importance for data collection processes concerning one specific indicator,
but have no impact on another one. Finally, the data component stores the
information of the event. If an action is carried out based on an event that
necessitates human intervention, this information can be delivered to the human.
Event
Data Outcome
Rules
Trigger
Rules
Active Passive
New Regulation X Indicator Y
Compensation
Action Event Periodic check
Evaluate facts
Event Fact Context
Change Compensation
Activity deadline < now
Event Z
Data Y changed
Event X
contains
is
Entity
Parts
Type
Rules
Example
Fig. 10: Event Model for Adaptation of Data Collection Processes
5.3 Monitoring and Data Quality of Data Collection Processes
This section discusses how our approach addresses the challenges DCC5 and
DCC6. In particular, issues relating to incompleteness and quality of data as
well as the monitoring of the data collection processes are taken into account. In
a complex supply chain, one data request may have dozens of responders. Thus,
the answering time of the request is hardly predictable and some responders
might reply with incomplete or low quality data. Our approach, therefore, aims
at providing the requester with fine-grained status information about the request
and enable SustainHub to handle incomplete data.
As the data collection process is executed in an integrated PAIS, a requester
is supposed to perceive request status for basic monitoring. However, this does
not suffice for two reasons: first, a request may have an arbitrary number of sub-
processes making it cumbersome to check them all. Second, the status of the
request might not only depend on activities, but on the transferred data as well.
Furthermore, not every activity and data set might have the same importance
with respect to the status of a request. Therefore, as first part of our monitoring
approach, we introduce a fine-grained, but still comprehensive status object as
illustrated in Fig. 11.
Accordingly, a request status is calculated from different activities and data
sets involved in the data collection process. These two types of entities can also be
annotated with a weight factor to indicate their importance. An example for such
a calculation is shown in Fig. 11. In this example, four activities and three data
items with varying importance are involved. A particular activity, which gathers
data from an IHS, might be very important for the data collection (having a
weight = 2) while another one has no importance (a simple administrative task
Request
Status
DataProcess
Activity
Activity
Activity
Item
Item
Item
written
finished
1/8
2/4
31%
Activity
W=2
W=0
W=1
W=1 W=2
W=1
W=1
W=2W=1
Fig. 11: Status Monitoring for Adaptive Data Collection
with weight = 0). The values of the weight factor are summed up and combined
to indicate the percentage of completeness of the relating request.
This extended status is a first improvement for monitoring data collection
processes. However, it does not address issues related to incomplete and low
quality data. In order to measure such problems and incorporate such meta
information into the monitoring process, we apply the following concepts:
Process and Data Metric: To explicitly specify what is supposed to be mea-
sured, we propose a Process and Data Metric. The latter may be used for
evaluating various facts related to a data collection request. It can be used
for various entities and properties, e.g., the status of a process or a Sustain-
Hub customer. Furthermore, it may incorporate a mathematical function like
a sum or an average. Two examples of metrics are as follows:
Metric X: Average rating of responders who have not yet executed an activity
X.
Metric Y: Average precision deviation of responses of a request.
Dynamic Recalculation. The data collection process and the data it relates
to are subject to changes. Therefore, metrics applied to one of these may
have to be recalculated frequently. To automate this, we propose a dynamic
recalculation defining what has to be done with a particular metric if a change
to the data collection process is conducted. It allows for specifying the targeted
metric, the trigger for action, and a description. Examples of such actions
include full recalculation or discarding the metric.
Monitoring Annotation. As aforementioned, responders might reply incom-
plete or not at all. In practice, companies often finish a data collection process
without receiving responses from all suppliers as some of them are not even
capable of answering properly. Thus, the requester waits until a number of
important suppliers has replied and finishes the request based on the avail-
able data. To support such advanced data collection behavior, we propose
aMonitoring Annotation. The latter can be added to a request in order to
automatically trigger various actions related to reporting and monitoring. It
allows specifying a target entity, a trigger event, and a set of facts (Context
Factors or metrics) that will be evaluated when the trigger event is fired to
determine, whether the rule will be executed. For the latter, various actions
can be defined, ranging from recalculating the metric to canceling the entire
data collection request. In the following, we will give two concrete examples
of such rules:
Annotation A1: Target: Data Collection Process, Trigger Event: Status >60%,
Facts: none, Action: Calculate preliminary Results
Annotation A2: Target: Data Collection Process, Trigger Event: Status >80%,
Facts: Metric X >80%, Action: Cancel Request Processing
The combination of the concepts introduced in this section enables Sustain-
Hub to deal with incompletely answered requests. Furthermore, based on the
status and the active Monitoring Annotations, the requester can be actively
informed about the status of his requests.
6 Conclusion
This paper motivated the topic of sustainability data exchange along supply
chains to subsequently present core challenges as well as state of the art in
this area. We have identified seven core challenges for today’s data collection
processes based on intensive interaction with our SustainHub partners most of
them relating to variability issues. Especially, both design and run time flexibility
are major requirements for any approach supporting sustainable development
and production. The presented challenges can serve as starting point for further
developments to support today’s complicated supply chain communication. The
challenges are expressed in terms of sustainability data collection, however they
describe generic problems that may occur in many other domains involving cross-
organizational communication. Thus the results can be transferred and used in
other domains. There exists a substantial amount of related work in different
areas touching these topics. Yet, none of these approaches or tools has succeeded
in providing holistic support for the process of sustainability data exchange in
a supply chain. The support of data collection requests and processes along
today’s complex supply chains is a challenge in the literal sense. Nonetheless,
the SustainHub project is actively working on a process-based solution to deal
with, and successfully manage the high variability occurring during design and
run time. Thus, we provide a first outlook on the approach we are developing
to tackle the challenges identified in this paper in the future. Future work will
describe the exact approach, combination of technologies, and the architecture
of the system to systematically address the presented data collection challenges.
Acknowledgement
The project SustainHub (Project No.283130) is sponsored by the EU in the 7th
Framework Programme of the European Commission (Topic ENV.2011.3.1.9-1,
Eco-innovation).
References
1. Fawcett, S.E., Osterhaus, P., Magnan, G.M., Brau, J.C., McCarter, M.W.: Infor-
mation sharing and supply chain performance: the role of connectivity and will-
ingness. Supply Chain Management: An Int’l Journal 12(5) (2007) 358–368
2. Gunasekaran, A., Ngai, E.W.T.: Information systems in supply chain integration
and management. Europ J of Operational Research 159(2) (2004) 269–295
3. Pramatari, K.: Collaborative supply chain practices and evolving technological
approaches. Supply Chain Management: An Int’l Journal 12(3) (2007) 210–220
4. Barnett, P.T., Braddock, D.M., Clarke, A.D., DuPr´e, D.L., Gimarc, R., Lehr, T.F.,
Palmer, A., Ramachandran, R., Renyolds, J., Spellman, A.C.: Method of semi-
automatic data collection, data analysis, and model generation for the performance
analysis of enterprise applications (2007)
5. Singh, R.K., Murty, H.R., Gupta, S.K., Dikshit, A.K.: An overview of sustainability
assessment methodologies. Ecological indicators 9(2) (2009) 189–212
6. Ballou, B., Heitger, D.L., Landes, C.E.: The Future of Corporate Sustainability
Reporting: A Rapidly Growing Assurance Opportunity. J of Accountancy 202(6)
(2006) 65–74
7. Adams, C.A., McNicholas, P.: Making a difference: Sustainability reporting, ac-
countability and organisational change. Accounting, Auditing & Accountability
Journal 20(3) (2007) 382–402
8. Pagell, M., Wu, Z.: Building a more complete theory of sustainable supply chain
management using case studies of 10 exemplars. J of Supply Chain Management
45(2) (2009) 37–56
9. Gottschalk, F., van der Aalst, W.M.P., Jansen-Vullers, M.H., La Rosa, M.: Con-
figurable workflow models. Int’l J Cooperative Information Systems 17(2) (2008)
177–221
10. Rosemann, M., van der Aalst, W.M.P.: A configurable reference modelling lan-
guage. Information Systems 32(1) (2005) 1–23
11. La Rosa, M., van der Aalst, W.M.P., Dumas, M., ter Hofstede, A.H.M.:
Questionnaire-based variability modeling for system configuration. Software and
System Modeling 8(2) (2009) 251–274
12. Reinhartz-Berger, I., Soffer, P., Sturm, A.: Extending the adaptability of reference
models. IEEE Transactions on Systems, Man, and Cybernetics, Part A 40(5)
(2010) 1045–1056
13. Hallerbach, A., Bauer, T., Reichert, M.: Configuration and management of process
variants. In: Int’l Handbook on Business Process Management I. Springer (2010)
237–255
14. Torres, V., Zugal, S., Weber, B., Reichert, M., Ayora, C., Pelechano, V.: A qual-
itative comparison of approaches supporting business process variability. In: 3rd
Int’l Workshop on Reuse in Business Process Management (rBPM 2012). BPM’12
Workshops. LNBIP, Springer (September 2012)
15. Ayora, C., Torres, V., Weber, B., Reichert, M., Pelechano, V.: Vivace: A framework
for the systematic evaluation of variability support in process-aware information
systems. Information and Software Technology (to appear) (2014)
16. van der Aalst, W.M.P., Weske, M., Gr¨unbauer, D.: Case handling: A new paradigm
for business process support. Data & Knowledge Engineering 53(2) (2004) 129–162
17. Reijers., H.A., Liman, S., van der Aalst, W.M.P.: Product-based workflow design.
Management Information Systems 20(1) (2003) 229–262
18. Bhattacharya, K., Hull, R., Su, J.: A data-centric design methodology for business
processes. In: Handbook of Research on Business Process Management. IGI (2009)
503–531
19. M¨uller, D., Reichert, M., Herbst, J.: A new paradigm for the enactment and
dynamic adaptation of data-driven process structures. In: CAiSE’08. Volume 5074
of LNCS., Springer (2008) 48–63
20. K¨unzle, V., Reichert, M.: PHILharmonicFlows: towards a framework for object-
aware process management. J of Software Maintenance and Evolution: Research
and Practice 23(4) (June 2011) 205–244
21. Dadam, P., Reichert, M.: The ADEPT project: A decade of research and de-
velopment for robust and flexible process support - challenges and achievements.
Computer Science - Research and Development 23(2) (2009) 81–97
22. Sadiq, S., Marjanovic, O., Orlowska, M.: Managing change and time in dynamic
workflow processes. Int. J Cooperative Information Systems 9(1&2) (2000) 93–116
23. Weske, M.: Formal foundation and conceptual design of dynamic adaptations in
a workflow management system. In: Proc. Hawaii Int’l Conf on System Sciences
(HICSS-34). (2001)
24. Lenz, R., Reichert, M.: IT support for healthcare processes - premises, challenges,
perspectives. Data and Knowledge Engineering 61(1) (2007) 39–58
25. Minor, M., Tartakovski, A., Bergmann, R.: Representation and structure-based
similarity assessment for agile workflows. In: Proc. ICCBR’07. (2007) 224–238
26. Weber, B., Reichert, M., Wild, W., Rinderle-Ma, S.: Providing integrated life cycle
support in process-aware information systems. Int’l J of Cooperative Information
Systems 18(1) (2009) 115–165
27. Minor, M., Tartakovski, A., Schmalen, D., Bergmann, R.: Agile workflow tech-
nology and case-based change reuse for long-term processes. Int’l J of Intelligent
Information Technologies 4(1) (2008) 80–98
28. M¨uller, R., Greiner, U., Rahm, E.: AgentWork: A workflow system supporting
rule–based workflow adaptation. Data & Knowledge Engineering 51(2) (2004)
223–256
29. Lerner, B.S., Christov, S., Osterweil, L.J., Bendraou, R., Kannengiesser, U., Wise,
A.E.: Exception handling patterns for process modeling. IEEE Trans. Software
Eng. 36(2) (2010) 162–183
30. Zugal, S., Soffer, P., Haisjackl, C., Pinggera, J., Reichert, M., Weber, B.: Inves-
tigating expressiveness and understandability of hierarchy in declarative business
process models. Software & Systems Modeling (June 2013)
31. Pesic, M., Schonenberg, H., van der Aalst, W.M.: Declare: Full support for loosely-
structured processes. In: Enterprise Distributed Object Computing Conference,
2007. EDOC 2007. 11th IEEE Int’l, IEEE (2007) 287–287
32. Weber, B., Pinggera, J., Zugal, S., Wild, W.: Alaska simulator toolset for conduct-
ing controlled experiments on process flexibility. In: Information Systems Evolu-
tion. Springer (2011) 205–221
33. Grambow, G., Mundbrod, N., Steller, V., Reichert, M.: Towards process-based
composition of activities for collecting data in supply chains. In: 6th Central
European Workshop on Services and their Composition (ZEUS 2014). (February
2014)