=Paper=
{{Paper
|id=Vol-1164/PaperVision03
|storemode=property
|title=Towards Configurable Data Collection for Sustainable Supply Chain Communication
|pdfUrl=https://ceur-ws.org/Vol-1164/PaperVision03.pdf
|volume=Vol-1164
|dblpUrl=https://dblp.org/rec/conf/caise/GrambowMSR14
}}
==Towards Configurable Data Collection for Sustainable Supply Chain Communication==
<pdf width="1500px">https://ceur-ws.org/Vol-1164/PaperVision03.pdf</pdf>
<pre>
     Towards Configurable Data Collection for
     Sustainable Supply Chain Communication

  Gregor Grambow, Nicolas Mundbrod, Vivian Steller and Manfred Reichert

                 Institute of Databases and Information Systems
                             Ulm University, Germany
    {gregor.grambow,nicolas.mundbrod,vivian.steller,manfred.reichert}@
                                   uni-ulm.de
                          http://www.uni-ulm.de/dbis


      Abstract. These days, companies in the automotive and electronics sec-
      tor are forced by legal regulations and customer needs to collect a myriad
      of different indicators regarding sustainability of their products. However,
      in today’s supply chains, these products are often the result of the collab-
      oration of a large number of companies. Thus, these companies have to
      apply complex, cross-organizational, and potentially long-running data
      collection processes to gather their sustainability data. Comprising a
      great number of manual and automated tasks for different partners, these
      processes imply great variability. To support such complex data collec-
      tion, we have designed a lightweight, automated approach for contextual
      process configuration.

      Key words: Process Configuration, Business Process Variability, Data Col-
      lection, Sustainability, Supply Chain


1 Introduction

In todays’ industry many products are the result of the collaboration of vari-
ous companies working together in complex supply chains. Cross-organizational
communication in such areas can be quite challenging due to the fact that differ-
ent companies have different information systems, data formats, and approaches
to such communication. These days, state authorities, customers and the pub-
lic opinion demand sustainability compliance from companies, especially in the
electronics and automotive sector. Therefore, companies have to report certain
sustainability indicators as, e.g., their greenhouse gas (GHG) emissions or the
amount of lead contained in their products. Such reports usually also involve
data from suppliers of the reporting company. Therefore, companies launch a
sustainability data collection process along their supply chain. This often in-
volves also the suppliers of the suppliers and so on.
    As sustainability data collection is a relatively new and complicated issue,
service providers (e.g., for data validation or lab tests) are also involved in such
data collection. A property that makes these data collection processes even more
complex and problematic is the heterogeneity in the supply chain: companies use
18        Pre-proceedings of CAISE’14 Forum

different information systems, data formats, and overall approaches to sustain-
ability data collection. Many of them even do not have any information system
or approach in place for this and answer with low quality data or not at all.
Therefore, no federated system or database could be applied to cope with such
problems and each request involves an often long-running, manual, and error-
prone data collection process. The following simplified scenario illustrates issues
with the data collection process in a small scale.
     Scenario: Sustainability Data Collection
     An automotive company wants to collect sustainability data relating to the
     quantity of lead contained in a specific part. This concerns two of the com-
     panies suppliers. One of them has an IHS in place, the other has no system
     and no dedicated responsible for sustainability. For the smaller company, a
     service provider is needed to validate the manually collected data to ensure
     that it complies with legal regulations. The IHS of the other company has
     its own data format that has to be explicitly converted to be useable. This
     simple scenario already shows how much complexity can be involved even
     in simple requests and gives an outlook on how this can look like in big-
     ger scenarios involving hundreds or thousands of companies with different
     systems and properties.

    In the SustainHub1 project, we develop a centralized information exchange
platform that supports sustainability data collection along the whole supply
chain. We have already thoroughly investigated the properties of such data col-
lection in the automotive and electronics sectors and published a paper about
challenges and state-of-the-art regarding this topic [1]. With this paper, we pro-
pose an approach that enables an inter-organizational data collection process.
The main point thereby is the capability of this process to automatically config-
ure itself in alignment with the context of its concrete execution.
    To guarantee the utility of our approach as well as its general applicability,
we have started with collecting problems and requirements directly from the
industry. This involved telephone interviews with representatives of 15 European
companies from the automotive and electronics sectors, a survey with 124 valid
responses from companies of these sectors, and continuous communication with a
smaller focus group to gather more precise information. Among the most valuable
information gathered there was a set of core challenges for such a system: as
most coordination for sustainability data exchange between companies is done
manually, it can be problematic to find the right companies, departments, and
persons to get data from and also to determine, in which cases service providers
must be involved (DCC1). Moreover, this is aggravated by the different systems
and approaches different companies apply. Even if the right entity or person
has been selected, it might still be difficult to access the data and to get it in
a usable format (DCC2). Furthermore, the data requests rely on a myriad of
contextual factors that are only manage implicitly (DCC3). Thus, a request is
1
     SustainHub (Project No.283130) is a collaborative project within the 7th Framework
     Programme of the European Commission (Topic ENV.2011.3.1.9-1, Eco-innovation).
  Configurable Data Collection for Sustainable Supply Chain Communication                                                           19

not reusable because an arbitrary number of variants can exist for it (DCC4).
A system aiming at supporting such data collection must explicitly manage and
store the requests, their variants, all related context data, and also data about
the different companies and support manual and automated data collection.
    The remainder of this paper is organized as follows: Section 2 shows our
general approach for data collection with processes. Section 3 extends this with
additional features regarding context and variability. This is followed by a brief
discussion of related work in Section 4 and the conclusion.


2 Data Collection Governed by Processes

The basic idea behind our approach for supporting data collection in complex
environments is governing the whole procedure by explicitly specified processes.
Furthermore, these processes are also automatically enacted by a PAIS (Process-
Aware Information System) that is integrated into the SustainHub platform.
That way, the process of data collection for a specific issue as a sustainability
indicator can be explicitly specified by a process type while process instances
derived from that type govern concrete data collections regarding that issue.
Activities in such a process represent the manual and automatic tasks to be
executed as part of the data collection by different companies. This approach
already covers a number of the elicited requirements. It enables a centralized
and consistent request handling (cf. DCC1) and also supports manual as well as
automated data collection (cf. DCC2). One big advantage lies in the modularity
of the realization as process. If a new external system shall be integrated, a new
activity component can be developed while the overall data collection process
does not need to be adapted. Finally, it also enables the explicit specification
of the data collection process (cf. DCC4). By visual modeling the creation and
maintenance of such processes is facilitated. However, the realization via pro-
cesses can only be the basis for comprehensive and consistent data collection
support. To be able to satisfy the requirements regarding contextual influences,
various types of important data, and data request variants, we propose an ex-
tended process-based approach for data collection illustrated in Figure 1.
                                                                               Configurations


 Contextual                                                                                                                   Users /
              Data Model


                                                                                                               SustainHub
                                                               Process
                                                                Types


 Influences                                                                                                                   Systems
                           Customer              Customer
                                                Relationship                                    Configured Process Instance
                                      Product


                           Context                                         Process
                           Mapping                                       Configuration


              Fig. 1: SustainHub Configurable Data Collection Approach


   To generate an awareness of contextual influences (e.g. the concrete approach
to data collection in a company, cf. DCC3) and make them usable for the data
20     Pre-proceedings of CAISE’14 Forum

collection process, we have defined an explicit context mapping approach (dis-
cussed in Section 3.1). This data is necessary for the central step of our approach,
the automatic and context-aware process configuration (discussed in Section 3.2),
where pre-defined process types and configuration options are used to automati-
cally generate a process instance containing all necessary activities to match the
properties of the current requests situation (cf. DCC4). As basis for this step, we
have elaborated a data model where contextual influences are stored (cf. DCC3)
alongside different kinds of content-related data. This data model integrates
process-related data with customer-related data as well as contextual informa-
tion. We will now briefly introduce the different kinds of incorporated data by
different sections of our data model. At first, such a system must manage data
about its customers. Therefore, a customer data section comprises data about
the companies, like organizational units or products. Another basic component
of industrial production that is important for many topics as sustainability are
substances and (sustainability) indicators. As these are not specific for one com-
pany, they are integrated as part of a master data section. In addition, the data
concretely exchanged between the companies is represented within a separate
section (exchange data). To support this data exchange, the system must man-
age certain data relating to the exchange itself (cf. DCC1): For whom is the data
accessible? What are the properties of the requests and responses? Such data
is captured in a runtime data section in the data model. Finally, to be able to
consistently manage the data request process, concepts for the process and its
variants as well as for the contextual meta data influencing the process have been
integrated with the other data. More detailed descriptions of these concepts and
their utilization will follow in the succeeding sections.


3 Variability Aspects of Data Collection
This section deals with the necessary areas for automated process configuration:
The mapping of contextual influences into the system to be used for configuration
and the modeling of the latter.

3.1 Context Mapping

As stated in the introduction, a request regarding the same topic (in this case,
a sustainability indicator) can have multiple variants that are influenced by a
myriad of possible contextual factors (e.g. the number of involved parties or
the data formats they use). Hence, if one seeks to implement any kind of auto-
mated variant management, a consistent manageable way of dealing with these
factors becomes crucial. However, the decisions on how to apply process con-
figuration and variant management often cannot be mapped directly to certain
facts existing in the environment of a system. Moreover, situations can occur,
in which different contextual factors will lead to the same decision(s) according
to variant management. For example, a company could integrate a special four-
eyes-principle approval process for the release of data due to different reasons
  Configurable Data Collection for Sustainable Supply Chain Communication                                                 21

like if the data is for a specific customer group or if the data relates to a specific
law or regulation. Nevertheless, it would be cumbersome to enable automatic
variant management by creating a huge number of rules for each and every pos-
sible contextual factor. Therefore, in the following, we propose a more generic
way of mapping for making contextual factors useable for decisions regarding
the data collection process.
    In our approach, contextual factors are abstracted by introducing two sepa-
rate concepts in a lightweight and easily configurable way: The Context Factor
captures all different possible contextual facts existing in the systems’ environ-
ment. Opposed to this, the Process Parameter is used to model a stable set of
parameters directly relevant to the process of data collection. Both concepts are
connected by simple logical rules as illustrated on the left side of Figure 2. In
this example, a simple mapping is shown. If a contact person is configured for
a company (CF1), the parameter ’Manual Data Collection’ will be derived. If
the company is connected via a tool connector (CF2), automatic data collection
will be applied (P3). If the company misses a certain certification (CF3), an
additional validation is needed (P2).


 Context   Context Rules     Process        CF1: Contact Person X               P1: Manual Data Collection
 Factors                   Parameters       CF2: Tool Connector Y               P2: Validation needed
  CF 3      CF3    P2         P2            CF3: Certification missing          P3: Automatic Data Collection
                                 Implication Contradiction 1        Contradiction 2
  CF 1      CF1    P1         P1                P1
                                   Mutual                                CF 1            CF1    P1              P1
                                  Exclusion             P3                                                           P2
  CF 2      CF2    P3         P3                P2                       CF 2           CF2     P3              P3


                                   Fig. 2: Context Mapping


    When exchanging data between companies, various situations might occur, in
which different decisions regarding the process might have implications on each
other. For example, it would make no sense to collect data both automatically
and manually for the same indicator at the same time. To express that we have
also included the two simple constraints ’implication’ and ’mutual exclusion’
for the parameters. For an example, we refer to Figure 2, where, for example,
manual and automatic data collection are mutually exclusive.
    Although we have put emphasis on keeping the applied rules and constraints
simple and maintainable, there can still exist situations, in which these lead to
contradictions. One case (Contradiction 1 in Figure 2) involves a contradiction
only created by the constraints, where one activity requires and permits the
occurrence of another activity at the same time. A second case (Contradiction
2 in Figure 2) occurs when combining certain rules with certain constraints, in
which a contradicting set of parameters is produced. To avoid such situations,
we have integrated a set of simple correctness checks for constraints and rules.
22         Pre-proceedings of CAISE’14 Forum

3.2 Process Configuration

In this section, we will introduce our approach for process configuration. There-
fore, we not only considered the aformentioned challenges, we also wanted to
keep the approach as easy and lightweight as possible to enable users of Sustain-
Hub to configure and manage the approach. Furthermore, our findings included
data about the actual activities of data collection and their relation to contextual
data. Data collection often contains a set of basic activities that are part of each
data collection process. Other activities appear mutually exclusive, e.g. manual
or automatic data collection, and no standard activity can be determined here.
In most cases, one or more context factors impose the application of a set of
additional coherent activities rather than one single activity.
    In the light of these facts, we have opted for the following approach for au-
tomatic process configuration: For one case (e.g. a sustainability indicator) a
process family is created. The latter contains a Base Process with all basic ac-
tivities for that case. Additional activities that are added to this Base Process
are encapsulated in Process Fragments. These are automatically added to the
process on account of the parameters of the current situation that is represented
in the system by the already introduced Process Parameters and Context Fac-
tors. Thus, we only rely on one single change pattern to the processes, an insert
operation. This operation has already been described in literature, for its formal
semantics, see [2]. Thus our approach avoids problems with other operations as
described by other approaches like Provop [3]. Figure 3 shows a simple example
of a Base Process that has been configured with Process Fragments (configured
areas are marked red). For simplicity, this example uses a subset of the activities
of the scenario from the introduction.
    To keep the approach lightweight and simple, we decided to model both the
Base Process and the fragments in a PAIS (Process-Aware Information System)
that will be integrated into our approach. Thus, we can rely on the abilities of
the PAIS for modeling and enacting the processes and also for checking their
correctness.

     Process Fragment 1           ID: PF1                                       Process Fragment 2            ID: PF2                    Process Fragment 3                ID: PF3
                                  Type: a                         Inform Person          Collect Data         Type: a                              Approve                 Type: x
         Collect Data
                                  Insert: Inline                                                              Insert: Inline                                               Insert: Inline
         Automatically                                            (Responsible)           Manually                                                 Reiceipt
                                  Exec: single                                                                Exec: single                                                 Exec: single


                          ID: EP1, Start: EP1.start, End: EP1.end, Type: a, Order: 1                               ID: EP2, Start: EP2.start, End: EP2.end, Type: x, y, Order: 2
                    EP1.start            Extension Point 1                  EP1.end                           EP2.start            Extension Point 2                   EP2.end

                                              Inform Person               Collect Data                                                              Deliver
                                              (Responsible)                Manually                                                                  Data
         Configure Data                                                                           Aggregate
           Collection                                                                               Data
                                                            Collect Data                                           Approve                      Inform
                                                            Automatically                                          Reiceipt                    Requester


                                                          Fig. 3: Process Fragments


   To enable the system to automatically extend the base process at the right
points with the chosen fragments, we have added the concept of the Extension
 Configurable Data Collection for Sustainable Supply Chain Communication        23

Point (EP). Both the latter and the fragments have parameters, the system can
match to find the right EP for a fragment (see Figure 3 for two example EPs
and three fragments with matching parameters). Regarding the connection of
the EPs to the Base Processes, we have also evaluated multiple options as, e.g.,
connecting them directly to activities. Most of such options introduce limitations
to the approach or impose a fair amount of additional complexity (cf. [3] for
a more detailed discussion). For these reasons we have selected an approach
involving two so-called connection points of an EP with a Base Process. These
points are connected with nodes in the process as shown in Figure 3. Taking the
nodes as connection points allows us to reference the nodes’ Id for the connection
point because this Id is stable and would only change in case of more complicated
configuration actions (cf. [3]). If the Base Process contains nodes between the
connection points of one EP, an insertion would be applied in parallel to these
(cf. EP2 in Figure 3), otherwise sequentially (cf. EP1). Furthermore, if more
than one fragment should be inserted at one EP, they will be inserted in parallel
to each other (cf. EP1 and fragments 1 and 2 in Figure 3).
    By relying on the capabilities of the PAIS we have kept the number of addi-
tional correctness checks small. However, the connection points are not checked
by the PAIS and could impose erroneous configurations. To keep correctness
checks on them simple we rely on two things: The relation of two connection
points of one EP and block-structured processes [4]. The first fact spares us
from having to check all mutual connections of all connection points as two
always belong together. The second implies certain guarantees regarding the
structure of the processes. So we only have to check a small set of cases, as e.g.,
the erroneous definition of EP2 in Figure 3 that would cause a violation to the
block structure as shown in the figure.


4 Related Work
Regarding the topic of process configuration, various approaches exist. Most of
them focus on the modeling of process configuration. One example is C-EPC [5]
that enables behavior-based configurations by integrating configurable elements
into a process model. Another approach with the same focus is ADOM [6]. It
allows for the specification of constraints and guidelines on a process model to
support variability modeling. For all of these approaches two main shortcomings
apply: First, they strongly focus on the modeling and neglect execution. Second,
configuration must be manually applied by a human, which can be complicated
and time-consuming. The approach most closely related to ours is probably
Provop [3]. It allows storing a base process and pre-configured configurations
to it. Compared to our approach Provop is more fine-grained, complicated, and
heavyweight whereas our approach utilizes a set of simplifications that enable a
far more lightweight approach. For further reading on the configuration topic,
see [7] for an overview of configuration approaches and our predecessor paper
for SustainHub [1].
24      Pre-proceedings of CAISE’14 Forum

5 Conclusion

In this paper, we have shown a lightweight approach to automatic and contex-
tual process configuration required in complex domains. We have investigated
concrete issues in an example domain relating to sustainability data collection
in supply chains. With our approach, we have centralized the data and process
management uniting many different factors in one data model and supporting
the whole data collection procedure by processes executed in a PAIS. Moreover,
we have enabled this approach to apply automated process configurations con-
forming to different situations by applying a simple model allowing for mapping
contextual factors to parameters for the configuration. In future work, we plan
to evaluate our work with our industrial partners and to extend our approach to
cover further aspects regarding runtime variability, automated monitoring, and
automated data quality management.


Acknowledgement

The project SustainHub (Project No.283130) is sponsored by the EU in the 7th
Framework Programme of the European Commission (Topic ENV.2011.3.1.9-1,
Eco-innovation).


References
1. Grambow, G., Mundbrod, N., Steller, V., Reichert, M.: Challenges of applying
   adaptive processes to enable variability in sustainability data collection. In: 3rd
   Int’l Symposium on Data-Driven Process Discovery and Analysis. (2013) 74–88
2. Rinderle-Ma, S., Reichert, M., Weber, B.: On the formal semantics of change pat-
   terns in process-aware information systems. In: Proc. 27th Int’l Conference on
   Conceptual Modeling (ER’08). Number 5231 in LNCS, Springer (October 2008)
   279–293
3. Hallerbach, A., Bauer, T., Reichert, M.: Configuration and management of process
   variants. In: Int’l Handbook on Business Process Management I. Springer (2010)
   237–255
4. Mendling, J., Reijers, H.A., van der Aalst, W.M.: Seven process modeling guidelines
   (7pmg). Information and Software Technology 52(2) (2010) 127–136
5. Rosemann, M., van der Aalst, W.M.P.: A configurable reference modelling language.
   Information Systems 32(1) (2005) 1–23
6. Reinhartz-Berger, I., Soffer, P., Sturm, A.: Extending the adaptability of reference
   models. IEEE Trans on Syst, Man, and Cyber, Part A 40(5) (2010) 1045–1056
7. Torres, V., Zugal, S., Weber, B., Reichert, M., Ayora, C., Pelechano, V.: A qual-
   itative comparison of approaches supporting business process variability. In: 3rd
   Int’l Workshop on Reuse in Business Process Management (rBPM 2012). BPM’12
   Workshops. LNBIP, Springer (September 2012)

</pre>