<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Applying Normalized Systems Theory in a Data Integration Solution, a Case Report.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hans Nouwens</string-name>
          <email>Hans.Nouwens@sogeti.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edzo A. Botjes</string-name>
          <email>ebotjes@xebia.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christiaan Balke</string-name>
          <email>Christiaan.Balke@sogeti.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sogeti Nederland</institution>
          ,
          <addr-line>Vianen</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Xebia - Security</institution>
          ,
          <addr-line>Hilversum</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Enterprise Application Integration (EAI) becomes more important and non-evident when confronted with the growth of complexity. The evolvability of the EAI solution itself decreases over time due to Lehman's law. This case report describes and evaluates the architecture (model and principles), governance, solution design, and realisation, which are based on the Normalized Systems Theory. There are indications that the realisation resulted in a mitigation of the efects of the Lehman's law, hence improving the evolvability of the created EAI solution.</p>
      </abstract>
      <kwd-group>
        <kwd>normalized systems</kwd>
        <kwd>software architecture</kwd>
        <kwd>case report</kwd>
        <kwd>enterprise application integration</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        When implementing information systems, the main focus is on fast realisation
and deployment. Preventing complexity of integration between the information
systems is not the main priority [
        <xref ref-type="bibr" rid="ref15 ref5">5,15</xref>
        ].
      </p>
      <p>
        The integration of applications, the exchanging of data between enterprise
applications, also known as Enterprise Application Integration (EAI) [
        <xref ref-type="bibr" rid="ref10 ref3 ref5">10,3,5</xref>
        ],
becomes more important and non-evident when confronted with the growth of
complexity of the application integration [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This often results in EAI becoming
distinct from application development with its own software used for performing
the task of integration [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. There are various methods to integrate applications
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Each methods has an unique complexity in their solution and an impact of
the complexity of the application that is integrated with the whole [
        <xref ref-type="bibr" rid="ref5 ref7">5,7</xref>
        ].
      </p>
      <p>
        Lehman’s law of increasing complexity [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] states: ”As an evolving program
is constantly changed, it complexity, reflecting deteriorating structure, increases
unless work is done to maintain or reduce it” [12, par. 6.2.2]. Normalized
Systems Theory (NST) proposes a set of theorems/ principles to counter Lehman’s
law including the previously described complexity of EAI. NST argues that
reduction of Combinatorial Efects (CE) will reduce the complexity of software.
Disconnecting the relation between the size of the application and the impact
of a change [
        <xref ref-type="bibr" rid="ref1 ref15">1,15</xref>
        ], will at least maintain a stable cost of change in a growing
system. The theorems of NST are described in two books [
        <xref ref-type="bibr" rid="ref11 ref12">11,12</xref>
        ] and various
articles [
        <xref ref-type="bibr" rid="ref13 ref14 ref4">13,14,4</xref>
        ]. Description of NST implemented in software is limited available
[
        <xref ref-type="bibr" rid="ref1 ref15 ref16">1,16,15</xref>
        ] as is description on integrating applications using NST [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>We want to add our experience with applying NST in a real life environment
to the available body of knowledge in the form of a case report following the
STARR format (Situation, Task, Approach, Result, Reflection).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Situation</title>
      <p>The case report subject is a Dutch university of applied science. The business
application landscape of this educational organisation evolved with little
guidance (organic). This resulted in many custom applications providing overlapping
functions and supporting undocumented information flows. Migrating this
complicated and interwoven application landscape to industry standard SaaS
applications proved to be very time consuming, holding the organisation hostage in
its current situation.</p>
      <p>Previous attempts to describe an architecture failed, mainly because of their
generic approach and lack of a clear vision on how to implement an evolvable
design.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Task</title>
      <p>To help in this migration from custom applications to standard SaaS, in 2017 the
ifrst author was requested to write an architecture. An architecture that guided
a design for an evolvable data integration solution based on a hub-and-spoke
pattern with a central data storage hub.</p>
      <p>One of the clients requirements was to make the design and solution able
to adapt to a changing environment, to be evolvable. Attention to potential
change drivers and possible combinatorial efects turned out to be one of the
main drivers behind the architecture and the solution designs.
1. The solution needs to support: multiple data sources and data targets,
multiple vendors, multiple communication patterns (pub/sub, etc),
multiple security patterns (JSON Web Tokens, IP-Allow-List, etc), multiple
connection techniques (ODBC, REST/SOAP API, (s)FTP, http(s), e-mail,
SAMBA/NFS, etc), multiple file types (JSON, CSV, XLS(X), etc) and
multiple (cloud) networks (on-premise, Private-cloud, Azure, AWS, GCP, etc).
2. The solution needs to support multiple frequencies of delivering data such
as a 24-hour bulk upload or small messages, based on events in a source
application.
3. The solution needs to support quality rules to prevent unqualified source
data to be distributed.
4. The solution needs to support filtering to ensure a data minimisation policy.</p>
    </sec>
    <sec id="sec-4">
      <title>Approach</title>
      <p>This case report contains our translation of the NST theorems [12, par. 12.2] into
an architecture for an EAI solution. In our experience, in the Dutch educational
sector the integration bus pattern is commonly used to realise EAI, especially
with of-the-shelf and Software-as-a-Service (SaaS) applications. This decision is
mainly based on the general conviction that the application of an ESB reduces
the number of connections between application from N(N-1)/2 to N as described
in [12, p. 275, fig. 12.3].</p>
      <p>However, we have not seen an application of the NST in this problem domain
in this sector.</p>
      <p>For our evaluation and reflection, we try to answer this question:</p>
      <sec id="sec-4-1">
        <title>Does our architecture, applying our interpretation of the Nor</title>
        <p>malized Systems Theory, improve the evolvability of the created</p>
      </sec>
      <sec id="sec-4-2">
        <title>EAI solution?</title>
        <p>Author one fulfilled the role of architect, Author two fulfilled the role of
architect at other educational organisations with similar requirements, and not
directly involved at the case report subject. The third author was analyst and
engineer in realising and operating the solution during two years. The numbers
in this case report include the period when the engineer was involved as the
consulting architect when there was a limited involvement of the main architect.</p>
        <p>Next to a number of personal observations and evaluations, we retrieved the
perceived efort to create new versions of the connections between the ESB and
the application. For this we consulted the previously hired developer, and the
developers, analyst and product owner currently working at the organisation.</p>
        <p>If we were successful in preventing complexity, in the context of Lehmans
law, we should be able to see that the efort of changing a connection does
not increase together with the overall size of the system, the total number of
connections.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Result</title>
      <p>The Integration Framework (IFW) is (1) a technical solution that integrates
multiple systems and data sources and (2) a framework of processes, controls and
agreements. The IFW system architecture describes the components, the way of
organising their relations and the principles guiding its design and evolution.
5.1</p>
      <sec id="sec-5-1">
        <title>IFW system architecture</title>
        <p>The IFW system consists out of the following three types of components: ‘source
pipelines‘, a ‘data-hub‘ component and ‘target pipelines‘ (see figure 1).</p>
        <p>The ‘source pipelines‘ are responsible for the optimal ingestion of information
from the designated source systems and maintain integrity and actuality of the
information in the ‘data storage‘. For each source system an instance of a source
pipeline is created. The ‘data-hub‘ is responsible for the availability of the
information. There is only one instance of the data-hub. The ‘target pipelines‘ are
responsible for providing data to systems that are not designated as the source
for this data. For each target system an instance of a target pipeline is
created. Together this creates the pattern of a ESB, plus a central data persistence
component.</p>
        <sec id="sec-5-1-1">
          <title>The IFW system consists of the following sub-systems:</title>
          <p>1. Source systems are connected to the IFW system to deliver part of it data
elements. The connection to the source systems is part of the scope of the
IFW, the source system itself is not. Each source system is connected to a
dedicated source pipeline.
2. Source pipelines consist of the following sequence of modules:
(a) The Adapters separate the connection specifics of the source system from
the standards used within a pipeline. The adapter is the only sub-system
of the IFW system that directly interacts with the source system. It
handles functions like an IP-Allow-List, SSL-certificates, API-tokens,
accounts and passwords, file-transfers et cetera. A typical implementation
is based on an API gateway service. If the adapter receives individual
messages, it will bufer them in a table.
(b) The Extractor retrieves the data from the result table of the Adapter
module. It converts technical formats (for example: JSON, XML, CSV)
to strong typed database records and values. The technical formats are
dictated by the source system. Type conversion may trigger a technical
error status and notifications. The errors in the extractor are limited to
technical data integrity, job integrity and connection errors.</p>
          <p>The extractor is responsible for storing a backup of the retrieved data
(files). The backup is used to restart the pipeline. Since every step can
be run independent it is possible that the extractor collects multiple sets
of retrieved data that accumulated while waiting to be processed by the
transformer step.
(c) The Transformer replaces values in the new data by updating it with
reference tables from the meta system. Field quality rules check for allowed
values. Column quality rules count rows. Row counts can be compared
to a range of expected number of rows.
(d) The Combiner has a read-only database connection to the data-hub. It
changes the data model from based on the source system, to the shared
data model of the data-hub (also known as ODS3). It looks in the
datahub at existing values and relational keys and creates new unique values
to be inserted. An unique key from the source system is maintained in
the data-hub to resolve future updates.
(e) The Loader creates or updates data in the data-hub. This is the only</p>
          <p>IFW sub-system that updates or creates data in the data-hub.
3. The Data-hub contains the data that is to be distributed to target systems.</p>
          <p>
            It has no historic record of previous versions like a data warehouse. The data
warehouse(s) of the organisation are considered as target system and source
system of the data-hub. The data-hub contains the most recent state of the
data-elements from the source systems.
(a) The canonical data model (CDM) of the data-hub must be able to
answer many questions by target systems, hence its independent, canonical
model.
(b) The technical data model (TDM) of the data-hub is the materialisation
of the Canonical Data Model [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ].
          </p>
          <p>The data-hub can consist of multiple types of data-storage solutions,
with each their own Technical Data Model, to ensure fulfilling all the
data requirements of the target systems. A single data-storage solution
would be in a 3NF4.</p>
          <p>
            Examples of multiple data-storage solutions are (1) an on-premise SQL
master database with an SQL slave database in the cloud and (2) an
SQL database in combination with a Data-lake solution in the cloud.
(c) The Generic Views (GV) function as a facade between the data-storage
in the data-hub and the target pipelines [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]. The complexity of the
database model in the data-hub is hidden by the usage of views [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ]. The
views are to be re-used by multiple outgoing pipelines. Generic Views
can not re-use other Generic Views as this would create a unwanted
dependency. Views are created to deliver grouped data on the level of a
complete business object5.
4. The Target pipelines are each created for each target system. They consist
of the following sequence of modules:
(a) The Selector combines several generic views from the data-hub. A subset
of the columns is selected to deliver only the required and allowed data
for this target system, enabling data minimisation policies.
3 The term ODS is commonly used as Operational Data Store in relation to the
dataprovisioning of a Data Warehouse (DWH) [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ]. Currently it is more common to use the
term data-hub in the scope of data-provisioning to operational applications within
an organisation.
4 See https://en.wikipedia.org/wiki/Third_normal_form
5 An architectural name to describe a collection of attributes that together have a
meaning to business people, also known as information.
(b) The Filter filters a subset of the rows, again enabling data minimisation
policies. Changes on both the column and row filters are explicitly
under governance of the privacy oficer 6 (CPO) and security oficer 7 (CISO)
[
            <xref ref-type="bibr" rid="ref17">17</xref>
            ]. Their policies determine the classification of the target system
connected to this instance of the pipeline [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ]. Hence the data that is allowed
to be delivered to the target system. The list of attributes to be delivered
will be recorded in the data-delivery register, enabling GDPR8
compliance and processes.
(c) The Transformer again transforms from one data model (in this case the
CDM via the TDM) to the data model of the target system. It can also
apply changes to the data using the reference tables.
(d) The Adapter ; the equivalent of the Adapter in the incoming pipeline. It
has the same dependency, but now to the target system. There are three
types of adapters present in the IFW solution: (1) a generic adapter using
API technology and services that resemble the Logical Data Model, (2)
a generic adapter using API technology and services that are specified
in a domain standard (for example: open banking api, open education
api) and (3) system specific adapter.
5. The Target systems receive data from the IFW solution. For every
dataelement in the data-hub CDM there is a source system defined. Every other
system is a potential target system of this data-element. When a system
needs data that is not part of their assigned ownership, the required
dataelements are retrieved via a target pipeline.
6. Meta-data
          </p>
          <p>Each of the modules maintains its own status in a globally available
table. The status ”is_enabled” in the Loader module enables testing of the
complete source pipeline and executing the quality rules, feeding back
results based on production data without actually changing the data-hub and
feeding to target systems. This is a remediation of privacy concerns during
testing.</p>
          <p>Severity levels of a failing quality rule will be logged. The next module will
use this to determine if it is allowed to start.</p>
          <p>Log messages will be used to automatically report about any data elements
that failed, to the responsible owner of the source system. It enables quality
improvement of the reused information (not data) on an organisational level.
5.2</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>IFW governance architecture</title>
        <p>The second part of the IFW is the governance processes and agreements.</p>
        <p>Creating a pipeline starts based on a new target system that needs data. If
the required data is already available in the data-hub, only a target pipeline will
be created. If the required data is not available in the data-hub, the team looks
for a new existing source system to be connected with a new source pipeline.
6 See https://en.wikipedia.org/wiki/Chief_privacy_officer
7 See https://en.wikipedia.org/wiki/Chief_information_security_officer
8 See https://eur-lex.europa.eu/eli/reg/2016/679/oj</p>
        <p>
          For each pipeline the Information Manager (IM) signs an internal agreement
with the functional owner of the connected system. This agreement describes the
responsibilities of the stakeholders, the data quality requirements, privacy
limitations and includes a list of the data elements. The data ownership is delegated
from the functional owner of the source system to the IM. The IM signs a
agreement with the functional owners of the target systems (see Fig. 2). Compared
to the classic data-delivery agreements (spanning from source to target), this
decoupling aims to simplify the negotiations. The explicit use of the agreements
aims to increase the security, privacy and data quality awareness.
The following customised interpretation of the Normalized Systems Theory (NST)
[
          <xref ref-type="bibr" rid="ref12 ref14 ref4">12,4,14</xref>
          ] was given to the designers:
        </p>
        <p>The agility and changeability and hence the maintainability of a composite
system is determined by the dificulty of changing the entire system. The focus
here is on preventing ripple (combinatorial) efects as an efect of changing a
sub-system.
1. Separation of concerns: a change must not have an impact on more than
one sub-system, this is also called functional loose coupling. Each sub-system
has only one responsibility.
2. Data Version Transparency: the version of an data element type is
included in the data element meta-data. Applications that process information
only use the version that they know.
3. Action Version Transparency: the version of an action element is
included in the action element. Applications invoke an action element only
with a version that they know.
4. Separation of States: after each action in a series, the intermediate result
is saved. Every sub-system is a real black box. Other sub-systems do not
need to know what the inside of another system looks like. They can ask the
sub-system what it does and what information can be delivered or requested.</p>
        <sec id="sec-5-2-1">
          <title>Designers are given this set of architecture principles:</title>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>1. Principle: Minimise dependencies</title>
        <p>Rationale: Systems change, thus the integration system like the IFW will
change forever. These changes will be problematic if there are many
dependencies.</p>
        <p>Implications: It can be cumbersome to prevent dependencies. This
investment will pay for itself in the long run.</p>
      </sec>
      <sec id="sec-5-4">
        <title>2. Principle: business objects are recorded in the Canonical Data</title>
      </sec>
      <sec id="sec-5-5">
        <title>Model.</title>
        <p>Rationale: Business objects consist of a considerable number of attributes,
rarely supplied by a single source system. In the Canonical Data Model
(CDM), the business objects with an unambiguous meaning are not related
to the model from a source system.</p>
        <p>Implications:
(a) Business objects and their attributes are described in the CDM.
(b) Attributes of a business object can be provided by multiple source
systems.
(c) The CDM is based on industry, general and market standards.
(d) The CDM contains only structured and machine readable fields.</p>
      </sec>
      <sec id="sec-5-6">
        <title>3. Principle: Everything has a version.</title>
        <p>Rationale: Explicit granting and simultaneous use of diferent versions gives
more flexibility because systems and links do not have to be changed
simultaneously when changing another system or link.</p>
        <p>Implications:
(a) All elements, building blocks, steps, tables, views et cetera have a version.
(b) Every update of a building block or step with a functional diference
gives a new version. Other building blocks never automatically use the
newer version.
(c) Versions exist side by side and are used simultaneously.
(d) All links (system calls, web services, FTP folders, references to tables,
etc.) explicitly refer to a version of an implementation.
(e) The (version) naming standard is applied and maintained.
(f) Implementations of new versions of elements are subject to a formal
change process.</p>
      </sec>
      <sec id="sec-5-7">
        <title>4. Principle: All prescribed building blocks always exist and are exclusively automated.</title>
        <p>Rationale: Only if all building blocks in the IFW are designed and
realized, there is independence from all building blocks in the chain. Manual
processing steps are not allowed because this creates a dependency on an
(interpretation of a) person. The only meaning of the IFW is integration.</p>
      </sec>
      <sec id="sec-5-8">
        <title>Implications:</title>
        <p>(a) There is no building block that has presentation functions. This is
reserved for portals and user interfaces.
(b) There is no building block that realises business logic and business
processes. This is a task of the target or source systems or a special system
for automated business processes.
(c) There is no building block that realises reporting and analysis functions.</p>
        <p>This is a task for target systems such as the data warehouse (DWH) or
Management Information System (MIS)
(d) Messages that lead to an error situation in the chain or in the target
systems (both technical and functional) must be corrected by an adjustment
in the source.
(e) There is no functionality available for manually creating, changing or
deleting messages other than in the source.
5.4</p>
      </sec>
      <sec id="sec-5-9">
        <title>Solution realisation</title>
        <p>Based on an explicit requirements by the client, the design is created using
Microsoft components. Hence the adaptors being based on the Azure API
management services and the databases implemented by a Microsoft SQL server.
These databases are on-premise because they mainly connect to on-premise
systems. The IFW modules are implemented as SSIS packages9, usually created by
a BI developer. However, the architect and designer did choose for the use of
BIML10, a XML based language that is used to generate the SSIS packages. The
BIML scripts make use of meta-data in tables in the SQL server and maintained
with a Microsoft Access GUI. The quality rules are also generated during design
time. This generation is comparable to a pattern expansion process. In runtime,
a copied subset of the meta-data is used as reference tables (validation of allowed
values and replacement of values) and thresholds for the quality rules.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Reflection</title>
      <p>We collected and grouped the following observations on the realised IFW at this
organisation:
1. Applied architecture principles in respect to NST
(a) Principles 1 and 2 are translations of avoiding combinatorial efects
requirement and the separations of concerns theorem.
(b) Principle 3 translates the data and action version transparancy theorem.
(c) Principle 4 is based on anticipated changes of adding functionality. The
chosen tactic is to include all possible building blocks from the start,
even if they initially do nothing. Adding functionality later is limited
to enabling functions in the empty building blocks, not changing the
structure and relations.
(d) There is no principle that translates the separation of states theorem.</p>
      <p>This is part of the IFW architecture model that describes the building
blocks, the way they handle data locally and handle state using shared
meta data.
2. Applied architecture principles in respect to building the solution
9 See https://en.wikipedia.org/wiki/SQL_Server_Integration_Services
10 See https://en.wikipedia.org/wiki/Business_Intelligence_Markup_Language
(a) Although the solution designers and developers were given freedom to
interpret the principles themselves, they frequently requested guidance
by an architect.
(b) Instead of the term ”preventing combinatorial efects” the team used the
term ”blocking domino efects”.
3. Solution design of module size
(a) We created expander scripts to generate modules as described in the
system structure (see Fig. 1). The size of these atomic elements of the
system is too large. This results in some repetition of the sub-functions
within a pipeline module and in combinatorial efects on the lowest (SSIS
package) level. Improved examples for the smallest elements could be:
i. reading the status from the meta-data and deciding if the module is
allowed to start
ii. application of quality rules
iii. replacing a value
iv. copy a value (database table to database table)
v. (re)creating result tables
vi. writing status or logging
4. Applied solution in respect to the data-hub
(a) A 3NF for the data-hub is cumbersome to design. There have been
discussions to apply a diferent data model form such as a snow flake. A
supporting argument would be that this form is easier to generate based
on meta-data. This is not tested nor implemented as it would impact the
design of then existing loader modules. This is already a combinatorial
efect in the current architecture and design. However, the generic views
module as described in the design will act according to the Separation
of States theorem, efectively stopping the combinatorial efects to the
outgoing pipelines. This separation allows for a incremental update of
the data-hub.
(b) The architecture describes the data-hub and the 3NF as a Canonical
data model (CDM). During design we recognised that the generic views
of the data-hub are the CDM. The 3NF database design should be seen
as part of the black box of the data-hub.
(c) The use of meta-data, both during design time and run time is not
optimal. On the one hand using the same database tables in development
and production is out of the question. On the other hand, making copies
introduces duplicates and unwanted complexity.
5. Applied solution in respect to adaptors
(a) Recognised combinatory efects: Adaptors highly depend on the technical
implementation of the source system. Views are dependent on the
datahub design. During the design phase these efects are recognised. Detailed
measures are taken to ensure the efect is only related to the next module.
This is preventing further domino efects. The combiner and loader both
depend on the (canonical and technical) data model of the data-hub.
This cannot be prevented as the combiner converts the incoming data
to the model of the data-hub, and the loader inserts and update data
within the constraints and design of the data-hub data model.
6. Governance
(a) Getting the responsible business owners to commit to the proposed
agreements (mandate and usage) turned out to be a hassle. It did cost a lot
of explaining to convince them of the necessity of the agreements. It did
also take some time to find the right level of details for the description
of the data elements.
(b) the ability to separate the creation and deployment of source pipelines
from target pipelines is an improvement. In the applied agile way of
working the individual backlog items turned out to be close to a module
in a pipe line. Several tasks within this item could be worked on by
diferent developers. This increased the understanding on the progress
of the development.
6.1</p>
      <sec id="sec-6-1">
        <title>Validating the observations</title>
        <p>To validate our observations, we interviewed four people who where also directly
involved in the design and development of the system; The product owner, the
development manager, the senior designer/developer, and a developer/sysadmin.</p>
        <p>At the start of each interview we briefly explained Lehman’s law and the
goal of our paper and interview. We presented our observations 2a, 2b, 3a, 4a,
4b, 4c, 5a, 6a, and 6b for validation in the form of ”Can you agree that …?”
Observations 1a..d were skipped because these explain the mapping between
the NS theorems and the architecture principles and therefore not suitable for
validation by the interviewees. We added a question to retrieve insight in the
perception of the evolvability of the system: ”7: Now that the system has about
20 pipelines, compared to the beginning when the system had two pipelines.
What is your global impression: Has the efort of making a comparable change
stayed the same?”.</p>
        <p>For each question the answers where limited to the following options: ”not
applicable” (for instance when the interviewee was not involved in that topic
and did not know an answer), and ”strongly disagree”, ”disagree”, ”agree”, or
”strongly agree”. By leaving out the option for a neutral answer, we forced the
interviewees to make a explicit choice. All interviews were done by author one
as interviewer, carefully using comparable wording.</p>
        <p>Based on the summed scores, the interviewees confirmed all except
observations 3a and 4a (see table 1).</p>
        <p>Regarding observation 3a, about the module size in the solution design, that
it should have been designed smaller; The product owner and development
manager disagreed. The senior designer/developer strongly disagreed, even suggested
they should be larger. The developer agreed.</p>
        <p>In respect to observation 4a, about 3NF being cumbersome to design; The
two managers both strongly agreed, giving arguments that it took a lot of time,
efort and thus money. The two developers both disagreed, giving arguments
that the 3NF was not easy but do-able and most of all necessary to be able to
deliver all possible combinations towards outgoing pipelines.</p>
        <p>The added question (7) about the efort related to the system size was agreed
upon by all four interviewees.
7</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>This paper reports about an application of Normalized Systems Theory (NST)
in the context of a data integration solution. A summary is given of our
interpretation of NST, as given by the architect to the designers of the integration
software. This case report is part of the practitioners double learning loop.</p>
      <p>In this case, the application of the Normalized Systems Theory helped the
practitioners to improve the adaptability of the data integration solution. In our
view it is definitely better but not perfect.</p>
      <p>As the design team was unfamiliar with NST, the use of the NST theorems
within the context of an architecture model and architecture principles,
indicated the need for continuous decision-making support. Using the term ”blocking
domino efects” improved comprehension of the team’s design challenge.</p>
      <p>The SSIS-packages, part of the Microsoft SQL suite, can be generated using
BIML. This case report indicates that this combination can be used as a
Normalized Systems Theory expander. However, the optimal module size is disputable.</p>
      <p>Based on our validated observations we can state; There are indications that
this IFW design, and architecture principles, our interpretation of NST,
disconnect the relation between the size of the application and the impact of a change.
Thus mitigating the efects of Lehman’s Law and improving the evolvability of
the created EAI solution.</p>
    </sec>
    <sec id="sec-8">
      <title>Future Research</title>
      <p>Based on the experience at this specific organisation, the IFW architecture was
applied and realised at other Dutch educational organisations. Each with
diferent technologies and various implementation strategies. Future research could be
collecting these individual use cases and combine it with a post implementation
questionnaire to compare the perceived (added) value of the IFW architecture
and the NST application.</p>
      <p>Research could be done to find the smallest element to generate using the
BIML scripts.</p>
      <p>Research could be done to automate the generation of the Technical Data
Model of the data-hub. This would save a significant amount of time and efort.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>We would like to thank the involved Dutch educational organisations in
supporting the collaborative open discussion on the topic of enterprise application
integration, and willing to exchange the knowledge between their organisations.
We also would like to thank our (former) colleagues at Sogeti for reflecting on
our concept architecture descriptions and adding their views and experiences.</p>
      <p>We see the creation of IFW as an example of combining academic research,
the operational application of this knowledge in production and sharing the
experience back to the academic field.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Chongsombut</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , De Bruyn,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Mannaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Huysmans</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Towards applying normalized systems theory to create evolvable enterprise resource planning software: a case study</article-title>
          .
          <source>In: The Eleventh International Conference on Software Engineering Advances (ICSEA)</source>
          . vol.
          <volume>11</volume>
          , pp.
          <fpage>172</fpage>
          -
          <lpage>177</lpage>
          . IARA, Rome, Italy (
          <year>2016</year>
          ), https://hdl.handle.
          <source>net/10067/1352400151162165141</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dayal</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hwang</surname>
          </string-name>
          , H.Y.:
          <article-title>View definition and generalization for database integration in a multidatabase system</article-title>
          .
          <source>IEEE Transactions on Software Engineering</source>
          SE-
          <volume>10</volume>
          (
          <issue>6</issue>
          ),
          <fpage>628</fpage>
          -
          <lpage>645</lpage>
          (
          <year>1984</year>
          ), https://doi.org/10.1109/TSE.
          <year>1984</year>
          .5010292
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Erasala</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yen</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rajkumar</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enterprise application integration in the electronic commerce world</article-title>
          .
          <source>Computer Standards &amp; Interfaces</source>
          <volume>25</volume>
          (
          <issue>2</issue>
          ),
          <fpage>69</fpage>
          -
          <lpage>82</lpage>
          (
          <year>2003</year>
          ), https://doi.org/10.1016/S0920-
          <volume>5489</volume>
          (
          <issue>02</issue>
          )
          <fpage>00106</fpage>
          -X
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Huysmans</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oorts</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , De Bruyn,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Mannaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Verelst</surname>
          </string-name>
          , J.:
          <article-title>Positioning the normalized systems theory in a design theory framework</article-title>
          .
          <source>In: International Symposium on Business Modeling and Software Design</source>
          . pp.
          <fpage>43</fpage>
          -
          <lpage>63</lpage>
          . Springer (
          <year>2012</year>
          ), https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -37478-
          <issue>4</issue>
          _
          <fpage>3</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Huysmans</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oost</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Integrating information systems using normalized systems theory: Four case studies</article-title>
          .
          <source>In: 2015 IEEE 17th Conference on Business Informatics</source>
          . vol.
          <volume>1</volume>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>180</lpage>
          . IEEE (
          <year>2015</year>
          ), https://doi.org/10.110 9/CBI.
          <year>2015</year>
          .43
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Inmon</surname>
            ,
            <given-names>W.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Linstedt</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>: 3.5 - the operational data store</article-title>
          . In: Inmon,
          <string-name>
            <given-names>W.H.</given-names>
            ,
            <surname>Linstedt</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (eds.)
          <article-title>Data Architecture: a Primer for the Data Scientist</article-title>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>126</lpage>
          . Kaufmann, Morgan, Boston, USA (
          <year>2015</year>
          ), https://doi.org/10.1016/B978-0-1
          <fpage>2</fpage>
          -
          <lpage>802044</lpage>
          -9.
          <fpage>00019</fpage>
          -
          <lpage>2</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Irani</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Themistocleous</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Love</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          :
          <article-title>The impact of enterprise application integration on information system lifecycles</article-title>
          .
          <source>Information &amp; Management</source>
          <volume>41</volume>
          (
          <issue>2</issue>
          ),
          <fpage>177</fpage>
          -
          <lpage>187</lpage>
          (
          <year>2003</year>
          ), https://doi.org/10.1016/S0378-
          <volume>7206</volume>
          (
          <issue>03</issue>
          )
          <fpage>00046</fpage>
          -
          <lpage>6</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Ku</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marlowe</surname>
            ,
            <given-names>T.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Budanskaya</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Software engineering design patterns for relational databases</article-title>
          .
          <source>In: Proc. International Conference on Software Engineering Research and Practice</source>
          . vol. II, pp.
          <fpage>340</fpage>
          -
          <lpage>346</lpage>
          . SERP '07, CSREA Press, Las Vegas, Nevada, USA (
          <year>2007</year>
          ), https://www.researchgate.
          <source>net/publication/221 611033</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lehman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>M.: Programs, life cycles, and laws of software evolution</article-title>
          .
          <source>Proceedings of the IEEE</source>
          <volume>68</volume>
          (
          <issue>9</issue>
          ),
          <fpage>1060</fpage>
          -
          <lpage>1076</lpage>
          (
          <year>1980</year>
          ), https://doi.org/10.1109/PROC.
          <year>1981</year>
          .12005
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Linthicum</surname>
            ,
            <given-names>D.S.:</given-names>
          </string-name>
          <article-title>Enterprise application integration</article-title>
          .
          <source>Addison-Wesley information technology series</source>
          ,
          <string-name>
            <surname>Addison-Wesley</surname>
            <given-names>Professional</given-names>
          </string-name>
          , Boston, USA (
          <year>2000</year>
          ), http://ww w.worldcat.org/oclc/890643434
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
          </string-name>
          , J.:
          <article-title>Normalized systems: re-creating information technology based on laws for software evolvability</article-title>
          .
          <source>Koppa</source>
          , Kermt, Hasselt, Belgium (
          <year>2009</year>
          ), http://www.worldcat.org/oclc/1073467550
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , De Bruyn,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Normalized Systems Theory: From Foundations for Evolvable Software Toward a General Theory for Evolvable Design</article-title>
          .
          <article-title>NSI-Press powered by Koppa</article-title>
          , Kermt, Hasselt, Belgium (
          <year>2016</year>
          ), http://www.worl dcat.
          <source>org/oclc/1050060943</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ven</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Towards evolvable software architectures based on systems theoretic stability</article-title>
          .
          <source>Software: Practice and Experience</source>
          <volume>42</volume>
          (
          <issue>1</issue>
          ),
          <fpage>89</fpage>
          -
          <lpage>116</lpage>
          (
          <year>2011</year>
          ), https://doi.org/10.1002/spe.1051
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ven</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>The transformation of requirements into software primitives: Studying evolvability based on systems theoretic stability</article-title>
          .
          <source>Science of Computer Programming</source>
          <volume>76</volume>
          (
          <issue>12</issue>
          ),
          <fpage>1210</fpage>
          -
          <lpage>1222</lpage>
          (
          <year>2011</year>
          ), https://doi.org/10.1016/j.
          <source>sc ico</source>
          .
          <year>2010</year>
          .
          <volume>11</volume>
          .009, special Issue on Software Evolution, Adaptability and Variability
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Oorts</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahmadpour</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oost</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Easily evolving software using normalized system theory-a case study</article-title>
          .
          <source>In: ICSEA 2014 : The Ninth International Conference on Software Engineering Advances</source>
          . pp.
          <fpage>322</fpage>
          -
          <lpage>327</lpage>
          (
          <year>2014</year>
          ), https://www.researchgate.net/publication/283089164
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Oorts</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huysmans</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Bruyn</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mannaert</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verelst</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oost</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Building evolvable software using normalized systems theory: A case study</article-title>
          .
          <source>In: 2014 47th Hawaii International Conference on System Sciences</source>
          . pp.
          <fpage>4760</fpage>
          -
          <lpage>4769</lpage>
          . IEEE (
          <year>2014</year>
          ), https://doi.org/10.1109/HICSS.
          <year>2014</year>
          .585
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Papelard</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bobbert</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berlijn</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Critical Success Factors for Efective Business Information Security</article-title>
          . Uitgeverij Dialoog, Zaltbommel, The
          <string-name>
            <surname>Netherlands</surname>
          </string-name>
          (
          <year>2018</year>
          ), http://www.worldcat.org/oclc/1088899915
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Saltor</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castellanos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>García-Solaco</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Suitability of datamodels as canonical models for federated databases</article-title>
          .
          <source>ACM Sigmod Record</source>
          <volume>20</volume>
          (
          <issue>4</issue>
          ),
          <fpage>44</fpage>
          -
          <lpage>48</lpage>
          (
          <year>1991</year>
          ), https://doi.org/10.1145/141356.141377
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>