<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Addressing Requirements to Identification Posed by the Digital Transformation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rustam Mehmandarov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dag Hovland</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Torleif Saltvedt</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arild Waaler</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bouvet ASA</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computas AS</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Equinor ASA</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>SIRIUS Centre, Department of Informatics, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Creating and maintaining a machine-readable mapping of the relationships between the various ways of identifying industrial assets across IT applications, domains, and actors is challenging in large-scale industrial systems. This challenge is usually addressed by using the manual labor of subject matter experts and by creating manual mappings. The automated solution for this challenge has been under-investigated previously. To this end, this paper proposes a classification of identifiers needed for identifying assets on an industrial scale and proposes an approach to digital transformation to address the problem by building upon a model-based approach that has been gaining popularity in recent years. We illustrate our approach with a real industrial example at Equinor, the largest, state-owned Norwegian energy company.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Asset Management</kwd>
        <kwd>Identifier</kwd>
        <kwd>Asset Identification</kwd>
        <kwd>Data Integration</kwd>
        <kwd>Software Interoperability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>today’s working practice is often based on the sequential processing of data by SMEs and then
by data engineers leading to bottlenecks and unnecessary delays in the process.
The Idea. To address this challenge, we aim to automatically generate the mappings between the
identifiers and the objects by inferring the relationship mapping using model-based integration
instead of the current document-based data exchange. We introduce our model-based approach
that will serve SMEs and data engineers working with data pipelines and integrations, including
the creation of automated machine-to-machine data integrations.</p>
      <p>The Vision. The industry’s move towards model-based documentation yields many
improvements to the current document-based approach. The model-based approach combined with the
suggested models will simplify the data exchange process and improve support for management
of change (MoC).</p>
      <p>Both of these tasks are challenging and resource-intensive in the current work practice – as
the data exposed about an asset will vary significantly based on the context, the task at hand,
and the role of the system actor. This diferentiation is challenging as various user groups and
applications would generally need to map diferent identification methods for the same asset,
as well as to have the ability to represent the asset in diferent ways based on the context and
the task at hand.</p>
      <p>This approach will make it possible to process data on diferent levels in parallel instead of
the more sequential processing we see today. The proposed solution will help optimize these
processes, considering the various needs of user groups or automated machine-to-machine
interfaces, consolidating the information about an asset across multiple domains and applications,
and improving software interoperability and human collaboration.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The Current State of Afairs</title>
      <p>
        Today’s document-centered practice requires a lot of resources and manual work to consolidate
and exchange information across the value chain and life cycles of engineering projects and
assets. We are seeing a shift towards model-based approaches that can support subject matter
experts, digital twins, and automated data exchange through APIs, data mesh [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], data
pipelines, or similar.
      </p>
      <p>However, creating models across multiple domains, applications, and value chains, raises the
need for a universal way to refer to an object represented by various distinct identificators. It
also questions how these object references should be managed and mapped.</p>
      <p>
        Despite being a common problem in the industrial setting, data integration and mapping is
still an area yet to be widely researched. It has commonly been solved using manual labor, and
proprietary solutions [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. This work is typically done by SMEs working closely with IT and
data integration experts, such as data engineers and data integrators.
      </p>
      <p>Furthermore, this complex problem can be divided into several challenges. To understand
those challenges and opportunities better, we must look closely at the actors and how each actor
identifies the assets and the relationships between them. Change management for identificators
at hand is yet another challenge for the industry that we shed light on in this paper.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Approach</title>
      <p>We want to start describing our approach with introducing classifications essential to
understanding and solving the challenge at hand.</p>
      <sec id="sec-3-1">
        <title>3.1. Classification of Identificator Systems</title>
        <p>In engineering systems, we often see various ways of identifying assets. We want to start
by distinguishing two groups of identification systems based on their uniqueness outside
their context and the amount of information they carry. Later we would also like to separate
idenficators into categories based on their lifecycle management. These categorisations are
based on the classification of the typical idenficators used in the engineering systems.</p>
        <p>We would like to start by dividing identificators into two distinct categories based on their
usage:
• Descriptors – often context-dependent identificators (i.e., unique only within a specific
context), bearing encoded information about the asset and breakdown structures that
SMEs use. Those often are presented in a human-readable format.
• Identifiers – identificators used to identify data entries uniquely, often not intended to
be human-readable or convey meaningful information.</p>
        <p>While descriptors are easy for the experts and engineers to use within their domains, they
carry too much information from that domain and are often dependent on the context to be
unique. In other words, a descriptor has no value if it can not uniquely identify an asset outside
its context. Therefore, we need identifiers to be able to supply that identification. On the
other hand, even though identifiers are more likely to be unique outside their context, they
will typically be less human-readable. As we can see, descriptors and identifiers have specific
usages in respectively engineering systems and the underlying IT applications and can not be
interchanged.</p>
        <p>Furthermore, we need to look at the classification of identificators from the lifecycle
management perspective. This classification is fundamental when looking at change management for the
data and assets. We usually see this in engineering systems where data has to be exchanged and
updated across the value chain, domains, applications, or similar. The four main categories are:
• Self-managed identificators can be managed by one entity without synchronizing
the identificator generation or naming with other entities. This is typically internal
identification that is being used within a specific context.
• Co-managed identificators have to be synchronized across multiple entities and thus
can not be easily changed without proper synchronization and clarification with other
parties involved.
• Unmanaged identificators that can be generated in a distributed manner and do not
need any management from any party, except for agreeing on the algorithm, such as
UUIDs and GUIDs.
• Centrally managed identificators – their use and assignment have to be managed and
coordinated by one specific body, e.g., TAG numbers in the current engineering practice.</p>
        <p>
          The introduced classification is also in line with the Industrie 4.0 view on identifier
management [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] and integrates well into Reference Designation System (RDS) codes defined in
ISO/IEC81346-1 [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>So far, we have introduced two ways of classifying the identificators based on their usage
and lifecycle. However, we still need to address the challenge of being able to map various
kinds of identificators across multiple applications and domains. We also need to make the data
mapping and exchange process less sequential so that the various system actors can work on
their parts without creating bottlenecks for each other.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Classification of System Actors</title>
        <p>To help address the issues mentioned above, we would like to introduce three types of actors in
expert systems involved in building and maintaining assets in engineering:
• Subject matter experts (SMEs)
• Data engineers / IT experts
• Digital multi-discipline experts.</p>
        <p>If we combine this classification with the identificator classification described above, we get
a much clearer picture with the separate areas of responsibilities shown in Fig. 1.</p>
        <p>Seen from the perspective of a subject matter expert, information about assets resides in a set
of systems varying in the domain (e.g., electrical engineering, mechanical engineering), project
execution stages (i.e., project lifecycle stages), and value chain (e.g., supplier, contractor). SMEs
often relate to project-specific descriptors (e.g., TAG numbers, other engineering numbering
systems) and use that information to identify objects. The same identifier can be traced throughout
various systems, domains, and disciplines that may also have their own descriptors. Descriptors
are also made to be human-readable and often contain encoded breakdown structure information.
Such descriptors are used across various systems and diagrams, such as Piping and
Instrumentation Diagrams (P&amp;ID), Master Equipment Lists (MEL), or process flow diagrams (PFD).</p>
        <p>The data of the engineering applications used by SMEs is stored in databases, which are often
tailored to these systems. Data import and export usually happen through application
programming interfaces (APIs) or specialized data pipelines. The data from the expert systems lies in the
databases and other ancillary IT systems. That data typically uses other types of identification,
which we refer to as identifiers . The data engineers and IT experts working on creating and
maintaining the IT applications, databases, data integrations, and data pipelines usually use
identifiers as a primary object identification method. These identifiers are often technical and
designed to be unique within their context and focus less on human readability. Typical examples
include GUIDs, UUIDs, auto-incremented numbers, or serial numbers for the equipment.</p>
        <p>The challenge is that the mapping between descriptors and identifiers across applications
needs to be explicit. In addition, SMEs and data engineers often only stick to either of the
distinct categories of identificators – descriptors or identifiers. Both types are specialized for the
workflows in which they are used and, therefore, are not optimal for use by other workflows.
These challenges underline the importance of supporting both types of identification.</p>
        <p>In addition to the two user groups already mentioned, we would also like to introduce the
digital multi-discipline experts that work in the cross-section of one or several domains and IT,
who map the information and help the two other user groups to move forward. This group would
often need to have a good understanding of both domain knowledge and the IT applications, as
well as types and mapping of the identificators.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Requirements</title>
        <p>To summarize the needs described above, we will need to create a solution that will be able to
solve the following challenge:</p>
        <p>Create tailored data "views" to show relevant data to a specific actor. Views will be
based on which task and at what stage of the project lifecycle it needs to be performed,
consolidating all the available data from various applications.</p>
        <p>Furthermore, we have identified the following requirements to address the challenge:
1. Data about an asset should be presented based on the actor’s context, need, and specific
task the actor needs to perform. The tasks are defined by factors like domain, project and
product lifecycles, and value chain.
2. There should be non-manual ways of mapping descriptors and identifiers across domains,
applications, and value chains to facilitate seamless data integration.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Digital Transformation</title>
        <p>
          The proposed approach will build further on the model-oriented way of working through
better digitalization initiatives, like Industrie 4.0, Asset Administration Shell [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and digital
twins [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], to name a few. The idea is to facilitate the creation of various "views" for the
system’s actors based on all the available information and aspects mentioned above. The
variation and complexity of identification methods that will need to be mapped are illustrated
as a multidimensional plane in Fig. 2. It should also be noted that this figure simplifies the
real-world situation, where more than three dimensions are needed to map between diferent
identification models. Identifiers for the same object will also not necessarily map one-to-one
between identification models. However, they would often rather have one-to-many mappings
that need to be addressed in the new approach of identificator mapping. Such identification
mapping and support for multiple dimensions is an important step to support the Industrie
4.0 approach, specifically the Reference Architecture Model for Industrie 4.0 (RAMI 4.0) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. The Proposed Systematic Model</title>
        <p>To satisfy the abovementioned needs, we propose classifying the user groups into three distinct
categories and introducing systematic knowledge or models.</p>
        <p>1. Project execution steps (facility lifecycle) – a piece of information about steps in the
lifecycle of a facility;
2. Asset representation according to the aspect1, including the domain and asset lifecycle;
3. Software applications including the data they represent and store – information
about what data is stored where and how it can be retrieved and updated;
4. Relationships and mappings of identifier and descriptor types – information about
how the descriptors and identifiers are composed for each type to be able to create
mappings</p>
        <p>
          As we can see from Fig. 3, the context plays a central role in all those models. Context can
vary from model to model but can typically be something that helps uniquely identify the
1The aspect is defined by ISO/IEC 81346-1:2022[
          <xref ref-type="bibr" rid="ref10 ref6">6, 10</xref>
          ] as views to sort out and monitor the technical information of
objects. The 81346-1 part defines four fundamental aspects that can be used in the ISO/IEC 81346 standard series
and can be extended with other aspects.
information elements queried to the models. For instance, for a software applications model,
that can be information about which application is requesting the information about a specific
identifier or the current facility lifecycle stage.
        </p>
        <p>Context is meant to add necessary information to a query or identificator to uniquely identify
an object with a model’s scope. We need this information since identificators are often only
unique with a specific scope. If we refer to those identificators outside their scope, we need to
add more information to ensure their uniqueness within the larger scope.</p>
        <p>From the Fig. 3, we can also see that the diferent models will have diferent kinds of
predominant identificators used to refer to the object – some models will be using various types
of descriptors, and some will be using multiple types of identifiers. To be able to extract the
necessary data for the particular actor on a specific stage in the project lifecycle, we would have
to be able to query all those models. This further underlines the need for a model mapping the
identificators in use.</p>
        <p>
          More about the parts of the technical implementation for such a system has been drafted in
work by Mehmandarov R. et al. previously [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>This paper introduced a classification of identifiers, their types, and actors in the industrial
context. Furthermore, we have defined a need and the requirements for identifying industrial
assets posed by digital transformation.</p>
      <p>We propose to address this problem by introducing several models in a machine-readable
way. This approach will help us to automate the data integration between various endpoints
and domains, as opposed to the time- and resource-consuming manual processes that exist in
today’s practice. This approach aligns well with the industry’s current move to a model-driven
approach and data exchange and distribution trends, such as data mesh architecture.</p>
      <p>We are implementing our approach using an example of Equinor and other prominent actors
in the Norwegian energy sector. In the future, we plan to validate our approach with more data
and industrial users and implement it in an industrial evaluation environment. We also aim to
develop a semantic theory of the approach in formal semantics and reasoning.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>The work was partially supported by the SIRIUS Centre, Norwegian Research Council project
number 237898.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dehghani</surname>
          </string-name>
          ,
          <source>Data mesh principles and logical architecture</source>
          ,
          <year>2020</year>
          . URL: https://martinfowler. com/articles/data-mesh-principles.
          <source>html.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dehghani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Data</given-names>
            <surname>Mesh</surname>
          </string-name>
          ,
          <string-name>
            <surname>O</surname>
          </string-name>
          'Reilly,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Soylu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zheleznyakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Jimenez-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Giese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Skjaeveland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schlatte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lie</surname>
          </string-name>
          , et al.,
          <article-title>OptiqueVQS: A visual query system over ontologies for industry</article-title>
          ,
          <source>Semantic Web</source>
          <volume>9</volume>
          (
          <year>2018</year>
          )
          <fpage>627</fpage>
          -
          <lpage>660</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Fillinger</surname>
          </string-name>
          , E. Esche, G. Tolksdorf,
          <string-name>
            <given-names>W.</given-names>
            <surname>Welscher</surname>
          </string-name>
          , G. Wozny, J.-U. Repke,
          <article-title>Data exchange for process engineering - challenges and opportunities</article-title>
          ,
          <source>Chemie Ingenieur Technik</source>
          <volume>91</volume>
          (
          <year>2019</year>
          )
          <fpage>256</fpage>
          -
          <lpage>267</lpage>
          . URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/cite.201800122. doi:https: //doi.org/10.1002/cite.201800122.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Plattform-</surname>
          </string-name>
          Industrie-
          <volume>4</volume>
          .0,
          <string-name>
            <given-names>AAS</given-names>
            <surname>Reference Modelling</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: https://www.plattform-i40. de/IP/Redaktion/EN/Downloads/Publikation/AAS_Reference_Modelling.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>[6] ISO, ISO/IEC81346-1 Industrial systems, installations and equipment and industrial products - Structuring principles and reference designations - Part 1: Basic rules</article-title>
          ,
          <year>2022</year>
          . URL: https://www.iso.org/standard/82229.html.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Tantik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Anderl</surname>
          </string-name>
          ,
          <article-title>Integrated data model and structure for the asset administration shell in industrie 4.0</article-title>
          ,
          <string-name>
            <surname>Procedia</surname>
            <given-names>Cirp</given-names>
          </string-name>
          60 (
          <year>2017</year>
          )
          <fpage>86</fpage>
          -
          <lpage>91</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Martin-Recuerda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Perry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fjellheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Waaler</surname>
          </string-name>
          ,
          <article-title>Towards semantically enhanced digital twins</article-title>
          ,
          <source>in: 2018 IEEE International Conference on Big Data (Big Data)</source>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>4189</fpage>
          -
          <lpage>4193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Schweichhart</surname>
          </string-name>
          ,
          <source>Reference Architectural Model Industrie 4.0 (RAMI 4.0)</source>
          ,
          <year>2017</year>
          . URL: https://ec.europa.eu/futurium/en/system/files/ged/a2-schweichhart
          <article-title>-reference_ architectural_model_industrie_4.0_rami_4.0</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>ISO</surname>
          </string-name>
          ,
          <string-name>
            <surname>The</surname>
            <given-names>RDS</given-names>
          </string-name>
          81346 Standard Series,
          <year>2022</year>
          . URL: https://www.81346.com/81346-1.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mehmandarov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Waaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fjellheim</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Pettersen</surname>
          </string-name>
          ,
          <article-title>A semantic approach to identifier management in engineering systems</article-title>
          , in: 2021
          <source>IEEE International Conference on Big Data (Big Data)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>4613</fpage>
          -
          <lpage>4616</lpage>
          . doi:https://doi.org/10.1109/ BigData52589.
          <year>2021</year>
          .
          <volume>9671515</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>