<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Hangzhou, China
$ dag.hovland@bouvet.no (D. Hovland); fredrik.chrislock@bouvet.no (F. Chrislock)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dag Hovland</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fredrik Chrislock</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bouvet Norway</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bergen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Norway</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>In our work at Equinor, an oil and gas operator company, we are involved in the exchange of information between contractors and operator during design and planning of new facilities, or capital projects. We are part of a larger efort to get away from transferring documents and rather transfer smaller pieces of information. The goal is that information about the facility becomes less disconnected and that more of it can be processed by machines. It is also a goal that the interaction between operator and contractor can be faster and more small-grained. That is, smaller pieces of information can be transferred, not only complete documents. RDF is a strong candidate for the transfer and storage of this information. However, it does not fulfill our needs for maintaining the same dataset in diferent organizations. Specifically, parts of the model, in diferent stages of development, will be exchanged between organizations (e.g. operator and contractor) and between units, the parts will be extended, changed and commented on simultaneously in these diferent organizations. It is not an option to maintain a single RDF dataset that is directly edited by all partners. There are two reasons for this: i) The contractor and operator do not wish to share their complete graph with each other and ii) Multiple, difering, versions of the same model/graph must be possible to maintain, to be able to explore different design choices. A consequence of this requirement is that decorating properties with identifiers of the dataset version, like in [3], is not suficient. In relational data warehousing this type of data is called slowly changing dimensions[2] and there are 7 named patterns for maintaining them. These patterns can also be implemented in RDF, but since we are no longer restricted to fixed, tabular formats, some patterns therefore are probably not useful in RDF. The options we have considered for transferring the model as RDF between contracator and operator 1. Send the whole RDF graph every time 2. Send instructions about what triples to delete and what to insert 3. Send the RDF graph for each object that has been changed Before explaining what we mean with objects, let us explain why the first two options do not cover our needs: Transferring the whole RDF graph means that concurrent changes to the graph are complicated and would need a strong locking mechanism on the RDF graph at the operator, preventing changes to the parts of the graph that are currently under work by the contractor. This is a too strict requirement.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The second option is not allowed by the users: The engineers want to make sure that
whoever makes a change to an object in the model, is aware of the state of the object they are
changing. This is not possible in a distributed setting if single triples can be inserted or removed
independently. This also means that Auer and Herre’s atomic changes [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] does not solve our
problem, although they could supplement the solution by being used to describe the changes
on objects.
      </p>
      <p>This leaves us with our suggested solution, which is to agree upon a fragmentation of the data
into objects. Every triple in the RDF graph is a member of exactly one such object, and there must
be agreed between the collaborators rules about the sizes of these objects. It must further be
possible to easily distinguish between diferent versions of an object, and the provenance history
of that specific version. Note that a version of an object is a RDF graph. These requirements
appear intuitive to the engineers and have led us to the following design decisions:
The representation of versions of objects as separate entities with their own IRIs is necessary to
be able to support the transfer of changes of data between organizations. When a system receives
an object, it can compare with the previously stored version of the object, and, importantly, see
both what triples should be removed and which should be added. Our approach has no impact
on how data is read from the RDF Graph, only on how data is written into it.</p>
      <p>An implementation of this versioning scheme is available at https://github.com/equinor/
versioned-object.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Herre</surname>
          </string-name>
          .
          <article-title>A versioning and evolution framework for RDF knowledge bases</article-title>
          .
          <source>In I. B. Virbitskaite and A</source>
          . Voronkov, editors,
          <source>Perspectives of Systems Informatics, 6th International Andrei Ershov Memorial Conference, PSI</source>
          <year>2006</year>
          , Novosibirsk, Russia, June 27-30,
          <year>2006</year>
          . Revised Papers, volume
          <volume>4378</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>55</fpage>
          -
          <lpage>69</lpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kimball</surname>
          </string-name>
          .
          <article-title>The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses</article-title>
          . John Wiley,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          , C. Ma, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>A new structure for representing and tracking version information in a deep time knowledge graph</article-title>
          .
          <source>Computers &amp; Geosciences</source>
          ,
          <volume>145</volume>
          :
          <fpage>104620</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>