<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>YARRRML + LDES: Simultaneously Lowering Complexity from Knowledge Graph Generation and Publication</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gerald Haesendonck</string-name>
          <email>gerald.haesendonck@ugent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ben De Meester</string-name>
          <email>ben.demeester@ugent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julián Andrés Rojas Meléndez</string-name>
          <email>julianandres.rojasmelendez@ugent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dylan Van Assche</string-name>
          <email>Dylan.VanAssche@ugent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pieter Colpaert</string-name>
          <email>pieter.colpaert@ugent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Knowledge Graph</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>YARRRML</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IDLab, Dep. of Electronics and Information Systems, Ghent University - imec</institution>
          ,
          <addr-line>Zwijnaarde</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <fpage>6</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>Linked Data Event Streams (LDES) is an advanced Knowledge Graph (KG) publication specification aimed at continuous data source replication and synchronization with benefits such as data entities versioning and history retention while providing a self-descriptive API. However, building an LDES requires a high level of expertise in the Semantic Web ecosystem. In this demo paper, we show how we lower the complexity and need for expertise when using a more advanced KG publication method such as LDES by providing an extension point to YARRRML, a human-friendly way to configure KG generation via RML. Integrated in Matey, an online YARRRML editor, we show how little efort (adding ifve characters to the YARRRML syntax in the simplest case) allows (re)generating multiple versions of a KG as an Event Stream. As such, this extension provides an easy-to-use starting point for anyone wanting to create an LDES from non-semantic data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>CEUR
Workshop
Proceedings
3https://www.vlaanderen.be/datavindplaats/catalogus?text.LIKE=ldes&amp;types.CONTAINS_ANY=service&amp;order_
relevance=asc</p>
      <p>However, building an LDES requires a high level of expertise: (i) it requires advanced
knowledge of Semantic Web concepts and technologies, and (ii) existing software for publishing
LDESes4 5 start from version objects already formatted as LDES members, thus requiring an
extra step to perform before data can be ingested. This inhibits uptake of the LDES specification
and constitutes a problem for advanced KG publication methods in general.</p>
      <p>
        In this demo paper, we show how we lower the complexity and need for expertise when
using a more advanced KG publication method such as LDES by providing an intuitive way to
configure LDES generation. For this, we rely on the RDF Mapping Language (RML) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] as a
commonly used declarative way to describe a KG construction process. Namely, we extended
YARRRML [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], a human-friendly representation for RML. We chose YARRRML as it has multiple
extensions already [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ] and is used in knowledge graph cloud services such as Google’s
Enterprise Knowledge Graph6.
      </p>
      <p>Where others extend YARRRML to stay in sync with the underlying mapping language
features, our extension adds no additional features but instead provides an easy-to-use starting
point for anyone wanting to create an LDES from existing data.</p>
      <p>
        Integrated in Matey, an online YARRRML editor, we show how little efort (adding five
characters to the YARRRML syntax in the simplest case) allows (re)generating multiple versions
of KG data entities as an Event Stream.
2. Adding an LDES Extension Point to YARRRML
Our YARRRML extension generates all needed RML rules that comply to an existing method
for generating LDES using RML [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. A full specification and examples are available online at
https://rml.io/yarrrml/spec/ldes/. We introduce a single extension point: adding the ldes key
to the subjects collection in YARRRML is enough to define RML mappings generating a valid
minimalist LDES with reasonable defaults. Subkeys allow further configuration to customize
the resulting LDES.
      </p>
      <p>We include the following properties of an ldes:EventStream instance according to the LDES
specification 7 tree:shape, ldes:timestampPath, and ldes:versionOfPath. In YARRRML
they can be configured using the subkeys shape, timestampPath and versionOfPath
respectively. The IRI to identify the ldes:EventStream instance can be configured with the id
sub-key. We do not support LDES fragmentation, pagination and retention policies because
they are typically use case specific and can easily be obtained by using existing LDES servers or
clients once a basic LDES is available.</p>
      <p>Finally, we add the watchedProperties sub-key. This key describes which fields in the
original data to watch, as only new values in all given fields trigger the generation of a new
tree:member in the LDES.</p>
      <p>
        The generated RML rules use a function to take the watchedProperties values into account
when generating a new tree:member, as introduced by Van Assche et al [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. For example,
4https://informatievlaanderen.github.io/VSDS-Tech-Docs/docs/LDES_server.html
5https://github.com/TREEcg/LDES-Solid-Server
6https://cloud.google.com/enterprise-knowledge-graph/docs/overview
7https://w3id.org/ldes/specification
      </p>
      <sec id="sec-2-1">
        <title>SensorID</title>
        <p>1
2
1
2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Timestamp</title>
        <p>2023-01-01T08:00:00
2023-01-01T08:00:00
2023-01-01T09:00:00
2023-01-01T09:00:00
watchedProperties: [$(Temperature)] results in a new LDES member only when the value
of field Temperature in the original data was not recorded before. Listing 1, lines 5 – 10,
illustrates how we can generate an LDES from data shown in Table 1.</p>
        <p>Compared to using RML directly to generate an LDES, our approach reduces technical
overhead, e.g., adding an LDES configuration to YARRRML requires maximum 7 additional lines
of code, but results in more than 40 additional triples in the RML mapping document8. By using
ldes as key and naming sub-keys close to the corresponding LDES properties, we provide an
intuitive and simple way to configure LDES generation. We implemented the LDES extension
using the YARRRML Parser9 which translates YARRRML rules into RML.
3. Matey: Human-friendly KG Generation &amp; Publication
To demonstrate our extension, we extended Matey10, an open-source browser-based application
that helps one write YARRRML rules, publicly available at https://rml.io/yarrrml/matey/.</p>
        <p>We added a basic LDES example (loadable by clicking the button [Basic LDES]) which loads
a CSV file with temperature readings and YARRRML mappings, and generates an LDES based
on temperature changes when clicking the [Generate LD] button (Figure 1). Because we only
want to generate members when the temperature changes, there is no member for sensor 2’s
8Examples of YARRRML rules and their corresponding RML rules generating LDES can be found at https://github.
com/RMLio/yarrrml-ldes-extension/tree/main/ldes, all folders contain links to YARRRML files (.yaml) and their
corresponding RML files (.rml.ttl).
9https://github.com/RMLio/yarrrml-parser/releases/tag/v1.5.2
10https://rml.io/yarrrml/matey/
last reading. Clicking [Generate RML] shows the corresponding RML mapping rules which
are applied on the data. When clicking [Generate LD] again, no members are generated at all,
since all members are already generated. This demonstrates the feature of adding members to a
previously generated LDES without violating the watchedProperties constraints. Clicking
[Reset state] clears the state, after which the LDES can be re-generated.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Conclusion</title>
      <p>By extending YARRRML to configure how to generate an LDES from existing data we lower
the technical complexity of a task that requires domain-specific knowledge. Future research
is needed to go beyond basic LDES generation and explore more advanced features such as
fragmentation. We expect this type of work to (i) increase LDES uptake and (ii) inspire other
(YARRRML-like) extensions to lower complexity for any KG generation or publication method.
Acknowledgments The described research activities were supported by SolidLab
Vlaanderen (Flemish Government, EWI and RRF project VV023/10), the Flemish Smart Data Space
(Flemish Government, Digital Flanders and RRF project VV073), and the imec ICON project
BoB (Agentschap Innoveren en Ondernemen project nr. HBC.2021.0658).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Van Lancker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Delva</surname>
          </string-name>
          , B. Van de Vyvere,
          <string-name>
            <given-names>J. Rojas</given-names>
            <surname>Meléndez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dedecker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Michiels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Buyle</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. De Craene</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Verborgh</surname>
          </string-name>
          ,
          <article-title>Publishing Base Registries as Linked Data Event Streams</article-title>
          , in: Web Engineering (ICWE),
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .1007/978- 3-
          <fpage>030</fpage>
          - 74296-
          <issue>6</issue>
          _
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Van de Vyvere</surname>
          </string-name>
          , O. V.
          <string-name>
            <surname>D'Huynslager</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Atauil</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Segers</surname>
            ,
            <given-names>L. Van</given-names>
          </string-name>
          <string-name>
            <surname>Campe</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Vandekeybus</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Teugels</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Saenko</surname>
            ,
            <given-names>P.-J.</given-names>
          </string-name>
          <string-name>
            <surname>Pauwels</surname>
          </string-name>
          , P. Colpaert,
          <article-title>Publishing cultural heritage collections of ghent with linked data event streams</article-title>
          ,
          <source>in: Metadata and Semantic Research</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lonneville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Delva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Portier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Maldeghem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schepers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bakeev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vanhoorne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tyberghein</surname>
          </string-name>
          , P. Colpaert,
          <article-title>Publishing the marine regions gazetteer as a linked data event stream, in: Proceedings of the Joint Ontology Workshops 2021 Episode VII: The Bolzano Summer of Knowledge co-located with the 12th</article-title>
          <source>International Conference on Formal Ontology in Information Systems (FOIS 2021), and the 12th International Conference on Biomedical Ontologies (ICBO</source>
          <year>2021</year>
          ), Bolzano, Italy,
          <source>September 11-18</source>
          ,
          <year>2021</year>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Vander</given-names>
            <surname>Sande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          , E. Mannens, R. Van de Walle,
          <article-title>RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data</article-title>
          ,
          <source>in: Proc. of the 7th Workshop on Linked Data on the Web</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Heyvaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>De Meester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          ,
          <article-title>Declarative Rules for Linked Data Generation at your Fingertips!</article-title>
          ,
          <source>in: The Semantic Web: ESWC 2018 Satellite Events</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Van Assche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Delva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Heyvaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>De Meester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <article-title>Towards a more humanfriendly knowledge graph generation &amp; publication</article-title>
          , in: International Semantic Web Conference (ISWC)
          <year>2021</year>
          : Posters, Demos, and Industry Tracks,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Steenwinckel</surname>
          </string-name>
          , G. Vandewiele,
          <string-name>
            <surname>I. Rausch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Heyvaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Taelman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Simoens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>De Turck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ongenae</surname>
          </string-name>
          ,
          <article-title>Facilitating the analysis of covid-19 literature through a knowledge graph</article-title>
          , in: J.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tamma</surname>
          </string-name>
          , C. d'Amato,
          <string-name>
            <given-names>K.</given-names>
            <surname>Janowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Seneviratne</surname>
          </string-name>
          , L. Kagal (Eds.),
          <source>The Semantic Web - ISWC</source>
          <year>2020</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Van Assche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Oo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Rojas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colpaert</surname>
          </string-name>
          ,
          <article-title>Continuous generation of versioned collections' members with RML and LDES</article-title>
          , in
          <source>: Proc. of the 3rd International Workshop on Knowledge Graph Construction (KGCW</source>
          <year>2022</year>
          ),
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>