<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic-IoT Framework: Knowledge Graph Generation from IoT Platforms</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Junsong Du</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yunheng Tian</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Max Berktold</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rita Streblow</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dirk Müller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>RWTH Aachen University, E.ON Energy Research Center, Institute for Energy Efficient Buildings and Indoor Climate</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Semantic technologies, and knowledge graphs (KGs) in particular, show promising potential for enhancing interoperability among diverse IoT systems, especially in the building sector. However, the extensive manual effort required for semantic modeling limits the widespread adoption of these technologies in engineering practice. This paper introduces Semantic-IoT, a framework that generates KGs from IoT platforms using the RDF Mapping Language (RML). By employing a semi-automated approach for declaring RML rules alongside fully automated KG generation, the framework reduces the manual effort associated with semantic modeling. The generated KGs comprehensively represent both system information and data interactions, thereby facilitating the development and deployment of cross-platform applications. A building automation use case is presented to demonstrate the feasibility and effectiveness of the proposed approach.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;IoT Platform</kwd>
        <kwd>Interoperability</kwd>
        <kwd>Building Automation</kwd>
        <kwd>RML</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Proposed Framework</title>
      <p>The RML Generation module is designed to semi-automatically create RML mapping rules for various
IoT platforms. Initially, this module loads a dataset that represents the data models of an IoT platform.
Typically, such a dataset comprises a list of virtual entities that encapsulate sensor data, actuation
functions, and semantic information about the underlying system. From this dataset, information required
by RML is extracted by dedicated submodules, as shown in Table 1.</p>
      <p>The RML preprocessor first identifies unique resource types from the dataset based on the type field
of each entity. For each resource type, the corresponding JSONPath to locate entities of that type (e.g.,
$[?(@.type==’TemperatureSensor’)]) and the IRI template for the RDF subject are generated. The
extraction of interrelational information requires more complex processing as illustrated in Algorithm 1.
Simply put, the RML preprocessor traverses the JSON object of each entity and identifies substructures
that contain references to other entities. In this way, a structural skeleton of the specific data model is
created.</p>
      <p>Algorithm 1 Find interrelationship for a given entity
1: Input: An JSON object entity, a list of JSON objects allEntities
2: Output: A list foundRelationships for the given entity
3:
4: foundRelationships ← an empty list
5: for otherEntity in allEntities do
6: if entity.id ̸= otherEntity.id then
7: for (path, value) in TRAVERSEJSON(entity) do
8: for value = otherEntity.id do
9: record ← new record with fields {path, relatedType}
10: record.path ← path
11: record.relatedType ← otherEntity.type
12: Add record to foundRelationships
13: return foundRelationships</p>
      <p>
        Subsequently, the subject and predicate terminologies are matched against the domain ontologies. This
submodule loads the serialized domain ontologies and then uses the Levenshtein algorithm to compute
string similarities between the terms of the data model and those in ontologies. Based on the highest
similarity scores, matching suggestions are generated. These suggestions, along with the previously
extracted information, are populated into a report for engineers to validate. Currently, manual intervention
is required to validate the suggested terminology matches and to specify URL patterns for accessing API
endpoints of the IoT platform, specifically when a resource type provides sensor data or supports actuation
functions. Once the report is finalized, the RML mapping rule can be generated automatically, thereby
eliminating the need for manual handling of RML syntax. In comparison to YARRRML [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which
enables a human-friendly declaration of RML mapping rules, the proposed framework is specifically
tailored for IoT platforms and offers a higher degree of automation for this specific use case.
      </p>
      <p>
        Once the RML mapping rule is generated, a platform-specific KGCP is established. This KGCP can
be reused to automatically generate KGs from various datasets and across different platform instances.
While the JSON preprocessor ensures the general applicability of the KGCP by normalizing the data
from IoT platforms, the RML engine, MorphKGC [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], is employed to generate KGs. It is important not
to confuse the proposed KGCP with a runtime virtualization layer. While the virtualization approach
offers direct access to database [8], it often demands custom development, especially when integrating
NoSQL databases like MongoDB. The KGCP, in contrast, is created through a lightweight configuration
process based on the established RML mechanism. By relying solely on the exposed structural data of
APIs, this approach is inherently more flexible for modern IoT platforms.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Use Case</title>
      <p>The proposed framework is planned to be applied in the building sector as illustrated in Figure 2. Currently,
the deployment of promising smart building applications—such as for automation, fault detection, and
energy monitoring—is often a laborious task due to the heterogeneity of underlying building systems
and their IoT platforms. Semantic-IoT addresses this challenge by facilitating the generation of KGs,
which provide a unified, semantic representation of the available information. By leveraging SWT, such
as reasoning and SPARQL, the deployment process can ultimately be automated.</p>
      <sec id="sec-3-1">
        <title>3.1. Framework Demonstration</title>
        <p>The proposed framework is demonstrated through a building automation use case5. In this use case, an
IoT system has been developed for hotel buildings based on the FIWARE platform. Platform-specific data
5https://github.com/N5GEH/semantic-iot/tree/main/examples/fiware</p>
        <sec id="sec-3-1-1">
          <title>Sensor data</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Actuations</title>
        </sec>
        <sec id="sec-3-1-3">
          <title>IoT Platform</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>IoT Platform</title>
        </sec>
        <sec id="sec-3-1-5">
          <title>IoT Platform</title>
          <p>Semantic-IoT
Framework</p>
        </sec>
        <sec id="sec-3-1-6">
          <title>Interaction via APIs</title>
        </sec>
        <sec id="sec-3-1-7">
          <title>Generated KGs</title>
        </sec>
        <sec id="sec-3-1-8">
          <title>Various applications</title>
        </sec>
        <sec id="sec-3-1-9">
          <title>Information retrieval</title>
        </sec>
        <sec id="sec-3-1-10">
          <title>Reasoning</title>
        </sec>
        <sec id="sec-3-1-11">
          <title>Deployment</title>
        </sec>
        <sec id="sec-3-1-12">
          <title>Extended KGs</title>
          <p>models6 have been designed in accordance with the NGSIv2 specification of FIWARE. To fully unlock
the flexibility to connect different building automation services, the Semantic-IoT framework is used.</p>
          <p>With the RML Generation module, twelve resource types are identified from the data models,
including locations, sensors, and actuators. Brick7, a widely-used ontology for building energy systems, is
used to provide the semantic foundation for the terminology matching. Although Brick can
theoretically be applied to all identified resource types, the current terminology matching achieves less than
60% accuracy. For example, while the resource type PresenceSensor should ideally match the class
brick:Occupancy_Count_Sensor, the similarity score between them is only 0.5. Consequently, identifying
suitable terminologies remains largely a manual work. In total, the complete RML mapping rules consists
of 344 lines, with manual intervention limited to 18 lines for validation and 7 lines for manual input.
Thus, the proposed framework significantly simplifies the declaration of RML mapping rules for IoT
platforms.</p>
          <p>With the generated RML mapping rules, a KGCP for the FIWARE-based platform is established,
enabling the construction of KGs for any hotel that utilizes the same platform. To test its applicability, we
then provisioned a range of virtual hotels, from a small 2-room layout to a large 1000-room one. For
each hotel, we fetched the available data from the FIWARE NGSIv2 API to create the test datasets8. As a
result, the KGCP generates KGs that represent the hotel buildings—including their rooms, sensors, and
actuators—and integrate URLs for data interaction via the platform API.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Automatic Service Deployment</title>
        <p>Subsequently, we demonstrate the possibility to automate the deployment of automation services for
ventilation control9. The tasks are mainly twofold: first, to decide on control strategies based on the
available sensors and actuators; and second, to establish reliable data interactions via the platform API.
To optimize information retrieval, we employ OWL-RL[9] to infer the generated KGs. Numerous class
subsumption is added; for example, the class brick:Ventilation_Air_System is inferred to be a subclass of
brick:Air_System and brick:HVAC_System. Thus, generalized SPARQL queries can be applied to retrieve
any possible actuators in the hotel air systems. The availability of sensors can also be queried in a similar
way. As a result, configurations for the ventilation controller can be automatically generated.
6https://github.com/N5GEH/n5geh.data_models/tree/main/example_building_automation
7https://brickschema.org/
8https://github.com/N5GEH/semantic-iot/tree/main/examples/fiware/hotel_dataset
9https://github.com/N5GEH/semantic-iot/tree/main/examples/fiware/application_deployment</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>In this paper, we introduce a framework that automates the generation of knowledge graphs (KGs) from
IoT platforms. By integrating system information and data interactions into the generated KGs, our
approach enhances interoperability across diverse IoT systems, thus simplifying the development and
deployment of cross-platform applications.</p>
      <p>In future work, we aim to leverage the HTTP Vocabulary10 to enrich the semantic representation of
platform APIs. Moreover, the current string similarity based terminology matching exhibits limited
accuracy. Consequently, we plan to investigate alternative approaches, such as word embedding models.
Ultimately, we intend to conduct comparative case studies that deploy advanced building automation
programs across different IoT systems with various representative IoT platforms.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We gratefully acknowledge the financial support provided by the Federal Ministry for Economic Affairs
and Climate Action (BMWK), promotional reference 03EN1030B.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Gemini in order to: Grammar and spelling check,
Paraphrase and reword. After using these tools/services, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s content.
[8] D. Calvanese, G. D. Giacomo, D. Lembo, M. Lenzerini, R. Rosati, Ontology-Based Data Access and
Integration, in: Encyclopedia of Database Systems, Springer, New York, NY, 2018, pp. 2590–2596.
doi:10.1007/978-1-4614-8265-9_80667.
[9] I. Herman, OWL-RL: OWL-RL: A simple OWL2 RL reasoner on top of rdflib, Zenodo, 2014. URL:
https://doi.org/10.5281/zenodo.14543. doi:10.5281/zenodo.14543.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Noura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Atiquzzaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaedke</surname>
          </string-name>
          , Interoperability in Internet of Things: Taxonomies and Open Challenges,
          <source>Mobile Netw Appl</source>
          <volume>24</volume>
          (
          <year>2019</year>
          )
          <fpage>796</fpage>
          -
          <lpage>809</lpage>
          . URL: https://doi.org/10.1007/ s11036-018-1089-9. doi:
          <volume>10</volume>
          .1007/s11036-018-1089-9.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hazra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Adhikari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Amgoth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Srirama</surname>
          </string-name>
          ,
          <string-name>
            <surname>A Comprehensive</surname>
          </string-name>
          <article-title>Survey on Interoperability for IIoT: Taxonomy, Standards, and Future Directions</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>55</volume>
          (
          <year>2021</year>
          ) 9:
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          :
          <fpage>35</fpage>
          . URL: https://dl.acm.org/doi/10.1145/3485130. doi:
          <volume>10</volume>
          .1145/3485130.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ganzha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Paprzycki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Pawłowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Szmeja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wasielewska</surname>
          </string-name>
          ,
          <article-title>Semantic interoperability in the Internet of Things: An overview from the INTER-IoT perspective</article-title>
          ,
          <source>Journal of Network and Computer Applications</source>
          <volume>81</volume>
          (
          <year>2017</year>
          )
          <fpage>111</fpage>
          -
          <lpage>124</lpage>
          . URL: https://www.sciencedirect.com/science/article/pii/ S1084804516301618. doi:
          <volume>10</volume>
          .1016/j.jnca.
          <year>2016</year>
          .
          <volume>08</volume>
          .007.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Dibowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ploennigs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kabitzsch</surname>
          </string-name>
          ,
          <source>Automated Design of Building Automation Systems, IEEE Transactions on Industrial Electronics</source>
          <volume>57</volume>
          (
          <year>2010</year>
          )
          <fpage>3606</fpage>
          -
          <lpage>3613</lpage>
          . URL: https://doi.org/10.1109/TIE.
          <year>2009</year>
          .
          <volume>2032209</volume>
          . doi:
          <volume>10</volume>
          .1109/TIE.
          <year>2009</year>
          .
          <volume>2032209</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Vander</given-names>
            <surname>Sande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Verborgh</surname>
          </string-name>
          , E. Mannens, R. Van de Walle,
          <article-title>RML: a generic language for integrated RDF mappings of heterogeneous data</article-title>
          ,
          <source>in: Proceedings of the 7th Workshop on Linked Data on the Web</source>
          , volume
          <volume>1184</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2014</year>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1184</volume>
          /ldow2014_paper_01.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Van Assche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Delva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Heyvaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>De Meester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dimou</surname>
          </string-name>
          ,
          <article-title>Towards a more human-friendly knowledge graph generation &amp; publication</article-title>
          , in: International Semantic Web Conference (ISWC)
          <year>2021</year>
          : Posters, Demos, and Industry Tracks,
          <year>2021</year>
          . URL: https://rml.io/yarrrml/assets/pdf/iswc2021.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Arenas-Guerrero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chaves-Fraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Toledo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          , Morph-KGC:
          <article-title>Scalable knowledge graph materialization with mapping partitions</article-title>
          ,
          <source>Semantic Web</source>
          <volume>15</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          . URL: https://doi.org/10.3233/SW-223135. doi:
          <volume>10</volume>
          .3233/SW-223135.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>