Linked Sensor Data Generation using Queryable RML Mappings? Pieter Heyvaert, Ruben Taelman, Ruben Verborgh, and Erik Mannens Ghent University - iMinds pheyvaer.heyvaert@ugent.be Abstract. As the amount of generated sensor data is increasing, seman- tic interoperability becomes an important aspect in order to support efficient data distribution and communication. Therefore, the integra- tion of (sensor) data is important, as this data is coming from different data sources and might be in different formats. Furthermore, reusable and extensible methods for this integration are required in order to be able to scale with the growing number of applications that generate se- mantic sensor data. Current research efforts allow to map sensor data to Linked Data in order to provide semantic interoperability. However, they lack support for multiple data sources, hampering the integration. Furthermore, the used methods are not available for reuse or are not ex- tensible, which hampers the development of applications. In this paper, we describe how the rdf Mapping Language (rml) and a Triple Pattern Fragments (tpf) server are used to address these shortcomings. The demonstration consists of a micro controller that generates sensor data. The data is captured and mapped to rdf triples using module-specific rml mappings, which are queried from a tpf server. 1 Introduction In the Internet of Things paradigm, many real-world objects are connected through the Internet [1]. In most cases this is facilitated by using sensors, that collect data. Over the years these sensors have reduced in cost, and, hence, are easier to obtain for consumers. This resulted in hardware projects, such as Tes- sel1 and Espruino2 , that offer modules that generate sensor data. Corcho and Garcı́a-Castro [2] state that the addition of semantics to this data improves its understandability, management and usability. Furthermore, they identify a num- ber of accompanying challenges, such as the integration of (sensor) data and the rapid development of applications that produce semantic sensor data. They con- clude that the use of Semantic Web technologies and Linked Data can address these challenges. ? The described research activities were funded by Ghent University, iMinds, the Insti- tute for the Promotion of Innovation by Science and Technology in Flanders (IWT), the Fund for Scientific Research Flanders (FWO Flanders), and the European Union. 1 https://tessel.io/ 2 http://www.espruino.com/ 2 Pieter Heyvaert et al. Research is being conducted towards making the produced data available as Linked Data, specifically as Resource Description Framework (rdf) triples. Pub- lishing sensor data as Linked Data enables finding other related data and relevant information, and facilitates interconnection and integration of data from differ- ent communities and sources [3]. Barnaghi and Presser [3] present Sense2Web, a platform to publish sensor data as Linked Data. They map the original data in xml to rdf triples using xslt. However, when integration is required with data in other data formats, xslt is not usable, as it is only limited to xml [4]. This hampers the integration with (sensor) data in order formats, such as the json format used by Schor et al. [6]. Patni et al. [5] developed an api that maps the original xml data to rdf. Similar to xslt, the api is limited to data in the xml format. Furthermore, the mappings are not accessible, as users can only use the api. Therefore, when the mappings needs to be updated or extended for a specific use-case, a new mapping needs to be created, which hampers the rapid development of applications. In the demo, we will show how these challenges can be addressed. We will generate rdf based on the sensor data stream of modules connected to a Tessel micro controller. The sensor data, available as json objects, is mapped to rdf using the rdf mapping language (rml) [4]. Furthermore, we publish the map- pings via a Triple Pattern Fragments (tpf) server [7]. This allows applications to query the correct module-specific mappings on demand, which improves the mappings’ reusability and allows users to build upon them. 2 Challenges In this section, we discuss how the two challenges are addressed by using rml mappings and making them available through a tpf server. Integration of (Sensor) Data We use rml to define how sensor data is mapped to rdf triples, because rml supports multiple heterogeneous data sources. Data might come from different sensors that each use a different data format. The use of different solutions for each data format to generate triples requires users to install, learn, use and maintain these solutions. This hampers the generation of triples. Furthermore, the data can come from multiple sensors and other sources. Therefore, a solution is required that allows to integrate multiple data sources at the same time. rml solves these two problems by offering a declarative way to define how data in multiple heterogeneous data sources is mapped to rdf triples. Rapid Development of Applications We use a tpf server to publish the rml mappings, because this allows applications and users to reuse the existing map- pings. The rapid development of applications that need to map (a stream of) sensor data to rdf triples is only possible when existing mapping methods are reusable and extensible. rml mappings can be published using any Linked Data Fragments (ldf) server, because the mappings are expressed as rdf triples. This Linked Sensor Data Generation using Queryable RML Mappings 3 Tessel module (2) query (1) data RDF-Based Application Application (4) (3) RMLMapper Fig. 1: (1) Sensor data is captured from the Tessel RFID module, as json objects; (2) the correct rml mappings are queried from the tpf server; (3) the rmlmapper is used to execute the mappings to generate rdf triples; (4) the application outputs the triples. allows applications to dynamically query the server for sensor-specific mappings on-demand. Additionally, when the mappings are updated, these applications can query for the new version. Subsequently, when new rdf triples are gener- ated, the new mappings are used. Furthermore, when users need to update or extend the mappings, they can do so without the need to define a completely new mapping that includes their changes, as opposed to the api provided by Patni et al. [5] which hides the mapping from the users. To update the map- pings, first, users query the server for the required mapping. Next, they update or extend the mapping. Optionally, they can be published the mapping again, so they become reusable for others. Though, rml mappings can be published using any ldf server, we choose a tpf server instead of, e.g., a sparql endpoint, to ensure availability if we provide public access to these mappings [7]. 3 Demo Our demo consists of four steps, as shown in Fig. 1. First, the data of a Tessel RFID module is captured by the application as json objects: 1 { "device":"TM-00-04-f000da30-006d4744-20bc2586", 2 "module":"rfid-pm532", 3 "uid":"3c527c00" } Each object contains the unique id of the RFID card that is tapped on the module. Second, the ldf-server3 , an implementation of a tpf server, is dy- namically queried for the module-specific mapping. This showcases the rapid development of applications as developers can reuse the existing mappings. Ad- ditionally, the mappings can be cached to improve performance. Besides mapping the sensor data to rdf triples, this mapping also maps the location of the sensor, which is available in an xml file: 3 https://github.com/LinkedDataFragments/Server.js 4 Pieter Heyvaert et al. 1 2 3 50 4 41 5 6 This showcases the integration of multiple heterogeneous (sensor) data sources to triples. Third, the mapping is executed by the rmlmapper4 , which produces the triples. Finally, they are outputted to the users: 1 @prefix ex: . 2 @prefix geo: . 3 4 ex:TM-00-04-f000da30-006d4744-20bc2586 a ; 5 ex:loc/50_41 ; 6 "3c527c00". 7 8 ex:loc/50_41 geo:lat "50"; geo:long "41". During the demo, users will be able to see the original data stream in the json format, the mapped rdf triples, and the rml mappings, provided by the tpf server. Furthermore, users can update the mappings, and see the changes when new rdf triples are generated. A screencast of the demo can be found at http://users.ugent.be/~pheyvaer/sensor2rdf/. The open source code is available at https://github.com/mmlab/demo-sensor2rdf. References [1] Melanie Swan. Sensor Mania! The Internet of Things, Wearable Computing, Objec- tive Metrics, and the Quantified Self 2.0. Journal of Sensor and Actuator Networks, 1(3):217–253, 2012. [2] Oscar Corcho and Raúl Garcı́a-Castro. Five challenges for the Semantic Sensor Web. Semantic Web, 1(1, 2):121–125, 2010. [3] Payam Barnaghi and Mirko Presser. Publishing Linked Sensor Data. In Proceedings of the 3rd International Conference on Semantic Sensor Networks-Volume 668, pages 1–16. CEUR-WS.org, 2010. [4] Anastasia Dimou, Miel Vander Sande, Pieter Colpaert, Ruben Verborgh, Erik Man- nens, and Rik Van de Walle. RML: A Generic Language for Integrated RDF Map- pings of Heterogeneous Data. In Workshop on Linked Data on the Web, 2014. [5] Harshal Patni, Cory Henson, and Amit Sheth. Linked Sensor Data. In Collaborative Technologies and Systems (CTS), 2010 International Symposium on, pages 362– 370. IEEE, 2010. [6] Lars Schor, Philipp Sommer, and Roger Wattenhofer. Towards a zero-configuration wireless sensor network architecture for smart buildings. In Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings, pages 31–36. ACM, 2009. [7] Ruben Verborgh, Miel Vander Sande, Olaf Hartig, Joachim Van Herwegen, Lau- rens De Vocht, Ben De Meester, Gerald Haesendonck, and Pieter Colpaert. Triple Pattern Fragments: a Low-cost Knowledge Graph Interface for the Web. Journal of Web Semantics, 37–38:184–206, 2016. ISSN 1570-8268. doi: 10.1016/j.websem.2016. 03.003. URL http://linkeddatafragments.org/publications/jws2016.pdf. 4 https://github.com/RMLio/RML-Mapper