<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ONETT: Systematic Knowledge Graph Generation for National Access Points</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Chaves-Fraga</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adolfo Anton</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jhon Toledo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oscar Corcho</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ontology Engineering Group, Universidad Politecnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <abstract>
        <p>In this paper, we describe our implemented approach for the usage and exploitation of declarative mappings for the publication of open transport data from transport authorities and operators into an ontology based on Transmodel. This allows a homogeneous representation of transport data across EU transport-related organisations and minimises the need to understand ad-hoc heterogeneous representation formats for transport data as currently published by them. We show how we create and use RML mappings for the speci c case of transforming GTFS data into a Transmodel-based ontology. In the future, such data may be further transformed into other formats such as NeTEx.</p>
      </abstract>
      <kwd-group>
        <kwd>Transmodel</kwd>
        <kwd>GTFS</kwd>
        <kwd>NAP</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>RML</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>to provide data about transport infrastructure, but also for route planning, but
also by other route planners, such as Navita.io and OpenTripPlanner.</p>
      <p>
        To achieve this homogeneity, there are several options that may be followed:
{ Transport authorities and operators may agree on using the same data
format and hence publish according to such data format. They know well the
type of data that they handle, the quality properties on such data, etc., so
they should be able to provide this data easily. To some extent, this is what
is happening currently with GTFS, and what should happen in the near
future in the European Union with NeTex, according to directive 2010/40/EU
and regulation 2017/1926 (MMTIS).
{ 3rd parties (as well as operators and authorities themselves) may be able to
create transformation rules that allow transforming the original data sources
into other generally-agreed formats, republishing such transformed data
either in the original data portals, if allowed to do so, or in other servers.
Transformations may be done programmatically (that is, with ad-hoc code)
or declaratively (using mappings in existing languages like R2RML [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or
RML [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]).
      </p>
      <p>In this paper, we present our work on ensuring that declarative mappings can
be used for the purpose of transforming transport data published by transport
authorities and operators into a homogeneous representation based on
Transmodel (the reference data model for public transport at European level, which
will be further described in section 2). This data can then be further
transformed into NeTEx so as to comply with the EU regulations for the publication
of transport-related data in National Access Points.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Transmodel Ontology and GTFS</title>
      <p>In its drive to foster interoperability across Europe, the EU is requiring each
Member State to allow access to transportation data via a National Access Point
(NAP). According to the EU Regulation 2017/1926, all transportation
authorities, transport operators and infrastructure managers must provide static and
dynamic data in speci c data formats (e.g., NeTEx, SIRI). - the EU Regulation
applies to di erent transportation modes, including air, train, road vehicle, bus,
ferry, metro, tram, shuttlebus, car-sharing, car-pooling and bike-sharing.</p>
      <p>Transmodel is the European Reference Data Model for Public Transport.
It provides a conceptual model of common public transport concepts and data
structures that can be used to build many di erent kinds of public transport
information system such as timetabling, fares, operational management,
realtime data, journey planning. It is divided into eight di erent sections or Parts:
Common Concepts (CC), Public Transport Network Topology (NT), Network
Description (ND), Operations Monitoring &amp; Control (OM), Fare Management
(FM), Passenger Information (PI), Driver Management (DM), Management
Information &amp; Statistics (MI).</p>
      <p>These parts or sections are usually developed by di erent standards or
speci c data formats. One of the most relevant implementations is NeTEx, which
covers partially some features of the parts CC, NT, ND, FM and PI. NeTEx
releases the 2017/1926 EU Regulation (May 2017) where the European
Commission recognized NeTEx as a strategic standard for the cross-border exchange
of data. The rst step must be taken before December 2019 when every
European country must provide data available in NeTEx format at National Access
Points to allow EU-wide multi-modal travel information services.</p>
      <p>The General Transit Feed Speci cation (GTFS) is a de-facto standard for
representing public transport data, a collection of at least ve required, two
optional required and up to fteen CSV les (with extension .txt and preferably
encoded as UTF-8) contained within a compressed le to describe a transit
scheduled operations system. The aim of GTFS is providing at least trip-planning
functionality. It de nes the headers and a set of rules that must be taken into
account when the dataset is created. Each le, as well as its headers, can be
mandatory or optional and they have relations among them. The speci cation
supports the representation of several public transport features such as trips,
routes, stops, times, fares or calendar.</p>
      <p>
        In order to provide a better GTFS to NeTEx conversion and further full data
interoperability, we start to build up a Transmodel Ontology. The development is
released in a github repository3 where every material generated is upload about
the di erent activities carried out during the development (i.e., use cases, user
stories, glossary of terms, etc.). Based on the Transmodel base URI proposed by
the CEN Transmodel working group model4 and its documentation5 we develop
the corresponding ontology following the NeOn methodology[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Before
performing the transformation from GTFS to the ontology based format of Transmodel,
we analyse the relationship between the two standards. For example, in Table 1
we show the relation between the properties of Agency in the GTFS model with
the corresponding property in Transmodel (Authority ).
      </p>
      <p>Table 1. Example of relation among GTFS properties and Transmodel Ontology</p>
      <p>GTFS
Agency name
Agency url</p>
      <p>Transmodel (Ontology)
https://w3id.org/transmodel/terms#authorityName
https://w3id.org/transmodel/terms#authorityUrl
Agency timeZone https://w3id.org/transmodel/terms#authorityTimezone
Agency lang</p>
      <p>https://w3id.org/transmodel/terms#authorityLang
3</p>
    </sec>
    <sec id="sec-4">
      <title>The ONETT Demo</title>
      <p>The Open NEtwork of public Transport application (ONETT)6 uses Semantic
Web technologies to perform a systematic knowledge graph generation in the</p>
      <sec id="sec-4-1">
        <title>3 https://github.com/oeg-upm/transmodel-ontology 4 https://w3id.org/transmodel/terms# 5 http://www.transmodel-cen.eu/ 6 https://osoc-es.github.io/onett/</title>
        <p>
          transport domain. More in detail, ONETT applies the concept of Ontology Based
Data Access (OBDA) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], which it aims at providing a uni ed view and common
access to a set of data sources, using ontologies and mappings.
        </p>
        <p>
          In this speci c case, we create a general mapping between the full speci
cation of GTFS7 and ontology based Transmodel using the RML speci cation
in its YARRRML [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] serialization. Before running the transformation, we have
to perform a mapping translation [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] process to adapt the general mapping to
the input data as it is not always going have the same structure and number
of les due the naturalness of GTFS. Thanks to the simplicity of YARRRML
serialization, the translation process is done in a e cient and simple manner.
The work ow of the application is shown in Figure 1 where the SDM-RDFizer8
engine for RML mappings is integrated in the application to perform the
transformations of the input data in CSV to RDF. More in detail, the steps following
by ONETT for generating the desirable RDF knowledge graph based on the
Transmodel ontology from a GTFS feed are:
1. Analyse the input data: It decompresses and analyses the input GTFS
feed to understand the les and the structure of each le (headers).
2. Mapping translation: It takes the general GTFS YARRRML mapping
that represents the full speci cation and generates a new mapping
corresponding to the input data.
3. Knowledge Graph Generation: It runs the SDM-RDFizer engine to
transform the raw data to RDF.
        </p>
        <p>These steps are a black box for the transport authorities that want to obtain
the knowledge graph from their GTFS feeds. Using the web application the
user only has to upload the compressed feed or provide a URL and ONETT</p>
      </sec>
      <sec id="sec-4-2">
        <title>7 https://github.com/osoc-es/onett-back/ 8 https://github.com/SDM-TIB/SDM-RDFizer</title>
        <p>generates automatically the corresponding knowledge graph. With this approach,
we provide a useful tool to generate National Access Point complaint data from
a de-facto standard and very popular data format in a systematic manner.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>The availability of homogeneous transport data from worldwide transport
authorities and operators gives us the possibility of creating new types of
applications related to transport (trip planners, fare calculators, ticket recommenders,
etc.) that can be deployed easily in di erent regions or cities. In this paper, we
have shown our approach to create such homogeneous transport data based on
declarative mappings that can be used to generate transport knowledge graphs
for any region or city in the world that is currently publishing data in GTFS. The
mappings allow transforming GTFS data into RDF according to a
TransModelbased ontology. Such data can be queried in a homogeneous manner so that the
aforementioned applications can be created more easily.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work is partially supported by EIT Digital under Grant Agreement \No.
EIT/ EIT DIGITAL/SGA2019/1 through action: SNAP" and by the Spanish
Ministerio de Economa, Industria y Competitividad and EU FEDER funds
under DATOS 4.0: RETOS Y SOLUCIONES - UPM Spanish national project
(TIN2016-78011-C4-4-R) and by an FPI grant (BES-2017-082511). Thank you
to our open Summer of code 20199 students: Luis Pozo, Pablo Castellanos, Marta
Retana.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Corcho</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Priyatna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaves-Fraga</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Towards a New Generation of Ontology Based Data Access</article-title>
          .
          <source>In: Semantic Web Journal</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sundara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>R2RML: RDB to RDF Mapping Language, W3C Recommendation 27 September 2012</article-title>
          . www.w3.org/TR/r2rml (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dimou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vander</surname>
            <given-names>Sande</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Verborgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Mannens</surname>
          </string-name>
          , E., Van de Walle, R.:
          <article-title>RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data</article-title>
          . In: LDOW (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Heyvaert</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Meester</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dimou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verborgh</surname>
          </string-name>
          , R.:
          <article-title>Declarative Rules for Linked Data Generation at your Fingertips! In: Proceedings of the 15th ESWC: Posters and Demos (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Linking data to ontologies</article-title>
          .
          <source>In: Journal on data semantics X</source>
          , pp.
          <volume>133</volume>
          {
          <fpage>173</fpage>
          . Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Suarez-Figueroa</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez-Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandez-Lopez</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The neon methodology for ontology engineering</article-title>
          . In: Ontology engineering in a networked world, pp.
          <volume>9</volume>
          {
          <fpage>34</fpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>