=Paper= {{Paper |id=Vol-2939/paper6 |storemode=property |title=Enabling Cross-Border Travel Offers Through National Access Point Federation via Metadata Harmonisation |pdfUrl=https://ceur-ws.org/Vol-2939/paper6.pdf |volume=Vol-2939 |authors=Alessio Carenini,Andrea Fiano,Mario Scrocca,Marco Comerio,Irene Celino |dblpUrl=https://dblp.org/rec/conf/i-semantics/CareniniFSCC21 }} ==Enabling Cross-Border Travel Offers Through National Access Point Federation via Metadata Harmonisation== https://ceur-ws.org/Vol-2939/paper6.pdf
 Enabling Cross-Border Travel Offers Through
 National Access Point Federation via Metadata
                 Harmonisation

Alessio Carenini     , Andrea Fiano , Mario Scrocca         , Marco Comerio         , and
                                 Irene Celino

                   Cefriel, Milan, Italy name.surname@cefriel.com



        Abstract. Planning cross-border transportation offers requires gather-
        ing data from multiple transport operators within and outside a country.
        The European legislation demands each member state to set up a Na-
        tional Access Point (NAP) for multimodal transport information, never-
        theless, interoperability in accessing data from different NAPs is far to
        be accomplished. In this paper, we describe and validate our approach to
        consolidate metadata coming from different sources using Semantic Web
        technologies. The presented solution implements an automated ingestion
        pipeline harmonising metadata from three different European NAPs in
        a single metadata catalog.

        Keywords: Metadata Harmonisation · Cross-border Travel Offers · Na-
        tional Access Points


1     Introduction

    In the transportation domain, several data and metadata catalogs coexist,
each one being maintained by a different initiative or mandated by a specific EU
directive or country law. According to the EU Delegated Regulations 2017/1926,
885/2013, 886/2013 and 2015/962, each EU member state has to implement a
National Access Points (NAP) to make national transport data discoverable. A
NAP is an intermediary digital platform allowing access to traffic and mobil-
ity data, and playing a crucial role in data exchange in the field of mobility in
Europe. From the point of view of the transport operator looking for mobility-
related information, NAPs represent trusted sources of data and metadata, and
their content can be reliably used inside their own information systems. A NAP
is a web-based portal handling data concerning Safe and Secure Truck Parking’s
(SSTP), Real-Time Traffic Information (road) (RTTI), Safety Related Trans-
port Information (road) (SRTI) and Multimodal Travel Information (MMTIS)
(all modes like train, busses, metro, cycling etc.). EU regulations mandate the us-
age of Transmodel-based specifications for the data exchange between transport
    Copyright © 2021 for this paper by its authors. Use permitted under Creative Com-
    mons License Attribution 4.0 International (CC BY 4.0).
2       A. Carenini, A. Fiano, M. Scrocca, M. Comerio, and I. Celino

operators and their own reference NAP, therefore aiming for data interoperabil-
ity. Nevertheless, the regulations don’t specify which metadata should be used
to describe datasets, and how a NAP should be implemented. As a result, each
member State is implementing its own National Access Point using different
metadata schemas and exposing its functionalities via custom APIs [2,5].
    This paper describes how we extended a metadata catalog, named Asset Man-
ager, to seamlessly support accessing both local digital assets, directly added by
users of the Asset Manager, and remote digital assets from multiple National
Access Points. Our scenario is based upon the real requirements coming from
Trenitalia1 , which wants to create mobility packages to be sold to tour operators
bringing tourists to the Milano-Cortina Winter Olympics in 20262 . Creating such
mobility packages means locating and accessing timetables of multiple transport
operators. Performing this task, even in the case National Access Points are
available, is time-consuming and requires checking multiple sources. Consolidat-
ing in a single catalog the metadata coming from multiple NAPs, together with
metadata provided by Trenitalia, means being able to perform more efficiently
the task and better fulfil the mobility needs of tourists heading to the Winter
Olympics. Such a scenario requires mapping different NAP metadata schemas
onto a single schema and creating multiple metadata ingestion pipelines.
    The steps which we implemented were the following:

  i) Metadata schema mapping: the different NAP metadata schemas were con-
     ceptually mapped onto a single schema, which is also used to describe the
     local assets.
 ii) RML transformation rules: the conceptual mappings from the specific NAP
     schemas to the Asset Manager metadata schema were implemented in RML [3].
iii) (Meta)Data ingestion pipelines: the RML mappings were integrated in data
     ingestion and transformation pipelines using the Chimera tool3 [6] for their
     execution. The resulting RDF triples, defining metadata for remote assets,
     are added to the RDF repository used by the Asset Manager.
iv) Exploration API creation: to ease integration in the user interface, Explo-
     ration APIs were created to wrap the execution of SPARQL queries as APIs.
     Such Exploration APIs allow obtaining the lists of assets belonging to a spe-
     cific type and their metadata. By doing this, we harmonised the access to
     both local and remote assets.
 v) User interface: the Asset Manager web interfaces, showing the integrated
     list of assets and their metadata, were updated.

To show the implemented approach for NAP metadata harmonisation, we se-
lected three different NAPs from France, Belgium and the Netherlands. Since
the approach is completely generic, this implementation opens the possibility to
use the Asset Manager as an aggregator of multiple trusted metadata sources,
 1
   https://www.trenitalia.com/
 2
   A video describing the scenario and the implemented solution is available at https:
   //www.youtube.com/watch?v=SoOLheMv1wQ
 3
   https://github.com/cefriel/chimera
              National Access Point Federation via Metadata Harmonisation       3

like open data portals, multimodal National Access Points, or other instances of
the Asset Manager. In the following sections, we provide details on each step of
the approach and meaningful insights about the implemented solution.


2   Metadata schemas for National Access Points
The main focus of the NAP regulations is to promote the usage of a specific
set of standards, based on Transmodel, across all Europe to improve transport
data interoperability. Even though the role of the NAP as a dataset catalog
is well-defined by the regulation, each member state is then free to define its
own implementation. Such principle led to the appearance of different metadata
vocabularies, and the need for interoperability between the metadata schemas
adopted by different NAPs. In our scenario, we analysed in detail the National
Access Points provided by France, Belgium and the Netherlands.
    Belgian NAP is built upon CKAN, therefore its API4 allows for searching for
datasets according to specific types or features. The metadata schema is quite
rich and contains multi-lingual documentation, geographical coverage, and both
contact person and responsible transport operator.
    French NAP features a rich API5 and metadata schema, containing many de-
tails about datasets. This National Access Point supports NeTEx representation
of static transport data (leveraging on Chouette [4] features), which are made
available as Community resource, which are alternative representations of the
same main information described in the asset. Also, spatial information about
the covered area is provided, allowing for geographical queries. As a last detail,
an asset has only a responsible organization and the metadata does not mandate
for a contact person.
    Netherlands NAP has no clear API to obtain metadata, and the actual end-
point6 has been found by analyzing the JavaScript sources of the NAP website.
The metadata schema mandates both a responsible person for the dataset publi-
cation and an owner transport operator company. The referenced dataset is listed
with the attribute publicationURL, and no geographical coverage is present (as
opposed to France metadata schema).
    Summarising the analysis of the selected NAPs, they all feature different
metadata schemas, and even basic information describing who is responsible for
the asset is not represented in the same way.
    A working group composed of representatives from the Netherlands, Ger-
many, Austria and Sweden started to work on common metadata definitions to
be applied to the various NAPs in Europe to increase interoperability and ease
the creation of multi-country solutions. The outcome of such group is called Co-
ordinated Metadata Catalogue 7 and defines a minimum set of metadata which,
according to its authors, should be supported in all the NAP implementations.
4
  https://www.transportdata.be/api/3
5
  https://transport.data.gouv.fr/swaggerui
6
  https://nt.ndw.nu/services-spoa/rest/v1/ui/multimodaal
7
  https://www.its-platform.eu/highlights/harmonised-metadata-national-access-points
4        A. Carenini, A. Fiano, M. Scrocca, M. Comerio, and I. Celino

Using the Coordinated Metadata Catalogue schema allows harmonising those
NAP schemas onto a unified schema, as all the basic information contained in
the assets coming from the three different NAPs can be represented.


3     Automating Metadata Aggregation from National
      Access Points
The Asset Manager is an RDF-based metadata catalog developed in the context
of the Shift2Rail Innovation Programme 48 . We show the possibility to use it as
an aggregator of metadata coming from multiple trusted sources. The objective
is to let companies accessing domain-specific knowledge in a coherent way using
a single tool. We defined and validated an approach based on Semantic Web
technologies to perform metadata ingestion, to define and execute mappings
to a single metadata schema. Following the European guidelines to represent
metadata in data catalogs, the DCAT Application Profile v2.0.1 [7] was selected
as metadata schema for the Asset Manager.
    Our solution leverages on the Asset Manager and the Chimera tool to: (i) con-
nect to each NAP, (ii) fetch the metadata of its assets, (iii) convert such metadata
into a coherent RDF representation to be easily queried via SPARQL, (iv) store
the resulting triples inside the RDF repositories, (v) show that the Asset Man-
ager can visualise both local and remote assets.

3.1    Configuring metadata ingestion
The first and most important part to configure the NAP metadata ingestion
process is understanding the metadata schemas and identify which attributes
and data structures can be found in all the different NAPs. We decided to use the
Coordinated Metadata Catalogue as an intermediate model to ease the definition
of mappings between the metadata schemas. Indeed, the Coordinated Metadata
Catalogue specification acknowledges the existence of other vocabularies and
already provide an alignment to DCAT-AP adopted by the Asset Manager. We
8
    cf. https://shift2rail.org/research-development/ip4/




Fig. 1: Conceptual mappings defined for the harmonisation of the different Na-
tional Access Point metadata schema.
              National Access Point Federation via Metadata Harmonisation       5




Fig. 2: Overall architecture of the implemented solution to integrate metadata
from the NAPs (France, Belgium and Netherlands).


exploited such alignment, defining a two-step conceptual mapping, as depicted
in Figure 1: first from the specific NAP metadata schema onto Coordinated
Metadata Catalogue, and then from that schema onto DCAT-AP.
    The defined conceptual mappings guided the coding of the actual mapping
rules using RML9 . Therefore, we assembled a metadata ingestion service exposed
through a Chimera pipeline10 . As shown in Figure 2, calling such service triggers
the execution of the following actions for each of the countries: (i) the NAP
API endpoint is called to obtain JSON metadata; (ii) lifting is performed on
the resulting JSON metadata using the appropriate RML mapping rules and
obtaining an RDF representation compliant with the DCAT-AP profile; (iii) the
resulting RDF triples are written in the RDF repository used by the Asset
Manager as a separate RDF graph.


3.2   Accessing metadata from the Asset Manager

The Asset Manager arranges assets in categories according to so-called asset
types. In the considered scenario, we mapped the items coming from the NAPs
metadata ingestion pipelines to the journey planning asset type, which can
be used to describe either datasets containing timetables or services provid-
ing timetables. Whenever a user asks for viewing the list of journey planning
assets, the Asset Manager performs a single SPARQL query11 to retrieve the
basic information about each published asset.
    As can be noticed in Figure 3, when the NAPs metadata ingestion pipeline is
activated, the Asset Manager starts showing both local assets and assets coming
from National Access Points. This enables users to browse through the consol-
idated list of assets and to search for the most interesting ones. Moreover, the
information retrieval can be automatised by exploiting the exposed Exploration
9
   The developed RML mappings are available at https://github.com/cefriel/
   nap-harmonisation/tree/main/rml
10
   The configuration of the ingestion service is available at https://github.com/
   cefriel/nap-harmonisation/blob/main/chimera-route/camel-context.xml
11
   The query is available at https://github.com/cefriel/nap-harmonisation/blob/
   main/asset-manager/query-visualise-assets.sparql
6      A. Carenini, A. Fiano, M. Scrocca, M. Comerio, and I. Celino




Fig. 3: Visualisation of both local and remote NAP assets in the Asset Manager.


API. As a result, this solution can encourage and facilitate the creation of mul-
timodal mobility packages providing standardised access to information coming
from several metadata sources.

4   Conclusions and Future Works
The general availability of the National Access Points throughout Europe will
improve interoperability in the transportation domain, as it will force all actors
to provide data according to the Transmodel-based specifications dictated by
the regulators. We demonstrated that converging to a common set of metadata
(such as the one proposed in the Coordinated Metadata Catalogue initiative)
enables the possibility to treat the entire network of NAPs as a source of trusted
data and metadata which can facilitate the planning of cross-border travel offers.
    The integration of remote metadata providers (such as the NAPs) in the IT
systems of a transport operator is an operation which must carefully follow the
data quality assurance and the information lifecycle processes defined inside the
company. As future work, we will investigate how to integrate the detection of
changes in the metadata acquired from NAPs inside the lifecycle processes of
other assets managed by the Asset Manager. Since NAPs will become the au-
thoritative source of information in the transportation domain, it is important
to promptly detect the availability of new versions of a remote asset used inter-
nally by the company (through the Asset Manager) notifying the owners of the
dependant applications to check their functionalities and prevent errors.
    Although based on a declarative approach, our solution exploits an external
integration engine to perform the actual calling of the API provided by the NAPs.
We will investigate the recent support introduced in RML for Web APIs [1] as an
alternative solution to define mappings for different NAPs endpoints in a fully
declarative way.
              National Access Point Federation via Metadata Harmonisation            7

Acknowledgments

The presented research was partially supported by the SPRINT project (Grant
Agreement 826172) and the RIDE2RAIL project (Grant Agreement 881825),
co-funded by the European Commission under the Horizon 2020 Framework
Programme.


References
1. Assche, D.V., Haesendonck, G., Mulder, G.D., Delva, T., Heyvaert, P., Meester,
   B.D., Dimou, A.: Leveraging web of things W3C recommendations for knowledge
   graphs generation. In: Web Engineering - 21st International Conference, ICWE 2021
   Proceedings. vol. 12706, pp. 337–352. Springer (2021). https://doi.org/10.1007/978-
   3-030-74296-6 26
2. Carenini, A., et al.: SPRINT project Deliverable D2.3 – Requirements for an IF
   architectural design (F-REL) (2020), http://sprint-transport.eu/
3. Dimou, A., Sande, M.V., Colpaert, P., Verborgh, R., Mannens, E., de Walle, R.V.:
   RML: A generic language for integrated RDF mappings of heterogeneous data.
   In: Proceedings of the Workshop on Linked Data on the Web co-located with the
   23rd International World Wide Web Conference (WWW 2014), Seoul, Korea, April
   8, 2014. CEUR Workshop Proceedings, vol. 1184. CEUR-WS.org (2014), http:
   //ceur-ws.org/Vol-1184/ldow2014_paper_01.pdf
4. Gendre, P., Denis, Y., Duquesne, C., Bouziane, Z., Bouree, K., Dezou, L., Lemettais,
   O.: CHOUETTE an open source software for PT reference data exchange. In: 8th
   European ITS Congress Lyon (2011)
5. Mylonas, C., Mitsakis, E., Dolianitis, A., Aifadopoulou, G.: A review of
   european national access points for intelligent transport systems data. In:
   23rd IEEE International Conference on Intelligent Transportation Systems,
   ITSC 2020, Rhodes, Greece, September 20-23, 2020. pp. 1–8. IEEE (2020).
   https://doi.org/10.1109/ITSC45102.2020.9294463
6. Scrocca, M., Comerio, M., Carenini, A., Celino, I.: Turning transport data to comply
   with EU standards while enabling a multimodal transport knowledge graph. In:
   Proceedings of the 19th International Semantic Web Conference. vol. 12507, pp.
   411–429. Springer (2020). https://doi.org/10.1007/978-3-030-62466-8 26
7. Van Nuffelen, B.: DCAT Application Profile for data portals in europe
   (DCAT-AP) v2.0.1. Tech. rep., SEMIC (2020), https://joinup.ec.europa.
   eu/collection/semantic-interoperability-community-semic/solution/
   dcat-application-profile-data-portals-europe/release/201-0