<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UNiCS: The open data platform for Research and Innovation ?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xavi Gimenez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Mosca</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fernando Roda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernardo Rondelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guillem Rull</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>SIRIS Lab, Research Division of SIRIS Academic</institution>
          ,
          <addr-line>Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper introduces UNiCS, the open platform for Research and Innovation data management. The UNiCS platform follows the Ontology-Based Data Access (OBDA) approach, which eases the access to a vast amount of heterogeneous data and offers the final users the possibility to formulate queries using terms from the knowledge domain they are experts in. In UNiCS, each query gets transformed in a set of optimised queries to different data sources. Moreover, the OBDA approach makes the semantics of the data explicit, thus offering an intuitive way to access, explore, visualise, analyse, and post-process them.</p>
      </abstract>
      <kwd-group>
        <kwd>Research and Innovation data Ontology-mediated data management OBDA Data Access Data integration Query answering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Research and Innovation (R&amp;I) ecosystems involve data and knowledge flows across
enterprises, academia, funding institutions, public authorities and citizens. The main
problem here is that key R&amp;I data elements are currently dispersed across a
multitude of distinct, heterogeneous datasets, which are often neither in structured format
nor systematically shared. This poses serious challenges to R&amp;I decision and policy
makers who are engaged in devising proper financial and political instruments to tune
R&amp;I dynamics, and in monitoring their impact in time. For them, in order to be driven
by factual evidence, it becomes mandatory to overcome the limitations imposed by
the usage of separated data silos, and to provide meaningful, integrated access to data
with the appropriate granularity. In such a context, ontology-mediated data management
(based on Semantic Web technologies) can help bringing together inputs and outcomes
data from a variety of sources, in an (linked) open and interoperable fashion. The
paper introduces the main characteristics of the UNiCS platform, an ontology-mediated
data access and data integration platform for R&amp;I policy and decision makers. UNiCS
provides end-users with: (i) a running technology for accessing data in a way that is
conceptually sound with their own domain knowledge; (ii) a semantically-transparent
platform, ready to acquire and be complemented with new data from different sources;
and (iii) a theoretically grounded mechanism to homogenise information stored in
different formats and according to different conceptualisations.
? Supported by SIRIS Academic SL.</p>
    </sec>
    <sec id="sec-2">
      <title>The UNiCS platform</title>
      <p>
        Since the mid 2000s, ontology-mediated data management has become a popular
approach for providing integrated, uniform access to heterogeneous data sources. The
UNiCS platform builds on top of this approach, its principles and methods: it hosts a
core ‘ontology-based data access (OBDA)’ engine [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which is fed with a R&amp;I
ontology and mappings for the sake of the data integration and the data access functionalities
the platform offers. But, UNiCS is not only that: a number of dedicated data
visualisations are implemented in its the front-end and directly connected to the OBDA engine,
which is the place where the data are retrieved via suitable SPARQL queries. As one
would expect, the ontology, the mappings and the visualisations are fully customisable
on a project-by-project basis, according to the specific information needs, as well as
reporting, analytical and communication goals of the end-users.
      </p>
      <p>The overall architecture of the UNiCS system is shown in Fig.1, where as data
sources we have considered the prototypical ones taken into account by R&amp;I decision
and policy makers: multidimensional data produced by governmental authorities, data
on the HE&amp;R sector, unstructured data coming from internal repositories, and
proprietary data that, most of the time, are the result of ad-hoc analyses performed in house).
In UNiCS, a conceptual layer is given in the form of an ontology, which captures
knowledge about the R&amp;I domain, and provides a high-level conceptual view of the
underlying data sources in terms of a shared terminology. The UNiCS ontology is connected to
the data sources through a declarative specification, given in terms of mappings that
relate entities in the ontology (classes and properties) to (SQL) views over data: users can
then query the data sources using the shared ontology terminology, without the need of
understanding the precise structure of the sources, the relations between them, or the
encoding of the data. Making use of the mappings, UNiCS translates the user queries
into SQL queries formulated over the sources, while at the same time exploiting the
domain knowledge encoded in the ontology to overcome incompleteness in the data and
enrich query answers.</p>
      <p>
        Ontology and the mappings, that are domain-dependent components of the
platform, but the way UNiCS implements the OBDA principles is through -ontop- [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">6,5,4,3</xref>
        ],
a mature open-source system to query relational databases as Virtual RDF Graphs
via query-rewriting [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. -ontop- is responsible for the translation of the SPARQL user
queries to SQL queries over the data sources, making an optimised use of the axioms
in the ontology, the mappings and the statistics about the sources themselves. Its virtual
approach avoids the cost of materialisation, and it allows one to profit from the maturity
of relational database systems1.
      </p>
      <p>In its front-end, UNiCS comes equipped with interactive data visualisations and a
full-fledge data access point based on SPARQL query answering. The UNiCS
interactive visualisations are always designed together with the end-users of the platform by
means of participatory and co-design methods. They offer an overview of the data plus
tools for drilling down into the details or filtering them according to a selected set of
1 -ontop- supports the virtual approach with all major RDBMSs (e.g., Oracle, IBM DB2,
Microsoft SQL Server, PostgreSQL, and MySQL).
specific dimensions2. Pop-up windows are also displayed with additional information
that is not originally provided by the visual representation of the data. The users of
UNiCS can always download the data behind each visualisation or copy and paste the
SPARQL queries which generate each visualisation and execute them, possibly
modified according to new specific needs.</p>
      <p>To conclude, the following is a list of selected applications that have been
developed on top of the UNiCS platform during the last two years3: UNiCS4, the Open Data
platform that integrates open data about Higher Education, Research and Innovation
(HERI) in Europe. The platform repository is constantly updated by members of the
SIRIS Lab, and it is fully accessible via its online SPARQL endpoint.
ToscanaOpenResearch5 is the Toscana regional observatory for research and innovation: a web portal
based on open and proprietary data, interactive visualisations and storytelling sections,
whose main aim is to promote more transparent, evidence-based and inclusive
governance of the HERI ecosystems of the region. Smart Manufacturing Web6 tool to allow
for discovery of academic actors carrying out research on specific themes linked to
Industry 4.0 in the region of Toscana (Italy). The web exploratory tool is the result of the
combination of the application of topic modelling analytical algorithms over
unstruc2 As data are often multidimensional, alternative visualisations are provided using well known
techniques like linking and brushing and progressive (sequential) disclosure.
3 Temporary user:semantics2018/password:semantics2018, whenever requested.
4 http://unics.cloud
5 http://toscanaopenresearch.it (in Italian)
6 http://unics.cloud/toscana-smart-manufacturing
tured data (mostly, scientific paper, projects and patent abstracts) and the data already
integrated in the UNiCS platform. RIS3-MCAT7 is an Open Government, open source
platform, with interactive visualisations summarising S3 (‘Smart Specialisation’)
activities in the region of Catalonia (Spain). It covers: (i) Policy instruments and funding;
(ii) Specialisation priorities and topics; (iii) Geographic and temporal distribution of
activities. A fully searchable and filterable exploratory tool of the collaboration
networks between Catalan actors, providing automatically computed indicators, is among
the supported functionality. Research Information System (RIS) in University Paris
Science et Lettres8, a key institution in the Parisian HERI landscape. The system has
been developed for integrating distributed and heterogeneous data sources for (i)
informing top-level strategic decision-making in the context of a period of radical change
in the French HERI system, as well as (ii) open up the university to other quadruple
helix actors by providing a detailed research portfolio, and (iii) generally increase the
availability of pertinent data to mid-level management and individual researchers,
fostering a culture change towards data use9.
7 http://unics.cloud/ris3/mcat (in Catalan).
8 http://unics.cloud/psl
9 All the applications above provide open access to the underlying open datasets according to
the well known ‘5* level’ of the Open Data vision.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <article-title>Rodr´ıguez-</article-title>
          <string-name>
            <surname>Muro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Ontologies and databases: The DL-Lite approach</article-title>
          . In: Tessaris,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Franconi</surname>
          </string-name>
          , E. (eds.)
          <source>Reasoning Web. Semantic Technologies for Informations Systems - 5th Int. Summer School Tutorial Lectures (RW), Lecture Notes in Computer Science</source>
          , vol.
          <volume>5689</volume>
          , pp.
          <fpage>255</fpage>
          -
          <lpage>356</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Tractable reasoning and efficient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>J. of Automated Reasoning</source>
          <volume>39</volume>
          (
          <issue>3</issue>
          ),
          <fpage>385</fpage>
          -
          <lpage>429</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giese</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hovland</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rezk</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based integration of cross-linked datasets</article-title>
          .
          <source>In: Proc. of the 14th Int. Semantic Web Conf. (ISWC). Lecture Notes in Computer Science</source>
          , vol.
          <volume>9366</volume>
          , pp.
          <fpage>199</fpage>
          -
          <lpage>216</lpage>
          . Springer (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Giese</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soylu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vega-Gorgojo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Waaler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haase</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rezk</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , O¨ zc¸ep,
          <string-name>
            <given-names>O¨ .L.</given-names>
            ,
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <source>Optique: Zooming in on Big Data. IEEE Computer 48(3)</source>
          ,
          <fpage>60</fpage>
          -
          <lpage>67</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Mosca</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rondelli</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rull</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The OBDA-based ”observatory of research and innovation” of the tuscany region</article-title>
          .
          <source>In: Proc. of the Joint Ontology Workshops 2017 Episode</source>
          <volume>3</volume>
          : The Tyrolean Autumn of Ontology, Bozen-Bolzano, Italy,
          <source>September 21-23</source>
          ,
          <year>2017</year>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <year>2050</year>
          .
          <article-title>CEUR-WS.org (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Rodriguez-Muro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based data access: Ontop of databases</article-title>
          .
          <source>In: Proc. of the 12th Int. Semantic Web Conf. (ISWC)</source>
          . vol.
          <volume>8218</volume>
          , pp.
          <fpage>558</fpage>
          -
          <lpage>573</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>