<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Monolith 2.0: the Semantic OBDM Knowledge Graph Platform</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lorenzo Lepore</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giacomo Ronconi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Ruzzi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Santarelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>(a) Sapienza Universita` di Roma</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>(b) OBDA Systems</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this demo we showcase the innovative features of MONOLITH 2.0, the Semantic Knowledge Graph platform, which gives access to the Ontologybased Data Management (OBDM) capabilities of the MASTRO 2.0 ontology reasoner, in particular to its enhanced SPARQL query answering and data quality checking features, provides a user-friendly environment to build SQL-based mappings of structured enterprise data to the ontology, and allows to build Virtual Knowledge Graphs from this data, or from external RDF or tabular datasets.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Ontology Based Data Management (OBDM) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is an innovative approach to data
access, integration, and governance. The fundamental idea behind OBDM is to adopt
an ontology as a unified, semantically rich, and comprehensible model of enterprise
data, and to enable access through the ontology to such data by means of a mapping
layer, which defines the semantic correspondences between the ontology entities and
the source data. OBDM has in recent years consolidated its position in the academic
and enterprise world as an effective means to manage enterprise data [
        <xref ref-type="bibr" rid="ref1 ref6 ref9">1, 6, 9</xref>
        ].
      </p>
      <p>
        Knowledge Graphs (KGs) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] are data models that use a graph structure, built on
nodes and edges, to represent enterprise data, and to highlight the relationships between
the data entities. When adopting the terms of a domain ontology, the KG acquires
semantics and meaning which enrich the representation of these relationships. The graph
model is extremely flexible, and, like the OBDM approach, is expandable, can be
applied to any real-world use case, and allows to abstract from the organization of the data
in the underlying data stores.
      </p>
      <p>
        In this demo1 we present MONOLITH 2.0, the latest major release of the
MONOLITH OBDM platform [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and in particular we will focus on its newest features and
innovations. In particular, we will take a look under the hood at the fully revamped
query answering motor of the MASTRO 2.0 system, and we will introduce the enhanced
SPARQL query interface and the new data quality checking environment. These
features go together with MONOLITH 2.0’s interactive visual inspection environment for
GRAPHOL [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] ontologies and with its fully SQL-based editing environment for the
1 Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).
ontology mappings to provide a full-fledged environment for all OBDM-related data
management services. Furthermore, MONOLITH 2.0’s dedicated KG section has been
upgraded to provide all the necessary functionalities to build RDF datasets from
enterprise structured data through MASTRO 2.0, and to combine these datasets with external
ones, which can be natively in RDF form, or can be tabular datasets, which MONOLITH
2.0 allows to transform into RDF statements through a SPARQL interface.
      </p>
      <p>MONOLITH 2.0 and MASTRO 2.0 are developed by OBDA Systems, a start-up of
Sapienza University of Rome.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Monolith 2.0’s main new features and improvements</title>
      <p>
        In this section we present the main novel features and improvements of MONOLITH 2.0,
starting with the new version of its underlying query answering motor, MASTRO 2.0. In
a nutshell, MASTRO supports data access through OWL 2, specifically the OWL 2 QL
profile, ontologies by leveraging a mapping layer which is constituted by a set of views
over the database and mapping assertions [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] which associate ontology elements with
such views. Crucially, while also supporting the W3C standard R2RML[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] mapping
syntax, MASTRO 2.0 provides a fully SQL-based proprietary mapping syntax, which
greatly reduces the learning curve of mapping design in real-world scenarios, where IT
experts are typically fluent in SQL, but not in R2RML. OBDM services such as query
answering and data quality checking are carried out in MASTRO through a very efficient
technique that reduces them, via query rewriting [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], to standard SQL query evaluation.
In essence, the SPARQL user query is reformulated with respect to first the ontology
and then the mappings, in such a way that a new query, which encodes this reasoning
and that can be directly executed on the relational data sources, is produced.
      </p>
      <p>
        The development of MASTRO 2.0 has been primarily focused on extending the
fragment of SPARQL 1.1 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] that was supported by earlier versions of MASTRO (which
was limited to the conjunctive query fragment of SPARQL), specifically aiming for
full support of SPARQL 1.1’s graph patterns, aggregates, functions, and solution
modifiers2. This was achieved by restructuring the core SPARQL-to-SQL translation
process, and enhancing MASTRO 2.0’s query-time optimization features to maintain very
good performances with respect to MASTRO 2.0’s reasoning time. Intuitively, Mastro’s
query answering algorithm now features a first step in which the conjunctive query
fragments of the SPARQL query (called cores) are identified and queued for rewriting
using Mastro’s two step query reformulation algorithm (the ontology rewriting and then
the mapping rewriting). The final SQL code is then obtained by compiling all parts of
the query, with the aid of an intermediate relational algebra-based query language. In
general, Mastro’s query answering performances are almost entirely dependent on the
underlying database, and on the complexity of the provided SPARQL query: in other
words, Mastro’s SPARQL-to-SQL query processing times are almost irrelevant with
respect to total query answering times when compared to the DBMS query evaluation
time. Moreover, parallelizing the query rewriting of each single SPARQL core allows
Mastro 2.0 to process more complex SPARQL queries in almost identical times as more
simple SPARQL conjunctive queries.
      </p>
      <p>Furthermore, we have widened the scope of MASTRO 2.0’s database management
system connectors, to include not only the market leaders among traditional RDBMSs,
but also connections to data stored in Apache Hadoop format through Apache Impala,
and to data virtualization and federation systems, such as Denodo. For each supported
DBMS, Mastro features a specifically tailored SQL dialect, in order to produce the final
SQL query in compliance with the chosen system.</p>
      <p>MONOLITH 2.0 features a new version of the SPARQL endpoint, where users can
query the ontology through MASTRO 2.0. The SPARQL endpoint now allows to choose
between three query execution modes: standard mode, which outputs the query results
to the interface, and which is coupled with an answer buffer to limit the number of
produced results; a file streaming mode which streams the results directly to a
chosen output file; a result count mode which runs the query in background and produces
the result count. These execution modes are designed to handle scenarios where the
user wants to inspect a portion of the query results directly in MONOLITH 2.0’s
interface, or in which large volumes of data are being extracted, and streamed directly
to a physical file. MONOLITH 2.0 provides different export options for both standard
and file streaming execution modes, including CSV, JSON, XML, RDF, and PowerBI
(.pbids) formats. The SPARQL Query Catalog has also been enriched with a query
tagging system in which queries in the catalog can by easily classified and then searched
for through user-defined tags.</p>
      <p>Finally, MONOLITH 2.0 introduces a new Data Quality section, where users
leverage MASTRO 2.0’s ability to automatically identify and extract data quality rules (or
data integrity constraints) from the OWL 2 ontology, and translate them into SPARQL
queries. MASTRO 2.0 current supports the following constraints: class disjointness
con2 http://www.monolith.obdasystems.com/monolith-user-manual/
#Mastro-SPARQL for a complete list of supported SPARQL 1.1 operators
straints, property functionality and universal participation constraints, cardinality
constraints, and participation constraints. Each such constraint type is processed by
MASTRO 2.0 in order to produce a specific kind of SPARQL query in such a way as to
interpret any data extracted by these queries as a violation of the ontology data
integrity rule. In MONOLITH 2.0’s interface users are provided a preview of such results
for each constraint, along with the query plan details to understand the provenance of
the violation. The Data Quality verification process essentially consists in building a
set of constraints to check: the user selects one or more integrity constraints for each
constraint type to schedule for verification, and then provides a priority level for each
constraint. Once the execution of each constraint is complete, MONOLITH 2.0 allows
to save the execution to a history log, which provides the results for each query and a
representation of the aggregate results, based on priority and/or constraint type, through
charts and graphs.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Application scenarios and Demo Session Overview</title>
      <p>MONOLITH 2.0 is currently commercially distributed by OBDA Systems, and is
being used in various OBDM-related projects, in particular with clients from the Italian
public administration sector. The more common application scenarios are projects in
which data from different business units is modelled in an ontology, therefore
allowing for integrated data access and data quality verification processes, and projects in
which MONOLITH 2.0 and MASTRO 2.0 are used to produce Linked Open Data (LOD)
datasets.These LOD datasets can either be obtained by the conversion (or triplification)
of structured data, typically CSV, XML, or JSON files) into RDF datasets by MASTRO
2.0, or by extracting new datasets through MONOLITH 2.0 from legacy data stores.</p>
      <p>
        Participants during the demo will be able to see MONOLITH 2.0 in action on one
of the specifications used in these projects, specifically the SIR (System of Integrated
Registers) ontology, which is being built in a joint project between OBDA Systems
and the Italian National Institute of Statistics (ISTAT). This ontology integrates
information from statistical censuses regarding, among others, demographic, territorial, and
public administration data. Another specification that will be featured in the demo is
the Movie Ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which provides a vocabulary to semantically describe movie
related concepts. During the demo, attendees will interact with MONOLITH 2.0 in the
above scenarios and will be introduced to its main new features.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>N.</given-names>
            <surname>Antonioli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Castano</surname>
          </string-name>
          `,
          <string-name>
            <given-names>C.</given-names>
            <surname>Civili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Coletta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Grossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Virardi</surname>
          </string-name>
          .
          <article-title>Ontology-based data access: The experience at the italian department of treasury</article-title>
          .
          <source>In Proc. of CAISE</source>
          <year>2013</year>
          , volume
          <volume>1017</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Bouza</surname>
          </string-name>
          . Mo - the
          <source>movie ontology</source>
          ,
          <year>2010</year>
          . [Online;
          <fpage>26</fpage>
          .
          <string-name>
            <surname>Jan</surname>
          </string-name>
          .
          <year>2010</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Tractable reasoning and efficient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>J. Autom. Reasoning</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ):
          <fpage>385</fpage>
          -
          <lpage>429</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundara</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          .
          <article-title>R2RML: RDB to RDF mapping language</article-title>
          .
          <source>W3C Recommendation</source>
          , World Wide Web Consortium, Sept.
          <year>2012</year>
          . Available at http://www. w3.org/TR/r2rml/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>F.</given-names>
            <surname>Di Pinto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mancini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruzzi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Optimizing query rewriting in ontology-based data access</article-title>
          .
          <source>In Proc. of EDBT</source>
          <year>2013</year>
          , pages
          <fpage>561</fpage>
          -
          <lpage>572</lpage>
          . ACM Press,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M.</given-names>
            <surname>Giese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Soylu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Vega-Gorgojo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Waaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Haase</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Rezk</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>O</given-names>
          </string-name>
          <string-name>
            <surname>¨ . L. O</surname>
          </string-name>
          <article-title>¨ zc¸ep, and</article-title>
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          . Optique:
          <article-title>Zooming in on big data</article-title>
          .
          <source>IEEE Computer</source>
          ,
          <volume>48</volume>
          (
          <issue>3</issue>
          ):
          <fpage>60</fpage>
          -
          <lpage>67</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>J. M. Go</surname>
          </string-name>
          <article-title>´mez-Pe´rez</article-title>
          ,
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          , G. Vetere, and
          <string-name>
            <given-names>H.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Enterprise knowledge graph: An introduction</article-title>
          .
          <source>In Exploiting Linked Data and Knowledge Graphs in Large Organisations</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>S.</given-names>
            <surname>Harris</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Seaborne</surname>
          </string-name>
          .
          <source>SPARQL 1</source>
          .
          <article-title>1 query language</article-title>
          .
          <source>Mar</source>
          .
          <year>2013</year>
          . Available at http: //www.w3.org/TR/sparql11-query.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>E.</given-names>
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Skjaeveland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bilidas</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Jime´nez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Soylu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lanti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Rezk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Zheleznyakov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Giese</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Lie</surname>
            ,
            <given-names>Y. E.</given-names>
          </string-name>
          <string-name>
            <surname>Ioannidis</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kotidis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Koubarakis</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Waaler</surname>
          </string-name>
          .
          <article-title>Ontology based data access in statoil</article-title>
          .
          <source>J. Web Semant</source>
          .,
          <volume>44</volume>
          :
          <fpage>3</fpage>
          -
          <lpage>36</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pantaleone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Santarelli</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Easy OWL drawing with the graphol visual ontology language</article-title>
          .
          <source>In Proc. of KR</source>
          , pages
          <fpage>573</fpage>
          -
          <lpage>576</lpage>
          . AAAI Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          .
          <article-title>Managing data through the lens of an ontology</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>39</volume>
          (
          <issue>2</issue>
          ):
          <fpage>65</fpage>
          -
          <lpage>74</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>V.</given-names>
            <surname>Santarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lepore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Namici</surname>
          </string-name>
          , G. Ronconi,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruzzi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Savo</surname>
          </string-name>
          .
          <article-title>Monolith: an OBDM and knowledge graph management platform</article-title>
          . In M. C.
          <article-title>Sua´rez-</article-title>
          <string-name>
            <surname>Figueroa</surname>
            , G. Cheng,
            <given-names>A. L.</given-names>
          </string-name>
          <string-name>
            <surname>Gentile</surname>
          </string-name>
          , C. Gue´ret,
          <string-name>
            <surname>C. M. Keet</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Bernstein, editors,
          <source>In Proc. of ISWC 2019 Satellite Tracks</source>
          , Auckland, New Zealand,
          <source>October 26-30</source>
          ,
          <year>2019</year>
          , volume
          <volume>2456</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <fpage>173</fpage>
          -
          <lpage>176</lpage>
          . CEUR-WS.org,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>