<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>G2GML: Graph to Graph Mapping Language for Bridging RDF and Property Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hirokazu Chiba</string-name>
          <email>chiba@dbcls.rois.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ryota Yamanaka</string-name>
          <email>ryota.yamanaka@oracle.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shota Matsumoto</string-name>
          <email>shota.matsumoto@lifematics.co.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Database Center for Life Science</institution>
          ,
          <addr-line>Chiba 277-0871</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Lifematics Inc.</institution>
          ,
          <addr-line>Tokyo 101-0041</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Oracle Corporation</institution>
          ,
          <addr-line>Bangkok 10500</addr-line>
          ,
          <country country="TH">Thailand</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>How can we maximize the value of accumulated RDF data? Whereas the RDF data can be queried using the SPARQL language, even the SPARQL-based operation has a limitation in implementing traversal or analytical algorithms. Recently, a variety of database implementations dedicated to analyses on the property graph (PG) model have emerged. Importing RDF datasets into these graph analysis engines provides access to the accumulated datasets through various application interfaces. However, the RDF model and the PG model are not interoperable. Here, we developed a framework based on the Graph to Graph Mapping Language (G2GML) for mapping RDF graphs to PGs to make the most of accumulated RDF data. Using this framework, accumulated graph data described in the RDF model can be converted to the PG model, which can then be loaded to graph database engines for further analysis. This study bridges RDF and PGs and contributes to interoperable management of knowledge graphs, thereby expanding the use cases of accumulated RDF data. Demonstration of the G2G mapping framework is available at https://purl.org/g2gml.</p>
      </abstract>
      <kwd-group>
        <kwd>RDF</kwd>
        <kwd>Property Graph</kwd>
        <kwd>Graph Database</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Increasing amounts of scienti c and social data are being published in the form
of the Resource Description Framework (RDF), which presently constitutes a
large open data cloud. DBpedia [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Wikidata [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] are well-known examples
of such RDF datasets. SPARQL is a protocol and query language that serves
as a standardized interface for RDF data. This standardized data model and
interface enables the construction of integrated graph data. However, the lack of
an interface for graph-based analysis and performant traversal limits use cases
of the graph data.
      </p>
      <p>Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        Recently, the property graph (PG) model [
        <xref ref-type="bibr" rid="ref3 ref4">3,4</xref>
        ] has been increasingly
attracting attention in the context of graph analysis. Various graph database engines,
including Neo4j [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Oracle Database [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and Amazon Neptune [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] adopt this
model. These graph database engines support algorithms for traversing or
analyzing graphs. However, few datasets are published in the PG model and the
lack of an ecosystem for exchanging data in the PG model limits the application
of these powerful engines.
      </p>
      <p>
        In the light of this situation, developing a method to transform RDF into PG
would be highly valuable. One of the practical issues faced by this challenge is
the lack of a standardized PG model. Another issue is that the transformation
between RDF and PG is not straightforward due to the di erences in their
models. In RDF graphs, all information is expressed by triples (node-edge-node),
whereas in PGs, arbitrary information can be contained in each node and edge
as the key-value form. Although this issue was previously addressed on the basis
of prede ned transformations [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], users still cannot control the mapping for their
speci c use cases.
      </p>
      <p>
        In this study[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]4, we rede ne the PG model incorporating the di erences
in existing models and propose serialization formats based on the data model.
We further propose a graph to graph mapping framework based on the Graph
to Graph Mapping Language (G2GML). Employing this mapping framework,
accumulated graph data described in RDF can be converted into PGs, which
can then be loaded into several graph database engines for further analysis.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Overview</title>
      <p>We provide an overview of the graph to graph mapping (G2G mapping)
framework (Figure 1).</p>
      <p>In this framework, users describe mappings from RDF to PG in G2GML.
According to this G2GML description, the input RDF dataset is converted into
a PG dataset. The new dataset can also be optionally saved in speci c formats
for loading into major graph database implementations.</p>
      <p>G2GML is a declarative language comprising pairs of RDF patterns and PG
patterns. The core concept of a G2GML description can be represented by a
map from RDF subgraphs, which match speci ed SPARQL patterns, to PG
components.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Examples</title>
      <p>
        4 Main paper to be presented at ISWC 2020
{ Resource to node: In lines 2{4, the RDF resources with type :Person are
mapped into the PG nodes using their IRIs as node IDs.
{ Datatype property to node property: In lines 2{4, the RDF datatype
property :name is mapped onto the PG node property key name. The literal
objects 'Alice' and 'Bob' are mapped onto the node property values.
{ Object property to edge: In lines 5{6, the RDF object property :supervised by
is mapped onto the PG edge supervised by.
{ Resource to edge: In lines 7{12, the RDF resource with type :Email is
mapped onto the PG edge emailed.
{ Datatype property to edge property: In lines 7{12, the RDF datatype
property :year and :attachment are mapped onto the PG edge property year
and attachment. The literal objects 2017 and '01.pdf' are mapped onto
the edge property values.
1 "http://example.org/person1" :person name:Alice
2 "http://example.org/person2" :person name:Bob
3 "http://example.org/person1" -&gt; "http://example.org/person2" :supervised_by
4 "http://example.org/person1" -&gt; "http://example.org/person2" :emailed year:2017 attachment:"01.pdf"
A preceding study on converting existing data into graph data included an
effort to convert relational databases into graph databases [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. However, given
that RDF has prevailed as a standardized data model in scienti c communities,
considering mapping based on the RDF model is crucial. The interoperability of
RDF and PG [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref8">8,12,13,14</xref>
        ] has been discussed, and e orts were made to develop
methods to convert RDF into PG [
        <xref ref-type="bibr" rid="ref15 ref16">15,16</xref>
        ]. However, considering the exibility
regarding the type of information that can be expressed by edges in property
graphs, a novel method for controlling the mapping is necessary.
      </p>
      <p>To the best of our knowledge, this study presents the rst attempt to
develop a framework for controlled mapping between RDF and PG. Notably, the
designed G2GML is a declarative mapping language. As a merit of the
declarative description, we can concentrate on the core logic of mappings. In the sense
that the mapping process generates new graph data on the basis of existing
graph data, it has a close relation to the semantic inference.</p>
      <p>
        Other mapping frameworks, such as Neosemantics (a Neo4j plugins), propose
a method to convert RDF datasets without mapping de nitions. We observe a
similar discussion in the conversion from the relational model to RDF, where
are two W3C standards, i.e., Direct Mapping [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and R2RML [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Availability</title>
      <p>The prototype implementation of G2G mapping is available on GitHub (https:
//github.com/g2glab/g2g) under MIT license, which is written in JavaScript
and can be executed using Node.js in the command line. It has an endpoint
mode and a local le mode. The local le mode uses Apache Jena ARQ to
execute SPARQL queries internally, while the endpoint mode accesses SPARQL
endpoints via the Internet. An example of the usage in the endpoint mode is as
follows:</p>
      <p>$ g2g musician.g2g http://dbpedia.org/sparql
where the rst argument is a G2GML description le, and the second argument
is the target SPARQL endpoint, which provides the source RDF dataset.</p>
      <p>Furthermore, a demonstration site (https://purl.org/g2gml) is available,
and the documentation (https://g2gml.readthedocs.io) also includes quick
tutorials on how to try G2GML using the Docker image (https://hub.docker.
com/r/g2glab/g2g).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isele</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakob</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jentzsch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P. N.</given-names>
          </string-name>
          , ...
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>DBpedia{a large-scale, multilingual knowledge base extracted from Wikipedia</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <fpage>167</fpage>
          -
          <lpage>195</lpage>
          . (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Vrandecic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Krotzsch, M.:
          <article-title>Wikidata: a free collaborative knowledgebase</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>57</volume>
          (
          <issue>10</issue>
          ),
          <fpage>78</fpage>
          -
          <lpage>85</lpage>
          . (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Angles</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gutierrez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>An introduction to Graph Data Management</article-title>
          . arXiv preprint arXiv:
          <year>1801</year>
          .
          <volume>00036</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Angles</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barcelo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutter</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vrgoc</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Foundations of Modern Query Languages for Graph Databases</article-title>
          .
          <source>ACM Computing Surveys (CSUR)</source>
          ,
          <volume>50</volume>
          (
          <issue>5</issue>
          ),
          <fpage>68</fpage>
          . (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>5. The Neo4j Graph Platform</article-title>
          . https://neo4j.com/
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Oracle</given-names>
            <surname>Database</surname>
          </string-name>
          Property Graph. https://www.oracle.com/goto/propertygraph
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Amazon</given-names>
            <surname>Neptune</surname>
          </string-name>
          . https://aws.amazon.com/neptune/
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Reconciliation of RDF* and property graphs</article-title>
          .
          <source>arXiv preprint arXiv:1409.3288</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Chiba</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yamanaka</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsumoto</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>G2GML: Graph to Graph Mapping Language for Bridging RDF and Property Graphs</article-title>
          .
          <source>Proceedings of the 19th International Semantic Web Conference</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Chiba</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yamanaka</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsumoto</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Property Graph Exchange Format</article-title>
          . arXiv preprint arXiv:
          <year>1907</year>
          .
          <volume>03936</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>De</surname>
            <given-names>Virgilio</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Maccioni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Torlone</surname>
          </string-name>
          , R.:
          <article-title>Converting relational to graph databases</article-title>
          .
          <source>In First International Workshop on Graph Data Management Experiences and Systems</source>
          , p.
          <fpage>1</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Angles</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tomaszuk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>RDF and Property Graphs Interoperability: Status and Issues</article-title>
          .
          <source>Proceedings of the 13th Alberto Mendelzon International Workshop on Foundations of Data Management</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivasan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perry</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chong</surname>
            ,
            <given-names>E. I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Banerjee</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Tale of Two Graphs: Property Graphs as RDF in Oracle</article-title>
          . In EDBT, pp.
          <volume>762</volume>
          {
          <issue>773</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Punjani</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keswani</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Stitch in Time Saves Nine{SPARQL querying of Property Graphs using Gremlin Traversals</article-title>
          . arXiv preprint arXiv:
          <year>1801</year>
          .
          <volume>02911</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Tomaszuk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>RDF data in property graph model</article-title>
          .
          <source>In Research Conference on Metadata and Semantics Research</source>
          , pp.
          <volume>104</volume>
          {
          <issue>115</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>De</surname>
            <given-names>Virgilio</given-names>
          </string-name>
          , R.:
          <article-title>Smart RDF data storage in graph databases</article-title>
          .
          <source>In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing</source>
          , pp.
          <volume>872</volume>
          {
          <fpage>881</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>A</given-names>
            <surname>Direct</surname>
          </string-name>
          <article-title>Mapping of Relational Data to RDF, W3C Recommendation 27 September</article-title>
          <year>2012</year>
          https://www.w3.org/TR/r2rml/
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. R2RML:
          <article-title>RDB to RDF Mapping Language</article-title>
          ,
          <source>W3C Recommendation 27 September</source>
          <year>2012</year>
          https://www.w3.org/TR/r2rml/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>