=Paper= {{Paper |id=Vol-3254/paper400 |storemode=property |title=RDF2TG: Towards Supporting RDF in TigerGraph Property Graph Database System |pdfUrl=https://ceur-ws.org/Vol-3254/paper400.pdf |volume=Vol-3254 |authors=Lu Zhou,Jay Yu |dblpUrl=https://dblp.org/rec/conf/semweb/ZhouY22 }} ==RDF2TG: Towards Supporting RDF in TigerGraph Property Graph Database System== https://ceur-ws.org/Vol-3254/paper400.pdf
RDF2TG: Towards Supporting RDF in TigerGraph
Property Graph Database System
Lu Zhou, Jay Yu
Innovation and Development Center, Tigergraph, Inc., 3636 Nobel Dr. Suite 100 San Diego, CA 92122, USA


                                      Abstract
                                      Graph data technology adopters often face the challenge of choosing between flexible knowledge
                                      representation and reasoning on Resource Description Framework (RDF) and large-scale data processing
                                      performance on Property Graph (PG) models. In this paper, we propose a generic method to bring
                                      the best of both worlds together by supporting RDF data in TigerGraph, a massive parallel distributed
                                      native property graph database system. This method relies on a generic schema with mapping rules for
                                      loading RDF data while preserving the flexibility of the original RDF graphs. We use LDBC Semantic
                                      Publishing Benchmark (SPB) to demonstrate how this mechanism maps RDF data and SPARQL queries
                                      into TigerGraph and GSQL.

                                      Keywords
                                      RDF Knowledge Graph, Tigergraph Property Graph, SPARQL, GSQL




1. Introduction
RDF is a W3C standard model representing data in triple statements composed of subject and
object as nodes, connected by predicate as edges. In contrast, PG allows labeled properties on
both nodes and edges for additional information. Due to the inherent differences between RDF
and PG databases and their associated query languages, users are forced to trade-off between the
semantic expressivity of RDF and the performance scalability of PG. One approach to bridge the
gap was by Neo4j plugin “neosemantics”.1 It maps datatype properties and values from triples
to concrete node properties. This model transformation might have limitations for use cases
like entity resolution, where attributes are represented as nodes to find similarity via graph
connectivities. In this paper, we propose another approach to map RDF data to TigerGraph
with a generic schema that preserves the flexibility of RDF graphs, meaning instead of mapping
RDF knowledge graphs and ontologies explicitly, the generic schema can tolerate dynamic
updates in RDF graphs without affecting the mapping process. We successfully load an RDF
graph with about 32 million triples generated from LDBC SPB benchmark2 to TigerGraph and
execute 36 GSQL queries translated from SPARQL in the benchmark. A preliminary evaluation
of a side-by-side comparison with an RDF database shows the generic PG model performs at a
similar level without any optimization on the database engine level.
ISWC’22: International Semantic Web Conference, October 23–27, 2022, Hangzhou, China
$ lu.zhou@tigergraph.com (L. Zhou); jay.yu@tigergraph.com (J. Yu)
 0000-0002-0453-9965 (L. Zhou)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings         CEUR Workshop Proceedings (CEUR-WS.org)
               http://ceur-ws.org
               ISSN 1613-0073




               1
                 https://neo4j.com/labs/neosemantics/4.1/
               2
                 https://ldbcouncil.org/benchmarks/spb/
Figure 1: Example Triples in RDF Graphs
                                                              Figure 2: Example Triples in TigerGraph


2. RDF to PG Graph Mapping
We design a generic schema in TigerGraph to import RDF graphs3 based on mapping rules. Fig-
ure 2 demonstrates an example graph after mapping two RDF triples depicted in Figure 1. There
are four types of vertex: ClassInstance, ObjectPropertyInstance, DatatypePropertyInstance, and
ValueInstance, and four types of directed edge: hasObjectPropertyInstance, hasObjectInstance,
hasDatatypeProperyInstance, and hasValueInstance. ValueInstance has three properties (value,
datatype, langTag), while other vertices have one property (uri). To evaluate the effectiveness of
the schema and mapping rules, we utilize the LDBC SPB benchmark to generate an RDF graph
with about 32 million triples. We load the RDF data into TigerGraph and result in a PG with
about 39.8 million vertices and 127.1 million edges.


3. SPARQL to GSQL Translation
LDBC SPB Benchmark provides two types of queries - basic and advanced. Basic queries
contain search, aggregate, geo-spatial, full-text search, and time-range, while advanced ones add
analytical, drill-down, and faceted search. For this phase of the project, we manually translate
36 SPARQL to GSQL queries and verify the results are equivalent. We conduct a preliminary
evaluation of query performance in TigerGraph. The results are promising and comparable
to running the same benchmark on an RDF graph database using the same environmental
configuration, without any optimization on the database engine level.


4. Conclusion and Future Work
In conclusion, we proposed a method to map RDF data and SPARQL queries to TigerGraph.
Preliminary results from applying it to the LDBC SPB benchmark are promising. Codes to
migrate RDF graphs to TigerGraph, mapping rules, queries, and performance are accessible
in the GitHub repository.4 We still have a few areas to expand our approach to cover more
RDF features like blank nodes, named graphs, RDFS and OWL reasoning, as well as advanced
SPARQL query capabilities to construct new graphs and perform updates. We will generalize
the manual SPARQL to GSQL translation rules to automatically support no-code/low-code RDF
data and query in TigerGraph.
   3
       Supports RDF 1.1 with RDF Schema (RDFS) and Web Ontology Language (OWL).
   4
       https://github.com/kbzhoulu/ldbc_spb_tigergraph