<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Formalizing Gremlin Pattern Matching Traversals in an Integrated Graph Algebra</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Harsh Thakkar</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soren Auer</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria-Esther Vidal</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Graph data management (also called NoSQL) has revealed bene cial characteristics in terms of exibility and scalability by di erently balancing between query expressivity and schema exibility. This peculiar advantage has resulted into an unforeseen race of developing new task-speci c graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present-day graph query languages are focused towards exible graph pattern matching (aka sub-graph matching), whereas graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph-based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language, and machine provide a common platform for supporting any graph computing system (such as an OLTP graph database or OLAP graph processors). In this extended report, we present a formalization of graph pattern matching for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra.</p>
      </abstract>
      <kwd-group>
        <kwd>Graph Pattern Matching</kwd>
        <kwd>Graph Traversal</kwd>
        <kwd>Gremlin</kwd>
        <kwd>Graph Algebra</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Upon observing the evolution of information technology, we can observe
a trend from data models and knowledge representation techniques
being tightly bound to the capabilities of the underlying hardware towards
more intuitive and natural methods resembling human-style information
processing. This evolution started with machine assembly languages, went
over procedural programming, object-oriented methods and resulted in an
ever more loosely coupling of data and code with relational data bases,
declarative query languages and object-relational mapping (ORM). In
recent years, we can observe an even further step in this evolution { graph
Copyright c 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        based data models, which organize information in conceptual networks.
Graphs are valued distinctly, when it comes to choosing formalisms for
modelling real-world scenarios such as biological, transport,
communication and social networks due to their intuitive data model. In particular,
the property graph data model is capable of representing complex domain
networks [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        Graph analysis tools have turned out to be one of pioneering
applications in understanding these natural and man-made networks [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Graph
analysis is carried out using graph processing techniques which ultimately
boil down to e cient graph query processing. Graph Pattern Matching
(GPM), also referred to as the sub-graph matching is the foundational
problem of graph query processing. Many vendors have proposed a
variety of (proprietary) graph query languages to demonstrate the solvability
of graph pattern matching problem.
      </p>
      <p>
        These modern graph query languages focus either on traversal, where
traversers move over vertices and edges of a graph in a user de ned
fashion or on pattern-matching, where graph patterns are matched against the
graph database. Gremlin [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] is one such modern graph query language,
with a distinctive advantage over others that it o ers both of these
perspectives. This implies that a user can reap bene ts of both declarative
and imperative matching style within the same framework. Furthermore,
conducting GPM in Gremlin can be of crucial importance in cases such
as:
{ Querying very large graphs, where a user is not completely aware of
certain dataset-speci c statistics of the graph (e.g., the number of
created vs knows edges existing in the graph (ref. Figure 1));
{ Creating optimal query plans, without the user having to dive deep
into traversal optimization strategies.
{ In application-speci c settings such as a question answering [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], users
express information needs (e.g., natural language questions) which can
be better represented as graph pattern matching problems than path
traversals.
      </p>
      <p>
        In this work, we present an extended version of our earlier work [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] that
contributes to establishing a formal base for a graph query algebra, by
surveying and integrating existing graph query operators. The
contributions of this work are in particular:
{ We consolidate existing graph algebra operators from the literature
and propose two new traversal operators into an integrated graph
algebra.
{ We formalize the graph pattern matching construct of the Gremlin
query language.
{ We provide a formal speci cation of pattern matching traversals for
the Gremlin language, which can serve as a foundation for
implementing a Gremlin based query compilation engine.
      </p>
      <p>
        As a result, the formalization of graph query algebra supports the
integration and interoperability of di erent graph data models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] (e.g.,
executing SPARQL queries on top of Gremlin [
        <xref ref-type="bibr" rid="ref23 ref24">23,24</xref>
        ]), helps to prevent
vendor lock in scenarios and boosts data management benchmarking
efforts such as LITMUS [
        <xref ref-type="bibr" rid="ref10 ref21 ref22">10,21,22</xref>
        ] and the Linked Data Benchmark Council
(LDBC) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>The remainder of this article is organised as follows: section 2 describes
the preliminaries including property graphs, graph pattern matching, and
foundations of relational algebra. section 3 elaborates on Gremlin as a
traversal language and machine, and discusses synergy of GPM and its
evaluation in Gremlin. section 4 proposes the Gremlin to graph algebra
mapping algorithm. section 5 surveys related work. In section 6 we
conclude this article and discuss future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        In this section, we present and summarise the mathematical concepts
used in this article. Our notation closely follows [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and extends [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] by
adding the notion of vertex labels.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Property Graphs</title>
        <p>
          Property graphs, also referred to as directed, edge-labeled, attributed
multi-graphs, have been formally de ned in a wide variety of texts, such
as [
          <xref ref-type="bibr" rid="ref1 ref11 ref15 ref17 ref8">1,8,11,17,15</xref>
          ]. We adapt the de nition of property graphs presented
by [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]:
De nition 1 (Property Graph). A property graph is de ned as G =
fV; E; ; g; where:
{ V is the set of vertices,
{ E is the set of directed edges such that E (V Lab
        </p>
        <p>is the set of Labels,
{ is a function that assigns labels to the edges and vertices (i.e.</p>
        <p>
          V [ E ! )3, and
V ) where Lab
:
3 set of strings ( )
is a partial function that maps elements and keys to values (i.e.
: (V [ E) R ! S) i.e. properties (key r 2 R, value s 2 S). tu
For simplicity, we de ne disparate sets (of and ) for the labels and
properties of vertices and edges respectively, adapting the terminology
used in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. We de ne:
{ Lv: Set of vertex labels (Lv ), l : V ! Lv assigns label to each
vertex
{ Le: Set of edge labels (Le ), e : E ! Le assigns label to each
edge. (Lv \ Le = , Lv [ Le = Lab; wrt de nition 1, = v [ e)
{ Pv:Set of vertex properties (Pv R), v : V Pv ! S assigns a value
to each vertex property
{ Pe: Set of edge properties (Pe R), e : E Pe ! S assigns a
value to each edge property (Pv 6= Pe; in the context of de nition 1,
= v [ e)
databases perform GPM for querying a variety of data models such as
RDF, Property Graphs, edge-labelled graphs, etc., and many works
address and analyze its solvability, such as [
          <xref ref-type="bibr" rid="ref1 ref11 ref13 ref27 ref8">1,8,11,13,27</xref>
          ]. Various graph
query languages have been implemented for querying these data models,
that are of imperative and declarative nature, such as:
{ The SPARQL5 query language for RDF triple stores (declarative),
{ Neo4J's native query language CYPHER6 (declarative),
{ Apache TinkerPop's graph traversal language and machine
Gremlin7 (a functional language traditionally imperative, but also o ers
a declarative construct).
        </p>
        <p>
          GPM queries can represented by two types of graph patterns [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], basic
graph patterns (BGP) and complex graph patterns (CGP), which add
operations such as projections, unions, etc. to BGPs (cf. Figure 2).
Example 1. For the graph pattern (say P ) as shown in Figure 2(b), as
per the De nition 1 of property graphs, we have that P = fV, E, , g,
where Figure 3 demonstrates the values of its components.
        </p>
        <p>
          A GPM is evaluated by matching, a sub-graph pattern over a graph G.
Matching has been formally de ned in various texts and we summarise
a formal de nition in our context which closely follows the de nition
provided by [
          <xref ref-type="bibr" rid="ref1 ref8">8,1</xref>
          ].
5 https://www.w3.org/TR/rdf-sparql-query/
6 https://neo4j.com/developer/cypher-query-language/
7 https://tinkerpop.apache.org/
De nition 2 (Match of a graph pattern). A graph pattern P =
(Vp; Ep; p; p); is matching the graph G = (V; E; ; ), i the following
conditions are satis ed:
1. there exist mappings p and p such that, all variables are mapped
to constants, and all constants are mapped to themselves (i.e. p 2
; p 2 ),
2. each edge e 2 Ep in P is mapped to an edge e 2 E in G, each vertex
v 2 Vp in P is mapped to a vertex v 2 V in G, and
3. the structure of P is preserved in G (i.e. P is a sub-graph of G)
tu
We discuss matching graph patterns, over a graph G, in the context of
Gremlin traversal language in Section 3.2.
        </p>
        <p>Example 2. We illustrate the evaluation of graph patterns (a) and (b)
from Figure 2, over the graph G from Figure 1.</p>
        <p>
          (a) v[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
(b) v[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]8
v[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
        </p>
        <p>
          In Gremlin, GPM is performed by traversing9 over a graph G. A
traversal t over G derives paths of arbitrary length. Therefore, a GPM
query in Gremlin can be perceived as a path traversal. Rodriguez et al. [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]
de ne a path as:
De nition 3 (Path). A path p is a sequence or a string, where p 2 E
and E (V Le V )10. A path allows for repeated edges and the length of
a path is denoted by jjpjj, which is equal to the number of edges in p. tu
8 Please note that since we did not explicitly specify any values, the traversal returns
the ids of the "matched" elements as a result.
9 The act of visiting of vertices (v 2 V ) and edges (e 2 E) in a graph in an alternating
manner (in some algorithmic fashion) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
10 The kleene start operation * constructs the free monoid E = Sn1=0 Ei. where E0
= f g; is the identity/empty element.
        </p>
        <p>
          Moreover, from [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] we also know that these path queries are
comprised of several atomic operations called the single-step traversals. A list
of graph-speci c operators have been de ned in [
          <xref ref-type="bibr" rid="ref17 ref18">17,18</xref>
          ]. We outline these
and other operators, in the following.
2.3
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Graph Algebra Foundations</title>
        <p>
          We present a consolidated summary of various graph query operators
de ned in the literature [
          <xref ref-type="bibr" rid="ref13 ref14 ref17 ref18 ref3 ref9">3,9,13,14,18,17</xref>
          ]. For brevity, we abstain from
dwelling into rigorous formal de nitions and underlying proofs and refer
the interested reader to the respective articles.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Unary operators</title>
        <p>Projection ( a;b;::) : R [ S ! : operator projects values of a speci c
set of variables a; b; ::; n (i.e. keys and elements), from the solution of
a matched input graph pattern P , against the graph G. Moreover, the
results returned by ( a;b) are not deduplicated be default, i.e. the result
will contain as many possible matched values or items as the input pattern
P . This operator is present in all standard graph query languages (e.g.
SELECT in SPARQL, and MATCH in CYPHER).</p>
        <p>
          Selection (9(p)), is analogous to the lter operator ( ), as de ned in [
          <xref ref-type="bibr" rid="ref1 ref13">13,1</xref>
          ],
restricts the match of a certain graph pattern P against a graph G,
by imposing conditional expressions (p) e.g., inequalities and/or other
traversal-speci c predicates (where predicate is a proposition formula).
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Binary operators</title>
        <p>
          Concatenation [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] ( ): E E ! E : concatenates two paths (cf.
De nition 3). For instance, if (i; ; j) and (j; ; k) are two edges in a graph
G, then their concatenation is the new path (i; ; j; ; k); where i; j; k 2 V
and ; 2 Le.
        </p>
        <p>
          Union operator (]): P (E ) P (E ) ! P (E ) : is the multiset union11
(bag union) of two path traversals or graph patterns. For instance, f(1,2),
(3,4), (3,4), (4,5)g ] f(1,2), (3,4)g = f(1,2), (1,2), (3,4), (3,4), (3,4),
(4,5)g. The results of this operator, like projection, are not deduplicated
by default.
11 Note that the domains and ranges of each of these sets are the power sets.
Join [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] (./ ): P (E ) P (E ) ! P (E ) : produces the concatenative
join of two sets of paths (path traversals) such that if P; R 2 P (E ), then
P ./ R = fp r j p 2 P ^ r 2 R ^ (p =
_ r =
_ +(p) =
(r))g12
For instance, if P = f(v1; e1; v2); (v2; e2; v3)g and R = f(v2; e2; v3); (v2; e2; v1)g,
then,
        </p>
        <p>P ./ R = f(v1; e1; v2; v2; e2; v3); (v1; e1; v2; v2; e2; v1)g,
where v1; v2; v3 2 V ; e1; e2 2 Le.</p>
        <p>
          Left -join (d|&gt;&lt;|), Right -join (|&gt;&lt;|d) and the Anti -join (.) operators: these
operators, are not explicitly implemented in Gremlin, unlike in other graph
query languages [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Their results can, however, be simulated by the
user, at run-time via selecting desired values of elements, vertices or edges
declaring using the projection operator. For instance, an anti-join can be
"computationally" achieved by not'ing an argument in the match step
by using .not() Gremlin step.
        </p>
        <p>
          General Extensions. We borrow the extended relational operators
Grouping (ya(p)), Sorting (&lt;*a;+b(p)) and Deduplication ( a;b;::(p)) which
have been de ned in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. A detailed illustration with formal de nitions
of extended operators can be found in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          Graph-speci c Extensions. Various graph/traversal-speci c operators
have been de ned in works such as [
          <xref ref-type="bibr" rid="ref18 ref9">9,18</xref>
          ]. Furthermore, there also exist
certain application-speci c extensions of algebra operators, such as the
and operators, for graph data aggregation (used in complex graph
network analysis) de ned by [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. We present graph-speci c operators, some
of which have been adapted from [
          <xref ref-type="bibr" rid="ref13 ref9">9,13</xref>
          ] and propose additional operators
based on the algebra de ned by [
          <xref ref-type="bibr" rid="ref17 ref18">18,17</xref>
          ].
        </p>
        <p>{ The Get-vertices/Get-edges nullary operators (Vg/ Eg): return the list
of vertices/edges, respectively. These operators, w.r.t. Gremlin query
construct, denote the start of a traversal. It is also possible to traverse
from a speci c vertex/edge in a graph, given their id's. Furthermore,
they can be used to construct custom indexes over elements depending
on user's choice.
12 Here, ( ; +) denote the rst and last elements of a path respectively.
Gremlin is the query language of Apache TinkerPop13 graph
computing framework. Gremlin is system agnostic, and enables both { pattern
matching (declarative) and graph traversal (imperative) style of querying
over graphs.</p>
        <p>The Machine. Theoretically, a set of traversers in T move (traverse)
over a graph G (property graph, cf. Section 2.1) according to the
instruc13 Gremlin: Apache TinkerPop's graph traversal language and machine (https://
tinkerpop.apache.org/)
tion in ( ), and this computation is said to have completed when there
are either:
1. no more existing traversers (t), or
2. no more existing instructions ( ) that are referenced by the traversers
(i.e. program has halted).</p>
        <p>
          Result of the computation being either a null/empty set (i.e. former case)
or the multiset union of the graph locations (vertices, edges, labels,
properties, etc.) of the halted traversers which they reference. Rodriguez et
al. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] formally de ne the operation of a traverser t as follows:
G
t 2 T
f ; &amp;g
!
(1)
where, : T ! U is a mapping from the traverser to its location in G;
: T ! maps a traverser to a step in ; : T ! N maps a traverser
to its bulk14; &amp;: T ! U maps a traverser to its sack (local variable of a
traverser) value.
        </p>
        <p>
          The Traversal. A Gremlin graph traversal can be represented in any
host language that supports function composition and function nesting.
These steps are are either of:
1. Linear motif - f g h, where the traversal is a linear chain of steps;
or
2. Nested motif - f (g h) where, the nested traversal g h is passed
as an argument to step f [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
        <p>A step (f 2 ) can be, de ned as f : A? ! B?15. Where, f maps a set
of traversers of type A (located at objects of A) to a set of traversers
of type B (located at objects of B). Given that Gremlin is a language
and a virtual machine, it is possible to design another traversal language
that compiles to the Gremlin traversal machine (analogous to how Scala
compiles to the JVM). As a result, there exists various Gremlin dialects
such as Gremlin-Groovy, Gremlin-Python, etc.</p>
        <p>
          A Gremlin traversal ( ) can be compiled down to a collection of steps
or instructions which form the Gremlin instruction set. The Gremlin
instruction set comprises approximately 30 steps which are su cient to
provide general purpose computing and for expressing the common motifs of
14 The bulk of a traverser is number of equivalent traversers a particular traverser
represents.
15 The Kleene star notation (A?; B?) denotes that multiple traversers can be in the
same element (A,B).
any graph traversal query, as highlighted by [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. However, in a
majority of cases only around 10 of these instructions are su cient to address
the most common information needs (i.e. for graph pattern matching and
traversal).
3.1
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>Graph Pattern Matching Queries in Gremlin</title>
        <p>
          Gremlin provides the GPM construct, analogous to SPARQL [
          <xref ref-type="bibr" rid="ref14 ref19 ref3">3,14,19</xref>
          ]16,
using the Match()-step. This enables the user to represent the query using
multiple individual connected or disconnected graph patterns. Each of
these graph patterns can be perceived as a simple path traversal,
to-andfrom a speci c source and destination, over the graph.
        </p>
        <p>
          Each traversal is a path query starting at a particular source (A)
and terminating at a destination (B) by visiting vertices (v 2 V) and
edges (e 2 E (V V) ) in an alternating fashion (i.e. referred to as
traversing [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]). Each path query is composed of one or more single-step
traversals. Through function composition and currying , it is possible
to de ne a query of arbitrary length [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. These path queries can be a
combination of either a source, destination, labelled traversal or all of
them in a varying fashion, depending on the query de ned by the user.
Example 3. For instance, consider a simple path traversal to the oldest
person that marko knows over the graph G as show in Figure 1. Listing 1.1
represents the gremlin query for the described traversal.
1 g .V( ) . has ( "name" , "marko" ) . out ( "knows" ) . v a l u e s ( " age " ) . max ( )
        </p>
        <p>
          Listing 1.1. Return the age of the oldest person marko knows
Here, g.V() i.e. Vg is the traverser de nition bijective to V where, ]i ((Vg)i)
= V. Functionally, this query be written using function currying as:
(2)
The terms outknows, valuesage and hasname are the single-step
Gremlin operations/traversals. In [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], Rodriguez presents the itemisation of
such single-step traversals which can be used to represent a complex path
traversal. Thus, as described earlier, through functional composition and
currying one can represent a graph traversal of random length. If i be
the starting vertex in G, then the traversal shown in listing 1.1 can be
represented as following function:
f (i) = max( age vin elkanbo+ws
eout
nmaamrkeo+) (i)
(3)
16 However, Property Graphs do not encode all semantics of RDF Graphs (e.g. blank
nodes.
where, f : P(V) ! P(S). A detailed illustration of the single step traversals
can be referred from [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
3.2
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>Evaluation of match()-step in Gremlin</title>
        <p>
          The match()-step17 evaluates the input graph patterns over a graph G
in a structure preserving manner binding the variables and constants to
their respective values (cf. De nition 2). We denote the evaluation of
a traversal t over a graph G, using the notation [[t]]g which is borrowed
from [
          <xref ref-type="bibr" rid="ref14 ref4">14,4</xref>
          ]. When a match()-step is encountered by the Gremlin machine,
it treats each graph pattern as an individual path traversal. These graph
patterns are represented using as()-step18 (step-modulators19 i.e. naming
variables) such as a, b, c, etc.), which typically mark the start (and end)
of particular graph patterns or path traversals.
        </p>
        <p>However, the order of execution of each graph pattern is up to the
match()-step implementation, where the variables and path labels are
local only to the current match()-step. Due to this uniqueness of the
Gremlin match()-step it is possible to:
1. treat each graph pattern individually as a single step traversal and
thus, construct composite graph patterns by joining (path-joins) each
of these single step traversals;
2. combine multiple match()-steps for constructing complex navigational
traversals (i.e. multi-hop queries), where each composite graph pattern
(from a particular match()-step) can be joined using the
concatenative join (ref. Section 2.3).</p>
        <p>For instance, consider the GPM gremlin query as shown in Listing 1.2.
1 g .V( ) . match (
2 . as ( ' a ' ) . out ( ' c r e a t e d ' ) . as ( ' b ' ) ,
3 . as ( ' b ' ) . has ( ' name ' , ' l o p ' ) ,
4 . as ( ' b ' ) . i n ( ' c r e a t e d ' ) . as ( ' c ' ) ,
5 . as ( ' c ' ) . has ( ' age ' , 3 0 ) ) . s e l e c t ( ' a ' , ' c ' ) . by ( ' name ' )
Listing 1.2. This traversal returns the names of people who created a project named
'lop' that was also created by someone who is 30 years old.</p>
        <p>
          Each of the comprising four graph patterns (traversals) of the query
(listing 1.2), can be individually represented using the curried functional
notation as described in Equation 2. Thus,
17 http://tinkerpop.apache.org/docs/3.2.3/reference/#match-step
18 Meaningful names can be used as variable names for enhancing query readability.
19 Rodriguez et al. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] refer to step modulators as `syntactic sugar' that reduce the
complexity of a step's arguments by modifying the previous step.
(5)
        </p>
        <p>The input arguments of the match()-step are the set of graph patterns
de ned above in equations 4,5, which form a composite graph pattern
(the nal traversal ( )). At run-time, when a traverser enters
match()step, it propagates through each of these patterns guaranteeing that, for
each graph pattern, all the pre x and post x variables (i.e. "a", "b",
etc) are binded with their labelled path values. It is only then allowed
to exit, having satis ed this condition. In simple words, though each of
these graph patterns is evaluated individually, it is made sure that at
runtime, the overall structure of the composite graph pattern is preserved by
mapping the path labels to declared variables.</p>
        <p>For instance, in the query (ref. listing 1.2), the starting vertex of g(i)
labelled as "b" which is the terminal vertex of f (i), similarly for h(i) and
j(i) with vertex labelled as "c". It is therefore necessary, to keep a track
of the current location of a traverser in the graph, to preserve traversal
structure. This is achieved in Gremlin by match() and bind() functions
respectively, which we outline next.</p>
        <p>The evaluation of an input graph pattern/traversal in Gremlin is taken
care by two functions:
1. the recursively de ned match() function- which evaluates each
constituting graph pattern and keeps a track of the traversers location in
the graph (i.e. path history), and,
2. the bind() function- which maps the declared variables (elements and
keys) to their respective values.</p>
        <p>
          Using equations (4, 5) (curried functional form of path traversals) in the
recursive de nition of match() by [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], we have:
where, t a (t) is the labelled path of traverser t. A path (ref. de nition 3) is
labelled "a" via the step-modulator .as(), of the traverser in the current
traversal ( ); m1, m2, m3 are hidden path labels which are appended
to the traversers labelled path for ensuring that each pattern is executed
only once; and bindx(t) is de ned as:
bindx(t) =
&lt;8 tt x (t) = (t) :: xx((tt)) == 0(t)
: : otherwise.
(7)
where 0: T ! U, is a function that maps a traverser to its current location
in the graph G (e.g., v 2 V, V 2 U) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. It ( 0) can be perceived analogues
to de ned in de nition 1, which maps elements and keys to values in
G, however in later case the value is the location of a traverser in G.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Mapping Gremlin GPM traversals to Graph Algebra</title>
      <p>In this section, we present a mapping algorithm for encoding a given
Gremlin pattern-matching traversal, relational graph algebra. Figure 4,
describes the conceptual architecture for formalizing Gremlin traversals.
We follow a bottom-up approach in order to construct the relational graph
algebra based on the traversal.
4.1</p>
      <p>Mapping Gremlin traversals to Graph Algebra
1. The input query is parsed and its constituent individual graph
patterns are extracted from the match()-step and the optional where()
step. /* Parse step
2. For each single graph patterns (single path traversals) in the query,
we rst construct the curried functional form (ref. equation 2).
3. We then map the get-vertices/get-edges operator for the encountered
g.V()/g.E() step (i.e. to the rst graph pattern) respectively. /* Steps
2-12 are the Build steps
4. Append a traverse-operator to all the respective in-coming and
outgoing edge traversals for each, that appear inside the match()-step.
5. Append a property- lter operator to all the respective has() and
values() steps based on the match()-step.
6. Multiple match() steps can be connected processed using the
concatenative join operator.
7. Append a selection operator, if the match() step is succeeded by a
where() step (this is an optional in gremlin queries).
8. Append a projection operator, if select()-step is declared with a
match() or where() step.
9. Append a deduplication operator, based on whether the dedup() step
is declared after the select() step.
10. Append a sorting operator, if the order() step with an optional by()
modi er is declared after the select() step.
11. Append a grouping operator, if the group() step with an optional
by() modi er is declared after the select() step.
12. Map the union operator if the query contains a union()-step20. Union
is technically a binary operator, however, a union of multiple patterns
can be constructed using a left deep join tree representation.</p>
      <p>Next we present three examples of gremlin traversals which have been
formalised using the proposed graph relational algebra.</p>
      <p>Example Query 1. The sample query as shown in listing 1.2, can be
formalized as:
yname
a;c J age=30 #bc [created] nbame=lop "ba [created](Vg)Kg</p>
      <p>c
| {tz }
(8)
Example Query 2. Consider the following gremlin traversal shown in
listing 1.3 below:
1 g .V( ) . match (
2 . as ( ' a ' ) . h a s L a b e l ( ' p e r s o n ' ) . v a l u e s ( ' age ' ) . as ( ' b ' ) ) .</p>
      <p>
        s e l e c t ( ' b ' ) . o r d e r ( ) . by ( a s c )
Listing 1.3. This traversal returns the list all the persons in the ascending order of
the age.
20 It is not a common practice to use a union()-step in Gremlin GPM traversals, as
multiple match()-steps in conjunction with where()-steps can be used as per required
(the ad in nitium style of traversing [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]).
      </p>
      <p>The gremlin traversal shown above (Listing 1.3) can be formalized as
follows:</p>
      <p>b a
bJ age label=person(Vg)Kg
| {tz }
(9)
(10)
1 g .V( ) . union (
2 . match (
3 . match (
. as ( ' a ' ) . out ( ' c r e a t e d ' ) . as ( ' c ' ) ) ,
. as ( ' b ' ) . out ( ' c r e a t e d ' ) . as ( ' c ' ) ) ) . s e l e c t ( ' a ' , ' c ' )
Listing 1.4. This traversal returns the list of all the people who have collaboratively
created a software.</p>
      <p>The gremlin traversal shown above (Listing 1.4) can be formalized as
follows:
b;c J"ca [created](Vg)Kg ] J"bc [created](Vg)Kg
| }</p>
      <p>
        t = {t1z] t2
Optimizations. The Gremlin graph traversal machine inherently o ers a
wide variety of machine and query level optimizations as traversal
strategies including (i) query rewriting (decoration), (ii) traversal optimization,
(iii) vendor optimization (using byte-code for gremlin language variants),
(iv) nalization, and (v) veri cation. We will not go into the speci c
details, rather we point the interested reader to (Section 4, page 6-7 of [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]).
Example Query 3. Consider the following gremlin traversal shown in
listing 1.4 below,
Limitations. We do not cover the complete Gremlin language, as clari ed
earlier, we strictly focus on formalizing the GPM (declarative) construct.
In the declarative construct of Gremlin, in this work, we only focus on
covering the core functionality of pattern-based graph traversing. We do
not formalize mutating and speci c graph-based task (such as,
centrality, eccentricity, etc) traversals and data manipulation operators, e.g.,
addVertex(), addEdge(), addProperty(), aggregate() operators.
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        This section summarizes work towards formalizing query algebras for
different data models and corresponding query languages.
Property graphs. The Property Graph data model is one of the
popular graph data models that provides a rich set of features for the user
to model domain-speci c real world data. Various query languages have
been proposed over the years for querying over PGs. The Cypher query
language21, native to the Neo4J PG database, is a high-level declarative
pattern matching-based graph query language. The developers of Neo4J
&amp; Cypher strive at standardizing Cypher by providing open formal
speci cation via the OpenCypher22 project [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. One of the limitations of
Cypher is that it misses certain graph querying functionality such as the
support for regular path queries and graph construction. PGQL [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], is
an SQL-like syntax based graph query language for the PG data model.
Albeit being able to overcome the limitations of Cypher, and lure the SQL
community with its SQL-like syntax support, PGQL lacks
standardization and support by database technology vendors.
      </p>
      <p>
        RDF: The Resource Description Framework (RDF), is another graph
data model, popular in the semantic web domain. In RDF the data (i.e.,
entity descriptions) are stored as triples, similar to the node-edge
formalism in PGs. SPARQL [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], the query language for RDF triple stores, is a
Cypher-like declarative GPM query language for querying RDF graphs.
SPARQL is a W3C standard and its query algebra has been formally
described in works such as [
        <xref ref-type="bibr" rid="ref14 ref4">14,4</xref>
        ]. Moreover, multiset semantics have also
been formalized by [
        <xref ref-type="bibr" rid="ref19 ref3">3,19</xref>
        ].
      </p>
      <p>
        SQL: Relational databases cater rather limited support for executing
graph queries. However, certain databases such as PostgreSQL allow the
execution of recursive queries. The SAP HANA Graph Scale-Out
Extension prototype (GSE), using a SQL-type language proposed by [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
supports modelling high level graph queries.
6
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Outlook</title>
      <p>
        In this extended paper, we presented the initial e orts on formalizing
GPM traversals, which is a subset of the Gremlin traversal language
and machine. Since, Gremlin is both a query language and a machine,
it enables the graph (pattern-match) traversals to be represented in any
formal query language, given that it supports function composition and
nesting. Our current work provides a theoretical foundation to leverage
21 https://neo4j.com/developer/cypher-query-language/
22 http://www.opencypher.org/
this advantage of Gremlin for querying graphs on various graph engines.
Furthermore, our current work lays the foundation for supporting query
interoperability by allowing translation of SPARQL queries to Gremlin
GPM traversals [
        <xref ref-type="bibr" rid="ref23 ref24">23,24</xref>
        ], using the proposed mapping, in order to: (i)
leverage the advantage of using graph traversal for query property graphs, and
(ii) enable the use of SPARQL query language (the defacto standard of
RDF stores) on top of other OLAP-based graph processors and
OLTPbased graph engines [
        <xref ref-type="bibr" rid="ref23 ref24">23,24</xref>
        ]. This enables the semantic web community
to reap best of both worlds, i.e., accessing a plethora of graph data
management systems without the obligation of embracing a new graph query
language. As the near future work, we aim to address the limitations
pointed out in Section 4.1.
      </p>
      <sec id="sec-5-1">
        <title>Acknowledgement</title>
        <p>This work is supported by the funding received from EU-H2020
Framework Programme under grant agreement No. 780732 (BOOST4.0)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          , et al.
          <article-title>Foundations of modern graph query languages</article-title>
          .
          <source>arXiv preprint arXiv:1610.06264</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Boncz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Larriba-Pey</surname>
          </string-name>
          , I. Fundulaki,
          <string-name>
            <given-names>T.</given-names>
            <surname>Neumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Erling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Neubauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Martinez-Bazan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kotsev</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Toma.</surname>
          </string-name>
          <article-title>The linked data benchmark council: a graph and rdf industry benchmarking e ort</article-title>
          .
          <source>ACM SIGMOD Record</source>
          ,
          <volume>43</volume>
          (
          <issue>1</issue>
          ):
          <volume>27</volume>
          {
          <fpage>31</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          et al.
          <article-title>The multiset semantics of SPARQL patterns</article-title>
          .
          <source>In The Semantic Web - ISWC -15th International Semantic Web Conference</source>
          , Kobe, Japan, pages
          <volume>20</volume>
          {
          <fpage>36</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          .
          <article-title>The expressive power of sparql</article-title>
          .
          <source>In International Semantic Web Conference</source>
          , pages
          <volume>114</volume>
          {
          <fpage>129</fpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Thakkar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Tomaszuk</surname>
          </string-name>
          .
          <article-title>RDF and property graphs interoperability: Status and issues</article-title>
          .
          <source>In Proceedings of the 13th Alberto Mendelzon International Workshop on Foundations of Data Management</source>
          , Asuncion, Paraguay, June 3-7,
          <year>2019</year>
          .,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>H.</given-names>
            <surname>Garcia-Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          , et al.
          <article-title>Database systems: the complete book</article-title>
          .
          <source>Pearson Education India</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. L.
          <string-name>
            <surname>Gomes-Jr</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Amann</surname>
          </string-name>
          , et al.
          <article-title>Beta-algebra: Towards a relational algebra for graph analysis</article-title>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.</given-names>
            <surname>Gubichev</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Then</surname>
          </string-name>
          .
          <article-title>Graph pattern matching: Do we have to reinvent the wheel</article-title>
          ?
          <source>In Proceedings of Workshop on GRAph Data management Experiences and Systems</source>
          , pages
          <fpage>1</fpage>
          <article-title>{7</article-title>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. J. Holsch and
          <string-name>
            <given-names>M.</given-names>
            <surname>Grossniklaus</surname>
          </string-name>
          .
          <article-title>An algebra and equivalences to transform graph patterns in neo4j</article-title>
          .
          <source>In EDBT/ICDT 2016 Workshops: EDBT Workshop on Querying Graph Structured Data (GraphQ)</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Keswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Thakkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dubey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          .
          <article-title>The litmus test: Benchmarking rdf and graph data management systems</article-title>
          .
          <source>In Proceedings of 13th International Conference on Semantic Systems-Poster &amp; Demos</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>C. Krause</surname>
          </string-name>
          et al.
          <article-title>An sql-based query language and engine for graph pattern matching</article-title>
          .
          <source>In International Conference on Graph Transformation</source>
          , pages
          <volume>153</volume>
          {
          <fpage>169</fpage>
          . Springer,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>C. Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. C.-C. Chang</surname>
          </string-name>
          , et al.
          <article-title>Ranksql: query algebra and optimization for relational top-k queries</article-title>
          .
          <source>In Proceedings of the 2005 ACM SIGMOD international conference on Management of data</source>
          , pages
          <volume>131</volume>
          {
          <fpage>142</fpage>
          . ACM,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>G. S. J.</given-names>
            <surname>Marton</surname>
          </string-name>
          .
          <article-title>Formalizing opencypher graph queries in relational algebra</article-title>
          .
          <source>Published online on FTSRG archive</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>J. Perez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Arenas</surname>
          </string-name>
          , et al.
          <article-title>Semantics and complexity of sparql</article-title>
          .
          <source>In International semantic web conference</source>
          , pages
          <volume>30</volume>
          {
          <fpage>43</fpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. E. Prud et al.
          <article-title>Sparql query language for rdf</article-title>
          .
          <source>Citeulike Online Archive</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          .
          <article-title>The gremlin graph traversal machine and language (invited talk)</article-title>
          .
          <source>In Proceedings of the 15th Symposium on Database Programming Languages, Pittsburgh</source>
          , PA, USA, pages
          <volume>1</volume>
          {
          <fpage>10</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Neubauer</surname>
          </string-name>
          .
          <article-title>The graph traversal pattern</article-title>
          .
          <source>In Graph Data Management: Techniques and Applications</source>
          ., pages
          <volume>29</volume>
          {
          <fpage>46</fpage>
          . IGI Global,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Neubauer</surname>
          </string-name>
          .
          <article-title>A path algebra for multi-relational graphs</article-title>
          .
          <source>In Workshops Proceedings of the 27th International Conference on Data Engineering, ICDE</source>
          <year>2011</year>
          , pages
          <fpage>128</fpage>
          {
          <fpage>131</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>M. Schmidt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Meier</surname>
          </string-name>
          , et al.
          <article-title>Foundations of sparql query optimization</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Database Theory</source>
          , pages
          <volume>4</volume>
          {
          <fpage>33</fpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. G. Szarnyas et al.
          <article-title>opencypher speci cation</article-title>
          .
          <source>Online FTSRG archive</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>H.</given-names>
            <surname>Thakkar</surname>
          </string-name>
          .
          <article-title>Towards an open extensible framework for empirical benchmarking of data management solutions: LITMUS</article-title>
          . In European Semantic Web Conference, pages
          <volume>256</volume>
          {
          <fpage>266</fpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22. H.
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Keswani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dubey</surname>
          </string-name>
          , et al.
          <article-title>Trying not to die benchmarking: Orchestrating RDF and graph data management solution benchmarks using LITMUS</article-title>
          .
          <source>In Proceedings of the 13th International Conference on Semantic Systems, SEMANTICS</source>
          <year>2017</year>
          , Amsterdam, The Netherlands,
          <source>September 11-14</source>
          ,
          <year>2017</year>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. H.
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Punjani</surname>
          </string-name>
          , et al.
          <article-title>A stitch in time saves nine{SPARQL querying of property graphs using gremlin traversals</article-title>
          . arXiv:
          <year>1801</year>
          .02911,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. H.
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Punjani</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lehmann</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Auer</surname>
          </string-name>
          .
          <article-title>Two for one: Querying property graph databases using SPARQL via gremlinator</article-title>
          .
          <source>In Proc. of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences &amp; Systems (GRADES)</source>
          and
          <article-title>Network Data Analytics (NDA), pages 1{5</article-title>
          . ACM,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25. H.
          <string-name>
            <surname>Thakkar</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Punjani</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-E. Vidal</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Auer</surname>
          </string-name>
          .
          <article-title>Towards an integrated graph algebra for graph pattern matching with gremlin</article-title>
          .
          <source>In Proceedings of the 28th International Conference, DEXA</source>
          <year>2017</year>
          , Lyon, France,
          <source>August 28-31</source>
          ,
          <year>2017</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , pages
          <volume>81</volume>
          {
          <fpage>91</fpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>C. Unger</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          <string-name>
            <surname>Ngomo</surname>
          </string-name>
          , et al.
          <article-title>6th open challenge on question answering over linked data (QALD-6)</article-title>
          . In Semantic Web Challenges - Third
          <source>SemWebEval Challenge at ESWC</source>
          <year>2016</year>
          , Heraklion, Crete, Greece, pages
          <volume>171</volume>
          {
          <fpage>177</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27. O. van
          <string-name>
            <surname>Rest</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Hong</surname>
          </string-name>
          , et al.
          <article-title>Pgql: a property graph query language</article-title>
          .
          <source>In Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, page 7. ACM</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>