<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>DL</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Query Rewriting for Nested Navigational Queries over Property Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bianca Löhnert</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nikolaus Augsten</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cem Okulmus</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Magdalena Ortiz</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Paderborn University</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>TU Wien</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Salzburg</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>38</volume>
      <fpage>3</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>The framework of ontology-mediated data access (OBDA) has enjoyed great success in the setting of relational databases. But while querying graph data has always been seen as a key motivation of OBDA, current technologies lack support for standard graph database systems. In recent years, graph databases using the labeled property graph data model have seen increasing popularity, and two ISO standards for querying property graphs were made public in 2023: GQL and SQL-PGQ. Leveraging this momentum, we take a big step toward ontology-mediated querying of property graphs. We propose DL-LitePG , a DL-Lite variant tailored for property graphs that uses property values to define concepts and roles in the ontology, and present a practical rewriting algorithm for rewriting navigational queries with DL-LitePG ontologies. We consider nested two-way regular path queries (N2RPQs) and a large class of conjunctive N2RPQs. Our queries can access property values within path expressions and capture a substantial portion of the GQL and SQL-PGQ standards. This is the first algorithm to fully support full 2RPQs in the presence of ontologies by leveraging nested regular paths. We present our algorithm and a proofof-concept implementation and conclude with a set of preliminary experiments that showcase the practicality of our approach.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;ontology-based data access</kwd>
        <kwd>graph query languages</kwd>
        <kwd>ontology-meditated query answering</kwd>
        <kwd>description logics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>A core aim of the ontology-based data access (OBDA) framework (additionally to the virtual integration of
data sources) is to extend existing data with domain-specific knowledge without the need to materialize
all ensuing facts, but still enable access to all the consequences when answering user queries. One
of the key techniques to realize OBDA is via query rewriting: one takes as input the domain-specific
knowledge in the form of an ontology, and a query specified over the extended signature (with terms
from the data and the ontology) and rewrites into a new query, which uses the reduced signature that is
present in the data alone, but captures all the semantic consequences that would follow in the presence
of the ontology. Hence, one can evaluate the query using an existing system while still making efective
use of the ontology. This is known as ontology-mediated query answering (OMQA) and it is one of the
key techniques underlying many OBDA systems.</p>
      <p>
        The OBDA framework is already seeing a number of commercial applications, albeit only in the
setting of relational databases. While it is unlikely that relational databases are being replaced anytime
soon, we none-the-less see new forms of storing and querying data. Graph databases are one such
example, where we already have a large number of commercial systems. Instead of the relational
model that underpins relational database systems, these instead focus on the labeled property graph
(LPG) model. It allows for nodes and binary edges over nodes, each of which can be mapped to a
number of labels and associated with data values via so-called properties. The increasing popularity of
graph databases has recently culminated in the standardization body ISO issuing two new standard for
graph query languages: GQL and SQL-PGQ. The key distinguishing feature of these query languages
are their navigational capabilities, that allow to match arbitrarily long paths in the data. Both GQL
and SQL-PGQ share the same navigational core [1], which includes the standard navigational query
languages: (two-way) regular path queries ((
        <xref ref-type="bibr" rid="ref2">2</xref>
        )RPQs), nested 2RPQs (N2RPQs), and their conjunctive
versions (C2RPQs and CN2RPQs).
      </p>
      <p>The study of navigational queries in the context of OMQA is far from novel. For more than a decade
we have had tight complexity results for most description logics, ranging from lightweight to highly
expressive, as well as for all standard navigational languages (2RPQs, N2RPQs, C2RPQs, and CN2RPQs),
e.g., [2, 3, 4]. However, these studies focused on the boundaries of decidability and computational
complexity. The proposed algorithms are so unnameable to implementation that, more than a decade
later, not a single practical implementation has been proposed.</p>
      <p>Unfortunately, there were two significant roadblocks that discouraged the development of practical
techniques for navigational OMQs over graph databases. First, practical languages imposed ad-hoc
restrictions on navigational languages that made them inadequate. Cypher, the most widely used
language for graph databases, did not support RPQs in full, and did not enable the basic homomorphism
semantics that is natural in the OMQA context. Second, it has been proven that even plain RPQs in
the presence of the simplest DL-Lite ontologies cannot be rewritten into C2RPQs [5]. Instead, they
require nested RPQs, which existing technologies did not support. The arrival of GQL and SQL-PGQ has
removed both obstacles at once. Since the standard now contains CN2RPQs and basic homomorphism
semantics, practical languages are being updated quickly to support it. This has finally made it feasible
to have practical OMQA for graph databases.</p>
      <p>Seizing the opportunity, we propose the first practical query rewriting technique that covers full
2RPQs and N2RPQs in the presence of DL-Lite ontologies. We also consider conjunctive N2RPQs;
however, to ensure the algorithm remains practical and useful, we impose a restriction called
’join-onfree’, whereby non-answer variables cannot be shared by multiple atoms. We also consider the property
data values, a central feature of the LPG model that is often disregarded in the OMQA literature. We
allow property tests in queries, also along the navigational paths. On the ontology side, we define a
variant of DL-Lite that can use property value tests on the left-hand-side of axioms, thus allowing to
create concepts and roles based on these values.</p>
      <p>In summary, we make the following contributions:
1. We present the first practical algorithm to rewrite join-on-free CN2RPQs into CN2RPQs, capturing
the full power of navigational languages such as GQL.
2. This is the first work on OMQA on the LPG model that makes full use of properties, by including
data tests over property values in both the ontology and query language.
3. We show-case the practical utility of the algorithm by providing a proof-of-concept
implementation that takes as input an ontology in OWL format and the real-world query language Cypher
and rewrites into Cypher.</p>
      <p>This paper is structured as follows. In Section 2 we present the key terminology of our setting and
also introduce our novel description logic to express tests over propety values. In Section 4 we present
our rewriting algorithm on join-on-free CN2RPQs. In Section 5 we present our proof-of-concept
implementation and an experimental evaluation that aims to show its performance. In Section 6 we
summarise our results and highlight future work.</p>
      <p>Related Work. We note that a few recent works have begun the efort of closing this gap towards
the practical OMQA algorithms in the graph database setting. Already Di Martino et al. [6] leveraged
the recursion in regular path queries to be able to rewrite a fragment of ℰ ℒ. They use navigational
queries as target language for query rewriting, but their source languages are only instance queries and
conjunctive queries. It is also a theoretical work not aimed at implementation, and thus more inline
with the mentioned theoretical works [2, 3] than with this paper. Aiming for practical implementation,
Dragovic et al. [7] considered a restricted fragment of C2RPQs that can be rewritten into UC2RPQs,
and DL-Lite ontologies. It has limited support for data tests. A next step came in the work from [5],
where the ontology language is also a fragment of ℰ ℒ very similar to the one of Di Martino et al. [6].
There are also restrictions on the regular path expressions to ensure rewritability into C2RPQs.</p>
      <p>Semantics
 ℐ
ℐ ⊆  ℐ
 ℐ ∖ ℐ
ℐ ⊆  ℐ</p>
      <p>×  ℐ
{(, ) | (, ) ∈ ℐ }
{ | (, ) ∈ ℐ }
{ | (, ′) ∈ ℐ for some ′ with ′ ⊙ } ∪
{(, ′) | ((, ′), ′) ∈ ℐ for some ′ with ′ ⊙ }</p>
    </sec>
    <sec id="sec-2">
      <title>2. Querying Property Graphs with Navigational Queries and DL-Lite</title>
      <p>Ontology language. We present now the ontology language that we will use in our approach, which
we call DL-LitePG . It is a careful extension of the well-known DL-Lite family [8] of ontology languages,
where we permit tests over properties via a concrete domain, afecting both concept and role inclusions.</p>
      <sec id="sec-2-1">
        <title>We assume disjoint, countably infinite sets</title>
        <sec id="sec-2-1-1">
          <title>C, R, N and K of concept names, role names, individuals</title>
          <p>and property keys, respectively. The set of roles is defined as
R = R ∪ {− |  ∈ R}.</p>
          <p>We also assume a concrete domain (D, PD), with D a set of values and PD a set of binary predicates
the set of data tests as TD = { ⊙  ? |  ∈ K, ⊙ ∈
PD,  ∈ D}.
over D. For example, (D, PD) could contain the integers with the usual =, ≤ , ≥
predicates. We define
Definition 1 (DL-LitePG ). To define the types of inclusions, we use the following syntax:
 :=  | ∃.⊤ |  (basic concepts and data test)
 :=  | ¬ (general concept)
 :=  | − (basic role)
 :=  |  (roles and data test)
 :=  | ¬ (general role)
(RI) the form  ⊑ . A TBox is a finite set of CIs and RIs. We use
closure of {(, ) |  ⊑  ∈  } and call  a subrole of  (in  ) if  ⊑* .
⊑
where  ∈ C,  ∈ R and  ⊆</p>
          <p>TD. A concept inclusion (CI) has the form  ⊑  and a role inclusion
* to denote the reflexive transitive</p>
          <p>The semantics are given via interpretations of the form ℐ = ( ℐ , · ℐ ), with  ℐ a non-empty set called
the abstract domain. · ℐ is the interpretation function, which assigns to every  ∈ C a set ℐ ⊆  ℐ , to
every  ∈ R a relation ℐ ⊆  ℐ
×  ℐ and to every  ∈ K a relation ℐ ⊆
( ℐ ∪ ( ℐ
×  ℐ )) ×
It is extended to concepts and CIs in the usual way, as seen in Table 1. Modelhood is also standard.
to a social network graph, such as profile information from LinkedIn.</p>
          <p>= {︀ {born ≥ 1997 ?, born ≤ 2012 ?} ⊑ GenZ,
{time ≥ (2025-01-01)?} ⊑ Recent,</p>
          <p>Opole ⊑ Poland,
∃employs− .⊤ ⊑ Employed,</p>
          <p>Employed ⊑ ¬Unemployed,</p>
          <p>TechCompany ⊑ ∃employs.Engineer,
∃announce.⊤ ⊑ Hiring,
friendsWith ⊑ friendsWith−
︀}
Data model. In this paper the data (or ABox) is given as finite property graphs [9, 10, 11].
Definition 2.</p>
          <p>A property graph (PG)  has the form (, , label, prop), where:
•  is a non-empty set of nodes;</p>
          <p>such an edge in the form (, ′) and call it an -edge;
•  is the set of edges, where each edge is a triple (, , ′) with  ∈ R and , ′ ∈  ; we may write
• label is a total function  → 2C;
• prop is a partial function ( ∪ ) × K → D mapping pairs (, ) with  ∈ ( ∪ ) and  ∈ K
to a value in D.</p>
          <p>If  ⊆ N and it is finite, we call it an ABox. We say that ′ = ( ′, ′, label′, prop′) is a subgraph
of  if  ′ ⊆  , ′ ⊆ , label′() = label() for all  ∈  ′, and prop′(, ) = prop(, ) for all
 ∈  ′ ∪ ′,  ∈ K.</p>
          <p>Note that our definition of property graph allows only a single edge for each role between each pair
of nodes. Also, the same name of property keys is used for both nodes and edges, as usually done in
property graphs [10, 11].</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Example 2. Below is an example property graph, over the social network setting from Example 1.</title>
          <p>{ viewed }</p>
          <p>e5
time = 24.04.2025
{ Opole }</p>
          <p>City17
population = 2,
area = 3
{ User }</p>
          <p>Alice
born = 2000
{ announce }</p>
          <p>e2
on = 14.03.2025
{ locatedIn }</p>
          <p>e1
since = 2012
{ friendsWith }</p>
          <p>e4
since = 2012
{ TechCompany }</p>
          <p>SmartBees
revenue = 500k
founded = 2012
{ Company }
nuCompany
revenue = 50k
founded = 2015
{ User }</p>
          <p>Bob
born = 1980
{ owns }</p>
          <p>e6
since = 2025
{ employs }</p>
          <p>e3
since = 2010</p>
          <p>Each interpretation can be seen as a property graph, and vice-versa.</p>
          <p>Definition 3. For a property graph  = (, , label, prop), define ℐ = (, · ℐ ) as follows: ℐ = { ∈
 |  ∈ label()}, ℐ = {(, ′) | (, ′) ∈ } and ℐ = {(, ) |  = prop(, )}. Conversely,
ℐ = ( ℐ , · ℐ ) induces a (possibly infinite) property graph PG (ℐ) = ( ℐ , , label, prop), where  =
{(, ′) | (, ′) ∈ ℐ }, label() = { ∈ C |  ∈ ℐ } and prop(, ) = { ∈ D | (, ) ∈ ℐ }. ℐ is
a model of an ABox , if  is a subgraph of  (ℐ). An ABox  is consistent with a TBox  if there is a
model of  and  .</p>
          <p>For the problem of checking that a given ABox is consistent with an DL-LitePG TBox, we refer to the
work of Artale et al. [12]; their algorithms can be easily tuned to account for key values on edges.</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>Query Language. We study conjunctive nested two-way regular path queries (CN2RPQs), the extension</title>
          <p>of C2RPQs, the navigational query language for graphs that has received most attention in OMQA.
We enhance CN2RPQs by data tests similar as data tests for navigational conjunctive queries in [7] to
query for property values, assuming that the predicates in PD can be realized in GQL and Cypher. To
represent CN2RPQs we rely on nested nondeterministic finite automaton (n-NFA), following generally
the notions from [3], with some simplifications.</p>
          <p>We will first define nested two-way regular path expressions by way of NFAs, and then give the
definition of conjunctive nested two-way regular path queries based on them. We assume a given
alphabet Σ, and to define nested automata, we extend Σ by allowing n-NFAs as symbols.
Definition 4. Let A0 be the set of all NFAs over a given alphabet Σ. For  &gt; 0, the set A of nested
NFA (n-NFA) over Σ of nesting depth  contains all automata over Σ = Σ ∪ {⟨ ⟩ |  ∈ A, 1 ≤  &lt; }.</p>
          <p>We omit the nesting depth of an n-NFA when irrelevant. We are interested in n-NFAs that use the
alphabet of symbols and tests that may occur in our property graphs. Note that a set of unary data tests
can be simulated by multiple transitions, thus the alphabet does not include a set of data tests for nodes.
Definition 5.</p>
        </sec>
        <sec id="sec-2-1-4">
          <title>A nested two-way regular path expression (N2RPE) is an n-NFA over the specific alphabet</title>
          <p>Σ  = 2R∪TD ∪ TD ∪ {C? | C ∈ C}.
 − =</p>
          <p>This alphabet Σ  consists of three kinds of symbols: 1) sets of (possibly inverted) role labels and data
tests, which allow the expression to perform a series of data tests while traversing an edge that has all the
role labels in the test; 2) individual data tests, which are checked against the properties of a node and lastly,</p>
        </sec>
        <sec id="sec-2-1-5">
          <title>3) concept tests, which check whether the current node is part of the required concept.</title>
          <p>The semantics of N2RPEs that we give below is based on finding in an interpretation a path with
suitable outgoing paths. But sometimes it is convenient to talk about the word language of N2RPEs,
that is, the set ( ) of words over Σ  ∪
⋃︀</p>
          <p>∈N A that it accepts in the standard sense, treating the
elements of ⋃︀
Definition 6.</p>
          <p>∈N A as ordinary alphabet symbols. We introduce some useful definitions.</p>
          <p>We define an inverse function over the alphabet Σ  ∪ ⋃︀
∈N A as follows:
{︃{ |  ∈  ∩ TD} ∪ {− |  ∈  ∩ R} ∪ { | − ∈ ,  ∈ R},  ∈ 2R∪TD
otherwise.</p>
          <p>The inverse − of a word  =  1 · · ·   is  − · · ·  1− , and for a language , we let − = {− |  ∈ }.</p>
          <p>Let 
Then  can be inverted to obtain ¯ = ⟨, Σ , ,  − , ⟩, where  − = {( ,  − , ) | (, ,   ) ∈  }.
= ⟨, Σ , , ,  ⟩ be an N2RPE with  ⊆
 the initial states and  ⊆  are the final states.</p>
          <p>Lemma 1. For every N2RPE  , (¯) = ( )− .
we can find an outgoing path that complies with  ′.</p>
          <p>We are now ready to use N2RPEs as a query language for property graphs. To this aim, we define the
semantics of a 2NRPE  as the pairs of nodes that are connected via  , and where at every nested ⟨ ′⟩
in Δℐ ,  1, . . . ,   is a word in ( ), and for each  ∈ {1, . . . , }:
(0, ) ∈ Δℐ ×
Definition 7. For a N2RPE  of nesting depth  and an interpretation ℐ, we define  ℐ as the set of pairs
Δℐ such that there is a sequence of the form 0 11 2 . . .  , where each  is a node
∙ If   ∈ 2R∪TD , then (i) for every data test  ⊙ ? ∈  ℐ and the ′ with ((− 1, ), ′) ∈ ℐ , we
∙ If   = C? for a concept C, then − 1 =  and  ∈ Cℐ .</p>
          <p>have ′ ⊙ , and (ii) for every  ∈  ℐ ∩ R, we have (− 1, ) ∈ ℐ .
∙ If 
∙ If   =  ⊙ ? for a property  ∈ K, then − 1 =  and ′ ⊙  where (− 1, ′) ∈ ℐ .</p>
          <p>=  ′ ∈ A− 1, then − 1 =  and there exists some  ∈ Δℐ such that (, ) ∈  ′ℐ .</p>
          <p>N2RPEs are a very rich query language in their own right, but we get even more flexible querying
capabilities if we use variables to combine N2RPEs.</p>
        </sec>
        <sec id="sec-2-1-6">
          <title>We call a CN2RPQ Boolean if ⃗ is empty.</title>
          <p>Definition 8.
· · ·  (, ) · · · ∧</p>
          <p>A conjunctive N2RPQ (CN2RPQ) (⃗) is a conjunction of atoms  1(1, 1) ∧
 (, ) with each   an N2RPE and ⃗ ⊆
 {, } is a tuple of answer variables.
⋃︀
join-on-free CN2RPQ.</p>
        </sec>
        <sec id="sec-2-1-7">
          <title>The set of join variables Join((⃗)) of a CN2RPQ (⃗) is the set of those variables that occur in</title>
          <p>two diferent atoms of (⃗). If Join((⃗)) ⊆ ⃗ for a CN2RPQ that is not Boolean, then we call (⃗) a
We sometimes abbreviate atoms of the form ⟨ ⟩(, ) as ⟨ ⟩(), and similary, ?() as ().</p>
          <p>It will be convenient to assume that CN2RPQs are connected, that is, the graph whose vertices are
the variables and that has an edge between two variables if they occur in the same atom is a connected
graph. Disconnected queries can be answered as separate queries and then their answers intersected.
Note that, under this assumption, every atom in a join-on-free CN2RPQ has at least one answer variable.
an answer for (⃗) in .</p>
          <p>Definition 9</p>
          <p>(Semantics of CN2RPQs). Consider a CN2RPQs (⃗) and a property graph  =
(, , label, prop). A match  for (⃗) in  is a mapping from the variables in  to nodes in  such that
( (),  ()) ∈  ℐ for every atom  (, ) occurring as a conjunct in (⃗). The tuple  (⃗) is then called
Example 3. To illustrate our query language, we use the schema introduced in Example 1. The query
1(, ) retrieves all technical companies located in Poland that are currently hiring and that have an
employee who is a friend of friends of the user. In query 2(, ) a recruiter might be interested in friends
(and friends of friends) born after 2000 of an employee, who has been working for the company for over
three years. In query 3(, ), we look for companies that were founded before 2003, with a revenue of 105
or more, and want to find all Polish companies that they own since 2010, and also look for all companies
that these companies own, and so on. In query 4(, ), we look for all companies that employ workers
that are friends with some user of GenZ, and return the pairs of company and users, when the user also
viewed a job ofer by the company that the friend is currently employed.</p>
          <p>1(, ) := friendsWith* · employs− (, ), ⟨locatedIn · Poland?⟩ · Hiring? · TechCompany?(, )
2(, ) := {employs− , since ≤ 2023?} · friendsWith* · born ≥ 2000?(, ),
3(, ) := founded ≤ 2015? · revenue ≥ 105? · ({owns, since ≥ 2020?} · ⟨ locatedIn · Poland?⟩)* (, )
4(, ) := Company? · employs · friendsWith · GenZ?(, ), viewed · Job? · announce− (, )
Following the OMQA literature, we use the homomorphism or walk semantics for path queries [13],
and in the presence of a TBox, adopt the certain answer semantics.</p>
          <p>Definition 10. Consider an ABox  and a TBox  . A tuple ⃗ of individuals in  is called a certain
answer to (⃗) over ( , ) if it is an answer to (⃗) in PG (ℐ) for every model ℐ of  and  .</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Rewriting N2RPQs</title>
      <p>We present a rewriting algorithm for N2RPQs with data tests. We assume in this section and the
next a fixed DL-LitePG TBox  . As usual, we can rely on the fact for every property graph  that is
consistent with  there is a canonical model ℐ , that gives exactly the certain answers to all N2RPQs.
The construction of this model is standard, and given in the extended version of this paper [14]. For
convenience, we may think of ℐ , as a set of facts () (, ′),  () and  (, ′), where , ′ are
nodes,  ∈ C ,  ∈ R and  ∈ TD. We call a fact explicit if it is present in , and implicit otherwise.
Note that all datatest facts are explicit, and that all implicit facts are introduced by the canonical model
construction on the basis of other (explicit or implicit) facts.</p>
      <p>
        We define a skipping function that takes an N2RPE as input and outputs a modified N2RPE that
contains additional transitions which allow it to ‘skip over’ implicit facts, and instead traverse only the
explicitly given graph. It draws inspiration from loop computation [15] and clipping [16, 17].
Definition 11 (skipping over N2RPEs). Let  be a DL-LitePG TBox. Given a N2RPE  with alphabet
Σ , we denote by skip ( ) the n-NFA obtained by exhaustively adding transitions as follows. In each
rule,  , ,  and  denote the transition function and states of one (arbitrary but fixed) automata that
occurs (possibly nested) in  , and as usual, ,  ∈ R, ,  ∈ C and  = {1, . . . , } ⊆ TD.
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) if  ⊑  ∈  and (, ,  ) ∈  with  ∈  (or − ∈ ), then add (, ,  ) to  with  =
( ∖ {}) ∪ {} (or  = ( ∖ {− }) ∪ {− })
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) if  ⊑  ∈  and (, ?,  ) ∈  , then add (, ?,  ) to 
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) if  ⊑  ∈  and (, ?,  ) ∈  , then add, for each ℓ ∈  , a transition (′ℓ, ℓ?, ′ℓ+1) to  , where
 = ′1, ′+1 =  , and ′2, . . . , ′ are fresh states,
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) if  ⊑  ∈  and (, ,  ) ∈  with  ∈ , then add (, ,  ) to 
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) if ∃.⊤ ⊑  and (, ?,  ) ∈  , then add (, ⟨ ⟩,  ) to  for a fresh two-state NFA   with a
single transition  (, {},  ) between its only initial state  and its only final state  .
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) for each ∃.⊤ ⊑ ∃.⊤ ∈  do:
a) if {(, {},  ), ( , {− }, )} ⊆  , then add (, ⟨ ⟩, ) to 
b) if (, {}, ) ∈  and  ∈  , then add (, ⟨ ⟩, ) to 
for a fresh two-state NFA   with a single transition  (, {}, ) between its only initial state 
and its only final state .
      </p>
      <p>
        1

5
?
−
−
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        ) for each CI  ⊑ ∃.⊤ ∈  do:
a) if {(, {},  ), ( , {− }, )} ⊆  , then add (, , ) to 
b) if {(, {},  )} ⊆  and  ∈  , then add (, ,  ) to 
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        ) if (, ,   ), ( , ⟨ ⟩, ) ∈  where  ∈ Σ  and  − ∈ ( ), then add (, ,  ) to  .
      </p>
      <p>Intuitively, each rule application adds a transition that allows  to ‘skip’ an implicit fact  in ℐ ,
and use instead a fact that participated in its creation. We illustrate skip ( ) on a simple example.
Example 4. Let us consider the TBox  = {∃.⊤ ⊑ ,  ⊑ ∃.,  ⊑ ∃.,  ⊑ ∃., { ≥ 10?} ⊑
,  ⊑ } and the query (⃗) =  1(, ) with  1 = (1, 1,  1, 1), and the nested  2 = (2, 2,  2, 2)
illustrated below; only the solid transitions are in the input N2RPE. The dotted transitions are those added
to skip ( 1).</p>
      <p>start
⟨ 2⟩
6
start
?
−</p>
      <p>4

2
 ≥ 10?
?
3
?
⟨⟩</p>
      <p>We can now show that skip ( )(, ) is indeed a rewriting of  (, ).</p>
      <p>Lemma 2. Let  be a DL-LitePG TBox and  be an N2RPE. Then, for every property graph  and every
pair of nodes ,  from , it holds that (,  ) ∈  ℐ , if (,  ) ∈ skip ( ).
Proof sketch. For the (if) direction, we show that if an N2RPE  +1 is obtained by applying a rule in
Definition 11 to  , then (,  ) ∈  ℐ+1, implies (,  ) ∈  ℐ , . This is a simple rule-by-rule
analysis.</p>
      <p>For the (only if) direction, we rely on the fact that the canonical model construction naturally induces
an ordering on the facts in ℐ ,, where an implicit fact always has a strictly higher degree that the facts
that participate in its creation. We then show that if (,  ) ∈  ℐ , and this can only be witnessed
using some implicit fact  , then it is possible to apply some rule in such a way that, with the additional
transition, (,  ) ∈  ℐ+1, can be witnessed using instead a fact  ′ of strictly lower degree than  . By
applying the rules exhaustively, we eventually have that (,  ) ∈ skip ( )ℐ , is witnessed using
only facts of degree 0, that is, facts in , and hence (,  ) ∈ skip ( ).</p>
    </sec>
    <sec id="sec-4">
      <title>4. Rewriting join-on-free CN2RPQs</title>
      <p>It is now very easy to obtain an algorithm for join-on-free CN2RPQs. First we get rid af all non-answer
variables using nesting.</p>
      <sec id="sec-4-1">
        <title>Definition 12. For a join-on-free CN2RPQ (⃗), we denote by ⟨∃⟩(⃗) the result of replacing each atom</title>
        <p>(, ) with  ̸∈ ⃗ by ⟨ ⟩(), and each  (, ) with  ̸∈ ⃗ by ⟨ − ⟩().</p>
        <p>A simple inspection of the CN2RPQ semantics shows that the transformation preserves query answers.
Lemma 3. Let (⃗) a join-on-free CN2RPQ. For every interpretation ℐ and tuple of nodes ⃗, we have that
⃗ is an answer to (⃗) in ℐ if ⃗ is an answer to ⟨∃⟩(⃗) in ℐ.</p>
        <p>Moreover, the transformation to ⟨∃⟩ removes all the non-answer variables. The rewriting of queries
now straightforward, since we only need to apply skip ( ) to each atom independently. We denote
by skip (⟨∃⟩(⃗)) the result of executing Algorithm 1 on (⃗) and  , which applies skip ( ) to each
atom in ⟨∃⟩(⃗) and leaves all variables untouched.</p>
        <p>Algorithm 1: rewrite_jof_CNRPQ – Rewriting join-on-free CN2RPQs with data tests
Input : CN2RPQ (⃗),</p>
        <p>Output : CN2RPQ ′
1 function rewrite_jof_CNRPQ():
2 =remove∃Var()
3 while  ̸= ′ do
4 ′ = 
5 foreach  (, ) ∈  do
6 ′ = ′[ ∖skipping( )]
7 return 
8 function remove∃Var():
9 foreach  (, ) ∈  do
10 if  ̸∈ ⃗ then
11  = [ (, ) ∖ ⟨ ⟩()]
12 else if  ̸∈ ⃗ then
13  = [ (, ) ∖ ⟨ − ⟩()]
14 return 
Theorem 1. Let  be a DL-LitePG TBox and (⃗) be a join-on-free CN2RPQ. Then, for every ABox 
and every tuple ⃗ of individuals from , it holds that ⃗ is a certain answer to (⃗) over ( , ) if ⃗ is an
answer to skip (⟨∃⟩(⃗)) over .</p>
        <p>Proof. Given a DL-LitePG TBox  and an ABox  and a join-on-free CN2RPQ (⃗). From For (⇒)
assume that ⃗ of individuals from  is a certain answer to (⃗) over ( , ). From Lemma 3 it holds that
⃗ is a certain answer to ⟨∃⟩(⃗) over ( , ), i.e., ⃗ =  (⃗) and ( (),  ()) ∈  ℐ , (or ( () ∈  ℐ , )
for each atom  (, ) (or  ()) in ⟨∃⟩(⃗). By Lemma 2 we can imply that ( (),  ()) ∈ skip ( )
and ( () ∈ skip ( ) respectively. Thus, the claim holds that ⃗ is an answer to skip (⟨∃⟩(⃗)) over .
Since the claims in Lemma 2 and Lemma 3 hold in both directions, the proof for (⇐) is analogous.
Example 5. To demonstrate the full algorithm, let us consider the TBox  from Example 1 and the queries
from Example 3. The first step of the algorithm is to remove the join-free non-answer variables of the input
query, for example:
2(, ) := {employs− , since ≤ 2023?} · friendsWith* · born ≥ 2000?(, ),
Then, we apply the skipping function and obtain the following rewritten queries:
1(, ) := (friendsWith* |(friendsWith− ))* · employs− (, ), ⟨⟨locatedIn · (Opole|Poland?⟩)·
(Hiring?|⟨announce⟩) · TechCompany?⟩()
2(, ) := {employs− , since ≤ 2023?} · (friendsWith* |(friendsWith− )* · born ≥ 2000?(, ),
3(, ) := founded ≤ 2003? · revenue ≥ 105? · ({owns, since ≥ 2010?}·</p>
        <p>⟨locatedIn · (Opole?|Poland?⟩))* (, )
4(, ) := Company? · employs · (friendsWith|friendsWith− )·</p>
        <p>(GenZ?|{born ≥ 1997?, born ≤ 2012?})(, ), viewed · Job? · announce− (, )</p>
      </sec>
      <sec id="sec-4-2">
        <title>Now, the evaluation of these queries over the property graph from Example 2 gives us the following</title>
        <p>matches. For 1() we get two matches  1() = Bob  1() = SmartBees and  ′1() = Alice  ′1() =
SmartBees. For query 2 there is one match  2() = Bob and  2() = Alice. The third query returns two
companies by the match  3() = SmartBees and  3() = nuCompany. Finally,  4() = SmartBees and
 4() = Alice is a valid match for query 4.</p>
        <p>Query</p>
        <p>Number of Concepts</p>
        <p>Rewriting Time [ms]</p>
        <p>Evaluation Time [ms]
1
2
3
4
5
68
67
82
77
124
7237
3277
3805
4557
5769</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Implementation &amp; Experiments</title>
      <p>As a proof-of-concept we implemented the algorithm from Section 4 in a publicly available prototype
[18] and provide preliminary results in this section. The implementation makes use of the OWL API
[19] to parse the input OWL ontology, ANTLR [20] for parsing the input CN2RPQ, and the Java library
JgraphT [21] for the internal representation of n-NFAs. To the best of our knowledge the current version
of the OWL2 standard [22] does not support to express data tests in role inclusions, i.e., axioms of the
form { ⊙  ?} ⊑ . Therefore, the experiments do not consider these kind of axioms in the evaluation.
Ontology.</p>
      <p>As TBox we use the Cognitive Task Ontology (CogiTO) [23] and extended it by the
axioms below (see [18] for this version of the ontology). There are multiple options to indicate that
a participant is female; we have added axioms to cover all such cases, although we present only one
representative example here. This ontology includes about 4686 concepts and 10 002 axioms, whereas
720 axioms are of the form ReadingTask ⊑ ∃has.Read ⊓ ∃has.Language − item. Note that CogiTO
contains conjunction and qualified existentials on the left-hand side, which the prototype ignores.
 = {︀ {License = “CC0” ?} ⊑ Reusable,</p>
      <p>{BIDSVersion = 1.2.0 ?} ⊑ LatestBIDSVersion,
{gender = “f” ?} ⊑ Female,
{Manufacturer = “Siemens”} ⊑ HighQuality
︀}
Data.</p>
      <p>The prototype produces a Cypher query, assuming that concepts in the ontology correspond
to node labels in the database, and roles correspond to relationships (i.e., edge labels). The dataset for
the experiments (see [18]) are from the domain of cognitive neuroscience [24] and is stored in a Neo4j
database, consisting of 396 741 nodes and 2 870 405 relationships.</p>
      <p>Queries.</p>
      <p>We handcrafted the following queries to provide preliminary results on the time required
for rewriting and evaluating the queries.</p>
      <p>Quantitative− value?⟩(, )
⟨has* · Read?⟩(, )
1() :=Dataset? · LatestBIDSVersion? · Reusable? · has* · Language− item(, )
2() :=Dataset⟨has* · Female?⟩ · ⟨ has* · Language− item?⟩(, )
3(, ) :=Dataset · ⟨ has* · HighQuality?⟩⟨has* · Language− item?⟩⟨has* · Read?⟩(, )
4(, ) :=Dataset · ⟨ has* · Female?⟩ · ⟨ has* · HighQuality?⟩ · ⟨ has* · Memorize?⟩ · ⟨ has* ·
5(, ) :=Dataset · ⟨ has* · HighQuality⟩ · ⟨ has* · Memorize?⟩ · ⟨ has* · Language− item?⟩·
Setup. The experiments were executed on a virtual cluster node running Rocky Linux 8.10 with an
AMD EPYC 7513 32-Core CPU @ 2.60 GHz and 400 GB RAM; Neo4j 5.18.1 runs on the same machine.
Results. In Table 2, we provide the results of our experimental evaluation. Since most of the transitions
introduced by the rewriting are concept tests added due to a large concept hierarchy and existentials on
the right-hand side, we give the number of concept tests as a proxy for the query size after the rewriting
(Number of Concepts). Additionally, we measure the time it takes to rewrite the queries (Rewriting Time
[ms]) and to evaluate them with the Neo4j database (Evaluation Time [ms]), both averaged over 10 runs.
The number of concepts, roughly indicating size and complexity of the rewriting, increases significantly
from 3–5 concepts in the input to 67–124 in the rewritten queries. The rewriting times for all queries
are within a reasonable range of a few seconds (3.3s-7.2s) suggesting that the rewriting is practically
feasible. The evaluation times for the rewritten queries, with the exception of 2 which timed out at
the 600s threshold, remain under 20s. Given the size of the ontology, these numbers demonstrate an
acceptable performance for the majority of our queries, although it also reveals that certain queries can
result in significantly longer evaluation times. A possible reason for the outlier 2 is a high number of
Female instances in the dataset. We emphasize that the prototype is a very simple proof of concept and
does not yet implement any optimizations. It has often been documented that queries obtained from
rewriting algorithms tend to perform poorly: they introduce redundancy, nested disjunctions, and other
complex subqueries that the engines are not optimized for. Thus, they require dedicated optimizations,
which are often feasible given their somewhat predictable structure. We are confident that there is
ample opportunity to optimize and improve the runtimes, and we plan to do so in the future.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper, we present the first practical algorithm for rewriting N2RPQs and a significant subset
of CN2RPQs, thereby capturing a substantial portion of Cypher and GQL. Our queries can access key
values within path expressions. In our DL-LitePG ontology language, property value tests can be used
to define concepts and roles in the ontology. The result is a highly flexible approach to
ontologymediated querying of property graphs that still allows for query rewriting into native graph database
technologies. We have demonstrated the formal correctness of our approach and provided a
proofof-concept implementation that can rewrite Cypher queries to incorporate ontological knowledge
and evaluate the rewritten queries using Cypher. This work brings us significantly closer to flexible,
ontology-mediated querying of graph databases, and we look forward to exploring its advantages in
real-world use cases.</p>
      <p>To achieve a simple and practicable solution we focused on join-on-free CN2RPQs. This does not
seem too limiting: even plain N2RPEs can already express many realistic examples that involve complex
bidirectional navigation on the anonymous implicit facts; arbitrary conjunctions over the free variables
make the query language even more powerful. We expect that real-world queries that require diferent
regular paths to travel separately and join somewhere in the unnamed part of the canonical model will
very rarely emerge in practice, if at all. Nevertheless, we plan to cover full CN2RPQs. We note that
algorithms for full extended CN2RPQs in expressive DLs have been available for over a decade [3]. It is
not dificult to adapt these algorithms to create one that is both correct and worst-case optimal for all
CN2RPQs and DL-LitePG ontologies. However, its value seems limited since these techniques are highly
impractical. In the extended version of this paper [14], we provide a sketch of one such technique,
but it relies on automata-theoretic operations that can cause an exponential increase in size (e.g. the
intersection of the word projection of N2RPEs). In contrast coming up with a practical, goal-oriented
algorithm that can be used in practice seems to be a much bigger challenge, which we will address.</p>
      <p>We plan to explore the potential of our technique for real-life OBDA systems with navigational
capabilities, and to conduct experiments on systems supporting GQL as soon as they become available.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work was partially supported by the Austrian Science Fund (FWF) project PIN8884924 and by
the State of Salzburg under grant number 20102-F2101143-FPR (DNI). The authors acknowledge the
computational resources and services provided by Salzburg Collaborative Computing (SCC), funded by
the Federal Ministry of Education, Science and Research (BMBWF) and the State of Salzburg.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <sec id="sec-8-1">
        <title>The author(s) have not employed any Generative AI tools.</title>
        <p>C. Bessiere, D. Dubois, P. Doherty, P. Frasconi, F. Heintz, P. J. F. Lucas (Eds.), ECAI 2012 - 20th
European Conference on Artificial Intelligence. Including Prestigious Applications of Artificial
Intelligence (PAIS-2012) System Demonstrations Track, Montpellier, France, August 27-31 , 2012,
volume 242 of Frontiers in Artificial Intelligence and Applications , IOS Press, 2012, pp. 61–66. URL:
https://doi.org/10.3233/978-1-61499-098-7-61. doi:10.3233/978-1-61499-098-7-61.
[13] R. Angles, M. Arenas, P. Barceló, A. Hogan, J. L. Reutter, D. Vrgoc, Foundations of modern query
languages for graph databases, ACM Comput. Surv. 50 (2017) 68:1–68:40. doi:10.1145/3104031.
[14] B. Löhnert, N. Augsten, C. Okulmus, M. Ortiz, Query rewriting for nested navigational queries over
property graphs (with appendix), 2025. URL: https://gitlab.com/austrian-neurocloud/software/
owl2cypher/-/releases/DL2025, accessed: 2025-07-23.
[15] M. Bienvenu, M. Ortiz, M. Simkus, Conjunctive regular path queries in lightweight description
logics, in: F. Rossi (Ed.), IJCAI 2013, Proceedings of the 23rd International Joint Conference on
Artificial Intelligence, Beijing, China, August 3-9, 2013, IJCAI/AAAI, 2013, pp. 761–767. URL:
http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6886.
[16] T. Eiter, M. Ortiz, M. Simkus, T. Tran, G. Xiao, Query rewriting for Horn-SHIQ plus rules, in:
J. Hofmann, B. Selman (Eds.), Proceedings of the Twenty-Sixth AAAI Conference on Artificial
Intelligence, July 22-26, 2012, Toronto, Ontario, Canada, AAAI Press, 2012, pp. 726–733. URL:
http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4931.
[17] B. Löhnert, N. Augsten, C. Okulmus, M. Ortiz, Towards practicable algorithms for rewriting graph
queries beyond dl-lite, in: European Semantic Web Conference, Springer, 2025, pp. 342–361.
[18] B. Löhnert, N. Dragovic, Ontology-mediated querying for property graphs, 2025. URL: https:
//gitlab.com/austrian-neurocloud/software/owl2cypher/-/releases/DL2025, accessed: 2025-06-08.
[19] The OWL API, July 25, 2023. URL: https://owlcs.github.io/owlapi/.
[20] The OWL API, July 25, 2023. URL: https://www.antlr.org/.
[21] JGraphT, July 25, 2023. URL: https://jgrapht.org/.
[22] OWL2 Standard, July 25, 2023. URL: https://www.w3.org/TR/owl2-overview/.
[23] B. Löhnert, B. Engler, F. Hutzler, N. Augsten, Cognitive task ontology (COGITO), 2025. URL:
https://gitlab.com/austrian-neurocloud/ontologies/cogito/-/releases/ESWC2025, accessed:
202503-27.
[24] A. Ravenschlag, M. Denissen, B. Löhnert, M. Pawlik, N. A. Himmelstoß, F. Hutzler, Efective queries
for mega-analysis in cognitive neuroscience, in: G. Fletcher, V. Kantere (Eds.), Proceedings of the
Workshops of the EDBT/ICDT 2023 Joint Conference, Ioannina, Greece, March, 28, 2023, volume
3379 of CEUR Workshop Proceedings, CEUR-WS.org, 2023. URL: https://ceur-ws.org/Vol-3379/
CoMoNoS_2023_id252_Mateusz_Pawlik.pdf.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>A. Proofs for Section 3 (Rewriting N2RPQs)</title>
      <p>Definition 13 (Canonical Model). Consider a DL-LitePG TBox  and an ABox . Let ℐ0 be the initial
interpretation with  the corresponding property graph. Then, we construct the canonical model ℐ , by
exhaustively applying the following rules to ℐ0.
(Ch1) If  ⊑  ∈  ,  ∈ ℐ and  ∈/ ℐ , then ℐ+1 is ℐ with ℐ+1 = ℐ ∪ {}.
(Ch2) If ∃.⊤ ⊑  ∈  ,  ∈ (∃.⊤)ℐ and  ∈/ ℐ , then ℐ+1 is ℐ with ℐ+1 = ℐ ∪ {}.
(Ch3) If ∃.⊤ ⊑ ∃.⊤ ∈  ,  ∈ (∃.⊤)ℐ and  ∈/ (∃.⊤)ℐ , then ℐ+1 is ℐ with a fresh  in  ℐ+1 . In
case  ∈ R then ℐ+1 = ℐ ∪ {(, )}, otherwise ℐ+1 = ℐ ∪ {(, )}.
(Ch4) If  ⊑ ∃.⊤ ∈  ,  ∈ ℐ and  ∈/ (∃.⊤)ℐ , then ℐ+1 is ℐ with a fresh  in  ℐ+1 . In case
 ∈ R then ℐ+1 = ℐ ∪ {(, )}, otherwise ℐ+1 = ℐ ∪ {(, )}.
(Ch5) If  ⊑  ∈  , (, ) ∈ ℐ and (, ) ∈/ ℐ , then ℐ+1 is ℐ with ℐ+1 = ℐ ∪ {(, )} in case
 ∈ R, otherwise ℐ+1 = ℐ ∪ {(, )}.
(Ch6) If  ⊑  ∈  ,  ∈  ℐ and  ∈/ ℐ , then ℐ+1 is ℐ with ℐ+1 = ℐ ∪ {}.
(Ch7) If  ⊑  ∈  , (, ) ∈  ℐ and (, ) ∈/ ℐ , then ℐ+1 is ℐ with ℐ+1 = ℐ ∪ {(, )}.
Lemma 2. Let  be a DL-LitePG TBox and  be an N2RPE. Then, for every property graph  and every
pair of nodes ,  from , it holds that (,  ) ∈  ℐ , if (,  ) ∈ skip ( ).
Proof. For the (if) direction, assume that an n-NFA  +1 was obtained by applying one of the rules in
Definition 11 to  . In the following we show for each of the rules that it holds if (0, ) ∈  ℐ+1, (, ),
then also (0, ) ∈  ℐ , (, ).</p>
      <p>
        • Assume that  ℐ+1, (, ) was obtained by applying rule (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) with  ⊑  ∈  and (, , +1) ∈  
with  ∈ . Then,  ℐ+1, (, ) contains the transition (, , +1) with  = ( ∖ {}) ∪ {}.
By assumption it holds that (, +1) ∈ ℐ , and from Definition 13 it follows that (, +1) ∈
ℐ , .
• The proof for the rules (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )-(
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) works equivalent to (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ).
• Assume that  ℐ+1, (, ) was obtained by rule (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) for ∃.⊤ ⊑  and (, ?, +1) ∈  . Then,
we add a transition with a nested NFA  that contains a single transition with . By Definition 7
it holds that  = +1 and there is a pair (, ′′), such that (, ′′) ∈ ℐ , . From Definition 13
it follows that  ∈ ℐ , .
• The argumentation for (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) is equivalent to the argumentation of (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) and (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ) combined.
• Assume that  ℐ+1, (, ) was obtained by applying rule (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ) with  ⊑ ∃.⊤ ∈  . By Definition 7
it holds that there is a pair (, ′) where (, ′), such that  ∈ ℐ , . From the canonical model
construction in Item 4 it holds that  ∈ (∃.)ℐ , . For each of the cases below it holds that
(, ) ∈  ℐ , (, ).
a) {(, , +1), (+1, − , +2)} ⊆ 
b) {(, , +1)} ⊆  and +1 ∈ 
• In case that  +1 was obtained by rule (
        <xref ref-type="bibr" rid="ref8">8</xref>
        ), then for each (, ,   ), ( , ⟨ ⟩, ) ∈  where
 ∈ Σ  and  − ∈ ( ) we add the transition (, ,  ) to  . Since  − ∈ ( ) and there has
to be a precedent  before  , there is a trivial matching for the NRPE  , i.e., if (1, ) ∈  ℐ+1, ,
then also (0, ) ∈  ℐ , .
      </p>
      <p>(⇒) In order to show completeness we first introduce the notion of degree, that comes from the
canonical model construction: We denote individuals or pair of individuals in the extension of concepts
and roles as facts, i.e.,  ∈ ℐ or (, ) ∈ ℐ . A fact that occurs for the first time in ℐ has degree ,
and a set of facts gets the sum of the degrees of its elements. The degree of sequence 0 11 . . .  
witnessing (0, ) ∈  ℐ , (, ) is the minimal degree of a set of facts that is suficient to witness all
conditions in Definition 7, including the nested automata. Note that there may be more than one way
to witness not only the automata, but even for a fixed (top level) sequence, we may have choices, but
we don’t care. We set the degree to be the smallest, and then every added transition will allow us to
witness the same sequence but with a strictly smaller degree.</p>
      <p>
        Having the notion of degree for a pair in place we show by induction that in each rewriting step we
are "moving up" the part of the canonical model that is induced by the axioms in  . In the following we
show that for each of the rules in Definition 13 there is a corresponding rule in Definition 11 producing
a n-NFA +1 from   such that the degree of the accepted sequence is strictly smaller.
(Ch1) Consider  ⊑  ∈  and  ∈ ℐ ,  ∈/ ℐ and  ∈ ℐ+1 by construction of the canonical
model. Consider the sequence (1 . . .  . . . ) that (1, ) ∈  ℐ , . In the rewriting we
apply rule applies rule (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) to obtain  +1 from  , where we add for each transition (, ?, +1)
a new transition (, ?, +1), thus (1 . . .  . . . ) will be accepted by  ℐ+1, , for which
the degree is strictly smaller then for (1 . . .  . . . ).
(Ch2) Given ∃.⊤ ⊑  ∈  and by construction of the canonical model  ∈ (∃.⊤)ℐ ,  ∈/ ℐ
 ∈ ℐ+1 . Consider a sequence of the form (1 . . .  . . . ) such that (1, ) ∈  ℐ , .
In skipping we apply rule (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) to obtain  +1 and add for each (, ?,  ) ∈  the transition
(, ⟨ ⟩,  ) to  , where   is a fresh two-state NFA with a single transition (, {},  ). Since
 ∈ (∃.⊤)ℐ there has to be some sequence (′), such that (, ′) ∈  ℐ , and (1, ) ∈
 ℐ+1, .
(Ch3) For this case consider ∃.⊤ ⊑ ∃.⊤ ∈  , then by construction of the canonical model we have
 ∈ (∃.⊤)ℐ ,  ∈/ (∃.⊤)ℐ and (, ′) ∈ ℐ+1 for a fresh . Further, there are two possible
ways for a sequence to pass the  edge, either the sequence is of the form (1 . . . ′−  . . .  )
or it ends with the  edge, i.e., ( . . . − 1). For both cases suppose that (1, ) ∈  ℐ , .
First, we get rid of trivially satisfied nested expressions by rule (
        <xref ref-type="bibr" rid="ref8">8</xref>
        ). Then, we obtain  +1 by
applying rule (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) and add for each consecutive transitions (, , +1), (+1, − , +2) in   a
transition (, ⟨ ⟩, +2) to  +1 and for each {(, {}, +1)} such that +1 is a final state, we
add (, {}, +1) to  +1, where   is a fresh two-state NFA with a single transition (, {},  ).
Since  ∈ (∃.⊤)ℐ , it holds that there is some ′′ such that the sequence (′′) ∈  ℐ , .

Thus, it holds that (1, ) ∈  ℐ+1, and the degree of the sequence is lower. The same holds for
the inverse case  ∈/ R.
(Ch4) The argumentation for the case with  ⊑ ∃.⊤ ∈  and rule (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ) is similar to the case (Ch3), but
instead of adding a nested NFA we simply add a transition with .
(Ch5) For  ⊑  ∈  rule 1 applies and the reasoning is similar to the case with (Ch1).
(Ch6) For  ⊑  ∈  we apply rule (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) and the argumentation is again equivalent to (Ch3).
(Ch7) For  ⊑  ∈  the argumentation is similar as for role inclusions by applying rule 4.
A.1. Algorithm Sketch for full CN2RPQs
In this section we provide a sketch for an algorithm that reduces full CN2RPQs to a union of CN2RPQs.
For this we first introduce the notion of a s-join of n-NFAs.
      </p>
      <p>Definition 14. Let A1 . . . A be n-NFAs with the same alphabet and let  = {1 . . . } be states with
 ∈ Ai and 1 ≤  ≤ , i.e., one state from each automaton. We define the s-join of A1 . . . An as a tuple
of  + 1 automata A′1 . . . A′A as follows:
• Each A′ is obtained from A by making  the only final state
• A is the intersection of all A by making  the only initial state</p>
      <p>There are diferent ways of taking the intersection of automata, but in either case, this step is
exponential. In the following we provide the instructions to get rid of existential variables that might
be mapped in the anonymous part of the canonical model.</p>
      <p>
        Given a full CN2RPQ (⃗), then we exhaustively apply the following rules:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) pick an existential join variable  ∈/ ⃗ and take all atoms  (, ) or  (, ) in 
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) for atoms  (, ), where  occurs in the first place, take the inverse  − (, ). The goal of this step
is to have all atoms with  in the form of  (, )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Choose one state  of each automaton  . Let  be the selected states, and replace all the atoms
by the s-join using one intermediate variable , that is, replace all atoms where  occurs by
A′1(1, ) ∧ · · · ∧ A′(, ) ∧ A′(, ).
      </p>
      <p>If we apply this rule to all non-answer variables in all possible ways, we can get rid of the original
existential variables. The fresh  are existential, but we can assume w.l.o.g. that they are mapped to
named objects, so the algorithm is complete for them.</p>
      <p>The complete query rewriting algorithm for full CN2RPQs under a DL-LitePG TBox can then be
derived by rewriting each of the queries in the output union with the Algorithm 1 for join-on-free
CN2RPQs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Deutsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Francis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Libkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lindaaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Marsault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Martens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Michels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Murlak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Plantikow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Selmer</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. van Rest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Voigt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vrgoc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zemke</surname>
          </string-name>
          ,
          <article-title>Graph pattern matching in GQL and SQL/PGQ</article-title>
          , in: Z. G. Ives,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bonifati</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. E.</surname>
          </string-name>
          Abbadi (Eds.),
          <source>SIGMOD '22: International Conference on Management of Data</source>
          , Philadelphia, PA, USA, June 12 - 17,
          <year>2022</year>
          , ACM,
          <year>2022</year>
          , pp.
          <fpage>2246</fpage>
          -
          <lpage>2258</lpage>
          . doi:
          <volume>10</volume>
          .1145/3514221.3526057.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simkus</surname>
          </string-name>
          ,
          <article-title>Regular path queries in lightweight description logics: Complexity and algorithms</article-title>
          ,
          <source>J. Artif. Intell. Res</source>
          .
          <volume>53</volume>
          (
          <year>2015</year>
          )
          <fpage>315</fpage>
          -
          <lpage>374</lpage>
          . doi:
          <volume>10</volume>
          .1613/jair.4577.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simkus</surname>
          </string-name>
          ,
          <article-title>Nested regular path queries in description logics</article-title>
          , in: C. Baral,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Giacomo</surname>
          </string-name>
          , T. Eiter (Eds.),
          <source>Principles of Knowledge Representation and Reasoning: Proceedings of the Fourteenth International Conference, KR 2014</source>
          , Vienna, Austria,
          <source>July 20-24</source>
          ,
          <year>2014</year>
          , AAAI Press,
          <year>2014</year>
          . URL: http://www.aaai.org/ocs/index.php/KR/KR14/paper/view/8000.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <article-title>Regular path queries in expressive description logics with nominals</article-title>
          , in: C.
          <string-name>
            <surname>Boutilier</surname>
          </string-name>
          (Ed.),
          <source>IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence</source>
          , Pasadena, California, USA, July
          <volume>11</volume>
          -
          <issue>17</issue>
          ,
          <year>2009</year>
          ,
          <year>2009</year>
          , pp.
          <fpage>714</fpage>
          -
          <lpage>720</lpage>
          . URL: http://ijcai.org/Proceedings/09/Papers/124.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Löhnert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Augsten</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Okulmus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <article-title>Towards practicable algorithms for rewriting graph queries beyond DL-Lite, in: The Semantic Web -</article-title>
          22nt International Conference, ESWC 2025, Portorož, Slovenia, June 1-5,
          <year>2025</year>
          <article-title>(accepted for publication)</article-title>
          ,
          <source>Lecture Notes in Computer Science</source>
          , Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>M. M. Dimartino</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Calì</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Poulovassilis</surname>
          </string-name>
          , P. T. Wood,
          <article-title>Query rewriting under linear EL knowledge bases</article-title>
          , in: M.
          <string-name>
            <surname>Ortiz</surname>
          </string-name>
          , S. Schlobach (Eds.),
          <source>Web Reasoning and Rule Systems - 10th International Conference, RR</source>
          <year>2016</year>
          ,
          <article-title>Aberdeen</article-title>
          , UK, September 9-
          <issue>11</issue>
          ,
          <year>2016</year>
          , Proceedings, volume
          <volume>9898</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2016</year>
          , pp.
          <fpage>61</fpage>
          -
          <lpage>76</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -45276-
          <issue>0</issue>
          _
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Dragovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Okulmus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <article-title>Rewriting ontology-mediated navigational queries into cypher</article-title>
          , in: O.
          <string-name>
            <surname>Kutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Ozaki (Eds.),
          <source>Proceedings of the 36th International Workshop on Description Logics (DL</source>
          <year>2023</year>
          )
          <article-title>co-located with the 20th</article-title>
          <source>International Conference on Principles of Knowledge Representation and Reasoning and the 21st International Workshop on Non-Monotonic Reasoning (KR 2023 and NMR</source>
          <year>2023</year>
          ).,
          <string-name>
            <surname>Rhodes</surname>
          </string-name>
          , Greece, September 2-
          <issue>4</issue>
          ,
          <year>2023</year>
          , volume
          <volume>3515</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3515</volume>
          /paper-9.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Giacomo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <article-title>Tractable reasoning and eficient query answering in description logics: The DL-lite family</article-title>
          ,
          <source>Journal of Automated Reasoning</source>
          <volume>39</volume>
          (
          <year>2007</year>
          )
          <fpage>385</fpage>
          -
          <lpage>429</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10817-007-9078-x.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Angles</surname>
          </string-name>
          ,
          <article-title>The property graph database model</article-title>
          , in: D.
          <string-name>
            <surname>Olteanu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          Poblete (Eds.),
          <source>Proceedings of the 12th Alberto Mendelzon International Workshop on Foundations of Data Management</source>
          , Cali, Colombia, May
          <volume>21</volume>
          -25,
          <year>2018</year>
          , volume
          <volume>2100</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2018</year>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2100</volume>
          /paper26.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bonifati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. H. L.</given-names>
            <surname>Fletcher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Voigt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yakovets</surname>
          </string-name>
          , Querying Graphs,
          <source>Synthesis Lectures on Data Management</source>
          , Morgan &amp; Claypool Publishers,
          <year>2018</year>
          . URL: https://doi.org/10.2200/ S00873ED1V01Y201808DTM051. doi:
          <volume>10</volume>
          .2200/S00873ED1V01Y201808DTM051.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Francis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Guagliardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Libkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lindaaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Marsault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Plantikow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rydberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Selmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , Cypher:
          <article-title>An evolving query language for property graphs</article-title>
          , in: G.
          <string-name>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. M. Jermaine</surname>
            ,
            <given-names>P. A.</given-names>
          </string-name>
          <string-name>
            <surname>Bernstein</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference</source>
          <year>2018</year>
          , Houston, TX, USA, June 10-15,
          <year>2018</year>
          , ACM,
          <year>2018</year>
          , pp.
          <fpage>1433</fpage>
          -
          <lpage>1445</lpage>
          . URL: https://doi.org/10.1145/3183713.3190657. doi:
          <volume>10</volume>
          .1145/3183713.3190657.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Artale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ryzhikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <article-title>Dl-lite with attributes and datatypes</article-title>
          , in: L. D. Raedt,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>