<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Workshop on the Resurgence of Datalog in Academia and Industry, October</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Incremental Evaluation of Dynamic Datalog Programs as a Higher-order DBSP Program</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bruno Rucy Carneiro Alves de Lima</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Merlin Kramer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kalmer Apinis</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kristopher Micinski</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Independent Researcher</institution>
          ,
          <addr-line>Wuppertal</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Syracuse University</institution>
          ,
          <addr-line>Syracuse</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Tartu</institution>
          ,
          <addr-line>Tartu</addr-line>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>11</volume>
      <issue>2024</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>We show that evaluating positive Datalog programs in a bottom-up manner can be modelled as a "circuit" of facts and rules where both can be freely added and retracted during runtime. This is realised by writing a definitional interpreter as a higher order program that defines bottom-up evaluation in terms of Database Stream Processing Theory (DBSP) operators that have incremental semantics. Our approach is compared against Souflé, and DDLog. Up to almost 10 times faster performance is exhibited when handling incremental updates compared to DDLog, and 2 times when compared to Souflé.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Datalog</kwd>
        <kwd>DBSP</kwd>
        <kwd>Bottom-up Evaluation</kwd>
        <kwd>Incremental View Maintenance</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        We posit that a simple and potentially eficient way to tackle the materialization of dynamic programs
is to define Datalog interpretation with an expressive language that has transparent incremental
semantics, such as the recent DBSP[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or its ancestor Diferential Dataflow[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        The value that these two bring is in allowing one to define non-incremental recursive programs
that inherit their incremental semantics by exclusively using expressive declarative operators. We
further argue that there is significant practical value in doing so, as both languages have very eficient
interpreters[
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] written in the Rust programming language.
      </p>
      <p>
        Contributions. Our contribution is in attempting to provide a practical and unified solution to
handling incremental evaluation of programs where rules and facts change during runtime. This is
realized over three tangible outputs:
• A novel formulation of incremental interpretation as a higher-order streaming computation. We
formulate incremental evaluation as a DBSP circuit and then empirically evaluate whether the
interpreter indeed inherited the incremental semantics of the host language by comparing it
both with the DDLog[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] incremental reasoner, and with Souflé as the recompute-from-scratch
baseline.
• A high-performance dynamic Datalog interpreter in Rust. The implementation used for the
benchmarks is fully open-source and lives on github[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. We have thorough tests and examples that
demonstrate how to use the library.
• An accessible DBSP and dynamic Datalog interpreter in Python. As DBSP’s canonical
implementation is a Rust program, it severely limits the ease of conducting research for as not only is it a
novel theory, but much of its implementation intertwines high-performance computing code with
its logic, often needing advanced Rust knowledge to use it. As we believe that there is much value
in DBSP, we wrote a Python implementation from scratch[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that also contains the dynamic
Datalog interpretation circuit presented in this paper.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>We consider Datalog engines to be relevant to this work if they are both bottom-up and their core purpose
is either incremental materialisation, or in having had been notable ancestors of those. Souflé being a
well known engine is relevant because it is useful to gauge the overhead of supporting incremental
updates in the evaluation section. A incremental reasoner will not provide value if it cannot compute
updates faster than some other non-incremental reasoner can compute from scratch.</p>
      <p>
        Reasoner
Souflé[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]
BigDatalog[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
Cog[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
Nexus[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
DDLog[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
DBSP*
DYRE
      </p>
      <p>Kind
Both
Compiler
Compiler
Compiler
Compiler
Compiler
Interpreter</p>
      <p>Fact Add
No
No
No
Yes
Yes
Yes
Yes</p>
      <p>Fact Del
No
No
No
No
Yes
Yes
Yes</p>
      <p>Rule Add
No
No
No
No
No
No
Yes</p>
      <p>Rule Remove
No
No
No
No
No
No
Yes</p>
      <p>With Table 1 we situate the reasoning system described in this paper, the Dynamic Datalog Reasoner
(DYRE), among other relevant reasoners.</p>
      <p>
        BigDatalog is an extension of the Apache Spark[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] distributed batch computation framework. It
provides a Datalog front-end that supports both recursive and non-recursive aggregation. While
BigDatalog did satisfactorily handle large amounts of data, it required having to fork its underlying
platform, as it not support any form of iteration, let alone cycles in its logical computational graph.
      </p>
      <p>
        Cog is BigDatalog’s successor project. It swaps Apache Spark for Apache Flink[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], a distributed
streaming computation framework with first-class support for iteration and cyclic dataflows. The
highlight of this reasoner is that it showed how Flink’s eficient delta iteration feature paved room for
naturally implementing semi-naïve evaluation.
      </p>
      <p>Both of these engines, while notable, do not support maintaining the materialization. Nexus is Cog’s
incremental follow-up, and it fully supports eficiently adjusting the computation to new data that
is being streamed, including when the program contains recursive aggregates. It uses a provenance
technique inspired by the counting method to keep track of all derivation paths. This is similar to how
DBSP maintains computations.</p>
      <p>
        The first Datalog engine that supports both incremental addition and deletion at scale is DDlog. The
DD stands for Diferential Dataflow[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the language that inspired DBSP. Its ethos is the same, which
is to provide a language for writing non-incremental programs that have incremental semantics.
      </p>
      <p>The key component of both program provenance and incremental materialization is time. While in
DD the time component can be an arbitrary lattice, DBSP restricts it to either being a total or product
order, for use cases that require nested timestamps such as iteration. In our experiments we found that
we did not need time to be a lattice other than product order.</p>
      <p>
        DDLog compiles an expressive dialect of Datalog to non-incremental recursive relational algebra
formulated with DD. In the seminal DBSP article[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] there is a section named "Implementing Recursive
Datalog". In there, a blueprint is laid out for a system, referred to as DBSP* in the table, that is similar
to DDLog in that it compiles a relational representation of the evaluation of a static Datalog program
into a DBSP circuit.
      </p>
      <p>Every reasoner referred to in the table compiles a static Datalog program to either some dialect of
SQL or to a intermediate representation that models relational algebra. While this approach has proven
itself to be successful, all of the underlying frameworks, sans DBSP and Diferential Dataflow, were not
developed with eficiently supporting iterative workloads in mind, as it is typical of Datalog.</p>
      <p>Moreover, the compilation step is a significant bottleneck that all of these reasoners sufer from. For
instance, DDLog takes minutes to compile a trivial transitive closure program. This makes it unusable
for scenarios that require incrementally adding or removing rules, as that is impossible altogether.</p>
      <p>
        This is not the case for the system proposed in this paper, where Datalog reasoning in itself is
implemented as a circuit with both rules and facts being dynamic.
3. Revisiting datalog as a rewriting system
We model Datalog evaluation as a rewriting system in a similar vein to the expositions found in
[
        <xref ref-type="bibr" rid="ref18 ref19 ref20">18, 19, 20</xref>
        ], and restrict it to the semantics of its positive fragment[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Syntax. A program is a finite set of horn clauses syntactically represented as implications, with
definite clauses referred to as rules, and those with no positive literal as ground atoms.
Equation 1 discerns predicates  , terms  , rules  , atoms , ground atoms , programs Π, and finite fact
stores . Bold symbols represent ordered lists of symbols, that if keyed represent sets. Constants are
quoted names, and variables are unquoted. The left-hand side of a rule is called head, and the right-hand
body.</p>
      <p>Evaluation. A substitution is a function  that maps a single variable to a constant. A rewrite Σ
is a set of substitutions such that no  ∈ Σ share a variable. We write dom(Σ) as the set of variables
among all  ∈ Σ and Σ[] as the constant that the some variable  maps to.
 ::= Const | Var</p>
      <p>::=  ( )
 ::=  (Const)
 ::=  ←</p>
      <p>Π ::= { }
 ::= {g}
(1)
Example 3.1 (Rewriting) Applying the rewrite Σ : { ↦→ "a",  ↦→ "b",  ↦→ "c"} to  (, ) grounds
it to  ("a", "c")
If there is at least one rewrite that has one substitution for each variable term in some atom , applying
it to  will ground it.</p>
      <p>Definition 3.1 The rewrite set monoid is a triple ⟨Σ , ⊕ , Σ0⟩ where:
• Σ is a set of rewrites Σ
• The binary operation Σ ⊕ Σ = Σ for any Σ, Σ ∈ Σ , with dom(Σ) = dom(Σ) ∪ dom(Σ)
and each  ∈ dom(Σ) maps to Σ[] if  ∈ dom(Σ) or Σ[] otherwise
• The identity element is Σ0, the empty rewrite</p>
      <p>Grounding the head of a rule is done by iteratively building at least one Σ that satisfies the constraints
of the body. Let x be the list of terms  of some atom  with predicate  . We use the syntax . to
refer to the set of ground atoms in  with ’s predicate.</p>
      <p>Definition 3.2 The rewrite product Σ ⊗  = Σ ′ of a rewrite set Σ and some atom , with the number
of variable terms in  being , is the set of rewrites such that for each Σ ∈ Σ , up to one new rewrite
Σ′ = Σ ⊕ Σ is created for each  ∈ . , with potentially up to  new  .</p>
      <sec id="sec-2-1">
        <title>Example 3.2 (Rewrite product)</title>
        <p>Given:
 =  (, )
. = {("a", "b"), ("b", "c"), ("c", "d")}
Σ = {{ ↦→ "a"}, { ↦→ "b"}}
Then:
Σ ⊕  = {{ ↦→ "a",  ↦→ "b"}, { ↦→ "b",  ↦→ "c"}}</p>
        <p>There are two steps to the product. The first is unification , where each  ∈ . is unified against all
Σ ∈ Σ applied to .  is said to unify with  if there is no constant term  in  that occupies a position
in  where there is some other term ′ such that  ̸= ′.</p>
        <p>The second is extension. For all ⟨Σ, ,  ⟩ triples where  ∈ . unifies with some Σ applied to some
, there is one new rewrite Σ where each variable  maps to some constant  in  such that  ∈/ Σ,
 ∈ , and  occupies the same position as . The output of the product is Σ ⊕ Σ for each triple.</p>
        <p>Let Σ 0 be the empty rewrite set. It contains the empty rewrite Σ0. The immediate consequence
 (,  ) of some rule  with a ground atom set  is computed by applying each rewrite from the final
rewrite product Σ 0 ⊗ 0 ⊗ ... ⊗  to , with  being the head, and all other atoms members of the body.</p>
        <p>
          The immediate consequence  (Π, ) of the program Π with   ∈ Π is then ⋃︀=0  ( , ). Program
evaluation is then defined as the least fixed point (LFP) of the immediate consequence of the program.
4. Datalog interpretation as a DBSP circuit
While we do not assume the reader to be familiar with DBSP as we introduce its core tenets alongside
our contributions, we expect for there to be some familiarity with incremental view maintenance[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ],
as that is what DBSP formalizes at a high level of expressivity. For further clarification, we recommend
the seminal DBSP paper[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>
          The central element of DBSP is the Stream. Streams  are maps N ↦→  with the domain representing
time, and the codomain a value from an Abelian group . The restriction imposed on  makes it
unsuitable for Datalog evaluation as presented, as set union, what ensures that at some point the fixed
point is reached, does not form an Abelian group. We address this by modelling  and Π as Z-sets[
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]
instead.
        </p>
        <p>Definition 4.1 Z-sets are maps with each domain element having an associated integer count . For any
two Z-sets Z and Z′ of the same underlying set , the following hold:
• Z′′ = Z + Z′ where  ∈ dom(Z′′ ) if Z [] + Z′ [] ̸= 0 and Z′′ [] = Z [] + Z′ []
• Z′ = − Z where ∀ ∈ dom(Z ), Z′ [] = − Z []
• If ∀ ∈ dom(Z ), Z [] = 1, then Z is bijective to some set.</p>
        <p>We name Z-sets where all values in their codomain are either 1 or -1 as ΔZ and those that are bijective to
sets as , and make use of the two auxiliary functions toset : Z ↦→ , and todif : Z ↦→ ΔZ that
ensure this conversion.</p>
        <p>Functions are maps 0 × ... ×  ↦→  where  and  are Abelian groups. An operator is also a
map 0 × ... ×  ↦→  where each  and  are streams. Lift-ing some function  :  ↦→ 
yields an operator ↑ :  ↦→  where ∀,  [] =  ([]). We use ↑ to denote lifted operators.</p>
        <p>
          An operator is said to be causal if its output at time  only ever relies on all inputs with time ′ such
that ′ ≤ , with it being strict if ′ &lt; . All lifted operators are causal, but not necessarily strict. All
strict operators are guaranteed[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to have a unique solution to their fixpoint. We do not utilise any
non-causal operators. Lifting  yields a causal operator.
        </p>
        <p>The two primitive building blocks of circuits are lifting and delaying. The delay operator −1 :
 ↦→ ′ produces a stream ′ such that ′[0] is the Abelian identity element, and ′[] = [ − 1]
otherwise. As per [9, Lemma 2.11], feeding the output of some causal operator  :  ×  ↦→ 
back to itself with −1 yields a strict operator.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Example 4.1 (Delay, toset and todif )</title>
        <p>Given:
 = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 2}, 2 ↦→ { ("c", "d") ↦→ 1}]
′ = [0 ↦→ { ("a", "b") ↦→ − 1}, 1 ↦→ { ("b", "c") ↦→ 2}, 2 ↦→ { ("c", "d") ↦→ − 1}]
′′ = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 2}]
Then:
↑toset( ) = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 1}, 2 ↦→ { ("c", "d") ↦→ 1}]
↑todif (′ ) = [0 ↦→ { ("a", "b") ↦→ − 1}, 1 ↦→ { ("b", "c") ↦→ 1}, 2 ↦→ { ("c", "d") ↦→ − 1}]
− 1(′′ ) = [0 ↦→ {}, 1 ↦→ { ("a", "b") ↦→ 1}, 2 ↦→ { ("b", "c") ↦→ 2}]
4.1. naïve dynamic interpretation
ΔΠ
Let ℐ :  ↦→ ′ be the stream integration operator: ∀, ′[] = ∑︀
=− 1 [], and (↑distinct)Δ :
Z ↦→ Δ′Z the stateful Z-set distinct operator: ∀, ′ [] = todif ( [] − − 1(ℐ( ))[]).</p>
        <p>With Π as a stream of bijective program Z-sets and  of ground atoms, we define the lifted
immediate consequence stream operator ↑ : Π ×  ↦→ Z′ as ∀, Z′ [] =  (Π [],  []).</p>
        <p>Figure 1 showcases the circuit of naïve dynamic Datalog interpretation. The value of ′ [] for any 
will be the addition of ∑︀</p>
        <p>=0  [] to the sum of all inferred distinct ground facts by then.</p>
        <p>The fixed point of the circuit, after either a new fact or program Z-set is received, will happen at
some timestamp  where the − 1 operator returns the identity of  , since then no new distinct atoms
can be emitted.</p>
        <p>The value of ′ [] will be the full materialization. As the input  and the final integration are
bijective to sets, the materialization will too be bijective, therefore emitting sets and matching regular
Datalog semantics.</p>
        <p>
          Example 4.2 (Integration and non-incremental dynamic interpretation)
Δ = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 1}, 2 ↦→ { ("c", "d") ↦→ 1}]
ΔΠ [0] = {(, ) ←  (, ) ↦→ 1}
ΔΠ [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] = {(, ) ← (, ),  (, ) ↦→ 1}
ΔΠ [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] = {(, ) ←
Then:
        </p>
        <p>
          (, ) ↦→ − 1}]
ℐ( )[0] = { ("a", "b") ↦→ 1}
ℐ( )[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] = { ("a", "b") ↦→ 1,  ("b", "c") ↦→ 2}
ℐ( )[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] = { ("a", "b") ↦→ 1,  ("b", "c") ↦→ 2,  ("c", "d") ↦→ 1}
′ [0] = { ("a", "b") ↦→ 1, ("a", "b") ↦→ 1}
′ [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] = ′ [0] + { ("b", "c") ↦→ 1, ("b", "c") ↦→ 1, ("a", "c") ↦→ 1}
′ [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] = ′ [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] + {("a", "b") ↦→ − 1, ("b", "c") ↦→ − 1, ("a", "c") ↦→ − 1,  ("c", "d") ↦→ 1}
        </p>
        <sec id="sec-2-2-1">
          <title>4.2. Incremental dynamic interpretation</title>
          <p>To derive incremental evaluation it is necessary to reason about the linearity of functions. A function
 :  ↦→  is linear if for any ,  ∈ ,  ( + ) =  () +  (). This extends to functions with more
than one argument e.g  :  ×  ↦→  is bilinear if it is linear in each of its arguments.</p>
          <p>While operators obtained by lifting linear functions are linear,  is not bilinear hence the general
formula for their incrementalization[9, Theorem 1.4] when applied to it yields:
∀, (↑ )Δ[] =  (ΔΠ [], Δ [])+ (− 1(ℐ(ΔΠ ))[], Δ [])+ (ΔΠ [], − 1(ℐ(Δ ))[])</p>
          <p>Will not satisfy the invariant ∀, ↑ (ℐ(ΔΠ ), ℐ(Δ ))[] = ℐ((↑ )Δ(ΔΠ , Δ ))[]
therefore being unsound.</p>
          <p>Example 4.3 (Non-linearity of ↑ )
Given:
Δ = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 1}}]
ΔΠ = [0 ↦→ {(, ) ←  (, ) ↦→ 1}, 1 ↦→ {(, ) ←
↑ (ΔΠ , Δ )[0] = {("a", "b") ↦→ 1}
(, ),  (, ) ↦→ 1}]</p>
          <p>We aim to achieve incremental evaluation by creating a circuit that implements the previously
introduced rewriting system exclusively with linear and bilinear operators.</p>
          <p>In order to iteratively compute rewrite products, there have to be notions of direction and end, as
rewrites need to be both propagated forward to meet other atoms and ultimately signalled to be ready
for grounding, to generate new facts.</p>
          <p>Definition 4.2 The expected provenance  :  ×  ↦→ N of the j-th atom  with respect to some rule  
is ∑︀</p>
          <p>=0 ℋ(), where ℋ() is some function that returns an integer such that if two ,  atoms have
the same predicate and terms, then ℋ() = ℋ()
Definition 4.3 The direction Z-set Z of some rule  with  atoms and weight  is
{⟨ (,  − 0),  (,   )⟩ ↦→  | ∀ &lt;  ∧  ≥ 1} + {⟨0,  (,  0)⟩ ↦→ }. The function dir :  ↦→ Z
computes the direction set for some rule  .</p>
          <p>Definition 4.4 The grounding signal  of some rule  with  atoms is ⟨ (,  ), .ℎ ⟩ with  being the
last atom, and .ℎ the rule head. The function sig :  ↦→  computes the grounding signal for some rule 
Definition 4.5 The grounding kit K of some program Π and its   rules with  respective weights is a
tuple ⟨Z, Z ⟩ where:
• Z is ∑︀</p>
          <p>=0 dir(  )
• Z is {sig(  ) ↦→  | ∀ &lt; }
The function kit : Π ↦→ K computes the grounding kit for some program Π, with dir : Π ↦→ Z and
sig : Π ↦→  returning, respectively, Z and Z of some program’s K.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>Example 4.4 (The grounding kit of some program)</title>
        <p>Given:
 = (, ) ←
 ′ = (, ) ←
Π = { ↦→ 1,  ′ ↦→ 1}
Then:
 (, )
 (, ), (, )
 (,  (, )) = ℋ( (, ))
 ( ′,  (, )) = ℋ( (, ))
 ( ′, (, )) = ℋ( (, )) + ℋ((, ))
dir(Π) = {ℋ( (, )) ↦→ 2,  ( ′, (, )) ↦→ 1}
sig(Π) = {⟨ℋ( (, )), (, )⟩ ↦→ 1, ⟨ ( ′, (, )), (, )⟩ ↦→ 1}
kit(Π) = ⟨dir(Π), sig(Π)⟩
ΔΣ
Δ
↑sig
↑dir
↑ 0
↑ 0
↑↑+
↑↑+</p>
        <p>We leverage the grounding kit to guide interpretation and implement the rewrite product as three
incremental joins.</p>
        <p>gatekeep</p>
        <p>Example 4.5 (Nested streams and stream introduction and elimination)
Δ = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 1}}]
Then:
 0(Δ [0]) = [0 ↦→ { ("a", "b") ↦→ 1}]
↑ 0(Δ ) = [0 ↦→ [0 ↦→ { ("a", "b") ↦→ 1}], 1 ↦→ [0 ↦→ { ("b", "c") ↦→ 1}]]
∫ (Δ ) = { ("a", "b") ↦→ 1,  ("b", "c") ↦→ 1}
↑ ∫ (↑ 0(Δ )) = [0 ↦→ { ("a", "b") ↦→ 1}, 1 ↦→ { ("b", "c") ↦→ 1}}]
∫ (↑ 0(Δ )) = [0 ↦→ { ("a", "b") ↦→ 1,  ("b", "c") ↦→ 1}]</p>
        <p>We refer to  as the outer timestamp of some stream of streams  and ′ as the inner one. 
can be interpreted as the time of arrival of some update, and ′ as the fixed point iteration. As the
materialization nonetheless expects the immediate consequence to be a single Z-set, the ↑ ∫︀ operator is
used to "flatten" it into one Z-set.</p>
        <p>
          gatekeep, product and ground are the three nested stream operators that implement the rewrite
product. All of them have the following form:
op : Z × ′Z ↦→ ′′Z , ∀, ′′Z [] = ↑↑  ((↑(↑ ⋊⋉)Δ)Δ(Z , ′Zℬ ))[]
Where   is the Z-set projection with scalar function  , and ⋊⋉ the Z-set join with boolean join
comparison scalar function . As   is linear, by definition it already is incremental. For some stream
 , the time complexity of ↑↑  ( ) at some , ′ is ( ((( )[])[′]))[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>The relational ⋊⋉ operator is bilinear, with the time complexity of its incremental form (↑(↑ ⋊⋉)Δ)Δ
with respect to its inputs at some , ′ being (‖↑ℐ(Z )[][′]‖ × ‖ℐ (Z )[][′]‖).</p>
        <p>Example 4.6 (A parallel with semi-naïve evaluation of static programs) With ′ denoting the
ifxed point iteration number, at each new data arrival  a ↑ℐ operator applied to the left side of the
incremental join would emit a stream where the value at ′ is ∑︀=′0((Z )[])[]. For the right side
on the other hand, it would be ∑︀=0((Z )[])[′] instead as ℐ is not lifted. A parallel can be made
with the regular non-incremental semi-naïve evaluation of a static program, where the left side takes
as input the most recently inferred facts and the right side all.
gatekeep : Z⟨N,Σ⟩ × ′Z ↦→ ′′Z⟨,Σ,N⟩ , ∀, ′′Z⟨,Σ,N⟩ [] = ↑↑  ((↑(↑ ⋊⋉)Δ)Δ(Z⟨N,Σ⟩ , ′Z ))[]
Each iteration is initiated through operator gatekeep. Its inputs are a delayed stream of streams of Z-sets
that contain tuples where the first element is the Z-set of expected provenances of some atom alongside
a rewrite set Σ and a stream of directions.</p>
        <p>The output is the Join of the delayed provenance-indexed rewrites with directions. The join predicate
 is the equality between the provenance of the left side with the "previous" provenance of matching
directions.</p>
        <p>The projection function  outputs tuples that in push Σ as candidate rewrites for the next iteration.
This is achieved by emitting tuples where candidate rewrites have their "next" provenance  alongside
some atom  such that (,  ) =  and  is any rule. This operator is proportional to the product of
the previous iteration rewrites and all directions.</p>
        <p>product : Z⟨,Σ,N⟩ × ′Z ↦→ ′′Z⟨N,Σ⟩ , ∀, ′′Z⟨N,Σ⟩ [] = ↑↑ ((↑(↑ ⋊⋉)Δ)Δ(Z⟨,Σ,N⟩ , ′Z ))[] (2)
Operator product implements the core of the rewrite product, with its inputs being the output of gatekeep
and the delayed stream of ground atoms. Let  : ⟨, Σ, N⟩ be the left side and  the right,  is true if
Σ applied to  unifies with  returning Σ . The projection  then simply yields Σ ⊕ Σ and N. This
operator is then proportional to the product of the rewrites propagated during this iteration with all
ground atoms.</p>
        <p>ground : Z⟨N,Σ⟩ × ′Z ↦→ ′′Z , ∀, ′′Z [] = ↑↑  ((↑(↑ ⋊⋉)Δ)Δ(Z⟨N,Σ⟩ , ′Z ))[]
(3)
The last operation is grounding itself, where the possibly-final rewrites stemming from product are
joined with grounding signals. The output of this operator will be all rewrites that have been propagated
beyond the last body atom, hence being applicable to the head of their respective rules to generate new
ground atoms.  is true if the provenance of the left side matches some found in a grounding signal. 
is the application of the rewrite to the head atom contained in the signal tuple.
4.3. A simple indexing scheme to improve performance
While the proposed circuit has been demonstrated to be incremental in both programs and ground
atoms, operator product is ineficient because in each iteration it needlessly attempts to unify every
single rewrite with all facts that share the same predicate. This implies that both the best and worst
case complexities of its join with respect to ground atoms are the same.</p>
        <p>
          To make our circuit practical, we devise a way to make its join key more fine-grained by also taking
into account constant term combinations. We implemented a simplified version of the ground atom
indexing method that Soufle has pioneered[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], but with incremental DBSP operators. This means that
this scheme will respond to change in the same way as the rest of the circuit.
Definition 4.6 The naïve join order Z-set Z of some rule  with weight  and  atoms  with predicates
. in its body is a Z-set of tuples {⟨.0, 0⟩ ↦→ , ..., ⟨.− 1, − 1⟩ ↦→ } where  for  &lt;  are
terms in  that are in the same place as some other terms in +1. For simplicity of exposition, we assume
that the respective body atoms are ordered in a way that ensures that every  is before some other +1
with variable terms that overlap. The function jorder :  ↦→ Z computes the naïve join order for some
rule  .
        </p>
        <p>Example 4.7 (The naïve join order of some rule) Given  =  (, ) ← (, ),  (, ) with
 = 1, then jorder( ) = {⟨, ()⟩ ↦→ 1, ⟨, 0⟩ ↦→ 1}. The first tuple indicates that for the predicate
 there should be no index. The second on the other hand makes it explicit that ground atoms in 
ought to be indexed on their zeroth term.</p>
        <p>Definition 4.7 The extended grounding kit Q of some program Π and its  rules  is a triple ⟨Z, Z , Z ⟩
where: Z and Z are the same as in K, and Z is ∑︀
=0 jorder(  ). The function ekit : Π ↦→ Q computes
the extended grounding kit for some program Π, and jorder : Π ↦→ Z returns some Π’s join order Z-set.</p>
        <p>As Figure 3 shows, it is only necessary to make a few changes to the circuit depicted on Figure 2 to
extend it with this incremental indexing mechanism.</p>
        <p>First, one new operator must be applied to ΔZΠ , ↑jorder, and product’s predicate needs to be shifted
to join over predicates and constant terms i.e indexed facts. This is enabled by adding a new operator
after the distinct operator that follows the sum of recently arrived ground atoms and ground.
index : Z × ′Z ↦→ ′′Z⟨,⟩ , ∀, ′′Z⟨,⟩ [] = ↑↑  ((↑(↑ ⋊⋉)Δ)Δ(Z , ′Z ))[]
(4)
The index operator receives as input ground atoms and join orders. The join key function  is true if
some atom’s predicate matches some order’s predicate. Given some matching order  and atom , 
will output a tuple ⟨′, ⟩ containing partial atom ′ according to  and .</p>
        <p>This will ensure that while the right side is still proportional to all ground terms of some predicate ,
if there is only one match according to the order , then the best-case complexity decreases to being a
scan over only that single match. Over the next section we demonstrate how this leads to enormous
performance gains, at a seemingly minimal increase in memory usage.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Evaluation</title>
      <p>There are three parts to this section. The first one pertains to comparing the materialisation performance
of the circuit with indexing, versus without. To give a point of reference, Souflé was put through the
same benchmark in interpreted mode.</p>
      <p>The second part aims to compare DYRE’s incremental behavior with the reasoner that is the closest
to it, DDLog. This is the most important empirical result of the paper, as this would indicate that the
incremental semantics were indeed inherited, with the additional benefit of the program being dynamic.</p>
      <p>As incremental systems must keep track of more data than non-incremental ones, the last part is
dedicated to providing an overview of the peak memory usage witnessed across all benchmarks.</p>
      <p>
        Datasets. We choose to strictly use three well-known synthetic datasets for performance evaluation.
The first dataset is LUBM[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], a standard benchmark for the entailment of description logic related
formalisms such as RDFS[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and OWL2RL[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. For all of these, the data has two components, the
ontology, that determines with which predicates can individuals be described, and the descriptions of
the individuals themselves.
 (, "rdf:type", ) ←
      </p>
      <p>(, "rdfs:domain", ),  (, , )
 (, , ) ←
 (, "rdf:type", ) ←</p>
      <p>(, "rdfs:subPropertyOf", ),  (, , )
 (, "rdfs:subClassOf", ) ←</p>
      <p>(, "rdfs:subClassOf", ),  (, "rdf:type", )
 (, "rdf:type", ) ←
 (, "rdfs:subPropertyOf", ) ←</p>
      <p>(, "rdfs:subClassOf", ),  (, "rdfs:subClassOf", )
 (, "rdfs:range", ),  (, , )</p>
      <p>(, "rdfs:subPropertyOf", ),  (, "rdfs:subPropertyOf", )
The core of RDFS entailment can be modelled as the program of approximately 6 mutually recursive
rules contained in equation 5.
...</p>
      <p>Faculty() ←
...</p>
      <p>Employee() ←
...(95 more rules)
degreeFrom(, ) ←</p>
      <p>hasAlumnus(, )</p>
      <sec id="sec-3-1">
        <title>PostDoc()</title>
      </sec>
      <sec id="sec-3-2">
        <title>Person(), worksFor(, ), Organization()</title>
        <p>
          The OWL2RL entailment program, in equation 6, is done by converting the description logic entailments
of LUBM’s ontology to Datalog according to [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. The final program has almost 100 rules.
        </p>
        <p>
          The second and third datasets are graphs generated with the GT[
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] tool. The second, RAND1K,
is a dense graph of one thousand edges. Each edge has around 1% chance of being connected to each
other. The third graph, RMAT1K, has ten thousand edges and one thousand nodes. It is a sparse graph
that follows an inverse power-law distribution, with most nodes being only connected to a handful of
other nodes.
 (, ) ←
 (, ) ←
(, )
(, ),  (, )
We only run one program over these two datasets, the ubiquitous transitive closure one in equation 7.
        </p>
        <p>All benchmarks used the same AWS-provisioned M1 Mac Pro machine. No other processes were
running during the evaluation. Each measurement was either taken fifty times, or up to until there
wasn’t noticeable variance. Runtime performance was recorded with each reasoner’s own profiler,
while peak memory usage was tracked with GNU Time.</p>
        <sec id="sec-3-2-1">
          <title>5.1. The impact of indexing</title>
          <p>The first benchmark set is shown on Figure 4. The X axis shows the percentage of the total data that is
being materialized according to the respective program and dataset, indicated by the title of the facet of
each subgraph. The Y axis shows the time in milliseconds taken to materialise the program. DYRE
refers to DYRE with indexing, and its counterpart is referred to without the superscript.
(5)
(6)
(7)</p>
          <p>In two out of the four graphs, Souflé is faster than both versions of DYRE. In TC:Dense, Souflé is
three times faster than DYRE with indexing, and twenty times otherwise. The same ratio holds true in
the sparse graph. As both of these scenarios require imply fixpoint iterations, they ofer a good proxy
to evaluate the impact of DBSP’s provenance, which is considerable in both variants. As the indexing
scheme restricts the rewrite product to only occur between fresh atoms that share constants with facts,
it decreases the amount of data that is retained.</p>
          <p>LUBM:OWL2RL and LUBM:RDFS do not require multiple iterations to converge, with the latter only
needing a handful. In both of these situations, DYRE with indexing is at least twice as fast as Souflé
in every single measurement. Given these results, we posit that our simple indexing scheme brings
overwhelmingly better performance, to the point that it rivals a state of the art reasoner that is not
incremental.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>5.2. Incremental performance</title>
          <p>With DYRE showing adequate single-batch performance when compared with the well-known reasoner
Souflé, we turn to comparing it with DDLog, a Souflé-inpsired incremental reasoner that compiles
non-incremental Datalog programs to diferential dataflow programs that are incremental. On the
benchmark shown on Figure 5, the X axis contains three sequential steps: The first step 1 - mat indicates
the initial materialisation, with the following ones representing incremental updates, with 2 - update
being a incremental addition, followed by a incremental deletion 3 - remove.</p>
          <p>The amount of data that is added is in the reasoner name. DDLog:99 means that the initial
materialisation consisted of 99% of all available data, with the incremental addition containing the remaining 1%,
and the deletion removing the same amount. DDLog:50 on the other hand, had a initial materialisation
size of 50%, with subsequent updates each adding and retracting the remaining 50%.</p>
          <p>We note that a update batch size that is equivalent to the initial materialisation is not realistic, as are
all depicted measurements where the batch size is 50%. It being 1% however, is akin to highly dynamic
scenarios. The expected behavior of a incremental system is to take less time to adjust to an update
than to restart the computation from scratch.</p>
          <p>For the programs that are run on LUBM, both DYRE and DDLog exhibit incremental behavior. It takes
roughly as much time for both to initialise the materialisation, than to update it with both additions
and deletions. When the batch size is 50% the line is flat, as each update takes roughly the same time, as
expected.</p>
          <p>When the update is much smaller than the initial materialisation, as is the case when the batch size is
1%, there should be a dramatic elbow from step 1 - mat to 2 - update. This is witnessed in both graphs,
for both reasoners. DYRE however is over ten times faster than DDLog to initially materialise the data
in RDFS, and twice to process updates. In OWL2RL it is respectively is four times and two times faster.</p>
          <p>While DYRE is significantly faster, DDLog’s ratio of initial materialisation over update processing is
better. For RDFS DDLog took 780 ms on average to materialise 99% of the data, and then only 10 ms for
each update. DYRE on the other hand took 83 ms and 7ms. This implies that DYRE has an almost ten
times bigger overhead to process updates.</p>
          <p>The performance of DDLog in the benchmarks with the transitive closure program is close to DYRE.
For the dense graph, DYRE did not manage to attain incremental behavior, as in the case where the
batch size is 1%, updates took the same time as materialisation. DDLog however did not. In the sparse
graph, DYRE performed better than DDLog but not by much, with both the initial materialisation and
the updates being only 30% faster. DDLog struggled with the large batch size, where it took almost four
times as long to update than the initial materialisation.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>5.3. Maximum memory resident set size</title>
          <p>To provide further insight into the cost of supporting incremental reasoning, Figure 6 shows the peak
memory usage that was recorded from each reasoner across all experiments. Souflé used at least
one order of magnitude less memory than every single reasoner. In the most extreme case, transitive
closure over the Sparse graph, DYRE used 100 times more memory than Souflé. In all scenarios DDLog
consistently consumed four times less memory than DYRE.</p>
          <p>The last notable observation is that the indexing scheme implied only a small increase in memory
usage, in spite of leading to a constant ten times higher performance.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusion</title>
      <p>In this paper we put forward a novel take on the incremental interpretation of dynamic Datalog
programs by entirely delegating that problem to DBSP. DBSP is a formal language for incremental
iterative streaming computation that allows for highly expressive query languages to be expressed in.</p>
      <p>DBSP’s primary benefit however isn’t in providing a suitable theoretical playground, but in also
having a very eficient interpreter written in Rust. While Datalog has been recently seen almost
exclusively as a problem of eficiently compiling horn clauses into relational algebra, we revisit its
framing as a term rewriting system and directly model it as a incremental DBSP circuit that is then
extended with a simple indexing scheme.</p>
      <p>We evaluated our reasoner DYRE, Dynamic Datalog Reasoner, against two state of the art reasoners,
Souflé, that is not incremental, and DDLog. We attained competitive benchmark results, surpassing
the former running in interpreted mode in materialization runtime in two occasions, and the latter in
most benchmarks, at the cost of four times higher memory consumption. Furthermore, our proposed
incremental indexing scheme was shown to be the key element that made its performance competitive.</p>
      <p>While DYRE is an interpreter, DDLog is a compiler. The total number of lines of code of DYRE’s
DBSP circuit sits at 120. Each DDLog diferential dataflow program had hundreds, up to thousands for
OWL2RL, of lines of code and took minutes to compile.</p>
      <p>We then conclude that DBSP is an interesting venue for describing and writing incremental and
non-incremental term writing systems whose primary goal is in attaining high performance. We see
multiple ways of extending this work, such as by extending the interpreter to support stratified negation
or other Datalog flavors, and in improving the indexing scheme by replicating Souflé’s.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gottlob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tanca</surname>
          </string-name>
          , et al.,
          <article-title>What you always wanted to know about datalog(and never dared to ask), IEEE transactions on knowledge and data engineering 1 (</article-title>
          <year>1989</year>
          )
          <fpage>146</fpage>
          -
          <lpage>166</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shkapsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Interlandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Condie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zaniolo</surname>
          </string-name>
          ,
          <article-title>Big data analytics with datalog queries on spark</article-title>
          ,
          <source>in: Proceedings of the 2016 International Conference on Management of Data</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1135</fpage>
          -
          <lpage>1149</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Imran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Gévay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          ,
          <article-title>Distributed graph analytics with datalog queries in flink,</article-title>
          <year>2020</year>
          , pp.
          <fpage>70</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Imran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Gévay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-A.</given-names>
            <surname>Quiané-Ruiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          ,
          <article-title>Fast datalog evaluation for batch and stream graph processing</article-title>
          ,
          <source>World Wide Web</source>
          <volume>25</volume>
          (
          <year>2022</year>
          )
          <fpage>971</fpage>
          -
          <lpage>1003</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ryzhyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Budiu</surname>
          </string-name>
          , Diferential datalog., volume
          <volume>2</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>4</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Mumick</surname>
          </string-name>
          ,
          <article-title>Incremental maintenance of recursive views: A survey</article-title>
          , MIT Press,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Piro</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Horrocks, Incremental update of datalog materialisation: the backward/forward algorithm</article-title>
          ,
          <source>in: AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Piro</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <article-title>Maintenance of datalog materialisations revisited</article-title>
          ,
          <source>Artif. Intell</source>
          .
          <volume>269</volume>
          (
          <year>2019</year>
          )
          <fpage>76</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Budiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>McSherry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ryzhyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tannen</surname>
          </string-name>
          ,
          <article-title>Dbsp: Automatic incremental view maintenance for rich query languages</article-title>
          ,
          <source>Proc. VLDB Endow</source>
          .
          <volume>16</volume>
          (
          <year>2022</year>
          )
          <fpage>1601</fpage>
          -
          <lpage>1614</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>McSherry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Plotkin</surname>
          </string-name>
          ,
          <article-title>Foundations of diferential dataflow</article-title>
          ,
          <source>in: Foundations of Software Science and Computation Structures: 18th International Conference, FOSSACS</source>
          <year>2015</year>
          ,
          <article-title>Held as Part of the European Joint Conferences on Theory and Practice of Software</article-title>
          ,
          <source>ETAPS</source>
          <year>2015</year>
          , London, UK, April
          <volume>11</volume>
          -
          <issue>18</issue>
          ,
          <year>2015</year>
          , Proceedings 18, Springer,
          <year>2015</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Dbsp implementation github repository</article-title>
          ,
          <year>2024</year>
          . URL: https://github.com/feldera/feldera/tree/main/ crates/dbsp.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>Diferential dataflow implementation github repository</article-title>
          ,
          <year>2024</year>
          . URL: https://github.com/ TimelyDataflow/diferential-dataflow.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>M. K. Bruno Rucy Carneiro Alves de Lima</surname>
          </string-name>
          , '
          <article-title>materialized-view' System Source Code</article-title>
          , https://github. com/brurucy/materialized-view,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>B. R. C. A. de Lima</surname>
          </string-name>
          ,
          <source>Pydbsp implementation github repository</source>
          ,
          <year>2024</year>
          . URL: https://github.com/ brurucy/pydbsp.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>B.</given-names>
            <surname>Scholz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jordan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Subotić</surname>
          </string-name>
          , T. Westmann,
          <article-title>On fast large-scale program analysis in datalog (</article-title>
          <year>2016</year>
          )
          <fpage>196</fpage>
          -
          <lpage>206</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Armbrust</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Xin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Bradley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kaftan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Franklin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghodsi</surname>
          </string-name>
          , et al.,
          <article-title>Spark sql: Relational data processing in spark</article-title>
          ,
          <source>in: Proceedings of the 2015 ACM SIGMOD international conference on management of data</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1383</fpage>
          -
          <lpage>1394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Carbone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Katsifodimos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ewen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Haridi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tzoumas</surname>
          </string-name>
          ,
          <article-title>Apache flink: Stream and batch processing in a single engine</article-title>
          ,
          <source>The Bulletin of the Technical Committee on Data Engineering</source>
          <volume>38</volume>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Urbani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <article-title>Datalog reasoning over compressed rdf knowledge bases</article-title>
          ,
          <source>28th ACM International Conference on Information and Knowledge Management</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <article-title>Pay-as-you-go ontology query answering using a datalog reasoner</article-title>
          ,
          <source>in: Description Logics</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Piro</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Horrocks, Incremental update of datalog materialisation: the backward/forward algorithm</article-title>
          ,
          <source>in: AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kotowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brodt</surname>
          </string-name>
          ,
          <article-title>Reasoning as axioms change - incremental view maintenance reconsidered</article-title>
          ,
          <source>in: International Conference on Web Reasoning and Rule Systems</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Karvounarakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tannen</surname>
          </string-name>
          ,
          <article-title>Provenance semirings</article-title>
          ,
          <source>in: 26th ACM SIGMOD-SIGACTSIGART symposium on Principles of database systems</source>
          ,
          <year>2007</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Arch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Subotić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Scholz</surname>
          </string-name>
          ,
          <article-title>Building a join optimizer for souflé</article-title>
          ,
          <source>in: International Symposium on Logic-Based Program Synthesis and Transformation</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>83</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Heflin</surname>
          </string-name>
          ,
          <article-title>Lubm: A benchmark for owl knowledge base systems</article-title>
          ,
          <source>J. Web Semant</source>
          .
          <volume>3</volume>
          (
          <year>2005</year>
          )
          <fpage>158</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>D.</given-names>
            <surname>Allemang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <article-title>Semantic web for the working ontologist: efective modeling in rdfs and owl</article-title>
          , Elsevier,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Szałas</surname>
          </string-name>
          ,
          <article-title>The web ontology rule language owl 2 rl + and its extensions</article-title>
          ,
          <source>Trans. Comput. Collect. Intell</source>
          .
          <volume>13</volume>
          (
          <year>2013</year>
          )
          <fpage>152</fpage>
          -
          <lpage>175</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>B. N.</given-names>
            <surname>Grosof</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Volz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          ,
          <article-title>Description logic programs: combining logic programs with description logic</article-title>
          ,
          <source>in: The Web Conference</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Bader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Madduri</surname>
          </string-name>
          ,
          <article-title>Gtgraph : A synthetic graph generator suite</article-title>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>