<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Fully Dynamic Materialization Maintenance</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Moritz Illich</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Birte Glimm</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ulm University</institution>
          ,
          <addr-line>James-Franck-Ring, 89081 Ulm</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Classically, consecutive updates on materialized datasets, obtained by computing every consequence for given inference rules, are considered separately without any direct interference. In contrast, we deal with fully dynamic updates (FDUs), where additions and deletions from diferent updates are processed concurrently and can directly influence each other. On the one hand, this enables immediate integration of any new update into the current processing, thus improving responsiveness. On the other hand, direct interference between additions and deletions may reverse or cancel operations. While this can prevent redundant computations, e.g., by stopping a deletion process if a fact is re-added, it might also lead to incorrect materializations if applied without further measures. In this work, we show that (i) FDUs require recursion handling and lead to incomplete processing of operations, (ii) classical incremental materialization approaches cannot fully deal with these challenges, and (iii) incremental materialization approaches can be adapted to allow for FDUs.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;materialization maintenance</kwd>
        <kwd>fully dynamic</kwd>
        <kwd>Datalog</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Answering queries on description logic (DL) ontologies with large ABoxes can be facilitated
by transformations into Datalog [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], enabling the usage of optimized reasoning engines like
RDFox [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] that are able to outperform state-of-the-art DL reasoners [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In Datalog, query
answering often refers to materialization where we compute all consequences for a set of facts
with respect to some inference rules such that every implicitly entailed fact is made explicitly
available. Since updates to the original set of facts, in the form of additions and deletions,
have to be reflected in the materialization too, but re-computing everything from scratch is
usually expensive, we make use of incremental materialization maintenance algorithms that
eficiently adapt a materialization. In detail, these algorithms focus on the actual changes in
a materialization by computing new consequences based on added facts, as well as removing
every fact that cannot be derived anymore due to introduced deletions.
      </p>
      <p>
        Typically, incremental materialization algorithms, like Delete/Rederive (DRed) and counting
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], process one update at a time, without any direct interference between consecutive updates.
In particular, a new update is not processed until the previous one is completed. In this work,
we investigate fully dynamic updates that may overlap and influence each other. Accordingly, a
new update can directly be integrated into the current processing without any delay, enabling
high responsiveness. This is useful for applications with high update frequencies, like in the
context of stream reasoning [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] where we answer continuous queries on data streams with
respect to given background knowledge.
      </p>
      <p>
        Related work refers to the parallelization of materialization maintenance. One approach that
deals with RDF data is provided by Motik et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which should also be applicable for DRed
and counting [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In addition, DynamiTE [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] enables parallel materialization in the context
of RDF stream reasoning. Besides, fully dynamic materialization maintenance has similarities
to incremental stream reasoning, as done in IMaRS [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] for RDF triples annotated with
window-based expiration times, or the work of Ren and Pan [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] dealing with updates on ℰ ℒ++
ontologies based on a truth maintenance system. However, in all of these works, concurrent
execution only refers to operations within an update, while a simultaneous processing of
diferent updates is not taken into account.
      </p>
      <p>In general, additions and deletions are already processed concurrently in given approaches,
but always refer to the same update. In contrast to this, fully dynamic updates also allow
interference between operations from distinct updates introduced at diferent times. As a
consequence, we can directly react to a new update and potentially utilize it to avoid redundant
computations, for example, by canceling the propagation of a deletion if the related fact is
readded in the new update. Nevertheless, the additional interaction between updates also imposes
some challenges regarding the correctness of the materialization maintenance. For instance,
canceling an operation might lead to missing facts that actually should be present in the updated
materialization. In this regard, the main goal of this work is to examine (i) the dificulties of
realizing fully dynamic updates for correct incremental materialization maintenance in Datalog,
(ii) to which degree DRed and the counting algorithm are already capable of handling those
problems, and (iii) how we can adapt the discussed approaches to enable fully dynamic updates
without sacrificing correctness.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Basics and Preliminaries</title>
      <p>To formally define Datalog, we fix countable, disjoint sets of predicates, constants and variables. A
term is a constant or a variable. An atom has the form (1, . . . , ), where  is a -ary predicate
and each , 1 ≤  ≤ , is a term. We focus on unary and binary atoms only (i.e., 1 ≤  ≤ 2),
which correspond to concepts/classes and roles/properties in an ontology, respectively. An atom
is ground if it does not contain variables. A fact is a ground atom, and a dataset is a finite set of
facts. A Datalog rule is a logical implication of the form 1, . . . ,  →  where 1, . . . , 
are called body atoms, and  is a head atom. A rule is safe if variables that appear in the head
also appear in a body atom. 1</p>
      <p>
        A Datalog program is a finite set of safe rules. Predicates that occur in the head of a rule are
called intensional (IDB) predicates; all other predicates are extensional (EDB). Typically, only
EDB predicates are allowed in datasets and their updates. This is without loss of generality
as we can replace every IDB predicate  that is to be used in a dataset or an update by a new
predicate ′ and add rules of the form () → ′() or (, ) → ′(, ) depending on the
1For the sake of readability, examples considered throughout this text do not contain any variables or constants.
Instead, we assume that the stated atoms or facts all implicitly share the same variable or constant, and may only
difer in their predicate.
predicate’s arity, such that  becomes an EDB predicate [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In materialization maintenance
approaches like DRed, such a transformation is needed to identify the explicit introduction of
an implicitly derivable fact as an alternative derivation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Given a program  with a rule 1, . . . ,  →  ∈  , we say that the predicate of a head 
depends on  for  a predicate symbol occurring in 1, . . . , ;  is called recursive in  if 
transitively depends on itself in  . A rule  is recursive if at least one of its body atoms has a
predicate that depends on the head’s predicate, and non-recursive otherwise. A program  is
non-recursive if no predicate symbol occurring in  is recursive in  , else  is recursive.</p>
      <p>A substitution  is a partial mapping from variables to constants. For  a term, an atom,
a rule, or a set of rules,  is the result of replacing each occurrence of a variable  in  on
which  is defined with  (). If  is a rule and  is a substitution mapping all variables of  to
constants, then  is an instance of . For a rule  = 1, . . . ,  →  and a dataset , we define
() = { |  ∈  for all 1 ≤  ≤ }. For a program  , we define  () = ⋃︀∈ ().
Given a rule instance  = 1, . . . ,   →  , we say that the fact  , with 1 ≤  ≤ ,
(directly) derives the fact  , and call  a (direct) derivation of  . In general, derivation
relations are transitive and may consist of a whole sequence of rule instances. A derivation is
recursive if it allows a fact to derive itself, and non-recursive otherwise.</p>
      <sec id="sec-2-1">
        <title>2.1. Materialization Maintenance</title>
        <p>Let  be a dataset (called explicit facts). Then, let 0 = ; for each  ≥ 1, let  = − 1 ∪  (− 1),
and let ∞ = ⋃︀≥ 0 . The set ∞ is the materialization of  w.r.t. , denoted as mat(, ).</p>
        <p>Let , +, and − be datasets with  ∩ + = ∅, − ⊆ , and + ∪ − ̸= ∅. The tuple
 = (+, − ) is called an update for  where + denotes the facts explicitly added to , and
− the facts explicitly deleted from , respectively. Applying the update  on  leads to the
updated dataset ′ = ( ∖ − ) ∪ +. Let ˆ = ⟨1, . . . , ⟩ with  ≥ 1 denote a sequence
of updates for a dataset 0, where for every update  = (+, − ), with 1 ≤  ≤ , we have
− 1 ∩ + = ∅ and − ⊆ − 1, resulting in  = (− 1 ∖ − ) ∪ +. Based on this, we define
materialization maintenance as the task of computing the updated materialization mat(, )
for a given materialization mat(, 0) and a sequence of updates ˆ = ⟨1, ..., ⟩. Note that
the program  stays the same, i.e., we do not consider changes to the set of rules.</p>
        <p>Typically, the application of updates in a sequence ˆ is strictly consecutive, such that an
update  is not processed until the preceding update − 1 is completed. Concretely, we
ifrst apply 1 on mat(, 0) to obtain mat(, 1), then apply 2 on mat(, 1) to obtain
mat(, 2), and continue this procedure until we finally reach mat(, ). In contrast, updates
are fully dynamic if their sequential application allows for overlaps between updates. Specifically,
the processing of an update  may directly start even if the preceding update − 1 is not yet
completed. An important aspect here is that the result of fully dynamic updates has to be the
same as for strictly consecutive ones. This means that the order of the updates in the sequence
is still taken into account despite the overlapping, which is relevant if + ∩ − ̸= ∅ for two
updates  and  . Besides, the whole sequence of updates does not need to be available directly
from the beginning, instead the updates may be introduced one by one with a potential delay
between them. Consequently, we also allow infinite sequences of updates.</p>
        <p>Algorithm 1 Fully dynamic materialization maintenance
Input: Datalog program  , dataset ,
1: initialize empty set  for operations
2: while updates available do
3: if new update  given in ˆ then
4: determine explicit operations
5: apply one of the following computation steps:
6: (1) determine implicitly | (2) cancel | (3) execute one operation</p>
        <p>(infinite) sequence of updates ˆ</p>
        <p>To formalize incremental materialization with fully dynamic updates, we define two operations
( ) and ( ) representing the addition and deletion of some fact  . For the processing of
these operations, we introduce three groups of transition rules that determine, cancel, or execute
an operation. Let  be a Datalog program,  a dataset,  a set of operations,  an update, and
 a rule instance with  ∈  and  a substitution. For each of the following transition rules,
the statements above the line describe needed preconditions, while the ones below state the
conclusion, respectively.
1. Determine an operation
i) explicitly based on an update</p>
        <p>= (+, − )
 =  ∪ {( +), ( − ) |  + ∈ +,  − ∈ − }
ii) implicitly based on a rule
a) addition</p>
        <p>b) deletion
 = 1, . . . ,   → 
{( ) | 1 ≤  ≤ } ∩  ̸= ∅
 ̸∈  → ( ) ∈  for 1 ≤  ≤ 
 =  ∪ {( )}</p>
        <p>= 1, . . . ,   → 
{( ) | 1 ≤  ≤ } ∩  ̸= ∅
 ̸∈  → ( ) ∈  for 1 ≤  ≤ 
 =  ∪ {( )}
2. Cancel an operation</p>
        <p>( ) ∈  ( ) ∈ 
 =  ∖ {( )} or  =  ∖ {( )}
3. Execute an operation
a) addition b) deletion
( ) ∈ 
 =  ∪ { }
( ) ∈ 
 =  ∖ { }
For example, the transition rule for determining an implicit addition extends  with
an  operation for the head of a rule provided a body atom is marked for addition
(( ) ∈ ) and where, additionally, the other body atoms are facts in the dataset ().
Moreover, canceling an operation is non-deterministic, such that diferent materialization
maintenance approaches may define their own conditions under which one operation removes
its counterpart.</p>
        <p>Based on these transition rules, Algorithm 1 describes the general procedure of fully dynamic
materialization maintenance. The main idea here is that we constantly alternate between
checking for a new update and performing one computation to assure the direct integration of
the former. The choice of the computation step, along with needed operations and rule instances,
depends on the concrete materialization maintenance approach. We assume a general fairness
condition such that every applicable rule will be considered eventually and computations do
not just focus on one update. In addition, the order of updates has to be transparent to ensure
that a new explicitly introduced operation is not canceled by an old, outdated one.</p>
        <sec id="sec-2-1-1">
          <title>2.1.1. Delete/Rederive</title>
          <p>
            One way to incrementally compute updates on materializations is described by the
Delete/Rederive (DRed) algorithm [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ], which works for both non-recursive and recursive Datalog programs.
The algorithm takes as input a materialization  = mat(, ) together with one update  =
(+, − ) and produces as output the updated materialization  ′ = mat(, ( ∖ − ) ∪ +).
The processing is separated into an overdeletion, a rederivation, and an insertion phase, where
the first two handle the deletions − , while the last one takes care of the additions +.
          </p>
          <p>During overdeletion, all facts from  that are contained in or derived by some fact from
− are deleted. This is achieved by propagating the deletion over direct derivations. For this,
we first introduce deletion operations for all facts in − . We then determine all implicitly
following deletions and repeat this process introducing new deletions each time until reaching
a fix-point. At this point, the deletions are executed. This procedure guarantees that we remove
every fact from  that can no longer be derived in  ′, but it potentially also deletes facts
that have alternative derivations and that should not be afected by the deletion. To fix this,
overdeletion is followed by the rederivation phase, where we re-add each deleted fact that can
still be derived by the remaining facts. In detail, we try to determine an addition ( ) for
each deleted fact  based on the remaining facts. If we find such an addition, we execute it,
cancel the related deletion, and continue the search until no further rederivations are possible.
Finally, the insertion phase adds + and iteratively computes all newly derivable facts.</p>
          <p>
            For an eficient execution, a semi-naive evaluation [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ] ensures that every time we consider a
rule instance to derive a fact, at least one fact that was previously not present must be involved.
Thus, an added fact only leads to further derivations if it is actually new, which also guarantees
termination in the context of recursive rules.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.1.2. Counting</title>
          <p>
            Aside from DRed, the counting algorithm [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] provides another approach for incremental
materialization maintenance. The basic idea here is that we count the number of current derivations
for each fact. In this context, the explicit addition of a fact based on an update also denotes a
derivation. Initially, if a fact is not present, it has a count of zero. Adding a fact, due to a new
derivation, increases its count. Conversely, if a derivation of a fact is not given anymore because
of some new operation, the fact’s count is decreased. Hence, the count directly states whether a
fact can be derived (count &gt; 0) or not (count = 0), where only the latter case requires an actual
deletion of the fact.
          </p>
          <p>The algorithm’s input consists of one update and a materialized dataset where the derivation
count for each fact is already computed. For some materialization mat(, ), the latter can
generally be achieved by updating an empty dataset with  = (, ∅). The actual processing
starts with the explicit introduction of addition and deletion operations defined by the provided
update. Based on this, we first determine directly following operations, and count how often
they appear. After that, we check if operations can be canceled by updating the count for
each fact  . In detail, the count is increased by the number of new ( ) operations and,
simultaneously, decreased by the number of new ( ) operations. If the adapted count is
greater than zero, we remove the deletions and keep one ( ), else we remove the additions
and keep one ( ), respectively. Finally, we execute every new operation and repeat the
procedure as long as fresh operations are introduced.</p>
          <p>
            Generally, the counting approach only works for non-recursive programs, because recursive
derivations enable a fact to increase its own count, thus potentially impeding its deletion. A
recursive variation of counting is provided by Motik et al. [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ]. Here, the materialization is
represented by a sequence of multisets (called trace) that reflects the semi-naive computation of
the derivations. In this regard, a fact may occur in various multisets, each time with a diferent
count, if it is derived several times during the semi-naive processing. Specifically, this means
that a fact cannot falsely influence its own count due to recursive derivations.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Realizing Fully Dynamic Updates</title>
      <p>In the presence of fully dynamic updates, the general procedure of incremental materialization
approaches like DRed and counting changes. As stated in Algorithm 1, we now constantly
check if a new update is available, and potentially introduce related operations, each time
before performing any computation. On the one hand, this means that the set of facts on which
an update is currently processed can be modified due to new incoming updates. Since these
changes might afect already processed facts, this can impede the semi-naive evaluation, where
we typically assume that a fact should only be further processed if it is not yet present. On
the other hand, we concurrently deal with several updates introduced at diferent times, such
that a strict order or separation of specific computation steps cannot always be guaranteed.
This afects especially DRed where the processing of every update goes through the sequential
overdeletion, rederivation, and insertion phases. Accordingly, the rederivation or insertion
phase of one update might overlap with the overdeletion phase of another one introduced later,
in particular since we assume a fair alternation between operations of diferent updates. In
contrast to this, the counting approach is not directly afected by fully dynamic updates, since
it does not rely on a specific rule separation or ordering. However, in the recursive variation, a
new update might modify a multiset in the trace that has already been adapted by a previous
update still being processed.</p>
      <p>Despite the changes involved in the processing of fully dynamic updates, performing DRed
and counting should still result in a correctly updated materialization. Therefore, we next take
a look at the diferent errors that might occur during incremental materialization maintenance
and how DRed and counting can deal with those in the context of fully dynamic updates.</p>
      <sec id="sec-3-1">
        <title>3.1. Handling Erroneous Processing</title>
        <p>Let ˆ = ⟨1, . . . , ⟩, with  ≥ 1, be a sequence of updates that transforms a materialization
 = mat(, 0) into a correctly updated materialization  ′ = mat(, ). Note that ˆ does
not need to be final, i.e., we may still extend the sequence by further updates +1, . . . , +
with  ≥ 1. In this case, the expected, correctly updated materialization is defined as  ′ =
mat(, +), always referring to the latest update + in the current sequence available so
far. Working with fully dynamic updates should not afect the computation of  ′. But due to
the possible overlapping of updates, their operations can influence each other such that we
might end up with a diferent, wrong materialization  * . Since the correct materialization  ′
is unique, this requires that  * contains additional facts not in  ′, or that  * misses facts
present in  ′. Having additional facts means we either performed some invalid additions or
omitted some deletions, whereas missing some facts is the result of either invalid deletions or
omitted additions, respectively.</p>
        <p>In order to avoid such incorrect behavior, we therefore analyze in the following each of those
four cases and check (i) their relation to fully dynamic updates, (ii) whether DRed and the
counting approach can handle those cases in the presence of fully dynamic updates, and (iii)
how we can adapt DRed and counting to enable fully dynamic materialization maintenance.</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Invalid Addition</title>
          <p>We say that an addition () is invalid if the fact  does not appear in the correctly updated
materialization  ′. Since we can assume that the explicit introduction of an addition directly
related to an update is always valid, an invalid addition can only result from the implicit
propagation of some addition. Suppose () was determined by using the rule instance
 = 1, . . . ,  →  (based on  =  and  =  ) together with ( ) for some
1 ≤  ≤ . Then,  not being present in  ′ requires that at least one fact , with 1 ≤  ≤ ,
is also not available in  ′ anymore, otherwise we could validly derive the fact . This means
there has to be some deletion () that removes . However, () could be applied
together with  to determine () and thus delete the fact . Accordingly,  can only be
present if this deletion is canceled as well. For that, we need an addition (), but as stated
before,  is not present in  ′. Hence, this addition is invalid too. In this regard, an invalid
addition appears if another invalid addition prevents a deletion afecting the former.</p>
          <p>In general, the derivation of some fact begins with an explicitly determined operation based
on an update. But since such an operation is always valid, it cannot lead to an invalid addition.
However, an exception is possible for recursive derivations, because they can start with an
implicitly introduced fact that derives itself. Hence, an invalid addition, for which there is no
deletion that could cancel it, occurs when a fact recursively (re-)adds itself and thus prevents its
own deletion.</p>
          <p>Example 1 (Invalid addition). Consider the following rules  → ,  → , and  → .
Assume we apply an update 1 = ({}, ∅) to an empty dataset. With the new , we can derive
a new fact , which furthermore allows us to derive a fact . Now, suppose we simultaneously
introduce another update 2 = (∅, {}) which deletes the fact . Accordingly,  must not be
present in the updated materialization. However, so far the derived fact  has not been processed,
which means that we can still apply the rule  →  leading to the invalid addition of .</p>
          <p>Although the example ends with an invalid addition, other outcomes are possible as well,
depending on how the operations are further processed after the second update:
1) Deletion overtakes addition: here, the explicit deletion of  directly causes the deletion of ,
further leading to the deletion of , thus preventing the invalid addition of . In this case, we
obtain the correct materialization where none of the facts are present anymore.
2) Addition overtakes deletion: using  to (re)derive a new  before the deletion of  is
propagated to  cancels the deletion completely. Hence, we end up with a wrong materialization
where , , and  are still present despite the deletion.
3) No overtaking: due to the deletion of , we have to delete the derived , but processing 
also allows us to derive another . Since the original  has already been deleted, this derived
 is considered to be a new, not already present, fact. Thus, we can now use the deleted  to
further delete , but likewise also use the new  to derive a new . For the deleted  and the
new , we can then repeat the same procedure, hence ending up in an endless cycle where we
constantly delete and re-add facts without termination.</p>
          <p>In DRed, the separation of overdeletion and rederivation usually prevents the simultaneous
processing of deletions and additions needed to produce invalid additions. Because of the
overlapping processing of fully dynamic updates, this strict separation does not hold anymore.
Moreover, both DRed and the recursive variation of counting handle recursive derivations based
on the semi-naive evaluation, where a fact is only processed if it is not already present. However,
fully dynamic updates allow for cases where we already delete facts in the recursion before the
processing of this recursive derivation is finished. As a consequence, we might not be able to
detect the end/beginning of the recursion anymore, such that the semi-naive evaluation is no
longer able to guarantee termination.</p>
          <p>Considering a potential solution for DRed and counting to deal with invalid additions, we
have to make sure that the propagation of some addition () does not enable the fact  to
recursively re-add itself, even if  is not present anymore. Therefore, the addition process has
to stop at the latest when the start of the recursion, i.e., the fact  that initiated the recursive
derivation, is reached. Stopping the addition then enables the deletion to catch up and ensure
termination, as described in the first situation discussed for Example 1. Concretely, this could
be realized by remembering the fact  (or the related ()) that started the recursion. The
remembered fact is then passed on to its derived facts that also take part in the recursion. As
soon as we reach the remembered fact again, i.e., determine another (), we stop the related
processing.</p>
          <p>Still, only detecting the end of a recursion is generally not enough. If a fact  involved in a
recursion has an alternative derivation, then re-adding the fact  that started the recursion is
actually valid. Accordingly, a fact  that was deleted in the recursion because of deleting  has
to be re-added as well. In DRed, the explicit rederivation phase already deals with this problem.
The counting approach, on the other hand, requires additional measures.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Invalid Deletion</title>
          <p>As a supplement to invalid additions, we say that a deletion () is invalid if the fact  has
to be present in the correctly updated materialization  ′. This means that  has an alternative
derivation (including an explicit introduction) that is not afected by any deletion and still
enables the addition of  in  ′. Identifying the impact of fully dynamic updates in this context
requires distinguishing between the two cases where the alternative addition () was
determined either a) before or b) after introducing the deletion (). In the first case a), we
delete the fact  although it has an alternative derivation. This issue appears even if updates
are not fully dynamic. Regarding case b), introducing the alternative () after the deletion
enables us to re-add the fact  and cancel (). Therefore,  can only not be present if
another invalid deletion () deletes it. Hence, we also end up in case a) where we remove
a fact despite the existence of an alternative derivation. Still, a special situation occurs if the
introduction of the alternative () results from a recursive derivation, as illustrated in the
following example.</p>
          <p>Example 2 (Invalid deletion). Suppose we extend the rules from Example 1 with a new rule
 → . Moreover, assume that we start with a materialized dataset  = {, , }. Given
the following update  = ({}, {}), we add a new fact  and simultaneously delete the fact
. While the deletion of  causes the deletion of , adding  allows us to rederive . With the
deletion of , we can furthermore delete . However, due to the rule  → , this might cause the
invalid deletion of the re-added .</p>
          <p>In this respect, we have the same problem as for invalid additions, which may be solved
likewise by stopping the propagation of a deletion as soon as the end of the recursion is reached.
Nevertheless, we also have to check if the earlier described general case a), where a fact is
deleted despite the existence of an alternative derivation, can still be handled by the discussed
approaches if fully dynamic updates are given. The counting approach does not have any
issue, since the utilization of a fact’s count directly prevents (invalid) deletions if alternative
derivations are given. For DRed, invalid deletions may be seen as the eponymous part of the
overdeletion phase, being resolved in the subsequent rederivation phase. However, an important
aspect here is that the rederivation does not start before the overdeletion is completed to ensure
that a rederivation is indeed not afected by any deletion. With fully dynamic updates, the
problem is that we cannot clearly state anymore at which point the overdeletion phase is done,
due to a possibly endless sequence of incoming updates. A potential solution for this can be
to restrict the propagation of operations to older facts, which are given in a finite number. In
detail, some deletion () is only used together with some fact  if  has been introduced
before the deletion was executed, otherwise  could not have been applied together with the
deleted fact  for some derivation.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>3.1.3. Omitted Addition and Deletion</title>
          <p>So far, we discussed cases where we obtain an incorrect materialization due to the execution
of operations that in fact should have been avoided. Now we take a look at the opposite,
where problems happen as a result of not performing certain operations. Here, we assume
that this omission of operations is not the consequence of passively forgetting to perform an
operation, but rather actively caused by either canceling an operation before it is executed, or
removing preconditions needed to determine the operation. Formally, omitting an addition
() requires either some deletion () that cancels (), or some deletion ( )
with 1 ≤  ≤ , which cancels ( ) or deletes  , such that a related rule instance
1, . . . ,  →  cannot be used to determine (). Likewise, omitting a deletion () is
caused either by some (), or by some ( ) that prevents determining ().</p>
          <p>One aspect to be aware of in this context is that compared to invalid operations, the omission
of operations by itself is not necessarily a problem. Indeed, it may even be beneficial by avoiding
redundant computations. For instance, it does not make sense to further compute derivations
based on some fact  if  has already been deleted. Likewise, re-adding a fact  prevents the
need to further propagate a previous deletion of  to its derived facts. Nevertheless, ending up
with a wrong materialization is possible as well. Concretely, a needed fact might be missing in
the updated materialization if we omit an addition, while omitting a deletion can lead to wrong,
additional facts that are actually not derivable in the correctly updated materialization  ′.</p>
          <p>Assume again that we have a rule instance 1, . . . ,  →  enabling us to determine an
addition (). Furthermore, let () and ( ), with 1 ≤  ≤ , be deletions that cause
the omission of (). If () and ( ) are valid, then  is neither present in the correctly
updated materialization  ′ nor can it be derived due to the missing  . Hence, omitting the
addition of  does not produce an error. On the contrary, if () and ( ) are invalid, then
 should be in  ′ and can even be derived due to  also being in  ′. In this case, omitting
the addition of  leads to a wrong materialization. The same argumentation can be made for
omitting a deletion based on an addition. In this regard, taking care of invalid operations as
discussed earlier also solves the problem of omitting operations leading to a wrongly updated
materialization. However, this is not the only issue caused by omitting operations.</p>
          <p>Usually, both additions and deletions are processed until no further changes in the current
set of facts are possible, i.e., if we add a fact , we do not stop until every (direct) derivation of
 has been computed, and if we delete , we make sure to propagate the deletion until only
facts with alternative derivations are left, respectively. In contrast to this, fully dynamic updates
enable the cancelation of operations before they are entirely processed, because of the direct
integration of new incoming updates after each computation. Accordingly, the problem of
omitting operations actually refers to the incomplete processing of operations. In other words,
we cannot be sure anymore if a certain operation was applied to introduce an operation or a
fact, or if it has been canceled before due to a new update. Therefore, only using the available
inference rules to detect derivation relations does not sufice.</p>
          <p>Example 3 (Deletion without derivation). Consider the following rules  → ,  → , and
 → . Assume we apply an update 1 = ({, }, ∅) to an empty dataset. Then, we can use 
to derive , and  to derive , respectively. Now, suppose we simultaneously introduce another
update 2 = (∅, {}). Because of the rule  → , we could technically propagate this deletion
of  to . However,  was actually not used to derive . Hence, the deletion of  should not
influence . In contrast to this, the fact  should be afected by the deletion of , but only using the
provided rules does not allow us to determine whether a fact was really involved in the derivation
of some fact or not.</p>
          <p>Example 4 (Duplicate derivation). Consider the rules  →  and  → . Assume we apply
an update 1 = (∅, {}) to a materialized dataset  = {, , }, where we already used 
to derive  and . The update deletes , which also triggers the deletion of . Now, suppose we
simultaneously introduce another update 2 = ({}, ∅). This allows us to rederive the previously
deleted . However, this new  may also be used to derive another , even though  was not
removed by the deletion. Accordingly, the derivation  →  would be applied twice here.</p>
          <p>In this regard, omitting additions can lead to false deletions not related to truly conducted
derivations, whereas omitting deletions might produce redundant duplicate facts. Since the
false deletion of a fact  can simply be repaired by re-adding  based on the derivation that
was indeed applied to add  in the first place (  →  in Example 3), and duplicates are not
relevant if we work with sets of facts, DRed can directly handle omitted operations without the
need for adaptations. For the counting approach, the impact is more severe, because omitting
operations means that the count associated with a fact only represents the number of actually
computed derivations, not the entire number of available rule instances that could be used to
directly derive the fact. Therefore, solely relying on the inference rules to detect derivation
relations does generally not work. For instance, in Example 3 the fact  was only derived by
, thus having a count of value one. But letting the deletion of  afect  would decrease
the counter to zero, thus deleting , despite its alternative derivation based on . A solution
here can be to remember which rule instances were really applied, in order to ensure that an
operation is only propagated over derivations that were in fact performed.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>We examined fully dynamic updates for incremental materialization maintenance, where new
updates are directly integrated into the current processing. For that, we analyzed the problems
that may occur due to the overlapping processing of updates and how the well-known DRed
and counting algorithms can handle those. In detail, fully dynamic updates impede the correct
processing of recursive derivations, resulting in invalid additions and deletions, or even
nontermination. This is solved by detecting and memorizing recursions, as well as identifying when
a fact needs to be rederived. Moreover, fully dynamic updates cause the omission of not yet
completely processed operations, potentially leading to incorrect deletions or duplicate facts.
While this does not afect the correctness of DRed, the counting approach has to know if a
derivation was really computed or not.</p>
      <p>
        Future work may include the examination of other materialization maintenance algorithms,
like Backward/Forward [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] or the combined approaches presented by Hu et al. [14]. Moreover,
the proposed adaptations for DRed and counting still require further investigation to allow for
concrete implementations that guarantee both correctness and eficiency. In this context, we
may also explicitly consider the parallelization of operations. Besides, it would be useful to also
consider the setting of Datalog with negation, and we would like to investigate how one can
still answer queries on materializations under the influence of fully dynamic updates, where
temporarily blocking the integration of updates might serve as a solution.
[14] P. Hu, B. Motik, I. Horrocks, Optimised Maintenance of Datalog Materialisations, in:
Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, AAAI Press,
2018, pp. 1871–1879. URL: https://doi.org/10.1609/aaai.v32i1.11554.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vianu</surname>
          </string-name>
          , Foundations of Databases, Addison-Wesley,
          <year>1995</year>
          . URL: http://webdam.inria.fr/Alice/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Piro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Olteanu, Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF Systems</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>28</volume>
          , AAAI Press,
          <year>2014</year>
          , pp.
          <fpage>129</fpage>
          -
          <lpage>137</lpage>
          . URL: https://doi.org/10.1609/aaai.v28i1.
          <fpage>8730</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Carral</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>González</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Koopmann</surname>
          </string-name>
          ,
          <article-title>From Horn-SRIQ to Datalog: A Data-Independent Transformation That Preserves Assertion Entailment</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>33</volume>
          , AAAI Press,
          <year>2019</year>
          , pp.
          <fpage>2736</fpage>
          -
          <lpage>2743</lpage>
          . URL: https://doi.org/10.1609/aaai.v33i01.
          <fpage>33012736</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Mumick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Subrahmanian</surname>
          </string-name>
          , Maintaining Views Incrementally,
          <source>in: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data</source>
          , ACM Press,
          <year>1993</year>
          , pp.
          <fpage>157</fpage>
          -
          <lpage>166</lpage>
          . URL: https://doi.org/10.1145/170035.170066.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dell'Aglio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Valle</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. van Harmelen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          ,
          <article-title>Stream reasoning: A survey and outlook</article-title>
          ,
          <source>Data Science</source>
          <volume>1</volume>
          (
          <year>2017</year>
          )
          <fpage>59</fpage>
          -
          <lpage>83</lpage>
          . URL: https://doi.org/10.3233/DS-170006.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Piro</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <source>Maintenance of Datalog Materialisations Revisited, Artificial Intelligence</source>
          <volume>269</volume>
          (
          <year>2019</year>
          )
          <fpage>76</fpage>
          -
          <lpage>136</lpage>
          . URL: https://doi.org/10.1016/j.artint.
          <year>2018</year>
          .
          <volume>12</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Urbani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Margara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J. H.</given-names>
            <surname>Jacobs</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. van Harmelen</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. E. Bal,</surname>
          </string-name>
          <article-title>DynamiTE: Parallel Materialization of Dynamic RDF Data</article-title>
          ,
          <source>in: The Semantic Web - ISWC</source>
          <year>2013</year>
          , volume
          <volume>8218</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2013</year>
          , pp.
          <fpage>657</fpage>
          -
          <lpage>672</lpage>
          . URL: https://doi.org/10. 1007/978-3-
          <fpage>642</fpage>
          -41335-3_
          <fpage>41</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Braga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ceri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Valle</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Grossniklaus, Incremental Reasoning on Streams and Rich Background Knowledge, in: The Semantic Web: Research and Applications</article-title>
          , volume
          <volume>6088</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2010</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -13486-
          <issue>9</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dell'Aglio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Valle</surname>
          </string-name>
          ,
          <article-title>Incremental Reasoning on RDF Streams, in: Linked Data Management, Chapman</article-title>
          and Hall/CRC,
          <year>2014</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>435</lpage>
          . URL: http://dellaglio.org/preprints/ 14b1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>Optimising ontology stream reasoning with truth maintenance system</article-title>
          ,
          <source>in: Proceedings of the 20th ACM International Conference on Information and Knowledge Management</source>
          , ACM Press,
          <year>2011</year>
          , pp.
          <fpage>831</fpage>
          -
          <lpage>836</lpage>
          . URL: https://doi.org/10.1145/2063576.2063696.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaminski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <article-title>Datalog rewritability of Disjunctive Datalog programs and non-Horn ontologies</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>236</volume>
          (
          <year>2016</year>
          )
          <fpage>90</fpage>
          -
          <lpage>118</lpage>
          . URL: https: //www.sciencedirect.com/science/article/pii/S0004370216300297.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bancilhon</surname>
          </string-name>
          ,
          <article-title>Naive Evaluation of Recursively Defined Relations</article-title>
          ,
          <source>in: On Knowledge Base Management Systems: Integrating Artificial Intelligence and Database Technologies, Topics in Information Systems</source>
          , Springer,
          <year>1986</year>
          , pp.
          <fpage>165</fpage>
          -
          <lpage>178</lpage>
          . URL: https://doi.org/10.1007/ 978-1-
          <fpage>4612</fpage>
          -4980-1_
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E. F.</given-names>
            <surname>Piro</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Horrocks, Incremental Update of Datalog Materialisation: the Backward/Forward Algorithm</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>29</volume>
          , AAAI Press,
          <year>2015</year>
          , pp.
          <fpage>1560</fpage>
          -
          <lpage>1568</lpage>
          . URL: https://doi.org/10.1609/ aaai.v29i1.
          <fpage>9409</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>