=Paper=
{{Paper
|id=Vol-2646/40-paper
|storemode=property
|title=Smart Data Exchange
|pdfUrl=https://ceur-ws.org/Vol-2646/40-paper.pdf
|volume=Vol-2646
|authors=Sergio Greco,Michele Ianni,Elio Masciari,Domenico Saccà,Irina Trubitsyna
|dblpUrl=https://dblp.org/rec/conf/sebd/GrecoIMST20
}}
==Smart Data Exchange==
<pdf width="1500px">https://ceur-ws.org/Vol-2646/40-paper.pdf</pdf>
<pre>
                           Smart Data Exchange
                          (DISCUSSION PAPER)
                      Sergio Greco1 , Michele Ianni1 , Elio Masciari2 ,
                         Domenico Saccà1 , and Irina Trubitsyna1
                               elio.masciari@unina.it
                 {greco, ianni, sacca, trubitsyna}@dimes.unical.it
                  1
                    DIMES-Università della Calabria, 87036 Rende (CS), Italy
         2
             DIETI-Università degli Studi di Napoli Federico II, 80125 Napoli (NA), Italy


         Abstract. The problem of exchanging data, even considering incomplete and
         heterogeneous data, has been deeply investigated in the last years. The approaches
         proposed so far are quite rigid as they refer to fixed schema and/or are based on a
         deductive approach consisting in the use of a fixed set of (mapping) rules. In this
         paper we describe a smart data exchange framework integrating deductive and
         inductive techniques to obtain new knowledge. The use of graph-based represen-
         tation of source and target data, together with the midway relational database and
         the extraction of new knowledge allow us to manage dynamic databases where
         also features of data may change over the time.


1     Introduction

Many proposal have been made for data exchange among heterogeneous sources. How-
ever, for the best of our knowledge, no generalization of the consolidated data exchange
framework, that supports both the extraction of new knowledge and flexible represen-
tation of heterogeneous data has been defined so far. In this paper we describe an ex-
tension of the data exchange framework, called Smart Data Exchange (SDE), recently
proposed in [13], that addresses these issues. The global scenario is showed in Fig. 1,
where (i) Source, Target and Midway are three databases, (ii) ΣSM and ΣMT are ex-
tended TGDs (Tuple Generating Dependencies) mapping data from Source to Midway
and standard TGDs mapping data from Midway to Target, respectively, (iii) ΣM and ΣT
are extended TGDs defined over the Midway database and standard TGDs and EGDs
(Equality Generating Dependencies) defined over the Target database, respectively.
    The idea is that, by allowing an intermediate database and a richer language to de-
rive new information, as well as information obtained by analyzing data, we may define
more powerful and flexible tools for data exchange. To have a flexible and general rep-
resentation of source and target databases, following the RDF approach, we decided to
model them using a graph-based formalism where data are stored into ternary relations.
Differently from other formalisms where data are stored into a unique relation [3, 18],
we consider multiple ternary relations, but analogously to RDF our graph data consist
    Copyright c 2020 for this paper by its authors. Use permitted under Creative Commons Li-
    cense Attribution 4.0 International (CC BY 4.0). This volume is published and copyrighted by
    its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy.
                                    Fig. 1. Architecture


of tuples of the form (i, a, v) with i being a (resource) identifier, a an attribute name
and v a value (either a constant or an identifier). The midway database is a relational
database containing tuples of any arity that are either imported from the source database
using the ΣSM source-midway mapping rules, or are generated by the Smart Analyzer
Tool (SAT). Data dependencies denoted by ΣM are used to generate new information
useful to enrich the target database. The target database is graph-based and is built by
importing data from the midway database using the mapping rules ΣMT . Finally, as in
the standard data exchange scenario, ΣT consists of a set of TGDs or EGDs.
     While we assume that SAT is a black-box aiming to produce new data stored in the
midway database (in our prototype it implements several data mining algorithms spe-
cific for the domain application), the language used to define ΣM is a rich logical lan-
guage that makes use of comparison predicates, aggregate predicates and nondetermin-
istic choice constructs (well studied in the past by the community of logic databases).
This language allows to obtain more realistic information that could be used in a flex-
ible way by both data experts (for analysis purposes) and inexperienced users (for a
better navigation through data). Another peculiarity of our framework is that the data
exchange takes also advantages of information (under the form of facts) generated by
a data analyzer module and stored into the midway database, together with the data
extracted from the source database.
     The idea to model data analysis during the data mapping is present in Data Posting
framework [8] and its simplified version [19]. Differently from SDE, the framework
proposed in [8, 19] does not use graph-based formalisms for source and target data and
does not have an intermediate level for data analysis. Moreover, it does not admit the
use of existentially quantified variables in the mapping rules and uncertain values in the
body of mapping rules can be represented only by means of non-deterministic variables,
whose values can be selected from specific relations, possibly restricted by target count
constraints.

2   Background
A Tuple Generating Dependency (TGD) is formula of the form: ∀x φ(x) → ∃y ψ(x, y),
where φ(x) and ψ(x, y) are conjunctions of atoms, and x, y are lists of variables. Full
TGDs are TGDs without existentially quantified variables. An Equality Generating De-
pendency (EGD) is a formula of the form: ∀x φ(x) → x1 = x2 , where φ(x) is conjunc-
tion of atoms, while x1 and x2 are variables in x. In our formulae we omit the universal
quantifiers, when their presence is clear from the context.
    Let S = S1 , ..., Sn and T = T1 , ..., Tm be two disjoint schemas. We refer to S (resp.
T ) as the source (resp. target) schema and to the Si ’s (resp. Tj ’s) as the source (resp.
target) relation symbols. Instances over S (resp. T ) will be called source (resp. target)
instances. If I is a source instance and J is a target instance, then we write hI, Ji for
the instance K over the schema S ∪ T such that K(Si ) = I(Si ) and K(Tj ) = J(Tj ),
for i ≤ n and j ≤ m.
     The data exchange setting [11, 2] is a tuple (S, T, Σst , Σt ), where S is the source
schema, T is the target schema, Σt are dependencies over T and Σst are source-to-
target TGDs. The dependencies in Σst map data from the source to the target schema
and are TGDs of the form ∀x( φs (x) → ∃y ψt (x, y) ), where φs (x) and ψt (x, y)
are conjunctions of atomic formulas on S and T , respectively. Dependencies in Σst are
also called mapping dependencies. Dependencies in Σt specify constraints on the target
schema and can be either TGDs or EGDs.
     The data exchange problem associated with this setting is the following: given a
finite source instance I, find a finite target instance J such that hI, Ji satisfies Σst and
J satisfies Σt . Such a J is called a solution for I. The computation of an universal
solution (the compact representation of all possible solutions) can be done by means of
the fixpoint chase algorithm, when it terminates [9]. Decidable (sufficient) conditions
ensuring chase termination can be found in [16, 6, 15].
     Queries over relational database can be expressed using Relational Algebra, or alter-
native equivalent languages such safe Relational Calculus and safe, nonrecursive Data-
log (with negation). To make query languages more expressive, several additional fea-
ture have been added to these languages including the possibility to manage bags (SQL),
aggregates (SQL), recursion (Datalog), existential variables (Datalog± ) and others.
     The choice constructs [20, 17] have been introduced in Datalog to get an increase
in expressive power and to obtain simple declarative formulations of classical combina-
torial problems, such as those which can be solved by means of greedy algorithms [20].
     A choice atom is of the form choice((x), (y)), where x and y are lists of variables
such that x ∩ y = ∅, and can occur in the body of a rule. Its intuitive meaning is to
force each initialization of x to be associated with a unique initialization of y, thus
making the result of executing of a corresponding rule nondeterministic. A choice rule
is of the form: A(w) ← B(z), choice((x), (y)), where A(w) is an atom, B(z) is a
conjunction of atoms, x, y, z and w are lists of variables such that x, y, w ⊆ z. The
atom choice((x), (y)) is used to enforce the functional dependency x → y on the set
of atoms derived by means of the rule.
     The choice-least and choice-most constructs [17] specialize the choice construct
so as to force greedy selections among alternative choices—these turn out to be par-
ticularly useful to express classical greedy algorithms. A choice-least (resp. choice-
most) atom is of the form choice least((x), (c)) (resp. choice most((x), (c))), where
x is a list of variables and c is a single variable ranging over an ordered domain.
A choice least((x), (c)) (resp. choice most((x), (c))) atom in a rule indicates that
the functional dependency defined by the atom choice((x), (c)) is to be satisfied, and
the c value assigned to a certain value of x has to be the minimum (resp. maximum)
one among the candidate values. The body of a rule may contain even more than one
choice constructs, but only one choice-least or choice-most atom. For instance, the rule
p(x, y, c) ← q(x, y, c), choice((x), (y)), choice least((x), (c)) imposes the func-
tional dependency x → y, c on the possible instances of p. In addition, for each value of
x, the minimum among the candidate values of c must be chosen. For instance, assum-
ing that q is defined by the facts q(a, b, 1) and q(a, d, 2), from the rule above we might
derive either p(a, b, 1) or p(a, d, 2). However, the choice-least atom introduces the ad-
ditional requirement that the minimum value on the third attribute has to be chosen, so
that only p(a, b, 1) is derived. The formal semantics is defined by rewriting rules with
choice atoms into rules with negated literals and selecting (nondeterministically) one of
the stable models of the rewritten program [20, 17].

3   Smart Data Exchange
In this section we present the data exchange framework informally discussed in the
Introduction. We assume the existence of the following countably infinite sets: relation
names R, identifiers I, attribute names A, constants C, nulls N and variables V. The set
of relation symbols (also called predicate symbols) is partitioned into three countable
sets denoted by RS (source relations), RT (target relations) and RM (midway relations),
whereas DS = I ∪ A ∪ C, DT = I ∪ A ∪ C ∪ N and DM = I ∪ A ∪ C denote the
domains of relations RS , RT and RM , respectively. Relations in RS and RT have arity 3
and take values from I × A × I ∪ C and I ∪ N × A ∪ N × I ∪ C ∪ N respectively,
whereas relations in RM may have any arity n and take values from DMn . The main
difference between the source and the target databases is that the target database may
also have nulls (corresponding to blank nodes in RDF) which are introduced to satisfy
constraints. The set of source (resp. target, midway) relations define the source (resp.
target, midway) database whose schema is denoted by S (resp. T , M ).
    The model defined above states that the source and target databases are graph-based
databases stored in a relational database (using triples), whereas the midway database
is a standard relational database. This choice is due to the fact that we would ex-
change data among heterogeneous databases and we want model data whose schema
may change over the time.
Extended TGDs. An atom is of the form p(t1 , ..., tn ), where t1 , ..., tn are terms (stan-
dard atom), or of the form t1 θt2 , where θ is a comparison predicates and t1 , t2 are terms
(built-in atom). A literal is an atom A (positive literal) or its negation ¬A (negative lit-
eral). A conjunction of literals is of the form B1 ∧· · ·∧Bn where B1 , ..., Bn are literals.
A conjunction of literals is said to be safe if all variables occurring in built-in atoms and
negated literals also appear in positive literals. From now on we assume that whenever
we consider conjunctions of literals they are safe.
Definition 1. An extended TGD (ETGD) is a universally quantified implication for-
mula of one of the following forms:
  – ϕ(x) ∧ choice(x0 ) → ψ(w),
     where ϕ(x) is a safe conjunction of standard and built-in literals, ψ(w) is a conjunc-
     tion of atoms, w ⊆ x and choice(x0 ), with x0 ⊆ x, is a possibly empty conjunction
     of choice-atoms;
  – ϕ(x) → q(w0 , c1 hw1 i, ..., cn hwn i) (called aggregate dependency)
     where ϕ(x) is a safe conjunction of standard and built-in literals, c1 , ..., cn denote
     aggregate functions (e.g., min, max, sum, count), w0 (called “group-by” vari-
     ables) and w1 , ..., wn are lists of variables such that w0 , ..., wn ⊆ x and w0 ∩wi = ∅
     ∀i ∈ [1, n].                                                                           2
      For any set Σ of data dependencies, the dependency graph GΣ = (V, E) is built
as follows: V consists of the predicate symbols occurring in Σ, whereas there is an
edge from p to q if there is a dependency having p in the body and q in the head.
Moreover, the edge is labeled with ¬ (resp. ag) if p occurs negated (resp. occurs in
the head atom which contains aggregate functions). A set of dependency is said to be
stratified if the depencecy graph does not contain cycles with labeled edges. Observe
that since standard dependencies are positive (that is, all body literals are positive), to
check stratification it is sufficient to check stratification of ETGDs. From now on we
assume that our dependencies are stratified.
      The semantics of ETGDs with choice atoms can be defined as in the case of Datalog
rules. To this end, in order to eliminate head conjunctions, any ETGDs r : ϕ(x) →
p1 (y1 ) ∧ · · · ∧ pk (yk ) having k > 1 atoms in the head is rewritten into: (i) k ETGDs
ri : ϕ(x) → pi (yi ) (i ∈ [1, k]), if r does not contain choice atoms, and (ii) k+1 ETGDs
r0 , ..., rk if r contains choice atoms, where r0 = ϕ(x) → hr (x) and ri = hr (x) →
pi (yi ) (with i ∈ 1, k]), where hr is a fresh new predicate. Regarding the semantics of
ETGDs with aggregate functions, as the set of ETGDs is stratified, we can partition it
into strata so that, if a stratum contains an ETGD with aggregate functions, then it does
not contain any other ETGD, and compute one stratum at time following topological
order defined over the strata by the dependency graph. The computation of an ETGD
with aggregate functions can be carried out in the same way of computing SQL queries
with aggregates, as body predicates have been already computed and, therefore, they
correspond to database relations in SQL (see also [12]).

Smart data exchange. Now we formally define the Smart Data Exchange Framework.
Definition 2. A Smart Data Exchange (SDE) is a tuple (S, M, T, ΣSM , ΣMT , ΣM , ΣT ), where:
 – S is the source schema containing predicates taken from RS ,
 – M is the midway schema containing predicates taken from RM ,
 – T is the target schema containing predicates taken from RT ,
 – ΣSM is a source-to-midway set of safe ETGDs,
 – ΣM is a midway-to-midway set of safe, stratified ETGDs,
 – ΣMT is a midway-to-target set of standard TGDs, and
 – ΣT is a target-to-target set of standard TGDs and standard EGDs.                      2

    As already pointed out, we assume that the source and target relations are graph-
based data and every class of objects is modeled as a named set of triples. RDF graph,
microdata and JSON representations are very closed to this description [1]. The midway
database is a relational database containing relations of any arity including data in the
format of the source and target database.
    For any SDE E = (S, M, T, ΣSM , ΣM , ΣMT , ΣT i, we shall use the following notation:
Σ1 = ΣSM ∪ ΣM , Σ2 = ΣMT ∪ ΣT , and Σ = Σ1 ∪ Σ2 .
Example 1. Consider a graph relation in the source database containing facts of the
form graph(id1 , edge, id2 ) where id1 and id2 are node identifiers and edge is an at-
tribute value denoting that there is an edge from id1 to id2 . The data exchange problem
consists in extracting a spanning tree to be stored in the target database. From the source
database we can import edges and nodes in the midway database, as defined in the set
ΣSM consisting of the ETGD: graph(x, edge, y) → edge(x, y) ∧ node(x) ∧ node(y).
Assume now that the midway database also contains a fact root(0), for instance gen-
erated by the SAT module or imported from the source database using another source-
to-midway dependency. The next set of ETGDs ΣM shows how it is possible to generate
a spanning tree rooted in the node x denoted by fact root(x).

        root(x) → st(nil, x)
        st(z, x) ∧ edge(x, y), choice((y), (x)), → st(x, y) ∧ connected(y)
        node(x) ∧ ¬connected(x), choice((), (x)) → nextRoot(x)

Here the first ETGD is used to start the computation by deriving an edge ending in
the root node (the starting node is the dummy node nil), whereas the second ETGD,
imposing the functional dependency y → x, guarantees that the set of selected nodes is
a spanning tree. The last ETGD in ΣM gives a node not belonging to the spanning tree
if the graph is not connected; this node can be used in the future as a root to compute
another spanning tree.
     Spanning tree edges and nextRoot facts are then imported in the target database (as
triples) using the below set of TGDs ΣMT :

       st(x, y) → graph(x, st, y)
       nextRoot(x) → ∃y graph(nil, root, x)

Information stored in the target database are next analyzed to generate new information
which will be stored in the midway database. For instance, from a fact graph(nil, root, x)
the SAT module could generate a fact root(x) which, after been stored in the midway
database, could generate the computation of another spanning tree rooted in x.        2

     Note that the process described in the previous example and showed in Fig. 1 is
supervised, that is the activation of the module SAT is performed by the user.
     The smart data exchange problem associated with this setting is the following:
given a finite source instance I over a schema S and a smart data exchange E =
(S, M, T, ΣSM , ΣM , ΣMT , ΣT ), find finite instances J and K over the schemas M and
T , respectively, such that hI, Ji satisfies ΣSM , J satisfies ΣM , hJ, Ki satisfies ΣMT and
K satisfies ΣM . The pair hJ, Ki is called a solution (or model) for hI, Ei, or equiva-
lently for hI, Σi, where let Σ1 = ΣSM ∪ ΣM and Σ2 = ΣMT ∪ ΣT , Σ = Σ1 ∪ Σ2 . The
set of solutions for hI, Ei (or equivalently for hI, Σi) is denoted by Sol(I, E), (resp.
Sol(I, Σ)). Moreover, J is also called solution (or model) for hI, Σ1 i and, in cascade,
K is a solution (or model) for hJ, Σ2 i. Thus, the problem of finding solutions can be
split into two problems: (i) finding a solutions J for I, and (ii) finding solutions K for
J. The set of solutions for hI, Σ1 i is denoted by Sol(I, Σ1 ), whereas the set of solutions
for hJ, Σ2 i is Sol(J, Σ2 ). Therefore, given a smart data exchange framework, starting
from I with dependency Σ1 we find solutions J1 , ..., Jm and, then starting from every
Ji (1 ≤ i ≤ m) with dependency Σ2 we find solutions Ki1 , ..., Kin . Observe that m is
always finite, whereas in , in the general case, is not guaranteed to be finite [10].

Query answering. Since we have multiple models, we distinguish the certain answer
derived from all models and the nondeterministic answer.
Definition 3. Given a query Q, a database I and an SDE E, the certain answer of Q
                                                                                2
                                  T
over hI, Ei is Certain(Q, I, E) = hJ,Ki∈Sol(I,E) Q(K).

T Since the certain answer could be equivalently defined as Certain(Q, I, E) =
  J∈Sol(I,Σ1 )∧K∈Sol(J,Σ2 ) Q(K), its computation can be optimized by considering the
four sets of dependencies separately, that is by first computing solutions J 0 for hI, ΣST i,
then solutions J for hJ 0 , ΣM i, next solutions K 0 for hJ, ΣMT i, and, finally solutions K for
hK 0 , ΣT i. Let us now introduce the concepts of homomorphism and universal model.
     A homomorphism from a set of atoms A1 to a set of atoms A2 is a mapping h
from the domain of A1 (set of terms occurring in A1 ) to the domain of A2 such
that: (i) h(c) = c, for every c ∈ Const(A1 ); and (ii) for every atom R(t1 , . . . , tn )
in A1 , we have that R(h(t1 ), . . . , h(tn )) is in A2 . With a slight abuse of notation,
we apply h also to sets of atoms and thus, for a given set of atoms A, we define
h(A) = {R(h(t1 ), . . . , h(tn )) | R(t1 , . . . , tn ) ∈ A}.
Definition 4. A universal model of (I, Σ) is a model hJ, Ki of (I, Σ) such that for
every model hJ 0 , K 0 i of (I, Σ) there exists a homomorphism from K to K 0 . The set of
all universal solutions of (I, Σ) will be denoted by USol(I, Σ).                              2
Proposition 1. For any positive query Q,Tdatabase I and SDE E, the certain answer
of Q over hI, Ei is Certain(Q, I, E) = J ∈ Sol(I, Σ ) Q(KJ )↓ , where KJ is any
                                                               1
universal solution of hJ, Σ2 i and Q(KJ )↓ is the result of computing naively (i.e. con-
sidering nulls as constants) Q(KJ ) and deleting tuples with nulls.                   2
    Universal solutions for hJ, Σ2 i can be easily computed by applying the classical
fixpoint algorithm called Chase which computes a subset of universal solutions called
canonical [4, 10].
    As previously discussed, in several cases we are not interested in all models of
hI, Σ1 i, but only in one selected nondeterministically (e.g. the set of edges of any mini-
mum spanning tree). Thus, we now introduce the definition of nondeterminitic answer.
Definition 5. The nondeterministic answer to a query Q over a T    database instance I and
SDE E = (S, M, T, ΣSM , ΣM , ΣMT , ΣT ) is N onDet(Q, I, E) = K ∈ Sol(J, Σ ) Q(K),
                                                                                       2
where, J is a model for hI, Σ1 i selected nondeterministically.                              2
    Observe that the nondeterministic choice of the model is applied only to dependen-
cies in Σ1 , where users express explicitly that they want select nondeterministically a
subset of tuples, whereas for hJ, Σ2 i we consider all models. Note that, two evaluations
of the nondeterministic answers could give different answers (as the choices made could
be different) and that the responsibility of computing nondeterministic answers, instead
of certain answers, is left to the user. Indeed, in several cases the user is not interested
in specific models, but only in one model satisfying some properties (e.g. any spanning
tree). Therefore, we introduce the concept of universal nondeterministc model.
Definition 6. A universal nondeterministic model for hI, Σi is any universal model in
USol(D, Σ) selected nondeterministically.                                                 2
Proposition 2. For any positive query Q, database I and SDE E = (S, M, T, ΣSM , ΣM ,
ΣMT , ΣT ), the nondeterministic answer of Q over hI, Ei is N onDet(Q, I, E) = Q(KJ )↓ ,
where, let J be any model for hI, Σ1 i selected nondeterministically, KJ is a universal
solution of hJ, Σ2 i                                                                 2
4    Conclusion
In this paper we have discussed a framework supporting both analysis and flexible rep-
resentation of heterogeneous data. The use of graph-based representation of source and
target data, together with the extraction of new knowledge made by the SAT tool allow
us to manage dynamic databases where also features of data may change over the time.
The theoretical complexity of the approach is discussed in [13]. Future work should
include the use of more powerful declarative languages for the midway, where limited
use of function symbols is allowed [5, 7, 14].

References
 1. R. Angles and C. Gutierrez. An introduction to graph data management. In Graph Data
    Management, Fundamental Issues and Recent Developments., pages 1–32. 2018.
 2. M. Arenas, P. Barceló, R. Fagin, and L. Libkin. Locally consistent transformations and query
    answering in data exchange. In PODS, 2004.
 3. M. Arenas, G. Gottlob, and A. Pieris. Expressive languages for querying the semantic web.
    ACM Trans. Database Syst., 43(3):13:1–13:45, 2018.
 4. C. Beeri and M. Y. Vardi. A proof procedure for data dependencies. J. ACM, 31(4):718–741,
    1984.
 5. M. Calautti, S. Greco, C. Molinaro, and I. Trubitsyna. Logic program termination analysis
    using atom sizes. In IJCAI, pages 2833–2839. AAAI Press, 2015.
 6. M. Calautti, S. Greco, C. Molinaro, and I. Trubitsyna. Exploiting equality generating depen-
    dencies in checking chase termination. PVLDB, 9(5):396–407, 2016.
 7. M. Calautti, S. Greco, F. Spezzano, and I. Trubitsyna. Checking termination of bottom-up
    evaluation of logic programs with function symbols. TPLP, 15(6):854–889, 2015.
 8. N. Cassavia, E. Masciari, C. Pulice, and D. Saccà. Discovering user behavioral features to
    enhance information search on big data. TiiS, 7(2):7:1–7:33, 2017.
 9. A. Deutsch, A. Nash, and J. B. Remmel. The chase revisited. In PODS, 2008.
10. R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data exchange: semantics and query
    answering. Theoretical Computer Science, 336(1):89–124, 2005.
11. R. Fagin, P. G. Kolaitis, and L. Popa. Data Exchange: getting to the core. ACM Trans.
    Database Syst., 30(1):174–210, 2005.
12. S. Greco. Dynamic programming in datalog with aggregates. IEEE Trans. Knowl. Data
    Eng., 11(2):265–283, 1999.
13. S. Greco, E. Masciari, D. Saccà, and I. Trubitsyna. HIKE: A step beyond data exchange. In
    ER Conference, 2019.
14. S. Greco, C. Molinaro, and I. Trubitsyna. Logic programming with function symbols: Check-
    ing termination of bottom-up evaluation through program adornments. Theory Pract. Log.
    Program., 13(4-5):737–752, 2013.
15. S. Greco, F. Spezzano, and I. Trubitsyna. Stratification criteria and rewriting techniques for
    checking chase termination. PVLDB, 4(11):1158–1168, 2011.
16. S. Greco, F. Spezzano, and I. Trubitsyna. Checking chase termination: Cyclicity analysis
    and rewriting techniques. IEEE Trans. Knowl. Data Eng., 27(3):621–635, 2015.
17. S. Greco and C. Zaniolo. Greedy algorithms in datalog. TPLP, 1(4):381–407, 2001.
18. L. Libkin, J. L. Reutter, A. Soto, and D. Vrgoc. Trial: A navigational algebra for RDF
    triplestores. ACM Trans. Database Syst., 43(1):5:1–5:46, 2018.
19. E. Masciari, I. Trubitsyna, and D. Saccà. Simplified data posting in practice. In IDEAS,
    pages 29:1–29:7, 2019.
20. D. Saccà and C. Zaniolo. Stable models and non-determinism in logic programs with nega-
    tion. In Proc. PODS Conf., pages 205–217, 1990.

</pre>