Recent Progress on the Algebra of Modular Systems

                                   Eugenia Ternovska

                                  Simon Fraser University


       Abstract. In this short paper, we present some recent and ongoing work on an
       algebra of modular system aiming at combining large pieces of information.


1    Introduction
Reusing components and services and linking data is important in Computer Science.
The essence of our proposal is that we can use first-order logic as a versatile language
for applying and combining modules – which are classes of structures – web services,
declarative specifications with associated solvers, Integer Liner Programs, Constraint
Satisfaction Problems etc.1 While the syntax of our formalism is first-order, the seman-
tics is second-order because variables range over relations. We use a version of Codd
relational algebra instead of first-order logic, but the idea is the same. We also add least
fixed points. Since most practical systems use inputs and outputs, or directionality in
linking data, we introduce an algebra with information flow. Operations such as while
loops, if-then-else, reachability, regular expressions over data links, etc. are definable.2
    The goal of this short paper is to present the two algebras and some of the recent
results and ongoing work, namely (1) on the complexity of the evaluation task, (2)
on modular system equivalence and containment and (3) on solving modular systems
using an algebra of propagators. While item (3) summarizes [1], items (1) and (2) is
an ongoing and yet unpublished work. The presentation of the syntax and semantics is
also new and improves over previous versions, e.g. [3].

2    Algebras
For exactly the same syntax, we produce two algebras, static and dynamic, by giving
different interpretations to the algebraic operations and to the elements of the algebras.
In the second algebra, atomic modules have a direction of information propagation. The
algebras correspond to classical and modal logics, respectively.
Syntax3 Assume we have a countable sequence Vars “ pX1 , X2 , . . . q of relational
variables each with an associated finite arity. For convenience, we use X, Y , Z, etc.
Let ModAt “ tM1 , M2 , . . . u be a fixed vocabulary of atomic module symbols. Each
Mi P ModAt has an associated variable vocabulary vvocpMi q whose length can de-
pend on Mi . We may write Mi pXi1 , . . . , Xik q, (or Mi pX̄q), to indicate that vvocpM q “
 1
   In database terminology, a module is a boolean query.
 2
   We call the logic counterpart of the second algebra a Logic of Information Flow.
 3
   Thanks to Brett McLean, Bart Bogaerts and anonymous referees of the previous versions
   whose comments helped to improve the presentation.
pXi1 , . . . , Xik q. Similarly, ModVars “ tZ1 , Z2 , . . . u is a countable sequence of mod-
ule variables, where each Zj P ModVars has its own vvocpZj q. Algebraic expressions
are built by the grammar:
                  α ::“ id | Mi | Zj | α Y α | ´ α | πδ pαq | σΘ pαq | µZj .α.            (1)
Here, Mi is any symbol in ModAt, δ is any finite set of relational variables in Vars, Θ
is any expression of the form X ” Y , for relational variables of equal arity that occur in
α, Zj is a module variable in ModVars which must occur positively in the expression
α, i.e., under an even number of the complementation (´) operator.
Static (Unary) Semantics Fix a finite relational vocabulary τ . A variable assignment
s is a function that assigns, to each relational variable, a symbol in τ of the same arity.
Now fix a domain Dom.4 Let U be the set of all τ -structures over the domain Dom.
Given a sub-vocabulary γ of τ , a subset V Ď U is determined by γ if it satisfies:
 for all A, B P U such that A|γ “ B|γ we have A P V iff B P V.
     Given a well-formed algebraic expression α defined by (1), we say that structure A
satisfies α (or that is a model of α) under variable assignment s, notation A |ùs α,
if A P rαs, where unary interpretation r ¨ s is defined as follows. Given a variable as-
signment s, function r ¨ s assigns a subset rMi s Ď U and a subset rZj s Ď U to each
Mi P ModAt and each Zj P ModVars, with the property that rMi s is determined by
spvvocpMi qq (respectively, rZj s is determined by spvvocpZj qq). We extend the defini-
tion of r ¨ s to all algebraic expressions.

                rids :“ U,
                rα1 Y α2 s :“ rα1 s Y rα2 s,
                r´αs :“ Uzrαs,
                rπδ pαqs :“ tA P U | DB pB P rαs and A|τ zspδq “ B|τ zspδq qu,
                rσX”Y pαqs :“ tA | A P rαs and A|spXq ( “ B|spY q u,
                            Ş
                rµZj .αs :“     R Ď U | rαsrZ:“αs Ď R .
Here, rαsrZ:“αs means an interpretation that is exactly like r ¨ s, except Z is interpreted
as α. Intuitively, to check whether a τ -structure A satisfies module Mi , we need to first
select predicate symbols Si1 , . . . , Sik from τ , whose arities match those of X1 , . . . , Xk ,
which is done by function s, and then “apply” the module to SiA1 , . . . , SiAk , as we would
apply a decision procedure or an “oracle”.
Example 1. Let MHC pV, X, Y q and M2Col pV, X, Z, T q be atomic modules “comput-
ing” a Hamiltonian Circuit and a 2-Colouring. For example, MHC can be represented
as an Answer Set Programming program, and M2Col be an imperative program or a hu-
man child with two pencils. The first module decides if Y forms a Hamiltonian Circuit
(represented as a set of edges) in the graph given by vertex set V and edge set X. The
second module decides if unary relations Z, T specify a proper 2-colouring of the graph.
The following expression determines whether or not there is a 2-colourable Hamiltonian
Circuit. M2Col´HC pV, X, Z, T q :“ πV,X,Z,T ppMHC pV, X, Y q X M2Col pV, Y, Z, T qq.
The need for recursion is illustrated by a much longer specification of a dynamic pro-
gramming algorithm on tree decompositions [3].
 4
     Usually, in applications, domain Dom is the (active) domain of an input structure for a task of
     interest.
Dynamic (Binary) Semantics Let ModAtI{O denote the set of all atomic module sym-
bols M with all possible partitions of vvocpM q into inputs, IpM q, and outputs, OpM q.5
This set is larger than the set ModAt (unless both are empty) because the same M
can have several different input-output assignments. Similarly, we define ModVarsI{O .
Given a well-formed α, we say that pair of structures pA, Bq, satisfies α under variable
assignment s, notation pA, Bq |ùs α , if pA, Bq P JαK, where binary interpretation J ¨ K
is defined as follows. JidK :“ U ˆ U. For atomic modules in ModAtI{O , we have:
          JM K :“ tpA, Bq P U ˆ U | A|τ zspOpM qq “ B|τ zspOpM qq and B P rM su.           (2)
Similarly, for Z P ModVarsI{O . Intuitively, atomic modules produce a replica of the
current database except the interpretation of the output vocabulary changes as specified
by the action.6 An illustration of the binary semantics for atomic modules is given in an
example below. We extend the binary interpretation J¨K to all expressions α:
     Jα1 Y α2 K :“ Jα1 K Y Jα2 K,
     J´α2 K :“ U ˆ U z JαK,                          (
                     R Ď U ˆ U : JαKrZ:“Rs Ď R ,
                 Ş
     JµZj .αK :“
     Jπδ pαqK :“ tpA, Bq P U ˆ U | D pA1 , B1 q P JαK : A1 |τ zspδq “ A|τ zspδq
                                                    and B1 |τ zspδq “ B|τ zspδq u,
                                              A            A
                     $              ˇ                                                ,
                                    ˇ pspXqqB “ pspY qqB if tX, Y u Ď Ipαq,
                     &              ˇ                                                .
     JσX”Y pαqK :“ pA, Bq P JαK ˇˇ pspXqq “ pspY qq if tX, Y u Ď Opαq,                 .
                     %              ˇ pspXqqA “ pspY qqB q if X P Ipαq and Y P Opαq. -

Example 2. In HC-2Col, in each atomic module, we underline designated input sym-
bols: πV,X,Z,T pMHC pV , X, Y q X M2Col pV , Y , Z, T qq. First, MHC pV , X, Y q makes
a transition by producing possibly several Hamiltonian Circuits. The interpretation of
the output Y changes, everything else is transferred by inertia. Each resulting structure
is taken as an input to M2Col pV , Y , Z, T q, where edges in the cycle, Y , are “fed” to
M2Col , although this is hidden from the outside observer. The second module produces
non-deterministic transitions, one for each generated colouring, if they exist.
The algebra can be equivalently represented in a “two-sorted” syntax:
                 α ::“ id | Ma | Zj | α Y α | ´ α | πδ pαq | σΘ pαq | φ? | µZj .α
                                                                                            (3)
                      φ ::“ J | Mp | Xi | φ _ φ | φ | |αy φ| xα| φ | µXi .φ.

The modal logic, which we call Lµµ (since we have two fixed points), is interpreted
over a transition system where states are τ -structures, Ma represent modules-actions
that change states, Mp – modules-propositions that are true in states. Process formulae
in the first line are interpreted exactly like in the binary semantic, and tests (i.e., expres-
sions ending with symbol “?”) are interpreted as in Dynamic Logic. State formulae in
the second line are interpreted exactly like in the µ-calculus.
Example 3. Definable constructs that can appear inside the forward |αy and backwards
xα| possibility modalities, are, e.g., if φ then α else β :“ pφ?; αq Y p φ?; βq,
while φ do α :“ pφ?; αq˚ ; φ?, regular expressions. Here, ‘;’ is sequential compo-
sition (a particular case of X), and α˚ :“ µZ.pid Y Z; αq.
 5
     Either one of these sets, Ipαq, Opαq, can be empty.
 6
     This is similar to the inertia law for actions in the Situation Calculus.
3    Some Recent and Ongoing Work
(1) Complexity and Expressiveness We consider the task where we give an interpre-
tation to Ipαq, and ask whether there exists an interpretation Opαq, such that α is sat-
isfied. We identify ”safe” fragments in which the data complexity of this task is essen-
tially the same as the complexity of the modules we start from. Safe (power-preserving)
operations are, e.g. X, ;, σX”Y , some cases of πδ and Y (those that do not add non-
determinism), two definable unary negations, and a restricted version of ˚. If we start
with polynomial time atomic modules, some power-increasing operations allow us to
capture NP, and with the use of a unary negation, all levels of the Polynomial Time Hi-
erarchy. We also characterize other complexity classes. This is yet unpublished work. It
extends an earlier initial work on a positive recursion-free fragment of the algebra [2].
(2) Equivalence and Inclusion For compound modules α1 , α2 with atoms M1 , . . . , Mk ,
we say that α1 is contained in α2 , denoted α1 Ď α2 , if A |ùs α1 implies A |ùs α2 , for
any s and any choice of atoms M1 , . . . , Ms . This is essentially the problem of logical
implication. We have α1 “ α2 iff α1 Ď α2 and α2 Ď α1 . The task here is, given α1 and
α2 , decide if α1 “ α2 (respectively, α1 Ď α2 ). We show that equivalence (and thus,
also containment) is undecidable in general. We prove that, for a large class of modular
systems, namely those specified by conjunctive expressions, containment is in NP. We
discuss the conditions under which the containment problem becomes polynomial time
solvable. Joint work with Andrei Bulatov. As of now, this work is unpublished.
(3) Solving using an algebra of propagators, CDCL-like algorithm We consider the
task of, given an interpretation of Ipαq, construct all interpretations of Opαq. We use
precision order u ăp t ăp i, u ăp f ăp i, which is pointwise extended to four-valued
(4V) structures: A ăp A1 . The set of all 4V structures forms a complete lattice. A prop-
agator is a mapping of 4V structures to 4V structures such that it is ěp -monotone, i.e.,
A ěp A1 implies P pAq ěp P pA1 q, and is information-preserving, i.e., P pAq ěp A. A
propagator refines a partial (four-valued) structure by deriving consequences of a given
module. An α-solver is a procedure that takes as input a 4V structure A, and whose
output is the set of all 2V structures A with A |ù α and A ěp A. We give examples
of how our notion of a propagator generalizes notions from various domains. We define
an algebra with the same operators as above, but on propagators, and then on explain-
ing propagators. An explaining propagator not only returns the partial structure that is
the result of its propagations (P pAq), but also an explanation (CpAq). This explanation
takes the form of a propagator itself. Our main algorithm turns such a propagator into a
solver that learns from these explanations, similar to the Conflict Driven Clause Learn-
ing (CDCL) algorithm, the main algorithm in SAT solving. Our work extends SMT
technology to arbitrary modules and to module combinations beyond conjunctive.

References
1. B. Bogaerts, E. Ternovska, and D. Mitchell. Propagators and solvers for the algebra of modular
   systems. In Logic for Programming, Artificial Intelligence and Reasoning (LPAR), 2017.
2. S. Tasharrofi and E. Ternovska. Modular systems. In Proceedings of Workshop on Hybrid
   Reasoning at IJCAI’15, 2015.
3. E. Ternovska. Static and dynamic views on the algebra of modular systems. In Proc. of the
   16th Int. Workshop on Non-Monotonic Reasoning (NMR 2016), pages 153–162, 2016.