-

Recent Progress on the Algebra of Modular Systems

0 Simon Fraser University

In this short paper, we present some recent and ongoing work on an algebra of modular system aiming at combining large pieces of information. Reusing components and services and linking data is important in Computer Science. The essence of our proposal is that we can use first-order logic as a versatile language for applying and combining modules - which are classes of structures - web services, declarative specifications with associated solvers, Integer Liner Programs, Constraint Satisfaction Problems etc.1 While the syntax of our formalism is first-order, the semantics is second-order because variables range over relations. We use a version of Codd relational algebra instead of first-order logic, but the idea is the same. We also add least fixed points. Since most practical systems use inputs and outputs, or directionality in linking data, we introduce an algebra with information flow. Operations such as while loops, if-then-else, reachability, regular expressions over data links, etc. are definable.2 The goal of this short paper is to present the two algebras and some of the recent results and ongoing work, namely (1) on the complexity of the evaluation task, (2) on modular system equivalence and containment and (3) on solving modular systems using an algebra of propagators. While item (3) summarizes [1], items (1) and (2) is an ongoing and yet unpublished work. The presentation of the syntax and semantics is also new and improves over previous versions, e.g. [3].

1 Introduction For exactly the same syntax, we produce two algebras, static and dynamic, by giving different interpretations to the algebraic operations and to the elements of the algebras. In the second algebra, atomic modules have a direction of information propagation. The algebras correspond to classical and modal logics, respectively.

Syntax3 Assume we have a countable sequence Vars pX1; X2; : : : q of relational variables each with an associated finite arity. For convenience, we use X, Y , Z, etc. Let ModAt tM1; M2; : : : u be a fixed vocabulary of atomic module symbols. Each Mi P ModAt has an associated variable vocabulary vvocpMiq whose length can depend on Mi. We may write MipXi1 ; : : : ; Xik q, (or MipXq), to indicate that vvocpM q 1 In database terminology, a module is a boolean query. 2 We call the logic counterpart of the second algebra a Logic of Information Flow. 3 Thanks to Brett McLean, Bart Bogaerts and anonymous referees of the previous versions whose comments helped to improve the presentation. pXi1 ; : : : ; Xik q. Similarly, ModVars tZ1; Z2; : : : u is a countable sequence of module variables, where each Zj P ModVars has its own vvocpZj q. Algebraic expressions are built by the grammar: :: id | Mi | Zj |

Y | | p q | p q | Zj : : ( 1 ) Here, Mi is any symbol in ModAt, is any finite set of relational variables in Vars, is any expression of the form X Y , for relational variables of equal arity that occur in , Zj is a module variable in ModVars which must occur positively in the expression , i.e., under an even number of the complementation ( ) operator.

Static (Unary) Semantics Fix a finite relational vocabulary . A variable assignment s is a function that assigns, to each relational variable, a symbol in of the same arity. Now fix a domain Dom.4 Let U be the set of all -structures over the domain Dom. Given a sub-vocabulary of , a subset V U is determined by if it satisfies: for all A; B P U such that A| B| we have A P V iff B P V:

Given a well-formed algebraic expression defined by ( 1 ), we say that structure A satisfies (or that is a model of ) under variable assignment s, notation A |ùs ; if A P r s, where unary interpretation r s is defined as follows. Given a variable assignment s, function r s assigns a subset rMis U and a subset rZj s U to each Mi P ModAt and each Zj P ModVars, with the property that rMis is determined by spvvocpMiqq (respectively, rZj s is determined by spvvocpZj qq). We extend the definition of r s to all algebraic expressions.

rids :

U; r 1 Y 2s : r 1s Y r 2s; r s : Uzr s; r p qs : tA P U | DB pB P r s and A| zsp q B| zsp qqu; r X Y p qs : tA | A P r s and A|spXq B|spY qu; r Zj : s : R U | r srZ: s R(: Here, r srZ: s means an interpretation that is exactly like r s, except Z is interpreted as . Intuitively, to check whether a -structure A satisfies module Mi, we need to first select predicate symbols Si1 ; : : : ; Sik from , whose arities match those of X1; : : : ; Xk, which is done by function s, and then “apply” the module to SiA1 ; : : : ; SiAk , as we would apply a decision procedure or an “oracle”.

Example 1. Let MHCpV; X; Y q and M2ColpV; X; Z; T q be atomic modules “computing” a Hamiltonian Circuit and a 2-Colouring. For example, MHC can be represented as an Answer Set Programming program, and M2Col be an imperative program or a human child with two pencils. The first module decides if Y forms a Hamiltonian Circuit (represented as a set of edges) in the graph given by vertex set V and edge set X. The second module decides if unary relations Z; T specify a proper 2-colouring of the graph. The following expression determines whether or not there is a 2-colourable Hamiltonian Circuit. M2Col HCpV; X; Z; T q : V;X;Z;T ppMHCpV; X; Y q X M2ColpV; Y; Z; T qq: The need for recursion is illustrated by a much longer specification of a dynamic programming algorithm on tree decompositions [3]. 4 Usually, in applications, domain Dom is the (active) domain of an input structure for a task of interest.

Dynamic (Binary) Semantics Let ModAtI{O denote the set of all atomic module symbols M with all possible partitions of vvocpM q into inputs, I pM q, and outputs, OpM q.5 This set is larger than the set ModAt (unless both are empty) because the same M can have several different input-output assignments. Similarly, we define ModVarsI{O. Given a well-formed , we say that pair of structures pA; Bq, satisfies under variable assignment s, notation pA; Bq |ùs , if pA; Bq P J K, where binary interpretation J K is defined as follows. JidK : U U. For atomic modules in ModAtI{O, we have:

JM K : tpA; Bq P U U | A| zspOpM qq B| zspOpM qq and B P rM su: ( 2 ) Similarly, for Z P ModVarsI{O. Intuitively, atomic modules produce a replica of the current database except the interpretation of the output vocabulary changes as specified by the action.6 An illustration of the binary semantics for atomic modules is given in an example below. We extend the binary interpretation J K to all expressions : J 1 Y J 2K : J Zj : K : J p qK :

J 1K Y J 2K;

U z J K;

R U tpA; Bq P U J X Y p qK : $ & %

rZ: Rs R(; pspXqqA pA; Bq P J K pspXqqB

pspXqqA U : J K U | D pA1; B1q P J K : A1| zsp q A| zsp q and B1| zsp q B| zsp qu; pspY qqA if tX; Y u Ip q; pspY qqB if tX; Y u Op q; pspY qqBq if X P Ip q and Y P Op q: Example 2. In HC-2Col, in each atomic module, we underline designated input symbols: V;X;Z;T pMHCpV ; X ; Y q X M2ColpV ; Y ; Z; T qq: First, MHCpV ; X ; Y q makes a transition by producing possibly several Hamiltonian Circuits. The interpretation of the output Y changes, everything else is transferred by inertia. Each resulting structure is taken as an input to M2ColpV ; Y ; Z; T q, where edges in the cycle, Y , are “fed” to M2Col, although this is hidden from the outside observer. The second module produces non-deterministic transitions, one for each generated colouring, if they exist. The algebra can be equivalently represented in a “two-sorted” syntax: :: id | Ma | Zj | Y :: J | Mp | Xi | | _ | | p q | p q | ? | Zj : | | y | x | | Xi: : ( 3 ) The modal logic, which we call L (since we have two fixed points), is interpreted over a transition system where states are -structures, Ma represent modules-actions that change states, Mp – modules-propositions that are true in states. Process formulae in the first line are interpreted exactly like in the binary semantic, and tests (i.e., expressions ending with symbol “?”) are interpreted as in Dynamic Logic. State formulae in the second line are interpreted exactly like in the -calculus.

Example 3. Definable constructs that can appear inside the forward | y and backwards x | possibility modalities, are, e.g., if then else : p ?; q Y p ?; q, while do : p ?; q ; ?, regular expressions. Here, ‘;’ is sequential composition (a particular case of X), and : Z:pid Y Z; q. 5 Either one of these sets, Ip q, Op q, can be empty. 6 This is similar to the inertia law for actions in the Situation Calculus. ( 1 ) Complexity and Expressiveness We consider the task where we give an interpretation to Ip q, and ask whether there exists an interpretation Op q, such that is satisfied. We identify ”safe” fragments in which the data complexity of this task is essentially the same as the complexity of the modules we start from. Safe (power-preserving) operations are, e.g. X, ;, X Y , some cases of and Y (those that do not add nondeterminism), two definable unary negations, and a restricted version of . If we start with polynomial time atomic modules, some power-increasing operations allow us to capture NP, and with the use of a unary negation, all levels of the Polynomial Time Hierarchy. We also characterize other complexity classes. This is yet unpublished work. It extends an earlier initial work on a positive recursion-free fragment of the algebra [2]. ( 2 ) Equivalence and Inclusion For compound modules 1; 2 with atoms M1; : : : ; Mk, we say that 1 is contained in 2, denoted 1 2, if A |ùs 1 implies A |ùs 2, for any s and any choice of atoms M1; : : : ; Ms. This is essentially the problem of logical implication. We have 1 2 iff 1 2 and 2 1. The task here is, given 1 and 2, decide if 1 2 (respectively, 1 2). We show that equivalence (and thus, also containment) is undecidable in general. We prove that, for a large class of modular systems, namely those specified by conjunctive expressions, containment is in NP. We discuss the conditions under which the containment problem becomes polynomial time solvable. Joint work with Andrei Bulatov. As of now, this work is unpublished. ( 3 ) Solving using an algebra of propagators, CDCL-like algorithm We consider the task of, given an interpretation of Ip q, construct all interpretations of Op q. We use precision order u p t p i, u p f p i, which is pointwise extended to four-valued (4V) structures: A p A1. The set of all 4V structures forms a complete lattice. A propagator is a mapping of 4V structures to 4V structures such that it is ¥p-monotone, i.e., A ¥p A1 implies P pAq ¥p P pA1q, and is information-preserving, i.e., P pAq ¥p A. A propagator refines a partial (four-valued) structure by deriving consequences of a given module. An -solver is a procedure that takes as input a 4V structure A, and whose output is the set of all 2V structures A with A |ù and A ¥p A. We give examples of how our notion of a propagator generalizes notions from various domains. We define an algebra with the same operators as above, but on propagators, and then on explaining propagators. An explaining propagator not only returns the partial structure that is the result of its propagations (P pAq), but also an explanation (CpAq). This explanation takes the form of a propagator itself. Our main algorithm turns such a propagator into a solver that learns from these explanations, similar to the Conflict Driven Clause Learning (CDCL) algorithm, the main algorithm in SAT solving. Our work extends SMT technology to arbitrary modules and to module combinations beyond conjunctive.

Bogaerts , E. Ternovska, and

Mitchell . Propagators and solvers for the algebra of modular systems . In Logic for Programming , Artificial Intelligence and Reasoning (LPAR) , 2017 .

Tasharrofi and

Ternovska . Modular systems . In Proceedings of Workshop on Hybrid Reasoning at IJCAI'15 , 2015 .

Ternovska . Static and dynamic views on the algebra of modular systems . In Proc. of the 16th Int. Workshop on Non-Monotonic Reasoning (NMR 2016 ), pages 153 - 162 , 2016 .