Recent Progress on the Algebra of Modular Systems Eugenia Ternovska Simon Fraser University Abstract. In this short paper, we present some recent and ongoing work on an algebra of modular system aiming at combining large pieces of information. 1 Introduction Reusing components and services and linking data is important in Computer Science. The essence of our proposal is that we can use first-order logic as a versatile language for applying and combining modules – which are classes of structures – web services, declarative specifications with associated solvers, Integer Liner Programs, Constraint Satisfaction Problems etc.1 While the syntax of our formalism is first-order, the seman- tics is second-order because variables range over relations. We use a version of Codd relational algebra instead of first-order logic, but the idea is the same. We also add least fixed points. Since most practical systems use inputs and outputs, or directionality in linking data, we introduce an algebra with information flow. Operations such as while loops, if-then-else, reachability, regular expressions over data links, etc. are definable.2 The goal of this short paper is to present the two algebras and some of the recent results and ongoing work, namely (1) on the complexity of the evaluation task, (2) on modular system equivalence and containment and (3) on solving modular systems using an algebra of propagators. While item (3) summarizes [1], items (1) and (2) is an ongoing and yet unpublished work. The presentation of the syntax and semantics is also new and improves over previous versions, e.g. [3]. 2 Algebras For exactly the same syntax, we produce two algebras, static and dynamic, by giving different interpretations to the algebraic operations and to the elements of the algebras. In the second algebra, atomic modules have a direction of information propagation. The algebras correspond to classical and modal logics, respectively. Syntax3 Assume we have a countable sequence Vars “ pX1 , X2 , . . . q of relational variables each with an associated finite arity. For convenience, we use X, Y , Z, etc. Let ModAt “ tM1 , M2 , . . . u be a fixed vocabulary of atomic module symbols. Each Mi P ModAt has an associated variable vocabulary vvocpMi q whose length can de- pend on Mi . We may write Mi pXi1 , . . . , Xik q, (or Mi pX̄q), to indicate that vvocpM q “ 1 In database terminology, a module is a boolean query. 2 We call the logic counterpart of the second algebra a Logic of Information Flow. 3 Thanks to Brett McLean, Bart Bogaerts and anonymous referees of the previous versions whose comments helped to improve the presentation. pXi1 , . . . , Xik q. Similarly, ModVars “ tZ1 , Z2 , . . . u is a countable sequence of mod- ule variables, where each Zj P ModVars has its own vvocpZj q. Algebraic expressions are built by the grammar: α ::“ id | Mi | Zj | α Y α | ´ α | πδ pαq | σΘ pαq | µZj .α. (1) Here, Mi is any symbol in ModAt, δ is any finite set of relational variables in Vars, Θ is any expression of the form X ” Y , for relational variables of equal arity that occur in α, Zj is a module variable in ModVars which must occur positively in the expression α, i.e., under an even number of the complementation (´) operator. Static (Unary) Semantics Fix a finite relational vocabulary τ . A variable assignment s is a function that assigns, to each relational variable, a symbol in τ of the same arity. Now fix a domain Dom.4 Let U be the set of all τ -structures over the domain Dom. Given a sub-vocabulary γ of τ , a subset V Ď U is determined by γ if it satisfies: for all A, B P U such that A|γ “ B|γ we have A P V iff B P V. Given a well-formed algebraic expression α defined by (1), we say that structure A satisfies α (or that is a model of α) under variable assignment s, notation A |ùs α, if A P rαs, where unary interpretation r ¨ s is defined as follows. Given a variable as- signment s, function r ¨ s assigns a subset rMi s Ď U and a subset rZj s Ď U to each Mi P ModAt and each Zj P ModVars, with the property that rMi s is determined by spvvocpMi qq (respectively, rZj s is determined by spvvocpZj qq). We extend the defini- tion of r ¨ s to all algebraic expressions. rids :“ U, rα1 Y α2 s :“ rα1 s Y rα2 s, r´αs :“ Uzrαs, rπδ pαqs :“ tA P U | DB pB P rαs and A|τ zspδq “ B|τ zspδq qu, rσX”Y pαqs :“ tA | A P rαs and A|spXq ( “ B|spY q u, Ş rµZj .αs :“ R Ď U | rαsrZ:“αs Ď R . Here, rαsrZ:“αs means an interpretation that is exactly like r ¨ s, except Z is interpreted as α. Intuitively, to check whether a τ -structure A satisfies module Mi , we need to first select predicate symbols Si1 , . . . , Sik from τ , whose arities match those of X1 , . . . , Xk , which is done by function s, and then “apply” the module to SiA1 , . . . , SiAk , as we would apply a decision procedure or an “oracle”. Example 1. Let MHC pV, X, Y q and M2Col pV, X, Z, T q be atomic modules “comput- ing” a Hamiltonian Circuit and a 2-Colouring. For example, MHC can be represented as an Answer Set Programming program, and M2Col be an imperative program or a hu- man child with two pencils. The first module decides if Y forms a Hamiltonian Circuit (represented as a set of edges) in the graph given by vertex set V and edge set X. The second module decides if unary relations Z, T specify a proper 2-colouring of the graph. The following expression determines whether or not there is a 2-colourable Hamiltonian Circuit. M2Col´HC pV, X, Z, T q :“ πV,X,Z,T ppMHC pV, X, Y q X M2Col pV, Y, Z, T qq. The need for recursion is illustrated by a much longer specification of a dynamic pro- gramming algorithm on tree decompositions [3]. 4 Usually, in applications, domain Dom is the (active) domain of an input structure for a task of interest. Dynamic (Binary) Semantics Let ModAtI{O denote the set of all atomic module sym- bols M with all possible partitions of vvocpM q into inputs, IpM q, and outputs, OpM q.5 This set is larger than the set ModAt (unless both are empty) because the same M can have several different input-output assignments. Similarly, we define ModVarsI{O . Given a well-formed α, we say that pair of structures pA, Bq, satisfies α under variable assignment s, notation pA, Bq |ùs α , if pA, Bq P JαK, where binary interpretation J ¨ K is defined as follows. JidK :“ U ˆ U. For atomic modules in ModAtI{O , we have: JM K :“ tpA, Bq P U ˆ U | A|τ zspOpM qq “ B|τ zspOpM qq and B P rM su. (2) Similarly, for Z P ModVarsI{O . Intuitively, atomic modules produce a replica of the current database except the interpretation of the output vocabulary changes as specified by the action.6 An illustration of the binary semantics for atomic modules is given in an example below. We extend the binary interpretation J¨K to all expressions α: Jα1 Y α2 K :“ Jα1 K Y Jα2 K, J´α2 K :“ U ˆ U z JαK, ( R Ď U ˆ U : JαKrZ:“Rs Ď R , Ş JµZj .αK :“ Jπδ pαqK :“ tpA, Bq P U ˆ U | D pA1 , B1 q P JαK : A1 |τ zspδq “ A|τ zspδq and B1 |τ zspδq “ B|τ zspδq u, A A $ ˇ , ˇ pspXqqB “ pspY qqB if tX, Y u Ď Ipαq, & ˇ . JσX”Y pαqK :“ pA, Bq P JαK ˇˇ pspXqq “ pspY qq if tX, Y u Ď Opαq, . % ˇ pspXqqA “ pspY qqB q if X P Ipαq and Y P Opαq. - Example 2. In HC-2Col, in each atomic module, we underline designated input sym- bols: πV,X,Z,T pMHC pV , X, Y q X M2Col pV , Y , Z, T qq. First, MHC pV , X, Y q makes a transition by producing possibly several Hamiltonian Circuits. The interpretation of the output Y changes, everything else is transferred by inertia. Each resulting structure is taken as an input to M2Col pV , Y , Z, T q, where edges in the cycle, Y , are “fed” to M2Col , although this is hidden from the outside observer. The second module produces non-deterministic transitions, one for each generated colouring, if they exist. The algebra can be equivalently represented in a “two-sorted” syntax: α ::“ id | Ma | Zj | α Y α | ´ α | πδ pαq | σΘ pαq | φ? | µZj .α (3) φ ::“ J | Mp | Xi | φ _ φ | φ | |αy φ| xα| φ | µXi .φ. The modal logic, which we call Lµµ (since we have two fixed points), is interpreted over a transition system where states are τ -structures, Ma represent modules-actions that change states, Mp – modules-propositions that are true in states. Process formulae in the first line are interpreted exactly like in the binary semantic, and tests (i.e., expres- sions ending with symbol “?”) are interpreted as in Dynamic Logic. State formulae in the second line are interpreted exactly like in the µ-calculus. Example 3. Definable constructs that can appear inside the forward |αy and backwards xα| possibility modalities, are, e.g., if φ then α else β :“ pφ?; αq Y p φ?; βq, while φ do α :“ pφ?; αq˚ ; φ?, regular expressions. Here, ‘;’ is sequential compo- sition (a particular case of X), and α˚ :“ µZ.pid Y Z; αq. 5 Either one of these sets, Ipαq, Opαq, can be empty. 6 This is similar to the inertia law for actions in the Situation Calculus. 3 Some Recent and Ongoing Work (1) Complexity and Expressiveness We consider the task where we give an interpre- tation to Ipαq, and ask whether there exists an interpretation Opαq, such that α is sat- isfied. We identify ”safe” fragments in which the data complexity of this task is essen- tially the same as the complexity of the modules we start from. Safe (power-preserving) operations are, e.g. X, ;, σX”Y , some cases of πδ and Y (those that do not add non- determinism), two definable unary negations, and a restricted version of ˚. If we start with polynomial time atomic modules, some power-increasing operations allow us to capture NP, and with the use of a unary negation, all levels of the Polynomial Time Hi- erarchy. We also characterize other complexity classes. This is yet unpublished work. It extends an earlier initial work on a positive recursion-free fragment of the algebra [2]. (2) Equivalence and Inclusion For compound modules α1 , α2 with atoms M1 , . . . , Mk , we say that α1 is contained in α2 , denoted α1 Ď α2 , if A |ùs α1 implies A |ùs α2 , for any s and any choice of atoms M1 , . . . , Ms . This is essentially the problem of logical implication. We have α1 “ α2 iff α1 Ď α2 and α2 Ď α1 . The task here is, given α1 and α2 , decide if α1 “ α2 (respectively, α1 Ď α2 ). We show that equivalence (and thus, also containment) is undecidable in general. We prove that, for a large class of modular systems, namely those specified by conjunctive expressions, containment is in NP. We discuss the conditions under which the containment problem becomes polynomial time solvable. Joint work with Andrei Bulatov. As of now, this work is unpublished. (3) Solving using an algebra of propagators, CDCL-like algorithm We consider the task of, given an interpretation of Ipαq, construct all interpretations of Opαq. We use precision order u ăp t ăp i, u ăp f ăp i, which is pointwise extended to four-valued (4V) structures: A ăp A1 . The set of all 4V structures forms a complete lattice. A prop- agator is a mapping of 4V structures to 4V structures such that it is ěp -monotone, i.e., A ěp A1 implies P pAq ěp P pA1 q, and is information-preserving, i.e., P pAq ěp A. A propagator refines a partial (four-valued) structure by deriving consequences of a given module. An α-solver is a procedure that takes as input a 4V structure A, and whose output is the set of all 2V structures A with A |ù α and A ěp A. We give examples of how our notion of a propagator generalizes notions from various domains. We define an algebra with the same operators as above, but on propagators, and then on explain- ing propagators. An explaining propagator not only returns the partial structure that is the result of its propagations (P pAq), but also an explanation (CpAq). This explanation takes the form of a propagator itself. Our main algorithm turns such a propagator into a solver that learns from these explanations, similar to the Conflict Driven Clause Learn- ing (CDCL) algorithm, the main algorithm in SAT solving. Our work extends SMT technology to arbitrary modules and to module combinations beyond conjunctive. References 1. B. Bogaerts, E. Ternovska, and D. Mitchell. Propagators and solvers for the algebra of modular systems. In Logic for Programming, Artificial Intelligence and Reasoning (LPAR), 2017. 2. S. Tasharrofi and E. Ternovska. Modular systems. In Proceedings of Workshop on Hybrid Reasoning at IJCAI’15, 2015. 3. E. Ternovska. Static and dynamic views on the algebra of modular systems. In Proc. of the 16th Int. Workshop on Non-Monotonic Reasoning (NMR 2016), pages 153–162, 2016.