Technical Communications of ICLP 2015. Copyright with the Authors. 1 Debugging ASP using ILP TINGTING LI Institute for Security Science and Technology, Imperial College London, UK E-mail: tingting.li@imperial.ac.uk MARINA DE VOS, JULIAN PADGET Department of Computer Science, University of Bath, Bath, UK E-mail: {mdv,jap}@cs.bath.ac.uk KEN SATOH National Institute of Informatics and Sokendai, Japan Email: ksatoh@nii.ac.jp TINA BALKE Centre for Research in Social Simulation, University of Surrey, UK E-mail: t.balke@surrey.ac.uk submitted 29 April 2015; accepted 5 June 2015 Abstract Declarative programming allows the expression of properties of the desired solution(s), while the computational task is delegated to a general-purpose algorithm. The freedom from explicit control is counter-balanced by the difficulty in working out what properties are missing or are incorrectly expressed, when the solutions do not meet expectations. This can be particularly problematic in the case of answer set semantics, because the absence of a key constraint/rule could make the difference between none or thousands of answer sets, rather than the intended one (or handful). The debugging task then comprises adding or deleting conditions on the right hand sides of existing rules or, more far-reaching, adding or deleting whole rules. The contribution of this paper is to show how inductive logic programming (ILP) along with examples of (un)desirable properties of answer sets can be used to revise the original program semi-automatically so that it satisfies the stated properties, in effect providing debugging-by-example for programs under answer set semantics. KEYWORDS: answer set programming, debugging, inductive logic programming 1 Introduction Answer Set Programming (ASP) is a declarative programming paradigm for logic pro- grams under answer set semantics. Like all declarative paradigms it has the advantage of describing the constraints and the solutions rather than the writing of an algorithm to find solutions to a problem. In recent years we have seen the rise of ASP applications1 . As with all programming languages, initial programs contain bugs that need to be eradicated. 1 In 2009, LPNMR (Erdem et al. 2009) had a special track on successful applications of ASP. 2 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke While syntactic errors can easily be picked up by the compiler, interpreter or solver, logi- cal errors are harder to spot and to debug. For imperative languages, debugger tools assist the programmer in finding the source of the bug. In declarative programming, the flow of the program is not necessarily known by the programmer (unless he or she is intimately familiar with the implementation of the solver and its many flags) and probably should not, to be in keeping with the declarative philosophy. With the development of applications in ASP, the need for support tools arises. In im- perative languages, the developer is often supported by debugging tools and integrated development environments. Following the same approach, the first debugging tool was introduced by Brain et al. (2005). Since then, new debugging techniques have been put forward, which range from visualisation tools (Calimeri et al. 2009; Cliffe et al. 2008; K- loimüllner et al. 2013), through meta-language extensions (Gebser et al. 2009) to idealised theoretical models (Denecker and Ternovska 2008). In our opinion Pührer’s (Oetsch et al. 2011) stepping technique is the most practical approach. It allows the programmer to select at each choice point which atom(s) should be added to the current candidate answer sets. The tool will then update the candidate with the atoms become true/false as a consequence of this choice. It offers the closest resemblance to the debugging techniques used in imper- ative languages, so programmers might be more familiar with this approach. Nevertheless, it requires the user to step through the computation of the answer set which causes a cog- nitive overhead, which ideally should be avoided. It also reveals the procedural aspect of the answer set computation. Another promising approach is that by Mikitiuk et al. (2007), which uses a translation from logic-program rules to natural language in order to detect program errors more easily. In this paper we propose a debugging technique for normal logic programs under an- swer set semantics, based on inductive logic programming. By allowing the programmer to specify those features that should appear in an answer set (positive examples) and those that should not appear in any answer set (negative examples), the inductive logic program can suggest revisions to the existing answer set program such that the positive examples feature in at least one answer set of the revised program, while none exhibit the negative examples. The implementation of the theory revision is done in ASP using abductive logic programming techniques. The rest of the paper is organised as follows: In Section 2 we provide brief introduction to answer sets programming before defining the debugging task for answer set programs. In Section 3 we start with an intuitive description of our debugging approach (Section 3.1). The most important component – revision tuples – of the debugging process is elaborated in Section 3.2. After describing the process formally (Section 3.3), we present its imple- mentation based on Inductive Logic Programming (Section 3.4). We conclude this paper with a discussion of the related and future work. 2 Bugs in Answer Set Programming 2.1 Answer Set Programming The basic component of a normal ASP program is an atom or predicate e.g. r(X, Y) denotes XrY . X and Y are variables which can be grounded with constants. For this paper we Debugging ASP using ILP 3 will assume there are no function symbols with a positive arity. Each ground atom can be assigned the truth value true or false. ASP adopts negation as failure to compute the negation of an atom, i.e. not a is true if there is no evidence to prove a in the current program. Literals are atoms a or negated atoms not a. A normal logic program P is a conjunction of rules r of the general form: a : − b1 , ..., bm , not c1 , ..., not cn . where r ∈ P , and a, bi and cj are atoms from a set of atoms A. Intu- itively, this means if all atoms bi are known/true and no atom cj is known/true, then a must be known/true. a is the head part of the rule head(r) = a , while bi and not cj are body parts of the rule body(r) = {b1 , ..., bm , not c1 , ..., not cn }. A rule is a fact rule if body(r) = ∅ or a constraint rule if head(r) = ∅, indicating that no solution should be able to satisfy the body. The finite set of all constants that appears in the program P is referred as the Herbrand universe, denoted UP . Using the Herbrand universe, we can ground the entire program. Each rule is replaced by its grounded instances, which can be obtained by replacing each variable symbol by an element of UP . The ground program, denoted ground (P ), is the union of all ground instances of the rules in P . The set of all atoms grounded over the Herbrand universe of a program is called the Herbrand base, denoted by BP . These are exactly those atoms that will appear in the grounded program. An interpretation is a subset of the Herbrand base. Those atoms that are part of the interpretation are considered true while the others are false through negation as failure. An interpretation is a model iff for each rule holds that the head is true whenever the body is true. The model M is minimal if no other model N exists that is a subset of M (N 6⊆ M ). The semantics of an ASP program P is defined in terms of answer sets, i.e. assignments of true and false to all atoms in the program that satisfy the rules in a minimal and consistent fashion. Let P be a program. The Gelfond-Lifschitz transformation of ground(P ) w.r.t S, a set of ground atoms, is the program ground(P )S containing the rules a : − B such that a : − B, not C ∈ ground(P ) with C ∩ S = ∅, and B and C are sets of atoms. A set of ground atoms S ⊆ BP is an answer set of P iff S is the minimal model of ground (P )S . A program P may have zero or more answer sets, and ΠP denotes all the possible answer sets of P . 2.2 Debugging ASP programs In some cases, the obtained answer sets in ΠP might not be consistent with our expectation, although they satisfy all the current rules of the program. In other words, there is a bug. Thus we need to track back to revise the program, in order to produce the expected results only. First of all, we need a way to represent the goal of debugging, i.e. what we expect the program is able to produce or not. The representation needs to be accessible for human inspection, but also be compatible with the semantics of ASP. Therefore, we use sets of ground literals to indicate desirable and undesirable properties of a program. For instance, we use the atom state(X, Y, N) to denote that the square on row X and column Y in a Sudoku puzzle has value N. Now if we want to express that it is desirable a particular square has a certain value, or avoids having certain values, such expectations can be directly captured by ground literals of state(X, Y, N). By expressing a desirable or undesirable property using a set of ground literals, we can 4 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke define the goals of debugging an ASP program. These goals guide the debugging process to produce necessary revisions for the original program. We also expect the revised program to be as close as possible to the original, so we use a cost function to measure the difference between the original and revised program. An ASP debugging task is guided by a set of literals that can be used to revise the program. A revision alternative for a program can only use these literals to add to or delete from existing rules, create new rules or delete entire existing rules. Definition 1 (ASP Debugging Task) A logic program debugging task is denoted as a tuple hP, Lit, Φ, Ψ, costi where P is a normal logic program, where Lit ⊆ BP is the set of literals that can be used for revision, Φ, Ψ ⊆ 2BP ∪not BP are sets of desirable and undesirable properties of P respectively, such that Φ ∩ Ψ = ∅. The cost function computes a difference metric between two logic programs. A debugging task is valid iff: 6 ∃I ∈ ΠP : ∀φ ∈ Φ, I |= φ, or ∃ψ ∈ Ψ : ∃I ∈ ΠP , I |= ψ. A program P 0 is a revised alternative for P if: • ∀r0 : A : −B, not C ∈ P 0 : ∃r : A : −B 0 , not C 0 ∈ P : B \ B 0 ⊆ Lit, B 0 \ B ⊆ Lit, C \ C 0 ⊆ Lit, C 0 \ C ⊆ Lit or A ∪ B ∪ C ⊆ Lit • ∀r : A : −B, not C ∈ P : ∃r0 : A : −B 0 , not C 0 ∈ P 0 : B \ B 0 ⊆ Lit, B 0 \ B ⊆ Lit, C \ C 0 ⊆ Lit, C 0 \ C ⊆ Lit or A ∪ B ∪ C ⊆ Lit Let RP be the set of all alternatives to P . A revised program P 0 ∈ RP is a solution to the task, iff: (i) P 0 ∈ argmin{cost(P, P 00 ) : P 00 ∈ RP }, (ii) ∃I ∈ ΠP 0 : ∀φ ∈ Φ, I |= φ, (iii) ∀ψ ∈ Ψ : ∀I ∈ ΠP 0 , I 6|= ψ. Given a revised rule r0 of the revised program P 0 , comprising head literals A, positive body literals B and negative body literals not C there is a corresponding rule r in the original program P , comprising head literals A, positive body literals B 0 and negative body literals not C 0 where the new added literals to B 0 (or C 0 ) or removed literals from B 0 (or C 0 ) must be in the set Lit. Conversely, given a rule r in the original program P , comprising head literals A, positive body literals B and negative body literals not C, there is a corresponding rule r0 of the revised program P 0 comprising head literals A, positive body literals B 0 and negative body literals not C 0 , where the new added literals to B (or C) or removed literals from B(or C 0 ) must be in the set Lit. Thus, the revised program P 0 is a debugged program in terms of the specified properties. As described in the definition, the purpose of debugging an ASP program is to produce a revised program P 0 , which is able to produce at least one answer set with the all specified desirable literals Φ present and none of the answer sets contain any undesirable literals in Ψ. Intuitively, the program P 0 eliminates all the undesirable properties while preserving the desirable ones. Additionally, the revised program P 0 has minimal difference with the original one, as measured by the cost function. A detailed example of debugging a program is given in Section 3.5. Debugging ASP using ILP 5 3 Debugging ASP programs with ILP 3.1 Overview Given a normal logic program P , and desirable and undesirable properties Φ and Ψ, the debugging task is to revise P to satisfy the specified properties, that is, the revised program P 0 can produce at least one answer set satisfying all properties in Φ while no answer set contains any component in Ψ. In order to obtain such a solution, we first need to compute all the possible changes we could make to P , which includes adding/removing a literal to/from a rule, or adding or deleting a rule all together. Given a set of literals Lit, we provide structural and content specifications (called mode declarations: see Section 3.2) of the rules that can be constructed with Lit, which determine the learning space for the alternatives to the rules currently specified in the program P . By encoding the properties Φ and Ψ as constraints, the debugging task can be converted into a boolean satisfiability problem (SAT) and solved by existing techniques. In this paper, we employ Inductive Logic Programming. Inductive Logic Programming (ILP) (Muggleton 1995) is a machine learning technique used to obtain logic theories by means of generalising (positive and negative) examples with respect to a prior base theory. In this paper we are interested in the revision of logic programs in light of desirable and undesirable properties. We want to support the synthe- sis of new rules and the deletion or revision of existing ones by means of examples. A positive (negative) example in this context is a collection of ground literals that represents (un)desirable properties for the debugging task. The task then is not learning a new theory, but rather revising an existing one. The space of possible solutions, i.e the revised theories, needs to be well-defined to make the search space as small as possible. This is achieved by a so-called “language bias”. Only theories satisfying this bias can be learned. Mode declarations (Muggleton 1995) are one way of specifying this language bias, determining which literals are allowed in the head and body of the rules of the theory. In this paper, we also employ mode declarations to constrain the structure of rules, and we encode mode declarations as revision tuples (details are given in Section 3.2), which represent all revi- sion operations we could apply to a program to derive a debugged program satisfying the language bias. In our system, users are firstly prompted to provide examples characterising properties, and a set of (head and body) literals, from which mode declarations are generated to, in turn, produce all revision tuples. The ILP debugger then selects the revision tuples that are able to derive the debugged program that satisfies the stated properties. It is considered preferable (Corapi et al. 2011) that a revised program should be as similar to the origi- nal one as possible. This suits the purpose of debugging while maintaining, as much as possible, the aims and design of the program being revised. One measure of minimality, similar to (Wogulis and Pazzani 1993), can be defined in terms of the number of revision operations. Revision operations are: (i) deleting a rule (i.e. removing an existing rule), (i- i) adding a rule (i.e. forming a new rule), and (iii) adding or deleting body elements (i.e. revising an existing rule). We define a cost function cost(P, P 0 ) to determine the cost of revising program P to P 0 . The revision process can be applied repeatedly, but the results at each step depend on the set of (un)desired literals that are specified and it is important to note that the revision 6 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke R R process is non-monotonic. Specifically, P −→ 1 P 0 −→ 2 P 00 may not result in the same R2 0 R1 00 program as P −→ P −→ P . Thus for incremental revision, the (un)desired literals must be accumulated to ensure that revisions made at one step are not undone at another. This iterative process is driven by the programmer, when she finds more issues with the program she is debugging. At each revision step all the desired and undesired literals are taken into account. 3.2 Mode Declaration and Revision Tuples In the definition of the ASP debugging task (Definition 1) we defined revision alterna- tives for a given program and set of literals. In ILP, this is determined by means of mode declarations. Definition 2 (Mode Declaration) A mode declaration is a tuple of the form hid, pre(l), var(l)i with l ∈ Lit, Lit a set of literals and id the unique label of a mode declaration. pre(l) is the predicate associated with the literal l, var(l) the associated variables. Lit can be divided into head literals Lith and body literals Litb , such that Lit = Lith ∪ Litb . For a set of mode declarations M , we have head mode declarations M h = {m|m = hid, pre(l), var(l)i, l ∈ Lith }, and body mode declarations M b = {m|m = hid, pre(l), var (l)i, l ∈ Litb }. We next define literal compatibility, where a literal might be a constituent of a logic program. Consequently, we define rule compatibility in terms of literal compatibility. Thus, given a set of literals compatible with M , a set of compatible rules can be formed to derive a constrained search space for a debugging task. Definition 3 (Literal Compatibility) Given a mode declaration m, a literal l0 is compatible with m iff (i) l0 has the predicate defined in pre(l) of m, and (ii) l0 has the variables in var(l) of m. Definition 4 (Rule Compatibility) Given a rule r formed by a head literal h and several body literals bi : h ← b0 , . . . , bn , the rule r is compatible with the mode declarations M iff: (i) there is a head mode declaration m ∈ M h compatible with h, and (ii) there is a body mode declaration m ∈ M b compatible with bi , i ∈ [0, n]. We have described the purpose of mode declarations in the revision process but to realize a computational solution, we need a representation. That is the purpose of this section and as such, it is just a technical explanation of the means to synthesize the revisions of the rules of a given set of literals. The revision of a rule takes one of two forms: (i) specialization: in which a literal is added to the body, thus adding a constraint or (ii) generalization: in which a literal is removed from the body, thus removing a constraint. In order to guide this process, we use mode declarations that constrain the search space and lead to the construction of revision tuples rev that describe specific rule revisions, i.e. following a revision tuple, a rule can be revised to be another. One or more revision tuples comprise the final solution to the debugging task in the later sections. Debugging ASP using ILP 7 The key to forming revision tuples is to generate them for all possible revisions that could be applied to the rules of a program. A deletion operation is quite simple, but addi- tion is more complicated, because we need to consider not only which literal to add, but also the relation between existing variables and those carried by the new literal. Each pos- sible relation may result in different rule structures. To operationalise this process, bound variable tuples, denoted Ξhb , are defined over a pair of head h and body literal b, where each element of the tuple corresponds to a body variable and collects the indices of head variables whose type is the same as the body variable. For instance, suppose we add a new body literal b to a rule with the head literal h, we have var(h) = hP1, P2, Q1i and var(b) = hP3, Q2i, such that P1, P2 and P3 are of one type and Q1 and Q2 are of another. The corresponding Ξhb is then hh0, 1i, h2ii in which the first inner tuple is a collection of head variable indexes for P3 and the second for Q2. If there is no related variables between the head literal and the new body literal, then the Ξhb is null. Definition 5 (Revision Tuple) A revision tuple rev is the representation of a revision operation in respect of a particular rule: rev = hrId, Θ, costi, where rId is the unique identifier of the rule in a program. Θ denotes the structure of the revised rule. cost is the metric associated with each revision operation. By default, cost is 1 unit. There are two forms of Θ, defining addition and deletion operations, respectively: 1. Θ = hh!id , b!id , Ξhb i, where h ∈ M h , b ∈ M b and Ξhb is the bound variable tuple. A revision tuple rev with such Θ implies an addition operation which extends the rule with a new body literal b in terms of Ξhb . 2. Θ = hh!id , ii where h ∈ M h and i is the index of an existing body. A revision tuple rev with such Θ implies a deletion operation to remove the ith body literal. R A set of revision tuples is denoted by R. The relation P → P 0 indicates a program P is revised to P 0 by applying revision tuples in R. For now, the cost of a set of revision tuples R is determined by the sum of the cost of each individual tuple in the set. The revision tuples express all applicable revisions to the current program. As we em- ploy ASP to implement the whole procedure, we translate revision tuples to corresponding ASP facts. Concrete examples of revision tuples can be found in the case study in Sec- tion 3.5. For instance, in line 32 of Fig.2, the revision tuple is encoded as rev(1, (hNull, bEqNeg, null), 1), denoting that a new body literal with id bEqNeg should be added to rule 1 at cost 1 unit. 3.3 Formal Procedures for Debugging Now we can formalize the procedures of a program debugging task, building on the pre- sented definitions. We rephrase a debugging task into a theory revision task such that the solution results in a debugged program satisfying the properties. Given a debugging task hP, Lit, Φ, Ψ, costi, the following steps compute the solution: 1. Construct M based on Lit; mode declarations for the given set of literals. 2. Encode C from Φ and Ψ; encode the set of properties as constraint rules. 3. Derive R from M ; all revision tuples compatible to the rules in P . 8 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke 4. Derive Ped and Pea from P ; convert a program P to revisable forms by applying certain syntactic transformations. Ped is to learn any necessary deletion operations and Pea for addition operations. These revisable forms use abducibles to indicate whether an existing rule remains unchanged or has literals deleted or added. R0 5. Learn ΠR0 to form P 0 , where ∀R0 ∈ ΠR0 , R0 ⊆ R, P → P 0 , P 0 |= C; A set of solutions ΠR0 is learned. Each solution R0 revises rules in P , giving P 0 which is a debugged program with regard to the specified properties. 6. Select R0min to optimise P 0 : select the solution R0min ∈ ΠR0 with minimum cost, resulting the P 0 with the minimum difference from P . As defined in Def. 1, the optimisation is guaranteed by the cost function, which com- putes the differences between two programs in terms of the operations needed to revise one to the other. For now, we assign a unit cost to each operation, so the total cost is the number of operations. If there is a desire to weight operations differently, the cost associated with each operation can be customized. 3.4 Implementation Inductive Logic Programming (ILP), as discussed in Section 3.1, is a symbolic machine learning technique which is capable of generating revisions to a problematic program P in order to satisfy some specified properties expressed by examples (i.e. ground literals). The solutions preserve the desirable properties while eliminating undesirable properties. In our case, properties reflect our debugging goals, by which some ground literals are expected or not expected to appear in the resulting answer sets. All the solutions produced by the ILP debugger are guaranteed to satisfy those properties. Furthermore, we use the cost criterion to make the revised program to as close as possible to the original, for convenience of human inspection and to minimize the cost of grounding and runtime. 3.4.1 Obtaining the Revision Tuples When given a program and a set of (un)desirable literals, our aim is to find out which rules, or more precisely which parts of some rules, need to be revised. The revisions typ- ically involve deletion of existing body atoms, or addition of new body atoms. Therefore, we need to adapt the existing rules to be ready for searching for possible changes. The following adaptations are designed for such purpose: we first use the try/3 predicate to label each body atom of the existing rules. Next we collect all possible revision tuples for deletion. Each try/3 literal is extended by a pair of use-delete rules, from which we can construct two variants: one with the atom referenced in the try/3 literal while the oth- er with the to-be-deleted atom. The two variants will be then examined later against the specified properties in the resulting answer sets. If the variant with the to-be-deleted atom succeeds, it implies a deletion operation is needed. The relevant revision tuples are also generated as part of this process and are used to capture the required operations. Next we need to obtain all the revision tuples for addition. We use the predicate extension/2 to label each head atom of the rules, and extend the extension/2 rules with other valid literals to derive different variants. If any of those variants satisfy all the properties, then an addition operation is required. Debugging ASP using ILP 9 By means of these syntactic transformations, a normal program P is converted into its revisable forms Ped and Pea . More importantly, all possible revision operations for a program are explored, with the corresponding revision tuples collected in the set R. Now we move to the abductive stage to learn solutions using R. 3.4.2 Abduction and Optimisation Both inductive and abductive learning take examples as inputs, from which inductive learn- ing aims to learn a rule to generalise the example, while abductive learning seeks explana- tions/causes for the example. In the our ASP debugger, we achieve inductive learning by abductive learning, through learning revision tuples for rules that invalidate the examples. In our case, examples are the literals characterising desirable and undesirable properties. Given a set of desirable and undesirable literals, a set of revision tuples collected in R and the revisable forms Ped and Pea of the program P , we are ready for abductive learn- ing. All solutions are computed by the ILP debugger, which is the program comprising: B ∪ Ped ∪ Pea ∪ R ∪ C. B is the fixed parts of P , including grounding and other domain facts. The set of constraint rules C captures desirable and undesirable properties. Each gen- erated answer set represents a solution to the problem, i.e. a set of revision tuples denoting alternative suggestions to revise the original program to satisfy the stated properties. As stated earlier, we are only interested in the revision with the minimum cost between P and P 0 . The total difference in cost between P and P 0 is the number of revision op- erations stated in a solution. In order to find the solution resulting in the minimum cost, we take advantage of the aggregate statement provided by C LINGO (Gebser et al. 2007), specifying a lower and an upper bound by which the weighted literals can be constrained, : −not [rev( , , Cost) = Cost]Max. We apply an incremental strategy for the variable Max, for example, if no solution can be found when Max = 1, then we continue incremen- tally until a solution is found. A direct encoding in ASP using minimise statements or weak constraints, would require the grounding of more eleborate revisions which may not be needed. Our incremental approach only grounds what is necessary. As the cost Max increases, the computation time increases accordingly, but the cost is bounded because the whole search space of rules is finite, so the number of possible opera- tions is therefore bounded. Thus, there is a maximum cost Max. While this guarantees that the program terminates, it does not guarantee that the program always returns a solution, as the user might have provided unsatisfiable properties. 3.5 Example: Debugging Sudoku Solver In this section, we present a simple but illustrative example to demonstrate the proposed debugging approach. In the example, a Sudoku solver P is implemented of which Fig.2 lists part in lines 1–11. Rule 1 (line 6-7) is a constraint specifying that a digit cannot appear twice in a column. However, we intentionally omitted the body condition not equal(XA, XB) to give us a problematic solver P . Rule 2 (line 9 to 10) states a digit cannot appear twice in a row. We omit the constraint rules for sub-square and other fact rules for grounding to save space. The most important atom is state(X, Y, N), which is true when the number N is placed at the square X in row Y. 10 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke 3 2 4 1 3 2 4 1 4 1 2 3 4 1 2 3 1 4 3 2 1 4 3 2 2 ? ? ? 2 3 1 4 (a) (b) Fig. 1. Example: (a) Sudoku puzzle 4×4; (b) solution Case Study: a Sudoku solver debugging task hP, Lit, Φ, Ψ, costi, where: the original (partial) program for the solver P is given as below: 1 %--- 4 possible locations and vaules 2 position(1 .. 4). value(1 .. 4). 3 %------ Each square may take any value 4 1{state(X,Y,N) : value(N) }1 :- position(X), position(Y). 5 %--- Rule 1: a number can appear twice in a column 6 :- state(XA,Y,N), state(XB,Y,N), position(XA), position(XB), 7 position(Y), value(N). 8 %--- Rule 2: a number cannot appear twice in a row 9 :- state(X,YA,N), state(X,YB,N), position(X), position(YA), 10 position(YB),value(N), not equal(YA, YB). 11 ...... Desirable Φ and undesirable properties Ψ are converted into constraint rules: 12 :- not state(4, 1, 2). 13 :- state(4,2,2). :- state(4,2,1). :- state(4,2,4). 14 :- state(4,3,4). :- state(4,3,2). :- state(4,4,3). 15 :- state(4,4,1). :- state(4,3,3). :- state(4,4,2). Revisable forms of Rule 1: 16 :- try(1, 1, state(XA,Y,N)), try(1, 1, state(XB,Y,N)), 17 extension(1, (XA, XB)). 18 19 % keep or not the body literal state(XA,Y,N) 20 try(1, 1, state(XA,Y,N)) :- state(XA,Y,N). 21 try(1, 1, state(XA,Y,N)) :- not state(XA,Y,N), rev(1, (1, b1), 1). 22 % keep or not the body literal state(XB,Y,N) 23 try(1, 1, state(XB,Y,N)) :- state(XB,Y,N). 24 try(1, 1, state(XB,Y,N)) :- not state(XB,Y,N), rev(1, (2, b2), 1). 25 26 % add new body literal equal(A,B) 27 extension(1,(XA, XB)) :- equal(XA,XB),rev(1,(hNull,bEqPos,null),1). 28 % add new body literal not equal(A, B) 29 extension(1,(XA, XB)) :- not equal(XA,XB),rev(1,(hNull,bEqNeg,null),1). 30 % nothing needs to add 31 extension(1,(XA, XB)) :- rev(1,(hNull,null,(0,0)),1). One of the optimised solutions (represented by revision tuples) is: 32 rev(1,(hNull,bEqNeg,null),1) which suggests to add a new body literal not equal(XA, XB) to the Rule 1: 33 :- state(XA,Y,N), state(XB,Y,N), position(XA), position(XB), 34 position(Y), value(N), not equal(XA, XB). Fig. 2. Debugging the Sudoku Solver Debugging ASP using ILP 11 A 4×4 puzzle is given with known digits for the first three rows in Fig.1(a). The current solver produces 24 possible guesses for the last row, but only one of them (as in Fig.1(b)) is correct according to Sudoku rules. Next we can declaratively specify desirable and unde- sirable properties to navigate the debugging for the solver in order to produce the expected answer sets only. Desirably, the first square on row 4 has the number 2 (state(4, 1, 2)). Undesirably, the second square on row 4 cannot be 2, 1 or 4, which have already appeared in other squares on the same column. Similar undesirable properties are specified for the other squares on row 4. Both desirable and undesirable properties are encoded as constrain- t rules in lines 12–15 in Fig.2. Using Rule 1 as an example, we then demonstrate how to convert the rule into its revisable form and collect relevant revision tuples as rev/3 atoms. We assume each revision tuple has unit cost. The produced optimal solution comprises a single revision tuple rev(1, (hNull, bEqNeg, null), 1), indicating that Rule 1 needs a new body literal bEqNeg, which corresponds to not equal(XA, XB). Consequently, a debugged Sudoku solver is obtained with the revised Rule 1 as shown in lines 33–34 in Fig.2. 4 Related and Future Work In this paper we present an open-loop debugger and automatic reviser for normal logic programs under answer set semantics. We use inductive logic programming with an ab- ductive implementation in ASP to revise the original program based on the desirable and undesirable properties specified by the user/programmer. The approach we have outlined is both less and more than the classic reference on the algorithmic debugging of logic programs (Shapiro 1983). Less, in that it is coarse-grained, because ASP is a batch process from input to output, rather than a steppable, proof tree creation process, through which the contribution of different program fragments can be observed. But more, in that it offers whole program debugging through the specification of input and (un)desired output. While algorithmic debugging depends upon an oracle (aka the programmer) to say whether the input/output relation is correct for a single case, in our approach the developer provides several positive/negative examples against which to test the program at the same time. Algorithmic debugging isolates the fragment of code that demonstrates an incorrect input/output relation in the case of the current input, but working out how to fix the code is the programmer’s responsibility. In contrast, the mechanism described here proposes changes to fix the code with respect to all the examples supplied. In earlier work (Corapi et al. 2011; Athakravi et al. 2012), we showed that the use of ILP, implemented as an abductive logic program under answer set semantics, can be used to revise a subclass of answer set programs. Those programs were the result of an auto- matic translation of an action language in the domain of normative multi-agent systems and demonstrated a very small range of variation in grammatical structure. Furthermore, that revision process only handles negative examples and each revision is just one nega- tive example. In this paper, we extend the approach to be able to deal with general answer set programs and allow each revision to take as many positive and negative examples as possible into account. As discussed in the introduction, a number of other debugging tools for answer set pro- gramming already exist (Brain et al. 2005; Calimeri et al. 2009; Cliffe et al. 2008; K- loimüllner et al. 2013; Gebser et al. 2009; Oetsch et al. 2011; Mikitiuk et al. 2007). They 12 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke all offer in some form a mechanism to locate where the solving process goes wrong, but provide little support for fixing it. (Oetsch et al. 2011) is probably the most advanced in terms of language support, notably being capable of dealing with aggregates. Arguably, the mode of development we have set out has more in common with (Lieber- man 1981), in that it encourages a form of “programming-by-example” through the user’s provision of inputs and examples of (un)satisfactory outputs, than with (Shapiro 1983), which directs the user’s knowledge at identifying which part of a program is behaving (in)correctly. However, that approach may be more comprehensible because it focusses attention on localised fragments of the program, whereas revision is a whole program pro- cess. This raises the question of how to make whole program debugging accessible given the volume of data involved. (Calimeri et al. 2009) takes some steps in this direction, but how well suited this might be to understanding the revision process is less clear, while a more general purpose approach to visualising inference (Lieberman and Henke 2015) seems potentially more appropriate. Our current system can improve in a number of ways. The user support at present is minimal. We plan to provide the tool as plug-in for SEALION (Busoniu et al. 2013), an IDE for answer set programming. While users can specify (un)desirable properties of the program, they are currently very basic. Even with the current implementation it would be relatively easy to extend them from sets of literals to sets of rules. The language bias determines the search space for the revised program. At the moment, the bias is fixed through our mode declaration including all literals of the program. In future versions, we want it to be possible for the programmer to steer the revisions if he/she knows were the problem might lie. This will reduced the run-time of the system, but the resulting revisions will be the same, asssuming that the programmer has specified an appropriate bias. Equally, based on the properties provided, the language bias can be adjusted to for example those literals that could influence the properties provided. A major and more complex future task is to extend the capability of the system to deal with aggregates. The abstract semantics provided by (Pührer 2014) can make a good start- ing point for our system. As programs evolve over time, we might still want them to adhere to the same prop- erties as the ones the program started off with. Unit testing for answer set programming (Janhunen et al. 2010; Janhunen et al. 2011; Febbraro et al. 2011; De Vos et al. 2012) is not new. The approach presented in this paper allows for the specification of properties the system has to adhere to, and it will revise the program when the system is no longer satis- fying the properties. In that respect, one might forese using this approach as an automated test-driven development method for answer set programming. Our current system is using minimal number of revisions steps for selecting the most appropriate revision to the original programme. We plan to investigate whether for an e- volving system, this results in the best possible programs. Debugging ASP using ILP 13 References ATHAKRAVI , D., C ORAPI , D., RUSSO , A., D E VOS , M., PADGET, J. A., AND S ATOH , K. 2012. Handling change in normative specifications. In Declarative Agent Languages and Technologies X - 10th International Workshop, DALT 2012, Valencia, Spain, June 4, 2012, Revised Selected Papers, M. Baldoni, L. A. Dennis, V. Mascardi, and W. W. Vasconcelos, Eds. Lecture Notes in Computer Science, vol. 7784. Springer, 1–19. B RAIN , M., WATSON , R., AND D E VOS , M. 2005. An Interactive Approach to Answer Set Programming. In ASP05: Answer Set Programming: Advances in Theory and Implementation, M. De Vos and A. Provetti, Eds. Research Press International, Bath, UK, 190–202. Also available from http://CEUR-WS.org/Vol-142/files/page190.pdf. B USONIU , P., O ETSCH , J., P ÜHRER , J., S KOCOVSKY, P., AND T OMPITS , H. 2013. Sealion: An eclipse-based IDE for answer-set programming with advanced debugging support. TPLP 13, 4-5, 657–673. C ALIMERI , F., L EONE , N., R ICCA , F., AND V ELTRI , P. 2009. A visual tracer for DLV. Answer Set Programming, 79. C LIFFE , O., D E VOS , M., B RAIN , M., AND PADGET, J. 2008. Aspviz: Declarative visualisation and animation using answer set programming. In International Conference of Logic Programming. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 724–728. C ORAPI , D., RUSSO , A., D E VOS , M., PADGET, J. A., AND S ATOH , K. 2011. Normative design using inductive learning. TPLP 11, 4-5, 783–799. D E VOS , M., K IZA , D., O ETSCH , J., P ÜHRER , J., AND T OMPITS , H. 2012. Annotating answer-set programs in LANA. Theory and Practice of Logic Programming 12, 4-5, 619–637. D ENECKER , M. AND T ERNOVSKA , E. 2008. A logic of nonmonotone inductive definitions. ACM transactions on computational logic (TOCL) 9, 2, 14. E RDEM , E., L IN , F., AND S CHAUB , T., Eds. 2009. Logic Programming and Nonmonotonic Reason- ing, 10th International Conference, LPNMR 2009, Potsdam, Germany, September 14-18, 2009. Proceedings. Lecture Notes in Computer Science, vol. 5753. Springer. F EBBRARO , O., L EONE , N., R EALE , K., AND R ICCA , F. 2011. Unit testing in ASPIDE. In Applica- tions of Declarative Programming and Knowledge Management - 19th International Conference, INAP 2011, and 25th Workshop on Logic Programming, WLP 2011, Vienna, Austria, Septem- ber 28-30, 2011, Revised Selected Papers, H. Tompits, S. Abreu, J. Oetsch, J. Pührer, D. Seipel, M. Umeda, and A. Wolf, Eds. Lecture Notes in Computer Science, vol. 7773. Springer, 345–364. G EBSER , M., K AUFMANN , B., N EUMANN , A., AND S CHAUB , T. 2007. Conflict-Driven Answer Set Solving. In Proceeding of IJCAI07. 386–392. G EBSER , M., P ÜHRER , J., S CHAUB , T., T OMPITS , H., AND W OLTRAN , S. 2009. S POCK: A debugging support tool for logic programs under the answer-set semantics. In Applications of Declarative Programming and Knowledge Management, D. Seipel, M. Hanus, and A. Wolf, Eds. Lecture Notes in Computer Science, vol. 5437. Springer Berlin Heidelberg, 247–252. JANHUNEN , T., N IEMEL Ä , I., O ETSCH , J., P ÜHRER , J., AND T OMPITS , H. 2010. On testing answer-set programs. In ECAI 2010 - 19th European Conference on Artificial Intelligence, Lis- bon, Portugal, August 16-20, 2010, Proceedings, H. Coelho, R. Studer, and M. Wooldridge, Eds. Frontiers in Artificial Intelligence and Applications, vol. 215. IOS Press, 951–956. JANHUNEN , T., N IEMEL Ä , I., O ETSCH , J., P ÜHRER , J., AND T OMPITS , H. 2011. Random vs. structure-based testing of answer-set programs: An experimental comparison. In Logic Program- ming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Vancouver, Canada, May 16-19, 2011. Proceedings, J. P. Delgrande and W. Faber, Eds. Lecture Notes in Computer Science, vol. 6645. Springer, 242–247. K LOIM ÜLLNER , C., O ETSCH , J., P ÜHRER , J., AND T OMPITS , H. 2013. Kara: A system for visual- ising and visual editing of interpretations for answer-set programs. In Applications of Declarative Programming and Knowledge Management - 19th International Conference, INAP 2011, and 25th 14 T.Li, M.De Vos, J.A.Padget, K. Satoh, T. Balke Workshop on Logic Programming, WLP 2011, Vienna, Austria, September 28-30, 2011, Revised Selected Papers, H. Tompits, S. Abreu, J. Oetsch, J. Pührer, D. Seipel, M. Umeda, and A. Wolf, Eds. LNCS, vol. 7773. Springer, 325–344. L IEBERMAN , H. 1981. Tinker: Example-based programming for artificial intelligence. In Proceed- ings of the 7th International Joint Conference on Artificial Intelligence (IJCAI ’81), Vancouver, BC, Canada, August 1981, P. J. Hayes, Ed. William Kaufmann, 1060. L IEBERMAN , H. AND H ENKE , J. 2015. Visualizing inference. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA., B. Bonet and S. Koenig, Eds. AAAI Press, 4280–4281. M IKITIUK , A., M OSELEY, E., AND T RUSZCZYNSKI , M. 2007. Towards debugging of answer-set programs in the language PSpb. In Proceedings of the 2007 International Conference on Artificial Intelligence, ICAI 2007, Volume II, June 25-28, 2007, Las Vegas, Nevada, USA, H. R. Arabnia, M. Q. Yang, and J. Y. Yang, Eds. CSREA Press, 635–640. M UGGLETON , S. 1995. Inverse entailment and progol. New generation computing 13, 245–286. O ETSCH , J., P ÜHRER , J., AND T OMPITS , H. 2011. Stepping through an answer-set program. In Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR2011, J. P. Delgrande and W. Faber, Eds. LNCS, vol. 6645. Springer, 134–147. P ÜHRER , J. 2014. Stepwise debugging in answer-set programming: Theoretical foundations and practical realisation. Ph.D. thesis, Vienna University of Technology, Vienna, Austria. S HAPIRO , E. Y. 1983. Algorithmic Program DeBugging. MIT Press, Cambridge, MA, USA. W OGULIS , J. AND PAZZANI , M. J. 1993. A methodology for evaluating theory revision systems: Results with Audrey II. In IJCAI. 1128–1134.