Technical Communications of ICLP 2015. Copyright with the Authors.                        1


                 On Structural Analysis of
             Non-Ground Answer-Set Programs∗
       BENJAMIN KIESL,1 PETER SCHÜLLER,2 and HANS TOMPITS1

      1 Institut für Informationssysteme, Arbeitsbereich Wissensbasierte Systeme 184/3,

                                Technische Universität Wien
                        (e-mail: {kiesl,tompits}@kr.tuwien.ac.at)

     2 Department of Computer Engineering, Faculty of Engineering, Marmara University

                         (e-mail: peter.schuller@marmara.edu.tr)

                       submitted 29 April 2015; accepted 5 June 2015


                                        Abstract
The development of answer-set programs often involves domain experts without a back-
ground in logic programming. In such situations, it would be beneficial to translate pro-
grams into a form which is easier to understand and closer to natural language. Since the
structure of a program determines to a great extent how a program should be explained in
a clear and comprehensible way, as a first step towards a natural-language representation
of answer-set programs, in this paper, we introduce methods for analysing the structure of
disjunctive non-ground answer-set programs. In particular, as most programs follow the
generate-define-test paradigm, we introduce formal definitions to characterise the respec-
tive generate, define, and test parts of a program. Thereby, we define the non-deterministic
core of a program, effectively determining the program’s active solution-space generators,
following ideas of the weakly perfect model semantics as introduced by Przymusinska
and Przymusinski, and we prove that our definitions fulfil desirable properties. Moreover,
we also provide an implementation of a tool, using a metaprogramming approach, which
classifies the rules of a given program according to our definitions. Finally, we propose
an algorithm that, based on our generate-define-test classification, computes the order in
which the rules of a program should be explained when translated into natural language.

KEYWORDS: answer-set programming, generate-define-test paradigm, program analysis


                                    1 Introduction
Answer-set programming (ASP) has proven to be a viable tool for knowledge rep-
resentation, reasoning, and solving complex search problems. In many cases, the
process of developing answer-set programs involves domain experts, i.e., persons
with a detailed knowledge of some certain field of application but often without


∗ This work has been supported by the Austrian Science Fund (FWF) under project W1255-N23
  and by the Scientific and Technological Research Council of Turkey (TUBITAK) under grant
  114E777.
2               Benjamin Kiesl, Peter Schüller and Hans Tompits

a background in ASP. Such experts provide their know-how of a specific topic
to software developers who are then concerned with encoding this expertise into
answer-set programs. In such a context, arguably the software-engineering process
for answer-set programs involving non-ASP experts could be eased if representa-
tions of the programs and their output in a form which is closer to natural language
would be available.
   Since the structure of a program determines to a great extent how a program
should be explained in a clear and understandable way, as a first step towards
realising natural-language representations of answer-set programs, we introduce in
this paper methods for analysing the structure of disjunctive non-ground answer-
set programs. We note that work addressing representations in the other direction,
i.e., from natural language to ASP, has been discussed in the literature, e.g., by
Erdem and Yeniterzi (2009), who formulated queries over biomedical ontologies
using a controlled natural language (CNL), which are then translated into ASP for
evaluation, and by Schwitter (2012; 2013) who specified various logical puzzles in
CNL and provided their solutions via translations into ASP.
   A key structural feature of most answer-set programs is that they adhere to
the so-called generate-define-test paradigm which was informally introduced by
Lifschitz (2002) as a refinement of the well-known guess-and-check paradigm (Eiter
et al. 2000; Leone et al. 2006; Eiter et al. 2009). According to this paradigm, a
program can be considered as a combination of three parts, respectively referred
to as the generate, define, and test part. The generate part non-deterministically
generates a set of candidate solutions from which the test part eliminates those
candidates which are not the desired solutions of the specified problem. In order
to do so, both the generate and the test part may use concepts which are specified
in the define part. Although generate-define-test is, as indicated by Denecker et
al. (2012), the dominating methodology of ASP, formal criteria to characterise the
generate, define, and test parts of an answer-set program have not been introduced
so far to the best of our knowledge. We introduce such formal conditions and argue
for their adequacy by showing that they possess intuitively desirable properties.
   For a natural definition of the non-deterministic generate part of a program, it
is necessary to identify cases in which default negation leads to non-deterministic
behaviour. From a result of You and Yuan (1994), it immediately follows that a
non-disjunctive program has multiple answer sets only if its (ground) dependency
graph contains cycles with an even number of negative edges. Furthermore, odd
negative cycles can eliminate certain answer-set candidates. But, as demonstrated
by Przymusinska and Przymusinski (1990), negative cycles do not always become
active which is one reason why the existence of even negative cycles is not suffi-
cient for the creation of multiple answer sets and why odd negative cycles do not
necessarily eliminate answer-set candidates.
   Based on ideas underlying the procedure for the computation of the weakly-
perfect model of a logic program (Przymusinska and Przymusinski 1990), we define
the non-deterministic core of a program which is obtained by removing certain
rules and literals, thereby eliminating negative cycles that cannot become active.
Given this, we then define the generate part of a program as those rules which have
           On Structural Analysis of Non-Ground Answer-Set Programs                       3

instances that are involved in even negative cycles within the dependency graph
of the non-deterministic core, together with disjunctive rules. From the remaining
rules, those which are constraints or which have instances that are involved within
odd negative cycles of the non-deterministic core form the test part of a program,
while all other rules comprise the define part.
   We also present an implementation of a tool, based on a metaprogramming ap-
proach, which classifies the rules of a given program according to our definitions.
Finally, we propose an algorithm that makes use of our generate-define-test classi-
fication to compute a specific order for explaining the rules of a program in natural
language.


                                      2 Preliminaries
We deal with logic programs which are finite sets of rules of form
      L1 ∨ . . . ∨ Lk ← Lk +1 , . . . , Lm , not Lm+1 , . . . , not Ln ,                (1)
where L1 , . . . , Ln are literals, i.e., atoms from some (implicitly given) function-
free first-order language L, possibly preceded by the strong-negation symbol “¬”.
Literals which are possibly preceded by the default-negation symbol “not” are
called default literals. Given a rule r as above, we define head(r ) = {L1 , . . . , Lk },
body + (r ) = {Lk +1 , . . . , Lm }, and body − (r ) = {Lm+1 , . . . , Ln }. The elements of
head(r ) are referred to as the head literals of r , while the elements of body + (r ) and
body − (r ) are respectively referred to as the positive and negative body literals of r .
Moreover, we also define body(r ) = body + (r ) ∪ body − (r ). We call a rule of form (1)
disjunctive if k > 1, a fact if body(r ) = ∅, and a constraint if head(r ) = ∅. A pro-
gram Π is extended if it contains no disjunctive rules, normal if it contains neither
disjunctions nor strong negations, and positive if for every r ∈ Π, body − (r ) = ∅.
Given a rule or program e, we denote by at(e) (resp., lit(e)) the set of atoms (resp.,
literals) in e.
   The grounding of a logic program Π relative to its Herbrand universe HU (Π) is
defined as usual and denoted by grd(Π). Let M be a set of ground (variable-free)
literals. M satisfies a rule r if body + (r ) ⊆ M and body − (r )∩M = ∅ only if head(r )∩
M 6= ∅. Furthermore, M is closed under a program Π if M satisfies all rules in
Π, and M is logically closed if it is consistent (i.e., for no literal L ∈ M we have
{L, ¬L} ⊆ M) or contains all literals. The reduct, ΠM , of Π relative to M (Gelfond
and Lifschitz 1988) is given by {head(r ) ← body + (r ) | r ∈ grd(Π) and body − (r ) ∩
M = ∅}. M is an answer set of Π if it is a minimal set that is both closed under
ΠM and logically closed.
   The dependency graph, DG(Π), of a program Π (Ullman 1988) is the graph
which contains as vertices lit(grd(Π)) and a positive (resp., negative) edge (B , H )
iff there is a rule r ∈ grd(Π) such that H ∈ head(r ) and B ∈ body + (r ) (resp.,
B ∈ body − (r )). A directed path with an even (resp., odd) number of negative edges
is called an even (resp., odd) negative path. Accordingly, a path without negative
edges is called a positive path. A cycle is a directed path whose start and end
vertex coincide and in which no other vertex is visited more than once. Let ξ be
4               Benjamin Kiesl, Peter Schüller and Hans Tompits

the function which maps every edge e in DG(Π) to the set of all rules in Π that
cause e to be contained in DG(Π). Then, we say that a rule r is involved in a path
P of DG(Π) if P contains an edge e with r ∈ ξ(e).
   Given a program Π and literals A and B , A depends positively (resp., depends
negatively) on B if DG(Π) contains a positive (resp., negative) path from B to A.
Moreover, A depends on B if A depends positively or negatively on B .
   The rule graph, RG(Π), of a program Π (Dimopoulos and Torres 1996) is the
graph which contains as vertices the rules of Π and an edge (r1 , r2 ) iff there is a
predicate p which occurs in the body of r1 and in the head of r2 .
   A normal logic program Π is stratified if there exists a mapping f assigning
each predicate symbol a natural number such that, for every rule r ∈ Π and every
predicate P1 and P2 , (i) if P1 occurs in head(r ) and P2 occurs in body + (r ) then
f (P1 ) ≥ f (P2 ), and (ii) if P1 occurs in head(r ) and P2 occurs in body − (r ) then
f (P1 ) > f (P2 ). Let Π0 be the program obtained from grd(Π) by replacing every atom
with a corresponding propositional predicate symbol. Then, Π is locally stratified
if Π0 is stratified (Apt et al. 1988).


              3 A Formal Generate-Define-Test Classification
In this section, we introduce formal definitions of the generate, define, and test
parts of an answer-set program. Furthermore, we also sketch a metaprogram which
classifies the rules of a given answer-set program according to our definitions.
   As already noted in Section 1, the generate part should contain those rules which
cause non-determinism in the sense that they generate multiple answer-set can-
didates. Disjunctive rules and negative cycles are obvious sources of such non-
determinism, but, although even negative cycles are in the absence of disjunction
necessary for the creation of multiple answer sets, they are not sufficient. Consider,
for illustration, the program Π1 , consisting of the facts n(1), n(2), and succ(1, 2),
together with the following rule which, intuitively, chooses every second element of
the set n:
     choose(Y ) ← n(X ), n(Y ), succ(X , Y ), not choose(X ).                     (2)
The grounding of Π1 contains, additionally to the facts, the following ground in-
stances of (2):
     choose(1) ← n(1), n(1), succ(1, 1), not choose(1),                           (3)
     choose(1) ← n(2), n(1), succ(2, 1), not choose(2),                           (4)
     choose(2) ← n(1), n(2), succ(1, 2), not choose(1),                           (5)
     choose(2) ← n(2), n(2), succ(2, 2), not choose(2).                           (6)
Rules (4) and (5) are involved in an even negative cycle with literals choose(1) and
choose(2). Still, Π1 has the unique answer set {n(1), n(2), succ(1, 2), choose(2)}.
Intuitively, the negative cycle does not become active in the sense that the body
atom succ(2, 1) in Rule (4) cannot be contained in an answer set of Π1 . Modern
grounders automatically eliminate a rule like (4) from the grounding of Π1 but there
are more intricate cases where grounders are not able to remove such negative cycles.
           On Structural Analysis of Non-Ground Answer-Set Programs                    5

   The observation that in many cases negative cycles do not become active has
already been made by Przymusinska and Przymusinski (1990) when they introduced
the weakly-perfect-model semantics and the class of weakly stratified programs. The
distinguishing feature of weakly stratified programs is that they have unique answer
sets even though they are a strict superclass of locally stratified programs.
   We next use ideas behind the weakly-perfect-model semantics to define the non-
deterministic core (short, nd-core) of a program. The nd-core is obtained by remov-
ing certain rules and literals such that negative cycles which cannot become active
are eliminated. We begin with the definition of weakly minimal literals.

Definition 1
A literal L ∈ lit(grd(Π)) is weakly minimal if it does not depend negatively on
other literals and if it does not depend on a literal in the head of a disjunctive rule.


Definition 2
The deterministic bottom-stratum, dbs(Π), of Π is the set of all weakly-minimal
literals which do not occur in the heads of disjunctive rules. Furthermore, the de-
terministic bottom-layer, dbl(Π), of Π is the set {r ∈ grd(Π) | head(r ) ⊆ dbs(Π)}.

The deterministic bottom-layer of a program can be seen as the positive determi-
nistic portion of a program since it is a non-disjunctive positive program which
clearly has a unique answer set.

Example 1
Let Π2 consist of the set of facts {n(1), n(2), succ(1, 2)} and the following rules:

       choose(Y ) ← chooseEven, n(X ), n(Y ), succ(X , Y ), not choose(X ),         (7)
       choose(X ) ← chooseAll, n(X ),                                               (8)
      chooseEven ← not chooseAll,                                                   (9)
        chooseAll ← not chooseEven.                                                (10)

Π2 has two answer sets: one in which choose(n) is true for every number n and one
in which choose(n) is only true for even n. The grounding of Π2 contains the facts
together with Rules (9) and (10), and the following instances of (7) and (8):

      choose(1) ← chooseEven, n(1), n(1), succ(1, 1), not choose(1),               (11)
      choose(1) ← chooseEven, n(2), n(1), succ(2, 1), not choose(2),               (12)
      choose(2) ← chooseEven, n(1), n(2), succ(1, 2), not choose(1),               (13)
      choose(2) ← chooseEven, n(2), n(2), succ(2, 2), not choose(2),               (14)
      choose(1) ← chooseAll, n(1),                                                 (15)
      choose(2) ← chooseAll, n(2).                                                 (16)

Then, the deterministic bottom-stratum of grd(Π2 ) is dbs(grd(Π2 )) = {n(1), n(2),
succ(1, 1), succ(1, 2), succ(2, 1), succ(2, 2)}, and thus the deterministic bottom-layer
is dbl(grd(Π2 )) = {succ(1, 2) ←, n(2) ←, n(1) ←}. 
6               Benjamin Kiesl, Peter Schüller and Hans Tompits

For defining the nd-core of a program Π we repeatedly apply the transformation
given in Definition 3 until a fixed-point is reached. The intuitive idea behind this
transformation is to first evaluate the deterministic positive portion of the program,
identified by its deterministic bottom-layer. The deterministic bottom-layer, as well
as further rules whose bodies can never become satisfied or which are redundant, are
then removed. Additionally, default literals that are satisfied by the unique answer
set of the deterministic bottom-layer are removed from the remaining rules. This
eliminates certain negative cycles that have a deterministic effect.
Definition 3
Let Π be a ground logic program and M the unique answer set of dbl(Π). Then,
ν(Π) is the program obtained from Π by
  (i) removing all rules which contain a negative body literal N such that N ∈ M,
      a positive body literal P such that P ∈ dbs(Π) \ M, or a head literal H with
      H ∈ M,
 (ii) removing those body default literals such that P ∈ M for positive body literals
      P or N ∈ dbs(Π) \ M for negative body literals N , and
(iii) removing all non-facts whose heads appear as facts in the program.
The removal of non-facts whose heads appear as facts is important for programs
like Π3 = {a ← not b, b ← not a, a ←}. Here, it allows to remove the first rule
although dbs(Π3 ) = ∅ and so ν(Π3 ) has, in contrast to Π3 , no negative cycles.
   Based on the operator ν, we can now define the non-deterministic core of a
(possibly non-ground) program, intuitively obtained by partially evaluating its de-
terministic portion (which may involve negative default literals) and transforming
the program accordingly.
   As a preparatory step, let ν 0 (Π) = Π and ν i+1 (Π) = ν(ν i (Π)), for i ∈ N+ .
Clearly, the application of ν to a ground logic program Π removes from Π all the
literals of dbs(Π) and does not add any literals. Hence, there is an i such that either
(a) ν i (Π) = ∅ and thus ν i+1 (Π) = ∅, or (b) ν i+1 (Π) is non-empty and does not
differ from ν i (Π), as dbs(ν i (Π)) = ∅, and ν i (Π) does not contain any non-facts
whose heads occur as facts. Thus, the following holds:
Theorem 1
For every ground logic program Π, there is a smallest number i0 such that ν i0 (Π) =
ν k (Π), for each k ≥ i0 .
Let us write ν ∞ (Π) for ν i0 (Π) with i0 as in Theorem 1. We then define:
Definition 4
The non-deterministic core (or nd-core) of a program Π is given by ν ∞ (grd(Π)).

Example 2
Let Π = grd(Π3 ), where Π3 is the program from the above. In Example 1, we
already identified that dbl(Π) = {n(1) ←, n(2) ←, succ(1, 2) ←}. Its unique answer
set is M = {n(1), n(2), succ(1, 2)}. Now, when computing ν(Π), Rules (11), (12),
and (14) are removed since their body literals succ(1, 1), succ(2, 1), and succ(2, 2)
           On Structural Analysis of Non-Ground Answer-Set Programs                   7

are contained in dbs(Π) \ M. As well, the facts succ(1, 2) ←, n(1) ←, and n(2) ←
are removed since their heads are contained in M. Finally, the body literals n(1),
n(2), and succ(1, 2) are removed from the remaining rules since they are contained
in M. We thus obtain ν 1 (Π) = ν(Π) as follows:
        choose(2) ← chooseEven, not choose(1),                                     (17)
        choose(1) ← chooseAll,                                                     (18)
        choose(2) ← chooseAll,                                                     (19)
      chooseEven ← not chooseAll,                                                  (20)
        chooseAll ← not chooseEven.                                                (21)
Since ν 1 (Π) does not contain any weakly-minimal literals, we have that dbs(ν 1 (Π))=
∅. Therefore, ν 2 (Π) = ν 1 (Π), and so ν 1 (Π) is already the nd-core of Π3 . 
In this example, only one application of ν was necessary for computing the nd-core
of Π3 . But this is not always the case. In fact, the number of computation steps
cannot be bounded by a constant, as stated by the following theorem:
Theorem 2
For every n ∈ N, there is a ground program Πn such that ν ∞ (Πn ) 6= ν n−1 (Πn ).
Having the nd-core of a program at our disposal, we can now define the generate,
define, and test parts. In the following, an instance of a rule r refers to a rule which
was obtained from r by grounding it and which has possibly been modified by ν.
Definition 5
Let Π be a logic program. Then,
  (i) the generate part of Π consists of all rules which are disjunctive or which have
      instances that are involved in even negative cycles within the dependency
      graph of the nd-core of Π,
 (ii) the test part of Π consists of all rules which are not in the generate part and
      which are either constraints or which have instances that are involved in (odd)
      negative cycles within the dependency graph of the nd-core of Π, and
(iii) the define part of Π consists of all rules which are neither in the generate part
      nor in the test part.
Rules of the generate, define, and test part are referred to as generating, defining,
and testing rules, respectively.
Note that we are explicitly speaking of instances of (non-ground) rules which are
involved in negative cycles and not of rules whose head literals or predicates are
contained in a negative cycle. Note also that facts are contained in the define part
but it would be straightforward to define a separate part for them.
  Also, often in non-ground ASP, one deals with uniform problem encodings, in
which a program Π is detached from the set of facts providing the input of the
program. From a database point of view, the program is the intensional database
(IDB) whilst the facts constitute the extensional database (EDB). When trying to
analyse the structure of such a uniform encoding Π, the analysis of Π can be done
8                 Benjamin Kiesl, Peter Schüller and Hans Tompits

by computing the nd-core of Π ∪ Πf , where Πf consists of input facts comprising
instances of atoms of all extensional predicates of Π.
Example 3
For the program Π2 , the generate part consists of the rules
        chooseEven ← not chooseAll       and chooseAll ← not chooseEven,
while the define part contains the facts together with the following rules:
        choose(Y ) ← chooseEven, n(X ), n(Y ), succ(X , Y ), not choose(X ),     (22)
        choose(X ) ← chooseAll, n(X ).                                           (23)
The test part of Π2 is empty.       
In the dependency graph of Π2 , instances of (22) are involved in negative cycles
with an even number of negative edges as well as in negative cycles with an odd
number of negative edges. But, contrary to common intuition, these cycles neither
cause the creation of multiple (candidate) answer sets nor do they eliminate them.
Hence, (22) is a defining rule and not a generating or a testing rule. In fact, (22)
would even be a defining rule if predicate succ would not be defined by facts but
by an equivalent set of rules including default negation.
    Next, we discuss some properties of the above definitions.
Lemma 1
The nd-core of every weakly stratified program, and thus of every (locally) stratified
program, is empty.
Consequently, the following holds:
Theorem 3
The generate part of every weakly stratified program, and thus of every (locally)
stratified program, is empty.
The following theorem expresses that a non-deterministic behaviour of a program
always has its origin in the generate part.
Theorem 4
A program with an empty generate part has at most one answer set.
Note that the converse of this statement does not hold since the generate part can
create multiple answer-set candidates which are all eliminated by the test part.
   In concluding this section, we remark that we implemented a tool which classifies
logic programs into its generate, define, and test parts according to our definitions.
The tool is implemented in Java and performs the computation of the nd-core and
the identification of rules which are involved within its negative cycles by means
of an ASP metaprogramming approach, using an answer-set program consisting of
the following modules:
    (i) Πreify , representing the input program as a set of facts,
           On Structural Analysis of Non-Ground Answer-Set Programs                  9

 (ii) Πdependency , defining the dependency notions,
(iii) Πbottomlayer , identifying the deterministic bottom-stratum and the corres-
      ponding deterministic bottom-layer,
(iv) Πnd-core , encoding the operator ν and computing the nd-core, and
 (v) Πcycle , identifying rules which are involved in even and odd negative cycles
      within the dependency graph.
Since the whole metaprogram consists of more than 100 rules we do not list it here.
It is available at www.kr.tuwien.ac.at/staff/kiesl/gdt_classification.lp.


                      4 An Explanation Order on Rules
For obtaining an understandable explanation of an answer-set program, the order
in which the rules are explained is crucial. One way to obtain arguably useful
explanations is to start with generating rules and then continue by explaining how
the predicates of the generate part are constrained by rules from the test part,
thereby also taking the involved defining rules into account. In what follows, we
discuss a method for assigning an explanation order on rules which also copes in
an intuitive way with testing rules that constrain several generating rules.
Example 4
Assume that we want to assign colours to the vertices of a graph such that every ver-
tex is either red, blue, or has no colour at all. Furthermore, some of the vertices are
not allowed to be either red or blue, represented by predicates must_not_be_red
and must_not_be_blue. A possible encoding is the following program Π4 :
       red(X ) ∨ ¬red(X ) ← vertex(X ),                                           (24)
      blue(X ) ∨ ¬blue(X ) ← vertex(X ),                                          (25)
                            ← vertex(X ), red(X ), must_not_be_red(X ),           (26)
                            ← vertex(X ), blue(X ), must_not_be_blue(X ),         (27)
                            ← vertex(X ), red(X ), blue(X ).                      (28)
There are two generating rules, (24) and (25). Among the three constraints, (26)
is only related to literals generated by (24), (27) is only related to those generated
by (25), and (28) is related to literals that are generated by both generating rules.
   We believe that a useful explanation of the whole program is as follows: we
start by explaining the generating rule (24) together with its corresponding con-
straint (26); after that we explain (25) together with (27), and at the end we men-
tion (28) which constrains both generating rules. A natural-language explanation
realising this order could be as follows:
1. For any vertex, choose whether it is red or not [Rule (24)] such that no vertex
which must not be red is red [Rule (26)].
2. For any vertex, choose whether it is blue or not [Rule (25)] such that no vertex
which must not be blue is blue [Rule (27)].
3. Additionally, it must not be the case that there is a vertex which is both red and
blue [Rule (28)]. 
10              Benjamin Kiesl, Peter Schüller and Hans Tompits
                                                                     ←
← vertex(X ), red(X ), must_not_be_red(X )     vertex(X ), blue(X ), must_not_be_blue(X )

                             ← vertex(X ), red(X ), blue(X )

     red(X ) ∨ ¬red(X ) ← vertex(X )               blue(X ) ∨ ¬blue(X ) ← vertex(X )

                             Fig. 1. The rule graph of Π4 .


To formalise the intuition behind this order, we make use of the strongly-connected
components (SCCs) of the rule graph RG(Π) of a program Π. Given a rule r ∈ Π
we denote by SCC(r ) the SCC of RG(Π) containing r . We call the graph obtained
by collapsing SCCs of RG(Π) the reduction graph of RG(Π). An SCC of RG(Π) is
maximal if it does not have an outgoing edge within the reduction graph of RG(Π).
Definition 6
A rule of a program Π is maximal if it is contained in a maximal SCC of RG(Π).

Definition 7
A generating rule rG of a program Π is maximal if in RG(Π) (i) no generating rule
in SCC(rG ) has more outgoing edges than rG and (ii) there is no directed path from
rG to a generating rule in a different SCC of RG(Π).
The intuition behind this definition is to give priority to explaining generating rules
that are most influential in the program (according to outgoing paths) and that are
nearest to testing rules and definitions.
  We next formalise that testing rules can be associated with certain rules.
Definition 8
Given a logic program Π, let rT ∈ Π be a testing rule and r ∈ Π a rule. Then, r is
constrained by rT if there is a directed path from r to rT in RG(Π), otherwise r is
unconstrained by rT .
Note that testing rules are trivially constrained by themselves.
Example 5 (continued )
Figure 1 depicts the rule graph of Π4 . According to our definitions, (26) constrains
only (24), (27) constrains only (25), and (28) constrains both (24) and (25). All
three constraints induce maximal components and thus they are maximal rules.
Both generating rules (24) and (25) are maximal ones.      
Algorithm 1 computes an order for explaining rules of a given program. Intuitively,
rules are explained in the order they are visited during a depth-first traversal of the
rule graph, starting from maximal rules and maximal generating rules. During the
traversal, rules which are in the same SCC are preferred. Additionally, as soon as all
rules from which a generating rule rG is reachable have been visited, testing rules
which constrain rG are explained. Step (a) first explains unconstrained maximal
rules, then maximal generating rules, and finally arbitrary maximal rules.
Theorem 5
Algorithm 1 always terminates and assigns a unique number to each rule.
             On Structural Analysis of Non-Ground Answer-Set Programs              11

  input : rule graph G
  output: order in which the rules are explained, defined by order
  let sameSCC and otherSCC be stacks
  let i := 1
  while G contains unlabelled rules do
       if otherSCC is empty then
(a)         let r be an unlabelled rule according to the following priority:
            - first choose unconstrained maximal rules,
            - then choose maximal generating rules,
            - then choose maximal rules.
           push(sameSCC, r )
        while sameSCC or otherSCC is not empty do
           if sameSCC is not empty then
               let r := pop(sameSCC)
           else
               let r := pop(otherSCC)
           if r is not labelled then
               let order(r ) := i    // now we consider r to be labelled
               let i := i + 1
               // iterate in the same order as the r 0 occur in the body of r
               for all edges (r 0 , r ) in G do
                   if r 0 is in the same SCC as r then
                        push(sameSCC, r 0 )
                   else
                        push(otherSCC, r 0 )

        let l be the list of unlabelled testing rules
        order l by the number of generating rules they constrain, descending
        for c in l do
             if all generating rules constrained by c are labelled then
(b)               push(otherSCC, c)


         Algorithm 1: Computes the order in which rules are explained.


Example 6 (continued )
Π4 does not contain unconstrained rules, so in Step (a) of Algorithm 1 we choose
one of the generating rules, say (24). It has no incoming edges but after labelling it
with 1, all rules constrained by (26) are labelled, hence (26) is pushed in Step (b)
and labelled with 2. Subsequently, the next unlabelled maximal generating rule is
chosen in (a), viz. Rule (25), and labelled with 3. Then, (27) is pushed in (b) and
labelled with 4 since all rules which are constrained by (27) were labelled. Finally,
(28) is pushed in (b) since now both rules which it constrains have been labelled.
Note that (28) is visited after (27) because (28) constrains more generating rules
than (27). We end up with the following order:
   1:      red(X ) ∨ ¬red(X ) ← vertex(X ),
   2:                             ← vertex(X ), red(X ), must_not_be_red(X ),
   3:    blue(X ) ∨ ¬blue(X ) ← vertex(X ),
12              Benjamin Kiesl, Peter Schüller and Hans Tompits

     4:                       ← vertex(X ), blue(X ), must_not_be_blue(X ),
     5:                       ← vertex(X ), red(X ), blue(X ),
which is exactly the order we wanted at the beginning of this section.       


                                   5 Conclusion
We introduced methods for structural analysis of non-ground disjunctive answer-set
programs. In particular, we introduced formal definitions for the generate, define,
and test parts of a program. In order to arrive at natural definitions, we first defined
the non-deterministic core of a program, which is obtained by eliminating deter-
ministically determined rules and literals. This has the potential to eliminate those
negative cycles that cannot become active and thus should not be considered in the
generate or test part. We then defined generating rules to be disjunctive rules as
well as rules having instances that are involved in the non-deterministic core’s even
negative cycles. From the remaining rules, we defined the test part to contain con-
straints and rules which have instances that are involved in the core’s odd negative
cycles, and all remaining rules are then considered to represent definitions.
   We implemented a tool based on metaprogramming which classifies rules in an
answer-set program accordingly. The aim of this tool is to apply it for translating
programs into (controlled) natural language, however by itself this tool can be useful
for developers to quicker gain an understanding of a given program. Based on our
definitions of generate-define-test, we introduced an algorithm which computes an
order for translating rules of a program into CNL. We believe we found criteria that
yield a clear and understandable explanation of the overall program.
   Our approach differs from justification-based methods (Pontelli et al. 2009; Damá-
sio et al. 2013) which aim at explaining the truth of certain atoms rather than
explaining the whole program itself and which in turn are closely related to the de-
bugging of logic programs (Pereira et al. 1993; Brain and De Vos 2005; Brain et al.
2007; Gebser et al. 2008; Oetsch et al. 2010; Oetsch et al. 2011; Shchekotykhin
2015).
   As future work, we plan to automatically realise a translation of answer-set pro-
grams into controlled natural language. Moreover, we want to extend our methods
to programs of broader syntactic classes, e.g., classes with language features like
choice constructs or aggregate relations.
   We also plan to investigate how our definition of the non-deterministic core re-
lates to program normal forms as introduced by Brass et al. (1996; 2001) for the
computation of the well-founded semantics. Note that the normal forms introduced
by Costantini and Provetti (2005) might differ quite drastically from the original
program, so they do not seem promising for our purposes of rule classification.
   We think that optimisation of non-ground logic programs can profit from our
work, e.g., Baral (2003) demonstrated that for many programs the efficiency can be
improved by modifying the generate part such that conditions from the test part are
already taken into account when generating candidate solutions, thereby shrinking
the solver search space. Arguably, such optimisations can be performed easier and
possibly automated if corresponding program parts can be identified automatically.
           On Structural Analysis of Non-Ground Answer-Set Programs                    13

                                      References
Apt, K. R., Blair, H. A., and Walker, A. 1988. Towards a theory of declarative
  knowledge. In Foundations of Deductive Databases and Logic Programming. Morgan
  Kaufmann, Chapter 2, 89–148.
Baral, C. 2003. Knowledge Representation, Reasoning, and Declarative Problem Solving.
  Cambridge University Press, New York, NY, USA.
Brain, M. and De Vos, M. 2005. Debugging logic programs under the answer set
  semantics. In Proceedings of the 3rd International ASP Workshop (ASP 2005). CEUR
  Workshop Proceedings. 142–152.
Brain, M., Gebser, M., Pührer, J., Schaub, T., Tompits, H., and Woltran, S.
  2007. Debugging ASP programs by means of ASP. In Proceedings of the 9th Inter-
  national Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR
  2007), C. Baral, G. Brewka, and J. Schlipf, Eds. Lecture Notes in Computer Science,
  vol. 4483. Springer, 31–43.
Brass, S. and Dix, J. 1996. Characterizing D-WFS: Confluence and iterated GCWA. In
  Proceedings of the 4th European Workshop on Logics in Artificial Intelligence (JELIA
  1996), J. J. Alferes, L. M. Pereira, and E. Orlowska, Eds. Lecture Notes in Computer
  Science, vol. 1126. Springer, 268–283.
Brass, S., Dix, J., Freitag, B., and Zukowski, U. 2001. Transformation-based
  bottom-up computation of the well-founded model. Theory and Practice of Logic Pro-
  gramming 1, 497–538.
Costantini, S. and Provetti, A. 2005. Normal forms for answer sets programming.
  Theory and Practice of Logic Programming 5, 747–760.
Damásio, C. V., Analyti, A., and Antoniou, G. 2013. Justifications for logic pro-
  gramming. In Proceedings of the 12th International Conference on Logic Programming
  and Nonmonotonic Reasoning (LPNMR 2013), P. Cabalar and T. C. Son, Eds. Lecture
  Notes in Computer Science, vol. 8148. Springer, 530–542.
Denecker, M., Lierler, Y., Truszczynski, M., and Vennekens, J. 2012. A Tarskian
  informal semantics for answer set programming. In Technical Communications of the
  28th International Conference on Logic Programming (ICLP 2012), A. Dovier and V. S.
  Costa, Eds. LIPIcs, vol. 17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 277–289.
Dimopoulos, Y. and Torres, A. 1996. Graph theoretical structures in logic programs
  and default theories. Theoretical Computer Science 170, 1, 209–244.
Eiter, T., Faber, W., Leone, N., and Pfeifer, G. 2000. Declarative problem-solving
  using the DLV system. In Logic-Based Artificial Intelligence, J. Minker, Ed. Kluwer
  Academic Publishers, 79–103.
Eiter, T., Ianni, G., and Krennwallner, T. 2009. Answer set programming: A primer.
  In Reasoning Web. Semantic Technologies for Information Systems, 5th International
  Summer School 2009, S. Tessaris, E. Franconi, T. Eiter, C. Gutierrez, S. Handschuh,
  M.-C. Rousset, and R. A. Schmidt, Eds. Lecture Notes in Computer Science, vol. 5689.
  Springer, 40–110.
Erdem, E. and Yeniterzi, R. 2009. Transforming controlled natural language biomed-
  ical queries into answer set programs. In Proceedings of the Workshop on Current
  Trends in Biomedical Natural Language Processing (BioNLP 2009). Association for
  Computational Linguistics, 117–124.
Gebser, M., Pührer, J., Schaub, T., and Tompits, H. 2008. A meta-programming
  technique for debugging answer-set programs. In Proceedings of the 23rd AAAI Con-
  ference on Artificial Intelligence (AAAI 2008). Vol. 8. 448–453.
Gelfond, M. and Lifschitz, V. 1988. The stable model semantics for logic program-
  ming. In Proceedings of the 5th International Conference and Symposium on Logic
14              Benjamin Kiesl, Peter Schüller and Hans Tompits

  Programming (ICLP/SLP 1988), R. A. Kowalski and K. A. Bowen, Eds. MIT Press,
  1070–1080.
Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S., and Scar-
  cello, F. 2006. The DLV system for knowledge representation and reasoning. ACM
  Transactions on Computional Logic 7, 3, 499–562.
Lifschitz, V. 2002. Answer set programming and plan generation. Artificial Intelli-
  gence 138, 1–2, 39 – 54.
Oetsch, J., Pührer, J., and Tompits, H. 2010. Catching the Ouroboros: On debugging
  non-ground answer-set programs. Theory and Practice of Logic Programming 10, 513–
  529.
Oetsch, J., Pührer, J., and Tompits, H. 2011. Stepping through an answer-set pro-
  gram. In Proceedings of the 11th International Conference on Logic Programming and
  Nonmonotonic Reasoning (LPNMR 2011), J. P. Delgrande and W. Faber, Eds. Lecture
  Notes in Computer Science, vol. 6645. Springer, 134–147.
Pereira, L. M., Damásio, C. V., and Alferes, J. J. 1993. Diagnosis and debug-
  ging as contradiction removal. In Proceedings of the 2nd International Workshop on
  Logic Programming and Non-monotonic Reasoning (LPNMR 1993), L. M. Pereira and
  A. Nerode, Eds. The MIT Press, 316–330.
Pontelli, E., Son, T. C., and Elkhatib, O. 2009. Justifications for logic programs
  under answer set semantics. Theory and Practice of Logic Programming 9, 1–56.
Przymusinska, H. and Przymusinski, T. C. 1990. Weakly stratified logic programs.
  Fundamenta Informaticae 13, 1, 51–65.
Schwitter, R. 2012. Answer set programming via controlled natural language processing.
  In Proceedings of the 3rd International Workshop on Controlled Natural Language
  (CNL 2012). Lecture Notes in Computer Science, vol. 7427. Springer, 26–43.
Schwitter, R. 2013. The jobs puzzle: Taking on the challenge via controlled natural
  language processing. Theory and Practice of Logic Programming 13, 4-5, 487–501.
Shchekotykhin, K. M. 2015. Interactive query-based debugging of ASP programs.
  In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI 2015),
  B. Bonet and S. Koenig, Eds. AAAI Press, 1597–1603.
Ullman, J. D. 1988. Principles of Database and Knowledge-Base Systems. Vol. 1. Com-
  puter Science Press.
You, J.-H. and Yuan, L. Y. 1994. A three-valued semantics for deductive databases
  and logic programs. Journal of Computer and System Sciences 49, 2, 334–361.