=Paper= {{Paper |id=None |storemode=property |title=Towards a Fully-Parallel DLV System |pdfUrl=https://ceur-ws.org/Vol-616/paper09.pdf |volume=Vol-616 |dblpUrl=https://dblp.org/rec/conf/cpaior/PerriRS10 }} ==Towards a Fully-Parallel DLV System== https://ceur-ws.org/Vol-616/paper09.pdf
                                           Towards a Fully-Parallel DLV System

                                              Simona Perri, Francesco Ricca, and Marco Sirianni

                                                            Department of Mathematics
                                                               University of Calabria
                                                                87030 Rende, Italy
                                                       perri,ricca,sirianni@mat.unical.it

                                                                        Abstract
                                          In this paper we report on the first attempts to customize and
                                      exploit in DLV, a state-of-the-art Answer Set Programming system,
                                      both novel and existing parallelization methods for the propositional
                                      search phase. These techniques, combined with the recently proposed
                                      strategies for parallel instantiation, will pave the way for obtaining a
                                      fully parallel DLV system.
                                          The results of an experimental analysis are also reported, show-
                                      ing the impact of parallel techniques on the performance of the DLV
                                      system.


                                1     Introduction
                                Answer Set Programming (ASP) [16, 26] is a purely declarative program-
                                ming paradigm based on nonmonotonic reasoning and logic programming.
                                    The idea of answer set programming is to represent a given computa-
                                tional problem by a logic program the answer sets of which correspond to
                                solutions, and then, use an answer set solver to find such solutions [26].
                                The main advantage of ASP is its high declarative nature combined with
                                a relatively high expressive power [22, 9]; but this comes at the price of a
                                high computational cost, which makes the implementation of efficient ASP
                                systems a difficult task.
                                    However, the development of efficient ASP systems [22, 20, 19, 24, 35,
                                28, 15, 27, 25, 2, 1], and among them the state-of-the-art system DLV [22],
                                made ASP an ideal tool for developing complex real-world applications, a
                                fact confirmed by the increasing number of ASP applications in both the
                                fields of AI and Knowledge Management [17, 32, 8, 21].
                                    At the time of this writing, the majority of the available ASP implemen-
                                tations is not able to take advantage during all the steps of the computation
                                from the latest hardware, featuring multi-core/multi-processor SMP (Sym-
                                metric MultiProcessing) technologies also for entry-level systems and PCs.
                                    In particular, concerning the DLV system, a first step in this direction
                                has been recently done by applying parallelism to the preliminary program
                                instantiation phase [6, 29]; however model generation and checking, the two
                                remaining phases of the computation, still rely on serial algorithms. On


                                Proceedings of the 17th International RCRA workshop (RCRA 2010):
                                Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion
                                Bologna, Italy, June 10–11, 2010

CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
the contrary, for some other ASP systems, specialized versions exploiting
parallelism in the propositional phase have been developed, but still rely on
a serial instantiation[14, 10, 18, 30].1
    In this paper we report on the first attempts to customize and exploit in
DLV both novel and existing parallelization methods for the propositional
search phase in order to obtain a fully parallel version of the DLV system.
We start by focusing on the computation of a single answer set, and try to
combine well-known parallel lookahead techniques [3] with a multi-heuristics
parallel search strategy.
    The results of an experimental analysis are also presented, showing the
impact of parallel techniques on the performance of the DLV system.


2     The DLV System
In this Section, first we provide a short description of the syntax and the
semantics of the kernel language of DLV which is disjunctive datalog under
the answer sets semantics [16]; then we outline the general architecture of
the system; finally, we focus on the computation performed by the Model
Generator, the module of the system subject of the improvements presented
in this paper.

2.1     The Core Language

Syntax.            A variable or a constant is a term. An atom is a(t1 , . . . , tn ),
where a is a predicate of arity n and t1 , . . . , tn are terms. A literal is either
a positive literal p or a negative literal not p, where p is an atom. Given a
literal L, we define its complementary literal not .L as follows: not .L = p
if L is of the form not p, otherwise not .L = not p.
      A disjunctive rule (rule, for short) r is a formula a1 ∨ · · · ∨ an :–
b1 , · · · , bk , not bk+1 , · · · , not bm . where a1 , · · · , an , b1 , · · · , bm are atoms
and n ≥ 0, m ≥ k ≥ 0. The disjunction a1 ∨ · · · ∨ an is the head of r,
while the conjunction b1 , ..., bk , not bk+1 , ..., not bm is the body of r. A
rule without head literals (i.e. n = 0) is usually referred to as an integrity
constraint. If the body is empty (i.e. k = m = 0), it is called a fact.
      An ASP program P is a finite set of safe rules. An atom, a literal, a
rule, or a program is ground if no variables appear in it.
Semantics. Let P be a program. The Herbrand Universe and the Her-
brand Base of P are defined in the standard way and denoted by UP and
BP , respectively.
   1
     Actually, some studies on parallel instantiation were carried out in [3] which remained
at a preliminary stage.




                                              2
            Figure 1: General architecture of the DLV system.

    Given a rule r occurring in P, a ground instance of r is a rule obtained
from r by replacing every variable X in r by σ(X), where σ is a substitu-
tion mapping the variables occurring in r to constants in UP ; ground(P)
denotes the set of all the ground instances of the rules occurring in P. An
interpretation for P is a set of ground atoms, that is, an interpretation is a
subset I of BP . A ground positive literal p is true (resp., false) w.r.t. I if
p ∈ I (resp., p 6∈ I). A ground negative literal not p is true w.r.t. I if p is
false w.r.t. I; otherwise not p is false w.r.t. I. Let r be a ground rule in
ground(P). The head of r is true w.r.t. I if H(r) ∩ I 6= ∅. The body of r is
true w.r.t. I if all body literals of r are true w.r.t. I (i.e., B + (r) ⊆ I and
B − (r) ∩ I = ∅) and is false w.r.t. I otherwise. The rule r is satisfied (or
true) w.r.t. I if its head is true w.r.t. I or its body is false w.r.t. I.
    A model for P is an interpretation M for P such that every rule r ∈
ground(P) is true w.r.t. M . A model M for P is minimal if no model N
for P exists such that N is a proper subset of M . The set of all minimal
models for P is denoted by MM(P).
    Given a ground program P and an interpretation I, the reduct of P w.r.t.
I is the subset P I of P, which is obtained from P by deleting rules in which
a body literal is false w.r.t. I. Note that the above definition of reduct,
proposed in [12], simplifies the original definition of Gelfond-Lifschitz (GL)
transform [16], but is fully equivalent to the GL transform for the definition
of answer sets [12].
    Let I be an interpretation for a program P. I is an answer set (or stable
model) for P if I ∈ MM(P I ) (i.e., I is a minimal model for the program P I )
[31, 16]. The set of all answer sets for P is denoted by AN S(P).

2.2   Architecture
An outline of the general architecture of the system is depicted in Fig.1.
    Upon startup, the input specified by the user is parsed and transformed
into the internal data structures of the system. In general, an input program
P contains variables, and the first step of the DLV computation, performed
by the Instantiator module, is to eliminate these variables, generating a

                                       3
bool ModelGenerator ( Interpretation& I )
{
     I = Propagate ( I );
     if ( I == L ) return false; (* inconsistency *)
     if ( “no atom is undefined in I” )
        return IsAnswerSet(I);
     Literal L;
     if ( ! Select(I, L) ) return False;
     if ( MG ( I ∪ {L} ) return True;
     else return MG ( I ∪ {not .L} );
};


                    Figure 2: Computation of Answer Sets


ground instantiation ground(P) of P. This process, called instantiation
(or grounding), is much more than a simple variables-elimination; it allows
to evaluate relevant programs fragments, and produces a ground program
which has precisely the same answer sets as the theoretical instantiation, but
it is sensibly smaller in size. Moreover, if the input program is disjunction-
free and stratified, then its evaluation is completely done by the instantiator
which computes the single answer set.
     The subsequent computations, which constitute the non-deterministic
part of the DLV system, are then performed on ground(P) by both the
Model Generator and the Model Checker. Roughly, the former produces
some “candidate” answer set, whose stability is subsequently verified by the
latter. The computation performed by the Model Generator exploits a back-
tracking search algorithm, which works directly on the ground instantiation
of the input program. When the Model Generator produces an answer set
candidate, the Model Checker verifies whether it is an answer set for the
input program. Note that, the task performed by the Model Checker is
as hard as the problem solved by the Model Generator for disjunctive pro-
grams, while it is trivial for non-disjunctive programs. However, there is
also a class of disjunctive programs, called Head-Cycle-Free programs [4],
for which the task solved by the Model Checker is provably simpler, which
is exploited in the system algorithms. Finally, once an answer set has been
found, it is printed and possibly the Model Generator resumes in order to
look for further answer sets.

2.3    The Model Generator
The computation performed by the Model Generator, is outlined in Figure 2.
Note that the description here is quite simplified, since the details of the
actual implementation are not important for this paper. For instance, the
algorithm presented here decides whether an answer set exists, but it can
be straightforwardly extended to compute all or a fixed number of answer


                                          4
sets as the DLV system does. A more detailed description can be found in
[11].
    The ModelGenerator function is first called with parameter I set to the
empty interpretation. If the program P has an answer set, then the function
returns true setting I to the computed answer set; otherwise it returns false.
The Model Generator is similar to the Davis-Putnam procedure employed by
SAT solvers. It first calls a function Propagate, which returns the extension
of I with the literals that can be deterministically inferred (or the set of all
literals L upon inconsistency). This function is similar to a unit propagation
procedure employed by SAT solvers, but exploits the peculiarities of ASP for
making further inferences (e.g., it exploits the knowledge that every answer
set is a minimal model). If Propagate does not detect any inconsistency,
an undefined2 literal L is selected according to a heuristic criterion (by
a call to the Select procedure) and ModelGenerator is called on I ∪ {L}
and on I ∪ {not .L}. The literal L corresponds to a branching variable in
SAT solvers. And indeed, like for SAT solvers, the selection of a “good”
literal L is crucial for the performance of an ASP system. To this end
DLV employes look-ahead heuristics [13] where the system looks ahead by
tentatively assuming each undefined literal L and its complement. Then,
the heuristic value of L (which is a measure of the “quality” of the resulting
interpretations) is exploited to select the next branching literal.
    Looking ahead is a comparatively costly operation, the computation of
this kind of heuristics can be very expensive, since the number of literals
to be “looked-ahead” may be very large; thus, the lookahead step is a good
candidate for exploiting parallelism [3]. Moreover, no heuristics measure
fits perfectly all cases, and it might even be the case that the costs of the
lookahead largely overcome its benefits; parallelism might help also in this
case by allowing for concurrently exploring different search paths.


3       Pushing Parallelism in the Model Generator
In this Section we present the first attempts to exploit parallelism in the
Model Generator. In this preliminary work, we focus on the computation
of a single answer set and we investigate two directions: first, we act on the
evaluation of the heuristically best literal, by making use of techniques of
parallel look-ahead; then we exploit a multi-heuristics parallel search strat-
egy taking advantage of different heuristic criteria.
    2
    The interpretations built during the computation are 3-valued, that is a literal can be
true, false or undefined (if its value has not been set yet) w.r.t. to an interpretation I.




                                            5
3.1    Parallel Lookahead
We now sketch a framework which exploits parallelism for the selection of
branching literals, according to a heuristic criterion.
    The general procedure evaluating the heuristically best literal is shown in
Figure 3. Roughly, some threads are spawned which run function Lookahead
(calls to pthread create). Each thread takes a different undefined literal A
to perform a look-ahead step by considering first I ∪ {A} and then I ∪
{not .A}. Lookahead, either calculates the two heuristic values (results are
stored in I+        −
            A and IA ) by calling function Propagate, or if an inconsistency is
                 +
encountered (IA = L or I−   A = L), stores the complement of the propagated
literal in the detChoices set. Once all threads have terminated (they reach
the barrier), literals in detChoices, which can be deterministically assumed,
are propagated. If an inconsistency arises at this point, then no literal can
be chosen at this level (the assumption of both A and its complement not .A
leads to inconsistency) and the function Select returns false, in order to cause
a backtracking.3 Otherwise, the best literal according to a given heuristic
criterion hC is selected among the candidate ones.

3.2    Multi-Heuristics Parallel Search Strategy
The heuristics for the selection of the branching literal dramatically affects
the performance of an ASP system; and, obviously, there is no heuristics
performing well in all cases, rather a heuristics can be more suitable than
another one for a given problem typology, or even for a specific problem
instance. In order to obtain an ASP solver which is able to always adopt
the best known heuristics, we exploit a multi-heuristics strategy. More in
detail, we run a number of Model Generator instances, each one adopting a
different branching criterion. Since we are interested in looking for just one
answer set, such a strategy corresponds to exploring different paths in the
search space, concurrently. The computation stops when the first instance
of Model Generator terminates, thus ensuring the best possible performance
for the given problem instance.
    In the following we briefly recall the exploited heuristic criteria. For fast
prototyping, we considered the ones already supported by the DLV system.
Further details can be found in [13].
Heuristic hsatz .      This is an extension of the branching rule adopted
in the system SATZ [23] to the framework of DLP. The basic idea is to
prefer literals introducing a higher number of “short” (that is, where few
undefined literals occur) unsatisfied rules. Intuitively, the introduction of a
high number of short unsatisfied rules is preferred because it creates more
   3
    The process is described here in a simplified way for reason of presentation. In the
actual implementation, the number of Lookahead threads can be fixed and in case of
inconsistency, threads are stopped as soon as the propagation of both L and not .L fails.


                                           6
bool Select(Interpretation& I, Literal& L)
{
   Interpretation I+A , IA ; ListOfThreadsID thList = ∅;
                         −

   ThreadSafeSetOfLiterals DetChoices = ∅;
   foreach Literal A undefined w.r.t. I do        (* spawn lookahead threads *)
        ThreadID id;
        pthread create(id, Lookahead, I, A, detChoices);
        th list.add(id);
   foreach ThreadID thread ∈ thList do            (* barrier *)
        pthread join(thread);
   I = Propagate(I ∪ detChoices);                 (* assume literals that can be deterministically propagated *)
   if (I == L) return false;
   L = N U LL;                                    (* compute the heuristically best literal *)
   foreach Literal A undefined w.r.t. I do
        if (L == N U LL)
             L = A;                               (* first literal, no comparison *)
        else if (L