A Hybrid Approach to Inference in Probabilistic Non-Monotonic Logic Programming Matthias Nickles and Alessandra Mileo Insight Centre for Data Analytics National University of Ireland, Galway {matthias.nickles,alessandra.mileo}@deri.org Abstract. We present a probabilistic inductive logic programming frame- work which integrates non-monotonic reasoning, probabilistic inference and parameter learning. In contrast to traditional approaches to prob- abilistic Answer Set Programming (ASP), our framework imposes only comparatively little restrictions on probabilistic logic programs - in par- ticular, it allows for ASP as well as FOL syntax, and for precise as well as imprecise (interval valued) probabilities. User-configurable sampling and inference algorithms, which can be combined in a pipeline-like fashion, provide for general as well as specialized, more scalable approaches to uncertainty reasoning, allowing for adaptability with regard to different reasoning and learning tasks. 1 Introduction With this paper, we present the probabilistic logic framework PrASP. PrASP is both a probabilistic logic programming language and a software system for probabilistic inference and inductive weight learning based on Answer Set Pro- gramming (ASP). Compared to previous works on this framework [11, 10], we introduce another inference algorithm and additional evaluation results. Reasoning in the presence of uncertainty and relational structures such as social networks or Linked Data is an important aspect of knowledge discovery and representation for the Web, the Internet Of Things, and other heterogeneous and complex domains. Probabilistic logic programing, and the ability to learn probabilistic logic programs from data, can provide an attractive approach to uncertainty reasoning and statistical relational learning, since it combines the deduction power and declarative nature of logic programming with probabilistic inference abilities traditionally known from graphical models, such as Bayesian and Markov networks. We build upon existing approaches in the area of prob- abilistic (inductive) logic programming in order to provide a new ASP-based probabilistic logic programming language and inference tool which combines the benefits of non-monotonic reasoning using state-of-the-art ASP solvers with probabilistic inference and machine learning. The main enhancement provided by PrASP over (non-probabilistic) ASP as well as existing probabilistic approaches to ASP is the possibility to annotate any formula with point or interval (i.e., imprecise) probabilities (including formulas in full FOL syntax, albeit over finite 57 A Hybrid Approach to Inference domains of discourse only), while providing a hybrid set of inference approaches: in addition to general inference algorithms, this includes specialized, more scal- able inference algorithms for cases where certain optional assumptions hold (in particular mutual independence of probabilistic events). As we will show later, it can even make sense to combine different such algorithms (which can each on its own obtain valid results if its specific prerequisites are fulfilled). The remainder of this paper is organized as follows: the next section presents related work. Section 3 describes the syntax and semantics of the formal frame- work. Section 4 describes approximate inference algorithms, and Section 5 pro- vided initial evaluation results. Section 6 concludes. 2 Related Work Approaches related to PrASP include [18, 8, 3–6, 14, 13, 15] which support prob- abilistic inference based on monotonic reasoning and [9, 1, 17, 2] which are based on non-monotonic logic programming. Like P-log [1], our approach computes probability distributions over answer sets (that is, possible worlds are identified with answer sets). However, P-log as well as [17] do not allow for annotating arbitrary formulas (including FOL formulas) with probabilities. [2] allows to associate probabilities with abducibles (only) and to learn both rules and prob- abilistic weights from given data (in form of literals). Again, PrASP does not impose such restrictions on probabilistic annotations or example data. On the other hand, PrASP cannot make use of abduction for learning. Various less closely related approaches to probabilistic reasoning exist (either not based on logic programming at all, or not in the realm of non-monotonic logic program- ming): Stochastic Logic Programs (SLP) [8] are an influential approach where sets of rules in form of range-restricted clauses can be labeled with probabilities. Parameter learning for SLPs is approached in [3] using the EM-algorithm. Ap- proaches which combine concepts from Bayesian network theory with relational modeling and learning are, e.g., [4–6]. Probabilistic Relational Models (PRM) [4] can be seen as relational counterparts to Bayesian networks. In contrast to these, our approach does not directly relate to graphical models such as Bayesian or Markov Networks but works on arbitrary possible worlds which are generated by ASP solvers in form of stable models (answer sets). ProbLog [14] allows for probabilistic facts, annotated disjunctions and definite clauses, and approaches to probabilistic rule and parameter learning (from interpretations) also exist for ProbLog. ProbLog builds upon the Distribution Semantics approach introduced for PRISM [18], which is also used by other influential approaches, such as In- dependent Choice Logic (ICL) [13]. Another important approach outside the area of ASP are Markov Logic Networks (MLN) [15]. A Markov Logic Network consists of first-order formulas annotated with weights (which are, in contrast to PrASP, not in general probabilities). MLNs are used as templates for the con- struction of Markov networks. The (ground) Markov network generated from the MLN then determines a probability distribution over possible worlds, with in- ference performed using weighted SAT solving (which is related to but different 58 A Hybrid Approach to Inference from ASP). MLNs are syntactically roughly similar to the logic programs in our framework (where weighted formulas can also be seen as soft or hard constraints for possible worlds). 3 Syntax and Semantics In this section, we briefly describe the formal language and its semantics. Com- pared to [10], the syntax of PrASP programs has been extended (in particular by allowing interval and non-ground weights) and a variety of approximate in- ference algorithms have been added (see next section) to the default inference approach which is described below and which still underlies the formal semantics of PrASP programs. PrASP is a Nilsson-style [12] probabilistic logic language. Let Φ be a set of function, predicate and object symbols and L(Φ) a first-order language over Φ with the usual connectives (including both strong negation “-” and default negation “not”) and first-order quantifiers. It can be assumed that this lan- guage covers both ASP and FOL syntax (ASP “specialties” such as choice con- structs can be seen as syntactic sugar which we omit here in order to keep things simple). A PrASP program (background knowledge) is a non-empty fi- nite set Λ = {[li ; ui ]fi } ∪ {[li ; ui |ci ]fi } ∪ {indep({f1i , ..., fni })} of annotated for- mulas (each concluded by a dot) and optional independence constraints (PrASP does not require an independence assumption but makes optionally use of de- clared or automatically discovered independence). [l; u]f asserts that the impre- cise probability of f is within interval [l, u] (i.e., l ≤ P r(f ) ≤ u) whereas [l; u|c]f states that the probability of f conditioned on formula c is within interval [l, u] (l ≤ P r(f |c) ≤ u). Formulas can be non-ground (including existentially or universally quantified variables in FOL formulas). For the purpose of this paper, weights need to be ground (real numbers), however, the prototype implementation also allows for certain non-ground weights. An independence constraint indep({f1i , ..., fni }) spec- ifies that the set of formulas {f1i , ..., fni } is mutually independent in the proba- bilistic sense (independence can also be discovered by PrASP by analyzing the background knowledge, but this is computationally more costly). If the weight of a formula is omitted, [1; 1] is assumed. Point probability weights [p] are translated into weights of the form [p; p] (analogously for conditional probabilities). Weighted formulas can intuitively be seen as constraints which specify which possible worlds (in the form of answer sets) are indeed possi- ble, and with which probability. w(f ) denotes the weight of formula f . The fi and ci are formulas either in FOL syntax and supported by means of a transformation into ASP syntax described in [7]) or plain AnsProlog syntax, e.g., [0.5] win :- coin(heads). Informally, every FOL formula or program with FOL formulas results in a set of ASP formulas. The precise AnsProlog syntax depends on the external ASP grounder being employed by PrASP - in principle, any grounder could be used. The current prototype implementation has been tested with Gringo/Clingo 3 and 4 (http://potassco.sourceforge.net). 59 A Hybrid Approach to Inference The semantics of PrASP is defined in terms of probability distributions over possible worlds which are identified with answer sets (models) - an assumption inspired by P-Log [1]. Let M = (D, Θ, π, µ) be a probability structure where D is a finite discrete domain of objects, Θ is a non-empty set of possible worlds, π is a function which assigns to the symbols in Φ predicates, functions and objects over/from D, and µ = (µl , µu ) is a discrete probability function over Θ, a PrASP program and a query formula, as defined further below. Each possible world is a Herbrand interpretation over Φ. Since we will use an- swer sets (i.e., stable models of a (disjunctive) answer set program) as possible worlds, defining Γ (a) to be the set of all answer sets of answer set program a will become handy. We define a (non-probabilistic) satisfaction relation of possible worlds and unannotated programs as follows: let Λ− be is an unannotated program and lp a transformation which transforms such a program (which might contain formulas in first-order logic syntax in addition to formulas in ASP syntax) into a disjunctive program. The details of this transformation are outside the scope of this paper and can be found in [7]. Then (M, θ) Θ Λ− iff θ ∈ Γ (lp(Λ− )) and θ ∈ Θ. For a disjunctive program ψ, we define (M, θ) Θ ψ iff θ ∈ Γ (ψ) and θ ∈ Θ. To do groundwork for the computation of a probability distribution over possible worlds Θ from a given PrASP program, we define a (non-probabilistic) satisfaction relation of possible worlds and unannotated formulas: Let φ be a PrASP formula (without weight) and θ be a possible world. Fur- thermore, let (M, θ) Λ φ iff (M, θ) Θ ρ(Λ) ∪ lp(φ) and Θ = Γ (ρ(Λ)) (we say formula φ is true in possible world θ). Sometimes we will just write θ |=Λ φ if M is given by the context. We abbreviate (M, θ) Λ φ as θ Λ φ. At this, the spanning program ρ(Λ) of PrASP program Λ is a non-probabilistic disjunctive program (without independence constraints) generated by removing all weights and transforming each formerly weighted formula f or ¬f into a disjunction f |¬ f , where ¬ stands for default negation. Informally, the spanning program represents the uncertain but unweighted beliefs of the knowledge engineer or agent. With Γ (a) as defined above, the set of possible worlds deemed possible according to existing belief ρ(Λ) is denoted as Γ (ρ(Λ)). We define the minimizing parameterized probability distribution µl (Λ, Θ, q) over a set Θ = {θ1 , ..., θm } = Γ (ρ(Λ)) of answer sets (possible worlds), a PrASP program Λ = {([pi ]fi , i = 1..n)} ∪ {([pi |ci ]fic )} ∪ {indep({f1i , ..., fki })} and a query formula q as {θi 7→ P r(θi ) : θi ∈ Θ} where (P r(θ1 ), ..., P r(θm )) is any solution P of the following system of inequalities (constraints) such that 1) P rl (q) = θi ∈Θ:θi Λ q P r(θi ) is minimized and 2) the distribution has maximum entropy [19] among any other solutions which minimize the said sum. Anal- ogously, µu denotes a maximum entropy P probability distribution so that the P r(θ1 ), ..., P r(θm ) maximize P ru (q) = θi ∈Θ:θi Λ q P r(θi ). X X l(f1 ) ≤ P r(θi ) ≤ u(f1 ) ··· l(fn ) ≤ P r(θi ) ≤ u(fn ) (1) θi ∈Θ:θi Λ f1 θi ∈Θ:θi Λ fn 60 A Hybrid Approach to Inference X θi = 1 (2) θi ∈Θ ∀θi ∈ Θ : 0 ≤ P r(θi ) ≤ 1 (3) At this, l(fi ) and u(fi ) denote the lower and upper endpoints of the probability interval (imprecise probability) of unconditional formula fi (analogous for inter- val endpoints l(fic |ci ) and u(fic |ci ) of conditional probabilities). In addition, any indep-declaration indep(F i ) in the program induces for every subset Q {f1i , ..., fri }P⊆ F i , r > 1 constraints of theQfollowing form: i i fk=1..r l(f k ) ≤ V i θj ∈Θ:θj Λ fk=1..r P r(θj ) ≤ f i u(fki ). In the case of Vk={1..r} i Q point (i.e., precise) probabilities, these encode P r( k=1..r fk ) = k=1..r P r(fki ). Furthermore, any conditional probability formula [pi |ci ]fic ) in the program in- duces constraints for ensuring l(fic |ci ) ≤ P r(fic |ci ) ≤ u(fic |ci ) (with P pi = [l(fic |ci ); u(fic |ci )]),P namely c c P r(θ j )ν(θ , j if ∧ c i ) + θj ∈Θ −l(fi |ci )P r(θj )ν(θj , ci ) > 0 Pθj ∈Θ c P θj ∈Θ P r(θj )ν(θj , fi ∧ ci ) + −u(fic |ci )P r(θj )ν(θj , ci ) < 0 ( θj ∈Θ 1, if θ Λ f At this, we define ν(θ, f ) = 0, otherwise For small systems, PrASP can compute minimizing and maximizing probability distributions directly using the inequalities above with linear programming, and a maximum entropy solution amongst a number of candidate distributions (so- lutions of an underdetermined system) can be discovered using gradient descent. However, to make distribution finding tractable, we need to use different algo- rithms, as described in the next section. That is, the inequalities system above serves mainly as a means to define the semantics of PrASP formulas. Finally, marginal inference results are obtained as follows: the result of a query of form [?] q is defined as the interval [P rl (q), P ru (q)] and the result of con- ditional queries of form [?|c] f (which stands for P r(f |c), where c is some evidence) is computed using P r(f ∧ c)/P r(c). An example PrASP program: coin(1..10). [0.4;0.6] coin_out(1,heads). [[0.5]] coin_out(N,heads) :- coin(N), N != 1. 1{coin_out(N,heads), coin_out(N,tails)}1 :- coin(N). n_win :- coin_out(N,tails), coin(N). win :- not n_win. [0.8|win] happy. :- happy, not win. The line starting with [[0.5]]... is syntactic sugar for a set of weighted rules where variable N is instantiated with all its possible values (i.e., [0.5] coin_out(2,heads) :- coin(2), 2 != 1 and [0.5] coin_out(3,heads) :- coin(3), 3 != 1). It would also be possible to use [0.5] as annotation of this rule, in which case the weight 0.5 would specify the probability of the entire non-ground formula instead. 1{coin_out(N,heads), coin_out(N,tails)}1 (Gringo AnsProlog syntax) denotes that a coin comes up with either heads or tails but not both. 61 A Hybrid Approach to Inference Our system accepts query formulas in format [?] a, which asks PrASP for the marginal probability of a and [?|b] a which computes the conditional probability P r(a|b). E.g., query [?|coin_out(2,tails)] happy results in [0;0]. 4 Sampling and Inference Algorithms PrASP (as a software system) contains a variety of exact and approximate in- ference algorithms which can be partially combined in a hybrid fashion. Using command line options, the user selects a pipeline of alternative pruning (simpli- fication), sampling and inference steps (depending on the nature and complexity of the respective problem). E.g., the user might chose to sample possible worlds from a near-uniform distribution and to pass on the resulting models to a sim- ulated annealing algorithm which computes a probability distribution over the sampled possible worlds. Finally, this distribution is used to compute the condi- tional or marginal probabilities of the query formulas. The inference algorithms available in the current prototype (version 0.7) of PrASP are: Linear programming Direct solution for the linear inequalities system de- scribed before. Precise and very fast for very small systems, intractable other- wise. Various answer set sampling algorithms for so-called initial sampling These can in some cases be used directly for inference, by computing a distri- bution which complies with the constraints (linear system) described before. An exemplary such algorithm is Algorithm 1. Alternatively, they can be followed by another inference algorithm (simulated annealing or iterative refinement, see below) which corrects the initial distribution computed by initial sampling. Parallel simulated annealing This approach (Algorithm 2) performs simu- lated annealing for inference problems where no assumptions can be made about independence or other properties of the program (except consistency). It can be used either stand-alone or in a hybrid combination with an initial sampling stage (e.g., Algorithm 1). Iterative refinement An adaptation of the inference algorithm described in [16] with guaranteed minimal Kullback−Leibler divergence to the uniform dis- tribution (i.e., maximum entropy). Direct counting Weights are transformed into unweighted formulas and queries are then solved by mere counting of models (see [10] for details). Most of our algorithms rely heavily on near-uniform sampling, either using randomization provided by the respective external ASP solver (fast but typically rather low quality, i.e., weakly uniform) or using so-called XOR-constraints as described in [10] (which provides higher sampling quality at expense of speed). From PrASP’s inference algorithms, we describe one of the initial sampling al- gorithms (Algorithm 1) and parallel simulated annealing (Algorithm 2). An interesting property of the first algorithm is its ability to provide a suitable distribution over possible worlds directly if all weighted formulas in the PrASP program are mutually independent (analogously to the independence assump- tion typically made by distribution semantics-based approaches). Algo. 2 can be 62 A Hybrid Approach to Inference used stand-alone or subsequently to Algo. 1: in that case, the probability dis- tribution computer by initial sampling (with replacement) is used as the initial distribution which is then refined by simulated annealing until all constraints (given probabilities) are fulfilled. The benefit of this pipelined approach to in- ference is that the user (knowledge engineer) doesn’t need to know about event independence - if the uncertain formulas in the program are independent, ini- tial sampling already provides a valid distribution and the subsequent simulated annealing stage almost immediately completes. Otherwise, simulated annealing “repairs” the insufficient distribution computed by the initial sampling stage. Concretely, Algo. 1 samples answer sets and computes a probability distribution over these models which reflects the weights provided in the PrASP program, provided that all uncertain formulas in the program describe a mutually in- dependent set of events. Other user-provided constraints (such as conditional probabilities in the PrASP program) are ignored here. Also, Algo. 1 does not guarantee that the solution has maximum entropy. Algorithm 1 Sampling from models of spanning program (point probabilities only) Require: max number of samples n, set of uncertain formulas uf = {[w(uf i )]uf i with 0 < w(uf i ) < 1}, set of certain formulas cf = {cf i : w(uf i ) = 1} (i.e., with probability 1) 1: i ← 1 2: for i ≤ |uf | do 3: ri ← random element of Sym({1, ..., n}) (permutations of {1, ..., n}) 4: i ← i+1 5: end for 6: m ← ∅, j ← 1 7: parfor j ∈ {1, ..., n} do 8: p ← ∅, k ← 1 9: for k ≤ |uf | do 10: if rjk ≤ n · w(uf k ) then p ← p ∪ {uf k } else p ← p ∪ {¬uf k } endif 11: k ← k+1 12: end for 13: s ← model sampled uniformly from models of program cf ∪ p (∅ if UNSAT) 14: m ← m ] {s} 15: end parfor Ensure: Multiset m contains samples from all answer sets of spanning program such that |{s∈m:s|=uf i }| 16: ∀uf i : w(uf i ) ≈ |m| iff set uf mutually independent. Algorithm 2 presents the approach PrASP uses for approximate inference using a parallel form of simulated annealing (which does not require event in- dependence). The initial list of samples initSamples are computed according to an initial sampling approach such as Algo. 1. 63 A Hybrid Approach to Inference Algorithm 2 Inference by parallel simulated annealing Part 1. We show only the basic variant for non-conditional formulas with point weights. The extension for conditional probabilities and interval probabilities (imprecise probabilities) is straightforward. Require: maxT ime, maxEnergy, initT emp, initSamples (from, e.g., Algo. 1). initSamples is a multiset which encodes a probability distribution via frequen- cies of models), frozen, degreeOfParallelization, α, F (set of weighted formulas), Λ (PrASP program) 1: s ← initSamples, k ← 0, temp ← initTemp 2: e ← energy(s) 3: while k ≤ maxT ime ∧ temp ≥ frozen do 4: parfor i ← 1, degreeOfParallelization do 5: s00i ← s00 ] sampleStep(samplingMethod ) 6: end parfor 7: s0 ← argmins00 (energy(s001 ), ..., energy(s00n )) 8: e0 ← energy(s0 ) 0 9: if e0 < e ∨ random10 < e−(e −e)/temp) then 10: s ← s0 11: e ← e0 12: end if 13: temp ← temp · α 14: k ← k+1 15: end while Ensure: Multiset s = (pw, frq) = µapprox (Λ) approximates the probability distribu- tion µ(Λ) = P r(Γ (ρ(Λ))) over the set pw = {pwi } = Γ (ρ(Λ)) of possible worlds by {P r(pwi ) ≈ frq(pw) |s| }. 16: function energy(s) 17: parfor fi ∈ |F | do |{{s0 ∈s:s0 |=Λ fi }}| 18: freq fi ← |s| 19: end parfor qP 2 return fi ∈F (freq fi − weight fi ) 20: end function 21: . (Continued in Part 2 below) In addition to the actual inference algorithms, it is often beneficial to let PrASP remove (prune) all parts of the spanning program which cannot influence the probabilities of the query formulas. The approach to this is straightforward (program dependency analysis) and therefore omitted here. We will show in the evaluation section how such simplification affects inference performance. 64 A Hybrid Approach to Inference Algorithm 2 Inference by parallel simulated annealing Part 2. 22: function stepSample(samplingMethod ) . The framework provides various configurable methods for the simulated annealing sampling step, of which we show here only one. 23: F0 ← ∅ 24: for fi ∈ |F | do 25: if random10 < weightfi then 26: F 0 ← F 0 ∪ {fi } 27: else 28: F 0 ← F 0 ∪ {¬fi } 29: end if 30: end for return answerSets(F 0 ) (might be ∅) 31: end function While for space-related reasons this paper covers deductive inference only, PrASP also supports induction (learning of weights of hypotheses from example data). Please refer to [10] for details. 5 Experiments The main goal of PrASP is not to outperform existing approaches in terms of speed but to provide a flexible, scalable and highly configurable framework which puts as few restrictions as possible on what users can express in terms of (non-monotonic) certain and uncertain beliefs while being competitive with more specialized inference approaches if the respective conditions (like event independence) are met. For our first experiment, we model a coin game (a slightly simplified variant of the example code shown before): a number of coins are tossed and the game is won if a certain subset of all coins comes up with “heads”. The inference task is the approximation of the winning probability. In addition, another random subset of coins are magically dependent from each other and one of the coins is biased (probability of “heads” is 0.6). Despite its simplicity, this scenario shows how inference copes with independent as well as dependent uncertain facts, and how effective the pruning approach of the respective framework works (since winning depends only on a subset of coins). Also, inference complexity clearly scales with the number of coins. In PrASP syntax, such a partially randomly generated program looks, e.g., as follows (adaptation to MLN or ProbLog syntax is straightforward): coin(1..8). [0.6] coin_out(1,heads). [[0.5]] coin_out(N,heads) :- coin(N), N != 1. 1{coin_out(N,heads), coin_out(N,tails)}1 :- coin(N). win :- 2{coin_out(3,heads),coin_out(4,heads)}2. coin_out(4,heads) :- coin_out(6,heads). 65 A Hybrid Approach to Inference The inference algorithm used is initial sampling (Algo. 1) followed by sim- ulated annealing (Algo. 2). Using Algo. 1, we computed 100 random models (number of samples n), which is sufficient to obtain a precision of +/-0.01 for the query probabilities. The winning subset of coins and the subset of mutually dependent coins (from which a rule of the form coin_out(a,heads) :- coin_out(b,heads), coin_out(c,heads), ... is generated) is each a random set with 25% of the size of the respective full set of coins. “PrASP 0.7.2 simp” in Fig. 1 stands for results obtained with pruning (i.e., parts of the program on which the query result cannot depend have been auto- matically removed). We also report the results obtained solely using (Algorithm 2) (“noinit” in Fig. 1), in order to see whether the initial sampling stage provides any benefits here. Simulated annealing parameters have been maxEnergy = 0.15, initTemp = 5, frozen = 10−150 , α = 0.85. We compared the performance (duration in dependency of the number of coins (x-axis), minimum number of 18 coins) of the current prototype of PrASP with that of Tuffy 0.3 (http://i.stanford.edu/hazy/hazy/tuffy/), a recent implemen- tation of Markov Logic Networks which uses a database system in order to in- crease scalability, and ProbLog2 2.1 (https://dtai.cs.kuleuven.be/problog/) (de- spite random dependencies). Times are in milliseconds, obtained using an i7 4-cores processor with 3.4GHz over five trials. Experiment Suite 'Coins'(Duration) ProbLog2 and Tuffy scale 9000 very well here, with some 8000 ProbLog2 2.1:Duration additional time required by MLN(Tuffy 0.3):Duration 7000 (PrASP 0.7.2):Duration Tuffy probably due to the (PrASP 0.7.2 simp):Duration overhead introduced by ex- 6000 (PrASP 0.7.2 noinit):Duration 5000 ternal database operations. 4000 With PrASP, we observe that 3000 priming simulated annealing 2000 with an initial sampling step 1000 (which would give inaccurate 0 0 50 100 150 200 250 300 results if used standalone) Fig. 1. Biased coins game improves performance mas- sively, whereas pruning ap- pears to suffer from the addi- tional overhead introduced by the required dependency analysis of the logic program. Our hypothesis regarding this behavior is that the initial distribution over possible worlds is, albeit not perfect, quite close to the accurate distribution so that the subsequent simulated annealing task takes off a lot faster compared to starting from scratch (i.e., from the uniform distribution). The next experiment shows how PrASP copes with a more realistic benchmark task - a form of the well-known friends-and-smokers problem [15] - which can be tractably approached using Algorithm 1 alone since the independence as- sumption is met (which also makes it suitable for ProbLog). On the other hand, the rules are more complex. In this benchmark scenario, a randomly chosen number of persons are friends, a randomly chosen subset of all people 66 A Hybrid Approach to Inference smoke, there is a certain probability for being stressed ([[0.3]] stress(X)), it is assumed that stress leads to smoking (smokes(X) :- stress(X)), and that some friends influence each other with a certain probability ([[0.2]] influences(X,Y)), in particular with regard to their smoking behavior smokes(X) :- friend(X,Y), influences(Y,X), smokes(Y). With a certain proba- bility, smoking leads to asthma ([[0.4]] h(X). asthma(X) :- smokes(X), h(X)). The query comprises of [?] asthma(X) for each person X. Experiment Suite 'Smokers Network'(Duration) The results (Fig. 2) have 300000 been averaged over five trials. ProbLog2:Duration Again, ProbLog2 scores best 250000 MLN(Tuffy 0.3):Duration (PrASP 0.7.2 (10 models)):Duration in this scenario. PrASP, using (PrASP 0.7.2 (100 models)):Duration Algorithm 1 (since all uncer- 200000 tain facts in this scenario are 150000 mutually independent), does 100000 quite well for most of episodes 50000 but looses on ProbLog2. Tuffy does very well below 212 per- 0 0 50 100 150 200 250 sons, then performance mas- Fig. 2. Smokers social network sively breaks in for unknown reasons (possibly due to some internal cache overflow). For technical reasons, we couldn’t get the smokers-scenario working with the pub- licly available current implementation of P-Log (we received segmentation faults which couldn’t be resolved), but experiments with examples coming with this software seem to indicate that this approach also scales fine. 6 Conclusion We have presented a new software framework for uncertainty reasoning and pa- rameter estimation based on Answer Set Programming. In contrast to most other approaches to probabilistic logic programming, the philosophy of PrASP is to provide a very expressive formal language (ASP or full FOL syntax over finite domains for formulas annotated with precise as well as imprecise probabilities) on the one hand and a variety of inference algorithms which are able to take advantage of certain problem domains which facilitate “fast track” reasoning and learning (in particular inference in the presence of formula independence) on the other. We see the main benefit of our framework, besides its support for non-monotonic reasoning, thus in its semantically rich and configurable uncer- tainty reasoning approach which allows to combine various sampling and infer- ence approaches in a pipeline-like fashion. Ongoing work focuses on additional experiments and the integration of further inference algorithms, and the direct integration of an ASP solver into PrASP, in order to avoid expensive calls of ex- ternal reasoning tools. Another area of ongoing work is the support for so-called annotated disjunctions [20]. Sponsored by SFI grant n. SFI/12/RC/2289. 67 A Hybrid Approach to Inference References 1. Baral, C., Gelfond, M., Rushton, N.: Probabilistic reasoning with answer sets. Theory Pract. Log. Program. 9(1), 57–144 (2009) 2. Corapi, D., Sykes, D., Inoue, K., Russo, A.: Probabilistic rule learning in nonmono- tonic domains. In: Procs. 12th international conference on Computational logic in multi-agent systems. pp. 243–258. CLIMA’11, Springer-Verlag, Berlin, Heidelberg (2011), http://dl.acm.org/citation.cfm?id=2044543.2044565 3. Cussens, J.: Parameter estimation in stochastic logic programs. In: Mach. Learn. p. 2001 (2000) 4. Friedman, N., Getoor, L., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: In IJCAI. pp. 1300–1309. Springer-Verlag (1999) 5. Kersting, K., Raedt, L.D.: Bayesian logic programs. In: Proceedings of the 10th International Conference on Inductive Logic Programming (2000) 6. Laskey, K.B., Costa, P.C.: Of klingons and starships: Bayesian logic for the 23rd century. In: Procs. of the 21st Conf. on Uncertainty in Artificial Intelligence (2005) 7. Lee, J., Palla, R.: System f2lp - computing answer sets of first-order formulas. In: Erdem, E., Lin, F., Schaub, T. (eds.) LPNMR. Lecture Notes in Computer Science, vol. 5753, pp. 515–521. Springer (2009) 8. Muggleton, S.: Learning stochastic logic programs. Electron. Trans. Artif. Intell. 4(B), 141–153 (2000) 9. Ng, R.T., Subrahmanian, V.S.: Stable semantics for probabilistic deductive databases. Inf. Comput. 110(1), 42–83 (1994) 10. Nickles, M., Mileo, A.: Probabilistic inductive logic programming based on answer set programming. In: 15th Int’l Workshop on Non-Monotonic Reasoning (NMR’14) (2014) 11. Nickles, M., Mileo, A.: A system for probabilistic inductive answer set program- ming. In: 9th International Conference on Scalable Uncertainty Management (SUM’15) (2015, to appear) 12. Nilsson, N.J.: Probabilistic logic. Artificial Intelligence 28(1), 71–87 (1986) 13. Poole, D.: The independent choice logic for modelling multiple agents under un- certainty. Artificial Intelligence 94, 7–56 (1997) 14. Raedt, L.D., Kimmig, A., Toivonen, H.: Problog: A probabilistic prolog and its application in link discovery. In: IJCAI. pp. 2462–2467 (2007) 15. Richardson, M., Domingos, P.: Markov logic networks. Mach. Learning 62(1-2), 107–136 (February 2006), http://dx.doi.org/10.1007/s10994-006-5833-1 16. Rodder, W., Meyer, C.: Coherent knowledge processing at maximum entropy by spirit. In: Proceedings of the Twelfth Conference on Uncertainty in Artificial In- telligence (UAI’96), 1996 (1996) 17. Saad, E., Pontelli, E.: Hybrid probabilistic logic programming with non-monotoic negation. In: In Twenty First International Conference on Logic Programming. Springer Verlag (2005) 18. Sato, T., Kameya, Y.: Prism: a language for symbolic-statistical modeling. In: In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI97. pp. 1330–1335 (1997) 19. Thimm, M., Kern-Isberner, G.: On probabilistic inference in relational conditional logics. Logic Journal of the IGPL 20(5), 872–908 (2012) 20. Vennekens, J., Verbaeten, S., Bruynooghe, M.: Logic programs with annotated dis- junctions. In: Demoen, B., Lifschitz, V. (eds.) Logic Programming, Lecture Notes in Computer Science, vol. 3132, pp. 431–445. Springer Berlin Heidelberg (2004) 68