=Paper= {{Paper |id=Vol-1661/paper-05 |storemode=property |title=Notes on the Implementation of FAM |pdfUrl=https://ceur-ws.org/Vol-1661/paper-05.pdf |volume=Vol-1661 |authors=Nicos Angelopoulos |dblpUrl=https://dblp.org/rec/conf/ilp/Angelopoulos16 }} ==Notes on the Implementation of FAM== https://ceur-ws.org/Vol-1661/paper-05.pdf
         Notes on the implementation of FAM

                               Nicos Angelopoulos

        The Welcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK
                     nicos.angelopoulos@sanger.ac.uk



      Abstract. We revisit the FAM algorithm with the aim of clarifying
      the inner working of the algorithm with respect to implementing it. We
      provide a step through the original presentation, clarifying issues that
      are of interest to implementors of the specific algorithm as well as to
      implementors of other EM algorithms in PLP frameworks. In addition,
      this paper presents P epl, an implementation of the algorithm in Prolog.
      We describe three different alternatives to dealing with the expressions
      needed at the core of the algorithm. A number of example programs
      provided with P epl are explained and the application of P epl to user
      specified programs is discussed.


1   Introduction
Failure adjusted maximization (F AM) [4] is an expectation-maxi-mization (EM )
algorithm originally proposed in the context of stochastic logic programs (S LPs),
[7,8]. As the name suggests, F AM extends the EM framework to account for
failed derivation paths in S LPs [4]. The algorithm provides a closed-form formu-
lation for computing the parameter weights within EM ’s iterative maximization
approach. It has been shown to be applicable to normalised S LPs, [4], which
is a wide class of stochastic programs. The principle of adjusting for failure as
introduced in F AM, has also been applied to the PRISM system [9].
    P epl, which stands for parameter estimation in Prolog, is an implementation
of the failure adjusted maximisation algorithm for S LPs. S LPs are an extension
of logic programs where arithmetic labels are attached to clausal definitions.
S LPs have well defined log linear semantics and an extensive analysis of the
effect of backtracking strategies on these semantics, [2,3].
    The original formulation of F AM although presented in a clear and concise
mathematical language can be challenging for implementors, particularly for
programmers who are new to probabilistic logic programming. Here, we expose
some of the intricacies of the algorithm with emphasis on practical challenges in
implementing it.
    Furthermore, we present a specific implementation of F AM, P epl, which
has been implemented in Prolog. Key aspects of the implementation such as
the transformation of labelled clauses to Prolog and alternative methods for
computing the counts used in the closed-form calculation, are also described.
The implementation is tested against the examples from the F AM paper as well
as on other probabilistic programs.
                                                                                      47

   The remainder of the paper is structured as follows: Section 2 presents the
algorithm with emphasis on implementation details, Section 3 describes P epl
and Section 4 discusses a number of examples. Section 5 concludes the paper.


2    Programming F AM

Here we present some reading notes for Failure Adjusted Maximisation (F AM)
algorithm on Stochastic Logic Programs as presented in [4]. The emphasis is in
making some of the intricacies more apparent with respect to implementing the
algorithm. The reader is expected to be familiar with Logic Programming jargon
and have some appreciation of the S LPs syntax. Here we briefly introduce some
basic S LP terminology.
    S LPs are logic programs which include clauses that are labelled with arith-
metic values. Predicates are defined as either logical, or as labelled in which case
all the clauses in the predicate’s definition should be labelled. The following two
examples show how an unbiased coin (left) and a recursive predicate (right) can
be coded.

    1/2 : coin(head).                1/3 : member( H, [H|T] ).
    1/2 : coin(tail).                2/3 : member( El, [H|T] ) :-
                                                     member( El, T ).

     A pure S LP is stochastic logic program that does not contain non-labelled
predicates. In contrast, an impure program contains, and potentially has call
dependencies between, labelled and non-labelled predicates. A normalised pred-
icate is a predicate defined by labelled clauses for which the sum of the labels
sum to one. F AM is a parameter estimation algorithm that learns the best values
of the clausal labels from a data.
     We prepend objects in the original F AM paper by ML-. Also, we use a top-
down approach in presenting the algorithm. Thus we start with the definition in
ML-Definition 10 repeated here in Figure 1. (Slightly modified from the origi-
nal.)
     The objective of this algorithm is, when given the S LP program (S) in Fig. 2
(ML-Fig.9 ) and data (y) Fig. 3 (part of ML-Table II ) to produce the best set
of λ1 , . . . λ6 (where λi = (logli )).
     λi is the label, or parameter of clause Ci and a positive number (here we
mainly consider 0 < λi ≤ 1).
     Intuitively speaking, we want to find λ̂ = hλˆ1 , . . . , λˆ6 i which will reproduce
the data according to the input frequencies. For example, if we run goal |? −
s(A, B) 120 times, and against the program in Fig. 2 (its parameters changed to
λ̂) then we would like the results to have frequencies close to the ones in Table 1.
     Assuming of course that we replace the top-to-bottom exhaustive backtrack-
able clause matching of Prolog by a proportional to label, non-backtrackable,
choice amongst the matching clauses.
48

 . Let h = 0, and λ(0) such that Zλ(0) > 0.
 . For each parameterised clause Ci , compute ψλ(h) [νi | y] ψλ(h) [νi | y] using 1 (ML-
    Eq.8 ).
                                               (h)
 . For each parameterised clause Ci let Si be the sum of the expected counts
      (h)
    ψλ [νi0 | y] for all the clauses Ci0 such that Ci+0 shares the predicate symbol
                                        +

    as Ci .
                                           (h)                  (h)
 . For each parameterised clause Ci , if Si = 0 then lih+1 = li otherwise
                                                          (h)
                                            (h+1)       ψλ [νi | y]
                                           li       =           (h)
                                                            Si

 . Set h ← h + 1 and go to  unless λ(h + 1) has converged.


                                    Fig. 1. The FAM Algorithm.

l1 : s(X,p) :- p(X), p(X).                              l3 : p(a).          l5 : q(a).
l2 : s(X,q) :- q(X).                                    l4 : p(b).          l6 : q(b).

                                    Fig. 2. The Example S LP S.


2.1     Expectations
FAM is an EM algorithm. An EM algorithm tries to maximise the likelihood of
the data by selecting the appropriate parameters. In SLPs the probability we try
to maximise is P (y|λ(h) ). This is, the probability of the observed data (in the
given frequencies) given the current parameters.
    Since an SLP’s parameters are its clausal probabilities, FAM works on the
expected contribution a particular clause has in derivations, with relation to the
data at hand. This is ψλ(h) [νi | y] and was decomposed in ML-Eq.8 (here 1 in
Fig. 1)
                              t−1
                              X
           ψλ(h) [νi | y] =         Nk ψλ(h) [νi | yk ] + N (Zλ−1
                                                                (h) − 1)ψλ(h) [νi | f ail]   (1)
                              k=1


   The first part corresponds to refutations while the second term to failed
derivations. Broadly speaking Eq. 1 gathers together the contributions of a par-
1
     the first part of the LHS of (1) has been removed here, since it was never used in
     the implementation, where we view unambiguous yields as a standard sub-case of
     the ambiguous case.


      k 1      2      3      4
Atom yk s(a,p) s(b,p) s(a,q) s(b,q)
Count Nk 4     2      3      3

                                     Fig. 3. The Example Data.
                                                                                                 49

Atom          s(a,p) s(b,p) s(a,q) s(b,q)
Should appear 40     20     30     30

                     Table 1. Example Distribution for |? − s(A, B).



ticular clause (Ci ) to derivations against (a) the program, (b) the current param-
eters and (c) the data. All the constituents of Eq. 1 are listed in the Glossary.
The most important components are (a) Zλ(h) the probability of success, (b)
ψλ(h) [νi | yk ] the expected number of times Ci was used in refutations yield-
ing yk , and (c) ψλ(h) [ν | f ail] the expected contribution of the clause to failed
derivations. Let R the set of refutation, with Rk the refutations producing yk
and F the set of failed derivations. Then (b) and (c) are :
                                     X                      X
                  ψλ(h) [νi | yk ] =    ψλ(h) (r) · νi (r)/   ψλ(h) (r)
                                      r∈Rk                          r∈Rk
                                      X                                   X
             ψλ(h) [νi | f ail] =               ψλ(h) (f ) · νi (f )/               ψλ(h) (f )
                                    f ∈Rf ail                           f ∈Rf ail

      We note that              X
                                        ψλ(h) (r) = ψλ(h) (yk )
                               r∈Rk
                                    X
                                            ψλ(h) (f ) = 1 − Z
                                f ∈Rf ail



2.2     Sampling versus exact counting
There are two main methods for computing (a) and R, Rk and F . Firstly, with
the exact method, and secondly with sampling.
   Exact computations involve finding all derivations and calculating the various
counts to exact figures. Probability of success in this case,
                             X          X           X
                     Zλ(h) =     ψ(r)/      ψ(r) +       ψ(f )
                                    r∈R            r∈R            f ∈F

Using this formulation gives us some extra millage, when it comes to impure
S LPs, which mix labelled and unlabelled (non-probabilistic) clauses. Strictly
speaking, ML-Sect.3., Defn.6 states that derivations
                                                   P in same P   equivalence class
should have the
              P   same yield.
                           P  This  ensures
                                         P   that    r∈RP ψ(r) +   f ∈F ψ(f ) = 1.
In such cases r∈R ψ(r)/ r∈R ψ(r) + f ∈F ψ(f ) = r∈R ψ(r) which is what
would FAM normally use.
    A problem arises when considering infinite computations. When there is at
least one infinite branch then it is impossible to manipulate all derivations.
One possible solution is to place a significance limit () deeming any derivation
with probability <  as failure. In the absence of infinite branches, one might
50

be able to generate graphs representing all possible derivations, thus making
substantial time savings as there will not be a need to revisit derivations at each
iteration. Note that we expect it will be very hard to combine significance limit
and derivation as graphs.
    Sampling, on the other hand, estimates the various expectations, by means
of running a number of independent trials and noting the results. The process
is repeated at each iteration with the updated λ. 2 In this case,
                                        P
                                          r∈R P (r)
                        Zλ(h) = P              P                                (2)
                                  r∈R P (r) +    f ∈F P (f )

   Similarly to the exact case, a significance limit can be set to deal with infinite
computation. On the other hand, it is not likely that we can efficiently employ
graphs.
   With sampling it is more straight forward to use the bag of samples to cal-
culate ψλ(h) [νi | yk ] . Let Sk be the set of all the samples yielding yk , then
                                                P
                                                       νi (s)
                              ψλ(h) [νi | yk ] = s∈Sk
                                                   | Sk |
      Since for s ∈ Sk ,
                                    1
                                         = ψλ(h) [s | yk ]
                                  | Sk |
    Note that sampling seems to only work for pure, normalised SLPs. In contrast
by using Z of 2 exact counting seems to also cope with impure SLPs.
    Finally, one can store expressions rather than re-proving each yk at every
iteration. Storing the expressions is an alternative backend, that provides the
same values. In terms of the theory of F AM, there is no difference to when the
values are provided by exact counting.
P In particular, we can store the expressions for ψλ(h) [νi | y], ψλ(h) (f ail), and
   r∈R ψ(r). The objectives of such an approach are:

 – immediate gains of not having to prove each yk every time,
 – to use either graph reduction techniques over the SLD-tree, or polynomial
   simplifications, to derived expressions, that are faster to compute.


2.3      Log Likelihood

The FAM algorithm of Fig. 1 terminates when there is no substantial progress
to be had. This is the case when the log-likelihood of the data given the current
parameters is improved less than some  within two iterations. The log-likelihood
is 3                              X
                      logLλ (y) =     Nk log(ψλ (yk | true))
                                       k

2
     strictly speaking, in our sampling R, Rk and F are not sets but bug collections
3
     it is also worth considering ψλ (yk | y)
                                                                               51

     In the case of sampling we have :

                           ψλ (yk | true) =| Rk | / | R |

since sampling may not yield all yk , the above will lead to logLλ (y) = −inf
when | Rk |= 0. Thus in that case we let ψλ (yk | true) = 1/ | R |2
    For the exact case
                                         P
                                                 ψλ (r)
                        ψλ (yk | true) = r∈Rk
                                              Zλ



3     Pepl

P epl is an implementation of the F AM algorithm. It consists of a set of Prolog
predicates that can be used on S LPs to learn the clausal parameters from data
and do sampling from S LPs. It is available for two Prolog systems: Yap [1] and
SWI-Prolog [10]. For the latter system, P epl is provided as a prepackaged library
4
  that can be easily installed from within SWI-Prolog.

?- pack install(pepl).
?- library(pepl).
 A simple example is provided which can be ran with:

?- [main].
?- main.

This example runs five iterations on the example presented in [4] (sources in
slp/jc ml pe.slp). The default is to use exact counting (equiv. ?- main exact.).
To run the same example with sampling or stored expressions counting, use ?-
main sample. and ?- main store. respectively.
    Stochastic clauses are term expanded to standard Prolog ones. Unique identi-
fiers and a path argument are added to the transformation of stochastic clauses.
These are used to identify the path of each derivation. In addition failure paths
are also recorded by term expansion techniques. The system provides three ways
for computing the counts needed for the closed-form calculation: exact, sample
and store. The first method is the straight forward approach where all solutions
to the target goal are quarried at each iterative step. Sampling approximates
the counts by only sampling from the target. The expressions associated with
the exact computation can be stored as term structures of arithmetic expression
that can be evaluated at each iteration with fresh instantiations of the labels.
This trades space for speed, making the computation much faster by requiring
larger amounts of memory.
4
    http://www.swi-prolog.org/pack/list?p=pepl
52

3.1   Available predicates
 sload pe(SlpSource),
    Load an SLP to memory. If the source file has .slp as its file extension then
    this may be omitted. P epl looks for SlpSource in directories ., and ./slp/
    In SWI-Prolog it also looks in pack(pepl/slp/).

 sls/0
    Listing of the stochastic program currently in memory.

 ssave(FileName)
    Save the stochastic program currently in memory to a file.

 fam(Options)
   Run F AM on the program loaded in memory, with a number of options
   passed as a list that may include the following terms:
    – count(CountMeth), CountMeth in {*exact*, store, sample};
    – times(Tms), default is Tms = 1000 (only relevant with CountMeth=sample);
    – termin(TermList), currently TermList knows about the following terms
         • *interactive*- ask user if another iteration should be run,
         • iter(I)- I is the number of iterations,
         • prm e(p )- parameter difference between iterations. If the change in
            each and all pararameters between two iterations is less than p then
            the algorithm terminates due to convergence of the parameters
         • ll e(λ )- likelihood convergence limit;
    – goal (Goal), the top goal, defaults to an version of the data predicate
       with all its arguments replaced by free variables;
    – pregoal (PreGoal), a goal that is called only once, before experiments are
       run. The intuition is that PreGoal will partially instantiate Goal.
    – data(Data), the data to use, overrides datafile/1. Data should be a
       list of Yield-Times pairs. (All Yields of Goal should be included in Data,
       even if that means some get Times = 0.)
    – prior (Prior), the distribution to replace the probability labels with. De-
       fault is that no prior is used, Prior=none and input parameters are used
       as given in Slp source file. System also knows about uniform and random.
       Any other distribution should come in Prolog source file named Prior.pl
       and define Prior/3 predicate. First argument is a list of ranges (Beg-End)
       for each stochastic predicate in source file. Second argument, is the list
       of actual probability labels in source file. Finally, third argument should
       be instantiated to the list of labels according to Prior.
    – datafile(DataFile), the data file to use, default is SLP data.pl. DataFile
       should have either a number of atomic formulae or a single formula of
       the form: frequencies(Data).+
    – complement(Complement), one of : none (with P rbSc = P rbT rue, the
       default), success (with P rbSc = 1 − P rbF ail), or quotient (with
       P rbSc = P rbT rue/(P rbT rue + P rbF ail)).
                                                                               53

    – setrand (SetRand), sets random seeds. SetRand = true sets the seeds
      to some random triplet while the default SetRand = false, does not
      set them. Any other value for SetRand is taken to be of the form rand
      (S1,S2,S3) as expected by system predicate random of the supported
      Prolog systems.
    – eps(Eps), the depth Epsilon. Sets the probability limit under which P epl
      considers a path as a failed one.
    – write iterations(Wrt) indicates which set of parameters to output. Values
      for Wrt are: all, which is the default, last, and none.
    – write ll (Bool) takes a boolean argument, indicating where log-likelihoods
      should be printed or not. Default is true.
    – debug(Dbg) should be set to on or off (later is the default). If on, various
      information about intermediate calculations will be printed.
    – return(RetOpts), a list of return options, default is the empty list. The
      term RetOpts contain variables which will be instantiated to the appro-
      priate values signified by the name of each corresponding term. Recog-
      nised are, initial pps/1 for the initial parameters, final pps/ for the
      final/learned parameters, termin/1 for the terminating reason, ll/1 for
      the last log-likelihood calculated, iter/1 for the number of iterations
      performed, and seeds/1 for the seeds used.
    – keep pl (KeepBool, if true, the temporary Prolog file that contains the
      translated SLP, is not deleted. Default is false.
    – exception(Handle), identifies the action to be taken if an exception is
      raised while running fam/1. The default value for Handle is rerun. This
      means the same Fam call is executed repeatedly. Any other value for
      Handle will cause execution to abort after printing the exception raised.

switch dbg(+Switch)
  Switch debugging of fam/1 to either on or off.

scall(Goal,Eps,Meth,Path,Succ,BrPrb)
  This is a predicate for people interested in the internals, and should only be
  used by experienced users. The following are the arguments to this call:
    – The vanilla Prolog Goal to call.
    – The value of Eps(ilon) at which branches are to be considered as failures.
    – The search Meth(od) to be used, i.e. all for all solutions or sample for
      a single solution.
    – The Path(s) of the derivation(s).
    – A flag indicating a Succ(essful) derivation or otherwise-Succ is bound
      to the atom fail if this was a failed derivation and remains unbound
      otherwise.
    – BrPrb the branch probability of the derivation.
  See predicate main gen/1, in examples/main scfg.pl for example usage.
54

 all paths(+SlpFile,+Call)
    Display to standard output all derivation paths and plenty of information
    associated with calling stochastic goal Call on the Slp defined in +SlpFile.



3.2   Running PE on your own SLPs


Since FAM is an instance of the EM algorithm, initial values for the parameters
must be supplied. Also, since FAM is only a parameter estimation algorithm, the
structure of the SLP must be given. In our implementation the user composes
an SLP with the appropriate structure and labels the clauses in this SLP with
the initial values of the parameters. The first step in running FAM is to load this
SLP. Suppose the SLP with the initial parameters were saved in the file foo.slp;
this SLP is loaded using sload pe/1 (detailed description in Section 3.1):
?- [pepl].
...
?- sload pe(foo).
yes
?-

    To run the EM algorithm we need a target sample space, comprising from ob-
servables, and the expected number each observable should appear. Data should
be represented by a Prolog file of atomic formulae or a single formula of the form
: frequencies( Freqs ). where Freqs is a list of Datum-Times pairs. In the
former case atomic formulae can be of arbitrary format but they should share
common predicate name and arity. This is the same predicate name and arity
for the top goal in the user defined SLP. The intuition is that each formula is a
point sample in the target sample space to which we wish to fit the parameters
of a given SLP. When frequencies( Freqs ) is used, Datum should be as the for-
mulae just described, while Times should be the times each Datum appears in
the target sample space. The target sample space can be passed to fam/1 either
with the option data file/1, with its argument pointing to a data file as de-
scribed above, or with the option data/1, with its argument being a frequencies
list (Freqs, above). The two different ways to pass the target sample space and
the two formats of the data file option are shown in Table 2
    Assume foo_data.pl is the Prolog file containing the training data. To run
FAM with default settings, do:
?- fam([]).
(see Section 3.1 for how to change these settings). To save the S LP with the
estimated parameters to a file fitted foo.slp, just do:
?- ssave(fitted foo).
File fitted foo.slp is created in current directory.
    To run fam without first loading the S LP to memory, call
?- fam([slp(foo)]).
                                                                                55

data([s(a,p)-4,s(a,q)-3,s(b,p)-2,s(b,q)-3])
                    (a)
s(a,p). s(a,q).
s(b,p). s(b,q).
s(a,p). s(a,q). frequencies([s(a,p)-4,s(a,q)-3,s(b,p)-2,s(b,q)-3]).
s(b,p). s(b,q).
s(a,p). s(a,q).
s(b,q). s(a,p).

    (b)                            (c)
Table 2. Three alternative ways to pass the target sample space to FAM. Top, (a),
using the data/1 option. Bottom left, (b), one format for datafiles in data file/1
option; order of terms is not important. Bottom right, (c), the alternative format,
using a single term.



4     Examples
A number of pre-canned examples are included in the P epl sources within the
directory examples. Standard runs of F AM are wrapped in simple calls that
can be invoked from the Prolog prompt.

4.1   Palindrome context free grammar


0.3 :: s −→ [a], s, [a].
0.2 :: s −→ [b], s, [b].
0.1 :: s −→ [a],[a].
0.4 :: s −→ [b],[b].

                      Fig. 4. The palindromic grammar example.


    A usual trick to check that the parameter estimation software works is to
generate (sample) N data points from an slp according to some initial set of
parameters, then change this set to some generic values (often setting these to a
uniform distribution) and then proceed to guessing the original parameters. An
example of how this can be done in this implementation is in run/main_scfg.pl
    For Yap:
% cd examples
% yap
?- [main csfg].
?- main.
    For Swi:
% swipl
?- [pack( ’pepl/examples/main csfg’ )].
    This example also illustrates :
56

 (a) a situation where failure  is important.
 (b) how to produce N samples (main gen/1 ).
   Again, default is main_exact and main_store with main_sample are also
provided.

4.2     Bloodtype PRISM example



 bloodtype(a) :- genotype(a,a).
 bloodtype(a) :- genotype(a,o).
 bloodtype(a) :- genotype(o,a).
 bloodtype(b) :- genotype(b,b).
 bloodtype(b) :- genotype(b,o).
 bloodtype(b) :- genotype(o,b).
 bloodtype(o) :- genotype(o,o).
bloodtype(ab) :- genotype(a,b).
bloodtype(ab) :- genotype(b,a).
genotype(X,Y) :- gene(X),gene(Y).

1/3 :: gene( a ).
1/3 :: gene( b ).
1/3 :: gene( o ).

                           Fig. 5. Blood type example.


   The example from the Prism web-site5 can also be seen in action. The corre-
sponding S LP can be found in the slp directory of the distribution: slp/prism bt.

% cd run
% prolog (where prolog is in {yap,swipl}).
?- [main prism bt].
?- main exact.
or
?- main store.
or
?- main sample.
after each of the above main calls, you can test accuracy by
?- test(10000).

   Frequency of ground successful goals should match the frequency of Data.
Note that FAM, in theoretic terms, is not strictly speaking applicable to this
example since it does not observe the equivalence class criterion. However, in
practice, the results of the algorithm are correct.
5
     http://sato-www.cs.titech.ac.jp/prism/overview-e.html
                                                                              57

5    Conclusions
We presented an extensive new reading of the FAM algorithm which may be
of value to implementors of EM algorithms in PLP languages. In tandem we
detailed the inner workings of an easy-to-install Prolog based implementation
of the algorithm. P epl comes with a number of canned examples that are col-
lected from the original FAM paper and other literature sources. Furthermore
we detailed a novel approach to re-calculating the arithmetic expressions neces-
sary in the estimation calculations. This is via storing variable versions of the
expressions, copies of which are instantiated at each iteration. Future work on
the system can include tabled inference which has been shown to be successful in
other PLP systems, [5,6]. The system described here is available as open source
from: http://stoics.org.uk/~nicos/sware/ and as an SWI-Prolog package
at http://swi-prolog.org/pack/list?p=pepl.


Acknowledgements
Dr James Cussens helped with many clarifications and endless iterations of ex-
planations while working on Pepl.


Glossary
Logic Programming
derivation   Sequence of goals produced by applying resolution steps on an ini-
             tial goal G0 and program P, ending on either true or false, p. 48.
refutation   A derivation ending in true., p. 48.

Indices
h            Scripts the algorithm’s current iteration.
i            Index for clauses.
k            Index for data.

Symbols
Ci           The ith clause. For example C4 in 2 is l4 : p(b), p. 48.
F            Set (or list) of all failed derivations, p. 49.
Nk           The number of times datum k occured in the observed data. For
             example in Fig. 3 N2 = 2.            P
N            The number of observed data (= k Nk ). For example in Fig. 3 N
             = 12.
Rk           Set (or list) of all refutations, yielding yk , p. 49.
R            Set (or list) of all refutations, p. 49.
Z            Probability of success.
58

νi (d)         Times Ci appeared in derivation (d).
li             The parameter of the ith clause. Also referred to, as clausal prob-
               ability or label.
λi             The log of the ith clause parameter.
λ              The set of the clause log parameters (= hλ1 , . . . , . . . , λn i)., p. 50.
y              The observed data against which FAM tries to learn the true pa-
               rameters., p. 47.
x0ι            The ιth unabiguous refutation.
ψλ(h) [νi | y] Expected times Ci appears in R, p. 48.
              Significance level probability., p. 49.


References
 1. Vı́tor Santos Costa, Ricardo Rocha, and Luı́s Damas. The YAP Prolog system.
    Theory and Practice of Logic Programming, 12:5–34, 1 2012.
 2. James Cussens. Loglinear models for first-order probabilistic reasoning. In UAI-99,
    1999.
 3. James Cussens. Stochastic logic programs: Sampling, inference and applications.
    In UAI’2000, pages 115–122, 2000.
 4. James Cussens. Parameter estimation in stochastic logic programs. Machine Learn-
    ing, 44(3):245–271, 2001.
 5. Yoshitaka Kameya, Taisuke Sato, and Neng-Fa Zhou. Yet more efficient EM learn-
    ing for parameterized logic programs by inter-goal sharing. In Proceedings of the
    16th Eureopean Conference on Artificial Intelligence, ECAI’2004, including Presti-
    gious Applicants of Intelligent Systems, PAIS 2004, Valencia, Spain, August 22-27,
    2004, pages 490–494, 2004.
 6. Angelika Kimmig, Vı́tor Santos Costa, Ricardo Rocha, Bart Demoen, and Luc De
    Raedt. On the efficient execution of problog programs. In Logic Program-
    ming, 24th International Conference, ICLP 2008, Udine, Italy, December 9-13
    2008, Proceedings, pages 175–189, 2008. doi: 10.1007/978-3-540-89982-2 22. URL
    http://dx.doi.org/10.1007/978-3-540-89982-2_22.
 7. Stephen Muggleton. Stochastic logic programs. In L. de Raedt, editor, Advances
    in Inductinve Logic Programming, pages 254–264. IOS Press, 1996.
 8. Stephen Muggleton. Semantics and derivations for SLPs. In Workshop on Fussion
    of Domain Knowledge with Data for Decision Support, 2000.
 9. Taisuke Sato, Yoshitaka Kameya, and Neng-Fa. Zhou. Generative modeling with
    failure in PRISM. In IJCAI’2005, page 847852, 2005.
10. Jan Wielemaker, Tom Schrijvers, Markus Triska, and Torbjörn Lager. SWI-Prolog.
    Theory and Practice of Logic Programming, 12(1-2):67–96, 2012.