<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Probabilistic Abduction Engine for Media Interpretation based on Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oliver Gries</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ralf Moller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anahita Na ssi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maurice Rosenfeld</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kamil Sokolski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Wessel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hamburg University of Technology</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>For multimedia interpretation, and in particular for the combined interpretation of information coming from di erent modalities, a semantically well-founded formalization is required in the context of an agent-based scenario. Low-level percepts, which are represented symbolically, de ne the observations of an agent, and interpretations of content are de ned as explanations for the observations. We propose an abduction-based formalism that uses description logics for the ontology and Horn rules for de ning the space of hypotheses for explanations (i.e., the space of possible interpretations of media content), and we use Markov logic to de ne the motivation for the agent to generate explanations on the one hand, and for ranking di erent explanations on the other. This work has been funded by the European Community with the project CASAM (Contract FP7-217061 CASAM) and by the German Science Foundation with the project PRESINT (DFG MO 801/1-1).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>For multimedia interpretation in the context of an agent-based scenario, and for the combined
interpretation of information coming from di erent modalities in particular, a semantically well-founded
formalization is required. Low-level percepts, which are represented symbolically, de ne the
observations of an agent w.r.t. some content, and interpretations of the content are de ned as explanations
for the observations.</p>
      <p>We propose an abduction-based formalism that uses description logics for the ontology and Horn
rules for de ning the space of hypotheses for explanations (i.e., the space of possible interpretations
of media content). Additionally, we propose the use of Markov logic to de ne the motivation for the
agent to generate explanations on the one hand, and for ranking di erent explanations on the other.</p>
      <p>Based on a presentation of the most important preliminaries in Section 2, the abduction and
interpretation procedures are discussed in detail in Section 3. Optimization techniques for the probabilistic
abduction engine are pointed out. In Section 4, a complete example is given, showing the main approach
using intermediate steps. Section 7 summarizes this paper.</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>In this chapter, the most important preliminaries are speci ed in order to make this document
selfcontained.
For specifying the ontology used to describe low-level analysis results as well as high-level interpretation
results, a less expressive description logic is applied to facilitate fast computations. We decided to
represent the domain knowledge with the DL ALHf (D) (restricted attributive concept language
with role hierarchies, functional roles and concrete domains). We shortly describe our nomenclature in
order to make this paper self-contained. For details see [Baader et al., 2003].</p>
      <p>In logic-based approaches, atomic representation units have to be speci ed. The atomic
representation units are xed using a so-called signature. A DL signature is a tuple S = (CN; RN; IN),
where CN = fA1; :::; Ang is the set of concept names (denoting sets of domain objects) and RN =
fR1; :::; Rmg is the set of role names (denoting relations between domain objects). The signature also
contains a component IN indicating a set of individuals (names for domain objects).</p>
      <p>In order to relate concept names and role names to each other (terminological knowledge) and
to talk about speci c individuals (assertional knowledge), a knowledge base has to be speci ed. An
ALHf knowledge base S = (T ; A), de ned with respect to a signature S, is comprised of a
terminological component T (called Tbox ) and an assertional component A (called Abox ). In the following
we just write if the signature is clear from context. A Tbox is a set of so-called axioms, which are
restricted to the following form in ALHf :
(I) Subsumption A1 v A2, R1 v R2
(II) Disjointness A1 v :A2
(III)Domain and range restrictions for roles 9R:&gt; v A, &gt; v 8R:A
(IV)Functional restriction on roles &gt; v ( 1 R)
(V) Local range restrictions for roles A1 v 8R:A2
(VI)De nitions with value restrictions A A0 u 8R1:A1 u ::: u 8Rn:An
With axioms of form (I), concept (role) names can be declared to be subconcepts (subroles) of each
other. Axioms of form (II) denote disjointness between concepts. Axioms of type (III) introduce domain
and range restrictions for roles. Axioms of the form (IV) introduce so-called functional restrictions on
roles, and axioms of type (V) specify local range restrictions (using value restrictions, see below). With
axioms of kind (VI) so-called de nitions (with necessary and su cient conditions) can be speci ed
for concept names found on the left-hand side of the sign. In the axioms, so-called concepts are
used. Concepts are concept names or expressions of the form &gt; (anything), ? (nothing), :A (atomic
negation), ( 1 R) (role functionality), 9R:&gt; (limited existential restriction), 8R:A (value restriction)
and (C1 u ::: u Cn) (concept conjunction).</p>
      <sec id="sec-2-1">
        <title>Knowledge about individuals is represented in the Abox part of . An Abox A is a set of expressions</title>
        <p>of the form A(a) or R(a; b) (concept assertions and role assertions, respectively) where A stands for
a concept name, R stands for a role name, and a; b stand for individuals. Aboxes can also contain
equality (a = b) and inequality assertions (a 6= b). We say that the unique name assumption (UNA) is
applied, if a 6= b is added for all pairs of individuals a and b.</p>
      </sec>
      <sec id="sec-2-2">
        <title>In order to understand the notion of logical entailment , we introduce the semantics of ALHf . In</title>
        <p>DLs such as ALHf , the semantics is de ned with interpretations I = (4I ; I ), where 4I is a
nonempty set of domain objects (called the domain of I) and I is an interpretation function which maps
individuals to objects of the domain (aI 2 4I ), atomic concepts to subsets of the domain (AI 4I )
and roles to subsets of the cartesian product of the domain (RI 4I 4I ). The interpretation of
arbitrary ALHf concepts is then de ned by extending I to all ALHf concept constructors as
follows:
&gt;I = 4I
?I = ;
(:A)I = 4I n AI
( 1 R)I = fu 2 4I j (8v1; v2) [((u; v1) 2 RI ^ (u; v2) 2 RI ) ! v1 = v2]
(9R:&gt;)I = fu 2 4I j (9v) [(u; v) 2 RI ]g
(8R:C)I = fu 2 4I j (8v) [(u; v) 2 RI ! v 2 CI ]g
(C1 u ::: u Cn)I = C1I \ ::: \ CnI</p>
      </sec>
      <sec id="sec-2-3">
        <title>In the following, the satis ability condition for axioms and assertions of an ALHf -knowledge</title>
        <p>base in an interpretation I are de ned. A concept inclusion C v D (concept de nition C D)
is satis ed in I, if CI DI (resp. CI = DI ) and a role inclusion R v S (role de nition R S), if
RI SI (resp. RI = SI ). Similarly, assertions C(a) and R(a; b) are satis ed in I, if aI 2 CI resp.
(a; b)I 2 RI . If an interpretation I satis es all axioms of T resp. A it is called a model of T resp. A.</p>
      </sec>
      <sec id="sec-2-4">
        <title>If it satis es both T and A it is called a model of . Finally, if there is a model of (i.e., a model for</title>
      </sec>
      <sec id="sec-2-5">
        <title>T and A), then is called satis able. We are now able to de ne the entailment relation j=. A DL knowledge base logically entails an assertion (symbolically j= ) if is satis ed in all models of . For an Abox A, we say j= A if j= for all 2 A.</title>
        <p>2.2</p>
        <sec id="sec-2-5-1">
          <title>Substitutions, Queries, and Rules</title>
        </sec>
        <sec id="sec-2-5-2">
          <title>Sequences, Variable Substitutions and Transformations A variable is a name of the form</title>
        </sec>
      </sec>
      <sec id="sec-2-6">
        <title>String where String is a string of characters from fA: : :Zg. In the following de nitions, we denote</title>
        <p>places where variables can appear with uppercase letters.</p>
      </sec>
      <sec id="sec-2-7">
        <title>Let V be a set of variables, and let X; Y1; : : : ; Yn be sequences h: : :i of variables from V . z denotes</title>
        <p>a sequence of individuals. We consider sequences of length 1 or 2 only, if not indicated otherwise, and
assume that (hXi) is to be read as (X) and (hX; Y i) is to be read as (X; Y ) etc. Furthermore, we
assume that sequences are automatically attened. A function as set turns a sequence into a set in
the obvious way.</p>
        <p>A variable substitution = [X i; Y j; : : :] is a mapping from variables to individuals. The
application of a variable substitution to a sequence of variables hXi or hX; Y i is de ned as h (X)i
or h (X); (Y )i, respectively, with (X) = i and (Y ) = j. In this case, a sequence of individuals is
de ned. If a substitution is applied to a variable X for which there exists no mapping X k in
then the result is unde ned. A variable for which all required mappings are de ned is called admissible
(w.r.t. the context).</p>
        <p>Grounded Conjunctive Queries Let X; Y1; : : : ; Yn be sequences of variables, and let Q1; : : : ; Qn
denote concept or role names.</p>
        <p>A query is de ned by the following syntax.</p>
        <p>f(X) j Q1(Y1); : : : ; Qn(Yn)g
The sequence X may be of arbitrary length but all variables mentioned in X must also appear in at
least one of the Y1; ; Yn: as set(X) as set(Y1) [ [ as set(Yn).</p>
        <p>Informally speaking, Q1(Y1); : : : ; Qn(Yn) de nes a conjunction of so-called query atoms Qi(Yi). The
list of variables to the left of the sign j is called the head and the atoms to the right of are called the
query body. The variables in the head are called distinguished variables. They de ne the query result.
The variables that appear only in the body are called non-distinguished variables and are existentially
quanti ed.</p>
        <p>Answering a query with respect to a knowledge base means nding admissible variable
substitutions such that j= f (Q1(Y1)); : : : ; (Qn(Yn))g. We say that a variable substitution = [X
i; Y j; : : :] introduces bindings i; j; : : : for variables X; Y; : : :. Given all possible variable substitutions
, the result of a query is de ned as f( (X))g. Note that the variable substitution is applied before
checking whether j= fQ1( (Y1)); : : : ; Qn( (Yn))g, i.e., the query is grounded rst.</p>
        <p>For a query f(?y) j P erson(?x); hasP articipant(?y; ?x)g and the Abox 1 = fHighJ ump(ind1),
P erson(ind2), hasP articipant(ind1; ind2)g, the substitution [?x ind2; ?y ind1] allows for
answering the query, and de nes bindings for ?y and ?x.</p>
        <p>A boolean query is a query with X being of length zero. If for a boolean query there exists a variable
substitution such that j= f (Q1(Y1)); : : : ; (Qn(Yn))g holds, we say that the query is answered
with true, otherwise the answer is false.</p>
        <p>Later on, we will have to convert query atoms into Abox assertions. This is done with the
function transform. The function transform applied to a set of query atoms f 1; : : : ng is de ned as
ftransform( 1; ); : : : ; transform( n; )g where
transform(P (X); ) := P ( (X)).</p>
        <p>Rules A rule r has the following form P (X) Q1(Y1); : : : ; Qn(Yn) where P, Q1; : : : ; Qn denote
concept or role names with the additional restriction (safety condition) that as set(X) as set(Y1) [
[ as set(Yn).</p>
        <p>Rules are used to derive new Abox assertions, and we say that a rule r is applied to an Abox A. The
function call apply( ; P (X) Q1(Y1); : : : ; Qn(Yn); A) returns a set of Abox assertions f (P (X))g if
there exists an admissible variable substitution such that the answer to the query
f() j Q1( (Y1)); : : : ; Qn( (Yn))g
is true with respect to [ A.1 If no such can be found, the result of the call to apply( ; r; A) is the
empty set. The application of a set of rules R = fr1; : : : rng to an Abox is de ned as follows.
apply( ; R; A) =
[ apply( ; r; A)
r2R
The result of forward chain( ; R; A) is de ned to be ; if apply( ; R; A) [ A = A holds. Otherwise
the result of forward chain is determined by the recursive call
apply( ; R; A) [ forward chain( ; R; A [ apply( ; R; A)).</p>
        <p>For some set of rules R we extend the entailment relation by specifying that (T ; A) j=R A0 i
(T ; A [ forward chain((T ; ;); R; A)) j= A0.
2.3</p>
        <sec id="sec-2-7-1">
          <title>Probabilistic Knowledge Representation</title>
          <p>The basic notion of probabilistic knowledge representation formalisms is the so-called random
experiment. A random variable X is a function assigning a value to the result of a random experiment. The
random experiment itself is not represented, so random variables are functions without arguments,
which return di erent values at di erent points of time. Possible values of a random variable comprise
the so-called domain of the random variable. In the sequel, we will use boolean random variables, whose
values can be either 1 or 0 (true or f alse, respectively).</p>
          <p>Let X~ = fX1; :::; Xng be the ordered set of all random variables of a random experiment. An event
(denoted X~ = ~x) is an assignment X1 = x1; :::; Xn = xn to all random variables. In case n = 1 we call
the event simple, otherwise the event is called complex. A certain vector of values ~x is referred to as
a possible world. A possible world can be associated with a probability value or probability for short.
Hence, the notion of a possible world can be used as a synonym for an event, and depending on the
context we use the former or the latter name. In case of an event with a boolean random variable X,
we write x as an abbreviation for X = true and :x as an abbreviation for X = f alse.</p>
          <p>Mappings of events to probabilities (or assignment of probabilities to events) are speci ed with
so-called probability assertions of the following syntax: P (X~ = ~x) = p, where X~ is a vector of random
variables, and p is a real value between 0 and 1 (it is assumed that the reader is familiar with
Kolmogorov's axioms of probability). In the special case of a simple event (single random variable, n = 1)
we write P (X = x) = p. The probability value p of an event is denoted as P (X~ = ~x) (or P (X = x)
in the simple case). In its raw form a set of probabilistic assertions is called a probabilistic knowledge
base (with signature X~ ).</p>
          <p>A mapping from the domain of a random variable X to probability values [0; 1] is called a
distribution. For distributions we use the notation P(X). Distributions can be de ned for (ordered)
sets of random variables as well. In this case we use P(X1; : : : ; Xn) as a denotation for a
mapping to the n-dimensional cross product of [0; 1]. For specifying a distribution, probability
assertions for all domain values must be speci ed, and the values p must sum up to 1. In case all
random variables of a random experiment are involved, we speak of a (full) joint probability
distribution (JPD), otherwise the expression is said to denote a marginal distribution (projection of the
ndimensional space of probability values to a lower-dimensional space with m dimensions). The
expression P(X1; : : : ; Xm; Xm+1 = xm+1; : : : ; Xl = xl) denotes an m-dimensional distribution with known
1 We slightly misuse notation in assuming (T ; A) [
well-de ned but useless. It will not be used afterwards.</p>
          <p>= (T ; A [
). If
[ A is inconsistent the result is
values xm+1; : : : ; xl. In slight misuse of notation, we sometimes write ~e for these known values (e stands
for evidence). The fragment ~e need not necessarily be written at the end in the parameter list of P.</p>
          <p>A conditional probability for a set of random variables X1; :::; Xm is denoted with P (X1 = x1; :::; Xm =
xm j ~e) or, in distribution form, we write P(X1; :::; Xm j ~e) (conditional probability distribution). This
distribution can be also written as PP(X(~~e;)~e) .</p>
          <p>For a probabilistic knowledge base, formal inference problems are de ned. We restrict our attention
to the two most convenient probabilistic inference problems: A conditional probability query is the
computation of the joint distribution of a set of m random variables conditioned on ~e and is denoted
with</p>
          <p>PX (x1 ^ ::: ^ xm j ~e) =?:
where vars(x1; : : : ; xm) \ vars(~e) = ; and vars(x1; : : : ; xm) [ vars(~e) X with vars speci ed in the
obvious way. Note that xi indicates Xi = xi. In the following we have the distribution form of the
above query:</p>
          <p>PX (X1; :::; Xm j ~e) =?:
If the set of random variables X is known from the context, the subscript X is often omitted.</p>
          <p>The Maximum A Posteriori (MAP) inference returns the most-likely state of query atoms given the
evidence. Based on the MAP inference, the \most probable world" given the evidence is determined
as a set of events. The MAP inference problem given a distribution for a set of random variables X is
formalized as follows:</p>
          <p>M APX (~e) := ~e [ argmax~xP (~xj~e)
(1)
where vars(~x) \ vars(~e) = ; and vars(~x) [ vars(~e) = X.</p>
          <p>For both inference problems, conditional probability queries as well as the MAP problem, di erent
kinds of algorithms exist, which possibly exploit additional assertions (such as, e.g., conditional
independence assumptions in so-called Bayesian networks, or factored probability distribution speci cations
as in so-called Markov networks). In the next subsection, we focus on the latter formalism.
2.4</p>
        </sec>
        <sec id="sec-2-7-2">
          <title>Markov Logic</title>
          <p>The formalism of Markov logic [Domingos and Richardson, 2007] provides a means to combine the
expressivity of rst-order logic augmented with the formalism of Markov networks [Pearl, 1988]. The
Markov logic formalism uses rst-order logic to de ne \templates" for constructing Markov networks.
The basic notion for this is a called a Markov logic network.</p>
          <p>A Markov logic network MLN = (FMLN ; WMLN ) consists of an ordered multiset of rst-order
formulas FMLN = fF1; :::; Fmg and an ordered multiset of real number weights W = fw1; :::; wmg.</p>
        </sec>
      </sec>
      <sec id="sec-2-8">
        <title>The association of a formula to its weight is by position in the ordered sets. For a formula F 2 FMLN</title>
        <p>with associated weight w we also write wF (weighted formula). Thus, a Markov logic network can
also be de ned as a set of weighted formulas. Both views can be used interchangeably. As a notational
convenience, for ordered sets we nevertheless sometimes write X~ ; Y~ instead of X~ [ Y~ .</p>
        <p>In contrast to standard rst-order logics such as predicate logic, relational structures not satisfying
a formula Fi are not ruled out as models. If a relational structure does not satisfy a formula associated
with a large weight it is just considered to be quite unlikely the "right" one.</p>
        <p>Let C = fc1; :::; cmg be the set of all constants mentioned in FMLN . A grounding of a formula</p>
      </sec>
      <sec id="sec-2-9">
        <title>Fi 2 FMLN is a substitution of all variables in the matrix of Fi with constants from C. From all</title>
        <p>groundings, the ( nite) set of grounded atomic formulas (also referred to as ground atoms ) can be
obtained. Grounding corresponds to a domain closure assumption. The motivation is to get rid of the
quanti ers and reduce inference problems to the propositional case.</p>
        <p>Since a ground atom can either be true or false in an interpretation (or world), it can be considered
as a boolean random variable X. Consequently, for each MLN with associated random variables X~ ,
there is a set of possible worlds ~x. In this view, sets of ground atoms are sometimes used to denote
worlds. In this context, negated ground atoms correspond to false and non-negated ones to true. We
denote worlds using a sequence of (possibly negated) atoms.</p>
        <p>
          When a world ~x violates a weighted formula (does not satisfy the formula) the idea is to ensure
that this world is less probable rather than impossible as in predicate logic. Note that weights do not
directly correspond to probabilities
          <xref ref-type="bibr" rid="ref3">(see [Domingos and Richardson, 2007] for details)</xref>
          .
        </p>
        <p>
          For each possible world of a Markov logic network MLN = (FMLN ; WMLN ) there is a
probability for its occurrence. Probabilistic knowledge is required to obtain this value. As usual,
probabilistic knowledge is speci ed using a probability distribution. In the formalism of Markov networks
the full joint probability distribution of a Markov logic network M LN is speci ed in symbolic form
as PMLN (X~ ) = (P (X~ = ~x1); : : : ; P (X~ = ~xn)), for every possible ~xi 2 ftrue; f alsegn, n = jX~ j
and P (X~ = ~x) := log linMLN (~x)
          <xref ref-type="bibr" rid="ref3">(for a motivation of the log-linear form, see, e.g., [Domingos and
Richardson, 2007])</xref>
          , with log lin being de ned as
log linMLN (~x) =
1
Z
        </p>
        <p>jFMLN j
exp ( X
i=1
wini(~x))
According to this de nition, the probability of a possible world ~x is determined by the exponential
of the sum of the number of true groundings (ni) formulas Fi 2 FMLN in ~x, multiplied with their
corresponding weights wi 2 WMLN , and nally normalized with</p>
        <p>Z =</p>
        <p>jFMLN j
X exp ( X
~x2X~
i=1
wini(~x));
the sum of the probabilities of all possible worlds. Thus, rather than specifying the full joint distribution
directly in symbolic form as we have discussed before, in the Markov logic formalism, the probabilistic
knowledge is speci ed implicitly by the weights associated with formulas. Determining these formulas
and their weights in a practical context is all but obvious, such that machine learning techniques are
usually employed for knowledge acquisition.</p>
        <p>A conditional probability query for a Markov logic network M LN is the computation of the joint
distribution of a set of m events involving random variables conditioned on ~e and is denoted with</p>
        <p>PMLN (x1 ^ : : : ^ xm j ~e)
The semantics of this query is given as:</p>
        <p>Prand vars(MLN)(x1 ^ : : : ^ xm j ~e) w:r:t: PMLN (rand vars(M LN ))
where vars(x1; : : : ; xm) \ vars(~e) = ; and vars(x1; : : : ; xm) rand vars(M LN ). and the function
rand vars function is de ned as follows: rand vars((F ; W)) := fP (C) j P (C) is mentioned in some
grounded formula F 2 F g. Grounding is accomplished w.r.t. all constants that appear in F . An
algorithm for answering queries of the above form is investigated in [Gries and Moller, 2010].</p>
        <p>In the case of Markov logic, the de nition of the M AP problem given in (1) can be rewritten as
follows. The conditional probability term P (~xj~e) is replaced with with the Markovian formula:
M APMLN (~e) := ~e [ argmax~x
1
Ze
exp</p>
        <p>X wini (~x; ~e)
i
!
Thus, for describing the most-probable world, M AP returns a set of events, one for each random
variable used in the Markov network derived from M LN . In the above equation, ~x denotes the hidden
variables, and Ze denotes the normalization constant which indicates that the normalization process
is performed over possible worlds consistent with the evidence ~e. In the next equation, Ze is removed
since it is constant and it does not a ect the argmax operation. Similarly, in order to optimize the
(2)
(3)
M AP computation the exp function is left out since it is a monotonic function and only its argument
has to be maximized:</p>
        <p>M APMLN (~e) := ~e [ argmax~x X wini (~x; ~e)
i
(4)
The above equation shows that the MAP problem in Markov logic formalism is reduced to a new
problem which maximizes the sum of weights of satis ed clauses.</p>
        <p>Since the MAP determination in Markov networks is an NP-hard problem [Domingos and
Richardson, 2007], it is performed by exact and approximate solvers. The most commonly used approximate
solver is MaxWalkSAT algorithm, a weighted variant of the WalkSAT local-search satis ability solver.
The MaxWalkSAT algorithm attempts to satisfy clauses with positive weights and keep clauses with
negative weights unsatis ed.</p>
        <p>It has to be mentioned that there might be several worlds with the same maximal probability. But
at this step, only one of them is chosen non-deterministically.
2.5</p>
        <sec id="sec-2-9-1">
          <title>Combining Markov Logic and Description Logics</title>
        </sec>
      </sec>
      <sec id="sec-2-10">
        <title>Since ALHf is a fragment of rst-order logic, its extension to the Markovian style of formalisms is</title>
        <p>speci ed in a similar way as for predicate logic in the section before. The formulas in Markov logic
correspond to Tbox axioms and Abox assertions. Weights in Markov description logics are associated
with axioms and assertions.</p>
        <p>Groundings of Tbox axioms are de ned analogously to the previous case.2 Abox assertions do not
contain variables and are already grounded. Note that since an ALHf Abox represents a relational
structure of domain objects, it can be directly seen as a possible world itself if assertions not contained
in the Abox are assumed to be false.</p>
        <p>For appropriately representing domain knowledge in CASAM, weights are possibly used only for
a subset of the axioms of the domain ontology. The remaining axioms can be assumed to be strict,
i.e., assumed to be true in any case. A consequence of specifying strict axioms is that lots of possible
worlds ~x can be ruled out (i.e., will have probability 0 by de nition).</p>
      </sec>
      <sec id="sec-2-11">
        <title>A Markov DL knowledge base M is a tuple (T ; A), where T is comprised of a set Ts of strict</title>
        <p>axioms and a set Tw of weighted axioms and A is comprised of a set As of strict assertions and a set</p>
      </sec>
      <sec id="sec-2-12">
        <title>Aw of weighted assertions. Referring to axioms, a proposal for CASAM is to consider strictness for the</title>
        <p>domain ontology patterns (I){(IV):
(I) subsumption
(II) disjointness
(III) domain and range restrictions
(IV) functional roles</p>
        <p>A1 v A2; R1 v R2
A1 v :A2
9R:&gt; v A; &gt; v 8R:A
&gt; v ( 1 R)</p>
        <p>The main justi cation treating axioms as strict is that the subsumption axioms, disjointness axioms,
domain and range restrictions as well as functional role axioms (in combination with UNA) are intended
to be true in any case such that there is no need to assign large weights to them.</p>
        <p>In [Gries and Moller, 2010] we show that Gibbs sampling with deterministic dependencies
speci ed in an appropriate fragment remains correct, i.e., probability estimates approximate the correct
probabilities. We have investigated a Gibbs sampling method incorporating deterministic dependencies
and conclude that this incorporation can speed up Gibbs sampling signi cantly. For details see [Gries
and Moller, 2010]. The advantage of this probabilistic approach is that initial ontology engineering
is done as usual with standard reasoning support and with the possibility to add weighted axioms
and weighted assertions on top of the strict fundament. Since lots of possible worlds do not have to
be considered because their probability is known to be 0, probabilistic reasoning will be signi cantly
faster.
2 For this purpose, the variable-free syntax of axioms can be rst translated to predicate logic.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Probabilistic Interpretation Engine</title>
      <p>In this chapter, the abduction procedure is de ned by the abduction algorithm CAE. Additionally, a
media interpretation agent is described by de ning a probabilistic interpretation algorithm Interpret .
In general, abduction is formalized as [ j=R where background knowledge ( ), rules (R), and
observations ( ) are given, and explanations ( ) are to be computed. In terms of DLs, and are
Aboxes and is a pair of Tbox and Abox.</p>
      <p>Abox abduction is implemented as a non-standard retrieval inference service in DLs. In contrast
to standard retrieval inference services where answers are found by exploiting the ontology, Abox
abduction has the task of acquiring what should be added to the knowledge base in order to answer
a query. Therefore, the result of Abox abduction is a set of hypothesized Abox assertions. To achieve
this, the space of abducibles has to be previously de ned and we do this in terms of rules.</p>
      <sec id="sec-3-1">
        <title>We assume that a set of rules R as de ned above (see Section 2.2) are speci ed, and de ne a</title>
        <p>non-deterministic function compute explanation as follows.</p>
        <p>{ compute explanation( ; R; A; P (z)) = transform( ; ) if there exists a rule
r = P (X)</p>
        <p>Q1(Y1); : : : ; Qn(Yn) 2 R
that is applied to an Abox A such that a minimal set of query atoms and an admissible variable
substitution with (X) = z can be found, and the query Q := f() j expand(P (z); r; R; ) n g is
answered with true.</p>
        <p>{ If no such rule r exists in R it holds that compute explanation( ; R; A; P (z)) = ;.
The goal of the function compute explanation is to determine what must be added ( ) such that
an entailment [ A [ j=R P (Z) holds. Hence, for compute explanation, abductive reasoning is
used. The set of query atoms de nes what must be hypothesized in order to answer the query</p>
      </sec>
      <sec id="sec-3-2">
        <title>Q with true such that expand(P (X); r; R; ) holds. The de nition of compute explanation is</title>
        <p>non-deterministic due to several possible choices for .</p>
        <p>The function application expand(P (Z); P (X) Q1(Y1); : : : ; Qn(Yn); R) is also de ned in a
nondeterministic way as
expand0(Q1( 0(Y1)); R; ) [
[ expand0(Qn( 0(Yn)); R; )
with expand0(P (Z); R; ) being expand(P ( 0(z)); r; R; 0) if there exist a rule r = P (X)
and hP (X)i otherwise. The variable substitution 0 is an extension of such that:
0 = [X1
z1; X2
z2; : : :]
: : : 2 R
(5)
The above equation shows the mapping of the free variables if it is not already de ned. This means
the free variables in the body of each rule are mapped to individuals with unique IDs.</p>
      </sec>
      <sec id="sec-3-3">
        <title>We say the set of rules is backward-chained, and since there might be multiple rules in R, backward</title>
        <p>chaining is non-deterministic as well. Thus, multiple explanations are generated.3
3 In the expansion process, variables have to be renamed. We neglect these issues here.
In the following, we devise an abstract computational engine for \explaining" Abox assertions in terms
of a given set of rules. Explanation of Abox assertions w.r.t. a set of rules is meant in the sense that
using the rules some high-level explanations are constructed such that the Abox assertions are entailed.
The explanation of an Abox is again an Abox. For instance, the output Abox represents results of the
content interpretation process. The presentation in slightly extended compared to the one in [Castano
et al., 2008]. Let the agenda A be a set of Aboxes and let be an Abox of observations whose
assertions are to be explained. The goal of the explanation process is to use a set of rules R to derive
\explanations" for elements in . The explanation algorithm implemented in the CASAM abduction
engine works on a set of Aboxes I.</p>
        <p>The complete explanation process is implemented by the CAE function:</p>
        <p>Function CAE( , , , R, S, A):
Input: a strategy function , a termination function
rules R, a scoring function S, and an agenda A
Output: a set of interpretation Aboxes I0
I0 := fassign level(l; A)g;
repeat</p>
        <p>I := I0;
(A; ) := (I) // A 2 I, 2 A s.th. requires f iat( l) holds;
l = l + 1;</p>
        <p>I0 := (A n fAg) [ assign level(l; explanation step( ; R; S; A; ));
until (I) or no A and can be selected such that I0 6= I ;
return I0
where assign level(l; A) is de ned by a lambda calculus term as follows:
, a background knowledge
, a set of
(6)
(7)
(8)
assign level(l; A) = map( ( ) assign level(l; A); A)</p>
        <p>A
assign level(l; A) takes as input a superscript l and an agenda A.</p>
        <p>In the following, assign level(l; A) is de ned which superscripts each assertion
with l if the assertion does not already have a superscript:
of the Abox A
assign level(l; A) =
l
j
2 A; 6= i; i 2 N
Note that l is a global variable, its starting value is zero and it is incremented in the CAE function.
The map4 function is de ned as follows:
map(f; X) = [ ff (x)g
x2X
It takes as parameters a function f and a set X and returns a set consisting of the values of f applied
to every element x of X.</p>
        <p>CAE function applies the strategy function in order to decide which assertion to explain, uses a
termination function in order to check whether to terminate due to resource constraints and a
scoring function S to evaluate an explanation.</p>
        <p>The function for the explanation strategy and for the termination condition are used as an
oracle and must be de ned in an application-speci c way. In our multimedia interpretation scenario
we assume that the function requires f iat( l) is de ned as follows:
4 Please note that in this report, the expression map is used in two di erent contexts. The rst one M AP
denotes the Maximum A Posteriori approach which is a sampling method whereas the second one map is a
function used in the assign level(l; A) function.</p>
        <p>(true
requires at ( l) =</p>
        <p>f alse if l 6= 0</p>
        <p>The function explanation step is de ned as follows.
explanation step( ; R; S; A; ):</p>
        <p>[
2compute all explanations( ;R;S;A; )
We need two additional auxiliary functions.
consistent completed explanations( ; R; A; ):
compute all explanations( ; R; S; A; ):</p>
        <p>consistent completed explanations( ; R; A; ):</p>
      </sec>
      <sec id="sec-3-4">
        <title>The function maximize( ; R; A; s; S) selects those explanations 2 s for which the score S( ; R; A; )</title>
        <p>is maximal, i.e., there exists no other 0 2 s such that S( ; R; A; 0) &gt; S( ; R; A; ). The function
consistent(T ;A)(A0) determines if the Abox A [ A0 has a model which is also a model of the Tbox T .</p>
        <p>Note the call to the nondeterministic function compute explanation. It may return di erent values,
all of which are collected.</p>
        <p>In the next Section we explain how probabilistic knowledge is used to (i) formalize the e ect of the
\explanation", and (ii) formalize the scoring function S used in the CAE algorithm explained above.
In addition, it is shown how the termination condition (represented with the parameter in the above
procedure) can be de ned based on the probabilistic conditions.
The interpretation procedure is completely discussed in this section by explaining the interpretation
problem and presenting a solution to this problem. The solution is presented by a probabilistic
interpretation algorithm which calls the CAE function described in the previous section. In the given
algorithm, a termination function, and a scoring function are de ned. The termination function
determines if the interpretation process can be stopped since at some point during the interpretation
process it makes no sense to continue the process. The reason for stopping the interpretation process
is that no signi cant changes can be seen in the results. The de ned scoring function in this section
assigns scores to interpretation Aboxes.</p>
        <p>Problem The objective of the interpretation component is the generation of interpretations for the
observations. An interpretation is an Abox which contains high level concept assertions. Since in the
arti cial intelligence, the agents are used for solving the problems, in the following the same problem
is formalized in the perspective of an agent:
Consider an intelligent agent and some percepts in an environment where the percepts are the analysis
results of KDMA and HCI. The objective of this agent is nding explanations for the existence of
percepts. The question is how the interpretation Aboxes are determined and how long the interpretation
process must be performed by the agent. The functionality of this Media Interpretation Agent is
presented in the M I Agent algorithm in Section 3.4.</p>
        <p>Solution In the following, an application for a probabilistic interpretation algorithm is presented which
gives a solution to the mentioned problem. This solution illustrates a new perspective to the
interpretation process and the reason why it is performed. Assume that the media interpretation component
receives a weighted Abox A from KDMA and HCI which contains observations. In the following, the
applied operation P (A; A0; R; WR; T ) in the algorithm is explained:</p>
      </sec>
      <sec id="sec-3-5">
        <title>The P (A; A0; R; WR; T ) function determines the probability of the Abox A with respect to the Abox</title>
      </sec>
      <sec id="sec-3-6">
        <title>A0, a set of rules R, a set of weighted rules WR, and the Tbox T where A A0. Note that R is a</title>
        <p>set of forward and backward chaining rules. The probability determination is performed based on the
Markov logic formalism as follows:</p>
        <p>P (A; A0; R; WR; T ) = PMLN(A;A0;R;WR;T )(Q~ (A) j ~e(A0))
(9)
(10)
(11)
Q~ (A) denotes an event composed of all assertions which appear in the Abox A. Assume that Abox A
contains n assertions 1; : : : ; n. Consequently, the event for the Abox A is de ned as follows:</p>
        <p>Q~ (A) = h 1 = true ^ : : : ^ n = truei
where a1; : : : ; an denote the random variables of the MLN. Assume that an Abox A contains m
assertions 1; : : : ; m. Then, the evidence vector ~e(A) de ned as follows.</p>
        <p>~e(A) = ha1 = true; : : : ; am = truei
In order to answer the query PMLN(A;A0;R;WR;T )(Q~ (A) j ~e(A0)) the function M LN (A; A0; R; WR; T )
is called. This function builds the Markov logic network M LN based on the Aboxes A and A0, the
rules R, the weighted rules WR and the Tbox T which is a time consuming process. Note that in
theory the above function is called not only once but several times. In a practical system, this might
be handled more e ciently. This function returns a Markov logic network (FMLN ; WMLN ), which we
de ne here as follows using a tuple notation:
&gt;8f( ; w)g if hw; i 2 A
&gt;
&gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt;ff(( ;; w1))gg iiff hw2; Ai 2 A0
M LN (A; A0; R; WR; T ) = S &lt;f( ; 1)g if 2 A0
&gt;&gt;&gt;&gt;f( ; 1)g if 2 R
&gt;
&gt;&gt;&gt;&gt;f( ; w)g if hw; i 2 WR
:&gt;f( ; 1)g if 2 T
In the following, the interpretation algorithm Interpret is presented:</p>
      </sec>
      <sec id="sec-3-7">
        <title>Function Interpret(A, CurrentI, , T , F R, BR, WR, )</title>
        <p>Input: an agenda A, a current interpretation Abox CurrentI, an Abox of observations , a</p>
      </sec>
      <sec id="sec-3-8">
        <title>Tbox T , a set of forward chaining rules F R, a set of backward chaining rules BR, a set of weighted rules WR, and the desired precision of the results</title>
        <p>Output: an agenda A0, a new interpretation Abox N ewI, and Abox di erences for additions
1 and omissions 2
i := 0 ;
p0 := P ( ; ; R; WR; T ) ;
:= (A) i := i + 1; pi := maxA2A P ( ; A [ A0; R; WR; T ); return j pi pi 1 j&lt; i ;
:= (T ; ;);
R := F R [ BR;
S := ((T ; A0)); R; A; ) P ( ; A [ A0 [ ; R; WR; T );
A0 := CAE( ; ; ; R; S; A);
N ewI = argmaxA2A0 (P ( ; A; R; WR; T ));
1 = AboxDi (N ewI; CurrentI); // additions
2 = AboxDi (CurrentI; N ewI); // omissions
return (A0; N ewI; 1; 2);
In the above algorithm, the termination function and the scoring function S are de ned by lambda
calculus terms. The termination condition of the algorithm is that no signi cant changes can be
seen in the successive probabilities pi and pi 1 (scores) of the two successive generated interpretation
Aboxes in two successive levels i 1 and i. In this case, the current interpretation Abox CurrentI is
preferred to the new interpretation Abox N ewI. In the next step, the CAE function is called which
returns agenda A0. Afterwards, the interpretation Abox NewI with the maximum score among the</p>
      </sec>
      <sec id="sec-3-9">
        <title>Aboxes A of A0 is selected. Additionally, the Abox di erences 1 and 2 respectively for additions</title>
        <p>and omissions among the interpretation Aboxes CurrentI and N ewI are calculated. In the following,
the strategy condition is de ned which is one of the parameters of CAE function:</p>
        <sec id="sec-3-9-1">
          <title>Function (I)</title>
          <p>Input: a set of interpretation Aboxes I</p>
        </sec>
      </sec>
      <sec id="sec-3-10">
        <title>Output: an Abox A and a at assertion</title>
        <p>A := nA 2 I j :9A0 2 I; A0 6= A : 9 0l0
A := random select(A);
min s = n l 2 A j :9 0l0 2 A0; 0l0 6=
return (A; random select(fmin sg));
2 A0 : 8 l 2 A : l0 &lt; lo;
l; l0 &lt; lo;</p>
      </sec>
      <sec id="sec-3-11">
        <title>In the above strategy function , the agenda A is a set of Aboxes A such that the assigned superscripts</title>
        <p>to their assertions are minimum. In the next step, an Abox A from A is randomly selected. Afterwards,
the min s set is determined which contains the assertions from A whose superscripts are minimum.
These are the assertions which require explanations. The strategy function returns as output an Abox</p>
      </sec>
      <sec id="sec-3-12">
        <title>A and an assertion which requires explanation.</title>
        <p>3.4</p>
        <sec id="sec-3-12-1">
          <title>The Media Interpretation Agent</title>
          <p>In the following, the M I Agent function is presented which calls the Interpret function:
Function MI Agent(Q, P artners, Die, (T ; A0); F R; BR; WR; )
Input: a queue of percept results Q, a set of partners P artners, a function Die for the
termination process, a background knowledge set (T ; A0), a set of forward chaining rules F R, a
set of backward chaining rules BR, a set of weighted rules WR, and the desired precision of the
results</p>
        </sec>
        <sec id="sec-3-12-2">
          <title>Output: {</title>
          <p>CurrentI = ;;
A00 = f;g;
repeat</p>
          <p>:= extractObservations(Q);
W := M AP ( ; WR; T ) ;</p>
          <p>0 := Select(W; );
A0 := lter ( (A) consistent (A); map( (A) 0[A[A0[forward chain( ; F R; 0[A[A0);
fselect(M AP ( 0 [ A [ A0; WR; T ); 0 [ A [ A0) j A 2 A00g));
(A00; N ewI; 1; 2) := Interpret (A0; CurrentI; 0; T ; F R; BR; WR [ ; );
CurrentI := N ewI;
Communicate( 1; 2; P artners);</p>
          <p>A00 := manage agenda(A00);
until Die() ;
where the lter function is de ned as follows:
lter (f; X) = [ (fxg
x2X
;
if f (x) = true
else
(12)
The lter function takes as parameters a function f and a set X and returns a set consisting of the
values of f applied to every element x of X.</p>
          <p>In the M I Agent function, the current interpretation CurrentI and the agenda A00 are initialized
to empty set. Since the agent performance is an incremental process, it is de ned by a repeat until
loop. The percept results are sent by KDMA and HCI to the queue Q. In order to take the
observations from the queue Q, the M I Agent calls the extractObservations function.</p>
        </sec>
      </sec>
      <sec id="sec-3-13">
        <title>The M AP ( [ A; WR; T ) function determines the most probable world of observations with re</title>
        <p>spect to a set of weighted rules WR and the Tbox T . This function performs actually the mentioned
MAP process in Chapter 2. It returns a vector W which consists of a set of zeros and ones assigned
to the ground atoms of the considered world. The assertions with assigned zeros and ones are called
respectively, negative and positive assertions.</p>
        <p>The Select(W; ) function selects the positive assertions from the bit vector W in the input Abox
. The selected positive assertions which require explanations are also known as at assertions. This
operation returns as output an Abox 0 which has the following characteristic: 0 .</p>
      </sec>
      <sec id="sec-3-14">
        <title>In the next step, a set of forward chaining rules F R is applied to all the Aboxes of A00. The generated</title>
        <p>assertions in this process are added to the to the Abox A. In the next step, only the consistent Aboxes
are selected and the other inconsistent Aboxes are not considered for the next steps.
In the next step, the Interpret function is called to determine the new agenda A00, the new
interpretation N ewI and the Abox di erences 1 and 2 for additions and omissions among CurrentI
and N ewI. Afterwards, the CurrentI is set to the N ewI and the M I Agent function communicates
the Abox di erences 1 and 2 to the partners. Additionally, the Tbox T , the set of forward chaining
rules F R, the set of backward chaining rules BR, and the set of weighted rules WR can be learnt by the
Learn function. The termination condition of the M I Agent function is that the Die() function is true.
Note that the M I Agent waits at the function call extractObservations(Q) if Q = ;.
After presenting the above algorithms, the mentioned unanswered questions can be discussed. A reason
for performing the interpretation process and explaining the at assertions is that the probability of</p>
      </sec>
      <sec id="sec-3-15">
        <title>P (A; A0; R; WR; T ) will increase through the interpretation process. In other words, by explaining</title>
        <p>the observations the agent's belief to the percepts will increase. This shows a new perspective for
performing the interpretation process.</p>
        <p>The answer to the question whether there is any measure for stopping the interpretation process,
is indeed positive. This is expressed by j pi pi 1 j&lt; i which is the termination condition of the
algorithm. The reason for selecting i and not as the upper limit for the termination condition is to
terminate the oscillation behaviour of the results. In other words, the precision interval is tightened
step by step during the interpretation process. The function manage agenda is explained Section 6.
Before, we dicuss an example for interpreting a single video shot in Section 4, and a scene in Section 5.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Video Shot Interpretation</title>
      <p>One of the main innovation introduced in the previous section, namely the introduction of a
probabilistic preference measure to control the space of possible interpretations, is exempli ed here using
examples inspired by the environmental domain used in the project CASAM.</p>
      <p>We have to mention that this example is not constructed to show the possible branchings through
the interpretation process. The purpose of this example is to show how the probabilities of the most
probable world of observations P (A0; A; R; WR; T ) behave during the interpretation process.</p>
      <p>At the beginning of this example, the signature of the knowledge base is presented. The set of all
concept names CN is divided into two disjoint sets Events and PhysicalThings such that
CN = Events [ PhysicalThings where these two sets are de ned as follows:
Events = fCarEntry; EnvConference; EnvP rot; HealthP rotg</p>
      <p>PhysicalThings = fCar; DoorSlam; Building; Environment; Agencyg
EnvConference, EnvP rot and HealthP rot denote respectively environmental conference,
environmental protection and health protection.</p>
      <p>The set of role names RN is de ned as follows:
RN = fCauses; OccursAt; HasAgency; HasT opic; HasSubject; HasObject; HasE ect ;</p>
      <p>HasSubEvent; HasLocationg
In the following, the set of individual names IN is given:</p>
      <p>IN = fC1; DS1; ES1; Ind42; Ind43; Ind44; Ind45; Ind46; Ind47; Ind48g</p>
      <p>Note that the notations in this example are based on Alchemy notations i.e. the instance-,
conceptand role names begin with capital letters. In the following, the set of forward chaining rules F R is
de ned:</p>
      <p>F R = f8x CarEntry(x) ! 9y Building(y); OccursAt(x; y);
8x EnvConference(x) ! 9y Environment(y); HasT opic(x; y);
8x EnvP rot(x) ! 9y Agency(y); HasAgency(x; y)g</p>
      <sec id="sec-4-1">
        <title>Similarly, the set of backward chaining rules BR is given as follows:</title>
        <p>BR = fCauses(x; y) CarEntry(z); HasObject(z; x); HasE ect (z; y); Car(x); DoorSlam(y);
OccursAt(x; y) EnvConference(z); HasSubEvent(z; x); HasLocation(z; y); CarEntry(x); Building(y);
HasT opic(x; y) EnvP rot(z); HasSubEvent(z; x); HasObject(z; y); EnvConference(x); Environment(y);
HasAgency(x; y) HealthP rot(z); HasObject(z; x); HasSubject(z; y); EnvP rot(x); Agency(y)g</p>
      </sec>
      <sec id="sec-4-2">
        <title>In the following, a set of weighted rules WR is de ned where all rules have the same high weight</title>
        <p>equal to 5:</p>
        <p>WR = f5 8x; y; z CarEntry(z)^HasObject(z; x)^HasE ect (z; y) ! Car(x)^DoorSlam(y)^Causes(x; y);
5 8x; y; z EnvConference(z)^HasSubEvent(z; x)^HasLocation(z; y) ! CarEntry(x)^Building(y)^OccursAt(x; y);
5 8x; y; z EnvP rot(z) ^ HasSubEvent(z; x) ^ HasObject(z; y) ! EnvConference(x) ^ Environment(y) ^
HasT opic(x; y);
5 8x; y; z HealthP rot(z)^HasObject(z; x)^HasSubject(z; y) ! EnvP rot(x)^Agency(y)^HasAgency(x; y)g</p>
      </sec>
      <sec id="sec-4-3">
        <title>Note that the weighted rules WR and their weights can be learnt by the machine learning com</title>
        <p>ponent. The selected initial value for in this example is 0:05. In the following, 1 and 2 denote
respectively the set of assertions hypothesized by a forward chaining rule and the set of assertions
generated by a backward chaining rule at each interpretation level.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Let us assume that the media interpretation agent receives the following weighted Abox A:</title>
        <p>A = f1:3 Car(C1); 1:2 DoorSlam(DS1); 0:3 EngineSound(ES1); Causes(C1; DS1)g
The rst applied operation to A is the M AP function which returns the bit vector W = h1; 1; 0; 1i.
This vector is composed of positive and negative events (bits). By applying the Select function to</p>
      </sec>
      <sec id="sec-4-5">
        <title>W and the input Abox A, the assertions from the input Abox A are selected that correspond to the</title>
        <p>positive events in W . Additionally, the assigned weights to the positive assertions are also taken from
the input Abox A. In the following, Abox A0 is depicted which contains the positive assertions:</p>
        <p>A0 = f1:3 Car(C1); 1:2 DoorSlam(DS1); Causes(C1; DS1)g
At this step, p0 = P (A0; A0; R; WR; T ) = 0:755. Since no appropriate forward chaining rule from F R
is applicable to Abox A0, 1 = ; and as a result A0 = A0 [ ;. The next step is the performance of
backward chain function where the next backward chaining rule from BR can be applied to Abox A0:
Causes(x; y) CarEntry(z); HasObject(z; x); HasE ect (z; y); Car(x); DoorSlam(y)
Consequently, by applying the above rule the next set of assertions is hypothesized:
2 = fCarEntry(Ind42); HasObject(Ind42; C1); HasE ect (Ind42; DS1)g
which are considered as strict assertions. Consequently, A1 is de ned as follows: A1 = A0 [ 2.
In the above Abox, p1 = P (A0; A1; R; WR; T ) = 0:993. As it can be seen, p1 &gt; p0 i.e.</p>
      </sec>
      <sec id="sec-4-6">
        <title>P (A0; Ai; R; WR; T ) increases by adding the new hypothesized assertions. This shows that the new</title>
        <p>assertions are considered as additional support. The termination condition of the algorithm is not
ful lled therefore the algorithm continues processing. At this level, it is still not known whether Abox</p>
      </sec>
      <sec id="sec-4-7">
        <title>A1 can be considered as the nal interpretation Abox. Thus, this process is continued with another</title>
        <p>level. Consider the next forward chaining rule:
8x CarEntry(x) ! 9y Building(y); OccursAt(x; y)
By applying the above rule, the next set of assertions is generated namely:</p>
        <p>1 = fBuilding(Ind43); OccursAt(Ind42; Ind43)g</p>
        <p>The new generated assertions are also considered as strict assertions. In the following, the expanded
Abox A1 is de ned as follows: A1 = A1 [ 1.</p>
      </sec>
      <sec id="sec-4-8">
        <title>Let us assume the next backward chaining rule from BR:</title>
        <p>OccursAt(x; y)</p>
        <p>EnvConference(z); HasSubEvent(z; x); HasLocation(z; y); CarEntry(x); Building(y)
Consequently, by applying the above abduction rule the next set of assertions is hypothesized:
2 = fEnvConference(Ind44); HasSubEvent(Ind44; Ind42); HasLocation(Ind44; Ind43)g
which are considered as strict assertions. Consequently, A2 = A1 [ 2.</p>
        <p>In the above Abox, p2 = P (A0; A2; R; WR; T ) = 0:988. As it can be seen, p2 &lt; p1 i.e.</p>
      </sec>
      <sec id="sec-4-9">
        <title>P (A0; Ai; R; WR; T ) decreases slightly by adding the new hypothesized assertions. Since the termi</title>
        <p>nation condition of the algorithm is ful lled, Abox A1 can be considered as the nal interpretation
Abox. To realize how the further behaviour of the probabilities is, this process is continued. Consider
the next forward chaining rule from F R:
8x EnvConference(x) ! 9y Environment(y); HasT opic(x; y)
By applying the above rule, new assertions are generated.</p>
        <p>1 = fEnvironment(Ind45); HasT opic(Ind44; Ind45)g</p>
        <p>In the following, the expanded Abox A2 is de ned: A2 = A2 [ 1.</p>
      </sec>
      <sec id="sec-4-10">
        <title>Consider the next backward chaining rule from BR:</title>
        <p>HasT opic(x; y) EnvP rot(z); HasSubEvent(z; x); HasObject(z; y); EnvConference(x); Environment(y)
By applying the above abduction rule, the following set of assertions is hypothesized:
2 = fEnvP rot(Ind46); HasSubEvent(Ind46; Ind44); HasObject(Ind46; Ind45)g
which are considered as strict assertions. In the following, A3 is de ned as follows A3 = A2 [ 2.
In the above Abox A3, p3 = P (A0; A3; R; WR; T ) = 0:99. As it can be seen, p3 &gt; p2, i.e.</p>
      </sec>
      <sec id="sec-4-11">
        <title>P (A0; Ai; R; WR; T ) increases slightly by adding the new hypothesized assertions.</title>
        <p>Consider the next forward chaining rule:
8x EnvP rot(x) ! 9y Agency(y); HasAgency(x; y)
By applying the above rule, the next assertions are generated:
1 = fAgency(Ind47); HasAgency(Ind46; Ind47)g
As a result, the expanded Abox A3 is presented as follows: A3 = A3 [ 1.</p>
      </sec>
      <sec id="sec-4-12">
        <title>Let us consider the next backward chaining rule from BR:</title>
        <p>HasAgency(x; y)</p>
        <p>HealthP rot(z); HasObject(z; x); HasSubject(z; y); EnvP rot(x); Agency(y)
Consequently, new assertions are hypothesized by applying the above abduction rule, namely:
2 = fHealthP rot(Ind48); HasObject(Ind48; Ind46); HasSubject(Ind48; Ind47)g
which are considered as strict assertions. Consequently, A4 is de ned as follows: A4 = A3 [ 2.
In the above Abox, p4 = P (A0; A4; R; WR; T ) = 0:985. As it can be seen, p4 &lt; p3, i.e.</p>
      </sec>
      <sec id="sec-4-13">
        <title>P (A0; Ai; R; WR; T ) decreases slightly by adding the new hypothesized assertions.</title>
        <sec id="sec-4-13-1">
          <title>Evaluation of the Results:</title>
          <p>The determined probability values P (A0; Ai; R; WR; T ) of this example are summarized in the next
table which shows clearly the behaviour of the probabilities stepwise after performing the interpretation
process:</p>
          <p>In the above table, variable i denotes the successive levels of the interpretation process. In this
example, the interpretation process is consecutively performed four times. As it can be seen, through
the rst interpretation level the probability p1 increases strongly in comparison to p0. By performing the
second, third and the forth interpretation levels, the probability values decrease slightly in comparison
to p1. This means no signi cant changes can be seen in the results. In other words, the determination
of A3 and A4 were not required at all. But the determination of A2 was required to realize the slight
di erence jp2 p1j &lt; 2 . Consequently, Abox A1 is considered as the nal interpretation Abox.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Preference-based Scene Interpretation</title>
      <p>In this example, we discuss how the video shot interpretation process can be performed by
considering the results of two consecutive video shots. For the interpretation of each video shot we require
information about the previous video shots otherwise the interpretation process does not work
correctly. The question is which assertions have to be considered from the previous video shots. As it
was discussed in this paper we would like to consider the assertions from the previous video shots
which increase P (A0; Ai; R; WR; T ). At the beginning of this example, the signature of the knowledge
base is presented. The set of the concept names CN is divided into two disjoint sets Events and
PhysicalThings which are described as follows:</p>
      <p>Events = fCarEntry; CarExit; CarRideg
PhysicalThings = fCar; DoorSlamg</p>
      <p>Additionally, the set of the role names RN and the set of the individual names IN are represented
as follows:</p>
      <p>RN = fCauses; HasObject; HasE ect ; Before; HasStartEvent; HasEndEventg
IN = fC1; C2; DS1; DS2; Ind41; Ind42; Ind44g</p>
      <sec id="sec-5-1">
        <title>The Tbox T contains the axiom CarEntry v :CarExit. In the following, the set of forward chaining rules F R is given:</title>
        <p>F R = f
8x; xl; y; yl; w; z AudioSeg(x); HasSegLoc(x; xl); V ideoSeg(y); HasSegLoc(y; yl); IsSmaller(xl; yl);</p>
        <p>Depicts(x; w); Depicts(y; z); CarEntry(w); CarEntry(z) ! Before(z; w);
8x; xl; y; yl; w; z AudioSeg(x); HasSegLoc(x; xl); V ideoSeg(y); HasSegLoc(y; yl); IsSmaller(xl; yl);</p>
        <p>Depicts(x; w); Depicts(y; z); CarEntry(w); CarExit(z) ! Before(z; w);
8x; xl; y; yl; w; z AudioSeg(x); HasSegLoc(x; xl); V ideoSeg(y); HasSegLoc(y; yl); IsSmaller(xl; yl);</p>
        <p>Depicts(x; w); Depicts(y; z); CarExit(w); CarEntry(z) ! Before(z; w);
8x; xl; y; yl; w; z AudioSeg(x); HasSegLoc(x; xl); V ideoSeg(y); HasSegLoc(y; yl); IsSmaller(xl; yl);</p>
        <p>Depicts(x; w); Depicts(y; z); CarExit(w); CarExit(z) ! Before(z; w)g
where AudioSeg, HasSegLoc and V ideoSeg denote AudioSegment, HasSegmentLocator and V ideoSegment
respectively. Note that the concepts and roles in F R which are not given in CN and RN appear only
in the multimedia content ontology. The multimedia content ontology determines the structure of the
multimedia document. Additionally, it determines whether the concpts are originated from video,
audio or text. The above rules mean that the concept assertion CarEntry or CarExit from the rst
shot appear chronologically before the concept assertion CarEntry or CarExit from the second video
shot. The set of backward chaining rules BR is presented as follows:</p>
        <p>BR = fCauses(x; y) CarEntry(z); HasObject(z; x); HasE ect (z; y); Car(x); DoorSlam(y),
Causes(x; y) CarExit(z); HasObject(z; x); HasE ect (z; y); Car(x); DoorSlam(y),</p>
        <p>Before(x; y) CarRide(z); HasStartEvent(z; x); HasEndEvent(z; y); CarEntry(x); CarExit(y)g
Additionally, the set of weighted rules is de ned as follows:</p>
        <p>WR = f5 8x; y; z CarEntry(z)^HasObject(z; x)^HasE ect (z; y) ) Car(x)^DoorSlam(y)^Causes(x; y);
5 8x; y; z CarExit(z)^HasObject(z; x)^HasE ect (z; y) ) Car(x)^DoorSlam(y)^Causes(x; y);
5 8x; y; z; k; m CarRide(z) ^ HasStartEvent(z; x) ^ HasEndEvent(z; y) ^ HasObject(x; k) ^</p>
        <p>HasObject(y; m) ) CarEntry(x) ^ CarExit(y) ^ Car(k) ^ Car(m) ^ k = mg
Consider the next gure as the rst video shot of a video:
Let us assume that the analysis results of the rst video shot represented in the Abox</p>
      </sec>
      <sec id="sec-5-2">
        <title>A1 are sent to the queue Q:</title>
        <p>A1 = f1:3 Car(C1); 1:2 DoorSlam(DS1); Causes(C1; DS1)g</p>
        <p>For the interpretation of the rst video shot, we will call the function
M I Agent(Q; P artners; Die; (T ; A0); F R; BR; WR; ). At the beginning of this
function, there are initializations for some variables, namely CurrentI = ; and
A00 = f;g. Afterwards extracting observations from the queue Q is performed,
which leads to = A1. Determination of the most probable world W = h1; 1; 1i
is performed in the next step and selecting the positive assertions and their
related weights determines 0 = . At this step, A = ; since A00 = f;g.
Additionally, A0 = ;. Consequently, M AP ( 0; WR; T ) = W and Select(W; 0) = 0.
f orward Chain( ; F R; 0) = ; since there is no forward chaining rule applicable to 0. A0 = 0.</p>
      </sec>
      <sec id="sec-5-3">
        <title>The Interpret(A0; CurrentI; 0; T ; F R; BR; WR [ ; ) is called in the next step which determines</title>
        <p>p0 = P ( 0; 0; R; WR; T ) = 0:733. The Interpret function calls CAE function which returns
A0 = f 0 [ 1; 0 [ 2g where the two possible explanations 0 and 00 are de ned as follows:
1 = fCarEntry(Ind41); HasObject(Ind41; C1); HasE ect (Ind41; DS1)g
2 = fCarExit(Ind41); HasObject(Ind41; C1); HasE ect (Ind41; DS1)g
Each of the above interpretation Aboxes have scoring values:
p1 = P ( 0; 0 [ 1; R; WR; T ) = 0:941 and p1 = P ( 0; 0 [ 2; R; WR; T ) = 0:935. N ewI =
0 [ 1 since this is the interpretation Abox with the maximum scoring value. The termination
condition is not ful lled since p1 p0 = 0:208 &gt; 0:05. The Abox di erence for additions is de ned as
follows: add = N ewI CurrentI = N ewI ; = N ewI. Simiarly, omi = ; is the Abox di erence
for omissions. The CAE function returns N ewI, A0 and the Abox di erences add and omi to the
Interpret function. Consider the next gure depicts the second video shot:</p>
        <p>Assume that the analysis results of the second video shot given in the next Abox
are sent to the queue Q:</p>
        <p>A2 = f1:3 Car(C2); 1:2 DoorSlam(DS2); Causes(C2; DS2)g
Similarly, for the interpretation of the second video shot we will call the function</p>
      </sec>
      <sec id="sec-5-4">
        <title>M I Agent(Q; P artners; Die; (T ; A0); F R; BR; WR; ). The observation extracttion</title>
        <p>process from Q leads to = A2. Afterwards, the most probable world W = h1; 1; 1i
is determined and applying Select function on W gives 0 = A2.</p>
        <p>Consider A 2 A00 where A00 = fA1 [ 1; A1 [ 2g. 0 [ A =
fA2 [ A1 [ 1; A2 [ A1 [ 2g. Applying M AP ( 0[A; WR; T ) gives W = h1; : : : ; 1i
and applying the Select(W; 0 [ A) function gives fA2 [ A1 [ 1; A2 [ A1 [ 2g.
Since no forward chaining rule is applicable to the above set and this set contains
consistent Aboxes A0 = fA2 [ A1 [ 1; A2 [ A1 [ 2g. In the next step, the function
Interpret(A0; CurrentI; 0; T ; F R; BR; WR [ ; ) is called which determines P ( 0; 0; R; WR; T ) =
0:733. Afterwards, the CAE function is called which determines the next exaplanations:
3 = fCarEntry(Ind42); HasObject(Ind41; C2); HasE ect (Ind41; DS2)g
4 = fCarExit(Ind42); HasObject(Ind41; C2); HasE ect (Ind41; DS2)g
The CAE function generates the following agenda which contains all possible interpretation Aboxes:
fI1; I2; I3; I4g
where:</p>
        <p>I1 = A2 [ A1 [ 1 [ 3 I2 = A2 [ A1 [ 1 [ 4</p>
        <p>I3 = A2 [ A1 [ 2 [ 3 I4 = A2 [ A1 [ 2 [ 4</p>
        <p>Afterwards applies the forward chaining rules on the above agenda. A new assertion Before(Ind41; Ind42)
is generated and added to the four interpretation Aboxes. In the following, the possible four
interpretation Aboxes are given:</p>
        <p>I1 = A2 [ A1 [ 1 [ 3 [ fBefore(Ind41; Ind42)g
I2 = A2 [ A1 [ 1 [ 4 [ fBefore(Ind41; Ind42)g
I3 = A2 [ A1 [ 2 [ 3 [ fBefore(Ind41; Ind42)g</p>
        <p>I4 = A2 [ A1 [ 2 [ 4 [ fBefore(Ind41; Ind42)g
Afterwards the backward chaining rule is applied which generates the following set only for the
interpretation Abox I2:</p>
        <p>= fCarRide(Ind44); HasStartEvent(Ind44; Ind41); HasEndEvent(Ind44; Ind42)g
Consequently I2 = I2 [ . The interpretation Aboxes have the next scoring values:
P (A1 [ A2; I1; R; WR; T ) = 0:964
P (A1 [ A2; I2; R; WR; T ) = 0:978
P (A1 [ A2; I3; R; WR; T ) = 0:952</p>
        <p>P (A1 [ A2; I4; R; WR; T ) = 0:959</p>
        <p>The above values show that the interpretation Abox I2 has a higher scoring value than the other
interpretation Aboxes. Therefore the nal interpretation Abox is N ewI = I2. The Abox di erences
for additions and omissions are de ned as follows:</p>
        <p>add = A2 [ 4 [ [ fBefore(Ind41; Ind42)g omi = ;</p>
        <p>For the next interpretation steps the agenda can continue with I2 and eliminate the other
interpretation Aboxes since this Abox has a higher scoring value.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Manage Agenda</title>
      <p>In this section, we introduce some techniques which improves the performance of the agenda.
{ Elimination of the interpretation Aboxes: This technique is applied if there are multiple
interpretation Aboxes with di erent scoring values where one of the Aboxes has a higher scoring value.
At this step, we can select this Abox, eliminate the remaining interpretation Aboxes and continue
the interpretation process with the selected Abox.
{ Combining the interpretation Aboxes: Consider the interpretation Aboxes I1; : : : ; In. In order to
determine the nal interpretation Abox, the MAP process can be applied to the union of all
interpretation Aboxes I1 [ : : : [ In.
{ Shrinking the interpretation Aboxes: By applying this technique, we can decide which assertions
from the previous video shots have to be considered for the interpretation process of the following
video shots since considering all assertions of the previous video shots will slow down the
interpretation process. We believe that only the high level concept assertions from the previous video
shots play an important role and not the low level concept assertions.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Summary</title>
      <p>For multimedia interpretation, a semantically well-founded formalization is required. In accordance
with previous work, in CASAM a well-founded abduction-based approach is pursued. Extending
previous work, abduction is controlled by probabilistic knowledge, and it is done in terms of rst-order
logic. Rather than merely using abduction for computing explanation with which observations are
entailed, the approach presented in this paper, uses a probabilistic logic to motivate the explanation
endeavor by increasing the belief in the observations. Hence, there exists a certain utility for an agent
for the computational resources it spends for generating explanations. Thus, we have presented a rst
attempt to more appropriately model a media interpretation agent.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Baader et al.,
          <year>2003</year>
          ] Baader,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Nardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            , and
            <surname>Patel-Schneider</surname>
          </string-name>
          , P. F., editors (
          <year>2003</year>
          ).
          <source>The Description Logic Handbook: Theory, Implementation and Applications</source>
          . Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Castano et al.,
          <year>2008</year>
          ] Castano,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Espinosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Ferrara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Karkaletsis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Kaya</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.,</surname>
          </string-name>
          <article-title>Moller</article-title>
          , R.,
          <string-name>
            <surname>Montanelli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petasis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Wessel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Multimedia interpretation for dynamic ontology evolution</article-title>
          .
          <source>In Journal of Logic and Computation</source>
          . Oxford University Press.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Domingos and Richardson</source>
          , 2007] Domingos,
          <string-name>
            <given-names>P.</given-names>
            and
            <surname>Richardson</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Markov logic: A unifying framework for statistical relational learning</article-title>
          . In Getoor, L. and
          <string-name>
            <surname>Taskar</surname>
          </string-name>
          , B., editors,
          <source>Introduction to Statistical Relational Learning</source>
          , pages
          <volume>339</volume>
          {
          <fpage>371</fpage>
          . Cambridge, MA: MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Gries and Moller, 2010] Gries,
          <string-name>
            <surname>O.</surname>
          </string-name>
          and Moller, R. (
          <year>2010</year>
          ).
          <article-title>Gibbs sampling in probabilistic description logics with deterministic dependencies</article-title>
          .
          <source>In Proc. of the First International Workshop on Uncertainty in Description Logics</source>
          , Edinburgh.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[Pearl</source>
          , 1988] Pearl,
          <string-name>
            <surname>J.</surname>
          </string-name>
          (
          <year>1988</year>
          ).
          <article-title>Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference</article-title>
          . Morgan Kaufmann, San Mateo, CA.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>