<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Sub jective Logic Extensions for the Semantic Web</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Davide Ceolin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Archana Nottamkandath</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wan Fokkink</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>d.ceolin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>a.nottamkandath</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>w.j.fokkink}@vu.nl</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>VU University</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Subjective logic is a powerful probabilistic logic which is useful to handle data in case of uncertainty. Subjective logic and the Semantic Web can mutually benefit from each other, since subjective logic is useful to handle the inner noisiness of the Semantic Web data, while the Semantic Web offers a mean to obtain evidence useful for performing evidential reasoning based on subjective logic. In this paper we propose three extensions and applications of subjective logic in the Semantic Web, namely: the use of semantic similarity measures for weighing subjective opinions, a way for accounting for partial observations, and the new concept of “open world opinion”, i.e. subjective opinions based on Dirichlet Processes, which extend multinomial opinions. For each of these extensions, we provide examples and applications to prove their validity.</p>
      </abstract>
      <kwd-group>
        <kwd>Subjective Logic</kwd>
        <kwd>Semantic Similarity</kwd>
        <kwd>Dirichlet Process</kwd>
        <kwd>Partial Observations</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Subjective logic [7] is a probabilistic logic widely adopted in the trust
management domain, based on evidential reasoning and statistical principles. This logic
focuses on the representation and the reasoning on assertions of which truth
value is not fully determined, but estimated on the basis of the observed
evidence. The logic comes with a variety of operators that allow to combine such
assertions and to derive the truth values of the consequences.</p>
      <p>Subjective logic is well-suited for the management of uncertainty within the
Semantic Web. For instance, the incremental access to these data (as a
consequence of crawling) can give rise to uncertainty issues which can be dealt with
using this logic. Furthermore, the fact that the fulcrum of this logic is the concept
of “subjective opinion” (which represent an assertion, its corresponding evidence
and the source of this evidence), allows to correctly represent how the estimated
truth value of an assertion is bound to the source of the corresponding evidence
and allows to easily keep lightweight provenance information. Finally, evidential
reasoning allows to limit the typical noisiness of Semantic Web data. On the
other hand, we also believe that the Semantic Web can be beneficial to this
logic, as an immeasurably important source of information: since the truth value
of assertions is based on availability of observations, the more data is available
(hopefully of high enough quality), the closer we can get to the correct truth
value for our assertions. We believe that this mutual relationship can be
improved. This paper proposes extensions and applications of subjective logic that
aim at the Semantic Web.</p>
      <p>The rest of the paper is organized as follows: Section 2 describes related work,
Section 3 proposes a combination of subjective logic and semantic similarity
measures, Section 4 introduces a method for dealing with partial observations
of evidence, Section 5 introduces the concept of Open World Opinion. Section 6
provides a final discussion about the work presented.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>The development of subjective logic’s operators has been investigated.
Remarkably, the averaging and cumulative fusion [8,9] and the discounting [11] operators
are among the most generic and useful operators for this logic. These operators
provide the foundations for the work proposed in this paper. The connections
among subjective logic and the (Semantic) Web are increasing. Ceolin et al. [4]
adopt this logic for computing trust values of annotations provided by experts,
using DBpedia and other Web sources as evidence. Unlike this work, they do not
use semantic similarity measures. Ceolin et al. [3] and Bellenger et al. [1] provide
applications of the combination of evidential reasoning with semantic
similarity measures and Semantic Web technologies. In the current paper we provide
the theoretical foundations for this kind of approaches, and we generalize them.
Sensoy et al. [15] use semantic similarity in combination with subjective logic
to import knowledge from one context to another. They use the semantic
similarity measure to compute a prior value for the imported data, while we use it
to weigh all the available evidence. Kaplan et al. [12] focus on the exploration
of uncertain partial observations used for building subjective opinions. Unlike
their work, we restrict our focus on partial observations of Web-like data and
evaluations, which comprise the number of “likes”, links and other similar
indicators related to a given Web item. The weighing and discounting based on
semantic similarity measures can resemble the work of Jøsang et al. [8], although
the additional information that we include in our reasoning (which is semantic
similarity) is related only to the frame of discernment in subjective logic, and
not to the belief assignment function.
3
3.1</p>
      <sec id="sec-2-1">
        <title>Preliminaries</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Combining Subjective Logic with Semantic Similarity</title>
      <p>Subjective Logic In subjective logic, so-called “subjective opinions” express
the belief that source x owns with respect to the value of assertion y chosen
among the elements of the set (“frame of discernment”). The belief is assigned
to the elements of the set X = 2 n (“frame”), according to the evidence. In
symbols, this is represented as !(b; d; u; a) when j j = 2 (binomial opinion) or
as !(!b ; u; !a ) when j j &gt; 2 (multinomial opinion). The positive and negative
evidence is represented as p and n respectively. The belief (b), disbelief (d),
uncertainty (u), and a priori values (a) for binomial opinions are computed as:
b =</p>
      <p>p
p + n + 2
d =</p>
      <p>n
p + n + 2
u =</p>
      <p>2
p + n + 2
a =</p>
      <p>
        A subjective opinion is equivalent to a Beta probability distribution (binomial
opinion) or to a Dirichlet distribution (multinomial opinion). The expected value
(E) for the Beta distribution is computed as in equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ).
      </p>
      <p>Opinions are computed based on contexts. For example source x provides
an observation about assertion y in context c (e.g. about an agent’s expertise).
The trustworthiness of assertion y in context c, represented as t(x; y : c), is
the expected value of the Beta distribution corresponding to the opinion and
computed as:</p>
      <p>E = t(x; y : c) = b + a u</p>
      <sec id="sec-3-1">
        <title>Base Rate Discounting Operator in Subjective Logic In subjective logic,</title>
        <p>the base rate sensitive discounting of opinion of source B on y by opinion of
source A on B !BA, !yB = (byB; dyB; uyB; ayB) by opinion !BA = (bAB; dA ; uA ; aAB) of
B B
source A produces transitive belief !yA:B = (byA:B; dyA:B; uyA:B; ayA:B) where
bA:B = E(!BA)byB
y
uyA:B = 1 E(!BA)(byB + dyB)
dA:B = E(!BA)dyB
y
aA:B = aB
y y</p>
      </sec>
      <sec id="sec-3-2">
        <title>Wu &amp; Palmer Semantic Similarity Measure Many semantic similarity</title>
        <p>measures have been developed (see the work of Budanitsky and Hirst [2]). We
focus on those computed from WordNet. WordNet groups words into sets of
synonyms called synsets that describes semantic relationships between them. It
is a directed and acyclic graph with each vertex v, an integer that represents a
synset, and each directed edge from v to w represents that w is a hypernym of v.
We focus on the Wu &amp; Palmer metric [18], which calculates semantic relatedness
in a deterministic way by considering the depths between two synsets in the
WordNet taxonomies, along with the depth of the Least Common Subsumer
(lcs) as follows:
score(s1; s2) =</p>
        <p>
          2 depth(lcs)
depth(s1) + depth(s2)
This means that score 2 ]0 : : : 1]. For deriving the opinions about a concept
where no evidence is available, we incorporate score, which represents the
semantic similarity (sim(c; c0)) in our trust assessment, where c and c0 are concepts
belonging to synset s1 and s2 respectively which represent two contexts.
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
3.2
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Using Semantic Similarity Measures within Subjective Logic</title>
        <p>Deriving Opinion about a New or Unknown Context Since we compute
opinions based on contexts, it is possible that evidence required to compute the
opinion for a particular context is unavailable. For example, suppose that source
x owns observations about an assertion in a certain context (e.g. the expertise
of an agent about tulips), but needs to evaluate them in a new context (e.g.
the agent’s expertise about sunflowers), of which it owns no observations. The
semantic similarity measure between two contexts, sim(c; c0) can be used for
obtaining the opinion about an agent y on an unknown or new context through
two different methods. In order to derive an opinion about a new or unknown
context we can use either the weighing (on the evidence) or the discounting
operation (on the opinion) and both the approaches are described below. We will
show that the discounting and the weighing are theoretically but not statistically
different.</p>
        <p>Weighing the Evidence We weigh the positive and negative evidence
belonging to a certain context (e.g. Tulips) on the corresponding semantic similarity
to the new context (e.g. Sunflowers), sim(Tulips, Sunflowers). We then
perform this for all the contexts for which source x has already provided an
opinion, 8c0 2 C, by weighing all the positive (p) and negative (n) evidence
of c0 with the similarity measure sim(c; c0) to obtain an opinion about y in
c (see the work of Ceolin et al. [3]).</p>
        <p>Discounting the Opinion In the second approach, every opinion source x has
about other related contexts c0, where c0 2 C is discounted with the
corresponding semantic similarity measure sim(c; c0) using the Discounting
operator in subjective logic. The discounted opinions are then aggregated to
form the final opinion of x about y in the new context c.</p>
        <p>Discounting Operator and Semantic Similarity Subjective logic offers a
variety of operators for “discounting”, i.e. for smoothing opinions given by third
parties, provided that we have at disposal an opinion about the source itself.
“Smoothing” is meant as reducing the belief provided by the third party,
depending on the opinion on the source (the worse the opinion, the higher the
reduction). Moreover, since the components of the opinion always sum to one,
reducing the belief implies an increase of (one) of the other components: hence
there exists a discounting operator favoring uncertainty and one favoring
disbelief. Finally, there exists a discounting operator that makes use of the expected
value E of the opinion. Following this line of thought, we can use the semantic
similarity as a discount factor for opinions imported from contexts related to the
one of interest, in case of a lack of opinions in it, to handle possible variations
in the validity of the statements due to the change of context.</p>
        <p>Choosing the Appropriate Discounting Operator We need to choose the
appropriate discounting operator that allows us to use the semantic similarity
value as a discounting factor for opinions. The disbelief favoring discounting is
an operator that is employed whenever one believes that the source considered
might be malicious. This is not our case, since the discounting is used to import
opinions own by ourselves but computed in different contexts than the one of
interest. Hence we do not make use of the disbelief favoring operator.</p>
        <p>In principle, we would have no specific reason to choose one between the
uncertainty favoring discounting and the base rate discounting. Basically, having
that only rarely the belief (and hence the expected value) is equal to 1, the
two discounting operators decrease the belief of the provided opinion, one by
multiplying it by the belief in the source, the other one by the expected value of
the opinion about the source. In practice, we will see that, thanks to Theorem
1 these two operators are equivalent in this context.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Theorem 1 (Semantic Relatedness Measure is a Dogmatic Opinion).</title>
        <p>Let sim(c; c0) be the semantic similarity between two contexts c and c0 obtained
by computing the semantic distance between the contexts in a graph through
deterministic measurements (e.g. [18]). Then, 8 sim(c; c0) 2 [0,1], !cm=eca0sure =
(bcm=eca0sure ; dcm=eca0sure ; ucm=eca0sure ; acm=eca0sure ) is equivalent to a dogmatic opinion in
subjective logic.</p>
        <p>Proof. A binomial opinion is a dogmatic opinion if the value of uncertainty
is 0. The semantic similarity measure can be represented as an opinion about
the similarity of two contexts c and c0. However, since we restrict our focus
on WordNet -based measures, the similarity is inferred by graph measurements,
and not by probabilistic means. This means that, according to the source, this
is a “dogmatic” opinion, since it does not provide any indication of uncertainty:
ucm=eca0sure = 0. The opinion is not based on evidence observation, rather on actual
deterministic measurements.</p>
        <p>
          E(!cm=eca0sure ) = bcm=eca0sure + ucm=eca0sure a = sim(c; c0)
uA:B = 1
y
ayA:B = ayB byA:B = sim(c; c0) byB
sim(c; c0) (byB + dyB) dyA:B = sim(c; c0) dyB
Definition 1 (Weighing Operator). Let C be the set of contexts c0 of which
a source A has an opinion derived from the positive and negative evidence in the
where measure indicates the procedure used to obtain the semantic distance,
e.g. Wu and Palmer Measure. The values of belief and disbelief are obtained as:
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
bcm=eca0sure = sim(c; c0)
dcm=eca0sure = 1
bcm=eca0sure
tu
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>Corollary 1 (Discounting an Opinion with a Dogmatic Opinion). Let</title>
        <p>A be a source who has an opinion about y in context c0 expressed as !yA:c0 =
(byA:c0 ; dyA:c0 ; uyA:c0 ; ayA:c0 ) and let the semantic similarity between the contexts c and
c’ be represented as a dogmatic opinion !cm=eca0sure = (bcm=eca0sure; dcm=eca0sure; 0; acc0=c0 ).
Since, the source A does not have any prior opinion about the context c, we derive
the opinion of A about c represented as !cA:c0 = (bcA:c0 ; dcA:c0 ; ucA:c0 ; acA:c0 ) using
the base rate discounting operator on the dogmatic opinion.
past. Let c be a new context for which A has no opinion yet. We can derive the
opinion of A about facts in c, by weighing the relevant evidences in set C with the
semantic similarity measure sim(c; c0) 8c0 2 C. The belief, disbelief, uncertainty
and a priori obtained through the weighing operation are expressed below.
ucA = 1</p>
        <p>sim(c;c0) pcA0
bcA = sim(c;c0)(pcA0 +ncA0 )+2
sim(c;c0) (pcA0 +ncA0 )
sim(c;c0)(pcA0 +ncA0 )+2</p>
        <p>sim(c;c0) ncA0
dcA = sim(c;c0)(pcA0 +ncA0 )+2
acA = acA0</p>
      </sec>
      <sec id="sec-3-6">
        <title>Theorem 2 (Approximation of the Weighing and Discounting Opera</title>
        <p>tors). Let !yA::cc0 = (byA::cc0 ,dyA::cc0 ,uyA::cc0 ,ayA::cc0 ) be a discounted opinion which source
A has about y in a new or unknown context c, derived by discounting A’s
opinion on known contexts c’ 2 C represented as !cA0 = (bcA0 ; dcA0 ; ucA0 ; acA0 ) with the
corresponding dogmatic opinions (e.g. sim(c,c’)). Let source A also obtain an
opinion about the unknown context c based on the evidence available from the
earlier contexts c’, by weighing the evidence (positive and negative) with
semantic similarity between c and c’, sim(c,c’) 8c0 2 C. Then the difference between
the results from the weighing and from the discount operator in subjective logic
are statistically insignificant.</p>
        <p>
          Proof. We substitute the values of belief, disbelief, uncertainty values in equation
(
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) for Base Rate Discounting with the values from equation (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) and expectation
value from equation (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ). We obtain the new value of the discounted base rate
opinion as follows:
bA:c0 = s(pimcA0(+cn;ccA00)+p2cA)0
c
uA:c0 = 1
c
sim(c;c0) (pcA0 +ncA0 )
(pcA0 +ncA0 +2)
dA:c0 = s(ipmcA0(+c;nccA00)+n2cA)0
c
aA:c0 = acA0
c
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
(
          <xref ref-type="bibr" rid="ref9">9</xref>
          )
        </p>
        <p>
          Equation (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ) and (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ) are pretty similar, except for the sim(c; c0):(pcA0 + ncA0 )
factor in the weighing operator. In the following section we use a 95% t-student
and Wilcoxon signed-rank statistical test to prove that the difference due to that
factor is not statistically significant for large values of sim(c; c0) (at least 0.5).
3.3
        </p>
      </sec>
      <sec id="sec-3-7">
        <title>Evaluations</title>
        <p>We prove statistically the similarity between the weighing and the discounting.1</p>
      </sec>
      <sec id="sec-3-8">
        <title>First Validation: Discounting and Weighing in a Real-Life Case</title>
        <p>Steve Social Tagging Project Dataset For the purpose of our evaluations,
we use the “Steve Social Tagging Project” [16] data (in particular, the
“Researching social tagging and folksonomy in the ArtMuseum”), which is a
collaboration of museum professionals and others aimed at enhancing social
1 Complete results are available at http://tinyurl.com/bp43k5d
tagging. In our experiments, we used a sample of tags which the users of
the system provided for the 1784 images of the museum available online.
Most of the tags were evaluated by the museum professionals to assess their
trustworthiness. We used only the evaluated tags for our experiments. The
tags can be single words or a string of words provided by the user regarding
any objective aspect of the image displayed to them for the tagging.
Gathering Evidence for Evaluation We select a set of tags highly
semantically related, by using a Web-based WordNet interface [14]. We then gather
the list of users who provided the tags regarding the chosen words and count
the number of positive and the negative evidence.</p>
        <p>
          The opinions are calculated using two different methods. First by weighing
the evidence with the semantic distance using equation (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ) and the second
method is by discounting the evidence with the semantic distance using
equation (
          <xref ref-type="bibr" rid="ref9">9</xref>
          ). We consider the Chinese-Asian pair (semantic similarity 0.933)
and the Chinese-Buddhist pair (semantic similarity 0.6667).
        </p>
        <p>Results We employ the Student’s t-test and the Wilcoxon signed-rank test to
assess the statistical significance of the difference between two sample means.
At 95% confidence level, both tests show a statistically significant difference
between the two means. This difference, for the Chinese-Asian pair is 0.025,
while for the Chinese-Buddhist pair is 0.11, thanks also to the high similarity
(higher than 0.5) between the considered topics. Having removed the average
difference from the results obtained from discounting (which, on average, are
higher than those from weighing), both the tests assure that the results of
the two methods distribute equally.</p>
      </sec>
      <sec id="sec-3-9">
        <title>Second Validation: Discounting and Weighing on a Large Simulated</title>
        <p>Dataset In order to validate our hypothesis that weighing with semantic
distance produces results that are highly similar to those obtained with the
discounting operator of subjective logic, we perform the Student’s t-test and the
Wilcoxon signed-rank test on a larger dataset consisting of 1000 samples. For
semantic distance values sim(c; c0) &gt; 0:7, the mean difference between the belief
values obtained by weighing and discounting is 0.092. Thus with 95%
confidence interval, both tests assure that both the weighing operator and the
discounting operator produce similar results. The semantic similarity threshold
sim(c; c0) &gt; 0:7 is relevant and reasonable, because it becomes more
meaningful to compute opinions for a new context based on the opinions provided
earlier for the most semantically related contexts, while also in case of lack of
evidence for a given context, evidence about a very diverse context can not be
much significant.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Partial Evidence Observation</title>
      <p>The Web and the Semantic Web are pervaded of data that can be used as
evidence for a given purpose, but that constitute partially positive/negative
evidence for others. Think about the Waisda? tagging game [13]. Here, users
challenge each other about video tagging. The more users insert the same tag
about the same video within the same time frame, the more the tag is believed
to be correct. Matching tags can be seen as positive observations for a specific
tag to be correct. However, consider the orthogonal issue of the user reputation.
User reputation is based on past behavior, hence on the trustworthiness of the
tags previously inserted by him/her. Now, the trustworthiness of each tag is
not deterministically computed, since it is roughly estimated from the number
of matching tags for each tag inserted by the user. The expected value of each
tag, which is less than one, can be considered as a partial observation of the
trustworthiness of the tag itself. Vice-versa, the remainder can be seen as a
negative partial observation. After having considered tag trustworthiness, one
can use each evaluation as partial evidence with respect to the user reliability:
no tag (or other kind of observation) is used as a fully positive or fully negative
evidence, unless its correctness has been proven by an authority or by another
source of validation. However, since only rarely the belief (and therefore, the
expected value) is equal to one, these observations almost never count as a fully
positive or fully negative evidence. We propose an operator for building opinions
based on indirect observations, i.e., on observations used to build these opinions,
each of which counts as an evidence.</p>
      <p>Theorem 3 (Partial Evidence-Based Opinions). Let p be a vector of
positive observations (e.g. a list of “like” counts) about distinct facts related to a
given subject s. Let l be the length of p. Let each opinion based on each entry of
p have an a priori value of 21 . Then we can derive an opinion about the reliability
of the subject in one of these two manners.</p>
      <p>– By cumulating the expected values (counted as partial positive evidence) of
each opinion based on each element of p:
b =</p>
      <p>1 l pi + 1
l + 2 i=1 pi + 2
d =</p>
      <p>1 l 1
l + 2 i=1 pi + 2
u =</p>
      <p>2
l + 2
– By averaging the expected values of the opinions computed on each of the
elements of p:
b =</p>
      <p>1 l pi + 1
l(l + 2) i=1 pi + 2
d =</p>
      <p>1 l 1
l(l + 2) i=1 pi + 2
u =</p>
      <p>2
l(l + 2)
Proof. The expected value of each opinion is computed as:</p>
      <p>E = b + a u =</p>
      <p>
        p
p + 2
+
1 2
2 p + 2
=
p + 1
p + 2
E is considered as partial positive evidence. Hence 1 E is considered as
partial negative evidence. Given that we have l pieces of partial evidence (because
we have l distinct elements in !p ), we compute the opinion about s following
equations (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ). Having that p (positive evidence of !s) is equal to pp00++12 , we obtain
equation (
        <xref ref-type="bibr" rid="ref10">10</xref>
        ). If we choose to average the evidence (and hence, the expected
values) instead of cumulate them, what we obtain is p = 1l ppii++21 , hence b = l ppii++21
1
l+2
and therefore we obtain equation (
        <xref ref-type="bibr" rid="ref11">11</xref>
        ).
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
tu
      </p>
    </sec>
    <sec id="sec-5">
      <title>Dirichlet Process-Based Opinions: Open</title>
    </sec>
    <sec id="sec-6">
      <title>Opinions</title>
    </sec>
    <sec id="sec-7">
      <title>World</title>
      <p>The Dirichlet Process [6] is a stochastic process representing a probability
distribution whose domain is a random probability distribution. As we previously
saw, the binomial and multinomial opinions are equivalent to Beta and Dirichlet
probability distributions. The Dirichlet distribution represents an extension of
the Beta distribution from a two-category situation to a situation where one
among n possible categories has to be chosen. A Dirichlet process over a set
S is a stochastic process whose sample path (i.e. an infinite-dimensional set of
random variables drawn from the process) is a probability distribution on S.
The finite dimensional distributions are from the Dirichlet distribution: if H is
a finite measure on S, is a positive real number and X is a sample path drawn
from a Dirichlet process, written as</p>
      <p>X</p>
      <p>DP ( ; H)
then for any partition of S of cardinality m, say fBigim=1
(X(B1); : : : ; X(Bm))</p>
      <p>
        Dirichlet( H(B1); : : : ; H (Bm)):
Moreover, given n draws from X, we can predict the next observation as:
obsn+1 =
(
xi (i 2 [1 : : : k])
H
with probability nn(+xi )
with probability n+
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
(
        <xref ref-type="bibr" rid="ref14">14</xref>
        )
(
        <xref ref-type="bibr" rid="ref15">15</xref>
        )
where xi is one of the k unique value among the observations gathered.
5.2
      </p>
      <sec id="sec-7-1">
        <title>Open World Opinions</title>
        <p>Having to deal with real data coming from the Web, which are accessed
incrementally, the possibility to update the relative probabilities of possible outcomes
might not be enough to deal with them. We may need to handle unknown
categories of data which should be accounted and manageable anyway. Ceolin et al.
[5] show how it is important to account for unseen categories, when dealing with
Web data. Here, we propose a particular subjective opinion called “open world
opinion” which accounts for partial knowledge about the possible outcomes. A
subjective opinion resemble personal opinion provided by sources with respect to
facts. Open world opinions represent the case when something about a given fact
has been observed, but the evidence allow also for some other (not yet observed)
outcome to be considered as plausible. With this extension we allow the frame of
discernment to have infinite cardinality. In practice, open world opinions allow
to represent situations when the unknown outcome of an event can be equal to
one among a list of already observed values (proportionally to the amount of
observations for each of them), but it is also possible that (and so some
probability mass is reserved to) the outcome is different from what has been observed
so far, and is drawn from an infinitely large domain.</p>
        <p>Definition 2 (Open World Opinion). Let: X be a frame of infinite
cardinality, 2 R+, k be the number of categories observed, !p be the array of evidence
per category, !B be a belief function over X. We define the open world opinion
!x:
!x( !B; U; H)</p>
        <p>Bxi =
+
pxi
k
x=1pxi</p>
        <p>
          U =
+
k
x=1pxi
1 = U +
xi Bxi (
          <xref ref-type="bibr" rid="ref16">16</xref>
          )
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>Definition 3 (Expected Value of Open World Opinion).</title>
        <p>
          The expected value of an open world opinion is computed as follows:
(
          <xref ref-type="bibr" rid="ref17">17</xref>
          )
(
          <xref ref-type="bibr" rid="ref18">18</xref>
          )
E(p(xi)jr; H) =
pxi + H(xi)
+ pxt
=
pxi
+
pxt
Theorem 4 (Equivalence between the Subjective and Dirichlet
Process Notation). Let !Xbn = ( !B; U; H) be an opinion expressed in belief
notation, and !Xpn = (E; ; H) be an opinion expressed in probabilistic notation, both
over the same frame X. !Xbn and !Xpn are equivalent when the following mappings
holds:
8 Bxi =
&gt;
&lt;
&gt;: U =
        </p>
        <p>pxi
+ xk=1pxi
+ xk=1pxi
,
&lt;8 pxi =</p>
        <p>Bxi</p>
        <p>U
: 1 = U +</p>
        <p>Bxi
Proof. Each step of the Dirichlet Process can be seen as a Dirichlet Distribution.
Hence the mapping between Dirichlet Distributions and multinomial opinions [9]
holds also here. tu</p>
      </sec>
      <sec id="sec-7-3">
        <title>Theorem 5 (Mapping between Open World Opinion and Multinomial</title>
        <p>
          Opinion). Let !1yx( !B; U; H) be an open world opinion and let !2yx( !B; U; !a ) be
a multinomial opinion. Let X2 and 2 be the frame and the frame of discernment
of !2yx. Let fBigik=1 be the result of the partition of dom(H) such that:
1. j 2j = jfBigj
2. SfBigik=1 = dom(H)
3. 8fxig[(fxig 2 X2 ^ jfxigj = 1 ^ xi 2 Bj ) ) @xk6=j 2 Bi]
4. W = k, where W is the non-informative constant of multinomial opinions
Then there exists a function D : Dom(H) ! fBig such that D(!1yx) = !2yx.
Proof. The equivalence between the discretized open world opinion and the
multinomial opinion is proven by showing that:
– given equation (
          <xref ref-type="bibr" rid="ref14">14</xref>
          ), since the partition fBigik=1 covers the entire dom(H),
then the partition distributes like the corresponding Dirichlet distribution;
– to each category of !2yx corresponds one and only one partition of fBig as
per item 2 of Theorem 5. tu
In other words, open world opinions extend multinomial opinions by allowing
the frame of discernment to be infinite. However, by properly discretizing an
open world opinion, what we obtain is an equivalent multinomial opinion.
Piracy at sea is a well know problem. Every year, several ships are attacked,
hijacked, etc. by pirates. The International Chamber of Commerce has created
a repository of reports about ship attacks.2 Van Hage et al. [17] have created
an enriched Semantic Web version of such a repository, the Linked Open Piracy
(LOP).3 On the basis of LOP, one might think to be able to predict the
frequency of attacks from one year based on the previously available data. However,
a problem arises in this situation, since new attack types appear every year and
this makes that frequencies vary. Ceolin et al. [5] have shown how the
Dirichlet process can be employed to model such situations. Having the possibility to
represent this information by means of an open world opinion adds the power of
subjective logic to the Dirichlet process based representation. We can merge
contributions from different sources, taking into account their reliability. Moreover,
we can combine these facts with others in a logical way and then estimate the
opinion (and the corresponding probability to be true) of the consequent facts.
By using open world opinions, we can easily apply usual subjective operators
to these data and easily represent them in a way that takes into account basic
provenance information (e.g. data source) when applying fusing or discounting
operators. For instance, if according to LOP, in Asia in 2010 we had 10 hijacking
events and 10 attempted boarding, then we would represent this as:
!ALOttaPcks in Asia in 2010([0:48; 0:48]; 0:04; U (0; 1))
If our opinion about LOP is that is a reliable but not fully accountable source
(e.g. !us
        </p>
        <p>LOP (0:8; 0:1; 0:1)), then we can take this information into account by
weighing the opinion given by LOP as follows:
!LusOP (0:8; 0:1; 0:1)</p>
        <p>!ALOttaPcks in Asia in 2010([0:48; 0:48]; 0:04; U (0; 1)) =
= !Austt:LacOkPs in Asia in 2010([0:384; 0:384]; 0:232; U (0; 1))
The resulting weighted opinion is more uncertain than the initial one, because,
even though the two observed types are more likely to happen, the small
uncertainty about the source reliability makes the other probabilities to rise.</p>
        <p>A difference with respect to multinomial opinions arises in case of fusion,
because the fusion operator requires that the a priori values have to be merged
(averaged). Since the a priori values in the case of the open world opinions are
represented by the distribution H (supposedly, H1 and H2 for two opinions
to be merged). The averaging is still performed, and in this case the averaged
distribution corresponds to the distribution Z having E(Z) = b E(X1)+a E(X2)
and VAR(X) = b2 (VAR(X1)) + a2 (VAR(X2)), where a, b are the two weights
(e.g. u1 and u2 in case of cumulative fusion).
2 http://www.icc-ccs.org
3 http://semanticweb.cs.vu.nl/lop</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Discussion</title>
      <p>We have shown the potential for employing subjective Logic as a basis for
reasoning on Web and Semantic Web data. We have shown how it can be really
powerful for handling uncertainty and how little extensions can help in improving
the mutual benefit that Semantic Web and subjective logic obtain from
cooperating together. Part of this work is based on previously mentioned practical
applications that show the usefulness of it, and here we provide theoretical
foundations for it. We foresee that other extensions will be possible as well like, for
instance, the usage of hyperopinions [10] to handle subsumption reasoning about
uncertain data.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellenger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gatepaille</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Abdulrab</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Kotowicz</surname>
          </string-name>
          .
          <article-title>An Evidential Approach for Modeling and Reasoning on Uncertainty in Semantic Applications</article-title>
          . In URSW, volume
          <volume>778</volume>
          , pages
          <fpage>27</fpage>
          -
          <lpage>38</lpage>
          . CEUR-WS.org,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Budanitsky</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Hirst. Evaluating</surname>
          </string-name>
          WordNet
          <article-title>-based Measures of Lexical Semantic Relatedness</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>32</volume>
          (
          <issue>1</issue>
          ):
          <fpage>13</fpage>
          -
          <lpage>47</lpage>
          , Mar.
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceolin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nottamkandath</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Fokkink</surname>
          </string-name>
          .
          <article-title>Automated Evaluation of Annotators for Museum Collections using Subjective Logic</article-title>
          .
          <source>In IFIPTM</source>
          , pages
          <fpage>232</fpage>
          -
          <lpage>239</lpage>
          . Springer, May
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceolin</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. Van Hage</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Fokkink</surname>
          </string-name>
          .
          <article-title>A Trust Model to Estimate the Quality of Annotations Using the Web</article-title>
          .
          <source>In WebSci10. Online</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ceolin</surname>
          </string-name>
          , W. van Hage, and
          <string-name>
            <given-names>W.</given-names>
            <surname>Fokkink</surname>
          </string-name>
          .
          <article-title>Estimating the Uncertainty of Categorical Web Data</article-title>
          .
          <source>In URSW</source>
          , volume
          <volume>778</volume>
          , pages
          <fpage>15</fpage>
          -
          <lpage>26</lpage>
          . CEUR-WS.org,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Ferguson</surname>
          </string-name>
          .
          <article-title>A Bayesian analysis of some nonparametric problems</article-title>
          .
          <source>Annals of Statistics</source>
          ,
          <volume>2</volume>
          :
          <fpage>209</fpage>
          -
          <lpage>230</lpage>
          ,
          <year>1973</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Jøsang</surname>
          </string-name>
          .
          <article-title>A Logic for Uncertain Probabilities</article-title>
          .
          <source>Int. Journal of Uncertainty, Fuzziness and Knowledge-Based Systems</source>
          ,
          <volume>9</volume>
          (
          <issue>3</issue>
          ):
          <fpage>279</fpage>
          -
          <lpage>212</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.</given-names>
            <surname>Jøsang</surname>
          </string-name>
          , M. Daniel, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Vannoorenberghe</surname>
          </string-name>
          .
          <article-title>Strategies for combining conflicting dogmatic beliefs</article-title>
          .
          <source>In FUSION. IEEE</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Jøsang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Diaz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Rifqi</surname>
          </string-name>
          .
          <article-title>Cumulative and averaging fusion of beliefs</article-title>
          .
          <source>Information Fusion</source>
          ,
          <volume>11</volume>
          (
          <issue>2</issue>
          ):
          <fpage>192</fpage>
          -
          <lpage>200</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Jøsang</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Hankin</surname>
          </string-name>
          .
          <article-title>Interpretation and fusion of hyper opinions in subjective logic</article-title>
          .
          <source>In FUSION. IEEE</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>A.</given-names>
            <surname>Jøsang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Marsh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Pope</surname>
          </string-name>
          .
          <article-title>Exploring Different Types of Trust Propagation</article-title>
          . In iTrust, pages
          <fpage>179</fpage>
          -
          <lpage>192</lpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. L.
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Chakraborty</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Bisdikian</surname>
          </string-name>
          .
          <article-title>Subjective Logic with Uncertain Partial Observations</article-title>
          .
          <source>In FUSION. IEEE</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <source>Netherlands Inst. for Sound and Vision</source>
          . Waisda? http://wasida.nl, Aug.
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Princeton University. Wordnet::Similarity. http://marimba.d.umn.edu/cgi-bin/ similarity/similarity.cgi, Feb.
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>M. Sensoy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Srivatsa</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Meneguzzi</surname>
          </string-name>
          .
          <article-title>Using Subjective Logic to Handle Uncertainty and Conflicts</article-title>
          . In TrustCom. IEEE,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. U.S Inst.
          <article-title>of Museum and Library Service. Steve social tagging project</article-title>
          ,
          <source>Jan</source>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. W. van Hage,
          <string-name>
            <given-names>V.</given-names>
            <surname>Malaisé</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. van Erp. Linked</given-names>
            <surname>Open</surname>
          </string-name>
          <article-title>Piracy: A story about e-Science, Linked Data, and statistics</article-title>
          .
          <source>Journal of Data Semantics</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmer</surname>
          </string-name>
          .
          <article-title>Verbs semantics and lexical selection</article-title>
          .
          <source>In ACL '94</source>
          , pages
          <fpage>133</fpage>
          -
          <lpage>138</lpage>
          . ACL,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>