=Paper=
{{Paper
|id=Vol-173/paper-15
|storemode=property
|title=Statistical Reasoning - A Foundation for Semantic Web Reasoning
|pdfUrl=https://ceur-ws.org/Vol-173/pos_paper6.pdf
|volume=Vol-173
}}
==Statistical Reasoning - A Foundation for Semantic Web Reasoning==
<pdf width="1500px">https://ceur-ws.org/Vol-173/pos_paper6.pdf</pdf>
<pre>
             Statistical Reasoning – A Foundation for Semantic Web Reasoning

                                        Shashi Kant • Evangelos Mamas
                                       Massachusetts Institute of Technology
                                             Cambridge, MA 02139
                                   skant@sloan.mit.edu • emamas@sloan.mit.edu


                         Abstract                                tion, and Robotics etc. The Semantic Web would do well to
                                                                 re-use some of these efforts in building this underlying
    There has been considerable debate as to the merits          framework.
    and the applicability of probabilistic or statistical
    reasoning to Semantic Web. Much of this debate
    seems to have centered on the applicability of sta-          2    Ontologies and Probabilistic Models
    tistical methods in a supposedly deterministic set-          We introduce the notion that Probabilistic Graph Models
    ting. In this paper, we argue that statistical reason-       (PGM) or Bayesian Networks can be viewed as fuzzy On-
    ing (“reasoning with uncertainty”) need not be a             tologies; conversely an Ontology can be viewed as a crisper
    substitute for traditional Description Logic (DL) /          Bayesian Networks. In our proposed architecture, there may
    First-Order Logic (FOL) reasoning, instead statisti-         not be a clear dividing line between them. A good way of
    cal methods can serve as a complement to logic-              visualizing this relation would be to view Ontologies and
    based reasoning systems in two ways: (i) Offer a             Bayesian Networks as ships floating in a sea of statistical
    meta-reasoning (or audit) mechanism to validate              “metadata”. We use this metaphor to describe the notion
    logical reasoning, and (ii) Act as a “filler” where          that the sea of statistical metadata fills-in the gaps between
    Ontological information either does not exist, or is         the islands of Ontologies. Lately there have been some ef-
    insufficient to reason conclusively.                         forts to develop Probabilistic Ontologies by annotating
                                                                 OWL or RDF Ontologies with probabilistic information e.g.
1   Introduction                                                 BayesOWL [Ding,Peng 2004]. We argue against this ap-
                                                                 proach, and suggest that probabilistic and logic-based rea-
Much of the Semantic Web effort has focused on the design
                                                                 soning approaches should be viewed as orthogonal to each
and development of Ontologies and related technologies.
                                                                 other. It makes most sense to keep the Ontological informa-
This approach presupposes that a critical mass of Ontologies     tion separate from the statistical data, along the lines of how
will exist that can sufficiently and accurately respond to
                                                                 the WWW operates - in which an HTML page links to a
reasoning queries. As Sir Tim Berners-Lee puts it [Berners-
                                                                 “FTP” site or a “mailto” to an email hyperlink and the nec-
Lee, 1998]: "The choice of classical logic for the Semantic      essary protocols invoked only when clicked.
web is not an arbitrary choice among equals. Classical
logic is the only way that inference can scale across the
                                                                 Figure 1 illustrates a hierarchical mechanism of aligned On-
web."                                                            tologies and Bayesian Networks. At the very top are the top-
                                                                 level Ontologies on which there is general agreement and
However, a pure logic-based approach looks increasingly
                                                                 acceptance, at the bottom are the fuzzier, grayer-scale
implausible given the paucity of Ontologies and the diffi-       Bayesian Networks which represent relations between re-
culty in constructing and maintaining Ontologies. Just like
                                                                 sources using probabilistic mechanisms.
the World Wide Web (WWW) had a ready and mature plat-
form to run on i.e. the Internet - which had been in existence
                                                                                                            BayesianAbstract
                                                                                                      Ontological      Details   Fuzzy
                                                                                                                                 Crisp
for a long time prior to the emergence of the WWW, we feel
that the Semantic Web needs an underlying platform, upon
which Ontologies can function and interoperate.

We argue that this platform should be a web of statistical
“metadata” – which expresses semantic relations in prob-
abilistic terms. Such systems (e.g. Bayesian Networks,                                                 Ontological   Abstract    Crisp
Probabilistic Relational Models) have also been in existence                                            Bayesian     Details     Fuzzy
for a while and are used in various Machine Learning and                          Query
AI applications such as Machine Vision, Speech Recogni-
                                                                 Figure 1: Ontologies vis-à-vis Bayesian Networks
We suggest, that probabilistic (or statistical) information be   4   Conclusion
encoded using any of the widely accepted Bayesian Inter-
change Formats such as XML-BIF [Cozman, 1998], or Mi-            “Reasoning with Uncertainty” is probably a misnomer to
crosoft Research’s XBN [Microsoft, 1998] or Hugin.net            describe the efforts required in this area - a more appropriate
[Jensen, 2004] format. We propose that the Ontological           phraseology would be “reasoning without certainty”. While
model encapsulate what it is designed for - expressing logi-     the difference may seem pedantic, the underlying notion is
cal relations between resources, and the probabilistic model     that “uncertainty” is not a state unto itself, but merely the
express the statistical relation between them. We do not see     absence of certainty. In a Semantic Web sense, it is a state
a need to mix-and-match as they offer very different views       where Ontological information is non-existent, incomplete
on the same information-set and are perceptually orthogo-        or inconclusive. Statistical reasoning could therefore be the
nal.                                                             bedrock upon which DL/FOL based querying and reasoning
                                                                 can be performed.
3    A Hybrid Reasoning Model                                    This means that the semantic web can operate in areas cur-
Reasoning using Ontologies is based on predicate logic and       rently out-of-bounds because of a lack of Ontological in-
belongs in the classical tradition of monotonic deductive        formation. We therefore hypothesize that statistical “meta-
reasoning i.e. propositions are either true or false. But this   data” could be the building-block of the Semantic Web lead-
proposed framework provides a mechanism for handling             ing to better and more accurate reasoning mechanisms.
fuzzier, incomplete and inaccurate inputs. In this model,
reasoning can be performed using a “bottom-up” approach          5   Acknowledgments
where a query unanswered by a pure Ontological match is
extended further up the hierarchy (Fig 1.) until all required    We would like to acknowledge the gracious help and sup-
information is found. An adjunct application might be to         port of the members of the World Wide Web Consortium
validate traditional reasoning with a mathematical confi-        (W3C) and faculty of MIT-CSAIL for their help and sup-
dence level (meta-reasoning).                                    port. We would especially like to thank Ralph Swick and Sir
                                                                 Tim Berners-Lee for their critique and feedback.
Some examples of the reasoning activities possible using
this system are:                                                 6   References
                                                                 [Berners-Lee, 1998] Tim Berners-Lee. Axioms of Web Ar-
1.   Deductive Reasoning: Deductive reasoning allows a              chitecture: n, 1998, Available at: http://www.w3.org/
     system to deduce information given a set of (possibly          DesignIssues/Rules.html, Accessed on June 20, 2005.
     incomplete and erroneous) information. For example, it
     can deduce that the best course to learn “Machine Vi-       [Cozman, 1998] Fabio Gagliardi Cozman, The Interchange
     sion”, “Genomics” and “Political Science” at MIT is            Format for Bayesian Networks, 1998, Available at:
                                                                    http://www-2.cs.cmu.edu/afs/cs/user/fgcozman/www/
     most probably “6.804J Computational Cognitive Sci-
                                                                    Research/InterchangeFormat/, Accessed on June 23,
     ence” even though the course does not directly teach
                                                                    2005.
     Political Science. It is making a best-guess fit for the
     requirements [OCW, 2005].                                   [Microsoft, 1998] Microsoft Research, XML Belief Net-
                                                                    work File Format: Main Page, 1998, Available at:
2.   Abductive Reasoning: Abductive reasoning allows a              http://research.microsoft.com/dtas/bnformat/ Accessed
     system to infer the possible causes for a certain effect.      on June 23, 2005.
     For example, the possible courses for learning Artificial   [Jensen, 2004] Finn V. Jensen, A Brief Overview of the
     Intelligence at MIT are 6.803, 6.825 etc. This is the          Three Main Paradigms of Expert Systems, Available at:
     equivalent of diagnostic reasoning in Bayesian Net-            http://developer.hugin.com/Getting_Started/Paradigms/,
     works [OCW, 2005].                                             Accessed on June 23, 2005.
3.   Monotonic reasoning, non-monotonic reasoning and
     default values: Traditional DL-based Ontologies can         [OCW, 2005] MIT Open Courseware, Massachusetts Insti-
                                                                    tute of Technology, Available at: http://ocw.mit.edu,
     represent information for monotonic reasoning. For ex-
                                                                    Accessed on June 23, 2005.
     ample, one might declare that Universities in the US
     have a GPA scale of 4.0, but MIT uses a 5.0 GPA scale       [Ding, Peng, 2004] Zhongli Ding and Yun Peng. A Prob-
     – so the system monotonically cannot reason with that       abilistic Extension to Ontology Language OWL, in Pro-
     information unless it has been explicitly encoded.          ceedings of the 37th Hawaii International Conference on
                                                                 System Sciences - 2004
This kind of non-monotonic reasoning is possible with the
proposed approach.

</pre>