<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ontology-Mediated Queries for Probabilistic Databases (Extended Abstract)?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stefan Borgwardt</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>İsmail İlkan Ceylan</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Lukasiewicz</string-name>
          <email>thomas.lukasiewicz@cs.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Computer Science, Technische Universität Dresden</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The semantics of large-scale knowledge bases like NELL and Google's Knowledge Vault is founded on (tuple-independent) probabilistic databases (PDBs) [3]. As for ordinary databases, they employ the closed-world assumption, i.e., missing facts are treated as being false (having the probability 0), which leads to unintuitive results when querying PDBs. Recently, open-world probabilistic databases (OpenPDBs) were proposed to address this issue by allowing probabilities of facts that are not in the database, called open tuples, to take any value from a fixed probability interval [1]. In the resulting framework of OpenPDBs, query probabilities are given in terms of upper and lower probability values, which is more in line with an incomplete view of the world. However, OpenPDBs are still limited in the sense that they do not allow to distinguish queries that are intuitively different with respect to our commonsense knowledge. In order to model such knowledge, we extend OpenPDBs by ontology-mediated queries using Datalog ontologies (which cover some Horn Description Logics). We show that the data complexity dichotomy between P and PP in (Open)PDBs [1, 2] can be lifted to the case of first-order rewritable positive ontologies (without negative constraints); and that the problem can become NPPP-complete once negative constraints are allowed. This result demonstrates the difference between OpenPDBs and PDBs, as in the latter reasoning with ontologies remains in PP. We also propose an approximating semantics that circumvents the increase in complexity caused by negative constraints. Finally, we provide complexity results beyond the data complexity for ontology-mediated query evaluation relative to (tuple-independent) PDBs and OpenPDBs.</p>
      </abstract>
    </article-meta>
  </front>
  <body />
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ceylan</surname>
          </string-name>
          , İ.İ.,
          <string-name>
            <surname>Darwiche</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Van den Broeck, G.:
          <article-title>Open-world probabilistic databases</article-title>
          .
          <source>In: Proc. of KR</source>
          . pp.
          <fpage>339</fpage>
          -
          <lpage>348</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Dalvi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suciu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The dichotomy of probabilistic inference for unions of conjunctive queries</article-title>
          .
          <source>J. ACM</source>
          <volume>59</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>87</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Suciu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olteanu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ré</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koch</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Probabilistic Databases</article-title>
          . Morgan &amp;
          <string-name>
            <surname>Claypool</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>