<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Way to Temporal OBDA Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Diego Calvanese</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cem Okulmus</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Magdalena Ortiz</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mantas Šimkus</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Umeå University</institution>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Extending the OBDA approach - where multiple data sources are exposed to users via a unified conceptual schema based on description logics - to also cover temporal reasoning has been a long standing goal, with many proposals over the last decades. To the best of our knowledge, these have yet to yield results in the form of systems or prototypes. As part of our ongoing work towards practical applicability, we identify here a number of key problems, which we believe have not been addressed suitably by previous works. Among these is the ability to deal with heterogeneous representations of time, the ability to deal with temporal inconsistencies, either due to missing value samples or conflicting values for a given time point and finally we also seek a suitable query language, where we in particular want compositionality the ability to use the output of queries to form new temporal views on the data. We present here our initial ideas on how to meet these challenges.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Ontology-based data access</kwd>
        <kwd>temporal database</kwd>
        <kwd>description logic</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Ontology-based data access (OBDA) describes the method of enriching relational databases
with semantical reasoning tools developed in the area of Description Logics (DLs). Specifically
OBDA allows one to create mappings from various data sources to an ontology, and extend
the data via concept and role inclusions, and thus create a “virtual knowledge graph” (VKG)
over which queries can be answered. This VKG need not be materialised, as the query can
be rewritten to incorporate the richer semantics from the ontology, and this rewritten query
can then be run on existing commercial RDBMs. At this point, OBDA is increasingly used in
practice, with both open-source and proprietary systems available.</p>
      <p>
        While temporal databases have been the focus of research for a long time [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], there has been
a wider adoption in the industry in recent years. We note the introduction of many commercial
systems with a specific focus on temporal data, such as InfluxDB 1, Prometheus2, TimescaleDB3,
and many others. In addition to the use of temporal data, the literature shows clearly that there
is a strong interest in the industry to have query languages that capture complex temporal
events in a succinct and intuitive way [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        There have been decades of work on temporal data and ontologies, both for the more
fundamental question of understanding the complexity of temporal DLs [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ] and even leading to
very promising proposals to extend existing OBDA systems with temporal reasoning [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Unfortunately, none of these works or proposals have yet led to prototypes of OBDA systems with
rich temporal reasoning, let alone complete systems. Unsatisfied with this state of afairs, we
want to identify key challenges that we believe have yet to be addressed or met by the research
on temporal OBDA, and we are convinced that our work in overcoming such challenges will
lead to working prototypes and, hopefully, ultimately pave the way toward practical systems.
      </p>
      <p>We will highlight in this paper the following challenges that we have identified, and present
our initial ideas on how to tackle them.</p>
      <p>• Finding ways of dealing with heterogeneous temporal data representations uniformly.
• Exploring the possibility of temporal inconsistencies arising both in the input data and
as the result of complex queries, and finding solutions that still enable reliable query
answering.
• Finding a composable temporal query language, with suitable complexity in the form of,
ideally, FO rewritability. We note that expressivity of this query language can be extended
via a complex ontology language, to be used by experts, in order to allow users to refer to
complex temporal events without the need to understand complicated temporal logics.</p>
      <p>In the remainder of the paper we will give more insights into these challenges, and present
on-going work on how we are planning to meet them.</p>
      <p>
        Related Work. Bridging temporal reasoning and DLs has been an area of focus for many
years [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7">3, 4, 5, 6, 7</xref>
        ]. These works focused on initial exploration of the complexity landscape,
and in particular the boundaries of decidability. Understandably, mappings to real data sources
are not explored, and most papers assume time to be simply represented by the integers and
an ordering on them, for example ⟨Z, &lt;⟩, either purely point-based or including intervals too.
The work on finding suitable ontology languages, which can encode complex temporal events,
and allowing their use in much simpler user queries has led to promising results, such as the
work of Kontchakov et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which features a fragment of the interval logic ℋ in the form
of an extension of Datalog. The proposal by Kalaycı et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is very close to what we aim
to ultimately realise with our work. Their proposal supports complex temporal events at the
ontology level and an extension of SPARQL to connect validity periods to facts.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Challenges for Practical Temporal OBDA</title>
      <p>Here we highlight the key challenges that we still see as unaddressed in the existing literature
and which any practical ontology-based system that allows for rich temporal reasoning on
real-world temporal databases must meet.</p>
      <p>Heterogeneous temporal data representations. One can identify in the literature diferent
views at the data level, both point-based (often also called time-series) and interval-based relations.
When looking at schemas involving temporal data, it becomes apparent that even the same
database will combine time-series and interval-based views. As such, one needs to identify
ways to jointly represent both and have ways to safely translate one representation into another.
By “safely” we refer to the ability to identify cases of temporal inconsistency, discussed next.
Inconsistency and temporal data. In temporal databases, where facts are enriched via
validity periods, we define temporal consistency to refer to the fact that we have a consistent
assignment of values in the domain to non-temporal attributes for any time-instant that the
temporal data is defined over. We believe that the case of temporal inconsistency will occur quite
often in practice, and systems will need robust and transparent ways to manage it. Furthermore,
we believe that this difers from the case of inconsistency in the non-temporal case. After
all, temporal inconsistency only refers to cases where for a given time-point we have too
many choices on values for non-temporal attributes (ambiguity) or none at all (gaps in the
data). Temporal inconsistency due to ambiguity can be introduced naturally just by allowing
general interval-based temporal relations. Managing temporal inconsistency becomes even
more crucial when one also considers the third challenge, namely an expressive temporal query
language, with the ability to use queries within other queries. With the ability to create new
temporal relations (or views) comes the possibility that these new relations themselves could
be temporally inconsistent. Thus, dealing with inconsistency becomes a necessity. In addition
to inconsistency due to ambiguity, another issue is “gaps” in the temporal data, due to a low
temporal resolution, for example. This too will need to be addressed in settings where queries
are expected to return useful answers for any time point, regardless of the temporal resolution
of any specific data source.</p>
      <p>An expressive, composable temporal query language. Just as with standard relational
databases and SQL, we would expect temporal query languages to be able to produce new
temporal relations, either point-based or interval-based, out of existing temporal data. In
addition, we believe it is necessary to be able to express predicates between time-intervals,
as detecting complex patterns on time is necessary for the kind of event detection that is of
practical interest in industry. An option that we also need to consider in the OBDA setting is
the distinction between ontology and query level. As complex and expressive languages on
time might be hard for non-expert users to master, one can delegate this task to the ontology
engineer via a complex ontology language, which would then introduce new facts to signify
temporal events. Users could then make use of the complex temporal machinery without the
need to define it themselves.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Discussion on Ongoing Work &amp; Outlook</title>
      <p>We present for each of the challenges a proposal or give some further details.</p>
      <p>For the data model, our current idea is to extend the relational setting and explicitly support
both time-series relations and interval-based relations. We use attr () and pk () to
respectively denote the set of all attributes and the primary key attributes of a relation . We use
capital letters to identify attributes in the schema and lowercase letters for values inside a tuple.
We assume that the type time is realised as time-stamps.</p>
      <p>Definition 1 (Time-series Relation). A time-series relation TS is a relation that has the following
property: there must exist exactly one attribute  ∈ attr (TS) of type time such that  ∈
pk (TS).</p>
      <p>Definition 2 (Time-interval Relation). A time-interval relation I is a relation that satisfies the
following properties:
• There are 1, 2 ∈ attr (I) of type time such that 1, 2 ∈ pk (I). We assume w.l.o.g.</p>
      <p>that 1, 2 are the first two attributes of I.</p>
      <p>• For every tuple (1, 2, 3, . . . , ) ∈ I, we have that 1 ≤ 2.</p>
      <p>With these two representations of time, we already get a number of issues that one needs
to address in any working implementation of temporal OBDA systems. The first issue is that
of clearly defining when one can safely transform one representation of time into the other
one. A second issue (or rather feature we want to have) is the ability to define for a data source
methods of making the data denser, for example by means of interpolation on numerical data
points to extend the temporal relation.</p>
      <p>
        For the problem of dealing with temporal inconsistency due to ambiguity, we only give here a
simple example showing how inconsistency might be introduced even by the ontology language
or at the query level, in addition to the possibility of being already present in the temporal
database itself. For the purpose of this example, we pick the ontology language proposed by
Brandt et al [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Datalog for sensor log data, or short DslD. We omit a detailed introduction of it
here, and refer interested readers to the original paper. They propose a number of temporal
operators in DslD, including ones that can manipulate intervals, such as lshift  and rshift ,
which extend the left (resp., right) boundary of the interval by  units. We note that DslD is
based on a metric view of time with a fixed time unit, such as seconds. Brandt et al. show the
need for such operators to formulate rules to detect complex temporal events, informed by the
needs of industrial partners. However, the expressive power of DslD introduces the possibility
of temporal inconsistency arising from the execution of rules, as demonstrated in the following
example.
      </p>
      <p>Example 1. Let us assume that our schema contains a relation “Project”, which indicates the
available budget for our project at a given time interval. We further assume we are given
a concrete database with the relation “Project” containing the tuples as shown in Table 1.
Furthermore, consider the following rule in DslD:
extend (time, budget ) ←</p>
      <p>time is rshift 10(time), Project(time, budget ).</p>
      <p>This rule would simply extend every existing interval period, while retaining the budget value
for that interval. In Table 1, we also show the tuples obtained after applying this rule. We
can see that the resulting relation introduces ambiguity in the form of overlaps between time
intervals of tuples that have diferent values for the non-temporal attributes.</p>
      <p>Outlook. We plan to continue tackling the challenges we sketched out in this short paper,
and ideally realise a first prototype implementation once we have a clearer picture of temporal
ontology-based data access. The design of a suitable combination of query language and
ontology language, while retaining FO reducibility, is in particular a crucial goal. It requires
a balance in order to ensure accessibility for non-expert users at the query level, while also
maintaining high expressivity overall.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>This work was partially supported by the Wallenberg AI, Autonomous Systems and Software
Program (WASP) funded by the Knut and Alice Wallenberg Foundation. It was also partially
supported by the Austrian Science Fund (FWF) projects P30360 and P30873, by the Vienna
Business Agency’s project CoRec, by the Italian Basic Research (PRIN) project HOPE, and by
the Province of Bolzano through the project D2G2.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Snodgrass</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Ahn</surname>
          </string-name>
          , Temporal databases,
          <source>Computer</source>
          <volume>19</volume>
          (
          <year>1986</year>
          )
          <fpage>35</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Kalaycı</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ryzhikov</surname>
          </string-name>
          , G. Xiao,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Ontologybased access to temporal data with Ontop: A framework proposal</article-title>
          ,
          <source>Int. J. Appl. Math. Comput. Sci</source>
          .
          <volume>29</volume>
          (
          <year>2019</year>
          )
          <fpage>17</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Artale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ryzhikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Tractable interval temporal propositional and description logics</article-title>
          ,
          <source>in: Proc. AAAI</source>
          <year>2015</year>
          ,
          <year>2015</year>
          , pp.
          <fpage>1417</fpage>
          -
          <lpage>1423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Gutiérrez-Basulto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <article-title>Temporalized EL ontologies for accessing temporal data: Complexity of atomic queries</article-title>
          ,
          <source>in: Proc. IJCAI</source>
          <year>2016</year>
          ,
          <year>2016</year>
          , pp.
          <fpage>1102</fpage>
          -
          <lpage>1108</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Kalaycı</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ryzhikov</surname>
          </string-name>
          , G. Xiao,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Querying log data with Metric Temporal Logic</article-title>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Artif</surname>
          </string-name>
          .
          <source>Intell. Res</source>
          .
          <volume>62</volume>
          (
          <year>2018</year>
          )
          <fpage>829</fpage>
          -
          <lpage>877</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Kalaycı</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mörzinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ryzhikov</surname>
          </string-name>
          , G. Xiao,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Two-dimensional rule language for querying sensor log data: A framework and use cases</article-title>
          ,
          <source>in: Proc. TIME</source>
          <year>2019</year>
          , volume
          <volume>147</volume>
          of LIPIcs,
          <year>2019</year>
          , pp.
          <volume>7</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          :
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Klarman</surname>
          </string-name>
          , T. Meyer,
          <article-title>Querying temporal databases via OWL 2 QL</article-title>
          , in
          <source>: Proc. RR</source>
          <year>2014</year>
          , volume
          <volume>8741</volume>
          <source>of LNCS</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>92</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pandolfo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pulina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ryzhikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Temporal and spatial OBDA with many-dimensional Halpern-Shoham logic</article-title>
          , in: S. Kambhampati (Ed.),
          <source>Proc. IJCAI</source>
          <year>2016</year>
          ,
          <year>2016</year>
          , pp.
          <fpage>1160</fpage>
          -
          <lpage>1166</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>