On the Way to Temporal OBDA Systems
Diego Calvanese1,2 , Cem Okulmus2 , Magdalena Ortiz2 and Mantas Šimkus2
1
    Free University of Bozen-Bolzano, Italy
2
    Umeå University, Sweden


                                         Abstract
                                         Extending the OBDA approach – where multiple data sources are exposed to users via a unified conceptual
                                         schema based on description logics – to also cover temporal reasoning has been a long standing goal,
                                         with many proposals over the last decades. To the best of our knowledge, these have yet to yield results
                                         in the form of systems or prototypes. As part of our ongoing work towards practical applicability, we
                                         identify here a number of key problems, which we believe have not been addressed suitably by previous
                                         works. Among these is the ability to deal with heterogeneous representations of time, the ability to deal
                                         with temporal inconsistencies, either due to missing value samples or conflicting values for a given time
                                         point and finally we also seek a suitable query language, where we in particular want compositionality –
                                         the ability to use the output of queries to form new temporal views on the data. We present here our
                                         initial ideas on how to meet these challenges.

                                         Keywords
                                         Ontology-based data access, temporal database, description logic


1. Introduction
Ontology-based data access (OBDA) describes the method of enriching relational databases
with semantical reasoning tools developed in the area of Description Logics (DLs). Specifically
OBDA allows one to create mappings from various data sources to an ontology, and extend
the data via concept and role inclusions, and thus create a “virtual knowledge graph” (VKG)
over which queries can be answered. This VKG need not be materialised, as the query can
be rewritten to incorporate the richer semantics from the ontology, and this rewritten query
can then be run on existing commercial RDBMs. At this point, OBDA is increasingly used in
practice, with both open-source and proprietary systems available.
  While temporal databases have been the focus of research for a long time [1], there has been
a wider adoption in the industry in recent years. We note the introduction of many commercial
systems with a specific focus on temporal data, such as InfluxDB1 , Prometheus2 , TimescaleDB3 ,

AMW’23: 15th Alberto Mendelzon International Workshop on Foundations of Data Management, May 22–26, 2023,
Santiago de Chile, CL
$ calvanese@inf.unibz.it (D. Calvanese); okulmus@cs.umu.se (C. Okulmus); magortiz@cs.umu.se (M. Ortiz);
simkus@cs.umu.se (M. Šimkus)
 0000-0001-5174-9693 (D. Calvanese); 0000-0002-7742-0439 (C. Okulmus); 0000-0002-2344-9658 (M. Ortiz);
0000-0003-0632-0294 (M. Šimkus)
                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR

          CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073


1
  www.influxdata.com
2
  prometheus.io
3
  www.timescale.com
and many others. In addition to the use of temporal data, the literature shows clearly that there
is a strong interest in the industry to have query languages that capture complex temporal
events in a succinct and intuitive way [2].
   There have been decades of work on temporal data and ontologies, both for the more funda-
mental question of understanding the complexity of temporal DLs [3, 4, 5] and even leading to
very promising proposals to extend existing OBDA systems with temporal reasoning [2]. Un-
fortunately, none of these works or proposals have yet led to prototypes of OBDA systems with
rich temporal reasoning, let alone complete systems. Unsatisfied with this state of affairs, we
want to identify key challenges that we believe have yet to be addressed or met by the research
on temporal OBDA, and we are convinced that our work in overcoming such challenges will
lead to working prototypes and, hopefully, ultimately pave the way toward practical systems.
   We will highlight in this paper the following challenges that we have identified, and present
our initial ideas on how to tackle them.

    • Finding ways of dealing with heterogeneous temporal data representations uniformly.
    • Exploring the possibility of temporal inconsistencies arising both in the input data and
      as the result of complex queries, and finding solutions that still enable reliable query
      answering.
    • Finding a composable temporal query language, with suitable complexity in the form of,
      ideally, FO rewritability. We note that expressivity of this query language can be extended
      via a complex ontology language, to be used by experts, in order to allow users to refer to
      complex temporal events without the need to understand complicated temporal logics.

  In the remainder of the paper we will give more insights into these challenges, and present
on-going work on how we are planning to meet them.

Related Work. Bridging temporal reasoning and DLs has been an area of focus for many
years [3, 4, 5, 6, 7]. These works focused on initial exploration of the complexity landscape,
and in particular the boundaries of decidability. Understandably, mappings to real data sources
are not explored, and most papers assume time to be simply represented by the integers and
an ordering on them, for example ⟨Z, <⟩, either purely point-based or including intervals too.
The work on finding suitable ontology languages, which can encode complex temporal events,
and allowing their use in much simpler user queries has led to promising results, such as the
work of Kontchakov et al. [8], which features a fragment of the interval logic ℋ𝒮 in the form
of an extension of Datalog. The proposal by Kalaycı et al. [2] is very close to what we aim
to ultimately realise with our work. Their proposal supports complex temporal events at the
ontology level and an extension of SPARQL to connect validity periods to facts.

2. Challenges for Practical Temporal OBDA
Here we highlight the key challenges that we still see as unaddressed in the existing literature
and which any practical ontology-based system that allows for rich temporal reasoning on
real-world temporal databases must meet.
Heterogeneous temporal data representations. One can identify in the literature different
views at the data level, both point-based (often also called time-series) and interval-based relations.
When looking at schemas involving temporal data, it becomes apparent that even the same
database will combine time-series and interval-based views. As such, one needs to identify
ways to jointly represent both and have ways to safely translate one representation into another.
By “safely” we refer to the ability to identify cases of temporal inconsistency, discussed next.

Inconsistency and temporal data. In temporal databases, where facts are enriched via
validity periods, we define temporal consistency to refer to the fact that we have a consistent
assignment of values in the domain to non-temporal attributes for any time-instant that the
temporal data is defined over. We believe that the case of temporal inconsistency will occur quite
often in practice, and systems will need robust and transparent ways to manage it. Furthermore,
we believe that this differs from the case of inconsistency in the non-temporal case. After
all, temporal inconsistency only refers to cases where for a given time-point we have too
many choices on values for non-temporal attributes (ambiguity) or none at all (gaps in the
data). Temporal inconsistency due to ambiguity can be introduced naturally just by allowing
general interval-based temporal relations. Managing temporal inconsistency becomes even
more crucial when one also considers the third challenge, namely an expressive temporal query
language, with the ability to use queries within other queries. With the ability to create new
temporal relations (or views) comes the possibility that these new relations themselves could
be temporally inconsistent. Thus, dealing with inconsistency becomes a necessity. In addition
to inconsistency due to ambiguity, another issue is “gaps” in the temporal data, due to a low
temporal resolution, for example. This too will need to be addressed in settings where queries
are expected to return useful answers for any time point, regardless of the temporal resolution
of any specific data source.

An expressive, composable temporal query language. Just as with standard relational
databases and SQL, we would expect temporal query languages to be able to produce new
temporal relations, either point-based or interval-based, out of existing temporal data. In
addition, we believe it is necessary to be able to express predicates between time-intervals,
as detecting complex patterns on time is necessary for the kind of event detection that is of
practical interest in industry. An option that we also need to consider in the OBDA setting is
the distinction between ontology and query level. As complex and expressive languages on
time might be hard for non-expert users to master, one can delegate this task to the ontology
engineer via a complex ontology language, which would then introduce new facts to signify
temporal events. Users could then make use of the complex temporal machinery without the
need to define it themselves.

3. Discussion on Ongoing Work & Outlook
We present for each of the challenges a proposal or give some further details.
   For the data model, our current idea is to extend the relational setting and explicitly support
both time-series relations and interval-based relations. We use attr (𝑅) and pk (𝑅) to respec-
tively denote the set of all attributes and the primary key attributes of a relation 𝑅. We use
Table 1
Temporal relation “Project”, and the result of a temporal rule applied to it.
                             Project                   extend(time𝑒𝑥𝑡 , 𝑏𝑢𝑑𝑔𝑒𝑡)
                         time      budget               time          budget
                        ( 0, 15)    150                ( 0, 25)         150
                        (16, 30)    300                (16, 40)         300

capital letters to identify attributes in the schema and lowercase letters for values inside a tuple.
We assume that the type time is realised as time-stamps.
Definition 1 (Time-series Relation). A time-series relation 𝑅TS is a relation that has the following
property: there must exist exactly one attribute 𝑇 ∈ attr (𝑅TS ) of type time such that 𝑇 ∈
pk (𝑅TS ).
Definition 2 (Time-interval Relation). A time-interval relation 𝑅I is a relation that satisfies the
following properties:
    • There are 𝑇1 , 𝑇2 ∈ attr (𝑅I ) of type time such that 𝑇1 , 𝑇2 ∈ pk (𝑅I ). We assume w.l.o.g.
      that 𝑇1 , 𝑇2 are the first two attributes of 𝑅I .
    • For every tuple (𝑡1 , 𝑡2 , 𝑥3 , . . . , 𝑥𝑛 ) ∈ 𝑅I , we have that 𝑡1 ≤ 𝑡2 .
   With these two representations of time, we already get a number of issues that one needs
to address in any working implementation of temporal OBDA systems. The first issue is that
of clearly defining when one can safely transform one representation of time into the other
one. A second issue (or rather feature we want to have) is the ability to define for a data source
methods of making the data denser, for example by means of interpolation on numerical data
points to extend the temporal relation.
   For the problem of dealing with temporal inconsistency due to ambiguity, we only give here a
simple example showing how inconsistency might be introduced even by the ontology language
or at the query level, in addition to the possibility of being already present in the temporal
database itself. For the purpose of this example, we pick the ontology language proposed by
Brandt et al [6], Datalog for sensor log data, or short DslD. We omit a detailed introduction of it
here, and refer interested readers to the original paper. They propose a number of temporal
operators in DslD, including ones that can manipulate intervals, such as lshift 𝑥 and rshift 𝑥 ,
which extend the left (resp., right) boundary of the interval by 𝑥 units. We note that DslD is
based on a metric view of time with a fixed time unit, such as seconds. Brandt et al. show the
need for such operators to formulate rules to detect complex temporal events, informed by the
needs of industrial partners. However, the expressive power of DslD introduces the possibility
of temporal inconsistency arising from the execution of rules, as demonstrated in the following
example.
Example 1. Let us assume that our schema contains a relation “Project”, which indicates the
available budget for our project at a given time interval. We further assume we are given
a concrete database with the relation “Project” containing the tuples as shown in Table 1.
Furthermore, consider the following rule in DslD:
       extend (time𝑒𝑥𝑡 , budget) ← time𝑒𝑥𝑡 is rshift 10 (time), Project(time, budget).
This rule would simply extend every existing interval period, while retaining the budget value
for that interval. In Table 1, we also show the tuples obtained after applying this rule. We
can see that the resulting relation introduces ambiguity in the form of overlaps between time
intervals of tuples that have different values for the non-temporal attributes.

Outlook. We plan to continue tackling the challenges we sketched out in this short paper,
and ideally realise a first prototype implementation once we have a clearer picture of temporal
ontology-based data access. The design of a suitable combination of query language and
ontology language, while retaining FO reducibility, is in particular a crucial goal. It requires
a balance in order to ensure accessibility for non-expert users at the query level, while also
maintaining high expressivity overall.


Acknowledgments
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software
Program (WASP) funded by the Knut and Alice Wallenberg Foundation. It was also partially
supported by the Austrian Science Fund (FWF) projects P30360 and P30873, by the Vienna
Business Agency’s project CoRec, by the Italian Basic Research (PRIN) project HOPE, and by
the Province of Bolzano through the project D2G2.


References
[1] R. T. Snodgrass, I. Ahn, Temporal databases, Computer 19 (1986) 35–42.
[2] E. G. Kalaycı, S. Brandt, D. Calvanese, V. Ryzhikov, G. Xiao, M. Zakharyaschev, Ontology-
    based access to temporal data with Ontop: A framework proposal, Int. J. Appl. Math.
    Comput. Sci. 29 (2019) 17–30.
[3] A. Artale, R. Kontchakov, V. Ryzhikov, M. Zakharyaschev, Tractable interval temporal
    propositional and description logics, in: Proc. AAAI 2015, 2015, pp. 1417–1423.
[4] V. Gutiérrez-Basulto, J. C. Jung, R. Kontchakov, Temporalized EL ontologies for accessing
    temporal data: Complexity of atomic queries, in: Proc. IJCAI 2016, 2016, pp. 1102–1108.
[5] S. Brandt, E. G. Kalaycı, V. Ryzhikov, G. Xiao, M. Zakharyaschev, Querying log data with
    Metric Temporal Logic, J. Artif. Intell. Res. 62 (2018) 829–877.
[6] S. Brandt, D. Calvanese, E. G. Kalaycı, R. Kontchakov, B. Mörzinger, V. Ryzhikov, G. Xiao,
    M. Zakharyaschev, Two-dimensional rule language for querying sensor log data: A frame-
    work and use cases, in: Proc. TIME 2019, volume 147 of LIPIcs, 2019, pp. 7:1–7:15.
[7] S. Klarman, T. Meyer, Querying temporal databases via OWL 2 QL, in: Proc. RR 2014,
    volume 8741 of LNCS, 2014, pp. 92–107.
[8] R. Kontchakov, L. Pandolfo, L. Pulina, V. Ryzhikov, M. Zakharyaschev, Temporal and spatial
    OBDA with many-dimensional Halpern-Shoham logic, in: S. Kambhampati (Ed.), Proc.
    IJCAI 2016, 2016, pp. 1160–1166.