Introduction

Versioned Queries over RDF Archives: All You Need is SPARQL?

Ignacio Cuevas

Aidan Hogan

0 0 Department of Computer Science, University of Chile & IMFD Chile

We explore solutions for representing archives of versioned RDF data using the SPARQL standard and o -the-shelf engines. We consider six representations of RDF archives based on named graphs, and describe how input queries can be automatically rewritten to return solutions for a particular version, or solutions that change between versions. We evaluate these alternatives over an archive of 8 weekly versions of Wikidata and 146 queries using Virtuoso as the SPARQL engine.

Introduction

A key aspect of the Web is its dynamic nature, where documents are frequently updated, deleted and added. Likewise when we speak of the Semantic Web, it is important to consider that sources may be dynamic and RDF datasets are subject to change [ 11 ]. It is in this context that various works have looked at versioning in the context of RDF/SPARQL [ 25,23,7,8,12 ], with recent works proposing RDF archives [ 5,3,2,6,20 ] that manage RDF graphs and their historical changes, allowing for querying across di erent versions of the graph. Within these works, a variety of specialised indexing techniques [ 3,2,20 ], query languages [ 23 ] and benchmarks [ 13,6 ] have been proposed, developed and evaluated. While these represent important advances, many such works propose custom SPARQL extensions, indexes, engines, etc., creating a barrier for adoption.

In fact, versioned queries as proposed in the literature [ 5 ] can be supported using o -the-shelf SPARQL engines with years of development, optimisation, and deployment. SPARQL named graphs can, for example, be used to track di erent versions of individual graphs. However, as Fernandez at al. [ 6 ] note, the approach of using pure SPARQL would \typically render rather ine cient SPARQL queries ". This raises a question: how ine cient will such queries be? If a pure SPARQL solution could be found with reasonable performance, existing SPARQL engines could be used to host and query RDF archives.

In this paper, we present preliminary empirical results addressing this research question. Speci cally we look at six representations of RDF archives using named graphs and propose query rewriting mechanisms for them. We then evaluate and compare these representations for an RDF archive of 8 versions of Wikidata [ 26 ]. Our experiments compare the sizes of the indexes generated, the time taken for indexing, and the relative costs of query evaluation. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

Related Work

Various temporal extensions for RDF have been proposed in literature based on annotations [ 9,19,29 ]. Proposed temporal extensions for SPARQL include SPARQL [ 23 ], T-SPARQL [ 7 ], SPARQL-ST [ 17 ], SPARQLT [ 27 ], etc. Related to temporality, a number of systems support versioning for RDF, including SemVersion [ 25 ], POI [ 24 ], x-RDF-3x [ 14 ], R43ples [ 8 ], Dydra [ 1 ] and Ostrich [ 21 ].

More recently RDF archives (of historical RDF data) have been gaining attention. Fernandez et al. [ 5 ] survey the theme, discussing the types of queries that can be run on such archives. Cerdeira-Pena et al. [ 3 ], Zaniolo et al. [ 27 ] and Taelman et al. [ 20 ] propose compressed indexes for RDF archives, while Khurana and Deshpande [ 12 ] propose indexes for historical graph data. Bahri et al. [ 2 ] use Apache Spark to manage RDF archives in a distributed setting. Benchmarks have also been proposed for RDF archives, including the BEnchmark of RDF ARchives (BEAR) [ 6 ], and the Semantic Publishing Benchmark (SPB) [ 15 ].

The past years have seen many developments for managing and querying temporal, versioned or historical RDF data. But most of these approaches propose specialised languages, implementations, etc., creating an obstacle for adoption. A number of authors note that one can manage and query RDF archives using vanilla SPARQL, though it may lead to complex or ine cient queries [ 23,6 ]. Recently SPARQL has been used to host and query the edit history of Wikidata, but only one representation is explored [ 22 ]. This paper describes preliminary experiments to gain insights into the e ciency of o -the-shelf SPARQL engines for hosting RDF archives using di erent types of representations. 3

Preliminaries

RDF triples are composed of three sets of terms: IRIs (I), literals (L) and blank nodes (B). We do not consider blank nodes in this work as they complicate the detection of changes [ 28 ]. An RDF triple (s; p; o) 2 I I (I [ L) consists of a subject s, predicate p and object o. An RDF graph G is a set of triples. An RDF archive is a tuple of RDF graphs G = (G1; : : : ; Gn).

A triple pattern t := (s; p; o) 2 (I [ V) (I [ V) (I [ L [ V) is an RDF triple that permits variables from V to appear in any position. We denote by vars(t) the variables of t. A solution is a partial mapping : V ! I [ L. We denote by dom( ) the domain of , i.e., the set of variables for which is de ned. We say that two solutions 1; 2 are compatible, denoted 1 2, if and only if 1(v) = 2(v) for all v 2 dom( 1) \ dom( 2). We denote by (t) the image of t under , replacing each variable v 2 dom( ) \ vars(t) by (v) in t. We denote by t(G) := f j (t) G and dom( ) = vars(t)g the evaluation of t over G.

SPARQL queries are based on triple patterns and a number of relational operators. Similar to Perez et al. [ 16 ], we de ne an abstract syntax for a subset of SPARQL of pertinence to this paper as follows. A triple pattern t is a graph pattern. Furthermore, if P and Q are graph patterns, and V V is a set of [P and Q](G) := P (G) no Q(G) [P union Q](G) := P (G) [ Q(G) [P minus Q](G) := P (G) n Q(G)

M1 no M2 := f 1 [ 2 j 1 2 M1; 2 2 M2; 1 M1 [ M2 := f j 2 M1 or 2g 2g variables, then [P and Q], [P union Q] and [P minus Q] are graph patterns. We de ne the semantics of these graph patterns in Figure 1.1

In this paper, we rely on SPARQL datasets to represent RDF archives. A SPARQL dataset D := fG; (x1; G1); : : : (xn; Gn)g consists of a default (RDF) graph G and a set of named graphs of the form (xi; Gi) where xi 2 I, Gi is an RDF graph, and xi 6= xj for 1 i m, 1 j m, i 6= j. We may represent D as a set of quads of the form (G f g) [ (G1 fx1g) [ : : : [ (Gn fxng), where 2= I [ L [ V is a special symbol denoting the default graph.2 Di erent named graphs can be queried using a GRAPH operator, creating quad patterns. A quad pattern q = (s; p; o; g) 2 (I [ V) (I [ V) (I [ L [ V) (I [ f g) extends a triple pattern with a fourth element that may be an IRI or *. Its evaluation is analogous to that of a triple pattern: q(G) := f j (q) G and dom( ) = vars(q)g. Following the SPARQL standard [ 10 ], we translate a triple pattern (s; p; o) to a quad pattern (s; p; o; ) evaluated only on the default graph. The semantics of the operators de ned in Table 1 then remain unchanged simply allowing P and Q to now also represent quad patterns, considered to be (named) graph patterns.

SPARQL provides two operators to initialise a SPARQL dataset: FROM and FROM NAMED. We combine both for brevity into one operator. A graph pattern P is considered to be a query. Likewise if P is a graph pattern, and M and N are sets of IRIs, then fromM;N P is also a query. If (xi; Gi) 2 D let D(xi) = Gi; otherwise if no graph is named xi in D, let D(xi) = fg. The evaluation of fromM;N P on a dataset D is de ned as fromM;N P (D) := P (DM;N ), where the query dataset DM;N := ([m2M D(m) f g) [ ([n2N D(n) n) is formed from a default graph containing the union3 of graphs with a name m 2 M , and all named graphs in D with a name n 2 N . We will be able to use these operators to de ne query datasets that capture speci c versions in an RDF archive. 4

Versioned Data

Our general approach is to represent an RDF archive as a SPARQL dataset D but there are multiple representations by which this can be achieved, each with its own strength and weaknesses. We propose six di erent representations falling into three di erent categories as discussed by various authors (e.g., [ 24,5 ]): Independent Copies (IC ), Change-Based (CB ) and Timestamp-Based (TB ). 1 A basic graph pattern ft1; : : : tng can be represented as [[t1 and : : :] and tn]. 2 We thus assume a quad store, and disallow empty named graphs. 3 Since we do not allow blank nodes, a union or RDF merge is equivalent. Independent Copies (IC): A natural representation is to store an RDF Archive G = (G1; : : : ; Gn) as a SPARQL dataset D = f(x1; G1); : : : ; (xn; Gn)g, where x1; : : : ; xn are IRIs that identify the version. This will result in relatively simple (and thus likely e cient) rewritten queries, but can be expected to occupy a lot of space, particularly where few triples change from version to version. Change-Based (CB): The core idea of CB representations is to store only triples that change from a given reference version. Along these lines, in the following we denote by ij := Gi n Gj the triples in version i not in version j. We consider four CB representations based on four transformations of G = (G1; : : : ; Gn): As can be seen for the de nitions of Gi, these transformations are lossless: we can retrieve any version of the graph from any such transformation. The rst two transformations start with the earliest version as a base. The rst encodes deltas always with respect to the rst version. The second encodes deltas with respect to the previous version. The latter two transformations start with the latest version. The third encodes deltas with respect to the latest version. The forth encodes deltas with respect to the subsequent version. Letting H = (H1; : : : ; Hn) such that Hi = Gn i+1 (1 i n), i.e., such that H \reverses" G, we remark that G1n = Hn1 and Gnn 1 = Hnn 1. Each such transformation can then be represented as a SPARQL dataset with 2n 1 named graphs.

In terms of space we expect Gnn 1 and Gnn 1 to be the most e cient as they always encode deltas from a neighbouring version. However, in terms of query rewriting, we expect G1n and Gn1 to be more e cient as they do not require a recursive construction of all intermediate versions towards the base version. In terms of indexing a new version, we expect G1n followed by Gn 1 to be most n 1 e cient as they require only computing the most recent deltas; we expect Gn, followed by Gnn 1, to be much more expensive, requiring a recompute of all deltas. On the other hand, Gn1 and Gnn 1 should be advantageous for queries over more recent versions, and in particular over the most recent version (a common case).

These four representations are analogous to di erential backups, incremental backups, reverse-di erential backups, and reverse-incremental backups. Timestamp-Based (TB): The intuition of the TB representation is to associate each triple with the versions in which it is contained. Along these lines, we denote by G(s; p; o) := fi j (s; p; o) 2 Gi for 1 i ng the versions containing (s; p; o).4 Let N := fN j 9(s; p; o) 2 Gi : G(s; p; o) = N; 1 i ng denote the family of sets of versions associated with some triple in (s; p; o). We can then represent the RDF archive with a named graph for each N 2 N . However, the number of named graphs can reach 2n (or the number of unique triples in G). Another 4 The de nition G : I (I [ L) ! 2f1;:::;ng [ 6 ] is analogous to G = (G1; : : : ; Gn). option is create a named graph for intervals [ 5 ]. More speci cally, a triple (s; p; o) is added to an interval [i; j] (for 1 i j n) if and only if (s; p; o) 2 Gk for i k j and either i = 1 or (s; p; o) 2= Gi 1 and j = n or (s; p; o) 2= Gj+1; in simpler terms, [i; j] is a maximal contiguous interval of versions in which (s; p; o) appears (omitting empty intervals). The upper bound is now n(n + 1)=2 named graphs, but triples may appear in multiple named graphs for di erent intervals.

In general, we expect the space to be similar to Gnn 1 and Gnn 1; in other words, quite good. However, as the number of versions n grows, O(n2) interval graphs need to be unioned in the worst-case to materialise a particular version; CB representations require processing O(1) (in the case of G1n and Gn1 ) or O(n) (in the case of Gnn 1 and Gnn 1) named graphs to materialise a particular version. Notation: We denote IC by i; di erential, incremental, reverse-di erential and reverse-incremental CB by cd+, ci+, cd and ci , resp.; and interval TB by t. 5

Versioned Queries

Given a SPARQL query Q over an RDF graph, we now describe automatic rewritings of Q to generate solutions for di erent versions. We rst focus on rewritings of triple patterns and then generalise. We assume that version parameters are given via HTTP rather than extending the SPARQL syntax. For reasons of space we rather present examples in online material [ 4 ]. 5.1

Single-Version Queries A single version query returns Q(Gv) for a speci ed version v. Our overall strategy is to use FROM to construct the graph of the version where possible, as is the case for all versions in i and t; for G1 in cd+, ci+; and for Gn in cd , ci . Otherwise we rewrite each individual triple pattern appearing in Q in order to ensure that it generates the same solutions as it would if evaluated over the graph of the selected version. We now provide more details for each representation Independent Copies (IC) For i we rewrite Q to fromfvg;fgQ, where v is the IRI that names the graph for version v Change-Based (CB) Recalling the observation that forwards and reverse CB representations are analogous, for brevity we de ne the rewriting for the forwards direction (cd+, ci+) only; the reverse direction (cd , ci ) follows naturally.

For the di erential representation, we load the base version and the positive delta into the default graph and, for each triple pattern, we subtract the negative delta which is queried using a quad pattern. Thus for cd+ we rewrite Q to fromf1;v1g;f1vgQ0, where 1, v1, 1v indicate the names of G1, 1v and 1v, respectively; and Q0 replaces each triple pattern (s; p; o) 2 Q with the named graph pattern [(s; p; o; ) minus (s; p; o; 1v)].

Unfortunately the incremental rewriting is more complex. A rst idea would be to take the union of G1 and all positive deltas 12; : : : ; vv 1 and then subtract the (union of the) negative deltas 21; : : : ; vv 1; unfortunately this would overlook triples that were removed from a version 1 < u < v but were added back in a later version u < u0 v (and were not removed again in a version u0 < u00 v). Hence a recursive rewriting appears to be necessary. Let Q1 := fromf1g;fgQ; then Q2 := fromf1g;f12;21gQ01, where Q1 is the result of replacing each triple pattern (s; p; o) in Q1 by the named graph pattern P2 := [[(s; p; o; ) union (s; p; o; 21)] minus (s; p; o; 12)]. We can then apply this rewriting recursively: Qi := fromf1g;f12;:::;(i-1)i;21;:::;i(i-1)gQ0i 1, where Q0i 1 replaces each named graph pattern Pi 1 appearing in Qi 1 with the recursive pattern Pi := [[Pi 1 union (s; p; o; i(i-1))] minus (s; p; o; (i-1)i)].

This rewriting leads to complex queries. We thus optimise using additional features of SPARQL. To sketch the idea, our goal is then to make sure that each triple pattern only matches triples that appear in a base version and were not removed, or that appear in a positive delta ij such that 1 < i < j v and do not appear in a (later) negative delta lk for j k < l v. In practice, for each delta ij stored as a named graph ji, in o ine processing, we index meta-data of the form (ji; ver; j; $), and (ji; type; pos; $) in the case that i < j or (ji; type; neg; $) in the case that j < i ($ 2 I is a reserved name for the meta-data graph). We can then check the aforementioned condition by using aggregation to nd the maximum version of a negative delta less than or equals v in which the triple pattern matches, then ltering the base version or any positive delta earlier than this maximum version. While the resulting query is still quite complex, we found it to be more practical than the recursive rewriting. Timestamp-Based (TB) Let i:j denote the name of the graph for the interval of versions [i; j]. We rewrite Q to the query fromfIg;fgQ, where I is the set of IRIs naming intervals in which v is contained; formally: I := fi:j j i v jg. 5.2

Delta-Version Queries Given a query Q, a control version u and a target version v, delta-version queries give solutions in Q(Gv)nQ(Gu). The general strategy for rewriting is to construct a query [Qv minus Qu] where Qv and Qu are the respective single-version queries. 5.3

Other SPARQL Features In the SPARQL algebra, only the evaluation of triple patterns and property paths (regular expressions that match arbitrary length paths in the graph) directly accept the graph as input. Hence, given a SPARQL query Q (over a default graph), if we can individually rewrite each triple pattern and (property) path pattern of Q to generate solutions for Gv, then the rewritten query Q0 will generate precisely the solutions for Gv. We previously described this process for triple patterns. However, in SPARQL we cannot always express a property path over multiple named graphs in the query dataset. For example, consider a property path :y+ indicating a path of one or more predicates :y, a positive integer K 1, and two named graphs (n1; f(c2k 2; :y; c2k 1) j 1 k Kg) and (n2; f(c2k 1; :y; c2k) j 1 k Kg) such that there is path for :y+ of length 2K (edges) that \alternates" between both named graphs. A GRAPH clause with a variable would evaluate this path on each graph separately (we cannot bind the graph variable to two graph names in a single solution). Though we can use FROM over n1 and n2 to form a default graph for evaluating :a+, we can only do this once per query. We can rather use a join of 2K GRAPH clauses, but K is bounded by the data, not the query (nor the number of versions). Thus while we support property paths for single-version queries on i and t, and delta-version queries on i, we do not know how to support them in the other cases. 6

Experiments

We now perform experiments to address the following three research questions (Q1) Which of the six representations allow for better compression, more e cient indexing, and more e cient updates of a new version? (Q2) How do the query runtimes of compressed representations (CB, TB ) compare with indexing complete versions (IC )? (Q3) Which representation works best overall? Setting We address these questions for a Wikidata archive of 8 weekly truthy versions from 2017-08-09 with 1.506 billion triples, to 2017-09-27 with 1.924 billion triples. The RDF archive consists of 13.477 billion triple{version pairs. Each week 25{93 million triples are added, while 4{6 million triples are removed. We take Wikidata's example queries, de ned by users5, and translate Wikidataspeci c features (e.g., the label service) to standard SPARQL. We further lter queries that feature federation to other endpoints, property paths (not supported by all representations), and quali ers (not in truthy versions). The result is a test set of 146 SPARQL queries. We take Virtuoso as our SPARQL implementation. Runtimes are averaged over three runs. Query timeouts were set to 5 minutes. The machine used has 120GB of RAM and a standard SATA hard-disk. Indexing We rst look at the results of total index sizes for each representation. In Figure 2 we show the index sizes (GB) for each representation, the time taken (min.) to bulk load all versions in the representation, and the time taken (min.) to update a seven-version archive with the eighth version. We see that i has the largest index sizes, followed by cd+ and cd , then t, and nally ci+ and ci . The bulk load times correlate with index size, with i (thus) being by far the slowest. We see a similar trend in version updates, except that cd is far slower than the other alternatives (even i) as the entire archive must be built from scratch. Single version queries We apply the rewriting of our 146 SPARQL queries for each of our six representations in order to retrieve results for version 1, 5 and 8. We show the results as box-plots with log y-axis in Figure 3 (with the timeout as the maximum). Median times were generally in the range of 100{1000 ms, though 1{20 queries timed out in each experiment, a ecting the mean (shown as a diamond). In terms of mean and median runtimes, i performs the best, with c+, cd and t also performing competitively across the di erent versions.

d 5 https://www.wikidata.org/wiki/Wikidata:SPARQL query service/queries/examples 200 )B160 164 G (e 120 z iS 80 x ed 40 n I 0

105 )s 104 (m103 e im102 T101

100 Conversely, ci+ and ci perform poorly relative to the other options (except ci in the case of version 8). Contrasting query times with index sizes, we note a clear time{space trade-o , where the largest index performs best, the smallest perform worst, and those with intermediate space perform middlingly. Delta version queries Next we rewrite our 146 SPARQL queries in order to retrieve delta results between versions 1{2, 4{5 and 7{8 from each of our six representations. We again show the results as box-plots with log y-axis in Figure 4 (with the timeout as the maximum). When compared with single version queries, we see an increase in time, where the median runtimes of even the best performing representations approach or exceed 1000 ms more often. This time the best performance is o ered by t, followed by i, ci+ and ci . Conversely, cd+ and cd perform very poorly. Of note is that incremental builds perform better than di erential builds; we believe that this is due to the ability to cache smaller graphs, most of which are used to generate results for both versions. 7

Conclusion

We now re ect back on our research questions: (Q1) In terms of indexing space and time, incremental builds with an initial base version (cd+) are best. (Q2) In general the uncompressed IC representation (i) o ers the best query runtimes, but interval TB (t) is quite competitive, and even outperforms IC for deltaversion queries. (Q3) Rather than there being an overall winner, we note a time{space tradeo , where less compact representations have faster queries and more compact representations have slower queries. Interval TB (t) arguably strikes the best balance for space and time, though this may not hold with more versions, particularly in RDF archives where triples are often added or removed multiple times, as a quadratic number of intervals may need to be accessed.

For future work, it would be of interest to run experiments for other SPARQL engines and other RDF archive benchmarks. Also it would be interesting to run more diverse types of versioned queries, such as delta versions with larger gaps, queries returning versions as solutions, etc.; a related direction would be to consider operators from temporal logics [ 18 ]. There are also open questions relating to more optimal/concise query rewritings, and support for paths.

In conclusion: for those who wish to host RDF archives, with di erent versions of an RDF graph, is SPARQL all you need? Specialised languages and systems can o er more features and consume less time and space. But with some caveats, our results suggest that query rewriting over an o -the-shelf SPARQL engine can be a solid (easy-to-deploy) option for such scenarios. Online material Please see [ 4 ] for code and queries.

Acknowledgements This work was funded by Fondecyt Grant No. 1181896 and ANID Millennium Science Initiative Program ICN17 002.

1. J. Anderson and A. Bendiken . Transaction-time queries in dydra . In Managing the Evolution and Preservation of the Data Web (MEPDaW) , pages 11 { 19 , 2016 .

Bahri ,

Laajimi , and

N. Y.

Ayadi . Distributed RDF Archives Querying with Spark . In European Semantic Web Conference (ESWC) , pages 451 { 465 , 2018 .

Cerdeira-Pena , A . Farin~a,

J. D.

Fernandez , and

M. A.

Mart nez-Prieto. Selfindexing RDF archives . In Data Compression Conference (DCC) , pages 526 { 535 , 2016 .

I. Cuevas. Online

Material , 2020 . https://github.com/HunterNacho/sparql-versioning/.

J. D.

Fernandez ,

Polleres , and

Umbrich . Towards E cient Archiving of Dynamic Linked Open Data . In DIACHRON Managing the Evolution and Preservation of the Data Web , pages 34 { 49 , 2015 .

J. D.

Fernandez ,

Umbrich ,

Polleres , and

Knuth . Evaluating query and storage strategies for RDF archives . Semantic Web , 10 ( 2 ): 247 { 291 , 2019 .

Grandi . T-SPARQL: A tsql2-like temporal query language for RDF . In Local Proceedings of the Fourteenth East-European Conference on Advances in Databases and Information Systems , pages 21 { 30 , 2010 .

Graube ,

Hensel , and L. Urbas. R43ples: Revisions for Triples - An Approach for Version Control in the Semantic Web . In Linked Data Quality (LDQ) , 2014 .

Gutierrez ,

C. A.

Hurtado , and

A. A.

Vaisman . Introducing time into RDF . IEEE Trans. Knowl . Data Eng., 19 ( 2 ): 207 { 218 , 2007 .

10.

Harris ,

Seaborne , and E. Prud'hommeaux, editors. SPARQL 1.1 Query Language. 21 March 2013 . Available at http://www.w3.org/TR/sparql11-query/.

11. T. Kafer,

Abdelrahman ,

Umbrich , P.

O'Byrne, and

Hogan . Observing linked data dynamics . In The Semantic Web: Semantics and Big Data , 10th International Conference, ESWC 2013, Montpellier, France, May 26 -30, 2013 . Proceedings, pages 213 { 227 , 2013 .

12.

Khurana and

Deshpande . Storing and Analyzing Historical Graph Data at Scale . In International Conference on Extending Database Technology (EDBT) , pages 65 { 76 , 2016 .

13.

Kotsev ,

Minadakis ,

Papakonstantinou ,

Erling , I.

Fundulaki, and

Kiryakov. Benchmarking RDF Query

Engines: The LDBC Semantic Publishing Benchmark . In Benchmarking Linked Data (BLINK) , 2016 .

14.

Neumann and

Weikum. x -RDF-3X: Fast Querying, High Update Rates, and Consistency for RDF Databases . Proc. VLDB Endow ., 3 ( 1 ): 256 { 263 , 2010 .

15.

Papakonstantinou , I. Fundulaki , and

Flouris . Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning Benchmark . In Scalable Semantic Web Knowledge Base Systems (SSWS) , pages 45 { 60 , 2018 .

16. J. Perez , M.

Arenas , and C.

Gutierrez . Semantics and complexity of SPARQL . ACM Trans. Database Syst ., 34 ( 3 ): 16 :1{ 16 : 45 , 2009 .

17. M. Perry , P. Jain , and

A. P.

Sheth . SPARQL-ST: extending SPARQL to support spatiotemporal queries . In Geospatial Semantics and the Semantic Web - Foundations , Algorithms, and Applications, pages 61 { 86 , 2011 .

18.

Pnueli . The temporal logic of programs . In Foundations of Computer Science (FOCS) , pages 46 { 57 . IEEE Computer Society, 1977 .

19.

Pugliese ,

Udrea , and

V. S.

Subrahmanian . Scaling RDF with time . In World Wide Web Conference (WWW) , pages 605 { 614 , 2008 .

20.

Taelman ,

M. V.

Sande ,

J. V.

Herwegen , E. Mannens, and

Verborgh . Triple storage for random-access versioned querying of RDF archives . J. Web Semant ., 54 :4{ 28 , 2019 .

21.

Taelman ,

M. V.

Sande , and

Verborgh . OSTRICH: Versioned Random-Access Triple Store . In Comp. of The Web Conference , pages 127 { 130 , 2018 .

22.

T. P.

Tanon and

F. M.

Suchanek . Querying the Edit History of Wikidata . In ESWC Satellite Events , pages 161 { 166 . Springer, 2019 .

23.

Tappolet and

Bernstein . Applied temporal RDF: e cient temporal querying of RDF data with SPARQL . In Extended Semantic Web Conference (ESWC) , pages 308 { 322 , 2009 .

24.

Tzitzikas ,

Theoharis , and

Andreou . On Storage Policies for Semantic Web Repositories That Support Versioning . In European Semantic Web Conference (ESWC) , volume 5021 , pages 705 { 719 , 2008 .

25. M. Volkel and T. Groza. SemVersion: An RDF-based ontology versioning system . In IADIS Conference: WWW/Internet , volume 2006 , 2006 .

26.

Vrandecic and M. Krotzsch. Wikidata: a free collaborative knowledgebase . Commun. ACM , 57 ( 10 ): 78 { 85 , 2014 .

27. C. Zaniolo , S.

Gao , M.

Atzori , M.

Chen , and J. Gu.

User-friendly temporal queries on historical knowledge bases . Inf. Comput. , 259 ( 3 ): 444 { 459 , 2018 .

28.

Zeginis ,

Tzitzikas , and

Christophides . On computing deltas of RDF/S knowledge bases . TWEB , 5 ( 3 ): 14 :1{ 14 : 36 , 2011 .

29.

Zimmermann ,

Lopes ,

Polleres , and

Straccia . A general framework for representing, reasoning and querying with annotated Semantic Web data . J. Web Sem ., 11 : 72 { 95 , 2012 .