Reflections on: Reasoning and Querying Web-scale Open Data based on DL-LiteA in a Divide-and-Conquer Way Zhenzhen Gu1 , Songmao Zhang2 , and Cungen Cao1 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2 Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing, China guzhenzhen0720@163.com smzhang@math.ac.cn cgcao@ict.ac.cn 1 Introduction Problem Description. Although the sheer size of the open datasets [2] in Linked Open Data (LOD) [3] can be viewed as a positive sign for Semantic Web initiatives, it causes performance bottlenecks for systems and tools that provide data managing and query answering services over the LOD datasets. Moreover, as the growth of the schema knowledge described by OWL, realizing efficient reasoning and expressive query answering has become a challenging task for both data publishers and end users. Most of the studies in this respect (for example, [4–6]) focused on dealing with the scalability issue whereas inferring knowledge through reasoning is often ignored. Moreover, when reasoning is considered, most of the time only lightweight RDFS reasoning can be supported. Proposal of DL-LiteA . Motivated by a statistic analysis of the BTC 2012 dataset3 , we propose to use DL-LiteA [10] techniques to reason and query the web-scale open datasets (knowledge bases) described by Semantic Web standards like RDF and OWL due to the low reasoning complexity and suitable expressiv- ity of the language. The distinguished feature of DL-LiteA is FOL-rewritability. This means that in DL-LiteA , by schema knowledge reasoning, both satisfiabil- ity checking and conjunctive query (CQ) answering can be eventually reduced to answering queries over the data layers of the corresponding knowledge bases (KBs). Moreover the separation between schema knowledge and data layer rea- soning makes it possible to use highly optimized database management systems and engines in practice for query answering. Bottleneck of DL-LiteA . Unfortunately, even with low reasoning complexity, when facing the real-life scalability challenge, the actual reasoning and CQ an- swering realized by DL-LiteA may become infeasible by the following two factors. Firstly, for both satisfiability checking and CQ answering, a polynomial size of queries may need to be answered over the data layers of the corresponding KBs w.r.t. the size of the schema knowledge of these KBs. Secondly, for the KBs with massive individual assertions, evaluating a single query over the data lay- ers may be highly time-consuming. The combination of these two factors makes 3 https://km.aifb.kit.edu/projects/btc-2012/ Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 Gu et al. the situation even worse, calling for new techniques for DL-LiteA to tackle the scalability and efficiency challenge. Our Solution–Divide-and-Conquer. In computer science, divide-and-conquer is the basis of efficient algorithms for all kinds of problems. This impels us to seek for a divide-and-conquer reasoning and query answering approach for DL- LiteA , with the basic idea of partitioning both KBs and CQs into smaller chunks and decomposing the original reasoning and query answering tasks into a group of independent sub-tasks such that the overall performance can be improved by taking advantage of the parallelization and distribution techniques. The challenge for designing such an approach lies in how to carry out KB and CQ partitioning and reasoning reduction in a sound and complete way. Motivated by hash partitioning of RDF graphs (the partitions with the local feature of star-query answering) [15], we expect the smaller KB chunks to have the local feature for both satisfiability checking and simple-query answering. Here simple-queries (SQs) are the CQs whose query atoms share a common variable or individual. For CQ answering, we expect to partition a CQ into smaller SQs and evaluate them over smaller KB chunks. Under these expectations, our divide-and-conquer approach is constructed from both theoretical and practical perspectives. Theoretically, definitions of KB partitions and CQ partitions are presented and the sufficient and necessary conditions are identified to determine whether a KB partition holds the desired features and whether a CQ can be answered in the desired way. Practically, based on the theoretical results, the concrete ways of partitioning KBs and CQs and evaluating CQ partitions over KB partitions are described. Moreover, a strategy of optimizing the procedure of evaluating CQ partitions over KB partitions is provided to improve the overall query answering performance. To verify our pro- posal and approach, experiments are conducted on two web-scale open datasets, i.e., DBpedia and BTC 2012 dataset, and the testing results indicate that the proposed approach opens new possibilities for realizing performance-critical ap- plications on the Web with both expressivity and high scalability. To the best of our knowledge, this is the first study on partitioning DL-LiteA KBs and CQs for the purpose of efficiently reasoning with and querying large- scale datasets. We assume that the readers are familiar with DL-LiteA and CQs. Next, we illustrate our main scientific contributions, containing KB partitioning, query partitioning, optimization and experiments. The technique details can be seen in [1]. In this paper, for a CQ Q and DL-LiteA KB K, we use Ans(Q, K) to denote the set of all the certain answers of Q over K. Moreover, we use Hd(Q) and Bd(Q) to denote the head and body of Q, respectively. For two tuples x and u with the same dimension, we use [x/u] to denote a substitution. 2 DL-LiteA KB partitioning Here, we discuss partitioning DL-LiteA KBs into small chunks with the desired feature, containing the formalization of KB partitions, the conditions of detecting the desired KB partitions and the concrete way of computing such KB partitions. Title Suppressed Due to Excessive Length 3 Formalizing KB partitions. Before defining DL-LiteA KB partitions, we first provide a way of computing a smaller sub-TBox for a given sub-ABox. Procedure 1. For a DL-LiteA KB (T , A) and sub-ABox A0 ⊆ A, we use T |A0 to denote a sub-TBox of T constructed by the steps 1–2. Let T |A0 = ∅. Then: Step 1. For each α ∈ T , add α to T |A0 if α has the form γ v ¬η or Fun(S) and each name in α occurs in (T |A0 , A0 ), or α has the form γ v η and the name in γ occurs in (T |A0 , A0 ); Step 2. Iterate Step 1, until T |A0 do not change anymore. Since for a sub-ABox, not all the axioms are relevant in terms of satisfiabil- ity checking and CQ answering, and smaller TBoxes can make the considered reasoning tasks be realized more quickly. The correctness is shown below: Lemma 1. For a DL-LiteA KB (T , A) and A0 ⊆ A, (1) (T |A0, A0 ) is satisfiable iff (T , A0 ) is satisfiable; and (2) Ans(Q, (T |A0, A0 )) = Ans(Q, (T , A0 )) for CQ Q. By Lemma 1, partitions of DL-LiteA KBs are defined in the definition below. Definition 1. For DL-LiteA KB K = (T , A), the set S = ∪ni=1 {(Ti , Ai )} of DL- LiteA KBs is called a partition of K if A = ∪ni=1 Ai and Ti = T |Ai . S is called a SCSQA-local partition of K iff (1) K is satisfiable iff each KB in S is satisfiable; and (2) if K is satisfiable then Ans(Q, K) = ∪K0 ∈S Ans(Q, K0 ) for each SQ Q. Detecting SCSQA-local partitions. By analyzing the algorithms of satisfi- ability checking and CQ rewriting [10], we found that in DL-LiteA , both satis- fiability checking and SQ answering can be reduced to answering SQs over the ABoxes of the corresponding KBs, shown in the lemma below. Lemma 2. For a DL-LiteA KB K = (T , A), (1) there exists a set Q of SQs so that K is satisfiable iff ∪Q∈Q Ans(Q, (∅, A) = ∅; and (2) if K is satisfiable then for each SQ Q, there exists a set Q of SQs so that Ans(Q, K) = ∪Q0 ∈Q Ans(Q0, (∅, A)). All the query atoms in a SQ share a common individual or variable. Therefore, by Lemma 2, we can get that in the scenario of KB partitioning, for the local feature of satisfiability checking and SQ answering, the assertions in the original KB and sharing a common individual should be put in an identical sub-KB. Proposition 1. For DL-LiteA KB K = (T , A) and partition S of K, if for each U ⊆ A satisfying that all the assertions in U share a common individual, there exists (T 0 , A0 ) ∈ S such that U ⊆ A0 , then S is a SCSQA-local partition of K. Without redundant individual assertions [1] (Theorem 1), the condition in Proposition 1 can be used as a necessary and sufficient condition. Computing SCSQA-local partitions. By Proposition 1, SCSQA-local par- titions of DL-LiteA KBs can be obtained by the following procedure. Procedure 2. For a DL-LiteA KB K = (T , A), a SCSQA-local partition S of K can be obtained by the following steps 1–2: Step 1. Compute a partition A1 , · · · , An of A by the following two sub-steps a–b: a. Compute the set U = {Ua |a ∈ I} where I is the set of all the individuals of A, and Ua is the set consisting of all the assertions in A that contain a. b. Compute a partition U1 , · · · , Un of U. For each Ui , let Ai = ∪e∈Ui e. Step 2. For each 1 ≤ i ≤ n, compute the TBox T |Ai for Ai . Let S = ∪n i=1 {(T |Ai , Ai )}. 4 Gu et al. Different ways of partitioning U in step 1.b lead to different SCSQA-local partitions of K. In terms of distribution and parallelization, the partitions gen- erating more balanced, smaller sized sub-ABoxes are more likely to achieve the efficiency improvement. For a given size asize of ABoxes in KB partition, a par- tition of U generating sub-ABoxes sized no more than asize can be obtained by: b1. Compute an ordered tuple O = U1 , · · · , Um of the elements in U such that |Ui | ≤ |Ui+1 | for 1 ≤ i ≤ m − 1; b2. Continue the following procedure P, until O does not contain any element. P. Generate a sub-ABox A0 . Let A0 = ∅. For i from |O| to 1, if |A0 ∪ O[i]| ≤ asize then let A0 = A0 ∪ O[i]; otherwise for j from 1 to i − 1, if |A0 ∪ O[j]| ≤ asize then let A0 = A0 ∪O[j], otherwise break the first loop. Remove all the elements used to generate A0 from O. The complexity of computing SCSQA-local partitions of DL-LiteA KBs is shown below. Low complexity makes the proposed approach appealing for partitioning a large-scale KB into smaller, independent sub-KBs with the desired local feature. Theorem 1. For a KB K = (T , A), a SCSQA-local partition of K can be com- puted in O(|T |2 ) w.r.t. |T | and in O(|A|) w.r.t. |A|. 3 Conjunctive query partitioning and evaluation Here, we discuss CQ partitioning and evaluation, containing the formalization of CQ partitions, the conditions of detecting CQ partitions and the concrete way of computing and evaluating CQ partitions. Formalizing CQ partitions. CQ partitions are defined in the definition below. Definition 2. For a CQ Q, we say the set Q = ∪ni=1 {qi } of CQs is a partition of Q if (1) Bd(Q) = ]ni=1 Bd(qi ) and (2) for each 1 ≤ i ≤ n, Hd(qi ) consists of all the variables in Bd(qi ) that also occur either in Hd(Q) or ∪nk=1,k6=i Bd(qk ). We say that Q is a SQ-partition of Q iff all the queries in Q are SQs. The concrete way of merging the certain answers of the sub-queries and the correctness are shown in the following definition and lemma, respectively. Definition 3. For a partition Q = ∪ni=1 {qi } of a CQ Q and a partition S of a DL-LiteA KB K, we define: Ans(Q, Q, S) = {x[x1 /u1 , · · · , xn /un ]|(u1 , · · · , un ) ∈ Λ1 × · · · × Λn } where x = Hd(Q), for each 1 ≤ i ≤ n, xi = Hd(qi ) and Λi = ∪K0 ∈S Ans(qi , K0 ), and for each 1 ≤ i, j ≤ n, 1 ≤ k ≤ |ui | and 1 ≤ l ≤ |uj |, if xi [k] = xj [l] then ui [k] = uj [l]. Lemma 3. Ans(Q, Q, S) ⊆ Ans(Q, K) holds. By the semantics of CQs, we know that distinguished (dist.) variables and non-dist. variables of CQs [1] are interpreted in different ways. Based on this, we define the concept of SQ-reducibility and show that SQ-reducible CQs can be answered in the descried way, i.e., by first partitioning CQs into SQs and then answering the smaller SQs over SCSQA-local partitions of DL-LiteA KBs. Definition 4. For a CQ Q, we say Q is SQ-reducible iff it has a SQ-partition Q such that Hd(q) ⊆ Q for each q ∈ Q. Title Suppressed Due to Excessive Length 5 Lemma 4. For a CQ Q, if Q is SQ-reducible then there exists a SQ-partition Q of Q such that for each satisfiable DL-LiteA KB K and each SQSQA-local partition S of K, Ans(Q, K) = Ans(Q, Q, S) holds. Detecting SQ-reducible CQs. Not all CQs are SQ-reducible. This means that there exist CQs Q and DL-LiteA KBs K such that evaluating any SQ-partition of Q over any SCSQA-local partition of K cannot obtain all the certain answers of Q over K. Thus in order to evaluate as many CQs in the desired way as possible, we need to provide a iff condition to decide whether a CQ is SQ-reducible. Theorem 2. A CQ Q is SQ-reducible iff there do not exist three query atoms P (?x, ?y), α and α0 in Q such that ?x and ?y are non-dist. variables of Q, α contains ?x but not ?y, and α0 contains ?y but not ?x. Theorem 3. Detecting whether a CQ Q is SQ-reducible can be done in O(|Q|). Next, we present the concrete ways of partitioning and evaluating SQ-reducible CQs and non-SQ-reducible CQs. Partitioning and evaluating SQ-reducible CQs. A SQ-reducible CQ may have many SQ-partitions, so we need to first detect the SQ-partitions that can capture all the certain answers of CQs and then provide the procedure of comput- ing the desired SQ-partitions and the complexity of computing such partitions. Theorem 4. For a SQ-reducible CQ Q, let Q be any SQ-partition of Q such that Hd(q) ⊆ Hd(Q) for each q ∈ Q. Then Ans(Q, K) = Ans(Q, Q, S) holds, for any satisfiable DL-LiteA KB K and any SCSQA-local partition S of K. Procedure 3. For a SQ-reducible CQ Q, a SQ-partition Q of Q satisfying that Hd(q) ⊆ Hd(Q) for each q ∈ Q can be obtained by the following steps 1–2. Step 1. Let A = Bd(Q) and U = ∅. Compute a partition U of Bd(Q) by the steps a-b. a. For each non-dist. variable x of Q, compute the set Ux consisting of all the atoms in A that contain x, and let U = U ∪ {Ux } and A = A − Ux . Then for every two different non-dist. variables x and y of Q that occur in a same atom in Q, replace Ux and Uy in U with Ux ∪ Uy ; b. For each dist. variable or individual x of Q, compute the set Ux consisting of all the atoms in A that contain x, and let U = U ∪ {Ux } and A = A − Ux . Step 2. For U ∈ U, generate a CQ q such that Bd(q) = U and Hd(q) consists of all the variables occurring in U that also occur in Hd(Q) or Bd(Q)−U, add q to Q. Theorem 5. For a SQ-reducible CQ Q, a partition Q of Q satisfying hd(q) ⊆ hd(Q) for each q ∈ Q can be obtained in O(|Q|). Partitioning and evaluating non-SQ-reducible CQs. If a CQ is not SQ- reducible, then non-SQ-partitions need to be evaluated to obtain all the certain answers of this CQ. By the semantics of CQs, we find that if a certain answer of a CQ Q over a KB is obtained by binding some non-dist. variables NV to anonymous elements, then in the scenario of KB and query partitioning, this certain answer can solely be obtained by evaluating a partition of Q in which the sub-queries do not contain the variables in NV as dist. variables. The definition below illustrates the partitions that can capture such certain answers of Q. 6 Gu et al. Definition 5. For a CQ Q and set NV of non-dist. variables of Q, we use Q|NV SQ to denote a partition of Q satisfying that (1) each variable in NV solely occurs in one query in Q, and (2) for each non-simple query q ∈ Q and for every two different atoms α and α0 in q, there exists a sequence β1 , · · · , βn of atoms in q such that β1 = α, βn = α0 and βi and βi+1 share a variable in NV for 1 ≤ i ≤ n−1. The concrete way of answering non-SQ-reducible CQs and the procedure of computing the desired partitions as well as the complexity are shown below. Theorem 6. Given a non-SQ-reducible CQ Q, for any satisfiable DL-LiteA KB K and any SCSQA-local partition S of K, Ans(Q, K) = ∪NV’∈2NV Ans(Q, Q|NV’ SQ , S) holds where NV is the set consisting of all the non-dist. variables of Q. Procedure 5. For a non-SQ-reducible CQ Q and set NV of some non-dist. variables of Q, the partition Q|NV SQ , can be obtained by the following two steps: Step 1. Let A = Bd(Q) and U = ∅. Compute a partition U of Bd(Q) by the steps a–c. a. Compute an undirected graph G with node set A. For each (α, α0 ) ∈ (A)2 , there exists an edge between α and α0 iff they share a variable in NV; b. For each connected component l of G, generate an atom set U consisting of all the nodes in l, and set U = U ∪ {U} and A = A − U; c. For each variable or individual x occurring in A, compute a set U consisting of all the atoms in A that contain x, and set U = U ∪ {U} and A = A − U. Step 2. For each U ∈ U, generate a CQ q so that Bd(q) = U and Hd(q) consists of all the variables in U that also occur in Hd(Q) or Bd(Q) − U, and add q to Q. Theorem 7. The complexity of computing the partition Q|NV 2 SQ of Q is O(|Q| ). 4 Optimizing the procedure of query partition evaluation Although query partitioning can reduce the final number of queries needed to be answered over the ABox of a KB significantly, a large amount of intermediate results generated by answering sub-queries may also slow down the overall query answering procedure. Thus we further provide an optimization strategy to reduce the intermediate results as many and as early as possible by transferring values between sub-queries, as shown in the following example and proposition. Example 1. Let S be a partition of a KB. Consider the partition Q of a CQ Q: Q : R(a, ?x) ∧ P (?x, ?y) ∧ S(?y, ?z) → q(?x, ?y, ?z) Q = {q1 : R(a, ?x) ∧ P (?x, ?y) → q(?x, ?y), q2 : S(?y, ?z) → q(?y, ?z)} After answering q1 over the KBs in S, suppose we obtain the answer set Λ = ∪3i=1{(ai , bi )}. If we transfer the bindings of ?y to q2 before it is evaluated, i.e., replacing answering q2 with answering the CQs ∪3i=1{S(bi , ?z) → q(bi , ?z)}, then we can avoid computing the answers of q2 which do not bind ?y to b1 , b2 or b3 . Proposition 2. For a partition Q = ∪ni=1 {qi } of a CQ Q and a partition S of a satisfiable DL-LiteA KB, the following equation holds: Ans(Q, Q, S) = {x[x1 /u1 , · · · , xn /un ]|(u1 , · · · , un ) ∈ Λ1 |B1 × · · · × Λn |Bn } where x = Hd(Q), for 1 ≤ i ≤ n, xi = Hd(qi ), B1 = ∅ and for 1 < k ≤ n: Bk = {[x1 /u1 , · · · , xk−1 /uk−1 ]|xk |(u1 , · · · , uk−1 ) ∈ Λ1 |B1 × · · · × Λk−1 |Bk−1 } where Λ1 |B1 = ∪K∈S Ans(q1 , K) and for 1 < j ≤ n, Λj |Bj = ∪K∈S ∪ξ∈Bj Ans(qj ξ, K). Title Suppressed Due to Excessive Length 7 In Proposition 2, Bk denotes the set of the bindings of some variables of qk obtained from the certain answers of the sub-queries evaluated before qk . By observing the CQ rewriting procedure in DL-LiteA , we find that the transferred variable bindings do not affect the query rewriting procedure. This means that when answering a sub-query q over a sub-KB K = (T , A) with a set B of variable bindings, the CQs obtained by first transferring the bindings in B to q and then rewriting the resultant queries over T are equivalent to the CQs obtained by first rewriting q over T and then transferring the bindings in B to the rewritten queries. Thus we actually need one time of query rewriting rather than |B|. Hence the procedure of evaluating the partitions of queries can be further optimized. Proposition 3. For a CQ q, satisfiable DL-LiteA KB K = (T , A) and set B of functions ξ that maps some variables of q to names, then we can get that: ∪ξ∈B PerfectRef(qξ, T ) = ∪ξ∈B {q 0 ξ|q 0 ∈ PerfectRef(q, T )} ∪ξ∈B Ans(qξ, K) = ∪q0 ∈PerfectRef(q,T ) ∪ξ∈B Ans(q 0 ξ, (∅, A)) where PerfectRef denotes one classical query rewriting algorithm in DL-LiteA . Experiments presented in the subsequent section demonstrate that the value transferring strategy can significantly improve the query answering performance. 5 The evaluation In order to demonstrate the efficiency and rationality of our approach and pro- posal, we have implemented a prototype system, where multi-thread techniques are adopted to check the satisfiability of and answer a single query over the sub- KBs in a KB partition in a parallel way. We run the system on two web-scale, real-world datasets, i.e., DBpedia 200 dataset extended with DBpedia ontology and BTC 2012 dataset. Next, we present the main results of the experiments. SCSQA-local partition computation. After materializing the name equiva- lence relations, two DL-LiteA KBs Kdbp and Kbtc are constructed, respectively with 233,227 axioms and 0.2 billion individual assertions and 339,519 axioms and 1.04 billion individual assertions. We randomly choose two numbers, 28 million and 52 million, to respectively limit the size of the sub-ABoxes in KB partitions. Then, a SCSQA-local partition Sdbp of Kdbp with 14 sub-KBs and a SCSQA-local partition Sbtc of Kbtc with 30 sub-KBs are computed by Procedure 2. Satisfiability checking. Without KB partition, checking the satisfiability of Kdbp totally cost 6 minutes, and for Kbtc , after 6 hours an out-of-memory error was reported caused by the failure in completing generating the queries needed to be evaluated over its ABox. However, facilitated by KB partitioning equipped with parallelization, satisfiability checking of Kdbp and Kbtc has been success- fully done in 0.07 and 0.15 minutes, respectively. The testing results indicate that KB partitioning equipped with parallelization can significantly improve the performance of checking the satisfiability of web-scale DL-LiteA KBs. Query answering checking. In order to detect the effect of KB and query par- titioning as well as sub-query value transfer on the performance of CQ answering, we design 14 CQs (8 SQs and 6 non-SQs) for Kdbp and Kbtc , respectively. 8 Gu et al. The SQ answering results indicate that no matter a SQ has a large or small number of rewritten queries over the original KB or it can be evaluated efficiently or inefficiently, in the situation of KB partitioning equipped with parallelization, this SQ can be answered in an extremely efficient way. On the other hand, the results of evaluating the tested non-SQ queries indicate that: (1) when a CQ cannot be evaluated over the original KB efficiently, query partitioning combined with KB partitioning and sub-query value transfer can make this query to be evaluated in a very efficient way; (2) no matter with or without KB partitioning, query partitioning can reduce the number of rewritten CQs significantly; (3) even when query partitioning cannot reduce the number of the rewritten queries, evaluating query partitions over KB partitions with sub-query value transfer can speed up the procedure; and (4) for the queries already efficiently answerable over the original KBs, our approach may not further improve the performance. Comparing with related systems. To demonstrate the rationality and effi- ciency of our proposal and approach, we have compared our system with the two state-of-the-art, centralized triple stores, Jena-TDB and Virtuoso which support RDFS reasoning. The comparison results can be summarized as: (1) even with KB and query partitioning, our approach is not as efficient as the two RDF stores which evaluate queries directly without taking reasoning into considera- tion; (2) when reasoning is considered, the performance of the two triple stores drops significantly, and they do not perform as well as our approach; and (3) query answering without considering reasoning can miss many certain answers. 6 Comparison with related work Compared with the works on querying LOD [4–9], our approach pursues both expressivity and scalability, capturing more schema knowledge and handling more expressive queries with the support of DL-LiteA techniques. For scala- bility, we provide a divide-and-conquer reasoning and query answering approach for DL-LiteA so that the massive knowledge in LOD can be managed by parallel and distribution techniques. Compared with the works on DL KB modularity [12–14], we study a tractable language DL-LiteA , and concentrate on satisfiability checking and CQ answering rather than entailment checking. Moreover, we provide a way of ABox partition- ing with linear data complexity where reasoning is not necessary. Compared with the works on RDF graph partitioning [15–17], we consider RDF graphs as DL-LiteA KBs so that constraints in these graphs can be captured. Compared with the works on query rewriting optimization [18–21], our ap- proach features query rewriting optimization in DL-LiteA from the distinctive perspective of query and KB partitioning, and the experiments on real-world datasets show that the number of rewritten queries can be significantly reduced. Compared with the works [22,23] on fast satisfiability checking, we propose to improve the performance in DL-LiteA from the perspective of KB partitioning. Experiments on web-scale datasets demonstrate the efficiency of our approach. Acknowledgments. This work has been supported by the National Key Research and Development Program of China under grants 2017YFC1700300 and 2017YFB1002300. Title Suppressed Due to Excessive Length 9 References 1. Gu, Z., Zhang, S., Cao, C,: Reasoning and Querying Web-scale Open Data based on DL-LiteA in a Divide-and-Conquer Way. J. Web Semant. 55: 122-144 (2019). 2. O’Riain, S., Curry, E., Harth, A.: XBRL and open data for global financial ecosys- tems: A linked data approach. Int. J. Accounting Inf. Systems. 13 (2):141-162 (2012). 3. Bauer, F., Kaltenböck, M.: Linked open data: The essentials, Edition mono/monochrom, Vienna, Austria, 2011. 4. Bishop, B., Kiryakov, A., Ognyanov, D., Peikov, I., Tashev, Z., Velkov, R.: Fact- Forge: A fast track to the web of Data. Semantic Web, 2 (2) (2011). 5. Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: Optimization Techniques for Federated Query Processing on Linked Data. In: ISWC, 2011. 6. Umbrich, J., Hogan, A., Polleres, A., Decker, S.: Link Traversal Querying for a diverse Web of Data. Semantic Web, 6 (6) (2012) 585-624. 7. Montoya, G., Skaf-Molli, H., Molli, P., Vidal, M. E.: Federated SPARQL Queries Processing with Replicated Fragments. In: ISWC, 2015. 8. Hartig, O., Freytag, J. C.: Foundations of traversal based query execution over Linked Data. In: ACM Hypertext, 2012. 9. Sabri, M. M.: A Hybrid Framework for Online Execution of Linked Data Queries. In: WWW, 2015. 10. Poggi, A., Lembo, L., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking Data to Ontologies. J. Data Semantics. 10 (2008) 133-173. 11. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning. 39 (3) (2007). 12. Grau, B. C., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontologies: theory and practice. J. Artif. Intell. Res. 31 (2008). 13. Botoeva, E., Konev, B., Lutz, C., Ryzhikov, V., Wolter, F., Zakharyaschev, M.: Inseparability and conservative extensions of description logic ontologies: A survey. In: Reasoning Wb, 2016. 14. Xu, J., Shironoshita, P., Visser, U., John, N., Kabuka, M.: Module Extraction for Efficient Object Queries over Ontologies with Large ABoxes. Artif Intell Appl. 2 (1) (2015). 15. Harbi, R., Abdelaziz, I., Kalnis, P., Mamoulis, N.: Evaluating SPARQL queries on massive RDF datasets. Proceedings of the VLDB Endowment, 8 (12) (2015). 16. Potter, A., Motik, B., Horrocks, I.: Querying Distributed RDF Graphs: The Effects of Partitioning. In: SSWS@ISWC, 2014. 17. Yang, S., Yan, X., Zong, B., Khan, A.: Towards effective partition management for large graphs. In: SIGMOD Conference, 2012. 18. Rosati, R., Almatelli, A.: Improving Query Answering over DL-Lite Ontologies. In: KR, 2010. 19. Venetis, T., Stoilos, G., Stamou, G. B.: Query extensions and incremental query rewriting for OWL 2 QL ontologies. J. Data Semantics, 3 (1) (2014). 20. Bursztyn, D., Goasdoué, F., Manolescu, I.: Teaching an RDBMS about ontological constraints. Proceedings of the VLDB Endowment, 9 (12) (2016). 21. Gottlob, G., Orsi, G., Pieris, A.: Query rewriting and optimization for ontological databases. ACM Trans. Database Syst. 39 (3) (2014). 22. Paulheim, H., Stuckenschmidt, H.: Fast Approximate A-Box Consistency Checking Using Machine Learning. In: ESWC, 2016. 23. Fokoue, A., Kershenbaum, A., Ma, L., Schonberg, E., Srinivas, K.: The Summary Abox: Cutting Ontologies Down to Size. In: ISWC, 2006.