Classifying the Complexity of Ontology-Mediated Queries in EL: From Atomic Queries to Conjunctive Queries (Extended Abstract)? Carsten Lutz and Leif Sabellek Department of Computer Science, University of Bremen, Germany Substantial research efforts have been invested into understanding the com- putational properties and expressive power of ontology-mediated queries (OMQs). Two important topics are the data complexity of OMQ evaluation [10, 12, 17, 5, 1], where only the data is considered to be the input while the OMQ is fixed, and the rewritability of OMQs into more standard database query languages such as SQL (which in this context is often equated with first-order logic) and Datalog [7, 3, 1, 11, 9]. Both subjects are thoroughly intertwined. In particular, rewritability into first-order logic (FO) is closely related to AC0 data complexity while rewritability into Datalog is closely related to PTime data complexity. Traditionally, data complexity and rewritability have mostly been studied on the level of OMQ languages (L, Q) defined by an ontology language L such as EL and a query language Q such as conjunctive queries (CQs). A more fine-grained analysis has been initiated in [16, 1], the aim being to understand the exact complexity and rewritability status of every OMQ from relevant OMQ languages (L, Q), see also [18, 15]. For expressive DLs, this turns out to be closely related to the complexity classification of constraint satisfaction problems (CSPs) with a fixed template [8]. Very important progress has recently been made in this area with the proof that CSPs enjoy a dichotomy between PTime and NP [4, 19]. Via the results in [1], this implies that OMQ evaluation in languages such as (ALCI, UCQ) enjoys a dichotomy between PTime and coNP. However, the complexity of CSPs and OMQs is still far from being fully understood. For example, neither in CSP nor in expressive OMQ languages it is known whether there is a dichotomy between NL and PTime, and whether containment in NL coincides with rewritability into linear Datalog, a well-known fragment of Datalog that allows only linear recursion. It has been conjectured that both of this is the case. We report on [14], which carries out a fine-grained analysis of the data com- plexity and rewritability of OMQs from (EL, CQ), achieving a complete classi- fication. This extends the results of [13] where the same was achieved for the OMQ language (EL, AQ) that only admits atomic queries, that is, conjunctive queries of the very simple form A(x). For every Q ∈ (EL, CQ), let eval(Q) be ? Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). 2 C. Lutz and L. Sabellek the problem to decide, given an ABox A and a tuple of individuals a, whether a is a certain answer to Q on A. Our main result is the following trichotomy. Theorem 1. For every OMQ Q ∈ (EL, CQ), one of the following is true: 1. eval(Q) is in AC0 and Q is rewritable into FO (actually into a UCQ with equality atoms); 2. eval(Q) is NL-complete and Q is rewritable into linear Datalog; 3. eval(Q) is PTime-complete and Q is rewritable into monadic Datalog. Consequently, eval(Q) is in NL if and only if Q is rewritable into linear Datalog unless NL = PTime. An important aspect of the proof of Theorem 1 are semantic characterizations of the three cases, which are also interesting in their own right. The character- izations are easiest to state for AQs. For Q ∈ (EL, AQ), let MQ denote the set of all tree shaped ABoxes whose root is a certain answer to Q and that are minimal with that property regarding set inclusion. We say that Q has bounded depth if there is a constant bound on the depth of the trees in MQ and Q has bounded pathwidth if there is a constant bound on the pathwidth of the trees in MQ ; recall that pathwidth measures the similarity of a structure to a path with lower number meaning more similar. For CQs, the definitions are similar, but for defining MQ we consider ABoxes of a certain pseudo-tree shape instead of tree shaped ones. While it is known that bounded depth of an OMQ from (EL, CQ) implies FO-rewritability [2], we prove that (1) if Q does not have bounded depth, then eval(Q) is NL-hard; (2) if Q has bounded pathwidth, then Q is rewritabe into linear Datalog (and thus, eval(Q) is in NL); (3) if Q does not have bounded pathwidth, then eval(Q) is PTime-hard and Q is not rewritable into linear Datalog. The proofs are technically subtle, especially the hardness proofs (1) and (3) for Boolean CQs. Furthermore, the proof of (2) yields a way to construct linear Datalog rewritings if they exist, and the rewritings have smaller width (number of variables in rule heads) than the linear Datalog rewritings given in [13]. We also show that all relevant decision problems, such as whether a given OMQ is rewritable into linear Datalog and whether it is PTime-hard, are ExpTime- complete. The upper bounds are proved by reduction to the emptiness problem of tree automata. There are several natural directions into which our results can be general- ized. One is to consider the OMQ language (EL, UCQ). We conjecture that this generalization is not difficult. In fact, we only refrained from covering it since it makes all proofs more technical and distracts from the main ideas. An im- portant direction for future work is to extend our analysis to ELI, that is, to add inverse roles. Even the case of (ELI, AQ) appears to be challenging. In fact, we observe that a complexity classification of (ELI, AQ) is equivalent to a com- plexity classification of all CSPs that have tree obstructions, which is an open Classifying the Complexity of Ontology-Mediated Queries in EL 3 problem. It is easy to see that (ELI, AQ) contains OMQs that are L-complete and we conjecture that AC0 , L, NL, and PTime are the only complexities that occur. We also conjecture that L-completeness coincides with rewritability into symmetric Datalog [6]. Acknowledgments. This research was supported by ERC consolidator grant 647289 CODA. References 1. Bienvenu, M., ten Cate, B., Lutz, C., Wolter, F.: Ontology-based data access: A study through disjunctive datalog, CSP, and MMSNP. ACM Trans. Database Syst. 39(4), 33:1–33:44 (2014) 2. Bienvenu, M., Hansen, P., Lutz, C., Wolter, F.: First order-rewritability and con- tainment of conjunctive queries in Horn description logics. In: Proc. of IJCAI. pp. 965–971. IJCAI/AAAI Press (2016) 3. Bienvenu, M., Lutz, C., Wolter, F.: First order-rewritability of atomic queries in horn description logics. In: Proc. of IJCAI. pp. 754–760. IJCAI/AAAI (2013) 4. Bulatov, A.A.: A dichotomy theorem for nonuniform csps. In: Proc. of FOCS. pp. 319–330 (2017) 5. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Data com- plexity of query answering in description logics. Artif. Intell. 195, 335–360 (2013) 6. Egri, L., Larose, B., Tesson, P.: Symmetric datalog and constraint satisfaction prob- lems in LogSpace. Electronic Colloquium on Computational Complexity (ECCC) 14(024), 1 (2007) 7. Eiter, T., Ortiz, M., Simkus, M., Tran, T., Xiao, G.: Query rewriting for Horn- SHIQ plus rules. In: Proc. of AAAI. AAAI Press (2012) 8. Feder, T., Vardi, M.Y.: The computational structure of monotone monadic SNP and constraint satisfaction: A study through datalog and group theory. SIAM J. Comput. 28(1), 57–104 (1998) 9. Feier, C., Kuusisto, A., Lutz, C.: Rewritability in monadic disjunctive datalog, MMSNP, and expressive description logics. Logical Methods in Computer Science (2019) 10. Hustadt, U., Motik, B., Sattler, U.: Data complexity of reasoning in very expressive description logics. In: Proc. of IJCAI. pp. 466–471. Professional Book Center (2005) 11. Kaminski, M., Nenov, Y., Grau, B.C.: Datalog rewritability of disjunctive datalog programs and its applications to ontology reasoning. In: Proc. of AAAI. pp. 1077– 1083. AAAI Press (2014) 12. Krisnadhi, A., Lutz, C.: Data complexity in the EL family of description logics. In: Proc. of LPAR. LNAI, vol. 4790, pp. 333–347. Springer (2007) 13. Lutz, C., Sabellek, L.: Ontology-mediated querying with the description logic el: Trichotomy and linear datalog rewritability. In: Proc. of IJCAI. pp. 1181–1187. ijcai.org (2017) 14. Lutz, C., Sabellek, L.: A complete classification of the complexity and rewritability of ontology-mediated queries based on the description logic EL. submitted to Journal of Artificial Intelligence Research (JAIR) (2019), http://arxiv.org/abs/1904.12533 15. Lutz, C., Seylan, I., Wolter, F.: Ontology-mediated queries with closed predicates. In: Proc. of IJCAI. pp. 3120–3126. AAAI Press (2015) 4 C. Lutz and L. Sabellek 16. Lutz, C., Wolter, F.: Non-uniform data complexity of query answering in descrip- tion logics. In: Proc. of KR. AAAI Press (2012) 17. Rosati, R.: The limits of querying ontologies. In: Proc. of ICDT. LNCS, vol. 4353, pp. 164–178. Springer (2007) 18. Zakharyaschev, M., Kikot, S., Gerasimova, O.: Towards a data complexity classifi- cation of ontology-mediated queries with covering. In: Proc. of DL. CEUR Work- shop Proceedings, vol. 2211. CEUR-WS.org (2018) 19. Zhuk, D.: A proof of CSP dichotomy conjecture. In: Proc. of FOCS. pp. 331–342 (2017)