Characterising Fixed Parameter Tractability of Query Evaluation Over Guarded TGDs (Extended Abstract)? Cristina Feier University of Bremen We consider the usual scenario of ontology mediated querying where queries are posed to a database whose knowledge is enriched by means of an ontology. A tuple (L, Q), where L is an ontology language and Q is a query language, is referred to as an OMQ language. One thoroughly explored fragment of FOL as a basis for ontology specification languages is that of tuple generating depen- dencies (TGDs). A tgd is a rule (logical implication) having as body and head conjunctions of atoms, where some variables occurring in head atoms might be existentially quantified (all other variables are universally quantified). As such, it potentially allows the derivation of atoms over fresh individuals (individuals not mentioned in the database). Answering (even atomic) queries with respect to sets of tgds is undecidable [7]. However, there has been lots of work on identi- fying decidable fragments [7, 1, 9]. A prominent such fragment is that of Guarded TGDs (GTGD) [7]: a tgd is guarded if all universally quantified variables occur as terms of some body atom, called guard. Many Description Logic languages like ELH or ELHI can be expressed using GTGDs. While query answering with respect to GTGDs is decidable, the combined complexity of the problem is high: exptime-complete for bounded arity schemas, and 2ExpTime-complete in general. At the same time, while the fragment is tractable w.r.t. data complexity [6], the coefficient of the polynomial depends on the size of the ontology, which might be relatively high. By fixing the set of tgds, the combined complexity drops to NP for evaluating conjunctive queries (CQs), and to PTime for evaluating atomic queries and CQs of bounded treewdith [8] which is similar to the complexity of query evaluation over databases [15]. A nat- ural question is when can OMQs from (GTGD, UCQ) be evaluated efficiently? The above question has been partially answered in [2]. There, the complexity of evaluating OMQs over bounded arity schemas is studied in a parameterized setting, with the parameter being the size of the OMQ. Under the assumption that FPT 6= W[1], it is shown that a recursively enumerable (r.e.) class of OMQs over bounded arity schemas can is fixed parameter tracatable (fpt) iff it has bounded semantic treewidth, i.e. there exists a k such that every OMQ from the class is equivalent to another OMQ from (GTGD, UCQ) whose UCQ has syntactic treewidth k [4]. The work generalizes previous results concerning ? “Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).” efficiency of evaluation of OMQs over bounded arity schemas based on the DL languages ELH and ELHI from [3]. The characterization from [2] is similar to the well-known result of Grohe [13], concerning the parameterized complexity of CQ evaluation over databases: the evaluation of a r.e. class of CQs is fpt iff the class has bounded semantic treewidth, i.e. there exists a class of equivalent CQs of bounded treewidth. The similarity is not coincidental, as the results build from [2] on the results of Grohe in a non-trivial way. In fact, the lower bound proof uses a central construction from [13], but has to employ sophisticated techniques to adapt this to OMQs. The main research question we address in this paper is the following: when is it possible to efficiently evaluate OMQs from (GTGD, UCQ) in the case where there is no restriction concerning schema arity? We consider again a parame- terized setting, where the parameter is the size of the OMQ. Given a r.e. class of OMQ Q from (GTGD, UCQ), we denote with p-OMQ(Q) the parameterized problem of evaluating OMQs from Q. As our first main result shows, in this case another structural measure for OMQs/UCQs plays a central role: submodular width. Similarly to the definition of semantic treewidth for OMQs, it is possible to define a notion of semantic submodular width of an OMQ. We obtain the following: Main Result 1. Let Q be a r. e. class of OMQs from (GTGD, UCQ). Assum- ing the Exponential Time Hypothesis, p-OMQ(Q) is fixed-parameter tractable iff Q has bounded semantic submodular width. To prove the result, we use the fact that every OMQ from (GTGD, UCQ) can be rewritten into an OMQ from (GDLog, UCQ) [2], where GDLog stands for Guarded Datalog, the restriction of GTGD to rules with only universally quantified variables. For OMQs from (GDLog, UCQ), we construct equivalent OMQs called covers which witness bounded semantic submodular width. Covers are based on so-called characteristic databases, which are databases that entail an OMQ and which are sufficiently minimal with respect to the homomorphism order, in a very specific sense. Typically, in a database setting, the database induced by a CQ (or its core) can be seen as a canonical database which entails the query and on which to base further constructions. However, in the case of OMQs which pose restric- tions on the database schema this is no longer possible: a CQ might contain symbols which are not allowed to occur in a database. In fact, this is a typical usage of ontologies: to enrich the database schema with new terminology. As such, we identify other representative databases for OMQs. Based on these new concepts, for OMQs from (GDLog, UCQ) it is possible to obtain an alternative characterization for fpt evaluation based on syntactic measures: Main Result 2. For Q a r. e. class of OMQs from (GDLog, UCQ), let Qc and DQ be the classes of covers and characteristic databases for OMQs from Q, respectively. Under the Exponential Time Hypothesis, the following statements are equivalent: 1. p-OMQ(Q) is fixed-parameter tractable; 2. Qc has bounded submodular width; 3. DQ has bounded submodular width. The hardness result for the above characterization exploits the seminal re- sult of Marx [14] concerning fpt evaluation of uniform CSPs closed under under- lying hypergraphs. For a r.e. class of structures A, the parameterized uniform CSP problem p-CSP(A, ) asks whether there exists a homomorphism from some structure A in A (with |A| being the parameter) to an arbitrary relational struc- ture. The result of Marx has been lifted to arbitrary uniform CSPs in [11]: Theorem 1 (Theorem 1, [11]). Let C be a r.e. class of structures. Assuming the Exponential Time Hypothesis, p-CSP(C, ) is fixed-parameter tractable iff C has bounded semantic submodular width. The problem of evaluating uniform CSPs can be seen as an alternative pre- sentation of the problem of evaluating a r.e. class of Boolean CQs over databases. In fact, Grohe’s characterization for fpt evaluation of r. e. classes of CQs in the bounded arity case has been achieved via a uniform CSP detour [13]. To exploit the result from [11] concerning evaluation of uniform CSPs, we introduce an fpt- reduction from parameterized uniform CSP evaluation to parameterized OMQ evaluation which preserves semantic submodular width. Main Result 3. For Q an OMQ from (GDLog, UCQ), there exists an fpt- reduction from p-CSP(DQ , ) to p-OMQ({Q}). The reduction is important also as a stand-alone result as it can be used as a black-box tool to port results from the uniform CSP/database realm to the OMQ one. To further showcase this, we revisit the problem of evaluating r.e. classes of OMQs over bounded arity schemas, previously addresed in [2] and es- tablish analogues of Main Result 1 and Main Result 2. In this case, submodular width is replaced with treewidth and the Exponential Time Hypothesis with the assumption that FPT 6= W[1]. This is due to the fact that the set of char- acteristic databases of an OMQ has the same treewidth as its cover. As such, we retrieve the semantic characterization from [2] without going into details of the construction employed by Grohe in [13], and also establish an alternative syntactic characterization for fpt evaluation of OMQs from (GDLog, UCQ) over bounded arity schemas using the newly introduced notions of cover and charac- teristic databases. So far we have only considered Boolean OMQs, as the corresponding results for CQ evaluation in the unbounded arity case have also only been established in the Boolean case. As future work, we plan to extend our work to the non- Boolean case. Many results from the database world concerning efficiency of performing tasks like counting [12], enumeration [5, 10], and so on, use structural measures on the queries similar to the ones involved in the characterizations for evaluation. Thus, by generalizing our constructs to the non-Boolean case, depending on which structural measures are preserved when transitioning from sets of characteristic databases to covers of OMQs, we might be able to lift such results to the OMQ world. References 1. Jean-François Baget, Michel Leclère, Marie-Laure Mugnier, and Eric Salvat. On rules with existential variables: Walking the decidability line. Artif. Intell., 175(9- 10):1620–1654, 2011. 2. Pablo Barceló, Victor Dalmau, Cristina Feier, Carsten Lutz, and Andreas Pieris. The limits of efficiency for open- and closed-world query evaluation under guarded tgds. In PODS, pages 259–270, 2020. 3. Pablo Barceló, Cristina Feier, Carsten Lutz, and Andreas Pieris. When is ontology- mediated querying efficient? In LICS, pages 1–13, 2019. 4. Pablo Barceló, Diego Figueira, Georg Gottlob, and Andreas Pieris. Seman- tic optimization of conjunctive queries. 2019. Under submission, available at http://homepages.inf.ed.ac.uk/apieris/BFGP.pdf. 5. Christoph Berkholz and Nicole Schweikardt. Constant delay enumeration with fpt-preprocessing for conjunctive queries of bounded submodular width. In MFCS 2019, volume 138, pages 58:1–58:15, 2019. 6. Andrea Calı̀, Georg Gottlob, and Thomas Lukasiewicz. A general datalog-based framework for tractable query answering over ontologies. Journal of Web Seman- tics, 14:57 – 83, 2012. 7. Andrea Calı̀, Georg Gottlob, and Michael Kifer. Taming the infinite chase: Query answering under expressive relational constraints. J. Artif. Intell. Res., 48:115–174, 2013. 8. Andrea Calı̀, Georg Gottlob, and Thomas Lukasiewicz. Datalog±: a unified ap- proach to ontologies and integrity constraints. In ICDT, pages 14–30, 2009. 9. Andrea Calı̀, Georg Gottlob, and Andreas Pieris. Towards more expressive ontology languages: The query answering problem. Artif. Intell., 193:87–128, 2012. 10. Nofar Carmeli and Markus Kröll. Enumeration Complexity of Conjunctive Queries with Functional Dependencies. In (ICDT 2018), volume 98, pages 11:1–11:17, 2018. 11. Hubie Chen, Georg Gottlob, Matthias Lanzinger, and Reinhard Pichler. Semantic width and the fixed-parameter tractability of constraint satisfaction problems. In Christian Bessiere, editor, IJCAI, pages 1726–1733, 2020. 12. Hubie Chen and Stefan Mengel. A Trichotomy in the Complexity of Counting Answers to Conjunctive Queries. In (ICDT 2015), volume 31, pages 110–126, 2015. 13. Martin Grohe. The complexity of homomorphism and constraint satisfaction prob- lems seen from the other side. J. ACM, 54(1):1:1–1:24, 2007. 14. Dániel Marx. Tractable hypergraph properties for constraint satisfaction and con- junctive queries. In STOC, pages 735–744, 2010. 15. Mihalis Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82–94, 1981.