Separating Counting from Non-Counting in Fragments of Two-Variable First-Order Logic (Extended Abstract) Louwe Kuijer1 , Tony Tan1 , Frank Wolter1 and Michael Zakharyaschev2 1 University of Liverpool, Ashton Street, Liverpool L69 3BX, UK 2 Birkbeck, University of London, Malet Street, London WC1E 7HX, UK Abstract We consider the problem of deciding whether two disjoint classes of models defined in a fragment of first-order logic (FO) with counting can be separated in the same fragment but without counting. This problem turns out to be hard. We show that separation for the two-variable fragment FO2 extended with counting quantifiers by means of plain FO2 is undecidable, and the same is true of the pair π’œβ„’π’žπ’ͺℐ𝒬/π’œβ„’π’žπ’ͺℐ of description logics. On the other hand, we establish 2ExpTime-completeness of the separation problem for the pairs π’œβ„’π’žπ’¬π‘’ /π’œβ„’π’ž 𝑒 and π’œβ„’π’žβ„π’¬π‘’ /π’œβ„’π’žβ„ 𝑒 . Keywords Separation, two-variable first-order logic, counting quantifiers, bisimulation. 1. Introduction Our concern in this paper is the following separation problems for a pair of languages 𝐿 and 𝐿𝑠 : 𝐿/𝐿𝑠 -separation: given two mutually exclusive 𝐿-formulas πœ™ and πœ“, decide whether there exists an 𝐿𝑠 -formula πœ’β€”a separator for πœ™ and πœ“β€”such that πœ™ |= πœ’ and πœ’ |= Β¬πœ“; Craig 𝐿/𝐿𝑠 -separation: decide whether the given 𝐿-formulas πœ™ and πœ“ have an 𝐿𝑠 -separator πœ’ that only contains common non-logical symbols (predicates and functions) of πœ™ and πœ“. To illustrate, πœ™ could be an ontology π’ͺ and πœ“ a concept 𝐢 that is not satisfiable with respect to π’ͺ, both given in an expressive language 𝐿. Then a separator ontology π’ͺβ€² in a weaker, easier to comprehend language 𝐿𝑠 potentially explains unsatisfiability as it inherits that π’ͺ |= π’ͺβ€² and 𝐢 is not satisfiable under π’ͺβ€² . Also, in the context of concept learning, πœ™ and πœ“ could represent positive and negative examples for a target concept 𝐢. Then any separator in an appropriately chosen language 𝐿𝑠 could represent the concept one aims to learn [1]. Separation generalises definability (aka membership), which asks whether a given 𝐿-formula (say, a datalog query) is equivalent to some 𝐿𝑠 -formula (say, a first-order query), and is regarded as one of the main approaches to understanding the expressive power of 𝐿 relative to 𝐿𝑠 . For instance, studying separability of regular languages by smaller language classes (e.g., a star-free language) has brought major insights into the respective formal languages, with some fundamental open problems in the area cast as separation questions [2]. Craig 𝐿/𝐿𝑠 -separation generalises classical Craig interpolation in 𝐿 [3] because a Craig 𝐿/𝐿- separator for πœ™ and πœ“ is a Craig interpolant for πœ™ β†’ Β¬πœ“ in 𝐿. Our aim in this paper is to investigate the decidability and complexity of the separation problem for certain fragments 𝐿 of C2 β€”that is, the two-variable first-order logic FO2 extended with counting quantifiersβ€”and the same fragments 𝐿𝑠 but without counting. DL 2024: 37th International Workshop on Description Logics, June 18–21, 2024, Bergen, Norway $ Louwe.Kuijer@liverpool.ac.uk (L. Kuijer); Tony.Tan@liverpool.ac.uk (T. Tan); wolter@liverpool.ac.uk (F. Wolter); m.zakharyaschev@bbk.ac.uk (M. Zakharyaschev)  0000-0001-6696-9023 (L. Kuijer); 0009-0005-8341-2004 (T. Tan); 0000-0002-4470-606X (F. Wolter); 0000-0002-2210-5183 (M. Zakharyaschev) Β© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Example 1. Consider the following C2 -formulas: πœ™(π‘₯) = βˆƒ=1 𝑦 𝑅(π‘₯, 𝑦), πœ“(π‘₯) = βˆƒ=1 𝑦 𝑅(π‘₯, 𝑦) ∧ 𝐴(𝑦) ∧ βˆƒ=1 𝑦 𝑅(π‘₯, 𝑦) ∧ ¬𝐴(𝑦) . (οΈ€ )οΈ€ (οΈ€ )οΈ€ Then πœ™ |= Β¬πœ“ and the FO2 -formula πœ’(π‘₯) = βˆ€π‘¦ 𝑅(π‘₯, 𝑦) β†’ 𝐴(𝑦) ∨ βˆ€π‘¦ 𝑅(π‘₯, 𝑦) β†’ ¬𝐴(𝑦) is a (οΈ€ )οΈ€ (οΈ€ )οΈ€ separator for πœ™(π‘₯) and πœ“(π‘₯). For πœ“ β€² (π‘₯) = βˆƒ=2 𝑦 𝑅(π‘₯, 𝑦), we also have πœ™ |= Β¬πœ“ β€² , but πœ™(π‘₯) and πœ“ β€² (π‘₯) are not separable in FO2 . On the other hand, there is no Craig FO2 -separator for πœ™(π‘₯) and πœ“(π‘₯) as it would have to be defined using 𝑅 only, and so separate πœ™(π‘₯) and πœ“ β€² (π‘₯) as well. 2. Logics The logics we consider here can all be regarded as fragments of first-order logic, FO, and are defined as follows. Let 𝜎 be a signature containing unary and binary relation symbols and possibly constants. Fix a set var comprising two individual variables. Then FO2 (𝜎), the two-variable fragment of FO(𝜎), is defined as the set of formulas that are built from atoms 𝐴(π‘₯), 𝑅(π‘₯, 𝑦), and π‘₯ = 𝑦 with unary 𝐴 ∈ 𝜎, binary 𝑅 ∈ 𝜎, and π‘₯, 𝑦 ∈ var, using the Boolean connectives ∧ and Β¬ and quantifier βˆƒπ‘₯ with π‘₯ ∈ var (other Booleans and βˆ€π‘₯ are regarded as standard abbreviations); C2 (𝜎), the two-variable fragment of FO2 (𝜎) with counting, extends FO2 (𝜎) with the counting quantifiers βˆƒ<π‘˜ π‘₯, for π‘˜ ∈ N and π‘₯ ∈ var (other counting quantifiers βˆƒ=π‘˜ π‘₯, βˆƒβ‰€π‘˜ π‘₯, βˆƒβ‰₯π‘˜ π‘₯ can be introduced as abbreviations). In this paper, we are only interested in formulas πœ™(π‘₯) with one free variable π‘₯ ∈ var. The signature of πœ™ is the set sig(πœ™) of relation and constant symbols occurring in πœ™. FO(𝜎) and its fragments are interpreted in 𝜎-structures A = (dom(A), (𝑅A )π‘…βˆˆπœŽ , (𝑐A )π‘βˆˆπœŽ ) with a domain dom(A) ΜΈ= βˆ…, relations 𝑅A on dom(A) of the same arity as 𝑅 ∈ 𝜎, and elements 𝑐A ∈ dom(A). A pointed structure is a pair A, π‘Ž with π‘Ž ∈ dom(A). We also consider a few fragments of C2 that correspond to some standard description logics (DLs). In the context of DLs, unary relation symbols are called concept names, binary ones role names, and constants individual names [4]. A role is a role name π‘Ÿ or its inverse π‘Ÿβˆ’ . The universal role is denoted by 𝑒. A nominal takes the form {𝑐} with an individual name 𝑐. An π’œβ„’π’žπ’ͺℐ𝒬𝑒 (𝜎)-concept is defined by the grammar 𝐢 ::= ⊀ | 𝐴 | {𝑐} | ¬𝐢 | 𝐢 βŠ“ 𝐢 β€² | β‰₯ π‘˜ π‘Ÿ.𝐢 | βˆƒπ‘’.𝐢, where 𝐴 ∈ 𝜎 is a concept, 𝑐 ∈ 𝜎 an individual, π‘Ÿ a role name in 𝜎 or its inverse, and π‘˜ > 0. We consider several fragments of π’œβ„’π’žπ’ͺℐ𝒬𝑒 . The weakest, π’œβ„’π’ž, is obtained by dropping the universal role (indicated by omitting ·𝑒 from the name), inverse roles (indicated by omitting ℐ), nominals (indicated by omitting π’ͺ), and only admitting qualified number restrictions of the form βˆƒπ‘Ÿ.𝐢 = (β‰₯ 1 π‘Ÿ.𝐢) (indicated by dropping 𝒬). The languages between π’œβ„’π’ž and π’œβ„’π’žπ’ͺℐ𝒬𝑒 are now defined in the obvious way. The semantics of DLs can be defined via the standard translation Β·β™― into C2 with constants. For any π’œβ„’π’žπ’ͺℐ𝒬𝑒 -concept 𝐢, we denote by 𝐢π‘₯β™― the C2 -formula with constants and free variable π‘₯ ∈ var defined inductively by taking βŠ€β™―π‘₯ = (π‘₯ = π‘₯), 𝐴♯π‘₯ = 𝐴(π‘₯), {𝑐}β™―π‘₯ = (π‘₯ = 𝑐), (¬𝐢)β™―π‘₯ = ¬𝐢π‘₯β™― , (𝐢 βŠ“ 𝐷)β™―π‘₯ = 𝐢π‘₯β™― ∧ 𝐷π‘₯β™― , Β― 𝐢π‘₯Β―β™― , (βˆƒπ‘’.𝐢)β™―π‘₯ = βŠ€β™―π‘₯ ∧ βˆƒπ‘₯ (β‰₯ π‘˜ π‘Ÿ.𝐢)β™―π‘₯ = βˆƒβ‰₯π‘˜ π‘₯ Β― ) ∧ 𝐢π‘₯Β―β™― , (οΈ€ )οΈ€ Β― π‘Ÿ(π‘₯, π‘₯ where π‘₯ Β― = 𝑦, 𝑦¯ = π‘₯ and {π‘₯, 𝑦} = var. The complexities of the satisfiability problems for the logics in question are as follows [4, 5]: β€’ FO2 , C2 , and π’œβ„’π’žπ’ͺℐ𝒬𝑒 are all NExpTime-complete; β€’ π’œβ„’π’ž 𝑒 , π’œβ„’π’žπ’¬π‘’ , π’œβ„’π’žβ„π’¬π‘’ , and π’œβ„’π’žπ’ͺℐ 𝑒 are all ExpTime-complete. 3. Deciding Separation Our main results are summarised in the next theorem: Theorem 1. The separation and Craig separation problems are β€’ undecidable for the pairs C2 /FO2 and π’œβ„’π’žπ’ͺℐ𝒬/π’œβ„’π’žπ’ͺℐ, β€’ 2ExpTime-complete for the pairs π’œβ„’π’žβ„π’¬π‘’ /π’œβ„’π’žβ„ 𝑒 and π’œβ„’π’žπ’¬π‘’ /π’œβ„’π’ž 𝑒 . The proof of Theorem 1 is based on the following straightforward model-theoretic characterisation of separation in terms of bisimulations; see [6, 7, 8] and references therein: Lemma 2. Let πœ™(π‘₯) and πœ“(π‘₯) be any C2 (𝜎)-formulas, 𝜚 βŠ† 𝜎, and let 𝐿𝑠 be any of the languages introduced in Section 2. Then the following conditions are equivalent: β€’ πœ™(π‘₯) and πœ“(π‘₯) do not have an 𝐿𝑠 (𝜚)-separator; β€’ there are pointed 𝜎-structures A, π‘Ž and B, 𝑏 such that A |= πœ™(π‘Ž), B |= πœ“(𝑏), A, π‘Ž βˆΌπΏπ‘  (𝜚) B, 𝑏. For Craig separation, we additionally require that 𝜚 βŠ† sig(πœ™) ∩ sig(πœ“). Here, A, π‘Ž βˆΌπΏπ‘  (𝜚) B, 𝑏 means that there is an 𝐿𝑠 (𝜚)-bisimulation 𝛽 between A and B such that (π‘Ž, 𝑏) ∈ 𝛽, which implies that A |= πœ‘(π‘Ž) iff B |= πœ‘(𝑏), for all 𝐿𝑠 (𝜚)-formulas πœ‘(π‘₯) [6, 9, 5]. The proof of the characterisation in Lemma 2 is standard and similar to the characterisations of Craig interpolant nonexistence in [7, 8]. The undecidability proofs are by reduction of the halting problem for 2-register machines where the numbers in the registers are represented by the number of 𝐿𝑠 (𝜚)- bisimilar nodes. The decidability proofs are based on novel adaptations of the mosaic technique for constructing 𝐿𝑠 (𝜚)-bisimilar models [7, 8]. Preliminary detailed proofs are available at https: //www.csc.liv.ac.uk/~frank/publ/publ.html. 4. Related Work With the exception of separating modal πœ‡-calculus formulas by plain modal formulas [10], separability has so far been mainly investigated for formal languages [11, 12, 13]. In contrast, definability has been investigated for many logics. For example, the problem of deciding whether a TBox given in a DL 𝐿 can be equivalently expressed in another DL 𝐿′ is considered in [14], the problem of deciding whether a guarded fragment (GF) formula or guarded negation fragment (GNF) formula is equivalent to an existential (or positive existential) GF formula or, respectively, GNF formula is considered in [15, 16], and there are many results on deciding whether fixpoints can be dropped from a second-order extension of a fragment of FO. Interestingly, the complexity of separation and definability of modal πœ‡-calculus formulas by plain modal formulas are both ExpTime-complete [17, 10]. Variants of definability explored in description logic are approximation [18] and conservative rewritability [19]. Craig separators are a generalisation of Craig interpolants where 𝐿𝑠 = 𝐿. If the logic 𝐿 has the Craig interpolation property (CIP), then the Craig separator existence problem for πœ™ and πœ“ reduces to checking whether πœ™ |= πœ“ and is thus not harder than entailment. Only recently the problem of deciding the existence of Craig interpolants has been considered for logics without the CIP [20, 21]. In fact, the bisimulation-based method employed here makes heavy use of techniques introduced for checking Craig interpolant existence [7, 8]. 5. Discussion We have started investigating the separation problem for fragments of FO with counting by formulas in the same fragments but without counting. Many problems remain to be addressed; we mention a few of them below: 1. Our decidability proofs are non-constructive, and it would be of interest to develop algorithms that construct separators whenever they exist and determine bounds on their size. 2. With the exception of π’œβ„’π’žπ’¬, the logics with counting we considered do not have the finite model property. It would be of interest to investigate whether our results also hold on finite structures. In that case, the bisimulation criterion does not hold as formulated (because its proof uses compactness) and one has to employ a different criterion that holds on finite structures (say, bounded bisimulations). 3. Our logics have the universal role. We conjecture that without the universal role π’œβ„’π’žπ’¬/π’œβ„’π’ž- and π’œβ„’π’žβ„π’¬/π’œβ„’π’žβ„-separation becomes coNExpTime-complete. 4. Is definability less complex than separation for the pairs of languages considered here. For example, is C2 /FO2 -definability decidable? References [1] J. C. Jung, C. Lutz, H. Pulcini, F. Wolter, Logical separability of labeled data examples under ontologies, Artif. Intell. 313 (2022) 103785. URL: https://doi.org/10.1016/j.artint.2022.103785. doi:10. 1016/J.ARTINT.2022.103785. [2] T. Place, M. Zeitoun, Separating regular languages with first-order logic, Log. Methods Comput. Sci. 12 (2016). [3] W. Craig, Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory, J. Symb. Log. 22 (1957) 269–285. doi:10.2307/2963594. [4] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An Introduction to Description Logic, Cambridge University Press, 2017. [5] I. Pratt-Hartmann, Fragments of First-Order Logic, Oxford Logic Guides, Oxford University Press, United Kingdom, 2023. [6] V. Goranko, M. Otto, Model theory of modal logic, in: Handbook of Modal Logic, Elsevier, 2007, pp. 249–329. [7] J. C. Jung, F. Wolter, Living without Beth and Craig: Definitions and interpolants in the guarded and two-variable fragments, in: Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, IEEE, 2021, pp. 1–14. URL: https://doi.org/10.1109/LICS52264. 2021.9470585. doi:10.1109/LICS52264.2021.9470585. [8] A. Artale, J. C. Jung, A. Mazzullo, A. Ozaki, F. Wolter, Living without Beth and Craig: Definitions and interpolants in description and modal logics with nominals and role inclusions, ACM Trans. Comput. Log. 24 (2023) 34:1–34:51. URL: https://doi.org/10.1145/3597301. doi:10.1145/3597301. [9] E. GrΓ€del, M. Otto, The freedoms of (guarded) bisimulation, in: Johan van Benthem on Logic and Information Dynamics, Springer International Publishing, 2014, pp. 3–31. [10] J. C. Jung, J. Kolodziejski, Modal separability of fixpoint formulae, in: Proceedings of the 37th International Workshop on Description Logics, CEUR Workshop Proceedings, CEUR-WS.org, 2024. [11] T. Place, M. Zeitoun, The tale of the quantifier alternation hierarchy of first-order logic over words, ACM SIGLOG News 2 (2015) 4–17. URL: https://doi.org/10.1145/2815493.2815495. doi:10.1145/ 2815493.2815495. [12] M. Bojanczyk, It is undecidable if two regular tree languages can be separated by a deterministic tree-walking automaton, Fundam. Informaticae 154 (2017) 37–46. URL: https://doi.org/10.3233/ FI-2017-1551. doi:10.3233/FI-2017-1551. [13] T. Place, Separating regular languages with two quantifier alternations, Log. Methods Comput. Sci. 14 (2018). URL: https://doi.org/10.23638/LMCS-14(4:16)2018. doi:10.23638/LMCS-14(4:16) 2018. [14] C. Lutz, R. Piro, F. Wolter, Description Logic TBoxes: Model-Theoretic Characterizations and Rewritability, in: IJCAI, 2011, pp. 983–988. [15] M. Benedikt, B. ten Cate, M. Vanden Boom, Effective interpolation and preservation in guarded logics, ACM Trans. Comput. Log. 17 (2016) 8. URL: https://doi.org/10.1145/2814570. doi:10.1145/ 2814570. [16] V. BΓ‘rΓ‘ny, M. Benedikt, B. ten Cate, Some model theory of guarded negation, J. Symb. Log. 83 (2018) 1307–1344. [17] M. Otto, Eliminating recursion in the πœ‡-calculus, in: C. Meinel, S. Tison (Eds.), STACS 99, 16th Annual Symposium on Theoretical Aspects of Computer Science, Trier, Germany, March 4-6, 1999, Proceedings, volume 1563 of Lecture Notes in Computer Science, Springer, 1999, pp. 531–540. URL: https://doi.org/10.1007/3-540-49116-3_50. doi:10.1007/3-540-49116-3\_50. [18] A. BΓΆtcher, C. Lutz, F. Wolter, Ontology approximation in Horn description logics, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, 2019, pp. 1574–1580. URL: https://doi.org/10.24963/ijcai.2019/218. doi:10.24963/IJCAI.2019/218. [19] B. Konev, C. Lutz, F. Wolter, M. Zakharyaschev, Conservative rewritability of description logic TBoxes, in: Proceedings of the Twenty-Fifth International Joint Conference on Artifi- cial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016, 2016, pp. 1153–1159. URL: http://www.ijcai.org/Abstract/16/167. [20] A. Kurucz, F. Wolter, M. Zakharyaschev, Definitions and (uniform) interpolants in first-order modal logic, in: Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning, KR 2023, Rhodes, Greece, September 2-8, 2023, 2023, pp. 417–428. URL: https://doi.org/10.24963/kr.2023/41. doi:10.24963/KR.2023/41. [21] A. Kurucz, F. Wolter, M. Zakharyaschev, A non-uniform view of Craig interpolation in modal logics with linear frames, CoRR abs/2312.05929 (2023). URL: https://doi.org/10.48550/arXiv.2312.05929. doi:10.48550/ARXIV.2312.05929. arXiv:2312.05929.