Separating Counting from Non-Counting in Fragments of
                         Two-Variable First-Order Logic (Extended Abstract)
                         Louwe Kuijer1 , Tony Tan1 , Frank Wolter1 and Michael Zakharyaschev2
                         1
                             University of Liverpool, Ashton Street, Liverpool L69 3BX, UK
                         2
                             Birkbeck, University of London, Malet Street, London WC1E 7HX, UK


                                        Abstract
                                        We consider the problem of deciding whether two disjoint classes of models defined in a fragment of first-order
                                        logic (FO) with counting can be separated in the same fragment but without counting. This problem turns out
                                        to be hard. We show that separation for the two-variable fragment FO2 extended with counting quantifiers by
                                        means of plain FO2 is undecidable, and the same is true of the pair 𝒜ℒ𝒞𝒪ℐ𝒬/𝒜ℒ𝒞𝒪ℐ of description logics.
                                        On the other hand, we establish 2ExpTime-completeness of the separation problem for the pairs 𝒜ℒ𝒞𝒬𝑢 /𝒜ℒ𝒞 𝑢
                                        and 𝒜ℒ𝒞ℐ𝒬𝑢 /𝒜ℒ𝒞ℐ 𝑢 .

                                        Keywords
                                        Separation, two-variable first-order logic, counting quantifiers, bisimulation.


                         1. Introduction
                         Our concern in this paper is the following separation problems for a pair of languages 𝐿 and 𝐿𝑠 :

                         𝐿/𝐿𝑠 -separation: given two mutually exclusive 𝐿-formulas 𝜙 and 𝜓, decide whether there exists an
                               𝐿𝑠 -formula 𝜒—a separator for 𝜙 and 𝜓—such that 𝜙 |= 𝜒 and 𝜒 |= ¬𝜓;

                         Craig 𝐿/𝐿𝑠 -separation: decide whether the given 𝐿-formulas 𝜙 and 𝜓 have an 𝐿𝑠 -separator 𝜒 that
                              only contains common non-logical symbols (predicates and functions) of 𝜙 and 𝜓.

                         To illustrate, 𝜙 could be an ontology 𝒪 and 𝜓 a concept 𝐶 that is not satisfiable with respect to 𝒪, both
                         given in an expressive language 𝐿. Then a separator ontology 𝒪′ in a weaker, easier to comprehend
                         language 𝐿𝑠 potentially explains unsatisfiability as it inherits that 𝒪 |= 𝒪′ and 𝐶 is not satisfiable under
                         𝒪′ . Also, in the context of concept learning, 𝜙 and 𝜓 could represent positive and negative examples
                         for a target concept 𝐶. Then any separator in an appropriately chosen language 𝐿𝑠 could represent the
                         concept one aims to learn [1].
                            Separation generalises definability (aka membership), which asks whether a given 𝐿-formula (say, a
                         datalog query) is equivalent to some 𝐿𝑠 -formula (say, a first-order query), and is regarded as one of
                         the main approaches to understanding the expressive power of 𝐿 relative to 𝐿𝑠 . For instance, studying
                         separability of regular languages by smaller language classes (e.g., a star-free language) has brought
                         major insights into the respective formal languages, with some fundamental open problems in the area
                         cast as separation questions [2].
                            Craig 𝐿/𝐿𝑠 -separation generalises classical Craig interpolation in 𝐿 [3] because a Craig 𝐿/𝐿-
                         separator for 𝜙 and 𝜓 is a Craig interpolant for 𝜙 → ¬𝜓 in 𝐿.
                            Our aim in this paper is to investigate the decidability and complexity of the separation problem
                         for certain fragments 𝐿 of C2 —that is, the two-variable first-order logic FO2 extended with counting
                         quantifiers—and the same fragments 𝐿𝑠 but without counting.


                             DL 2024: 37th International Workshop on Description Logics, June 18–21, 2024, Bergen, Norway
                          $ Louwe.Kuijer@liverpool.ac.uk (L. Kuijer); Tony.Tan@liverpool.ac.uk (T. Tan); wolter@liverpool.ac.uk (F. Wolter);
                          m.zakharyaschev@bbk.ac.uk (M. Zakharyaschev)
                           0000-0001-6696-9023 (L. Kuijer); 0009-0005-8341-2004 (T. Tan); 0000-0002-4470-606X (F. Wolter); 0000-0002-2210-5183
                          (M. Zakharyaschev)
                                       © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Example 1. Consider the following C2 -formulas:

       𝜙(𝑥) = ∃=1 𝑦 𝑅(𝑥, 𝑦),       𝜓(𝑥) = ∃=1 𝑦 𝑅(𝑥, 𝑦) ∧ 𝐴(𝑦) ∧ ∃=1 𝑦 𝑅(𝑥, 𝑦) ∧ ¬𝐴(𝑦) .
                                                 (︀              )︀        (︀                )︀

Then 𝜙 |= ¬𝜓 and the FO2 -formula 𝜒(𝑥) = ∀𝑦 𝑅(𝑥, 𝑦) → 𝐴(𝑦) ∨ ∀𝑦 𝑅(𝑥, 𝑦) → ¬𝐴(𝑦) is a
                                                    (︀                )︀      (︀                )︀

separator for 𝜙(𝑥) and 𝜓(𝑥). For 𝜓 ′ (𝑥) = ∃=2 𝑦 𝑅(𝑥, 𝑦), we also have 𝜙 |= ¬𝜓 ′ , but 𝜙(𝑥) and 𝜓 ′ (𝑥)
are not separable in FO2 . On the other hand, there is no Craig FO2 -separator for 𝜙(𝑥) and 𝜓(𝑥) as it
would have to be defined using 𝑅 only, and so separate 𝜙(𝑥) and 𝜓 ′ (𝑥) as well.


2. Logics
The logics we consider here can all be regarded as fragments of first-order logic, FO, and are defined as
follows. Let 𝜎 be a signature containing unary and binary relation symbols and possibly constants. Fix
a set var comprising two individual variables. Then
FO2 (𝜎), the two-variable fragment of FO(𝜎), is defined as the set of formulas that are built from atoms
      𝐴(𝑥), 𝑅(𝑥, 𝑦), and 𝑥 = 𝑦 with unary 𝐴 ∈ 𝜎, binary 𝑅 ∈ 𝜎, and 𝑥, 𝑦 ∈ var, using the Boolean
      connectives ∧ and ¬ and quantifier ∃𝑥 with 𝑥 ∈ var (other Booleans and ∀𝑥 are regarded as
      standard abbreviations);

C2 (𝜎), the two-variable fragment of FO2 (𝜎) with counting, extends FO2 (𝜎) with the counting quantifiers
      ∃<𝑘 𝑥, for 𝑘 ∈ N and 𝑥 ∈ var (other counting quantifiers ∃=𝑘 𝑥, ∃≤𝑘 𝑥, ∃≥𝑘 𝑥 can be introduced as
      abbreviations).
In this paper, we are only interested in formulas 𝜙(𝑥) with one free variable 𝑥 ∈ var. The signature of
𝜙 is the set sig(𝜙) of relation and constant symbols occurring in 𝜙.
   FO(𝜎) and its fragments are interpreted in 𝜎-structures A = (dom(A), (𝑅A )𝑅∈𝜎 , (𝑐A )𝑐∈𝜎 ) with a
domain dom(A) ̸= ∅, relations 𝑅A on dom(A) of the same arity as 𝑅 ∈ 𝜎, and elements 𝑐A ∈ dom(A).
A pointed structure is a pair A, 𝑎 with 𝑎 ∈ dom(A).
   We also consider a few fragments of C2 that correspond to some standard description logics (DLs).
In the context of DLs, unary relation symbols are called concept names, binary ones role names, and
constants individual names [4]. A role is a role name 𝑟 or its inverse 𝑟− . The universal role is denoted by
𝑢. A nominal takes the form {𝑐} with an individual name 𝑐.
   An 𝒜ℒ𝒞𝒪ℐ𝒬𝑢 (𝜎)-concept is defined by the grammar

                      𝐶 ::= ⊤ | 𝐴 | {𝑐} | ¬𝐶 | 𝐶 ⊓ 𝐶 ′ | ≥ 𝑘 𝑟.𝐶 | ∃𝑢.𝐶,

where 𝐴 ∈ 𝜎 is a concept, 𝑐 ∈ 𝜎 an individual, 𝑟 a role name in 𝜎 or its inverse, and 𝑘 > 0. We consider
several fragments of 𝒜ℒ𝒞𝒪ℐ𝒬𝑢 . The weakest, 𝒜ℒ𝒞, is obtained by dropping the universal role
(indicated by omitting ·𝑢 from the name), inverse roles (indicated by omitting ℐ), nominals (indicated by
omitting 𝒪), and only admitting qualified number restrictions of the form ∃𝑟.𝐶 = (≥ 1 𝑟.𝐶) (indicated
by dropping 𝒬). The languages between 𝒜ℒ𝒞 and 𝒜ℒ𝒞𝒪ℐ𝒬𝑢 are now defined in the obvious way.
   The semantics of DLs can be defined via the standard translation ·♯ into C2 with constants. For any
𝒜ℒ𝒞𝒪ℐ𝒬𝑢 -concept 𝐶, we denote by 𝐶𝑥♯ the C2 -formula with constants and free variable 𝑥 ∈ var
defined inductively by taking

    ⊤♯𝑥 = (𝑥 = 𝑥),     𝐴♯𝑥 = 𝐴(𝑥),     {𝑐}♯𝑥 = (𝑥 = 𝑐),    (¬𝐶)♯𝑥 = ¬𝐶𝑥♯ ,    (𝐶 ⊓ 𝐷)♯𝑥 = 𝐶𝑥♯ ∧ 𝐷𝑥♯ ,
                                                      ¯ 𝐶𝑥¯♯ ,
                                    (∃𝑢.𝐶)♯𝑥 = ⊤♯𝑥 ∧ ∃𝑥          (≥ 𝑘 𝑟.𝐶)♯𝑥 = ∃≥𝑘 𝑥      ¯ ) ∧ 𝐶𝑥¯♯ ,
                                                                                     (︀             )︀
                                                                                   ¯ 𝑟(𝑥, 𝑥

where 𝑥
      ¯ = 𝑦, 𝑦¯ = 𝑥 and {𝑥, 𝑦} = var.
  The complexities of the satisfiability problems for the logics in question are as follows [4, 5]:
    • FO2 , C2 , and 𝒜ℒ𝒞𝒪ℐ𝒬𝑢 are all NExpTime-complete;

    • 𝒜ℒ𝒞 𝑢 , 𝒜ℒ𝒞𝒬𝑢 , 𝒜ℒ𝒞ℐ𝒬𝑢 , and 𝒜ℒ𝒞𝒪ℐ 𝑢 are all ExpTime-complete.
3. Deciding Separation
Our main results are summarised in the next theorem:

Theorem 1. The separation and Craig separation problems are

    • undecidable for the pairs C2 /FO2 and 𝒜ℒ𝒞𝒪ℐ𝒬/𝒜ℒ𝒞𝒪ℐ,

    • 2ExpTime-complete for the pairs 𝒜ℒ𝒞ℐ𝒬𝑢 /𝒜ℒ𝒞ℐ 𝑢 and 𝒜ℒ𝒞𝒬𝑢 /𝒜ℒ𝒞 𝑢 .

   The proof of Theorem 1 is based on the following straightforward model-theoretic characterisation
of separation in terms of bisimulations; see [6, 7, 8] and references therein:

Lemma 2. Let 𝜙(𝑥) and 𝜓(𝑥) be any C2 (𝜎)-formulas, 𝜚 ⊆ 𝜎, and let 𝐿𝑠 be any of the languages
introduced in Section 2. Then the following conditions are equivalent:

    • 𝜙(𝑥) and 𝜓(𝑥) do not have an 𝐿𝑠 (𝜚)-separator;

    • there are pointed 𝜎-structures A, 𝑎 and B, 𝑏 such that

                              A |= 𝜙(𝑎),      B |= 𝜓(𝑏),       A, 𝑎 ∼𝐿𝑠 (𝜚) B, 𝑏.

For Craig separation, we additionally require that 𝜚 ⊆ sig(𝜙) ∩ sig(𝜓).

   Here, A, 𝑎 ∼𝐿𝑠 (𝜚) B, 𝑏 means that there is an 𝐿𝑠 (𝜚)-bisimulation 𝛽 between A and B such that
(𝑎, 𝑏) ∈ 𝛽, which implies that A |= 𝜑(𝑎) iff B |= 𝜑(𝑏), for all 𝐿𝑠 (𝜚)-formulas 𝜑(𝑥) [6, 9, 5]. The
proof of the characterisation in Lemma 2 is standard and similar to the characterisations of Craig
interpolant nonexistence in [7, 8]. The undecidability proofs are by reduction of the halting problem
for 2-register machines where the numbers in the registers are represented by the number of 𝐿𝑠 (𝜚)-
bisimilar nodes. The decidability proofs are based on novel adaptations of the mosaic technique
for constructing 𝐿𝑠 (𝜚)-bisimilar models [7, 8]. Preliminary detailed proofs are available at https:
//www.csc.liv.ac.uk/~frank/publ/publ.html.


4. Related Work
With the exception of separating modal 𝜇-calculus formulas by plain modal formulas [10], separability
has so far been mainly investigated for formal languages [11, 12, 13]. In contrast, definability has been
investigated for many logics. For example, the problem of deciding whether a TBox given in a DL 𝐿
can be equivalently expressed in another DL 𝐿′ is considered in [14], the problem of deciding whether
a guarded fragment (GF) formula or guarded negation fragment (GNF) formula is equivalent to an
existential (or positive existential) GF formula or, respectively, GNF formula is considered in [15, 16],
and there are many results on deciding whether fixpoints can be dropped from a second-order extension
of a fragment of FO. Interestingly, the complexity of separation and definability of modal 𝜇-calculus
formulas by plain modal formulas are both ExpTime-complete [17, 10]. Variants of definability explored
in description logic are approximation [18] and conservative rewritability [19].
   Craig separators are a generalisation of Craig interpolants where 𝐿𝑠 = 𝐿. If the logic 𝐿 has the
Craig interpolation property (CIP), then the Craig separator existence problem for 𝜙 and 𝜓 reduces to
checking whether 𝜙 |= 𝜓 and is thus not harder than entailment. Only recently the problem of deciding
the existence of Craig interpolants has been considered for logics without the CIP [20, 21]. In fact, the
bisimulation-based method employed here makes heavy use of techniques introduced for checking
Craig interpolant existence [7, 8].
5. Discussion
We have started investigating the separation problem for fragments of FO with counting by formulas
in the same fragments but without counting. Many problems remain to be addressed; we mention a few
of them below:

   1. Our decidability proofs are non-constructive, and it would be of interest to develop algorithms
      that construct separators whenever they exist and determine bounds on their size.

   2. With the exception of 𝒜ℒ𝒞𝒬, the logics with counting we considered do not have the finite
      model property. It would be of interest to investigate whether our results also hold on finite
      structures. In that case, the bisimulation criterion does not hold as formulated (because its proof
      uses compactness) and one has to employ a different criterion that holds on finite structures (say,
      bounded bisimulations).

   3. Our logics have the universal role. We conjecture that without the universal role 𝒜ℒ𝒞𝒬/𝒜ℒ𝒞-
      and 𝒜ℒ𝒞ℐ𝒬/𝒜ℒ𝒞ℐ-separation becomes coNExpTime-complete.

   4. Is definability less complex than separation for the pairs of languages considered here. For
      example, is C2 /FO2 -definability decidable?


References
 [1] J. C. Jung, C. Lutz, H. Pulcini, F. Wolter, Logical separability of labeled data examples under
     ontologies, Artif. Intell. 313 (2022) 103785. URL: https://doi.org/10.1016/j.artint.2022.103785. doi:10.
     1016/J.ARTINT.2022.103785.
 [2] T. Place, M. Zeitoun, Separating regular languages with first-order logic, Log. Methods Comput.
     Sci. 12 (2016).
 [3] W. Craig, Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory,
     J. Symb. Log. 22 (1957) 269–285. doi:10.2307/2963594.
 [4] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An Introduction to Description Logic, Cambridge
     University Press, 2017.
 [5] I. Pratt-Hartmann, Fragments of First-Order Logic, Oxford Logic Guides, Oxford University Press,
     United Kingdom, 2023.
 [6] V. Goranko, M. Otto, Model theory of modal logic, in: Handbook of Modal Logic, Elsevier, 2007,
     pp. 249–329.
 [7] J. C. Jung, F. Wolter, Living without Beth and Craig: Definitions and interpolants in the guarded
     and two-variable fragments, in: Proceedings of the 36th Annual ACM/IEEE Symposium on Logic
     in Computer Science, LICS 2021, IEEE, 2021, pp. 1–14. URL: https://doi.org/10.1109/LICS52264.
     2021.9470585. doi:10.1109/LICS52264.2021.9470585.
 [8] A. Artale, J. C. Jung, A. Mazzullo, A. Ozaki, F. Wolter, Living without Beth and Craig: Definitions
     and interpolants in description and modal logics with nominals and role inclusions, ACM Trans.
     Comput. Log. 24 (2023) 34:1–34:51. URL: https://doi.org/10.1145/3597301. doi:10.1145/3597301.
 [9] E. Grädel, M. Otto, The freedoms of (guarded) bisimulation, in: Johan van Benthem on Logic and
     Information Dynamics, Springer International Publishing, 2014, pp. 3–31.
[10] J. C. Jung, J. Kolodziejski, Modal separability of fixpoint formulae, in: Proceedings of the 37th
     International Workshop on Description Logics, CEUR Workshop Proceedings, CEUR-WS.org, 2024.
[11] T. Place, M. Zeitoun, The tale of the quantifier alternation hierarchy of first-order logic over words,
     ACM SIGLOG News 2 (2015) 4–17. URL: https://doi.org/10.1145/2815493.2815495. doi:10.1145/
     2815493.2815495.
[12] M. Bojanczyk, It is undecidable if two regular tree languages can be separated by a deterministic
     tree-walking automaton, Fundam. Informaticae 154 (2017) 37–46. URL: https://doi.org/10.3233/
     FI-2017-1551. doi:10.3233/FI-2017-1551.
[13] T. Place, Separating regular languages with two quantifier alternations, Log. Methods Comput.
     Sci. 14 (2018). URL: https://doi.org/10.23638/LMCS-14(4:16)2018. doi:10.23638/LMCS-14(4:16)
     2018.
[14] C. Lutz, R. Piro, F. Wolter, Description Logic TBoxes: Model-Theoretic Characterizations and
     Rewritability, in: IJCAI, 2011, pp. 983–988.
[15] M. Benedikt, B. ten Cate, M. Vanden Boom, Effective interpolation and preservation in guarded
     logics, ACM Trans. Comput. Log. 17 (2016) 8. URL: https://doi.org/10.1145/2814570. doi:10.1145/
     2814570.
[16] V. Bárány, M. Benedikt, B. ten Cate, Some model theory of guarded negation, J. Symb. Log. 83
     (2018) 1307–1344.
[17] M. Otto, Eliminating recursion in the 𝜇-calculus, in: C. Meinel, S. Tison (Eds.), STACS 99, 16th
     Annual Symposium on Theoretical Aspects of Computer Science, Trier, Germany, March 4-6, 1999,
     Proceedings, volume 1563 of Lecture Notes in Computer Science, Springer, 1999, pp. 531–540. URL:
     https://doi.org/10.1007/3-540-49116-3_50. doi:10.1007/3-540-49116-3\_50.
[18] A. Bötcher, C. Lutz, F. Wolter, Ontology approximation in Horn description logics, in: Proceedings
     of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao,
     China, August 10-16, 2019, 2019, pp. 1574–1580. URL: https://doi.org/10.24963/ijcai.2019/218.
     doi:10.24963/IJCAI.2019/218.
[19] B. Konev, C. Lutz, F. Wolter, M. Zakharyaschev, Conservative rewritability of description
     logic TBoxes, in: Proceedings of the Twenty-Fifth International Joint Conference on Artifi-
     cial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016, 2016, pp. 1153–1159. URL:
     http://www.ijcai.org/Abstract/16/167.
[20] A. Kurucz, F. Wolter, M. Zakharyaschev, Definitions and (uniform) interpolants in first-order
     modal logic, in: Proceedings of the 20th International Conference on Principles of Knowledge
     Representation and Reasoning, KR 2023, Rhodes, Greece, September 2-8, 2023, 2023, pp. 417–428.
     URL: https://doi.org/10.24963/kr.2023/41. doi:10.24963/KR.2023/41.
[21] A. Kurucz, F. Wolter, M. Zakharyaschev, A non-uniform view of Craig interpolation in modal logics
     with linear frames, CoRR abs/2312.05929 (2023). URL: https://doi.org/10.48550/arXiv.2312.05929.
     doi:10.48550/ARXIV.2312.05929. arXiv:2312.05929.