<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Well-founded Graph-based Summarization Framework for Description Logics</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Univ Rennes</institution>
          ,
          <addr-line>Lannion</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The quotient operation from graph theory ofers an elegant graph summarization framework that has been widely investigated in the literature, notably for the exploration and eficient management of large graphs; it consists in fusing equivalent vertices according to an equivalence relation. In this paper, we study whether a similar operation may be used to summarize ABoxes. Towards this goal, we define and examine the quotient operation on an ABox: we establish that a quotient ABox is more specific than the ABox it summarizes, and characterize to which extent it is more specific. This preliminary investigation validates the interest of a quotient-based ABox summarization framework, and paves the way for further studies on it in the description logic setting, e.g., to devise equivalence relations suited to the optimization of typical data management and reasoning tasks on large ABoxes or to the visualization of large ABoxes, and on its utilization in related settings, e.g., Semantic Web.</p>
      </abstract>
      <kwd-group>
        <kwd>Data summaries</kwd>
        <kwd>quotient graph</kwd>
        <kwd>most specific concept</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Graph summarization has received considerable attention in the literature [
        <xref ref-type="bibr" rid="ref14 ref5">14,5</xref>
        ],
in particular for the exploration and visualization of large graphs, as well as for
the optimization of graph management systems (cardinality estimation,
indexing, etc). A well-established tool used to summarize graphs into compact ones is
the quotient operation from graph theory [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]; other kinds of summaries have also
been considered like collections of graph statistics or of frequent graph patterns,
or a mix of both as in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        The quotient operation ofers an elegant graph summarization framework by
decoupling the summarization method, which basically fuses equivalent vertices,
from the high-level specification of equivalent vertices, defined by an equivalence
relation, e.g., bisimilarity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Applying the quotient operation to a graph results
Copyright © 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
in a (typically small) homomorphic approximation of it, called quotient graph;
the equivalence relation to choose depends on the target summary usage.
      </p>
      <p>
        In this paper, we study whether a similar operation may be used to summarize
ABoxes. Towards this goal, we first transfer the quotient operation from graphs
to description logics (DLs) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in order to summarize ABoxes as if they were
merely graphs. Then, because ABoxes are not just graphs but are essentially
ifrst-order theories (or slight extensions of them), we examine the semantics
of the quotient operation on an ABox. We establish that a quotient ABox is
more specific than the ABox it summarizes , by showing that its structure entails
the structure of the ABox, and we characterize to which extent this quotient
ABox is more specific than the ABox it summarizes , by determining the precise
nature of the approximation between the quotient ABox contents and the ABox
contents. This latter characterization is the main result of this paper. It provides
an in-depth understanding of the quotient operation itself as a tool for ABox
summarization; the essence of quotient ABoxes is independent of the DL under
consideration and of the chosen equivalence relation (that depends on the target
summary usage).
      </p>
      <p>The paper is organized as follows. We recall the basics of DLs in Sec. 2.
Then, we transfer the quotient operation from graphs to ABoxes in Sec. 3 and
we examine its semantics w.r.t. DLs in Sec. 4. Finally, we discuss related works
in Sec. 5 and we conclude with salient perspectives in Sec. 6.</p>
      <p>The proofs of our technical results can be found in the appendix of this paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        We recall the important aspects of DLs that are used in this paper [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], with
a focus on the lightweight DL-liteR DL [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] that we (just) use to illustrate our
discussions throughout this paper; our results hold for (standard) DLs.
DL syntax and semantics Given a set NC of concept names (unary predicates), a
set NR of role names (binary predicates), and a set NI of individuals (constants),
a DL allows expressing unary and binary formulae respectively called concepts
and roles. They are used to build TBoxes (ontologies) made of concept inclusions
of the form C1 ⊑ C2 and role inclusions of the form R1 ⊑ R2, as well as to
build ABoxes (databases) storing concept assertions of the form C(a) and role
assertions of the form R(a1, a2), where a, a1, a2 ∈ NI . In DL-liteR, concepts and
roles are built according to the following rules, where A ∈ NC and R ∈ NR:
      </p>
      <p>B := A | ∃Q, C := B | ¬B, Q := R | R− , S := Q | ¬Q.</p>
      <p>Also, TBox inclusions are of the forms B ⊑ C and Q ⊑ S, while ABox assertions
are of the forms A(a) and R(a1, a2).</p>
      <p>First-order interpretations are used to define the semantics of DLs. They are
of the form I = (∆ I , · I ), where ∆ I is a non-empty domain and · I is a function
that maps every a ∈ NI to aI ∈ ∆ I , every A ∈ NC to AI ⊆ ∆ I , and every
R ∈ NR to RI ⊆ ∆ I × ∆ I . Further, the interpretation function · I is extended
to the concept and role formulae allowed in the DL under consideration. In
DLliteR, · I is extended as follows: (R− )I = {(o1, o2) | (o2, o1) ∈ RI }, (∃Q)I = {o1 |
∃o2 (o1, o2) ∈ QI }, (¬B)I = ∆ I \ BI and (¬Q)I = (∆ I × ∆ I ) \ QI . Let I be
an interpretation. I satisfies a DL concept C (resp. role R) if it has a non-empty
interpretation, i.e., CI ̸= ∅ (resp. RI ̸= ∅). I satisfies an inclusion C1 ⊑ C2 (resp.
R1 ⊑ R2) if C1I ⊆ C2I (resp. R1I ⊆ R2I ) holds, and it satisfies a TBox if it satisfies
all the TBox inclusions. I satisfies an assertion C(a) (resp. R(a1, a2)) if aI ∈ CI
(resp. (a1I , a2I ) ∈ RI ) holds, and it satisfies an ABox if it satisfies all the ABox
assertions. Finally, I is a model of a concept, role, inclusion, TBox, assertion or
ABox if I satisfies it.</p>
      <p>Generalization/specialization relations Concepts and roles are compared through
subsumption, which may rely on the knowledge of a TBox T . C1 is subsumed by
C2 or C2 subsumes C1 w.r.t. T , denoted C1 ⪯ T C2, if every model I of T is a
model of the inclusion C1 ⊑ C2; subsumption for roles is defined similarly.</p>
      <p>ABoxes are compared through entailment, which may also rely on a TBox
T . A1 entails A2 or A2 is entailed by A1 w.r.t. T , denoted A1 |=T A2, if every
model I of A1 and T is also a model of A2. As special case, an ABox A entails
a concept assertion C(a) w.r.t. T , denoted A |=T C(a), if A entails w.r.t. T the
ABox {C(a)}; entailment of a role assertion is defined similarly.</p>
      <p>
        Data management The main data management tasks are consistency checking
and query answering. They involve an ABox A and may rely on the knowledge
of a TBox T . A is consistent w.r.t. T if A has a model that is also a model of T .
To query ABoxes, we consider the first-order language of unions of conjunctive
queries that is widely-considered for ontology-mediated query answering [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and
ontology-based data access [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. A conjunctive query (CQ) is of the form ∃Y⃗ ϕ ,
where ϕ is a conjunction of the forms A(t) or R(t, t′) where t, t′ are variables
or constants, and Y⃗ is a tuple of variable from ϕ . A CQ is Boolean if all its
variables are existentially quantified. The answer to a Boolean CQ q is true if it
is entailed by A and T , denoted A |=T q, i.e., q is true in all the models of A and
T ; otherwise q is false. The answer to a non-Boolean CQ q with free variables
X1, . . . , Xn, i.e., of arity n, is the set of all the tuples of constants of the form
⃗a = ⟨a1, . . . , an⟩ such that A |=T q[⃗a], where q[⃗a] is the Boolean CQ obtained
by replacing each Xk by ak for 1 ≤ k ≤ n. We note qA,T the answer set of q on
A w.r.t. T , with the convention that for Boolean queries qA,T = ∅ if q is f alse,
and qA,T = {⟨⟩} if q is true where ⟨⟩ is the empty tuple. A union of CQs (UCQ)
is a disjunction of CQs with same arity. The answer to a UCQ is the union of
the answers to its individual CQs.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Quotient operation for ABoxes</title>
      <p>An ABox A can be represented, and is frequently drawn, as a multigraph with
typed vertices and typed edges: the vertices are the constants in A, there is a
vertex type C on the vertex a if the concept assertion C(a) is in A, and there
is an R-typed edge a →R a′ if the role assertion R(a, a′) is in A. Based on this
analogy, we transfer the quotient operation from graphs to ABoxes to define
quotient ABoxes.</p>
      <p>Definition 1 (Quotient ABox). Let A be an ABox, ≡ be some equivalence
relation between constants, and let a≡1 , . . . , a≡n denote by a slight abuse of notation
both the equivalence classes of the constants in A w.r.t. ≡ and the names of these
equivalence classes. The quotient ABox of A w.r.t. ≡ is the ABox A≡ such that:
–– CR((aa≡ii ),a∈≡j A)∈≡ Aift≡heirfetheexriestesxisat ∈aa≡i∈ sau≡ch that C(a) ∈ A, for 1 ≤ i ≤ n,
≡ i and a′ ∈ a≡j such that R(a, a′) ∈ A,
for 1 ≤ i, j ≤ n.</p>
      <p>In this paper, we do not focus on particular equivalence relations between
constants (i.e., binary relations that are reflexive, symmetric and transitive), though
in the description logic setting we consider they should be denfied w.r.t. an ABox
and a TBox. Discussing DL-specific equivalence relations between constants is
delegated to future work.</p>
      <p>Tex.</p>
      <p>Example 1 (Running example). Let us consider the ABox
Aex = {PhDS(s), sB(s, r1), sB(s, r2), R(r1), R(r2), wW(r2, r1), wF(r1, l1), wF(r2, l2),</p>
      <p>ULab(l1), CLab(l2)}
which states that s is a PhD student (PhDS), who is jointly supervized by (sB)
the researchers (R) r1, who works for (wF) the university lab (ULab) l1, and r2,
who works for the company lab (CLab) l2 and who also works with (wW) r1.</p>
      <p>The graph representations of Aex and of three of quotient ABoxes obtained
from it are depicted in Fig. 1. The equivalence relations used in this figure
somehow reflect that equivalent constants are instances of the same concept
for Fig. 1(b), or instances of non-disjoint concepts with a common generalizing
concept if we assume that company labs and university labs are not necessarily
disjoint kinds of labs for Fig. 1(c), i.e., CLab ⊑ Lab ∈ Tex and ULab ⊑ Lab ∈ Tex,
and if we further assume that PhD student and researcher are not necessarily
disjoint kinds of employees for Fig. 1(d), i.e., PhDS ⊑ Emp ∈ Tex and R ⊑ Emp ∈</p>
      <p>In the sequel, we term summary of an ABox A a quotient ABox of A.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Characterization of ABox summaries</title>
      <p>Although Definition 1 does enable the quotient-based summarization of an ABox,
an ABox is not just a graph because it also has a first-order semantics. An
important question that therefore arises is: does there exist a semantic relationship
between an ABox and a summary of it? To answer this question we first point out
that, in Definition 1, the implicit function that maps the constants in A to the
constants in A≡ defines a homomorphism from A to A≡ ; from now, we name this
function h and we say that an ABox constant a is represented by a summary
PhDS s
wW</p>
      <p>PhDS s</p>
      <p>sB
sB
sB</p>
      <p>R
r1
r2</p>
      <p>R
(a) Aex
wF
wF</p>
      <p>ULab
l1
l2</p>
      <p>CLab</p>
      <p>R
r1|r2 wF</p>
      <p>ULab</p>
      <p>l1
wF
wF</p>
      <p>l2</p>
      <p>CLab
(b) A≡b with r1 ≡ r2
sB
PhDS s
sB
l1|l2 CLab, ULab</p>
      <p>PhDS, R s|r1|r2 wF
l1|l2 CLab, ULab
wW
(c) A≡c with r1 ≡ r2 and l1 ≡ l2
wW
(d) A≡d with s ≡ r1 ≡ r2 and l1 ≡ l2
constant a≡ if h(a) = a≡ holds. Based on this ABox-to-summary
homomorphism, we answer the above question by relating an ABox and a summary of it
through entailment between ABoxes w.r.t. a TBox. The property below states
that if we compare an ABox and a summary of it through entailment regardless
of the constants they use (they are incomparable otherwise because they use
diferent sets of constants), by just considering how the ABox (resp. summary)
atoms join according to these constants, then the ABox is more general than its
summary.</p>
      <p>Property 1. Let T be a TBox, a1, . . . , am be the constants in an ABox A and
cao≡1n,s.t.a.n,tas≡nasbeextishteenctoianlstvaanrtiasbilness,otmheens∃uam≡1 m·· · a∃ryaA≡n≡ A≡of |=AT. I∃faw1· e·· ∃conasmidAer htohledsse.</p>
      <p>Besides, it is also worth noting that, due to the existence of the
ABox-tosummary homomorphism, an ABox and a summary of it are related through the
essential data management tasks of consistency checking and query answering.
Property 2. Let T be a TBox, A be an ABox and A≡ be some summary of A.
If A≡ is consistent w.r.t. T , then A is consistent w.r.t. T .
lPertoqpebrteya3.UCLeQt.TIf bqeAa,T T̸=Bo∅x,hAoldbsetahnenAqBhAo≡x,,T A̸ =≡ ∅ holds, with qh the query q
be some summary of A, and
in which every constant a is replaced by its image h(a) through the A-to-A≡
homomorphism h.</p>
      <p>We remark that Property 2 and the contraposition of Property 3 (i.e., if
qA≡ ,T = ∅ holds then qA,T = ∅ holds) are of practical interest to check rapidly
h
if, for sure, an ABox is consistent and if a query has no answer, by using the
typically (much) smaller ABox summary.</p>
      <p>Example 2 (Cont.). From now, let us assume that Tex just consists of the
inclusions mentioned in Example 1.</p>
      <p>To illustrate Property 1, it is easy to see on Fig. 1 that A≡i |=∅ Aex holds for
i ∈ {b, c, d}, hence holds with |=Tex .</p>
      <p>As for Property 2, it is also easy to see that A≡i is consistent w.r.t. Tex, for i ∈
{b, c, d}, as well as Aex. We remark that if PhD students and researchers would
be disjoint, i.e., PhDS ⊑ ¬R ∈ Tex, then A≡d would be inconsistent w.r.t. Tex while
Aex would be consistent w.r.t. Tex. This observation indicates that DL-specific
equivalence relations should avoid to the extent possible building inconsistent
summaries out of consistent ABoxes.</p>
      <p>Finally, let us illustrate Property 3 on Fig. 1. Consider the UCQ q = R(X) ∧
wF(X, l1)∪R(X)∧wF(X, l2) asking for the researchers who work for the lab l1 or l2.
q has some answer on Aex w.r.t. Tex (qAex,Tex = {⟨r1⟩, ⟨r2⟩}), thus its translation
qh has some answer w.r.t. Tex on the summary A≡i , i ∈ {b, c, d}. For the CQ
q = PhDS(X1) ∧ wW(X1, X2) asking for the PhD students and people with whom
they work, we note that because the qh translation does not have answers on
A≡i w.r.t. Tex, i ∈ {b, c}, i.e., qhA≡i ,T = ∅, hence by contraposition of Property 3,
q does not have answer on Aex w.r.t. Tex either, i.e., qAex,Tex = ∅. However, we
observe that qh has some answer on A≡d w.r.t. Tex, while q has no answer on Aex
w.r.t. Tex, i.e., this particular summary fails in detecting that q has no answer on
Aex w.r.t. Tex. This observation suggests that DL-specific equivalence relations
should be carefully chosen depending on the target summary usage.</p>
      <p>Resuming our initial discussion, though Property 1 states that a summary is
more specific than the ABox it summarizes up to the set of constants they use, it
immediately raises a second crucial question: to which extent a summary is more
specific than the original ABox it summarizes? We answer this second question
by characterizing to which extent the knowledge a summary has about a constant
is more specific than the knowledge the ABox has about the equivalence class this
summary constant represents, and similarly for a pair of summary constants and
the pair of equivalence classes these summary constants represent in the ABox.</p>
      <p>
        For an ABox A and a TBox T , the most precise knowledge about a set of
constants a1, . . . , an corresponds to one or more concepts known as the most
specific concepts (MSCs) a1, . . . , an are all instances of [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. A concept C is a
MSC of a1, . . . , an w.r.t. A and T if ( i) A |=T C(ai) holds for all i ∈ [1, n],
and (ii) for any concept D such that A |=T D(ai) holds for all i ∈ [1, n], either
C ⪯ T D holds or C ̸⪯ T D and D ̸⪯ T C hold. We remark that though a MSC
may be of infinite size in some DLs, this is not an issue here as we just want a
semantic characterization of it. In the sequel, we designate by mscA,T (a1, . . . , an)
the conjunction (⊓) of the MSCs of a1, . . . , an w.r.t. A and T 1 and we refer
to it as the MSC of a1, . . . , an w.r.t. A and T , because it always exists (the
1 a1, . . . , an obviously have a single MSC in DLs equipped with conjunction (⊓).
empty conjunction is the universal concept ⊤) and it is obviously unique up to
equivalence.
      </p>
      <p>A first step towards answering our second question is Lemma 1, which
provides an upper approximation through subsumption of the MSC of some
summary constant w.r.t. the individual MSCs of the equivalent ABox constants it
represents; Lemma 1 follows from the existence of the ABox-to-summary
homomorphism.</p>
      <p>Lemma 1. Let T be a TBox, A be an ABox and A≡ be the summary of A
w.r.t. the ≡ equivalence relation. If a1, . . . , an are all the constants in A that
belong to the equivalence class a≡ according to ≡ , then the following holds:
Example 3 (Cont.). Let us consider the summary A≡d of Aex in Fig. 1 and
illustrate Lemma 1 with the summary constant l1|l2 and the ABox constants l1, l2
it represents:
mscA≡d ,Tex (l1|l2) = Clab ⊓ ULab ⊓ ∃wF− ,
mscAex,Tex (l1) = ULab ⊓ ∃wF− ,
mscAex,Tex (l2) = CLab ⊓ ∃wF− ,
and mscA≡d ,Tex (l1|l2) ⪯ Tex ⊓x∈{l1,l2}mscAex,Tex (x), i.e., Clab ⊓ ULab ⊓ ∃wF− ⪯ Tex
(ULab ⊓ ∃wF− ) ⊓ (CLab ⊓ ∃wF− ), clearly holds.</p>
      <p>At the same time, in the above Lemma 1, to the set of (equivalent)
constants a1, . . . , an corresponds its most specific concept mscA,T (a1, . . . , an), as
recalled above. As the next step towards answering our second question, Lemma 2
provides a lower approximation through subsumption of the MSC of a1, . . . , an
w.r.t. the individual MSCs of these constants; it follows from the definition of
MSC.</p>
      <p>Lemma 2. Let T be a TBox and A be an ABox. If a1, . . . , an are constants in
A, then the following holds:
Example 4 (Cont.). Let us consider again the summary A≡d of Aex in Fig. 1 and
illustrate Lemma 2 with the summary constant l1|l2 and the ABox constants
l1, l2 it represents like in the preceding example:
mscAex,Tex (l1, l2) = Lab ⊓ ∃wF− ,
and ⊔x∈{l1,l2}mscAex,Tex (x) ⪯ T mscAex,Tex (l1, l2), i.e., (ULab ⊓ ∃wF− ) ⊔ (CLab ⊓
∃wF− ) ⪯ Tex Lab ⊓ ∃wF− , clearly holds.</p>
      <p>We are now ready to answer our second question with Theorem 1, which
is a direct corollary of Lemma 1 and Lemma 2. It relates mscA,T (a≡ ) and
mscA,T (a1, . . . , an) through their respective upper and lower approximations.
Theorem 1. Let T be a TBox, A be an ABox and A≡ be the summary of A
w.r.t. the ≡ equivalence relation. If a1, . . . , an are all the constants in A that
belong to the equivalence class a≡ according to ≡ , then the following holds:
From Theorem 1, it is now clear that the semantic distance through subsumption
between the knowledge about each constant a≡ in the summary, i.e., msc(a≡ ),
and the knowledge about the set of equivalent constants that a≡ represents
in the ABox, i.e., mscA,T (a1, . . . , an), corresponds at least to approximate the
union of individual knowledge about a1, . . . , an in the ABox by a conjunction,
i.e., approximating Fin=1 mscA,T (ai) by dn
i=1 mscA,T (ai).</p>
      <p>We establish similar results (omitted due to space limitations) by adapting
the notion of MSC to roles.</p>
      <p>Finally, we stress that all the above results just follow from the definition of
quotient ABox (Definition 1). In particular, they are independent of the
equivalence relation (≡ ) and of the DL under consideration.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Related work</title>
      <p>We present below the works from the DL literature that are closely related to
ABox summarization.</p>
      <p>
        An ABox summarization approach for the SHIN DL has been studied to
optimize consistency checking [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and instance retrieval in the subsequent work [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
i.e., finding the instances of a given SHIN concept; these two summary-based
optimizations are implemented in the SHER system [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This approach can be
seen as a particular instantiation for the SHIN DL of our general ABox
summarization framework, in which equivalent constants are non-distinct constants2
stored in the ABox as instances of the same SHIN concepts; our framework
allows using this definition of equivalent constants for the equivalence relation ≡
or any other relevant to the target summary usage. We therefore provide a better
understanding of the SHIN summaries of [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">8,6,7</xref>
        ] by establishing that they are
more specific than the SHIN ABoxes they summarize, and notably to which
extent they are more specific. Further, we remark that our ABox summarization
approach, because it is based on the quotient operation, is more declarative than
that of [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">8,6,7</xref>
        ], by elegantly decoupling the summarization means (i.e., fusing
equivalent constants) from the high-level specicfiation of equivalent constants,
i.e., the equivalence relation ≡ to use. By contrast, SHIN summaries are
deifned by an explicit canonical function that maps ABox constants to summary
one; our function h has similar goal but is implicit because derived from the
ABox and equivalence relation at hand.
      </p>
      <p>
        Another form of ABox summarization is ABox abstraction, which has been
studied for materialization in the Horn-ALCHOI and Horn-SHOIF DLs [
        <xref ref-type="bibr" rid="ref10 ref9">9,10</xref>
        ],
2 In SHIN , constants are stated distinct by using assertions of the built-in role ̸=.
i.e., for pre-computing and storing entailed assertions. The idea is to represent
several assertions of an ABox by a single one in the ABox abstraction. To this
aim, an ABox B is defined as an abstraction of an ABox A if there exists a
mapping from the constants in B to the constants in A that defines a
homomorphism from B to A. Hence, by contrast with our summaries that are more specific
than original ABoxes, abstractions are more general than original ABoxes, thus
capture a subset of their contents.
      </p>
      <p>
        Also related to ABox summarization is ABox modularization, which has been
studied for instance retrieval in the SHIF [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and SHI [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] DLs. It consists in
extracting a subset of an ABox that entails all the assertions in which a given
constant is involved.
      </p>
      <p>
        Finally, less related works on ABox summarization do not summarize an
ABox by another, e.g., [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] which computes statistics about the concept and roles
instances, and Abstract Knowledge Patterns of the form (C1, r, C2) indicating that
the role r relates instances of the concept C1 to instances of the concept C2.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and perspectives</title>
      <p>We devised a well-founded ABox summarization framework for DLs, by applying
the quotient operation from graph theory to ABoxes and examining the quotient
ABoxes, i.e., summaries, it produces.</p>
      <p>
        With this now well-understood ABox summarization framework in place,
the next necessary step is to study DL-specific equivalence relations between
constants, to explore, visualize or optimize reasoning on and the management
of ABoxes. These equivalence relations might be sought either from scratch or
derived from the ones used for graph summarization if meaningful w.r.t. the DL
semantics, e.g., the well-established and widely adopted ones based on
bisimilation [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] or the recent ones based on the cooccurrence of types and edges [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
Also, ideally, such relations should be devised to avoid to the extent possible
building inconsistent summaries out of consistent ABoxes (recall Example 2).
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
    </sec>
    <sec id="sec-8">
      <title>Appendix</title>
      <p>This work was partially supported by the ANR project CQFD
(ANR-18-CE230003).</p>
      <p>Proof (Proof of Property 1). In case ∃a≡1 , . . . , a≡n A≡ is inconsistent w.r.t. T ,
then Property 1 trivially holds. Otherwise, let I = (∆ I , · I ) be a model of T and
∃a≡1 , . . . , an</p>
      <p>≡ A≡ and let us show that I is also a model of ∃a1, . . . , am A.</p>
      <p>Because I is a model of ∃a≡1 , . . . , a≡n A≡ , there exists a function ψ from
a≡1 , . . . , a≡n to the interpretation domain ∆ I of I such that ψ (a≡i ) ∈ CI for every
CR((aa≡i≡i ),a∈≡j )A∈≡ Aa≡nd 1 ≤ i ≤ n, and such that (ψ (a≡i ), ψ (a≡j )) ∈ RI for every
and 1 ≤ i, j ≤ n.</p>
      <p>Let us consider the composite function ψ ◦ h, where ψ is the above function
and h is the implicit function that defines our ABox-to-summary homomorphism.
This function thus maps a1, . . . , am to the interpretation domain of I and clearly
ψ (h(ai)) ∈ CI for every C(ai) ∈ A and 1 ≤ i ≤ m, because by construction of A≡
C(h(ai)) ∈ A≡ , and such that (ψ (h(ai)), ψ (h(aj))) ∈ RI for every R(ai, aj) ∈
A and 1 ≤ i, j ≤ m, because by construction of A≡ R(h(ai), h(aj)) ∈ A≡ .
Therefore, I is also a model of ∃a1, . . . , am A. ⊔⊓
Proof (Proof of Property 2). If A≡ is consistent w.r.t. T , there must exist some
model I = (∆ I, · I) of A≡ . Let us show that we can build a model of A out of
I, hence that A is consistent w.r.t. T .</p>
      <p>Consider the model J = (∆ J , · J ) denfied as follows: ∆ J = ∆ I, AJ = AI
for every concept name A ∈ NC , RJ = RI for every role name R ∈ NR, and for
every constant a in A, aJ = h(a)I.</p>
      <p>For every concept assertion C(a) ∈ A, C(h(a)) ∈ A≡ by definition of a
quotient ABox, where h is the implicit function defining the ABox-to-summary
homomorphism. Since I is a model of A≡ , I is a model of C(h(a)), i.e., h(a)I ∈
CI. By definition of J , CJ = CI and aJ = h(a)I, thus aJ ∈ CJ , i.e., J is a
model of C(a). J is therefore a model of all the concept assertions in A.</p>
      <p>Similarly, for every role assertion R(ak, al) ∈ A, R(h(ak), h(al)) ∈ A≡ by
definition of a quotient ABox, where h is the implicit function defining the
ABox-to-summary homomorphism. Since I is a model of A≡ , I is a model
of R(h(ak), h(al)), i.e., (h(ak)I, h(al)I) ∈ RI. By definition of J , RJ = RI,
akJ = h(ak)I and alJ = h(al)I, thus (akJ , alJ ) ∈ RJ , i.e., J is a model of
R(ak, al). J is therefore a model of all the role assertions in A.</p>
      <p>It therefore follows that J is a model of all the assertions in A, hence a model
of A. ⊔⊓
Proof (Proof of Property 3). We remark that if a query is asked on an ABox that
is inconsistent w.r.t. its associated TBox, then both true and f alse if the query
is Boolean, and all the tuples of the arity of the query otherwise, are answers.</p>
      <p>If A is not consistent w.r.t. T , then qA,T ̸= ∅, and by the contraposition of
Property 2, A≡ is not consistent w.r.t. T too, therefore qhA≡ ,T ̸= ∅ and Property 3
holds in this first case.</p>
      <p>If A is consistent w.r.t. T , while A≡ is not consistent w.r.t. T (see Example 2),
then Property 3 holds in this second case: qhA≡ ,T ̸= ∅ holds whenever qA,T ̸= ∅
or qA,T = ∅ holds.</p>
      <p>Otherwise, A is consistent w.r.t. T and A≡ is consistent w.r.t. T . Let us
show that Property 3 holds in this third and last case. If the UCQ q has a
non-empty answer set, then at least one of its CQs has a non-empty answer set.
Let q′ be such a CQ in q. Let us assume, without loss of generality, that q′ is
Boolean as the definition of non-Boolean query answers reduces to the Boolean
case. Let us consider some model J of A≡ and T . Let I be the interpretation
such that ∆ I = ∆ J , AI = AJ for every concept name A ∈ NC , RI = RJ for
every role name R ∈ NR, and aI = h(a)J for every constant a in A. Clearly,
by construction, I is homomorphic to J and is a model of A and T . Thus, I
is also a model of q′. Further, J is obviously a model of q′ just in case q′ does
not contain contants, because the constants in A do not have interpretations in
J : ∆ I = ∆ J , AI = AJ for every concept name A ∈ NC , RI = RJ for every
role name R ∈ NR. Otherwise, J is a model of qh′, in which constants from A
are replaced by the constants in A≡ they are represented by: aI = h(a)J for
every constant a in A. Therefore, every model of A≡ and T is a model of qh′,
i.e., qh′ is entailed by A≡ and T , thus has a non-empty answer set, hence qh has
a non-empty answer set too. ⊔⊓
Proof (Proof of Lemma 1). If A is inconsistent w.r.t. T , then A≡ is also
inconsistent w.r.t. T (by contraposition of Property 2), and the Lemma clearly
holds.</p>
      <p>If A is consistent w.r.t. T while A≡ is inconsistent w.r.t. T , then the Lemma
also clearly holds.</p>
      <p>Otherwise, both A and A≡ are consistent w.r.t. T , and let us consider some
model J of A≡ and T . Let I be the interpretation such that ∆ I = ∆ J , AI = AJ
for every concept name A ∈ NC , RI = RJ for every role name R ∈ NR, and
aI = h(a)J for every constant a in A. Clearly, by construction, I is homomorphic
to J and is a model of A and T . Because I is a model of A and T , I is a model of
the concept assertion (mscA,T (ai))(ai) for 1 ≤ i ≤ n. By construction of I, J is
a model of (mscA,T (ai))(h(ai)) for 1 ≤ i ≤ n, where h is the ABox-to-summary
homomorphism, i.e., J is a model of (mscA,T (ai))(a≡ ) for 1 ≤ i ≤ n.</p>
      <p>
        It follows that all the models of A≡ and T are models of (mscA,T (ai))(a≡ )
for 1 ≤ i ≤ n, and of (mscA≡ d,Tn(a≡ ))(a≡ ) (∗ ). By definition of the MSC of
ad≡ n, either (i) mscA≡ d,Tn(ai=≡1) m⪯sTcA,Ti=(a1im)̸⪯scTA,mT(saciA)≡ h,Tol(das≡ o)rh(oiild).mInsctAh≡ e,lTa(tate≡r)c⪯̸ asTe
i=1 mscA,T (ai) and dn
(ii), mscA≡ ,T (a≡ )⊓ i=1 mscA,T (ai) would be strictly subsumed by mscA≡ ,T (a≡ )
w.r.t. T , and because of (∗ ) above, mscA≡ ,T (a≡ ) ⊓ dn
i=1 mscA,T (ai) would
be the MSC of ad≡,n a contradiction. Therefore, we must be in case (i) and
mscA≡ ,T (a≡ ) ⪯ T i=1 mscA,T (ai) holds. ⊔⊓
Proof (Proof of Lemma 2). Lemma 2 follows the well-known relationship
between the MSC of constants and the notion of least common subsumer (LCS)
of concepts [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], i.e., a most specific concept generalizing all of them. A concept
C is a LCS of the concepts C1, . . . , Cn w.r.t. a TBox T if ( i) Ci ⪯ T C holds
for all i ∈ [1, n], and (ii) for any concept D such that Ci ⪯ T D holds for all
i ∈ [1, n], either C ⪯ T D holds or C ̸⪯ T D and D ̸⪯ T C hold. We denote by
lcsT (C1, . . . , Cn) the conjunction of the LCSs of C1, . . . , Cn w.r.t. T and we refer
to it as the LCS of C1, . . . , Cn w.r.t. T . In particular, it is well-known that
mscA,T (a1, . . . , an) and lcsT (mscA,T (a1), . . . , mscA,T (an)) are equivalent [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Further, it is also known that lcsT (mscA,T (a1), . . . , mscA,T (an)) cannot be more
specific than Fin=1 mscA,T (ai), notably they are equivalent in DLs allowing
disjunction (⊔). Thus, Fin=1 mscA,T (ai) ⪯ T mscA,T (a1, . . . , an) holds. ⊔⊓
Proof (Proof of Theorem 1). The proof trivially follows from Lemma 1, Lemma 2,
and the standard first-order semantics of disjunction ⊔ and conjunction ⊓ in
DLs. ⊔⊓
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abriola</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Barceol,´ P.,
          <string-name>
            <surname>Figueira</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Figueira</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Bisimulations on data graphs</article-title>
          .
          <source>In: KR</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F</given-names>
          </string-name>
          . (eds.):
          <article-title>The Description Logic Handbook</article-title>
          . Cambridge University Press (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-mediated query answering: Harnessing knowledge to get more from data</article-title>
          .
          <source>In: IJCAI</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giacomo</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Tractable reasoning and eficient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>J. Autom. Reason</source>
          .
          <volume>39</volume>
          (
          <issue>3</issue>
          ) (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cebiric</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Goasdoue,´ F.,
          <string-name>
            <surname>Kondylakis</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kotzinos</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manolescu</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troullinou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zneika</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Summarizing semantic graphs: a survey</article-title>
          .
          <source>VLDB J</source>
          .
          <volume>28</volume>
          (
          <issue>3</issue>
          ) (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Dolby</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalyanpur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kershenbaum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schonberg</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivas</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
          </string-name>
          , L.:
          <article-title>Scalable semantic retrieval through summarization and refinement</article-title>
          . In: AAAI (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dolby</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalyanpur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schonberg</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivas</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Scalable highly expressive reasoner (SHER)</article-title>
          .
          <source>J. Web Semant</source>
          .
          <volume>7</volume>
          (
          <issue>4</issue>
          ) (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kershenbaum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ma</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schonberg</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srinivas</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>The summary abox: Cutting ontologies down to size</article-title>
          . In: ISWC (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebig</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vialard</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Abstraction refinement for ontology materialization</article-title>
          .
          <source>In: ISWC</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Ontology materialization by abstraction refinement in horn SHOIF</article-title>
          . In: Singh,
          <string-name>
            <given-names>S.P.</given-names>
            ,
            <surname>Markovitch</surname>
          </string-name>
          , S. (eds.)
          <source>AAAI</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Goasdoue</surname>
          </string-name>
          ,´ F.,
          <string-name>
            <surname>Guzewicz</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manolescu</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>RDF graph summarization for firstsight structure discovery</article-title>
          .
          <source>VLDB J</source>
          .
          <volume>29</volume>
          (
          <issue>5</issue>
          ) (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Gross</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yellen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang, P. (eds.):
          <source>Handbook of Graph Theory. Discrete Mathematics and Its Applications</source>
          , Chapman &amp; Hall / CRC Press, Taylor &amp; Francis (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heflin</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A scalable approach for partitioning owl knowledge bases</article-title>
          .
          <source>In: International Workshop on Scalable Semantic Web Knowledge Base Systems</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Safavi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dighe</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koutra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Graph summarization methods and applications: A survey</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>51</volume>
          (
          <issue>3</issue>
          ) (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. Mo¨ller, R.,
          <string-name>
            <surname>Neuenstadt</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , O¨cze¸p,
          <string-name>
            <given-names>O¨.L.</given-names>
            ,
            <surname>Wandelt</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.:</surname>
          </string-name>
          <article-title>Advances in accessing big data with expressive ontologies</article-title>
          . In: International Workshop on Description Logics (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Palmonari</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rula</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Porrini</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maurino</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spahiu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferme</surname>
          </string-name>
          , V.:
          <article-title>ABSTAT: linked data summaries with abstraction and statistics</article-title>
          . In: ESWC (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Xiao</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-based data access: A survey</article-title>
          .
          <source>In: IJCAI</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>