<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Nexus of Similarities⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giovanni Amendola</string-name>
          <email>giovanni.amendola@unical.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pietro Cofone</string-name>
          <email>pietro.cofone@unical.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Manna</string-name>
          <email>marco.manna@unical.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aldo Ricioppo</string-name>
          <email>aldo.ricioppo@unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Cyprus</institution>
          ,
          <country country="CY">Cyprus</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mathematics and Computer Science, University of Calabria</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <fpage>9</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>Recognizing and explaining nexus of similarities between entities is a crucial task that arises in numerous real-life scenarios. Understanding why a group of individuals shares a certain disease or why some products may be more appealing than others are natural examples. Over the last thirty years, a variety of logic-based approaches have been proposed to address this task. However, despite these eforts, there is a gap in available systems capable of exploiting real-world knowledge bases. In response to this gap, the paper presents neXSim, a prototypical web platform based on a recent logic-based framework explicitly designed to be suitable for a practical and efective implementation. This platform, instantiated over BabelNet 4.0, is capable of exploiting very large knowledge bases on-the-fly and in real-time, while providing formal, concise, and human-readable explanations.</p>
      </abstract>
      <kwd-group>
        <kwd>implementation</kwd>
        <kwd>logic-based reasoning</kwd>
        <kwd>knowledge base integration</kwd>
        <kwd>semantic similarity</kwd>
        <kwd>real-time reasoning systems</kwd>
        <kwd>algorithms</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The task of recognizing and expressing commonalities between entities within a knowledge base
has captured the interest of researchers across various disciplines over the past three decades. This
exploration spans diverse AI communities, ranging from Description Logics to Semantic Web and
Database Theory [
        <xref ref-type="bibr" rid="ref1 ref10 ref11 ref2 ref3 ref4 ref5 ref6 ref7 ref8 ref9">1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11</xref>
        ]. Existing approaches vary across four key dimensions:
the form of the input (e.g., pairs of entities, sets of entities, sets of entity tuples), the type of knowledge
bases they can handle (e.g., DL knowledge bases, RDF, relational databases), the scope of knowledge
considered for each input item (e.g., the entire knowledge base or selected excerpts), and the specific
formalism to express commonalities (e.g., DL Concepts, r-graphs, (U)CQs, SPARQL queries, rooted-CQs).
      </p>
      <p>
        Despite the significant research eforts, there remains a gap in available online systems capable of
addressing the considered reasoning task in real-world scenarios exploiting very large knowledge graphs.
In response to this gap, the paper presents neXSim, a prototypical web platform based on a recent
logicbased framework [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] explicitly designed to be suitable for a practical and efective implementation.
In particular, this framework: () can accommodate diferent types of underlying knowledge bases;
() adopts a notion of summary selector to focus on relevant knowledge about the input items; () is
conceived to handle sets of entity tuples; and ( ) expresses nexus of similarities (i.e., commonalities) via
nearly connected conjunctive formulas (essentially, rooted-CQs plus constants). Overall, it guarantees
that, for any input items, there always exists a finite formula expressing all commonalities, called
characterization. Moreover, to guarantee concise and human-readable explanations, in the same spirit
as done also for rooted-CQs [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the framework promotes the adoption of size-wise minimal formulas
among all equivalent characterizations.
      </p>
      <p>To illustrate our approach, consider a Knowledge Base (KB) describing Leonardo da Vinci as a
painter, sculptor, artist, and person of Italian nationality who designed war machines, while it describes
Joint Proceedings of the Workshops and Doctoral Consortium of the 41st International Conference on Logic Programming, September</p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073</p>
      <p>
        Michelangelo as a painter, sculptor, artist, and person of Italian nationality who wrote sonnets. Our
ifrst goal is to capture the nexus of similarity shared by them via a so called core characterization [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
The latter, in essence, is the most compact, non-redundant formula that is still exhaustive in listing all
shared properties. From what we know about the two entities, the core characterization is:
x ← isA(x, Painter), isA(x, Sculptor), isA(x, Artist), isA(x, Person), nationality(x, Italian).
It is crucial to note that, at this stage, no information is omitted based on logical entailment. The objective
here is to establish the complete ground truth, expressed in its most concise syntactical form. With this
notion of explanation as our target, this paper introduces neXSim, a research prototype instantiated
over BabelNet 4.0 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and available at https://tinyurl.com/bdhpvdc7. In the context of Explainable AI,
neXSim is designed to compute such core characterizations for a broad audience. This capability serves
not only human users—from domain experts formalizing knowledge to curious users exploring a topic—
but also automated agents like recommender systems that must justify their suggestions. Accordingly,
the tool allows users to assemble a group of entities with ease, either by searching for them via keywords
or by providing their specific identifiers. The main contributions of this paper are therefore twofold.
First, in Section 3, we provide empirical evidence that core characterizations are a superior alternative to
canonical characterizations [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. These canonical characterizations are built using classical techniques
drawn from various fields such as database theory, Query-by-Example, and logic definability. While
such methods are computationally appealing, they often lead to verbose and redundant results that are
ill-suited for end-users. We show that, in practical scenarios, our approach produces characterizations
that are significantly more compact while remaining just as complete. Second, in Section 4 we introduce
kernel explanations (ker ) as a subsequent refinement step, designed for maximum readability by pruning
logically deducible facts from the core characterization. Revisiting our example, the complete core
characterization is refined into a  explanation:
      </p>
      <p>x ← isA(x, Painter), isA(x, Sculptor), nationality(x, Italian).</p>
      <p>This is achieved by observing that, in accordance with the information in the KB that we have defined,
being a painter implies being an artist, which in turn implies being a person, making the latter two
concepts redundant for a human reader. We then detail how neXSim eficiently computes these
explanations, and envision future work to further enhance their user-friendliness, for instance by
generating them in natural language.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Existing Framework</title>
      <p>2.0.1. Basics.</p>
      <p>
        We outline the recent framework for characterizing nexus of similarities [
        <xref ref-type="bibr" rid="ref11 ref13">11, 13</xref>
        ].
      </p>
      <p>An atom is a labelled tuple ( 1, ...,   ), where  is a predicate and each   is a term, either a constant
or a variable. A structure is a set of atoms. A dataset is a variable-free structure. Constants are
called entities in datasets encoding knowledge graphs. A homomorphism from a structure  to a
structure  ′, denoted  ⟶  ′, is a map ℎ from terms() to terms( ′), where  = ( 1, ...,   ) ∈  implies
ℎ() = (ℎ( 1), ..., ℎ(  )) ∈  ′ and ℎ() =  for each constant  . A (conjunctive) formula ( 1, ...,   ) is an
expression  1, ...,   ←  1(t1), ...,   (t ), where  is its arity,  is its size, each t is a sequence of terms,
each   (t ) is an atom, and each   is a (free) variable occurring in some of the atoms of  . The output of
 over a dataset  is the set () of every tuple ⟨ 1, ...,   ⟩ such that atoms() ⟶  via a homomorphism
ensuring that each ℎ(  ) =   .</p>
      <sec id="sec-2-1">
        <title>2.0.2. Selective knowledge bases.</title>
        <p>
          A knowledge base (KB) is a pair  = (, ) , where  is a dataset and  is an (onto)logical first-order
theory [
          <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
          ]. As is customary, an atom  is entailed by  if it occurs in every model of  . The set of
atoms entailed by  is denoted as ent( ) . A selective knowledge base (SKB) is a pair  = ( , ) , where
 is a KB and  is a summary selector, namely and algorithm that takes in input  together with a tuple
 of constants and returns a dataset ( ,  ) ⊆ ent( ) representing a summary of  (w.r.t.  ). When  is
clear from the context, we write ( ) instead of ( ,  ) . Angle brackets may be omitted for unary tuples.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.0.3. Explanation language.</title>
        <p>A nearly connected formula (NℂF) is a formula ( 1, ...,   ) such that the structure atoms() ∪
{dummy( 1, ...,   )} is connected.1 Intuitively, each atom of  can “reach” some free variable. For
example, x ← isA(x, ap), located(x, y), partOf(y, US) and x, y ← isA(x, ap), partOf(y, US) are both
nearly connected, but x ← isA(x, ap), partOf(y, US) is not. An instance of  according to an SKB
 = ( , ) is any tuple  of constants that belongs to (( )) . The set of all instances of  is denoted by
inst(,  ) . Clearly, inst(,  ) ⊆ ( ent( )) . Given two NℂFs  1( 1, ...,   ) and  2( 1, ...,   ), by  1 ⟶  2
we mean that atoms( 1) ⟶ atoms( 2) via a homomorphism ℎ ensuring that each   = ℎ(  ).</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.0.4. Characterizations.</title>
        <p>
          Fix an SKB  = ( , ) and a set U = { 1, ...,   } of  -ary tuples of constants, called ( -ary) unit. An NℂF
 explains (some nexus of similarity between the tuples of) U if inst(,  ) ⊇ U, whereas  characterizes
(the nexus of similarity between the tuples of) U if  explains U and  ′ ⟶  whenever  ′ explains U.
Intuitively,  characterizes U if it encompasses the nexus of similarities expressed by any other NℂF
explaining U. To ensure the existence of characterizations, we assume that both ent( ) and any ( )
contain a special atom ⊤() for each of their constants  . By adapting the well-known direct product
operator between structures and taking into account the properties of NℂFs (presence of constants and
nearly connectedness), it is possible to build a canonical characterization of U, denoted as can(U,  )
as described in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. As it will be clearer in Section 3, can(U,  ) may contain redundant atoms. They,
however, can be eliminated by keeping only a core2 of the atoms of can(U,  ) preserving all free
variables. This (generally) smaller characterization is denoted by core(U,  ) .
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Canonical vs Core Characterizations</title>
      <p>The objective of this section is to demonstrate that, even within small knowledge bases, core
characterizations are preferable in terms of explainability, readability, and conciseness compared to canonical
1Inductively, a set of atoms is connected if it is a singleton or can be partitioned into two connected sets sharing at least one
term.
2A core of a structure  is any  ′ ⊆  such that  ⟶  ′ and there is no  ″ ⊂  ′ such that  ′ ⟶  ″.</p>
      <p>
        From a theoretical viewpoint, we know there are families {(U ,   )}&gt;1 of SKBs and units of size
proportional to  , for which the sizes of can(U ,   ) and core(U ,  
) coincide [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. But, there are
families where the size of can(U ,   ) is exponential in  , while the size of core(U ,   ) is constant. As
an example, consider U
⋃0&lt;&lt;   , and   (,
      </p>
      <p>= {a1, ..., a },   = (  ,   ),   = (  , ∅),  0 = ∅,   = {p(a , b ), p(a , c )} ∪

) = {p(, ), ⊤() |
p(, ) ∈</p>
      <p>} ∪ {⊤()} . In particular, |can(U ,   )| = 1 + 2+1 , while
|core(U ,   )| = 3. It is therefore important to understand, in practical cases, the actual ratio between the
two forms of characterizations. To this aim, in what follows, we present an excerpt from an experimental
analysis conducted on a small real-world knowledge base in the movies domain.</p>
      <p>The considered knowledge base comprises data on 130 notable films from the past 30 years. It also
includes profiles of influential industry figures. The dataset’s structure is based on binary relations,
connecting two entities through predicates such as directedBy or starring, to represent facts within the
domain. In particular, it comprises 399 entities, 1,273 binary relations as extensional knowledge, and
an ontology consisting of 13 rules: one for the transitive closure of the isA relation, and the others to
derive roles of the entities (e.g., director or writer). On average, each entity has 15 items in its summary.
Then, we randomly generated and tested 10,000 units for each cardinality between 2 and 4.</p>
      <p>For each size of the considered units, we analyzed the trends about the ratio between the sum of the
sizes of all canonical characterizations and the sum of the sizes of all core characterizations under four
diferent assumptions: (A1) all units; (A2) only units where the instances of their characterizations do
not coincide with all possible entities; (A3) as in (A1) but removing ⊤-atoms from characterizations; (A4)
as in (A2) but removing ⊤-atoms from characterizations. As it will be clear in Section 4.4, the last two
trends have been considered since ⊤-atoms can generally and safely be omitted for readability. Figure 1
illustrates the essence of our findings: overall, the size of canonical characterizations becomes more and
more larger than the one of core characterizations when considering units of increasing size; indeed, all
four considered trends are more than linear. We close by underlining that, while the maximum size of
canonical characterizations was 9,705 (resp., 4,852) with (resp., without) ⊤( )
atoms, the maximum size
of core characterizations was of 29 (resp., 15). Full details are available at https://tinyurl.com/3sks2tax.</p>
    </sec>
    <sec id="sec-4">
      <title>4. From theory to practice with neXSim</title>
      <p>
        This section introduces neXSim 0.1-alpha, the initial prototype system implementing the framework
outlined in the previous section. Currently, the system is capable of dealing with unary units of arbitrary
size, characterized according to SKBs that are defined from RDF knowledge graphs (KGs) paired with
simple ontological Datalog rules. As a proof of concept, neXSim has been instantiated with BabelNet
4.0.1, a well-known generalist RDF KG [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <sec id="sec-4-1">
        <title>4.1. From RDF Knowledge Graphs to SKBs</title>
        <p>From a given RDF KG  , we illustrate the shape of the SKB  = ( , )
with  = (, )
that neXSim
conceptually associates to  at present. For each triple (, , )
in  , the dataset  contains the atom
(, )</p>
        <p>. Additionally, if  is of a hypernymy kind (e.g., isA) or of a holonymy kind (e.g., partOf), then 
contains the following rule: ( x, z) ← ( x, y), ( y, z). Clearly, for example, if both isA(, )
and isA(, )
belong to  , then both of them, together with isA(, ) , belong to ent( ) . In what follows, we assume
that relations of a hypernymy or a holonymy kind are acyclic. When this is not the case,  has to be
cleaned or repaired in advance. Finally, for each entity  occurring in  , we have that () consists of
the set of atoms {(, ), ⊤(), ⊤() | (, ) ∈</p>
        <p>ent( )} .</p>
        <p>The framework illustrated in Section 2 may accommodate alternative representations. Indeed, one
could encode a triple (, , )
in  as the atom triple(, , )
. Consider, however, the triples (a, isA, person),
(a, founded, Google), (b, isA, person), (b, uses, Google), the unit U = {a, b}, and  as above. If we encode
triples as binary atoms, then a characterization of U looks like x ← isA(x, person); whereas, if we
encode triples as ternary atoms, then it looks like x ← triple(x, isA, person), triple(x, y, Google). Since
we believe that the latter in not extremely more informative, we currently prefer simpler explanations.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Dealing with Hypernyms</title>
        <p>Fix an SKB  = ( , ) as illustrated in Section 4.1, including the binary predicate isA and the ontological
rule isA(x, z) ← isA(x, y), isA(y, z). Consider a scenario where we have a unit U = { 1,  2}, with
both entities being students. Since isA(student, person) belongs to ent( ) , it is reasonable to avoid
considering that both entities of U are also persons. Hence, from core(U,  ) , we could chose to “hide”
the atom isA(, person), as it may be obvious with respect to  , while displaying isA(, student). By
following this intuition, we can both speed up the computation of nexus explanations and improve the
user experience. To achieve this, we define a suitable variant of  , called  .̃</p>
        <p>An ancestor of an entity  is any  such that isA(, ) belongs to ent( ) . Consider a unary unit
U = { 1, ...,   }. A common ancestor for U is any entity  that is an ancestor of each entity in U. A
least common ancestor for U is any common ancestor  such that, for each isA(, ) ∈ ent( ) ,  is not
a common ancestor for U. We denote by lca(U) the set of all least common ancestors for U. For
each  ∈ U, we define  U() as the set {(, ) ∈ () |  ≠ isA} ∪ {isA(, ) ∈ () |  ∈ lca(U)} ∪
{isA(, ) ∈ () | (, ) ∈ () ∧  ≠ isA}. Intuitively, we retain the following: all atoms for
nonhypernymy relations, the atoms containing the least common ancestors for U, and the atoms containing
common ancestors that also occur in non-hypernymy relations.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Virtualizing SKBs via Graph Databases</title>
        <p>
          To eficiently deal with very large RDF-based SKBs as BabelNet, neXSim adopts the NoSQL graph
database management system Neo4j [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. In the rest of the section, fix an RDF graph  . Let  = ( , )
with  = (, ) be the SKB conceptually associated to  , as illustrated in Section 4.1. To create and
on-the-fly navigate an instance   of Neo4j that allows to virtually interact with  , we exploit the
built-in query language of Neo4j called Cypher [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. For example, to insert to   an atom (, ) of  ,
we use the query:
        </p>
        <p>MERGE (n1 {id: })
MERGE (n2 {id: })</p>
        <p>MERGE (n1)-[: ]-&gt;(n2);
Moreover, to retrieve all entities directly connected to some entity  of  via some relation  , namely
those in the set { | (, ) ∈ } , we can use the following path query:</p>
        <p>MATCH ({id: })-[: ]-&gt;(x)</p>
        <p>WITH DISTINCT x.id as y RETURN y;
In particular, if  is a transitive relation (like isA or partOf), we can adapt the above path query, by
replacing [: ] with [: *], resulting in a variable-length path query. If so, the result would now be
the set { | (, ) ∈ ent( )} .</p>
        <p>To compute lca(U), we first add temporarily to both  and   a dummy atom isA(♡, ) for each
 ∈ U. Then, we exploit a Cypher procedure to retrieve the set hyper(U) = {(♡, ) | isA(♡, ) ∈
} ∪ {(, ) | isA(, ) ∈  ∧ isA(♡, ) ∈ ent( ))} as follows:</p>
        <p>MATCH (node {id:♡})
CALL apoc.path.subgraphAll(node,</p>
        <p>{relationshipFilter:'isA&gt;'})
YIELD nodes, relationships
UNWIND relationships as relation
WITH startNode(relation).id as x,
endNode(relation).id as y,
type(relation) as rel
WHERE rel = 'isA'</p>
        <p>RETURN x, y;
Finally, from hyper(U) it is possible to compute lca(U) either directly in Python or via a simple set of
Datalog rules with stratified negation [ 18] available https://tinyurl.com/3sks2tax.</p>
        <p>Algorithm 1 Kernel of U according to  = ( , )
Input:  U( 1), ...,  U(  )
Output: ker (U,  )
1:  ∶=  U( 1) and  ∶=  1
2: for ℓ ∈ {2, ..., } do
3: (, ) ∶= ( U( ℓ),  ℓ)
4: (, ) ∶= Prod&amp;Core(, , , )
5: end for
6: return  ← 
Algorithm 2 Prod&amp;Core
Input: , , , 
Output: Core( ♦ ) ,  ♦ 
1:  1 ∶= { ↦ x, } and  2 ∶= { ↦ x, }
2:  ∶= predicates() ∩ predicates()
3:  ∶=  1(|  ) and  ∶=  2(|  )
4: Keep ∶= {(, ) ∈  ∩  ∣ #  ( ∪ ) = 1}
5:  ∶=  ∖ {(, ) ∈  ∣ #  () = 1}
6:  ∶=  ∖ {(, ) ∈  ∣ #  () = 1}
7:  ′ ∶=  ∖ predicates(Keep)
8:  ∶= ( ♦ ) ∪ Keep ∪{( x, , y, ) ∣  ∈  ′}
9:  ∶= Core( )
10: return ( , x, )</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Kernel Explanations</title>
        <p>As discussed in Section 4.2, when dealing with RDF-based SKBs, we can also enhance the user experience
by further refining the shape of core(U,  ) to hide nexus that may be considered obvious. To this aim,
we are going to define the kernel (explanation) of U according to  , denoted by ker (U,  ) . What we
obtain is not precisely a characterization for U according to  . Nonetheless, such a new formula can be
understood as a special explanation that provides the user with all the “fundamental” and “expected”
nexus of similarity expressed by core(U,  ) . To formally define ker (U,  ) , we exploit an ad hoc variant
of  , consisting of the ⊤-free SKB  U = ( ,  U), where  U is defined as in Section 4.2. Note that,  U
is not interesting in its own, but it has to be considered only as a technical tool for dealing with the
kernel explanation of the unit U at hand. Interestingly, if core(U,  ) =  ← ⊤() , then ker (U,  ) does
not exist; in all the other cases, it is not dificult to show that the set of all instances of core(U,  ) and
ker (U,  ) according to  do coincide. Currently, the neXSim platform has been optimized to furnish
users with kernel explanations in place of core characterizations.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Computing Kernel Explanations</title>
        <p>We now show how to eficiently construct ker (U,  ) . Before analyzing the algorithm, we introduce a
convenient variant of the well-known direct product operator ⊗, here denoted by ♦ to avoid confusion.
Essentially, given two terms  and  ′, the product  ♦  ′ is either the variable x, ′ if  ≠  ′ or the term
 itself, otherwise. Having that, the product between structures behaves as usual [19]. For example,
given  1 = {p(y, 1)} and  2 = {p(2, 3), p(y, 4), p(3, 1)}, the new product  1♦  2 is the structure {p(xy,2, x1,3),
p(y, x1,4), p(xy,3, 1)}. We are now ready to explain the behaviours of Algorithm 1 and Algorithm 2. Let us
start by describing the behavior of the latter. The input of Algorithm 2 is given by two structures, namely
 and  , and two terms, namely  and  . Assuming that (,  ) ∈  implies that  =  when  = 
and  =  when  =  , our algorithm returns the pair (Core( ♦ ),  ♦ ) , where (,  ) ∈ Core( ♦ )
implies that  =  ♦  . This assumption will always be satisfied whenever we call Algorithm 2 as a
subroutine of Algorithm 1. In line 1, we introduce two functions,  1 and  2, which are identities outside
their domains. They map  to x, and  to x, , respectively, without altering terms common to both 
and  . In line 2, we store the common predicates of  and  in the set  . This is done because elements
not in the common signature are directly eliminable. In line 3, we project  and  onto their common
signature, renaming both  ∈  and  ∈  as x, . This enables the intersection of the two sets. Note
that when Algorithm 2 is used within Algorithm 1, the distinct elements  1, … ,   necessitate renaming
to avoid an invariably empty intersection. In line 4, we define the set Keep. This set comprises all atoms
(,  ) common to both  and  , with  occurring exactly once in  ∪  . The distinctiveness of Keep is
as follows: for any atom (,  ′) ∈  , when (,  ) ♦ (,  ′) is considered, with (,  ) ∈ Keep, the term
 ♦  ′ appears only once in Keep♦  ⊆  ♦  . It can be mapped back to  , present in Keep♦ Keep ⊆  ♦  ,
making the materialization of this atom superfluous. Lines 5 and 6 eliminate atoms with uniquely
occurring terms from the initial structures. Line 7 stores in  ′ the predicates common to  and  ,
absent in Keep. Line 8 adds to the final structure atoms with fresh variables for each predicate in  ′,
necessary due to their absence in Keep and their removal from  and  . Keep and  ♦  ’s residual
elements are also included. In line 9, the core of the structure is computed using an external logic
program that verifies each atom’s membership in the core. The element  , is not fixed because, by
hypothesis, it uniquely appears in the first position of every atom, ensuring its presence in the core
output of Algorithm 1. We conclude by returning the pair (Core( ♦ ),  ♦ ) . The process described in
Algorithm 1 is straightforward. It involves iterating over all input summaries, as outlined in Section 4.2,
and applying Algorithm 2 repeatedly.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Future Directions</title>
      <p>Future work will proceed along two primary directions. The first is to increase the explanatory power of
our method by extending it to capture non-local similarities, i.e., those spanning beyond the immediate
neighborhood of the input entities. This is a non-trivial task that will require us to revisit the mechanisms
for computing direct products and cores of structures under more relaxed assumptions. The second
direction is to enhance the system’s accessibility. To this end, we will focus on the automatic generation
of the logical formulas into natural language, making the discovered insights truly accessible to the
widest possible audience.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work significantly contributed to the basic research activities of WP9.1: “KRR Frameworks for
Green-aware AI”, supported by the FAIR project (PE_00000013, CUP H23C22000860006) – Spoke 9,
within the NRRP MUR program funded by NextGenerationEU. Additional support was provided by
the NRRP MUR project “Tech4You” (ECS00000009, CUP H23C22000370006) – Spoke 6, funded by
NextGenerationEU; by the PRIN project PRODE “Probabilistic Declarative Process Mining” (Project
ID 20224C9HXA, CUP H53D23003420006), funded by the Italian Ministry of University and Research
(MUR); by the PN RIC project ASVIN “Assistente Virtuale Intelligente di Negozio” (F/360050/02/X75, CUP
B29J24000200005), under the Italian National Program for Research, Innovation and Competitiveness;
by the project STROKE 5.0 “Progetto, sviluppo e sperimentazione di una piattaforma tecnologica di
servizi di intelligenza artificiale a supporto della gestione clinica integrata di eventi acuti di ictus”
(F/310031/02/X56, CUP B29J23000430005), funded by the Italian Ministry of Enterprises and Made
in Italy (MIMIT); by the NRRP project RAISE “Robotics and AI for Socio-economic Empowerment”
(ECS00000035, CUP H53C24000400006), through the GOLD sub-project “Management and Optimization
of Hospital Resources through Data Analysis, Logic Programming, and Digital Twin”; and by the
EI-TWIN project (F/310168/05/X56, CUP B29J24000680005), funded by the former Ministry of Industrial
Development (MISE, now MIMIT).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used GPT-4 in order to: Grammar and spelling check.
After using this tool, the authors reviewed and edited the content as needed and take full responsibility
for the publication’s content.
P. Selmer, A. Taylor, Cypher: An evolving query language for property graphs, in: SIGMOD
Conference, ACM, 2018, pp. 1433–1445.
[18] S. Abiteboul, R. Hull, V. Vianu, Foundations of Databases, Addison-Wesley, 1995.
[19] B. ten Cate, V. Dalmau, The product homomorphism problem and applications, in: M. Arenas,
M. Ugarte (Eds.), 18th International Conference on Database Theory, ICDT 2015, March 23-27,
2015, Brussels, Belgium, volume 31 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik,
2015, pp. 161–176. URL: https://doi.org/10.4230/LIPIcs.ICDT.2015.161. doi:10.4230/LIPIcs.ICDT.
2015.161.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>W. W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Borgida</surname>
          </string-name>
          , H. Hirsh,
          <article-title>Computing least common subsumers in description logics</article-title>
          , in: AAAI, AAAI Press / The MIT Press,
          <year>1992</year>
          , pp.
          <fpage>754</fpage>
          -
          <lpage>760</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Küsters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Molitor</surname>
          </string-name>
          ,
          <article-title>Computing least common subsumers in description logics with existential restrictions</article-title>
          , in: IJCAI, Morgan Kaufmann,
          <year>1999</year>
          , pp.
          <fpage>96</fpage>
          -
          <lpage>103</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sertkaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Turhan</surname>
          </string-name>
          ,
          <article-title>Computing the least common subsumer w</article-title>
          .r.t. a background terminology,
          <source>J. Appl. Log</source>
          .
          <volume>5</volume>
          (
          <year>2007</year>
          )
          <fpage>392</fpage>
          -
          <lpage>420</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Colucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Donini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Giannini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Sciascio</surname>
          </string-name>
          ,
          <article-title>Defining and computing least common subsumers in RDF</article-title>
          ,
          <source>J. Web Semant</source>
          .
          <volume>39</volume>
          (
          <year>2016</year>
          )
          <fpage>62</fpage>
          -
          <lpage>80</lpage>
          . URL: https://doi.org/10.1016/j.websem.
          <year>2016</year>
          .
          <volume>02</volume>
          . 001. doi:
          <volume>10</volume>
          .1016/j.websem.
          <year>2016</year>
          .
          <volume>02</volume>
          .001.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Hassad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Goasdoué</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jaudoin</surname>
          </string-name>
          ,
          <article-title>Learning commonalities in RDF, in: ESWC (1</article-title>
          ), volume
          <volume>10249</volume>
          of Lecture Notes in Computer Science,
          <year>2017</year>
          , pp.
          <fpage>502</fpage>
          -
          <lpage>517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Petrova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Sherkhonov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <article-title>Entity comparison in RDF graphs</article-title>
          ,
          <source>in: ISWC (1)</source>
          , volume
          <volume>10587</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2017</year>
          , pp.
          <fpage>526</fpage>
          -
          <lpage>541</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Petrova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. V.</given-names>
            <surname>Kostylev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          ,
          <article-title>Query-based entity comparison in knowledge graphs revisited</article-title>
          ,
          <source>in: ISWC (1)</source>
          , volume
          <volume>11778</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2019</year>
          , pp.
          <fpage>558</fpage>
          -
          <lpage>575</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          ,
          <article-title>Least general generalizations in description logic: Verification and existence</article-title>
          , in: AAAI, AAAI Press,
          <year>2020</year>
          , pp.
          <fpage>2854</fpage>
          -
          <lpage>2861</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pulcini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          ,
          <article-title>Logical separability of labeled data examples under ontologies</article-title>
          ,
          <source>Artif. Intell</source>
          .
          <volume>313</volume>
          (
          <year>2022</year>
          )
          <article-title>103785</article-title>
          . URL: https://doi.org/10.1016/j.artint.
          <year>2022</year>
          .
          <volume>103785</volume>
          . doi:
          <volume>10</volume>
          . 1016/j.artint.
          <year>2022</year>
          .
          <volume>103785</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>ten Cate</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dalmau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <article-title>Extremal fitting problems for conjunctive queries</article-title>
          , in: PODS, ACM,
          <year>2023</year>
          , pp.
          <fpage>89</fpage>
          -
          <lpage>98</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Amendola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Manna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ricioppo</surname>
          </string-name>
          ,
          <article-title>A logic-based framework for characterizing nexus of similarity within knowledge bases</article-title>
          ,
          <source>Information Sciences 664</source>
          (
          <year>2024</year>
          )
          <article-title>120331</article-title>
          . URL: https://www. sciencedirect.com/science/article/pii/S0020025524002445. doi:https://doi.org/10.1016/j.ins.
          <year>2024</year>
          .
          <volume>120331</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bevilacqua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Conia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Montagnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cecconi</surname>
          </string-name>
          ,
          <article-title>Ten years of babelnet: A survey, in: IJCAI, ijcai</article-title>
          .org,
          <year>2021</year>
          , pp.
          <fpage>4559</fpage>
          -
          <lpage>4567</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Agresta</surname>
          </string-name>
          , G. Amendola,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cofone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Manna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ricioppo</surname>
          </string-name>
          ,
          <article-title>Characterizing nexus of similarity between entities, in: IPS-RCRA-SPIRIT@AI*IA</article-title>
          , volume
          <volume>3585</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          ,
          <article-title>Query and predicate emptiness in ontology-based data access</article-title>
          ,
          <source>J. Artif. Intell. Res</source>
          .
          <volume>56</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>59</lpage>
          . URL: https://doi.org/10.1613/jair.4866. doi:
          <volume>10</volume>
          .1613/ jair.4866.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Calautti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Greco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Molinaro</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Trubitsyna</surname>
          </string-name>
          ,
          <article-title>Preference-based inconsistency-tolerant query answering under existential rules</article-title>
          ,
          <source>Artif. Intell</source>
          .
          <volume>312</volume>
          (
          <year>2022</year>
          )
          <article-title>103772</article-title>
          . URL: https://doi.org/10.1016/j. artint.
          <year>2022</year>
          .
          <volume>103772</volume>
          . doi:
          <volume>10</volume>
          .1016/j.artint.
          <year>2022</year>
          .
          <volume>103772</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sandell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Asplund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Y.</given-names>
            <surname>Ayele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Duneld</surname>
          </string-name>
          ,
          <article-title>Performance comparison analysis of arangodb, mysql, and neo4j: An experimental study of querying connected data</article-title>
          , in: HICSS, ScholarSpace,
          <year>2024</year>
          , pp.
          <fpage>7760</fpage>
          -
          <lpage>7769</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>N.</given-names>
            <surname>Francis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Green</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Guagliardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Libkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lindaaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Marsault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Plantikow</surname>
          </string-name>
          , M. Rydberg,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>