<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CONNOR: Exploring Similarities in Graphs with Concepts of Neighbors</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hugo Ayats</string-name>
          <email>hugo.ayats@irisa.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peggy Cellier</string-name>
          <email>peggy.cellier@irisa.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sébastien Ferré</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Univ Rennes, INSA, CNRS, IRISA. Campus de Beaulieu</institution>
          ,
          <addr-line>35042 Rennes</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>Since its first formalization, the Formal Concept Analysis (FCA) field has shown diverse extensions of the FCA paradigm. A recent example is Graph-FCA, an extension of FCA to graphs. In the context of Graph-FCA, a notion of concept of neighbors has been introduced to support a form of nearest neighbor search over the nodes of a graph. Concepts of neighbors have been used for diverse tasks, such as knowledge graph completion and relation classification in texts. In this paper, we present CONNOR, a Java library for the computation of concepts of neighbors on RDF graphs.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Theoretical Background</title>
      <p>This section introduces definitions related to the concepts of neighbors. Details can be found in
the reference journal paper [7].</p>
      <p>Definition 1. A graph context is a triple  = (, ,  ) where  is a set of objects,  a set of
attributes and  ⊆  ∗ ×  is an incidence relation between tuples of objects and attributes.</p>
      <p>Definition 2. A graph pattern  ⊆  ∗ ×  is a set of labeled directed edges and labeled nodes
with variables as nodes and attributes as labels.</p>
      <p>A projected graph pattern (PGP) is a couple  = ( ,  ) where  is a graph pattern and  ∈  ∗
a projection tuple. The arity of a PGP is the length of  . A PGP of arity  is also called a  -PGP.</p>
      <p>Figure 1 (b) represents a 2-PGP defining the ”sibling” binary relation (two persons sharing
a male parent and a female parent). Projection variables are in bold. In practice, PGPs can be
seen as queries on the graph context. In this paper we call set of answers of a  -PGP the set of
the  -tuples of objects that are the answers of the PGP considered as a query.</p>
      <p>A PGP inclusion relation (noted ⊆ ) can be defined to express the fact that a  -PGP  1
is more specific than another  -PGP  2 (i.e. by renaming the variables of  2 we can obtain a
PGP having the same projection tuple than  1 and having its graph pattern included in the
graph pattern of  1). In this case, we denote  2 ⊆  1. If  1 ⊆  2 and  2 ⊆  1, they are said
equivalent ( 1 ≡  2). The most specific query for a given answer set is the query (under
equivalency relation) that is more specific to every other queries having this set as answer set.
Definition 3. A graph concept of arity  (also called  -concept) is a pair (, ) ,  being a set of
 -tuples of objects (called extension) and  being a  -PGP (called intension), such that:
•  is the set of answers of  ;
•  is the most specific  -PGP having  as set of answers (under the equivalence relation).</p>
      <p>For example, let us consider (, ) where  is the PGP presented in Figure 1 (b) and  is the set
of couples: {(    ,   ), … , (    ,     ), ( , ℎ  ), … , (ℎ  , ℎ  )} .
This concept represents the notion of sibling: the intension is a PGP describing the couple of
persons having a same mother and a same father, and the extension all the couples of entities
respecting this pattern.</p>
      <p>Definition 4. The conceptual distance between two  -tuples of objects  1 and  2 is the  -concept
(  1,  2) = (, ) where the intension  is the most specific  -PGP such that the extension  contains
 1 and  2.</p>
      <p>It can be proven that this conceptual distance has properties similar to a numerical distance,
such as positivity, symmetry and triangle inequality, by using the concept extension inclusion
as a partial order and a notion of conceptual supremum as addition. The extensional
distance (  1,  2) = |(  1,  2).| can be used as a (degraded) numerical distance to evaluate the
dissimilarity between  1 and  2 as the number of  -tuples between them.</p>
      <p>The concepts of neighbours of a  -tuple  is the set of  -concepts C-N () =
Definition 5.
{(  ′, )|  ′ ∈   }.</p>
      <p>The proper extension of a concept .  for  ∈ C-N () is the set of  -tuples of its
extension that are not in the extension of a more specific concept. The set of proper extensions
forms a partition of   , and therefore the set of extensions forms a hierarchical clustering of
  .</p>
      <p>
        Figure 2 presents the five concepts of neighbors of Charlotte in the graph context presented
in Figure 1 (a). On the right, the extensions are presented as a Venn diagram, and on the left the
intensions are expressed in plain English for simplicity. The first concept has for intension the
whole graph centered on Charlotte and has only Chalotte in its extension. Then there are two
larger concepts, one describing the children of Kate and William (Charlotte and George), and
another one describing the women. Then there is an even larger concept describing the people
having a father, a mother and a sibling (Harry, William, Charlotte, and George). Finally, there is
the top concept, having an empty intension and  as extension.
/ / (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) C r e a t i o n o f t h e model
ConnorModel model = new ConnorModel ( rdfModel ) ;
/ / (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) C r e a t i o n o f t h e t a r g e t
L i s t &lt; S t r i n g &gt; t a r g e t = new A r r a y L i s t &lt; &gt; ( { ” h t t p : / / example . org /
r o y a l / C h a r l o t t e ” } ) ;
/ / (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) C r e a t i o n o f t h e i n i t t a b l e
T a b l e t a b l e = T a b l e . o f A r i t y (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) ;
rdfModel . l i s t S u b j e c t s ( ) . f o r E a c h R e m a i n i n g ( s −&gt;
t a b l e . addInitRow ( I n t S t r e a m . o f (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
. mapToObj ( i −&gt; s . asNode ( ) )
. c o l l e c t ( C o l l e c t o r s . t o L i s t ( ) ) ) ;
) ;
/ / (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) C r e a t i o n o f t h e p a r t i t i o n
C o n n o r P a r t i t i o n p a r t i t i o n = new C o n n o r P a r t i t i o n ( model , t a r g e t ,
t a b l e , 0 ) ;
/ / (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) C r e a t i o n o f t h e t h r e a d and c o m p u t a t i o n o f t h e C−N
AtomicBoolean c u t = new AtomicBoolean ( f a l s e ) ;
ConnorThread t h r e a d = new ConnorThread ( p a r t i t i o n , c u t ) ;
t h r e a d . s t a r t ( ) ;
t h r e a d . j o i n ( 2 ∗ 1 0 0 0 ) ; / / 2 0 0 0 ms = 2 s
c u t . s e t ( true ) ;
/ / (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) P r i n t o f t h e r e s u l t s
System . out . p r i n t ( p a r t i t i o n . t o J s o n ( ) ) ;
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. CONNOR: Computation of CONcepts of NeighbORs with a</title>
    </sec>
    <sec id="sec-4">
      <title>Java Library</title>
      <p>In this section we present CONNOR, a Java library for the computation of the Concepts of
Neighbors. This library is a free and open-source software 1, based on Apache Jena2, a
wellknown Java library for the Semantic Web. As the Semantic Web is based on the RDF standard,
and as RDF graphs only represent labelled directed multigraphs, the library only handles
graph contexts with unary relations (node labels) and binary relations (labelled directed edges).
This library implements data structures for manipulating concepts of neighbors, as well as an
algorithm for their computation. Table 1 presents an example code for the computation of the
concepts of neighbors of Charlotte in the graph context presented in Figure 1 (a).</p>
      <p>
        The CONNOR library is structured into four main classes, which are detailed below.
1Accessible here: https://gitlab.inria.fr/hayats/CONNOR
2https://jena.apache.org/
ConnorModel This class encapsulates an RDF model that represents a graph context. There
are two ways to create a model using this class. The first way consists in creating an empty
model and adding triples one by one. The second way consists in creating a model from a
pre-existing RDF model. Section (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) of Table 1 shows an example of creation of ConnorModel
from a RDF graph.
      </p>
      <p>ConceptOfNeighbors As its name tells, this class represents a concept of neighbors, and
hence a graph concept too. As expected, an object of this class is characterized by its intension
(decomposed into two attributes: the list of the projection variables and the graph pattern),
its extension and its proper extension. Those elements can be accessed through the methods
getProjectionVars(), getIntensionBody(), getExtension() and getProperExtension().
The creation of objects of this class is handled by the ConnorPartition class, and should not
be done by library users.</p>
      <p>
        ConnorPartition This class is central to the computation of concepts of neighbors. It takes its
name from the fact that, as presented above, the proper extensions of the concepts of neighbors
form a partition of the set of objects. This class contains all the information needed for the
computations of concepts of neighbors, such as of course the graph context (represented by
a ConnorModel), the concepts of neighbors once the computation is done (represented by a
collection of ConceptOfNeighbors objects), but also the tuple of objects (called target) for which
we want to compute the concepts of neighbors. A ConnorPartition object can be translated
into JSON for serialization and further processing. An exemple of serialization in JSON is given
by Section (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) of Table 1.
      </p>
      <p>A key aspect of this class is that it implements an anytime algorithm for the computation
of concepts of neighbors. Presented in [5], this algorithm starts with the top concept and, by
successively trying to add elements to the intension, refines it in more specific concepts that
still include the target. This way, the algorithm can be interrupted at each refinement step.</p>
      <p>
        To use this class, the base constructor of this object takes as argument a ConnorModel object,
the target (represented by a list of URIs), the partition domain and a maxDepth parameter.
An example of usage is given by Section (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) of Table 1. The partitioning domain is called
initializationTable, which is a Table object which represents a set of tuples of entities.
The role of this argument is to stipulate which set of tuples should appear in the extensions
of the concepts. An example of construction of such a table is given in Section (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) of Table 1.
Concerning the maxDepth parameter, its role is to set, if desired, a limit to the depth of the
intension from the elements of the target. If set to zero, no limit is applied.
      </p>
      <p>The class method to call in order to launch the computation of concepts of neighbors is called
fullPartitioning(cut). Taking a mutable Boolean named cut as a parameter, it refines the
partition until convergence or until cut is switched to true.</p>
      <p>
        ConnorThread This class encapsulates the computation of concepts of neighbors using a
ConnorPartition in a Java thread, so that the main thread just needs to launch this thread and
switch cut to true when desired. An example of usage is given by Section (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) of Table 1.
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>This paper introduces CONNOR, a Java library for the computation of concepts of neighbors
on RDF graphs. After having introduced the theoretical notions related to Graph-FCA (an
extension of FCA to graphs) and to the concepts of neighbors, this paper presents the main
classes of CONNOR, and details their working through an example code for the computation of
concepts of neighbors.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research is supported by ANR project SmartFCA (ANR-21-CE23-0023).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rouane-Hacene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huchard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Valtchev</surname>
          </string-name>
          ,
          <article-title>Relational concept analysis: mining concept lattices from multi-relational data</article-title>
          ,
          <source>Annals of Mathematics and Artificial Intelligence</source>
          <volume>67</volume>
          (
          <year>2013</year>
          )
          <fpage>81</fpage>
          -
          <lpage>108</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10472-012-9329-3.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferré</surname>
          </string-name>
          ,
          <article-title>A Proposal for Extending Formal Concept Analysis to Knowledge Graphs</article-title>
          ,
          <source>in: International Conference on Formal Concept Analysis</source>
          , volume
          <volume>9113</volume>
          , Springer International Publishing, Cham,
          <year>2015</year>
          , pp.
          <fpage>271</fpage>
          -
          <lpage>286</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -19545-2_
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferré</surname>
          </string-name>
          ,
          <article-title>Application of Concepts of Neighbours to Knowledge Graph Completion</article-title>
          ,
          <string-name>
            <surname>Data Science</surname>
          </string-name>
          Pre-press (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          ,
          <article-title>Fitting Pattern Structures to Knowledge Discovery in Big Data</article-title>
          ,
          <source>in: Int. Conf. on Formal Concept Analysis</source>
          , volume
          <volume>7880</volume>
          ,
          <year>2013</year>
          , pp.
          <fpage>254</fpage>
          -
          <lpage>266</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>642</fpage>
          -38317-5_
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferré</surname>
          </string-name>
          ,
          <article-title>Answers Partitioning and Lazy Joins for Eficient Query Relaxation and Application to Similarity Search, in: The Semantic Web</article-title>
          , Cham,
          <year>2018</year>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>224</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>319</fpage>
          -93417-4_
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ayats</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cellier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferré</surname>
          </string-name>
          ,
          <article-title>A Two-Step Approach for Explainable Relation Extraction</article-title>
          ,
          <source>in: Int. Symposium on Intelligence Data Analysis</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>14</fpage>
          -
          <lpage>25</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>031</fpage>
          -01333-
          <issue>1</issue>
          _
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferré</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huchard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaytoue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. O.</given-names>
            <surname>Kuznetsov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <source>Formal Concept Analysis: From Knowledge Discovery to Knowledge Processing, A Guided Tour of Artificial Intelligence Research: Volume II: AI Algorithms</source>
          (
          <year>2020</year>
          )
          <fpage>411</fpage>
          -
          <lpage>445</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -06167-8_
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>