<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring the Application of Graph-FCA to the Problem of Knowledge Graph Alignment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sébastien Ferré</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Knowledge Graph, Semantic Web, Entity Alignement, Ontology Matching, Graph-FCA</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Univ Rennes, CNRS, IRISA, Campus de Beaulieu</institution>
          ,
          <addr-line>35042 Rennes</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>Knowledge Graphs (KG) have become a widespread knowledge representation. When diferent KGs exist for some domain, it is valuable to merge them into a richer KG. This is known as the problem of KG alignement, which encompasses related problems such as entity alignement or ontology matching. Although most recent approaches rely on supervised representation learning, Formal Concept Analysis (FCA) has also been proposed as a basis for symbolic and unsupervised approaches. We here explore the application of Graph-FCA, an extension of FCA for KGs, to diferent scenarios of KG alignments: (A) when the two KGs have common values, and (B) when pre-aligned pairs are known. We show that, compared to previous FCA-based approaches, Graph-FCA allows for a more natural and scalable representation of the KGs to be aligned, and makes it simpler to extract alignments from the concepts. It also features flexibility w.r.t. diferent alignment scenarios.</p>
      </abstract>
      <kwd-group>
        <kwd>Alignment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>A knowledge graph (KG) is a modelling of some domain of interest that supports many tasks
such as question answering, reasoning over data or machine learning. They are commonly
represented in languages of the Semantic Web (RDF(S), OWL), and published on the web as
Linked Open Data (LOD) [9]. A KG is a set of entities described by their classes, properties,
and relations with other entities. Due to the open nature of the web, it is common that a same
domain of interest is modelled by multiple KGs. Those have in general overlapping, hence
redundant, information but also complementary information. It is therefore valuable to merge
the diferent KGs that model a same domain into a single richer KG. The dificulty is that
diferent KGs generally use diferent entity identifiers (URIs) for the same real entities, and even
diferent schemas or ontologies, i.e. diferent classes, properties and relations.</p>
      <p>
        The problem of merging two KGs consists in discovering equivalence links between entities
and/or between schemas, which is known as knowledge graph alignment. A number of other
terms are used in the literature, partially depending on the focus on entities or on the schema:
entity alignment, entity matching, ontology alignment/matching, data interlinking [
        <xref ref-type="bibr" rid="ref2">2, 14</xref>
        ]. In
order to align two KGs, they must have something in common, some alignment seeds. Those
seeds can be: a common language for the string values (e.g., names, titles), a set pre-aligned
pairs, or a combination of those. Most existing work consider a single scenario, and may not
apply to other scenarios. A common scenario (called Scenario A in the following) is to only
assume a common language, and no pre-aligned pairs. This is an unsupervised setting. Another
common scenario in recent literature (called Scenario B in the following), assumes that a set of
pre-aligned pairs is available [14]. This is a supervised setting where the pre-aligned pairs serve
as a training set. Most approaches use similarity measures that combine terminological features
(i.e., strings for names, titles, ...), and structural features (i.e., relationships between entities). A
recent trend is to use KG embeddings as such features. Formal Concept Analysis (FCA) [7] and
FCA extensions, such as Pattern Structures [6] and Relational Concept Analysis [11], have been
recently proposed for tasks related to KG alignment [
        <xref ref-type="bibr" rid="ref1">15, 3, 1</xref>
        ]. Formal concepts are used there
as a symbolic form of similarity.
      </p>
      <p>In this paper, we explore the application of Graph-FCA [5], an extension of FCA specifically
designed for analyzing KGs, to the two main scenarios of KG alignment. We propose an approach
that makes three contributions to previous FCA-based approaches.</p>
      <p>1. A more natural and scalable representation. The formal context is the union (rather than
the product) of the two KGs to be aligned, with minor adjustments depending on the
alignment scenario.
2. A uniform alignment extraction methods for both entities and schema elements (classes,
properties and relations). All alignement pairs are extracted from concept extents (rather
than from both extents and intents).
3. A flexible approach w.r.t. diferent scenarios. They are handled through minor adjustments
to the representation of KGs.</p>
      <p>We show the efectiveness of our approach on a few use cases taken or adapted from previous
FCA-based work.</p>
      <p>Section 2 states the KG alignment problem. Section 3 presents the related work. Section 4
shortly describes Graph-FCA. Section 5 explains our approach from principles to concrete
application, and discusses the results on the two main scenarios. Section 6 concludes and draws
perspectives.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem Statement</title>
      <p>We start by defining knowledge graphs (KG), taking care to distinguish between classes,
properties and relations in order to allow for a more accurate modelling. Many work on KGs only
consider entities and relations.</p>
      <p>Definition 1 (knowledge graph). A knowledge graph is a structure  = ⟨, ,  , ,  ⟩ , where 
is the set of entities,  is the set of classes,  is the set of properties,  is the set of relations, and 
is a set of triples. The schema of the KG is made of the classes, properties and relations. There are
three kinds of triples, for any ,  ′ ∈ ,  ∈ ,  ∈  ,  ∈  :</p>
      <p>• (, type, ) states that class  is the type of entity  ;
• (, ,  )</p>
      <p>etc.;
• (,  ,  ′) states that relation  links entity  to entity  ′.</p>
      <p>states that property  has value  on entity  , where a value can be a string, a number,
In RDF and OWL, entities are called individuals, values are called literals, and the distinction
between properties and relations corresponds to the distinction in OWL between datatype
properties and object properties. URIs are used to identify entities, classes, properties, and
relations.</p>
      <p>Figure 1 shows two knowledge graphs, one on the left and another on the right. It falls into
Scenario A as it uses completely diferent schemas and it only shares proper names of entities
(square boxes). The objective is to find the best alignement between the two. This is the most
complex example considered by Atencia et al [3] (Fig. 8, p.23). It involves cyclic dependencies
that have to be taken into account to find the correct alignment.</p>
      <p>It is easy to find that o1:z1 aligns with o2:i1 because they are the unique entities in each KG
to have "Dupont" as a value. From this first alignement, one can find two additional alignments:
property o1:lastname with o2:name, and class o1:Person with o2:Inhabitant. It is more
dificult to find that o1:z3 aligns with o2:i3. o1:z3 has lastname "Dubois" but both o2:i2
and o2:i3 have name "Dubois". o1:z3 has a home/address in Paris but so do o2:i2 and o2:i3.
Traversing deeper in the graph, we find that o1:z3’s home is owned by a Dupont who lives in
Grenoble. Something equivalent can be found for o2:i3 but not for o2:i2. We observe that the
alignment of entities, classes, properties and relations depend on each other, and that handling
long-range dependencies may be necessary to resolve alignments.</p>
      <p>The example in Figure 1 can be adapted to cover Scenario B where string values are in
diferent incomparable languages but some pre-aligned pairs may be available. Imagine for
instance that the strings of the left-side KG are in Chinese, and that we already know that o1:z1
aligns with o2:i1. Without that pre-aligned pair, one only has structural information, which is
not enough to discriminate between the diferent people and places. Knowing that o1:z1 aligns
with o2:i1 (Dupont), one can align classes o1:Person and o2:Inhabitant, and also relations
o1:home and o2:address. By propagating alignments through neighborhood, starting from
the pre-aligned pair, it is here possible to find the correct alignment for all elements: first o1:h1
with o2:a1, then o1:city with o2:city and o1:z2 with o2:i2, and so on.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Related Work</title>
      <p>
        The alignment of KGs (or entities, or ontologies) is a matter of defining and combining similarity
measures [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. There are four kinds of similarities, depending on the kind of information that is
used. Terminological similarity is based on the strings that are used to name entities and schema
elements. Structural similarity is based on the description of entities by classes, properties, and
relations. Extensional similarity is based on the extensions of classes. Semantic similarity is
based on external sources of knowledge, and on automated reasoning (e.g., subsumption checks
in description logics). Terminological and structural similarities are the most common methods,
and are often combined to get better results. In general, strutural similarity does not consider
the KG as a whole but use either tree-shaped descriptions extracted from the knowledge graph,
or neighbor-based propagation of similarity values (e.g., RiMOM-IM [13]).
      </p>
      <p>The trend in KG alignement is to use representation learning methods, in a supervised setting
where a significant number of pre-aligned pairs is supposed to be known [ 14]. The principle
is first to learn embeddings for the two KGs that coincide for the pre-aligned pairs. Other
alignments can then be found by nearest-neighbor search in the embedding space. Many
representation learning methods exist, such as TransE [4], ComplEx-N3 [10] or R-GCN [12].</p>
      <p>Diferent variants of FCA have been applied to tasks related to KG alignment. Classical
FCA was used for aligning (aka. matching) several biomedical ontologies, in a system called
FCA-Map [15]. The scenario is here to match classes and relations based on their names and
their relationships (e.g., class hierarchies, class disjointness, relation domains and ranges), there
are no entities per se. The intuition behind using FCA is that if a concept has only two objects
in its extent (here classes and relations), one from each ontology, then this is a strong indication
that the two entities should be aligned. This is called an anchor in FCA-Map. The global
alignment is obtained by defining a succession of five formal contexts, and computing their
concept lattices. This can be seen as an ad-hoc form of Relational Concept Analysis (RCA) [11],
with the notions of iterative concept formation, and relational attributes.</p>
      <p>RCA was used for the task of link key discovery [3], which consists in finding pairs of properties
and/or relations that are identifying keys for the entities of a pair of classes. In Figure 1, the
pair o1:lastname/o2:name is not a link key for the pair of classes o1:Person/o2:Inhabitant,
but combined with the pair o1:home/o2:address, it is. Link key discovery is somewhat
stronger than KG alignment because in addition to provide alignments, it provides a rule for
predicting aligned pairs in unseen data. RCA can be compared to methods that propagate
similarity values through relations, except that formal concepts are used in place of numerical
values. Pattern Structures [6] were used for the same task with an approach that is similar
but does not cover long-range and cyclic dependencies like in RCA. In those approaches, the
discovery of an alignment between two KGs is somewhat contrived. First, although a KG can be
naturally represented as an RCA context, the RCA context must instead represent the product
of the two KGs: i.e., the objects are all pairs of entities (one from each KG), the properties
are all pairs of properties, and so on. This entails a quadratic increase in the size of the data,
which poses scalibility issues. Second, the discovered alignment is scattered in the extents and
intents of formal concepts, across several concept lattices. We show in this paper that those
two disadvantages are overcome when using Graph-FCA [5] on the task of KG alignment.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Graph-FCA</title>
      <p>Graph-FCA [5] is an extension of FCA for multi-relational data, and in particular for knowledge
graphs. Graph-FCA defines a graph context as an incidence relation between tuples of objects
and attributes. A graph context is a triple  = (, ,  ) , where  is a set of objects,  is a set of
attributes, and  ⊆  ∗ ×  is an incidence relation between object tuples  = ( 1, … ,   ) ∈  ∗, for
any arity  , and attributes  ∈  . A graph context represents a KG with objects as nodes, and
attributes as hyper-edge labels. An hyper-edge (( 1, … ,   ), ) can be seen as the logical atom
( 1, … ,   ); we use the latter notation for readability. Unary edges correspond to node labels
while binary edges correspond to labeled edges. Classical FCA is a special case of Graph-FCA
where  = 1 (only unary edges) and  = 1 (only unary concepts).</p>
      <p>Figure 2 shows the graphical representation of a graph context about dishes, cereals, and
countries. Each box represents an object  along with the list of attributes  such that there is an
unary edge () ∈  . Each arrow from box  1 to box  2 labelled by attribute  denotes a binary
edge ( 1,  2) ∈  . An hyper-edge ( 1, … ,   ) with  &gt; 2 is represented as an ellipsis labeled
by  , and having an arrow labeled  to each box   (not illustrated here).</p>
      <p>Concept intents express what several  -tuples of objects have in common. They are graph
patterns with  distinguished nodes. They are called Projected Graph Patterns (PGP), and they
are similar to conjunctive queries. For example, the PGP [(,  ) ←  (, ),  ( , )]
has three nodes ,  ,  , two edges  (, ) and  ( , ) , and two distinguished nodes ,  .
It can be used as a definition of the ”sibling” relationship, i.e. the fact that  and  are siblings if
they have a common parent  . FCA is extended to graph contexts by extending the set operations
to PGPs. PGP inclusion ⊆ is based on graph homomorphisms, and PGP intersection ∩ is based
on the categorical product of graphs (see [5] for details).</p>
      <p>The Galois connection is defined in Graph-FCA between PGPs (, ⊆  ) and sets of object
tuples (2 ∗, ⊆). In the definitions of  ′ and  ′ below, the PGP [ ←  ] represents the description
of an object tuple  by the whole incidence relation  seen from the relative position of  .
 ′ ∶= { ∈   ∣  ⊆  [ ←  ]},
 ′ ∶= ∩ {[ ←  ] ∣  ∈ },
for  = [( 1, … ,   ) ←  ] ∈ 
for  ⊆   ,  ∈ ℕ</p>
      <p>Informally,  ′ can be understood as the result set of  , seen as a query; and  ′ as the most
specific query whose result set contains  . From there, concepts can be defined in the usual
way, and organized into lattices. A concept is a pair (, ) such that  ′ =  and  ′ =  . The
arity of  and  must be the same, it determines the arity of the concept. Unary concepts are
about sets of objects, while binary concepts are about relationships between objects, and so on.
Unlike RCA, there is a concept lattice for each concept arity rather than for each object type.</p>
      <p>Figure 3 displays a compact representation of the graph concepts about dishes, cereals, and
countries. It only shows unary concepts (, ) , with  = [ ←  ] , s.t. || ≥ 2 ,  ≠ ∅ , and
 is a “core pattern” (i.e.,  has no homomorphic subset). Each box  identifies a unary
concept, with its name at the top (e.g., Q1b), and its extent at the bottom (here, Pakistan,
Thailand). The concept intent is the PGP [ ←  ] , where  is the subgraph made of solid
arrows containing node  . Those arrows denote binary edges while the middle part of the
boxes denote unary edges. By reading the graph, we learn that Concept Q1b is the concept
of “Asian countries, which eat a lot of some dish whose main cereal is a rice produced in the
country itself”. Formally, its intent is denoted by [ ←   (), (),  (, ), ℎ(),
ℎ  (, ),  (),  (),    (, )] (using the small letters of concept names
as pattern variables). Concepts Q1a and Q1c have the same graph pattern as Q1b but with a
diferent focus, on dishes for Q1a and on cereals for Q2c. N-ary concepts are obtained by picking
several distinguished nodes. For example, (Q1b,Q1a,Q1c) is a ternary concept whose instances
are the object triples ( ,   , ) and ( ℎ , ℎ  , ℎ) . It
represents the cyclic relationship existing between a country, a dish, and a cereal in Asian
countries. The Q2-concepts are generalizations of the Q1-concepts. For example, concept Q2b
covers all countries. The dashed arrow from Q2b to Q1b says that Q1b is a subconcept of Q2b.
The Q2-intents say that, in the context, all dishes use some rice as a main cereal, and that all
cereals are produced in some country but that not all countries eat a lot of some dish, and not
all dishes are eaten a lot by some country.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Concept-based Knowledge Graph Alignment</title>
      <p>In this section, we explain how KG alignments can be extracted from Graph-FCA concepts, and
we discuss the results on the example and scenarios of Section 2.</p>
      <sec id="sec-5-1">
        <title>5.1. Principle</title>
        <p>When two entities from two diferent KGs are equivalent, e.g. o1:z1 and o2:i1 in the example,
they generally have equivalent types (e.g., o1:Person and o2:Inhabitant), equivalent
properties with equal values (e.g. o1:lastname and o2:name “Dupont”), and equivalent relations
with equivalent entities (e.g., o1:home o1:h1 and o2:address o2:a1). Similarly, when two
classes are equivalent, they are the type of equivalent entities. In FCA terms, the two equivalent
entities (or classes) belong to the extent of a formal concept whose intent expresses everything
they have in common. Here, the intent is not only about how the entities are described but also
about how the entities are related to other entities. Moreover, the equivalence between two
entities depend on the equivalence between the related entities, and so on recursively. This
makes Graph-FCA, like RCA, a good match for the problem of KG alignment.</p>
        <p>More precisely, let  1 and  2 be two KGs. If an unary concept  = (, ) has only two
elements in its extent,  = { 1,  2}, where  1 belong to  1 and  2 belongs to  2, then this is a
strong indication that  1 and  2 are equivalent because they are unique in their own KG to
match the projected graph pattern  . Such a concept is called an anchor in [15]. Concepts that
are not anchors can also contribute to the alignement, although in a less direct and less certain
way. For example, a concept with extent { 1,  1,  2,  2} says that  1-entities  1 and  1 align with
 2-entities  2 and  2, without knowing which aligns to which. However, if we also have the
anchor { 1,  2}, we can deduce an alignment between  1 and  2. Another example is a concept
with extent { 1,  2,  2} where some ambiguity remains on the alignment of  1 but it nonetheless
reduces the set of possibilities from all  2-entities to only two entities,  2 and  2.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Graph-FCA Representation</title>
        <p>The immediate representation of a KG as a Graph-FCA context is to use classes, properties and
relations as attributes (hyper-edge labels), and entities as objects (nodes). Each occurrence of a
value is represented by a distinct node labelled by the value as this avoids spurious connections
between objects. This is formalized by the following mappings from KG triples to Graph-FCA
hyper-edges:
(, type, ) ⇝ ()
(, ,  )
⇝ (, " ")
(,  , 
′) ⇝  (,  ′)
where (, " ") is a shorthand for (, ),  () for some fresh object  .</p>
        <p>This immediate representation, however, does not produce any alignment because the two
KG representations do not share any attribute. Indeed, the two KGs have diferent schemas,
hence diferent classes, properties and relations. Recall that Graph-FCA concept intents are
made of attributes and variables. As a consequence, for all concepts except the top one, the
concept extent contains only objects from a single KG, and no anchor concept can be found.</p>
        <p>An adequate representation should use the same set of attributes for the two KGs to be aligned,
and those attributes should reflect what they have in common. In Scenario A, the two KGs only
share the values. Actually, they also share the graph structure, and the distinction between
classes, properties and relations. Anything else must hence be represented as objects: i.e., not
only entities but also classes, properties and relations. We achieve this by introducing three
meta-level common attributes: type for expressing the relation between entities and classes,
prop for relating entities to values through properties, and rel for relating entities to entities
through relations. The new mapping from KG triples to Graph-FCA hyper-edges is hence as
follows:
(, type, ) ⇝ type(, )
(, ,  )
⇝
prop(, , " ")
(,  , 
′) ⇝ rel(,  ,  ′)
This is a form of reification, in which classes, properties and relations are treated as objects. It
is made possible by the fact that Graph-FCA supports hyper-edges beyond binary edges. As
defined above, the notation " " for a value implicitly introduces an attribute  , hence accounts
for the values shared between KGs.</p>
        <p>In Scenario B, in contrast to Scenario A, even the values are diferent in the two KGs. This
means that values do not bring any information so that the mapping for properties can be
simplified as (, ,  ) ⇝ prop(, ) . Obviously, the two mappings can be mixed if some values
are shared and others not. To compensate for the lack of shared elements, a set of pre-aligned
pairs ( 1,  2) may be given as input, and modelled as equivalence triples ( 1, sameAs,  2). Such
an equivalence can be represented in Graph-FCA by using a fresh attribute   that represents
the real entity or schema element  that  1 and  2 refer to. The mapping rule is
( 1, sameAs,  2) ⇝   ( 1),   ( 2), for some fresh attribute   .</p>
        <p>Attribute   works like a nominal attribute for each KG, labelling only  1 in  1, and only  2 in
 2.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Results and Discussion</title>
        <p>We ran our Graph-FCA implementation1 on the two scenarios in order to find anchors, and
hence KG alignments. We used the following options to make their computation more eficient,
and the output smaller: -minsupp 2 because anchors have by definition a support equal to 2,
and -only-cores because it excludes concepts that represent under-optimal alignments (an
example is given below).</p>
        <sec id="sec-5-3-1">
          <title>Scenario A (common values).</title>
          <p>Graph-FCA finds 24 unary concepts, grouped in only two
patterns: Q1 and Q2. All anchor concepts are in Q1, and all concepts in Q1 are anchor concepts.
Moreover, all elements of the two KGs (entities and schema) appear in those concepts. Pattern
Q1 therefore represents a complete alignement between the two KGs. Table 1 lists those
concepts with their identifier (a lowercase letter), their extent (an object from each KG), and
their neighboring intent. For instance, concept  shows the alignment of o1:z1 with o2:i1, and
its intent reads as follows: The two entities have type  , property  with value “Dupont”, relation 
to entity ℎ, and relation  from entity  . Each variable in the intent refers to a concept in the same
pattern, and hence to an aligned pair: e.g., the common type  is o1:Person/o2:Inhabitant,
and the property with value “Dupont” is o1:lastname/o2:name. One can observe that the
alignement is also correct, not only for entities but also for classes, properties and relations.
1Open source at https://bitbucket.org/sebferre/graph-fca/
and places), another all classes, and similarly for properties and relations. It simply reflects
the meta-model used in our representation. Without option -only-cores, Graph-FCA outputs
additional concepts that represent approximate alignments, such as {o1:z1, o1:z3, o2:i1,
o2:i3} or {o1:lastname, o2:name, o2:given}.</p>
        </sec>
        <sec id="sec-5-3-2">
          <title>Scenario B (pre-aligned pairs).</title>
          <p>Without common values nor any pre-aligned pairs,
GraphFCA only finds the meta-concepts, like in pattern Q2 in the above scenario. Indeed, there
is no rationale for making any alignment. When adding a pre-aligned pair, like o1:z1 with
o2:i1, Graph-FCA finds an almost complete alignment (see Table 2). The only uncertainty is
about o1:lastname that aligns with either o2:name or o2:given. Indeed, without values, the
diferent properties that apply to an entity cannot be distinguished. Choosing any other pair
of entities as pre-aligned pair gives the same result. However, if the pre-aligned pair is about
the schema (e.g., two equivalent classes), then only the schema elements are correctly aligned.
Entities are grouped by type (people and places) but they are not aligned individually.</p>
        </sec>
        <sec id="sec-5-3-3">
          <title>Other results.</title>
          <p>
            We also applied our approach to the example of Abbas et al [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ], which falls
in Scenario A (common values). It is at the same time simpler than the example of Atencia et
al because it has no relations (only classes and properties), and more dificult because there
is not a one-to-one mapping between the two KGs. First, there are entities in  1 that are not
present in  2. Second, the classes are not equivalent: the  1-class o1:Book is split in two
 2-classes, o2:Dictionary and o2:Novel; and the  2-class o2:FemaleScientist is equivalent
to the intersection of two  1-classes, o1:Woman and o1:Scientist. The alignments produced
by Graph-FCA are grouped in two patterns, one for people and another for books, because there
are no relations between them in the data. All common entities and all properties are completely
and correctly aligned (one anchor concept for each pair). For classes, three interesting concepts
are found: {o1:Woman, o1:Scientist, o2:FemaleScientist}, {o1:Book, o2:Novel}, and
{o1:Book, o2:Dictionary}. They can be read as “Woman and Scientist together align with
FemaleScientist”, and “Book aligns with either Novel or Dictionary”. This is exactly as
expected. Note that each entity with type o1:Book is correctly assigned to one of o2:Novel and
o2:Dictionary (two distinct concepts), based on its properties (“author” for novels, “year” for
dictionaries).
          </p>
          <p>Discussion. Compared to the use of classical FCA, RCA or Pattern Structures, our approach
does not build a context as the “product” of the two KGs, i.e. with all pairs of entities as objects,
and all pairs of classes/properties/relations as attributes. Instead, the Graph-FCA context is
built as the “union” of the two KGs. This entails that the size of our context is the sum of the
size of the two KGs, rather than the product. This clearly favors the scalability of our approach.</p>
          <p>A key enabling feature of Graph-FCA is the support for  -ary relations as it enables to
reify the binary properties and relations as objects, along with entities, and to use them as
arguments of meta-level ternary predicates (e.g., prop). Note that this is akin to RDF triples
where properties belong to the same space as individuals. As a consequence, the alignment
of classes, properties and relations (the schema) can be interpreted in the Graph-FCA output
exactly like for entities, through concept extents. This is in contrast with other FCA-based
approaches, where a diferent interpretation method must be used for the schema because it
appears in concept intents. Although it can be dificult to read Graph-FCA output in general,
a KG alignment can be read from the unary concept extents, which are sets of objects, like in
classical FCA.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Perspectives</title>
      <p>Graph-FCA is FCA for knowledge graphs, and we have shown in this paper that by adequately
representing two KGs in terms of what they have in common, we can extract aligned pairs
of entities, classes, properties and relations from Graph-FCA concepts in a uniform way. The
results in this paper are limited to a few encouraging illustrative examples in diferent scenarios,
and much remains to be done to apply our approach to real KGs. A first issue is scalability. A
research track is to only generate anchors, i.e. concepts that result from the intersection of
two objects, one from each KG. A second issue is to allow for approximate matching of values,
especially texts, as they often slightly difer from one KG to another.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This research is supported by ANR project SmartFCA (ANR-21-CE23-0023).
[3] Atencia, M., David, J., Euzenat, J., Napoli, A., Vizzini, J.: Link key candidate extraction
with relational concept analysis. Discrete applied mathematics 273, 2–20 (2020)
[4] Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating
embeddings for modeling multi-relational data. In: Advances in neural information processing
systems. pp. 2787–2795 (2013)
[5] Ferré, S., Cellier, P.: Graph-FCA: An extension of formal concept analysis to knowledge
graphs. Discrete Applied Mathematics 273, 81–102 (2019)
[6] Ganter, B., Kuznetsov, S.: Pattern structures and their projections. In: Delugach, H.S.,
Stumme, G. (eds.) Int. Conf. Conceptual Structures. pp. 129–142. LNCS 2120, Springer
(2001)
[7] Ganter, B., Wille, R.: Formal Concept Analysis — Mathematical Foundations. Springer
(1999)
[8] Hahn, G., Tardif, C.: Graph homomorphisms: structure and symmetry. In: Graph symmetry,
pp. 107–166. Springer (1997)
[9] Hitzler, P., Krotzsch„ M., Rudolph, S.: Foundations of Semantic Web Technologies.
Chapman &amp; Hall/CRC (2009)
[10] Lacroix, T., Usunier, N., Obozinski, G.: Canonical tensor decomposition for knowledge
base completion. In: Dy, J.G., Krause, A. (eds.) Int. Conf. Machine Learning. Proceedings
of Machine Learning Research, vol. 80, pp. 2869–2878. PMLR (2018)
[11] Rouane-Hacene, M., Huchard, M., Napoli, A., Valtchev, P.: Relational concept analysis:
mining concept lattices from multi-relational data. Annals of Mathematics and Artificial
Intelligence 67(1), 81–108 (2013)
[12] Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling
relational data with graph convolutional networks. In: The Semantic Web Conf. (ESWC).
pp. 593–607. Springer (2018)
[13] Shao, C., Hu, L.M., Li, J.Z., Wang, Z.C., Chung, T., Xia, J.B.: Rimom-im: A novel iterative
framework for instance matching. J. computer science and technology 31(1), 185–197
(2016)
[14] Zeng, K., Li, C., Hou, L., Li, J., Feng, L.: A comprehensive survey of entity alignment for
knowledge graphs. AI Open 2, 1–13 (2021)
[15] Zhao, M., Zhang, S., Li, W., Chen, G.: Matching biomedical ontologies based on formal
concept analysis. J. biomedical semantics 9(1), 1–27 (2018)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Abbas</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>David</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Discovery of link keys in RDF data based on pattern structures: Preliminary steps</article-title>
          .
          <source>In: Int. Conf. Concept Lattices and Their Applications</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Ardjani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bouchiha</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malki</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-alignment techniques: Survey and analysis</article-title>
          .
          <source>Int. J. Modern Education &amp; Computer Science</source>
          <volume>7</volume>
          (
          <issue>11</issue>
          ) (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>