<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Collaborative Approach for FCA-Based Knowledge Extraction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Loria-Campus Scientifique</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vandoeuvre-les-Nancy Cedex</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>My-Thao.Tang</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannick.Toussaint}@loria.fr</string-name>
        </contrib>
      </contrib-group>
      <fpage>281</fpage>
      <lpage>286</lpage>
      <abstract>
        <p>In this paper, we propose an approach for an FCA-based knowledge extraction process relying on the collaboration between humans and machines. Evaluation of the results is performed on the lattice by experts through an interactive process where they may specify their wishes for changes using a set of predefined operations. Thus, the system then may suggest several strategies to reach their goal. In such an interactive and iterative process, the system converges towards a knowledge model close to the experts' needs. We illustrate the process on a small preliminary experiment.</p>
      </abstract>
      <kwd-group>
        <kwd>Formal Concept Analysis</kwd>
        <kwd>Knowledge Extraction</kwd>
        <kwd>Expert Interaction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Several approaches ([
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1–4</xref>
        ]) showed how powerful is Formal Concept Analysis
(FCA) ([
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) for knowledge modelling and ontology design, and the lattice is
usually considered as the knowledge model. This paper focuses on providing
domain experts with capabilities to control and customize the lattice using
retroengineering operations. Our approach is motivated by a desire of keeping a trace
between text sources and concepts of the resulting ontology. Objects and
properties in texts are annotated and used to build the formal context. Annotations
and the lattice evolved simultaneously thanks to retro-engineering. In this way,
we can keep a trace between the linguistic level and the conceptual level making
one two separated processes in literature, namely semantic annotation which
identifies concepts from an ontology in texts and ontology building which
builds concepts from texts.
      </p>
      <p>FCA is a bottom-up process which builds concepts from a set of instances
of objects and properties (the formal context). To improve the lattice and to
make it closer to expert needs, we want to define an iterative and interactive
process where experts are asked, at each loop, to evaluate the “quality” of the
concepts. Unfortunately, in Knowledge Engineering, building ontology is not a
straightforward process and usually results from trial and error process. There
are several reasons for experts to ask for changes in the lattice: (1) there may
be noise in resources or in the information extraction processes and thus, some
c paper author(s), 2013. Published in Manuel Ojeda-Aciego, Jan Outrata (Eds.): CLA
2013, pp. 281{286, ISBN 978{2{7466{6566{8, Laboratory L3i, University of La
Rochelle, 2013. Copying permitted only for private and academic purposes.
concepts result from this noise, (2) experts are not satisfied with the resulting
knowledge model and wish it to be more in accordance with their needs, i.e. the
application that will use the knowledge model.</p>
      <p>The rest of the paper is structured as follows. Firstly, section 2 explains
how the system and experts collaborate in building and changing the lattice,
illustrates the approach by removing a concept from a lattice. Next, section 3
presents experimental results. Finally, section 4 concludes with a summary and
draws perspectives.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Collaboration for Changing A Lattice</title>
      <p>In our approach, experts do not have access to the formal context; they can
browse concepts, run through subsumption paths, look at extents and intents of
concepts. Then, they express their wishes to change the lattice using operations.
An operation on the lattice is a kind of retro-engineering: experts select an
operation on the lattice (ex: remove this concept), the system assesses the impact
of the change, suggests different strategies and then, for the chosen strategy,
calculates what to change in data for the new lattice to meet the requirements.
The lattice is then built accordingly. Experts repeat the process until they reach
an agreement between the model and their knowledge.
2.1</p>
      <p>
        Defining Operations on a Lattice
We list basic changes on a lattice to add or remove one element from the lattice
(Table 1). There could be more complex operations such as merging two concepts
into one, splitting one concept into two sub-concepts, creating a common concept
for a set of concepts... For each operation, we need to define an algorithm to
compute the set of possible strategies to perform the change, to identify related
changes in the lattice, to rank them according to a cost function and to keep a
trace in a history. Adding or removing objects or properties in a formal context
motivated the development of incremental algorithms to update a lattice [
        <xref ref-type="bibr" rid="ref6 ref7 ref8 ref9">6–
9</xref>
        ]. Concept removing is somehow complex: several strategies are possible and
impacts on other concepts in the lattice (side effects). We detail this operation
to explain our approach in the next part.
2.2
      </p>
      <p>Removing a Concept from a Lattice
Removing a concept while keeping the lattice structure means moving it down
to one of its children or up to one of its parents. These solutions correspond to
what we call different strategies.</p>
      <p>Strategy 1 moves concept C down to one of its children, Cchild. The relation
I of the formal context should be modified to I∗ as follows:</p>
      <p>I∗ = I ∪ {(g, m) : m ∈ {Intent(Cchild) \ Intent(C)}, g ∈ {Extent(C) \
Extent(Cchild), gIm}.</p>
      <p>Add Remove
Object Add an object to a lattice Remove an object from a lattice</p>
      <p>Add an object to a concept Remove an object from a concept
Attribute Add an attribute to a lattice Remove an attribute from a lattice
Add an attribute to a concept Remove an attribute from a concept
Add an attribute to an object Remove an attribute from an object
Concept Add a concept to a lattice Remove a concept from a lattice
Add a supper concept to a concept Remove a supper concept from a concept
Add a sub concept to a concept Remove a sub concept from a concept</p>
      <p>Strategy 2 moves concept C up to one of its parents, Cparent. The relation I
of the formal context should be modified to I∗ as follows:</p>
      <p>I∗ = I \ {(g, m) : m ∈ {Intent(C) \ Intent(Cparent)}, g ∈ {Extent(C)}, gIm}.</p>
      <p>Table 2 shows an example of strategies for removing concept c4 from the
initial lattice given in Fig. 1. Here, we have three strategies, two for moving
c4 down and one for moving c4 up. In strategy 1.1, c4 = ({F, HD}, {CoWi D})
and c7 = ({F}, {Ca RAS, CoWi D}). To move concept c4 down to concept c7, the
attribute in set {Intent(c7) \ Intent(c4)} = {Ca RAS} is added to the object in
set {Extent(c4)\Extent(c7)} = {HD}. Similarly, in strategy 2.1, to move concept
c4 up to concept c0 with Intent(c0) = ∅, the attribute CoWi D is removed from
the objects in set {F, HD}.</p>
      <p>
        We benefit from incremental algorithms that have been implemented to build
lattices ([
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]) not only to avoid complete recalculation of the lattice but also
for identifying the different possible strategies to perform a given change in the
current lattice. Incremental algorithms, at the ith step, produce the concept set
or diagram graph for i first objects of the context. The new object i + 1th is
dynamically added by modifying the existing lattice without recomputing the
whole lattice from scratch. ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ])
      </p>
      <p>Let us explain our approach through the algorithm for moving concept down
- Strategy 1. In Strategy 1, the set of objects and the set of properties
remain unchanged. Only the relation I is enriched. Moreover, a new closed set
(Extent(C), Intent(Cchild)) is a formal concept of the new lattice L∗.</p>
      <p>From the new closed set (Extent(C), Intent(Cchild)) and the existing lattice
L, we generate the new lattice L∗ by identifying the categories of concepts. The
new lattice is obtained from the existing lattice by taking, deleting or modifying
some concepts and creating new concepts. Thus, concepts are classified into four
The initial lattice</p>
      <p>Deleted</p>
      <p>Deleted</p>
      <p>The resulting lattice</p>
      <p>Modified</p>
      <p>Modified/Destructor New</p>
      <p>New/Destructor
categories, old concepts, deleted concepts, modified concepts and new concepts
as follows. Let C be a concept in L∗:
– C is an old concept if there exists a concept with the same Extent(C) and</p>
      <p>Intent(C) in L.
– C is a modified concept if there exists a concept with the same Intent(C) in</p>
      <p>L but the extent is different from Extent(C).
– C is a new concept if Intent(C) doesn’t exist in L.
– C in L is a deleted concept if Intent(C) doesn’t exist in L∗.</p>
      <p>Moreover, we identify destructors. A destructor is a concept that a deleted
concept will jump to. Thus links to the deleted concepts should be reported to
destructors.</p>
      <p>Fig. 1 shows an example of removing concept c4 according to the strategy
moving it down to concept c7. Here, two concepts, c4 and c9, in the initial lattice,
are deleted; two concepts, c2 and c7, are modified, adding the object HD to their
extent; two new concepts, c11 and c12, are created. In addition, concept c7 is
destructor of the deleted concept c4 and concept c12 is destructor of the deleted
concept c9.</p>
      <p>
        The algorithm calculating the new lattice is based on the fundamental
property of lattices ([
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]). For any closed set (X, X0) in the lattice:
X0 = f (X) = T f ({x}) and X = g(X0) = T g({x0})
x∈X x0∈X0
Moreover, for any set of f (x) (resp. g({x0})), their intersection should be in the
lattice.
      </p>
      <p>The main process of the algorithm is to update the set of intersections of
extents. We perform a top-down traversal of the lattice L to identify whether
a concept is old, modified or deleted, to create new concepts and to keep the
information of destructors.</p>
      <p>Thanks to the categories of concepts, the resulting lattice can be calculated
and we can keep a trace from the previous lattice. By this way, we can know the
impact of a change on the lattice (lists of modified, deleted and new concepts).
Finally, all the strategies are suggested to the expert with their impacts (Table
2) so that the expert can consider choosing one strategy. Once the expert makes
a choice, the system will update the lattice accordingly.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Experiment of Removing A Concept</title>
      <p>We experimented our approach for extracting knowledge from a real text corpus
of Fibromuscular Dysplasia extracted from PubMed1. The corpus consists of
400 texts.</p>
      <p>
        In the preprocessing step, we used SemRep ([
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]) to annotate the texts. 2402
annotation objects were identified from the corpus, of which only 668 objects
are shown in the right or left side of a triplet relationship and 481 objects on the
right side. 36 different relationships are identified in these texts. Given an
annotation (object1, relationship, object2), we consider object1 as an object
and the name of the relation is concatenated with object2 to become a binary
attribute of object1. For example, Fibromuscular Dysplasia is an object and
ISA Disease is one of its attributes. SemRep annotation process can be noisy
but, of course, no automatic tool is perfect. Moreover, any annotation process
could annotate the texts, including manual annotation. This is one of our goals
to take the advantage of expert interaction in the construction of the knowledge
model.
      </p>
      <p>Next, we built a Java script to transform the set annotations into a formal
context. The lattice was produced by Galicia2 and exported as the XML
document. The context built from our corpus contains 481 objects, 545 attributes
formed by the association of the relationship of the object with which it relates.
The lattice contains 523 concepts and the longest path from the Top to the
Bottom is 7.</p>
      <p>Then, we implemented a system with operation removing a concept from
a lattice. When an expert, during the evaluation phase, asks for removing a
concept, the system suggests a list of strategies and makes explicit their impacts
on the lattice (lists of modified, deleted and new concepts) to the expert. Once
the expert chooses one strategy, the system will update the lattice accordingly.</p>
      <p>The required time for the expert to remove concepts has not yet been
evaluated. Our experience shows that a judicious choice of a strategy in order to
remove concepts has a strong impact on the speed of convergence towards a
satisfactory knowledge model.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Perspective</title>
      <p>We presented a collaborative approach for FCA-based knowledge extraction. Our
study shows how domain experts and machines can collaborate in building and
changing a lattice. The system handles changes in a lattice and keeps a trace
1 http://www.ncbi.nlm.nih.gov/pubmed
2 http://sourceforge.net/projects/galicia/
from the previous lattice. In that way, experts know the impact of a change.
Our iterative and interactive approach takes advantage from FCA and reduces
limitations due to the use of a formal method to model a complex cognitive
process.</p>
      <p>This approach opens several perspectives. We can do more to help the expert.
Refining the cost function associated to changes can make easier the choice
of one strategy. Tagging concepts that the expert agrees with and wants to
keep unchanged all along the process could reduce the number of the suggested
strategies. Performing several changes at once needs also more investigations.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bendaoud</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toussaint</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Formal concept analysis: A unified framework for building and refining ontologies</article-title>
          . In Gangemi,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Euzenat</surname>
          </string-name>
          , J., eds.
          <source>: EKAW</source>
          . Volume
          <volume>5268</volume>
          of Lecture Notes in Computer Science., Springer (
          <year>2008</year>
          )
          <fpage>156</fpage>
          -
          <lpage>171</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Obitko</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Sna´sel, V.,
          <string-name>
            <surname>Smid</surname>
          </string-name>
          , J.:
          <article-title>Ontology design with formal concept analysis</article-title>
          .
          <source>In: Proceedings of the CLA 2004 International Workshop on Concept Lattices and their Applications</source>
          , Ostrava, Czech Republic,
          <source>September 23-24</source>
          ,
          <year>2004</year>
          . (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Vo¨lker, J.:
          <article-title>Text2onto - a framework for ontology learning and datadriven change discovery</article-title>
          . In Springer, ed.
          <source>: Proceedings of Natural Language Processing and Information Systems (NLDB</source>
          <year>2005</year>
          ).
          <source>Number 3513 in Lecture Notes in Computer Science (LNCS)</source>
          (
          <year>June 2005</year>
          )
          <fpage>227</fpage>
          -
          <lpage>238</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Huchard</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Napoli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rouane</surname>
            ,
            <given-names>M.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valtchev</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A proposal for combining formal concept analysis and description logics for mining relational data</article-title>
          .
          <source>In: proceeding of the 5th International Conference Formal Concept Analysis (ICFCA'07)</source>
          , Clermond-Ferrand, France (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis, Mathematical Foundations</source>
          . Springer (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Godin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missaoui</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alaoui</surname>
          </string-name>
          , H.:
          <article-title>Incremental concept formation algorithms based on Galois (concept) lattices</article-title>
          .
          <source>Computational Intelligence</source>
          <volume>11</volume>
          (
          <issue>2</issue>
          ) (May
          <year>1995</year>
          )
          <fpage>246</fpage>
          -
          <lpage>267</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Valtchev</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missaoui</surname>
          </string-name>
          , R.:
          <article-title>Building galois (concept) lattices from parts: Generalizing the incremental approach</article-title>
          . In Delugach, H.,
          <string-name>
            <surname>Stumme</surname>
          </string-name>
          , G., eds.
          <source>: Proceedings of the ICCS 2001</source>
          .
          <article-title>Volume 2120 of LNCS</article-title>
          ., Springer Verlag (
          <year>2001</year>
          )
          <fpage>290</fpage>
          -
          <lpage>303</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            ,
            <given-names>S.A.:</given-names>
          </string-name>
          <article-title>Comparing performance of algorithms for generating concept lattices</article-title>
          .
          <source>J. Exp. Theor. Artif. Intell</source>
          .
          <volume>14</volume>
          (
          <issue>2-3</issue>
          ) (
          <year>2002</year>
          )
          <fpage>189</fpage>
          -
          <lpage>216</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Merwe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kourie</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Addintent: A new incremental algorithm for constructing concept lattices</article-title>
          . In Eklund, P., ed.:
          <source>Concept Lattices. Volume 2961 of Lecture Notes in Computer Science</source>
          . Springer, Berlin/Heidelberg (
          <year>2004</year>
          )
          <fpage>205</fpage>
          -
          <lpage>206</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Barbut</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monjardet</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Ordre et classification, Alg`ebre et combinatoire</article-title>
          . Hachette, Paris (
          <year>1970</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Rindflesch</surname>
            ,
            <given-names>T.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fiszman</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The interaction of domain knowledge and linguistic structure in natural language processing: Interpreting hypernymic propositions in biomedical text</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>36</volume>
          (
          <issue>6</issue>
          ) (
          <year>2003</year>
          )
          <fpage>462</fpage>
          -
          <lpage>477</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>