<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>FCA-based Search for Duplicate ob jects in Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmitry A. Ilvovsky</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mikhail A. Klimushkin</string-name>
          <email>klim.mikhail@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Research University Higher School of Economics, School of Applied Mathematics and Informational Science 11</institution>
          <addr-line>Pokrovskiy boulevard, Moscow</addr-line>
          ,
          <country>Russia dilv</country>
        </aff>
      </contrib-group>
      <fpage>36</fpage>
      <lpage>46</lpage>
      <abstract>
        <p>A new approach for detecting duplicates in ontology built on real redundant data is considered. This approach is based on transformation of initial ontology into a formal context and processing this context using methods of Formal Concept Analysis (FCA). As a part of a new method we also introduce a new index for measuring similarity between objects in formal concept. We study the new approach on randomly generated contexts and real ontology built for a collection of political news and documents.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        One of the most generic ways to represent structured data is an ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
A common way to build an ontology is its automatic or semi-automatic
generation from unstructured data (usually text). The problem of this approach
is data redundancy, because di erent sources of information often contain
duplicated information. Detecting and elimination of redundancy directly at the
ontology building (or extending) stage (for instance, via pairwise comparison of
new objects with existing ones) is not very e ective because such an approach
signi cantly increases burden on the expert who makes nal decisions.
Moreover, in real world redundant data comes to ontology unevenly and it makes
sense to eliminate redundancy not every time when an ontology renews but do
it at longer intervals. The duration of intervals can be determined by features of
a particular domain.
      </p>
      <p>
        In this work we consider a new method for e ective identi cation of
redundancy in data represented by an ontology. This method can be either used as an
automatic detector of a list of potential duplicate objects or as a
recommendation system. Algorithm is realized via union of closed sets of objects and based
on Formal Concept Analysis methods [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
to methods of object-attribute clustering. FCA considers not clusters of
objects without their attribute descriptions, but groups of objects and attributes
strongly related with each other.
      </p>
      <p>De nition 1. A formal context K is de ned as a triple (G; M; I), where G is
called a set of objects, M is called a set of attributes, I G M is a binary
relation speci es which objects have which attributes.</p>
      <p>If g 2 G, m 2 M and gIm, the object g has the attribute m.</p>
      <p>De nition 2. The derivation operators (:)0 are de ned for A
as follows:
G and B</p>
      <p>M</p>
      <p>A0 , fm 2 M j gIm8g 2 Ag; B0 , fg 2 G j gIm8m 2 Bg
De nition 3. Formal concept of the context K = (G; M; I) is a pair (A; B),
where A G, B M , A0 = B and B0 = A. The set A is called the extent, and
B is called the intent of the concept (A; B).</p>
      <p>De nition 4. A concept (A; B) is a subconcept of (C; D) if A
D B). In this case (C; D) is called a superconcept of (A; B).</p>
      <p>C (equivalently
The set of all concepts of K ordered by subconcept-superconcept relation forms
a lattice, which is called the concept lattice (K).
3</p>
    </sec>
    <sec id="sec-2">
      <title>Problem statement</title>
      <p>The problem solved in this paper is to search for duplicate objects in an ontology,
i.e objects describing the same object in the real world. The original problem
was posed by analysts of Avicomp company. Their main interest is to search
for duplicate objects describing people and companies in an ontologies built by
the automatic semantic processing ow of news texts. Currently, this problem
in Avicomp is solved by methods based on the Hamming distance and a variety
of additional heuristics.</p>
      <p>The input to the algorithm takes an ontology constructed from text sources.
An ontology contains objects of di erent classes. Objects can involve
relationships, appropriate to their classes. The number of detected features and links
between object can vary greatly. Some objects describe the same object in the
real world.</p>
      <p>At the output the algorithm should return lists of objects that have been
detected as duplicates. The algorithm must have high precision because the
returning of two di erent objects as duplicates is more critical error than not
detecting some of the duplicates of the object.</p>
    </sec>
    <sec id="sec-3">
      <title>A method of duplicates detection</title>
      <p>The algorithm of duplicates detection consists of several stages:
1. Transformation of a source ontology to a multi-valued context.
2. Scaling multi-valued context.
3. Building the set of formal concepts of the context.
4. Building the set of potential duplicate objects.
4.1</p>
      <sec id="sec-3-1">
        <title>Transformation of a source ontology to the multi-valued context</title>
        <p>The source (instance of) ontology is transformed to the multi-valued context as
follows:
1. A set of context objects is a set of objects O of the ontology.
2. A set of context attributes is a set M = L [ C [ R, where:
{ L is a set of all features de ned by all ontology classes,
{ C is a set of binary attributes, which characterize object classes,
{ R is a set of binary attributes, which describe relations between ontology
objects. Any relation p(x; y) between ontology objects x and y generates
two binary attributes in the context: p(x; ) and p( ; y). They correspond
to the relations p from x and p to y. An object z has an attribute p( ; y)
in context i there exists a relation p from z to y in the source ontology.
3. Each object is incident to the values of its source attributes. Also, each object
gets a special value null for those attributes not incident to it or those
whose incidence to the object is unknown. Also it gets binary attributes,
corresponding to the object's class (and all of its superclasses) and relations
between this object and other objects.
4.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Scaling multi-valued context</title>
        <p>
          The multi-valued context is converted to a binary context by means of scaling [
          <xref ref-type="bibr" rid="ref1 ref3">1,
3</xref>
          ]. Attribute sets C and R are binary. Attribute set L is scaled depending on the
type of attribute. As a rule, many attributes describe nonquantitative attributes
of objects (e.g., a persons name, a company name, etc.). Moreover, many
quantitative or numerical data are such that an approximated similarity by these
attributes does not mean the similarity of objects. By way of example, if two
company objects have attributes 2005 and 2006 as their year of establishment,
the proximity (failure to match) of these attributes does not make us sure that
the objects describe the same company; they more likely produce the opposite
e ect. For such attributes, only matching of their values makes sense and if the
values are di erent then the distance between them does not matter. Such
attributes are scaled by a nominal scale, i.e., its own binary attribute corresponds
to each attribute value. In other attributes, some other scaling types are used
such as:
{ Interval type: the transformation of an attribute A into a set of binary
attributes of type a A &lt; b. In this case, the intervals [a; b) can be both
disjoint and overlapping.
{ Ordering : an attribute A is transformed into a set of binary attributes of
        </p>
        <p>A &gt; b type.
{ Other scaling types which, in an experts opinion, can characterize the
similarity of duplicate objects in the best way.</p>
        <p>The experiments on the generated data and a real ontology use only nominal
scaling; however, this does not restrain the generality of the proposed approach.
4.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Building set of formal concepts of the context</title>
        <p>
          There are several e ective methods for building sets of formal concepts of the
formal context. In this work the AddIntent [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] algorithm was used.
4.4
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Formal concepts ltering</title>
        <p>After building the set of formal concepts it is necessary to distinguish formal
concepts having an extent which includes only duplicates of the particular object.
Building special indices to lter concepts we have to consider all main features
of these concepts. First, index must increase if the number of attributes which
are di erent for objects in a extent decreases all other things being equal. To take
into account this property we used the following index:</p>
        <p>The second feature that an index must ful ll is | it must increase when the
number of common attributes is increasing, all other things being equal. As a
result we used an index that has got this feature:
The nal index DII is a combination of two indices described earlier. In our
work we used the following combination variants:
1. Linear combination of indices:
2. Product of indices with power coe cients:</p>
        <p>DII = I1k1 I2k2 (4)
Absolute values of the coe cients have an in uence only to the value of a
threshold and ltering quality depends on the coe cients correlation in the index
formula. So we can specialize indices family without loss of the optimum and take
1 as a value for the one of the coe cients:</p>
        <p>DII+ = I1 + kI2; k &gt; 0;</p>
        <p>DII = I1 I2k; k &gt; 0
I1(A; B) = Pg2A jfgg0j</p>
        <p>jAjjBj
I2(A; B) = X</p>
        <p>A
j j
m2B jfmg0j
DII+ = k1I1 + k2I2
(1)
(2)
(3)
(5)</p>
      </sec>
      <sec id="sec-3-5">
        <title>Forming the set of potential duplicate objects</title>
        <p>The lists of objects that the algorithm returns as potential duplicates are
obtained from those formal concept extents that have high value of index. The
algorithm have two work modes: automatic operation mode and semi-automatic
operation mode with an expert assistance.</p>
        <p>In automatic mode the algorithm lters formal concepts by the threshold
of the developed index. At this stage, various heuristics can be added that are
hard to account for by means of the index. Then the algorithm prepares the
lists of duplicate objects. We assume that the binary relation \be a duplicate"
is transitive and it holds for objects in a selected formal concept. In this case
to nd lists of duplicates we should nd connected components in the graph of
objects with the "duplicate" relation.</p>
        <p>In semi-automatic operation mode with an expert assistance the algorithm
consequently o ers expert estimate concepts in descending order of DII values.
The lists of duplicates are built as soon as an expert gives answers. Before asking
an expert to estimate the concept the algorithm searches for all lists of duplicate
objects with non-empty extent intersections with this object. If this extent is
included in one of the lists then this concept is not o ered to the expert. So the
algorithm gets the list of objects corresponding to the current mark up made
by the expert. Furthermore, the expert can stop the estimating process at each
step and receive the formed lists of duplicate objects.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments on arti cially generated formal contexts</title>
      <p>Basic experiments were conducted on arti cially generated data with the
duplicates known in advance in order to obtain statistical evaluation of the quality
of the developed algorithm. Thus, this enables us to evaluate the quality of
the method based on a large scope of input data and qualitatively compare it
with the most widespread alternative approaches. Along with this, in the data
generation, the features of ontology were taken into account to extrapolate the
obtained results onto real data.
5.1</p>
      <sec id="sec-4-1">
        <title>Input data generation</title>
        <p>Various quality metrics on arti cially generated contexts were used in order to
evaluate the method quality. The generated formal contexts in this case have
the properties of the contexts that were obtained from ontologies.</p>
        <p>First, the generated contexts must contain a large number of objects and
attributes. It is assumed that the objects will be measured in tens of thousands.
The number of binary attributes is comparable to the number of objects, since
many objects contain unique or rare attributes.</p>
        <p>In this case, each object has a relatively small number of attributes. Their
quantity does not exceed several tens. Therefore, the context is strongly
scattered and despite its large dimension it has a relatively small number of formal
concepts: from 5000 to 30000.</p>
        <p>Second, the number of object attributes varies considerably and, as a rule,
satis es the Mandelbrot law, i.e., the number of attributes is in reverse
proportion to the object range among the objects that are ordered by the number of
their attributes.</p>
        <p>The third property that is regarded in the context generation is an irregular
distribution of attribute frequencies. Commonly, the attribute frequency is
inversely proportional to its range in the sequence that is ordered by the frequency
of the appearance of the attribute in the context objects. Upon generating unique
objects, an input context was generated. An object in the context was generated
for each object in the following way: each object attribute was added with a
certain probability to the set of object attributes in the context. For some initial
objects, several objects were thus generated. The obtained objects were taken
as the duplicates of the same object.
5.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Comparative analysis of the methods for detecting duplicates</title>
        <p>
          As alternative methods we considered methods of pairwise comparison based on
Hamming distance and absolute similarity. Also we considered the method which
is similar to ours: the di erence was in the use of extensional stability index [
          <xref ref-type="bibr" rid="ref4 ref8">4,
8</xref>
          ] instead of our index.
        </p>
        <p>
          The method based on the concept extensional stability S. Kuznetsov
was the rst to introduce the formal concept stability in 1990 [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Later, in
work [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], it was proposed to distinguish extensional and intentional stability. In
our work, we deal with extensional stability since we assume that the objects
that are considered as duplicates must be strongly related to a large number of
attributes and have a small number of individual attributes. A formal concept
they generate must be stable to the elimination of individual attributes.
        </p>
        <p>The algorithm for searching for duplicates is similar to the basic method,
viz. the most (extensionally) stable concepts are deleted from the set of formal
concepts. Then it is assumed that the objects from the extent of the stable
formal concept are the duplicates of the same object. The relationship R \to be
duplicates" is built by the set of the chosen formal concepts. Then connectivity
components for this relation are sought. The obtained components are given to
the input as the lists of duplicate objects.</p>
      </sec>
      <sec id="sec-4-3">
        <title>The method based on absolute similarity This method is based on the</title>
        <p>pairwise comparison of objects. Duplicate objects are assumed to have a large
number of general attributes. Therefore, the number of general attributes serves
the criterion of object similarity. The indicator that is based on this measure is
the threshold of the quantity of general attributes.</p>
        <p>The algorithm receives an incoming square similarity matrix A : A[i][j] =
k , the ith and jth objects have k general binary attributes and the threshold
t(N ).</p>
        <p>The matrix A and the threshold are used to build an adjacency matrix B :
A[i][j] &gt; t ) B[i][j] = 1.</p>
        <p>The adjacency matrix (similarly to the ingress matrix) is symmetrical and
describes the similarity relationship R. Proceeding from the fact that the
relationship "to be a duplicate\ is an equivalence relationship and possesses
transitivity, its transitive closure R is built using the obtained relationship R. The
equivalence classes in R correspond to the object groups that are the
duplicates of the same object. The same result can be obtained by detecting all the
connectivity components of the relationship R.</p>
        <p>The asymptotic complexity of the algorithm by time is O(n2 m), where n is
the number of objects in the formal context and m is the number of attributes.</p>
      </sec>
      <sec id="sec-4-4">
        <title>The method based on Hamming distance The algorithm for detecting</title>
        <p>duplicates is based on the pairwise comparison of objects. The Hamming
distance serves as the metric of similarity. First, a square matrix of the distances
between objects is built. Then, using the obtained matrix A and a speci ed
threshold t(N ) the matrix B of the relationship R "to be a duplicate\ is built:
R : A[i][j] &gt; t ) B[i][j] = 1; (xi; xj ) 2 R. The obtained relationship will be
re ective and symmetrical. The connectivity components are sought by this
relationship. The objects that enter the same connectivity component are considered
as the duplicates of the same object.</p>
        <p>The asymptotic complexity of the algorithm is similar to that of the algorithm
based on absolute similarity, viz. O(n2 m), where n is the number of objects in
a formal context and m is the number of attributes.
5.3</p>
      </sec>
      <sec id="sec-4-5">
        <title>The results</title>
        <p>We used a few quality metrics for comparison of methods: recall, precision,
average value of recall when precision is 100%, Mean Average Precision (MAP):
M AP (K) =</p>
        <p>AveP (k) =</p>
        <p>PjiK=1j AveP (Ki)</p>
        <p>jKj
Pc2Ck (P (c))
jCkj
;
where K is set of contexts, Ck is set of relevant formal concepts of the context
k, P (c) is number of the relevant concepts between all of the concepts having
range (value of index) not lower than the concept c.</p>
        <p>For the evaluation of the new method optimal coe cients for the index were
primarily chosen. The coe cient was chosen using one of the generated contexts.
The network on the positive real line was taken and the MAP index was
maximized on it. Therefore, the coe cients for the used variants of the index DII
were obtained:</p>
        <p>DII+ = I1 + 0:25I2;
(7)
(8)
(9)</p>
        <p>DII = I1 I0:18
2</p>
        <p>The algorithm with this index was compared with the alternative methods
for searching for duplicates. In order to build the function of algorithm precision
versus its recall, several tens of di erent thresholds were speci ed and then for
each threshold, the recall and the precision were calculated.</p>
        <p>The method based on extensional stability demonstrates good results at a
high index threshold. At a threshold above 0.5 only formal concepts that have
duplicates are chosen. At a threshold below 0.5, the algorithm precision drops
on average to 10%, since a large number of formal concepts with stability 0.5
are one attribute concepts that do not characterize duplicate objects.</p>
        <p>The algorithm for searching for duplicates using Hamming distance has shown
relatively low results. The Hamming distance takes into consideration only
distinctions in attributes rather than the quantity of general attributes</p>
        <p>The algorithm based on absolute similarity proved to be the most e cient
among the considered alternative algorithms. In most cases, a large number of
common attributes in a pair of objects means that these objects are duplicates.
The disadvantage of the index is that it disregards the distinctions between
objects.</p>
        <p>The algorithm based on the new index demonstrated better results than the
alternatives considered. The main distinct feature of the new method is small
decrease of precision (down to 90%) while recall increases up to 70%. The results
for DII+ and DII are very similar. The di erence is in that the behavior of
DII is less stable, viz., while sometimes making errors at a large threshold
the algorithm did not make errors at a low threshold and detected 42% of the
duplicates
(10)</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Expirements on a real ontology</title>
      <p>
        The ontology for tests was built by Avicomp. This ontology was built and
extended automatically by semantic analysis of several political news sites. The
OntosMiner [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] programming tool set was used. The ontology contains 12006
objects of di erent classes. We used our algorithm for detecting duplicates with
objects belonging to classes \Person" and \Company". The ontology contains
9821 such objects. Though we searched for duplicates only in two classes we used
all classes and relations between objects and classes in ontology as attributes of
objects in these classes.
      </p>
      <p>A rather simple heuristical constraint was added in the algorithm based on
the new index DII (the DII+ variant was used): we ltered out concepts which
contained objects having di erent values of attribute Name or Last name. The
algorithm detected 905 group of objects. Group size ranged from 2 to 41 objects.
The largest groups found by the algorithm described such people as Benjamin
Netanyahu (41 objects), Julia Tymoshenko (35 objects), Vladimir Putin (34
objects), Dmitry Medvedev (33 objects), Steve Jobs (31 objects) etc. However,
the main part of the detected groups contains 2 to 4 objects.</p>
      <p>With experts assistance we estimated the precision of our algorithm. We
could be sure that 98% of the detected groups consist of duplicates. Very often
we can see groups, where attributes Name and Last name are not common, but
other attributes and relations let the algorithm place these objects in one group.
For instance, the algorithm detected 7 objects, describing Ksenia Sobchak and
having only 1 common ontology attribute but brought together because of same
relations with other objects.</p>
      <p>It is necessary to point out that attribute weights in index I2 let algorithm
detect large groups of objects describing Putin, Tymoshenko, Medvedev etc.
The key feature of these objects is that all of them have a lot of attributes and
relations that di er from other objects.</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>In this work a new algorithm for the detection of duplicate objects was
introduced. The algorithm is based on methods of Formal Concepts Analysis. In
particular a index for ranking formal concepts was proposed. The index allows
one to select the set of concepts containing only duplicate objects with high
accuracy. The proposed method was compared with other approaches to the solution
of the problem of data duplication on randomly generated data and real
ontology data. Experiments demonstrated the e ectiveness of the new index. Further
work will consist of estimating recall of the new method on a real ontology.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The results of the project \Mathematical Models, Algorithms, and Software
Tools for Intelligent Analysis of Structural and Textual Data", carried out within
the framework of the Basic Research Program at the National Research
University Higher School of Economics in 2012, are presented in this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis: Mathematical Foundations</source>
          . In: Springer,
          <year>1999</year>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Maedche</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zacharias</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Clustering Ontology-based Metadata in the Semantic Web</article-title>
          .
          <source>In: Proc. of 6th European Conference on Principles of Data Mining and Knowledge Discovery</source>
          .|
          <year>2002</year>
          .|P.
          <fpage>348</fpage>
          -
          <lpage>360</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Prediger</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Logical scaling in formal concept analysis</article-title>
          .
          <source>In: ICCS, Lecture Notes in Computer Science</source>
          .|
          <year>1997</year>
          .|Vol.
          <volume>1257</volume>
          . Springer.|P.
          <fpage>332</fpage>
          -
          <lpage>341</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kuznetsov</surname>
            <given-names>S.O.</given-names>
          </string-name>
          :
          <article-title>Stability as an Estimate of the Degree of Substantiation of Hypotheses on the Basis of Operational Similarity</article-title>
          . In:
          <string-name>
            <surname>Nauchno-Tekhnicheskaya</surname>
            <given-names>Informatsiya</given-names>
          </string-name>
          ,
          <source>Seriya 2</source>
          , Vol.
          <volume>24</volume>
          , No.
          <volume>12</volume>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          ,
          <year>1990</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          :
          <article-title>On stability of a formal concept</article-title>
          .
          <source>In: Annals of Mathematics and Arti cial Intelligence</source>
          .|
          <year>2007</year>
          .|Vol.
          <volume>49</volume>
          .|P.
          <volume>101</volume>
          {
          <fpage>115</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.:</given-names>
          </string-name>
          <article-title>A Fast Algorithm for Computing All Intersections of Objects in a Finite Semi-Lattice</article-title>
          .
          <source>In: Automatic Documentation and Mathematical Linguistics</source>
          <volume>27</volume>
          (
          <issue>5</issue>
          ),
          <fpage>11</fpage>
          -
          <lpage>21</lpage>
          ,
          <year>1993</year>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Merwe</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kourie</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>AddIntent: a new incremental algorithm for constructing concept lattices</article-title>
          .
          <source>In: LNCS</source>
          , Springer.|
          <year>2004</year>
          .|P.
          <fpage>205</fpage>
          -
          <lpage>206</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kourie</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>On Succinct Representation of Knowledge Community Taxonomies with Formal Concept Analysis</article-title>
          .
          <source>In: IJFCS (Intl Journal of Foundations of Computer Science)</source>
          .|
          <year>2008</year>
          .|P.
          <fpage>383</fpage>
          -
          <lpage>404</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Rudolf</given-names>
            <surname>Wille</surname>
          </string-name>
          .:
          <article-title>Restructuring lattice theory: an approach based on hierarchies of concepts</article-title>
          . In: Ordered Sets: Dordrecht/Boston, Reidel. - 1982. P.
          <volume>445</volume>
          -
          <fpage>470</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          :
          <article-title>Mathematical Aspects of Concept analysis</article-title>
          .
          <source>In: Journal of Mathematical Sciences</source>
          , Vol.
          <volume>80</volume>
          ,
          <string-name>
            <surname>Issue</surname>
            <given-names>2</given-names>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          1654 -
          <fpage>1698</fpage>
          , Springer.|
          <year>1996</year>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Reducing the representation complexity of lattice-based taxonomies</article-title>
          .
          <source>In: 15th Intl Conf on Conceptual Structures</source>
          ,
          <string-name>
            <surname>ICCS</surname>
          </string-name>
          <year>2007</year>
          .
          <article-title>|She eld</article-title>
          ,
          <source>UK.|LNCS/LNAI</source>
          . Vol.
          <volume>4604</volume>
          . Springer.|
          <year>2007</year>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Klimushkin</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Obiedkov</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Approaches to the selection of relevant concepts in the case of noisy data</article-title>
          .
          <source>In: 8th International Conference, ICFCA2010</source>
          , Morocco.|Springer.|
          <year>2010</year>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. http://www.ontos.com/?page id=
          <volume>630</volume>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>