<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Matrix Decompositions in Formal Concept Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vaclav Snasel</string-name>
          <email>vaclav.snasel@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Petr Gajdos</string-name>
          <email>petr.gajdos@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hussam M. Dahwa Abdulla</string-name>
          <email>hussamdahwa@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Polovincak</string-name>
          <email>martin.polovincak.fei@vsb.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava</institution>
          ,
          <addr-line>17. listopadu 15, Ostrava, 708 33</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <fpage>121</fpage>
      <lpage>128</lpage>
      <abstract>
        <p>One of the main problems connected with the formal concept analysis and lattice construction is the high complexity of algorithms which plays a significant role when computing all concepts from a huge incidence matrix. In some cases, we only need to compute some of them to test for common attributes. In our research we try to modify an incidence matrix using matrix decompositions, creating a new matrix with fewer dimensions as an input for some known algorithms for lattice construction. In this paper, we want to describe methods of matrix decompositions and their influence on the concept lattice.</p>
      </abstract>
      <kwd-group>
        <kwd>Formal Concept Analysis</kwd>
        <kwd>Singular Value Decomposition</kwd>
        <kwd>Semi Discrete Decomposition</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        We are dealing with possible uses of matrix decompositions for reduction of
concept lattice. These methods are well known in the area of information retrieval
under the name Latent Semantic Indexing (LSI) or Latent Semantic Analysis (LSA)
[2], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. LSI and LSA have been used for discovery of latent dependencies between
terms (or documents). We would like to apply this approach in the area of formal
concept analysis (FCA) [5]. First, we will introduce basic terms of Formal Concept
Analysis (FCA) and then we will describe the rank-k SVD decomposition and the LSI
method. Because the SVD decomposition presents some difficulty connected with the
setting of its parameters, we will introduce a different approach based on
semidiscrete lattice decomposition.
      </p>
      <p>Small examples will illustrate our approaches. We hope that this is the way to use
FCA on large data collections and visualize a data structure via concept lattice. Note
that usage of this method depends on concrete application, because it makes some
noise in the lattice structure to make it simpler.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Background</title>
      <p>In the following paragraphs we would like to introduce important basic terms of
formal concept analysis, singular value decomposition and semi-discrete value
decomposition. Pros and cons of matrix decompositions will be discussed later.</p>
      <sec id="sec-2-1">
        <title>2.1 Formal Concept Analysis</title>
        <p>Definition 1. [9] A formal context C = (G, M , I) consists of two sets; G and M, I is a
subset of GxM. The elements of G are defined as objects and the elements of M are
defined as attributes of the context. In order to express that an object g ∈ G is
related to I with the attribute m ∈ M , we record it as gIm or (g, m) ∈ I and read it as
“the object g has the attribute m”. I is also defined as the context incidence relation.
Definition 2. [9] For A ⊂ G as set of objects we use the definition
A′ = {m ∈ M gIm for all g ∈ A} (the set of attributes common to the objects in A).
Correspondingly, for B attributes we use the definition
B′ = {g ∈ G gIm for all m ∈ B} (the set of objects which have all attributes in B).
Definition 3. [9] A formal concept of the context (G, M, I) is the pair (A, B)
with A ⊆ G , B ⊆ M , and B′ = A. We call A the extent and B the intent of the concept
(A, B). β(G, M, I) denotes the set of all concepts of context (G, M, I).</p>
        <p>Definition 4. [9] The concept lattice β (G,M, I) is a complete lattice in which infimum
and supremum are given by</p>
        <p>I ( At , Bt ) = (I At , (U Bt )")
t∈T t∈T t∈T
U ( At , Bt ) = ((U At )", I B )</p>
        <p>t
t∈T t∈T t∈T</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Singular Value Decomposition</title>
        <p>
          Singular value decomposition (SVD) is well known because of its application in
information retrieval as LSI. SVD is especially suitable in its variant for sparse
matrices. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
Theorem 1. (Singular Value Decomposition) Let A be an m × n rank-r matrix. Be
σ 1 Kσ r eigenvalues of a matrix AAT . There exist orthogonal matrices
U = (u1, . . . , ur ) and V = (v1, . . . , vr ) , whose column vectors are orthonormal,
and diagonal matrix ∑ = diag (σ 1 , ..., σ r ) . The decomposition A = UΣV T is referred to
as a singular value decomposition of matrix A and numbers σ 1 Kσ r are singular
values of the matrix A. Columns of U (or V) are referred to as left (or right) singular
vectors of matrix A.
        </p>
        <p>Now we have a decomposition of the original matrix A. Needless to say, the left
and right singular vectors are not sparse. We have at most r nonzero singular
numbers, where rank-r is the min(m,n). Because the singular values usually decrease
quickly, we only need to take k greatest singular values and corresponding singular
vector coordinates and create a k-reduced singular decomposition of matrix A.
Definition 5. Let us have k, 0 &lt; k &lt; r and singular value decomposition of A
⎛Σ
A = UΣV T = (UKU0 )⎜
⎝ 0</p>
        <p>K
0 ⎞⎛V T ⎞</p>
        <p>⎟⎜ KT ⎟
Σ0 ⎠⎝V0 ⎠
A = Uk ΣkVkT is referred to as a k-reduced singular value decomposition (k-rank SVD).</p>
        <p>In information retrieval, if every document is relevant to only one topic, we obtain
a latent semantics – semantically related words and documents will have similar
vectors in the reduced space. For an illustration of rank-k SVD see &lt; 1, the grey areas
determine first k coordinates from singular vectors, which are being used.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.2 Semi Discrete Matrix Decomposition (SDD)</title>
        <p>We have used another alternative decomposition in our research, semi discrete
decomposition (SDD). For equal data set and incidence matrix, the SDD does as well
as the SVD and uses less than one-tenth the storage space. In this section, we briefly
overview SDD and then we will describe its purpose.</p>
        <p>Semi discrete decomposition approximates a matrix as a weighted sum of outer
product formed by vectors with entries constrained to be in the set S = {−1, 0, 1}.
O’Leary and Peleg introduced the SDD in the context of image compression [8], and
Kolda and O’Leary used the SDD for latent semantic indexing LSI in information
retrieval [6], [7].</p>
        <p>The primary advantage of the SDD over other types of matrix decompositions such
as the truncated SVD is that it typically provides a more accurate approximation for
far less storage. An SDD of an m × n matrix A as a decomposition is shown in
figure 2. Here each xi is an m − vector with entries from the set S = {−1, 0, 1}, each yi
⎡d1
⎢
⎢ 0
Ak = [ x1x2...xk ] ⎢ ...</p>
        <p>⎢
⎢⎣ 0
0 ... 0 ⎤ ⎡Y1T ⎤
d2 ... 0 ⎥⎥ ⎢⎢Y2T ⎥⎥
... ... ... ⎥ ⎢ ... ⎥
0 ... dk ⎥⎦⎥ ⎣⎢⎢YkT ⎦⎥⎥
is an n−vector with entries from the set S, and each di is a positive scalar. We call this
k − term SDD.</p>
        <p>Note: From now on, we will assume rank-k semi discrete decomposition when
speaking about SDD.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4 Nonnegative Matrix Factorization (NMF)</title>
        <p>Nonnegative matrix factorization differs from other rank reduction methods for
vector space models in text mining, e.g. principal component analysis (PCA) or vector
quantization (VQ) due to use of constraints that produce nonnegative basis vectors,
which make possible the concept of a parts-based representation.</p>
        <p>Lee and Seung [11] first introduced the notion of parts-based representations for
problems in image analysis or text mining that occupies nonnegative subspaces in a
vector-space model. Techniques like PCA and VQ also generate basis vectors various
additive and subtractive combinations of which can be used to reconstruct the original
space. But the basis vectors for PCA and VQ contain negative entries and cannot be
directly related to the original vector space to derive meaningful interpretations. In the
case of NMF, the basis vectors contain no negative entries, allowing only additive
combinations of the vectors to reproduce the original. So the perception of the whole,
be it an image or a document in a collection, becomes a combination of its parts
represented by these basis vectors. In text mining, the vectors represent or identify
semantic features, i.e. a set of words denoting a particular concept or topic. If a
document is viewed as a combination of basis vectors, then it can be categorized as
belonging to the topic represented by its principal vector. Thus, NMF can be used to
organize text collections into partitioned structures or clusters directly derived from
the nonnegative factors [10].
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Our Experiment</title>
      <sec id="sec-3-1">
        <title>3.1 Input Data</title>
        <p>Data from the Ostrava Mass Transit Authority was used in this experiment. We can
describe data in a graph. Rows represent tram stops, columns represent tram lines. Let
1 represent a tram stopping at a tram stop in the appropriate row and column crossing,
otherwise zero. This data was chosen because we can easily see if the SVD is
working. Many tram lines share the same tram stops. These parts of tram lines should
be minimized with SVD and it should also be visible in the lattice after using SVD.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Steps of Our Experiment</title>
        <p>Our experiment included several steps. The first step was to read and transform
data into adjacency matrix. After transforming and computing (using SVD) three new
matrixes were obtained as results from SVD A = U Σ V T . After choosing rank-k, a
new matrix was computed.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Reading and Transforming Data</title>
        <p>Data was obtained from tram timetables. The first ten tram lines and their tram
stops were chosen. The transformed matrix had 92 rows and 10 columns.
Transformation was simple. If a tram stops at a particular tram stop, the number 1
appeared in the appropriate row and column crossing, otherwise zero. The matrix is
also an incidence matrix and was used in SVD, NMF and FCA computations.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Using SVD</title>
        <p>SVD computations were made by SVDLIBC-fix software. This software is free
and is written in C language. An incidence matrix acts as the input. Software can
compute all three matrixes - U and Σ and VT. k-rank was chosen k = 4, that is mean,
matrixes will be U(92x10), Σ(10x10) and VT(10x10).
3.5</p>
      </sec>
      <sec id="sec-3-5">
        <title>Computing Concept Lattices</title>
        <p>Both incidence matrix and product of SVD computing were used to compute
concept lattice. Both matrixes passed as input for Concept Lattice software developed
at VSB–TU Ostrava. The Concept Lattice software produced a concept lattice. The
view of these concept lattices were completed with GraphPlace software.
GraphPlace’s output is a Postscript file that can be viewed in any postscript viewer.</p>
        <p>Concept lattice was computed from the output matrix. Man can see the difference
between the concept lattice computed from the original matrix and the concept lattice
computed from the output matrix using SVD.</p>
        <p>Fig. 5. Example 1
Example 1. Node 11 (figure 5): Node 11 has attributes (e, g, h) in line 69 of the
original matrix. Attribute (e) disappears from line 69 after SVD. Attribute (e) is a
shared attribute and exists only in this node at this level. When an attribute is lost the
connection between node 11 and node 10 is lost too. The other attributes in node 11
(g, h) appear in the other nodes at the same level. For that node 11 is lost with node
10, which has lost all connections with other nodes.</p>
        <p>Example 3. Node 34 (figure 7): Node 34 has attributes (a, i) in lines (8, 9, 11, 16, 17,
18, 19, 20, 21). After the SVD attribute (i) from lines (16, 17, 18, 19, 20, 21) is lost
and only attribute (a) remains. Attribute (a) appears in other nodes at the same level
(node 38), and lines (8, 9, 11) appear in other nodes at the same level (node 38).
43
42
6</p>
        <p>Also, node 8 has attributes (f, i) in lines (8,9,11,79), after SVD we lose the attribute
(i) from line (79) and only attribute (f) remains, that appears in other node at the same
level (node 38), and lines (8,9,11) appear in other node at the same level (node 38).
For these nodes 34 and 8 are after SVD lost. Loosing nodes (34 and 8) means also
loosing connection with node 1 who has attribute (i), and we loose also node 1.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>Every node in the concept has a distinct set of attributes. That means that two nodes
can not be found with the same set of attributes. Experiment consists in finding
changes of objects and their set of attributes after using SVD. SVD reduces some
attributes from the original matrix because these attributes do not appear in many
objects. Attributes are not primary and they can be lost without any problem.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>G.</given-names>
            <surname>Battista</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Eades</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tamassia</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Tollis</surname>
          </string-name>
          , “
          <article-title>Algorithms for drawing graphs, an annotated bibiliography”</article-title>
          ,
          <source>Computational Geometry, Theory and Applications</source>
          , vol.
          <volume>4</volume>
          ,
          <issue>1994</issue>
          , pp.
          <fpage>235</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Berry</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Browne</surname>
          </string-name>
          . “Understanding Search Engines”,
          <source>Mathematical Modeling and Text Retrieval</source>
          , Siam,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Berry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumais</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Letsche</surname>
          </string-name>
          . “
          <article-title>Computation Methods for Intelligent Information Access”</article-title>
          ,
          <source>Proceedings of the 1995 ACM/IEEE Supercomputing Conference</source>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>P.</given-names>
            <surname>Gajdoš</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Moravec</surname>
          </string-name>
          , “
          <article-title>Concept lattice generation by singular value decomposition”</article-title>
          ,
          <source>CLA</source>
          <year>2004</year>
          ,
          <year>2004</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>B.</given-names>
            <surname>Ganter</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Wille</surname>
          </string-name>
          ,
          <source>Formal Concept Analysis</source>
          . Springer-Verlag Berlin Heidelberg,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Kolda and D. P. O'Leary</surname>
          </string-name>
          ,
          <article-title>Computation and uses of the semidiscrete matrix decomposition</article-title>
          ,
          <source>Technical Report CS-TR-4012</source>
          , Oak Ridge National Laboratory,
          <year>1999</year>
          , pp.
          <fpage>8</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>T.G.</given-names>
            <surname>Kolda and D.P.O'Leary</surname>
          </string-name>
          , “
          <article-title>A semidiscrete matrix decomposition for latent semantic indexing information retrieval”</article-title>
          ,
          <source>ACM Transactions on Information Systems</source>
          <volume>16</volume>
          (
          <issue>4</issue>
          ),
          <year>1998</year>
          , pp.
          <fpage>322</fpage>
          -
          <lpage>346</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>D.P.O'Leary</surname>
            and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Peleg</surname>
          </string-name>
          , “
          <article-title>Digital image compression by outer product expansion”</article-title>
          ,
          <source>IEEE Transactions on Communications 31</source>
          ,
          <year>1983</year>
          , pp.
          <fpage>441</fpage>
          -
          <lpage>444</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Wille</surname>
          </string-name>
          ,
          <article-title>Lattices in data analysis: How to draw them with a computer, Algorithms</article-title>
          and Order,
          <year>1989</year>
          , pp.
          <fpage>33</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          Manage.
          <volume>42</volume>
          (
          <issue>2</issue>
          ):
          <fpage>373</fpage>
          -
          <lpage>386</lpage>
          2006.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Seung, “H.
          <article-title>Algorithms for non-negative matrix factorization”</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          ,
          <year>2001</year>
          , pp.
          <fpage>556</fpage>
          -
          <lpage>562</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>G.</given-names>
            <surname>Young</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Eckart</surname>
          </string-name>
          ,
          <article-title>The approximation of one matrix by another of lower rank</article-title>
          ,
          <source>Psychometrika</source>
          <volume>1</volume>
          ,
          <year>1936</year>
          , pp.
          <fpage>211</fpage>
          -
          <lpage>218</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>