<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Can We Measure the Interpretability of Factors?</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science Palacky University Olomouc</institution>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <fpage>179</fpage>
      <lpage>190</lpage>
      <abstract>
        <p>Decomposition of matrices over some finite scale received a considerable attention in data mining research. The methods that perform such decomposition can be viewed as an implementation of factor analysis. Surprisingly, the main motivation that is behind the factor analysis, the interpretation of the factors, is given only a very small amount of attention, or is completely neglected, in current research. In this paper, we are arguing that the interpretation of factors is an important part of matrix decomposition and we propose a novel measure, based on simple structure from factor analysis, enabling the intererability measurement. Furthermore, we present an experimental evaluation of selected decomposition algorithms via our metric.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>
        Decomposition of matrices over some finite scale—especially a case where the
scale contains only two elements, namely zero and one, called the Boolean
matrix decomposition—has become one of the standard methods in data mining
with applications to many fields. In a broad sense, these methods may be
considered as implementing the general idea of classical factor analysis introduced
by psychologist Charles Spearman [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>The motivation for the factor analysis comes from the psychology and the
social sciences. The general aim is to simplify complex data. More precisely to
describe original data via new more fundamental variables called factors.</p>
      <p>
        Boolean matrix decomposition (BMF) and in general the decomposition of
matrix over some finite scale, always reflects the ideas of factor analysis. This
is not surprising, because BMF also comes from the psychology, where Boolean
data often occur (see e.g. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]). On the other hand some aspects of factor analysis
are neglected in contemporary literature including matrix decomposition
methods, namely the quality of factors. In factor analysis, the quality of factors, more
precisely the interpretability of factors, is on the first place.
      </p>
      <p>The current direction of research focuses on creating new algorithms and
evaluating their quality in relation to the number of factors and the size of the
input data covered. If an analysis of factor interpretability is contained, it is
done by hand on a small number of datasets. The main reason is that the
interpretation of factors is subjective and very tedious. In addition, contemporary
literature lacks a uniform methodology or a metric to measure the
interpretability of factors.
c paper author(s), 2018. Proceedings volume published and copyrighted by its editors.</p>
      <p>Paper published in Dmitry I. Ignatov, Lhouari Nourine (Eds.): CLA 2018, pp.
179{190, Department of Computer Science, Palacky University Olomouc, 2018.
Copying permitted only for private and academic purposes.</p>
      <p>The purpose of this paper is twofold. We argue that the interpretation of
factors, which is often neglected, is important part of matrix decomposition
method; and we propose a new measure, based on the simple structure from
factor analysis, enabling an objective measurement of the interpretability of
factors.</p>
      <p>The main reason why we chose a simple structure as a criterion for the
interpretability of factors is the historical interdependence of factor analysis and
matrix decompositions. Simple structure is therefore the first choice, and for
this reason this work should be seen as the first small step in the new field of
research.</p>
      <p>The rest of the paper is organized as follows. In the following Section 2 we
provide a brief overview of current research involving matrix decomposition from
the factor interpretation standpoint. Moreover, we describe in Section 2 basic
approach to the factors interpretation in factor analysis. Then, in Section 3 basic
notions, notation and formalization of the metric are presented. The metric is
experimentally evaluated in Section 4. Section 5 draws a conclusion and future
research direction.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Interpretation of Factors</title>
      <sec id="sec-2-1">
        <title>Lost in the Flood of Algorithms</title>
        <p>
          The current direction of matrix decomposition research focuses primarily on
the production of new algorithms and improving existing ones. An overview of
existing approaches and methods is beyond the scope of this paper (see e.g. [
          <xref ref-type="bibr" rid="ref18 ref5">5,18</xref>
          ]
which provides comparison of the most commonly used algorithms). The factors
interpretation has only a small amount of attention or it does not perform at
all. Note that this is indeed a feature of contemporary research. Early works
involving matrix decomposition usually contain a larger assessment—usually the
analysis of the factors of one particular dataset—of factor interpretability.
        </p>
        <p>
          One of the few exceptions is the work [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] that deals with the extensive detailed
analysis of factors. However, this analysis is done manually. Works involving
matrix decomposition do not contain any methodology or metrics to measure
the interpretability of factors. This is very surprising, especially because these
methods are inspired by classical factor analysis, where the interpretability of
factors and its measurement is an elementary concept.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Good Factors Definition and Metric</title>
        <p>In classical factor analysis, the question of whether the factor is good or bad is
based on the law of parsimony, well known as Occam’s razor, i.e. we should pick
the simplest explanation of facts. A solution, which is selected via the parsimony
law, is called simple structure.</p>
        <p>
          In 1947 Thurstone proposed five simple criteria of simple structure in his
work [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. These can be seen as informal, vague and verbally described definition
of good factors. Thurstone’s criteria were as follows:
1. Each row of the rotated matrix should contain at least one zero.
2. In each factor the minimum number of zero loadings (see Section 3.2) should
be the number of factors in the rotation.
3. For every pair of factors there should be variables with zero loadings on one
and significant loadings on the other.
4. For every pair of factors a large portion of the loadings should be zero, at
least in a matrix with a large number of factors.
5. For every pair of factors there should be only a few variables with significant
loadings on both factors.
        </p>
        <p>Two of these criteria, namely 3 and 5 are of overriding importance.
Essentially, the criterion of simple structure is a factor matrix in which the factors
each have a few high loadings.</p>
        <p>
          Later in 1978, Cattell in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], who continues in the Thurstone’s work, argued
that the simple structure factors are usually simple to interpret. There have
been many attempts to formalize the simple structure (see e.g. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]). The result
of these attempts is an ad hoc formalization and a conclusion that there will
never be a simple formula describing Thurstone’s five criteria. Unfortunately,
these approaches cannot be adopted in the case of decomposition of matrices
over some finite scale, because they use a different calculus.
        </p>
        <p>On the other hand, this kind of data can be handled using fuzzy logic.
Moreover, in case of data over some finite scale, all Thurstone’s criteria can be
formalized via logical formulas. In the following section we will use the fuzzy logic to
formalize Thurstone’s criteria and create a metric allowing an objective analysis
of factor interpretability.
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Formalization of Metric</title>
      <sec id="sec-3-1">
        <title>Basics from Fuzzy Logic</title>
        <p>
          Fuzzy logic has been employed to handle the concept of partial truth, where
the truth value may range between completely true and completely false. This
approach has been proven to be useful in several areas and we refer to [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>Let us consider a set L of truth values. We assume that this set is
partially ordered (partial ordering is denoted by ≤), contains a least element 0 and
a greatest element 1.</p>
        <p>
          Let a and b are the truth degrees from L, then in L exists a truth value
which is greater than both a and b. The least element that is greater or equal to
both a and b is called supremum of a and b. Analogously, we can define infimum
of a and b—the greatest element from L which is smaller or equal to both a
and b. We define the lower cone of A by L(A) = {a ∈ L|a ≤ b for all b ∈ A}
and the upper cone of A by U (A) = {a ∈ L|b ≤ a for all b ∈ A}. If L(A) has
a greatest element a, then a is called the supremum of A (denoted W A) and
dually if U (A) has a least element a, then a is called the infimum of A (denoted
V A). In particular, we assume that the partial order ≤ makes L a complete
lattice [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] (i.e., arbitrary infima V and suprema W exist in L). This assumption
is automatically satisfied if L is a finite chain (i.e. a ≤ b or b ≤ a for every
a, b ∈ L), in which case a ∧ b = min(a, b) and a ∨ b = max(a, b). We also need
to define a logical conjunction operation (denoted by ⊗). We assume that ⊗ is
commutative, associative, has 1 as its neutral element (a ⊗ 1 = a = 1 ⊗ a), and
distributes over arbitrary suprema, i.e. a ⊗ (Wj∈J bj ) = Wj∈J (a ⊗ bj ). This leads
to if a and b are truth degrees of propositions p1 and p2, then a ⊗ b is the truth
degree of proposition “p1 and p2”.
        </p>
        <p>Importantly, ⊗ induces another operation, →, called the residuum of ⊗,
which plays the role of the truth function of implication and is defined by
a → b = max{c ∈ L | a ⊗ c ≤ b}.</p>
        <p>Residuum, which may be looked at as a kind of division, satisfies an important
technical condition called adjointness:</p>
        <p>a ⊗ b ≤ c iff a ≤ b → c,
which is also utilized below. This leads to algebraic structures called residuated
lattices.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Basic Notions of Matrix Decomposition</title>
        <p>In general, matrix decomposition aims at whether data involving objects and
their directly observable attributes may be explained by a smaller number of
different, more fundamental attributes called factors. For example, whether
performances of students (directly observable attributes) may be described by some
treats of their intelligence (factors). Formally, the input data is represented by an
n × m object–attribute matrix I and the “explanation” means a decomposition
I = A ◦ B.
(1)
(exact or approximate) of I into a product A ◦ B of an n × k object–factor matrix
A—called a score matrix in the factor analysis terminology—and a k × m factor–
attribute matrix B—called a loading matrix in the factor analysis terminology.
What kind of matrices (real, Boolean, or other) and what kind of product ◦ are
involved determines the semantics of the factor model.</p>
        <p>In this paper, we are mainly focused on the decomposition of matrices
containing grades of certain scales L with the sup-⊗ product. In particular, the
matrix entry Iij is a degree to which attribute j applies to object i. Similarly,
Ail and Blj are the degrees to which factor l applies to object i and the degree to
which attribute j is a (one particular) manifestation of factor l. The case where
the scale L contains only two degrees (0 and 1), is called the Boolean matrix
decomposition.</p>
        <p>Equation (1) has the following meaning. Object i has attribute j if and only
if there exists factor l such that i has l (or, l applies to i) and j is one of the
particular manifestations of l. The meaning can be described by the formula</p>
        <p>Let us note, in the Boolean case (L = {0, 1}), the meaning of equation (1)
may be described via formula
(A ◦ B)ij = Wk</p>
        <p>l=1 Ail ⊗ Blj ,
k
(A ◦ B)ij = max min(Ail, Blj ).</p>
        <p>l=1</p>
        <p>
          There exist two concrete variants of the decomposition problem. These two
problems reflect two important views on matrix decomposition. The first one—
the discrete basis problem (DBP) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]—emphasizes the importance of the first
k (presumably the most important) factors. The second one—the approximate
factorization problem (AFP) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]—emphasizes the need to account for (and thus
to explain) a prescribed portion of data, which is specified by error ε.
        </p>
        <p>The DBP and AFP problems are generally known in BMF, but both problems
can be generalized to problems over some scale L. For this purpose we need to
define closeness of matrices over L.</p>
        <p>
          Let sL : L × L → [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] be an appropriate function measuring closeness of
degrees in L. For matrices I, J ∈ Ln×m, put
s(I, J ) =
        </p>
        <p>
          Pn,m
i,j=1 sL(Iij , Jij )
n · m
i.e. s(I, J ) ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] is the normalized sum over all matrix entries of the closeness
of the corresponding entries in I and J . In general, we require sL(a, b) = 1 if
and only if a = b, and sL(0, 1) = sL(1, 0) = 0, in which case s(I, J ) = 1 if and
only if I = J . We furthermore require that a ≤ b ≤ c implies sL(a, c) ≤ sL(b, c).
For the important case of L being a subchain of [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ], sL may be defined by
sL(a, b) = a ↔ b,
where a ↔ b = min(a → b, b → a) is the so-called biresiduum (many-valued
equivalence from a logical point of view) of a and b. Let us note that the closeness
coincides with the notion coverage in several papers.
        </p>
        <p>The generalization of the AFP and DBP to the general decomposition over
scale L follows:
– DBP(L): Given I ∈ Ln×m and a positive integer k, find A ∈ Ln×k and</p>
        <p>
          B ∈ Lk×m that maximize s(I, A ◦ B).
– AFP(L): Given I and prescribed error ε ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ], find A ∈ Ln×k and B ∈
        </p>
        <p>Lk×m with k as small as possible such that s(I, A ◦ B) ≥ ε.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Interpretability Metric</title>
        <p>
          We approach the formalization of Thurstone’s five criteria according to the
principles of mathematical fuzzy logic [
          <xref ref-type="bibr" rid="ref1 ref12 ref13">1,12,13</xref>
          ] as follows. We consider the factor
model 1 and the Royce [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] definition of factor, namely “factor is a construct
operationally defined by its factor loadings”. In other words, factors are
represented via attributes which are manifestation of them, i.e. factors are represented
via rows of matrix B. This is very important aspect of our metric, because we
can evaluate factors regardless of whether they do or or not contain noise—noise
is a big issue in matrix decompositions, see e.g. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>The following formalization of the five criteria described in Section 2.2 utilizes
operations over scale L. As far as the choice of the operations on L is concerned,
we use the Lukasiewicz t-norm in the formalization, due to some of its intuitive
properties. We describe each criterion via single logical formula with the truth
degrees from L. The degree of fulfillment of each criterion is determined by the
degree of fulfillment of a particular formula.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Formalization of Thurstone’s Criteria</title>
        <p>The first criterion. Each row contains at least one zero, i.e. for each factor there
exist at least one attribute which is not particular manifestation of the factor.
Formally, the first criterion can be described via formula ∀i∃j¬Bij , i.e.</p>
        <p>(∃j¬B1j ) ∧ (∃j¬B2j ) ∧ · · · ∧ (∃j¬Bkj ),
where (∃j¬Bij ) = (¬Bi1) ∨ (¬Bi2) ∨ · · · ∨ (¬Bim).</p>
        <p>The second criterion. In each factor, the minimum number of zero loadings
should be the number of factors, i.e. in each factor, there is at least k attributes
that are not manifestation of this factor. Formally,</p>
        <p>∀i∃j1∃j2 . . . ∃jk(¬Bij1 ∧ ¬Bij2 ∧ . . . ¬Bijk ∧ j1 6= j2 6= · · · 6= jk).</p>
        <p>The third criterion. For every pair of factors there should be variables with zero
loadings on one and significant loadings on the other. Formally,</p>
        <p>∀i1∀i2∃j(Bi1j ∧ ¬Bi2j ) ∧ (¬Bi1j ∧ Bi2j ).</p>
        <p>The fourth criterion. For every pair of factors a large portion of loadings should
be zero (at least in a matrix with large number of factors. We need to define
what the “large portion” means, i.e. how many attributes do not manifest one or
second (or both) factors. Let us denote “large portion” by lp and Bij = Bi ∪ Bj
(Bi denotes row i of matrix B), than formally</p>
        <p>∀i1∀i2∃j1∃j2 . . . ∃jlp(¬Bji11i2 ) ∧ (¬Bji21i2 ) ∧ · · · ∧ (¬Bjil1pi2 ) ∧ j1 6= j2 6= · · · 6= jlp.
The fifth criterion. For every pair of factors there should be only a few attributes
that manifest both factors. Similarly as in the previous case, we need to define
a “few”.</p>
        <p>∀i1∀i2∃j1∃j2 . . . ∃jfew(Bi1j1 ∧ Bi2j1 ) ∧ (Bi1j2 ∧ Bi2j2 ) ∧ . . .</p>
        <p>The formalization via above presented logical formulas strictly says how a set
of factors satisfies each criteria, but it does not take into account how many
factors (pairs of factors) do not meet the criterion.</p>
        <p>We can analogously define less strict measure which takes into account for
how many factors (pairs of factors) each criterion holds, i.e instead of minimum
value for each factor (pair of factors), we take mean of this values.</p>
        <p>We denote the first variant of metric as the strict metric and the second
variant as the partially strict metric in Section 4—which provides experimental
evaluation of our metric.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Evaluation</title>
      <p>
        The following section is devoted to the experimental evaluation of metrics
described in Section 3. We compare three algorithms for the matrix decomposition
problem, namely GreConDL [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], GreEssL [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and AssoL [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The first two are
based on formal concept analysis [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Let us note, these algorithms provide the
decomposition of matrices over a finite scale L. All of them are inspired by the
existing BMF algorithms.
4.1
      </p>
      <sec id="sec-4-1">
        <title>Real-World Data</title>
        <p>Since we are interested in the interpretability of factors, we perform experiments
only on the real-world datasets—which, unlike synthetic data, are influenced by
real factors. We used the following datasets.</p>
        <p>
          Dog breeds dataset represents 151 dog breeds and their 13 attributes such
as for example Playfulness, Protection ability, Affection or Ease of training. For
detailed analysis see [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>
          Decathlon extends the dataset from [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] to 28 athletes and their performance
in 10 disciplines of decathlon. A detail analysis of this data can be found in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>
          IPAQ consists of international questionnaire data involving 4510 respondents
answering 16 questions using a three-element scale regarding physical activity.
The questions include those regarding their age, sex, body-mass-index (BMI),
health, to what extent the person bicycles, walks, etc. For more detail see [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>
          Music [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] consists of results of a study inquiring how people perceive speed of
song depending on various song characteristics. The data consists of a 900 × 26
matrix over a six-element scale L, representing a questionnaire involving 30
participants who were presented 30 music samples.
        </p>
        <p>
          Rio dataset [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] represents 87 × 31 matrix I obtained from https://www.
rio2016.com/en/medal-count and consists of 87 countries that obtained any
medal in one of 31 sport (such as Archery, Athletics, Badminton, Basketball,
Boxing, . . . ) on Olympics games in Rio de Janeiro 2016. L contains four grades—
1 means that country won at least one gold medal, 23 at least silver medal, 13 at
least one bronze medal and 0 no medal in this sport. This dataset is very sparse
in comparison with other presented datasets.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Assessment of the Interpretability Metric</title>
        <p>Obtained results for each of five Thurstone’s criteria and a total value of the
simple structure are presented in Table 1 (strict measure) and Table 2 (partially
strict measure). We provide the results for sets of factors with the values of
closeness (column s) 0.75, 0.85, 0.9, 0.95 and 1 (which corresponds with the
values of coverage 75%, 85%, 90%, 95% and 100% of the input data). Value NA
means, that the particular algorithm can not obtain a prescribed coverage.</p>
        <p>One may observe that the best results provides (in case of strict measure)
GreEssL algorithm which outperforms GreConDL and AssoLon Breeds,
Decathlon and IPAQ. On the Music and Rio data, GreConDL produces slightly
better results than GreEssL. AssoL is not able to reach higher coverage and
usually provides worse results, but on Music and Rio data it outperforms both
GreConDL and GreEssL.</p>
        <p>We obtain similar results for partially strict measure. For this metric
GreConD and GreEss produce higher values than in the case of strict metric.
Additionally GreEssL outperforms GreConDL on IPAQ data and produces
almost identical results on Rio data. AssoL produces very similar results as in
the case of strict metric.</p>
        <p>From Tables 1 and 2 it is obvious that the simple structure firstly fail on
second criterion especially for high closeness (in both GreConDL and GreEssL)
since usually they need more factors than the number of attributes to achieve a
prescribed coverage. AssoL is algorithm for solving DBP, usually it is not able
to obtain full coverage of input data. On the other hand, first factors obtained
by AssoL cover larger portion of data, so for example in Rio dataset we need
only one factor to obtain coverage slightly higher than 90%. This is the reason
why the total value of simple structure is equal to 1.</p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] authors discuss problem of factor with values “around the middle”.
These factors are the reason why AssoL produces results which returns lower
values on the criteria, that depend on the number of zeros, namely criterion
1, criterion 2 and criterion 5. In these criteria GreEssL returns better
factorization than GreConDLon almost all of the datasets. The reason is probably
the logic behind the factor selection which particular algorithm utilizes. Unlike
GreConDL algorithm GreEssL algorithm takes into account different role of
entries, namely it utilizes the so-called essential entries [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>Some observations that depend on data itself are for example, that Decathlon
dataset does not contain any 0 as input value, so neither GreConDL nor
GreEssLfully satisfy criterion 1. IPAQ dataset has much more objects than
attributes so we need more factors to obtain higher coverage. This is the reason
why from closeness 0.9 it fails in criterion 2.
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Application to Boolean Matrix Decompostion</title>
        <p>BMF is probably the most popular class of matrix decomposition over finite
scale—in this case the scale L contains only two elements, namely zero and one.
Factors produced by BMF algorithms can be analyzed without any problems via
our metric. We perform several experiments and we observe how good is the set
of factors from the simple structure perspective.</p>
        <p>
          There exist several algorithms for BMF based on different ideas. We used the
following algorithms: GreConD, GreEss, Asso, Hyper, PaNDa, Tiling (for
more details see e.g. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]) and 8M [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Like in graded case, some of them are
usually not able to achieve 100% coverage, namely Asso, PaNDa and 8M. We
evaluate all of them on well known real data such as for example Americas-small,
DBLP, Emea, Chess and Mushroom. All of them are well known and widely used.
Description and characteristics of these datasets can be found e.g. in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>We present only basic observation. A broader analysis of results delivered by
BMF algorithms is left to an extended version of this paper.</p>
        <p>In the Boolean case, the third criterion is always true. It can be understood
as: for every pair of factors, there should be an attribute which is manifestation
of one of them and is not manifestation of the other one.</p>
        <p>The best factors in terms of above defined measures are obtained by
algorithm Hyper—for almost all datasets it returns value 1 (for closeness ≤ 0.95).
The main reason is that Hyper usually selects factors including only one
attribute. Such factors are really easy to interpret. On the other hand, this shows
a drawback of the simple structure, because such factors are not useful.</p>
        <p>
          GreEss, GreConD and Tiling algorithm returns comparable results
(GreEss is slightly better than GreConD and both are slightly better than
Tiling). All of them work in the similar way (they use formal concepts [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] as
factors) and all of them do not meet the fifth criterion for high coverage.
        </p>
        <p>8M, PaNDa and Asso return a set of factors which cover small amount data
(in many cases they explain less that 80% of input data). Surprisingly, factors
delivered via these algorithms produce the best results from the simple structure
standpoint.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Research</title>
      <p>We proposed a novel metric, based on a simple structure from factor analysis,
for the measurement of the interpretability of factors delivered by matrix
decomposition algorithms—more precisely algorithms that provide decomposition
of matrices over some finite scale. Simple structure is defined via five criterion
which we formalized via mathematics of Fuzzy logic. We proposed two variants
of the metric, strict and partially strict and we experimentally evaluated the
results produced by GreConDL, GreEssL and AssoL algorithm.
Additionally we provide a brief overview of experimental evaluation of selected BMF
algorithms.</p>
      <p>The observed results encourage us to the following future research directions.
First, to explore different ways of the interpretability measuring. Second, to
provide extensive evaluation of results produced by BMF algorithms.
noD c4 .05 .05 .50 .05 .05 .05 .05 .05 .05 .05 1 1 1 1 1 .30 .03 .03 .03 .03 1 1 1 .607 .067
3 3 3 3 3
erC c3 .705 .075 .05 .05 .05 .057 .075 .05 .05 .05 1 1 1 1 .50 1 .80 .08 .06 .01 1 1 1 1 .3
3 3 7 7 3
0
G
Q
A
P
I
o
i
R
lttoa .088 .084 .081 .06 .020 .05 .05 .05 .05 0 1 1 1 .9 0 .570 .80 .058 .067 0 .909 .099 .509 .209 0
3
0
L 9 5 6 6 3 8 2 4 6 9
ss c4 1 .90 .09 .09 .09 .05 .055 .058 .06 .058 1 1 1 1 1 .70 .08 .09 .09 .09 1 1 1 1 .9
E 0
erG c3 .908 .094 .4 5 6 7 5 8 8 3 9 9
09 .09 .09 .06 .07 .07 .07 .07 1 1 1 1 1 .9 1 1 1 .9 1 1 1 1 1
0 0
c2 1 .903 .081 .06 .002 .076 .055 .405 .05 0 1 1 1 .9 0 .860 .90 .508 .706 0 1 1 .905 .092 0
3
0
itrce c5 .205 .024 1 0 .070 .330
m oL c4 .205 .025 A A A 1 A A A A .05 .50 A A A .30 A A A A
3
itc ss
trs A c3 .505 .052 N N N 1 N N N N .706 .806 N N N .760 N N N N
c2 .705 .075 .4 2 0 0 .08 .03 0 1 .908 .096 .093 0
6
06 .00 0 .50 .603 .05 .038 0 1 1 .8 0 0 .89 .9 7 5</p>
      <p>0
c1 .807 .083 .608 .09 .094 .057 .088 .907 .807 .308 1 1 1 1 1 .90 .09 .
2 6 097 .098 .099 1 1 1 1 1</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Belohlavek</surname>
          </string-name>
          , R.:
          <source>Fuzzy Relational Systems: Foundations and Principles</source>
          . Kluwer Academic/Plenum Press New York (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Belohlavek</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krmelova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Factor analysis of sports data via decomposition of matrices with grades</article-title>
          . In: Szathmary,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Priss</surname>
          </string-name>
          ,
          <string-name>
            <surname>U</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of The Ninth International Conference on Concept Lattices and Their Applications</source>
          . pp.
          <fpage>293</fpage>
          -
          <lpage>304</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Belohlavek</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krmelova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Beyond boolean matrix decompositions: Toward factor analysis and dimensionality reduction of ordinal data</article-title>
          . In: Xiong,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Karypis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Thuraisingham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.M.</given-names>
            ,
            <surname>Cook</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.J.</given-names>
            ,
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>X</surname>
          </string-name>
          . (eds.)
          <source>2013 IEEE 13th International Conference on Data Mining</source>
          . pp.
          <fpage>961</fpage>
          -
          <lpage>966</lpage>
          . IEEE Computer Society (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Belohlavek</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krmelova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Factor analysis of ordinal data via decomposition of matrices with grades</article-title>
          .
          <source>Ann. Math. Artif. Intell</source>
          .
          <volume>72</volume>
          (
          <issue>1-2</issue>
          ),
          <fpage>23</fpage>
          -
          <lpage>44</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Belohlavek</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trnecka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>From-below approximations in boolean matrix factorization: Geometry and new algorithm</article-title>
          .
          <source>J. Comput. Syst. Sci</source>
          .
          <volume>81</volume>
          (
          <issue>8</issue>
          ),
          <fpage>1678</fpage>
          -
          <lpage>1697</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Belohlavek</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vychodil</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Factor analysis of incidence data via novel decomposition of matrices</article-title>
          . In: Ferr´e,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Rudolph</surname>
          </string-name>
          , S. (eds.)
          <source>Proceedings of 7th International Conference, ICFCA 2009. Lecture Notes in Computer Science</source>
          , vol.
          <volume>5548</volume>
          , pp.
          <fpage>83</fpage>
          -
          <lpage>97</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Boeck</surname>
            ,
            <given-names>P.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosenberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Hierarchical classes: Model and data analysis</article-title>
          .
          <source>Psychometrika</source>
          <volume>53</volume>
          ,
          <fpage>361</fpage>
          -
          <lpage>381</lpage>
          (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          :
          <article-title>An analytical solution for approximating simple structure in factor analysis</article-title>
          .
          <source>Psychometrika</source>
          <volume>18</volume>
          (
          <issue>1</issue>
          ),
          <fpage>23</fpage>
          -
          <lpage>38</lpage>
          (
          <year>1953</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Cattell</surname>
          </string-name>
          , R.B.:
          <article-title>The scientific use of factor analysis in behavioral and life sciences</article-title>
          .
          <source>Springer US</source>
          (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Dixon</surname>
          </string-name>
          , W.:
          <article-title>Bmdp statistical software manual to accompany the 7.0 software release, vols 1-3</article-title>
          . (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal concept analysis - mathematical foundations</source>
          . Springer (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Gottwald</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Treatise on Many-Valued Logics</article-title>
          , vol.
          <volume>3</volume>
          . research studies press Baldock (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Hajek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Metamathematics of fuzzy logic</article-title>
          ,
          <source>vol. 4</source>
          . Springer Science &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Miettinen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Mielika¨inen, T.,
          <string-name>
            <surname>Gionis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mannila</surname>
          </string-name>
          , H.:
          <article-title>The discrete basis problem</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>20</volume>
          (
          <issue>10</issue>
          ),
          <fpage>1348</fpage>
          -
          <lpage>1362</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Royce</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <article-title>Factors as theoretical constructs</article-title>
          .
          <source>American Psychologist</source>
          <volume>18</volume>
          (
          <issue>8</issue>
          ),
          <volume>522</volume>
          (
          <year>1963</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Spearman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : ”
          <article-title>General intelligence,” objectively determined and measured</article-title>
          .
          <source>The American Journal of Psychology</source>
          <volume>15</volume>
          (
          <issue>2</issue>
          ),
          <fpage>201</fpage>
          -
          <lpage>292</lpage>
          (
          <year>1904</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Thurstone</surname>
            ,
            <given-names>L.L.</given-names>
          </string-name>
          :
          <article-title>Multiple factor analysis</article-title>
          . University of Chicago Press: Chicago (
          <year>1947</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Trneckova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Formal concept analysis with ordinal attributes</article-title>
          .
          <source>Ph.D. thesis</source>
          , Palacky University Olomouc, Czech
          <string-name>
            <surname>Republic</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>