<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Comparison of Matrix Reordering Algorithms Based on Monotone Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Grete Lind</string-name>
          <email>grete.lind@taltech.ee</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rein Kuusik</string-name>
          <email>rein.kuusik1@taltech.ee</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tallinn University of Technology</institution>
          ,
          <addr-line>Tallinn</addr-line>
          ,
          <country country="EE">Estonia</country>
        </aff>
      </contrib-group>
      <fpage>281</fpage>
      <lpage>286</lpage>
      <abstract>
        <p>The aim of this paper is to shed light on matrix reordering methods: namely scale of conformity, minus technique, plus technique and mixed technique. All of them are based on the theory of monotone systems. Presented methods are applicable both to NxN (entityto-entity) and NxM (entity-to-attribute) data tables. All the methods can use larger set of discrete values (not only binary ones). Rows and columns are reordered separately. The result does not depend on the initial order of rows and columns. We compare the results of these methods through stress measure (both in the von Neumann neighborhood and in the Moore neighborhood) using binarized representations of well-known data sets.</p>
      </abstract>
      <kwd-group>
        <kwd>Matrix Reordering</kwd>
        <kwd>Monotone Systems</kwd>
        <kwd>Conformity</kwd>
        <kwd>Stress</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Seriation is an exploratory data analysis technique to reorder objects into a
sequence along a one-dimensional continuum so that it best reveals regularity and
patterning among the whole series [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Seriation is often called matrix reordering,
when applied to two-way datasets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Two-way one-mode data table means
entity-to-entity (NxN) data table and two-way two-mode is entity-to-attribute
(MxN) data table [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Matrix reordering has been used in many different fields, from archaeology
to operations research. A thorough historical overview is given by Liiv [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        In this paper, we introduce little known matrix reordering methods that are
based on the theory of monotone systems (MS) created by Mullat [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3–5</xref>
        ]. These
methods – scale of conformity, minus technique, plus technique – have been
created by Leo V˜ohandu at Tallinn University of Technology, department of
Informatics [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6–8</xref>
        ]. We propose a novel MS-based method by Rein Kuusik called
mixed technique [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        According to Liiv [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] the main future goal for seriation is to make it
ubiquitously usable, reordering the matrices should be a common practice for
everybody inspecting any data table. Recently Liiv and V˜ohandu [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] experiment with
asymmetric one-mode two-way (NxN) data tables. They propose that “seriation
methods can be applied to analyze asymmetric one-mode two-way datasets as if
they were two-mode two-way datasets while continuing to keep the information
about entities actually belonging to one class”.
      </p>
      <p>The paper is organized as follows. In the following subsections, we
introduce stress measures and describe MS-based reordering methods. Section 2 is
dedicated to the mixed technique. Experiments are presented in section 3 and
conclusion in section 4.
1.1</p>
      <sec id="sec-1-1">
        <title>Stress</title>
        <p>
          Stress is a dissimilarity measure, it compares the values in a matrix with their
neighbors. For an n × m matrix X, the local stress measure for element xij is
defined for two types of neighborhood [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]:
1. in the Moore neighborhood–a square-shaped neighborhood comprising (at
most) eight adjacent entries:
sij = Σkm=inm(anx,i(+1,1i)−1)Σlm=imn(amx(,1j+,j−1)1)(xij − xkl)2 ,
(1)
(3)
sij = Σkm=inm(anx,i(+1,1i)−1)(xij − xkj )2 + Σlm=imn(amx(,1j+,j−1)1)(xij − xil)2 .
(2)
        </p>
        <p>A global stress measure for the whole matrix is the sum of the local stresses
(in either neighborhood) for all entries of the table:</p>
        <p>ST RESS = Σin=1Σjm=1sij .</p>
        <p>In case of binary data, the local stress sij is just the count of the neighbors
that have different values.
2. in the von Neumann neighborhood–a diamond-shape neighborhood
comprising (at most) four adjacent entries:
1.2</p>
      </sec>
      <sec id="sec-1-2">
        <title>Reordering Methods Based on Monotone Systems</title>
        <p>
          Here we briefly introduce three methods for reordering data tables: scale of
conformity, minus technique and plus technique [
          <xref ref-type="bibr" rid="ref6 ref7 ref8">6–8</xref>
          ]. Because of limited space
we cannot present their algorithms and examples.
        </p>
        <p>All the methods reorder rows and columns separately. Thus, the same
algorithm can be applied for both rows and columns, using transposed table for one
of them. Their result does not depend on the initial order of rows/columns.</p>
        <p>
          The scale of conformity [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] reorders the data by object’s typicality using the
conformity measure that is the sum of all attribute-value frequencies (of the
row).
        </p>
        <p>
          In case of iterative techniques [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], to wit minus technique, plus technique,
rows/columns are removed from table one-by-one, after each removal the weights
of the remaining rows/columns are recomputed. Iterative techniques use
conformity as a weight function. Actually different weight functions (e.g. influence [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ])
can be used.
        </p>
        <p>Algorithm: Mixed technique for matrix rows</p>
        <p>These reordering methods allow to process non-binary data and zeros are
treated the same way as other values. Using a different weight function, we can
treat zeros differently.</p>
        <p>The data table will be reordered in order to better visualize the data. In case
of conformity scale and plus technique, the most homogeneous group forms in
the upper-left corner and the most atypical in the lower-right corner. In case of
minus technique–on the contrary.</p>
        <p>While these techniques help finding homogeneous groups, we miss a method
offering possibly smooth changes. In the next section we will propose such a
method.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Mixed Technique</title>
      <p>
        Here we present a novel iterative method for matrix reordering, called mixed
technique [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. It is aimed to create a gradual way of changes, starting from the
first object/attribute. The user chooses from which object/attribute to start. If
the user cannot decide, then it can be chosen by minimal weight (as in minus
technique). Further the closest object/attribute to the just eliminated one (by
the number of coincidences) is chosen for removal. The weight is not based on the
frequencies of (diminishing) initial table (as in case of plus technique and minus
technique), but on the (growing) table of already removed objects/attributes.
      </p>
      <p>Assume that we have an N × M data matrix X. Every element Xij , i =
1, . . . , N , j = 1, . . . , M , has a discrete value from an interval [1, K].</p>
      <p>Algorithm (see above) starts like the minus technique (S1..S5) and after the
first iteration, it continues similarly to the plus technique (S6..S13).</p>
      <p>In order to demonstrate the mixed technique we use data given in Table 1
(a). The conformity i.e. weight of O1 is 2+2+4+2+2=12 (count(A1 = 1) +
count(A2 = 2) + count(A3 = 2) + count(A4 = 2) + count(A5 = 2)). O1 has the
smallest weight among objects, therefore it is selected first. After that starts the
iterative part of the algorithm.</p>
      <p>Table 1 (b) shows the weights of objects (rows) during 6 iterations and the
order of elimination of rows, while Table 1 (c) shows the same for columns (during
5 iterations). Reordered data table is presented in Table 1 (d). Weights of rows
W (Oi) and weights of columns W (Aj) present the weights at the moment of
elimination of a row/column.</p>
      <p>Compared to the previous techniques, mixed technique gives different
information. By maximizing the similarity between consecutive rows/columns, it
reveals an “evolutionary” way of changing/developing the initial object/attribute.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>
        We compare the results of four introduced algorithms by two stress measures [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
(see section 1.1) using 20 different data sets.
      </p>
      <p>We use binarized representations of 20 data sets from UCI Machine Learning
Repository1, the list of data sets, ordered by size, is given in Table 2.</p>
      <p>Table 3 shows global stress values in both neighborhoods for all the considered
data sets for all 4 reordering algorithms based on monotone systems: conformity
scale (conf), plus technique (plus), minus technique (min) and mixed technique
(mixed).
1 http://archive.ics.uci.edu/ml/</p>
      <p>
        The smaller the stress value, the better is the reordering. In each row, the
smallest values for both neighborhoods are shown in bold. In all cases the mixed
technique achieves the best result and conformity scale has the worst result. In
most cases, minus technique gives better result than plus technique (16 data
sets out of 20 by both measures). This complies with observation by Liiv ([
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
p.61): “the “minus” technique outperformed the “plus” technique on the average
with all three measures”, although used data sets and evaluation measures are
different.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this paper, we have introduced three matrix reordering methods based on
monotone systems and proposed a new one, called mixed technique. We have
evaluated all four methods using stress measures. We have found that the novel
mixed technique performs better than the other three algorithms. Future
challenges include handling big data and dealing with incremental update.
Acknowledgements. The authors would like to thank prof. Sadok Ben Yahia
for helpful comments and suggestions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Liiv</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Seriation and matrix reordering methods: An historical overview</article-title>
          .
          <source>Statistical analysis and data mining 3(2)</source>
          ,
          <fpage>70</fpage>
          -
          <lpage>91</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Liiv</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vohandu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Seriation and Matrix Reordering Methods for Asymmetric One-Mode Two-Way Datasets</article-title>
          . In: Imaizumi,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Nakayama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Yokoyama</surname>
          </string-name>
          , S. (eds.),
          <article-title>Advanced Studies in Behaviormetrics and Data Science: Essays in Honor of Akinori Okada, Behaviormetrics: Quantitative Approaches to Human Behavior 5</article-title>
          , https://doi.org/10.1007/
          <fpage>978</fpage>
          -981-15-2700-5_
          <fpage>10</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Mullat</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          :
          <article-title>On the Maximum Principle for some Set Functions</article-title>
          .
          <source>Proceedings of the Tallinn Technical University</source>
          ,
          <volume>313</volume>
          ,
          <fpage>37</fpage>
          -
          <lpage>44</lpage>
          (
          <year>1971</year>
          ). http://www.datalaundering. com/download/modular.pdf,
          <source>last accessed 2016/02</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Mullat</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          :
          <source>Extremal Subsystems of Monotonic Systems. I. Automation and Remote Control</source>
          ,
          <volume>37</volume>
          (
          <issue>5</issue>
          ),
          <fpage>758</fpage>
          -
          <lpage>766</lpage>
          (
          <year>1976</year>
          ). www.datalaundering.com/download/ extrem01.pdf,
          <source>last accessed 2016/02</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Mullat</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          :
          <source>Extremal Subsystems of Monotonic Systems. II. Automation and Remote Control</source>
          ,
          <volume>37</volume>
          (
          <issue>8</issue>
          ),
          <fpage>1286</fpage>
          -
          <lpage>1294</lpage>
          , (
          <year>1977</year>
          ). www.datalaundering.com/downlaods/ extrem02.pdf,
          <source>last accessed 2016/02</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Vyhandu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Rapid data analysis methods</article-title>
          . Transactions of Tallinn Technical University,
          <volume>464</volume>
          ,
          <fpage>21</fpage>
          -
          <lpage>39</lpage>
          (
          <year>1979</year>
          ).
          <article-title>(in Russian)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Vyhandu</surname>
            ,
            <given-names>L</given-names>
          </string-name>
          :
          <article-title>Fast Methods in Exploratory Data Analysis</article-title>
          . Transactions of Tallinn Technical University,
          <volume>705</volume>
          ,
          <fpage>3</fpage>
          -
          <lpage>13</lpage>
          (
          <year>1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. V˜ohandu, L.,
          <string-name>
            <surname>Kuusik</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lind</surname>
          </string-name>
          , G.:
          <article-title>Monotone systems and their applications</article-title>
          . Tallinn: Tallinn University of Technology, Department of Software Science (
          <year>2018</year>
          ).
          <article-title>(in Estonian) issuu</article-title>
          .com/monotsys/docs/monotoonsed_s_steemid,
          <source>last accessed 2019/07</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hahsler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hornik</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buchta</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Getting things in order: An introduction to the R package seriation</article-title>
          , https://rdrr.io/cran/seriation/,
          <source>last accessed</source>
          <year>2019</year>
          /06/27
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Liiv</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Pattern Discovery Using Seriation and Matrix Reordering: A Unified View, Extensions and an Application to Inventory Management</article-title>
          .
          <source>Ph.D. Thesis</source>
          , Tallinn University of Technology (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>