<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of the structure of the relationship between the descriptions of objects of classes and evaluation of their compactness</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>E N Zguralskaya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ulyanovsk Technical University. Institute of Aviation Technologies and Management</institution>
          ,
          <addr-line>Sozidateley avenue, 13A, Ulyanovsk, Russia, 432072</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>283</fpage>
      <lpage>289</lpage>
      <abstract>
        <p>The study is conducted to assess the compactness of descriptions of objects of classes on the numerical axis and in the multidimensional attribute space. The computation of compactness is possible only in the defined boundaries of areas of the attribute space. In the one-dimensional case, the boundaries are calculated by the frequency of occurrence of the values of features of objects of classes in the interval. In the multidimensional case, a subset of the boundary objects of the classes is used for a given metric. A comparative analysis is given of the values of the compactness measure by latent attributes on the numerical axis and by the sets of initial features from which they are synthesized.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the pattern recognition theory objects are structured into classes based on the compactness
hypothesis. Under this hypothesis, “close” objects shall belong to the same class. It is necessary to
clarify (interpret) the terms “closeness” and “compactness” of objects.</p>
      <p>No common determination of the “compactness” term has been adopted. [1] postulates a
compactness measure of disjoint groups, set of admissible values of which is determined in (0, 1] and
depends on structure of relations between objects. The following factors affecting values of
compactness are pointed out:
• the choice of the metric to compute distances between objects;
• the dimension of the attribute space;
• the choice of the way to scale and normalize data;
• the usage of methods to select informative collections of attributes;
• conditions to select and remove noise objects from the sample;
• the number of standard objects of the minimal coverage of the learning sample;
• linear and nonlinear transformations of the attribute space for the description of the objects.</p>
      <p>The aim of the searching for extremal values of compactness measures on the variety of parameters
listed above is to improve generalizing ability of recognition algorithms. The method to obtain a
quantitative estimate for the pattern compactness, described in [2], is based on the usage of the
function of competitive similarity between objects (FRiS-functions). Using the FRiS-function, one can
describe all distributions of classes by collections of standard objects. The collection of objects allows
one to find the compactness measure of the whole sample or each separate object of the class and to
clear the learning sample from objects adding negative contributions to the value compactness.</p>
      <p>Implementation of machine learning algorithms becomes significantly more complicated when the
dimension of the data is large. A geometric interpretation of origin of the effect of curse of
dimensionality is given in [3]. The effect of curse of dimensionality arises from the fact that the
number of possible sets of attributes in the description of objects significantly exceeds the number of
training examples. Learning algorithm can only support correct generalization provided that the
number of examples from learning sample is enough.</p>
      <p>Compactness implies the existence of a boundary between areas of attribute space with a
description of objects from different classes.</p>
      <p>Numerical methods to obtain a quantitative estimate for compactness are differentiated as well. For
one-dimensional cases the interval methods are used while for multidimensional cases – computation
of measure of compactness of objects of classes and samples in a whole for a given metric. What both
one-dimensional and multidimensional cases have in common is the existence of areas of attribute
space on boundaries of which measure of compactness is computed.</p>
      <p>For a one-dimensional case the objects can be compared on the numerical axis by values of its
initial and latent attributes using relations “greater than”, “less than” or “equal to”.</p>
      <p>When the measure of compactness is computed for a multidimensional case in [1] the property of
connectedness of objects along the subset (spans) of boundary objects of disjoint groups is used. Based
on this property the objects are decomposed into disjoint groups. Connectedness of objects Si, Sj is
treated as property of logical regularities in form of hyperballs with these objects being its centre. Si
and Sj objects are considered bound if their intersection contains spans objects. Any pair of objects
(Si,Sj) of one group can always be linked by a chain of connected objects. Ideally, all class objects
shall represent one group of connected objects.</p>
      <p>This paperwork reviews structure of relations between class objects on the numerical axis. It is
suggested to use measures of compactness, computed through decomposition of either attributes
values (initial and latent) or values of distance between the objects into intervals, as a research tool.
Values of measure of compactness are used to detect latent patterns in data. Such patterns can be
regarded as new knowledge obtained within the frames of information models of ill-structured subject
areas.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Criteria for decomposition of attributes into intervals</title>
      <p>Let us consider two computing algorithms put forward in [4, 5] to optimize criteria for decomposition
of attributes values into intervals. For convenience let these criteria be referred to as CR1 and CR2.</p>
      <p>When computing with respect to CR1 number of intervals on the ordered sequence of attribute
values equals to number of disjoint classes. Values of interval boundaries are determined via the
maximum of product of intraclass similarity and interclass difference. Ideally every interval shall be
represented by all attribute values of objects of one class.</p>
      <p>For the CR2 criterion the number of classes is 2, the number of intervals is equal to or greater than
2. When computing boundaries of disjoint intervals, number of which is initially unknown, the
absolute difference in frequency of occurrence of attribute values (both initial and latent) in the
description of objects of two classes is used. The values of attributes on the numerical axis form a
sequence of clusters (intervals). There should not be two neighboring clusters in which representatives
of one class would dominate (in terms of frequency of occurrence). Those decompositions are
considered ideal in the sense of consistency, for which values of (not necessarily all) objects of only
one class are contained within the boundaries of each interval.</p>
      <p>The set of admissible values by the CR1 criterion and the consistent decomposition of attributes
into intervals over CR2 are contained in the segment [0; 1] and are further considered as a measure of
their compactness. The value 1 corresponds to a perfect decomposition with respect to CR1 and CR2.
The degree of deviation from the ideal can be inferred by values less than 1.</p>
      <p>Combined use of CR1 and CR2 criteria is necessary to detect latent patterns in data. The search for
patterns is based on results of a computational experiment. To interpret the results of the experiment,
known forms of logical regularities are used (hyperball, half-plane, parallelepiped).</p>
      <p>Let a set of objects E0={S1,…,Sm} be given, containing representatives d of disjoint classes K1,...,Kd.
The objects are described using a set of n different types of X (n) attributes, δ (δ &lt;n) of which are
measured in nominal, n – δ in interval scales. Gaps and duplicate values in data are allowed.</p>
      <p>Search for latent, that is, hidden attributes is of great interest as those can be very informative in the
classification being one of the objectives of the present study. It is believed that the CR1 and CR2
criteria are used to decompose values of a quantitative attribute (both initial and latent) into disjoint
intervals. Latent attributes can represent combinations of nominal and quantitative attributes [6, 7].
Required to determine:
• method to compute latent attributes;
• boundaries of intervals and CR1 criterion values on initial and latent attributes;
• number of intervals, values of their boundaries and consistency of decomposition of initial
attributes by the CR2 criterion.</p>
      <p>A variety of methods to form latent attributes and criteria for decomposition of their values into
disjoint intervals is essential to detect hidden patterns in databases of subject areas. We will obtain
latent attributes from a set X (n) in form of combinations of xi*xj and xi/xj. If number of gradations of a
nominal attribute is equal to the number of disjoint classes of objects, then they can always be
associated with a set of integers a1,…,ad, where ai≠0, i=1,…,d and aj+1-aj=const, where j=1,…,d-1.
Each disjoint interval in respect with CR1 will be represented by one value. For example, if number of
gradations equals to 2, the choice of values from [-1, 1] would constitute a representation form
convenient for calculating.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Selection of information attribute sets by compactness of objects</title>
      <p>
        Let the metric ρ(x, y) be defined on the set of attributes X(h)⊂X(n), 1≤h≤n. The object S∊ E0∩Kp,
p=1,2 is considered as a centre of a hyperball from which, according to the ordered set of objects
{S,S1,…,Sm-1}=E0, a sequence of hyperballs nested in each other is formed having radii
ρ(S, S) ≤ρ(S, S1) ≤ ρ(S, S2) ≤ … ≤ ρ(S, Sm-1). (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>
        Values of boundaries of intervals of each object S∊E0∩Kp, p=1, 2 to E0, computed by the CR1
criterion at (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), are used as a means to select an informative set of diverse attributes from X (n). The
geometric interpretation of formation of the ordered set {S, S1,…, Sm-1} is shown at "Figure 1".
      </p>
      <p>
        Let the boundaries of the intervals [c1,c2], (c2,c3], c1=ρ(S,S)=0 be defined at (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) in respect with CR1.
Estimate of compactness of the object S∊Kp by the set of attributes X(h)⊂X(n) is calculated as
φ(S,X(h))=θ1(1- θ2), (
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
where
θ 1 =
{S i ∈ K p ρ (S , S i )∈[c1 , c2 ]}
      </p>
      <p>K p
,
θ 2 =
{S i ∈ K3− p ρ (S , S i )∈[c1 , c2 ]}</p>
      <p>K3− p
.</p>
      <p>As a means to reduce combinatorial complexity when searching for logical regularities in [8], it is
offered to use hierarchical grouping methods. It is noted how important a choice of the first step for
such search for patterns is.</p>
      <p>
        How these hierarchical agglomerative grouping methods are implemented depends on the rules
used in them. Let φ(X(h),Si) be the compactness estimate (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) of the object Si ∊E0 на X(h). When
forming the set X(h+1) of X(h) it is necessary to calculate
      </p>
      <p>1 m 1,ϕ (X (h + 1), Si ) ≥ ϕ (X (h), Si ),
R(X (h + 1)) = ∑ </p>
      <p>m i=10,ϕ (X (h + 1), Si ) &lt; ϕ (X (h), Si ).</p>
      <p>The condition (rule) for adding a attribute xj∊X(n)\X(h) in X(h+1) is:</p>
      <p>R(X (h) ∪ {x j }) &gt; 12 и R(X (h) ∪ {x j }) =</p>
      <p>
        max
xi ∈X (n)\ X (h)
. (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
The group (set) X(h) is considered as formed if the attribute xj∊X(n)\X(h), for which (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) holds true,
does not exist.
      </p>
      <p>
        One peculiarity of Bigdata methods is analysis of data samples in which the number of attributes is
greater than or equal to the number of objects. The number of groups into which the set of attributes X
(n) is decomposed by (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) is initially unknown. It has been experimentally proved in [9] that, when
majority rules in hierarchical agglomerative grouping are used, the informativeness of each subsequent
group of attributes is less than that of a preceding one. Group formation sequence is determined by the
principle of dynamic programming. For this reason, composition of attributes of the first group is
considered as an informative set.
      </p>
      <p>As the first step in selecting an informative set of attributes, it is proposed to choose a subset
Y⊂X(n) consisting of one or two attributes. The subset Y shall satisfy the following requirement:
R(X (h) ∪ {xi })
B(Y ) =</p>
      <p>m 
max ∑ S j ∈ K p ρ (S j , Sd ) &lt; R,
{i, j}∈X (n) d =1 </p>
      <p>R =</p>
      <p>
min ρ (Sc , Sd )
Sc∈K3− p  .</p>
      <p>
        (
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Computation experiment</title>
      <p>Let us review the results of decomposition of quantitative attributes into disjoint intervals with respect
of the CR1 and CR2 criteria on a data sample from [10].</p>
      <p>The sample consists of two classes - K1 and K2 - and contains data on cardiovascular diseases. The
description of objects is given by the following set of attributes X(13)=(x1,…,x13). Number of objects
of class K1 is 150, ones of class K2is 120. x1, x4, x5, x8, x10, x11, x12 are quantitative attributes, while x2,
x3, x6, x7, x9, x13 are nominal. The nominal attributes x2, x6, x9 have two gradations (i.e., number of
gradations of a attribute is equal to number of classes).</p>
      <p>The compactness of the quantitative attributes from X(13) and the limits of the CR1 intervals are
given in Table 1.</p>
      <p>The product of intraclass similarity and interclass difference is used to compute nominal attribute
weights (as well as quantitative attributes compactness) by CR1. If the values of gradations in the
description of objects of each class do not intersect each other, then the weight of the nominal attribute
equals to 1. Table 2 shows the values of all six nominal attributes.</p>
      <p>Attribute
x1
x4
x5
x8
x10
x11
x12</p>
      <p>
        Oldpeak = ST depression induced by
exercise relative to rest
The slope of the peak exercise ST
segment
Number of major vessels (
        <xref ref-type="bibr" rid="ref1 ref2 ref3">0-3</xref>
        ) colored by
flourosopy
maximum heart rate achieved
Age
Resting blood pressure
Serum cholestoral in mg/dl
      </p>
      <p>As is seen from the Table 1 and Table 2, the compactness of quantitative (attributes) by CR1 and
the values of nominal attribute weights are very different from the ideal ones. The value of
quantitative attribute within the disjoint interval by CR1 can be considered as a gradation (interval
number) in the nominal measurement scale. In such a description of objects the attribute weight in
nominal scale will coincide with the value of compactness in respect with the CR1 criterion.</p>
      <p>The number of disjoint intervals and the stability of the decomposition by the CR2 criterion are
given in the Table 3 below.</p>
      <p>As is seen from the Table 1 and Table 3, relatively high values of compactness are obtained for the
attribute x12.</p>
      <p>Table 4 gives information about decomposition of latent attributes into two intervals obtained from
the operations of multiplication and division of values of initial attributes.</p>
      <p>Analysis of the results from the Table 4 and Table 1 shows that it is in fact feasible to search
hidden patterns by latent attributes, the compactness of which is higher than each of the initial
attributes that make up their composition.</p>
      <p>Let a set of indices of respective quantitative and nominal attributes in the set X(n) be designated
by I, J . In order to unify the measurement scales, we will display the values of quantitative attributes
by a linear fractional transformation in [0; 1]. The Zhuravlev metric will be used as a measure of the
distance between the objects Su, Sv∊E0 (Sc=(ac1,…,acn), c=1,…,m) for selection of informative
attributes
1, aui ≠ avi ,
ρ(Su , Sv ) = ∑ aui - avi + ∑ </p>
      <p>i∈I i∈J 0, aui = avi .</p>
      <p>
        The first pair of attributes added to the informative set in (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) is (x8, x13). The process of stepwise
selection of attributes according to (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) is shown in the Table 5.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>Two interval methods have been offered to analyze the structure of relations between objects of
disjoint classes by quantitative initial and latent attributes. Numerical estimates of the structure of
relations by these methods differ in that the number of disjoint intervals can be initially known or
determined by the algorithm. The rule of hierarchical agglomerative grouping for the formation of an
informative attribute set has been described.</p>
      <p>The considered methods are recommended to be used to search for hidden patterns in data when
developing information models based on knowledge.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Ignatyev</surname>
            <given-names>N A</given-names>
          </string-name>
          <year>2018</year>
          <article-title>Structure Choice for Relations between Objects in Metric Classification Algorithms Pattern Recognition</article-title>
          and
          <source>Image Analysis</source>
          <volume>28</volume>
          <fpage>590</fpage>
          -
          <lpage>597</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Zagoruiko</surname>
            <given-names>N G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kutnenko</surname>
            <given-names>O A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zyryanov</surname>
            <given-names>A O</given-names>
          </string-name>
          and
          <string-name>
            <surname>Levanov</surname>
            <given-names>D A</given-names>
          </string-name>
          <year>2014</year>
          <article-title>Learning to recognize patterns without retraining Machine Learning</article-title>
          and
          <source>Data Analysis</source>
          <volume>1</volume>
          <fpage>891</fpage>
          -
          <lpage>901</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Goodfellow</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            <given-names>Y</given-names>
          </string-name>
          and
          <string-name>
            <surname>Courville A 2016 Deep Learning</surname>
          </string-name>
          (Cambridge: MIT Press) p
          <fpage>652</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Zguralskaya</surname>
            <given-names>Е N</given-names>
          </string-name>
          <year>2018</year>
          <article-title>Stability of data partitioning into intervals in the tasks of recognition and the search for hidden patterns</article-title>
          <source>Proceedings of the Samara Scientific Center of the Russian Academy of Sciences</source>
          <volume>4</volume>
          <fpage>826</fpage>
          -
          <lpage>829</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Zguralskaya</surname>
            <given-names>Е N</given-names>
          </string-name>
          <year>2012</year>
          <article-title>Selection of informative features for solving classification problems using artificial neural networks Neurocomputers: development</article-title>
          , application
          <volume>20</volume>
          -
          <issue>27</issue>
          [6]
          <string-name>
            <surname>Ignatyev</surname>
            <given-names>N A</given-names>
          </string-name>
          <year>2011</year>
          <article-title>Calculation of generalized indicators and data mining Automation</article-title>
          and
          <source>Remote Control</source>
          <volume>183</volume>
          -190
          <string-name>
            <surname>Saidov D Y 2017</surname>
          </string-name>
          <article-title>Data visualization and its proof by compactness criterion of objects of classes</article-title>
          <source>International Journal of Intelligent Systems and Applications (IJISA) 9</source>
          <volume>51</volume>
          -58
          <string-name>
            <surname>Duke</surname>
            <given-names>V A</given-names>
          </string-name>
          <year>2005</year>
          <article-title>Methodology of the search for logical laws in the subject area with fuzzy systemology: an example of clinical and experimental studies URL: https://dlib</article-title>
          .rsl.ru/viewer/01002930373#?
          <article-title>page=1 Madrakhimov Sh 2018 Calculation of the Generalized Estimations in Sets of Features and</article-title>
          their Interpretation
          <source>International Journal of Software Engineering and Its Applications</source>
          <volume>12</volume>
          <fpage>29</fpage>
          -
          <lpage>38</lpage>
          UCI repository of machine learning databases URL: http://archive.ics.uci.edu
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>