<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Formation of Life Quality Indicators System through Search Algorithm of Association Rules</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lyudmila P. Bilgaeva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dashidondok Sh. Shirapov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Grigoriy V. Badmaev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>East Siberia State University of Technology and Management</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper is devoted to the search of association rules for the formation of the indicators system that a ects the quality of life. The search of association rules is carried out in the transactional database based on the method of AprioriTid algorithm to calculate such metrics as support, con dence and lift. It results in the extraction of useful association rules showing the relationship of life quality indicators, which can be used later to solve the problems of analysis and forecasting.</p>
      </abstract>
      <kwd-group>
        <kwd>extraction algorithm of frequent sets of database</kwd>
        <kwd>the property of monotony</kwd>
        <kwd>the associative search of life quality indicators</kwd>
        <kwd>truncation of candidates</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>At present, issues of life quality are relevant, as the current economic crisis has
primarily a ected the population. In general, the standard of living depends on
a competent social policy pursued by the state. Solving social problems requires
the adoption of management decisions based on real information. This requires
research aimed at identifying the main factors a ecting the life quality.</p>
      <p>In this paper we propose to use methods of searching association rules to
identify the most important indicators of life quality that will enable the
authorities to plan and implement certain measures to improve the population
living standards.</p>
      <p>To search association rules is one of the tasks of Data Mining, the modern
technology of intellectual data analysis, which includes nding regularities
between some related events, the identi cation of related objects and their location
in the space of states. To nd associations such a database is typically used in
which all objects are connected to each other, provided that the database is
consistent and integrative.</p>
    </sec>
    <sec id="sec-2">
      <title>Basic theoretical principles of association rules search</title>
      <p>There are many techniques, which allow solving the problem of nding
association rules. They have the same mathematical approach, but the ways of the
method implementation are di erent. Let us consider the basic theoretical
principles of these methods.</p>
      <p>The association rule of context K is an expression of the form</p>
      <p>A ! B;
where A; B M .</p>
      <p>The context K is a tuple (G; M; I), where G is a set of objects, M is a set
of features, but I G M .</p>
      <p>When association rules are searched, special metrics are used: Support,
Con dence, Lift.</p>
      <p>
        Association rule A ! B Support is a quantity de ned by the formula:
The Support value indicates which part of the G objects contains A [ B. The
Con dence of the association rules is de ned by the formula:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
Support(A ! B) = j(A [ B)0j
      </p>
      <p>jGj
Con dence(A ! B) = j(A [ B)0j</p>
      <p>jA0j
Lift(A ! B) = j(A [ B)0j
jA0j jB0j
The Con dence value shows, which part of the objects that contain A, also
contains A [ B.</p>
      <p>The following quantity is called the association rule utility (Lift):
In other words, the utility is the ratio of Con dence(A ! B) to the Support(B).
The Lift value indicates the usefulness of the rule. If the found utility value is
more than 1, then the rule is considered to be useful.</p>
      <p>The task of mining Association rules is to nd all Association rules of the
context for which the values support and con dence exceed certain set values
min_support and min_confidence, respectively.</p>
      <p>Searching the frequent sets of data is limited to the minimum support value
(min_support), which is set by the user [1{3]. Search of association rules is
made within the frequent sets of data and is limited to the minimum con dence
(min_confidence) and utility value. The minimum con dence is generally set
by the user.</p>
      <p>AprioriTid method, as well as the Apriori method, is based on the
antimonotony property, the key property when nding multielement frequent sets of
data [4, 11]. It is formulated as follows:
8A; B</p>
      <p>M; A</p>
      <p>B ) Support(B)</p>
      <p>
        Support(A)
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
      </p>
      <p>It means that:
{ with an increase of the set size its support either decreases or does not
change;
{ for any set of characteristics support does not exceed the minimum support
of any of its subsets;
{ the set of n size characteristics will be frequent only if all its n 1-element
subsets are frequent.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Valid method choice</title>
      <p>To select a search method of the association rules the authors developed de nite
criteria and comparatively analyzed the certain amount of methods. The results
are given in Table 1.</p>
    </sec>
    <sec id="sec-4">
      <title>Software module of associative search of population life quality indicators</title>
      <p>It is convenient to extract data from a database applying database records identifiers, i.e. TID.
TID also enables you to identify whether the generated rules belong to a particular database record.
To soTlvhee ptohsesibpirliotyblteomtruonfcattheecafnodrimdaatetsioanlloowfsacustytisntge museolefssinadndicautnorerlsiabtlheartulaes eacttthleifire
generation stage in order to optimize the memory used.
quality of the population, we developed a system the architecture of which is
4 Software module of associative search of population life quality indicators
presented in Figure 1.</p>
      <p>To solve the problem of the formation of a system of indicators that affect life quality of the
population, we developed a system the architecture of which is presented in Figure 1.
concentration factor)."
of results.</p>
      <p>Minimum support and minimum confidence are specified by the user.</p>
      <p>The system starts with setting up the parameters, such as the minimum</p>
      <p>While conducting experiments one can consider various transaction and attributes sets,
stuheprpefoorret s(umchinaspuarpam),ettehr eas ma“inseirmiaulnmumcboenrofdtehneceexp(ermiminencto”nisfu)seadn. d a serial number of
the exTpheerfiumncetniotn. Tofhreultersagnesnaecrattiioonn icsobnatseendto,ni.teh.eeAapcrhiorreiTciodrdmeitnhoad,dtahteabblaocske tdaiabglrea,misoaf
swehticohf ips oshsoswibnl einaFtitgruirbeu2t.es which are coded indicators of life quality. For example,</p>
      <p>It starts with generating single-element data sets that are candidates for rules. Support, i.e, the
in a database entry 1, 5, 7 , 1 is an indicator of \Actually available income
number of repetitions in alfl databasegtransactions involved in the experiment, is counted for each of
othfetmh.e population, %", 5 { \Life expectancy at birth in years", 7 { \The Gini
coe</p>
      <p>Tciheennttw(oin-ecloemmeentcsoentsc,ethnrtere-aetlieomnenftascettso,r.)..",.i-element sets, where 2 ≤ i ≤ k, are generated in
the iMteriantiiomn.um support and minimum con dence are speci ed by the user.</p>
      <p>The same sets that are redundant are removed from the resulted sets.</p>
      <p>While conducting experiments one can consider various transaction and
at</p>
      <p>After that support is calculated for each of the remain database sets, then the current set
tsruipbpuotrtevsaslueetsjs,utphiesrceofmoprearesducwhithathpeamrainmimeatlesrupapsoart \msinesruiap,l sneutbmy btheeruosefrt.he experiment"
is useIdf.the condition jsup ≥ minsup is met, then the association rule formation begins, otherwise
the cTuhrreenftusnetcitsiorenmoovferdu.les generation is based on the AprioriTid method, the block</p>
      <sec id="sec-4-1">
        <title>Confidence and utility (lift) are calculated for the generated rule.</title>
        <p>diagram of which is shown in Figure 2.</p>
        <p>If the confidence value is greater than or equal to the minimum confidence value and the lift
It starts with generating single-element data sets that are candidates for rules.
value is greater than or equal to 1, then the rule is considered to be credible and useful, otherwise it
Sisudpepleoterdt., i. e, the number of repetitions in all database transactions involved in
the experiment, is counted for each of t3hem.</p>
        <p>Then two-element sets, three-element sets, . . . , i-element sets, where 2
i
k, are generated in the iteration.</p>
        <p>Generating single-element
data sets and calculating their</p>
        <p>…
support
i = 2, k
Generating i-element data sets</p>
        <p>…
Removing redundant sets</p>
        <p>j = 1, count</p>
      </sec>
      <sec id="sec-4-2">
        <title>Calculating j-set support</title>
        <p>jsup ≥ minsup</p>
        <p>true
Forming a rule and counting
its utility
false</p>
        <p>Deleting of set
and their support, the generated association rules and the values of the confidence and utility
parameters for each of them.
5 The results of the experiments</p>
        <p>The same sets that are redundant are removed from the resulted sets.</p>
        <p>We made many experiments with the AprioriTID method of association rule to search for a
system of indicators thAafttaefrfetcht alitfesuqupapliotyr.tTishecasulcbusylastteemd ofofrtheeaicnhdiocfattohrse prreompoasienddbayttahbeaasuethsoertss, then the
in [10] was takencausrrinepnutt sdeattas.upTphiosrstuvbsaylusteemjspuropviidsescoemighptamreadinwinitdhicatthoers mofinthimepaolpsuulaptpioonrt minsup,
life quality and thseeftacbtoyrsththeatuisneflru.ence on each of them.</p>
        <p>Database transactions were formed from the original data, which contained a various</p>
        <p>If the condition jsup minsup is met, then the association rule formation
number of attributes representing the coded life quality indicators and factors distinguished
begins, otherwise the current set is removed.
according to the experts’ opinion. Overall, there were formed 25 transactions with the various
number of attributes fCroomn fdiveentcoe saenvedntueteinl.itWyh(eLnifuts)inagrethceatlrcaunlsaatcetidonfsorwtithhefigveenaetrtraibteudtesruanled.
more, fourteen ones included, there were no results of the experiments. The generation of</p>
        <p>If the Con dence value is greater than or equal to the minimum con dence
value and the lift value is g4reater than or equal to 1, then the rule is considered
to be credible and useful, otherwise it is deleted.</p>
        <p>Visualization of the results allows us displaying the initial transactions,
frequent sets of data and their support, the generated association rules and the
values of the con dence and utility parameters for each of them.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>The results of the experiments</title>
      <p>We made many experiments with the AprioriTID method of association rule
to search for a system of indicators that a ect life quality. The subsystem of
the indicators proposed by the authors in [10] was taken as input data. This
subsystem provides eight main indicators of the population life quality and the
factors that in uence on each of them.</p>
      <p>Database transactions were formed from the original data, which contained
a various number of attributes representing the coded life quality indicators
and factors distinguished according to the experts' opinion. Overall, there were
formed 25 transactions with the various number of attributes from ve to
seventeen. When using the transactions with ve attributes and more, fourteen ones
included, there were no results of the experiments. The generation of association
rules begins with using 15 attributes in a transaction. Figure 3 shows a fragment
of the original database transaction with ve and seven attributes.</p>
      <p>In Figure 4 you can see a fragment of frequent item sets containing six or
seven attributes, the support value of which is equal to three. Four valid useful
rules presented in Table 2 were generated based on the frequent item sets above.</p>
      <p>The experiment resulted in the generation of fourteen valid and useful
association rules. Since any association rule is an operation of implication, it is
possible to combine them through a conjunction operation provided that the
conjunction is true. After converting a logical expression ve association rules
were obtained. They are represented in Table 3.</p>
      <p>Here it is seen that to generate the association rule 251 ! 252 15 database
transactions were used. This rule means that the \Mortality" indicator (252) is
a ected by the \Birth rate" indicator (251).</p>
      <p>Or, for example, Rule 230 ^ 235 ! 238 ^ 239 ^ 241 means that \Life quality
index" (230) and \Purchasing power" (235) indicators are in uenced on with
such indicators as \Paid services volume per capita" (238), \Growth rate of the
minimum subsistence level" (239) and \Employment rate of the population"
(241).</p>
      <p>During the experiments the graphs were plotted. Figure 5 shows the graph
of relation between the number of rules and the number of transactions, a trend
line was made.
Computational experiments with the developed software were carried out. They
enabled us to obtain valid and useful association rules for the population life
quality indicators, the number of which depends on the input data.</p>
      <p>The experiments outcome shows that the indicators and factors in each
association rule are interrelated. In addition, the results obtained demonstrate that it
is possible to generate valid and useful association rules based on a transactional
database. Having performed logical transformations over them, one can create
a system of life quality indicators, which then can be used to solve problems of
analyzing and forecasting the population life quality.</p>
      <p>This approach will enable the state authorities to correct and reasonably
develop strategic social and economic programs to improve the population life
quality.
10. Saktoev, V.E., Sadykova, E.T.: Sustainable Development of Regional Economic
Systems with Environmental Regulations. ZAO \Economy", Moscow, Russia
(2011)
11. Zayko, T.A., Oleinik, A.A., Subbotin, S.A.: Association rules in data mining.
Bulletin of NTU \KhPI" 39(1012), 82{95 (2013)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Imielinski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swami</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Mining association rules between sets of items in large databases</article-title>
          .
          <source>In: Proceedings of the ACM SICMOD conference on management of data</source>
          . pp.
          <volume>207</volume>
          {
          <fpage>216</fpage>
          .
          <string-name>
            <surname>Washington</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mannila</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stricant</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toivonen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verkamo</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          :
          <article-title>Advances in knowledge discovery and data mining, chap</article-title>
          .
          <source>Fast Discovery of Association Rules</source>
          , pp.
          <volume>307</volume>
          {
          <fpage>328</fpage>
          . American Association for Arti cial Intelligence Menlo Park, CA, USA (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srikant</surname>
          </string-name>
          , R.:
          <article-title>Fast algorithms for mining association rules in large databases</article-title>
          .
          <source>In: Proceedings of the 20th International Conference on Very Large Databases</source>
          . pp.
          <volume>487</volume>
          {
          <fpage>499</fpage>
          .
          <string-name>
            <surname>Santiago</surname>
          </string-name>
          ,
          <string-name>
            <surname>Chili</surname>
          </string-name>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Billig</surname>
            ,
            <given-names>V.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsaregorodcev</surname>
            ,
            <given-names>N.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ivanova</surname>
            ,
            <given-names>O.V.</given-names>
          </string-name>
          :
          <article-title>Building association rules in medical diagnosis</article-title>
          .
          <source>International Journal of Software &amp; Systems</source>
          <volume>2</volume>
          ,
          <fpage>146</fpage>
          {
          <fpage>157</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>An association rule mining algorithm based on a boolean matrix</article-title>
          .
          <source>Data Science Journal</source>
          <volume>6</volume>
          , Supplement,
          <volume>559</volume>
          {
          <fpage>565</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Olson</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Delen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Advanced Data Mining Techniques</article-title>
          . Springer Publishing Company, Incorporated (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Oreshkov</surname>
          </string-name>
          , V.:
          <article-title>Fpg { an alternative search algorithm for association rules (</article-title>
          <year>2014</year>
          ), uRL: https://basegroup.ru/community/articles/fpg
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Babu</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shankar</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>V.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rajanikanth</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sekhar</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          :
          <article-title>Mining association rules based on boolean algorithm { a study in large databases</article-title>
          .
          <source>International Journal of Machine Learning and Computing</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <volume>347</volume>
          {
          <fpage>350</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Sahaaya</given-names>
            <surname>Arul Mary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Malarvizhi</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>A new improved weighted association rule mining with dynamic programming approach for predicting a user's next access</article-title>
          .
          <source>In: Proceedings of the ICAITA conference</source>
          . vol.
          <volume>2</volume>
          , pp.
          <volume>105</volume>
          {
          <fpage>122</fpage>
          .
          <string-name>
            <surname>Dubai</surname>
            ,
            <given-names>UAE</given-names>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>