<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Constructing on the Example of Patient's Physical Characteristics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nataliya Shakhovska</string-name>
          <email>nataliya.b.shakhovska@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iryna Zhelizniak</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Artificial Intelligence, Lviv Polytechnic National University, UKRAINE</institution>
          ,
          <addr-line>Lviv, 12 S.Bandera str.</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>In this example, the set of transactions containing the Object</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>1</fpage>
      <lpage>3</lpage>
      <abstract>
        <p>The methodof the construction of associative In the field of medicine, such objects, for example, are rules are described. Associative rules for assaying the patient have been constructed. The set of transactions that are available for medical analysis of a patient is considered. It has been found that the correct assessment of the utility of an associative rule affects the volume and speed of access to information. A unique identifier for the patient set of patient analyzes has been entered. Additional numerical attributes of the investigated objects are indicated.</p>
      </abstract>
      <kwd-group>
        <kwd>Characteristics</kwd>
        <kwd>associative rules</kwd>
        <kwd>data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>In medical and biological research, as well as in practical
medicine, the range of tasks to be solved is so wide that it is
possible to use any of the methodologies of Data Mining. An
example can be the construction of a diagnostic system or the
study of the effectiveness of surgical intervention.</p>
      <p>
        One
of the
most
advanced
areas
of
medicine is
bioinformatics. The object of bioinformatics research is huge
amounts of information about DNA sequences and the primary
structure of proteins that arose as a result of studying the
structure of genomes of microorganisms, mammals and
humans.
the
specific
content of this
information, it can be regarded as a set of genetic texts,
consisting of extended character sequences. Detection of
structural laws in such sequences is a number of tasks,
effectively solved by means of Data Mining, for example, by
means of sequencing and associative analysis [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>The purpose of the study is to identify the most important
rules for constructing associative rules. Determination of the
patterns of constructing associative rules and the division of
physical indicators at different levels of the hierarchy..</p>
      <p>II. OBJECTS AND METHODS OF RESEARCH</p>
      <p>
        One of the most common data analysis tasks is to identify
sets of objects that are often encountered in a large set of
objects. We describe this problem in a generalized form. To do
this, we denote the objects that make up the study sets
(itemsets), as follows [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]:

= { 1,  2, … ,   , … ,   },
Value
120/80 mm. Hg.
70 mm. H2O
70 mm. Hg.
85 beats/min
36,6 С
      </p>
      <p>In this way they correspond to the following set of objects:
I = {arterial pressure, venous pressure, capillary pressure,
pulse, temperature, hemoglobin level in blood, pH}.</p>
      <p>Sets of objects from the I set, stored in a database and
subject to analysis, are called transactions. We describe the
transaction as a subset of the set I:</p>
      <p>= {  |  ∈  } .</p>
      <p>Such transactions in the hospital are in accordance with the
delivery of medical examinations of the patient and stored in
the database in the form of a medical card. They list the tests
that the patient passed for a history and diagnosis.</p>
      <p>The set of transactions, the information about which is
available for analysis, will be described by the following set:
where m - the number of transactions available for analysis.</p>
      <p>= { 1,  2, … ,   , … ,   },</p>
      <p>
        III. RESEARCH RESULTS
as a table (Table 2).
indicated as follows [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]:

= {  |
      </p>
      <p>∈   ; 
To use Data Mining methods, the set D can be represented
The set of transactions, which includes jі objects, is
(1)
= 1. .  ; 
= 1. .  } ⊆ 
(2)
(3)
(4)
indicators and analyzes of the patient (Table 1).</p>
      <sec id="sec-1-1">
        <title>Pulse</title>
      </sec>
      <sec id="sec-1-2">
        <title>Temperature</title>
      </sec>
      <sec id="sec-1-3">
        <title>Level of hemoglobin in the blood</title>
      </sec>
      <sec id="sec-1-4">
        <title>Temperature is the following: of objects.</title>
        <p>(5)
(6)
(7)
(8)</p>
        <p>Then the sequence of objects can be described as follows:
For example, in the case of analyzes such a sequence of
objects may be the date of delivery of analyzes. Such a

= {… ,   , … ,   }, ℎ
   
&lt;  .</p>
        <p>(10)
sequence:
S = {(hemoglobin level, 10.10.2017), (venous pressure,
09/25/2017), (pH, 28.09.2017)}
сan be interpreted as a sequence of delivery of tests by one
person at different times (initially measured venous pressure,
then</p>
        <p>measured the pH level, and finally the level of
hemoglobin).</p>
        <p>There are two types of sequences: with cycles and without
cycles. In the first case it is allowed to enter the sequence of
the same object at different positions:

= {… ,   , … ,   , … }, ℎ    
&lt;  ,  
=   . (11)</p>
        <p>It is said that transaction T contains the sequence S, if S ⊆
T and the objects included in S, also belong to the set of T, with
preservation of the relation of order. It is supposed that in the
set T between objects in the sequence of S there may be other
objects.</p>
        <p>The maintenance of the sequence S is the ratio of the number
of transactions, which includes the sequence of S, to the total
number of transactions. The sequence is frequent if its support
exceeds the minimum support given by the user:

( )
&gt; 
   .</p>
        <p>The task of sequential analysis is to
search all frequent
sequences:

= { |
( )
&gt; 
   } .</p>
        <p>The main difference between the problems of sequential
analysis from the search for associative rules is to establish a
relation of order between objects of the set I. This relation can
be determined in different ways. In the analysis of the
sequence of events occurring in time, the objects of the set I
are events, and the order of relationships corresponds to the
chronology of their appearance. For example, analyzing
sequences of assays in a hospital are sets of analyzes that the
patient submits at different times, and the order of reference is
the time of the implementation of these analyzes.</p>
        <p>D = {{(temperature, blood pressure, capillary pressure), (pH,
temperature,
pulse)},
{(hemoglobin
level
in
blood,
temperature), (blood pressure, temperature), (temperature,
venous pressure)}, {( hemoglobin level in the blood)}}.</p>
        <p>Of course, there is a problem of identification of patients. In
practice, this is decided by the introduction of medical cards
that have a unique identifier (table 3).
The presence of a hierarchy changes the perception of when
an object i is present in transaction T. Obviously, support is not
a separate object, but the group to which it is included is
greater:
where ij ∈ Iq.
transactions that include a separate object, but also transactions
containing all objects of the analyzed group are counted. For
example, if Supp {blood pressure, temperature} = 2/3, then
support Supp {pressure, physical parameters} = 2/3, since the
objects of the groups of pressure and physical parameters are
included in the transaction with the identifiers 0 and 1.</p>
        <p>Using the hierarchy allows you to determine the connection
that goes into higher levels of the hierarchy, since the support
for the set can increase if the entry of the group, and not its
object, is counted. In addition to the search for kits that often
occur in transactions, which in turn consist of objects  =
{ | Î  } or groups of the same level of the hierarchy:
You can also consider mixed sets of objects and groups:
patient with the ID 0 initially passed the temperature, the
arterial and capillary pressure, and then passed the pH,
temperature and pulse rate with his visit. For example, the
support for the {(blood pressure, temperature)} sequence is
2/3, since it is found in patients with identifiers 0 and 1.</p>
        <p>In many applications, objects of the set I naturally combine
into groups that in turn can also be grouped into more general
groups, etc. Thus, the hierarchical structure of objects is
obtained.
categorization of analyzes:</p>
        <p>An example of such a hierarchy may be the following</p>
        <sec id="sec-1-4-1">
          <title>Pressure:</title>
          <p>· Arterial;
· Venous;
· Capillary</p>
        </sec>
        <sec id="sec-1-4-2">
          <title>Physical indicators:</title>
          <p>· Temperature</p>
        </sec>
        <sec id="sec-1-4-3">
          <title>Blood test: · Hemoglobin level; · PH</title>
          <p>(14)
(15)
(16)
the groups, and then, depending on the results, investigate the
objects that interest the group analyst. In any case, it can be
argued that the presence of a hierarchy in objects and its use in
the task of finding associative rules allows you to perform a
more flexible analysis and gain additional knowledge.</p>
          <p>In the considered problem of searching for associative rules,
the presence of an object in a transaction was determined only
by its presence in it (

∈  ) or the absence (

objects have additional attributes, usually
∉  ). Often,
numeric. For
example, analyzes in a transaction have attributes: value and
duration. In this case, the presence of an object in the set can
be determined not only by the fact of its presence, but also the
execution of the condition in relation to a certain attribute. For
example, in analyzing transactions performed by patients, they
are interested not only in the value of the analysis, but also in
how well this indicator is stable (long-term).</p>
          <p>You can add additional objects to explore the sets in order
to extend the analysis capabilities by searching for associative
rules. In the general case, they may have a nature different
from the main objects. For example, in the case of delivery of
tests, you can enter the field of delivery frequency or
symptoms that precede the delivery of these particular
analyzes.</p>
          <p>Solving the problem of finding associative rules, as well as
any task, is to process the output and obtain the results.
Processing of the initial data is performed by a certain Data</p>
        </sec>
      </sec>
      <sec id="sec-1-5">
        <title>Mining algorithm.</title>
        <p>The results obtained in solving this problem are accepted in
the form of associative rules. In this regard, when searching
for them, there are two main stages:
1.
2.</p>
      </sec>
      <sec id="sec-1-6">
        <title>Finding all large sets of objects; Generation of associative rules from found large sets of objects. Associative rules are as follows:</title>
        <p>programming languages. However, they are not always useful.
There are three types of rules:
1. Useful rules - contain valid information that was previously
unknown but has a logical explanation. Such rules can be
used for making decisions that are beneficial;
2. Trivial rules - contain valid and easily understandable
information that is already known. Such rules, although
they can be explained, but can not bring any benefits, as
they reflect or known laws in the studied area, or the
results of past activity. Sometimes such rules can be used
to verify the implementation of decisions taken on the
basis of preliminary analysis;
3. Unclear rules - contain information that can not be
explained. Such rules can be obtained either on the basis
of abnormal values, or deeply hidden knowledge. Directly
such rules can not be used for decision making, since their
lack of clarity can lead to unpredictable results. For better
understanding, further analysis is required.</p>
        <p>Associative rules are built on the basis of large sets. So, the
rules built on the basis of the set F, are all possible
combinations of objects included in it.</p>
        <p>For example, for the set {arterial pressure, temperature,
pulse} the following associative rules can be constructed:
If (arterial pressure) then (temperature);
If (arterial pressure) then (pulse);
If (arterial pressure) then (temperature);
If (arterial pressure) then (temperature, pulse);
If (temperature, pulse) then (arterial pressure);
And so on.</p>
        <p>Thus, the number of associative rules can be very large and
bad for human perception. In addition, not all of the built-in
rules carry useful information. To assess their usefulness, the
following values are entered:
• Support - shows which percentage of transactions
supports this rule (we found rules, where Support is
upper then 75%).
• Confidence - shows the probability that the presence
of a set Y in the transaction in the set X implies (we
found rules, where Confidence is upper then 0.5).
• Improvement - indicates whether this rule is useful
for research.</p>
        <p>These estimates are used when generating rules. An analyst
when searching for associative rules specifies the minimum
values of these variables. As a result, those rules that do not
satisfy these conditions are discarded and are not included in
the solution of the problem.</p>
        <p>If objects have additional attributes that affect the
composition of objects in transactions, and therefore in sets,
then they should be taken into account in generated rules. In
this case, the conditional part of the rules will not only include
verification of the existence of an object in a transaction, but
also more complex comparing operations: more, less, includes,
etc. The resulting part of the rules may also contain statements
about the attribute values. For example, if an indicator is
considered topical, then the rules may look like this:
If pH.relevance &gt; 10 days then the level of hemoglobin
in the blood.relevance &lt; 3 days.</p>
        <p>This rule states that the patient did the pH analysis more than
10 days ago, then probably his analysis of hemoglobin in the
blood is valid for no more than 3 days.</p>
        <p>The main differences between static and dynamic XML
documents are:
• Availability of validity period</p>
        <p>A static XML document does not contain elements that
indicate the expiration date of this document. In contrast, a
dynamic XML document initially contains at least one element
that indicates the validity period of a particular version of the
document.
• Persistence of displayed information</p>
        <p>Once created, the information of a static XML document
remains valid at all times. Conversely, the version of the
dynamic XML document is valid only for the period specified
in the corresponding elements. As soon as a new version
appears, the information contained in the previous version is
replaced.</p>
        <p>Most of the work on finding associative rules in static XML
documents is related to the use of XML-based algorithms
based on the Apriori algorithm. However, there are a number
of other approaches.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>III. CONCLUSION</title>
      <p>The task of finding associative rules is to identify sets of
objects that are commonly encountered in a large number of
objects. The task of sequential analysis is to search for frequent
sequences. The main difference between the tasks of
sequential analysis from the search for associative rules is to
establish a relationship of order between objects. The presence
of a hierarchy in objects and its use in the task of finding
associative rules allows you to perform a more flexible
analysis and obtain additional knowledge. The results of the
solution of the problem are presented in the form of associative
rules, conditional and the final part of which contains sets of
objects.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Brin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Page</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>The anatomy of a large-scale hypertextual web search engine</article-title>
          .
          <source>Computer networks and ISDN systems</source>
          ,
          <volume>30</volume>
          (
          <issue>1-7</issue>
          ),
          <fpage>107</fpage>
          -
          <lpage>117</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Negnevitsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Artificial intelligence: a guide to intelligent systems</article-title>
          .
          <source>Pearson Education.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benyoucef</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Deshmukh</surname>
            ,
            <given-names>S. G.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>A new approach for evaluating agility in supply chains using fuzzy association rules mining</article-title>
          .
          <source>Engineering Applications of Artificial Intelligence</source>
          ,
          <volume>21</volume>
          (
          <issue>3</issue>
          ),
          <fpage>367</fpage>
          -
          <lpage>385</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Shakhovska</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminskyy</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zasoba</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Tsiutsiura</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>ASSOCIATION RULES MINING IN BIG DATA</article-title>
          .
          <source>International Journal of Computing</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <fpage>25</fpage>
          -
          <lpage>32</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>