<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Personalized Approach to the Processing and Analysis of Patients' Medical Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Edgars Vasilevskis</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The application of machine learning technology and Big data to solve the problem of personalized approach in the tasks of making medical decisions and predicting states will allow to study random mechanisms of modeling and forecasting of treatment stages taking into account individual patient's characteristics, analysis of the medicaments and their key characteristics. We use this information to develop innovative approaches to risk forecasting, modeling therapies, and improving the quality of medical care by personalizing treatment schemes of patients. And it will allow you to effectively optimize data processing even when new information revenues come from different sources. The authors proposed the development of a System for the medical decisions support for the personalized data consolidation of the patient, which were receiving from the heterogeneous sources that are related healthcare. The conceptual scheme of the system was proposed and new approaches to consolidation and analysis of patient's data and forecasting of its states are offered. The use of various processing technologies for the Big data obtained will allow the study of random mechanisms for modeling and predicting treatment stages, taking into account individual patient's characteristics, analysis of the medicaments and their key characteristics. These will help develop innovative approaches to improve the risk stratification methodology, improve the quality of medical care by personalizing treatment schemes for patients.</p>
      </abstract>
      <kwd-group>
        <kwd>medical decisions</kwd>
        <kwd>prediction of states</kwd>
        <kwd>consolidation of personalized data</kwd>
        <kwd>analysis of medical data</kwd>
        <kwd>personalization of treatment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>The problems personalized data processing</title>
      <p>The problem of obtaining unified, valid and qualitative information about the
information object of the studied subject area is relevant in the modern world. This is due to
the rapid growth of information flows, which are characterized by different types of
data that are coming from different sources. The urgency of the distinct consolidation
data in medicine is growing because of the need for the rapid processing of a large
patient's information amount that is characterized by its heterogeneity due to the
emergence of new features of the history of the disease for each patient, namely: it is the
individual features of the patient, pre-treatment, biochemical indicators, presence of
complications, previous medication therapy, etc.</p>
      <p>
        The rapid growth in the volume of data collected the lack of alternative methods for
their effective analysis, the need for significant human resources to support the data
analysis process, and the high computational complexity of existing analysis algorithms
lead to a steady increase in the time even with timely updating of hardware. This
necessitates the emergence of new methods and tools for processing, the consolidation of
personalized data for the process of collecting heterogeneous data of large volumes, as
well as supporting of the doctor's decisions [
        <xref ref-type="bibr" rid="ref1">1,6</xref>
        ]. To solve this problem it is expedient
to use approaches of technology of machine learning and Big Data.
      </p>
      <p>For effective work, comprehensive solutions are needed for monitoring, filtering,
structuring and searching for semantic relationships between the concepts of the studied
area. However, you can observe a huge variety of variables, identify global trends and
conclusions about assessing the status of the object under study, and predict the
transitions between them, based on the information provided and using Big Data technology.
The development and analysis of such different type data is used to stimulate the
development of events and situations in decision support systems.</p>
      <p>
        The researchers that had started the study of this problem were von Neumann,
developers of IBM, academics of the school Lebedev SO, 17 Glushkov V.M. (system
analysis, the theory of conflict games, problem-oriented systems of modeling and data
processing) [
        <xref ref-type="bibr" rid="ref1">1, 5</xref>
        ], which led to the development of languages of block programming,
decision support systems. So, changing the class of research - from operational to
analytical, the emergence of new types of data, the need for quick access to them, led to an
increased interest in the problem of integration and processing of data in order to
improve the quality of solutions. The greatest activity in the field of data integration
research was in the 90's of the XX century. and in our time [4,10] due to the rapid
development of Business Intelligence, Machine Learning and increased data warehousing
capabilities (increasing the amount of stored data, analytical processing of data).
      </p>
      <p>The peculiarity of the current researching is to analyze not only the types of data
(descriptions), but semantics. Especially this is active development of means for
operative collection of various data, loading them into a data warehouse, analysis and
forecasting, which is observed in the spheres of energy and administrative management
[5,11]. But at the present time, this problem is relevant in medicine. Specifics of
processing large volumes of medical data sets need to develop new methods of analysis,
consolidation and forecasting to support medical decisions during diagnosis, treatment
and rehabilitation.</p>
      <p>The process of analyzing medical data is characterized by a number of definite
problems that arise in solving a class of problems, namely [3,8,13]:
─ Fuzzy data presented;
─ classification of data;
─ consolidation of data;
─ determination of the general patient's condition;
─ identification of personalized treatment decisions;
─ assessment of the reliability of the resulting conclusions;
─ assessment of risks;
─ prediction of the patient's conditions under the influence of the applied therapy.
As a consequence, there are problems in data processing, namely: the absence of
methods of analysis suitable for use due to their variety (for medicine - time-dependent data
of the general condition of the patient, poorly structured data of laboratory studies, etc.),
the need for significant human resources to support the analysis process data, high
computational complexity of existing analysis algorithms and rapid growth of data
collection. This in turn leads to a steady increase in the time spent on data analysis, even with
the regular updating of computer tools..</p>
      <p>Thus, there is the task of developing an effective unified method for analyzing and
consolidating personalized data, which will allow it to be used not only for medicine,
but also for other subject areas.</p>
      <p>The main tasks of the analysis of medical data, which are relevant both to the
diagnosis and treatment, Fig. 1
definition of the most
important individual</p>
      <p>characteristics
classification of the general
patient's condition
processing personalized</p>
      <p>patient's data
the presence of the
relationship between the</p>
      <p>values of the most
important signs and the
classification of patients
analysis of the results of
treatment</p>
      <p>prediction of patient's states
ogy, competing drugs, related drugs, previous drug therapy, patient age,
cost-effective</p>
    </sec>
    <sec id="sec-2">
      <title>Formalized representation of medical data</title>
      <p>The availability of effective storage, access and modification of information about the
state of the object (patient), as well as combining with actual flow data about the
investigated situational problem will allow to work out the structure of medical data. To do
this, you need to define the structure of personalized data.</p>
      <p>The personalized data PD is a set of data and its elements are subsets of
time-independent data (A) and time-dependent data (S) of the object under study and which
characterize its general state.
ness factor, etc.
where
(1)
(2)
(3)
(4)
(5)</p>
      <p>The more models can be identified, the more accurate the information will be in the
CDR, and will allow the integration, search and processing of data in the data space.
By analyzing data sets that are characterized by diverse information and data
representation in different models, one can use the data consolidation approach by creating
associations between data objects from different participants; improving access to sources
with limited own means of access; ensuring the ability to execute queries without access
to a real data source; data consolidation as a result of a user's request; maintaining a
high level of accessibility and recovery.</p>
      <p>Due to the increase of patient information during diagnostic and treatment processes,
and depending on the implementation of the data directory (Dd), which contains a
model management environment (Mm), which allows you to create new connections
and manage the links between them.</p>
      <p>The relationship between a directory, a model management environment and a
consolidated data repository (CDR) can be represented as a mapping [3, 19]:</p>
    </sec>
    <sec id="sec-3">
      <title>Formalization of the process of analysis of personalized medical data</title>
      <p>In accordance with defined data sets to ensure the personalization of solutions, a
conceptual model for personalizing decisions on the definition of treatment is proposed,
(6)
(7)
(8)
(9)
= { 1,  2, … ,  
and   :</p>
      <p>( )}
→   .</p>
      <p>PD called attributes set conditions, and D - solutions. Oi (the set of objects) is i-th
class of solutions, the di-value and its solution obtained from the set of solutions D.
  = {  
∈</p>
      <p>, ∶  (   ) =    };
Thus, the rule of decision is a formula of the form
  = (  1 →   1 ) ∧ (  2 →   2) ∧. . .∧ ( 

( ) →   
( )).</p>
    </sec>
    <sec id="sec-4">
      <title>The application of personalized data analysis for decisionmaking tasks</title>
      <p>There is a large number of special algorithms for constructing classification rules [6].
These algorithms are NP-computable from a computational point of view, and their
application requires the use of special heuristics to reduce the total amount of
computations. In this case, the total number of rules received can be significant, which will
require additional efforts to select the best ones. Among the well-known approaches to
the formation of such heuristics is the method of associative rules, reverse output, Bays,
Boolean considerations [2, 10, 11]. Solving the decision making problem in
determining the treatment in terms of applying the rules received should be based on the
experience of previous experience.</p>
      <p>In the last years the algorithm ID3 and its modifications C4.5, See5 are actively used
to construct decision trees [14, 17]. All these algorithms build trees and generate rules
based on examples.</p>
      <p>The practical application of the classic ID3 algorithm is due to a number of problems
that are typical of learning-based models and decision trees, in particular. One of the
drawbacks of the ID3 algorithm is that it works incorrectly with attributes that have
unique values for all objects in the training sample. The information entropy is zero for
such objects and no new data can be obtained from a constructed tree and with a given
dependent variable, so the subsets obtained after the partition will contain one object.
To effectively overcome the shortcomings of the ID3 has been refined, resulting in its
expansion, called C4.5.</p>
      <p>C4.5 algorithm solves this problem by introducing normalization. There is evaluated
not the number of objects of a class after the partition, but the number of subsets and
their power (number of elements) is evaluated. However, the problem of processing
exclusively independent parameters remains.</p>
      <p>The evaluation function V (PD) is crucial for the process of personalizing the
treatment scheme. This function is obtained as a result of application of the method of
unification of personalized schemes and is formed on the basis of the Bayesian theorem.
The weight of the appearance of the next event corresponds to the largest value of the
a posteriori probability of the appearance of the next state, taking into account the
timedependent input parameters.</p>
      <p>V ( S )  max  p G | S  .
(11)</p>
      <p>There is a prototype of the decision support system for the analysis of personalized
patient data, Fig. 4, as a result of application of a search tree using the method of
unification of personal treatment schemes for targeted solutions.</p>
      <p>The work of many methods was analyzed for processing personalized data and the
processing speed of the query when looking for an individual treatment scheme.
Comparison of the time complexity of making medical decisions of the methods used is
represented in Table.1:</p>
      <p>The high speed of the unified selection method is explained by the fact that the
method processes only the personalized data given in the input data set due to the
balance of the search tree of the treatment scheme. As a result, the increase in selection
criteria (patient's parameters) affects is inversely proportional to the list of proposed
therapeutic schemes.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>So, the personalized approach to the processing of medical information is characterized
by a number of problems, namely: the uncertainty of the data presented, the
classification of data, data consolidation, the definition of the general patient's condition, the
definition of personalized treatment decisions, the assessment of the reliability of the
resulting conclusions, assessment of the emergence of risks, prediction of the
conditions of the patient under the effect of the applied therapy. As a result, there are
problems in data processing, the lack of methods of analysis, suitable for use due to their
variety.</p>
      <p>The approach is proposed for the unification and formalization of personalized data
to simplify the process of processing and solving a number of identified problems,
which will allow optimize the process of analyzing medical data and making medical
decisions.</p>
      <p>The methods of personalized processing of medical data are analyzed, namely: the
method of unification of personalized treatment schemes, the Bayesian network, the
method of associative rules and the method of logical deduction. This formed the vision
of the effectiveness of their application for this kind of task. As a result of an increase
in selection criteria (patient's parameters), the inverse proportional effect on the list of
proposed therapies in the method of unification of personalized treatment schemes. This
allows to increase the search speed by balancing the tree search and processing only the
personalized data that arrives in the input data set.
2. Melnykova, N., Markiv, O.: Semantic approach to personalization of medical data. Computer
Sciences and Information Technologies. Proceedings of the 11th International Scientific and
Technical Conference, CSIT 2016, pp. 59-61, (2016)
3. Melnykova, N., Marikutsa, U.: Specifics personalized approach in the analysis of medical
information. ECONTECHMOD. An International Quarterly Journal on Economics of
Technology and Modelling Processes, 2016, Vol. 5, No 2, pp. 113-120, (2016)
4. Perova, I., Pliss I., Churyumov, G., Franklin, M., Eze, Samer, Mohamed, Kanaan, Mahmoud:
Neo-Fuzzy Approach for Medical Diagnostics Tasks in Online-Mode. 2016 IEEE First
International Conference on Data Stream Mining &amp; Processing (DSMP), Lviv, Ukraine, pp.
3438, (2016)
5. Melnykova, N., Shakhovska, N., Sviridova, T.: The personalized approach in a medical
decentralized diagnostic and treatment. 14th International Conference The Experience of
Designing and Application of CAD Systems in Microelectronics, CADSM 2017, pp. 295-297,
(2017)
6. Melnykova, N.: Semantic search personalized data as special method of processing medical
information. Intelligent Systems and Computing, pp. 315-325, (2017)
7. Bodyanskiy, Ye., Perova, I., Vynokurova, O., Izonin, I.: Adaptive Wavelet Diagnostic
NeuroFuzzy System for Biomedical Tasks. Proc. of 14th International Conference on Advanced
Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET),
LvivSlavske, Ukraine, February 20 – 24, pp. 711- 715, (2018)
8. Izonin, I., Trostianchyn, A., Duriagina, Z., Tepla, T., Lotoshynska, N.: The combined use of
the wiener polynomial and SVM for material classification task in medical implants
production. International Journal of Intelligent Systems and Applications. 9, pp. 40-47, (2018)
9. Boyko, N., Sviridova, T., Shakhovska, N.: Use of machine learning in the forecast of clinical
consequences of cancer diseases. 7th Mediterranean Conference on Embedded Computing.</p>
      <p>MECO 2018 - Including ECYPS 2018, Budva, pp. 1-6, 2018,
10. Calvaneseand, D., Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A.,</p>
      <p>Rosati, R.: Ontology-based database access. Proc. of SEBD – 2007, pp. 324–331, (2007)
11. Mintser, O., Stryzhak1, O., Denysenko, S.: Use of principles of medical ontology for
construction of scenario models of post-graduate education of doctors and pharmacists. National
Medical Academy of Postgraduate Education Shupyk, Medical Informatics and Engineering,
Kyiv, No 2, pp. 18-23, (2013)
12. Lytvyn, V.: Design of intelligent decision support systems using ontological approach. The
international quarterly journal on economics in technology, new technologies and modelling
processes, Vol. II, No 1, pp. 31-38, (2013)
13. Lytvyn, V., Dosyn, D., Shkutiak, N.: Mathematical software development base domain
ontology. Proceedings of II international scientific-practical conference. Modern information
technology and other innovation transport, Kherson, 25-27 May 2010, HDMI, pp. 345-348, (2010)
14. Slyunyayev, A.: Methods of finding solutions in intelligent multi-agent information
management system airport. Proceedings of the Military Institute of Kyiv National University Taras
Shevchenko, №28, Kyiv-2010, pp. 110-114, (2010)
15. Dunham, M.: Data Mining Introductory and Advanced Topics. Pearson Education, Inc.,
(2003)
16. Donald, K.: Generating all trees. The history of combinatorial generation. The Art of
Computer Programming, Williams, T. (4), pp. 160, (2007)
17. Perova, I., Litovchenko, O., Bodyanskiy, Ye., Brazhnykova, Ye., Zavgorodnii, I., Mulesa. P.:
Medical Data-Stream Mining in the Area of Electromagnetic Radiation and Low Temperature
Influence on Biological Objects. Proc. 2018 IEEE Second International Conference on Data
Stream Mining &amp; Processing (DSMP), August 21-25, (2018), Lviv, Ukraine, pp. 3-6, (2018)
18. Perova, I., Bodyanskiy, Ye., Brazhnykova, Ye., Mulesa P.: Neural Network for Online
Principal Component Analysis in Medical Data Mining Tasks. IEEE First International
Conference on System Analysis &amp; Intelligent Computing (SAIC) 8-12 October 2018, Kyiv, Ukraine,
pp.150-154, (2018)
19. Perova, I., Bodyanskiy, Ye.: Fast medical diagnostics using autoassociative neuro-fuzzy
memory. International Journal of Computing, 16 (1), pp. 34-40, (2017)
20. Melnykova, N., Marikutsa, U., Slych, A.: The Intelligent SystemArchitecture of Personalized
Management. XIV Ukrainian-Polish Conference on CAD in Machinery Design.
Implementation and Educational Issues, CADMD-2016, pp. 21-22, (2016)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Mulesa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perova</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Spacial Extrapolation Method Using Manhattan Metrics for Tasks of Medical Data Mining</article-title>
          .
          <source>Computer Science and Information Technologies CSIT</source>
          <year>2015</year>
          , Lviv, Ukraine, pр.
          <fpage>104</fpage>
          -
          <lpage>106</lpage>
          , (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>