<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Algorithms on Electronic Health Records' Logs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sabina Rakhmetulayeva</string-name>
          <email>ssrakhmetulayeva@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aliya Kulbayeva</string-name>
          <email>aakulbayeva@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International Information Technology University</institution>
          ,
          <addr-line>Manas St. 34/1, Almaty</addr-line>
          ,
          <country country="KZ">Kazakhstan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The number of tasks devoted to predicting the incidence of infectious diseases is growing rapidly due to the availability of statistical data supporting the analysis. This article describes the main solutions currently available for creating short and long-term disease forecasts. Their limitations and practical applications are shown. Much attention is given to the Naïve Bayes classification, logistic regression, artificial neural network algorithm and k-means artificial neural networks as methods of model analysis based on machine learning. This article provides an overview of two popular machine learning algorithms used to predict diseases. The standard datasetis used for a wide range of diseases including fungal infection, allergy, GERD, chronic cholestasis,peptic ulcer disease, diabetes, bronchial asthma, migraine, paralysis (brain hemorrhage) and more.</p>
      </abstract>
      <kwd-group>
        <kwd>Keywords1</kwd>
        <kwd>network</kwd>
        <kwd>Point-to-point estimates</kwd>
        <kwd>regression models</kwd>
        <kwd>method of analogues</kwd>
        <kwd>Naïve Bayes</kwd>
        <kwd>logistic</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Patients access medical services over the Internet by connecting to medical information systems.
When people are sick, they frequently search the Internet for various information explaining their
symptoms and develop incorrect diagnosis for themselves. As a result, the medical services system,
which includes medical consultations, visits to medical facilities, drug purchases, recuperation, and
treatment, is evolving [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        When it comes to data collecting and processing, one of the most inconvenient topics is health.
With the digital age, a vast amount of patient data is being generated, including hospital resource
factors, diagnostic patient information records, and medical equipment. Making excellent judgments
necessitates extremely complicated data processing and review. The extraction of medical data gives
up a lot of possibilities for detecting duplicates of medical data that have been saved [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Patients and
doctors seeking information about their symptoms use automated tools that support medical
diagnostic systems as they focus on several possible causes to avoid complex or premature diagnoses
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. A lot of efforts have been made to create predictive diagnostic systems and encode relevant
information for the development of forecasting methods [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Data collecting in many industries is continuously increasing as a result of recent technological
advancements such as computers and satellites. Traditional data analysis approaches are obviously
incapable of processing enormous volumes of data efficiently. Data mining techniques are the only
option to extract knowledge from enormous volumes of data inthis scenario. In the field of data mining,
the obtained machine learning algorithms are a strong instrument in prediction of diseases. The</p>
      <p>2022 Copyright for this paper by its authors.
potential utility of these technologies has lately been discovered through diagnostics based on health
data. Statistics on various disease data for up to ten years provide a solid opportunity to forecast data
for the following 2-3 years.</p>
      <p>
        The implementation of proper diagnostics for autonomous extraction of relevant information from
electronic medical records is one of the ultimate aims of intelligent healthcare [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This is a highly
important and promising duty that cannot only improve work efficiency, but also minimize doctors'
diagnostic mistakes while making a diagnosis [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Previously, the models employed in diagnostic approaches had to be specified by hand, which took
a long time and effort [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Although the technical details of the manual make this model very weak, it
is difficult to adapt it to new diseases or clinical conditions. Automatic symptom-based disease
planning can significantly accelerate the development of such diagnostic tools. The pricing is also
determined by the visuals.
      </p>
      <p>
        There are four main reasons why EMR data is difficult to interpret. Previously, the texts inmedical
and medical records were shorter than in traditional textbooks, which made it difficultto determine the
context of diseases and symptoms. Both textbooks and magazines often offersimplified examples that
reflect only the most general features to help learning. EMR data represent real patients with all
common diseases and factors that make them individual. Unlike the third textbook, which reveals the
link between disease and symptoms, EMR's link betweendisease and symptoms is statistical, making it
simpler to mix up the connection with the cause.Finally, the attending physician modifies the
decisionmaking process as part of the electronicmedical record entry procedure [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
      </p>
      <p>This article explores the use of various methods of initial center selection along with naive
Bayesian methods to include the grouping of k instruments in the diagnosis of patients with diseases.
2. Naïve Bayes Algorithm</p>
      <p>The Bayesian dynamic network is another technique to time series modeling that takes into
consideration the associated data structure. Straight graphs with vertices corresponding to model
variables and edges corresponding to probabilistic connections between them established by
particular distribution rules are used to represent Bayesian networks. After training the Bayesian
network, the likelihood of an event occurring in the observed sequence of events may be estimated.
Bayesian networks are fast gaining popularity in a variety of sectors of knowledge and are being
utilized to tackle the challenge of forecasting morbidity, particularly in their most basic version - the
Hidden Markov mode.</p>
      <p>
        The basic idea of HMM is to compare any random variable Yt with an unobservable randomvariable
St t which determines the conditional distribution of Yt [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>This parameter can be estimated from the observations of yt by explaining the distribution laws of
Yt and St. Thus, Yt can be the number of citizens seeking medical care and St is an important
characteristic of an epidemic situation, for example, the total number of infected citizens. It is
assumed that the Yt values are based only on the values of the latent variables Stat time t and that the
sequences St are Markov features. That is, the value of St is based only on St-1 (Figure 1).</p>
      <p>
        Bayes' theorem is the foundation of the naïve Bayesian classifier. He has a strong sense of
selfsufficiency. An autonomous functional model is another name for it [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The presenceor absence of an
element of a certain class is considered independent of the presence or absenceof any other element of
this class. Under controlled learning settings, naive Bayesian classifiersmay be taught. It employs the
method of maximal equality. This is done in challenging real- world scenarios. A limited quantity of
training data is required for this [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Parameters for rating just the variance of the variable, not the
complete array, must be determined for each class [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. When the input data is large, naive Bayesian
analysis is utilized. This makes the outcome more difficult. The likelihood of each input characteristic
leaving the expected state. naïve Bayesian classification-based machine learning and data mining
methods.
      </p>
      <p>Bayes theorem:
 ( | ) =  ( | ) ( ) ( ),
(1)
where:
P(C|X) - posterior probability;P(X|C) - likelihood;
P(X) - predictor prior probability;P(C) - class prior probability.</p>
      <p>
        The Naive Bayesian classification algorithm is based on Bayesian theory with the concept of
attribute independence. In other words, the NBA recognizes that the existence of one attribute does
not depend on the existence of another attribute [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>Naive Bayes primarily predicts whether a patient is at risk of a certain type of disease. After
applying the K-means algorithm, we get a model dataset that compares the values of the datasetwith the
trained dataset. There will apply the Bayesian principle and determine if the patient has a disease.</p>
      <p>How does the naive Bayes algorithm work in our research?</p>
      <p>
        We have the dataset from kaggle.com with symptoms which are mapped to 42 diseases [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].Using
this dataset, we created a likelihood table with 10 diseases (Table 1) which contains theprobability of
certain disease under symptoms. Overall, dataset contains 149 symptoms, and likelihood table
describes in how many symptoms the disease will be indicated as probable.
      </p>
      <p>Based on the data set of disease markers, we need to determine which 129 markers identify the
types of diseases. We need to determine the accuracy and probability of the disease using the Bayes
algorithm of the dataset. We can solve this problem using the approach described above.</p>
      <p>P(Yes|Disease (n))=P(Disease(n) | Yes)*P(Yes) /P(Disease(n))P(Disease(1) | Yes))=4/55=0.07
P(Disease (1))=4/129=0.03P(Yes)=40/129=0.43
P(Yes|Disease (1))=0.07*0.43/0.03=1.003</p>
      <p>As we can see from the calculation of disease prediction using naive Bayes algorithm (Figure 2).</p>
      <p>As we can see the accuracy of Naïve Bayes is 97.6% which represent the good result for ourstudy. It
means that in future when we do comparative work on other machine learning algorithms we will
choose the best algorithm (Figure 3).</p>
    </sec>
    <sec id="sec-2">
      <title>3. K-means classification</title>
      <p>
        K-Means is a method of unsupervised learning that is often used in the process of collecting
information about the nearest neighbors. The data can be grouped into k groups based on similarity. K
is the number that you need to know for the algorithm to work [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ][
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. K-mean is the most
commonly used cluster algorithm capable of detecting new data collected with accuracy at most
distances [
        <xref ref-type="bibr" rid="ref18 ref19">18,19,20</xref>
        ]. The first selection of k cluster centers is done at random;thereafter, all points are
assigned to their nearest cluster centers and recomputed cluster centersfor the newly constructed group,
because some cluster centers impact K means, they are particularly susceptible to noise and outliers
[21]. One of the advantages of the K method is that it is easy to implement and interpret deductively.
The disadvantage of this approach is the complexity of estimating K. It works with clusters of
spherical meshes [22]. The K-means method is depicted graphically in Figure 4. There are two sets of
themes in the first level. Then,on both sides, define the center. Groups that form additional clusters in
the dataset are regenerated based on their center of gravity. Repeat this method until you get the ideal
pair [23].
      </p>
      <p>Based on the data, you need to build an algorithm for solving the problem. An example of
numbered list is as following.</p>
      <p>1. Select a value for K2.
2. Randomly select a data point K representing the center of gravity of cluster.
3. Assign all other data points to the nearest center of gravity of the cluster.
4. Repeat steps 3 and 4 until there are no changes left in each cluster.</p>
      <p>The cluster technique does not determine the number of clusters in the k-means approach since it
must be specified before the start. As a result, I'll employ the elbow strategy. The elbowapproach is a
popular strategy to figure out how many clusters are needed.</p>
      <p>Based on this, we got such a graph for calculating the dataset in Python and built a graph (Figure
5).</p>
      <p>When calculating cluster analysis, the big question often arises how many clusters to take and the
elbow method helps in this matter! With each new cluster, the total difference in eachcluster becomes
smaller. In extreme cases, when there are many clusters compared to the result, the score is zero.
However, in most cases, the decrease in general fluctuations after a certain moment is very small.
This point is used as the best cluster number [23].</p>
    </sec>
    <sec id="sec-3">
      <title>4. Logistic regression</title>
      <p>Logistic regression (LR) is a strong and well—established technique of classifying data with a
teacher.</p>
      <p>This is an extension of classic regression, and only binary variables representing the presence or
absence of events are often modeled.</p>
      <p>LR aids in determining the likelihood that a new instance will belong to a specific class. Given that
this is a probability, the outcome will be between 0 and 1. As a result, in order to employ LR as a
binary classifier, a threshold must be set to discriminate between the two classes. For example, if the
input instance's probability value is larger than 0.50, it is classed as "Class A." Otherwise, it's "Class
B."</p>
      <p>The LR model may be extended to represent categorical variables with three or more values.
Polynomial logistic regression is the name given to this expanded variant of LR.</p>
      <p>So, we made some research with dataset and determined that RF algorithm has that accuracywhich
is presented in Figure 6 and as we can see the accuracy of RF is 100% which representbetter result for
our study.</p>
    </sec>
    <sec id="sec-4">
      <title>5. Support vector machine</title>
      <p>Support Vector Machines (SVMs) can classify both linear and nonlinear data. First, map each data
element to an n-dimensional feature space. n is the number of objects. Detects hyperplanes that divide
data into two classes, maximizing the boundary distance between both classes and minimizing
classification errors.</p>
      <p>The distance constraint for a class is the distance between the solution's hyperplane and theclass's
nearest instance. Each data point is initially mapped as a point in n-dimensional space (where n is the
number of items), with the value of each object being the value of the suppliedcoordinate. Figure 7 is
a simple illustration of the SVM classifier.</p>
      <p>So, we made some research with dataset and determined that SVM algorithm has that accuracy
which is presented in Figure 8 and as we can see the accuracy of SVM is 100% whichrepresent better
result for our study.</p>
    </sec>
    <sec id="sec-5">
      <title>6. Artificial neural networks</title>
      <p>Artificial neural networks (ANNs) are a class of machine learning algorithms that are modeled on
how neural networks function in the human brain. McCulloch and Pitts [68] presented them initially,
followed by Rumelhartetal. According to the research. As in architecture these associations can be
reprogrammed (for example, through neuroplasticity) to aid in information adaption, processing, and
storage.</p>
      <p>ANN algorithms, similarly, may be depicted as a network of interconnected nodes.Depending on
the link, the node output is used as input to another annotation for further processing. Depending on
the alterations they execute, nodes are typically organized into a matrix known as a layer. The ANN
structure may have one or more hidden layers in addition to the input and output layers.</p>
      <p>Nodes and edges are hefty weights that enable you to control the strength of the communication
signal. Repeated training can enhance or decrease communication signals. Predictions of test data can
be made using training and subsequent selection matrices, node weights, and edges. Figure 9 depicts
an ANN (with two hidden layers) and its corresponding set of nodes.</p>
      <p>So, we made some research with dataset and determined that ANN algorithm has that
accuracy which is presented in Figure 10 and as we can see the accuracy of ANN is 64.1% which
represent worse result for our study.</p>
    </sec>
    <sec id="sec-6">
      <title>7. Results</title>
      <p>and huge datasets.
• It is applicable to both binary and
multiclass classification issues.
• This necessitates less training data.
• It is capable of making probabilistic
predictions and processing both
continuous and discrete input.
• More dependable tan LR
• Capable of handling numerous
spatial objects.
• Reduced danger of retraining.
• It is effective for categorizing
semistructured or unstructured data such
as words, photos, and so on.</p>
      <p>exclusive of one another.
• Dependence between
characteristics has a detrimental
impact on classification
performance.
• Assuming a normal distribution
of numerical characteristics.
• The computing cost for huge
and complicated datasets.
• It will not function if the data is
noisy.
• The consequences of the
ensuing patterns, weights, and
variables are frequently difficult to
comprehend.
• If there is no extension,
common SVMs cannot categorize
more than two classes.</p>
      <p>In the end of the research in Figure 11 we can see the comparison graph of the methods.</p>
      <p>Therefore, it should be noted that the accuracy of the algorithm depends on the size of the dataset,
the number of objects and the results of the model as a whole, it is better to use only one model.</p>
    </sec>
    <sec id="sec-7">
      <title>8. Conclusion</title>
      <p>Machine learning works with health departments to provide disease-related tools and dataanalysis.
Therefore, machine learning algorithms play an important role in the early detection of diseases. This
article provides an overview of two popular machine learning algorithms usedto predict diseases. The
standard dataset is used for a wide range of diseases including fungal infection, allergy, GERD,
chronic cholestasis, peptic ulcer disease, diabetes, bronchial asthma, migraine, paralysis (brain
hemorrhage) and more. Furthermore, the accuracy of the same method might differ from data set to
data set since many critical aspects influence the model'saccuracy and performance. Data set function
selection and function computation Another significant finding in this analysis is that the model's
accuracy and performance may be enhanced by applying specific algorithms that create single
pairings.</p>
      <p>The list of results found by the researchers is divided into tables for the diagnosis of diseases using
machine learning algorithms Naive Bayes and K-Means After comparing data sets of 129 columns of
two models predicting Nave-Bayes disease, it was shown that they haveexcellent prediction accuracy.</p>
      <p>Moreover, as guidance for future study, some of the limitations of this work are outlined.
9. References
equation, Journal of Theoretical and Applied Information 99.8 (2021) 1730–1739.
[20] A. F. Jahwar, A. M. Abdulazeez, Meta-Heuristic Algorithms for K-Means Clustering: A</p>
      <p>Review, Palarch's Journal of Archaeology of Egypt/Egyptology, 2021.
[21] N. Valarmathy and S. Krishnaveni, Performance Evaluation and Comparison of Clustering
Algorithms used in Educational Data Mining, International Journal of Recent Technology and
Engineering 76S5 (2019).
[22] S. Ray, A Quick Review of Machine Learning Algorithms, in: Proceedings of the International
Conference on Machine Learning, Big Data, Cloud and Parallel Computing, Com-IT-Con, India,
14th -16th Feb, 2019.
[23] Sh. Shukla and S. Naganna, A Review on K-means Data Clustering Approach, International
Journal of Information &amp; Computation Technology 4.17 (2014) 1847-1860.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mertens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gailly</surname>
          </string-name>
          , and G. Poels,
          <article-title>Supporting and assisting the execution of flexible healthcare processes</article-title>
          ,
          <source>in: Proc. Int. Conf. Pervas. Comput. Technol. Healthcare</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>375</fpage>
          -
          <lpage>388</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Manne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.C.</given-names>
            <surname>Kantheti</surname>
          </string-name>
          ,
          <article-title>Application of artificial intelligence in healthcare: chances and challenges</article-title>
          ,
          <source>Curr. J. Appl. Sci. Technol</source>
          .
          <volume>40</volume>
          .6 (
          <year>2021</year>
          )
          <fpage>78</fpage>
          -
          <lpage>89</lpage>
          . URL: https://doi.org/10.9734/cjast/2021/v40i631320.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Paparrizos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and E.</given-names>
            <surname>Horvitz</surname>
          </string-name>
          ,
          <article-title>Screening for Pancreatic Adenocarcinoma Using Signals From Web Search Logs: Feasibility Study and Results</article-title>
          ,
          <source>Journal of Oncology Practice</source>
          (
          <year>2016</year>
          )
          <article-title>JOPR010504</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Bisson</surname>
          </string-name>
          , et al.,
          <article-title>Accuracy of a computer-based diagnostic program for ambulatory patients with knee pain, The American journal of sports medicine (</article-title>
          <year>2014</year>
          )
          <fpage>0363546514541654</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lally</surname>
          </string-name>
          , et al.,
          <article-title>WatsonPaths: scenario-based question answering and inference overunstructured information</article-title>
          ,
          <source>Yorktown Heights: IBM Research</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P. B.</given-names>
            <surname>Jensen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Jensen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Brunak</surname>
          </string-name>
          ,
          <article-title>Mining electronic health records: towards better research applications and clinical care</article-title>
          ,
          <source>Nat. Rev. Genet</source>
          .
          <volume>13</volume>
          (
          <year>2012</year>
          )
          <fpage>395</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>W. F.</given-names>
            <surname>Stewart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. R.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Selna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Paulus</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <article-title>Bridging the inferential gap: The electronic health record and clinical evidence</article-title>
          ,
          <source>Heal. Aff</source>
          .
          <volume>26</volume>
          (
          <year>2007</year>
          )
          <fpage>181</fpage>
          -
          <lpage>191</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rotmensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Halpern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tlimat</surname>
          </string-name>
          , et al.,
          <article-title>Learning a Health Knowledge Graph from Electronic Medical Records</article-title>
          ,
          <source>Sci Rep</source>
          <volume>7</volume>
          (
          <year>2017</year>
          )
          <fpage>5994</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N. G.</given-names>
            <surname>Weiskopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rusanov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <article-title>Sick patients have more data: the non-random completeness of electronic health records</article-title>
          ,
          <source>in: AMIA Annu Symp Proc</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Le</surname>
            <given-names>Strat</given-names>
          </string-name>
          , Carrat,
          <year>1999</year>
          ; Siettos, Russo,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Shinde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Arjun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Patil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Waghmare</surname>
          </string-name>
          ,
          <string-name>
            <surname>An IntelligentHeart Disease Prediction System Using K-Means Clustering</surname>
          </string-name>
          and Naïve Bayes Algorithm,
          <source>International Journal of Computer Science and Information Technologies</source>
          <volume>6</volume>
          .1 (
          <year>2015</year>
          )
          <fpage>637</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Subbalakshmi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Chinna Rao, Decision Support in Heart Disease Prediction System using Naive Bayes</article-title>
          ,
          <source>Indian Journal of Computer Science and Engineering (IJCSE) 2</source>
          .2 (
          <year>2011</year>
          )
          <fpage>170</fpage>
          -
          <lpage>176</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Sh</surname>
            .
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Pattekari</surname>
            and
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Parveen</surname>
          </string-name>
          ,
          <article-title>Prediction System For Heart Disease Using Naïve Bayes</article-title>
          ,
          <source>International Journal of Advanced Computer and MathematicalSciences 3</source>
          .2 (
          <year>2012</year>
          )
          <fpage>290</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14] URL: http://datareview.info/article/6
          <article-title>-prostyih-shagov-dlya-osvoeniya-naivnogo-bayesovskogoalgoritma-s-primerom-koda-na-python.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15] URL https://www.kaggle.com/kaushil268/disease
          <article-title>-prediction-using-machine-learning.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Rakhmetulayeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Duisebekova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Mamyrbekov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. N.</given-names>
            <surname>Astaubayeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stamkulova</surname>
          </string-name>
          ,
          <article-title>Application of Classification Algorithm Based on SVM for Determining the Effectiveness of Treatment of Tuberculosis</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>130</volume>
          (
          <year>2018</year>
          )
          <fpage>231</fpage>
          -
          <lpage>238</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>NagaMallik Raj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. Thirupathi</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Mandhala</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Bhattacharyya</surname>
          </string-name>
          , Machine Learning Algorithms To Enhance Security In Wireless Network,
          <source>Journal of Critical Reviews</source>
          ,
          <volume>7</volume>
          .14 (
          <year>2020</year>
          )
          <fpage>425</fpage>
          -
          <lpage>432</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D. Q.</given-names>
            <surname>Zeebaree</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Haron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Abdulazeez</surname>
          </string-name>
          and
          <string-name>
            <given-names>S. R. M.</given-names>
            <surname>Zeebaree</surname>
          </string-name>
          ,
          <article-title>Combination of K-means clustering with Genetic Algorithm: A review</article-title>
          ,
          <source>International Journal of Applied Engineering Research</source>
          <volume>12</volume>
          .24 (
          <year>2017</year>
          )
          <fpage>14238</fpage>
          -
          <lpage>14245</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Rakhmetulayeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Duisebekova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Kozhamzharova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zh</surname>
          </string-name>
          . Aitimov,
          <article-title>Pollutant transport modeling using Gaussian approximation for the solution of the semi- empirical</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>