<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Prediction of Heart Disease Mortality Rate Using Data Mining</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Prasenjit Das</string-name>
          <email>prasenjit.das@chitkarauniversity.edu.in</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shaily Jain</string-name>
          <email>shaily.jain@chitkarauniversity.edu.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chetan Sharma</string-name>
          <email>chetan.sharma@chitkarauniversity.edu.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shankar Shambhu</string-name>
          <email>shankar.shambhu@chitkarauniversity.edu.in</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sakshi</string-name>
          <email>sakshi@chitkara.edu.in</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chitkara University Himachal Pradesh</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Chitkara University Institute of Engineering and Technology, Chitkara University</institution>
          ,
          <addr-line>Himachal Pradesh</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Chitkara University Institute of Engineering and Technology, Chitkara University</institution>
          ,
          <addr-line>Punjab</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>352</fpage>
      <lpage>367</lpage>
      <abstract>
        <p>Heart disease is the most acute disease with the highest mortality rate in the world. Prediction and timely treatment of this deadly disease only can reduce its effectiveness. Our paper aims to predict heart disease death using different data mining algorithms with utmost accuracy. In this context, we have used five data mining algorithms, Naive Bayes, LBLinear, Naive tree, Regression and Bayesian network on weka implementing on a dataset from UCI repository. According to the results obtained after execution, all data mining algorithms are predictive with good accuracy. We have evaluated accuracy, f-measure, recall, and precision to compare different data mining algorithms in consideration. However, the Bayes network outperforms all with a maximum accuracy of 79.26%. The values of other parameters are also highest in the Bayes network compared to the other four algorithms.</p>
      </abstract>
      <kwd-group>
        <kwd>Classification</kwd>
        <kwd>Prediction</kwd>
        <kwd>Algorithms</kwd>
        <kwd>Heart Disease</kwd>
        <kwd>Data Mining</kwd>
        <kwd>WEKA</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Data Mining is a branch of computer science
that is widely used in many fields. Data mining
means that mining or digging out knowledge or
useful information from a vast amount of data.
Through data mining, we can explore small to
large datasets to dig out any useful data
previously hidden or unknown and detect
relationships between different parameters that
were not possible with statistical methods. In
the health care industry, by applying data
mining techniques, we can diagnose and predict
the occurrence of disease and the probability of
death. Early prediction and diagnosis of the
disease can further decrease the death rate.
____________________________________
Cardiovascular disease is the most commonly
occurring disease, leading to maximum deaths
around the globe [1]. According to WHO, more
than 19 million people died from cardiovascular
diseases in 2018, and around 4 million of these
deaths are of non-senior citizens.</p>
      <p>Large amounts of data are available with our
health care industry which can be mined to
determine hidden information about diseases
and be used for effective decision making
beforehand [2]. Many researchers have already
been motivated by the increasing mortality rate
of cardiovascular diseases and started working
in the direction of extracting useful information
using various data mining techniques [3].
Hence, if we can design a prediction system for
different diseases like heart using machine
learning or deep learning methods, medical
professionals can forego symptoms or problems
related to the heart based on the available data
about patients and various attributes that
contribute to the occurrence of heart disease.
One major challenge in assisting doctors in
diagnosing the world’s most deadly disease
needs utmost accuracy [4]. Hence, most of the
research is aiming to improve diagnosis
accuracy.</p>
      <p>This paper used different classification
algorithms to evaluate and compare some
parameters like accuracy in predicting death
rate, f-measure, precision, and recall. Section 2
of this article is about the related work in this
domain, and the proposed methodology is
discussed in section 3. Section 4 tabulates our
experimental setup along with our results and a
discussion on them. Finally, the paper is
concluded in section 5.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>Background and Motivation</title>
      <p>Intensive research has been going on for the
past few decades to predict heart disease using
data mining techniques. Various data mining
algorithms like Naive Bayes, Decision Tree,
Neural Network, Support Vector, Logistic
Regression, Machine(SVM), k-Nearest
Neighbour, Artificial neural network, Random
Forest, J48 have already been used by
researching in determining different levels of
accuracy on multiple datasets around the globe
[5].</p>
      <p>
        Guidi et al. and others in [6] designed a clinical
decision support system (CDSS) for the heart
failure analysis. In their paper, performance
comparison of various machine learning
classifiers like Artificial neural network
(ANN), support vector machine (SVM), CART
system with fuzzy rules, and Random forests
has been made in which CART model and
Random forest outperformed by achieving an
accuracy of 87.6%. In [7], the authors proposed
a logistic regression classifier after feature
selection based upon a decision support system
for the classification of heart disease and
achieved an accuracy of 77%. Authors in [8]
used two approaches, multilayer perceptron
(MLP) and support vector machine, to classify
heart disease and reached an accuracy of
80.41%. In [9], the authors proposed and
evaluated a hybrid classification system of heart
disease and achieved an accuracy of 87.4%.
They combine the fuzzy and artificial neural
network techniques for classification to find the
results. Palaniappan et al. In [10] have applied
Naive Bayes, ANN, and Decision Tree
algorithms to diagnose the existence of heart
disease. According to their results, ANN comes
out as the best predictive model with an
accuracy of 88.12% compared to Naive Bayes
with the accuracy of 86.12% and Decision Tree
with only 80.4%. The authors proposed a
threephase model in [11] for heart disease diagnosis.
They achieved an accuracy of only 88.89%.
Accuracy is the most important factor in
prediction, but this is not only the one. Some
researchers have taken some other parameters
like precision, recall, f-measure, and R2 values
into heart disease prediction. In [12], the
authors used the Dimensionality reduction
technique to process the raw data of 74 features
first and then divide them into three groups.
They could achieve the highest accuracy of
99.4% for CH, 100% precision, and 97.1%
recall while using CHI-PCA with RF classifier.
Shamsollahi in [13] has used combined
predictive and descriptive approaches for
predicting Coronary Artery Disease. They have
selected the k-means method for clustering
(descriptive) and various classification methods
(predictive), including CHAID, Quest, C5.0,
C&amp;RT decision tree, and ANN method. They
compared the results on parameters precision,
accuracy, specificity, sensitivity, and error rate.
As per the results, C&amp;RT comes out as the best
method for the entire dataset with only 0.074
errors. In [
        <xref ref-type="bibr" rid="ref2">14</xref>
        ], authors applied decision tree
classification using J48, random forest, and
logistic model trees algorithms on the UCI
repository. It is concluded from their results
that the J48 tree classification algorithm is the
most excellent classifier for heart disease
prediction because it achieves the highest
accuracy and smallest amount of total time to
build. Moreover, effect is pruning is clearly
visible. They could achieve an accuracy of only
56.76% and time to build is 0.04 seconds for
J48 while logistic model trees reach the only
accuracy of 55.77% with a total time to build
0.39 seconds.
      </p>
      <p>
        Authors have implemented five different
classifying algorithms: Naïve Bayes, Decision
Tree, discriminant, Random Forest, and
Support Vector Machine with big datasets and
compared their performance in terms of
accuracy, precision, specificity, recall, and
Fmeasure [
        <xref ref-type="bibr" rid="ref4">15</xref>
        ]. Among all five classifiers, the
decision tree ranks first, achieving an accuracy
of 99.0%, with random forest stands at the
second position with an accuracy of 93.4%.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Proposed Methodology</title>
      <p>The experiment's process flow is explained in
Figure 1 and further sections explain the
proposed methodology used.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Dataset</title>
      <p>
        We have taken the UCI repository dataset from
Kaggle [
        <xref ref-type="bibr" rid="ref6">16</xref>
        ] named as Heart Failure prediction.
The dataset has in total 13 attributes and 299
A10
A11
      </p>
      <sec id="sec-4-1">
        <title>Smoking</title>
      </sec>
      <sec id="sec-4-2">
        <title>Time</title>
        <p>DEATH_EVENT</p>
        <p>Creatinine level in blood and its unit of measure is mg/dl
Sodium level found in patient blood and its unit is
milliequivalents per liter
Patient smokes
1: Yes
0: No
This is follow up time with patients
The occurrence of death due to heart disease
1 = yes.
0 = no
3.2</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Data Pre Processing</title>
      <p>The real-life data consists of redundant values
and lots of noise. The data needs to be cleaned,
and the missing values need to be filled before
the data is fed to generate a model. In the
preprocessing process, these issues are taken care
of so that the prediction can be made accurately.
Once the cleaning of data is done, i.e., the noise
is removed, and the missing values are filled,
we need to transform it. Many supervised
learning algorithms work on nominal or
cardinal data. So data transformation is applied
to the dataset obtained from UCI in the present
work. Reduction of the dataset is applied to
convert the complex dataset into a more
straightforward form, improving the model's
accuracy
3.3</p>
    </sec>
    <sec id="sec-6">
      <title>Tool Used</title>
      <p>WEKA 3.8.4 machine learning tool is used to
conduct this study written in Java and
developed at the University of Waikato. WEKA
tool provides us with different classifiers to
examine the performance. WEKA is used to
evaluate different data mining tasks like
preprocessing, classification, regression, and many
more. WEKA accepts .csv and .arff file format
and the chosen dataset has already created the
required data in the mentioned format.
3.4</p>
    </sec>
    <sec id="sec-7">
      <title>Classification Algorithms</title>
      <p>After going through an intensive literature
review, we have selected five classification
algorithms: regression, naive Bayes tree, naive
Bayes classification, Bayes network, and
Liblinear.</p>
      <p>
        Regression [
        <xref ref-type="bibr" rid="ref8">17</xref>
        ][
        <xref ref-type="bibr" rid="ref9">18</xref>
        ]: Regression is a
supervised learning technique used to predict
the class of the dataset when the target values
are known[
        <xref ref-type="bibr" rid="ref10">19</xref>
        ]. The current study includes the
regression to generate a model with the
parameters, namely, age, gender, etc., and we
have predicted the unknown class. The
technique of regression works as follows:
The parameters used to make the prediction are
continuous variables (θ1, θ2, ..., θn). Based on
these parameters, the model tries to find the best
fit to predict Y's target variable and improve
upon the accuracy. Using the function F of
more predictors (x1, x2, ..., xn ) and a factor e as
an error, the formula for calculation Y (value of
the target variable ) as
      </p>
      <p>Y=F(x, θ) + e
(1)
The target variable Y is dependent on the
predictor variables, which are independent of
each other. The model is generated based on the
relation between the predictors and the target
class. This is done in the training process. The
model thus built is now fed with different
unknown datasets for which the target value is
predicted. The number of correctly predicted
classes constitutes the accuracy and establishes
the effectiveness of the model.</p>
      <p>Naive Bayes Tree: It is a hybrid approach in
which the model is generated using the naïve
Bayes and Decision tree Approach. The naïve
Bayes classification assumes that the features
are unbiased of each other, and the decision tree
assumes that the features are dependent on each
other. So the hybrid approach takes advantage
of both approaches. The decision tree is built by
considering only one feature, and output is fed
to the node. Based on the outcome of each node,
other features are selected. In this hybrid
approach, the split is done in the same manner
by considering only one feature at every node
but with Naive-Bayes classifiers at the leaves.
In large datasets, data splitting is considered a
vital and important task for classification using
the features we have implemented the naive
Bayes tree classification.</p>
      <p>
        Naive Bayes Classification [
        <xref ref-type="bibr" rid="ref11">20</xref>
        ]–[
        <xref ref-type="bibr" rid="ref14">22</xref>
        ]: This
classification technique is based on the Bayes
theorem, which works on the assumption that
the existence of one feature is independent of
the other feature. The advantage of the Naive
Bayes classification is that it requires a small
amount of data to create/train the model.
Bayes theorem provides a way of calculating
posterior probability (conditional probability
where we are finding probability under a given
condition assumed to be true ) P(c|x) from P(c),
P(x), and P(x|c). The following is the formula
to calculate posterior probability:
      </p>
      <p>P(c|x)=P(x|c)*P(c)/P(x|c) (2)
Where:
P(c|x) is the conditional probability that occurs
when x has already occurred
P(c) is the known probability of the class.
P(x|c) is the conditional probability of x
condition that c has occurred.</p>
      <p>P(x) the known probability of the class.
Bayes Network: The naïve Bayes algorithm
assumes the independence of features. This
hypothesis hampers the performance of the NB
classifier. To improve the performance of the
classifier, the Bayes networking algorithm is
applied. The network is an acyclic graph that
shows the joint probability distribution of the
random variables/features. Each node/vertex of
the graph represents a feature, and the edge
represents the correlation between the features.
This, in a way, reduces the effect of the
hypothesis that the features are independent of
each other. The independence of the features is
then evaluated to reduce the number of
parameters needed to calculate the probability
distribution and compute the posterior
probabilities. The acyclic graph is a joint
probability distribution of random variables,
say U. mathematically, we can say that it is an
ordered pair U= (G, Y). The first component of
the ordered pair G is the acyclic graph. In this
graph, the vertices represent the random
variable X1, X2……, Xn, and the edges
represent the relationship between these
variables. The second component, Y, is the set
of features that constitute the network. It
contains a feature Yxi|xi = PB(xi|xi ) for each
possible value xi of Xi, and Πxi of ΠXi , where
ΠXi denotes the set of parents of Xi in G. A
Bayesian network B defines a joint probability
distribution (PDF) over U, and this is a unique
PDF.</p>
      <p>PB(X1,X2,……,Xn) = Π PB(Xi|ΠXi) (3)
LiBLinear: LIBLinear is an open-source
library for linear classification. It supports two
linear classifications, one logistic regression,
and another is the Linear Support vector
machine. Given a set of instance-label pairs (xi;
yi); i = 1; : : : ; l; xi 2, both methods solve the
following unconstrained optimization problem
with different loss functions _(w; xi; yi):
penalty parameter, and C&gt;0
C is a
(4)
3.5</p>
    </sec>
    <sec id="sec-8">
      <title>Evaluation Matrices</title>
      <p>We have considered four parameters for our
paper. In the present work, the prediction class
is if the person having certain attributes has died
because of heart disease or not, so the class C in
the above table is no. of instances belonging to
the class. Figure 2 is the confusion matrix.
TP is the number of people who died because of
heart disease, and the model also predicted the
same. Similarly, TN is the person who didn’t
die of a heart ailment, and our model also
predicted the same. False Positive (FP) is a
Type I error because the model predicted that
the person died of an ailment, but actually, the
patient didn’t. False-negative is a type II error.
The model predicted that the person didn’t die
of the alignment, but he/she did.</p>
      <p>The accuracy of the model is calculated through
the formula given below:
Accuracy = (TP+TN)/Total no. of instance
(5)
The recall is the measure of correctly predicted
classes out of the total positive classes. The
formula is as follows:
Recall= (TP)/(TP+FN)
Precision is the measure of actual positive
classes out of all the correctly predicted positive
classes. The formula for the recall is as follows:
Precision = TP/(TP+FP)
(6)
(7)</p>
      <p>Comparing the two models becomes difficult
when the precision is low, and the recall value
is high. In the case of vice versa, the two
parameters are not of much use for comparison
of the models. F-score is used to compare the
models in such cases. F-score uses the harmonic
mean of the two values. This helps to measure
the recall and precision at the same time.
Instead of the Arithmetic mean, the harmonic
mean is used because the Arithmetic mean is
sensitive to extreme values.</p>
      <p>F-score= (2*Recall*Precision) / (Recall +
Precision)</p>
      <sec id="sec-8-1">
        <title>Actual class\Predicted class C Not in C</title>
        <p>C
Not in C</p>
        <p>True Positives (TP) False Negatives (FN)
False Positives (FP) True Negatives (TN)</p>
        <p>Figure2: Confusion Matrix
3.6</p>
        <p>
          k-Fold Cross-Validation
Dividing the dataset into k parts of equal size in which k-1 sets are used for training purposes and rest
are used for evaluation is termed as k-fold cross-validation [
          <xref ref-type="bibr" rid="ref15">23</xref>
          ]. For instance, if we use 10-fold
crossvalidation, 90 percent of total data is used for training the classifier, and the rest 10 percent is used for
evaluation.
        </p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Results and Discussion</title>
      <p>
        The chosen five different classification
algorithms were implemented on the heart
disease dataset of the UCI repository. The
experimental results have been obtained on the
framework of WEKA 3.8.4. We used different
k as 5, 10, and 20 for cross-validation and
evaluated the above mentioned four parameters
using five classification algorithms on WEKA.
Table 2 tabulates the results obtained when
taken 5-fold CV classification with five
algorithms to evaluate the accuracy, F-measure,
precision, and recall. Similarly, table 3 and
table 4 show our experiment's simulation
results on weka with 10-fold and 20-fold CV
classification. Table5 tabulates the results when
we have used 66% data for training the system
and the rest 34% data for evaluating the results.
From the results, we can easily predict that
Bayesian Network outperforms all with the
highest accuracy, precision, f-measure, and
recall in each method. Naive Bayes network
uses an acyclic graph where each node
represents a feature, and the edge represents its
relation with other features. In the present work,
the features such as age, gender, blood pressure,
diabetes, etc., contribute towards heart disease
[
        <xref ref-type="bibr" rid="ref16">24</xref>
        ]. Hence, the accuracy for this classifier
outperforms the other. This establishes our
hypothesis that the features such as age, gender,
etc., when classified in the form of a graph
(where these are dependent on each other),
means that the heart-related ailment depends on
these factors. So we can use this technique for
the prediction of heart disease[25].
      </p>
      <sec id="sec-9-1">
        <title>Algorithms</title>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>Conclusion and Future Scope</title>
      <p>In this paper, five data mining classifiers
(LibLinear, Naive Bayes, Naive Bayes tree,
Bayes network, and classification via
regression) on heart disease data taken from the
UCI repository have been implemented. The
goal of this experimentation is to detect the
accuracy in the prediction of heart disease of
patients. We successfully achieved the highest
accuracy of 79.28% with the Bayesian network
classifier followed by naive Bayes. The reason
behind excellent performance by the Bayesian
network is the use of graphs in it, as graphs can
reflect the relationship better between
dependent variables as we have in our dataset
like smoking habit, diabetes, high BP, etc.
Hence, we get better accuracy and prove that
these factors contribute to heart disease
occurrence.</p>
      <p>In the future, we could use these results to
design an effective prediction system that could
help our medical practitioners diagnose and
treat heart disease. Also, we could implement
these data mining techniques for other diseases
like diabetes, etc.</p>
      <p>References
[1] S. Gupta, D. Kumar, and A. Sharma,
“Performance analysis of various data mining
classification techniques on healthcare data,”
Int. J. Comput. Sci. Inf. Technol., vol. 3, no. 4,
pp. 155–169, 2011.
[2] J. Soni, U. Ansari, D. Sharma, and S.
Soni, “Predictive data mining for medical
diagnosis: An overview of heart disease
prediction,” Int. J. Comput. Appl., vol. 17, no.
8, pp. 43–48, 2011.
[3] C. S. Dangare and S. S. Apte,
“Improved study of heart disease prediction
system using data mining classification
techniques,” Int. J. Comput. Appl., vol. 47, no.
10, pp. 44–48, 2012.
[4] S. Sa, “Intelligent heart disease
prediction system using data mining
techniques,” Int. J. Healthc. Biomed. Res., vol.
1, pp. 94–101, 2013.
[5] S. Nazir, S. Shahzad, S. Mahfooz, and
M. Nazir, “Fuzzy logic based decision support
system for component security evaluation.,”
Int. Arab J. Inf. Technol., vol. 15, no. 2, pp.
224–231, 2018.
[6] G. Guidi, M. C. Pettenati, P. Melillo,
and E. Iadanza, “A machine learning system to
improve heart failure patient assistance,” IEEE
J. Biomed. Heal. informatics, vol. 18, no. 6, pp.
1750–1756, 2014.
[7] R. Detrano et al., “International
application of a new probability algorithm for
the diagnosis of coronary artery disease,” Am.
J. Cardiol., vol. 64, no. 5, pp. 304–310, 1989.
[8] M. Gudadhe, K. Wankhade, and S.
Dongre, “Decision support system for heart
disease based on support vector machine and
artificial neural network,” in 2010 International
Conference on Computer and Communication
Technology (ICCCT), 2010, pp. 741–745.
[9] H. Kahramanli and N. Allahverdi,
“Design of a hybrid system for the diabetes and
heart diseases,” Expert Syst. Appl., vol. 35, no.
1–2, pp. 82–89, 2008.
[10] S. Palaniappan and R. Awang,
“Intelligent heart disease prediction system
using data mining techniques,” in 2008
IEEE/ACS international conference on
computer systems and applications, 2008, pp.
108–115.
[11] E. O. Olaniyi, O. K. Oyedotun, and K.
Adnan, “Heart diseases diagnosis using neural
networks arbitration,” Int. J. Intell. Syst. Appl.,
vol. 7, no. 12, p. 72, 2015.
[12] A. K. Garate-Escamilla, A. H. E. L.
Hassani, and E. Andres, “Classification models
for heart disease prediction using feature
selection and PCA,” Informatics Med.
Unlocked, p. 100330, 2020.
[13] M. Shamsollahi, A. Badiee, and M.
Ghazanfari, “Using combined descriptive and
Intelligence Paradigm (IJAIP), vol 7, issue3-4,</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>predictive methods of data mining for coronary artery disease prediction: a case study approach,” J. AI Data Min</article-title>
          ., vol.
          <volume>7</volume>
          , no.
          <issue>1</issue>
          , pp.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Patel</surname>
          </string-name>
          , D. TejalUpadhyay, and S.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Dis.</surname>
          </string-name>
          , vol.
          <volume>7</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>129</fpage>
          -
          <lpage>137</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>I. A.</given-names>
            <surname>Zriqat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Altamimi</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Azzeh</surname>
          </string-name>
          , “
          <article-title>A comparative study for predicting heart diseases using data mining classification methods</article-title>
          ,
          <source>” arXiv Prepr. arXiv1704.02799</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G. J.</given-names>
            <surname>Davide</surname>
          </string-name>
          <string-name>
            <surname>Chicco</surname>
          </string-name>
          , “Heart Failure Prediction,”
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          https://www.kaggle.com/andrewmvd/heartfailure
          <article-title>-clinical-data (accessed Nov</article-title>
          .
          <volume>10</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>F. E.</given-names>
            <surname>Harrell</surname>
          </string-name>
          , “Ordinal logistic regression,” in Regression modeling strategies, Springer,
          <year>2015</year>
          , pp.
          <fpage>311</fpage>
          -
          <lpage>325</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>V.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>The nature of statistical learning theory</article-title>
          . Springer science &amp; business media,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K.</given-names>
            <surname>Larsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Petersen</surname>
          </string-name>
          , E. BudtzJørgensen, and L. Endahl, “
          <article-title>Interpreting parameters in the logistic regression model with random effects,” Biometrics</article-title>
          , vol.
          <volume>56</volume>
          , no.
          <issue>3</issue>
          , pp.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Ye</surname>
          </string-name>
          , “
          <article-title>Experimental comparisons of multi-class classifiers</article-title>
          ,
          <source>” Informatica</source>
          , vol.
          <volume>39</volume>
          , no.
          <issue>1</issue>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Qamar</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. Q. A.</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Rizvi</surname>
          </string-name>
          , “
          <article-title>Techniques of data mining in healthcare: a review,”</article-title>
          <source>Int. J. Comput. Appl.</source>
          , vol.
          <volume>120</volume>
          , no.
          <issue>15</issue>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Nikam</surname>
          </string-name>
          , “
          <article-title>A comparative study of classification techniques in data mining algorithms</article-title>
          ,” Orient.
          <source>J. Comput. Sci. Technol</source>
          ., vol.
          <volume>8</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>19</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>V.</given-names>
            <surname>Madaan</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <article-title>"Predicting Ayurveda-Based Constituent Balancing in Human Body Using Machine Learning Methods,"</article-title>
          <source>in IEEE Access</source>
          , vol.
          <volume>8</volume>
          , pp.
          <fpage>65060</fpage>
          -
          <lpage>65070</lpage>
          ,
          <year>2020</year>
          , doi: 10.1109/ACCESS.
          <year>2020</year>
          .
          <volume>2985717</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Vishu</given-names>
            <surname>Madaan</surname>
          </string-name>
          and Anjali Goyal, “
          <article-title>Analysis and Synthesis of a Human Prakriti Identification System Based on Soft Computing Techniques”</article-title>
          ,
          <source>Recent Patents on Computer Science</source>
          ,
          <volume>12</volume>
          (
          <issue>1</issue>
          ), pp
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          ,
          <year>2019</year>
          . DOI:
          <volume>10</volume>
          .2174/2213275912666190207144831 [25]
          <string-name>
            <surname>Prateek</surname>
            <given-names>Agrawal</given-names>
          </string-name>
          , Vishu Madaan, Vikas Kumar, “
          <article-title>Fuzzy Rule Based Medical Expert System to Identify the Disorders of Eyes, ENT</article-title>
          and Liver”,
          <source>International Journal of Advanced</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>