Risk Estimator using a Multi-Layer Perceptron Network for
                Coronary Artery Disease Prevention
                   Didi Liliana Popaa, Mihai Lucian Mocanua and Radu Teodoru Popaa
                   a
                       Universitatea din Craiova, Facultatea de Automatică, Calcultoare și Electronică,
                       Bulevardul Decebal,nr 107, Craiova, România


                 Abstract

                 One of the most prevalent heart disease is coronary artery disease (CAD).
                 We propose the use of Deep Learning (DL) Network Multi-Layer Perceptron (MLP) in order
                 to obtain an early cardiovascular risk estimation at 10 year for CAD prevention in patients with
                 the purpose of reduced rate of mistreatment.For this purpose, we designed a protocol for
                 selecting eloquent data. We also designed a method which is using Deep Neural Network
                 sequential model which has multiple inputs and three outputs. Data set are from a private clinic
                 in South- West zone in Romania. Custom data set included a batch of 784 patients with 11
                 medical characteristics. The result of predicting the MLP network gives us the probability
                 that the patient will develop a severe heart disease in the following 10 years.
                 By deploying a DL network, we were able to provide an unitary risk assessment method of
                 CAD for physicians that allowed the “localization” of the medical European Society of
                 Cardiology guidelines to Romania region.

               Keywords
               Coronary artery disease, deep neural network, multilayer perceptron network, cardiovascular
               risk estimator
                                                         coronary artery disease, unstable angina,
                                                         myocardial infarction, heart failure, and sudden
1. Introduction                                          cardiac      death[1].      In    Europe,     the
                                                         recommendations for treating cardiac diseases
                                                         a1re described in the Guidelines of the
    The importance of of early diagnosis and
                                                         European         Society       of     Cardiology
risk stratification of ischemic heart diseases is
                                                         [www.escardio.org].
given by the fact that cardiovascular diseases is
                                                             Those are covering a minimum of
the leading cause of death in Europe. [eurostat
                                                         investigations that should be done to patients
- causes of death statistics 2019], and in the
                                                         with coronary heart disease such as laboratory
same time in the world [World Health
                                                         examinations (bio-markers, lipid profile,
Organization]. Among them, the most
                                                         NTProBNP,             D-Dimers),          12-lead
prevalent manifestation is ischemic heart
                                                         electrocardiogram, the ECG and imaging effort
disease given by coronary atherosclerosis
                                                         test, echocardiography, coronarography and
pathology, which is associated with an
                                                         describes the cardiac risk scores that should be
increased mortality and morbidity rate.
                                                         performed, but it leaves to the physician's
    Coronary artery disease is caused by
                                                         discretion how these protocols will be
cholesterol deposits that stick and narrow the
                                                         implemented.
walls of coronary arteries that supply blood to
                                                             The diagnosis and cardiovascular risk
the heart.
                                                         assessment of stable coronary artery disease
    Clinical presentation of ischemic heart
                                                         (SCAD) involves clinical evaluation, including
disease includes silent ischemia, stable
                                                         identifying        significant     dyslipidemia,
                                                         hyperglycaemia or other biochemical risk

Proccedings of RTA-CSIT 2021, May 2021, Tirana, Albania
EMAIL: liliana.popa@edu.ucv.ro(A. 1);
             ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative
             Commons License Attribution 4.0 International (CC BY 4.0).

             CEUR Workshop Proceedings (CEUR-WS.org)
factors and specific cardiac investigations such     examination and other attributes recorded from
as stress testing or coronary imaging. These         the patients.
investigations may be used to confirm the
diagnosis of ischemia in patients with suspected     2. Methods
SCAD, to identify or exclude associated
conditions or precipitating factors, assist in
stratifying risk associated with the disease and         The main purpose was to help physicians in
                                                     their practice by automatic predicting the
to evaluate the efficacy of treatment
                                                     cardiovascular risk for a particular patient, in
    Conventional risk factors for the
development of SCAD are hypertension,                other words to determine which patient will
hypercholesterolemia,           diabetes,sedentary   have in the near future (next 10 years) a major
lifestyle, obesity,smoking and a family history.     cardiovascular event such as sudden death,
     Taking into consideration the fact that         therefore the physicians will have to prescribe a
cardiac diseases have remained the leading           more aggressive medical treatment.
causes of death globally in the last 15 years [2],       We used private data from a private clinic in
there is need for a better strategy in improving     South- West zone in Romania. Data used were
                                                     obtained between October 2017- September
the diagnostic and treatment.
                                                     2019. The patients enrolled received cardiology
    Artificial Intelligence can help in order to
                                                     consult with electrocardiogram, different blood
have an early diagnosis and more accurate and
also can reduce the rate of misdiagnosis. That       tests.    The examination was Data were
leads to a decrease in mortality rate. In order to   anonymized and patient consent was obtained.
achieve this, is necessary to customized                 Patient consultations, cardiac ultrasounds
healthcare for each individual patient.              and exercise tests were performed by a
    The cardiac risk scores used in traditional      cardiologist. Patients had previous blood tests .
medicine are calculated on a generalized                  We proposed a MLP network with 4 layers
                                                     Deep Neural Network sequential model which
population at a very large level, and doesn’t
allow localized medicine with particularities        has multiple inputs and three outputs because
from each zone. Neural networks can do               our model needs to predict cardiac overall risk
customized healthcare, because they learn and        for the patient.
                                                         We decided to use the most accessible deep
so the cardiac risk scores is improved.
                                                     network architecture that could fulfill our
    AI refers to those programs that computers
may execute similar to human intelligence ,          requirements.
learning and solving problems. The neural                Each hidden network layer used an rectifier
network are simulating the way that human            function (ReLu) and we used the SoftMax
brain is interacting in the learning process.        function in our output layer, because we want a
                                                     three output result (low, intermediate and high)
Deep learning (DNN) is formulated as a
mathematical neural network architecture             therefore the number of categories in the output
consisting of multiple hidden layers with non-       layer is more than two.
linear activation.[3] One architecture of DNN is         For the purpose of implementing and testing
Multilayer perceptron (MLP), in which every          the MLP network we used a custom data set that
element of a previous layer, is connected to         included a batch of 784 patients.
every element of the next layer and has an                The patient dataset was made of 8 medical
activation function at each hidden layer.[4]         characteristics:RegistryNumber, PatientName,
    In literature, there are different methods in    PatientAge, Gender, Total Cholesterol,LDL
medical research for SCAD classification             Cholesterol, Glicemia, BMI, ABI,Mean Blood
using different learning and data mining             Pressure.After analyzing the medical data,we
techniques , like neural network (NN), support       determined each medical input attribute and
vector machine, random forest, decision tree,        noticed that:
clustering, and Gaussian mixture model and               -some attributes like PatientAge, Glicemia,
others.                                              BMI and LDL attributes are integers; others are
    The purpose of this model was to obtain an       cathegorical      attributes    like     Gender,
early diagnosis of CAD with a good accuracy ,        RelativeRisk, Sex, etc.
that can be used in clinical practice for                -In the test population test we have more
diagnosis of SCAD, using deep learning               male , over 60 years old. According to eurostat
                                                     2016 standardised death rate were higher for
methods for combining results of clinical
man than for women for nearly all the main         ...
causes of death , including cardiac disease.       <item>
   -Some attributes with zero value are non-       FinalRisk
existent values for that patient.                  </item>
   -The patient data set is small (for learning    <value type=string>
purposes) and contains 784 rows with 11            High risk
columns.The output/endpoint of the dataset         </value>
consisted of 3 distinct                            </patient>
   We also implemented a Graphical User
Interface in order to enter the data.             For the implementation of the neural network
                                                  that predicts risk and makes medical
                                                  recommendations (intensive medical treatment
                                                  and invasive cardiac procedures), we used
                                                  Spyder content in the Anaconda library, which
                                                  can be downloaded free from the Internet. It
                                                  requires also to install the Tensorflow, Theano
                                                  and Keras libraries in Spyder. Keras is the main
                                                  library that implements Multilayer perceptron
                                                  network models and it is built on Tensorflow
                                                  and Theano, so that these two libraries work in
                                                  back-end whenever we execute a program in
                                                  Keras[5].


   Figure 1: Screenshot of the Risk Estimator
application Graphical User Interface

   In order to load the data in the neural
network we have implemented a XML file
format specially created for our project.
   The XML format contains metadata along
with the structured data as follows:
 <patient number=1>                       (1)
 <item>
 Age                                                 Figure 2:Proposed Deep Learning Network
 </item>                                          architecture
 <value type=numeric>                             Keras is a high-level neural network API
 50                                               capable of running on Tensorflow, Theano and
 </value>                                         CNTK. It allows for fast experimentation
 <item>                                           through a high-user-friendly, modular and
 Name                                             extensible API, as well as running on the
 </item>                                          processor and GPU[6].
 <value type=string>                              The MLP network uses the efficient Adam
 ML                                               gradient descent optimization algorithm with a
 </value>                                         logarithmic      loss      function,       called
 <item>                                           "categorical_crossentropy"[7] .
 Gender                                           The       Adam       optimizer        used      a
 </item>                                          LearningRateSchedule based on an exponential
 <value type=categorical>                         decay schedule with      initial learning rate of
 male</value>
0.01, decay steps of 10000 and decay rate 0.9       The result of predicting the MLP network will
and epsilon value of 0.01.[8]                       give us the probability that the patient will
In Machine Learning, we always divide               develop a severe heart disease. We will convert
medical data into a training part and a testing     that probability into binary 0 and 1.
part[9]. So , we train the model on the training    In following step           we evaluated the
data and on the test data we check the accuracy     performance of our MLP network model. We
of the model. The efficiency of the model is        already have final results and thus we can
evaluated when we test the model on the test        classification reports to verify the accuracy of
data using F1-score per each class, overall         the model.
accuracy, macro-average accuracy, weighted-         To test our model we used 10 fold stratified
macro-average accuracy[10][11].                     cross validation because we had a small dataset
                                                    and we wanted to be sure that the results do not
3. Results                                          depend on the initialization of weights or on the
                                                    order of presentation of training data
                                                    vectors[12][13].
Our study collected the data from 784 cases.
By training our Deep Learning Network we
achieved two things:
-we calculated the accuracy of the final risk
estimation
-we computed for a new patient the risk score
based on previous patient historical data by
deploying the trained network.
We trained our model using a batch size of 10
and 120 epochs.
Because we are modelling a multi-class
                                                        KFold 1 acc: 63.29%
classification problem using a MLP neural
network, we decided to reshape the output
attribute of a vector that contains value (high
risk, intermediate risk and low risk) to a matrix
with a boolean for each value by using hot
coding or creating dummy variables from a
categorical variable.
For example, in this problem the three class
values are low risk, medium risk and high risk.
We can turn this into a hot-coded binary matrix
for each data instance that would look like this:       KFold 2 acc: 77.22%
Table 1
Cardiovascular risk coding
  Low risk    Intermediate risk      High risk
      0                0                1
      0                1                0

Because we used one-hot encoding for our
cardiovascular data set, the output layer creates
3 output values, one for each class. The output
value with the highest value will be taken as the       KFold 3 acc: 75.64%
class provided by the model.
We used a Softmax activation function in the
output layer. This ensures that the output values
are in the range 0 and 1 and can be used as
predicted probabilities.
                          KFold 8 acc: 80.77%
KFold 4 acc: 79.49%


                          KFold 9 acc: 79.49%
KFold 5 acc: 74.36%


KFold 6 acc: 72.15%       KFold 10 acc: 73.08%

                         Figure 3:Ten intermediary results during k-
                      fold validation from 10 runs

                         We have computed the average accuracy
                      (ACA) as the percentage of correctly classified
                      cases during the testing phase[14]. Besides the
                      ACA, the standard deviation (SD) of the ACA
                      and the 95% confidence interval were
                      computed also[15].
KFold 7 acc: 78.48%
                      Table 2
                      MLP performance indicators
                         Variable       ACA       SD       95% CI
                                         (%)
                        MLPNetwor 75.39 5.15                (71.711
                             k            6        0    , 79.080)
   We can see from Table 2 that on average the        provide patients with higher quality diagnostic
MLPNetwork performs with 75% average                  results than experience alone.[17].Sooner or
accuracy. Regarding the stability of the model,       later, the development of deep learning
the SD is 5.150.                                      applications will affect every aspect of health
   We also built a classification report showing      care.[18].
the general classification metrics after complete         We consider that artificial intelligence can
MLP training.                                         customizes healthcare for each patient because
                                                      neural networks can learn and so the cardiac
Table 3                                               risk scores is improved.
Overall Classification Report                             Therefore using this innovative DL network,
Class       Precis Recall        F1-      support     we were able to provide an unitary diagnosis
            ion                  score                method for physicians that allowed the
0           0.81       0.96      0.88     479         “localization” of the medical ESC guidelines to
1           0.72       0.46      0.56     213         Romania region. This way we created an
2           0.84       0.76      0.80     92          method to transmit medical knowledge in a
Macro       0.79       0.73      0.75     784         consistent way, therefore physicians will
avg                                                   benefit from both ESC guidelines and “local”
Weighted 0.79          0.80      0.78     784         experience because a DL network has the
avg                                                   ability to “learn” from previous medical
Accuracy                         0.80     784         patients data in diagnosis of coronary heart
                                                      diseases.
                                                          We further plan to train our application and
    The reported averages in our testing              deep neural network with more clinical data,
included precision[16], recall, F1-score per risk     including ultrasound and cardiac 3D
class (low ,intermediate and high), macro             angiography data[20]. Also we plan to use more
average (averaging the unweighted mean per            complex deep neural networks with multiple
risk class, weighted average (averaging the           layers to test if we can further improve the
support-weighted mean per risk class), and            overall accuracy of our risk estimator.
overall accuracy. Support parameter described
number of patients included in each risk class.
    This way we determine of the performance          5. References
of our supervised learning algorithm.For
computing these parameters we used all the            [1] 2019 ESC Guidelines for the diagnosis and
instances in a predicted class, compared with             management of chronic coronary
the instances of the”true”class.T hese instances          syndromes:The Task Force for the
contained "actual" and "predicted" values.                diagnosis and management of chronic
    We obtain an accuracy for our cardiac DL              coronary syndromes of the European
network model of 80%, which physicians                    Society of Cardiology (ESC),Juhani
consider is an acceptable accuracy.                       Knuuti,2019,European Heart Journal,
    Finally our model could be used to predict            https://doi.org/10.1093/eurheartj/ehz425
the cardiac risk for a new patient using classifier   [2] WHO The top 10 causes of death
“predict_classes “ method.                                URL:https://www.who.int/news-
                                                          room/fact-sheets/detail/the-top-10-causes-
4. Conclusion                                             of-death
                                                      [3] LeCun Y, Bengio Y, Hinton G (2015)
                                                          Deep learning. Nature 521: 436–444.
    Sometimes, the diagnosis of coronary heart            pmid:26017442.
disease can escape doctors. With the help of AI,      [4] Jason Brownlee ,Your First Deep
even less experienced or tired doctors will have          Learning Project in Python with Keras
a high degree of accurate diagnosis. AI can help          Step-By-Step, Machine learning mastery,
doctors improve the effectiveness of their                2019
treatment. AI is not perfect, but it has promising    [5] Pushkar Mandot, Build your First
results. One of the outcome is that AI                    DeepLearning Neural Network Model
algorithms need a lot of data and time to be              using Keras in Python, Medium, 2017
trained. Studies have suggested that the
combination of clinicians and AI skills will
[6] Milad Toutounchian,Deep Learning from               doi:10.1002/sim.7372. PMC 5575530.
     Scratch and Using Tensorflow in Python,            PMID 28620945.
     Medium, 2019                                  [20] Riley RD, Ahmed I, Debray TP, Willis
[7] Khyati Mahendru,A Detailed Guide to 7               BH, Noordzij P, Higgins JP, Deeks JJ
     Loss Functions for Machine Learning                (2015). "Summarising and validating test
     Algorithms, Medium, 2014                           accuracy results across multiple studies for
[8] Belciug, S., Artificial Intelligence in             use in clinical practice". Statistics in
     Cancer: Diagnostic to Tailored Treatment,          Medicine. 34 (13): 2081–2103.
     (Elsevier, 2020).
[9] Peat, J., Barton, B., Medical Statistics: A
     guide to data analysis and critical
     appraisal. (Blackwell Publishing, 2005).
[10] Thode, H.J., Testing for normality. (New
     York: Marcel Dekker, 2002).
[11] Altman, D.G., Practical Statistics for
     Medical Research, (Chapman and Hall,
     New York, 1991).
[12] Jason Brownlee, A Gentle Introduction to
     k-fold Cross-Validation, Machine learning
     mastery, 2018
[13] Varma, Sudhir; Simon, Richard (2006).
     "Bias in error estimation when using cross-
     validation for model selection". BMC
     Bioinformatics. 7: 91. doi:10.1186/1471-
     2105-7-91. PMC 1397873. PMID
     16504092.
[14] Politis, Dimitris N.; Romano, Joseph P.
     (1994). "The Stationary Bootstrap".
     Journal of the American Statistical
     Association. 89 (428): 1303–1313.
     doi:10.1080/01621459.1994.10476870.
[15] Picard, Richard; Cook, Dennis (1984).
     "Cross-Validation of Regression Models".
     Journal of the American Statistical
     Association.      79     (387):   575–583.
     doi:10.2307/2288403. JSTOR 2288403.
[16] Chen JH , Asch SM. Machine learning and
     prediction in medicine- beyond the peak of
     inflated expectations. N. Eng J Med 2017;
     376:2507-2509
[17] Jason Brownlee, How to Calculate
     Precision, Recall, F1, and More for Deep
     Learning Models, Machine learning
     mastery, 2020
[18] Popa Didi Liliana, Faiq Baji, Popa Radu
     Teodoru -Overview of the Deep Learning
     in Medical Imaging,Annals of the Univ.
     Craiova , Series: Automation, Computers,
     Electronics and Mechatronics, Vol.14(41),
     No. 1, 2017
[19] Willis BH, Riley RD (2017). "Measuring
     the statistical validity of summary meta-
     analysis and meta-regression results for
     use in clinical practice". Statistics in
     Medicine.       36     (21):    3283–3301.