Congenital Heart Disease Detection Using Clinical
Data and Auscultation Heart Sounds: a Machine
Learning Approach
Solange Belinha1 , MD, Bruno Miguel Oliveira1,2 , MSc and Pedro Pereira Rodrigues1,2 ,
PhD
1
  Faculty of Medicine, Department of Biochemistry, University of Porto, Al Prof Hernani Monteiro, 4200-319 Porto,
Portugal
2
  CINTESIS - Center for Research in Health Technologies and Information Systems, Porto, Portugal


                                         Abstract
                                         Congenital heart disease (CHD) is the most common congenital malformation and has high morbidity and
                                         mortality related to late diagnosis. Screening protocols are lacking and only 1% of murmurs are associated
                                         with CHD. The decline in auscultation skills highlights the need for better screening. This study aims
                                         to create and evaluate models for the detection of CHD using clinical data and sound features. These
                                         features were extracted using pure conventional MFCC and selected MFCC through matrix profiling
                                         and motif search. Four combinations of data were used to train decision trees (DT) and artificial neural
                                         networks (ANN), and the area under the curve (AUC) was compared. Posteriorly, models were also
                                         trained for the detection of any cardiac pathology. In both pathologies, the ANN model using clinical
                                         data and conventional MFCC showed the highest performance with AUC of 0.761 for CHD and 0.791 for
                                         any cardiac pathology. However, this is only a slight improvement when compared with the ANN models
                                         using only clinical data (0.747 and 0.789, respectively. Additionally, the inclusion of motif selected MFCC
                                         seems to worsen the model performance. Although further research is still needed, this is a potential
                                         improvement in CHD screening, particularly for primary care physicians.

                                         Keywords
                                         Heart Auscultation, Machine Learning, Congenital Heart Disease, Mel-frequency Cepstral Coefficients,
                                         Matrix Profile, Decision Tree, Artificial Neural Network, Computer Assisted Decision


1. Introduction
1.1. Background
Congenital heart disease (CHD) is the most common congenital defect in the world [1, 2] and is
defined as an abnormal development of the structures of the heart and/or great vessels which is
present at birth.
   In terms of global birth incidence, recent studies estimated a birth incidence of more than
17/1000 in 2017 [3, 4], which represents an increase of 4.2% from 1990 [5, 6]. As for global preva-
lence, it is estimated that nearly 12 million people were living with CHD in 2017, representing

AIxIA 2021 SMARTERCARE Workshop, November 29, 2021, Milan, IT
Envelope-Open up201503744@up.pt (S. Belinha); boliveira@med.up.pt (B. M. Oliveira); pprodrigues@med.up.pt (P. P. Rodrigues)
Orcid 0000-0002-0760-1808 (S. Belinha); 0000−0001−7665−6506 (B. M. Oliveira); 0000-0001-7867-6682 (P. P. Rodrigues)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                          91
Solange Belinha et al. CEUR Workshop Proceedings                                             91–103


an increase of 18.17% since 1990 [4].
   Looking at the mortality of CHD, several studies mention a clear decline in mortality since
1990 up to 2017 [3, 4]. Nevertheless, CHD remains the cause of 69% of deaths in children
younger than 1 year and in the top 10 causes of death in children of this age group [4]. It is also
responsible for more than double the years of life lost when compared to other cardiac diseases,
such as rheumatic heart disease (RHD) [4].
   Focusing on the morbidity, of all heart disease, CHD is one of the most common causes of
years lived with disability (YLD) in individuals under 20 years, along with RHD [4]. Morbid-
ity caused by CHD is related to late diagnosis and can include pulmonary hypertension [7],
neurodevelopmental deficits [8, 9, 10] and delayed growth [11, 12].
   Due to the context portrayed above, it is ever more important to have appropriate screening
and early diagnosis strategies. Currently, screening consists of prenatal ultrasounds and a pulse
oximetry test at birth, in countries where a formal screening protocol exists [13]. Still, this
screening detects only between 82.8 and 92% of critical CHD [14] and critical CHD only amount
to about 25% of all CHD [15]. This means that approximately 75% of CHD are not detected at
birth.
   Even though there are other signs and symptoms associated with CHD, by far the most
concerning to general physicians and parent is the presence of a murmur, and this is the main
cause for referral to a pediatric cardiology appointment [16, 17]. Nonetheless, estimates show
that between 50 and 72% of children will have a murmur during infancy and adolescence
[16, 17, 18, 19] and that only 1% of these are associated with a CHD, pathologic murmurs
[16, 18].
   To aggravate the situation, doctor’s cardiac auscultation skills are in decline [20], with a study
reporting a sensitivity of 73% and specificity of 78% for the detection of an abnormal murmur by
primary healthcare physicians [21] and even the cardiac auscultation skills of pediatric resident
being suboptimal [22]. One study found that the overall accuracy of cardiac auscultation skills
in pediatricians was 73%, which is low when compared to 83% overall accuracy for cardiologists
[23]. This illustrates the necessity for an assisted-decision tool primary care physicians and
pediatricians could use when referring children with murmurs to a pediatric cardiologist.

1.2. Prior Work
In recent years, there has been an exploration of the potential artificial intelligence brings to
the analysis of heart sounds. Several studies have attempted to detect cardiac pathology from
cardiac auscultation recordings [24]. In the area of pediatrics, a few studies were able to use
these methods to detect specific types of CHD [25, 26, 27] or distinguish normal and abnormal
heart sounds [26, 28, 29]. Those who tried to identify CHD had a limited number of participants
[30]. Nonetheless, to the best of our knowledge, no study has yet explored the detection of
CHD using both clinical and heart sound data.

1.3. Goal of This Study
This study aims to produce machine learning models for the detection of CHD using clinical
data and auscultation sound features. We hypothesized that by adding heart sound features to


                                                 92
Solange Belinha et al. CEUR Workshop Proceedings                                         91–103


the existing clinical data, we would be able to improve the ability of the model to discriminate
individuals with and without CHD. The ideal being that this tool would be useful for screening
of CHD.


2. Methods
2.1. Data Collection and Preprocessing
This is a retrospective study, which reuses the data obtained from two volunteer mass screening
programs, thus forming a convenience series. These programs were conducted in Eastern Brazil
between July and August 2014, and June and July 2015, and included all participants presenting
voluntarily within this period. Eligibility criteria for screening was patients younger than 21
years of age.
   When preprocessing the data, individuals identified as fetus, as having had previous cardiac
surgery and as not having an echocardiogram were removed. Errors identified in variable
codification were corrected. Variables were computed from information in other variables and
percentiles according to age were calculated for some variables.
   Additionally, the body mass index (BMI) percentile was used to stratify children younger
than 19 years, inclusively, as underweight (BMI percentile ≤ 5), normal (BMI percentile <5 and
>85), overweight (BMI percentile ≥ 85) and obese (BMI percentile ≥ 95). For children older than
19 years, underweight was defined as a BMI <18.5, overweight was defined as a BMI ≥ 25 and
obese was defined as a BMI percentile ≥ 30. Clinical indication variables were recoded into
logical variables. Variables including information on diagnosis or orientation and irrelevant
information were removed, as well as redundant variables. Finally, variables with more than
50% missing data were also removed.
   Echocardiogram diagnosis was used as reference standard to create the outcome variable,
presence or absence of CHD. Due to the exclusion criteria, the were no cases with missing data
on the reference standard result. Echocardiogram diagnosis was chosen as reference standard
because it is the gold standard for detection of CHD. Additionally, another outcome variable
was created using information from the echocardiogram diagnosis and the cardiology diagnosis
to represent presence or absence of any cardiac pathology.

2.1.1. Variables Computed
             𝑂𝑥𝑦𝑔𝑒𝑛𝑆𝑎𝑡𝑢𝑟𝑎𝑡𝑖𝑜𝑛𝐷𝑖𝑓 𝑓 𝑒𝑟𝑒𝑛𝑐𝑒 = 𝑅𝑖𝑔ℎ𝑡𝐴𝑟𝑚𝑆𝑎𝑡𝑢𝑟𝑎𝑡𝑖𝑜𝑛 − 𝐿𝑒𝑔𝑆𝑎𝑡𝑢𝑟𝑎𝑡𝑖𝑜𝑛
              𝑀𝑒𝑎𝑛𝐴𝑟𝑡𝑒𝑟𝑖𝑎𝑙𝑃𝑟𝑒𝑠𝑠𝑢𝑟𝑒 = (2 ∗ 𝐷𝑖𝑎𝑠𝑡𝑜𝑙𝑖𝑐𝑃𝑟𝑒𝑠𝑠𝑢𝑟𝑒 + 𝑆𝑦𝑠𝑡𝑜𝑙𝑖𝑐𝑃𝑟𝑒𝑠𝑠𝑢𝑟𝑒)/3
                      𝐵𝑜𝑑𝑦𝑀𝑎𝑠𝑠𝐼 𝑛𝑑𝑒𝑥 = 𝑊 𝑒𝑖𝑔ℎ𝑡𝑘𝑔/(𝐻 𝑒𝑖𝑔ℎ𝑡𝑐𝑚2 ) ∗ 10000

2.1.2. Percentiles Calculated
Body Mass Index, Height, Weight, Arm Circumference, Abdominal Circumference, Systolic
Blood Pressure, Diastolic Blood Pressure, Mean Arterial Pressure, Heart Rate


                                               93
Solange Belinha et al. CEUR Workshop Proceedings                                          91–103


2.2. Preprocessing of Heart Sounds and Extraction of Features
The heart sound recordings were obtained during the mass screening programs and included
recordings from four anatomical locations (aortic, pulmonary, mitral and tricuspid). Along with
these were files indicating the beginning and end times for the four main components of a heart
sound (S1, systole, S2, and diastole).

2.2.1. Mel-Frequency Cepstral Coefficients
MFCC is a well-known and used feature for speech recognition systems [31]. These features were
extracted for each heartbeat. However, because there we had incomplete heartbeat segments in
the file annotations and there was a wide range of recording durations, we created functions
to count the number of complete heartbeats present and measure their duration. Standardly,
MFCCs are calculated for a window of 0.025s with 0.010s hops from the start of each window,
with an overlap between frames. From our analysis, the 25th percentile for the number of
heartbeats in each recording was 10, so we decided to extract this number of heartbeats, where
possible, and impute missing values for the missing heartbeat segments in the shorter recordings.
Additionally, the heartbeats measured on average 0.60s, which with standard parameters would
give us 60 frames. Taking this into account and given the fact that we had to correct for the
variation in heartbeat duration, we used formula 4 to calculate the hop time for each heartbeat to
get 60 frames from each and multiplied that value by 2,5 to obtain the window time maintaining
the original standard ratio. A total of 12 MFCCs were extracted for each frame.

                              ℎ𝑜𝑝𝑡𝑖𝑚𝑒 = ℎ𝑒𝑎𝑟𝑡𝑏𝑒𝑎𝑡𝑑𝑢𝑟𝑎𝑡𝑖𝑜𝑛/𝑓 𝑟𝑎𝑚𝑒𝑠

2.2.2. Matrix Profile and Motif Finding
In a parallel process, the whole recording was converted into MFCC frames, and these were
used to obtain the matrix profile with a window size of 60 frames, for the same reason as before.
With this we attempted to identify and extract 3 motifs from each recording, which are 60 frame
segments of all 12 MFCCs. Where motifs could not be extracted, data was imputed as missing.

2.2.3. Principal Components Analysis
PCA is a technique used to obtain the combination of original variables that account for a
certain amount of the total variation [32]. We used this technique to reduce the amount of data
created during the sound feature extraction. For each of three sets of data, conventional MFCC,
motif MFCC, and a combination of both, we identified the variables which represented 80% of
the total variance.

2.3. Decision Trees and Artificial Neural Networks
Decision trees (DT) emerged as the most used approach for data mining because of their
characteristics. In our case, the natural incorporation of a mixture of numerical and categorical
variables and the production of interpretable results were the most important characteristics


                                               94
Solange Belinha et al. CEUR Workshop Proceedings                                          91–103


Figure 1: Flowchart of participants


[33]. Artificial neural network (ANN) is the most common method of machine learning used for
the classification of cardiac sounds [24]. We trained both types of models using four versions
of the dataset: clinical data alone, with conventional MFCC, with motif MFCC and with both
types of features. Training was performed using bootstrapping as a resampling method with
only one resample. The evaluation of the models produced was achieved through ROC curve
analysis of the bootstrapping resample.

2.4. Statistical Software
We used R 4.0.5 software in every stage of this study. The packages were used for the calculation
of percentiles (package childsds [34]), the descriptive and comparative analyses (packages
gmodels [35] and lattice [36]), the sound file import and MFCC extraction (package readr [37],
tuneR [38], stringr [39] and foreach [40]), the matrix profile calculation and motif detection
(package tsmp [41]), the model training and validation (packages caret [42]), and the ROC curve
analysis (package pROC [43]).


3. Results
3.1. Participants
The first mass screening yielded 1019 participants and the second yielded 915. Due to three
individuals participating in both screenings, the final collected data only comprised 1934 partic-
ipants. By applying the exclusion criteria for this study, the final dataset analyzed had 1655
participants. This is graphically represented in the flowchart on Fig. 1.


                                               95
Solange Belinha et al. CEUR Workshop Proceedings                                             91–103


Table 1
Socio-demographic and clinical characteristics of participants
                                     With CHD      Without CHD      Overall     p value   Missing
                                      (n=459)        (n=1196)      (n=1655)
                                        n(%)           n(%)          n(%)                 n(%)
  Gender (female)                     225(49.0)        550(46.0)   775(46.8)     .293     0(0)
    Age Group                                                                    <.001    14(0.85)
                        Neonate         8(1.8)          5(0.4)      13(0.8)
                      (0-1 month)
                         Infant       119(26.1)        235(19.8)   354(21.6)
                     (1-24 month)
                         Child        276(60.5)        848(71.6)   1124(68.5)
                      (2-12 years)
                       Adolescent      50(11.0)         89(7.5)     139(8.5)
                     (12-16 years)
                     Young Adult        3(0.7)          8(0.7)      11(0.7)
                     (16-21 years)
      Ethnicity                                                                  .004     0(0)
                        Asian            0(0)           1(0.1)        1(0.1)
                        Black           8(1.7)           9(0.8)      17(1.0)
                      Mixed Race      337(73.4)        964(80.6)   1301(78.6)
                        White         114(24.8)        222(18.6)    336(20.3)
    BMI for age                                                                  <.001    96(5.80)
                     Underweight       57(13.2)         65(5.8)     122(7.8)
                       Normal         242(55.9)        631(56.0)   873(56.0)
                     Overweight        50(11.5)        149(13.2)   199(12.8)
                       Obese           84(19.4)        281(25.0)   365(23.4)
      Murmur                                                                     <.001    51(3.08)
                        Present       345(78.8)        340(29.2)   (685(42.7)
                        Absent         93(21.1)        826(70.8)    919(57.3)


   In terms of the target pathology, we included 1196 (72.3%) healthy individual and 459 (27.7%)
individuals with CHD. The socio-demographic and clinical characteristics of the participants
are reported on Table 1. The differences in distribution of these variables within the primary
outcome groups are also included in the table. For gender, there are no significative differences
in terms of the presence of CHD. There were statistically significative difference in age group
proportions, with CHD being more common in neonates, infants and adolescents than in the
child and young adult groups. In terms of ethnicity, it is relevant to note the lesser proportion
of mixed race individuals having CHD, when compared to other ethnicities.
   Looking at clinical characteristics, there is a significant difference in the proportion of
individual with CHD according to BMI, with this group presenting a higher proportion of
underweight patients than the group without CHD. Additionally, in the group with CHD there
were individuals with a murmur as there were more individuals without a murmur in the group
without the disease.
   Regarding the distribution of the severity of the disease, from those with CHD 409 (89.1%)


                                                  96
Solange Belinha et al. CEUR Workshop Proceedings                                         91–103


Table 2
AUC values for cardiac pathology models
             Clinical data       Clinical data          Clinical data       Clinical data
                             + Conventional MFCC    + Conventional MFCC     + Motif MFCC
                                + Motif MFCC
      DT         0.733              0.676                   0.733           0.712
     ANN         0.789              0.784                   0.791           0.740


Table 3
AUC values for CHD models
             Clinical data       Clinical data          Clinical data       Clinical data
                             + Conventional MFCC    + Conventional MFCC     + Motif MFCC
                                + Motif MFCC
      DT         0.747              0.713                   0.720           0.714
     ANN         0.747              0.757                   0.761           0.721


had simple CHD, while only 50 (10.9%) had complex CHD. As for cardiac pathologies in general,
there were a total of 628 (37.9%) individuals with a cardiac diagnosis. From those, 459 (73.1%)
had CHD and 169 (26.9%) did not have the disease. Looking at the types of murmurs present in
the population, in terms of timimg, there were 10 (1.5%) continuous murmurs, 4 (0.6%) diastolic
murmurs, 670 (97.8%) systolic murmurs and 1 with missing classification.
   Concerning the recordings used in this study, the median of recording duration was 10s,
ranging between 2s and 65s. The median of heartbeats per recording was 16, with a range of 3
to 94 heartbeats, and each heartbeat lasted on average 0.60s, with a minimum of 0.30s and a
maximum of 1.27s.

3.2. Test Results
When detecting cardiac pathology, the clinical data models showed AUC of 0.733 for DT and
0.789 for ANN. When all sound features were included, the AUC fell for both (0.676 and 0.784,
respectively). The best model for this classification was the ANN trained with clinical data and
conventional MFCC (Fig. 2), with its equivalent DT showing an AUC of 0.733. When training
only with motif MFCC the AUC where 0.712 for DT and 0.740 for ANN.
   For CHD, both the DT and ANN models of the clinical data showed AUC of 0.747. The DT and
ANN models of clinical data with both types of sound features showed AUC of 0.713 and 0.757,
respectively. Nonetheless, the best model for the detection of CHD was also the ANN trained
using clinical data and conventional MFCC, with an AUC of 0.761 (Fig. 3). The equivalent DT
presented an AUC of 0.720. The models trained with clinical data and motif MFCC performed
poorly with AUC of 0.714 for the DT model and 0.721 for the ANN model. Values of AUC for
both cardiac pathology and CHD models are summarised on Table 2 and Table 3, respectively.


                                               97
Solange Belinha et al. CEUR Workshop Proceedings                                         91–103


Figure 2: ROC curves of cardiac pathology models using clinical data and conventional MFCC


4. Discussion
4.1. Principal Results
The results show a slight improvement in the performance of the models, particularly when
using solely conventional MFCC. This is in line with their use in previous studies in this area
[24, 25, 26, 27, 28, 29, 30], even though we would expect a more evident improvement on the
models.
  Moreover, motif MFCC seem to clearly worsen the performance of the models, this could be
due the algorithm used being optimized for 3 MFCCs per frame. The selection of the cepstrum
which better characterize cardiac sound could improve the quality of the features extracted
through this method.


                                               98
Solange Belinha et al. CEUR Workshop Proceedings                                             91–103


Figure 3: ROC curves of CHD models using clinical data and conventional MFCC


4.2. Limitations
There are several limitations to this study. Firstly, the use of a second-hand dataset makes it
difficult to preprocess the data in a more efficient way. Additionally, we had little information
on the manner in which the screening was conducted, and, because of this, there are doubt
on who performed the echocardiograms and the cardiac auscultation. This is turn raises the
question of the observer’s experience. Also, due to the nature of voluntary screenings, there is
probably a selection bias which overestimates the presence of pathology within the population.
   Regarding the sound feature extraction, by using MFCC we have limited our analysis of sound
in the frequency domain. Because some pathologies may influence the duration of the heartbeat
components and the loudness of the sound, time and amplitude features could potentially be
added in future works as a way to solve this limitation.
   In terms of the models training, these results are limited in the fact that only one bootstrapping
resample was performed. These experiments should be repeat with more resamples to obtain a


                                                 99
Solange Belinha et al. CEUR Workshop Proceedings                                             91–103


better estimate of the performance of the models.

4.3. Conclusions
There is much room for improvement and experimentation in this field. Further research is
needed on the extraction and selection of features, preferably avoiding the need for segmentation
of heart sound. It is also important to create a tool that is computationally efficient and that can
be used in more basic processing devices. This has the potential to become a useful tool in the
screening of CHD.


Acknowledgments
This work was financed by FEDER - Fundo Europeu de Desenvolvimento Regional funds through
the COMPETE 2020 - Operacional Programme for Competitiveness and Internationalisation
(POCI), and by Portuguese funds through FCT - Fundação para a Ciência e a Tecnologia in the
framework of the project POCI-01-0145-FEDER-029200.


References
 [1] P.-L. Bernier, A. Stefanescu, G. Samoukovic, C. I. Tchervenkov, The challenge of congenital
     heart disease worldwide: Epidemiologic and demographic facts, Seminars in Thoracic and
     Cardiovascular Surgery: Pediatric Cardiac Surgery Annual 13 (2010) 26–34. doi:10.1053/
     j.pcsu.2010.02.005 .
 [2] Y. Liu, S. Chen, L. Zühlke, S. V. Babu-Narayan, G. C. Black, M.-K. Choy, N. Li, B. D.
     Keavney, Global prevalence of congenital heart disease in school-age children: a meta-
     analysis and systematic review, BMC Cardiovascular Disorders 20 (2020). doi:10.1186/
     s12872- 020- 01781- x .
 [3] W. Wu, J. He, X. Shao, Incidence and mortality trend of congenital heart disease at
     the global, regional, and national level, 1990–2017, Medicine 99 (2020). doi:10.1097/md.
     0000000000020593 .
 [4] M. S. Zimmerman, A. G. Smith, C. A. Sable, M. M. Echko, L. B. Wilner, H. E. Olsen, H. T.
     Atalay, A. Awasthi, Z. A. Bhutta, J. L. Boucher, et al., Global, regional, and national burden
     of congenital heart disease, 1990–2017: A systematic analysis for the global burden of
     disease study 2017, The Lancet Child amp; Adolescent Health 4 (2020) 185–200. doi:10.
     1016/s2352- 4642(19)30402- x .
 [5] Y. Liu, S. Chen, L. Zühlke, G. C. Black, M.-K. Choy, N. Li, B. D. Keavney, Global birth
     prevalence of congenital heart defects 1970–2017: updated systematic review and meta-
     analysis of 260 studies, International Journal of Epidemiology 48 (2019) 455–463. doi:10.
     1093/ije/dyz009 .
 [6] W. H. Johnson, J. H. Moller, Pediatric cardiology: The Essential Pocket Guide, John Wiley
     amp; Sons, 2014.
 [7] S. H. Abman, G. Hansmann, S. L. Archer, D. D. Ivy, I. Adatia, W. K. Chung, B. D. Hanna, E. B.


                                                100
Solange Belinha et al. CEUR Workshop Proceedings                                            91–103


     Rosenzweig, J. U. Raj, D. Cornfield, et al., Pediatric pulmonary hypertension, Circulation
     132 (2015) 2037–2099. doi:10.1161/cir.0000000000000329 .
 [8] B. S. Marino, P. H. Lipkin, J. W. Newburger, G. Peacock, M. Gerdes, J. W. Gaynor, K. A.
     Mussatto, K. Uzark, C. S. Goldberg, W. H. Johnson, et al., Neurodevelopmental outcomes
     in children with congenital heart disease: Evaluation and management, Circulation 126
     (2012) 1143–1172. doi:10.1161/cir.0b013e318265ee8a .
 [9] M. A. Raheem, W. Mohamed, Impact of congenital heart disease on brain development
     in newborn infants, Annals of Pediatric Cardiology 5 (2012) 21. doi:10.4103/0974- 2069.
     93705 .
[10] A. Ozmen, S. Terlemez, F. S. Tunaoglu, S. Soysal, A. Pektas, E. Cilsal, U. Koca, S. Kula, A. D.
     Oguz, Evaluation of neurodevelopment and factors affecting it in children with acyanotic
     congenital cardiac disease, Iranian Journal of Pediatrics 26 (2016). doi:10.5812/ijp.3278 .
[11] B. Varan, K. Tokel, G. Yilmaz, Malnutrition and growth failure in cyanotic and acyanotic
     congenital heart disease with and without pulmonary hypertension, Archives of Disease
     in Childhood 81 (1999) 49–52. doi:10.1136/adc.81.1.49 .
[12] A. Soliman, A. Khella, H. Yassin, A. Elawwa, S. Saeed, Linear growth in relation to the
     circulating concentration of insulin-like growth factor-i in young children with acyanotic
     congenital heart disease with left to right shunts before versus after surgical intervention,
     Indian Journal of Endocrinology and Metabolism 16 (2012) 791. doi:10.4103/2230- 8210.
     100678 .
[13] R. L. Knowles, R. M. Hunter, Screening for congenital heart defects: External review against
     programme appraisal criteria for the uk nsc, 2014. URL: https://legacyscreening.phe.org.
     uk/documents/pulse-oximetry/CHDandPOFirstReviewDoc.pdf.
[14] K. K. Wong, A. Fournier, D. S. Fruitman, L. Graves, D. G. Human, M. Narvey, J. L.
     Russell, Canadian cardiovascular society/canadian pediatric cardiology association po-
     sition statement on pulse oximetry screening in newborns to enhance detection of
     critical congenital heart disease, Canadian Journal of Cardiology 33 (2017) 199–208.
     doi:10.1016/j.cjca.2016.10.006 .
[15] M. E. Oster, K. A. Lee, M. A. Honein, T. Riehle-Colarusso, M. Shin, A. Correa, Temporal
     trends in survival among infants with critical congenital heart defects, Pediatrics 131
     (2013). doi:10.1542/peds.2012- 3435 .
[16] E. Mejia, Innocent murmur, 2021. URL: https://www.ncbi.nlm.nih.gov/books/NBK507849/.
[17] A. A. Lardhi, Prevalence and clinical significance of heart murmurs detected in routine
     neonatal examination, Journal of the Saudi Heart Association 22 (2010) 25–27. doi:10.
     1016/j.jsha.2010.03.005 .
[18] E. Kostopoulou, G. Dimitriou, A. Karatza, Cardiac murmurs in children: A challenge for
     the primary care physician, Current Pediatric Reviews 15 (2019) 131–138. doi:10.2174/
     1573396315666190321105536 .
[19] J. E. Frank, K. M. Jacobe, Evaluation and management of heart murmurs in children,
     American Family Physician 84 (2011) 793–800. URL: https://www.aafp.org/afp/2011/1001/
     p793.html.
[20] S. Mangione, Cardiac auscultatory skills of physicians-in-training: a comparison of
     three english-speaking countries, The American Journal of Medicine 110 (2001) 210–216.
     doi:10.1016/s0002- 9343(00)00673- 2 .


                                               101
Solange Belinha et al. CEUR Workshop Proceedings                                               91–103


[21] I. Germanakis, E. T. Petridou, G. Varlamis, I. L. Matsoukis, K. Papadopoulou-Legbelou,
     M. Kalmanti, Skills of primary healthcare physicians in paediatric cardiac auscultation,
     Acta Paediatrica 102 (2012). doi:10.1111/apa.12062 .
[22] P. R. A. Gaskin, S. E. Owens, N. S. Talner, S. P. Sanders, J. S. Li, Clinical auscultation skills
     in pediatric residents, Pediatrics 105 (2000) 1184–1187. doi:10.1542/peds.105.6.1184 .
[23] K. Kumar, W. R. Thompson, Evaluation of cardiac auscultation skills in pediatric residents,
     Clinical Pediatrics 52 (2012) 66–73. doi:10.1177/0009922812466584 .
[24] C. Liu, D. Springer, Q. Li, B. Moody, R. A. Juan, F. J. Chorro, F. Castells, J. M. Roig, I. Silva,
     A. E. W. Johnson, et al., An open access database for the evaluation of heart sound
     algorithms, Physiological Measurement 37 (2016) 2181–2213. doi:10.1088/0967- 3334/
     37/12/2181 .
[25] S. Gómez-Quintana, C. E. Schwarz, I. Shelevytsky, V. Shelevytska, O. Semenova, A. Fac-
     tor, E. Popovici, A. Temko, A framework for ai-assisted detection of patent ductus
     arteriosus from neonatal phonocardiogram, Healthcare 9 (2021) 169. doi:10.3390/
     healthcare9020169 .
[26] S. Aziz, M. U. Khan, M. Alhaisoni, T. Akram, M. Altaf, Phonocardiogram signal processing
     for automatic diagnosis of congenital heart disorders through fusion of temporal and
     cepstral features, Sensors 20 (2020) 3790. doi:10.3390/s20133790 .
[27] A. A. Gharehbaghi, A. A. Sepehri, A. A. Babic, Distinguishing septal heart defects from the
     valvular regurgitation using intelligent phonocardiography, Studies in Health Technology
     and Informatics 270 (2020) 178–182. doi:10.3233/SHTI200146 .
[28] B. Bozkurt, I. Germanakis, Y. Stylianou, A study of time-frequency features for cnn-based
     automatic heart sound classification for pathology detection, Computers in Biology and
     Medicine 100 (2018) 132–143. doi:10.1016/j.compbiomed.2018.06.026 .
[29] W. R. Thompson, A. J. Reinisch, M. J. Unterberger, A. J. Schriefl, Artificial intelligence-
     assisted auscultation of heart murmurs: Validation by virtual clinical trial, Pediatric
     Cardiology 40 (2018) 623–629. doi:10.1007/s00246- 018- 2036- z .
[30] J. Wang, T. You, K. Yi, Y. Gong, Q. Xie, F. Qu, B. Wang, Z. He, Intelligent diagnosis of heart
     murmurs in children with congenital heart disease, Journal of Healthcare Engineering
     2020 (2020) 1–9. doi:10.1155/2020/9640821 .
[31] E. D. Trejos, A. M. Castaño, J. I. Godino, G. Castellanos, Detección de soplos cardíacos us-
     ando medidas derivadas del análisis acústico en señales fonocardiográficas, IV Latin Ameri-
     can Congress on Biomedical Engineering 2007, Bioengineering Solutions for Latin America
     Health IFMBE Proceedings 18 (2007) 202–206. doi:10.1007/978- 3- 540- 74471- 9_47 .
[32] K. V. Mardia, J. M. Bibby, J. T. Kent, Multivariate analysis, Acad. Pr., 1992.
[33] T. Hastie, R. Tibshiriani, J. Friedman, The elements of statistical learning: Data mining,
     inference, and prediction, Springer, 2001.
[34] M. Vogel, Cran - package childsds - cran.r-project.org, 2020. URL: https://cran.r-project.
     org/package=childsds.
[35] G. R. Warners, B. Bolker, T. Lumley, R. C. Johnson, Cran - package gmodels - cran.r-
     project.org, 2018. URL: https://cran.r-project.org/web/packages/gmodels/index.html.
[36] D. Sarkar, Lattice: Multivariate Data Visualization with R, 1st ed., Springer, 2008.
[37] H. Wickham, J. Hester, R. Francois, J. Bryan, J. Jylänki, M. Jorgensen, 2021. URL: https:
     //cran.r-project.org/web/packages/readr/index.html.


                                                 102
Solange Belinha et al. CEUR Workshop Proceedings                                          91–103


[38] U. Ligges, S. Krey, O. Mersmann, S. Schnackenberg, G. Guenard, A. Preusser, A. Thieler,
     J. Mielke, C. Weihs, M. Heymann, et al., 2021. URL: https://cran.r-project.org/web/packages/
     tuneR/index.html.
[39] H. Wickham, 2019. URL: https://cran.r-project.org/web/packages/stringr/index.html.
[40] M. Walling, S. Weston, Microsoft, 2020. URL: https://cran.r-project.org/web/packages/
     foreach/index.html.
[41] M. Kuhn, J. Wing, S. Weston, A. Williams, C. Keefer, A. Engelhardt, T. Cooper, Z. Mayer,
     B. Kenkel, M. Benesty, et al., 2021. URL: https://cran.r-project.org/web/packages/caret/
     index.html.
[42] F. Bischoff, M. Yeh, D. Silva, Y. Zhu, H. Dau, M. Linardi, 2020. URL: https://cran.r-project.
     org/web/packages/tsmp/index.html.
[43] X. Robin, N. Turck, A. Hainard, N. Tiberti, F. Lisacek, J.-C. Sanchez, M. Müller, proc: an
     open-source package for r and s to analyze and compare roc curves, BMC Bioinformatics
     12 (2011). doi:10.1186/1471- 2105- 12- 77 .


                                               103