<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>IDDM-</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Algorithms for Classification and Prediction of Heart Disease</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nataliya Boyko</string-name>
          <email>nataliya.i.boyko@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iryna Dosiak</string-name>
          <email>iryna.dosiak.knm.2018@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>Profesorska Street 1, Lviv, 79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>4</volume>
      <fpage>19</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>The study aims to improve the effectiveness of health care in various ways. The paper considers ML algorithms that allow health professionals to allocate resources optimally and physicians to choose the best treatment options for patients. This approach will reduce the burden on doctors and increase and accelerate patients' access to health care, save resources and reduce costs. The paper presents the results of research that will allow the use of smaller data sets to develop transparent models. The report uses a naive Bayes classifier to predict heart disease. The advantage of this approach is that the sample size requirements are reduced from exponential to linear, which is very important. There is an overview of the classification model, its advantages and disadvantages. Materials and methods are also analyzed. Model, classification, machine learning, algorithm, Bayes classifier</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Machine Learning (ML) algorithms allow healthcare professionals to allocate resources optimally
and physicians to choose the best treatment options for patients. This approach reduces the burden on
doctors, increases and accelerates patients' access to health care, saves resources, and reduces costs.
However, despite the achievements of ML research in medicine, its role is currently limited. Creating
and testing a model may require large amounts of high-quality data. Besides, diagnostic models must
be built individually for each disease. It is a lengthy process. In addition, the psychological aspect of
trusting black box algorithms can also be difficult to perceive. However, continuing ML research may
allow using smaller data sets and developing more transparent models [
        <xref ref-type="bibr" rid="ref4">4, 13</xref>
        ].
      </p>
      <p>
        The nature of heart disease is complex. In addition, the diagnosis of heart disease in most cases
depends on a complex combination of clinical and pathological data. The relationship between the
real cause of the disorder and the effects of spontaneous symptoms in patients can often be hidden and
not obvious [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
to avoid medical error.
      </p>
      <p>That is why the analysis of medical data in health care is considered an important but complex task
that must be performed accurately and effectively. In addition, the study of medical data is necessary
The basis of medical diagnosis is the problem of classification. The diagnosis comes down to the
The study aims to apply and implement the original Naive Bayes model with two existing models:
problem of displaying data to one of N different results.
the Gaussian model and the Multinomial model.
classifier with different models</p>
      <p>This study will focus on comparative analysis, differences, capabilities, and effectiveness of the</p>
      <p>2021 Copyright for this paper by its authors.</p>
      <p>The purpose of classifying heart disease is to diagnose a disease in a patient based on specific
diagnostic measurements included in the data set. In addition, the work will consist of searching for
significant features and patterns between the various factors influencing the diagnosis.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Review of literature sources</title>
      <p>For a detailed study of these tasks, you need to read and analyze the experience of scientists in this
field. Since the problem is relevant, numerous studies have been conducted that have focused on
diagnosing heart disease in combination with or without another condition.</p>
      <p>
        • G. Parthiban, A. Rajesh, S.K. Srivatsa predicted the chances of people with diabetes having
heart disease and highlighted the results in their article "Diagnosis of Heart Disease for
Diabetic Patients using Naive Bayes Method," published in the International Journal of
Computer Applications [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The accuracy was 74%.
• Mrs. Mr. Subbalakshmi, Mr. K. Ramesh M. Tech, Mr. M. Chinna Rao M.Tech developed a
system that extracts hidden knowledge from a historical heart disease database using a Naive
Bayes classification [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The article "Decision Support in Heart Disease Prediction System
using Naive Bayes" was published in the Indian Journal of Computer Science and
Engineering».
• Jyoti Soni, Ujma Ansari, Dipesh Sharma, Sunita Soni conducted a study and compared KNN
and the Naive Bayes classifier to predict heart disease [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However, the accuracy of the
results reached 45.6% for KNN and 52.33% in the case of the Naive Bayes classifier. Their
article "Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease
Prediction" was published in the International Journal of Computer Applications. In the end,
they added the need to improve the proposed study.
• Vincy Cherian and Bindu M.S developed a heart disease prediction system using a Naive
Bayes classifier and a Laplace smoothing technique [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. They reported this in their article
"Improved Study of Heart Disease Prediction System using Data Mining Classification
Techniques." They achieved high accuracy. However, the system has a limit on the number of
attributes - symptoms.
      </p>
      <p>Unfortunately, searches for such studies among Ukrainian sources did not yield any results.</p>
      <p>Thus, various studies only represent the effectiveness of predicting heart disease using ML
methods. This study aims to find features and patterns between different factors that affect the
diagnosis using a Naive Bayes classifier.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods overview</title>
      <p>Classification solves the following problem: let there be a set of objects divided into classes on one
or more grounds. Moreover, a finite set of objects is given, for which it is known to which classes
they belong. Such a set is considered to be a training sample. It is unknown to which class the other
objects belong. We need to build an algorithm that can classify any object of the source set - specify
the number or name of the class to which it belongs [9, 11].</p>
    </sec>
    <sec id="sec-4">
      <title>3.1. A mathematical formulation of the classification problem</title>
      <p>Let X be a set of object descriptions, and Y be class numbers or names. There is an unknown target
relationship - mapping , the values of which are known only on the elements of the finished
training sample . We need to build an algorithm a , that can classify
an arbitrary object [12].</p>
    </sec>
    <sec id="sec-5">
      <title>3.2. Bayes classifier</title>
      <p>Bayes classifier - provides a classification with a degree of confidence rather than simply issuing
the most plausible class. Bayes' theorem is used to determine the degree of certainty.</p>
      <p>Bayes' theorem describes the probability of an event, given the circumstances that may affect the
event. Thus, you can more accurately calculate the probability, considering both already known
information and data from new observations [14].</p>
      <p>A Naive Bayes classifier is an assumption about the independence of traits. In other words, the
NCB assumes that any attribute in the class is not related to the presence of any other feature.</p>
    </sec>
    <sec id="sec-6">
      <title>3.3. Method overview</title>
      <p>As mentioned, the Bayes classifier is based on the Bayes theorem, which describes the probability
of an event, given the circumstances that may affect the event [14].</p>
      <p>Suppose there is a symptom S. In addition, there are classes (diseases) C, which should include the
symptom. It is necessary to find a class (disease) C in which the probability for this line would be
maximum. The mathematical notation is given in Formula 1.</p>
      <p>It is hard to calculate P(C|O). However, you can use Bayes' theorem and go to (Formula 2):
,
where P(С) - an a priori probability, the probability of meeting a class among all the data;
P(O|C) - conditional probability, the probability of symptoms in each class;
P(O) - total probability, probability of symptoms.</p>
      <p>Usually, it makes no sense to work with one symptom. It is much more effective to detect the
disease on several grounds. Thus Formula 2 will take the form (Formula 3):</p>
      <p>Since you need to find the function's maximum, the denominator can be ignored (this is a
constant). It is also necessary to include a "naive" assumption that the symptoms of S depend only on
class C and do not depend on each other. Then the numerator will take the form (Formula 4):
So, the final formula will look like (Formula 5):
(1)
(2)
(3)
(4)
(5)</p>
      <p>So it all comes down to calculating the probability P(C) and P(S|C). Calculating these parameters
is called classifier training.</p>
    </sec>
    <sec id="sec-7">
      <title>3.4. Multinomial Naive Bayes</title>
      <p>
        Multinomial Naive Bayes implements a Naive Bayes algorithm for multinomial distributed data
and is one of two classic variants of Naive Bayes [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>This algorithm puts forward a second assumption of independence - the assumption of positional
independence. Conditional probabilities of symptom onset are equally independent of its position in
the data sample [9].</p>
      <p>The data is usually presented as a vector. The basic idea is that each unique feature (symptom) that
occurs is assigned a unique integer. Therefore the data can be represented as a sequence of numbers.</p>
      <p>The distribution of the number of vectors is parameterized by vectors
for each
class, where n - number of features (symptom), and – the probability of the appearance in
the sample of features belonging to class C.</p>
      <p>The parameter is estimated by the smoothed version of the maximum probability. The relative
frequency calculation (Formula 6):
,
(6)
where</p>
      <p>- the number of times the  character appears in a class C sample in the training set.;
- the total number of all features (symptoms) for class C;</p>
      <p>A - Laplace smoothing.</p>
    </sec>
    <sec id="sec-8">
      <title>4. Review and analysis of data</title>
      <p>
        The data set about heart disease "heart.csv" is used for research [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. It was taken from Kaggle. This
database contains 76 attributes, but all published experiments involve using a subset of 14 of them, as
the rest of the information is the identification of individuals. The total number is 303 rows and 14
columns, of which 165 have heart disease [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Attribute information:
1. age;
2. sex : (1 = a man; 0 = a woman);
3. cp: chest pain type (4 values);
4. trestbps: blood pressure at rest (in mm Hg on admission to the hospital);
5. chol: serum cholesterol in mg / dl;
6. fbs: (fasting blood sugar) (1 =&gt; 120 mg / dl; 0 = &lt;120 mg / dl);
7. restecg: the results of electrocardiography at rest (values 0, 1, 2);
8. thalach: the maximum pulse;
9. exang: angina caused by exercise (1 = yes; 0 = no);
10. oldpeak: ST depression caused by exercise for rest;
11. slope: the slope of the peak segment of exercise ST;
12. ca: the number of major vessels (0–3) stained by fluoroscopy;
13. thal: thalassemia (1 = normal; 2 = fixed defect; 3 = reversible defect);
14. target: (1 = heart disease or 0 = no heart disease).</p>
      <p>Fig. 1, Fig. 2, and Fig. 3 show a dataset.</p>
    </sec>
    <sec id="sec-9">
      <title>4.1. Search for the correlation of heart disease with different parameters</title>
      <p>To find the links of heart disease with different parameters, we need to build a correlation matrix
(Fig. 7).
5. Trestbps (resting blood pressure) and fbs (fasting blood sugar) are negatively correlated.</p>
      <p>Moreover, the correlation is lower for women compared to men.</p>
      <p>For these observations, the accuracy of the conclusions should be checked, taking into account the
distribution of data between men and women (Fig. 9).
11. On the other hand, a value of 0, the probable presence of hypertrophy, in itself does not
indicate the presence of heart disease.
12. In itself, the feature - blood sugar levels, does not give confidence in the presence or absence
of heart disease. However, we will not abandon this feature, as it can be helpful with other
variables.
13. Chest pain also does not give an unambiguous answer. It is challenging to tell if a patient has
heart disease that corresponds only to its symptoms.</p>
      <p>To verify the accuracy of the conclusions, you should use PCA, which helps extract a set of
variables from an existing large set of variables. These extracted variables are called essential
components.</p>
      <p>Because the data set is small and has no many features, only two components should be used to see
how much variance it covers.</p>
      <p>The study can explain approximately 90% of the variance in the data set using only two
components. Fig. 14 presents each of these decomposed components:</p>
    </sec>
    <sec id="sec-10">
      <title>4.2. Application of the Naive Bayes classifier</title>
      <p>The next step is to divide the data into training and test in 80% to 20%. You should also normalize
the data with OneHotEncoder and MinMaxScaler [10].</p>
      <p>OneHotEncoder - a strategy in which each value of the category is converted into a new column,
and it is assigned a value of 1 or 0 (notation for true/false). Fig. 15 shows an example of the strategy.</p>
      <p>From the correlation matrix, you can determine the accuracy or positive predictive value
(precision), the probability of detection (recall), and the completeness of the definition (f1_score).
• TP - true-positive decision;
• TN - true-negative decision;
• FP - false-positive decision;
• FN - false-negative decision.</p>
      <p>The next step is to use the metrics for this method. The results are shown in Fig. 18.</p>
    </sec>
    <sec id="sec-11">
      <title>4.3. Application of the Multinomial Naive Bayes classifier</title>
      <p>To implement the classification, you should use MultinomialNB from the sklearn library with
different states when sharing data.</p>
      <p>The score function from the sklearn library is used to evaluate the results.</p>
      <p>The obtained results are presented in Fig. 21.</p>
      <p>Thus, the average estimate of the Multinomial Naive Bayes classifier for random states from 0 to
200 is 0.82. The highest score is 0.849039016334426.</p>
      <p>The next step is to reduce the number of attributes to 7. The results of the experiments are shown
in Fig. 25.</p>
    </sec>
    <sec id="sec-12">
      <title>5. Discussion of experimental results</title>
      <p>To study the accuracy of the two classification models, we use a set of data on heart disease.</p>
      <p>Table 1 summarizes the characteristics of the data set used in the experiments.</p>
      <sec id="sec-12-1">
        <title>GaussianNB</title>
      </sec>
      <sec id="sec-12-2">
        <title>MultinomialNB 14 features 0,84426229 0.85012901</title>
      </sec>
      <sec id="sec-12-3">
        <title>7 features 0,765245901 0.830100260</title>
      </sec>
      <sec id="sec-12-4">
        <title>Accuracy %</title>
      </sec>
      <sec id="sec-12-5">
        <title>Precision 0,828571 0,852941</title>
      </sec>
      <sec id="sec-12-6">
        <title>F1_score 0,852941 0.865672 Figure 27: Comparison of estimates of two methods of the Naive Bayes classifier</title>
        <p>Table 5 shows the execution time of the classification of different Naive Bayes models. On the
same data set, MultinomialNB performs training faster, which again emphasizes its advantage for the
selected data set.</p>
        <p>It is also noticeable that as the number of features decreases, the time decreases (Fig. 28).</p>
        <p>Analyzing Fig. 28, we can conclude that the Multinomial Bayes classifier is more accurate and
faster for the selected data set.</p>
        <p>So, the choice of using the Naive Bayes method depends on the data. The Multinomial Naive
Bayes is appropriate if the data consists of calculations, and observations can only take non-negative
integers. It is better to use the Gaussian NB for decimal features. GNB accepts features that
correspond to the normal distribution.</p>
        <p>For the selected data set, which contains features for diagnosing heart disease, the Multinomial
Naive Bayes showed better results. Using this method, we can achieve greater accuracy and reduce
the time to perform training.</p>
        <p>Analyzing the study results, it is worth emphasizing the importance of choosing the correct method
of the naive classifier. It helps achieve better classification results, which is critical in the medical
field.</p>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>6. Conclusion</title>
      <p>The paper considered the relevance of the topic: the use of data mining methods for diagnosing the
disease in a patient on a set of indicators, such as symptoms, test results, and other indicators.</p>
      <p>We used the Heart data set for the study, which we cleared of emissions, Null values, and
normalized. We also performed a search and analysis of significant features and patterns between
different factors influencing heart disease.</p>
      <p>In addition, we used two algorithms in this work, which objectively showed the classification
results on the selected dataset.</p>
      <p>The parameters used for the analysis were the selection and deletion of the function. We first
tested a classifier with all the features and then gradually reduced the set to determine which
algorithm best classifies with fewer features.</p>
      <p>The simulation results show that the Multinomial Naive Bayes classifier has better accuracy than
the Gaussian method with the same data set and parameters. In addition, it reduces training time,
which is very important because the annual growth of data in medicine is increasing very rapidly.</p>
      <p>In future work, it is worth considering two aspects. Namely, we can compare more algorithms to
achieve better results and potentially introduce a better algorithm in Naive Bayes. Moreover, we can
try to evaluate the effectiveness of their work to justify their use in the health care system.</p>
    </sec>
    <sec id="sec-14">
      <title>7. References</title>
      <p>[9] S. Kharya, S. Soni, Weighted naive bayes classifier: A predictive model for breast cancer
detection, in: International Journal of Computer Applications, 133(9) (2016): 32-37.
[10] A. Ashari, P. Iman, and A. Min Tjoa, "Performance comparison between Naïve Bayes, decision
tree and k-nearest neighbor in searching alternative design in an energy simulation tool", in:
International Journal of Advanced Computer Science and Applications (IJACSA) (2013).
[11] N.D. Uma, Extraction of action rules for chronic kidney diseas using Naive Bayes classifier,</p>
      <p>IEEE Internstional Conference Comput Intelligence Comput Res (2016).
[12] W. P. Castelli, Lipids, risk factors and ischaemic heart disease, Atherosclerosis (1996). doi:
10.1016/0021-9150(96)05851-0.
[13] W. F. Wilson, W. B. Kannel, H. Silbershatz, Clustering of Metabolic Factors and Heart Disease.</p>
      <p>159(10) (1999): 1104. doi: 10.1001/archinte.159.10.1104.
[14] Stat Quest with Josh Starmer - Naive Bayes. URL:
https://www.youtube.com/watch?v=O2L2Uv9pdDA&amp;ab_channel=StatQuestwithJoshStarmerSta
tQuestwithJoshStarmer%D0%9F%D1%96%D0%B4%D1%82%D0%B2%D0%B5%D1%80%D
0%B4%D0%B6%D0%B5%D0%BD%D0%BE.
[15] N. Boyko, Kh. Shakhovska, L. Mochurad, J. Campos, “Information System of Catering
Selection by Using Clustering Analysis”, in: Proceedings of the 1st International Workshop on
Digital Content &amp; Smart Multimedia (DCSMart 2019), Lviv, Ukraine, December 23-25, (2019):
94-106.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Parthiban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rajesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.K.</given-names>
            <surname>Srivatsa</surname>
          </string-name>
          ,
          <article-title>Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method</article-title>
          , in:
          <source>International Journal of Computer Applications</source>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ) (
          <year>2011</year>
          ). doi:
          <volume>10</volume>
          .5120/
          <fpage>2933</fpage>
          -
          <lpage>3887</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Subbalakshmi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tech</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Chinna</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tech</surname>
          </string-name>
          ,
          <article-title>Decision Support in Heart Disease Prediction System using Naive Bayes</article-title>
          , in:
          <source>Indian Journal of Computer Science and Engineering</source>
          ,
          <volume>2</volume>
          (
          <issue>2</issue>
          ) (
          <year>2011</year>
          ):
          <fpage>170</fpage>
          -
          <lpage>176</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Soni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Ansari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Soni</surname>
          </string-name>
          ,
          <article-title>Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction</article-title>
          , in:
          <source>International Journal of Computer Applications</source>
          , Vol.
          <volume>17</volume>
          (
          <issue>8</issue>
          ) (
          <year>2011</year>
          ):
          <fpage>43</fpage>
          -
          <lpage>48</lpage>
          . DOI:
          <volume>10</volume>
          .5120/
          <fpage>2237</fpage>
          -
          <lpage>2860</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Ch. Vincy</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          <string-name>
            <surname>Bindu</surname>
          </string-name>
          ,
          <source>Prediction Analysis of Cardiac Disease using Classification</source>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .22214/ijraset.
          <year>2019</year>
          .
          <volume>6295</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kunanets</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vasiuta</surname>
          </string-name>
          , N. Boikо, “
          <article-title>Advanced Technologies of Big Data Research in Distributed Information Systems”</article-title>
          ,
          <source>in: Proceedings of the 14th International conference "Computer sciences and Information technologies" (CSIT</source>
          <year>2019</year>
          ), Lviv, Ukraine,
          <source>September</source>
          <volume>17</volume>
          -
          <fpage>20</fpage>
          (
          <year>2019</year>
          ):
          <fpage>71</fpage>
          -
          <lpage>76</lpage>
          . DOI:
          <volume>10</volume>
          .1109/STC-CSIT.
          <year>2019</year>
          .
          <volume>8929756</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Heart</given-names>
            <surname>Database</surname>
          </string-name>
          . URL: https://www.kaggle.com/zhaoyingzhu/heartcsv.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Clinic</given-names>
            <surname>Manufactory - Cardiovascular Diseases</surname>
          </string-name>
          . URL: https://manufacturaclinica.com/blog/sertsevo-sudinni-zahvoryuvannya.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W.J.</given-names>
            <surname>Loesche</surname>
          </string-name>
          ,
          <article-title>Periodontal disease as a risk factor for heart disease</article-title>
          ,
          <source>in: Compendium</source>
          ,
          <volume>15</volume>
          (
          <issue>8</issue>
          ):
          <fpage>978</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>