Application Association Rule Mining in Medical-
               Ьiological Investigations: а Survey

 Xenia Naidenova, Vyacheslav Ganapolsky, Alexandre Yakovlev, Tatiana Martirova

                      Military Medical Academy, Saint Petersburg, Russia
                                 ksennaidd@grnail.com


            Abstract. 1n this chapter, а survey is proposed related to the application of
        Apriori-like algorithms in medical and biological investigations for discovering
        frequent sets of attributes values in data and then extracting logical rules in the
        form of implication dependencies between values of observed and measured at­
        tributes and diagnostic parameter

         Keywords: Apriori algorithm, association rule mining, medical investigation


1        Introduction
Intelligent data processing is now an integral part of Ьiomedical research. Revealing
different dependencies in the data (implicative, functional, coпelational, etc.) helps in
diagnosis, treatment's planning, predicting the course of diseases and in identifying
new factors that expand the understanding of specialists about specific diseases and
their comЬinations.
   The purpose of many Ьiomedical studies is to higblight associative rules in а given
data set. The association rule is the rule in the form Х ⇒ У, where Х and У are non­
intersecting sets of distinct literals called items. 1n general case, we can consider а set
of items as а set of all attributes' values that can appear in descriptions of some ob­
jects or situations (transactions) in а data base.
   Let 1= {i1, Ь, ..., iN} Ье а set of items. А database (TDB) is а set of transactions,
where transaction <tid, Х> contains а set of items (i.e., Х � 1) and is associated with а
unique identifier tid. А non-empty itemset У� 1 is а q-itemset if it contains q items.
   А transaction <tid, Х> is said to contain itemset У if У � Х. The number of trans­
actions in TDB containing itemset Х is called the support of Х, denoted as sup(X):
sup (Х)= l{tidl (tid, У) Е TDB, Х� Y}I, where lsl denotes the cardinality of s. Giving
а minimum support threshold, min-sup, an item У is frequent if sup(Y) 2: min-sup.
   In frequent itemsets, association rules are extracted in the form of implications, for
which the value of support is an important characteristic: Sup(rule) = sup(X ⇒ У)=
sup(X u У). The rule has а measure of reliaЬility called confidence and defined as
follows: conf(X ⇒ У)= sup(X u Y)/sup(X). Confidence is defined as the part of all
transactions in TDB containing Х and У, among those transactions that contain Х.


    Copyright © 2020 for this paper Ьу its authors. Use permitted under Creative
    Commons License Attribution 4.0 Intemational (СС ВУ 4.0).


                                               230
    The third main metric ofassociation rules is Lift defined as follows: lift(X ⇒ У) =
conf(X ⇒ Y)/sup(Y). This measure quantifies the predictive power ofrule Х ⇒ У.
    The traditional purpose of data analysis is to find all associative rules (ASRs) that
have support and confidence above the specified minimum values.
    In this paper, we consider the applications of the classical Apriori algorithm for
mining ASRs in Ьiomedical data. The choice of this algorithm is justified Ьу the fol­
lowing arguments: it is the most understandaЫe and easy mastered Ьу specialists in
different applied fields, it is universal, in the sense that ASRs containing in frequent
itemsets сап Ье extracted with the help ofthis algorithm [30]; it is constantly improv­
ing based on many progressive techniques in ASR mining.
    The idea ofthe classical Apriori algorithm [2] is based on the following considera­
tion: q-itemset сап Ье frequent ifand only ifall its proper sub-itemsets are frequent.
    At the first step ofthe algorithщ all the items are considered (values ofattributes,
elements oftransactions) and among these are separated those satisfying the condition
ofminimal support. Then the separated items are used to form itemsets oftwo items
(candidates for frequency). For theщ the support is calculated and those that do not
meet the minimum support are removed. The remaining itemsets are used to form
ones of three items. The process is going on iteratively, as long as it is possiЫe to
generate а new set ofcandidates for frequency. The Apriori algorithm uses an induc­
tive method ofconstructing sets ofthe cardinality (q+l) ((q+l)-sets) from their sub­
sets of the cardinality q (q-sets). The method of forming (q +1)-sets from q-sets and
calculating their supports are the main sub-processes of the Apriori algorithm deter­
mining its computational complexity.
    The paper is organized as follows. The Section 2, 3 contain briefsurveys ofapply­
ing the Apriori algorithm in medical studies. The perfection of the Apriori algorithm
is considered in Section 4 followed Ьу а small conclusion section.


2      The Apriori Algorithm in Medical Studies
The Apriori algorithm is widely used in medical research. These studies cover: cardi­
ovascular disease [23, 29, 41]; lung cancer [18]; oral cancer [39]; infectious diseases
(Ebola virus) [13]; type 2 diabetes [26, 34, 37, 45, 46]; Alzheimer's disease [7]; liver
cancer [32]. In [41], some other diseases are enumerated for the study ofwhich were
used the Apriori algorithm or its modifications: asthma, impotence, lupus, obesity,
whooping cough, pregnancy, phenomenon Raynaud syndrome. The proЫems solved
are also varied: searching for unknown trends in disease; determining the nature of
disease based on а prediction method; diagnosis (detection) of disease; predicting а
patient's response to drug [38, 44]; early diagnosis and prevention of disease [37];
prediction ofillness's progress (course ofdisease); predicting the outcome of disease
[23]; identification ofdisease risk factors [34]; identification ofrelationships between
different medical operations, appointments, analyses and diagnoses [35]; extracting
diagnostic pattems (sets of features, symptoms) and association rules in electronic
medical database [1, 11, 19, 22] and many others.


                                          231
   The papers [3, 5, 41, 51] give detailed reviews of ASR mining based on Apriori al­
gorithrn and its modifications. In [39], early detection and prevention of oral cancer is
considered and the Apriori algorithrn is used to assess the chance of patients' survival.
This is achieved Ьу extracting а set of significant rules among various laboratory tests
and investigations like FNAC (Fine Needle Aspiration Cytology) of neck node, LFT
(Liver Function Tests), Biopsy, USG (Ultra-SonoGraphy), СТ scan or MRI (computer
tomography), and survivaЬility of the oral cancer patients. Liver Function Tests (LFT)
give information about the state of а patient's liver. Most liver diseases cause only
mild symptoms initially, but it is vital that these diseases Ье detected early. Biopsy is
important, as it is the only sure way to know if the abnormal area is cancer. USG is an
ultrasound based diagnostic imaging technique to visualize intra-abdominal struc­
tures. The extracted ASRs clearly show that if FNAC of neck node, USG and СТ
scan/ МR1 is positive, then chance of survival is reduced. However, if LFT is normal,
the probaЬility of survival is high. All the generated rules hold the highest confidence
level.
   Nahar J. et al. [31] extract the significant prevention factors for particular types of
cancer: Ыadder, breast, cervical, lung, prostate and skin cancer. The algorithms Apri­
ori, Predictive Apriori, and Tertius are used to discover most of the significant pre­
vention factors against these specific types of cancer. Predictive Apriori tries to max­
imize expected accuracy rather than confidence in Apriori, Tertius is а top-down rule
discovery system employing well known decision tree algorithrn.
   The article [6] describes the methods to detect the comorЬidity of professional dis­
eases caused Ьу pathogenic factors, namely, Ьу ionizing radiation. Apriori algorithrns
implemented in the environment of the SQL Server Analysis Services Data Mining.
The following deceases are considered: chronic radiation sickness of degree 1, 2 and
3, residual phenomena of chronic radiation sickness, exposure to ionizing radiation,
malaise and fatigue, and vegetative nervous system disorders.
   Numerous papers deal with the use of the Apriori algorithrn to study coronary heart
disease. Karaolis et al. [20] developed а data analysis system to search for associa­
tions to assess the heart disease risk factors with WЕКА tools. In [29], the objective
of the study was to effectively predict possiЫe heart attacks from а patient dataset.
The Data Base consisted of 209 records (instances) collected from а hospital in Iran
and 8 attributes. Apriori algorithrns were implemented in WЕКА 2016 (version 3.9.0)
and МАТLАВ R2013a software. The preprocessing of data consisted of previous
purifying data through the Discretization Unsupervised Filter and а discretizing meth­
od to change numeric data into nominal data. The algorithrn implemented in
МАТLАВ showed the best results related to the diagnosis prediction accuracy Ьу the
use of the obtained ASRs.
   It should Ье noted that there are still very few works in the domestic literature on
the use of the Apriori algoritm in Ьiomedical research. 1n addition to the work of Bi­
ryukov А. and Dumansky S. [6], we сап cite the article [5], which proposes а new
effective algorithrn AprioriScale to build ASRs. The algorithrn is applied to the prob­
lem of detecting children's diseases: obesity and metabolic syndrome. Two proЫems
are solved: 1) the possiЬility of distinguishing between obesity and metabolic syn­
drome, and 2) identifying а comЬination of risk factors determined in the early stages


                                           232
of а child's development that may indicate the onset of the disease in adolescence. 12
rules with the reliaЬility above 0.75 were obtained. Let's take а closer look at two of
these rules:
           • Toxicosis, ПР, Weak:LD➔ MS frequency = 0,16, probaЬility = 0,83;
           • Toxicosis, ПР, ExtraGM, РРН ➔ MS frequency = 0,09, probaЬility =
               0,93,
where ПР - threat of interruption of pregnancy, MS - metabolic syndrome, WeakLD
- weak labour delivery, and ExtraGM - extragenital diseases of mother, РРН is post­
prandial hyperlipernia. These rules show that if factors such as toxicosis, ПР, Ex­
traGM, later accompanied Ьу factors such as Weak:LD, РРН are observed in the early
stages of а child's development, the risk of metabolic syndrome is high, the probaЬil­
ity of its rnanifestation is between 0.83 and 0.93.


3      Analysis of Biological and Genetic Data Based on Association
       Rule Extracting
А topic dealing with the analysis of patient Ьiological data is now becorning particu­
larly relevant. The Ьiological data analysis is connected with the identification of
previously unknown hidden patterns (frequent iternsets), associative structures in the
large number of Ьiological sequences. These sequences include gene sequences, arni­
no acid sequences, protein composition, and other data that display the structure, lo­
calization, interaction or functioning of proteins and genes in cells. Arnino acids are
the building material of proteins. The shape and other properties of proteins are asso­
ciated with the exact sequence of arnino acids contained in them. The chernical prop­
erties of amino acids deterrnine the Ьiological activity of proteins.
   Many diseases have Ьiological nature: obesity, high Ыооd cholesterol, diabetes, in­
somnia, arthritis, and many others. Analysis of gene information, including Apriori
algorithms, [1, 10, 21, 27] helps to study the nature of disease, optirnize its treatment,
predict the course of disease. An overview of some methods of extracting knowledge
from Ьiological (DNA) sequences is given in [8]. The comparison of the Apriori algo­
rithm with other algorithms in the mutation analysis is produced in [28]. In [17], а
model is proposed for finding а dorninant sequence of arnino acids to Ыосk the
growth of cancer cells based on protein clustering.
   In [16], а comparative analysis of classifiers in cancer prediction using multiple da­
ta rnining techniques is given. The dataset contained а total of 844 records and nine
features: Clump thickness, Uniforrnity of cell size, Uniforrnity of cell shape, Marginal
adhesion, Single epithelial cell size, Ваге nuclei, Вland chrornatin, Norrnal nucleoli
and Mitoses. The analysis of data consisted of two stages. The first stage applies the
algorithm Apriori to reduce the number of input features. In the second stage, six
classifiers have been applied and validated through а k-fold cross-validation scheme.
Some variaЫes named as Marginal adhesion and Ваге nuclei have been removed as
the noise data. Thus, а new subset of seven features has been provided for solving the
proЫem of classification. Six predictive algorithms were chosen: decision tree (DT),
support vector rnachine (SVМ), k-nearest neighbour (КNN), nai"ve Bayes (NВ), ran-


                                           233
dom forest (RF) and neural network (NN). For the experiments, the R statistical envi­
ronment was chosen, as it is an open source scripting language specifically designed
for data analysis. The classifiers were evaluated based on performance metrics includ­
ing accuracy, sensitivity, and specificity. The SVМ classifier achieved а classification
accuracy of 0.9372 with а sensitivity of 0.9332 and а specificity of 0.9226, so it per­
forms better than all the remaining classifiers.
   А sirnilar method of two-stage data processing is used in [44] to predict а patient's
response to а drug in the treatment of cancer. For prediction of drug response based
on molecular profiles of multiple cancer cell types, it was generated а large-scale
pharmacogenomics dataset for 1001 cancer cell lines and 251 anti-cancer drugs. The
authors performed the feature selection in the form of ASRs and utilized the selected
features to train the state-of-the-art Deep Leaming Neural Networks (DLNNs) to
predict pharmacological response in а Ыind (control) set. The ASRs are treated as а
novel meta-dataset. Specifically, the Apriori algorithm was applied to generate а rule­
set, containing all tissue-to-gene, tissue-to-drug, gene-to-drug and drug-to-drug asso­
ciations.
   The study shows that type 2 diabetes is а genetic disease and evidence of а statisti­
cal interaction among several Single Nucleotide Polymorphisms (SNPs) has been
reported. In [26], the algorithm Apriori-Gen has been applied to SNP data of type 2
diabetes for association study. The obtained associations are measured through risk
rate (RR) and odds ratio (OR) proposed Ьу the authors. The obtained results allow to
assess with high accuracy and statistical reliaЬility the interaction of nucleotide poly­
morphisms with disease complexes. An analysis of diabetic ASR based on the Apriori
algorithm is given in [45].
   The volume of Ьiological knowledge is rapidly increasing in the form of gene ex­
pression databases (GEO, Arrayexpress, etc.), information on rnicroaпay experiments
(spotted probes, data processing protocols, etc.), molecular databases (GenВank,
ЕmЫ, Unigene, etc.), semantic sources as thesaurus, ontologies or semantic networks
(UМLS, GO, etc.), ЬiЫiographical databases (Medline, Biosis, etc.) and gene/protein
related specific sources (КEGG, OMIM, etc.) [27].


4      Perfection of the Apriori Algorithm
The popularity of the Apriori algorithm for medical diagnostic tasks is due to its sim­
plicity, however, its application for large data sets requires the development of more
efficient modifications in terms of reducing its computational complexity. And such
work to improve this algorithm is being carried out all over the world [4]. 1n particu­
lar, we may Ье аЫе to choose the following directions in improving the Apriori algo­
rithm: 1) developing new algorithms; 2) data management; 3) constraint-based ASR
rnining; 4) incremental mode of ARS rnining.
   Developing New Algorithms. The algorithms FP-GROWS and ECLAT are at­
tributaЬle to the first direction. The main drawbacks of the Apriori algorithm is scan­
ning the database several times. The algorithm FP-GROWTH [14, 32] uses а :fre­
quent-pattem tree structure (FP-Tree), which stores all the database. This structure


                                          234
can compress the data up to 200 times, and it is stored to the computer's memory.
Then, frequent itemsets are directly extracted from the FP-Tree using the divide-and­
conquer method. This algorithm is used for analyzing risk factors of Туре 2 diabetes
in [46].
    The algorithm "Equivalence Class Transformation" (ECLAT) mines frequent item­
sets in а vertical data format [12]. The algorithm builds the TID set of all items in the
transaction database. The ASR mining on vertically partitioned data is used in [15].
    The article [45] gives а general overview of effective processing the Apriori algo­
rithm on medical data. The authors propose а modification of the Apriori algorithщ
in which the amount of support for many candidates for frequency multiple attributes
is calculated only for transactions, the length of which is longer or equal to the cardi­
nality of the candidates in question. The study in [9] aims to see the effect of the k­
means clustering algorithm on the Apriori algorithm Ьу comЬining these two algo­
rithms. А logical comЬinatorial neuron-like network is advanced for optimization of
the Apriori algorithm in [30].
   Data Management. The following articles can Ье attributed to the category of data
management. In [37], the ASRs are extracted to predict the co-diseases in diabetic
mellitus patients. The peculiarity of this work is in selecting rules via their testing on
а sample of data not used in data processing. The efficiency of the Apriori algorithm
was increased with the help of а prefixed-itemset-based data structure [49]. 1n paper
[38], а new method and а statistic test on rules were introduced to mine ASR over
multiple databases. In [42], а method for ASRs extraction based on ontology seman­
tics is proposed. The medical dataset is transformed into an ontology in the form of
triples (subject, object, predicate), and SPARQL (Query Results ХМL Format) is
used to query the generated ontology.
    Constraint-Based Association Rules Mining. Currently, а lot of studies appeared
in which the Apriori algorithm is optimized from the point of view of obtaining not all
the possiЫe set of ASRs but only some of its subset satisfying а given essential prop­
erty - interesting rules, non-redundant, negative, maximal rules, association rules
generated through the questions of the end users and some others.
    The authors of [1] introduce а Query-constraint based ASR Mining (QARМ) ap­
proach for exploratory analysis of multiple, diverse clinical data sets in the National
Sleep Research Resource (NSRR). Top-k Non-Redundant (ТNR) ASR mining algo­
rithm is used in this work. Non-redundant ASR is rule deleting from which at least
one item implies the change of rule's support for the worse. The work [36] also pro­
poses an algorithm to generate non-redundant ASRs.
   Both positive and negative rules were generated in [27] to analyze which diagnosis
types require or not require Laboratory Diagnostic Tests (LDTs) for patients. The
negative rules are generated from infrequent itemsets.
   Maximal ASRs are extracted from maximal frequent pattems (itemsets). А pattem
Х is а maximal frequent in data set D if Х is frequent, and there exists no super­
pattem х such that Х с х and х is frequent in D. The maximal frequent pattem mining
algorithms is, for example, Maximal Frequent Itemset Algorithm (MAFIA) [14, 21].
    Currently there is consideraЫe interest in the methods restricting the extraction of
rules to the specific type of the most interesting rules for the users. In [33], it is ad-


                                           235
vanced а method ofRank Based Weighted ASR Mining (RANWAR)) elaborated for
Ьiological data processing. Two new measures of interestingness are considered based
on ranging gens.
   In [43], а method for generating efficient rules for associative classification is ad­
vanced. Associative classification is а technique that integrates classification and
ASRs mining for classifying unseen data. Associative classification gives more accu­
rate and easier to understand rules than it is possiЫe to obtain Ьу using the traditional
classifiers. In [37, 40], mining ASRs is also combined with extracting classification
rules.
   Existing ASRs mining algorithms rely on frequency-based rule evaluation methods
failing to provide sound statistical or computational measures for rule evaluation, and
often suffer from many redundant rules. In [37], the authors propose predictaЬility­
based an ASRs mining algorithm based on cross-validation with а new rule evaluation
step. А training dataset is partitioned into inner training sets and inner test sets and
then candidate rules' predictive performance is evaluated.
   lncremental Mode of Association Rules Mining Traditional static ASRs mining
cannot solve real-world proЫems with dynamically changing data. When the size of
the transaction database increases, then an initially frequent item may become an
infrequent one, and an initially infrequent item may become а frequent one. Incre­
mental ASRs mining algorithms help to сору with the drawbacks of the classical
Apriori algorithm. The authors [52] comЬine the Fast Update Pruning (FUP) algo­
rithm with а compressed Boolean matrix [25] and propose а new incremental ASRs
mining algorithm, named FBCM. This algorithm requires only а single scan of both
the database and incremental database. An incremental algorithm for mining interest­
ing ASRs has been developed in [48]. The papers [50] summarize the methods for
incremental ASRs mining.


Conclusion
The paper provides an overview on mining associative rules from data in Ьiomedical
research. This review is based on а study of the work from 2013 to 2020 and shows
the widespread use of the Apriori algorithm and its modifications in medicine. The
review includes also the methods to improve the Apriori algorithm to mining more
effective associative rules adapted for various research tasks.


References
 1. Abeysinghe, R. and Cui, L.: Query-constraint-based mining of association rules for ex­
    ploratory analysis of clinical datasets in the National Sleep Research Resource. Medical
    Infonnatics and Decision Making 18 (Suppl 2), 89-100 (2018)
 2. Agrawal, R., Imielinski, Т., and Swami, А.: Mining association rules between sets of items
    in large databases. АСМ SIGMOD Record, 22(2), 207-216 (1993)
 3. Altaf, W., Shahbaz, М., and Guergachi, А.: Applications of association rule mining in
    health infonnatics: А survey. Artificial Intelligence Review, 47(3), 313-340 (2017)


                                            236
 4. Bhende, D., Kasarker, U., and Gedaщ М.: Study of various improved Apriori algorithm.
    IOSR Journal ofComputer Science Engineering, e-ISSN:2278-0661, рр. 55-58 (2016)
 5. Billig, V. А., and Ivanova, О.У.: Building association rules in а task of medical diagnosis.
    Software Products and Systems, 2(114), 146-157 (2016) (in Russian).
 6. Birukov, А. Р. and Durnansky, S. М.: Revealing the comorbidity of professional diseases
    caused Ьу pathogenic factors with the help of associative algorithms on the examples of а
    cohort of victims of ionizing radiation. Medicine of Extreme Situations, 2, 13-24 (2016)
    (in Russian).
 7. Chavez, R., Gбrriz, J. М., Ramirez, J., Salas-Gonzlez, D., and Gбmez-Rio, М.: Efficient
    mining of association rules for the early diagnosis of Alzheimer's disease. Phis Med Biol.,
    56, 6047-6063 (2011)
 8. Das, N. N. et al.: Brief survey on DNA sequence mining. Intemational Journal ofComput­
    er Science andMobileComputing, 2(11), 129-134 (2013)
 9. Dharshini, N. Р., Azmi, F., Fawwaz, I., Husein, А. М., and Siregar, S. D.: Analysis of ac­
    curacy k-means and Apriori algorithms for patient data clusters. Journal Physics.: Confer­
    ence Ser. 1230, 1-9. ЮР PuЫishing (2019)
10. Dhumale S.: Predicting pattems over protein sequences using Apriori algorithm. Intema­
    tional Journal of Engineering andComputer Science 4(7), 13011-13016 (2015)
11. Doddi, D., Marath, А., Ravi, S.S., and Tomey, D.C.: Discovery of association rules in
    medical data. Medical Informatics and Intemet in Medicine 26(1), 25-33 (2001)
12. Domadia, N. and Rao, U.P.: Privacy-preserving distributed association rule mining ap­
    proach on vertically partitioned healthcare data. ProcediaComputer Science, 148, 303-312
    (2019)
13. Go, Е., Lee, S., and Yoon, Т.: Analysis of ebolavirus with decision tree and Apriori algo­
    rithm. Intemational Journal ofMachine Leaming andComputing, 4(6), 543-548 (2014).
14. Han, J., Pei, J., and Yin, У.: Mining frequent pattems without candidate generation. In:
    Proceedings of АСМ SIGMOD Intemational Conference on Management of Data, рр. 1-
    12 (2000)
15. Harahap, М., Husein, А.М., Aisyah, S. et al.: Mining association rules based on diseases
    population for recommendation of medicine need. Journal of Physics: Conference Series,
    1007, 1-12 (2018)
16. Jalali, S. М., Moro, S., Mahmoudi, М. R., Ghaffary, К. А., Maleki, М., and Alidoostan,
    А.: А comparative analysis of classifiers in cancer prediction using multiple data mining
    techniques. Intemational Journal of Business Intelligence and Systems Engineering, 1(2),
    166-178 (2017)
17. Kalaiyarasi, R. and Prabasri, S.: Predicting the Lung Cancer from Biological Sequences.
    Intemational Journal of Innovation in Engineering and Technology (IJIET), 5(1), 106-111
    (2015)
18. Kanageswari, S. and Gladis, D.: Generation of association rules of data mining for lung
    cancer Ьу air pollution. Intemational Journal of Engineering and Advanced Technology
    (IJEAT), 9(3), 2874-2880 (2020)
19. Kang'ethe, S. М. and Wagacha, Р. W.: Extracting diagnosis pattems in electronic medical
    records using association rule mining. Intemational Journal of Computer Applications
    108(15), 19-27 (2014)
20. Karaolis, М., Moutris, J.A., Papaconstantinou, L., and Pattichis, С. S.: Association rule
    analysis for the assessment of the risk of coronary heart events. In: Proceedings of the 31st
    Annual Intemational Conference of the IEEE Engineering inMedicine and Biology Socie­
    ty, рр. 6238-6241. Minneapolis, МN: USA (2009)


                                              237
21. Кavakiotis, I., Tzanis, G., and Vlahavas, I.: Mining frequent patterns and association rules
    from biological data. In: Mourad Ellourni and Albert У. Zomaya (eds.), Biological
    Кnowledge Discovery Handbook: Preprocessing, Mining, and Postprocessing of Biologi­
    cal Data, First Edition, Chapter 34, рр. 737-762. John Wiley & Sons, Inc. PuЫished (2014)
22. Kumar, R. Р. R., Jayakumar, R., and Sankaridevi, А.: Apriori-based frequent symptomset
    association mining in medical databases. International Journal of Recent Technology and
    Engineering (IJRТE), 7(5С), 65-68 (2019)
23. Lakshmi, К.R., Кrishna, M.V., and Kumar, S.P.: Performance comparison of data mining
    techniques for predicting of heart disease survivability. Intemational Journal of Scientific
    and Research PuЬlications, 3(6), 1-10 (2013)
24. Lee, J.Y. and Кim, К.-У.: Semantic and association rule mining-based knowledge exten­
    sion for reusaЫe medical equipment random forest rules. Journal oflntegrated Design and
    Process Science, 22(4), 55-81 (2018)
25. Li, Т. and Luo, D.: А new improved Apriori algorithm based on compression matrix. In:
    Proceedings of Intemational Conference on Advanced Data Mining Application., vol.
    8933, рр. 1-15. Berlin, Germany: Springer-Verlag (2014)
26. Мао, W. and Мао, J.: Тhе Application of Apriori-Gen algorithm in the association study
    in type 2 diabetes. In: Proceedings of 3rd International Conference on Bioinformatics and
    Biomedical Engineering (ICBBE), 4(1), 126-140 (2009)
27. Martinez, R., Pasquier, С., and Pasquier, N.: GENМINER: Mining informative association
    rules from genomic data. In: Proceeding of the IEEE Intemational Conference on Bioin­
    formatics and Biomedicine, рр. 15-22 (2007)
28. Mayilvaganan, М. and Hemalathe, R.: Performance comparison of FSA Red and Apriori
    algorithm's in mutation analysis. International Journal of Computer Trends and Technolo­
    gy (IJСТТ), 17(4), 205-209 (2014)
29. Mirmozaffari, М., Alinezhad, А., and Gilanpour, А.: Data mining Apriori algorithm for
    heart disease prediction. Intemational Journal of Computing, Communication & Instru­
    mentation Engineering (IJCCTE), 4(1), 20-23 (2017)
30. Naidenova, Х., Parkhomenko, V., and Shvetsov, К.: Application of а logical­
    combinatorial network to symbolic machine learning tasks. In: Naidenova, Х., Shvetsov,
    К., and Yakovlev, А. (eds.), Machine Learning in Analysis of Biomedical and Socio­
    Economic Data, рр. 371-409. Saint-Petersburg, Russia: Polytech Press (2020)
31. Nahar, J., Tickle, К., Ali, A.B.M.S., and Chen, У.Р. Significant cancer prevention factor
    extraction: An association rule discovery approach. Journal of Medical System 35(3), 353-
    367 (2011)
32. Pinheiro F.M.R.: Applying the Apriori and FP-Growth association algorithms to liver can­
    cer data. А Thesis Submitted in Partial Fulfillment of the Requirement for the Degree of
    Master of Science, 172 рр. University of Victoria (2013)
33. Premalatha S. and Nandhini, U. С.: Efficiently generating the rank based weighted asso­
    ciation rule mining using Apriori algorithm in High Biological Database. Intemational Re­
    search Journal ofEngineering and Technology (IRJEТ), 2(9), 2143-2147 (2015)
34. Ramezankhani, А., Pournik, О., Shahrabi, J., Azizi, F., and Hadegh, F.: An application of
    association rule mining to extract risk pattem for type 2 diabetes using tehran lipid and
    glucose study database. Intemational ofEndocrinol Metabolism. 13(2) е25389 (2015)
35. Sanyer, G. and Ta�ar, С. 6.: Highlighting the rules between diagnosis types and laboratory
    diagnostic tests for patients of an emergency department: use of association rule mining.
    Health Informatics Journal, 2(6), 1-17 (2019).
36. Severac, F., Sauleau, Е. А., Meyer, N., Lefevre, Н., Nisand, G., and Jay, N.: Non­
    redundant association rules between diseases and medications: an automated method for


                                              238
    knowledge base construction. ВМС Medical Infoпnatics and Decision Making 15(1), 2-7
    (2015)
37. Shahebaz, Ah. Кh. and Jabbar, М. А.: Improved classification techniques to predict the co­
    disease in diabetic mellitus patients using discretization and Apriori algorithm. Intema­
    tional Journal of Innovative Technology and Exploring Engineering (IЛТЕЕ), 8(11), 730-
    733 (2019)
38. Shang, Е., Duan, J., Fan, Х., Tang, У., and Уе, L.: Association rule mining and statistic
    test over multiple datasets on ТСМ drug pairs. Intemational Journal of Biomedical Data
    Mining 6(1), 2-6 (2017)
39. Shaпna, N. and Om, Н.: Early detection and prevention of oral cancer: association rules
    mining on investigation. WSEAS Transaction on Computers, 13, 1-8 (2014)
40. Song, К. and Lee, К.: Predictability-based collective class association rule mining. Expert
    System with Application, 79, 1-7 (2017)
41. Swathi, Р. and Prajna, В.: Тhе effective procession of Apriori algorithm prescribed data
    mining on medical data. Intemational Journal of Computer Science and Technology
    (IJCST), 7(3), 22-26 (2016)
42. Тhamer, V., El-Sappagh, S., and El-Shishtawy, Т.: А semantic approach for extracting
    medical association rules. Intemational Journal oflntelligent Engineering Systems, 13(3),
    280-293 (2020)
43. Тhanajiranthom, Ch. and Songram, Р.: Generation of efficient rules for associative classi­
    fication. In: Proceedings of the 13th Intemational Conference MIWAl, рр. 109-120 (2019)
44. Vougas, К., Кrochmal, М., Jackson, Т., Polyzos, А. et al: Deep leaming and association
    rule mining for predicting drug response in cancer. А personalized medicine approach.
    Cuпent version availaЫe https://www.Ьiorxiv.org/content/10.ll01/070490v3.full (2017)
45. Wang, Х., Su, К., and Liu, Zh.: Analysis of diabetic association rules based on Apriori al­
    gorithm. In: Ch. Huang, Yu-Wei Chan, and Neil Yen (eds), Data Processing Techniques
    and Applications for Cyber-Physical System, рр. 553-563. Springer (2020).
46. Wei, Z., and Guangjian, Уе.: Тhе research on analyzing risk factors of type 2 diabetes
    mellitus based on improved ftequent pattem tree algorithm. 1n Proceedings of the 2015 In­
    temal Conference on Materials Engineering and Infoпnation Technology Applications, рр.
    459-463. Atlantic Press (2015)
47. Win, S.L., Htike, Z.Z., Yusof, F., and Noorbatcha, 1.А.: Gene expression mining for sur­
    vivaЬility of patients in early stages of lung cancer.lntemational Journal ofBioinfoпnatics
    andBiosciences, 4(2), 1-9 (2014).
48. Yafi, Е., Al-Hegami, А., Alam, А., andBiswas, R. УАМI: Incremental Mining oflnterest­
    ing Association pattems. Тhе Intemational Arab Joumal oflnfoпnation Technology 9(6),
    504-5011 (2012)
49. Yu, S. and Zhou, У.: А prefixed-itemset-based improvement for Apriori algorithm. Cor­
    nell University, arXiv preprint, arXiv: 1601.01746 (2016)
50. Zhan, F., Zhu, Х., Zhang, L., Wang, Х., Wang, L., and Liu, Ch.: Surnmary of association
    rules. ЮР Conf. Series: Earth and Environment Science (ЕSМА), vol. 252, рр. 12 (2019).
51. Zhang, B.Z., Jiang, К. Q., and Zhang, У. Z.: Survey on incremental association rule min­
    ing research. Journal of China Computer. Systems, 37(1), 18-23 (2016)
52. Zhou, D., Ouyag, V., Kang, Zh., Li, Zh., Zhou, J.P., and Cheng, Х.: lncremental АRМ
    based on matrix compression for edge computing. IEEE Access, Section on Innovation
    and Application of Intelligent Processing of Data, Infoпnation and Кnowledge as Re­
    sources in Edge Computing, vol. 7, 173044-173053 (2019)


                                             239