Application Association Rule Mining in Medical- Ьiological Investigations: а Survey Xenia Naidenova, Vyacheslav Ganapolsky, Alexandre Yakovlev, Tatiana Martirova Military Medical Academy, Saint Petersburg, Russia ksennaidd@grnail.com Abstract. 1n this chapter, а survey is proposed related to the application of Apriori-like algorithms in medical and biological investigations for discovering frequent sets of attributes values in data and then extracting logical rules in the form of implication dependencies between values of observed and measured at­ tributes and diagnostic parameter Keywords: Apriori algorithm, association rule mining, medical investigation 1 Introduction Intelligent data processing is now an integral part of Ьiomedical research. Revealing different dependencies in the data (implicative, functional, coпelational, etc.) helps in diagnosis, treatment's planning, predicting the course of diseases and in identifying new factors that expand the understanding of specialists about specific diseases and their comЬinations. The purpose of many Ьiomedical studies is to higblight associative rules in а given data set. The association rule is the rule in the form Х ⇒ У, where Х and У are non­ intersecting sets of distinct literals called items. 1n general case, we can consider а set of items as а set of all attributes' values that can appear in descriptions of some ob­ jects or situations (transactions) in а data base. Let 1= {i1, Ь, ..., iN} Ье а set of items. А database (TDB) is а set of transactions, where transaction contains а set of items (i.e., Х � 1) and is associated with а unique identifier tid. А non-empty itemset У� 1 is а q-itemset if it contains q items. А transaction is said to contain itemset У if У � Х. The number of trans­ actions in TDB containing itemset Х is called the support of Х, denoted as sup(X): sup (Х)= l{tidl (tid, У) Е TDB, Х� Y}I, where lsl denotes the cardinality of s. Giving а minimum support threshold, min-sup, an item У is frequent if sup(Y) 2: min-sup. In frequent itemsets, association rules are extracted in the form of implications, for which the value of support is an important characteristic: Sup(rule) = sup(X ⇒ У)= sup(X u У). The rule has а measure of reliaЬility called confidence and defined as follows: conf(X ⇒ У)= sup(X u Y)/sup(X). Confidence is defined as the part of all transactions in TDB containing Х and У, among those transactions that contain Х. Copyright © 2020 for this paper Ьу its authors. Use permitted under Creative Commons License Attribution 4.0 Intemational (СС ВУ 4.0). 230 The third main metric ofassociation rules is Lift defined as follows: lift(X ⇒ У) = conf(X ⇒ Y)/sup(Y). This measure quantifies the predictive power ofrule Х ⇒ У. The traditional purpose of data analysis is to find all associative rules (ASRs) that have support and confidence above the specified minimum values. In this paper, we consider the applications of the classical Apriori algorithm for mining ASRs in Ьiomedical data. The choice of this algorithm is justified Ьу the fol­ lowing arguments: it is the most understandaЫe and easy mastered Ьу specialists in different applied fields, it is universal, in the sense that ASRs containing in frequent itemsets сап Ье extracted with the help ofthis algorithm [30]; it is constantly improv­ ing based on many progressive techniques in ASR mining. The idea ofthe classical Apriori algorithm [2] is based on the following considera­ tion: q-itemset сап Ье frequent ifand only ifall its proper sub-itemsets are frequent. At the first step ofthe algorithщ all the items are considered (values ofattributes, elements oftransactions) and among these are separated those satisfying the condition ofminimal support. Then the separated items are used to form itemsets oftwo items (candidates for frequency). For theщ the support is calculated and those that do not meet the minimum support are removed. The remaining itemsets are used to form ones of three items. The process is going on iteratively, as long as it is possiЫe to generate а new set ofcandidates for frequency. The Apriori algorithm uses an induc­ tive method ofconstructing sets ofthe cardinality (q+l) ((q+l)-sets) from their sub­ sets of the cardinality q (q-sets). The method of forming (q +1)-sets from q-sets and calculating their supports are the main sub-processes of the Apriori algorithm deter­ mining its computational complexity. The paper is organized as follows. The Section 2, 3 contain briefsurveys ofapply­ ing the Apriori algorithm in medical studies. The perfection of the Apriori algorithm is considered in Section 4 followed Ьу а small conclusion section. 2 The Apriori Algorithm in Medical Studies The Apriori algorithm is widely used in medical research. These studies cover: cardi­ ovascular disease [23, 29, 41]; lung cancer [18]; oral cancer [39]; infectious diseases (Ebola virus) [13]; type 2 diabetes [26, 34, 37, 45, 46]; Alzheimer's disease [7]; liver cancer [32]. In [41], some other diseases are enumerated for the study ofwhich were used the Apriori algorithm or its modifications: asthma, impotence, lupus, obesity, whooping cough, pregnancy, phenomenon Raynaud syndrome. The proЫems solved are also varied: searching for unknown trends in disease; determining the nature of disease based on а prediction method; diagnosis (detection) of disease; predicting а patient's response to drug [38, 44]; early diagnosis and prevention of disease [37]; prediction ofillness's progress (course ofdisease); predicting the outcome of disease [23]; identification ofdisease risk factors [34]; identification ofrelationships between different medical operations, appointments, analyses and diagnoses [35]; extracting diagnostic pattems (sets of features, symptoms) and association rules in electronic medical database [1, 11, 19, 22] and many others. 231 The papers [3, 5, 41, 51] give detailed reviews of ASR mining based on Apriori al­ gorithrn and its modifications. In [39], early detection and prevention of oral cancer is considered and the Apriori algorithrn is used to assess the chance of patients' survival. This is achieved Ьу extracting а set of significant rules among various laboratory tests and investigations like FNAC (Fine Needle Aspiration Cytology) of neck node, LFT (Liver Function Tests), Biopsy, USG (Ultra-SonoGraphy), СТ scan or MRI (computer tomography), and survivaЬility of the oral cancer patients. Liver Function Tests (LFT) give information about the state of а patient's liver. Most liver diseases cause only mild symptoms initially, but it is vital that these diseases Ье detected early. Biopsy is important, as it is the only sure way to know if the abnormal area is cancer. USG is an ultrasound based diagnostic imaging technique to visualize intra-abdominal struc­ tures. The extracted ASRs clearly show that if FNAC of neck node, USG and СТ scan/ МR1 is positive, then chance of survival is reduced. However, if LFT is normal, the probaЬility of survival is high. All the generated rules hold the highest confidence level. Nahar J. et al. [31] extract the significant prevention factors for particular types of cancer: Ыadder, breast, cervical, lung, prostate and skin cancer. The algorithms Apri­ ori, Predictive Apriori, and Tertius are used to discover most of the significant pre­ vention factors against these specific types of cancer. Predictive Apriori tries to max­ imize expected accuracy rather than confidence in Apriori, Tertius is а top-down rule discovery system employing well known decision tree algorithrn. The article [6] describes the methods to detect the comorЬidity of professional dis­ eases caused Ьу pathogenic factors, namely, Ьу ionizing radiation. Apriori algorithrns implemented in the environment of the SQL Server Analysis Services Data Mining. The following deceases are considered: chronic radiation sickness of degree 1, 2 and 3, residual phenomena of chronic radiation sickness, exposure to ionizing radiation, malaise and fatigue, and vegetative nervous system disorders. Numerous papers deal with the use of the Apriori algorithrn to study coronary heart disease. Karaolis et al. [20] developed а data analysis system to search for associa­ tions to assess the heart disease risk factors with WЕКА tools. In [29], the objective of the study was to effectively predict possiЫe heart attacks from а patient dataset. The Data Base consisted of 209 records (instances) collected from а hospital in Iran and 8 attributes. Apriori algorithrns were implemented in WЕКА 2016 (version 3.9.0) and МАТLАВ R2013a software. The preprocessing of data consisted of previous purifying data through the Discretization Unsupervised Filter and а discretizing meth­ od to change numeric data into nominal data. The algorithrn implemented in МАТLАВ showed the best results related to the diagnosis prediction accuracy Ьу the use of the obtained ASRs. It should Ье noted that there are still very few works in the domestic literature on the use of the Apriori algoritm in Ьiomedical research. 1n addition to the work of Bi­ ryukov А. and Dumansky S. [6], we сап cite the article [5], which proposes а new effective algorithrn AprioriScale to build ASRs. The algorithrn is applied to the prob­ lem of detecting children's diseases: obesity and metabolic syndrome. Two proЫems are solved: 1) the possiЬility of distinguishing between obesity and metabolic syn­ drome, and 2) identifying а comЬination of risk factors determined in the early stages 232 of а child's development that may indicate the onset of the disease in adolescence. 12 rules with the reliaЬility above 0.75 were obtained. Let's take а closer look at two of these rules: • Toxicosis, ПР, Weak:LD➔ MS frequency = 0,16, probaЬility = 0,83; • Toxicosis, ПР, ExtraGM, РРН ➔ MS frequency = 0,09, probaЬility = 0,93, where ПР - threat of interruption of pregnancy, MS - metabolic syndrome, WeakLD - weak labour delivery, and ExtraGM - extragenital diseases of mother, РРН is post­ prandial hyperlipernia. These rules show that if factors such as toxicosis, ПР, Ex­ traGM, later accompanied Ьу factors such as Weak:LD, РРН are observed in the early stages of а child's development, the risk of metabolic syndrome is high, the probaЬil­ ity of its rnanifestation is between 0.83 and 0.93. 3 Analysis of Biological and Genetic Data Based on Association Rule Extracting А topic dealing with the analysis of patient Ьiological data is now becorning particu­ larly relevant. The Ьiological data analysis is connected with the identification of previously unknown hidden patterns (frequent iternsets), associative structures in the large number of Ьiological sequences. These sequences include gene sequences, arni­ no acid sequences, protein composition, and other data that display the structure, lo­ calization, interaction or functioning of proteins and genes in cells. Arnino acids are the building material of proteins. The shape and other properties of proteins are asso­ ciated with the exact sequence of arnino acids contained in them. The chernical prop­ erties of amino acids deterrnine the Ьiological activity of proteins. Many diseases have Ьiological nature: obesity, high Ыооd cholesterol, diabetes, in­ somnia, arthritis, and many others. Analysis of gene information, including Apriori algorithms, [1, 10, 21, 27] helps to study the nature of disease, optirnize its treatment, predict the course of disease. An overview of some methods of extracting knowledge from Ьiological (DNA) sequences is given in [8]. The comparison of the Apriori algo­ rithm with other algorithms in the mutation analysis is produced in [28]. In [17], а model is proposed for finding а dorninant sequence of arnino acids to Ыосk the growth of cancer cells based on protein clustering. In [16], а comparative analysis of classifiers in cancer prediction using multiple da­ ta rnining techniques is given. The dataset contained а total of 844 records and nine features: Clump thickness, Uniforrnity of cell size, Uniforrnity of cell shape, Marginal adhesion, Single epithelial cell size, Ваге nuclei, Вland chrornatin, Norrnal nucleoli and Mitoses. The analysis of data consisted of two stages. The first stage applies the algorithm Apriori to reduce the number of input features. In the second stage, six classifiers have been applied and validated through а k-fold cross-validation scheme. Some variaЫes named as Marginal adhesion and Ваге nuclei have been removed as the noise data. Thus, а new subset of seven features has been provided for solving the proЫem of classification. Six predictive algorithms were chosen: decision tree (DT), support vector rnachine (SVМ), k-nearest neighbour (КNN), nai"ve Bayes (NВ), ran- 233 dom forest (RF) and neural network (NN). For the experiments, the R statistical envi­ ronment was chosen, as it is an open source scripting language specifically designed for data analysis. The classifiers were evaluated based on performance metrics includ­ ing accuracy, sensitivity, and specificity. The SVМ classifier achieved а classification accuracy of 0.9372 with а sensitivity of 0.9332 and а specificity of 0.9226, so it per­ forms better than all the remaining classifiers. А sirnilar method of two-stage data processing is used in [44] to predict а patient's response to а drug in the treatment of cancer. For prediction of drug response based on molecular profiles of multiple cancer cell types, it was generated а large-scale pharmacogenomics dataset for 1001 cancer cell lines and 251 anti-cancer drugs. The authors performed the feature selection in the form of ASRs and utilized the selected features to train the state-of-the-art Deep Leaming Neural Networks (DLNNs) to predict pharmacological response in а Ыind (control) set. The ASRs are treated as а novel meta-dataset. Specifically, the Apriori algorithm was applied to generate а rule­ set, containing all tissue-to-gene, tissue-to-drug, gene-to-drug and drug-to-drug asso­ ciations. The study shows that type 2 diabetes is а genetic disease and evidence of а statisti­ cal interaction among several Single Nucleotide Polymorphisms (SNPs) has been reported. In [26], the algorithm Apriori-Gen has been applied to SNP data of type 2 diabetes for association study. The obtained associations are measured through risk rate (RR) and odds ratio (OR) proposed Ьу the authors. The obtained results allow to assess with high accuracy and statistical reliaЬility the interaction of nucleotide poly­ morphisms with disease complexes. An analysis of diabetic ASR based on the Apriori algorithm is given in [45]. The volume of Ьiological knowledge is rapidly increasing in the form of gene ex­ pression databases (GEO, Arrayexpress, etc.), information on rnicroaпay experiments (spotted probes, data processing protocols, etc.), molecular databases (GenВank, ЕmЫ, Unigene, etc.), semantic sources as thesaurus, ontologies or semantic networks (UМLS, GO, etc.), ЬiЫiographical databases (Medline, Biosis, etc.) and gene/protein related specific sources (КEGG, OMIM, etc.) [27]. 4 Perfection of the Apriori Algorithm The popularity of the Apriori algorithm for medical diagnostic tasks is due to its sim­ plicity, however, its application for large data sets requires the development of more efficient modifications in terms of reducing its computational complexity. And such work to improve this algorithm is being carried out all over the world [4]. 1n particu­ lar, we may Ье аЫе to choose the following directions in improving the Apriori algo­ rithm: 1) developing new algorithms; 2) data management; 3) constraint-based ASR rnining; 4) incremental mode of ARS rnining. Developing New Algorithms. The algorithms FP-GROWS and ECLAT are at­ tributaЬle to the first direction. The main drawbacks of the Apriori algorithm is scan­ ning the database several times. The algorithm FP-GROWTH [14, 32] uses а :fre­ quent-pattem tree structure (FP-Tree), which stores all the database. This structure 234 can compress the data up to 200 times, and it is stored to the computer's memory. Then, frequent itemsets are directly extracted from the FP-Tree using the divide-and­ conquer method. This algorithm is used for analyzing risk factors of Туре 2 diabetes in [46]. The algorithm "Equivalence Class Transformation" (ECLAT) mines frequent item­ sets in а vertical data format [12]. The algorithm builds the TID set of all items in the transaction database. The ASR mining on vertically partitioned data is used in [15]. The article [45] gives а general overview of effective processing the Apriori algo­ rithm on medical data. The authors propose а modification of the Apriori algorithщ in which the amount of support for many candidates for frequency multiple attributes is calculated only for transactions, the length of which is longer or equal to the cardi­ nality of the candidates in question. The study in [9] aims to see the effect of the k­ means clustering algorithm on the Apriori algorithm Ьу comЬining these two algo­ rithms. А logical comЬinatorial neuron-like network is advanced for optimization of the Apriori algorithm in [30]. Data Management. The following articles can Ье attributed to the category of data management. In [37], the ASRs are extracted to predict the co-diseases in diabetic mellitus patients. The peculiarity of this work is in selecting rules via their testing on а sample of data not used in data processing. The efficiency of the Apriori algorithm was increased with the help of а prefixed-itemset-based data structure [49]. 1n paper [38], а new method and а statistic test on rules were introduced to mine ASR over multiple databases. In [42], а method for ASRs extraction based on ontology seman­ tics is proposed. The medical dataset is transformed into an ontology in the form of triples (subject, object, predicate), and SPARQL (Query Results ХМL Format) is used to query the generated ontology. Constraint-Based Association Rules Mining. Currently, а lot of studies appeared in which the Apriori algorithm is optimized from the point of view of obtaining not all the possiЫe set of ASRs but only some of its subset satisfying а given essential prop­ erty - interesting rules, non-redundant, negative, maximal rules, association rules generated through the questions of the end users and some others. The authors of [1] introduce а Query-constraint based ASR Mining (QARМ) ap­ proach for exploratory analysis of multiple, diverse clinical data sets in the National Sleep Research Resource (NSRR). Top-k Non-Redundant (ТNR) ASR mining algo­ rithm is used in this work. Non-redundant ASR is rule deleting from which at least one item implies the change of rule's support for the worse. The work [36] also pro­ poses an algorithm to generate non-redundant ASRs. Both positive and negative rules were generated in [27] to analyze which diagnosis types require or not require Laboratory Diagnostic Tests (LDTs) for patients. The negative rules are generated from infrequent itemsets. Maximal ASRs are extracted from maximal frequent pattems (itemsets). А pattem Х is а maximal frequent in data set D if Х is frequent, and there exists no super­ pattem х such that Х с х and х is frequent in D. The maximal frequent pattem mining algorithms is, for example, Maximal Frequent Itemset Algorithm (MAFIA) [14, 21]. Currently there is consideraЫe interest in the methods restricting the extraction of rules to the specific type of the most interesting rules for the users. In [33], it is ad- 235 vanced а method ofRank Based Weighted ASR Mining (RANWAR)) elaborated for Ьiological data processing. Two new measures of interestingness are considered based on ranging gens. In [43], а method for generating efficient rules for associative classification is ad­ vanced. Associative classification is а technique that integrates classification and ASRs mining for classifying unseen data. Associative classification gives more accu­ rate and easier to understand rules than it is possiЫe to obtain Ьу using the traditional classifiers. In [37, 40], mining ASRs is also combined with extracting classification rules. Existing ASRs mining algorithms rely on frequency-based rule evaluation methods failing to provide sound statistical or computational measures for rule evaluation, and often suffer from many redundant rules. In [37], the authors propose predictaЬility­ based an ASRs mining algorithm based on cross-validation with а new rule evaluation step. А training dataset is partitioned into inner training sets and inner test sets and then candidate rules' predictive performance is evaluated. lncremental Mode of Association Rules Mining Traditional static ASRs mining cannot solve real-world proЫems with dynamically changing data. When the size of the transaction database increases, then an initially frequent item may become an infrequent one, and an initially infrequent item may become а frequent one. Incre­ mental ASRs mining algorithms help to сору with the drawbacks of the classical Apriori algorithm. The authors [52] comЬine the Fast Update Pruning (FUP) algo­ rithm with а compressed Boolean matrix [25] and propose а new incremental ASRs mining algorithm, named FBCM. This algorithm requires only а single scan of both the database and incremental database. An incremental algorithm for mining interest­ ing ASRs has been developed in [48]. The papers [50] summarize the methods for incremental ASRs mining. Conclusion The paper provides an overview on mining associative rules from data in Ьiomedical research. This review is based on а study of the work from 2013 to 2020 and shows the widespread use of the Apriori algorithm and its modifications in medicine. The review includes also the methods to improve the Apriori algorithm to mining more effective associative rules adapted for various research tasks. References 1. Abeysinghe, R. and Cui, L.: Query-constraint-based mining of association rules for ex­ ploratory analysis of clinical datasets in the National Sleep Research Resource. Medical Infonnatics and Decision Making 18 (Suppl 2), 89-100 (2018) 2. Agrawal, R., Imielinski, Т., and Swami, А.: Mining association rules between sets of items in large databases. АСМ SIGMOD Record, 22(2), 207-216 (1993) 3. Altaf, W., Shahbaz, М., and Guergachi, А.: Applications of association rule mining in health infonnatics: А survey. Artificial Intelligence Review, 47(3), 313-340 (2017) 236 4. Bhende, D., Kasarker, U., and Gedaщ М.: Study of various improved Apriori algorithm. IOSR Journal ofComputer Science Engineering, e-ISSN:2278-0661, рр. 55-58 (2016) 5. Billig, V. А., and Ivanova, О.У.: Building association rules in а task of medical diagnosis. Software Products and Systems, 2(114), 146-157 (2016) (in Russian). 6. Birukov, А. Р. and Durnansky, S. М.: Revealing the comorbidity of professional diseases caused Ьу pathogenic factors with the help of associative algorithms on the examples of а cohort of victims of ionizing radiation. Medicine of Extreme Situations, 2, 13-24 (2016) (in Russian). 7. Chavez, R., Gбrriz, J. М., Ramirez, J., Salas-Gonzlez, D., and Gбmez-Rio, М.: Efficient mining of association rules for the early diagnosis of Alzheimer's disease. Phis Med Biol., 56, 6047-6063 (2011) 8. Das, N. N. et al.: Brief survey on DNA sequence mining. Intemational Journal ofComput­ er Science andMobileComputing, 2(11), 129-134 (2013) 9. Dharshini, N. Р., Azmi, F., Fawwaz, I., Husein, А. М., and Siregar, S. D.: Analysis of ac­ curacy k-means and Apriori algorithms for patient data clusters. Journal Physics.: Confer­ ence Ser. 1230, 1-9. ЮР PuЫishing (2019) 10. Dhumale S.: Predicting pattems over protein sequences using Apriori algorithm. Intema­ tional Journal of Engineering andComputer Science 4(7), 13011-13016 (2015) 11. Doddi, D., Marath, А., Ravi, S.S., and Tomey, D.C.: Discovery of association rules in medical data. Medical Informatics and Intemet in Medicine 26(1), 25-33 (2001) 12. Domadia, N. and Rao, U.P.: Privacy-preserving distributed association rule mining ap­ proach on vertically partitioned healthcare data. ProcediaComputer Science, 148, 303-312 (2019) 13. Go, Е., Lee, S., and Yoon, Т.: Analysis of ebolavirus with decision tree and Apriori algo­ rithm. Intemational Journal ofMachine Leaming andComputing, 4(6), 543-548 (2014). 14. Han, J., Pei, J., and Yin, У.: Mining frequent pattems without candidate generation. In: Proceedings of АСМ SIGMOD Intemational Conference on Management of Data, рр. 1- 12 (2000) 15. Harahap, М., Husein, А.М., Aisyah, S. et al.: Mining association rules based on diseases population for recommendation of medicine need. Journal of Physics: Conference Series, 1007, 1-12 (2018) 16. Jalali, S. М., Moro, S., Mahmoudi, М. R., Ghaffary, К. А., Maleki, М., and Alidoostan, А.: А comparative analysis of classifiers in cancer prediction using multiple data mining techniques. Intemational Journal of Business Intelligence and Systems Engineering, 1(2), 166-178 (2017) 17. Kalaiyarasi, R. and Prabasri, S.: Predicting the Lung Cancer from Biological Sequences. Intemational Journal of Innovation in Engineering and Technology (IJIET), 5(1), 106-111 (2015) 18. Kanageswari, S. and Gladis, D.: Generation of association rules of data mining for lung cancer Ьу air pollution. Intemational Journal of Engineering and Advanced Technology (IJEAT), 9(3), 2874-2880 (2020) 19. Kang'ethe, S. М. and Wagacha, Р. W.: Extracting diagnosis pattems in electronic medical records using association rule mining. Intemational Journal of Computer Applications 108(15), 19-27 (2014) 20. Karaolis, М., Moutris, J.A., Papaconstantinou, L., and Pattichis, С. S.: Association rule analysis for the assessment of the risk of coronary heart events. In: Proceedings of the 31st Annual Intemational Conference of the IEEE Engineering inMedicine and Biology Socie­ ty, рр. 6238-6241. Minneapolis, МN: USA (2009) 237 21. Кavakiotis, I., Tzanis, G., and Vlahavas, I.: Mining frequent patterns and association rules from biological data. In: Mourad Ellourni and Albert У. Zomaya (eds.), Biological Кnowledge Discovery Handbook: Preprocessing, Mining, and Postprocessing of Biologi­ cal Data, First Edition, Chapter 34, рр. 737-762. John Wiley & Sons, Inc. PuЫished (2014) 22. Kumar, R. Р. R., Jayakumar, R., and Sankaridevi, А.: Apriori-based frequent symptomset association mining in medical databases. International Journal of Recent Technology and Engineering (IJRТE), 7(5С), 65-68 (2019) 23. Lakshmi, К.R., Кrishna, M.V., and Kumar, S.P.: Performance comparison of data mining techniques for predicting of heart disease survivability. Intemational Journal of Scientific and Research PuЬlications, 3(6), 1-10 (2013) 24. Lee, J.Y. and Кim, К.-У.: Semantic and association rule mining-based knowledge exten­ sion for reusaЫe medical equipment random forest rules. Journal oflntegrated Design and Process Science, 22(4), 55-81 (2018) 25. Li, Т. and Luo, D.: А new improved Apriori algorithm based on compression matrix. In: Proceedings of Intemational Conference on Advanced Data Mining Application., vol. 8933, рр. 1-15. Berlin, Germany: Springer-Verlag (2014) 26. Мао, W. and Мао, J.: Тhе Application of Apriori-Gen algorithm in the association study in type 2 diabetes. In: Proceedings of 3rd International Conference on Bioinformatics and Biomedical Engineering (ICBBE), 4(1), 126-140 (2009) 27. Martinez, R., Pasquier, С., and Pasquier, N.: GENМINER: Mining informative association rules from genomic data. In: Proceeding of the IEEE Intemational Conference on Bioin­ formatics and Biomedicine, рр. 15-22 (2007) 28. Mayilvaganan, М. and Hemalathe, R.: Performance comparison of FSA Red and Apriori algorithm's in mutation analysis. International Journal of Computer Trends and Technolo­ gy (IJСТТ), 17(4), 205-209 (2014) 29. Mirmozaffari, М., Alinezhad, А., and Gilanpour, А.: Data mining Apriori algorithm for heart disease prediction. Intemational Journal of Computing, Communication & Instru­ mentation Engineering (IJCCTE), 4(1), 20-23 (2017) 30. Naidenova, Х., Parkhomenko, V., and Shvetsov, К.: Application of а logical­ combinatorial network to symbolic machine learning tasks. In: Naidenova, Х., Shvetsov, К., and Yakovlev, А. (eds.), Machine Learning in Analysis of Biomedical and Socio­ Economic Data, рр. 371-409. Saint-Petersburg, Russia: Polytech Press (2020) 31. Nahar, J., Tickle, К., Ali, A.B.M.S., and Chen, У.Р. Significant cancer prevention factor extraction: An association rule discovery approach. Journal of Medical System 35(3), 353- 367 (2011) 32. Pinheiro F.M.R.: Applying the Apriori and FP-Growth association algorithms to liver can­ cer data. А Thesis Submitted in Partial Fulfillment of the Requirement for the Degree of Master of Science, 172 рр. University of Victoria (2013) 33. Premalatha S. and Nandhini, U. С.: Efficiently generating the rank based weighted asso­ ciation rule mining using Apriori algorithm in High Biological Database. Intemational Re­ search Journal ofEngineering and Technology (IRJEТ), 2(9), 2143-2147 (2015) 34. Ramezankhani, А., Pournik, О., Shahrabi, J., Azizi, F., and Hadegh, F.: An application of association rule mining to extract risk pattem for type 2 diabetes using tehran lipid and glucose study database. Intemational ofEndocrinol Metabolism. 13(2) е25389 (2015) 35. Sanyer, G. and Ta�ar, С. 6.: Highlighting the rules between diagnosis types and laboratory diagnostic tests for patients of an emergency department: use of association rule mining. Health Informatics Journal, 2(6), 1-17 (2019). 36. Severac, F., Sauleau, Е. А., Meyer, N., Lefevre, Н., Nisand, G., and Jay, N.: Non­ redundant association rules between diseases and medications: an automated method for 238 knowledge base construction. ВМС Medical Infoпnatics and Decision Making 15(1), 2-7 (2015) 37. Shahebaz, Ah. Кh. and Jabbar, М. А.: Improved classification techniques to predict the co­ disease in diabetic mellitus patients using discretization and Apriori algorithm. Intema­ tional Journal of Innovative Technology and Exploring Engineering (IЛТЕЕ), 8(11), 730- 733 (2019) 38. Shang, Е., Duan, J., Fan, Х., Tang, У., and Уе, L.: Association rule mining and statistic test over multiple datasets on ТСМ drug pairs. Intemational Journal of Biomedical Data Mining 6(1), 2-6 (2017) 39. Shaпna, N. and Om, Н.: Early detection and prevention of oral cancer: association rules mining on investigation. WSEAS Transaction on Computers, 13, 1-8 (2014) 40. Song, К. and Lee, К.: Predictability-based collective class association rule mining. Expert System with Application, 79, 1-7 (2017) 41. Swathi, Р. and Prajna, В.: Тhе effective procession of Apriori algorithm prescribed data mining on medical data. Intemational Journal of Computer Science and Technology (IJCST), 7(3), 22-26 (2016) 42. Тhamer, V., El-Sappagh, S., and El-Shishtawy, Т.: А semantic approach for extracting medical association rules. Intemational Journal oflntelligent Engineering Systems, 13(3), 280-293 (2020) 43. Тhanajiranthom, Ch. and Songram, Р.: Generation of efficient rules for associative classi­ fication. In: Proceedings of the 13th Intemational Conference MIWAl, рр. 109-120 (2019) 44. Vougas, К., Кrochmal, М., Jackson, Т., Polyzos, А. et al: Deep leaming and association rule mining for predicting drug response in cancer. А personalized medicine approach. Cuпent version availaЫe https://www.Ьiorxiv.org/content/10.ll01/070490v3.full (2017) 45. Wang, Х., Su, К., and Liu, Zh.: Analysis of diabetic association rules based on Apriori al­ gorithm. In: Ch. Huang, Yu-Wei Chan, and Neil Yen (eds), Data Processing Techniques and Applications for Cyber-Physical System, рр. 553-563. Springer (2020). 46. Wei, Z., and Guangjian, Уе.: Тhе research on analyzing risk factors of type 2 diabetes mellitus based on improved ftequent pattem tree algorithm. 1n Proceedings of the 2015 In­ temal Conference on Materials Engineering and Infoпnation Technology Applications, рр. 459-463. Atlantic Press (2015) 47. Win, S.L., Htike, Z.Z., Yusof, F., and Noorbatcha, 1.А.: Gene expression mining for sur­ vivaЬility of patients in early stages of lung cancer.lntemational Journal ofBioinfoпnatics andBiosciences, 4(2), 1-9 (2014). 48. Yafi, Е., Al-Hegami, А., Alam, А., andBiswas, R. УАМI: Incremental Mining oflnterest­ ing Association pattems. Тhе Intemational Arab Joumal oflnfoпnation Technology 9(6), 504-5011 (2012) 49. Yu, S. and Zhou, У.: А prefixed-itemset-based improvement for Apriori algorithm. Cor­ nell University, arXiv preprint, arXiv: 1601.01746 (2016) 50. Zhan, F., Zhu, Х., Zhang, L., Wang, Х., Wang, L., and Liu, Ch.: Surnmary of association rules. ЮР Conf. Series: Earth and Environment Science (ЕSМА), vol. 252, рр. 12 (2019). 51. Zhang, B.Z., Jiang, К. Q., and Zhang, У. Z.: Survey on incremental association rule min­ ing research. Journal of China Computer. Systems, 37(1), 18-23 (2016) 52. Zhou, D., Ouyag, V., Kang, Zh., Li, Zh., Zhou, J.P., and Cheng, Х.: lncremental АRМ based on matrix compression for edge computing. IEEE Access, Section on Innovation and Application of Intelligent Processing of Data, Infoпnation and Кnowledge as Re­ sources in Edge Computing, vol. 7, 173044-173053 (2019) 239