<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">An Efficient Algorithm for the Prediction of Cancer of the Kidney Using Data Analytic Technique</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Aranuwa</forename><surname>Felix</surname></persName>
							<email>felix.aranuwa@aaua.edu.ng</email>
							<affiliation key="aff0">
								<orgName type="institution">Aekunle Ajasin University</orgName>
								<address>
									<postCode>+2347031341911</postCode>
									<settlement>Akungba -Akoko</settlement>
									<region>Ondo State</region>
									<country key="NG">Nigeria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ogundare</forename><surname>Olanike</surname></persName>
							<email>ogundareolanike@yahoo.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Aekunle Ajasin University</orgName>
								<address>
									<postCode>+2347031341911</postCode>
									<settlement>Akungba -Akoko</settlement>
									<region>Ondo State</region>
									<country key="NG">Nigeria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sellappan</forename><surname>Palaniappan</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Malaysia University of Science and Technology</orgName>
								<address>
									<postCode>+6010212624</postCode>
									<settlement>Selangor</settlement>
									<country key="MY">Malaysia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">An Efficient Algorithm for the Prediction of Cancer of the Kidney Using Data Analytic Technique</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">F052BAC8783D036D75619791955D3A6A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T00:54+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Data Analytics</term>
					<term>Classification Algorithms</term>
					<term>Data Mining</term>
					<term>Kidney Cancer</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Our focus in this research work is to present an efficient algorithm for apt prediction of cancer of the kidney in which medical practitioners and patients could gain valuable knowledge for early and proactive intervention strategies to save lives from this harmful disease. To achieve these objectives, dataset pertaining to patients of cancer of the kidney were acquired from selected private and public hospitals in south west Nigeria. A two-layered classifier system consisting of Rule Induction (RI) and Decision Tree (DT) classifiers was designed to build the model based on data analytic approach. The classifier system designed was tested successfully using case study data from fifty-two (52) selected Local Governments in South West Nigeria using purposive and selective sampling technique. Ten classification algorithms were used in the modeling. Waikato Environment for Knowledge Analysis was used for the experiment and each model was built in two different ways (10-fold cross validation and percentage split mode). Performance comparison of the various algorithms considered was carried out using standard metrics of accuracy for classification and speed of model building benchmarks. The experimental results show that the J48 decision tree algorithm outperform all other algorithms in all the layers with correctly classified instances of 74.7%, F-Measure of 0.614, TP rate of 0.747, FP rate of 0.135, precision and recall of 0.687 and 0.714 respectively. It took the best algorithm, 0.03 seconds to build the model. This proves that the algorithm is suitable for the research purpose. The results from the system framework when tested with test data shows that the identified attributes, algorithm and the system model performed well and can serve as valuable tool for early detection of the disease in patients.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CCS Concepts</head><p>• Software and its engineering ➝Software organization and properties ➝Extra-functional properties ➝Software performance Keywords</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>In Africa, experimental studies have shown that most cancers are diagnosed at an advanced stage of the disease which usually contributes to its complications and mortality rate. This is due to a limited awareness of the early signs and symptoms of the disease among the public and healthcare providers. According to Lasebikan, Nwadinigwe &amp; Onyegbule, (2014), the mortality rates of this disease is always compounded by the later stage at which the disease is diagnosed, presenting a ticking time bomb of life expectancy and lifestyle changes such as women having fewer children, as well as hormonal intervention such as postmenopausal hormonal therapy <ref type="bibr" target="#b0">[1]</ref>. To reduce this harm caused by the disease, an effective way is to detect it early <ref type="bibr" target="#b1">[2]</ref>. However, early detection and prognosis requires an accurate information, reliable analytic procedure and efficient algorithm. Therefore, the researcher's direction in this work is to present a reliable analytic procedure and efficient algorithm suitable for the prediction of cancer of the kidney through data analytic approach, in which medical practitioners and patients can gain valuable knowledge and help for proactive intervention strategies in order to save lives from this harmful disease.</p><p>Data analytic has proven to be a multi-dimensional discipline that uses descriptive techniques and predictive models to gain valuable knowledge from data warehouses for recommendations and decision making. It is the discovery of patterns and communication of meaningful insight in data <ref type="bibr" target="#b2">[3]</ref>. According to Berson, Smith and Thearling (1999), data analytics is the science of examining raw data with the purpose of drawing conclusions from it <ref type="bibr" target="#b8">[9]</ref>. It focuses on inference, identify undiscovered patterns and establish hidden relationships <ref type="bibr" target="#b3">[4]</ref>. Figure <ref type="figure" target="#fig_0">1</ref> depicts the process of data analytics. The science is generally divided into exploratory data analysis (EDA), where new features in the data are discovered and confirmatory data analysis (CDA) where existing hypotheses are proven true or false. Typically, it is used to describe the technical aspects of data analysis, especially predictive modeling, machine learning techniques. Data Analytics has been commonly apply to business data, marketing mix modeling, web analysis, risk analysis and fraud analysis to communicate insights from data. It is very good in recommending action and guide decision making, </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">METHOD AND MATERIALS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Data Collection and Data Format</head><p>Dataset pertaining to this research work was collected from selected health centres and hospitals in the south western part of Nigeria using purposive and selective sampling techniques. The researcher collected a sample data totaling, 1,006 records from fifty-two selected health centres in six ( <ref type="formula">6</ref>) different states. The data collected was cleaned, normalized and organized in a form suitable for data analytic process. Table <ref type="table">1</ref> shows the data format for the research data collection while Figure <ref type="figure" target="#fig_0">1</ref> and Figure <ref type="figure" target="#fig_1">2</ref> show the visualized information about selected states and health centres respectively.</p><p>Table <ref type="table">1</ref> shows the data format for the research data collection </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Data Analysis &amp; Interpretation</head><p>Statistically, out of the 1,006 patient's data captured, 44.8% were male while the remaining 55.2% are female, (See Table <ref type="table" target="#tab_0">2</ref>). The analysis further revealed that 57.1% of the patients are exposed to chemical and industrial contents while 32.7% of the population as gender and hereditary disorder. The patient's life style data collected also indicated that the people around this region are addicted to smoking and drinking of alcohol, regular use of nonsteroidal anti-inflamatory drug (NSAIDs) such as ibuprofen and naproxen, which can double the risk of the disease by 51%. Other factors include obesity; faulty genes; a family history of kidney cancer; having kidney disease that needs dialysis; being infected with hepatitis C; and previous treatment for testicular cancer or cervical cancer. There is an indication also, that High blood pressure is a possible risk factor though still under investigation. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">DESIGN OF EXPERIMENT AND RESULTS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Research Experimental Platform</head><p>Waikato Environment for Knowledge Analysis (WEKA) platform was used for the data analytic experiment. It is a powerful data mining tool that has a GUI Chooser from which any of the four major WEKA application environments (Explorer, Experimenter, KnowledgeFlow and Simple CLI) can be selected. The Explorer Application is selected for this experiment because it has a workbench that contains a collection of visualization tools, data processing, attribute ranking and predictive modeling with graphical user interface (GUI) for easy access to this LMT from the family of Decision Tree. The Decision Tree also known as "white box" classification model can provide explanation for their models, and could be used directly for decision making <ref type="bibr" target="#b4">[5]</ref>, while the Rule Induction is one of the fundamental tools of data mining, in which formal rules are extracted from a set of observations. The rules extracted represent a full scientific model of the data <ref type="bibr" target="#b5">[6]</ref>. According to <ref type="bibr" target="#b6">Kapil et al., (2013)</ref>, rule induction is a popular and well researched method for discovering interesting relations between variables in large database. These abilities and aptitudes of rule induction are suited and of good requirement for any effective and efficient intelligent system. A major paradigm of the Rule Induction is the Association Rules <ref type="bibr" target="#b6">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 3: Designed Classifier System</head><p>As shown in Figure <ref type="figure">3</ref>, the patient's databank component is responsible for the data collection, updating and storing patient's data from different sources. The classifier system component is responsible for the data modeling based on the algorithms in the layers. The performance evaluation component is responsible for the evaluation of the performance of the algorithms considered in the layers using standard metric to produce the best (optimal) algorithm. The rule generated from this algorithm is to be incorporated into the prediction system. Since the objective of the research work is to present a suitable algorithm for the cancer of the kidney prediction system, which the work has achieved. Hence the prediction system processes is not discussed in the work, but will be discussed in the future work of this research.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Experimental Results</head><p>Ten (10) classification algorithms from the family of classifiers implemented in this work were used to model the patient's dataset. The datasets for the experiment was first divided into two, which includes the training and testing datasets. 66% of the datasets was devoted to training while the remaining 34% was used for testing of randomly selected data. JRip, PART and Decision Table in layer 1 of the classifier system were first used to model the patient's data and later the Decision Tree classifiers.</p><p>The 10-fold cross validation test and percentage split modes were also considered in the modeling. Since they are from different classifiers family, they yielded different models that classify differently on some inputs. The algorithms were tested on the datasets in order to determine that which best models the data with best predictive accuracy.</p><p>The comparison of the performance of the various algorithms in layer 1 and layer 2 based on the output from the percentage split (hold-out) and 10-fold cross validation modes was carried out. The results of the models from the two modes and the performance evaluations are presented in Table <ref type="table" target="#tab_2">3</ref>. The 10-fold cross-validation test mode was considered good since it produced the best model both in layer 1 and 2 of the classifier system. Moreover, the 10-fold cross validation mode have been widely used, and it is described a better option to determine the performance of a classifier <ref type="bibr" target="#b7">[8]</ref>. Table <ref type="table" target="#tab_3">4</ref> shows the standard metric accuracy details from the 10-fold cross validation mode considered for all the algorithms in the experiment. Figure <ref type="figure">4</ref> and Figure <ref type="figure">5</ref> show the graphs of predictive accuracy and time taken to build the models by the classifiers respectively.   From the experimental results and analysis, it shows that the J48 decision tree and LMT rules outperform all other algorithms in the layers. However, J48 decision tree was chosen as the best algorithm in this work because it has the correctly classified instances of 74.7%, ROC Area of 0.78 and recall of 0.714 respectively. It has a lower FP rate of 0.153, F-Measure of 0.614 and took lesser time of 0.03 seconds to build the model compared to LMT and other classifiers as shown in Table <ref type="table" target="#tab_3">4</ref>. Additionally, J48 decision tree algorithms generally have this ability that can produce a simple tree structure with high accuracy in term of classification rate, even with huge volume of data <ref type="bibr" target="#b8">[9]</ref>. Pruning methods have been introduced to reduce the complexity of tree structure without any decrease in classification accuracy. The J48 decision tree structure and rules as generated by WEKA are presented in Figure <ref type="figure" target="#fig_4">6</ref>. The rules generated from the best algorithm (J48 pruned decision tree) are as stated in rules 1 to 20. The rules were tested in a prediction system framework and their prediction levels are classified as follows: (PL) -One, Two and Three. This show the status of patients and by interpretation: Level One and Two indicates a risk level or status of the disease manifestation in the patients that needs to be attended to urgently. While, level Three indicates that the patient is not manifesting any symptoms of kidney cancer disease, but may suffer from other diseases. A back-end for updating the rules as the situation arises will be incorporated into the system to match other conditions. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Performance Evaluation Optimal Algorithm</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>KC Prediction System</head><note type="other">Classifier System</note></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">CONCLUSIONS</head><p>The research work was focused at presenting an efficient algorithm suitable for predicting the status of kidney cancer in patients. To achieve the objectives of the research work: (i). Dataset pertaining to patient was acquired from fifty LGA (52) selected Health Centres in the south western region of Nigeria using purposive and selective sampling techniques. (ii) the researcher developed a two-layered classifier system consists of Rule Induction and Decision Trees implemented on Waikato Environment for Knowledge Analysis (WEKA) to build the data model using data analytic approach, and (iii) different machine learning algorithms were used in search for the algorithm that produced the best model with predictive accuracy. In the experiment, ten (10) classification model algorithms from different classifier family were implemented on the patients'dataset. Since they are from different classifiers family, they yielded different models that classify differently on some inputs. The comparison of the performance of the various algorithms in layer 1 and layer 2, and the standard metrics of accuracy, precision, recall and f-measure for the best classifier considered in this work was carried out as shown in Table <ref type="table" target="#tab_2">3</ref> and Table <ref type="table" target="#tab_3">4</ref> respectively. The results show that the J48 decision tree outperform all other algorithms in the layers with predictive accuracy of correctly classified instances of 74.7 % in 0.03 seconds, ROC Area of 0.78, FP rate of 0.153, TP rate of 0.714, precision and recall of 0.614.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Data Analytics Process</figDesc><graphic coords="1,318.60,588.25,239.75,106.45" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Visualize information about selected health centres in LGAs</figDesc><graphic coords="2,61.57,486.83,103.90,67.45" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 PredictiveFigure 5 :</head><label>45</label><figDesc>Figure 4 Predictive Accuracy of Classifiers in Layers 1 and 2 for both 10-fold and Hold-out (Percentage Split) Validations</figDesc><graphic coords="4,54.50,444.50,231.70,141.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: J48 Decision Tree Structure as presented by WEKA</figDesc><graphic coords="5,54.50,223.45,236.60,210.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Rule 1 :Rule 3 :Rule 4 :Rule 5 :</head><label>1345</label><figDesc>IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = blood in urine: PL = One Rule 2: IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = back pain: PL = Two IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = tumor: PL = Three IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Fibroids: PL = Three IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Stomach ucher : PL = Two Rule 6: IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Kidney pain: One Rule 7 IF (G&amp;H Disorder = NO) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Abdominal pain: Two Rule 8 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = blood in urine: PL = One Rule 9 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Obesity) AND Complaints = blood in urine: PL = Two Rule 10 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = HB Pressure) AND Complaints = blood in urine: PL = Two Rule 11 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Drug Abuse OR Tumor OR Fibroids: PL = Two Rule 12 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Abdominal pain: PL = Two Rule 13 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = Kidney pain: PL = One Rule 14 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Smoking) AND Complaints = stomach ucher: PL = One Rule 15 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Alcohol OR Dialysis) AND Complaints = stomach ucher: PL = Two Rule 16 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Radiation) AND Complaints = stomach ucher OR blood in urine: PL = One Rule 17 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = Yes) AND (Lifestyle = Water pills) AND Complaints = stomach ucher: PL = Three Rule 18 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = NO) AND (Lifestyle = Smoking) AND Complaints = stomach ucher OR kidney pain: PL = One Rule 19 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = NO) AND (Lifestyle = Smoking) AND Complaints = stomach ucher: PL = Two Rule 20 IF (G&amp;H Disorder = YES) AND (C&amp;I Exposure = NO) AND (Lifestyle = Smoking OR Obesity OR Drug Abuse OR Radiation OR Water Pills OR Dialysis) AND Complaints = stomach ucher: PL = Three</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2 : Statistical Data for the Selected Attributes</head><label>2</label><figDesc></figDesc><table><row><cell>S/N</cell><cell>Attributes</cell><cell>Data</cell><cell>Percentage</cell></row><row><cell></cell><cell></cell><cell></cell><cell>(%)</cell></row><row><cell>1</cell><cell>Gender</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Male</cell><cell></cell><cell>44.8</cell></row><row><cell></cell><cell>Female</cell><cell></cell><cell>55.2</cell></row><row><cell>2</cell><cell>Lifestyle</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Smoking</cell><cell></cell><cell>39.5</cell></row><row><cell></cell><cell>Obesity</cell><cell></cell><cell>1.9</cell></row><row><cell></cell><cell>Drug Abuse</cell><cell></cell><cell>13.3</cell></row><row><cell></cell><cell>HB Pressure</cell><cell></cell><cell>10.53</cell></row><row><cell></cell><cell>Water Pills</cell><cell></cell><cell>3.98</cell></row><row><cell></cell><cell>Dialysis</cell><cell></cell><cell>0.8</cell></row><row><cell></cell><cell>Alcohol</cell><cell></cell><cell>29.3</cell></row><row><cell></cell><cell>Radiation</cell><cell></cell><cell>0.69</cell></row><row><cell>3</cell><cell>G&amp;H Disorder</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Yes</cell><cell></cell><cell>32.7</cell></row><row><cell></cell><cell>No</cell><cell></cell><cell>67.3</cell></row><row><cell>4</cell><cell>C&amp;I Exposure</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Yes</cell><cell></cell><cell>57.3</cell></row><row><cell></cell><cell>No</cell><cell></cell><cell>42.7</cell></row><row><cell>5</cell><cell>Complaints</cell><cell></cell><cell></cell></row><row><cell></cell><cell>Blood in Urine</cell><cell></cell><cell>11.23</cell></row><row><cell></cell><cell>Back pain</cell><cell></cell><cell>20.17</cell></row><row><cell></cell><cell>Tumor</cell><cell></cell><cell>18.8</cell></row><row><cell></cell><cell>Fibroid</cell><cell></cell><cell>13.02</cell></row><row><cell>6</cell><cell>Stomach Ucher</cell><cell></cell><cell>14.31</cell></row><row><cell></cell><cell>Kidney pain</cell><cell></cell><cell>15.8</cell></row><row><cell></cell><cell>Abdominal pain</cell><cell></cell><cell>6.67</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>are very important to the research work. WEKA is a collection of machine learning algorithms for data mining tasks. Algorithms implemented in WEKA include: Bayesian classifiers, Decision Trees, Rules, Artificial Neural Network (Functions), Lazy classifiers and miscellaneous classifiers. But for the purpose of this work Rule Induction and Decision Tree classifiers was considered. These families of classifiers have been selected because of their performances in various domains. They have both been successfully applied to a variety of real-world classification tasks in industry, business, science and education with good performances [10]. The classifier system designed for the data modeling as shown in Figure3is of two layers: Layer 1 consists of JRiP, PART and Decision Table of the family of Rules Induction and Layer 2 consists of J48, LAD Tree, Decision Stump, Random Forest, Rep Tree, BF Tree, and</figDesc><table><row><cell cols="2">7 functionalities, which Age Group</cell><cell></cell><cell></cell></row><row><cell></cell><cell>20-30</cell><cell>38</cell><cell>3.8</cell></row><row><cell></cell><cell>31-40</cell><cell>150</cell><cell>15.0</cell></row><row><cell></cell><cell>41-50</cell><cell>231</cell><cell>23.0</cell></row><row><cell></cell><cell>51-60</cell><cell>240</cell><cell>23.9</cell></row><row><cell></cell><cell>61-70</cell><cell>211</cell><cell>21.0</cell></row><row><cell></cell><cell>70 -80</cell><cell>94</cell><cell>9.13</cell></row><row><cell></cell><cell>81-90</cell><cell>42</cell><cell>4.17</cell></row><row><cell></cell><cell>91-100</cell><cell>0</cell><cell>0</cell></row><row><cell>S/N</cell><cell>Variable Name</cell><cell cols="2">Variable Format</cell><cell>Variable Type</cell></row><row><cell>1</cell><cell>Gender</cell><cell>Male, Female</cell><cell></cell><cell>Categorical</cell></row><row><cell>2</cell><cell>Age</cell><cell>25, 30,……..</cell><cell></cell><cell>Numerical</cell></row><row><cell>3</cell><cell>Lifestyle</cell><cell cols="2">Smoking, Obesity,</cell><cell>Categorical</cell></row><row><cell>4</cell><cell>G&amp;H Disorder</cell><cell>Yes, No</cell><cell></cell><cell>Categorical</cell></row><row><cell>5</cell><cell>C &amp; I Exposure</cell><cell>Yes No</cell><cell></cell><cell>Categorical</cell></row><row><cell>6</cell><cell>Prediction</cell><cell>One, Two, Three</cell><cell></cell><cell>Categorical</cell></row><row><cell></cell><cell>Level</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 : Classification Accuracy Comparison between Hold-out and 10-fold Cross Validations in Layer 1 and Layer 2</head><label>3</label><figDesc></figDesc><table><row><cell>10-fold Cross Validation</cell><cell>Hold-out (Percentage Split)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 : Compared standard metric accuracy details for all the Classification Algorithms</head><label>4</label><figDesc></figDesc><table><row><cell>S/N</cell><cell>Algorithms</cell><cell>TP</cell><cell>FP</cell><cell>Precision</cell><cell>Recall</cell><cell>F-</cell><cell>ROC</cell><cell>Built</cell><cell>Correctly</cell></row><row><cell></cell><cell></cell><cell>Rate</cell><cell>Rate</cell><cell></cell><cell></cell><cell>Measure</cell><cell>Area</cell><cell>Time(s)</cell><cell>classified %</cell></row><row><cell></cell><cell>J48 Decision Tree</cell><cell>0.747</cell><cell>0.135</cell><cell>0.687</cell><cell>0.714</cell><cell>0.614</cell><cell>0.78</cell><cell>0.03</cell><cell>74.7</cell></row><row><cell></cell><cell>LMT</cell><cell>0.746</cell><cell>0.239</cell><cell>0.73</cell><cell>0.746</cell><cell>0.733</cell><cell>0.863</cell><cell>29.25</cell><cell>74.6</cell></row><row><cell></cell><cell>LAD Tree</cell><cell>0.731</cell><cell>0.292</cell><cell>0.714</cell><cell>0.731</cell><cell>0.702</cell><cell>0.85</cell><cell>0.91</cell><cell>73.1</cell></row><row><cell></cell><cell>RepTree</cell><cell cols="2">0.716 0.548</cell><cell>0.536</cell><cell>0.658</cell><cell>0.533</cell><cell>0.571</cell><cell>0.03</cell><cell>71.6</cell></row><row><cell></cell><cell>JRiP</cell><cell>0.709</cell><cell>0.274</cell><cell>0.728</cell><cell>0.749</cell><cell>0.731</cell><cell>0.754</cell><cell>0.06</cell><cell>70.9</cell></row><row><cell></cell><cell>PART</cell><cell>0.718</cell><cell>0.294</cell><cell>0.694</cell><cell>0.718</cell><cell>0.695</cell><cell>0.814</cell><cell>0.03</cell><cell>71.8</cell></row><row><cell></cell><cell>Decision Table</cell><cell>0.704</cell><cell>0.238</cell><cell>0.716</cell><cell>0.704</cell><cell>0.702</cell><cell>0.816</cell><cell>0.05</cell><cell>70.4</cell></row><row><cell></cell><cell>Decision Stump</cell><cell>0.649</cell><cell>0.36</cell><cell>0.579</cell><cell>0.647</cell><cell>0.612</cell><cell>0.669</cell><cell>0.02</cell><cell>64.9</cell></row><row><cell></cell><cell>Random Forest</cell><cell>0.643</cell><cell>0.327</cell><cell>0.622</cell><cell>0.643</cell><cell>0.629</cell><cell>0.74</cell><cell>0.08</cell><cell>64.3</cell></row><row><cell></cell><cell>BF Tree</cell><cell>0.579</cell><cell>0.223</cell><cell>0.718</cell><cell>0.716</cell><cell>0.717</cell><cell>0.748</cell><cell>2.54</cell><cell>57.9</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Onyegbule EC Pattern of bone tumours seen in a regional orthopaedic hospital in Nigeria</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">A</forename><surname>Lasebikan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">U</forename><surname>Nwadinigwe</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">American Cancer Society Guidelines on nutrition and physical activity for cancer prevention: reducing</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H</forename><surname>Kushi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Doyle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mccullough</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">R</forename><surname>Kohavi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">J'</forename><surname>Rothleder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Simoudis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Emerging Trends in Business Analytics Published by ACM</title>
		<imprint>
			<biblScope unit="volume">45</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page" from="45" to="48" />
			<date type="published" when="2002-08">2002. August 2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">Smith</forename><surname>Berson</surname></persName>
		</author>
		<author>
			<persName><surname>Ad Thearling</surname></persName>
		</author>
		<imprint>
			<date>199</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">A meta-learning approach for recommending a subset of white-box classification algorithms for Moodle datasets</title>
		<author>
			<persName><forename type="first">C</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Olmo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ventura</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<pubPlace>, Spain</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Department of Computer Science, University of Cordoba</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Rule Induction -University of Kansas</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">W</forename><surname>Grzymala-Busse</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Extracted</title>
		<imprint>
			<biblScope unit="page" from="20" to="26" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A Hybrid Approach Based On Association Rule Mining and Rule Induction in Data Mining International</title>
		<author>
			<persName><forename type="first">S</forename><surname>Kapil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sheveta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Heena</surname></persName>
		</author>
		<author>
			<persName><surname>Richa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jasreena</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Journal of Soft Computing and Engineering</title>
				<imprint>
			<publisher>IJSCE</publisher>
			<date type="published" when="2013-03">2013. March 2013 146</date>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="2231" to="2307" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><surname>Weka</surname></persName>
		</author>
		<ptr target="http://www.cs.waikato.ac.nz/ml/weka/" />
		<title level="m">WEKA Tutorial. The University of Waikato</title>
				<imprint>
			<date type="published" when="2011-07-20">2011. 2011. 20 July, 2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A Comparative Study of Reduced Error Pruning Method in Decision Tree Algorithms</title>
		<author>
			<persName><forename type="first">W</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Nor Haizan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohd</forename><forename type="middle">N S</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Abdul</forename><forename type="middle">H</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Control System, Computing and Engineering</title>
				<meeting><address><addrLine>Penang, Malaysia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012-11-25">2012. 23 -25 Nov. 2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
