<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A Survey on Data mining classification approaches</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Anupama</forename><surname>Mishra</surname></persName>
							<email>tiwari.anupama@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Swami Rama Himalayan University</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Gupta</surname></persName>
							<email>gupta.brij@gmail.com</email>
							<affiliation key="aff1">
								<orgName type="institution">National Institute of Technology</orgName>
								<address>
									<postCode>136119</postCode>
									<settlement>Kurukshetra</settlement>
									<region>Haryana</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution">Asia University</orgName>
								<address>
									<postCode>413</postCode>
									<settlement>Taichung</settlement>
									<country>Taiwan &amp;</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution">Staffordshire University</orgName>
								<address>
									<addrLine>Stoke-on-Trent</addrLine>
									<postCode>ST4 2DE</postCode>
									<country key="GB">UK</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dragan</forename><surname>Peraković</surname></persName>
							<email>dperakovic@fpz.unizg.hr</email>
							<affiliation key="aff4">
								<orgName type="institution">University of Zagreb</orgName>
								<address>
									<country key="HR">Croatia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Francisco</forename><surname>José</surname></persName>
						</author>
						<author>
							<persName><forename type="first">García</forename><surname>Peñalvo</surname></persName>
							<affiliation key="aff5">
								<orgName type="institution">University of Salamanca</orgName>
								<address>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">A Survey on Data mining classification approaches</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">DDDA25439938BD3F3715B03D308EA3DE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T10:49+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Bagging</term>
					<term>Naive Bayes</term>
					<term>SVM</term>
					<term>Random Forest</term>
					<term>data mining</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this review article, we discuss a number of different classification algorithms used in data mining for unique applications. There are various techniques to analyse the data for continuous and discrete values. Though,in our research paper, we discuss algorithm used for classification and applied for data mining. Basically classification is a technique for categorising data into discrete categories depending on limitations. The Genetic algorithm C4.5, the Naive Bayes algorithm, and others are examples of classification algorithms.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The practise of identifying previously unknown, valid patterns and links in large data sets using advanced data analysis tools is known as data mining. These technologies include statistical models, mathematical algorithms, and machine learning methodologies. There are wide applications of data mining techniques <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. As a result, data mining includes more than just data collection and maintenance; it also includes analysis and prediction. The classification technique, which can handle a larger range of data than regression, is gaining prominence <ref type="bibr" target="#b2">[3]</ref>. Knowledge discovery from datasets is a part of data mining. Data mining tools and methods are applied to extract patterns and features from large amount of data <ref type="bibr" target="#b11">[12]</ref>, which can then be applied to other datasets <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b5">6]</ref>. Classification is a process that assigns an object or event to one of the predefined classes in a group. It's based on their characteristics in order to be able to predict their future behavior. Classification methods are used when the data set has already been divided into groups before the classification process begins. The accuracy often depend on the preprocessing of the data which involves data cleaning (missing values, null values, blank values), data integration from multiple sources, data transformation and discretizaion <ref type="bibr" target="#b14">[15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Classification Techniques in Data Mining</head><p>Classification Techniques are methods of data analysis that can be used to determine the categorization of an individual based on their personal attributes <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b9">10]</ref>. These techniques help us better understand individuals by grouping them together depending on their lifestyle, habits, and traits. Figure <ref type="figure" target="#fig_0">1</ref> presents the classification algorithms which are generally used for data mining applications.Classification is one of the most commonly used data mining techniques. It can be used for both categorical and numerical attributes. The goal is to predict the class labels of new, unseen observations by using training data consisting of both labeled and unlabeled examples. This method makes use of an algorithm to identify patterns in the training data that are predictive of new observations <ref type="bibr" target="#b24">[25,</ref><ref type="bibr" target="#b17">18]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Decision Tree</head><p>A decision tree is a class discriminator that iteratively splits the training set until each partition contains only or primarily samples from one class. A split point is a test that describes how data is partitioned in each non-leaf node of the tree based on one or more qualities <ref type="bibr" target="#b12">[13]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Naive Bayes</head><p>Naive bayes is used to work with probabilistic models and majorly used in machine learning <ref type="bibr" target="#b10">[11]</ref>. In this model, probability is calculated for each class to determine their categorization, which is then used to forecast the values for a new class. Here, y is a instance of a problem which has to be classified. A vector can represent it by 𝑦 = 𝑦1, 𝑦2, .....𝑦𝑛 where 𝑛 represents independent variables, and assigned to instance probabilities 𝑝(𝑐𝑦/(𝑦1, 𝑦2, .....𝑦𝑛)) For each of n possible outcomes or classes cn 𝑝(𝑐𝑛|𝑦) = 𝑝(𝑐𝑛)𝑝(𝑦|𝑐𝑛) 𝑝(𝑌 )</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Rule Based Classification</head><p>"If-then-"rules are the classification rules, and the rule is a condition. The rules of individuals are ranked. Rule-based order refers to the order that is based on their quality. Class-based ordering refers to the grouping of rules that belong to the same class. A good rule should be error-free and cover as many scenarios as possible.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Support Vector Machine</head><p>Support Vector Machines (SVM) <ref type="bibr" target="#b22">[23]</ref>, is a classification technique that can be used to build both classifiers and non-parametric regression models. SVM works by finding an optimal hyperplane that separates objects of different classes in the input space based on their training samples. A new classification method for both linear and non-linear data is the Support Vector Machine. It transforms the original training data into a higher dimension using a non-linear mapping. It searches for the linear optimal separation hyper plane with the additional dimension (i.e. "decision boundary"). A hyper plane with a good non-linear mapping to a high enough </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Genetic Algorithms</head><p>In GA, a technique called association rules mining is utilised to uncover indeterminate solutions <ref type="bibr" target="#b8">[9]</ref>.GA is implemented with a small collection of categorical data. After GA is implemented, high-level prediction rules are produced for the selection of better attribute. The Michigan technique provides a single prediction rule for every individual in the entire population by lowering the cost <ref type="bibr" target="#b6">[7]</ref>. The Pittsburgh method <ref type="bibr" target="#b4">[5]</ref> is a set of prediction criteria for a whole group of people. We evaluate the overall quality of the rule set rather than the quality of each individual rule when categorising. Rules, like the logical OR implementing and logical AND implementing AND operators, are generalised or specialised based on facts (logical AND).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Model Evaluation and Selection</head><p>The task of choosing a model is challenging because many models are often equivalent in terms of accuracy, but have different computational complexity. The evaluation and selection of the best model for the particular application depends on the cost-complexity trade-off. One alternative in tackling this task is to carry out an exhaustive search over all possible models, which may be costly in terms of computational time or storage space. Figure <ref type="figure" target="#fig_1">2</ref> depicts the types of methods used for Evaluation and selection of the model <ref type="bibr" target="#b15">[16]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Hold-Out</head><p>A technique used to improve classification accuracy is the holdout validation. It will remove data that was used in training and then split the remaining data into two parts, one for training and one for testing. This prevents over-fitting of the model on the training set.</p><p>Hold out validation is a technique that can be used to improve classification accuracy. This method is used by removing any data that was used in training, splitting the remaining data into two groups (one for testing and one for training), and preventing over-fitting of models on the test set by using only new data for testing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">n-fold Cross Validation</head><p>The available data is divided into n distinct subsets of equal size. To train a classifier, use each subset as the training set. The operation is repeated n times, with the given accuracies being the average of the n accuracies. Cross-validation methods such as 10-fold and 5-fold are often utilised. When the available data is small, this strategy is employed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Leave-one-out cross validation</head><p>if the data volume is small , then this method can be used.Cross-validation is a subset of it. Each cross validation fold contains only one test case, and all of the data's tests are used in training <ref type="bibr" target="#b16">[17]</ref>. When there are m examples in the original data set, this is referred to as m-fold cross-validation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Validation Set</head><p>A validation set is widely used in learning algorithms to estimate parameters. In such instances, the final parameter values are those that provide the highest accuracy on the validation set. Cross validation can also be used to estimate parameters.The data may be divided into the below three sets: 1. Training set 2. Validation set 3. Test set</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Minimum Description Length (MDL)</head><p>Missing values are treated by DL as though they were missing at random. Zero vectors are used to replace sparse numerical data while zero vectors are used to replace sparse categorical data <ref type="bibr" target="#b13">[14]</ref>. Missing values are considered sparse in nested columns. In the case of MDL, the model's size as well as the reduction in uncertainty that using the model causes does matters <ref type="bibr" target="#b21">[22]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Techniques to Improve Classification Accuracy</head><p>Classification Accuracy describes how well a model can assign the correct class to a given input. Improvements in Classification Accuracy is important for models that are more accurate and fair as they can reduce the risk of unjustified misclassifications and false alarms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Bagging</head><p>The most common technique for improving classification accuracy is to use the bagging technique <ref type="bibr" target="#b20">[21]</ref>. The bagging technique samples training data across multiple training folds (called bootstrapping) and then uses the resampled data to train the classifier. Boosting is another popular technique which applies weighting to examples that are misclassified by the initial classifier. The bagging technique can significantly improve classification accuracy. During the training phase, each classifier is trained with a subset of the data. This process is called bootstrapping. Once a classifier has been trained for a specific set of data, it's used to classify new data from the same set. In contrast to boosting, wherein each classifier is trained on both positive and negative examples from the input set, bagging trains each classifier on only one type of example at a time. The result is an ensemble of classifiers that combine to create an even better model than any single component could be alone.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Boosting</head><p>It is well-known that boosting algorithms are high-performing classifiers <ref type="bibr" target="#b18">[19]</ref>. They are versatile and provide good accuracy when there is a large imbalance in the training data. However, they have a drawback in that they can be computationally expensive for online or near real-time processing. Boosting is the process of producing a classifier in a successive manner. Each classifier dependents on the preceding based one and concentrates on the errors of the before one. Test sets that have previously been wrongly predicted by classifiers are selected frequently and are weighted properly. The weights of data that are already categorised will be increased. Data that are correctly classified will have their weights reduced. Boosting is a machine learning technique that can be used to improve the classification accuracy of your model. It can be used in many different scenarios, but boosting is most often applied when the goal of the algorithm is to identify what class (or label) an observation belongs to <ref type="bibr" target="#b19">[20]</ref>.</p><p>A big issue with boosting models is that they do not always converge well. This results in algorithms not being able to make accurate predictions. There are many different techniques which can mitigate this issue, one of them being early stopping.</p><p>Boosting models work by accumulating error terms, which are then used to adjust weights on different parts of the model or training data. The more data you have for a given class, the larger its weight will be in your model's prediction function and vice versa for other classes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Occam's Razor</head><p>Occam's razor presents the theory that fits our data and identify unfamiliar objects. It says that if two or more models have same kind of generalisation errors, the simpler model should be preferred over the more complex model. There is a greater probability that sophisticated models will be fitted accidently due to data mistakes <ref type="bibr" target="#b21">[22]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Random Forest</head><p>WEKA is a general-purpose classification and regression tool <ref type="bibr" target="#b23">[24]</ref>. For gradient boosting and support vector machine, random forest has a very high accuracy. Random Forest is divided into two types. 1. Regression and classification trees 2. A bootstrap sample is a sample derived from the original dataset with replacement that is the same size as the original dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>This review discusses numerous data mining classification techniques. each technique has its own set of advantages and disadvantages. Data mining is a broad term that encompasses a variety of approaches for analysing vast amounts of data, including many technology like machine learning and deep learning with the included knowledge of statistics. To accomplish various data analysis tasks, these areas have a huge number of data mining algorithms built in them. Based on the behaviour of the data, the algorithm and evaluation methods can be applied to analyse the data. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Classification Algorithm</figDesc><graphic coords="3,89.29,84.19,436.80,201.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Model Evaluation and Selection</figDesc><graphic coords="5,89.29,84.19,436.80,201.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Techniques to Improve Classification Accuracy</figDesc><graphic coords="7,89.29,84.19,436.80,201.60" type="bitmap" /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Quick Medical Data Access Using Edge Computing</title>
		<author>
			<persName><surname>Mamta</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Insights2Techinfo</title>
		<imprint>
			<biblScope unit="page">1</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Artificial Intelligence and Machine learning for Smart and Secure Healthcare System</title>
		<author>
			<persName><forename type="first">Sandeep</forename><surname>Kumar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Insights2Techinfo</title>
		<imprint>
			<biblScope unit="page">1</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Training an agent for fps doom game using visual reinforcement learning and vizdoom</title>
		<author>
			<persName><forename type="first">K</forename><surname>Adil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Grigoriev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Advanced Computer Science and Applications</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">12</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Parallel implementation for 3d medical volume fuzzy segmentation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Alzu'bi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Shehab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Al-Ayyoub</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jararweh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Gupta</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition Letters</title>
		<imprint>
			<biblScope unit="volume">130</biblScope>
			<biblScope unit="page" from="312" to="318" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Witten</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Frank</surname></persName>
		</author>
		<title level="m">Data Mining: Practical Machine Learning Tools And Techniques</title>
				<meeting><address><addrLine>Morgan Francisco</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2005">2005. 2005</date>
		</imprint>
	</monogr>
	<note>2nd Edition</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Constructing X-Of-N Attributes For Decision Tree Learning</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Zheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="35" to="75" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Accelerating 3D medical volume segmentation using GPUs</title>
		<author>
			<persName><forename type="first">M</forename><surname>Al-Ayyoub</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Alzu'bi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jararweh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Shehab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Gupta</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Multimedia Tools and Applications</title>
		<imprint>
			<biblScope unit="volume">77</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="4939" to="4958" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Bayesian Network Classifiers</title>
		<author>
			<persName><forename type="first">N</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Geiger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="page" from="131" to="163" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">From Data Mining To Nowledge Discovery In Databases</title>
		<author>
			<persName><forename type="first">U</forename><surname>Fayyad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Piatetsky-Shapiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">And</forename><surname>Smyth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Ai Magazine</title>
		<imprint>
			<date type="published" when="1996">1996</date>
			<publisher>American Association For Artificial Intelligence</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Being Bayesian About Network Structure: A Bayesian Approach To Structure Discovery In Bayesian Networks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>&amp;koller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Machine Learning</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="95" to="125" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">C4.5 -Programs For Machine Learning</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Quinlan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1993">1993</date>
			<publisher>Morgan Kaufmann Publishers</publisher>
			<pubPlace>San Francisco, Ca</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Identification Of Foreign-Accented French Using Data Mining Techniques</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">D</forename><surname>Bianca</surname></persName>
		</author>
		<imprint/>
		<respStmt>
			<orgName>Computer Sciences Laboratory For Mechanics And Engineering Sciences (Limsi</orgName>
		</respStmt>
	</monogr>
	<note>PhilippeBoula De Mareüil And Martine Adda-Decker</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Simplifying Decision Trees:A Survey</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Breslow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">W</forename><surname>Aha</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge Engineering Review</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="1" to="40" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">An Introduction To Bayesian Networks</title>
		<author>
			<persName><forename type="first">F</forename><surname>Jensen</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1996">1996</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Machine learning for computer and cyber security: principle, algorithms, and practices</title>
		<editor>Gupta, B. B., &amp; Sheng, Q. Z.</editor>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>CRC Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Classification of spammer and nonspammer content in online social network using genetic algorithm-based feature selection</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Sahoo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Gupta</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Enterprise Information Systems</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="710" to="736" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">The Performance Of Bayesian Network Classifiers Constructed Using Different Techniques</title>
		<author>
			<persName><forename type="first">M</forename><surname>Madden</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings Of European Conference On Machine Learning, Workshopon Probabilistic Graphical Models ForClassification</title>
				<meeting>Of European Conference On Machine Learning, Workshopon Probabilistic Graphical Models ForClassification</meeting>
		<imprint>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="59" to="70" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">Avirm</forename><surname>Michael</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kearns And</forename><surname>Dana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ron</forename></persName>
		</author>
		<title level="m">Algorithmic Stability And Sanity-Check Bounds For Leave-One-Out Cross Validation</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Boosting A Weak Learning Algorithm By Majority</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Freund</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information And Computation</title>
		<imprint>
			<biblScope unit="volume">121</biblScope>
			<biblScope unit="page">256285</biblScope>
			<date type="published" when="1995">1995</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Process Consistency ForAdaboost</title>
		<author>
			<persName><forename type="first">W</forename><surname>Jiang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
		<respStmt>
			<orgName>Dept. Of Statistics,Northwestern University</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Tech. Report</note>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Bagging Predictors, Machine Learning</title>
		<author>
			<persName><forename type="first">Leo</forename><surname>Breiman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Voting Over Multiple Condensed Nearest Neoghbors</title>
		<author>
			<persName><forename type="first">E</forename><surname>Alpaydin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence Review</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="115" to="132" />
			<date type="published" when="1997">1997</date>
			<publisher>Kluwer Academic Publishers</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Cristianini</surname></persName>
		</author>
		<author>
			<persName><surname>Shawe-Taylor</surname></persName>
		</author>
		<title level="m">An Introduction To Support Vector Machines</title>
				<meeting><address><addrLine>Cambridge</addrLine></address></meeting>
		<imprint>
			<publisher>Cambridge University Press</publisher>
			<date type="published" when="2000">2000</date>
			<biblScope unit="volume">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">Classification And Regression Trees</title>
		<author>
			<persName><forename type="first">L</forename><surname>Breiman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Olshen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">J</forename><surname>Stone</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1984">1984</date>
			<pubPlace>Wadsworth, Belmont</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Real-time detection of fake account in twitter using machine-learning approach</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Sahoo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Gupta</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advances in computational intelligence and communication technology</title>
				<meeting><address><addrLine>Singapore</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="149" to="159" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
