<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Analysis the performance of Naive Bayes and K-Nearest Neighbor Classifiers *</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hubert</forename><surname>Bojda</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Applied Mathematics</orgName>
								<orgName type="institution">Silesian University of Technology</orgName>
								<address>
									<addrLine>Kaszubska 23</addrLine>
									<postCode>44100</postCode>
									<settlement>Gliwice</settlement>
									<country key="PL">POLAND</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Dawid</forename><surname>Gala</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Applied Mathematics</orgName>
								<orgName type="institution">Silesian University of Technology</orgName>
								<address>
									<addrLine>Kaszubska 23</addrLine>
									<postCode>44100</postCode>
									<settlement>Gliwice</settlement>
									<country key="PL">POLAND</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Information Society</orgName>
								<orgName type="institution">University Studies</orgName>
								<address>
									<addrLine>2024, May 17</addrLine>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Analysis the performance of Naive Bayes and K-Nearest Neighbor Classifiers *</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">647F105A8CBF3D211D514CB99CB15E5D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>artificial intelligence</term>
					<term>London weather data</term>
					<term>dataset</term>
					<term>machine learning algorithms</term>
					<term>K-Nearest Neighbors (KNN)</term>
					<term>Naive Bayes</term>
					<term>accuracy</term>
					<term>F1-score</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In our study, we implemented and compared two machine learning algorithms: K-Nearest Neighbors (KNN) and Naive Bayes. For each algorithm, we conducted 10 test runs to evaluate their performance. The results indicated that the KNN algorithm achieved an accuracy ranging from 0.80 to 0.82, demonstrating its robustness in predicting weather conditions based on the London's historical weather data. On the other hand, the Naive Bayes algorithm achieved an accuracy ranging from 0.74 to 0.76. Although slightly lower than KNN, these results still reflect the Naive Bayes algorithm's effectiveness in handling the weather data. Overall, this analysis provides valuable insights into the predictive capabilities of these algorithms.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Artificial intelligence methods show us examples and uses of machine learning algorithms. This is very important in today's world, because more and more systems have more or less developed ai algorithms implemented. For example, they can be used for deep neural network models for unbalanced medical data of IoT systems <ref type="bibr" target="#b0">[1]</ref> or to predict COVID-19 virus spread <ref type="bibr" target="#b1">[2]</ref> This artificial intelligence system was developed to explore and validate the effectiveness of the K-Nearest Neighbors (KNN) and Gaussian Naive Bayes algorithms. To achieve this, we selected a weather database, which is particularly well-suited for testing these algorithms due to its mix of numerical and categorical data. The database includes columns with numerical values such as temperature, humidity, and wind speed, alongside a column containing categorical information about weather conditions at the time of observation, including categories like 'Clear', 'Overcast', and 'Foggy'. This rich and diverse dataset facilitates effective training and testing, enabling a thorough evaluation of the algorithms' performance.</p><p>The numerical data is good for the KNN algorithm, which predicts outcomes based on the distance between data points. For KNN, we use Euclidean distance measure to find the closest neighbors to a given data point and make predictions based on these neighbors. On the other hand, the categorical weather classifications are well-suited for the Naive Bayes algorithm. Naive Bayes works by calculating the probability of each class based on the feature distributions and assumes that features are independent given the class label, making it efficient for categorical data.</p><p>We then divided the dataset into a 70:30 ratio for training and testing. This split provides a substantial amount of data for training the models while reserving enough data to accurately assess their performance. Our benchmark tests involved evaluating the algorithms using standard performance metrics such as accuracy, precision, recall, and F1-score. These metrics offer a comprehensive view of the algorithms' ability to classify weather conditions correctly. Additionally, we performed cross-validation to ensure that our results were not overly dependent on a particular train-test split, further validating the robustness and reliability of our models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Methodology</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">K Nearest Neighbors</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.1.">Description</head><p>The KNN classifier <ref type="bibr" target="#b2">[3]</ref>, or k nearest neighbor algorithm, is used to classify and predict the value based on the variable specified in the decision column in the database. The algorithm compares the values in the columns that explain the phenomenon with the values of the variables that are included in the learning set. It contains information about the k closest observations from the learning set.</p><p>An important aspect in the creation of a classifier is the selection of an appropriate metric that calculates the distance between the observations of the learning set and the training set. The most popular metrics are Euclidean, Minkowski or Manhattan.</p><p>With successive iterations, the division of the data is corrected against the given metric. The algorithm moves data between classes so that the variance within each class is as smallest</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.2.">Formulas</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Calculating Distance Between Points:</head><p>The Euclidean distance 𝑑 between two points 𝑥 𝑖 = (𝑥 𝑖1 , 𝑥 𝑖2 , . . . , 𝑥 𝑖𝑛 ) and 𝑥 𝑗 = (𝑥 𝑗1 , 𝑥 𝑗2 , . . . , 𝑥 𝑗𝑛 ) is given by:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Finding Nearest Neighbors</head><p>To find the 𝑘 nearest neighbors for a test point, compute the Euclidean distances from the test point to all points in the training set and select the 𝑘 points with the smallest distances.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Classification by Majority Voting</head><p>For classification, the class of the test point is determined by the classes of its 𝑘 nearest neighbors. The class 𝐶 of the test point is given by: where 1(𝑦 𝑖 = 𝑐) is an indicator function that equals 1 if 𝑦 𝑖 = 𝑐 and 0 otherwise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.3.">Classifier Algorithm</head><p>The KNN classifier algorithms are shown below:</p><formula xml:id="formula_0">Algorithm 1 KNN Algorithm Require: 𝑋_ , 𝑡𝑟𝑎𝑖𝑛 𝑦_ , 𝑡𝑟𝑎𝑖𝑛 𝑋_ , 𝑡𝑒𝑠𝑡 𝑦_𝑡𝑒𝑠𝑡 1: 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 ← [] 2: for 𝑥 in 𝑋_𝑡𝑒𝑠𝑡 do 3:</formula><p>Calculate and sort distances from 𝑥 to 𝑋_𝑡𝑟𝑎𝑖𝑛.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4:</head><p>Select 𝑘 nearest neighbors' labels.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5:</head><p>Perform majority voting to determine the most frequent label.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6:</head><p>Add the most frequent label to 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖 𝑜𝑛𝑠.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>7:</head><p>Calculate the accuracy by comparing 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 to 𝑦_𝑡𝑒𝑠𝑡. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Naive Bayes</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1.">Description</head><p>Before describing Gaussian Naive Bayes, we would like to describe how the naive bayes algorithm works. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of any possible correlations between the color, roundness, and diameter features. Based on prior knowledge of conditions that may be related to an event, Bayes theorem describes the probability of the event.</p><p>So what is Gaussian Naive Bayes? <ref type="bibr" target="#b3">[4]</ref> Gaussian Naive Bayes is a type of Naive Bayes method where continuous attributes are considered and the data features follow a Gaussian distribution throughout the dataset. In Sklearn library terminology, Gaussian Naive Bayes is a type of classification algorithm working on continuous normally distributed features that is based on the Naive Bayes algorithm. Before diving deep into this topic we must gain a basic understanding of the principles on which Gaussian Naive Bayes work. Here are some terminologies that can help us gain knowledge and ease our further study. The Naive Bayes classifier is based on Bayes' theorem and the assumption of conditional independence of features. The formula is as follows:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2.">Formulas</head><p>Where:</p><formula xml:id="formula_1">𝑃 (𝐶 𝑘 |x) = 𝑃 ( 𝐶 𝑘 ) • 𝑃 ( x | 𝐶 𝑘 ) 𝑃 (x)<label>(3)</label></formula><p>• 𝑃 (𝐶 𝑘 |x) is the posterior probability of class 𝐶 𝑘 given the sample x,</p><p>• 𝑃 (𝐶 𝑘 ) is the prior probability of class 𝐶 𝑘 ,</p><p>• 𝑃 (x|𝐶 𝑘 ) is the likelihood of sample x given class 𝐶 𝑘 ,</p><p>• 𝑃 (x) is the total probability of the sample x. Assuming conditional independence of features x = (𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑛 ), we can write:</p><formula xml:id="formula_2">𝑃 (x|𝐶 𝑘 ) = 𝑃 (𝑥 1 , . . . , 𝑥 𝑛 |𝐶 𝑘 ) = ∏︁ 𝑃 (𝑥 𝑖 |𝐶 𝑘 )<label>(4)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>𝑖=1</head><p>Therefore, the final formula for the Naive Bayes classifier is: Calculate the likelihood of 𝑥 for each class using the Gaussian probability density function.</p><formula xml:id="formula_3">𝑃 (𝐶 𝑘 |x) ∝ 𝑃 (𝐶 𝑘 ) • ∏︁ 𝑃 (𝑥 𝑖 |𝐶 𝑘 )<label>(5)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6:</head><p>Calculate posterior probabilities for each class based on the features of point 𝑥.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>7:</head><p>Select the class with the highest posterior probability as the predicted label for point 𝑥.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>8:</head><p>Add the predicted label to the predictions list. 9: end for 10: return 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖 𝑜𝑛𝑠</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Dataset preparing</head><p>Weather Dataset <ref type="bibr" target="#b4">[5]</ref> contains data from years 1979 to 2021., extracted by MUTHUKUMAR.J. Records, that did not meet following dependency have been removed from the original database:</p><formula xml:id="formula_4">• Formatted Date • Apparent Temperature (C) • Precip Type • Loud Cover • Daily Summary • Humidity • Wind Bearing (degrees)</formula><p>In the first phase of testing, we worked on three abstract classes: 'rain,' 'clear,' and 'overcast.' The accuracy of the classifiers was around 90% for KNN and 80% for Naive Bayes. However, the confusion matrices revealed that the count of entities labeled 'rain' in the 'Summary' column was very low. Consequently, the next step was to identify the abstract class with the highest count of entities. To address this, we analyzed the distribution of the 'Summary' column values across the different classes. This analysis helped us determine which class had the highest representation, allowing us to focus our efforts on balancing the dataset and improving the overall performance of the classifiers.Based on this, records in which the abstraction class is not included were removed. These other classes had a negative impact <ref type="bibr" target="#b5">[6]</ref> on the model's performance. For example: "Breezy and Mostly Cloudy", "Windy and Foggy", "Windy and Dry", "Dry and Partly Cloudy". </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Tests</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">KNN tests</head><p>The first phase of testing involved appropriately reducing the number of classes in the project's dataset to decrease the computational complexity of the model. After data preparation, model testing commenced. Next, the optimal value of 𝑘 for the model was determined. The Matplotlib library <ref type="bibr" target="#b6">[7]</ref>, which generates graphs, was helpful in this regard. In Figure <ref type="figure" target="#fig_1">1</ref>, we observe that our model performs best for 𝑘 = 6. However, in the interval <ref type="bibr" target="#b0">[1,</ref><ref type="bibr">10]</ref>, the values exhibit significant variability, with stabilization occurring only in the interval (10, 30). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Naive Bayes tests</head><p>The Bayes classifier has a good distribution when:   The k-nearest neighbor classifier has about 8 percentage points higher accuracy compared to the Gaussian Naive Bayes classifier on fig. <ref type="figure" target="#fig_5">5</ref>. This difference can be attributed to several factors. First, KNN is a nonparametric algorithm, meaning that it does not assume any particular distribution of the data. This flexibility allows it to effectively capture complex, nonlinear relationships in the feature space. GNB, on the other hand, assumes that the features have a Gaussian distribution and are independent of the class label. When the actual data distribution deviates from these assumptions, GNB's performance can suffer. Second, KNN relies on the proximity of data points in the feature space, adapting well to different data distributions without making strong assumptions. In addition, KNN can mitigate the impact of outliers and noisy data by considering multiple nearest neighbors, which helps smooth out the impact of anomalous data points. On the other hand, GNB can be inaccurate under the significant influence of outliers, as they can distort the estimation of the parameters of the mean and variance of the Gaussian distribution for each trait. Together, these factors contribute to the higher accuracy we observed for KNN in our tests. From the time comparison on fig. <ref type="figure" target="#fig_6">6</ref>, the classifiers have completely different execution times. The KNN algorithm can be more time-consuming, especially for large datasets, due to the need to calculate the distance between each pair of points in the training set. Naive Bayes, on the other hand, being based on a simple probabilistic model, often exhibits lower computational complexity. In addition, differences in running times may also be due to differences in implementations of these algorithms and characteristics of specific data, such as the number of dimensions or the size of the dataset. The F1-score or F1-measure is a measure of predictive performance. It is calculated from the precision and recall of the test, where the precision is the number of true positive results divided by the number of all samples predicted to be positive, including those not identified correctly, and the recall is the number of true positive results divided by the number of all samples that should have been identified as positive. Precision is also known as positive predictive value, and recall is also known as sensitivity in diagnostic binary classification. Using the built-in F1 metric from the sklearn library <ref type="bibr" target="#b8">[9]</ref>, nine iterations were conducted with data shuffiing to calculate the results. This approach ensured robustness in evaluating the model's performance across multiple trials and varying data distributions. Each iteration involved computing the F1-score, which provides a balanced measure of the classifier's precision and recall, thus capturing its ability to correctly classify positive instances while minimizing false positives and false negatives. The iterative process allowed for a comprehensive assessment of the model's effectiveness in handling different data configurations and revealed insights into its consistency and reliability. As seen on the fig. <ref type="figure" target="#fig_7">7</ref>, the results obtained by the F1-score from the sklearn library closely align with the results obtained using the accuracy calculation algorithm implemented by the authors in the tested classifier.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.1.">F1-Score</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion</head><p>To sum up and recap, our study using the London weather dataset provided valuable insights into the functioning and performance of K-Nearest Neighbors (KNN) and the Gaussian Naive Bayes classifier (GNB). KNN is much easier to implement. Its concept is straightforward: it classifies new data points based on the most common class among the nearest neighbors. This simplicity in implementation makes KNN an agttractive option for quick and easy classification tasks. However, it has its limitations. KNN can be slower to classify, especially for large datasets, because it is necessary to calculate the distance between a new point and each point in the training set. This distance calculation can become computationally expensive as the size of the dataset increases, leading to longer classification times. On the other hand, the Bayes classifier, particularly the Naive Bayes classifier, may require more effort at the implementation stage. This is due to the need to calculate and model conditional probabilities and to make feature independence assumptions. Despite this initial complexity, the Naive Bayes classifier can be faster during the classification phase. It only requires calculating the conditional probabilities for each feature and applying Bayes' rule. Throughout of this study, we gained a large dose of knowledge. Implementing these algorithms and benchmarking their performance allowed us to gain practical experience with both KNN and GNB. We discovered firsthand the trade-offs between ease of implementation and computational efficiency.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>𝑖=1 2 . 2 . 3 .Algorithm 2</head><label>2232</label><figDesc>Classifier AlgorithmThe Naive Bayes classifier algorithms are shown below: Description of the Naive Bayes Algorithm Require: 𝑋_ , 𝑡𝑟𝑎𝑖𝑛 𝑦_ , 𝑡𝑟𝑎𝑖𝑛 𝑋_𝑡𝑒𝑠𝑡 1: 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 ← [] 2: Calculate the prior probabilities for each class using 𝑦_𝑡𝑟𝑎𝑖𝑛. 3: Calculate the mean and variance for each feature for each class using 𝑋_𝑡𝑟𝑎𝑖𝑛 and 𝑦_𝑡𝑟𝑎𝑖𝑛. 4: for 𝑥 in each point in 𝑋_𝑡𝑒𝑠𝑡 do 5:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Choosing the best k value</figDesc><graphic coords="4,153.60,397.70,286.10,208.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Cumulative Distribution Function for Naive Bayes Classifier</figDesc><graphic coords="5,155.95,86.95,286.70,199.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>•</head><label></label><figDesc>The lines for each class increase rapidly, indicating high probabilities assigned by the model to the correct classes. • Lines for different classes should be separated from each other, indicating that the model distinguishes classes well. • The CDF lines should be close to zero at low probabilities As you can see from fig.2, all of these things are almost maintained, indicating that the classifier predicts quite well. We can confirm this because the classifier has an accuracy of about 75%. It is worth noting on the sudden intersection of the foggy class. The abrupt intersection of the line indicates that the model is uncertain about assigning probabilities to this particular class, which may be the result of an overlap in feature space between this class and other classes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 3 : 4 :</head><label>34</label><figDesc>Figure 3: Confusion Matrix for K Nearest Neighbors Figure 4: Confusion Matrix for Naive Bayes</figDesc><graphic coords="5,70.10,557.20,190.10,151.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Comparison of the classifier accuracies</figDesc><graphic coords="6,148.90,145.90,298.90,174.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Comparison of the classifier times</figDesc><graphic coords="6,148.90,532.60,298.90,174.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: F1 Score</figDesc></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">BiLSTM deep neural network model for imbalanced medical data of IoT systems</title>
		<author>
			<persName><forename type="first">Marcin</forename><surname>Woźniak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michał</forename><surname>Wieczorek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jakub</forename><surname>Siłka</surname></persName>
		</author>
		<ptr target="https://www.sciencedirect.com/science/article/pii/S0167739X22004095" />
	</analytic>
	<monogr>
		<title level="j">Future Generation Computer Systems</title>
		<imprint>
			<biblScope unit="volume">141</biblScope>
			<biblScope unit="page" from="489" to="499" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Neural network powered COVID-19 spread forecasting model</title>
		<author>
			<persName><forename type="first">Michał</forename><surname>Wieczorek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jakub</forename><surname>Siłka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marcin</forename><surname>Woźniak</surname></persName>
		</author>
		<ptr target="https://www.sciencedirect.com/science/article/pii/S0960077920305993" />
	</analytic>
	<monogr>
		<title level="j">Chaos, Solitons &amp; Fractals</title>
		<idno type="ISSN">Issn: 0960-0779</idno>
		<imprint>
			<biblScope unit="volume">140</biblScope>
			<biblScope unit="page">110203</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">K-Nearest Neighbor(KNN) Algorithm in Machine Learning</title>
		<author>
			<persName><forename type="first">Rizwana</forename><surname>Yasmeen</surname></persName>
		</author>
		<ptr target="https://medium.com/@rizwanayasmeen06/k-nearest-neighbor-knn-algorithm-in-machine-learning-d38d9638d7e0" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Intrusion Detection System using Naive Bayes algorithm</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">S</forename><surname>Nagapadma</surname></persName>
		</author>
		<author>
			<persName><surname>Sharmila</surname></persName>
		</author>
		<ptr target="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9019921%5C&amp;casa_token=FEZMEU72iF8AAAAA:4D1LZ_1ZcT7dqbDdxFSbDGfqnG8TMb-vwrGeDgnZRzxV7YMyJGNupv8dmhmhkpsq2C6SJqZAmxc" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Weather Dataset</title>
		<author>
			<persName><forename type="middle">J</forename><surname>Muthukumar</surname></persName>
		</author>
		<ptr target="https://www.kaggle.com/datasets/muthuj7/weather-dataset" />
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Assessing the Impact of Changing Environments on Classifier Performance</title>
		<author>
			<persName><forename type="first">Rocio</forename><surname>Alaiz-Rodriguezand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nathalie</forename><surname>Japkowicz</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-540-68825-9_2</idno>
		<ptr target="https://link.springer.com/chapter/10.1007/978-3-540-68825-9_2" />
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Matplotlib for Python Developers</title>
		<author>
			<persName><forename type="first">Sandro</forename><surname>Tosi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Uczenie maszynowe z użyciem Scikit-Learn i TensorFlow, Wydanie II, aktualizacja do modułu TensorFlow</title>
		<author>
			<persName><forename type="first">Aurélien</forename><surname>Géron</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">2</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<ptr target="https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html" />
		<title level="m">sklearn.metrics.f1_score</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>scikit-learn</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
