A Simple Approach to Weather Predictions by using Naive Bayes Classifiers Agnieszka Lutecka1 , Zuzanna Radosz1 1 Silesian University Of Technology, Faculty of Applied Mathematics, Kaszubska 23, 44-100 Gliwice, Poland Abstract This article presents and explains how we have used Naive Bayes Classifier and date base to predict the weather. We compare the dependence of accuracy on the probability of different distributions for various types of data. Keywords Naive Bayes Classifier, probability distribution, Python 1. Introduction β€’ 𝑃 (𝐡|𝐴) is the probability that we would observe such data if the hypothesis were true Many IT systems use the widely understood artificial β€’ 𝑃 (𝐴) is the a priori probability that the hypothe- intelligence [1, 2]. Artificial intelligence methods also sis is true apply to the data processing [3, 4, 5]. The wide appli- β€’ 𝑃 (𝐡) is the probability of the observed data cation of artificial intelligence also applies to systems installed in cars, which are used, for example, to detect Importantly, the naive Bayes classifier assumes that the quality of the surface [6]. Many technical problems the influence of the attributes is independent of each lead to the optimization tasks [7, 8, 9], where the com- other, therefore 𝑃 (𝐡|𝐴) can be written as plexity of the [10, 11, 12] functional is a big challenge. 𝑃 (𝐡1 |𝐴) * 𝑃 (𝐡2 |𝐴) * ... * 𝑃 (𝐡𝑛 |𝐴). (2) Optimization processes concern many different areas of The naivety of this classifier follows from the above as- life require constant search for new, more effective op- sumption. timization methods [13, 14] based on the observation of nature [15, 16, 17]. A very important and important branch of artificial intelligence are the applications of 3. Description of how the broadly understood [18] neural networks. Interesting applications concern health protection [19], adult care program works [20, 21]. Artificial intelligence methods are also used We started the project with an analysis of data from the for weather forecasting [22, 23, 24], as well as for the database. Initially, we shuffled and normalized the data detection of features [25, 26, 27]. to the 0 βˆ’ 1 range to operate on smaller numbers, thus increasing the performance of our program. After di- 2. Program description viding into validation and training sets, we move on to the main part of our program, i.e. the use of the naive The task of our program is to predict the weather using Bayesian classifier. Its task is to return the name of the the naive Bayes classifier. It is especially suitable for most probable weather for a given sample. problems with a lot of input data, so it is perfect for our This algorithm uses a probability distribution defined by project. It uses a conditional probability formula that a density function that describes how the probability of looks like this: a random variable (x) is distributed. We implemented 5 different probability distributions to compare the algo- 𝑃 (𝐡|𝐴) * 𝑃 (𝐴) rithm’s efficiency for different probability density formu- 𝑃 (𝐴|𝐡) = (1) 𝑃 (𝐡) las. We use the following distributions: Gauss, Laplace, log-normal, uniform, triangular. β€’ 𝐴 is our hypothesis β€’ 𝐡 is the observed data (attribute values) 3.1. Gaussian distribution SYSYEM 2022: 8th Scholar’s Yearly Symposium of Technology, Engi- It is one of the most important probability distributions, neering and Mathematics, Brunek, July 23, 2022 playing an important role in statistics. The formula for " agnilut814@polsl.pl (A. Lutecka); zuzarad785@polsl.pl the probability density is as follows: (Z. Radosz) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License 1 βˆ’(π‘₯ βˆ’ πœ‡)2 CEUR Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) √ exp( ) (3) 2𝜎 2 http://ceur-ws.org Workshop 𝜎 2πœ‹ ISSN 1613-0073 Proceedings 64 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 The probability function plot of this distribution is a bell- Where πœ‡ is the mean value and 𝜎 is the standard devi- shaped curve (the so-called bell curve). ation. The density function graph is as follows: Figure 1: Graph of the probability density function Source: Wikipedia.org Figure 3: Graph of the probability density function Where πœ‡ is the expected or mean value and 𝜎 stan- Source: Wikipedia.org dard deviation. The red line corresponds to the standard normal distribution. 3.2. Laplace Distribution 3.4. Triangular Distribution It’s a continuous probability distribution named after It is a continuous probability distribution of a random Pierre Laplace. The probability density is given by the variable. The probability density of a triangular distribu- formula: tion can also be expressed as: 1 βˆ’|π‘₯ βˆ’ πœ‡| exp( ) (4) √ 0 dla π‘₯ < πœ‡ βˆ’ 6𝜎 ⎧ 2𝑏 𝑏 βŽͺ βŽͺ √ Where πœ‡ is the expected value, i.e. the mean, and b> 0 is + √16𝜎 dla πœ‡ βˆ’ 6𝜎 ≀ π‘₯ ≀ πœ‡ βŽͺ π‘₯βˆ’πœ‡ ⎨ 6𝜎 2 √ the scale parameter. The function graph looks like this: 𝑓 (π‘₯) = βŽͺβˆ’ π‘₯βˆ’πœ‡ 2 + √1 dla πœ‡ ≀ π‘₯ ≀ πœ‡ + 6𝜎 ⎩ 6𝜎 6𝜎 √ βŽͺ βŽͺ 0 dla π‘₯ > πœ‡ + 6𝜎 (6) Where πœ‡ is the mean value and 𝜎 is the standard devi- ation. The function graph looks like this: Figure 2: Graph of the probability density function Source: Wikipedia.org 3.3. Log Normal Distribution Figure 4: Graph of the probability density function Source: Wikipedia.org It is the continuous probability distribution of a positive random variable whose logarithm is normally distributed. Pattern: 1 βˆ’(𝑙𝑛π‘₯ βˆ’ πœ‡)2 3.5. Uniform distribution √ exp( ) * 1(0, ∞) (5) 2πœ‹πœŽπ‘₯ 2𝜎 2 It is a continuous probability distribution for which the probability density in the range from a to b is constant 65 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 and not equal to zero, and otherwise equal to zero. We column j, sr [j] - the mean value of the column j and can see it in the formula below std [j] - standard deviation of the values from column j. √ These methods process the input data through the formu- ⎨0 dla π‘₯ < πœ‡ βˆ’βˆš 3𝜎 ⎧ βŽͺ √ las for the probability distribution and return the value 𝑓 (π‘₯) = 2√13𝜎 dla πœ‡ βˆ’ 3𝜎 ≀ π‘₯ ≀ πœ‡ + 3𝜎 (7)of the probability density function at the point sample √ 0 dla π‘₯ > πœ‡ + 3𝜎 [j], that is, the probability of sample [j] occurring under βŽͺ ⎩ the conditions sr [j] and st [j]. Finally, the algorithm Where πœ‡ is the mean value and 𝜎 is the standard devia- returns the name of the weather most likely to occur at tion. the sample input. The function graph is as follows: Each probability distribution has a differently defined density function. Therefore, the distributions may differ in the results. Below we present the pseudocode of Naive- Classifier class methods with an emphasis on processing the input data by probability distributions. Data: Input π‘‘π‘Žπ‘‘π‘Ž, π‘ π‘Žπ‘šπ‘π‘™π‘’, π‘›π‘Žπ‘šπ‘’ Result: Weather Name Extract weather records ; Enter the weather record sets into the list π‘›π‘Žπ‘šπ‘’π‘ ; 𝑖 := 0; for 𝑖 < 𝑙𝑒𝑛(π‘›π‘Žπ‘šπ‘’π‘ ) do Figure 5: Graph of the probability density function π‘‘π‘Ÿ = []; Source: Wikipedia.org 𝑗 := 1; for 𝑗 < 7 do Calculate the mean value of the column j ; Calculate the column standard deviation j 3.6. Select Distributions ; if π‘›π‘Žπ‘šπ‘’ == π‘™π‘Žπ‘π‘™π‘Žπ‘π‘’β€² π‘Ž then As we can see, the formulas differ significantly, which tr.append(NaiveClassifier.laplace will definitely have a big impact on the effectiveness (sample[j],sr[j-1])); of the program. We tried to choose such formulas for end the probability density so that the values were not very if π‘›π‘Žπ‘šπ‘’ == π‘™π‘œπ‘” βˆ’ π‘›π‘œπ‘Ÿπ‘šπ‘Žπ‘™π‘›π‘¦ then divergent, as we will present in the next paragraphs. tr.append(NaiveClassifier.logarytmiczny (sample[j],std[j-1],sr[j-1])); 4. Algorithm end if π‘›π‘Žπ‘šπ‘’ == π‘—π‘’π‘‘π‘›π‘œπ‘ π‘‘π‘Žπ‘—π‘›π‘¦ then The naive Bayes classifier uses probability density func- tr.append(NaiveClassifier.jednostajny tions to compute the probability of a given start con- (sample[j],std[j-1],sr[j-1])); dition. The NaiveClassifier class has 6 static methods: end β€žlaplace”, β€žlogarytmiczny”, β€žjednastajny”, β€žtrojkatny”, if π‘›π‘Žπ‘šπ‘’ == π‘‘π‘Ÿπ‘œπ‘—π‘˜π‘Žπ‘‘π‘›π‘¦ then β€žgauss”, β€žbayes”, where the first 5 are different proba- tr.append(NaiveClassifier.trojkatny bility distributions . We use as many as 5 to compare (sample[j],std[j-1],sr[j-1])); the algorithm’s effectiveness for different formulas on end the probability density. The "bayes" method accepts the if π‘›π‘Žπ‘šπ‘’ == π‘”π‘Žπ‘’π‘ π‘  then following input data: data – training set, sample – a set of tr.append(NaiveClassifier.gauss values in the range 0-1 that represent successive database (sample[j],std[j-1],sr[j-1])); columns, name – the name of the probability distribution end (Gauss, Laplace, log-normal, uniform, triangular). At the end beginning, the algorithm extracts the records with the Return the probability of the given weather; given weather. Then, using the loops, the program goes end through all the records of the training set, calculating Return the name of the most likely weather; the mean values and standard deviation of each column Algorithm 1: Bayes algorithm for each type of weather. Using the given "name", the algorithm calls the appropriate method, where the input data is: sample [j] - where j is the sample value for the 66 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 Data: Input π‘₯, π‘šπ‘’π‘Žπ‘› Data: Input π‘₯, 𝑠𝑑𝑑,π‘ π‘Ÿ Result: The value of the density function at x Result: The value of the density function at x 𝑏 := 2; return (1/(dev*np.sqrt(2*np.pi)))*np.exp(-((x- return mean)**2)/(2*dev**2)); ((1/(2*b))*(math.exp(-(math.fabs(x-mean)/b)))); Algorithm 2: Laplace’s algorithm Algorithm 6: Gauss algorithm Data: Input π‘₯, 𝑠𝑑𝑑,π‘ π‘Ÿ Result: The value of the density function at x 5. Databases if π‘₯ > 0 then return (1/((math.pi*2)**(1/2)*std*x))*math.exp(- 5.1. Database Analysis ((math.log(x)-sr)**2)/(2*std**2)); For our project, we use the Istanbul Weather Data database, downloaded from the kaggle website. The end database has 3896 records. It contains the following else data columns: DateTime, Condition, Rain, MaxTemp, return 0; MinTemp, SunRise, SunSet, MoonRise, MoonSet, Avg- end Wind, AvgHumidity, AvgPressure. Algorithm 3: Logarithmic algorithm Data: Input π‘₯, 𝑠𝑑𝑑,π‘ π‘Ÿ Result: The value of the density function at x if π‘₯ < π‘ π‘Ÿ βˆ’ 3 * *(1/2) * 𝑠𝑑𝑑 then return 0; end if π‘₯ >= π‘ π‘Ÿ βˆ’ 3 * *(1/2) * 𝑠𝑑𝑑 && π‘₯ <= π‘ π‘Ÿ + 3 * *(1/2) * 𝑠𝑑𝑑 then return 1/(2*(3**(1/2))*std); end if π‘₯ > π‘ π‘Ÿ + 3 * *(1/2) * 𝑠𝑑𝑑 then return 0; end else return 0; Figure 6: Data types in each column end Algorithm 4: Uniform algorithm We analyzed the data using a matrix graph that shows the relationships between weather variables. Data: Input π‘₯, 𝑠𝑑𝑑,π‘ π‘Ÿ The chart is dominated by warm colors, which means Result: The value of the density function at x that most of the records in our database are sunny and if π‘₯ < π‘ π‘Ÿ βˆ’ 6 * *(1/2) * 𝑠𝑑𝑑 then slightly cloudy. return 0; It can be seen that the data is presented in compact end groups. This means that the parameters for different if π‘₯ >= π‘ π‘Ÿ βˆ’ 6 * *(1/2) && π‘₯ <= π‘ π‘Ÿ then weather conditions do not differ much from the others. return (x-sr)/(6*std**2)+1/(6**(1/2)*std); This can make our algorithm that determines the weather end based on these parameters not very accurate. There if π‘₯ > π‘ π‘Ÿπ‘Žπ‘›π‘‘π‘₯ <= 𝑠𝑑𝑑 + 6 * *(1/2) * 𝑠𝑑𝑑 then may be situations where the algorithm will calculate the return -(x-sr)/(6*std**2) + 1/(6**(1/2)*std); weather "Sunny" because it was the most probable, but end the actual weather will be different. In the Experiments if π‘₯ > π‘ π‘Ÿ + 6 * *(1/2) * 𝑠𝑑𝑑 then section, we will test and analyze the obtained results of return 0; the algorithm’s accuracy. end else return 0; We also analyzed the data using a violin graph for all end weather conditions and maximum temperature as we can Algorithm 5: The triangle algorithm see below: 67 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 Figure 7: Matrix graph In the attached photo we can see how the temperature To normalize the data to the range 0-1, we changed int64 value changes for a given weather, for example for "Mod- to float64. erate rain", i.e. moderate rain, the maximum temperature ranges from 5 to 20 degrees. 6. Implementation 5.2. Database modification 6.1. ProcessingData class DateTime, SunRise, SunSet, MoonRise, MoonSet will not Our project consists of two files: a file containing the be used in our project, so we can get rid of them. program code - "Pogoda.ipnb" and the database - "Istan- d a t a . drop ( ’ DateTime ’ , a x i s = 1 , bul Weather Data.csv". After analyzing the data from i n p l a c e = True ) the database, we went to the "ProcessingData" class, in d a t a . drop ( ’ S u n R i s e ’ , a x i s = 1 , which we created 3 static methods: shuffle, splitSet and i n p l a c e = True ) normalize. d a t a . drop ( ’ SunSet ’ , a x i s = 1 , i n p l a c e = True ) d a t a . drop ( ’ MoonRise ’ , a x i s = 1 , 6.2. Shuffle method i n p l a c e = True ) d a t a . drop ( ’ MoonSet ’ , a x i s = 1 , It takes base as input, i.e. our database. We use for loop i n p l a c e = True ) to go through it selecting records and swapping them. 68 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 Figure 8: Violin graph Data: Input π‘π‘Žπ‘ π‘’ Data: Input π‘₯, π‘˜ Result: Database with jumbled records Result: Training and validation set for 𝑖 in range(len(base)-1,-1,-1): do 𝑛 = 𝑖𝑛𝑑(𝑙𝑒𝑛(π‘₯) * π‘˜) take two records from the database, the π‘₯𝑇 π‘Ÿπ‘Žπ‘–π‘› = π‘₯[: 𝑛] second with a random index and swap them π‘₯𝑉 π‘Žπ‘™ = π‘₯[𝑛 :] end return xTrain, xVal; return base; Algorithm 8: SplitSet algorithm Algorithm 7: The shuffle algorithm def s p l i t S e t ( x , k ) : Program code n= i n t ( l e n ( x ) βˆ— k ) x T r a i n =x [ : n ] @staticmethod x V a l =x [ n : ] def s h u f f l e ( base ) : r e t u r n xTrain , xVal f o r i in range ( len ( base ) βˆ’1 , βˆ’1 , βˆ’1) : base . i l o c [ i ] , base . i l o c [ rd 6.4. Normalize method . r a n d i n t ( 0 , i ) ]= b a s e . i l o c [ rd . r a n d i n t ( 0 , i ) ] , Takes x, which is a database that will have records scram- base . i l o c [ i ] bled using the shuffle method. At the beginning, we enter return base all data from the database into the variable values, except for the string value, and the values of the column names into the variable columnNames. We loop through all the 6.3. Splitset method columns in the column, and then take all the rows in the column column and store them in the variable data. It takes as input x - database and k - division of the set. In Variables max1 and min1 are assigned maximum and the variable n we write the length of the set x multiplied minimum values from date. Using the next loop, we go by k to know how to divide this set. Then, to two xTrain through all the rows and assign to the variable val the variables, we write the data from the database to the formula for normalization min-max, that is, we subtract value n, creating the training set, and to the variable xVal the coordinate database record [row, column] from the - all data following the value n, creating the validation value min1, and then divide this difference by the differ- set. Finally, we return both of these sets. ence between max1 and min1. Finally, we write the value after normalization to the database. The method returns us a normalized database. Program code 69 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 Data: Input π‘₯ described in general in the "Algorithm" section, so now Result: Standardized database we will look at the details. First, the method extracts the π‘£π‘Žπ‘™π‘’π‘’π‘  = π‘₯.𝑠𝑒𝑙𝑒𝑐𝑑𝑑 𝑑𝑦𝑝𝑒𝑠(𝑒π‘₯𝑐𝑙𝑒𝑑𝑒 =, , π‘œπ‘π‘—π‘’π‘π‘‘β€²β€² ) database records with the given weather name into sepa- π‘π‘œπ‘™π‘’π‘šπ‘›π‘ π‘Žπ‘šπ‘’π‘  = π‘£π‘Žπ‘™π‘’π‘’π‘ .π‘π‘œπ‘™π‘’π‘šπ‘›π‘ .π‘‘π‘œπ‘™π‘–π‘ π‘‘() rate lists. Then each of the lists created above is put into for π‘π‘œπ‘™π‘’π‘šπ‘›π‘–π‘›π‘π‘œπ‘™π‘’π‘šπ‘›π‘ π‘Žπ‘šπ‘’π‘ ): do the "names" list. Additionally, we create a stringnames take all the rows from the column column list with string weather names and a values list that will π‘šπ‘Žπ‘₯1 = π‘šπ‘Žπ‘₯(π‘‘π‘Žπ‘‘π‘Ž) π‘šπ‘–π‘›1 = π‘šπ‘–π‘›(π‘‘π‘Žπ‘‘π‘Ž) store the calculated probabilities for each weather. for row in range(0,len(x) do we go through all the lines π‘£π‘Žπ‘™ = (π‘₯.π‘Žπ‘‘[π‘Ÿπ‘œπ‘€, π‘π‘œπ‘™π‘’π‘šπ‘›] βˆ’ π‘šπ‘–π‘›1)/(π‘šπ‘Žπ‘₯1 βˆ’ π‘šπ‘–π‘›1) π‘₯.π‘Žπ‘‘[π‘Ÿπ‘œπ‘€, π‘π‘œπ‘™π‘’π‘šπ‘›] = π‘£π‘Žπ‘™ end end return x; Algorithm 9: The normalize algorithm Program code def normalize ( x ) : v a l u e s =x . s e l e c t _ d t y p e s ( exclude =" o b j e c t " ) # s e l e c t a l l d a t a from t h e d a t a b a s e except the object , i . e . string columnNames= v a l u e s . columns . tolist () f o r column i n columnNames : d a t a =x . l o c [ : , column ] # summon a l l rows i n column column max1=max ( d a t a ) min1=min ( d a t a ) f o r row i n r a n g e ( 0 , l e n ( x ) ) : #we go t h r o u g h a l l the l i n e s v a l = ( x . a t [ row , column ] βˆ’ min1 ) / ( max1βˆ’min1 ) x . a t [ row , column ]= v a l return x 6.5. NaiveClassifier class and bayes method The NaiveClassifier class has 6 static methods: "laplace", "logarithmic", "uniform", "triangle", "gauss", "bayes". The first 5 are methods describing the functions of different probability distributions. The "bayes" method has been 70 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 Data: Input π‘‘π‘Žπ‘‘π‘Ž, π‘ π‘Žπ‘šπ‘π‘™π‘’, π‘›π‘Žπ‘šπ‘’ Result: Weather name Extract weather records ; Enter weather record sets into π‘›π‘Žπ‘šπ‘’π‘  ; Enter weather names in the list π‘ π‘‘π‘Ÿπ‘–π‘›π‘”π‘› π‘Žπ‘šπ‘’π‘  ; π‘£π‘Žπ‘™π‘’π‘’π‘  := [] 𝑖 := 0; for 𝑖 < 𝑙𝑒𝑛(π‘›π‘Žπ‘šπ‘’π‘ ) do π‘‘π‘Ÿ = []; π‘ π‘Ÿ = []; 𝑠𝑑𝑑 = []; 𝑗 := 1; for 𝑗 < 7 do Calculate the mean value of the column j ; Calculate the column standard deviation j ; if π‘ π‘Ÿ[𝑗 βˆ’ 1] == 0 then sr[j-1]=0.0000001; end if 𝑠𝑑𝑑[𝑗 βˆ’ 1] == 0 then std[j-1]=0.0000001; end if π‘›π‘Žπ‘šπ‘’ == π‘™π‘Žπ‘π‘™π‘Žπ‘π‘’β€² π‘Ž then tr.append(NaiveClassifier.laplace (sample[j],sr[j-1])); end if π‘›π‘Žπ‘šπ‘’ == π‘™π‘œπ‘” βˆ’ π‘›π‘œπ‘Ÿπ‘šπ‘Žπ‘™π‘›π‘¦ then tr.append(NaiveClassifier.logarytmiczny (sample[j],std[j-1],sr[j-1])); end if π‘›π‘Žπ‘šπ‘’ == π‘—π‘’π‘‘π‘›π‘œπ‘ π‘‘π‘Žπ‘—π‘›π‘¦ then tr.append(NaiveClassifier.jednostajny (sample[j],std[j-1],sr[j-1])); end if π‘›π‘Žπ‘šπ‘’ == π‘‘π‘Ÿπ‘œπ‘—π‘˜π‘Žπ‘‘π‘›π‘¦ then tr.append(NaiveClassifier.trojkatny (sample[j],std[j-1],sr[j-1])); end if π‘›π‘Žπ‘šπ‘’ == π‘”π‘Žπ‘’π‘ π‘  then tr.append(NaiveClassifier.gauss (sample[j],std[j-1],sr[j-1])); end end values.append (np.prod(tr)*len(names[i])/len(names)); end 𝐼𝑛𝑑𝑒π‘₯ = π‘£π‘Žπ‘™π‘’π‘’π‘ .𝑖𝑛𝑑𝑒π‘₯(π‘šπ‘Žπ‘₯(π‘£π‘Žπ‘™π‘’π‘’π‘ )); return value from π‘ π‘‘π‘Ÿπ‘–π‘›π‘”π‘›π‘Žπ‘šπ‘’π‘  at index 𝐼𝑛𝑑𝑒𝑋 ; Algorithm 10: Bayes algorithm 71 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 The next step is to loop through all the values of the @staticmethod names list in sequence. Then we create auxiliary lists: tr d e f a n a l i z e ( T r a i n , Val , name ) : [] - to store 6 probability values that correspond to the c o r r e c t =0 next database column, sr [] - to store the mean values f o r i in range ( len ( Val ) ) : from each column, std - to store the standard deviations if NaiveClassifier . for each column. We pass the next loop through all the c l a s s i f y ( Train , Val . columns one by one. Then we calculate the mean and i l o c [ i ] , name ) == V a l . standard deviation for a given column. The next step is i l o c [ i ] . Condition : the conditions that prevent sr [] and std [] from occurring. c o r r e c t +=1 The time has come for timetables. Depending on the in- accuracy = c o r r e c t / len ( Val ) βˆ—100 put data "name" to the list tr [] we add the result of the return accuracy function of the given distribution. After going through the inner loop, we compute the value from Bayes’ theo- rem. Based on the formula for conditional probability, we 7. Tests multiply the values in the tr list, then multiply that prod- uct by the list of names [i]. We divide the whole thing by We started our tests by checking the algorithm’s opera- the length of the "names" list. We add the obtained result tion using various samples. to the "values" list. After going through both loops, we determine the index from the values list with the highest value. Finally, we return the name of the weather with the given index from the stringnames list. 6.6. AnalizingData class Another class in our project is AnalizingData with the Figure 9: Sample tests Analyze method. This method measures the accuracy as a percentage of the Bayes classifier. The input data is: Train - training set, Val - validation set and name The above code shows us that depending on the value in - name of the probability distribution. The algorithm the sample, the algorithm returns different values, which first sets the value of the corrtect variable to 0. Then it proves the correct operation of the algorithm. goes through the iterator loop and goes through all the records of the Val set. If the value returned by the bayes The next step was to determine the accuracy of our al- classifier at the input: Train, Val [i], name is the same gorithm for various probability distributions. For this we as the weather name for the Val [i] record, increase the used the AnalizingData class with the analysis method. variable correct by 1. Finally, the algorithm returns the We called the method for each of the 5 types of probabil- accuracy, which we calculate by dividing correct by the ity distributions for the training and validation division product the length of the validation set and 100. in the ratio of 7: 3, and then, using the plt package, we displayed the graph. Data: Input 𝑇 π‘Ÿπ‘Žπ‘–π‘›, 𝑉 π‘Žπ‘™, π‘›π‘Žπ‘šπ‘’ Result: Accuracy of the bayes algorithm π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘π‘‘ := 0 𝑖 := 0 for 𝑖 < 𝑙𝑒𝑛(𝑉 π‘Žπ‘™)): do if 𝑁 π‘Žπ‘–π‘£π‘’πΆπ‘™π‘Žπ‘ π‘ π‘–π‘“ π‘–π‘’π‘Ÿ.π‘π‘™π‘Žπ‘ π‘ π‘–π‘“ 𝑦(𝑇 π‘Ÿπ‘Žπ‘–π‘›, 𝑉 π‘Žπ‘™.π‘–π‘™π‘œπ‘[𝑖], π‘›π‘Žπ‘šπ‘’) == 𝑉 π‘Žπ‘™.π‘–π‘™π‘œπ‘[𝑖].πΆπ‘œπ‘›π‘‘π‘–π‘‘π‘–π‘œπ‘› then correct+=1; end end accuracy=correct/len(Val)*100; return x; Algorithm 11: Analyze algorithm Figure 10: The graph of the accuracy of the algorithm de- pending on the probability distribution Program code As you can see in the attached picture, the algorithm c l a s s AnalizingData : 72 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 has different accuracies depending on the probability dis- tribution. The Gaussian distribution definitely exceeds other distributions with its accuracy, which is over 60%. This means that more than 60% of the weather names returned were valid. However, the Laplace and uniform distributions do not lag far behind. Their values are in the range 50-60%. Log normal and triangular distributions are the least efficient because their accuracy is less than 20%. Additionally, we measured the execution time of the algorithm. It was almost 4.512 minutes. Figure 12: Bar chart 1. for the training set 0.1 8. Experiments 8.1. Analysis of algorithm results for normalized and non-normalized data We tested the operation of our program for both nor- malized and non-normalized values and we determined the algorithm execution time for both data sets. The graphs below show the dependence of the accuracy on the probability of individual distributions. Figure 13: Bar chart 2. for the training set 0.3 Figure 11: Bar chart for unnormalized data The first plot shows the Bayesian operation for unnor- malized data in a ratio of 7: 3, while the second plot shows Figure 14: Bar chart 3. for the training set 0.5 the operation for unnormalized data with the same par- tition. The program launch time for the first graph was 4.512 minutes, while for the second graph it was 4.433 minutes. As we can see, the only significant difference was when using the Laplace distribution, the accuracy of which decreased by almost 10%. The remaining results are similar for both types of data. The time difference is insignificant as it is only 5.023s. 8.2. Analysis of the algorithm’s results for different data divisions The following charts show the efficiency of the algorithm Figure 15: Bar chart 4. for the training set 0.9 for normalized data for various divisions into training and validation sets: The algorithm execution time decreases with the re- duction of the training set. For the last execution of the algorithm, where the division was in the ratio of 9: 1, the 73 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 time was only 1.467 minutes. However, the first calcula- [5] R. Avanzato, F. Beritelli, M. Russo, S. Russo, M. Vac- tions, where the division was in the ratio of 1: 9, took as caro, Yolov3-based mask and face recognition al- long as 9,517 minutes. This is because by reducing the gorithm for individual protection applications, in: training set, we increase the number of records in the CEUR Workshop Proceedings, volume 2768, CEUR- validation set. As a result, the algorithm will be called WS, 2020, pp. 41–45. more times and the most time-consuming elements such [6] M. WoΕΊniak, A. Zielonka, A. Sikora, Driving sup- as extracting records with a given weather or loops will port by type-2 fuzzy logic control model, Expert be performed many times. Systems with Applications 207 (2022) 117798. Analyzing the above, we can see that the accuracy [7] G. Borowik, M. WoΕΊniak, A. Fornaia, R. Giunta, of the Gaussian distribution is superior to all of them, C. Napoli, G. Pappalardo, E. Tramontana, A soft- its value is practically unchanged. The Laplace distri- ware architecture assisting workflow executions bution is in second place, almost tapping 60%. The on cloud resources, International Journal of Elec- value of the uniform distribution ranges from 50-55 %. It tronics and Telecommunications 61 (2015) 17–23. achieves the most on the last chart where the training doi:10.1515/eletel-2015-0002. set is 0.9. Triangular and log normal distributions reach [8] T. Qiu, B. Li, X. Zhou, H. Song, I. Lee, J. Lloret, much lower values than the previously mentioned dis- A novel shortcut addition algorithm with particle tributions. The jump is quite big, around 30%. However, swarm for multisink internet of things, IEEE Trans- the log-normalized distribution only slightly exceeds the actions on Industrial Informatics 16 (2019) 3566– triangular distribution once, and it is in the third graph. 3577. Nevertheless, the accuracy values of both distributions [9] G. Capizzi, G. Lo Sciuto, C. Napoli, R. Shikler, never exceed 20%. M. Wozniak, Optimizing the organic solar cell man- ufacturing process by means of afm measurements and neural networks, Energies 11 (2018). 9. Conclusion [10] M. WoΕΊniak, A. Sikora, A. Zielonka, K. Kaur, M. S. Hossain, M. Shorfuzzaman, Heuristic optimization We can conclude from this that the Gauss distribution of multipulse rectifier for reduced energy consump- is the best probability distribution for our database. The tion, IEEE Transactions on Industrial Informatics algorithm with this distribution, with each modification, 18 (2021) 5515–5526. correctly determines about 60% of weather names, which [11] G. Capizzi, G. Lo Sciuto, C. Napoli, E. Tramontana, is a good but unsatisfactory value. This is due to the M. WoΕΊniak, A novel neural networks-based tex- way the data is distributed in the database. In the case ture image processing algorithm for orange defects of more different values for different weather conditions, classification, International Journal of Computer this algorithm could become much more efficient. Science and Applications 13 (2016) 45–60. [12] N. Brandizzi, S. Russo, R. Brociek, A. Wajda, First References studies to apply the theory of mind theory to green and smart mobility by using gaussian area cluster- [1] Y. Li, W. Dong, Q. Yang, S. Jiang, X. Ni, J. Liu, Auto- ing, volume 3118, CEUR-WS, 2021, pp. 71–76. matic impedance matching method with adaptive [13] D. Yu, C. P. Chen, Smooth transition in communica- network based fuzzy inference system for wpt, IEEE tion for swarm control with formation change, IEEE Transactions on Industrial Informatics 16 (2019) Transactions on Industrial Informatics 16 (2020) 1076–1085. 6962–6971. [2] J. Yi, J. Bai, W. Zhou, H. He, L. Yao, Operating [14] C. Napoli, G. Pappalardo, E. Tramontana, A parameters optimization for the aluminum electrol- hybrid neuro-wavelet predictor for qos control ysis process using an improved quantum-behaved and stability, Lecture Notes in Computer Sci- particle swarm algorithm, IEEE Transactions on ence (including subseries Lecture Notes in Arti- Industrial Informatics 14 (2017) 3405–3415. ficial Intelligence and Lecture Notes in Bioinfor- [3] J. W. W. L. Z. B. Wei Dong, Marcin WoΕΊniak, De- matics) 8249 LNAI (2013) 527–538. doi:10.1007/ noising aggregation of graph neural networks by 978-3-319-03524-6_45. using principal component analysis, IEEE Transac- [15] Y. Zhang, S. Cheng, Y. Shi, D.-w. Gong, X. Zhao, tions on Industrial Informatics (2022). Cost-sensitive feature selection using two-archive [4] N. Brandizzi, V. Bianco, G. Castro, S. Russo, A. Wa- multi-objective artificial bee colony algorithm, Ex- jda, Automatic rgb inference based on facial emo- pert Systems with Applications 137 (2019) 46–58. tion recognition, in: CEUR Workshop Proceedings, [16] M. Ren, Y. Song, W. Chu, An improved locally volume 3092, CEUR-WS, 2021, pp. 66–74. weighted pls based on particle swarm optimization for industrial soft sensor modeling, Sensors 19 74 Agnieszka Lutecka et al. CEUR Workshop Proceedings 64–75 (2019) 4099. [17] B. Nowak, R. Nowicki, M. WoΕΊniak, C. Napoli, Multi-class nearest neighbour classifier for incomplete data handling, in: Lecture Notes in Artificial Intelligence (Subseries of Lec- ture Notes in Computer Science), volume 9119, Springer Verlag, 2015, pp. 469–480. doi:10.1007/978-3-319-19324-3_42. [18] V. S. Dhaka, S. V. Meena, G. Rani, D. Sinwar, M. F. Ijaz, M. WoΕΊniak, A survey of deep convolutional neural networks applied for prediction of plant leaf diseases, Sensors 21 (2021) 4749. [19] R. Brociek, G. Magistris, F. Cardia, F. Coppa, S. Russo, Contagion prevention of covid-19 by means of touch detection for retail stores, in: CEUR Workshop Proceedings, volume 3092, CEUR-WS, 2021, pp. 89–94. [20] N. Dat, V. Ponzi, S. Russo, F. Vincelli, Supporting impaired people with a following robotic assistant by means of end-to-end visual target navigation and reinforcement learning approaches, in: CEUR Workshop Proceedings, volume 3118, CEUR-WS, 2021, pp. 51–63. [21] M. WoΕΊniak, M. Wieczorek, J. SiΕ‚ka, D. PoΕ‚ap, Body pose prediction based on motion sensor data and recurrent neural network, IEEE Transactions on Industrial Informatics 17 (2020) 2101–2111. [22] G. Capizzi, F. Bonanno, C. Napoli, A wavelet based prediction of wind and solar energy for long- term simulation of integrated generation systems, in: SPEEDAM 2010 - International Symposium on Power Electronics, Electrical Drives, Automa- tion and Motion, 2010, pp. 586–592. doi:10.1109/ SPEEDAM.2010.5542259. [23] G. Capizzi, G. Lo Sciuto, C. Napoli, M. WoΕΊniak, G. Susi, A spiking neural network-based long-term prediction system for biogas production, Neural Networks 129 (2020) 271 – 279. [24] G. Capizzi, G. Lo Sciuto, C. Napoli, E. Tramontana, An advanced neural network based solution to en- force dispatch continuity in smart grids, Applied Soft Computing Journal 62 (2018) 768 – 775. [25] O. Dehzangi, et al., Imu-based gait recognition using convolutional neural networks and multi- sensor fusion, Sensors 17 (2017) 2735. [26] G. Capizzi, F. Bonanno, C. Napoli, Hybrid neu- ral networks architectures for soc and voltage pre- diction of new generation batteries storage, in: 3rd International Conference on Clean Electrical Power: Renewable Energy Resources Impact, IC- CEP 2011, 2011, pp. 341–344. doi:10.1109/ICCEP. 2011.6036301. [27] H. G. Hong, M. B. Lee, K. R. Park, Convolutional neural network-based finger-vein recognition using nir image sensors, Sensors 17 (2017) 1297. 75