=Paper= {{Paper |id=Vol-3360/p06 |storemode=property |title=Suggesting a Specific Factor-driven Career Choice using KNN and Soft Set Algorithms |pdfUrl=https://ceur-ws.org/Vol-3360/p06.pdf |volume=Vol-3360 |authors=Joanna Bodora,Jadwiga Cader,Nikola Gębka |dblpUrl=https://dblp.org/rec/conf/system/BodoraCG22 }} ==Suggesting a Specific Factor-driven Career Choice using KNN and Soft Set Algorithms== https://ceur-ws.org/Vol-3360/p06.pdf

Suggesting a Specific Factor-driven Career Choice using
KNN and Soft Set Algorithms
Joanna Bodora1 , Jadwiga Cader1 and Nikola Gębka1
1
Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44-100 Gliwice, Poland

Abstract
Choosing perfect work path is not an easy task especially in IT sector. Lately we can notice that data science and jobs
connected to this field are getting more and more popular. To reduce time consumed on finding perfect work position in
data science, authors have presented solution, which selects best job based on factors introduced by user. Final job title is a
result of combining soft set algorithm with analyzed accuracies of k-nearest neighbours algorithms classified with different
k parameters and on various collections.

Keywords
Soft set, k-nearest neighbours, Classification

1. Introduction rithm. The soft set table consists of columns that are a
specific factor on which we focus, and the rows are the
Nowadays, IT systems [1, 2] very often use artificial in- next algorithms from KNN, while the content of the table
telligence methods, which allow not only to download is the accuracy that we obtained using a specific KNN
and process data [3], but also to infer and support the algorithm.
decision-making process based on them. One of the im- The program is written in Python, has no graphical
portant branches of artificial intelligence systems are interface and is executed in the IDE. The data is in the
fuzzy sets [4, 5, 6], which are used in numerous applica- form of a database in the .csv file, while the user enters
tions, among others, in the detection of pavement dam- the weights and values for each column in the form of a
age [7] or in smart home management [8, 9]. The second list directly in the program.
important direction of applications are the optimization
algorithms [10, 11, 12, 13], which are used in optimiza-
tion processes, where the aim is to minimize or maximize 2. K-nearest neighbors Algorithm
the objective function [14, 15, 2]. An interesting appli-
cation of the heuristic algorithm concerns the reduction The K Nearest Neighbors algorithm is a ranking
of energy consumption [16, 17, 18? ]. An important algorithm, it evaluates to which group the point belongs
part of optimization algorithms are algorithms modeled to from the current iteration of the algorithm in the
on the behavior of animals cooperating in large groups surface. The classification works on the basis of counting
[19, 20]. These algorithms, imitating the behavior of the the number of the nearest neighbors points in a given
community, e.g. ants and bees, allow you to quickly and group, the score is returned based on the vote of the
effectively achieve the goal. The third direction of the majority.
development of artificial intelligence are all kinds of meth-
ods based on artificial neural networks [21, 22]. They Data analysis is based on clustering. The pro-
are widely used in medicine, in the care of the elderly gram classifies data based on different variants of the
[23, 24, 25], in detection [26, 27] as well as in machine KNN (k-nearest neighbors) algorithm. It consists in
learning [28, 29, 30, 31]. finding the k elements already classified (neighbors)
We created a program that allows you to choose a ca- closest to the new element and assigning this element
reer path based on specific factors. The program will to the group to which most of its neighbors belong.
make it possible to select the optimal result using the Several metrics are used to determine the similarity, this
k nearest neighbors algorithm and using soft sets. We program uses two: Manhattan (Taxi Cab) and Minkowski.
create a table for soft sets with the accuracy of various
types of distance calculation methods in the KNN algo- Manhattan metric
𝑛
SYSYEM 2022: 8th Scholar’s Yearly Symposium of Technology, Engi-
∑︁
𝑑(x, y) = |𝑥𝑖 − 𝑦𝑖 | (1)
neering and Mathematics, Brunek, July 23, 2022
𝑖=1
" bodorajoanna@polsl.pl (J. Bodora); jadwcad575@polsl.pl
(J. Cader); nikogeb061@polsl.pl (N. Gębka)
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Where:
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org) d – distance,

41
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

x – value of a sample, the soft set above 𝑈 , where 𝐹 is the mapping given by
y – value of a classified element, 𝐹 : 𝐴 → 𝑃 (𝑈 ). Others in words, the soft set (𝐹, 𝐴)
n – amount of elements in the sample over U is a parameterized family of the subset 𝑈 . For
𝑒 ∈ 𝐴, 𝐹 (𝑒) can be considered a set of e-elements or
Minkowski metric – a modified Euclidean metric e-approximate elements of soft sets (𝐹, 𝐴). Thus, (𝐹, 𝐴)
(︃ 𝑛 )︃1/𝑚 is defined as:
∑︁
𝐿𝑚 (x, y) = |𝑥𝑖 − 𝑦𝑖 | 𝑚
. (2)
𝑖=1
(𝐹, 𝐴) = {𝐹 (𝑒) ∈ 𝑃 (𝑈 ) : 𝑒 ∈ 𝐸, 𝐹 (𝑒) = ∅ , if
Where: 𝑒∈/ 𝐴}
d – distance,
x – value of a sample, 𝑛
∑︁
y – value of a classified element, 𝑠𝑖 · 𝑤𝑖 (4)
n – amount of elements in the sample 𝑖=1
m – any small integer,
• 𝑠𝑖 – element of the sample
After calculating the distance, the data is clustered: • 𝑤𝑖 – weight
first sorted in ascending order, then voting is done on • 𝑛 – length of the sample
the basis of a 1:1 matching of the sample attribute to the
test set attribute - the same elements are added to the
common set. Then, the percentage share of the searched
4. Other methods used
elements in relation to the entire data set is calculated: Cross validation– a statistical method involving divi-
sion statistical sample for subsets, and then conducting
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑠𝑖𝑧𝑒𝑜𝑓𝑠𝑖𝑧𝑒𝑜𝑓
𝑡ℎ𝑒𝑠𝑒𝑡𝑜𝑓 𝑚𝑎𝑡𝑐ℎ𝑒𝑑𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠
× 100% analyzes of the training set, while the test set is used to
𝑡ℎ𝑒𝑤ℎ𝑜𝑙𝑒𝑑𝑎𝑡𝑎𝑠𝑒𝑡
confirm the plausibility of its results.
The variable k largely determines the behavior of the
classifier. Determines the number of the closest neigh- Rule extraction – rejection of variables not use-
bors that decide on the classification of the element. It is ful in the study.
a natural number. This parameter is arbitrary, but if we
want our classifier to work efficiently, we must make a Data normalization is scaling data into a range
few assumptions:
Min-max normalization using a linear function,
• K must be greater than the square root of the it reduces the data to the interval specified by the user
number of all classified elements (newmin, newmax). At the same time, we should know
√ the range that the data can achieve. If we do not know
𝑘 ≥ 𝑛, it, we can use the highest and the smallest value in the
n - number of classified elements analyzed set.
• If the number of groups is even, k must be odd. 𝑥′ = 𝑚𝑎𝑥−𝑚𝑖𝑛
𝑥−𝑚𝑖𝑛
· 𝑛𝑒𝑤𝑚𝑎𝑥 − 𝑛𝑒𝑤𝑚𝑖𝑛 + 𝑛𝑒𝑤𝑚𝑖𝑛
Otherwise, k must be even.
{︃ This algorithm is used for both regression and classi-
2𝑎 + 1, 𝑐|2 fication. Useful when dependencies between objects of
𝑘= (3)
2𝑎, otherwise the same classes are difficult to interpret.

c – number of groups, 𝑎 ∈ 𝑁
5. Database
• K must be greater than the number of groups
The project was created with the use of a database taken
𝑘>𝑐 from the website https://www.kaggle.com. The database
deals with salaries in individual professions in work
related to the field of data analysis.
3. Soft Set
Database link:
Let 𝑈 be the initial infinite set and 𝐸 the set of https://www.kaggle.com/datasets/saurabhshahane/
parameters or attributes relative to 𝑈 . Let 𝑃 (𝑈 ) denote data-science-jobs-salaries
the power set 𝑈 i 𝐴 ⊆ 𝐸. The (𝐹, 𝐴) pair is called

42
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

6. Implementation of KNN
algorithm
The final program was developed to return best KNN
algorithms based on accuracy which we get from ana-
lyzing different options. We implemented two types of
KNN algorithms, one based on distances between val- Figure 1: Histogram presenting values of annual salary
ues of sample and dataset tried to give best job position
sorting by distances and summing appearance of various
job titles. Second algorithm also calculated distances but
the classified data. Also salaries in different jobs posi-
firstly it focused on getting a specific category of work
tions overlap in ranges, which may make it difficult to
and then from this limited collection of data it returned
distinguish positions based on the amount of salary.
nearest neighbours for job positions. Both of these al-
Plots Fig. 4 and Fig. 5 presenting the connection be-
gorithms were closely analyzed and results showed that
tween the location or the nationality of employee and the
classic KNN algorithm without any categorization gives
amount of salary shows that the research was conducted
best accuracy.
mainly on the American market, also the scope of salaries
of employees of different nationalities and companies
Data: Input 𝑠𝑎𝑚𝑝𝑙𝑒, 𝑑𝑎𝑡𝑎𝑇 𝑎𝑏, 𝑘 from other countries rather coincides, i.e. the amount
Result: 𝑗𝑜𝑏𝑇 𝑖𝑡𝑙𝑒 of the salary does not depend on the citizenship of the
𝑑𝑖𝑠𝑡 := []; employee or the country in which he works. Therefore it
𝑐𝑙𝑎𝑠𝑠𝑒𝑠 := []; can be concluded that there are certain salary scales that
while 𝑖 < 𝑙𝑒𝑛(𝑑𝑎𝑡𝑎𝑇 𝑎𝑏) do are offered in IT positions in data analysis regardless of
Calculate distance between sample and record
location or nationality.
in 𝑑𝑎𝑡𝑎𝑇 𝑎𝑏, save it to 𝑑𝑖𝑠𝑡;
Fig. 6 shows the connection between the employee’s
end level of experience and his salary. The highest rate was
Add 𝑑𝑖𝑠𝑡 as new column to 𝑑𝑎𝑡𝑎𝑇 𝑎𝑏; offered to the person with the greatest responsibility,
Sort 𝑑𝑎𝑡𝑎𝑇 𝑎𝑏 by column 𝑑𝑖𝑠𝑡; i.e. working in an executive position, for example the
for 𝑖 in range(0,k) do position of director, leader or project manager. Then
Save number of different job title’s
the seniors have the highest stake. The lowest stake is
occurrences for 𝑘 first records in 𝑑𝑎𝑡𝑎𝑇 𝑎𝑏 to
accumulated in the junior experience group. There are
𝑐𝑙𝑎𝑠𝑠𝑒𝑠;
also single outliers in each group.
end
Fig. 7 checks if there is any connection between the
return 𝑗𝑜𝑏𝑇 𝑖𝑡𝑙𝑒 that appeared most frequently
amount of the salary and company’s size. We may notice
in 𝑐𝑙𝑎𝑠𝑠𝑒𝑠;
the lack of huge differences in stakes for employees from
Algorithm 1: Algorithm of our implementation of
various companies.
KNN
Pie charts Fig. 8 and Fig. 9 were generated to verify the
percentage of various work modes and the types of em-
ployments. It shows that remote or semi-remote work is
7. Analyzing dataset provided in almost 85 percent of positions, while full-time
employment predominates in the type of employment.
The histogram Fig. 1 and plot Fig. 3 show the perfor- Summing up, the data available does not stand out for
mance of the earnings in the field of datascience. It in- a specific group of job positions or, for example, for a
forms us that there are over 160 people earning between certain location of the company, which may result in the
0 to 10000 USD per year. We note that earnings cumulate difficulty of their classification and lower accuracy. The
in the range of approximately 50000 to 200000 USD. The lack of visible boundaries in the rates due to the size of the
remaining values are sporadic and we look at them as company shown in Fig. 7 or the small number of records
outliers. for certain positions Fig. 2 will be factors that make
From the chart Fig. 2, we obtain information about the classification difficult. Also, the predominance of the
earnings for a specific position. We also note the number location of companies and the citizenship of employees
of records that will define a given job. Positions such from the United States makes the data reflect the reality
as Data Scientist or Data Engineer have more records rather for developed countries.
than, for example Data Specialist, which appears only
once in the database. Not having the same number of
records for different positions will affect the accuracy of

43
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

Figure 2: Plot presenting values of annual salary according to the job title

8. Analyzing KNN performance result, we get low accuracy of the algorithm’s operation.
Therefore, in further action, despite the re-verification
Presented KNN algorithms have achieved an accuracy be- of the operation on normalized values, we gave up using
tween 37 to 89% for classification based on job title. Data this normalized data due to the very low accuracy.
were divided in proportions adequately 30% testing and We may notice on Fig. 12 that normalizing only salary
70% training part. Results were analyzed to determine column itself, which initially takes values in thousandths,
perfect combination of dataset, k parameter and variety allows to increase the accuracy of the classification with
of distance metrics used in KNN algorithm. We focused the use of job type categorization. One more time, the
on two types of distance metrics Minkowski and Manhat- taxicab metric is a better method of calculating distances.
tan. Comparison test consist on checking performance Graph and table on Fig. 13 show the accuracies for
of KNN algorithm on normalized dataset, not normalized different k using the classic KNN algorithm without addi-
dataset and normalized data but only in salary column. tional categorization. The accuracy values are practically
Graphs presented in Fig. 10 show the influence of k the same with minimal variation depending on the dis-
on the accuracy of the algorithm for k nearest neighbors tance metric used.
using an additional column of job categories. We can Working on completely normalized data in each of
see that for k equal to 8 there is a sudden decrease in the columns turns out to be pointless due to the very
accuracy for both the Minkowski method of distance low accuracy that we obtain regardless of the parameter
calculation and the taxi method. Then the values from k k Fig. 14. Therefore, in the created table for the soft
equal to 9 decrease. Better accuracy is obtained by using set algorithm, we do not take into account the accuracy
the Manhattan distance metrics. obtained when working on this type of data sets.
The impact of k on accuracy shown in Fig. 11, informs In the presented graphs Fig. 15, we may notice that
us that normalizing all columns with little variation in the parameter k affects the determination of the accuracy.
data does not allow algorithm to classify properly. As a

44
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

Figure 5: Plot presenting values of annual salary according
to the company location

Figure 3: Plot presenting values of annual salary

Figure 6: Plot presenting values of annual salary according
to the employee’s experience level

Figure 7: Plot presenting values of annual salary according
to the company size

decreases. On the other hand, when using the Manhattan
metric, values decrease from the intial k.

9. Experiments
Figure 4: Plot presenting values of annual salary according
to the nationality of an employee Table Fig. 16 presents the obtained table for the operation
of the soft set algorithm. This table contains accuracies
for the following KNN algorithms from the lines, using
the given parameter k as well as a specific data set. We
In the graphs on the left, which uses the Minkowski
obtain this soft set table after analyzing for which param-
metric to calculate the distance, we see that the accuracy
eters k gives the best accuracy.
remains high for the initial 4 k values and then gradually

45
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

Figure 10: Results and plots presenting impact of K param-
eter on accuracy of KNN classification with category of not
Figure 8: Pie chart presenting the percentage of different normalized values using Minkowski and Manhattan distance
types of work metrics

Figure 11: Results and plots presenting impact of K parame-
ter on accuracy of KNN classification with category of nor-
malized values using Minkowski and Manhattan distance
metrics
Figure 9: Pie chart presenting the percentage of different
form of employments

10. Conclusion
As we can see presented solution allows the user to find
perfect job position based on factors, which he or she
focuses on. Because of in-depth reporting of data set we
could distinguish best combinations of KNN algorithm in
terms of k parameter, distance metric and data set itself.
Thanks to creating soft set table of accuracies of different
KNN solutions we get best algorithm, which also gives
factors we focus on the most the utmost importance.
Figure 12: Results and plots presenting impact of K param-
eter on accuracy of KNN classification with category of nor-
A. Online Resources malized values only in salary column using Minkowski and
Manhattan distance metrics
The sources for the solution are available via

• GitHub

46
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

Figure 16: Table showing the final accuracies for selected
algorithms on specific data

References
[1] M. A. Sanchez, O. Castillo, J. R. Castro, Generalized
type-2 fuzzy systems for controlling a mobile robot
Figure 13: Results and plots presenting impact of K param-
eter on accuracy of KNN classification of not normalized val- and a performance comparison with interval type-
ues using Minkowski and Manhattan distance metrics 2 and type-1 fuzzy systems, Expert Systems with
Applications 42 (2015) 5904–5914.
[2] Q.-b. Zhang, P. Wang, Z.-h. Chen, An improved
particle filter for mobile robot localization based on
particle swarm optimization, Expert Systems with
Applications 135 (2019) 181–193.
[3] J. W. W. L. Z. B. Wei Dong, Marcin Woźniak, De-
noising aggregation of graph neural networks by
using principal component analysis, IEEE Transac-
tions on Industrial Informatics (2022).
[4] Y. Li, W. Dong, Q. Yang, S. Jiang, X. Ni, J. Liu, Auto-
matic impedance matching method with adaptive
network based fuzzy inference system for wpt, IEEE
Transactions on Industrial Informatics 16 (2019)
1076–1085.
[5] F. Qu, J. Liu, H. Zhu, D. Zang, Wind turbine condi-
Figure 14: Results and plots presenting impact of K param- tion monitoring based on assembled multidimen-
eter on accuracy of KNN classification of normalized values sional membership functions using fuzzy inference
using Minkowski and Manhattan distance metrics system, IEEE Transactions on Industrial Informat-
ics 16 (2019) 4028–4037.
[6] A. Carpenzano, R. Caponetto, L. Lo Bello,
O. Mirabella, Fuzzy traffic smoothing: An ap-
proach for real-time communication over ethernet
networks, in: 4th IEEE International Workshop on
Factory Communication Systems, IEEE, 2002, pp.
241–248.
[7] M. Woźniak, A. Zielonka, A. Sikora, Driving sup-
port by type-2 fuzzy logic control model, Expert
Systems with Applications 207 (2022) 117798.
[8] M. Woźniak, A. Zielonka, A. Sikora, M. J. Piran,
A. Alamri, 6g-enabled iot home environment con-
trol using fuzzy rules, IEEE Internet of Things
Journal 8 (2020) 5442–5452.
Figure 15: Results and plots presenting impact of K param- [9] C. Napoli, G. Pappalardo, E. Tramontana, Improv-
eter on accuracy of KNN classification of normalized values ing files availability for bittorrent using a diffu-
only in salary column using Minkowski and Manhattan dis- sion model, in: Proceedings of the Workshop
tance metrics
on Enabling Technologies: Infrastructure for Col-
laborative Enterprises, WETICE, IEEE Computer
Society, 2014, pp. 191–196. doi:10.1109/WETICE.
2014.65.

47
Joanna Bodora et al. CEUR Workshop Proceedings 41–48

[10] T. Qiu, B. Li, X. Zhou, H. Song, I. Lee, J. Lloret, diseases, Sensors 21 (2021) 4749.
A novel shortcut addition algorithm with particle [22] C. Napoli, F. Bonanno, G. Capizzi, An hybrid neuro-
swarm for multisink internet of things, IEEE Trans- wavelet approach for long-term prediction of solar
actions on Industrial Informatics 16 (2019) 3566– wind, Proceedings of the International Astronomi-
3577. cal Union 6 (2010) 153 – 155.
[11] D. Yu, C. P. Chen, Smooth transition in communica- [23] M. Woźniak, M. Wieczorek, J. Siłka, D. Połap, Body
tion for swarm control with formation change, IEEE pose prediction based on motion sensor data and
Transactions on Industrial Informatics 16 (2020) recurrent neural network, IEEE Transactions on
6962–6971. Industrial Informatics 17 (2020) 2101–2111.
[12] G. Capizzi, G. Lo Sciuto, C. Napoli, R. Shikler, [24] S. Illari, S. Russo, R. Avanzato, C. Napoli, A cloud-
M. Wozniak, Optimizing the organic solar cell man- oriented architecture for the remote assessment
ufacturing process by means of afm measurements and follow-up of hospitalized patients, in: CEUR
and neural networks, Energies 11 (2018). Workshop Proceedings, volume 2694, CEUR-WS,
[13] G. Capizzi, G. Lo Sciuto, C. Napoli, E. Tramontana, 2020, pp. 29–35.
An advanced neural network based solution to en- [25] N. Dat, V. Ponzi, S. Russo, F. Vincelli, Supporting
force dispatch continuity in smart grids, Applied impaired people with a following robotic assistant
Soft Computing Journal 62 (2018) 768 – 775. by means of end-to-end visual target navigation
[14] J. Yi, J. Bai, W. Zhou, H. He, L. Yao, Operating and reinforcement learning approaches, in: CEUR
parameters optimization for the aluminum electrol- Workshop Proceedings, volume 3118, CEUR-WS,
ysis process using an improved quantum-behaved 2021, pp. 51–63.
particle swarm algorithm, IEEE Transactions on [26] O. Dehzangi, M. Taherisadr, R. ChangalVala, Imu-
Industrial Informatics 14 (2017) 3405–3415. based gait recognition using convolutional neural
[15] C. Napoli, G. Pappalardo, E. Tramontana, Using networks and multi-sensor fusion, Sensors 17 (2017)
modularity metrics to assist move method refactor- 2735.
ing of large systems, in: Proceedings - 2013 7th [27] H. G. Hong, M. B. Lee, K. R. Park, Convolutional
International Conference on Complex, Intelligent, neural network-based finger-vein recognition using
and Software Intensive Systems, CISIS 2013, 2013, nir image sensors, Sensors 17 (2017) 1297.
pp. 529–534. doi:10.1109/CISIS.2013.96. [28] A. T. Özdemir, B. Barshan, Detecting falls with
[16] F. Bonanno, G. Capizzi, C. Napoli, Some remarks wearable sensors using machine learning tech-
on the application of rnn and prnn for the charge- niques, Sensors 14 (2014) 10691–10708.
discharge simulation of advanced lithium-ions bat- [29] N. Brandizzi, V. Bianco, G. Castro, S. Russo, A. Wa-
tery energy storage, in: SPEEDAM 2012 - 21st Inter- jda, Automatic rgb inference based on facial emo-
national Symposium on Power Electronics, Electri- tion recognition, in: CEUR Workshop Proceedings,
cal Drives, Automation and Motion, 2012, pp. 941– volume 3092, CEUR-WS, 2021, pp. 66–74.
945. doi:10.1109/SPEEDAM.2012.6264500. [30] R. Brociek, G. Magistris, F. Cardia, F. Coppa,
[17] M. Woźniak, A. Sikora, A. Zielonka, K. Kaur, M. S. S. Russo, Contagion prevention of covid-19 by
Hossain, M. Shorfuzzaman, Heuristic optimization means of touch detection for retail stores, in: CEUR
of multipulse rectifier for reduced energy consump- Workshop Proceedings, volume 3092, CEUR-WS,
tion, IEEE Transactions on Industrial Informatics 2021, pp. 89–94.
18 (2021) 5515–5526. [31] K. G. Liakos, P. Busato, D. Moshou, S. Pearson,
[18] F. Bonanno, G. Capizzi, A. Gagliano, C. Napoli, Op- D. Bochtis, Machine learning in agriculture: A
timal management of various renewable energy review, Sensors 18 (2018) 2674.
sources by a new forecasting method, 2012, pp. 934–
940. doi:10.1109/SPEEDAM.2012.6264603.
[19] M. Ren, Y. Song, W. Chu, An improved locally
weighted pls based on particle swarm optimization
for industrial soft sensor modeling, Sensors 19
(2019) 4099.
[20] Y. Zhang, S. Cheng, Y. Shi, D.-w. Gong, X. Zhao,
Cost-sensitive feature selection using two-archive
multi-objective artificial bee colony algorithm, Ex-
pert Systems with Applications 137 (2019) 46–58.
[21] V. S. Dhaka, S. V. Meena, G. Rani, D. Sinwar, M. F.
Ijaz, M. Woźniak, A survey of deep convolutional
neural networks applied for prediction of plant leaf