=Paper= {{Paper |id=Vol-3360/p06 |storemode=property |title=Suggesting a Specific Factor-driven Career Choice using KNN and Soft Set Algorithms |pdfUrl=https://ceur-ws.org/Vol-3360/p06.pdf |volume=Vol-3360 |authors=Joanna Bodora,Jadwiga Cader,Nikola GΔ™bka |dblpUrl=https://dblp.org/rec/conf/system/BodoraCG22 }} ==Suggesting a Specific Factor-driven Career Choice using KNN and Soft Set Algorithms== https://ceur-ws.org/Vol-3360/p06.pdf
Suggesting a Specific Factor-driven Career Choice using
KNN and Soft Set Algorithms
Joanna Bodora1 , Jadwiga Cader1 and Nikola GΔ™bka1
1
    Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44-100 Gliwice, Poland


                                             Abstract
                                             Choosing perfect work path is not an easy task especially in IT sector. Lately we can notice that data science and jobs
                                             connected to this field are getting more and more popular. To reduce time consumed on finding perfect work position in
                                             data science, authors have presented solution, which selects best job based on factors introduced by user. Final job title is a
                                             result of combining soft set algorithm with analyzed accuracies of k-nearest neighbours algorithms classified with different
                                             k parameters and on various collections.

                                             Keywords
                                             Soft set, k-nearest neighbours, Classification



1. Introduction                                                                                                                       rithm. The soft set table consists of columns that are a
                                                                                                                                      specific factor on which we focus, and the rows are the
Nowadays, IT systems [1, 2] very often use artificial in-                                                                             next algorithms from KNN, while the content of the table
telligence methods, which allow not only to download                                                                                  is the accuracy that we obtained using a specific KNN
and process data [3], but also to infer and support the                                                                               algorithm.
decision-making process based on them. One of the im-                                                                                    The program is written in Python, has no graphical
portant branches of artificial intelligence systems are                                                                               interface and is executed in the IDE. The data is in the
fuzzy sets [4, 5, 6], which are used in numerous applica-                                                                             form of a database in the .csv file, while the user enters
tions, among others, in the detection of pavement dam-                                                                                the weights and values for each column in the form of a
age [7] or in smart home management [8, 9]. The second                                                                                list directly in the program.
important direction of applications are the optimization
algorithms [10, 11, 12, 13], which are used in optimiza-
tion processes, where the aim is to minimize or maximize                                                                              2. K-nearest neighbors Algorithm
the objective function [14, 15, 2]. An interesting appli-
cation of the heuristic algorithm concerns the reduction                                                                              The K Nearest Neighbors algorithm is a ranking
of energy consumption [16, 17, 18? ]. An important                                                                                    algorithm, it evaluates to which group the point belongs
part of optimization algorithms are algorithms modeled                                                                                to from the current iteration of the algorithm in the
on the behavior of animals cooperating in large groups                                                                                surface. The classification works on the basis of counting
[19, 20]. These algorithms, imitating the behavior of the                                                                             the number of the nearest neighbors points in a given
community, e.g. ants and bees, allow you to quickly and                                                                               group, the score is returned based on the vote of the
effectively achieve the goal. The third direction of the                                                                              majority.
development of artificial intelligence are all kinds of meth-
ods based on artificial neural networks [21, 22]. They                                                                                Data analysis is based on clustering.          The pro-
are widely used in medicine, in the care of the elderly                                                                               gram classifies data based on different variants of the
[23, 24, 25], in detection [26, 27] as well as in machine                                                                             KNN (k-nearest neighbors) algorithm. It consists in
learning [28, 29, 30, 31].                                                                                                            finding the k elements already classified (neighbors)
   We created a program that allows you to choose a ca-                                                                               closest to the new element and assigning this element
reer path based on specific factors. The program will                                                                                 to the group to which most of its neighbors belong.
make it possible to select the optimal result using the                                                                               Several metrics are used to determine the similarity, this
k nearest neighbors algorithm and using soft sets. We                                                                                 program uses two: Manhattan (Taxi Cab) and Minkowski.
create a table for soft sets with the accuracy of various
types of distance calculation methods in the KNN algo-                                                                                Manhattan metric
                                                                                                                                                                   𝑛
SYSYEM 2022: 8th Scholar’s Yearly Symposium of Technology, Engi-
                                                                                                                                                                  βˆ‘οΈ
                                                                                                                                                      𝑑(x, y) =         |π‘₯𝑖 βˆ’ 𝑦𝑖 |           (1)
neering and Mathematics, Brunek, July 23, 2022
                                                                                                                                                                  𝑖=1
" bodorajoanna@polsl.pl (J. Bodora); jadwcad575@polsl.pl
(J. Cader); nikogeb061@polsl.pl (N. GΔ™bka)
                                       Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License   Where:
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       Attribution 4.0 International (CC BY 4.0).
                                       CEUR Workshop Proceedings (CEUR-WS.org)                                                        d – distance,



                                                                                                                                 41
Joanna Bodora et al. CEUR Workshop Proceedings                                                                      41–48



x – value of a sample,                                            the soft set above π‘ˆ , where 𝐹 is the mapping given by
y – value of a classified element,                                𝐹 : 𝐴 β†’ 𝑃 (π‘ˆ ). Others in words, the soft set (𝐹, 𝐴)
n – amount of elements in the sample                              over U is a parameterized family of the subset π‘ˆ . For
                                                                  𝑒 ∈ 𝐴, 𝐹 (𝑒) can be considered a set of e-elements or
Minkowski metric – a modified Euclidean metric                    e-approximate elements of soft sets (𝐹, 𝐴). Thus, (𝐹, 𝐴)
                  (οΈƒ 𝑛              )οΈƒ1/π‘š                         is defined as:
                    βˆ‘οΈ
      πΏπ‘š (x, y) =      |π‘₯𝑖 βˆ’ 𝑦𝑖 | π‘š
                                          .  (2)
                         𝑖=1
                                                                     (𝐹, 𝐴) = {𝐹 (𝑒) ∈ 𝑃 (π‘ˆ ) : 𝑒 ∈ 𝐸, 𝐹 (𝑒) = βˆ… , if
Where:                                                                                  π‘’βˆˆ/ 𝐴}
d – distance,
x – value of a sample,                                                                   𝑛
                                                                                        βˆ‘οΈ
y – value of a classified element,                                                             𝑠𝑖 Β· 𝑀𝑖                  (4)
n – amount of elements in the sample                                                     𝑖=1
m – any small integer,
                                                                       β€’ 𝑠𝑖 – element of the sample
   After calculating the distance, the data is clustered:              β€’ 𝑀𝑖 – weight
first sorted in ascending order, then voting is done on                β€’ 𝑛 – length of the sample
the basis of a 1:1 matching of the sample attribute to the
test set attribute - the same elements are added to the
common set. Then, the percentage share of the searched
                                                             4. Other methods used
elements in relation to the entire data set is calculated:   Cross validation– a statistical method involving divi-
                                                             sion statistical sample for subsets, and then conducting
  π΄π‘π‘π‘’π‘Ÿπ‘Žπ‘π‘¦ = π‘ π‘–π‘§π‘’π‘œπ‘“π‘ π‘–π‘§π‘’π‘œπ‘“
                       π‘‘β„Žπ‘’π‘ π‘’π‘‘π‘œπ‘“ π‘šπ‘Žπ‘‘π‘β„Žπ‘’π‘‘π‘’π‘™π‘’π‘šπ‘’π‘›π‘‘π‘ 
                                                 Γ—  100%     analyzes of the training set, while the test set is used to
                            π‘‘β„Žπ‘’π‘€β„Žπ‘œπ‘™π‘’π‘‘π‘Žπ‘‘π‘Žπ‘ π‘’π‘‘
                                                             confirm the plausibility of its results.
The variable k largely determines the behavior of the
classifier. Determines the number of the closest neigh- Rule extraction – rejection of variables not use-
bors that decide on the classification of the element. It is ful in the study.
a natural number. This parameter is arbitrary, but if we
want our classifier to work efficiently, we must make a Data normalization is scaling data into a range
few assumptions:
                                                             Min-max normalization using a linear function,
      β€’ K must be greater than the square root of the it reduces the data to the interval specified by the user
        number of all classified elements                    (newmin, newmax). At the same time, we should know
                                   √                         the range that the data can achieve. If we do not know
                             π‘˜ β‰₯ 𝑛,                          it, we can use the highest and the smallest value in the
            n - number of classified elements                analyzed set.
      β€’ If the number of groups is even, k must be odd.           π‘₯β€² = π‘šπ‘Žπ‘₯βˆ’π‘šπ‘–π‘›
                                                                         π‘₯βˆ’π‘šπ‘–π‘›
                                                                                  Β· π‘›π‘’π‘€π‘šπ‘Žπ‘₯ βˆ’ π‘›π‘’π‘€π‘šπ‘–π‘› + π‘›π‘’π‘€π‘šπ‘–π‘›
        Otherwise, k must be even.
                        {οΈƒ                                       This algorithm is used for both regression and classi-
                           2π‘Ž + 1, 𝑐|2                       fication. Useful when dependencies between objects of
                   π‘˜=                                   (3)
                           2π‘Ž,      otherwise                the same classes are difficult to interpret.

          c – number of groups, π‘Ž ∈ 𝑁
                                                                  5. Database
     β€’ K must be greater than the number of groups
                                                                  The project was created with the use of a database taken
                               π‘˜>𝑐                                from the website https://www.kaggle.com. The database
                                                                  deals with salaries in individual professions in work
                                                                  related to the field of data analysis.
3. Soft Set
                                                                  Database link:
Let π‘ˆ be the initial infinite set and 𝐸 the set of                https://www.kaggle.com/datasets/saurabhshahane/
parameters or attributes relative to π‘ˆ . Let 𝑃 (π‘ˆ ) denote        data-science-jobs-salaries
the power set π‘ˆ i 𝐴 βŠ† 𝐸. The (𝐹, 𝐴) pair is called



                                                             42
Joanna Bodora et al. CEUR Workshop Proceedings                                                                     41–48



6. Implementation of KNN
   algorithm
The final program was developed to return best KNN
algorithms based on accuracy which we get from ana-
lyzing different options. We implemented two types of
KNN algorithms, one based on distances between val-            Figure 1: Histogram presenting values of annual salary
ues of sample and dataset tried to give best job position
sorting by distances and summing appearance of various
job titles. Second algorithm also calculated distances but
                                                            the classified data. Also salaries in different jobs posi-
firstly it focused on getting a specific category of work
                                                            tions overlap in ranges, which may make it difficult to
and then from this limited collection of data it returned
                                                            distinguish positions based on the amount of salary.
nearest neighbours for job positions. Both of these al-
                                                               Plots Fig. 4 and Fig. 5 presenting the connection be-
gorithms were closely analyzed and results showed that
                                                            tween the location or the nationality of employee and the
classic KNN algorithm without any categorization gives
                                                            amount of salary shows that the research was conducted
best accuracy.
                                                            mainly on the American market, also the scope of salaries
                                                            of employees of different nationalities and companies
   Data: Input π‘ π‘Žπ‘šπ‘π‘™π‘’, π‘‘π‘Žπ‘‘π‘Žπ‘‡ π‘Žπ‘, π‘˜                          from other countries rather coincides, i.e. the amount
   Result: π‘—π‘œπ‘π‘‡ 𝑖𝑑𝑙𝑒                                        of the salary does not depend on the citizenship of the
   𝑑𝑖𝑠𝑑 := [];                                              employee or the country in which he works. Therefore it
   π‘π‘™π‘Žπ‘ π‘ π‘’π‘  := [];                                           can be concluded that there are certain salary scales that
   while 𝑖 < 𝑙𝑒𝑛(π‘‘π‘Žπ‘‘π‘Žπ‘‡ π‘Žπ‘) do                               are offered in IT positions in data analysis regardless of
       Calculate distance between sample and record
                                                            location or nationality.
         in π‘‘π‘Žπ‘‘π‘Žπ‘‡ π‘Žπ‘, save it to 𝑑𝑖𝑠𝑑;
                                                               Fig. 6 shows the connection between the employee’s
   end                                                      level of experience and his salary. The highest rate was
   Add 𝑑𝑖𝑠𝑑 as new column to π‘‘π‘Žπ‘‘π‘Žπ‘‡ π‘Žπ‘;                      offered to the person with the greatest responsibility,
   Sort π‘‘π‘Žπ‘‘π‘Žπ‘‡ π‘Žπ‘ by column 𝑑𝑖𝑠𝑑;                            i.e. working in an executive position, for example the
   for 𝑖 in range(0,k) do                                   position of director, leader or project manager. Then
       Save number of different job title’s
                                                            the seniors have the highest stake. The lowest stake is
         occurrences for π‘˜ first records in π‘‘π‘Žπ‘‘π‘Žπ‘‡ π‘Žπ‘ to
                                                            accumulated in the junior experience group. There are
         π‘π‘™π‘Žπ‘ π‘ π‘’π‘ ;
                                                            also single outliers in each group.
   end
                                                               Fig. 7 checks if there is any connection between the
   return π‘—π‘œπ‘π‘‡ 𝑖𝑑𝑙𝑒 that appeared most frequently
                                                            amount of the salary and company’s size. We may notice
    in π‘π‘™π‘Žπ‘ π‘ π‘’π‘ ;
                                                            the lack of huge differences in stakes for employees from
 Algorithm 1: Algorithm of our implementation of
                                                            various companies.
 KNN
                                                               Pie charts Fig. 8 and Fig. 9 were generated to verify the
                                                            percentage of various work modes and the types of em-
                                                            ployments. It shows that remote or semi-remote work is
7. Analyzing dataset                                        provided in almost 85 percent of positions, while full-time
                                                            employment predominates in the type of employment.
The histogram Fig. 1 and plot Fig. 3 show the perfor-          Summing up, the data available does not stand out for
mance of the earnings in the field of datascience. It in- a specific group of job positions or, for example, for a
forms us that there are over 160 people earning between certain location of the company, which may result in the
0 to 10000 USD per year. We note that earnings cumulate difficulty of their classification and lower accuracy. The
in the range of approximately 50000 to 200000 USD. The lack of visible boundaries in the rates due to the size of the
remaining values are sporadic and we look at them as company shown in Fig. 7 or the small number of records
outliers.                                                   for certain positions Fig. 2 will be factors that make
   From the chart Fig. 2, we obtain information about the classification difficult. Also, the predominance of the
earnings for a specific position. We also note the number location of companies and the citizenship of employees
of records that will define a given job. Positions such from the United States makes the data reflect the reality
as Data Scientist or Data Engineer have more records rather for developed countries.
than, for example Data Specialist, which appears only
once in the database. Not having the same number of
records for different positions will affect the accuracy of



                                                          43
Joanna Bodora et al. CEUR Workshop Proceedings                                                                 41–48




Figure 2: Plot presenting values of annual salary according to the job title



8. Analyzing KNN performance                                result, we get low accuracy of the algorithm’s operation.
                                                            Therefore, in further action, despite the re-verification
Presented KNN algorithms have achieved an accuracy be- of the operation on normalized values, we gave up using
tween 37 to 89% for classification based on job title. Data this normalized data due to the very low accuracy.
were divided in proportions adequately 30% testing and         We may notice on Fig. 12 that normalizing only salary
70% training part. Results were analyzed to determine column itself, which initially takes values in thousandths,
perfect combination of dataset, k parameter and variety allows to increase the accuracy of the classification with
of distance metrics used in KNN algorithm. We focused the use of job type categorization. One more time, the
on two types of distance metrics Minkowski and Manhat- taxicab metric is a better method of calculating distances.
tan. Comparison test consist on checking performance           Graph and table on Fig. 13 show the accuracies for
of KNN algorithm on normalized dataset, not normalized different k using the classic KNN algorithm without addi-
dataset and normalized data but only in salary column. tional categorization. The accuracy values are practically
   Graphs presented in Fig. 10 show the influence of k the same with minimal variation depending on the dis-
on the accuracy of the algorithm for k nearest neighbors tance metric used.
using an additional column of job categories. We can           Working on completely normalized data in each of
see that for k equal to 8 there is a sudden decrease in the columns turns out to be pointless due to the very
accuracy for both the Minkowski method of distance low accuracy that we obtain regardless of the parameter
calculation and the taxi method. Then the values from k k Fig. 14. Therefore, in the created table for the soft
equal to 9 decrease. Better accuracy is obtained by using set algorithm, we do not take into account the accuracy
the Manhattan distance metrics.                             obtained when working on this type of data sets.
   The impact of k on accuracy shown in Fig. 11, informs       In the presented graphs Fig. 15, we may notice that
us that normalizing all columns with little variation in the parameter k affects the determination of the accuracy.
data does not allow algorithm to classify properly. As a



                                                              44
Joanna Bodora et al. CEUR Workshop Proceedings                                                                          41–48




                                                                   Figure 5: Plot presenting values of annual salary according
                                                                   to the company location




Figure 3: Plot presenting values of annual salary

                                                                   Figure 6: Plot presenting values of annual salary according
                                                                   to the employee’s experience level




                                                                   Figure 7: Plot presenting values of annual salary according
                                                                   to the company size



                                                                   decreases. On the other hand, when using the Manhattan
                                                                   metric, values decrease from the intial k.


                                                                   9. Experiments
Figure 4: Plot presenting values of annual salary according
to the nationality of an employee                                  Table Fig. 16 presents the obtained table for the operation
                                                                   of the soft set algorithm. This table contains accuracies
                                                                   for the following KNN algorithms from the lines, using
                                                                   the given parameter k as well as a specific data set. We
In the graphs on the left, which uses the Minkowski
                                                                   obtain this soft set table after analyzing for which param-
metric to calculate the distance, we see that the accuracy
                                                                   eters k gives the best accuracy.
remains high for the initial 4 k values and then gradually



                                                              45
Joanna Bodora et al. CEUR Workshop Proceedings                                                                         41–48




                                                                  Figure 10: Results and plots presenting impact of K param-
                                                                  eter on accuracy of KNN classification with category of not
Figure 8: Pie chart presenting the percentage of different        normalized values using Minkowski and Manhattan distance
types of work                                                     metrics




                                                                  Figure 11: Results and plots presenting impact of K parame-
                                                                  ter on accuracy of KNN classification with category of nor-
                                                                  malized values using Minkowski and Manhattan distance
                                                                  metrics
Figure 9: Pie chart presenting the percentage of different
form of employments



10. Conclusion
As we can see presented solution allows the user to find
perfect job position based on factors, which he or she
focuses on. Because of in-depth reporting of data set we
could distinguish best combinations of KNN algorithm in
terms of k parameter, distance metric and data set itself.
Thanks to creating soft set table of accuracies of different
KNN solutions we get best algorithm, which also gives
factors we focus on the most the utmost importance.
                                                                  Figure 12: Results and plots presenting impact of K param-
                                                                  eter on accuracy of KNN classification with category of nor-
A. Online Resources                                               malized values only in salary column using Minkowski and
                                                                  Manhattan distance metrics
The sources for the solution are available via

     β€’ GitHub




                                                             46
Joanna Bodora et al. CEUR Workshop Proceedings                                                                        41–48




                                                                 Figure 16: Table showing the final accuracies for selected
                                                                 algorithms on specific data



                                                                 References
                                                                  [1] M. A. Sanchez, O. Castillo, J. R. Castro, Generalized
                                                                      type-2 fuzzy systems for controlling a mobile robot
Figure 13: Results and plots presenting impact of K param-
eter on accuracy of KNN classification of not normalized val-         and a performance comparison with interval type-
ues using Minkowski and Manhattan distance metrics                    2 and type-1 fuzzy systems, Expert Systems with
                                                                      Applications 42 (2015) 5904–5914.
                                                                  [2] Q.-b. Zhang, P. Wang, Z.-h. Chen, An improved
                                                                      particle filter for mobile robot localization based on
                                                                      particle swarm optimization, Expert Systems with
                                                                      Applications 135 (2019) 181–193.
                                                                  [3] J. W. W. L. Z. B. Wei Dong, Marcin WoΕΊniak, De-
                                                                      noising aggregation of graph neural networks by
                                                                      using principal component analysis, IEEE Transac-
                                                                      tions on Industrial Informatics (2022).
                                                                  [4] Y. Li, W. Dong, Q. Yang, S. Jiang, X. Ni, J. Liu, Auto-
                                                                      matic impedance matching method with adaptive
                                                                      network based fuzzy inference system for wpt, IEEE
                                                                      Transactions on Industrial Informatics 16 (2019)
                                                                      1076–1085.
                                                                  [5] F. Qu, J. Liu, H. Zhu, D. Zang, Wind turbine condi-
Figure 14: Results and plots presenting impact of K param-            tion monitoring based on assembled multidimen-
eter on accuracy of KNN classification of normalized values           sional membership functions using fuzzy inference
using Minkowski and Manhattan distance metrics                        system, IEEE Transactions on Industrial Informat-
                                                                      ics 16 (2019) 4028–4037.
                                                                  [6] A. Carpenzano, R. Caponetto, L. Lo Bello,
                                                                      O. Mirabella, Fuzzy traffic smoothing: An ap-
                                                                      proach for real-time communication over ethernet
                                                                      networks, in: 4th IEEE International Workshop on
                                                                      Factory Communication Systems, IEEE, 2002, pp.
                                                                      241–248.
                                                                  [7] M. WoΕΊniak, A. Zielonka, A. Sikora, Driving sup-
                                                                      port by type-2 fuzzy logic control model, Expert
                                                                      Systems with Applications 207 (2022) 117798.
                                                                  [8] M. WoΕΊniak, A. Zielonka, A. Sikora, M. J. Piran,
                                                                      A. Alamri, 6g-enabled iot home environment con-
                                                                      trol using fuzzy rules, IEEE Internet of Things
                                                                      Journal 8 (2020) 5442–5452.
Figure 15: Results and plots presenting impact of K param-        [9] C. Napoli, G. Pappalardo, E. Tramontana, Improv-
eter on accuracy of KNN classification of normalized values           ing files availability for bittorrent using a diffu-
only in salary column using Minkowski and Manhattan dis-              sion model, in: Proceedings of the Workshop
tance metrics
                                                                      on Enabling Technologies: Infrastructure for Col-
                                                                      laborative Enterprises, WETICE, IEEE Computer
                                                                      Society, 2014, pp. 191–196. doi:10.1109/WETICE.
                                                                      2014.65.




                                                            47
Joanna Bodora et al. CEUR Workshop Proceedings                                                                   41–48



[10] T. Qiu, B. Li, X. Zhou, H. Song, I. Lee, J. Lloret,          diseases, Sensors 21 (2021) 4749.
     A novel shortcut addition algorithm with particle       [22] C. Napoli, F. Bonanno, G. Capizzi, An hybrid neuro-
     swarm for multisink internet of things, IEEE Trans-          wavelet approach for long-term prediction of solar
     actions on Industrial Informatics 16 (2019) 3566–            wind, Proceedings of the International Astronomi-
     3577.                                                        cal Union 6 (2010) 153 – 155.
[11] D. Yu, C. P. Chen, Smooth transition in communica-      [23] M. WoΕΊniak, M. Wieczorek, J. SiΕ‚ka, D. PoΕ‚ap, Body
     tion for swarm control with formation change, IEEE           pose prediction based on motion sensor data and
     Transactions on Industrial Informatics 16 (2020)             recurrent neural network, IEEE Transactions on
     6962–6971.                                                   Industrial Informatics 17 (2020) 2101–2111.
[12] G. Capizzi, G. Lo Sciuto, C. Napoli, R. Shikler,        [24] S. Illari, S. Russo, R. Avanzato, C. Napoli, A cloud-
     M. Wozniak, Optimizing the organic solar cell man-           oriented architecture for the remote assessment
     ufacturing process by means of afm measurements              and follow-up of hospitalized patients, in: CEUR
     and neural networks, Energies 11 (2018).                     Workshop Proceedings, volume 2694, CEUR-WS,
[13] G. Capizzi, G. Lo Sciuto, C. Napoli, E. Tramontana,          2020, pp. 29–35.
     An advanced neural network based solution to en-        [25] N. Dat, V. Ponzi, S. Russo, F. Vincelli, Supporting
     force dispatch continuity in smart grids, Applied            impaired people with a following robotic assistant
     Soft Computing Journal 62 (2018) 768 – 775.                  by means of end-to-end visual target navigation
[14] J. Yi, J. Bai, W. Zhou, H. He, L. Yao, Operating             and reinforcement learning approaches, in: CEUR
     parameters optimization for the aluminum electrol-           Workshop Proceedings, volume 3118, CEUR-WS,
     ysis process using an improved quantum-behaved               2021, pp. 51–63.
     particle swarm algorithm, IEEE Transactions on          [26] O. Dehzangi, M. Taherisadr, R. ChangalVala, Imu-
     Industrial Informatics 14 (2017) 3405–3415.                  based gait recognition using convolutional neural
[15] C. Napoli, G. Pappalardo, E. Tramontana, Using               networks and multi-sensor fusion, Sensors 17 (2017)
     modularity metrics to assist move method refactor-           2735.
     ing of large systems, in: Proceedings - 2013 7th        [27] H. G. Hong, M. B. Lee, K. R. Park, Convolutional
     International Conference on Complex, Intelligent,            neural network-based finger-vein recognition using
     and Software Intensive Systems, CISIS 2013, 2013,            nir image sensors, Sensors 17 (2017) 1297.
     pp. 529–534. doi:10.1109/CISIS.2013.96.                 [28] A. T. Γ–zdemir, B. Barshan, Detecting falls with
[16] F. Bonanno, G. Capizzi, C. Napoli, Some remarks              wearable sensors using machine learning tech-
     on the application of rnn and prnn for the charge-           niques, Sensors 14 (2014) 10691–10708.
     discharge simulation of advanced lithium-ions bat-      [29] N. Brandizzi, V. Bianco, G. Castro, S. Russo, A. Wa-
     tery energy storage, in: SPEEDAM 2012 - 21st Inter-          jda, Automatic rgb inference based on facial emo-
     national Symposium on Power Electronics, Electri-            tion recognition, in: CEUR Workshop Proceedings,
     cal Drives, Automation and Motion, 2012, pp. 941–            volume 3092, CEUR-WS, 2021, pp. 66–74.
     945. doi:10.1109/SPEEDAM.2012.6264500.                  [30] R. Brociek, G. Magistris, F. Cardia, F. Coppa,
[17] M. WoΕΊniak, A. Sikora, A. Zielonka, K. Kaur, M. S.           S. Russo, Contagion prevention of covid-19 by
     Hossain, M. Shorfuzzaman, Heuristic optimization             means of touch detection for retail stores, in: CEUR
     of multipulse rectifier for reduced energy consump-          Workshop Proceedings, volume 3092, CEUR-WS,
     tion, IEEE Transactions on Industrial Informatics            2021, pp. 89–94.
     18 (2021) 5515–5526.                                    [31] K. G. Liakos, P. Busato, D. Moshou, S. Pearson,
[18] F. Bonanno, G. Capizzi, A. Gagliano, C. Napoli, Op-          D. Bochtis, Machine learning in agriculture: A
     timal management of various renewable energy                 review, Sensors 18 (2018) 2674.
     sources by a new forecasting method, 2012, pp. 934–
     940. doi:10.1109/SPEEDAM.2012.6264603.
[19] M. Ren, Y. Song, W. Chu, An improved locally
     weighted pls based on particle swarm optimization
     for industrial soft sensor modeling, Sensors 19
     (2019) 4099.
[20] Y. Zhang, S. Cheng, Y. Shi, D.-w. Gong, X. Zhao,
     Cost-sensitive feature selection using two-archive
     multi-objective artificial bee colony algorithm, Ex-
     pert Systems with Applications 137 (2019) 46–58.
[21] V. S. Dhaka, S. V. Meena, G. Rani, D. Sinwar, M. F.
     Ijaz, M. WoΕΊniak, A survey of deep convolutional
     neural networks applied for prediction of plant leaf



                                                        48