<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Trainee Churn Prediction using Machine Learning: A Case Study of Data Scientist Job⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Oanh Thi Tran⋆⋆</string-name>
          <email>oanhtt@isvnu.vn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ly Phuong Nguyen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International School, Vietnam National University</institution>
          ,
          <addr-line>Hanoi</addr-line>
          ,
          <country country="VN">Vietnam</country>
        </aff>
      </contrib-group>
      <fpage>74</fpage>
      <lpage>82</lpage>
      <abstract>
        <p>The number of positions for data scientists is increasing. The companies working on big data and data science usually receive many registrations for the training programs of the companies before oficially giving them a permanent role. Among those trainees, the companies want to know which candidates are really want to work for them or will look for a new employment after training time. This will help to reduce the training cost, and bring higher levels of satisfaction and retention. This work is performed to interpret the main factors impacting to candidate decision and then build a prediction model to predict the probability of a candidate will look for a new job or will work for the company using the current credentials, demographics, experience data, etc. To this goal, different robust machine learning methods are carefully investigated which are single classifiers such as decision trees, naive bayes, KNNs, SVMs and ensemble classifiers such as random forest, voting strategies, Xgboost and LGBM on a public dataset. The experimental results show that the ensemble classifiers have achieved relatively higher performance in comparison to the single classifiers. The LGBM classifier was the best one which yielded up to 80% in the F1 score using the selected feature sets. This research shows promising results and provides a strong preliminary result on this interesting yet unexplored problem.</p>
      </abstract>
      <kwd-group>
        <kwd>Churn prediction • machine learning method • data science</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Churn prediction [
        <xref ref-type="bibr" rid="ref1 ref13 ref9">1,9,13</xref>
        ] is very common for any company or organization to
know when and why the employees are likely to leave the company. This research
direction is attracting the attention of many researchers over the world. Recently,
the application of machine learning in this field is blooming thanks to data for
churn prediction is now available in considerable quantity. For examples, there
existed a lot of research using diferent robust machine learning methods such
as SVM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], logistic regression [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Xgboost [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], or tree-based classifiers like
decision trees [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], random forest [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] , etc. on many public datasets.
      </p>
      <p>While many researches have been done for employee churn prediction, to our
knowledge, there is no published research on trainee or candidate churn
prediction. Nowadays, companies which are active in Big Data and Data Science want
to hire data scientists among people who successfully pass some courses which
conduct by the company. Learning and developing at the training time is
winwin for both the companies and the trainees. Typically, these companies receive
multiple candidate signups for their training programs. Hence, they wants to
know which of these candidates really want to work for the company after
training time or looking for a new employment at other companies. This prediction
would be extremely useful because it helps to reduce the cost and time as well as
the quality of training or planning the courses and categorization of candidates.</p>
      <p>This prediction problem is considered to be quite close to the problem of
employee churn prediction. In fact, the data about the current and past candidates
can be used to analyze to figure out the common characteristics of the
candidates targeted to making prediction about the possible retention of the potential
candidates in the future. In this paper, we aim at systematically studying about
the trainee churn prediction. We exploited the public data available1 at Kaggle
to conduct the research.</p>
      <p>
        This dataset designed to understand the factors that lead a person to leave
the company after training programs. By model(s) that uses these data, we can
predict the probability of a candidate to look for a new job or will work for the
company, as well as interpreting afected factors on employee decision.
Specifically, we conducted a systematic study on diferent robust machine learning
techniques as follows:
– Single classifiers : decision tree [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], logistic regression [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], multilayer
perceptron, k-nearest neighbors, and support vector machine [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
– Ensemble classifiers : random forest [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], voting strategies, XGBoost [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and
LightGBM [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Before implementing the diferent models, we also performed explanatory
data analysis to get more insights from this dataset. We also performed
preprocessing to make the data in a good quality before feeding into the models.
Finally, we also conducted feature selection method to select the most important
features for building the best model. Experimental results on the public dataset
are quite promising. The SVM method was proved to be the best model among
single classifiers, while the LGBM classifier was the best one among ensemble
classifiers. LGBM even outperformed SVM for all evaluation metrics and yielded
80% in the F1 score. This result was slightly improved with the selected 26 feature
1 https://www.kaggle.com/arashnic/hr-analytics-job-change-of-data-scientists
set. Specifically, using LGBM on these feature sets, we achieved nearly 80% in
the F1 score.</p>
      <p>The rest of this paper is organized as follows: Section 2 presents the related
work on employee churn prediction. Section 3 shows some preliminary scan on
the data using exploratory data analysis before developing the prediction models
using the proposed methods mentioned in Section 4. Section 5 describes
experiments setups, experimental results and some discussion on the results. Finally,
we conclude the paper and show some lines of future work in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Alamsyah et al., 2018 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] used three popular models for prediction which are
Nıav¨ e bayes, decision tree, and random forest using a Human Resource
Information System (HRIS) from a well-known telecommunications company in
Indonesia. Punnoose et al.2016 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] also used data from the HRIS of a global
retailer to compare XGBoost against six historically used supervised classifiers and
demonstrate its significantly higher accuracy for predicting employee turnover.
The same, Jain et al. 2021 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] used dataset from the HRIS and showed that the
system using the CatBoost algorithm outperforms other ML algorithms.
Alduayj et al. 2018 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] conducted experiments using a synthetic data created by IBM
Watson and using the following machine learning models: SVM, random forest
and KNN. Aseel Qutub et al. 2021 [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] used IBM attrition dataset for training
and evaluating machine learning models. Their result suggestion that Logistic
Regressor had the highest values and Decision tree had the lowest scores. Khera
et at., 2019 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] used support vector machine (SVM) for prediction based on
archival employee data collected from Human Resource databases of three IT
companies in India, including their employment status at the time of collection.
The same dataset, however, Yue Zhao et al. 2019 [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] used tree-based classifiers
(XGB, GBT, RF, DT) and showed that they worked well in general. Srivastava
et al.2021 [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] established the predictive power of Deep Learning for employee
churn prediction over ensemble machine learning techniques on real-time
employee data from a mid-sized Fast-Moving Consumer Goods (FMCG) company.
Nguyen et al. 2020 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] applied a case study of an organization with 1470
employee positions to demonstrate the whole integrating churn predict, EVM and
machine learning process.
      </p>
      <p>These researches mostly focused on the target of employees who are
permanently working for the companies using a wide range of machine learning
techniques. In this paper, we target to candidates or trainees of the company to
see whether or not they are likely to leave the company after training time. We
performed a systematic research on this task using a public dataset on Kaggle.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Explanatory Data Analysis</title>
      <p>Here are some analysis on this dataset using explanatory data analysis techniques
such as histogram, box plot, corelation analysis, etc.:
– Number of candidates ‘leaving ’ only accounted for 25%, while number of
candidates ‘not leaving ’ made up 75%. Hence, this is an imbalanced class
problem.
– It is noted that the majority of ‘leaving ’ are Male (89%). This is not
surprising given that the dataset features a higher relative number of male than
female and other.
– People who work in Data Science for the first eight years are more likely to
look for a new job, and more than half of those who have been in the efild
for more than 20 years are not looking for a new job.
– Candidates work in small company are more likely to look for a new job,
while medium and large company has a smaller number of seeking new
opportunities.
– Candidates with graduate education are more likely than others to look for
a new job.
– The majority of the candidates who do not leave the company are from cities
with city indexes ranging from 0.8 to 0.9, whereas the candidates who do
leave the company are from cities with city indexes ranging from 0.6 to 0.9.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Proposed ML classifiers</title>
      <p>In this work, we exploited both single classifiers and ensemble classifiers to
train the prediction models.
5
5.1</p>
    </sec>
    <sec id="sec-5">
      <title>Experiments</title>
      <sec id="sec-5-1">
        <title>Data Pre-processing</title>
        <p>Dealing with missing data There are 8 features containing missing values
including experience, enrolled university, last new job, education level, major
discipline, gender, company type and company size. To handle this problem, we
used the method fillna() to replace NaN values with ‘unknown’ for these eight
columns.</p>
        <p>
          Converting categorical features Because all predictor variables in many
models must be numeric. Therefore, these categorical variables must be properly
transformed into numeric representations using dummy encoding methods.
Feature Scaling Feature scaling is to transform the values of diferent
numerical features into the similar range of [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ] using the StandardScaler function.
Class imbalance we used the SMOTE method for the tuned LGBM Classifier
that is the best model. What it does is, it creates synthetic (not duplicate)
samples of the minority class. Hence making the minority class equal to the
majority class. SMOTE does this by selecting similar records and altering that
record one column at a time by a random amount within the diference to the
neighboring records [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
5.2
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Experimental Setups</title>
        <p>We conducted 5-fold cross validation test. All experiments were performed using
Google colab and evaluated using precision, recall, F1 and accuracy scores.
5.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Experimental Results</title>
      </sec>
      <sec id="sec-5-4">
        <title>Experimental results of diferent ML methods Table 1 shows the ex</title>
        <p>perimental results of models with Precision, Recall, F1 score and the accuracy
score.</p>
        <p>Among single classifiers, the worst performance is the performance of the
decision tree method, followed by the MLP method. The SVM method
significantly outperformed other methods and yielded the highest performance on all
four evaluation metrics. In comparison to the second and third best methods of
KNN and logistic regression, it boosted the F1 and accuracy scores by
approximately 3%. Using SVM, we achieved 78.81% in the F1 score and 79.22% in the
accuracy score.</p>
        <p>As shown in Table 1, the ensemble classifiers have achieved relatively higher
performance in comparison to single classifiers. The simple voting techniques
could not enhance the performance even using strong single classifier like SVM.
For the random forest technique, its performance was competitive with the best
single SVM classifier. Two variants of gradient boosting architectures which are
Xgboost and LGBM proved to be quite efective in predicting the likelihood
of candidate churn on this dataset. Among two classifiers, LGBM was slightly
better than Xgboost. It boosted the F1 score by nearly 1% in comparison to the
single SVM classifier. This best classifier yielded quite good performance with
79.71% in the F1 score and 79.64% in the accuracy score.</p>
      </sec>
      <sec id="sec-5-5">
        <title>Experimental results using SMOTE to handle imbalanced data Table</title>
        <p>2 illustrates the model evaluation without SMOTE and with SMOTE using
the best ensemble classifier of the LGBM method. The SMOTE technique can
slightly improve the performan in all evaluation metrics. For the F1 score, using
it enhanced the F1 score by 0.24% in comparison to not using it.</p>
        <p>We also measured the performance of each class using SMOTE and realized
that the prediction of the class 1 is more dificult than the prediction of class 0.
In more details, we gained 86% and 61.64% in the F1 score for class 0 and class
1, respectively.</p>
      </sec>
      <sec id="sec-5-6">
        <title>Comparing experimental results between selected features and full</title>
        <p>features. This greatly impacts the performance of the models. In this study,
we investigated three popular feature selection methods including univariate
Selection with chi-squared statistical test, feature importance used the tuned
LGBM classiefir, and heatmap. Among selected top 50 best features for each
technique, we found that all 3 methods shared the same 28 features. Based on
these feature sets, we built the best models using the best tuned LGBM classifier.
To have a better picture about the best feature sets, we also tried with other
options around these 28 features. Figure 2 depicts that using only shared features
of common 28 features yielded a sightly better performance than using their
feature subsets. Using the best set of 28 features yielded the best performance
with 0.3% improvement in the F1 score in comparison to using the full feature
set.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>This paper presented a work on predicting the likelihood of the candidates with
the intention to leave or do not leave the company after training periods. This
work was performed to interpret the main factors impacting to candidate decision
and then build a prediction model to predict the probability of a candidate will
look for a new job or will work for the company using the current credentials,
demographics, experience data, etc. We conducted extensive experiments using
diferent machine learning methods in order to look for the best prediction model.
Experimental results on a public dataset showed that in general the ensemble
classifiers gave the relatively higher performance in comparison to the single
classifiers. The LGBM classifier was the best one which yielded up to 80% in
the F1 score using selected feature sets. Among two classes, the experimental
results showed that predicting the class 1 – the candidate leaving the company
is more dificult than predicting the class 0 – the candidate doesn’t not leave the
company. We don’t expect a perfect model but the promising results suggested
that the best model could be used in the companies today.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alamsyah</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salma</surname>
          </string-name>
          , N.:
          <article-title>A Comparative Study of Employee Churn Prediction Model</article-title>
          .
          <source>In Proceedings of the 4th International Conference on Science and Technology (ICST)</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          (
          <year>2018</year>
          ), doi: 10.1109/ICSTC.
          <year>2018</year>
          .
          <volume>8528586</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Alduayj</surname>
            ,
            <given-names>S.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rajpoot</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Predicting Employee Attrition using Machine Learning</article-title>
          .
          <source>In: Proceedings of the 2018 International Conference on Innovations in Information Technology (IIT)</source>
          , pp.
          <fpage>93</fpage>
          -
          <lpage>98</lpage>
          (
          <year>2018</year>
          ), doi: 10.1109/INNOVATIONS.
          <year>2018</year>
          .
          <volume>8605976</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Amin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rahim</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ali</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anwar</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Comparison of Two Oversampling Techniques (SMOTE vs MTDF) for Handling Class Imbalance Problem: A Case Study of Customer Churn Prediction</article-title>
          . In: Rocha A.,
          <string-name>
            <surname>Correia</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costanzo</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reis</surname>
            <given-names>L</given-names>
          </string-name>
          . (eds) New Contributions in
          <source>Information Systems and Technologies. Advances in Intelligent Systems and Computing</source>
          , vol
          <volume>353</volume>
          . Springer, Cham. https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -16486-1
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Random forests</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>45</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benesty</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khotilovich</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Xgboost: extreme gradient boosting</article-title>
          .
          <source>R package version 04-2</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Cortes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Support vector machine</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>20</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>An improved random forest algorithm for predicting employee turnover</article-title>
          . Mathematical Problems in Engineering, pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De-lin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , Fen-xiang,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>An improved ID3 decision tree algorithm</article-title>
          .
          <source>In Proceedings of the 4th International Conference on Computer Science and Education: IEEE</source>
          (
          <year>2009</year>
          ). https://doi.org/10.1109/iccse.
          <year>2009</year>
          .
          <volume>5228509</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tomar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jana</surname>
            ,
            <given-names>P.K.</given-names>
          </string-name>
          <article-title>A novel scheme for employee churn problem using multi-attribute decision making approach and machine learning</article-title>
          .
          <source>J Intell Inf Syst 56</source>
          , pp.
          <fpage>279</fpage>
          -
          <lpage>302</lpage>
          (
          <year>2021</year>
          ). https://doi.org/10.1007/s10844-020-00614-9.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ke</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meng</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finley</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , Ma,
          <string-name>
            <given-names>W.</given-names>
            ,
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          , T.Y.:
          <article-title>LightGBM: a highly eficient gradient boosting decision tree</article-title>
          .
          <source>In Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
          , pp.
          <fpage>3149</fpage>
          -
          <lpage>3157</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Khera</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Divya.:
          <article-title>Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques</article-title>
          .
          <source>Vision: The Journal of Business Perspective</source>
          ,
          <volume>23</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>12</fpage>
          -
          <lpage>21</lpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>T.N.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vijender</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>L.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vu</surname>
            ,
            <given-names>H.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luong</surname>
            ,
            <given-names>N.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vu</surname>
            ,
            <given-names>T.N.</given-names>
          </string-name>
          :
          <article-title>Integrating Employee Value Model with Churn Prediction</article-title>
          .
          <source>International Journal of Sensors</source>
          , Wireless Communications and Control;
          <volume>10</volume>
          (
          <issue>4</issue>
          ) (
          <year>2020</year>
          ), https://doi.org/10.2174/2210327910666200213123728.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Punnoose</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ajit</surname>
          </string-name>
          , P.:
          <article-title>Prediction of Employee Turnover in Organizations using Machine Learning Algorithms</article-title>
          .
          <source>International Journal of Advanced Research in Artificial Intelligence</source>
          ,
          <volume>5</volume>
          (
          <issue>9</issue>
          ) (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Qutub</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Mehmadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Al-Hssan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aljohani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alghamdi</surname>
          </string-name>
          , H.:
          <article-title>Prediction of Employee Attrition Using Machine Learning</article-title>
          and
          <string-name>
            <given-names>Ensemble</given-names>
            <surname>Methods</surname>
          </string-name>
          .
          <source>International Journal of Machine Learning and Computing</source>
          ,
          <volume>11</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>110</fpage>
          -
          <lpage>114</lpage>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eachempati</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Intelligent Employee Retention System for Attrition Rate Analysis and Churn Prediction</article-title>
          .
          <source>Journal of Global Information Management</source>
          ,
          <volume>29</volume>
          (
          <issue>6</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>29</lpage>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Webb</surname>
            ,
            <given-names>G.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sammut</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perlich</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horvath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wrobel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Korb</surname>
            ,
            <given-names>K.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noble</surname>
            ,
            <given-names>W.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leslie</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lagoudakis</surname>
            ,
            <given-names>M.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quadrianto</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buntine</surname>
            ,
            <given-names>W.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Quadrianto</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buntine</surname>
            ,
            <given-names>W.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Getoor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Namata</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Getoor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xin</surname>
            <given-names>Jin</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.H.</given-names>
            ,
            <surname>Ting</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.A.</given-names>
            ,
            <surname>Vijayakumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Schaal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Raedt</surname>
          </string-name>
          , L.D.:
          <article-title>Logistic regression</article-title>
          .
          <source>In Encyclopedia of Machine Learning</source>
          (pp.
          <fpage>631</fpage>
          -
          <lpage>631</lpage>
          ) (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hryniewicki</surname>
          </string-name>
          , M.K., Cheng, F.,
          <string-name>
            <surname>Fu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Employee Turnover Prediction with Machine Learning: A Reliable Approach</article-title>
          . In:
          <string-name>
            <surname>Arai</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kapoor</surname>
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhatia</surname>
            <given-names>R</given-names>
          </string-name>
          . (eds)
          <article-title>Intelligent Systems and Applications</article-title>
          .
          <source>IntelliSys 2018. Advances in Intelligent Systems and Computing</source>
          , vol
          <volume>869</volume>
          . Springer, Cham (
          <year>2019</year>
          ). https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -01057-7
          <fpage>56</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>