=Paper=
{{Paper
|id=Vol-2083/paper-04
|storemode=property
|title=Recommendation of Job Offers Using Random Forests and Support Vector Machines
|pdfUrl=https://ceur-ws.org/Vol-2083/paper-04.pdf
|volume=Vol-2083
|authors=Jorge Martinez-Gil,Bernhard Freudenthaler,Thomas Natschläger
|dblpUrl=https://dblp.org/rec/conf/edbt/GilFN18
}}
==Recommendation of Job Offers Using Random Forests and Support Vector Machines==
<pdf width="1500px">https://ceur-ws.org/Vol-2083/paper-04.pdf</pdf>
<pre>
     Recommendation of Job Offers Using Random Forests and
                  Support Vector Machines
              Jorge Martinez-Gil                                Bernhard Freudenthaler                              Thomas Natschläger
         Software Competence Center                            Software Competence Center                         Software Competence Center
              Hagenberg GmbH                                        Hagenberg GmbH                                     Hagenberg GmbH
              Hagenberg, Austria                                    Hagenberg, Austria                                 Hagenberg, Austria
          jorge.martinez-gil@scch.at                          bernhard.freudenthaler@scch.at                      thomas.natschlaeger@scch.at

ABSTRACT                                                                                 kind of information filtering mechanism (a.k.a. job recommender
The challenge of automatically recommending job offers to ap-                            system [2, 6, 20]) aiming to predict the potential interest of job
propriate job seekers is a topic that has attracted many research                        seekers on given job offers. More specifically, job recommender
effort during the last times. However, it is generally assumed that                      systems aim automatically suggesting job openings in such a way
there is a need of more user-friendly filtering methods so that                          that as many offers as possible are offered to the right candidates
the automated recommendation systems might be more widely                                at the right moment.
used. We present here our research on two methods from the                                   To appropriate face these problems, a number of alternatives
data analytics field being able to disseminate job offers to the                         have been already explored: whether data concerning the of-
right person at the right time, which are based on Random Forest                         fer should be provided in a structured or unstructured way [7],
and Support Vector Machines respectively. Both methods are                               which communication channels are the most appropriate in a
used here to identify the actual attributes in which users are set                       given context [4, 5], how knowledge extraction over the job de-
when they are attracted to a job offer. Preliminary results in the                       scriptions should be performed [22], and so on. However, it is
context of automatic job recommendation suggest that these two                           widely assumed that more accurate and user-friendly filtering
methods seem to be promising.                                                            methods need to be developed in order to reach a wider audience
                                                                                         for these kind of software products [18].
KEYWORDS                                                                                     Our research work proposes to make this process much more
                                                                                         smooth and comfortable for the users looking for accurate job
e-recruitment, data analytics, random forests, support vector
                                                                                         recommendations. In fact, our methods aim to automatically
machines
                                                                                         identify the criteria on what potential candidates evaluate the
                                                                                         acceptance of a given job offer. Additionally, our research aim to
1    INTRODUCTION                                                                        improve the perceived quality of recommendations as feedback
Today, the job market is becoming more and more dynamic. In                              is received from users. Therefore, in view of the aforementioned
fact, this is one of the major reasons for an increasing demand for                      issues, we propose here a novel approach for the accurate rec-
better methods for publishing or finding interesting jobs offers.                        ommendation of job offers using two well-known methods from
Moreover, this interest is bidirectional [13], what means that it                        the data analytics field that can have great performance in this
stems not only from Human Resources (HR) departments in com-                             context. In fact, the major contributions of this ongoing work
panies, intermediaries or manufacturers of recruiting software,                          can be summarized as follows:
but also from job seekers looking for facing new professional
challenges. This means that, as a first step, it is assumed that                             • We propose a novel mechanism to automatically recom-
a preliminary reduction of the most promising applicants and                                   mend job offers based on Random Forests in an accurate
job offers can lead to considerable improvements and savings                                   way.
(in terms of money, time and effort) for both parties [10]. In this                          • We propose an alternative mechanism to automatically
context, job portals and online recruitment platforms have been                                recommend job offers based on the computation of Sup-
traditionally designed in order to help job providers and job seek-                            port Vector Machines.
ers to easily find suitable candidates and job offers respectively.                          • We perform an empirical evaluation of our two proposed
   At present, many job portals and web-based recruitment sys-                                 methods with real data concerning recruitment from one
tems offer their services around the world. However, there is                                  of our partners.
a great corpus of literature suggesting that the functionality of
the existing portals could be improved [3, 7, 14–16, 19, 23]. As
a general case, only references to online job advertisements are                            The remainder of this work is organized in the following way:
managed, which are then classified using a simple textual de-                            Section 2 reports the state-of-the-art on existing methods and
scription or core attributes. This means that there are serious                          tools for the automatic recommendation of job offers. Section 3
obstacles for a satisfactory support, at least, in the side of job                       presents the problem that we are addressing within the frame
seekers who are forced to browse through the list of available                           of this work. Section 4 described our two methods to face that
job offers to find what better fits their needs and interests.                           problem, these two methods are based on Random Forests and
   In order to allow job seekers to efficiently find what they                           Support Vector Machines respectively. Section 5 reports the em-
are looking for, the research community has been working in a                            pirical evaluation of our methods. Section 6 outlines the analysis
                                                                                         of the results that we have achieved from our empirical evalua-
© 2018 Copyright held by the owner/author(s). Published in the Workshop
Proceedings of the EDBT/ICDT 2018 Joint Conference (March 26, 2018, Vienna,
                                                                                         tion. Finally, we remark the conclusions and the future lines of
Austria) on CEUR-WS.org (ISSN 1613-0073). Distribution of this paper is permitted        research.
under the terms of the Creative Commons license CC-by-nc-nd 4.0.


                                                                                    22
2   BACKGROUND                                                                When using relational databases, job offers with descriptive
For many years, information systems for human resources (a.k.a.            attributes such as job title, location, company, required skills,
Human Resources Management Systems or simply HRMs) have                    etc. and the URL of the job advertisement are stored in relations,
been mainly restricted to tracking applicant’s data through the            and access is provided by means of database queries in standard
applicant’s management systems [11]. However, through an in-               languages such as SQL [21]. Consequently, only those vacancies
creasing differentiation of labor and business worlds, the process         matching exactly the given search criteria can be found [17].
of finding the right person for a job opening and vice versa is            When using IR methods, the full text search is alternatively sup-
increasing its complexity. It is clear that upcoming social media          ported by keywords whereby standard search engines can be
channels in addition to an overwhelming number of job portals              integrated. Both procedures can be used in a similar way when
require new strategies and technologies for both recruiters and            searching for offers. However, IR-based methods allow to exploit
job seekers [9].                                                           semantic similarity in keywords, but this is only supported to a
                                                                           limited extent by standard search engines. On the other hand,
2.1 Uses Cases                                                             these approaches generate ordered lists of URLs, where users
                                                                           have a proven tendency to view only the highest ranking results.
Solutions for the automatic recommendation of job offers are
                                                                              For these reasons, and regardless of the way in which job offers
currently of great interest for a number of organizations that
                                                                           are handled and processed, the task of recommending the right
wish to automatize their e-recruitment processes. Among the
                                                                           offer to the right user has been always an important task [12].
most important ones, we can mention HR departments, market
                                                                           In this way, the research community is working to find ways
intermediates, electronic job platforms and portals, or software
                                                                           to make this recommendation fully satisfactory to all parties
manufacturers. We offer here a closer overlook to each of them.
                                                                           involved in the process.
   2.1.1 HR departments. The Human Resources (HR) depart-
ments in companies have to daily face with problems of this kind.          2.3     Existing Methods
Currently, the HR departments of large companies receive lots
                                                                           Techniques for automatic recommendation of job offers are specif-
of incoming e-mail applications. All the application documents
                                                                           ically designed to address the problem of information overload
have to be manually process, so that the relevant information
                                                                           by giving priority to information delivery for individual users
extracted can be transferred into the internal recruiting systems.
                                                                           based on their learned preferences [1].
This process is very time consuming and spends a lot of resources
                                                                              The most common to process this information nowadays con-
(time, money, effort). For this reason, only the data from proper
                                                                           sists of automatically processing the documents involved in the
candidates should be transferred into the system.
                                                                           e-recruitment process. For each document, it is possible to extract
   2.1.2 Market intermediaries. HR Recruiters and headhunters              a vector for each of its fields (which contain textual information)
usually receive the order of finding the most suitable candidate           using the bag-of-words model and TF-IDF as weighting function.
for a specific job description. The challenge is so complex that           Then, some kind of methods for set comparison can shed results
many companies are willing to pay big sums for successfully                on the suitability of a given candidate for a specific job offer.
completing this task. Solutions for job recommendation can help               In general, most of methods try to exploit solutions based on
to alleviate this problem, so that it can be performed much more           the Vector Space Model (VSM) to measure the similarity ratio
efficiently and effectively.                                               between the original job offer and the application received. It is
                                                                           a solution easy to implement, with very low computational costs,
   2.1.3 Electronic job platforms and portals. The segment of              and that traditional has achieved very good results in the context
electronic job platforms and/or portals is subject to a strong             of job recommendation. However, new trends bet on the use of
competition. To survive in this highly competitive market, these           machine learning technology in order to overcome the traditional
operators provide their customers continually new and additional           limitations concerning the incapability of going further beyond
services. With the envisaged research results in the field of auto-        the syntactical representation of the documents.
matic job recommendation, portal operators can increase their
level of innovation and therefore generate additional competitive
advantages for their customers.
                                                                           3     PROBLEM STATEMENT
                                                                           The problem that we address within the frame of this work is be-
   2.1.4 Manufacturers of recruiting software. It is also neces-           ing able to automatically recommend job offers to the appropriate
sary to mention the manufacturers of recruiting software, since            candidates. We are given past solved cases
this group is constantly striving to expand their software solu-
tion continuously with additional and innovative modules to                                 (x i , yi ),   x ∈ Rd , y ∈ {−1, 1}.
increase customer satisfaction and generate additional revenue.
For this reason, software manufacturers of recruiting solutions            We want a classifier so that,
are potentially beneficiaries of results leading to a satisfactory
job recommendation.                                                                          д(x) = sign(ϕ(w) · ϕ(x) + b),                (1)

2.2 Existing Recommendation Engines                                        where
Existing job portals are mainly based on either the use of rela-                                  ϕ(w) · ϕ(x) = K(w, x).                  (2)
tional databases or well-known methods from the area of infor-
mation retrieval (IR). A major difference between them is that                The key here is being able to evaluate the performance of the
relational systems are only able to work with job offers that are          proposed method in relation to the past solved cases that are used
already stored in the databases, while IR-based approaches may             to feed the algorithm in each iteration to readjust the internal
allow global searches over the Web or social networks.                     parameters.


                                                                      23
                                    RF                                                     y


                                                                                                                      s
                                                                                                                       er
                                                                                                                    off
           T1             T2                  T3             Tn


                                                                                                                                        es
                                                                                                                                      ss
                                                                                                                  b
                                                                                                                jo


                                                                                                                                   cla
                                                                                                               t
                                                                                                            an


                                                                                                                                o
                                                                                                                               w
                                                                                                          lev


                                                                                                                               t
                                                                                                                            he
                                                                                                        re
      ..        ..   ..        ..        ..        ..   ..        ..


                                                                                                                               t
                                                                                                                            es
                                                                                                                         at


                                                                                                                                             rs
                                                                                                                       eg


                                                                                                                                           e
                                                                                                                                       off
                                                                                                                     gr
                                                                                                                    se


                                                                                                                                        ob
    Y N Y N Y N Y N Y N Y N Y N Y N


                                                                                                                st


                                                                                                                                      tj
                                                                                                              be


                                                                                                                                    an
                                                                                                               at


                                                                                                                                 lev
                                                                                                             th


                                                                                                                              re
Figure 1: Example of Random Forest bagging N decision


                                                                                                          ne


                                                                                                                              n
                                                                                                        la


                                                                                                                            no
                                                                                                      rp
trees. Each decision tree gives a vote for a given class.


                                                                                                    pe
                                                                                                  hy
Then, the random forest chooses the classification having
the most votes.                                                                                                                              x


4     METHODS
In order to improve the accuracy of the predictions, great research           Figure 2: Example of 2-dimensional Support Vector Ma-
efforts have been made in the last few years concerning the                   chine. The method consists of looking for the hyperplane
definition of methods for combining a number of simple methods.               that maximizes the separation between the two given
These methods construct a set of hypotheses (a.k.a. ensemble),                classes
and combine the predictions of the ensemble in some way to
classify new data. The precision obtained by this combination of
hypotheses is usually better than the precision of each individual                  • RF, in general, can be easily extended to support multiple
component. One of the most popular methods in this context are                        classes
random forests.                                                                     • RF are based on probabilistic principles
   On the other hand, algorithms based on n-dimensional geom-
etry where given a set of past solved cases from the past are also            4.2     Support Vector Machines
gaining popularity. In this way, it is possible to label the classes          Support Vector Machines (SVM) is a state-of-the-art classification
and train the algorithm to build a geometric model that correctly             method that separates data samples using the geometric notion
classify a new sample. We give a deeper insight of these two                  of hyperplane. The concept behind SVM is very intuitive and
methods below.                                                                easy to understand: If we have data samples that has been al-
                                                                              ready classified, SVM can be used to generate multiple separation
4.1        Random Forests                                                     hyperplanes so that the data samples already classified can be
The first method that we envision in this research work is the                divided into segments.
Random Forest (RF). The rationale behind RF is to work with a                     The idea is that each of these segments contain only one class.
given number of decision trees at the same time. Each tree gives              The SVM technique is generally useful and very accurate in
a vote for a given class. This process is iterated by all trees. Then,        scenarios involving some kind of classification. The reason is
the RF indicates the results having the most votes.                           that SVM is designed to minimize the classification error and
    One of the advantages of RFs using is that, in most situations,           maximize the geometric margin.
this method is able to avoid overfitting of the training set, what                From all the classifiers which are able to correctly classify the
it is not always possible by using other machine learning tech-               past samples, we are just interested in picking the closest to the
niques. Figure 1 shows us an example of RF. Please note that, in              hyperplane. Figure 2 shows us the rationale behind SVM with an
order to work in a correct way, each decision tree has to been                example that represents a space of two dimensions. The aim here
built following these steps:                                                  is to find the hyperplane that best segregates the class of relevant
    (1) Be N the number of test cases, M is the number of variables           job offers from the class of non relevant job offers. When a new
        in the classifier.                                                    instance is added, then this hyperplane has to be recalculated in
    (2) The number of input variables to be used to determine the             order to facilitate future classifications.
        decision on a node is m; more m must be always smaller                    SVM has demonstrated a great performance in a number of
        than M                                                                scenarios involving some kind of classification of data samples
    (3) Select a training set for this tree and use the remaining             in the past. We also think that SVM offers several advantages in
        test cases to calculate the error.                                    the context of automatic recommendation of job offers. These
    (4) For each node of the decision tree, randomly select m                 advantages are the following:
        variables on which to base the decision. Calculate the best                 • SVM has a regularization mechanism which allows avoid-
        distribution of the training set from the variable m.                          ing over-fitting (a.k.a. geometric margin)
    We think that the main advantages of using RF in this context                   • SVM is defined by a optimization problem for which there
can be summarized as follows:                                                          is a number of existing efficient solutions
      • In general, RF has only one parameter to configure, the                     • SVM provides an approximation to a bound on the test
        number of trees in the RF                                                      error, which makes it very robust
      • Unlike black-box models, the results obtained by RF are                   SVM also has additional advantage that consists of using ker-
        easier to interpret                                                   nels, so that it is possible to add expert knowledge about the


                                                                         24
Table 1: Average values and standard deviations for the nu-
merical attributes of our data set                                                SVM                                                               100
                                                                                  RF(B)                                                  87.5
                             Average     Std. Deviation                           Baseline                               62.5
           Workers           5069.5      9195.2
           Inhabitants       361547.5    642882.9                             0                                                                 100
           Distance          36.3        37.4                                                         Degree of success
           Salary            52437.5     13717.4
           Working time      38.8        3.5
                                                                             Figure 3: Results obtained for the experiment that gener-
                                                                             ates a salary driven profile
problem. This aspect is out the scope of the present work, but it
could be quite interesting to face it as part of our future work.

5    RESULTS                                                                         0.8

We report here the results from our experiments in the field of
automatic job recommendation. We have worked with a data set                         0.6
of 40 job offers that have been evaluated on basis of templates or
profiles. A template or profile is a pre-defined pattern that shows


                                                                             Score
interest on job offers that follow certain conditions.                               0.4
    The sample set we are working with is not too large (mainly
due to the cost of acquiring data in this context) but it can give
us a good starting point to test the accuracy of these methods for                   0.2
solving the problem we are facing.
    Before each execution, our complete data set is randomly di-
                                                                                      0
vided in training set (80% of samples) and test set (20% of samples).
The former is intended to train both RF and SVM, and the latter                              5     10    15       20       25            30
is intended to verify the accuracy of the method.
                                                                                                    Number of Decision Trees
    It is also important to mention that the attributes for each job
offers are the following:
                                                                             Figure 4: Evolution of the performance as more decision
    • Company name                                                           trees are considered in the case of a salary driven profile
    • Position title
    • City
    • Distance to home                                                       attributes for each of the offers that the potential candidate liked
    • Working hours                                                          in the past. Then, we compare new offers with the ’average’ one,
    • Yearly salary before taxes                                             and we decide if it is similar or not based on the number of similar
    • Are your potentially interested (Y/N)? (to be predicted)               attributes, i.e. attributes closer to the average.
   Table 1 shows us the average values for the attributes and its
corresponding standard deviations (the amount of variation or                5.2      Salary driven profile
dispersion of the values)                                                    The first case we are going to study is the profile of a person who
   Moreover, the most repeated Position Title is programmer, al-             is willing to be interested in job offers with very high salaries.
though other occupations that appear in the data set are analyst,            Figure 3 shows us the results. Please note that for the RF, we
researcher, desk support or developer. The attribute to be pre-              pick the best result since this result can vary depending on the
dicted is dependent of the profile that we are analyzing. And in             number of decision trees that our method is trying to bag, as we
some cases it can be strongly unbalanced (what means that it                 explain later.
will be an an overwhelming majority samples of one class) what                  It is very important to determine the number of decision trees
makes the learning process even more difficult. However, this is             that we are going to work with. To do that, we run several time the
how things work in real e-recruitment scenarios, where users                 algorithm in order to determine what is the appropriate number
click in either just a few or in many potential job offers, so we            of trees to be bagged.
are facing here a realistic situation.                                          From Figure 4 it is possible to see, the more decision trees we
   The results will show us the degree of accuracy that we have              add the better get the results. However, at a certain point the
achieved in each case. In order to identify what is the best strategy        benefit is lower than the cost (in terms of computing time) of
in each of these cases, we propose a baseline method that it does            including additional decision trees.
not involve any kind of learning.
                                                                             5.3      Distance driven profile
5.1 Baseline                                                                 In this case, we are going to study the profile of a person who
In order to compare the results from our methods, we need to                 is willing to be interested in job offers for those companies that
define a baseline method. Since we want to verify the advantages             are located near its current location. Therefore, the template will
of using methods being able to analyze past solved cases, we are             have Yes in job offers with shorter distances and No in job offers
going to choose a baseline method with no learning capabilities.             for positions located further away. However, what in principle
In this case, we are considering to calculate the average of the             seems to be an easy scenario, it is not so easy to solve as we


                                                                        25
     SVM                                           75
                                                                                   0.6
     RF(B)                                                 87.5
     Baseline              37.5

 0                                                                100              0.4


                                                                           Score
                         Degree of success

Figure 5: Results obtained for for the experiment that gen-
                                                                                   0.2
erates a distance driven profile


                                                                                     0
        0.8
                                                                                            5        10    15       20       25             30
                                                                                                      Number of Decision Trees
        0.6
                                                                           Figure 8: Evolution of the performance as more decision
Score


                                                                           trees are considered for a highly paid hour profile
        0.4


        0.2                                                                     SVM                                                                    100
                                                                                RF(B)                                                                  100

         0                                                                      Baseline                                                               100

                5     10    15       20       25           30               0                                                                    100
                       Number of Decision Trees                                                         Degree of success

Figure 6: Evolution of the performance as more decision                    Figure 9: Results obtained for the experiment that gener-
trees are considered in a case of distance driven profile                  ates a profile giving importance to big companies located
                                                                           in big cities


     SVM                                                   87.5
                                                                              In Figure 8, we can see once again how, at some point, the
     RF(B)                                 62.5                            improvement of the results decreases as the number of decision
     Baseline                              62.5                            trees increases.

 0                                                                100      5.5       Big companies located in big cities profile
                         Degree of success                                 In this experiment, the template is going to choose those job
                                                                           offers which are offered by large companies located in big cities.
Figure 7: Results obtained for the experiment that gener-                  This case is also interesting because it might allow us seeing how
ates a profile for a highly paid hour profile                              our methods deal with the fact that more than one attribute has
                                                                           an impact in the user’s decision. Figure 9 shows us the results of
                                                                           the experiment. As it can be seen, it was not a difficult scenario
can see in Figure 5. Reason is that the data set generated by the          for any of the methods considered.
template is very unbalanced, what means that only a few offers                For the case of RF, Figure 10 shows us the evolution of the
a located in a surrounding area.                                           score in relation to the number of decision trees. In this case, the
   In Figure 6, we can see once again how the score improvement            RF remains stable during all the experiments.
decreases as the number of decision trees increases, what means
that a larger amount of trees is usually fine just to some extent.         6       DISCUSSION
                                                                           From the results that we have achieved in our pool of experiments,
5.4      Highly paid hour profile                                          it is possible to see that the most important advantages of our
In this experiment, the template is going to choose those job              approach are:
offers which offers the best hourly rate by the potential employer,                • Both RF and SVM are quite accurate learning algorithms
i.e. the proportion between salary and work time seems to be                         in the context of automatic job recommendation. For a suf-
more advantageous. This case is quite interesting because it might                   ficiently large data set, it is possible to build very accurate
allow us understanding how our methods behave when the user                          classifiers. Even for smaller samples like ours, results are
looks for a complex aggregation of attributes. Figure 7 shows us                     better than those from methods with no learning capabili-
the results for this experiment.                                                     ties.


                                                                      26
                                                                            mapping in the case of SVM as we mentioned earlier. Finally, it
         1
                                                                            is also necessary to study how to integrate this technology with
                                                                            existing web information systems so that these two methods can
        0.8                                                                 be put into operation by the industry.

                                                                            ACKNOWLEDGMENTS
        0.6
                                                                            We would like to thank the anonymous reviewers for their useful
Score


                                                                            suggestions to improve this work. The research reported in this
        0.4                                                                 paper has been supported by the Austrian Ministry for Trans-
                                                                            port, Innovation and Technology, the Federal Ministry of Science,
        0.2                                                                 Research and Economy, and the Province of Upper Austria in the
                                                                            frame of the COMET center SCCH.

         0                                                                  REFERENCES
                                                                             [1] Fabian Abel, Andras A. Benczur, Daniel Kohlsdorf, Martha Larson, Robert
              5       10    15       20       25            30                   Palovics: Proceedings of the 2016 Recommender Systems Challenge, RecSys
                       Number of Decision Trees                                  Challenge 2016, Boston, Massachusetts, USA, September 15, 2016. ACM 2016.
                                                                             [2] Daniel Bernardes, Mamadou Diaby, Raphael Fournier, Francoise Fogelman-
                                                                                 Soulie, Emmanuel Viennet: A Social Formalism and Survey for Recommender
Figure 10: Evolution of the performance as more decision                         Systems. SIGKDD Explorations 16(2): 20-37 (2014).
                                                                             [3] Stefan Buschner, Rafael Schirru, Hanna Zieschang, Peter Junker: Providing
trees are considered for the profile Big companies located                       recommendations for horizontal career change. I-KNOW 2014: 33:1-33:4
in big cities                                                                [4] Mamadou Diaby, Emmanuel Viennet, Tristan Launay: Toward the next gener-
                                                                                 ation of recruitment tools: an online social network-based job recommender
                                                                                 system. ASONAM 2013: 821-828
                                                                             [5] Mamadou Diaby, Emmanuel Viennet, Tristan Launay: Exploration of method-
     • RF and SVM both can handle many variables without                         ologies to improve job recommender systems on social networks. Social Netw.
        discarding any of them, what makes them good candidates                  Analys. Mining 4(1): 227 (2014)
        to efficiently work at web scale, in large databases or with         [6] Frank Faerber, Tim Weitzel, Tobias Keim: An Automated Recommendation
                                                                                 Approach to Selection in Personnel Recruitment. AMCIS 2003: 302.
        large instances.                                                     [7] Evanthia Faliagka, Lazaros S. Iliadis, Ioannis Karydis, Maria Rigou, Spyros
     • Last, but not least, RF is able to provide useful insights                Sioutas, Athanasios K. Tsakalidis, Giannis Tzimas: On-line consistent ranking
                                                                                 on e-recruitment: seeking the truth behind a well-formed CV. Artif. Intell. Rev.
        for understanding the interactions between the different                 42(3): 515-528 (2014).
        variables. On the other hand, SVM operate in a less intu-            [8] Evanthia Faliagka, Athanasios K. Tsakalidis, Giannis Tzimas: An Integrated E-
        itive way, but in exchange, has had a better performance                 Recruitment System for Automated Personality Mining and Applicant Ranking.
                                                                                 Internet Research 22(5): 551-568 (2012).
        in most of cases.                                                    [9] Tobias Keim: Extending the Applicability of Recommender Systems: A Multi-
   However, an complete empirical evaluation over larger data                    layer Framework for Matching Human Resources. HICSS 2007: 169.
                                                                            [10] Stefan Lang, Sven Laumer, Christian Maier, Andreas Eckhardt: Drivers, chal-
sets should be performed in order to gain deeper insights on the                 lenges and consequences of E-recruiting: a literature review. CPR 2011: 26-35.
advantages of these two methods. The reason is that, as we have             [11] Sven Laumer, Andreas Eckhardt: Help to find the needle in a haystack: inte-
seen, it is not always possible to obtain optimal results with small             grating recommender systems in an IT supported staff recruitment system.
                                                                                 CPR 2009: 7-12.
samples like ours.                                                          [12] Jochen Malinowski, Tim Weitzel, Tobias Keim: Decision support for team
                                                                                 staffing: An automated relational recommendation approach. Decision Support
                                                                                 Systems 45(3): 429-447 (2008).
7       CONCLUSIONS AND FUTURE WORK                                         [13] Jochen Malinowski, Tobias Keim, Oliver Wendt, Tim Weitzel: Matching People
In this work, we have presented our proposal for the automatic                   and Jobs: A Bilateral Recommendation Approach. HICSS 2006.
                                                                            [14] Jorge Martinez Gil: An Overview of Knowledge Management Techniques for
recommendation of job offers. Our goal here is being able to                     e-Recruitment. JIKM 13(2) (2014).
build methods being able to deliver appropriate job offers to               [15] Jorge Martinez Gil, Alejandra Lorena Paoletti, Klaus-Dieter Schewe: A Smart
those job seekers that could be potentially interested on them. To               Approach for Matching, Learning and Querying Information from the Human
                                                                                 Resources Domain. ADBIS (Short Papers and Workshops) 2016: 157-167.
do that, we have based our research efforts on two well-known               [16] Alejandra Lorena Paoletti, Jorge Martinez Gil, Klaus-Dieter Schewe: Extending
classification methods: random forests (RF) and support vector                   Knowledge-Based Profile Matching in the Human Resources Domain. DEXA
                                                                                 (2) 2015: 21-35.
machines (SVM).                                                             [17] Alejandra Lorena Paoletti, Jorge Martinez Gil, Klaus-Dieter Schewe: Top-
   Our empirical evaluation shows us interesting facts. For ex-                  k Matching Queries for Filter-Based Profile Matching in Knowledge Bases.
ample, RF are more likely to be interpreted although they do no                  DEXA (2) 2016: 295-302.
                                                                            [18] Ioannis K. Paparrizos, Berkant Barla Cambazoglu, Aristides Gionis: Machine
present a particularly good performance in relation to SVM. On                   learned job recommendation. RecSys 2011: 325-328.
the other hand, SVM are more accurate, although they work with              [19] Gabor Racz, Attila Sali, Klaus-Dieter Schewe: Semantic Matching Strategies
a model being much harder to interpret by human. What it is                      for Job Recruitment: A Comparison of New and Known Approaches. FoIKS
                                                                                 2016: 149-168.
clear is, that in both cases, we have shown that these two meth-            [20] Amit Singh, Rose Catherine, Karthik Visweswariah, Vijil Chenthamarakshan,
ods are quite appropriate for accurately working in the context                  Nandakishore Kambhatla: PROSPECT: a system for screening candidates for
                                                                                 recruitment. CIKM 2010: 659-668.
of automatic job recommendation.                                            [21] Eufemia Tinelli, Simona Colucci, Francesco M. Donini, Eugenio Di Sciascio,
   As future work, we propose to design novel computational                      Silvia Giannini: Embedding semantics in human resources management au-
methods being able to process the textual description from the                   tomation via SQL. Appl. Intell. 46(4): 952-982 (2017).
                                                                            [22] Eufemia Tinelli, Simona Colucci, Silvia Giannini, Eugenio Di Sciascio,
job offers. At that point, we were using just the quantitative                   Francesco M. Donini: Large Scale Skill Matching through Knowledge Compi-
information that is advertised. However, we think that the way                   lation. ISMIS 2012: 192-201.
an offer is written can help attracting potential candidates as             [23] Xing Yi, James Allan, W. Bruce Croft: Matching resumes and jobs based on
                                                                                 relevance models. SIGIR 2007: 809-810.
well, maybe new methods for natural language processing using
neural networks could help in this task. We also would like to
explore the possibilities to work with expert knowledge via kernel


                                                                       27

</pre>