<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Deep Semi-supervised Graph Representation Learning Model for Resume Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Wissem Inoubli</string-name>
          <email>wissem.inoubli@loria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Armelle Brun</string-name>
          <email>armelle.brun@loria.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Keep In Touch</institution>
          ,
          <addr-line>Strasbourg</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Lorraine</institution>
          ,
          <addr-line>CNRS, LORIA</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <fpage>18</fpage>
      <lpage>23</lpage>
      <abstract>
        <p>The main goal of job seekers is to identify job ofers that match their profile. The same stands for human resource departments that aim to identify candidates, through their resumes, that match the recruiter's expectations. However, the number of job seekers and job ofers is so important that none of human resource employees nor job seekers is able to go through all the resumes and ofers manually. Recommender systems have emerged these last years with the goal to recommend job seekers and human resource departments, job ofers and resumes respectively. One of the approaches adopted by the literature relies on the identification of content elements in the ofers and resumes that contribute to perform matching. We propose to represent data under the form of graphs and approach this problem as a classification problem. We present DGL4C, a semi-supervised graph deep learning model, that learns the adequate representation from a graph and trains a classifier on this latent representation. Experiments are carried out on an open dataset of anonymous resumes. Results show that DGL4C significantly improves precision and accuracy of a traditional deep learning models, such as sBERT and confirm the pertinence of relying on a graph structure for the classification task in HR domain.</p>
      </abstract>
      <kwd-group>
        <kwd>Classification</kwd>
        <kwd>graph representation learning</kwd>
        <kwd>deep learning</kwd>
        <kwd>semi-supervised learning</kwd>
        <kwd>resume classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Tracking Systems (ATS), e.g. JobSCAN 1.</p>
      <p>Recruitment is the process of matching job ofers with
resumes. It is performed by both job seekers and human
resource (HR) departments, through the use of Applicant</p>
      <p>Due to the huge amount of resumes and ofers,
matchformed manually anymore. Information retrieval (IR)
algorithms have been traditionally used to perform this
task. They combine features extraction techniques and
a retrieval model (e.g. the standard boolean model). For
example, in [1] the goal is to find relevant resumes in a
LGOBE
(A. Brun)
(A. Brun)
signed to recommend either HR the resumes that match
a given job ofer or a job seeker the relevant ofers for
his/her profile [ 2].</p>
      <p>RecSys in HR’22: The 2nd Workshop on Recommender Systems for
Human Resources, in conjunction with the 16th ACM Conference on
0000-0001-5121-9043 (W. Inoubli); 0000-0002-9876-6906
Although job ofers can be easily collected to form
a dataset, resume collection is a more tricky task, due
to privacy issues. Such datasets are generally collected
by private firms and are not freely available.</p>
      <sec id="sec-1-1">
        <title>Worse,</title>
        <p>very few datasets contain the ground-truth matching
between ofers and resumes. Thus, the content-based
recommendation approach, that identifies resumes and
approach to perform this matching.</p>
        <p>Besides, due to this lack of ground-truth, the
evaluation of recommendation models remains a challenging
task. To cope with this limit, we propose to perform
this matching by using higher level information about
occupation of a job ofer, e.g.</p>
        <p>
          computer scientist, can be
used to perform this matching. We propose to view the
identification of this higher-level information as a
classiifcation problem, as proposed by [
          <xref ref-type="bibr" rid="ref2">3</xref>
          ]. The challenge here
is thus to learn a classifier dedicated to resumes or job
ofers, i.e. to unstructured plain texts.
        </p>
        <p>
          The text classification literature traditionally takes
place in two steps. First, the texts are pre-processed to
extract features. For example, TF-IDF (term-
Frequencyinverse Document frequency), LDA (Latent Dirichlet
Allocation) [
          <xref ref-type="bibr" rid="ref1">4</xref>
          ], and Word2vec [5] are traditional models
for feature extraction and text representation. Second,
classification is performed by supervised machine
learnliterature has shown that the performance of classifiers
CEUR
htp:/ceur-ws.org
ISN1613-073
https://www.jobscan.co/applicant-tracking-systems
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ing algorithms that exploit such representations. The
Attribution 4.0 International (CC BY 4.0).
is highly dependent of the quality of the representation in HR are released. The most famous ones are DISCO2,
[6]. ISCO3 and ESCO4 [12]. Those knowledge bases
repre
        </p>
        <p>Deep learning, that has the characteristics of perform- sent occupational groups at various granularity levels.
ing feature extraction and classification in a unique step, An occupation is defined as a set of jobs whose main tasks
has been also studied. Many neural network variants and duties are characterized by a high degree of similarity
were studied like long short-term memory (LSTM) [7], [12], and a skill is defined as the ability to apply knowledge
based on a recurrent neural network (RNN) architecture and know-how to complete tasks and solve problems. For
[8], or a convolution neural network (CNN) [9]. Deep example, the ESCO standard contains 13,485 skills related
learning has proven to perform better than classical ma- to 2,942 occupations, being sorted on four granularity
chine learning approaches. However, in the context of levels. In [12, 13], the authors use TextKernal Extract
HR, deep learning still sufers from high error rates and parser, an industrial tool5 to extract skills from resumes.
low classification accuracy, especially for resume classifi- The ESCO base is then used to build a classifier based
cation [1]. This can be probably explained by the limited on the matching of the skills defined by ESCO with the
size of the training datasets used [10]. skills extracted in the resumes.</p>
        <p>
          Besides, graph structures have been traditionally Regarding machine learning models, which are widely
adopted to manage rich data. Recently, deep graph learn- used, they rely on training data, and also require a
preing models [
          <xref ref-type="bibr" rid="ref2">3</xref>
          ], that allow to learn a non-euclidean space processing step dedicated to feature extraction. Machine
of data, have emerged. Surprisingly, they have not been learning models, such as Random Forests, Decision Trees,
studied in the HR context, especially for resume classifi- Support Vector Machines, etc. have shown high
efication. ciency and performance on the resume classification task
        </p>
        <p>Thus, in this work, we propose DGL4C, that stands for [14]. In such models, the accuracy of the feature
extracDeep Graph Representation Learning for Classification, tion step strongly impacts the classification performance.
a new classification model based on deep graph learning. At the opposite of ontology-based and machine
learnDGL4C is a semi-supervised model, designed for resume ing models, deep learning models consider both feature
classification, that manages both labeled (resumes) and extraction and classification in a unique step, which
reunlabeled (elements of resumes) data. Concretely, we pro- duces the possible loss of information in the feature
expose two variants of DGL4C. DGL4C-GCN, Deep Graph traction step. Deep learning models are highly popular,
Representation Learning for Classification with a Graph and have shown a significant improvement of
perforConvolution Network, is an end-to-end model, that learns mance. Several works have been proposed for job
classiall the stages between the initial input phase and the final ifcation [ 9, 2, 13, 15] where both a 1-D convolution neural
output result (resume classification). DGL4C-GRL, made network (CNN) and a recurrent neural network (RNN)
up of two stages: (i) text (resume) representation made architectures were adapted to the HR context.
by a GCN architecture, and (ii) a machine learning-based Graph representation learning techniques have
reclassifier. cently emerged and are used in many applications.</p>
        <p>The remainder of this paper is organized as follows. Graphs are a traditional way to represent data, but
modSection 2 introduces the literature related to resume clas- els that rely on such a representation sufer from data
sification. In Section 3, we introduce the two variants of sparsity and robustness to noise, which decreases the
perDGL4C: DGL4C-GCN and DGL4C-GRL. Then, in Section formance of predictive models. To overcome those limits,
4, experimental results are described and analysed. Last, graph representation learning has been designed to
repin Section 5, we conclude and propose perspectives. resent data in a low dimension space, and has shown
its eficiency for unstructured data such as images, texts
and graphs. Graph representation learning can be
cat2. Related Work egorized into three families: matrix factorization based
models [16, 17], random-walk based models [18, 19] and
neural network based models [20, 21]. The latter are
neural networks that are used to learn node embeddings by
aggregating information from neighboring node through
edges. Neighborhood aggregation consists in forwarding
and receiving back data between nodes, throughout their
neighborhood. In GNN, a node has an unlimited number
of direct neighbors, whereas in the other neural network
In this section, we briefly review some works related to
resume classification in the area of human resources.</p>
        <p>The literature has proposed several approaches, that can
be divided into three categories: (i) ontology-based
models, (ii) machine learning models and (iii) deep learning
models.</p>
        <p>Let us first consider the ontology-based models.
After a feature extraction step, ontology-based models use
ontologies to perform classification. An ontology is a
conceptual meta-model that represents a domain knowledge
[11]. Few international and national knowledge bases
2European Dictionary of Skills and Competences
3International Standard Classification of Occupations
4European Skills, Competences, Qualifications and Occupations
5https://www.textkernel.com/
architectures, the number of direct neighbors is limited,   ∈  is a resume with   = { 
1,  
2,    } is the set of
(e.g. two for RNN architectures and eight in the case of
2D-CNN architectures). This unlimited number of direct
neighbors has shown its ability to encode both structural
and semantic (node features) information, which makes
it successful. This neighborhood information leads to
(classes).  =
of distinct words in  .
words of the resume.  is the set of labels with   ∈ 
is the label of resume   and || is the number of labels
⋃   is the vocabulary of  , i.e the set</p>
        <p>Definition of a R-graph. Let  be an heterogeneous,
a neural network that learns better than other architec- attributed and unweighted graph built from  , the set of
tures, which is confirmed by the work of Ding Yao et
al. [22] in the case of hyper spectral image classification.
resumes.  = ( ,  ) with  and  represent nodes and
edges of  respectively. The set of nodes  = ℝ ∪ 
is
Graph Neural Network architectures (GNN) are receiving
made up of the union of the set of resumes and the set
a growing attention [20] and many models are proposed
unique words of the resume dataset. As a consequence,
in the literature. Starting from the initial Graph Convolu-  is made up of two types of nodes: resume nodes and
tion Networks (GCN), GraphSage [23] was then proposed
word nodes.
to overcome the scalabilty issue of GCN, by changing the
The set of edges  is also divided into two types of
convolution method. In the same context, an attention
edges, word-to-resume edges and word-to-word edges.
mechanism was proposed by the model called GAT [24].</p>
        <p>
          An edge between two nodes exists if the similarity
To the best of our knowledge, the GNN architecture
between those nodes is positive, similarly to [
          <xref ref-type="bibr" rid="ref2">3</xref>
          ]. Recall
has not been studied for modeling resumes or job ofers,
that the edges are unweighted.
and we assume that they could be of interest.
        </p>
        <p>The way this similarity is evaluated depends on the
type of edge. Word-to-resume edges are evaluated by the
well-known term frequency-inverse document frequency
(TF-IDF), which is computed as follows:</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. DGL4C: a GNN-based</title>
    </sec>
    <sec id="sec-3">
      <title>Architecture for Classification</title>
      <p>Considering that data representation is a core step of
classification models, we propose a new approach for
resume representation, based on both a graph structure
and semantic information. Concretely, we propose a
semi-supervised graph representation learning model,
based on a GNN architecture. This representation is used
in the classification DGL4C model, that we design.</p>
      <p>As previously mentioned, the main motivation for the
choice of a graph-based representation learning, and
specially a GNN architecture comes from the neighborhood
information (neighboring resumes) taken into
consideration during the training step.</p>
      <p>As deep learning requires a lot of training data to be
efective, and since resume data are generally small, we
decide to adopt a semi-supervised learning algorithm,
which takes both labeled and unlabeled examples;
therefore the training process runs on both example types:
labeled and unlabeled [21] data. DGL4C aims to form
a high quality latent representation of resumes, further
used in the classification phase. In the following
subsections, we present the way we propose to construct a
graph of resumes, then the core idea of graph
representation learning, and the way we develop DGL4C, designed
to encode a dataset of resumes into a latent vector space.</p>
      <sec id="sec-3-1">
        <title>3.1. Graph Construction</title>
        <p>
          Before learning the graph representation, a first phase
consists in constructing the graph of resumes. We
propose to is inspire from the recent work conducted in [
          <xref ref-type="bibr" rid="ref2">3</xref>
          ].
Let  = (, )
denote a dataset.  is the set of resumes,
(1)
(2)
(3)
(4)
(5)
   (, ) = 
(, ) =
() =
(, )
()()
♯ (, )
        </p>
        <p>♯
♯ ()
♯
⎧
⎪
⎩ 0
⎨ 1  =</p>
        <p>otherwise
1  ,  are words, and    (, ) &gt; 0
⎪ 1  is a document and  is a word, and   - 
, &gt; 0</p>
        <p>
          TF-IDF(wd,r,R) =  ( ,  ) ×  ( , )
where  ( ,  )
 
appears in a resume  ;  ( , )
denotes the number of times the word
denotes the
number of resumes that contain the word   . Word-to-word
edges are evaluated by the Point-wise Mutual
Information (  
), as in [
          <xref ref-type="bibr" rid="ref2">3</xref>
          ]. A positive   
value means a high
semantic correlation of the pair of words in the corpus. A
negative value indicates that both words are not
semantically close. Therefore, only the edges that are associated
with a positive   
unweighted form. The   
value exist in the graph, under an
        </p>
        <p>value is computed as follows:
the corpus.
graph  .
 , =
Where ♯ ()
word    , ♯ (, )
is the number of resumes that contain the</p>
        <p>is number of resumes that contain both
words    and    and ♯ is the number of resumes in
Equation (5) represents the adjacency matrix  of the</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Graph Representation Learning</title>
        <p>After building the graph  from the set of resumes  , the
graph representation learning is the second step of the
proposed model. Most of neural networks have the same
universal architecture, namely a set of multi-layer
perceptron neural networks connected that operate on the
input data. GCN [21] is a convolutional neural network
on graphs that performs similar operations than CNN,
except that it applies convolution over a graph instead
of convolution on a 2-D array as input. GCN learns a
latent representation by propagating information from
direct neighbors in the graph and applies a linear
transformation. The information propagation procedure
consists in aggregating information from direct and  -hops
neighborhood. Next, as perceptrons, GCN applies linear
transformation followed by pointwise non-linearity. By
stacking  GCN layers, each node aggregates information
from nodes  -hops away. The GCN [21] propagation rule
is defined as follows:
  =  (  ̂ −1   ),  ℎ 
0 = 
(6)
1</p>
        <p>1
Where  ̂ =  − 2</p>
        <p>− 2 is the normalized symmetric
adjacency matrix and   and  +1 are the previous and the
new hidden state matrix respectively,   is a trainable
weights matrix for layer  , and  denotes any non-linear
activation function (e.g., ReLU). The convolution step in
GCN is based on message passing that is divided into
sub-steps (i) message gather and (ii) aggregation.
Message gather consists of getting messages from n-hops
neighbors, and the aggregation consist of normalization
of all messages in order to get an embedding of a node  .
The message passing form of equation (6) can be written
as follows:
  =
∑
1
ℎ

−1
∈  √|  |√|  |
ℎ =  (
   +   )

(7)
(8)
number of profiles. As it is used by the loss function, only
the resumes nodes are used where the graph contains
resumes and elements of resumes (words) that show the
semi-supervised training presented by DGL4C.</p>
        <p>In addition, experiments that we conducted showed
that the graphSage aggregation method performs better
than GCN [21] and GAT [24] aggregation methods, thus
the graphSage aggregation (mean aggregation) is the one
kept for DGL4C.</p>
        <p>We propose two variants of DGL4C, that difer in the
number of steps they are made up of. DGL4C-GCN is an
end-to-end model with a unique step that includes both
representation and classification. Regarding
DGL4CGRL, it is made up of two stages: (i) text (resume)
representation, and (ii) a machine learning classifier.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>In this section, we aim at evaluating the performance of
both DGL4C-GCN and DGL4C-GRL.</p>
      <sec id="sec-4-1">
        <title>4.1. Experimental Setup</title>
        <p>4.1.1. Dataset
The dataset used is a corpus of 2,484 anonymous
resumes6. Each resume is associated with one label, that
represents the resume profile, and that we consider as
being the class label. 24 profiles (classes) are available.
Each resume is written in natural language and contains
personal information, education, experience, etc.</p>
        <p>In the experiments conducted, the aim is the evaluation
of the performance of the proposed models, and compare
them to several baseline models from the literature. In
addition, we are interested in evaluating the impact of
the number of classes (profiles) on the accuracy of the
models. Thus, we form several datasets so that they fit a
predefined number of classes. Statistics about each of the
resulting datasets and associated graphs are presented in
where   is the set of resume indices that have labels and 
is the dimension of the last layer of the GCN, which is the</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Parameters Settings</title>
        <p>DGL4C-GCN and DGL4C-GRL were implemented using
the DGL framework 7 with two convolution layers of
the GraphSage [23] architecture to allow message
passing among nodes, and the mean aggregation. From an
architectural point of view, we set the embedding size
of the first convolution layer at 500, fixed from initial
experiments. We tuned other parameters and set the
learning rate as 0.001, dropout as 0.2. Both models have
been trained over 200 epochs with a batch size of 32
training samples. For each dataset, we randomly use 80% of
resumes for training and the remaining resumes for test
and perform this selection 10 times. As a consequence,
the accuracy evaluated and reported in the experiments
is the mean test accuracy.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Experimental Results</title>
        <p>To evaluate the efectiveness of DGL4C-GCN and
DGL4CGRL, we compare their performance with several models
from the literature, that difer in either the text
representation or the classifier step. Each model is a pair of
text representation model and a classifier, except for the
end-to-end model DGL4C-GCN. The popular text
representation models, mentioned in the related work section
are used. The list of models is presented in Table 2.
4.3.1. Impact of the Text Representation
We first focus on the evaluation of the impact of the text
representation, by fixing the classifier. We choose to use
the popular random forest (RF) algorithm. The models
studied are listed in the two upper parts of Table 2. Table
3 presents the test accuracy of these models.</p>
        <p>Let us first compare accuracy across models, on the
complete dataset (D5). As expected, TF-IDF+RF is the
less performing model (30.09 accuracy), TF-IDF being
the historical representation and is a quite simple way
to represent texts. Deep-learning based representations:
Word2Vec+RF and sBERT+RF perform better, with an
accuracy of 49.65 and 60.23 respectively. sBERT+RF
performs better than Word2Vec+RF, which is in line with</p>
        <sec id="sec-4-3-1">
          <title>7https://www.dgl.ai/</title>
          <p>the literature, sBERT being the current best performing
model in NLP, specifically on the semantic textual
similarity task [25].</p>
          <p>Let us now consider the graph-based representation
models. DGL4C-GCN, the end-to-end model we
propose, performs slightly better than sBERT+RF, but this
increase is not statistically significant. Regarding
DGL4CGRL+RF, it performs significantly better than sBERT+RF.
We can conclude that graph-based representations are
adequate for the resume classification task and that the
use of neighborhood information in the representation
learning, that combines both of semantic and structural
information, is useful. Especially, this information can
be viewed as a way to compensate the lack of data faced
by deep learning models.</p>
          <p>Let us now focus on the impact of the number of classes
on the performance of the models, by studying
performance on D1 to D5 datasets, i.e. from 5 to 24 classes. As
expected, we can see that the performance of each model
is negatively impacted by the increase of the number of
classes. For example, the accuracy of DGL4C-GRL+RF is
94.38 with 5 classes and decreases to 67.87 with 24 classes.
However, this performance does not decrease linearly
with the number of classes. Especially, the performance
between 11 to 20 classes remain stable. A significant
decrease occurs between 20 to 24 classes, from 75.76 to
67.87. A similar decrease also occurs for the other
graphbased model DGL4C-GCN. However, this is not the case
for deep-learning-based models, nor for TF-IDF.</p>
          <p>We can conclude that graph representation based
models perform better than traditional deep learning based
models. However, they seem to be less robust as the
number of classes grows. Additional experiments would
deserve to be conducted to identify if the decrease in
performance is due to the number of classes or to
characteristics of the 4 additional classes of D5.
4.3.2. Impact of the Classifier
We now focus on the evaluation of the impact of the
classifier on the performance of DGL4C-GRL. We evaluate
several well-known classifiers, namely Support Vector
Classification, Multi-layer Perceptron, Logistic
Regression, that we compare to the previously studied Random</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>Forest. The list models studied is presented in the two
lower parts of Table 2. The mean accuracy of these
models is presented in Table 4, which also recalls the perfor- In this paper we have proposed DGL4C, a deep
semimance of the best deep learning model sBERT+RF and supervised graph representation learning based model for
the graph-based end-to-end DGL4C-GCN model. resume classification. This model can be used to provide</p>
      <p>First of all, we can see that whatever is the classifier recommendations to ATSs, human resource departments,
used, DGL4C-GRL still performs better than sBERT+RF and professional online social networks (e.g. Linkedin,
on most of the datasets versions. If we focus on the Viadeo, Meetup, JobCase, etc).
impact of the classifier on the performance of DGL4C- DGL4C relies on a deep learning approach and adapts
GRL, LR and SVC are the two best performing classifiers, the GNN architecture to textual data. The experiments
that slightly outperform the performance of RF. However, conducted demonstrate the performance of the two
this improvement is not statistically significant. We can variants of DGL4C: DGL4C-GCN and DGL4C-GRL.
Esthus conclude that the nature of the classifier does not pecially, both variants perform better than machine
significantly impact the performance of the model. On learning-based and deep learning-based models from the
the contrary, the graph representation seems to be the literature, including sBERT that has shown good
performost influential step for the performance, which confirms mance on close uses cases. Experiments thus confirm the
the findings of the literature. relevance of relying on a graph-based representation in</p>
      <p>Considering DGL4C-GCN, it is the best performing the HR context.
model for two of the five datasets (D2 and D3). However, In future works, we plan to adopt an
unsuperDGL4C-GCN has a significantly lower performance on vised graph representation learning instead of a
semiD5 (62.43 accuracy) compared to the best performing supervised learning, which will be associated to the
posmodel DGL4C-GRL+SVC (69.04 precision). This can be sibility of collecting and evaluating on larger datasets.
explained by the fact that an end-to-end model has one
optimization function that optimises the representation References
learning and the classification, whereas DGL4C-GRL has
two optimization functions used separately, which makes
the classifier more flexible.
[1] A. Zaroor, M. Maree, M. Sabha, Jrc: a job post and
resume classification system for online recruitment,
in: 29th ICTAI, IEEE, 2017, pp. 780–787.
[2] A. Giabelli, L. Malandri, F. Mercorio, M.
Mezzan94.38
73.65
73.76
67.87</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>DGL4C-GRL+LR DGL4C-GRL+SVC DGL4C-GRL+MLP zanica, A. Seveso, Skills2job: A recommender sys</article-title>
          - [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Fareri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Melluso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Chiarello</surname>
          </string-name>
          , G. Fantoni,
          <article-title>Skilltem that encodes job ofer embeddings on graph ner: Mining and mapping soft skills from any databases</article-title>
          ,
          <source>Applied Soft Computing</source>
          <volume>101</volume>
          (
          <year>2021</year>
          )
          <article-title>text</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>184</volume>
          (
          <year>2021</year>
          )
          <volume>107049</volume>
          .
          <fpage>115544</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Luo</surname>
          </string-name>
          , Graph convolutional net- [15]
          <string-name>
            <given-names>E.</given-names>
            <surname>Abdollahnejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kalman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. H.</given-names>
            <surname>Far</surname>
          </string-name>
          ,
          <article-title>A deep works for text classification</article-title>
          ,
          <source>in: AAAI</source>
          , volume
          <volume>33</volume>
          ,
          <article-title>learning bert-based approach to person-job fit in 2019</article-title>
          , pp.
          <fpage>7370</fpage>
          -
          <lpage>7377</lpage>
          . talent recruitment, in: CSCI, IEEE,
          <year>2021</year>
          , pp.
          <fpage>98</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>