<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Enhancing Human Capital Management: AI Techniques for Candidate Matching and Skill Extraction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Muhammad Hasan Nizami</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ahtisham Uddin</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muhammad Talha Salani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ayesha Saeed</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Faisal Alvi</string-name>
          <email>faisal.alvi@sse.habib.edu.pk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abdul Samad</string-name>
          <email>abdul.samad@sse.habib.edu.pk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Habib University</institution>
          ,
          <addr-line>Block 18, Gulistan-e-Jauhar, Karachi</addr-line>
          ,
          <country country="PK">Pakistan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Artificial Intelligence (AI) is transforming talent acquisition by enabling semantically informed, retrieval-based approaches for job and skill matching. In this paper, we present our system developed for the TalentCLEF 2025 challenge, addressing Task A (multilingual job title similarity) and Task B (job title-based skill prediction). Our approach leverages pre-trained multilingual sentence embedding models and cosine similarity to match job titles and rank relevant skills without requiring large-scale supervision or retraining. This framework achieves scalable and efective performance across both multilingual and monolingual settings, demonstrating the potential of embedding-based retrieval methods in human capital management applications.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Artificial Intelligence</kwd>
        <kwd>Profile Matching</kwd>
        <kwd>Skill Extraction</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>SBERT</kwd>
        <kwd>Job Retrieval</kwd>
        <kwd>Sentence Embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The following review synthesizes state-of-the-art research in AI-driven job matching and skill
extraction, highlighting key methodologies, challenges such as data sparsity and bias, and future
research directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>Elements of Human Capital Management like title matching and skill extraction have particularly
benefited from integrating Natural Language Processing. As much as automatic job title classification
requires innovative solutions, NLP models for real-world HR uses also need to be tailored for cross
industry and multilingual situations. This review focuses on the efectiveness and limitations of the
existing solutions to these tasks, with particular attention to retrieval-based approaches.</p>
      <p>
        Matching job titles in more than one language is a major problem for international recruiters.
ESCOXLM-R, [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] built on top of XLM-Rlarge, is a transformer based model that is state of the art (SOTA)
for job title classification. It leverages masked language modelling and ESCO relation prediction. The
model does cross-language job title alignment, especially for very short titles, remarkably. Unfortunately,
it fails to work with low-resource languages, so where training data is a problem the performance
deteriorates.
      </p>
      <p>
        RAG combined with multi-lingual embeddings is yet another way [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] attempt to improve the
efectiveness of job title classification. This approach combines job title embeddings with semantic retrieval
enabling better alignment across diferent languages and industries. Unlike standard embedding models
and techniques, this technique ofers much better eficiency. However, it still has some ambiguous job
title context issues which need other features as contextual characteristics in order to be resolved.
      </p>
      <p>
        Recent advances in retrieval-based systems leverage contrastive learning for multilingual job matching.
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposed a two-stage approach combining unsupervised pre-training on skill distributions with
contrastive fine-tuning using ESCO taxonomy pairs. Their method achieved a 4.3% improvement
in Mean Average Precision (MAP) over monolingual baselines, demonstrating strong cross-lingual
alignment. However, its reliance on ESCO limits coverage for Asian languages like Japanese and Korean.
This highlights the trade-of between taxonomy-driven precision and language coverage in retrieval
systems.
      </p>
      <p>
        A number of attempts have been made to address the problem of job title classification and
normalization. Shi et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] created a model specifically for LinkedIn called Job2Skills, which enhanced
the system’s eficiency in providing job recommendations. However, the data’s specific limitations
to LinkedIn poses issues regarding the models transferability to other tools. Li et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] had the
same problem as stated above when they used a two-step tokenization based job title normalization
approach. Like many people, Li tried to find some workaround within LinkedIn by putting a tokenized
user-generated job title together with a common reference table but this solution lacked adaptability to
standardized taxonomies like ESCO or O*NET.
      </p>
      <p>
        Graph-based retrieval systems ofer an alternative by explicitly modeling job-skill relationships.
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] combined a multilingual sentence encoder (mUSE) with GraphSage to create a bipartite job-skill
graph, using TF-IDF weighted edges for retrieval. Their hybrid approach achieved 0.7329 MAP in
job-job matching, outperforming text-only baselines by 15%. Notably, the system showed robustness
in cold-start scenarios, maintaining 0.691 MAP even when 95% of test skills were removed during
training. This demonstrates the value of structural information in retrieval systems, though at the cost
of increased preprocessing complexity.
      </p>
      <p>
        Regarding the classification of jobs, there lies a challenge in proprietary datasets which is being
addressed through taxonomy-driven classification. Javed et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposes a solution for the problem
through semi-supervised taxonomy-based classification using hierarchical classifiers trained on O*NET
SOC taxonomy for online recruitment data classification.
      </p>
      <p>
        “JobBERT,” [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is a widely recognized benchmark known for its methodology that utilizes a taxonomy
for job title classification as a semantic text similarity (STS) problem. This taxonomy is aligned with
ESCO. Difering from prior approaches, JobBERT seeks to understand semantics by deriving job relevant
skills from the associated vacancies and descriptions, thus reducing the need for extensive labeled
datasets or continuously updated standardized titles.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Approach</title>
      <p>3.1. Task A
To tackle Task A, we adopted a fine-tuning-based strategy using the Sentence-BERT (SBERT) framework,
particularly leveraging the paraphrase-multilingual-mpnet-base-v2 model due to its strong cross-lingual
capabilities. The dataset comprises job title pairs in multiple languages (English, Spanish, German,
and Chinese), categorized by their semantic similarity. For each language, we preprocessed the text to
normalize spacing and cleaned inconsistencies. Positive pairs were formed from job titles sharing the
same family ID, while hard negatives were created by sampling titles from diferent families, enabling
the model to better learn nuanced distinctions. We structured these pairs into InputExample objects
and aggregated them across all languages.</p>
      <sec id="sec-3-1">
        <title>3.1.1. Model Architecture</title>
        <p>For our architecture, we built upon the Sentence-BERT (SBERT) framework, which modifies the
original BERT model by enabling it to generate semantically meaningful sentence embeddings. We
specifically used the paraphrase-multilingual-mpnet-base-v2 variant, a transformer-based model
finetuned for multilingual paraphrase identification. This model projects each job title into a dense
768dimensional vector space, where cosine similarity between vectors reflects semantic closeness. The
architecture comprises an MPNet encoder that captures bidirectional context, followed by mean pooling
over token embeddings to produce fixed-size sentence representations. These embeddings are then
compared using a cosine similarity function during training and inference. The model is optimized
using CosineSimilarityLoss, which encourages embeddings of similar job titles to be close in vector
space and dissimilar ones to be far apart. This setup ensures eficient and accurate semantic matching
across languages, which is critical for capturing the subtle distinctions and equivalencies between
multilingual job titles.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.1.2. Text Processing</title>
        <p>Before feeding the job titles into the model, we implemented a standardized text preprocessing pipeline to
ensure consistency and reduce noise. All job titles were first lowercased to avoid case-based mismatches.
We then removed extraneous punctuation, digits, and special characters that could interfere with the
semantic embedding process. Stopwords were retained intentionally, as they often carry essential
syntactic and semantic meaning in short phrases like job titles (e.g., “Head of Marketing”). Tokenization
was handled internally by the SBERT model’s tokenizer, which splits the input into subword tokens
compatible with the MPNet architecture. In multilingual cases, the same preprocessing steps were
applied uniformly across all languages to maintain parallel structure and reduce the risk of
languagespecific biases. This careful preprocessing ensured that each job title was converted into a clean,
semantically rich representation, enabling more accurate comparisons across the embedding space.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.1.3. Retrieval System</title>
        <p>Our proposed approach centers around a retrieval-based system, leveraging Sentence-BERT (SBERT)
embeddings and cosine similarity to identify and return the most semantically relevant job titles. Rather
than classifying input into predefined categories or generating new content, the system encodes both
user queries and existing job titles into dense vector representations using a pre-trained SBERT model.
These embeddings capture the semantic relationships between words and phrases, allowing the system to
compute cosine similarity scores between the query and all entries in the dataset. The top match—based
on the highest similarity score—is then retrieved as the output. This method ensures a scalable and
interpretable mechanism for matching user input with job roles, ofering strong performance for
real-world applications where nuance and context in textual data are critical.</p>
        <p>• Use of SBERT for high-quality sentence embeddings capturing semantic meaning.
• Cosine similarity to compare query-job title pairs eficiently.
• Top-k retrieval mechanism to extract the most relevant job titles.
• No need for extensive labeled training data—relies on pre-trained semantic understanding.
• Scalable to large corpora of job roles and adaptable to new data entries with minimal overhead.</p>
        <p>In Task A, we explored various modeling approaches before finalizing a robust retrieval-based system
using SBERT and cosine similarity. This method efectively captures semantic nuances and provides
accurate role recommendations without requiring extensive training. The system demonstrates strong
potential for scalable deployment in real-world job matching scenarios.
3.2. Task B
For Task B, we adopted a retrieval-based strategy to identify relevant skills for given job titles. Unlike
Task A, Task B is monolingual (English-only), which allowed us to focus on semantic representation and
matching within a single language. Instead of using a generative model to produce skills, our method
relies on computing semantic similarity between job titles and skill aliases using transformer-based
sentence embeddings. This approach not only eliminates the need for extensive labeled data but also
ensures interpretability and scalability.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.2.1. Model Architecture</title>
        <p>We utilized the paraphrase-multilingual-mpnet-base-v2 model from the
SentenceTransformers library. Although the dataset was entirely in English and did not require multilingual handling, this
model was chosen for its superior performance in generating semantically rich embeddings compared
to smaller or monolingual alternatives. The model projects input sentences into a 768-dimensional
vector space, where semantic similarity is quantified using cosine similarity.</p>
        <p>Each job title and each skill alias (treated as an individual concept) was encoded into vector
embeddings using this model. By leveraging pre-trained transformer weights, we benefited from a high level
of semantic understanding without requiring additional fine-tuning.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.2.2. Text Processing</title>
        <p>To prepare the data for retrieval, we processed two key components: the job titles (queries) and the skill
aliases (corpus elements). The skill entries in the dataset often included multiple aliases per skill. These
were parsed using ast.literal_eval to convert string representations into Python lists, and then
expanded using the explode() function to treat each alias as a separate candidate for retrieval.</p>
        <p>Job titles and skill aliases were cleaned minimally, preserving their original casing and structure
to retain contextual meaning. Unlike typical NLP pipelines, we intentionally avoided aggressive
preprocessing (like stopword removal) because job titles and skill names are often short phrases where
every word may carry critical semantic information. Tokenization was handled internally by the model’s
tokenizer, ensuring compatibility with the MPNet architecture.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.2.3. Retrieval System</title>
        <p>The retrieval system was designed to compute semantic similarity between job titles and individual
skill aliases. Using the SentenceTransformer model, we encoded all job titles and aliases into dense
vector embeddings. We then computed cosine similarity between each job title vector and all skill alias
vectors.</p>
        <p>For each query, we ranked all skill aliases based on descending similarity scores. To prevent redundant
outputs, we retained only the highest-ranked alias per unique skill ID, ensuring that each skill appeared
only once per query’s final list.</p>
        <p>The results were formatted in TREC-style output, capturing the query ID, skill ID, rank, similarity
score, and system tag. This format enabled direct evaluation using the oficial TalentCLEF evaluation
script.</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.2.4. Design Rationale</title>
        <p>Our decision to use the paraphrase-multilingual-mpnet-base-v2 model was driven by
empirical results rather than the multilingual feature set. Preliminary experiments demonstrated that this
model outperformed lighter or task-specific alternatives and significantly exceeded the benchmark
performance. Additionally, the retrieval-based architecture ofers several advantages:
• No need for supervised fine-tuning or labeled training data.
• High-quality semantic matching via dense embeddings.
• Scalable to large corpora of skill terms with low computational overhead.</p>
        <p>• Easily adaptable to new job titles and evolving skill taxonomies.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>4.1. Task A</p>
      <sec id="sec-4-1">
        <title>4.1.1. Baseline Approach Analysis</title>
        <p>The baseline approach relied on the paraphrase-multilingual-MiniLM-L12-v2 model from the
Sentence-Transformers library, which is a lightweight multilingual model designed for general-purpose
semantic similarity tasks. While computationally eficient and easy to deploy, this model lacked the
contextual depth and domain-specific training necessary for capturing nuanced job-related semantics.
In particular, it struggled to diferentiate between closely related job titles and skill terms, resulting in
reduced retrieval precision. Furthermore, the absence of task-specific fine-tuning or alignment with
recruitment domain vocabularies made it less efective in understanding multilingual and ambiguous
queries. As shown in Figure 1, its performance lagged behind fine-tuned alternatives across all evaluation
metrics and languages.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.1.2. Proposed Approach &amp; Analysis</title>
        <p>Our final retrieval-based approach, which employed fine-tuned SBERT embeddings, outperformed
both the provided baseline and our initial implementation across all languages. In English, we achieved
a MAP of 0.5468 and an MRR of 0.7948, significantly improving upon the baseline score of 0.4992 and
our initial score of 0.5213. Similarly, the Spanish results improved from a baseline of 0.3717 to a final
MAP of 0.4469, and German saw a jump from 0.2840 to 0.3733. Chinese also exhibited notable gains,
increasing from a baseline MAP of 0.4371 to 0.4965. These results demonstrate the efectiveness of our
multilingual fine-tuned SBERT model in enhancing retrieval performance. The use of cosine similarity
on dense representations allowed for accurate candidate-job matching across diferent languages with
minimal overhead.
4.2. Task B
This section presents the findings from testing various models and hyperparameters to achieve optimal
performance for Task B. The primary evaluation metric was Mean Average Precision (MAP).</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.2.1. Final Result</title>
        <p>The most efective configuration, as outlined in 3.2, yielded the following performance:</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.2.2. Model Selection</title>
        <p>In the initial stages, several sentence embedding models were tested. The model
paraphrase-multilingual-mpnet-base-v2 achieved the highest MAP score of 0.2205,
outperforming all other transformer-based models considered.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.2.3. Pooling Strategy</title>
        <p>Among the diferent pooling strategies, using the cls_token embedding performed best. This method
relies on the special [CLS] token, which encodes sentence-level semantics.</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.2.4. Max Input Length</title>
        <p>Varying the maximum input length (128, 256, 512) showed minimal impact on performance, likely
because most input sequences in the dataset were shorter than 128 tokens.</p>
      </sec>
      <sec id="sec-4-7">
        <title>4.2.5. Softmax Temperature</title>
        <p>Since predictions were based on cosine similarity scores passed through a softmax layer, the team
experimented with diferent temperature values. A temperature of 0.1 yielded the best results. However,
even this configuration underperformed compared to the multilingual MPNet model, which did not use
temperature scaling.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Error Analysis</title>
      <p>When we dug into where our system stumbled, some clear patterns emerged. Short, ambiguous job
titles like “Analyst” or “Manager” sometimes misfired, our embeddings grasped general meaning but
missed industry nuances (e.g., matching a financial analyst to a data science role). We also noticed
skill aliases working against us: terms like “Python Programming” and “Python (Language)” split the
relevance signal for the same skill, dragging down scores. Language gaps persisted too, culture-specific
Chinese roles like “关系经理” (Guanxi Manager) underperformed compared to European titles, hinting
at our model’s Eurocentric training roots.</p>
      <p>More subtly, we struggled with “implied” skills: titles like “Project Coordinator” rarely surfaced “Team
Leadership” despite their real-world connection, showing how pure text similarity misses conceptual
hierarchies. And while our MAP scores looked solid, we saw real frustration points, correct matches
often lurked just outside the top-ranked spot (#2–#5), which matters immensely when recruiters only
glance at the first suggestion. These stumbles remind us that for real-world use, we’d need smarter
alias grouping and ways to inject contextual or industry-aware signals.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Perspectives and Future Work</title>
      <sec id="sec-6-1">
        <title>6.1. Bias Mitigation and Fairness in Recruitment Systems</title>
        <p>To reduce systemic bias in AI-driven recruitment, future work should include data augmentation
modifying sensitive attributes such as gender or ethnicity while keeping qualifications constant, to
evaluate fairness. Adversarial debiasing can be used to penalize reliance on demographic proxies. For
transparency, embedding-based explanations (e.g., LIME adaptations) can help highlight key textual
features influencing similarity scores. Hybrid pipelines combining embeddings with rule-based logic
may further enhance interpretability and user trust.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Real-Time and Incremental Updating</title>
        <p>As job markets and terminology rapidly evolve, future systems must incorporate continuous updates.
Streaming data pipelines can ingest new job titles and skills, enabling online or continual learning
through adapter layers or time-aware embeddings. Time-stamped tokens can help capture semantic
shifts. Human feedback, via active learning loops or recruiter interfaces, should be used to identify edge
cases and refine the model’s understanding of emerging patterns.</p>
      </sec>
      <sec id="sec-6-3">
        <title>6.3. Multimodal and Cross-Domain Integration</title>
        <p>Future systems should go beyond plain text by integrating visual and auditory information for more
accurate skill assessment. Layout-aware models like LayoutLMv3 can utilize resume structure (headings,
tables, font styles) to prioritize relevant sections. Additionally, incorporating candidate video or audio
(e.g., short introductions) can help infer soft skills like communication or leadership. Combining
text, layout, and speech into multimodal embeddings will result in richer and more holistic talent
representations.</p>
      </sec>
      <sec id="sec-6-4">
        <title>6.4. Domain-Specific Customization and Transferability</title>
        <p>Generic skill-matching models often fail to capture the specificity of diferent industries. Future research
should explore domain-adapted modules (e.g., using adapter layers) and meta-learning techniques to
ifne-tune models on small in-domain datasets. Transferability should be tested across public and
proprietary taxonomies to identify generalization gaps. Few-shot and zero-shot learning methods may
enable rapid adaptation to company-specific roles and specialized job titles with minimal annotation
efort.</p>
      </sec>
      <sec id="sec-6-5">
        <title>6.5. Explainability, Transparency, and Ethical Considerations</title>
        <p>Future recruitment systems must prioritize both ethical robustness and privacy. Intersectional fairness
audits—evaluating bias across combinations of sensitive attributes—are essential. Counterfactual
explanations can help expose dependencies that influence predictions unfairly. Privacy-preserving training
techniques such as federated learning and diferential privacy should be explored to ensure compliance
and protect user data. These eforts are vital for building equitable, trustworthy, and legally sound
AI-based hiring tools.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this paper, we presented comprehensive approaches for both Task A and Task B of the TalentCLEF
2025 lab, tackling the challenges of job title similarity and skill recommendation in multilingual and
monolingual settings, respectively.</p>
      <p>For Task A, which focused on multilingual job title similarity, we fine-tuned a Sentence-BERT
(SBERT) model using the paraphrase-multilingual-mpnet-base-v2 transformer backbone. Our
method was designed to produce semantically meaningful embeddings across four languages: English,
Spanish, German, and Chinese. Through strategic creation of positive and hard negative pairs based on
job family IDs, and consistent multilingual preprocessing, we enabled the model to efectively capture
cross-lingual semantic relationships. The resulting embeddings allowed for precise cosine similarity
computations, enabling accurate and scalable retrieval of semantically equivalent job titles across
languages.</p>
      <p>For Task B, we adopted a retrieval-based approach to associate English-language job titles with
relevant skill terms. Rather than using a generative model, we utilized the same SentenceTransformer
backbone to encode both job titles and skill aliases into high-dimensional vector spaces. Each skill
alias was treated as an individual retrieval candidate, and cosine similarity was used to rank them in
relation to each job title. By exploding multi-alias skill entries and ensuring one result per unique
skill ID, we constructed a refined output that was both semantically relevant and non-redundant. Our
system demonstrated strong alignment with the gold-standard skill associations, validated through
oficial evaluation metrics such as Mean Average Precision (MAP).</p>
      <p>Overall, our contributions showcase the flexibility and efectiveness of transformer-based sentence
embeddings for talent search and retrieval tasks. Whether through fine-tuning for multilingual
paraphrasing or zero-shot semantic matching in monolingual skill recommendation, the use of pretrained
language models enabled high-quality results without requiring task-specific architectures. Our work
underscores the practicality of embedding-based solutions for large-scale, real-world applications in
job-market intelligence and human resource technology.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>We would like to acknowledge the support provided by the Ofice of Research (OoR) at Habib University,
Karachi, Pakistan for funding this project through the internal research grant IRG-2235.
During the preparation of this work, the author(s) used ChatGPT and other generative AI, to fix
grammar and spelling errors, paraphrase and reword sections of the paper where needed. After using
this tool/service, the author(s) reviewed and edited the content to their liking and need. The author(s)
take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Gasco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fabregat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>García-Sardiña</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Estrella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Deniz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rodrigo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zbib</surname>
          </string-name>
          ,
          <article-title>Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management, in: International Conference of the Cross-Language Evaluation Forum for European Languages</article-title>
          , Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Escoxlm-</surname>
          </string-name>
          r:
          <article-title>Multilingual taxonomy-driven pre-training for the job market domain, ArXiv</article-title>
          .org (
          <year>2023</year>
          ). URL: https://arxiv.org/abs/2305.12092.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kavas</surname>
          </string-name>
          , et al.,
          <article-title>Enhancing job posting classification with multilingual embeddings and large language models (</article-title>
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Deniz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Retyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Garcia-Sardina</surname>
          </string-name>
          , et al.,
          <article-title>Combined unsupervised and contrastive learning for multilingual job recommendation (</article-title>
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Salience and market-aware skill extraction for job targeting</article-title>
          ,
          <source>ArXiv</source>
          .org (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2005</year>
          .13094.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Deep job understanding at linkedin (</article-title>
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .1145/339727.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Fabregat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Retyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Poves</surname>
          </string-name>
          , et al.,
          <article-title>Inductive graph neural network for job-skill framework analysis</article-title>
          ,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>73</volume>
          (
          <year>2024</year>
          )
          <fpage>83</fpage>
          -
          <lpage>94</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Javed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>McNair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Jacob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Towards a job title classification system, ArXiv</article-title>
          .org (
          <year>2016</year>
          ). URL: https://arxiv.org/abs/1606.00917.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.-J.</given-names>
            <surname>Decorte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Hautte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Demeester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Develder</surname>
          </string-name>
          ,
          <article-title>Jobbert: Understanding job titles through skills</article-title>
          ,
          <source>ArXiv abs/2109</source>
          .09605 (
          <year>2021</year>
          ). URL: https://api.semanticscholar.org/CorpusID:237572142.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>