<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Enhancing Job Posting Classification with Multilingual Embeddings and Large Language Models</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hamit</forename><surname>Kavas</surname></persName>
							<email>hamit.kavas@upf.edu</email>
							<affiliation key="aff0">
								<orgName type="laboratory">NLP Group</orgName>
								<orgName type="institution">Pompeu Fabra University</orgName>
								<address>
									<addrLine>C/ Roc Boronat</addrLine>
									<postCode>138, 08018</postCode>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Adevinta Spain de Granada</orgName>
								<address>
									<addrLine>C/ de la Ciutat, 150</addrLine>
									<postCode>08018</postCode>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Marc</forename><surname>Serra-Vidal</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Adevinta Spain de Granada</orgName>
								<address>
									<addrLine>C/ de la Ciutat, 150</addrLine>
									<postCode>08018</postCode>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Leo</forename><surname>Wanner</surname></persName>
							<email>leo.wanner@upf.edu</email>
							<affiliation key="aff0">
								<orgName type="laboratory">NLP Group</orgName>
								<orgName type="institution">Pompeu Fabra University</orgName>
								<address>
									<addrLine>C/ Roc Boronat</addrLine>
									<postCode>138, 08018</postCode>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department">Catalan Institute for Research and Advanced Studies (ICREA)</orgName>
								<address>
									<addrLine>Passeig Lluís Companys, 23</addrLine>
									<postCode>08010</postCode>
									<settlement>Barcelona</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<orgName type="department">Tenth Italian Conference on Computational Linguistics</orgName>
								<address>
									<addrLine>Dec 04 -06</addrLine>
									<postCode>2024</postCode>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Enhancing Job Posting Classification with Multilingual Embeddings and Large Language Models</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">573D11A030519E239924306358A6196C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:36+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>ESCO labour market taxonomy</term>
					<term>job posting classification</term>
					<term>class embeddings</term>
					<term>text embeddings</term>
					<term>LLM</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In the modern labour market, taxonomies such the European Skills, Competences, Qualifications and Occupations (ESCO) classification are used as an interlingua to match job postings with job seeker profiles. Both are classified with respect to ESCO occupations, and match if they align with the same occupation and the same skills assigned to the occupation. However, matching models usually struggle with the classification because of overlapping skills and similar definitions of occupations defined in the ESCO taxonomy. This often leads to imprecise classification outcomes. In this paper, we focus on the challenge of the classification of job postings written in Italian or Spanish against ESCO occupations written in English. We experiment with multilingual embeddings, zero-shot classification, and use of a large language model (LLM) and show that the use of an LLM leads to best results. Furthermore, we also explore an alternative automatic labeling method by prompting three top-performing LLMs to annotate the test dataset. This approach serves both as an experiment on the usability of automatic labeling and as an evaluation of the reliability of the automatically assigned labels, involving human annotators.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The modern labour market becomes more and more diverse. High-tech jobs demand novel skills and competences, which in their turn keep undergoing adaptations and modifications. Under these circumstances, accurately classifying job postings and CVs of job seekers (henceforth candidate experiences) that contain detailed technological specifications with remarkably similar yet distinct skills and experiences has evolved into a complex challenge.</p><p>The overwhelming majority of job portals and employment agencies use either the European Skills, Competences, Qualifications and Occupations (ESCO) taxonomy 1 or its US equivalent O*Net taxonomy 2 to classify job postings and candidate experiences in terms of job title labeled ESCO/O*Net occupations. Most of the proposals to automatic alignment of job postings with candidate experiences (or vice versa) also use ESCO or O*Net <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b2">3]</ref>. However, despite their wide use, both ESCO and O*Net taxonomies exhibit principle limitations for the task of automatic classification of job postings and candidate experiences because due to their tree structure they often fail to adequately distinguish between occupations that exhibit substantial skill overlaps. For instance, two job postings labeled as 'data analyst' may appear similar but require different skills if one focuses on market research while the other concentrates on healthcare trends analysis. This issue is particularly pronounced when classification relies on a single label, such as the job title of an ESCO occupation, where skill overlaps undermine precise classification. Hence, employing multiple job titles and framing the problem as a multi-label classification task is imperative.</p><p>This paper addresses the challenge of multilingual multi-label classification using Large Language Models (LLMs) for the alignment of Italian and Spanish job postings with English job titles encountered in the ESCO taxonomy. Multilingual class embeddings are explored to improve classification accuracy, aiming to provide the necessary contextual awareness and addressing the core limitations of taxonomies such as ESCO.</p><p>Furthermore, we explore an alternative automatic labeling method by prompting three top-performing LLMs to annotate the test dataset. This approach serves both as an experiment on the usability of automatic labeling and as an evaluation of the reliability of the automatically assigned labels, involving human annotators.</p><p>To provide LLMs with domain-specific information and to mitigate hallucinations in the course of the classification of the job postings, we employ Retrieval Augmented Generation (RAG) <ref type="bibr" target="#b3">[4]</ref>, which combines information retrieval with a generative model. RAG serves two critical functions in our methodology. Firstly, it provides detailed definitions, including essential skills and synonyms for each ESCO occupation, selected through vector similarity as outlined in <ref type="bibr" target="#b4">[5]</ref>. Secondly, it ensures that the assigned job titles are restricted to titles within our predefined label space, i.e., standardized job titles defined in the ESCO taxonomy.</p><p>The contributions of our work are:</p><p>• We explore the impact of using multilingual class embeddings derived from the ESCO taxonomy for the task of job posting classification.</p><p>• We integrate RAG to provide LLMs with domainspecific information and eliminate the dependency on fine-tuning;</p><p>• We show how the LLM response can be restricted to standardized job titles and thus how LLMs can be used for high quality job title classification that outperforms state-of-the-art proposals for this task.</p><p>The remainder of the paper is structured as follows. In Section 2, we present a concise overview of the related work. In Section 3, the model on which our work is based is outlined. Section 4 describes the experiments we carried out, the results we obtained in these experiments, and their discussion. In Section 5, finally, draws some conclusions from the presented work and outlines some directions for future research. In Appendix A, we present an ablation study in which we assess the comprehension of English ESCO job titles and its Spanish equivalents by our model. Appendix B provides, for illustration, examples of Italian job postings and predicted ESCO job titles.</p><p>In Appendix E, we present the signature used to prompt Large Language Models for pre-processing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>A number of works have been carried out in the domain of job title classification, focusing on various facets of the problem. Shi et al. <ref type="bibr" target="#b5">[6]</ref> introduce Job2Skills, a model developed for LinkedIn. The model significantly improves job recommendation performance metrics, however, raises questions about its effectiveness beyond LinkedIn. Li et al. <ref type="bibr" target="#b6">[7]</ref> proposes a two-step job title normalization, also in LinkedIn, which is based on tokenization and matching of the original job title provided by the user with a lookup table. The use of a lookup table instead of a standard occupation taxonomy such as ESCO or O*Net significantly limits the generalization potential of this strategy. Zhang et al. <ref type="bibr" target="#b7">[8]</ref> extract soft and hard skills from job posting descriptions, showing that domain-specific pre-training significantly enhances performance in skills and knowledge extraction. Javed et al. <ref type="bibr" target="#b2">[3]</ref> introduce a semi-supervised machine learning approach that utilizes hierarchical classifiers and the O * NET Standard Occupational Classification (SOC) taxonomy for the classification of online recruitment data. Similarly, Wang et al. <ref type="bibr" target="#b8">[9]</ref> propose a model based on multi-stream convolutional neural networks, aiming to classify noisy user-generated job titles by considering different elements such as characters and words within job titles. Yamashita et al. <ref type="bibr" target="#b9">[10]</ref> and Zbib et al. <ref type="bibr" target="#b0">[1]</ref> conduct studies on the classification of job titles, focusing on job title alignment and job similarity training, respectively. JobBERT Decorte et al. <ref type="bibr" target="#b1">[2]</ref> classifies job titles against the ESCO taxonomy, treating the task as a semantic text similarity (STS) exercise. In particular, JobBERT emphasizes the understanding of the semantics of job titles through the skills inferred from the associated vacancies and descriptions, thus alleviating the need for an extensive labeled dataset or a continuously updated list of standardized titles. Before the recent proposals <ref type="bibr" target="#b10">[11]</ref> and <ref type="bibr" target="#b11">[12]</ref>, JobBERT used to be referenced as the state-of-the-art baseline. In general, all of these works draw upon some of the information encoded in the ESCO taxonomy. However, none of them uses detailed descriptions of ESCO occupations, as we propose.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">The Model</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">The Basics</head><p>The proposed model is based on the notion of distinctiveness, which specifies the difference between the prompt concept 𝜃 * and other concepts within the conceptual space Θ <ref type="bibr" target="#b12">[13]</ref>. The notion is crucial for distinguishing in-context learning concepts that are aimed to be learned by analogy. 𝜃 * acts as a latent parameter in a Hidden Markov Model that defines a distribution over observed tokens, represented by selected ESCO job titles as labels. As proposed by Xie et al. <ref type="bibr" target="#b12">[13]</ref>, the error of the in-context predictor approaches optimality under the condition that 𝜃 * is distinguishable from other concepts in Θ ∖ {𝜃 * }. When RAG is adapted as a few-shot reasoning (or incontext learning) framework for job posting classification <ref type="bibr" target="#b13">[14]</ref> , 𝜃 * is represented by the top-selected ESCO labels and ensures that the LLM can effectively differentiate between closely related job categories.</p><p>The explanation enriched prompts enhance the LLM's ability to learn more from each example. According to Xie et al. <ref type="bibr" target="#b12">[13]</ref>, the expected error decreases as the length and informational content of each example increase, contributing to the richness of the input-output mapping for a more robust in-context learning environment. This assumption is proven to be true under the condition of distinguishability of in-context examples and can be mathematically expressed as a reduction of the expected error 𝐸[𝜖], correlated with an increase in the information content 𝐼 of the examples: The use of RAG helps avoid hallucination since when directly prompted with job postings, LLMs have been observed to sometimes produce non-existent labels <ref type="bibr" target="#b4">[5]</ref>.</p><formula xml:id="formula_0">𝐸[𝜖] ∝ 1 𝐼(𝑆𝑛, 𝑥test)<label>(1)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Design of the Model</head><p>The proposed model (see Figure <ref type="figure" target="#fig_0">1</ref>) uses multilingual class embeddings of the E5-large model <ref type="bibr" target="#b14">[15]</ref> to retrieve pertinent ESCO occupation definitions in English. The definitions serve as contextual information to prompt language models for selection of the most suitable job titles. To this end, we incorporate the DSPy library's Chain-of-Thought mechanism, <ref type="foot" target="#foot_0">3</ref> augmented by a hint to restrict the model output to a specified list of job titles. The signature used in this methodology (cf. Figure <ref type="figure" target="#fig_1">2</ref>) is inspired by <ref type="bibr" target="#b15">[16]</ref>.</p><p>To implement the RAG model, we initially established a vector database, <ref type="foot" target="#foot_1">4</ref> in which English ESCO occupation definitions were inserted as multilingual embedding vectors. Acknowledging the reported significance of chunking in many NLP applications, we conducted a series of ablation studies to determine the optimal chunk size. These studies revealed that subdividing the ESCO occupation definitions into smaller segments adversely affects the performance of vector-based similarity matching. Therefore, we opted for storing each of the 3,015 occupations represented in the ESCO taxonomy in its entirety.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Recall values for classification with E5-large Text Embeddings vector similarity Precision @ K @5 @10 @30 @40 Value 0.4238 0.9004 0.9627 0.9817</p><p>To accurately classify a given job posting with respect to the ESCO taxonomy, we include 30 ESCO occupation documents (i.e., 30 nodes of the taxonomy) into the LLM's context as potential job titles. The rationale for choosing 30 documents is that we aim to strike a balance between computational efficiency and the accuracy of the retrieved documents. The precision of the LLM would naturally decrease when it is presented with inaccurate labels. Although, as shown in Table <ref type="table">1</ref>, the precision of the model slightly increases with 40 documents in the context, we accepted this trade-off in favor of a lower VRAM requirement.</p><p>Upon the retrieval of the 30 ESCO occupations that are most closely aligned with a given job posting description, a composite prompt (see Figure <ref type="figure" target="#fig_1">2</ref>) is constructed as input to the LLM. The prompt integrates the actual text data encompassing job titles, descriptions, and skills pertinent to the selected occupations. The design of the simplified composite prompt aims to minimize the bias by focusing only on the core elements. The prompt is then processed by using a locally stored Llama-3 LLM<ref type="foot" target="#foot_2">5</ref> in an isolated environment <ref type="foot" target="#foot_3">6</ref> .</p><p>As a few-show predictor, the LLM evaluates the composite prompt to accurately classify job postings by examining the semantic nuances of the selected ESCO occupations, aligning them with the actual job titles within the offers. To quantitatively assess the alignment between a job posting vector 𝐽 and each occupation embedding 𝐸ESCO derived from the ESCO taxonomy, cosine similarity 𝑎(𝐽, 𝐸ESCO) is used:</p><formula xml:id="formula_1">𝑎(𝐽, 𝐸ESCO) = 𝐽 • 𝐸ESCO ‖𝐽‖‖𝐸ESCO‖<label>(2)</label></formula><p>The similarity scores yielded through 𝑎(𝐽, 𝐸ESCO) for each 𝐸ESCO facilitate the identification and selection of the ESCO occupation embeddings that are most pertinent to the job posting in question. Armed with this information, the LLM proceeds to classify the job posting by selecting the ESCO occupation that exhibits the highest degree of semantic and contextual relevance.</p><p>For a specific job posting 𝐽, an embedding function 𝐸 is employed, such that 𝐸(𝐽) produces the corresponding embedding for 𝐽. The degree of similarity between the job posting's embedding 𝐸(𝐽) and any ESCO occupation embedding 𝑒𝑖 from 𝐸ESCO (where 𝐸ESCO stands for the ensemble of occupation embeddings derived from the ESCO taxonomy) is determined through the similarity function 𝑆(𝐸(𝐽), 𝑒𝑖) (in our case cosine).</p><p>The similarity scores for each occupation embedding 𝑒𝑖 within 𝐸ESCO relative to 𝐸(𝐽) are computed. The ten class embeddings that exhibit the highest similarity to 𝐸(𝐽), denoted as 𝐸top, are selected. Formally, 𝐸top is defined as the subset {𝑒1, 𝑒2, . . . , 𝑒10} from 𝐸ESCO, where each 𝑒𝑖 is selected based on the top 10 similarity scores 𝑆(𝐸(𝐽), 𝑒𝑖).</p><p>The last stage entails a decision-making process enacted by the Llama-3 LLM, represented by the function 𝐷. This function accepts the composite prompt including candidates {𝑒1, 𝑒2, . . . , 𝑒10} accumulated to 𝐸top and the job posting 𝐽, to render the final selected occupation embedding. The chosen occupation embedding 𝑒 * is determined by 𝑒 * = 𝐷(𝐸top, 𝐽), representing the ESCO occupation best matched by the model.</p><p>The entire algorithm can be presented by the following equation, which encapsulates the embedding generation, similarity assessment, and decision-making process by the LLM, culminating in the selection of the most suitable ESCO occupation embedding 𝑒 * for the given job posting description.</p><formula xml:id="formula_2">𝑒 * = 𝐷({𝑒1, 𝑒2, . . . , 𝑒 𝑘 | 𝑒𝑖 ∈ 𝐸ESCO; top k by 𝑆(𝐸(𝐽), 𝑒𝑖)}, 𝐽) (3)</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments</head><p>To evaluate the effectiveness of the proposed model in handling multilingual job postings, experiments were conducted separately on Italian and Spanish datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Test dataset</head><p>To have a reliable test dataset, we use three high performing LLMs as initial annotators of real-world 100 Italian and 100 Spanish job postings with the most extensive descriptions from the InfoJobs<ref type="foot" target="#foot_4">7</ref> database. Non-informative elements such as company descriptions and promotional content where removed using a DSPy module (cf. Appendix E for prompt), which employs zero-shot LLama-3 LM inference to anonymize sensitive information in job postings and candidate experiences. The preprocessed postings were annotated by the top three performing LLMs: GPT-4o<ref type="foot" target="#foot_5">8</ref> , Gemini 1.5 Pro<ref type="foot" target="#foot_6">9</ref> , and Claude 3.5 Sonnet <ref type="foot" target="#foot_7">10</ref> , according to LmSys Arena <ref type="bibr" target="#b16">[17]</ref>. In this context, the ESCO job titles are presented to each model separately, requesting them to select the appropriate job titles, and then measure their level of agreement on these labels. The agreement between LLM models was assessed using Cohen's kappa coefficient <ref type="bibr" target="#b17">[18]</ref>. The average kappa score between Gemini and GPT-4o was found to be 0.6386, indicating a substantial level of agreement. The agreement between Gemini and Claude was lower, with an average kappa of 0.5798, suggesting a moderate level of agreement. Similarly, the kappa score between GPT-4o and Claude was 0.6497, also indicative of substantial agreement. Overall, the average kappa score across all "annotators" was 0.6227, reflecting a general trend towards substantial inter-annotator agreement among the models.</p><p>To establish ground truth labels, we incorporated a dual-layer labelling process. Although the test set consists of only 200 items, labeling them from scratch would be time-consuming due to the complexity of the ESCO taxonomy, which includes 3,015 distinct occupations. Human annotators would require extensive training to accurately navigate this taxonomy. Therefore, we first annotate the occupations automatically using LLMs and then let the initial annotations cross-examine by human expert annotator. Since each data point was reviewed by one annotator only, inter-annotator agreement among human annotators was not quantified. Instead, we conducted an analysis to identify job titles that consistently showed agreement or disagreement across the three LLMs, where domain-specific professionals from InfoJobs reviewed label discrepancies. This analysis, detailed in Appendix C, suggests that certain occupations are inherently more challenging to classify, possibly due to overlapping skills or ambiguous descriptions. Furthermore, we repeated experiments using ground truth labels where any two of the three automatic LLMs agreed on the label. The results showed alignment between the models' predictions and the automatic labeling process, indicating consistency with the patterns recognized by the automatic methods when there is partial agreement. A detailed analysis of this alignment can be found in Appendix D.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Baselines</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1.">SkillGPT</head><p>SkillGPT <ref type="bibr" target="#b4">[5]</ref> has been introduced as a tool for skill extraction and classification, with vector similarity search against LLM-precomputed ESCO embeddings. The authors employ embeddings generated by an LLM, although they do not directly use LLM to select among candidate embeddings. Instead, they rely on embedding similarity to assign the most closely related ESCO class to job descriptions under consideration.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2.">Zero-Shot Classification</head><p>By transforming the classification task into a Natural Language Inference (NLI) problem, any model pretrained on NLI tasks can be utilized as a text classifier without the need for fine-tuning, effectively achieving zero-shot text classification. This is particularly beneficial when we deal with classes unseen during training, making it a robust solution for a variety of text classification scenarios <ref type="bibr" target="#b18">[19]</ref>.</p><p>In our implementation that we use as baseline, we utilize the BART-MNLI model <ref type="bibr" target="#b19">[20]</ref> that showed high performance in summarization tasks when pretrained for various NLI tasks on an MNLI dataset <ref type="bibr" target="#b20">[21]</ref> that is leveraged for its capability to understand entailment relations for classification of the given sequence into one of the specified categories. We also apply the same methodology with the Llama-3 model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Model Optimization</head><p>To optimize LLMs with a minimal set of manually crafted examples, we use the DSPy library <ref type="bibr" target="#b21">[22]</ref>. We initialize the classifier module with a Llama-3 model and use a GPT-4o model as the teacher. Our optimization of the classification is aimed at achieving high F1 scores for each dataset individually. In each run, we use 10 labeled training examples and 30 labeled validation examples. We employ DSPy's BootstrapFewShot, configuring it to perform a maximum of 2 rounds with up to 8 bootstrapped demonstrations. We define a custom metric-the F1 score-to guide the bootstrapping process. For the optimization of the LLMs, we use data points that had high inter-agreement among the automatic methods and were reviewed by human annotators. We perform a validation/test split to ensure that the optimization did not bias the evaluation results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Outcome of the experiments</head><p>For the evaluation of the results of the experiments, we used the micro recall and micro precision metrics, which are suitable for our multi-class classification task. We report evaluation scores seperately on Spanish and Italian test sets. Tables 2 and 3 display the results on the Italian and Spanish datasets, respectively. The results indicate that prompting techniques outperform SkillGPT in both languages. Specifically, the optimized Llama-3-8b model with chain-of-thought (CoT) achieves the highest precision and recall at @5 for Italian, with values of 0.32 and 0.76, respectively, and for Spanish, with values of 0.28 and 0.72. This supports our assumption that optimization enhances performance. The multilingual E5-large model achieves the highest precision at @10 for Italian (0.19) and the highest recall at @10 for Spanish (0.92), underscoring the efficacy of embeddings in classification. This implies that semantically less similar labels can confuse models, whereas embeddings ensure higher recall accuracy, particularly in wider retrieval scenarios. Although both models exhibit similar precision, indicating comparable accuracy in their predictions, the optimized model's capacity to capture a broader range of relevant job titles ensures greater alignment with expert human preferences. This enhances the model's ability to make relevant job title suggestions, thereby improving the overall matching process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.">Discussion</head><p>In Tables 2 and3 we observe that the combined use of general text embeddings and language models significantly outperforms current classification techniques, which rely on language models specifically tailored to the field of the labour market, such as <ref type="bibr" target="#b11">[12]</ref>. We see that using vector similarity with the text embeddings created by the E5-large text embedings model alone does not surpass the baseline. However, it is worth noting that the results are quite close, despite the fact that this model was not specifically fine-tuned on labour market data or adapted to the ESCO taxonomy, as is the case of <ref type="bibr" target="#b11">[12]</ref>. Furthermore, we can observe how text embeddings indeed provide a significant value for filtering n occupations closest to a job posting within the taxonomy. Using these k professions as input to various language models for few-shot classification significantly improves over the baselines. Table <ref type="table" target="#tab_3">6</ref> in the Appendix illustrates the decisions of the LLMs in the case of four sample job postings.</p><p>We also evaluated the effectiveness of a large language model for classification of job titles based on provided descriptions, as shown in Table <ref type="table" target="#tab_2">4</ref> even when the correct titles were not explicitly listed among the initial ESCO job titles. The model's ability to select accurate titles reflects its functionality in processing and understanding the contextual and semantic aspects of the job descriptions. For instance, when presented with a job description focused on the management of comprehensive water and wastewater services, the model correctly identified "Operations Manager" as the correct title. This identification was made despite the presence of several closely related but distinct labels (such as, "Water treatment plant manager") within the pool of ESCO job titles. This indicates that the model's decisions are more influenced by a comprehensive understanding of the job responsibilities and sectors than by the mere presence of keywords or phrases in the ESCO job titles.</p><p>The model's capacity to differentiate between job titles with more specific definitions enhances its comprehension of job postings and assigned labels, thereby improving the precision of suggesting relevant skills. Upon integration into an operational job platform, this model will better understand the requirements of job postings and accurately assign job titles that align with the specific needs of companies. Similarly, in the context of parsing of job candidate experiences, keywords tend to appear more frequently in semantically related ESCO definitions, enabling parsers to incorporate these keywords to enhance parsing performance.</p><p>Overall, we can thus state that the integration of class embeddings generated using the multilingual E5-large model, with subsequent application of few-shot classification techniques through LLMs, significantly improves the accuracy of job title classification, clearly surpassing those of the baselines.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.6.">Computational Cost of Compared Methods</head><p>In addition to evaluating performance metrics, we analyzed the computational cost and environmental impact of each method. The Llama-3-8b model, with 8 billion parameters, requires significant resources for inference, necessitating a GPU with at least 16 GB of VRAM (e.g., NVIDIA RTX 3090). Its average inference time per job posting is approximately 1.5 seconds, and its high energy consumption leads to increased CO2 emissions, making large-scale deployment less environmentally sustainable without optimizations.</p><p>In contrast, the mBART-large-mnli model has about 610 million parameters and operates on GPUs with 8 GB of VRAM, offering faster inference times under 0.5 seconds per job posting. The embeddings-based method using the multilingual E5-large model, with 330 million parameters, allows for precomputed embeddings and efficient CPU-based vector similarity searches, reducing inference time to less than 0.2 seconds per job posting. These smaller models consume less energy, providing more resource-efficient and eco-friendly alternatives suitable for production environments where computational cost and environmental impact are critical considerations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions and future work</head><p>In this paper, we argued that the use of multilingual embeddings in combination with LLMs significantly enhances our ability to distinguish between very similar (or even identical) job titles that suggest different skills and competencies. Our experiments have shown that this is indeed the case, demonstrating that the combination of multilingual text embeddings similarity with the Llama-3 markedly exceeds the performance of other leading approaches in the field.</p><p>In the future, we plan to apply the same approach to the analysis and classification of job candidate experiences. Once it is ensured that both job postings and candidate experiences can accurately be modeled using the embedded representation of the ESCO taxonomy, we plan to set the stage for a more direct and efficient alignment process between job postings and experiences of job seekers.</p><p>Another interesting direction for future research is to analyze the lexical overlap between English domainspecific terms that appear in Italian and Spanish job postings and the English occupation descriptions in the ESCO taxonomy. Such an analysis would reveal whether job types with higher lexical overlap affect model accuracy, providing deeper insights into the multilingual nature of the task. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Ablation Study</head><p>In our ablation study, we pursued two primary objectives. Firstly, to evaluate the model's comprehension of ESCO job titles and its decision-making process. To achieve this, we prompted the model to articulate its underlying rationale. Secondly, so far we reported the performance of our model when Italian and Spanish data were matched against English job titles and occupations in the ESCO taxonomy. Here we wanted to explore whether its comprehension was extendable to data in different languages. We selected Spanish for this purpose and discovered that the model's understanding was consistent, irrespective of the language; see Table <ref type="table" target="#tab_2">4</ref>.</p><p>As illustrated in Figure <ref type="figure" target="#fig_2">3</ref>, the LLM showcases a comprehensive understanding of the task at hand, effectively narrowing down potential ESCO job titles to identify the most suitable label. Additionally, the LLM is observed to generate a novel job title, referred to as "fast food shift team leader". This can be attributed to the absence of contstraints imposed on the LLM regarding structured output for classification, thereby granting it to autonomy to propose the most fitting job title. The analysis initially excludes broader or less related job titles such as "bussiness manager", "hospitality revenue manager", and "accomodation manager", which are not spesific to quick-service restaurant operations. Subsequently, the model considers and ultimately selects titles that emphasize leadership within this spesific restaurant context, narrowing down to "quick service restaurant team leader" and "fast food shift team leader" as the most apt job titles. The reasoning of the model is correct on chosing these titles for their precise reflection of the managerial and leadership responsibilities pertinent to the restaurant environment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Job postings and Predicted ESCO job titles</head><p>The following tables provide examples of job titles, job posting descriptions, and the corresponding gold labels in Table <ref type="table">5</ref> and optimized LLama-3 job titles in Table <ref type="table" target="#tab_3">6</ref>. These examples illustrate how the job titles assigned by recruiters may not always capture the specific nature of the job described in the postings. The gold labels and the optimized LLama-3 job titles offer a more accurate representation of the job roles based on the detailed job descriptions. The job title "Commessa" (Salesperson) is generic and does not specify the specialization required for the job. The gold label "telecommunications equipment specialised seller" fits better because the job description clearly focuses on selling telecommunications equipment, which requires specific knowledge and skills related to this type of product. The gold label accurately reflects the specialized nature of the role. The job title "Project engineer" given by the recruiter suggests a technical and</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Gold Label Job Title</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Quick Service Restaurant Team Leader Posting Job Title</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Encargado de Franquicias Posting Description:</head><p>-Responsable de garantizar la satisfacción de los huéspedes y de gestionar y superar los objetivos financieros y operativos de los restaurantes a mi cargo.</p><p>-Garantizar una excelente atención a los huéspedes en base a las promesas y estándares definidos.</p><p>-Liderar, motivar y desarrollar equipos.</p><p>-Facilitar los recursos y el apoyo necesario a los equipos en sus restaurantes.</p><p>-Utilizar de manera eficaz los diferentes recursos de la Compañía.</p><p>-Identificar oportunidades y amenazas de negocio en el mercado.</p><p>-Aportar ideas y ejecutando proyectos en el corto y medio plazo.</p><p>-Difundir las mejores practicas y resolver problemas comunes en los restaurantes.</p><p>-Cumplir los protocolos y políticas de la Marca y la Compañía.</p><p>-Garantizar y difundir los valores y principios definidos por la Compañía.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Skills: SAP Girnet Gtock, Cuiner</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ESCO Job Titles:</head><p>Restaurant Manager, Business Manager, Hospitality Revenue Manager, Accommodation Manager, Delicatessen Shop Manager, Rooms Division Manager, Customer Experience Manager, Quick Service Restaurant Team Leader, Destination Manager, Membership Manager Project manager, Product development manager</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 5</head><p>Examples of Job Titles, Descriptions, and Gold Labels engineering-focused role. However, the job description emphasizes project management, coordination of project activities, support to product development, and supervision of the technical team. The gold label "project manager" fits better as it captures the overall management and coordination responsibilities described, which are more aligned with the duties of a project manager than just a project engineer. The job title "Addetto alle vendite" (Sales Assistant) is too generic and does not capture the specialized nature of the role described in the vacancy. The description specifies duties typical of a deli worker, such as serving customers, slicing cheeses and cured meats, preparing packages, and managing the deli counter. Our model's titles "meat and meat products specialised seller" and "deli worker" are more precise, indicating a specialized role in food handling and customer service, which goes beyond the general sales assistant title. This demonstrates our model's ability to interpret the specific context and responsibilities of the job accurately.</p><p>The job title "IT Specialist" is generic and could encompass various IT roles. However, the job description clearly indicates responsibilities such as managing <ref type="bibr">ICT</ref>  Gestione dei flussi delle segnalazioni dei cittadini per prenotazioni vaccinazioni e assistenza pandemica, inclusa la verifica del "certificato verde" per la conformità alle normative sanitarie.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Healthcare</head><p>Assistant, Administrative Assistant, Contact Tracing Agent Commesso di Negozio (Retail) Creazione di vetrine accattivanti con abbinamenti di tendenza e assistenza alla clientela nella scelta dei prodotti. Shop Assistant, Sales Assistant, Visual Merchandiser Team Leader (Energy Sector) Predisposizione documenti formativi e aggiornamento processi operativi presso sede Enel, inclusa l'implementazione e il collaudo di software per la gestione energetica.</p><p>Team Leader, Energy Analyst, Business Process Analyst Assistente Amministrativo (Legal and Fiscal)</p><p>Compiti legati al Registro Nazionale delle Varietà Vegetali e mansioni fiscali complesse come Dichiarazioni IRAP.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Accounting</head><p>Assistant, Administrative Assistant, Compliance Officer cal support. The optimized titles "ICT project manager" and "software development manager" are more accurate as they reflect the leadership, coordination, and project management aspects of the role, which go beyond the scope of a general IT specialist.</p><p>The job title "Sales Manager" suggests a mid-level management role. However, the job description highlights responsibilities such as business development, defining sales strategies, managing the sales team, monitoring performance, and managing relationships with key clients and strategic partners. These responsibilities are more aligned with a higher-level role such as "business development manager" or "sales director", which involve strategic planning and high-level management.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Ambiguity from Specialized and Contextual Factors</head><p>To further understand the complexity of job classification in a multilingual context, we conducted an ablation study focusing on cases where both human annotators and LLMs demonstrated shared uncertainty in assigning definitive labels. These cases were particularly challenging due to specialized terminology, regional language variations, or overlapping responsibilities within job postings. Table <ref type="table" target="#tab_4">7</ref> highlights key examples where annotators, despite their recruitment expertise, aligned with the LLMs in experiencing ambiguity.</p><p>As presented in Table <ref type="table" target="#tab_4">7</ref>, each example illustrates specific challenges encountered in classifying job postings across multilingual and sector-specific contexts. The Junior Project Manager job posting, for instance, combines general project management with specialized tasks such as machine vision, but without enough specific context, it is unclear whether the focus should be on technical expertise or managerial skills. The Project Engineer example shows the impact of technical terminology and sector-spesific language on classification. Terms such as "SCADA" and "Modbus TCP" are common in international engineering contexts but may not align with typical understanding of recruiters, leading to the selection of varied labels by both LLMs and annotators. The example of the Assistente Amministrativo with a legal and fiscal focus involves highly specialized processes such as "Registro Nazionale delle Varietà Vegetali" and complex fiscal duties like "Dichiarazioni IRAP. " These terms relate to specific Italian government and regulatory compliance, which could exceed the annotators' typical recruitment experience, thus resulting in generalized labels that do not fully capture the compliance and accounting complexity.</p><p>These cases emphasize that job postings, as humancreated documents, often do not provide enough context for a definitive classification, resulting in ambiguity across specialized and regional terms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Analysis of Model Alignment</head><p>with Partial Agreement Ground Truth Labels In our evaluation, we established two levels of ground truth labels: gold and silver. Gold labels represent unanimous agreement among all three annotators (GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet), validated by human experts. Silver labels indicate a strong majority consensus, assigned when any two annotators agree, even if the third disagrees.</p><p>We assessed our model's performance on both silver and gold labels to understand its effectiveness under different levels of agreement. We had reported results for gold labels in Table <ref type="table" target="#tab_1">2 and 3</ref>, results for silver label are presented in Table <ref type="table" target="#tab_5">8</ref>. For the Spanish dataset, the model's performance was relatively consistent between silver and gold labels, with only minor variations in precision and recall. This consistency suggests that the model robustly captures underlying patterns in the job postings, regardless of labeling strictness.</p><p>In contrast, the Italian dataset exhibited more significant differences between performances on silver and gold labels. For example, in some cases, the precision was higher for silver labels while recall was higher for gold labels. This disparity may indicate that the model better captures broader classifications aligning with majority consensus in Italian but struggles with the stricter criteria required for unanimous agreement.</p><p>An interesting observation is that optimization using gold label ground truth data had a negative effect on the models' scores derived from silver labels. This could be explained by the fact that during optimization, the language models became more attuned to the patterns present in the gold labels, potentially diverging from those in the silver labels. As a result, the models may have become less effective at predicting labels where only partial agreement (silver labels) was present among the automatic methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. DSPy Signature</head><p>We utilize DSPy signatures to prompt large language models (LLMs) for performing downstream tasks. To optimize the script, recursive LLM calls were employed, resulting in its final form based on empirical observations. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Model Architecture</figDesc><graphic coords="3,89.29,84.19,208.35,101.17" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Prompt Template</figDesc><graphic coords="3,302.62,84.19,203.36,189.37" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: LLM's Rationale</figDesc><graphic coords="8,89.29,84.18,416.70,234.39" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Pre-processing Signature</figDesc><graphic coords="11,328.04,488.12,152.52,162.15" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2</head><label>2</label><figDesc>Italian Performance Metrics for Top 5 and Top 10 Predictions</figDesc><table><row><cell>Model</cell><cell cols="2">Precision</cell><cell cols="2">Recall</cell></row><row><cell></cell><cell>@5</cell><cell>@10</cell><cell>@5</cell><cell>@10</cell></row><row><cell>llama-3-8b (CoT opt.)</cell><cell>0.32</cell><cell>0.13</cell><cell>0.76</cell><cell>0.80</cell></row><row><cell>llama-3-8b (CoT)</cell><cell>0.26</cell><cell>0.12</cell><cell>0.62</cell><cell>0.64</cell></row><row><cell>llama-3-8b (SkillGPT)</cell><cell>0.19</cell><cell>0.19</cell><cell>0.36</cell><cell>0.82</cell></row><row><cell>mBart-large-mnli (0-shot)</cell><cell>0.13</cell><cell>0.12</cell><cell>0.29</cell><cell>0.58</cell></row><row><cell>multilingual-e5-large</cell><cell>0.16</cell><cell>0.19</cell><cell>0.36</cell><cell>0.88</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3</head><label>3</label><figDesc>Spanish Performance Metrics for Top 5 and Top 10 Predictions</figDesc><table><row><cell>Model</cell><cell cols="2">Precision</cell><cell cols="2">Recall</cell></row><row><cell></cell><cell>@5</cell><cell>@10</cell><cell>@5</cell><cell>@10</cell></row><row><cell>llama-3-8b (CoT opt.)</cell><cell cols="4">0.28 0.20 0.72 0.90</cell></row><row><cell>llama-3-8b (CoT)</cell><cell>0.26</cell><cell>0.16</cell><cell>0.64</cell><cell>0.68</cell></row><row><cell>llama-3-8b (SkillGPT)</cell><cell>0.09</cell><cell>0.12</cell><cell>0.36</cell><cell>0.62</cell></row><row><cell cols="2">mBart-large-mnli (0-shot) 0.15</cell><cell>0.14</cell><cell>0.39</cell><cell>0.70</cell></row><row><cell>multilingual-e5-large</cell><cell>0.20</cell><cell>0.19</cell><cell cols="2">0.48 0.92</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 4</head><label>4</label><figDesc>Spanish job posting Example</figDesc><table><row><cell>Posting Job Title</cell><cell>Job Posting Description</cell><cell>Gold Labels</cell></row><row><cell>Commessa</cell><cell>Commessa; Commessa; -Presentazione e vendita di attrezzature per</cell><cell>Telecommunications</cell></row><row><cell></cell><cell>telecomunicazioni ai clienti; -Servizio e supporto clienti; -Gestione delle</cell><cell>equipment specialised</cell></row><row><cell></cell><cell>transazioni di vendita; -Gestione dello stock e dell'inventario.</cell><cell>seller</cell></row><row><cell>Project Engineer</cell><cell>Project Engineer; Project Engineer; PROJECT MANAGER / PROJECT</cell><cell></cell></row><row><cell></cell><cell>ENGINEER Divisione: Amministrazione Tecnica -Coordinamento delle</cell><cell></cell></row><row><cell></cell><cell>attività di gestione progetti in ambito tecnico; -Supporto al Product</cell><cell></cell></row><row><cell></cell><cell>Development; -Pianificazione e monitoraggio delle attività progettuali; -</cell><cell></cell></row><row><cell></cell><cell>Supervisione del team tecnico; -Assistenza alla gestione dei fornitori e</cell><cell></cell></row><row><cell></cell><cell>del budget di progetto.</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 6</head><label>6</label><figDesc>Examples of Job Titles, Descriptions, and Optimized Job Titles</figDesc><table><row><cell>projects,</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 7</head><label>7</label><figDesc>Examples of Job Postings with Ambiguous Classification due to Multilingual and Contextual Challenges</figDesc><table><row><cell>Job Title</cell><cell></cell><cell>Description Excerpt</cell><cell>Labels Suggested</cell></row><row><cell>Junior</cell><cell>Project</cell><cell>Applicare i metodi e gli strumenti propri del Project Management a</cell><cell>Project Manager, ICT</cell></row><row><cell>Manager</cell><cell></cell><cell>commesse specifiche per il settore dell'automazione industriale, di cui</cell><cell>Project Manager, Pro-</cell></row><row><cell></cell><cell></cell><cell>l'azienda fornisce sistemi di visione artificiale.</cell><cell>gramme Manager</cell></row><row><cell>Assistente</cell><cell>Am-</cell><cell></cell><cell></cell></row><row><cell cols="2">ministrativo</cell><cell></cell><cell></cell></row><row><cell cols="2">(Healthcare)</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 8</head><label>8</label><figDesc>Performance Metrics for Top 5 and Top 10 Predictions</figDesc><table><row><cell>Model</cell><cell cols="2">Precision</cell><cell cols="2">Recall</cell></row><row><cell></cell><cell>@5</cell><cell>@10</cell><cell>@5</cell><cell>@10</cell></row><row><cell cols="2">Spanish (SPA)</cell><cell></cell><cell></cell><cell></cell></row><row><cell>llama-3-8b (CoT opt.)</cell><cell>0.12</cell><cell>0.06</cell><cell>0.58</cell><cell>0.62</cell></row><row><cell>llama-3-8b (CoT)</cell><cell>0.22</cell><cell>0.16</cell><cell>0.64</cell><cell>0.68</cell></row><row><cell>llama-3-8b (SkillGPT)</cell><cell>0.19</cell><cell>0.12</cell><cell>0.36</cell><cell>0.62</cell></row><row><cell cols="2">mBart-large-mnli (0-shot) 0.15</cell><cell>0.14</cell><cell>0.39</cell><cell>0.70</cell></row><row><cell>multilingual-e5-large</cell><cell>0.20</cell><cell>0.19</cell><cell>0.48</cell><cell>0.92</cell></row><row><cell></cell><cell>Italian (ITA)</cell><cell></cell><cell></cell><cell></cell></row><row><cell>llama-3-8b (CoT opt.)</cell><cell>0.12</cell><cell>0.06</cell><cell>0.56</cell><cell>0.60</cell></row><row><cell>llama-3-8b (CoT)</cell><cell>0.23</cell><cell>0.07</cell><cell>0.55</cell><cell>0.59</cell></row><row><cell>llama-3-8b (SkillGPT)</cell><cell>0.22</cell><cell>0.06</cell><cell>0.53</cell><cell>0.59</cell></row><row><cell cols="2">mBart-large-mnli (0-shot) 0.27</cell><cell>0.06</cell><cell>0.31</cell><cell>0.58</cell></row><row><cell>multilingual-e5-large</cell><cell>0.35</cell><cell>0.08</cell><cell>0.39</cell><cell>0.79</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">https://github.com/stanfordnlp/dspy</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_1">https://www.trychroma.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_2">https://llama.meta.com/llama3/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">We use dockerized models from the open-source Ollama library https://ollama.com/ for all experiments</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_4">https://www.infojobs.net/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_5">https://openai.com/index/hello-gpt-4o/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_6">https://deepmind.google/technologies/gemini/pro/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_7">https://www.anthropic.com/news/claude-3-5-sonnet</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Learning job titles similarity from noisy skill labels</title>
		<author>
			<persName><forename type="first">R</forename><surname>Zbib</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">L</forename><surname>Alvarez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Retyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Poves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Aizpuru</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Fabregat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Šimkus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">G</forename><surname>Casademont</surname></persName>
		</author>
		<idno>ArXiv abs/2207.00494</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:250243975" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">J.-J</forename><surname>Decorte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">V</forename><surname>Hautte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Demeester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Develder</surname></persName>
		</author>
		<idno>ArXiv abs/2109.09605</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:237572142" />
		<title level="m">Jobbert: Understanding job titles through skills</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Javed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mcnair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Jacob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhao</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1606.00917</idno>
		<title level="m">Towards a job title classification system</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Language models are few-shot learners</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">B</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ryder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Subbiah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kaplan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhariwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Neelakantan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shyam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sastry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Askell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Herbert-Voss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Krueger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Henighan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Child</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Ziegler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Winter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Hesse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sigler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Litwin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clark</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Berner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mccandlish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Amodei</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2005.14165</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">D</forename><surname>Bie</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2304.11060</idno>
		<title level="m">Skillgpt: a restful api service for skill extraction and standardization using a large language model</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Salience and market-aware skill extraction for job targeting</title>
		<author>
			<persName><forename type="first">B</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>He</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2005.13094</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Deep job understanding at linkedin</title>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Shi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>He</surname></persName>
		</author>
		<idno type="DOI">10.1145/3397271.3401403</idno>
		<idno>doi:10.1145/3397271. 3401403</idno>
		<ptr target="http://dx.doi.org/10.1145/3397271.3401403" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</title>
				<meeting>the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">N</forename><surname>Jensen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">D</forename><surname>Sonniks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Plank</surname></persName>
		</author>
		<idno>ArXiv abs/2204.12811</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:248405777" />
		<title level="m">Skillspan: Hard and soft skill extraction from english job postings</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Deepcarotene -job title classification with multi-stream convolutional neural network</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Abdelfatah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Korayem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Balaji</surname></persName>
		</author>
		<idno type="DOI">10.1109/BigData47090.2019.9005673</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1953" to="1961" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Yamashita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ekhtiari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lee</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2202.10739</idno>
		<title level="m">James: Job title mapping with multi-aspect embeddings and reasoning</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Escoxlmr: Multilingual taxonomy-driven pre-training for the job market domain</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Van Der Goot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Plank</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:258832782" />
	</analytic>
	<monogr>
		<title level="m">Annual Meeting of the Association for Computational Linguistics</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Job offer and applicant cv classification using rich information from a labour market taxonomy</title>
		<author>
			<persName><forename type="first">H</forename><surname>Kavas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Serra-Vidal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Wanner</surname></persName>
		</author>
		<idno type="DOI">10.2139/ssrn.4519766</idno>
	</analytic>
	<monogr>
		<title level="j">SSRN Electronic Journal</title>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">An explanation of in-context learning as implicit bayesian inference</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Raghunathan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ma</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2111.02080</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xiong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2312.10997</idno>
		<title level="m">Retrieval-augmented generation for large language models: A survey</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Long</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2308.03281</idno>
		<idno type="arXiv">arXiv:2308.03281</idno>
		<idno>arXiv:2308.03281</idno>
		<title level="m">Towards General Text Embeddings with Multi-stage Contrastive Learning</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Oosterlinck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Khattab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Remy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Demeester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Develder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Potts</surname></persName>
		</author>
		<idno>ArXiv abs/2401.12178</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:267068618" />
		<title level="m">In-context learning for extreme multi-label classification</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<author>
			<persName><forename type="first">W.-L</forename><surname>Chiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Angelopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jordan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E</forename><surname>Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Stoica</surname></persName>
		</author>
		<idno>ArXiv abs/2403.04132</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:268264163" />
		<title level="m">Chatbot arena: An open platform for evaluating llms by human preference</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A coefficient of agreement for nominal scales</title>
		<author>
			<persName><forename type="first">J</forename><surname>Cohen</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:15926286" />
	</analytic>
	<monogr>
		<title level="j">Educational and Psychological Measurement</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="37" to="46" />
			<date type="published" when="1960">1960</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Minaee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kalchbrenner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Cambria</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Nikzad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chenaghlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<idno>CoRR abs/2004.03705</idno>
		<ptr target="https://arxiv.org/abs/2004.03705.arXiv:2004.03705" />
		<title level="m">Deep learning based text classification: A comprehensive review</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Shu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<idno>ArXiv abs/2202.01924</idno>
		<title level="m">Zero-shot aspectbased sentiment analysis</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">A broadcoverage challenge corpus for sentence understanding through inference</title>
		<author>
			<persName><forename type="first">A</forename><surname>Williams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Nangia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bowman</surname></persName>
		</author>
		<ptr target="http://aclweb.org/anthology/N18-1101" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long Papers</title>
		<meeting>the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1112" to="1122" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">O</forename><surname>Khattab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Singhvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Maheshwari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Santhanam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vardhamanan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Haq</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">T</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Moazam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zaharia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Potts</surname></persName>
		</author>
		<idno>ArXiv abs/2310.03714</idno>
		<ptr target="https://api.semanticscholar.org/CorpusID:263671701" />
		<title level="m">Dspy: Compiling declarative language model calls into self-improving pipelines</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
