1. Introduction

ARXIV.

10.48550/ARXIV

Using vector representations for matching tasks to skills

Miriam Amin

Jan-Peter Bergmann

Yuri Campbell

0 0 Fraunhofer Center for International Management and Knowledge Economy (IMW) , Neumarkt 9-19, 04109 Leipzig , Germany

1908

10084 18 23

Science, Technology and Innovation (ST&I) companies as well as large research organizations are repeatedly facing the problem of matching an emerging task with the appropriate skill that is present somewhere in an organizational unit. Many organizations already have skill or competence taxonomies that can be useful in this regard. In this working paper, we present our experiments on automatically recommending suitable skills from the internal skill taxonomy of the Fraunhofer Society research organization to incoming research requests in order to support human decision making processes. We applied three diferent vector-based approaches for this end, one based on language models, one on word embeddings and one on a simple one-hot-encoding of keywords. Our results show that the language-model-based approach outperforms the other methods and is able to recommend skills to research requests with an MAP of 0.82. These first findings pave the way for further improvements of our method and for the transfer to other related problems.

eol>Recommender Systems Knowledge Management Skill Taxonomy Competence Taxonomy Task-Skill Matching

1. Introduction Research request

We are searching for a solution to link a smart metering system of high-resolution electricity, gas and heat data with our intelligent cloud solution. In the cloud, we want to automatically process the data using machine learning to check for consistency and completeness and to enable load forecasts and cost optimization. We are also looking for the joint development of innovative business models.

2. Methods

In order to support the matching between research re2.1. Data and Preprocessing quests and in-house skills in large organizations, we propose a vector-based approach, which draws from recent The Fraunhofer Society combines a wide variety of spe- Transfer Learning advances in Natural Language Processcialized institutes under one umbrella. To handle this ing. Firstly, we represent the skills in the taxonomy with variety of skills contained in the diferent institutions, a vector model. Then, with the same vector representhe Fraunhofer Society developed an overview of its al- tation approach, we transform the requests and project ready existing competences as well as prospective ones. them into the same vector space. Finally, every research It is planned that employees will be able to subscribe request acts as a query for which we retrieve matching to the skills and topics that interest them, i.e. skills are documents. In this Information Retrieval setting, we renot automatically assigned to employees. Based on their turn the − closest skill-vectors to a specific query vector individual selections, employees can then receive rele- as matches for that request. vant messages and notifications about incoming research In this framework, we test three distinct approaches to requests. These skills are hierarchically structured in a create useful vector representations for the task at hand. taxonomy with a tree-like structure with four levels: the They are Keyword-Binarizer (KB), Keyword-Embedding root, the first level: scientific disciplines, the second level: (KE) and Language Model (LM). In the KB approach, their research fields, and finally, the skills are the leaves we extract keywords using the keyword extraction alin this skill-tree. gorithm YAKE! [ 5 ] from the text description of skills

The entire dataset includes approximately 1.000 skills and requests, then a binary vector is constructed in an that are either written in German, English or mixed En- one-hot-encoding manner for all skills and requests. It glish and German. Moreover, disciplines and research is important to note here that YAKE! extracts keywords ifelds as well have similar language composition in their as well as keyphrases (the combination of two or more description. That means, even when a leaf is described in words). From now on in the text, we will refer to both as the English language, as Machine Learning, its research keywords only. ifeld Künstliche Intelligenz can be written in German, and In the KE vector model, the texts undergo the same vice versa. In order to give more contextual information keyword extraction procedure as in KB. However, the fito single skills, we concatenate skill, research field and nal step for the construction of the vector representation scientific discipline to build one textual representation is diferent. Here, given a skill or a request, we create for every skill in this way. These preprocessed skills have the corresponding vector representation by averaging an average length of 128 characters. Table 1 shows an the Word2Vec embeddings of the keywords belonging to example of a skill hierarchy and the concatenated skill that skill/request. We use Word2Vec word-embeddings, string. In this specific case, all levels are in English. which were trained by Deepset1 on the whole German

On the other side, research requests are short texts Wikipedia corpus. In cases where the vector representaof approximately 1.112 characters in length. Since they tions for a specific word is not found in the embedding come from diferent authors, they are very diverse both dictionary, we apply compound splitting and a vector structurally and stylistically. Also, they cover a large retrieval is attempted for the resulting components. This variety of research fields and can be German or English, procedure is specially useful for German, since many but mainly German. Our experimental corpus of research German words have a compositional structure, for exrequests conveys approximately 100 documents. Table 2 shows an example of such a research request. 1https://www.deepset.ai/german-word-embeddings

Sampling method

ample Forschungsprojekt = Forschung (research) + Projfeokutn(dprroejceecitv)e. Wa0o−rdvsefcotrowr, hwihchicnhoprreapctriecsaelnlytactaionnceclasnanbye Method STNoiDmpCilaGritiesMAP NDCEGxpertMAP NDCG impact they might have on the average representation. LM 0.70 0.89 0.63 0.76 0.67

Finally, in the LM approach, we use a multilingual KE 0.25 0.37 0.13 0.16 0.19 language model which is fine-tuned on the task KB 0.28 0.39 0.15 0.28 0.21 of semantic similarity. More precisely, we use the model paraphrase-multilingual-mpnet-v2, Table 3 provided by Sentence-Transformers 2. This model NDCG@5 and MAP values for the three vectorization methis suitable for creating vector representations ods and the two sampling methods. LM - Language Model, of sentences and paragraphs for information re- KE - Keyword-Embeddings, KB - Keyword-Binarizer trieval, clustering or sentence similarity tasks3. The model paraphrase-multilingual-mpnet-v2 is the multilingual version of the original for the request, ’1’ when it was not completely relevant, model all-mpnet-base-v2. The model but also not irrelevant and ’0’ when it was completely paraphrase-multilingual-mpnet-v2 is trained via irrelevant. multilingual knowledge distillation [ 6 ]. In other words, We took two samples of ten requests, each with a difa smaller multilingual model, in this case XLM-RoBERTa ferent sampling method. In the sampling method ’expert’, [ 7 ], is used as the student model, while a bigger MPNET we selected ten requests in which the authors of this pa[8] monolingual model is used to guide the multilingual per themselves have expert knowledge of the required vector representations of translated pairs by means skills - resulting in ten IT and AI related requests. For the of a double mean squared error loss on the generated sampling method ’top similarities’, we considered the top representations for the multilingual training pair. The ifve skills with the highest similarity scores for each repre-trained monolingual teacher model MPNET was quest. We then took the mean of these top five similarity ifne-tuned with SBERT-like objective [ 9] on more than 1 scores. For each vectorization method, we then selected billion pairs of sentences/paragraphs4. The pre-training the top ten request with the highest mean similarities. objective of the teacher model is an usual contrastive Note that the ’top similarities’ sample sets difer among learning objective. That means, for a given pair of sen- the methods. In addition, we calculated the mean value tences, or paragraphs or sentence-paragraph, the model from the ’expert’ and the ’top similarities’ sample. predicts which, out of a set of randomly constructed With 20 relevance assessments for each method, we pairs with at least one component of the original pair, were able to calculate the Normalized Discounted Cumuwere actually paired in the billion dataset. In our use lative Gain@5 (NDCG@5) and the Mean Average Precicase, just the trained student model is used in order to sion (MAP) for each system. In order to calculate these create multilingual vector representations for skills and measures despite the missing ground truth, we assumed requests. Both require no further pre-processing steps that there are five matching skills for each request. In before as the XLM-RoBERTa model has SentencePiece order to calculate the MAP, which requires a binary releas its base tokenizer and it was previously pre-trained in vance, we considered the relevance labels ’1’ and ’2’ as many languages, among them English and German as relevant and ’0’ as irrelevant. well.

Mean

2.3. Validation In order to validate the three approaches described in the preceding section, we took two diferent samples of the request corpus, retrieved the top five skill recommendations from each method and assessed the relevance.

For the experiments at hand, we needed to conduct the relevance assessment manually. In the near future, however, a completely expert-labeled ground truth dataset will be at our hand, recording all relevant skills for each request. We labeled a request-skill-pair with the relevance value ’2’ when the skill was completely relevant 2https://www.sbert.net/docs/pretrained_models.html 3https://huggingface.co/sentence-transformers/all-mpnet-base-v2 4https://huggingface.co/sentence-transformers/all-mpnet-base-v2

3. Results

The purpose of our experiment was to find out which NLP method yields the best results for the task of recommending skills from a standardised skill ontology to a specific task or request. Table 3 shows an overview of the NDCG@5 and the MAP scores obtained during our experiments.

From the data, it is apparent that the language modelbased method yielded by far the best results. Over all samples, the language model achieved an impressive MAP of 0.82 and and NDCG of 0.67. The other two methods are far behind.

To illustrate the findings of these first experiments, we show the top five skill recommendations of each method

Before prompt engineering After prompt engineering

Simulation, control and operational management of energy supply systems Field of competence energy informatics AI-based autonomous actions We work on AI-based autonomous actions. Our field of competence is energy informatics, within the research field of simulation, control and operational management of energy supply systems

4. Discussion

The results of these preliminary experiments are very satisfactory. We have shown that our language-model-based method in particular performed very well for matching skills to specific tasks. That was somewhat surprising against the background that the skills have a comparatively short text length and thus do not provide much context for the language model to compute semantic similarities. Equally surprising was that the word-embeddingbased method (KE), which were supposed to perform well even without context, showed such poor performance. We suspect that this is due to the rather technical vocabulary in both the skills and the requests that is not present in our word embedding vocabulary. Our attempt to counteract this with the compound splitting described above does not seem to have achieved the expected results.

Nevertheless, we are convinced that the performance - particularly that of the LM approach - can still be improved by further tuning. In future work, we want to experiment with further text preprocessing and prompt engineering methods. For example, we are interested how the transformation of the skill string into a reallanguage sentence impacts the performance. For this, a sentence template with slots for the hierarchical elements of the skill string can be used. Table 4 shows an example of such a transformed string. With such a transformation, we hope to provide even more context to the Language Model, especially to the attention mechanism. Moreover, LMs are trained and optimized on whole natural sentences, not on syntaxless word groups.

Again, we should address that the sample size of this experiment is still rather small and results need to be conifrmed as soon as the entire dataset of research requests was labelled with the matching skills.

We also hope to make further improvements to our approach with such a ground truth at hand. Not only would this allow us to calculate more evaluation measures, such as precision@k and F1@k, we could also fine-tune the vector-space model. With contrastive learning, we could optimize the vector space in a way that requests move closer to the matching skills and further away from the mismatching skills, hoping that this new vector space is transferable to unknown requests.

Last, and maybe most importantly, we want to explore the transferability of our method to other, related problems. These are, e.g., recommending skills for more general tasks and work assignments or even finding the worker or team with the optimal skill set for requests, tasks and work assignments.

However, it remains very important to mention that such recommender systems are only useful and properly utilized when they are designed to support an essentially human-driven decision-making process.

Language Model Keyword Embedding Keyword Binarizer

We are searching for a solution to link a smart metering system of highresolution electricity, gas and heat data with our intelligent cloud solution. In the cloud, we want to automatically process the data using machine learning to check for consistency and completeness and to enable load forecasts and cost optimization. We are also looking for the joint development of innovative business models.

Rank Research field Skill Assessment Label

1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

Energy Information Technology Energy Information Technology Economic and regulatory assessment Energy Information Technology Energy Information Technology Storage & storage systems Lightweight construction technologies Power grids Artificial Intelligence Methods Artificial Intelligence Methods Module manufacturing/ integration Process Technologies Component manufacturing Component manufacturing Component packaging, module manufacturing/ integration

Data Science, Statistics, Time Series Analyses, AI/ML Data Management Energy system analyses AI-based methods of optimized,

predictive network operation management Standards and interfaces for interoperable communication

Integration of new storage

systems Functional integration in lightweight construction Modeling of power grids Generation of Synthetic Training Data AI Technologies in Production & Logistics

Packaging for RF and analog mixed-signal modules Epitaxy High- and ultra-highfrequency components (High-Frequency Devices) Actuators, MEMS actuators

Display, RFID packaging 2 2 2 1 2 0 0 0 0 1 0 0 0 0 0 1911.02116. [8] K. Song, X. Tan, T. Qin, J. Lu, T.-Y. Liu, Mpnet: Masked and permuted pre-training for language understanding, 2020. URL: https://arxiv.org/abs/2004. 09297. doi:10.48550/ARXIV.2004.09297. [9] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, 2019. URL: https://arxiv.org/abs/1908.10084. doi:10.

[1]

Qin ,

Zhu ,

Xu ,

Zhu ,

Jiang , E. Chen,

Xiong , Enhancing person-job fit for talent recruitment , in: K. Collins-Thompson (Ed.), The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval , ACM Conferences, ACM, New York, NY, 2018 , pp. 25 - 34 . doi: 10 .1145/3209978.3210025.

[2]

Zhao ,

Wang ,

Sigdel ,

Zhang , P. Hoang, M. Liu,

Korayem , Embedding-based recommender system for job to candidate matching on scale , 2021 . URL: https://arxiv.org/pdf/2107.00221.

[3]

Lavi ,

Medentsiy , D. Graus, consultantbert: Finetuned siamese sentence-bert for matching jobs and job seekers , in: The Workshop on Recommender Systems for Human Resources (RecSys in HR 2021) , 2021 .

[4]

M. H.

Jarrahi ,

Askay ,

Eshraghi ,

Smith, Artificial intelligence and knowledge management: A partnership between human and ai , Business Horizons ( 2022 ). doi: 10 .1016/j.bushor. 2022 . 03 .002.

[5]

Campos ,

Mangaravite ,

Pasquali ,

Jorge ,

Nunes ,

Jatowt , Yake! keyword extraction from single documents using multiple local features , Information Sciences 509 ( 2020 ) 257 - 289 . doi: 10 .1016/j.ins. 2019 . 09 .013.

[6]

Reimers , I. Gurevych , Making monolingual sentence embeddings multilingual using knowledge distillation, 2020 . URL: https://arxiv.org/abs/ 2004 .09813. doi: 10 .48550/ARXIV. 2004 . 09813 .

[7]

Conneau ,

Khandelwal ,

Goyal ,

Chaudhary ,

Wenzek ,

Guzmán , E. Grave,

Ott ,

Zettlemoyer ,

Stoyanov , Unsupervised cross-lingual representation learning at scale , 2019 . URL: https: