Recommendation of Job Offers Using Random Forests and Support Vector Machines Jorge Martinez-Gil Bernhard Freudenthaler Thomas Natschläger Software Competence Center Software Competence Center Software Competence Center Hagenberg GmbH Hagenberg GmbH Hagenberg GmbH Hagenberg, Austria Hagenberg, Austria Hagenberg, Austria jorge.martinez-gil@scch.at bernhard.freudenthaler@scch.at thomas.natschlaeger@scch.at ABSTRACT kind of information filtering mechanism (a.k.a. job recommender The challenge of automatically recommending job offers to ap- system [2, 6, 20]) aiming to predict the potential interest of job propriate job seekers is a topic that has attracted many research seekers on given job offers. More specifically, job recommender effort during the last times. However, it is generally assumed that systems aim automatically suggesting job openings in such a way there is a need of more user-friendly filtering methods so that that as many offers as possible are offered to the right candidates the automated recommendation systems might be more widely at the right moment. used. We present here our research on two methods from the To appropriate face these problems, a number of alternatives data analytics field being able to disseminate job offers to the have been already explored: whether data concerning the of- right person at the right time, which are based on Random Forest fer should be provided in a structured or unstructured way [7], and Support Vector Machines respectively. Both methods are which communication channels are the most appropriate in a used here to identify the actual attributes in which users are set given context [4, 5], how knowledge extraction over the job de- when they are attracted to a job offer. Preliminary results in the scriptions should be performed [22], and so on. However, it is context of automatic job recommendation suggest that these two widely assumed that more accurate and user-friendly filtering methods seem to be promising. methods need to be developed in order to reach a wider audience for these kind of software products [18]. KEYWORDS Our research work proposes to make this process much more smooth and comfortable for the users looking for accurate job e-recruitment, data analytics, random forests, support vector recommendations. In fact, our methods aim to automatically machines identify the criteria on what potential candidates evaluate the acceptance of a given job offer. Additionally, our research aim to 1 INTRODUCTION improve the perceived quality of recommendations as feedback Today, the job market is becoming more and more dynamic. In is received from users. Therefore, in view of the aforementioned fact, this is one of the major reasons for an increasing demand for issues, we propose here a novel approach for the accurate rec- better methods for publishing or finding interesting jobs offers. ommendation of job offers using two well-known methods from Moreover, this interest is bidirectional [13], what means that it the data analytics field that can have great performance in this stems not only from Human Resources (HR) departments in com- context. In fact, the major contributions of this ongoing work panies, intermediaries or manufacturers of recruiting software, can be summarized as follows: but also from job seekers looking for facing new professional challenges. This means that, as a first step, it is assumed that • We propose a novel mechanism to automatically recom- a preliminary reduction of the most promising applicants and mend job offers based on Random Forests in an accurate job offers can lead to considerable improvements and savings way. (in terms of money, time and effort) for both parties [10]. In this • We propose an alternative mechanism to automatically context, job portals and online recruitment platforms have been recommend job offers based on the computation of Sup- traditionally designed in order to help job providers and job seek- port Vector Machines. ers to easily find suitable candidates and job offers respectively. • We perform an empirical evaluation of our two proposed At present, many job portals and web-based recruitment sys- methods with real data concerning recruitment from one tems offer their services around the world. However, there is of our partners. a great corpus of literature suggesting that the functionality of the existing portals could be improved [3, 7, 14–16, 19, 23]. As a general case, only references to online job advertisements are The remainder of this work is organized in the following way: managed, which are then classified using a simple textual de- Section 2 reports the state-of-the-art on existing methods and scription or core attributes. This means that there are serious tools for the automatic recommendation of job offers. Section 3 obstacles for a satisfactory support, at least, in the side of job presents the problem that we are addressing within the frame seekers who are forced to browse through the list of available of this work. Section 4 described our two methods to face that job offers to find what better fits their needs and interests. problem, these two methods are based on Random Forests and In order to allow job seekers to efficiently find what they Support Vector Machines respectively. Section 5 reports the em- are looking for, the research community has been working in a pirical evaluation of our methods. Section 6 outlines the analysis of the results that we have achieved from our empirical evalua- © 2018 Copyright held by the owner/author(s). Published in the Workshop Proceedings of the EDBT/ICDT 2018 Joint Conference (March 26, 2018, Vienna, tion. Finally, we remark the conclusions and the future lines of Austria) on CEUR-WS.org (ISSN 1613-0073). Distribution of this paper is permitted research. under the terms of the Creative Commons license CC-by-nc-nd 4.0. 22 2 BACKGROUND When using relational databases, job offers with descriptive For many years, information systems for human resources (a.k.a. attributes such as job title, location, company, required skills, Human Resources Management Systems or simply HRMs) have etc. and the URL of the job advertisement are stored in relations, been mainly restricted to tracking applicant’s data through the and access is provided by means of database queries in standard applicant’s management systems [11]. However, through an in- languages such as SQL [21]. Consequently, only those vacancies creasing differentiation of labor and business worlds, the process matching exactly the given search criteria can be found [17]. of finding the right person for a job opening and vice versa is When using IR methods, the full text search is alternatively sup- increasing its complexity. It is clear that upcoming social media ported by keywords whereby standard search engines can be channels in addition to an overwhelming number of job portals integrated. Both procedures can be used in a similar way when require new strategies and technologies for both recruiters and searching for offers. However, IR-based methods allow to exploit job seekers [9]. semantic similarity in keywords, but this is only supported to a limited extent by standard search engines. On the other hand, 2.1 Uses Cases these approaches generate ordered lists of URLs, where users have a proven tendency to view only the highest ranking results. Solutions for the automatic recommendation of job offers are For these reasons, and regardless of the way in which job offers currently of great interest for a number of organizations that are handled and processed, the task of recommending the right wish to automatize their e-recruitment processes. Among the offer to the right user has been always an important task [12]. most important ones, we can mention HR departments, market In this way, the research community is working to find ways intermediates, electronic job platforms and portals, or software to make this recommendation fully satisfactory to all parties manufacturers. We offer here a closer overlook to each of them. involved in the process. 2.1.1 HR departments. The Human Resources (HR) depart- ments in companies have to daily face with problems of this kind. 2.3 Existing Methods Currently, the HR departments of large companies receive lots Techniques for automatic recommendation of job offers are specif- of incoming e-mail applications. All the application documents ically designed to address the problem of information overload have to be manually process, so that the relevant information by giving priority to information delivery for individual users extracted can be transferred into the internal recruiting systems. based on their learned preferences [1]. This process is very time consuming and spends a lot of resources The most common to process this information nowadays con- (time, money, effort). For this reason, only the data from proper sists of automatically processing the documents involved in the candidates should be transferred into the system. e-recruitment process. For each document, it is possible to extract 2.1.2 Market intermediaries. HR Recruiters and headhunters a vector for each of its fields (which contain textual information) usually receive the order of finding the most suitable candidate using the bag-of-words model and TF-IDF as weighting function. for a specific job description. The challenge is so complex that Then, some kind of methods for set comparison can shed results many companies are willing to pay big sums for successfully on the suitability of a given candidate for a specific job offer. completing this task. Solutions for job recommendation can help In general, most of methods try to exploit solutions based on to alleviate this problem, so that it can be performed much more the Vector Space Model (VSM) to measure the similarity ratio efficiently and effectively. between the original job offer and the application received. It is a solution easy to implement, with very low computational costs, 2.1.3 Electronic job platforms and portals. The segment of and that traditional has achieved very good results in the context electronic job platforms and/or portals is subject to a strong of job recommendation. However, new trends bet on the use of competition. To survive in this highly competitive market, these machine learning technology in order to overcome the traditional operators provide their customers continually new and additional limitations concerning the incapability of going further beyond services. With the envisaged research results in the field of auto- the syntactical representation of the documents. matic job recommendation, portal operators can increase their level of innovation and therefore generate additional competitive advantages for their customers. 3 PROBLEM STATEMENT The problem that we address within the frame of this work is be- 2.1.4 Manufacturers of recruiting software. It is also neces- ing able to automatically recommend job offers to the appropriate sary to mention the manufacturers of recruiting software, since candidates. We are given past solved cases this group is constantly striving to expand their software solu- tion continuously with additional and innovative modules to (x i , yi ), x ∈ Rd , y ∈ {−1, 1}. increase customer satisfaction and generate additional revenue. For this reason, software manufacturers of recruiting solutions We want a classifier so that, are potentially beneficiaries of results leading to a satisfactory job recommendation. д(x) = sign(ϕ(w) · ϕ(x) + b), (1) 2.2 Existing Recommendation Engines where Existing job portals are mainly based on either the use of rela- ϕ(w) · ϕ(x) = K(w, x). (2) tional databases or well-known methods from the area of infor- mation retrieval (IR). A major difference between them is that The key here is being able to evaluate the performance of the relational systems are only able to work with job offers that are proposed method in relation to the past solved cases that are used already stored in the databases, while IR-based approaches may to feed the algorithm in each iteration to readjust the internal allow global searches over the Web or social networks. parameters. 23 RF y s er off T1 T2 T3 Tn es ss b jo cla t an o w lev t he re .. .. .. .. .. .. .. .. t es at rs eg e off gr se ob Y N Y N Y N Y N Y N Y N Y N Y N st tj be an at lev th re Figure 1: Example of Random Forest bagging N decision ne n la no rp trees. Each decision tree gives a vote for a given class. pe hy Then, the random forest chooses the classification having the most votes. x 4 METHODS In order to improve the accuracy of the predictions, great research Figure 2: Example of 2-dimensional Support Vector Ma- efforts have been made in the last few years concerning the chine. The method consists of looking for the hyperplane definition of methods for combining a number of simple methods. that maximizes the separation between the two given These methods construct a set of hypotheses (a.k.a. ensemble), classes and combine the predictions of the ensemble in some way to classify new data. The precision obtained by this combination of hypotheses is usually better than the precision of each individual • RF, in general, can be easily extended to support multiple component. One of the most popular methods in this context are classes random forests. • RF are based on probabilistic principles On the other hand, algorithms based on n-dimensional geom- etry where given a set of past solved cases from the past are also 4.2 Support Vector Machines gaining popularity. In this way, it is possible to label the classes Support Vector Machines (SVM) is a state-of-the-art classification and train the algorithm to build a geometric model that correctly method that separates data samples using the geometric notion classify a new sample. We give a deeper insight of these two of hyperplane. The concept behind SVM is very intuitive and methods below. easy to understand: If we have data samples that has been al- ready classified, SVM can be used to generate multiple separation 4.1 Random Forests hyperplanes so that the data samples already classified can be The first method that we envision in this research work is the divided into segments. Random Forest (RF). The rationale behind RF is to work with a The idea is that each of these segments contain only one class. given number of decision trees at the same time. Each tree gives The SVM technique is generally useful and very accurate in a vote for a given class. This process is iterated by all trees. Then, scenarios involving some kind of classification. The reason is the RF indicates the results having the most votes. that SVM is designed to minimize the classification error and One of the advantages of RFs using is that, in most situations, maximize the geometric margin. this method is able to avoid overfitting of the training set, what From all the classifiers which are able to correctly classify the it is not always possible by using other machine learning tech- past samples, we are just interested in picking the closest to the niques. Figure 1 shows us an example of RF. Please note that, in hyperplane. Figure 2 shows us the rationale behind SVM with an order to work in a correct way, each decision tree has to been example that represents a space of two dimensions. The aim here built following these steps: is to find the hyperplane that best segregates the class of relevant (1) Be N the number of test cases, M is the number of variables job offers from the class of non relevant job offers. When a new in the classifier. instance is added, then this hyperplane has to be recalculated in (2) The number of input variables to be used to determine the order to facilitate future classifications. decision on a node is m; more m must be always smaller SVM has demonstrated a great performance in a number of than M scenarios involving some kind of classification of data samples (3) Select a training set for this tree and use the remaining in the past. We also think that SVM offers several advantages in test cases to calculate the error. the context of automatic recommendation of job offers. These (4) For each node of the decision tree, randomly select m advantages are the following: variables on which to base the decision. Calculate the best • SVM has a regularization mechanism which allows avoid- distribution of the training set from the variable m. ing over-fitting (a.k.a. geometric margin) We think that the main advantages of using RF in this context • SVM is defined by a optimization problem for which there can be summarized as follows: is a number of existing efficient solutions • In general, RF has only one parameter to configure, the • SVM provides an approximation to a bound on the test number of trees in the RF error, which makes it very robust • Unlike black-box models, the results obtained by RF are SVM also has additional advantage that consists of using ker- easier to interpret nels, so that it is possible to add expert knowledge about the 24 Table 1: Average values and standard deviations for the nu- merical attributes of our data set SVM 100 RF(B) 87.5 Average Std. Deviation Baseline 62.5 Workers 5069.5 9195.2 Inhabitants 361547.5 642882.9 0 100 Distance 36.3 37.4 Degree of success Salary 52437.5 13717.4 Working time 38.8 3.5 Figure 3: Results obtained for the experiment that gener- ates a salary driven profile problem. This aspect is out the scope of the present work, but it could be quite interesting to face it as part of our future work. 5 RESULTS 0.8 We report here the results from our experiments in the field of automatic job recommendation. We have worked with a data set 0.6 of 40 job offers that have been evaluated on basis of templates or profiles. A template or profile is a pre-defined pattern that shows Score interest on job offers that follow certain conditions. 0.4 The sample set we are working with is not too large (mainly due to the cost of acquiring data in this context) but it can give us a good starting point to test the accuracy of these methods for 0.2 solving the problem we are facing. Before each execution, our complete data set is randomly di- 0 vided in training set (80% of samples) and test set (20% of samples). The former is intended to train both RF and SVM, and the latter 5 10 15 20 25 30 is intended to verify the accuracy of the method. Number of Decision Trees It is also important to mention that the attributes for each job offers are the following: Figure 4: Evolution of the performance as more decision • Company name trees are considered in the case of a salary driven profile • Position title • City • Distance to home attributes for each of the offers that the potential candidate liked • Working hours in the past. Then, we compare new offers with the ’average’ one, • Yearly salary before taxes and we decide if it is similar or not based on the number of similar • Are your potentially interested (Y/N)? (to be predicted) attributes, i.e. attributes closer to the average. Table 1 shows us the average values for the attributes and its corresponding standard deviations (the amount of variation or 5.2 Salary driven profile dispersion of the values) The first case we are going to study is the profile of a person who Moreover, the most repeated Position Title is programmer, al- is willing to be interested in job offers with very high salaries. though other occupations that appear in the data set are analyst, Figure 3 shows us the results. Please note that for the RF, we researcher, desk support or developer. The attribute to be pre- pick the best result since this result can vary depending on the dicted is dependent of the profile that we are analyzing. And in number of decision trees that our method is trying to bag, as we some cases it can be strongly unbalanced (what means that it explain later. will be an an overwhelming majority samples of one class) what It is very important to determine the number of decision trees makes the learning process even more difficult. However, this is that we are going to work with. To do that, we run several time the how things work in real e-recruitment scenarios, where users algorithm in order to determine what is the appropriate number click in either just a few or in many potential job offers, so we of trees to be bagged. are facing here a realistic situation. From Figure 4 it is possible to see, the more decision trees we The results will show us the degree of accuracy that we have add the better get the results. However, at a certain point the achieved in each case. In order to identify what is the best strategy benefit is lower than the cost (in terms of computing time) of in each of these cases, we propose a baseline method that it does including additional decision trees. not involve any kind of learning. 5.3 Distance driven profile 5.1 Baseline In this case, we are going to study the profile of a person who In order to compare the results from our methods, we need to is willing to be interested in job offers for those companies that define a baseline method. Since we want to verify the advantages are located near its current location. Therefore, the template will of using methods being able to analyze past solved cases, we are have Yes in job offers with shorter distances and No in job offers going to choose a baseline method with no learning capabilities. for positions located further away. However, what in principle In this case, we are considering to calculate the average of the seems to be an easy scenario, it is not so easy to solve as we 25 SVM 75 0.6 RF(B) 87.5 Baseline 37.5 0 100 0.4 Score Degree of success Figure 5: Results obtained for for the experiment that gen- 0.2 erates a distance driven profile 0 0.8 5 10 15 20 25 30 Number of Decision Trees 0.6 Figure 8: Evolution of the performance as more decision Score trees are considered for a highly paid hour profile 0.4 0.2 SVM 100 RF(B) 100 0 Baseline 100 5 10 15 20 25 30 0 100 Number of Decision Trees Degree of success Figure 6: Evolution of the performance as more decision Figure 9: Results obtained for the experiment that gener- trees are considered in a case of distance driven profile ates a profile giving importance to big companies located in big cities SVM 87.5 In Figure 8, we can see once again how, at some point, the RF(B) 62.5 improvement of the results decreases as the number of decision Baseline 62.5 trees increases. 0 100 5.5 Big companies located in big cities profile Degree of success In this experiment, the template is going to choose those job offers which are offered by large companies located in big cities. Figure 7: Results obtained for the experiment that gener- This case is also interesting because it might allow us seeing how ates a profile for a highly paid hour profile our methods deal with the fact that more than one attribute has an impact in the user’s decision. Figure 9 shows us the results of the experiment. As it can be seen, it was not a difficult scenario can see in Figure 5. Reason is that the data set generated by the for any of the methods considered. template is very unbalanced, what means that only a few offers For the case of RF, Figure 10 shows us the evolution of the a located in a surrounding area. score in relation to the number of decision trees. In this case, the In Figure 6, we can see once again how the score improvement RF remains stable during all the experiments. decreases as the number of decision trees increases, what means that a larger amount of trees is usually fine just to some extent. 6 DISCUSSION From the results that we have achieved in our pool of experiments, 5.4 Highly paid hour profile it is possible to see that the most important advantages of our In this experiment, the template is going to choose those job approach are: offers which offers the best hourly rate by the potential employer, • Both RF and SVM are quite accurate learning algorithms i.e. the proportion between salary and work time seems to be in the context of automatic job recommendation. For a suf- more advantageous. This case is quite interesting because it might ficiently large data set, it is possible to build very accurate allow us understanding how our methods behave when the user classifiers. Even for smaller samples like ours, results are looks for a complex aggregation of attributes. Figure 7 shows us better than those from methods with no learning capabili- the results for this experiment. ties. 26 mapping in the case of SVM as we mentioned earlier. Finally, it 1 is also necessary to study how to integrate this technology with existing web information systems so that these two methods can 0.8 be put into operation by the industry. ACKNOWLEDGMENTS 0.6 We would like to thank the anonymous reviewers for their useful Score suggestions to improve this work. The research reported in this 0.4 paper has been supported by the Austrian Ministry for Trans- port, Innovation and Technology, the Federal Ministry of Science, 0.2 Research and Economy, and the Province of Upper Austria in the frame of the COMET center SCCH. 0 REFERENCES [1] Fabian Abel, Andras A. Benczur, Daniel Kohlsdorf, Martha Larson, Robert 5 10 15 20 25 30 Palovics: Proceedings of the 2016 Recommender Systems Challenge, RecSys Number of Decision Trees Challenge 2016, Boston, Massachusetts, USA, September 15, 2016. ACM 2016. [2] Daniel Bernardes, Mamadou Diaby, Raphael Fournier, Francoise Fogelman- Soulie, Emmanuel Viennet: A Social Formalism and Survey for Recommender Figure 10: Evolution of the performance as more decision Systems. SIGKDD Explorations 16(2): 20-37 (2014). [3] Stefan Buschner, Rafael Schirru, Hanna Zieschang, Peter Junker: Providing trees are considered for the profile Big companies located recommendations for horizontal career change. I-KNOW 2014: 33:1-33:4 in big cities [4] Mamadou Diaby, Emmanuel Viennet, Tristan Launay: Toward the next gener- ation of recruitment tools: an online social network-based job recommender system. ASONAM 2013: 821-828 [5] Mamadou Diaby, Emmanuel Viennet, Tristan Launay: Exploration of method- • RF and SVM both can handle many variables without ologies to improve job recommender systems on social networks. Social Netw. discarding any of them, what makes them good candidates Analys. Mining 4(1): 227 (2014) to efficiently work at web scale, in large databases or with [6] Frank Faerber, Tim Weitzel, Tobias Keim: An Automated Recommendation Approach to Selection in Personnel Recruitment. AMCIS 2003: 302. large instances. [7] Evanthia Faliagka, Lazaros S. Iliadis, Ioannis Karydis, Maria Rigou, Spyros • Last, but not least, RF is able to provide useful insights Sioutas, Athanasios K. Tsakalidis, Giannis Tzimas: On-line consistent ranking on e-recruitment: seeking the truth behind a well-formed CV. Artif. Intell. Rev. for understanding the interactions between the different 42(3): 515-528 (2014). variables. On the other hand, SVM operate in a less intu- [8] Evanthia Faliagka, Athanasios K. Tsakalidis, Giannis Tzimas: An Integrated E- itive way, but in exchange, has had a better performance Recruitment System for Automated Personality Mining and Applicant Ranking. Internet Research 22(5): 551-568 (2012). in most of cases. [9] Tobias Keim: Extending the Applicability of Recommender Systems: A Multi- However, an complete empirical evaluation over larger data layer Framework for Matching Human Resources. HICSS 2007: 169. [10] Stefan Lang, Sven Laumer, Christian Maier, Andreas Eckhardt: Drivers, chal- sets should be performed in order to gain deeper insights on the lenges and consequences of E-recruiting: a literature review. CPR 2011: 26-35. advantages of these two methods. The reason is that, as we have [11] Sven Laumer, Andreas Eckhardt: Help to find the needle in a haystack: inte- seen, it is not always possible to obtain optimal results with small grating recommender systems in an IT supported staff recruitment system. CPR 2009: 7-12. samples like ours. [12] Jochen Malinowski, Tim Weitzel, Tobias Keim: Decision support for team staffing: An automated relational recommendation approach. Decision Support Systems 45(3): 429-447 (2008). 7 CONCLUSIONS AND FUTURE WORK [13] Jochen Malinowski, Tobias Keim, Oliver Wendt, Tim Weitzel: Matching People In this work, we have presented our proposal for the automatic and Jobs: A Bilateral Recommendation Approach. HICSS 2006. [14] Jorge Martinez Gil: An Overview of Knowledge Management Techniques for recommendation of job offers. Our goal here is being able to e-Recruitment. JIKM 13(2) (2014). build methods being able to deliver appropriate job offers to [15] Jorge Martinez Gil, Alejandra Lorena Paoletti, Klaus-Dieter Schewe: A Smart those job seekers that could be potentially interested on them. To Approach for Matching, Learning and Querying Information from the Human Resources Domain. ADBIS (Short Papers and Workshops) 2016: 157-167. do that, we have based our research efforts on two well-known [16] Alejandra Lorena Paoletti, Jorge Martinez Gil, Klaus-Dieter Schewe: Extending classification methods: random forests (RF) and support vector Knowledge-Based Profile Matching in the Human Resources Domain. DEXA (2) 2015: 21-35. machines (SVM). [17] Alejandra Lorena Paoletti, Jorge Martinez Gil, Klaus-Dieter Schewe: Top- Our empirical evaluation shows us interesting facts. For ex- k Matching Queries for Filter-Based Profile Matching in Knowledge Bases. ample, RF are more likely to be interpreted although they do no DEXA (2) 2016: 295-302. [18] Ioannis K. Paparrizos, Berkant Barla Cambazoglu, Aristides Gionis: Machine present a particularly good performance in relation to SVM. On learned job recommendation. RecSys 2011: 325-328. the other hand, SVM are more accurate, although they work with [19] Gabor Racz, Attila Sali, Klaus-Dieter Schewe: Semantic Matching Strategies a model being much harder to interpret by human. What it is for Job Recruitment: A Comparison of New and Known Approaches. FoIKS 2016: 149-168. clear is, that in both cases, we have shown that these two meth- [20] Amit Singh, Rose Catherine, Karthik Visweswariah, Vijil Chenthamarakshan, ods are quite appropriate for accurately working in the context Nandakishore Kambhatla: PROSPECT: a system for screening candidates for recruitment. CIKM 2010: 659-668. of automatic job recommendation. [21] Eufemia Tinelli, Simona Colucci, Francesco M. Donini, Eugenio Di Sciascio, As future work, we propose to design novel computational Silvia Giannini: Embedding semantics in human resources management au- methods being able to process the textual description from the tomation via SQL. Appl. Intell. 46(4): 952-982 (2017). [22] Eufemia Tinelli, Simona Colucci, Silvia Giannini, Eugenio Di Sciascio, job offers. At that point, we were using just the quantitative Francesco M. Donini: Large Scale Skill Matching through Knowledge Compi- information that is advertised. However, we think that the way lation. ISMIS 2012: 192-201. an offer is written can help attracting potential candidates as [23] Xing Yi, James Allan, W. Bruce Croft: Matching resumes and jobs based on relevance models. SIGIR 2007: 809-810. well, maybe new methods for natural language processing using neural networks could help in this task. We also would like to explore the possibilities to work with expert knowledge via kernel 27