=Paper=
{{Paper
|id=Vol-3817/long3
|storemode=property
|title=Using Semantic-based Adaptive Relevance Prediction to Enhance Entity Recommendation for Personal Knowledge Assistance
|pdfUrl=https://ceur-ws.org/Vol-3817/long3.pdf
|volume=Vol-3817
|authors=Mahta Bakhshizadeh,Heiko Maus,Andreas Dengel
|dblpUrl=https://dblp.org/rec/conf/kars/BakhshizadehM024
}}
==Using Semantic-based Adaptive Relevance Prediction to Enhance Entity Recommendation for Personal Knowledge Assistance==
Using Semantic-based Adaptive Relevance Prediction to Enhance Entity Recommendation for Personal Knowledge Assistance Mahta Bakhshizadeh1,2,* , Heiko Maus1 and Andreas Dengel1,2 1 German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany 2 University of Kaiserslautern-Landau (RPTU), Kaiserslautern, Germany Abstract Personal knowledge assistance tools are designed to support knowledge work by delivering contextually relevant information and recommendations, thereby enhancing productivity and decision-making. Entity recommendation is a form of knowledge assistance that suggests relevant entities commonly sourced from public knowledge bases, like DBpedia, based on user context to improve productivity in daily digital tasks. In this study, we explore which similarity metrics within RDF2Vec knowledge graph embedding are most effective at capturing users’ personal interpretations of entity similarities within their specific contexts. Accordingly, we propose a semantic-based recommendation method that includes an adaptive relevance prediction module to dynamically evaluate entity relevance by incorporating user feedback. Our approach is benchmarked on RLKWiC, a publicly available dataset of real-life knowledge work in context, and demonstrated a twenty percent improvement over the established baseline for entity recommendation, highlighting its potential to enhance knowledge work support. Keywords Entity recommendation, Personal knowledge assistance, RDF2Vec Embeddings, Knowledge work support 1. Introduction Recommender systems (RS) have become a major technology in a wide range of applications, from e-commerce and social media to digital entertainment [1]. Traditional RS primarily rely on collaborative filtering methods [2], which leverage user-item interaction data, often enhanced by machine learning techniques, to predict user preferences. However, these methods typically do not utilize the vast amounts of structured and unstructured knowledge available about the domain of interest. This gap has led to the emergence of Knowledge-aware RS (KaRS), which aim to integrate domain-specific knowledge into the RS to improve not just the accuracy but also the relevance and interpretability of recommendations. KaRS extend beyond the conventional data-driven approaches by incorporating rich semantic information from various knowledge sources, such as ontologies, Knowledge Graphs (KGs), and other structured databases. These systems leverage this knowledge to provide more contextually relevant recommendations, allowing them to address some of the limitations of traditional recommenders, such as cold start problems and lack of explainability. By using knowledge bases and KGs, KaRS can infer new relationships between items and users, capture deeper insights about user preferences, and understand the semantics behind user interactions [3]. The integration of knowledge sources into RS represents a shift towards a more comprehensive approach, where the goal is not only to predict what a user might like but also to provide recommendations that are contextually appropriate and semantically meaningful. As such, knowledge-aware and conversational RS are at the forefront of advancing the field by leveraging the semantic richness of knowledge bases and the interactive nature of conversational AI, ultimately enhancing user satisfaction and engagement [4]. Sixth Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop @ RecSys 2024, September 14–18 2024, Bari, Italy. * Corresponding author. $ mahta.bakhshizadeh@dfki.de (M. Bakhshizadeh); heiko.maus@dfki.de (H. Maus); andreas.dengel@dfki.de (A. Dengel) 0000-0001-7796-3444 (M. Bakhshizadeh); 0000-0003-3508-5860 (H. Maus); 0000-0002-6100-8255 (A. Dengel) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Over the past two decades, a limited number of studies have investigated the integration of KaRS with Personal Knowledge Assistance (PKA). This emerging application seeks to develop RS that can continuously provide Knowledge Workers (KW) with the most relevant and useful information based on their specific context, thereby enhancing productivity in their daily tasks [5]. KW, including professionals such as architects, engineers, scientists, lawyers, and academics, rely heavily on knowledge as their primary asset [6] and are primarily focused on processing and applying information rather than engaging in manual labor [7]. In a relevant study, KG-based RS were identified as one of three promising types of RS capable of addressing the complexities of this research goal [5]. A knowledge-aware recommender system integrated into the work environment can support KW in various ways throughout their daily activities. For instance, during the process of writing a paper, such systems could recommend relevant articles, suggest contacts for consultation based on previous research and experience, propose appropriate research tools, and highlight upcoming conferences. These recommendations can be derived from different layers of the KW’s information space (personal, corporate, and global) tailored to both their immediate needs and long-term preferences [8]. While some studies have concentrated on exploring the personal and corporate information spaces of KW for recommendation, often framing it as proactive information delivery [9] or information re-finding [10], other research has focused on leveraging public knowledge bases for recommendations. This latter approach is commonly referred to as Entity Recommendation (ER), where entities typically refer to resources within public KGs, such as DBpedia1 [11, 12, 13]. The goal of our study is to explore how incorporating user feedback into ER systems can enhance their accuracy in suggesting relevant entities to KW, based on the users’ context. Specifically, we aim to investigate whether a semantic approach, which adapts recommendations by measuring the distance between entities in embedding space, can improve the relevance of these suggestions. This raises an additional research question: Are the representations of entities from public KGs, generated through common embedding methods, aligned with users’ personal interpretations of entity relevance within their self-defined contexts? Our study seeks to provide insights into these questions, assessing the potential of adaptive, feedback-driven systems in refining entity recommendations. In the subsequent section, we discuss the evolution of ER towards PKA followed by an introduction to a publicly available benchmark designed for evaluating ER in PKA. We then present our proposed approach, which uses semantic similarities to dynamically predict the relevance of entities, thereby enhancing ER performance on the established benchmark. The paper is then concluded along with a brief overview of potential future research directions. 2. Personal Knowledge Assistance through Entity Recommendation There has been a significant evolution of ER in the web search domain [14, 15] with extensive benchmarking using popular datasets [16] such as Movielens [17]. While PKA has not gained the same level of popularity as these domains, it has nonetheless seen notable contributions and advancements. This evolution reflects a shift from basic, application-specific models to sophisticated, context-aware systems that understand and anticipate user needs across various digital environments. Initially, ER systems relied on limited user inputs and predefined logs, such as query histories or browsing data. For instance, the proactive information retrieval efforts focused on screen surveillance and the use of optical character recognition to analyze all content on a user’s screen, enhancing task detection accuracy and proactive retrieval through digital footprints [18, 19]. An earlier study explored ER application within email management, as demonstrated by the SmartOffice extension for Microsoft Outlook [20]. This tool integrated email processing with enterprise workflows, enhancing the efficiency of handling process-relevant emails and documents. Evaluation results highlighted SmartOffice’s potential to significantly improve workflow integration and user acceptance, showcasing the early promise of ER in supporting professional tasks through email. 1 https://www.dbpedia.org As research progressed, there was a greater focus on user intentions, task goals, and the factors driving information search behaviors. Studies highlighted that search behaviors were often linked to creative processes triggered by prior digital activities, underscoring the need for contextual factors in RS design [21]. This led to the development of entity-based systems that deliver actionable recommendations across multiple applications by continuously monitoring digital activities and capturing context through screen frames. Such systems, exemplified by EntityBot, offer recommendations without explicit queries, enhancing user productivity and satisfaction by reducing cognitive load [22, 12, 11]. Further advancements involved refining the methodologies for capturing and using contextual information. Techniques like Dirichlet–Hawkes processes were used to model context from digital traces, enhancing web search query augmentation based on comprehensive user activity data [23]. Additionally, integrating multimodal data from spoken conversations and digital activities improved task predictions and ER, thereby supporting users more effectively in digital environments [24]. Entity footprinting, which models contextual user states through digital monitoring and uses latent representations of entities to predict relevance, demonstrated improved accuracy in user state detection and entity prediction [25]. Most recently, research utilizing transformer architectures to model digital activity contexts has shown promise in predicting personalized information needs for various tasks, suggesting that a broader use of contextual data can enhance the effectiveness of RS [26]. The evolution of ER systems towards PKA has emphasized contextual awareness, multimodal data integration, and proactive support, reflecting a continuous effort to improve the relevance and utility of recommendations in meeting users’ information needs. However, despite the aforementioned valuable contributions, a significant barrier to advancing the use of RS for supporting KW has been the lack of a standardized framework for evaluating and benchmarking these methods [27]. Most of the conducted experiments rely on proprietary datasets that are, at best, partially available, leading to challenges in reproducing results and making fair comparisons across different approaches [28]. The RLKWiC Benchmarking Dataset for ER2 , that is introduced in the following section, fills this gap by extending a publicly available dataset3 towards promoting transparency and comparability, enabling researchers to more effectively evaluate and contrast various methods [13]. 3. Entity Recommendation on RLKWiC RLKWiC is a publicly available dataset of Real-Life Knowledge Work in Context, gathered by monitoring computer interactions of 8 volunteers over 2 months. It aims to provide a standardized benchmark for evaluating PKA services by offering multiple information dimensions, including detailed user contexts, documents, semantics, events, and sessions [28]. This dataset is extended to create a community benchmark by simulating an ER scenario where participants were given entities extracted from selected segments of their captured activities across various contexts [13]. In this setup, 1,850 entity recommendations were simulated across 56 different contexts within the dataset. After eliminating duplicates, these entities were grouped by context and presented to participants for explicit feedback. To evaluate the relevance of recommendations in relation to their respective contexts, participants were instructed to rate the recommended entities on a 3-point scale: • Irrelevant (0): Indicates that there is no relevance between the recommended entity and the corresponding context. • Relevant (1): Suggests that there is a connection between the entity and the context, but it does not fully encapsulate or represent the context. • Representative (2): Denotes that the entity is closely aligned with the context, reflecting a high degree of relevance, such that the context can be inferred to be specifically about this entity. 2 https://purl.org/entity-recommendation-on-rlkwic 3 https://purl.org/rlkwic Figure 1: Incorporating Adaptive Relevance Prediction (indicated in green) into the entity recommendation scenario on RLKWiC. It was noted that participants showed a preference for receiving information about entities labeled as representative in the corresponding context. For example, a participant who created the “GNN” context (a context about Graph Neural Networks), expressed interest in obtaining information about representative entities such as Message Passing4 and Graph Representation5 within this context. Conversely, participants generally found that explicit information about entities marked as relevant was less helpful in most scenarios. Nevertheless, these relevant entities can provide valuable, contextually rich information and enhance context representation learning for various information tasks. For example, in a job search context, the entity of Mannheim6 , a city in Germany where a participant was looking for a job, was labeled as relevant. Although providing direct information about Mannheim may not seem necessary in this context, utilizing this data indirectly—such as filtering job recommendations to only include positions in Mannheim—could lead to more pertinent results. Participants could also suggest additional entities they considered relevant but were not included in the original recommendations. Combining participant feedback with these additional suggestions resulted in a dataset of over a thousand entities, each labeled with explicit relevance scores, enhancing the RLKWiC dataset. This ER benchmark dataset provides comprehensive details for each recommendation case, including timestamps and participant-assigned scores. To establish a baseline for future research, the performance of a simulated ER scenario in recommending relevant and representative entities is also reported. With this scored entity set for each context, the challenge is to develop a recommendation strategy that can be simulated on the RLKWiC dataset to maximize the number of relevant and representative entities while minimizing irrelevant ones [13]. In the following section, we demonstrate how incorporating an Adaptive Relevance Prediction (ARP) module into the simulated ER scenario has allowed us to outperform the established baseline. 4. Adaptive Relevance Prediction Using Semantic Similarities In the previously simulated ER scenario on RLKWiC (illustrated in Figure 1), participants were provided with entities derived from selected portions of their captured activities across various contexts. 4 http://dbpedia.org/resource/Message_passing 5 http://dbpedia.org/resource/Graph_representation 6 http://dbpedia.org/page/Mannheim In this scenario, each recorded event was first classified as either context-informative or not by the ER system. Context-informative events, which have evidence of the user’s implicit confirmation of relevance include five types of activities: naming a newly defined context, making a search query, and adding a tag, file, or web page to the current context. Given the abundance of activities within each context, including considerable noise (e.g., distractions from irrelevant emails), only the context-informative events were considered as recommendation triggers. In the case of any of the mentioned triggering events, the corresponding content was extracted and pre-processed for the next step. This involves removing symbols and irrelevant strings (such as “https” in URLs) from the content. Finally, using the DBpedia Spotlight7 Named Entity Recognition (NER) tool, entities were extracted from the pre-processed text to be recommended to the user [13]. A sample recommendation process is shown by the red dashed rectangles in Figure 1. In this case, a recommendation was triggered when a web page was added to the GNN context. The URL and title of the web page were pre-processed and analyzed by the NER module, resulting in three entities being extracted for recommendation to the user [13]. In the original scenario, all entities recognized from the pre-processed text extracted from context-informative events were directly recommended to the user without further filtering (gray dashed arrow in Figure 1). In contrast, our proposed method introduces an ARP module (shown in green color in Figure 1) to refine the recommendation process by incorporating user feedback on previously recommended entities. The core idea behind ARP is to recommend only those entities that are semantically more closely aligned with the ones the user has already confirmed as relevant. By integrating user feedback into the ER process, we demonstrate a noticeable improvement in the system’s performance. The following subsections provide a detailed description of our proposed method: we first explain how entities are semantically represented in our approach using RDF2Vec embeddings, then conduct a comparative analysis of different similarity metrics to determine the most effective one, followed by detailing our semantic similarity-based algorithm, which utilizes the selected metric to enhance ER on the RLKWiC dataset. Finally, we present the results to illustrate how our method outperforms the original recommendation strategy. 4.1. DBpedia RDF2Vec Graph Embeddings DBpedia, the foundational knowledge base used for representing entities in RLKWiC, is a prominent linked open data resource that structures its knowledge as a Resource Description Framework (RDF) graph. The graph-based nature of RDF makes it inherently complex and challenging to manage using traditional data mining and machine learning techniques. To address this challenge, RDF2Vec was introduced as a method for creating semantic representations of entities in RDF graphs by learning latent numerical vector embeddings [29]. RDF2Vec leverages language modeling techniques traditionally used for word embeddings to generate feature vectors from graph substructures, such as those obtained through Weisfeiler-Lehman Subtree RDF Graph Kernels [30, 31] and random graph walks. These embeddings effectively capture the semantics of entities in a way that is suitable for data mining and predictive modeling tasks [29]. In our study, we utilized the DBpedia RDF2Vec graph embeddings dataset8 [32] to semantically represent RLKWiC entities to enhance the ER accuracy by enabling the system to recommend entities based on their semantic proximity within the DBpedia KG. Our method is based on the assumption that entities representing a user context, or at least those relevant to it, should be positioned closer to each other in the embedding space. Consequently, by filtering out recognized entities that are not sufficiently close to previously successful recommendations (entities scored as 1: relevant or 2: representative by the user), we can enhance recommendation precision. To evaluate this assumption, we analyzed the pairwise similarities between the embeddings of entities within each context. For an intuitive understanding, we began by visualizing the similarities. 7 https://www.dbpedia-spotlight.org 8 Available on https://zenodo.org/records/6377944 Figure 2: Pairwise semantic similarity of entities within two sample contexts (left: “Vehicle Vibrations”, and right: “Neuropsychology”) from RLKWiC using Euclidean distance. While similarities based on certain metrics, such as Sigmoid, Linear, and Polynomial kernels, yielded poor results, others aligned well with our hypothesis. Figure 2 illustrates the pairwise semantic similarity of entities within two sample contexts from RLKWiC using Euclidean distance. The heatmap on the left represents a context named “Vehicle Vibrations” which includes 17 evaluated entities: 10 representative (scored 2, shown in green font), 4 relevant (scored 1, in yellow), and 3 irrelevant (scored 0, in red). The heatmap demonstrates a high pairwise similarity among representative entities, with relevant entities being somewhat less close, and irrelevant ones further away. An interesting case to note in this context is the entity “stroke”. Here, the entity stroke9 (medical condition) was incorrectly recognized and recommended within the “Vehicle Vibrations” context, whereas the correct entity to recommend was stroke in an engine10 . The heatmap shows that the correct entity (stroke in engine) is closer to the appropriate entities compared to the incorrect one (stroke as a medical condition), highlighting the potential of semantic information for disambiguation and improving ER. The heatmap on the right depicts another context from a different participant named “Neuropsychology”, with more entities, supporting the same interpretation. While these visualizations provided some intuitive confirmation of our assumption, they are not sufficient for drawing definitive conclusions. Therefore, in the designed experiment presented in the next section, we assess our assumption more rigorously and identify which similarity metrics appear to be more effective in representing entity similarities in alignment with users’ assessments of relevance. 4.2. Comparative analysis of similarity metrics While there has been intriguing research on entity similarities within KG embedding spaces, such as conducting extensive experiments to evaluate the clustering capabilities of various KG embedding models to investigate how different models may capture distinct notions of similarity [33], the challenge of ensuring that these embeddings inherently reflect the user’s personal interpretation of entity similarity by positioning similar entities close to each other has not yet been explored. This research takes a step towards this goal by exploring entity similarity in the RDF2Vec DBpedia KG embedding space from the perspective of the user’s context. To achieve this, we utilized the scikit-learn library11 , which provides functions for computing pairwise similarities and supports a variety of metrics including Euclidean, Cosine, and Manhattan distances, along with RBF, Laplacian, Sigmoid, Linear, and Polynomial kernels. 9 http://dbpedia.org/resource/Stroke 10 http://dbpedia.org/resource/Stroke_(engine) 11 https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html Figure 3: Values and averages of 𝐺0 , 𝐺1 , and 𝐺2 based on cosine similarity. To evaluate our assumption and determine which similarity metrics are more effective in aligning with users’ assessments of entity relevance to their contexts, we applied a one-way ANOVA (Analysis of Variance) test. The one-way ANOVA is used to determine whether an independent variable has an effect on a dependent variable. In our case, the independent variable is the entity’s relevance to the context, which is categorized as 0 (irrelevant), 1 (relevant), or 2 (representative). The dependent variable is the average pairwise semantic similarity between all entities and representative entities within each context. The one-way ANOVA test produces an F-statistic, which is then used to calculate the p-value. The p-value helps us decide whether to reject or fail to reject the null hypothesis (𝐻0 ), which states that there is no statistically significant difference between the pairwise semantic similarity within the representative entities of a context compared to their similarity to other entities within the same context. Conversely, the alternative hypothesis (𝐻1 ) posits that representative entities are semantically closer to each other. We denote the average pairwise semantic similarities for each group by 𝐺𝑖 : ⎧ ⃒ ⎫ ⎨ 1 ∑︁ ∑︁ ⃒ ⎬ 𝐺𝑖 = sim𝑚 (𝑒2 , 𝑒𝑖 ) ⃒ 𝑐 = 1, . . . , 56 , for 𝑖 = 0, 1, 2 ⃒ ⎩ |𝐸𝑐2 | · |𝐸𝑐𝑖 | 2 𝑖 ⃒ ⎭ 𝑒2 ∈𝐸𝑐 𝑒𝑖 ∈𝐸𝑐 Where 𝐺𝑖 (𝑖 ∈ {1, 2, 3}) denotes the set of average semantic similarities of entities scored with 𝑖 from representative entities (scored with 2) over the 56 existing contexts in RLKWiC. The notation 𝐸𝑐2 represents the set of entities scored with 2 in context 𝑐 (𝑐 = 1, . . . , 56), and 𝐸𝑐𝑖 represents the set of entities scored with 𝑖 (which can be 0, 1, or 2) in context 𝑐. The function sim𝑚 (𝑒2 , 𝑒𝑖 ) denotes the semantic similarity between a representative ∑︀ entity ∑︀ 𝑒2 in 𝐸𝑐 and an entity 𝑒𝑖 in 𝐸𝑐 , based on the 2 𝑖 similarity metric 𝑚. The double summation 𝑒2 ∈𝐸𝑐2 𝑒𝑖 ∈𝐸𝑐𝑖 sim𝑚 (𝑒2 , 𝑒𝑖 ) calculates the total sum of semantic similarities between all pairs of entities from sets 𝐸𝑐2 and 𝐸𝑐𝑖 , and dividing by |𝐸𝑐2 | · |𝐸𝑐𝑖 | computes the average of these similarities over the number of pairwise comparisons. Figure 3 presents the values of the three groups calculated using cosine similarity as an example, along with their respective average lines which aligns with our expectation that the similarities among representative entities will be the highest, followed by relevant entities, and then irrelevant ones. Using the computed average similarity groups across contexts for each metric, we then calculated the F-statistic, which is the ratio of the mean square between groups to the mean square within groups (MST/MSE), to determine the p-values for testing our hypothesis. As shown in Table 1, most p-values are below 0.05, indicating a statistically significant difference between the pairwise semantic similarity among representative entities in a context and their similarity to other entities within the same context. Table 1 Comparative analysis of similarity metrics using one-way ANOVA (rounded to three decimal places). Cosine Euclidean Manhattan RBF Laplacian Sigmoid Polynomial Linear similarity distance distance kernel kernel kernel kernel kernel F-statistic 3.503 4.775 4.657 3.961 4.930 1.778 2.064 1.902 P-value 0.033 0.010 0.011 0.021 0.009 0.173 0.131 0.153 Additionally, the Laplacian kernel and Euclidean distance metrics show the greatest potential for aligning with users’ personal interpretations of entity similarities in relation to their contexts. 4.3. Recommendation algorithm Based on the findings from our semantic analysis, we propose an algorithm aimed at enhancing entity recommendations on the RLKWiC dataset. In our approach, detailed in Algorithm 1, we categorize evaluated entities into relevant (scored 1 or 2) and irrelevant (scored 0) based on the explicit feedback provided in the RLKWiC benchmark dataset. For each newly recognized entity, we compute its semantic distance using the Laplacian kernel metric relative to both the relevant and irrelevant groups. The new entity is recommended only if it is closer to the relevant entities than to the irrelevant ones. We ordered the recognized entities within each context chronologically, simulating the availability of user feedback at each recommendation point, and then applied our recommendation method across the 56 contexts of the 8 participants in the RLKWiC dataset. Algorithm 1: Adaptive relevance prediction via semantic similarity of RDF2Vec entity embeddings Input: Participants 𝑃 , for each 𝑝 ∈ 𝑃 a set of contexts 𝐶𝑝 including a chronologically ordered set of entities 𝐸𝑝𝑐 , Participant’s explicit feedback on previous recommendations divided into relevant entities 𝑅𝑝𝑐 and irrelevant entities 𝐼𝑝𝑐 , scaling parameter 𝛾 = 200 1.0 (RDF2Vec embeddings dataset consists of 200-dimensional vectors of DBpedia entities) Output: Recommendation decision for all recognized entities in RLKWiC for each participant 𝑝 ∈ 𝑃 do for each context 𝑐 ∈ 𝐶𝑝 do for each entity 𝑒 ∈ 𝐸𝑝𝑐 do Compute the average semantic (︁ similarity of 𝑒 with relevant )︁ entities in 𝐶𝑝 : ‖RDF2Vec(𝑒)−RDF2Vec(𝑟)‖1 sim𝑟𝑒𝑙𝑝𝑐 ← |𝑅𝑝𝑐 | 𝑟∈𝑅𝑝𝑐 exp − 1 ∑︀ 𝛾 Compute the average semantic (︁ similarity of 𝑒 with irrelevant )︁ entities in 𝐶𝑝 : ‖RDF2Vec(𝑒)−RDF2Vec(𝑖)‖1 sim𝑖𝑟𝑟𝑝𝑐 ← |𝐼𝑝𝑐 | 𝑖∈𝐼𝑝𝑐 exp − 1 ∑︀ 𝛾 if sim𝑟𝑒𝑙𝑝𝑐 > sim𝑖𝑟𝑟𝑝𝑐 then Recommend the entity 𝑒 Add 𝑒 to relevant entities 𝑅𝑝𝑐 else Do not recommend the entity 𝑒 Add 𝑒 to irrelevant entities 𝐼𝑝𝑐 The step by step process of simulating the proposed ER method on RLKWiC is explained as follows: 4.3.1. Input Data: The algorithm works with data from multiple participants. For each participant, the algorithm processes a set of contexts, each of which includes a chronologically ordered set of entities. The participant’s explicit feedback is also provided, where entities are divided into two categories: • Relevant entities (scored 1 or 2), meaning the user found these entities useful. • Irrelevant entities (scored 0), meaning the user did not find these entities useful. The core data utilized for the similarity computation is a set of 200-dimensional RDF2Vec embeddings representing DBpedia entities. These embeddings capture semantic information about the entities. 4.3.2. Contextual Evaluation: The algorithm processes the entity data within each participant’s context. For each context, entities are evaluated one by one in chronological order, simulating a scenario where the system provides recommendations progressively, as more feedback becomes available. 4.3.3. Similarity Computation: For each entity, the algorithm calculates two separate semantic similarity scores: • Similarity to relevant entities: This measures how close the new entity is to entities previously marked as relevant within the current context. The similarity is computed using the Laplacian kernel, which measures the distance between the RDF2Vec embedding of the new entity and the embeddings of the relevant entities. The formula for the similarity is: ‖RDF2Vec(𝑒) − RDF2Vec(𝑟)‖1 (︂ )︂ 1 ∑︁ sim𝑟𝑒𝑙𝑝𝑐 = exp − |𝑅𝑝𝑐 | 𝛾 𝑟∈𝑅𝑝𝑐 where ‖ · ‖1 denotes the L1 (Manhattan) distance, and 𝛾 is a scaling parameter that controls the sensitivity of the similarity function. • Similarity to irrelevant entities: This is similarly computed, but compares the new entity to those previously marked as irrelevant, using the same Laplacian kernel approach: ‖RDF2Vec(𝑒) − RDF2Vec(𝑖)‖1 (︂ )︂ 1 ∑︁ sim𝑖𝑟𝑟𝑝𝑐 = exp − |𝐼𝑝𝑐 | 𝛾 𝑖∈𝐼𝑝𝑐 4.3.4. Recommendation Decision: Once both similarity scores are computed, the algorithm compares them: • If the similarity to relevant entities (sim𝑟𝑒𝑙𝑝𝑐 ) is greater than the similarity to irrelevant entities (sim𝑖𝑟𝑟𝑝𝑐 ), the new entity is recommended. This means the new entity is considered closer, in terms of semantic similarity, to the group of relevant entities. • If the similarity to irrelevant entities is higher, the entity is not recommended. 4.3.5. Feedback Incorporation: After each decision (whether to recommend or not), the algorithm updates the relevant or irrelevant entity sets for that context by adding the newly evaluated entity to the corresponding set. This allows the recommendation process to evolve as more user feedback becomes available. Table 2 Performance comparison between the baseline method and ER with ARP on RLKWiC. Recommendation task Method Accuracy Precision Recall F1-Score Recommending Baseline ER 0.563 0.563 1.00 0.721 relevant entities ER with ARP 0.598 0.613 0.780 0.686 Recommending Baseline ER 0.258 0.258 1.00 0.410 representative entities ER with ARP 0.458 0.302 0.838 0.444 4.3.6. Repeat for All Participants and Contexts: The process is repeated for each participant and for each context within their data to generate a set of recommendation decisions for all recognized entities in the RLKWiC dataset. These recommendations are personalized for each participant and context, based on how semantically similar a new entity is to those previously marked as relevant or irrelevant. In summary, this algorithm adapts to user preferences over time by learning from their explicit feedback and progressively improving recommendations based on the semantic similarities of entity embeddings. The use of the Laplacian kernel with RDF2Vec embeddings allows for the assessment of how closely related new entities are to those the user has previously interacted with, ultimately aiming to deliver more relevant recommendations. 4.4. Results In our evaluation, we focused exclusively on the set of entities recognized by the NER module within the scenario, excluding any additional entities manually added by users (Unlike the original evaluation [13]). Since the baseline method recommends all entities identified by the NER, it achieves the highest possible recall value (1.0). In contrast, our ARP-based method filters out entities that are not semantically related to the relevant ones, resulting in improved accuracy. The results presented in Table 2 provide a detailed comparison of the performance of the two methods. While ARP only provides a slight improvement in recommending relevant entities, it enhances the accuracy for representative entities by 20 percent. Additionally, we trained several binary classification models, including Random Forest, Linear Regression, and Gaussian Naive Bayes, which are known to perform well with sparse datasets, to predict the relevance of recognized entities using their embeddings. However, these models did not demonstrate any significant improvements compared to our proposed ARP-based approach. 5. Conclusion and Future Work In this paper, we explored ER as a method to support knowledge work. We examined the evolution of ER tools towards PKA and introduced the RLKWiC benchmark, the only existing dataset that ensures full transparency and reproducibility for ER evaluation. Our approach outperformed the defined baseline by 20% in recommending representative entities by incorporating user feedback and simulating ER using an ARP module. To predict the relevance of recognized entities in an adaptive manner, we measured the semantic distance between new entities and previously evaluated entities within each context. We utilized RDF2Vec embeddings to represent DBpedia entities in the RLKWiC dataset. Additionally, we tested our core hypothesis statistically, demonstrating the potential of semantic-based methods in enhancing ER. We found that among various similarity metrics, Laplacian kernel and Euclidean distance show more potential in maintaining the user’s context-based interpretation of entity closeness. Furthermore, We highlighted that entities perceived as similar by a user may not align with their closeness in conventional knowledge representations. Consequently, methods that are effective in general knowledge tasks may not necessarily perform well in PKA, where a more personalized interpretation of world knowledge is required. Further exploration of novel approaches to computing semantic similarity of resources in DBpedia [34] could offer valuable insights in this regard. Despite the usefulness of the RLKWiC dataset for evaluating our approach, the study has some limitations. The small sample size, with only eight participants, restricts the generalizability of our findings, making it essential to explore the robustness of the results with a larger and more diverse user base in future work. Additionally, our study could benefit from a more comprehensive comparison with alternative ER approaches. Benchmarking our approach against state-of-the-art ER methods would provide deeper insights into its relative performance and areas for further improvement. Addressing these limitations in future research could strengthen the validity and applicability of our conclusions. Further avenues could be pursued to further improve ER. One potential enhancement involves incorporating a disambiguation module into the system. This would allow the model to distinguish context-specific meanings of entities, such as recognizing that “stroke” in the context of “Vehicle Vibrations” refers to an engine component rather than a medical condition. Our method focuses on enhancing the ER scenario following the NER phase. However, this enhancement could also be applied at an earlier stage in the ER process by leveraging novel methods, such as inflection-tolerant ontology-based NER for real-time applications [35], to further improve the effectiveness of ER. Additionally, large language models are anticipated to excel in predicting entity relevance when given enough information about user contexts, as they have already demonstrated considerable success in various domains including RS[36]. 6. Acknowledgments This work was funded by the BMBF project SensAI (grant no. 01IW20007) and the DFG project Managed Forgetting (project no. 318396700). References [1] H. Ko, S. Lee, Y. Park, A. Choi, A survey of recommendation systems: Recommendation models, techniques, and application fields, Electronics 11 (2022). URL: https://www.mdpi.com/2079-9292/ 11/1/141. [2] J. B. Schafer, D. Frankowski, J. Herlocker, S. Sen, Collaborative filtering recommender systems, in: The adaptive web, Springer, 2007, pp. 291–324. [3] V. W. Anelli, P. Basile, G. De Melo, F. M. Donini, A. Ferrara, C. Musto, F. Narducci, A. Ragone, M. Zanker, Fifth knowledge-aware and conversational recommender systems workshop (KaRS), in: Proceedings of the 17th ACM Conference on Recommender Systems, RecSys ’23, Association for Computing Machinery, New York, NY, USA, 2023, p. 1259–1262. doi:10.1145/3604915. 3608759. [4] V. W. Anelli, P. Basile, D. Bridge, T. Di Noia, P. Lops, C. Musto, F. Narducci, M. Zanker, Knowledge-aware and conversational recommender systems, RecSys ’18, Association for Computing Machinery, New York, NY, USA, 2018, p. 521–522. doi:10.1145/3240323.3240338. [5] M. Bakhshizadeh, C. Jilek, H. Maus, A. Dengel, Leveraging context-aware recommender systems for improving personal knowledge assistants by introducing contextual states, LWDA, 2021, pp. 1–12. URL: https://ceur-ws.org/Vol-2993/paper-01.pdf. [6] T. H. Davenport, Thinking for a Living: How to Get Better Performances and Results from Knowledge Workers, Harvard Bus. School Press, 2005. [7] P. F. Drucker, Knowledge-worker productivity: The biggest challenge, California Management Review 41 (1999) 79–94. [8] M. Bakhshizadeh, C. Jilek, H. Maus, A. Dengel, Towards context-aware recommender systems for supporting knowledge workers in personal and corporate information space, 2024. [9] O. Rostanin, H. Maus, T. Suzuki, K. Maeda, Using concept maps to improve proactive information delivery in tasknavigator, in: R. Setchi, I. Jordanov, R. J. Howlett, L. C. Jain (Eds.), Knowledge-Based and Intelligent Information and Engineering Systems, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 639–648. doi:10.1007/978-3-642-15387-7_67. [10] M. Sappelli, S. Verberne, W. Kraaij, Evaluation of context-aware recommendation systems for information re-finding, Journal of the Association for Information Science and Technology 68 (2017) 895–910. doi:10.1002/ASI.23717. [11] T. Vuong, S. Andolina, G. Jacucci, P. Daee, K. Klouche, M. Sjöberg, T. Ruotsalo, S. Kaski, Entitybot: Actionable entity recommendations for everyday digital task, in: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, CHI EA ’22, Association for Computing Machinery, New York, NY, USA, 2022. doi:10.1145/3491101.3519910. [12] T. Vuong, S. Andolina, G. Jacucci, P. Daee, K. Klouche, M. Sjöberg, T. Ruotsalo, S. Kaski, Entitybot: Supporting everyday digital tasks with entity recommendations, in: Proceedings of the 15th ACM Conference on Recommender Systems, RecSys ’21, Association for Computing Machinery, New York, NY, USA, 2021, p. 753–756. doi:10.1145/3460231.3478883. [13] M. Bakhshizadeh, H. Maus, A. Dengel, Context-based entity recommendation for knowledge workers: Establishing a benchmark on real-life data, in: Proceedings of the 18th ACM Conference on Recommender Systems, RecSys ’24, Association for Computing Machinery, New York, NY, USA, 2024. doi:/10.1145/3640457.3688068. [14] Z. Wang, F. Wang, J.-R. Wen, Z. Li, Bring user interest to related entity recommendation, in: M. Croitoru, P. Marquis, S. Rudolph, G. Stapleton (Eds.), Graph Structures for Knowledge Representation and Reasoning, Springer International Publishing, Cham, 2015, pp. 139–153. [15] X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, J. Han, Personalized entity recommendation: a heterogeneous information network approach, in: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM ’14, Association for Computing Machinery, New York, NY, USA, 2014, p. 283–292. doi:10.1145/2556195.2556259. [16] E. Palumbo, D. Monti, G. Rizzo, R. Troncy, E. Baralis, entity2rec: Property-specific knowledge graph embeddings for item recommendation, Expert Systems with Applications 151 (2020) 113235. doi:/10.1016/j.eswa.2020.113235. [17] F. M. Harper, J. A. Konstan, The movielens datasets: History and context, ACM Trans. Interact. Intell. Syst. 5 (2015). doi:10.1145/2827872. [18] T. Vuong, G. Jacucci, T. Ruotsalo, Proactive information retrieval via screen surveillance, in: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’17, Association for Computing Machinery, New York, NY, USA, 2017, p. 1313–1316. doi:10.1145/3077136.3084151. [19] T. Vuong, G. Jacucci, T. Ruotsalo, Watching inside the screen: Digital activity monitoring for task recognition and proactive information retrieval, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1 (2017). doi:10.1145/3130974. [20] C. Lampasona, O. Rostanin, H. Maus, Seamless integration of order processing in ms outlook using smartoffice: an empirical evaluation, in: Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’12, Association for Computing Machinery, New York, NY, USA, 2012, p. 165–168. doi:10.1145/2372251.2372281. [21] T. Vuong, M. Saastamoinen, G. Jacucci, T. Ruotsalo, Understanding user behavior in naturalistic information search tasks, Journal of the Association for Information Science and Technology 70 (2019) 1248–1261. doi:/10.1002/asi.24201. [22] G. Jacucci, P. Daee, T. Vuong, S. Andolina, K. Klouche, M. Sjöberg, T. Ruotsalo, S. Kaski, Entity recommendation for everyday digital tasks, ACM Trans. Comput.-Hum. Interact. 28 (2021). doi:10.1145/3458919. [23] T. Vuong, S. Andolina, G. Jacucci, T. Ruotsalo, Does more context help? effects of context window and application source on retrieval performance, ACM Trans. Inf. Syst. 40 (2021). doi:10.1145/ 3474055. [24] T. Vuong, Behavioral Task Modeling for Entity Recommendation, Ph.D. thesis, Finland, 2022. [25] Z. R. Yousefi, T. Vuong, M. AlGhossein, T. Ruotsalo, G. Jaccuci, S. Kaski, Entity footprinting: Modeling contextual user states via digital activity monitoring, ACM Trans. Interact. Intell. Syst. 14 (2024). doi:10.1145/3643893. [26] T. Vuong, T. Ruotsalo, Predicting representations of information needs from digital activity context, ACM Trans. Inf. Syst. 42 (2024). doi:10.1145/3639819. [27] M. Bakhshizadeh, Supporting knowledge workers through personal information assistance with context-aware recommender systems, in: Proceedings of the 18th ACM Conference on Recommender Systems, RecSys ’24, Association for Computing Machinery, New York, NY, USA, 2024. doi:/10.1145/3640457.3688010. [28] M. Bakhshizadeh, C. Jilek, M. Schröder, H. Maus, A. Dengel, Data collection of real-life knowledge work in context: The rlkwic dataset, in: S. Li (Ed.), Information Management, Springer Nature Switzerland, Cham, 2024, pp. 277–290. doi:10.1007/978-3-031-64359-0_22. [29] P. Ristoski, H. Paulheim, Rdf2vec: Rdf graph embeddings for data mining, in: P. Groth, E. Simperl, A. Gray, M. Sabou, M. Krötzsch, F. Lecue, F. Flöck, Y. Gil (Eds.), The Semantic Web – ISWC 2016, Springer International Publishing, Cham, 2016, pp. 498–514. [30] G. K. D. de Vries, A fast approximation of the weisfeiler-lehman graph kernel for rdf data, in: H. Blockeel, K. Kersting, S. Nijssen, F. Železný (Eds.), Machine Learning and Knowledge Discovery in Databases, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 606–621. [31] G. K. D. de Vries, S. de Rooij, Substructure counting graph kernels for machine learning from rdf data, Journal of Web Semantics 35 (2015) 71–84. doi:/10.1016/j.websem.2015.08.002, machine Learning and Data Mining for the Semantic Web (MLDMSW). [32] M. P. Christensen, M. Lissandrini, K. Hose, Dbpedia rdf2vec graph embeddings, 2022. doi:10. 5281/zenodo.6377944. [33] N. Hubert, H. Paulheim, A. Brun, D. Monticolo, Do similar entities have similar embeddings?, in: A. Meroño Peñuela, A. Dimou, R. Troncy, O. Hartig, M. Acosta, M. Alam, H. Paulheim, P. Lisena (Eds.), The Semantic Web, Springer Nature Switzerland, Cham, 2024, pp. 3–21. [34] G. Piao, S. s. Ara, J. G. Breslin, Computing the semantic similarity of resources in dbpedia for recommendation purposes, in: G. Qi, K. Kozaki, J. Z. Pan, S. Yu (Eds.), Semantic Technology, Springer International Publishing, Cham, 2016, pp. 185–200. [35] C. Jilek, M. Schröder, R. Novik, S. Schwarz, H. Maus, A. Dengel, Inflection-tolerant ontology-based named entity recognition for real-time applications, in: M. Eskevich, G. de Melo, C. Fäth, J. P. McCrae, P. Buitelaar, C. Chiarcos, B. Klimek, M. Dojchinovski (Eds.), 2nd Conference on Language, Data and Knowledge (LDK 2019), volume 70 of Open Access Series in Informatics (OASIcs), Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2019, pp. 11:1–11:14. doi:10. 4230/OASIcs.LDK.2019.11. [36] L. Wu, Z. Zheng, Z. Qiu, H. Wang, H. Gu, T. Shen, C. Qin, C. Zhu, H. Zhu, Q. Liu, H. Xiong, E. Chen, A survey on large language models for recommendation, World Wide Web 27 (2024) 60. doi:10.1007/s11280-024-01291-2.